Commit Graph

14908 Commits

Author SHA1 Message Date
Matthew Poremba
472c697d88 arch-vega: Implement v_mfma_i32_16x16x16i8
Tested using AMD labs notes examples located on github:

https://github.com/amd/amd-lab-notes/blob/release/matrix-cores/
    src/mfma_i32_16x16x16i8.cpp

Change-Id: Ib0e50162288528012b6d3395e1f629ebf12e8e54
2024-01-03 15:41:06 -06:00
Matthew Poremba
cc75281802 gpu-compute: Update code object to latest LLVM
The AMDKernelCode struct is very outdated. Most of the fields are no
longer used and have been replaced with new fields that are used.
Therefore in order to support the new fields the code object needs to be
updated. The new structure is based on the table located at
https://llvm.org/docs/AMDGPUUsage.html#code-object-v3-kernel-descriptor

Most notably this adds the new compute_pgm_rsrc3 and kernarg preload
fields which are new features in gfx90a (MI200). The accum_offset in
compute_pgm_rsrc3 and kergarg preload values are necessary to run
application which enable those features and therefore a way to check
their values is needed.

Also noteable is the removal of enable_sgpr_workgroup_id_{X,Y,Z}. These
seem to be unused in all versions of ROCm that gem5 supports and
therefore these fields can be removed. They are replaced with a reserved
field in the new code object.

Change-Id: I5542442e1e5961b05e17affad0adb5186d6d9d1a
2024-01-03 15:41:06 -06:00
Matthew Poremba
7e1b27969f arch-vega: Improve FLAT disassembly
Use the opSelectorToRegSym which will print the full range of VGPRs
(e.g., will now print v[2:3] instead of v2 when the source / dest is
64-bits). Fixes atomic disassembly prints. Now shows "glc" if GLC bit is
enabled. Fixes some VGPR fields being printed as an SGPR in places where
the 9-bit register index bit is implied (e.g., VDST).

This makes it easier to use a GPUExec trace to match with LLVM
disassembly when debugging.

Change-Id: Ia163774850f0054243907aca8fc8d0361e37fdd5
2024-01-03 10:40:34 -06:00
Matthew Poremba
bc69ab0a1f arch-vega: Add VOP3P encodings and packed 16b insts
This adds the VOP3P and VOP3P_MAI encodings from the MI200 spec. These
instructions are used for packed math and miSIMD instructions. The first
19 VOP3P opcodes are implemented and validated against hardware. This
includes all instructions which operate on one dword containing two
packed 16-bit values of fp16, int16_t, or uint16_t.

Implement one MFMA instruction for now which was also validated against
hardware.
2024-01-03 10:40:34 -06:00
Matthew Poremba
4903fe2db1 arch-arm: Allow fplib to be used outside of ARM build
This is useful in other ISAs to implement FP16 computation. For example,
it can be used in the GPU model. The ARM specific misc register is
ignored in that case.

Change-Id: I339ac0ccd9be4371b0f220ad99068e5e12b3d263
2024-01-03 10:40:34 -06:00
Matthew Poremba
8c016ebbbc gpu-compute: Implement packed workitem ABI init
This initialization method is used in gfx90a (MI200). Rather than using
three VGPRs for X,Y,Z dimensions of the kernel, pack them into one
register with 10-bits for each dimensions.

Change-Id: I8e5b681c8287779ff9f80451d6028e862322294a
2024-01-03 10:40:34 -06:00
Matthew Poremba
5e45233484 gpu-compute: Add gfx version to HSA task entry
The version is necessary for determining the correct ABI init process.
Add it to the task queue so it is accessible when doing ABI init.

Change-Id: If77434b0f93614057b5c40fcf612d59b54e05dbb
2024-01-03 10:40:34 -06:00
Bobby R. Bruce
0615ba4748 misc: Merge branch 'release-staging-v23-1' into develop
Change-Id: I091b7788d67f1803ddb8f9c4f5661f1f24c3b594
2023-12-27 12:42:51 -08:00
Bobby R. Bruce
2fe738911e misc: Change version information to develop for v24.0
Change-Id: I5a29cd574256f8a0f8963567ead0af45c1fce9f2
2023-12-27 12:29:52 -08:00
Bobby R. Bruce
4ea676471a misc: Merge branch 'release-staging-v23-1' into develop
This is just a stanity check to ensure all changes are in the `develop`
branch.

Change-Id: I7f6e6709338ae9a386bc7527d5cf4daf10d768c2
2023-12-27 11:42:56 -08:00
Bobby R. Bruce
012c4a3fbd misc: Merge branch stable into release-staging-v23-1
Change-Id: I3903331ec4c9d7ba83656bbf579ac3c1cac8518f
2023-12-27 11:41:50 -08:00
Bobby R. Bruce
646b1f4882 cpu: 'suppressFuncErrors' -> 'pkt->suppressFuncError()' fix
Change-Id: If4aa71e9f6332df2a3daa51b69eaad97f6603f6b
2023-12-21 18:46:25 -08:00
Harshil Patel
70aeaaa0e9 mem: Updated bytesRead and bytesWritten stat (#705)
- The bytesRead and bytesWritten stat had duplicate names. Updated
bytesRead and bytesWritten for dram_interface and nvm_interface

Change-Id: I7658e8a0d12ef6b95819bcafa52a85424f01ac76
2023-12-21 18:46:02 -08:00
Bobby R. Bruce
c4146d8813 misc: Fix 'maybe-uninitialized' warn turn off (#706)
https://github.com/gem5/gem5/pull/696 was implemented incorrectly and
and causes error when running with GCC 12.1. This patch fixes the error.
2023-12-21 18:45:56 -08:00
Bobby R. Bruce
e95389920a misc: Turn off 'maybe-uninitialized' warn for regex include (#696)
https://github.com/gem5/gem5/pull/636 triggered a bug with the GCC
compiler and its interaction with the CPP stdlib regex library, outlined
here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105562.

This was causing the gem5 Compiler tests to fail for GCC-12:
https://github.com/gem5/gem5/actions/runs/7219055796

This fix turns off the 'maybe-unitialized' warning when we include the
regex headers in "src/kern/linux/helpers.cc".
2023-12-21 18:45:47 -08:00
Harshil Patel
5288dbbf90 mem: Updated bytesRead and bytesWritten stat (#705)
- The bytesRead and bytesWritten stat had duplicate names. Updated
bytesRead and bytesWritten for dram_interface and nvm_interface

Change-Id: I7658e8a0d12ef6b95819bcafa52a85424f01ac76
2023-12-21 10:21:40 -08:00
Bobby R. Bruce
25e0e96741 misc: Fix 'maybe-uninitialized' warn turn off (#706)
https://github.com/gem5/gem5/pull/696 was implemented incorrectly and
and causes error when running with GCC 12.1. This patch fixes the error.
2023-12-21 10:21:20 -08:00
Bobby R. Bruce
82b5c332b7 tests: Fix Daily memory tests (#695)
Fixes a series of issues in the Daily memory tests causing test failure.
Discussed in #697.
2023-12-20 13:11:25 -08:00
Bobby R. Bruce
2f58f1c87b misc: Turn off 'maybe-uninitialized' warn for regex include (#696)
https://github.com/gem5/gem5/pull/636 triggered a bug with the GCC
compiler and its interaction with the CPP stdlib regex library, outlined
here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105562.

This was causing the gem5 Compiler tests to fail for GCC-12:
https://github.com/gem5/gem5/actions/runs/7219055796

This fix turns off the 'maybe-unitialized' warning when we include the
regex headers in "src/kern/linux/helpers.cc".
2023-12-20 13:10:56 -08:00
Bobby R. Bruce
213d0b0bfe cpu: 'suppressFuncErrors' -> 'pkt->suppressFuncError()' fix
Change-Id: If4aa71e9f6332df2a3daa51b69eaad97f6603f6b
2023-12-20 09:15:15 -08:00
Giacomo Travaglini
4f5d4b9baf mem-ruby: Implement WriteUniqueZero CHI transaction (#692)
The WriteUniqueZero is an immediate write to a Snoopable address region
that does not require any data transfer (cacheline is zeroed)

Change-Id: Ia8c9b40e08a3b7d613f0b62ce0ac4b0547860871

Reviewed-by: Tiago Muck <tiago.muck@arm.com>

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-12-19 11:12:50 +00:00
Tiberiu Bucur
27d89379d2 sim: Remove trailing / from proc/meminfo special path (#689)
Note: A bug was identified in that the one of the special file paths,
namely /proc/meminfo contained an extra trailing /, implicitly making
the incorrect assumption that meminfo was a directory, when it is, in
fact, a (pseudo-)file. This was causing application in SE mode to fail
opening the meminfo pseudo-file with errno 13. This commit fixes this
issue.

Change-Id: I93fa81cab49645d70775088f1e634f067b300698
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
2023-12-18 17:28:10 -08:00
Alexander Richardson
2700f392cb tests: Silence Clang 16 warnings (#679)
I was trying to build with clang 16 and ran into these -Werror warnings

Change-Id: I9207990fcfe9c1a5485945294969f21d1d812a7c
2023-12-18 14:57:11 -08:00
Tiberiu Bucur
9b0bf33f79 sim: Remove trailing / from proc/meminfo special path (#689)
Note: A bug was identified in that the one of the special file paths,
namely /proc/meminfo contained an extra trailing /, implicitly making
the incorrect assumption that meminfo was a directory, when it is, in
fact, a (pseudo-)file. This was causing application in SE mode to fail
opening the meminfo pseudo-file with errno 13. This commit fixes this
issue.

Change-Id: I93fa81cab49645d70775088f1e634f067b300698
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
2023-12-17 22:07:39 -08:00
Giacomo Travaglini
a008cd2611 mem-ruby: Implement a dummy StashOnceShared/Unique (#688)
Stash requests will simply be discarded by the Home Node This will
return a CompI response to the RNF

Change-Id: I9c2ce5d4d42f380d1a554933d381cf8a8590ba22

Reviewed-by: Tiago Muck <tiago.muck@arm.com>

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-12-16 14:43:45 -08:00
Bobby R. Bruce
29b77260f3 arch-x86: Fix two_byte_opcodes.isa 0x6 -> 0x0 (#666)
This bug was introduced by https://github.com/gem5/gem5/pull/593 and
caused Issue https://github.com/gem5/gem5/issues/664.

Change-Id: Ia55de364ee8260e1fe315e37e1cffbc71ab229fb
2023-12-14 00:41:09 -08:00
Roger Chang
654e7c6019 arch: Fix inst flag of RISC-V vector store macro instructions
Correct the instruction flags of RISC-V vector store instructions, such
as `vse64_v`, `vse32_v`. The `vse64_v`  in `decoder.isa` is
`Mem_vc.as<uint64_t>()[i] = Vs3_ud[i];` and it will generate the code
`Mem.as<uint64_t>()[i] = Vs3[i];`. The current regex of assignRE only
mark the operand `Mem` as `dest` only if meet the formats like `Mem = Rd`
or `Mem[i] = Rd` because the code ` = Rd` or `[i] = Rd` match the
`assignRE` respectively. For the expression `Mem.as<uint64_t>()[i]`,
the operand `Mem` will falsely mark the operand as `src` because the
code `.as<uint64_t>()[i]` is not match the  `assignRE`.

The PR will ensure the operand `Mem` is dest for the format like
`Mem.as<xxx>()[i] = yyy`.

Change-Id: I9c57986a64f1efb81eb9c7ade90712b118e0788d
2023-12-14 00:05:26 -08:00
Roger Chang
a9f8db7044 arch-riscv: Fix the vector store indexed instructions declaration
Change-Id: I6f8701ef0819c22eda8cb20d09c40101f2d001a0
2023-12-14 00:05:17 -08:00
Harshil Patel
9aab380775 arch-riscv: fix riscv matched board for se mode (#677) 2023-12-14 00:05:02 -08:00
Bobby R. Bruce
d8cc530597 stdlib: Add get_local_path() call to Looppoint resources
Due to a change introduced in https://github.com/gem5/gem5/pull/625, a
gem5 resource will not download any external files until
`get_local_path()` is called. In the construction of the Looppoint
Resources this function was not called, the `local_path` variable was
called directly. As such, an error occured.

The downside of this fix is the Looppoint resources external files are
downloaded when `obtain_resource` is called, thus the bandwidth savings
introduced with https://github.com/gem5/gem5/pull/625 will not occur for
Looppoint resources. However, https://github.com/gem5/gem5/issues/644
proposes a fix which would supercede the
https://github.com/gem5/gem5/pull/625 solution.

Change-Id: I52181382a03e492ec1cb58b01e71bc4820af9ccc
2023-12-14 00:04:10 -08:00
Bobby R. Bruce
301fb3f509 stdlib: Remove 'additional_params' value type assert
The value of a `WorkloadResource`'s additional parameter may not always
be a string. It can be any JSON value (integer, a list, a dict, ect.).
For Looppoint resources we have additional parameters such as a List of
region start points.

The assert inside workloads checking the type of the value breaks
certain usecase and is therefore removed in this commit.

Change-Id: Iecb1518082c28ab3872a8de888c76f0800261640
2023-12-14 00:04:00 -08:00
Harshil Patel
5ac9598133 Arch-riscv: Add chosen node
Change-Id: I458665caec08856cd8e61d2cd7a5b0dc5c35d469
2023-12-14 00:00:41 -08:00
Harshil Patel
7ce69b56be arch-riscv: Update riscv matched boad
- Update riscv matched board to work with new
RiscvBootloaderKernelWorkload

Change-Id: Ic20b964f33e73b76775bfe18798bd667f36253f6
2023-12-14 00:00:30 -08:00
Yu-Cheng Chang
db286903ee stdlib: Fix the chi protocol of arm boot tests (#658)
Change-Id: I63f17a73b2e16bc26d9b41babc63439a6040791f
2023-12-13 23:59:00 -08:00
Harshil Patel
c66862f6e3 arch-riscv: fix riscv matched board for se mode (#677) 2023-12-13 13:16:08 -08:00
Bobby R. Bruce
695c350f31 stdlib,resources: Fix obtaining gem5 Looppoint resources (#675)
There were two small bugs preventing gem5 from obtaining Looppoint
resources.

1. When obtained via a `WorkloadResource` there was an assert which
assumed the values in the resource's DB entry's `additional_parameter`
field were of type string. This is not the case. For Looppoint resources
there are additional parameters which are arrays.
2. Due to changes introduced in https://github.com/gem5/gem5/pull/625,
the Looppoint CSV and JSON files were not being downloaded when needed.
This was fixed by replacing access to the `local_path` variable with a
call to `get_local_path()`.
2023-12-13 12:49:57 -08:00
Bobby R. Bruce
da3e3b806d arch-riscv: squash walks with tlb hits in startWalkWrapper (#672)
Because each vector load is fragmented into 64 byte cache-aligned
chunks, and one page-table walk is issued per fragment on tlb miss,
walks start to accumulate on a pending queue, which is processed in a
blocking way (no pending walks can be issued while one is being
processed). This adds noticeable latency on vector loads when VLEN is
sufficiently large.

This commit fixes the issue by allowing walks to be squashed if a TLB
lookup hits just before starting the walk on `startWalkWrapper`. This
idea was taken from the ARM walker.
2023-12-13 12:45:40 -08:00
Saúl Adserias
78f23ad2df arch-riscv: squash walks with tlb hits in startWalkWrapper
Change-Id: I1bdfd7b2ee02ddee5a2d4c13bafc8c472f555f61
2023-12-13 16:40:46 +01:00
Giacomo Travaglini
8d09e95420 arch-arm: Partial SVE2 Implementation (#657)
Instructions added:

BGRP, RAX1, EOR3, BCAX,
XAR & TBX, PMUL, PMULLB/T, SMULLB/T and UMULLB/T

Move from gerrit [1]

[1]: https://gem5-review.googlesource.com/c/public/gem5/+/70277

Change-Id: Ia135ba9300eae312b24342bcbda835fef6867113
2023-12-13 10:27:19 +00:00
Bobby R. Bruce
4eb81296b1 stdlib: Add get_local_path() call to Looppoint resources
Due to a change introduced in https://github.com/gem5/gem5/pull/625, a
gem5 resource will not download any external files until
`get_local_path()` is called. In the construction of the Looppoint
Resources this function was not called, the `local_path` variable was
called directly. As such, an error occured.

The downside of this fix is the Looppoint resources external files are
downloaded when `obtain_resource` is called, thus the bandwidth savings
introduced with https://github.com/gem5/gem5/pull/625 will not occur for
Looppoint resources. However, https://github.com/gem5/gem5/issues/644
proposes a fix which would supercede the
https://github.com/gem5/gem5/pull/625 solution.

Change-Id: I52181382a03e492ec1cb58b01e71bc4820af9ccc
2023-12-12 14:28:11 -08:00
Bobby R. Bruce
4adeb24a4f stdlib: Remove 'additional_params' value type assert
The value of a `WorkloadResource`'s additional parameter may not always
be a string. It can be any JSON value (integer, a list, a dict, ect.).
For Looppoint resources we have additional parameters such as a List of
region start points.

The assert inside workloads checking the type of the value breaks
certain usecase and is therefore removed in this commit.

Change-Id: Iecb1518082c28ab3872a8de888c76f0800261640
2023-12-12 14:23:04 -08:00
Bobby R. Bruce
eff08ba113 mem: Add a flag on AbstractMemory to control statistics collection (#656)
The stats initialization in the AbstractMemory allocates the space
according to the max requestors of the System. This may cause issues in
multiple system simulation.
Given there are two system A and B. A has one requestor and a memory,
while B has two requestors. When the requestor with requestor id 2
sending requests to the meomry in A, the simulator would crash because
requestor id 2 is out of the allocated space.

Current solution is adding a SysBridge between across A and B which
would rewrite the requestor id to a valid one. This solution works but
it needs to the bridge at the correct boundary which may not easy. In
addition, the stats would record a mapped data which may not accurate.

To reduce the complexity, we add an flag to AbstractMemory to control
the stats. If users don't want the statistics and want to solve the
cross system issue simply, users can disable the statistics collection.
We also makes the flag by default True to not disturb current users.
2023-12-12 13:13:30 -08:00
Bobby R. Bruce
c8cc193db8 arch,arch-riscv: Fix inst flag of RISC-V vector store macro instructions (#674)
Correct the instruction flags of RISC-V vector store instructions, such
as `vse64_v`, `vse32_v`. The `vse64_v` in `decoder.isa` is
`Mem_vc.as<uint64_t>()[i] = Vs3_ud[i];` and it will generate the code
`Mem.as<uint64_t>()[i] = Vs3[i];`. The current regex of assignRE only
mark the operand `Mem` as `dest` only if meet the formats like `Mem =
Rd` or `Mem[i] = Rd` because the code ` = Rd` or `[i] = Rd` match the
`assignRE` respectively. For the expression `Mem.as<uint64_t>()[i]`, the
operand `Mem` will falsely mark the operand as `src` because the code
`.as<uint64_t>()[i]` is not match the `assignRE`.

The PR will ensure the operand `Mem` is dest for the format like
`Mem.as<xxx>()[i] = yyy`.
2023-12-12 13:07:50 -08:00
Bobby R. Bruce
37e4173351 arch-x86: Fix two_byte_opcodes.isa 0x6 -> 0x0 (#666)
This bug was introduced by https://github.com/gem5/gem5/pull/593 and
caused Issue https://github.com/gem5/gem5/issues/664.

Change-Id: Ia55de364ee8260e1fe315e37e1cffbc71ab229fb
2023-12-12 08:21:27 -08:00
Roger Chang
bedc3c597c arch: Fix inst flag of RISC-V vector store macro instructions
Correct the instruction flags of RISC-V vector store instructions, such
as `vse64_v`, `vse32_v`. The `vse64_v`  in `decoder.isa` is
`Mem_vc.as<uint64_t>()[i] = Vs3_ud[i];` and it will generate the code
`Mem.as<uint64_t>()[i] = Vs3[i];`. The current regex of assignRE only
mark the operand `Mem` as `dest` only if meet the formats like `Mem = Rd`
or `Mem[i] = Rd` because the code ` = Rd` or `[i] = Rd` match the
`assignRE` respectively. For the expression `Mem.as<uint64_t>()[i]`,
the operand `Mem` will falsely mark the operand as `src` because the
code `.as<uint64_t>()[i]` is not match the  `assignRE`.

The PR will ensure the operand `Mem` is dest for the format like
`Mem.as<xxx>()[i] = yyy`.

Change-Id: I9c57986a64f1efb81eb9c7ade90712b118e0788d
2023-12-12 17:04:31 +08:00
Roger Chang
10d344a942 arch-riscv: Fix the vector store indexed instructions declaration
Change-Id: I6f8701ef0819c22eda8cb20d09c40101f2d001a0
2023-12-12 16:36:49 +08:00
Bobby R. Bruce
ea1226119c arch-riscv: Update riscv matched board (#654)
- Update riscv matched board to work with new
RiscvBootloaderKernelWorkload

Change-Id: Ic20b964f33e73b76775bfe18798bd667f36253f6
2023-12-08 13:33:09 -08:00
Yu-Cheng Chang
10a0c950da stdlib: Fix the chi protocol of arm boot tests (#658)
Change-Id: I63f17a73b2e16bc26d9b41babc63439a6040791f
2023-12-07 16:10:45 -08:00
Giacomo Travaglini
81d3c6307d arch-arm: add Sve mla and mls indexed (#596)
This contains the implementation of mla and MLS index version
instructions from ARM SVE2 ISA specification.
2023-12-07 21:47:35 +00:00
Harshil Patel
0f0317ad16 Arch-riscv: Add chosen node
Change-Id: I458665caec08856cd8e61d2cd7a5b0dc5c35d469
2023-12-06 20:10:56 -08:00