Commit Graph

21722 Commits

Author SHA1 Message Date
Hoa Nguyen
500da4306b arch: Mark FailUnimplemented instructions as Invalid instructions (#1247)
This is a follow-up on the discussion here [1].

The IsInvalid flag was previously defined as an instruction that does
not appear in the ISA. However, a micro-architecture can choose to not
recognize an instruction in and raise illegal instruction fault even if
the instruction is in the ISA.

This change modifies the definition of a Invalid instruction such that,
if a StaticInst instruction is marked as IsInvalid, it means the
instruction is not recognized by the decoder. This means that any
instruction recognized by the decoder are not invalid, even if the
instruction is not in the official ISA spec; e.g., m5
pseudo-instructions.

Note that instructions that are recognized by the decoder but are chosen
to act as a nop are not invalid. This applies to WarnUnimplemented
instructions, e.g. hint instructions.

[1] https://github.com/gem5/gem5/pull/1071

Change-Id: I1371b222d8b06793d47f434d0f148c5571672068

Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2024-06-17 12:44:05 -07:00
Giacomo Travaglini
2804311f7b cpu-o3: Revert "Do not set Executed on load instruction to be replayed" (#1251)
Reverts gem5/gem5#1182

This is breaking O3 execution. Investigating the matter
2024-06-17 12:24:43 -07:00
Matt Sinclair
6776bebbf6 gpu-compute,mem-ruby: Add RubyHitMiss flag for TCP and TCC cache (#1226)
Add hit and miss print for TCP and TCC cache with RubyHitMiss debug flag

Change-Id: I4430532b901811e03d9b077b61e2eca4557b34e1
2024-06-17 12:47:47 -05:00
Matthew Poremba
50e4209a4a arch-vega: Various MI300 fixes for PyTorch tests (#1249)
- Fix address calculation issue with scratch_* instructions when SVE bit
is 0.
- Fix ds_swizzle_b32 not mapping to execution unit.
- Implement VOP3 V_FMAC_B32.
- Fix architected scratch address register being clobbered.

Tested with MNIST from PyTorch quickstart tutorial and nanoGPT on
mi300.py.
2024-06-17 07:59:47 -07:00
Jarvis Jia
3a2bf47d57 Add default value and change Ruby address format specifier
Change-Id: I8fbaf34745e90589e610d3b9bd423937e7ebdc3d
2024-06-17 03:27:25 -05:00
Jarvis Jia
edb2e76077 Merge branch 'develop' into rubyhitmiss 2024-06-17 15:57:50 +08:00
Matthew Poremba
2b0ca93517 gpu-compute: Fix architected flat scratch
Currently writing to SRF which is incorrect, as the physical register
number can be clobbered by another wavefront if registers get renamed to
the physical register number.

Fix this by actually architecting the register, i.e., there is a
dedicated "hardware" register in the wavefront class.

Change-Id: I94e9e463eed348b2928cae884c1c20566c00984d
2024-06-15 15:46:33 -07:00
Matthew Poremba
2f5842d253 arch-vega: Add valid flag to ds_swizzle_b32
Currently the flag is just Load and there is a long comment explaining
why. This does not meet any of the scoreboard check requirements:

https://github.com/gem5/gem5/blob/develop/src/gpu-compute/scoreboard_check_stage.cc#L230-L241

Add a generic ALU flag as well so the instruction executes instead of
panicking.

Change-Id: I54b2d20d47fad5e8f05f927328433aab7db7d862
2024-06-15 14:28:59 -07:00
Matthew Poremba
42369eab2c arch-vega: Implement MI300 FLAT SVE bit
For scratch instructions only, this bit specifies if an offset in a VGPR
should be used for address calculation. This is new in MI300 and was
previously the LDS bit. The LDS bit is rarely used and in fact gem5 does
not even check this bit.

This fixes a bug when SADDR == 0x7f (i.e., no SGPR should be used) where
a VGPR was being added to the address when it should have been ignored.

Change-Id: I9864379692df6795b25b58b98825da05d18fc5db
2024-06-15 14:28:59 -07:00
Matthew Poremba
1dab4be002 arch-vega: Implement VOP3 V_FMAC_F32
A version of V_FMAC_F32 with extra modifiers from VOP3 format.

Change-Id: Ib6b41b0a3ceb91269b91a0287dfc94bc73e4d217
2024-06-15 14:28:58 -07:00
Matthew Poremba
f91d14fe46 gpu-compute: Add MFMA stats (#1248)
Add dynamic instruction counts for MFMAs.

Change-Id: I976b01344577cf011aeb3dd648a8c0017281c4e3
2024-06-15 13:04:00 -07:00
Minje Jun
b8e21a2d32 cpu-o3: Do not set Executed on load instruction to be replayed (#1182)
A load instruction can be replayed when
1) it's strictly ordered or
2) it falls into load-store forwarding mismatch.

Case 1 was considered in executeLoad function but the case 2 wasn't. It
causes the case-2 replayed load instruction to violate the assertion
condition "assert(!load_inst->isExecuted())" in LSQUnit::read. This
commit fixes the problem by adding consideration of the case 2 in
LSQUnit::executeLoad.

Co-authored-by: Minje Jun <minje.jun@samsung.com>
2024-06-14 10:12:26 -07:00
Matthew Poremba
3cf638e217 gpu-compute, util-m5: add GPU kernel exit events (#1217)
The GPUFS scripts include support for dumping and resetting
stats at kernel boundaries by identifying specific GPU kernel 
exit events. This commit extends that support to work with 
GPU SE-mode support.

Change-Id: I662233ae71e2987d90af1fd0100e29036b2ef1c6
2024-06-14 08:13:27 -07:00
Jason Lowe-Power
21ffd91529 cpu,arch: Add IsInvalid flag to Unknown insts (#1071)
The IsInvalid flag indicates that the static instruction is not part of
the executing ISA and not part of m5's pseudo-instructions. This flag
provides a way to recognize an illegal instruction at the decode stage.
2024-06-13 16:26:35 -07:00
Matthew Poremba
b3d9dc42d4 configs: Add replacement policy options for GPUFS (#1230)
GPU_VIPER.py was modified to use these options but they did not exist,
breaking GPUFS. This commit adds them to fix the issue.

Change-Id: I0095f400ea606c4e8d91a41870ef208465cef803
2024-06-13 11:23:50 -07:00
Jarvis Jia
87c0d7732c Merge branch 'develop' into rubyhitmiss 2024-06-12 17:30:35 -04:00
Jarvis Jia
edfc139c40 Change black format
Change-Id: I3733b31baf187e0d3d38d971d9423a1b1afe2296

gpu-compute: add GPU RubyHitMiss for TCP and TCC

Change-Id: I4430532b901811e03d9b077b61e2eca4557b34e1

gpu-compute: Add RubyHitMiss flag for TCP and TCC cache

Change-Id: I4e5d1127c84b9eb1060ec9ba0b6638267449eda5

gpu-compute: Add RubyHitMiss flag for TCP and TCC cache

Change-Id: I4e5d1127c84b9eb1060ec9ba0b6638267449eda5

Remove space

Change-Id: I401f528c6f128ba0956bdbc232e8f2ae37bf648c
2024-06-12 16:04:36 -05:00
Jarvis Jia
b6b2e8c6c5 Black format
Change-Id: If224c106262bae25127675160ea78386eedace3b
2024-06-12 15:57:04 -05:00
Jarvis Jia
0ebcddea95 Update apu_se.py to remove part not needed
Change-Id: I06df4e0a67ccd2b7a45296ff65bf26c2b465a934
2024-06-12 15:54:13 -05:00
Matthew Poremba
be0a7937c1 mem-ruby: Fix deadlock in GPU_VIPER when issuing atomic requests (#1216)
When a compute unit issues several requests to the same line,
the requests wait in the L2 if it is a writeback cache. If the line is
invalid initially and the first request is atomic in nature, the L2
cache issues a request to main memory. On data return, the cache line
transitions to M but doesn't wake up the other requests, resulting in
a deadlock. This commit adds a wakeup call on data return for atomics
and fixes potential deadlocks.
2024-06-12 10:10:32 -07:00
Harshil Patel
74afea471d cpu: Revert "Don't change to suspend if the thread status is halted" (#1225)
Reverts gem5/gem5#1039
2024-06-12 00:20:06 -07:00
Bobby R. Bruce
f9abf6bb08 stdlib: Improve gem5 PyStats (#996)
This PR incorporates numerous improvements and fixes to the gem5
PyStats. This includes:

* PyStats now support SimObject Vectors. The PyStats representing them
are subscribable and therefore acceptable by accessing an index: e.g.,:
`simobjectvec[0]`. (This replaces the `Vector` group PyStat)
* Adds the `SparseHist` PyStats.
* Adds the `Vector2d` to PyStats.
* The `Distribution` PyStats is fixed to be a vector of Scalars.
* Tests added for the PyStat's Vector and bugs fixed.
2024-06-12 00:19:08 -07:00
Bobby R. Bruce
261490f23c misc,tests: Revert merge version to 'v4' from 'v4.0.0'
'v4.0.0' wasn't working. The following error was occurred:

```
Can't find 'action.yml', 'action.yaml' or 'Dockerfile' for action 'actions/upload-artifact/merge@v4.0.0'.
```

Change-Id: I658b0fe292df029501fbc1286acb06f4014ae4e1
2024-06-12 00:14:27 -07:00
Vishnu Ramadas
42b9a9666e mem-ruby: Add instSeqNum to atomic responses from GPU L2 caches
This commit adds instSeqNum to the atomic responses in
GPU_VIPER-TCC.sm. This will be useful when debugging issues related to
GPU atomic transactions

Change-Id: Ic05c8e1a1cb230abfca2759b51e5603304aadaa3
2024-06-11 20:35:43 -05:00
Vishnu Ramadas
943d1f1453 mem-ruby: Fix deadlock in GPU_VIPER when issuing atomic requests
When a compute unit issues several requests to the same line,
the requests wait in the L2 if it is a writeback cache. If the line is
invalid initially and the first request is atomic in nature, the L2
cache issues a request to main memory. On data return, the cache line
transitions to M but doesn't wake up the other requests, resulting in
a deadlock. This commit adds a wakeup call on data return for atomics
and fixes potential deadlocks.

Change-Id: I8200ce6e77da7c8b4db285c0cc8b8ca0dfa7d720
2024-06-11 20:33:46 -05:00
Bobby R. Bruce
7e45ec0ff0 stdlib: Fix m5.ext.pystats __init__.py
Addresses Jason's complaint that wildcare imports should be avoided, in
accordance with PEP008:
https://github.com/gem5/gem5/pull/996#discussion_r1621051601.

Change-Id: I72266df43d3ec4ede3f45c3e34e2e05e1990bd6b
2024-06-11 16:26:24 -07:00
Bobby R. Bruce
8fc4d3f793 misc,tests: Update daily test artifact actions to v4.0.0
Change-Id: I711fa36639e925ce958e0484a31ee6a4dde87dbe
2024-06-11 15:43:40 -07:00
Matt Sinclair
8a44e97a10 gpu-compute: Added functions to choose replacement policies for GPU (#1213)
Adding RP_choose functions to change replacement policies among
TreePLRU, LRU, FIFO, LFU, LIP, MRU, NRU, RRIP, SecondChance AND ShiPMem
replacement
policies  for TCC, TCP and SQC caches for GPU
2024-06-11 15:08:42 -05:00
Hoa Nguyen
d528a6bd2d arch: Flag all ISAs Unknown instruction as IsInvalid
Change-Id: I096138a157c4e2063c5f4f4324c21c1463dddb65
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2024-06-11 18:48:29 +00:00
Hoa Nguyen
369029d2be cpu: Add IsInvalid flag to StaticInstFlags
The IsInvalid flag indicates that the static instruction is not part
of the executing ISA and not part of m5's pseudo-instructions. This
flag provides a way to recognize an illegal instruction at the decode
stage.

Change-Id: I2779c6edcd8c5e6a77ea11cad3ff73bacb79d800
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2024-06-11 18:48:29 +00:00
Harry Chiang
d198380489 base: Fix uninitialized variable warning in symtab.test.cc (#1221)
This warning is appeared when I add warning related flags to LINKFLAGS
and turn on LTO to build unit tests.
2024-06-11 10:53:00 -07:00
Jarvis Jia
4fea51b598 Black format change
Change-Id: I95cbf5b97601ef3b6ca26bc1a1835305929ffcab
2024-06-10 22:52:56 -05:00
Jarvis Jia
8e268d42e2 gpu-compute: Provided m5ops support for gpu
Adding m5 stat dump and reset into python script through different exit
event

Change-Id: I662233ae71e2987d90af1fd0100e29036b2ef1c6
2024-06-10 20:56:08 -05:00
Jarvis Jia
cf5e316a92 Change black format
Change-Id: I3733b31baf187e0d3d38d971d9423a1b1afe2296
2024-06-10 16:33:18 -05:00
Jarvis Jia
3404369e68 gpu-compute: Added functions to choose replacement policies for GPU
Adding RP_choose function to change replacement policies among
TreePLRU, LRU, FIFO, LFU, LIP, MRU, NRU, RRIP, SecondChance AND ShiPMem replacement
policies  for TCC, TCP and SQC caches for GPU

Change-Id: I86cc41cca19f8e0d24d8cf015e2e034a1fc4bc43
2024-06-10 16:24:20 -05:00
Jarvis Jia
ccdfe00998 gpu-compute: Added functions to choose replacement policies for GPU
Adding RP_choose functions to change replacement policies among
TreePLRU, LRU, FIFO, LFU, LIP, MRU, NRU, RRIP, SecondChance AND ShiPMem replacement
policies for TCC, TCP and SQC caches for GPU

Change-Id: If84a13babf1006ad41a557747c45d48ce2ce22a9
2024-06-10 16:22:41 -05:00
Jarvis Jia
3c8c783bc3 gpu-compute: Added functions to choose replacement policies for GPU
Adding RP_choose functions to change replacement policies among
TreePLRU, LRU, FIFO, LFU, LIP, MRU, NRU, RRIP, SecondChance AND ShiPMem replacement
policies for TCC, TCP and SQC caches for GPU
2024-06-10 15:13:21 -05:00
Jarvis Jia
c158ce22bf gpu-compute: Added functions to choose replacement policies for GPU
Adding RP_choose function to change replacement policies among
TreePLRU, LRU, FIFO, LFU, LIP, MRU, NRU, RRIP, SecondChance AND ShiPMem replacement
policies  for TCC, TCP and SQC caches for GPU
2024-06-10 15:11:17 -05:00
Jarvis Jia
7c410797d1 Adding functions to choose replacement policies for GPU
Adding RP_choose functions to change replacement policies among
TreePLRU, LRU, FIFO, LFU, LIP, MRU, NRU, RRIP, SecondChance AND ShiPMem replacement
policies  for TCC, TCP and SQC caches for GPU
2024-06-10 14:09:09 -05:00
Jarvis Jia
5b44eca64e Adding functions to choose replacement policies for GPU
Adding RP_choose functions to change replacement policies among
TreePLRU, LRU, FIFO, LFU, LIP, MRU, NRU, RRIP, SecondChance AND ShiPMem replacement
policies  for TCC, TCP and SQC caches for GPU
2024-06-10 13:58:24 -05:00
Alexander Richardson
3cfc550fc0 arch-arm,mem: Don't hardcode secure mode accesses for semihosting (#1200)
When accessing memory using functionalAccess(), the MMU could tell us to
use a nonsecure access even though the CPU is operating in secure mode.
I noticed this while trying to run a simple semihosting hello world with
the MMU+caches enabled and the semihosting calls ended up reading from
memory instead of the caches due to an S/NS mismatch.

See also https://github.com/gem5/gem5/pull/1198 which happens to also
mask the issue I saw, but I believe both changes are needed.

Change-Id: I9e6b9839b194fbd41938e2225449c74701ea7fee
2024-06-09 14:08:54 -07:00
Saúl
5cfad84a98 arch-riscv: correctly set dynamic VLEN for all arith instructions (#1187)
Some arithmetic instructions of the riscv vector extension where still
using the default VLEN=256 instead of the dynamic one through the
inherited `vlen` attribute. Most of them only use this to calculate the
effective index for the mask element like so:

```
uint32_t ei = i + vtype_VLMAX(vtype, vlen, true) * this->microIdx;
if (this->vm || elem_mask(v0, ei)) {
...
```

This means that instructions will wrongly compute the mask index in the
second and subsequent micro instructions (`microIdx` > 0). This commit
fixes this by adding the corresponding `set_vlen` snippet to the
affected instruction formats.

Change-Id: Ib041de972d6938490741a9fb4c214a6a5172c34e
2024-06-07 22:33:56 -07:00
Alexander Richardson
ec5881ec4e arch-arm: avoid using an uninitialized variable use in MMU walks (#1198)
While running a simple Arm32 binary, I noticed that all memory
transactions were being marked as NS instead of S once I turn on the MMU
(even though the page tables have the NS bit set to zero). The result of
this was that semihosting calls were failing since they were using
functional accesses with the SECURE flag set, but the caches only
contained NS tagged entries so these accesses always read stale values
from DRAM.

Digging through the Arm MMU code it appears that the NS bit lookup was
being keyed of the `secureLookup` flag which is only used for long
descriptors. I believe 0c28712f51 should
have used isSecure instead of secureLookup. To avoid using these
uninitialized values in the future I wrapped the LPAE state in a
std::optional to ensure that it is only accessed once initialized.

Change-Id: Ibc406ed3f4cfa768f470e34a5eca3c1a2bf45cd8
2024-06-07 08:59:28 +01:00
Alexander Richardson
8e5fbcbbbb arch-generic: flush streams after semihosting write calls (#1202)
The SYS_WRITEC and SYS_WRITE0 calls are specified as writing to the
debug channel, so it is a reasonable expectation for these messages to
be visibile immediately after the semihosting call.

Change-Id: I8e6e9a7aab593a59e82ecb9cf4603c18c7a8acbe
2024-06-06 09:57:36 +01:00
Alexander Richardson
abbb94af8b dev-arm: Fix -Wdeprecated-copy warning (#1197)
Clang warns as follows: `warning: definition of implicit copy
constructor for 'TranslResult' is deprecated because it has a
user-declared copy assignment operator`

Change-Id: Ic701d8522aac75d569f4f513f54de91f76a17e48
2024-06-05 12:36:38 +01:00
Ivana Mitrovic
a764b9be1c Revert "arch-x86: Fix TLB Assertion Error on CFLUSH" (#1196)
Reverts gem5/gem5#1080 as it is not a good fix.
2024-06-04 10:26:53 -07:00
Hoa Nguyen
40ef8f3afb dev: Remove an extra file in virtio (#1191)
`src/dev/virtio/VirtIORng 2.py` is identical to
`src/dev/virtio/VirtIORng.py`, and the former does not appear in any
build script.

Change-Id: I9c5f1b1a3809d1c7028b630c32310e540613e232

Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2024-06-04 08:40:41 -07:00
dependabot[bot]
500bdc5302 misc: bump tqdm from 4.66.3 to 4.66.4 (#1192)
Bumps [tqdm](https://github.com/tqdm/tqdm) from 4.66.3 to 4.66.4.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-04 06:35:35 -07:00
dependabot[bot]
8c98dcb7cf misc: bump pre-commit from 3.7.0 to 3.7.1 (#1193)
Bumps [pre-commit](https://github.com/pre-commit/pre-commit) from 3.7.0
to 3.7.1.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-04 06:34:53 -07:00
Lukas Zenick
dad5c7b6f7 arch-x86: Fix TLB Assertion Error on CFLUSH (#1080)
Fixed the assertion statement in the cpu's translation.hh file so that
it doesn't fail the assertion if the cache is clean.

I compile this c code to `test`
```c
#include <stdio.h>

static inline void clflush(volatile void *p) {
    __asm__ volatile ("clflush (%0)" : : "r"(p) : "memory");
}

int main() {
    int data = 42;  // Example variable

    printf("Value before clflush: %d\n", data);

    clflush(&data);

    printf("Value after clflush: %d\n", data);

    return 0;
}
```
And run it with this script
`./build/X86/gem5.opt configs/learning_gem5/part1/two_level.py ./test`
In order to verify that it no longer fails the assertion check.

GitHub Issue: #862 
Change-Id: I6004662e7c99f637ba0ddb07d205d1657708e99f
2024-06-03 10:17:10 -07:00