derek/gem5 - gem5 - Gitea: Git with a cup of tea

derek/gem5

Author	SHA1	Message	Date
Hoa Nguyen	500da4306b	arch: Mark FailUnimplemented instructions as Invalid instructions (#1247 ) This is a follow-up on the discussion here [1]. The IsInvalid flag was previously defined as an instruction that does not appear in the ISA. However, a micro-architecture can choose to not recognize an instruction in and raise illegal instruction fault even if the instruction is in the ISA. This change modifies the definition of a Invalid instruction such that, if a StaticInst instruction is marked as IsInvalid, it means the instruction is not recognized by the decoder. This means that any instruction recognized by the decoder are not invalid, even if the instruction is not in the official ISA spec; e.g., m5 pseudo-instructions. Note that instructions that are recognized by the decoder but are chosen to act as a nop are not invalid. This applies to WarnUnimplemented instructions, e.g. hint instructions. [1] https://github.com/gem5/gem5/pull/1071 Change-Id: I1371b222d8b06793d47f434d0f148c5571672068 Signed-off-by: Hoa Nguyen <hn@hnpl.org>	2024-06-17 12:44:05 -07:00
Giacomo Travaglini	2804311f7b	cpu-o3: Revert "Do not set Executed on load instruction to be replayed" (#1251 ) Reverts gem5/gem5#1182 This is breaking O3 execution. Investigating the matter	2024-06-17 12:24:43 -07:00
Matt Sinclair	6776bebbf6	gpu-compute,mem-ruby: Add RubyHitMiss flag for TCP and TCC cache (#1226 ) Add hit and miss print for TCP and TCC cache with RubyHitMiss debug flag Change-Id: I4430532b901811e03d9b077b61e2eca4557b34e1	2024-06-17 12:47:47 -05:00
Matthew Poremba	50e4209a4a	arch-vega: Various MI300 fixes for PyTorch tests (#1249 ) - Fix address calculation issue with scratch_* instructions when SVE bit is 0. - Fix ds_swizzle_b32 not mapping to execution unit. - Implement VOP3 V_FMAC_B32. - Fix architected scratch address register being clobbered. Tested with MNIST from PyTorch quickstart tutorial and nanoGPT on mi300.py.	2024-06-17 07:59:47 -07:00
Jarvis Jia	3a2bf47d57	Add default value and change Ruby address format specifier Change-Id: I8fbaf34745e90589e610d3b9bd423937e7ebdc3d	2024-06-17 03:27:25 -05:00
Jarvis Jia	edb2e76077	Merge branch 'develop' into rubyhitmiss	2024-06-17 15:57:50 +08:00
Matthew Poremba	2b0ca93517	gpu-compute: Fix architected flat scratch Currently writing to SRF which is incorrect, as the physical register number can be clobbered by another wavefront if registers get renamed to the physical register number. Fix this by actually architecting the register, i.e., there is a dedicated "hardware" register in the wavefront class. Change-Id: I94e9e463eed348b2928cae884c1c20566c00984d	2024-06-15 15:46:33 -07:00
Matthew Poremba	2f5842d253	arch-vega: Add valid flag to ds_swizzle_b32 Currently the flag is just Load and there is a long comment explaining why. This does not meet any of the scoreboard check requirements: https://github.com/gem5/gem5/blob/develop/src/gpu-compute/scoreboard_check_stage.cc#L230-L241 Add a generic ALU flag as well so the instruction executes instead of panicking. Change-Id: I54b2d20d47fad5e8f05f927328433aab7db7d862	2024-06-15 14:28:59 -07:00
Matthew Poremba	42369eab2c	arch-vega: Implement MI300 FLAT SVE bit For scratch instructions only, this bit specifies if an offset in a VGPR should be used for address calculation. This is new in MI300 and was previously the LDS bit. The LDS bit is rarely used and in fact gem5 does not even check this bit. This fixes a bug when SADDR == 0x7f (i.e., no SGPR should be used) where a VGPR was being added to the address when it should have been ignored. Change-Id: I9864379692df6795b25b58b98825da05d18fc5db	2024-06-15 14:28:59 -07:00
Matthew Poremba	1dab4be002	arch-vega: Implement VOP3 V_FMAC_F32 A version of V_FMAC_F32 with extra modifiers from VOP3 format. Change-Id: Ib6b41b0a3ceb91269b91a0287dfc94bc73e4d217	2024-06-15 14:28:58 -07:00
Matthew Poremba	f91d14fe46	gpu-compute: Add MFMA stats (#1248 ) Add dynamic instruction counts for MFMAs. Change-Id: I976b01344577cf011aeb3dd648a8c0017281c4e3	2024-06-15 13:04:00 -07:00
Minje Jun	b8e21a2d32	cpu-o3: Do not set Executed on load instruction to be replayed (#1182 ) A load instruction can be replayed when 1) it's strictly ordered or 2) it falls into load-store forwarding mismatch. Case 1 was considered in executeLoad function but the case 2 wasn't. It causes the case-2 replayed load instruction to violate the assertion condition "assert(!load_inst->isExecuted())" in LSQUnit::read. This commit fixes the problem by adding consideration of the case 2 in LSQUnit::executeLoad. Co-authored-by: Minje Jun <minje.jun@samsung.com>	2024-06-14 10:12:26 -07:00
Matthew Poremba	3cf638e217	gpu-compute, util-m5: add GPU kernel exit events (#1217 ) The GPUFS scripts include support for dumping and resetting stats at kernel boundaries by identifying specific GPU kernel exit events. This commit extends that support to work with GPU SE-mode support. Change-Id: I662233ae71e2987d90af1fd0100e29036b2ef1c6	2024-06-14 08:13:27 -07:00
Jason Lowe-Power	21ffd91529	cpu,arch: Add IsInvalid flag to Unknown insts (#1071 ) The IsInvalid flag indicates that the static instruction is not part of the executing ISA and not part of m5's pseudo-instructions. This flag provides a way to recognize an illegal instruction at the decode stage.	2024-06-13 16:26:35 -07:00
Matthew Poremba	b3d9dc42d4	configs: Add replacement policy options for GPUFS (#1230 ) GPU_VIPER.py was modified to use these options but they did not exist, breaking GPUFS. This commit adds them to fix the issue. Change-Id: I0095f400ea606c4e8d91a41870ef208465cef803	2024-06-13 11:23:50 -07:00
Jarvis Jia	87c0d7732c	Merge branch 'develop' into rubyhitmiss	2024-06-12 17:30:35 -04:00
Jarvis Jia	edfc139c40	Change black format Change-Id: I3733b31baf187e0d3d38d971d9423a1b1afe2296 gpu-compute: add GPU RubyHitMiss for TCP and TCC Change-Id: I4430532b901811e03d9b077b61e2eca4557b34e1 gpu-compute: Add RubyHitMiss flag for TCP and TCC cache Change-Id: I4e5d1127c84b9eb1060ec9ba0b6638267449eda5 gpu-compute: Add RubyHitMiss flag for TCP and TCC cache Change-Id: I4e5d1127c84b9eb1060ec9ba0b6638267449eda5 Remove space Change-Id: I401f528c6f128ba0956bdbc232e8f2ae37bf648c	2024-06-12 16:04:36 -05:00
Jarvis Jia	b6b2e8c6c5	Black format Change-Id: If224c106262bae25127675160ea78386eedace3b	2024-06-12 15:57:04 -05:00
Jarvis Jia	0ebcddea95	Update apu_se.py to remove part not needed Change-Id: I06df4e0a67ccd2b7a45296ff65bf26c2b465a934	2024-06-12 15:54:13 -05:00
Matthew Poremba	be0a7937c1	mem-ruby: Fix deadlock in GPU_VIPER when issuing atomic requests (#1216 ) When a compute unit issues several requests to the same line, the requests wait in the L2 if it is a writeback cache. If the line is invalid initially and the first request is atomic in nature, the L2 cache issues a request to main memory. On data return, the cache line transitions to M but doesn't wake up the other requests, resulting in a deadlock. This commit adds a wakeup call on data return for atomics and fixes potential deadlocks.	2024-06-12 10:10:32 -07:00
Harshil Patel	74afea471d	cpu: Revert "Don't change to suspend if the thread status is halted" (#1225 ) Reverts gem5/gem5#1039	2024-06-12 00:20:06 -07:00
Bobby R. Bruce	f9abf6bb08	stdlib: Improve gem5 PyStats (#996 ) This PR incorporates numerous improvements and fixes to the gem5 PyStats. This includes: * PyStats now support SimObject Vectors. The PyStats representing them are subscribable and therefore acceptable by accessing an index: e.g.,: `simobjectvec[0]`. (This replaces the `Vector` group PyStat) * Adds the `SparseHist` PyStats. * Adds the `Vector2d` to PyStats. * The `Distribution` PyStats is fixed to be a vector of Scalars. * Tests added for the PyStat's Vector and bugs fixed.	2024-06-12 00:19:08 -07:00
Bobby R. Bruce	261490f23c	misc,tests: Revert merge version to 'v4' from 'v4.0.0' 'v4.0.0' wasn't working. The following error was occurred: ``` Can't find 'action.yml', 'action.yaml' or 'Dockerfile' for action 'actions/upload-artifact/merge@v4.0.0'. ``` Change-Id: I658b0fe292df029501fbc1286acb06f4014ae4e1	2024-06-12 00:14:27 -07:00
Vishnu Ramadas	42b9a9666e	mem-ruby: Add instSeqNum to atomic responses from GPU L2 caches This commit adds instSeqNum to the atomic responses in GPU_VIPER-TCC.sm. This will be useful when debugging issues related to GPU atomic transactions Change-Id: Ic05c8e1a1cb230abfca2759b51e5603304aadaa3	2024-06-11 20:35:43 -05:00
Vishnu Ramadas	943d1f1453	mem-ruby: Fix deadlock in GPU_VIPER when issuing atomic requests When a compute unit issues several requests to the same line, the requests wait in the L2 if it is a writeback cache. If the line is invalid initially and the first request is atomic in nature, the L2 cache issues a request to main memory. On data return, the cache line transitions to M but doesn't wake up the other requests, resulting in a deadlock. This commit adds a wakeup call on data return for atomics and fixes potential deadlocks. Change-Id: I8200ce6e77da7c8b4db285c0cc8b8ca0dfa7d720	2024-06-11 20:33:46 -05:00
Bobby R. Bruce	7e45ec0ff0	stdlib: Fix m5.ext.pystats __init__.py Addresses Jason's complaint that wildcare imports should be avoided, in accordance with PEP008: https://github.com/gem5/gem5/pull/996#discussion_r1621051601. Change-Id: I72266df43d3ec4ede3f45c3e34e2e05e1990bd6b	2024-06-11 16:26:24 -07:00
Bobby R. Bruce	8fc4d3f793	misc,tests: Update daily test artifact actions to v4.0.0 Change-Id: I711fa36639e925ce958e0484a31ee6a4dde87dbe	2024-06-11 15:43:40 -07:00
Matt Sinclair	8a44e97a10	gpu-compute: Added functions to choose replacement policies for GPU (#1213 ) Adding RP_choose functions to change replacement policies among TreePLRU, LRU, FIFO, LFU, LIP, MRU, NRU, RRIP, SecondChance AND ShiPMem replacement policies for TCC, TCP and SQC caches for GPU	2024-06-11 15:08:42 -05:00
Hoa Nguyen	d528a6bd2d	arch: Flag all ISAs Unknown instruction as IsInvalid Change-Id: I096138a157c4e2063c5f4f4324c21c1463dddb65 Signed-off-by: Hoa Nguyen <hn@hnpl.org>	2024-06-11 18:48:29 +00:00
Hoa Nguyen	369029d2be	cpu: Add IsInvalid flag to StaticInstFlags The IsInvalid flag indicates that the static instruction is not part of the executing ISA and not part of m5's pseudo-instructions. This flag provides a way to recognize an illegal instruction at the decode stage. Change-Id: I2779c6edcd8c5e6a77ea11cad3ff73bacb79d800 Signed-off-by: Hoa Nguyen <hn@hnpl.org>	2024-06-11 18:48:29 +00:00
Harry Chiang	d198380489	base: Fix uninitialized variable warning in symtab.test.cc (#1221 ) This warning is appeared when I add warning related flags to LINKFLAGS and turn on LTO to build unit tests.	2024-06-11 10:53:00 -07:00
Jarvis Jia	4fea51b598	Black format change Change-Id: I95cbf5b97601ef3b6ca26bc1a1835305929ffcab	2024-06-10 22:52:56 -05:00
Jarvis Jia	8e268d42e2	gpu-compute: Provided m5ops support for gpu Adding m5 stat dump and reset into python script through different exit event Change-Id: I662233ae71e2987d90af1fd0100e29036b2ef1c6	2024-06-10 20:56:08 -05:00
Jarvis Jia	cf5e316a92	Change black format Change-Id: I3733b31baf187e0d3d38d971d9423a1b1afe2296	2024-06-10 16:33:18 -05:00
Jarvis Jia	3404369e68	gpu-compute: Added functions to choose replacement policies for GPU Adding RP_choose function to change replacement policies among TreePLRU, LRU, FIFO, LFU, LIP, MRU, NRU, RRIP, SecondChance AND ShiPMem replacement policies for TCC, TCP and SQC caches for GPU Change-Id: I86cc41cca19f8e0d24d8cf015e2e034a1fc4bc43	2024-06-10 16:24:20 -05:00
Jarvis Jia	ccdfe00998	gpu-compute: Added functions to choose replacement policies for GPU Adding RP_choose functions to change replacement policies among TreePLRU, LRU, FIFO, LFU, LIP, MRU, NRU, RRIP, SecondChance AND ShiPMem replacement policies for TCC, TCP and SQC caches for GPU Change-Id: If84a13babf1006ad41a557747c45d48ce2ce22a9	2024-06-10 16:22:41 -05:00
Jarvis Jia	3c8c783bc3	gpu-compute: Added functions to choose replacement policies for GPU Adding RP_choose functions to change replacement policies among TreePLRU, LRU, FIFO, LFU, LIP, MRU, NRU, RRIP, SecondChance AND ShiPMem replacement policies for TCC, TCP and SQC caches for GPU	2024-06-10 15:13:21 -05:00
Jarvis Jia	c158ce22bf	gpu-compute: Added functions to choose replacement policies for GPU Adding RP_choose function to change replacement policies among TreePLRU, LRU, FIFO, LFU, LIP, MRU, NRU, RRIP, SecondChance AND ShiPMem replacement policies for TCC, TCP and SQC caches for GPU	2024-06-10 15:11:17 -05:00
Jarvis Jia	7c410797d1	Adding functions to choose replacement policies for GPU Adding RP_choose functions to change replacement policies among TreePLRU, LRU, FIFO, LFU, LIP, MRU, NRU, RRIP, SecondChance AND ShiPMem replacement policies for TCC, TCP and SQC caches for GPU	2024-06-10 14:09:09 -05:00
Jarvis Jia	5b44eca64e	Adding functions to choose replacement policies for GPU Adding RP_choose functions to change replacement policies among TreePLRU, LRU, FIFO, LFU, LIP, MRU, NRU, RRIP, SecondChance AND ShiPMem replacement policies for TCC, TCP and SQC caches for GPU	2024-06-10 13:58:24 -05:00
Alexander Richardson	3cfc550fc0	arch-arm,mem: Don't hardcode secure mode accesses for semihosting (#1200 ) When accessing memory using functionalAccess(), the MMU could tell us to use a nonsecure access even though the CPU is operating in secure mode. I noticed this while trying to run a simple semihosting hello world with the MMU+caches enabled and the semihosting calls ended up reading from memory instead of the caches due to an S/NS mismatch. See also https://github.com/gem5/gem5/pull/1198 which happens to also mask the issue I saw, but I believe both changes are needed. Change-Id: I9e6b9839b194fbd41938e2225449c74701ea7fee	2024-06-09 14:08:54 -07:00
Saúl	5cfad84a98	arch-riscv: correctly set dynamic VLEN for all arith instructions (#1187 ) Some arithmetic instructions of the riscv vector extension where still using the default VLEN=256 instead of the dynamic one through the inherited `vlen` attribute. Most of them only use this to calculate the effective index for the mask element like so: ``` uint32_t ei = i + vtype_VLMAX(vtype, vlen, true) * this->microIdx; if (this->vm \|\| elem_mask(v0, ei)) { ... ``` This means that instructions will wrongly compute the mask index in the second and subsequent micro instructions (`microIdx` > 0). This commit fixes this by adding the corresponding `set_vlen` snippet to the affected instruction formats. Change-Id: Ib041de972d6938490741a9fb4c214a6a5172c34e	2024-06-07 22:33:56 -07:00
Alexander Richardson	ec5881ec4e	arch-arm: avoid using an uninitialized variable use in MMU walks (#1198 ) While running a simple Arm32 binary, I noticed that all memory transactions were being marked as NS instead of S once I turn on the MMU (even though the page tables have the NS bit set to zero). The result of this was that semihosting calls were failing since they were using functional accesses with the SECURE flag set, but the caches only contained NS tagged entries so these accesses always read stale values from DRAM. Digging through the Arm MMU code it appears that the NS bit lookup was being keyed of the `secureLookup` flag which is only used for long descriptors. I believe `0c28712f51` should have used isSecure instead of secureLookup. To avoid using these uninitialized values in the future I wrapped the LPAE state in a std::optional to ensure that it is only accessed once initialized. Change-Id: Ibc406ed3f4cfa768f470e34a5eca3c1a2bf45cd8	2024-06-07 08:59:28 +01:00
Alexander Richardson	8e5fbcbbbb	arch-generic: flush streams after semihosting write calls (#1202 ) The SYS_WRITEC and SYS_WRITE0 calls are specified as writing to the debug channel, so it is a reasonable expectation for these messages to be visibile immediately after the semihosting call. Change-Id: I8e6e9a7aab593a59e82ecb9cf4603c18c7a8acbe	2024-06-06 09:57:36 +01:00
Alexander Richardson	abbb94af8b	dev-arm: Fix -Wdeprecated-copy warning (#1197 ) Clang warns as follows: `warning: definition of implicit copy constructor for 'TranslResult' is deprecated because it has a user-declared copy assignment operator` Change-Id: Ic701d8522aac75d569f4f513f54de91f76a17e48	2024-06-05 12:36:38 +01:00
Ivana Mitrovic	a764b9be1c	Revert "arch-x86: Fix TLB Assertion Error on CFLUSH" (#1196 ) Reverts gem5/gem5#1080 as it is not a good fix.	2024-06-04 10:26:53 -07:00
Hoa Nguyen	40ef8f3afb	dev: Remove an extra file in virtio (#1191 ) `src/dev/virtio/VirtIORng 2.py` is identical to `src/dev/virtio/VirtIORng.py`, and the former does not appear in any build script. Change-Id: I9c5f1b1a3809d1c7028b630c32310e540613e232 Signed-off-by: Hoa Nguyen <hn@hnpl.org>	2024-06-04 08:40:41 -07:00
dependabot[bot]	500bdc5302	misc: bump tqdm from 4.66.3 to 4.66.4 (#1192 ) Bumps [tqdm](https://github.com/tqdm/tqdm) from 4.66.3 to 4.66.4. Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-06-04 06:35:35 -07:00
dependabot[bot]	8c98dcb7cf	misc: bump pre-commit from 3.7.0 to 3.7.1 (#1193 ) Bumps [pre-commit](https://github.com/pre-commit/pre-commit) from 3.7.0 to 3.7.1. Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-06-04 06:34:53 -07:00
Lukas Zenick	dad5c7b6f7	arch-x86: Fix TLB Assertion Error on CFLUSH (#1080 ) Fixed the assertion statement in the cpu's translation.hh file so that it doesn't fail the assertion if the cache is clean. I compile this c code to `test` ```c #include <stdio.h> static inline void clflush(volatile void *p) { __asm__ volatile ("clflush (%0)" : : "r"(p) : "memory"); } int main() { int data = 42; // Example variable printf("Value before clflush: %d\n", data); clflush(&data); printf("Value after clflush: %d\n", data); return 0; } ``` And run it with this script `./build/X86/gem5.opt configs/learning_gem5/part1/two_level.py ./test` In order to verify that it no longer fails the assertion check. GitHub Issue: #862 Change-Id: I6004662e7c99f637ba0ddb07d205d1657708e99f	2024-06-03 10:17:10 -07:00

1 2 3 4 5 ...

21722 Commits