derek/gem5 - gem5 - Gitea: Git with a cup of tea

derek/gem5

Author	SHA1	Message	Date
Jason Lowe-Power	21ffd91529	cpu,arch: Add IsInvalid flag to Unknown insts (#1071 ) The IsInvalid flag indicates that the static instruction is not part of the executing ISA and not part of m5's pseudo-instructions. This flag provides a way to recognize an illegal instruction at the decode stage.	2024-06-13 16:26:35 -07:00
Matthew Poremba	be0a7937c1	mem-ruby: Fix deadlock in GPU_VIPER when issuing atomic requests (#1216 ) When a compute unit issues several requests to the same line, the requests wait in the L2 if it is a writeback cache. If the line is invalid initially and the first request is atomic in nature, the L2 cache issues a request to main memory. On data return, the cache line transitions to M but doesn't wake up the other requests, resulting in a deadlock. This commit adds a wakeup call on data return for atomics and fixes potential deadlocks.	2024-06-12 10:10:32 -07:00
Harshil Patel	74afea471d	cpu: Revert "Don't change to suspend if the thread status is halted" (#1225 ) Reverts gem5/gem5#1039	2024-06-12 00:20:06 -07:00
Bobby R. Bruce	f9abf6bb08	stdlib: Improve gem5 PyStats (#996 ) This PR incorporates numerous improvements and fixes to the gem5 PyStats. This includes: * PyStats now support SimObject Vectors. The PyStats representing them are subscribable and therefore acceptable by accessing an index: e.g.,: `simobjectvec[0]`. (This replaces the `Vector` group PyStat) * Adds the `SparseHist` PyStats. * Adds the `Vector2d` to PyStats. * The `Distribution` PyStats is fixed to be a vector of Scalars. * Tests added for the PyStat's Vector and bugs fixed.	2024-06-12 00:19:08 -07:00
Vishnu Ramadas	42b9a9666e	mem-ruby: Add instSeqNum to atomic responses from GPU L2 caches This commit adds instSeqNum to the atomic responses in GPU_VIPER-TCC.sm. This will be useful when debugging issues related to GPU atomic transactions Change-Id: Ic05c8e1a1cb230abfca2759b51e5603304aadaa3	2024-06-11 20:35:43 -05:00
Vishnu Ramadas	943d1f1453	mem-ruby: Fix deadlock in GPU_VIPER when issuing atomic requests When a compute unit issues several requests to the same line, the requests wait in the L2 if it is a writeback cache. If the line is invalid initially and the first request is atomic in nature, the L2 cache issues a request to main memory. On data return, the cache line transitions to M but doesn't wake up the other requests, resulting in a deadlock. This commit adds a wakeup call on data return for atomics and fixes potential deadlocks. Change-Id: I8200ce6e77da7c8b4db285c0cc8b8ca0dfa7d720	2024-06-11 20:33:46 -05:00
Bobby R. Bruce	7e45ec0ff0	stdlib: Fix m5.ext.pystats __init__.py Addresses Jason's complaint that wildcare imports should be avoided, in accordance with PEP008: https://github.com/gem5/gem5/pull/996#discussion_r1621051601. Change-Id: I72266df43d3ec4ede3f45c3e34e2e05e1990bd6b	2024-06-11 16:26:24 -07:00
Hoa Nguyen	d528a6bd2d	arch: Flag all ISAs Unknown instruction as IsInvalid Change-Id: I096138a157c4e2063c5f4f4324c21c1463dddb65 Signed-off-by: Hoa Nguyen <hn@hnpl.org>	2024-06-11 18:48:29 +00:00
Hoa Nguyen	369029d2be	cpu: Add IsInvalid flag to StaticInstFlags The IsInvalid flag indicates that the static instruction is not part of the executing ISA and not part of m5's pseudo-instructions. This flag provides a way to recognize an illegal instruction at the decode stage. Change-Id: I2779c6edcd8c5e6a77ea11cad3ff73bacb79d800 Signed-off-by: Hoa Nguyen <hn@hnpl.org>	2024-06-11 18:48:29 +00:00
Harry Chiang	d198380489	base: Fix uninitialized variable warning in symtab.test.cc (#1221 ) This warning is appeared when I add warning related flags to LINKFLAGS and turn on LTO to build unit tests.	2024-06-11 10:53:00 -07:00
Alexander Richardson	3cfc550fc0	arch-arm,mem: Don't hardcode secure mode accesses for semihosting (#1200 ) When accessing memory using functionalAccess(), the MMU could tell us to use a nonsecure access even though the CPU is operating in secure mode. I noticed this while trying to run a simple semihosting hello world with the MMU+caches enabled and the semihosting calls ended up reading from memory instead of the caches due to an S/NS mismatch. See also https://github.com/gem5/gem5/pull/1198 which happens to also mask the issue I saw, but I believe both changes are needed. Change-Id: I9e6b9839b194fbd41938e2225449c74701ea7fee	2024-06-09 14:08:54 -07:00
Saúl	5cfad84a98	arch-riscv: correctly set dynamic VLEN for all arith instructions (#1187 ) Some arithmetic instructions of the riscv vector extension where still using the default VLEN=256 instead of the dynamic one through the inherited `vlen` attribute. Most of them only use this to calculate the effective index for the mask element like so: ``` uint32_t ei = i + vtype_VLMAX(vtype, vlen, true) * this->microIdx; if (this->vm \|\| elem_mask(v0, ei)) { ... ``` This means that instructions will wrongly compute the mask index in the second and subsequent micro instructions (`microIdx` > 0). This commit fixes this by adding the corresponding `set_vlen` snippet to the affected instruction formats. Change-Id: Ib041de972d6938490741a9fb4c214a6a5172c34e	2024-06-07 22:33:56 -07:00
Alexander Richardson	ec5881ec4e	arch-arm: avoid using an uninitialized variable use in MMU walks (#1198 ) While running a simple Arm32 binary, I noticed that all memory transactions were being marked as NS instead of S once I turn on the MMU (even though the page tables have the NS bit set to zero). The result of this was that semihosting calls were failing since they were using functional accesses with the SECURE flag set, but the caches only contained NS tagged entries so these accesses always read stale values from DRAM. Digging through the Arm MMU code it appears that the NS bit lookup was being keyed of the `secureLookup` flag which is only used for long descriptors. I believe `0c28712f51` should have used isSecure instead of secureLookup. To avoid using these uninitialized values in the future I wrapped the LPAE state in a std::optional to ensure that it is only accessed once initialized. Change-Id: Ibc406ed3f4cfa768f470e34a5eca3c1a2bf45cd8	2024-06-07 08:59:28 +01:00
Alexander Richardson	8e5fbcbbbb	arch-generic: flush streams after semihosting write calls (#1202 ) The SYS_WRITEC and SYS_WRITE0 calls are specified as writing to the debug channel, so it is a reasonable expectation for these messages to be visibile immediately after the semihosting call. Change-Id: I8e6e9a7aab593a59e82ecb9cf4603c18c7a8acbe	2024-06-06 09:57:36 +01:00
Alexander Richardson	abbb94af8b	dev-arm: Fix -Wdeprecated-copy warning (#1197 ) Clang warns as follows: `warning: definition of implicit copy constructor for 'TranslResult' is deprecated because it has a user-declared copy assignment operator` Change-Id: Ic701d8522aac75d569f4f513f54de91f76a17e48	2024-06-05 12:36:38 +01:00
Ivana Mitrovic	a764b9be1c	Revert "arch-x86: Fix TLB Assertion Error on CFLUSH" (#1196 ) Reverts gem5/gem5#1080 as it is not a good fix.	2024-06-04 10:26:53 -07:00
Hoa Nguyen	40ef8f3afb	dev: Remove an extra file in virtio (#1191 ) `src/dev/virtio/VirtIORng 2.py` is identical to `src/dev/virtio/VirtIORng.py`, and the former does not appear in any build script. Change-Id: I9c5f1b1a3809d1c7028b630c32310e540613e232 Signed-off-by: Hoa Nguyen <hn@hnpl.org>	2024-06-04 08:40:41 -07:00
Lukas Zenick	dad5c7b6f7	arch-x86: Fix TLB Assertion Error on CFLUSH (#1080 ) Fixed the assertion statement in the cpu's translation.hh file so that it doesn't fail the assertion if the cache is clean. I compile this c code to `test` ```c #include <stdio.h> static inline void clflush(volatile void *p) { __asm__ volatile ("clflush (%0)" : : "r"(p) : "memory"); } int main() { int data = 42; // Example variable printf("Value before clflush: %d\n", data); clflush(&data); printf("Value after clflush: %d\n", data); return 0; } ``` And run it with this script `./build/X86/gem5.opt configs/learning_gem5/part1/two_level.py ./test` In order to verify that it no longer fails the assertion check. GitHub Issue: #862 Change-Id: I6004662e7c99f637ba0ddb07d205d1657708e99f	2024-06-03 10:17:10 -07:00
Yu-Cheng Chang	5d3f1c3316	arch-riscv: Add rvZext to BranchTarget (#1173 ) Ensure the upper xlen bits are all zeros Change-Id: Id81330eced907d21320bc1af85ad38fb6e95f6b1	2024-06-03 10:03:51 -07:00
Matthew Poremba	00dcd5b0bc	arch-vega: Implement literals for 64b dest operands This feature has been available since Vega10 but was never implemented. MI300 adds a few new instructions that make use of this more often (e.g., v_mov_b64). Change-Id: Ieeb7834462b76d77c0030f49622d0de09f90c9e4	2024-05-31 13:41:46 -07:00
Matthew Poremba	6c8caf83c6	arch-vega: Implement V_ACCVGPR_MOV_B32 instruction This instruction is a simple move from accumulation register to accumulation register. It is essentially a move with the accumulation offset added to the register index. Change-Id: Ic93ae72599b75c91213f56ebafe5bbd7b2867089	2024-05-31 09:32:35 -07:00
Matthew Poremba	7cdb69bf21	arch-vega: Fill in scratch insts to match flat/global Flat, scratch, and global share the same instruction implementation with different address calculations essentially. These instructions were already implemented but not added to the decoder. This commit adds the remaining scratch instructions which have a shared instruction implementation. Change-Id: I8f2e9ceb221294dce1b81c45745b642f0592d985	2024-05-31 09:32:34 -07:00
Bobby R. Bruce	a0de33110b	arch-vega: Fix clang comp error due to constant exp (#1183 ) The lines `constexpr int B_I = std::ceil(64.0f / (N * M / H));` caused the following compilation error in clang Version 16: ``` error: constexpr variable 'B_I' must be initialized by a constant expression ``` `std::ceil` is not a const expression. Therefore instances of this expression in instructions.hh have been replaced with a constant expression friendly alternative. This is calling our compiler tests to fail: https://github.com/gem5/gem5/actions/runs/9288296434/job/25559409142 Change-Id: I74da1dab08b335c59bdddef6581746a94107f370	2024-05-30 09:44:34 -07:00
NSurawar	efbfdeabd7	mem-ruby: Reduce handshaking between CorePair and dir (#1117 ) Currently when data is downgraded by MOESI_AMD_Base-CorePair (e.g. due to a replacement) this requires a 4-way handshake between the CorePair and the dir. Specifically, the CorePair send a message telling the dir it'd like to downgrade then, the dir sends an ACK back and then, the CorePair writes the data back, and finally, the dir ACKs the writeback. This is very inefficient and not representative of how modern protocols downgrade a request. Accordingly, this commits updates the downgrade support such that the CorePair writes back the data immediately and then the dir ACKs it. Thus, this approach requires only a 2-way handshake. Change-Id: I7ebc85bb03e8ce46a8847e3240fc170120e9fcd6 Co-authored-by: Neeraj Surawar <neerajs@hyrule.cs.wisc.edu>	2024-05-30 09:36:29 -07:00
Bobby R. Bruce	c0a64c4862	stdlib: Move SimStat specific varibale sets out of loop Change-Id: I6e1f4c01a52ae904e9a6c6582b5b413f94c1cb05	2024-05-30 03:03:29 -07:00
Bobby R. Bruce	7f0290985f	stdlib,tests: Add Pyunit tests to check Pyunit nav, fix bugs Bigs fixed of note: 1. The 'find' method has been fixed to work. This involved making 'children' a class implemented per-subclass as required. 2. The 'get_all_stats_of_name' method has been removed. This was not working at all correctly and is largely doing what 'find' does. 2. The functionality to get an element in a vector via an attribute call (i.e., self.vector1 == self.vector[1]) has been implemented this maintaining backwards compatibility with the regular Python stats. Change-Id: I31a4ccc723937018a3038dcdf491c82629ddbbb2	2024-05-30 03:02:06 -07:00
Bobby R. Bruce	2d4a213046	stdlib: Make PyStat SimStat inherit from Group The SimStat Object is nothing more than a group of other SimStats and is therefore logically a group. With this, functionality can be shared more easily. Change-Id: I5dce23a02d5871e640b422654ca063e590b1429a	2024-05-30 02:56:13 -07:00
ylldummy	7fa0342a7c	mem-cache: Fix maybe-uninitialized warning (#1179 ) When compiler tries to inline a vector construction with a default value as default constructed ReplaceableEntry. It can complain about the uninitialized member. Let's provide basic initialization to the members. Example codepath: SignaturePathV2 constructor -> GlobalHistoryEntry() as init_value to AssociativeSet -> AssociativeSet initialize vector<Entry> with init_value	2024-05-29 10:41:35 -07:00
Bobby R. Bruce	6d174c43e4	stdlib: Expand and simplify PyStats __init__.py 1. Adds newly added PyStat classes to "__init__.py", ensuring they can all be accessed via a `m5.ext.pystats` import. 2. Simplifies the layout out "__init__.py" to just import all classes from all files. Change-Id: I43bfc5e7ff1aec837e661905304c6fb10b00c90e	2024-05-29 08:22:49 -07:00
Bobby R. Bruce	b161172f65	arch-arm: Fix memory attributes of table walks (#1180 ) This PR is doing the following: 1) Fixing memory attributes of partial translation entries (table walks) 2) Properly setting the cacheability of table walks	2024-05-29 08:07:44 -07:00
Nicholas Mosier	9027d5c3e2	arch-x86: set AF=0 when logical instructions execute (#1171 ) Fix #1168. Prevent logical instructions like AND, OR, and TEST from having input dependencies on the previous value of the Zaps register (ZF+AF+PF+SF) by having them set AF=0, rather than not modifying AF.	2024-05-29 08:04:44 -07:00
Nicholas Mosier	a54d3198a8	arch-x86: break 32/64-bit mov's input dependency on prior dest value (#1172 ) Fix #1169. Break the input dependency of 32-bit and 64-bit 'mov' micro-ops on the prior value in the destination register. Such a dependency is required for 8-bit and 16-bit moves, as they do not completely overwrite the value in the destination register. However, it is unnecessary for 32-bit moves (which implicitly zero the upper 32 bits) and 64-bit moves. This patch implements the fix by adding a new code template field inside the generated constructors of X86StaticInst's, called `invalidate_srcs`, which instruction implementations like `mov` can use to conditionally invalidate particular source registers as needed. In `mov`'s case, this is when the data size is 32 or 64 bits. Change-Id: Ib2aef6be6da08752640ea3414b90efb7965be924	2024-05-29 07:54:03 -07:00
Matthew Poremba	07f6b7c59c	dev-amdgpu: Fix pending PCI RLC doorbell (#1157 ) SDMA RLC queues do not currently remove their doorbell mapping. This can cause issues re-registering the queue and prevents the pending doorbells feature from working. In addition the data value of the doorbell (the ring buffer rptr) is not saved, leading to UB when this workaround is used. This commit removes the doorbell mapping from the gpu device when the SDMA engine unmaps an RLC queue and copies the next doorbell value to the pending packet as was originally intended. Change-Id: Ifd551450f439c065579afcf916f8ff192e7598ab	2024-05-29 07:15:46 -07:00
Giacomo Travaglini	c4ed23a10b	arch-arm: Implement HCR_EL2 force broadcast for EL1&0 TLBIs (#1175 ) According to the Arm architecture reference manual, it is possible to force the broadcast of the following TLBIs: AArch64: TLBI VMALLE1, TLBI VAE1, TLBI ASIDE1, TLBI VAAE1, TLBI VALE1, TLBI VAALE1, IC IALLU, TLBI RVAE1, TLBI RVAAE1, TLBI RVALE1, and TLBI RVAALE1. AArch32: BPIALL, TLBIALL, TLBIMVA, TLBIASID, DTLBIALL, DTLBIMVA, DTLBIASID, ITLBIALL, ITLBIMVA, ITLBIASID, TLBIMVAA, ICIALLU, TLBIMVAL, and TLBIMVAAL. Via the HCR_EL2.FB bit Change-Id: Ib11aa05cd202fadfbd9221db7a2043051196ecbd Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2024-05-29 11:54:24 +01:00
Giacomo Travaglini	e9dcb906b4	arch-arm: Set memory attributes for partial table entries Change-Id: I80adcead410f226c323e4d781adb1ff17a386986 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2024-05-29 09:30:58 +01:00
Giacomo Travaglini	09f0c20be2	arch-arm: Use HCR_EL2.CD for stage2 table walks When determining the cacheability of table walks, SCTLR.C should only be used in stage1 EL1&0 translations. Stage2 translations should rely on HCR_EL2.CD instead Change-Id: I1b0830bc3fb5086f68d7a7a1560c7fed5d126d28 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2024-05-29 09:30:58 +01:00
Giacomo Travaglini	854662f48f	arch-arm: Check OSH domain as well for cacheability attribute Make table walks uncacheable if marked as uncacheable in either inner or outer shareable domain Change-Id: I5898a3b91b5b919e0beda6c6fe896394e3ab94df Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2024-05-29 09:30:58 +01:00
Matthew Poremba	e82cf20150	mem-ruby: Remove VIPER StoreThrough temp cache storage (#1156 ) StoreThrough in VIPER when the TCP is disabled, GLC bit is set, or SLC bit is set will bypass the TCP, but will temporarily allocate a cache entry seemingly to handle write coalescing with valid blocks. It does not attempt to evict a block if the set is full and the address is invalid. This causes a panic if the set is full as there is no spare cache entry to use temporarily to use for DataBlk manipulation. However, a cache block is not required for this. This commit removes using a cache block for StoreThrough with invalid blocks as there is no existing data to coalesce with. It creates no allocate variants of the actions needed in StoreThrough and pulls the DataBlk information from the in_msg instead. Non-invalid blocks do not have this panic as they have a cache entry already. Fixes issues with StoreThroughs on more aggressive architectures like MI300. Change-Id: Id8687eccb991e967bb5292068cbe7686e0930d7d	2024-05-28 11:02:00 -07:00
Ivana Mitrovic	5ec1acaf5f	arch-arm: TLBIs targeting EL2 regime are executable from S state (#1176 ) Those AArch64 instructions/registers were labelled as executable from EL3 only if SCR_EL3.NS == 1. This is not valid anymore after the introduction of FEAT_SEL2	2024-05-28 10:54:18 -07:00
Matthew Poremba	1dfaa224ff	arch-vega: Fix GCC 13 build errors (#1162 ) The new static analysis in GCC 13 finds issues with operand.hh. This commit fixes the error so that gem5 compiles when BUILD_GPU is true. Change-Id: I6f4b0d350f0cabb6e356de20a46e1ca65fd0da55	2024-05-28 07:58:28 -07:00
Giacomo Travaglini	27c7647fee	arch-arm: Use monWrite a shorter version Change-Id: I8da8a39238eb100315d3df496f55a6bf3da948c6 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2024-05-28 11:20:52 +01:00
Giacomo Travaglini	6995a99d77	arch-arm: TLBIs targeting EL2 regime are executable from S state Those AArch64 instructions/registers were labelled as executable from EL3 only if SCR_EL3.NS == 1. This is not valid anymore after the introduction of FEAT_SEL2 Change-Id: Ie7b56f3fe779c3a99d4f0ef937c7c8ec0530b00e Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2024-05-28 11:20:32 +01:00
Giacomo Travaglini	10dbfb8bb7	arch-arm: Rewrite performTlbi to use map instead of switch (#1166 ) This is making it easier for TLBI instructions to share code. Common code (under the form of tlbi* functions) are closely matching the instruction description in the Arm pseudocode Change-Id: If10c22fb4a7df2bcd0335e9761286ad3c458722b Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2024-05-28 11:03:07 +01:00
Bobby R. Bruce	8f0ed46061	stdlib: Move `_m5.stats.processDumpQueue` to call-once This commit addresses Jason's comment (https://github.com/gem5/gem5/pull/996#discussion_r1613870880) which highlighted putting the `_m5.stats.processDumpQueue` call in the iteration through the `root` object in `get_simstat` caused this function be potentially called many times when it only needs to be called once. This chance moved this call to just before this iteration and will tehrefore only be called once (if required) per `get_simstat` execution. Change-Id: I16908b6dee063a0df7877a19e215883963bfb081	2024-05-27 08:35:21 -07:00
Yu-Cheng Chang	4f6fdbf8bf	arch-riscv: Fix c.jalr and c.jr instruction (#1163 ) The bit 0 of register should be 0 for jump address. Wrong handling the jump address may cause infinite run or segment fault. gem5 issue: https://github.com/gem5/gem5/issues/981	2024-05-25 20:18:42 -07:00
Bobby R. Bruce	0f6bd24c95	stdlib: Fix get_simstat to accept lists of SimObjects Change-Id: Iae12a0ac88f9646acb00e73d70f83b1e2ff94ac9	2024-05-23 14:54:59 -07:00
Bobby R. Bruce	c0a1fa33fe	stdlib: Improve PyStat support for SimObject Vectors Change-Id: Iba0c93ffa5c4b18acf75af82965c63a8881df189	2024-05-23 14:54:59 -07:00
Bobby R. Bruce	178679cbfd	stdlib: Add SparseHist to PyStats This is inclusive of tests to ensure they have implemented correctly. Change-Id: I5c84d5ffdb7b914936cfd86ca012a7b141eeaf42	2024-05-23 14:54:59 -07:00
Bobby R. Bruce	b5e8804cd4	stdlib: Remove 'Vector' group subclass This was not used and easily confused with the other 'Vector' in PyStats. Change-Id: I9294bb0ae04db0537c87a5f50ce023fc83d587b8	2024-05-23 14:54:59 -07:00
Bobby R. Bruce	6ae3692057	stdlib: Add Vector2d to PyStats Change-Id: Icb2f691abf88ef4bac8d277e421329edb000209b	2024-05-23 14:54:59 -07:00

1 2 3 4 5 ...

15178 Commits