derek/gem5 - gem5 - Gitea: Git with a cup of tea

derek/gem5

Author	SHA1	Message	Date
Bobby R. Bruce	646df63e56	misc: Fix typos in util/dockerfiles/README.md Change-Id: I5488301543bfff21279b6c0b1aae841574efee95 Co-authored-by: Harshil Patel <harshilp2107@gmail.com>	2024-08-15 10:45:53 -07:00
Bobby R. Bruce	0c26ee5f71	util-docker: Replace gem5 v24.0 clone with wget This is more efficient. Change-Id: Idd57343183a8667425dbc036ad0c7c18581898f5	2024-08-14 14:08:44 -07:00
Bobby R. Bruce	dcb04a72fc	util-docker,tests: Remove Ubuntu 20.04 Docker Change-Id: I1d4bbebaa4b6f064b5f40a95d066bbf092cf103f	2024-08-13 16:15:49 -07:00
Bobby R. Bruce	9f93c8ac9c	util-docker: Revert docker image tag to 'latest' Change-Id: Iafe92716725e6b3cecfeba57098c3a7efaf73d97	2024-08-13 16:13:33 -07:00
Bobby R. Bruce	59455daa85	util-docker: Fix correct common platform comment Change-Id: Ifc703b47b1e59522ba01f4c2b59a4863779eefb1	2024-08-13 16:12:45 -07:00
Bobby R. Bruce	8b61490df1	util-docker: Update dockerfiles README Change-Id: I39bca04b3770bd51203944d69d0fbecff85055f8	2024-08-13 16:09:08 -07:00
Bobby R. Bruce	bef452ce72	misc,tests: Update supported GCC and Clang compilers - GCC: v10 to v14 - Clang: v14 to v18 Change-Id: I6cd1686ffff0f08686a231b6b4936da343d53831	2024-08-13 16:09:06 -07:00
Bobby R. Bruce	b68c2ef37f	util-docker: Add vim to 24.04-all-deps Ubuntu Docker Change-Id: I898a0fddcdcf8a876fcbbe11795e858395ad9740	2024-08-13 16:08:05 -07:00
Bobby R. Bruce	3875dcdfd7	util-docker: Update the sst Dockerfile 1. Builds on top of the Ubuntu 24.04 all-deps image. 2. Unify the download, build, install, and cleanup steps. Change-Id: I4c2bf8e571dfd228f7df8372cda0f428de59af51	2024-08-13 16:08:05 -07:00
Bobby R. Bruce	2c0c933a3a	util-docker: Cleanup the systemc docker 1. Uses the ubuntu-24.04_all-deps as the base image. 2. Unifies the build and cleanup into a single step, thus reducing the size of the image. Change-Id: I63b5dad2af0e8b1f6be8ad1f28321c743f36b2dc	2024-08-13 16:08:05 -07:00
Bobby R. Bruce	58aad68329	util-docker: Order targets in docker-bake This improves readbility. The targets order matches that in the default group. Change-Id: I1102aeb48bc256df9b58032a327ec663e5733a98	2024-08-13 16:08:03 -07:00
Bobby R. Bruce	9978b4ea4c	util-docker: Add 'devcontainer' to default bake group Change-Id: I4b245cabd6e384cab780bd22b0f8b40d9819b92b	2024-08-13 16:07:22 -07:00
Bobby R. Bruce	4956c475f4	util-docker: Set the GPU Docker images to build only to x86 These images won't work and make no sense compiling to any platform other than X86. These are used in SE mode simulations where the host platform matters. Change-Id: I47405e930bf511fabcbc93d0b08ee2fb2c556869	2024-08-13 16:07:22 -07:00
Bobby R. Bruce	c1a562083d	util-docker: Set 'pull' to 'always' This ensures the a `docker pull` command is always run before building. Change-Id: If1a66b9b426d5843459e0308a64f13a11c0c6ed2	2024-08-13 16:07:21 -07:00
Bobby R. Bruce	9d635dea55	util-docker: Improve Docker gcc and clang builder 1. Uses the all-dependencies image as the base image. 2. Has all compilers use Ubuntu 24.04. Notes: This change implitly changes our supported compilers to GCC v10 to v13 and Clang v14 to v18. This will be fully incorporated into the project later. Change-Id: Id8e2141ea64a34c7e3532605f6ecb7d9ccb76951	2024-08-13 16:07:15 -07:00
Bobby R. Bruce	a1eefb6ed8	util-docker: Remove old/unsed/unecessary Dockerfiles * Unsupported compilers. * Unsed cross compilers. * The gem5-all-min-deps image. Change-Id: Iaab64e5e6685b0a538c38b2979fae86f01bc53e8	2024-08-13 16:05:47 -07:00
Bobby R. Bruce	e03a20bdb4	util-docker: Remove version from systemc Docker context This simplifies things slightly. Change-Id: I1263e385f7adeb2b83cdc09f7f6903be9193c467	2024-08-13 16:05:47 -07:00
Bobby R. Bruce	c291678881	util-docker: Fix docker-bake.hcl sst context The context for sst is the "sst" directory. Change-Id: Ic120cca13a9e4df02b98d101ad8e16c296807c2d	2024-08-13 16:05:47 -07:00
Bobby R. Bruce	e82e824f08	util-docker: Breakup long (>79 char) lines in docker-bake Change-Id: I5488301543bfff21279b6c0b1aae841574efee95	2024-08-13 16:05:46 -07:00
Bobby R. Bruce	3640559a12	misc,tests: Fix compiler tests (add missing `,`) (#1459 )	2024-08-13 06:54:12 -07:00
Alexander Richardson	f6f547fb62	arch-arm: Fix incorrect behaviour of VFNMS and VFNMA (#1420 ) This was found while comparing a diverging execution against QEMU traces and checking for the first mismatched program counter. Fortunately this was caused by a branch shortly after this incorrect computation but still took a long time to track down. There are two issues here: the decoder had inverted the cases for S and A, and the sign bit was wrong for VFN*.	2024-08-13 09:05:52 +01:00
Matthew Poremba	c359b53a19	arch-vega: Update microscaling format scaling and denorm handling (#1451 ) This PR has 3 commits: - Update scaling methods to scale by multiplication or division when upcasting or downcasting respectively. - Preserve the sign when a microscaling conversion results in NaN or infinity to match hardware. - Rework rounding to handle cases where conversion results in a denormal number in the output type so that the value is correct.	2024-08-12 07:00:26 -07:00
Matthew Poremba	7d46c50663	arch-vega: Swizzle multi-dword scratch requests (#1445 ) Scratch memory requests that are larger than one dword are using a different memory layout than global instructions. Rather than being placed contiguously, each dword is interleaved 64 lanes * 4 bytes away as described in Section 9.1.5.2. "Swizzled Buffer Addressing" in the MI300 specification. This was verified by comparing MI300 output (which uses scratch_ instructions) with MI200 (which uses buffer instructions). MI300 FashionMNIST bs=1 now matches CPU reference. This requires several changes to the instruction implementations: - For stores, data in the GPUDynInst can be swizzled before the data is written to memory. This is easy to do using a helper method. This is done in the template<int N> variant of initMemWrite. To use this x2 stores are changed to use template<int N> rather than loading a U64. The swizzle function is renamed to swizzleAddr to avoid confusion with swizzleData. - For loads, data is unswizzled in completeAcc when writing register values. This is not as easy to implement as a helper and is thus implemented for the three load instructions that load more than one dword. - Accessing swizzled data requires at least one packet per dword. A new GPU memory helper is added to create these packets for scratch requests specifically. This is called in the template<int N> variant of initMemRead / initMemWrite. Loads and stores of x2 are changed to use this variant instead of accessing a U64. The GPUDynInst status vector restrictions are increased to allow for swizzled x4 accesses. For simplicity this does not currently support misaligned swizzled accesses and will panic upon seeing such a case. Change-Id: Ic686c51e28e0af029a043d5a5b3d4069f2cb94f9	2024-08-12 06:58:48 -07:00
Matthew Poremba	62a2c09d4b	arch-vega: Rework rounding for microscaling conversions The current implementation does not correctly convert subnormal numbers (number that fill the underflow gap around zero in floating-point arithmetic). This commit reworks the rounding code to get correct results. First, the min_exp is set to 0 which allows for numbers to become subnormal when rounding. Second, the rounding code now uses something closer to "GRS" rounding (guard, round, sticky) which represent the first bit removed when rounding to a smaller type, the next second bit removed, and whether any of the other bits removed are one. More details can be found in the code comments. Change-Id: Idcd2f1e4383e4012fc3abf73b1f73c847d44f67b	2024-08-10 10:23:07 -07:00
Matthew Poremba	bdba981753	arch-vega: Preserve sign of NaN/Inf for microscaling types The implementation of microscaling formats uses the Open Compute Project specification which includes a sign bit for NaN and infinity. This should be preserved when a conversion results in NaN or infinity. Change-Id: Id9e99324c6486e256c699016aff301d5f06814d5	2024-08-10 10:23:07 -07:00
Matthew Poremba	c1251f51c1	arch-vega: Introduce two scaling methods for microscaling types Currently there is only a scale() method which multiplies a microscaling type by an int8 value. This should only be applied when upcasting to a larger type after conversion to match hardware. When downcasting to a smaller type, the scaling method should divide by the int8 value before conversion. This commit adds both scaling methods. Change-Id: Ibafa8caa389cde4df609e536cd53bd2289959420	2024-08-10 10:23:07 -07:00
Robert Hauser	e980780efd	arch-riscv: Extend wfi behavior (#1364 ) At the moment, a hart does not halt if there are pending interrupts. However, an implementation can also consider the enable status of the individual interrupts, i.e., a halted hart would only resume if there are locally enabled pending interrupts. This commit introduces this behavior. The wfi behavior is controlled by the new configuration variable wfi_pending_resume of RiscvISA. Change-Id: I316239f9732c6e73e6ad692491bca08d773dd995 --------- Signed-off-by: Robert Hauser <robert.hauser@uni-rostock.de>	2024-08-09 11:28:15 -07:00
Marleson Graf	b8001a861b	mem-ruby,sim-se: Clear LL/SC locks after functional writes (#1404 ) Functional writes atomically update all copies of a data block, so they should invalidate any pending LL/SC locks, just like a conventional write would. Change-Id: Ic79d2d8d24901f1b6a2ce81dc0e2decc84c0ebbc	2024-08-09 09:30:37 -07:00
Bobby R. Bruce	8593f69f0a	util: Fix MongoDB script requirements.txt (#1426 ) Dependency Bot appears to have had difficulty with this file: https://github.com/gem5/gem5/security/dependabot/29 This PR: 1. Removes the weird "```" which could not be parsed. 2. Ups PyMongo to a more secure version.	2024-08-08 13:01:29 -07:00
MMysore2	33e3bc4ff1	Updating Traffic Generators (#1416 ) Added documentation for `strided_generator.py` and `strided_generator_core.py.` Updated clarity of documentation for `linear_generator.py`, `linear_generator_core.py`, `random_generator.py`, and `random_generator_core.py`. Made `max_addr` exclusive instead of inclusive for strided and linear traffic generation in `strided_gen.cc` and `linear_gen.cc`.	2024-08-08 12:46:10 -07:00
Matthew Poremba	85c48a36ec	dev-amdgpu: Fix issues found by address sanitizer (#1430 ) These commits primarily fix the SDMA engine which was (1) using pointer arithmetic on a variable returned by new and then attempting to free the modified pointer and (2) using a buffer after it was freed due to the DMA device calling completion event before Ruby actually completed. Some minor fixes are included: Stop using uninitialized value as packet context and using same request pointer for two separate packets for GPU invalidations.	2024-08-08 11:14:50 -07:00
Ivana Mitrovic	ba0c3cc29a	misc: Update GitHub badge links (#1428 ) Change-Id: Iaead9f6146a90c9b2a671b9b78a318869ca739e6	2024-08-08 08:44:26 -07:00
Yangyu Chen	ce07203c5f	arch-riscv: use sign-extend for all address generation (#1316 ) In gem5, we use the same code base for RISC-V 32 and 64. However, if we need to allow modifiable XLEN control on CSR.mstatus in the future, we should follow the RISC-V ISA manual to sign-extend all the register results, including PC and GPR. If this feature implemented, the simulator needs to handle user-mode in RV32 but CSR.SATP sets to Sv39. In this case, 0x80000000 and 0xffffffff80000000 are different addresses in the 64-bit S-Mode perspective, but they are the same in the 32-bit U-Mode perspective. We should avoid this wrong behavior happening before we implement this feature. Thus, we need to sign-extend the results of all the addresses, including the PC and memory addresses, which currently use zero-extend. As specified in the RISC-V ISA manual, we use zero-extend in narrow XLEN mode for the physical address implemented in TLB. Changes based on spec: 1. Sign-extend narrow XLEN: https://github.com/riscv/riscv-isa-manual/blob/riscv-isa-release-b7a445a-2024-07-02/src/machine.adoc?plain=1#L567 2. Zero-extend physical address: https://github.com/riscv/riscv-isa-manual/blob/riscv-isa-release-b7a445a-2024-07-02/src/supervisor.adoc?plain=1#L1670 Signed-off-by: Yangyu Chen <cyy@cyyself.name>	2024-08-08 08:41:35 -07:00
Matt Sinclair	86f7fae86b	gpu-compute: fix GPU TLB outstandingReqs vs. associativity (#1431 ) The GPU TLB maxOutstandingReqs field gets limited by the associativity. In the current setup, this means that the max outstanding requests is 32 even though the setup is for 64 entries. Update the associativity to be 64 entries. Change-Id: I2104e4647d97bf4d1cf5ac447e38ad6ac6a1a0d8	2024-08-07 21:37:36 -05:00
Matthew Poremba	84fedecafe	gpu-compute: Update Requests for invalidations The SQC and TCC invalidations share a Request pointer which they both modify. This can cause some problems, so use a different request pointer for each invalidate. The setContext call is also removed as the value being assigned to it is uninitialized. Change-Id: I82ea7aa44a4f4515c1560993caa26cc6a89355af	2024-08-07 14:37:49 -07:00
Matthew Poremba	db0d5f19cf	dev-amdgpu: Add cleanup events for SDMA SDMA packets which use dmaVirtWrites call their completion event before the write takes place in the Ruby protocol. This causes a use-after-free issue corruption random memory locations leading to random errors. This commit adds a cleanup event for each packet that uses DMA and sets the cleanup latency as 10000 ticks. In atomic mode, the writes complete exactly 2000 ticks after the completion event is called and therefore a fixed latency can be used. This is not tested with timing mode, which does not work with GPUFS at the moment, so a warning is added to give an idea where to look in case the same issue occurs once timing mode is supported. Change-Id: I9ee2689f2becc46bb7794b18b31205f1606109d8	2024-08-07 14:37:49 -07:00
Matt Sinclair	03ddd0b75f	gpu-compute: fix GPU TLB outstandingReqs vs. associativity The GPU TLB maxOutstandingReqs field gets limited by the associativity. In the current setup, this means that the max outstanding requests is 32 even though the setup is for 64 entries. Update the associativity to all 64 entries. Change-Id: I2104e4647d97bf4d1cf5ac447e38ad6ac6a1a0d8	2024-08-07 16:16:01 -05:00
Matthew Poremba	0d0b68266c	dev-amdgpu: Fix bad free in SDMA The SDMA engine copies data in chunks. It currently uses the pointer returned from new[] and manipulates it using pointer arithmetic. This modified pointer is then passed to the completion function which deletes the pointer. Since it is not the original pointer allocated by new[] this triggers issues in ASAN. Change-Id: I03ccf026633285e75005509445c62fcbda8eb978	2024-08-07 12:54:45 -07:00
Saili Karkare	bd228af5cf	Updating hex addr printing (#1385 ) This change changes the addresses that are printed when TrafficGen DebugFlag is enabled. Previously, hex strings were printed without a preceding 0x. This change fixes that to distinguish between decimal and hex.	2024-08-07 02:31:21 -07:00
Bobby R. Bruce	811e8c0fb4	util-docker,tests: Up clang support: >=v10 (#1415 ) The compiler tests are failing to to a compile bug in Clang 7: https://github.com/gem5/gem5/actions/runs/10170081794 Given Ubuntu 20.04 APT installs v10 by default (i.e., with `apt install clang`). This is the oldest LTS Ubuntu version. It therefore seems sensible to drop support for older (<v10) versions of clang.	2024-08-07 00:38:08 -07:00
Bobby R. Bruce	eabb625870	util-docker,tests: Up clang support: >=v10 The compiler tests are failing to to a compile bug in Clang 7: https://github.com/gem5/gem5/actions/runs/10170081794 Given Ubuntu 20.04 APT installs v10 by default (i.e., with `apt install clang`). This is the oldest LTS Ubuntu version. It therefore seems sensible to drop support for older (<v10) versions of clang. Change-Id: I4c48223b80306422beac1464c09f03397c156ba1	2024-08-07 00:35:34 -07:00
Bobby R. Bruce	bbc49aa914	misc: Stable merge to dev (#1424 )	2024-08-06 21:02:35 -07:00
Bobby R. Bruce	8885d60399	misc: Merge branch stable branch into develop Change-Id: Ie391ea7eeb86a6e862e910e7d150edde0059cc54	2024-08-06 21:02:06 -07:00
Bobby R. Bruce	bb290aaff5	misc: Change devcontainer for isca tutorial and bootcamp (#1282 )	2024-08-06 19:45:48 -07:00
Robert Hauser	ba704a01b2	misc: Fix typo in multisim code snippet (#1417 )	2024-08-06 13:54:16 -07:00
Bobby R. Bruce	bd53bad5cf	mem: Fix "Need is_secure arg" prefetcher crash (#1374 ) This PR fixes the "Need is_secure arg" crash that occurs when using IndirectMemoryPrefetcher, SignaturePathPrefetcher, SignaturePathPrefetcherV2, STeMSPrefetcher, and PIFPrefetcher. This was done by changing some variables to have the type AssociativeSet<...> instead of AssociativeCache<...> and adding in "false" or an existing value for the value of the secure bit in some function calls. Further changes may be needed to move away from hard-coding values.	2024-08-06 13:01:40 -07:00
Erin Le	6dbe2bca7b	mem: Add constexprs to spatio_temporal_memory_streaming.cc Change-Id: I6fa3d9f9a9d89d59d9ec1fc97c152bea3059f87d	2024-08-06 00:06:38 +00:00
Erin Le	f325949ba5	mem: remove stray comment from signature_path_v2.cc Change-Id: I5ddd2ddd6a9cb4fb032b48870c5ef6b0dc9533c0	2024-08-05 23:10:10 +00:00
Erin Le	2db021b27b	mem: Comment removal and adding constexpr to is_secure bools This commit removes some comments and adds constexpr in front of "bool is_secure..." in pif.cc, signature_path.cc, and signature_path_v2.cc Change-Id: Icafe1d7c97d1d3fbf6abc12ba87ebb596255b96f	2024-08-05 15:43:40 -07:00
Erin Le	9adf44ed1f	mem: use is_secure instead of hardcoded false in prefetcher crash This modifies the crash fix so that the function calls that were modified use a local variables called `is_secure` instead of a hardcoded `false`. Some of these existed previously so it made more sense to use them, while others were newly added in to mark where the code might need to be changed later. Change-Id: I0c0d14b74f0ccf70ee5fe7c8b01ed0266353b3c1	2024-08-05 15:43:40 -07:00

1 2 3 4 5 ...

21911 Commits