derek/gem5 - gem5 - Gitea: Git with a cup of tea

derek/gem5

Author	SHA1	Message	Date
Ali Nezhadi Khelejani	1512eddd43	misc: Update on-create.sh (#1477 ) After merging the old personal gem5 repository with the stable version v24, I tried to run the project inside the `.devcontainer` environment. During the image build process, I encountered the following error: ```sh [7683 ms] Start: Run in container: /bin/sh -c ./.devcontainer/on-create.sh fatal: detected dubious ownership in repository at '/workspaces/gem5' To add an exception for this directory, call: git config --global --add safe.directory /workspaces/gem5 [7724 ms] onCreateCommand failed with exit code 128. Skipping any further user-provided commands. ``` This error occurred due to an ownership permission problem, which I resolved by adding the following line.	2024-08-20 11:15:33 -07:00
Bobby R. Bruce	0857442e44	util-docker: Cleanup, refactor, better document Dockerfiles (#1292 ) * Removes the "docker-compose.yaml" in favor of "docker-bake.hcl". This uses the `docker buildx` tool which has the advantage of enabling multi-platformm builds where desired. By default all images are built targeting `linux/arm64`, `linux/amd64` and `linux/riscv64` as targets with the exception of the GPU images where only `linux/amd64` makes sense. * Remove unused/older Docker build targets (these can easily be re-added but they were not regularly built or have any current usage). * Update "README.md" to better describe these Dockerfiles and how they are built. * Simplify GCC and Clang compiler images. Each uses the Ubuntu 24.04 All Deps image as a base then specialized the compiler on top. * To simply things, all compiler versions are built from 24.04. This means narrowing the supported versions from GCC v10 to v14 and Clang v14 to v18. * Fix some bugs in the "docker-bake.hcl" thus ensuring all targets may be built from it. * Cleanup the systemc and sst images: reducing their size and building them off the common 24.04 ubuntu base image.	2024-08-20 09:45:47 -07:00
Harshil Patel	ce4c2c6495	dev,arch-x86: Added softstrobe mode to intel8254 timer (#1447 ) This PR should fix the #1195	2024-08-19 12:21:31 -07:00
Bobby R. Bruce	7413d3217c	docs,misc: RELEASE-NOTES.md updates for v24.1 (#1460 )	2024-08-19 10:58:29 -07:00
Bobby R. Bruce	f600db4a98	gpu-compute,tests: Move GPU tests to testlib (#1270 ) A new host tag `gcn_gpu` has been added. This allows for selection of those GPU tests which depend upon the gcn-gpu docker image to run. In addition to this, the square GPU tests has been moved to the CI tests. This ensures some GPU code is compiled and run on every PR.	2024-08-19 10:58:06 -07:00
Yangyu Chen	b0d81ec8a2	arch-riscv: fix GDB breakpoint issue for RV32 (#1470 ) Since PR #1316, we use sign-extend for all address generation, including PC, to match the ISA specification for modifiable XLEN. However, when we set a breakpoint using remote GDB, our address is not sign-extended. This causes the breakpoint to be set at the wrong address, as specified in Issue #1463. This PR fixes the issue by sign-extending the address when setting a breakpoint. This also matches the RISC-V ISA Specification that "must sign-extend results to fill the entire widest supported XLEN in the destination register." Change-Id: I9b493bf8ad5b1ef45a9728bb40fc5e38250fe9c3 Signed-off-by: Yangyu Chen <cyy@cyyself.name>	2024-08-19 10:25:39 -07:00
Bobby R. Bruce	cad4307951	util-docker: Re-add env variables to SST Change-Id: I653baeb69f8be1501766b57337f6643e00d7dd60	2024-08-19 10:08:20 -07:00
Yu-Cheng Chang	aa4fe362a5	arch-riscv: Sign-extend the address in newPCState (#1471 ) From #1316, creating the new PCState should sign-extend the address to avoid wrong address issue. Change-Id: I884b4e3708f5f1cc49cfd44d51bec5a2b63cc47a	2024-08-19 08:21:42 -07:00
Giacomo Travaglini	280871245b	arch-arm: Redirect VHE for ZCR_EL1 (#1472 ) Change-Id: Iff83d25257065503dc02728461823bc9985dbab3 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2024-08-16 22:49:49 +01:00
Bobby R. Bruce	646df63e56	misc: Fix typos in util/dockerfiles/README.md Change-Id: I5488301543bfff21279b6c0b1aae841574efee95 Co-authored-by: Harshil Patel <harshilp2107@gmail.com>	2024-08-15 10:45:53 -07:00
Alexander Richardson	646f994efb	arch-arm: Fix incorrect operation of VRINT* instructions (#1325 ) After a lot of debugging and comparing traces I noticed that vrintp was giving different results from QEMU. An input of 0x3f800000 (1.0) was being passed to the fplib helpers as (uint32_t)1 which has a completely different floating-point interpretation and the result was therefore completely wrong. I've fixed this as well as all remaining implicit float-to-int conversions in the ARM instruction execution. There are more -W(implicit-)float-conversion warnings in the other executors, but for now this fixes the issue I was seeing. Change-Id: Ifdeee745ca155d7f4504ac4c54235ac431acdeb9	2024-08-15 11:01:48 +01:00
Bobby R. Bruce	0c26ee5f71	util-docker: Replace gem5 v24.0 clone with wget This is more efficient. Change-Id: Idd57343183a8667425dbc036ad0c7c18581898f5	2024-08-14 14:08:44 -07:00
Setu	629bf84e10	mem: Stride Prefetcher Fix (#1449 ) This PR fixes the issues mentioned in #1448. Note that this contribution is the result of a joint collaboration with @AbhishekUoR This PR introduces the following 4 changes: 1. It changes the addresses which are used to compute the stride to cache line aligned addresses (the current version uses word aligned addresses) 2. It correctly returns if the stride does not match (as opposed to issuing prefetches using the new stride incorrectly) 3. It returns if the new stride is 0, indicating multiple reads from the same cache line. 4. It removes code which is no longer necessary after the addition of changes number 1 and 3. Change-Id: Ic346d0e15df6d07e2b93289c8d6b89b4c2f45a34 --------- Co-authored-by: Abhishek Shailendra Singh <abs218@leigh.edu>	2024-08-14 07:16:10 -07:00
Bobby R. Bruce	dcb04a72fc	util-docker,tests: Remove Ubuntu 20.04 Docker Change-Id: I1d4bbebaa4b6f064b5f40a95d066bbf092cf103f	2024-08-13 16:15:49 -07:00
Bobby R. Bruce	9f93c8ac9c	util-docker: Revert docker image tag to 'latest' Change-Id: Iafe92716725e6b3cecfeba57098c3a7efaf73d97	2024-08-13 16:13:33 -07:00
Bobby R. Bruce	59455daa85	util-docker: Fix correct common platform comment Change-Id: Ifc703b47b1e59522ba01f4c2b59a4863779eefb1	2024-08-13 16:12:45 -07:00
Bobby R. Bruce	8b61490df1	util-docker: Update dockerfiles README Change-Id: I39bca04b3770bd51203944d69d0fbecff85055f8	2024-08-13 16:09:08 -07:00
Bobby R. Bruce	bef452ce72	misc,tests: Update supported GCC and Clang compilers - GCC: v10 to v14 - Clang: v14 to v18 Change-Id: I6cd1686ffff0f08686a231b6b4936da343d53831	2024-08-13 16:09:06 -07:00
Bobby R. Bruce	b68c2ef37f	util-docker: Add vim to 24.04-all-deps Ubuntu Docker Change-Id: I898a0fddcdcf8a876fcbbe11795e858395ad9740	2024-08-13 16:08:05 -07:00
Bobby R. Bruce	3875dcdfd7	util-docker: Update the sst Dockerfile 1. Builds on top of the Ubuntu 24.04 all-deps image. 2. Unify the download, build, install, and cleanup steps. Change-Id: I4c2bf8e571dfd228f7df8372cda0f428de59af51	2024-08-13 16:08:05 -07:00
Bobby R. Bruce	2c0c933a3a	util-docker: Cleanup the systemc docker 1. Uses the ubuntu-24.04_all-deps as the base image. 2. Unifies the build and cleanup into a single step, thus reducing the size of the image. Change-Id: I63b5dad2af0e8b1f6be8ad1f28321c743f36b2dc	2024-08-13 16:08:05 -07:00
Bobby R. Bruce	58aad68329	util-docker: Order targets in docker-bake This improves readbility. The targets order matches that in the default group. Change-Id: I1102aeb48bc256df9b58032a327ec663e5733a98	2024-08-13 16:08:03 -07:00
Bobby R. Bruce	9978b4ea4c	util-docker: Add 'devcontainer' to default bake group Change-Id: I4b245cabd6e384cab780bd22b0f8b40d9819b92b	2024-08-13 16:07:22 -07:00
Bobby R. Bruce	4956c475f4	util-docker: Set the GPU Docker images to build only to x86 These images won't work and make no sense compiling to any platform other than X86. These are used in SE mode simulations where the host platform matters. Change-Id: I47405e930bf511fabcbc93d0b08ee2fb2c556869	2024-08-13 16:07:22 -07:00
Bobby R. Bruce	c1a562083d	util-docker: Set 'pull' to 'always' This ensures the a `docker pull` command is always run before building. Change-Id: If1a66b9b426d5843459e0308a64f13a11c0c6ed2	2024-08-13 16:07:21 -07:00
Bobby R. Bruce	9d635dea55	util-docker: Improve Docker gcc and clang builder 1. Uses the all-dependencies image as the base image. 2. Has all compilers use Ubuntu 24.04. Notes: This change implitly changes our supported compilers to GCC v10 to v13 and Clang v14 to v18. This will be fully incorporated into the project later. Change-Id: Id8e2141ea64a34c7e3532605f6ecb7d9ccb76951	2024-08-13 16:07:15 -07:00
Bobby R. Bruce	a1eefb6ed8	util-docker: Remove old/unsed/unecessary Dockerfiles * Unsupported compilers. * Unsed cross compilers. * The gem5-all-min-deps image. Change-Id: Iaab64e5e6685b0a538c38b2979fae86f01bc53e8	2024-08-13 16:05:47 -07:00
Bobby R. Bruce	e03a20bdb4	util-docker: Remove version from systemc Docker context This simplifies things slightly. Change-Id: I1263e385f7adeb2b83cdc09f7f6903be9193c467	2024-08-13 16:05:47 -07:00
Bobby R. Bruce	c291678881	util-docker: Fix docker-bake.hcl sst context The context for sst is the "sst" directory. Change-Id: Ic120cca13a9e4df02b98d101ad8e16c296807c2d	2024-08-13 16:05:47 -07:00
Bobby R. Bruce	e82e824f08	util-docker: Breakup long (>79 char) lines in docker-bake Change-Id: I5488301543bfff21279b6c0b1aae841574efee95	2024-08-13 16:05:46 -07:00
Bobby R. Bruce	3640559a12	misc,tests: Fix compiler tests (add missing `,`) (#1459 )	2024-08-13 06:54:12 -07:00
Alexander Richardson	f6f547fb62	arch-arm: Fix incorrect behaviour of VFNMS and VFNMA (#1420 ) This was found while comparing a diverging execution against QEMU traces and checking for the first mismatched program counter. Fortunately this was caused by a branch shortly after this incorrect computation but still took a long time to track down. There are two issues here: the decoder had inverted the cases for S and A, and the sign bit was wrong for VFN*.	2024-08-13 09:05:52 +01:00
Matthew Poremba	c359b53a19	arch-vega: Update microscaling format scaling and denorm handling (#1451 ) This PR has 3 commits: - Update scaling methods to scale by multiplication or division when upcasting or downcasting respectively. - Preserve the sign when a microscaling conversion results in NaN or infinity to match hardware. - Rework rounding to handle cases where conversion results in a denormal number in the output type so that the value is correct.	2024-08-12 07:00:26 -07:00
Matthew Poremba	7d46c50663	arch-vega: Swizzle multi-dword scratch requests (#1445 ) Scratch memory requests that are larger than one dword are using a different memory layout than global instructions. Rather than being placed contiguously, each dword is interleaved 64 lanes * 4 bytes away as described in Section 9.1.5.2. "Swizzled Buffer Addressing" in the MI300 specification. This was verified by comparing MI300 output (which uses scratch_ instructions) with MI200 (which uses buffer instructions). MI300 FashionMNIST bs=1 now matches CPU reference. This requires several changes to the instruction implementations: - For stores, data in the GPUDynInst can be swizzled before the data is written to memory. This is easy to do using a helper method. This is done in the template<int N> variant of initMemWrite. To use this x2 stores are changed to use template<int N> rather than loading a U64. The swizzle function is renamed to swizzleAddr to avoid confusion with swizzleData. - For loads, data is unswizzled in completeAcc when writing register values. This is not as easy to implement as a helper and is thus implemented for the three load instructions that load more than one dword. - Accessing swizzled data requires at least one packet per dword. A new GPU memory helper is added to create these packets for scratch requests specifically. This is called in the template<int N> variant of initMemRead / initMemWrite. Loads and stores of x2 are changed to use this variant instead of accessing a U64. The GPUDynInst status vector restrictions are increased to allow for swizzled x4 accesses. For simplicity this does not currently support misaligned swizzled accesses and will panic upon seeing such a case. Change-Id: Ic686c51e28e0af029a043d5a5b3d4069f2cb94f9	2024-08-12 06:58:48 -07:00
Matthew Poremba	62a2c09d4b	arch-vega: Rework rounding for microscaling conversions The current implementation does not correctly convert subnormal numbers (number that fill the underflow gap around zero in floating-point arithmetic). This commit reworks the rounding code to get correct results. First, the min_exp is set to 0 which allows for numbers to become subnormal when rounding. Second, the rounding code now uses something closer to "GRS" rounding (guard, round, sticky) which represent the first bit removed when rounding to a smaller type, the next second bit removed, and whether any of the other bits removed are one. More details can be found in the code comments. Change-Id: Idcd2f1e4383e4012fc3abf73b1f73c847d44f67b	2024-08-10 10:23:07 -07:00
Matthew Poremba	bdba981753	arch-vega: Preserve sign of NaN/Inf for microscaling types The implementation of microscaling formats uses the Open Compute Project specification which includes a sign bit for NaN and infinity. This should be preserved when a conversion results in NaN or infinity. Change-Id: Id9e99324c6486e256c699016aff301d5f06814d5	2024-08-10 10:23:07 -07:00
Matthew Poremba	c1251f51c1	arch-vega: Introduce two scaling methods for microscaling types Currently there is only a scale() method which multiplies a microscaling type by an int8 value. This should only be applied when upcasting to a larger type after conversion to match hardware. When downcasting to a smaller type, the scaling method should divide by the int8 value before conversion. This commit adds both scaling methods. Change-Id: Ibafa8caa389cde4df609e536cd53bd2289959420	2024-08-10 10:23:07 -07:00
Robert Hauser	e980780efd	arch-riscv: Extend wfi behavior (#1364 ) At the moment, a hart does not halt if there are pending interrupts. However, an implementation can also consider the enable status of the individual interrupts, i.e., a halted hart would only resume if there are locally enabled pending interrupts. This commit introduces this behavior. The wfi behavior is controlled by the new configuration variable wfi_pending_resume of RiscvISA. Change-Id: I316239f9732c6e73e6ad692491bca08d773dd995 --------- Signed-off-by: Robert Hauser <robert.hauser@uni-rostock.de>	2024-08-09 11:28:15 -07:00
Marleson Graf	b8001a861b	mem-ruby,sim-se: Clear LL/SC locks after functional writes (#1404 ) Functional writes atomically update all copies of a data block, so they should invalidate any pending LL/SC locks, just like a conventional write would. Change-Id: Ic79d2d8d24901f1b6a2ce81dc0e2decc84c0ebbc	2024-08-09 09:30:37 -07:00
Bobby R. Bruce	8593f69f0a	util: Fix MongoDB script requirements.txt (#1426 ) Dependency Bot appears to have had difficulty with this file: https://github.com/gem5/gem5/security/dependabot/29 This PR: 1. Removes the weird "```" which could not be parsed. 2. Ups PyMongo to a more secure version.	2024-08-08 13:01:29 -07:00
MMysore2	33e3bc4ff1	Updating Traffic Generators (#1416 ) Added documentation for `strided_generator.py` and `strided_generator_core.py.` Updated clarity of documentation for `linear_generator.py`, `linear_generator_core.py`, `random_generator.py`, and `random_generator_core.py`. Made `max_addr` exclusive instead of inclusive for strided and linear traffic generation in `strided_gen.cc` and `linear_gen.cc`.	2024-08-08 12:46:10 -07:00
Matthew Poremba	85c48a36ec	dev-amdgpu: Fix issues found by address sanitizer (#1430 ) These commits primarily fix the SDMA engine which was (1) using pointer arithmetic on a variable returned by new and then attempting to free the modified pointer and (2) using a buffer after it was freed due to the DMA device calling completion event before Ruby actually completed. Some minor fixes are included: Stop using uninitialized value as packet context and using same request pointer for two separate packets for GPU invalidations.	2024-08-08 11:14:50 -07:00
Ivana Mitrovic	ba0c3cc29a	misc: Update GitHub badge links (#1428 ) Change-Id: Iaead9f6146a90c9b2a671b9b78a318869ca739e6	2024-08-08 08:44:26 -07:00
Yangyu Chen	ce07203c5f	arch-riscv: use sign-extend for all address generation (#1316 ) In gem5, we use the same code base for RISC-V 32 and 64. However, if we need to allow modifiable XLEN control on CSR.mstatus in the future, we should follow the RISC-V ISA manual to sign-extend all the register results, including PC and GPR. If this feature implemented, the simulator needs to handle user-mode in RV32 but CSR.SATP sets to Sv39. In this case, 0x80000000 and 0xffffffff80000000 are different addresses in the 64-bit S-Mode perspective, but they are the same in the 32-bit U-Mode perspective. We should avoid this wrong behavior happening before we implement this feature. Thus, we need to sign-extend the results of all the addresses, including the PC and memory addresses, which currently use zero-extend. As specified in the RISC-V ISA manual, we use zero-extend in narrow XLEN mode for the physical address implemented in TLB. Changes based on spec: 1. Sign-extend narrow XLEN: https://github.com/riscv/riscv-isa-manual/blob/riscv-isa-release-b7a445a-2024-07-02/src/machine.adoc?plain=1#L567 2. Zero-extend physical address: https://github.com/riscv/riscv-isa-manual/blob/riscv-isa-release-b7a445a-2024-07-02/src/supervisor.adoc?plain=1#L1670 Signed-off-by: Yangyu Chen <cyy@cyyself.name>	2024-08-08 08:41:35 -07:00
Matt Sinclair	86f7fae86b	gpu-compute: fix GPU TLB outstandingReqs vs. associativity (#1431 ) The GPU TLB maxOutstandingReqs field gets limited by the associativity. In the current setup, this means that the max outstanding requests is 32 even though the setup is for 64 entries. Update the associativity to be 64 entries. Change-Id: I2104e4647d97bf4d1cf5ac447e38ad6ac6a1a0d8	2024-08-07 21:37:36 -05:00
Matthew Poremba	84fedecafe	gpu-compute: Update Requests for invalidations The SQC and TCC invalidations share a Request pointer which they both modify. This can cause some problems, so use a different request pointer for each invalidate. The setContext call is also removed as the value being assigned to it is uninitialized. Change-Id: I82ea7aa44a4f4515c1560993caa26cc6a89355af	2024-08-07 14:37:49 -07:00
Matthew Poremba	db0d5f19cf	dev-amdgpu: Add cleanup events for SDMA SDMA packets which use dmaVirtWrites call their completion event before the write takes place in the Ruby protocol. This causes a use-after-free issue corruption random memory locations leading to random errors. This commit adds a cleanup event for each packet that uses DMA and sets the cleanup latency as 10000 ticks. In atomic mode, the writes complete exactly 2000 ticks after the completion event is called and therefore a fixed latency can be used. This is not tested with timing mode, which does not work with GPUFS at the moment, so a warning is added to give an idea where to look in case the same issue occurs once timing mode is supported. Change-Id: I9ee2689f2becc46bb7794b18b31205f1606109d8	2024-08-07 14:37:49 -07:00
Matt Sinclair	03ddd0b75f	gpu-compute: fix GPU TLB outstandingReqs vs. associativity The GPU TLB maxOutstandingReqs field gets limited by the associativity. In the current setup, this means that the max outstanding requests is 32 even though the setup is for 64 entries. Update the associativity to all 64 entries. Change-Id: I2104e4647d97bf4d1cf5ac447e38ad6ac6a1a0d8	2024-08-07 16:16:01 -05:00
Matthew Poremba	0d0b68266c	dev-amdgpu: Fix bad free in SDMA The SDMA engine copies data in chunks. It currently uses the pointer returned from new[] and manipulates it using pointer arithmetic. This modified pointer is then passed to the completion function which deletes the pointer. Since it is not the original pointer allocated by new[] this triggers issues in ASAN. Change-Id: I03ccf026633285e75005509445c62fcbda8eb978	2024-08-07 12:54:45 -07:00
Saili Karkare	bd228af5cf	Updating hex addr printing (#1385 ) This change changes the addresses that are printed when TrafficGen DebugFlag is enabled. Previously, hex strings were printed without a preceding 0x. This change fixes that to distinguish between decimal and hex.	2024-08-07 02:31:21 -07:00

1 2 3 4 5 ...

21922 Commits