Commit Graph

21911 Commits

Author SHA1 Message Date
Bobby R. Bruce
646df63e56 misc: Fix typos in util/dockerfiles/README.md
Change-Id: I5488301543bfff21279b6c0b1aae841574efee95

Co-authored-by: Harshil Patel <harshilp2107@gmail.com>
2024-08-15 10:45:53 -07:00
Bobby R. Bruce
0c26ee5f71 util-docker: Replace gem5 v24.0 clone with wget
This is more efficient.

Change-Id: Idd57343183a8667425dbc036ad0c7c18581898f5
2024-08-14 14:08:44 -07:00
Bobby R. Bruce
dcb04a72fc util-docker,tests: Remove Ubuntu 20.04 Docker
Change-Id: I1d4bbebaa4b6f064b5f40a95d066bbf092cf103f
2024-08-13 16:15:49 -07:00
Bobby R. Bruce
9f93c8ac9c util-docker: Revert docker image tag to 'latest'
Change-Id: Iafe92716725e6b3cecfeba57098c3a7efaf73d97
2024-08-13 16:13:33 -07:00
Bobby R. Bruce
59455daa85 util-docker: Fix correct common platform comment
Change-Id: Ifc703b47b1e59522ba01f4c2b59a4863779eefb1
2024-08-13 16:12:45 -07:00
Bobby R. Bruce
8b61490df1 util-docker: Update dockerfiles README
Change-Id: I39bca04b3770bd51203944d69d0fbecff85055f8
2024-08-13 16:09:08 -07:00
Bobby R. Bruce
bef452ce72 misc,tests: Update supported GCC and Clang compilers
- GCC: v10 to v14
- Clang: v14 to v18

Change-Id: I6cd1686ffff0f08686a231b6b4936da343d53831
2024-08-13 16:09:06 -07:00
Bobby R. Bruce
b68c2ef37f util-docker: Add vim to 24.04-all-deps Ubuntu Docker
Change-Id: I898a0fddcdcf8a876fcbbe11795e858395ad9740
2024-08-13 16:08:05 -07:00
Bobby R. Bruce
3875dcdfd7 util-docker: Update the sst Dockerfile
1. Builds on top of the Ubuntu 24.04 all-deps image.
2. Unify the download, build, install, and cleanup steps.

Change-Id: I4c2bf8e571dfd228f7df8372cda0f428de59af51
2024-08-13 16:08:05 -07:00
Bobby R. Bruce
2c0c933a3a util-docker: Cleanup the systemc docker
1. Uses the ubuntu-24.04_all-deps as the base image.
2. Unifies the build and cleanup into a single step, thus reducing the
   size of the image.

Change-Id: I63b5dad2af0e8b1f6be8ad1f28321c743f36b2dc
2024-08-13 16:08:05 -07:00
Bobby R. Bruce
58aad68329 util-docker: Order targets in docker-bake
This improves readbility. The targets order matches that in the default
group.

Change-Id: I1102aeb48bc256df9b58032a327ec663e5733a98
2024-08-13 16:08:03 -07:00
Bobby R. Bruce
9978b4ea4c util-docker: Add 'devcontainer' to default bake group
Change-Id: I4b245cabd6e384cab780bd22b0f8b40d9819b92b
2024-08-13 16:07:22 -07:00
Bobby R. Bruce
4956c475f4 util-docker: Set the GPU Docker images to build only to x86
These images won't work and make no sense compiling to any platform
other than X86. These are used in SE mode simulations where the host
platform matters.

Change-Id: I47405e930bf511fabcbc93d0b08ee2fb2c556869
2024-08-13 16:07:22 -07:00
Bobby R. Bruce
c1a562083d util-docker: Set 'pull' to 'always'
This ensures the a `docker pull` command is always run before building.

Change-Id: If1a66b9b426d5843459e0308a64f13a11c0c6ed2
2024-08-13 16:07:21 -07:00
Bobby R. Bruce
9d635dea55 util-docker: Improve Docker gcc and clang builder
1. Uses the all-dependencies image as the base image.
2. Has all compilers use Ubuntu 24.04.

Notes: This change implitly changes our supported compilers to GCC v10
to v13 and Clang v14 to v18. This will be fully incorporated into the
project later.

Change-Id: Id8e2141ea64a34c7e3532605f6ecb7d9ccb76951
2024-08-13 16:07:15 -07:00
Bobby R. Bruce
a1eefb6ed8 util-docker: Remove old/unsed/unecessary Dockerfiles
* Unsupported compilers.
* Unsed cross compilers.
* The gem5-all-min-deps image.

Change-Id: Iaab64e5e6685b0a538c38b2979fae86f01bc53e8
2024-08-13 16:05:47 -07:00
Bobby R. Bruce
e03a20bdb4 util-docker: Remove version from systemc Docker context
This simplifies things slightly.

Change-Id: I1263e385f7adeb2b83cdc09f7f6903be9193c467
2024-08-13 16:05:47 -07:00
Bobby R. Bruce
c291678881 util-docker: Fix docker-bake.hcl sst context
The context for sst is the "sst" directory.

Change-Id: Ic120cca13a9e4df02b98d101ad8e16c296807c2d
2024-08-13 16:05:47 -07:00
Bobby R. Bruce
e82e824f08 util-docker: Breakup long (>79 char) lines in docker-bake
Change-Id: I5488301543bfff21279b6c0b1aae841574efee95
2024-08-13 16:05:46 -07:00
Bobby R. Bruce
3640559a12 misc,tests: Fix compiler tests (add missing ,) (#1459) 2024-08-13 06:54:12 -07:00
Alexander Richardson
f6f547fb62 arch-arm: Fix incorrect behaviour of VFNMS and VFNMA (#1420)
This was found while comparing a diverging execution against QEMU traces
and checking for the first mismatched program counter. Fortunately this
was
caused by a branch shortly after this incorrect computation but still
took
a long time to track down.

There are two issues here: the decoder had inverted the cases for *S and
*A,
and the sign bit was wrong for VFN*.
2024-08-13 09:05:52 +01:00
Matthew Poremba
c359b53a19 arch-vega: Update microscaling format scaling and denorm handling (#1451)
This PR has 3 commits:
- Update scaling methods to scale by multiplication or division when
upcasting or downcasting respectively.
- Preserve the sign when a microscaling conversion results in NaN or
infinity to match hardware.
- Rework rounding to handle cases where conversion results in a denormal
number in the output type so that the value is correct.
2024-08-12 07:00:26 -07:00
Matthew Poremba
7d46c50663 arch-vega: Swizzle multi-dword scratch requests (#1445)
Scratch memory requests that are larger than one dword are using a
different memory layout than global instructions. Rather than being
placed contiguously, each dword is interleaved 64 lanes * 4 bytes away
as described in Section 9.1.5.2. "Swizzled Buffer Addressing" in the
MI300 specification. This was verified by comparing MI300 output (which
uses scratch_ instructions) with MI200 (which uses buffer instructions).
MI300 FashionMNIST bs=1 now matches CPU reference.

This requires several changes to the instruction implementations:
- For stores, data in the GPUDynInst can be swizzled before the data is
written to memory. This is easy to do using a helper method. This is
done in the template<int N> variant of initMemWrite. To use this x2
stores are changed to use template<int N> rather than loading a U64. The
swizzle function is renamed to swizzleAddr to avoid confusion with
swizzleData.
- For loads, data is unswizzled in completeAcc when writing register
values. This is not as easy to implement as a helper and is thus
implemented for the three load instructions that load more than one
dword.
- Accessing swizzled data requires at least one packet per dword. A new
GPU memory helper is added to create these packets for scratch requests
specifically. This is called in the template<int N> variant of
initMemRead / initMemWrite. Loads and stores of x2 are changed to use
this variant instead of accessing a U64.

The GPUDynInst status vector restrictions are increased to allow for
swizzled x4 accesses. For simplicity this does not currently support
misaligned swizzled accesses and will panic upon seeing such a case.

Change-Id: Ic686c51e28e0af029a043d5a5b3d4069f2cb94f9
2024-08-12 06:58:48 -07:00
Matthew Poremba
62a2c09d4b arch-vega: Rework rounding for microscaling conversions
The current implementation does not correctly convert subnormal numbers
(number that fill the underflow gap around zero in floating-point
arithmetic). This commit reworks the rounding code to get correct
results.

First, the min_exp is set to 0 which allows for numbers to become
subnormal when rounding. Second, the rounding code now uses something
closer to "GRS" rounding (guard, round, sticky) which represent the
first bit removed when rounding to a smaller type, the next second bit
removed, and whether any of the other bits removed are one. More details
can be found in the code comments.

Change-Id: Idcd2f1e4383e4012fc3abf73b1f73c847d44f67b
2024-08-10 10:23:07 -07:00
Matthew Poremba
bdba981753 arch-vega: Preserve sign of NaN/Inf for microscaling types
The implementation of microscaling formats uses the Open Compute Project
specification which includes a sign bit for NaN and infinity. This
should be preserved when a conversion results in NaN or infinity.

Change-Id: Id9e99324c6486e256c699016aff301d5f06814d5
2024-08-10 10:23:07 -07:00
Matthew Poremba
c1251f51c1 arch-vega: Introduce two scaling methods for microscaling types
Currently there is only a scale() method which multiplies a microscaling
type by an int8 value. This should only be applied when upcasting to
a larger type after conversion to match hardware. When downcasting to a
smaller type, the scaling method should divide by the int8 value before
conversion.

This commit adds both scaling methods.

Change-Id: Ibafa8caa389cde4df609e536cd53bd2289959420
2024-08-10 10:23:07 -07:00
Robert Hauser
e980780efd arch-riscv: Extend wfi behavior (#1364)
At the moment, a hart does not halt if there are pending interrupts.
However, an implementation can also consider the enable status of the
individual interrupts, i.e., a halted hart would only resume if there
are locally enabled pending interrupts. This commit introduces this
behavior. The wfi behavior is controlled by the new configuration
variable wfi_pending_resume of RiscvISA.

Change-Id: I316239f9732c6e73e6ad692491bca08d773dd995

---------

Signed-off-by: Robert Hauser <robert.hauser@uni-rostock.de>
2024-08-09 11:28:15 -07:00
Marleson Graf
b8001a861b mem-ruby,sim-se: Clear LL/SC locks after functional writes (#1404)
Functional writes atomically update all copies of a data block, so they
should invalidate any pending LL/SC locks, just like a conventional
write would.

Change-Id: Ic79d2d8d24901f1b6a2ce81dc0e2decc84c0ebbc
2024-08-09 09:30:37 -07:00
Bobby R. Bruce
8593f69f0a util: Fix MongoDB script requirements.txt (#1426)
Dependency Bot appears to have had difficulty with this file:
https://github.com/gem5/gem5/security/dependabot/29

This PR:

1. Removes the weird "```" which could not be parsed.
2. Ups PyMongo to a more secure version.
2024-08-08 13:01:29 -07:00
MMysore2
33e3bc4ff1 Updating Traffic Generators (#1416)
Added documentation for `strided_generator.py` and
`strided_generator_core.py.`

Updated clarity of documentation for `linear_generator.py`,
`linear_generator_core.py`, `random_generator.py`, and
`random_generator_core.py`.

Made `max_addr` exclusive instead of inclusive for strided and linear
traffic generation in `strided_gen.cc` and `linear_gen.cc`.
2024-08-08 12:46:10 -07:00
Matthew Poremba
85c48a36ec dev-amdgpu: Fix issues found by address sanitizer (#1430)
These commits primarily fix the SDMA engine which was (1) using pointer
arithmetic on a variable returned by new and then attempting to free the
modified pointer and (2) using a buffer after it was freed due to the
DMA device calling completion event before Ruby actually completed.

Some minor fixes are included: Stop using uninitialized value as packet
context and using same request pointer for two separate packets for GPU
invalidations.
2024-08-08 11:14:50 -07:00
Ivana Mitrovic
ba0c3cc29a misc: Update GitHub badge links (#1428)
Change-Id: Iaead9f6146a90c9b2a671b9b78a318869ca739e6
2024-08-08 08:44:26 -07:00
Yangyu Chen
ce07203c5f arch-riscv: use sign-extend for all address generation (#1316)
In gem5, we use the same code base for RISC-V 32 and 64.

However, if we need to allow modifiable XLEN control on CSR.mstatus in
the future, we should follow the RISC-V ISA manual to sign-extend all
the register results, including PC and GPR. If this feature implemented,
the simulator needs to handle user-mode in RV32 but CSR.SATP sets to
Sv39. In this case, 0x80000000 and 0xffffffff80000000 are different
addresses in the 64-bit S-Mode perspective, but they are the same in the
32-bit U-Mode perspective. We should avoid this wrong behavior happening
before we implement this feature.

Thus, we need to sign-extend the results of all the addresses, including
the PC and memory addresses, which currently use zero-extend. As
specified in the RISC-V ISA manual, we use zero-extend in narrow XLEN
mode for the physical address implemented in TLB.

Changes based on spec:
1. Sign-extend narrow XLEN:
https://github.com/riscv/riscv-isa-manual/blob/riscv-isa-release-b7a445a-2024-07-02/src/machine.adoc?plain=1#L567
2. Zero-extend physical address:
https://github.com/riscv/riscv-isa-manual/blob/riscv-isa-release-b7a445a-2024-07-02/src/supervisor.adoc?plain=1#L1670

Signed-off-by: Yangyu Chen <cyy@cyyself.name>
2024-08-08 08:41:35 -07:00
Matt Sinclair
86f7fae86b gpu-compute: fix GPU TLB outstandingReqs vs. associativity (#1431)
The GPU TLB maxOutstandingReqs field gets limited by the associativity.
In the current setup, this means that the max outstanding requests is 32
even though the setup is for 64 entries. Update the associativity to be
64 entries.

Change-Id: I2104e4647d97bf4d1cf5ac447e38ad6ac6a1a0d8
2024-08-07 21:37:36 -05:00
Matthew Poremba
84fedecafe gpu-compute: Update Requests for invalidations
The SQC and TCC invalidations share a Request pointer which they both
modify. This can cause some problems, so use a different request pointer
for each invalidate. The setContext call is also removed as the value
being assigned to it is uninitialized.

Change-Id: I82ea7aa44a4f4515c1560993caa26cc6a89355af
2024-08-07 14:37:49 -07:00
Matthew Poremba
db0d5f19cf dev-amdgpu: Add cleanup events for SDMA
SDMA packets which use dmaVirtWrites call their completion event before
the write takes place in the Ruby protocol. This causes a use-after-free
issue corruption random memory locations leading to random errors. This
commit adds a cleanup event for each packet that uses DMA and sets the
cleanup latency as 10000 ticks. In atomic mode, the writes complete
exactly 2000 ticks after the completion event is called and therefore a
fixed latency can be used. This is not tested with timing mode, which
does not work with GPUFS at the moment, so a warning is added to give an
idea where to look in case the same issue occurs once timing mode is
supported.

Change-Id: I9ee2689f2becc46bb7794b18b31205f1606109d8
2024-08-07 14:37:49 -07:00
Matt Sinclair
03ddd0b75f gpu-compute: fix GPU TLB outstandingReqs vs. associativity
The GPU TLB maxOutstandingReqs field gets limited by the associativity.
In the current setup, this means that the max outstanding requests is
32 even though the setup is for 64 entries.  Update the associativity
to all 64 entries.

Change-Id: I2104e4647d97bf4d1cf5ac447e38ad6ac6a1a0d8
2024-08-07 16:16:01 -05:00
Matthew Poremba
0d0b68266c dev-amdgpu: Fix bad free in SDMA
The SDMA engine copies data in chunks. It currently uses the pointer
returned from new[] and manipulates it using pointer arithmetic. This
modified pointer is then passed to the completion function which deletes
the pointer. Since it is not the original pointer allocated by new[]
this triggers issues in ASAN.

Change-Id: I03ccf026633285e75005509445c62fcbda8eb978
2024-08-07 12:54:45 -07:00
Saili Karkare
bd228af5cf Updating hex addr printing (#1385)
This change changes the addresses that are printed when TrafficGen
DebugFlag is enabled. Previously, hex strings were printed without a
preceding 0x. This change fixes that to distinguish between decimal and
hex.
2024-08-07 02:31:21 -07:00
Bobby R. Bruce
811e8c0fb4 util-docker,tests: Up clang support: >=v10 (#1415)
The compiler tests are failing to to a compile bug in Clang 7:
https://github.com/gem5/gem5/actions/runs/10170081794

Given Ubuntu 20.04 APT installs v10 by default (i.e., with `apt install
clang`). This is the oldest LTS Ubuntu version. It therefore seems
sensible to drop support for older (<v10) versions of clang.
2024-08-07 00:38:08 -07:00
Bobby R. Bruce
eabb625870 util-docker,tests: Up clang support: >=v10
The compiler tests are failing to to a compile bug in Clang 7:
https://github.com/gem5/gem5/actions/runs/10170081794

Given Ubuntu 20.04 APT installs v10 by default (i.e., with `apt install
clang`). This is the oldest LTS Ubuntu version. It therefore seems
sensible to drop support for older (<v10) versions of clang.

Change-Id: I4c48223b80306422beac1464c09f03397c156ba1
2024-08-07 00:35:34 -07:00
Bobby R. Bruce
bbc49aa914 misc: Stable merge to dev (#1424) 2024-08-06 21:02:35 -07:00
Bobby R. Bruce
8885d60399 misc: Merge branch stable branch into develop
Change-Id: Ie391ea7eeb86a6e862e910e7d150edde0059cc54
2024-08-06 21:02:06 -07:00
Bobby R. Bruce
bb290aaff5 misc: Change devcontainer for isca tutorial and bootcamp (#1282) 2024-08-06 19:45:48 -07:00
Robert Hauser
ba704a01b2 misc: Fix typo in multisim code snippet (#1417) 2024-08-06 13:54:16 -07:00
Bobby R. Bruce
bd53bad5cf mem: Fix "Need is_secure arg" prefetcher crash (#1374)
This PR fixes the "Need is_secure arg" crash that occurs when using
IndirectMemoryPrefetcher, SignaturePathPrefetcher,
SignaturePathPrefetcherV2, STeMSPrefetcher, and PIFPrefetcher. This was
done by changing some variables to have the type AssociativeSet<...>
instead of AssociativeCache<...> and adding in "false" or an existing
value for the value of the secure bit in some function calls. Further
changes may be needed to move away from hard-coding values.
2024-08-06 13:01:40 -07:00
Erin Le
6dbe2bca7b mem: Add constexprs to spatio_temporal_memory_streaming.cc
Change-Id: I6fa3d9f9a9d89d59d9ec1fc97c152bea3059f87d
2024-08-06 00:06:38 +00:00
Erin Le
f325949ba5 mem: remove stray comment from signature_path_v2.cc
Change-Id: I5ddd2ddd6a9cb4fb032b48870c5ef6b0dc9533c0
2024-08-05 23:10:10 +00:00
Erin Le
2db021b27b mem: Comment removal and adding constexpr to is_secure bools
This commit removes some comments and adds constexpr in front
of "bool is_secure..." in pif.cc, signature_path.cc, and
signature_path_v2.cc

Change-Id: Icafe1d7c97d1d3fbf6abc12ba87ebb596255b96f
2024-08-05 15:43:40 -07:00
Erin Le
9adf44ed1f mem: use is_secure instead of hardcoded false in prefetcher crash
This modifies the crash fix so that the function calls that were
modified use a local variables called `is_secure` instead of a
hardcoded `false`. Some of these existed previously so it made
more sense to use them, while others were newly added in to mark
where the code might need to be changed later.

Change-Id: I0c0d14b74f0ccf70ee5fe7c8b01ed0266353b3c1
2024-08-05 15:43:40 -07:00