Without specifying the "gem5/gpu" directory, this test attempted to run
the entire test suite. This caused the daily and weekly tests to fail.
This change fixes this.
After merging the old personal gem5 repository with the stable version
v24, I tried to run the project inside the `.devcontainer` environment.
During the image build process, I encountered the following error:
```sh
[7683 ms] Start: Run in container: /bin/sh -c ./.devcontainer/on-create.sh
fatal: detected dubious ownership in repository at '/workspaces/gem5'
To add an exception for this directory, call:
git config --global --add safe.directory /workspaces/gem5
[7724 ms] onCreateCommand failed with exit code 128. Skipping any further user-provided commands.
```
This error occurred due to an ownership permission problem, which I
resolved by adding the following line.
* Removes the "docker-compose.yaml" in favor of "docker-bake.hcl". This
uses the `docker buildx` tool which has the advantage of enabling
multi-platformm builds where desired. By default all images are built
targeting `linux/arm64`, `linux/amd64` and `linux/riscv64` as targets
with the exception of the GPU images where only `linux/amd64` makes
sense.
* Remove unused/older Docker build targets (these can easily be re-added
but they were not regularly built or have any current usage).
* Update "README.md" to better describe these Dockerfiles and how they
are built.
* Simplify GCC and Clang compiler images. Each uses the Ubuntu 24.04 All
Deps image as a base then specialized the compiler on top.
* To simply things, all compiler versions are built from 24.04. This
means **narrowing the supported versions from GCC v10 to v14 and Clang
v14 to v18**.
* Fix some bugs in the "docker-bake.hcl" thus ensuring all targets may
be built from it.
* Cleanup the systemc and sst images: reducing their size and building
them off the common 24.04 ubuntu base image.
Some syscalls were incorrectly using 64 bit
integers instead of VPtr's guest pointers,
causing parameter value corruption. This
commit addresses this issue.
Change-Id: If9e27a7c776b802dda18979d1a83a76c23557359
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Same as with the off_t, some syscalls were using
incorrect size parametres in place of a guest-defined
size_t. This commit changes the signature of said
syscalls and adds the size_t typedef to the
arch-dependent Linux OSs.
Change-Id: Iece43814971a8e6275d25f6789e41528d241d1f4
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Some system calls were using incorrect sizing for
offset parametres, which was causing the ABI to pass
wrong values due to size mismatches. One such syscall
is lseek, which in the Arm syscall table was
incorrectly marked as llseek, which does not exist
in aarch64 Linux. In addition, the off_t alias for
general Linux was changed from an unsigned to a
signed type, to accurately reflect the behaviour
in the real-life Linux operating system.
Change-Id: Iada4b66a8933466c162ba9ec901dbdae73c73a18
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
This commit either adds the implementation or the ignoreFunc
to the corresponding entry in the syscall table for
some Arm syscalls that were required in order to test
the fix for the incorrect parameter size bug in se mode.
Change-Id: Ifc6d87e2decf1bf96ecd81de6690f92927377bf8
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
This commit adds the --param option to the starter_se
configuration script for the Arm ISA. This is in order
to support attaching remote debugger sessions.
Change-Id: I2d8cc9f677f731948872003cca6066d1072ad570
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
A new host tag `gcn_gpu` has been added. This allows for selection of
those GPU tests which depend upon the gcn-gpu docker image to run.
In addition to this, the square GPU tests has been moved to the CI
tests. This ensures some GPU code is compiled and run on every PR.
Since PR #1316, we use sign-extend for all address generation, including
PC, to match the ISA specification for modifiable XLEN. However, when we
set a breakpoint using remote GDB, our address is not sign-extended.
This causes the breakpoint to be set at the wrong address, as specified
in Issue #1463. This PR fixes the issue by sign-extending the address
when setting a breakpoint. This also matches the RISC-V ISA
Specification that "must sign-extend results to fill the entire widest
supported XLEN in the destination register."
Change-Id: I9b493bf8ad5b1ef45a9728bb40fc5e38250fe9c3
Signed-off-by: Yangyu Chen <cyy@cyyself.name>
In `get_default_kernel_root_val()`, now prioiritizes
the explicit disk_device passed from the user over the
default implemented by the board.
Also adjusts syntax for selecting this value in
`set_kernel_disk_workload()` for consistency.
Change-Id: Icddcf438f5b96c2288c3cc608782f191df2c394e
After a lot of debugging and comparing traces I noticed that vrintp was
giving different results from QEMU. An input of 0x3f800000 (1.0) was
being passed to the fplib helpers as (uint32_t)1 which has a completely
different floating-point interpretation and the result was therefore
completely wrong.
I've fixed this as well as all remaining implicit float-to-int
conversions in the ARM instruction execution. There are more
-W(implicit-)float-conversion warnings in the other executors, but for
now this fixes the issue I was seeing.
Change-Id: Ifdeee745ca155d7f4504ac4c54235ac431acdeb9
This PR fixes the issues mentioned in #1448.
**Note that this contribution is the result of a joint collaboration
with @AbhishekUoR**
This PR introduces the following 4 changes:
1. It changes the addresses which are used to compute the stride to
cache line aligned addresses (the current version uses word aligned
addresses)
2. It correctly returns if the stride does not match (as opposed to
issuing prefetches using the new stride incorrectly)
3. It returns if the new stride is 0, indicating multiple reads from the
same cache line.
4. It removes code which is no longer necessary after the addition of
changes number 1 and 3.
Change-Id: Ic346d0e15df6d07e2b93289c8d6b89b4c2f45a34
---------
Co-authored-by: Abhishek Shailendra Singh <abs218@leigh.edu>
1. Builds on top of the Ubuntu 24.04 all-deps image.
2. Unify the download, build, install, and cleanup steps.
Change-Id: I4c2bf8e571dfd228f7df8372cda0f428de59af51
1. Uses the ubuntu-24.04_all-deps as the base image.
2. Unifies the build and cleanup into a single step, thus reducing the
size of the image.
Change-Id: I63b5dad2af0e8b1f6be8ad1f28321c743f36b2dc
These images won't work and make no sense compiling to any platform
other than X86. These are used in SE mode simulations where the host
platform matters.
Change-Id: I47405e930bf511fabcbc93d0b08ee2fb2c556869
1. Uses the all-dependencies image as the base image.
2. Has all compilers use Ubuntu 24.04.
Notes: This change implitly changes our supported compilers to GCC v10
to v13 and Clang v14 to v18. This will be fully incorporated into the
project later.
Change-Id: Id8e2141ea64a34c7e3532605f6ecb7d9ccb76951
This was found while comparing a diverging execution against QEMU traces
and checking for the first mismatched program counter. Fortunately this
was
caused by a branch shortly after this incorrect computation but still
took
a long time to track down.
There are two issues here: the decoder had inverted the cases for *S and
*A,
and the sign bit was wrong for VFN*.
This PR has 3 commits:
- Update scaling methods to scale by multiplication or division when
upcasting or downcasting respectively.
- Preserve the sign when a microscaling conversion results in NaN or
infinity to match hardware.
- Rework rounding to handle cases where conversion results in a denormal
number in the output type so that the value is correct.
Scratch memory requests that are larger than one dword are using a
different memory layout than global instructions. Rather than being
placed contiguously, each dword is interleaved 64 lanes * 4 bytes away
as described in Section 9.1.5.2. "Swizzled Buffer Addressing" in the
MI300 specification. This was verified by comparing MI300 output (which
uses scratch_ instructions) with MI200 (which uses buffer instructions).
MI300 FashionMNIST bs=1 now matches CPU reference.
This requires several changes to the instruction implementations:
- For stores, data in the GPUDynInst can be swizzled before the data is
written to memory. This is easy to do using a helper method. This is
done in the template<int N> variant of initMemWrite. To use this x2
stores are changed to use template<int N> rather than loading a U64. The
swizzle function is renamed to swizzleAddr to avoid confusion with
swizzleData.
- For loads, data is unswizzled in completeAcc when writing register
values. This is not as easy to implement as a helper and is thus
implemented for the three load instructions that load more than one
dword.
- Accessing swizzled data requires at least one packet per dword. A new
GPU memory helper is added to create these packets for scratch requests
specifically. This is called in the template<int N> variant of
initMemRead / initMemWrite. Loads and stores of x2 are changed to use
this variant instead of accessing a U64.
The GPUDynInst status vector restrictions are increased to allow for
swizzled x4 accesses. For simplicity this does not currently support
misaligned swizzled accesses and will panic upon seeing such a case.
Change-Id: Ic686c51e28e0af029a043d5a5b3d4069f2cb94f9
The current implementation does not correctly convert subnormal numbers
(number that fill the underflow gap around zero in floating-point
arithmetic). This commit reworks the rounding code to get correct
results.
First, the min_exp is set to 0 which allows for numbers to become
subnormal when rounding. Second, the rounding code now uses something
closer to "GRS" rounding (guard, round, sticky) which represent the
first bit removed when rounding to a smaller type, the next second bit
removed, and whether any of the other bits removed are one. More details
can be found in the code comments.
Change-Id: Idcd2f1e4383e4012fc3abf73b1f73c847d44f67b
The implementation of microscaling formats uses the Open Compute Project
specification which includes a sign bit for NaN and infinity. This
should be preserved when a conversion results in NaN or infinity.
Change-Id: Id9e99324c6486e256c699016aff301d5f06814d5
Currently there is only a scale() method which multiplies a microscaling
type by an int8 value. This should only be applied when upcasting to
a larger type after conversion to match hardware. When downcasting to a
smaller type, the scaling method should divide by the int8 value before
conversion.
This commit adds both scaling methods.
Change-Id: Ibafa8caa389cde4df609e536cd53bd2289959420
At the moment, a hart does not halt if there are pending interrupts.
However, an implementation can also consider the enable status of the
individual interrupts, i.e., a halted hart would only resume if there
are locally enabled pending interrupts. This commit introduces this
behavior. The wfi behavior is controlled by the new configuration
variable wfi_pending_resume of RiscvISA.
Change-Id: I316239f9732c6e73e6ad692491bca08d773dd995
---------
Signed-off-by: Robert Hauser <robert.hauser@uni-rostock.de>
Functional writes atomically update all copies of a data block, so they
should invalidate any pending LL/SC locks, just like a conventional
write would.
Change-Id: Ic79d2d8d24901f1b6a2ce81dc0e2decc84c0ebbc
Dependency Bot appears to have had difficulty with this file:
https://github.com/gem5/gem5/security/dependabot/29
This PR:
1. Removes the weird "```" which could not be parsed.
2. Ups PyMongo to a more secure version.