Hashing the `src` directory is too costly, with some runners reaching
timeout. Also, as we only have 10GB of cache it makes sense to have
more course grained caching
The setting of the `sim_quantum` parameter makes considerably more sense
to occur in the Processor. Through the `_pre_instnatiate` functions this
is now possible.
It makes much more sense for the Root Object to be create within the
board and passed where required. Creating it in the Simulator class is
not required.
For this to work the signuature of the `_pre_instantiate` function in
`AbstractBoard` has been updated to return the Root object.
THis is deprecated in favor of the board determining whether the
simulation is FS or SE. Usually this will be contingent on which
`set_workload` funciton has been called. Regardless, it is the board's
responsibility. The user should not need to explicitly declare this any
longer.
The Weekly GPU tests are failing due to a timeout, but I found the
testing timeout was set to 5 hours, and we have been frequently close to
reaching this but have recently changed the test enough to consistently
go over.
The main two things that appear to have caused this are:
~~1. Moving the X86_VEGA compilation into the same step as the running
of the tests.~~ (I take this back, the timeout is per-job, it shouldn't
matter how stuff is deivided among steps in the job. However, keeping it
separate does no harm and merging the two steps did coincide with
failures occurring. I'll play it safe for now_.
2. Reducing the number of threads per GitHub Actions runner, thus
slowing job execution.
In addition, we've added more tests to this weekly GPU suite, though I
don't believe we have got to running these tests yet. The timeout
appears to always have been triggered before this.
This PR increases the timeout to 3 days and moves the compilation into a
separate step.
**Update: Same changes done for Daily tests too as it appears to be the
same problem.
The Weekly GPU tests are failing due to a timeout but I found the testing
timeout was set to 5 hours and we have been frequently close to reaching this
but have recently changes the test enought o consistently go over.
The main two things that appear to have caused this are:
1. Moving the X86_VEGA compilation into the the same step as the running of
the tests.
2. Reducing the number of threads per GitHub Actions runner, thus slowing
job execution.
In addition we've added more tests to this weekly GPU suite though I don't
believe have got to running these tests yet. The timeout appears to
always been triggered before this.
This PR increases the timout to 3 days and moves the compilation into a
seperate step.
Two faults:
1. You can't give description the docker-bake file for single platform
builds. They must be in the Dockerfile..
2. The gpu docker image def in docker-bake.hcl was not overriding the
"common" setttings as previously thought. This was causing builds to
something build the wrong platform and vairous other weird bugs. This
has been fixed in this patch.
Invalidate requests align to system cache line size. This causes
problems if the GPU cache hierarchy's cache line size is different than
the system as the unlaigned requests never return, leading to deadlock
on deferred dispatch.
This commit uses the cache line size from the GPU memory manager and
makes the cache line size there non-optional.
Tested with multiple RubySystems where CPU side was 64B and GPU side was
128B cache lines.
Vega10 is no longer officially supported by ROCm and ROCm is starting to
use some packet types not supported. These were originally kept to allow
users to use older disk images with newer gem5. Going forward the gem5
version and gem5-resources releases will be required to be the same to
prevent lingering old configs.
As a replacement for vega10*.py, mi300.py or mi200.py should be used.
HIP examples, cookbook, and rodinia configs can be replaced with the
standard flow of building / obtaining the GPU application and running
using mi300.py or mi200.py as they do not require any input options and
therefore do not require changes to the disk image.
The clone3 syscall, implemented in commit 87e774c, is currently only
handled for x86-64 in gem5. Clone3 is employed by modern glibc versions
instead of clone for processes/threads generation (e.g. issue #1204).
This commit enables the clone3 syscall in riscv64 by adding the
corresponding handler call, as well as its arguments struct.
FMAXV, FMINV, FMAXNMV, FMINNMV and ADDV instructions perform recursive
reduction. Different reduction methods lie to different result when
handle NaN values.
Reuse the template of `twoRegAcrossInstX`. Add one more option
`recursive` for recursive reduction.
Change-Id: I69e690ce7668baee818542d3ea463f7a5f269a69
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
This commit fixs a bug in the viota instuction.
The two different instructions can be referenced to the same
StaticInstPtr because the decoder behaves as shown in [the section of
the
code](https://github.com/gem5/gem5/blob/stable/src/arch/riscv/decoder.cc#L98-L100).
So every first micro-op should reset the cnt variable in the macro-op.
Change-Id: Id311a05cfed41b01e16fd7256d9baa166aee49da
Co-authored-by: Jack Yung-Chen Lin <jack622@andestech.com>
This commit changes metric units (e.g. kB, MB, and GB) to binary units
(KiB, MiB, GiB) in various files. This PR covers files that were missed
by a previous PR that also made these changes.
This change adds MADT entries to the X86Board. Previously, the kernel in
full-system mode was complaining about a `ACPI BIOS Error (bug): Invalid
table length 0x24 in RSDT/XSDT (20190816/tbutils-291)`. This patch fixes
the invalid length and initializes all the tables correctly.
Signed-off-by: Kaustav Goswami <kggoswami@ucdavis.edu>
FMAXV, FMINV, FMAXNMV, FMINNMV and ADDV instructions perform recursive
reduction. Different reduction methods lie to different result when
handle NaN values.
Reuse the template of `twoRegAcrossInstX`. Add one more option
`recursive` for recursive reduction.
Change-Id: I69e690ce7668baee818542d3ea463f7a5f269a69
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
This PR is doing a simple refactoring of some partitioning policies. It
moves existing functionalities
within PP methods so that they can be called multiple times throughout
the simulation.
Therefore allowing a dynamic adjustment of the partitioning scheme
- Add `isExternalAbort()` in `AbortFault<T>` to determine external
abort.
- Add `virtual isExternalAbort()` in `ArmFault` so the method can be
used in base class.
- Set iss.ea by `isExternalAbort()`
- Add `isExternalAbort()` in `AbortFault<T>` to determine external abort.
- Add `virtual isExternalAbort()` in `ArmFault` so the method can be
used in base class.
- Set iss.ea by `isExternalAbort()`.
Change-Id: I01c22dc46958ab424b389af96d3c3b6243cbc671
The External Data Abort may not set TranMethod, and it leads to assert
error.
- Make `ArmFault::update` virtual.
- Implement override `update` in `AbortFault<T>` to set TranMethod.
Change-Id: I49e18799df8420b214b6059ffa756a13edf343d5
This will allow gem5 to configure the maximum capacity of a
partition dynamically during simulation, rather than
having it statically defined at construction time
Change-Id: Ib55c9990a6bc2930abaf2438c13337acc643520f
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
In this way we actually need to store one unsigned integer instead of
two. We also won't need to recompute the total number of cache blocks
whenever we will adapt this policy to be dynamically modified
Change-Id: Ia8cf906539d1891b6cdb821f2a74628127dc68c6
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Add decoder and function of AArch32 VCVTA, VCVTP, VCVTN and VCVTM
instructions. Support both 16-bit and 32-bit variants.
Only support A32 encoding.
Change-Id: I6ece0e1b779f9a7cc9d709894a49a7fdcda28373
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Replace std::uniform_*_distribution by custom code
to make random number generation in gem5 portable across
compilers.
Of note, FP random number generation was not uniformly
distributed, and this PR does not fix that issue.
Thanks to Chandana S. Deshpande (deshpande.s.chandana@gmail.com)
for uncovering the issue.
Co-authored-by: Arthur Perais <arthur.perais@univ-grenoble-alpes.fr>
This refactor attempts to homogenize all riscv's vector (macro/micro)
instruction classes so that ELEN and VLEN are guaranteed to be a class
attribute. Since both are constant, all instructions will get it on the
decoding process passed through to their vector base class.
This allows the removal of VLEN in the PC state and also in some
constructor default parameters (solves issue #1207).
Change-Id: I6f0471004335f49b00b015c37e95dc7f9569e303
Move getRvType & getPrivilegeModeSet static methods into
RiscvISA::RemoteGDB virtual methods allows the derived
RiscvISA::RemoteGDB to override it without change a lot of methods in
base methods
Change-Id: I3cbb9cf1fdee4a298e903bb4a0a5683c042b749d
64kB, in these cases, will cast to 64KiB regardless. To improve
readability and understanding of these objects, this patch changes there
SI Prefix (kB -> KiB).
System(Misc) register accesses are not the only trappable instructions.
We move the exception generation logic (generateTrap) from the
MiscRegOp64 to the base ArmStaticInst