Commit Graph

20794 Commits

Author SHA1 Message Date
Hoa Nguyen
c3acfdc9b8 arch-riscv: Copy Misc Regs when swiching cpus (#479)
Misc Regs might contain rather important information about the state of
a core, e.g., information in CSR registers.

This patch enforces copying the CSR registers when switching cpus. The
bug and the proposed fix are reported here [1].

[1] https://github.com/gem5/gem5/issues/451

Change-Id: I611782e6e3bcd5530ddac346342a9e0e44b0f757

Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-10-18 10:51:37 -07:00
Harshil Patel
7bd0b99635 tests: Changed percent atomics to 0 in memtest to fix daily test (#477) 2023-10-18 10:09:45 -07:00
Bobby R. Bruce
334df18dce arch-riscv: Add bootloader+kernel workload (#390)
Aims to boot OpenSBI + Linux kernel.
2023-10-18 09:17:12 -07:00
Bobby R. Bruce
e9fe9cb001 util: Improve GitHub Action runners: Enable KVM; Better Cleanup; Better Tooling (#470)
This PR adds the following the GitHub Actions runners:

1. Enables KVM to be run within docker containers within the VMs, if
permitted. Now, any Docker containers wanting to use KVM must create
containers with the `--device /dev/kvm` argument. This may make it hard
or impossible to utilize with GitHub Actions. Nonetheless it is enabled.
2. Improves the docker prune step (a cleanup step carried out after each
job) so it now removes all the Docker images in the VM.
3. Adds the "halt-helper.sh" script which automatically, and safely,
halts (shutsdown) all the VMs so maintenance tasks can be undertaken.
2023-10-17 11:38:40 -07:00
Andreas Sandberg
42d1c8b3c3 cpu: Restructure RAS (#428) 2023-10-17 19:14:13 +01:00
David Schall
5387e67114 cpu: Restructure RAS
The return address stack (RAS) is restructured to be a separate SimObject.
This enables disabling the RAS and better separation of the functionality.
Furthermore, easier statistics and debugging.

Change-Id: I8aacf7d4c8e308165d0e7e15bc5a5d0df77f8192
Signed-off-by: David Schall <david.schall@ed.ac.uk>
2023-10-17 15:30:56 +00:00
Bobby R. Bruce
3783afff5d util: Enable KVM on VMs and ensure working in Runners
This patch:

1. Adds setup scripting to "provision_root.sh" to setup and enable KVM,
   for the 'vagrant' user, for VMs which are capable of this.
2. Runs a check on each VM to see if KVM can be run sucessfully within a
   docker container. If so, the GitHub Actions runner is given a 'kvm'
   label. It is unknown at this time if GitHub Runners can utlized KVM
   but it is open to their processes.

Change-Id: Idfcbb7bfa3e5b7cc47d29aea50fb1ebcafdb7acc
2023-10-16 21:21:31 -07:00
Bobby R. Bruce
d18087af96 util: Add halt-helper.sh
This script helps use safely halt vagrant VMs.

Change-Id: I2f2f36b93f82e07756d069334db178604a9915b3
2023-10-16 21:14:09 -07:00
Bobby R. Bruce
adb5470996 arch-arm: Fix (other) line-length errors (#468)
https://github.com/gem5/gem5/pull/459 missed a couple.
This commit should complete the task.
2023-10-16 17:47:46 -07:00
Bobby R. Bruce
4b9c4e1e17 misc: Add --all to Runner docker system prune
Without `--all` `docker prune --force --volumes` will remove everything
exception non-dangling images. For an image to be considered dangling it
must be untagged and/or not used by a container at that time. As most of
the images we download are tagged (e.g., `:latest`) then most of our
images are never removed without the inclusion of `--all` which will
remove any image not currently used by a container.

Images were starting to accumulate on runners. This will ensure they do
not and are cleaned after each job run.

Change-Id: I6d8441a11d22fdcf827e9c44422dbcf02cf600e0
2023-10-16 13:33:30 -07:00
Bobby R. Bruce
aaefda3b08 arch-arm: Fix line-length error in branch64.is
Change-Id: I62c5d5fd47927a838e6731a464fc7e6d8afab768
2023-10-16 10:57:03 -07:00
Hoa Nguyen
d048ad34d6 arch-riscv: Change to VS bits to DIRTY for rvv insts changing vregs (#376)
This is similar to [1] and [2].

Essentially, the VS bits of STATUS CSR keep track of the state of
the vector registers. (VS bits == DIRTY) means the content of vector
registers have been updated since the last time the VS bits were
updated.

This chain of changes is supposed to change the VS bits to DIRTY for if
any
vector register is potentially updated.

[1] https://gem5-review.googlesource.com/c/public/gem5/+/65272
[2] https://github.com/gem5/gem5/pull/370

Change-Id: I0427890dadc63b74a470d7405807dcfcad18005b
2023-10-16 10:07:40 -07:00
Yu-Cheng Chang
2825bc1d55 misc: Add missing RISCV valid ISA option to README.md (#462)
The list of valid ISA options should be same as the website:
https://www.gem5.org/documentation/general_docs/building

Change-Id: Id5ace5b0356ec35634caec5b11159551801c0615
2023-10-16 09:45:28 -07:00
Hoa Nguyen
9b2b6cd8d2 arch-riscv: Mark vector configuration insts as vector insts (#463) 2023-10-16 09:40:09 -07:00
Bobby R. Bruce
a9464a41f5 stdlib,resources: Generalize exception for request retry (#466)
In commit bbc301f2f0 the generalized
`Exception` was changed back to the more specific `HTTPError`.

In this case we do not desire specific error handling. If the connection
to the database fails I want the exception handled in the way outlined:
i.e., i want the connection to be retried 4 times before giving up. With
`HTTPError`, only `HTTPError`s warrent a retry.

Changing this to `HTTPError` cause tests to fail due to a failure to
retry downloading of a resource. Here is an example:
https://github.com/gem5/gem5/actions/runs/6521543885/job/17710779784

In this case `request.urlopen` raised a `URLError`. I suspect this was
some issued to do with reaching the DNS servers. It likely would've
succeeded if it had just tried again.
2023-10-16 09:39:44 -07:00
Bobby R. Bruce
322b105b9d arch-arm: Fix (another) line-length error in misc.cc
https://github.com/gem5/gem5/pull/459 missed one.
This commit should complete the task.

Change-Id: I0aeba79d6f13ddc45effe141945f5636b75daecc
2023-10-16 09:37:51 -07:00
Bobby R. Bruce
5240c07d3c util: Fix runners to extent to max disk size (#460)
THe `lvextend` command extends the logical volume. However, the
`resize2fs` command is needed to extend the filesystem to fill the
logical volume.

Prior to this patch the filesystem ran out of space despite there being
enough room in the volume. This was just wasted free space.
2023-10-16 09:20:13 -07:00
Bobby R. Bruce
97f4b44dd3 arch-arm: Fix line-length error in misc.cc (#459) 2023-10-16 08:35:54 -07:00
Giacomo Travaglini
f9cf8bf8a2 cpu, arch-arm: Add IsPseudo tag for gem5 pseudo instructions (#465)
This only applies to pseudo instructions with their own encoding (m5
ops)... In other
words memory mapped m5 operations are not supported. This make sense as
they should
rather be treated as device accesses... Though it is something to take
into consideration
when relying on the flag
2023-10-16 16:15:05 +01:00
Bobby R. Bruce
d42eeb6b68 cpu: Explicitly define cache_line_size -> 64-bit unsigned int (#329)
While it's plausible to define the cache_line_size as a 32-bit unsigned
int, the use of cache_line_size is way out of its original scope.

cache_line_size has been used to produce an address mask, which masking
out the offset bits from an address. For example, [1], [2], [3], and
[4]. However, since the cache_line_size is an "unsigned int", the type
of the value is not guaranteed to be 64-bit long. Subsequently, the bit
twiddling hacks in [1], [2], [3], and [4] produce 32-bit mask, i.e.,
0x00000000FFFFFFC0.

This behavior at least caused a problem in LLSC in RISC-V [5], where the
load reservation (LR) relies on the mask to produce the cache block
address. Two distinct 64-bit addresses can be mapped to the same cache
block using the above mask.

This patch explicitly defines cache_line_size as a 64-bit unsigned int
so the cache block mask can be produced correctly for 64-bit addresses.

[1]
3bdcfd6f7a/src/cpu/simple/atomic.hh (L147)
[2]
3bdcfd6f7a/src/cpu/simple/timing.hh (L224)
[3]
3bdcfd6f7a/src/cpu/o3/lsq_unit.cc (L241)
[4]
3bdcfd6f7a/src/cpu/minor/lsq.cc (L1425)
[5]
3bdcfd6f7a/src/arch/riscv/isa.cc (L787)
2023-10-16 07:50:35 -07:00
Jason Lowe-Power
d702d3b90a misc: fix clang13 overloaded-virtual warning (#454)
Like #363 clang is also unhappy about the overloaded virtual. However,
clang needs to have the diagnostic in a different place

Fixes #437
2023-10-16 07:23:08 -07:00
Giacomo Travaglini
3f925c4084 arch-arm: Mark gem5 pseudo-ops with IsPseudo flag
Change-Id: I9c8a146d73596597f28cdeca22ad7b7b01b381a7
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-16 13:42:23 +01:00
Giacomo Travaglini
a3b1bfdbf0 cpu: Add a IsPseudo StaticInstFlag for gem5 pseudo-ops
Being able to recognise pseudo ops from the static instruction
pointer is actually quite useful in several circumstances

Change-Id: Ib39badf9aabba15ab3ebe7a8e9717583412731e4
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-16 13:41:04 +01:00
Giacomo Travaglini
2e85c95f4b arch-arm: Remove Jazelle state + ThumbEE support (#364)
This PR removes Jazelle state (while still keeping a "Trivial Jazelle
implementation",
see Arm Architecture Reference Manual) and ThumbEE support
2023-10-16 09:41:44 +01:00
Jason Lowe-Power
20f5555f30 python: Enable -m switch on gem5 binary (#453)
With -m, you can now run a module from the command line that is embedded
in the gem5 binary.
This will allow us to put some common "scripts" in the stdlib instead of
in the "configs" directory.
2023-10-14 20:08:06 -07:00
Matthew Poremba
ca2592d3ba configs: Fix missing param exchange for GPUFS (#457)
PR #367 adds an option to configs/ruby/GPU_VIPER.py that was not added
to the corresponding dGPU equal for GPUFS and thus all GPUFS runs are
failing. Fixed in this patch.
2023-10-14 20:07:39 -07:00
Daniel Kouchekinia
4931fb0010 mem-ruby: Always pass on GPU atomics to dir in write-through TCC (#367)
Added checks to ensure that atomics are not performed in the TCC when it
is configured as a write-through cache. Also added SLC bit overwrite to
ensure directory preforms atomics when there is a write-through TCC.

Change-Id: I4514e6c8022aeb7785f2c59871cd9acec8161ed8
2023-10-14 06:39:50 -07:00
Yu-Cheng Chang
a3c51ca38c arch-riscv: Fix write back register issue of vmask_mv_micro (#443)
After removing the setRegOperand in VecRegOperand
https://github.com/gem5/gem5/pull/341. The vmask_vm_micro will not write
back to register because tmp_d0 is not the reference type. The PR will
make tmp_d0 as reference of regFile.

Change-Id: I2a934ad28045ac63950d4e2ed3eecc4a7d137919
2023-10-13 15:20:42 -07:00
Matthew Poremba
7706e958e5 mem-ruby: Update cache recorder to use RubyPort and remove BUILD_GPU guards (#448)
This PR updates cache recorder to use a vector of RubyPorts for cache
cooldown and warmup instead of Sequencer or GPUCoalescer vectors (refer
to issue #403 for more details). It also removes the extra guards that
were added in #377 to prevent compile-time failures in non-GPU builds.
2023-10-13 14:36:45 -07:00
Kaustav Goswami
68af3f45c9 tests: updated the nightly tests to use SST 13.0.0 (#441)
PR https://github.com/gem5/gem5/pull/396 updates the gem5 SST bridge to
use StandardMem in SST. This change updates the nightly tests to use SST
13.0.0 instead of SST 11.1.0. It also updates the dockerfile.

Change-Id: I5c109c40379d2f09601a1c9f19c51dd716c6582e

---------

Signed-off-by: Kaustav Goswami <kggoswami@ucdavis.edu>
Co-authored-by: Bobby R. Bruce <bbruce@ucdavis.edu>
2023-10-13 14:31:35 -07:00
Andreas Sandberg
59f96deb0f cpu: Refactor indirect predictor (#429) 2023-10-13 11:35:02 +01:00
Giacomo Travaglini
1c45cdcc41 arch-arm: Remove legacy ThumbEE references
ThumbEE had already been removed but there were still some
references to it dangling around. We were also signaling
ThumbEE as being available through HWCAPS in SE which
was not correct. This patch is fixing it

Change-Id: I8b196f5bd27822cd4dd8b3ab3ad9f12a6f54b047
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-13 09:25:48 +01:00
Giacomo Travaglini
a33f3d3967 arch-arm: Remove Jazelle state support
Jazelle state has been officially removed in Armv8.
Every AArch32 implementation must still support the
"Trivial Jazelle implementation", which means that while
the instruction set has been removed, it is still possible
for privileged software to access some Jazelle registers
like JIDR,JMCR, and JOSCR which are just treated as RAZ

Change-Id: Ie403c4f004968eb4cb45fa51067178a550726c87
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-13 09:25:48 +01:00
Vishnu Ramadas
8d54a5cbab mem-ruby: Remove BUILD_GPU guards from ruby coalescer models
A previous commit added BUILD_GPU guards to gpu coalescer models since
a related cache recorder commit added GPU support. This is no longer
needed since the cache recorder moved to using a vector of RubyPorts
instead of Sequencer/GPUCoalescer pointers. This commit removes
BUILD_GPU guards from the Ruby coalescer models

Change-Id: I23a7957d82524d6cd3483d22edfb35ac51796eca
2023-10-12 14:53:29 -05:00
Vishnu Ramadas
08c1af1b16 mem-ruby: Use RubyPort vector to access Ruby in cache recorder
Previously, the cache recorder used a vector of sequencer pointers to
access Ruby objects. A recent commit updated the cache recorder to also
maintain a vector of GPUCoalescer pointers in order for GPUs to support
flushin. This added redundant code to the cache recorder. This commit
replaces the sequencer and GPUCoalescer vectors with a vector of
RubyPort pointers so that the code does not contain redundant lines

Change-Id: Id5da33fb870f17bb9daef816cc43c0bcd70a8706
2023-10-12 14:49:06 -05:00
Bobby R. Bruce
3455d9e68d misc,tests: Add dummy jobs to workflows for status checks (#444)
With this change we can block PRs if the compiler, daily, and/or weekly
tests are failing by setting these dummy jobs as require status checks.
2023-10-12 11:14:08 -07:00
Bobby R. Bruce
bf1c10d4b2 tests,misc: Update CI Tests 'testlib-quick' runs-on
Here it's more sensible to use a GitHub hosted runner. This job is
miniscule and is used to check the other tests have completed
successfully. It makes sense for this not to be on our own self-hosted
runner.

Change-Id: I5377e025334d43eaedd0fc61e5c708ba61255d28
2023-10-12 07:37:11 -07:00
Bobby R. Bruce
3816ea5633 misc,tests: Add dummy jobs to workflows for status checks
Change-Id: I52e42b6f93cfbb1a8e4800a3f6e264d49bebb06c
2023-10-12 07:37:04 -07:00
Matthew Poremba
4d336c0636 arch-vega: Implement buffer_atomic_cmpswap (#439)
This is a standard compare and swap but implemented on vector memory
buffer instructions (i.e., it is the same as FLAT_ATOMIC_CMPSWAP with
MUBUF's special address calculation).

This was tested using a Tensile kernel, a backend for rocBLAS, which is
used by PyTorch and Tensorflow. Prior to this patch both ML frameworks
crashed. With this patch they both make forward progress.

Change-Id: Ie76447a72d210f81624e01e1fa374e41c2c21e06
2023-10-12 07:33:40 -07:00
Matthew Poremba
7bae5464dc arch-vega: Ignore s_setprio instruction instead of panic (#438)
This instruction is used by ML frameworks to prioritize certain
wavefronts. Since gem5 does not have any support for wavefront
scheduling based on priority (besides wavefront age), we ignore this
instruction and warn_once rather than calling panic. Since hardware can
override this priority anyways, we can be sure that ignoring the value
will not inhibit forward progress resulting in application hangs.

Change-Id: Ic5eef14f9685dd2b316c5cf76078bb78d5bfe3cc
2023-10-12 07:32:58 -07:00
Matthew Poremba
f7ad8fe435 configs: GPUFS option to disable KVM perf counters (#433)
Add a --no-kvm-perf option to disable KVM perf counters for GPUFS
scripts. This is useful for users who have KVM enabled but configured
with more restrictive settings, which seems to be the default in newer
Linux distros.

Change-Id: I7508113d0f7c74deb21ea7b2770522885a0ec822
2023-10-11 14:20:27 -07:00
Matthew Poremba
4b7f25fcb6 arch-vega: Ignore s_setprio instruction instead of panic
This instruction is used by ML frameworks to prioritize certain
wavefronts. Since gem5 does not have any support for wavefront
scheduling based on priority (besides wavefront age), we ignore this
instruction and warn_once rather than calling panic. Since hardware can
override this priority anyways, we can be sure that ignoring the value
will not inhibit forward progress resulting in application hangs.

Change-Id: Ic5eef14f9685dd2b316c5cf76078bb78d5bfe3cc
2023-10-11 15:55:16 -05:00
Matthew Poremba
4b85a1710e arch-vega: Implement buffer_atomic_cmpswap
This is a standard compare and swap but implemented on vector memory
buffer instructions (i.e., it is the same as FLAT_ATOMIC_CMPSWAP with
MUBUF's special address calculation).

This was tested using a Tensile kernel, a backend for rocBLAS, which is
used by PyTorch and Tensorflow. Prior to this patch both ML frameworks
crashed. With this patch they both make forward progress.

Change-Id: Ie76447a72d210f81624e01e1fa374e41c2c21e06
2023-10-11 15:42:50 -05:00
Bobby R. Bruce
c855dbf7c5 configs,ext: Updated the gem5 SST Bridge to use SST 13.0.0 (#396)
This change updates the gem5 SST Bridge to use SST 13.0.0. Changes are
made to replace SimpleMem class to StandardMem class as SimpleMem will
be deprecated in SST 14 and above. In addition, the translator.hh is
updated to translate more types of gem5 packets. A new parameter `ports`
was added on SST's side when invoking the gem5 component which does not
require recompiling the gem5 component whenever a new outgoing bridge is
added in a gem5 config.
2023-10-11 13:34:48 -07:00
Bobby R. Bruce
70b6b53e54 misc,python: Add pyupgrade to pre-commit (#424)
This adds the [pyupgrade](https://github.com/asottile/pyupgrade) hook to
pre-commit.

This hook automatically upgrades the syntax to the recommended standards
for the newer version of the language.
2023-10-11 09:07:09 -07:00
Matthew Poremba
da11427ba6 gpu-compute: Update tokens for flat global/scratch (#408)
Memory instructions acquire coalescer tokens in the schedule stage.
Currently this is only done for buffer and flat instructions, but not
flat global or flat scratch. This change now acquires tokens for flat
global and flat scratch instructions. This provides back-pressure to the
CUs and helps to avoid deadlocks in Ruby.

The change also handles returning tokens for buffer, flat global, and
flat scratch instructions. This was previously only being done for
normal flat instructions leading to deadlocks in some applications when
the tokens were exhausted.

To simplify the logic, added a needsToken() method to GPUDynInst which
return if the instruction is buffer or any flat segment.

The waitcnts were also incorrect for flat global and flat scratch. We
should always decrement vmem and exp count for stores and only normal
flat instructions should decrement lgkm. Currently vmem/exp are not
decremented for flat global and flat scratch which can lead to deadlock.
This change set fixes this by always decrementing vmem/exp and lgkm only
for normal flat instructions.

Change-Id: I673f4ac6121e4b5a5e8491bc9130c6d825d95fc5
2023-10-11 09:00:10 -07:00
Andreas Sandberg
891250192d arch-arm: Implement FEAT_TCR2 and FEAT_SCTLR2 (#416)
This is simply adding the new Armv8.9 registers defined in the related
features:

- FEAT_TCR2
- FEAT_SCTLR2
2023-10-11 10:14:31 +01:00
David Schall
f65df9b959 cpu: Refactor indirect predictor
Simplify indirect predictor interface. Several of the existing
functions where merged together into four clear once. Those
four are similar to the main direction predictor interface.
'lookup', 'update', 'squash' and 'commit'. This makes the
interface much more clear, allows better functionality isolation
and makes it simpler to develop new predictor models.

A new parameter is added to allow additional buffer space for
speculative path history.

Change-Id: I6d6b43965b2986ef959953a64c428e50bc68d38e
Signed-off-by: David Schall <david.schall@ed.ac.uk>
2023-10-11 07:50:32 +00:00
Bobby R. Bruce
c4156b06fb python: Fix base logic in MetaSimObject
This ensures `class Foo` is considered equivalent to `class
Foo(object)`.

Change-Id: I65a8aec27280a0806308bbc9d32281dfa6a8f84e
2023-10-10 21:47:08 -07:00
Bobby R. Bruce
298119e402 misc,python: Run pre-commit run --all-files
Applies the `pyupgrade` hook to all files in the repo.

Change-Id: I9879c634a65c5fcaa9567c63bc5977ff97d5d3bf
2023-10-10 21:47:07 -07:00