We add the following registers which are added by SME:
* ID_AA64SMFR0_EL1
* SVCR
* SMIDR_EL1
* SMPRI_EL1
* SMPRIMAP_EL2
* SMCR_EL3
* SMCR_EL2
* SMCR_EL12
* SMCR_EL1
* TPIDR2_EL0
* MPAMSM_EL1
In addition we extend some of the existing registers with SME support
(SCR_EL3, CPACR_EL1, CPTR_EL2, CPTR_EL3, etc). These regisers are
responsible for enabling SME itself, or for configuring the trapping
behaviour for the differernt ELs.
In addition we implement some dummy registers as they are officially
required by SME, but gem5 itself doesn't actually support the features
yet (FGT, HCX).
Jira Issue: https://gem5.atlassian.net/browse/GEM5-1289
Change-Id: I18ba65fb9ac2b7a4b4f361998564fb5d472d1789
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/64335
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
We add support for the matrix registers to the Arm architecture. This
will be used to implement support for Arm's Scalable Matrix Extension
(SME) in subsequent commits.
We add an implementation of a matrix register for the Arm
architecture. These are akin to 2D vector registers in the sense that
they can be dynamically viewed as a variety of element sizes. As
widening the element size would reduce the matrix size by a factor of
element size, we instead layer multiple tiles of wider elements onto
the underlying matrix storage in order to retain square matrices.
We separate the storage of the matrix from the different views one can
have. The potential views are:
* Tiles: View the matrix as one or more tiles using a specified
element size. As the element size increases the number of indexable
tiles increases. When using the smallest granularity element size
(bytes) there is a single tile. As an example, using 32-bit elements
yields 4 tiles. Tiles are interleaved onto the underlaying matrix
modulo element size. A tile supports 2D indexing ([][]), with the
first index specifying the row index, and the second the column
(element index within the row).
* A Horizontal/Vertical slice (row or a column) of a tile: Take the
aforementioned tile, and extract a specified row or column slice
from it. A slice supports standard []-based indexing. A tile slice
must use the same underlying element type as is used for the tile.
* A Horizontal/Vertical slice (row or column) of the underlying matrix
storage: Treat the matrix register as an array of vectors (rows or
columns, rows preferred due to them being indepependent of the
element size being used).
On simulator start-up the matrix registers are initialised to a
maximum size. At run-time the used size can by dynamically
adjusted. However, please note that as the matrix register class
doesn't know if a smaller size is being used, the class itself doesn't
do any bounds checking itself. This is left to the user.
Jira Issue: https://gem5.atlassian.net/browse/GEM5-1289
Change-Id: I6a6a05154846e4802e9822bbbac00ab2c39538ed
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/64334
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
We now explicitly check in both the cache and the MSHRs if writes are
masked or not before promoting to a whole-line write. Failure to do
this previously was resulting in data loss when dirty data was present
in lower level caches and a coincidentally aligned and
cache-line-sized masked write occured.
Change-Id: I9434590d8b22e4d993167d789eb9d15a2e866bf1
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/64340
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This is essentially the same as how the reset signals were exported
from the CortexR52 which I used as an example, except here there is
only one reset. I passed through with the same name rather than calling
it "model_reset" as in the CortexR52 since the pass through is trivial,
and renaming the signal with no additional functionality seemed like it
would just create confusion. In the CortexR52 case it makes more sense
since there are multiple reset lines that need to be toggled to
actually cause a reset, and a level of abstraction is actually helpful.
Change-Id: I6b61fed6eb1566d131d4b0367fe4ae65031b25f8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67351
Maintainer: Gabe Black <gabe.black@gmail.com>
Reviewed-by: Yu-hsin Wang <yuhsingw@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Currently the gem5 standard library does not define a class to represent
a cluster of CPUs.
The SubSystem class has been extended in some python modules [1] to
define clock/voltage domains shared by a group of CPUs (the cluster),
and to provide some utility functions for top level configs.
This patch is moving the aforementioned class within the gem5 standard
library, to let other ISAs and scripts make use of it.
Adding a cpu cluster class to the gem5 library will have the
benefit of standardizing the interface to cpus in the toplevel
configs
Most of the new class still resides in the python world: we want the
class to be as generic as possible and we want to make its use
optional
[1]: https://github.com/gem5/gem5/blob/v22.0.0.0/\
configs/example/arm/devices.py#L96
Change-Id: Idb05263a244e28bffa9eac811c6deb62ebb76a74
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65891
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The port_wrapper classes convert the Request/ResponsePort from
inherit-base to callback registrations. This help 'composition over
inheritance' that most design pattern follows, which help reducing
code length and increase reusability.
Change-Id: Ia13cc62507ac8425bd7cf143a2e080d041c173f9
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67232
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
-Wno-free-nonheap-object can happen at compile or link time depending on
the versions. To better disable this false alarm, we move the memory
management part into .cc file, so the check is always done at link time.
This change also removes the global flags so other code is still checked
with the flags.
Change-Id: I8f1e20197b25c90b5f439e2ecc474bd99e4f82ed
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67237
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Yu-hsin Wang <yuhsingw@google.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
When using the typed register template, most functionality of the class
can be controlled using callbacks. For instance, callbacks can be
installed to handle reads or writes to a register without having to
subclass the template and override those methods using inheritance.
The recently added reset() method did not follow this pattern though,
which has two problems. First, it's inconsistent with how the class is
normally used. Second, once you've defined a subclass, the reader,
writer, etc, callbacks still expect the type of the original class.
That means these have to either awkwardly use a type different from the
actual real type of the register, or use awkward, inefficient, and/or
dangerous casting to get back to the true type.
To address these problems, this change adds a resetter(...) method
which works like the reader(...) or writer(...) methods to optionally
install a callback to implement any special reset behavior.
Change-Id: Ia74b36616fd459c1dbed9304568903a76a4b55de
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67203
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Yu-hsin Wang <yuhsingw@google.com>
Maintainer: Gabe Black <gabeblack@google.com>
66d4a158 added support for AMD's GPU cache bypassing flags (GLC
for bypassing L1 caches, SLC for bypassing all caches). However,
it did not add a transition for the situation where the cache line
is currently I (Invalid). This commit adds this support, which
resolves an assert failure in Pannotia workloads when this situation
arises.
Change-Id: I59a62ce70c01dd8b73aacb733fb3d1d0dab2624b
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67201
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
66d4a158 added support for AMD's GPU cache bypassing flags (GLC
for bypassing L1 caches, SLC for bypassing all caches). However,
for applications that use the GLC flag but intermix GLC- and
non-GLC accesses to the same address, this previous commit
has a bug. This bug manifests when the address is currently
valid in the L1 (TCP). In this case, the previous commit chose
to evict the line before letting the bypassing access to proceed.
However, to do this the previous commit was using the inv_invDone
action as part of the process of evicting it. This action is only
intended to be called when load acquires are being performed
(i.e., when the entire L1 cache is being flash invalidated). Thus,
calling inv_invDone for a GLC (or SLC) bypassing request caused an
assert failure since the bypassing request was not performing a
load acquire.
This commit resolves this by changing the support in this case to
simply invalidate the entry in the cache.
Change-Id: Ibaa4976f8714ac93650020af1c0ce2b6732c95a2
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67199
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Some clients (e.g. fastmodel integration) would like to catch specific
warning messages from SystemC. Adding facilities to chain extra report
handler (instead of just replacing the default one), that are run
after the default/set handler.
Change-Id: I8ef140fc897ae5eee0fc78c70caf081f625efbfd
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67234
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
When decode width is larger than fetch width, the skid buffer
overflow happens at decode stage. The decode stage assumes
that fetch stage sends instructions as many as the fetch width,
but it sends them at decode width rate.
This patch makes the decode stage set its skid buffer size
according to the decode width.
Change-Id: I90ee43d16c59a4c9305c77bbfad7e4cdb2b9cffa
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67231
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Hanhwi Jang <jang.hanhwi@gmail.com>
Reviewed-by: Tom Rollet <tom.rollet@huawei.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Two W->WI transitions, on events RdBlk and Atomic in the GPU L2 cache
coherence protocol do not clear the request from the request queue upon
completing the transition. This action is not performed in the respone
path. This update adds the p_popRequestQueue action to each of these
transitions to remove the stale request from the queue.
Change-Id: Ia2679fe3dd702f4df2bc114f4607ba40c18d6ff1
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67192
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
An earlier commit added support for GLC and SLC AMDGPU instruction
modifiers. These modifiers enable cache bypassing when set. The GLC/SLC
flag information was being threaded through all the way to memory and
back so that appropriate actions could be taken upon receiving a request
and corresponding response. This commit removes the threading and adds
the bypass flag information to TBE. Requests populate this
entry and responses access it to determine the correct set of actions to
execute.
Change-Id: I20ffa6682d109270adb921de078cfd47fb4e137c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67191
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>