The VGPR-offset used when SGPR-base addressing is used can be signed in
Vega. These are global instructions of the format:
`global_load_dword v0, v1, s[0:1]`. This is not explicitly stated in the
ISA manual however based on compiler output the offset can be negative.
This changeset assigns the offset to a signed 32-bit integer and the
compiler takes care of the signedness in the expression which calculates
the final address. This fixes a bad address calculation in a rocPRIM
unit test.
Change-Id: I271edfbb4c6344cb1a6a69a0fd3df58a6198d599
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67412
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
These are the "reset" and "po_reset" lines. It seems reasonable that
these are the normal reset and the power on reset signals, but that's
not spelled out in the fast model "lisa" file, nor does it explain
exactly what the difference is between them.
Change-Id: I686b4d973fc3cfff8a3ec05f8c95ee2cb6ff6698
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67575
Reviewed-by: Jui-min Lee <fcrh@google.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This CL enables forwarding stream ID from amba_pv to gem5 world.
The stream ID information is originally stored in master_id of
pv::TransactionAtrribute, then it will be stored to m_id of
amba_pv::amba_pv_extension.
This CL brings the information to stream ID field of
Gem5SystemC::ControlExtension. Then the information can be set to stream
ID of the gem5 packet's request.
After bringing the information to gem5, we can identify the packet's
stream ID from gem5 side. One example usage is PL330. In PL330_DMAC, each
transaction is associated with a stream ID. If we can identitfy the
stream ID, we can, for example, set attribute to specific DMAC channel.
Change-Id: I943ce49fde57b0bcfc18b58c7566eec61cc676f4
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67591
Reviewed-by: Yu-hsin Wang <yuhsingw@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Gabe Black <gabeblack@google.com>
The ISA parser now emits the code required to access matrix
registers. In the case where a register is both a source and a
destination, the ISA parser generates appropriate code to make sure
that the contents of the source is copied to the destination. This is
required for the O3 CPU which treats these as two different physical
registers, and hence data is lost if not explicitly preserved.
Jira Issue: https://gem5.atlassian.net/browse/GEM5-1289
Change-Id: I8796bd1ea55b5edf5fb8ab92ef1a6060ccc58fa1
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/64338
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
We add the SME access checks and trapping, which roughly mirrors that
used by SVE.
SME adds a new mode called streaming mode. When a core is in streaming
mode the behaviour of the SVE instructions changes such that they
check the SME traps and enables as opposed to the SVE ones. We
therefore update the existing SVE trap/access checking code to check
the SME equivalents when a core is in streaming mode. Else, the
original behaviour is preserved.
Jira Issue: https://gem5.atlassian.net/browse/GEM5-1289
Change-Id: I7eba70da9d41d2899b753fababbd6074ed732501
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/64337
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
We add the following registers which are added by SME:
* ID_AA64SMFR0_EL1
* SVCR
* SMIDR_EL1
* SMPRI_EL1
* SMPRIMAP_EL2
* SMCR_EL3
* SMCR_EL2
* SMCR_EL12
* SMCR_EL1
* TPIDR2_EL0
* MPAMSM_EL1
In addition we extend some of the existing registers with SME support
(SCR_EL3, CPACR_EL1, CPTR_EL2, CPTR_EL3, etc). These regisers are
responsible for enabling SME itself, or for configuring the trapping
behaviour for the differernt ELs.
In addition we implement some dummy registers as they are officially
required by SME, but gem5 itself doesn't actually support the features
yet (FGT, HCX).
Jira Issue: https://gem5.atlassian.net/browse/GEM5-1289
Change-Id: I18ba65fb9ac2b7a4b4f361998564fb5d472d1789
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/64335
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
We add support for the matrix registers to the Arm architecture. This
will be used to implement support for Arm's Scalable Matrix Extension
(SME) in subsequent commits.
We add an implementation of a matrix register for the Arm
architecture. These are akin to 2D vector registers in the sense that
they can be dynamically viewed as a variety of element sizes. As
widening the element size would reduce the matrix size by a factor of
element size, we instead layer multiple tiles of wider elements onto
the underlying matrix storage in order to retain square matrices.
We separate the storage of the matrix from the different views one can
have. The potential views are:
* Tiles: View the matrix as one or more tiles using a specified
element size. As the element size increases the number of indexable
tiles increases. When using the smallest granularity element size
(bytes) there is a single tile. As an example, using 32-bit elements
yields 4 tiles. Tiles are interleaved onto the underlaying matrix
modulo element size. A tile supports 2D indexing ([][]), with the
first index specifying the row index, and the second the column
(element index within the row).
* A Horizontal/Vertical slice (row or a column) of a tile: Take the
aforementioned tile, and extract a specified row or column slice
from it. A slice supports standard []-based indexing. A tile slice
must use the same underlying element type as is used for the tile.
* A Horizontal/Vertical slice (row or column) of the underlying matrix
storage: Treat the matrix register as an array of vectors (rows or
columns, rows preferred due to them being indepependent of the
element size being used).
On simulator start-up the matrix registers are initialised to a
maximum size. At run-time the used size can by dynamically
adjusted. However, please note that as the matrix register class
doesn't know if a smaller size is being used, the class itself doesn't
do any bounds checking itself. This is left to the user.
Jira Issue: https://gem5.atlassian.net/browse/GEM5-1289
Change-Id: I6a6a05154846e4802e9822bbbac00ab2c39538ed
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/64334
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This is essentially the same as how the reset signals were exported
from the CortexR52 which I used as an example, except here there is
only one reset. I passed through with the same name rather than calling
it "model_reset" as in the CortexR52 since the pass through is trivial,
and renaming the signal with no additional functionality seemed like it
would just create confusion. In the CortexR52 case it makes more sense
since there are multiple reset lines that need to be toggled to
actually cause a reset, and a level of abstraction is actually helpful.
Change-Id: I6b61fed6eb1566d131d4b0367fe4ae65031b25f8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67351
Maintainer: Gabe Black <gabe.black@gmail.com>
Reviewed-by: Yu-hsin Wang <yuhsingw@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Global instructions in Vega can either use a VGPR base address plus
instruction offset or SGPR base address plus VGPR offset plus
instruction offset. Currently the VGPR address/offset is always read as
two dwords. This causes problems if the VGPR number is the last VGPR
allocated to a wavefront since the second dword would be beyond the
allocation and trip an assert.
This changeset sets the operand size of the VGPR operand to one dword
when SGPR base is used and two dwords otherwise so initDynOperandInfo
does not assert. It also moves the read of the VGPR into the calcAddr
method so that the correct ConstVecOperandU## is used to prevent another
assertion failure when reading from the register file. These two changes
are made to all flat instructions, as global instructions are a
subsegement of flat instructions.
Change-Id: I79030771aa6deec05ffa5853ca2d8b68943ee0a0
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67077
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
DPP processing has several issues which are fixed in this changeset:
1) Incorrect comment is updated
2) newLane calculation for shift/rotate instructions is corrected
3) A copy of original data is made so that a copy of a copy is not made
4) Reset all booleans (OOB, zeroSrc, laneDisabled) after each lane
iteration
The shift, rotate, and broadcast variants were tested by implementing
them in assembly and running on silicon.
Change-Id: If86fbb26c87eaca4ef0587fd846978115858b168
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/66752
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
The bitfield extract instructions come in unsigned and signed variants.
The documentation on this is not correct, however the GCN3 documentation
gives some clues. The instruction should extract an N-bit integer where
N is defined in a source operand starting at some bit also defined by a
source operand. For signed variants of this instruction, the N-bit
integer should be sign extended but is currently not.
This changeset does sign extension using the runtime value of N by ORing
the upper bits with ones if the most significant bit is one. This was
verified by writing these instructions in assembly and running on a real
GPU. Changes are made to v_bfe_i32, s_bfe_i32, and s_bfe_i64.
Change-Id: Ia192f5940200c6de48867b02f709a7f1b2daa974
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/66751
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
The ResetRequestPort and ResetResponsePort have a few problems:
1. A reset signal should happen during the time a reset is asserted,
or in other words the device should stay in reset and not doing
anything while reset is asserted. It should not immediately restart
execution while the reset is still held.
2. These names are misleading, since there is no response. These names
are inherited from other port types where there is an actual response.
There is a new generic SignalSourcePort and SignalSinkPort set of port
classes which are templated on the type of signal they propogate, and
which can be used in place of reset ports in c++. These ports can
still have a specialized role which will ensure that only reset ports
are connected to each other for a form of type checking, although
the underlying c++ instances are more interoperable than that.
Change-Id: Id98bef901ab61ac5b200dbbe49439bb2d2e6c57f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/66675
Maintainer: Gabe Black <gabeblack@google.com>
Reviewed-by: Yu-hsin Wang <yuhsingw@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
These are largely compatibility wrappers around the Signal*Port
classes. The python versions of these types enforce more specific
compatibility, but on the c++ side the Signal*Port<bool> classes can
be used directly instead.
Change-Id: I1325074d0ed1c8fc6dfece5ac1ee33872cc4f5e3
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/66673
Maintainer: Gabe Black <gabeblack@google.com>
Reviewed-by: Yu-hsin Wang <yuhsingw@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>