This implements flattening in the x86 integer and floating point
RegClass-es, as well as adding regName functions for each. These came
from the X86StaticInst::printReg function, and the flattening functions
in the X86ISA::ISA class.
Change-Id: If026e3b44aa64441222451d91e99778f6054d9f0
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51228
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
This makes RegIds and the RegClass-es associated with them responsible
for their own flattening. If they don't need to be flattened (a common
case) then they just mark themselves as already flat and that step can
be skipped.
This will also make it possible to get rid of the (get|set)RegFlat APIs,
since if you want to use flattened registers, you'll either have or
create a flattened RegId and pass it into the same (get|set)Reg method.
By making flattening work on RegIds instead of RegIndexes, this will
also make it possible for registers to start out in one RegClass and
move into another one. This would be useful if, for instance, there were
multiple groups of integer registers which had different indexing
semantics, but which should all end up in the same pool for renaming.
For instance, on x86, there are three distinct classes of FP registers.
They are the MMX registers, the pairs of registers which back the XMM
registers, and the X87 registers. Only the last of these needs
flattening. These could all be treated as different RegClass-es
pre-flattening, and could converge on the underlying floating point
register file post-flattening.
Another example in x86 is that some registers can encode that they
should refer to either the first byte of one register, or the second
byte of another register. This only applies to some registers though,
and so only those would need to go through the flattening step.
Another major advantage is that this removes the need for flattening
functions on the ISA object. Having those, and treating the ISA object
as a TheISA::ISA instead of the more generic BaseISA, was done to make
the flattening functions inline, and to make them fold away in cases
where flattening is not necessary. This new scheme isn't *quite* as
streamline as that, since you'll actually need to check if something is
already flattened. You won't, however, need to check what type the
register is and then look up the right flattening function, so that will
likely compensate.
Change-Id: I3c648cc8c0776b0e1020504113445b7d033e665f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51227
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
This commit adds a commit which applied Python Black to all Python
files, to .git-blame-ignore-revs.
.git-blame-ignore-revs contains commits that should be ignored from git
blame. It can be setup using:
```
git config --global blame.ignoreRevsFile .git-blame-ignore-revs
```
Change-Id: I02c309a40ea7d3b9ef3e6ecb05d98e1e333585a9
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62052
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
The InstResult class is always used to store a register value, and also
only used to store a RegVal and not any more complex type like a
VecRegContainer. This is partially because the methods that *would*
store a complex result only have a pointer to work with, and don't have
a type to cast to to store the result in the InstResult.
This change reworks the InstResult class to hold the RegClass the
register goes with, and also either a standard RegVal, or a pointer to a
blob of memory holding the actual value if RegVal isn't appropriate. If
the InstResult has no RegClass, it is considered invalid.
To make working with InstResult easier, it also now has an "asString"
method which will just call into the RegClass's valString method with
the appropriate pointer.
By removing the ultimately unnecessary generality of the original class,
this change also simplifies InstResult significantly.
Change-Id: I71ace4da6c99b5dd82757e5365c493d795496fe5
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/50253
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
test_hello_se.py:
Added "take_params_progs" to store avaliable isa and the binary names.
Added two new parameters in function "verify_config" to take verifier
and input arguments for the binary.
simple_binary_run.py:
Added a new unrequired args called "arguments" to take input arguments
for the binary. Its default value is [ ] so the
"arguments = args.arguments" in the "set_se_workload" can run without
inputting any args.arguments.
Change-Id: Ib99dc92aa97060de5e1d34d9cac5800b82dab9e6
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61771
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Most of the invalidation methods in the TLB class are
doing the same thing: looping over all entries, checking if
the entry matches a certain criteria, and invalidating it
in case it does.
The only specific bit is the matching function, therefore
we add a virtual TLBIOp::match method which allows us
to specialize different TLBIs and to provide a single
flush method in the TLB class
Change-Id: I0672ff958742ac7ebff8d30218f75127343f1a58
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61753
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
The method is no longer calling the lookup method which
had been complicated by the introduction of partial translations.
(which is now called during address translation only)
The lookup method is iterating over all TLB Entries until a non
partial translation is found. Using lookup in flushMva makes it
O(n^2). With this patch we iterate over the TLB entries only once
(making flushMva O(n))
Change-Id: I8f2ae56192812cee231baf6943068abea4d7ef91
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61752
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Fixing invalidation behaviour for the following stage 2 TLB maintainance
instructions
MISCREG_TLBI_IPAS2E1_Xt
MISCREG_TLBI_IPAS2LE1_X
MISCREG_TLBI_IPAS2E1_Xt
MISCREG_TLBI_IPAS2LE1_Xt
1) Do nothing if EL2 is not enabled in the current security state
2) If we are in secure state, the 63 bit of the Xt register selects
the security domain (s/ns) of the invalidated entries
Change-Id: I4573ed60ce619bcefd9cb05f00c5d3fcfa8d3199
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61751
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Replace the two constructors with one that takes the truly mandantory
parameters, and then a function to derive a new RegClass with some sort
of adjustment, currently by adding custom ops, or setting a non-standard
register size.
Because the constructor and the modifier function are constexpr, they
should fold away and not actually create extra temporary copies of the
RegClass in the modifier functions.
Change-Id: I8acb755eb28fc8474ec453c51ad205a52eed9a8e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/50249
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
The following AArch64 CMOs were flagged as warnNotFail even
if they are actually implemented and there is no reason
for them to fail:
MISCREG_DC_IVAC_Xt
MISCREG_DC_ZVA_Xt
MISCREG_DC_CVAC_Xt
MISCREG_DC_CVAU_Xt
MISCREG_DC_CIVAC_Xt
This is likely coming from AArch32 (those CMOs are unimplemented in
AArch32).
Please note: this patch is not changing anything behaviorally; the
warnOnFail flag is not considered in AArch64 unless the unimplemented
flag is also set (and this was not the case for those CMOs)
Change-Id: I40396016703b9eb48f69b0eb710d077f8c2b146b
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61685
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
The LDS and scratch aperture base and limits are hardcoded to some
values that are useful for SE mode. In reality, these are chosen by the
driver so we need to honor whatever values the driver passes so that
when addresses are calculated they fall into the correct aperture to
route flat instructions to those apertures.
This overwrites the default hardcoded values for LDS and scratch base
and limit using the values providing by the driver in a MAP_PROCESS
packet.
Change-Id: I0e194a26631f697819d8aaecf1bf346a7b7c7026
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61656
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
These instructions are supposed to be read/writing special shader
hardware registers. Currently they are getting/setting to an SGPR. This
results in getting incorrect registers at best and clobbering an SGPR
being used by an application at worst. Furthermore, some registers need
to be set in the shader and the application will never (can never) set
them.
This patch overhauls the getreg/setreg instructions to use different
storage in the shader. The values will be updated either via setreg from
an application (e.g., mode register) or set by a PM4 MAP_PROCESS.
Change-Id: Ie5e5d552bd04dc47f5b35b5ee40a569ae345abac
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61655
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Here the mask should not be inverted. We also need to shift by the
offset to remove the padding as the consumer of the value expects the
offset to be removed.
This can be easily tested by running a GPU kernel with __shared__
variables. This will generate the following assembly:
s_getreg_b32 s6, hwreg(HW_REG_SH_MEM_BASES, 16, 16)
The current implementation returns the lower 16 bits (private memory
aperture) while the correct behavior is the uppter 16 bits (shared/LDS
memory aperture).
Change-Id: Iea8f0adceeadb24cdcf46ef4183fcaa8262ab9e7
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61654
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
The driver uses the pasid to look up events that need to be set in
kfd_signal_event_interrupt (amdkfd/kfd_events.c). Currently this is
uninitialized which causes the function in the driver to return without
doing anything useful.
This changeset initializes the cookie PASID to 0x8000. 0x8000 is always
the first PASID assigned by the driver. This works since gem5 only
supports one GPU process in FS mode. This would have to be changed for
multi-process support, so a comment is added as a reminder.
Change-Id: I7074b581f2f2f346bd910eef15d5f9253ce17e2c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61653
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The environment variable HSA_ENABLE_INTERRUPT controls if Interrupt or
busy wait signals are used in the ROCm runtime. Interrupts are not being
sent in gem5 causing simulations to hang indefinitely in certain
situations. To fix this, always disable interrupts to fall back to busy
wait signals. Using interrupts is an old and simple optimization to not
waste CPU cycles, but from the perspective of simulation this is not
important. Disabling interrupt-based HSA signals therefore increases the
number of applications working within gem5.
Change-Id: I1ae21d7ee01548a4d00a8972642079b90278f9a2
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61652
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
When GPU needs more scratch it requests from the runtime. In the
method to wait for response, a dmaReadVirt is called with the same
method as the callback with zero delay. This means that effectively
there is an infinite loop in the event queue if the scratch setup is not
successful on the first attempt. In the case of GPUFS, it is never
successfully instantly so a delay must be added. Without added delay,
the host CPU is never scheduled to make progress setting up more scratch
space.
The value 1e9 is choosen to match the KVM quantum and hopefully give KVM
a chance to schedule an event. For reference, the driver timeout is
200ms so this is still fairly aggressive checking of the signal response.
This value is also balanced around the GPUCommandProc DPRINTF to
prevent the print in this method from overwhelming debug output.
Change-Id: I0e0e1d75cd66f7c47815b13a4bfc3c0188e16220
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61651
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
This code is unnecessary as the read index is already correct.
Furthermore, it can cause hangs in some situations where the packet
SHOULD be marked as not complete. This causes a bug where the read index
is incremented by 1 multiple times, causing the packet processor to read
an invalid packet, followed by a hang after it does nothing.
Change-Id: Iceda3c9606e018f60f8902770a2d9762c1c14304
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61650
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
This instruction appears to be the only VOP1 instruction that has a
scalar destination using VDST as the destination register number.
However, since VDST is only 8 bits it cannot encode all possible
registers. Therefore, use the opcode to determine if the destination is
a scalar or vector destination.
This issue manifests as a VGPR dest being out of range for a kernel
where the number of SGPRs is more than the number of VGPRs and the
intended SGPR dest is larger than the count of VGPRs
Change-Id: I95a7de1ddb97f7171f48331fed36aef776fa0cb4
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61649
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
These VHE flags are not needed anymore.
They were used to trap EL2 access to VHE only registers (like CPACR_EL12)
when VHE was disabled (hcr.e2h = 0)
With the new faulting logic, we can just introduce VHE specific
callbacks checking for the hcr.e2h bitfield and returning an undefined
instruction if VHE is disabled.
In this way we don't have to add VHE only bits to every system register
Change-Id: I07bf9a9adc7a089bd45e718fb06d88488a2b7ed5
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61678
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>