This separates the idea of a SimpleCore and a BaseCPUCore. A SimpleCore
selects the correct BaseCPU subclass based on user-specified CPUTypes
and target ISA. The new BaseCPUCore type simply wraps any BaseCPU core
for usage in the stdlib.
Much of the code previously handled in SimpleCore has been moved to
BaseCPUCore.
The `cpu_simobject_factory` method has been moved from AbstractCore to
SimpleCore; a more logical location for this function.
Change-Id: I29ce9e381e7d5e8fe57e0db5deb04ad976b7dab9
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62292
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
This constraint bound us in many ways. There are many cases where we
want a core in a component which does not correspond to a CPUType
enum value.
This refactoring makes it so only SimpleCore utilizes this.
Docstrings have been updated to reflect this change.
Change-Id: I918c73310fc530dd060691cf9be65163cacfffb4
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62291
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
This patch allows a user to set the "GEM5_USE_PROXY" environment
variable, in the format of "<host>:<port>", to declare a socks5 proxy
server to use when obtaining gem5 resources and the resources.json
file.
Note, this requires the Python SOCKS client module, which can be
installed via `pip install PySocks`.
Change-Id: I13f50d71fb6e0713f6a280ec9d2f0b3049c27eb6
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62391
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
The previous implementation used the physical memory view when reporting
memory back to GDB. This circumvents MMUs and caches, and leads to wrong
backtraces at the least.
Current architectures support EL3, EL2, and EL1/EL0, and the Iris
interface presents a Msn that corresponds to that (`0x10ff`), see
table "Canonical memory space numbers" in the Iris user guide.
As GDB expects the view of the processor when querying memory (e.g. for
backtraces), this will allow proper backtraces.
Not sure if there is an implicit way of expressing memory attributes
(like in Lauterbach with the access modifiers before address
specifications), or if there is a need to implement special monitor
commands. But for the common use, using `CurrentMsn` should be the
correct choice.
Change-Id: Ibd14c1f94163105539a7fb9132550fe713b5c295
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61951
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Current WFI implementation always put the CPU to sleep even if there's
an pending interrupt. This might cause issue because an interrupt might
happen just before the WFI is executed, and there might not be any
further interrupts to wake the CPU up, so the CPU sleeps indefinitely.
In this CL, we ensure the CPU sleeps only if there's no pending
interrupt at all, regardless whether the interrupt is masked or not. We
intentionally check for masked interrupt as well because a masked
interrupt is also able to wake the CPU up if it occurs after WFI. This
will make the behavior consistent no matter the interrupt comes before
or after WFI.
Change-Id: I74dbc01fed52652101675f1ae8e481b77c932088
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62251
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
This implements flattening in the x86 integer and floating point
RegClass-es, as well as adding regName functions for each. These came
from the X86StaticInst::printReg function, and the flattening functions
in the X86ISA::ISA class.
Change-Id: If026e3b44aa64441222451d91e99778f6054d9f0
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51228
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
This makes RegIds and the RegClass-es associated with them responsible
for their own flattening. If they don't need to be flattened (a common
case) then they just mark themselves as already flat and that step can
be skipped.
This will also make it possible to get rid of the (get|set)RegFlat APIs,
since if you want to use flattened registers, you'll either have or
create a flattened RegId and pass it into the same (get|set)Reg method.
By making flattening work on RegIds instead of RegIndexes, this will
also make it possible for registers to start out in one RegClass and
move into another one. This would be useful if, for instance, there were
multiple groups of integer registers which had different indexing
semantics, but which should all end up in the same pool for renaming.
For instance, on x86, there are three distinct classes of FP registers.
They are the MMX registers, the pairs of registers which back the XMM
registers, and the X87 registers. Only the last of these needs
flattening. These could all be treated as different RegClass-es
pre-flattening, and could converge on the underlying floating point
register file post-flattening.
Another example in x86 is that some registers can encode that they
should refer to either the first byte of one register, or the second
byte of another register. This only applies to some registers though,
and so only those would need to go through the flattening step.
Another major advantage is that this removes the need for flattening
functions on the ISA object. Having those, and treating the ISA object
as a TheISA::ISA instead of the more generic BaseISA, was done to make
the flattening functions inline, and to make them fold away in cases
where flattening is not necessary. This new scheme isn't *quite* as
streamline as that, since you'll actually need to check if something is
already flattened. You won't, however, need to check what type the
register is and then look up the right flattening function, so that will
likely compensate.
Change-Id: I3c648cc8c0776b0e1020504113445b7d033e665f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51227
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
The InstResult class is always used to store a register value, and also
only used to store a RegVal and not any more complex type like a
VecRegContainer. This is partially because the methods that *would*
store a complex result only have a pointer to work with, and don't have
a type to cast to to store the result in the InstResult.
This change reworks the InstResult class to hold the RegClass the
register goes with, and also either a standard RegVal, or a pointer to a
blob of memory holding the actual value if RegVal isn't appropriate. If
the InstResult has no RegClass, it is considered invalid.
To make working with InstResult easier, it also now has an "asString"
method which will just call into the RegClass's valString method with
the appropriate pointer.
By removing the ultimately unnecessary generality of the original class,
this change also simplifies InstResult significantly.
Change-Id: I71ace4da6c99b5dd82757e5365c493d795496fe5
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/50253
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Most of the invalidation methods in the TLB class are
doing the same thing: looping over all entries, checking if
the entry matches a certain criteria, and invalidating it
in case it does.
The only specific bit is the matching function, therefore
we add a virtual TLBIOp::match method which allows us
to specialize different TLBIs and to provide a single
flush method in the TLB class
Change-Id: I0672ff958742ac7ebff8d30218f75127343f1a58
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61753
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
The method is no longer calling the lookup method which
had been complicated by the introduction of partial translations.
(which is now called during address translation only)
The lookup method is iterating over all TLB Entries until a non
partial translation is found. Using lookup in flushMva makes it
O(n^2). With this patch we iterate over the TLB entries only once
(making flushMva O(n))
Change-Id: I8f2ae56192812cee231baf6943068abea4d7ef91
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61752
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Fixing invalidation behaviour for the following stage 2 TLB maintainance
instructions
MISCREG_TLBI_IPAS2E1_Xt
MISCREG_TLBI_IPAS2LE1_X
MISCREG_TLBI_IPAS2E1_Xt
MISCREG_TLBI_IPAS2LE1_Xt
1) Do nothing if EL2 is not enabled in the current security state
2) If we are in secure state, the 63 bit of the Xt register selects
the security domain (s/ns) of the invalidated entries
Change-Id: I4573ed60ce619bcefd9cb05f00c5d3fcfa8d3199
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61751
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Replace the two constructors with one that takes the truly mandantory
parameters, and then a function to derive a new RegClass with some sort
of adjustment, currently by adding custom ops, or setting a non-standard
register size.
Because the constructor and the modifier function are constexpr, they
should fold away and not actually create extra temporary copies of the
RegClass in the modifier functions.
Change-Id: I8acb755eb28fc8474ec453c51ad205a52eed9a8e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/50249
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
The following AArch64 CMOs were flagged as warnNotFail even
if they are actually implemented and there is no reason
for them to fail:
MISCREG_DC_IVAC_Xt
MISCREG_DC_ZVA_Xt
MISCREG_DC_CVAC_Xt
MISCREG_DC_CVAU_Xt
MISCREG_DC_CIVAC_Xt
This is likely coming from AArch32 (those CMOs are unimplemented in
AArch32).
Please note: this patch is not changing anything behaviorally; the
warnOnFail flag is not considered in AArch64 unless the unimplemented
flag is also set (and this was not the case for those CMOs)
Change-Id: I40396016703b9eb48f69b0eb710d077f8c2b146b
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61685
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
The LDS and scratch aperture base and limits are hardcoded to some
values that are useful for SE mode. In reality, these are chosen by the
driver so we need to honor whatever values the driver passes so that
when addresses are calculated they fall into the correct aperture to
route flat instructions to those apertures.
This overwrites the default hardcoded values for LDS and scratch base
and limit using the values providing by the driver in a MAP_PROCESS
packet.
Change-Id: I0e194a26631f697819d8aaecf1bf346a7b7c7026
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61656
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
These instructions are supposed to be read/writing special shader
hardware registers. Currently they are getting/setting to an SGPR. This
results in getting incorrect registers at best and clobbering an SGPR
being used by an application at worst. Furthermore, some registers need
to be set in the shader and the application will never (can never) set
them.
This patch overhauls the getreg/setreg instructions to use different
storage in the shader. The values will be updated either via setreg from
an application (e.g., mode register) or set by a PM4 MAP_PROCESS.
Change-Id: Ie5e5d552bd04dc47f5b35b5ee40a569ae345abac
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61655
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Here the mask should not be inverted. We also need to shift by the
offset to remove the padding as the consumer of the value expects the
offset to be removed.
This can be easily tested by running a GPU kernel with __shared__
variables. This will generate the following assembly:
s_getreg_b32 s6, hwreg(HW_REG_SH_MEM_BASES, 16, 16)
The current implementation returns the lower 16 bits (private memory
aperture) while the correct behavior is the uppter 16 bits (shared/LDS
memory aperture).
Change-Id: Iea8f0adceeadb24cdcf46ef4183fcaa8262ab9e7
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61654
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
The driver uses the pasid to look up events that need to be set in
kfd_signal_event_interrupt (amdkfd/kfd_events.c). Currently this is
uninitialized which causes the function in the driver to return without
doing anything useful.
This changeset initializes the cookie PASID to 0x8000. 0x8000 is always
the first PASID assigned by the driver. This works since gem5 only
supports one GPU process in FS mode. This would have to be changed for
multi-process support, so a comment is added as a reminder.
Change-Id: I7074b581f2f2f346bd910eef15d5f9253ce17e2c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61653
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
When GPU needs more scratch it requests from the runtime. In the
method to wait for response, a dmaReadVirt is called with the same
method as the callback with zero delay. This means that effectively
there is an infinite loop in the event queue if the scratch setup is not
successful on the first attempt. In the case of GPUFS, it is never
successfully instantly so a delay must be added. Without added delay,
the host CPU is never scheduled to make progress setting up more scratch
space.
The value 1e9 is choosen to match the KVM quantum and hopefully give KVM
a chance to schedule an event. For reference, the driver timeout is
200ms so this is still fairly aggressive checking of the signal response.
This value is also balanced around the GPUCommandProc DPRINTF to
prevent the print in this method from overwhelming debug output.
Change-Id: I0e0e1d75cd66f7c47815b13a4bfc3c0188e16220
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61651
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
This code is unnecessary as the read index is already correct.
Furthermore, it can cause hangs in some situations where the packet
SHOULD be marked as not complete. This causes a bug where the read index
is incremented by 1 multiple times, causing the packet processor to read
an invalid packet, followed by a hang after it does nothing.
Change-Id: Iceda3c9606e018f60f8902770a2d9762c1c14304
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61650
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
This instruction appears to be the only VOP1 instruction that has a
scalar destination using VDST as the destination register number.
However, since VDST is only 8 bits it cannot encode all possible
registers. Therefore, use the opcode to determine if the destination is
a scalar or vector destination.
This issue manifests as a VGPR dest being out of range for a kernel
where the number of SGPRs is more than the number of VGPRs and the
intended SGPR dest is larger than the count of VGPRs
Change-Id: I95a7de1ddb97f7171f48331fed36aef776fa0cb4
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61649
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
These VHE flags are not needed anymore.
They were used to trap EL2 access to VHE only registers (like CPACR_EL12)
when VHE was disabled (hcr.e2h = 0)
With the new faulting logic, we can just introduce VHE specific
callbacks checking for the hcr.e2h bitfield and returning an undefined
instruction if VHE is disabled.
In this way we don't have to add VHE only bits to every system register
Change-Id: I07bf9a9adc7a089bd45e718fb06d88488a2b7ed5
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61678
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>