This generalized Workload SimObject is not geared towards FS or SE
simulations, although currently it's only used in FS. This gets rid
of the ARM specific highestELIs64 property (from the workload, not the
system) and replaces it with a generic getArch.
The old globally accessible kernel symtab has been replaced with a
symtab accessor which takes a ThreadContext *. The parameter isn't used
for anything for now, but in cases where there might be multiple
symbol tables to choose from (kernel vs. current user space?) the
method will now be able to distinguish which to use. This also makes
it possible for the workload to manage its symbol table with whatever
policy makes sense for it.
That method returns a const SymbolTable * since most of the time the
symbol table doesn't need to be modified. In the one case where an
external entity needs to modify the table, two pseudo instructions,
the table to modify isn't necessarily the one that's currently active.
For instance, the pseudo instruction will likely execute in user space,
but might be intended to add a symbol to the kernel in case something
like a module was loaded.
To support that usage, the workload has a generic "insertSymbol" method
which will insert the symbol in the table that "makes sense". There is
a lot of ambiguity what that means, but it's no less ambiguous than
today where we're only saved by the fact that there is generally only
one active symbol table to worry about.
This change also introduces a KernelWorkload SimObject class which
inherits from Workload and adds in kernel related members for cases
where the kernel is specified in the config and loaded by gem5 itself.
That's the common case, but the base Workload class would be used
directly when, for instance, doing a baremetal simulation or if the
kernel is loaded by software within the simulation as is the case for
SPARC FS.
Because a given architecture specific workload class needs to inherit
from either Workload or KernelWorkload, this change removes the
ability to boot ARM without a kernel. This ability should be restored
in the future.
To make having or not having a kernel more flexible, the kernel
specific members of the KernelWorkload should be factored out into
their own object which can then be attached to a workload through a
(potentially unused) property rather than inheritance.
Change-Id: Idf72615260266d7b4478d20d4035ed5a1e7aa241
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/24283
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This is another fix for the AArch32-AArch64 interprocessing issue
introduced in
3d15150d cpu, arch, arch-arm: Wire unused VecElem code in the O3 model.
Register mapping between AArch32 and AArch64 is explicitly defined in
ARMv8 manual. This allows software to read registers right after a state
switch without writing them first, and it is indeed common for software
to save registers to memory first before using them.
In gem5's implementation of vector mode switching, however, vectors may
not be marked as ready right after a state switch. Software reads toward
vectors at this time will stall O3CPU forever. This patch fixes this by
marking all mapped vectors (or vector elements, depending on AArch32 or
AArch64) as ready right after switching vector mode.
Change-Id: I609552c543dad8da66939c0a3079d73d48e92163
Signed-off-by: Hsuan Hsu <hsuan.hsu@mediatek.com>
Signed-off-by: Howard Wang <Howard.Wang@mediatek.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/26203
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Information about what kernel to load and how to load it was built
into the System object and its subclasses. That overloaded the System
object and made it responsible for too many things, and also was
somewhat awkward when working with SE mode which doesn't have a kernel.
This change extracts the kernel and information related to it from the
System object and puts into into a OsKernel or Workload object.
Currently the idea of a "Workload" to run and a kernel are a bit
muddled, an unfortunate carry-over from the original code. It's also an
implication of trying not to make too sweeping of a change, and to
minimize the number of times configs need to change, ie avoiding
creating a "kernel" parameter which would shortly thereafter be
renamed to "workload".
In future changes, the ideas of a kernel and a workload will be
disentangled, and workloads will be expanded to include emulated
operating systems which shephard and contain Process-es for syscall
emulation.
This change was originally split into pieces to make reviewing it
easier. Those reviews are here:
https: //gem5-review.googlesource.com/c/public/gem5/+/22243
https: //gem5-review.googlesource.com/c/public/gem5/+/24144
https: //gem5-review.googlesource.com/c/public/gem5/+/24145
https: //gem5-review.googlesource.com/c/public/gem5/+/24146
https: //gem5-review.googlesource.com/c/public/gem5/+/24147
https: //gem5-review.googlesource.com/c/public/gem5/+/24286
Change-Id: Ia3d863db276a023b6a2c7ee7a656d8142ff75589
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/26466
Reviewed-by: Gabe Black <gabeblack@google.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The new local access mechanism installs a callback in the request which
implements what the mmapped IPR was doing. That avoids having to have
stubs in ISAs that don't have mmapped IPRs, avoids having to encode
what to do to communicate from the TLB and the mmapped IPR functions,
and gets rid of another global ISA interface function and header files.
Jira Issue: https://gem5.atlassian.net/browse/GEM5-187
Change-Id: I772c2ae2ca3830a4486919ce9804560c0f2d596a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/23188
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This patch fixes the AArch32-AArch64 interprocessing issue introduced in
3d15150d cpu, arch, arch-arm: Wire unused VecElem code in the O3 model.
When O3CPU switches vector renaming mode, architectural-physical mapping
and physical free list are switched in the following way so that content
of vectors has no change from software view:
Case 1. Full mode -> Elem mode (AArch64 -> AArch32):
1.1. Split vector-vector mapping into element-element mapping.
1.2. Split vectors in free list into elements.
Case 2. Elem mode -> Full mode (AArch32 -> AArch64):
2.1. Move content of all N*M mapped physical elements to first N*M
physical elements in architectural order (N = number of
architectural vectors, M = number of elements per vector).
2.2. Map N architectural vectors to first N physical vectors (i.e.
initial mapping in full mode).
2.3. Place remaining physical vectors in free list (i.e. initial free
list in full mode).
Previous gem5 revision misses step 2.2 when AArch32->AArch64 switch.
The wrong mapping will lead to the situation in which a physical vector
is assigned twice to a same architectural vector without being freed.
Once this occurs, the physical vector will not be freed anymore, since
it is treated as a special register (e.g. zero or misc) by O3CPU's
renaming logic. Eventually O3CPU will either stall forever when all
physical vectors get stuck, or trigger the panic condition "The free
list has lost vector registers" when AArch64->AArch32 switch. This patch
adds the missing step and fixes the issue.
Change-Id: I32233635c28763260bcbb776b52ed198a9abace9
Signed-off-by: Hsuan Hsu <hsuan.hsu@mediatek.com>
Signed-off-by: Howard Wang <Howard.Wang@mediatek.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/25743
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This is mostly only a superficial change since the isa parameter is
then dynamic cast to the ISA specific version inside the various
consumers, currently the SimpleThread, O3CPU and Decoder classes. If
those aren't being used, for instance in the fast model CPUs, then you
can use a different ISA implementation without any type clashes.
Change-Id: I2226ef60f9a471ae51b8bfce8683033f7854197a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/25009
Reviewed-by: Gabe Black <gabeblack@google.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
TheISA::initCPU is basically an ISA specific implementation of reset
logic on architectural state. As such, it only needs to be called if
we're not going to load a checkpoint, ie in initState.
Also, since the implementation was the same across all CPUs, this
change collapses all the individual implementations down into the base
CPU class.
Change-Id: Id68133fd7f31619c90bf7b3aad35ae20871acaa4
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/24189
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
The logic that determines which syscall to call was built into the
implementation of faults/exceptions or even into the instruction
decoder, but that logic can depend on what OS is being used, and
sometimes even what version, for example 32bit vs. 64bit.
This change pushes that logic up into the Process objects since those
already handle a lot of the aspects of emulating the guest OS. Instead,
the ISA or fault implementations just notify the rest of the system
that a nebulous syscall has happened, and that gets propogated upward
until the process does something with it. That's very analogous to how
a system call would work on a real machine.
When a system call happens, the low level component which detects that
should call tc->syscall(&fault), where tc is the relevant thread (or
execution) context, and fault is a Fault which can ultimately be set
by the system call implementation.
The TC implementor (probably a CPU) will then have a chance to do
whatever it needs to to handle a system call. Currently only O3 does
anything special here. That implementor will end up calling the
Process's syscall() method.
Once in Process::syscall, the process object will use it's contextual
knowledge to determine what system call is being requested. It then
calls Process::doSyscall with the right syscall number, where doSyscall
centralizes the common mechanism for actually retrieving and calling
into the system call implementation.
Jira Issue: https://gem5.atlassian.net/browse/GEM5-187
Change-Id: I937ec1ef0576142c2a182ff33ca508d77ad0e7a1
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/23176
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Brandon Potter <Brandon.Potter@amd.com>
There is a check on a global flag denoting that the simulator
has been configured to run in fullsystem mode. The check is
conducted at runtime during calls to syscall methods.
The high-level models are checking the flag when the check
could be conducted further down the call chain (nearer to the
actual Process invocation). Moving the checks should result
in less copy-pasta as new models are developed. It might be
argued that the checks should stay in place since an error
would detected earlier; that may be true, but the error
would be the same and the simulation should fail in either
case. This arrangement requires fewer lines of code.
The changeset also changes the check into a fatal error
instead of a panic since usage (in fs mode) should result
in immediate corruption.
Change-Id: If387e27f166ac1374f3fe8b7befe3546e69adba7
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/23240
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This was useful when transitioning away from the CPU based
comInstEventQueue, but now that objects backing the ThreadContexts have
access to the underlying comInstEventQueue and can manipulate it
directly, they don't need to do so through a generic interface.
Getting rid of this function narrows and simplifies the interface.
Change-Id: I202d466d266551675ef6792d38c658d8a8f1cb8b
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/22113
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This switches to letting the ThreadContexts use a thread based/local
comInstEventQueue instead of falling back to the CPU's array. Because
the implementation is no longer shared and it's not given where the
comInstEventQueue (or other implementation) should be accessed, the
default implementation has been removed.
Also, because nobody is using the CPU's array of event queues, those
have been removed.
Change-Id: I515e6e00a2174067a928c33ef832bc5c840bdf7f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/22110
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The TLBs now create the stage 2 MMUs as children, and since those are
specialized for instruction and data, the CPU needs to use ArmITB or
ArmDTB instead of ArmTLB which is the base class without an MMU. This
was changed for the BaseCPU and SimpleCPU checker already, but the TLBs
are added in the O3 checker CPU as well.
Change-Id: I498f247f376c8721fb70ce26c0f1b0815b12fe2d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/22039
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This was initially added in 2003 and only supported in the simple CPUs.
It's oddly specific since there are no other similar event queues for,
for instance, stores, branches, system calls, etc.
Given that this seems like a historical oddity which is only partially
supported and would be very hard to support on more diverse CPU types
like KVM or fast model which don't generally have hooks for counts of
specific instruction types.
Change-Id: I29209b7ffcf896cf424b71545c9c7546f439e2b9
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/21780
Reviewed-by: Brandon Potter <Brandon.Potter@amd.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This queue was set up to allow triggering events based on the total
number of instructions executed at the system level, and was added in
a change which added a number of things to support McPAT. No code
checked into gem5 actually schedules an event on that queue, and no
code in McPAT (which seems to have gone dormant) either downloadable
from github or found in ext modify gem5 in a way that makes it use
the instEventQueue.
Also, the KVM CPU does not interact with the instEventQueue correctly.
While it does check the per-thread instruction event queue when
deciding how long to run, it does not check the instEventQueue. It will
poke it to run events when it stops for other reasons, but it may (and
likely will) have run beyond the point where it was supposed to stop.
Since this queue doesn't seem to actually be used for anything, isn't
being used properly in all cases anyway, and adds overhead to all the
CPU models, this change eliminates it.
Change-Id: I0e126df14788c37a6d58ca9e1bb2686b70e60d88
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/21783
Maintainer: Gabe Black <gabeblack@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Tiago Mück <tiago.muck@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This change is based on modify the way we move the AtomicOpFunctor*
through gem5 in order to mantain proper ownership of the object and
ensuring its destruction when it is no longer used.
Doing that we fix at the same time a memory leak in Request.hh
where we were assigning a new AtomicOpFunctor* without destroying the
previous one.
This change creates a new type AtomicOpFunctor_ptr as a
std::unique_ptr<AtomicOpFunctor> and move its ownership as needed. Except
for its only usage when AtomicOpFunc() is called.
Change-Id: Ic516f9d8217cb1ae1f0a19500e5da0336da9fd4f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20919
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
No caller uses any of the MasterPort specific properties of these
function's return values, so we can instead return a reference to the
base Port class. This makes it possible for the data and inst ports
to be of any port type, not just gem5 style MasterPorts. This makes
life simpler for, for example, systemc based CPUs which might have TLM
ports.
It also makes it possible for any two CPUs which have compatible ports
to be switched between, as long as the ports they use support being
unbound. Unfortunately that does not include TLM or systemc ports which
are bound permanently.
Change-Id: I98fce5a16d2ef1af051238e929dd96d57a4ac838
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20240
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Gabe Black <gabeblack@google.com>
Fix problem with O3 and AMO instructions. At initial stages amo
instruction is considered a type of non-speculative store. After
the instruction has been commited and during the squash step,
acquire_release version of the AMO operation is considered speculative,
that differents results in an assert fault. This fix ensures that AMO
instructions are always considered non-speculative, during early stages
and during squas/removal of the instruction.
Change-Id: Ia0c5fbb9dc44a9991337b57eb759b1ed08e4149e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19815
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The assert() in the LSQ writeback() only allowed ReExec faults.
However, a SplitRequest which completed the translation in
PartialFault state (i.e. any but the very first cacheline
translation failed) may end up here. The assert() condition is
extended accordingly.
The patch also removes the superfluous/unused Complete/Squashed
states from the LSQ request. (The completion of the request is
recorded in the flags still.)
Change-Id: Ie575f4d3b4d5295585828ad8c7d3f4c7c1fe15d0
Signed-off-by: Gabor Dozsa <gabor.dozsa@arm.com>
Reviewed-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19174
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Reset the fault status always before translation is initiated in
pushRequest() in the LSQ. This avoids the problem when a strictly
ordered load needs to be re-executed multiple times. If the
translation is delayed at one of those attempts then the
internal panicFault (from the previous execution attempt) can get
fired at commit.
Change-Id: I0c22b2f7afd6e2cb00bc359a4a01042efd2d01d2
Signed-off-by: Gabor Dozsa <gabor.dozsa@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19388
Reviewed-by: Ciro Santilli <ciro.santilli@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This patch adds support for pinning registers for a certain number of
consecutive writes. This is only relevant for timing CPU models
(functional-only models are unaffected), and it is primarily needed to
provide a realistic execution model for micro-coded operations whose
microops can write to non-overlapping portions of a destination
register, e.g. vector gather loads. In those cases, this mechanism
can disable renaming for a sequence of consecutive writes, thus making
the resulting execution more efficient: allocating a new physical
register for each microop would introduce a read-modify-write chain of
dependencies, while with these modifications the microops can write
back in parallel.
Please note that this new feature is only leveraged by O3CPU for the
time being.
Additional authors:
- Gabor Dozsa <gabor.dozsa@arm.com>
Change-Id: I07eb5fdbd1fa0b748c9bdc1174d9f330fda34f81
Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/13520
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>