TheISA::initCPU is basically an ISA specific implementation of reset
logic on architectural state. As such, it only needs to be called if
we're not going to load a checkpoint, ie in initState.
Also, since the implementation was the same across all CPUs, this
change collapses all the individual implementations down into the base
CPU class.
Change-Id: Id68133fd7f31619c90bf7b3aad35ae20871acaa4
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/24189
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
The logic that determines which syscall to call was built into the
implementation of faults/exceptions or even into the instruction
decoder, but that logic can depend on what OS is being used, and
sometimes even what version, for example 32bit vs. 64bit.
This change pushes that logic up into the Process objects since those
already handle a lot of the aspects of emulating the guest OS. Instead,
the ISA or fault implementations just notify the rest of the system
that a nebulous syscall has happened, and that gets propogated upward
until the process does something with it. That's very analogous to how
a system call would work on a real machine.
When a system call happens, the low level component which detects that
should call tc->syscall(&fault), where tc is the relevant thread (or
execution) context, and fault is a Fault which can ultimately be set
by the system call implementation.
The TC implementor (probably a CPU) will then have a chance to do
whatever it needs to to handle a system call. Currently only O3 does
anything special here. That implementor will end up calling the
Process's syscall() method.
Once in Process::syscall, the process object will use it's contextual
knowledge to determine what system call is being requested. It then
calls Process::doSyscall with the right syscall number, where doSyscall
centralizes the common mechanism for actually retrieving and calling
into the system call implementation.
Jira Issue: https://gem5.atlassian.net/browse/GEM5-187
Change-Id: I937ec1ef0576142c2a182ff33ca508d77ad0e7a1
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/23176
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Brandon Potter <Brandon.Potter@amd.com>
There is a check on a global flag denoting that the simulator
has been configured to run in fullsystem mode. The check is
conducted at runtime during calls to syscall methods.
The high-level models are checking the flag when the check
could be conducted further down the call chain (nearer to the
actual Process invocation). Moving the checks should result
in less copy-pasta as new models are developed. It might be
argued that the checks should stay in place since an error
would detected earlier; that may be true, but the error
would be the same and the simulation should fail in either
case. This arrangement requires fewer lines of code.
The changeset also changes the check into a fatal error
instead of a panic since usage (in fs mode) should result
in immediate corruption.
Change-Id: If387e27f166ac1374f3fe8b7befe3546e69adba7
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/23240
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This was useful when transitioning away from the CPU based
comInstEventQueue, but now that objects backing the ThreadContexts have
access to the underlying comInstEventQueue and can manipulate it
directly, they don't need to do so through a generic interface.
Getting rid of this function narrows and simplifies the interface.
Change-Id: I202d466d266551675ef6792d38c658d8a8f1cb8b
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/22113
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The TLBs now create the stage 2 MMUs as children, and since those are
specialized for instruction and data, the CPU needs to use ArmITB or
ArmDTB instead of ArmTLB which is the base class without an MMU. This
was changed for the BaseCPU already, but the TLBs are added in the
checker CPU as well.
Change-Id: Ide8ce950622b40f69c37bbe2ea0d22295b76d7a6
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/21979
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This was initially added in 2003 and only supported in the simple CPUs.
It's oddly specific since there are no other similar event queues for,
for instance, stores, branches, system calls, etc.
Given that this seems like a historical oddity which is only partially
supported and would be very hard to support on more diverse CPU types
like KVM or fast model which don't generally have hooks for counts of
specific instruction types.
Change-Id: I29209b7ffcf896cf424b71545c9c7546f439e2b9
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/21780
Reviewed-by: Brandon Potter <Brandon.Potter@amd.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This queue was set up to allow triggering events based on the total
number of instructions executed at the system level, and was added in
a change which added a number of things to support McPAT. No code
checked into gem5 actually schedules an event on that queue, and no
code in McPAT (which seems to have gone dormant) either downloadable
from github or found in ext modify gem5 in a way that makes it use
the instEventQueue.
Also, the KVM CPU does not interact with the instEventQueue correctly.
While it does check the per-thread instruction event queue when
deciding how long to run, it does not check the instEventQueue. It will
poke it to run events when it stops for other reasons, but it may (and
likely will) have run beyond the point where it was supposed to stop.
Since this queue doesn't seem to actually be used for anything, isn't
being used properly in all cases anyway, and adds overhead to all the
CPU models, this change eliminates it.
Change-Id: I0e126df14788c37a6d58ca9e1bb2686b70e60d88
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/21783
Maintainer: Gabe Black <gabeblack@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Tiago Mück <tiago.muck@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This change is based on modify the way we move the AtomicOpFunctor*
through gem5 in order to mantain proper ownership of the object and
ensuring its destruction when it is no longer used.
Doing that we fix at the same time a memory leak in Request.hh
where we were assigning a new AtomicOpFunctor* without destroying the
previous one.
This change creates a new type AtomicOpFunctor_ptr as a
std::unique_ptr<AtomicOpFunctor> and move its ownership as needed. Except
for its only usage when AtomicOpFunc() is called.
Change-Id: Ic516f9d8217cb1ae1f0a19500e5da0336da9fd4f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20919
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
data_write_req byteEnable which is used in ARM SVE partial writes was not
being zeroed between writes.
As a result, non-SVE memory write instructions such as STP that followed
SVE memory write instructions could still have the write mask active.
This could lead to wrong simulation behaviour, and to an assertion failure:
src/mem/packet.hh:1211: void Packet::writeData(uint8_t*) const: Assertion
`req->getByteEnable().size() == getSize()' failed. '`
Change-Id: I74b5a82675e9923b0ffdf2c1dd9afb00c91cb204
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20448
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
No caller uses any of the MasterPort specific properties of these
function's return values, so we can instead return a reference to the
base Port class. This makes it possible for the data and inst ports
to be of any port type, not just gem5 style MasterPorts. This makes
life simpler for, for example, systemc based CPUs which might have TLM
ports.
It also makes it possible for any two CPUs which have compatible ports
to be switched between, as long as the ports they use support being
unbound. Unfortunately that does not include TLM or systemc ports which
are bound permanently.
Change-Id: I98fce5a16d2ef1af051238e929dd96d57a4ac838
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20240
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Gabe Black <gabeblack@google.com>
This changeset adds support for partial (or masked) loads/stores, i.e.
loads/stores that can disable accesses to individual bytes within the
target address range. In addition, this changeset extends the code to
crack memory accesses across most CPU models (TimingSimpleCPU still
TBD), so that arbitrarily wide memory accesses are supported. These
changes are required for supporting ISAs with wide vectors.
Additional authors:
- Gabor Dozsa <gabor.dozsa@arm.com>
- Tiago Muck <tiago.muck@arm.com>
Change-Id: Ibad33541c258ad72925c0b1d5abc3e5e8bf92d92
Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/13518
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Then cast to the ISA specific type when necessary. This removes
(mostly) an ISA specific aspect to some of the interfaces. The ISA
specific version of the kernel stats still needs to be constructed and
stored in a few places which means that kernel_stats.hh still needs to
be a switching arch header, for instance.
In the future, I'd like to make the kernel its own object like the
Process objects in SE mode, and then it would be able to instantiate
and maintain its own stats.
Change-Id: I8309d49019124f6bea1482aaea5b5b34e8c97433
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18429
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
MemObject doesn't provide anything beyond its base ClockedObject any
more, so this change removes it from most inheritance hierarchies.
Occasionally MemObject is replaced with SimObject when I was fairly
confident that the extra functionality of ClockedObject wasn't needed.
Change-Id: Ic014ab61e56402e62548e8c831eb16e26523fdce
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18289
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Gabe Black <gabeblack@google.com>
The importer in Python 3 doesn't like the way we import SimObjects
from the global namespace. Convert the existing SimObject declarations
to import from m5.objects. As a side-effect, this makes these files
consistent with configuration files.
Change-Id: I11153502b430822130722839e1fa767b82a027aa
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/15981
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
This patch enables all 4 CPU models (AtomicSimpleCPU, TimingSimpleCPU,
MinorCPU and DerivO3CPU) to issue atomic memory (AMO) requests to memory
system.
Atomic memory instruction is treated as a special store instruction in
all CPU models.
In simple CPUs, an AMO request with an associated AtomicOpFunctor is
simply sent to L1 dcache.
In MinorCPU, an AMO request bypasses store buffer and waits for any
conflicting store request(s) currently in the store buffer to retire
before the AMO request is sent to the cache. AMO requests are not buffered
in the store buffer, so their effects appear immediately in the cache.
In DerivO3CPU, an AMO request is inserted in the store buffer so that it
is delivered to the cache only after all previous stores are issued to
the cache. Data forwarding between between an outstanding AMO in the
store buffer and a subsequent load is not allowed since the AMO request
does not hold valid data until it's executed in the cache.
This implementation assumes that a target ISA implementation must insert
enough memory fences as micro-ops around an atomic instruction to
enforce a correct order of memory instructions with respect to its
memory consistency model. Without extra memory fences, this implementation
can allow AMOs and other memory instructions that do not conflict
(i.e., not target the same address) to reorder.
This implementation also assumes that atomic instructions execute within
a cache line boundary since the cache for now is not able to execute an
operation on two different cache lines in one single step. Therefore,
ISAs like x86 that require multi-cache-line atomic instructions need to
either use a pair of locking load and unlocking store or change the
cache implementation to guarantee the atomicity of an atomic
instruction.
Change-Id: Ib8a7c81868ac05b98d73afc7d16eb88486f8cf9a
Reviewed-on: https://gem5-review.googlesource.com/c/8188
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Latest-gen. vector/SIMD extensions, including the Arm Scalable Vector
Extension (SVE), introduce the notion of a predicate register file.
This changeset adds this feature across architectures and CPU models.
Change-Id: Iebcadbad89c0a582ff8b1b70de353305db603946
Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/13715
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
This patch is:
* Adding a missing VecElemClass entry
* Fixing assertion in rename map which was checking the number of free
vector registers rather than free vector element registers
* Fixing assertion in read/setVecElemOperand APIs.
* Using the right register index in SimpleThread
* Using VecElem instead of VecReg on O3 readArchVecElem
Change-Id: I265320dcbe35eb47075991301dfc99333c5190c4
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/15598
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
These values are all basic integers (specifically uint64_t now), and
so passing them by const & is actually less efficient since there's a
extra level of indirection and an extra value, and the same sized value
(a 64 bit pointer vs. a 64 bit int) is being passed around.
Change-Id: Ie9956b8dc4c225068ab1afaba233ec2b42b76da3
Reviewed-on: https://gem5-review.googlesource.com/c/13626
Maintainer: Gabe Black <gabeblack@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
These types are IntReg, FloatReg, FloatRegBits, and MiscReg. There are
some remaining types, specifically the vector registers and the CCReg.
I'm less familiar with these new types of registers, and so will look
at getting rid of them at some later time.
Change-Id: Ide8f76b15c531286f61427330053b44074b8ac9b
Reviewed-on: https://gem5-review.googlesource.com/c/13624
Reviewed-by: Gabe Black <gabeblack@google.com>
Maintainer: Gabe Black <gabeblack@google.com>
Summary: Usage of const DynInstPtr& when possible and introduction of
move operators to RefCountingPtr.
In many places, scoped references to dynamic instructions do a copy of
the DynInstPtr when a reference would do. This is detrimental to
performance. On top of that, in case there is a need for reference
tracking for debugging, the redundant copies make the process much more
painful than it already is.
Also, from the theoretical point of view, a function/method that
defines a convenience name to access an instruction should not be
considered an owner of the data, i.e., doing a copy and not a reference
is not justified.
On a related topic, C++11 introduces move semantics, and those are
useful when, for example, there is a class modelling a HW structure that
contains a list, and has a getHeadOfList function, to prevent doing a
copy to an internal variable -> update pointer, remove from the list ->
update pointer, return value making a copy to the assined variable ->
update pointer, destroy the returned value -> update pointer.
Change-Id: I3bb46c20ef23b6873b469fd22befb251ac44d2f6
Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/13105
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
The AtomicSimpleCPU used to be able to access memory directly to speed
up simulation if no caches are used. This is fine as long as no
switching between CPU models is required. In order to switch to a new
CPU model that requires caches, we currently need to checkpoint the
system and restore it into a new configuration. The new
'atomic_noncaching' memory mode provides a solution that avoids this
issue since caches are bypassed in this mode. This changeset removes
the old fastmem option from the AtomicSimpleCPU and introduces a new
CPU, NonCachingSimpleCPU, which derives from the AtomicSimpleCPU.
The NonCachingSimpleCPU uses the same mechanism as the AtomicSimpleCPU
used to use when accessing memory in when fastmem was enabled.
This changeset also introduces a new switcheroo test that tests
switching between a NonCachingSimpleCPU and a TimingSimpleCPU with
caches.
Change-Id: If01893f9b37528b14f530c11ce6f53c097582c21
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/12419
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
In TimingSimpleCPU model, when a CPU is suspended by a syscall (e.g.,
futex(FUTEX_WAIT)), the CPU waits for another CPU to wake it up
(e.g., FUTEX_WAKE operation). While staying Idle, the suspended CPU
should not try to fetch next instructions after the syscall.
This patch added a status check before a CPU schedule a fetch event
after a fault is handled.
Change-Id: I0cc953135686c9b35afe94942aa1d0b245ec60a2
Reviewed-on: https://gem5-review.googlesource.com/8181
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Brandon Potter <Brandon.Potter@amd.com>
This patch is changing the underlying type for RequestPtr from Request*
to shared_ptr<Request>. Having memory requests being managed by smart
pointers will simplify the code; it will also prevent memory leakage and
dangling pointers.
Change-Id: I7749af38a11ac8eb4d53d8df1252951e0890fde3
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/10996
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Starting with version 3, scons imposes using the print function instead
of the print statement in code it processes. To get things building
again, this change moves all python code within gem5 to use the
function version. Another change by another author separately made this
same change to the site_tools and site_init.py files.
Change-Id: I2de7dc3b1be756baad6f60574c47c8b7e80ea3b0
Reviewed-on: https://gem5-review.googlesource.com/8761
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Gabe Black <gabeblack@google.com>
Get rid of some remnants of a system which was intended to separate
address computation into its own instruction object.
Change-Id: I23f9ffd70fcb89a8ea5bbb934507fb00da9a0b7f
Reviewed-on: https://gem5-review.googlesource.com/7122
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Gabe Black <gabeblack@google.com>
CPUs have historically instantiated the architecture specific version
of the TLBs to avoid a virtual function call, making them a little bit
more dependent on what the current ISA is. Some simple performance
measurement, the x86 twolf regression on the atomic CPU, shows that
there isn't actually any performance benefit, and if anything the
simulator goes slightly faster (although still within margin of error)
when the TLB functions are virtual.
This change switches everything outside of the architectures themselves
to use the generic BaseTLB type, and then inside the ISA for them to
cast that to their architecture specific type to call into architecture
specific interfaces.
The ARM TLB needed the most adjustment since it was using non-standard
translation function signatures. Specifically, they all took an extra
"type" parameter which defaulted to normal, and translateTiming
returned a Fault. translateTiming actually doesn't need to return a
Fault because everywhere that consumed it just stored it into a
structure which it then deleted(?), and the fault is stored in the
Translation object when the translation is done.
A little more work is needed to fully obviate the arch/tlb.hh header,
so the TheISA::TLB type is still visible outside of the ISAs.
Specifically, the TlbEntry type is used in the generic PageTable which
lives in src/mem.
Change-Id: I51b68ee74411f9af778317eff222f9349d2ed575
Reviewed-on: https://gem5-review.googlesource.com/6921
Maintainer: Gabe Black <gabeblack@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>