In most cases, the microcode ROM doesn't actually do anything. The
structural existence of a microcode ROM doesn't make sense in the
general case, and in architectures that know they have one and need to
interact with it, they can cast their decoder into an arch specific type
and access the ROM that way.
Change-Id: I25b67bfe65df1fdb84eb5bc894cfcb83da1ce64b
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32898
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This code was at least a little Alpha specific, and now that Alpha is
gone it can no longer be compiled. We could either fix it up to work
with other/all ISAs or delete it, and the consensus was to delete it. It
could potentially be revived in the future by retrieving it from version
control.
Change-Id: Ied073f2b9b166951ecba3442cd762eb19bc690b3
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32954
Reviewed-by: Steve Reinhardt <stever@gmail.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This parameter is associated with a periodic event which would take a
sample for a kernel profile in FS mode. Unfortunately the only ISA which
had working versions of the necessary classes was alpha, and that has
been deleted. That means that without additional work for any given ISA,
the profile parameter has no chance of working.
Ideally, this parameter should be moved to the Workload classes. There
it can intrinsically be tied to a particular kernel, rather than having
to assume a particular kernel and gate everything on whether you're in
FS mode.
Because this isn't (IMHO) where this parameter should live in the long
term, and because it's currently unusable without additional development
for each of the ISAs, I think it makes the most sense to remove the
front end for this mechanism from the CPU.
Since the sampling/profiling mechanism itself could be useful and could
be re-plumbed somewhere else, the back end and its classes are left alone.
Change-Id: I2a3319c1d5ad0ef8c99f5d35953b93c51b2a8a0b
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32214
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This patch sets the faulting flag in atomic, timing, minor and o3 CPU
models.
It also fixes the minor/timing CPU models which were not respecting the
ExecFaulting flag. This is now checked before calling dump() on the
tracing object, to bring it in line with the other CPU models.
Change-Id: I9c7b64cc5605596eb7fcf25fdecaeac5c4b5e3d7
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/30135
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The ThreadContext can be used to access the cpu if needed, and is a
more representative interface to various pieces of state than the CPU
itself. Also convert some of the methods in Interupts to use the
locally stored ThreadContext pointer instead of taking one as an
argument. This makes calling those methods simpler and less error
prone.
Change-Id: I740bd99f92e54e052a618a4ae2927ea1c4ece193
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/28988
Reviewed-by: Gabe Black <gabeblack@google.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The version of vtophys which didn't take a ThreadContext had only been
implemented on Alpha which has since been removed, so this version of
the function was completely unimplemented and never used.
This change also gets rid of the dbg_vtophys which was sometimes
implemented but also never used, and takes the opportunity to fix up
some style problems in some of the vtophys arch files.
Change-Id: Ie10f881f8ce08c7188e71805357cf3264be4c81a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/26224
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
TheISA::initCPU is basically an ISA specific implementation of reset
logic on architectural state. As such, it only needs to be called if
we're not going to load a checkpoint, ie in initState.
Also, since the implementation was the same across all CPUs, this
change collapses all the individual implementations down into the base
CPU class.
Change-Id: Id68133fd7f31619c90bf7b3aad35ae20871acaa4
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/24189
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
This was useful when transitioning away from the CPU based
comInstEventQueue, but now that objects backing the ThreadContexts have
access to the underlying comInstEventQueue and can manipulate it
directly, they don't need to do so through a generic interface.
Getting rid of this function narrows and simplifies the interface.
Change-Id: I202d466d266551675ef6792d38c658d8a8f1cb8b
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/22113
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This was initially added in 2003 and only supported in the simple CPUs.
It's oddly specific since there are no other similar event queues for,
for instance, stores, branches, system calls, etc.
Given that this seems like a historical oddity which is only partially
supported and would be very hard to support on more diverse CPU types
like KVM or fast model which don't generally have hooks for counts of
specific instruction types.
Change-Id: I29209b7ffcf896cf424b71545c9c7546f439e2b9
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/21780
Reviewed-by: Brandon Potter <Brandon.Potter@amd.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This queue was set up to allow triggering events based on the total
number of instructions executed at the system level, and was added in
a change which added a number of things to support McPAT. No code
checked into gem5 actually schedules an event on that queue, and no
code in McPAT (which seems to have gone dormant) either downloadable
from github or found in ext modify gem5 in a way that makes it use
the instEventQueue.
Also, the KVM CPU does not interact with the instEventQueue correctly.
While it does check the per-thread instruction event queue when
deciding how long to run, it does not check the instEventQueue. It will
poke it to run events when it stops for other reasons, but it may (and
likely will) have run beyond the point where it was supposed to stop.
Since this queue doesn't seem to actually be used for anything, isn't
being used properly in all cases anyway, and adds overhead to all the
CPU models, this change eliminates it.
Change-Id: I0e126df14788c37a6d58ca9e1bb2686b70e60d88
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/21783
Maintainer: Gabe Black <gabeblack@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Tiago Mück <tiago.muck@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This changeset adds support for partial (or masked) loads/stores, i.e.
loads/stores that can disable accesses to individual bytes within the
target address range. In addition, this changeset extends the code to
crack memory accesses across most CPU models (TimingSimpleCPU still
TBD), so that arbitrarily wide memory accesses are supported. These
changes are required for supporting ISAs with wide vectors.
Additional authors:
- Gabor Dozsa <gabor.dozsa@arm.com>
- Tiago Muck <tiago.muck@arm.com>
Change-Id: Ibad33541c258ad72925c0b1d5abc3e5e8bf92d92
Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/13518
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Then cast to the ISA specific type when necessary. This removes
(mostly) an ISA specific aspect to some of the interfaces. The ISA
specific version of the kernel stats still needs to be constructed and
stored in a few places which means that kernel_stats.hh still needs to
be a switching arch header, for instance.
In the future, I'd like to make the kernel its own object like the
Process objects in SE mode, and then it would be able to instantiate
and maintain its own stats.
Change-Id: I8309d49019124f6bea1482aaea5b5b34e8c97433
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18429
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
MemObject doesn't provide anything beyond its base ClockedObject any
more, so this change removes it from most inheritance hierarchies.
Occasionally MemObject is replaced with SimObject when I was fairly
confident that the extra functionality of ClockedObject wasn't needed.
Change-Id: Ic014ab61e56402e62548e8c831eb16e26523fdce
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18289
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Gabe Black <gabeblack@google.com>
This patch enables all 4 CPU models (AtomicSimpleCPU, TimingSimpleCPU,
MinorCPU and DerivO3CPU) to issue atomic memory (AMO) requests to memory
system.
Atomic memory instruction is treated as a special store instruction in
all CPU models.
In simple CPUs, an AMO request with an associated AtomicOpFunctor is
simply sent to L1 dcache.
In MinorCPU, an AMO request bypasses store buffer and waits for any
conflicting store request(s) currently in the store buffer to retire
before the AMO request is sent to the cache. AMO requests are not buffered
in the store buffer, so their effects appear immediately in the cache.
In DerivO3CPU, an AMO request is inserted in the store buffer so that it
is delivered to the cache only after all previous stores are issued to
the cache. Data forwarding between between an outstanding AMO in the
store buffer and a subsequent load is not allowed since the AMO request
does not hold valid data until it's executed in the cache.
This implementation assumes that a target ISA implementation must insert
enough memory fences as micro-ops around an atomic instruction to
enforce a correct order of memory instructions with respect to its
memory consistency model. Without extra memory fences, this implementation
can allow AMOs and other memory instructions that do not conflict
(i.e., not target the same address) to reorder.
This implementation also assumes that atomic instructions execute within
a cache line boundary since the cache for now is not able to execute an
operation on two different cache lines in one single step. Therefore,
ISAs like x86 that require multi-cache-line atomic instructions need to
either use a pair of locking load and unlocking store or change the
cache implementation to guarantee the atomicity of an atomic
instruction.
Change-Id: Ib8a7c81868ac05b98d73afc7d16eb88486f8cf9a
Reviewed-on: https://gem5-review.googlesource.com/c/8188
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
This patch is changing the underlying type for RequestPtr from Request*
to shared_ptr<Request>. Having memory requests being managed by smart
pointers will simplify the code; it will also prevent memory leakage and
dangling pointers.
Change-Id: I7749af38a11ac8eb4d53d8df1252951e0890fde3
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/10996
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
CPUs have historically instantiated the architecture specific version
of the TLBs to avoid a virtual function call, making them a little bit
more dependent on what the current ISA is. Some simple performance
measurement, the x86 twolf regression on the atomic CPU, shows that
there isn't actually any performance benefit, and if anything the
simulator goes slightly faster (although still within margin of error)
when the TLB functions are virtual.
This change switches everything outside of the architectures themselves
to use the generic BaseTLB type, and then inside the ISA for them to
cast that to their architecture specific type to call into architecture
specific interfaces.
The ARM TLB needed the most adjustment since it was using non-standard
translation function signatures. Specifically, they all took an extra
"type" parameter which defaulted to normal, and translateTiming
returned a Fault. translateTiming actually doesn't need to return a
Fault because everywhere that consumed it just stored it into a
structure which it then deleted(?), and the fault is stored in the
Translation object when the translation is done.
A little more work is needed to fully obviate the arch/tlb.hh header,
so the TheISA::TLB type is still visible outside of the ISAs.
Specifically, the TlbEntry type is used in the generic PageTable which
lives in src/mem.
Change-Id: I51b68ee74411f9af778317eff222f9349d2ed575
Reviewed-on: https://gem5-review.googlesource.com/6921
Maintainer: Gabe Black <gabeblack@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Print faulting instruction for unmapped address panic in faults.cc
and print extra info about corresponding fetched PC in base.cc.
Change-Id: Id9e15d3e88df2ad6b809fb3cf9f6ae97e9e97e0f
Reviewed-on: https://gem5-review.googlesource.com/6461
Reviewed-by: Gabe Black <gabeblack@google.com>
Maintainer: Gabe Black <gabeblack@google.com>
These files aren't a collection of miscellaneous stuff, they're the
definition of the Logger interface, and a few utility macros for
calling into that interface (panic, warn, etc.).
Change-Id: I84267ac3f45896a83c0ef027f8f19c5e9a5667d1
Reviewed-on: https://gem5-review.googlesource.com/6226
Reviewed-by: Brandon Potter <Brandon.Potter@amd.com>
Maintainer: Gabe Black <gabeblack@google.com>
Move the code responsible for performing the actual probe point notify
into BaseCPU. Use BaseCPU activateContext and suspendContext to keep
track of sleep cycles. Create a probe point (ppActiveCycles) that does
not count cycles where the processor was asleep. Rename ppCycles
to ppAllCycles to reflect its nature.
Change-Id: I1907ddd07d0ff9f2ef22cc9f61f5f46c630c9d66
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/5762
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reiley's update :) of the isa parser definitions. My addition of the
vector element operand concept for the ISA parser. Nathanael's modification
creating a hierarchy between vector registers and its constituencies to the
isa parser.
Some fixes/updates on top to consider instructions as vectors instead of
floating when they use the VectorRF. Some counters added to all the
models to keep faithful counts.
Change-Id: Id8f162a525240dfd7ba884c5a4d9fa69f4050101
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/2706
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
This patch adds some more functionality to the cpu model and the arch to
interface with the vector register file.
This change consists mainly of augmenting ThreadContexts and ExecContexts
with calls to get/set full vectors, underlying microarchitectural elements
or lanes. Those are meant to interface with the vector register file. All
classes that implement this interface also get an appropriate implementation.
This requires implementing the vector register file for the different
models using the VecRegContainer class.
This change set also updates the Result abstraction to contemplate the
possibility of having a vector as result.
The changes also affect how the remote_gdb connection works.
There are some (nasty) side effects, such as the need to define dummy
numPhysVecRegs parameter values for architectures that do not implement
vector extensions.
Nathanael Premillieu's work with an increasing number of fixes and
improvements of mine.
Change-Id: Iee65f4e8b03abfe1e94e6940a51b68d0977fd5bb
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
[ Fix RISCV build issues and CC reg free list initialisation ]
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/2705
Adds SMT support to the "simple" CPU models so that they can be
used with other SMT-supported CPUs. Example usage: this enables
the TimingSimpleCPU to be used to warmup caches before swapping to
detailed mode with the in-order or out-of-order based CPU models.
Objects that are can be serialized are supposed to inherit from the
Serializable class. This class is meant to provide a unified API for
such objects. However, so far it has mainly been used by SimObjects
due to some fundamental design limitations. This changeset redesigns
to the serialization interface to make it more generic and hide the
underlying checkpoint storage. Specifically:
* Add a set of APIs to serialize into a subsection of the current
object. Previously, objects that needed this functionality would
use ad-hoc solutions using nameOut() and section name
generation. In the new world, an object that implements the
interface has the methods serializeSection() and
unserializeSection() that serialize into a named /subsection/ of
the current object. Calling serialize() serializes an object into
the current section.
* Move the name() method from Serializable to SimObject as it is no
longer needed for serialization. The fully qualified section name
is generated by the main serialization code on the fly as objects
serialize sub-objects.
* Add a scoped ScopedCheckpointSection helper class. Some objects
need to serialize data structures, that are not deriving from
Serializable, into subsections. Previously, this was done using
nameOut() and manual section name generation. To simplify this,
this changeset introduces a ScopedCheckpointSection() helper
class. When this class is instantiated, it adds a new /subsection/
and subsequent serialization calls during the lifetime of this
helper class happen inside this section (or a subsection in case
of nested sections).
* The serialize() call is now const which prevents accidental state
manipulation during serialization. Objects that rely on modifying
state can use the serializeOld() call instead. The default
implementation simply calls serialize(). Note: The old-style calls
need to be explicitly called using the
serializeOld()/serializeSectionOld() style APIs. These are used by
default when serializing SimObjects.
* Both the input and output checkpoints now use their own named
types. This hides underlying checkpoint implementation from
objects that need checkpointing and makes it easier to change the
underlying checkpoint storage code.
Another churn to clean up undefined behaviour, mostly ARM, but some
parts also touching the generic part of the code base.
Most of the fixes are simply ensuring that proper intialisation. One
of the more subtle changes is the return type of the sign-extension,
which is changed to uint64_t. This is to avoid shifting negative
values (undefined behaviour) in the ISA code.
Mwait works as follows:
1. A cpu monitors an address of interest (monitor instruction)
2. A cpu calls mwait - this loads the cache line into that cpu's cache.
3. The cpu goes to sleep.
4. When another processor requests write permission for the line, it is
evicted from the sleeping cpu's cache. This eviction is forwarded to the
sleeping cpu, which then wakes up.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
This changeset adds probe points that can be used to implement PMU
counters for CPU stats. The following probes are supported:
* BaseCPU::ppCycles / Cycles
* BaseCPU::ppRetiredInsts / RetiredInsts
* BaseCPU::ppRetiredLoads / RetiredLoads
* BaseCPU::ppRetiredStores / RetiredStores
* BaseCPU::ppRetiredBranches RetiredBranches