Created stat group ExecuteCPUStats in BaseCPU and moved stats from the
simple and minor cpu models.
The stats moved from SimpleCPU are dcacheStallCycles,
icacheStallCycles, numCCRegReads, numCCRegWrites, numFpAluAccesses,
numFpRegReads, numFpRegWrites, numIntAluAccesses, numIntRegReads,
numIntRegWrites, numMemRefs, numMiscRegReads, numMiscRegWrites,
numVecAluAccesses, numVecPredRegReads, numVecPredRegWrites,
numVecRegReads, numVecRegWrites.
The stat moved from MinorCPU is numDiscardedOps.
Also, ccRegfileReads, ccRegfileWrites, fpRegfileReads, fpRegfileWrites,
intRegfileReads, intRegfileWrites, miscRegfileReads, miscRegfileWrites,
vecPredRegfileReads, vecPredRegfileWrites, vecRegfileReads,
and vecRegfileWrites are removed from cpu.hh and cpu.cc in O3CPU. The
corresponding stats in BaseCPU::ExecuteCPUStats are used instead.
Changed the getReg, getWritableReg, and setReg functions in the O3 CPU
object to take the thread ID as a parameter. This is because the stats
in base are stored in vectors that are indexed by thread ID.
Change-Id: I801c5ceb4c70b7b281127569f11c6ee98f614b27
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67390
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
This summarizes a series of changes to move general Simple, Minor,
O3 CPU stats to BaseCPU. This commit focuses on moving numBranches
from SimpleCPU to the FetchCPUStats in the BaseCPU, and
numFetchSuspends from MinorCPU into FetchCPUStats. More general
information about this relation chain is below
1. Summary:
Moved general CPU stats found across Simple, Minor, and O3 CPU models
into BaseCPU through new stat groups. The stat groups are
FetchCPUStats, ExecuteCPUStats, and CommitCPUStats. Implemented the
committedControl stat vector found in MinorCPU for Simple and O3 CPU.
Implemented the numStoreInsts stat found in SimpleCPU for O3CPU. IPC
and CPI stats are now tracked at the core and thread level in BaseCPU
and are made universal for simple, minor, o3, and kvm CPUs. Duplicate
stats across the models are merged into a single stat in BaseCPU under
the same stat name. This change does not implement every general level
stat moved to BaseCPU for every model.
2. Stat API Changes
a. SimpleCPU:
statExecutedInstType vector unified into committedInstType
numCondCtrlInsts unified into committedControl::isControl
b. O3CPU:
i. Fetch Stage
branches in fetch unified into with numBranches
rate renamed to fetchRate
insts unified into with numInsts
ii. Execute Stage
Regfile stats unified into base with use of Simple's stat naming
numRefs in IEW unified into numMemRefs
numRate from IEW renamed to instRate
iii. Commit Stage
committedInsts is renamed to numInstsNotNOP
committedOps is renamed to numOpsNotNOP
instsCommitted is unified into numInsts
opsCommitted is unified into numOps
branches is unified into committedControl::isControl
floating is unified into numFpInsts
integer is unified into numIntInsts
loads is unified into numLoadInsts
memRefs is renamed to numMemRefs
vectorInstructions is unified into numVecInsts
3. Details:
Created three stat groups in BaseCPU. FetchCPUStats track statistics
related to the fetch stage. ExecuteCPUStats track statistics related
to the execute stage. CommitCPUStats track statistics related to the
commit stage.
There are three vectors in Base that store unique pointers to per
thread instances of these stat groups. The stat group pointer for
thread i is accessible at index i of one of these vectors. For example,
stat numCCRegReads of the execute stage for thread 0 can be accessed
with executeStats[0]->numCCRegReads. The stats.txt output will print the
thread ID of the stat group. For example, numVecRegReads on thread 0
of a single core prints as
"board.processor.cores.core.executeStats0.numVecRegReads".
NOTE: Multithreading in gem5 is untested. Therefore per thread stats
output in stats.txt is not currently guaranteed to be correctly
formatted.
For FetchCPUStats, the stats moved from SimpleCPU are numBranches
and numInsts. From MinorCPU, the stat moved is numFetchSuspends. From
O3CPU, the stats moved are from the O3 fetch stage: Stat branches is
unified into numBranches, stat rate is renamed to fetchRate in Base,
stat insts is unified into numInsts, stat icacheStallCycles keeps the
same name in Base.
For ExecuteCPUStats, the stats moved from SimpleCPU are
dcacheStallCycles, numCCRegReads, numCCRegWrites,
numFpAluAccesses, numFpRegReads, numFpRegWrites, numIntAluAccesses,
numIntRegReads, numIntRegWrites, numMemRefs, numMiscRegReads,
numMiscRegWrites, numVecAluAccesses, numVecPredRegReads,
numVecPredRegWrites, numVecRegReads, numVecRegWrites. The stat moved
from MinorCPU is numDiscardedOps. From O3, the Regfile stats in CPU are
unified into the reg stats in Base and use the names found originally
in SimpleCPU. From O3 IEW stage, numInsts keeps the same name in
Base, numBranches is unified into numBranches in base, numNop keeps
the same name in Base, numRefs is unified into numMemRefs in Base,
numLoadInsts and numStoreInsts are moved into Base, numRate is renamed
to instRate in base.
For CommitCPUStats, the stats moved from SimpleCPU are
numCondCtrlInsts, numFpInsts, numIntInsts, numLoadInsts, numStoreInsts,
numVecInsts. The stats moved from MinorCPU are numInsts,
committedInstType, and committedControl. statExecutedInstType of
SimpleCPU is unified with committedInstType of MinorCPU. Implemented
committedControl stats from MinorCPU in Simple and O3 CPU. In MinorCPU,
this stat was a 2D vector, where the first dimension is the thread ID.
In base it is now a 1D vector that is tied to a thread ID via the
commitStats vector that the object is accessible through. From the O3
commit stage, committedInsts is renamed to numInstsNotNOP, committedOps
is renamed to numOpsNotNOP, instsCommitted is unified into numInsts,
opsCommitted is renamed to numOps, committedInstType is unified into
committedInstType from Minor, branches is removed because it duplicates
committedControl::IsControl, floating is unified into numFpInsts,
interger is unified into numIntInsts, loads is unified into
numLoadInsts, numStoreInsts is implemented for tracking in O3, memRefs
is renamed to numMemRefs, vectorInstructions is unified into
numVecInsts. Note that numCondCtrlInsts of Simple is unified into
committedControl::IsCondCtrl.
Implemented IPC and CPI tracking inside BaseCPU.
In BaseCPU::BaseCPUStats, numInsts and numOps track per CPU core
committed instructions and operations.
In BaseCPU::FetchCPUStats, numInsts and numOps track per thread
fetched instructions and operations.
In BaseCPU::CommitCPUStats, numInsts tracks per thread executed
instructions.
In BaseCPU::CommitCPUStats, numInsts and numOps track per thread
committed instructions and operations.
In BaseSimpleCPU, the countInst() function has been split into
countInst(), countFetchInst(), and countCommitInst(). The stat count
incrementation step of countInst() has been removed and delegated to the
other two functions. countFetchInst() increments numInsts and numOps
of the FetchCPUStats group for a thread. countCommitInst() increments
the numInsts and numOps of the CommitCPUStats group for a thread and
of the BaseCPUStats group for a CPU core. These functions are called
in the appropriate stage within timing.cc and atomic.cc. The call to
countInst() is left unchanged. countFetchInst() is called in
preExecute(). countCommitInst() is called in postExecute().
For MinorCPU, only the commit level numInsts and numOps stats have been
implemented.
IPC and CPI stats have been added to BaseCPUStats (core level) and
CommitCPUStats (thread level). The formulas for the IPC and CPI stats
in CommitCPUStats are set in the BaseCPU constructor, after the
CommitCPUStats stat group object has been created. These replace IPC,
CPI, totalIpc, and totalCpi stats in O3.
Replaced committedInsts stats of KVM CPU with commitStats.numInsts
of BaseCPU. This results in IPC and CPI printing in stats.txt for
KVM simulations.
This change does not implement most general stats found in one or two
model for all others.
Jira Ticket: https://gem5.atlassian.net/browse/GEM5-1304
Change-Id: I3c852f8dba3268c71b7a3415480fb63d8dc30cb7
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/66031
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
The way these were set up, there would be a conflict between SimObject
files with the same name set up for different ISAs.
This change creates a single file which tries to determine how many ISAs
are enabled, and if there is exactly one, it creates a backwards
compatible alias for the ISA specific CPU types.
Change-Id: Iab358c2880d49222e814a98354c81d0f306fe1fc
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/52493
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
The TARGET_ISA variable would let you select one ISA from a list of
possible ISAs. That has now been replaced with USE_ARM_ISA, USE_X86_ISA,
etc, variables which are boolean on or off. That will allow any number
of ISAs to be enabled or disabled individually. Enabling something other
than exactly one of these will probably prevent you from getting a
working gem5 binary, but those problems are being addressed in other,
parallel change series.
I decided to use the USE_ prefix since it was consistent with most other
on/off variables we have in gem5. One noteable exception is the
BUILD_GPU setting which, you could convincingly argue, is a better
prefix than USE_. Another option would be to use CONFIG_, in
anticipation of using a kconfig style config mechanism in gem5.
It seemed premature to start using a CONFIG_ prefix here, and if we
decide to switch to some other prefix like BUILD_, it should be a
purposeful choice and not something somebody just starts using.
Change-Id: I90fef2835aa4712782e6c1313fbf564d0ed45538
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/52491
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
This makes what are configuration and what are internal SCons variables
explicit and separate, and makes it unnecessary to call out what
variables to export to C++.
These variables will also be plumbed into and out of kconfiglib in later
changes.
Change-Id: Iaf5e098d7404af06285c421dbdf8ef4171b3f001
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56892
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Rather than constantly overwriting the "zero" register to return its
value to zero, just ignore writes to it.
We assume here that the "zero" register is a standard RegVal type
register (ie not bigger than 64 bits) and is accessed as such.
Change-Id: I06029b78103019c668647569c6037ca64a4d9c76
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/49709
Maintainer: Gabe Black <gabe.black@gmail.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The BaseCPU type had been specializing itself based on the value of
TARGET_ISA, which is not compatible with building more than one ISA at a
time.
This change refactors the CPU models so that the BaseCPU is more
general, and the ISA specific components are added to the CPU when the
CPU types are fully specialized. For instance, The AtomicSimpleCPU has a
version called X86AtomicSimpleCPU which installs the X86 specific
aspects of the CPU.
This specialization is done in three ways.
1. The mmu parameter is assigned an instance of the architecture
specific MMU type. This provides a reasonable default, but also avoids
having having to use the ISA specific type when the parameter is
created.
2. The ISA specific types are made available as class attributes, and
the utility functions (including __init__!) in the BaseCPU class can
refer to them to get the types they need to set up the CPU at run time.
Because SimObjects have strange, unhelpful semantics as far as assigning
to their attributes, these types need to be set up in a non-SimObject
class, which is then brought in as a base of the actual SimObject type.
Because the metaclass of this other type is just "type", things work
like you would expect. The SimObject doesn't do any special processing
of base classes if they aren't also SimObjects, so these attributes
survive and are accessible using normal lookup in the BaseCPU class.
3. There are some methods like addCheckerCPU and properties like
needsTSO which have ISA specific values or behaviors. These are set in
the ISA specific subclass, where they are inherently specific to an ISA
and don't need to check TARGET_ISA.
Also, the DummyChecker which was set up for the BaseSimpleCPU which
doesn't actually do anything in either C++ or python was not carried
forward. The CPU type still exists, but it isn't installed in the
simple CPUs.
To provide backward compatibility, each ISA implements a .py file which
matches the original .py for a CPU, and the original is renamed with a
Base prefix. The ISA specific version creates an alias with the old CPU
name which maps to the ISA specific type. This way, old scripts which
refer to, for example, AtomicSimpleCPU, will get the X86AtomicSimpleCPU
if the x86 version was compiled in, the ArmAtomicSimpleCPU on arm, etc.
Unfortunately, because of how tags on PySource and by extension SimObjects
are implemented right now, if you set the tags on two SimObjects or
PySources which have the same module path, the later will overwrite the
former whether or not they both would be included. There are some
changes in review which would revamp this and make it work like you
would expect, without this central bookkeeping which has the conflict.
Since I can't use that here, I fell back to checking TARGET_ISA to
decide whether to tell SCons about those files at all.
In the long term, this mechanism should be revamped so that these
compatibility types are only available if there is exactly one ISA
compiled into gem5. After the configs have been updated and no longer
assume they can use AtomicSimpleCPU in all cases, then these types can
be deleted.
Also, because ISAs can now either provide subclasses for a CPU or not,
the CPU_MODELS variable has been removed, meaning the non-ISA
specialized versions of those CPU models will always be included in
gem5, except when building the NULL ISA.
In the future, a more granular config mechanism will hopefully be
implemented for *all* of gem5 and not just the CPUs, and these can be
conditional again in case you only need certain models, and want to
reduce build time or binary size by excluding the others.
Change-Id: I02fc3f645c551678ede46268bbea9f66c3f6c74b
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/52490
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Currently, an access to an invalid address will cause GEM5 to exit with
a `!pkt.isError()` assertion failure. I was seeing this assertion while
running a baremetal RISC-V binary that faulted before the trap vector
had been configured and therefore tried to jump to address zero. With
this change we now print the invalid address and the type of access
(ifetch/load/store/amo) which makes debugging such a problem much easier.
For example, my faulting program now prints the following:
`panic: Instruction fetch ([0:0x4]) failed: BadAddressError [0:3] IF`
I also saw this assertion with a program that was dereferencing a NULL
pointer, which now prints a more helpful message:
`panic: Data fetch ([0x10:0x11]) failed: BadAddressError [10:10]`
Change-Id: Id983b74bf4688711f47308c6c7c15f49662ac495
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/55203
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This was originally intended to make it more efficient to get the
microPC without making a copy of the entire PCState object to return.
Now that the PCState is returned through a pointer without a copy and
the microPC can be accessed with an inline accessor, we don't need to
create a special accessor for it.
Change-Id: I1d354dfca6be5d954e147f23dc9d27917b379bf2
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/52061
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Gabe Black <gabe.black@gmail.com>
This virtual method can trivially be shared among different CPUs, making
it unnecessary to cast from a BaseCPU pointer to some more specific CPU
class. The existing similar functions which implement this functionality
are only trivially different, and can be merged into overloads of this
common method.
Noteably this method is not implemented for the MinorCPU which uses the
SimpleThread class, typedef-ed to be MinorThread. If the previous
version of this method had been called on that CPU, it would have
crashed the simulator since a dynamic_cast would have failed. This
doesn't provide an implementation for the MinorCPU, but it also doesn't
make the problem worse, and provides a way to actually implement it some
day.
Change-Id: I23399ea6bbbbabd87e6c8bf7a66d48902745d2cf
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/52084
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Turn the functions within it into virtual methods on the ISA classes.
Eliminate the implementation in MIPS, which was just copy pasted from
Alpha long ago. Fix some minor style issues in ARM. Remove templating.
Switch from using an "XC" type parameter to using the ThreadContext *
installed in all ISA classes.
The ARM version of these functions actually depend on the ExecContext
delaying writes to MiscRegs to work correctly. More insiduously than
that, they also depend on the conicidental ThreadContext like
availability of certain functions like contextId and getCpuPtr which
come from the class which happened to implement the type passed into XC.
To accomodate that, those functions need both a real ThreadContext, and
another object which is either an ExecContext or a ThreadContext
depending on how the method is called.
Jira Issue: https://gem5.atlassian.net/browse/GEM5-1053
Change-Id: I68f95f7283f831776ba76bc5481bfffd18211bc4
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/50087
Maintainer: Gabe Black <gabe.black@gmail.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The CpuModel class looks like it was originally intended to hold a lot
more information that it does now, which is just the name of a CPU, and
whether it should be enabled by default. All the built in configs in
build_opts already explicitly list a set of CPUs, and CPUs should not,
generally speaking, be enabled by default, since they almost always need
to be specifically supported by an ISA. This is mutual, since usually
ISAs definitions are not quite correct or complete, and this is often
exposed by plugging them into different, and/or more complex CPU models
which stress those short comings.
This change drops the idea of a "default" CPU, and since the only thing
being tracked at this point is a list of CPUs, it turns CpuModel() into
a simple function which adds a name to a set of supported CPUs.
Change-Id: Id7475d5dc8548802b5253f79839da8c76ef4d165
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/48963
Reviewed-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>