Latest-gen. vector/SIMD extensions, including the Arm Scalable Vector
Extension (SVE), introduce the notion of a predicate register file.
This changeset adds this feature across architectures and CPU models.
Change-Id: Iebcadbad89c0a582ff8b1b70de353305db603946
Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/13715
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
VecElem code had been introduced in order to simulate change of renaming
for vector registers. Most of the work is happening on the rename_map
switchRenameMode. Change of renaming can happen after a squash in the
pipeline.
This patch is also changing the interface to the ISA part so that
a PCState is used instead of ISA in order to check if rename mode
has changed.
Change-Id: I8af795d771b958e0a0d459abfeceff5f16b4b5d4
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/15601
This patch is:
* Adding a missing VecElemClass entry
* Fixing assertion in rename map which was checking the number of free
vector registers rather than free vector element registers
* Fixing assertion in read/setVecElemOperand APIs.
* Using the right register index in SimpleThread
* Using VecElem instead of VecReg on O3 readArchVecElem
Change-Id: I265320dcbe35eb47075991301dfc99333c5190c4
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/15598
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
This patch is:
* Increasing the number of bits in the Scoreboard so that
it is keeping track of VecElemClass dependencies.
* Fixing VecElemClass entry in the scoreboard table so that it
correctly uses flatIndex rather than index.
Change-Id: Ie4877e5fe410b1437447adebbe289602a443f7c0
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/15597
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
This patch does a large modification of the LSQ in the O3 model. The
main goal of the patch is to remove the 'an operation can be served with
one or two memory requests' assumption that is present in the LSQ
and the instruction with the req, reqLow, reqHigh triplet, and
generalising it to operations that can be addressed with one request,
and operations that require many requests, embodied in the
SingleDataRequest and the SplitDataRequest.
This modification has been done mimicking the minor model to an extent,
shifting the responsibilities of dealing with VtoP translation and
tracking the status and resources from the DynInst to the LSQ via the
LSQRequest. The LSQRequest models the information concerning the
operation, handles the creation of fragments for translation and request
as well as assembling/splitting the data accordingly.
With this modifications, the implementation of vector ISAs, particularly
on the memory side, become more rich, as the new model permits a
dissociation of the ISA characteristics as vector length, from the
microarchitectural characteristics that govern how contiguous loads are
executing, allowing exploration of different LSQ to DL1 bus widths to
understand the tradeoffs in complexity and performance.
Part of the complexities introduced stem from the fact that gem5 keeps a
large amount of metadata regarding, in particular, memory operations,
thus, when an instruction is squashed while some operation as TLB lookup
or cache access is ongoing, when the relevant structure communicates to
the LSQ that the operation is over, it tries to access some pieces of
data that should have died when the instruction is squashed, leading to
asserts, panics, or memory corruption. To ensure the correct behaviour,
the LSQRequest rely on assesing who is their owner, and self-destroying
if they detect their owner is done with the request, and there will be
no subsequent action. For example, in the case of an instruction
squashed whal the TLB is doing a walk to serve the translation, when the
translation is served by the TLB, the LSQRequest detects that the
instruction was squashed, and as the translation is done, no one else
expect to access its information, and therefore, it self-destructs.
Having destroyed the LSQRequest earlier, would lead to wrong behaviour
as the TLB walk may access some fields of it.
Additional authors:
- Gabor Dozsa <gabor.dozsa@arm.com>
Change-Id: I9578a1a3f6b899c390cdd886856a24db68ff7d0c
Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/13516
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
These values are all basic integers (specifically uint64_t now), and
so passing them by const & is actually less efficient since there's a
extra level of indirection and an extra value, and the same sized value
(a 64 bit pointer vs. a 64 bit int) is being passed around.
Change-Id: Ie9956b8dc4c225068ab1afaba233ec2b42b76da3
Reviewed-on: https://gem5-review.googlesource.com/c/13626
Maintainer: Gabe Black <gabeblack@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
The smtCommitPolicy is a parameter in the o3 cpu that can have 3
different values. Previously this setting was done through a string
and a parser function would turn it into a c++ enum value. This
changeset turns the string into a python Param.ScopedEnum.
Change-Id: I3625f2c08a1ae0c3b0dce7a641c6ae1ce3fd79a5
Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/15400
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
The smtROBPolicy is a parameter in the o3 cpu that can have 3
different values. Previously this setting was done through a string
and a parser function would turn it into a c++ enum value. This
changeset turns the string into a python Param.ScopedEnum.
Change-Id: Ie104d055dbbc6e44997ae0c1470de714239be5a3
Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/15399
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
The smtIQPolicy is a parameter in the o3 cpu that can have 3
different values. Previously this setting was done through a string
and a parser function would turn it into a c++ enum value. This
changeset turns the string into a python Param.ScopedEnum.
Change-Id: Ieecf0a19427dd250b0d5ae3d531ab46a37326ae5
Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/15398
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
The smtLSQPolicy is a parameter in the o3 cpu that can have 3
different values. Previously this setting was done through a string
and a parser function would turn it into a c++ enum value. This
changeset turns the string into a python Param.ScopedEnum.
Change-Id: I82041b88bd914c5dc660058d9e3998e3114e7c35
Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/15397
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
The smtFetchPolicy is a parameter in the o3 cpu that can have 5
different values. Previously this setting was done through a string
and a parser function would turn it into a c++ enum value. This
changeset turns the string into a python Param.ScopedEnum.
Change-Id: Iafb4b4b27587541185ea912e5ed581bce09695f5
Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/15396
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
These types are IntReg, FloatReg, FloatRegBits, and MiscReg. There are
some remaining types, specifically the vector registers and the CCReg.
I'm less familiar with these new types of registers, and so will look
at getting rid of them at some later time.
Change-Id: Ide8f76b15c531286f61427330053b44074b8ac9b
Reviewed-on: https://gem5-review.googlesource.com/c/13624
Reviewed-by: Gabe Black <gabeblack@google.com>
Maintainer: Gabe Black <gabeblack@google.com>
Change 9af1214 added a new ctor to the LSQUnit, however
there is a typo/bug because it sizes the SQEntries
member variable to lqEntries + 1, as opposed to
sqEntries + 1. This change corrects the issue by
using sqEntries.
Change-Id: I19dfaa5c0e335bd7b84343a92034147d7c5d914e
Reviewed-on: https://gem5-review.googlesource.com/c/15015
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Derived classes with virtual functions need to define a virtual
destructor or a protected destructor otherwise calling the base class
destructor has undefined behavior. This change adds a virtual
distructor in the base class.
Change-Id: I1c855aa56dff6585ff99b9147bdb4eb9729a0a53
Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/14815
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
This patch changes two members from being raw pointers to being STL
containers. The reason behind, other than cleanlyness and arguable OO
best practices is that containers have more intronspections capabilities
than naked pointers do, as the size is known.
Using STL containers adds little overhead and eases the automation of
process during debugging (gdb).
Change-Id: I4d9d3eedafa8b5e50ac512ea93b458a4200229f2
Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/13126
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
The value that is not initialized has a bogus value that manifests when
using some debug-flags what makes the usage of tracediff a bit more
challenging.
In addition, while debugging with other techniques, it introduces the
problem of understanding if the value of a field is 'intended' or just
an effect of the lack of initialisation.
Change-Id: Ied88caa77479c6f1d5166d80d1a1a057503cb106
Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/13125
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Neither assert(0) nor assert(false) give any hint as to why control
getting to them is bad, and their more descriptive versions,
assert(0 && "description") and assert(false && "description"), jury
rig assert to add an error message when the utility function panic()
already does that directly with better formatting options.
This change replaces that flavor of call to assert with panic, except
in the actual code which processes the formatting that panic uses (to
avoid infinitely recurring error handling), and in some *.sm files
since I don't know what rules those have to follow and don't want to
accidentaly break them.
Change-Id: I8addfbfaf77eaed94ec8191f2ae4efb477cefdd0
Reviewed-on: https://gem5-review.googlesource.com/c/14636
Reviewed-by: Brandon Potter <Brandon.Potter@amd.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
This includes TAGE tag sizes, TAGE table sizes, U counters reset period,
loop predictor associativity, path history size, the USE_ALT_ON_NA size
and the WITHLOOP size
Change-Id: I935823f0a5794f5d55b744263798897a813dc1bd
Signed-off-by: Pau Cabre <pau.cabre@metempsy.com>
Reviewed-on: https://gem5-review.googlesource.com/c/14417
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Increased to 2 bits of useful counter per TAGE entry as described in the
LTAGE paper (and made the size configurable)
Changed how the useful counters are incremented/decremented as described
in the LTAGE paper
Change-Id: I8c692cc7c180d29897cb77781681ff498a1d16c8
Signed-off-by: Pau Cabre <pau.cabre@metempsy.com>
Reviewed-on: https://gem5-review.googlesource.com/c/14215
Reviewed-by: Ilias Vougioukas <ilias.vougioukas@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Fixed the following fields of the loop predictor entries as described on
the LTAGE paper:
- Age counter (it was 3 bits and it should be 8 bits)
- Tag (it was 16 bits and it should be 14 bits). Also some times it used
int variables and some times uint16_t, leading to wrong behaviour
- Confidence counter (it was 2 bits ins some parts of the code and 3 bits
in some other parts. It should be 2 bits)
- Iteration counters (they were 16 bits and they should be 14 bits)
All the new sizes are now configurable
Change-Id: I8884c7454c1e510b65160eb4d5749d3259d34096
Signed-off-by: Pau Cabre <pau.cabre@metempsy.com>
Reviewed-on: https://gem5-review.googlesource.com/c/14216
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Summary: Usage of const DynInstPtr& when possible and introduction of
move operators to RefCountingPtr.
In many places, scoped references to dynamic instructions do a copy of
the DynInstPtr when a reference would do. This is detrimental to
performance. On top of that, in case there is a need for reference
tracking for debugging, the redundant copies make the process much more
painful than it already is.
Also, from the theoretical point of view, a function/method that
defines a convenience name to access an instruction should not be
considered an owner of the data, i.e., doing a copy and not a reference
is not justified.
On a related topic, C++11 introduces move semantics, and those are
useful when, for example, there is a class modelling a HW structure that
contains a list, and has a getHeadOfList function, to prevent doing a
copy to an internal variable -> update pointer, remove from the list ->
update pointer, return value making a copy to the assined variable ->
update pointer, destroy the returned value -> update pointer.
Change-Id: I3bb46c20ef23b6873b469fd22befb251ac44d2f6
Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/13105
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Sometimes it's easier to debug gem5 built with ASan enabled. This CL fixes
some build error when using --with-asan.
Bug: None
Test: ./scripts/build_gem5 --with-asan --with-ubsan build/ARM/gem5.debug
Change-Id: Iaaaaebc3f25749e11f97bf454ddd0153b3de56e7
Reviewed-on: https://gem5-review.googlesource.com/12511
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
The AtomicSimpleCPU used to be able to access memory directly to speed
up simulation if no caches are used. This is fine as long as no
switching between CPU models is required. In order to switch to a new
CPU model that requires caches, we currently need to checkpoint the
system and restore it into a new configuration. The new
'atomic_noncaching' memory mode provides a solution that avoids this
issue since caches are bypassed in this mode. This changeset removes
the old fastmem option from the AtomicSimpleCPU and introduces a new
CPU, NonCachingSimpleCPU, which derives from the AtomicSimpleCPU.
The NonCachingSimpleCPU uses the same mechanism as the AtomicSimpleCPU
used to use when accessing memory in when fastmem was enabled.
This changeset also introduces a new switcheroo test that tests
switching between a NonCachingSimpleCPU and a TimingSimpleCPU with
caches.
Change-Id: If01893f9b37528b14f530c11ce6f53c097582c21
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/12419
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
This patch is adding support for generating memory requests which set
the StreamID/SubstreamID field, so that is possible to emulate devices
attached to an external IOMMU/SMMU with a Traffic generator.
Change-Id: Iea068de581ae7125a9d49314124a08c045c75b49
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/12188
GCC 8 adds a number of new warnings to -Wall which generate errors.
- Fix memset to 0 for structs by adding casts.
- Fix cast with const when the const was ignored.
- Fix catch a polymorphic type by value
We now compile with GCC 8!
Change-Id: Iab70ce11190eee67608fc25c0bedff170152b153
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-on: https://gem5-review.googlesource.com/11949
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Do not generate garnet tester file or Ruby debug headers without a Ruby
protocol (i.e. PROTOCOL=None). It makes no sense to include these files
into the build when there will be no protocol to utilize them.
Change-Id: I8db4dd532f60008217a10c88a2e089f85df9d104
Reviewed-on: https://gem5-review.googlesource.com/8381
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Previously, reg_class_impl.hh was added in order to prevent a cyclic
dependency between it and the_isa.hh (See
http://reviews.gem5.org/r/3754). It was determined that this was not
necessary. The two files had almost entirely the same includes, and the
current test-suite including multiple gcc and clang compilers on both
MacOS and Linux successfully built the library with all functionality
moved into the reg_class.hh file.
Change-Id: I0319e187b9eb280726a003951bb1ce315ffe17f5
Signed-off-by: Bradley Wang <radwang@ucdavis.edu>
Reviewed-on: https://gem5-review.googlesource.com/11869
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
This patch allows to instantiate a Traffic generator starting from a
generic SimObject, so that linking to a BaseTrafficGen only is no longer
mandatory. This permits SimObjects different than a BaseTrafficGen to
instantiate generators and to manually specify the MasterID they
will be using when generating memory requests.
Change-Id: Ic286cfa49fd9c9707e6f12a4ea19993dd3006b2b
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/11789
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Free the squahsed instructions' heads of DepGraph in IQ squashing
In a system with large register file (ex.2048), the number of
DynInst hits the hardcoded limit (1500). This is caused by
missing freeing the heads of DepGraph in IQ. IQ only clears
out the heads when instructions reach writeback stage.
If a instruction is squashed before writeback stage, its head of
dependency graph, which holds the instruction's DynInstPtr,
would not be cleared out. This prevents freeing the DynInst of the
squahsed instruction even after it is committed.
Change-Id: I05b3db93cb6ad8960183d7ae765149c7f292e5b3
Reviewed-on: https://gem5-review.googlesource.com/7481
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
The current traffic generator relies on a configuration file that
describes a small machine to generate stimuli. This configuration file
is usually generated by the gem5 Python configuration. This creates an
unnecessary and fragile step.
This changeset introduces a Python-based trace module. When
instantiated, the module exposes a start method that takes an iterable
object as a parameter (e.g., a generator). The iterable object is
expected to represent a list of generators that will be run one after
the other. For example:
system.tgen = PyTrafficGen()
m5.instantiate()
def trace():
yield system.tgen.createIdle(1000)
yield system.tgen.createExit(0)
system.tgen.start(trace())
Change-Id: I58e60ca517e86c197859f4daaa67750066abdc1c
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/11518
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>