This patch adds support for pinning registers for a certain number of
consecutive writes. This is only relevant for timing CPU models
(functional-only models are unaffected), and it is primarily needed to
provide a realistic execution model for micro-coded operations whose
microops can write to non-overlapping portions of a destination
register, e.g. vector gather loads. In those cases, this mechanism
can disable renaming for a sequence of consecutive writes, thus making
the resulting execution more efficient: allocating a new physical
register for each microop would introduce a read-modify-write chain of
dependencies, while with these modifications the microops can write
back in parallel.
Please note that this new feature is only leveraged by O3CPU for the
time being.
Additional authors:
- Gabor Dozsa <gabor.dozsa@arm.com>
Change-Id: I07eb5fdbd1fa0b748c9bdc1174d9f330fda34f81
Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/13520
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This changeset adds support for partial (or masked) loads/stores, i.e.
loads/stores that can disable accesses to individual bytes within the
target address range. In addition, this changeset extends the code to
crack memory accesses across most CPU models (TimingSimpleCPU still
TBD), so that arbitrarily wide memory accesses are supported. These
changes are required for supporting ISAs with wide vectors.
Additional authors:
- Gabor Dozsa <gabor.dozsa@arm.com>
- Tiago Muck <tiago.muck@arm.com>
Change-Id: Ibad33541c258ad72925c0b1d5abc3e5e8bf92d92
Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/13518
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Then cast to the ISA specific type when necessary. This removes
(mostly) an ISA specific aspect to some of the interfaces. The ISA
specific version of the kernel stats still needs to be constructed and
stored in a few places which means that kernel_stats.hh still needs to
be a switching arch header, for instance.
In the future, I'd like to make the kernel its own object like the
Process objects in SE mode, and then it would be able to instantiate
and maintain its own stats.
Change-Id: I8309d49019124f6bea1482aaea5b5b34e8c97433
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18429
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
MemObject doesn't provide anything beyond its base ClockedObject any
more, so this change removes it from most inheritance hierarchies.
Occasionally MemObject is replaced with SimObject when I was fairly
confident that the extra functionality of ClockedObject wasn't needed.
Change-Id: Ic014ab61e56402e62548e8c831eb16e26523fdce
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18289
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Gabe Black <gabeblack@google.com>
PerfKvmCounter::attach fails if the user doesn't have privileges to make
the perf_event_open syscall. This is the default privilege setting since
kernel 4.6. I've seen some users in the mailing list resort to running
as root; changing the perf_event_paranoid setting is an alternative.
Change-Id: I2bc6f76abb6e97bf34b408a611f64b1910f50a43
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17508
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
TLB:getMasterPort is used to obtain the PageWalkMasterPort if present and
hides the BaseTLB::getMasterPort().
The TLB::getMasterPort() is renamed according to the expected behavior.
Change-Id: If4f61189094a706d59805cd10f4f814e5830eda8
Reviewed-on: https://gem5-review.googlesource.com/c/16648
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
During the O3PipeView execution, a potential invalid iterator is used to
Update the instruction storeTick field.
If the store_idx iterator is the first() of the StoreQueue, the
corresponding instruction is removed from the queue, leaving the iterator
invalid and not usable in the TRACING_ON block.
This patch uses the store_inst variable to access (and update) the
instruction tick, instead of the (potential) invalid one.
Change-Id: I671052ef282b9048e5239da8629b89e8afa86bf0
Reviewed-on: https://gem5-review.googlesource.com/c/16322
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Many functions that used to return lists (e.g., dict.items()) now
return iterators and their iterator counterparts (e.g.,
dict.iteritems()) have been removed. Switch calls to the Python 2.7
iterator methods to use the Python 3 equivalent and add explicit list
conversions where necessary.
Change-Id: I0c18114955af8f4932d81fb689a0adb939dafaba
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/15992
Reviewed-by: Juha Jäykkä <juha.jaykka@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
The importer in Python 3 doesn't like the way we import SimObjects
from the global namespace. Convert the existing SimObject declarations
to import from m5.objects. As a side-effect, this makes these files
consistent with configuration files.
Change-Id: I11153502b430822130722839e1fa767b82a027aa
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/15981
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Now the indirect branch predictor handles its own GHR instead of getting
the one from the direction predictor.
Also, now the commit method of the indirect predictor is called for every
pending branch on an update, as the indirect predictors may want to update
their interal structures/histories with the information of each branch.
Change-Id: I7053fbea42a53960a3bc1ba32912cc99c160511e
Reviewed-on: https://gem5-review.googlesource.com/c/15318
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
This patch enables all 4 CPU models (AtomicSimpleCPU, TimingSimpleCPU,
MinorCPU and DerivO3CPU) to issue atomic memory (AMO) requests to memory
system.
Atomic memory instruction is treated as a special store instruction in
all CPU models.
In simple CPUs, an AMO request with an associated AtomicOpFunctor is
simply sent to L1 dcache.
In MinorCPU, an AMO request bypasses store buffer and waits for any
conflicting store request(s) currently in the store buffer to retire
before the AMO request is sent to the cache. AMO requests are not buffered
in the store buffer, so their effects appear immediately in the cache.
In DerivO3CPU, an AMO request is inserted in the store buffer so that it
is delivered to the cache only after all previous stores are issued to
the cache. Data forwarding between between an outstanding AMO in the
store buffer and a subsequent load is not allowed since the AMO request
does not hold valid data until it's executed in the cache.
This implementation assumes that a target ISA implementation must insert
enough memory fences as micro-ops around an atomic instruction to
enforce a correct order of memory instructions with respect to its
memory consistency model. Without extra memory fences, this implementation
can allow AMOs and other memory instructions that do not conflict
(i.e., not target the same address) to reorder.
This implementation also assumes that atomic instructions execute within
a cache line boundary since the cache for now is not able to execute an
operation on two different cache lines in one single step. Therefore,
ISAs like x86 that require multi-cache-line atomic instructions need to
either use a pair of locking load and unlocking store or change the
cache implementation to guarantee the atomicity of an atomic
instruction.
Change-Id: Ib8a7c81868ac05b98d73afc7d16eb88486f8cf9a
Reviewed-on: https://gem5-review.googlesource.com/c/8188
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
When a thread is suspended, all instructions after the suspension need
to be discarded since the thread will take a different execution stream
when it wakes up.
To do that, in MinorCPU, whenever a thread gets suspended, we change the
current execution stream by updating the current branch with
BranchData::SuspendThread reason.
Change-Id: I7cdcda22c1cf6e8ac8db8800b7d9ec052433fdf3
Reviewed-on: https://gem5-review.googlesource.com/c/9626
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Giacomo Gabrielli <giacomo.gabrielli@gmail.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
When a thread calls exit_group, in addition to halting the thread
itself, it needs to halt all other threads in its group (i.e., threads
sharing the same thread group ID). This patch enables threads to do
that.
Change-Id: Ib2e158fb27cf98843f177a64a2d643b1bbc94d03
Reviewed-on: https://gem5-review.googlesource.com/c/9623
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>