Fence instruction origially had two flags NonSpeculative and
MemBarrier. In O3 model, MemBarrier instructions are inserted
into the instruction queue by the InstructionQueue::insertBarrier (at
src/cpu/o3/iew_impl.hh:1083). Barrier instructions are implicitly
assumed to be non-speculative.
Adding NonSpeculative flag to fence instruction makes it inserted into
the instruction queue twice (at src/cpu/o3/iew_impl.hh:1083 and :1111).
This can lead to a deadlock if both pointers to the instruction are not
cleared from the queue when the instruction retires.
This patch removes NonSpeculative flag from the fence inst.
Change-Id: I26573d12a0b52f43b73c0e51158286dc98d05ea4
Reviewed-on: https://gem5-review.googlesource.com/c/8183
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Alec Roelke <ar4jc@virginia.edu>
Maintainer: Alec Roelke <ar4jc@virginia.edu>
When a thread is activated by another thread calling a clone system
call, the child thread's context is initialized in the middle of the
clone system call and before the context is fully initialized.
Therefore, the child thread starts fetching an unitialized PC, which
could lead to a page fault.
This patch adds a pipeline wakeup event that is scheduled later in the
cycle when the thread is activated. This event ensures that the first
fetch only happens after the thread context is fully initialized
(e.g., in case of clone syscall, it is when the parent thread copies
its context over to the child thread).
When a thread first starts or wakes up, input queue to the Fetch2 stage
needs to be drained since the execution flow is likely to change and
previously fetched instructions in the queue may no longer be in the
correct flow. This patch dumps/drains all inputs in the input queue
of a thread context in the Fetch2 stage when the associated thread wakes
up.
Change-Id: Iad970638e435858b7289cd471158cc0afdbbb0e5
Reviewed-on: https://gem5-review.googlesource.com/c/8182
Reviewed-by: Brandon Potter <Brandon.Potter@amd.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Brandon Potter <Brandon.Potter@amd.com>
Since f2bda876f7, the build system started
adding a length for generated blobs as in:
const std::size_t variable_len = 123;
There were two types of blob files, ones with a header and the ones
without.
The ones with the header, also include the header in the .cc of the blob,
which contains a declaration:
extern const std::size_t variable_len;
Therefore, the ones without header, don't have that extern declaration,
which makes them static according to the C++ standard.
clang then correctly interprets that as problematic due to
-Wunused-const-variable, while GCC does not notice this.
This patch removes the length declaration from the blob files that don't
have the header. Those files currently don't use the length.
Change-Id: I3fc61b28f887fc1015288857328ead2f3b34c6e6
Reviewed-on: https://gem5-review.googlesource.com/c/15955
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
A recent patch add the use of the macro:
CMSG_ALIGN
This macro is not very cross-platform, and needs to be
defined according to the platform.
This patch defines the missing macro on MacOS.
Change-Id: I582f69e652dc060b4532358141179ad6d37eafc7
Reviewed-on: https://gem5-review.googlesource.com/c/16102
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Brandon Potter <Brandon.Potter@amd.com>
This implementation is based in the description available in:
Jinchun Kim, Seth H. Pugsley, Paul V. Gratz, A. L. Narasimha Reddy,
Chris Wilkerson, and Zeshan Chishti. 2016.
Path confidence based lookahead prefetching.
In The 49th Annual IEEE/ACM International Symposium on Microarchitecture
(MICRO-49). IEEE Press, Piscataway, NJ, USA, Article 60, 12 pages.
Change-Id: I4b8b54efef48ced7044bd535de9a69bca68d47d9
Reviewed-on: https://gem5-review.googlesource.com/c/14819
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Python 2.7 requires a workaround when wrapping exit objects to
explicitly convert the return of getCode() to int to not confuse
sys.exit. This workaround isn't needed and doesn't work on Python 3
since it doesn't have a separate long integer type.
Change-Id: I57bc3fd8f4699676c046ece8a52baa2796959ffd
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/15978
Reviewed-by: Gabe Black <gabeblack@google.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Some tests are really just a wrapper around a test script in
configs/. Add a helper method to wrap these scripts to make sure they
are executed in a consistent environment. This wrapper sets up a
global environment that is identical to that created by main() when it
executes the script. Unlike the old wrappers, it updates the module
search path to make relative imports work correctly in Python 3.
Change-Id: Ie9f81ec4e2689aa8cf5ecb9fc8025d3534b5c9ca
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/15976
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Most Ruby tests assume that the highest frequency in the system under
test is 1GHz and limits the global tick rate to this frequency. This
assumption is broken since the default Ruby configuration scripts
clock the CPU at 2Ghz, which results in warnings and sometimes
incorrect behaviour.
Change-Id: I4b204660862ce3b0ea4a13df42caacd4398fef8c
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/15975
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Latest-gen. vector/SIMD extensions, including the Arm Scalable Vector
Extension (SVE), introduce the notion of a predicate register file.
This changeset adds this feature across architectures and CPU models.
Change-Id: Iebcadbad89c0a582ff8b1b70de353305db603946
Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/13715
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
At Ia49298304f658701ea0800bd79e08db404a655c3 we removed the default
kernel and DTB filenames from FSConfig.py.
However, the regression tests rely on that to find those blobs.
This commit restores those default filenames just for the config of the
regression tests.
Change-Id: I9d7d869b0087ee8a3b63088693f753a703ead5d6
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/15957
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Before this commit, there were default magic DTB and kernel filenames
for some platforms.
This was inelegant and error prone, as it refered to out-of-tree files,
and set defaults which users almost always want to customize with
explicit command line options.
One result of this is that a wrong exception could be thrown if --kernel
was given but not --machine-type, since the default machine type
VExpress_EMM had a default kernel, and the code would always search for
the default filename even though --kernel was given:
IOError: Can't find file 'vmlinux.aarch32.ll_20131205.0-gem5' on path.
The defaults existed only for older machine types, and not for the
usually recommended VExpress_GEM5_V1, which suggests that this
deprecation should not affect many users.
Change-Id: Ia49298304f658701ea0800bd79e08db404a655c3
Reviewed-on: https://gem5-review.googlesource.com/c/15898
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
SIMD & FP Operations use FloatRegs in AArch32 mode and VecRegs in
AArch64 mode. The usage of two different register pools breaks
interprocessing between A32 and A64. This patch is changing definition
of arm operands so that they are backed by VecElems in A32, which are
mapped to the same storage as A64 VecRegs.
Change-Id: I54e2ea0ef1ae61d29aca57ab09acb589d82c1217
Reviewed-on: https://gem5-review.googlesource.com/c/15603
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Fixes include:
* Change of reg_class: VecElemClass in lieau of non-existing
VectorElemClass.
* Removal of unused regId in operand constructor
* makeRead and makeWrite are using VecElem (which is a typedef
of uint32_t) as a source/destination type, regardless of the real
operand type (which is specified by ctype)
Change-Id: I4588e1120e1fc8fdb68b2b2f05d5e3692c55b2e8
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/15602
Reviewed-by: Gabe Black <gabeblack@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Gabe Black <gabeblack@google.com>
VecElem code had been introduced in order to simulate change of renaming
for vector registers. Most of the work is happening on the rename_map
switchRenameMode. Change of renaming can happen after a squash in the
pipeline.
This patch is also changing the interface to the ISA part so that
a PCState is used instead of ISA in order to check if rename mode
has changed.
Change-Id: I8af795d771b958e0a0d459abfeceff5f16b4b5d4
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/15601
This patch is:
* Adding a missing VecElemClass entry
* Fixing assertion in rename map which was checking the number of free
vector registers rather than free vector element registers
* Fixing assertion in read/setVecElemOperand APIs.
* Using the right register index in SimpleThread
* Using VecElem instead of VecReg on O3 readArchVecElem
Change-Id: I265320dcbe35eb47075991301dfc99333c5190c4
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/15598
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
This patch is:
* Increasing the number of bits in the Scoreboard so that
it is keeping track of VecElemClass dependencies.
* Fixing VecElemClass entry in the scoreboard table so that it
correctly uses flatIndex rather than index.
Change-Id: Ie4877e5fe410b1437447adebbe289602a443f7c0
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/15597
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
There is no line information When the ISA code is executed inside the
isa_parser environment and an error is encountered. The build stops and
reports the line of the let block containing the error.
This patch is enhacing the error reporting by printing the traceback of
the faulting ISA code.
Change-Id: I3acd17f0d78b2feb8fe6e48808a094c5b81624e6
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/15595
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
params.py checks the validity of memory port-port connections before
they are instantiated in C++. This commit ensures that attempting to
connect two slave ports together will cause a TypeError.
Change-Id: Ia7d0a15df28b96c7bf5e568c4f4917d21a19b824
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/15896
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
This patch does a large modification of the LSQ in the O3 model. The
main goal of the patch is to remove the 'an operation can be served with
one or two memory requests' assumption that is present in the LSQ
and the instruction with the req, reqLow, reqHigh triplet, and
generalising it to operations that can be addressed with one request,
and operations that require many requests, embodied in the
SingleDataRequest and the SplitDataRequest.
This modification has been done mimicking the minor model to an extent,
shifting the responsibilities of dealing with VtoP translation and
tracking the status and resources from the DynInst to the LSQ via the
LSQRequest. The LSQRequest models the information concerning the
operation, handles the creation of fragments for translation and request
as well as assembling/splitting the data accordingly.
With this modifications, the implementation of vector ISAs, particularly
on the memory side, become more rich, as the new model permits a
dissociation of the ISA characteristics as vector length, from the
microarchitectural characteristics that govern how contiguous loads are
executing, allowing exploration of different LSQ to DL1 bus widths to
understand the tradeoffs in complexity and performance.
Part of the complexities introduced stem from the fact that gem5 keeps a
large amount of metadata regarding, in particular, memory operations,
thus, when an instruction is squashed while some operation as TLB lookup
or cache access is ongoing, when the relevant structure communicates to
the LSQ that the operation is over, it tries to access some pieces of
data that should have died when the instruction is squashed, leading to
asserts, panics, or memory corruption. To ensure the correct behaviour,
the LSQRequest rely on assesing who is their owner, and self-destroying
if they detect their owner is done with the request, and there will be
no subsequent action. For example, in the case of an instruction
squashed whal the TLB is doing a walk to serve the translation, when the
translation is served by the TLB, the LSQRequest detects that the
instruction was squashed, and as the translation is done, no one else
expect to access its information, and therefore, it self-destructs.
Having destroyed the LSQRequest earlier, would lead to wrong behaviour
as the TLB walk may access some fields of it.
Additional authors:
- Gabor Dozsa <gabor.dozsa@arm.com>
Change-Id: I9578a1a3f6b899c390cdd886856a24db68ff7d0c
Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/13516
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
The SWP and SWPB instructions have been removed from AArch32. It was
previously (ARMv7) possible to enable them with the ID_ISAR0.Swap bits,
which are now hardcoded to 0b0000 (SWP and SWPB not implemented)
Change-Id: Ic32b534454a7e0f7494a6f0b5e11182c65b3fe24
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/15815
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
There are a couple things this CL fixes related to the TLM #includes.
1. Removes #includes of <systemc> and <tlm>. These bring in a header
file from boost which shouldn't be necessary but which some of the
tests (and likely some external code) depends on. We avoid including
those in files built into gem5 itself so that gem5 isn't dependent on
boost.
2. All includes in ext should be relative. That way those headers can
be removed from gem5 and still build, allowing them to be moved over
to or referenced from a foreign codebase which isn't part of gem5.
Change-Id: I76e267385b48cb4fe93aea89ec8319c76465a0a4
Reviewed-on: https://gem5-review.googlesource.com/c/15796
Reviewed-by: Ciro Santilli <ciro.santilli@arm.com>
Maintainer: Gabe Black <gabeblack@google.com>