It is possible from the MMU to traverse the entire hierarchy of
TLBs, starting from the DTB and ITB (generally speaking from the
first level) up to the last level via the nextLevel pointer. So
in theory no extra data should be stored in the BaseMMU.
This design makes some operations a bit more complex. For example
if we have a unified (I+D) L2, it will be pointed by both ITB and
DTB. If we want to invalidate all TLB entries, we should be
careful to not invalidate L2 twice, but if we simply follow the
next level pointer, we might do so. This is not a problem from
a functional perspective but alters the TLB statistics (a single
invalidation is recorded twice)
We then provide a different view of the set of TLBs in the system.
At the init phase we traverse the TLB hierarchy and we add every
TLB to the appropriate set. This makes invalidation (and any
operation targeting a specific kind of TLBs) easier.
JIRA: https://gem5.atlassian.net/browse/GEM5-790
Change-Id: Ieb833c2328e9daeaf50a32b79b970f77f3e874f7
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/48146
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
This patch is moving most of the TLB code to the MMU.
In this way the TLB stops being the main translating agent and becomes a
simple "passive" translation cache.
All the logic behind virtual memory translation, like
* Checking permission/alignment
* Issuing page table walks
* etc
Is now embedded in the MMU model. This will allow us to stack multiple
TLBs and to compose arbitrary hierarchies as their sole purpose now is
to cache translations
JIRA: https://gem5.atlassian.net/browse/GEM5-790
Change-Id: I687c639a56263d5e3bb6633dd8c9666c85edba3a
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/48141
Tested-by: kokoro <noreply+kokoro@google.com>
Now that we're using c++17, the type_traits with a ::value member have
a _v alias which reduces verbosity. Or on other words
std::is_integral<T>::value
can be replaced with
std::is_integral_v<T>
Make this substitution throughout the code base. In places where gem5
introduced it's own similar templates, add a V alias, spelled
differently to match gem5's internal style.
gem5: :IsVarArgs<T>::value => gem5::IsVarArgsV<T>
Change-Id: I1d84ffc4a236ad699471569e7916ec17fe5f109a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/48604
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Memory space is not always outside of the CPU. For example the tightly
coupled memory (TCM) is inside of the core. To make gdb access those
kind of memory, we should use Iris memory API to read and write memory.
If we access a memory address not inside the CPU with Iris memory API.
The CPU would fire a request via amba transport_dbg. So the change also
covers the original behavior.
Change-Id: Ie223ab12f9a746ebafa21026a8680222f6ebd593
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/45581
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This implements the 96 and 128b ds_read/write instructions in a similar
fashion to the 3 and 4 dword flat_load/store instructions.
These instructions are treated as reads/writes of 3 or 4 dwords, instead
of as a single 96b/128b memory transaction, due to the limitations of
the VecOperand class used in the amdgpu code.
In order to handle treating the memory transaction as multiple dwords,
the patch also adds in new initMemRead/initMemWrite functions for ds
instructions. These are similar to the functions used in flat
instructions for the same purpose.
Change-Id: I0f2ba3cb7cf040abb876e6eae55a6d38149ee960
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/48342
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Alex Dutu <alexandru.dutu@amd.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
The proxies are not used on the critical path, and it's usually implicit
whether they should be the FS or SE version.
Ideally in the future we won't need to worry about which version we need
to use, but the differences haven't quite been abstracted away, and
occasionally we need to decide between the two.
Change-Id: Idb363d6ddc681f7c1ad5e7aba69865f40aa30dc8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/45907
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
These were never used with conditions, so the condition check just added
overhead. Also, the not-taken path through the instruction didn't
actually set the destination to something, meaning that it would set it
to something arbitrary and not actually leave it unmodified.
Change-Id: I33fef088979b14ad74adf22b26419a1cacf386dd
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/45305
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
This patch reverts part of the changes made by the removal of
the Stage2MMU class [1]:
Prior to that patch the stage1 and stage2 walkers were sharing
the same port (which was instantiated in the Stage2MMU).
By removing the Stage2MMU we provided every table walker a
unique port.
With this patch we are reintroducing port sharing to temporarily fix
existing platforms using walker caches.
(The long term design goal will be to have a unique page table walker)
Those complain if we try to connect a single ported cache to 2 table
walker ports (stage1 and stage2)
[1]: https://gem5-review.googlesource.com/c/public/gem5/+/45780
Change-Id: Ib68ef97f1e9772a698771269c9a4ec4514f5d4d7
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/48200
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Certain DS insts are classfied as Loads, but don't actually go through
the memory pipeline. However, any instruction classified as a load
marks its destination registers as free in the memory pipeline.
Because these instructions didn't use the memory pipeline, they
never freed their destination registers, which led to a deadlock.
This patch explicitly calls the function used to free the destination
registers in the execute() method of those Load instructions that
don't use the memory pipeline.
Change-Id: Ic2ac2e232c8fbad63d0c62c1862f2bdaeaba4edf
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/48019
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Certain instructions don't have implementations in instructions.cc,
and get decoded as a nullptr.
This adds a fatal when decoding a missing instruction, as we aren't
able to properly run a program if all its instructions aren't
implemented, and it allows us to figure out which instruction is
missing due to fatals printing the line they were called.
Change-Id: I7e3690f079b790dceee102063773d5fbbc8619f1
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/47522
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
When the Vega ISA got committed, it lacked the request counter
tracking for memory requests that existed in the GCN3 code.
Instead of copying over the same lines from the GCN3 code to the Vega
code, this commit makes the various memory pipelines handle updating the
request counter information instead, as every memory instruction calls a
memory pipeline.
This commit also adds an issueRequest in scalar_memory_pipeline, as
previously, the gpuDynInsts were explicitly placed in the queue of
issuedRequests.
Change-Id: I5140d3b2f12be582f2ae9ff7c433167aeec5b68e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/45347
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
vector_register_file uses the exec_mask of a memory instruction in
order to determine if it should mark a register as in-use or not.
Previously, the exec_mask of memory instructions was only set on
execution of that instruction, which occurs after the code in
vector_register_file. This led to the code reading potentially garbage
data, leading to a scenario where a register would be marked used when
it shouldn't be.
This fix sets the exec_mask of memory instructions in schedule_stage,
which works because the only time the wavefront execMask() is updated is
on a instruction executing, and we know the previous instruction will
have executed by the time schedule_stage executes, due to the order the
pipeline is executed in.
This also undoes part of a patch from last year (62ec973) which treated
the symptom of accidental register allocation, without preventing the
registers from being allocated in the first place.
This patch also removes now redundant code that sets the exec_mask in
instructions.cc for memory instructions
Change-Id: Idabd35020000764fb06133ac2458606c1aaf6f04
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/45346
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Matthew Poremba <matthew.poremba@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Certain memory writes were reading their registers in
initiateAcc, which lead to scenarios where a subsequent instruction
would execute, clobbering the value in that register before the memory
writes' initiateAcc method was called, causing the memory write to read
wrong data.
This patch moves all register reads to execute, preventing the above
scenario from happening.
Change-Id: Iee107c19e4b82c2e172bf2d6cc95b79983a43d83
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/45345
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Reviewed-by: Alex Dutu <alexandru.dutu@amd.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
In C, to refer to a type without a struct or enum tag on the type, you
need to typedef it like this:
typedef struct
{
} Foo;
Foo foo;
In C++, this is unnecessary:
struct Foo
{
};
Foo foo;
Remove all of the first form in C++ files and replace them with the
second form.
Change-Id: I37cc0d63b2777466dc6cc51eb5a3201de2e2cf43
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46199
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The adoption of the gem5 namespace [1] broke aarch64 builds with
the following error:
build/ARM/arch/arm/kvm/base_cpu.cc:198:22: error: aggregate
'gem5::kvm_reg_list regs_probe' has incomplete type and cannot be
defined
In file included from build/ARM/arch/arm/kvm/base_cpu.cc:38:0:
build/ARM/arch/arm/kvm/base_cpu.hh:115:28: note: forward declaration of
'struct gem5::kvm_reg_list'
std::unique_ptr<struct kvm_reg_list> tryGetRegList(uint64_t nelem)
const;
Forward declaring the struct defined in linux/kvm.hh (included in source
file) in the global namespace, rather than the gem5 one fixes the
problem
[1]: https://gem5-review.googlesource.com/c/public/gem5/+/46323
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Change-Id: I96c7d31aa4810edcf98e23cefeaf4895620b6444
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/47619
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
To properly implement load-store instructions for use with
the TimingSimpleCPU model, the initiateAcc() part of the
instruction should only be responsible for performing the
effective address computation and then initiating memory
access.
The completeAcc() part of the instruction should then be
responsible for setting the condition register flags or
updating the base register based on the outcome of the
memory access. This fixes the following instructions:
* Load Byte and Zero with Update (lbzu)
* Load Halfword and Zero with Update (lhzu)
* Load Halfword Algebraic with Update (lhau)
* Load Word and Zero with Update (lwzu)
* Load Doubleword with Update (ldu)
* Load Floating Single with Update (lfsu)
* Load Floating Double with Update (lfdu)
* Load Byte and Zero with Update Indexed (lbzux)
* Load Halfword and Zero with Update Indexed (lhzux)
* Load Halfword Algebraic with Update Indexed (lhaux)
* Load Word and Zero with Update Indexed (lwzux)
* Load Word Algebraic with Update Indexed (lwaux)
* Load Doubleword with Update Indexed (ldux)
* Load Floating Single with Update Indexed (lfsux)
* Load Floating Double with Update Indexed (lfdux)
* Load Byte And Reserve Indexed (lbarx)
* Load Halfword And Reserve Indexed (lharx)
* Load Word And Reserve Indexed (lwarx)
* Load Doubleword And Reserve Indexed (ldarx)
* Store Byte with Update (stbu)
* Store Halfword with Update (sthu)
* Store Word with Update (stwu)
* Store Doubleword with Update (stdu)
* Store Byte with Update Indexed (stbux)
* Store Halfword with Update Indexed (sthux)
* Store Word with Update Indexed (stwux)
* Store Doubleword with Update Indexed (stdux)
* Store Byte Conditional Indexed (stbcx.)
* Store Halfword Conditional Indexed (sthcx.)
* Store Word Conditional Indexed (stwcx.)
* Store Doubleword Conditional Indexed (stdcx.)
* Store Floating Single with Update (stfsu)
* Store Floating Double with Update (stdsu)
* Store Floating Single with Update Indexed (stfsux)
* Store Floating Double with Update Indexed (stfdux)
Change-Id: If5f720619ec3c40a90c1362a9dfc8cc204e57acf
Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/40947
Reviewed-by: Boris Shingarov <shingarov@labware.com>
Maintainer: Boris Shingarov <shingarov@labware.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This adds multi-mode support for remote debugging via GDB
with the addition of the XML target description files for
both 32-bit and 64-bit variants of the Power architecture.
Proper byte order conversions have also been added.
MSR has now been modeled to some extent but it is still
not exposed by getRegs() since its a privileged register
that cannot be modified from userspace. Similarly, the
target descriptions require FPSCR to also be part of the
payload and hence, it has been added too.
Change-Id: I156fdccb791f161959dbb2c3dd8ab1e510d9cd4b
Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/40946
Reviewed-by: Boris Shingarov <shingarov@labware.com>
Maintainer: Boris Shingarov <shingarov@labware.com>
Tested-by: kokoro <noreply+kokoro@google.com>