When the Vega ISA got committed, it lacked the request counter
tracking for memory requests that existed in the GCN3 code.
Instead of copying over the same lines from the GCN3 code to the Vega
code, this commit makes the various memory pipelines handle updating the
request counter information instead, as every memory instruction calls a
memory pipeline.
This commit also adds an issueRequest in scalar_memory_pipeline, as
previously, the gpuDynInsts were explicitly placed in the queue of
issuedRequests.
Change-Id: I5140d3b2f12be582f2ae9ff7c433167aeec5b68e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/45347
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
vector_register_file uses the exec_mask of a memory instruction in
order to determine if it should mark a register as in-use or not.
Previously, the exec_mask of memory instructions was only set on
execution of that instruction, which occurs after the code in
vector_register_file. This led to the code reading potentially garbage
data, leading to a scenario where a register would be marked used when
it shouldn't be.
This fix sets the exec_mask of memory instructions in schedule_stage,
which works because the only time the wavefront execMask() is updated is
on a instruction executing, and we know the previous instruction will
have executed by the time schedule_stage executes, due to the order the
pipeline is executed in.
This also undoes part of a patch from last year (62ec973) which treated
the symptom of accidental register allocation, without preventing the
registers from being allocated in the first place.
This patch also removes now redundant code that sets the exec_mask in
instructions.cc for memory instructions
Change-Id: Idabd35020000764fb06133ac2458606c1aaf6f04
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/45346
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Matthew Poremba <matthew.poremba@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Certain memory writes were reading their registers in
initiateAcc, which lead to scenarios where a subsequent instruction
would execute, clobbering the value in that register before the memory
writes' initiateAcc method was called, causing the memory write to read
wrong data.
This patch moves all register reads to execute, preventing the above
scenario from happening.
Change-Id: Iee107c19e4b82c2e172bf2d6cc95b79983a43d83
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/45345
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Reviewed-by: Alex Dutu <alexandru.dutu@amd.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
In C, to refer to a type without a struct or enum tag on the type, you
need to typedef it like this:
typedef struct
{
} Foo;
Foo foo;
In C++, this is unnecessary:
struct Foo
{
};
Foo foo;
Remove all of the first form in C++ files and replace them with the
second form.
Change-Id: I37cc0d63b2777466dc6cc51eb5a3201de2e2cf43
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46199
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The adoption of the gem5 namespace [1] broke aarch64 builds with
the following error:
build/ARM/arch/arm/kvm/base_cpu.cc:198:22: error: aggregate
'gem5::kvm_reg_list regs_probe' has incomplete type and cannot be
defined
In file included from build/ARM/arch/arm/kvm/base_cpu.cc:38:0:
build/ARM/arch/arm/kvm/base_cpu.hh:115:28: note: forward declaration of
'struct gem5::kvm_reg_list'
std::unique_ptr<struct kvm_reg_list> tryGetRegList(uint64_t nelem)
const;
Forward declaring the struct defined in linux/kvm.hh (included in source
file) in the global namespace, rather than the gem5 one fixes the
problem
[1]: https://gem5-review.googlesource.com/c/public/gem5/+/46323
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Change-Id: I96c7d31aa4810edcf98e23cefeaf4895620b6444
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/47619
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
To properly implement load-store instructions for use with
the TimingSimpleCPU model, the initiateAcc() part of the
instruction should only be responsible for performing the
effective address computation and then initiating memory
access.
The completeAcc() part of the instruction should then be
responsible for setting the condition register flags or
updating the base register based on the outcome of the
memory access. This fixes the following instructions:
* Load Byte and Zero with Update (lbzu)
* Load Halfword and Zero with Update (lhzu)
* Load Halfword Algebraic with Update (lhau)
* Load Word and Zero with Update (lwzu)
* Load Doubleword with Update (ldu)
* Load Floating Single with Update (lfsu)
* Load Floating Double with Update (lfdu)
* Load Byte and Zero with Update Indexed (lbzux)
* Load Halfword and Zero with Update Indexed (lhzux)
* Load Halfword Algebraic with Update Indexed (lhaux)
* Load Word and Zero with Update Indexed (lwzux)
* Load Word Algebraic with Update Indexed (lwaux)
* Load Doubleword with Update Indexed (ldux)
* Load Floating Single with Update Indexed (lfsux)
* Load Floating Double with Update Indexed (lfdux)
* Load Byte And Reserve Indexed (lbarx)
* Load Halfword And Reserve Indexed (lharx)
* Load Word And Reserve Indexed (lwarx)
* Load Doubleword And Reserve Indexed (ldarx)
* Store Byte with Update (stbu)
* Store Halfword with Update (sthu)
* Store Word with Update (stwu)
* Store Doubleword with Update (stdu)
* Store Byte with Update Indexed (stbux)
* Store Halfword with Update Indexed (sthux)
* Store Word with Update Indexed (stwux)
* Store Doubleword with Update Indexed (stdux)
* Store Byte Conditional Indexed (stbcx.)
* Store Halfword Conditional Indexed (sthcx.)
* Store Word Conditional Indexed (stwcx.)
* Store Doubleword Conditional Indexed (stdcx.)
* Store Floating Single with Update (stfsu)
* Store Floating Double with Update (stdsu)
* Store Floating Single with Update Indexed (stfsux)
* Store Floating Double with Update Indexed (stfdux)
Change-Id: If5f720619ec3c40a90c1362a9dfc8cc204e57acf
Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/40947
Reviewed-by: Boris Shingarov <shingarov@labware.com>
Maintainer: Boris Shingarov <shingarov@labware.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This adds multi-mode support for remote debugging via GDB
with the addition of the XML target description files for
both 32-bit and 64-bit variants of the Power architecture.
Proper byte order conversions have also been added.
MSR has now been modeled to some extent but it is still
not exposed by getRegs() since its a privileged register
that cannot be modified from userspace. Similarly, the
target descriptions require FPSCR to also be part of the
payload and hence, it has been added too.
Change-Id: I156fdccb791f161959dbb2c3dd8ab1e510d9cd4b
Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/40946
Reviewed-by: Boris Shingarov <shingarov@labware.com>
Maintainer: Boris Shingarov <shingarov@labware.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This adds multi-mode support and allows the simulator to
read, interpret and execute 32bit and 64-bit, big and
little endian binaries in syscall emulation mode.
During process initialization, a minimal set of hardware
capabilities are also advertised by the simulator to show
support for 64-bit mode and little endian byte order.
This also adds some fixups specific to 64-bit ELF ABI v1
that readjust the entry point and symbol table due to the
use of function descriptors.
Change-Id: I124339eff7b70dbd14e50ff970340c88c13bd0ad
Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/40944
Reviewed-by: Boris Shingarov <shingarov@labware.com>
Maintainer: Boris Shingarov <shingarov@labware.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This adds the definition of the Machine State Register
(MSR) in preparation for multi-mode support. The MSR
has bits that define the state of the processor. This
defines all the bitfields and sets the ones that are
typically used for userspace environments.
In preparation for multi-mode support, the SF and LE
bits are used by instructions to check if the simulation
is running in 64-bit mode and if memory accesses are to
be performed in little endian byte order respectively.
This introduces changes in areas such as target address
computation for branch instructions, carry and overflow
computation for arithmetic instructions, etc.
Change-Id: If9ac69415ca85b0c873bd8579e7d1bd2219eac62
Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/40943
Reviewed-by: Boris Shingarov <shingarov@labware.com>
Maintainer: Boris Shingarov <shingarov@labware.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Apply the gem5 namespace to the codebase.
Some anonymous namespaces could theoretically be removed,
but since this change's main goal was to keep conflicts
at a minimum, it was decided not to modify much the
general shape of the files.
A few missing comments of the form "// namespace X" that
occurred before the newly added "} // namespace gem5"
have been added for consistency.
std out should not be included in the gem5 namespace, so
they weren't.
ProtoMessage has not been included in the gem5 namespace,
since I'm not familiar with how proto works.
Regarding the SystemC files, although they belong to gem5,
they actually perform integration between gem5 and SystemC;
therefore, it deserved its own separate namespace.
Files that are automatically generated have been included
in the gem5 namespace.
The .isa files currently are limited to a single namespace.
This limitation should be later removed to make it easier
to accomodate a better API.
Regarding the files in util, gem5:: was prepended where
suitable. Notice that this patch was tested as much as
possible given that most of these were already not
previously compiling.
Change-Id: Ia53d404ec79c46edaa98f654e23bc3b0e179fe2d
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46323
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This introduces the S field for X form instructions which
is used to specify signed versus unsigned comparison. The
Power ISA does not specify a formal name for the third
1-bit opcode field required for decoding XFX form move to
and from CR field instructions, the S field can be used
to achieve the same as it has the same span and position.
This fixes the following instructions.
* Move To Condition Register Fields (mtcrf)
* Move From Condition Register (mfcr)
Change-Id: I8d291f707cd063781f0497f7226bebfc47bd9e63
Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/40935
Reviewed-by: Boris Shingarov <shingarov@labware.com>
Maintainer: Boris Shingarov <shingarov@labware.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This introduces new classes and new formats for D and X
form instructions, the TO field that is used to encode
the trap conditions and adds the following instructions.
* Trap Word Immediate (twi)
* Trap Word (tw)
* Trap Doubleword Immediate (tdi)
* Trap Doubleword (td)
Change-Id: I029147ef643c2ee6794426e5e90af4d75f22e92e
Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/40934
Reviewed-by: Boris Shingarov <shingarov@labware.com>
Maintainer: Boris Shingarov <shingarov@labware.com>
Tested-by: kokoro <noreply+kokoro@google.com>
sched_getaffinity is different from other syscalls in the raw syscall
return the size of the cpumask being used to represent the CPU bit mask.
Because of this, when a library (libnuma in this case) directly called
sched_getaffinity and got a return value of 0, it errored out, thinking
that there were no CPUs available.
This implementation assumes that all CPUs are available, so it sets
all simulated CPUs in the bitmask
Change-Id: Id95c919986cc98a411877056256604f57a29f0f9
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46243
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
In GCN3, the v_add_u32, v_sub_u32, and v_subrev_u32 instructions write
the carry-out value to VCC. VEGA introduces explicit carry-out versions
of these instructions (v_add_co_u32, v_sub_co_u32, and v_subrev_co_u32),
and modifies the behavior of the baseline, non-carry-out versions to not
write to VCC. Previously both the carry-out and non-carry-out versions
shared a single implementation that wrote to VCC. This patch correctly
implements the non-carry-out versions to avoid the VCC write.
This patch also makes the following substitutions for GCN3 instructions
that no longer exist in VEGA (this renaming has no functional impact):
v_addc_u32 -> v_addc_co_u32
v_subb_u32 -> v_subb_co_u32
v_subbrev_u32 -> v_subbrev_co_u32
Change-Id: I002fa6e9316d38fd4cc3554daff047523cfc12c9
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/47240
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This introduces a new class and a new format for MD and
MDS form instructions where the shift amount, mask begin
and mask end are specified by two fields that must be
concatenated and adds the following instructions.
* Rotate Left Doubleword Immediate then Clear Left (rldicl[.])
* Rotate Left Doubleword Immediate then Clear Right (rldicr[.])
* Rotate Left Doubleword Immediate then Clear (rldic[.])
* Rotate Left Doubleword then Clear Left (rldcl[.])
* Rotate Left Doubleword then Clear Right (rldcr[.])
* Rotate Left Doubleword Immediate then Mask Insert (rldimi[.])
Change-Id: Id7f1f24032242ccfdfda2f1aefd6fe9f0331f610
Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/40933
Reviewed-by: Boris Shingarov <shingarov@labware.com>
Maintainer: Boris Shingarov <shingarov@labware.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Now that 64-bit registers are being used, the rotation
operation changes for words. Instead of just rotating
the lower word of the operand, the lower word is first
duplicated in the upper word and then rotated. This
fixes the following instructions.
* Rotate Left Word Immediate then And with Mask (rlwinm[.])
* Rotate Left Word then And with Mask (rlwnm[.])
* Rotate Left Word Immediate then Mask Insert (rlwimi[.])
Change-Id: Ic743bceb8bafff461276984ecc999dedc1f94e9f
Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/40930
Reviewed-by: Boris Shingarov <shingarov@labware.com>
Maintainer: Boris Shingarov <shingarov@labware.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This introduces a new class and a new format for XS form
instructions where the shift amount is specified by two
fields that must be concatenated and adds the following
instructions.
* Shift Left Doubleword (sld[.])
* Shift Right Doubleword (srd[.])
* Shift Right Algebraic Doubleword (srad[.])
* Shift Right Algebraic Doubleword Immediate (sradi[.])
* Extend-Sign Word and Shift Left Immediate (extswsli[.])
Change-Id: If51c676009ddafb40f855b66c00eeeffa5d8874c
Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/40928
Reviewed-by: Boris Shingarov <shingarov@labware.com>
Maintainer: Boris Shingarov <shingarov@labware.com>
Tested-by: kokoro <noreply+kokoro@google.com>