Remove the line "For use for simulation and test purposes only" in files
were AMD is the only copyright holder listed in the header. This happens
to be the case for all files where this line exists, removing it
completely from gem5.
Change-Id: I623f266b002f564301b28774f49081099cfc60fd
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53943
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Global instructions are new in Vega and are essentially FLAT
instructions from GCN3 but guaranteed to go to global memory where as
flat can go to global or local memory.
This reworks the flat instruction classes so that the initiateAcc /
execute / completeAcc logic can be reused for flat, global, and later
scratch subtypes of flat instructions. The decoder creates a flat
instruction class which sets instruction flags based on the flat
instruction's SEG field. There are new initOperandInfo and
generateDissasmbly methods for flat and global. The number of operands
and operand index getters are modified to check the flags and return the
correct value for the subtype.
Change-Id: I1db4a3742aeec62424189e54c38c59d6b1a8d3c1
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/47106
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Kyle Roarty <kyleroarty1716@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Add support for LDS accesses by allowing Flat instructions to dispatch
into the local memory pipeline if the requested address is in the group
aperture.
This requires implementing LDS accesses in the Flat initMemRead/Write
functions, in a similar fashion to the DS functions of the same name.
Because we now can potentially dispatch to the local memory pipeline,
this change also adds a check to regain any tokens we requested as a
flat instruction.
Change-Id: Id26191f7ee43291a5e5ca5f39af06af981ec23ab
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/48343
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Apply the gem5 namespace to the codebase.
Some anonymous namespaces could theoretically be removed,
but since this change's main goal was to keep conflicts
at a minimum, it was decided not to modify much the
general shape of the files.
A few missing comments of the form "// namespace X" that
occurred before the newly added "} // namespace gem5"
have been added for consistency.
std out should not be included in the gem5 namespace, so
they weren't.
ProtoMessage has not been included in the gem5 namespace,
since I'm not familiar with how proto works.
Regarding the SystemC files, although they belong to gem5,
they actually perform integration between gem5 and SystemC;
therefore, it deserved its own separate namespace.
Files that are automatically generated have been included
in the gem5 namespace.
The .isa files currently are limited to a single namespace.
This limitation should be later removed to make it easier
to accomodate a better API.
Regarding the files in util, gem5:: was prepended where
suitable. Notice that this patch was tested as much as
possible given that most of these were already not
previously compiling.
Change-Id: Ia53d404ec79c46edaa98f654e23bc3b0e179fe2d
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46323
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This commit removes functions that indexed into the
vectors that held the operands. Instead, for-each loops
are used, iterating through one of 6 vectors
(src, dst, srcScalar, srcVec, dstScalar, dstVec)
that all hold various (potentially overlapping)
combinations of the operands.
Change-Id: Ia3a857c8f6675be86c51ba2f77e3d85bfea9ffdb
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42212
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
This change removes the GPUDynInstPtr argument from
getRegisterIndex(). The dynamic inst was only needed
to get access to its parent WF's state so it could
determine the number of scalar registers the wave was
allocated. However, we can simply pass the number of
scalar registers directly. This cuts down on shared
pointer usage.
Change-Id: I29ab8d9a3de1f8b82b820ef421fc653284567c65
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42210
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
FLAT instructions used to decrement lgkm count on execute, while the
GCN3 ISA specifies that lgkm count should be decremented on data being
returned or data being written.
This patch changes it so that lgkm is decremented after initiateAcc (for
stores) and after completeAcc (for loads) to better reflect the ISA
definition.
This fixes a bug where waitcnts would be satisfied even though the
memory access wasn't completed, which lead to instructions using the
wrong data.
Change-Id: I596cb031af9cda8d47a1b5e146e4a4ffd793d36c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/38696
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Previously, with HSAIL, we were guaranteed by the HSA specification
that the GPU will never issue unaligned accesses. However, now
that we are directly running GCN this is no longer true.
Accordingly, this commit adds support for unaligned accesses.
Moreover, to reduce the replication of nearly identical
code for the different request types, I also added new helper
functions that are called by all the different memory request
producing instruction types in op_encodings.hh.
Adding support for unaligned instructions requires changing
the statusBitVector used to track the status of the memory
requests for each lane from a bit per lane to an int per lane.
This is necessary because an unaligned access may span multiple
cache lines. In the worst case, each lane may span multiple
cache lines. There are corresponding changes in the files that
use the statusBitVector.
Change-Id: I319bf2f0f644083e98ca546d2bfe68cf87a5f967
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29920
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
for HSAIL an operand's indices into the register files may be calculated
trivially, because the operands are always read from a register file, or are
an immediate.
for machine ISA, however, an op selector may specify special registers, or
may specify special SGPRs with an alias op selector value. the location of
some of the special registers values are dependent on the size of the RF
in some cases. here we add a way for the underlying getRegisterIndex()
method to know about the size of the RFs, so that it may find the relative
positions of the special register values.
we are removing doGmReturn from the GM pipe, and adding completeAcc()
implementations for the HSAIL mem ops. the behavior in doGmReturn is
dependent on HSAIL and HSAIL mem ops, however the completion phase
of memory ops in machine ISA can be very different, even amongst individual
machine ISA mem ops. so we remove this functionality from the pipeline and
allow it to be implemented by the individual instructions.
this patch removes the GPUStaticInst enums that were defined in GPU.py.
instead, a simple set of attribute flags that can be set in the base
instruction class are used. this will help unify the attributes of HSAIL
and machine ISA instructions within the model itself.
because the static instrution now carries the attributes, a GPUDynInst
must carry a pointer to a valid GPUStaticInst so a new static kernel launch
instruction is added, which carries the attributes needed to perform a
the kernel launch.
Eliminate the VSZ constant that defined the Wavefront size (in numbers of work
items); replaced it with a parameter in the GPU.py configuration script.
Changed all data structures dependent on the Wavefront size to be dynamically
sized. Legal values of Wavefront size are 16, 32, 64 for now and checked at
initialization time.
the n_reg field in the GPUDynInst is not currently set in the constructor.
if it is not set externally, there are assertion failures that may occur
if the random value it gets is just right. here we set it to 0 by default.