Currently, when an instruction has an operand that reads a const
value, it goes thru the same readMiscReg() api call as other
misc registers (real HW registers, not constant values). There
is an issue, however, when casting from the const values (which are
32b) to higher precision values, like 64b.
This change creates a separate, templated function call to the GPU's
ISA state that will return the correct type.
Change-Id: I41965ebeeed20bb70e919fce5ad94d957b3af802
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29927
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Instructions that use the SDWA field need to use the extra SRC0
register associated with the SDWA instruction instead of the
"default" SRC0 register, since the default SRC0 register contains
the SDWA information when SDWA is being used. This commit fixes
15de044c to take this into account. Additionally, this commit
removes reads of the registers from the SDWA helper functions,
since they overwrite any changes made to the destination register.
Finally, this change modifies the instructions that use SDWA to
simplify the flow through the execute() functions.
Change-Id: I3bad83133808dfffc6a4c40bbd49c3d76599e669
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29922
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Previously, with HSAIL, we were guaranteed by the HSA specification
that the GPU will never issue unaligned accesses. However, now
that we are directly running GCN this is no longer true.
Accordingly, this commit adds support for unaligned accesses.
Moreover, to reduce the replication of nearly identical
code for the different request types, I also added new helper
functions that are called by all the different memory request
producing instruction types in op_encodings.hh.
Adding support for unaligned instructions requires changing
the statusBitVector used to track the status of the memory
requests for each lane from a bit per lane to an int per lane.
This is necessary because an unaligned access may span multiple
cache lines. In the worst case, each lane may span multiple
cache lines. There are corresponding changes in the files that
use the statusBitVector.
Change-Id: I319bf2f0f644083e98ca546d2bfe68cf87a5f967
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29920
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Kernel end release was turned on for VIPER protocol, which
is in fact write-through based and thus no need to have
release operation. This changeset splits the option
'impl_kern_boundary_sync' into 'impl_kern_launch_acq'
and 'impl_kern_end_rel', and turns off release on VIPER.
Change-Id: I5490019b6765a25bd801cc78fb7445b90eb02a3d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29917
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Reviewed-by: Xianwei Zhang <xianwei.zhang@amd.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Decoder: gpu_decoder.hh and decoder.cc:
The decoder is defined in these files. The decoder
is implemented as a lookup table of function pointers
where each decode function will decode to a unique
ISA instruction, or do some sub-decoding to infer
the next decode function to call.
The format for each OP encoding is defined in the
header file.
Registers:
registers.[hh|cc] define the special registers and
operand selector values, which are used to map
operands to registers/special values. many
convenience functions are also provides to determine
the source/type of an operand, for example vector
vs. scalar, register operand vs. constant, etc.
GPU ISA:
Some special GPU ISA state is maintained in gpu_isa.hh
and isa.cc. This class is used to hold some special
registers and values that can be used as operands
by ISA instructions. Eventually more ISA-specific
state should be moved here, and out of the WF class.
Vector Operands:
The operands for GCN3 instructions are defined in
operand.hh. This file defines both scalar and
vector operands wth GCN3 specific semantics. The
vector operand class is desgned around the generic
vec_reg.hh that is already present in gem5.
Instructions:
The GCN3 instructions are defined and implemented
throughout gpu_static_inst.[hh|cc], instructions.[hh|cc],
op_encodings.[hh|cc], and inst_util.hh. GCN3 instructions
all fall under one of the OP encoding types; for example
scalar memory operands are of the type SMEM, vector
ALU instructions can be VOP3, VOP2, etc. The base code
common to all instructions of a certain OP encoding type
is implemented in the OP encodings files, which includes
operand information, disassembly methods, encoding type,
etc.
Each individual ISA isntruction is implemented as
a class object in instructions.[hh|cc] and are derived
from one of the OP encoding types. The instructions.cc
file is primarily for the execute() methods of each
individual instruction, and the header file provides
the class definition and a few instruction specific
API calls.
Note that these instruction classes were auto-generated
but not using the gem5 ISA description language. A
custom ISA description was used and that cannot be released
publicly, therefore we are providing them already in C++.
Change-Id: I14d2a02d6b87109f41341c8f50a69a2cca9f3d14
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/28127
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>