Commit Graph

90 Commits

Author SHA1 Message Date
Gabe Black
2da4a2cdca cpu: Stop including arch/isa.hh.
This header file is no longer needed.

Also fix some places where the isa header file was being transitively
included.

Change-Id: Ib9a9d7db0c9808b29d7614bbd68e2052ea345e9f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51239
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-08-20 01:17:06 +00:00
Alexandru Dutu
c6b38909e1 gpu-compute: Adding support for LDS atomics
This changeset is adding support for LDS atomics
and implementing DS_OR_B32 instruction.

Change-Id: I84c5cf6ce0e9494726dc7299f360551cd2a485f5
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61791
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-08-19 16:44:31 +00:00
Bobby R. Bruce
787204c92d python: Apply Black formatter to Python files
The command executed was `black src configs tests util`.

Change-Id: I8dfaa6ab04658fea37618127d6ac19270028d771
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/47024
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-08-03 09:10:41 +00:00
Matthew Poremba
f65f5a8981 gpu-compute,arch-vega: Overhaul HWRegs, setreg, getreg
These instructions are supposed to be read/writing special shader
hardware registers. Currently they are getting/setting to an SGPR. This
results in getting incorrect registers at best and clobbering an SGPR
being used by an application at worst. Furthermore, some registers need
to be set in the shader and the application will never (can never) set
them.

This patch overhauls the getreg/setreg instructions to use different
storage in the shader. The values will be updated either via setreg from
an application (e.g., mode register) or set by a PM4 MAP_PROCESS.

Change-Id: Ie5e5d552bd04dc47f5b35b5ee40a569ae345abac
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61655
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
2022-07-28 14:10:33 +00:00
Matthew Poremba
5c7514c81c arch-vega: Fix S_GETREG_B32 masking/shifting
Here the mask should not be inverted. We also need to shift by the
offset to remove the padding as the consumer of the value expects the
offset to be removed.

This can be easily tested by running a GPU kernel with __shared__
variables. This will generate the following assembly:

s_getreg_b32 s6, hwreg(HW_REG_SH_MEM_BASES, 16, 16)

The current implementation returns the lower 16 bits (private memory
aperture) while the correct behavior is the uppter 16 bits (shared/LDS
memory aperture).

Change-Id: Iea8f0adceeadb24cdcf46ef4183fcaa8262ab9e7
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61654
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
2022-07-28 14:10:33 +00:00
Matthew Poremba
618d16d6fc arch-vega: Fix V_READFIRSTLANE_B32
This instruction appears to be the only VOP1 instruction that has a
scalar destination using VDST as the destination register number.
However, since VDST is only 8 bits it cannot encode all possible
registers. Therefore, use the opcode to determine if the destination is
a scalar or vector destination.

This issue manifests as a VGPR dest being out of range for a kernel
where the number of SGPRs is more than the number of VGPRs and the
intended SGPR dest is larger than the count of VGPRs

Change-Id: I95a7de1ddb97f7171f48331fed36aef776fa0cb4
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61649
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-07-28 14:10:33 +00:00
Matthew Poremba
af9ecf7920 arch-vega,arch-gcn3: Add support for VCC_HI as scalar source
Currently there is only support for VCC_LO as a scalar source. Add
support for VCC_HI as well. The op selector symbol is also changed to be
vcc_hi/vcc_lo as it is in disassembly from LLVM.

Change-Id: I19ea8e23873049c33ffe2eb4ec8504a18f371c0e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61329
Reviewed-by: Matt Sinclair <mattdsinclair.wisc@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-07-18 17:27:50 +00:00
Matthew Poremba
3a73aa3ac1 arch-vega: Implement new VOP2 using VOP3 insts
Vega adds three new VOP2 instructions that may use VOP3 encoding that
are not part of the GCN3 ISA: v_add_u32, v_sub_u32, v_subrev_u32. This
changeset implements those three new instructions to fix errors related
to "invalid encoding" when those instructions are seen.

Tested using srad from Rodinia 3.0 HIP port which compiles a v_add_u32
instruction with VOP3 encoding.

Change-Id: I409a9f72f5c37895c3a0ab7ceb14a4dd121874a4
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61330
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2022-07-15 14:55:33 +00:00
Matthew Poremba
40077055cf arch-vega: Fix disassembly for two dword VOPC
Calling opSelectorToRegSym in the disassembly for VOPC when there is a
second dword (SDWA, DPP, or Literal) causes a panic as those registers
do not have a string symbol. This is fixed by checking for a second
dword before printing similar to how VOP1, VOP2, SOP1, etc. function.

Change-Id: I97b33e1e45abcf3ff1d0bc5754773b4eee961a98
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61269
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-07-12 19:05:32 +00:00
Matt Sinclair
590719a383 arch-vega: explain when op encoder ignores src reg
Previously b40b361bee added support for the Vega operand encoder.  As
part of this, it made sure to check for the S_GETPC_B64 instruction,
which appears to be the only instruction in the Vega ISA that does not
use the source register.  However, at the time the commit used magic
numbers without comment, which can be difficult for users to interpret.

To resolve this, this commit adds a comment to explain where the magic
numbers come from (Table 58 in the Vega ISA manual).

Change-Id: Ic5007b510e0175558d21ede8eb6db273113187b2
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/60650
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Matthew Poremba <matthew.poremba@amd.com>
2022-06-22 17:14:35 +00:00
Matt Sinclair
00008b725c arch-vega: some Vega instructions don't use dest reg
Some of the Vega scalar instructions (S_SETPC_B64, S_RFE_B64,
S_CBRANCH_JOIN, and S_SET_GPR_IDX_IDX) do not use the SDST scalar
destination register.  However, Vega's operand encoding function for the
SOP1 instruction type's class assumed all instructions used the
destination register, which results in an assert failure for these
instructions.

To resolve this, this commit updates the Vega SOP1 operand encoder to
ignore the destination register for these specific instructions.

Change-Id: I2f0d830f6264fc7f47c0694a2fd5da5d33d2ea0b
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/60649
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Matthew Poremba <matthew.poremba@amd.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
2022-06-22 17:14:35 +00:00
Michael Boyer
81058189af arch-vega,arch-gcn3: Implement S_MEMTIME instruction
Change-Id: I3e286eb6ff8af4097ad03d4066be79f73d938cea
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53603
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-06-21 20:19:46 +00:00
Charles Jamieson
7170c365be arch-vega: implement S_GETREG_B32 instruction
This commit adds support for the Vega GPU ISA's S_GETREG_B32
instruction.

This work was done by Charles Jamieson but I am committing.

Change-Id: Ic2e24f667ed1aec7b8b1404a06e17e7ffb192fba
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/60589
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Matthew Poremba <matthew.poremba@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-06-17 02:24:05 +00:00
Gabe Black
84ae0afa59 arch-x86: Put misc reg indexes into a name space.
Also make them match the style guide.

Change-Id: I845f141f85d4499a5acf56c2161240764906a232
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/49758
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Gabe Black <gabe.black@gmail.com>
2022-06-08 07:09:05 +00:00
Gabe Black
88143c940b arch-x86: Convert segment indices to fit the style guide.
Capitalize only their first letter, and use a namespace to namespace
them instead of a SEGMENT_REG_ prefix.

Change-Id: I69778c8d052ad6cc0ffd9e74dd1c643e9d28048d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/49756
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Reviewed-by: Gabe Black <gabe.black@gmail.com>
2022-06-08 07:08:37 +00:00
Matthew Poremba
b64467025d arch-vega: Implement SOP2 S_MUL_HI instructions
Two new 32-bit signed and unsigned variants of S_MUL were added in
gfx900 which operate similar to S_MUL expect they shift the product by
32 bits after multiplication. Tested with Histogram HIP-Sample and
b+tree in rodinia 3.0 HIP port.

Change-Id: I1bed32b17ccda7aa47f3b59528eb3304912d3610
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/58473
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-04-11 17:06:43 +00:00
Matthew Poremba
e3f65393fd dev-amdgpu,arch-vega: Implement TLB invalidation logic
Add logic to collect pointers to all GPU TLBs in full system. Implement
the invalid TLBs PM4 packet. The invalidate is done functionally since
there is really no benefit to simulate it with timing and there is no
support in the TLB to do so. This allow application with much larger
data sets which may reuse device memory pages to work in gem5 without
possibly crashing due to a stale translation being leftover in the TLB.

Change-Id: Ia30cce02154d482d8f75b2280409abb8f8375c24
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/58470
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-04-08 17:12:32 +00:00
Gabe Black
e6c0ba97db scons: Put all config variables in an env['CONF'] sub-dict.
This makes what are configuration and what are internal SCons variables
explicit and separate, and makes it unnecessary to call out what
variables to export to C++.

These variables will also be plumbed into and out of kconfiglib in later
changes.

Change-Id: Iaf5e098d7404af06285c421dbdf8ef4171b3f001
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56892
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-03-28 20:31:21 +00:00
Matthew Poremba
dd90417211 arch-vega: Bypass Ruby for functional page walks
Currently if a Ruby functional access fails to find an address in the
caches, it gives up. For functional page table walks we need to be able
to go all the way to memory. This adds a pointer to the system object
which allows the walker to get a pointer to device memory which can be
used to do a functional access directly to memory bypassing Ruby.

Change-Id: I0ead6e5e130a0d53021c44ae9221b167c6316ab2
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57529
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-03-25 19:51:29 +00:00
Matthew Poremba
9cbdf75295 dev-amdgpu: Add VM class for apertures, TranslationGens
Create a VM class to reduce clutter in the amdgpu_device.* files. This
new file is in charge of reading/writting MMIOs related to VM contexts
and apertures. It also provides ranges checks for various apertures and
breaks out the MMIO interface so that there are not overloaded macro
definitions in the device MMIO methods.

The new translation generator classes for the various apertures are also
added to this class.

Change-Id: Ic224c1aa485685685b1136a46eed50bcf99d2350
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53066
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-03-24 14:59:57 +00:00
Matthew Poremba
f64f05eff6 arch-vega: Mark global instructions executed as global
The executed_as field is currently not set for global memory
instructions. This results in the default of SC_NONE, causing the status
vector to be all zeros. The GM pipe sees this and completes the
instruction immediately rather than issuing memory requests. This is
fixed by marking the instruction as executed as SC_GLOBAL always. Flat
instructions use resolvedFlatSegment for this, however since global
instructions are known to be global we can set this field directly. This
results in the expected issuing of memory requests to GPU memory.

Change-Id: Ic23102853ccd49a41e2f083b7bb24f033dfed18a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57829
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-03-18 22:44:28 +00:00
Gabe Black
72d67e6426 arch-vega: Replace deprecated Stats namespace recently reintroduced.
The deprecated "Stats" namespace was recently reintroduced to the vega
TLB code. Replace it with the new statistics namespace.

Change-Id: Ie5daf288176ce7e8aadd27b84a70baf4cbc72dff
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57949
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Matthew Poremba <matthew.poremba@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-03-18 20:19:37 +00:00
Matthew Poremba
539a2e2bcd arch-vega: Add VEGA page tables and TLB
Add the page table walker, page table format, TLB, TLB coalescer, and
associated support in the AMDGPUDevice. This page table format used the
hardware format for dGPU and is very different from APU/GCN3 which use
the X86 page table format.

In order to support either format for the GPU model, a common
TranslationState called GpuTranslation state is created which holds the
combined fields of both the APU and Vega translation state. Similarly
the TlbEntry is cast at runtime by the corresponding arch files as they
are the only files which touch the internals of the TlbEntry. The GPU
model only checks if a TlbEntry is non-null and thus does not need to
cast to peek inside the data structure.

Change-Id: I4484c66239b48df5224d61caa6e968e56eea38a5
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51848
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-03-17 00:11:14 +00:00
Kyle Roarty
5e721db9a2 arch-vega: Handle signed offsets in Global/Scratch instructions
The offset field in Flat-style instructions is treated differently
based on if the instruction is Flat or Global/Scratch.

In Flat insts, the offset is treated as a 12-bit unsigned number.

In Global/Scratch insts, the offset is treated as a 13-bit signed number.

This patch updates the calcAddr function for Flat-style instructions
to properly sign-extend the offset on Global/Scratch instructions

Change-Id: I57f10258c23d900da9bf6ded6717c6e8abd177b7
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57209
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
2022-03-01 21:14:38 +00:00
Gabe Black
5df52e0dca arch-x86: Overhaul how address size is handled, particularly for stack.
The stack size is something that applies to addresses when performing
accesses as part of some instructions. This was handled inconsistently
or incompletely or simply incorrectly in a few ways.

First, when pushing or popping from the stack, the *address size* should
be set to the stack size. The data size is generally the operand size.
When the stack pointer is incremented/decremented, it should be changed
by the data size. When a stack pointer is manipulated, the data size
for those calculations should be the stack size. Importantly that does
not change the value of the increment/decrement, which is the operand
size still. This usage has been fixed throughout.

The TLB generally needs to know what the address size was in order to
figure out what segment offset was used so that it can do limit checks.
There is some inherent inaccuracy in doing things in reverse like this,
but that's how it works currently. To find that size, the TLB tried to
start from first principles to figure out what the default address size
was, and then whether there was an override was passed in through the
request flags.

This is *very* inaccurate for a few reasons. First, the override doesn't
always apply. Second, the address size used by a particular instruction
doesn't have to be based on any particular size, whether that is the
default or alternate address size, the stack size, etc.

Instead, the instructions now pass the actual size being used in as a 2
bit value (0 -> 1 byte, 1 -> 2 bytes, 2 -> 4 bytes, 3 -> 8 bytes),
avoiding most of the inaccuracy and approximation.

Because the CPU won't embed any size information into fetches, we'll
just assume those have no wrap around within the address size.

Finally, there were microops that had been added which overrode the
address size to be the stack size internally, and try to help the TLB
figure out what to do to figure out the address size. Because both of
those things are now handled in a different way, those microops are no
longer needed or used and have been deleted.

Change-Id: I2b1bdf1acf1540bf643fac6d49fe1a5a576ba5c1
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/55443
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Reviewed-by: Gabe Black <gabe.black@gmail.com>
2022-02-26 01:58:23 +00:00
Matthew Poremba
faf3730559 arch-vega: Fix global 64-bit calcAddr with SGPR base
Global instruction address calculation when using an SGPR or SGPR pair
as a base address was being calculated incorrectly when 64-bit addresses
were to be generated.

From the ISA documentation, the SGPR should be read as 32-bit or 64-bit
depending on "ADDRESS_MODE." The VGPR-offset (computed from the lower
32-bits of vaddr) should always be 32-bits and the offset is 12 bits
from the instruction. This means the 32-bit mask should only be applied
to vaddr to get the VGPU-offset rather than the final sum.

The SGPR base format is being seen in more recent clang/ROCm versions to
avoid unnecessary copies of SGPRs into VGPRs to use VGPRs as the base
address.

Change-Id: I48910611fcfac5b62bc63496bbaabd6f6e53fe0d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/55643
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-01-20 16:03:23 +00:00
Matthew Poremba
3ecd28a222 arch-vega: Update FLAT memory access helpers to support LDS
This patch ports the changes from a similar patch for arch-gcn3:
https://gem5-review.googlesource.com/c/public/gem5/+/48343. Vega already
has an helper function to send to the correct pipe depending on the
scope, however the initMem helpers currently always assume global scope.

In addition the MUBUF WBINVL1 instructions are updated similarly to the
GCN3 patch.

Change-Id: I612b9198cb56e226721a90e72bba64395c84ebcd
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/55465
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-01-18 15:20:10 +00:00
Matthew Poremba
ff17ecc177 arch-vega: Fix MUBUF out-of-bounds case 1
Ported from https://gem5-review.googlesource.com/c/public/gem5/+/51127:

This patch updates the out-of-bounds check to properly check
against the correct buffer_offset, which is different depending
on if the const_swizzle_enable is true or false.

Change-Id: I9757226e62c587b679cab2a42f3616a5dca97e60
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/55464
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-01-18 15:20:10 +00:00
Matthew Poremba
0cb64ce9f0 arch-vega: Free dest registers in non-memory Load DS insts
Ported from https://gem5-review.googlesource.com/c/public/gem5/+/48019:

Certain DS insts are classfied as Loads, but don't actually go through
the memory pipeline. However, any instruction classified as a load
marks its destination registers as free in the memory pipeline.

Because these instructions didn't use the memory pipeline, they
never freed their destination registers, which led to a deadlock.

This patch explicitly calls the function used to free the destination
registers in the execute() method of those Load instructions that
don't use the memory pipeline.

Change-Id: I8231217a79661ca6acc837b2ab4931b946049a1a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/55463
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-01-17 23:55:51 +00:00
Matthew Poremba
d6bd91a9fd arch-vega: Implement large ds_read/write instructions
Port large DS read/write instructions from
https://gem5-review.googlesource.com/c/public/gem5/+/48342.

This implements the 96 and 128b ds_read/write instructions in a similar
fashion to the 3 and 4 dword flat_load/store instructions.

These instructions are treated as reads/writes of 3 or 4 dwords, instead
of as a single 96b/128b memory transaction, due to the limitations of
the VecOperand class used in the amdgpu code.

In order to handle treating the memory transaction as multiple dwords,
the patch also adds in new initMemRead/initMemWrite functions for ds
instructions. These are similar to the functions used in flat
instructions for the same purpose.

Change-Id: Iee2de14eb7f32b6654799d53dc97d806288af98f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/55344
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-01-11 16:58:09 +00:00
Matthew Poremba
5a94e73d00 arch-vega: Validate if scalar sources are scalar gprs
Port the fixes for scalar source checks from arch-gcn3 at
https://gem5-review.googlesource.com/c/public/gem5/+/48344.

Scalar sources can either be a general-purpose register or a constant
register that holds a single value.

If we don't check for if the register is a general-purpose register,
it's possible that we get a constant register, which then causes all of
the register mapping code to break, as the constant registers aren't
supposed to be mapped like the general-purpose registers are.

This fix adds an isScalarReg check to the instruction encodings that
were missing it.

Change-Id: I30dd2d082a5a1dcc3075843bcefd325113ed1df6
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/55343
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-01-11 16:58:09 +00:00
Kyle Roarty
f9deeea427 arch-gcn3,arch-vega: Select proper data on misaligned access
req1->getSize() returns the size in bytes, but because we're using it
in an array index, we need to scale it by the size of the data type.

This ensures we give the second request the proper data.

Change-Id: I578665406762d5d0c95f2ea8297c362e1cc0620b
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/54503
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
2021-12-20 18:28:08 +00:00
Matthew Poremba
9313294efe misc: Remove AMD license addition
Remove the line "For use for simulation and test purposes only" in files
were AMD is the only copyright holder listed in the header. This happens
to be the case for all files where this line exists, removing it
completely from gem5.

Change-Id: I623f266b002f564301b28774f49081099cfc60fd
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53943
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-12-11 04:00:56 +00:00
Matthew Poremba
0ae1a9d109 arch-vega: Implement S_SLEEP
This is merely copied from arch-gcn3.

Change-Id: Ibd2bda37fe9adc083a35efab0f59617d386019b9
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53883
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-12-11 03:59:19 +00:00
Matthew Poremba
23aec13f07 arch-vega: Implement V_AND_OR_B32
Change-Id: I8daeb8de2db5996e132cf7ed729f02c3c94a6862
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53868
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-12-11 03:59:19 +00:00
Matthew Poremba
f1e3fa7a3e arch-vega: Implement V_ADD3_U32
Change-Id: I4d01265f946e289cbff56090c2dd193ea66d5c70
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53867
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-12-11 03:59:19 +00:00
Matthew Poremba
2abc51e810 arch-vega: Impelemnt V_ADD_LSHL_U32
Change-Id: Ia4e465ef2534fe28dc846f728b2e1da3dfe4f7d6
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53866
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-12-11 03:59:19 +00:00
Matthew Poremba
04d025806d arch-vega: Implement V_LSHL_ADD_U32
Change-Id: I986f82e8c6c02b0d62e55fbaed1c3f9e5b2b4a43
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53865
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-12-11 03:59:19 +00:00
Matthew Poremba
19788d3b56 arch-vega: Implement V_LSHL_OR_B32
Change-Id: I237410e05df9a96323a6ceb7d09ae2a2a8608f16
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53864
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-12-11 03:59:19 +00:00
Matthew Poremba
b5d8d9ddcc arch-vega: Implement V_OR3_B32
Change-Id: Id6c074033b08058b739e056f06b40ee5735f8f00
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53863
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-12-11 03:59:19 +00:00
Matthew Poremba
c028af111a arch-gcn3,gpu-compute: Move TLB to common folder in amdgpu
This TLB is more of an "APU" TLB than anything GCN3 specific. It can be
used with either GCN3 or Vega. With this change, VEGA_X86 builds and one
can run binaries with Vega ISA code using the same steps as GCN3 but
building the Vega ISA instead.

Change-Id: I0c92bcd0379a18628dc05cb5af070bdc7e692c7c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53803
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-12-09 17:26:15 +00:00
Gabe Black
1c233ee9d2 scons: Add sim_object and enums arguments to SimObject().
This will explicitly declare what SimObject and Enum types need to be set
up in C++, which will make importing all the SimObject modules during
the setup phase of SCons uneccessary.

Change-Id: Id2d7603daf33b236ceaa0789e2f089f589d34e62
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/49406
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-12-08 08:01:23 +00:00
Gabe Black
f315461bb7 arch,cpu: Stop using and remove ThreadContext::instAddr.
Change-Id: I9cd8077fd72a9d7bff20f1bd7ba37e4e038b8fac
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/52062
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Gabe Black <gabe.black@gmail.com>
2021-11-30 23:30:06 +00:00
Kyle Roarty
223cd52431 arch-gcn3,arch-vega: Don't write exec in v_cmp_f_i32
Per the GCN3 and VEGA ISAs, v_cmpx_* writes exec, while v_cmp_* doesn't.

This removes the erroneous exec write in the VOP3 implementation of
v_cmp_f_i32.

Change-Id: I048e35917163c45b879f38d31a88f3f3d56c0baf
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/52445
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-11-05 19:52:40 +00:00
Matthew Poremba
3112a7f0d0 arch-gcn3,gpu-compute: Move GCN3 specific TLB to arch
Move GpuTLB and TLBCoalescer to GCN3 as the TLB format is specific to
GCN3 and SE mode / APU simulation. Vega will have its own TLB,
coalescer, and walker suitable for a dGPU. This also adds a using alias
for the TLB translation state to reduce the number of references to
TheISA and X86ISA. X86 specific includes are also removed.

Change-Id: I34448bb4e5ddb9980b34a55bc717bbcea0e03db5
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/49847
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-10-04 23:47:03 +00:00
Matthew Poremba
c15e472199 arch-vega: Rework flat instructions to support global
Global instructions are new in Vega and are essentially FLAT
instructions from GCN3 but guaranteed to go to global memory where as
flat can go to global or local memory.

This reworks the flat instruction classes so that the initiateAcc /
execute / completeAcc logic can be reused for flat, global, and later
scratch subtypes of flat instructions. The decoder creates a flat
instruction class which sets instruction flags based on the flat
instruction's SEG field. There are new initOperandInfo and
generateDissasmbly methods for flat and global. The number of operands
and operand index getters are modified to check the flags and return the
correct value for the subtype.

Change-Id: I1db4a3742aeec62424189e54c38c59d6b1a8d3c1
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/47106
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Kyle Roarty <kyleroarty1716@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-10-04 22:51:37 +00:00
Kyle Roarty
008659bee1 arch-gcn3: Fix MUBUF out-of-bounds case 1
This patch upates the out-of-bounds check to properly check
against the correct buffer_offset, which is different depending
on if the const_swizzle_enable is true or false.

Change-Id: I5c687c09ee7f8e446618084b8545b74a84211d4d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51127
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Reviewed-by: Alex Dutu <alexandru.dutu@amd.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-09-30 20:22:35 +00:00
Matthew Poremba
5a884fdab5 arch-vega: Fix VEGA_X86 build issues
The registerManager was not being dereferenced properly. Also remove
non-existant include file.

Change-Id: I5dac692abedc327ed83ee904e4c6ac5dac811e4c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/47105
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-09-27 22:30:30 +00:00
Matthew Poremba
e51e698b74 arch-vega: Update instruction stats
These stats were moved to a Stats::Group but the instructions were not
updated to use the stats struct.

Change-Id: I49348e30bc0988a2a873f51bd7079c1f315649b4
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/47104
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
2021-09-27 22:30:30 +00:00
Matthew Poremba
16de253c15 arch-vega: Add missing functions referenced by insts
Some instructions were referencing pc() and isExecMaskRegister() which
were not defined.

Change-Id: Ic5b3fa9057950ff85603fcb87447a81b6c7f274b
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/47103
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
2021-09-27 22:30:30 +00:00