Commit Graph

19 Commits

Author SHA1 Message Date
Matthew Poremba
91955ae879 arch-vega: Decodings for all MFMA/SMFMACs up to MI300
This adds the decodings for all of the matrix fused multiply add (MFMA)
and sparse matrix fused multiply accumulate (SMFMAC) instructions up to
and including MI300. This does not yet provide the implementation for
these instructions, however it is easier and less tedious to add them in
bulk rather that one at a time.

Change-Id: I5acd23ca8a26bdec843bead545d1f8820ad95b41
2024-05-20 09:27:12 -05:00
Matthew Poremba
2bb62a05e1 arch-vega: Implement v_cvt_pk_fp8_f32
This instruction serves as a test for the MXFP8 type.

Change-Id: I2ce30bf7f3a3ecc850a445aebdf971c37c39a79e
2024-05-20 09:27:12 -05:00
Marco Kurzynski
d5a734c252 arch-vega: Template MFMA instructions
templated
- v_mfma_f64_16x16x4f64

added support for
- v_mfma_f32_32x32x2f32
- v_mfma_f32_4x4x1_16b_f32
- v_mfma_f32_16x16x4f32

[formula for gprs needed](https://github.com/ROCm/amd_matrix_instruction_calculator)

[formulas for register layouts and lanes used in computation](https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/instruction-set-architectures/amd-instinct-mi300-cdna3-instruction-set-architecture.pdf)

Change-Id: I15d6c0a5865d58323ae8dbcb3f6dcb701a9ab3c7
2024-05-20 09:27:12 -05:00
Matthew Poremba
9ab004cccc arch-vega: Implement V_LSHL_ADD_U64
This is a new instruction in MI300 and operates similar to
V_LSHL_ADD_U32 but on 64-bit values.

Change-Id: Ia4ac65160bdad748fccdcb28286ba03157cc4046
2024-03-21 10:10:01 -05:00
Matthew Poremba
31e63b01ad arch-vega: Add vop3p DOT instructions
Implemented according to the ISA spec. Validated with silion. In
particular the sign extend is important for the signed variants and the
unsigned variants seem to overflow lanes (hence why there is no mask()
in the unsigned varints. FP16 -> FP32 continues using ARM's fplib.

Tested vs. an MI210. Clamp has not been verified.

Change-Id: Ifc09aecbc1ef2c92a5524a43ca529983018a6d59
2024-01-03 15:41:06 -06:00
Matthew Poremba
420cda1bef arch-vega: Implement FP32 packed math
Starting with MI200, packed math can operate on double dword inputs. In
this case, 64-bits of inputs (two VGPRs per lane) contain two FP32
values.

Add instructions to perform add, multiply, and FMA on packed FP32 types.

Change-Id: Ib838bff91a10e02e013cc7c33ec3d91ff08647b0
2024-01-03 15:41:06 -06:00
Matthew Poremba
7b0c47d52f arch-vega: Implement all global atomics up to gfx90a
This change adds all of the missing flat/global atomics up to including
the new atomics in gfx90a (MI200). Adds all decodings and instruction
implementations with the exception of __half2 which does not have a
corresponding data type in gem5. This refactors the execute() and
completeAcc() methods by creating helper functions similar to what
initiateAcc() uses. This reduces redundant code for global atomic
instruction implementations.

Validated all except PK_ADD_F16, ADD_F32, and ADD_F64 which will be done
shortly. Verified the source/dest register sizes in the header are
correct and the template parameters for the new execute()/completeAcc()
methods are correct.

Change-Id: I4b3351229af401a1a4cbfb97166801aac67b74e4
2024-01-03 15:41:06 -06:00
Matthew Poremba
472c697d88 arch-vega: Implement v_mfma_i32_16x16x16i8
Tested using AMD labs notes examples located on github:

https://github.com/amd/amd-lab-notes/blob/release/matrix-cores/
    src/mfma_i32_16x16x16i8.cpp

Change-Id: Ib0e50162288528012b6d3395e1f629ebf12e8e54
2024-01-03 15:41:06 -06:00
Matthew Poremba
bc69ab0a1f arch-vega: Add VOP3P encodings and packed 16b insts
This adds the VOP3P and VOP3P_MAI encodings from the MI200 spec. These
instructions are used for packed math and miSIMD instructions. The first
19 VOP3P opcodes are implemented and validated against hardware. This
includes all instructions which operate on one dword containing two
packed 16-bit values of fp16, int16_t, or uint16_t.

Implement one MFMA instruction for now which was also validated against
hardware.
2024-01-03 10:40:34 -06:00
Matthew Poremba
cd76f92c94 arch-vega: Add decodings for new MI100 VOP2 insts
VOP2 with opcodes 55-61 were added in MI100 and are not in Vega10. This
changeset adds the decodings for these instructions.

The changeset does not implement the instructions, however the fatal
message is much more helpful for debugging compared so a generic
decode_invalid handler.

Change-Id: Ibde0880c35ff915bf8e50772df9ce263e55ca893
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70042
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2023-04-28 00:48:35 +00:00
Matthew Poremba
3a73aa3ac1 arch-vega: Implement new VOP2 using VOP3 insts
Vega adds three new VOP2 instructions that may use VOP3 encoding that
are not part of the GCN3 ISA: v_add_u32, v_sub_u32, v_subrev_u32. This
changeset implements those three new instructions to fix errors related
to "invalid encoding" when those instructions are seen.

Tested using srad from Rodinia 3.0 HIP port which compiles a v_add_u32
instruction with VOP3 encoding.

Change-Id: I409a9f72f5c37895c3a0ab7ceb14a4dd121874a4
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61330
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2022-07-15 14:55:33 +00:00
Matthew Poremba
9313294efe misc: Remove AMD license addition
Remove the line "For use for simulation and test purposes only" in files
were AMD is the only copyright holder listed in the header. This happens
to be the case for all files where this line exists, removing it
completely from gem5.

Change-Id: I623f266b002f564301b28774f49081099cfc60fd
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53943
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-12-11 04:00:56 +00:00
Daniel R. Carvalho
974a47dfb9 misc: Adopt the gem5 namespace
Apply the gem5 namespace to the codebase.

Some anonymous namespaces could theoretically be removed,
but since this change's main goal was to keep conflicts
at a minimum, it was decided not to modify much the
general shape of the files.

A few missing comments of the form "// namespace X" that
occurred before the newly added "} // namespace gem5"
have been added for consistency.

std out should not be included in the gem5 namespace, so
they weren't.

ProtoMessage has not been included in the gem5 namespace,
since I'm not familiar with how proto works.

Regarding the SystemC files, although they belong to gem5,
they actually perform integration between gem5 and SystemC;
therefore, it deserved its own separate namespace.

Files that are automatically generated have been included
in the gem5 namespace.

The .isa files currently are limited to a single namespace.
This limitation should be later removed to make it easier
to accomodate a better API.

Regarding the files in util, gem5:: was prepended where
suitable. Notice that this patch was tested as much as
possible given that most of these were already not
previously compiling.

Change-Id: Ia53d404ec79c46edaa98f654e23bc3b0e179fe2d
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46323
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-07-01 19:08:24 +00:00
Michael Boyer
3f5120e01f arch-vega: Implement non-carry-out VEGA add, sub, and subrev
In GCN3, the v_add_u32, v_sub_u32, and v_subrev_u32 instructions write
the carry-out value to VCC. VEGA introduces explicit carry-out versions
of these instructions (v_add_co_u32, v_sub_co_u32, and v_subrev_co_u32),
and modifies the behavior of the baseline, non-carry-out versions to not
write to VCC. Previously both the carry-out and non-carry-out versions
shared a single implementation that wrote to VCC. This patch correctly
implements the non-carry-out versions to avoid the VCC write.

This patch also makes the following substitutions for GCN3 instructions
that no longer exist in VEGA (this renaming has no functional impact):
v_addc_u32 -> v_addc_co_u32
v_subb_u32 -> v_subb_co_u32
v_subbrev_u32 -> v_subbrev_co_u32

Change-Id: I002fa6e9316d38fd4cc3554daff047523cfc12c9
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/47240
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-06-29 17:28:26 +00:00
Kyle Roarty
9ddfe09649 arch-vega: Add Vega-specific opcodes
The opcodes aren't implemented yet, returning nullptr

Change-Id: I700c2158035aea84e6365a32d53304accab59d96
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42208
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
2021-04-01 02:58:31 +00:00
Kyle Roarty
f85a861594 arch-vega: Order pointer functions by opcode
This makes it easier to add new ops

Change-Id: I2820005c42c87a1289aa87ddcdc5473ff0e57bd9
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42207
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
2021-04-01 02:58:31 +00:00
Kyle Roarty
16548557ef arch-vega: Add decodings for Flat, Global, Scratch
Does not implement the functions yet

Change-Id: I32feab747b13bd2eff98983e3281c0d82e756221
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42206
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
2021-04-01 02:58:31 +00:00
Kyle Roarty
b30e9645d7 arch-vega: Update instruction encodings
This also renames VOP3 and VOP3_SDST_ENC to
VOP3A and VOP3B, matching the ISA.

Change-Id: I56f254433b1f3181d4ee6896f957a2256e3c7b29
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42205
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
2021-04-01 02:58:31 +00:00
Kyle Roarty
f7d4ff6ef5 arch-vega: Add Vega ISA as a copy of GCN3
This changeset adds Vega support as a copy of GCN3.
Configs have been modified to include both ISAs.
Current implementation is not complete and needs
modifications to fully comply with the ISA manual:

https://developer.amd.com/wp-content/resources/
Vega_Shader_ISA_28July2017.pdf

Change-Id: I608aa6747a45594f8e1bd7802da1883cf612168b
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42204
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
2021-04-01 02:58:31 +00:00