arch-vega: Fix S_GETREG_B32 masking/shifting

Here the mask should not be inverted. We also need to shift by the
offset to remove the padding as the consumer of the value expects the
offset to be removed.

This can be easily tested by running a GPU kernel with __shared__
variables. This will generate the following assembly:

s_getreg_b32 s6, hwreg(HW_REG_SH_MEM_BASES, 16, 16)

The current implementation returns the lower 16 bits (private memory
aperture) while the correct behavior is the uppter 16 bits (shared/LDS
memory aperture).

Change-Id: Iea8f0adceeadb24cdcf46ef4183fcaa8262ab9e7
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61654
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
This commit is contained in:
Matthew Poremba
2022-07-24 09:27:26 -07:00
parent f2949f3d03
commit 5c7514c81c

View File

@@ -2044,7 +2044,7 @@ namespace VegaISA
// Store value from hardware to part of the SDST.
ScalarRegU32 mask = (((1U << size) - 1U) << offset);
sdst = (hwreg.rawData() & ~mask);
sdst = (hwreg.rawData() & mask) >> offset;
sdst.write();
} // execute
// --- Inst_SOPK__S_SETREG_B32 class methods ---