gpu-compute: Fix private offset/size register indexes

According to the ABI documentation from LLVM, the *low* register of flat
scratch (maxSGPR - 4) is the offset and the high register (maxSGPR - 3)
is size. These are currently backwards, resulting in some gnarly
addresses being generated leading to page fault and/or incorrect data.

This commit fixes this by setting the order correctly.

Change-Id: I0b1d077c49c0ee2a4e59b0f6d85cdb8f17f9be61
This commit is contained in:
Matthew Poremba
2023-08-26 13:09:28 -05:00
parent e0379f4526
commit 4506188e00

View File

@@ -901,12 +901,12 @@ GPUDynInst::resolveFlatSegment(const VectorMask &mask)
uint32_t numSgprs = wavefront()->maxSgprs;
uint32_t physSgprIdx =
wavefront()->computeUnit->registerManager->mapSgpr(wavefront(),
numSgprs - 3);
numSgprs - 4);
uint32_t offset =
wavefront()->computeUnit->srf[simdId]->read(physSgprIdx);
physSgprIdx =
wavefront()->computeUnit->registerManager->mapSgpr(wavefront(),
numSgprs - 4);
numSgprs - 3);
uint32_t size =
wavefront()->computeUnit->srf[simdId]->read(physSgprIdx);
for (int lane = 0; lane < wavefront()->computeUnit->wfSize(); ++lane) {