arch-vega: Fix multi-dword setElem in PackedReg (#1664)

There are two issues related to setting an element in PackedReg where
the element spans multiple dwords. First, the mask value is wrong and is
clobbering both dwords. Second, a portion of the value is shifted out of
the narrower input type.

Fix this by using the correct mask to clear the bits where the value
will be placed and use a larger data type to shift the value into place.
This commit is contained in:
Matthew Poremba
2024-10-14 10:19:52 -07:00
committed by GitHub
parent 20965f571b
commit deb8f983a1

View File

@@ -960,11 +960,14 @@ class PackedReg
uint64_t elem_mask = (1ULL << ELEM_SIZE) - 1;
value &= elem_mask;
// Clear the bits where the value goes so that operator| can be used.
elem_mask <<= qw_lbit;
qword &= elem_mask;
qword &= ~elem_mask;
value <<= qw_lbit;
qword |= value;
// Promote to 64-bit to prevent shifting out of range
uint64_t value64 = value;
value64 <<= qw_lbit;
qword |= value64;
dwords[udw] = uint32_t(qword >> 32);
dwords[ldw] = uint32_t(qword & mask(32));