arch-riscv: fix initialization for some vector reduction insts (#1340)

Vector reduce float (widening and non-widening) and integer (widening)
instructions initialize the reduce loop operation with the first element
of the destination register (i.e. `Vd[0]`).

Since all reductions per spec seem to be `Vd[0] = Vs1[0] + Vs2[*]`
(where `+` is an arbitrary binary op and `*` indicates all active
elements) gem5 will calculate this incorrectly if `Vd[0]` and/or
`Vs1[0]` are non-neutral for the operation (the later case being because
it's not taken into account at all).

To solve this we just have to initialize the reduction loop to `Vs1[0]`
(the non-widening integer reduction already does this).
This commit is contained in:
Saúl
2024-07-11 07:08:49 +02:00
committed by GitHub
parent 2b902b0aec
commit 8dde32d2dc

View File

@@ -1826,7 +1826,7 @@ Fault
auto reduce_loop =
[&, this](const auto& f, const auto* _, const auto* vs2) {
vu tmp_val = Vd[0];
vu tmp_val = Vs1[0];
for (uint32_t i = 0; i < this->microVl; i++) {
uint32_t ei = i + vtype_VLMAX(vtype, vlen, true) *
this->microIdx;
@@ -1876,7 +1876,7 @@ Fault
auto reduce_loop =
[&, this](const auto& f, const auto* _, const auto* vs2) {
vwu tmp_val = Vd[0];
vwu tmp_val = Vs1[0];
for (uint32_t i = 0; i < this->microVl; i++) {
uint32_t ei = i + vtype_VLMAX(vtype, vlen, true) *
this->microIdx;
@@ -2230,7 +2230,7 @@ Fault
auto reduce_loop =
[&, this](const auto& f, const auto* _, const auto* vs2) {
vwu tmp_val = Vd[0];
vwu tmp_val = Vs1[0];
for (uint32_t i = 0; i < this->microVl; i++) {
uint32_t ei = i + vtype_VLMAX(vtype, vlen, true) *
this->microIdx;