dev-amdgpu: Writeback PM4 queue rptr when empty (#597)

The GPU device keeps a local copy of each ring buffers read pointer
(rptr) to avoid constant DMAs to/from host memory. This means it needs
to be periodically updated on the host side as the driver uses this to
determine how much space is left in the queue and may hang if it believe
the queue is full. For user-mode queues, this already happens when
queues are unmapped. For kernel mode queues (e.g., HIQ, KIQ) the rptr is
never updated leading to a hang.

In this patch the rptr for *all* queues is reported back to the kernel
whenever the queue reaches an empty state (rptr == wptr). Additionally
to handle PM4 queue wrap-around, the queue processing function checks if
the queue is not empty instead of rptr < wptr. This is state because the
driver fills PM4 queues with NOP packets on initialization and when wrap
around occurs.

Change-Id: Ie13a4354f82999208a75bb1eaec70513039ff30f
This commit is contained in:
Matthew Poremba
2023-11-27 11:02:11 -08:00
committed by GitHub
parent 0f6eabe8c9
commit 9e6a87e67a

View File

@@ -168,7 +168,7 @@ PM4PacketProcessor::decodeNext(PM4Queue *q)
DPRINTF(PM4PacketProcessor, "PM4 decode queue %d rptr %p, wptr %p\n",
q->id(), q->rptr(), q->wptr());
if (q->rptr() < q->wptr()) {
if (q->rptr() != q->wptr()) {
/* Additional braces here are needed due to a clang compilation bug
falsely throwing a "suggest braces around initialization of
subject" error. More info on this bug is available here:
@@ -181,11 +181,28 @@ PM4PacketProcessor::decodeNext(PM4Queue *q)
dmaReadVirt(getGARTAddr(q->rptr()), sizeof(uint32_t), cb,
&cb->dmaBuffer);
} else {
// Reached the end of processable data in the queue. Switch out of IB
// if this is an indirect buffer.
assert(q->rptr() == q->wptr());
q->processing(false);
if (q->ib()) {
q->ib(false);
decodeNext(q);
}
// Write back rptr when the queue is empty. For static queues which
// are not unmapped, this is how the driver knows there is enough
// space in the queue to continue writing packets to the ring buffer.
if (q->getMQD()->aqlRptr) {
Addr addr = getGARTAddr(q->getMQD()->aqlRptr);
uint32_t *data = new uint32_t;
// gem5 stores rptr as a bytes offset while the driver expects
// a dword offset. Convert the offset to dword count.
*data = q->getRptr() >> 2;
auto cb = new DmaVirtCallback<uint32_t>(
[data](const uint32_t &) { delete data; });
dmaWriteVirt(addr, sizeof(uint32_t), cb, data);
}
}
}