mem-ruby: fix load deadlock with WB GPU L2 caches
By default the GPU VIPER coherence protocol uses a WT L2 cache. However it has support for using WB caches (although this is not tested currently). When using a WB L2 cache for the GPU, this results in deadlocks with loads. Specifically, when a load reaches the L2 and the line is currently in the W state, that line must be written back before the load can be performed. However, the current transition for this in the L2 did not attempt to retry the load when the WB completes, resulting in a deadlock. This deadlock can be replicated by running the GPU Ruby random tester as is with a WB L2 cache instead of a WT L2 cache. To fix this, this change modifies the transition in question to put the load on the stalled requests buffer, which the WBAck will check when it returns to the L2 (and thus perform the load). This fix has been tested and verified with both the per-checkin and nightly GPU Ruby Random tester tests (with a WB L2 cache). Change-Id: Ieec4f61a3070cf9976b8c3ef0cdbd0cc5a1443c6 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/68977 Reviewed-by: Matthew Poremba <matthew.poremba@amd.com> Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com>
This commit is contained in:
committed by
Matt Sinclair
parent
fb4eb86711
commit
92d920f994
@@ -718,10 +718,13 @@ machine(MachineType:TCC, "TCC Cache")
|
||||
p_popRequestQueue;
|
||||
}
|
||||
transition(W, RdBlk, WI) {TagArrayRead, DataArrayRead} {
|
||||
p_profileHit;
|
||||
t_allocateTBE;
|
||||
wb_writeBack;
|
||||
p_popRequestQueue;
|
||||
// need to try this request again after writing back the current entry -- to
|
||||
// do so, put it with other stalled requests in a buffer to reduce resource
|
||||
// contention since they won't try again every cycle and will instead only
|
||||
// try again once woken up
|
||||
st_stallAndWaitRequest;
|
||||
}
|
||||
|
||||
transition(I, RdBlk, IV) {TagArrayRead} {
|
||||
|
||||
Reference in New Issue
Block a user