mem-ruby: fix races between data and DMA in MOESI_AMD_Base-dir

There are race conditions while running several benchmarks, where
the DMA engine and the CorePair simultaneously send requests for the
same block. This patch fixes two scenarios
(a) If the request from the DMA engine arrives before the one from the
CorePair, the directory controller records it as a pending request.
However, once the DMA request is serviced, the directory doesn't check
for pending requests. The CorePair, consequently, never sees a response
to its request and this results in a Deadlock.

Added call to wakeUpDependents in the transition from BDR_Pm to U
Added call to wakeUpDependents in the transition from BDW_P to U

(b) If the request from the CorePair is being serviced by the directory
and the DMA requests for the same block, this causes an invalid
transition because the current coherence doesn't take care of this
scenario.

Added transition state where the requests from DMA are added to the
stall buffer.

Updated B to U CoreUnblock transition to check all buffers, as the DMA
requests were being placed later in the stall buffer than was being checked

Change-Id: I5a76efef97723bc53cf239ea7e112f84fc874ef8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31996
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Bradford Beckmann <brad.beckmann@amd.com>
Maintainer: Bradford Beckmann <brad.beckmann@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This commit is contained in:
Kyle Roarty
2020-07-29 17:58:45 -05:00
parent d542dc838e
commit 187c44fe44
2 changed files with 22 additions and 3 deletions

View File

@@ -180,6 +180,7 @@ machine(MachineType:Directory, "AMD Baseline protocol")
void set_tbe(TBE a);
void unset_tbe();
void wakeUpAllBuffers();
void wakeUpAllBuffers(Addr a);
void wakeUpBuffers(Addr a);
Cycles curCycle();
@@ -1069,6 +1070,10 @@ machine(MachineType:Directory, "AMD Baseline protocol")
stall_and_wait(requestNetwork_in, address);
}
action(sd_stallAndWaitRequest, "sd", desc="Stall and wait on the address") {
stall_and_wait(dmaRequestQueue_in, address);
}
action(wa_wakeUpDependents, "wa", desc="Wake up any requests waiting for this address") {
wakeUpBuffers(address);
}
@@ -1077,6 +1082,10 @@ machine(MachineType:Directory, "AMD Baseline protocol")
wakeUpAllBuffers();
}
action(wa_wakeUpAllDependentsAddr, "waaa", desc="Wake up any requests waiting for this address") {
wakeUpAllBuffers(address);
}
action(z_stall, "z", desc="...") {
}
@@ -1090,6 +1099,11 @@ machine(MachineType:Directory, "AMD Baseline protocol")
st_stallAndWaitRequest;
}
// The exit state is always going to be U, so wakeUpDependents logic should be covered in all the
// transitions which are flowing into U.
transition({BL, BS_M, BM_M, B_M, BP, BDW_P, BS_PM, BM_PM, B_PM, BS_Pm, BM_Pm, B_Pm, B}, {DmaRead,DmaWrite}){
sd_stallAndWaitRequest;
}
// transitions from U
transition(U, DmaRead, BDR_PM) {L3TagArrayRead} {
@@ -1193,7 +1207,7 @@ machine(MachineType:Directory, "AMD Baseline protocol")
}
transition({B}, CoreUnblock, U) {
wa_wakeUpDependents;
wa_wakeUpAllDependentsAddr;
pu_popUnblockQueue;
}
@@ -1323,12 +1337,18 @@ machine(MachineType:Directory, "AMD Baseline protocol")
}
transition(BDW_P, ProbeAcksComplete, U) {
// Check for pending requests from the core we put to sleep while waiting
// for a response
wa_wakeUpAllDependentsAddr;
dt_deallocateTBE;
pt_popTriggerQueue;
}
transition(BDR_Pm, ProbeAcksComplete, U) {
dd_sendResponseDmaData;
// Check for pending requests from the core we put to sleep while waiting
// for a response
wa_wakeUpDependents;
dt_deallocateTBE;
pt_popTriggerQueue;
}

View File

@@ -149,8 +149,7 @@ AbstractController::wakeUpAllBuffers(Addr addr)
{
if (m_waiting_buffers.count(addr) > 0) {
//
// Wake up all possible lower rank (i.e. lower priority) buffers that could
// be waiting on this message.
// Wake up all possible buffers that could be waiting on this message.
//
for (int in_port_rank = m_in_ports - 1;
in_port_rank >= 0;