66d4a158 added support for AMD's GPU cache bypassing flags (GLC
for bypassing L1 caches, SLC for bypassing all caches). However,
for applications that use the GLC flag but intermix GLC- and
non-GLC accesses to the same address, this previous commit
has a bug. This bug manifests when the address is currently
valid in the L1 (TCP). In this case, the previous commit chose
to evict the line before letting the bypassing access to proceed.
However, to do this the previous commit was using the inv_invDone
action as part of the process of evicting it. This action is only
intended to be called when load acquires are being performed
(i.e., when the entire L1 cache is being flash invalidated). Thus,
calling inv_invDone for a GLC (or SLC) bypassing request caused an
assert failure since the bypassing request was not performing a
load acquire.
This commit resolves this by changing the support in this case to
simply invalidate the entry in the cache.
Change-Id: Ibaa4976f8714ac93650020af1c0ce2b6732c95a2
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67199
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Two W->WI transitions, on events RdBlk and Atomic in the GPU L2 cache
coherence protocol do not clear the request from the request queue upon
completing the transition. This action is not performed in the respone
path. This update adds the p_popRequestQueue action to each of these
transitions to remove the stale request from the queue.
Change-Id: Ia2679fe3dd702f4df2bc114f4607ba40c18d6ff1
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67192
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
An earlier commit added support for GLC and SLC AMDGPU instruction
modifiers. These modifiers enable cache bypassing when set. The GLC/SLC
flag information was being threaded through all the way to memory and
back so that appropriate actions could be taken upon receiving a request
and corresponding response. This commit removes the threading and adds
the bypass flag information to TBE. Requests populate this
entry and responses access it to determine the correct set of actions to
execute.
Change-Id: I20ffa6682d109270adb921de078cfd47fb4e137c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67191
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
The GPU cache models do not support cache bypassing when the GLC or SLC
AMDGPU instruction modifiers are used in a load or store. This commit
adds cache bypass support by introducing new transitions in the
coherence protocol used by the GPU memory system. Now, instructions with
the GLC bit set will not cache in the L1 and instructions with SLC bit
set will not cache in L1 or L2.
Change-Id: Id29a47b0fa7e16a21a7718949db802f85e9897c3
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/66991
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
An example case,
```python
mem_side_port = RequestPort(
"This port sends requests and " "receives responses"
)
```
This is the residue of running the python formatter.
This is done by finding all tokens matching the regex `"\s"(?![.;"])`
and manually replacing them by empty strings.
Change-Id: Icf223bbe889e5fa5749a81ef77aa6e721f38b549
Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/66111
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
The current MESI_Two_Level protocol's L1 caches updates the MRU information twice per request on misses -- once when the request reaches Ruby and once when the miss is returned from another level of the memory hierarchy.
Although this approach does not cause any correctness bugs for replacement policies like LRU since this request is the LRU in both cases, it does not work correctly for other policies like SecondChance and LFU, where updating the information twice (for misses) causes them to devolve to LRU.
Note that this was not directly a problem with Ruby previously, because it only supported LRU-based policies that were unaffected by this. However, with the integration of 20879 Ruby now uses the same replacement policies as Classic (which has additional, non-LRU based replacement policies).
This patch resolves this problem by not updating the MRU information a second time for the misses. It has been tested and validated with the replacement policy tests.
Change-Id: I9e7e96a9d6c09f3d6b7daae7115ef091ac3bdc08
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/64371
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Clang Version 14 throws a warning "use of bitwise '&/|' with boolean
operands" for cases where bitwise operations are used where boolean
operations are intended.
This occurred in "WriteMast.hh", "data.isa", and "decode.cc" where
boolean values were being compared using the bitwise operands. While
bitwise operations are equivalent, they have been changed to boolean
operations in this patch to avoid the clang-14 warning.
Change-Id: Ic7583e13a325661712c75c8e1b234c4878832352
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/64172
Reviewed-by: Tom Rollet <tom.rollet@huawei.com>
Reviewed-by: Kunal Pai <kunpai@ucdavis.edu>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
The `Addr line_addr` in "src/mem/snoop_filter.cc" variable was only
used in an assert, stripped when compiling gem5.fast.
Clang-13 throws a warning for this variable. This has been fixed by
merging the variable and associated logic into the assert statement.
The variables in inet.cc and Sequencer.cc were also causing an 'unused
variable' warning to be thrown due to variables that were only used in
assert statements. In these cases the logic could not be moved into the
assert statement and, as such, the `GEM5_VAR_USED` MACRO is used to
remove this warning.
Change-Id: I6511d0863608c38b79e4558c7dcf35a323fe8362
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/64171
Reviewed-by: Kunal Pai <kunpai@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
AddrRangeMap::intersects doesn't support ranges with different
interleavings, thus the current implementation of the destination
seach won't work in cases when different machines map the same address
with different interleaving.
The fixed implementation uses a different AddrRangeMap for each mach
type.
Change-Id: Idd0184da343c46c92a4c86f142938902096c2b1f
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63671
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Finish_CopyBack_Stale is scheduled only when the requestor is the last
sharer. This prevents the cacahe evicting the line which was already
evicted while the stale WriteBack transaction was stalled.
Wrong condition check in Finish_CopyBack_Stale for eviction is also
removed.
Change-Id: Ib66acc1b9e4a6f7cea373e1fb37375427897d48d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63611
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The current MI_example protocol's L1 caches updates the MRU information twice per store requests that miss -- once when the request reaches Ruby and once when the store miss is returned from another level of the memory hierarchy.
Although this approach does not cause any correctness bugs for replacement policies like LRU since this request is the LRU in both cases, it does not work correctly for other policies like SecondChance and LFU, where updating the information twice (for misses) causes them to devolve to LRU.
Note that this was not directly a problem with Ruby previously, because it only supported LRU-based policies that were unaffected by this. However, with the integration of 20879 Ruby now uses the same replacement policies as Classic (which has additional, non-LRU based replacement policies).
This patch resolves this problem by not updating the MRU information a second time for the misses. It has been tested and validated with the replacement policy tests in 20880, and it modifies the store instead of the load in 62232.
Change-Id: I8436e3e537da0ee5841c59a94fa5e5c30105529f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63191
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
The current MI_example protocol's L1 caches updates the MRU information twice per request on misses -- once when the request reaches Ruby and once when the miss is returned from another level of the memory hierarchy.
Although this approach does not cause any correctness bugs for replacement policies like LRU since this request is the LRU in both cases, it does not work correctly for other policies like SecondChance and LFU, where updating the information twice (for misses) causes them to devolve to LRU.
Note that this was not directly a problem with Ruby previously, because it only supported LRU-based policies that were unaffected by this. However, with the integration of 20879 Ruby now uses the same replacement policies as Classic (which has additional, non-LRU based replacement policies).
This patch resolves this problem by not updating the MRU information a second time for the misses. It has been tested and validated with the replacement policy tests in 20880.
Change-Id: I82a57abf2a16d70820413ba8118378f2e91fd7fb
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62232
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
This commit adds `desc` descriptions to the new symbols introduced
with CHI DVM support. The generation of the SLICC HTML documentation
requires each symbol to have a description, so a build with
`SLICC_HTML=True` will fail without this change.
Change-Id: I06f3bdd33edd1ff6e4bec35b01a460b9359ed9f6
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/60869
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Previously, the GPU SQC and TCP Ruby protocols always told the Sequencer
that the externalHit field was false. This impacts the statistics and
profiling, because the Sequencer uses this hit/miss information both for
profiling and the coalescer's statistics.
To resolve this, this commit updates the GPU SQC and TCP Ruby protocols
to pass the appropriate hit/miss information into the Sequencer's
readCallback and hitCallback functions.
Change-Id: Ib74af09b66fa8866eee72d3a9ab0e8a8f2196c03
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/60652
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Matthew Poremba <matthew.poremba@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
A deadlock occured where we got a RdBlk while in W,
which put us in WI while we wait for a writeback to complete.
This would cause the request to be stalled while the writeback
was occuring, but when the writeback completed (WBAck), we never
woke up the requests and thus never completed the RdBlk.
This commit adds a wakeup when we receive a WBAck while in WI.
Change-Id: I01edf1d7a47757b4f680baf9f33a1a6aa37e7e25
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/59352
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
When ReadOnce request hits upstream, set dataToBeInvalid to true
for R* states so that the line from the upstream is successfully dropped
at the end by Finalize_UpdateCacheFromTBE.
For UD_RU and UC_RU state, set dataValid to true to prevent it changing
to RU state when it doesn't get the snoop data response.
Change-Id: Ie83c511e8d158e18abc5c9c16bc6040ce73587bf
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/58411
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Tiago Muck <tiago.muck@arm.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Assume core C1 with private L1/L2 and a shared exclusive L3.
C1 has a line in SC state, while the state in the L3 is
RUSC (L3 has exclusive accesses and upstream requester has line in SC).
When C1 evicts the line (Evict request), the L3 has to issue a
WriteEvictFull to the home node, however the L3 doesn't have a copy
of the line.
This fix handling Evict requests when the line state is RUSC. When
the last sharer issues an Evict request, the responder may issue
SnpOnce the obtain a copy the line if needed.
JIRA: https://gem5.atlassian.net/browse/GEM5-1195
Change-Id: Ic8f4e10b38d95cd6d84f8d65b87b0c94fcf52eea
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/59991
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
When just forwarding a WU request, the controller waits until the WU is
acked from downstream before sending the ack upstream. This
prevents snoops clearing valid WU data.
JIRA: https://gem5.atlassian.net/browse/GEM5-1195
This was more likely to happen with shared exclusive caches, e.g:
assume core C1 and C2 with private L1/L2 and a shared exclusive L3.
C1 has as dirty copy of the line while C2 issues a WriteUnique request
to that line. The line state is RU in the L3, so the L3 will just
forward the request to the HNF, so:
- C2 issues WU to L3 cache
- L3 acks the WU, allowing C2 to send the data, while concurrently
forwarding the WU to the HNF.
- L3 receives data from C2
- HNF sends invalidating snoops upstream because line is RU
- The snoop hazards with the pending WU at the L3 and invalidates
the data previously received. This causes an assertion to fail when
we resume handling the WU.
Change-Id: I51e457e0bdb648c0fff3f702b7d2c95dcf431dc5
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/59990
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Normally we don't check the TBE data if there are outstanding response
messages for the transaction because that means the latest valid data is
either in another cache or within an inflight message.
However this is not the case when we have either a pending CleanUnique
or we are handling CleanUnique. So bypass the pending message check in
this case.
Change-Id: I5f31039ca2a01a6a68fee8e0f3cf02c7e437b43e
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57395
Reviewed-by: Daecheol You <daecheol.you@samsung.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Initiate_MaitainCoherence would not trigger a writeback if
tbe.dataMaybeDirtyUpstream is set due to the assumption that
the upstream cache would writeback any dirty data. However this
is not the case if we use this action finalize a CleanUnique, e.g.:
- L1-A has data in SC
- L1-B has data in SD
- L2 has data in RUSD (L2 is an exclusive cache)
- L1-A sends CleanUnique to L2
- L2 invalidates L1-B and receives dirty data.
- L2 acks the CleanUnique; L1-A is now UC
- L2 has the dirty data but drops it because dataMaybeDirtyUpstream
- L1-A doesn't modify the data and eventually evicts it with WriteEvict
- Data from WriteEvicts are dropped at the HNF and we lose the line
This patch removes the tbe.dataMaybeDirtyUpstream check.
Instead it only skips the WriteBack if an upstream cache is in
SD state, when it's guaranteed it will writeback the dirty data.
JIRA: https://gem5.atlassian.net/browse/GEM5-1195
Change-Id: I6722bc25068b0c44afcf261abc8824f1d80c09f9
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57392
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Daecheol You <daecheol.you@samsung.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
WriteCleanFull can be requested for the cache line in SD state (e.g.
Local eviction of a cache line in SD_RSC state). In this case, the
requestor is the owner of the cache line,
but it doesn't have it with exclusive right.
Thus, 'ownerIsExcl == false' should be removed from the stale condition.
Change-Id: I4d34021ac31b2e8600c24689a03a3b8fa18aa1f7
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/58412
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Initiate_CopyBack_Stale removes the requestor from the sharer list.
However, if CBWrData_SC is the data response of stale WriteCleanFull,
the requestor should remain in the sharer list.
Thus, whether to send a Evict or not can be decided after the data
response arrives. For this, FinishCopyBack_Stale event was added as the
last event to handle Evict.
Change-Id: Ic3e3a1e4d74b24b9aa328b2ddfa817db44f24e4e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/58413
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Tiago Muck <tiago.muck@arm.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>