Commit Graph

152 Commits

Author SHA1 Message Date
Giacomo Travaglini
da740b1cdd mem-ruby: Add a DBID field to the CHIResponseMsg data type
This will hold the CHI Data Buffer Identifier (DBID) field.
The DBID allows a Completer of a transaction to provide its own
identifier for a transaction ID.
This new ID will be used as a TxnId field by a following
WriteData/CompData/CompAck response.

For now we only set it to the original txnId (identity mapping)

Change-Id: If30c5e1cafbe5a30073c7cd01d60bf41eb586cee
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-09-08 08:38:13 +01:00
Giacomo Travaglini
4359567180 mem-ruby: Generate TxnId field for an incoming CHI request
The TxnId field of a CHI request has so far been unused (other than for
DVM transactions). With this patch we always initialize the field when
we extract a ruby request from the sequencer port.
According to specs (IHI0050F):

A 12-bit field is defined for the TxnID with the number of outstanding
transactions being limited to 1024. A Requester is permitted to reuse a
TxnID value after it has received either:
* All responses associated with a previous transaction that have used
the same value.
* A RetryAck response for a previous transaction that used the same value

Change-Id: Ie48f0fee99966339799ac50932d36b2a927b1c7d
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-09-08 08:38:13 +01:00
Giacomo Travaglini
f032eeae93 mem-ruby: Provide a fromSequencer helper function
Based on the CHIRequestType, it automatically tells if the
request has been originated from the sequencer
(CPU load/fetch/store)

Change-Id: I50fd116c8b1a995b1c37e948cd96db60c027fe66
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-09-08 08:38:13 +01:00
Matthew Poremba
2da54d5a4f mem-ruby: Reorder SLC atomic and response actions
Currently the MOESI_AMD_Base-directory transition for system level
atomics sends the response message before the atomic is performed. This
was likely done because atomics are supposed to return the value of the
data *before* the atomic is performed and by simply ordering the actions
this way that was taken care of.

With the new atomic log feature, the atomic values are pulled from the
log by the coalescer on the return path. Therefore, these actions can be
reordered. However, it is now necessary that the atomics be performed
before sending the response so that the log is populated and copied by
the response action. This should fix #253 .

Change-Id: Ie7e178f93990975367de2cc3e89e5ef9c9069241
2023-09-01 10:36:54 -05:00
Bobby R. Bruce
0e323bc409 mem: Atomic ops to same address (#200)
Augmenting the DataBlock class with a change log structure to record the
effects of atomic operations on a data block and service these changes
if the atomic operations require return values.

Although the operations are atomic, the coalescer need not send unique
memory requests for each operation. Atomic operations within a wavefront
to the same address are now coalesced into a single memory request. The
response of this request carries all the necessary information to
provide the requesting lanes unique values as a result of their
individual atomic operations. This helps reduce contention for request
and response queues in simulation.

Previously, only the final value of the datablock after all atomic ops
to the same address was visible to the requesting waves. This change
corrects this behavior by allowing each wave to see the effect of this
individual atomic op is a return value is necessary.
2023-08-30 23:53:35 -07:00
Bobby R. Bruce
68a48a2dfa mem-ruby: fix CHI sending the wrong snoop response (#219)
Do not respond with SnpRespData_I when the line is still present
upstream.
2023-08-28 16:21:25 -07:00
Bobby R. Bruce
737c611e72 mem-ruby: fix assert on CHI ReadUnique (#218)
DCT must be disabled when handling a ReadUnique where the copy need to
be upgraded.

Previously we were just asserting as it was assumed DCT is only enabled
for HNFs (which can "auto-upgrade"). However DCT may also be enabled for
intermediated levels of distributed shared caches above the HNFs.
2023-08-28 16:06:09 -07:00
Bobby R. Bruce
4bd3d2f864 mem-ruby: Improve Ruby/CHI stats for in/out trans (#220)
Currently we generate these stats for all defined Events in the
protocol, which may generate too many stats that are never used. Though
these don't appear in the stats.txt file, they unnecessarily increases
simulation startup time and memory footprint.

This patch limits those stats to events with the "in_trans" and/or
"out_trans" properties. SLICC compiler then checks which combinations of
event+state are possible when generating the stats.

Also the possible level of detail for inTransLatHist was reduced.
Only the number of transactions for each event+initial+final state
combinations is now accounted. Latency histograms are only defined per
event type (similarly to outTransLatHist). This significantly reduces
the final file size for generated stats.
2023-08-28 15:06:39 -07:00
Tiago Mück
9584d2efa9 mem-ruby: add in_trans/out_trans to CHI events
Marks which events signal the beginning of incoming and outgoing
transactions for generating inTransLatHist and outTransLatHist stats.

Change-Id: I90594a27fa01ef9cfface309971354b281308d22
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
2023-08-23 17:25:50 -05:00
Tiago Mück
a5fd6edea1 mem-ruby: fix CHI sending the wrong snoop response
Do not respond with SnpRespData_I when the line is still present
upstream.

Change-Id: I2592e5c6637cfc0e83042169a245837648276e61
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
2023-08-23 17:04:09 -05:00
Tiago Mück
49f5ec16d1 mem-ruby: fix assert on CHI ReadUnique
DCT must be disabled when handling a ReadUnique where the copy
need to be upgraded.

Previously we were just asserting as it was assumed DCT is only enabled
for HNFs (which can "auto-upgrade"). However DCT may also be enabled
for intermediated levels of distributed shared caches above the HNFs.

Change-Id: I9e29142a8d2f59ea61c1d90cda6b00c19435d6b7
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
2023-08-23 16:58:25 -05:00
Reiley Jeyapaul
c9ff54677f mem-ruby: fix CHI Evict race condition
When an Evict request is received from upstream for a shared line
and the line is no longer cached locally (or on any other upstream
cache), we need to also send an Evict downstream. In this case we need
to wait until our outgoing Evict completes before completing the Evict
from upstream in order be able to resolve race conditions with incoming
snoops. E.g.: while our outgoing Evict is pending we may receive a
snoop requesting data, but we won't be able to complete this snoop if
we have already completed all upstream Evicts and we no longer have the
line.

Change-Id: I23ac4f0a9c4ddd81e2425376c8d1e1c7fb66d107
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
2023-08-23 15:49:51 -05:00
Ranganath (Bujji) Selagamsetty
f6a453362f mem: Atomic ops to same address
Augmenting the DataBlock class with a change log structure to
record the effects of atomic operations on a data block and
service these changes if the atomic operations require return
values.

Although the operations are atomic, the coalescer need not
send unique memory requests for each operation. Atomic
operations within a wavefront to the same address are now
coalesced into a single memory request. The response of this
request carries all the necessary information to provide the
requesting lanes unique values as a result of their individual
atomic operations. This helps reduce contention for request
and response queues in simulation.

Previously, only the final value of the datablock after all
atomic ops to the same address was visible to the requesting
waves. This change corrects this behavior by allowing each wave
to see the effect of this individual atomic op is a return value
is necessary.

Change-Id: I639bea943afd317e45f8fa3bff7689f6b8df9395
2023-08-23 14:45:25 -05:00
Daniel Kouchekinia
984499329d mem-ruby,configs: Add GLC Atomic Latency VIPER Parameter (#110)
Added a GLC atomic latency parameter (glc-atomic-latency) used when
enqueueing response messages regarding atomics directly performed in
the TCC. This latency is added in addition to the L2 response latency
(TCC_latency). This represents the latency of performing an atomic
within the L2.

With this change, the TCC response queue will receive enqueues with
varying latencies as GLC atomic responses will have this added GLC
atomic latency while data responses will not. To accommodate this in
light of the queue having strict FIFO ordering (which would be violated
here), this change also adds an optional parameter bypassStrictFIFO to
the SLICC enqueue function which allows overriding strict FIFO
requirements for individual messages on a case-by-case basis. This
parameter is only being used in the TCC's atomic response enqueue call.

Change-Id: Iabd52cbd2c0cc385c1fb3fe7bcd0cc64bdb40aac
2023-07-23 15:57:06 -05:00
Daniel Kouchekinia
1705853b12 mem-ruby: Added support for non-system-scope atomics in VIPER (#101)
Added support for performing non-SLC-set atomics in the TCC.
Previously, all atomics were being passed on by the TCC to the
directory. With this change, atomics will only be passed on if the SLC
bit is set or if the line isn't present or available in the TCC.

If a non-SLC atomic is passed on to the directory because it is not
present in the TCC, the atomic will be performed on the return path on
the Data event. To accommodate the directory not performing the atomic
in this case, this change also passes the SLC bit on to the directory.

The previously-named "Atomic" action has been renamed to
"AtomicPassOn", with the new "Atomic" corresponding to an atomic
performed directly in the TCC.

Change-Id: Ibf92f71ddceb38bd1b0da70b0a786cc4c3cf2669
2023-07-20 11:48:08 -05:00
Daniel Kouchekinia
f8f5dd98bf mem-ruby: Added WIB State to VIPER TCC Cache (#67)
Added WIB (Waiting on Writethrough Ack; Will be Bypassed) state which
is transitioned to when a dirty line in the TCC is evicted in a
bypassed read. Previously, we were transitioning to invalid.

While a WI (Waiting on Writethrough Ack) state exists, transitions from
it on WBAck deallocates the TBE, which contains SLC bit information
needed to trigger the Bypass event when the read response from the
directory comes in.

Without this change, WB acknowledgements from the directory in read
bypass evicts (with the SLC bit set) were being treated as if they were
read responses, leading to an invalid transition panic.

Change-Id: I703c3fe8af0366856552bb677810cb1a8f2896de
2023-07-17 10:17:47 -07:00
Gabriel Busnot
159953080a mem-ruby: Fix of an address bug in MESI_Two_Level-dir.sm
Physical access address and line address were mixed up in
qw_queueMemoryWBRequest_partial

Change-Id: I0b238ffc59d2bb3de221d96905c75b7616eac964
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67661
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2023-07-07 10:17:54 +00:00
Gabriel Busnot
20dd444273 mem-ruby: Switch to dequeueMemRspQueue() in all Ruby protocols
Change-Id: I33bca345d985618e3fca62e9ddd5bcc3ad8226a3
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67659
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
2023-07-07 10:17:54 +00:00
Matt Sinclair
a030ff2745 mem-ruby: fix atomic deadlock with WB GPU L2 caches
By default the GPU VIPER coherence protocol uses a WT L2 cache.
However it has support for using WB caches (although this is not
tested currently).  When using a WB L2 cache for the GPU, this
results in deadlocks with atomics.

Specifically, when an atomic reaches the L2 and the line is
currently in M or W, the line must be written back before the atomic
can be performed.  However, the current support has two issues:

a) it never performs the atomic operation -- while VIPER current
assumes all atomics are system scope atomics and thus cannot be
performed at the L2 and this transition requires the dirty line be
written back before performing the atomic, the transition never
performs the atomic nor does the response path handle it.
b) putting the atomic action right after the write back is not
safe because we need to ensure the requests are ordered when they
reach memory -- thus we have to wait until the write back is
acknowledged before it's safe to send/perform the atomic.

To fix this, this change modifies the transition in question to
put the atomic on the stalled requests buffer, which the WBAck will
check when it returns to the L2 (and thus perform the atomic, which
will result in the atomic being sent on to the directory).

This fix has been tested and verified with both the per-checkin and
nightly GPU Ruby Random tester tests (with a WB L2 cache).

Change-Id: I9a43fd985dc71297521f4b05c47288d92c314ac7
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/68978
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2023-03-22 04:00:38 +00:00
Matt Sinclair
92d920f994 mem-ruby: fix load deadlock with WB GPU L2 caches
By default the GPU VIPER coherence protocol uses a WT L2 cache.
However it has support for using WB caches (although this is not
tested currently).  When using a WB L2 cache for the GPU, this
results in deadlocks with loads.

Specifically, when a load reaches the L2 and the line is currently
in the W state, that line must be written back before the load can
be performed.  However, the current transition for this in the L2
did not attempt to retry the load when the WB completes, resulting
in a deadlock.  This deadlock can be replicated by running the GPU
Ruby random tester as is with a WB L2 cache instead of a WT L2
cache.

To fix this, this change modifies the transition in question to
put the load on the stalled requests buffer, which the WBAck will
check when it returns to the L2 (and thus perform the load).

This fix has been tested and verified with both the per-checkin and
nightly GPU Ruby Random tester tests (with a WB L2 cache).

Change-Id: Ieec4f61a3070cf9976b8c3ef0cdbd0cc5a1443c6
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/68977
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
2023-03-22 04:00:38 +00:00
Matt Sinclair
4e61a98336 mem-ruby: add GPU cache bypass I->I transition
66d4a158 added support for AMD's GPU cache bypassing flags (GLC
for bypassing L1 caches, SLC for bypassing all caches).  However,
it did not add a transition for the situation where the cache line
is currently I (Invalid).  This commit adds this support, which
resolves an assert failure in Pannotia workloads when this situation
arises.

Change-Id: I59a62ce70c01dd8b73aacb733fb3d1d0dab2624b
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67201
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2023-01-08 20:24:11 +00:00
Matt Sinclair
1d467bed7f mem-ruby: fix TCP spacing/spelling
Change-Id: I3fd9009592c8716a3da19dcdccf68f16af6522ef
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67200
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2023-01-08 20:24:11 +00:00
Matt Sinclair
24e2ef0b78 mem-ruby, gpu-compute: fix TCP GLC cache bypassing
66d4a158 added support for AMD's GPU cache bypassing flags (GLC
for bypassing L1 caches, SLC for bypassing all caches).  However,
for applications that use the GLC flag but intermix GLC- and
non-GLC accesses to the same address, this previous commit
has a bug.  This bug manifests when the address is currently
valid in the L1 (TCP).  In this case, the previous commit chose
to evict the line before letting the bypassing access to proceed.
However, to do this the previous commit was using the inv_invDone
action as part of the process of evicting it.  This action is only
intended to be called when load acquires are being performed
(i.e., when the entire L1 cache is being flash invalidated).  Thus,
calling inv_invDone for a GLC (or SLC) bypassing request caused an
assert failure since the bypassing request was not performing a
load acquire.

This commit resolves this by changing the support in this case to
simply invalidate the entry in the cache.

Change-Id: Ibaa4976f8714ac93650020af1c0ce2b6732c95a2
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67199
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2023-01-08 20:24:11 +00:00
Vishnu Ramadas
c23d7bb3ee gpu-compute, mem-ruby: Add p_popRequestQueue to some transitions
Two W->WI transitions, on events RdBlk and Atomic in the GPU L2 cache
coherence protocol do not clear  the request from the request queue upon
completing the transition. This action is not performed in the respone
path. This update adds the p_popRequestQueue action to each of these
transitions to remove the stale request from the queue.

Change-Id: Ia2679fe3dd702f4df2bc114f4607ba40c18d6ff1
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67192
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2023-01-05 23:41:00 +00:00
Vishnu Ramadas
ddf43726ef gpu-compute, mem-ruby: Update GPU cache bypassing to use TBE
An earlier commit added support for GLC and SLC AMDGPU instruction
modifiers. These modifiers enable cache bypassing when set. The GLC/SLC
flag information was being threaded through all the way to memory and
back so that appropriate actions could be taken upon receiving a request
and corresponding response. This commit removes the threading and adds
the bypass flag information to TBE. Requests populate this
entry and responses access it to determine the correct set of actions to
execute.

Change-Id: I20ffa6682d109270adb921de078cfd47fb4e137c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67191
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
2023-01-05 23:38:32 +00:00
Vishnu Ramadas
66d4a15820 gpu-compute,mem-ruby: Add support for GPU cache bypassing
The GPU cache models do not support cache bypassing when the GLC or SLC
AMDGPU instruction modifiers are used in a load or store. This commit
adds cache bypass support by introducing new transitions in the
coherence protocol used by the GPU memory system. Now, instructions with
the GLC bit set will not cache in the L1 and instructions with SLC bit
set will not cache in L1 or L2.

Change-Id: Id29a47b0fa7e16a21a7718949db802f85e9897c3
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/66991
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2023-01-03 21:19:24 +00:00
Jarvis Jia
a68e842332 mem-ruby: Fix replacement policy in MESI_Two_Level
The current MESI_Two_Level protocol's L1 caches updates the MRU information twice per request on misses -- once when the request reaches Ruby and once when the miss is returned from another level of the memory hierarchy.

Although this approach does not cause any correctness bugs for replacement policies like LRU since this request is the LRU in both cases, it does not work correctly for other policies like SecondChance and LFU, where updating the information twice (for misses) causes them to devolve to LRU.

Note that this was not directly a problem with Ruby previously, because it only supported LRU-based policies that were unaffected by this.  However, with the integration of 20879 Ruby now uses the same replacement policies as Classic (which has additional, non-LRU based replacement policies).

This patch resolves this problem by not updating the MRU information a second time for the misses. It has been tested and validated with the replacement policy tests.

Change-Id: I9e7e96a9d6c09f3d6b7daae7115ef091ac3bdc08
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/64371
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
2022-10-20 01:03:39 +00:00
Tiago Mück
027b508a38 mem-ruby: fix missing transition in CHI-mem
JIRA: https://gem5.atlassian.net/browse/GEM5-1195

Change-Id: I0aae4b9042cb6565c77cc8781b514a9e65ab161b
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63676
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-09-28 18:56:04 +00:00
Tiago Mück
c6a460eff4 mem-ruby: fix CHI memory controller
Break up the transition to READING_MEM into two separate steps so
contention at the requestToMemory queue won't block the TBE
initialization.

JIRA: https://gem5.atlassian.net/browse/GEM5-1195

Change-Id: Ifa0ee589bde67eb30e7c0b315ff41f22b61e8db7
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63675
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-09-28 18:56:04 +00:00
Daecheol You
e8ff8817e3 mem-ruby: bug fix for stale WriteBack
Finish_CopyBack_Stale is scheduled only when the requestor is the last
sharer. This prevents the cacahe evicting the line which was already
evicted while the stale WriteBack transaction was stalled.
Wrong condition check in Finish_CopyBack_Stale for eviction is also
removed.

Change-Id: Ib66acc1b9e4a6f7cea373e1fb37375427897d48d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63611
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-09-19 01:57:23 +00:00
Tiago Muck
f6b2793b91 Revert "mem-ruby: bug fix for Finish_CopyBack_Stale"
This reverts commit f7cf47bc31.

Reason for revert: introduces an issue when handling a stale WriteBack

Change-Id: I4bd370911cb003c0c99e5fd14866b8c98afa80e2
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63412
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
2022-09-12 14:52:38 +00:00
Daecheol You
f7cf47bc31 mem-ruby: bug fix for Finish_CopyBack_Stale
I made a mistake in the change below:
https://gem5-review.googlesource.com/c/public/gem5/+/58413

Checking the requestor in the sharer list for eviction
should be removed now. If the sharer count is zero, the requestor can't
be in the sharer list.

Change-Id: I304d2dd7df1aff4907801664a260c35c490a2136
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62991
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
2022-09-09 20:38:20 +00:00
Jarvis Jia
b86088008a mem-ruby: Fix replacement policy updates with stores in MI_example
The current MI_example protocol's L1 caches updates the MRU information twice per store requests that miss -- once when the request reaches Ruby and once when the store miss is returned from another level of the memory hierarchy.

Although this approach does not cause any correctness bugs for replacement policies like LRU since this request is the LRU in both cases, it does not work correctly for other policies like SecondChance and LFU, where updating the information twice (for misses) causes them to devolve to LRU.

Note that this was not directly a problem with Ruby previously, because it only supported LRU-based policies that were unaffected by this.  However, with the integration of 20879 Ruby now uses the same replacement policies as Classic (which has additional, non-LRU based replacement policies).

This patch resolves this problem by not updating the MRU information a second time for the misses. It has been tested and validated with the replacement policy tests in 20880, and it modifies the store instead of the load in 62232.

Change-Id: I8436e3e537da0ee5841c59a94fa5e5c30105529f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63191
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2022-09-09 15:19:54 +00:00
Bobby R. Bruce
2bc5a8b71a misc: Run pre-commit run on all files in repo
The following command was run:

```
pre-commit run --all-files
```

This ensures all the files in the repository are formatted to pass our
checks.

Change-Id: Ia2fe3529a50ad925d1076a612d60a4280adc40de
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62572
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
2022-08-24 21:47:07 +00:00
Jarvis Jia
2816598831 mem-ruby: Fix replacement policy updates in MI_example
The current MI_example protocol's L1 caches updates the MRU information twice per request on misses -- once when the request reaches Ruby and once when the miss is returned from another level of the memory hierarchy.

Although this approach does not cause any correctness bugs for replacement policies like LRU since this request is the LRU in both cases, it does not work correctly for other policies like SecondChance and LFU, where updating the information twice (for misses) causes them to devolve to LRU.

Note that this was not directly a problem with Ruby previously, because it only supported LRU-based policies that were unaffected by this.  However, with the integration of 20879 Ruby now uses the same replacement policies as Classic (which has additional, non-LRU based replacement policies).

This patch resolves this problem by not updating the MRU information a second time for the misses. It has been tested and validated with the replacement policy tests in 20880.

Change-Id: I82a57abf2a16d70820413ba8118378f2e91fd7fb
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62232
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2022-08-19 03:08:02 +00:00
Richard Cooper
b893344b7d mem-ruby: Add descriptions to the CHI DVM symbols.
This commit adds `desc` descriptions to the new symbols introduced
with CHI DVM support. The generation of the SLICC HTML documentation
requires each symbol to have a description, so a build with
`SLICC_HTML=True` will fail without this change.

Change-Id: I06f3bdd33edd1ff6e4bec35b01a460b9359ed9f6
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/60869
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-07-06 17:09:46 +00:00
Matt Sinclair
9c1af09605 mem-ruby, gpu-compute: update TCP,SQC to pass hit/miss
Previously, the GPU SQC and TCP Ruby protocols always told the Sequencer
that the externalHit field was false.  This impacts the statistics and
profiling, because the Sequencer uses this hit/miss information both for
profiling and the coalescer's statistics.

To resolve this, this commit updates the GPU SQC and TCP Ruby protocols
to pass the appropriate hit/miss information into the Sequencer's
readCallback and hitCallback functions.

Change-Id: Ib74af09b66fa8866eee72d3a9ab0e8a8f2196c03
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/60652
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Matthew Poremba <matthew.poremba@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-06-21 22:59:05 +00:00
Matt Sinclair
669eb6a6fa mem-ruby, gpu-compute: add hit/miss profiling to SQC
This commit updates the Ruby SQC (GPU L1 I$) to perform hit and miss
profiling on each request that reaches it.

Change-Id: I736521b89b5d37d950265f32cf1a6d2ee5316dba
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/60651
Maintainer: Matthew Poremba <matthew.poremba@amd.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-06-21 22:58:42 +00:00
Kyle Roarty
f876e60bc2 mem-ruby: Fix deadlock in GPU VIPER TCC
A deadlock occured where we got a RdBlk while in W,
which put us in WI while we wait for a writeback to complete.

This would cause the request to be stalled while the writeback
was occuring, but when the writeback completed (WBAck), we never
woke up the requests and thus never completed the RdBlk.

This commit adds a wakeup when we receive a WBAck while in WI.

Change-Id: I01edf1d7a47757b4f680baf9f33a1a6aa37e7e25
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/59352
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-06-06 18:28:52 +00:00
Daecheol You
9bfffe0f34 mem-ruby: modify the TBE data state for ReadOnce_HitUpstream
When ReadOnce request hits upstream, set dataToBeInvalid to true
for R* states so that the line from the upstream is successfully dropped
at the end by Finalize_UpdateCacheFromTBE.
For UD_RU and UC_RU state, set dataValid to true to prevent it changing
to RU state when it doesn't get the snoop data response.

Change-Id: Ie83c511e8d158e18abc5c9c16bc6040ce73587bf
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/58411
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Tiago Muck <tiago.muck@arm.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-06-03 09:31:21 +00:00
Tiago Mück
e4274cabd9 mem-ruby: fix Evict request for CHI excl. caches
Assume core C1 with private L1/L2 and a shared exclusive L3.
C1 has a line in SC state, while the state in the L3 is
RUSC (L3 has exclusive accesses and upstream requester has line in SC).

When C1 evicts the line (Evict request), the L3 has to issue a
WriteEvictFull to the home node, however the L3 doesn't have a copy
of the line.

This fix handling Evict requests when the line state is RUSC. When
the last sharer issues an Evict request, the responder may issue
SnpOnce the obtain a copy the line if needed.

JIRA: https://gem5.atlassian.net/browse/GEM5-1195

Change-Id: Ic8f4e10b38d95cd6d84f8d65b87b0c94fcf52eea
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/59991
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2022-06-01 15:23:47 +00:00
Tiago Mück
612f242359 mem-ruby: fix CHI snoops clearing WU data
When just forwarding a WU request, the controller waits until the WU is
acked from downstream before sending the ack upstream. This
prevents snoops clearing valid WU data.

JIRA: https://gem5.atlassian.net/browse/GEM5-1195

This was more likely to happen with shared exclusive caches, e.g:
assume core C1 and C2 with private L1/L2 and a shared exclusive L3.
C1 has as dirty copy of the line while C2 issues a WriteUnique request
to that line. The line state is RU in the L3, so the L3 will just
forward the request to the HNF, so:
- C2 issues WU to L3 cache
- L3 acks the WU, allowing C2 to send the data, while concurrently
  forwarding the WU to the HNF.
- L3 receives data from C2
- HNF sends invalidating snoops upstream because line is RU
- The snoop hazards with the pending WU at the L3 and invalidates
  the data previously received. This causes an assertion to fail when
  we resume handling the WU.

Change-Id: I51e457e0bdb648c0fff3f702b7d2c95dcf431dc5
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/59990
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2022-06-01 15:23:47 +00:00
Tiago Mück
1dfd319d98 mem-ruby: fix data state for partial WU
When receiving data from a WriteUniquePtl we were wrongfully clearing
the data valid flag.

JIRA: https://gem5.atlassian.net/browse/GEM5-1195

Change-Id: I5c17433f1cfb706e443a0169a9f0e99ff5c1fcc0
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/59989
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2022-06-01 15:23:47 +00:00
Tiago Mück
dc33a16993 mem-ruby: fix functionalRead on pending CU
Normally we don't check the TBE data if there are outstanding response
messages for the transaction because that means the latest valid data is
either in another cache or within an inflight message.
However this is not the case when we have either a pending CleanUnique
or we are handling CleanUnique. So bypass the pending message check in
this case.

Change-Id: I5f31039ca2a01a6a68fee8e0f3cf02c7e437b43e
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57395
Reviewed-by: Daecheol You <daecheol.you@samsung.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-06-01 15:23:47 +00:00
Tiago Mück
23888df8a9 mem-ruby: fix MaintainCoherence typo
Change-Id: Iee3319e1d470898c727747894287029e1b0ab102
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57394
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2022-06-01 15:23:47 +00:00
Tiago Mück
12641069de mem-ruby: reuse existing event on CleanUnique
Reuse the existing MaintainCoherence event to schedule
writebacks or cache fill after a CleanUnique.

JIRA: https://gem5.atlassian.net/browse/GEM5-1195

Change-Id: I127ebf78736b8312ccf2b18cf7c586eb5a77f373
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57393
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-06-01 15:23:47 +00:00
Tiago Mück
d1d6b4cb9e mem-ruby: fix inconsistent WBs for dirty data
Initiate_MaitainCoherence would not trigger a writeback if
tbe.dataMaybeDirtyUpstream is set due to the assumption that
the upstream cache would writeback any dirty data. However this
is not the case if we use this action finalize a CleanUnique, e.g.:

- L1-A has data in SC
- L1-B has data in SD
- L2 has data in RUSD (L2 is an exclusive cache)
- L1-A sends CleanUnique to L2
- L2 invalidates L1-B and receives dirty data.
- L2 acks the CleanUnique; L1-A is now UC
- L2 has the dirty data but drops it because dataMaybeDirtyUpstream
- L1-A doesn't modify the data and eventually evicts it with WriteEvict
- Data from WriteEvicts are dropped at the HNF and we lose the line

This patch removes the tbe.dataMaybeDirtyUpstream check.
Instead it only skips the WriteBack if an upstream cache is in
SD state, when it's guaranteed it will writeback the dirty data.

JIRA: https://gem5.atlassian.net/browse/GEM5-1195

Change-Id: I6722bc25068b0c44afcf261abc8824f1d80c09f9
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57392
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Daecheol You <daecheol.you@samsung.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2022-06-01 15:23:47 +00:00
Tiago Mück
183e8e2b61 mem-ruby: fix state updates on WriteCleanFull
- fix wrong variable check at UpdateDirState_FromReqDataResp
- even after a WriteClean, dataMaybeDirtyUpstream still applies if
  there is an exclusive owner upstream.

JIRA: https://gem5.atlassian.net/browse/GEM5-1195

Change-Id: If1fa3ee40e30226db3a66c34633316e751eb7c4d
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57391
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Daecheol You <daecheol.you@samsung.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2022-06-01 15:23:47 +00:00
Tiago Mück
5faa7aaffd mem-ruby: removed check for WriteCleanFull
Relaxed check on Send_WriteCleanFull. That data state may actually
happen if the writeback was triggered by a CleanUnique request.

JIRA: https://gem5.atlassian.net/browse/GEM5-1195

Change-Id: I33ec5693df09efe39345f403c5b6d3388f1a5056
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57390
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Daecheol You <daecheol.you@samsung.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-06-01 15:23:47 +00:00
Tiago Mück
ff5aafa1e9 mem-ruby: fix CHI wrong response to ReadShared
When an exclusive cache is responding to a ReadShared and the line is
unique, it send the data in unique state without checking if the line
already has other sharers in other upstream caches.

This patch fixes this issue and also cleans up Send_CompData.

JIRA: https://gem5.atlassian.net/browse/GEM5-1195

Change-Id: Ica7c2afafb55750681b39ae7de99a665689ecb8a
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57389
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
2022-06-01 15:23:47 +00:00