The compressors are not able to process a whole line at once,
so they must divide it into multiple same-sized chunks. This
patch makes the base compressor responsible for this division,
so that the derived classes are mostly agnostic to this
translation.
This change has been coupled with a change of the signature
of the public compress() to avoid introducing a temporary
function rename. Previously, this function did not return
the compressed data, under the assumption that everything
related to the compressed data would be handled by the
compressor. However, sometimes the units using the compressor
could need to know or store the compressed data.
For example, when sharing dictionaries the compressed data
must be checked to determine if two blocks can co-allocate
(DISH, Panda et al. 2016).
Change-Id: Id8dbf68936b1457ca8292cc0a852b0f0a2eeeb51
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33379
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
When the write buffer is full, it still has space to store an additional
number of entries (reserve) equal to the number of MSHRs so that if any
of them requires a writeback this can be handled. Even if the slave port
is blocked, a prefetcher can generate new MSHR entries that may lead to
additional writebacks and eventually saturate the reserve space. This is
solved by checking if the cache is blocked for accesses before
prefetching data.
Change-Id: Iaad04dd6786a09eab7afae4a53d1b1299c341f33
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29615
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
When a writeback needs to be allocated the whenReady field of the
block is not set, and therefore its access latency calculation
uses the previously invalidated value (MaxTick), significantly
delaying execution.
This is fixed by assuming that the data write portion of a write
access is done regardless of previous writes, and that only the
tag latency is important for the critical path latency calculation.
Change-Id: I739132a2deab6eb4c46d084f4ee6dd65177873fd
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20068
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
This addresses the issue described in
64687ee mem-cache: Mark block as dirty after a SWPrefetchEXResp.
Previous patch misses cases when the prefetch response is ReadExResp or
UpgradeResp. Also, marking the block as dirty in serviceMSHRTargets
instead of in handleFill covers cases when the prefetch is coalesced with
other requests.
Change-Id: I2b377fdd240eb0f09e720b6bb284dee6545925ce
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19688
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This reverts commit bf0a722acd.
Reason for revert: This patch introduces a bug:
The problem here is that the insertion of block A may cause the
eviction of block B, which on the lower level may cause the
eviction of block A. Since A is not marked as present yet, A is
"safely" removed from the snoop filter
However, by reverting it, using atomic and a Tags sub-class that
can generate multiple evictions at once becomes broken when using
Atomic mode and shall be fixed in a future patch.
Change-Id: I5b27e54b54ae5b50255588835c1a2ebf3015f002
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19088
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Previously all atomic writebacks concerned a single block,
therefore, when a block was evicted, no other block would be
pending eviction. With sector tags (and compression),
however, a single replacement can generate many evictions.
This can cause problems, since a writeback that evicts a block
may evict blocks in the lower cache. If one of these conflict
with one of the blocks pending eviction in the higher level, the
snoop must inform it to the lower level. Since atomic mode does
not have a writebuffer, this kind of conflict wouldn't be noticed.
Therefore, instead of evicting multiple blocks at once, we
do it one by one.
Change-Id: I2fc2f9eb0f26248ddf91adbe987d158f5a2e592b
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18209
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
When a block in compressed form is overwriten, it may change
its size. If the new compressed size is bigger, and the total
size becomes bigger than the block size, one or more blocks
will have to be evicted. This is called data expansion, or
fat writes.
This change assumes that a first level cache cannot have a
compressor, since otherwise data expansion should have been
handled for atomic operations and writes. As such, data
expansions should only be seen on writebacks. As writebacks
are forwarded to the next level when failed, there should
be no data expansions when servicing misses either.
This patch adds the functionality to handle data expansions
by evicting the co-allocated blocks to make room for an
expanded block.
Change-Id: I0bd77bf6446bfae336889940b2f75d6f0c87e533
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/12087
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This is a workaround for a bug introduced from the change:
59e3585a8 arch-arm: We add PRFM PST instruction for arm
which can cause deadlocks in the memory system.
The design of the classic memory system in gem5 makes the folloing two
assumptions:
* A cache that fetches a block with an intention to modify it, becomes
the point of ordering and therefore commits to respond to any snoop
requests [1].
* A cache that fetches an exclusive copy of the block, does so with
the intention to modify it [2]. Immediately after it receives the
block, it will write to it and mark it as dirty. As the point of
ordering, it responds to any outstanding snoops.
The current implementation of prefetch exclusive request breaks the
second assumption. A cache can fetch an exclusive block without an
immediate intention to modify it. If the block is not modified, it
will not be marked as dirty. However, the cache has committed to
respond to outstanding snoops, and if the block is clean it
won't. This can result in deadlocks where a snoop gets stuck waiting
for responses.
One solution (implemented by this patch) is to unconditionally mark
the block dirty when filling due to a prefetch exclusive request.
This makes the PrefetchExReq behave like a WriteReq. However, as it
may mark as dirty a clean block, it creates the requirement for an
uncessary WritebackDirty in the future. In practice, this shouldn't be
a big problem unless the application is unnecessarily using prefetch
exclusive instructions.
Other solutions, would require deeper changes to the design of the
memory system to handle this properly.
[1]: When a cache commits to respond, it "informs" the xbar/PoC (point
of coherence) and the other caches of its intention to respond. As a
result the request will not be send to the main memory.
[2]: In fact the assumption is that in the needsWritable MSHR there is
at least one WriteReq before any snoops from other caches.
Change-Id: I378d3c0dadf25fc52e430b67102347b44d2f18ea
Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17729
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Tested-by: kokoro <noreply+kokoro@google.com>
MemObject doesn't provide anything beyond its base ClockedObject any
more, so this change removes it from most inheritance hierarchies.
Occasionally MemObject is replaced with SimObject when I was fairly
confident that the extra functionality of ClockedObject wasn't needed.
Change-Id: Ic014ab61e56402e62548e8c831eb16e26523fdce
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18289
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Gabe Black <gabeblack@google.com>
Commit 7976b561de tried fixing
replacement update when a single location can be associated to
multiple blocks.
Although the comment of the correct action was added, the proper
validation check was forgotten. This change adds that check and
moves doing the eviction to when there is a valid block.
Change-Id: I31d8bb914ccfd1849e9d97464d70a58a62f59533
Signed-off-by: Daniel <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18210
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Replacements should be increased when there is any evicted
block, which does not necessarily have to be the victim.
For example, assume a superblock contains 4 blocks, and both
A and C are stored compressed (belonging to SB_1). Then F,
from SB2 needs to make room by replacing SB1. If F map to
location 2, the number of replacements should be increased,
even though 2 had no valid blocks:
Tag Data Tag Data
|SB_1|--|A|X|C|X| --> |SB_2| |X|F|X|X|
1 2 3 4 1 2 3 4
Change-Id: I7b3735d28a35faa8d8fa613a1555bb258da65859
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18208
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Having the caller decide the matching logic is error-prone, and
frequently ends up with the secure bit being forgotten. This
change adds matching functions to the QueueEntry to avoid this
problem.
As a side effect the signature of findPending has been changed.
Change-Id: I6e494a821c1e6e841ab103ec69632c0e1b269a08
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17530
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Some accesses only need to search for a tag in the tag array, with
no need to touch the data array. This is the case for CleanEvicts,
evicts that don't find a corresponding block entry (since a write
cannot be done in parallel with tag lookup), and maintenance
operations.
Change-Id: I7365a915500b5d7ab636d49a9acc627072a7f58e
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/14878
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
When dealing with writebacks, as soon as the packet metadata arrives
there will be a tag lookup, done sequentially because a write can't
be done in parallel. While the tag lookup is being done, the payload
will arrive. When both the payload are present and the tag is correct
block entry is determined the fill happens.
Change-Id: If1a0085d742458b675bfc012b6d908d9d9a25e32
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/14877
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Previously the bus delay was being ignored for the access latency
calculation, and then applied on top of the access latency. This
patch fixes the order, as first the packet must arrive before the
access starts.
Change-Id: I6d55299a911d54625c147814dd423bfc63ef1b65
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/14876
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
The header and payload delay have already been accounted and
zeroed previous to calling this function. The probe is not
allowed to modify the packet, therefore no extra delays are
added, and it is safe to remove the todo note.
Change-Id: I8ddf7e189fbe609cdec34364f3c013427930daf7
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/14875
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
A packet queue is typically used to hold on to packets that are
schedules to be sent in the future or when they need to queue behind
younger packets that have been sent out yet. Due to memory order
requirements, some MemObjects need to maintain the order for packet
(mostly responses) that reference the same cache block.
Prior to this patch the ordering requirements where determined when
the packet was scheduled to be sent. This patch moves the parameter to
the constructor.
Change-Id: Ieb4d94e86bc7514f5036b313ec23ea47dd653164
Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/15555
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Add a getter and a setter function to access CacheBlk::whenReady
to encapsulate the variable and allow error checking. This error
checking consists on verifying that writes to a block after it
has been inserted follow a chronological order.
As a side effect, tickInserted retain its value until updated,
that is, it is not reset in invalidate().
Change-Id: Idc3c5a99c3f002ee9acc2424f00e554877fd3a69
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/14715
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
In order to allow polymorphism of the block these two
functions have been added, and all direct status
assignments to these bits have been substituted.
We also assert that the block has been invalidated
before insertion. Then the block is validated in
the insertion.
Change-Id: Ie7be42408721ad4c2c9dc880f82a62cb594f8668
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/14362
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Access latency was not being calculated properly, as it was
always assuming that for hits reads take as long as writes,
and that parallel accesses would produce the same latency
for read and write misses.
By moving the calculation to the Cache we can use the write/
read information, reduce latency variables duplication and
remove Cache dependency from Tags.
The tag lookup latency is still calculated by the Tags.
Change-Id: I71bc68fb5c3515b372c3bf002d61b6f048a45540
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/13697
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Requests, for which a cache has already committed to respond do not
perform any lookups. Previously in atomic mode the packet would pay
the lookup latency while in timing it wouldn't. This patch aligns
recvAtomic with recvTimingReq and removes the lookup latency from the
the handling of such requests.
Change-Id: I50a0631f8058e5086d94d55af0e1788a60e2883f
Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/14175
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Since the tag classes are subclasses of SimObject, they inherit an
init function which does generic initialization at simulation startup
and which doesn't take any parameters. A new function was added which
does take a parameter, and which is just for doing tag specific
initialization as triggered by the base cache. These two names clashed,
and clang complained that the tag local name was hiding the SimObject
name (which it was).
Change-Id: I399775aceaf8f1a8e2646d434facef22e6d3e7d0
Reviewed-on: https://gem5-review.googlesource.com/c/13875
Reviewed-by: Gabe Black <gabeblack@google.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Gabe Black <gabeblack@google.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Classes were using memcpy instead of the Packet functions
created for writing to/from the packet. This allows these
writes to be better checked and tracked.
This also fixes a bug in MemCheckerMonitor, which was using
the incorrect type for the packet pointer.
Change-Id: I5bbc8a24e59464e8219bb6d54af8209e6d4ee1af
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/13695
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Block was being invalidated twice when not a tempBlock.
Make explicit that the else case is only to be applied
when handling the tempBlock, as otherwise the Tags
should be taking care of the invalidation.
Change-Id: Ie7603fdbe156c54e94bbdc83541b55e66f8d250f
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/13895
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Enable the cache to detect contiguous writes and hold on to the MSHR
long enough to allow the entire line to be written. If the whole line
is written, the MSHR will be sent out as an invalidation requests, as
it is part of a whole-line write, i.e. no-fetch-on-write.
The cache is also able to switch to a write-no-allocate policy on the
actual completion of the writes, and instead use the tempBlock and
turn the write operation into a writeback.
These policies are all well-known, and described in works such as
Jouppi, Cache Write Policies and Performance, vol 21, no 2, ACM, 1993.
Change-Id: I19792f2970b3c6798c9b2b493acdd156897284ae
Reviewed-on: https://gem5-review.googlesource.com/c/12907
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
This patch changes how we deal with whole-line writes their
responses. With these changes, we use the MSHR tracking to determine
if a whole-line is written, and on a fill we simply handle the
invalidation response, with the actual writes taking place as part of
satisfying the CPU-side hit.
Change-Id: I9a18e41a95db3c20b97f8bca7d95ff33d35a578b
Reviewed-on: https://gem5-review.googlesource.com/c/12905
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>