Allow all sub-compressors of BDI to be successful as long as
they are able to compress. Then, BDI's actual size threshold
acts as the cutting point.
This situation arises on any multi compressor; yet, generalizing
this assumption might be too bold.
Change-Id: Iec5057d16d4a7ba5fb573133a30ea10869bd67e0
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33386
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The size can be zero in special occasions, which would
generate divisions by zero. This patch expands the
stats to support them. It also fixes the compression
factor calculation in the Multi compressor.
As a side effect, now that zero sizes are handled, allow
the Zero compressor to generate it.
Change-Id: I9f7dee76576b09fdc9bef3e1f3f89be3726dcbd9
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33383
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
When compressing using a multi-compressor, one must be able to
identify which sub-compressor should be used to decompress data.
This can be achieved by either adding encoding bits to block's
tag or data entry.
It was previously assumed that these encoding bits would be added
to the tag, but now make it a parameter that defaults to the data
entry.
Change-Id: Id322425e7a6ad59cb2ec7a4167a43de4c55c482c
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33380
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The compressors are not able to process a whole line at once,
so they must divide it into multiple same-sized chunks. This
patch makes the base compressor responsible for this division,
so that the derived classes are mostly agnostic to this
translation.
This change has been coupled with a change of the signature
of the public compress() to avoid introducing a temporary
function rename. Previously, this function did not return
the compressed data, under the assumption that everything
related to the compressed data would be handled by the
compressor. However, sometimes the units using the compressor
could need to know or store the compressed data.
For example, when sharing dictionaries the compressed data
must be checked to determine if two blocks can co-allocate
(DISH, Panda et al. 2016).
Change-Id: Id8dbf68936b1457ca8292cc0a852b0f0a2eeeb51
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33379
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Previously either the compression data was the one declared within
DictionaryCompressor, or the derived class would have to override
the compress() to use a derived compression data.
With this change, the instantiation can be overridden, and thus
any derived class can choose the compression data pointer type
they need to use.
Change-Id: I387936265a3de6785a6096c7a6bd21774202b1c7
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33378
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
When applying the bitwise not to a short integer the compiler
automatically promotes it to an integer. For example, if a 8-bit
mask=0xFF, and the compiler decides to promote the mask to 32-bit
to apply the bitwise not, ~mask=0xFFFFFF00, which will yield wrong
results for popcount(): expected=0, got=24.
Change-Id: I95efba5532c27ca004ff6947d4b51a8a14f09741
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33374
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
isa_traits.hh used to have much more in it, but now it only has
PageShift, PageBytes, and (for now) the guest endianness. These values
should only be retrieved from the System class generally speaking, so
only the system class should include arch/isa_traits.hh.
Some gpu compute related files need PageBytes or PageShift. Even though
those files don't advertise their ISA dependence, they are tied to x86.
In those files, they can include arch/x86/isa_traits.hh.
The only other file which legitimately needs arch/isa_traits.hh is the
decoder cache since it uses PageBytes to size an array.
Change-Id: I12686368715623e3140a68a7027c136bd52567b1
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33203
Reviewed-by: Gabe Black <gabeblack@google.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Creation of the Compressor namespace. It encapsulates all the cache
compressors, and other classes used by them.
The following classes have been renamed:
BaseCacheCompressor -> Base
PerfectCompressor - Perfect
RepeatedQwordsCompressor -> RepeatedQwords
ZeroCompressor -> Zero
BaseDictionaryCompressor and DictionaryCompressor were not renamed
because the there is a high probability that users may want to
create a Dictionary class that encompasses the dictionary contained
by these compressors.
To apply this patch one must force recompilation (e.g., by deleting
it) of build/<arch>/params/BaseCache.hh (and any other files that
were previously using these compressors).
Change-Id: I78cb3b6fb8e3e50a52a04268e0e08dd664d81230
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33294
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The `BasePrefetcher` python class had members `_events` and `_tlbs`
defined as lists, meaning that any call to `list.append` on them would
affect `_events` and `_tlbs` for all prefetchers, not just the calling
object. This change redefines them as instance members to fix the
problem.
Change-Id: I68feb1d6d78e2fa5e8775afba8c81c6dd0de6c60
Signed-off-by: Isaac Sánchez Barrera <isaac.sanchez@bsc.es>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32394
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
The System class has a few different arrays of values which each
correspond to a thread of execution based on their position. This
change collects them together into a single class to make managing them
easier and less error prone. It also collects methods for manipulating
those threads as an API for that class.
This class acts as a collection point for thread based state which the
System class can look into to get at all its state. It also acts as an
interface for interacting with threads for other classes. This forces
external consumers to use the API instead of accessing the individual
arrays which improves consistency.
Change-Id: Idc4575c5a0b56fe75f5c497809ad91c22bfe26cc
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/25144
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
When the write buffer is full, it still has space to store an additional
number of entries (reserve) equal to the number of MSHRs so that if any
of them requires a writeback this can be handled. Even if the slave port
is blocked, a prefetcher can generate new MSHR entries that may lead to
additional writebacks and eventually saturate the reserve space. This is
solved by checking if the cache is blocked for accesses before
prefetching data.
Change-Id: Iaad04dd6786a09eab7afae4a53d1b1299c341f33
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29615
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The temporal compactor was never initialized.
There were more possible indexes to the prec/succ vectors than
entries, so a block distance of zero would seg fault.
When checking for an address the wrong vector was being used.
From the original paper, "The prediction mechanism searches for
the PC of the accessed instruction in the index table"
Change-Id: I3c3aceac3c0adbbe8aef5c634c88cb35ba7487be
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/28487
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This patch fixes the following bugs:
- Previously when deltaPointer was 0 or 1, getting the last or penultimate deltas
would be wrong for non-pow2 deltas.size(). For example, if the last added delta
was to position 0, the previous should be in position 19, if deltas.size() = 20.
However, 0-1=4294967295, and 4294967295%20=15.
- When searching for the previous late and penultimate, the oldest entry was being
skipped.
Change-Id: Id800b60b77531ac4c2920bb90c15cc8cebb137a9
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/24538
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Javier Bueno Hedo <javier.bueno@metempsy.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The priority queue comparator orders such that false gives the
entry a higher priority. Therefore, if it is desired to make
the entry with lowest decompression latency have higher priority,
the comparison must be inverted.
Can be tested with:
MultiCompressor(compressors=[
PerfectCompressor(decompression_latency=1),
PerfectCompressor(decompression_latency=2)])
Where it is expected that compressor0 (the one with decomp lat
of 1) is always chosen.
Change-Id: I44acbf5f51c6e47efdd2a16fba9596935cf2eb69
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/28367
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The new local access mechanism installs a callback in the request which
implements what the mmapped IPR was doing. That avoids having to have
stubs in ISAs that don't have mmapped IPRs, avoids having to encode
what to do to communicate from the TLB and the mmapped IPR functions,
and gets rid of another global ISA interface function and header files.
Jira Issue: https://gem5.atlassian.net/browse/GEM5-187
Change-Id: I772c2ae2ca3830a4486919ce9804560c0f2d596a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/23188
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Add an invalidation function to the AssociativeSet, so that entries
can be properly invalidated by also invalidating their replacement
data.
Both setInvalid and reset have been merged into invalidate to
indicate users that they are using an incorrect approach by
generating compilation errors, and to match CacheBlk's naming
convention.
Change-Id: I568076a3b5adda8b1311d9498b086c0dab457a14
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/24529
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
When the MSHR is handling a request that will make the block dirty the
current cache commits respond. When that's not the case the cache
should forward any snoops. This CL fixes MSHR::handleSnoop() to
implement this behavior.
Change-Id: I207e3ca4968fd9528fd4cdbfb3eb95f470b4744d
Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/23668
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Members `tc` and `ongoingTranslation` were uninitialized in the constructor for
`QueuedPrefetcher::DeferredPacket`. If `ongoingTranslation` is not initialized to
`false` by default, some translation requests from queued prefetchers are not
properly handled and executions are nondeterministic.
Change-Id: Ia278f9e74847d6b847984d47f6a45643bae57794
Signed-off-by: Isaac Sánchez Barrera <isaac.sanchez@bsc.es>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/22844
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>