Add an accuracy and coverage stat for the prefetchers.
Accuracy is defined as the ratio of the number of prefetch
request that have been counted as useful over the number
of prefetch request issued.
Accuracy tells whether the prefetcher is producing useful
requests or not.
Coverage is defined as the ratio of of the number of prefetch
request that have been counted as useful over the number of
demand misses if there was no prefetch, which is counted as
the number of useful prefetch request plus the remaining
demand misses. Due to the way stats are defined in the cache,
I have to add a stat to count the number of remaining demand
misses directly in the prefetcher stat. Demand is defined
as being one of this request type: ReadReq, WriteReq,
WriteLineReq, ReadExReq, ReadCleanReq, ReadSharedReq.
Coverage tells what part of misses are covered by the prefetcher.
Change-Id: I3bb8838f87b42665fdd782889f6ba56ca2a802fc
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/47603
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Daniel Carvalho <odanrc@yahoo.com.br>
Tested-by: kokoro <noreply+kokoro@google.com>
Count how many time a prefetch is useful, meaning
a hit has happened on a prefetched cache block.
Another stat (pfUsefulButMiss) has been added to count
the special case where there is a hit on prefetched block
but it is counted as a miss because the block is not in
the requested coherency state.
Change-Id: I253216b9ac96d5f21139b710c489d6eb3fce7136
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/47602
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Daniel Carvalho <odanrc@yahoo.com.br>
Tested-by: kokoro <noreply+kokoro@google.com>
Apply the gem5 namespace to the codebase.
Some anonymous namespaces could theoretically be removed,
but since this change's main goal was to keep conflicts
at a minimum, it was decided not to modify much the
general shape of the files.
A few missing comments of the form "// namespace X" that
occurred before the newly added "} // namespace gem5"
have been added for consistency.
std out should not be included in the gem5 namespace, so
they weren't.
ProtoMessage has not been included in the gem5 namespace,
since I'm not familiar with how proto works.
Regarding the SystemC files, although they belong to gem5,
they actually perform integration between gem5 and SystemC;
therefore, it deserved its own separate namespace.
Files that are automatically generated have been included
in the gem5 namespace.
The .isa files currently are limited to a single namespace.
This limitation should be later removed to make it easier
to accomodate a better API.
Regarding the files in util, gem5:: was prepended where
suitable. Notice that this patch was tested as much as
possible given that most of these were already not
previously compiling.
Change-Id: Ia53d404ec79c46edaa98f654e23bc3b0e179fe2d
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46323
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
As part of recent decisions regarding namespace
naming conventions, all namespaces will be changed
to snake case.
::Stats became ::statistics.
"statistics" was chosen over "stats" to avoid generating
conflicts with the already existing variables (there are
way too many "stats" in the codebase), which would make
this patch even more disturbing for the users.
Change-Id: If877b12d7dac356f86e3b3d941bf7558a4fd8719
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/45421
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
As part of recent decisions regarding namespace
naming conventions, all namespaces will be changed
to snake case.
::Prefetcher became ::prefetcher.
"prefetch" was chosen over "prefetcher" to avoid generating
conflicts with the already existing variables. "prefetcher"
is a name that is expected to be more common in user's code
than "prefetch".
Change-Id: I8f07217f278a0229e05545b7847f2620ed208c66
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/45410
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
The systemc dir was not included in this fix.
First it was identified that there were only occurrences
at 0, 1, and 2 levels of indentation, using:
grep -nrE --exclude-dir=systemc \
"^ *class [A-Za-z].* {$" src/
Then the following commands were run to replace:
<indent level>class X ... {
by:
<indent level>class X ...
<indent level>{
Level 0:
grep -nrl --exclude-dir=systemc
"^class [A-Za-z].* {$" src/ | \
xargs sed -Ei \
's/^class ([A-Za-z].*) \{$/class \1\n\{/g'
Level 1:
grep -nrl --exclude-dir=systemc \
"^ class [A-Za-z].* {$" src/ | \
xargs sed -Ei \
's/^ class ([A-Za-z].*) \{$/ class \1\n \{/g'
and so on.
Change-Id: I17615ce16a333d69867b27c7bae0f4fdafd8b2eb
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/39015
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The create() method on Params structs usually instantiate SimObjects
using a constructor which takes the Params struct as a parameter
somehow. There has been a lot of needless variation in how that was
done, making it annoying to pass Params down to base classes. Some of
the different forms were:
const Params &
Params &
Params *
const Params *
Params const*
This change goes through and fixes up every constructor and every
create() method to use the const Params & form. We use a reference
because the Params struct should never be null. We use const because
neither the create method nor the consuming object should modify the
record of the parameters as they came in from the config. That would
make consuming them not idempotent, and make it impossible to tell what
the actual simulation configuration was since it would change from any
user visible form (config script, config.ini, dot pdf output).
Change-Id: I77453cba52fdcfd5f4eec92dfb0bddb5a9945f31
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/35938
Reviewed-by: Gabe Black <gabeblack@google.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
isa_traits.hh used to have much more in it, but now it only has
PageShift, PageBytes, and (for now) the guest endianness. These values
should only be retrieved from the System class generally speaking, so
only the system class should include arch/isa_traits.hh.
Some gpu compute related files need PageBytes or PageShift. Even though
those files don't advertise their ISA dependence, they are tied to x86.
In those files, they can include arch/x86/isa_traits.hh.
The only other file which legitimately needs arch/isa_traits.hh is the
decoder cache since it uses PageBytes to size an array.
Change-Id: I12686368715623e3140a68a7027c136bd52567b1
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33203
Reviewed-by: Gabe Black <gabeblack@google.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This patch adds a meta-prefetcher that enables gem5's cache models to
connect to multiple prefetchers. Sub-prefetchers still use the
probes-based interface and training can be controlled
independently. However, when the cache requests a prefetch packet, the
adaptor traverses the priority list of prefetchers and uses the first
prefetcher that is able to generate a prefetch.
Kudos to Mitch Hayenga for the original version of this patch.
Change-Id: I25569a834997e5404c7183ec995d212912c5dcdf
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18868
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Added additional information to the PrefetchInfo data structure
- Whether the event is triggered by a cache miss
- Whether the event is a write or a read
- Size of the data accessed
- Data accessed by the request
Change-Id: I070f3ffe837ea960a357388e7f2b8a61d7b2196c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/16583
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reference:
Multi-level hardware prefetching using low complexity delta correlating
prediction tables with partial matching.
Marius Grannaes, Magnus Jahre, and Lasse Natvig. 2010.
In Proceedings of the 5th international conference on High Performance
Embedded Architectures and Compilers (HiPEAC'10)
Change-Id: I7b5d7ede9284862a427cfd5693a47652a69ed49d
Reviewed-on: https://gem5-review.googlesource.com/c/16062
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
This implementation is based in the description available in:
Jinchun Kim, Seth H. Pugsley, Paul V. Gratz, A. L. Narasimha Reddy,
Chris Wilkerson, and Zeshan Chishti. 2016.
Path confidence based lookahead prefetching.
In The 49th Annual IEEE/ACM International Symposium on Microarchitecture
(MICRO-49). IEEE Press, Piscataway, NJ, USA, Article 60, 12 pages.
Change-Id: I4b8b54efef48ced7044bd535de9a69bca68d47d9
Reviewed-on: https://gem5-review.googlesource.com/c/14819
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Prefetchers can be configured to operate with virtual or physical addreses.
The option can be configured through the "use_virtual_addresses" parameter
of the Prefetcher object.
Change-Id: I4f8c3687988afecc8a91c3c5b2d44cc0580f72aa
Reviewed-on: https://gem5-review.googlesource.com/c/14416
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Some common functionality added to the base prefetcher, mainly dealing with
extracting the block address, page address, block index inside the page and
some other information that can be inferred from the block address. This is
used for some prefetching algorithms, and having the methods in the base,
as well as the block size and other information is the sensible way.
Re-organizes the prefetcher class structure. Previously the
BasePrefetcher forced multiple assumptions on the prefetchers that
inherited from it. This patch makes the BasePrefetcher class truly
representative of base functionality. For example, the base class no
longer enforces FIFO order. Instead, prefetchers with FIFO requests
(like the existing stride and tagged prefetchers) now inherit from a
new QueuedPrefetcher base class.
Finally, the stride-based prefetcher now assumes a custimizable lookup table
(sets/ways) rather than the previous fully associative structure.
This patch takes a step towards an ISA-agnostic memory
system by enabling the components to establish the page size after
instantiation. The swap operation in the memory is now also allowing
any granularity to avoid depending on the IntReg of the ISA.
Static analysis unearther a bunch of uninitialised variables and
members, and this patch addresses the problem. In all cases these
omissions seem benign in the end, but at least fixing them means less
false positives next time round.
This patch extends the classic prefetcher to work on non-block aligned
addresses. Because the existing prefetchers in gem5 mask off the lower
address bits of cache accesses, many predictable strides fail to be
detected. For example, if a load were to stride by 48 bytes, with 64 byte
cachelines, the current stride based prefetcher would see an access pattern
of 0, 64, 64, 128, 192.... Thus not detecting a constant stride pattern. This
patch fixes this, by training the prefetcher on access and not masking off the
lower address bits.
It also adds the following configuration options:
1) Training/prefetching only on cache misses,
2) Training/prefetching only on data acceses,
3) Optionally tagging prefetches with a PC address.
#3 allows prefetchers to train off of prefetch requests in systems with
multiple cache levels and PC-based prefetchers present at multiple levels.
It also effectively allows a pipelining of prefetch requests (like in POWER4)
across multiple levels of cache hierarchy.
Improves performance on my gem5 configuration by 4.3% for SPECINT and 4.7% for SPECFP (geomean).
This patch removes the time field from the packet as it was only used
by the preftecher. Similar to the packet queue, the prefetcher now
wraps the packet in a deferred packet, which also has a tick
representing the absolute time when the packet should be sent.
This patch changes the cache-related latencies from an absolute time
expressed in Ticks, to a number of cycles that can be scaled with the
clock period of the caches. Ultimately this patch serves to enable
future work that involves dynamic frequency scaling. As an immediate
benefit it also makes it more convenient to specify cache performance
without implicitly assuming a specific CPU core operating frequency.
The stat blocked_cycles that actually counter in ticks is now updated
to count in cycles.
As the timing is now rounded to the clock edges of the cache, there
are some regressions that change. Plenty of them have very minor
changes, whereas some regressions with a short run-time are perturbed
quite significantly. A follow-on patch updates all the statistics for
the regressions.
This change adds a master id to each request object which can be
used identify every device in the system that is capable of issuing a request.
This is part of the way to removing the numCpus+1 stats in the cache and
replacing them with the master ids. This is one of a series of changes
that make way for the stats output to be changed to python.
This prevents redundant prefetches from being issued, solving the
occasional 'needsExclusive && !blk->isWritable()' assertion failure
in cache_impl.hh that several people have run into.
Eliminates "prefetch_cache_check_push" flag, neither setting of
which really solved the problem.
Apparently we broke it with the cache rewrite and never noticed.
Thanks to Bao Yungang <baoyungang@gmail.com> for a significant part
of these changes (and for inspiring me to work on the rest).
Some other overdue cleanup on the prefetch code too.