The temporal compactor was never initialized.
There were more possible indexes to the prec/succ vectors than
entries, so a block distance of zero would seg fault.
When checking for an address the wrong vector was being used.
From the original paper, "The prediction mechanism searches for
the PC of the accessed instruction in the index table"
Change-Id: I3c3aceac3c0adbbe8aef5c634c88cb35ba7487be
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/28487
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This patch fixes the MESI_Three_Level protocols so that it correctly
informers the Ruby sequencer when a line eviction occurs. Furthermore,
the patch allows the protocol to recognize the 'Store_Conditional'
RubyRequestType and shortcuts this operation if the monitored line
has been cleared from the address monitor. This prevents certain
livelock behaviour in which a line could ping-pong between competing
cores.
The patch establishes a new C/C++ preprocessor definition which allows
the Sequencer to send the 'Store_Conditional' RubyRequestType to
MESI_Three_Level instead of 'ST'. This is a temporary measure until
the other protocols explicitely recognize 'Store_Conditional'.
Change-Id: I27ae041ab0e015a4f54f20df666f9c4873c7583d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/28328
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
The implementation for load-linked/store-conditional did not work
correctly for multi-core simulations. Since load-links were treated as
stores, it was not possible for a line to have multiple readers which
often resulted in livelock when using these instructions to implemented
mutexes. This improved implementation treats load-linked instructions
similarly to loads but locks the line after a copy has been fetched
locally. Writes to a monitored address ensure the 'linked' property is
blown away and any subsequent store-conditional will fail.
Change-Id: I19bd74459e26732c92c8b594901936e6439fb073
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/27103
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
The MESI_Three_Level protocol includes a transition in its L1
definition to invalidate an SM state but this transition does
not notify the L0 cache. The unintended side effect of this
allows stale values to be read by the L0 cache. This can cause
incorrect behaviour when executing LL/SC based mutexes. This
patch ensures that all invalidates to SM states are exposed to
the L0 cache.
Change-Id: I7fefabdaa8027fdfa4c9c362abd7e467493196aa
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/28047
Reviewed-by: John Alsop <johnathan.alsop@amd.com>
Reviewed-by: Pouya Fotouhi <pfotouhi@ucdavis.edu>
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
This patch fixes the following bugs:
- Previously when deltaPointer was 0 or 1, getting the last or penultimate deltas
would be wrong for non-pow2 deltas.size(). For example, if the last added delta
was to position 0, the previous should be in position 19, if deltas.size() = 20.
However, 0-1=4294967295, and 4294967295%20=15.
- When searching for the previous late and penultimate, the oldest entry was being
skipped.
Change-Id: Id800b60b77531ac4c2920bb90c15cc8cebb137a9
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/24538
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Javier Bueno Hedo <javier.bueno@metempsy.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The priority queue comparator orders such that false gives the
entry a higher priority. Therefore, if it is desired to make
the entry with lowest decompression latency have higher priority,
the comparison must be inverted.
Can be tested with:
MultiCompressor(compressors=[
PerfectCompressor(decompression_latency=1),
PerfectCompressor(decompression_latency=2)])
Where it is expected that compressor0 (the one with decomp lat
of 1) is always chosen.
Change-Id: I44acbf5f51c6e47efdd2a16fba9596935cf2eb69
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/28367
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Adds a TokenPort which uses tokens for flow control rather than the
standard retry mechanism in gem5. The port is intended to be used
for flow control where speculatively sending packets is not possible.
For example, GPU instructions require this to send memory requests
to the cache coalescer.
Change-Id: Id0d55ab65b7c773e97752b8514a780cdf7d88707
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/27428
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This patch addresses multiple cases:
- When a controller has read/write permissions while others have read
only permissions, the one with r/w permissions performs the read as
the others may have stale data
- When controllers only have lines with stale or busy access permissions,
a valid copy of the line may be in a message in transit in the network
or in a message buffer (not seen by the controller yet). In this case,
we forward the functional request accordingly.
- Sequencer messages should not accept functional reads
- Functional writes also update the packet data on the sequencer
outstanding request lists and the cpu-side response queue.
Change-Id: I6b0656f1a2b81d41bdcf6c783dfa522a77393981
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/22022
Tested-by: Gem5 Cloud Project GCB service account <345032938727@cloudbuild.gserviceaccount.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: John Alsop <johnathan.alsop@amd.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
The components in base/loader were moved into a namespace called
Loader. This will make it easier to add loader components with fairly
short natural names which don't invite name collisions.
gem5 should use namespaces more in general for that reason and to make
it easier to write independent components without having to worry about
name collisions being added in the future.
Unfortunately this namespace has the same name as a class used to load
an object file into a process object. These names can be disambiguated
because the Process loader is inside the Process scope and the Loader
namespace is at global scope, but it's still confusing to read.
Fortunately, this shouldn't last for very long since the responsibility
for loading Processes is going to move to a fake OS object which will
expect to load a particular type of Process, for instance, fake 64 bit
x86 linux will load either 32 or 64 bit x86 processes.
That means that the capability to feed any binary that matches the
current build into gem5 and have gem5 figure out what to do with it
will likely be going away in the future. That's likely for the best,
since it will force users to be more explicit about what they're trying
to do, ie what OS they want to try to load a given binary, and also
will prevent loading two or more Processes which are for different OSes
to the same system, something that's possible today as far as I know
since there are no consistency checks.
Change-Id: Iea0012e98f39f5e20a7c351b78cdff9401f5e326
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/24783
Reviewed-by: Gabe Black <gabeblack@google.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This is specialized per arch, and the Workload class is the only thing
actually using it. It doesn't make any sense to dispatch those calls
over to the System object, especially since that was, in most cases,
the only reason an ISA specific system class even still existed.
After this change, only ARM still has an architecture specific System
class.
Change-Id: I81b6c4db14b612bff8840157cfc56393370095e2
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/24287
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
A new Prefetcher namespace was added which holds the gem5 prefetchers
and means they don't all need a "Prefetcher" in their name. Unfortunately
that means that there is now both a Prefetcher namespace and a
Prefetcher class which conflict with each other.
This change tries to resolve the conflict with as little disruption as
possible by simply renaming the c++ ruby Pretcher class RubyPrefetcher,
leaving the python name alone so that configs aren't affected.
Issue-on: https://gem5.atlassian.net/browse/GEM5-447
Change-Id: I7afdf5dbc57dbf46d82552113c52f3a9207870f2
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/27949
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This change includes:
1) Verify available command bandwidth
2) Add support for multi-cycle commands
3) Add new timing parameters
4) Add ability to interleave bursts
5) Add LPDDR5 configurations
The DRAM controller historically does not verify contention on the
command bus and if there is adaquate command bandwidth to issue a
new command. As memory technologies evolve, multiple cycles are becoming
a requirement for some commands. Depending on the burst length, this
can stress the command bandwidth. A check was added to verify command
issue does not exceed a maximum value within a defined window. The
default window is a burst, with the maximum value defined based on the
burst length and media clocking characteristics. When the command bandwidth
is exceeded, commands will be shifted to subsequent burst windows.
Added support for multi-cycle commands, specifically Activate, which
requires a larger address width as capacities grow. Additionally,
added support for multi-cycle Read / Write bursts for low power
DRAM cases in which additional CLK synchronization may be required
to run at higher speeds.
To support emerging memories, added the following new timing parameters.
1) tPPD -- Precharge-to-Precharge delay
2) tAAD -- Max delay between Activate-1 and Activate-2 commands
I/O data rates are continuing to increase for DRAM but the core frequency
is still fairly stagnant for many technologies. As we increase the burst
length, either the core prefetch needs to increase (for a seamless burst)
or the burst will be transferred with gaps on the data bus. To support
the latter case, added the ability to interleave 2 bursts across bank
groups.
Using the changes above, added an initial set of LPDDR5 configurations.
Change-Id: I1b14fed221350e6e403f7cbf089fe6c7f033c181
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/26236
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: Gem5 Cloud Project GCB service account <345032938727@cloudbuild.gserviceaccount.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Calls to queueMemoryRead and queueMemoryWrite do not consider the size
of the queue between ruby directories and DRAMCtrl which causes infinite
buffering in the queued port between the two. This adds a MessageBuffer
in between which uses enqueues in SLICC and is therefore size checked
before any SLICC transaction pushing to the buffer can occur, removing
the infinite buffering between the two.
Change-Id: Iedb9070844e4f6c8532a9c914d126105ec98d0bc
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/27427
Tested-by: Gem5 Cloud Project GCB service account <345032938727@cloudbuild.gserviceaccount.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Bradford Beckmann <brad.beckmann@amd.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Bradford Beckmann <brad.beckmann@amd.com>
This proxy was only used by the ARM semihosting interface which can now
use a tweaked regular TranslatingPortProxy or SETranslatingPortProxy
instead of this special purpose class.
This sort of class would still be necessary if you wanted to use
physical addresses and not virtual addresses, but presently there is no
such use. This code can be retrieved from history if it's needed in the
future.
Change-Id: Ie47a8b4bb173cba1a06bd3ca60391081987936b8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/26625
Tested-by: Gem5 Cloud Project GCB service account <345032938727@cloudbuild.gserviceaccount.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Fix MOESI_hammer checkpoint hanging.
The function markRemoved() should be called before hitCallback(),
not after it. The reason is that hitCallback() checks if draining is
complete based on the value of "m_outstanding_count". And since
markRemoved() is responsible for decrementing "m_outstanding_count",
hitCallback() does not see that there are no outstanding requests.
Reported by: Timothy Hayes
Jira: https://gem5.atlassian.net/browse/GEM5-331
Change-Id: I14c34be79843b172ae994ab1792fe4ce6cf5cf6e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/25683
Reviewed-by: Timothy Hayes <timothy.hayes@arm.com>
Reviewed-by: John Alsop <johnathan.alsop@amd.com>
Maintainer: Bradford Beckmann <brad.beckmann@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Switch over to the new MemState API by specifying memory regions for
stack in each ISA, changing brkFunc to use MemState for heap memory,
and calling the MemState fixup in fixupStackFault (renamed to just
fixupFault).
Change-Id: Ie3559a68ce476daedf1a3f28b168a8fbc7face5e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/25366
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
There are a few problems with this check.
1. Many ISAs support multiple page sizes.
2. Memories (particularly small ROMs) may not actually be in multiples
of the page size.
3. In a heterogenous environment, there won't be a single page size even
if each ISA picks a canonical page size.
4. Other than catching some egregious configuration mistakes, there's
nothing functionally wrong/different about a memory that isn't evenly
coverable in pages, especially in systems or configurations that
don't even use paging.
Change-Id: I3cd241657318d2e3fd5a1226cb54fdebbf172788
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/26423
Maintainer: Gabe Black <gabeblack@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: Gem5 Cloud Project GCB service account <345032938727@cloudbuild.gserviceaccount.com>
The only functional difference between them was that the SE one might
have optionally fixed up missing translations for demand paging.
This lets us get rid of some code recreating the proxy ports in
setProcessPtr since the SE translating port no longer keeps a copy of
the process object pointer.
Change-Id: Id97df1874f1de138ffd4f2dbb5846dda79d9e4ac
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/26550
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Gabe Black <gabeblack@google.com>