The XBar uses the concept of Layers to model throughput and
instantiates one Layer per master. As it forwards a packet to and from
master, the corresponding Layer is marked as occupied for a number of
cycles. Requests/responses to/from a master are blocked while the
corresponding Layer is occupied. Previously the delay would be
calculated based on the formula 1 + size / width, which assumes that
the Layer is always occupied for 1 cycle while processing the packet
header. This changes makes the header latency a parameter which
defaults to 1.
Change-Id: I12752ab4415617a94fbd8379bcd2ae8982f91fd8
Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/30054
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Tested-by: kokoro <noreply+kokoro@google.com>
The System class has a few different arrays of values which each
correspond to a thread of execution based on their position. This
change collects them together into a single class to make managing them
easier and less error prone. It also collects methods for manipulating
those threads as an API for that class.
This class acts as a collection point for thread based state which the
System class can look into to get at all its state. It also acts as an
interface for interacting with threads for other classes. This forces
external consumers to use the API instead of accessing the individual
arrays which improves consistency.
Change-Id: Idc4575c5a0b56fe75f5c497809ad91c22bfe26cc
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/25144
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Instead of calling into object files after the fact and asking them to
put symbols into a target symbol table, this change makes object files
fill in a symbol table themselves at construction. Then, that table can
be retrieved and used to fill in aggregate tables, masked, moved,
and/or filtered to have only one type of symbol binding.
This simplifies the symbol management API of the object file types
significantly, and makes it easier to deal with symbol tables alongside
binaries in the FS workload classes.
Change-Id: Ic9006ca432033d72589867c93d9c5f8a1d87f73c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/24787
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
When the write buffer is full, it still has space to store an additional
number of entries (reserve) equal to the number of MSHRs so that if any
of them requires a writeback this can be handled. Even if the slave port
is blocked, a prefetcher can generate new MSHR entries that may lead to
additional writebacks and eventually saturate the reserve space. This is
solved by checking if the cache is blocked for accesses before
prefetching data.
Change-Id: Iaad04dd6786a09eab7afae4a53d1b1299c341f33
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29615
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
makeLineAddress function uses m_block_size_bits to create
masked addresses. m_block_size_bits is used to specify
cache, directory, and memory controller interleaving,
and it can be larger than the cache line size.
To generate addresses that can align with the cache line
rather than the interleaving granularity, a version of
makeLineAddress is created to specify bits that need to
be masked.
Change-Id: I06deec4949da7fa46f1d6f7575334f18ee61c786
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/28135
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Onur Kayıran <onur.kayiran@amd.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
This patch enables cache controllers to make response
messages in advance, store them in a per-address saved
map in an output message buffer and enqueue them altogether
in the future. This patch introduces new slicc statement
called defer_enqueueing. This patch would help simplify
the logic of state machines that deal with coalesing
multiple requests from different requestors.
Change-Id: I566d4004498b367764238bb251260483c5a1a5e5
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/28132
Reviewed-by: Tuan Ta <qtt2@cornell.edu>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Without this fix `error: call to function 'operator<<' that is neither
visible in the template definition nor found by argument-dependent
loopup` is thrown when compiling HSAIL_X86 using a clang compiler (at
`base/cprintf_formats.hhi:139`).
This error is due to a "<<" operator in a template declared prior to its
definition in the code. The operator is used in
`base/cprintf_formats.hh`, included in `base/cprintf.hh`, and defined in
`mem/ruby/common/BoolVec.hh`. Therefore, for clang to compile without
error, `mem/ruby/common/BoolVec.hh` must be included before
`base/cprintf.hh` when generating the
`mem/ruby/protocol/RegionBuffer_Controller.cc` in
`mem/slicc/symbols/StateMachine.py`.
Due to the gem5 style-checker, an overly-verbose solution was required
to permit this patch to be committed to the codebase.
Change-Id: Ie0ae4053e4adc8c4e918e4a714035637925ca104
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29532
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Add support for multiple networks per RubySystem. This is done by
introducing local IDs to each network and translating from a global ID
passed around through Ruby and SLICC code. The local IDs represents the
NodeID of a MachineType in the network and are ordered the same way
that NodeIDs are ordered using MachineType_base_number. If there are
not multiple networks in a RubySystem the local and global IDs are the
same value.
This is useful in cases where multiple isolated networks are needed to
support devices with Ruby caches which do not interact with other
networks. For example, a dGPU device will have a cache hierarchy that
will not interact with the CPU cache hierachy.
Change-Id: I33a917b3a394eec84b16fbf001c3c2c44c047f66
JIRA: https://gem5.atlassian.net/browse/GEM5-445
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/27927
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Bradford Beckmann <brad.beckmann@amd.com>
Maintainer: Bradford Beckmann <brad.beckmann@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This singleton object is used thruoughout the simulator. There is
really no reason not to have it statically allocated, except that
whether it was allocated seems to sometimes be used as a signal that
something already put symbols in it, specifically in SE mode.
To keep that functionality for the moment, this change adds an "empty"
method to the SymbolTable class to make it easy to check if the symbol
table is empty, or if someone already populated it.
Change-Id: Ia93510082d3f9809fc504bc5803254d8c308d572
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/24785
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
AbstractController sends requests using a QueuedMasterPort which has an
implicit buffer which is unbounded. Remove this by changing the port to
a MasterPort and implement a retry mechanism for AbstractController.
Although the request remains in the MessageBuffer if a retry is needed,
the additional retry logic optimizes serviceMemoryQueue slightly and
prevents the DRAMCtrl retry stats from being incorrect due to multiple
calls to sendTimingReq.
Change-Id: I8c592af92a1a499a418f34cfee16dd69d84803ad
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/28387
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Bradford Beckmann <brad.beckmann@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Remove the read/write tables and coalescing table and introduce a two
levels of tables for uncoalesced and coalesced packets. Tokens are
granted to GPU instructions to place in uncoalesced table. If tokens
are available, the operation always succeeds such that the 'Aliased'
status is never returned. Coalesced accesses are placed in the
coalesced table while requests are outstanding. Requests to the same
address are added as targets to the table similar to how MSHRs
operate.
Change-Id: I44983610307b638a97472db3576d0a30df2de600
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/27429
Reviewed-by: Bradford Beckmann <brad.beckmann@amd.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Bradford Beckmann <brad.beckmann@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The recent commit dd6cd33 modified the behaviour of the the Ruby
sequencer to handle load linked requests as loads rather than
stores. This caused the regression test
realview-simple-timing-dual-ruby-ARM-x86_64-opt
to become stuck when booting Linux. This patch fixes the issue by
adding a missing forward_eviction_to_cpu action to the state
transition(OM, Fwd_GETX, IM).
Change-Id: I8f253c5709488b07ddc5143a15eda406e31f3cc6
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/28787
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Fixes a few resource allocation issues in the directory controller:
- Added TBE resource checks on allocation.
- Now also allocating a TBE when issuing read requests to the controller
to allow for a better response to backpressure. Without the TBE as a
limiting factor, the directory can have an unbounded amount of
outstanding memory requests.
- Also allocating a TBE for forwarded requests.
Change-Id: I17016668bd64a50a4354baad5d181e6d3802ac46
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/21928
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Pouya Fotouhi <pfotouhi@ucdavis.edu>
This patch properly sets the access permissions in all controllers.
'Busy' was used for all transient states, which is incorrect in lots of
cases when we still hold a valid copy of the line and are able to handle
a functional read.
In the L2 controller these states were split to differentiate the access
permissions:
IFGXX -> IFGXX, IFGXXD
IGMO -> IGMO, IGMOU
IGMIOF -> IGMIOF, IGMIOFD
Same for the dir. controller:
IS -> IS, IS_M
MM -> MM, MM_M
The dir. controllers also has the states WBI/WBS for lines that have
been queued for a writeback. In these states we hold the data in the TBE
for replying to functional reads until the memory acks the write and we
move to I or S.
Other minor changes includes updated debug messages and asserts.
Change-Id: Ie4f6eac3b4d2641ec91ac6b168a0a017f61c0d6f
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/21927
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Pouya Fotouhi <pfotouhi@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
This patch fixes some issues in the directory controller regarding DMA
handling:
1) Junk data messages were being sent immediately in response to DMA reads
for a line in the S state (one or more sharers, clean). Now, data is
fetched from memory directly and forwarded to the device. Some existing
transitions for handling GETS requests are reused, since it's essentially
the same behavior (except we don't update the list of sharers for DMAs)
2) DMA writes for lines in the I or S states would always overwrite the
whole line. We now check if it's only a partial line write, in which case
we fetch the line from memory, update it, and writeback.
3) Fixed incorrect DMA msg size
Some existing functions were renamed for clarity.
Change-Id: I759344ea4136cd11c3a52f9eaab2e8ce678edd04
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/21926
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Pouya Fotouhi <pfotouhi@ucdavis.edu>
The temporal compactor was never initialized.
There were more possible indexes to the prec/succ vectors than
entries, so a block distance of zero would seg fault.
When checking for an address the wrong vector was being used.
From the original paper, "The prediction mechanism searches for
the PC of the accessed instruction in the index table"
Change-Id: I3c3aceac3c0adbbe8aef5c634c88cb35ba7487be
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/28487
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This patch fixes the MESI_Three_Level protocols so that it correctly
informers the Ruby sequencer when a line eviction occurs. Furthermore,
the patch allows the protocol to recognize the 'Store_Conditional'
RubyRequestType and shortcuts this operation if the monitored line
has been cleared from the address monitor. This prevents certain
livelock behaviour in which a line could ping-pong between competing
cores.
The patch establishes a new C/C++ preprocessor definition which allows
the Sequencer to send the 'Store_Conditional' RubyRequestType to
MESI_Three_Level instead of 'ST'. This is a temporary measure until
the other protocols explicitely recognize 'Store_Conditional'.
Change-Id: I27ae041ab0e015a4f54f20df666f9c4873c7583d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/28328
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
The implementation for load-linked/store-conditional did not work
correctly for multi-core simulations. Since load-links were treated as
stores, it was not possible for a line to have multiple readers which
often resulted in livelock when using these instructions to implemented
mutexes. This improved implementation treats load-linked instructions
similarly to loads but locks the line after a copy has been fetched
locally. Writes to a monitored address ensure the 'linked' property is
blown away and any subsequent store-conditional will fail.
Change-Id: I19bd74459e26732c92c8b594901936e6439fb073
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/27103
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
The MESI_Three_Level protocol includes a transition in its L1
definition to invalidate an SM state but this transition does
not notify the L0 cache. The unintended side effect of this
allows stale values to be read by the L0 cache. This can cause
incorrect behaviour when executing LL/SC based mutexes. This
patch ensures that all invalidates to SM states are exposed to
the L0 cache.
Change-Id: I7fefabdaa8027fdfa4c9c362abd7e467493196aa
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/28047
Reviewed-by: John Alsop <johnathan.alsop@amd.com>
Reviewed-by: Pouya Fotouhi <pfotouhi@ucdavis.edu>
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>