The VIPER configuration uses the MOESI_AMD_Base protocol's directory.
This protocol does not wait for memory ACKs. As a result, this can lead
to read requests being pulled out of the MessageBuffer between the
directory and DRAMCtrl before a write request to the same address. This
leads to inconsistent data. To fix this, make the MessageBuffers
ordered. Since these MessageBuffers are essentially just an interface
between SLICC and DRAMCtrl, and DRAMCtrl can reorder requests properly,
this should not cause any large impact on performance due to the
constraint.
Also remove the duplicate instantiation of these MessageBuffers.
Change-Id: I59653717cc79884e733af3958adfc14941703958
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57411
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Matthew Poremba <matthew.poremba@amd.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Remove the line "For use for simulation and test purposes only" in files
were AMD is the only copyright holder listed in the header. This happens
to be the case for all files where this line exists, removing it
completely from gem5.
Change-Id: I623f266b002f564301b28774f49081099cfc60fd
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53943
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
In order to fix several regression failures [1] the master/slave
terminology in src/cpu/BaseCPU.py was reintroduced [2].
This patch is addressing the issue by providing 2 different
ways of connecting cpu ports:
*) connectBus: The method assumes an object with a bus interface is
passed as an argument, therefore it tries to bind cpu ports to the
bus.mem_side_ports and bus.cpu_side_ports
*) connectAllPorts: No assumption on the port owning device is made.
The method simply accepts ports as arguments which will be directly
connected to the peer cpu ports
This will be used for example by ruby Sequencers
[1]: https://gem5.atlassian.net/browse/GEM5-775
[2]: https://gem5-review.googlesource.com/c/public/gem5/+/34495
Change-Id: I715ab8471621d6e5eb36731d7eaefbedf9663a71
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/52584
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
In order to have more fine grained control over which SLICC controllers
are part of which Ruby network in a disjoint configuration, the
create_system function in GPU_VIPER is broken up into multiple construct
calls for each SLICC machine type in the protocol. By default this does
not change anything functionally. A future config will use the construct
calls to explicitly set which network (CPU or GPU) the controller is in.
Change-Id: Ic038b300c5c3732e96992ef4bfe14e43fa0ea824
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51847
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Viper is checking for the dma's type before making the port assignment.
In FullSystem mode the IDE device is a PortRef and does not have an
attribute 'type.' This handles the various types a bit better and
ensures that IDE device, the protocol tester, and upcoming DMA devices
related to FullSystem can be added.
Change-Id: I6879b25c6aabbbc22b0ee8dc9cbfec6399f70daa
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/44806
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Matthew Poremba <matthew.poremba@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
GPUCoalescers in FullSystem mode should not be connected to the piobus
since they reside on a completely different RubyPort. There is also no
concept of IO requests from GPU so any request attempting to use the
default port (pio) should fatal. Further, coalescers do not implement
the connectIOPorts function.
This avoids coalescers by checking is_cpu_sequencer, which I believe is
the purpose of that boolean.
Change-Id: I482dd631292ca20e3bcd856489376f9b38457200
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/44805
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This patch is adding an extra parameter to the Ruby.create_system
function. The idea is to remove any assumption about cpu configuration
in the ruby scripts.
At the moment the scripts are assuming a flat list of cpu assigned
to the system object. Unfortunately this is not standardized, as
some systems might empoloy a different layout of cpus, like grouping
them in cluster objects.
With this patch we are allowing client scripts to provide the cpu list
as an extra argument
This has the extra benefit of removing the indexing hack
if len(system.cpu) == 1:
which was present in most scripts
Change-Id: Ibc06b920273cde4f7c394d61c0ca664a7143cd27
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/43287
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Changed format from yaml to plain python. The new py configuration
file, when provided, must specialize the CHI node types defined in
configs/ruby/CHI_config.py (moved from configs/ruby/CHI.py). This
is required in order to setup the node->router bindings when the
CustomMesh topology is used.
See configs/example/noc_config/2x4.py (replaces
configs/example/noc_config/2x4.yaml) for an example.
--noc-config was also renamed to --chi-config, since the CHI node types
can be fully specialized in the configuration file.
Change-Id: Ic0c5407dba3d2483d5c30634c115b5410a5228fd
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/43123
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This patch add a new Ruby cache coherence protocol based on Arm' AMBA5
CHI specification. The CHI protocol defines and implements two state
machine types:
- Cache_Controller: generic cache controller that can be configured as:
- Top-level L1 I/D cache
- A intermediate level (L2, L3, ...) private or shared cache
- A CHI home node (i.e. the point of coherence of the system and
has the global directory)
- A DMA requester
- Memory_Controller: implements a CHI slave node and interfaces with
gem5 memory controller. This controller has the functionality of a
Directory_Controller on the other Ruby protocols, except it doesn't
have a directory.
The Cache_Controller has multiple cache allocation/deallocation
parameters to control the clusivity with respect to upstream caches.
Allocation can be completely disabled to use Cache_Controller as a
DMA requester or as a home node without a shared LLC.
The standard configuration file configs/ruby/CHI.py provides a
'create_system' compatible with configs/example/fs.py and
configs/example/se.py and creates a system with private L1/L2 caches
per core and a shared LLC at the home nodes. Different cache topologies
can be defined by modifying 'create_system' or by creating custom
scripts using the structures defined in configs/ruby/CHI.py.
This patch also includes the 'CustomMesh' topology script to be used
with CHI. CustomMesh generates a 2D mesh topology with the placement
of components manually defined in a separate configuration file using
the --noc-config parameter.
The example in configs/example/noc_config/2x4.yaml creates a simple 2x4
mesh. For example, to run a SE mode simulation, with 4 cores,
4 mem ctnrls, and 4 home nodes (L3 caches):
build/ARM/gem5.opt configs/example/se.py \
--cmd 'tests/test-progs/hello/bin/arm/linux/hello' \
--ruby --num-cpus=4 --num-dirs=4 --num-l3caches=4 \
--topology=CustomMesh --noc-config=configs/example/noc_config/2x4.yaml
If one doesn't care about the component placement on the interconnect,
the 'Crossbar' and 'Pt2Pt' may be used and they do not require the
--noc-config option.
Additional authors:
Joshua Randall <joshua.randall@arm.com>
Pedro Benedicte <pedro.benedicteillescas@arm.com>
Tuan Ta <tuan.ta2@arm.com>
JIRA: https://gem5.atlassian.net/browse/GEM5-908
Change-Id: I856524b0afd30842194190f5bd69e7e6ded906b0
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42563
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This patch updates Ruby configuration scripts to use the functions
defined in the RubySequencer python object to connect to cpu ports.
Only the protocol-agnostic scripts were updated. Scripts that assume
a specific protocol (e.g. configs/example/apu_se.py, gpu tests, etc)
and scripts in which the obj connected to the RubySequencer is not a
BaseCPU (e.g. the tests scripts) were not changed as they require a
non-standard port wireup.
Change-Id: I1e931ff0fc93f393cb36fbb8769ea4b48e1a1e86
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31418
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Make the actual controller more generic
- Rename DRAMCtrl to MemCtrl
- Rename DRAMacket to MemPacket
- Rename dram_ctrl.cc to mem_ctrl.cc
- Rename dram_ctrl.hh to mem_ctrl.hh
- Create MemCtrl debug flag
Move the memory interface classes/functions to separate files
- mem_interface.cc
- mem_interface.hh
Change-Id: I1acba44c855776343e205e7733a7d8bbba92a82c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31654
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Add NVM interface to memory controller.
This can be used with or instead of the existing
DRAM interface. Therefore, a single controller can interface
to either DRAM or NVM, or both.
Specifically, a memory channel can be configured as:
- Memory controller interfacing to DRAM only
- Memory controller interfacing to NVM only
- Memory controller interfacing to both DRAM and NVM
How data is placed or migrated between media types is outside
of the scope of this change.
The NVM interface incorporates new static delay parameters
for read and write completion. The interface defines a 2
stage read to manage non-deterministic read delays while
enabling deterministic data transfer, similar to NVDIMM-P.
The NVM interface also includes parameters to define
read and write buffers on the media side (on-DIMM). These are
utilized to quickly offload commands and write data, mitigating
the effects of lower latency and bandwidth media characteristics.
Change-Id: I6b22ddb495877f88d161f0bd74ade32cc8fdcbcc
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29027
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Wendy Elsasser <wendy.elsasser@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Made DRAMCtrl a ClockedObject, with DRAMInterface
defined as an AbstractMemory. The address
ranges are now defined per interface. Currently
the model only includes a DRAMInterface but this
can be expanded for other media types.
The controller object includes a parameter to the
interface, which is setup when gem5 is configured.
Change-Id: I6a368b845d574a713c7196c5671188ca8c1dc5e8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/28968
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This patch augments the MESI_Three_Level Ruby protocol with hardware
transactional memory support.
The HTM implementation relies on buffering of speculative memory updates.
The core notifies the L0 cache controller that a new transaction has
started and the controller in turn places itself in transactional state
(htmTransactionalState := true).
When operating in transactional state, the usual MESI protocol changes
slightly. Lines loaded or stored are marked as part of a transaction's
read and write set respectively. If there is an invalidation request to
cache line in the read/write set, the transaction is marked as failed.
Similarly, if there is a read request by another core to a speculatively
written cache line, i.e. in the write set, the transaction is marked as
failed. If failed, all subsequent loads and stores from the core are
made benign, i.e. made into NOPS at the cache controller, and responses
are marked to indicate that the transactional state has failed. When the
core receives these marked responses, it generates a HtmFailureFault
with the reason for the transaction failure. Servicing this fault does
two things--
(a) Restores the architectural checkpoint
(b) Sends an HTM abort signal to the cache controller
The restoration includes all registers in the checkpoint as well as the
program counter of the instruction before the transaction started.
The abort signal is sent to the L0 cache controller and resets the
failed transactional state. It resets the transactional read and write
sets and invalidates any speculatively written cache lines. It also
exits the transactional state so that the MESI protocol operates as
usual.
Alternatively, if the instructions within a transaction complete without
triggering a HtmFailureFault, the transaction can be committed. The core
is responsible for notifying the cache controller that the transaction
is complete and the cache controller makes all speculative writes
visible to the rest of the system and exits the transactional state.
Notifting the cache controller is done through HtmCmd Requests which are
a subtype of Load Requests.
KUDOS:
The code is based on a previous pull request by Pradip Vallathol who
developed HTM and TSX support in Gem5 as part of his master’s thesis:
http://reviews.gem5.org/r/2308/index.html
JIRA: https://gem5.atlassian.net/browse/GEM5-587
Change-Id: Icc328df93363486e923b8bd54f4d77741d8f5650
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/30319
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
There was a missing option (--buffers-size) used to set the mandatory
queue size for the scalar controllers. This patch renames the option to
be more clear, and adds it to the argument parser.
Default of 128 taken from the implementation on the GCN staging branch
Change-Id: I58b6b57be07498cdf6e39c0bb85982674ec4caa6
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32676
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This was originally from the GCN staging branch, which only had
GPU_VIPER.py, but the other GPU_VIPER configs had DirMem as well, so I
applied this change to all of them.
The patch replaces the Directory in DirCntrl from DirMem to
RubyDirectoryMemory. This fixes errors that DirMem caused relating to
setting class variables. It also generates and sets addr_ranges in
DirCntrl as RubyDirectoryMemory uses the parent object's addr_ranges
in its code
The style checker complained about a line length in GPU_VIPER_Region,
so the patch also fixes that
Change-Id: Icec96777a51d8a826b576fc752fae0f7f15427bc
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32674
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Bradford Beckmann <brad.beckmann@amd.com>
Maintainer: Bradford Beckmann <brad.beckmann@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This changeset adds the necessary changes for running
GCN3 ISA with VIPER in apu_se.py.
Changes to the VIPER protocol configs are made to add support
for DMA and scalar caches.
hsaTopology is added to help the pseudo FS create the files
needed by ROCm to understand the device on which the SW is
being run.
Change-Id: I0f47a6a36bb241a26972c0faafafcf332a7d7d1f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/30274
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Bradford Beckmann <brad.beckmann@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This changeset allows setting a variable for interleaving.
That value is used together with the number of directories to
calculate numa_high_bit, which is in turn used to set up
cache, directory, and memory controller interleaving.
A similar approach is used to set xor_low_bit, and calculate
xor_high_bit for address hashing.
Change-Id: Ia342c77c59ca2e3438db218b5c399c3373618320
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/28134
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This patch properly sets the access permissions in all controllers.
'Busy' was used for all transient states, which is incorrect in lots of
cases when we still hold a valid copy of the line and are able to handle
a functional read.
In the L2 controller these states were split to differentiate the access
permissions:
IFGXX -> IFGXX, IFGXXD
IGMO -> IGMO, IGMOU
IGMIOF -> IGMIOF, IGMIOFD
Same for the dir. controller:
IS -> IS, IS_M
MM -> MM, MM_M
The dir. controllers also has the states WBI/WBS for lines that have
been queued for a writeback. In these states we hold the data in the TBE
for replying to functional reads until the memory acks the write and we
move to I or S.
Other minor changes includes updated debug messages and asserts.
Change-Id: Ie4f6eac3b4d2641ec91ac6b168a0a017f61c0d6f
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/21927
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Pouya Fotouhi <pfotouhi@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Calls to queueMemoryRead and queueMemoryWrite do not consider the size
of the queue between ruby directories and DRAMCtrl which causes infinite
buffering in the queued port between the two. This adds a MessageBuffer
in between which uses enqueues in SLICC and is therefore size checked
before any SLICC transaction pushing to the buffer can occur, removing
the infinite buffering between the two.
Change-Id: Iedb9070844e4f6c8532a9c914d126105ec98d0bc
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/27427
Tested-by: Gem5 Cloud Project GCB service account <345032938727@cloudbuild.gserviceaccount.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Bradford Beckmann <brad.beckmann@amd.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Bradford Beckmann <brad.beckmann@amd.com>