This patch adds the GPU protocol tester that uses data-race-free
operation to discover bugs in GPU protocols including GPU_VIPER. For
more information please see the following paper and the README:
T. Ta, X. Zhang, A. Gutierrez and B. M. Beckmann, "Autonomous
Data-Race-Free GPU Testing," 2019 IEEE International Symposium on
Workload Characterization (IISWC), Orlando, FL, USA, 2019, pp. 81-92,
doi: 10.1109/IISWC47752.2019.9042019.
Change-Id: Ic9939d131a930d1e7014ed0290601140bdd1499f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32855
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This change is useful when using custom simulation scripts that do not
rely on configs/common/Options.py.
Without this change, the custom script always needed to provide some
value for cache sizes and HW prefetchers configuration; with this change
it is possible to provide no value and use what is defined in the core
configuration as default.
Change-Id: Id0e807c3fa224180d682f366c7307941bab8ce59
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/36776
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
When the TokenPort was moved from the GCN3 staging branch to develop the
TokenPort was changed from being the port connecting the ComputeUnit to
Ruby's vector memory port to a sideband port which inhibits requests to
Ruby's vector memory port. As such, it needs to be explicitly connected
as a new port. This changes the getPort method in ComputeUnit to be
aware of the port as well as modifying the example config to connect to
TCPs.
The iteration to connect in the config file was modified since it was
not properly connecting to TCPs each time and Ruby.py does not
explicitly return a list of each MachineType.
Change-Id: Ia70a6756b2af54d95e94d19bec5d8aadd3c2d5c0
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/35096
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The workload object is still optional for the sake of compatibility,
even though it probably shouldn't be in the long term. If a simulation
is just a collection of components with nothing in particular running on
it, for instance driven by a traffic generator, should it even have a
System object in the first place?
Change-Id: I8bcda72bdfa3730248226fb62f0bba9a83243d95
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33278
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This version of garnet includes HeteroGarnet which
supports heterogenous interconnect systems, flexible
router and link configurations, and better debugging
resources.
This patch changes the garnet directory structure
to not include the version number. The user will be
informed about the garnet version being used.
Change-Id: Id4763421528305193ae0cd10c159b385a9513553
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/34259
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Make the actual controller more generic
- Rename DRAMCtrl to MemCtrl
- Rename DRAMacket to MemPacket
- Rename dram_ctrl.cc to mem_ctrl.cc
- Rename dram_ctrl.hh to mem_ctrl.hh
- Create MemCtrl debug flag
Move the memory interface classes/functions to separate files
- mem_interface.cc
- mem_interface.hh
Change-Id: I1acba44c855776343e205e7733a7d8bbba92a82c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31654
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Add NVM interface to memory controller.
This can be used with or instead of the existing
DRAM interface. Therefore, a single controller can interface
to either DRAM or NVM, or both.
Specifically, a memory channel can be configured as:
- Memory controller interfacing to DRAM only
- Memory controller interfacing to NVM only
- Memory controller interfacing to both DRAM and NVM
How data is placed or migrated between media types is outside
of the scope of this change.
The NVM interface incorporates new static delay parameters
for read and write completion. The interface defines a 2
stage read to manage non-deterministic read delays while
enabling deterministic data transfer, similar to NVDIMM-P.
The NVM interface also includes parameters to define
read and write buffers on the media side (on-DIMM). These are
utilized to quickly offload commands and write data, mitigating
the effects of lower latency and bandwidth media characteristics.
Change-Id: I6b22ddb495877f88d161f0bd74ade32cc8fdcbcc
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29027
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Wendy Elsasser <wendy.elsasser@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Made DRAMCtrl a ClockedObject, with DRAMInterface
defined as an AbstractMemory. The address
ranges are now defined per interface. Currently
the model only includes a DRAMInterface but this
can be expanded for other media types.
The controller object includes a parameter to the
interface, which is setup when gem5 is configured.
Change-Id: I6a368b845d574a713c7196c5671188ca8c1dc5e8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/28968
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This patch augments the MESI_Three_Level Ruby protocol with hardware
transactional memory support.
The HTM implementation relies on buffering of speculative memory updates.
The core notifies the L0 cache controller that a new transaction has
started and the controller in turn places itself in transactional state
(htmTransactionalState := true).
When operating in transactional state, the usual MESI protocol changes
slightly. Lines loaded or stored are marked as part of a transaction's
read and write set respectively. If there is an invalidation request to
cache line in the read/write set, the transaction is marked as failed.
Similarly, if there is a read request by another core to a speculatively
written cache line, i.e. in the write set, the transaction is marked as
failed. If failed, all subsequent loads and stores from the core are
made benign, i.e. made into NOPS at the cache controller, and responses
are marked to indicate that the transactional state has failed. When the
core receives these marked responses, it generates a HtmFailureFault
with the reason for the transaction failure. Servicing this fault does
two things--
(a) Restores the architectural checkpoint
(b) Sends an HTM abort signal to the cache controller
The restoration includes all registers in the checkpoint as well as the
program counter of the instruction before the transaction started.
The abort signal is sent to the L0 cache controller and resets the
failed transactional state. It resets the transactional read and write
sets and invalidates any speculatively written cache lines. It also
exits the transactional state so that the MESI protocol operates as
usual.
Alternatively, if the instructions within a transaction complete without
triggering a HtmFailureFault, the transaction can be committed. The core
is responsible for notifying the cache controller that the transaction
is complete and the cache controller makes all speculative writes
visible to the rest of the system and exits the transactional state.
Notifting the cache controller is done through HtmCmd Requests which are
a subtype of Load Requests.
KUDOS:
The code is based on a previous pull request by Pradip Vallathol who
developed HTM and TSX support in Gem5 as part of his master’s thesis:
http://reviews.gem5.org/r/2308/index.html
JIRA: https://gem5.atlassian.net/browse/GEM5-587
Change-Id: Icc328df93363486e923b8bd54f4d77741d8f5650
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/30319
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This upgrades the garnet model to support HeteroGarnet
1) Static and dynamic multi-freq domains in network
2) Support for CDC
3) Separate links for each message class
4) Separate linkwidth for each message class
5) Support for SerDes
Change-Id: I6d00e3b5cb3745e849d221066cb46b2138c47871
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32597
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
There was a missing option (--buffers-size) used to set the mandatory
queue size for the scalar controllers. This patch renames the option to
be more clear, and adds it to the argument parser.
Default of 128 taken from the implementation on the GCN staging branch
Change-Id: I58b6b57be07498cdf6e39c0bb85982674ec4caa6
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32676
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This change sets the properties in hsaTopology to the proper values
specified by the user through command-line arguments. This ensures
that if the properties file is read by a program, it will return
the correct values for the simulated hardware.
This change also adds in a command-line argument for the lds size, as
it was the only other property used in hsaTopology that didn't have
a command-line argument. The default value (65536) is taken from
src/gpu-compute/LdsState.py
Change-Id: I17bb812491708f4221c39b738c906f1ad944614d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31995
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Alexandru Duțu <alexandru.dutu@amd.com>
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This was originally from the GCN staging branch, which only had
GPU_VIPER.py, but the other GPU_VIPER configs had DirMem as well, so I
applied this change to all of them.
The patch replaces the Directory in DirCntrl from DirMem to
RubyDirectoryMemory. This fixes errors that DirMem caused relating to
setting class variables. It also generates and sets addr_ranges in
DirCntrl as RubyDirectoryMemory uses the parent object's addr_ranges
in its code
The style checker complained about a line length in GPU_VIPER_Region,
so the patch also fixes that
Change-Id: Icec96777a51d8a826b576fc752fae0f7f15427bc
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32674
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Bradford Beckmann <brad.beckmann@amd.com>
Maintainer: Bradford Beckmann <brad.beckmann@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This changeset adds the necessary changes for running
GCN3 ISA with VIPER in apu_se.py.
Changes to the VIPER protocol configs are made to add support
for DMA and scalar caches.
hsaTopology is added to help the pseudo FS create the files
needed by ROCm to understand the device on which the SW is
being run.
Change-Id: I0f47a6a36bb241a26972c0faafafcf332a7d7d1f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/30274
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Bradford Beckmann <brad.beckmann@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
fs.py warns when an Arm platform is being created without a DTB file,
if the platform does not support the automatic creation of a DTB.
Updated the list of supported platforms with recent additions in order
to remove incorrect and potentially confusing warnings.
Change-Id: I549124a1afbc36e313f614dccab17973582bc3f7
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/30575
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
This commit makes it possible to make invocations such as:
gem5.opt se.py --stats-root 'system.cpu[:].dtb' --stats-root 'system.membus'
When --stats-root is given, only stats that are under any of the root
SimObjects get dumped. E.g. the above invocation would dump stats such as:
system.cpu0.dtb.walker.pwrStateResidencyTicks::UNDEFINED
system.cpu1.dtb.walker.pwrStateResidencyTicks::UNDEFINED
system.membus.pwrStateResidencyTicks::UNDEFINED
system.membus.trans_dist::ReadReq
but not for example `system.clk_domain.clock`.
If the --stats-root is given, only new stats as defined at:
Idc8ff448b9f70a796427b4a5231e7371485130b4 get dumped, and old ones are
ignored. The commits following that one have done some initial conversion
work, but many stats are still in the old format.
Change-Id: Iadaef26edf9a678b39f774515600884fbaeec497
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/28628
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This changeset allows setting a variable for interleaving.
That value is used together with the number of directories to
calculate numa_high_bit, which is in turn used to set up
cache, directory, and memory controller interleaving.
A similar approach is used to set xor_low_bit, and calculate
xor_high_bit for address hashing.
Change-Id: Ia342c77c59ca2e3438db218b5c399c3373618320
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/28134
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Most of these "rcS" scripts are only useful for specific disk images
that have long been lost to the gem5 community. This commit deletes all
of these scripts. It keeps the generally useful hack_back_cktp script
and the bbench scripts that work with the android images that are still
available.
In the future, these remaning scripts should be moved to the gem5
resources repository.
Issue-on: https://gem5.atlassian.net/browse/GEM5-350
Change-Id: Iba99e70fde7f656e968b4ecd95663275bd38fd6e
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/28507
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This patch properly sets the access permissions in all controllers.
'Busy' was used for all transient states, which is incorrect in lots of
cases when we still hold a valid copy of the line and are able to handle
a functional read.
In the L2 controller these states were split to differentiate the access
permissions:
IFGXX -> IFGXX, IFGXXD
IGMO -> IGMO, IGMOU
IGMIOF -> IGMIOF, IGMIOFD
Same for the dir. controller:
IS -> IS, IS_M
MM -> MM, MM_M
The dir. controllers also has the states WBI/WBS for lines that have
been queued for a writeback. In these states we hold the data in the TBE
for replying to functional reads until the memory acks the write and we
move to I or S.
Other minor changes includes updated debug messages and asserts.
Change-Id: Ie4f6eac3b4d2641ec91ac6b168a0a017f61c0d6f
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/21927
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Pouya Fotouhi <pfotouhi@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>