The function had been introduced in the past when we needed to
instantiate either an ArmSystem or a LinuxArmSystem depending on the
workload. Now that the workload object has been introduced in gem5, we
always instantiate an ArmSystem in FS mode, hence we don't need a
function to generate the System object
Change-Id: I79ccf31087b84521cce32da71bc835ff202dc432
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/43285
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This patch add a new Ruby cache coherence protocol based on Arm' AMBA5
CHI specification. The CHI protocol defines and implements two state
machine types:
- Cache_Controller: generic cache controller that can be configured as:
- Top-level L1 I/D cache
- A intermediate level (L2, L3, ...) private or shared cache
- A CHI home node (i.e. the point of coherence of the system and
has the global directory)
- A DMA requester
- Memory_Controller: implements a CHI slave node and interfaces with
gem5 memory controller. This controller has the functionality of a
Directory_Controller on the other Ruby protocols, except it doesn't
have a directory.
The Cache_Controller has multiple cache allocation/deallocation
parameters to control the clusivity with respect to upstream caches.
Allocation can be completely disabled to use Cache_Controller as a
DMA requester or as a home node without a shared LLC.
The standard configuration file configs/ruby/CHI.py provides a
'create_system' compatible with configs/example/fs.py and
configs/example/se.py and creates a system with private L1/L2 caches
per core and a shared LLC at the home nodes. Different cache topologies
can be defined by modifying 'create_system' or by creating custom
scripts using the structures defined in configs/ruby/CHI.py.
This patch also includes the 'CustomMesh' topology script to be used
with CHI. CustomMesh generates a 2D mesh topology with the placement
of components manually defined in a separate configuration file using
the --noc-config parameter.
The example in configs/example/noc_config/2x4.yaml creates a simple 2x4
mesh. For example, to run a SE mode simulation, with 4 cores,
4 mem ctnrls, and 4 home nodes (L3 caches):
build/ARM/gem5.opt configs/example/se.py \
--cmd 'tests/test-progs/hello/bin/arm/linux/hello' \
--ruby --num-cpus=4 --num-dirs=4 --num-l3caches=4 \
--topology=CustomMesh --noc-config=configs/example/noc_config/2x4.yaml
If one doesn't care about the component placement on the interconnect,
the 'Crossbar' and 'Pt2Pt' may be used and they do not require the
--noc-config option.
Additional authors:
Joshua Randall <joshua.randall@arm.com>
Pedro Benedicte <pedro.benedicteillescas@arm.com>
Tuan Ta <tuan.ta2@arm.com>
JIRA: https://gem5.atlassian.net/browse/GEM5-908
Change-Id: I856524b0afd30842194190f5bd69e7e6ded906b0
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42563
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Add a DMA thread tester to the Ruby GPU tester to test the DMA state
machine in the protocol. Currently creates a dummy DMA device to pass
through Ruby.py and scans for the DMA sequencers due to opaqueness of
Ruby.py.
DMA atomics not yet supported as there is no protocol that implements
atomic transitions in the DMA state machine file.
Example run command:
build/GCN3_X86/gem5.opt configs/example/ruby_gpu_random_test.py \
--test-length=1000
Change-Id: I63d83e00fd0dcbb1e34c6704d1c2d49ed4e77722
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/39936
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Event creation and management support from emulated drivers is required
to support interruptible signals in HSA and this support was not
available. This changeset adds the event creation and management support
in the emulated driver. With this patch, each interruptible signal
created by the HSA runtime is associated with a signal event. The HSA
runtime can then put a thread waiting on a signal condition to sleep
asking the driver to monitor the event associated with that signal. If
the signal is modified by the GPU, the dispatcher notifies the driver
about signal value change. If the modifier is a CPU thread, the thread
will have to make HSA API calls to modify the signal and these API calls
will notify the driver about signal value change. Once the driver is
notified about a change in the signal value, the driver checks to see if
any thread is sleeping on that signal and wake up the sleeping thread
associated with that event. The driver has also implemented the time_out
wakeup that can wake up the thread after a certain time period has
expired. This is also true for barrier packets.
Each signal has an event address in a kernel managed and allocated
event page that can be used as a mailbox pointer to notify an event.
However, this feature used by non-CPU agents to communicate with the
driver is not implemented by this changeset because the non-CPU HSA
agents in our model can directly communicate with driver in our
implementation. Having said that, adding that feature should be trivial
because the event address and event pages are correctly setup by this
changeset and just adding the event page's virtual address to our PIO
doorbell interface in the page tables and registering that pio address
to the driver should be sufficient. Managing mailbox pointer for an
event is based on event ID and using this event ID as an index into
event page, this changeset already provides a unique mailbox pointer for
each event.
Change-Id: Ic62794076ddd47526b1f952fdb4c1bad632bdd2e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/38335
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The baremetal platform is the platform we use for running
user supplied binaries on baremetal hardware.
(simply put, it runs provided binaries without adding
a gem5 bootloader)
Some layers of this software stack might not have a pci driver.
This might be the case for firmware images like edkII
which needs to use a block device to extract the bootloader
and/or the kernel image. Those can use the memory mapped
(in host domain) virtio block device which is already
part of the VExpress_GEM5 platforms
Change-Id: I9c6ba7e1b4566a3999fd9ba20a2bebe191dc3ef8
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/39995
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
SimplePoolManager doesn't allow mapping of two WGs
simultaneously on the same Compute Unit (provided
the previous WG has been mapped to all the SIMDs)
even if there is sufficient VRF and SRF space
available.
DynPoolManager takes care of that by dynamically
allocating and deallocating register file space
to wavefronts
Change-Id: I2255c68d4b421615d7b231edc05d3ebb27cbd66c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32034
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Alexandru Duțu <alexandru.dutu@amd.com>
Before the commit, the bootloader had a hardcoded entry point that it
would jump to.
However, the Linux kernel arm64 v5.8 forced us to change the kernel
entry point because the required memory alignment has changed at:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
commit/?h=v5.8&id=cfa7ede20f133cc81cef01dc3a516dda3a9721ee
Therefore the only way to have a single bootloader that boots both
pre-v5.8 and post-v5.8 kernels is to pass that information from gem5
to the bootloader, which we do in this patch via registers.
This approach was already used by the 32-bit bootloader, which passed
that value via r3, and we try to use the same register x3 in 64-bit.
Since we are now passing this information, the this patch also removes
the hardcoding of DTB and cpu-release-addr, and also passes those
values via registers.
We store the cpu-release-addr in x5 as that value appears to have a
function similar to flags_addr, which is used only in 32-bit arm and
gets stored in r5.
This commit renames atags_addr to dtb_addr, since both are mutually
exclusive, and serve a similar purpose, DTB being the newer recommended
approach.
Similarly, flags_addr is renamed to cpu_release_addr, and it is moved
from ArmSystem into ArmFsWorkload, since it is not an intrinsic system
property, and should be together with dtb_addr instead.
Before this commit, flags_addr was being set from FSConfig.py and
configs/example/arm/devices.py to self.realview.realview_io.pio_addr
+ 0x30. This commit moves that logic into RealView.py instead, and
sets the flags address 8 bytes before the start of the DTB address.
JIRA: https://gem5.atlassian.net/browse/GEM5-787
Change-Id: If70bea9690be04b84e6040e256a9b03e46710e10
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/35076
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Since the beginning fs_bigLITTLE has been pointing to a default
default_rcs = 'bootscript.rcS'
as a System.readfile parameter. That script is not present in
the gem5 repo and all the other fs scripts (starter_fs.py, fs.py
through Options.py) are using an emptry string as default
readfile param value.
We are hence aligning to the other scripts by removing this
default value
Change-Id: I20dc7714deae890d61706459c8d13bd8f5aac7a0
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/38815
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This patch updates Ruby configuration scripts to use the functions
defined in the RubySequencer python object to connect to cpu ports.
Only the protocol-agnostic scripts were updated. Scripts that assume
a specific protocol (e.g. configs/example/apu_se.py, gpu tests, etc)
and scripts in which the obj connected to the RubySequencer is not a
BaseCPU (e.g. the tests scripts) were not changed as they require a
non-standard port wireup.
Change-Id: I1e931ff0fc93f393cb36fbb8769ea4b48e1a1e86
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31418
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This patch adds the GPU protocol tester that uses data-race-free
operation to discover bugs in GPU protocols including GPU_VIPER. For
more information please see the following paper and the README:
T. Ta, X. Zhang, A. Gutierrez and B. M. Beckmann, "Autonomous
Data-Race-Free GPU Testing," 2019 IEEE International Symposium on
Workload Characterization (IISWC), Orlando, FL, USA, 2019, pp. 81-92,
doi: 10.1109/IISWC47752.2019.9042019.
Change-Id: Ic9939d131a930d1e7014ed0290601140bdd1499f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32855
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
When the TokenPort was moved from the GCN3 staging branch to develop the
TokenPort was changed from being the port connecting the ComputeUnit to
Ruby's vector memory port to a sideband port which inhibits requests to
Ruby's vector memory port. As such, it needs to be explicitly connected
as a new port. This changes the getPort method in ComputeUnit to be
aware of the port as well as modifying the example config to connect to
TCPs.
The iteration to connect in the config file was modified since it was
not properly connecting to TCPs each time and Ruby.py does not
explicitly return a list of each MachineType.
Change-Id: Ia70a6756b2af54d95e94d19bec5d8aadd3c2d5c0
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/35096
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The workload object is still optional for the sake of compatibility,
even though it probably shouldn't be in the long term. If a simulation
is just a collection of components with nothing in particular running on
it, for instance driven by a traffic generator, should it even have a
System object in the first place?
Change-Id: I8bcda72bdfa3730248226fb62f0bba9a83243d95
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33278
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Make the actual controller more generic
- Rename DRAMCtrl to MemCtrl
- Rename DRAMacket to MemPacket
- Rename dram_ctrl.cc to mem_ctrl.cc
- Rename dram_ctrl.hh to mem_ctrl.hh
- Create MemCtrl debug flag
Move the memory interface classes/functions to separate files
- mem_interface.cc
- mem_interface.hh
Change-Id: I1acba44c855776343e205e7733a7d8bbba92a82c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31654
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Made DRAMCtrl a ClockedObject, with DRAMInterface
defined as an AbstractMemory. The address
ranges are now defined per interface. Currently
the model only includes a DRAMInterface but this
can be expanded for other media types.
The controller object includes a parameter to the
interface, which is setup when gem5 is configured.
Change-Id: I6a368b845d574a713c7196c5671188ca8c1dc5e8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/28968
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This upgrades the garnet model to support HeteroGarnet
1) Static and dynamic multi-freq domains in network
2) Support for CDC
3) Separate links for each message class
4) Separate linkwidth for each message class
5) Support for SerDes
Change-Id: I6d00e3b5cb3745e849d221066cb46b2138c47871
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32597
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
This change sets the properties in hsaTopology to the proper values
specified by the user through command-line arguments. This ensures
that if the properties file is read by a program, it will return
the correct values for the simulated hardware.
This change also adds in a command-line argument for the lds size, as
it was the only other property used in hsaTopology that didn't have
a command-line argument. The default value (65536) is taken from
src/gpu-compute/LdsState.py
Change-Id: I17bb812491708f4221c39b738c906f1ad944614d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31995
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Alexandru Duțu <alexandru.dutu@amd.com>
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This changeset adds the necessary changes for running
GCN3 ISA with VIPER in apu_se.py.
Changes to the VIPER protocol configs are made to add support
for DMA and scalar caches.
hsaTopology is added to help the pseudo FS create the files
needed by ROCm to understand the device on which the SW is
being run.
Change-Id: I0f47a6a36bb241a26972c0faafafcf332a7d7d1f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/30274
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Bradford Beckmann <brad.beckmann@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
fs.py warns when an Arm platform is being created without a DTB file,
if the platform does not support the automatic creation of a DTB.
Updated the list of supported platforms with recent additions in order
to remove incorrect and potentially confusing warnings.
Change-Id: I549124a1afbc36e313f614dccab17973582bc3f7
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/30575
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
This commit does not make any functional changes but just rearranges
the existing code with regard to the power states. Previously, all
code regarding power states was in the ClockedObjects. However, it
seems more logical and cleaner to move this code into a separate
class, called PowerState. The PowerState is a now SimObject. Every
ClockedObject has a PowerState but this patch also allows for objects
with PowerState which are not ClockedObjects.
Change-Id: Id2db86dc14f140dc9d0912a8a7de237b9df9120d
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Sascha Bischoff <sascha.bischoff@arm.com>
Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/28049
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
fs_power.py is an example script that demonstrates how power models
can be used with gem5. Previously, the formulas used to calculate the
dynamic and static power of the cores and the L2 cache were using
stats in equations as determined by their path relative to the
SimObject where the power model is attached to or full paths. This CL
changes these formulas to refer to the stats only by their full paths.
Change-Id: I91ea16c88c6a884fce90fd4cd2dfabcba4a1326c
Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/27893
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Before this change, running:
./build/NULL/gem5.opt configs/example/ruby_mem_test.py -m 20000000 \
--functional 10
would only print warning for memory errors such as:
warn: Read access failed at 0x107a00
and there was no way to make the simulation fail.
This commit makes those warnings into errors such as:
panic: Read access failed at 0x107a00
unless --suppress-func-errors is given.
This will be used to automate MemTest testing in later commits.
Change-Id: I1840c1ed1853f1a71ec73bd50cadaac095794f91
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/26804
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>