Previously the L1 request and response latencies were not configurable
in the GPU config scripts. As a result, the simulations used the default
values from GPU.py. This commits adds support to change this value as an
input parameter. The parameters to use are "--mem-req-latency" followed
by the value and "--mem-resp-latency" followed by the value. The default
values are the same as those in GPU.py (which is 50). These new
parameters should be set instead of changing the mandatory queue latency when configuring the L1 cache.
Change-Id: I812d77758ea12530899953f308c91f4c8b05866d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63971
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Previously the LDS bus latency was not configurable in the GPU config
scripts. As a result, the simulations would use the default value from
GPU.py. This commit adds support to change this value as an input
option. The option to use is "--vrf_lm_bus_latency".
Change-Id: I8d8852e6d7b9d03ebec1fe8b392968f396dd3526
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63652
Maintainer: Matthew Poremba <matthew.poremba@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair.wisc@gmail.com>
Support has been added for SDMA RLC queues which are used for host to
device and device to host "memcpy" calls. Previously the SDMA engine was
disabled which caused GPU BLIT kernels to be called. This removes the
environment variable disabling SDMAs which has two main benefits:
- It will be much easier to debug host/device transfer by using SDMA
debug flag.
- Simulation time is improved since we no longer need detailed GPU
simulation to copy data and instead are doing a simple large DMA
Change-Id: I7524245731d301b5c26394318f2156ed6b4c983a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62717
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
simpoints-se-checkpoint.py & simpoints-se-restore.py:
These are two example scripts to show how to use SimPoints functions with
the stdlib.
se_binary_workload.py:
Allow se_binary_workload to take in SimPoint Class item and schedule
SimPoint exit events.
exit_event.py:
Added SIMPOINT_BEGIN and MAX_INSTS exit events.
simulator.py:
Added SIMPOINT_BEGIN and MAX_INSTS exit event scheduling functions.
They can schedule exit events before or during the simulation.
Jira Issue: https://gem5.atlassian.net/browse/GEM5-1259
Change-Id: Iaa07a83de9dddc293b9f1a230aba8e35d4f5af6c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63154
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
The TARGET_ISA variable would let you select one ISA from a list of
possible ISAs. That has now been replaced with USE_ARM_ISA, USE_X86_ISA,
etc, variables which are boolean on or off. That will allow any number
of ISAs to be enabled or disabled individually. Enabling something other
than exactly one of these will probably prevent you from getting a
working gem5 binary, but those problems are being addressed in other,
parallel change series.
I decided to use the USE_ prefix since it was consistent with most other
on/off variables we have in gem5. One noteable exception is the
BUILD_GPU setting which, you could convincingly argue, is a better
prefix than USE_. Another option would be to use CONFIG_, in
anticipation of using a kconfig style config mechanism in gem5.
It seemed premature to start using a CONFIG_ prefix here, and if we
decide to switch to some other prefix like BUILD_, it should be a
purposeful choice and not something somebody just starts using.
Change-Id: I90fef2835aa4712782e6c1313fbf564d0ed45538
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/52491
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
The environment variable HSA_ENABLE_INTERRUPT controls if Interrupt or
busy wait signals are used in the ROCm runtime. Interrupts are not being
sent in gem5 causing simulations to hang indefinitely in certain
situations. To fix this, always disable interrupts to fall back to busy
wait signals. Using interrupts is an old and simple optimization to not
waste CPU cycles, but from the perspective of simulation this is not
important. Disabling interrupt-based HSA signals therefore increases the
number of applications working within gem5.
Change-Id: I1ae21d7ee01548a4d00a8972642079b90278f9a2
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61652
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
This change consists of two scripts,
- riscv-hello-save-checkpoint.py: runs the first million ticks of the
simulation and save a checkpoint.
- riscv-hello-load-checkpoint.py: loads the above checkpoint, and runs
the rest of the simulation.
This change also adds the two scripts as part of quick tests.
Change-Id: I7bd97ba953fab52f298cbbcf213f2ea5c185cc38
Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/58829
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
This changes adds a new board to the gem5 stdlib, which is capable
of simulating an ARM based full system. It also adds an example
config script to perform a boot-test using an Ubuntu 18.04 disk
image. A test has been added in the gem5-library-example for the
same.
Change-Id: Ic95ee56084a444c7f1cf21cdcbf40585dcf5274a
Signed-off-by: Kaustav Goswami <kggoswami@ucdavis.edu>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/58910
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
The supported version of ROCm in GPUFS only supports the DKMS build of
the amdgpu driver. However, since a gem5 user can potentially pass in
any Linux kernel as a parameter, it is possible that the DKMS package
for that kernel was not installed on the disk image. This would result
in the simulation appearing to work when in reality it is just spinning
waiting for commands from the driver. This check exits gem5 early in the
simulation and outputs an error on the console to sanity check the
correct driver is being used.
Change-Id: I708912e5625e47eba15dcb2f722772a3b2928b98
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/58129
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The main restriction with this design is it results in one ISA target
per board. The ISA is declared per core. To make the design simpler it's
assumed a Processor (a collection of cores) are all of the same ISA. As
each board has one processor, this also means a board is typically tied
to one ISA per simulation.
In order to remain backwards compatible and maintain the standard
library APIs, this patch adds a `--main-isa` parameter which will
determine what `gem5.runtime.get_runtime_isa` returns in cases where
mutliple ISAs are compiled in. When setting the ISA in a simulation (via
the Processor or Cores), the user may, as before, choose not to and, in
this case, the `gem5.runtime.get_runtime_isa` function is used.
The `gem5.runtime.get_runtime_isa` function is an intermediate step
which should be removed in future versions of gem5 (users should specify
precisely what ISA they want via configuration scripts). For this reason
it throws a warning when used and should not be heavily relied upon. It
is deprecated.
Change-Id: Ia76541bfa9a5a4b6b86401309281849b49dc724b
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/55423
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The amdgpu driver supports fetching instructions from pages which reside
in system memory rather than device memory. This changeset adds support
to do this by adding the system hub object added in a prior changeset to
the fetch unit and issues requests to the system hub if the system bit
in the memory page's PTE is set. Otherwise, the requestor ID is set to
be device memory and the request is routed through the Ruby network /
GPU caches to fetch the instructions.
Change-Id: Ib2fb47c589fdd5e544ab6493d7dbd8f2d9d7b0e8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57652
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Add the constructors for the Vega TLB and TLB coalescers in the python
config. These need a pointer to the gpu device which is added as a
parameter. The last level TLB's page table walker is added as a dma
device to the system so that the port is connected to the GPU device
memory in the disjoint VIPER configuration file.
A portion of the the GPUFS system configuration file needs to be
shuffled around so that the shader CPU is created before the TLBs are
created so they can be connected to the shader's ports. This means the
real CPU init code needs to break once reaching the shader. The vendor
string must also be set after createThreads is called on real CPUs.
Change-Id: I36ed93db262b21427f3eaf4904a1c897a2894835
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57649
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The CorePairs in MOESI_AMD_Base round up the number of CPUs when
creating CPU sequencers. If the CPU count is an odd number, this was
causing the Disjoint_VIPER config to connect a sequencer that does not
exist. As a result the crossbar was waiting for a range change from the
sequencer but it never arrived, causing an assert.
This patch fixes this by conditionally connecting CPU sequencers to the
PIO port only if the ID is less than the number of CPUs.
Change-Id: I2280c0048492d43528429a947a726871f1c23ca7
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57531
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Make the necessary changes to connect Vega pagetable walkers for
full-system mode. Previously the CP and HSA packet processor could only
read AQL packets from system/host memory using proxy port. This allows
for AQL to be read from device memory which is used for non-blit
kernels.
Change-Id: If28eb8be68173da03e15084765e77e92eda178e9
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53077
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The disjoint VIPER configuration creates completely disconnected CPU and
GPU Ruby networks which can communicate only via the PCI bus. Either
garnet or simple network can be used. This copies most of the Ruby setup
from Ruby.py's create_system since creating disjoint networks is not
possible using Ruby.py.
Change-Id: Ibc23aa592f56554d088667d8e309ecdeb306da68
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53072
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Connect the --second-disk option in GPUFS. Typically this is used as a
benchmarks disk image. If the disk is unmounted at the time of
checkpoint, a new disk image can be mounted after restoring the
checkpoint for a simple way to add new benchmarks without recreating a
checkpoint.
Change-Id: I57b31bdf8ec628006d774feacff3fde6f533cd4b
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53071
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
A valid CPU vendor string (i.e., not "M5 Simulator") needs to be passed
to CPUID in order for Linux to create the sysfs files needed for ROCm's
Thunk interface to initialize properly. If these are no created
hipDeviceProperties and other basic GPU code APIs will error out.
Change-Id: I6e3f459162e4673860a8f0a88473e38d5d7be237
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53070
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The PM4 packet processor is handling all non-HSA GPU packets such
as packets for (un)mapping HSA queues. This commit pulls many
Linux structs and defines out into their own files for clarity.
Finally, it implements the VMID related functions in AMDGPU device.
Change-Id: I5f0057209305404df58aff2c4cd07762d1a31690
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53068
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Add the devices that have been added in previous changesets to the
config file. Forward MMIO writes to the appropriate device based
on the MMIO address. Connect doorbells and forward rings to the
appropriate device based on queue type.
Change-Id: I44110c9a24559936102a246c9658abb84a8ce07e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53065
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>