Commit Graph

15 Commits

Author SHA1 Message Date
Matthew Poremba
63cabf2848 dev-amdgpu: Handle GPU atomics on host memory addresses
It is possible to execute a GPU atomic instruction using a memory
address that is in the host memory space (e.g, HMM, __managed__,
hipHostMalloc'd address). Since these are in host memory they are passed
to the SystemHub DmaDevice. However, this currently executes as a write
packet without modifying data. This leads to hangs in applications that
use atomics for forward progress (e.g., HeteroSync).

It is not clear where these are handled on a real GPU, but they are
certianly not handled by the software stack nor driver, so they must be
handled in hardware and therefore implemented in gem5. Handling for
atomics in the SystemHub makes the most sense.

To make atomics work a few extra changes need to be made to the
SystemHub. (1) The atomic is implemented as a host memory read, followed
by calling the AtomicOpFunctor, followed by a write. This requires a
second event to handle read response, performing atomic, and issuing a
write. (2) Atomics must be serialized otherwise two atomics might return
the same value which is incorrect. This patch adds serialization logic
for all request types to the same address to handle this. (3) With the
added complexity of the SystemHub, a new debug flag explicitly for
SystemHub is added.

Testing done: The heterosync application with input "sleepMutex 10 16 4"
previously hung before this patch. It passes with the patch applied.
This application tests both (1) and (2) above, as it allocates locks
with hipHostMalloc and has multiple workgroups sending an atomic request
in the same Tick, verifying the serialization mechanism.

Change-Id: Ife84b30037d1447dd384340cfeb06fdfd472fff9
2023-09-20 13:52:25 -05:00
Matthew Poremba
316538bf8a dev-amdgpu: Enable more GPUs with device specific registers
Currently gem5 assumes the amdgpu device to be Vega10. In order to
support more devices we need to handle situations where different
registers and addresses have the same functionality but different
offsets on different devices.

This changeset adds an NBIO class to handle device discovery and driver
initialization related tasks, pulling them out of the AMDGPUDevice
class. The offsets used for MMIOs are reworked slightly to use offsets
rather than absolute addresses. This is because we cannot determine the
absolute address in the constructor since the BAR has not been assigned
by the OS yet.

Change-Id: I14b364374e086e185978334425a4e265cf2760d0
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70041
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2023-04-28 00:48:35 +00:00
Matthew Poremba
a648be2338 dev-amdgpu: Add an SDMA data debug flag
This debug flag is used to print spammy SDMA DPRINTFs, such as an SDMA
copy printing the data of large transfers 8 bytes per line at a time. For
those prints, the SDMAEngine flag will now only print the first and last
qword of the transfer and the new SDMAData flag is needed for verbose
data printing. This makes the SDMAEngine flag still useful for verifying
copies in applications with predictable data such as square.

Additionally, the memory allocation/deallocation done solely for a print
statement is removed in favor of casting the data to the printed type.

Change-Id: I18c1918ef9085cca4570f79881ee63d510ccc32f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/64452
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
2022-10-13 20:17:00 +00:00
Gabe Black
e6c0ba97db scons: Put all config variables in an env['CONF'] sub-dict.
This makes what are configuration and what are internal SCons variables
explicit and separate, and makes it unnecessary to call out what
variables to export to C++.

These variables will also be plumbed into and out of kconfiglib in later
changes.

Change-Id: Iaf5e098d7404af06285c421dbdf8ef4171b3f001
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56892
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-03-28 20:31:21 +00:00
Matthew Poremba
1be246bbe3 dev-amdgpu: Add PM4PP, VMID, Linux definitions
The PM4 packet processor is handling all non-HSA GPU packets such
as packets for (un)mapping HSA queues. This commit pulls many
Linux structs and defines out into their own files for clarity.
Finally, it implements the VMID related functions in AMDGPU device.

Change-Id: I5f0057209305404df58aff2c4cd07762d1a31690
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53068
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-03-24 14:59:57 +00:00
Alexandru Dutu
f1772d3505 dev-amdgpu: Add SDMAEngine and GPU device methods
SDMAEngine handles copies to device memory. This commit
updates sdma_packets.hh style as well. Added several methods needed by
SDMAEngine to GPU device including GART table, various getters, and
aperture range checkers. Move the MMIO interface from GPUController to
SDMAEngine. Create an SDMA MMIO and commands header with only the macros
we use so that we don't need to check in multi-thousand line header
files from the linux kernel. Keep SOC15 IH client ID macros as that file
is small.

Change-Id: I986fede90cc1bc16ee56d4e8598cf9283bde034e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53064
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-03-24 14:59:57 +00:00
Matthew Poremba
9cbdf75295 dev-amdgpu: Add VM class for apertures, TranslationGens
Create a VM class to reduce clutter in the amdgpu_device.* files. This
new file is in charge of reading/writting MMIOs related to VM contexts
and apertures. It also provides ranges checks for various apertures and
breaks out the MMIO interface so that there are not overloaded macro
definitions in the device MMIO methods.

The new translation generator classes for the various apertures are also
added to this class.

Change-Id: Ic224c1aa485685685b1136a46eed50bcf99d2350
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53066
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-03-24 14:59:57 +00:00
Matthew Poremba
2390cd1143 dev-amdgpu: Add SystemHub for GPU load/store to host
In a dGPU configuration, vector and scalar loads/stores can either be
requests to device memory or host memory depending on if the system bit
is set in the PTE when the request's virtual address is translated. This
object is used to send/receive those requests to the host via DMA.

This object will be used in a later changeset by the compute unit and
fetch units to issue data and instruction loads from the GPU which
translate to physical addresses on the host/cpu memory.

Change-Id: I4537059f90ebc03f3b2e6b8b631b4c452841f83f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51851
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-03-22 17:35:59 +00:00
Matthew Poremba
42b56ceb7b dev-amdgpu: Add memory manager for GPU VRAM
The memory manager is responsible for reading and writes to VRAM memory
for direct requests that bypass GPU caches.

Change-Id: I4aa1e77737ce52f2f2c01929b58984126bdcb925
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51850
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-03-22 17:35:59 +00:00
Matthew Poremba
b7826f1329 dev-amdgpu: Add GPU interrupt handler object
Add device interrupt handler for amdgpu device. The interrupt handler is
primarily used to signal that fences in the kernel driver can be passed.

Change-Id: I574fbfdef6e3bae310ec7f86058811e1e4886df6
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51849
Maintainer: Matthew Poremba <matthew.poremba@amd.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-03-22 17:35:59 +00:00
Matthew Poremba
9313294efe misc: Remove AMD license addition
Remove the line "For use for simulation and test purposes only" in files
were AMD is the only copyright holder listed in the header. This happens
to be the case for all files where this line exists, removing it
completely from gem5.

Change-Id: I623f266b002f564301b28774f49081099cfc60fd
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53943
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-12-11 04:00:56 +00:00
Gabe Black
1c233ee9d2 scons: Add sim_object and enums arguments to SimObject().
This will explicitly declare what SimObject and Enum types need to be set
up in C++, which will make importing all the SimObject modules during
the setup phase of SCons uneccessary.

Change-Id: Id2d7603daf33b236ceaa0789e2f089f589d34e62
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/49406
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-12-08 08:01:23 +00:00
Gabe Black
73025695c7 scons: Use tags to gate ISA files and not env['TARGET_ISA'].
Change-Id: Ib81a4c570fbb050fa7d82919edacfed004c6800e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/50336
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2021-10-19 20:41:03 +00:00
Matthew Poremba
7426a0da8e dev-amdgpu: Implement MMIO trace reader
Helper class to read Linux kernel MMIO trace from amdgpu modprobes. This
class is used rather than implementing MMIOs in code as it is easier to
update to newer kernel versions this way. It also helps with setting
values for registers which are not documented.

Based on https://gem5-review.googlesource.com/c/amd/gem5/+/23743

Change-Id: Ia9b85c269c98b6ae0d5bcfe89141a4c30ef2f914
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46160
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-06-11 17:10:32 +00:00
Matthew Poremba
0209e7dede dev-amdgpu: Add initial AMDGPU device
The initial device contains enough code for the gpufs configuration
scripts to register an amdgpu device that identifies as a Vega 10
(Frontier Edition) device when PCI devices are listed by Linux. It also
contains stubs necessary for adding the MMIO interface to handle driver
initialization.

Using the configuration Linux boots and the device is successfully seen
in lspci. The driver can also begin loading an successfully sends
initial MMIOs and attempts to read the ROM.

Change-Id: I7ad87026876f31f44668e700d5adb639c2c053c1
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/44909
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-05-05 17:33:22 +00:00