Commit Graph

70 Commits

Author SHA1 Message Date
pre-commit-ci[bot]
54487d3bf6 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2024-10-09 14:04:56 +00:00
Matthew Poremba
f5858fe81f dev-amdgpu: Deprecate rom and mmio trace params (#1633)
The ROM field was originally intended as a future alternate way to load
VBIOS without the ROM being on the disk image. This code path is never
taken for the devices gem5 supports and there is no gem5 implementation.
Deprecate the rom_binary field for this reason.

Similarly, MMIO traces were only used for Vega10. Deprecate this as
Vega10 is now deprecated. The MMIO trace reader is kept as it may still
be useful in the future. It is still the primary way to handle devies
which have graphics capability. None of the devices supported by gem5
have graphics now that Vega10 is deprecated.
2024-10-07 07:12:07 -07:00
Matthew Poremba
24504c9a3e dev-amdgpu: Use GPU specific cache line size (#1621)
Invalidate requests align to system cache line size. This causes
problems if the GPU cache hierarchy's cache line size is different than
the system as the unlaigned requests never return, leading to deadlock
on deferred dispatch.

This commit uses the cache line size from the GPU memory manager and
makes the cache line size there non-optional.

Tested with multiple RubySystems where CPU side was 64B and GPU side was
128B cache lines.
2024-10-03 08:47:08 -07:00
Matthew Poremba
c8c75959ad configs: Deprecate Vega10 (#1619)
Vega10 is no longer officially supported by ROCm and ROCm is starting to
use some packet types not supported. These were originally kept to allow
users to use older disk images with newer gem5. Going forward the gem5
version and gem5-resources releases will be required to be the same to
prevent lingering old configs.

As a replacement for vega10*.py, mi300.py or mi200.py should be used.
HIP examples, cookbook, and rodinia configs can be replaced with the
standard flow of building / obtaining the GPU application and running
using mi300.py or mi200.py as they do not require any input options and
therefore do not require changes to the disk image.
2024-10-02 14:18:41 -07:00
Erin (Jianghua) Le
c10feed524 tests, configs, util, mem, python, systemc: Change base 10 units to base 2 (#1605)
This commit changes metric units (e.g. kB, MB, and GB) to binary units
(KiB, MiB, GiB) in various files. This PR covers files that were missed
by a previous PR that also made these changes.
2024-10-01 11:18:05 -07:00
Bobby R. Bruce
f2f86a3e42 stdlib, python: Add warning message and clarify binary vs metric units (#1479)
This PR changes memory and cache sizes in various parts of the gem5
codebase to use binary units (e.g. KiB) instead of metric units (e.g.
kB). This makes the codebase more consistent, as gem5 automatically
converts memory and cache sizes that are in metric units to binary
units.

This PR also adds a warning message to let users know when an
auto-conversion from base 10 to base 2 units occurs.

There were a few places in configs and in the comments of various files
where I didn't change the metric units, as I couldn't figure out where
the parameters with those units were being used.
2024-09-17 17:32:27 +00:00
Daniel Carvalho
51863d322f gpu-compute: Reuse RP list in GPU_VIPER (#1530)
It is safer to reuse the dynamic list than manually listing all possible
replacement policies.

---------

Signed-off-by: odanrc <odanrc@yahoo.com.br>
2024-09-09 09:18:01 -07:00
Marco Kurzynski
a8447b7fc0 arch-vega: Pass s_memtime through smem pipe (#1350)
The Vega ISA's s_memtime instruction is used to obtain a cycle value
from the GPU. Previously, this was implemented to obtain the cycle count
when the memtime instruction reached the execute stage of the GPU
pipeline. However, from microbenchmarking we have found that this under
reports the latency for memtime instructions relative to real hardware.
Thus, we changed its behavior to go through the scalar memory pipeline
and obtain a latency value from the the SQC (L1 I$). This mirrors the
suggestion of the AMD Vega ISA manual that s_memtime should be treated
like a s_load_dwordx2.

The default latency was set based on microbenchmarking.

Change-Id: I5e251dde28c06fe1c492aea4abf9f34f05784420
2024-08-26 19:47:04 -07:00
Erin Le
e1db67c4bd configs, dev, learning-gem5, python, tests: more clarification
This commit contains the rest of the base 2 vs base 10 cache/memory
size clarifications. It also changes the warning message to use
warn(). With these changes, the warning message should now no
longer show up during a fresh compilation of gem5.

Change-Id: Ia63f841bdf045b76473437f41548fab27dc19631
2024-08-23 18:02:42 -07:00
Matthew Poremba
ddc9a18536 configs: GPUFS: Disable KVM perf counters by default (#1391)
This is on by default in gem5 (see src/cpu/kvm/BaseKvmCPU.py), however
the perf counters only measure host instruction counters and GPUFS is
not concerned about accuracy of KVM CPU stats. There are also a larger
set of users who have access to KVM, but do not have the paranoid level
low enough to attach performance counters.

Therefore, make the performance counters OFF by default. They can still
be enabled, but this will allow for a larger set of users to follow the
upcoming GPUFS documentation without needing to read through a
troubleshooting section after seeing a gem5 error about the KVM paranoid
level.

Change-Id: I6b465559edf3ce17e7117ada049c60bd39aecd83
2024-07-29 12:26:10 -07:00
Matthew Poremba
b3d9dc42d4 configs: Add replacement policy options for GPUFS (#1230)
GPU_VIPER.py was modified to use these options but they did not exist,
breaking GPUFS. This commit adds them to fix the issue.

Change-Id: I0095f400ea606c4e8d91a41870ef208465cef803
2024-06-13 11:23:50 -07:00
Matthew Poremba
6164835230 configs: GPUFS: MI300X
Add a config capable of simulating MI300X ISA (gfx942). This is similar
to the mi200.py config and uses the same scripts followed by some
tuneable parameters. This config optionally lets the user call the
runMI300GPU function with gem5 resources. This allows for something like
the following before a VIPER stdlib python is available:

```
import mi300
from gem5.resources.resource import obtain_resource

disk = obtain_resource("x86-gpu-fs-img")
kernel = obtain_resource("x86-linux-kernel-5.4.0-105-generic")
app = obtain_resource("square-gpu-test")

mi300.runMI300GPUFS("X86KvmCPU", disk, kernel, app)
```

Tested cold boot config, checkpoint create and restore, and using gem5
resources.

Change-Id: I50a13d7a3d207786b779bf7fd47a5645256b1e6a
2024-05-16 09:23:03 -07:00
Matthew Poremba
8be5ce6fc9 dev-amdgpu,configs,gpu-compute: Add gfx942 version
This is the version for MI300. For the most part, it is the same as
MI200 with the exception of architected flat scratch (not yet
implemented in gem5) and therefore a new version enum is required.

Change-Id: Id18cd7b57c4eebd467c010a3f61e3117beb8d58a
2024-05-15 12:08:41 -07:00
Matthew Poremba
386fb3d1cc configs: Fix HSA packer processor address
The address has one too many zeros and is therefore placed in a memory
region usually used for system memory. As a result this causes failure
when trying to run a simulation with a huge amount of memory.

Change the address to be within the C000'0000h - FFFF'FFFFh X86 I/O hole
as was intended.

Change-Id: I5d03ac19ea3b2c01a8c431073c12fa1868b3df24
2024-05-03 14:29:30 -07:00
Matthew Poremba
c54039da5b configs: GPUFS: Turn off SSE4 and fancy XSAVEs (#1041)
A user reported a bug with the SSE4.1 version of memcmp in libc. When
enabled the simulated program crashes with SIGILL. After attempting all
fixes recommended by Intel SDM and still not working, turning the bit
off instead.

Similar, the default XSAVE functionality is not completely implemented
for AVX and newer ISA extensions. Therefore, there is not much point to
claiming to support the more advanced versions of XSAVE (XSAVEOPT,
XSAVEC, XSAVES, and XGETBV with ECX=1).

Note that none of these bits are enabled for non-GPU full system
simulations (see src/arch/x86/X86ISA.py). This only impacts GPUFS
simulations.

Change-Id: I8eb7bf0f2a0a29226095e7889fec9c1e8a65f88f
2024-04-20 11:04:59 -07:00
Matthew Poremba
823b5a6eb8 dev-amdgpu: Support multiple CPs and MMIO AddrRanges
Currently gem5 assumes that there is only one command processor (CP)
which contains the PM4 packet processor. Some GPU devices have multiple
CPs which the driver tests individually during POST if they are used or
not. Therefore, these additional CPs need to be supported.

This commit allows for multiple PM4 packet processors which represent
multiple CPs. Each of these processors will have its own independent
MMIO address range. To more easily support ranges, the MMIO addresses
now use AddrRange to index a PM4 packet processor instead of the
hard-coded constexpr MMIO start and size pairs.

By default only one PM4 packet processor is created, meaning the
functionality of the simulation is unchanged for devices currently
supported in gem5.

Change-Id: I977f4fd3a169ef4a78671a4fb58c8ea0e19bf52c
2024-03-21 10:13:55 -05:00
Matthew Poremba
6bbde8fbb8 dev-amdgpu: Rework handling of unknown registers
The top level AMDGPUDevice currently reads/writes all unknown registers
to/from a map containing the previously written value. This is intended
as a way to handle registers that are not part of the model but the
driver requires for functionality. Since this is at the top level, it
can mask changes to register values which do not go through the same
interface. For example, reading an MMIO, changing via PM4 queue, and
reading again returns the stale cached value.

This commit removes the usage of the regs map in AMDGPUDevice,
implements some important MMIOs that were previously handled by it, and
moves the unknown register handling to the NBIO aperture only. To reduce
the number of additional MMIOs to implement, the display manager in
vega10 is now disabled.

Change-Id: Iff0a599dd82d663c7e710b79c6ef6d0ad1fc44a2
2024-03-21 10:10:01 -05:00
Michael Boyer
acd9d3ff94 gpu-compute: Add support for skipping GPU kernels (#940)
gpu-compute: Add support for skipping GPU kernels

This commit adds two new command-line options:

--skip-until-gpu-kernel N
Skips (non-blit) GPU kernels until the target kernel is reached.
Execution continues normally from there. Blit kernels are not skipped
because they are responsible for copying the kernel code and metadata
for the non-blit kernels. Note that skipping kernels can impact
correctness; this feature is only useful if the kernel of interest has
no data-dependent behavior, or its data-dependent behavior is not based
on data generated by the skipped kernels.

--exit-after-gpu-kernel N
Ends the simulation after completing (non-blit) GPU kernel N.

This commit also renames two existing command-line options:
--debug-at-gpu-kernel -> --debug-at-gpu-task
--exit-at-gpu-kernel  -> --exit-at-gpu-task

These were renamed because they count GPU tasks, which include both
kernels launched by the application as well as blit kernels.

Change-Id: If250b3fd2db05c1222e369e9e3f779c4422074bc
2024-03-21 07:46:27 -07:00
Michael Boyer
ba2f5615ba gpu-compute: Support cache line sizes >64B in GPUFS (#939)
This change fixes two issues:

1) The --cacheline_size option was setting the system cache line size
but not the Ruby cache line size, and the mismatch was causing assertion
failures.

2) The submitDispatchPkt() function accesses the kernel object in
chunks, with the chunk size equal to the cache line size. For cache line
sizes >64B (e.g. 128B), the kernel object is not guaranteed to be
aligned to a cache line and it was possible for a chunk to be partially
contained in two separate device memories, causing the memory access to
fail.

Change-Id: I8e45146901943e9c2750d32162c0f35c851e09e1

Co-authored-by: Michael Boyer <Michael.Boyer@amd.com>
2024-03-20 11:09:25 -07:00
Vishnu Ramadas
7dae25e881 configs, gpu-compute: Add parameter in shader for CUs per SQC
Change-Id: If0ae0db1b6ccc08a92f169a271b137f69f410f7b
2024-02-09 12:17:24 -06:00
Matthew Poremba
6a9e80c54c gpu-compute: Support for MI200 GPU model (#733) 2024-01-15 08:18:34 -08:00
Matt Sinclair
dc85d1492c gpu-compute: Added register file cache support (#730)
The RFC is defaulted to a size of 0 which removes it completely. To use
the RFC set the --register-file-cache-size to a non-zero multiple of
two. In addition, rfc_pipe_length may be altered to increase or decrease
RFC latency benefit.
2024-01-05 12:57:06 -06:00
KaiBatley
359ac63280 gpu-compute: Added register file cache support
The RFC is defaulted to a size of 0 which removes it completely. To use
the RFC set the --register-file-cache-size to a non-zero multiple of
two. In addition, rfc_pipe_length may be altrered to increase or
decrease RFC latency benefit.

Change-Id: I6f5bf5b750eb64155fbc8c8343e9feadce5c9f79
2024-01-04 22:43:05 -06:00
Matthew Poremba
a40f8f0efa configs: Add MI200 script
This is the MI200 equivalent of configs/example/gpufs/vega10.py.

Change-Id: Ib9761caa4326abe6b90099e6a77111b2acce0f76
2024-01-03 15:41:06 -06:00
Bobby R. Bruce
d11c40dcac misc: Run pre-commit run --all-files
This ensures `isort` is applied to all files in the repo.

Change-Id: Ib7ced1c924ef1639542bf0d1a01c5737f6ba43e9
2023-11-29 22:06:41 -08:00
Matthew Poremba
f7ad8fe435 configs: GPUFS option to disable KVM perf counters (#433)
Add a --no-kvm-perf option to disable KVM perf counters for GPUFS
scripts. This is useful for users who have KVM enabled but configured
with more restrictive settings, which seems to be the default in newer
Linux distros.

Change-Id: I7508113d0f7c74deb21ea7b2770522885a0ec822
2023-10-11 14:20:27 -07:00
Bobby R. Bruce
298119e402 misc,python: Run pre-commit run --all-files
Applies the `pyupgrade` hook to all files in the repo.

Change-Id: I9879c634a65c5fcaa9567c63bc5977ff97d5d3bf
2023-10-10 21:47:07 -07:00
Bobby R. Bruce
ddf6cb88e4 misc: Run pre-commit run --all-files
This is reflect the updates made to black when running `pre-commit
autoupdate`.

Change-Id: Ifb7fea117f354c7f02f26926a5afdf7d67bc5919
2023-10-10 14:01:58 -07:00
Vishnu Ramadas
d3637a489d configs: Add option to disable AVX in GPUFS
GPUFS+KVM simulations automatically enable AVX. This commit adds a
command line option to disable AVX if its not needed for a GPUFS
simulation.

Change-Id: Ic22592767dbdca86f3718eca9c837a8e29b6b781
2023-10-03 12:10:42 -05:00
Matthew Poremba
addba01d29 configs,dev-amdgpu: Add PCI express capability info
The ROCm stack requires PCI express atomics. Currently the first PCI
CapabilityPtr does not point to anything, which signals to the OS
(Linux) that this is an early generation PCI device. As PCI express
atomics were introduced later, the CapabilityPtr needs to point to at
least a PCI express capability structure. This capability is defined as
0x10 in Linux. We additionally set the PCI atomic based bits and
implement device specific PCI configuration space reads and writes to
the amdgpu device.

With this commit, the output of simulation when loading the amdgpu
driver no longer outputs "PCIE atomics not supported". Further, an
application which uses PCIe atomics (PyTorch with a reduce_sum kernel)
now makes further progress.

Change-Id: I5e3866979659a2657f558941106ef65c2f4d9988
2023-08-24 09:10:35 -05:00
Matthew Poremba
f8490e4681 configs: Only require MMIO trace for Vega10
The MMIO trace contains register values for parts of the GPU that are
not modeled in gem5, such as registers related to the graphics core.
Since MI100 and MI200 do not have anything that is not modeled, the
MMIO trace is not needed, therefore it does not need to be used or
checked and the command line option goes away entirely for MI100/200.

Change-Id: I23839db32b1b072bd44c8c977899a99347fc9687
2023-07-30 13:17:05 -05:00
Matthew Poremba
9acfc5a751 configs: Enable AVX2 for GPUFS+KVM
AVX is a requirement for some ROCm libraries, such as rocBLAS, which are
themselves requirements for libraries higher up the stack like PyTorch.
This patch sets the necessary CPUID bits in the GPUFS config to enable
AVX, AVX2, and various SSE features so that applications using these
libraries do not cause an illegal instruction trap.

Change-Id: Id22f543fb2a06b268271725a54075ee6a9a1f041
2023-07-28 11:34:04 -05:00
Matthew Poremba
3756af8ed9 gpu-compute,configs: Make sim exits conditional
The unconditional exit event when a kernel completes that was added in
c644eae2dd is causing scripts that do not
ignore unknown exit events to end simulation prematurely. One such
script is the apu_se.py script used in SE mode GPU simulation. Make this
exit conditional to the parameter being set to a valid value to avoid
this problem.

Change-Id: I1d2c082291fdbcf27390913ffdffb963ec8080dd
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/72098
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2023-07-07 14:12:54 +00:00
Matthew Poremba
05ffa35426 configs: Create base GPUFS vega config and atomic config
Move the Vega KVM script code to a common base file and add scripts for
KVM and atomic. Since atomic is now possible in GPUFS this gives a way
to run it without editing the current scripts.

Change-Id: I094bc4d4df856563535c28c1f6d6cc045d6734cd
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71939
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
2023-06-30 19:55:18 +00:00
Matthew Poremba
ce715601ad configs: Add GPUFS --root-partition option
Different GPUFS disk images have different root partitions that Linux
needs to boot from. In particular, Ubuntu's new installer has a GRUB
partition that cannot seem to be removed. Adding this as an option
prevents needing to edit a config script to change one character each
time a different disk image is used.

Change-Id: Iac2996ea096047281891a70aa2901401ac9746fc
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71918
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
2023-06-29 23:30:16 +00:00
Matthew Poremba
90067c6ce4 configs: GPUFS: Only use parallel eventqs for KVM
This is turned on by default with multiple CPUs in the GPUFS configs,
which causes other CPU types (e.g., AtomicSimpleCPU) to assert. Only
enable parallel event queues for KVM CPUs to avoid this issue.

Change-Id: Ic8235437caf0150560e2b360a4544d82dfc26c36
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71419
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2023-06-08 22:04:02 +00:00
Matthew Poremba
c644eae2dd configs,gpu-compute: Kernel dispatch-based exit events
Add two kernel dispatch-based exit events that are useful for limiting
the simulation and enabling debug flags at specific GPU kernels. Since
the KVM CPU typically used with GPUFS is not deterministic, this help
with enabling debug flags when the Tick number may vary. The exit at GPU
kernel option can also limit simulation by only simulating a few hundred
kernels, for example, and exit at a determined point.

Change-Id: I81bae92a80c25fc38c41e999aa662e1417b7a20d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71418
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2023-06-08 22:03:47 +00:00
Matthew Poremba
6b4a1020be configs,dev-amdgpu: GPUFS MI200/gfx90a support
Add support for MI200-like device. This includes adding PCI IDs and new
MMIOs for the device, a different MAP_PROCESS packet, and a different
calculation for the number of VGPRs.

Change-Id: I0fb7b3ad928826beaa5386d52a94ba504369cb0d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70317
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2023-05-25 19:14:32 +00:00
Vishnu Ramadas
8659b9e1af dev-amdgpu: Update vega10_kvm.py to add checkpointing instruction
The vega10_kvm.py script configures a system to run in GPUFS mode. To
create a checkpoint, an m5 checkpoint instruction has to be added to the
script manually. This commit automatically adds the instruction if the
checkpoint-dir flag is set

Change-Id: I552fae6e98f6ec33a70a5b384242e87edb0e9526
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70078
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
2023-04-28 03:19:20 +00:00
Matthew Poremba
8b91ac6f8d dev-amdgpu: Refactor MMIO interface for SDMA engines
Currently the amdgpu simulated device is assumed to be a Vega10. As a
result there are a few things that are hardcoded. One of those is the
number of SDMAs. In order to add a newer device, such as MI100+, we need
to enable a flexible number of SDMAs.

In order to support a variable number of SDMAs and with the MMIO offsets
of each device being potentially different, the MMIO interface for SDMAs
is changed to use an SDMA class method dispatch table with forwards a
32-bit value from the MMIO packet to the MMIO functions in SDMA of the
format `void method(uint32_t)`. Several changes are made to enable this:

 - Allow the SDMA to have a variable MMIO base and size. These are
   configured in python.
 - An SDMA class method dispatch table which contains the MMIO offset
   relative to the SDMA's MMIO base address.
 - An updated writeMMIO method to iterate over the SDMA MMIO address
   ranges and call the appropriate SDMA MMIO method which matches the
   MMIO offset.
 - Moved all SDMA related MMIO data bit twiddling, masking, etc. into
   the MMIO methods themselves instead of in the writeMMIO method in
   SDMAEngine.

Change-Id: Ifce626f84d52f9e27e4438ba4e685e30dbf06dbc
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70040
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2023-04-28 00:48:35 +00:00
Matthew Poremba
9c3107c762 dev-amdgpu,configs: Add human readable names for different GPUs
Add a human readable string for GPU device names rather than using the
device ID in the code. This is intended to make code more readable.

Change-Id: Id3ea74ca37422b1f4a0f09e5a9522d37b5998c1a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70038
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
2023-04-28 00:48:35 +00:00
Matthew Poremba
c2c5cd1048 configs: Allow other CPU types in GPUFS
Previously the CPU type and memory modes were hardcoded for KVM, because
there was a deadlock bug. After some recent testing, this deadlock bug
no longer exists with the simple CPU models. Thus, changing the configs
to allow for other CPU models as a first step toward lifting the KVM
requirement from GPUFS.

Change-Id: Ib616c3ef60f173871421b55a8bb73b25ce2990b5
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/69979
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2023-04-22 00:48:28 +00:00
Matthew Poremba
70ef9b219c configs: Add simple check for valid GPU MMIO trace
This file is a required input to the simulator for GPUFS. There seems to
be confusion from several users who are not providing this input. This
usually results in the amdgpu driver failing to load, leading to the
application under test exiting along with it.

This changeset adds a simple md5 hashsum check to compare against the
known good MMIO trace located in the gem5-resources repository.

Change-Id: I59819fc795a6bc4bc6badbd4d120db1246498987
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/69978
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
2023-04-22 00:48:28 +00:00
Matthew Poremba
2f3f73a098 configs: Use higher dmesg level for GPUFS
The dmesg level is currently set to 3 which will not display errors if
the amdgpu driver fails to load. Changing to level 8 will show errors in
the gem5 terminal and is not too spammy. This will help GPUFS developers
with bug reports since we would actually be able to observe an error.
Currently if the driver fails to load, there is no way to detect it and
applications will attempt to run, usually failing on getting device
properties.

Change-Id: I56b9581c1a12a8ce329066d18d6a072d006c096d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/69977
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
2023-04-22 00:48:28 +00:00
Giacomo Travaglini
e73655d038 misc: Use python f-strings for string formatting
This patch has been generated by applying flynt to the
gem5 repo (ext has been excluded)

JIRA: https://gem5.atlassian.net/browse/GEM5-831

Change-Id: I0935db6223d5426b99515959bde78e374cbadb04
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/68957
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
2023-03-16 09:05:29 +00:00
Hoa Nguyen
eac06ad681 python: Fix multiline quotes in a single line
An example case,
```python
mem_side_port = RequestPort(
    "This port sends requests and " "receives responses"
)
```

This is the residue of running the python formatter.
This is done by finding all tokens matching the regex `"\s"(?![.;"])`
and manually replacing them by empty strings.

Change-Id: Icf223bbe889e5fa5749a81ef77aa6e721f38b549
Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/66111
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-11-29 23:44:38 +00:00
Matthew Poremba
27da9b3576 configs: GPUFS: use multiple event queues for >1 CPU
The KVM CPU hangs if there are not multiple event queues when more than
one CPU is created. Since GPUFS primarily relies on the KVM CPU, support
for multiple event queues is needed. Some GPU libraries, such as AMD
Research's ATMI library, assume more than one CPU.

This changeset adds support for multiple CPUs and was tested for up to
four CPUs.

Change-Id: Ia354e02209d0fa18195f3ad44f4fb1d58e93b5ca
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65131
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2022-11-01 15:34:29 +00:00
Matthew Poremba
f20d12656a configs: Stop disabling SDMA in GPUFS config
Support has been added for SDMA RLC queues which are used for host to
device and device to host "memcpy" calls. Previously the SDMA engine was
disabled which caused GPU BLIT kernels to be called. This removes the
environment variable disabling SDMAs which has two main benefits:

 - It will be much easier to debug host/device transfer by using SDMA
   debug flag.
 - Simulation time is improved since we no longer need detailed GPU
   simulation to copy data and instead are doing a simple large DMA

Change-Id: I7524245731d301b5c26394318f2156ed6b4c983a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62717
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
2022-09-09 04:13:49 +00:00
Bobby R. Bruce
787204c92d python: Apply Black formatter to Python files
The command executed was `black src configs tests util`.

Change-Id: I8dfaa6ab04658fea37618127d6ac19270028d771
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/47024
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-08-03 09:10:41 +00:00
Matthew Poremba
923d6c4081 configs: Always use busy wait for GPUFS
The environment variable HSA_ENABLE_INTERRUPT controls if Interrupt or
busy wait signals are used in the ROCm runtime. Interrupts are not being
sent in gem5 causing simulations to hang indefinitely in certain
situations. To fix this, always disable interrupts to fall back to busy
wait signals. Using interrupts is an old and simple optimization to not
waste CPU cycles, but from the perspective of simulation this is not
important. Disabling interrupt-based HSA signals therefore increases the
number of applications working within gem5.

Change-Id: I1ae21d7ee01548a4d00a8972642079b90278f9a2
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61652
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
2022-07-28 14:10:33 +00:00