This `mkdir` is problematic as it doesn't create the directory
recursively. This casues errors if `dir` is `X/Y/Z` and both `Y` and `Z`
has not been created. An error will be returned (`No such file or
directory`).
This issue was fixed with: https://github.com/gem5/gem5/pull/263. The
checkpointing code already recursively creates directories as needed.
Ergo was can remove this `mkdir` statement.
This `mkdir` is problematic as it doesn't create the directory
recursively. This casues errors if `dir` is `X/Y/Z` and both `Y` and `Z`
has not been created. An error will be returned (`No such file or
directory`).
This issue was fixed with: https://github.com/gem5/gem5/pull/263. The
checkpointing code already recursively creates directories as needed.
Ergo was can remove this `mkdir` statement.
Change-Id: Ibae38267c8ee1eba76d7834367aa1c54013365bc
The new script will automatically use the newly
defined O3_ARM_v7a_3_Etrace CPU to run a simple SE simulation while
generating elastic trace files.
The script is based on starter_se.py, but contains the following
limitations:
1) No L2 cache as it might affect computational delay calculations
2) Supporting SimpleMemory only with minimal memory latency
There restrictions were imported by the existing elastic trace
generation logic in the common library (collected by grepping
elastic_trace_en) [1][2][3]
Example usage:
build/ARM/gem5.opt configs/example/arm/etrace_se.py \
--inst-trace-file [INSTRUCTION TRACE] \
--data-trace-file [DATA TRACE] \
[WORKLOAD]
[1]: https://github.com/gem5/gem5/blob/stable/\
configs/common/MemConfig.py#L191
[2]: https://github.com/gem5/gem5/blob/stable/\
configs/common/MemConfig.py#L232
[3]: https://github.com/gem5/gem5/blob/stable/\
configs/common/CacheConfig.py#L130
Change-Id: I021fc84fa101113c5c2f0737d50a930bb4750f76
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
We define a new parent (ClusterSystem) to model a system
with one or more cpu clusters within it.
The idea is to make this new base class reusable by SE
systems/scripts as well (like starter_se.py)
Change-Id: I1398d773813db565f6ad5ce62cb4c022cb12a55a
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Added a new feature to CHI protocol (in collaboration with @tiagormk).
Here is the Jira Ticket
[https://gem5.atlassian.net/browse/GEM5-1326](https://gem5.atlassian.net/browse/GEM5-1326
). As described in CHI specs, far atomic transactions enable remote
execution of Atomic Memory Operations. This pull request incorporates
several changes:
* Fix Arm ISA definition of Swap instructions. These instructions should
return an operand, so their ISA definition should be Return Operation.
* Enable AMOs in Ruby Mem Test to verify that AMOs work
* Enable near and far AMO in the Cache Controler of CHI
Three configuration parameters have been used to tune this behavior:
* policy_type: sets the atomic policy to one of the described in [our
paper](https://dl.acm.org/doi/10.1145/3579371.3589065)
* atomic_op_latency: simulates the AMO ALU operation latency
* comp_anr: configures the Atomic No return transaction to split
CompDBIDResp into two different messages DBIDResp and Comp
This patch introduces a new category called "suite".
A suite is a collection of workloads.
Each workload in a SuiteResource has a tag that can be narrowed down
through the function with_input_group.
Also, the set of input groups can be seen through list_input_groups.
Added unit tests to test all functions of SuiteResource class.
Change-Id: Iddda5c898b32b7cd874987dbe694ac09aa231f08
Co-authored-by: Kunal Pai <kunpai@ucdavis.edu>
As we no longer inherit from the BaseCPU, we can't really use
CPU generation methods (like Simulation.setCPUClass) and
cache generation ones (like CacheConfig.config_cache).
This is good news as it allows us to simplify the etrace
script and to remove a dependency with the deprecated-to-be
common library.
Change-Id: Ic89ce2b9d713ee6f6e11bf20c5065426298b3da2
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
The ROCm stack requires PCI express atomics. Currently the first PCI
CapabilityPtr does not point to anything, which signals to the OS
(Linux) that this is an early generation PCI device. As PCI express
atomics were introduced later, the CapabilityPtr needs to point to at
least a PCI express capability structure. This capability is defined as
0x10 in Linux. We additionally set the PCI atomic based bits and
implement device specific PCI configuration space reads and writes to
the amdgpu device.
With this commit, the output of simulation when loading the amdgpu
driver no longer outputs "PCIE atomics not supported". Further, an
application which uses PCIe atomics (PyTorch with a reduce_sum kernel)
now makes further progress.
Change-Id: I5e3866979659a2657f558941106ef65c2f4d9988
- Added deprecated warnings to Workload and Abstract workload.
- Added comments to the classes changed.
Change-Id: I671daacf5ef455ea65103bd96aa442486142a486
- Added WrokloadResource in resource.py.
- depricated Workload and CustomWorkload.
- changed iscvmatched-fs.py with obtain resource for workload to test.
Change-Id: I2267c44249b96ca37da3890bf630e0d15c7335ed
Note: change example files back to original
The MMIO trace contains register values for parts of the GPU that are
not modeled in gem5, such as registers related to the graphics core.
Since MI100 and MI200 do not have anything that is not modeled, the
MMIO trace is not needed, therefore it does not need to be used or
checked and the command line option goes away entirely for MI100/200.
Change-Id: I23839db32b1b072bd44c8c977899a99347fc9687
AVX is a requirement for some ROCm libraries, such as rocBLAS, which are
themselves requirements for libraries higher up the stack like PyTorch.
This patch sets the necessary CPUID bits in the GPUFS config to enable
AVX, AVX2, and various SSE features so that applications using these
libraries do not cause an illegal instruction trap.
Change-Id: Id22f543fb2a06b268271725a54075ee6a9a1f041
* stdlib,configs,tests: Remove `Resource` class use
This class is deprecated, but was still used in various example
configuration scriots and tests. This patch replaces it with the
`obtain_resource` function.
Change-Id: I0c89bf17783ccaaafc18072aaeefb5d1e207bc55
* configs: Remove `CustomDiskImageResource` use
The class is deprecated but was still used in the SPEC example scripts.
This patch replaces it with the `DiskImageResource` class.
Change-Id: Ie0697fe59a3d737b05eb45ff3bc964f42b0387e0
* configs,tests: Remove `CustomResource` use
This class is deprecated but was still used in example scripts and
mentioned, incorrectly, in comments in the pyunit tests. This patch
removes these.
Change-Id: Icb6d02f47a5b72cd58551e5dcd59cc72d6a91a01
* stdlib: Remove '\' in Workload docstring example
This example shows how to use the Workload. The backslash is not correct Python and would fail if used in this way.
Co-authored-by: Jason Lowe-Power <jason@lowepower.com>
---------
Co-authored-by: Jason Lowe-Power <jason@lowepower.com>
* cpu-kvm: Add a variable signifying whether we are using perf
Change-Id: Iaa081e364f85c863f781723b5524d267724ed0e4
Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
* cpu-kvm: Making it clear the functionalities are specific to KVM
Change-Id: I982426f294d90655227dc15337bf73c42a260ded
Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
* cpu-kvm: Make perf optional
Change-Id: I8973c2a96575383976cea7ca3fda478f83e95c3f
Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
* configs: Add an example config of using KVM without perf
Change-Id: Ic69fa7dac4f1a2c8fe23712b0fa77b5b22c5f2df
Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
* Apply suggestions from code review
Co-authored-by: Jason Lowe-Power <jason@lowepower.com>
* misc: Add an example to the panic
Change-Id: Ic1fdfb955e5d8b9ad1d4f0a2bf30fa8050deba70
Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
* misc: Add warning of not using perf when using KVM CPU
Change-Id: I96c0832fb48c63a79773665ca6228da778ef0497
Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
* misc: Fix stuff
Change-Id: Ib407ae7407955b695f0e0f2718324f41bb0d768f
Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
* misc: style fix
Change-Id: I7275942e43f46140fdd52c975f76abb3c81b8b0a
Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
---------
Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
Co-authored-by: Jason Lowe-Power <jason@lowepower.com>
The unconditional exit event when a kernel completes that was added in
c644eae2dd is causing scripts that do not
ignore unknown exit events to end simulation prematurely. One such
script is the apu_se.py script used in SE mode GPU simulation. Make this
exit conditional to the parameter being set to a valid value to avoid
this problem.
Change-Id: I1d2c082291fdbcf27390913ffdffb963ec8080dd
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/72098
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Different GPUFS disk images have different root partitions that Linux
needs to boot from. In particular, Ubuntu's new installer has a GRUB
partition that cannot seem to be removed. Adding this as an option
prevents needing to edit a config script to change one character each
time a different disk image is used.
Change-Id: Iac2996ea096047281891a70aa2901401ac9746fc
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71918
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Add two kernel dispatch-based exit events that are useful for limiting
the simulation and enabling debug flags at specific GPU kernels. Since
the KVM CPU typically used with GPUFS is not deterministic, this help
with enabling debug flags when the Tick number may vary. The exit at GPU
kernel option can also limit simulation by only simulating a few hundred
kernels, for example, and exit at a determined point.
Change-Id: I81bae92a80c25fc38c41e999aa662e1417b7a20d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71418
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Add `--pmu-dump-stats-on <event>` and `--pmu-reset-stats-on <event>`
options to the Arm `baremetal.py` config to optionally dump and/or
reset stats on various PMU events.
These options allow the user to specify which PMU events should cause
the dumping or resetting of gem5 stats. The available `<event>`s are
PMU `enable`, `disable`, `reset`, and `interrupt`. Both these CLI
options may be specified multiple times to enable more than one event
to cause a stats dump/reset if desired. Stats are dumped before they
are reset.
These options are useful for sampled simulation workloads (e.g.
SimPoints) which are controlled by the PMU.
Change-Id: Ie2ffe11c6aa1f3a57a58425ccec3681c780065c8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/69959
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Add an option to add a PMU to the CPUs in `starter_fs.py` and
`baremetal.py`. By default PMUs will not be added.
Also adds an `--arm-ppi-number` option. Each PMU will be connected to
its core using the specified PPI.
Change-Id: I9cfb5781f211338919550f2320a7133d88801f6a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/69957
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Currently the amdgpu simulated device is assumed to be a Vega10. As a
result there are a few things that are hardcoded. One of those is the
number of SDMAs. In order to add a newer device, such as MI100+, we need
to enable a flexible number of SDMAs.
In order to support a variable number of SDMAs and with the MMIO offsets
of each device being potentially different, the MMIO interface for SDMAs
is changed to use an SDMA class method dispatch table with forwards a
32-bit value from the MMIO packet to the MMIO functions in SDMA of the
format `void method(uint32_t)`. Several changes are made to enable this:
- Allow the SDMA to have a variable MMIO base and size. These are
configured in python.
- An SDMA class method dispatch table which contains the MMIO offset
relative to the SDMA's MMIO base address.
- An updated writeMMIO method to iterate over the SDMA MMIO address
ranges and call the appropriate SDMA MMIO method which matches the
MMIO offset.
- Moved all SDMA related MMIO data bit twiddling, masking, etc. into
the MMIO methods themselves instead of in the writeMMIO method in
SDMAEngine.
Change-Id: Ifce626f84d52f9e27e4438ba4e685e30dbf06dbc
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70040
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Previously the CPU type and memory modes were hardcoded for KVM, because
there was a deadlock bug. After some recent testing, this deadlock bug
no longer exists with the simple CPU models. Thus, changing the configs
to allow for other CPU models as a first step toward lifting the KVM
requirement from GPUFS.
Change-Id: Ib616c3ef60f173871421b55a8bb73b25ce2990b5
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/69979
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
This file is a required input to the simulator for GPUFS. There seems to
be confusion from several users who are not providing this input. This
usually results in the amdgpu driver failing to load, leading to the
application under test exiting along with it.
This changeset adds a simple md5 hashsum check to compare against the
known good MMIO trace located in the gem5-resources repository.
Change-Id: I59819fc795a6bc4bc6badbd4d120db1246498987
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/69978
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
The dmesg level is currently set to 3 which will not display errors if
the amdgpu driver fails to load. Changing to level 8 will show errors in
the gem5 terminal and is not too spammy. This will help GPUFS developers
with bug reports since we would actually be able to observe an error.
Currently if the driver fails to load, there is no way to detect it and
applications will attempt to run, usually failing on getting device
properties.
Change-Id: I56b9581c1a12a8ce329066d18d6a072d006c096d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/69977
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Removed the calls to `sys.exit()` from the Arm simple configs. These
calls terminate gem5's embedded Python interpreter and gem5 at the end
of the config script, preventing gem5 from dropping into the
interactive IPython shell when the `--interactive` option has been
specified.
Change-Id: I0c350b0d107f297691255361d25c566c889f9469
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/69687
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Only the GICv3 model has a `gicv4` parameter, causing the current
`baremetal.py` config to throw an exception when used with the
VExpress_GEM5_V1 platform containing a GICv2.
This patch checks for the existence of the `gicv4` parameter, allowing
all VExpress platforms to be used.
Change-Id: I72667a9caee64fa497bda516217cd424050eb242
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/69685
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
gem5 supports Tarmac trace generation for Arm simulations, but there
are no examples of how to use this feature.
This patch adds a `--tarmac-gen` option to three of the simple Arm
configs. Tarmac generation is useful for out-of-the-box users, and
this patch also provides an example of how to use the Tarmac
generation feature.
Change-Id: I0d3c523b5c0bb6d94de93bc502e4451622fb635d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/69684
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>