Connect the --second-disk option in GPUFS. Typically this is used as a
benchmarks disk image. If the disk is unmounted at the time of
checkpoint, a new disk image can be mounted after restoring the
checkpoint for a simple way to add new benchmarks without recreating a
checkpoint.
Change-Id: I57b31bdf8ec628006d774feacff3fde6f533cd4b
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53071
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
A valid CPU vendor string (i.e., not "M5 Simulator") needs to be passed
to CPUID in order for Linux to create the sysfs files needed for ROCm's
Thunk interface to initialize properly. If these are no created
hipDeviceProperties and other basic GPU code APIs will error out.
Change-Id: I6e3f459162e4673860a8f0a88473e38d5d7be237
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53070
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The PM4 packet processor is handling all non-HSA GPU packets such
as packets for (un)mapping HSA queues. This commit pulls many
Linux structs and defines out into their own files for clarity.
Finally, it implements the VMID related functions in AMDGPU device.
Change-Id: I5f0057209305404df58aff2c4cd07762d1a31690
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53068
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Add the devices that have been added in previous changesets to the
config file. Forward MMIO writes to the appropriate device based
on the MMIO address. Connect doorbells and forward rings to the
appropriate device based on queue type.
Change-Id: I44110c9a24559936102a246c9658abb84a8ce07e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53065
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
SDMAEngine handles copies to device memory. This commit
updates sdma_packets.hh style as well. Added several methods needed by
SDMAEngine to GPU device including GART table, various getters, and
aperture range checkers. Move the MMIO interface from GPUController to
SDMAEngine. Create an SDMA MMIO and commands header with only the macros
we use so that we don't need to check in multi-thousand line header
files from the linux kernel. Keep SOC15 IH client ID macros as that file
is small.
Change-Id: I986fede90cc1bc16ee56d4e8598cf9283bde034e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53064
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Create a VM class to reduce clutter in the amdgpu_device.* files. This
new file is in charge of reading/writting MMIOs related to VM contexts
and apertures. It also provides ranges checks for various apertures and
breaks out the MMIO interface so that there are not overloaded macro
definitions in the device MMIO methods.
The new translation generator classes for the various apertures are also
added to this class.
Change-Id: Ic224c1aa485685685b1136a46eed50bcf99d2350
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53066
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
* The value of build environment variable KVM_ISA is serialized into
the generated file `kvm_isa.hh'. This value should be a string, but on
hosts where the KVM headers are not available, the default `None` is
inserted. Changed the default value to the string `""` in this case.
* Added missing include for `std::array`.
Change-Id: I651122cc46fc9c0757f592b05f4b4cab285cb91f
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57889
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Current implementation prevents customers from performing zero
initialize on BitUnion class. Customers would get unexpected results
when writing `BitUnion{}`. Changing the default constructor to default
can solve this issue.
After changing the default constructor, the test failed with unused
variable. I also change one with zero initializer and make the other
with maybe_unused label.
```
tests/build/ARM/base/bitunion.test.cc:133:14: error: 'emptySixteen' defined but not used [-Werror=unused-variable]
133 | EmptySixteen emptySixteen;
| ^~~~~~~~~~~~
tests/build/ARM/base/bitunion.test.cc:132:16: error: 'emptyThirtyTwo' defined but not used [-Werror=unused-variable]
132 | EmptyThirtyTwo emptyThirtyTwo;
| ^~~~~~~~~~~~~~
```
Change-Id: Icbed36b3fa6751cbda63e84443eaab6d865d9bd6
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57730
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
There have been failures on the weekly tests during the decoding of the
downloaded resources.json base64 file. These errors suggested an
incomplete download or some form of file corruption. These errors only
ever seem to occur when multiple threads of gem5 are running. It has
therefore been proposed that perhaps, in some cases, the cached
downloaded file was bring re-downloaded while also being read by
another thread. For this reason this patch adds a filelock so only one
instance of gem5, at any one time, can download and read the
resources.json file. Even if this is not the cause of the weekly test
errors, it still adds some additional safeguards.
Change-Id: I7c6e1c1786c1919e8519587e53b6a77f4aafa932
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57789
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
In a dGPU configuration, vector and scalar loads/stores can either be
requests to device memory or host memory depending on if the system bit
is set in the PTE when the request's virtual address is translated. This
object is used to send/receive those requests to the host via DMA.
This object will be used in a later changeset by the compute unit and
fetch units to issue data and instruction loads from the GPU which
translate to physical addresses on the host/cpu memory.
Change-Id: I4537059f90ebc03f3b2e6b8b631b4c452841f83f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51851
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Sometimes a library is needed to support particular functionality in
gem5, and that functionality is only used (or even desirable) in
certain binaries SCons can build. We can currently filter sources to
include in a particular executable using tags, but libraries have been
added to the environment globally using the LIBS variable which applies
to all Executables.
This change adds a SourceLib() mechanism which is a new category of
source which represents libraries. This is independent from classes
which inherit from SourceFile which represent actual files, as opposed
to more abstract libraries.
When gem5 builds an executable, the filters it provides are used to
select both Source()-es, aka c/c++ files, and libraries. If something
like a unit test does not need all the libraries gem5 proper does,
then those won't be picked up by its filter, and it won't include them.
Change-Id: I003e029eb82f7800a7ecff698c260e2d18ea2900
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/58069
Reviewed-by: Yu-hsin Wang <yuhsingw@google.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This was defined in the Micro_Container base class, and then again in
each subclass. The base definition was different and less complete than
the others, but the others were identical. Replace the base class
definition with the definition in the subclasses, and delete the ones in
the subclasses.
Change-Id: Ib2d1ce72958ec299115efb6efced2bd14c08467c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56330
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The current default GPU register allocator is the "simple" policy,
which only allows 1 wavefront to run at a time on each CU. This is
not very realistic and also means the tester (when not specifically
choosing the dynamic policy) is less rigorous in terms of validating
correctness.
To resolve this, this commit changes the default to the "dynamic"
register allocator, which runs as many waves per CU as there are
space in terms of registers and other resources -- thus it is more
realistic and does a better job of ensuring test coverage.
Change-Id: Ifca915130bb4f44da6a9ef896336138542b4e93e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57537
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
The executed_as field is currently not set for global memory
instructions. This results in the default of SC_NONE, causing the status
vector to be all zeros. The GM pipe sees this and completes the
instruction immediately rather than issuing memory requests. This is
fixed by marking the instruction as executed as SC_GLOBAL always. Flat
instructions use resolvedFlatSegment for this, however since global
instructions are known to be global we can set this field directly. This
results in the expected issuing of memory requests to GPU memory.
Change-Id: Ic23102853ccd49a41e2f083b7bb24f033dfed18a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57829
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The seldom used '--include-tags' and '--exclude-tags' flags allows a
testing user to remove and include tags from the search used by TestLib
to select tests. For example, by default the 'quick' tag is included as
part of the search of tests to run. The '--exclude-tags' flag could then
be used to remove the 'quick' tags from the search.
The TestLib framework was applying the regex these flags input before
the default flags. This meant if the user wished to remove a flag, it
was impossible. This is now applied after.
Change-Id: I569e0f8d6093ff5e5cdc76faff89c15e75ff297a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56830
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The first big change is gcc-6.4 is no longer supported in FastModel
11.17. We switch to gcc-7.3. Next, TARGET_MAXVIEW is
replaced by TARGET_SYSTEMC_MAXVIEW. The default value of
TARGET_SYSTEMC_MAXVIEW is zero. So we can simply remove TARGET_MAXVIEW.
Finally, I fixed an undefined exception in the build script.
Change-Id: I5ec70112056513c253e6127ed5f8abacf191431f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57549
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
In FS mode offsets for HSA queues are determined by the driver and
cannot be linearly assigned as they are in SE mode. Add plumbing to pass
the offset of a queue to the HSA packet processor and then to HW
scheduler.
A mapping to/from queue ID <-> doorbell offset are also needed to be
able to unmap queues. ROCm 4.2 is fairly aggressive about context
switching queues, which results in queues being constantly mapped and
unmapped.
Another result of remapping queues is the read index is not preserved in
gem5. The PM4 packet processor will write the currenty read index value
to the MQD before the queue is unmapped. The MQD is written back to
memory on unmap and re-read on mapping to obtain the previous value.
Some helper functions are added to be able to restore the read index
from a non-zero value.
Change-Id: I0153ff53765daccc1de23ea3f4b69fd2fa5a275f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53076
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Add a new option `auto_unlink_shared_backstore` to System so it will
remove the shared backstore used in physical memories when the System is
getting destructed. This will prevent unintended memory leak.
If the shared memory is designed to live through multiple round of
simulations, you may set the option to false to prevent the removal.
Test: Run a simulation with shared_backstore set, and see whether there
is anything left in /dev/shm/ after simulation ends.
Change-Id: I0267b643bd24e62cb7571674fe98f831c13a586d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57469
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Daniel Carvalho <odanrc@yahoo.com.br>
Tested-by: kokoro <noreply+kokoro@google.com>
Add the page table walker, page table format, TLB, TLB coalescer, and
associated support in the AMDGPUDevice. This page table format used the
hardware format for dGPU and is very different from APU/GCN3 which use
the X86 page table format.
In order to support either format for the GPU model, a common
TranslationState called GpuTranslation state is created which holds the
combined fields of both the APU and Vega translation state. Similarly
the TlbEntry is cast at runtime by the corresponding arch files as they
are the only files which touch the internals of the TlbEntry. The GPU
model only checks if a TlbEntry is non-null and thus does not need to
cast to peek inside the data structure.
Change-Id: I4484c66239b48df5224d61caa6e968e56eea38a5
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51848
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
dGPUs can translate a virtual address and will not know if the address
resides in system/host memory or device/dGPU memory until the
translation is complete. In order to mark requests as going to either
system memory or device memory we add a field to the Request class.
Change-Id: Ib1e80e8d03ecdfeb11c24d979ccc4b912ce07f91
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51852
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
The ESR.IL field (Instruction Lenght) is set to 0 if the exception
has been triggered by a 16-bit instruction (Thumb) and 1 otherwise.
Current implementation has been implemented more or less correctly
for AArch32 but not for AArch64; by doing:
if (to64) {
esr.il = 1;
} ... [AArch32]
We are directly setting ESR.IL to 1 in case the exception is taken in
AArch64 mode. This is not covering the case of a thumb instruction
faulting to AArch64.
We are fixing this by defining a virtual method returning the ESR.IL
bitfield depending on the exception cause/type. This is following
the Arm Architectural Reference Manual, which states ESR.IL bit should
be set to 1 for 32-bit instructions and for cases where the fault
doesn't really depend on the instruction:
* SError interrupt
* Instruction Abort exception
* PC alignment exception
* SP alignment exception
* Data Abort exception for which the value of the ISV bit is 0.
* Illegal Execution state exception.
* Debug exception except for Breakpoint instruction exceptions
* Exception reported using EC value 0b000000.
Change-Id: I79c9ba8397248c526490e2ed83088fe968029b0e
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57570
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>