How to reset a model correctly is very different between models. Take
cpu models for instance, they have different reset pins for different
parts(typically one for each core, one for shared component, one for
debug interface). To make users more easily to reset the model, here we
want to introduce a special reset port. By implementing the port, users
can simply request a whole reset to the model. If users want to do
partial resets, users still can access the raw pins to achieve what they
want.
Change-Id: I746121d16441e021dc3392aeae1a6d9fa33d637a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/58810
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Add logic to collect pointers to all GPU TLBs in full system. Implement
the invalid TLBs PM4 packet. The invalidate is done functionally since
there is really no benefit to simulate it with timing and there is no
support in the TLB to do so. This allow application with much larger
data sets which may reuse device memory pages to work in gem5 without
possibly crashing due to a stale translation being leftover in the TLB.
Change-Id: Ia30cce02154d482d8f75b2280409abb8f8375c24
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/58470
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The QCntxt is reused when a queue is unmapped and mapped again. This is
fairly common in GPU full system. If this is not done the readIndex on
the queue context is reset to 1, causing getCommandsFromHost to read
from the wrong slot which is typically an old dispatch packet or an
invalid packet. This causes simulation to stall as the incorrect
completion signal is eventually written.
Change-Id: I65541e559fe04f5eb44b936ca37e3f802262fe6a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57670
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This makes what are configuration and what are internal SCons variables
explicit and separate, and makes it unnecessary to call out what
variables to export to C++.
These variables will also be plumbed into and out of kconfiglib in later
changes.
Change-Id: Iaf5e098d7404af06285c421dbdf8ef4171b3f001
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56892
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reads to the frame buffer are currently handled by either the MMIO trace
or from the GART table if the address is in the GART aperture. In some
cases the MMIO trace will not contain the address or the data may have
been written previously and be different from the MMIO trace. To handle
this, return the data that was written previously by the driver. The
priority order from lowest to highest is: MMIO trace, device cache,
special framebuffer registers.
Change-Id: Ia45ae19555508fcd780926fedbd7a65c3d294727
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57589
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The driver will check this bit is set after initializing IPs. Currently
the MMIO trace will cause this bit to be set at the correct time,
however this is not portable access different ROCm versions. Therefore
we modify the value to always set the bit indicating interrupts are
enabled.
Change-Id: Iae0baf1936720fbe9835ae4acadbf1b3bdc52896
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57530
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Make the necessary changes to connect Vega pagetable walkers for
full-system mode. Previously the CP and HSA packet processor could only
read AQL packets from system/host memory using proxy port. This allows
for AQL to be read from device memory which is used for non-blit
kernels.
Change-Id: If28eb8be68173da03e15084765e77e92eda178e9
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53077
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The Arm Architecture Reference Manual has moved from
"Armv7-oriented" names for generic timer interrupts to
names more consistent with Armv8 (Exception Levels based).
We are therefore renaming those interrupts as follows:
int_phys_s -> int_el3_phys
int_phys_ns -> int_el1_phys
int_virt -> int_el1_virt
int_hyp -> int_el2_ns_phys
Change-Id: Id6e34a0e4311953938b25bca168a34357e3c8643
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/58109
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The PM4 packet processor is handling all non-HSA GPU packets such
as packets for (un)mapping HSA queues. This commit pulls many
Linux structs and defines out into their own files for clarity.
Finally, it implements the VMID related functions in AMDGPU device.
Change-Id: I5f0057209305404df58aff2c4cd07762d1a31690
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53068
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Add the devices that have been added in previous changesets to the
config file. Forward MMIO writes to the appropriate device based
on the MMIO address. Connect doorbells and forward rings to the
appropriate device based on queue type.
Change-Id: I44110c9a24559936102a246c9658abb84a8ce07e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53065
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
SDMAEngine handles copies to device memory. This commit
updates sdma_packets.hh style as well. Added several methods needed by
SDMAEngine to GPU device including GART table, various getters, and
aperture range checkers. Move the MMIO interface from GPUController to
SDMAEngine. Create an SDMA MMIO and commands header with only the macros
we use so that we don't need to check in multi-thousand line header
files from the linux kernel. Keep SOC15 IH client ID macros as that file
is small.
Change-Id: I986fede90cc1bc16ee56d4e8598cf9283bde034e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53064
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Create a VM class to reduce clutter in the amdgpu_device.* files. This
new file is in charge of reading/writting MMIOs related to VM contexts
and apertures. It also provides ranges checks for various apertures and
breaks out the MMIO interface so that there are not overloaded macro
definitions in the device MMIO methods.
The new translation generator classes for the various apertures are also
added to this class.
Change-Id: Ic224c1aa485685685b1136a46eed50bcf99d2350
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53066
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
In a dGPU configuration, vector and scalar loads/stores can either be
requests to device memory or host memory depending on if the system bit
is set in the PTE when the request's virtual address is translated. This
object is used to send/receive those requests to the host via DMA.
This object will be used in a later changeset by the compute unit and
fetch units to issue data and instruction loads from the GPU which
translate to physical addresses on the host/cpu memory.
Change-Id: I4537059f90ebc03f3b2e6b8b631b4c452841f83f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51851
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
In FS mode offsets for HSA queues are determined by the driver and
cannot be linearly assigned as they are in SE mode. Add plumbing to pass
the offset of a queue to the HSA packet processor and then to HW
scheduler.
A mapping to/from queue ID <-> doorbell offset are also needed to be
able to unmap queues. ROCm 4.2 is fairly aggressive about context
switching queues, which results in queues being constantly mapped and
unmapped.
Another result of remapping queues is the read index is not preserved in
gem5. The PM4 packet processor will write the currenty read index value
to the MQD before the queue is unmapped. The MQD is written back to
memory on unmap and re-read on mapping to obtain the previous value.
Some helper functions are added to be able to restore the read index
from a non-zero value.
Change-Id: I0153ff53765daccc1de23ea3f4b69fd2fa5a275f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53076
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Add the page table walker, page table format, TLB, TLB coalescer, and
associated support in the AMDGPUDevice. This page table format used the
hardware format for dGPU and is very different from APU/GCN3 which use
the X86 page table format.
In order to support either format for the GPU model, a common
TranslationState called GpuTranslation state is created which holds the
combined fields of both the APU and Vega translation state. Similarly
the TlbEntry is cast at runtime by the corresponding arch files as they
are the only files which touch the internals of the TlbEntry. The GPU
model only checks if a TlbEntry is non-null and thus does not need to
cast to peek inside the data structure.
Change-Id: I4484c66239b48df5224d61caa6e968e56eea38a5
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51848
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
When powered on, the "passed self test" bit should not be set. It should
only be set once the I8042 has actually been told to do a self test.
Also the mouse and keyboard should be disabled. With them disabled their
interrupts won't matter, but we might as well leave those disabled as
well.
Change-Id: Ief1ab30365a0a8ea0a116e52c16dcccf441515ec
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/55805
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Now that the I8259's vector is reported using a special memory access,
the getVector method doesn't need to be accessible outside of the class.
It's still useful internally though, since it nicely encapsulates what
should happen when an INTA signal is received.
Change-Id: I7da7c1f18fac97ffc62c965978f53fb4c5430de3
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/55698
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
In a real system, once a CPU receives an interrupt of type ExtInt, it
will send an INTA message out to the I8259 sytle interrupt controllers
to read the vector for that interrupt. In ye-olden-times, that would
literally mean the I8259 would be in charge of the bus and would write
the eight bit vector for the CPU to read. In more modern systems, the
vector is transported on the system interconnect using a special
message.
To better approximate a real system, and to make the interrupt
controllers more modular and agnostic (so the IO APIC doesn't have a
I8259 pointer within it, for instance), this change adds a new special
address which the I8259 can respond to on reads which will act as if it
received an INTA message, and the read data will be the interrupt
vector.
Only the master controller, or a single device, will respond to this
address, and because of its value and the fact that it's beyond the end
of the 16 bit IO port address space's effective range but still within
it, that address won't be generated by any other activity other than
possibly a bogus address.
Also by putting the special address in the IO port address space, that
will make it easier to ensure that it's within the range of addresses
which are routed towards the I8259 which operates off the IO port bus.
This address is not yet actually used by the IO APIC or local APIC but
will be shortly.
Change-Id: Ib73ab4ee08531028d3540570594c552f39053a40
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/55694
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The x86 version is basically just a specialization of the base IO port
version of the QEMU firmware configuration device, with the port
addresses set for x86.
The E820 entry type is x86 specific, and is a way to pass an E820 memory
map to firmware which doesn't have another way to figure out where
memory is. This would be for firmware like SeaBIOS which is itself
responsible for publishing an E820 map, but it needs somewhere to get
that information from in the first place. This mechanism is one it
supports natively.
This entry type reuses the E820Entry SimObjects which were defined a
long time ago for passing to a Linux FS workload. It doesn't use their
ability to write themselves out to guest memory, and just uses them as a
transport for their address, size and type properties.
Change-Id: Ifff214f5fc10bd7d0a2a0acddad4fc00dd65f67d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/55628
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
This artificial device is provided by QEMU inside their emulated
machines to feed extra configuration information to the system firmware,
or even to the operating system if it chose to use it. The behavior of
this device is explained in the docs/specs/fw_cfg.txt file in the QEMU
source.
This implementation currently supports the traditional interface, and
does not support the DMA based interface, although it probably wouldn't
be that hard to expand it to in the future.
The interface exposes individual entries which can optionally (and
usually) have paths associated with them that you can look up in a
directory type entry which has a fixed index. There are some entries
which are built into the device itself, which are the ID, signature and
directory entries, but the rest can be set up in the config scripts.
To make it easier to add new entries which are not from config scripts,
aka the ones that are hard coded in C++ and built into the device, the
actual entries themselves are not SimObjects, but it's easy to create a
SimObject wrapper which will spit them out for the device to consume.
Other items can be added to the device manually without generating them
with SimObjects.
Entries can have fixed or automatically generated indices. All entries
have a "path" in the sense that they have a name, but as a minor
deviation from what the QEMU documentation says, a "path" which begins
with a "." is not exported in the directory. This is purposefully
reminiscent of the unix style hidden file mechanism, where files or
directories who's names begin with "." are not normally shown by ls.
There are two different styles of this device, one which is IO port
based, and one which is MMIO based. Which to use depends on the
architecture, where x86 currently uses the IO scheme and ARM uses the
MMIO scheme. The documentation doesn't say what other ISAs use, if any
other ones support this interface, but I'd assume the MMIO version.
These are split out because the rules for how they work are subtly
different, but they share a lot of common machinery under the hood.
In most cases where somebody tries to talk to the device in an unusual
way, for instance accessing a register with an incomplete width or at an
offset, the device will just report all zeroes. The behavior in those
cases isn't specified, in many cases doesn't make sense based on the
design of the device, and doesn't seem to be depended on in the limited
use case I looked at.
Change-Id: Ib81ace406f877b298b9b98883d417e7d673916b5
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/55627
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
This is essentially the same as the normal one, except it sets its
ProgIF bits to show that it works in compatibility mode only, with fixed
IO ports and fixed IRQs that it operates with which are outside of the
scope of the normal PCI mechanisms.
Change-Id: I69d04f5c9444e7e227588b96b7dd4123b2850e23
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/55586
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Gabe Black <gabe.black@gmail.com>
This command is one of two that should be implemented by ATAPI devices.
An ATAPI device are essentially optical devices which use SCSI commands
which are transported over ATA using two special commands, a PACKET
command which actually sends the SCSI commands, and an IDENTIFY command
which is basically the same as the ATA IDENTIFY command but which is
only implemented on ATAPI devices. In order to determine if the device
connected to an IDE controller is an optical drive or a regular ATA hard
drive, software can send the ATAPI_IDENTIFY_DEVICE command and see if
gets an appropriate response.
In gem5, this command was originally not implemented by the IDE disk
model. If a driver attempted to send it, the gem5 model would panic and
kill the simulation. To fix that problem, that command was added to the
list of supported commands and just made a synonym for the ATA IDENTIFY
command since they have essentially the same response.
Unfortunately, this makes all ATA devices look like ATAPI devices, which
is not what we have implemented.
Instead, when we get this command, what we *should* do, as far as I can
tell by reading this:
http://users.utcluj.ro/~baruch/media/siee/labor/ATA-Interface.pdf
is to set the ERR bit in the status register, and then set the ABRT bit
in the error register to indicate that the command was not implemented.
I've attempted to implement that into the model with this change by
setting those bits as described, and then setting the "action" member to
be ACT_CMD_ERROR. I think that action is there primarily to support
cancelled transfers, but it seems like it has the desired effect(?).
Since the error bits are not really explicitly set or managed by the
model in most cases, this change also adds a little bit of code at the
top of startCommand which clears them to zero. These bits are supposed
to "contain the status of the last command executed", and if we're
starting a new command, the error bits no longer apply.
I'm confident that conceptually this is how the ATAPI_IDENTIFY_DEVICE
command should behave in our model, at least unless we decide to
implement real ATAPI models which actually accept SCSI commands, etc.
I'm less confident that I've modified the model to actually implement
that behavior, but as far as I can tell it doesn't seem to have broken
anything, and now SeaBIOS no longer things our disk model is a CDROM
drive.
Change-Id: I2c0840e279e9caa89c21a4e7cbdbcaf6bccd92ac
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/55523
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Gabe Black <gabe.black@gmail.com>
A (another) weird/terrible quirk of the architecture of the PC is that
the the highest order bit of the value which selects a register to read
from the CMOS NVRAM/RTC is stolen and repurposed to enable/disable NMIs
since the 286, if the internet is to be believed.
Fortunately We don't currently attempt to generate an NMI anywhere, and so
this bit won't do anything in gem5 currently.
Unfortunately if we treat this value as the real offset without masking
off this bit, if software attempts to disable NMIs with it, it will
trigger an out of bounds assert in the CMOS device.
This change makes the CMOS device slightly smarter and has it maintain
but ignore the NMI disabling bit.
Change-Id: I8d7d0f1b0aadcca2060bf6a27931035859d66ca7
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/55244
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>