This code is unnecessary as the read index is already correct.
Furthermore, it can cause hangs in some situations where the packet
SHOULD be marked as not complete. This causes a bug where the read index
is incremented by 1 multiple times, causing the packet processor to read
an invalid packet, followed by a hang after it does nothing.
Change-Id: Iceda3c9606e018f60f8902770a2d9762c1c14304
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61650
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
The QCntxt is reused when a queue is unmapped and mapped again. This is
fairly common in GPU full system. If this is not done the readIndex on
the queue context is reset to 1, causing getCommandsFromHost to read
from the wrong slot which is typically an old dispatch packet or an
invalid packet. This causes simulation to stall as the incorrect
completion signal is eventually written.
Change-Id: I65541e559fe04f5eb44b936ca37e3f802262fe6a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57670
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This makes what are configuration and what are internal SCons variables
explicit and separate, and makes it unnecessary to call out what
variables to export to C++.
These variables will also be plumbed into and out of kconfiglib in later
changes.
Change-Id: Iaf5e098d7404af06285c421dbdf8ef4171b3f001
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56892
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Make the necessary changes to connect Vega pagetable walkers for
full-system mode. Previously the CP and HSA packet processor could only
read AQL packets from system/host memory using proxy port. This allows
for AQL to be read from device memory which is used for non-blit
kernels.
Change-Id: If28eb8be68173da03e15084765e77e92eda178e9
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53077
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
In FS mode offsets for HSA queues are determined by the driver and
cannot be linearly assigned as they are in SE mode. Add plumbing to pass
the offset of a queue to the HSA packet processor and then to HW
scheduler.
A mapping to/from queue ID <-> doorbell offset are also needed to be
able to unmap queues. ROCm 4.2 is fairly aggressive about context
switching queues, which results in queues being constantly mapped and
unmapped.
Another result of remapping queues is the read index is not preserved in
gem5. The PM4 packet processor will write the currenty read index value
to the MQD before the queue is unmapped. The MQD is written back to
memory on unmap and re-read on mapping to obtain the previous value.
Some helper functions are added to be able to restore the read index
from a non-zero value.
Change-Id: I0153ff53765daccc1de23ea3f4b69fd2fa5a275f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53076
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Remove the line "For use for simulation and test purposes only" in files
were AMD is the only copyright holder listed in the header. This happens
to be the case for all files where this line exists, removing it
completely from gem5.
Change-Id: I623f266b002f564301b28774f49081099cfc60fd
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53943
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
GFX7 (not supported in gem5) and GFX8 have a bug with how virtual
addresses are calculated for their HSA queues. The ROCr component of
ROCm solves this problem by doubling the HSA queue size that is
requested, then mapping all virtual addresses in the second half of the
queue to the same virtual addresses as the first half of the queue.
This commit fixes gem5's support to mimic this behavior.
Note that this change does not affect Vega's HSA queue support, because
according to the ROCm documentation, Vega does not have the same problem
as GCN3.
Change-Id: I133cf1acc3a00a0baded0c4c3c2a25f39effdb51
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51371
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
I don't have this header on one of the machines I build on, so this is
breaking the build for me. Removing this include seems to make the
build succeed, implying that it's not actually necessary. I looked at
the file it's used in and didn't see anything that looked like it came
from this header file.
Change-Id: If4a29063d6d0d25904183cab78c9713ff1f8daa6
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/48603
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Remove the duplicate dmaVirt calls from HSA packet processor and GPU
command processor and move them into their own class. This removes some
duplicate code and allows a DmaVirtDevice to be created which will be
useful for upcoming full system GPU commits.
The DmaVirtDevice is an abstraction of the base DmaDevice but iterates
using ChunkGenerator over virtual addresses. Classes which inherit from
DmaVirtDevice must provide a translation function to translate from
virtual address to physical address. Once translated, the physical
address is passed to DmaDevice to do the work.
Change-Id: Idd59ccb4d9ba21c0b1150ee328ededf5a88d824e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/47179
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
In C, to refer to a type without a struct or enum tag on the type, you
need to typedef it like this:
typedef struct
{
} Foo;
Foo foo;
In C++, this is unnecessary:
struct Foo
{
};
Foo foo;
Remove all of the first form in C++ files and replace them with the
second form.
Change-Id: I37cc0d63b2777466dc6cc51eb5a3201de2e2cf43
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46199
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Apply the gem5 namespace to the codebase.
Some anonymous namespaces could theoretically be removed,
but since this change's main goal was to keep conflicts
at a minimum, it was decided not to modify much the
general shape of the files.
A few missing comments of the form "// namespace X" that
occurred before the newly added "} // namespace gem5"
have been added for consistency.
std out should not be included in the gem5 namespace, so
they weren't.
ProtoMessage has not been included in the gem5 namespace,
since I'm not familiar with how proto works.
Regarding the SystemC files, although they belong to gem5,
they actually perform integration between gem5 and SystemC;
therefore, it deserved its own separate namespace.
Files that are automatically generated have been included
in the gem5 namespace.
The .isa files currently are limited to a single namespace.
This limitation should be later removed to make it easier
to accomodate a better API.
Regarding the files in util, gem5:: was prepended where
suitable. Notice that this patch was tested as much as
possible given that most of these were already not
previously compiling.
Change-Id: Ia53d404ec79c46edaa98f654e23bc3b0e179fe2d
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46323
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Commit id ef44dc9a removed mmap-based doorbell allocation since dGPUs
use ioctl's instead. However, APUs still need this to work correctly.
Add that logic back in as well as some new logic to distinguish doorbells
mmaps from other types. Also add some additional commentary regarding
Event page mmaps.
Change-Id: I8507ac85c8f07886d0fb4f95bde5e18a7790eab8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42218
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
New topology ripped from Fiji to support dGPU. A dGPU flag is added to
the config which is propogated to the driver. The emulated driver is
now able to properly deal with dGPU ioctls and mmaps. For now, dGPU
physical memory is allocated from the host, but this is easy to change
once we get a GPU memory controller up and running.
Change-Id: I594418482b12ec8fb2e4018d8d0371d56f4f51c8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42214
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
These were not caught by the previous patches because
the grep used ignored:
- anonymous structures
(e.g., "struct {")
- opening braces without leading spaces
(e.g., "struct Name{"),
- weird chars in auto-generation files
(e.g., "struct $name {").
- extra characters after the opening brace.
(e.g., "struct Name { // Comment")
- typedefs (note that this is not caught by the verifier)
(e.g., "typedef struct Name {")
Most of this has been fixed be grepping structures
with the following regex:
grep -nrE --exclude-dir=systemc \
"^ *(typedef)* *(struct|class|enum|union) [^{]*{$" src/
The following makes sure that "struct{" is captured:
grep -nrE --exclude-dir=systemc \
"^ *(struct|class|enum|union){" src/
To find cases that contain a comment after the
opening brace:
grep -nrE --exclude-dir=systemc \
"^ *(struct|class|enum|union)[^{]*{\s*//" src/
Change-Id: I9f822bed628d13b1a09ccd6059373aff63a8d7bd
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/43505
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
In the HSAQueueDescriptor ptr function, we mod the index by numElts, but
numElts was previously just set to size, which was the raw size of the
queue. This lead to indexing past the queue. We fix this by dividing by
the size by the AQL packet size to get the actual number of elements the
queue can hold.
We also add an assert for indexing into the queue, as there is a
scenario where the queue reports a larger size than it actually is.
Change-Id: Ie5e699379f303255305c279e58a34dc783df86a0
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42423
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The systemc dir was not included in this fix.
First it was identified that there were only occurrences
at 0, 1, and 2 levels of indentation (and 2 of 2 spaces,
1 of 3 spaces and 2 of 12 spaces), using:
grep -nrE --exclude-dir=systemc \
"^ *enum [A-Za-z].* {$" src/
Then the following commands were run to replace:
<indent level>enum X ... {
by:
<indent level>enum X ...
<indent level>{
Level 0:
grep -nrl --exclude-dir=systemc \
"^enum [A-Za-z].* {$" src/ | \
xargs sed -Ei \
's/^enum ([A-Za-z].*) \{$/enum \1\n\{/g'
Level 1:
grep -nrl --exclude-dir=systemc \
"^ enum [A-Za-z].* {$" src/ | \
xargs sed -Ei \
's/^ enum ([A-Za-z].*) \{$/ enum \1\n \{/g'
and so on.
Change-Id: Ib186cf379049098ceaec20dfe4d1edcedd5f940d
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/43326
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The systemc dir was not included in this fix.
First it was identified that there were only occurrences
at 0, 1, 2 and 3 levels of indentation (and a single
occurrence of 2 and 3 spaces), using:
grep -nrE --exclude-dir=systemc \
"^ *struct [A-Za-z].* {$" src/
Then the following commands were run to replace:
<indent level>struct X ... {
by:
<indent level>struct X ...
<indent level>{
Level 0:
grep -nrl --exclude-dir=systemc
"^struct [A-Za-z].* {$" src/ | \
xargs sed -Ei \
's/^struct ([A-Za-z].*) \{$/struct \1\n\{/g'
Level 1:
grep -nrl --exclude-dir=systemc \
"^ struct [A-Za-z].* {$" src/ | \
xargs sed -Ei \
's/^ struct ([A-Za-z].*) \{$/ struct \1\n \{/g'
and so on.
Change-Id: I362ef58c86912dabdd272c7debb8d25d587cd455
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/39017
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The systemc dir was not included in this fix.
First it was identified that there were only occurrences
at 0, 1, and 2 levels of indentation, using:
grep -nrE --exclude-dir=systemc \
"^ *class [A-Za-z].* {$" src/
Then the following commands were run to replace:
<indent level>class X ... {
by:
<indent level>class X ...
<indent level>{
Level 0:
grep -nrl --exclude-dir=systemc
"^class [A-Za-z].* {$" src/ | \
xargs sed -Ei \
's/^class ([A-Za-z].*) \{$/class \1\n\{/g'
Level 1:
grep -nrl --exclude-dir=systemc \
"^ class [A-Za-z].* {$" src/ | \
xargs sed -Ei \
's/^ class ([A-Za-z].*) \{$/ class \1\n \{/g'
and so on.
Change-Id: I17615ce16a333d69867b27c7bae0f4fdafd8b2eb
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/39015
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Event creation and management support from emulated drivers is required
to support interruptible signals in HSA and this support was not
available. This changeset adds the event creation and management support
in the emulated driver. With this patch, each interruptible signal
created by the HSA runtime is associated with a signal event. The HSA
runtime can then put a thread waiting on a signal condition to sleep
asking the driver to monitor the event associated with that signal. If
the signal is modified by the GPU, the dispatcher notifies the driver
about signal value change. If the modifier is a CPU thread, the thread
will have to make HSA API calls to modify the signal and these API calls
will notify the driver about signal value change. Once the driver is
notified about a change in the signal value, the driver checks to see if
any thread is sleeping on that signal and wake up the sleeping thread
associated with that event. The driver has also implemented the time_out
wakeup that can wake up the thread after a certain time period has
expired. This is also true for barrier packets.
Each signal has an event address in a kernel managed and allocated
event page that can be used as a mailbox pointer to notify an event.
However, this feature used by non-CPU agents to communicate with the
driver is not implemented by this changeset because the non-CPU HSA
agents in our model can directly communicate with driver in our
implementation. Having said that, adding that feature should be trivial
because the event address and event pages are correctly setup by this
changeset and just adding the event page's virtual address to our PIO
doorbell interface in the page tables and registering that pio address
to the driver should be sufficient. Managing mailbox pointer for an
event is based on event ID and using this event ID as an index into
event page, this changeset already provides a unique mailbox pointer for
each event.
Change-Id: Ic62794076ddd47526b1f952fdb4c1bad632bdd2e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/38335
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
HSA packet processor will now accept and process agent packets.
Type field in packet is command type.
For now:
AgentCmd::Nop = 0
AgentCmd::Steal = 1
Steal command steals the completion signal for a running kernel.
This enables a benchmark to use hsa primitives to send an agent
packet to steal the signal, then wait on that signal.
Minimal working example to be added in gem5-resources.
Change-Id: I37f8a4b7ea1780b471559aecbf4af1050353b0b1
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/37015
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The create() method on Params structs usually instantiate SimObjects
using a constructor which takes the Params struct as a parameter
somehow. There has been a lot of needless variation in how that was
done, making it annoying to pass Params down to base classes. Some of
the different forms were:
const Params &
Params &
Params *
const Params *
Params const*
This change goes through and fixes up every constructor and every
create() method to use the const Params & form. We use a reference
because the Params struct should never be null. We use const because
neither the create method nor the consuming object should modify the
record of the parameters as they came in from the config. That would
make consuming them not idempotent, and make it impossible to tell what
the actual simulation configuration was since it would change from any
user visible form (config script, config.ini, dot pdf output).
Change-Id: I77453cba52fdcfd5f4eec92dfb0bddb5a9945f31
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/35938
Reviewed-by: Gabe Black <gabeblack@google.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This change replaces the __attribute__ syntax with the now standard [[]]
syntax. It also reorganizes compiler.hh so that all special macros have
some explanatory text saying what they do, and each attribute which has a
standard version can use that if available and what version of c++ it's
standard in is put in a comment.
Also, the requirements as far as where you put [[]] style attributes are
a little more strict than the old school __attribute__ style. The use of
the attribute macros was updated to fit these new, more strict
requirements.
Change-Id: Iace44306a534111f1c38b9856dc9e88cd9b49d2a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/35219
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
db_offset used to be calculated through pointer arithmetic. Pointer
arithmetic increments the address by the size of the data type the
pointer is pointing at. In the previous db_offset calculation, that
was a uint32_t, which means the input was multiplied by 4, which is
sizeof(uint32_t)
This patch multiplies the input value by sizeof(uint32_t) before
assigning it to db_offset.
Change-Id: I9042560303ae6b8b1054b98e9a16a9da27843bb2
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32678
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Alexandru Duțu <alexandru.dutu@amd.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This changeset fixes several bugs in the HSA barrier bit implementation.
1. Forces AQL packet launch to wait for completion of all previous packets
2. Enforces barrier bit blocking only if there are packets pending completion
3. Barrier bit unblocking is correclty done by the last pending packet
4. Implementing barrier bit for all packets to conform to HSA spec
Change-Id: I62ce589dff57dcde4d64054a1b6ffd962acd5eb8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/30354
Reviewed-by: Sooraj Puthoor <puthoorsooraj@gmail.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>