Commit Graph

1357 Commits

Author SHA1 Message Date
Matthew Poremba
a648be2338 dev-amdgpu: Add an SDMA data debug flag
This debug flag is used to print spammy SDMA DPRINTFs, such as an SDMA
copy printing the data of large transfers 8 bytes per line at a time. For
those prints, the SDMAEngine flag will now only print the first and last
qword of the transfer and the new SDMAData flag is needed for verbose
data printing. This makes the SDMAEngine flag still useful for verifying
copies in applications with predictable data such as square.

Additionally, the memory allocation/deallocation done solely for a print
statement is removed in favor of casting the data to the printed type.

Change-Id: I18c1918ef9085cca4570f79881ee63d510ccc32f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/64452
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
2022-10-13 20:17:00 +00:00
Earl Ou
317bfd62bd dev: fix device number check error in IDE controller
Fixed a typo between 3 and 4.

Change-Id: I1470e30c4d472587db0b9da5512b24ab92f1fd65
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/64052
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
2022-10-04 01:21:37 +00:00
Giacomo Travaglini
7c0ab07ee2 dev-arm: Fix GICv3 GICD_ITARGETSR address range
According to the GICv3 manual, GICD_ITARGETSR address range goes from
0x0800 to 0x0c00 (as already implemented in the GICv2 model [1])

[1]: https://github.com/gem5/gem5/blob/v22.0.0.0/\
    src/dev/arm/gic_v2.cc#L64

Change-Id: I064e91d070d1a7b79f41a06ffd2197e4c07dae32
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/64074
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-10-03 17:45:10 +00:00
Matthew Poremba
2f1d67f8fe dev-amdgpu: Remove cached copy of device memory
This map was originally used for fast access to the GART table. It is no
longer needed as the table has been moved to the AMDGPUVM class. Along
with commit 12ec5f9172 which reads
functionally from device memory, this table is no longer needed and is
essentially a duplicate copy of device memory for anything written over
the PCI BAR.

This changeset removes the map entirely which will reduce the memory
footprint of simulations and potentially avoid stale copies of data
when reading over the PCI BAR.

Change-Id: I312ae38f869c6a65e50577b1c33dd055078aaf32
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63951
Reviewed-by: Matt Sinclair <mattdsinclair.wisc@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
2022-10-01 14:04:45 +00:00
Matthew Poremba
b623d26543 dev-amdgpu: Fix interrupt call for release mem
Both the client id and source id are incorrect for the release mem CP
packet. This changeset sets both to the correct value and adds asserts
that the value is declared in the client ID and source ID enums.

Change-Id: I4cc6c3a5f2a482e8f7dcd2a529c4a69bf71742c0
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63177
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2022-09-09 04:13:49 +00:00
Matthew Poremba
6c935657fd dev-amdgpu: Implement SDMA atomic packet
SDMA atomic packets are used in conjunction with RLC queues in SDMA for
synchronization similar to how HSA signals are used with BLIT kernels
when SDMA is disabled. Implement a skeleton of the SDMA atomic packet
methods as well as the atomic add64 operation.

The atomic add operation appears to be the only operation used in ROCm,
so this implementation is fairly complete. See:

https://github.com/RadeonOpenCompute/ROCR-Runtime/blob/
    rocm-4.2.x/src/core/runtime/amd_blit_sdma.cpp#L880

Change-Id: I62cc337f2ffe590bdb947b48053760ee8b3a6f32
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63174
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-09-09 04:13:49 +00:00
Matthew Poremba
9ea28bd782 dev-amdgpu: Implement SDMA RLC queue unmapping
The unmap queues packet specifies all non-static queues should be
unmapped which includes RLC queues in the SMDA. This functionality did
not exist before and is added in this changeset.

Fixes bug with rodinia_3.0/hip/bfs.

Change-Id: I80ca8cf8d89559625b5870745889b0a27916635e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63173
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2022-09-09 04:13:49 +00:00
Matthew Poremba
af4251f6ae dev-amdgpu: Rework SDMA RLC queue data structure
There can only ever be two RLC queues maximum. Use this information for
a simpler data structure to store doorbell information. The patch
changes the std::unordered_map previously used to std::array. This will
also be useful in avoiding erase-while-iterating issues needed to
unregister all queues at once.

Change-Id: I95600e40de51cb1a992a20bcebaf7580ea4d0be8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63172
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2022-09-09 04:13:49 +00:00
Matthew Poremba
12ec5f9172 dev-amdgpu: Rework framebuffer reads
Previously framebuffer reads would try reading from MMIO trace, special
addresses, and then anything previously written to a special address
range. This does not handle direct large BAR reads, causing incorrect
results in some applications that were doing this.

Rework the readFramebuffer method to do the following. Remove the MMIO
trace read altogether, as there were not any framebuffer reads from the
trace to begin with. Read special addresses first to avoid overwriting
by previous writes. Next read previous writes to special ranges. The
special range is the GART table. These are required for functional
translations. Lastly read from the device memory directly. This does a
functional read required by the PCIDevice read method which is
non-timing. Reading from device memory is preferred over the map type
used for GART to avoid duplication of a potentially huge amount of data.

With this changeset all but one of the HIP samples and HIP examples
applications now run and pass verification of results.

Change-Id: Id3b788bfc5eaf17cfa1897f25d26f3725d4db321
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63171
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2022-09-09 04:13:49 +00:00
Matthew Poremba
4b35693bd2 dev-amdgpu: Forward RLC queue doorbells
Forward user queue doorbells to the SDMA. This is the final step needed
to enable RLC (user) queues to replace BLIT kernels.

Change-Id: I0c2ef70bb5414b82785ef437dd65d6c57798d24f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63033
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-09-09 04:13:49 +00:00
Matthew Poremba
a5dfb0718d dev-amdgpu: Add user-mode TranslationGen to SDMA
RLC queue do translation using user mode addresses. To support this, add
the final aperture translation needed to the SDMA engine.

Change-Id: I25841e240e3b44f66d26d503ab52b54379daa49a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63032
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
2022-09-09 04:13:49 +00:00
Matthew Poremba
9ed39afe62 dev-amdgpu: Place all user-mode translations in MMHUB
The memory management hub ("mmhub") is an aperture that aliases the GPU
device memory. MMHUB addresses functionally map to the same device
address, with the exception that it is guaranteed not to overlap with
host memory. This is useful in gem5 for APIs with Addr type as it
prevents sending e.g., DMAs to the wrong place.

Change-Id: Ia296809a8dc2c5fbdeba6d70cd53215f9ab36c93
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63031
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-09-09 04:13:49 +00:00
Matthew Poremba
e0e2806fc4 dev-amdgpu: Add SDMA device translation helper
Adding a helper function to remove duplicate code in the copy packet
methods. Adds more comments on that code to explain what it is doing.
This could in theory also be used in other packets in the future.

Change-Id: Id0ed50c87260a2f12f53cb14e927f8c49bb99072
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62718
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2022-09-09 04:13:49 +00:00
Matthew Poremba
58e072f8bf dev-amdgpu: Remove default callback in mem manager API
In almost all cases reading/writing using the GPU memory manager will
want to wait until that read or write is complete. Therefore, change the
API to not default to no callback so that the user must explicitly
specify nullptr indicating they do not want to wait for completion.

Updates a write call which cannot use a callback due to being atomic in
the base gpu device code.

Change-Id: Id19145d49c7cafc97e2e178819682cb97270a16a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62716
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2022-09-09 04:13:49 +00:00
Matthew Poremba
3465ff1e7d dev-amdgpu: Add callbacks for all SDMA GPUMemMgr reqs
SDMA write, copy, and ptePde use GPUMemMgr to write to device memory and
were dangerously not waiting for write completion which could result in
data not being completely written to memory, the data buffer being freed
and potentially reused in the simulator, or advancing to the next SDMA
packet before the previous one is complete.

This changeset adds callbacks for the corresponding "done" methods
similar to what the dmaVirt methods call when reading or writing to host
memory to fix this issue.

Change-Id: I44ce14c13f812ea2a7a76438e12a6ed7c6e0bff0
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62715
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-09-03 16:05:58 +00:00
Matthew Poremba
404aa34855 dev-amdgpu: Track outstanding chunks in mem manager
Requests sent using the GPU memory manager are not guaranteed to be
ordered. As a result, the last chunk created by the chunk generator
could complete before all of the previous chunks are done. This will
trigger the final callback and may cause an SDMA/PM4/etc. packet that is
waiting for its completion to resume before the data is ready.

This is likely a fix for verification failures in many applications.
Currently this is tested on MatrixTranspose from the HIP cookbook which
now passes its verification step. It could also potentially fix other
race conditions between reads/writes from/to memory such as using a PTE
or PDE before it is written, etc.

Change-Id: Id6fb342d899db6bd0b86c80056ecf91eeb3026f5
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62714
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
2022-09-03 16:05:58 +00:00
Matthew Poremba
432329c853 dev-amdgpu: Allow device address source for SDMA COPY
Now that the memory manager can DMA read from device memory, allow the
linear copy SDMA packet to use device memory as a source. This is used
when copying memory from device to host when SDMA engines are enabled.
This improves simulation performance over using (simulated) BLIT kernels
with SDMA engines disabled.

Change-Id: I1f41b294022f0049d154a401c1dc885abb4f223b
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62713
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-09-03 16:05:58 +00:00
Matthew Poremba
a531ff64c3 dev-amdgpu: Add memory manager readRequest method
This method reads arbitrary sized requests from *device* memory with the
ability to call a callback after the last chunk, similar to writeRequest
method.

Change-Id: I8fc22c45b650a632ea48dbed1e978ceeda34ffdd
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62712
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-09-03 16:05:58 +00:00
Matthew Poremba
4211962f8c dev-amdgpu: Fix translation reading SDMA MQD ("RLC queue")
The RLC queue MQD address is a GART address, not a system address, so it
must be translated through the GART first.

Change-Id: Ie52b0e65ebf57141b8ba6f88a49989813750eeec
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62711
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2022-09-03 16:05:58 +00:00
Giacomo Travaglini
0dc2a87666 dev-arm: Fix PCI range in VExpress_GEM5_Foundation
When we added the PCI mem range in the VExpress_GEM5_Foundation [1], we
meant to add a 256GiB region starting at 0x40 0000 0000.

By mistake the end address was set to 0x8 0000 0000 rather than
0x80 0000 0000

[1]: https://gem5-review.googlesource.com/c/public/gem5/+/44165

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Change-Id: I848b8fee11fb742939c9343aae4ee5205aa836e4
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62511
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-09-01 08:07:02 +00:00
Bobby R. Bruce
2bc5a8b71a misc: Run pre-commit run on all files in repo
The following command was run:

```
pre-commit run --all-files
```

This ensures all the files in the repository are formatted to pass our
checks.

Change-Id: Ia2fe3529a50ad925d1076a612d60a4280adc40de
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62572
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
2022-08-24 21:47:07 +00:00
Gabe Black
f4209bbdee misc: Remove lingering uses of TheISA::.
Change-Id: Ie55e0d79867fbc8f75a993fb456a58c84de5def4
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62196
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
2022-08-20 07:30:16 +00:00
Gabe Black
a13e3debed misc: Stop excluding code when building the NULL ISA.
The BaseCPU needs a little extra hacking because it tries to create
default objects based on what the ISA is. If the ISA isn't recognized,
then the types will be set to None, and some extra checks have been
added as the type is set up.

Change-Id: Ia3cae313e1a96a953d2316d9192f41a8fd28c141
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62195
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-08-20 07:30:07 +00:00
Wei-Han Chen
b3781ce93d configs: Add ITS in fastmodel cluster
There's a gic-its domain in gem5_vexpress_v2 device tree, thus adding ITS
domain in fastmodel cluster config.

Change-Id: Ieb0221fec2e85710531cef1723c492a07f47290a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62212
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Yu-hsin Wang <yuhsingw@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
2022-08-10 06:40:35 +00:00
Bobby R. Bruce
787204c92d python: Apply Black formatter to Python files
The command executed was `black src configs tests util`.

Change-Id: I8dfaa6ab04658fea37618127d6ac19270028d771
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/47024
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-08-03 09:10:41 +00:00
Giacomo Travaglini
ef2573bc95 arch-arm: Convert to the new faulting logic
This patch is moving trapping behaviour modelled in
MiscRegOp64::trap to the MiscRegLUTEntry fault callbacks.

Change-Id: Idfca428e9e6669b747de0255888fc8a85a1f5d07
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61683
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-07-29 06:51:11 +00:00
Matthew Poremba
68115460d8 gpu-compute: Set LDS and Scratch apertures in FS
The LDS and scratch aperture base and limits are hardcoded to some
values that are useful for SE mode. In reality, these are chosen by the
driver so we need to honor whatever values the driver passes so that
when addresses are calculated they fall into the correct aperture to
route flat instructions to those apertures.

This overwrites the default hardcoded values for LDS and scratch base
and limit using the values providing by the driver in a MAP_PROCESS
packet.

Change-Id: I0e194a26631f697819d8aaecf1bf346a7b7c7026
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61656
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2022-07-28 14:10:33 +00:00
Matthew Poremba
f65f5a8981 gpu-compute,arch-vega: Overhaul HWRegs, setreg, getreg
These instructions are supposed to be read/writing special shader
hardware registers. Currently they are getting/setting to an SGPR. This
results in getting incorrect registers at best and clobbering an SGPR
being used by an application at worst. Furthermore, some registers need
to be set in the shader and the application will never (can never) set
them.

This patch overhauls the getreg/setreg instructions to use different
storage in the shader. The values will be updated either via setreg from
an application (e.g., mode register) or set by a PM4 MAP_PROCESS.

Change-Id: Ie5e5d552bd04dc47f5b35b5ee40a569ae345abac
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61655
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
2022-07-28 14:10:33 +00:00
Matthew Poremba
f2949f3d03 dev-amdgpu: Set PASID in interrupt cookie
The driver uses the pasid to look up events that need to be set in
kfd_signal_event_interrupt (amdkfd/kfd_events.c). Currently this is
uninitialized which causes the function in the driver to return without
doing anything useful.

This changeset initializes the cookie PASID to 0x8000. 0x8000 is always
the first PASID assigned by the driver. This works since gem5 only
supports one GPU process in FS mode. This would have to be changed for
multi-process support, so a comment is added as a reminder.

Change-Id: I7074b581f2f2f346bd910eef15d5f9253ce17e2c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61653
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-07-28 14:10:33 +00:00
Matthew Poremba
23fadb7260 dev-hsa: Don't set _aqlComplete in setRdIdx method
This code is unnecessary as the read index is already correct.
Furthermore, it can cause hangs in some situations where the packet
SHOULD be marked as not complete. This causes a bug where the read index
is incremented by 1 multiple times, causing the packet processor to read
an invalid packet, followed by a hang after it does nothing.

Change-Id: Iceda3c9606e018f60f8902770a2d9762c1c14304
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61650
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
2022-07-28 14:10:33 +00:00
Joël Porquet-Lupine
1c6f57cd6d dev: update LupIO-IPI device to latest specs
The specs for the LupIO-IPI device were recently updated. Instead of
providing a single IPI value for each processor, the device now provides
32 individual IPI bits that can be masked and set.

Update device accordingly in gem5.

Change-Id: Ia47cd1c70e073686bc2009d546c80edb0ad58711
Signed-off-by: Joël Porquet-Lupine <joel@porquet.org>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61530
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-07-21 23:48:04 +00:00
Joël Porquet-Lupine
0800c060d8 dev: Fix cpu/reg decoding logic in multi-instance LupIO devices
The current decoding logic is flawed and complicated to understand.
Using simple division and modulo instead; the compiler is smart enough
to generate efficient code since the divisor is a power of 2.

Change-Id: I95cbb4969e37132343f557e772984a48749731f0
Signed-off-by: Joël Porquet-Lupine <joel@porquet.org>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61529
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2022-07-21 23:47:30 +00:00
Alexandru Dutu
1115f81233 gpu-compute: Fix for HSA queue remapping
When a queue is being remapped the write and dispatch
pointers are set to the read pointer. This assumes that
all packets up to the read pointer have been dispatched
and completed.

Change-Id: I4ed0c6c68f16f57c3fb5c3ecba182a43e74078e2
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61429
Reviewed-by: Matt Sinclair <mattdsinclair.wisc@gmail.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Matthew Poremba <matthew.poremba@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-07-20 16:26:14 +00:00
Earl Ou
c0ca47b6ed dev: avoid intpin to reset value at binding stage
By design SimObject should initialize its state at init() stage.
However, the original intpin design will try to reset the sink side when
binding. This could cause unexpected issue as the other side does not
init() yet.

To align with the design, the call to upper()/lower() should be left to
the initiator in the init() function instead of constructor.

Change-Id: Iec8b228715d093381a33e747849119562bd634e1
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/60751
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Yu-hsin Wang <yuhsingw@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
2022-06-29 12:42:36 +00:00
Bobby R. Bruce
7b9364b5c0 misc: Merge branch 'release-staging-v22-0' into stable 2022-06-17 20:21:37 -04:00
ntampouratzis
f3e9484969 arch-riscv,dev: Add PCI Host to RISCV Board
Add GenericRiscvPciHost to RISCV Board. In addition, we connect the IGbE_e1000
ethernet card to PCI in order to verify the correct functionality.

To be noticed that we build a new Linux kernel v5.10 (with Bootloader) according to these steps (
https://github.com/gem5/gem5-resources/tree/stable/src/riscv-fs) adding the the PCI and e1000 drivers:

CONFIG_PCI_SYSCALL=y
CONFIG_PCI_STUB=y
CONFIG_PCI_HOST_GENERIC=y
CONFIG_NET_VENDOR_INTEL=y
CONFIG_E1000=y
CONFIG_E1000E=y
CONFIG_IGB=y
CONFIG_NET_VENDOR_I825XX=y

Here you can find the kernel.config and our prebuild kernel to verify the correct behaviour:
https://www.dropbox.com/scl/fo/sz9s37vybpfecbfilxqzz/h?dl=0&rlkey=klkxh33anjqnzwj3sopucqqzx

You can verify it with the following command:
build/RISCV/gem5.fast configs/example/gem5_library/riscv-fs.py

Dear Jason Lowe-Power,

Thank you for your comments! We have addressed all of them.

Best regards,
Nikolaos Tampouratzis

Dear Jason,

I think that it is ok now! :)

Thanks!

Best regards,
Nikolaos Tampouratzis

Change-Id: Id27d84a5588648b82cbfd5c88471927157ae6759
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/59969
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-06-06 18:42:12 +00:00
Matthew Poremba
1e6ff02c25 dev-amdgpu: Allow for atomics in SystemHub
It seems that applications can be compiled which issue atomics to host
memory, such as heterosync. Remove the arbitrary assert to disallow them
and issue atomics as a DMA write by default.

Change-Id: I7812a421a9312406b3faccdc05d6c7e9fc837da0
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/59669
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-05-16 15:08:09 +00:00
Giacomo Travaglini
5d45c50b48 misc: Add VExpress_GEM5_Foundation bootloader
The VExpress_GEM5_Foundation platform cannot use the VExpress_GEM5_V2
bootloader as the GIC has a different memory map

A new tarball has been uploaded to dist.gem5.org with the new bootloader

Change-Id: Ie0c16e623c3323b7be2a333cd6b0ffcf891b7b9b
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/59392
Tested-by: kokoro <noreply+kokoro@google.com>
2022-05-07 22:40:47 +00:00
Giacomo Travaglini
776321d2c2 dev-arm: GICD_PIDR2.ArchRev value depends on GIC version
The GIC architecture specification states the GICD_PIDR2.ArchRev
field is set to 3 for GICv3 and to 4 for GICv4. We bind this
value to the gicv4 parameter

Change-Id: I3ba34bc0b4538b4d5170915a4ee042e534f2590f
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/59391
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
2022-05-07 22:40:25 +00:00
Matthew Poremba
54d2438066 dev-amdgpu: Removed hardcoded AQL queue size
The AQL queue size is currently hardcoded to 64kB. For longer running
applications this causes the circular queue to wrap before reaching the
real end of the queue. Add the computation for queue size instead.

Previously longer applications (e.g., bc in pannotia) were hanging
around 4k kernels. With change the application launches 10k+ kernels.

Change-Id: I6c31677c1799a3c9ce28cf4e7e79efcb987e3b7f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/59449
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2022-05-07 03:47:06 +00:00
Giacomo Travaglini
7a9e99400f dev-arm: Gicv3.gicv4 parameter set to False by default
GICv4 features are not currently implemented so it is more natural
to set it to false by default

VExpress_GEM5_V2 platform assumes a GICv4 memory map therefore
sets it to True

Change-Id: Ib4bd17acd56cd029aacf5578ab0259a6ea1bb30c
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/59390
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2022-05-06 16:29:22 +00:00
Matthew Poremba
c170994676 dev-amdgpu: Fix size issue in interrupt handler
The data allocated for the DMA request used to send an interrupt cookie
was too large. This was causing the memcpy to occasionally seg fault due
to reading past the bounds of the source parameter (the interrupt cookie
struct). Correct the size and add a compile time check to ensure it is
the correct number of bytes expected by the driver.

Change-Id: Ie9757cb52ce8f72354582c36cfd3a7e8a1525484
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/58969
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Alexandru Duțu <alexandru.dutu@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2022-05-05 16:09:49 +00:00
Yu-hsin Wang
1f713320fe dev: Expose ResetRequestPort constructor
Port::Port is in protected scope. ResetRequestPort should expose the
constructor by itself.

Change-Id: I72ce701fca89379f90e212d7411f481ae1e1977a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/59209
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Gabe Black <gabe.black@gmail.com>
2022-04-29 02:42:54 +00:00
Yu-hsin Wang
8df2ebf43e dev: Add a special reset interface to consolidate reset logic
How to reset a model correctly is very different between models. Take
cpu models for instance, they have different reset pins for different
parts(typically one for each core, one for shared component, one for
debug interface). To make users more easily to reset the model, here we
want to introduce a special reset port. By implementing the port, users
can simply request a whole reset to the model. If users want to do
partial resets, users still can access the raw pins to achieve what they
want.

Change-Id: I746121d16441e021dc3392aeae1a6d9fa33d637a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/58810
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-04-26 06:31:51 +00:00
Matthew Poremba
8a53add7f8 dev-amdgpu: Fix frame writes for <32-bit writes
In theory a packet between one and eight bytes can be written to frame
buffer memory from the driver. In gem5 pkt->getLE<utin32_t>() will
assert if the packet size is <32-bits. Change to pkt->getUintX(...) to
fix this issue.

Change-Id: If8554013e4ea7bac90985487991d0bf8bdc765ea
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/58852
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-04-21 21:32:53 +00:00
Matthew Poremba
1562251243 dev-amdgpu: Update comments pointing to ROCK repo
It seems the tag name was changed which broke a few links in some
comments pointing to where definitions and struct come from. Update the
URLs and also use consistent version.

Change-Id: I7d6393f1f08d592989999a8a6f9c5bbdf1a9c992
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/58471
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-04-08 17:12:32 +00:00
Matthew Poremba
e3f65393fd dev-amdgpu,arch-vega: Implement TLB invalidation logic
Add logic to collect pointers to all GPU TLBs in full system. Implement
the invalid TLBs PM4 packet. The invalidate is done functionally since
there is really no benefit to simulate it with timing and there is no
support in the TLB to do so. This allow application with much larger
data sets which may reuse device memory pages to work in gem5 without
possibly crashing due to a stale translation being leftover in the TLB.

Change-Id: Ia30cce02154d482d8f75b2280409abb8f8375c24
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/58470
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-04-08 17:12:32 +00:00
Matthew Poremba
2af227c32a dev-hsa: Update QCntxt readIndex in HW scheduler write
The QCntxt is reused when a queue is unmapped and mapped again. This is
fairly common in GPU full system. If this is not done the readIndex on
the queue context is reset to 1, causing getCommandsFromHost to read
from the wrong slot which is typically an old dispatch packet or an
invalid packet. This causes simulation to stall as the incorrect
completion signal is eventually written.

Change-Id: I65541e559fe04f5eb44b936ca37e3f802262fe6a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57670
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-03-28 23:24:53 +00:00
Matthew Poremba
6883f12f09 dev-hsa: Properly mask HSA packet header bits
The HSA packet macros were not actually masking the header bits
properly. Add a mask call around the width (number of bits) of the field
being masked.

Change-Id: Ia5e5fb0451296e99a85fb12a5f73b27aea72fc2e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57669
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-03-28 23:24:53 +00:00
Gabe Black
e6c0ba97db scons: Put all config variables in an env['CONF'] sub-dict.
This makes what are configuration and what are internal SCons variables
explicit and separate, and makes it unnecessary to call out what
variables to export to C++.

These variables will also be plumbed into and out of kconfiglib in later
changes.

Change-Id: Iaf5e098d7404af06285c421dbdf8ef4171b3f001
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56892
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-03-28 20:31:21 +00:00