Commit Graph

1397 Commits

Author SHA1 Message Date
Gabe Black
2e1d24d048 base,dev: Simplify the ListenSocket::accept method.
Remove the nodelay option which is always set to the same thing, and
simplify the logic of the method itself.

Change-Id: I78cd91f99cbaec9abddedbc7dcddc563daedb81f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/69158
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Gabe Black <gabeblack@google.com>
Reviewed-by: Yu-hsin Wang <yuhsingw@google.com>
2023-03-22 08:27:23 +00:00
Giacomo Travaglini
e73655d038 misc: Use python f-strings for string formatting
This patch has been generated by applying flynt to the
gem5 repo (ext has been excluded)

JIRA: https://gem5.atlassian.net/browse/GEM5-831

Change-Id: I0935db6223d5426b99515959bde78e374cbadb04
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/68957
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
2023-03-16 09:05:29 +00:00
Gabriel Busnot
ba19f967d7 sim: Use ref constructor of MemberEventWrapper everywhere
Change-Id: I77989aa7318142634c771c558293138e7b1e8e51
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67657
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
2023-03-13 06:47:09 +00:00
Gabriel Busnot
1bb8cd3d44 sim: Switch from EventWrapper to MemberEventWrapper before deprec
Change-Id: I25c81787d522a0dd063112b6727669da46e0f0e7
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67655
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2023-03-13 06:47:09 +00:00
Roger Chang
e6604bf109 arch-riscv,dev: Add HiFive Base Platform
This is basic abstract platform and all of RISC-V system should
use platform inherit from HiFiveBase, HiFiveBase declared the common
way to handle interrupt.

Change-Id: I52122e1c82c200d7e6012433c2535c07d427f637
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/68199
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Yu-hsin Wang <yuhsingw@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2023-03-02 21:12:46 +00:00
Roger Chang
2209957256 arch-riscv,dev: Add PLIC abstract class to support multiple PLIC
implementation

We should create PLIC abstract and have common interface to let
HiFive platform send and clear interrupt to variable type of PLIC

Change-Id: Ic3a2ffc2a2a002540b400c70c85c3495fa838f2a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/68197
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Yu-hsin Wang <yuhsingw@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2023-03-02 21:12:46 +00:00
Roger Chang
9fb5ce5cd3 arch-riscv,dev: Fix behavior issues of PLIC
1. Fix reserved size between enable memory map and threshold memory
map. The number of enablePadding should be the number of context in
PLIC
2. writePriority to memory should update

Change-Id: Ib4b7e5ecd183863e140c4f3382a75057902d446d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/68198
Reviewed-by: Ayaz Akram <yazakram@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Yu-hsin Wang <yuhsingw@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2023-02-24 03:56:23 +00:00
hungweihsu
e10be09dcf dev: add method to set initial register value out of constructor.
The initial value of register is set in constructor but there is no
standard way to assign the initial value and default value at the same
time out of that. So we decided to add an extra method to set the
initialValue to current register value. The usecase would be:

reg.get().field1 = val1;
reg.get().field2 = val2;
reg.resetInitialValue();

Change-Id: Ibc5454e2945cc6aff943e6599043edd8ca442f5f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67917
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
2023-02-15 02:07:09 +00:00
Matthew Poremba
ea9239ae09 dev-amdgpu: Update deprecated ports
Change-Id: Icbc5636c33b437c7396ee27363eed1cf006f8882
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67837
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2023-02-14 18:57:33 +00:00
Matthew Poremba
39b5b5e511 dev-amdgpu: Fix address in POLL_REGMEM SDMA packet
The address for the POLL_REGMEM packet should not be shifted when the
mode is 1 (memory). Relevant driver code below is not shifting the
address. The shift is causing a page fault due to the incorrect address.

This changeset removes the shift so the correct address is translated.

https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver/blob/
    roc-4.3.x/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c#L903

Change-Id: I7a0ec3245ca14376670df24c5d3773958c08d751
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67877
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2023-02-14 15:36:56 +00:00
Gabe Black
d1f76741c6 dev: Add a definition for VectorResetResponsePort.
This is just a simple extension of the regular ResetResponsePort, and
is useful if there is a collection of reset pins on a device.

Change-Id: I6ccb21e949d3a51bf8b788ffd23e4b2b02706da9
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67576
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Yu-hsin Wang <yuhsingw@google.com>
Maintainer: Gabe Black <gabeblack@google.com>
2023-02-09 02:04:10 +00:00
Gabriel Busnot
7f4c92c910 mem,arch-arm,mem-ruby,cpu: Remove use of deprecated base port owner
Change-Id: I29214278c3dd4829c89a6f7c93214b8123912e74
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67452
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
2023-02-03 06:11:45 +00:00
Earl Ou
1b949e9759 dev: terminal: run pollevent in terminal eventq
Change-Id: Idefda0ca1cd71d3e790d470458fa1cd370393c4a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67532
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2023-02-03 03:40:35 +00:00
Matthew Poremba
0bce2e56d9 dev: Ignore MC146818 UIP bit / Fix x86 Linux 5.11+ boot
As of Linux 5.11, the MC146818 code was changed to avoid reading garbage
data that may occur if the is a read while the registers are being
updated:

github.com/torvalds/linux/commit/05a0302c35481e9b47fb90ba40922b0a4cae40d8

Previously toggling this bit was fine as Linux would check twice. It now
checks before and after reading time information, causing it to retry
infinitely until eventually Linux bootup fails due to watchdog timeout.

This changeset always sets update in progress to false. Since this is a
simulation, the updates probably will not be occurring at the same time
a read is occurring.

Change-Id: If0f440de9f9a6bc5a773fc935d1d5af5b98a9a4b
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/66731
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
2023-01-17 16:43:10 +00:00
Daniel R. Carvalho
5f5aae8940 dev: Remove a couple of deprecated namespaces
These namespaces have gone through the deprecation period
and can now be removed: Sinic, SCMI, Ps2, Regs, Keyboard,
Mouse, TxdOp, iGbReg, CopyEngineReg.

Change-Id: Icfaf458bffca2658650318508c0bb376719cf911
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67370
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2023-01-17 09:16:20 +00:00
Gabe Black
626e445563 dev: Add a "resetter" callback to the typed register class.
When using the typed register template, most functionality of the class
can be controlled using callbacks. For instance, callbacks can be
installed to handle reads or writes to a register without having to
subclass the template and override those methods using inheritance.

The recently added reset() method did not follow this pattern though,
which has two problems. First, it's inconsistent with how the class is
normally used. Second, once you've defined a subclass, the reader,
writer, etc, callbacks still expect the type of the original class.
That means these have to either awkwardly use a type different from the
actual real type of the register, or use awkward, inefficient, and/or
dangerous casting to get back to the true type.

To address these problems, this change adds a resetter(...) method
which works like the reader(...) or writer(...) methods to optionally
install a callback to implement any special reset behavior.

Change-Id: Ia74b36616fd459c1dbed9304568903a76a4b55de
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67203
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Yu-hsin Wang <yuhsingw@google.com>
Maintainer: Gabe Black <gabeblack@google.com>
2023-01-12 05:52:39 +00:00
Giacomo Travaglini
5447d55e39 dev: Fix -Wunused-variable in structured binding
Change-Id: Ia244767dd9d1dd7b72c320fb78e48f206694f5a2
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/66891
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Yu-hsin Wang <yuhsingw@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2022-12-22 08:07:51 +00:00
Gabe Black
fbd0722de4 fastmodel,dev: Replace the reset port with a Signal*Port<bool>.
The ResetRequestPort and ResetResponsePort have a few problems:

1. A reset signal should happen during the time a reset is asserted,
or in other words the device should stay in reset and not doing
anything while reset is asserted. It should not immediately restart
execution while the reset is still held.

2. These names are misleading, since there is no response. These names
are inherited from other port types where there is an actual response.

There is a new generic SignalSourcePort and SignalSinkPort set of port
classes which are templated on the type of signal they propogate, and
which can be used in place of reset ports in c++. These ports can
still have a specialized role which will ensure that only reset ports
are connected to each other for a form of type checking, although
the underlying c++ instances are more interoperable than that.

Change-Id: Id98bef901ab61ac5b200dbbe49439bb2d2e6c57f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/66675
Maintainer: Gabe Black <gabeblack@google.com>
Reviewed-by: Yu-hsin Wang <yuhsingw@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-12-16 07:19:05 +00:00
Gabe Black
89d5bfca7c fastmodel,dev: Rework the Int*Pin classes with Signal*Port.
These are largely compatibility wrappers around the Signal*Port
classes. The python versions of these types enforce more specific
compatibility, but on the c++ side the Signal*Port<bool> classes can
be used directly instead.

Change-Id: I1325074d0ed1c8fc6dfece5ac1ee33872cc4f5e3
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/66673
Maintainer: Gabe Black <gabeblack@google.com>
Reviewed-by: Yu-hsin Wang <yuhsingw@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-12-16 07:19:05 +00:00
Gabe Black
8b1688da34 dev: Introduce a reset() method on RegisterBank and Register classes.
This will make it much easier to implement reset behaviors on devices
which have RegisterBanks in them.

Change-Id: I73fe9874fcb69feed33611a320dcca85c0de2d0e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/66671
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Gabe Black <gabeblack@google.com>
Reviewed-by: Yu-hsin Wang <yuhsingw@google.com>
Reviewed-by: Jui-min Lee <fcrh@google.com>
2022-12-16 07:19:05 +00:00
Roger Chang
dd04e70445 arch-riscv: Implement rv32 zicsr extension
1. Add misc register mstatush, cycleh, timeh, instreth,
   hpmcounter03...hpmcounter31, pmpcfg1, pmpcfg3
2. Implement handling RV32 only registers
3. Implement methods of set time CSR

Change-Id: I5c55c18a0da91977d6e23da24ea3cbcba9f0509b
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65733
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-12-13 05:21:27 +00:00
Gabe Black
9d1cc1bcc9 dev: Add an offset checking mechanism to RegisterBank.
When adding a long list of registers, it can be easy to miss one which
will offset all the registers after it. It can be hard to find those
sorts of problems, and tedious and error prone to fix them.

This change adds a mechanism to simply annotate what offset a register
should have. That should also make the register list more self
documenting, since you'll be able to easily see what offset a register
has from the source without having to count up everything in front of it.

Change-Id: Ia7e419ffb062a64a10106305f875cec6f9fe9a80
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/66431
Reviewed-by: Yu-hsin Wang <yuhsingw@google.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-12-06 13:27:33 +00:00
Giacomo Travaglini
ed6cf2eced dev-arm: Allow GICv3 to be externally(publicly) updated
Change-Id: Ifa7b745ea11e74c17024c22ae993b6103eecb744
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/66271
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-12-05 10:50:42 +00:00
Giacomo Travaglini
0df37a33f6 arch-arm: Setup TC/ISA at construction time 2nd attempt
This partly reverts commit ec75787aef
by fixing the original problem noted by Bobby (long regressions):

setupThreadContext has to be implemented otherswise the GICv3 cpu interface
will end up holding old references when switching TC/ISAs.

This new implementation is still setting up the cpu interface reference
in the ISA only when it is required, but it is storing the
TC/ISA reference within the interface every time the ISA::setupThreadContext
gets called.

Change-Id: I2f54f95761d63655162c253e887b872f3718c764
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65931
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
2022-12-04 20:02:10 +00:00
Matthew Poremba
eee42275ee dev-amdgpu: Writeback RLC queue MQD when unmapped
Currently when RLC queues (user mode queues) are mapped, the read/write
pointers of the ring buffer are set to zero. However, these queues could
be unmapped and then remapped later. In that situation the read/write
pointers should be the previous value before unmapping occurred. Since
the read pointer gets reset to zero, the queue begins reading from the
start of the ring, which usually contains older packets. There is a 99%
chance those packets contain addresses which are no longer in the page
tables which will cause a page fault.

To fix this we update the MQD with the current read/write pointer values
and then writeback the MQD to memory when the queue is unmapped. This
requires adding a pointer to the MQD and the host address of the MQD
where it should be written back to. The interface for registering RLC
queue is also simplified. Since we need to pass the MQD anyway, we can
get values from it as well.

Fixes b+tree and streamcluster from rodinia (when using RLC queues).

Change-Id: Ie5dad4d7d90ea240c3e9f0cddf3e844a3cd34c4f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65791
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2022-12-01 21:04:05 +00:00
Hoa Nguyen
eac06ad681 python: Fix multiline quotes in a single line
An example case,
```python
mem_side_port = RequestPort(
    "This port sends requests and " "receives responses"
)
```

This is the residue of running the python formatter.
This is done by finding all tokens matching the regex `"\s"(?![.;"])`
and manually replacing them by empty strings.

Change-Id: Icf223bbe889e5fa5749a81ef77aa6e721f38b549
Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/66111
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-11-29 23:44:38 +00:00
Bobby R. Bruce
ec75787aef arch-arm: Revert 'Setup TC/ISA at construction time..'
Reverts:

dd2f1fb2f8
https://gem5-review.googlesource.com/c/public/gem5/+/65174

and

47bd56ee71
https://gem5-review.googlesource.com/c/public/gem5/+/65291

The 47bd56ee change resulted in the
`SuiteUID:tests/gem5/fs/linux/arm/test.py:realview-switcheroo-noncaching-timing-ALL-x86_64-opt`
nightly test stalling. This behavior can be reproduced with:

```
./build/ALL/gem5.opt tests/gem5/fs/linux/arm/run.py tests/gem5/configs/realview-switcheroo-noncaching-timing.py tests/gem5/resources/arm “$(pwd)”
```

The subsequent change, dd2f1fb2, must be reverted for this change to be
reverted.

Change-Id: I6fed74f33d013f321b93cf1a73eee404cb87ce18
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65732
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-11-18 17:46:09 +00:00
Matthew Poremba
33a36d35de dev-amdgpu: Store SDMA queue type, use for ring ID
Currently the SDMA queue type is guessed in the trap method by looking
at which queue in the engine is processing packets. It is possible for
both queues to be processing (e.g., one queue sent a DMA and is waiting
then switch to another queue), triggering an assert.

Instead store the queue type in the queue itself and use that type in
trap to determine which ring ID to use for the interrupt packet.

Change-Id: If91c458e60a03f2013c0dc42bab0b1673e3dbd84
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65691
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-11-18 15:30:37 +00:00
Matthew Poremba
623e2d3dac dev-amdgpu: Handle ring buffer wrap for PM4 queue
Change-Id: I27bc274327838add709423b072d437c4e727a714
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65431
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2022-11-09 15:47:50 +00:00
Matthew Poremba
c8d687b05c dev-amdgpu: Fix SDMA ring buffer wrap around
The current SDMA wrap around handling only considers the ring buffer
location as seen by the GPU. Eventually when the end of the SDMA ring
buffer is reached, the driver waits until the rptr written back to the
host catches up to what the driver sees before wrapping around back to
the beginning of the buffer. This writeback currently does not happen at
all, causing hangs for applications with a lot of SDMA commands.

This changeset first fixes the sizes of the queues, especially RLC
queues, so that the wrap around occurs in the correct place. Second, we
now store the rptr writeback address and the absoluate (unwrapped) rptr
value in each SDMA queue. The absolulte rptr is what the driver sends to
the device and what it expects to be written back.

This was tested with an application which basically does a few hundred
thousand hipMemcpy() calls in a loop. It should also fix the issue with
pannotia BC in fullsystem mode.

Change-Id: I53ebdcc6b02fb4eb4da435c9a509544066a97069
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65351
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
2022-11-09 04:11:35 +00:00
Giacomo Travaglini
47bd56ee71 dev-arm: Setup TC/ISA at construction time of Gicv3CPUInterface
We should initialize them as soon as possible to make sure
any Gicv3CPUInterface method uses a valid reference

Change-Id: I8fffebdab9136a9027c4f61bb9413e97031e1969
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65291
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
2022-11-04 21:25:37 +00:00
Matthew Poremba
489074fbfd dev-amdgpu: Fix issues with PM4 queue map, fences
The PM4 release_mem packet is used as a DMA fence in the driver. It
specifies which queue the interrupt came from by encoding the me, pipe,
and queue fields from the map_queue packet into the interrupt ring ID.
Currently these fields are incorrect because (1) the order in the
bitfield is backwards, (2) the queue constructor assigns a pointer to
the PM4MapQueue packet containing this data to the dmaBuffer which gets
deleted in short order, and (3) the order of the encoding of ring ID is
incorrect.

This change fixes these issues by (1) placing the struct vales in
correct order, (2) creating a const copy of the dmaBuffer on
construction, and (3) using the ring ID encoding expected by the driver:
https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver/blob/roc-4.3.x/
     drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c#L5989

Change-Id: I72c382980e57573f8a8a6879912c4139c7e2f505
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65095
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
2022-11-01 15:34:17 +00:00
Matthew Poremba
c5feca8251 dev-amdgpu: Rework PM4 NOP packet
The PM4 NOP header is used to insert spaces in the PM4 ring and can
therefore be any size. This includes zero. A size of zero is denoted by
a value of 0x3fff in the NOP packet header. Currently we assume this
means the remainder of the PM4 queue up to the wptr is empty/NOPs. This
is not always true.

This changeset reworks the PM4 NOP packet to handle the value of 0x3fff
as a special value and advances the rptr by 0 bytes. This fixes issues
where there were additional packets in the queue which were being
skipped over by fast forwarding. Since those packets could be anything,
that leads to undefined behavior afterwards.

Change-Id: I3f5c3f4b7dd50f93ba503fea97454a9d41771e30
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65094
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2022-11-01 15:34:08 +00:00
Matthew Poremba
752b696883 dev-amdgpu: Fix SDMA trap ring ID, context
SDMA traps are used in the driver as a DMA fence. To pass a fence, the
SDMA sends the driver the interrupt context from a trap packet and the
ring ID which specifies which queue in the SDMA engine is passing a
fence. Currently the interrupt context is using the wrong value in the
packet and the ring ID is hard-coded to always be the gfx queue.

This changeset uses the correct interrupt context from the SDMA packet
and sets the ring ID to either 0 if the gfx queue is currently being
processed or 3 if the page queue is being processed.

The relevant interrupt service routine in the driver can be found at:
https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver/blob/roc-4.3.x/
    drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c#L2129

Change-Id: Ie4a4a9d6ab1d3bf83bf76bb57a02a91100217b51
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65093
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-11-01 15:34:08 +00:00
Matthew Poremba
8899291db6 dev-amdgpu: Fix interrupt handler address assignment
The interrupt handler's base address is sent via MMIO and must be
shifted by 8 bits to convert to a byte address. The current code is
shifting the MMIO dword first then assigning, resulting in the top 8
bits being shifted out.

This changeset fixes the issue by assigning the dword to the 64-bit
address first then shifting after. Similarly, the upper dword is cast to
a 64-bit value first before shifting.

This fixes some "fence fallback timeout" errors in the m5term output.
These timeouts become a problem because the driver will reset after a
few hundred of them, killing any running GPU applications as part of the
process.

Change-Id: I0beec313f533765c94063bcf4de8c65aacf2986b
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65092
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
2022-11-01 15:34:08 +00:00
Matthew Poremba
144ce7f12c dev-amdgpu: Fix GART PTE size
The GART table is a legacy 1-level page table primarily used for
supervisor mode accesses to GPUs. The PTE size is 64-bits, not 32-bit.
This causes memory sizes >3GB (in X86) to fail loading amdgpu driver.

This changeset fixes the issue by setting the GART mappings to the
correct data type.

Change-Id: Ibfba2443675fe28316d26afa5f1a14885fdce40c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65091
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2022-10-31 14:40:30 +00:00
Matthew Poremba
7b16b17e61 dev-amdgpu: Chunkify SDMA copies that use device memory
The current implementation of SDMA copy calls the GPU memory manager's
read/write method one time passing a physical address as the
source/destination. This implicitly assumes the physical addresses are
contiguous which is generally not true for large allocations. This
results in reading from/writing to the wrong address.

This changeset fixes the problem by copying large copies in chunks of
the minimum possible page size on the GPU (4kB). Each page is translated
seperately to ensure the correct physical address. The final copy "done"
callback is only used for the last transfer. The transfers should
complete in order so the copy command will not complete until all chunks
have been copied. Tested and verified on an application with a large
allocation (~5GB).

Change-Id: I27018a963da7133f5e49dec13b0475c3637c8765
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/64752
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-10-31 14:30:24 +00:00
Giacomo Travaglini
506bd9d9e7 dev-arm: Use ThreadContext instead if ISA in GICV3 cpu interface
Some CPU wrappers like the Fastmodel one do extend the
ThreadContext interface in order to retrieve system register
state... By bypassing the TC interface and by using the ISA
instead, we are basically forcing users to extend the ISA
as well to intercept these calls.

So with this patch we are making sure every system register is accessed
(like HCR_EL2 or SCR_EL3) through the thread context. This of course
does not apply to the CPU interface registers as we still use the ISA
storage for them.  In the future we should probably move that storage
from the ISA class to the Gicv3CPUInterface class itself

This is also simplifying Gicv3CPUInterface::isEL3OrMon:
currEL already covers the AArch32 case so no need to
differentiate between AArch32 and AArch64

Change-Id: I446a14a6e12b77e1a62040b3422f79ae52cc9eec
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/64913
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
2022-10-27 15:33:18 +00:00
Giacomo Travaglini
9a9de78811 dev-arm: Implement System Security Control registers
This block of system registers is part of the N1 SDP [1]

[1]: https://developer.arm.com/documentation/101489/0000/\
    Programmers-model/System-Security-Control-registers

Change-Id: I2ecf5cd247bd68eddcd359e91f3954070dbffaa8
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/64951
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2022-10-27 10:33:38 +00:00
Giacomo Travaglini
4db981576e arch-arm: Setup ThreadContext in GICv3 cpu interface
Change-Id: If019b4b114031f880dff43e05658a162c201ea6a
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/64912
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-10-27 10:31:10 +00:00
Matthew Poremba
a648be2338 dev-amdgpu: Add an SDMA data debug flag
This debug flag is used to print spammy SDMA DPRINTFs, such as an SDMA
copy printing the data of large transfers 8 bytes per line at a time. For
those prints, the SDMAEngine flag will now only print the first and last
qword of the transfer and the new SDMAData flag is needed for verbose
data printing. This makes the SDMAEngine flag still useful for verifying
copies in applications with predictable data such as square.

Additionally, the memory allocation/deallocation done solely for a print
statement is removed in favor of casting the data to the printed type.

Change-Id: I18c1918ef9085cca4570f79881ee63d510ccc32f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/64452
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
2022-10-13 20:17:00 +00:00
Earl Ou
317bfd62bd dev: fix device number check error in IDE controller
Fixed a typo between 3 and 4.

Change-Id: I1470e30c4d472587db0b9da5512b24ab92f1fd65
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/64052
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
2022-10-04 01:21:37 +00:00
Giacomo Travaglini
7c0ab07ee2 dev-arm: Fix GICv3 GICD_ITARGETSR address range
According to the GICv3 manual, GICD_ITARGETSR address range goes from
0x0800 to 0x0c00 (as already implemented in the GICv2 model [1])

[1]: https://github.com/gem5/gem5/blob/v22.0.0.0/\
    src/dev/arm/gic_v2.cc#L64

Change-Id: I064e91d070d1a7b79f41a06ffd2197e4c07dae32
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/64074
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-10-03 17:45:10 +00:00
Matthew Poremba
2f1d67f8fe dev-amdgpu: Remove cached copy of device memory
This map was originally used for fast access to the GART table. It is no
longer needed as the table has been moved to the AMDGPUVM class. Along
with commit 12ec5f9172 which reads
functionally from device memory, this table is no longer needed and is
essentially a duplicate copy of device memory for anything written over
the PCI BAR.

This changeset removes the map entirely which will reduce the memory
footprint of simulations and potentially avoid stale copies of data
when reading over the PCI BAR.

Change-Id: I312ae38f869c6a65e50577b1c33dd055078aaf32
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63951
Reviewed-by: Matt Sinclair <mattdsinclair.wisc@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
2022-10-01 14:04:45 +00:00
Matthew Poremba
b623d26543 dev-amdgpu: Fix interrupt call for release mem
Both the client id and source id are incorrect for the release mem CP
packet. This changeset sets both to the correct value and adds asserts
that the value is declared in the client ID and source ID enums.

Change-Id: I4cc6c3a5f2a482e8f7dcd2a529c4a69bf71742c0
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63177
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2022-09-09 04:13:49 +00:00
Matthew Poremba
6c935657fd dev-amdgpu: Implement SDMA atomic packet
SDMA atomic packets are used in conjunction with RLC queues in SDMA for
synchronization similar to how HSA signals are used with BLIT kernels
when SDMA is disabled. Implement a skeleton of the SDMA atomic packet
methods as well as the atomic add64 operation.

The atomic add operation appears to be the only operation used in ROCm,
so this implementation is fairly complete. See:

https://github.com/RadeonOpenCompute/ROCR-Runtime/blob/
    rocm-4.2.x/src/core/runtime/amd_blit_sdma.cpp#L880

Change-Id: I62cc337f2ffe590bdb947b48053760ee8b3a6f32
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63174
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-09-09 04:13:49 +00:00
Matthew Poremba
9ea28bd782 dev-amdgpu: Implement SDMA RLC queue unmapping
The unmap queues packet specifies all non-static queues should be
unmapped which includes RLC queues in the SMDA. This functionality did
not exist before and is added in this changeset.

Fixes bug with rodinia_3.0/hip/bfs.

Change-Id: I80ca8cf8d89559625b5870745889b0a27916635e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63173
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2022-09-09 04:13:49 +00:00
Matthew Poremba
af4251f6ae dev-amdgpu: Rework SDMA RLC queue data structure
There can only ever be two RLC queues maximum. Use this information for
a simpler data structure to store doorbell information. The patch
changes the std::unordered_map previously used to std::array. This will
also be useful in avoiding erase-while-iterating issues needed to
unregister all queues at once.

Change-Id: I95600e40de51cb1a992a20bcebaf7580ea4d0be8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63172
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2022-09-09 04:13:49 +00:00
Matthew Poremba
12ec5f9172 dev-amdgpu: Rework framebuffer reads
Previously framebuffer reads would try reading from MMIO trace, special
addresses, and then anything previously written to a special address
range. This does not handle direct large BAR reads, causing incorrect
results in some applications that were doing this.

Rework the readFramebuffer method to do the following. Remove the MMIO
trace read altogether, as there were not any framebuffer reads from the
trace to begin with. Read special addresses first to avoid overwriting
by previous writes. Next read previous writes to special ranges. The
special range is the GART table. These are required for functional
translations. Lastly read from the device memory directly. This does a
functional read required by the PCIDevice read method which is
non-timing. Reading from device memory is preferred over the map type
used for GART to avoid duplication of a potentially huge amount of data.

With this changeset all but one of the HIP samples and HIP examples
applications now run and pass verification of results.

Change-Id: Id3b788bfc5eaf17cfa1897f25d26f3725d4db321
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63171
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2022-09-09 04:13:49 +00:00
Matthew Poremba
4b35693bd2 dev-amdgpu: Forward RLC queue doorbells
Forward user queue doorbells to the SDMA. This is the final step needed
to enable RLC (user) queues to replace BLIT kernels.

Change-Id: I0c2ef70bb5414b82785ef437dd65d6c57798d24f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63033
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-09-09 04:13:49 +00:00