Global instructions in Vega can either use a VGPR base address plus
instruction offset or SGPR base address plus VGPR offset plus
instruction offset. Currently the VGPR address/offset is always read as
two dwords. This causes problems if the VGPR number is the last VGPR
allocated to a wavefront since the second dword would be beyond the
allocation and trip an assert.
This changeset sets the operand size of the VGPR operand to one dword
when SGPR base is used and two dwords otherwise so initDynOperandInfo
does not assert. It also moves the read of the VGPR into the calcAddr
method so that the correct ConstVecOperandU## is used to prevent another
assertion failure when reading from the register file. These two changes
are made to all flat instructions, as global instructions are a
subsegement of flat instructions.
Change-Id: I79030771aa6deec05ffa5853ca2d8b68943ee0a0
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67077
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
DPP processing has several issues which are fixed in this changeset:
1) Incorrect comment is updated
2) newLane calculation for shift/rotate instructions is corrected
3) A copy of original data is made so that a copy of a copy is not made
4) Reset all booleans (OOB, zeroSrc, laneDisabled) after each lane
iteration
The shift, rotate, and broadcast variants were tested by implementing
them in assembly and running on silicon.
Change-Id: If86fbb26c87eaca4ef0587fd846978115858b168
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/66752
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
The bitfield extract instructions come in unsigned and signed variants.
The documentation on this is not correct, however the GCN3 documentation
gives some clues. The instruction should extract an N-bit integer where
N is defined in a source operand starting at some bit also defined by a
source operand. For signed variants of this instruction, the N-bit
integer should be sign extended but is currently not.
This changeset does sign extension using the runtime value of N by ORing
the upper bits with ones if the most significant bit is one. This was
verified by writing these instructions in assembly and running on a real
GPU. Changes are made to v_bfe_i32, s_bfe_i32, and s_bfe_i64.
Change-Id: Ia192f5940200c6de48867b02f709a7f1b2daa974
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/66751
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
The ResetRequestPort and ResetResponsePort have a few problems:
1. A reset signal should happen during the time a reset is asserted,
or in other words the device should stay in reset and not doing
anything while reset is asserted. It should not immediately restart
execution while the reset is still held.
2. These names are misleading, since there is no response. These names
are inherited from other port types where there is an actual response.
There is a new generic SignalSourcePort and SignalSinkPort set of port
classes which are templated on the type of signal they propogate, and
which can be used in place of reset ports in c++. These ports can
still have a specialized role which will ensure that only reset ports
are connected to each other for a form of type checking, although
the underlying c++ instances are more interoperable than that.
Change-Id: Id98bef901ab61ac5b200dbbe49439bb2d2e6c57f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/66675
Maintainer: Gabe Black <gabeblack@google.com>
Reviewed-by: Yu-hsin Wang <yuhsingw@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
These are largely compatibility wrappers around the Signal*Port
classes. The python versions of these types enforce more specific
compatibility, but on the c++ side the Signal*Port<bool> classes can
be used directly instead.
Change-Id: I1325074d0ed1c8fc6dfece5ac1ee33872cc4f5e3
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/66673
Maintainer: Gabe Black <gabeblack@google.com>
Reviewed-by: Yu-hsin Wang <yuhsingw@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This partly reverts commit ec75787aef
by fixing the original problem noted by Bobby (long regressions):
setupThreadContext has to be implemented otherswise the GICv3 cpu interface
will end up holding old references when switching TC/ISAs.
This new implementation is still setting up the cpu interface reference
in the ISA only when it is required, but it is storing the
TC/ISA reference within the interface every time the ISA::setupThreadContext
gets called.
Change-Id: I2f54f95761d63655162c253e887b872f3718c764
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65931
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Various changes to support rv32:
1. Add riscv_bits field into RiscvISA to switch rv_type
2. Add rv_type field into ExtMachInst
3. Split various constants into rv32/rv64 version
4. Fix mcause/mstatus/misa setting per rv_type
5. Split RiscvCPU into rv32/rv64
6. Fix how reset/branch create new pc so rv_type is preserved
7. Tag gdb-xml only for rv64
TODO:
Add rv32 gdb-xml
Add rv32 implementation into decoder
Currently there're three places where we store the rv_type information
(1) ISA (2) PCState (3) ExtMachInst. In theory, the ISA should be the
source of truth, and propagates information into PCState, then Inst.
However, there is an API on RiscvProcess that let users modify the
rv_type in PCState, so there's a chance to get inconsistent rv_type. We
should either modify the structure so such kind of usage is well
supported, or just prohibit people from setting a different rv_type.
Change-Id: If5685ae60f8d18f4f2e18137e235989e63156404
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63091
Reviewed-by: Yu-hsin Wang <yuhsingw@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
An example case,
```python
mem_side_port = RequestPort(
"This port sends requests and " "receives responses"
)
```
This is the residue of running the python formatter.
This is done by finding all tokens matching the regex `"\s"(?![.;"])`
and manually replacing them by empty strings.
Change-Id: Icf223bbe889e5fa5749a81ef77aa6e721f38b549
Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/66111
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Per RISC-V ISA Manual, vol II, section 3.1.6.6, page 26, the SD bit is
a read-only bit indicating whether any of FS, VS, and XS fields being
in the respective dirty state.
Per section 3.1.6, page 20, the SD bit is the most significant bit of
the mstatus register for both RV32 and RV64.
Per section 3.1.6.6, page 29, the explicit formula for updating the SD is,
SD = ((FS==DIRTY) | (XS==DIRTY) | (VS==DIRTY))
Previously in gem5, this bit is not updated anywhere in the gem5
implementation. This cause an issue of incorrectly saving the context
before entering the system call and consequently, incorecttly restoring
the context after a system call as described here [1].
Ideally, we want to update the SD after every relevant instruction;
however, lazily updating the Status register upon its read produces
the same effect.
[1] https://gem5-review.googlesource.com/c/public/gem5/+/65272/
Change-Id: I1db0cc619d43bc5bacb1d03f6f214345d9d90e28
Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65273
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Per RISC-V ISA Manual, vol II, section 3.1.6.6, page 25, the
FS field of the mstatus register encodes the status of the floating
point unit, including the floating point registers. Per page 27,
microarchitecture can choose to set the FS field to Dirty even if
the floating point unit has not been modified.
Per section 3.1.6, page 20, the FS field is located at bits 14..13
of the mstatus register.
Per section 3.1.6.6, page 27, the FS field is used for saving
context.
Upon a system call, the Linux kernel relies on mstatus for
choosing registers to save for switching to kernel code.
In particular, if the SD bit (updating this bit is also a bug
in gem5 and will be explained in the next commit) is not set
properly due to the FS field being incorrect, the process of saving
the context and restoring the context result in the floating
point registers being zeroed out. I.e., upon the saving context
function call, the floating point registers are not saved, while
in restore context function call, the floating point registers
are overwritten with zero bits.
Previously, in gem5 RISC-V ISA, the FS field is not updated upon
floating point instruction execution. This caused issue on context
saving described above.
This change conservatively updates the FS field to Dirty on
the execution of any floating point instruction.
Change-Id: I8b3b4922e8da483cff3a2210ee80c163cace182a
Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65272
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Vega allows for any integer multiple of 4kB pages. However, the current
implementation is designed for 4kB page primarily. In order to support
variable page sizes, the physical address calculation needs to be
updated to add the virtual page offset to the base physical address
rather than bitwise-OR. Bitwise-OR assumes physical pages are at
aligned to the page size which is generally not the case for very
large pages (1GB+).
This changeset changes all of the physical address computations to add
the virtual offset to the physical page address. This fixes many GPUFS
applications which use larger pages. The support was tested by
hipMalloc'ing ~5GB to induce a large page being created. The test
application now passes verification with this change.
Change-Id: Ic8d1475e001def443f3e4ab609449bca0c40b638
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/64751
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This patch is adding an outfile parameter to the TarmacTracer
This has 3 options:
1) stdoutput = dump to standard output (default behaviour)
2) stderror = dump to standard error
3) file = dump to a file. As there is one tracer per CPU,
this means every CPU will dump its trace to a different file,
named after the tracer name (e.g. cpu0.tracer, cpu1.tracer)
It is still possible to redirect to a file with option 1 and 2
thanks to common bash redirection. What the third option is
really buying us is the capability to dump CPU traces on
separate files, and to separate the trace output from the debug-flag
output
Change-Id: Icd2bcc721f8598d494c9efabdf5e092666ebdece
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63892
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
For SVE it is possible to override the run-time vector length (VL) per
exception level by setting the value in the appropriate ZCR_ELx
registers. In general instructions query the appropriate registers
during execute() to determine the actual vector length. The exception
to this rule are the SVE Macromem instructions which use the VL to
determine the number of micro-ops to crack into during
decode. However, as there is no available ExecContext during the
decode stage these instructions rely on a cached value stored in the
decoder.
Previously we were updating the VL in the decoder using potentially
stale values of ZCR_ELx by calling the update before actually setting
the registers themselves. We now set the registers before updating the
decoder.
Change-Id: I0167095699f7f950ee99fc42c7c8292fe8938d28
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/64331
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
This change makes the default vendor string AuthenticAMD.
GLIBC now is much more strict about checking for the current system's
supported features. In Ubuntu 22.04, when trying to load a dynamically
linked file, the CPUID is checked for the required features. If they are
not there, an error saying ISA level too low is returned and the program
crashes.
The underlying issue is that GLIBC does not check and populate the
cpu_feature data structure if it does not detect a *known* CPU model.
The options are hardcoded. See the following file for the glibc code.
glibc/sysdeps/x86/cpu-features.c
Note that the cpu_features is not populated with the
COMMON_CPUID_INDEX_1 unless there is a known family, which is only set
if the vendor string matches a known vendor.
This change uses AuthenticAMD instead of the alternatives because the
checks in glibc are most simple (no special cases) for AuthenticAMD in
the init_cpu_features functions.
GLIBC has been unable to populate the cpu_features datastructure
correctly with gem5 for a long time. However, this has just now become a
problem for us because the library now is more strict on not allowing
code to execute unless the processor meets certain minimum requirements.
I believe the commit for GLIBC which caused this breakage is
ecce11aa0752735c4fd730da6e7c9e0b98e12fb8
See https://sourceware.org/pipermail/binutils/2020-October/113593.html
for more details on that commit.
Change-Id: I8eedb46f577361e749ad8d0adda4fd0753e99960
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/64831
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
The conversion logic between gem5 register id and iris resouce id is
duplicated in read and write function. Some of them also doesn't handle
the invalid id correctly. This change wraps the logic, fixes them, and
improves the debug messages by printing the register names.
Change-Id: I093d05f5f06d804d5f01988c2a7ffa60244c5516
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/64651
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Gabe Black <gabeblack@google.com>