This patch is adding the NS bit to CHI requests to make sure they are
properly tagged according to their security
Change-Id: I33d3610edefbb5a05a6090e9125c35d4fb8bca58
Reviewed-by: Tiago Muck <tiago.muck@arm.com>
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
The extended control registers were not being updated in the KVM thread
context nor updated in the KVM state. This was causing issues when
checkpointing since the XCR0 value was reverting to the default value
rather than what it was previously before the checkpoint. THis was
causing multiple applications to crash due to executing instructions
which are now illegal instructions due to XCR0 being incorrect.
This commit adds the XCR0 as a misc register similar to the exiting x86
control registers and adds all of the helper functions to access and set
the register value. It also adds support for updating the KVM CPU's
state with the register value and updating the thread context's misc reg
value so that it is checkpointed along with the other misc regs.
Note that this does *not* add support for XSAVE of the AVX state (i.e.,
the upper 128 bits of YMM registers). It does however fix the immediate
problem in issue #958 .
Change-Id: I97456c8b57cbc7b381bd4be94944ce6567a43c76
Includes fixes for several bugs reported via email, self found, and
internal reports. Also includes runs through Valgrind and UBsan. See
individual commits for more details.
The scalar cache is not being invalidated which causes stale data to be
left in the scalar cache between GPU kernels. This commit sends
invalidates to the scalar cache when the SQC is invalidated. This is a
sufficient baseline for simulation.
Since the number of invalidates might be larger than the mandatory queue
can hold and no flash invalidate mechanism exists in the VIPER protocol,
the command line option for the mandatory queue size is removed, which
is the same behavior as the SQC.
Change-Id: I1723f224711b04caa4c88beccfa8fb73ccf56572
The implementation of SYS_FLEN was missing, which caused picolibc to
treat this file as not implemented. Additionally, there was a bug in the
SYS_READ call that was comparing the wrong variable against the passed
buffer length. It was comparing the current file position against the
buffer length instead of the number of written bytes. Finally, pos was
unititialized which could result in spurious errors.
Change-Id: I8b487a79df5970a5001d3fef08d5579bb4aa0dd0
This prints what values are read/written to LDS and the previous value
on write. This is useful for debugging problems with LDS instructions.
Change-Id: I30063327bec1a1a808914a018467d5d78d5d58b4
SDMA ptePde packets are generating a warning that a GART address is
missing, causing a wrong address to be clobbered by the operation.
This commit fixes this by converting the GART address when the queue is
running in privledged mode, which is the only mode allowed to use GART
addresses. This removes the warnings and writes to the correct memory
region.
Change-Id: I64acac308db2431c5996b876bf4cda704f51cf25
The VOP3 instruction encoding generally states that ABS/NEG modifiers in
the instruction encoding are only valid on floating point data types.
This is currently coded in gem5 to mean floating point *instructions*.
For untyped instructions like V_CNDMASK_B32, we don't actually know what
the data type is. We must trust that the compiler did not attempt to
apply these bits to non-FP data types.
This commit simply removes the asserts. The ABS/NEG modifiers are
therefore ignored which is consistent with the ISA documentation.
This is done on the lane manipulation instructions V_CNDMASK_B32,
V_READLINE_B32, and V_WRITELANE_B32 which are typically used to mask off
or move data between registers. Other bitwise instructions (e.g.,
V_OR_B32) keep the asserts as bitwise operations on FP types are
genernally illegal in languages like C++.
Change-Id: I478c5272ba96383a063b2828de21d60948b25c8f
Fixes several memory leaks, mostly of small and medium severity. Fixes
mismatched new/new[] and delete/delete[] calls.
Change-Id: Iedafc409389bd94e45f330bc587d6d72d1971219
The address has one too many zeros and is therefore placed in a memory
region usually used for system memory. As a result this causes failure
when trying to run a simulation with a huge amount of memory.
Change the address to be within the C000'0000h - FFFF'FFFFh X86 I/O hole
as was intended.
Change-Id: I5d03ac19ea3b2c01a8c431073c12fa1868b3df24
Three main fixes:
- Remove the initDynOperandInfo. UBSAN errors and exits due to things
not being captured properly. After a few failed attempts playing with
the capture list, just move the lambda to a new method.
- Invalid data type size for some thread mask instructions. This might
actually have caused silent bugs when the thread id was > 31.
- Alignment issues with the operands.
Change-Id: I0297e10df0f0ab9730b6f1bd132602cd36b5e7ac
From sepc
> Instructions that access a non-existent CSR are reserved. Attempts to
access a CSR without appropriate privilege level raise
illegal-instruction exceptions or, as described in Section 13.6.1,
virtual-instruction exceptions. Attempts to write a read-only register
raise illegal-instruction exceptions. A read/write register might also
contain some bits that are read-only, in which case writes to the
read-only bits are ignored.
Setting the bit not in the mask should be ignore rather than raise the
illegal exception. The unmask bits of xstatus CSR are `WPRI`, the
unmasks bits of xie are `RO`(above priv v1.12) or `WPRI`(priv v1.11 and
priv v1.10), the unmask bits of xip CSR are `RO`(above priv v1.12) or
`WPRI`(priv v1.11) or `WIRI` (priv v1.10).
Note: The workload of `riscv-ubuntu-20.04-boot` uses the priv v1.10.
More details please see the `RISC-V spec: Privileged Architecture`
v1.10:
https://github.com/riscv/riscv-isa-manual/releases/tag/riscv-priv-1.10
v1.11(20190608):
https://github.com/riscv/riscv-isa-manual/releases/tag/Ratified-IMFDQC-and-Priv-v1.11
v1.12(20211213):
https://github.com/riscv/riscv-isa-manual/releases/tag/Priv-v1.12
Change-Id: I5d6e964e99b30b71da3dc267cd1575665d922633
Currently, the filesRootDir is prepended for all paths that do not start
with '/'. However, we should not be doing this for the special files :tt
and :semihosting-features. Noticed this while testing semihosting with a
non-empty filesRootDir.
Change-Id: I156c8b680cb71cdc88788be3b0e93fc1d52e11e5
The implementation of SYS_FLEN was missing, which caused picolibc to
treat this file as not implemented. Additionally, there was a bug in
the SYS_READ call that was comparing the wrong variable against the
passed buffer length. It was comparing the current file position against
the buffer length instead of the number of written bytes.
Finally, pos was unititialized which could result in spurious errors.
Change-Id: I8b487a79df5970a5001d3fef08d5579bb4aa0dd0
This PR added RISC-V Integer Conditional Operations Extension, which is
in the RVA23U64 Profile Mandatory Base. And the performance of
conditional move instructions in micro-architecture is an interesting
point to explore.
Zicond instructions added: czero.eqz, czero.nez
Changes based on spec:
https://github.com/riscvarchive/riscv-zicond/releases/download/v1.0.1/riscv-zicond_1.0.1.pdf
As discussed in https://github.com/orgs/gem5/discussions/954:
In the refactor made by commit f65df9b959 conditional indirect
branches are no longer updated in the indirect predictor.
This kind of branches do not exist in x86 neither arm, but they are
present in PowerPC.
This patch, enables the indirect predictor to track this kind of
branches.
The extended control registers were not being updated in the KVM thread
context nor updated in the KVM state. This was causing issues when
checkpointing since the XCR0 value was reverting to the default value
rather than what it was previously before the checkpoint. THis was
causing multiple applications to crash due to executing instructions
which are now illegal instructions due to XCR0 being incorrect.
This commit adds the XCR0 as a misc register similar to the exiting x86
control registers and adds all of the helper functions to access and set
the register value. It also adds support for updating the KVM CPU's
state with the register value and updating the thread context's misc reg
value so that it is checkpointed along with the other misc regs.
Note that this does *not* add support for XSAVE of the AVX state (i.e.,
the upper 128 bits of YMM registers). It does however fix the immediate
problem in issue #958 .
A checkpoint upgrader is also provided to add the default value of XCR0
if the checkpoint tag is missing.
Change-Id: I97456c8b57cbc7b381bd4be94944ce6567a43c76
In the payload event queue in Gem5ToTlmBridgeBase, the phase is checked
twice for BEGIN_RESP. This commit removes the second if clause since it
is unnecessary.
Duplicate if clause in line 234 & line 256
dd2689905f/src/systemc/tlm_bridge/gem5_to_tlm.cc (L234-L267)
please correct me if I am missing something important
Fix#1055. Prioritize committing from exiting threads before we consider
other threads using the specified SMT commit policy. All instructions in
the ROB for exiting threads should already have been squashed. Thus,
this ensures that the ROB instruction queues for all exiting threads
will be empty at the end of the current cycle, avoiding the assertion
failure encountered in #1055.
Change-Id: Ib0178a1aa6e94bce2b6c49dd87750e82776639dc
Fix#1042. Clear the current fetch macro-op if the instruction
initiating the squash is the last micro-op in its macro-op.
Change-Id: I77f60334771277e47f19573d4067b3a7bc5488b2
Fix#1044. This patch adds checks for message types (PUTX_COPY, DATA,
DATA_EXCLUSIVE) that contain data blocks but were missing from the
original `functionalRead` method in MESI Three-Level messages.
Change-Id: I0cedc314166c9cc037bf20f5b7fef5552dd1253c
A usual system register read/write pattern is something like the
following
```
switch(el) {
case EL1:
tc->readMiscReg(REG_EL1);
case EL2:
tc->readMiscReg(REG_EL2);
case EL3:
tc->readMiscReg(REG_EL3);
}
```
To avoid repeating these switch statements all over gem5, we define
templated functions which have
an accessor struct as a template parameter. These accessor will help
populating the templated switch
construct. We provide the FAR register accessor as an example. The
accessor should define the following
fields: (type, el0, el1, el2, el3)
Example:
```
struct FarAccessor
{
using type = RegVal;
static const MiscRegIndex el0 = NUM_MISCREGS;
static const MiscRegIndex el1 = MISCREG_FAR_EL1;
static const MiscRegIndex el2 = MISCREG_FAR_EL2;
static const MiscRegIndex el3 = MISCREG_FAR_EL3;
};
```
Fix#1064 by adding support for hardware performance counters on hybrid
architectures like Intel Alder Lake.
Hybrid architectures have multiple types of cores, each of which require
the instantiation of a separate performance counter. The KVM CPU's
PerfKvmCounter class was not aware of this, any only instantiated a
single performance counter, implicitly bound to the P-core only. This
meant that if gem5 ever ran on an E-core, the various hardware
performance counters would not get updated properly, in some cases
always zero (e.g., for the number of instructions executed).
This patch adds support for hybrid host architectures as follows. First,
we convert PerfKvmCounter into an abstract class, which has two concrete
implementations: SimplePerfKvmCounter and HybridPerfKvmCounter. The
former is used for non-hybrid architectures or for non-hardware
performance counters and is functionally equivalent to the prior
implementation of PerfKvmCounter. The latter is used for instantiating
hardware performance counters (i.e., of type PERF_TYPE_HARDWARE) on
hybrid host architectures. It does so by internally instantiating two
SimplePerfKvmCounters, one for a P-core and one for an E-core. Upon
read, it sums the results of reading the two internal counters.
Change-Id: If64fcb0e2fcc1b3a6a37d77455c2b21e1fc81150
Use it accordingly in the faulting/exception logic
Change-Id: I2f6360d04698b6fb7188e776f1d6966e99ce19b1
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
This is adding two templated functions for reading/writing
system registers (MiscRegs). It is introducing them inside
a new misc_regs namespace.
Change-Id: I21233337c057673d46d1147971ebabbfc2c2bb6a
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
This reverts commit c1de2b8762. We revert
the commit as Ruby does not use get_runtime_isa anymore after [1]
[1]: https://github.com/gem5/gem5/pull/241
Change-Id: Iaac8d64194bbd53a9b1a57a796ff92f763c75a87
This PR is refactoring the Arm PageTableWalker in the following way:
1) Simplifying the currState handling logic (mainly the tear down)
2) Amending the TlbTestInterface APIs to use a RequestPtr reference
3) Use finalizePhysical even when MMU is off, which means allowing
memory mapped m5ops to work also in that circumstance
Remove the requirement in TreePLRU's implementation that the number of
leaves (i.e., the number of cache ways) be a power of two. Firstly, on
some recent processors, this is not the case---for example, Intel Golden
Cove's L1D has 12 ways. Secondly, The implementation of TreePLRU appears
to work just fine as-is with a way count that's not a power of two.
Change-Id: If2a27dc5bbe7a8e96684f79ce791df5c0b582230
Now that the Request has been made an Extensible object, it
can carry within itself much more data. It makes sense
to pass it to the TlbTestInterface as more information about
the table walk can be extracted from it.
This is also aligning with the testTranslation utility which
is expecting a request reference as first argument.
Change-Id: I3dbc9a81d6b4bcc1801246ba7eb4136774d8f3c7
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
They both make final checks to the VA->PA translation before
relinquishing control back to the translate client (usually
CPU code)
Change-Id: Ib0a9da25404248c22c6a240817d2f50f0913fdf7
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
The finalizePhysical is just checking if the physical
address falls within the m5op region (if using mmapped
m5ops). There's not reason why we shouldn't enable it
with virtual memory off
Change-Id: I5ab80fd4e7886743abd4b7d85937b72253b578d3
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
We also unify the fault handling logic; rather than cleaning
up the WalkerState in several places scattered throughout the
walking code, we handle faults in the top level method
Change-Id: Ia22fb6f27044ff445fffbab228777a48efa473cb
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
It's more efficient to pass a reference of the tester to the
TableWalkers. In this way a table walk check is tested directly
from the walkers instead of going through the MMU every time.
Change-Id: I9820dbabb8b551981005a65efa54a76b1a027541
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
This is done in order to differentiate between EL0 (unprivileged) and
EL1. Effectively it won't change much as most of the decisions are
now taken according to the translation regime which will be the
same regardless (EL10)
Change-Id: I218037e9c19cf638aff05c51869e439204d9af69
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>