The `BasePrefetcher` python class had members `_events` and `_tlbs`
defined as lists, meaning that any call to `list.append` on them would
affect `_events` and `_tlbs` for all prefetchers, not just the calling
object. This change redefines them as instance members to fix the
problem.
Change-Id: I68feb1d6d78e2fa5e8775afba8c81c6dd0de6c60
Signed-off-by: Isaac Sánchez Barrera <isaac.sanchez@bsc.es>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32394
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
The NS field in PTEs descriptors is tagging Secure/Non-secure physical
memory (pages). This field is relevant in Secure state only:
While in Secure state, software can access both the Secure and
Non-secure physical address spaces, software in Non-secure state can
only access Non-secure memory; the NS bit is hence discarded/treated as
1.
This patch is aligning VMSAv8-32 with VMSAv8-64, which is tagging the
pointed memory as Non-secure in case of a Non-secure lookup.
The old behaviour was probably not leading to incorrect execution:
once a translation completes, the security flag in the memory request
is chcked against the security state of the cpu (and not only relying
on the NS bit in the TLB entry)
if (isSecure && !te->ns) {
req->setFlags(Request::SECURE);
}
so we were already forbidding secure accesses from non secure world
if NS = 0.
It is however misleading in the debug logs to see tlb entries with
NSTID = 1 and NS = 0.
Change-Id: I1f964069f88c33fb14362dd4101cb22538907226
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32638
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
debugExceptionReturnSS is called on an ERET instruction to
check for software step. The method was not using the
SPSR.width and it was relying on the more generic ELIs32 to
check the execution mode of the destination EL.
This is not only an efficiency problem: the helper might not work
when returning to EL0. In general it is not possible to
understand if EL0 is using AArch32 or AArch64 if the current
EL is not EL0 and EL1 is using AArch64.
This is instead visible by inspecting the spsr.width during the
execution of an ERET instruction
Change-Id: Ibc5a43633d0020139f2c0e372959a3ab4880da6e
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32634
Tested-by: kokoro <noreply+kokoro@google.com>
Flat instructions free some of their registers through their memory
requests, in particuar a call to scheduleWriteOperandsFromLoad(),
which gets called from GlobalMemPipeline::exec.
When execMask is 0, the instruction doesn't issue a memory request.
This patch adds in a call to scheduleWriteOperandsFromLoad() when
execMask is 0 for Flat Load and AtomicReturn instructions, as those
are the instructions that call scheduleWriteOperandsFromLoad()
in the memory pipeline.
This patch also adds in a missing return statement when execMask is 0
in one of the Flat instructions.
Change-Id: I09296adb7401e7515d3cedceb780a5df4598b109
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32234
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
There are race conditions while running several benchmarks, where
the DMA engine and the CorePair simultaneously send requests for the
same block. This patch fixes two scenarios
(a) If the request from the DMA engine arrives before the one from the
CorePair, the directory controller records it as a pending request.
However, once the DMA request is serviced, the directory doesn't check
for pending requests. The CorePair, consequently, never sees a response
to its request and this results in a Deadlock.
Added call to wakeUpDependents in the transition from BDR_Pm to U
Added call to wakeUpDependents in the transition from BDW_P to U
(b) If the request from the CorePair is being serviced by the directory
and the DMA requests for the same block, this causes an invalid
transition because the current coherence doesn't take care of this
scenario.
Added transition state where the requests from DMA are added to the
stall buffer.
Updated B to U CoreUnblock transition to check all buffers, as the DMA
requests were being placed later in the stall buffer than was being checked
Change-Id: I5a76efef97723bc53cf239ea7e112f84fc874ef8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31996
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Bradford Beckmann <brad.beckmann@amd.com>
Maintainer: Bradford Beckmann <brad.beckmann@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This changeset fixes several bugs in the HSA barrier bit implementation.
1. Forces AQL packet launch to wait for completion of all previous packets
2. Enforces barrier bit blocking only if there are packets pending completion
3. Barrier bit unblocking is correclty done by the last pending packet
4. Implementing barrier bit for all packets to conform to HSA spec
Change-Id: I62ce589dff57dcde4d64054a1b6ffd962acd5eb8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/30354
Reviewed-by: Sooraj Puthoor <puthoorsooraj@gmail.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
According to the ArmArm:
"When the value of the ENABLE bit is 1, ISTATUS indicates whether the
timer condition is met. ISTATUS takes no account of the value of the
IMASK bit. If the value of ISTATUS is 1 and the value of IMASK is 0 then
the timer interrupt is asserted."
Since ISTATUS is simply flagging that timer conditions are met, an
interrupt mask (via the <timer>_CTL_EL<x>.IMASK) shouldn't reset the
field to 0.
Clearing the ISTATUS bit leads to the following problem
as an example:
1) virtual timer (EL1) issuing a physical interrupt to the GIC
2) hypervisor handling the physical interrupt; setting the
CNTV_CTL_EL0.IMASK to 1 before issuing the virtual interrupt
to the VM
3) The VM receives the virtual interrupt but it gets confused
since CNTV_CTL_EL0.ISTATUS is 0 (due to point 2)
What happens when we disable the timer?
"When the value of the ENABLE bit is 0, the ISTATUS field is UNKNOWN."
So we are allowed to not clear the ISTATUS bit if the timer gets
disabled
Change-Id: I8eb32459a3ef6829c1910cf63815e102e2705566
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Adrian Herrera <adrian.herrera@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31775
Reviewed-by: Hsuan Hsu <kugwa2000@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Before this commit:
* SEV events were not waking neither WFE (wrong) nor futex WAIT (correct)
* locked memory events (LLSC) due to LDXR and STXR were waking up both
WFE (correct) and futex WAIT (wrong)
This commit fixes all wrong behaviours mentioned above.
The fact that LLSC events were waking up futexes leads to deadlocks,
as shown in the test case described at:
https://gem5.atlassian.net/browse/GEM5-537
because threads woken up by SVE are not removed from the waiter list
for the futex address they are sleeping on.
A previous fix atttempt was done at:
1531b56d605d47252dc0620bb3e755b7cf84df97
in which only sleeping threads are woken up. But that is not sufficient,
because the futex sleeping thread that was being wrongly woken up on SEV
can start to sleep on a second futex.
As an example, consider the case where 4 threads are fighting over two
critical sections protected by futex1 and futex2 addresses. In this case,
one thread wakes up the other thread after it is done with the section.
Suppose the following sequence of events:
* thread1 is awake and all others are suspended on futex1
* thread1 SEV wakes thread2 from the futex1 while in the critical region 1.
This is the wrong behaviour that this patch prevents, because
now thread2 is still in the sleeper list for futex1
* thread1 then futex wakes tread3, then proceeds to critical region 2.
* thread3 wakes up, but because thread2 has critical region, it sleeps
again.
* thread2 finishes its work, futex wakes thread3, and then proceeds to
futex2
When it reaches futex2, thread1 is still working there, so it sleeps on
futex2.
* thread3 futex wakes thread2, because it is still wrongly on the sleeper
list of futex1. But thread2 is in futex2 now.
If it weren't for this mistake, it should have awaken the final thread4
instead.
Outcome: thread4 sleeps forever, no other thread ever wakes it, because all
other threads have woken from futex1 and awoken another thread.
The problem is fixed by adding the waitingTcs unordered_set FutexMap,
which is basically an inverse map to FutexMap, which tracks (addr,
tgid) -> ThreadContext. This allows us allow to quickly check
if a given ThreadContext is waiting on a futex in any address.
Then the SEV wakeup code path
now checks if the thread is k
Change-Id: Icec5e30b041f53e5aa3b6e0d291e77bc0e865984
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29777
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Brandon Potter <Brandon.Potter@amd.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Brandon Potter <Brandon.Potter@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Both methods do basically the same, especially since they don't handle the
timeout which is basically the only difference between both modes of the
syscall (one uses absolute and the other relative time).
Remove the WaiterState::WaiterState(ThreadContext* _tc) constructor,
since the only calls were from FutexMap::suspend which does not use them
anymore. Instead, set the magic 0xffffffff constant as a parameter to
suspend_bitset.
Change-Id: I69d86bad31d63604657a3c71cf07e5623f0ea639
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29776
Reviewed-by: Brandon Potter <Brandon.Potter@amd.com>
Maintainer: Brandon Potter <Brandon.Potter@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Add checkpoint parameters (together with corresponding serialization
and unserialization) for VMA list of class MemState into a separate
section named 'vmalist'.
Without these VMA list parameters, a page table fault will occur when
running with --restore-simpoint-checkpoint, because of an empty VMA
list. For example:
$ ./build/RISCV/gem5.debug --debug-flags=Exec configs/example/se.py \
-c tests/test-progs/hello/bin/riscv/linux/hello \
--cpu-type=NonCachingSimpleCPU --restore-simpoint-checkpoint \
--checkpoint-dir m5out/ -r 2
...
2404000: system.switch_cpus: T0 : @_int_malloc+3392 : sd a5, 8(a0) \
: MemWrite : D=0x000000000001ed21 A=0x862e8
panic: Page table fault when accessing virtual address 0x862e8
...
Example checkpoint output:
[system.cpu.workload.vmalist]
size=3
[system.cpu.workload.vmalist.Vma0]
name=stack
addrRangeStart=...
addrRangeEnd=...
[system.cpu.workload.vmalist.Vma1]
name=heap
addrRangeStart=...
addrRangeEnd=...
[system.cpu.workload.vmalist.Vma2]
...
Change-Id: Ib2fa7ad2c34fe667ce95bc4b10a1affcf60d9c1f
Signed-off-by: Ian Jiang <ianjiang.ict@gmail.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31875
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Reviewed-by: Alexandru Duțu <alexandru.dutu@amd.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This parameter is associated with a periodic event which would take a
sample for a kernel profile in FS mode. Unfortunately the only ISA which
had working versions of the necessary classes was alpha, and that has
been deleted. That means that without additional work for any given ISA,
the profile parameter has no chance of working.
Ideally, this parameter should be moved to the Workload classes. There
it can intrinsically be tied to a particular kernel, rather than having
to assume a particular kernel and gate everything on whether you're in
FS mode.
Because this isn't (IMHO) where this parameter should live in the long
term, and because it's currently unusable without additional development
for each of the ISAs, I think it makes the most sense to remove the
front end for this mechanism from the CPU.
Since the sampling/profiling mechanism itself could be useful and could
be re-plumbed somewhere else, the back end and its classes are left alone.
Change-Id: I2a3319c1d5ad0ef8c99f5d35953b93c51b2a8a0b
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32214
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Each instance of simgen uses a license. If there are only so many to
go around, running many instances at once could exhaust the pool of
licenses and break the build.
The number of licenses may be less than the number of regular build
steps we want to do in parallel, but may be greater than zero. To
limit them to at most n in parallel where n might be less than j
and/or more than 1, we create a group of license slots, assign simgen
invocations to a slot, and then use scons's side effect mechanism to
ensure no two invocations in the same slot run at the same time.
This may be a suboptimal packing if the commands take significantly
different amounts of time to run since the slots are preallocated and
not demand allocated, but the difference shouldn't normally matter in
practice, and scons doesn't provide a better mechanism for partially
serializing certain build steps.
Change-Id: Ifae58b48ae1b989c1915444bf7564f352f042305
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32124
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
In disassembling of float register instructions, Gem5 always gives 2
source registers rs1 and rs2. However, this is not correct for Mul-Add
instructions which have three rs1, rs2, and rs3, and for Move, Convert
instructions which have only rs1.
For example: (Gem5 output vs Expected)
- fmadd.d fa0,fa0,fa4 vs fmadd.d fa0,fa0,fa4,fa5
- fcvt.d.l fa4,a6,zero vs fcvt.d.l fa4,a6
This patch fixes the problem.
Change-Id: I02d840eab602ac4a9782911b3cdff2935dfe5e68
Signed-off-by: Ian Jiang <ianjiang.ict@gmail.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32054
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Current compareVersions() fails in this case:
compareVersions("10", "10.0") return -1 while it should be 0.
This at least is causing a systemc compiling issue.
This problem causes by the comparison algorithm. The algorithm
turns the versions in two lists, and compares the corresponding
elements of the two lists up to the last element of the shorter
list. If all elements are equal, the longer list will be
determined to be the more recent version. Hence, this algorithm
determines "10.0" to be more recent to "10".
This commit addresses this issue by making the version lists
have the same length by adding 0 to the shorter list.
JIRA: https://gem5.atlassian.net/browse/GEM5-715
Change-Id: I859679185ac67e1b4d327d8803699cc5e399fa8c
Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32014
Reviewed-by: Gabe Black <gabeblack@google.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This patch adds Secure EL2 feature. This allows stage1
EL2/EL&0 and stage2 secure translation.
The changes are organized as follow:
+ insts/static_inst.cc: Modify checks for illegalInstruction on eret
+ isa.cc/hh: Enabling contorl bits
+ isa/insts/misc.hh/64.hh: Smc fault trigger.
+ miscregs.cc/hh: Declaration and initialization of new registers
+ self_debug.cc/hh: Add secureEL2 types for breakpoints
+ stage2_lookup.cc/hh: Allow stage2 in secure state.
+ tlb.cc/table_walker.cc: Allow secure state for stage2 and stage 1 EL2&0
translation regime
+ utility.cc/hh: New function InSecure and refactor of other helpers
to enable secure state
JIRA: https://gem5.atlassian.net/browse/GEM5-686
Change-Id: Ie59438b1828508e944334420da1d8f4745649056
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31394
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
At Iff9ad68d64e67b3df51682b7e4e272e5f355bcd6 a check was added to prevent
segfaults when unserializing the GenericTimer in case the new number of
thread contexts was smaller than the old one pre-checkpoint.
However, GenericTimer objects are only created dynamically as needed after
timer miscreg accesses. Therefore, if we take the checkpoint before
touching those registers, e.g. from a simple baremetal example, then the
checkpoint saves zero timers, and upon restore the assert would fail
because we have one thread context and not zero:
> fatal: The simulated system has been initialized with 1 CPUs, but the
Generic Timer checkpoint expects 0 CPUs. Consider restoring the checkpoint
specifying 0 CPUs.
This commit solves that by ensuring only that the new thread context count
larger than, but not necessarily equal to the number of cores.
Change-Id: I8bcb05a6faecd4b4845f7fd4d71df95041bf6c99
JIRA: https://gem5.atlassian.net/browse/GEM5-703
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31894
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
In checkpoint output files, the parameters for page table including
size and entries are organized not very clearly. For example:
[system.cpu.workload]
...
ptable.size=...
[system.cpu.workload.Entry0]
vaddr=...
paddr=...
flags=...
[system.cpu.workload.Entry1]
...
This commit moves these parameters into a separate section named
'ptable'. For example:
[system.cpu.workload.ptable]
size=...
[system.cpu.workload.ptable.Entry0]
vaddr=...
paddr=...
flags=...
[system.cpu.workload.ptable.Entry1]
...
Change-Id: Iaa4129b3f4f090e8c3651bde90524abba0999c7f
Signed-off-by: Ian Jiang <ianjiang.ict@gmail.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31874
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Gabe Black <gabeblack@google.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>