This was originally from the GCN staging branch, which only had
GPU_VIPER.py, but the other GPU_VIPER configs had DirMem as well, so I
applied this change to all of them.
The patch replaces the Directory in DirCntrl from DirMem to
RubyDirectoryMemory. This fixes errors that DirMem caused relating to
setting class variables. It also generates and sets addr_ranges in
DirCntrl as RubyDirectoryMemory uses the parent object's addr_ranges
in its code
The style checker complained about a line length in GPU_VIPER_Region,
so the patch also fixes that
Change-Id: Icec96777a51d8a826b576fc752fae0f7f15427bc
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32674
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Bradford Beckmann <brad.beckmann@amd.com>
Maintainer: Bradford Beckmann <brad.beckmann@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The patch is aiming at speeding up gem5 execution. The TLB::translateFs
is in the critical path of the simulator: every fetch + ld/st will make
use of it.
Checking all the time for a breakpoint during fetch is rather expensive;
it is better to make use of the cached booleans in SelfDebug to do an
early check to see if any of
Watchpoint/Breakpopint/VectorCatch/SoftwareStep is enabled.
Most workloads won't use them so there's no point on calling the
testDebug method
Change-Id: I0189b84e0dc2e081acce04ff44787b9f1014477c
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32776
Tested-by: kokoro <noreply+kokoro@google.com>
* enableFlag -> mde
The "enableFlag" variable, enabling the Breakpoint, Watchpoint, Vector
Catch exceptions is actually the cached version of MDSCR_EL1.MDE. The
"enableFlag" name looks too general as it's not covering the Software
Step exception case.
* bKDE -> kde
* bSDD -> sdd
The b prefix was likely referring to "breakpoint". However these bitfields
are actually used by watchpoints as well.
Change-Id: I48b762b32b2d763f4c4ceb7dcc28968cfb470fc1
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32775
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
The `BasePrefetcher` python class had members `_events` and `_tlbs`
defined as lists, meaning that any call to `list.append` on them would
affect `_events` and `_tlbs` for all prefetchers, not just the calling
object. This change redefines them as instance members to fix the
problem.
Change-Id: I68feb1d6d78e2fa5e8775afba8c81c6dd0de6c60
Signed-off-by: Isaac Sánchez Barrera <isaac.sanchez@bsc.es>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32394
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
A new parameter as added to the initText method in March of this year,
but this example code was not updated which prevents it from compiling.
This change adds the parameter to the call and sets it to what the
documenting comments say is the default, true.
Change-Id: Ic8da46dba03f01f338c38a7bc02ba232a90ae349
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32641
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Some class names within gem5 changed in March of last year, and this
code was not updated to match. Change ExternalMaster::Port to
ExternalMaster::ExternalPort, and ExternalSlave::Port to
ExternalSlave::ExternalPort.
Change-Id: I04c0970c4107de3449473c24c7c6f99ada72bbb3
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32640
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The list is rather old and it contains some entries which are likely
unneeded. Since we are also now able to select specific FS binaries
to be compiled individually, there is not point of requiring all
components to be installed.
Instead, if is better to rely on the error message of building process
and let the users figure out which packages they need to install
Change-Id: I16c74861cb1f2b09c3e91e408ace01a9bd7a234d
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32556
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The NS field in PTEs descriptors is tagging Secure/Non-secure physical
memory (pages). This field is relevant in Secure state only:
While in Secure state, software can access both the Secure and
Non-secure physical address spaces, software in Non-secure state can
only access Non-secure memory; the NS bit is hence discarded/treated as
1.
This patch is aligning VMSAv8-32 with VMSAv8-64, which is tagging the
pointed memory as Non-secure in case of a Non-secure lookup.
The old behaviour was probably not leading to incorrect execution:
once a translation completes, the security flag in the memory request
is chcked against the security state of the cpu (and not only relying
on the NS bit in the TLB entry)
if (isSecure && !te->ns) {
req->setFlags(Request::SECURE);
}
so we were already forbidding secure accesses from non secure world
if NS = 0.
It is however misleading in the debug logs to see tlb entries with
NSTID = 1 and NS = 0.
Change-Id: I1f964069f88c33fb14362dd4101cb22538907226
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32638
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
debugExceptionReturnSS is called on an ERET instruction to
check for software step. The method was not using the
SPSR.width and it was relying on the more generic ELIs32 to
check the execution mode of the destination EL.
This is not only an efficiency problem: the helper might not work
when returning to EL0. In general it is not possible to
understand if EL0 is using AArch32 or AArch64 if the current
EL is not EL0 and EL1 is using AArch64.
This is instead visible by inspecting the spsr.width during the
execution of an ERET instruction
Change-Id: Ibc5a43633d0020139f2c0e372959a3ab4880da6e
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32634
Tested-by: kokoro <noreply+kokoro@google.com>
Flat instructions free some of their registers through their memory
requests, in particuar a call to scheduleWriteOperandsFromLoad(),
which gets called from GlobalMemPipeline::exec.
When execMask is 0, the instruction doesn't issue a memory request.
This patch adds in a call to scheduleWriteOperandsFromLoad() when
execMask is 0 for Flat Load and AtomicReturn instructions, as those
are the instructions that call scheduleWriteOperandsFromLoad()
in the memory pipeline.
This patch also adds in a missing return statement when execMask is 0
in one of the Flat instructions.
Change-Id: I09296adb7401e7515d3cedceb780a5df4598b109
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32234
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
There are race conditions while running several benchmarks, where
the DMA engine and the CorePair simultaneously send requests for the
same block. This patch fixes two scenarios
(a) If the request from the DMA engine arrives before the one from the
CorePair, the directory controller records it as a pending request.
However, once the DMA request is serviced, the directory doesn't check
for pending requests. The CorePair, consequently, never sees a response
to its request and this results in a Deadlock.
Added call to wakeUpDependents in the transition from BDR_Pm to U
Added call to wakeUpDependents in the transition from BDW_P to U
(b) If the request from the CorePair is being serviced by the directory
and the DMA requests for the same block, this causes an invalid
transition because the current coherence doesn't take care of this
scenario.
Added transition state where the requests from DMA are added to the
stall buffer.
Updated B to U CoreUnblock transition to check all buffers, as the DMA
requests were being placed later in the stall buffer than was being checked
Change-Id: I5a76efef97723bc53cf239ea7e112f84fc874ef8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31996
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Bradford Beckmann <brad.beckmann@amd.com>
Maintainer: Bradford Beckmann <brad.beckmann@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This changeset fixes several bugs in the HSA barrier bit implementation.
1. Forces AQL packet launch to wait for completion of all previous packets
2. Enforces barrier bit blocking only if there are packets pending completion
3. Barrier bit unblocking is correclty done by the last pending packet
4. Implementing barrier bit for all packets to conform to HSA spec
Change-Id: I62ce589dff57dcde4d64054a1b6ffd962acd5eb8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/30354
Reviewed-by: Sooraj Puthoor <puthoorsooraj@gmail.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
According to the ArmArm:
"When the value of the ENABLE bit is 1, ISTATUS indicates whether the
timer condition is met. ISTATUS takes no account of the value of the
IMASK bit. If the value of ISTATUS is 1 and the value of IMASK is 0 then
the timer interrupt is asserted."
Since ISTATUS is simply flagging that timer conditions are met, an
interrupt mask (via the <timer>_CTL_EL<x>.IMASK) shouldn't reset the
field to 0.
Clearing the ISTATUS bit leads to the following problem
as an example:
1) virtual timer (EL1) issuing a physical interrupt to the GIC
2) hypervisor handling the physical interrupt; setting the
CNTV_CTL_EL0.IMASK to 1 before issuing the virtual interrupt
to the VM
3) The VM receives the virtual interrupt but it gets confused
since CNTV_CTL_EL0.ISTATUS is 0 (due to point 2)
What happens when we disable the timer?
"When the value of the ENABLE bit is 0, the ISTATUS field is UNKNOWN."
So we are allowed to not clear the ISTATUS bit if the timer gets
disabled
Change-Id: I8eb32459a3ef6829c1910cf63815e102e2705566
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Adrian Herrera <adrian.herrera@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31775
Reviewed-by: Hsuan Hsu <kugwa2000@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Before this commit:
* SEV events were not waking neither WFE (wrong) nor futex WAIT (correct)
* locked memory events (LLSC) due to LDXR and STXR were waking up both
WFE (correct) and futex WAIT (wrong)
This commit fixes all wrong behaviours mentioned above.
The fact that LLSC events were waking up futexes leads to deadlocks,
as shown in the test case described at:
https://gem5.atlassian.net/browse/GEM5-537
because threads woken up by SVE are not removed from the waiter list
for the futex address they are sleeping on.
A previous fix atttempt was done at:
1531b56d605d47252dc0620bb3e755b7cf84df97
in which only sleeping threads are woken up. But that is not sufficient,
because the futex sleeping thread that was being wrongly woken up on SEV
can start to sleep on a second futex.
As an example, consider the case where 4 threads are fighting over two
critical sections protected by futex1 and futex2 addresses. In this case,
one thread wakes up the other thread after it is done with the section.
Suppose the following sequence of events:
* thread1 is awake and all others are suspended on futex1
* thread1 SEV wakes thread2 from the futex1 while in the critical region 1.
This is the wrong behaviour that this patch prevents, because
now thread2 is still in the sleeper list for futex1
* thread1 then futex wakes tread3, then proceeds to critical region 2.
* thread3 wakes up, but because thread2 has critical region, it sleeps
again.
* thread2 finishes its work, futex wakes thread3, and then proceeds to
futex2
When it reaches futex2, thread1 is still working there, so it sleeps on
futex2.
* thread3 futex wakes thread2, because it is still wrongly on the sleeper
list of futex1. But thread2 is in futex2 now.
If it weren't for this mistake, it should have awaken the final thread4
instead.
Outcome: thread4 sleeps forever, no other thread ever wakes it, because all
other threads have woken from futex1 and awoken another thread.
The problem is fixed by adding the waitingTcs unordered_set FutexMap,
which is basically an inverse map to FutexMap, which tracks (addr,
tgid) -> ThreadContext. This allows us allow to quickly check
if a given ThreadContext is waiting on a futex in any address.
Then the SEV wakeup code path
now checks if the thread is k
Change-Id: Icec5e30b041f53e5aa3b6e0d291e77bc0e865984
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29777
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Brandon Potter <Brandon.Potter@amd.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Brandon Potter <Brandon.Potter@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Both methods do basically the same, especially since they don't handle the
timeout which is basically the only difference between both modes of the
syscall (one uses absolute and the other relative time).
Remove the WaiterState::WaiterState(ThreadContext* _tc) constructor,
since the only calls were from FutexMap::suspend which does not use them
anymore. Instead, set the magic 0xffffffff constant as a parameter to
suspend_bitset.
Change-Id: I69d86bad31d63604657a3c71cf07e5623f0ea639
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29776
Reviewed-by: Brandon Potter <Brandon.Potter@amd.com>
Maintainer: Brandon Potter <Brandon.Potter@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Add checkpoint parameters (together with corresponding serialization
and unserialization) for VMA list of class MemState into a separate
section named 'vmalist'.
Without these VMA list parameters, a page table fault will occur when
running with --restore-simpoint-checkpoint, because of an empty VMA
list. For example:
$ ./build/RISCV/gem5.debug --debug-flags=Exec configs/example/se.py \
-c tests/test-progs/hello/bin/riscv/linux/hello \
--cpu-type=NonCachingSimpleCPU --restore-simpoint-checkpoint \
--checkpoint-dir m5out/ -r 2
...
2404000: system.switch_cpus: T0 : @_int_malloc+3392 : sd a5, 8(a0) \
: MemWrite : D=0x000000000001ed21 A=0x862e8
panic: Page table fault when accessing virtual address 0x862e8
...
Example checkpoint output:
[system.cpu.workload.vmalist]
size=3
[system.cpu.workload.vmalist.Vma0]
name=stack
addrRangeStart=...
addrRangeEnd=...
[system.cpu.workload.vmalist.Vma1]
name=heap
addrRangeStart=...
addrRangeEnd=...
[system.cpu.workload.vmalist.Vma2]
...
Change-Id: Ib2fa7ad2c34fe667ce95bc4b10a1affcf60d9c1f
Signed-off-by: Ian Jiang <ianjiang.ict@gmail.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31875
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Reviewed-by: Alexandru Duțu <alexandru.dutu@amd.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This parameter is associated with a periodic event which would take a
sample for a kernel profile in FS mode. Unfortunately the only ISA which
had working versions of the necessary classes was alpha, and that has
been deleted. That means that without additional work for any given ISA,
the profile parameter has no chance of working.
Ideally, this parameter should be moved to the Workload classes. There
it can intrinsically be tied to a particular kernel, rather than having
to assume a particular kernel and gate everything on whether you're in
FS mode.
Because this isn't (IMHO) where this parameter should live in the long
term, and because it's currently unusable without additional development
for each of the ISAs, I think it makes the most sense to remove the
front end for this mechanism from the CPU.
Since the sampling/profiling mechanism itself could be useful and could
be re-plumbed somewhere else, the back end and its classes are left alone.
Change-Id: I2a3319c1d5ad0ef8c99f5d35953b93c51b2a8a0b
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32214
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Each instance of simgen uses a license. If there are only so many to
go around, running many instances at once could exhaust the pool of
licenses and break the build.
The number of licenses may be less than the number of regular build
steps we want to do in parallel, but may be greater than zero. To
limit them to at most n in parallel where n might be less than j
and/or more than 1, we create a group of license slots, assign simgen
invocations to a slot, and then use scons's side effect mechanism to
ensure no two invocations in the same slot run at the same time.
This may be a suboptimal packing if the commands take significantly
different amounts of time to run since the slots are preallocated and
not demand allocated, but the difference shouldn't normally matter in
practice, and scons doesn't provide a better mechanism for partially
serializing certain build steps.
Change-Id: Ifae58b48ae1b989c1915444bf7564f352f042305
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32124
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>