When the new thread context ctc is created, it should have a copy of
all the state in the original tc, including the original PC. This code
used to specially handle the KVM case by explicitly making this new
context return from the system call immediately by jumping right to
RCX which (assuming a particular instruction was used) is where user
mode should resume.
The first problem with this approach as far as I can tell is that the
CPU will still be in CPL0, ie supervisor mode, and will not have been
forced back into CPL3, ie user mode. This may not have any immediately
visible effect, but may down the line.
Second, this seems unnecessary. The non-special case code will advance
the PC beyond the instruction which triggered the system call. Then
once the new thread starts executing again, it will execute sysret and
return to rcx naturally, just like the original thread will.
The only observed difference is that when executing a gem5 instruction,
the IP is set to the currently executing instruction, and so to avoid
the new context from re-executing the system call, the PC needs to be
advanced. When calling in from KVM, the instruction has already been
"completed", and so the IP should *not* be advanced.
Also note that when reading the PCState object in KVM, it doesn't
figure out where the next instruction is and so NPC is just one
ExtMachInst sized blob later on. Advancing the PC will just move to
an address 8 bytes later, which is very unlikely to be what you want.
Jira Issue: https://gem5.atlassian.net/browse/GEM5-187
Change-Id: I0d97f66e64ce39b13d6700dcf3d7da88d6fe0048
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/23199
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Information about what kernel to load and how to load it was built
into the System object and its subclasses. That overloaded the System
object and made it responsible for too many things, and also was
somewhat awkward when working with SE mode which doesn't have a kernel.
This change extracts the kernel and information related to it from the
System object and puts into into a OsKernel or Workload object.
Currently the idea of a "Workload" to run and a kernel are a bit
muddled, an unfortunate carry-over from the original code. It's also an
implication of trying not to make too sweeping of a change, and to
minimize the number of times configs need to change, ie avoiding
creating a "kernel" parameter which would shortly thereafter be
renamed to "workload".
In future changes, the ideas of a kernel and a workload will be
disentangled, and workloads will be expanded to include emulated
operating systems which shephard and contain Process-es for syscall
emulation.
This change was originally split into pieces to make reviewing it
easier. Those reviews are here:
https: //gem5-review.googlesource.com/c/public/gem5/+/22243
https: //gem5-review.googlesource.com/c/public/gem5/+/24144
https: //gem5-review.googlesource.com/c/public/gem5/+/24145
https: //gem5-review.googlesource.com/c/public/gem5/+/24146
https: //gem5-review.googlesource.com/c/public/gem5/+/24147
https: //gem5-review.googlesource.com/c/public/gem5/+/24286
Change-Id: Ia3d863db276a023b6a2c7ee7a656d8142ff75589
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/26466
Reviewed-by: Gabe Black <gabeblack@google.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The GenericTimer specification includes a global component for
a universal view of time: the System Counter.
If both per-PE architected and memory-mapped timers are instantiated
in a system, they must both share the same counter. SystemCounter is
promoted to be an independent SimObject, which is now shared by
implementations.
The SystemCounter may be controlled/accessed through the memory-mapped
counter module in the system level implementation. This provides
control (CNTControlBase) and status (CNTReadBase) register frames. The
counter module is now implemented as part of GenericTimerMem.
Frequency changes occur through writes to an active CNTFID or to
CNTCR.EN as per the architecture. Low-high and high-low transitions are
delayed until suitable thresholds, where the counter value is a divisor
of the increment given the new frequency.
Due to changes in frequency, timers need to be notifies to be
rescheduled their counter limit events based on CompareValue/TimerValue.
A new SystemCounterListener interface is provided to achieve
correctness.
CNTFRQ is no longer able to modify the global frequency. PEs may
use this to modify their register view of the former, but they should
not affect the global value. These two should be consistent.
With frequency changes, counter value needs to be stored to track
contributions from different frequency epochs. This is now handled
on epoch change, counter disable and register access.
References to all GenericTimer model components are now provided as
part of the documentation.
VExpress_GEM5_Base is updated with the new model configuration.
Change-Id: I9a991836cacd84a5bc09e5d5275191fcae9ed84b
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/25306
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
getArmSystem is the building block for a lot of ArmSystem getters
a client can use to check for a specific feature.
This method is called very often during simulation and it is basically
casting a System pointer into an ArmSystem pointer.
To do so, it is using dynamic casting to check if the system is really
an ArmSystem. This is very expensive and usually not needed.
The only chance arm code would use a non ArmSystem is when in SE mode.
But if that's the case, we can just replace the assertion with a
assert(FullSystem).
Testing Linux boot with this patch provides a speedup of nearly 2x!
(atomic mode).
This is partially related to:
JIRA: https://gem5.atlassian.net/browse/GEM5-337
Since the PAuth patch changed the purifyTagged helper (on the critical
path of simulation) to rely more heavilly on getArmSystem (via
ArmSystem:: static methods)
Change-Id: Idbf079548ffe03513b4fc58c76f0d69613952a50
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Ciro Santilli <ciro.santilli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/25964
Tested-by: kokoro <noreply+kokoro@google.com>
Python 2.x and Python 3 use different meta class syntax. Fix this by
implementing meta classes using the add_metaclass decorator in the six
Python library.
Due to the way meta classes are implemented in six,
MetaParamValue.__new__ seems to be called twice for some classes. This
triggers an assertion which when param that checks that Param types
have only been registered once. I have turned this assertion into a
warning.
The assertion was triggered in params.CheckedInt and params.Enum. It
seems like the cause of the issue is that these classes have their own
meta classes (CheckedIntType and MetaEnum) that inherit from
MetaParamValue and a base class (ParamValue) that also inherits from
MetaParamValue.
Change-Id: I5dea08bf0558cfca57897a124cb131c78114e59e
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/26083
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
The version of vtophys which didn't take a ThreadContext had only been
implemented on Alpha which has since been removed, so this version of
the function was completely unimplemented and never used.
This change also gets rid of the dbg_vtophys which was sometimes
implemented but also never used, and takes the opportunity to fix up
some style problems in some of the vtophys arch files.
Change-Id: Ie10f881f8ce08c7188e71805357cf3264be4c81a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/26224
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This is a slightly munged version of vtophys, but which returns faults
like the normal translate functions if the address is malformed. It
attempts to return an approximately correct fault if the translation
isn't found, but since SPARC doesn't have hardware managed TLBs that
has to be an approximation.
translateFunctional also ignores permissions type checks (unless
they're built into the "lookup" method?) in line with vtophys type
semantics. The idea is that translateFunctional is used in conjunction
with functional accesses, and those are intended to reach beyond
normal barriers/boundaries to give unfettered access to the system for
debugging or setup purposes.
Change-Id: I000d9c31877b82043489792de037e7d664914fa9
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/26404
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The only difference was whether the the atomic op functor was accepted
as an argument. If it wasn't, setVirt would be called without an op
functor argument where it will default to nullptr.
This change deletes the constructor which doesn't take an atomic op
functor and in the other defaults the functor to nullptr. Functionally
nothing changes, but the code is now simpler.
Change-Id: Iff06543b1046594df297344e16961ee9d0f0a373
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/26231
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Gabe Black <gabeblack@google.com>
As described in the Jira issue, this replaces the implementation of
isPowerOf2() and power(). It also revamps floorLog2 so that there only
needs to be one implementation and no assumptions about how big certain
types are.
The way power() used to work was to raise a number n to an exponent e
by multiplying n times itself e times. As a warning in this function
explains, this can be quite slow for large e. A much more efficient
way to raise a number to an exponent is to square n over and over, and
to multiply in the current square if that bit of e is set.
n ^ 15 = (n^1) * (n^2) * (n^4) * (n^8)
n^8 = (n^4)^2
n^4 = (n^2)^2
n^2 = n^2
n^1 = n
So that takes 6 multiplications, n^2, (n^2)^2, (n^4)^2, and then each
multipy to compute the final result, instead of 14.
The difference is more pronounced for larger exponents, although you'd
quickly start to overflow a uint64_t.
Jira Issue: https://gem5.atlassian.net/browse/GEM5-140
Change-Id: I0ae05aeba1b5882d2a616613b1679e6206b4cbfe
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/26164
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
These are only used in these two files, one each, and pass one dummy
argument with a default value and one extra argument with an actual
value compared to the more common constructors.
Instead, switch to constructors without those two arguments and set the
one extra value explicitly after construction.
The constructor will likely be inlined, and merged with this additional
assignment.
Change-Id: I75ca539d5ca95b57b4f4322ffa050af2031544dd
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/26229
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Gabe Black <gabeblack@google.com>
The new local access mechanism installs a callback in the request which
implements what the mmapped IPR was doing. That avoids having to have
stubs in ISAs that don't have mmapped IPRs, avoids having to encode
what to do to communicate from the TLB and the mmapped IPR functions,
and gets rid of another global ISA interface function and header files.
Jira Issue: https://gem5.atlassian.net/browse/GEM5-187
Change-Id: I772c2ae2ca3830a4486919ce9804560c0f2d596a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/23188
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
pybind internally uses a construct which initializes an array of bools
as a way to run a function on each member of a parameter pack. It then
discards the array since it was just trying to run the function. This
triggers a warning in clang 11 called unused-value which breaks the
build.
This change adds some pragmas to the pybind11.h header which disable
that warning while in pybind11 which is less intrusive than trying to
fix the false positive warning, and better than disabling the warning
universally. Since g++ and clang++ will complain if they see this
pragma guarded by the other's name, these pragmas are also surrounded
by ifdefs which should make them only visible to clang.
Change-Id: Ie9b5c65e8cadc8b96fbc1bd7971bed4a61c4340d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/25228
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>