This function had a comment claiming that returning an arbitrary request
from the call was necessary for page table walker statistics, but
looking at the actual code, the return type was never used. Also
returning whatever the last request happens to be seems arbitrary, and a
bad boundary for modularization. The page table walker should not depend
on the internal implementation of the DMA port.
Change-Id: I00281fbaf6aeb85b15baf54f3d4a23ca1ac72b8b
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/38716
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
ARM reaches in and pads out the source register index list behind the
parser's back to force dest regs to also be sources in case an
instruction fails predication and needs to forward the original register
values. It shouldn't be hacking up these values in that way, but since
it is, this will let it continue to do so while still fitting in the new
system where each instruction allocates its src/dest reg index arrays to
size.
Change-Id: Ia296be9f63123f18f6cdc0d3bb1314d33e759b3a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/38380
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This class had been trying to keep all indices within the modulus of the
queue size, and to use all elements in the underlying storage by making
the empty and full conditions alias, differentiated by a bool. To keep
track of the difference between a storage location on one trip around
the queue vs other times around, ie an alias once the indices had
wrapped, it also keep track of a "round" value in both the queue itself,
and any iterators it created.
All this bookkeeping significantly complicated the data structure.
Instead, this change modifies it to keep track of a monotonically
increasing index which is wrapped at the time it's used. Only the head
index and current size need to be tracked in the queue itself, and only
a pointer to the queue and an index need to be tracked in the iterators.
Theoretically, it's possible that this value could overflow eventually
since it increases forever, unlike before where the index wrapped and
was never larger than the queue's capacity. In practice, the type of the
index was changed from a uint32_t to a size_t, probably a 64 bit value
in modern systems, which will hold much larger values. Also, the round
counter and the index values together acted like a smaller than 64 bit
value anyway, since the round counter would overflow after 2^32 times
around a less than 2^32 entry queue.
One minor interface difference is that the head() and tail() values
returned by the queue are no longer pre-wrapped to be modulo the queue's
capacity. As long as consumers don't try to be overly clever and feed in
fixed values, do their own bounds checking, etc., something that would
be cumbersome considering the wrapping nature of the structure, this
shouldn't be an issue.
Also, since external consumers no longer need to worry about wrapping,
since only one of them was used in only one place, and because they
weren't even marked as part of the interface, the modulo helper
functions have been eliminated from the queue. If other code wants to
perform modulo arithmetic for some reason (which the queue no longer
requires) they can accomplish basically the same thing in basically the
same amount of code using normal math.
Also, rather than inherit from std::vector, this change makes the vector
internal to the queue. That prevents methods of the vector that aren't
aware of the circular nature of the structure from leaking out if
they're not overridden or otherwise proactively blocked.
On top of simplifying the implementation, this also makes it perform
*slightly* better. To measure that, I ran the following command:
$ time build/ARM/base/circular_queue.test.opt --gtest_repeat=100000 > /dev/null
and found a few percent improvement in total run time. While this
difference was small and not measured against realistic usage of the
data structure, it was still measurable, and at minimum doesn't seem to
have hurt performance.
Change-Id: Ic2baa28de135be7086fa94579bbec451d69b3b15
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/38478
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This commit makes move stats from several classes in mem/ruby
to corresponding Stats::Group's.
For ruby's Profiler, additional changes are made: there are stats that
are profiled for each of RequestType, for each of MachineType, and for
each of combinations of RequestType and MachineType. The current naming
scheme is ...<stat_name>.<request_type_name>.<machine_type_name>. To make
it easier for stats parser to know whether the stat is of RequestType, or
is of MachineType, or is of (RequestType, MachineType), a prefix is added
as follows,
...<meta>.<stat_name>.<request_type_name>.<machine_type_name>
where <meta> is one of {RequestType, MachineType, RequestTypeMachineType}.
Another point of using this naming scheme is that the parser doesn't
need to know all of RequestType and MachineType.
Change-Id: I8b8bdd771c7798954f984d416f521e8eb42d01ed
Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/36478
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
SimObject outputs a warning when its parent is specified more than
once. The cause is most likely that there is unexpected param
specified in the constructor called in the Python interface.
This commit adds a note about this probable cause of this potential
error to the warning message.
Change-Id: I9b6bf5d5fb0c77bfdad5fde42e88f814e8a4b72b
Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/38359
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
To limit the number of license slots used by SCons when building fast
model components, the fastmodel SConscript set up a group of nodes
which are attached to each simgen run using the SCons SideEffect
method using one of the library files it generates.
To create each unique node, the SCons Value() method was used, passing
it the counter for the loop. In at least version 4 of SCons, what this
ended up doing was setting that library file as a source for each of
the Value() nodes it corresponds to.
That doesn't *seem* like a problem, but then when creating config
include files, files which expose SCons configuration values to C++,
they also create Value() nodes using the value of the config variable.
In cases where that variable is boolean, the value might be 0 or 1.
The result was that the config header depended on Value(0) (for
instance), and then Value(0) depended on a collection of static library
files.
When scons tried to determine whether the config file was up to date,
it tried to check if if its sources had changed. It would check
Value(0), and then Value(0) would try to compute a checksum for its own
source. To do that, it seems to assume that the value can be
interpreted as a string and tries to decode it as utf8. Since the
library is a binary file, that would fail and break the build with a
cryptic message from within the guts of SCons.
To address this, this change replaces the loop index with a call to
object(). Each instance created in that way will be different from
every other, and there will be no way (purposefully or otherwise) to
create a collision with it when creating Value() nodes for some other
purpose.
Change-Id: I56bc842ae66b8cb36d3dcbc25796b708254d6982
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/38617
Reviewed-by: Yu-hsin Wang <yuhsingw@google.com>
Reviewed-by: Ahbong Chang <cwahbong@google.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
These values were (seemingly) arbitrarily changed from the original,
non-KVM settings, and no longer matched the comments which were also
copied over. These two bits enable alignment checking on memory accesses
(not normally used on x86), and whether kernel code can write to read
only pages.
Change-Id: I48e560e448e4849607f12e9336d1ab0458ad9407
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/38536
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Implementation of a generic frequency-based sampling
compressor. The compressor goes through a sampling stage,
where no compression is done, and the values are simply
sampled for their frequencies. Then, after enough samples
have been taken, the compressor starts generating
compressed data.
Compression works by comparing chunks to the table of
most frequent values. In theory, a chunk that is present
in the frequency table has its value replaced by the
index of its respective entry in the table. In practice,
the value itself is stored because there is no straight-
forward way to acquire an index from an entry.
Finally, the index can be encoded so that the values
with highest frequency have smaller codeword representation.
Its Huffman coupling can be used similar to the approach
taken in "SC 2 : A Statistical Compression Cache Scheme".
Change-Id: Iae0ebda08e8c08f3b62930fd0fb7e818fd0d141f
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/37335
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
For some segments, there are two base registers. One is the
architecturally visible base, and the other is the effective base used
when actually referencing memory relative to that segment. The process
initialization code was setting the architecturally visible base,
presumably because that's the value used by KVM, but was setting the
effective base to zero.
Change-Id: I06e079f24fa63f0051268437bf00c14578f62612
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/38488
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
When a gem5 op is triggered using a KVM MMIO exit event, the PC has
already been advanced beyond the offending instruction. Normally when
a system call or gem5 op is triggered, the PC has not advanced because
the instruction hasn't actually finished executing. This means that if
a gem5 op, and by extension a system call in SE mode, want to advance
the PC to the instruction after the gem5 op, they have to check whether
they were triggered from KVM.
To avoid having to special case these sorts of situations (currently
only in the clone system call), we can have the code which dispatches to
gem5 ops from KVM adjust the next PC so that it points to what the
current PC is. That way the PC can be advanced unconditionally, and will
point to the instruction after the one that triggered the call.
To be fully consistent, we would also need to adjust the current PC.
That would be non-trivial since we'd have to figure out where the
current instruction started, and that may not even be possible to
unambiguously determine given x86's instruction structure. Then we would
also need to restore the original PC to avoid confusing KVM.
Change-Id: I9ef90b2df8e27334dedc25c59eb45757f7220eea
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/38486
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
These pseudo insts are less useful outside of full system, but they
should all still work. Removing this check makes it possible to, for
instance, test them in syscall emulation mode, and removes another
difference between the two styles of simulation.
Change-Id: Ia7d29bfc6f7c5c236045d151930fc171a6966799
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/38485
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Input ports can specify a custom handler that is called
on resource stalls. The handler should return 'true' to
indicate the stall was handled and new messages from that
queue can be processed on that cycle. When it returns
'false' or no handler is defined, a resource stall is
generated.
Handlers are defined using the 'rsc_stall_handler' (for
resource stalls) and the 'prot_stall_handler' (for
protocol stalls) parameters. For example:
in_port(mandatory_in, RubyRequest, mandatoryQueue,
rsc_stall_handler=mandatory_in_stall_handler) {
...
}
bool mandatory_in_stall_handler() {
// Do something here to handle the stall !
return true;
// or return false if we don't want to do anything
}
Note: this patch required a change to the generate()
functions interface in the SLICC compiler, so we
could propagate a reference to the in_port to the
appropriate generate() functions. The updated interface
allows passing and forwarding of keyword arguments.
Change-Id: I3481d130d5eb411e6760a54d098d3da5de511c86
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31265
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Added functions for connecting the sequencer and cpu ports.
Using these functions instead of wiring up the ports directly allow
protocols to provide specialized sequencer implementations. For
instance, connecting the cpu icache_port and dcache_port to
different sequencer ports or to different sequencers.
A follow-up patch will update the configurations to use these
functions.
Change-Id: I2d8db8bbfb05c731c0e549f482a9ab93f341474b
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31417
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
When the result is returned to the caller from the pseudoInst dispatch
function, the default behavior is to not store that value using the
guestABI mechanism. In the x86 definition, I accidentally used this
version but then didn't store the result manually. The fix should simply
be to not return the result to the instruction definition and to let the
guestABI mechanism handle everything normally.
Change-Id: Ib69f266ad6314032622e5d8d69e9ff114c62657a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/38195
Reviewed-by: Daniel Gerzhoy <daniel.gerzhoy@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>