If the memory system can provide a back door to memory, store that, and
use it for subsequent accesses to the range it covers. For now, this
covers only fetch. That's because fetch will generally happen more than
loads and stores, and because it's relatively simple to implement since
we can ignore atomic operations, etc.
Some limitted benchmarking suggests that this speeds up x86 linux boot
by about 20%, although my modifications to the config to remove caching
(which blocks the back door mechanism) also made gem5 crash, so it's
hard to say for sure if that's a valid result. The crash happened in the
same way before and after, so it's probably at least relatively
representative.
While this gives a pretty substantial performance boost, it will prevent
statistics from being collected at the memory, or on intermediate objects
in the interconnect like the bus. That is to be expected with this
memory mode, however.
Change-Id: I73f73017e454300fd4d61f58462eb4ec719b8d85
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/36979
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The methods `AddrRange::removeIntlvBits(Addr)` and
`AddrRange::addIntlvBits(Addr)` should be the inverse of one another,
but the latter did not insert the blanks for filling the removed bits in
the correct positions. Since the masks are ordered increasingly by the
position of the least significant bit of each mask, the lowest bit that
has to be inserted at each iteration is always `intlv_bit`, not needing
to be shifted to the left or right. The bits that need to be copied
from the input address are `intlv_bit-1..0` at each iteration.
The test `AddrRangeTest.AddRemoveInterleavBitsAcrossRange` has been
updated have masks below bit 12, making the old code not pass the test.
A new `AddrRangeTest.AddRemoveInterleavBitsAcrossContiguousRange` test
has been added to include a case in which the previous code fails. The
corrected code passes both tests.
This function is not used anywhere other than the tests and the class
`ChannelAddr`. However, it is needed to efficiently implement
interleaved caches in the classic mode.
Change-Id: I7d626a1f6ecf09a230fc18810d2dad2104d1a865
Signed-off-by: Isaac Sánchez Barrera <isaac.sanchez@bsc.es>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/37175
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
The variable 'm_allow_zero_latency' was only used in an assert message in
`src/mem/ruby/network/MessageBuffer.cc`. This assert is stripped when
compiling to gem5.fast, resulting in the compilation failing with an
unused variable error.
This assert is better as a panic_if, which will not be stripped out
during the .fast compilation.
Change-Id: I5de74982fa42b3291899ddcf73f7140079e1ec3f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/36697
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Previously the SimpleMem depended on the fact that it inherited from the
AbstractMem in order to access and export it's back door. Now, the
AbstractMem has a method which will set a back door pointer if
appropriate, which the SimpleMem can use, or anything else which uses an
AbstractMem as its backing store.
Also, make the AbstractMem invalidate any existing back doors and refuse
to give out any new ones while some bit of memory is locked. That's
because if the storage is accessed directly, the AbstractMem will have
no change to manage its bookkeeping, and locking won't work properly.
Change-Id: If8c2a63e0827bb88b583f27ab4151d6b761e116e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/36977
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
There were accessors for reading these indexes, but they were not
consistently used. This change makes them private to StaticInst, and
changes places that were accessing them directly to instead use the
accessors. New accessors are added for code generated by the ISA parser
and some ARM code to set the indexes without accessing them directly.
By forcing these values to be behind accessors, it will be much simpler
to change how those values are stored and retrieved.
Change-Id: Icca80023d7f89e29504fac6b194881f88aedeec2
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/36875
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This patch adds the GPU protocol tester that uses data-race-free
operation to discover bugs in GPU protocols including GPU_VIPER. For
more information please see the following paper and the README:
T. Ta, X. Zhang, A. Gutierrez and B. M. Beckmann, "Autonomous
Data-Race-Free GPU Testing," 2019 IEEE International Symposium on
Workload Characterization (IISWC), Orlando, FL, USA, 2019, pp. 81-92,
doi: 10.1109/IISWC47752.2019.9042019.
Change-Id: Ic9939d131a930d1e7014ed0290601140bdd1499f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32855
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Several TLB invalidation instructions rely on VMID matching. This is
only applicable is EL2 is implemented and enabled in the current state.
The code prior to this patch was making the now invalid assumption that
we shouldn't consider the VMID if we are doing a secure lookup. This is
because in the past if we were in secure mode we were sure EL2 was not
enabled.
This is fishy and not valid anymore anyway after the introduction of
secure EL2.
Change-Id: I9a1368f8ed19279ed3c4b2da9fcdf0db799bc514
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/35242
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This gets rid of the requirement to only modify one byte register at a
time, and builds some structure around individual DMA channels.
The one small feature of the i8237 that was implemented is still
implemented, but now with a method of the i8237.
Change-Id: Ibc2b2d75f2a3b860da3f28ae649c6f1a099bdf7d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/36815
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Modern libraries such as ROCm, MPI, and libnuma use files in Linux'
sysfs to determine the system topology such as number of CPUs, cache
size, cache associativity, etc. If Linux does not recognize the vendor
string returned by CPUID in x86 it will do a generic initialization
which does not include creating these files. In the case of ROCm
(specifically ROCt) this causes failures when getting device properties.
This can be solved by setting the vendor string to, for example,
AuthenticAMD (as qemu does) so that Linux will create the relevant sysfs
files. Unfortunately, simply changing the string in cpuid.cc to
AuthenticAMD causes simulation slowdown and may not be desirable to all
users. This change creates a parameter, defaulting to M5 Simulator as it
currently is, which can be set in python configuration files to change
the vendor string. Example of how to configure this is:
for i in range(len(self.cpus)):
for j in range(len(self.cpus[i].isa)):
self.cpus[i].isa[j].vendor_string = "AuthenticAMD"
Change-Id: I8de26d5a145867fa23518718a799dd96b5b9bffa
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/36156
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
A memory willing to autogenerate child nodes can do that directly in
the generateDeviceTree method. However sometimes portions of memory
(child nodes) are tagged for specific applications. Hardcoding the
child node in the parent memory class is not flexible, so we delegate
this to the application model, which is registering the generator
helper via the ParentMem interface
JIRA: https://gem5.atlassian.net/browse/GEM5-768
Change-Id: I5fa5bac0decf5399dbaa3804569998dc5e6d7bc0
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/34376
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
The event in KVM x86 SE mode plays double duty, triggering a system call
or a page fault depending on where it's called from (the system call
handler vs page fault handler).
This means we can eliminate the page fault gem5 op and the
pseudo_inst.hh switching header file.
This change touches a lot of things, but there wasn't really a good
place to split it up which still made sense and was consistent and
functional.
Change-Id: Ic414829917bcbd421893aa6c89d78273e4926b78
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/34165
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Reviewed-by: Alexandru Duțu <alexandru.dutu@amd.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The *vast* majority of SimObjects use the standard boilerplate version
of their Params::create() method which just returns new
ClassName(*this); Rather than force every class to define this method,
or annoy and frustrate users who forget and then get linker errors, this
change automates the default while leaving the possibility of defining a
custom create() method for non-default cases.
The situations this mechanism handles can be first broken down by
whether the SimObject class has a constructor of the normal form, ie one
that takes a const Params reference as its only parameter.
If no, then no default create() implementation is defined, and one
*must* be defined by the user.
If yes, then a default create() implementation is defined as a weak
symbol. If the user still wants to define their own create method for
some reason, perhaps to add debugging info, to keep track of instances
in c++, etc., then they can and it will override the weak symbol and
take precedence.
The way this is implemented is not straightforward. A set of classes are
defined which use SFINAE which either map in the real Params type or a
dummy based on whether the normal constructor exists in the SimObject
class. Then those classes are used to define *a* create method.
Depending on how the SFINAE works out, that will either be *the* create
method on the real Params struct, or a create method on a dummy class
set up to just absorb the definition and then go away. In either case the
create() method is a weak symbol, but in the dummy case it
doesn't/shouldn't matter.
Annoyingly the compiler insists that the weak symbol be visible. While
that makes total sense normally, we don't actually care what happens to
the weak symbol if it's attached to the dummy class. Unfortunately that
means we need to make the dummy class globally visible, although we put
it in a namespace to keep it from colliding with anything useful.
Change-Id: I3767a8dc8dc03665a72d5e8c294550d96466f741
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/35942
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This is a way to send a very generic poke to the workload so it can do
something. It's up to the workload to know what information to look for
to interpret an event, such as what PC it came from, what register
values are, or the context of the workload itself (is this SE mode? which
OS is running?).
Change-Id: Ifa4bdf3b5c5a934338c50600747d0b65f4b5eb2b
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/34162
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
These tables take up a lot of space and obscure what's going on in the
file around them. This change moves them into their own files (one for
32 bit and one for 64 bit). It also moves the x86 local definitions of
some system calls into their own file, and creates a SConscript file for
the linux subdirectory.
Change-Id: Ib0978005783b41789ea59695ad95b0336f6353eb
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/34160
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>