If we create abstract memories with a sub-page size on a system with
shared backstore, the offset of next mmap might become non-page-align
and cause an invalid argument error.
In this CL, we always upscale the range size to multiple of page before
updating the offset, so the offset is always on page boundary.
Change-Id: I3a6adf312f2cb5a09ee6a24a87adc62b630eac66
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/58289
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Reviewed-by: Boris Shingarov <shingarov@labware.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Add an utility class that provides a service for another process
query and get the fd of the corresponding region in gem5's physmem.
Basically, the service works in this way:
1. client connect to the unix socket created by a SharedMemoryServer
2. client send a request {start, end} to gem5
3. the server locates the corresponding shared memory
4. gem5 response {offset} and pass {fd} in ancillary data
mmap fd at offset will provide the client the view into the physical
memory of the request range.
Change-Id: I9d42fd8a41fc28dcfebb45dec10bc9ebb8e21d11
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57729
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Reviewed-by: Boris Shingarov <shingarov@labware.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Add a new option `auto_unlink_shared_backstore` to System so it will
remove the shared backstore used in physical memories when the System is
getting destructed. This will prevent unintended memory leak.
If the shared memory is designed to live through multiple round of
simulations, you may set the option to false to prevent the removal.
Test: Run a simulation with shared_backstore set, and see whether there
is anything left in /dev/shm/ after simulation ends.
Change-Id: I0267b643bd24e62cb7571674fe98f831c13a586d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57469
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Daniel Carvalho <odanrc@yahoo.com.br>
Tested-by: kokoro <noreply+kokoro@google.com>
Previously, all abstract memory backed by the same physical memory will
use the exact same chunk of shared memory if sharedBackstore is set. It
means that all abstract memories, despite setting to a different range,
will still be map to the same chunk of memory.
As a result, setting the sharedBackstore not only allows our host system
to share gem5 memory, it also enforces multiple gem5 memories to share
the same content. Which will significantly affect the simulation result.
Furthermore, the actual size of the shared memory will be determined by
the last backingStore created. If the last one is unfortunately smaller
than any previous backingStore, this may invalid previous mapped region
and cause a SIGBUS upon access (on linux).
In this CL, we put all backingStores of those abstract memories side by
side instead of stacking them all together. So the behavior of abstract
memories will be kept consistent whether the sharedBackstore is set or
not, yet presist the ability to access those memories from host.
Change-Id: Ic4ec25c99fe72744afaa2dfbb48cd0d65230e9a8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57369
Reviewed-by: Yu-hsin Wang <yuhsingw@google.com>
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Apply the gem5 namespace to the codebase.
Some anonymous namespaces could theoretically be removed,
but since this change's main goal was to keep conflicts
at a minimum, it was decided not to modify much the
general shape of the files.
A few missing comments of the form "// namespace X" that
occurred before the newly added "} // namespace gem5"
have been added for consistency.
std out should not be included in the gem5 namespace, so
they weren't.
ProtoMessage has not been included in the gem5 namespace,
since I'm not familiar with how proto works.
Regarding the SystemC files, although they belong to gem5,
they actually perform integration between gem5 and SystemC;
therefore, it deserved its own separate namespace.
Files that are automatically generated have been included
in the gem5 namespace.
The .isa files currently are limited to a single namespace.
This limitation should be later removed to make it easier
to accomodate a better API.
Regarding the files in util, gem5:: was prepended where
suitable. Notice that this patch was tested as much as
possible given that most of these were already not
previously compiling.
Change-Id: Ia53d404ec79c46edaa98f654e23bc3b0e179fe2d
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46323
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This patch adds the ability for a host-OS process external to gem5
to access the backing store via POSIX shared memory.
The new param shared_backstore of the System object is the filename
of the shared memory (i.e., the first argument to shm_open()).
Change-Id: I98c948a32a15049a4515e6c02a14595fb5fe379f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/30994
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Previously, we used the start address to determine the right physical
memory while servicing memory requests. This change uses the full
address range to correctly determine the right physical memory and
expose bugs where requests might not fully map to a single physical
memory.
Change-Id: I183d7552918106000f917a62ceb877511ff0ff71
Reviewed-on: https://gem5-review.googlesource.com/11118
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
We need to determined whether an address range is fully contained or
it overlaps with an address range in the address range in the mmap. As
an example, we use address range maps to associate ports to address
ranges and we determine which port we will forward the request based
on which address range contains the addresses accessed by the
request. We also need to make sure that when we add a new port to the
address range map, its address range does not overlap with any of the
existing ports.
This patch splits the function find() into two functions contains()
and intersects() to implement this distinct functionality. It also
changes the xbar and the physical memory to use the right function.
Change-Id: If3fd3f774a16b27db2df76dc04f1d61824938008
Reviewed-on: https://gem5-review.googlesource.com/11115
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Only map memories into the KVM guest address space that are
marked as usable by KVM. Create BackingStoreEntry class
containing flags for is_conf_reported, in_addr_map, and
kvm_map.
Objects that are can be serialized are supposed to inherit from the
Serializable class. This class is meant to provide a unified API for
such objects. However, so far it has mainly been used by SimObjects
due to some fundamental design limitations. This changeset redesigns
to the serialization interface to make it more generic and hide the
underlying checkpoint storage. Specifically:
* Add a set of APIs to serialize into a subsection of the current
object. Previously, objects that needed this functionality would
use ad-hoc solutions using nameOut() and section name
generation. In the new world, an object that implements the
interface has the methods serializeSection() and
unserializeSection() that serialize into a named /subsection/ of
the current object. Calling serialize() serializes an object into
the current section.
* Move the name() method from Serializable to SimObject as it is no
longer needed for serialization. The fully qualified section name
is generated by the main serialization code on the fly as objects
serialize sub-objects.
* Add a scoped ScopedCheckpointSection helper class. Some objects
need to serialize data structures, that are not deriving from
Serializable, into subsections. Previously, this was done using
nameOut() and manual section name generation. To simplify this,
this changeset introduces a ScopedCheckpointSection() helper
class. When this class is instantiated, it adds a new /subsection/
and subsequent serialization calls during the lifetime of this
helper class happen inside this section (or a subsection in case
of nested sections).
* The serialize() call is now const which prevents accidental state
manipulation during serialization. Objects that rely on modifying
state can use the serializeOld() call instead. The default
implementation simply calls serialize(). Note: The old-style calls
need to be explicitly called using the
serializeOld()/serializeSectionOld() style APIs. These are used by
default when serializing SimObjects.
* Both the input and output checkpoints now use their own named
types. This hides underlying checkpoint implementation from
objects that need checkpointing and makes it easier to change the
underlying checkpoint storage code.
This patch ensures we can run simulations with very large simulated
memories (at least 64 TB based on some quick runs on a Linux
workstation). In essence this allows us to efficiently deal with
sparse address maps without having to implement a redirection layer in
the backing store.
This opens up for run-time errors if we eventually exhausts the hosts
memory and swap space, but this should hopefully never happen.
This patch changes the range cache used in the global physical memory
to be an iterator so that we can use it not only as part of isMemAddr,
but also access and functionalAccess. This matches use-cases where a
core is using the atomic non-caching memory mode, and repeatedly calls
isMemAddr and access.
Linux boot on aarch32, with a single atomic CPU, is now more than 30%
faster when using "--fastmem" compared to not using the direct memory
access.
This patch changes the name of the Bus classes to XBar to better
reflect the actual timing behaviour. The actual instances in the
config scripts are not renamed, and remain as e.g. iobus or membus.
As part of this renaming, the code has also been clean up slightly,
making use of range-based for loops and tidying up some comments. The
only changes outside the bus/crossbar code is due to the delay
variables in the packet.
--HG--
rename : src/mem/Bus.py => src/mem/XBar.py
rename : src/mem/coherent_bus.cc => src/mem/coherent_xbar.cc
rename : src/mem/coherent_bus.hh => src/mem/coherent_xbar.hh
rename : src/mem/noncoherent_bus.cc => src/mem/noncoherent_xbar.cc
rename : src/mem/noncoherent_bus.hh => src/mem/noncoherent_xbar.hh
rename : src/mem/bus.cc => src/mem/xbar.cc
rename : src/mem/bus.hh => src/mem/xbar.hh
This patch fixes a bug in how physical memory used to be mapped and
unmapped. Previously we unmapped and re-mapped if restoring from a
checkpoint. However, we never checked that the new mapping was
actually the same, it was just magically working as the OS seems to
fairly reliably give us the same chunk back. This patch fixes this
issue by relying entirely on the mmap call in the constructor.
This patch removes the explicit memset as it is redundant and causes
the simulator to touch the entire space, forcing the host system to
allocate the pages.
Anonymous pages are mapped on the first access, and the page-fault
handler is responsible for zeroing them. Thus, the pages are still
zeroed, but we avoid touching the entire allocated space which enables
us to use much larger memory sizes as long as not all the memory is
actually used.
This patch adds merging of interleaved ranges before creating the
backing stores. The backing stores are always a contigous chunk of the
address space, and with this patch it is possible to have interleaved
memories in the system.
This patch adds basic merging of address ranges when determining which
address ranges should be reported in the configuration table. By
performing this merging it is possible to distribute an address range
across many memory channels (controllers). This is essential to enable
address interleaving.
This patch adds support for interleaving bits for the address
ranges. What was previously just a start and end address, now has an
additional three fields, for the high bit, and number of bits to use
for interleaving, and a match value to compare against. If the number
of interleaving bits is set to zero it is effectively disabled.
A number of convenience functions are added to the range to enquire
about the interleaving, its granularity and the number of stripes it
is part of.
This patch cleans up the AddrRangeMap in preparation for the addition
of interleaving by removing unused code. The non-const editions of
find are never used, and hence the duplication is not needed.
This patch makes the start and end address private in a move to
prevent direct manipulation and matching of ranges based on these
fields. This is done so that a transition to ranges with interleaving
support is possible.
As a result of hiding the start and end, a number of member functions
are needed to perform the comparisons and manipulations that
previously took place directly on the members. An accessor function is
provided for the start address, and a function is added to test if an
address is within a range. As a result of the latter the != and ==
operator is also removed in favour of the member function. A member
function that returns a string representation is also created to allow
debug printing.
In general, this patch does not add any functionality, but it does
take us closer to a situation where interleaving (and more cleverness)
can be added under the bonnet without exposing it to the user. More on
that in a later patch.
This patch temporarily removes the joining of ranges when creating the
backing store, to reserve this functionality for the interleaved
ranges that are about to be introduced.
When creating the mmaps for the backing store, there is no point in
creating larger contigous chunks that what is necessary. The larger
chunks will only make life more difficult for the host.
Merging will be re-added later, but then only for interleaved ranges.
This patch fixes a bug that caused multiple systems to overwrite each
other physical memory. The system name is now included in the filename
such that this is avoided.
This patch adds a _curTick variable to an eventq. This variable is updated
whenever an event is serviced in function serviceOne(), or all events upto
a particular time are processed in function serviceEvents(). This change
helps when there are eventqs that do not make use of curTick for scheduling
events.
This patch moves all the memory backing store operations from the
independent memory controllers to the global physical memory. The main
reason for this patch is to allow address striping in a future set of
patches, but at this point it already provides some useful
functionality in that it is now possible to change the number of
memory controllers and their address mapping in combination with
checkpointing. Thus, the host and guest view of the memory backing
store are now completely separate.
With this patch, the individual memory controllers are far simpler as
all responsibility for serializing/unserializing is moved to the
physical memory. Currently, the functionality is more or less moved
from AbstractMemory to PhysicalMemory without any major
changes. However, in a future patch the physical memory will also
resolve any ranges that are interleaved and properly assign the
backing store to the memory controllers, and keep the host memory as a
single contigous chunk per address range.
Functionality for future extensions which involve CPU virtualization
also enable the host to get pointers to the backing store.
This patch takes the final plunge and transitions from the templated
Range class to the more specific AddrRange. In doing so it changes the
obvious Range<Addr> to AddrRange, and also bumps the range_map to be
AddrRangeMap.
In addition to the obvious changes, including the removal of redundant
includes, this patch also does some house keeping in preparing for the
introduction of address interleaving support in the ranges. The Range
class is also stripped of all the functionality that is never used.
--HG--
rename : src/base/range.hh => src/base/addr_range.hh
rename : src/base/range_map.hh => src/base/addr_range_map.hh
This patch removes the assumption on having on single instance of
PhysicalMemory, and enables a distributed memory where the individual
memories in the system are each responsible for a single contiguous
address range.
All memories inherit from an AbstractMemory that encompasses the basic
behaviuor of a random access memory, and provides untimed access
methods. What was previously called PhysicalMemory is now
SimpleMemory, and a subclass of AbstractMemory. All future types of
memory controllers should inherit from AbstractMemory.
To enable e.g. the atomic CPU and RubyPort to access the now
distributed memory, the system has a wrapper class, called
PhysicalMemory that is aware of all the memories in the system and
their associated address ranges. This class thus acts as an
infinitely-fast bus and performs address decoding for these "shortcut"
accesses. Each memory can specify that it should not be part of the
global address map (used e.g. by the functional memories by some
testers). Moreover, each memory can be configured to be reported to
the OS configuration table, useful for populating ATAG structures, and
any potential ACPI tables.
Checkpointing support currently assumes that all memories have the
same size and organisation when creating and resuming from the
checkpoint. A future patch will enable a more flexible
re-organisation.
--HG--
rename : src/mem/PhysicalMemory.py => src/mem/AbstractMemory.py
rename : src/mem/PhysicalMemory.py => src/mem/SimpleMemory.py
rename : src/mem/physical.cc => src/mem/abstract_mem.cc
rename : src/mem/physical.hh => src/mem/abstract_mem.hh
rename : src/mem/physical.cc => src/mem/simple_mem.cc
rename : src/mem/physical.hh => src/mem/simple_mem.hh
This patch removes the physMemPort from the RubySequencer and instead
uses the system pointer to access the physmem. The system already
keeps track of the physmem and the valid memory address ranges, and
with this patch we merely make use of that existing functionality. The
memory is modified so that it is possible to call the access functions
(atomic and functional) without going through the port, and the memory
is allowed to be unconnected, i.e. have no ports (since Ruby does not
attach it like the conventional memory system).
This patch introduces the notion of a master and slave port in the C++
code, thus bringing the previous classification from the Python
classes into the corresponding simulation objects and memory objects.
The patch enables us to classify behaviours into the two bins and add
assumptions and enfore compliance, also simplifying the two
interfaces. As a starting point, isSnooping is confined to a master
port, and getAddrRanges to slave ports. More of these specilisations
are to come in later patches.
The getPort function is not getMasterPort and getSlavePort, and
returns a port reference rather than a pointer as NULL would never be
a valid return value. The default implementation of these two
functions is placed in MemObject, and calls fatal.
The one drawback with this specific patch is that it requires some
code duplication, e.g. QueuedPort becomes QueuedMasterPort and
QueuedSlavePort, and BusPort becomes BusMasterPort and BusSlavePort
(avoiding multiple inheritance). With the later introduction of the
port interfaces, moving the functionality outside the port itself, a
lot of the duplicated code will disappear again.
This patch decouples the queueing and the port interactions to
simplify the introduction of the master and slave ports. By separating
the queueing functionality from the port itself, it becomes much
easier to distinguish between master and slave ports, and still retain
the queueing ability for both (without code duplication).
As part of the split into a PacketQueue and a port, there is now also
a hierarchy of two port classes, QueuedPort and SimpleTimingPort. The
QueuedPort is useful for ports that want to leave the packet
transmission of outgoing packets to the queue and is used by both
master and slave ports. The SimpleTimingPort inherits from the
QueuedPort and adds the implemention of recvTiming and recvFunctional
through recvAtomic.
The PioPort and MessagePort are cleaned up as part of the changes.
--HG--
rename : src/mem/tport.cc => src/mem/packet_queue.cc
rename : src/mem/tport.hh => src/mem/packet_queue.hh
This patch moves all port creation from the getPort method to be
consistently done in the MemObject's constructor. This is possible
thanks to the Swig interface passing the length of the vector ports.
Previously there was a mix of: 1) creating the ports as members (at
object construction time) and using getPort for the name resolution,
or 2) dynamically creating the ports in the getPort call. This is now
uniform. Furthermore, objects that would not be complete without a
port have these ports as members rather than having pointers to
dynamically allocated ports.
This patch also enables an elaboration-time enumeration of all the
ports in the system which can be used to determine the masterId.
The functional ports are no longer used and this patch cleans up the
legacy that is still present in buses, memories, CPUs etc. Note that
this does not refer to the class FunctionalPort (already removed), but
rather ports with the name (and use) functional.
This patch simplifies the address-range determination mechanism and
also unifies the naming across ports and devices. It further splits
the queries for determining if a port is snooping and what address
ranges it responds to (aiming towards a separation of
cache-maintenance ports and pure memory-mapped ports). Default
behaviours are such that most ports do not have to define isSnooping,
and master ports need not implement getAddrRanges.