QoSMemSinkInterface is a special case of memory interface type, similar
to SimpleMemory. It requires a QoSMemSinkCtrl where most model parameters
are exposed. By adding support in "config_mem", we allow configurations
with multiple QoSMemSinkCtrls to be centrally configured, particularly
interleaving parameters.
Change-Id: I46462b55d587acd2c861963bc0279bce92d5f450
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/35797
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Make the actual controller more generic
- Rename DRAMCtrl to MemCtrl
- Rename DRAMacket to MemPacket
- Rename dram_ctrl.cc to mem_ctrl.cc
- Rename dram_ctrl.hh to mem_ctrl.hh
- Create MemCtrl debug flag
Move the memory interface classes/functions to separate files
- mem_interface.cc
- mem_interface.hh
Change-Id: I1acba44c855776343e205e7733a7d8bbba92a82c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31654
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Add NVM interface to memory controller.
This can be used with or instead of the existing
DRAM interface. Therefore, a single controller can interface
to either DRAM or NVM, or both.
Specifically, a memory channel can be configured as:
- Memory controller interfacing to DRAM only
- Memory controller interfacing to NVM only
- Memory controller interfacing to both DRAM and NVM
How data is placed or migrated between media types is outside
of the scope of this change.
The NVM interface incorporates new static delay parameters
for read and write completion. The interface defines a 2
stage read to manage non-deterministic read delays while
enabling deterministic data transfer, similar to NVDIMM-P.
The NVM interface also includes parameters to define
read and write buffers on the media side (on-DIMM). These are
utilized to quickly offload commands and write data, mitigating
the effects of lower latency and bandwidth media characteristics.
Change-Id: I6b22ddb495877f88d161f0bd74ade32cc8fdcbcc
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29027
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Wendy Elsasser <wendy.elsasser@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Made DRAMCtrl a ClockedObject, with DRAMInterface
defined as an AbstractMemory. The address
ranges are now defined per interface. Currently
the model only includes a DRAMInterface but this
can be expanded for other media types.
The controller object includes a parameter to the
interface, which is setup when gem5 is configured.
Change-Id: I6a368b845d574a713c7196c5671188ca8c1dc5e8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/28968
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This changeset allows setting a variable for interleaving.
That value is used together with the number of directories to
calculate numa_high_bit, which is in turn used to set up
cache, directory, and memory controller interleaving.
A similar approach is used to set xor_low_bit, and calculate
xor_high_bit for address hashing.
Change-Id: Ia342c77c59ca2e3438db218b5c399c3373618320
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/28134
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Information about what kernel to load and how to load it was built
into the System object and its subclasses. That overloaded the System
object and made it responsible for too many things, and also was
somewhat awkward when working with SE mode which doesn't have a kernel.
This change extracts the kernel and information related to it from the
System object and puts into into a OsKernel or Workload object.
Currently the idea of a "Workload" to run and a kernel are a bit
muddled, an unfortunate carry-over from the original code. It's also an
implication of trying not to make too sweeping of a change, and to
minimize the number of times configs need to change, ie avoiding
creating a "kernel" parameter which would shortly thereafter be
renamed to "workload".
In future changes, the ideas of a kernel and a workload will be
disentangled, and workloads will be expanded to include emulated
operating systems which shephard and contain Process-es for syscall
emulation.
This change was originally split into pieces to make reviewing it
easier. Those reviews are here:
https: //gem5-review.googlesource.com/c/public/gem5/+/22243
https: //gem5-review.googlesource.com/c/public/gem5/+/24144
https: //gem5-review.googlesource.com/c/public/gem5/+/24145
https: //gem5-review.googlesource.com/c/public/gem5/+/24146
https: //gem5-review.googlesource.com/c/public/gem5/+/24147
https: //gem5-review.googlesource.com/c/public/gem5/+/24286
Change-Id: Ia3d863db276a023b6a2c7ee7a656d8142ff75589
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/26466
Reviewed-by: Gabe Black <gabeblack@google.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Typically, a memory controller is assigned an address range of the
form [start, end). This address range might be interleaved and
therefore only a non-continuous subset of the addresses in the address
range is handed by this controller.
Prior to this patch, the DRAM controller was unaware of the
interleaving and as a result the address range could affect the
mapping of addresses to DRAM ranks, rows and columns. This patch
changes the DRAM controller, to transform the input address to a
continuous range of the form [0, size). As a result the DRAM
controller always operates on a dense and continuous address range
regardlesss of the system configuration.
Change-Id: I7d273a630928421d1854658c9bb0ab34e9360851
Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19328
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Wendy Elsasser <wendy.elsasser@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Adding an option to enable DRAM low-power states. The low power
states can have a significant impact on application performance
(sim_ticks) on the order of 2-3x, especially for compute-gpu apps.
The options allows for it to easily be enabled/disabled to compare
performance numbers. The option is disabled by default.
Change-Id: Ib9bddbb792a1a6a4afb5339003472ff8f00a5859
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18548
Reviewed-by: Wendy Elsasser <wendy.elsasser@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Python 2.7 used to return lists for operations such as map and range,
this has changed in Python 3. To make the configs Python 3 compliant,
add explicit conversions from iterators to lists where needed, replace
xrange with range, and fix changes to exec syntax.
This change doesn't fix import paths since that might require us to
restructure the configs slightly.
Change-Id: Idcea8482b286779fc98b4e144ca8f54069c08024
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/16002
Reviewed-by: Gabe Black <gabeblack@google.com>
MemConfig currently assumes that all callers include the its full set
of options in the command line parser. This is unnecessary and
sometimes confusing. Make most of the options optional to avoid having
to add all of them to example scripts.
Change-Id: I2d73be2454427b00db16716edcfd96a47133c888
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Gabor Dozsa <gabor.dozsa@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/3940
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Names of DRAM configurations were updated to reflect both
the channel and device data width.
Previous naming format was:
<DEVICE_TYPE>_<DATA_RATE>_<CHANNEL_WIDTH>
The following nomenclature is now used:
<DEVICE_TYPE>_<DATA_RATE>_<n>x<w>
where n = The number of devices per rank on the channel
x = Device width
Total channel width can be calculated by n*w
Example:
A 64-bit DDR4, 2400 channel consisting of 4-bit devices:
n = 16
w = 4
The resulting configuration name is:
DDR4_2400_16x4
Updated scripts to match new naming convention.
Added unique configurations for DDR4 for:
1) 16x4
2) 8x8
3) 4x16
Change-Id: Ibd7f763b7248835c624309143cb9fc29d56a69d1
Reviewed-by: Radhika Jagtap <radhika.jagtap@arm.com>
Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
The current TLM bridge only provides a Slave Port that allows the gem5
world to send request to the SystemC world. This patch series refractors
and cleans up the existing code, and adds a Master Port that allows the
SystemC world to send requests to the gem5 world.
This patch:
* Restructure the existing sources in preparation of the addition of the
* new
Master Port.
* Refractor names to allow for distinction of the slave and master port.
* Replace the Makefile by a SConstruct.
Testing Done: The examples provided in util/tlm (now
util/tlm/examples/slave_port) still compile and run error free.
Reviewed at http://reviews.gem5.org/r/3527/
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
In this new hmc configuration we have used the existing components in gem5
mainly [SerialLink] [NoncoherentXbar]& [DRAMCtrl] to define 3 different
architecture for HMC.
Highlights
1- It explores 3 different HMC architectures
2- It creates 4-HMC crossbars and attaches 16 vault controllers with it.
This will connect vaults to serial links
3- From the previous version, HMCController with round robin funtionality
is being removed and all the serial links are being accessible directly
from user ports
4- Latency incorporated by HMCController (in previous version) is being
added to SerialLink
Committed by Jason Lowe-Power <jason@lowepower.com>
This patch adds changes to the configuration scripts to support elastic
tracing and replay.
The patch adds a command line option to enable elastic tracing in SE mode
and FS mode. When enabled the Elastic Trace cpu probe is attached to O3CPU
and a few O3 CPU parameters are tuned. The Elastic Trace probe writes out
both instruction fetch and data dependency traces. The patch also enables
configuring the TraceCPU to replay traces using the SE and FS script.
The replay run is designed to resume from checkpoint using atomic cpu to
restore state keeping it consistent with FS run flow. It then switches to
TraceCPU to replay the input traces.
This patch enables modeling a complete Hybrid Memory Cube (HMC) device. It
highly reuses the existing components in gem5's general memory system with some
small modifications. This changeset requires additional patches to model a
complete HMC device.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
Transaction Level Modeling (TLM2.0) is widely used in industry for creating
virtual platforms (IEEE 1666 SystemC). This patch contains a standard compliant
implementation of an external gem5 port, that enables the usage of gem5 as a
TLM initiator component in SystemC based virtual platforms. Both TLM coding
paradigms loosely timed (b_transport) and aproximately timed (nb_transport) are
supported.
Compared to the original patch a TLM memory manager was added. Furthermore, the
transaction object was removed and for each TLM payload a PacketPointer that
points to the original gem5 packet is added as an TLM extension. For event
handling single events are now created.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
This patch adds an example configuration in ext/sst/tests/ that allows
an SST/gem5 instance to simulate a 4-core AArch64 system with SST's
memHierarchy components providing all the caches and memories.
This patch uses the recently added XOR hashing capabilities for the
DRAM channel interleaving. This avoids channel biasing due to strided
access patterns.
This patch changes the DRAM channel interleaving default behaviour to
be more representative. The default address mapping (RoRaBaCoCh) moves
the channel bits towards the least significant bits, and uses 128 byte
as the default channel interleaving granularity.
These defaults can be overridden if desired, but should serve as a
sensible starting point for most use-cases.
This patch gives the user direct influence over the number of DRAM
ranks to make it easier to tune the memory density without affecting
the bandwidth (previously the only means of scaling the device count
was through the number of channels).
The patch also adds some basic sanity checks to ensure that the number
of ranks is a power of two (since we rely on bit slices in the address
decoding).
This patch is the final in the series. The whole series and this patch in
particular were written with the aim of interfacing ruby's directory controller
with the memory controller in the classic memory system. This is being done
since ruby's memory controller has not being kept up to date with the changes
going on in DRAMs. Classic's memory controller is more up to date and
supports multiple different types of DRAM. This also brings classic and
ruby ever more close. The patch also changes ruby's memory controller to
expose the same interface.
This patch moves code for instantiating a single memory controller from
the function config_mem() to a separate function. This is being done
so that memory controllers can be instantiated without assuming that
they will be attached to the system in a particular fashion.
This patch renames the not-so-simple SimpleDRAM to a more suitable
DRAMCtrl. The name change is intended to ensure that we do not send
the wrong message (although the "simple" in SimpleDRAM was originally
intended as in cleverly simple, or elegant).
As the DRAM controller modelling work is being presented at ISPASS'14
our hope is that a broader audience will use the model in the future.
--HG--
rename : src/mem/SimpleDRAM.py => src/mem/DRAMCtrl.py
rename : src/mem/simple_dram.cc => src/mem/dram_ctrl.cc
rename : src/mem/simple_dram.hh => src/mem/dram_ctrl.hh
This patch adds the row bits to the name of the address mapping
schemes to make it more clear that all the current schemes places the
row bits as the most significant bits.
This patch adds DRAMSim2 as a memory controller by wrapping the
external library and creating a sublass of AbstractMemory that bridges
between the semantics of gem5 and the DRAMSim2 interface.
The DRAMSim2 wrapper extracts the clock period from the config
file. There is no way of extracting this information from DRAMSim2
itself, so we simply read the same config file and get it from there.
To properly model the response queue, the wrapper keeps track of how
many transactions are in the actual controller, and how many are
stacking up waiting to be sent back as responses (in the wrapper). The
latter requires us to move away from the queued port and manage the
packets ourselves. This is due to DRAMSim2 not having any flow control
on the response path.
DRAMSim2 assumes that the transactions it is given are matching the
burst size of the choosen memory. The wrapper checks to ensure the
cache line size of the system matches the burst size of DRAMSim2 as
there are currently no provisions to split the system requests. In
theory we could allow a cache line size smaller than the burst size,
but that would lead to inefficient use of the DRAM, so for not we
fatal also in this case.
This patch adds support for specifying multi-channel memory
configurations on the command line, e.g. 'se/fs.py
--mem-type=ddr3_1600_x64 --mem-channels=4'. To enable this, it
enhances the functionality of MemConfig and moves the existing
makeMultiChannel class method from SimpleDRAM to the support scripts.
The se/fs.py example scripts are updated to make use of the new
feature.
This patch changes the class names of the variuos DRAM configurations
to better reflect what memory they are based on. The speed and
interface width is now part of the name, and also the alias that is
used to select them on the command line.
Some minor changes are done to the actual parameters, to better
reflect the named configurations. As a result of these changes the
regressions change slightly and the stats will be bumped in a separate
patch.
This patch adds a typical (leaning towards fast) LPDDR3 configuration
based on publically available data. As expected, it looks very similar
to the LPDDR2-S4 configuration, only with a slightly lower burst time.
This patch enables selection of the memory controller class through a
mem-type command-line option. Behind the scenes, this option is
treated much like the cpu-type, and a similar framework is used to
resolve the valid options, and translate the short-hand description to
a valid class.
The regression scripts are updated with a hardcoded memory class for
the moment. The best solution going forward is probably to get the
memory out of the makeSystem functions, but Ruby complicates things as
it does not connect the memory controller to the membus.
--HG--
rename : configs/common/CpuConfig.py => configs/common/MemConfig.py