Rather than adding the options to *every* config that might be using
GPU_VIPER.py, just change the Ruby config to check if the option is
available before trying to use it. Otherwise, reverts to what was the
default on stable.
Change-Id: Ia6f1d0827d489ee2a35c598b644461cbff59e247
Adding RP_choose function to change replacement policies among
TreePLRU, LRU, FIFO, LFU, LIP, MRU, NRU, RRIP, SecondChance AND ShiPMem replacement
policies for TCC, TCP and SQC caches for GPU
Adding RP_choose functions to change replacement policies among
TreePLRU, LRU, FIFO, LFU, LIP, MRU, NRU, RRIP, SecondChance AND ShiPMem replacement
policies for TCC, TCP and SQC caches for GPU
Adding RP_choose functions to change replacement policies among
TreePLRU, LRU, FIFO, LFU, LIP, MRU, NRU, RRIP, SecondChance AND ShiPMem replacement
policies for TCC, TCP and SQC caches for GPU
The scalar cache is not being invalidated which causes stale data to be
left in the scalar cache between GPU kernels. This commit sends
invalidates to the scalar cache when the SQC is invalidated. This is a
sufficient baseline for simulation.
Since the number of invalidates might be larger than the mandatory queue
can hold and no flash invalidate mechanism exists in the VIPER protocol,
the command line option for the mandatory queue size is removed, which
is the same behavior as the SQC.
Change-Id: I1723f224711b04caa4c88beccfa8fb73ccf56572
After removing `get_runtime_isa`, the `send_evicts` function in the ruby
configs assumes that there is an ISA built. This change short-circuits
that logic if the current build is the NULL (none) ISA.
Change-Id: I75fefe3d70649b636b983c4d2145c63a9e1342f7
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
configs/ruby/Ruby.py fails when `DerivO3CPU` is not compiled into the
gem5 binary. The `isinstance` check fails. This fix addds a guard.
Change-Id: I1e5503ab18ec94683056c6eb28cebeda6632ae8e
1. Fix the wrong ISA detect of get_isa function
2. Fix the typo ObjectLIst.cpu_list
3. Fix missing PageTableWalkerCache
4. Fix the invalid default cpu_type paramter
Change-Id: I217ea8da8a6d8e712743a5b32c4c0669216ce6c4
`get_runtime_isa()` has been deprecated for some time. It is a leftover
piece of code from when gem5 was compiled to a single ISA and that ISA
used to configure the simulated system to use that ISA. Since multi-ISA
compilations are possible, `get_runtime_isa()` should not be used.
Unless the gem5 binary is compiled to a single ISA, a failure will
occur.
The new proceedure for specify which ISA to use is by the setting of the
correct `BaseCPU` implementation. E.g., `X86SimpleTimingCPU` of
`ArmO3CPU`.
This patch removes the remaining `get_runtime_isa()` instances and
removes the function itself. The `SimpleCore` class has been updated to
allow for it's CPU factory to return a class, needed by scripts in
"configs/common".
The deprecated functionality in the standard library, which allowed for
the specifying of an ISA when setting up a processor and/or core has
also been removed. Setting an ISA is now manditory.
Fixes#216.
This patch adds supports for using the "classic" prefetchers with ruby
cache controllers.
This pull request includes a few commits making the changes in this
order:
- Refactor decouples the classic cache and prefetchers interfaces
- Extras probes for later integration with ruby
- General ruby-side support
- Adds support for the CHI protocol
Commit [mem-ruby: support prefetcher in CHI
protocol](2bdb65653b)
may be used as example on how to add support for other protocols.
JIRA issues that may be related to this pull request:
https://gem5.atlassian.net/browse/GEM5-457https://gem5.atlassian.net/browse/GEM5-1112
Added a resource constraint, AtomicALUOperation, to GLC atomics
performed in the TCC.
The resource constraint uses a new class, ALUFreeList array. The class
assumes the following:
- There are a fixed number of atomic ALU pipelines
- While a new cache line can be processed in each pipeline each cycle,
if a cache line is currently going through a pipeline, it can't be
processed again until it's finished
Two configuration parameters have been used to tune this behavior:
- tcc-num-atomic-alus corresponds to the number of atomic ALU pipelines
- atomic-alu-latency corresponds to the latency of atomic ALU pipelines
Change-Id: I25bdde7dafc3877590bb6536efdf57b8c540a939
PR #367 adds an option to configs/ruby/GPU_VIPER.py that was not added
to the corresponding dGPU equal for GPUFS and thus all GPUFS runs are
failing. Fixed in this patch.
Added checks to ensure that atomics are not performed in the TCC when it
is configured as a write-through cache. Also added SLC bit overwrite to
ensure directory preforms atomics when there is a write-through TCC.
Change-Id: I4514e6c8022aeb7785f2c59871cd9acec8161ed8
Added a new feature to CHI protocol (in collaboration with @tiagormk).
Here is the Jira Ticket
[https://gem5.atlassian.net/browse/GEM5-1326](https://gem5.atlassian.net/browse/GEM5-1326
). As described in CHI specs, far atomic transactions enable remote
execution of Atomic Memory Operations. This pull request incorporates
several changes:
* Fix Arm ISA definition of Swap instructions. These instructions should
return an operand, so their ISA definition should be Return Operation.
* Enable AMOs in Ruby Mem Test to verify that AMOs work
* Enable near and far AMO in the Cache Controler of CHI
Three configuration parameters have been used to tune this behavior:
* policy_type: sets the atomic policy to one of the described in [our
paper](https://dl.acm.org/doi/10.1145/3579371.3589065)
* atomic_op_latency: simulates the AMO ALU operation latency
* comp_anr: configures the Atomic No return transaction to split
CompDBIDResp into two different messages DBIDResp and Comp
Introduce far atomic operations in CHI protocol.
Three configuration parameters have been used to tune this behavior:
policy_type: sets the atomic policy to one of the described in our paper
atomic_op_latency: simulates the AMO ALU operation latency
comp_anr: configures the Atomic No return transaction to split
CompDBIDResp into two different messages DBIDResp and Comp
Change-Id: I087afad9ad9fcb9df42d72893c9e32ad5a5eb478
Previously, the L1, L2 number of banks and L2 latencies were not
configurable through command line arguments. This commit adds support to
configure them through the arguments '--tcp-num-banks' for number of
banks in L1, '--tcc-num-banks' for number of banks in L2, and
'--tcc-tag-access-latency', and '--tcc-data-access-latency'
Change-Id: Ie3b713ead16865fd7120e2d809ebfa56b69bc4a1
Added a GLC atomic latency parameter (glc-atomic-latency) used when
enqueueing response messages regarding atomics directly performed in
the TCC. This latency is added in addition to the L2 response latency
(TCC_latency). This represents the latency of performing an atomic
within the L2.
With this change, the TCC response queue will receive enqueues with
varying latencies as GLC atomic responses will have this added GLC
atomic latency while data responses will not. To accommodate this in
light of the queue having strict FIFO ordering (which would be violated
here), this change also adds an optional parameter bypassStrictFIFO to
the SLICC enqueue function which allows overriding strict FIFO
requirements for individual messages on a case-by-case basis. This
parameter is only being used in the TCC's atomic response enqueue call.
Change-Id: Iabd52cbd2c0cc385c1fb3fe7bcd0cc64bdb40aac
The TARGET_ISA variable would let you select one ISA from a list of
possible ISAs. That has now been replaced with USE_ARM_ISA, USE_X86_ISA,
etc, variables which are boolean on or off. That will allow any number
of ISAs to be enabled or disabled individually. Enabling something other
than exactly one of these will probably prevent you from getting a
working gem5 binary, but those problems are being addressed in other,
parallel change series.
I decided to use the USE_ prefix since it was consistent with most other
on/off variables we have in gem5. One noteable exception is the
BUILD_GPU setting which, you could convincingly argue, is a better
prefix than USE_. Another option would be to use CONFIG_, in
anticipation of using a kconfig style config mechanism in gem5.
It seemed premature to start using a CONFIG_ prefix here, and if we
decide to switch to some other prefix like BUILD_, it should be a
purposeful choice and not something somebody just starts using.
Change-Id: I90fef2835aa4712782e6c1313fbf564d0ed45538
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/52491
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Update the TCP_latency input arg to reflect what it does -- in
combination with the number of banks, it determines the number of
accesses that can happen in the L1 (TCP) in a given cycle. It does
not directly affect the L1 latency as the name implies. Instead,
the mandatory_queue_latency does this.
Change-Id: Ib6cbc8367ce2b1f30005d137384f53650a403b49
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61309
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
The VIPER configuration uses the MOESI_AMD_Base protocol's directory.
This protocol does not wait for memory ACKs. As a result, this can lead
to read requests being pulled out of the MessageBuffer between the
directory and DRAMCtrl before a write request to the same address. This
leads to inconsistent data. To fix this, make the MessageBuffers
ordered. Since these MessageBuffers are essentially just an interface
between SLICC and DRAMCtrl, and DRAMCtrl can reorder requests properly,
this should not cause any large impact on performance due to the
constraint.
Also remove the duplicate instantiation of these MessageBuffers.
Change-Id: I59653717cc79884e733af3958adfc14941703958
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57411
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Matthew Poremba <matthew.poremba@amd.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Remove the line "For use for simulation and test purposes only" in files
were AMD is the only copyright holder listed in the header. This happens
to be the case for all files where this line exists, removing it
completely from gem5.
Change-Id: I623f266b002f564301b28774f49081099cfc60fd
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53943
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
In order to fix several regression failures [1] the master/slave
terminology in src/cpu/BaseCPU.py was reintroduced [2].
This patch is addressing the issue by providing 2 different
ways of connecting cpu ports:
*) connectBus: The method assumes an object with a bus interface is
passed as an argument, therefore it tries to bind cpu ports to the
bus.mem_side_ports and bus.cpu_side_ports
*) connectAllPorts: No assumption on the port owning device is made.
The method simply accepts ports as arguments which will be directly
connected to the peer cpu ports
This will be used for example by ruby Sequencers
[1]: https://gem5.atlassian.net/browse/GEM5-775
[2]: https://gem5-review.googlesource.com/c/public/gem5/+/34495
Change-Id: I715ab8471621d6e5eb36731d7eaefbedf9663a71
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/52584
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
In order to have more fine grained control over which SLICC controllers
are part of which Ruby network in a disjoint configuration, the
create_system function in GPU_VIPER is broken up into multiple construct
calls for each SLICC machine type in the protocol. By default this does
not change anything functionally. A future config will use the construct
calls to explicitly set which network (CPU or GPU) the controller is in.
Change-Id: Ic038b300c5c3732e96992ef4bfe14e43fa0ea824
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51847
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>