mem-ruby: HTM mem implementation
This patch augments the MESI_Three_Level Ruby protocol with hardware transactional memory support. The HTM implementation relies on buffering of speculative memory updates. The core notifies the L0 cache controller that a new transaction has started and the controller in turn places itself in transactional state (htmTransactionalState := true). When operating in transactional state, the usual MESI protocol changes slightly. Lines loaded or stored are marked as part of a transaction's read and write set respectively. If there is an invalidation request to cache line in the read/write set, the transaction is marked as failed. Similarly, if there is a read request by another core to a speculatively written cache line, i.e. in the write set, the transaction is marked as failed. If failed, all subsequent loads and stores from the core are made benign, i.e. made into NOPS at the cache controller, and responses are marked to indicate that the transactional state has failed. When the core receives these marked responses, it generates a HtmFailureFault with the reason for the transaction failure. Servicing this fault does two things-- (a) Restores the architectural checkpoint (b) Sends an HTM abort signal to the cache controller The restoration includes all registers in the checkpoint as well as the program counter of the instruction before the transaction started. The abort signal is sent to the L0 cache controller and resets the failed transactional state. It resets the transactional read and write sets and invalidates any speculatively written cache lines. It also exits the transactional state so that the MESI protocol operates as usual. Alternatively, if the instructions within a transaction complete without triggering a HtmFailureFault, the transaction can be committed. The core is responsible for notifying the cache controller that the transaction is complete and the cache controller makes all speculative writes visible to the rest of the system and exits the transactional state. Notifting the cache controller is done through HtmCmd Requests which are a subtype of Load Requests. KUDOS: The code is based on a previous pull request by Pradip Vallathol who developed HTM and TSX support in Gem5 as part of his master’s thesis: http://reviews.gem5.org/r/2308/index.html JIRA: https://gem5.atlassian.net/browse/GEM5-587 Change-Id: Icc328df93363486e923b8bd54f4d77741d8f5650 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/30319 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>
This commit is contained in:
committed by
Giacomo Travaglini
parent
1c61dae99b
commit
0a8a787de3
6
build_opts/ARM_MESI_Three_Level_HTM
Normal file
6
build_opts/ARM_MESI_Three_Level_HTM
Normal file
@@ -0,0 +1,6 @@
|
||||
# Copyright (c) 2019 ARM Limited
|
||||
# All rights reserved.
|
||||
|
||||
TARGET_ISA = 'arm'
|
||||
CPU_MODELS = 'TimingSimpleCPU,O3CPU'
|
||||
PROTOCOL = 'MESI_Three_Level_HTM'
|
||||
337
configs/ruby/MESI_Three_Level_HTM.py
Normal file
337
configs/ruby/MESI_Three_Level_HTM.py
Normal file
@@ -0,0 +1,337 @@
|
||||
# Copyright (c) 2006-2007 The Regents of The University of Michigan
|
||||
# Copyright (c) 2009,2015 Advanced Micro Devices, Inc.
|
||||
# Copyright (c) 2013 Mark D. Hill and David A. Wood
|
||||
# Copyright (c) 2020 ARM Limited
|
||||
# All rights reserved.
|
||||
#
|
||||
# Redistribution and use in source and binary forms, with or without
|
||||
# modification, are permitted provided that the following conditions are
|
||||
# met: redistributions of source code must retain the above copyright
|
||||
# notice, this list of conditions and the following disclaimer;
|
||||
# redistributions in binary form must reproduce the above copyright
|
||||
# notice, this list of conditions and the following disclaimer in the
|
||||
# documentation and/or other materials provided with the distribution;
|
||||
# neither the name of the copyright holders nor the names of its
|
||||
# contributors may be used to endorse or promote products derived from
|
||||
# this software without specific prior written permission.
|
||||
#
|
||||
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
|
||||
# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
|
||||
# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
|
||||
# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
|
||||
# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
|
||||
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
|
||||
# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
||||
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
||||
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
||||
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
|
||||
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
|
||||
import math
|
||||
import m5
|
||||
from m5.objects import *
|
||||
from m5.defines import buildEnv
|
||||
from .Ruby import create_topology, create_directories
|
||||
from .Ruby import send_evicts
|
||||
from common import FileSystemConfig
|
||||
|
||||
#
|
||||
# Declare caches used by the protocol
|
||||
#
|
||||
class L0Cache(RubyCache): pass
|
||||
class L1Cache(RubyCache): pass
|
||||
class L2Cache(RubyCache): pass
|
||||
|
||||
def define_options(parser):
|
||||
parser.add_option("--num-clusters", type = "int", default = 1,
|
||||
help = "number of clusters in a design in which there are shared\
|
||||
caches private to clusters")
|
||||
parser.add_option("--l0i_size", type="string", default="4096B")
|
||||
parser.add_option("--l0d_size", type="string", default="4096B")
|
||||
parser.add_option("--l0i_assoc", type="int", default=1)
|
||||
parser.add_option("--l0d_assoc", type="int", default=1)
|
||||
parser.add_option("--l0_transitions_per_cycle", type="int", default=32)
|
||||
parser.add_option("--l1_transitions_per_cycle", type="int", default=32)
|
||||
parser.add_option("--l2_transitions_per_cycle", type="int", default=4)
|
||||
parser.add_option("--enable-prefetch", action="store_true", default=False,\
|
||||
help="Enable Ruby hardware prefetcher")
|
||||
return
|
||||
|
||||
def create_system(options, full_system, system, dma_ports, bootmem,
|
||||
ruby_system):
|
||||
|
||||
if buildEnv['PROTOCOL'] != 'MESI_Three_Level_HTM':
|
||||
fatal("This script requires the MESI_Three_Level protocol to be\
|
||||
built.")
|
||||
|
||||
cpu_sequencers = []
|
||||
|
||||
#
|
||||
# The ruby network creation expects the list of nodes in the system to be
|
||||
# consistent with the NetDest list. Therefore the l1 controller nodes
|
||||
# must be listed before the directory nodes and directory nodes before
|
||||
# dma nodes, etc.
|
||||
#
|
||||
l0_cntrl_nodes = []
|
||||
l1_cntrl_nodes = []
|
||||
l2_cntrl_nodes = []
|
||||
dma_cntrl_nodes = []
|
||||
|
||||
assert (options.num_cpus % options.num_clusters == 0)
|
||||
num_cpus_per_cluster = options.num_cpus / options.num_clusters
|
||||
|
||||
assert (options.num_l2caches % options.num_clusters == 0)
|
||||
num_l2caches_per_cluster = options.num_l2caches / options.num_clusters
|
||||
|
||||
l2_bits = int(math.log(num_l2caches_per_cluster, 2))
|
||||
block_size_bits = int(math.log(options.cacheline_size, 2))
|
||||
l2_index_start = block_size_bits + l2_bits
|
||||
|
||||
#
|
||||
# Must create the individual controllers before the network to ensure the
|
||||
# controller constructors are called before the network constructor
|
||||
#
|
||||
for i in range(options.num_clusters):
|
||||
for j in range(num_cpus_per_cluster):
|
||||
#
|
||||
# First create the Ruby objects associated with this cpu
|
||||
#
|
||||
l0i_cache = L0Cache(size = options.l0i_size,
|
||||
assoc = options.l0i_assoc,
|
||||
is_icache = True,
|
||||
start_index_bit = block_size_bits,
|
||||
replacement_policy = LRURP())
|
||||
|
||||
l0d_cache = L0Cache(size = options.l0d_size,
|
||||
assoc = options.l0d_assoc,
|
||||
is_icache = False,
|
||||
start_index_bit = block_size_bits,
|
||||
replacement_policy = LRURP())
|
||||
|
||||
# the ruby random tester reuses num_cpus to specify the
|
||||
# number of cpu ports connected to the tester object, which
|
||||
# is stored in system.cpu. because there is only ever one
|
||||
# tester object, num_cpus is not necessarily equal to the
|
||||
# size of system.cpu; therefore if len(system.cpu) == 1
|
||||
# we use system.cpu[0] to set the clk_domain, thereby ensuring
|
||||
# we don't index off the end of the cpu list.
|
||||
if len(system.cpu) == 1:
|
||||
clk_domain = system.cpu[0].clk_domain
|
||||
else:
|
||||
clk_domain = system.cpu[i].clk_domain
|
||||
|
||||
# Ruby prefetcher
|
||||
prefetcher = RubyPrefetcher(
|
||||
num_streams=16,
|
||||
unit_filter = 256,
|
||||
nonunit_filter = 256,
|
||||
train_misses = 5,
|
||||
num_startup_pfs = 4,
|
||||
cross_page = True
|
||||
)
|
||||
|
||||
l0_cntrl = L0Cache_Controller(
|
||||
version = i * num_cpus_per_cluster + j,
|
||||
Icache = l0i_cache, Dcache = l0d_cache,
|
||||
transitions_per_cycle = options.l0_transitions_per_cycle,
|
||||
prefetcher = prefetcher,
|
||||
enable_prefetch = options.enable_prefetch,
|
||||
send_evictions = send_evicts(options),
|
||||
clk_domain = clk_domain,
|
||||
ruby_system = ruby_system)
|
||||
|
||||
cpu_seq = RubyHTMSequencer(version = i * num_cpus_per_cluster + j,
|
||||
icache = l0i_cache,
|
||||
clk_domain = clk_domain,
|
||||
dcache = l0d_cache,
|
||||
ruby_system = ruby_system)
|
||||
|
||||
l0_cntrl.sequencer = cpu_seq
|
||||
|
||||
l1_cache = L1Cache(size = options.l1d_size,
|
||||
assoc = options.l1d_assoc,
|
||||
start_index_bit = block_size_bits,
|
||||
is_icache = False)
|
||||
|
||||
l1_cntrl = L1Cache_Controller(
|
||||
version = i * num_cpus_per_cluster + j,
|
||||
cache = l1_cache, l2_select_num_bits = l2_bits,
|
||||
cluster_id = i,
|
||||
transitions_per_cycle = options.l1_transitions_per_cycle,
|
||||
ruby_system = ruby_system)
|
||||
|
||||
exec("ruby_system.l0_cntrl%d = l0_cntrl"
|
||||
% ( i * num_cpus_per_cluster + j))
|
||||
exec("ruby_system.l1_cntrl%d = l1_cntrl"
|
||||
% ( i * num_cpus_per_cluster + j))
|
||||
|
||||
#
|
||||
# Add controllers and sequencers to the appropriate lists
|
||||
#
|
||||
cpu_sequencers.append(cpu_seq)
|
||||
l0_cntrl_nodes.append(l0_cntrl)
|
||||
l1_cntrl_nodes.append(l1_cntrl)
|
||||
|
||||
# Connect the L0 and L1 controllers
|
||||
l0_cntrl.prefetchQueue = MessageBuffer()
|
||||
l0_cntrl.mandatoryQueue = MessageBuffer()
|
||||
l0_cntrl.bufferToL1 = MessageBuffer(ordered = True)
|
||||
l1_cntrl.bufferFromL0 = l0_cntrl.bufferToL1
|
||||
l0_cntrl.bufferFromL1 = MessageBuffer(ordered = True)
|
||||
l1_cntrl.bufferToL0 = l0_cntrl.bufferFromL1
|
||||
|
||||
# Connect the L1 controllers and the network
|
||||
l1_cntrl.requestToL2 = MessageBuffer()
|
||||
l1_cntrl.requestToL2.master = ruby_system.network.slave
|
||||
l1_cntrl.responseToL2 = MessageBuffer()
|
||||
l1_cntrl.responseToL2.master = ruby_system.network.slave
|
||||
l1_cntrl.unblockToL2 = MessageBuffer()
|
||||
l1_cntrl.unblockToL2.master = ruby_system.network.slave
|
||||
|
||||
l1_cntrl.requestFromL2 = MessageBuffer()
|
||||
l1_cntrl.requestFromL2.slave = ruby_system.network.master
|
||||
l1_cntrl.responseFromL2 = MessageBuffer()
|
||||
l1_cntrl.responseFromL2.slave = ruby_system.network.master
|
||||
|
||||
|
||||
for j in range(num_l2caches_per_cluster):
|
||||
l2_cache = L2Cache(size = options.l2_size,
|
||||
assoc = options.l2_assoc,
|
||||
start_index_bit = l2_index_start)
|
||||
|
||||
l2_cntrl = L2Cache_Controller(
|
||||
version = i * num_l2caches_per_cluster + j,
|
||||
L2cache = l2_cache, cluster_id = i,
|
||||
transitions_per_cycle =\
|
||||
options.l2_transitions_per_cycle,
|
||||
ruby_system = ruby_system)
|
||||
|
||||
exec("ruby_system.l2_cntrl%d = l2_cntrl"
|
||||
% (i * num_l2caches_per_cluster + j))
|
||||
l2_cntrl_nodes.append(l2_cntrl)
|
||||
|
||||
# Connect the L2 controllers and the network
|
||||
l2_cntrl.DirRequestFromL2Cache = MessageBuffer()
|
||||
l2_cntrl.DirRequestFromL2Cache.master = ruby_system.network.slave
|
||||
l2_cntrl.L1RequestFromL2Cache = MessageBuffer()
|
||||
l2_cntrl.L1RequestFromL2Cache.master = ruby_system.network.slave
|
||||
l2_cntrl.responseFromL2Cache = MessageBuffer()
|
||||
l2_cntrl.responseFromL2Cache.master = ruby_system.network.slave
|
||||
|
||||
l2_cntrl.unblockToL2Cache = MessageBuffer()
|
||||
l2_cntrl.unblockToL2Cache.slave = ruby_system.network.master
|
||||
l2_cntrl.L1RequestToL2Cache = MessageBuffer()
|
||||
l2_cntrl.L1RequestToL2Cache.slave = ruby_system.network.master
|
||||
l2_cntrl.responseToL2Cache = MessageBuffer()
|
||||
l2_cntrl.responseToL2Cache.slave = ruby_system.network.master
|
||||
|
||||
# Run each of the ruby memory controllers at a ratio of the frequency of
|
||||
# the ruby system
|
||||
# clk_divider value is a fix to pass regression.
|
||||
ruby_system.memctrl_clk_domain = DerivedClockDomain(
|
||||
clk_domain = ruby_system.clk_domain, clk_divider = 3)
|
||||
|
||||
mem_dir_cntrl_nodes, rom_dir_cntrl_node = create_directories(
|
||||
options, bootmem, ruby_system, system)
|
||||
dir_cntrl_nodes = mem_dir_cntrl_nodes[:]
|
||||
if rom_dir_cntrl_node is not None:
|
||||
dir_cntrl_nodes.append(rom_dir_cntrl_node)
|
||||
for dir_cntrl in dir_cntrl_nodes:
|
||||
# Connect the directory controllers and the network
|
||||
dir_cntrl.requestToDir = MessageBuffer()
|
||||
dir_cntrl.requestToDir.slave = ruby_system.network.master
|
||||
dir_cntrl.responseToDir = MessageBuffer()
|
||||
dir_cntrl.responseToDir.slave = ruby_system.network.master
|
||||
dir_cntrl.responseFromDir = MessageBuffer()
|
||||
dir_cntrl.responseFromDir.master = ruby_system.network.slave
|
||||
dir_cntrl.requestToMemory = MessageBuffer()
|
||||
dir_cntrl.responseFromMemory = MessageBuffer()
|
||||
|
||||
for i, dma_port in enumerate(dma_ports):
|
||||
#
|
||||
# Create the Ruby objects associated with the dma controller
|
||||
#
|
||||
dma_seq = DMASequencer(version = i, ruby_system = ruby_system)
|
||||
|
||||
dma_cntrl = DMA_Controller(version = i,
|
||||
dma_sequencer = dma_seq,
|
||||
transitions_per_cycle = options.ports,
|
||||
ruby_system = ruby_system)
|
||||
|
||||
exec("ruby_system.dma_cntrl%d = dma_cntrl" % i)
|
||||
exec("ruby_system.dma_cntrl%d.dma_sequencer.slave = dma_port" % i)
|
||||
dma_cntrl_nodes.append(dma_cntrl)
|
||||
|
||||
# Connect the dma controller to the network
|
||||
dma_cntrl.mandatoryQueue = MessageBuffer()
|
||||
dma_cntrl.responseFromDir = MessageBuffer(ordered = True)
|
||||
dma_cntrl.responseFromDir.slave = ruby_system.network.master
|
||||
dma_cntrl.requestToDir = MessageBuffer()
|
||||
dma_cntrl.requestToDir.master = ruby_system.network.slave
|
||||
|
||||
all_cntrls = l0_cntrl_nodes + \
|
||||
l1_cntrl_nodes + \
|
||||
l2_cntrl_nodes + \
|
||||
dir_cntrl_nodes + \
|
||||
dma_cntrl_nodes
|
||||
|
||||
# Create the io controller and the sequencer
|
||||
if full_system:
|
||||
io_seq = DMASequencer(version=len(dma_ports), ruby_system=ruby_system)
|
||||
ruby_system._io_port = io_seq
|
||||
io_controller = DMA_Controller(version = len(dma_ports),
|
||||
dma_sequencer = io_seq,
|
||||
ruby_system = ruby_system)
|
||||
ruby_system.io_controller = io_controller
|
||||
|
||||
# Connect the dma controller to the network
|
||||
io_controller.mandatoryQueue = MessageBuffer()
|
||||
io_controller.responseFromDir = MessageBuffer(ordered = True)
|
||||
io_controller.responseFromDir.slave = ruby_system.network.master
|
||||
io_controller.requestToDir = MessageBuffer()
|
||||
io_controller.requestToDir.master = ruby_system.network.slave
|
||||
|
||||
all_cntrls = all_cntrls + [io_controller]
|
||||
# Register configuration with filesystem
|
||||
else:
|
||||
for i in range(options.num_clusters):
|
||||
for j in range(num_cpus_per_cluster):
|
||||
FileSystemConfig.register_cpu(physical_package_id = 0,
|
||||
core_siblings = range(options.num_cpus),
|
||||
core_id = i*num_cpus_per_cluster+j,
|
||||
thread_siblings = [])
|
||||
|
||||
FileSystemConfig.register_cache(level = 0,
|
||||
idu_type = 'Instruction',
|
||||
size = options.l0i_size,
|
||||
line_size =\
|
||||
options.cacheline_size,
|
||||
assoc = 1,
|
||||
cpus = [i*num_cpus_per_cluster+j])
|
||||
FileSystemConfig.register_cache(level = 0,
|
||||
idu_type = 'Data',
|
||||
size = options.l0d_size,
|
||||
line_size =\
|
||||
options.cacheline_size,
|
||||
assoc = 1,
|
||||
cpus = [i*num_cpus_per_cluster+j])
|
||||
|
||||
FileSystemConfig.register_cache(level = 1,
|
||||
idu_type = 'Unified',
|
||||
size = options.l1d_size,
|
||||
line_size = options.cacheline_size,
|
||||
assoc = options.l1d_assoc,
|
||||
cpus = [i*num_cpus_per_cluster+j])
|
||||
|
||||
FileSystemConfig.register_cache(level = 2,
|
||||
idu_type = 'Unified',
|
||||
size = str(MemorySize(options.l2_size) * \
|
||||
num_l2caches_per_cluster)+'B',
|
||||
line_size = options.cacheline_size,
|
||||
assoc = options.l2_assoc,
|
||||
cpus = [n for n in range(i*num_cpus_per_cluster, \
|
||||
(i+1)*num_cpus_per_cluster)])
|
||||
|
||||
ruby_system.network.number_of_virtual_networks = 3
|
||||
topology = create_topology(all_cntrls, options)
|
||||
return (cpu_sequencers, mem_dir_cntrl_nodes, topology)
|
||||
@@ -138,4 +138,5 @@ MakeInclude('system/Sequencer.hh')
|
||||
# <# include "mem/ruby/protocol/header.hh"> in any file
|
||||
# generated_dir = Dir('protocol')
|
||||
MakeInclude('system/GPUCoalescer.hh')
|
||||
MakeInclude('system/HTMSequencer.hh')
|
||||
MakeInclude('system/VIPERCoalescer.hh')
|
||||
|
||||
@@ -130,6 +130,14 @@ machine(MachineType:L1Cache, "MESI Directory L1 Cache CMP")
|
||||
Ack_all, desc="Last ack for processor";
|
||||
|
||||
WB_Ack, desc="Ack for replacement";
|
||||
|
||||
// hardware transactional memory
|
||||
L0_DataCopy, desc="Data Block from L0. Should remain in M state.";
|
||||
|
||||
// L0 cache received the invalidation message and has
|
||||
// sent a NAK (because of htm abort) saying that the data
|
||||
// in L1 is the latest value.
|
||||
L0_DataNak, desc="L0 received INV message, specifies its data is also stale";
|
||||
}
|
||||
|
||||
// TYPES
|
||||
@@ -361,6 +369,10 @@ machine(MachineType:L1Cache, "MESI Directory L1 Cache CMP")
|
||||
|
||||
if(in_msg.Class == CoherenceClass:INV_DATA) {
|
||||
trigger(Event:L0_DataAck, in_msg.addr, cache_entry, tbe);
|
||||
} else if (in_msg.Class == CoherenceClass:NAK) {
|
||||
trigger(Event:L0_DataNak, in_msg.addr, cache_entry, tbe);
|
||||
} else if (in_msg.Class == CoherenceClass:PUTX_COPY) {
|
||||
trigger(Event:L0_DataCopy, in_msg.addr, cache_entry, tbe);
|
||||
} else if (in_msg.Class == CoherenceClass:INV_ACK) {
|
||||
trigger(Event:L0_Ack, in_msg.addr, cache_entry, tbe);
|
||||
} else {
|
||||
@@ -808,18 +820,6 @@ machine(MachineType:L1Cache, "MESI Directory L1 Cache CMP")
|
||||
k_popL0RequestQueue;
|
||||
}
|
||||
|
||||
transition(EE, Load, E) {
|
||||
hh_xdata_to_l0;
|
||||
uu_profileHit;
|
||||
k_popL0RequestQueue;
|
||||
}
|
||||
|
||||
transition(MM, Load, M) {
|
||||
hh_xdata_to_l0;
|
||||
uu_profileHit;
|
||||
k_popL0RequestQueue;
|
||||
}
|
||||
|
||||
transition({S,SS}, Store, SM) {
|
||||
i_allocateTBE;
|
||||
c_issueUPGRADE;
|
||||
@@ -1034,7 +1034,7 @@ machine(MachineType:L1Cache, "MESI Directory L1 Cache CMP")
|
||||
kd_wakeUpDependents;
|
||||
}
|
||||
|
||||
transition(SM, L0_Invalidate_Else, SM_IL0) {
|
||||
transition(SM, {Inv,L0_Invalidate_Else}, SM_IL0) {
|
||||
forward_eviction_to_L0_else;
|
||||
}
|
||||
|
||||
@@ -1093,4 +1093,55 @@ machine(MachineType:L1Cache, "MESI Directory L1 Cache CMP")
|
||||
transition({S_IL0, M_IL0, E_IL0, MM_IL0}, {Inv, Fwd_GETX, Fwd_GETS}) {
|
||||
z2_stallAndWaitL2Queue;
|
||||
}
|
||||
|
||||
// hardware transactional memory
|
||||
|
||||
// If a transaction has aborted, the L0 could re-request
|
||||
// data which is in E or EE state in L1.
|
||||
transition({EE,E}, Load, E) {
|
||||
hh_xdata_to_l0;
|
||||
uu_profileHit;
|
||||
k_popL0RequestQueue;
|
||||
}
|
||||
|
||||
// If a transaction has aborted, the L0 could re-request
|
||||
// data which is in M or MM state in L1.
|
||||
transition({MM,M}, Load, M) {
|
||||
hh_xdata_to_l0;
|
||||
uu_profileHit;
|
||||
k_popL0RequestQueue;
|
||||
}
|
||||
|
||||
// If a transaction has aborted, the L0 could re-request
|
||||
// data which is in M state in L1.
|
||||
transition({E,M}, Store, M) {
|
||||
hh_xdata_to_l0;
|
||||
uu_profileHit;
|
||||
k_popL0RequestQueue;
|
||||
}
|
||||
|
||||
// A transaction may have tried to modify a cache block in M state with
|
||||
// non-speculative (pre-transactional) data. This needs to be copied
|
||||
// to the L1 before any further modifications occur at the L0.
|
||||
transition({M,E}, L0_DataCopy, M) {
|
||||
u_writeDataFromL0Request;
|
||||
k_popL0RequestQueue;
|
||||
}
|
||||
|
||||
transition({M_IL0, E_IL0}, L0_DataCopy, M_IL0) {
|
||||
u_writeDataFromL0Request;
|
||||
k_popL0RequestQueue;
|
||||
}
|
||||
|
||||
// A NAK from the L0 means that the L0 invalidated its
|
||||
// modified line (due to an abort) so it is therefore necessary
|
||||
// to use the L1's correct version instead
|
||||
transition({M_IL0, E_IL0}, L0_DataNak, MM) {
|
||||
k_popL0RequestQueue;
|
||||
kd_wakeUpDependents;
|
||||
}
|
||||
|
||||
transition(I, L1_Replacement) {
|
||||
ff_deallocateCacheBlock;
|
||||
}
|
||||
}
|
||||
|
||||
@@ -48,6 +48,7 @@ enumeration(CoherenceClass, desc="...") {
|
||||
INV_OWN, desc="Invalidate (own)";
|
||||
INV_ELSE, desc="Invalidate (else)";
|
||||
PUTX, desc="Replacement message";
|
||||
PUTX_COPY, desc="Data block to be copied in L1. L0 will still be in M state";
|
||||
|
||||
WB_ACK, desc="Writeback ack";
|
||||
|
||||
@@ -59,6 +60,7 @@ enumeration(CoherenceClass, desc="...") {
|
||||
DATA, desc="Data block for L1 cache in S state";
|
||||
DATA_EXCLUSIVE, desc="Data block for L1 cache in M/E state";
|
||||
ACK, desc="Generic invalidate ack";
|
||||
NAK, desc="Used by L0 to tell L1 that it cannot provide the latest value";
|
||||
|
||||
// This is a special case in which the L1 cache lost permissions to the
|
||||
// shared block before it got the data. So the L0 cache can use the data
|
||||
|
||||
1606
src/mem/ruby/protocol/MESI_Three_Level_HTM-L0cache.sm
Normal file
1606
src/mem/ruby/protocol/MESI_Three_Level_HTM-L0cache.sm
Normal file
File diff suppressed because it is too large
Load Diff
9
src/mem/ruby/protocol/MESI_Three_Level_HTM.slicc
Normal file
9
src/mem/ruby/protocol/MESI_Three_Level_HTM.slicc
Normal file
@@ -0,0 +1,9 @@
|
||||
protocol "MESI_Three_Level_HTM";
|
||||
include "RubySlicc_interfaces.slicc";
|
||||
include "MESI_Two_Level-msg.sm";
|
||||
include "MESI_Three_Level-msg.sm";
|
||||
include "MESI_Three_Level_HTM-L0cache.sm";
|
||||
include "MESI_Three_Level-L1cache.sm";
|
||||
include "MESI_Two_Level-L2cache.sm";
|
||||
include "MESI_Two_Level-dir.sm";
|
||||
include "MESI_Two_Level-dma.sm";
|
||||
@@ -1,5 +1,5 @@
|
||||
/*
|
||||
* Copyright (c) 2019 ARM Limited
|
||||
* Copyright (c) 2020 ARM Limited
|
||||
* All rights reserved.
|
||||
*
|
||||
* The license below extends only to copyright in the software and shall
|
||||
@@ -167,6 +167,31 @@ enumeration(RubyRequestType, desc="...", default="RubyRequestType_NULL") {
|
||||
Release, desc="Release operation";
|
||||
Acquire, desc="Acquire opertion";
|
||||
AcquireRelease, desc="Acquire and Release opertion";
|
||||
HTM_Start, desc="hardware memory transaction: begin";
|
||||
HTM_Commit, desc="hardware memory transaction: commit";
|
||||
HTM_Cancel, desc="hardware memory transaction: cancel";
|
||||
HTM_Abort, desc="hardware memory transaction: abort";
|
||||
}
|
||||
|
||||
bool isWriteRequest(RubyRequestType type);
|
||||
bool isDataReadRequest(RubyRequestType type);
|
||||
bool isReadRequest(RubyRequestType type);
|
||||
bool isHtmCmdRequest(RubyRequestType type);
|
||||
|
||||
// hardware transactional memory
|
||||
RubyRequestType htmCmdToRubyRequestType(Packet *pkt);
|
||||
|
||||
enumeration(HtmCallbackMode, desc="...", default="HtmCallbackMode_NULL") {
|
||||
HTM_CMD, desc="htm command";
|
||||
LD_FAIL, desc="htm transaction failed - inform via read";
|
||||
ST_FAIL, desc="htm transaction failed - inform via write";
|
||||
}
|
||||
|
||||
enumeration(HtmFailedInCacheReason, desc="...", default="HtmFailedInCacheReason_NO_FAIL") {
|
||||
NO_FAIL, desc="no failure in cache";
|
||||
FAIL_SELF, desc="failed due local cache's replacement policy";
|
||||
FAIL_REMOTE, desc="failed due remote invalidation";
|
||||
FAIL_OTHER, desc="failed due other circumstances";
|
||||
}
|
||||
|
||||
enumeration(SequencerRequestType, desc="...", default="SequencerRequestType_NULL") {
|
||||
|
||||
@@ -132,12 +132,18 @@ structure (Sequencer, external = "yes") {
|
||||
// ll/sc support
|
||||
void writeCallbackScFail(Addr, DataBlock);
|
||||
bool llscCheckMonitor(Addr);
|
||||
void llscClearLocalMonitor();
|
||||
|
||||
void evictionCallback(Addr);
|
||||
void recordRequestType(SequencerRequestType);
|
||||
bool checkResourceAvailable(CacheResourceType, Addr);
|
||||
}
|
||||
|
||||
structure (HTMSequencer, interface="Sequencer", external = "yes") {
|
||||
// hardware transactional memory
|
||||
void htmCallback(Addr, HtmCallbackMode, HtmFailedInCacheReason);
|
||||
}
|
||||
|
||||
structure(RubyRequest, desc="...", interface="Message", external="yes") {
|
||||
Addr LineAddress, desc="Line address for this request";
|
||||
Addr PhysicalAddress, desc="Physical address for this request";
|
||||
@@ -152,6 +158,8 @@ structure(RubyRequest, desc="...", interface="Message", external="yes") {
|
||||
int wfid, desc="Writethrough wavefront";
|
||||
uint64_t instSeqNum, desc="Instruction sequence number";
|
||||
PacketPtr pkt, desc="Packet associated with this request";
|
||||
bool htmFromTransaction, desc="Memory request originates within a HTM transaction";
|
||||
int htmTransactionUid, desc="Used to identify the unique HTM transaction that produced this request";
|
||||
}
|
||||
|
||||
structure(AbstractCacheEntry, primitive="yes", external = "yes") {
|
||||
@@ -185,6 +193,10 @@ structure (CacheMemory, external = "yes") {
|
||||
void recordRequestType(CacheRequestType, Addr);
|
||||
bool checkResourceAvailable(CacheResourceType, Addr);
|
||||
|
||||
// hardware transactional memory
|
||||
void htmCommitTransaction();
|
||||
void htmAbortTransaction();
|
||||
|
||||
int getCacheSize();
|
||||
int getNumBlocks();
|
||||
Addr getAddressAtIdx(int);
|
||||
|
||||
@@ -38,6 +38,7 @@ all_protocols.extend([
|
||||
'MOESI_AMD_Base',
|
||||
'MESI_Two_Level',
|
||||
'MESI_Three_Level',
|
||||
'MESI_Three_Level_HTM',
|
||||
'MI_example',
|
||||
'MOESI_CMP_directory',
|
||||
'MOESI_CMP_token',
|
||||
|
||||
@@ -1,4 +1,16 @@
|
||||
/*
|
||||
* Copyright (c) 2020 ARM Limited
|
||||
* All rights reserved
|
||||
*
|
||||
* The license below extends only to copyright in the software and shall
|
||||
* not be construed as granting a license to any other intellectual
|
||||
* property including but not limited to intellectual property relating
|
||||
* to a hardware implementation of the functionality of the software
|
||||
* licensed hereunder. You may use the software subject to the license
|
||||
* terms below provided that you ensure that this notice is replicated
|
||||
* unmodified and in its entirety in all distributions of the software,
|
||||
* modified or unmodified, in source code or in binary form.
|
||||
*
|
||||
* Copyright (c) 1999-2008 Mark D. Hill and David A. Wood
|
||||
* All rights reserved.
|
||||
*
|
||||
@@ -37,6 +49,8 @@ AbstractCacheEntry::AbstractCacheEntry() : ReplaceableEntry()
|
||||
m_Address = 0;
|
||||
m_locked = -1;
|
||||
m_last_touch_tick = 0;
|
||||
m_htmInReadSet = false;
|
||||
m_htmInWriteSet = false;
|
||||
}
|
||||
|
||||
AbstractCacheEntry::~AbstractCacheEntry()
|
||||
@@ -81,3 +95,27 @@ AbstractCacheEntry::isLocked(int context) const
|
||||
m_Address, m_locked, context);
|
||||
return m_locked == context;
|
||||
}
|
||||
|
||||
void
|
||||
AbstractCacheEntry::setInHtmReadSet(bool val)
|
||||
{
|
||||
m_htmInReadSet = val;
|
||||
}
|
||||
|
||||
void
|
||||
AbstractCacheEntry::setInHtmWriteSet(bool val)
|
||||
{
|
||||
m_htmInWriteSet = val;
|
||||
}
|
||||
|
||||
bool
|
||||
AbstractCacheEntry::getInHtmReadSet() const
|
||||
{
|
||||
return m_htmInReadSet;
|
||||
}
|
||||
|
||||
bool
|
||||
AbstractCacheEntry::getInHtmWriteSet() const
|
||||
{
|
||||
return m_htmInWriteSet;
|
||||
}
|
||||
|
||||
@@ -1,4 +1,16 @@
|
||||
/*
|
||||
* Copyright (c) 2020 ARM Limited
|
||||
* All rights reserved
|
||||
*
|
||||
* The license below extends only to copyright in the software and shall
|
||||
* not be construed as granting a license to any other intellectual
|
||||
* property including but not limited to intellectual property relating
|
||||
* to a hardware implementation of the functionality of the software
|
||||
* licensed hereunder. You may use the software subject to the license
|
||||
* terms below provided that you ensure that this notice is replicated
|
||||
* unmodified and in its entirety in all distributions of the software,
|
||||
* modified or unmodified, in source code or in binary form.
|
||||
*
|
||||
* Copyright (c) 1999-2008 Mark D. Hill and David A. Wood
|
||||
* All rights reserved.
|
||||
*
|
||||
@@ -90,6 +102,18 @@ class AbstractCacheEntry : public ReplaceableEntry
|
||||
|
||||
// Set the last access Tick.
|
||||
void setLastAccess(Tick tick) { m_last_touch_tick = tick; }
|
||||
|
||||
// hardware transactional memory
|
||||
void setInHtmReadSet(bool val);
|
||||
void setInHtmWriteSet(bool val);
|
||||
bool getInHtmReadSet() const;
|
||||
bool getInHtmWriteSet() const;
|
||||
virtual void invalidateEntry() {}
|
||||
|
||||
private:
|
||||
// hardware transactional memory
|
||||
bool m_htmInReadSet;
|
||||
bool m_htmInWriteSet;
|
||||
};
|
||||
|
||||
inline std::ostream&
|
||||
|
||||
@@ -1,4 +1,16 @@
|
||||
/*
|
||||
* Copyright (c) 2020 ARM Limited
|
||||
* All rights reserved
|
||||
*
|
||||
* The license below extends only to copyright in the software and shall
|
||||
* not be construed as granting a license to any other intellectual
|
||||
* property including but not limited to intellectual property relating
|
||||
* to a hardware implementation of the functionality of the software
|
||||
* licensed hereunder. You may use the software subject to the license
|
||||
* terms below provided that you ensure that this notice is replicated
|
||||
* unmodified and in its entirety in all distributions of the software,
|
||||
* modified or unmodified, in source code or in binary form.
|
||||
*
|
||||
* Copyright (c) 2009 Mark D. Hill and David A. Wood
|
||||
* All rights reserved.
|
||||
*
|
||||
@@ -57,6 +69,8 @@ class RubyRequest : public Message
|
||||
DataBlock m_WTData;
|
||||
int m_wfid;
|
||||
uint64_t m_instSeqNum;
|
||||
bool m_htmFromTransaction;
|
||||
uint64_t m_htmTransactionUid;
|
||||
|
||||
RubyRequest(Tick curTime, uint64_t _paddr, uint8_t* _data, int _len,
|
||||
uint64_t _pc, RubyRequestType _type, RubyAccessMode _access_mode,
|
||||
@@ -71,7 +85,9 @@ class RubyRequest : public Message
|
||||
m_Prefetch(_pb),
|
||||
data(_data),
|
||||
m_pkt(_pkt),
|
||||
m_contextId(_core_id)
|
||||
m_contextId(_core_id),
|
||||
m_htmFromTransaction(false),
|
||||
m_htmTransactionUid(0)
|
||||
{
|
||||
m_LineAddress = makeLineAddress(m_PhysicalAddress);
|
||||
}
|
||||
@@ -96,7 +112,9 @@ class RubyRequest : public Message
|
||||
m_writeMask(_wm_size,_wm_mask),
|
||||
m_WTData(_Data),
|
||||
m_wfid(_proc_id),
|
||||
m_instSeqNum(_instSeqNum)
|
||||
m_instSeqNum(_instSeqNum),
|
||||
m_htmFromTransaction(false),
|
||||
m_htmTransactionUid(0)
|
||||
{
|
||||
m_LineAddress = makeLineAddress(m_PhysicalAddress);
|
||||
}
|
||||
@@ -122,7 +140,9 @@ class RubyRequest : public Message
|
||||
m_writeMask(_wm_size,_wm_mask,_atomicOps),
|
||||
m_WTData(_Data),
|
||||
m_wfid(_proc_id),
|
||||
m_instSeqNum(_instSeqNum)
|
||||
m_instSeqNum(_instSeqNum),
|
||||
m_htmFromTransaction(false),
|
||||
m_htmTransactionUid(0)
|
||||
{
|
||||
m_LineAddress = makeLineAddress(m_PhysicalAddress);
|
||||
}
|
||||
|
||||
@@ -1,4 +1,16 @@
|
||||
/*
|
||||
* Copyright (c) 2020 ARM Limited
|
||||
* All rights reserved
|
||||
*
|
||||
* The license below extends only to copyright in the software and shall
|
||||
* not be construed as granting a license to any other intellectual
|
||||
* property including but not limited to intellectual property relating
|
||||
* to a hardware implementation of the functionality of the software
|
||||
* licensed hereunder. You may use the software subject to the license
|
||||
* terms below provided that you ensure that this notice is replicated
|
||||
* unmodified and in its entirety in all distributions of the software,
|
||||
* modified or unmodified, in source code or in binary form.
|
||||
*
|
||||
* Copyright (c) 1999-2008 Mark D. Hill and David A. Wood
|
||||
* Copyright (c) 2013 Advanced Micro Devices, Inc.
|
||||
* All rights reserved.
|
||||
@@ -85,6 +97,75 @@ inline int max_tokens()
|
||||
return 1024;
|
||||
}
|
||||
|
||||
inline bool
|
||||
isWriteRequest(RubyRequestType type)
|
||||
{
|
||||
if ((type == RubyRequestType_ST) ||
|
||||
(type == RubyRequestType_ATOMIC) ||
|
||||
(type == RubyRequestType_RMW_Read) ||
|
||||
(type == RubyRequestType_RMW_Write) ||
|
||||
(type == RubyRequestType_Store_Conditional) ||
|
||||
(type == RubyRequestType_Locked_RMW_Read) ||
|
||||
(type == RubyRequestType_Locked_RMW_Write) ||
|
||||
(type == RubyRequestType_FLUSH)) {
|
||||
return true;
|
||||
} else {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
inline bool
|
||||
isDataReadRequest(RubyRequestType type)
|
||||
{
|
||||
if ((type == RubyRequestType_LD) ||
|
||||
(type == RubyRequestType_Load_Linked)) {
|
||||
return true;
|
||||
} else {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
inline bool
|
||||
isReadRequest(RubyRequestType type)
|
||||
{
|
||||
if (isDataReadRequest(type) ||
|
||||
(type == RubyRequestType_IFETCH)) {
|
||||
return true;
|
||||
} else {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
inline bool
|
||||
isHtmCmdRequest(RubyRequestType type)
|
||||
{
|
||||
if ((type == RubyRequestType_HTM_Start) ||
|
||||
(type == RubyRequestType_HTM_Commit) ||
|
||||
(type == RubyRequestType_HTM_Cancel) ||
|
||||
(type == RubyRequestType_HTM_Abort)) {
|
||||
return true;
|
||||
} else {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
inline RubyRequestType
|
||||
htmCmdToRubyRequestType(const Packet *pkt)
|
||||
{
|
||||
if (pkt->req->isHTMStart()) {
|
||||
return RubyRequestType_HTM_Start;
|
||||
} else if (pkt->req->isHTMCommit()) {
|
||||
return RubyRequestType_HTM_Commit;
|
||||
} else if (pkt->req->isHTMCancel()) {
|
||||
return RubyRequestType_HTM_Cancel;
|
||||
} else if (pkt->req->isHTMAbort()) {
|
||||
return RubyRequestType_HTM_Abort;
|
||||
}
|
||||
else {
|
||||
panic("invalid ruby packet type\n");
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* This function accepts an address, a data block and a packet. If the address
|
||||
* range for the data block contains the address which the packet needs to
|
||||
|
||||
@@ -1,4 +1,16 @@
|
||||
/*
|
||||
* Copyright (c) 2020 ARM Limited
|
||||
* All rights reserved
|
||||
*
|
||||
* The license below extends only to copyright in the software and shall
|
||||
* not be construed as granting a license to any other intellectual
|
||||
* property including but not limited to intellectual property relating
|
||||
* to a hardware implementation of the functionality of the software
|
||||
* licensed hereunder. You may use the software subject to the license
|
||||
* terms below provided that you ensure that this notice is replicated
|
||||
* unmodified and in its entirety in all distributions of the software,
|
||||
* modified or unmodified, in source code or in binary form.
|
||||
*
|
||||
* Copyright (c) 1999-2012 Mark D. Hill and David A. Wood
|
||||
* Copyright (c) 2013 Advanced Micro Devices, Inc.
|
||||
* All rights reserved.
|
||||
@@ -31,6 +43,7 @@
|
||||
|
||||
#include "base/intmath.hh"
|
||||
#include "base/logging.hh"
|
||||
#include "debug/HtmMem.hh"
|
||||
#include "debug/RubyCache.hh"
|
||||
#include "debug/RubyCacheTrace.hh"
|
||||
#include "debug/RubyResourceStalls.hh"
|
||||
@@ -479,6 +492,23 @@ CacheMemory::clearLocked(Addr address)
|
||||
entry->clearLocked();
|
||||
}
|
||||
|
||||
void
|
||||
CacheMemory::clearLockedAll(int context)
|
||||
{
|
||||
// iterate through every set and way to get a cache line
|
||||
for (auto i = m_cache.begin(); i != m_cache.end(); ++i) {
|
||||
std::vector<AbstractCacheEntry*> set = *i;
|
||||
for (auto j = set.begin(); j != set.end(); ++j) {
|
||||
AbstractCacheEntry *line = *j;
|
||||
if (line && line->isLocked(context)) {
|
||||
DPRINTF(RubyCache, "Clear Lock for addr: %#x\n",
|
||||
line->m_Address);
|
||||
line->clearLocked();
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
bool
|
||||
CacheMemory::isLocked(Addr address, int context)
|
||||
{
|
||||
@@ -578,6 +608,34 @@ CacheMemory::regStats()
|
||||
.desc("number of stalls caused by data array")
|
||||
.flags(Stats::nozero)
|
||||
;
|
||||
|
||||
htmTransCommitReadSet
|
||||
.init(8)
|
||||
.name(name() + ".htm_transaction_committed_read_set")
|
||||
.desc("read set size of a committed transaction")
|
||||
.flags(Stats::pdf | Stats::dist | Stats::nozero | Stats::nonan)
|
||||
;
|
||||
|
||||
htmTransCommitWriteSet
|
||||
.init(8)
|
||||
.name(name() + ".htm_transaction_committed_write_set")
|
||||
.desc("write set size of a committed transaction")
|
||||
.flags(Stats::pdf | Stats::dist | Stats::nozero | Stats::nonan)
|
||||
;
|
||||
|
||||
htmTransAbortReadSet
|
||||
.init(8)
|
||||
.name(name() + ".htm_transaction_aborted_read_set")
|
||||
.desc("read set size of a aborted transaction")
|
||||
.flags(Stats::pdf | Stats::dist | Stats::nozero | Stats::nonan)
|
||||
;
|
||||
|
||||
htmTransAbortWriteSet
|
||||
.init(8)
|
||||
.name(name() + ".htm_transaction_aborted_write_set")
|
||||
.desc("write set size of a aborted transaction")
|
||||
.flags(Stats::pdf | Stats::dist | Stats::nozero | Stats::nonan)
|
||||
;
|
||||
}
|
||||
|
||||
// assumption: SLICC generated files will only call this function
|
||||
@@ -655,3 +713,69 @@ CacheMemory::isBlockNotBusy(int64_t cache_set, int64_t loc)
|
||||
{
|
||||
return (m_cache[cache_set][loc]->m_Permission != AccessPermission_Busy);
|
||||
}
|
||||
|
||||
/* hardware transactional memory */
|
||||
|
||||
void
|
||||
CacheMemory::htmAbortTransaction()
|
||||
{
|
||||
uint64_t htmReadSetSize = 0;
|
||||
uint64_t htmWriteSetSize = 0;
|
||||
|
||||
// iterate through every set and way to get a cache line
|
||||
for (auto i = m_cache.begin(); i != m_cache.end(); ++i)
|
||||
{
|
||||
std::vector<AbstractCacheEntry*> set = *i;
|
||||
|
||||
for (auto j = set.begin(); j != set.end(); ++j)
|
||||
{
|
||||
AbstractCacheEntry *line = *j;
|
||||
|
||||
if (line != nullptr) {
|
||||
htmReadSetSize += (line->getInHtmReadSet() ? 1 : 0);
|
||||
htmWriteSetSize += (line->getInHtmWriteSet() ? 1 : 0);
|
||||
if (line->getInHtmWriteSet()) {
|
||||
line->invalidateEntry();
|
||||
}
|
||||
line->setInHtmWriteSet(false);
|
||||
line->setInHtmReadSet(false);
|
||||
line->clearLocked();
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
htmTransAbortReadSet.sample(htmReadSetSize);
|
||||
htmTransAbortWriteSet.sample(htmWriteSetSize);
|
||||
DPRINTF(HtmMem, "htmAbortTransaction: read set=%u write set=%u\n",
|
||||
htmReadSetSize, htmWriteSetSize);
|
||||
}
|
||||
|
||||
void
|
||||
CacheMemory::htmCommitTransaction()
|
||||
{
|
||||
uint64_t htmReadSetSize = 0;
|
||||
uint64_t htmWriteSetSize = 0;
|
||||
|
||||
// iterate through every set and way to get a cache line
|
||||
for (auto i = m_cache.begin(); i != m_cache.end(); ++i)
|
||||
{
|
||||
std::vector<AbstractCacheEntry*> set = *i;
|
||||
|
||||
for (auto j = set.begin(); j != set.end(); ++j)
|
||||
{
|
||||
AbstractCacheEntry *line = *j;
|
||||
if (line != nullptr) {
|
||||
htmReadSetSize += (line->getInHtmReadSet() ? 1 : 0);
|
||||
htmWriteSetSize += (line->getInHtmWriteSet() ? 1 : 0);
|
||||
line->setInHtmWriteSet(false);
|
||||
line->setInHtmReadSet(false);
|
||||
line->clearLocked();
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
htmTransCommitReadSet.sample(htmReadSetSize);
|
||||
htmTransCommitWriteSet.sample(htmWriteSetSize);
|
||||
DPRINTF(HtmMem, "htmCommitTransaction: read set=%u write set=%u\n",
|
||||
htmReadSetSize, htmWriteSetSize);
|
||||
}
|
||||
|
||||
@@ -1,4 +1,16 @@
|
||||
/*
|
||||
* Copyright (c) 2020 ARM Limited
|
||||
* All rights reserved
|
||||
*
|
||||
* The license below extends only to copyright in the software and shall
|
||||
* not be construed as granting a license to any other intellectual
|
||||
* property including but not limited to intellectual property relating
|
||||
* to a hardware implementation of the functionality of the software
|
||||
* licensed hereunder. You may use the software subject to the license
|
||||
* terms below provided that you ensure that this notice is replicated
|
||||
* unmodified and in its entirety in all distributions of the software,
|
||||
* modified or unmodified, in source code or in binary form.
|
||||
*
|
||||
* Copyright (c) 1999-2012 Mark D. Hill and David A. Wood
|
||||
* Copyright (c) 2013 Advanced Micro Devices, Inc.
|
||||
* All rights reserved.
|
||||
@@ -121,6 +133,7 @@ class CacheMemory : public SimObject
|
||||
// provided by the AbstractCacheEntry class.
|
||||
void setLocked (Addr addr, int context);
|
||||
void clearLocked (Addr addr);
|
||||
void clearLockedAll (int context);
|
||||
bool isLocked (Addr addr, int context);
|
||||
|
||||
// Print cache contents
|
||||
@@ -131,6 +144,10 @@ class CacheMemory : public SimObject
|
||||
bool checkResourceAvailable(CacheResourceType res, Addr addr);
|
||||
void recordRequestType(CacheRequestType requestType, Addr addr);
|
||||
|
||||
// hardware transactional memory
|
||||
void htmAbortTransaction();
|
||||
void htmCommitTransaction();
|
||||
|
||||
public:
|
||||
Stats::Scalar m_demand_hits;
|
||||
Stats::Scalar m_demand_misses;
|
||||
@@ -150,6 +167,12 @@ class CacheMemory : public SimObject
|
||||
Stats::Scalar numTagArrayStalls;
|
||||
Stats::Scalar numDataArrayStalls;
|
||||
|
||||
// hardware transactional memory
|
||||
Stats::Histogram htmTransCommitReadSet;
|
||||
Stats::Histogram htmTransCommitWriteSet;
|
||||
Stats::Histogram htmTransAbortReadSet;
|
||||
Stats::Histogram htmTransAbortWriteSet;
|
||||
|
||||
int getCacheSize() const { return m_cache_size; }
|
||||
int getCacheAssoc() const { return m_cache_assoc; }
|
||||
int getNumBlocks() const { return m_cache_num_sets * m_cache_assoc; }
|
||||
|
||||
337
src/mem/ruby/system/HTMSequencer.cc
Normal file
337
src/mem/ruby/system/HTMSequencer.cc
Normal file
@@ -0,0 +1,337 @@
|
||||
/*
|
||||
* Copyright (c) 2020 ARM Limited
|
||||
* All rights reserved
|
||||
*
|
||||
* The license below extends only to copyright in the software and shall
|
||||
* not be construed as granting a license to any other intellectual
|
||||
* property including but not limited to intellectual property relating
|
||||
* to a hardware implementation of the functionality of the software
|
||||
* licensed hereunder. You may use the software subject to the license
|
||||
* terms below provided that you ensure that this notice is replicated
|
||||
* unmodified and in its entirety in all distributions of the software,
|
||||
* modified or unmodified, in source code or in binary form.
|
||||
*
|
||||
* Redistribution and use in source and binary forms, with or without
|
||||
* modification, are permitted provided that the following conditions are
|
||||
* met: redistributions of source code must retain the above copyright
|
||||
* notice, this list of conditions and the following disclaimer;
|
||||
* redistributions in binary form must reproduce the above copyright
|
||||
* notice, this list of conditions and the following disclaimer in the
|
||||
* documentation and/or other materials provided with the distribution;
|
||||
* neither the name of the copyright holders nor the names of its
|
||||
* contributors may be used to endorse or promote products derived from
|
||||
* this software without specific prior written permission.
|
||||
*
|
||||
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
|
||||
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
|
||||
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
|
||||
* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
|
||||
* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
|
||||
* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
|
||||
* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
||||
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
||||
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
||||
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
|
||||
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
*/
|
||||
|
||||
#include "mem/ruby/system/HTMSequencer.hh"
|
||||
|
||||
#include "debug/HtmMem.hh"
|
||||
#include "debug/RubyPort.hh"
|
||||
#include "mem/ruby/slicc_interface/RubySlicc_Util.hh"
|
||||
#include "sim/system.hh"
|
||||
|
||||
using namespace std;
|
||||
|
||||
HtmCacheFailure
|
||||
HTMSequencer::htmRetCodeConversion(
|
||||
const HtmFailedInCacheReason ruby_ret_code)
|
||||
{
|
||||
switch (ruby_ret_code) {
|
||||
case HtmFailedInCacheReason_NO_FAIL:
|
||||
return HtmCacheFailure::NO_FAIL;
|
||||
case HtmFailedInCacheReason_FAIL_SELF:
|
||||
return HtmCacheFailure::FAIL_SELF;
|
||||
case HtmFailedInCacheReason_FAIL_REMOTE:
|
||||
return HtmCacheFailure::FAIL_REMOTE;
|
||||
case HtmFailedInCacheReason_FAIL_OTHER:
|
||||
return HtmCacheFailure::FAIL_OTHER;
|
||||
default:
|
||||
panic("Invalid htm return code\n");
|
||||
}
|
||||
}
|
||||
|
||||
HTMSequencer *
|
||||
RubyHTMSequencerParams::create()
|
||||
{
|
||||
return new HTMSequencer(this);
|
||||
}
|
||||
|
||||
HTMSequencer::HTMSequencer(const RubyHTMSequencerParams *p)
|
||||
: Sequencer(p)
|
||||
{
|
||||
m_htmstart_tick = 0;
|
||||
m_htmstart_instruction = 0;
|
||||
}
|
||||
|
||||
HTMSequencer::~HTMSequencer()
|
||||
{
|
||||
}
|
||||
|
||||
void
|
||||
HTMSequencer::htmCallback(Addr address,
|
||||
const HtmCallbackMode mode,
|
||||
const HtmFailedInCacheReason htm_return_code)
|
||||
{
|
||||
// mode=0: HTM command
|
||||
// mode=1: transaction failed - inform via LD
|
||||
// mode=2: transaction failed - inform via ST
|
||||
|
||||
if (mode == HtmCallbackMode_HTM_CMD) {
|
||||
SequencerRequest* request = nullptr;
|
||||
|
||||
assert(m_htmCmdRequestTable.size() > 0);
|
||||
|
||||
request = m_htmCmdRequestTable.front();
|
||||
m_htmCmdRequestTable.pop_front();
|
||||
|
||||
assert(isHtmCmdRequest(request->m_type));
|
||||
|
||||
PacketPtr pkt = request->pkt;
|
||||
delete request;
|
||||
|
||||
// valid responses have zero as the payload
|
||||
uint8_t* dataptr = pkt->getPtr<uint8_t>();
|
||||
memset(dataptr, 0, pkt->getSize());
|
||||
*dataptr = (uint8_t) htm_return_code;
|
||||
|
||||
// record stats
|
||||
if (htm_return_code == HtmFailedInCacheReason_NO_FAIL) {
|
||||
if (pkt->req->isHTMStart()) {
|
||||
m_htmstart_tick = pkt->req->time();
|
||||
m_htmstart_instruction = pkt->req->getInstCount();
|
||||
DPRINTF(HtmMem, "htmStart - htmUid=%u\n",
|
||||
pkt->getHtmTransactionUid());
|
||||
} else if (pkt->req->isHTMCommit()) {
|
||||
Tick transaction_ticks = pkt->req->time() - m_htmstart_tick;
|
||||
Cycles transaction_cycles = ticksToCycles(transaction_ticks);
|
||||
m_htm_transaction_cycles.sample(transaction_cycles);
|
||||
m_htmstart_tick = 0;
|
||||
Counter transaction_instructions =
|
||||
pkt->req->getInstCount() - m_htmstart_instruction;
|
||||
m_htm_transaction_instructions.sample(
|
||||
transaction_instructions);
|
||||
m_htmstart_instruction = 0;
|
||||
DPRINTF(HtmMem, "htmCommit - htmUid=%u\n",
|
||||
pkt->getHtmTransactionUid());
|
||||
} else if (pkt->req->isHTMAbort()) {
|
||||
HtmFailureFaultCause cause = pkt->req->getHtmAbortCause();
|
||||
assert(cause != HtmFailureFaultCause::INVALID);
|
||||
auto cause_idx = static_cast<int>(cause);
|
||||
m_htm_transaction_abort_cause[cause_idx]++;
|
||||
DPRINTF(HtmMem, "htmAbort - reason=%s - htmUid=%u\n",
|
||||
htmFailureToStr(cause),
|
||||
pkt->getHtmTransactionUid());
|
||||
}
|
||||
} else {
|
||||
DPRINTF(HtmMem, "HTM_CMD: fail - htmUid=%u\n",
|
||||
pkt->getHtmTransactionUid());
|
||||
}
|
||||
|
||||
rubyHtmCallback(pkt, htm_return_code);
|
||||
testDrainComplete();
|
||||
} else if (mode == HtmCallbackMode_LD_FAIL ||
|
||||
mode == HtmCallbackMode_ST_FAIL) {
|
||||
// transaction failed
|
||||
assert(address == makeLineAddress(address));
|
||||
assert(m_RequestTable.find(address) != m_RequestTable.end());
|
||||
|
||||
auto &seq_req_list = m_RequestTable[address];
|
||||
while (!seq_req_list.empty()) {
|
||||
SequencerRequest &request = seq_req_list.front();
|
||||
|
||||
PacketPtr pkt = request.pkt;
|
||||
markRemoved();
|
||||
|
||||
// TODO - atomics
|
||||
|
||||
// store conditionals should indicate failure
|
||||
if (request.m_type == RubyRequestType_Store_Conditional) {
|
||||
pkt->req->setExtraData(0);
|
||||
}
|
||||
|
||||
DPRINTF(HtmMem, "%s_FAIL: size=%d - "
|
||||
"addr=0x%lx - htmUid=%d\n",
|
||||
(mode == HtmCallbackMode_LD_FAIL) ? "LD" : "ST",
|
||||
pkt->getSize(),
|
||||
address, pkt->getHtmTransactionUid());
|
||||
|
||||
rubyHtmCallback(pkt, htm_return_code);
|
||||
testDrainComplete();
|
||||
pkt = nullptr;
|
||||
seq_req_list.pop_front();
|
||||
}
|
||||
// free all outstanding requests corresponding to this address
|
||||
if (seq_req_list.empty()) {
|
||||
m_RequestTable.erase(address);
|
||||
}
|
||||
} else {
|
||||
panic("unrecognised HTM callback mode\n");
|
||||
}
|
||||
}
|
||||
|
||||
void
|
||||
HTMSequencer::regStats()
|
||||
{
|
||||
Sequencer::regStats();
|
||||
|
||||
// hardware transactional memory
|
||||
m_htm_transaction_cycles
|
||||
.init(10)
|
||||
.name(name() + ".htm_transaction_cycles")
|
||||
.desc("number of cycles spent in an outer transaction")
|
||||
.flags(Stats::pdf | Stats::dist | Stats::nozero | Stats::nonan)
|
||||
;
|
||||
m_htm_transaction_instructions
|
||||
.init(10)
|
||||
.name(name() + ".htm_transaction_instructions")
|
||||
.desc("number of instructions spent in an outer transaction")
|
||||
.flags(Stats::pdf | Stats::dist | Stats::nozero | Stats::nonan)
|
||||
;
|
||||
auto num_causes = static_cast<int>(HtmFailureFaultCause::NUM_CAUSES);
|
||||
m_htm_transaction_abort_cause
|
||||
.init(num_causes)
|
||||
.name(name() + ".htm_transaction_abort_cause")
|
||||
.desc("cause of htm transaction abort")
|
||||
.flags(Stats::total | Stats::pdf | Stats::dist | Stats::nozero)
|
||||
;
|
||||
|
||||
for (unsigned cause_idx = 0; cause_idx < num_causes; ++cause_idx) {
|
||||
m_htm_transaction_abort_cause.subname(
|
||||
cause_idx,
|
||||
htmFailureToStr(HtmFailureFaultCause(cause_idx)));
|
||||
}
|
||||
}
|
||||
|
||||
void
|
||||
HTMSequencer::rubyHtmCallback(PacketPtr pkt,
|
||||
const HtmFailedInCacheReason htm_return_code)
|
||||
{
|
||||
// The packet was destined for memory and has not yet been turned
|
||||
// into a response
|
||||
assert(system->isMemAddr(pkt->getAddr()) || system->isDeviceMemAddr(pkt));
|
||||
assert(pkt->isRequest());
|
||||
|
||||
// First retrieve the request port from the sender State
|
||||
RubyPort::SenderState *senderState =
|
||||
safe_cast<RubyPort::SenderState *>(pkt->popSenderState());
|
||||
|
||||
MemSlavePort *port = safe_cast<MemSlavePort*>(senderState->port);
|
||||
assert(port != nullptr);
|
||||
delete senderState;
|
||||
|
||||
//port->htmCallback(pkt, htm_return_code);
|
||||
DPRINTF(HtmMem, "HTM callback: start=%d, commit=%d, "
|
||||
"cancel=%d, rc=%d\n",
|
||||
pkt->req->isHTMStart(), pkt->req->isHTMCommit(),
|
||||
pkt->req->isHTMCancel(), htm_return_code);
|
||||
|
||||
// turn packet around to go back to requester if response expected
|
||||
if (pkt->needsResponse()) {
|
||||
DPRINTF(RubyPort, "Sending packet back over port\n");
|
||||
pkt->makeHtmTransactionalReqResponse(
|
||||
htmRetCodeConversion(htm_return_code));
|
||||
port->schedTimingResp(pkt, curTick());
|
||||
} else {
|
||||
delete pkt;
|
||||
}
|
||||
|
||||
trySendRetries();
|
||||
}
|
||||
|
||||
void
|
||||
HTMSequencer::wakeup()
|
||||
{
|
||||
Sequencer::wakeup();
|
||||
|
||||
// Check for deadlock of any of the requests
|
||||
Cycles current_time = curCycle();
|
||||
|
||||
// hardware transactional memory commands
|
||||
std::deque<SequencerRequest*>::iterator htm =
|
||||
m_htmCmdRequestTable.begin();
|
||||
std::deque<SequencerRequest*>::iterator htm_end =
|
||||
m_htmCmdRequestTable.end();
|
||||
|
||||
for (; htm != htm_end; ++htm) {
|
||||
SequencerRequest* request = *htm;
|
||||
if (current_time - request->issue_time < m_deadlock_threshold)
|
||||
continue;
|
||||
|
||||
panic("Possible Deadlock detected. Aborting!\n"
|
||||
"version: %d m_htmCmdRequestTable: %d "
|
||||
"current time: %u issue_time: %d difference: %d\n",
|
||||
m_version, m_htmCmdRequestTable.size(),
|
||||
current_time * clockPeriod(),
|
||||
request->issue_time * clockPeriod(),
|
||||
(current_time * clockPeriod()) -
|
||||
(request->issue_time * clockPeriod()));
|
||||
}
|
||||
}
|
||||
|
||||
bool
|
||||
HTMSequencer::empty() const
|
||||
{
|
||||
return Sequencer::empty() && m_htmCmdRequestTable.empty();
|
||||
}
|
||||
|
||||
template <class VALUE>
|
||||
std::ostream &
|
||||
operator<<(ostream &out, const std::deque<VALUE> &queue)
|
||||
{
|
||||
auto i = queue.begin();
|
||||
auto end = queue.end();
|
||||
|
||||
out << "[";
|
||||
for (; i != end; ++i)
|
||||
out << " " << *i;
|
||||
out << " ]";
|
||||
|
||||
return out;
|
||||
}
|
||||
|
||||
void
|
||||
HTMSequencer::print(ostream& out) const
|
||||
{
|
||||
Sequencer::print(out);
|
||||
|
||||
out << "+ [HTMSequencer: " << m_version
|
||||
<< ", htm cmd request table: " << m_htmCmdRequestTable
|
||||
<< "]";
|
||||
}
|
||||
|
||||
// Insert the request in the request table. Return RequestStatus_Aliased
|
||||
// if the entry was already present.
|
||||
RequestStatus
|
||||
HTMSequencer::insertRequest(PacketPtr pkt, RubyRequestType primary_type,
|
||||
RubyRequestType secondary_type)
|
||||
{
|
||||
if (isHtmCmdRequest(primary_type)) {
|
||||
// for the moment, allow just one HTM cmd into the cache controller.
|
||||
// Later this can be adjusted for optimization, e.g.
|
||||
// back-to-back HTM_Starts.
|
||||
if ((m_htmCmdRequestTable.size() > 0) && !pkt->req->isHTMAbort())
|
||||
return RequestStatus_BufferFull;
|
||||
|
||||
// insert request into HtmCmd queue
|
||||
SequencerRequest* htmReq =
|
||||
new SequencerRequest(pkt, primary_type, secondary_type,
|
||||
curCycle());
|
||||
assert(htmReq);
|
||||
m_htmCmdRequestTable.push_back(htmReq);
|
||||
return RequestStatus_Ready;
|
||||
} else {
|
||||
return Sequencer::insertRequest(pkt, primary_type, secondary_type);
|
||||
}
|
||||
}
|
||||
113
src/mem/ruby/system/HTMSequencer.hh
Normal file
113
src/mem/ruby/system/HTMSequencer.hh
Normal file
@@ -0,0 +1,113 @@
|
||||
/*
|
||||
* Copyright (c) 2020 ARM Limited
|
||||
* All rights reserved
|
||||
*
|
||||
* The license below extends only to copyright in the software and shall
|
||||
* not be construed as granting a license to any other intellectual
|
||||
* property including but not limited to intellectual property relating
|
||||
* to a hardware implementation of the functionality of the software
|
||||
* licensed hereunder. You may use the software subject to the license
|
||||
* terms below provided that you ensure that this notice is replicated
|
||||
* unmodified and in its entirety in all distributions of the software,
|
||||
* modified or unmodified, in source code or in binary form.
|
||||
*
|
||||
* Redistribution and use in source and binary forms, with or without
|
||||
* modification, are permitted provided that the following conditions are
|
||||
* met: redistributions of source code must retain the above copyright
|
||||
* notice, this list of conditions and the following disclaimer;
|
||||
* redistributions in binary form must reproduce the above copyright
|
||||
* notice, this list of conditions and the following disclaimer in the
|
||||
* documentation and/or other materials provided with the distribution;
|
||||
* neither the name of the copyright holders nor the names of its
|
||||
* contributors may be used to endorse or promote products derived from
|
||||
* this software without specific prior written permission.
|
||||
*
|
||||
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
|
||||
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
|
||||
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
|
||||
* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
|
||||
* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
|
||||
* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
|
||||
* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
||||
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
||||
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
||||
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
|
||||
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
*/
|
||||
|
||||
#ifndef __MEM_RUBY_SYSTEM_HTMSEQUENCER_HH__
|
||||
#define __MEM_RUBY_SYSTEM_HTMSEQUENCER_HH__
|
||||
|
||||
#include <cassert>
|
||||
#include <iostream>
|
||||
|
||||
#include "mem/htm.hh"
|
||||
#include "mem/ruby/protocol/HtmCallbackMode.hh"
|
||||
#include "mem/ruby/protocol/HtmFailedInCacheReason.hh"
|
||||
#include "mem/ruby/system/RubyPort.hh"
|
||||
#include "mem/ruby/system/Sequencer.hh"
|
||||
#include "params/RubyHTMSequencer.hh"
|
||||
|
||||
class HTMSequencer : public Sequencer
|
||||
{
|
||||
public:
|
||||
HTMSequencer(const RubyHTMSequencerParams *p);
|
||||
~HTMSequencer();
|
||||
|
||||
// callback to acknowledge HTM requests and
|
||||
// notify cpu core when htm transaction fails in cache
|
||||
void htmCallback(Addr,
|
||||
const HtmCallbackMode,
|
||||
const HtmFailedInCacheReason);
|
||||
|
||||
bool empty() const override;
|
||||
void print(std::ostream& out) const override;
|
||||
void regStats() override;
|
||||
void wakeup() override;
|
||||
|
||||
private:
|
||||
/**
|
||||
* Htm return code conversion
|
||||
*
|
||||
* This helper is a hack meant to convert the autogenerated ruby
|
||||
* enum (HtmFailedInCacheReason) to the manually defined one
|
||||
* (HtmCacheFailure). This is needed since the cpu code would
|
||||
* otherwise have to include the ruby generated headers in order
|
||||
* to handle the htm return code.
|
||||
*/
|
||||
HtmCacheFailure htmRetCodeConversion(const HtmFailedInCacheReason rc);
|
||||
|
||||
void rubyHtmCallback(PacketPtr pkt, const HtmFailedInCacheReason fail_r);
|
||||
|
||||
RequestStatus insertRequest(PacketPtr pkt,
|
||||
RubyRequestType primary_type,
|
||||
RubyRequestType secondary_type) override;
|
||||
|
||||
// Private copy constructor and assignment operator
|
||||
HTMSequencer(const HTMSequencer& obj);
|
||||
HTMSequencer& operator=(const HTMSequencer& obj);
|
||||
|
||||
// table/queue for hardware transactional memory commands
|
||||
// these do not have an address so a deque/queue is used instead.
|
||||
std::deque<SequencerRequest*> m_htmCmdRequestTable;
|
||||
|
||||
Tick m_htmstart_tick;
|
||||
Counter m_htmstart_instruction;
|
||||
|
||||
//! Histogram of cycle latencies of HTM transactions
|
||||
Stats::Histogram m_htm_transaction_cycles;
|
||||
//! Histogram of instruction lengths of HTM transactions
|
||||
Stats::Histogram m_htm_transaction_instructions;
|
||||
//! Causes for HTM transaction aborts
|
||||
Stats::Vector m_htm_transaction_abort_cause;
|
||||
};
|
||||
|
||||
inline std::ostream&
|
||||
operator<<(std::ostream& out, const HTMSequencer& obj)
|
||||
{
|
||||
obj.print(out);
|
||||
out << std::flush;
|
||||
return out;
|
||||
}
|
||||
|
||||
#endif // __MEM_RUBY_SYSTEM_HTMSEQUENCER_HH__
|
||||
@@ -1,5 +1,5 @@
|
||||
/*
|
||||
* Copyright (c) 2012-2013,2019 ARM Limited
|
||||
* Copyright (c) 2012-2013,2020 ARM Limited
|
||||
* All rights reserved.
|
||||
*
|
||||
* The license below extends only to copyright in the software and shall
|
||||
@@ -169,6 +169,7 @@ bool RubyPort::MemMasterPort::recvTimingResp(PacketPtr pkt)
|
||||
{
|
||||
// got a response from a device
|
||||
assert(pkt->isResponse());
|
||||
assert(!pkt->htmTransactionFailedInCache());
|
||||
|
||||
// First we must retrieve the request port from the sender State
|
||||
RubyPort::SenderState *senderState =
|
||||
@@ -253,6 +254,7 @@ RubyPort::MemSlavePort::recvTimingReq(PacketPtr pkt)
|
||||
// pio port.
|
||||
if (pkt->cmd != MemCmd::MemSyncReq) {
|
||||
if (!isPhysMemAddress(pkt)) {
|
||||
assert(!pkt->req->isHTMCmd());
|
||||
assert(ruby_port->memMasterPort.isConnected());
|
||||
DPRINTF(RubyPort, "Request address %#x assumed to be a "
|
||||
"pio address\n", pkt->getAddr());
|
||||
@@ -638,7 +640,6 @@ RubyPort::PioMasterPort::recvRangeChange()
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
int
|
||||
RubyPort::functionalWrite(Packet *func_pkt)
|
||||
{
|
||||
|
||||
@@ -56,6 +56,7 @@ Source('CacheRecorder.cc')
|
||||
Source('DMASequencer.cc')
|
||||
if env['BUILD_GPU']:
|
||||
Source('GPUCoalescer.cc')
|
||||
Source('HTMSequencer.cc')
|
||||
Source('RubyPort.cc')
|
||||
Source('RubyPortProxy.cc')
|
||||
Source('RubySystem.cc')
|
||||
|
||||
@@ -55,6 +55,7 @@
|
||||
#include "mem/ruby/protocol/PrefetchBit.hh"
|
||||
#include "mem/ruby/protocol/RubyAccessMode.hh"
|
||||
#include "mem/ruby/slicc_interface/RubyRequest.hh"
|
||||
#include "mem/ruby/slicc_interface/RubySlicc_Util.hh"
|
||||
#include "mem/ruby/system/RubySystem.hh"
|
||||
#include "sim/system.hh"
|
||||
|
||||
@@ -148,6 +149,12 @@ Sequencer::llscCheckMonitor(const Addr address)
|
||||
}
|
||||
}
|
||||
|
||||
void
|
||||
Sequencer::llscClearLocalMonitor()
|
||||
{
|
||||
m_dataCache_ptr->clearLockedAll(m_version);
|
||||
}
|
||||
|
||||
void
|
||||
Sequencer::wakeup()
|
||||
{
|
||||
@@ -243,7 +250,8 @@ Sequencer::insertRequest(PacketPtr pkt, RubyRequestType primary_type,
|
||||
// Check if there is any outstanding request for the same cache line.
|
||||
auto &seq_req_list = m_RequestTable[line_addr];
|
||||
// Create a default entry
|
||||
seq_req_list.emplace_back(pkt, primary_type, secondary_type, curCycle());
|
||||
seq_req_list.emplace_back(pkt, primary_type,
|
||||
secondary_type, curCycle());
|
||||
m_outstanding_count++;
|
||||
|
||||
if (seq_req_list.size() > 1) {
|
||||
@@ -569,7 +577,10 @@ Sequencer::empty() const
|
||||
RequestStatus
|
||||
Sequencer::makeRequest(PacketPtr pkt)
|
||||
{
|
||||
if (m_outstanding_count >= m_max_outstanding_requests) {
|
||||
// HTM abort signals must be allowed to reach the Sequencer
|
||||
// the same cycle they are issued. They cannot be retried.
|
||||
if ((m_outstanding_count >= m_max_outstanding_requests) &&
|
||||
!pkt->req->isHTMAbort()) {
|
||||
return RequestStatus_BufferFull;
|
||||
}
|
||||
|
||||
@@ -590,7 +601,7 @@ Sequencer::makeRequest(PacketPtr pkt)
|
||||
if (pkt->isWrite()) {
|
||||
DPRINTF(RubySequencer, "Issuing SC\n");
|
||||
primary_type = RubyRequestType_Store_Conditional;
|
||||
#ifdef PROTOCOL_MESI_Three_Level
|
||||
#if defined (PROTOCOL_MESI_Three_Level) || defined (PROTOCOL_MESI_Three_Level_HTM)
|
||||
secondary_type = RubyRequestType_Store_Conditional;
|
||||
#else
|
||||
secondary_type = RubyRequestType_ST;
|
||||
@@ -629,7 +640,10 @@ Sequencer::makeRequest(PacketPtr pkt)
|
||||
//
|
||||
primary_type = secondary_type = RubyRequestType_ST;
|
||||
} else if (pkt->isRead()) {
|
||||
if (pkt->req->isInstFetch()) {
|
||||
// hardware transactional memory commands
|
||||
if (pkt->req->isHTMCmd()) {
|
||||
primary_type = secondary_type = htmCmdToRubyRequestType(pkt);
|
||||
} else if (pkt->req->isInstFetch()) {
|
||||
primary_type = secondary_type = RubyRequestType_IFETCH;
|
||||
} else {
|
||||
bool storeCheck = false;
|
||||
@@ -706,6 +720,14 @@ Sequencer::issueRequest(PacketPtr pkt, RubyRequestType secondary_type)
|
||||
printAddress(msg->getPhysicalAddress()),
|
||||
RubyRequestType_to_string(secondary_type));
|
||||
|
||||
// hardware transactional memory
|
||||
// If the request originates in a transaction,
|
||||
// then mark the Ruby message as such.
|
||||
if (pkt->isHtmTransactional()) {
|
||||
msg->m_htmFromTransaction = true;
|
||||
msg->m_htmTransactionUid = pkt->getHtmTransactionUid();
|
||||
}
|
||||
|
||||
Tick latency = cyclesToTicks(
|
||||
m_controller->mandatoryQueueLatency(secondary_type));
|
||||
assert(latency > 0);
|
||||
|
||||
@@ -92,7 +92,7 @@ class Sequencer : public RubyPort
|
||||
DataBlock& data);
|
||||
|
||||
// Public Methods
|
||||
void wakeup(); // Used only for deadlock detection
|
||||
virtual void wakeup(); // Used only for deadlock detection
|
||||
void resetStats() override;
|
||||
void collateStats();
|
||||
void regStats() override;
|
||||
@@ -114,7 +114,7 @@ class Sequencer : public RubyPort
|
||||
const Cycles firstResponseTime = Cycles(0));
|
||||
|
||||
RequestStatus makeRequest(PacketPtr pkt) override;
|
||||
bool empty() const;
|
||||
virtual bool empty() const;
|
||||
int outstandingCount() const override { return m_outstanding_count; }
|
||||
|
||||
bool isDeadlockEventScheduled() const override
|
||||
@@ -123,7 +123,7 @@ class Sequencer : public RubyPort
|
||||
void descheduleDeadlockEvent() override
|
||||
{ deschedule(deadlockCheckEvent); }
|
||||
|
||||
void print(std::ostream& out) const;
|
||||
virtual void print(std::ostream& out) const;
|
||||
|
||||
void markRemoved();
|
||||
void evictionCallback(Addr address);
|
||||
@@ -194,16 +194,22 @@ class Sequencer : public RubyPort
|
||||
Cycles forwardRequestTime,
|
||||
Cycles firstResponseTime);
|
||||
|
||||
RequestStatus insertRequest(PacketPtr pkt, RubyRequestType primary_type,
|
||||
RubyRequestType secondary_type);
|
||||
|
||||
// Private copy constructor and assignment operator
|
||||
Sequencer(const Sequencer& obj);
|
||||
Sequencer& operator=(const Sequencer& obj);
|
||||
|
||||
protected:
|
||||
// RequestTable contains both read and write requests, handles aliasing
|
||||
std::unordered_map<Addr, std::list<SequencerRequest>> m_RequestTable;
|
||||
|
||||
Cycles m_deadlock_threshold;
|
||||
|
||||
virtual RequestStatus insertRequest(PacketPtr pkt,
|
||||
RubyRequestType primary_type,
|
||||
RubyRequestType secondary_type);
|
||||
|
||||
private:
|
||||
int m_max_outstanding_requests;
|
||||
Cycles m_deadlock_threshold;
|
||||
|
||||
CacheMemory* m_dataCache_ptr;
|
||||
CacheMemory* m_instCache_ptr;
|
||||
@@ -215,9 +221,6 @@ class Sequencer : public RubyPort
|
||||
Cycles m_data_cache_hit_latency;
|
||||
Cycles m_inst_cache_hit_latency;
|
||||
|
||||
// RequestTable contains both read and write requests, handles aliasing
|
||||
std::unordered_map<Addr, std::list<SequencerRequest>> m_RequestTable;
|
||||
|
||||
// Global outstanding request count, across all request tables
|
||||
int m_outstanding_count;
|
||||
bool m_deadlock_check_scheduled;
|
||||
@@ -294,6 +297,13 @@ class Sequencer : public RubyPort
|
||||
* @return a boolean indicating if the line address was found.
|
||||
*/
|
||||
bool llscCheckMonitor(const Addr);
|
||||
|
||||
|
||||
/**
|
||||
* Removes all addresses from the local monitor.
|
||||
* This is independent of this Sequencer object's version id.
|
||||
*/
|
||||
void llscClearLocalMonitor();
|
||||
};
|
||||
|
||||
inline std::ostream&
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
# Copyright (c) 2009 Advanced Micro Devices, Inc.
|
||||
# Copyright (c) 2020 ARM Limited
|
||||
# All rights reserved.
|
||||
#
|
||||
# Redistribution and use in source and binary forms, with or without
|
||||
@@ -70,6 +71,11 @@ class RubySequencer(RubyPort):
|
||||
# 99 is the dummy default value
|
||||
coreid = Param.Int(99, "CorePair core id")
|
||||
|
||||
class RubyHTMSequencer(RubySequencer):
|
||||
type = 'RubyHTMSequencer'
|
||||
cxx_class = 'HTMSequencer'
|
||||
cxx_header = "mem/ruby/system/HTMSequencer.hh"
|
||||
|
||||
class DMASequencer(RubyPort):
|
||||
type = 'DMASequencer'
|
||||
cxx_header = "mem/ruby/system/DMASequencer.hh"
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright (c) 2019 ARM Limited
|
||||
# Copyright (c) 2019-2020 ARM Limited
|
||||
# All rights reserved.
|
||||
#
|
||||
# The license below extends only to copyright in the software and shall
|
||||
@@ -54,6 +54,7 @@ python_class_map = {
|
||||
"CacheMemory": "RubyCache",
|
||||
"WireBuffer": "RubyWireBuffer",
|
||||
"Sequencer": "RubySequencer",
|
||||
"HTMSequencer": "RubyHTMSequencer",
|
||||
"GPUCoalescer" : "RubyGPUCoalescer",
|
||||
"VIPERCoalescer" : "VIPERCoalescer",
|
||||
"DirectoryMemory": "RubyDirectoryMemory",
|
||||
|
||||
Reference in New Issue
Block a user