mem-ruby: Remove static methods from RubySystem (#1453)

There are several parts to this PR to work towards #1349 .

(1) Make RubySystem::getBlockSizeBytes non-static by providing ways to
access the block size or passing the block size explicitly to classes.

The main changes are:
 - DataBlocks must be explicitly allocated. A default ctor still exists
   to avoid needing to heavily modify SLICC. The size can be set using a
   realloc function, operator=, or copy ctor. This is handled completely
   transparently meaning no protocol or config changes are required.
 - WriteMask now requires block size to be set. This is also handled
   transparently by modifying the SLICC parser to identify WriteMask
   types and call setBlockSize().
 - AbstractCacheEntry and TBE classes now require block size to be set.
   This is handled transparently by modifying the SLICC parser to
   identify these classes and call initBlockSize() which calls
   setBlockSize() for any DataBlock or WriteMask.
 - All AbstractControllers now have a pointer to RubySystem. This is
   assigned in SLICC generated code and requires no changes to protocol
   or configs.
 - The Ruby Message class now requires block size in all constructors.
   This is added to the argument list automatically by the SLICC parser.
   
(2) Relax dependence on common functions in
src/mem/ruby/common/Address.hh
so that RubySystem::getBlockSizeBits is no longer static. Many classes
already have a way to get block size from the previous commit, so they
simply multiple by 8 to get the number of bits. For handling SLICC and
reducing the number of changes, define makeCacheLine, getOffset, etc. in
RubyPort and AbstractController. The only protocol changes required are
to change any "RubySystem::foo()" calls with "m_ruby_system->foo()".

For classes which do not have a way to get access to block size but
still used makeLineAddress, getOffset, etc., the block size must be
passed to that class. This requires some changes to the SimObject
interface for two commonly used classes: DirectoryMemory and
RubyPrefecther, resulting in user-facing API changes

User-facing API changes:
 - DirectoryMemory and RubyPrefetcher now require the cache line size as
   a non-optional argument.
 - RubySequencer SimObjects now require RubySystem as a non-optional
   argument.
 - TesterThread in the GPU ruby tester now requires the cache line size
   as a non-optional argument.

(3) Removes static member variables in RubySystem which control
randomization, cooldown, and warmup. These are mostly used by the Ruby
Network. The network classes are modified to take these former static
variables as parameters which are passed to the corresponding method
(e.g., enqueue, delayHead, etc.) rather than needing a RubySystem object
at all.

Change-Id: Ia63c2ad5cf0bf9d1cbdffba5d3a679bb4d3b1220

(4) There are two major SLICC generated static methods:
getNumControllers()
on each cache controller which returns the number of controllers created
by the configs at run time and the functions which access this method,
which are MachineType_base_count and MachineType_base_number. These need
to be removed to create multiple RubySystem objects otherwise NetDest,
version value, and other objects are incorrect.

To remove the static requirement, MachineType_base_count and
MachineType_base_number are moved to RubySystem. Any class which needs
to call these methods must now have a pointer to a RubySystem. To enable
that, several changes are made:
 - RubyRequest and Message now require a RubySystem pointer in the
   constructor. The pointer is passed to fields in the Message class
   which require a RubySystem pointer (e.g., NetDest). SLICC is modified
   to do this automatically.
 - SLICC structures may now optionally take an "implicit constructor"
   which can be used to call a non-default constructor for locally
   defined variables (e.g., temporary variables within SLICC actions). A
   statement such as "NetDest bcast_dest;" in SLICC will implicitly
   append a call to the NetDest constructor taking RubySystem, for
   example.
 - RubySystem gets passed to Ruby network objects (Network, Topology).
This commit is contained in:
Matthew Poremba
2024-10-08 08:14:50 -07:00
committed by GitHub
parent 4a3e2633d2
commit 4f7b3ed827
123 changed files with 1066 additions and 399 deletions

View File

@@ -371,6 +371,7 @@ for dma_idx in range(n_DMAs):
num_lanes=1,
clk_domain=thread_clock,
deadlock_threshold=tester_deadlock_threshold,
cache_line_size=system.cache_line_size,
)
)
g_thread_idx += 1
@@ -393,6 +394,7 @@ for cu_idx in range(n_CUs):
num_lanes=args.wf_size,
clk_domain=thread_clock,
deadlock_threshold=tester_deadlock_threshold,
cache_line_size=system.cache_line_size,
)
)
g_thread_idx += 1

View File

@@ -84,6 +84,7 @@ class MyCacheSystem(RubySystem):
# I/D cache is combined and grab from ctrl
dcache=self.controllers[i].cacheMemory,
clk_domain=self.controllers[i].clk_domain,
ruby_system=self,
)
for i in range(len(cpus))
]
@@ -191,7 +192,9 @@ class DirController(Directory_Controller):
self.version = self.versionCount()
self.addr_ranges = ranges
self.ruby_system = ruby_system
self.directory = RubyDirectoryMemory()
self.directory = RubyDirectoryMemory(
block_size=ruby_system.block_size_bytes
)
# Connect this directory to the memory side.
self.memory = mem_ctrls[0].port
self.connectQueues(ruby_system)

View File

@@ -84,6 +84,7 @@ class MyCacheSystem(RubySystem):
# I/D cache is combined and grab from ctrl
dcache=self.controllers[i].cacheMemory,
clk_domain=self.controllers[i].clk_domain,
ruby_system=self,
)
for i in range(len(cpus))
]
@@ -180,7 +181,9 @@ class DirController(Directory_Controller):
self.version = self.versionCount()
self.addr_ranges = ranges
self.ruby_system = ruby_system
self.directory = RubyDirectoryMemory()
self.directory = RubyDirectoryMemory(
block_size=ruby_system.block_size_bytes
)
# Connect this directory to the memory side.
self.memory = mem_ctrls[0].port
self.connectQueues(ruby_system)

View File

@@ -79,6 +79,7 @@ class TestCacheSystem(RubySystem):
# I/D cache is combined and grab from ctrl
dcache=self.controllers[i].cacheMemory,
clk_domain=self.clk_domain,
ruby_system=self,
)
for i in range(num_testers)
]

View File

@@ -84,14 +84,14 @@ class CPCntrl(AMD_Base_Controller, CntrlBase):
self.L2cache = L2Cache()
self.L2cache.create(options.l2_size, options.l2_assoc, options)
self.sequencer = RubySequencer()
self.sequencer = RubySequencer(ruby_system=ruby_system)
self.sequencer.version = self.seqCount()
self.sequencer.dcache = self.L1D0cache
self.sequencer.ruby_system = ruby_system
self.sequencer.coreid = 0
self.sequencer.is_cpu_sequencer = True
self.sequencer1 = RubySequencer()
self.sequencer1 = RubySequencer(ruby_system=ruby_system)
self.sequencer1.version = self.seqCount()
self.sequencer1.dcache = self.L1D1cache
self.sequencer1.ruby_system = ruby_system

View File

@@ -114,14 +114,14 @@ class CPCntrl(CorePair_Controller, CntrlBase):
self.L2cache = L2Cache()
self.L2cache.create(options.l2_size, options.l2_assoc, options)
self.sequencer = RubySequencer()
self.sequencer = RubySequencer(ruby_system=ruby_system)
self.sequencer.version = self.seqCount()
self.sequencer.dcache = self.L1D0cache
self.sequencer.ruby_system = ruby_system
self.sequencer.coreid = 0
self.sequencer.is_cpu_sequencer = True
self.sequencer1 = RubySequencer()
self.sequencer1 = RubySequencer(ruby_system=ruby_system)
self.sequencer1.version = self.seqCount()
self.sequencer1.dcache = self.L1D1cache
self.sequencer1.ruby_system = ruby_system
@@ -169,7 +169,7 @@ class TCPCntrl(TCP_Controller, CntrlBase):
# TCP_Controller inherits this from RubyController
self.mandatory_queue_latency = options.mandatory_queue_latency
self.coalescer = VIPERCoalescer()
self.coalescer = VIPERCoalescer(ruby_system=ruby_system)
self.coalescer.version = self.seqCount()
self.coalescer.icache = self.L1cache
self.coalescer.dcache = self.L1cache
@@ -182,7 +182,7 @@ class TCPCntrl(TCP_Controller, CntrlBase):
options.max_coalesces_per_cycle
)
self.sequencer = RubySequencer()
self.sequencer = RubySequencer(ruby_system=ruby_system)
self.sequencer.version = self.seqCount()
self.sequencer.dcache = self.L1cache
self.sequencer.ruby_system = ruby_system
@@ -211,7 +211,7 @@ class TCPCntrl(TCP_Controller, CntrlBase):
self.L1cache.create(options)
self.issue_latency = 1
self.coalescer = VIPERCoalescer()
self.coalescer = VIPERCoalescer(ruby_system=ruby_system)
self.coalescer.version = self.seqCount()
self.coalescer.icache = self.L1cache
self.coalescer.dcache = self.L1cache
@@ -219,7 +219,7 @@ class TCPCntrl(TCP_Controller, CntrlBase):
self.coalescer.support_inst_reqs = False
self.coalescer.is_cpu_sequencer = False
self.sequencer = RubySequencer()
self.sequencer = RubySequencer(ruby_system=ruby_system)
self.sequencer.version = self.seqCount()
self.sequencer.dcache = self.L1cache
self.sequencer.ruby_system = ruby_system
@@ -387,7 +387,9 @@ class DirCntrl(Directory_Controller, CntrlBase):
self.response_latency = 30
self.addr_ranges = dir_ranges
self.directory = RubyDirectoryMemory()
self.directory = RubyDirectoryMemory(
block_size=ruby_system.block_size_bytes
)
self.L3CacheMemory = L3Cache()
self.L3CacheMemory.create(options, ruby_system, system)
@@ -686,7 +688,7 @@ def construct_gpudirs(options, system, ruby_system, network):
dir_cntrl.addr_ranges = dram_intf.range
# Append
exec("system.ruby.gpu_dir_cntrl%d = dir_cntrl" % i)
exec("ruby_system.gpu_dir_cntrl%d = dir_cntrl" % i)
dir_cntrl_nodes.append(dir_cntrl)
mem_ctrls.append(mem_ctrl)

View File

@@ -148,6 +148,7 @@ def create_system(
train_misses=5,
num_startup_pfs=4,
cross_page=True,
block_size=options.cacheline_size,
)
l0_cntrl = L0Cache_Controller(

View File

@@ -148,6 +148,7 @@ def create_system(
train_misses=5,
num_startup_pfs=4,
cross_page=True,
block_size=options.cacheline_size,
)
l0_cntrl = L0Cache_Controller(

View File

@@ -94,7 +94,7 @@ def create_system(
is_icache=False,
)
prefetcher = RubyPrefetcher()
prefetcher = RubyPrefetcher(block_size=options.cacheline_size)
clk_domain = cpus[i].clk_domain

View File

@@ -112,14 +112,14 @@ class CPCntrl(CorePair_Controller, CntrlBase):
self.L2cache = L2Cache()
self.L2cache.create(options)
self.sequencer = RubySequencer()
self.sequencer = RubySequencer(ruby_system=ruby_system)
self.sequencer.version = self.seqCount()
self.sequencer.dcache = self.L1D0cache
self.sequencer.ruby_system = ruby_system
self.sequencer.coreid = 0
self.sequencer.is_cpu_sequencer = True
self.sequencer1 = RubySequencer()
self.sequencer1 = RubySequencer(ruby_system=ruby_system)
self.sequencer1.version = self.seqCount()
self.sequencer1.dcache = self.L1D1cache
self.sequencer1.ruby_system = ruby_system
@@ -194,7 +194,9 @@ class DirCntrl(Directory_Controller, CntrlBase):
self.response_latency = 30
self.addr_ranges = dir_ranges
self.directory = RubyDirectoryMemory()
self.directory = RubyDirectoryMemory(
block_size=ruby_system.block_size_bytes
)
self.L3CacheMemory = L3Cache()
self.L3CacheMemory.create(options, ruby_system, system)

View File

@@ -308,7 +308,9 @@ def create_directories(options, bootmem, ruby_system, system):
for i in range(options.num_dirs):
dir_cntrl = Directory_Controller()
dir_cntrl.version = i
dir_cntrl.directory = RubyDirectoryMemory()
dir_cntrl.directory = RubyDirectoryMemory(
block_size=ruby_system.block_size_bytes
)
dir_cntrl.ruby_system = ruby_system
exec("ruby_system.dir_cntrl%d = dir_cntrl" % i)
@@ -316,7 +318,9 @@ def create_directories(options, bootmem, ruby_system, system):
if bootmem is not None:
rom_dir_cntrl = Directory_Controller()
rom_dir_cntrl.directory = RubyDirectoryMemory()
rom_dir_cntrl.directory = RubyDirectoryMemory(
block_size=ruby_system.block_size_bytes
)
rom_dir_cntrl.ruby_system = ruby_system
rom_dir_cntrl.version = i + 1
rom_dir_cntrl.memory = bootmem.port

View File

@@ -41,3 +41,4 @@ class TesterThread(ClockedObject):
thread_id = Param.Int("Unique TesterThread ID")
num_lanes = Param.Int("Number of lanes this thread has")
deadlock_threshold = Param.Cycles(1000000000, "Deadlock threshold")
cache_line_size = Param.UInt32("Size of cache line in cache")

View File

@@ -64,7 +64,9 @@ AddressManager::AddressManager(int n_atomic_locs, int n_normal_locs_per_atomic)
std::shuffle(
randAddressMap.begin(),
randAddressMap.end(),
std::default_random_engine(random_mt.random<unsigned>(0,UINT_MAX))
// TODO: This is a bug unrelated to this draft PR but the GPU tester is
// useful for testing this PR.
std::default_random_engine(random_mt.random<unsigned>(0,UINT_MAX-1))
);
// initialize atomic locations

View File

@@ -70,7 +70,7 @@ DmaThread::issueLoadOps()
Addr address = addrManager->getAddress(location);
DPRINTF(ProtocolTest, "%s Episode %d: Issuing Load - Addr %s\n",
this->getName(), curEpisode->getEpisodeId(),
ruby::printAddress(address));
printAddress(address));
int load_size = sizeof(Value);
@@ -127,7 +127,7 @@ DmaThread::issueStoreOps()
DPRINTF(ProtocolTest, "%s Episode %d: Issuing Store - Addr %s - "
"Value %d\n", this->getName(),
curEpisode->getEpisodeId(), ruby::printAddress(address),
curEpisode->getEpisodeId(), printAddress(address),
new_value);
auto req = std::make_shared<Request>(address, sizeof(Value),
@@ -211,7 +211,7 @@ DmaThread::hitCallback(PacketPtr pkt)
DPRINTF(ProtocolTest, "%s Episode %d: hitCallback - Command %s -"
" Addr %s\n", this->getName(), curEpisode->getEpisodeId(),
resp_cmd.toString(), ruby::printAddress(addr));
resp_cmd.toString(), printAddress(addr));
if (resp_cmd == MemCmd::SwapResp) {
// response to a pending atomic

View File

@@ -67,7 +67,7 @@ GpuWavefront::issueLoadOps()
Addr address = addrManager->getAddress(location);
DPRINTF(ProtocolTest, "%s Episode %d: Issuing Load - Addr %s\n",
this->getName(), curEpisode->getEpisodeId(),
ruby::printAddress(address));
printAddress(address));
int load_size = sizeof(Value);
@@ -124,7 +124,7 @@ GpuWavefront::issueStoreOps()
DPRINTF(ProtocolTest, "%s Episode %d: Issuing Store - Addr %s - "
"Value %d\n", this->getName(),
curEpisode->getEpisodeId(), ruby::printAddress(address),
curEpisode->getEpisodeId(), printAddress(address),
new_value);
auto req = std::make_shared<Request>(address, sizeof(Value),
@@ -178,7 +178,7 @@ GpuWavefront::issueAtomicOps()
DPRINTF(ProtocolTest, "%s Episode %d: Issuing Atomic_Inc - Addr %s\n",
this->getName(), curEpisode->getEpisodeId(),
ruby::printAddress(address));
printAddress(address));
// must be aligned with store size
assert(address % sizeof(Value) == 0);
@@ -268,7 +268,7 @@ GpuWavefront::hitCallback(PacketPtr pkt)
DPRINTF(ProtocolTest, "%s Episode %d: hitCallback - Command %s - "
"Addr %s\n", this->getName(),
curEpisode->getEpisodeId(), resp_cmd.toString(),
ruby::printAddress(addr));
printAddress(addr));
// whether the transaction is done after this hitCallback
bool isTransactionDone = true;

View File

@@ -43,6 +43,7 @@ TesterThread::TesterThread(const Params &p)
: ClockedObject(p),
threadEvent(this, "TesterThread tick"),
deadlockCheckEvent(this),
cacheLineSize(p.cache_line_size),
threadId(p.thread_id),
numLanes(p.num_lanes),
tester(nullptr), addrManager(nullptr), port(nullptr),
@@ -383,7 +384,7 @@ TesterThread::validateAtomicResp(Location loc, int lane, Value ret_val)
ss << threadName << ": Atomic Op returned unexpected value\n"
<< "\tEpisode " << curEpisode->getEpisodeId() << "\n"
<< "\tLane ID " << lane << "\n"
<< "\tAddress " << ruby::printAddress(addr) << "\n"
<< "\tAddress " << printAddress(addr) << "\n"
<< "\tAtomic Op's return value " << ret_val << "\n";
// print out basic info
@@ -409,7 +410,7 @@ TesterThread::validateLoadResp(Location loc, int lane, Value ret_val)
<< "\tTesterThread " << threadId << "\n"
<< "\tEpisode " << curEpisode->getEpisodeId() << "\n"
<< "\tLane ID " << lane << "\n"
<< "\tAddress " << ruby::printAddress(addr) << "\n"
<< "\tAddress " << printAddress(addr) << "\n"
<< "\tLoaded value " << ret_val << "\n"
<< "\tLast writer " << addrManager->printLastWriter(loc) << "\n";
@@ -467,7 +468,7 @@ TesterThread::printOutstandingReqs(const OutstandingReqTable& table,
for (const auto& m : table) {
for (const auto& req : m.second) {
ss << "\t\t\tAddr " << ruby::printAddress(m.first)
ss << "\t\t\tAddr " << printAddress(m.first)
<< ": delta (curCycle - issueCycle) = "
<< (cur_cycle - req.issueCycle) << std::endl;
}
@@ -488,4 +489,10 @@ TesterThread::printAllOutstandingReqs(std::stringstream& ss) const
<< pendingFenceCount << std::endl;
}
std::string
TesterThread::printAddress(Addr addr) const
{
return ruby::printAddress(addr, cacheLineSize * 8);
}
} // namespace gem5

View File

@@ -132,6 +132,7 @@ class TesterThread : public ClockedObject
{}
};
int cacheLineSize;
// the unique global id of this thread
int threadId;
// width of this thread (1 for cpu thread & wf size for gpu wavefront)
@@ -204,6 +205,7 @@ class TesterThread : public ClockedObject
void printOutstandingReqs(const OutstandingReqTable& table,
std::stringstream& ss) const;
std::string printAddress(Addr addr) const;
};
} // namespace gem5

View File

@@ -124,7 +124,8 @@ Check::initiatePrefetch()
// push the subblock onto the sender state. The sequencer will
// update the subblock on the return
pkt->senderState = new SenderState(m_address, req->getSize());
pkt->senderState = new SenderState(m_address, req->getSize(),
CACHE_LINE_BITS);
if (port->sendTimingReq(pkt)) {
DPRINTF(RubyTest, "successfully initiated prefetch.\n");
@@ -161,7 +162,8 @@ Check::initiateFlush()
// push the subblock onto the sender state. The sequencer will
// update the subblock on the return
pkt->senderState = new SenderState(m_address, req->getSize());
pkt->senderState = new SenderState(m_address, req->getSize(),
CACHE_LINE_BITS);
if (port->sendTimingReq(pkt)) {
DPRINTF(RubyTest, "initiating Flush - successful\n");
@@ -207,7 +209,8 @@ Check::initiateAction()
// push the subblock onto the sender state. The sequencer will
// update the subblock on the return
pkt->senderState = new SenderState(writeAddr, req->getSize());
pkt->senderState = new SenderState(m_address, req->getSize(),
CACHE_LINE_BITS);
if (port->sendTimingReq(pkt)) {
DPRINTF(RubyTest, "initiating action - successful\n");
@@ -261,7 +264,8 @@ Check::initiateCheck()
// push the subblock onto the sender state. The sequencer will
// update the subblock on the return
pkt->senderState = new SenderState(m_address, req->getSize());
pkt->senderState = new SenderState(m_address, req->getSize(),
CACHE_LINE_BITS);
if (port->sendTimingReq(pkt)) {
DPRINTF(RubyTest, "initiating check - successful\n");
@@ -291,7 +295,9 @@ Check::performCallback(ruby::NodeID proc, ruby::SubBlock* data, Cycles curTime)
// This isn't exactly right since we now have multi-byte checks
// assert(getAddress() == address);
assert(ruby::makeLineAddress(m_address) == ruby::makeLineAddress(address));
int block_size_bits = CACHE_LINE_BITS;
assert(ruby::makeLineAddress(m_address, block_size_bits) ==
ruby::makeLineAddress(address, block_size_bits));
assert(data != NULL);
DPRINTF(RubyTest, "RubyTester Callback\n");
@@ -342,7 +348,7 @@ Check::performCallback(ruby::NodeID proc, ruby::SubBlock* data, Cycles curTime)
}
DPRINTF(RubyTest, "proc: %d, Address: 0x%x\n", proc,
ruby::makeLineAddress(m_address));
ruby::makeLineAddress(m_address, block_size_bits));
DPRINTF(RubyTest, "Callback done\n");
debugPrint();
}

View File

@@ -47,6 +47,7 @@ class SubBlock;
const int CHECK_SIZE_BITS = 2;
const int CHECK_SIZE = (1 << CHECK_SIZE_BITS);
const int CACHE_LINE_BITS = 6;
class Check
{

View File

@@ -90,7 +90,9 @@ class RubyTester : public ClockedObject
{
ruby::SubBlock subBlock;
SenderState(Addr addr, int size) : subBlock(addr, size) {}
SenderState(Addr addr, int size, int cl_size)
: subBlock(addr, size, cl_size)
{}
};

View File

@@ -51,37 +51,33 @@ maskLowOrderBits(Addr addr, unsigned int number)
}
Addr
getOffset(Addr addr)
getOffset(Addr addr, int cacheLineBits)
{
return bitSelect(addr, 0, RubySystem::getBlockSizeBits() - 1);
}
Addr
makeLineAddress(Addr addr)
{
return mbits<Addr>(addr, 63, RubySystem::getBlockSizeBits());
assert(cacheLineBits < 64);
return bitSelect(addr, 0, cacheLineBits - 1);
}
Addr
makeLineAddress(Addr addr, int cacheLineBits)
{
assert(cacheLineBits < 64);
return maskLowOrderBits(addr, cacheLineBits);
}
// returns the next stride address based on line address
Addr
makeNextStrideAddress(Addr addr, int stride)
makeNextStrideAddress(Addr addr, int stride, int cacheLineBytes)
{
return makeLineAddress(addr) +
static_cast<int>(RubySystem::getBlockSizeBytes()) * stride;
return makeLineAddress(addr, floorLog2(cacheLineBytes))
+ cacheLineBytes * stride;
}
std::string
printAddress(Addr addr)
printAddress(Addr addr, int cacheLineBits)
{
std::stringstream out;
out << "[" << std::hex << "0x" << addr << "," << " line 0x"
<< makeLineAddress(addr) << std::dec << "]";
<< makeLineAddress(addr, cacheLineBits) << std::dec << "]";
return out.str();
}

View File

@@ -33,6 +33,7 @@
#include <iomanip>
#include <iostream>
#include "base/intmath.hh"
#include "base/types.hh"
namespace gem5
@@ -44,11 +45,10 @@ namespace ruby
// selects bits inclusive
Addr bitSelect(Addr addr, unsigned int small, unsigned int big);
Addr maskLowOrderBits(Addr addr, unsigned int number);
Addr getOffset(Addr addr);
Addr makeLineAddress(Addr addr);
Addr getOffset(Addr addr, int cacheLineBits);
Addr makeLineAddress(Addr addr, int cacheLineBits);
Addr makeNextStrideAddress(Addr addr, int stride);
std::string printAddress(Addr addr);
Addr makeNextStrideAddress(Addr addr, int stride, int cacheLineBytes);
std::string printAddress(Addr addr, int cacheLineBits);
} // namespace ruby
} // namespace gem5

View File

@@ -40,8 +40,8 @@
#include "mem/ruby/common/DataBlock.hh"
#include "mem/ruby/common/Address.hh"
#include "mem/ruby/common/WriteMask.hh"
#include "mem/ruby/system/RubySystem.hh"
namespace gem5
{
@@ -51,17 +51,22 @@ namespace ruby
DataBlock::DataBlock(const DataBlock &cp)
{
assert(cp.isAlloc());
assert(cp.getBlockSize() > 0);
assert(!m_alloc);
uint8_t *block_update;
size_t block_bytes = RubySystem::getBlockSizeBytes();
m_data = new uint8_t[block_bytes];
memcpy(m_data, cp.m_data, block_bytes);
m_block_size = cp.getBlockSize();
m_data = new uint8_t[m_block_size];
memcpy(m_data, cp.m_data, m_block_size);
m_alloc = true;
m_block_size = m_block_size;
// If this data block is involved in an atomic operation, the effect
// of applying the atomic operations on the data block are recorded in
// m_atomicLog. If so, we must copy over every entry in the change log
for (size_t i = 0; i < cp.m_atomicLog.size(); i++) {
block_update = new uint8_t[block_bytes];
memcpy(block_update, cp.m_atomicLog[i], block_bytes);
block_update = new uint8_t[m_block_size];
memcpy(block_update, cp.m_atomicLog[i], m_block_size);
m_atomicLog.push_back(block_update);
}
}
@@ -69,21 +74,44 @@ DataBlock::DataBlock(const DataBlock &cp)
void
DataBlock::alloc()
{
m_data = new uint8_t[RubySystem::getBlockSizeBytes()];
assert(!m_alloc);
if (!m_block_size) {
return;
}
m_data = new uint8_t[m_block_size];
m_alloc = true;
clear();
}
void
DataBlock::realloc(int blk_size)
{
m_block_size = blk_size;
assert(m_block_size > 0);
if (m_alloc) {
delete [] m_data;
m_alloc = false;
}
alloc();
}
void
DataBlock::clear()
{
memset(m_data, 0, RubySystem::getBlockSizeBytes());
assert(m_alloc);
assert(m_block_size > 0);
memset(m_data, 0, m_block_size);
}
bool
DataBlock::equal(const DataBlock& obj) const
{
size_t block_bytes = RubySystem::getBlockSizeBytes();
assert(m_alloc);
assert(m_block_size > 0);
size_t block_bytes = m_block_size;
// Check that the block contents match
if (memcmp(m_data, obj.m_data, block_bytes)) {
return false;
@@ -102,7 +130,9 @@ DataBlock::equal(const DataBlock& obj) const
void
DataBlock::copyPartial(const DataBlock &dblk, const WriteMask &mask)
{
for (int i = 0; i < RubySystem::getBlockSizeBytes(); i++) {
assert(m_alloc);
assert(m_block_size > 0);
for (int i = 0; i < m_block_size; i++) {
if (mask.getMask(i, 1)) {
m_data[i] = dblk.m_data[i];
}
@@ -113,7 +143,9 @@ void
DataBlock::atomicPartial(const DataBlock &dblk, const WriteMask &mask,
bool isAtomicNoReturn)
{
for (int i = 0; i < RubySystem::getBlockSizeBytes(); i++) {
assert(m_alloc);
assert(m_block_size > 0);
for (int i = 0; i < m_block_size; i++) {
m_data[i] = dblk.m_data[i];
}
mask.performAtomic(m_data, m_atomicLog, isAtomicNoReturn);
@@ -122,7 +154,9 @@ DataBlock::atomicPartial(const DataBlock &dblk, const WriteMask &mask,
void
DataBlock::print(std::ostream& out) const
{
int size = RubySystem::getBlockSizeBytes();
assert(m_alloc);
assert(m_block_size > 0);
int size = m_block_size;
out << "[ ";
for (int i = 0; i < size; i++) {
out << std::setw(2) << std::setfill('0') << std::hex
@@ -147,6 +181,7 @@ DataBlock::popAtomicLogEntryFront()
void
DataBlock::clearAtomicLogEntries()
{
assert(m_alloc);
for (auto log : m_atomicLog) {
delete [] log;
}
@@ -156,35 +191,59 @@ DataBlock::clearAtomicLogEntries()
const uint8_t*
DataBlock::getData(int offset, int len) const
{
assert(offset + len <= RubySystem::getBlockSizeBytes());
assert(m_alloc);
assert(m_block_size > 0);
assert(offset + len <= m_block_size);
return &m_data[offset];
}
uint8_t*
DataBlock::getDataMod(int offset)
{
assert(m_alloc);
return &m_data[offset];
}
void
DataBlock::setData(const uint8_t *data, int offset, int len)
{
assert(m_alloc);
memcpy(&m_data[offset], data, len);
}
void
DataBlock::setData(PacketPtr pkt)
{
int offset = getOffset(pkt->getAddr());
assert(offset + pkt->getSize() <= RubySystem::getBlockSizeBytes());
assert(m_alloc);
assert(m_block_size > 0);
int offset = getOffset(pkt->getAddr(), floorLog2(m_block_size));
assert(offset + pkt->getSize() <= m_block_size);
pkt->writeData(&m_data[offset]);
}
DataBlock &
DataBlock::operator=(const DataBlock & obj)
{
// Reallocate if needed
if (m_alloc && m_block_size != obj.getBlockSize()) {
delete [] m_data;
m_block_size = obj.getBlockSize();
alloc();
} else if (!m_alloc) {
m_block_size = obj.getBlockSize();
alloc();
// Assume this will be realloc'd later if zero.
if (m_block_size == 0) {
return *this;
}
} else {
assert(m_alloc && m_block_size == obj.getBlockSize());
}
assert(m_block_size > 0);
uint8_t *block_update;
size_t block_bytes = RubySystem::getBlockSizeBytes();
size_t block_bytes = m_block_size;
// Copy entire block contents from obj to current block
memcpy(m_data, obj.m_data, block_bytes);
// If this data block is involved in an atomic operation, the effect

View File

@@ -61,8 +61,14 @@ class WriteMask;
class DataBlock
{
public:
DataBlock()
// Ideally this should nost be called. We allow default so that protocols
// do not need to be changed.
DataBlock() = default;
DataBlock(int blk_size)
{
assert(!m_alloc);
m_block_size = blk_size;
alloc();
}
@@ -101,10 +107,16 @@ class DataBlock
bool equal(const DataBlock& obj) const;
void print(std::ostream& out) const;
int getBlockSize() const { return m_block_size; }
void setBlockSize(int block_size) { realloc(block_size); }
bool isAlloc() const { return m_alloc; }
void realloc(int blk_size);
private:
void alloc();
uint8_t *m_data;
bool m_alloc;
uint8_t *m_data = nullptr;
bool m_alloc = false;
int m_block_size = 0;
// Tracks block changes when atomic ops are applied
std::deque<uint8_t*> m_atomicLog;
@@ -124,18 +136,21 @@ DataBlock::assign(uint8_t *data)
inline uint8_t
DataBlock::getByte(int whichByte) const
{
assert(m_alloc);
return m_data[whichByte];
}
inline void
DataBlock::setByte(int whichByte, uint8_t data)
{
assert(m_alloc);
m_data[whichByte] = data;
}
inline void
DataBlock::copyPartial(const DataBlock & dblk, int offset, int len)
{
assert(m_alloc);
setData(&dblk.m_data[offset], offset, len);
}

View File

@@ -30,6 +30,8 @@
#include <algorithm>
#include "mem/ruby/system/RubySystem.hh"
namespace gem5
{
@@ -38,12 +40,18 @@ namespace ruby
NetDest::NetDest()
{
resize();
}
NetDest::NetDest(RubySystem *ruby_system)
: m_ruby_system(ruby_system)
{
resize();
}
void
NetDest::add(MachineID newElement)
{
assert(m_bits.size() > 0);
assert(bitIndex(newElement.num) < m_bits[vecIndex(newElement)].getSize());
m_bits[vecIndex(newElement)].add(bitIndex(newElement.num));
}
@@ -51,6 +59,7 @@ NetDest::add(MachineID newElement)
void
NetDest::addNetDest(const NetDest& netDest)
{
assert(m_bits.size() > 0);
assert(m_bits.size() == netDest.getSize());
for (int i = 0; i < m_bits.size(); i++) {
m_bits[i].addSet(netDest.m_bits[i]);
@@ -60,6 +69,8 @@ NetDest::addNetDest(const NetDest& netDest)
void
NetDest::setNetDest(MachineType machine, const Set& set)
{
assert(m_ruby_system != nullptr);
// assure that there is only one set of destinations for this machine
assert(MachineType_base_level((MachineType)(machine + 1)) -
MachineType_base_level(machine) == 1);
@@ -69,12 +80,14 @@ NetDest::setNetDest(MachineType machine, const Set& set)
void
NetDest::remove(MachineID oldElement)
{
assert(m_bits.size() > 0);
m_bits[vecIndex(oldElement)].remove(bitIndex(oldElement.num));
}
void
NetDest::removeNetDest(const NetDest& netDest)
{
assert(m_bits.size() > 0);
assert(m_bits.size() == netDest.getSize());
for (int i = 0; i < m_bits.size(); i++) {
m_bits[i].removeSet(netDest.m_bits[i]);
@@ -84,6 +97,7 @@ NetDest::removeNetDest(const NetDest& netDest)
void
NetDest::clear()
{
assert(m_bits.size() > 0);
for (int i = 0; i < m_bits.size(); i++) {
m_bits[i].clear();
}
@@ -101,6 +115,8 @@ NetDest::broadcast()
void
NetDest::broadcast(MachineType machineType)
{
assert(m_ruby_system != nullptr);
for (NodeID i = 0; i < MachineType_base_count(machineType); i++) {
MachineID mach = {machineType, i};
add(mach);
@@ -111,6 +127,9 @@ NetDest::broadcast(MachineType machineType)
std::vector<NodeID>
NetDest::getAllDest()
{
assert(m_ruby_system != nullptr);
assert(m_bits.size() > 0);
std::vector<NodeID> dest;
dest.clear();
for (int i = 0; i < m_bits.size(); i++) {
@@ -127,6 +146,8 @@ NetDest::getAllDest()
int
NetDest::count() const
{
assert(m_bits.size() > 0);
int counter = 0;
for (int i = 0; i < m_bits.size(); i++) {
counter += m_bits[i].count();
@@ -137,12 +158,14 @@ NetDest::count() const
NodeID
NetDest::elementAt(MachineID index)
{
assert(m_bits.size() > 0);
return m_bits[vecIndex(index)].elementAt(bitIndex(index.num));
}
MachineID
NetDest::smallestElement() const
{
assert(m_bits.size() > 0);
assert(count() > 0);
for (int i = 0; i < m_bits.size(); i++) {
for (NodeID j = 0; j < m_bits[i].getSize(); j++) {
@@ -158,6 +181,9 @@ NetDest::smallestElement() const
MachineID
NetDest::smallestElement(MachineType machine) const
{
assert(m_bits.size() > 0);
assert(m_ruby_system != nullptr);
int size = m_bits[MachineType_base_level(machine)].getSize();
for (NodeID j = 0; j < size; j++) {
if (m_bits[MachineType_base_level(machine)].isElement(j)) {
@@ -173,6 +199,7 @@ NetDest::smallestElement(MachineType machine) const
bool
NetDest::isBroadcast() const
{
assert(m_bits.size() > 0);
for (int i = 0; i < m_bits.size(); i++) {
if (!m_bits[i].isBroadcast()) {
return false;
@@ -185,6 +212,7 @@ NetDest::isBroadcast() const
bool
NetDest::isEmpty() const
{
assert(m_bits.size() > 0);
for (int i = 0; i < m_bits.size(); i++) {
if (!m_bits[i].isEmpty()) {
return false;
@@ -197,8 +225,9 @@ NetDest::isEmpty() const
NetDest
NetDest::OR(const NetDest& orNetDest) const
{
assert(m_bits.size() > 0);
assert(m_bits.size() == orNetDest.getSize());
NetDest result;
NetDest result(m_ruby_system);
for (int i = 0; i < m_bits.size(); i++) {
result.m_bits[i] = m_bits[i].OR(orNetDest.m_bits[i]);
}
@@ -209,8 +238,9 @@ NetDest::OR(const NetDest& orNetDest) const
NetDest
NetDest::AND(const NetDest& andNetDest) const
{
assert(m_bits.size() > 0);
assert(m_bits.size() == andNetDest.getSize());
NetDest result;
NetDest result(m_ruby_system);
for (int i = 0; i < m_bits.size(); i++) {
result.m_bits[i] = m_bits[i].AND(andNetDest.m_bits[i]);
}
@@ -221,6 +251,7 @@ NetDest::AND(const NetDest& andNetDest) const
bool
NetDest::intersectionIsNotEmpty(const NetDest& other_netDest) const
{
assert(m_bits.size() > 0);
assert(m_bits.size() == other_netDest.getSize());
for (int i = 0; i < m_bits.size(); i++) {
if (!m_bits[i].intersectionIsEmpty(other_netDest.m_bits[i])) {
@@ -233,6 +264,7 @@ NetDest::intersectionIsNotEmpty(const NetDest& other_netDest) const
bool
NetDest::isSuperset(const NetDest& test) const
{
assert(m_bits.size() > 0);
assert(m_bits.size() == test.getSize());
for (int i = 0; i < m_bits.size(); i++) {
@@ -246,12 +278,15 @@ NetDest::isSuperset(const NetDest& test) const
bool
NetDest::isElement(MachineID element) const
{
assert(m_bits.size() > 0);
return ((m_bits[vecIndex(element)])).isElement(bitIndex(element.num));
}
void
NetDest::resize()
{
assert(m_ruby_system != nullptr);
m_bits.resize(MachineType_base_level(MachineType_NUM));
assert(m_bits.size() == MachineType_NUM);
@@ -263,6 +298,7 @@ NetDest::resize()
void
NetDest::print(std::ostream& out) const
{
assert(m_bits.size() > 0);
out << "[NetDest (" << m_bits.size() << ") ";
for (int i = 0; i < m_bits.size(); i++) {
@@ -277,6 +313,7 @@ NetDest::print(std::ostream& out) const
bool
NetDest::isEqual(const NetDest& n) const
{
assert(m_bits.size() > 0);
assert(m_bits.size() == n.m_bits.size());
for (unsigned int i = 0; i < m_bits.size(); ++i) {
if (!m_bits[i].isEqual(n.m_bits[i]))
@@ -285,5 +322,19 @@ NetDest::isEqual(const NetDest& n) const
return true;
}
int
NetDest::MachineType_base_count(const MachineType& obj)
{
assert(m_ruby_system != nullptr);
return m_ruby_system->MachineType_base_count(obj);
}
int
NetDest::MachineType_base_number(const MachineType& obj)
{
assert(m_ruby_system != nullptr);
return m_ruby_system->MachineType_base_number(obj);
}
} // namespace ruby
} // namespace gem5

View File

@@ -41,6 +41,8 @@ namespace gem5
namespace ruby
{
class RubySystem;
// NetDest specifies the network destination of a Message
class NetDest
{
@@ -48,6 +50,7 @@ class NetDest
// Constructors
// creates and empty set
NetDest();
NetDest(RubySystem *ruby_system);
explicit NetDest(int bit_size);
NetDest& operator=(const Set& obj);
@@ -98,6 +101,8 @@ class NetDest
void print(std::ostream& out) const;
void setRubySystem(RubySystem *rs) { m_ruby_system = rs; resize(); }
private:
// returns a value >= MachineType_base_level("this machine")
// and < MachineType_base_level("next highest machine")
@@ -112,6 +117,12 @@ class NetDest
NodeID bitIndex(NodeID index) const { return index; }
std::vector<Set> m_bits; // a vector of bit vectors - i.e. Sets
// Needed to call MacheinType_base_count/level
RubySystem *m_ruby_system = nullptr;
int MachineType_base_count(const MachineType& obj);
int MachineType_base_number(const MachineType& obj);
};
inline std::ostream&

View File

@@ -38,13 +38,14 @@ namespace ruby
using stl_helpers::operator<<;
SubBlock::SubBlock(Addr addr, int size)
SubBlock::SubBlock(Addr addr, int size, int cl_bits)
{
m_address = addr;
resize(size);
for (int i = 0; i < size; i++) {
setByte(i, 0);
}
m_cache_line_bits = cl_bits;
}
void
@@ -52,7 +53,7 @@ SubBlock::internalMergeFrom(const DataBlock& data)
{
int size = getSize();
assert(size > 0);
int offset = getOffset(m_address);
int offset = getOffset(m_address, m_cache_line_bits);
for (int i = 0; i < size; i++) {
this->setByte(i, data.getByte(offset + i));
}
@@ -63,7 +64,7 @@ SubBlock::internalMergeTo(DataBlock& data) const
{
int size = getSize();
assert(size > 0);
int offset = getOffset(m_address);
int offset = getOffset(m_address, m_cache_line_bits);
for (int i = 0; i < size; i++) {
// This will detect crossing a cache line boundary
data.setByte(offset + i, this->getByte(i));

View File

@@ -45,7 +45,7 @@ class SubBlock
{
public:
SubBlock() { }
SubBlock(Addr addr, int size);
SubBlock(Addr addr, int size, int cl_bits);
~SubBlock() { }
Addr getAddress() const { return m_address; }
@@ -74,6 +74,7 @@ class SubBlock
// Data Members (m_ prefix)
Addr m_address;
std::vector<uint8_t> m_data;
int m_cache_line_bits;
};
inline std::ostream&

View File

@@ -39,13 +39,13 @@ namespace ruby
{
WriteMask::WriteMask()
: mSize(RubySystem::getBlockSizeBytes()), mMask(mSize, false),
mAtomic(false)
: mSize(0), mMask(mSize, false), mAtomic(false)
{}
void
WriteMask::print(std::ostream& out) const
{
assert(mSize > 0);
std::string str(mSize,'0');
for (int i = 0; i < mSize; i++) {
str[i] = mMask[i] ? ('1') : ('0');
@@ -59,6 +59,7 @@ void
WriteMask::performAtomic(uint8_t * p,
std::deque<uint8_t*>& log, bool isAtomicNoReturn) const
{
assert(mSize > 0);
int offset;
uint8_t *block_update;
// Here, operations occur in FIFO order from the mAtomicOp

View File

@@ -78,6 +78,17 @@ class WriteMask
~WriteMask()
{}
int getBlockSize() const { return mSize; }
void
setBlockSize(int size)
{
// This should only be used once if the default ctor was used. Probably
// by src/mem/ruby/protocol/RubySlicc_MemControl.sm.
assert(mSize == 0);
assert(size > 0);
mSize = size;
}
void
clear()
{
@@ -87,6 +98,7 @@ class WriteMask
bool
test(int offset) const
{
assert(mSize > 0);
assert(offset < mSize);
return mMask[offset];
}
@@ -94,6 +106,7 @@ class WriteMask
void
setMask(int offset, int len, bool val = true)
{
assert(mSize > 0);
assert(mSize >= (offset + len));
for (int i = 0; i < len; i++) {
mMask[offset + i] = val;
@@ -102,6 +115,7 @@ class WriteMask
void
fillMask()
{
assert(mSize > 0);
for (int i = 0; i < mSize; i++) {
mMask[i] = true;
}
@@ -111,6 +125,7 @@ class WriteMask
getMask(int offset, int len) const
{
bool tmp = true;
assert(mSize > 0);
assert(mSize >= (offset + len));
for (int i = 0; i < len; i++) {
tmp = tmp & mMask.at(offset + i);
@@ -122,6 +137,7 @@ class WriteMask
isOverlap(const WriteMask &readMask) const
{
bool tmp = false;
assert(mSize > 0);
assert(mSize == readMask.mSize);
for (int i = 0; i < mSize; i++) {
if (readMask.mMask.at(i)) {
@@ -135,6 +151,7 @@ class WriteMask
containsMask(const WriteMask &readMask) const
{
bool tmp = true;
assert(mSize > 0);
assert(mSize == readMask.mSize);
for (int i = 0; i < mSize; i++) {
if (readMask.mMask.at(i)) {
@@ -146,6 +163,7 @@ class WriteMask
bool isEmpty() const
{
assert(mSize > 0);
for (int i = 0; i < mSize; i++) {
if (mMask.at(i)) {
return false;
@@ -157,6 +175,7 @@ class WriteMask
bool
isFull() const
{
assert(mSize > 0);
for (int i = 0; i < mSize; i++) {
if (!mMask.at(i)) {
return false;
@@ -168,6 +187,7 @@ class WriteMask
void
andMask(const WriteMask & writeMask)
{
assert(mSize > 0);
assert(mSize == writeMask.mSize);
for (int i = 0; i < mSize; i++) {
mMask[i] = (mMask.at(i)) && (writeMask.mMask.at(i));
@@ -182,6 +202,7 @@ class WriteMask
void
orMask(const WriteMask & writeMask)
{
assert(mSize > 0);
assert(mSize == writeMask.mSize);
for (int i = 0; i < mSize; i++) {
mMask[i] = (mMask.at(i)) || (writeMask.mMask.at(i));
@@ -196,6 +217,7 @@ class WriteMask
void
setInvertedMask(const WriteMask & writeMask)
{
assert(mSize > 0);
assert(mSize == writeMask.mSize);
for (int i = 0; i < mSize; i++) {
mMask[i] = !writeMask.mMask.at(i);
@@ -205,6 +227,7 @@ class WriteMask
int
firstBitSet(bool val, int offset = 0) const
{
assert(mSize > 0);
for (int i = offset; i < mSize; ++i)
if (mMask[i] == val)
return i;
@@ -214,6 +237,7 @@ class WriteMask
int
count(int offset = 0) const
{
assert(mSize > 0);
int count = 0;
for (int i = offset; i < mSize; ++i)
count += mMask[i];

View File

@@ -47,7 +47,6 @@
#include "base/random.hh"
#include "base/stl_helpers.hh"
#include "debug/RubyQueue.hh"
#include "mem/ruby/system/RubySystem.hh"
namespace gem5
{
@@ -216,6 +215,7 @@ random_time()
void
MessageBuffer::enqueue(MsgPtr message, Tick current_time, Tick delta,
bool ruby_is_random, bool ruby_warmup,
bool bypassStrictFIFO)
{
// record current time incase we have a pop that also adjusts my size
@@ -237,7 +237,7 @@ MessageBuffer::enqueue(MsgPtr message, Tick current_time, Tick delta,
// is turned on and this buffer allows it
if ((m_randomization == MessageRandomization::disabled) ||
((m_randomization == MessageRandomization::ruby_system) &&
!RubySystem::getRandomization())) {
!ruby_is_random)) {
// No randomization
arrival_time = current_time + delta;
} else {
@@ -265,7 +265,7 @@ MessageBuffer::enqueue(MsgPtr message, Tick current_time, Tick delta,
}
// If running a cache trace, don't worry about the last arrival checks
if (!RubySystem::getWarmupEnabled()) {
if (!ruby_warmup) {
m_last_arrival_time = arrival_time;
}
@@ -447,7 +447,6 @@ MessageBuffer::stallMessage(Addr addr, Tick current_time)
{
DPRINTF(RubyQueue, "Stalling due to %#x\n", addr);
assert(isReady(current_time));
assert(getOffset(addr) == 0);
MsgPtr message = m_prio_heap.front();
// Since the message will just be moved to stall map, indicate that the
@@ -479,7 +478,8 @@ MessageBuffer::deferEnqueueingMessage(Addr addr, MsgPtr message)
}
void
MessageBuffer::enqueueDeferredMessages(Addr addr, Tick curTime, Tick delay)
MessageBuffer::enqueueDeferredMessages(Addr addr, Tick curTime, Tick delay,
bool ruby_is_random, bool ruby_warmup)
{
assert(!isDeferredMsgMapEmpty(addr));
std::vector<MsgPtr>& msg_vec = m_deferred_msg_map[addr];
@@ -487,7 +487,7 @@ MessageBuffer::enqueueDeferredMessages(Addr addr, Tick curTime, Tick delay)
// enqueue all deferred messages associated with this address
for (MsgPtr m : msg_vec) {
enqueue(m, curTime, delay);
enqueue(m, curTime, delay, ruby_is_random, ruby_warmup);
}
msg_vec.clear();

View File

@@ -90,13 +90,14 @@ class MessageBuffer : public SimObject
Tick readyTime() const;
void
delayHead(Tick current_time, Tick delta)
delayHead(Tick current_time, Tick delta, bool ruby_is_random,
bool ruby_warmup)
{
MsgPtr m = m_prio_heap.front();
std::pop_heap(m_prio_heap.begin(), m_prio_heap.end(),
std::greater<MsgPtr>());
m_prio_heap.pop_back();
enqueue(m, current_time, delta);
enqueue(m, current_time, delta, ruby_is_random, ruby_warmup);
}
bool areNSlotsAvailable(unsigned int n, Tick curTime);
@@ -124,6 +125,7 @@ class MessageBuffer : public SimObject
const MsgPtr &peekMsgPtr() const { return m_prio_heap.front(); }
void enqueue(MsgPtr message, Tick curTime, Tick delta,
bool ruby_is_random, bool ruby_warmup,
bool bypassStrictFIFO = false);
// Defer enqueueing a message to a later cycle by putting it aside and not
@@ -135,7 +137,8 @@ class MessageBuffer : public SimObject
// enqueue all previously deferred messages that are associated with the
// input address
void enqueueDeferredMessages(Addr addr, Tick curTime, Tick delay);
void enqueueDeferredMessages(Addr addr, Tick curTime, Tick delay,
bool ruby_is_random, bool ruby_warmup);
bool isDeferredMsgMapEmpty(Addr addr) const;
//! Updates the delay cycles of the message at the head of the queue,

View File

@@ -65,7 +65,8 @@ Network::Network(const Params &p)
"%s: data message size > cache line size", name());
m_data_msg_size = p.data_msg_size + m_control_msg_size;
params().ruby_system->registerNetwork(this);
m_ruby_system = p.ruby_system;
m_ruby_system->registerNetwork(this);
// Populate localNodeVersions with the version of each MachineType in
// this network. This will be used to compute a global to local ID.
@@ -102,7 +103,8 @@ Network::Network(const Params &p)
m_topology_ptr = new Topology(m_nodes, p.routers.size(),
m_virtual_networks,
p.ext_links, p.int_links);
p.ext_links, p.int_links,
m_ruby_system);
// Allocate to and from queues
// Queues that are getting messages from protocol
@@ -246,7 +248,7 @@ Network::addressToNodeID(Addr addr, MachineType mtype)
}
}
}
return MachineType_base_count(mtype);
return m_ruby_system->MachineType_base_count(mtype);
}
NodeID
@@ -256,5 +258,23 @@ Network::getLocalNodeID(NodeID global_id) const
return globalToLocalMap.at(global_id);
}
bool
Network::getRandomization() const
{
return m_ruby_system->getRandomization();
}
bool
Network::getWarmupEnabled() const
{
return m_ruby_system->getWarmupEnabled();
}
int
Network::MachineType_base_number(const MachineType& obj)
{
return m_ruby_system->MachineType_base_number(obj);
}
} // namespace ruby
} // namespace gem5

View File

@@ -78,6 +78,7 @@ namespace ruby
class NetDest;
class MessageBuffer;
class RubySystem;
class Network : public ClockedObject
{
@@ -147,6 +148,10 @@ class Network : public ClockedObject
NodeID getLocalNodeID(NodeID global_id) const;
bool getRandomization() const;
bool getWarmupEnabled() const;
RubySystem *getRubySystem() const { return m_ruby_system; }
protected:
// Private copy constructor and assignment operator
Network(const Network& obj);
@@ -176,6 +181,12 @@ class Network : public ClockedObject
// Global NodeID to local node map. If there are not multiple networks in
// the same RubySystem, this is a one-to-one mapping of global to local.
std::unordered_map<NodeID, NodeID> globalToLocalMap;
// For accessing if randomization/warnup are turned on. We cannot store
// those values in the constructor in case we are constructed first.
RubySystem *m_ruby_system = nullptr;
int MachineType_base_number(const MachineType& obj);
};
inline std::ostream&

View File

@@ -37,6 +37,7 @@
#include "mem/ruby/network/BasicLink.hh"
#include "mem/ruby/network/Network.hh"
#include "mem/ruby/slicc_interface/AbstractController.hh"
#include "mem/ruby/system/RubySystem.hh"
namespace gem5
{
@@ -56,10 +57,12 @@ const int INFINITE_LATENCY = 10000; // Yes, this is a big hack
Topology::Topology(uint32_t num_nodes, uint32_t num_routers,
uint32_t num_vnets,
const std::vector<BasicExtLink *> &ext_links,
const std::vector<BasicIntLink *> &int_links)
: m_nodes(MachineType_base_number(MachineType_NUM)),
const std::vector<BasicIntLink *> &int_links,
RubySystem *ruby_system)
: m_nodes(ruby_system->MachineType_base_number(MachineType_NUM)),
m_number_of_switches(num_routers), m_vnets(num_vnets),
m_ext_link_vector(ext_links), m_int_link_vector(int_links)
m_ext_link_vector(ext_links), m_int_link_vector(int_links),
m_ruby_system(ruby_system)
{
// Total nodes/controllers in network
assert(m_nodes > 1);
@@ -78,7 +81,8 @@ Topology::Topology(uint32_t num_nodes, uint32_t num_routers,
AbstractController *abs_cntrl = ext_link->params().ext_node;
BasicRouter *router = ext_link->params().int_node;
int machine_base_idx = MachineType_base_number(abs_cntrl->getType());
int machine_base_idx =
ruby_system->MachineType_base_number(abs_cntrl->getType());
int ext_idx1 = machine_base_idx + abs_cntrl->getVersion();
int ext_idx2 = ext_idx1 + m_nodes;
int int_idx = router->params().router_id + 2*m_nodes;
@@ -189,7 +193,7 @@ Topology::createLinks(Network *net)
for (int i = 0; i < topology_weights[0].size(); i++) {
for (int j = 0; j < topology_weights[0][i].size(); j++) {
std::vector<NetDest> routingMap;
routingMap.resize(m_vnets);
routingMap.resize(m_vnets, m_ruby_system);
// Not all sources and destinations are connected
// by direct links. We only construct the links
@@ -264,7 +268,7 @@ Topology::makeLink(Network *net, SwitchID src, SwitchID dest,
for (int l = 0; l < links.size(); l++) {
link_entry = links[l];
std::vector<NetDest> linkRoute;
linkRoute.resize(m_vnets);
linkRoute.resize(m_vnets, m_ruby_system);
BasicLink *link = link_entry.link;
if (link->mVnets.size() == 0) {
net->makeExtInLink(src, dest - (2 * m_nodes), link,
@@ -287,7 +291,7 @@ Topology::makeLink(Network *net, SwitchID src, SwitchID dest,
for (int l = 0; l < links.size(); l++) {
link_entry = links[l];
std::vector<NetDest> linkRoute;
linkRoute.resize(m_vnets);
linkRoute.resize(m_vnets, m_ruby_system);
BasicLink *link = link_entry.link;
if (link->mVnets.size() == 0) {
net->makeExtOutLink(src - (2 * m_nodes), node, link,
@@ -309,7 +313,7 @@ Topology::makeLink(Network *net, SwitchID src, SwitchID dest,
for (int l = 0; l < links.size(); l++) {
link_entry = links[l];
std::vector<NetDest> linkRoute;
linkRoute.resize(m_vnets);
linkRoute.resize(m_vnets, m_ruby_system);
BasicLink *link = link_entry.link;
if (link->mVnets.size() == 0) {
net->makeInternalLink(src - (2 * m_nodes),
@@ -413,16 +417,17 @@ Topology::shortest_path_to_node(SwitchID src, SwitchID next,
const Matrix &weights, const Matrix &dist,
int vnet)
{
NetDest result;
NetDest result(m_ruby_system);
int d = 0;
int machines;
int max_machines;
machines = MachineType_NUM;
max_machines = MachineType_base_number(MachineType_NUM);
max_machines = m_ruby_system->MachineType_base_number(MachineType_NUM);
for (int m = 0; m < machines; m++) {
for (NodeID i = 0; i < MachineType_base_count((MachineType)m); i++) {
for (NodeID i = 0;
i < m_ruby_system->MachineType_base_count((MachineType)m); i++) {
// we use "d+max_machines" below since the "destination"
// switches for the machines are numbered
// [MachineType_base_number(MachineType_NUM)...

View File

@@ -80,7 +80,8 @@ class Topology
public:
Topology(uint32_t num_nodes, uint32_t num_routers, uint32_t num_vnets,
const std::vector<BasicExtLink *> &ext_links,
const std::vector<BasicIntLink *> &int_links);
const std::vector<BasicIntLink *> &int_links,
RubySystem *ruby_system);
uint32_t numSwitches() const { return m_number_of_switches; }
void createLinks(Network *net);
@@ -108,7 +109,7 @@ class Topology
const Matrix &weights, const Matrix &dist,
int vnet);
const uint32_t m_nodes;
uint32_t m_nodes;
const uint32_t m_number_of_switches;
int m_vnets;
@@ -116,6 +117,8 @@ class Topology
std::vector<BasicIntLink*> m_int_link_vector;
LinkMap m_link_map;
RubySystem *m_ruby_system = nullptr;
};
inline std::ostream&

View File

@@ -41,6 +41,7 @@
#include "mem/ruby/network/garnet/Credit.hh"
#include "mem/ruby/network/garnet/flitBuffer.hh"
#include "mem/ruby/slicc_interface/Message.hh"
#include "mem/ruby/system/RubySystem.hh"
namespace gem5
{
@@ -244,7 +245,9 @@ NetworkInterface::wakeup()
outNode_ptr[vnet]->areNSlotsAvailable(1, curTime)) {
// Space is available. Enqueue to protocol buffer.
outNode_ptr[vnet]->enqueue(t_flit->get_msg_ptr(), curTime,
cyclesToTicks(Cycles(1)));
cyclesToTicks(Cycles(1)),
m_net_ptr->getRandomization(),
m_net_ptr->getWarmupEnabled());
// Simply send a credit back since we are not buffering
// this flit in the NI
@@ -332,7 +335,9 @@ NetworkInterface::checkStallQueue()
if (outNode_ptr[vnet]->areNSlotsAvailable(1,
curTime)) {
outNode_ptr[vnet]->enqueue(stallFlit->get_msg_ptr(),
curTime, cyclesToTicks(Cycles(1)));
curTime, cyclesToTicks(Cycles(1)),
m_net_ptr->getRandomization(),
m_net_ptr->getWarmupEnabled());
// Send back a credit with free signal now that the
// VC is no longer stalled.
@@ -699,6 +704,12 @@ NetworkInterface::functionalWrite(Packet *pkt)
return num_functional_writes;
}
int
NetworkInterface::MachineType_base_number(const MachineType& obj)
{
return m_net_ptr->getRubySystem()->MachineType_base_number(obj);
}
} // namespace garnet
} // namespace ruby
} // namespace gem5

View File

@@ -306,6 +306,8 @@ class NetworkInterface : public ClockedObject, public Consumer
InputPort *getInportForVnet(int vnet);
OutputPort *getOutportForVnet(int vnet);
int MachineType_base_number(const MachineType& obj);
};
} // namespace garnet

View File

@@ -268,7 +268,8 @@ PerfectSwitch::operateMessageBuffer(MessageBuffer *buffer, int vnet)
buffer->getIncomingLink(), vnet, outgoing, vnet);
out_port.buffers[vnet]->enqueue(msg_ptr, current_time,
out_port.latency);
out_port.latency, m_switch->getNetPtr()->getRandomization(),
m_switch->getNetPtr()->getWarmupEnabled());
}
}
}

View File

@@ -104,6 +104,7 @@ class Switch : public BasicRouter
void print(std::ostream& out) const;
void init_net_ptr(SimpleNetwork* net_ptr) { m_network_ptr = net_ptr; }
SimpleNetwork* getNetPtr() const { return m_network_ptr; }
bool functionalRead(Packet *);
bool functionalRead(Packet *, WriteMask&);

View File

@@ -199,7 +199,9 @@ Throttle::operateVnet(int vnet, int channel, int &total_bw_remaining,
// Move the message
in->dequeue(current_time);
out->enqueue(msg_ptr, current_time,
m_switch->cyclesToTicks(m_link_latency));
m_switch->cyclesToTicks(m_link_latency),
m_ruby_system->getRandomization(),
m_ruby_system->getWarmupEnabled());
// Count the message
(*(throttleStats.

View File

@@ -34,6 +34,7 @@
#include "base/stl_helpers.hh"
#include "mem/ruby/profiler/Profiler.hh"
#include "mem/ruby/protocol/RubyRequest.hh"
#include "mem/ruby/system/RubySystem.hh"
namespace gem5
{
@@ -307,7 +308,8 @@ AddressProfiler::addTraceSample(Addr data_addr, Addr pc_addr,
}
// record data address trace info
data_addr = makeLineAddress(data_addr);
int block_size_bits = m_profiler->m_ruby_system->getBlockSizeBits();
data_addr = makeLineAddress(data_addr, block_size_bits);
lookupTraceForAddress(data_addr, m_dataAccessTrace).
update(type, access_mode, id, sharing_miss);

View File

@@ -95,7 +95,7 @@ machine(MachineType:SQC, "GPU SQC (L1 I Cache)")
}
TBETable TBEs, template="<SQC_TBE>", constructor="m_number_of_TBEs";
int TCC_select_low_bit, default="RubySystem::getBlockSizeBits()";
int TCC_select_low_bit, default="m_ruby_system->getBlockSizeBits()";
void set_cache_entry(AbstractCacheEntry b);
void unset_cache_entry();

View File

@@ -121,7 +121,7 @@ machine(MachineType:TCP, "GPU TCP (L1 Data Cache)")
}
TBETable TBEs, template="<TCP_TBE>", constructor="m_number_of_TBEs";
int TCC_select_low_bit, default="RubySystem::getBlockSizeBits()";
int TCC_select_low_bit, default="m_ruby_system->getBlockSizeBits()";
int WTcnt, default="0";
int Fcnt, default="0";
bool inFlush, default="false";

View File

@@ -167,7 +167,7 @@ machine(MachineType:L1Cache, "MESI Directory L1 Cache CMP")
TBETable TBEs, template="<L1Cache_TBE>", constructor="m_number_of_TBEs";
int l2_select_low_bit, default="RubySystem::getBlockSizeBits()";
int l2_select_low_bit, default="m_ruby_system->getBlockSizeBits()";
Tick clockEdge();
Cycles ticksToCycles(Tick t);

View File

@@ -167,7 +167,7 @@ machine(MachineType:L1Cache, "MESI Directory L1 Cache CMP")
TBETable TBEs, template="<L1Cache_TBE>", constructor="m_number_of_TBEs";
int l2_select_low_bit, default="RubySystem::getBlockSizeBits()";
int l2_select_low_bit, default="m_ruby_system->getBlockSizeBits()";
Tick clockEdge();
Cycles ticksToCycles(Tick t);

View File

@@ -181,7 +181,7 @@ machine(MachineType:RegionBuffer, "Region Buffer for AMD_Base-like protocol")
// Stores only region addresses
TBETable TBEs, template="<RegionBuffer_TBE>", constructor="m_number_of_TBEs";
int TCC_select_low_bit, default="RubySystem::getBlockSizeBits()";
int TCC_select_low_bit, default="m_ruby_system->getBlockSizeBits()";
Tick clockEdge();
Tick cyclesToTicks(Cycles c);
@@ -195,8 +195,8 @@ machine(MachineType:RegionBuffer, "Region Buffer for AMD_Base-like protocol")
Cycles curCycle();
MachineID mapAddressToMachine(Addr addr, MachineType mtype);
int blockBits, default="RubySystem::getBlockSizeBits()";
int blockBytes, default="RubySystem::getBlockSizeBytes()";
int blockBits, default="m_ruby_system->getBlockSizeBits()";
int blockBytes, default="m_ruby_system->getBlockSizeBytes()";
int regionBits, default="log2(m_blocksPerRegion)";
// Functions

View File

@@ -155,7 +155,7 @@ machine(MachineType:RegionDir, "Region Directory for AMD_Base-like protocol")
// Stores only region addresses
TBETable TBEs, template="<RegionDir_TBE>", constructor="m_number_of_TBEs";
int TCC_select_low_bit, default="RubySystem::getBlockSizeBits()";
int TCC_select_low_bit, default="m_ruby_system->getBlockSizeBits()";
Tick clockEdge();
Tick cyclesToTicks(Cycles c);
@@ -169,8 +169,8 @@ machine(MachineType:RegionDir, "Region Directory for AMD_Base-like protocol")
Cycles curCycle();
MachineID mapAddressToMachine(Addr addr, MachineType mtype);
int blockBits, default="RubySystem::getBlockSizeBits()";
int blockBytes, default="RubySystem::getBlockSizeBytes()";
int blockBits, default="m_ruby_system->getBlockSizeBits()";
int blockBytes, default="m_ruby_system->getBlockSizeBytes()";
int regionBits, default="log2(m_blocksPerRegion)";
// Functions

View File

@@ -183,7 +183,7 @@ machine(MachineType:Directory, "AMD Baseline protocol")
TBETable TBEs, template="<Directory_TBE>", constructor="m_number_of_TBEs";
int TCC_select_low_bit, default="RubySystem::getBlockSizeBits()";
int TCC_select_low_bit, default="m_ruby_system->getBlockSizeBits()";
Tick clockEdge();
Tick cyclesToTicks(Cycles c);

View File

@@ -192,7 +192,7 @@ machine(MachineType:Directory, "AMD Baseline protocol")
TBETable TBEs, template="<Directory_TBE>", constructor="m_number_of_TBEs";
int TCC_select_low_bit, default="RubySystem::getBlockSizeBits()";
int TCC_select_low_bit, default="m_ruby_system->getBlockSizeBits()";
Tick clockEdge();
Tick cyclesToTicks(Cycles c);

View File

@@ -143,7 +143,7 @@ machine(MachineType:Directory, "Directory protocol")
bool isPresent(Addr);
}
int blockSize, default="RubySystem::getBlockSizeBytes()";
int blockSize, default="m_ruby_system->getBlockSizeBytes()";
// ** OBJECTS **
TBETable TBEs, template="<Directory_TBE>", constructor="m_number_of_TBEs";

View File

@@ -198,7 +198,7 @@ machine(MachineType:L1Cache, "Token protocol")
TBETable L1_TBEs, template="<L1Cache_TBE>", constructor="m_number_of_TBEs";
bool starving, default="false";
int l2_select_low_bit, default="RubySystem::getBlockSizeBits()";
int l2_select_low_bit, default="m_ruby_system->getBlockSizeBits()";
PersistentTable persistentTable;
TimerTable useTimerTable;

View File

@@ -171,7 +171,7 @@ machine(MachineType:Directory, "Token protocol")
TBETable TBEs, template="<Directory_TBE>", constructor="m_number_of_TBEs";
bool starving, default="false";
int l2_select_low_bit, default="RubySystem::getBlockSizeBits()";
int l2_select_low_bit, default="m_ruby_system->getBlockSizeBits()";
Tick clockEdge();
Tick clockEdge(Cycles c);

View File

@@ -72,6 +72,8 @@ structure(WriteMask, external="yes", desc="...") {
int count();
int count(int);
bool test(int);
int getBlockSize();
void setBlockSize(int);
}
structure(DataBlock, external = "yes", desc="..."){

View File

@@ -89,7 +89,9 @@ structure(MemoryMsg, desc="...", interface="Message") {
if ((MessageSize == MessageSizeType:Response_Data) ||
(MessageSize == MessageSizeType:Writeback_Data)) {
WriteMask read_mask;
read_mask.setMask(addressOffset(addr, makeLineAddress(addr)), Len, true);
read_mask.setBlockSize(mask.getBlockSize());
read_mask.setMask(addressOffset(addr,
makeLineAddress(addr, mask.getBlockSize())), Len, true);
if (MessageSize != MessageSizeType:Writeback_Data) {
read_mask.setInvertedMask(mask);
}

View File

@@ -94,7 +94,7 @@ structure (Set, external = "yes", non_obj="yes") {
NodeID smallestElement();
}
structure (NetDest, external = "yes", non_obj="yes") {
structure (NetDest, external = "yes", non_obj="yes", implicit_ctor="m_ruby_system") {
void setSize(int);
void setSize(int, int);
void add(NodeID);

View File

@@ -52,6 +52,7 @@ Addr intToAddress(int addr);
int addressOffset(Addr addr, Addr base);
int max_tokens();
Addr makeLineAddress(Addr addr);
Addr makeLineAddress(Addr addr, int cacheLineBits);
int getOffset(Addr addr);
int mod(int val, int mod);
Addr bitSelect(Addr addr, int small, int big);

View File

@@ -574,7 +574,7 @@ machine(MachineType:Cache, "Cache coherency protocol") :
////////////////////////////////////////////////////////////////////////////
// Cache block size
int blockSize, default="RubySystem::getBlockSizeBytes()";
int blockSize, default="m_ruby_system->getBlockSizeBytes()";
// CacheEntry
structure(CacheEntry, interface="AbstractCacheEntry") {

View File

@@ -192,7 +192,7 @@ machine(MachineType:MiscNode, "CHI Misc Node for handling and distrbuting DVM op
////////////////////////////////////////////////////////////////////////////
// Cache block size
int blockSize, default="RubySystem::getBlockSizeBytes()";
int blockSize, default="m_ruby_system->getBlockSizeBytes()";
// Helper class for tracking expected response and data messages
structure(ExpectedMap, external ="yes") {

View File

@@ -157,7 +157,7 @@ machine(MachineType:Memory, "Memory controller interface") :
////////////////////////////////////////////////////////////////////////////
// Cache block size
int blockSize, default="RubySystem::getBlockSizeBytes()";
int blockSize, default="m_ruby_system->getBlockSizeBytes()";
// TBE fields
structure(TBE, desc="...") {

View File

@@ -59,6 +59,8 @@ namespace gem5
namespace ruby
{
class RubySystem;
class AbstractCacheEntry : public ReplaceableEntry
{
private:
@@ -78,16 +80,15 @@ class AbstractCacheEntry : public ReplaceableEntry
// The methods below are those called by ruby runtime, add when it
// is absolutely necessary and should all be virtual function.
virtual DataBlock&
[[noreturn]] virtual DataBlock&
getDataBlk()
{
panic("getDataBlk() not implemented!");
// Dummy return to appease the compiler
static DataBlock b;
return b;
}
virtual void initBlockSize(int block_size) { };
virtual void setRubySystem(RubySystem *rs) { };
int validBlocks;
virtual int& getNumValidBlocks()
{

View File

@@ -89,6 +89,9 @@ AbstractController::init()
getMemReqQueue()->setConsumer(this);
}
downstreamDestinations.setRubySystem(m_ruby_system);
upstreamDestinations.setRubySystem(m_ruby_system);
// Initialize the addr->downstream machine mappings. Multiple machines
// in downstream_destinations can have the same address range if they have
// different types. If this is the case, mapAddressToDownstreamMachine
@@ -268,7 +271,7 @@ AbstractController::serviceMemoryQueue()
}
const MemoryMsg *mem_msg = (const MemoryMsg*)mem_queue->peek();
unsigned int req_size = RubySystem::getBlockSizeBytes();
unsigned int req_size = m_ruby_system->getBlockSizeBytes();
if (mem_msg->m_Len > 0) {
req_size = mem_msg->m_Len;
}
@@ -294,7 +297,7 @@ AbstractController::serviceMemoryQueue()
SenderState *s = new SenderState(mem_msg->m_Sender);
pkt->pushSenderState(s);
if (RubySystem::getWarmupEnabled()) {
if (m_ruby_system->getWarmupEnabled()) {
// Use functional rather than timing accesses during warmup
mem_queue->dequeue(clockEdge());
memoryPort.sendFunctional(pkt);
@@ -382,7 +385,10 @@ AbstractController::recvTimingResp(PacketPtr pkt)
return false;
}
std::shared_ptr<MemoryMsg> msg = std::make_shared<MemoryMsg>(clockEdge());
int blk_size = m_ruby_system->getBlockSizeBytes();
std::shared_ptr<MemoryMsg> msg =
std::make_shared<MemoryMsg>(clockEdge(), blk_size, m_ruby_system);
(*msg).m_addr = pkt->getAddr();
(*msg).m_Sender = m_machineID;
@@ -396,7 +402,7 @@ AbstractController::recvTimingResp(PacketPtr pkt)
// Copy data from the packet
(*msg).m_DataBlk.setData(pkt->getPtr<uint8_t>(), 0,
RubySystem::getBlockSizeBytes());
m_ruby_system->getBlockSizeBytes());
} else if (pkt->isWrite()) {
(*msg).m_Type = MemoryRequestType_MEMORY_WB;
(*msg).m_MessageSize = MessageSizeType_Writeback_Control;
@@ -404,7 +410,8 @@ AbstractController::recvTimingResp(PacketPtr pkt)
panic("Incorrect packet type received from memory controller!");
}
memRspQueue->enqueue(msg, clockEdge(), cyclesToTicks(Cycles(1)));
memRspQueue->enqueue(msg, clockEdge(), cyclesToTicks(Cycles(1)),
m_ruby_system->getRandomization(), m_ruby_system->getWarmupEnabled());
delete pkt;
return true;
}
@@ -471,6 +478,45 @@ AbstractController::sendRetryRespToMem() {
}
}
Addr
AbstractController::getOffset(Addr addr) const
{
return ruby::getOffset(addr, m_ruby_system->getBlockSizeBits());
}
Addr
AbstractController::makeLineAddress(Addr addr) const
{
return ruby::makeLineAddress(addr, m_ruby_system->getBlockSizeBits());
}
std::string
AbstractController::printAddress(Addr addr) const
{
return ruby::printAddress(addr, m_ruby_system->getBlockSizeBits());
}
NetDest
AbstractController::broadcast(MachineType type)
{
assert(m_ruby_system != nullptr);
NodeID type_count = m_ruby_system->MachineType_base_count(type);
NetDest dest;
for (NodeID i = 0; i < type_count; i++) {
MachineID mach = {type, i};
dest.add(mach);
}
return dest;
}
int
AbstractController::machineCount(MachineType machType)
{
assert(m_ruby_system != nullptr);
return m_ruby_system->MachineType_base_count(machType);
}
bool
AbstractController::MemoryPort::recvTimingResp(PacketPtr pkt)
{

View File

@@ -72,6 +72,7 @@ namespace ruby
class Network;
class GPUCoalescer;
class DMASequencer;
class RubySystem;
// used to communicate that an in_port peeked the wrong message type
class RejectException: public std::exception
@@ -229,6 +230,11 @@ class AbstractController : public ClockedObject, public Consumer
/** List of upstream destinations (towards the CPU) */
const NetDest& allUpstreamDest() const { return upstreamDestinations; }
// Helper methods for commonly used functions called in common/address.hh
Addr getOffset(Addr addr) const;
Addr makeLineAddress(Addr addr) const;
std::string printAddress(Addr addr) const;
protected:
//! Profiles original cache requests including PUTs
void profileRequest(const std::string &request);
@@ -452,6 +458,13 @@ class AbstractController : public ClockedObject, public Consumer
{}
};
RubySystem *m_ruby_system = nullptr;
// Formerly in RubySlicc_ComponentMapping.hh. Moved here to access
// RubySystem pointer.
NetDest broadcast(MachineType type);
int machineCount(MachineType machType);
private:
/** The address range to which the controller responds on the CPU side. */
const AddrRangeList addrRanges;

View File

@@ -62,10 +62,12 @@ typedef std::shared_ptr<Message> MsgPtr;
class Message
{
public:
Message(Tick curTime)
: m_time(curTime),
Message(Tick curTime, int block_size, const RubySystem *rs)
: m_block_size(block_size),
m_time(curTime),
m_LastEnqueueTime(curTime),
m_DelayedTicks(0), m_msg_counter(0)
m_DelayedTicks(0), m_msg_counter(0),
p_ruby_system(rs)
{ }
Message(const Message &other) = default;
@@ -121,6 +123,9 @@ class Message
int getVnet() const { return vnet; }
void setVnet(int net) { vnet = net; }
protected:
int m_block_size = 0;
private:
Tick m_time;
Tick m_LastEnqueueTime; // my last enqueue time
@@ -130,6 +135,9 @@ class Message
// Variables for required network traversal
int incoming_link;
int vnet;
// Needed to call MacheinType_base_count/level
const RubySystem *p_ruby_system = nullptr;
};
inline bool

View File

@@ -86,11 +86,12 @@ class RubyRequest : public Message
bool m_isSLCSet;
bool m_isSecure;
RubyRequest(Tick curTime, uint64_t _paddr, int _len,
RubyRequest(Tick curTime, int block_size, RubySystem *rs,
uint64_t _paddr, int _len,
uint64_t _pc, RubyRequestType _type, RubyAccessMode _access_mode,
PacketPtr _pkt, PrefetchBit _pb = PrefetchBit_No,
ContextID _proc_id = 100, ContextID _core_id = 99)
: Message(curTime),
: Message(curTime, block_size, rs),
m_PhysicalAddress(_paddr),
m_Type(_type),
m_ProgramCounter(_pc),
@@ -99,13 +100,16 @@ class RubyRequest : public Message
m_Prefetch(_pb),
m_pkt(_pkt),
m_contextId(_core_id),
m_writeMask(block_size),
m_WTData(block_size),
m_htmFromTransaction(false),
m_htmTransactionUid(0),
m_isTlbi(false),
m_tlbiTransactionUid(0),
m_isSecure(m_pkt ? m_pkt->req->isSecure() : false)
{
m_LineAddress = makeLineAddress(m_PhysicalAddress);
int block_size_bits = floorLog2(block_size);
m_LineAddress = makeLineAddress(m_PhysicalAddress, block_size_bits);
if (_pkt) {
m_isGLCSet = m_pkt->req->isGLCSet();
m_isSLCSet = m_pkt->req->isSLCSet();
@@ -116,10 +120,10 @@ class RubyRequest : public Message
}
/** RubyRequest for memory management commands */
RubyRequest(Tick curTime,
RubyRequest(Tick curTime, int block_size, RubySystem *rs,
uint64_t _pc, RubyRequestType _type, RubyAccessMode _access_mode,
PacketPtr _pkt, ContextID _proc_id, ContextID _core_id)
: Message(curTime),
: Message(curTime, block_size, rs),
m_PhysicalAddress(0),
m_Type(_type),
m_ProgramCounter(_pc),
@@ -128,6 +132,8 @@ class RubyRequest : public Message
m_Prefetch(PrefetchBit_No),
m_pkt(_pkt),
m_contextId(_core_id),
m_writeMask(block_size),
m_WTData(block_size),
m_htmFromTransaction(false),
m_htmTransactionUid(0),
m_isTlbi(false),
@@ -144,14 +150,14 @@ class RubyRequest : public Message
}
}
RubyRequest(Tick curTime, uint64_t _paddr, int _len,
uint64_t _pc, RubyRequestType _type,
RubyRequest(Tick curTime, int block_size, RubySystem *rs,
uint64_t _paddr, int _len, uint64_t _pc, RubyRequestType _type,
RubyAccessMode _access_mode, PacketPtr _pkt, PrefetchBit _pb,
unsigned _proc_id, unsigned _core_id,
int _wm_size, std::vector<bool> & _wm_mask,
DataBlock & _Data,
uint64_t _instSeqNum = 0)
: Message(curTime),
: Message(curTime, block_size, rs),
m_PhysicalAddress(_paddr),
m_Type(_type),
m_ProgramCounter(_pc),
@@ -170,7 +176,8 @@ class RubyRequest : public Message
m_tlbiTransactionUid(0),
m_isSecure(m_pkt->req->isSecure())
{
m_LineAddress = makeLineAddress(m_PhysicalAddress);
int block_size_bits = floorLog2(block_size);
m_LineAddress = makeLineAddress(m_PhysicalAddress, block_size_bits);
if (_pkt) {
m_isGLCSet = m_pkt->req->isGLCSet();
m_isSLCSet = m_pkt->req->isSLCSet();
@@ -180,15 +187,15 @@ class RubyRequest : public Message
}
}
RubyRequest(Tick curTime, uint64_t _paddr, int _len,
uint64_t _pc, RubyRequestType _type,
RubyRequest(Tick curTime, int block_size, RubySystem *rs,
uint64_t _paddr, int _len, uint64_t _pc, RubyRequestType _type,
RubyAccessMode _access_mode, PacketPtr _pkt, PrefetchBit _pb,
unsigned _proc_id, unsigned _core_id,
int _wm_size, std::vector<bool> & _wm_mask,
DataBlock & _Data,
std::vector< std::pair<int,AtomicOpFunctor*> > _atomicOps,
uint64_t _instSeqNum = 0)
: Message(curTime),
: Message(curTime, block_size, rs),
m_PhysicalAddress(_paddr),
m_Type(_type),
m_ProgramCounter(_pc),
@@ -207,7 +214,8 @@ class RubyRequest : public Message
m_tlbiTransactionUid(0),
m_isSecure(m_pkt->req->isSecure())
{
m_LineAddress = makeLineAddress(m_PhysicalAddress);
int block_size_bits = floorLog2(block_size);
m_LineAddress = makeLineAddress(m_PhysicalAddress, block_size_bits);
if (_pkt) {
m_isGLCSet = m_pkt->req->isGLCSet();
m_isSLCSet = m_pkt->req->isSLCSet();
@@ -218,7 +226,12 @@ class RubyRequest : public Message
}
}
RubyRequest(Tick curTime) : Message(curTime) {}
RubyRequest(Tick curTime, int block_size, RubySystem *rs)
: Message(curTime, block_size, rs),
m_writeMask(block_size),
m_WTData(block_size)
{
}
MsgPtr clone() const
{ return std::shared_ptr<Message>(new RubyRequest(*this)); }

View File

@@ -41,17 +41,6 @@ namespace gem5
namespace ruby
{
inline NetDest
broadcast(MachineType type)
{
NetDest dest;
for (NodeID i = 0; i < MachineType_base_count(type); i++) {
MachineID mach = {type, i};
dest.add(mach);
}
return dest;
}
inline MachineID
mapAddressToRange(Addr addr, MachineType type, int low_bit,
int num_bits, int cluster_id = 0)
@@ -77,12 +66,6 @@ machineIDToMachineType(MachineID machID)
return machID.type;
}
inline int
machineCount(MachineType machType)
{
return MachineType_base_count(machType);
}
inline MachineID
createMachineID(MachineType type, NodeID id)
{

View File

@@ -233,8 +233,9 @@ addressOffset(Addr addr, Addr base)
inline bool
testAndRead(Addr addr, DataBlock& blk, Packet *pkt)
{
Addr pktLineAddr = makeLineAddress(pkt->getAddr());
Addr lineAddr = makeLineAddress(addr);
int block_size_bits = floorLog2(blk.getBlockSize());
Addr pktLineAddr = makeLineAddress(pkt->getAddr(), block_size_bits);
Addr lineAddr = makeLineAddress(addr, block_size_bits);
if (pktLineAddr == lineAddr) {
uint8_t *data = pkt->getPtr<uint8_t>();
@@ -259,8 +260,10 @@ testAndRead(Addr addr, DataBlock& blk, Packet *pkt)
inline bool
testAndReadMask(Addr addr, DataBlock& blk, WriteMask& mask, Packet *pkt)
{
Addr pktLineAddr = makeLineAddress(pkt->getAddr());
Addr lineAddr = makeLineAddress(addr);
assert(blk.getBlockSize() == mask.getBlockSize());
int block_size_bits = floorLog2(blk.getBlockSize());
Addr pktLineAddr = makeLineAddress(pkt->getAddr(), block_size_bits);
Addr lineAddr = makeLineAddress(addr, block_size_bits);
if (pktLineAddr == lineAddr) {
uint8_t *data = pkt->getPtr<uint8_t>();
@@ -288,8 +291,9 @@ testAndReadMask(Addr addr, DataBlock& blk, WriteMask& mask, Packet *pkt)
inline bool
testAndWrite(Addr addr, DataBlock& blk, Packet *pkt)
{
Addr pktLineAddr = makeLineAddress(pkt->getAddr());
Addr lineAddr = makeLineAddress(addr);
int block_size_bits = floorLog2(blk.getBlockSize());
Addr pktLineAddr = makeLineAddress(pkt->getAddr(), block_size_bits);
Addr lineAddr = makeLineAddress(addr, block_size_bits);
if (pktLineAddr == lineAddr) {
const uint8_t *data = pkt->getConstPtr<uint8_t>();

View File

@@ -57,10 +57,10 @@ namespace ruby
* - The same line has been accessed in the past accessLatency ticks
*/
ALUFreeListArray::ALUFreeListArray(unsigned int num_ALUs, Tick access_latency)
ALUFreeListArray::ALUFreeListArray(unsigned int num_ALUs, Cycles access_clocks)
{
this->numALUs = num_ALUs;
this->accessLatency = access_latency;
this->accessClocks = access_clocks;
}
bool ALUFreeListArray::tryAccess(Addr addr)
@@ -85,7 +85,7 @@ bool ALUFreeListArray::tryAccess(Addr addr)
}
// Block access if the line is already being used
if (record.lineAddr == makeLineAddress(addr)) {
if (record.lineAddr == makeLineAddress(addr, m_block_size_bits)) {
return false;
}
}
@@ -99,7 +99,9 @@ void ALUFreeListArray::reserve(Addr addr)
// the access is valid
// Add record to queue
accessQueue.push_front(AccessRecord(makeLineAddress(addr), curTick()));
accessQueue.push_front(
AccessRecord(makeLineAddress(addr, m_block_size_bits), curTick())
);
}
} // namespace ruby

View File

@@ -32,6 +32,7 @@
#include <deque>
#include "base/intmath.hh"
#include "mem/ruby/common/TypeDefines.hh"
#include "sim/cur_tick.hh"
@@ -45,7 +46,8 @@ class ALUFreeListArray
{
private:
unsigned int numALUs;
Tick accessLatency;
Cycles accessClocks;
Tick accessLatency = 0;
class AccessRecord
{
@@ -62,14 +64,33 @@ class ALUFreeListArray
// Queue of accesses from past accessLatency cycles
std::deque<AccessRecord> accessQueue;
int m_block_size_bits = 0;
public:
ALUFreeListArray(unsigned int num_ALUs, Tick access_latency);
ALUFreeListArray(unsigned int num_ALUs, Cycles access_clocks);
bool tryAccess(Addr addr);
void reserve(Addr addr);
Tick getLatency() const { return accessLatency; }
Tick
getLatency() const
{
assert(accessLatency > 0);
return accessLatency;
}
void
setClockPeriod(Tick clockPeriod)
{
accessLatency = accessClocks * clockPeriod;
}
void
setBlockSize(int block_size)
{
m_block_size_bits = floorLog2(block_size);
}
};
} // namespace ruby

View File

@@ -42,8 +42,7 @@ namespace ruby
{
BankedArray::BankedArray(unsigned int banks, Cycles accessLatency,
unsigned int startIndexBit, RubySystem *rs)
: m_ruby_system(rs)
unsigned int startIndexBit)
{
this->banks = banks;
this->accessLatency = accessLatency;
@@ -78,6 +77,8 @@ BankedArray::reserve(int64_t idx)
if (accessLatency == 0)
return;
assert(clockPeriod > 0);
unsigned int bank = mapIndexToBank(idx);
assert(bank < banks);
@@ -95,7 +96,7 @@ BankedArray::reserve(int64_t idx)
busyBanks[bank].idx = idx;
busyBanks[bank].startAccess = curTick();
busyBanks[bank].endAccess = curTick() +
(accessLatency-1) * m_ruby_system->clockPeriod();
(accessLatency-1) * clockPeriod;
}
unsigned int

View File

@@ -48,6 +48,7 @@ class BankedArray
private:
unsigned int banks;
Cycles accessLatency;
Tick clockPeriod = 0;
unsigned int bankBits;
unsigned int startIndexBit;
RubySystem *m_ruby_system;
@@ -69,7 +70,7 @@ class BankedArray
public:
BankedArray(unsigned int banks, Cycles accessLatency,
unsigned int startIndexBit, RubySystem *rs);
unsigned int startIndexBit);
// Note: We try the access based on the cache index, not the address
// This is so we don't get aliasing on blocks being replaced
@@ -78,6 +79,8 @@ class BankedArray
void reserve(int64_t idx);
Cycles getLatency() const { return accessLatency; }
void setClockPeriod(Tick _clockPeriod) { clockPeriod = _clockPeriod; }
};
} // namespace ruby

View File

@@ -69,12 +69,9 @@ operator<<(std::ostream& out, const CacheMemory& obj)
CacheMemory::CacheMemory(const Params &p)
: SimObject(p),
dataArray(p.dataArrayBanks, p.dataAccessLatency,
p.start_index_bit, p.ruby_system),
tagArray(p.tagArrayBanks, p.tagAccessLatency,
p.start_index_bit, p.ruby_system),
atomicALUArray(p.atomicALUs, p.atomicLatency *
p.ruby_system->clockPeriod()),
dataArray(p.dataArrayBanks, p.dataAccessLatency, p.start_index_bit),
tagArray(p.tagArrayBanks, p.tagAccessLatency, p.start_index_bit),
atomicALUArray(p.atomicALUs, p.atomicLatency),
cacheMemoryStats(this)
{
m_cache_size = p.size;
@@ -88,12 +85,25 @@ CacheMemory::CacheMemory(const Params &p)
m_replacementPolicy_ptr) ? true : false;
}
void
CacheMemory::setRubySystem(RubySystem* rs)
{
dataArray.setClockPeriod(rs->clockPeriod());
tagArray.setClockPeriod(rs->clockPeriod());
atomicALUArray.setClockPeriod(rs->clockPeriod());
atomicALUArray.setBlockSize(rs->getBlockSizeBytes());
if (m_block_size == 0) {
m_block_size = rs->getBlockSizeBytes();
}
m_ruby_system = rs;
}
void
CacheMemory::init()
{
if (m_block_size == 0) {
m_block_size = RubySystem::getBlockSizeBytes();
}
assert(m_block_size != 0);
m_cache_num_sets = (m_cache_size / m_cache_assoc) / m_block_size;
assert(m_cache_num_sets > 1);
m_cache_num_set_bits = floorLog2(m_cache_num_sets);
@@ -286,6 +296,9 @@ CacheMemory::allocate(Addr address, AbstractCacheEntry *entry)
assert(cacheAvail(address));
DPRINTF(RubyCache, "allocating address: %#x\n", address);
entry->initBlockSize(m_block_size);
entry->setRubySystem(m_ruby_system);
// Find the first open slot
int64_t cacheSet = addressToCacheSet(address);
std::vector<AbstractCacheEntry*> &set = m_cache[cacheSet];

View File

@@ -154,6 +154,8 @@ class CacheMemory : public SimObject
void htmAbortTransaction();
void htmCommitTransaction();
void setRubySystem(RubySystem* rs);
public:
int getCacheSize() const { return m_cache_size; }
int getCacheAssoc() const { return m_cache_assoc; }
@@ -213,6 +215,14 @@ class CacheMemory : public SimObject
*/
bool m_use_occupancy;
RubySystem *m_ruby_system = nullptr;
Addr
makeLineAddress(Addr addr) const
{
return ruby::makeLineAddress(addr, floorLog2(m_block_size));
}
private:
struct CacheMemoryStats : public statistics::Group
{

View File

@@ -64,12 +64,14 @@ DirectoryMemory::DirectoryMemory(const Params &p)
}
m_size_bits = floorLog2(m_size_bytes);
m_num_entries = 0;
m_block_size = p.block_size;
m_ruby_system = p.ruby_system;
}
void
DirectoryMemory::init()
{
m_num_entries = m_size_bytes / RubySystem::getBlockSizeBytes();
m_num_entries = m_size_bytes / m_block_size;
m_entries = new AbstractCacheEntry*[m_num_entries];
for (int i = 0; i < m_num_entries; i++)
m_entries[i] = NULL;
@@ -108,7 +110,7 @@ DirectoryMemory::mapAddressToLocalIdx(Addr address)
}
ret += r.size();
}
return ret >> RubySystem::getBlockSizeBits();
return ret >> (floorLog2(m_block_size));
}
AbstractCacheEntry*
@@ -133,6 +135,8 @@ DirectoryMemory::allocate(Addr address, AbstractCacheEntry *entry)
assert(idx < m_num_entries);
assert(m_entries[idx] == NULL);
entry->changePermission(AccessPermission_Read_Only);
entry->initBlockSize(m_block_size);
entry->setRubySystem(m_ruby_system);
m_entries[idx] = entry;
return entry;

View File

@@ -104,6 +104,9 @@ class DirectoryMemory : public SimObject
uint64_t m_size_bytes;
uint64_t m_size_bits;
uint64_t m_num_entries;
uint32_t m_block_size;
RubySystem *m_ruby_system = nullptr;
/**
* The address range for which the directory responds. Normally

View File

@@ -49,3 +49,7 @@ class RubyDirectoryMemory(SimObject):
addr_ranges = VectorParam.AddrRange(
Parent.addr_ranges, "Address range this directory responds to"
)
block_size = Param.UInt32(
"Size of a block in bytes. Usually same as cache line size."
)
ruby_system = Param.RubySystem(Parent.any, "")

View File

@@ -74,6 +74,8 @@ class PerfectCacheMemory
public:
PerfectCacheMemory();
void setBlockSize(const int block_size) { m_block_size = block_size; }
// tests to see if an address is present in the cache
bool isTagPresent(Addr address) const;
@@ -108,6 +110,8 @@ class PerfectCacheMemory
// Data Members (m_prefix)
std::unordered_map<Addr, PerfectCacheLineState<ENTRY> > m_map;
int m_block_size = 0;
};
template<class ENTRY>
@@ -130,7 +134,7 @@ template<class ENTRY>
inline bool
PerfectCacheMemory<ENTRY>::isTagPresent(Addr address) const
{
return m_map.count(makeLineAddress(address)) > 0;
return m_map.count(makeLineAddress(address, floorLog2(m_block_size))) > 0;
}
template<class ENTRY>
@@ -149,7 +153,8 @@ PerfectCacheMemory<ENTRY>::allocate(Addr address)
PerfectCacheLineState<ENTRY> line_state;
line_state.m_permission = AccessPermission_Invalid;
line_state.m_entry = ENTRY();
m_map[makeLineAddress(address)] = line_state;
Addr line_addr = makeLineAddress(address, floorLog2(m_block_size));
m_map.emplace(line_addr, line_state);
}
// deallocate entry
@@ -157,7 +162,8 @@ template<class ENTRY>
inline void
PerfectCacheMemory<ENTRY>::deallocate(Addr address)
{
[[maybe_unused]] auto num_erased = m_map.erase(makeLineAddress(address));
Addr line_addr = makeLineAddress(address, floorLog2(m_block_size));
[[maybe_unused]] auto num_erased = m_map.erase(line_addr);
assert(num_erased == 1);
}
@@ -175,7 +181,8 @@ template<class ENTRY>
inline ENTRY*
PerfectCacheMemory<ENTRY>::lookup(Addr address)
{
return &m_map[makeLineAddress(address)].m_entry;
Addr line_addr = makeLineAddress(address, floorLog2(m_block_size));
return &m_map[line_addr].m_entry;
}
// looks an address up in the cache
@@ -183,14 +190,16 @@ template<class ENTRY>
inline const ENTRY*
PerfectCacheMemory<ENTRY>::lookup(Addr address) const
{
return &m_map[makeLineAddress(address)].m_entry;
Addr line_addr = makeLineAddress(address, floorLog2(m_block_size));
return &m_map[line_addr].m_entry;
}
template<class ENTRY>
inline AccessPermission
PerfectCacheMemory<ENTRY>::getPermission(Addr address) const
{
return m_map[makeLineAddress(address)].m_permission;
Addr line_addr = makeLineAddress(address, floorLog2(m_block_size));
return m_map[line_addr].m_permission;
}
template<class ENTRY>
@@ -198,8 +207,8 @@ inline void
PerfectCacheMemory<ENTRY>::changePermission(Addr address,
AccessPermission new_perm)
{
Addr line_address = makeLineAddress(address);
PerfectCacheLineState<ENTRY>& line_state = m_map[line_address];
Addr line_addr = makeLineAddress(address, floorLog2(m_block_size));
PerfectCacheLineState<ENTRY>& line_state = m_map[line_addr];
line_state.m_permission = new_perm;
}

View File

@@ -63,6 +63,12 @@ class PersistentTable
// Destructor
~PersistentTable();
void
setBlockSize(int block_size)
{
m_block_size_bits = floorLog2(block_size);
}
// Public Methods
void persistentRequestLock(Addr address, MachineID locker,
AccessType type);
@@ -82,9 +88,17 @@ class PersistentTable
PersistentTable(const PersistentTable& obj);
PersistentTable& operator=(const PersistentTable& obj);
int m_block_size_bits = 0;
// Data Members (m_prefix)
typedef std::unordered_map<Addr, PersistentTableEntry> AddressMap;
AddressMap m_map;
Addr
makeLineAddress(Addr addr) const
{
return ruby::makeLineAddress(addr, m_block_size_bits);
}
};
inline std::ostream&

View File

@@ -54,4 +54,3 @@ class RubyCache(SimObject):
dataAccessLatency = Param.Cycles(1, "cycles for a data array access")
tagAccessLatency = Param.Cycles(1, "cycles for a tag array access")
resourceStalls = Param.Bool(False, "stall if there is a resource failure")
ruby_system = Param.RubySystem(Parent.any, "")

View File

@@ -56,13 +56,15 @@ namespace ruby
RubyPrefetcher::RubyPrefetcher(const Params &p)
: SimObject(p), m_num_streams(p.num_streams),
m_array(p.num_streams), m_train_misses(p.train_misses),
m_array(p.num_streams, p.block_size), m_train_misses(p.train_misses),
m_num_startup_pfs(p.num_startup_pfs),
unitFilter(p.unit_filter),
negativeFilter(p.unit_filter),
nonUnitFilter(p.nonunit_filter),
m_prefetch_cross_pages(p.cross_page),
pageShift(p.page_shift),
m_block_size_bits(floorLog2(p.block_size)),
m_block_size_bytes(p.block_size),
rubyPrefetcherStats(this)
{
assert(m_num_streams > 0);
@@ -90,7 +92,7 @@ void
RubyPrefetcher::observeMiss(Addr address, const RubyRequestType& type)
{
DPRINTF(RubyPrefetcher, "Observed miss for %#x\n", address);
Addr line_addr = makeLineAddress(address);
Addr line_addr = makeLineAddress(address, m_block_size_bits);
rubyPrefetcherStats.numMissObserved++;
// check to see if we have already issued a prefetch for this block
@@ -214,7 +216,7 @@ RubyPrefetcher::initializeStream(Addr address, int stride,
// initialize the stream prefetcher
PrefetchEntry *mystream = &(m_array[index]);
mystream->m_address = makeLineAddress(address);
mystream->m_address = makeLineAddress(address, m_block_size_bits);
mystream->m_stride = stride;
mystream->m_use_time = m_controller->curCycle();
mystream->m_is_valid = true;
@@ -222,7 +224,7 @@ RubyPrefetcher::initializeStream(Addr address, int stride,
// create a number of initial prefetches for this stream
Addr page_addr = pageAddress(mystream->m_address);
Addr line_addr = makeLineAddress(mystream->m_address);
Addr line_addr = makeLineAddress(mystream->m_address, m_block_size_bits);
// insert a number of prefetches into the prefetch table
for (int k = 0; k < m_num_startup_pfs; k++) {
@@ -312,8 +314,7 @@ RubyPrefetcher::accessNonunitFilter(Addr line_addr,
// This stride HAS to be the multiplicative constant of
// dataBlockBytes (bc makeNextStrideAddress is
// calculated based on this multiplicative constant!)
const int stride = entry.stride /
RubySystem::getBlockSizeBytes();
const int stride = entry.stride / m_block_size_bytes;
// clear this filter entry
entry.clear();

View File

@@ -68,10 +68,10 @@ class PrefetchEntry
{
public:
/// constructor
PrefetchEntry()
PrefetchEntry(int block_size)
{
// default: 1 cache-line stride
m_stride = (1 << RubySystem::getBlockSizeBits());
m_stride = (1 << floorLog2(block_size));
m_use_time = Cycles(0);
m_is_valid = false;
}
@@ -239,6 +239,16 @@ class RubyPrefetcher : public SimObject
const unsigned pageShift;
int m_block_size_bits = 0;
int m_block_size_bytes = 0;
Addr
makeNextStrideAddress(Addr addr, int stride) const
{
return ruby::makeNextStrideAddress(addr, stride,
m_block_size_bytes);
}
struct RubyPrefetcherStats : public statistics::Group
{
RubyPrefetcherStats(statistics::Group *parent);

View File

@@ -62,6 +62,9 @@ class RubyPrefetcher(SimObject):
page_shift = Param.UInt32(
12, "Number of bits to mask to get a page number"
)
block_size = Param.UInt32(
"Size of block to prefetch, usually cache line size"
)
class Prefetcher(RubyPrefetcher):

View File

@@ -66,7 +66,7 @@ RubyPrefetcherProxy::RubyPrefetcherProxy(AbstractController* _parent,
prefetcher->setParentInfo(
cacheCntrl->params().system,
cacheCntrl->getProbeManager(),
RubySystem::getBlockSizeBytes());
cacheCntrl->m_ruby_system->getBlockSizeBytes());
}
}
@@ -112,7 +112,7 @@ RubyPrefetcherProxy::issuePrefetch()
if (pkt) {
DPRINTF(HWPrefetch, "Next prefetch ready %s\n", pkt->print());
unsigned blk_size = RubySystem::getBlockSizeBytes();
unsigned blk_size = cacheCntrl->m_ruby_system->getBlockSizeBytes();
Addr line_addr = pkt->getBlockAddr(blk_size);
if (issuedPfPkts.count(line_addr) == 0) {
@@ -126,6 +126,8 @@ RubyPrefetcherProxy::issuePrefetch()
std::shared_ptr<RubyRequest> msg =
std::make_shared<RubyRequest>(cacheCntrl->clockEdge(),
blk_size,
cacheCntrl->m_ruby_system,
pkt->getAddr(),
blk_size,
0, // pc
@@ -136,7 +138,10 @@ RubyPrefetcherProxy::issuePrefetch()
// enqueue request into prefetch queue to the cache
pfQueue->enqueue(msg, cacheCntrl->clockEdge(),
cacheCntrl->cyclesToTicks(Cycles(1)));
cacheCntrl->cyclesToTicks(Cycles(1)),
cacheCntrl->m_ruby_system->getRandomization(),
cacheCntrl->m_ruby_system->getWarmupEnabled()
);
// track all pending PF requests
issuedPfPkts[line_addr] = pkt;
@@ -230,5 +235,19 @@ RubyPrefetcherProxy::regProbePoints()
cacheCntrl->getProbeManager(), "Data Update");
}
Addr
RubyPrefetcherProxy::makeLineAddress(Addr addr) const
{
return ruby::makeLineAddress(addr,
cacheCntrl->m_ruby_system->getBlockSizeBits());
}
Addr
RubyPrefetcherProxy::getOffset(Addr addr) const
{
return ruby::getOffset(addr,
cacheCntrl->m_ruby_system->getBlockSizeBits());
}
} // namespace ruby
} // namespace gem5

View File

@@ -142,6 +142,9 @@ class RubyPrefetcherProxy : public CacheAccessor, public Named
*/
ProbePointArg<CacheDataUpdateProbeArg> *ppDataUpdate;
Addr makeLineAddress(Addr addr) const;
Addr getOffset(Addr addr) const;
public:
/** Accessor functions */

View File

@@ -70,6 +70,8 @@ class TBETable
return (m_number_of_TBEs - m_map.size()) >= n;
}
void setBlockSize(const int block_size) { m_block_size = block_size; }
ENTRY *getNullEntry();
ENTRY *lookup(Addr address);
@@ -85,7 +87,8 @@ class TBETable
std::unordered_map<Addr, ENTRY> m_map;
private:
int m_number_of_TBEs;
int m_number_of_TBEs = 0;
int m_block_size = 0;
};
template<class ENTRY>
@@ -101,7 +104,7 @@ template<class ENTRY>
inline bool
TBETable<ENTRY>::isPresent(Addr address) const
{
assert(address == makeLineAddress(address));
assert(address == makeLineAddress(address, floorLog2(m_block_size)));
assert(m_map.size() <= m_number_of_TBEs);
return !!m_map.count(address);
}
@@ -112,7 +115,8 @@ TBETable<ENTRY>::allocate(Addr address)
{
assert(!isPresent(address));
assert(m_map.size() < m_number_of_TBEs);
m_map[address] = ENTRY();
assert(m_block_size > 0);
m_map.emplace(address, ENTRY(m_block_size));
}
template<class ENTRY>

View File

@@ -70,7 +70,7 @@ TimerTable::nextAddress() const
void
TimerTable::set(Addr address, Tick ready_time)
{
assert(address == makeLineAddress(address));
assert(address == makeLineAddress(address, m_block_size_bits));
assert(!m_map.count(address));
m_map[address] = ready_time;
@@ -87,7 +87,7 @@ TimerTable::set(Addr address, Tick ready_time)
void
TimerTable::unset(Addr address)
{
assert(address == makeLineAddress(address));
assert(address == makeLineAddress(address, m_block_size_bits));
assert(m_map.count(address));
m_map.erase(address);

View File

@@ -48,6 +48,12 @@ class TimerTable
public:
TimerTable();
void
setBlockSize(int block_size)
{
m_block_size_bits = floorLog2(block_size);
}
void
setConsumer(Consumer* consumer_ptr)
{
@@ -88,6 +94,8 @@ class TimerTable
//! Consumer to signal a wakeup()
Consumer* m_consumer_ptr;
int m_block_size_bits = 0;
std::string m_name;
};

View File

@@ -36,7 +36,6 @@
#include "base/cprintf.hh"
#include "base/stl_helpers.hh"
#include "mem/ruby/system/RubySystem.hh"
namespace gem5
{
@@ -74,7 +73,8 @@ WireBuffer::~WireBuffer()
}
void
WireBuffer::enqueue(MsgPtr message, Tick current_time, Tick delta)
WireBuffer::enqueue(MsgPtr message, Tick current_time, Tick delta,
bool /*ruby_is_random*/, bool /*ruby_warmup*/)
{
m_msg_counter++;
Tick arrival_time = current_time + delta;

View File

@@ -78,7 +78,10 @@ class WireBuffer : public SimObject
void setDescription(const std::string& name) { m_description = name; };
std::string getDescription() { return m_description; };
void enqueue(MsgPtr message, Tick current_time, Tick delta);
// ruby_is_random and ruby_warmup are not used, but this method signature
// must match that of MessageBuffer.
void enqueue(MsgPtr message, Tick current_time, Tick delta,
bool ruby_is_random = false, bool ruby_warmup = false);
void dequeue(Tick current_time);
const Message* peek();
void recycle(Tick current_time, Tick recycle_latency);

View File

@@ -35,5 +35,3 @@ class RubyWireBuffer(SimObject):
type = "RubyWireBuffer"
cxx_class = "gem5::ruby::WireBuffer"
cxx_header = "mem/ruby/structures/WireBuffer.hh"
ruby_system = Param.RubySystem(Parent.any, "")

View File

@@ -49,31 +49,25 @@ TraceRecord::print(std::ostream& out) const
<< m_type << ", Time: " << m_time << "]";
}
CacheRecorder::CacheRecorder()
: m_uncompressed_trace(NULL),
m_uncompressed_trace_size(0),
m_block_size_bytes(RubySystem::getBlockSizeBytes())
{
}
CacheRecorder::CacheRecorder(uint8_t* uncompressed_trace,
uint64_t uncompressed_trace_size,
std::vector<RubyPort*>& ruby_port_map,
uint64_t block_size_bytes)
uint64_t trace_block_size_bytes,
uint64_t system_block_size_bytes)
: m_uncompressed_trace(uncompressed_trace),
m_uncompressed_trace_size(uncompressed_trace_size),
m_ruby_port_map(ruby_port_map), m_bytes_read(0),
m_records_read(0), m_records_flushed(0),
m_block_size_bytes(block_size_bytes)
m_block_size_bytes(trace_block_size_bytes)
{
if (m_uncompressed_trace != NULL) {
if (m_block_size_bytes < RubySystem::getBlockSizeBytes()) {
if (m_block_size_bytes < system_block_size_bytes) {
// Block sizes larger than when the trace was recorded are not
// supported, as we cannot reliably turn accesses to smaller blocks
// into larger ones.
panic("Recorded cache block size (%d) < current block size (%d) !!",
m_block_size_bytes, RubySystem::getBlockSizeBytes());
m_block_size_bytes, system_block_size_bytes);
}
}
}
@@ -125,7 +119,7 @@ CacheRecorder::enqueueNextFetchRequest()
DPRINTF(RubyCacheTrace, "Issuing %s\n", *traceRecord);
for (int rec_bytes_read = 0; rec_bytes_read < m_block_size_bytes;
rec_bytes_read += RubySystem::getBlockSizeBytes()) {
rec_bytes_read += m_block_size_bytes) {
RequestPtr req;
MemCmd::Command requestType;
@@ -133,19 +127,19 @@ CacheRecorder::enqueueNextFetchRequest()
requestType = MemCmd::ReadReq;
req = std::make_shared<Request>(
traceRecord->m_data_address + rec_bytes_read,
RubySystem::getBlockSizeBytes(), 0,
m_block_size_bytes, 0,
Request::funcRequestorId);
} else if (traceRecord->m_type == RubyRequestType_IFETCH) {
requestType = MemCmd::ReadReq;
req = std::make_shared<Request>(
traceRecord->m_data_address + rec_bytes_read,
RubySystem::getBlockSizeBytes(),
m_block_size_bytes,
Request::INST_FETCH, Request::funcRequestorId);
} else {
requestType = MemCmd::WriteReq;
req = std::make_shared<Request>(
traceRecord->m_data_address + rec_bytes_read,
RubySystem::getBlockSizeBytes(), 0,
m_block_size_bytes, 0,
Request::funcRequestorId);
}

View File

@@ -73,13 +73,15 @@ class TraceRecord
class CacheRecorder
{
public:
CacheRecorder();
~CacheRecorder();
// Construction requires block size.
CacheRecorder() = delete;
CacheRecorder(uint8_t* uncompressed_trace,
uint64_t uncompressed_trace_size,
std::vector<RubyPort*>& ruby_port_map,
uint64_t block_size_bytes);
uint64_t trace_block_size_bytes,
uint64_t system_block_size_bytes);
~CacheRecorder();
void addRecord(int cntrl, Addr data_addr, Addr pc_addr,
RubyRequestType type, Tick time, DataBlock& data);

View File

@@ -73,7 +73,7 @@ void
DMASequencer::init()
{
RubyPort::init();
m_data_block_mask = mask(RubySystem::getBlockSizeBits());
m_data_block_mask = mask(m_ruby_system->getBlockSizeBits());
}
RequestStatus
@@ -110,8 +110,10 @@ DMASequencer::makeRequest(PacketPtr pkt)
DPRINTF(RubyDma, "DMA req created: addr %p, len %d\n", line_addr, len);
int blk_size = m_ruby_system->getBlockSizeBytes();
std::shared_ptr<SequencerMsg> msg =
std::make_shared<SequencerMsg>(clockEdge());
std::make_shared<SequencerMsg>(clockEdge(), blk_size, m_ruby_system);
msg->getPhysicalAddress() = paddr;
msg->getLineAddress() = line_addr;
@@ -145,8 +147,8 @@ DMASequencer::makeRequest(PacketPtr pkt)
int offset = paddr & m_data_block_mask;
msg->getLen() = (offset + len) <= RubySystem::getBlockSizeBytes() ?
len : RubySystem::getBlockSizeBytes() - offset;
msg->getLen() = (offset + len) <= m_ruby_system->getBlockSizeBytes() ?
len : m_ruby_system->getBlockSizeBytes() - offset;
if (write && (data != NULL)) {
if (active_request.data != NULL) {
@@ -157,7 +159,8 @@ DMASequencer::makeRequest(PacketPtr pkt)
m_outstanding_count++;
assert(m_mandatory_q_ptr != NULL);
m_mandatory_q_ptr->enqueue(msg, clockEdge(), cyclesToTicks(Cycles(1)));
m_mandatory_q_ptr->enqueue(msg, clockEdge(), cyclesToTicks(Cycles(1)),
m_ruby_system->getRandomization(), m_ruby_system->getWarmupEnabled());
active_request.bytes_issued += msg->getLen();
return RequestStatus_Issued;
@@ -183,8 +186,10 @@ DMASequencer::issueNext(const Addr& address)
return;
}
int blk_size = m_ruby_system->getBlockSizeBytes();
std::shared_ptr<SequencerMsg> msg =
std::make_shared<SequencerMsg>(clockEdge());
std::make_shared<SequencerMsg>(clockEdge(), blk_size, m_ruby_system);
msg->getPhysicalAddress() = active_request.start_paddr +
active_request.bytes_completed;
@@ -196,9 +201,9 @@ DMASequencer::issueNext(const Addr& address)
msg->getLen() =
(active_request.len -
active_request.bytes_completed < RubySystem::getBlockSizeBytes() ?
active_request.bytes_completed < m_ruby_system->getBlockSizeBytes() ?
active_request.len - active_request.bytes_completed :
RubySystem::getBlockSizeBytes());
m_ruby_system->getBlockSizeBytes());
if (active_request.write) {
msg->getDataBlk().
@@ -207,7 +212,8 @@ DMASequencer::issueNext(const Addr& address)
}
assert(m_mandatory_q_ptr != NULL);
m_mandatory_q_ptr->enqueue(msg, clockEdge(), cyclesToTicks(Cycles(1)));
m_mandatory_q_ptr->enqueue(msg, clockEdge(), cyclesToTicks(Cycles(1)),
m_ruby_system->getRandomization(), m_ruby_system->getWarmupEnabled());
active_request.bytes_issued += msg->getLen();
DPRINTF(RubyDma,
"DMA request bytes issued %d, bytes completed %d, total len %d\n",

View File

@@ -142,8 +142,8 @@ UncoalescedTable::updateResources()
// are accessed directly using the makeRequest() command
// instead of accessing through the port. This makes
// sending tokens through the port unnecessary
if (!RubySystem::getWarmupEnabled()
&& !RubySystem::getCooldownEnabled()) {
if (!coalescer->getRubySystem()->getWarmupEnabled() &&
!coalescer->getRubySystem()->getCooldownEnabled()) {
if (reqTypeMap[seq_num] != RubyRequestType_FLUSH) {
DPRINTF(GPUCoalescer,
"Returning token seqNum %d\n", seq_num);
@@ -177,7 +177,7 @@ UncoalescedTable::printRequestTable(std::stringstream& ss)
ss << "Listing pending packets from " << instMap.size() << " instructions";
for (auto& inst : instMap) {
ss << "\tAddr: " << printAddress(inst.first) << " with "
ss << "\tAddr: " << coalescer->printAddress(inst.first) << " with "
<< inst.second.size() << " pending packets" << std::endl;
}
}
@@ -590,7 +590,7 @@ GPUCoalescer::hitCallback(CoalescedRequest* crequest,
// When the Ruby system is cooldown phase, the requests come from
// the cache recorder. These requests do not get coalesced and
// do not return valid data.
if (RubySystem::getCooldownEnabled())
if (m_ruby_system->getCooldownEnabled())
continue;
if (pkt->getPtr<uint8_t>()) {
@@ -700,8 +700,8 @@ GPUCoalescer::makeRequest(PacketPtr pkt)
// When Ruby is in warmup or cooldown phase, the requests come from
// the cache recorder. There is no dynamic instruction associated
// with these requests either
if (!RubySystem::getWarmupEnabled()
&& !RubySystem::getCooldownEnabled()) {
if (!m_ruby_system->getWarmupEnabled()
&& !m_ruby_system->getCooldownEnabled()) {
if (!m_usingRubyTester) {
num_packets = 0;
for (int i = 0; i < TheGpuISA::NumVecElemPerVecReg; i++) {
@@ -985,8 +985,8 @@ GPUCoalescer::completeHitCallback(std::vector<PacketPtr> & mylist)
// When Ruby is in warmup or cooldown phase, the requests come
// from the cache recorder. They do not track which port to use
// and do not need to send the response back
if (!RubySystem::getWarmupEnabled()
&& !RubySystem::getCooldownEnabled()) {
if (!m_ruby_system->getWarmupEnabled()
&& !m_ruby_system->getCooldownEnabled()) {
RubyPort::SenderState *ss =
safe_cast<RubyPort::SenderState *>(pkt->senderState);
MemResponsePort *port = ss->port;
@@ -1015,9 +1015,9 @@ GPUCoalescer::completeHitCallback(std::vector<PacketPtr> & mylist)
}
RubySystem *rs = m_ruby_system;
if (RubySystem::getWarmupEnabled()) {
if (m_ruby_system->getWarmupEnabled()) {
rs->m_cache_recorder->enqueueNextFetchRequest();
} else if (RubySystem::getCooldownEnabled()) {
} else if (m_ruby_system->getCooldownEnabled()) {
rs->m_cache_recorder->enqueueNextFlushRequest();
} else {
testDrainComplete();

View File

@@ -341,6 +341,8 @@ class GPUCoalescer : public RubyPort
void insertKernel(int wavefront_id, PacketPtr pkt);
RubySystem *getRubySystem() { return m_ruby_system; }
GMTokenPort& getGMTokenPort() { return gmTokenPort; }
statistics::Histogram& getOutstandReqHist() { return m_outstandReqHist; }

View File

@@ -326,6 +326,8 @@ RubyPort::MemResponsePort::recvAtomic(PacketPtr pkt)
panic("Ruby supports atomic accesses only in noncaching mode\n");
}
RubySystem *rs = owner.m_ruby_system;
// Check for pio requests and directly send them to the dedicated
// pio port.
if (pkt->cmd != MemCmd::MemSyncReq) {
@@ -343,12 +345,11 @@ RubyPort::MemResponsePort::recvAtomic(PacketPtr pkt)
return owner.ticksToCycles(req_ticks);
}
assert(getOffset(pkt->getAddr()) + pkt->getSize() <=
RubySystem::getBlockSizeBytes());
assert(owner.getOffset(pkt->getAddr()) + pkt->getSize() <=
rs->getBlockSizeBytes());
}
// Find the machine type of memory controller interface
RubySystem *rs = owner.m_ruby_system;
static int mem_interface_type = -1;
if (mem_interface_type == -1) {
if (rs->m_abstract_controls[MachineType_Directory].size() != 0) {
@@ -404,7 +405,7 @@ RubyPort::MemResponsePort::recvFunctional(PacketPtr pkt)
}
assert(pkt->getAddr() + pkt->getSize() <=
makeLineAddress(pkt->getAddr()) + RubySystem::getBlockSizeBytes());
owner.makeLineAddress(pkt->getAddr()) + rs->getBlockSizeBytes());
if (access_backing_store) {
// The attached physmem contains the official version of data.
@@ -501,7 +502,7 @@ RubyPort::ruby_stale_translation_callback(Addr txnId)
// assumed they will not be modified or deleted by receivers.
// TODO: should this really be using funcRequestorId?
auto request = std::make_shared<Request>(
0, RubySystem::getBlockSizeBytes(), Request::TLBI_EXT_SYNC,
0, m_ruby_system->getBlockSizeBytes(), Request::TLBI_EXT_SYNC,
Request::funcRequestorId);
// Store the txnId in extraData instead of the address
request->setExtraData(txnId);
@@ -701,7 +702,7 @@ RubyPort::ruby_eviction_callback(Addr address)
// assumed they will not be modified or deleted by receivers.
// TODO: should this really be using funcRequestorId?
auto request = std::make_shared<Request>(
address, RubySystem::getBlockSizeBytes(), 0,
address, m_ruby_system->getBlockSizeBytes(), 0,
Request::funcRequestorId);
// Use a single packet to signal all snooping ports of the invalidation.
@@ -739,5 +740,23 @@ RubyPort::functionalWrite(Packet *func_pkt)
return num_written;
}
Addr
RubyPort::getOffset(Addr addr) const
{
return ruby::getOffset(addr, m_ruby_system->getBlockSizeBits());
}
Addr
RubyPort::makeLineAddress(Addr addr) const
{
return ruby::makeLineAddress(addr, m_ruby_system->getBlockSizeBits());
}
std::string
RubyPort::printAddress(Addr addr) const
{
return ruby::printAddress(addr, m_ruby_system->getBlockSizeBits());
}
} // namespace ruby
} // namespace gem5

View File

@@ -181,6 +181,11 @@ class RubyPort : public ClockedObject
virtual int functionalWrite(Packet *func_pkt);
// Helper methods for commonly used functions called in common/address.hh
Addr getOffset(Addr addr) const;
Addr makeLineAddress(Addr addr) const;
std::string printAddress(Addr addr) const;
protected:
void trySendRetries();
void ruby_hit_callback(PacketPtr pkt);

View File

@@ -66,15 +66,8 @@ namespace gem5
namespace ruby
{
bool RubySystem::m_randomization;
uint32_t RubySystem::m_block_size_bytes;
uint32_t RubySystem::m_block_size_bits;
uint32_t RubySystem::m_memory_size_bits;
bool RubySystem::m_warmup_enabled = false;
// To look forward to allowing multiple RubySystem instances, track the number
// of RubySystems that need to be warmed up on checkpoint restore.
unsigned RubySystem::m_systems_to_warmup = 0;
bool RubySystem::m_cooldown_enabled = false;
RubySystem::RubySystem(const Params &p)
: ClockedObject(p), m_access_backing_store(p.access_backing_store),
@@ -212,8 +205,8 @@ RubySystem::makeCacheRecorder(uint8_t *uncompressed_trace,
// Create the CacheRecorder and record the cache trace
m_cache_recorder = new CacheRecorder(uncompressed_trace, cache_trace_size,
ruby_port_map,
block_size_bytes);
ruby_port_map, block_size_bytes,
m_block_size_bytes);
}
void
@@ -331,7 +324,7 @@ RubySystem::serialize(CheckpointOut &cp) const
// Store the cache-block size, so we are able to restore on systems
// with a different cache-block size. CacheRecorder depends on the
// correct cache-block size upon unserializing.
uint64_t block_size_bytes = getBlockSizeBytes();
uint64_t block_size_bytes = m_block_size_bytes;
SERIALIZE_SCALAR(block_size_bytes);
// Check that there's a valid trace to use. If not, then memory won't
@@ -416,7 +409,6 @@ RubySystem::unserialize(CheckpointIn &cp)
readCompressedTrace(cache_trace_file, uncompressed_trace,
cache_trace_size);
m_warmup_enabled = true;
m_systems_to_warmup++;
// Create the cache recorder that will hang around until startup.
makeCacheRecorder(uncompressed_trace, cache_trace_size, block_size_bytes);
@@ -467,10 +459,7 @@ RubySystem::startup()
delete m_cache_recorder;
m_cache_recorder = NULL;
m_systems_to_warmup--;
if (m_systems_to_warmup == 0) {
m_warmup_enabled = false;
}
m_warmup_enabled = false;
// Restore eventq head
eventq->replaceHead(eventq_head);
@@ -509,7 +498,7 @@ bool
RubySystem::functionalRead(PacketPtr pkt)
{
Addr address(pkt->getAddr());
Addr line_address = makeLineAddress(address);
Addr line_address = makeLineAddress(address, m_block_size_bits);
AccessPermission access_perm = AccessPermission_NotPresent;
@@ -625,7 +614,7 @@ bool
RubySystem::functionalRead(PacketPtr pkt)
{
Addr address(pkt->getAddr());
Addr line_address = makeLineAddress(address);
Addr line_address = makeLineAddress(address, m_block_size_bits);
DPRINTF(RubySystem, "Functional Read request for %#x\n", address);
@@ -726,7 +715,7 @@ bool
RubySystem::functionalWrite(PacketPtr pkt)
{
Addr addr(pkt->getAddr());
Addr line_addr = makeLineAddress(addr);
Addr line_addr = makeLineAddress(addr, m_block_size_bits);
AccessPermission access_perm = AccessPermission_NotPresent;
DPRINTF(RubySystem, "Functional Write request for %#x\n", addr);

View File

@@ -68,12 +68,12 @@ class RubySystem : public ClockedObject
~RubySystem();
// config accessors
static int getRandomization() { return m_randomization; }
static uint32_t getBlockSizeBytes() { return m_block_size_bytes; }
static uint32_t getBlockSizeBits() { return m_block_size_bits; }
static uint32_t getMemorySizeBits() { return m_memory_size_bits; }
static bool getWarmupEnabled() { return m_warmup_enabled; }
static bool getCooldownEnabled() { return m_cooldown_enabled; }
int getRandomization() { return m_randomization; }
uint32_t getBlockSizeBytes() { return m_block_size_bytes; }
uint32_t getBlockSizeBits() { return m_block_size_bits; }
uint32_t getMemorySizeBits() { return m_memory_size_bits; }
bool getWarmupEnabled() { return m_warmup_enabled; }
bool getCooldownEnabled() { return m_cooldown_enabled; }
memory::SimpleMemory *getPhysMem() { return m_phys_mem; }
Cycles getStartCycle() { return m_start_cycle; }
@@ -134,14 +134,13 @@ class RubySystem : public ClockedObject
void processRubyEvent();
private:
// configuration parameters
static bool m_randomization;
static uint32_t m_block_size_bytes;
static uint32_t m_block_size_bits;
static uint32_t m_memory_size_bits;
bool m_randomization;
uint32_t m_block_size_bytes;
uint32_t m_block_size_bits;
uint32_t m_memory_size_bits;
static bool m_warmup_enabled;
static unsigned m_systems_to_warmup;
static bool m_cooldown_enabled;
bool m_warmup_enabled = false;
bool m_cooldown_enabled = false;
memory::SimpleMemory *m_phys_mem;
const bool m_access_backing_store;
@@ -158,6 +157,11 @@ class RubySystem : public ClockedObject
Profiler* m_profiler;
CacheRecorder* m_cache_recorder;
std::vector<std::map<uint32_t, AbstractController *> > m_abstract_controls;
std::map<MachineType, uint32_t> m_num_controllers;
// These are auto-generated by SLICC based on the built protocol.
int MachineType_base_count(const MachineType& obj);
int MachineType_base_number(const MachineType& obj);
};
} // namespace ruby

View File

@@ -73,6 +73,8 @@ Sequencer::Sequencer(const Params &p)
{
m_outstanding_count = 0;
m_ruby_system = p.ruby_system;
m_dataCache_ptr = p.dcache;
m_max_outstanding_requests = p.max_outstanding_requests;
m_deadlock_threshold = p.deadlock_threshold;
@@ -726,7 +728,7 @@ Sequencer::hitCallback(SequencerRequest* srequest, DataBlock& data,
printAddress(request_address));
// update the data unless it is a non-data-carrying flush
if (RubySystem::getWarmupEnabled()) {
if (m_ruby_system->getWarmupEnabled()) {
data.setData(pkt);
} else if (!pkt->isFlush()) {
if ((type == RubyRequestType_LD) ||
@@ -782,11 +784,11 @@ Sequencer::hitCallback(SequencerRequest* srequest, DataBlock& data,
}
RubySystem *rs = m_ruby_system;
if (RubySystem::getWarmupEnabled()) {
if (m_ruby_system->getWarmupEnabled()) {
assert(pkt->req);
delete pkt;
rs->m_cache_recorder->enqueueNextFetchRequest();
} else if (RubySystem::getCooldownEnabled()) {
} else if (m_ruby_system->getCooldownEnabled()) {
delete pkt;
rs->m_cache_recorder->enqueueNextFlushRequest();
} else {
@@ -852,8 +854,8 @@ Sequencer::completeHitCallback(std::vector<PacketPtr> & mylist)
// When Ruby is in warmup or cooldown phase, the requests come
// from the cache recorder. They do not track which port to use
// and do not need to send the response back
if (!RubySystem::getWarmupEnabled()
&& !RubySystem::getCooldownEnabled()) {
if (!m_ruby_system->getWarmupEnabled()
&& !m_ruby_system->getCooldownEnabled()) {
RubyPort::SenderState *ss =
safe_cast<RubyPort::SenderState *>(pkt->senderState);
MemResponsePort *port = ss->port;
@@ -873,9 +875,9 @@ Sequencer::completeHitCallback(std::vector<PacketPtr> & mylist)
}
RubySystem *rs = m_ruby_system;
if (RubySystem::getWarmupEnabled()) {
if (m_ruby_system->getWarmupEnabled()) {
rs->m_cache_recorder->enqueueNextFetchRequest();
} else if (RubySystem::getCooldownEnabled()) {
} else if (m_ruby_system->getCooldownEnabled()) {
rs->m_cache_recorder->enqueueNextFlushRequest();
} else {
testDrainComplete();
@@ -910,14 +912,16 @@ Sequencer::invL1()
// Evict Read-only data
RubyRequestType request_type = RubyRequestType_REPLACEMENT;
std::shared_ptr<RubyRequest> msg = std::make_shared<RubyRequest>(
clockEdge(), addr, 0, 0,
request_type, RubyAccessMode_Supervisor,
clockEdge(), m_ruby_system->getBlockSizeBytes(), m_ruby_system,
addr, 0, 0, request_type, RubyAccessMode_Supervisor,
nullptr);
DPRINTF(RubySequencer, "Evicting addr 0x%x\n", addr);
assert(m_mandatory_q_ptr != NULL);
Tick latency = cyclesToTicks(
m_controller->mandatoryQueueLatency(request_type));
m_mandatory_q_ptr->enqueue(msg, clockEdge(), latency);
m_mandatory_q_ptr->enqueue(msg, clockEdge(), latency,
m_ruby_system->getRandomization(),
m_ruby_system->getWarmupEnabled());
m_num_pending_invs++;
}
DPRINTF(RubySequencer,
@@ -1080,11 +1084,14 @@ Sequencer::issueRequest(PacketPtr pkt, RubyRequestType secondary_type)
pc = pkt->req->getPC();
}
int blk_size = m_ruby_system->getBlockSizeBytes();
// check if the packet has data as for example prefetch and flush
// requests do not
std::shared_ptr<RubyRequest> msg;
if (pkt->req->isMemMgmt()) {
msg = std::make_shared<RubyRequest>(clockEdge(),
msg = std::make_shared<RubyRequest>(clockEdge(), blk_size,
m_ruby_system,
pc, secondary_type,
RubyAccessMode_Supervisor, pkt,
proc_id, core_id);
@@ -1111,8 +1118,10 @@ Sequencer::issueRequest(PacketPtr pkt, RubyRequestType secondary_type)
msg->m_tlbiTransactionUid);
}
} else {
msg = std::make_shared<RubyRequest>(clockEdge(), pkt->getAddr(),
pkt->getSize(), pc, secondary_type,
msg = std::make_shared<RubyRequest>(clockEdge(), blk_size,
m_ruby_system,
pkt->getAddr(), pkt->getSize(),
pc, secondary_type,
RubyAccessMode_Supervisor, pkt,
PrefetchBit_No, proc_id, core_id);
@@ -1147,7 +1156,9 @@ Sequencer::issueRequest(PacketPtr pkt, RubyRequestType secondary_type)
assert(latency > 0);
assert(m_mandatory_q_ptr != NULL);
m_mandatory_q_ptr->enqueue(msg, clockEdge(), latency);
m_mandatory_q_ptr->enqueue(msg, clockEdge(), latency,
m_ruby_system->getRandomization(),
m_ruby_system->getWarmupEnabled());
}
template <class KEY, class VALUE>
@@ -1194,7 +1205,7 @@ Sequencer::incrementUnaddressedTransactionCnt()
// Limit m_unaddressedTransactionCnt to 32 bits,
// top 32 bits should always be zeroed out
uint64_t aligned_txid = \
m_unaddressedTransactionCnt << RubySystem::getBlockSizeBits();
m_unaddressedTransactionCnt << m_ruby_system->getBlockSizeBits();
if (aligned_txid > 0xFFFFFFFFull) {
m_unaddressedTransactionCnt = 0;
@@ -1206,7 +1217,7 @@ Sequencer::getCurrentUnaddressedTransactionID() const
{
return (
uint64_t(m_version & 0xFFFFFFFF) << 32) |
(m_unaddressedTransactionCnt << RubySystem::getBlockSizeBits()
(m_unaddressedTransactionCnt << m_ruby_system->getBlockSizeBits()
);
}

Some files were not shown because too many files have changed in this diff Show More