Files
gem5/src/cpu/base_dyn_inst_impl.hh
Rekai Gonzalez-Alberquilla 51becd2475 cpu-o3: O3 LSQ Generalisation
This patch does a large modification of the LSQ in the O3 model. The
main goal of the patch is to remove the 'an operation can be served with
one or two memory requests' assumption that is present in the LSQ
and the instruction with the req, reqLow, reqHigh triplet, and
generalising it to operations that can be addressed with one request,
and operations that require many requests, embodied in the
SingleDataRequest and the SplitDataRequest.

This modification has been done mimicking the minor model to an extent,
shifting the responsibilities of dealing with VtoP translation and
tracking the status and resources from the DynInst to the LSQ via the
LSQRequest. The LSQRequest models the information concerning the
operation, handles the creation of fragments for translation and request
as well as assembling/splitting the data accordingly.

With this modifications, the implementation of vector ISAs, particularly
on the memory side, become more rich, as the new model permits a
dissociation of the ISA characteristics as vector length, from the
microarchitectural characteristics that govern how contiguous loads are
executing, allowing exploration of different LSQ to DL1 bus widths to
understand the tradeoffs in complexity and performance.

Part of the complexities introduced stem from the fact that gem5 keeps a
large amount of metadata regarding, in particular, memory operations,
thus, when an instruction is squashed while some operation as TLB lookup
or cache access is ongoing, when the relevant structure communicates to
the LSQ that the operation is over, it tries to access some pieces of
data that should have died when the instruction is squashed, leading to
asserts, panics, or memory corruption. To ensure the correct behaviour,
the LSQRequest rely on assesing who is their owner, and self-destroying
if they detect their owner is done with the request, and there will be
no subsequent action. For example, in the case of an instruction
squashed whal the TLB is doing a walk to serve the translation, when the
translation is served by the TLB, the LSQRequest detects that the
instruction was squashed, and as the translation is done, no one else
expect to access its information, and therefore, it self-destructs.
Having destroyed the LSQRequest earlier, would lead to wrong behaviour
as the TLB walk may access some fields of it.

Additional authors:
- Gabor Dozsa <gabor.dozsa@arm.com>

Change-Id: I9578a1a3f6b899c390cdd886856a24db68ff7d0c
Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/13516
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
2019-01-24 09:46:34 +00:00

240 lines
6.2 KiB
C++

/*
* Copyright (c) 2011 ARM Limited
* All rights reserved.
*
* The license below extends only to copyright in the software and shall
* not be construed as granting a license to any other intellectual
* property including but not limited to intellectual property relating
* to a hardware implementation of the functionality of the software
* licensed hereunder. You may use the software subject to the license
* terms below provided that you ensure that this notice is replicated
* unmodified and in its entirety in all distributions of the software,
* modified or unmodified, in source code or in binary form.
*
* Copyright (c) 2004-2006 The Regents of The University of Michigan
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are
* met: redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer;
* redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution;
* neither the name of the copyright holders nor the names of its
* contributors may be used to endorse or promote products derived from
* this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*
* Authors: Kevin Lim
*/
#ifndef __CPU_BASE_DYN_INST_IMPL_HH__
#define __CPU_BASE_DYN_INST_IMPL_HH__
#include <iostream>
#include <set>
#include <sstream>
#include <string>
#include "base/cprintf.hh"
#include "base/trace.hh"
#include "config/the_isa.hh"
#include "cpu/base_dyn_inst.hh"
#include "cpu/exetrace.hh"
#include "debug/DynInst.hh"
#include "debug/IQ.hh"
#include "mem/request.hh"
#include "sim/faults.hh"
template <class Impl>
BaseDynInst<Impl>::BaseDynInst(const StaticInstPtr &_staticInst,
const StaticInstPtr &_macroop,
TheISA::PCState _pc, TheISA::PCState _predPC,
InstSeqNum seq_num, ImplCPU *cpu)
: staticInst(_staticInst), cpu(cpu),
thread(nullptr),
traceData(nullptr),
macroop(_macroop),
memData(nullptr),
savedReq(nullptr),
reqToVerify(nullptr)
{
seqNum = seq_num;
pc = _pc;
predPC = _predPC;
initVars();
}
template <class Impl>
BaseDynInst<Impl>::BaseDynInst(const StaticInstPtr &_staticInst,
const StaticInstPtr &_macroop)
: staticInst(_staticInst), traceData(NULL), macroop(_macroop)
{
seqNum = 0;
initVars();
}
template <class Impl>
void
BaseDynInst<Impl>::initVars()
{
memData = NULL;
effAddr = 0;
physEffAddr = 0;
readyRegs = 0;
memReqFlags = 0;
status.reset();
instFlags.reset();
instFlags[RecordResult] = true;
instFlags[Predicate] = true;
lqIdx = -1;
sqIdx = -1;
// Eventually make this a parameter.
threadNumber = 0;
// Also make this a parameter, or perhaps get it from xc or cpu.
asid = 0;
// Initialize the fault to be NoFault.
fault = NoFault;
#ifndef NDEBUG
++cpu->instcount;
if (cpu->instcount > 1500) {
#ifdef DEBUG
cpu->dumpInsts();
dumpSNList();
#endif
assert(cpu->instcount <= 1500);
}
DPRINTF(DynInst,
"DynInst: [sn:%lli] Instruction created. Instcount for %s = %i\n",
seqNum, cpu->name(), cpu->instcount);
#endif
#ifdef DEBUG
cpu->snList.insert(seqNum);
#endif
}
template <class Impl>
BaseDynInst<Impl>::~BaseDynInst()
{
if (memData) {
delete [] memData;
}
if (traceData) {
delete traceData;
}
fault = NoFault;
#ifndef NDEBUG
--cpu->instcount;
DPRINTF(DynInst,
"DynInst: [sn:%lli] Instruction destroyed. Instcount for %s = %i\n",
seqNum, cpu->name(), cpu->instcount);
#endif
#ifdef DEBUG
cpu->snList.erase(seqNum);
#endif
}
#ifdef DEBUG
template <class Impl>
void
BaseDynInst<Impl>::dumpSNList()
{
std::set<InstSeqNum>::iterator sn_it = cpu->snList.begin();
int count = 0;
while (sn_it != cpu->snList.end()) {
cprintf("%i: [sn:%lli] not destroyed\n", count, (*sn_it));
count++;
sn_it++;
}
}
#endif
template <class Impl>
void
BaseDynInst<Impl>::dump()
{
cprintf("T%d : %#08d `", threadNumber, pc.instAddr());
std::cout << staticInst->disassemble(pc.instAddr());
cprintf("'\n");
}
template <class Impl>
void
BaseDynInst<Impl>::dump(std::string &outstring)
{
std::ostringstream s;
s << "T" << threadNumber << " : 0x" << pc.instAddr() << " "
<< staticInst->disassemble(pc.instAddr());
outstring = s.str();
}
template <class Impl>
void
BaseDynInst<Impl>::markSrcRegReady()
{
DPRINTF(IQ, "[sn:%lli] has %d ready out of %d sources. RTI %d)\n",
seqNum, readyRegs+1, numSrcRegs(), readyToIssue());
if (++readyRegs == numSrcRegs()) {
setCanIssue();
}
}
template <class Impl>
void
BaseDynInst<Impl>::markSrcRegReady(RegIndex src_idx)
{
_readySrcRegIdx[src_idx] = true;
markSrcRegReady();
}
template <class Impl>
bool
BaseDynInst<Impl>::eaSrcsReady() const
{
// For now I am assuming that src registers 1..n-1 are the ones that the
// EA calc depends on. (i.e. src reg 0 is the source of the data to be
// stored)
for (int i = 1; i < numSrcRegs(); ++i) {
if (!_readySrcRegIdx[i])
return false;
}
return true;
}
#endif//__CPU_BASE_DYN_INST_IMPL_HH__