There is a flow of packets as so:
WriteResp -> WriteReq -> WriteCompleteResp
These packets share some variables, in particular senderState and a
status vector.
One issue was the WriteResp packet decremented the status vector, which
was used by the WriteCompleteResp packets to determine when to handle
the global memory response. This could lead to multiple
WriteCompleteResp packets attempting to handle the global memory
response.
Because of that, the WriteCompleteResp packets needed to handle the
status vector. this patch moves WriteCompleteResp packet handling back
into ComputeUnit::DataPort::processMemRespEvent from
ComputeUnit::DataPort::recvTimingResp. This helps remove some redundant
code.
This patch has the WriteResp packet return without doing any status
vector handling, and without deleting the senderState, which had
previously caused a segfault.
Another issue was WriteCompleteResp packets weren't being issued for
each active lane, as the coalesced request was being issued too early.
In order to fix that, we have to ensure every active lane puts their
request into their applicable coalesced request before issuing the
coalesced request. Because of that change, we change the issuing of
CoalescedRequests from GPUCoalescer::coalescePacket to
GPUCoalescer::completeIssue.
That change involves adding a new variable to store the
CoalescedRequests that are created in the calls to coalescePacket. This
variable is a map from instruction sequence number to coalesced
requests.
Additionally, the WriteCompleteResp packet was attempting to access
physical memory in hitCallback while not having any data, which
caused a crash. This can be resolved either by not allowing
WriteCompleteResp packets to access memory, or by copying the data
from the WriteReq packet. This patch denies WriteCompleteResp packets
memory access in hitCallback.
Finally, in VIPERCoalescer::writeCompleteCallback there was a map
that held the WriteComplete packets, but no packets were ever being
removed. This patch removes packets that match the address that was
passed in to the function.
Change-Id: I9a064a0def2bf6c513f5295596c56b1b652b0ca4
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33656
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The create() method on Params structs usually instantiate SimObjects
using a constructor which takes the Params struct as a parameter
somehow. There has been a lot of needless variation in how that was
done, making it annoying to pass Params down to base classes. Some of
the different forms were:
const Params &
Params &
Params *
const Params *
Params const*
This change goes through and fixes up every constructor and every
create() method to use the const Params & form. We use a reference
because the Params struct should never be null. We use const because
neither the create method nor the consuming object should modify the
record of the parameters as they came in from the config. That would
make consuming them not idempotent, and make it impossible to tell what
the actual simulation configuration was since it would change from any
user visible form (config script, config.ini, dot pdf output).
Change-Id: I77453cba52fdcfd5f4eec92dfb0bddb5a9945f31
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/35938
Reviewed-by: Gabe Black <gabeblack@google.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
All parameters in functions defined within SLICC are const& by default
(except for the implicit types, e.g. TBE). This allow us to specify
if we want to pass parameters as & or const&. Default behavior is
maintained.
A use case is to allow refactoring of common code in actions that
enqueue messages. Messages can be passed as a non-const ref. to
to functions with common initialization. E.g.:
void initRequestMsg(RequestMsg & out_msg) {
// Common msg init code
}
action(sendRequest1, ...) {
enqueue(...) {
initRequestMsg(out_msg);
// Request1 specific code
}
}
action(sendRequest2, ...) {
enqueue(...) {
initRequestMsg(out_msg);
// Request2 specific code
}
}
Change-Id: Ic6a18169a661b3e36710b2a9f8a0e6bc5fce40f8
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31259
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Added mapAddressToDownstreamMachine that may be used by the protocols
to map an address to different target donwstream controller of the same
type.
These functions do not use the global mapping provided by the network
and map addresses to one of the controllers specified in the
downstream_destinations parameter.
This change facilitates reusing the same cache state-machine/controllers
to model different levels of the cache hierarchy.
Change-Id: I9a202e9461e0d2f16ed232ff8b60bbde2d15570d
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31415
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Returns the offset of an address with respect to a base address.
Looks unnecessary, but SLICC doesn't support casting and the '-'
operator for Addr types, so the alternative to this would be to add
more some helpers like 'addrToUint64' and 'uint64ToInt'.
Change-Id: I90480cec4c8b2e6bb9706f8b94ed33abe3c93e78
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31270
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Add incomingTransactionStart/End and outgoingTransactionStart/End
functions that can be called from the protocol to profile events
that initiate a transaction locally (e.g. an incoming request) and
remotely (e.g. outgoing requests). The generated stats will include
histograms of the latency for completing each type of transaction.
This assumes assumes the protocol uses different trigger events for
initiating incoming and outgoing transactions.
Change-Id: Ib528641b9676c68907b5989b6a09bfe91373f9c9
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31421
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: John Alsop <johnathan.alsop@amd.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
There are cases in which we need to prevent randomization for a
specific buffer when enabled at the RubySystem level (e.g. a internal
trigger queue that requires zero latency enqueue, while other buffers
can be randomized).
This changes the randomization parameter to support enabling and
disabling randomization regardless of the RubySystem setting.
Change-Id: If7520153cc5864897fa42e8911a6f8acbcf01db5
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31419
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Templated types can now be used within structures defined in SLICC.
Usage is similar to the TBETable: the templated type must have all
possible methods in it's SLICC definition. Eg.:
structure(Map, desc="Template map definition") {
MachineID lookup(Addr);
MachineID lookup(int);
}
structure(SomeType, desc="Some other struct definition") {
MachineID addrMap, template="<Addr,MachineID>";
MachineID intMap, template="<int,MachineID>";
}
Change-Id: I02a621cea5e4a89302762334651c6534c6574e9d
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31264
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Reviewed-by: Bradford Beckmann <bradford.beckmann@gmail.com>
Maintainer: Bradford Beckmann <bradford.beckmann@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Ruby prefetcher's non-unit filter is a circular queue, so use the class
created for this functionality.
This changes the behavior, since previously iterating through the
filter was completely arbitrary, and now it iterates from the
beginning of the queue to the end when accessing and updating
the filter's contents.
Change-Id: I3148efcbef00da0c8f6cf2dee7fb86f6c2ddb27d
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/24533
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Ruby prefetcher's unit filter is a circular queue, so use the class
created for this functionality.
This changes the behavior, since previously iterating through the
filter was completely arbitrary, and now it iterates from the
beginning of the queue to the end when accessing and updating
the filter's contents.
Change-Id: I834be88a33580d5857c38e9bae8b289c5a6250b9
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/24532
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Messages may be enqueued and be ready in the same cycle.
Using this feature may introduce nondeterminism in the protocol and
should be used in specific cases. A case study is to avoid needing an
additional cycle for internal protocol triggers (e.g. the All_Acks
event in src/mem/ruby/protocol/MOESI_CMP_directory-L2cache.sm).
To mitigate modeling mistakes, the 'allow_zero_latency' parameter must
be set for a MessageBuffer where this behavior is acceptable.
This changes also updates the Consumer to schedule events according to
this new behavior. The original implementation would not schedule a new
wakeup event if the wakeup for the Consumer had already been executed
in that cycle.
Additional authors:
- Tuan Ta <tuan.ta2@arm.com>
Change-Id: Ib194e7b4b4ee4b06da1baea17c0eb743f650dfdd
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31255
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Encapsulate this variable to facilitate polymorphism.
- The status enum was renamed to CoherenceBits, since it
lists the coherence bits supported by the CacheBlk.
- status was made protected and renamed to coherence since
it contains the coherence bits.
- Functions to set, clear and get the coherence bits were
created.
- To set a status bit, the block must be validated first.
This guarantees a constant flow and helps catching bugs.
As a side effect, some of the modified files contained long
lines, which had to be split.
Change-Id: I558cc51ac685d30b6bf298c78f86a6e24ff06973
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/34960
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The TaggedEntry class inherits from the ReplaceableEntry
class. Its purpose is to define a replaceable entry with
tagging attributes.
It has been created as a separate class because both the
replacement policies and the AbstractCacheEntry use
ReplaceableEntry, and do not need the tagging information
to perform their operations.
Change-Id: I24e87c865fc21c79dea7e488507a8cafc5223b39
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/35698
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Encapsulate this variable to facilitate polymorphism.
- tickInserted was renamed to _tickInserted and was privatized.
- The insertion tick should always be set to the current tick,
and only on insertion; thus, its setter is not public and
does not take arguments.
- An additional function was created to get the age since of
the block relative to its insertion tick.
- There is no current need for a getter.
Change-Id: I81d4009abec5e9633e10f1e851e3a524553c98a4
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/34958
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Encapsulate this variable to facilitate polymorphism.
- refCount was renamed to _refCount and was privatized.
- The reference count should only be reset at invalidation;
thus, its setter is not public.
- An additional function was created to increment the number
of references by 1.
Change-Id: Ibc8799a8dcb7c53c651de3eb1c9b86118a297b9d
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/34957
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Make the tag variable private, so that every access to it must pass
through a getter and a setter. This protects it from being incorrectly
updated when there is no direct match between a tag and a data entry,
as is the case of sector, compressed, decoupled, and many other table
layouts.
As a side effect, a block matching function has been created, which
also depends directly on the mapping between tag and data entries.
Change-Id: I848a154404feb5cbcea8d0fd2509bf49e1d73bd0
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/34955
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The is a bug in the GPUCoalescer which occurs in the following
situation:
1) An instruction crosses a page boundary causing multiple TLB requests
to be sent.
2) The TLB responses arrive at different times, causing the vector
memory requests to be sent at different times.
3) The first vector memory request completes before the second vector
memory request arrives at the coalescer.
This caused the coalescer to consider the instruction sequence number
done and return its token. Then the second request would arrive and
complete sending back another token. Eventually this increases the token
count beyond the maximum tripping an assert.
This change keeps track of the number of per-lane requests which are
expected to be sent in the vector memory request by looking at the exec
mask of the instruction. The token is not returned until the expected
number of per-lane requests have been coalesced. This fixes "#7" in the
list of issues in JIRA-300. There are also style fixes for local
variables in code nearby the changes in this CL.
Change-Id: I152fd9397920ad82ba6079112908387e71ff3cce
JIRA: https://gem5.atlassian.net/browse/GEM5-300
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/35176
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Kyle Roarty <kyleroarty1716@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The byteEnable variable is used for masking bytes in a memory request.
The default behaviour is to provide from the ExecContext to the CPU
(and then to the LSQ) an empty vector, which is the same as providing
a vector where every element is true.
Such vectors basically mean: do not mask any byte in the memory request.
This behaviour adds more complexity to the downstream LSQs, which now
have to distinguish between an empty and non-empty byteEnable.
This patch is simplifying things by transforming an empty vector into
a all true one, making sure the CPUs are always receiving a non empty
byteEnable.
JIRA: https://gem5.atlassian.net/browse/GEM5-196
Change-Id: I1d1cecd86ed64c53a314ed700f28810d76c195c3
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/23285
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Tested-by: kokoro <noreply+kokoro@google.com>
This change replaces the __attribute__ syntax with the now standard [[]]
syntax. It also reorganizes compiler.hh so that all special macros have
some explanatory text saying what they do, and each attribute which has a
standard version can use that if available and what version of c++ it's
standard in is put in a comment.
Also, the requirements as far as where you put [[]] style attributes are
a little more strict than the old school __attribute__ style. The use of
the attribute macros was updated to fit these new, more strict
requirements.
Change-Id: Iace44306a534111f1c38b9856dc9e88cd9b49d2a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/35219
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Some code was added fairly recently which would load a memory image
into a memory directly in order to make it easier to set up ROMs.
Unfortunately, that code accidentally used the image size instead of
the cache line size when setting up the port proxy which would actually
write the data. This happens to work when the image size is a power of
two since that's all the proxy checks for, but there's no guarantee
that every image will be sized that way.
This change instead looks into the system object, retrieves the cache
line size from it, and uses that to set up the port proxy.
Change-Id: I227ac475b855d9516e1feb881769e12ec4e7d598
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/35155
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
These were including instruction class definitions from x86 for some
reason. There was no code in those .cc files which actually used
anything from them, as evidenced by the fact that the GCN3_X86 build
still works. No other code in the file was conditionally compiled as of
today.
Change-Id: I3cef8348fb601dd7af67665cf64bbf514c91c3db
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/34577
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>