Certain instructions (some atomics and buffer_wbinvl1_vol) deadlock
in the coalescer, where sendTimingReq fails, fails a retry, and then
never retries again.
This fix sets m_cache_inv_pkt to null before calling
completeHitCallback(), as that allows the failed packets to be retried
again.
Change-Id: I4a51c741360f385f8b4c3f2a31a9410f18e095d9
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/37477
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The function was defined in a .cc file but marked as inline. gcc seems
to often figure out what it should do, but in clang it doesn't export
the function (since it's marked as inline), and during linking external
references, which don't have a local copy since it's not defined in the
.hh file, will fail.
This failure looks particularly odd because the funciton is virtual,
and so the failure is reported as being unable to compose the vtable
in places where the object is constructed, relatively obscure code
which is generated by the build system and obscured by templates from
an external code base (pybind11).
Change-Id: Ib51aefbf9005e4ca8dfebef32c5def472175f115
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/37436
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The compression factor of a block is measured according to the maximum
achievable compression ratio.
For example, if up to 4 blocks can co-allocate in a superblock, and
a cache line has 512 bits, the possible compression factors are 1
(uncompressed, <=512 bits), 2 (compressed, <=256 bits), 4 (compressed,
<=128 bits).
This is an approach similar to the one described in "Yet Another
Compressed Cache", by Sardashti et al.
Change-Id: I52ef36989f3eeef6fc8890132a57f995ef9c5258
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/36581
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
When setting the size of a compressed block, its compressibility
needs to be recalculated based on that, so move such functionality
to be done after the block has been inserted, within setSizeBits.
As a side effect, insertBlock does not need to be overridden
anymore.
Change-Id: I608f876cd2110ac5e394ffad5b29941ba458ba91
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/36580
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Data contractions happen when a block passes from a less compressed
(e.g., uncompressed) to a more compressed (e.g., compressed) state.
Some compaction methods enforce that a block can only be allocated
in a location matches an exact compression factor, thus on data
contractions such blocks must be moved to another location, or
they must be padded to fake a bigger size.
For compaction methods that do not have that limitation, performance
can be improved if the contracted block is moved to co-allocate with
another existing entry, since it frees up an entry.
Change-Id: I302bc561b897f9d3ce1426331fe4b5c2df76f4b5
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/36578
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
When searching for victims of a data expansion a simple approach to
make room for the expanded block is to evict every co-allocatable
block. This, however, ignores replacement policies and tends to be
inefficient. Besides, some cache compaction policies do not allow
blocks that changed their compression ratio to be allocated in the
same location (e.g., Skewed Compressed Caches), so they must be
moved elsewhere.
The replacement policy approach asks the replacement policy which
block(s) would be the best to evict in order to make room for the
expanded block. The other approach, on the other hand, simply evicts
all co-allocated entries. In the case the replacement policy selects
the superblock of the block being expanded, we must make sure the
latter is not evicted/moved by mistake.
This patch also allows the user to select which approach they would
like to use.
Change-Id: Iae57cf26dac7218c51ff0169a5cfcf3d6f8ea28a
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/36577
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Some cache techniques may need to move a block's metadata information
into another block. This must have some limitations to avoid mistakes:
- The destination entry must be invalid, otherwise the replacement
policy steps would be skipped.
- The source entry must be valid, otherwise there would be no point
in moving their metadata contents.
- The entries locations (set, way, offset...) must not be moved, since
they are fixed. The same principle is applied to the location specific
variables, such as the replacement pointer
Why it would be used:
For example, when using compression, and a block goes from uncompressed
to compressed state due to an overwrite, after the tag lookup
(sequential access) it can be decided whether to store the new data in
the old location, or, since we might have already found the block's co-
allocatable blocks, move it to co-allocate.
Other examples of techniques that could use this functionality are
Skewed Compressed Caches, and ZCaches.
Change-Id: I96e4f8cc8c992c4b01f315251d1a75d51c28692c
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/36575
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The variable 'm_allow_zero_latency' was only used in an assert message in
`src/mem/ruby/network/MessageBuffer.cc`. This assert is stripped when
compiling to gem5.fast, resulting in the compilation failing with an
unused variable error.
This assert is better as a panic_if, which will not be stripped out
during the .fast compilation.
Change-Id: I5de74982fa42b3291899ddcf73f7140079e1ec3f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/36697
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Previously the SimpleMem depended on the fact that it inherited from the
AbstractMem in order to access and export it's back door. Now, the
AbstractMem has a method which will set a back door pointer if
appropriate, which the SimpleMem can use, or anything else which uses an
AbstractMem as its backing store.
Also, make the AbstractMem invalidate any existing back doors and refuse
to give out any new ones while some bit of memory is locked. That's
because if the storage is accessed directly, the AbstractMem will have
no change to manage its bookkeeping, and locking won't work properly.
Change-Id: If8c2a63e0827bb88b583f27ab4151d6b761e116e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/36977
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
There is a flow of packets as so:
WriteResp -> WriteReq -> WriteCompleteResp
These packets share some variables, in particular senderState and a
status vector.
One issue was the WriteResp packet decremented the status vector, which
was used by the WriteCompleteResp packets to determine when to handle
the global memory response. This could lead to multiple
WriteCompleteResp packets attempting to handle the global memory
response.
Because of that, the WriteCompleteResp packets needed to handle the
status vector. this patch moves WriteCompleteResp packet handling back
into ComputeUnit::DataPort::processMemRespEvent from
ComputeUnit::DataPort::recvTimingResp. This helps remove some redundant
code.
This patch has the WriteResp packet return without doing any status
vector handling, and without deleting the senderState, which had
previously caused a segfault.
Another issue was WriteCompleteResp packets weren't being issued for
each active lane, as the coalesced request was being issued too early.
In order to fix that, we have to ensure every active lane puts their
request into their applicable coalesced request before issuing the
coalesced request. Because of that change, we change the issuing of
CoalescedRequests from GPUCoalescer::coalescePacket to
GPUCoalescer::completeIssue.
That change involves adding a new variable to store the
CoalescedRequests that are created in the calls to coalescePacket. This
variable is a map from instruction sequence number to coalesced
requests.
Additionally, the WriteCompleteResp packet was attempting to access
physical memory in hitCallback while not having any data, which
caused a crash. This can be resolved either by not allowing
WriteCompleteResp packets to access memory, or by copying the data
from the WriteReq packet. This patch denies WriteCompleteResp packets
memory access in hitCallback.
Finally, in VIPERCoalescer::writeCompleteCallback there was a map
that held the WriteComplete packets, but no packets were ever being
removed. This patch removes packets that match the address that was
passed in to the function.
Change-Id: I9a064a0def2bf6c513f5295596c56b1b652b0ca4
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33656
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The create() method on Params structs usually instantiate SimObjects
using a constructor which takes the Params struct as a parameter
somehow. There has been a lot of needless variation in how that was
done, making it annoying to pass Params down to base classes. Some of
the different forms were:
const Params &
Params &
Params *
const Params *
Params const*
This change goes through and fixes up every constructor and every
create() method to use the const Params & form. We use a reference
because the Params struct should never be null. We use const because
neither the create method nor the consuming object should modify the
record of the parameters as they came in from the config. That would
make consuming them not idempotent, and make it impossible to tell what
the actual simulation configuration was since it would change from any
user visible form (config script, config.ini, dot pdf output).
Change-Id: I77453cba52fdcfd5f4eec92dfb0bddb5a9945f31
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/35938
Reviewed-by: Gabe Black <gabeblack@google.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
All parameters in functions defined within SLICC are const& by default
(except for the implicit types, e.g. TBE). This allow us to specify
if we want to pass parameters as & or const&. Default behavior is
maintained.
A use case is to allow refactoring of common code in actions that
enqueue messages. Messages can be passed as a non-const ref. to
to functions with common initialization. E.g.:
void initRequestMsg(RequestMsg & out_msg) {
// Common msg init code
}
action(sendRequest1, ...) {
enqueue(...) {
initRequestMsg(out_msg);
// Request1 specific code
}
}
action(sendRequest2, ...) {
enqueue(...) {
initRequestMsg(out_msg);
// Request2 specific code
}
}
Change-Id: Ic6a18169a661b3e36710b2a9f8a0e6bc5fce40f8
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31259
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Added mapAddressToDownstreamMachine that may be used by the protocols
to map an address to different target donwstream controller of the same
type.
These functions do not use the global mapping provided by the network
and map addresses to one of the controllers specified in the
downstream_destinations parameter.
This change facilitates reusing the same cache state-machine/controllers
to model different levels of the cache hierarchy.
Change-Id: I9a202e9461e0d2f16ed232ff8b60bbde2d15570d
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31415
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Returns the offset of an address with respect to a base address.
Looks unnecessary, but SLICC doesn't support casting and the '-'
operator for Addr types, so the alternative to this would be to add
more some helpers like 'addrToUint64' and 'uint64ToInt'.
Change-Id: I90480cec4c8b2e6bb9706f8b94ed33abe3c93e78
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31270
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Add incomingTransactionStart/End and outgoingTransactionStart/End
functions that can be called from the protocol to profile events
that initiate a transaction locally (e.g. an incoming request) and
remotely (e.g. outgoing requests). The generated stats will include
histograms of the latency for completing each type of transaction.
This assumes assumes the protocol uses different trigger events for
initiating incoming and outgoing transactions.
Change-Id: Ib528641b9676c68907b5989b6a09bfe91373f9c9
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31421
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: John Alsop <johnathan.alsop@amd.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
There are cases in which we need to prevent randomization for a
specific buffer when enabled at the RubySystem level (e.g. a internal
trigger queue that requires zero latency enqueue, while other buffers
can be randomized).
This changes the randomization parameter to support enabling and
disabling randomization regardless of the RubySystem setting.
Change-Id: If7520153cc5864897fa42e8911a6f8acbcf01db5
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31419
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Templated types can now be used within structures defined in SLICC.
Usage is similar to the TBETable: the templated type must have all
possible methods in it's SLICC definition. Eg.:
structure(Map, desc="Template map definition") {
MachineID lookup(Addr);
MachineID lookup(int);
}
structure(SomeType, desc="Some other struct definition") {
MachineID addrMap, template="<Addr,MachineID>";
MachineID intMap, template="<int,MachineID>";
}
Change-Id: I02a621cea5e4a89302762334651c6534c6574e9d
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31264
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Reviewed-by: Bradford Beckmann <bradford.beckmann@gmail.com>
Maintainer: Bradford Beckmann <bradford.beckmann@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>