WriteCleanFull can be requested for the cache line in SD state (e.g.
Local eviction of a cache line in SD_RSC state). In this case, the
requestor is the owner of the cache line,
but it doesn't have it with exclusive right.
Thus, 'ownerIsExcl == false' should be removed from the stale condition.
Change-Id: I4d34021ac31b2e8600c24689a03a3b8fa18aa1f7
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/58412
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Initiate_CopyBack_Stale removes the requestor from the sharer list.
However, if CBWrData_SC is the data response of stale WriteCleanFull,
the requestor should remain in the sharer list.
Thus, whether to send a Evict or not can be decided after the data
response arrives. For this, FinishCopyBack_Stale event was added as the
last event to handle Evict.
Change-Id: Ic3e3a1e4d74b24b9aa328b2ddfa817db44f24e4e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/58413
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Tiago Muck <tiago.muck@arm.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
This makes what are configuration and what are internal SCons variables
explicit and separate, and makes it unnecessary to call out what
variables to export to C++.
These variables will also be plumbed into and out of kconfiglib in later
changes.
Change-Id: Iaf5e098d7404af06285c421dbdf8ef4171b3f001
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56892
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Presumably, these are fixed for whatever protocol that gets selected. We
don't need to accumulate includes, we need to set includes to something
in particular. If there is a common include which always needs to be
used, we can handle that in the SConscript separately from
SLICC_INCLUDES.
Change-Id: I996d08566944e38e388dc287f644c40366ebba0d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56754
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Reviewed-by: Yu-hsin Wang <yuhsingw@google.com>
This enhances MOESI_AMD_Base-dir DmaWrite to enable partial writes. This
is currently done by assuming a full cache line, invalidating caches,
and transitioning back to unblocked state. The enhanced write supports
partial writes (i.e., smaller than cache line size) by first reading
memory, merging the modified data, and then writing back to memory.
Implementation of this mirrors that of DmaRead in terms of state. This
means for each DmaRead state (BDR_PM, BDR_Pm, and BDR_M) there is a
write analogue (BDW_PM, BDW_Pm, and BDR_M) and the BDR_P state is
removed. Furthermore, this enhanced DmaWrite ... actually writes data to
memory instead of relying on DirectoryEntry / backing store for correct
data.
There are two possible state transitions for DmaWrite now. (1) Memory
data arrives before probe response and (2) probe response arrives before
memory data. In case (1), probe data overwrites memory data and merges
the partial write using the TBE write mask then updates write mask to
'filled' state. In case (2), probe data is merged with the partial data
using the TBE write mask then updates write mask to 'filled' state. The
memory data will then be clobbered by copying the TBE data over the
response since the write mask is now full.
Change-Id: I1eebb882b464c4c5ee5fd60932fd38d271ada4d7
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57410
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Matthew Poremba <matthew.poremba@amd.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This protocol is using an old style where read/writes to memory were
being done by writing to a DataBlock in a DirectoryMemory entry. This
results in having multiple copies of memory, leads to stale copies in at
least one memory (usually DRAM), and require --access-backing-store in
most cases to work properly. This changeset removes all references to
getDirectoryEntry(...).DataBlk and instead forwards those reads and
writes to DRAM always.
Change-Id: If2e52151789ad82c7b55c8fa2b41c1f4e5b65994
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57409
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This protocol is using an old style where read/writes to memory were
being done by writing to a DataBlock in a DirectoryMemory entry. This
results in having multiple copies of memory, leads to stale copies in at
least one memory (usually DRAM), and require --access-backing-store in
most cases to work properly. This changeset removes all references to
getDirectoryEntry(...).DataBlk and instead forwards those reads and
writes to DRAM always.
This results in new transient states BL_WM, BDW_WM, and B_WM which are
blocked states waiting on memory acks indicating a write request is
complete. The appropriate transitions are updates to move to these new
states and stall states are updated to include them. DMA write ACK is
also moved to when the request is sent to memory, rather than when the
request is received.
Change-Id: Ic5bd6a8a8881d7df782e0f7eed8be9d873610e04
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56446
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The directory has an assert that this is at least one destination for a
probe when sending an invalidation or shared probe to coherence end
points in the protocol (TCC, LLC). This is not necessarily request and
for certain configurations there will be no probes required and none
will be sent. One such configuration is the GPU protocol tester which
would not require a probe to the CPU if it does not exist.
To fix this we first collect the probe destinations. Then we check if
any destinations exist. If so, we send the probe message. Otherwise we
immediately enqueue a probe complete message to the trigger queue. This
reorganization prevents messages with no destinations from being
enqueued, meeting the criteria for the assertion.
Change-Id: If016f457cb8c9e0277a910ac2c3f315c25b50ce8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/55543
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
JIRA: https://gem5.atlassian.net/browse/GEM5-1185
Fixed an issue in which a CleanUnique responder would incorrectly
deallocate the cache block when handling an stale CU when the state
is UD_RU or UC_RU (thus incorrectly transitioning to RU).
The fix is to handle stale CUs similarly to stale WBs where we
override the dataValid TBE field to prevent the wrong state
transition.
This patch moves the stale code path to a separate transition
(similarly to stale WBs/Evicts) and moves the dataValid override to
Initiate_Request_Stale so it applies to all stale request types.
Notice now the stale field is also set on stale Comp_UC responses.
Additional minor change: CheckUpgrade_FromRU is the same as
CheckUpgrade_FromStore so it was removed.
Change-Id: I0a2cedcfde1dc30d67aa2c16d71b7470369c2b6e
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56810
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Meatboy 106 <garbage2collector@gmail.com>
In SimpleNetwork, switches were assigned an index depending on their
position in params().routers. But switches are also referenced by their
router_id parameter in other locations of the ruby network system (e.g.,
src and dst node parameter in links). If the router_id does not match the
position in SimpleNetwork::m_switches, the network initialization might
fail or implement a different topology from what the user intended. This
patch fixes this issue by storing switches in a map instead of a vector.
Change-Id: I398f950ad404efbf9516ea9bbced598970a2bc24
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/55723
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Setting the physical_vnets_channels parameter enables the emulation of
the bandwidth impact of having multiple physical channels for each
virtual network. This is implemented by computing bandwidth in a
per-vnet/channel basis within Throttle objects. The size of the
message buffers are also scaled according to this setting (when buffer
are not unlimited).
The physical_vnets_bandwidth can be used to override the channel width
set for each link and assign different widths for each virtual network.
The --simple-physical-channels option can be used with the generic
configuration scripts to automatically assign a single physical channel
to each virtual network defined in the protocol.
JIRA: https://gem5.atlassian.net/browse/GEM5-920
Change-Id: Ia8c9ec8651405eac8710d3f4d67f637a8054a76b
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41854
Reviewed-by: Meatboy 106 <garbage2collector@gmail.com>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
The 'max_dequeue_rate' parameter limits the rate at which messages can
be dequeued in a single cycle. When set, 'isReady' returns false if
after max_dequeue_rate is reached.
This can be used to fine tune the performance of cache controllers.
For the record, other ways of achieving a similar effect could be:
1) Modifying the SLICC compiler to limit message consumption in the
generated wakeup() function
2) Set the buffer size to max_dequeue_rate. This can potentially cut the
the expected throughput in half. For instance if a producer can
enqueue every cycle, and a consumer can dequeue every cycle, a
message can only be actually enqueued every two (assuming
buffer_size=1) since the buffer entries available after dequeue
are only visible in the next cycle (even if the consumer executes
before the producer).
JIRA: https://gem5.atlassian.net/browse/GEM5-920
Change-Id: I3a446c7276b80a0e3f409b4fbab0ab65ff5c1f81
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41862
Reviewed-by: Meatboy 106 <garbage2collector@gmail.com>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Three changes below:
1. The m_stall_time was declared as statistics::Average, but
statistics::Average uses AvgStor as storage and this works as per-tick
average stat. In the case of m_stall_time, Scalar should be used to get
the calculation right.
2. The function used to get an enqueue time was changed since the
getTime() returns the time when the message was created.
3. Record the stall time only when the message is really dequeued
from the buffer (stall time is not evaluated when the message is moved
to stall map).
Change-Id: I090d19828b5c43f0843a8b735d3f00f312c436e9
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/54363
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Remove the line "For use for simulation and test purposes only" in files
were AMD is the only copyright holder listed in the header. This happens
to be the case for all files where this line exists, removing it
completely from gem5.
Change-Id: I623f266b002f564301b28774f49081099cfc60fd
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53943
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
In order to fix several regression failures [1] the master/slave
terminology in src/cpu/BaseCPU.py was reintroduced [2].
This patch is addressing the issue by providing 2 different
ways of connecting cpu ports:
*) connectBus: The method assumes an object with a bus interface is
passed as an argument, therefore it tries to bind cpu ports to the
bus.mem_side_ports and bus.cpu_side_ports
*) connectAllPorts: No assumption on the port owning device is made.
The method simply accepts ports as arguments which will be directly
connected to the peer cpu ports
This will be used for example by ruby Sequencers
[1]: https://gem5.atlassian.net/browse/GEM5-775
[2]: https://gem5-review.googlesource.com/c/public/gem5/+/34495
Change-Id: I715ab8471621d6e5eb36731d7eaefbedf9663a71
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/52584
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
This was conditioned on the TARGET_ISA being x86 because the code it
replaced was, and that was because the x86 interrupts object had an
extra port that didn't appear for other ISAs. This inconsistency is not
present on either side of this connection, and so we don't need it to be
conditional.
We do, however, need to ensure that the port sends a range change even
if it doesn't have any ranges to send, to satisfy the bookkeeping of the
bus on the other side of the connection. We do that in init, like leaf
devices do.
Change-Id: Idec6f6c5e2cf78b113fb238d0edd2c63d6cd2c23
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/52109
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Some aspects of the formatting in this file were questionable, like
aligning =s between adjacent lines, although not technically against the
style rules as far as I know.
More strangely though, the whole file used three space indents instead
of the typical four.
Change-Id: I7b60f1978c5b2c60a15296b10d09d5701cf7fa5c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/52108
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Currently MOESI_AMD_Base used in VIPER has a CPUonly parameter which
indicates that messages should not try to add GPU SLICC controllers as
destinations. This adds the analogue GPUonly parameter which indicates
that requests should not try to add CPU SLICC controllers.
Also adds an assert to ensure the outgoing message has at least one
destination. This assert would indicate a misconfiguration.
Change-Id: Ibb0affd4606084fca021f0e7c117d4ff8c06d429
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51928
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
The CPUonly variable in MOESI_AMD_Base's Directory indicates that probes
should not be sent to any GPU SLICC controllers as they are not part of
CPU. There is one CPUonly check missing which causes problems in
GPU-only Ruby networks as there is no route to any controllers with that
MachineType. Add a condition to check CPUonly and do nothing in that
case.
Change-Id: I41b6c04feec473e34b04402adfb5978e75b847b6
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51927
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
RISC-V atomics carry a atomic functor that needs to be executed in the
cache hierarchy. To implement this in Ruby, we execute the functor in
the hitCallback function. Note that these functions are slightly
different than the atomic functions used in the GPU model and the GPU
coalescer even though they have similar semantics.
This change was tested with RISC-V Linux boot which has a few atomics
and linux boot finishes successfully. Previously, the boot got stuck
after the incorrect atomic operation.
Change-Id: I47a69c05ad9f4267d0220023289116e62b5231be
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51447
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>