Commit Graph

1167 Commits

Author SHA1 Message Date
Matthew Poremba
6a9dfcef52 mem-ruby: Revert 7018c2b34
This reverts commit 7018c2b34e. This
commit needs more work which will take a while. Meanwhile the nightly
tests are broken because of this.

Change-Id: I11d01d50ab3a2d8fd649f1a825911e14815b1ca6
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57109
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-02-26 15:19:51 +00:00
Matthew Poremba
1bc23ca966 mem-ruby: Add protocol prints to MOESI_AMD_BASE-dma
Change-Id: I59ed7311a8dc2a06ce1df0027891ba8e24e8a89e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56447
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-02-17 17:03:19 +00:00
Matthew Poremba
7018c2b34e mem-ruby: Remove DirectoryMemory storage in MOESI_AMD_BASE-dir
This protocol is using an old style where read/writes to memory were
being done by writing to a DataBlock in a DirectoryMemory entry. This
results in having multiple copies of memory, leads to stale copies in at
least one memory (usually DRAM), and require --access-backing-store in
most cases to work properly. This changeset removes all references to
getDirectoryEntry(...).DataBlk and instead forwards those reads and
writes to DRAM always.

This results in new transient states BL_WM, BDW_WM, and B_WM which are
blocked states waiting on memory acks indicating a write request is
complete. The appropriate transitions are updates to move to these new
states and stall states are updated to include them. DMA write ACK is
also moved to when the request is sent to memory, rather than when the
request is received.

Change-Id: Ic5bd6a8a8881d7df782e0f7eed8be9d873610e04
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56446
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-02-17 17:03:19 +00:00
Matthew Poremba
54fc137945 mem-ruby: Ensure MOESI_AMD_Base-dir has probe destinations
The directory has an assert that this is at least one destination for a
probe when sending an invalidation or shared probe to coherence end
points in the protocol (TCC, LLC). This is not necessarily request and
for certain configurations there will be no probes required and none
will be sent. One such configuration is the GPU protocol tester which
would not require a probe to the CPU if it does not exist.

To fix this we first collect the probe destinations. Then we check if
any destinations exist. If so, we send the probe message. Otherwise we
immediately enqueue a probe complete message to the trigger queue. This
reorganization prevents messages with no destinations from being
enqueued, meeting the criteria for the assertion.

Change-Id: If016f457cb8c9e0277a910ac2c3f315c25b50ce8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/55543
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-02-17 17:03:19 +00:00
Tiago Mück
b354e1a252 mem-ruby: Fix handling of stale CleanUnique
JIRA: https://gem5.atlassian.net/browse/GEM5-1185

Fixed an issue in which a CleanUnique responder would incorrectly
deallocate the cache block when handling an stale CU when the state
is UD_RU or UC_RU (thus incorrectly transitioning to RU).

The fix is to handle stale CUs similarly to stale WBs where we
override the dataValid TBE field to prevent the wrong state
transition.

This patch moves the stale code path to a separate transition
(similarly to stale WBs/Evicts) and moves the dataValid override to
Initiate_Request_Stale so it applies to all stale request types.
Notice now the stale field is also set on stale Comp_UC responses.

Additional minor change: CheckUpgrade_FromRU is the same as
CheckUpgrade_FromStore so it was removed.

Change-Id: I0a2cedcfde1dc30d67aa2c16d71b7470369c2b6e
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56810
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Meatboy 106 <garbage2collector@gmail.com>
2022-02-17 15:21:45 +00:00
Giacomo Travaglini
7129e2559e mem-ruby: Fix -Werror=unused-variable from recent ruby patch
One of the recent ruby patches [1] adopted iteration over an
unordered_map via structured binding.  As of now it is not possible to
ignore one of the unpacked variables, and, if unused, a warning might be
triggered by some compilers.

With this patch we are fixing the building error by using range-based
for loops without structured binding

[1]: https://gem5-review.googlesource.com/c/public/gem5/+/55723

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Change-Id: I882158cc2aeccc58d30318f29470505c53baf3e2
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56104
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Meatboy 106 <garbage2collector@gmail.com>
2022-01-28 09:05:22 +00:00
Gabriel Busnot
8a7fcd340f mem-ruby: Add missing CHI transition SD_RSC + *_Stale->BUSY_BLKD
Related JIRA: https://gem5.atlassian.net/browse/GEM5-1180

Change-Id: Ife83bebcaa48345633fce0a0de08394e30c1a796
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56083
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Tiago Muck <tiago.muck@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-01-28 07:01:14 +00:00
Gabriel Busnot
748b613c94 mem-ruby: Fix switch storage in SimpleNetwork
In SimpleNetwork, switches were assigned an index depending on their
position in params().routers. But switches are also referenced by their
router_id parameter in other locations of the ruby network system (e.g.,
src and dst node parameter in links). If the router_id does not match the
position in SimpleNetwork::m_switches, the network initialization might
fail or implement a different topology from what the user intended. This
patch fixes this issue by storing switches in a map instead of a vector.

Change-Id: I398f950ad404efbf9516ea9bbced598970a2bc24
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/55723
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-01-26 06:43:27 +00:00
Tiago Muck
85a1d43c10 mem-ruby: additional SimpleNetwork stats
Additional stats allow more detailed monitoring of switch bandwidth
and stalls.

Also cleaned up previous Throttle stats to match new stat API.

JIRA: https://gem5.atlassian.net/browse/GEM5-920

Change-Id: I56604f315024f19df5f89c6f6ea1e3aa0ea185ea
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41865
Reviewed-by: Meatboy 106 <garbage2collector@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-01-25 16:37:46 +00:00
Tiago Mück
9c8f79310f mem-ruby: add priorities in SimpleNetwork routing
Configurations can specify a routing priority for message buffers.
This priority is used by SimpleNetwork when checking for messages
in the routers' input ports. Higher priority ports are always checked
first.

JIRA: https://gem5.atlassian.net/browse/GEM5-920

Change-Id: I7e2b35e2cae63086a76def1145f9b4b56220a2ba
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41864
Reviewed-by: Meatboy 106 <garbage2collector@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-01-25 16:37:46 +00:00
Tiago Mück
b476d7c1d3 mem-ruby: fine tunning SimpleNetwork buffers
If physical_vnets_channels is set we adjust the link buffer sizes and
the max_dequeue_rate in order to achieve the expected maximum throughput
assuming a fully pipelined link, i.e., throughput of 1 msg per cycle
per channel (assuming the channels width matches the protocol
logical message size, otherwise maximum throughput may be smaller).

JIRA: https://gem5.atlassian.net/browse/GEM5-920

Change-Id: Id99ab745ed54686d8ffcc630d622fb07ac0fc352
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41863
Reviewed-by: Meatboy 106 <garbage2collector@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-01-25 16:37:46 +00:00
Tiago Mück
986e7b90d3 mem-ruby: int/ext SimpleNetwork routing latency
One now may specify separate routing latencies for internal and
external links using the router's int_routing_latency and
ext_routing_latency, respectively.

JIRA: https://gem5.atlassian.net/browse/GEM5-920

Change-Id: I5532668bf23fc61d02b978bfd9479023a6ce2b16
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41861
Reviewed-by: Meatboy 106 <garbage2collector@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-01-25 16:37:46 +00:00
Tiago Mück
ac278e44f9 mem-ruby: fix SimpleNetwork WeightBased routing
Individual link weights are propagated to the routing algorithms and
WeightBased routing now uses this information to select the output
link when multiple routing options exist.

JIRA: https://gem5.atlassian.net/browse/GEM5-920

Change-Id: I86a4deb610a1b94abf745e9ef249961fb52e9800
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41860
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-01-25 16:37:46 +00:00
Tiago Mück
f748fbe7e1 mem-ruby: refactor SimpleNetwork buffers
This removes the int_link_buffers param from SimpleNetwork. Internal
link buffers are now created as children of SimpleIntLink objects.
This results in a cleaner configuration and simplifies some code in
SimpleNetwork.cc.

setup_buffers is also split between Switch.setup_buffers and
SimpleIntLink.setup_buffers for clarity.

JIRA: https://gem5.atlassian.net/browse/GEM5-920

Change-Id: I68ad36ec0e682b8d5600c2950bcb56debe186af3
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41859
Reviewed-by: Meatboy 106 <garbage2collector@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-01-25 16:37:46 +00:00
Tiago Mück
c3880c2c46 mem-ruby: refactored SimpleNetwork routing
The routing algorithm is encapsulated in a separate SimObject to allow
user to implement different routing strategies. The default
implementation (WeightBased) maintains the original behavior.

JIRA: https://gem5.atlassian.net/browse/GEM5-920

Change-Id: I5c8927f358b8b04b2da55e59679c2f629c7cd2f9
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41858
Reviewed-by: Meatboy 106 <garbage2collector@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-01-24 19:09:26 +00:00
Tiago Mück
286c23da52 mem-ruby: fixed SimpleNetwork starvation
The round-robing scheduling seed is shared across all ports and vnets
in the router and it's possible that, under certain heavy traffic
scenarios, the same port will always fill the input buffers before any
other port is checked.

This patch removes the round-robin scheduling. The port to be checked
first is always the one with the oldest message.

JIRA: https://gem5.atlassian.net/browse/GEM5-920

Change-Id: I918694d46faa0abd00ce9180bc98c58a9b5af0b5
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41857
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Meatboy 106 <garbage2collector@gmail.com>
2022-01-20 15:26:58 +00:00
Tiago Muck
72185e51b2 mem-ruby: SimpleNetwork router latencies
SimpleNetwork takes into account the network router latency parameter.
The latency may be set to zero. PerfectSwitch and Throttle events were
assigned different priorities to ensure they always execute in the same
order for zero-latency forwarding.

JIRA: https://gem5.atlassian.net/browse/GEM5-920

Change-Id: I6cae6a0fc22b25078c27a1e2f71744c08efd7753
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41856
Reviewed-by: Meatboy 106 <garbage2collector@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-01-20 15:26:58 +00:00
Tiago Muck
43232cdb9f mem-ruby: Optionally set Consumer ev. priority
JIRA: https://gem5.atlassian.net/browse/GEM5-920

Change-Id: I62dc6656bbed4e7f4d575a6a82ac254382294ed1
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41855
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Meatboy 106 <garbage2collector@gmail.com>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-01-20 15:26:58 +00:00
Tiago Muck
bab3ce1661 configs,mem-ruby: SimpleNetwork physical channels
Setting the physical_vnets_channels parameter enables the emulation of
the bandwidth impact of having multiple physical channels for each
virtual network. This is implemented by computing bandwidth in a
per-vnet/channel basis within Throttle objects. The size of the
message buffers are also scaled according to this setting (when buffer
are not unlimited).

The physical_vnets_bandwidth can be used to override the channel width
set for each link and assign different widths for each virtual network.

The --simple-physical-channels option can be used with the generic
configuration scripts to automatically assign a single physical channel
to each virtual network defined in the protocol.

JIRA: https://gem5.atlassian.net/browse/GEM5-920

Change-Id: Ia8c9ec8651405eac8710d3f4d67f637a8054a76b
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41854
Reviewed-by: Meatboy 106 <garbage2collector@gmail.com>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-01-20 15:26:58 +00:00
Tiago Mück
87cdf354be mem-ruby: dequeue rate limit for message buffers
The 'max_dequeue_rate' parameter limits the rate at which messages can
be dequeued in a single cycle. When set, 'isReady' returns false if
after max_dequeue_rate is reached.

This can be used to fine tune the performance of cache controllers.

For the record, other ways of achieving a similar effect could be:
1) Modifying the SLICC compiler to limit message consumption in the
   generated wakeup() function
2) Set the buffer size to max_dequeue_rate. This can potentially cut the
   the expected throughput in half. For instance if a producer can
   enqueue every cycle, and a consumer can dequeue every cycle, a
   message can only be actually enqueued every two (assuming
   buffer_size=1) since the buffer entries available after dequeue
   are only visible in the next cycle (even if the consumer executes
   before the producer).

JIRA: https://gem5.atlassian.net/browse/GEM5-920

Change-Id: I3a446c7276b80a0e3f409b4fbab0ab65ff5c1f81
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41862
Reviewed-by: Meatboy 106 <garbage2collector@gmail.com>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-01-20 15:26:58 +00:00
Daecheol You
ec58d9d7f3 mem-ruby: Fix message stall time calculation
Three changes below:
1. The m_stall_time was declared as statistics::Average, but
statistics::Average uses AvgStor as storage and this works as per-tick
average stat. In the case of m_stall_time, Scalar should be used to get
the calculation right.

2. The function used to get an enqueue time was changed since the
getTime() returns the time when the message was created.

3. Record the stall time only when the message is really dequeued
from the buffer (stall time is not evaluated when the message is moved
to stall map).

Change-Id: I090d19828b5c43f0843a8b735d3f00f312c436e9
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/54363
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-01-03 02:18:41 +00:00
Matthew Poremba
9313294efe misc: Remove AMD license addition
Remove the line "For use for simulation and test purposes only" in files
were AMD is the only copyright holder listed in the header. This happens
to be the case for all files where this line exists, removing it
completely from gem5.

Change-Id: I623f266b002f564301b28774f49081099cfc60fd
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53943
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-12-11 04:00:56 +00:00
Gabe Black
1c233ee9d2 scons: Add sim_object and enums arguments to SimObject().
This will explicitly declare what SimObject and Enum types need to be set
up in C++, which will make importing all the SimObject modules during
the setup phase of SCons uneccessary.

Change-Id: Id2d7603daf33b236ceaa0789e2f089f589d34e62
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/49406
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-12-08 08:01:23 +00:00
Matthew Poremba
e0d62e510d configs,mem-ruby: Remove reference to old GPU ptls
GPU_VIPER_Baseline, GPU_VIPER_Region, and GPU_RfO were removed some time
ago.

Change-Id: If873b0cfe8cc2b3096cbe97d4e13a8e02d2ec567
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53703
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-12-07 20:26:17 +00:00
Giacomo Travaglini
de7337a32a misc: Replace master/slave terminology from BaseCPU.py
In order to fix several regression failures [1] the master/slave
terminology in src/cpu/BaseCPU.py was reintroduced [2].

This patch is addressing the issue by providing 2 different
ways of connecting cpu ports:

*) connectBus: The method assumes an object with a bus interface is
passed as an argument, therefore it tries to bind cpu ports to the
bus.mem_side_ports and bus.cpu_side_ports

*) connectAllPorts: No assumption on the port owning device is made.
The method simply accepts ports as arguments which will be directly
connected to the peer cpu ports
This will be used for example by ruby Sequencers

[1]: https://gem5.atlassian.net/browse/GEM5-775
[2]: https://gem5-review.googlesource.com/c/public/gem5/+/34495

Change-Id: I715ab8471621d6e5eb36731d7eaefbedf9663a71
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/52584
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
2021-11-16 18:17:47 +00:00
Gabe Black
c02abad641 mem-ruby: Don't conditionalize setting RubySequencer's pio_response_port
This was conditioned on the TARGET_ISA being x86 because the code it
replaced was, and that was because the x86 interrupts object had an
extra port that didn't appear for other ISAs. This inconsistency is not
present on either side of this connection, and so we don't need it to be
conditional.

We do, however, need to ensure that the port sends a range change even
if it doesn't have any ranges to send, to satisfy the bookkeeping of the
bus on the other side of the connection. We do that in init, like leaf
devices do.

Change-Id: Idec6f6c5e2cf78b113fb238d0edd2c63d6cd2c23
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/52109
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-10-29 02:20:36 +00:00
Gabe Black
9309863322 mem: Fix whitespace in mem/ruby/system/Sequencer.py.
Some aspects of the formatting in this file were questionable, like
aligning =s between adjacent lines, although not technically against the
style rules as far as I know.

More strangely though, the whole file used three space indents instead
of the typical four.

Change-Id: I7b60f1978c5b2c60a15296b10d09d5701cf7fa5c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/52108
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-10-27 23:22:13 +00:00
Matthew Poremba
c5ba40cfe1 mem-ruby: Add GPUonly parameter for VIPER
Currently MOESI_AMD_Base used in VIPER has a CPUonly parameter which
indicates that messages should not try to add GPU SLICC controllers as
destinations. This adds the analogue GPUonly parameter which indicates
that requests should not try to add CPU SLICC controllers.

Also adds an assert to ensure the outgoing message has at least one
destination. This assert would indicate a misconfiguration.

Change-Id: Ibb0affd4606084fca021f0e7c117d4ff8c06d429
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51928
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2021-10-26 15:52:11 +00:00
Matthew Poremba
55fdf4be52 mem-ruby: Add missing CPUonly check for VIPER
The CPUonly variable in MOESI_AMD_Base's Directory indicates that probes
should not be sent to any GPU SLICC controllers as they are not part of
CPU. There is one CPUonly check missing which causes problems in
GPU-only Ruby networks as there is no route to any controllers with that
MachineType. Add a condition to check CPUonly and do nothing in that
case.

Change-Id: I41b6c04feec473e34b04402adfb5978e75b847b6
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51927
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-10-26 15:52:11 +00:00
Jason Lowe-Power
3e32fd3b33 mem-ruby: Add RISC-V atomic support to Ruby
RISC-V atomics carry a atomic functor that needs to be executed in the
cache hierarchy. To implement this in Ruby, we execute the functor in
the hitCallback function. Note that these functions are slightly
different than the atomic functions used in the GPU model and the GPU
coalescer even though they have similar semantics.

This change was tested with RISC-V Linux boot which has a few atomics
and linux boot finishes successfully. Previously, the boot got stuck
after the incorrect atomic operation.

Change-Id: I47a69c05ad9f4267d0220023289116e62b5231be
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51447
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
2021-10-21 01:33:34 +00:00
Matt Sinclair
118677218d mem-ruby: fix typo in GPU VIPER TCC comment
72ee6d1a fixed a deadlock in the GPU VIPER TCC.  However, it
inadvertently added a typo to the comments explaining the change.  This
commit fixes that.

Change-Id: Ibba835aa907be33fc3dd8e576ad2901d5f8f509c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51687
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-10-17 04:07:49 +00:00
Giacomo Travaglini
7260394d4b mem: Make ruby AbstractController compatible with XBar
At the moment the ruby AbstractController is trying to re-send the same
memory request every clock cycle until it finally succeeds [1]
(in other words it is not waiting for a recvReqRetry from the peer
port)

This polling behaviour is not compatible with the gem5 XBar, which is
panicking if it receives two consecutive requests to the same BUSY
layer [2]

This patch is fixing the incompatibility by inhibiting the
AbstractController retry until it gets a notification from the peer
response port

[1]: https://github.com/gem5/gem5/blob/v21.1.0.1/\
    src/mem/ruby/slicc_interface/AbstractController.cc#L303
[2]: https://github.com/gem5/gem5/blob/v21.1.0.1/src/mem/xbar.cc#L196

Change-Id: I0ac38ce286051fb714844de569c2ebf85e71a523
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/50367
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-10-13 08:45:25 +00:00
Giacomo Travaglini
4fdf61493b mem-ruby: HTMSequencer stats initialized twice
HTMSequencer stats are already initialized in the constructor

This is a bug from:

[1]: https://gem5-review.googlesource.com/c/public/gem5/+/36478

Change-Id: Id7d9b11f45035a46af32584ed86470c65d2a80b6
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51407
Reviewed-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-10-12 17:58:19 +00:00
Matt Sinclair
1120931105 mem-ruby: Move VIPER TCC decrements to action from in_port
Currently, the GPU VIPER TCC protocol handles races between atomics in
the triggerQueue_in.  This in_port does not check for resource
availability, which can cause the trigger queue to execute multiple
times.  Although this is the expected behavior, the code for handling
atomic races decrements the atomicDoneCnt flag in the trigger queue,
which is not safe since resource contention may cause it to execute
multiple times.

To resolve this issue, this commit moves the decrementing of this
counter to a new action that is called in an event that happens only
when the race between atomics is detected.

Change-Id: I552fd4f34fdd9ebeec99fb7aeb4eeb7b150f577f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51368
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-10-08 22:03:13 +00:00
Matt Sinclair
72ee6d1aad mem-ruby: Update GPU VIPER TCC protocol to resolve deadlock
In the GPU VIPER TCC, programs with mixes of atomics and data
accesses to the same address, in the same kernel, can experience
deadlock when large applications (e.g., Pannotia's graph analytics
algorithms) are running on very small GPUs (e.g., the default 4 CU GPU
configuration).  In this situation, deadlocks occur due to resource
stalls interacting with the behavior of the current implementation for
handling races between atomic accesses.  The specific order of events
causing this deadlock are:

1. TCC is waiting on an atomic to return from directory

2. In the meantime it receives another atomic to the same address -- when
this happens, the TCC increments number of atomics to this address
(numAtomics = 2) that are pending in TBE, and does a write through of the
atomic to the directory.

3. When the first atomic returns from the Directory, it decrements the
numAtomics counter.  numAtomics was at 2 though, because of step #2.  So
it doesn't deallocate the TBE entry and calls Event:AtomicNotDone.

4. Another request (a LD) to the same address comes along for the same
address.  The LD does z_stall since the second atomic is pending –- so the
LD retries every cycle until the deadlock counter times out (or until the
second atomic comes back).

5.  The second atomic returns to the TCC.  However, because there are so
many LD's pending in the cache, all doing z_stall's and retrying every cycle,
there are a lot of resource stalls.  So, when the second atomic returns, it is
forced to retry its operation multiple times -- and each time it decrements
the atomicDoneCnt flag (which was added to catch a race between atomics
arriving and leaving the TCC in 7246f70bfb) repeatedly.  As a result
atomicDoneCnt becomes negative.

6.  Since this atomicDoneCnt flag is used to determine when Event:AtomicDone
happens, and since the resource stalls caused the atomicDoneCnt flag to become
negative, we never complete the atomic.  Which means the pending LD can never
access the line, because it's stuck waiting for the atomic to complete.

7.  Eventually the deadlock threshold is reached.

To fix this issue, this commit changes the VIPER TCC protocol from using
z_stall to using the stall_and_wait buffer method that the
Directory-level of the SLICC already uses.  This change effectively
prevents resource stalls from dominating the TCC level, by putting
pending requests for a given address in a per-address stall buffer.
These requests are then woken up when the pending request returns.

As part of this change, this change also makes two small changes to the
Directory-level protocol (MOESI_AMD_BASE-dir):

1.  Updated the names of the wakeup actions to match the TCC wakeup actions,
to avoid confusion.

2.  Changed transition(B, UnblockWriteThrough, U) to check all stall buffers,
as some requests were being placed later in the stall buffer than was
being checked.  This mirrors the changes in 187c44fe44 to other Directory
transitions to resolve races between GPU and DMA requests, but for
transitions prior workloads did not stress.

Change-Id: I60ac9830a87c125e9ac49515a7fc7731a65723c2
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51367
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-10-08 22:03:13 +00:00
Gabe Black
13725927a0 mem-ruby: Replace the sys param with a page_shift param.
This parameter defaults to a shift which corresponds to a 4K page.

Change-Id: I259081a75cd6e7286d65f1e7dcdc657404397426
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/50351
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2021-09-30 00:31:18 +00:00
Gabe Black
00187b7bc3 x86,mem: Replace the x86 StoreCheck flag with READ_MODIFY_WRITE.
X86 had a private/arch specific request flag called StoreCheck which it
used to signal to the TLB that it should fault on a load if it would
have faulted had it been a store. That way, you can detect whether a
read-modify-write type of operation is going to fail due to a
translation problem during the read, and don't have to worry about not
doing anything architecturally visible until the store had succeeded,
while also making sure not to do the store part if the modify part
could fail.

It seems that Ruby had hijacked that flag and had an architecture
specific check which was looking for a load which was going to be
followed by a store. The x86 flag was never intended to communicate that
beyond the TLB, and this nominally architecture agnostic component
shouldn't be reaching into the ISA specific flags to try to get that
information.

Instead, this change introduces a new Request flag called
READ_MODIFY_WRITE which is used for the same purpose in x86, but in
general means that a load will be followed by a write in the near
future.

With this new globally applicable flag, the ruby Sequencer class no
longer needs to check what the arch is, nor does it need to access ISA
private data in the request flags. Always doing this check should be no
less efficient than before, because checking the arch involved calling
into the system object, while checking the flag only requires masking a
bit on the flags which the compiler probably already has floating around
for other logic in this function.

Change-Id: Ied5b744d31e7aa8bf25e399b6b321f9d2020a92f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/48710
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Maintainer: Gabe Black <gabe.black@gmail.com>
2021-09-05 05:29:27 +00:00
Daecheol You
82db312550 mem-ruby: Add (RUSC, LocalHN_Eviction) transition
During full system simulation on CHI, LocalHN_Eviction event on the RUSC
state occured occasionally. Thus, the change adds RUSC state to the transition.

Change-Id: Ibff382c38a092895bc03a4a64cf072ae752decf3
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/49263
Reviewed-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-08-24 00:17:32 +00:00
Daecheol You
8e00f8e582 mem-ruby: Atomic transaction support for CHI protocol
Ruby assumes protocols use directory controllers as memory interface.
Thus, recvAtomic() uses the machine type of directory when it calls
mapAddressToMachine(). However, it doesn't work for CHI since
CHI does not use directory controllers as memory controller interface.
Therefore, the code was modified to check which controller type is used
for memory interface between MachineType_Directory and
MachineType_Memory, which is used for CHI.

Change-Id: If35a06a8a3772ce5e5b994df05c9d94c7770c90d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/48403
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-08-05 00:29:34 +00:00
Gabe Black
d52db719cd scons: Delete the unused do_embed_text function.
Change-Id: I2ad37c9965e7a58e288711f0fa5bb1858f121c05
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/48968
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
Maintainer: Gabe Black <gabe.black@gmail.com>
2021-08-03 07:27:40 +00:00
Gabe Black
00876fff20 misc: Replace the GEM5_VAR_USED macro with [[maybe_unused]].
The [[maybe_unused]] attribute is now standard, so we can use that
directly without hiding it behind a macro.

Change-Id: If24ffd7e50bdb503cb3e6ea61f226ea794e84b8f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/48511
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-07-29 10:17:51 +00:00
Kyle Roarty
1415308d10 mem-ruby: Account for misaligned accesses in GPUCoalescer
Previously, we assumed that the maximum number of requests that would be
issued by an instruction was equal to the number of threads that were
active for that instruction.

However, if a thread has an access that crosses a cache line, that
thread has a misaligned access, and needs to request both cache lines.

This patch takes that into account by checking the status vector for
each thread in that instruction to determine the number of requests.

Change-Id: I1994962c46d504b48654dbd22bcd786c9f382fd9
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/48341
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
2021-07-24 17:27:02 +00:00
Daniel R. Carvalho
79bab1dc5d mem: Adopt a memory namespace for memories
Encapsulate every class inheriting from Abstract or Physical
memories, and the memory controller in a memory namespace.

Change-Id: I228f7e55efc395089e3616ae0a0a6325867bd782
Issued-on: https://gem5.atlassian.net/browse/GEM5-983
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/47309
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2021-07-09 11:24:10 +00:00
Daniel R. Carvalho
60e4ad955d mem-ruby: Add a ruby namespace
Encapsulate all ruby-related files in a ruby namespace.

Change-Id: If642c9751ecefc35b45c5dd69d85e67813cc5224
Issued-on: https://gem5.atlassian.net/browse/GEM5-984
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/47307
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-07-07 23:18:59 +00:00
Daniel R. Carvalho
4b2118ed4b misc: Remove sim/cur_tick dependency from sim/core.hh
Remove this unnecessary dependency. Fixed all incorrect
includes of sim/core.hh.

Change-Id: I3ae282dbaeb45fbf4630237a3ab9b1a593ffbe0c
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/43592
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-07-06 09:59:11 +00:00
Daniel R. Carvalho
00cd307b13 mem-garnet: Add a garnet namespace
Add a namespace encapsulating all garnet files.

GarnetSyntheticTraffic, from
cpu/testers/garnet_synthetic_traffic/GarnetSyntheticTraffic.hh
has not been added to this namespace.

Change-Id: I5304ad3130100ba325e35e20883ee9286f51a75a
Issued-on: https://gem5.atlassian.net/browse/GEM5-987
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/47306
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Srikant Bharadwaj <srikant.bharadwaj@amd.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Srikant Bharadwaj <srikant.bharadwaj@amd.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2021-07-01 19:08:24 +00:00
Daniel R. Carvalho
974a47dfb9 misc: Adopt the gem5 namespace
Apply the gem5 namespace to the codebase.

Some anonymous namespaces could theoretically be removed,
but since this change's main goal was to keep conflicts
at a minimum, it was decided not to modify much the
general shape of the files.

A few missing comments of the form "// namespace X" that
occurred before the newly added "} // namespace gem5"
have been added for consistency.

std out should not be included in the gem5 namespace, so
they weren't.

ProtoMessage has not been included in the gem5 namespace,
since I'm not familiar with how proto works.

Regarding the SystemC files, although they belong to gem5,
they actually perform integration between gem5 and SystemC;
therefore, it deserved its own separate namespace.

Files that are automatically generated have been included
in the gem5 namespace.

The .isa files currently are limited to a single namespace.
This limitation should be later removed to make it easier
to accomodate a better API.

Regarding the files in util, gem5:: was prepended where
suitable. Notice that this patch was tested as much as
possible given that most of these were already not
previously compiling.

Change-Id: Ia53d404ec79c46edaa98f654e23bc3b0e179fe2d
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46323
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-07-01 19:08:24 +00:00
Carlos Falquez
6d07200693 mem-ruby: Add (BUSY_BLKD,SnpOnceFwd) transition
Add (BUSY_BLKD,SnpOnceFwd) cache transition
to the Ruby CHI protocol.

Change-Id: I150880b26dee869b48cfd16fb661b9487527a8cd
Signed-off-by: Carlos Falquez <c.falquez@fz-juelich.de>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46901
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Tiago Mück <tiago.muck@arm.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-06-29 02:13:54 +00:00
Matthew Poremba
c493d2c4ad sim,mem-ruby: Handle interleaved device memory
Device memories are used for PCI devices which have their own pools of
backing store memory such as amdgpu device. The check for an address
being in device memory previously did not handle multiple interleaved
memory devices with the same address range. Therefore, the device memory
check would fail if the interleaving masks did not match. This updates
the method to iterate through all device memories that handle the
RequestorID and returns true if any of the device memories contain the
packet address.

Change-Id: I9339d39c1cb54a5b9075c4a122c118fe61dc6fdb
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46381
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-06-14 15:48:51 +00:00
Matthew Poremba
ca12a8997d mem-ruby,sim: Add support for VGA ROM memory region
Checks if the address is in a shadowed region, and sends the request
to pio to be serviced by the device backing up that range.

Based on: https://gem5-review.googlesource.com/c/amd/gem5/+/23484

Change-Id: I4d5b46cccd6203523008b2e9545d55eb62130964
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46159
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-06-11 17:10:32 +00:00