derek/gem5 - gem5 - Gitea: Git with a cup of tea

derek/gem5

Author	SHA1	Message	Date
Matthew Poremba	6a9dfcef52	mem-ruby: Revert `7018c2b34` This reverts commit `7018c2b34e`. This commit needs more work which will take a while. Meanwhile the nightly tests are broken because of this. Change-Id: I11d01d50ab3a2d8fd649f1a825911e14815b1ca6 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57109 Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu> Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com>	2022-02-26 15:19:51 +00:00
Matthew Poremba	1bc23ca966	mem-ruby: Add protocol prints to MOESI_AMD_BASE-dma Change-Id: I59ed7311a8dc2a06ce1df0027891ba8e24e8a89e Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56447 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-02-17 17:03:19 +00:00
Matthew Poremba	7018c2b34e	mem-ruby: Remove DirectoryMemory storage in MOESI_AMD_BASE-dir This protocol is using an old style where read/writes to memory were being done by writing to a DataBlock in a DirectoryMemory entry. This results in having multiple copies of memory, leads to stale copies in at least one memory (usually DRAM), and require --access-backing-store in most cases to work properly. This changeset removes all references to getDirectoryEntry(...).DataBlk and instead forwards those reads and writes to DRAM always. This results in new transient states BL_WM, BDW_WM, and B_WM which are blocked states waiting on memory acks indicating a write request is complete. The appropriate transitions are updates to move to these new states and stall states are updated to include them. DMA write ACK is also moved to when the request is sent to memory, rather than when the request is received. Change-Id: Ic5bd6a8a8881d7df782e0f7eed8be9d873610e04 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56446 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-02-17 17:03:19 +00:00
Matthew Poremba	54fc137945	mem-ruby: Ensure MOESI_AMD_Base-dir has probe destinations The directory has an assert that this is at least one destination for a probe when sending an invalidation or shared probe to coherence end points in the protocol (TCC, LLC). This is not necessarily request and for certain configurations there will be no probes required and none will be sent. One such configuration is the GPU protocol tester which would not require a probe to the CPU if it does not exist. To fix this we first collect the probe destinations. Then we check if any destinations exist. If so, we send the probe message. Otherwise we immediately enqueue a probe complete message to the trigger queue. This reorganization prevents messages with no destinations from being enqueued, meeting the criteria for the assertion. Change-Id: If016f457cb8c9e0277a910ac2c3f315c25b50ce8 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/55543 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-02-17 17:03:19 +00:00
Tiago Mück	b354e1a252	mem-ruby: Fix handling of stale CleanUnique JIRA: https://gem5.atlassian.net/browse/GEM5-1185 Fixed an issue in which a CleanUnique responder would incorrectly deallocate the cache block when handling an stale CU when the state is UD_RU or UC_RU (thus incorrectly transitioning to RU). The fix is to handle stale CUs similarly to stale WBs where we override the dataValid TBE field to prevent the wrong state transition. This patch moves the stale code path to a separate transition (similarly to stale WBs/Evicts) and moves the dataValid override to Initiate_Request_Stale so it applies to all stale request types. Notice now the stale field is also set on stale Comp_UC responses. Additional minor change: CheckUpgrade_FromRU is the same as CheckUpgrade_FromStore so it was removed. Change-Id: I0a2cedcfde1dc30d67aa2c16d71b7470369c2b6e Signed-off-by: Tiago Mück <tiago.muck@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56810 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Meatboy 106 <garbage2collector@gmail.com>	2022-02-17 15:21:45 +00:00
Giacomo Travaglini	7129e2559e	mem-ruby: Fix -Werror=unused-variable from recent ruby patch One of the recent ruby patches [1] adopted iteration over an unordered_map via structured binding. As of now it is not possible to ignore one of the unpacked variables, and, if unused, a warning might be triggered by some compilers. With this patch we are fixing the building error by using range-based for loops without structured binding [1]: https://gem5-review.googlesource.com/c/public/gem5/+/55723 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Change-Id: I882158cc2aeccc58d30318f29470505c53baf3e2 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56104 Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Meatboy 106 <garbage2collector@gmail.com>	2022-01-28 09:05:22 +00:00
Gabriel Busnot	8a7fcd340f	mem-ruby: Add missing CHI transition SD_RSC + *_Stale->BUSY_BLKD Related JIRA: https://gem5.atlassian.net/browse/GEM5-1180 Change-Id: Ife83bebcaa48345633fce0a0de08394e30c1a796 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56083 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Tiago Muck <tiago.muck@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-01-28 07:01:14 +00:00
Gabriel Busnot	748b613c94	mem-ruby: Fix switch storage in SimpleNetwork In SimpleNetwork, switches were assigned an index depending on their position in params().routers. But switches are also referenced by their router_id parameter in other locations of the ruby network system (e.g., src and dst node parameter in links). If the router_id does not match the position in SimpleNetwork::m_switches, the network initialization might fail or implement a different topology from what the user intended. This patch fixes this issue by storing switches in a map instead of a vector. Change-Id: I398f950ad404efbf9516ea9bbced598970a2bc24 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/55723 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-01-26 06:43:27 +00:00
Tiago Muck	85a1d43c10	mem-ruby: additional SimpleNetwork stats Additional stats allow more detailed monitoring of switch bandwidth and stalls. Also cleaned up previous Throttle stats to match new stat API. JIRA: https://gem5.atlassian.net/browse/GEM5-920 Change-Id: I56604f315024f19df5f89c6f6ea1e3aa0ea185ea Signed-off-by: Tiago Mück <tiago.muck@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41865 Reviewed-by: Meatboy 106 <garbage2collector@gmail.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-01-25 16:37:46 +00:00
Tiago Mück	9c8f79310f	mem-ruby: add priorities in SimpleNetwork routing Configurations can specify a routing priority for message buffers. This priority is used by SimpleNetwork when checking for messages in the routers' input ports. Higher priority ports are always checked first. JIRA: https://gem5.atlassian.net/browse/GEM5-920 Change-Id: I7e2b35e2cae63086a76def1145f9b4b56220a2ba Signed-off-by: Tiago Mück <tiago.muck@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41864 Reviewed-by: Meatboy 106 <garbage2collector@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-01-25 16:37:46 +00:00
Tiago Mück	b476d7c1d3	mem-ruby: fine tunning SimpleNetwork buffers If physical_vnets_channels is set we adjust the link buffer sizes and the max_dequeue_rate in order to achieve the expected maximum throughput assuming a fully pipelined link, i.e., throughput of 1 msg per cycle per channel (assuming the channels width matches the protocol logical message size, otherwise maximum throughput may be smaller). JIRA: https://gem5.atlassian.net/browse/GEM5-920 Change-Id: Id99ab745ed54686d8ffcc630d622fb07ac0fc352 Signed-off-by: Tiago Mück <tiago.muck@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41863 Reviewed-by: Meatboy 106 <garbage2collector@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-01-25 16:37:46 +00:00
Tiago Mück	986e7b90d3	mem-ruby: int/ext SimpleNetwork routing latency One now may specify separate routing latencies for internal and external links using the router's int_routing_latency and ext_routing_latency, respectively. JIRA: https://gem5.atlassian.net/browse/GEM5-920 Change-Id: I5532668bf23fc61d02b978bfd9479023a6ce2b16 Signed-off-by: Tiago Mück <tiago.muck@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41861 Reviewed-by: Meatboy 106 <garbage2collector@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-01-25 16:37:46 +00:00
Tiago Mück	ac278e44f9	mem-ruby: fix SimpleNetwork WeightBased routing Individual link weights are propagated to the routing algorithms and WeightBased routing now uses this information to select the output link when multiple routing options exist. JIRA: https://gem5.atlassian.net/browse/GEM5-920 Change-Id: I86a4deb610a1b94abf745e9ef249961fb52e9800 Signed-off-by: Tiago Mück <tiago.muck@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41860 Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-01-25 16:37:46 +00:00
Tiago Mück	f748fbe7e1	mem-ruby: refactor SimpleNetwork buffers This removes the int_link_buffers param from SimpleNetwork. Internal link buffers are now created as children of SimpleIntLink objects. This results in a cleaner configuration and simplifies some code in SimpleNetwork.cc. setup_buffers is also split between Switch.setup_buffers and SimpleIntLink.setup_buffers for clarity. JIRA: https://gem5.atlassian.net/browse/GEM5-920 Change-Id: I68ad36ec0e682b8d5600c2950bcb56debe186af3 Signed-off-by: Tiago Mück <tiago.muck@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41859 Reviewed-by: Meatboy 106 <garbage2collector@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-01-25 16:37:46 +00:00
Tiago Mück	c3880c2c46	mem-ruby: refactored SimpleNetwork routing The routing algorithm is encapsulated in a separate SimObject to allow user to implement different routing strategies. The default implementation (WeightBased) maintains the original behavior. JIRA: https://gem5.atlassian.net/browse/GEM5-920 Change-Id: I5c8927f358b8b04b2da55e59679c2f629c7cd2f9 Signed-off-by: Tiago Mück <tiago.muck@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41858 Reviewed-by: Meatboy 106 <garbage2collector@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-01-24 19:09:26 +00:00
Tiago Mück	286c23da52	mem-ruby: fixed SimpleNetwork starvation The round-robing scheduling seed is shared across all ports and vnets in the router and it's possible that, under certain heavy traffic scenarios, the same port will always fill the input buffers before any other port is checked. This patch removes the round-robin scheduling. The port to be checked first is always the one with the oldest message. JIRA: https://gem5.atlassian.net/browse/GEM5-920 Change-Id: I918694d46faa0abd00ce9180bc98c58a9b5af0b5 Signed-off-by: Tiago Mück <tiago.muck@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41857 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Meatboy 106 <garbage2collector@gmail.com>	2022-01-20 15:26:58 +00:00
Tiago Muck	72185e51b2	mem-ruby: SimpleNetwork router latencies SimpleNetwork takes into account the network router latency parameter. The latency may be set to zero. PerfectSwitch and Throttle events were assigned different priorities to ensure they always execute in the same order for zero-latency forwarding. JIRA: https://gem5.atlassian.net/browse/GEM5-920 Change-Id: I6cae6a0fc22b25078c27a1e2f71744c08efd7753 Signed-off-by: Tiago Mück <tiago.muck@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41856 Reviewed-by: Meatboy 106 <garbage2collector@gmail.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-01-20 15:26:58 +00:00
Tiago Muck	43232cdb9f	mem-ruby: Optionally set Consumer ev. priority JIRA: https://gem5.atlassian.net/browse/GEM5-920 Change-Id: I62dc6656bbed4e7f4d575a6a82ac254382294ed1 Signed-off-by: Tiago Mück <tiago.muck@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41855 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Meatboy 106 <garbage2collector@gmail.com> Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-01-20 15:26:58 +00:00
Tiago Muck	bab3ce1661	configs,mem-ruby: SimpleNetwork physical channels Setting the physical_vnets_channels parameter enables the emulation of the bandwidth impact of having multiple physical channels for each virtual network. This is implemented by computing bandwidth in a per-vnet/channel basis within Throttle objects. The size of the message buffers are also scaled according to this setting (when buffer are not unlimited). The physical_vnets_bandwidth can be used to override the channel width set for each link and assign different widths for each virtual network. The --simple-physical-channels option can be used with the generic configuration scripts to automatically assign a single physical channel to each virtual network defined in the protocol. JIRA: https://gem5.atlassian.net/browse/GEM5-920 Change-Id: Ia8c9ec8651405eac8710d3f4d67f637a8054a76b Signed-off-by: Tiago Mück <tiago.muck@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41854 Reviewed-by: Meatboy 106 <garbage2collector@gmail.com> Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com>	2022-01-20 15:26:58 +00:00
Tiago Mück	87cdf354be	mem-ruby: dequeue rate limit for message buffers The 'max_dequeue_rate' parameter limits the rate at which messages can be dequeued in a single cycle. When set, 'isReady' returns false if after max_dequeue_rate is reached. This can be used to fine tune the performance of cache controllers. For the record, other ways of achieving a similar effect could be: 1) Modifying the SLICC compiler to limit message consumption in the generated wakeup() function 2) Set the buffer size to max_dequeue_rate. This can potentially cut the the expected throughput in half. For instance if a producer can enqueue every cycle, and a consumer can dequeue every cycle, a message can only be actually enqueued every two (assuming buffer_size=1) since the buffer entries available after dequeue are only visible in the next cycle (even if the consumer executes before the producer). JIRA: https://gem5.atlassian.net/browse/GEM5-920 Change-Id: I3a446c7276b80a0e3f409b4fbab0ab65ff5c1f81 Signed-off-by: Tiago Mück <tiago.muck@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41862 Reviewed-by: Meatboy 106 <garbage2collector@gmail.com> Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com>	2022-01-20 15:26:58 +00:00
Daecheol You	ec58d9d7f3	mem-ruby: Fix message stall time calculation Three changes below: 1. The m_stall_time was declared as statistics::Average, but statistics::Average uses AvgStor as storage and this works as per-tick average stat. In the case of m_stall_time, Scalar should be used to get the calculation right. 2. The function used to get an enqueue time was changed since the getTime() returns the time when the message was created. 3. Record the stall time only when the message is really dequeued from the buffer (stall time is not evaluated when the message is moved to stall map). Change-Id: I090d19828b5c43f0843a8b735d3f00f312c436e9 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/54363 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-01-03 02:18:41 +00:00
Matthew Poremba	9313294efe	misc: Remove AMD license addition Remove the line "For use for simulation and test purposes only" in files were AMD is the only copyright holder listed in the header. This happens to be the case for all files where this line exists, removing it completely from gem5. Change-Id: I623f266b002f564301b28774f49081099cfc60fd Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53943 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-12-11 04:00:56 +00:00
Gabe Black	1c233ee9d2	scons: Add sim_object and enums arguments to SimObject(). This will explicitly declare what SimObject and Enum types need to be set up in C++, which will make importing all the SimObject modules during the setup phase of SCons uneccessary. Change-Id: Id2d7603daf33b236ceaa0789e2f089f589d34e62 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/49406 Reviewed-by: Gabe Black <gabe.black@gmail.com> Maintainer: Gabe Black <gabe.black@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-12-08 08:01:23 +00:00
Matthew Poremba	e0d62e510d	configs,mem-ruby: Remove reference to old GPU ptls GPU_VIPER_Baseline, GPU_VIPER_Region, and GPU_RfO were removed some time ago. Change-Id: If873b0cfe8cc2b3096cbe97d4e13a8e02d2ec567 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53703 Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu> Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-12-07 20:26:17 +00:00
Giacomo Travaglini	de7337a32a	misc: Replace master/slave terminology from BaseCPU.py In order to fix several regression failures [1] the master/slave terminology in src/cpu/BaseCPU.py was reintroduced [2]. This patch is addressing the issue by providing 2 different ways of connecting cpu ports: ) connectBus: The method assumes an object with a bus interface is passed as an argument, therefore it tries to bind cpu ports to the bus.mem_side_ports and bus.cpu_side_ports ) connectAllPorts: No assumption on the port owning device is made. The method simply accepts ports as arguments which will be directly connected to the peer cpu ports This will be used for example by ruby Sequencers [1]: https://gem5.atlassian.net/browse/GEM5-775 [2]: https://gem5-review.googlesource.com/c/public/gem5/+/34495 Change-Id: I715ab8471621d6e5eb36731d7eaefbedf9663a71 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/52584 Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>	2021-11-16 18:17:47 +00:00
Gabe Black	c02abad641	mem-ruby: Don't conditionalize setting RubySequencer's pio_response_port This was conditioned on the TARGET_ISA being x86 because the code it replaced was, and that was because the x86 interrupts object had an extra port that didn't appear for other ISAs. This inconsistency is not present on either side of this connection, and so we don't need it to be conditional. We do, however, need to ensure that the port sends a range change even if it doesn't have any ranges to send, to satisfy the bookkeeping of the bus on the other side of the connection. We do that in init, like leaf devices do. Change-Id: Idec6f6c5e2cf78b113fb238d0edd2c63d6cd2c23 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/52109 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-10-29 02:20:36 +00:00
Gabe Black	9309863322	mem: Fix whitespace in mem/ruby/system/Sequencer.py. Some aspects of the formatting in this file were questionable, like aligning =s between adjacent lines, although not technically against the style rules as far as I know. More strangely though, the whole file used three space indents instead of the typical four. Change-Id: I7b60f1978c5b2c60a15296b10d09d5701cf7fa5c Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/52108 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-10-27 23:22:13 +00:00
Matthew Poremba	c5ba40cfe1	mem-ruby: Add GPUonly parameter for VIPER Currently MOESI_AMD_Base used in VIPER has a CPUonly parameter which indicates that messages should not try to add GPU SLICC controllers as destinations. This adds the analogue GPUonly parameter which indicates that requests should not try to add CPU SLICC controllers. Also adds an assert to ensure the outgoing message has at least one destination. This assert would indicate a misconfiguration. Change-Id: Ibb0affd4606084fca021f0e7c117d4ff8c06d429 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51928 Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2021-10-26 15:52:11 +00:00
Matthew Poremba	55fdf4be52	mem-ruby: Add missing CPUonly check for VIPER The CPUonly variable in MOESI_AMD_Base's Directory indicates that probes should not be sent to any GPU SLICC controllers as they are not part of CPU. There is one CPUonly check missing which causes problems in GPU-only Ruby networks as there is no route to any controllers with that MachineType. Add a condition to check CPUonly and do nothing in that case. Change-Id: I41b6c04feec473e34b04402adfb5978e75b847b6 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51927 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-10-26 15:52:11 +00:00
Jason Lowe-Power	3e32fd3b33	mem-ruby: Add RISC-V atomic support to Ruby RISC-V atomics carry a atomic functor that needs to be executed in the cache hierarchy. To implement this in Ruby, we execute the functor in the hitCallback function. Note that these functions are slightly different than the atomic functions used in the GPU model and the GPU coalescer even though they have similar semantics. This change was tested with RISC-V Linux boot which has a few atomics and linux boot finishes successfully. Previously, the boot got stuck after the incorrect atomic operation. Change-Id: I47a69c05ad9f4267d0220023289116e62b5231be Signed-off-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51447 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Matthew Poremba <matthew.poremba@amd.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com>	2021-10-21 01:33:34 +00:00
Matt Sinclair	118677218d	mem-ruby: fix typo in GPU VIPER TCC comment `72ee6d1a` fixed a deadlock in the GPU VIPER TCC. However, it inadvertently added a typo to the comments explaining the change. This commit fixes that. Change-Id: Ibba835aa907be33fc3dd8e576ad2901d5f8f509c Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51687 Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-10-17 04:07:49 +00:00
Giacomo Travaglini	7260394d4b	mem: Make ruby AbstractController compatible with XBar At the moment the ruby AbstractController is trying to re-send the same memory request every clock cycle until it finally succeeds [1] (in other words it is not waiting for a recvReqRetry from the peer port) This polling behaviour is not compatible with the gem5 XBar, which is panicking if it receives two consecutive requests to the same BUSY layer [2] This patch is fixing the incompatibility by inhibiting the AbstractController retry until it gets a notification from the peer response port [1]: https://github.com/gem5/gem5/blob/v21.1.0.1/\ src/mem/ruby/slicc_interface/AbstractController.cc#L303 [2]: https://github.com/gem5/gem5/blob/v21.1.0.1/src/mem/xbar.cc#L196 Change-Id: I0ac38ce286051fb714844de569c2ebf85e71a523 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/50367 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-10-13 08:45:25 +00:00
Giacomo Travaglini	4fdf61493b	mem-ruby: HTMSequencer stats initialized twice HTMSequencer stats are already initialized in the constructor This is a bug from: [1]: https://gem5-review.googlesource.com/c/public/gem5/+/36478 Change-Id: Id7d9b11f45035a46af32584ed86470c65d2a80b6 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51407 Reviewed-by: Hoa Nguyen <hoanguyen@ucdavis.edu> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-10-12 17:58:19 +00:00
Matt Sinclair	1120931105	mem-ruby: Move VIPER TCC decrements to action from in_port Currently, the GPU VIPER TCC protocol handles races between atomics in the triggerQueue_in. This in_port does not check for resource availability, which can cause the trigger queue to execute multiple times. Although this is the expected behavior, the code for handling atomic races decrements the atomicDoneCnt flag in the trigger queue, which is not safe since resource contention may cause it to execute multiple times. To resolve this issue, this commit moves the decrementing of this counter to a new action that is called in an event that happens only when the race between atomics is detected. Change-Id: I552fd4f34fdd9ebeec99fb7aeb4eeb7b150f577f Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51368 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Matthew Poremba <matthew.poremba@amd.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-10-08 22:03:13 +00:00
Matt Sinclair	72ee6d1aad	mem-ruby: Update GPU VIPER TCC protocol to resolve deadlock In the GPU VIPER TCC, programs with mixes of atomics and data accesses to the same address, in the same kernel, can experience deadlock when large applications (e.g., Pannotia's graph analytics algorithms) are running on very small GPUs (e.g., the default 4 CU GPU configuration). In this situation, deadlocks occur due to resource stalls interacting with the behavior of the current implementation for handling races between atomic accesses. The specific order of events causing this deadlock are: 1. TCC is waiting on an atomic to return from directory 2. In the meantime it receives another atomic to the same address -- when this happens, the TCC increments number of atomics to this address (numAtomics = 2) that are pending in TBE, and does a write through of the atomic to the directory. 3. When the first atomic returns from the Directory, it decrements the numAtomics counter. numAtomics was at 2 though, because of step #2. So it doesn't deallocate the TBE entry and calls Event:AtomicNotDone. 4. Another request (a LD) to the same address comes along for the same address. The LD does z_stall since the second atomic is pending –- so the LD retries every cycle until the deadlock counter times out (or until the second atomic comes back). 5. The second atomic returns to the TCC. However, because there are so many LD's pending in the cache, all doing z_stall's and retrying every cycle, there are a lot of resource stalls. So, when the second atomic returns, it is forced to retry its operation multiple times -- and each time it decrements the atomicDoneCnt flag (which was added to catch a race between atomics arriving and leaving the TCC in `7246f70bfb`) repeatedly. As a result atomicDoneCnt becomes negative. 6. Since this atomicDoneCnt flag is used to determine when Event:AtomicDone happens, and since the resource stalls caused the atomicDoneCnt flag to become negative, we never complete the atomic. Which means the pending LD can never access the line, because it's stuck waiting for the atomic to complete. 7. Eventually the deadlock threshold is reached. To fix this issue, this commit changes the VIPER TCC protocol from using z_stall to using the stall_and_wait buffer method that the Directory-level of the SLICC already uses. This change effectively prevents resource stalls from dominating the TCC level, by putting pending requests for a given address in a per-address stall buffer. These requests are then woken up when the pending request returns. As part of this change, this change also makes two small changes to the Directory-level protocol (MOESI_AMD_BASE-dir): 1. Updated the names of the wakeup actions to match the TCC wakeup actions, to avoid confusion. 2. Changed transition(B, UnblockWriteThrough, U) to check all stall buffers, as some requests were being placed later in the stall buffer than was being checked. This mirrors the changes in `187c44fe44` to other Directory transitions to resolve races between GPU and DMA requests, but for transitions prior workloads did not stress. Change-Id: I60ac9830a87c125e9ac49515a7fc7731a65723c2 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51367 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Matthew Poremba <matthew.poremba@amd.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-10-08 22:03:13 +00:00
Gabe Black	13725927a0	mem-ruby: Replace the sys param with a page_shift param. This parameter defaults to a shift which corresponds to a 4K page. Change-Id: I259081a75cd6e7286d65f1e7dcdc657404397426 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/50351 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com>	2021-09-30 00:31:18 +00:00
Gabe Black	00187b7bc3	x86,mem: Replace the x86 StoreCheck flag with READ_MODIFY_WRITE. X86 had a private/arch specific request flag called StoreCheck which it used to signal to the TLB that it should fault on a load if it would have faulted had it been a store. That way, you can detect whether a read-modify-write type of operation is going to fail due to a translation problem during the read, and don't have to worry about not doing anything architecturally visible until the store had succeeded, while also making sure not to do the store part if the modify part could fail. It seems that Ruby had hijacked that flag and had an architecture specific check which was looking for a load which was going to be followed by a store. The x86 flag was never intended to communicate that beyond the TLB, and this nominally architecture agnostic component shouldn't be reaching into the ISA specific flags to try to get that information. Instead, this change introduces a new Request flag called READ_MODIFY_WRITE which is used for the same purpose in x86, but in general means that a load will be followed by a write in the near future. With this new globally applicable flag, the ruby Sequencer class no longer needs to check what the arch is, nor does it need to access ISA private data in the request flags. Always doing this check should be no less efficient than before, because checking the arch involved calling into the system object, while checking the flag only requires masking a bit on the flags which the compiler probably already has floating around for other logic in this function. Change-Id: Ied5b744d31e7aa8bf25e399b6b321f9d2020a92f Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/48710 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu> Maintainer: Gabe Black <gabe.black@gmail.com>	2021-09-05 05:29:27 +00:00
Daecheol You	82db312550	mem-ruby: Add (RUSC, LocalHN_Eviction) transition During full system simulation on CHI, LocalHN_Eviction event on the RUSC state occured occasionally. Thus, the change adds RUSC state to the transition. Change-Id: Ibff382c38a092895bc03a4a64cf072ae752decf3 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/49263 Reviewed-by: Tiago Mück <tiago.muck@arm.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-08-24 00:17:32 +00:00
Daecheol You	8e00f8e582	mem-ruby: Atomic transaction support for CHI protocol Ruby assumes protocols use directory controllers as memory interface. Thus, recvAtomic() uses the machine type of directory when it calls mapAddressToMachine(). However, it doesn't work for CHI since CHI does not use directory controllers as memory controller interface. Therefore, the code was modified to check which controller type is used for memory interface between MachineType_Directory and MachineType_Memory, which is used for CHI. Change-Id: If35a06a8a3772ce5e5b994df05c9d94c7770c90d Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/48403 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-08-05 00:29:34 +00:00
Gabe Black	d52db719cd	scons: Delete the unused do_embed_text function. Change-Id: I2ad37c9965e7a58e288711f0fa5bb1858f121c05 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/48968 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Hoa Nguyen <hoanguyen@ucdavis.edu> Maintainer: Gabe Black <gabe.black@gmail.com>	2021-08-03 07:27:40 +00:00
Gabe Black	00876fff20	misc: Replace the GEM5_VAR_USED macro with [[maybe_unused]]. The [[maybe_unused]] attribute is now standard, so we can use that directly without hiding it behind a macro. Change-Id: If24ffd7e50bdb503cb3e6ea61f226ea794e84b8f Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/48511 Reviewed-by: Gabe Black <gabe.black@gmail.com> Maintainer: Gabe Black <gabe.black@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-07-29 10:17:51 +00:00
Kyle Roarty	1415308d10	mem-ruby: Account for misaligned accesses in GPUCoalescer Previously, we assumed that the maximum number of requests that would be issued by an instruction was equal to the number of threads that were active for that instruction. However, if a thread has an access that crosses a cache line, that thread has a misaligned access, and needs to request both cache lines. This patch takes that into account by checking the status vector for each thread in that instruction to determine the number of requests. Change-Id: I1994962c46d504b48654dbd22bcd786c9f382fd9 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/48341 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Matthew Poremba <matthew.poremba@amd.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com>	2021-07-24 17:27:02 +00:00
Daniel R. Carvalho	79bab1dc5d	mem: Adopt a memory namespace for memories Encapsulate every class inheriting from Abstract or Physical memories, and the memory controller in a memory namespace. Change-Id: I228f7e55efc395089e3616ae0a0a6325867bd782 Issued-on: https://gem5.atlassian.net/browse/GEM5-983 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/47309 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com>	2021-07-09 11:24:10 +00:00
Daniel R. Carvalho	60e4ad955d	mem-ruby: Add a ruby namespace Encapsulate all ruby-related files in a ruby namespace. Change-Id: If642c9751ecefc35b45c5dd69d85e67813cc5224 Issued-on: https://gem5.atlassian.net/browse/GEM5-984 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/47307 Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu> Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com>	2021-07-07 23:18:59 +00:00
Daniel R. Carvalho	4b2118ed4b	misc: Remove sim/cur_tick dependency from sim/core.hh Remove this unnecessary dependency. Fixed all incorrect includes of sim/core.hh. Change-Id: I3ae282dbaeb45fbf4630237a3ab9b1a593ffbe0c Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/43592 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-07-06 09:59:11 +00:00
Daniel R. Carvalho	00cd307b13	mem-garnet: Add a garnet namespace Add a namespace encapsulating all garnet files. GarnetSyntheticTraffic, from cpu/testers/garnet_synthetic_traffic/GarnetSyntheticTraffic.hh has not been added to this namespace. Change-Id: I5304ad3130100ba325e35e20883ee9286f51a75a Issued-on: https://gem5.atlassian.net/browse/GEM5-987 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/47306 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Srikant Bharadwaj <srikant.bharadwaj@amd.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Srikant Bharadwaj <srikant.bharadwaj@amd.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com>	2021-07-01 19:08:24 +00:00
Daniel R. Carvalho	974a47dfb9	misc: Adopt the gem5 namespace Apply the gem5 namespace to the codebase. Some anonymous namespaces could theoretically be removed, but since this change's main goal was to keep conflicts at a minimum, it was decided not to modify much the general shape of the files. A few missing comments of the form "// namespace X" that occurred before the newly added "} // namespace gem5" have been added for consistency. std out should not be included in the gem5 namespace, so they weren't. ProtoMessage has not been included in the gem5 namespace, since I'm not familiar with how proto works. Regarding the SystemC files, although they belong to gem5, they actually perform integration between gem5 and SystemC; therefore, it deserved its own separate namespace. Files that are automatically generated have been included in the gem5 namespace. The .isa files currently are limited to a single namespace. This limitation should be later removed to make it easier to accomodate a better API. Regarding the files in util, gem5:: was prepended where suitable. Notice that this patch was tested as much as possible given that most of these were already not previously compiling. Change-Id: Ia53d404ec79c46edaa98f654e23bc3b0e179fe2d Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46323 Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu> Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu> Reviewed-by: Matthew Poremba <matthew.poremba@amd.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-07-01 19:08:24 +00:00
Carlos Falquez	6d07200693	mem-ruby: Add (BUSY_BLKD,SnpOnceFwd) transition Add (BUSY_BLKD,SnpOnceFwd) cache transition to the Ruby CHI protocol. Change-Id: I150880b26dee869b48cfd16fb661b9487527a8cd Signed-off-by: Carlos Falquez <c.falquez@fz-juelich.de> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46901 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Tiago Mück <tiago.muck@arm.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-06-29 02:13:54 +00:00
Matthew Poremba	c493d2c4ad	sim,mem-ruby: Handle interleaved device memory Device memories are used for PCI devices which have their own pools of backing store memory such as amdgpu device. The check for an address being in device memory previously did not handle multiple interleaved memory devices with the same address range. Therefore, the device memory check would fail if the interleaving masks did not match. This updates the method to iterate through all device memories that handle the RequestorID and returns true if any of the device memories contain the packet address. Change-Id: I9339d39c1cb54a5b9075c4a122c118fe61dc6fdb Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46381 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-06-14 15:48:51 +00:00
Matthew Poremba	ca12a8997d	mem-ruby,sim: Add support for VGA ROM memory region Checks if the address is in a shadowed region, and sends the request to pio to be serviced by the device backing up that range. Based on: https://gem5-review.googlesource.com/c/amd/gem5/+/23484 Change-Id: I4d5b46cccd6203523008b2e9545d55eb62130964 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46159 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-06-11 17:10:32 +00:00

... 2 3 4 5 6 ...

1167 Commits