Commit Graph

3310 Commits

Author SHA1 Message Date
Tiago Muck
72185e51b2 mem-ruby: SimpleNetwork router latencies
SimpleNetwork takes into account the network router latency parameter.
The latency may be set to zero. PerfectSwitch and Throttle events were
assigned different priorities to ensure they always execute in the same
order for zero-latency forwarding.

JIRA: https://gem5.atlassian.net/browse/GEM5-920

Change-Id: I6cae6a0fc22b25078c27a1e2f71744c08efd7753
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41856
Reviewed-by: Meatboy 106 <garbage2collector@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-01-20 15:26:58 +00:00
Tiago Muck
43232cdb9f mem-ruby: Optionally set Consumer ev. priority
JIRA: https://gem5.atlassian.net/browse/GEM5-920

Change-Id: I62dc6656bbed4e7f4d575a6a82ac254382294ed1
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41855
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Meatboy 106 <garbage2collector@gmail.com>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-01-20 15:26:58 +00:00
Tiago Muck
bab3ce1661 configs,mem-ruby: SimpleNetwork physical channels
Setting the physical_vnets_channels parameter enables the emulation of
the bandwidth impact of having multiple physical channels for each
virtual network. This is implemented by computing bandwidth in a
per-vnet/channel basis within Throttle objects. The size of the
message buffers are also scaled according to this setting (when buffer
are not unlimited).

The physical_vnets_bandwidth can be used to override the channel width
set for each link and assign different widths for each virtual network.

The --simple-physical-channels option can be used with the generic
configuration scripts to automatically assign a single physical channel
to each virtual network defined in the protocol.

JIRA: https://gem5.atlassian.net/browse/GEM5-920

Change-Id: Ia8c9ec8651405eac8710d3f4d67f637a8054a76b
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41854
Reviewed-by: Meatboy 106 <garbage2collector@gmail.com>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-01-20 15:26:58 +00:00
Tiago Mück
87cdf354be mem-ruby: dequeue rate limit for message buffers
The 'max_dequeue_rate' parameter limits the rate at which messages can
be dequeued in a single cycle. When set, 'isReady' returns false if
after max_dequeue_rate is reached.

This can be used to fine tune the performance of cache controllers.

For the record, other ways of achieving a similar effect could be:
1) Modifying the SLICC compiler to limit message consumption in the
   generated wakeup() function
2) Set the buffer size to max_dequeue_rate. This can potentially cut the
   the expected throughput in half. For instance if a producer can
   enqueue every cycle, and a consumer can dequeue every cycle, a
   message can only be actually enqueued every two (assuming
   buffer_size=1) since the buffer entries available after dequeue
   are only visible in the next cycle (even if the consumer executes
   before the producer).

JIRA: https://gem5.atlassian.net/browse/GEM5-920

Change-Id: I3a446c7276b80a0e3f409b4fbab0ab65ff5c1f81
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41862
Reviewed-by: Meatboy 106 <garbage2collector@gmail.com>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-01-20 15:26:58 +00:00
Yu-hsin Wang
52661838a4 ext: upgrade to googletest 1.11.x
Upgrade googletest to 1.11.x
upstream commit: 8306020a3e9eceafec65508868d7ab5c63bb41f7

sha1sum df8cdd26ee7cdf2a3d9c05a92d3630a96f406422 generated by command:
find . -type f ! -name SConscript ! -path "./.*" -print0 \
| sort -z | xargs -0 sha1sum | sha1sum

This upgrade is mainly for providing ConditionalMatcher support.

Change-Id: I27d971c02c59a3ad42c3002f1b4e1a8b18269c56
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/55384
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
2022-01-20 01:16:02 +00:00
Austin Harris
41ee8ec7d8 mem: implement x86 locked accesses in timing-mode classic cache
Add LockedRMW(Read|Write)(Req|Resp) commands.  In timing mode,
use a combination of clearing permission bits and leaving
an MSHR in place to prevent accesses & snoops from touching
a locked block between the read and write parts of an locked
RMW sequence.

Based on an old patch by Steve Reinhardt:
http://reviews.gem5.org/r/2691/index.html

Jira Issue: https://gem5.atlassian.net/browse/GEM5-1105

Change-Id: Ieadda4deb17667ca4a6282f87f6da2af3b011f66
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/52303
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-01-17 15:15:24 +00:00
Alex Richardson
5ac54aab7e misc: Generate StateMachine debug includes in deterministic order
Since 3454a4a36e the order of the debug/
includes is non-deterministic which can result in unnecessary rebuilds.

Change-Id: I583d2caf70632e08fa59ac85073786270991edbc
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/54983
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-01-05 10:15:55 +00:00
Jiasen Huang
ec32f6c6f0 mem-cache: Add switch on/off duplicate entries into RMOB
Change-Id: I394d7c852a439be5315c4755b091c8741e671ea3
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/55083
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Daniel Carvalho <odanrc@yahoo.com.br>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-01-05 01:18:14 +00:00
Jiasen Huang
66e6e59666 mem-cache: pstAddress should be inserted into PST in STeMS
Change-Id: Ib2c4c5fb0fec32e63947d3ee8dcb5c3d7e2555ab
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/55084
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-01-05 01:14:48 +00:00
Daecheol You
ec58d9d7f3 mem-ruby: Fix message stall time calculation
Three changes below:
1. The m_stall_time was declared as statistics::Average, but
statistics::Average uses AvgStor as storage and this works as per-tick
average stat. In the case of m_stall_time, Scalar should be used to get
the calculation right.

2. The function used to get an enqueue time was changed since the
getTime() returns the time when the message was created.

3. Record the stall time only when the message is really dequeued
from the buffer (stall time is not evaluated when the message is moved
to stall map).

Change-Id: I090d19828b5c43f0843a8b735d3f00f312c436e9
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/54363
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-01-03 02:18:41 +00:00
Bobby R. Bruce
6a44472b1c mem: Add 'controller()' function to NVMInterface.py
As noted here: https://gem5.atlassian.net/browse/GEM5-1133,
NVMInterface.py does not have a `controller()` function, which is used
by `configs/common/MemConfig.py` to obtain a memory controller for a
specific memory type selected. This patch adds a `controller()`
function to `NVNInterface.py` to avoid the reported error.

It should be noted that we do not enforce a rule that a memory type
must include a `controller()` function. `se.py`, and other scrips
that use `configs/common/MemConfigs.py`, should not rely on this
false assumption.

Change-Id: Ieba62f803d3b9f9c5c3c863d5a8c4ca16c5e5e82
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/54923
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-12-30 03:03:15 +00:00
Jiasen Huang
398d4f9d07 mem-cache: Init lastTriggerCounter for STeMS Prefetcher
Change-Id: I43a0093e6d35a39799d724e7dee15c95dbc26343
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/54643
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Daniel Carvalho <odanrc@yahoo.com.br>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-12-23 01:40:09 +00:00
Jiasen Huang
7910926631 mem-cache: Set prefetch bit if the blk comes from Prefetch only.
Original logic setting prefetch inside serviceMSHRTargets
did not exclude the blks that both came from CORE and Prefetch

Change-Id: Iab56b9266eb64baf972b160774aca0823faea458
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/54364
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Daniel Carvalho <odanrc@yahoo.com.br>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-12-22 01:02:22 +00:00
Jiasen Huang
735f768cc8 mem-cache: Added stats filtering both useful and spanPage prefetch.
Change-Id: I2570ee47f064ac999f2dcc813c9e39174a2ad8af
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/54163
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-12-17 01:04:21 +00:00
Matthew Poremba
9313294efe misc: Remove AMD license addition
Remove the line "For use for simulation and test purposes only" in files
were AMD is the only copyright holder listed in the header. This happens
to be the case for all files where this line exists, removing it
completely from gem5.

Change-Id: I623f266b002f564301b28774f49081099cfc60fd
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53943
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-12-11 04:00:56 +00:00
Gabe Black
1c233ee9d2 scons: Add sim_object and enums arguments to SimObject().
This will explicitly declare what SimObject and Enum types need to be set
up in C++, which will make importing all the SimObject modules during
the setup phase of SCons uneccessary.

Change-Id: Id2d7603daf33b236ceaa0789e2f089f589d34e62
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/49406
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-12-08 08:01:23 +00:00
Matthew Poremba
e0d62e510d configs,mem-ruby: Remove reference to old GPU ptls
GPU_VIPER_Baseline, GPU_VIPER_Region, and GPU_RfO were removed some time
ago.

Change-Id: If873b0cfe8cc2b3096cbe97d4e13a8e02d2ec567
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53703
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-12-07 20:26:17 +00:00
huangjs
ea6b0a03d3 mem: Setup stats for Cache hitLatency
Change-Id: I18de75d3b6c4ce1784b90653e2d132ffecf1b1af
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53103
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2021-11-30 02:59:05 +00:00
huangjs
cdbeb07ffd mem: Targets per MSHR allocated should be always <= tgts_per_mshr
This modification indicates when tgts_per_mshr = 0 for particular level of Cache

Change-Id: Icc1ecffcd598a473bd26bc5115a5e0a7998fb527
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/52783
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-11-22 01:31:02 +00:00
Giacomo Travaglini
de7337a32a misc: Replace master/slave terminology from BaseCPU.py
In order to fix several regression failures [1] the master/slave
terminology in src/cpu/BaseCPU.py was reintroduced [2].

This patch is addressing the issue by providing 2 different
ways of connecting cpu ports:

*) connectBus: The method assumes an object with a bus interface is
passed as an argument, therefore it tries to bind cpu ports to the
bus.mem_side_ports and bus.cpu_side_ports

*) connectAllPorts: No assumption on the port owning device is made.
The method simply accepts ports as arguments which will be directly
connected to the peer cpu ports
This will be used for example by ruby Sequencers

[1]: https://gem5.atlassian.net/browse/GEM5-775
[2]: https://gem5-review.googlesource.com/c/public/gem5/+/34495

Change-Id: I715ab8471621d6e5eb36731d7eaefbedf9663a71
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/52584
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
2021-11-16 18:17:47 +00:00
Gabe Black
ba5f68db3d misc: Use python 3's argumentless super().
When calling a method in a superclass, you can/should use the super()
method to get a reference to that class. The python 2 version of that
method takes two parameters, the current class name, and the "self"
instance. The python 3 version takes no arguments. This is better for a
at least three reasons.

First, this version is less verbose because you don't have to specify
any arguments.

Second, you don't have to remember which argument goes where (I always
have to look it up), and you can't accidentally use the wrong class
name, or forget to update it if you copy code from a different class.

Third, this version will work correctly if you use a class decorator.
I don't know exactly how the mechanics of this work, but it is referred
to in a comment on this stackoverflow question:

https://stackoverflow.com/questions/681953/how-to-decorate-a-class

Change-Id: I427737c8f767e80da86cd245642e3b057121bc3b
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/52224
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-11-09 13:04:44 +00:00
Jason Lowe-Power
6f49a1fe29 mem: Initialize all stats in MemInterface
Change-Id: I1ee9ca14127abb7311ee8282b3fef1051277592c
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/52503
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
2021-11-06 18:25:43 +00:00
Gabe Black
e34fa5d86a mem-cache: Ensure all fields of the CacheBlk class are initialized.
The constructor only initialized two fields, data and _tickInserted. The
print() method at least accesses the coherence status bits, which
valgrind determined were being accessed without being initialized.

This change adds a default initializer to all fields to prevent any
value from flapping around uninitialized.

Change-Id: Ie4c839504d49f9a131d8e3c3e8be02ff22f453a6
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/52404
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-11-04 22:53:51 +00:00
Gabe Black
6d60e76a60 mem-cache: Don't generate debug output unless you're going to use it.
The BaseCache::handleFill function would generate an "old_state" string
unconditionally, just in case it would need to print it out later on in
the function if the Cache debug variable was set. This is very wasteful.
We should only generate that string if we are actually going to use it
later on.

Change-Id: I4a570d1cd2814e5a089eac1233dedd1801d68975
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/52405
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-11-04 19:48:08 +00:00
Gabe Black
c02abad641 mem-ruby: Don't conditionalize setting RubySequencer's pio_response_port
This was conditioned on the TARGET_ISA being x86 because the code it
replaced was, and that was because the x86 interrupts object had an
extra port that didn't appear for other ISAs. This inconsistency is not
present on either side of this connection, and so we don't need it to be
conditional.

We do, however, need to ensure that the port sends a range change even
if it doesn't have any ranges to send, to satisfy the bookkeeping of the
bus on the other side of the connection. We do that in init, like leaf
devices do.

Change-Id: Idec6f6c5e2cf78b113fb238d0edd2c63d6cd2c23
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/52109
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-10-29 02:20:36 +00:00
Mahyar Samani
b22c0183cf mem: Adding PortTerminator
This change adds the source code for the PortTerminator SimObject.
It could be used to connect request/response ports in the system
that can not be connected to any other ports. This will prevent
errors caused by orphan ports in the system. As an example if
you have set up a cache hierarchy and do not want to test its
performance in full system mode and want to use PyTrafficGen
instead, your system will end up with an icache or walker ports
that are not connected to anything. In this case, you can use a
PortTerminator to connect the orphan ports in your system.

Change-Id: I5e19cdd3ce064638ffabf29d29225eda77ffc146
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51609
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2021-10-28 20:13:41 +00:00
Gabe Black
9309863322 mem: Fix whitespace in mem/ruby/system/Sequencer.py.
Some aspects of the formatting in this file were questionable, like
aligning =s between adjacent lines, although not technically against the
style rules as far as I know.

More strangely though, the whole file used three space indents instead
of the typical four.

Change-Id: I7b60f1978c5b2c60a15296b10d09d5701cf7fa5c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/52108
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-10-27 23:22:13 +00:00
Matthew Poremba
c5ba40cfe1 mem-ruby: Add GPUonly parameter for VIPER
Currently MOESI_AMD_Base used in VIPER has a CPUonly parameter which
indicates that messages should not try to add GPU SLICC controllers as
destinations. This adds the analogue GPUonly parameter which indicates
that requests should not try to add CPU SLICC controllers.

Also adds an assert to ensure the outgoing message has at least one
destination. This assert would indicate a misconfiguration.

Change-Id: Ibb0affd4606084fca021f0e7c117d4ff8c06d429
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51928
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2021-10-26 15:52:11 +00:00
Matthew Poremba
55fdf4be52 mem-ruby: Add missing CPUonly check for VIPER
The CPUonly variable in MOESI_AMD_Base's Directory indicates that probes
should not be sent to any GPU SLICC controllers as they are not part of
CPU. There is one CPUonly check missing which causes problems in
GPU-only Ruby networks as there is no route to any controllers with that
MachineType. Add a condition to check CPUonly and do nothing in that
case.

Change-Id: I41b6c04feec473e34b04402adfb5978e75b847b6
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51927
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-10-26 15:52:11 +00:00
Gabe Black
74c246d15b mem: Add a translation generator function to EmulationPageTable.
This lets the caller iterate over translated address ranges over the
requested total virtual address region.

Change-Id: I50bd59bdbb12c055fa9ace9b1d5ff972e382cb85
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/50762
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
2021-10-22 21:43:02 +00:00
Gabe Black
7155b8ba1e mem: Use the MMU's translation generator in translating proxies.
Use the more flexible MMU translation generator which does not need to
be told what page size to use, and which will be able to do flexible
things like translate across varying page sizes.

Change-Id: Ibfefc39d833f37bc35d703c505b193ea68988ab0
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/50760
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Gabe Black <gabe.black@gmail.com>
2021-10-22 21:43:02 +00:00
Jason Lowe-Power
3e32fd3b33 mem-ruby: Add RISC-V atomic support to Ruby
RISC-V atomics carry a atomic functor that needs to be executed in the
cache hierarchy. To implement this in Ruby, we execute the functor in
the hitCallback function. Note that these functions are slightly
different than the atomic functions used in the GPU model and the GPU
coalescer even though they have similar semantics.

This change was tested with RISC-V Linux boot which has a few atomics
and linux boot finishes successfully. Previously, the boot got stuck
after the incorrect atomic operation.

Change-Id: I47a69c05ad9f4267d0220023289116e62b5231be
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51447
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
2021-10-21 01:33:34 +00:00
Matt Sinclair
118677218d mem-ruby: fix typo in GPU VIPER TCC comment
72ee6d1a fixed a deadlock in the GPU VIPER TCC.  However, it
inadvertently added a typo to the comments explaining the change.  This
commit fixes that.

Change-Id: Ibba835aa907be33fc3dd8e576ad2901d5f8f509c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51687
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-10-17 04:07:49 +00:00
Gabe Black
a2f1400e06 mem: Add a translation gen helper class.
This class helps translate a region of memory one chunk at a time. The
generator returns an iterator through its begin and end methods which
can be used as part of a regular for loop, or as part of a range based
for loop. The iterator points to a Range object which holds the virtual
and physical address of the translation, the size of the region included
in the translation, and a Fault if the translation of that chunk
faulted.

When incrementing the iterator, if there was no fault it simply moves
ahead to the next region and attempts to translate it using a virtual
method implemented by subclasses. It's up to the subclass to determine
if there is now a fault, how many bytes have been translated if, for
instance, the page size is variable, and what the translated physical
address is.

If there was a fault, the iterator does not increment, it just clears
the fault and tries the previous translation again. This gives consumers
of the translation generator a chance to fix up faulting addresses
without having to abort the whole process and try again. This might be
useful if, for instance, you've reached the end of the stack and a new
page needs to be demand-paged in.

Change-Id: I8c4023845d989fe3781b1b73ab12f7c8855c9171
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/50758
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-10-14 20:49:23 +00:00
Gabe Black
7290bf52f3 mem: Replace SatCounter with SatCounter8 in the SHiP replacement policy.
Change-Id: Ibbc8e78df7119cdff62ad08b5c68f4237ca25cfe
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51530
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Daniel Carvalho <odanrc@yahoo.com.br>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-10-13 23:13:57 +00:00
Gabe Black
4fe9af8d17 mem: Stop using SlavePort as a base class.
There are other classes like "ExternalSlave" which still have the word
"Slave" in them, but at least this will make the build quit complaining
about the deprecated SlavePort.

Change-Id: I917c2880574cb77ea37c69dc2727ac5e84b83cd5
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51529
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-10-13 20:28:05 +00:00
Giacomo Travaglini
7260394d4b mem: Make ruby AbstractController compatible with XBar
At the moment the ruby AbstractController is trying to re-send the same
memory request every clock cycle until it finally succeeds [1]
(in other words it is not waiting for a recvReqRetry from the peer
port)

This polling behaviour is not compatible with the gem5 XBar, which is
panicking if it receives two consecutive requests to the same BUSY
layer [2]

This patch is fixing the incompatibility by inhibiting the
AbstractController retry until it gets a notification from the peer
response port

[1]: https://github.com/gem5/gem5/blob/v21.1.0.1/\
    src/mem/ruby/slicc_interface/AbstractController.cc#L303
[2]: https://github.com/gem5/gem5/blob/v21.1.0.1/src/mem/xbar.cc#L196

Change-Id: I0ac38ce286051fb714844de569c2ebf85e71a523
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/50367
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-10-13 08:45:25 +00:00
Giacomo Travaglini
4fdf61493b mem-ruby: HTMSequencer stats initialized twice
HTMSequencer stats are already initialized in the constructor

This is a bug from:

[1]: https://gem5-review.googlesource.com/c/public/gem5/+/36478

Change-Id: Id7d9b11f45035a46af32584ed86470c65d2a80b6
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51407
Reviewed-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-10-12 17:58:19 +00:00
Davide Basilio Bartolini
3d025b517f misc: Fix hdf5 stats + test
HDF5 stats file creation was not completing correctly due to name
clashes.

Change-Id: Ifc2d52f4bbc62b0c6798ce92f4d027b0ec69a373
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51061
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-10-12 06:03:21 +00:00
Matt Sinclair
1120931105 mem-ruby: Move VIPER TCC decrements to action from in_port
Currently, the GPU VIPER TCC protocol handles races between atomics in
the triggerQueue_in.  This in_port does not check for resource
availability, which can cause the trigger queue to execute multiple
times.  Although this is the expected behavior, the code for handling
atomic races decrements the atomicDoneCnt flag in the trigger queue,
which is not safe since resource contention may cause it to execute
multiple times.

To resolve this issue, this commit moves the decrementing of this
counter to a new action that is called in an event that happens only
when the race between atomics is detected.

Change-Id: I552fd4f34fdd9ebeec99fb7aeb4eeb7b150f577f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51368
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-10-08 22:03:13 +00:00
Matt Sinclair
72ee6d1aad mem-ruby: Update GPU VIPER TCC protocol to resolve deadlock
In the GPU VIPER TCC, programs with mixes of atomics and data
accesses to the same address, in the same kernel, can experience
deadlock when large applications (e.g., Pannotia's graph analytics
algorithms) are running on very small GPUs (e.g., the default 4 CU GPU
configuration).  In this situation, deadlocks occur due to resource
stalls interacting with the behavior of the current implementation for
handling races between atomic accesses.  The specific order of events
causing this deadlock are:

1. TCC is waiting on an atomic to return from directory

2. In the meantime it receives another atomic to the same address -- when
this happens, the TCC increments number of atomics to this address
(numAtomics = 2) that are pending in TBE, and does a write through of the
atomic to the directory.

3. When the first atomic returns from the Directory, it decrements the
numAtomics counter.  numAtomics was at 2 though, because of step #2.  So
it doesn't deallocate the TBE entry and calls Event:AtomicNotDone.

4. Another request (a LD) to the same address comes along for the same
address.  The LD does z_stall since the second atomic is pending –- so the
LD retries every cycle until the deadlock counter times out (or until the
second atomic comes back).

5.  The second atomic returns to the TCC.  However, because there are so
many LD's pending in the cache, all doing z_stall's and retrying every cycle,
there are a lot of resource stalls.  So, when the second atomic returns, it is
forced to retry its operation multiple times -- and each time it decrements
the atomicDoneCnt flag (which was added to catch a race between atomics
arriving and leaving the TCC in 7246f70bfb) repeatedly.  As a result
atomicDoneCnt becomes negative.

6.  Since this atomicDoneCnt flag is used to determine when Event:AtomicDone
happens, and since the resource stalls caused the atomicDoneCnt flag to become
negative, we never complete the atomic.  Which means the pending LD can never
access the line, because it's stuck waiting for the atomic to complete.

7.  Eventually the deadlock threshold is reached.

To fix this issue, this commit changes the VIPER TCC protocol from using
z_stall to using the stall_and_wait buffer method that the
Directory-level of the SLICC already uses.  This change effectively
prevents resource stalls from dominating the TCC level, by putting
pending requests for a given address in a per-address stall buffer.
These requests are then woken up when the pending request returns.

As part of this change, this change also makes two small changes to the
Directory-level protocol (MOESI_AMD_BASE-dir):

1.  Updated the names of the wakeup actions to match the TCC wakeup actions,
to avoid confusion.

2.  Changed transition(B, UnblockWriteThrough, U) to check all stall buffers,
as some requests were being placed later in the stall buffer than was
being checked.  This mirrors the changes in 187c44fe44 to other Directory
transitions to resolve races between GPU and DMA requests, but for
transitions prior workloads did not stress.

Change-Id: I60ac9830a87c125e9ac49515a7fc7731a65723c2
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51367
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-10-08 22:03:13 +00:00
Gabe Black
d6974ef636 mem: Add a page_bytes parameter to the classic prefetcher.
This parameter is used to figure out if two addresses are on the same or
different pages, and could be used to find what page they were on and
the page offset, although it doesn't look like the later two are
actually used.

This value could possibly come from the TLB parameter attached to the
prefetcher, but making it explicit makes these more symmetric with the
Ruby prefetcher, and reduces the complexity of the TLB implementation.

Change-Id: I6921943c49af19971b84225ecfd1127304363426
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/50352
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
2021-09-30 00:31:29 +00:00
Gabe Black
13725927a0 mem-ruby: Replace the sys param with a page_shift param.
This parameter defaults to a shift which corresponds to a 4K page.

Change-Id: I259081a75cd6e7286d65f1e7dcdc657404397426
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/50351
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2021-09-30 00:31:18 +00:00
Gabe Black
ede1ad4b8c arch,cpu,mem,sim: Fold arch/locked_mem.hh into the BaseISA class.
Turn the functions within it into virtual methods on the ISA classes.
Eliminate the implementation in MIPS, which was just copy pasted from
Alpha long ago. Fix some minor style issues in ARM. Remove templating.
Switch from using an "XC" type parameter to using the ThreadContext *
installed in all ISA classes.

The ARM version of these functions actually depend on the ExecContext
delaying writes to MiscRegs to work correctly. More insiduously than
that, they also depend on the conicidental ThreadContext like
availability of certain functions like contextId and getCpuPtr which
come from the class which happened to implement the type passed into XC.

To accomodate that, those functions need both a real ThreadContext, and
another object which is either an ExecContext or a ThreadContext
depending on how the method is called.

Jira Issue: https://gem5.atlassian.net/browse/GEM5-1053

Change-Id: I68f95f7283f831776ba76bc5481bfffd18211bc4
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/50087
Maintainer: Gabe Black <gabe.black@gmail.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-09-28 19:56:01 +00:00
Gabe Black
750a809169 python,scons: Break slicc's dependence on m5.util.
The only dependence remaining was a small utility function makeDir which
was only used by slicc. This change moves it to where it's used, and
cleans up the additions to sys.path a little.

Change-Id: I7415b53ea2e9c378b6dbf342b8b3a966f48e117c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/49397
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2021-09-24 21:23:41 +00:00
Gabe Black
cc75a47b84 python,scons: Move grammar.py and code_formatter.py into build_tools.
These are only used in a build, and so don't need to be built into gem5.
grammar.py is used by slicc and the fast model project file parser, and
code_formatter.py is only used by SConscripts.

Change-Id: Id43e62459d69f07fdb2ed125548a83e38bbb7590
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/49396
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
Maintainer: Gabe Black <gabe.black@gmail.com>
2021-09-24 21:23:27 +00:00
Gabe Black
bec16fbc31 misc: Move MemPool based calls to the SEWorkload.
These currently proxy to the System object, but this is one step towards
moving the MemPool-s out of the System and into the SEWorkload where
they really should have been from the start.

Change-Id: Id27e7b874c283abf07bd892c8467a9cc52e2fdff
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/50342
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-09-21 02:05:32 +00:00
Quentin Forcioli
32fd8cfa80 mem: Fix for CFI memory
Subtile modification of the CFI memory to bring back u-boot compatibility :
- Ignoring AMD_RESET_CMD (0xf0)
- Increasing CFIQueryTable size to have 4 Erase Block Region Information (3 are
just empty)

Change-Id: I49e7a78a89a46b1298f04132559debafdeddb8ef
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/49570
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-09-20 08:00:53 +00:00
Gabe Black
00187b7bc3 x86,mem: Replace the x86 StoreCheck flag with READ_MODIFY_WRITE.
X86 had a private/arch specific request flag called StoreCheck which it
used to signal to the TLB that it should fault on a load if it would
have faulted had it been a store. That way, you can detect whether a
read-modify-write type of operation is going to fail due to a
translation problem during the read, and don't have to worry about not
doing anything architecturally visible until the store had succeeded,
while also making sure not to do the store part if the modify part
could fail.

It seems that Ruby had hijacked that flag and had an architecture
specific check which was looking for a load which was going to be
followed by a store. The x86 flag was never intended to communicate that
beyond the TLB, and this nominally architecture agnostic component
shouldn't be reaching into the ISA specific flags to try to get that
information.

Instead, this change introduces a new Request flag called
READ_MODIFY_WRITE which is used for the same purpose in x86, but in
general means that a load will be followed by a write in the near
future.

With this new globally applicable flag, the ruby Sequencer class no
longer needs to check what the arch is, nor does it need to access ISA
private data in the request flags. Always doing this check should be no
less efficient than before, because checking the arch involved calling
into the system object, while checking the flag only requires masking a
bit on the flags which the compiler probably already has floating around
for other logic in this function.

Change-Id: Ied5b744d31e7aa8bf25e399b6b321f9d2020a92f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/48710
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Maintainer: Gabe Black <gabe.black@gmail.com>
2021-09-05 05:29:27 +00:00
Bobby R. Bruce
1853d57dc3 misc: Revert "arch,cpu,mem,sim: Fold arch/locked_mem.hh..."
This reverts commit a3f85217ab,
https://gem5-review.googlesource.com/c/public/gem5/+/48384

The reason for reverting this commit is it causes the Nightly build to
timeout: https://www.mail-archive.com/gem5-dev@gem5.org/msg40344.html

The exact cause of this failure was a stalling with the O3 processor on
ARM. The simulation reaches the following error and repeats until
timeout:

```
build/ARM/arch/arm/isa.cc:2634: warn: context 0: 2136500000 consecutive store conditional failures
```

The "realview-o3-ARM-x86_64-opt" test can replicate this:

```
./main.py run -j8 --uid
SuiteUID:tests/gem5/fs/linux/arm/test.py:realview-o3-ARM-x86_64-opt
```

Change-Id: I9e9a20753c2a25c143e6a73f58716feb41861cde
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/49927
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-09-04 04:37:49 +00:00