Make this part of the Functional protocol, since it should always
return immediately, can be shared by the atomic and timing protocols,
and thematically fits with that protocol.
The default implementation on the receiving end just ignores the
request and leaves the back door pointer set to null, effectively
making back doors default "off" which matches their behavior in the
atomic protocol.
This mechamism helps fix a bug in the TLM gem5 bridges which need to
translate to/from the DMI and back door mechanisms, where there can be
an explicit request for a back door which does not have a transaction
associated with it. It is also necessary for bridging DMI requests in
timing mode, since the DMI requests must be instant, and the timing
protocol does not send/receive packets instantly.
Change-Id: I905f13b9bc83c3fa7877b05ce932e17c308125e2
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65752
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Gabe Black <gabeblack@google.com>
Apply the gem5 namespace to the codebase.
Some anonymous namespaces could theoretically be removed,
but since this change's main goal was to keep conflicts
at a minimum, it was decided not to modify much the
general shape of the files.
A few missing comments of the form "// namespace X" that
occurred before the newly added "} // namespace gem5"
have been added for consistency.
std out should not be included in the gem5 namespace, so
they weren't.
ProtoMessage has not been included in the gem5 namespace,
since I'm not familiar with how proto works.
Regarding the SystemC files, although they belong to gem5,
they actually perform integration between gem5 and SystemC;
therefore, it deserved its own separate namespace.
Files that are automatically generated have been included
in the gem5 namespace.
The .isa files currently are limited to a single namespace.
This limitation should be later removed to make it easier
to accomodate a better API.
Regarding the files in util, gem5:: was prepended where
suitable. Notice that this patch was tested as much as
possible given that most of these were already not
previously compiling.
Change-Id: Ia53d404ec79c46edaa98f654e23bc3b0e179fe2d
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46323
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
L1 controller selects the L2 to message based on the assigned address
ranges instead of explicitly interleaving bits in the L1 controller. This
simplifies the L1 controller implementation a bit and allows for more
flexibility when changing the address->controller mapping.
Change-Id: Ie67999bb977566939432a5045f65dbd2da81816a
Signed-off-by: Tiago Muck <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18410
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
When a message triggers a transition that has actions which allocate
TBEs, the generated code automatically includes a check for the TBETable
size before executing any action. If the table is full, the transition
returns TransitionResult_ResourceStall and no more messages from the
buffer are handled (until the next cycle).
This behavior may lead to deadlocks in the MOESI_CMP_directory protocol
since events triggered by the response queue may allocate TBEs (e.g.
L2 replacements triggered by the response queue). If the table is full,
the queue is stalled preventing other responses from freeing TBEs.
This patch fixes this by handling WRITEBACK_DIRTY_DATA/CLEAN_DATA messages
as requests and WB_ACK/WB_NACK as responses. All controllers are changed
to work with the new types. With this fix, responses are always
handled first in all controllers, and no response triggers TBE
allocations.
Change-Id: I377c0ec4f06d528e9f0541daf3dcc621184f2524
Signed-off-by: Tiago Muck <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18408
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: John Alsop <johnathan.alsop@amd.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Using recycle in the L2 controllers to put messages back into the buffer
may lead to starvation when there are many L1 requests for the same line.
This can easily trigger the deadlock detection mechanism in configurations
with many cores (16+). Replacing recycle by stall_and_wait for L1
requests avoids this issue. wakeUpBuffers calls were added to all
transitions from transient to stable states.
Change-Id: I28b8aeacc48919ccf38e69653cd9205a4153514b
Signed-off-by: Tiago Muck <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17568
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Adding back some changes done in patch 676ae57827.
Transient state IS_I, STALE_DATA, Data_Stale event are necessary.
Issue: (cacheline A, initial state for P0 and P1 is I)
| P0 | P1 |
|GETX (A)| |
| |GETS (A)|
|Inv_All | |
P1 never sends the ACK - deadlock
It should ACK, later upon data use it as stale data, and got to I.
Solution:
P1(A):
GETS: I->IS
Inv_All: IS->IS_I, Send ACK
Data: IS_I->I, STALE_DATA to L0
Signed-off-by: Pouya Fotouhi <pfotouhi@ucdavis.edu>
Change-Id: I1e7b2c05439d08579c68d8eb444e0f332e75e07f
Reviewed-on: https://gem5-review.googlesource.com/c/15715
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
To avoid deadlocks ruby objects typically prioritize the handling of
responses to all other events. The order in which in_port statements
are written determine the order in which they are handled. This patch
fixes the order of in_order statements for the L2 cache in the
MOESI_CMP_directory.
Change-Id: I62248b0480a88ac2cd945425155f0961a1cf6cb1
Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/13595
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Since a3177645, the MESI_Three_Level protocol does not build. This
changeset addresses the problem by adding the L0Cache machine type
to the static machine type declaration in Ruby's export file.
Change-Id: I6327547fcb34595619caeb73932c0032f5f65c9f
Reviewed-on: https://gem5-review.googlesource.com/8383
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Only a small quantity of prefetches were issued, as the positive
feedback mechanism was not implemented. This commit adds a new
action po_observeHit, which notifies the RubyPrefetcher of
successful prefetches and resets the prefetch flag.
When a cache line was replaced by a prefetch, the wrong queue could
be stalled. This commit adds a new event PF_L1_Replacement, which
stalls the correct queue.
The behavior when receiving a prefetch or instruction fetch while
in PF_IS_I (prefetch caused GETs, but got invalidated before the
response was received) was undefined. This was changed to drop the
prefetch request or change the state to non-prefetch, respectively.
This behavior is analogous to IS_I (non-prefetch caused GETs, but
got invalidated before the response was received) and the data case,
respectively.
In my local branch a major (20+%) performance increase can be
observed in SPEC2006 gobmk and leslie3d when enabling the
prefetcher. Some other benchmarks like bwaves, GemsFDTD, sphinx and
wrf show smaller (~10%) performance increases. Unfortunately, the
performance in most other SPEC benchmarks is still poor, most likely
as the prefetcher does not detect strides fast/often enough. In
order to push the change timely (most benchmarks have runtimes in
the order of days on my machine even with the smallest parameters)
after checkout, I have only run gobmk with the base repository
+ this commit. The results match those of my local branch.
Change-Id: I9903a2fcd02060ea5e619b409f31f7d6fac47ae8
Reviewed-on: https://gem5-review.googlesource.com/8801
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Swapnil Haria <swapnilster@gmail.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
This changeset fixes a bug that was affecting the MOESI_CMP_token
protocol where setting the next timeout required an absolute tick in
the future.
Change-Id: Ibfdb59354e13c7e552cb3389e71bda010f333249
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/7163
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
The function map_Address_to_DMA was used to route responses to the
first (and assumed to be the only) DMA engine in the system. This
function is now unused as protocols handle responses and route them to
the right DMA engine.
Change-Id: I2fba913cf2f12321d1a1e38e7ee85bdf26b8a47a
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/7162
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Previously the MESI_Two_Level protocol supported systems with a single
DMA engine and responses from the directory to DMA requests were
routed back to the only DMA engine. This changeset adds support for
multiple DMA engines in the system by routing the response to the DMA
engine that originally sent the request.
Change-Id: I10ceda682ea29746636862ec8ef2a9c4220ca045
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/7161
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Previously the directory covered a flat address range that always
started from address 0. This change adds a vector of address ranges
with interleaving and hashing that each directory keeps track of and
the necessary flexibility to support systems with non continuous
memory ranges.
Change-Id: I6ea1c629bdf4c5137b7d9c89dbaf6c826adfd977
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/2903
Reviewed-by: Bradford Beckmann <brad.beckmann@amd.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Multiple outstanding DMA requests introduced new DMA states that didn't
be considered into slicc code. This patch implements the missed DMA state
changes on MOESI_CMP_directory protocol.
Change-Id: I700d441d76556b7e77e0d507904af6ec6ba59cc2
Signed-off-by: Michael LeBeane <michael.lebeane@amd.com>
Reviewed-on: https://gem5-review.googlesource.com/2380
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Michael LeBeane <Michael.Lebeane@amd.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
SequencerMsg is autogenerated by slicc scripts and the MessageSizeType is
initialized to the max enume value by default. The DMASequencer pushes this
message to the mandatory queue and since the MessageSizeType is unitialized,
string_to_MessageSizeType() function used by traces to print the message fails
with a panic. This patch avoids this problem by initializing MessageSizeType
of SequencerMsg to Request_Control.
DMA sequencers and protocols can currently only issue one DMA access at
a time. This patch implements the necessary functionality to support
multiple outstanding DMA requests in Ruby.
Over the past 6 years, we realized that the protocol is essentially used
to run the garnet network in a standalone manner, and feed standard synthetic
traffic patterns through it.
Allow usage of packet class in ruby for convenience purposes. This may be
used to access members of the packet/request class (e.g., via helper
functions) and/or push protocol specific information to the packets
SenderState without needing to modify SLICC types and protocols in multiple
locations.
mem: support for gpu-style RMWs in ruby
This patch adds support for GPU-style read-modify-write (RMW) operations in
ruby. Such atomic operations are traditionally executed at the memory controller
(instead of through an L1 cache using cache-line locking).
Currently, this patch works by propogating operation functors through the memory
system.
This patch is imported from reviewboard patch 2551 by Nilay.
This patch moves from a dynamically defined MachineType to a statically
defined one. The need for this patch was felt since a dynamically defined
type prevents us from having types for which no machine definition may
exist.
The following changes have been made:
i. each machine definition now uses a type from the MachineType enumeration
instead of any random identifier. This required changing the grammar and the
*.sm files.
ii. MachineType enumeration defined statically in RubySlicc_Exports.sm.
* * *
normal protocol fixes for nilay's parser machine type fix
This patch is imported from reviewboard patch 2550 by Nilay.
It was possible to specify multiple machine types with a single state machine.
This seems unnecessary and is being removed.
Changeset 4872dbdea907 replaced Address by Addr, but did not make changes to
print statements. So the addresses which were being printed in hex earlier
along with their line address, were now being printed in decimals. This patch
adds a function printAddress(Addr) that can be used to print the address in hex
along with the lines address. This function has been put to use in some of the
places. At other places, change has been made to print just the address in
hex.
Some blocks in MOESI hammer were not getting deallocated when they were set to
an idle state (e.g. by invalidate or other_getx/s messages). While
functionally correct, this caused some bad effects on performance, such as
blocks in I in the L1s getting sent to the L2 upon eviction, in turn evicting
valid blocks. Also, if a valid block was in LRU, that block could be evicted
rather than a block in I. This patch adds in the missing deallocations.
Committed by: Nilay Vaish<nilay@cs.wisc.edu>