Commit Graph

3338 Commits

Author SHA1 Message Date
NSurawar
efbfdeabd7 mem-ruby: Reduce handshaking between CorePair and dir (#1117)
Currently when data is downgraded by MOESI_AMD_Base-CorePair (e.g. due
to a replacement) this requires a 4-way handshake between the CorePair
and the dir. Specifically, the CorePair send a message telling the dir
it'd like to downgrade then, the dir sends an ACK back and then, the
CorePair writes the data back, and finally, the dir ACKs the writeback.
This is very inefficient and not representative of how modern protocols
downgrade a request. Accordingly, this commits updates the downgrade
support such that the CorePair writes back the data immediately and then
the dir ACKs it.
Thus, this approach requires only a 2-way handshake.

Change-Id: I7ebc85bb03e8ce46a8847e3240fc170120e9fcd6

Co-authored-by: Neeraj Surawar <neerajs@hyrule.cs.wisc.edu>
2024-05-30 09:36:29 -07:00
ylldummy
7fa0342a7c mem-cache: Fix maybe-uninitialized warning (#1179)
When compiler tries to inline a vector construction with a default value
as default constructed ReplaceableEntry. It can complain about the
uninitialized member.

Let's provide basic initialization to the members.

Example codepath:
 SignaturePathV2 constructor
 -> GlobalHistoryEntry() as init_value to AssociativeSet
 -> AssociativeSet initialize vector<Entry> with init_value
2024-05-29 10:41:35 -07:00
Matthew Poremba
e82cf20150 mem-ruby: Remove VIPER StoreThrough temp cache storage (#1156)
StoreThrough in VIPER when the TCP is disabled, GLC bit is set, or SLC
bit is set will bypass the TCP, but will temporarily allocate a cache
entry seemingly to handle write coalescing with valid blocks. It does
not attempt to evict a block if the set is full and the address is
invalid. This causes a panic if the set is full as there is no spare
cache entry to use temporarily to use for DataBlk manipulation. However,
a cache block is not required for this.

This commit removes using a cache block for StoreThrough with invalid
blocks as there is no existing data to coalesce with. It creates no
allocate variants of the actions needed in StoreThrough and pulls the
DataBlk information from the in_msg instead. Non-invalid blocks do not
have this panic as they have a cache entry already.

Fixes issues with StoreThroughs on more aggressive architectures like
MI300.

Change-Id: Id8687eccb991e967bb5292068cbe7686e0930d7d
2024-05-28 11:02:00 -07:00
Ivana Mitrovic
233135da81 mem-ruby: Fix NullPointerException in RubyRequest (#1118)
This PR includes a check for `m_pkt` being null and appropriately
handles that case. This issue was causing the Daily tests to fail.

Change-Id: I87142ca14ca4ab3d8306153a1cf34c2629a119ba
2024-05-09 08:46:13 -07:00
Giacomo Travaglini
0df5635bdf mem-ruby: Implement NS bit for CHI transactions (#1100)
This patch is adding the NS bit to CHI requests to make sure they are
properly tagged according to their security


Change-Id: I33d3610edefbb5a05a6090e9125c35d4fb8bca58
Reviewed-by: Tiago Muck <tiago.muck@arm.com>

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-05-08 07:46:50 +02:00
Giacomo Travaglini
36c1ea9c61 mem-ruby: Implement MakeReadUnique in CHI (#1101)
Change-Id: I64cd3c62804cca184d68287fc099534e9205f2b8
Reviewed-by: Tiago Muck <tiago.muck@arm.com>

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-05-06 08:30:59 +02:00
Ivana Mitrovic
939d8e28df mem-cache: Fix TreePLRU num leaves error (#1075)
This PR fixes the error noted here #1073. 

Change-Id: I5d31c259ac5ee93f46f28b20eda4f58460ba8523
2024-04-26 20:22:20 -07:00
Nicholas Mosier
66decb2e93 mem-ruby: Fix functional reads for MESI Three-Level messages (#1045)
Fix #1044. This patch adds checks for message types (PUTX_COPY, DATA,
DATA_EXCLUSIVE) that contain data blocks but were missing from the
original `functionalRead` method in MESI Three-Level messages.

Change-Id: I0cedc314166c9cc037bf20f5b7fef5552dd1253c
2024-04-25 11:14:37 -07:00
Nicholas Mosier
ed8a09303a mem-cache: Remove power-of-2 requirement for TreePLRU num leaves (#1061)
Remove the requirement in TreePLRU's implementation that the number of
leaves (i.e., the number of cache ways) be a power of two. Firstly, on
some recent processors, this is not the case---for example, Intel Golden
Cove's L1D has 12 ways. Secondly, The implementation of TreePLRU appears
to work just fine as-is with a way count that's not a power of two.

Change-Id: If2a27dc5bbe7a8e96684f79ce791df5c0b582230
2024-04-24 20:59:06 -07:00
Ivana Mitrovic
42ffa52907 mem-ruby: Implement no_alloc Far Atomics in CHI (#994)
This PR introduces a missing pice of far atomic implementation. This
pull request incorporates several changes:

- Enable 2-level and 4-level (and N-level) cache hierarchies, removing
Atomic_NoWait transactions
- Fix Unique Near policy implementation that raised abort
- Add support for alloc_on_atomic == False. Enables Far Atomics on
systems where the HNF does not allocate evicted lines at LLC (Like in
WriteUpdate).
2024-04-18 11:35:47 -07:00
Matthew Poremba
7e2d8dee42 mem,gpu-compute: Implement GPU TCC directed invalidate (#1011)
The GPU device currently supports large BAR which means that the driver
can write directly to GPU memory over the PCI bus without using SDMA or
PM4 packets. The gem5 PCI interface only provides an atomic interface
for BAR reads/writes, which means the values cannot go through timing
mode Ruby caches. This causes bugs as the TCC cache is allowed to keep
clean data between kernels for performance reasons. If there is a BAR
write directly to memory bypassing the cache, the value in the cache is
stale and must be invalidated.

In this commit a TCC invalidate is generated for all writes over PCI
that go directly to GPU memory. This will also invalidate TCP along the
way if necessary. This currently relies on the driver synchonization
which only allows BAR writes in between kernels. Therefore, the cache
should only be in I or V state.

To handle a race condition between invalidates and launching the next
kernel, the invalidates return a response and the GPU command processor
will wait for all TCC invalidates to be complete before launching the
next kernel.

This fixes issues with stale data in nanoGPT and possibly PENNANT.
2024-04-15 13:18:01 -07:00
Pranith Kumar
769f750eb9 mem-cache: Implement AssociativeSet from AssociativeCache
AssociativeSet can reuse most of the generic cache library code with the
addition of a secure bit. This reduces duplicated code.

Change-Id: I008ef79b0dd5f95418a3fb79396aeb0a6c601784
2024-04-10 16:17:57 -04:00
Pranith Kumar
f3bc10c168 mem-cache: Derive tagged entry from cache entry
The tagged entry can be derived from the generic cache entry and add the secure
flag that it needs. This reduces code duplication.

Change-Id: I7ff0bddc40604a8a789036a6300eabda40339a0f
2024-04-10 16:17:57 -04:00
Pranith Kumar
8fb3611614 mem-cache: prefetch: Implement DCPT tables using cache library
The DCPT table is better built using the generic cache library since we do not
need the secure bit.

Change-Id: I8a4a8d3dab7fbc3bbc816107492978ac7f3f5934
2024-04-10 16:17:57 -04:00
Pranith Kumar
2c7d4bed66 mem-cache: Implement VFT tables using cache library
The frequency table is better built using the generic cache library instead of the
AssociativeSet since the secure bit is not needed for this structure.

Change-Id: Ie3b6442235daec7b350c608ad1380bed58f5ccf4
2024-04-10 16:17:57 -04:00
Matthew Poremba
1d64669473 mem,gpu-compute: Implement GPU TCC directed invalidate
The GPU device currently supports large BAR which means that the driver
can write directly to GPU memory over the PCI bus without using SDMA or
PM4 packets. The gem5 PCI interface only provides an atomic interface
for BAR reads/writes, which means the values cannot go through timing
mode Ruby caches. This causes bugs as the TCC cache is allowed to keep
clean data between kernels for performance reasons. If there is a BAR
write directly to memory bypassing the cache, the value in the cache is
stale and must be invalidated.

In this commit a TCC invalidate is generated for all writes over PCI
that go directly to GPU memory. This will also invalidate TCP along the
way if necessary. This currently relies on the driver synchonization
which only allows BAR writes in between kernels. Therefore, the cache
should only be in I or V state.

To handle a race condition between invalidates and launching the next
kernel, the invalidates return a response and the GPU command processor
will wait for all TCC invalidates to be complete before launching the
next kernel.

This fixes issues with stale data in nanoGPT and possibly PENNANT.

Change-Id: I8e1290f842122682c271e5508a48037055bfbcdf
2024-04-10 11:35:25 -07:00
Matthew Poremba
833392e7b2 mem-ruby,gpu-compute: Allow memory reqs without inst
The GPUDynInst for sending memory requests through the CUs data port
is required but only used for DPRINTFs. Relax this constraint so that
the methods can be reused for requests such as probes generated by the
GPU device.

Change-Id: I16094e400968225596370b684d6471580888d98a
2024-04-10 11:35:24 -07:00
Bobby R. Bruce
3af15a535e mem-cache, configs, arch-arm: Handle partitioning policies through a PartitionManager (#966)
This PR is offloading some of the partitioning logic to the partitioning
manager, effectively changing
the partitioning interface. Rather than always relying on the
PartitionFieldExtention data structure to
convey partition IDs, we make it implementation defined by introducing
the partitioning manager abstraction.
We want user to be able to extract the partitionId more flexibly and
this requires using a SimObject.

Users can extend the PartitioningManager, overriding the
readPacketPartitionId, therefore providing their
own mean of injecting/extracting partitioning data from a packet
2024-04-08 16:05:17 -07:00
Giacomo Travaglini
bdb08a5b6c arch-arm, dev-arm: Fix typo in PartitionFieldExtention name
Rename PartitionFieldExtention into PartitionFieldExtension

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Change-Id: I8072adf78d81b94c5b8bc61a317c0238cf0a9fd9
2024-04-07 11:45:57 +01:00
Giacomo Travaglini
dd45e1c319 misc: Make PartitionFieldExtention private to Arm
The new ISA-agnostic interface is the PartitionManager.
We therefore make the PartitionFieldExtention private to the
Arm implementation of memory partitioning (FEAT_MPAM)

Any other partitioning implementation should override the
PartitionManager::readPacketPartitionID to provide a mean
for extracting partitioning data (partition_id) from the
incoming Packet.

With this commit we also define an MPAM MSC which is
supposed to be the partitioning manager for the
Memory System Component

Change-Id: I6959ace0c0cbca549dcc1aacd53dff223b5fe328
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-04-07 11:45:57 +01:00
Víctor Soria Pardos
98358da968 mem-ruby: Implement Atomic No Alloc Policy
Add alternative implementation to far atomics when the flag alloc_on_commit
is false. The implementation fetches the data, performs the atomic and
writes back the cache line to main memory.

Co-authored-by: Fabian Schätzle <f.schaetzle@fz-juelich.de>
Change-Id: I8797fbc68448e1866a292f4afeedd3613113dddd
2024-04-06 18:51:11 +02:00
Víctor Soria Pardos
5a6a3be6da mem-ruby: Fix policy_type condition in CHI
Fix if-else condition in CHI-cache-actions to correctly
support policy_type Present Near (2)

Change-Id: Ib776d847a908a8ac7693c2d10405bc0c4a9d767d
2024-04-04 10:55:56 +02:00
Víctor Soria Pardos
7ee574b309 mem-ruby: Remove AtomicReturn_NoWait from CHI
To make Atomic transaction recursive and enable 2-level config,
remove AtomicReturn_NoWait and other level-dependent code

GitHub Issue: https://github.com/gem5/gem5/issues/882

Change-Id: Iac468cdb8a3b5914c8f05c5cedde866ce85f359a
2024-04-04 10:54:42 +02:00
Minje Jun
ffd0680a2c mem-ruby: Copyback UD_RU line when evicted in CHI protocol (#945)
This is a followed up fix to #791 mem-ruby: Fix possible dirty line loss
in CHI when ReadShared hit on UD line.
UD_RU line may have stale data since the upstream could have updated the
line, so its local cache line data is treated as invalid
(dataValid=false). But when the line is evicted, it must be written back
to downstream because the upstream may have the line in clean state
(UC). This change fixes it by performing copy back the UD_RU line while
keeping its dataValid as false.

Example error case:
- L3 was in UD_RSC and being evicted without back-invalidation. LLC (HN)
was in RU state.
- Because there's still upstream sharer, L3 sends WriteClean.
- Because the data state was unique and dirty, L3 sends CBWrData_UD_PD.
- LLC becomes UD_RU.
- When the line is evicted from LLC (LocalHN_Eviction), the line is just
dropped, causing the loss of the dirty copy

Co-authored-by: Minje Jun <minje.jun@samsung.com>
2024-04-03 08:33:22 -07:00
Giacomo Travaglini
9ab97c8930 mem-cache: Move partitioningPolicies to the PartitionManager
Change-Id: I13b41e918ed3864e1a52940786b3eec063253e1d
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-03-26 12:12:24 +00:00
Giacomo Travaglini
d0539fe7cb mem-cache: Define a PartitionManager to handle partitioning
This is a first step towards offloading some of the partitioning
logic to the partitioning manager. We start with this patch
by replacing the static readPacketPartitionId into a virtual
method owned by the manager.

The issue with readPacketPartitionId as of now is that it relies
on the fixed PartitionFieldExtention.
We want user to be able to extract the partitionId more flexibly
and this requires using a SimObject

Change-Id: I3bd2e81e2a97c55fc83548956fc59f422c8049a6
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-03-26 12:12:15 +00:00
Yan Lee
84da503d37 mem: Fix callback of functional access in port wrapper (#938)
In previous implementation of port_wrapper, recvFunctional() will call
timing request callback. This should be a typo and this change fix the
typo.
2024-03-18 08:21:43 -07:00
Tiago Mück
942979162a READ_MODIFY_WRITE flag fix (#922)
Change bit for Request::READ_MODIFY_WRITE, which was the same as
Request::ACQUIRE.

Signed-off-by: Tiago Mück <tiago.muck@arm.com>
2024-03-11 08:32:11 -07:00
Giacomo Travaglini
c57a6b0d59 mem-cache: Add support for partitioning caches (#765)
* Add Cache partitioning policies to manage and enforce cache
partitioning:
    * Add Way partition policy 
    * Add MaxCapacity partition policy
* Add PartitionFieldsExtension Extension class for Packets to store
Partition IDs for cache partitioning and monitoring
* Modify Cache SimObjects to store partition policies
* Modify Cache block eviction logic to use new partitioning policies

Co-authored-by: Adrian Herrera <adrian.herrera@arm.com>

Change-Id: Ib35153a8b46803c22a433926270d82e5e19ce544
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-03-04 09:44:01 +00:00
Giacomo Travaglini
c1d5ffe7c7 mem-cache: Prefetchers Improvements (#872)
This pull request contains a set of small patches which fix some bugs in
the gem5 prefetchers, and aligns out-of-the box prefetcher performance
more closely with that which a typical user would expect.

The performance patches have been tested with an out-of-the-box
(untuned) Stride prefetcher configuration against a set of SPEC 2017
SimPoints, and show a small-to-modest IPC uplift across the about half
the benchmarks, with no significant IPC degradation.

The new defaults were identified as part of work on gem5 prefetchers
undertaken by Nikolaos Kyparissas while on internship at Arm.

This PR is an updated version of PR #564, which was reverted due to Bug
#580. Bug #580 was fixed in PR #871. This PR updates #564 to the latest
state of the develop branch, and should be applied after PR #871.
2024-03-04 09:09:47 +00:00
Hristo Belchev
27c8355565 mem-cache: Add support for partitioning caches
* Add Cache partitioning policies to manage and enforce cache partitioning:
    * Add Way partition policy
    * Add MaxCapacity partition policy
* Add PartitionFieldsExtension Extension class for Packets to store
  Partition IDs for cache partitioning and monitoring
* Modify Cache Tags SimObjects to store partition policies
* Modify Cache Tags block eviction logic to use new partitioning policies
* Add example system and TrafficGen configurations for testing Cache
  Partitioning Policies

Change-Id: Ic3fb0f35cf060783fbb9289380721a07e18fad49
Co-authored-by: Adrian Herrera <adrian.herrera@arm.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-03-01 15:26:38 +00:00
amatabsc
0d79b5098b Increased packets sanity check limit to 1024 (#797)
For some simulations with big values for VLEN (e.g. 8k and 16k) there
were more packets created on the fly and, as a consequence, failing the
simulations. The sanity check has been increased in order to solve this
high VLEN cases.

Supervised by [@aarmejach](https://github.com/aarmejach)
Change-Id: I137b0f3113687b3fc9c4154d19ca5e8017e6e992

Co-authored-by: Adrià Armejach <adria.armejach@bsc.es>
2024-02-29 08:12:59 -08:00
Matt Sinclair
777ac91bb0 mem-ruby: Add categorization of bypassed atomics in TCC (#899)
Adds categorization of bypassed atomics in TCC to the TBE as either
return or no-return, which gets consumed in pa_performAtomic to
determine if atomic logs should be stored.

Reestablishes TCC bypassed atomics after #546.

Change-Id: Ibc1fa2b795ef1c47c3893a0b1911fa7993522d38
2024-02-28 14:26:09 -06:00
Daniel Kouchekinia
de615836f0 mem-ruby: Add categorization of bypassed atomics in TCC
Adds categorization of bypassed atomics in TCC to the TBE as either return
or no-return, which gets consumed in pa_performAtomic to determine if
atomic logs should be stored.

Reestablishes TCC bypassed atomics after #546.

Change-Id: Ibc1fa2b795ef1c47c3893a0b1911fa7993522d38
2024-02-27 23:12:45 -06:00
Daniel Kouchekinia
0fd73f4e05 Merge branch 'develop' into missing-tcc-transition 2024-02-27 16:46:30 -06:00
Hristo Belchev
e78a6b71fe Merge branch 'develop' into qos-qpolicy-assertions-fix 2024-02-27 09:38:34 +00:00
Daniel Kouchekinia
6374697a20 mem-ruby: Add missing transition for SLC writes to VIPER TCC
Bypassed write though requests on invalid lines in the TCC should be
written though to the directory. This transition was previously
missing.

Change-Id: I16b117c4e085ce6be0ed5297aa0129d52cd35a51
2024-02-26 13:13:06 -06:00
Ivana Mitrovic
61ee36eee6 mem-ruby: Fix possible dirty line loss in CHI when ReadShared hit on UD line (#791)
In case ReadShared hit on a UD line and there's no sharers, this chage
makes the downstream passes Dirty to the requestor whenever possible
even though it doesn't deallocate the line. This will make the requestor
to SD and the downstream to UD_RSD.
In the previous implementation, loosely exclusive intermediate cache can
cause loss of dirty data. Example error condition is as below.
   
Configurations
L2 cache: Roughly inclusive to L1 without back-invalidation
- dealloc_on_* = false
- dealloc_backinv_* = false
L3 cache: Roughly exclusive to L2 without back-invalidation
- alloc_on_readshared = tue
- alloc_on_readunique = false
- dealloc_on_shared = false
- dealloc_on_unique = true
- dealloc_backinv_* = false
- is_HN = false
LLC: Same clusivity as L3 except is_HN = true
For all caches, allow_SD = true and fwd_unique_on_readshared = false
    
Example problem sequence:
1. L1 sends ReadUnique then becomes UD. L2 is UC_RU. L3 and LLC are RU.
2. L1 evicts the line to L2 by WriteBackFull (UD_PD). L2 becomes UD.
3. L2 evicts the line to L3 using WriteBackFull (UD_PD). L3 becomes UD.
4. L1 reads the line with ReadShared which misses on L2.
5. L2 reads the line with ReadShared which hits on L3. L3 becomes UD_RSC
because it doesn't deallocate the line (dataToBeInvalid=false)
6. L3 evicts the line to LLC by WriteCleanFull (UD_PD) because L3
doesn't back-invalidate and still has sharer. The local cache line is
invalidated by Deallocate_CacheBlock. L3 becomes RUSC and LLC becomes
UD_RU.
7. When UD_RU is evicted at LLC, the UD_RU line is dropped expecting the
upstream to writeback, causing loss of dirty data
2024-02-26 10:06:17 -08:00
Giacomo Travaglini
1d5be8d9e5 mem-cache: Optimize strided prefetcher address generation
This commit optimizes the address generation logic in the strided
prefetcher by introducing the following changes

(d is the degree of the prefetcher)

* Evaluate the fixed prefetch_stride only once (and not d-times)
* Replace 2d multiplications (d * prefetch_stride and distance *
prefetch_stride) with additions by updating the new base prefetch
address while looping

Change-Id: I3ec0c642bc9ec7635b0d38308797e99b645304bb
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-02-26 10:40:45 +00:00
Nikolaos Kyparissas
a5fece3b91 mem: added distance parameter to stride prefetcher
The Stride Prefetcher will skip this number of strides ahead of the
first identified prefetch, then generate `degree` prefetches at
`stride` intervals. A value of zero indicates no skip (i.e. start
prefetching from the next identified prefetch address).

This parameter can be used to increase the timeliness of prefetches by
starting to prefetch far enough ahead of the demand stream to cover
the memory system latency.

[Richard Cooper <richard.cooper@arm.com>:
- Added detail to commit comment and `distance` Param documentation.
- Changed `distance` Param from `Param.Int` to `Param.Unsigned`.
]

Change-Id: I4ce79c72d74445b12acf68e0a54e13966e30041c
Co-authored-by: Richard Cooper <richard.cooper@arm.com>
Signed-off-by: Richard Cooper <richard.cooper@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2024-02-26 10:40:45 +00:00
Nikolaos Kyparissas
1ccdf407cb mem-cache: Added clean eviction check for prefetchers.
pkt->req->isCacheMaintenance() would not include a check
for clean eviction before notifying the prefetcher,
causing gem5 to crash.

Change-Id: I4a56c7384818c63d6e2263f26645e87cef1243cb
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2024-02-26 10:40:45 +00:00
Richard Cooper
9fe998a8c0 mem-cache: Update default prefetch options.
Update the default prefetch options to achieve out-of-the box
prefetcher performance closer to that which a typical user would
expect. Configurations that set these parameters explicitly will be
unaffected.

The new defaults were identified as part of work on gem5 prefetchers
undertaken by Nikolaos Kyparissas while on internship at Arm.

Change-Id: Ia6c1803c86e42feef01de40c34d928de50fe0bed
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2024-02-26 10:40:45 +00:00
Richard Cooper
05f33fbef5 mem-cache: Squash prefetch queue entries by block address.
Prefetch queue entries were being squashed by comparing the address
of each queued prefetch against the block address of the demand
access. Only prefetches that happen to fall on a cache-line block
boundary would be squashed.

This patch converts the prefetch addresses to block addresses before
comparison.

Change-Id: I3a80a1e3d752f925595e33edebf5359d2cc67182
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2024-02-26 10:40:45 +00:00
Hristo Belchev
2138a4ec92 mem: Fix LIFO q_policy and add assetions
* Fix selectPacket() in LIFO Queue Policy to correctly return the end of
  the `deque` backing store for its packet queue
* Move selectPacket() implementations for FIFO and LIFO queues into
  `q_policy.cc` file

Change-Id: I8c35e5fc83dc380b19f52be14c18b1f414f9e141
2024-02-22 21:57:08 +00:00
Hristo Belchev
f20ac07dde mem: Fix assertions in LRG Q policy
Fix assertions in LRG Queue Policy to correctly assert requestor and
list validity

Change-Id: I84e3f5b8936b74e7ac675faf7a3e6b9999026781
2024-02-22 14:16:20 +00:00
Richard Cooper
308fef6b46 mem-cache: Fix possible crash in base prefetcher (#871)
When processing memory Packets for prefetch, the `PrefetchInfo` class
constructor will attempt to copy the `Packet` data. In cases where the
`Packet` under consideration does not contain data, an assertion will be
triggered in the Packet's `getConstPtr` method, causing the simulation
to crash.

This problem was first exposed by Bug #580 when processing an
`UpgradeReq` memory packet.

This patch addresses the problem by suppressing the copying of the
`Packet` data during construction of a `PrefetchInfo` object in cases
where the `Packet` has no data.

This patch addresses Bug #580 [1], which was exposed by PR #564 [2],
subsequently reverted by PR #581 [3]

[1] https://github.com/gem5/gem5/issues/580
[2] https://github.com/gem5/gem5/pull/564
[3] https://github.com/gem5/gem5/pull/581

Change-Id: Ic1e828c0887f4003441b61647440c8e912bf0fbc
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2024-02-17 14:14:57 -08:00
Vishnu Ramadas
690b2b9462 gpu-compute, mem-ruby: Add comments and reformat code
Change-Id: Id2b3886dce347fdcfcad22009a42b92febc00a6c
2024-02-09 12:17:24 -06:00
Vishnu Ramadas
0e93e6142a arch-vega, gpu-compute, mem-ruby: Remove extra empty lines
Change-Id: I18770ec7e38c4a992a0ae6de95b0be49ab4426c2
2024-02-09 12:17:24 -06:00
Vishnu Ramadas
23dc98ea72 mem-ruby: Add SQC cache invalidation support to GPU VIPER
This commit adds support for cache invalidation in GPU VIPER protocol's
SQC cache. To support this, the commit also adds L1 cache invalidation
framework in the Sequencer such that the Sequencer sends out an
invalidation request for each line in the cache and declares completion
once all lines are evicted.

Change-Id: I2f52eacabb2412b16f467f994e985c378230f841
2024-02-09 12:14:57 -06:00
Hristo Belchev
fd3aac1518 mem-cache: Fix circular dependency in QoS mem (#857)
This PR removes a circular dependency between `QoSMemSinkCtrl` and
`QoSMemSinkInterface` that prevented the `controller()` function of
`QoSMemSinkInterface` from being used by removing the default value for
`QoSMemSinkCtrl.interface`.

Change-Id: I4ecc39b974e239be1a2e9285e1f6f8ea873c018d
2024-02-09 11:32:16 +00:00