derek/gem5 - gem5 - Gitea: Git with a cup of tea

derek/gem5

Author	SHA1	Message	Date
Matt Sinclair	ec633b3d68	dev-amdgpu,mem-ruby: Add support to checkpoint and restore between kernels in GPUFS (#377 ) Earlier, GPU checkpointing was working only if a checkpoint was created before the first kernel execution. This pull request adds support to checkpoint in-between any two kernel calls. It does so by doing the following. - Adds flush support in the GPU_VIPER protocol - Adds flush support in the GPUCoalescer - Updates cache recorder to use the GPUCoalescer during simulation cooldown and cache warmup times.	2023-10-10 09:41:21 -05:00
Giacomo Travaglini	00748c7901	mem-ruby: Fix CHI fromSequencer helper function This has been broken by #177 Change-Id: I52feff4b5ab2faf0aa91edd6572e3e767c88e257 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2023-10-06 14:51:11 +01:00
Vishnu Ramadas	a19667427a	mem-ruby: Add BUILD_GPU guard to ruby cooldown and warmup phases Ruby was recently updated to support flushes and warmup for GPUs. Since this support uses the GPUCoalescer, non-GPU builds face a compile time issue. This is because GPU code is not built for non-GPU builds. This commit addes "#if BUILD_GPU" guards around the GPU-related code in common files like AbstractController.hh, CacheRecorder.*, RubySystem.cc, GPUCoalescer.hh, and VIPERCoalescer.hh. This support allows GPU builds to use flushing while non-GPU builds compile without problems Change-Id: If8ee4ff881fe154553289e8c00881ee1b6e3f113	2023-10-05 18:59:54 -05:00
Víctor Soria	6411b2255c	mem-ruby,configs: Add CHI far atomics support Introduce far atomic operations in CHI protocol. Three configuration parameters have been used to tune this behavior: policy_type: sets the atomic policy to one of the described in our paper atomic_op_latency: simulates the AMO ALU operation latency comp_anr: configures the Atomic No return transaction to split CompDBIDResp into two different messages DBIDResp and Comp Change-Id: I087afad9ad9fcb9df42d72893c9e32ad5a5eb478	2023-10-04 19:19:08 +02:00
Vishnu Ramadas	ae5a51994c	mem-ruby: Update cache recorder to use GPUCoalescer port for GPUs Previously, the cache recorder used the Sequencer to issue flush requests and cache warmup requests. The GPU however uses GPUCoalescer to access the cache, and not the Sequencer. This commit adds a GPUCoalescer map to the cache recorder and uses it to send flushes and cache warmup requests to any GPU caches in the system Change-Id: I10490cf5e561c8559a98d4eb0550c62eefe769c9	2023-10-02 19:05:10 -05:00
Vishnu Ramadas	085789d00c	mem-ruby: Add flush support to GPU_VIPER protocol This commit adds flush support to the GPU VIPER coherence protocol. The L1 cache will now initiate a flush request if the packet it receives is of type RubyRequestType_FLUSH. During the flush process, the L1 cache will a request to L2 if its in either V or I state. L2 will issue a flush request to the directory if its cache line is in the valid state before invalidating its copy. The directory, on receiving this request, writes data to memory and sends an ack back to the L2. L2 forwards this ack back to the L1, which then ends the flush by calling the write callback Change-Id: I9dfc0c7b71a1e9f6d5e9e6ed4977c1e6a3b5ba46	2023-10-02 19:05:10 -05:00
Vishnu Ramadas	61e39d5b26	mem-ruby: Add cache cooldown and warmup support to GPUCoalescer The GPU Coalescer does not contain cache cooldown and warmup support. This commit updates the coalsecer to support cache cooldown during flush and warmup during checkpoint restore. Change-Id: I5459471dec20ff304fd5954af1079a7486ee860a	2023-10-02 19:05:04 -05:00
Vishnu Ramadas	a50ead5907	mem-ruby: Add Flush as a supported memory type in VIPERCoalescer This commit adds flush as a recognized memory type in VIPERCoalescer. Change-Id: I0f1b6f4518548e8e893ef681955b12a49293d8b4	2023-10-02 19:02:55 -05:00
Giacomo Travaglini	f5968da41c	mem-ruby: start using txnid and DBID identifiers in CHI transactions (#288 ) With this PR our CHI implementation starts making use of the txnid and DBID identifiers. Note: we were already making use of the txnId for DVM messages to convey the DVM address. This is still the case. In the future we should realign the DVM logic so that the txnId is solely used as a transaction identifier.	2023-09-26 09:51:47 +01:00
Giacomo Travaglini	aec1d081c8	mem-ruby: Populate missing txnId field to CompDBID_Stale response Change-Id: I6861d27063b13cd710e09c153d15062640c887fe Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2023-09-18 15:23:21 +01:00
Giacomo Travaglini	320454b75f	mem-ruby: Populate missing txnId field to CompI response Change-Id: I02030f61dd4e64a29b16e47d49bcde8c723260b5 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2023-09-15 12:13:00 +01:00
Gautham Pathak	178db9e270	mem-ruby: patch fixes a protocol error in MOESI_CMP_Directory When there is race between FwdGetX and PUTX on owner. Owner in this case hands off ownership to GetX requestor and PUTX still goes through. But since owner has changed, state should go back to M and PUTX is essentially trashed. An Unblock to the Directory in this case will give an undefined transition. I have added transitions which indicate that when an Unblock is served to the Directory, it means that some kind of ownership transfer has happened while a PUTX/PUTO was in progress. Change-Id: I37439b5a363417096030a0875a51c605bd34c127	2023-09-13 19:09:13 -04:00
Gautham Pathak	87db6df8f6	mem-ruby: This commit patches an error in AbstractController.cc After calling m5_dump_reset_stats(0,0) in a test program, some statistics like l1_controllers.L1Dcache.m_demand_hits, l1_controllers.L1Dcache.m_demand_misses, l1_controllers.L1Dcache.m_demand_accesses were not getting reset in the newer stat dumps. This one line patch fixes that. Changes were tested with calling two m5_dump_reset_stats(0,0) in a row for a system with 1 core, tested on both SE and FS. Credits to Gabriel Busnot for finding the fix. Change-Id: I19d75996fa53d31ef20f7b206024fd38dbeac643	2023-09-13 14:07:16 -04:00
Giacomo Travaglini	da740b1cdd	mem-ruby: Add a DBID field to the CHIResponseMsg data type This will hold the CHI Data Buffer Identifier (DBID) field. The DBID allows a Completer of a transaction to provide its own identifier for a transaction ID. This new ID will be used as a TxnId field by a following WriteData/CompData/CompAck response. For now we only set it to the original txnId (identity mapping) Change-Id: If30c5e1cafbe5a30073c7cd01d60bf41eb586cee Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2023-09-08 08:38:13 +01:00
Giacomo Travaglini	4359567180	mem-ruby: Generate TxnId field for an incoming CHI request The TxnId field of a CHI request has so far been unused (other than for DVM transactions). With this patch we always initialize the field when we extract a ruby request from the sequencer port. According to specs (IHI0050F): A 12-bit field is defined for the TxnID with the number of outstanding transactions being limited to 1024. A Requester is permitted to reuse a TxnID value after it has received either: * All responses associated with a previous transaction that have used the same value. * A RetryAck response for a previous transaction that used the same value Change-Id: Ie48f0fee99966339799ac50932d36b2a927b1c7d Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2023-09-08 08:38:13 +01:00
Giacomo Travaglini	f032eeae93	mem-ruby: Provide a fromSequencer helper function Based on the CHIRequestType, it automatically tells if the request has been originated from the sequencer (CPU load/fetch/store) Change-Id: I50fd116c8b1a995b1c37e948cd96db60c027fe66 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2023-09-08 08:38:13 +01:00
Matthew Poremba	2da54d5a4f	mem-ruby: Reorder SLC atomic and response actions Currently the MOESI_AMD_Base-directory transition for system level atomics sends the response message before the atomic is performed. This was likely done because atomics are supposed to return the value of the data before the atomic is performed and by simply ordering the actions this way that was taken care of. With the new atomic log feature, the atomic values are pulled from the log by the coalescer on the return path. Therefore, these actions can be reordered. However, it is now necessary that the atomics be performed before sending the response so that the log is populated and copied by the response action. This should fix #253 . Change-Id: Ie7e178f93990975367de2cc3e89e5ef9c9069241	2023-09-01 10:36:54 -05:00
Bobby R. Bruce	0e323bc409	mem: Atomic ops to same address (#200 ) Augmenting the DataBlock class with a change log structure to record the effects of atomic operations on a data block and service these changes if the atomic operations require return values. Although the operations are atomic, the coalescer need not send unique memory requests for each operation. Atomic operations within a wavefront to the same address are now coalesced into a single memory request. The response of this request carries all the necessary information to provide the requesting lanes unique values as a result of their individual atomic operations. This helps reduce contention for request and response queues in simulation. Previously, only the final value of the datablock after all atomic ops to the same address was visible to the requesting waves. This change corrects this behavior by allowing each wave to see the effect of this individual atomic op is a return value is necessary.	2023-08-30 23:53:35 -07:00
Bobby R. Bruce	68a48a2dfa	mem-ruby: fix CHI sending the wrong snoop response (#219 ) Do not respond with SnpRespData_I when the line is still present upstream.	2023-08-28 16:21:25 -07:00
Bobby R. Bruce	737c611e72	mem-ruby: fix assert on CHI ReadUnique (#218 ) DCT must be disabled when handling a ReadUnique where the copy need to be upgraded. Previously we were just asserting as it was assumed DCT is only enabled for HNFs (which can "auto-upgrade"). However DCT may also be enabled for intermediated levels of distributed shared caches above the HNFs.	2023-08-28 16:06:09 -07:00
Bobby R. Bruce	4bd3d2f864	mem-ruby: Improve Ruby/CHI stats for in/out trans (#220 ) Currently we generate these stats for all defined Events in the protocol, which may generate too many stats that are never used. Though these don't appear in the stats.txt file, they unnecessarily increases simulation startup time and memory footprint. This patch limits those stats to events with the "in_trans" and/or "out_trans" properties. SLICC compiler then checks which combinations of event+state are possible when generating the stats. Also the possible level of detail for inTransLatHist was reduced. Only the number of transactions for each event+initial+final state combinations is now accounted. Latency histograms are only defined per event type (similarly to outTransLatHist). This significantly reduces the final file size for generated stats.	2023-08-28 15:06:39 -07:00
Tiago Mück	9584d2efa9	mem-ruby: add in_trans/out_trans to CHI events Marks which events signal the beginning of incoming and outgoing transactions for generating inTransLatHist and outTransLatHist stats. Change-Id: I90594a27fa01ef9cfface309971354b281308d22 Signed-off-by: Tiago Mück <tiago.muck@arm.com>	2023-08-23 17:25:50 -05:00
Tiago Mück	3360a87d5a	mem-ruby: optimize in/outTransLatHist stats Generating these stats for all defined Events may generate too many stats that are never used, which unnecessarily increases simulation startup time and memory consumption. This patch limits those stats to events with the "in_trans" and/or "out_trans" properties. SLICC compiler then checks which combinations of event+state are possible when generating the stats. Also the possible level of detail for inTransLatHist was reduced. Only the number of transactions for each event+initial+final state combinations is now accounted. Latency histograms are only defined per event type (similarly to outTransLatHist). This significantly reduces the final file size for generated stats. Change-Id: I29aaeb771436cc3f0ce7547a223d58e71d9cedcc Signed-off-by: Tiago Mück <tiago.muck@arm.com>	2023-08-23 17:25:38 -05:00
Tiago Mück	a5fd6edea1	mem-ruby: fix CHI sending the wrong snoop response Do not respond with SnpRespData_I when the line is still present upstream. Change-Id: I2592e5c6637cfc0e83042169a245837648276e61 Signed-off-by: Tiago Mück <tiago.muck@arm.com>	2023-08-23 17:04:09 -05:00
Tiago Mück	49f5ec16d1	mem-ruby: fix assert on CHI ReadUnique DCT must be disabled when handling a ReadUnique where the copy need to be upgraded. Previously we were just asserting as it was assumed DCT is only enabled for HNFs (which can "auto-upgrade"). However DCT may also be enabled for intermediated levels of distributed shared caches above the HNFs. Change-Id: I9e29142a8d2f59ea61c1d90cda6b00c19435d6b7 Signed-off-by: Tiago Mück <tiago.muck@arm.com>	2023-08-23 16:58:25 -05:00
Reiley Jeyapaul	c9ff54677f	mem-ruby: fix CHI Evict race condition When an Evict request is received from upstream for a shared line and the line is no longer cached locally (or on any other upstream cache), we need to also send an Evict downstream. In this case we need to wait until our outgoing Evict completes before completing the Evict from upstream in order be able to resolve race conditions with incoming snoops. E.g.: while our outgoing Evict is pending we may receive a snoop requesting data, but we won't be able to complete this snoop if we have already completed all upstream Evicts and we no longer have the line. Change-Id: I23ac4f0a9c4ddd81e2425376c8d1e1c7fb66d107 Signed-off-by: Tiago Mück <tiago.muck@arm.com>	2023-08-23 15:49:51 -05:00
Ranganath (Bujji) Selagamsetty	f6a453362f	mem: Atomic ops to same address Augmenting the DataBlock class with a change log structure to record the effects of atomic operations on a data block and service these changes if the atomic operations require return values. Although the operations are atomic, the coalescer need not send unique memory requests for each operation. Atomic operations within a wavefront to the same address are now coalesced into a single memory request. The response of this request carries all the necessary information to provide the requesting lanes unique values as a result of their individual atomic operations. This helps reduce contention for request and response queues in simulation. Previously, only the final value of the datablock after all atomic ops to the same address was visible to the requesting waves. This change corrects this behavior by allowing each wave to see the effect of this individual atomic op is a return value is necessary. Change-Id: I639bea943afd317e45f8fa3bff7689f6b8df9395	2023-08-23 14:45:25 -05:00
Daniel Kouchekinia	984499329d	mem-ruby,configs: Add GLC Atomic Latency VIPER Parameter (#110 ) Added a GLC atomic latency parameter (glc-atomic-latency) used when enqueueing response messages regarding atomics directly performed in the TCC. This latency is added in addition to the L2 response latency (TCC_latency). This represents the latency of performing an atomic within the L2. With this change, the TCC response queue will receive enqueues with varying latencies as GLC atomic responses will have this added GLC atomic latency while data responses will not. To accommodate this in light of the queue having strict FIFO ordering (which would be violated here), this change also adds an optional parameter bypassStrictFIFO to the SLICC enqueue function which allows overriding strict FIFO requirements for individual messages on a case-by-case basis. This parameter is only being used in the TCC's atomic response enqueue call. Change-Id: Iabd52cbd2c0cc385c1fb3fe7bcd0cc64bdb40aac	2023-07-23 15:57:06 -05:00
Adwaith R Krishna	427b4d596e	mem-garnet: Fix packet_id val in flit (#72 ) Change-Id: I163b5a32972783bf2e99f3383b9f86776577b727 Co-authored-by: Bobby R. Bruce <bbruce@ucdavis.edu>	2023-07-20 13:56:31 -07:00
Daniel Kouchekinia	1705853b12	mem-ruby: Added support for non-system-scope atomics in VIPER (#101 ) Added support for performing non-SLC-set atomics in the TCC. Previously, all atomics were being passed on by the TCC to the directory. With this change, atomics will only be passed on if the SLC bit is set or if the line isn't present or available in the TCC. If a non-SLC atomic is passed on to the directory because it is not present in the TCC, the atomic will be performed on the return path on the Data event. To accommodate the directory not performing the atomic in this case, this change also passes the SLC bit on to the directory. The previously-named "Atomic" action has been renamed to "AtomicPassOn", with the new "Atomic" corresponding to an atomic performed directly in the TCC. Change-Id: Ibf92f71ddceb38bd1b0da70b0a786cc4c3cf2669	2023-07-20 11:48:08 -05:00
Jason Lowe-Power	442923c414	Add feature to output citations automatically based on configuration (#90 ) This change adds a new file to m5out which is citations.bib. This file will contain the citations to the papers which describe the aspects of the gem5 simulator that the simulation uses. In other words, each simulation configuration could generate a different bib file referencing different works. Each SimObject can now have a set of citations associated with it. After the system is built (in `instantiate`), the citations.bib file is created by parsing all SimObjects that have been instantiated and taking the union of their associated citations. This commit is not meant to add all citations, but to act as an example for others to add more citations to gem5. Change-Id: Icd5c46fd9ee44adbeec1fea162657f5716f7e5ef Signed-off-by: Jason Lowe-Power <jason@lowepower.com>	2023-07-17 10:41:51 -07:00
Daniel Kouchekinia	f8f5dd98bf	mem-ruby: Added WIB State to VIPER TCC Cache (#67 ) Added WIB (Waiting on Writethrough Ack; Will be Bypassed) state which is transitioned to when a dirty line in the TCC is evicted in a bypassed read. Previously, we were transitioning to invalid. While a WI (Waiting on Writethrough Ack) state exists, transitions from it on WBAck deallocates the TBE, which contains SLC bit information needed to trigger the Bypass event when the read response from the directory comes in. Without this change, WB acknowledgements from the directory in read bypass evicts (with the SLC bit set) were being treated as if they were read responses, leading to an invalid transition panic. Change-Id: I703c3fe8af0366856552bb677810cb1a8f2896de	2023-07-17 10:17:47 -07:00
Gabriel Busnot	159953080a	mem-ruby: Fix of an address bug in MESI_Two_Level-dir.sm Physical access address and line address were mixed up in qw_queueMemoryWBRequest_partial Change-Id: I0b238ffc59d2bb3de221d96905c75b7616eac964 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67661 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com>	2023-07-07 10:17:54 +00:00
Gabriel Busnot	20dd444273	mem-ruby: Switch to dequeueMemRspQueue() in all Ruby protocols Change-Id: I33bca345d985618e3fca62e9ddd5bcc3ad8226a3 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67659 Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>	2023-07-07 10:17:54 +00:00
Gabriel Busnot	833afc3451	mem-ruby: AbstractController can send retry req to mem controller Prior to this patch, when a memory controller was failing at sending a response to AbstractController, it would not wakeup until the next request. This patch gives the opportunity to Ruby models to notify memory response buffer dequeue so that AbstractController can send a retry request if necessary. A dequeueMemRspQueue function has been added AbstractController to automate the dequeue+notify operation. Note that models that don't notify AbstractController will continue working as before. Change-Id: I261bb4593c126208c98825e54f538638d818d16b Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67658 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu> Maintainer: Bobby Bruce <bbruce@ucdavis.edu>	2023-07-07 10:17:54 +00:00
Bobby R. Bruce	6dd60a6c1a	base,arch,mem: Remove {GE}M5_VAR_USED instances `[[maybe_unused]]` is to be used to specify that a variable is used. Change-Id: Ife2ac96111b3af13e182baba1f3456e48c3a9f9b Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70397 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>	2023-05-08 22:54:06 +00:00
Bobby R. Bruce	fcb36458e2	misc: Fix 'unused variable' clang errors with gem5.fast Change-Id: I2bb8ac10e8db69fa82abe41577cd8e5db575e93d Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70297 Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>	2023-05-08 22:54:06 +00:00
Hoa Nguyen	09023d4158	mem-ruby: Not flushing data to memory when there's no dirty block Currently, taking a checkpoint with a ruby cache involves moving all the dirty data in cache to memory. This is done by keeping only simulating the cache until all dirty data are flushed to the memory before taking the checkpoint. However, when the cache does not have dirty data, it is a problem if we keep simulating the cache. E.g., calling checkpoint caused the gem5 "empty event queue" assertion fault when running the ruby cache in atomic_noncaching mode. Since the mode bypasses the cache, all blocks are invalid and do not contain dirty data. Subsequently, there is no event placed to the event queue when we keep only simulating the cache before taking the checkpoint. This patch fixes this problem by checking if there is any actionable item when trying to move dirty data to memory. If there is no block contains dirty data, we simply choose not to continue simulating the cache before taking the checkpoint. Change-Id: Idfa09be51274c7fc8a340e9e33167f5b32d1b866 Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/69897 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Bobby Bruce <bbruce@ucdavis.edu>	2023-04-17 21:51:43 +00:00
Matt Sinclair	ea623eb6e5	mem-ruby: fix whitespacing errors in RubySystem These errors cause other commits to fail pre-commit Change-Id: I379d2d7c73f88d0bb35de5aaa7d8cb70a83ee1dd Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/69397 Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>	2023-04-05 04:19:50 +00:00
Matt Sinclair	a030ff2745	mem-ruby: fix atomic deadlock with WB GPU L2 caches By default the GPU VIPER coherence protocol uses a WT L2 cache. However it has support for using WB caches (although this is not tested currently). When using a WB L2 cache for the GPU, this results in deadlocks with atomics. Specifically, when an atomic reaches the L2 and the line is currently in M or W, the line must be written back before the atomic can be performed. However, the current support has two issues: a) it never performs the atomic operation -- while VIPER current assumes all atomics are system scope atomics and thus cannot be performed at the L2 and this transition requires the dirty line be written back before performing the atomic, the transition never performs the atomic nor does the response path handle it. b) putting the atomic action right after the write back is not safe because we need to ensure the requests are ordered when they reach memory -- thus we have to wait until the write back is acknowledged before it's safe to send/perform the atomic. To fix this, this change modifies the transition in question to put the atomic on the stalled requests buffer, which the WBAck will check when it returns to the L2 (and thus perform the atomic, which will result in the atomic being sent on to the directory). This fix has been tested and verified with both the per-checkin and nightly GPU Ruby Random tester tests (with a WB L2 cache). Change-Id: I9a43fd985dc71297521f4b05c47288d92c314ac7 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/68978 Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Reviewed-by: Matthew Poremba <matthew.poremba@amd.com> Tested-by: kokoro <noreply+kokoro@google.com>	2023-03-22 04:00:38 +00:00
Matt Sinclair	92d920f994	mem-ruby: fix load deadlock with WB GPU L2 caches By default the GPU VIPER coherence protocol uses a WT L2 cache. However it has support for using WB caches (although this is not tested currently). When using a WB L2 cache for the GPU, this results in deadlocks with loads. Specifically, when a load reaches the L2 and the line is currently in the W state, that line must be written back before the load can be performed. However, the current transition for this in the L2 did not attempt to retry the load when the WB completes, resulting in a deadlock. This deadlock can be replicated by running the GPU Ruby random tester as is with a WB L2 cache instead of a WT L2 cache. To fix this, this change modifies the transition in question to put the load on the stalled requests buffer, which the WBAck will check when it returns to the L2 (and thus perform the load). This fix has been tested and verified with both the per-checkin and nightly GPU Ruby Random tester tests (with a WB L2 cache). Change-Id: Ieec4f61a3070cf9976b8c3ef0cdbd0cc5a1443c6 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/68977 Reviewed-by: Matthew Poremba <matthew.poremba@amd.com> Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com>	2023-03-22 04:00:38 +00:00
Melissa Jost	6884aeb86a	base: Fix gcc-13 build error This change adds relevant errors that allow building with gcc-13. Change-Id: Ib97a90ef647a9cd9ec1bf1f2bde61daca85de427 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/68497 Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>	2023-03-04 02:08:41 +00:00
Gabriel Busnot	8a774e07b2	dev-amdgpu: Patch forgotten port after mem port owner deprecation Change-Id: I82f88b8962d9f04521e549ca1383c42f2b5b3ffc Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67631 Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>	2023-02-07 13:29:55 +00:00
Gabriel Busnot	7f4c92c910	mem,arch-arm,mem-ruby,cpu: Remove use of deprecated base port owner Change-Id: I29214278c3dd4829c89a6f7c93214b8123912e74 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67452 Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>	2023-02-03 06:11:45 +00:00
Matt Sinclair	4e61a98336	mem-ruby: add GPU cache bypass I->I transition `66d4a158` added support for AMD's GPU cache bypassing flags (GLC for bypassing L1 caches, SLC for bypassing all caches). However, it did not add a transition for the situation where the cache line is currently I (Invalid). This commit adds this support, which resolves an assert failure in Pannotia workloads when this situation arises. Change-Id: I59a62ce70c01dd8b73aacb733fb3d1d0dab2624b Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67201 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com>	2023-01-08 20:24:11 +00:00
Matt Sinclair	1d467bed7f	mem-ruby: fix TCP spacing/spelling Change-Id: I3fd9009592c8716a3da19dcdccf68f16af6522ef Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67200 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2023-01-08 20:24:11 +00:00
Matt Sinclair	24e2ef0b78	mem-ruby, gpu-compute: fix TCP GLC cache bypassing `66d4a158` added support for AMD's GPU cache bypassing flags (GLC for bypassing L1 caches, SLC for bypassing all caches). However, for applications that use the GLC flag but intermix GLC- and non-GLC accesses to the same address, this previous commit has a bug. This bug manifests when the address is currently valid in the L1 (TCP). In this case, the previous commit chose to evict the line before letting the bypassing access to proceed. However, to do this the previous commit was using the inv_invDone action as part of the process of evicting it. This action is only intended to be called when load acquires are being performed (i.e., when the entire L1 cache is being flash invalidated). Thus, calling inv_invDone for a GLC (or SLC) bypassing request caused an assert failure since the bypassing request was not performing a load acquire. This commit resolves this by changing the support in this case to simply invalidate the entry in the cache. Change-Id: Ibaa4976f8714ac93650020af1c0ce2b6732c95a2 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67199 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com>	2023-01-08 20:24:11 +00:00
Vishnu Ramadas	c23d7bb3ee	gpu-compute, mem-ruby: Add p_popRequestQueue to some transitions Two W->WI transitions, on events RdBlk and Atomic in the GPU L2 cache coherence protocol do not clear the request from the request queue upon completing the transition. This action is not performed in the respone path. This update adds the p_popRequestQueue action to each of these transitions to remove the stale request from the queue. Change-Id: Ia2679fe3dd702f4df2bc114f4607ba40c18d6ff1 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67192 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2023-01-05 23:41:00 +00:00
Vishnu Ramadas	ddf43726ef	gpu-compute, mem-ruby: Update GPU cache bypassing to use TBE An earlier commit added support for GLC and SLC AMDGPU instruction modifiers. These modifiers enable cache bypassing when set. The GLC/SLC flag information was being threaded through all the way to memory and back so that appropriate actions could be taken upon receiving a request and corresponding response. This commit removes the threading and adds the bypass flag information to TBE. Requests populate this entry and responses access it to determine the correct set of actions to execute. Change-Id: I20ffa6682d109270adb921de078cfd47fb4e137c Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67191 Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>	2023-01-05 23:38:32 +00:00
Vishnu Ramadas	66d4a15820	gpu-compute,mem-ruby: Add support for GPU cache bypassing The GPU cache models do not support cache bypassing when the GLC or SLC AMDGPU instruction modifiers are used in a load or store. This commit adds cache bypass support by introducing new transitions in the coherence protocol used by the GPU memory system. Now, instructions with the GLC bit set will not cache in the L1 and instructions with SLC bit set will not cache in L1 or L2. Change-Id: Id29a47b0fa7e16a21a7718949db802f85e9897c3 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/66991 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2023-01-03 21:19:24 +00:00

1 2 3 4 5 ...

1119 Commits