derek/gem5 - gem5 - Gitea: Git with a cup of tea

derek/gem5

Author	SHA1	Message	Date
Tiago Mück	d8a04f902e	mem-cache: add prefetch info to update probe CacheDataUpdateProbeArg has additional info to tell listeners if the block was prefetched and evicted without being used, as well as which object prefetched the block. Related JIRA: https://gem5.atlassian.net/browse/GEM5-457 https://gem5.atlassian.net/browse/GEM5-1112 Change-Id: Id8ac9099ddbce6e94ee775655da23de5df25cf0f Signed-off-by: Tiago Mück <tiago.muck@arm.com>	2023-11-28 18:30:49 -06:00
Tiago Mück	becba00d95	mem-cache,configs: remove extra prefetch_* params Remove the prefetch_on_access and prefetch_on_pf_hit from BaseCache. BasePrefetch no longer expects this params to exist in the parent. Configurations that set these parameter using the cache object were fixed. Change-Id: I9ab6a545eaf930ee41ebda74e2b6b8bad0ca35a7 Signed-off-by: Tiago Mück <tiago.muck@arm.com>	2023-11-28 18:30:49 -06:00
Tiago Mück	af2ee0db30	mem-cache: decoupled prefetchers from cache This patches decouples the prefetchers from the cache implementation as the first step to allow using the classic prefetchers with ruby caches. The prefetchers that need do cache lookups can do so using the accessor object provided when the probes are notified. This may also facilitate connecting the same prefetcher to multiple caches. Related JIRA: https://gem5.atlassian.net/browse/GEM5-457 https://gem5.atlassian.net/browse/GEM5-1112 Change-Id: I4fee1a3613ae009fabf45d7b747e4582cad315ef Signed-off-by: Tiago Mück <tiago.muck@arm.com>	2023-11-28 18:30:49 -06:00
Bobby R. Bruce	d42eeb6b68	cpu: Explicitly define cache_line_size -> 64-bit unsigned int (#329 ) While it's plausible to define the cache_line_size as a 32-bit unsigned int, the use of cache_line_size is way out of its original scope. cache_line_size has been used to produce an address mask, which masking out the offset bits from an address. For example, [1], [2], [3], and [4]. However, since the cache_line_size is an "unsigned int", the type of the value is not guaranteed to be 64-bit long. Subsequently, the bit twiddling hacks in [1], [2], [3], and [4] produce 32-bit mask, i.e., 0x00000000FFFFFFC0. This behavior at least caused a problem in LLSC in RISC-V [5], where the load reservation (LR) relies on the mask to produce the cache block address. Two distinct 64-bit addresses can be mapped to the same cache block using the above mask. This patch explicitly defines cache_line_size as a 64-bit unsigned int so the cache block mask can be produced correctly for 64-bit addresses. [1] `3bdcfd6f7a/src/cpu/simple/atomic.hh (L147)` [2] `3bdcfd6f7a/src/cpu/simple/timing.hh (L224)` [3] `3bdcfd6f7a/src/cpu/o3/lsq_unit.cc (L241)` [4] `3bdcfd6f7a/src/cpu/minor/lsq.cc (L1425)` [5] `3bdcfd6f7a/src/arch/riscv/isa.cc (L787)`	2023-10-16 07:50:35 -07:00
Daniel Kouchekinia	4931fb0010	mem-ruby: Always pass on GPU atomics to dir in write-through TCC (#367 ) Added checks to ensure that atomics are not performed in the TCC when it is configured as a write-through cache. Also added SLC bit overwrite to ensure directory preforms atomics when there is a write-through TCC. Change-Id: I4514e6c8022aeb7785f2c59871cd9acec8161ed8	2023-10-14 06:39:50 -07:00
Vishnu Ramadas	8d54a5cbab	mem-ruby: Remove BUILD_GPU guards from ruby coalescer models A previous commit added BUILD_GPU guards to gpu coalescer models since a related cache recorder commit added GPU support. This is no longer needed since the cache recorder moved to using a vector of RubyPorts instead of Sequencer/GPUCoalescer pointers. This commit removes BUILD_GPU guards from the Ruby coalescer models Change-Id: I23a7957d82524d6cd3483d22edfb35ac51796eca	2023-10-12 14:53:29 -05:00
Vishnu Ramadas	08c1af1b16	mem-ruby: Use RubyPort vector to access Ruby in cache recorder Previously, the cache recorder used a vector of sequencer pointers to access Ruby objects. A recent commit updated the cache recorder to also maintain a vector of GPUCoalescer pointers in order for GPUs to support flushin. This added redundant code to the cache recorder. This commit replaces the sequencer and GPUCoalescer vectors with a vector of RubyPort pointers so that the code does not contain redundant lines Change-Id: Id5da33fb870f17bb9daef816cc43c0bcd70a8706	2023-10-12 14:49:06 -05:00
Bobby R. Bruce	298119e402	misc,python: Run `pre-commit run --all-files` Applies the `pyupgrade` hook to all files in the repo. Change-Id: I9879c634a65c5fcaa9567c63bc5977ff97d5d3bf	2023-10-10 21:47:07 -07:00
Bobby R. Bruce	ddf6cb88e4	misc: Run `pre-commit run --all-files` This is reflect the updates made to black when running `pre-commit autoupdate`. Change-Id: Ifb7fea117f354c7f02f26926a5afdf7d67bc5919	2023-10-10 14:01:58 -07:00
Matt Sinclair	ec633b3d68	dev-amdgpu,mem-ruby: Add support to checkpoint and restore between kernels in GPUFS (#377 ) Earlier, GPU checkpointing was working only if a checkpoint was created before the first kernel execution. This pull request adds support to checkpoint in-between any two kernel calls. It does so by doing the following. - Adds flush support in the GPU_VIPER protocol - Adds flush support in the GPUCoalescer - Updates cache recorder to use the GPUCoalescer during simulation cooldown and cache warmup times.	2023-10-10 09:41:21 -05:00
Giacomo Travaglini	00748c7901	mem-ruby: Fix CHI fromSequencer helper function This has been broken by #177 Change-Id: I52feff4b5ab2faf0aa91edd6572e3e767c88e257 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2023-10-06 14:51:11 +01:00
Vishnu Ramadas	a19667427a	mem-ruby: Add BUILD_GPU guard to ruby cooldown and warmup phases Ruby was recently updated to support flushes and warmup for GPUs. Since this support uses the GPUCoalescer, non-GPU builds face a compile time issue. This is because GPU code is not built for non-GPU builds. This commit addes "#if BUILD_GPU" guards around the GPU-related code in common files like AbstractController.hh, CacheRecorder.*, RubySystem.cc, GPUCoalescer.hh, and VIPERCoalescer.hh. This support allows GPU builds to use flushing while non-GPU builds compile without problems Change-Id: If8ee4ff881fe154553289e8c00881ee1b6e3f113	2023-10-05 18:59:54 -05:00
Víctor Soria	6411b2255c	mem-ruby,configs: Add CHI far atomics support Introduce far atomic operations in CHI protocol. Three configuration parameters have been used to tune this behavior: policy_type: sets the atomic policy to one of the described in our paper atomic_op_latency: simulates the AMO ALU operation latency comp_anr: configures the Atomic No return transaction to split CompDBIDResp into two different messages DBIDResp and Comp Change-Id: I087afad9ad9fcb9df42d72893c9e32ad5a5eb478	2023-10-04 19:19:08 +02:00
Víctor Soria	4fd9d66c53	tests,mem-ruby: Enhance ruby false sharing test with Atomics New ruby mem test includes a percentages of AMOs that will be executed randomly in ruby mem test Change-Id: Ie95ed78e59ea773ce6b59060eaece3701fe4478c	2023-10-04 19:11:01 +02:00
Vishnu Ramadas	ae5a51994c	mem-ruby: Update cache recorder to use GPUCoalescer port for GPUs Previously, the cache recorder used the Sequencer to issue flush requests and cache warmup requests. The GPU however uses GPUCoalescer to access the cache, and not the Sequencer. This commit adds a GPUCoalescer map to the cache recorder and uses it to send flushes and cache warmup requests to any GPU caches in the system Change-Id: I10490cf5e561c8559a98d4eb0550c62eefe769c9	2023-10-02 19:05:10 -05:00
Vishnu Ramadas	085789d00c	mem-ruby: Add flush support to GPU_VIPER protocol This commit adds flush support to the GPU VIPER coherence protocol. The L1 cache will now initiate a flush request if the packet it receives is of type RubyRequestType_FLUSH. During the flush process, the L1 cache will a request to L2 if its in either V or I state. L2 will issue a flush request to the directory if its cache line is in the valid state before invalidating its copy. The directory, on receiving this request, writes data to memory and sends an ack back to the L2. L2 forwards this ack back to the L1, which then ends the flush by calling the write callback Change-Id: I9dfc0c7b71a1e9f6d5e9e6ed4977c1e6a3b5ba46	2023-10-02 19:05:10 -05:00
Vishnu Ramadas	61e39d5b26	mem-ruby: Add cache cooldown and warmup support to GPUCoalescer The GPU Coalescer does not contain cache cooldown and warmup support. This commit updates the coalsecer to support cache cooldown during flush and warmup during checkpoint restore. Change-Id: I5459471dec20ff304fd5954af1079a7486ee860a	2023-10-02 19:05:04 -05:00
Vishnu Ramadas	a50ead5907	mem-ruby: Add Flush as a supported memory type in VIPERCoalescer This commit adds flush as a recognized memory type in VIPERCoalescer. Change-Id: I0f1b6f4518548e8e893ef681955b12a49293d8b4	2023-10-02 19:02:55 -05:00
Yu-hsin Wang	9ca2672cab	misc: fix g++13 overloaded-virtual warning There are two overloaded-virtual issues reported by g++13. 1. Copy assignment and move assignment overload is hidden in the derived class [ CXX] src/mem/cache/replacement_policies/weighted_lru_rp.cc -> ALL/mem/cache/replacement_policies/weighted_lru_rp.o In file included from src/mem/cache/base.hh:61, from src/mem/cache/base.cc:46: src/mem/cache/cache_blk.hh:172:5: error: ‘virtual gem5::CacheBlk& gem5::CacheBlk::operator=(gem5::CacheBlk&&)’ was hidden [-Werror=overloaded-virtual=] 172 \| operator=(CacheBlk&& other) \| ^~~~~~~~ src/mem/cache/cache_blk.hh:518:19: note: by ‘gem5::TempCacheBlk& gem5::TempCacheBlk::operator=(const gem5::TempCacheBlk&)’ 518 \| TempCacheBlk& operator=(const TempCacheBlk&) = delete; \| ^~~~~~~~ In this case, we can exiplict using parent operator= to keep the function overload. 2. Intended overload hidden in SystemC is reported as error. In file included from src/systemc/ext/tlm_utils/simple_initiator_socket.h:24, from src/systemc/tlm_bridge/gem5_to_tlm.hh:72, from build/ALL/python/_m5/param_Gem5ToTlmBridge256.cc:17: src/systemc/ext/tlm_utils/../tlm_core/2/sockets/initiator_socket.hh: In instantiation of ‘class tlm::tlm_base_initiator_socket<256, tlm::tlm_fw_transport_if<>, tlm::tlm_bw_transport_if<>, 1, sc_core::SC_ONE_OR_MORE_BOUND>’: src/systemc/ext/tlm_utils/../tlm_core/2/sockets/initiator_socket.hh:185:7: required from ‘class tlm::tlm_initiator_socket<256, tlm::tlm_base_protocol_types, 1, sc_core::SC_ONE_OR_MORE_BOUND>’ src/systemc/ext/tlm_utils/simple_initiator_socket.h:37:7: required from ‘class tlm_utils::simple_initiator_socket_b<sc_gem5::Gem5ToTlmBridge<256>, 256, tlm::tlm_base_protocol_types, sc_core::SC_ONE_OR_MORE_BOUND>’ src/systemc/ext/tlm_utils/simple_initiator_socket.h:156:7: required from ‘class tlm_utils::simple_initiator_socket<sc_gem5::Gem5ToTlmBridge<256>, 256, tlm::tlm_base_protocol_types>’ src/systemc/tlm_bridge/gem5_to_tlm.hh:147:46: required from ‘class sc_gem5::Gem5ToTlmBridge<256>’ /usr/include/c++/13/type_traits:1411:38: required from ‘struct std::is_base_of<sc_gem5::Gem5ToTlmBridgeBase, sc_gem5::Gem5ToTlmBridge<256> >’ ext/pybind11/include/pybind11/detail/../detail/common.h:880:59: required from ‘struct pybind11::class_<sc_gem5::Gem5ToTlmBridge<256>, sc_gem5::Gem5ToTlmBridgeBase, std::unique_ptr<sc_gem5::Gem5ToTlmBridge<256>, pybind11::nodelete> >::is_valid_class_option<sc_gem5::Gem5ToTlmBridgeBase>’ ext/pybind11/include/pybind11/detail/../detail/common.h:719:35: required by substitution of ‘template<class ... Ts> using pybind11::detail::all_of = pybind11::detail::bool_constant<(Ts::value && ...)> [with Ts = {pybind11::class_<sc_gem5::Gem5ToTlmBridge<256>, sc_gem5::Gem5ToTlmBridgeBase, std::unique_ptr<sc_gem5::Gem5ToTlmBridge<256>, pybind11::nodelete> >::is_valid_class_option<sc_gem5::Gem5ToTlmBridgeBase>, pybind11::class_<sc_gem5::Gem5ToTlmBridge<256>, sc_gem5::Gem5ToTlmBridgeBase, std::unique_ptr<sc_gem5::Gem5ToTlmBridge<256>, pybind11::nodelete> >::is_valid_class_option<std::unique_ptr<sc_gem5::Gem5ToTlmBridge<256>, pybind11::nodelete> >}]’ ext/pybind11/include/pybind11/pybind11.h:1506:70: required from ‘class pybind11::class_<sc_gem5::Gem5ToTlmBridge<256>, sc_gem5::Gem5ToTlmBridgeBase, std::unique_ptr<sc_gem5::Gem5ToTlmBridge<256>, pybind11::nodelete> >’ build/ALL/python/_m5/param_Gem5ToTlmBridge256.cc:34:179: required from here src/systemc/ext/tlm_utils/../core/sc_port.hh:125:18: error: ‘void sc_core::sc_port_b<IF>::bind(sc_core::sc_port_b<IF>&) [with IF = tlm::tlm_fw_transport_if<>]’ was hidden [-Werror=overloaded-virtual=] 125 \| virtual void bind(sc_port_b<IF> &p) { sc_port_base::bind(p); } \| ^~~~ In file included from src/systemc/ext/tlm_utils/simple_initiator_socket.h:27: src/systemc/ext/tlm_utils/../tlm_core/2/sockets/initiator_socket.hh:133:18: note: by ‘tlm::tlm_base_initiator_socket<256, tlm::tlm_fw_transport_if<>, tlm::tlm_bw_transport_if<>, 1, sc_core::SC_ONE_OR_MORE_BOUND>::bind’ 133 \| virtual void bind(bw_interface_type &ifs) { (get_base_export())(ifs); } \| ^~~~ src/systemc/ext/tlm_utils/../core/sc_port.hh:124:18: error: ‘void sc_core::sc_port_b<IF>::bind(IF&) [with IF = tlm::tlm_fw_transport_if<>]’ was hidden [-Werror=overloaded-virtual=] 124 \| virtual void bind(IF &i) { sc_port_base::bind(i); } \| ^~~~ src/systemc/ext/tlm_utils/../tlm_core/2/sockets/initiator_socket.hh:133:18: note: by ‘tlm::tlm_base_initiator_socket<256, tlm::tlm_fw_transport_if<>, tlm::tlm_bw_transport_if<>, 1, sc_core::SC_ONE_OR_MORE_BOUND>::bind’ 133 \| virtual void bind(bw_interface_type &ifs) { (get_base_export())(ifs); } \| ^~~~ From the code comment, it's intended in SystemC header. // The overloaded virtual is intended in SystemC, so we'll disable the warning. // Please check section 9.3 of SystemC 2.3.1 release note for more details. The issue is we should move the skip to the base class. Change-Id: I6683919e594ffe1fb3b87ccca1602bffdb788e7d	2023-09-27 13:43:28 +08:00
Giacomo Travaglini	f5968da41c	mem-ruby: start using txnid and DBID identifiers in CHI transactions (#288 ) With this PR our CHI implementation starts making use of the txnid and DBID identifiers. Note: we were already making use of the txnId for DVM messages to convey the DVM address. This is still the case. In the future we should realign the DVM logic so that the txnId is solely used as a transaction identifier.	2023-09-26 09:51:47 +01:00
Hoa Nguyen	1fc89bc8ae	cpu,mem,dev: Use Addr for cacheLineSize Change-Id: I2f056571dbf35081d58afda09726c600141d5a05 Signed-off-by: Hoa Nguyen <hn@hnpl.org>	2023-09-20 14:16:46 -07:00
Hoa Nguyen	ac5280fedc	mem,sim: Change the type of cache_line_size to Addr Change-Id: Id39e8249fef89c0d59bb39f8104650257ff00245 Signed-off-by: Hoa Nguyen <hn@hnpl.org>	2023-09-20 14:00:45 -07:00
Giacomo Travaglini	aec1d081c8	mem-ruby: Populate missing txnId field to CompDBID_Stale response Change-Id: I6861d27063b13cd710e09c153d15062640c887fe Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2023-09-18 15:23:21 +01:00
Bobby R. Bruce	3bdcfd6f7a	mem-ruby: patch fixes a protocol error in MOESI_CMP_Directory (#316 ) When there is race between FwdGetX and PUTX on owner. Owner in this case hands off ownership to GetX requestor and PUTX still goes through. But since owner has changed, state should go back to M and PUTX is essentially trashed. An Unblock to the Directory in this case will give an undefined transition. I have added transitions which indicate that when an Unblock is served to the Directory, it means that some kind of ownership transfer has happened while a PUTX/PUTO was in progress.	2023-09-15 13:25:51 -07:00
Giacomo Travaglini	320454b75f	mem-ruby: Populate missing txnId field to CompI response Change-Id: I02030f61dd4e64a29b16e47d49bcde8c723260b5 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2023-09-15 12:13:00 +01:00
Bobby R. Bruce	59a96c8c2f	mem-cache: Fix bug in classic cache while clflush (#274 ) This change, https://github.com/gem5/gem5/pull/205, mistakenly allocates write buffer for clflush instruction when there's a cache miss. However, clflush in gem5 is not a write instruction. Thus, the cache should allocate miss buffer in this case.	2023-09-14 01:14:39 -07:00
Gautham Pathak	178db9e270	mem-ruby: patch fixes a protocol error in MOESI_CMP_Directory When there is race between FwdGetX and PUTX on owner. Owner in this case hands off ownership to GetX requestor and PUTX still goes through. But since owner has changed, state should go back to M and PUTX is essentially trashed. An Unblock to the Directory in this case will give an undefined transition. I have added transitions which indicate that when an Unblock is served to the Directory, it means that some kind of ownership transfer has happened while a PUTX/PUTO was in progress. Change-Id: I37439b5a363417096030a0875a51c605bd34c127	2023-09-13 19:09:13 -04:00
Gautham Pathak	87db6df8f6	mem-ruby: This commit patches an error in AbstractController.cc After calling m5_dump_reset_stats(0,0) in a test program, some statistics like l1_controllers.L1Dcache.m_demand_hits, l1_controllers.L1Dcache.m_demand_misses, l1_controllers.L1Dcache.m_demand_accesses were not getting reset in the newer stat dumps. This one line patch fixes that. Changes were tested with calling two m5_dump_reset_stats(0,0) in a row for a system with 1 core, tested on both SE and FS. Credits to Gabriel Busnot for finding the fix. Change-Id: I19d75996fa53d31ef20f7b206024fd38dbeac643	2023-09-13 14:07:16 -04:00
Hoa Nguyen	91d1a5deb5	mem-cache: Fix bug in classic cache while clflush This change, https://github.com/gem5/gem5/pull/205, mistakenly allocates write buffer for clflush instruction when there's a cache miss. However, clflush in gem5 is not a write instruction. Thus, the cache should allocate miss buffer in this case. Change-Id: I9c1c9b841159c4420567e9c929e71e4aa27d5c28 Signed-off-by: Hoa Nguyen <hn@hnpl.org>	2023-09-08 18:16:10 +00:00
Giacomo Travaglini	da740b1cdd	mem-ruby: Add a DBID field to the CHIResponseMsg data type This will hold the CHI Data Buffer Identifier (DBID) field. The DBID allows a Completer of a transaction to provide its own identifier for a transaction ID. This new ID will be used as a TxnId field by a following WriteData/CompData/CompAck response. For now we only set it to the original txnId (identity mapping) Change-Id: If30c5e1cafbe5a30073c7cd01d60bf41eb586cee Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2023-09-08 08:38:13 +01:00
Giacomo Travaglini	4359567180	mem-ruby: Generate TxnId field for an incoming CHI request The TxnId field of a CHI request has so far been unused (other than for DVM transactions). With this patch we always initialize the field when we extract a ruby request from the sequencer port. According to specs (IHI0050F): A 12-bit field is defined for the TxnID with the number of outstanding transactions being limited to 1024. A Requester is permitted to reuse a TxnID value after it has received either: * All responses associated with a previous transaction that have used the same value. * A RetryAck response for a previous transaction that used the same value Change-Id: Ie48f0fee99966339799ac50932d36b2a927b1c7d Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2023-09-08 08:38:13 +01:00
Giacomo Travaglini	f032eeae93	mem-ruby: Provide a fromSequencer helper function Based on the CHIRequestType, it automatically tells if the request has been originated from the sequencer (CPU load/fetch/store) Change-Id: I50fd116c8b1a995b1c37e948cd96db60c027fe66 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2023-09-08 08:38:13 +01:00
Giacomo Travaglini	5dbc48432f	mem-ruby: Allow Addr as a controller member type Change-Id: I63127ed06b4f871b74faad6c2c6436aebd118334 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2023-09-08 08:38:13 +01:00
Giacomo Travaglini	f7d6dadc10	mem-ruby: Allow trivial integer operations with Addr type At the moment an address value can only be used in the slicc code to do TBE lookups but there is no way to add/subtract/divide/multiply two addresses nor an address and an integer value. This hinders the development of protocol specific code and forces developers to place such code in shared C++ structures Change-Id: Ia184e793b6cd38f951f475a7cdf284f529972ccb Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2023-09-08 08:38:13 +01:00
Giacomo Travaglini	ddb6749b62	mem-ruby: Add static_cast by value in SLICC At the moment it is possible to static_cast by pointer/reference only: static_cast(type, "pointer", val) -> static_cast<type*>(val); static_cast(type, "reference", val) -> static_cast<type&>(val); With this patch it will also be possible to do something like static_cast(type, "value", val) -> static_cast<type>(val); Which is important when wishing to convert integer types into custom onces and viceversa. This patch is also deferring static_cast type check to C++ At the moment it is difficult to use the static_cast utility in slicc as it tries to handle type checking in the language itself. This would force us to explicitly define compatible types (like an Addr and an int as an example). Rather than pushing the burden on us, we should always allow a developer to use a static_cast in slicc and let the C++ compiler complain if the generated code is not compatible Change-Id: I0586b9224b1e41751a07d15e2d48a435061c2582 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2023-09-08 08:38:13 +01:00
Matthew Poremba	2da54d5a4f	mem-ruby: Reorder SLC atomic and response actions Currently the MOESI_AMD_Base-directory transition for system level atomics sends the response message before the atomic is performed. This was likely done because atomics are supposed to return the value of the data before the atomic is performed and by simply ordering the actions this way that was taken care of. With the new atomic log feature, the atomic values are pulled from the log by the coalescer on the return path. Therefore, these actions can be reordered. However, it is now necessary that the atomics be performed before sending the response so that the log is populated and copied by the response action. This should fix #253 . Change-Id: Ie7e178f93990975367de2cc3e89e5ef9c9069241	2023-09-01 10:36:54 -05:00
Bobby R. Bruce	0e323bc409	mem: Atomic ops to same address (#200 ) Augmenting the DataBlock class with a change log structure to record the effects of atomic operations on a data block and service these changes if the atomic operations require return values. Although the operations are atomic, the coalescer need not send unique memory requests for each operation. Atomic operations within a wavefront to the same address are now coalesced into a single memory request. The response of this request carries all the necessary information to provide the requesting lanes unique values as a result of their individual atomic operations. This helps reduce contention for request and response queues in simulation. Previously, only the final value of the datablock after all atomic ops to the same address was visible to the requesting waves. This change corrects this behavior by allowing each wave to see the effect of this individual atomic op is a return value is necessary.	2023-08-30 23:53:35 -07:00
Bobby R. Bruce	68a48a2dfa	mem-ruby: fix CHI sending the wrong snoop response (#219 ) Do not respond with SnpRespData_I when the line is still present upstream.	2023-08-28 16:21:25 -07:00
Bobby R. Bruce	737c611e72	mem-ruby: fix assert on CHI ReadUnique (#218 ) DCT must be disabled when handling a ReadUnique where the copy need to be upgraded. Previously we were just asserting as it was assumed DCT is only enabled for HNFs (which can "auto-upgrade"). However DCT may also be enabled for intermediated levels of distributed shared caches above the HNFs.	2023-08-28 16:06:09 -07:00
Bobby R. Bruce	4bd3d2f864	mem-ruby: Improve Ruby/CHI stats for in/out trans (#220 ) Currently we generate these stats for all defined Events in the protocol, which may generate too many stats that are never used. Though these don't appear in the stats.txt file, they unnecessarily increases simulation startup time and memory footprint. This patch limits those stats to events with the "in_trans" and/or "out_trans" properties. SLICC compiler then checks which combinations of event+state are possible when generating the stats. Also the possible level of detail for inTransLatHist was reduced. Only the number of transactions for each event+initial+final state combinations is now accounted. Latency histograms are only defined per event type (similarly to outTransLatHist). This significantly reduces the final file size for generated stats.	2023-08-28 15:06:39 -07:00
Tiago Mück	9584d2efa9	mem-ruby: add in_trans/out_trans to CHI events Marks which events signal the beginning of incoming and outgoing transactions for generating inTransLatHist and outTransLatHist stats. Change-Id: I90594a27fa01ef9cfface309971354b281308d22 Signed-off-by: Tiago Mück <tiago.muck@arm.com>	2023-08-23 17:25:50 -05:00
Tiago Mück	3360a87d5a	mem-ruby: optimize in/outTransLatHist stats Generating these stats for all defined Events may generate too many stats that are never used, which unnecessarily increases simulation startup time and memory consumption. This patch limits those stats to events with the "in_trans" and/or "out_trans" properties. SLICC compiler then checks which combinations of event+state are possible when generating the stats. Also the possible level of detail for inTransLatHist was reduced. Only the number of transactions for each event+initial+final state combinations is now accounted. Latency histograms are only defined per event type (similarly to outTransLatHist). This significantly reduces the final file size for generated stats. Change-Id: I29aaeb771436cc3f0ce7547a223d58e71d9cedcc Signed-off-by: Tiago Mück <tiago.muck@arm.com>	2023-08-23 17:25:38 -05:00
Tiago Mück	a5fd6edea1	mem-ruby: fix CHI sending the wrong snoop response Do not respond with SnpRespData_I when the line is still present upstream. Change-Id: I2592e5c6637cfc0e83042169a245837648276e61 Signed-off-by: Tiago Mück <tiago.muck@arm.com>	2023-08-23 17:04:09 -05:00
Tiago Mück	49f5ec16d1	mem-ruby: fix assert on CHI ReadUnique DCT must be disabled when handling a ReadUnique where the copy need to be upgraded. Previously we were just asserting as it was assumed DCT is only enabled for HNFs (which can "auto-upgrade"). However DCT may also be enabled for intermediated levels of distributed shared caches above the HNFs. Change-Id: I9e29142a8d2f59ea61c1d90cda6b00c19435d6b7 Signed-off-by: Tiago Mück <tiago.muck@arm.com>	2023-08-23 16:58:25 -05:00
Reiley Jeyapaul	c9ff54677f	mem-ruby: fix CHI Evict race condition When an Evict request is received from upstream for a shared line and the line is no longer cached locally (or on any other upstream cache), we need to also send an Evict downstream. In this case we need to wait until our outgoing Evict completes before completing the Evict from upstream in order be able to resolve race conditions with incoming snoops. E.g.: while our outgoing Evict is pending we may receive a snoop requesting data, but we won't be able to complete this snoop if we have already completed all upstream Evicts and we no longer have the line. Change-Id: I23ac4f0a9c4ddd81e2425376c8d1e1c7fb66d107 Signed-off-by: Tiago Mück <tiago.muck@arm.com>	2023-08-23 15:49:51 -05:00
Ranganath (Bujji) Selagamsetty	f6a453362f	mem: Atomic ops to same address Augmenting the DataBlock class with a change log structure to record the effects of atomic operations on a data block and service these changes if the atomic operations require return values. Although the operations are atomic, the coalescer need not send unique memory requests for each operation. Atomic operations within a wavefront to the same address are now coalesced into a single memory request. The response of this request carries all the necessary information to provide the requesting lanes unique values as a result of their individual atomic operations. This helps reduce contention for request and response queues in simulation. Previously, only the final value of the datablock after all atomic ops to the same address was visible to the requesting waves. This change corrects this behavior by allowing each wave to see the effect of this individual atomic op is a return value is necessary. Change-Id: I639bea943afd317e45f8fa3bff7689f6b8df9395	2023-08-23 14:45:25 -05:00
Hoa Nguyen	9e007e5bd7	mem-cache: fix wrong function call Change-Id: I924ede89f373ec21557faf25c96b36f4bc8430dd Signed-off-by: Hoa Nguyen <hn@hnpl.org>	2023-08-19 22:56:55 +00:00
Hoa Nguyen	f442846d9d	mem-cache: Fix another typo Change-Id: Ib2051f9bda6e6d9002d3be1dbf0b890299098201 Signed-off-by: Hoa Nguyen <hn@hnpl.org>	2023-08-19 22:50:53 +00:00
Hoa Nguyen	7b897a30fa	mem-cache: Fix syntax error Change-Id: I1360879c13d377661e9eeeddf345b785c01efeb6 Signed-off-by: Hoa Nguyen <hn@hnpl.org>	2023-08-19 21:27:53 +00:00
Hoa Nguyen	98daec7d99	mem-cache: Allow clflush's uncacheable requests on classic cache When a linux kernel changes a page property, it flushes the related cache lines. The kernel might change the page property before flushing the cache lines. This results in the clflush might occur in an uncacheable region. Currently, an uncacheable request must be a read or a write. However, clflush request is neither of them. This change aims to allow clflush requests to work on uncacheable regions. Since there is no straightforward way to check if a packet is from a clflush instruction, this change permits all Clean Invalidate Requests, which is the type of request produced by clflush, to work on uncacheable regions. Change-Id: Ib3ec01d9281d3dfe565a0ced773ed912edb32b8f Signed-off-by: Hoa Nguyen <hn@hnpl.org>	2023-08-19 18:20:16 +00:00

1 2 3 4 5 ...

3234 Commits