derek/gem5 - gem5 - Gitea: Git with a cup of tea

derek/gem5

Author	SHA1	Message	Date
Jason Lowe-Power	b3e7af9d79	Support for classic prefetchers in Ruby (#502 ) This patch adds supports for using the "classic" prefetchers with ruby cache controllers. This pull request includes a few commits making the changes in this order: - Refactor decouples the classic cache and prefetchers interfaces - Extras probes for later integration with ruby - General ruby-side support - Adds support for the CHI protocol Commit [mem-ruby: support prefetcher in CHI protocol](`2bdb65653b`) may be used as example on how to add support for other protocols. JIRA issues that may be related to this pull request: https://gem5.atlassian.net/browse/GEM5-457 https://gem5.atlassian.net/browse/GEM5-1112	2023-11-30 10:24:29 -08:00
Bobby R. Bruce	d11c40dcac	misc: Run `pre-commit run --all-files` This ensures `isort` is applied to all files in the repo. Change-Id: Ib7ced1c924ef1639542bf0d1a01c5737f6ba43e9	2023-11-29 22:06:41 -08:00
Tiago Mück	91cf58871e	mem-ruby: support prefetcher in CHI protocol Use RubyPrefetcherProxy to support prefetchers in the CHI cache controller L1I/L1D/L2 prefechers can now be added by specifying a non-null prefetcher type when configuring a CHI_RNF. Related JIRA: https://gem5.atlassian.net/browse/GEM5-457 https://gem5.atlassian.net/browse/GEM5-1112 Additional authors: Tuan Ta <tuan.ta2@arm.com> Change-Id: I41dc637969acaab058b22a8c9c3931fa137eeace Signed-off-by: Tiago Mück <tiago.muck@arm.com>	2023-11-28 18:30:50 -06:00
Tiago Mück	becba00d95	mem-cache,configs: remove extra prefetch_* params Remove the prefetch_on_access and prefetch_on_pf_hit from BaseCache. BasePrefetch no longer expects this params to exist in the parent. Configurations that set these parameter using the cache object were fixed. Change-Id: I9ab6a545eaf930ee41ebda74e2b6b8bad0ca35a7 Signed-off-by: Tiago Mück <tiago.muck@arm.com>	2023-11-28 18:30:49 -06:00
Derek Christ	e95cab429f	configs,ext,stdlib: Update DRAMSys integration (#525 ) Recent breaking changes in the DRAMSys API require user code to be updated. These updates have been applied to the gem5 integration. Furthermore, as DRAMSys started to use CMake dependency management, it is no longer sensible to maintain two separate build systems for DRAMSys. The use of the DRAMSys integration in gem5 will therefore from now on require that CMake is installed on the target machine. Additionally, support for snapshots have been implemented into DRAMSys and coupled with gem5's checkpointing API.	2023-11-14 08:05:11 -08:00
Daniel Kouchekinia	be5c03ea9f	mem-ruby,configs: Add GPU GLC Atomic Resource Constraints (#120 ) Added a resource constraint, AtomicALUOperation, to GLC atomics performed in the TCC. The resource constraint uses a new class, ALUFreeList array. The class assumes the following: - There are a fixed number of atomic ALU pipelines - While a new cache line can be processed in each pipeline each cycle, if a cache line is currently going through a pipeline, it can't be processed again until it's finished Two configuration parameters have been used to tune this behavior: - tcc-num-atomic-alus corresponds to the number of atomic ALU pipelines - atomic-alu-latency corresponds to the latency of atomic ALU pipelines Change-Id: I25bdde7dafc3877590bb6536efdf57b8c540a939	2023-11-14 07:48:48 -08:00
Kaustav Goswami	2c229aa2ff	configs,ext: gem5 SST bridge calls m5.instantiate() in gem5 This change updates the gem5 SST bridge to call m5.instantiate() in the gem5 config script instead of in the SST component. This allows more flexibility for the gem5-SST setup, as we can now write traffic generators using the bridge. Change-Id: I510a8c15f8fb00bdbdd60dafa2d9f5ad011e48f2 Signed-off-by: Kaustav Goswami <kggoswami@ucdavis.edu>	2023-11-06 11:54:35 -08:00
Bobby R. Bruce	531067fffa	mem,tests: Set Ruby Mem Test atomic percent to 0 (#489 ) Fixes https://github.com/gem5/gem5/issues/450 (https://github.com/gem5/gem5/pull/477 fixes non-ruby memtests, so only a partial fix).	2023-10-19 15:38:38 -07:00
Andreas Sandberg	42d1c8b3c3	cpu: Restructure RAS (#428 )	2023-10-17 19:14:13 +01:00
David Schall	5387e67114	cpu: Restructure RAS The return address stack (RAS) is restructured to be a separate SimObject. This enables disabling the RAS and better separation of the functionality. Furthermore, easier statistics and debugging. Change-Id: I8aacf7d4c8e308165d0e7e15bc5a5d0df77f8192 Signed-off-by: David Schall <david.schall@ed.ac.uk>	2023-10-17 15:30:56 +00:00
Matthew Poremba	ca2592d3ba	configs: Fix missing param exchange for GPUFS (#457 ) PR #367 adds an option to configs/ruby/GPU_VIPER.py that was not added to the corresponding dGPU equal for GPUFS and thus all GPUFS runs are failing. Fixed in this patch.	2023-10-14 20:07:39 -07:00
Daniel Kouchekinia	4931fb0010	mem-ruby: Always pass on GPU atomics to dir in write-through TCC (#367 ) Added checks to ensure that atomics are not performed in the TCC when it is configured as a write-through cache. Also added SLC bit overwrite to ensure directory preforms atomics when there is a write-through TCC. Change-Id: I4514e6c8022aeb7785f2c59871cd9acec8161ed8	2023-10-14 06:39:50 -07:00
Matthew Poremba	f7ad8fe435	configs: GPUFS option to disable KVM perf counters (#433 ) Add a --no-kvm-perf option to disable KVM perf counters for GPUFS scripts. This is useful for users who have KVM enabled but configured with more restrictive settings, which seems to be the default in newer Linux distros. Change-Id: I7508113d0f7c74deb21ea7b2770522885a0ec822	2023-10-11 14:20:27 -07:00
Bobby R. Bruce	c855dbf7c5	configs,ext: Updated the gem5 SST Bridge to use SST 13.0.0 (#396 ) This change updates the gem5 SST Bridge to use SST 13.0.0. Changes are made to replace SimpleMem class to StandardMem class as SimpleMem will be deprecated in SST 14 and above. In addition, the translator.hh is updated to translate more types of gem5 packets. A new parameter `ports` was added on SST's side when invoking the gem5 component which does not require recompiling the gem5 component whenever a new outgoing bridge is added in a gem5 config.	2023-10-11 13:34:48 -07:00
Bobby R. Bruce	298119e402	misc,python: Run `pre-commit run --all-files` Applies the `pyupgrade` hook to all files in the repo. Change-Id: I9879c634a65c5fcaa9567c63bc5977ff97d5d3bf	2023-10-10 21:47:07 -07:00
Bobby R. Bruce	3f5d7d647a	misc: Run `pre-commit autoupdate` (#419 ) 1. Runs `pre-commit autoupdate`. 2. Runs `pre-commit run --all-files`. 3. Adds (2.) to ".git-blame-ignore-rev".	2023-10-10 21:41:33 -07:00
Kaustav Goswami	937b829e8f	configs,ext: Updated the gem5 SST Bridge to use SST 13.0.0 This change updates the gem5 SST Bridge to use SST 13.0.0. Changes are made to replace SimpleMem class to StandardMem class as SimpleMem will be deprecated in SST 14 and above. In addition, the translator.hh is updated to translate more types of gem5 packets. A new parameter `ports` was added on SST's side when invoking the gem5 component which does not require recompiling the gem5 component whenever a new outgoing bridge is added in a gem5 config. Change-Id: I45f0013bc35d088df0aa5a71951422cabab4d7f7 Signed-off-by: Kaustav Goswami <kggoswami@ucdavis.edu>	2023-10-10 14:16:29 -07:00
Bobby R. Bruce	ddf6cb88e4	misc: Run `pre-commit run --all-files` This is reflect the updates made to black when running `pre-commit autoupdate`. Change-Id: Ifb7fea117f354c7f02f26926a5afdf7d67bc5919	2023-10-10 14:01:58 -07:00
ivanaamit	486763b671	learning-gem5: use f-string for print Change-Id: If27af6524af4e4a6a59e914e9e40ba10de24adf4	2023-10-10 13:54:07 -07:00
Matt Sinclair	ec633b3d68	dev-amdgpu,mem-ruby: Add support to checkpoint and restore between kernels in GPUFS (#377 ) Earlier, GPU checkpointing was working only if a checkpoint was created before the first kernel execution. This pull request adds support to checkpoint in-between any two kernel calls. It does so by doing the following. - Adds flush support in the GPU_VIPER protocol - Adds flush support in the GPUCoalescer - Updates cache recorder to use the GPUCoalescer during simulation cooldown and cache warmup times.	2023-10-10 09:41:21 -05:00
Bobby R. Bruce	486916b5d4	configs,tests: Remove `mkdir` in simpoint-se-checkpoint.py (#425 ) This `mkdir` is problematic as it doesn't create the directory recursively. This casues errors if `dir` is `X/Y/Z` and both `Y` and `Z` has not been created. An error will be returned (`No such file or directory`). This issue was fixed with: https://github.com/gem5/gem5/pull/263. The checkpointing code already recursively creates directories as needed. Ergo was can remove this `mkdir` statement.	2023-10-09 22:34:19 -07:00
Bobby R. Bruce	21c5d77000	configs: Add an example elastic trace generation script (#415 ) Current [TraceCPU documentation](https://www.gem5.org/documentation/general_docs/cpu_models/TraceCPU) still references the deprecated se.py/fs.py scripts for elastic trace generation (script paths are also outdated). With this PR we provide a simpler Arm based elastic trace generation script that can be used out of the box by a user or that can be extended as needed.	2023-10-09 14:11:33 -07:00
Bobby R. Bruce	1fe0056d3b	configs,tests: Remove `mkdir` in simpoint-se-checkpoint.py This `mkdir` is problematic as it doesn't create the directory recursively. This casues errors if `dir` is `X/Y/Z` and both `Y` and `Z` has not been created. An error will be returned (`No such file or directory`). This issue was fixed with: https://github.com/gem5/gem5/pull/263. The checkpointing code already recursively creates directories as needed. Ergo was can remove this `mkdir` statement. Change-Id: Ibae38267c8ee1eba76d7834367aa1c54013365bc	2023-10-09 14:00:21 -07:00
Giacomo Travaglini	4c4615523f	configs: Add an example elastic-trace-generating script The new script will automatically use the newly defined O3_ARM_v7a_3_Etrace CPU to run a simple SE simulation while generating elastic trace files. The script is based on starter_se.py, but contains the following limitations: 1) No L2 cache as it might affect computational delay calculations 2) Supporting SimpleMemory only with minimal memory latency There restrictions were imported by the existing elastic trace generation logic in the common library (collected by grepping elastic_trace_en) [1][2][3] Example usage: build/ARM/gem5.opt configs/example/arm/etrace_se.py \ --inst-trace-file [INSTRUCTION TRACE] \ --data-trace-file [DATA TRACE] \ [WORKLOAD] [1]: https://github.com/gem5/gem5/blob/stable/\ configs/common/MemConfig.py#L191 [2]: https://github.com/gem5/gem5/blob/stable/\ configs/common/MemConfig.py#L232 [3]: https://github.com/gem5/gem5/blob/stable/\ configs/common/CacheConfig.py#L130 Change-Id: I021fc84fa101113c5c2f0737d50a930bb4750f76 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Richard Cooper <richard.cooper@arm.com>	2023-10-09 16:45:00 +01:00
Giacomo Travaglini	1a5dee0f0f	configs: Add an elastic-trace-generating CPU According to the original paper [1] the elastic trace generation process requires a cpu with a big number of entries in the ROB, LQ and SQ, so that there are no stalls due to resource limitation. At the moment these numbers are copy pasted from the CpuConfig.config_etrace method [2]. [1]: https://ieeexplore.ieee.org/document/7818336 [2]: https://github.com/gem5/gem5/blob/stable/\ configs/common/CpuConfig.py#L40 Change-Id: I00fde49e5420e420a4eddb7b49de4b74360348c9 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Richard Cooper <richard.cooper@arm.com>	2023-10-09 16:45:00 +01:00
Giacomo Travaglini	e35e2966c0	configs: Use devices.SimpleSeSystem in starter_se.py Change-Id: I742e280e7a2a4047ac4bb3d783a28ee97f461480 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Richard Cooper <richard.cooper@arm.com>	2023-10-09 16:45:00 +01:00
Giacomo Travaglini	7395b94c40	configs: Add a SimpleSeSystem class to devices.py Change-Id: I9d120fbaf0c61c5a053163ec1e5f4f93c583df52 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Richard Cooper <richard.cooper@arm.com>	2023-10-09 16:45:00 +01:00
Giacomo Travaglini	3b8c974456	configs: Refactor BaseSimpleSystem in devices.py We define a new parent (ClusterSystem) to model a system with one or more cpu clusters within it. The idea is to make this new base class reusable by SE systems/scripts as well (like starter_se.py) Change-Id: I1398d773813db565f6ad5ce62cb4c022cb12a55a Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Richard Cooper <richard.cooper@arm.com>	2023-10-09 16:45:00 +01:00
David Schall	edf9092fee	cpu: Restructure BTB - A new abstract BTB class is created to enable different BTB implementations. The new BTB class gets its own parameter and stats. - An enum is added to differentiate branch instruction types. This enum is used to enhance statistics and BPU management. - The existing BTB is moved into `simple_btb` as default. - An additional function is added to store the static instruction in the BTB. This function is used for the decoupled front-end. - Update configs to match new BTB parameters. Change-Id: I99b29a19a1b57e59ea2b188ed7d62a8b79426529 Signed-off-by: David Schall <david.schall@ed.ac.uk>	2023-10-09 14:37:47 +00:00
Giacomo Travaglini	ae104cc431	mem-ruby: Add new feature far atomics in CHI (#177 ) Added a new feature to CHI protocol (in collaboration with @tiagormk). Here is the Jira Ticket [https://gem5.atlassian.net/browse/GEM5-1326](https://gem5.atlassian.net/browse/GEM5-1326 ). As described in CHI specs, far atomic transactions enable remote execution of Atomic Memory Operations. This pull request incorporates several changes: * Fix Arm ISA definition of Swap instructions. These instructions should return an operand, so their ISA definition should be Return Operation. * Enable AMOs in Ruby Mem Test to verify that AMOs work * Enable near and far AMO in the Cache Controler of CHI Three configuration parameters have been used to tune this behavior: * policy_type: sets the atomic policy to one of the described in [our paper](https://dl.acm.org/doi/10.1145/3579371.3589065) * atomic_op_latency: simulates the AMO ALU operation latency * comp_anr: configures the Atomic No return transaction to split CompDBIDResp into two different messages DBIDResp and Comp	2023-10-06 10:09:58 +01:00
Matt Sinclair	85340973bf	configs: Add configurable GPU L1,L2 num banks and L2 latencies (#389 ) Previously, the L1, L2 number of banks and L2 latencies were not configurable through command line arguments. This commit adds support to configure them through the arguments '--tcp-num-banks' for number of banks in L1, '--tcc-num-banks' for number of banks in L2, and '--tcc-tag-access-latency', and '--tcc-data-access-latency' Change-Id: Ie3b713ead16865fd7120e2d809ebfa56b69bc4a1	2023-10-05 15:54:24 -05:00
Víctor Soria	6411b2255c	mem-ruby,configs: Add CHI far atomics support Introduce far atomic operations in CHI protocol. Three configuration parameters have been used to tune this behavior: policy_type: sets the atomic policy to one of the described in our paper atomic_op_latency: simulates the AMO ALU operation latency comp_anr: configures the Atomic No return transaction to split CompDBIDResp into two different messages DBIDResp and Comp Change-Id: I087afad9ad9fcb9df42d72893c9e32ad5a5eb478	2023-10-04 19:19:08 +02:00
Víctor Soria	4fd9d66c53	tests,mem-ruby: Enhance ruby false sharing test with Atomics New ruby mem test includes a percentages of AMOs that will be executed randomly in ruby mem test Change-Id: Ie95ed78e59ea773ce6b59060eaece3701fe4478c	2023-10-04 19:11:01 +02:00
Vishnu Ramadas	d3637a489d	configs: Add option to disable AVX in GPUFS GPUFS+KVM simulations automatically enable AVX. This commit adds a command line option to disable AVX if its not needed for a GPUFS simulation. Change-Id: Ic22592767dbdca86f3718eca9c837a8e29b6b781	2023-10-03 12:10:42 -05:00
Vishnu Ramadas	53627cc39c	configs: Add configurable GPU L1,L2 num banks and L2 latencies Previously, the L1, L2 number of banks and L2 latencies were not configurable through command line arguments. This commit adds support to configure them through the arguments '--tcp-num-banks' for number of banks in L1, '--tcc-num-banks' for number of banks in L2, and '--tcc-tag-access-latency', and '--tcc-data-access-latency' Change-Id: Ie3b713ead16865fd7120e2d809ebfa56b69bc4a1	2023-10-03 11:51:28 -05:00
Harshil Patel	3af3c1121b	stdlib, resources: Addressed requested changes Change-Id: I22abdc3bdcdde52301ed10cb3113e8925159c245 Co-authored-by: Kunal Pai <kunpai@users.noreply.github.com>	2023-10-02 23:27:32 -07:00
Harshil Patel	8182f8084b	stdlib, resources, tests: Introduce Suite of Workloads This patch introduces a new category called "suite". A suite is a collection of workloads. Each workload in a SuiteResource has a tag that can be narrowed down through the function with_input_group. Also, the set of input groups can be seen through list_input_groups. Added unit tests to test all functions of SuiteResource class. Change-Id: Iddda5c898b32b7cd874987dbe694ac09aa231f08 Co-authored-by: Kunal Pai <kunpai@ucdavis.edu>	2023-09-29 10:50:09 -07:00
Marco Kurzynski	516dcf3bcd	configs: Fixed Typo Fixed a typo importing obtain_resource Change-Id: I5792ca161187c6576e2501e5aaea610d8b8ee5ea	2023-09-20 21:42:56 +00:00
Bobby R. Bruce	e42d71e802	configs: 'memoy' -> 'memory' spelling mistake fix Fixes https://github.com/gem5/gem5/issues/309 Change-Id: I41ac7c5559d49353d01b3676b5bdf7b91e4efbda	2023-09-13 14:30:22 -07:00
Giacomo Travaglini	785eba6ce1	configs: Reflect TraceCPU changes in the etrace_replay script As we no longer inherit from the BaseCPU, we can't really use CPU generation methods (like Simulation.setCPUClass) and cache generation ones (like CacheConfig.config_cache). This is good news as it allows us to simplify the etrace script and to remove a dependency with the deprecated-to-be common library. Change-Id: Ic89ce2b9d713ee6f6e11bf20c5065426298b3da2 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2023-09-12 15:49:39 +01:00
Bobby R. Bruce	eb5ae35341	resources,stdlib: Add workload to resource specialization and deprecate workload.py (#212 )	2023-09-07 12:45:45 -07:00
Harshil Patel	bbe96d6485	stdlib: Changed use of Workload to obtain_resource - Changed files calling Workload class to call obtain_resoucre instead. Change-Id: I41f5f0c3ccc7c08b39e7049eabef9609d6d68788	2023-09-06 10:06:16 -07:00
Matthew Poremba	addba01d29	configs,dev-amdgpu: Add PCI express capability info The ROCm stack requires PCI express atomics. Currently the first PCI CapabilityPtr does not point to anything, which signals to the OS (Linux) that this is an early generation PCI device. As PCI express atomics were introduced later, the CapabilityPtr needs to point to at least a PCI express capability structure. This capability is defined as 0x10 in Linux. We additionally set the PCI atomic based bits and implement device specific PCI configuration space reads and writes to the amdgpu device. With this commit, the output of simulation when loading the amdgpu driver no longer outputs "PCIE atomics not supported". Further, an application which uses PCIe atomics (PyTorch with a reduce_sum kernel) now makes further progress. Change-Id: I5e3866979659a2657f558941106ef65c2f4d9988	2023-08-24 09:10:35 -05:00
Harshil Patel	328d140c70	stdlib, resources: Added warn msgs and commets. - Added deprecated warnings to Workload and Abstract workload. - Added comments to the classes changed. Change-Id: I671daacf5ef455ea65103bd96aa442486142a486	2023-08-23 13:50:08 -07:00
Harshil Patel	a18b4b17ed	stdlib, resources: depricated workload - Added WrokloadResource in resource.py. - depricated Workload and CustomWorkload. - changed iscvmatched-fs.py with obtain resource for workload to test. Change-Id: I2267c44249b96ca37da3890bf630e0d15c7335ed Note: change example files back to original	2023-08-18 13:56:12 -07:00
Adrià Armejach	ae651f4de1	configs: update riscv restore checkpoint test Change-Id: I019fc6394a03196711ab52533ad8062b22c89daf	2023-08-02 14:46:36 +02:00
Matthew Poremba	f8490e4681	configs: Only require MMIO trace for Vega10 The MMIO trace contains register values for parts of the GPU that are not modeled in gem5, such as registers related to the graphics core. Since MI100 and MI200 do not have anything that is not modeled, the MMIO trace is not needed, therefore it does not need to be used or checked and the command line option goes away entirely for MI100/200. Change-Id: I23839db32b1b072bd44c8c977899a99347fc9687	2023-07-30 13:17:05 -05:00
Matthew Poremba	9acfc5a751	configs: Enable AVX2 for GPUFS+KVM AVX is a requirement for some ROCm libraries, such as rocBLAS, which are themselves requirements for libraries higher up the stack like PyTorch. This patch sets the necessary CPUID bits in the GPUFS config to enable AVX, AVX2, and various SSE features so that applications using these libraries do not cause an illegal instruction trap. Change-Id: Id22f543fb2a06b268271725a54075ee6a9a1f041	2023-07-28 11:34:04 -05:00
Daniel Kouchekinia	984499329d	mem-ruby,configs: Add GLC Atomic Latency VIPER Parameter (#110 ) Added a GLC atomic latency parameter (glc-atomic-latency) used when enqueueing response messages regarding atomics directly performed in the TCC. This latency is added in addition to the L2 response latency (TCC_latency). This represents the latency of performing an atomic within the L2. With this change, the TCC response queue will receive enqueues with varying latencies as GLC atomic responses will have this added GLC atomic latency while data responses will not. To accommodate this in light of the queue having strict FIFO ordering (which would be violated here), this change also adds an optional parameter bypassStrictFIFO to the SLICC enqueue function which allows overriding strict FIFO requirements for individual messages on a case-by-case basis. This parameter is only being used in the TCC's atomic response enqueue call. Change-Id: Iabd52cbd2c0cc385c1fb3fe7bcd0cc64bdb40aac	2023-07-23 15:57:06 -05:00
Bobby R. Bruce	01623fac68	stdlib,configs,tests: Remove deprecated Resource classes usage (#102 ) * stdlib,configs,tests: Remove `Resource` class use This class is deprecated, but was still used in various example configuration scriots and tests. This patch replaces it with the `obtain_resource` function. Change-Id: I0c89bf17783ccaaafc18072aaeefb5d1e207bc55 * configs: Remove `CustomDiskImageResource` use The class is deprecated but was still used in the SPEC example scripts. This patch replaces it with the `DiskImageResource` class. Change-Id: Ie0697fe59a3d737b05eb45ff3bc964f42b0387e0 * configs,tests: Remove `CustomResource` use This class is deprecated but was still used in example scripts and mentioned, incorrectly, in comments in the pyunit tests. This patch removes these. Change-Id: Icb6d02f47a5b72cd58551e5dcd59cc72d6a91a01 * stdlib: Remove '\' in Workload docstring example This example shows how to use the Workload. The backslash is not correct Python and would fail if used in this way. Co-authored-by: Jason Lowe-Power <jason@lowepower.com> --------- Co-authored-by: Jason Lowe-Power <jason@lowepower.com>	2023-07-20 23:08:39 -07:00

1 2 3 4 5 ...

1374 Commits