derek/gem5 - gem5 - Gitea: Git with a cup of tea

derek/gem5

Author	SHA1	Message	Date
Matthew Poremba	6164835230	configs: GPUFS: MI300X Add a config capable of simulating MI300X ISA (gfx942). This is similar to the mi200.py config and uses the same scripts followed by some tuneable parameters. This config optionally lets the user call the runMI300GPU function with gem5 resources. This allows for something like the following before a VIPER stdlib python is available: ``` import mi300 from gem5.resources.resource import obtain_resource disk = obtain_resource("x86-gpu-fs-img") kernel = obtain_resource("x86-linux-kernel-5.4.0-105-generic") app = obtain_resource("square-gpu-test") mi300.runMI300GPUFS("X86KvmCPU", disk, kernel, app) ``` Tested cold boot config, checkpoint create and restore, and using gem5 resources. Change-Id: I50a13d7a3d207786b779bf7fd47a5645256b1e6a	2024-05-16 09:23:03 -07:00
Matthew Poremba	8be5ce6fc9	dev-amdgpu,configs,gpu-compute: Add gfx942 version This is the version for MI300. For the most part, it is the same as MI200 with the exception of architected flat scratch (not yet implemented in gem5) and therefore a new version enum is required. Change-Id: Id18cd7b57c4eebd467c010a3f61e3117beb8d58a	2024-05-15 12:08:41 -07:00
Lukas Zenick	b279e40cb7	configs: nvm sweep fix (#1114 ) These changes to sweep and sweep_hybrid for NVM allow them to run. I'm not an expert on this, so I'm not sure if these are technically correct, but they no longer fail when running `build/X86/gem5.opt configs/nvm/sweep.py` and `build/X86/gem5.opt configs/nvm/sweep_hybrid.py` GitHub Issue: #669	2024-05-13 14:51:39 -07:00
Harshil Patel	5c82447653	misc: Add resource versions to examples (#1110 ) - Explicitly defining resource version in obtain resource calls in examples. Change-Id: I74ab5d2f5e9bc73a0145585a0fe75f2ec905472f	2024-05-09 10:16:27 -07:00
Matthew Poremba	6ed446e546	arch-x86: Add XCR0 register and add to X86KvmCPU (#1040 ) The extended control registers were not being updated in the KVM thread context nor updated in the KVM state. This was causing issues when checkpointing since the XCR0 value was reverting to the default value rather than what it was previously before the checkpoint. THis was causing multiple applications to crash due to executing instructions which are now illegal instructions due to XCR0 being incorrect. This commit adds the XCR0 as a misc register similar to the exiting x86 control registers and adds all of the helper functions to access and set the register value. It also adds support for updating the KVM CPU's state with the register value and updating the thread context's misc reg value so that it is checkpointed along with the other misc regs. Note that this does not add support for XSAVE of the AVX state (i.e., the upper 128 bits of YMM registers). It does however fix the immediate problem in issue #958 . Change-Id: I97456c8b57cbc7b381bd4be94944ce6567a43c76	2024-05-06 09:58:07 -07:00
Matthew Poremba	cb47755e15	gpu: Consolidated fixes for v24.0 (#1103 ) Includes fixes for several bugs reported via email, self found, and internal reports. Also includes runs through Valgrind and UBsan. See individual commits for more details.	2024-05-06 07:35:57 -07:00
Matthew Poremba	0d3d456894	gpu-compute: Invalidate Scalar cache when SQC invalidates (#1093 ) The scalar cache is not being invalidated which causes stale data to be left in the scalar cache between GPU kernels. This commit sends invalidates to the scalar cache when the SQC is invalidated. This is a sufficient baseline for simulation. Since the number of invalidates might be larger than the mandatory queue can hold and no flash invalidate mechanism exists in the VIPER protocol, the command line option for the mandatory queue size is removed, which is the same behavior as the SQC. Change-Id: I1723f224711b04caa4c88beccfa8fb73ccf56572	2024-05-06 07:35:38 -07:00
Matthew Poremba	386fb3d1cc	configs: Fix HSA packer processor address The address has one too many zeros and is therefore placed in a memory region usually used for system memory. As a result this causes failure when trying to run a simulation with a huge amount of memory. Change the address to be within the C000'0000h - FFFF'FFFFh X86 I/O hole as was intended. Change-Id: I5d03ac19ea3b2c01a8c431073c12fa1868b3df24	2024-05-03 14:29:30 -07:00
Harshil Patel	1164f9b81e	tests: update resource to use new checkpoint - Updated the id of the simpoint-se-checkpoint resource. Change-Id: Iab0b10da87b9790c24407e0edce7a18c38e0f48a	2024-05-03 10:55:04 -07:00
Alexander Richardson	1bb5d3b99e	arch-riscv: Add support for RISC-V semihosting (#681 ) See https://github.com/riscv-software-src/riscv-semihosting for the current specification. Almost all code is shared with the Arm implementation. Tested by running some binaries built with [picolibc](https://github.com/picolibc/picolibc).	2024-04-27 05:12:32 -07:00
Matthew Poremba	c54039da5b	configs: GPUFS: Turn off SSE4 and fancy XSAVEs (#1041 ) A user reported a bug with the SSE4.1 version of memcmp in libc. When enabled the simulated program crashes with SIGILL. After attempting all fixes recommended by Intel SDM and still not working, turning the bit off instead. Similar, the default XSAVE functionality is not completely implemented for AVX and newer ISA extensions. Therefore, there is not much point to claiming to support the more advanced versions of XSAVE (XSAVEOPT, XSAVEC, XSAVES, and XGETBV with ECX=1). Note that none of these bits are enabled for non-GPU full system simulations (see src/arch/x86/X86ISA.py). This only impacts GPUFS simulations. Change-Id: I8eb7bf0f2a0a29226095e7889fec9c1e8a65f88f	2024-04-20 11:04:59 -07:00
Bobby R. Bruce	3af15a535e	mem-cache, configs, arch-arm: Handle partitioning policies through a PartitionManager (#966 ) This PR is offloading some of the partitioning logic to the partitioning manager, effectively changing the partitioning interface. Rather than always relying on the PartitionFieldExtention data structure to convey partition IDs, we make it implementation defined by introducing the partitioning manager abstraction. We want user to be able to extract the partitionId more flexibly and this requires using a SimObject. Users can extend the PartitioningManager, overriding the readPacketPartitionId, therefore providing their own mean of injecting/extracting partitioning data from a packet	2024-04-08 16:05:17 -07:00
Giacomo Travaglini	82a82c8793	configs: Change cache_partitioning.py to use PartitionManager Change-Id: I891cc4967dc5483313bcb1179d19b37123a37ba0 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2024-04-05 10:09:46 +01:00
Kaustav Goswami	28b081b348	arch-arm,stdlib: ARM release for_kvm is moved to configs (#986 ) This change sets the `release` of the ARM board at the config file instead of overriding the release on the ArmBoard. This change partially solves issue 932 as the system taking and restoring the checkpoint is consistent across KVM and timing CPUs respectively. Signed-off-by: Kaustav Goswami <kggoswami@ucdavis.edu>	2024-04-03 11:48:24 +01:00
Matthew Poremba	823b5a6eb8	dev-amdgpu: Support multiple CPs and MMIO AddrRanges Currently gem5 assumes that there is only one command processor (CP) which contains the PM4 packet processor. Some GPU devices have multiple CPs which the driver tests individually during POST if they are used or not. Therefore, these additional CPs need to be supported. This commit allows for multiple PM4 packet processors which represent multiple CPs. Each of these processors will have its own independent MMIO address range. To more easily support ranges, the MMIO addresses now use AddrRange to index a PM4 packet processor instead of the hard-coded constexpr MMIO start and size pairs. By default only one PM4 packet processor is created, meaning the functionality of the simulation is unchanged for devices currently supported in gem5. Change-Id: I977f4fd3a169ef4a78671a4fb58c8ea0e19bf52c	2024-03-21 10:13:55 -05:00
Matthew Poremba	6bbde8fbb8	dev-amdgpu: Rework handling of unknown registers The top level AMDGPUDevice currently reads/writes all unknown registers to/from a map containing the previously written value. This is intended as a way to handle registers that are not part of the model but the driver requires for functionality. Since this is at the top level, it can mask changes to register values which do not go through the same interface. For example, reading an MMIO, changing via PM4 queue, and reading again returns the stale cached value. This commit removes the usage of the regs map in AMDGPUDevice, implements some important MMIOs that were previously handled by it, and moves the unknown register handling to the NBIO aperture only. To reduce the number of additional MMIOs to implement, the display manager in vega10 is now disabled. Change-Id: Iff0a599dd82d663c7e710b79c6ef6d0ad1fc44a2	2024-03-21 10:10:01 -05:00
Michael Boyer	acd9d3ff94	gpu-compute: Add support for skipping GPU kernels (#940 ) gpu-compute: Add support for skipping GPU kernels This commit adds two new command-line options: --skip-until-gpu-kernel N Skips (non-blit) GPU kernels until the target kernel is reached. Execution continues normally from there. Blit kernels are not skipped because they are responsible for copying the kernel code and metadata for the non-blit kernels. Note that skipping kernels can impact correctness; this feature is only useful if the kernel of interest has no data-dependent behavior, or its data-dependent behavior is not based on data generated by the skipped kernels. --exit-after-gpu-kernel N Ends the simulation after completing (non-blit) GPU kernel N. This commit also renames two existing command-line options: --debug-at-gpu-kernel -> --debug-at-gpu-task --exit-at-gpu-kernel -> --exit-at-gpu-task These were renamed because they count GPU tasks, which include both kernels launched by the application as well as blit kernels. Change-Id: If250b3fd2db05c1222e369e9e3f779c4422074bc	2024-03-21 07:46:27 -07:00
Michael Boyer	ba2f5615ba	gpu-compute: Support cache line sizes >64B in GPUFS (#939 ) This change fixes two issues: 1) The --cacheline_size option was setting the system cache line size but not the Ruby cache line size, and the mismatch was causing assertion failures. 2) The submitDispatchPkt() function accesses the kernel object in chunks, with the chunk size equal to the cache line size. For cache line sizes >64B (e.g. 128B), the kernel object is not guaranteed to be aligned to a cache line and it was possible for a chunk to be partially contained in two separate device memories, causing the memory access to fail. Change-Id: I8e45146901943e9c2750d32162c0f35c851e09e1 Co-authored-by: Michael Boyer <Michael.Boyer@amd.com>	2024-03-20 11:09:25 -07:00
Giacomo Travaglini	058dd7e195	configs, tests: Amend stdlib configs to use WalkCache hierarchy As X86 and RISCV are relying on a Table Walker cache, we change their stdlib configs to use the newly defined PrivateL1PrivateL2WalkCacheHierarchy Change-Id: I63c3f70a9daa3b2c7a8306e51af8065bf1bea92b Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2024-03-18 09:42:05 +00:00
Giacomo Travaglini	c57a6b0d59	mem-cache: Add support for partitioning caches (#765 ) * Add Cache partitioning policies to manage and enforce cache partitioning: * Add Way partition policy * Add MaxCapacity partition policy * Add PartitionFieldsExtension Extension class for Packets to store Partition IDs for cache partitioning and monitoring * Modify Cache SimObjects to store partition policies * Modify Cache block eviction logic to use new partitioning policies Co-authored-by: Adrian Herrera <adrian.herrera@arm.com> Change-Id: Ib35153a8b46803c22a433926270d82e5e19ce544 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2024-03-04 09:44:01 +00:00
Hristo Belchev	27c8355565	mem-cache: Add support for partitioning caches * Add Cache partitioning policies to manage and enforce cache partitioning: * Add Way partition policy * Add MaxCapacity partition policy * Add PartitionFieldsExtension Extension class for Packets to store Partition IDs for cache partitioning and monitoring * Modify Cache Tags SimObjects to store partition policies * Modify Cache Tags block eviction logic to use new partitioning policies * Add example system and TrafficGen configurations for testing Cache Partitioning Policies Change-Id: Ic3fb0f35cf060783fbb9289380721a07e18fad49 Co-authored-by: Adrian Herrera <adrian.herrera@arm.com> Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2024-03-01 15:26:38 +00:00
Nicholas Mosier	1990186170	configs: Ensure m5ops base doesn't overlap physical mem in KVM (#875 ) Fix #874, in which running se.py with 4GB or more memory (via option --mem-size=4GB) causes all KVM programs to crash or hang. This occurred because the m5ops address range (set to 0xFFFF0000-0x100000000) overlapped with physical memory under such a configuration. This patch fixes the bug by moving the m5ops address range if phyiscal memory is >=4GB. Change-Id: Ic8a004517bc2be2c27860ed314460be749a11dc1	2024-02-26 10:33:48 -08:00
Harshil Patel	0f79b15b2f	tests: Update checkpoint tests to new checkpoints (#888 ) Change-Id: I1bf6d47017bcf77a4f93341c73de355372e1dea7	2024-02-21 16:37:28 -08:00
Vishnu Ramadas	7dae25e881	configs, gpu-compute: Add parameter in shader for CUs per SQC Change-Id: If0ae0db1b6ccc08a92f169a271b137f69f410f7b	2024-02-09 12:17:24 -06:00
Kaustav Goswami	b5d18b84a8	arm,stdlib: added kvm support to the ARM board (#725 ) This change adds support to use KVM cores on the ARM board. The board simulates gic to enable KVM, similar to the gem5 ARM FS configs. The limitation is that it only supports VExpress_GEM5_V1. Signed-off-by: Kaustav Goswami <kggoswami@ucdavis.edu>	2024-01-31 10:17:58 -08:00
Matthew Poremba	63caa780c2	misc: Remove all references to GCN3 Replace instances of "GCN3" with Vega. Remove gfx801 and gfx803. Rename FIJI to Vega and Carrizo to Raven. Using misc since there is not enough room to fit all the tags. Change-Id: Ibafc939d49a69be9068107a906e878408c7a5891	2024-01-17 11:11:06 -06:00
Matthew Poremba	6a9e80c54c	gpu-compute: Support for MI200 GPU model (#733 )	2024-01-15 08:18:34 -08:00
Giacomo Travaglini	7487c13181	configs: Add o3 --cpu choice to the starter_se.py script (#764 ) This is matching what we are already doing in the starter_fs.py script Change-Id: I50239050be9bd151a607ec892f8dd9322b24040b Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2024-01-12 07:47:51 -08:00
Matt Sinclair	dc85d1492c	gpu-compute: Added register file cache support (#730 ) The RFC is defaulted to a size of 0 which removes it completely. To use the RFC set the --register-file-cache-size to a non-zero multiple of two. In addition, rfc_pipe_length may be altered to increase or decrease RFC latency benefit.	2024-01-05 12:57:06 -06:00
KaiBatley	359ac63280	gpu-compute: Added register file cache support The RFC is defaulted to a size of 0 which removes it completely. To use the RFC set the --register-file-cache-size to a non-zero multiple of two. In addition, rfc_pipe_length may be altrered to increase or decrease RFC latency benefit. Change-Id: I6f5bf5b750eb64155fbc8c8343e9feadce5c9f79	2024-01-04 22:43:05 -06:00
Matthew Poremba	a40f8f0efa	configs: Add MI200 script This is the MI200 equivalent of configs/example/gpufs/vega10.py. Change-Id: Ib9761caa4326abe6b90099e6a77111b2acce0f76	2024-01-03 15:41:06 -06:00
Bobby R. Bruce	025ccadc68	configs: Fix SMT cpu type checking (#698 ) The args.cpu_type is not a type but a string so the isinstance checking will always fail and an assertion will always be thrown A cherry-pick of #684 to develop Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Co-authored-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2023-12-22 11:30:45 -08:00
Jason Lowe-Power	7adaaa6f2a	mem-ruby,configs: Enable Ruby with NULL build After removing `get_runtime_isa`, the `send_evicts` function in the ruby configs assumes that there is an ISA built. This change short-circuits that logic if the current build is the NULL (none) ISA. Change-Id: I75fefe3d70649b636b983c4d2145c63a9e1342f7 Signed-off-by: Jason Lowe-Power <jason@lowepower.com>	2023-12-20 15:26:03 -08:00
Bobby R. Bruce	5d09ff4525	configs: Add `hasattr` guard to ensure DerivO3CPU compiled configs/ruby/Ruby.py fails when `DerivO3CPU` is not compiled into the gem5 binary. The `isinstance` check fails. This fix addds a guard. Change-Id: I1e5503ab18ec94683056c6eb28cebeda6632ae8e	2023-12-18 14:37:51 -08:00
Yu-Cheng Chang	5a6901c405	configs: Make riscv/fs_linux work in build/ALL/gem5.opt (#655 ) Change-Id: If9add7dc5e9c5600f769d27817da41466158942b	2023-12-12 08:23:28 -08:00
Yu-Cheng Chang	9bd61f217f	configs: Fix issues after get_runtime_isa() #241 removed (#652 ) 1. Fix the wrong ISA detect of get_isa function 2. Fix the typo ObjectLIst.cpu_list 3. Fix missing PageTableWalkerCache 4. Fix the invalid default cpu_type paramter Change-Id: I217ea8da8a6d8e712743a5b32c4c0669216ce6c4	2023-12-06 10:57:18 -08:00
Matthew Poremba	f00d7f70a4	configs: Fix apu_se.py CPU type checks (#651 ) The current checks do not work. Correct the CPU type names Change-Id: I81778873df0567c4a8dabbbe659c4c7a39326f98	2023-12-04 19:14:46 -08:00
Bobby R. Bruce	569e21f798	configs,stdlib,tests: Remove get_runtime_isa() (#241 ) `get_runtime_isa()` has been deprecated for some time. It is a leftover piece of code from when gem5 was compiled to a single ISA and that ISA used to configure the simulated system to use that ISA. Since multi-ISA compilations are possible, `get_runtime_isa()` should not be used. Unless the gem5 binary is compiled to a single ISA, a failure will occur. The new proceedure for specify which ISA to use is by the setting of the correct `BaseCPU` implementation. E.g., `X86SimpleTimingCPU` of `ArmO3CPU`. This patch removes the remaining `get_runtime_isa()` instances and removes the function itself. The `SimpleCore` class has been updated to allow for it's CPU factory to return a class, needed by scripts in "configs/common". The deprecated functionality in the standard library, which allowed for the specifying of an ISA when setting up a processor and/or core has also been removed. Setting an ISA is now manditory. Fixes #216.	2023-12-04 09:53:35 -08:00
Harshil Patel	bad569a3f8	misc: update x86-npb-benchmarks.py to use suites (#587 ) - updated the x86-npb-benchmarks.py to use npb workloads and suites. The suites and workloads are not in the database are also waiting feedback. I am attaching the JSON file here. [npb_workloads_suite.json](https://github.com/gem5/gem5/files/13431116/npb_workloads_suite.json) To run the x86-npb-benchmarks.py script use the GEM5_RESOURCE_JSON_APPEND env variable. The full command is: ``` GEM5_RESOURCE_JSON_APPEND=[path to npb_workloads_suite.json] ./build/X86/gem5.opt configs/example/gem5_library/x86-npb-benchmarks.py --benchmark [benchmark] ``` Change-Id: I248e6452ea4122e9260e34e4368847660edae577	2023-12-03 13:23:46 -08:00
Harshil Patel	88c57e22de	misc: update gapbs example to use suites (#607 )	2023-12-03 13:21:37 -08:00
anoop	fc0a043950	mem-ruby: Unused L3CacheCntrl freed (#598 ) Seems like the MOESI_AMD_Base-L3Cache.sm file is unused in the VIPER protocol. It's confusing to have it in the GPU_VIPER.slicc file.	2023-12-01 13:01:19 -08:00
Jason Lowe-Power	b3e7af9d79	Support for classic prefetchers in Ruby (#502 ) This patch adds supports for using the "classic" prefetchers with ruby cache controllers. This pull request includes a few commits making the changes in this order: - Refactor decouples the classic cache and prefetchers interfaces - Extras probes for later integration with ruby - General ruby-side support - Adds support for the CHI protocol Commit [mem-ruby: support prefetcher in CHI protocol](`2bdb65653b`) may be used as example on how to add support for other protocols. JIRA issues that may be related to this pull request: https://gem5.atlassian.net/browse/GEM5-457 https://gem5.atlassian.net/browse/GEM5-1112	2023-11-30 10:24:29 -08:00
Bobby R. Bruce	d11c40dcac	misc: Run `pre-commit run --all-files` This ensures `isort` is applied to all files in the repo. Change-Id: Ib7ced1c924ef1639542bf0d1a01c5737f6ba43e9	2023-11-29 22:06:41 -08:00
Tiago Mück	91cf58871e	mem-ruby: support prefetcher in CHI protocol Use RubyPrefetcherProxy to support prefetchers in the CHI cache controller L1I/L1D/L2 prefechers can now be added by specifying a non-null prefetcher type when configuring a CHI_RNF. Related JIRA: https://gem5.atlassian.net/browse/GEM5-457 https://gem5.atlassian.net/browse/GEM5-1112 Additional authors: Tuan Ta <tuan.ta2@arm.com> Change-Id: I41dc637969acaab058b22a8c9c3931fa137eeace Signed-off-by: Tiago Mück <tiago.muck@arm.com>	2023-11-28 18:30:50 -06:00
Tiago Mück	becba00d95	mem-cache,configs: remove extra prefetch_* params Remove the prefetch_on_access and prefetch_on_pf_hit from BaseCache. BasePrefetch no longer expects this params to exist in the parent. Configurations that set these parameter using the cache object were fixed. Change-Id: I9ab6a545eaf930ee41ebda74e2b6b8bad0ca35a7 Signed-off-by: Tiago Mück <tiago.muck@arm.com>	2023-11-28 18:30:49 -06:00
Derek Christ	e95cab429f	configs,ext,stdlib: Update DRAMSys integration (#525 ) Recent breaking changes in the DRAMSys API require user code to be updated. These updates have been applied to the gem5 integration. Furthermore, as DRAMSys started to use CMake dependency management, it is no longer sensible to maintain two separate build systems for DRAMSys. The use of the DRAMSys integration in gem5 will therefore from now on require that CMake is installed on the target machine. Additionally, support for snapshots have been implemented into DRAMSys and coupled with gem5's checkpointing API.	2023-11-14 08:05:11 -08:00
Daniel Kouchekinia	be5c03ea9f	mem-ruby,configs: Add GPU GLC Atomic Resource Constraints (#120 ) Added a resource constraint, AtomicALUOperation, to GLC atomics performed in the TCC. The resource constraint uses a new class, ALUFreeList array. The class assumes the following: - There are a fixed number of atomic ALU pipelines - While a new cache line can be processed in each pipeline each cycle, if a cache line is currently going through a pipeline, it can't be processed again until it's finished Two configuration parameters have been used to tune this behavior: - tcc-num-atomic-alus corresponds to the number of atomic ALU pipelines - atomic-alu-latency corresponds to the latency of atomic ALU pipelines Change-Id: I25bdde7dafc3877590bb6536efdf57b8c540a939	2023-11-14 07:48:48 -08:00
Kaustav Goswami	2c229aa2ff	configs,ext: gem5 SST bridge calls m5.instantiate() in gem5 This change updates the gem5 SST bridge to call m5.instantiate() in the gem5 config script instead of in the SST component. This allows more flexibility for the gem5-SST setup, as we can now write traffic generators using the bridge. Change-Id: I510a8c15f8fb00bdbdd60dafa2d9f5ad011e48f2 Signed-off-by: Kaustav Goswami <kggoswami@ucdavis.edu>	2023-11-06 11:54:35 -08:00
Bobby R. Bruce	531067fffa	mem,tests: Set Ruby Mem Test atomic percent to 0 (#489 ) Fixes https://github.com/gem5/gem5/issues/450 (https://github.com/gem5/gem5/pull/477 fixes non-ruby memtests, so only a partial fix).	2023-10-19 15:38:38 -07:00
Andreas Sandberg	42d1c8b3c3	cpu: Restructure RAS (#428 )	2023-10-17 19:14:13 +01:00

1 2 3 4 5 ...

1465 Commits