derek/gem5 - gem5 - Gitea: Git with a cup of tea

derek/gem5

Author	SHA1	Message	Date
Erin Le	e1db67c4bd	configs, dev, learning-gem5, python, tests: more clarification This commit contains the rest of the base 2 vs base 10 cache/memory size clarifications. It also changes the warning message to use warn(). With these changes, the warning message should now no longer show up during a fresh compilation of gem5. Change-Id: Ia63f841bdf045b76473437f41548fab27dc19631	2024-08-23 18:02:42 -07:00
Matthew Poremba	ddc9a18536	configs: GPUFS: Disable KVM perf counters by default (#1391 ) This is on by default in gem5 (see src/cpu/kvm/BaseKvmCPU.py), however the perf counters only measure host instruction counters and GPUFS is not concerned about accuracy of KVM CPU stats. There are also a larger set of users who have access to KVM, but do not have the paranoid level low enough to attach performance counters. Therefore, make the performance counters OFF by default. They can still be enabled, but this will allow for a larger set of users to follow the upcoming GPUFS documentation without needing to read through a troubleshooting section after seeing a gem5 error about the KVM paranoid level. Change-Id: I6b465559edf3ce17e7117ada049c60bd39aecd83	2024-07-29 12:26:10 -07:00
Matthew Poremba	8be5ce6fc9	dev-amdgpu,configs,gpu-compute: Add gfx942 version This is the version for MI300. For the most part, it is the same as MI200 with the exception of architected flat scratch (not yet implemented in gem5) and therefore a new version enum is required. Change-Id: Id18cd7b57c4eebd467c010a3f61e3117beb8d58a	2024-05-15 12:08:41 -07:00
Matthew Poremba	386fb3d1cc	configs: Fix HSA packer processor address The address has one too many zeros and is therefore placed in a memory region usually used for system memory. As a result this causes failure when trying to run a simulation with a huge amount of memory. Change the address to be within the C000'0000h - FFFF'FFFFh X86 I/O hole as was intended. Change-Id: I5d03ac19ea3b2c01a8c431073c12fa1868b3df24	2024-05-03 14:29:30 -07:00
Matthew Poremba	c54039da5b	configs: GPUFS: Turn off SSE4 and fancy XSAVEs (#1041 ) A user reported a bug with the SSE4.1 version of memcmp in libc. When enabled the simulated program crashes with SIGILL. After attempting all fixes recommended by Intel SDM and still not working, turning the bit off instead. Similar, the default XSAVE functionality is not completely implemented for AVX and newer ISA extensions. Therefore, there is not much point to claiming to support the more advanced versions of XSAVE (XSAVEOPT, XSAVEC, XSAVES, and XGETBV with ECX=1). Note that none of these bits are enabled for non-GPU full system simulations (see src/arch/x86/X86ISA.py). This only impacts GPUFS simulations. Change-Id: I8eb7bf0f2a0a29226095e7889fec9c1e8a65f88f	2024-04-20 11:04:59 -07:00
Matthew Poremba	823b5a6eb8	dev-amdgpu: Support multiple CPs and MMIO AddrRanges Currently gem5 assumes that there is only one command processor (CP) which contains the PM4 packet processor. Some GPU devices have multiple CPs which the driver tests individually during POST if they are used or not. Therefore, these additional CPs need to be supported. This commit allows for multiple PM4 packet processors which represent multiple CPs. Each of these processors will have its own independent MMIO address range. To more easily support ranges, the MMIO addresses now use AddrRange to index a PM4 packet processor instead of the hard-coded constexpr MMIO start and size pairs. By default only one PM4 packet processor is created, meaning the functionality of the simulation is unchanged for devices currently supported in gem5. Change-Id: I977f4fd3a169ef4a78671a4fb58c8ea0e19bf52c	2024-03-21 10:13:55 -05:00
Michael Boyer	acd9d3ff94	gpu-compute: Add support for skipping GPU kernels (#940 ) gpu-compute: Add support for skipping GPU kernels This commit adds two new command-line options: --skip-until-gpu-kernel N Skips (non-blit) GPU kernels until the target kernel is reached. Execution continues normally from there. Blit kernels are not skipped because they are responsible for copying the kernel code and metadata for the non-blit kernels. Note that skipping kernels can impact correctness; this feature is only useful if the kernel of interest has no data-dependent behavior, or its data-dependent behavior is not based on data generated by the skipped kernels. --exit-after-gpu-kernel N Ends the simulation after completing (non-blit) GPU kernel N. This commit also renames two existing command-line options: --debug-at-gpu-kernel -> --debug-at-gpu-task --exit-at-gpu-kernel -> --exit-at-gpu-task These were renamed because they count GPU tasks, which include both kernels launched by the application as well as blit kernels. Change-Id: If250b3fd2db05c1222e369e9e3f779c4422074bc	2024-03-21 07:46:27 -07:00
Bobby R. Bruce	d11c40dcac	misc: Run `pre-commit run --all-files` This ensures `isort` is applied to all files in the repo. Change-Id: Ib7ced1c924ef1639542bf0d1a01c5737f6ba43e9	2023-11-29 22:06:41 -08:00
Matthew Poremba	f7ad8fe435	configs: GPUFS option to disable KVM perf counters (#433 ) Add a --no-kvm-perf option to disable KVM perf counters for GPUFS scripts. This is useful for users who have KVM enabled but configured with more restrictive settings, which seems to be the default in newer Linux distros. Change-Id: I7508113d0f7c74deb21ea7b2770522885a0ec822	2023-10-11 14:20:27 -07:00
Bobby R. Bruce	ddf6cb88e4	misc: Run `pre-commit run --all-files` This is reflect the updates made to black when running `pre-commit autoupdate`. Change-Id: Ifb7fea117f354c7f02f26926a5afdf7d67bc5919	2023-10-10 14:01:58 -07:00
Vishnu Ramadas	d3637a489d	configs: Add option to disable AVX in GPUFS GPUFS+KVM simulations automatically enable AVX. This commit adds a command line option to disable AVX if its not needed for a GPUFS simulation. Change-Id: Ic22592767dbdca86f3718eca9c837a8e29b6b781	2023-10-03 12:10:42 -05:00
Matthew Poremba	9acfc5a751	configs: Enable AVX2 for GPUFS+KVM AVX is a requirement for some ROCm libraries, such as rocBLAS, which are themselves requirements for libraries higher up the stack like PyTorch. This patch sets the necessary CPUID bits in the GPUFS config to enable AVX, AVX2, and various SSE features so that applications using these libraries do not cause an illegal instruction trap. Change-Id: Id22f543fb2a06b268271725a54075ee6a9a1f041	2023-07-28 11:34:04 -05:00
Matthew Poremba	3756af8ed9	gpu-compute,configs: Make sim exits conditional The unconditional exit event when a kernel completes that was added in `c644eae2dd` is causing scripts that do not ignore unknown exit events to end simulation prematurely. One such script is the apu_se.py script used in SE mode GPU simulation. Make this exit conditional to the parameter being set to a valid value to avoid this problem. Change-Id: I1d2c082291fdbcf27390913ffdffb963ec8080dd Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/72098 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2023-07-07 14:12:54 +00:00
Matthew Poremba	ce715601ad	configs: Add GPUFS --root-partition option Different GPUFS disk images have different root partitions that Linux needs to boot from. In particular, Ubuntu's new installer has a GRUB partition that cannot seem to be removed. Adding this as an option prevents needing to edit a config script to change one character each time a different disk image is used. Change-Id: Iac2996ea096047281891a70aa2901401ac9746fc Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71918 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com>	2023-06-29 23:30:16 +00:00
Matthew Poremba	6b4a1020be	configs,dev-amdgpu: GPUFS MI200/gfx90a support Add support for MI200-like device. This includes adding PCI IDs and new MMIOs for the device, a different MAP_PROCESS packet, and a different calculation for the number of VGPRs. Change-Id: I0fb7b3ad928826beaa5386d52a94ba504369cb0d Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70317 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2023-05-25 19:14:32 +00:00
Matthew Poremba	8b91ac6f8d	dev-amdgpu: Refactor MMIO interface for SDMA engines Currently the amdgpu simulated device is assumed to be a Vega10. As a result there are a few things that are hardcoded. One of those is the number of SDMAs. In order to add a newer device, such as MI100+, we need to enable a flexible number of SDMAs. In order to support a variable number of SDMAs and with the MMIO offsets of each device being potentially different, the MMIO interface for SDMAs is changed to use an SDMA class method dispatch table with forwards a 32-bit value from the MMIO packet to the MMIO functions in SDMA of the format `void method(uint32_t)`. Several changes are made to enable this: - Allow the SDMA to have a variable MMIO base and size. These are configured in python. - An SDMA class method dispatch table which contains the MMIO offset relative to the SDMA's MMIO base address. - An updated writeMMIO method to iterate over the SDMA MMIO address ranges and call the appropriate SDMA MMIO method which matches the MMIO offset. - Moved all SDMA related MMIO data bit twiddling, masking, etc. into the MMIO methods themselves instead of in the writeMMIO method in SDMAEngine. Change-Id: Ifce626f84d52f9e27e4438ba4e685e30dbf06dbc Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70040 Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2023-04-28 00:48:35 +00:00
Matthew Poremba	c2c5cd1048	configs: Allow other CPU types in GPUFS Previously the CPU type and memory modes were hardcoded for KVM, because there was a deadlock bug. After some recent testing, this deadlock bug no longer exists with the simple CPU models. Thus, changing the configs to allow for other CPU models as a first step toward lifting the KVM requirement from GPUFS. Change-Id: Ib616c3ef60f173871421b55a8bb73b25ce2990b5 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/69979 Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2023-04-22 00:48:28 +00:00
Matthew Poremba	27da9b3576	configs: GPUFS: use multiple event queues for >1 CPU The KVM CPU hangs if there are not multiple event queues when more than one CPU is created. Since GPUFS primarily relies on the KVM CPU, support for multiple event queues is needed. Some GPU libraries, such as AMD Research's ATMI library, assume more than one CPU. This changeset adds support for multiple CPUs and was tested for up to four CPUs. Change-Id: Ia354e02209d0fa18195f3ad44f4fb1d58e93b5ca Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65131 Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2022-11-01 15:34:29 +00:00
Bobby R. Bruce	787204c92d	python: Apply Black formatter to Python files The command executed was `black src configs tests util`. Change-Id: I8dfaa6ab04658fea37618127d6ac19270028d771 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/47024 Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-08-03 09:10:41 +00:00
Matthew Poremba	91e8bbe299	configs,gpu-compute: Support fetch from system pages The amdgpu driver supports fetching instructions from pages which reside in system memory rather than device memory. This changeset adds support to do this by adding the system hub object added in a prior changeset to the fetch unit and issues requests to the system hub if the system bit in the memory page's PTE is set. Otherwise, the requestor ID is set to be device memory and the request is routed through the Ruby network / GPU caches to fetch the instructions. Change-Id: Ib2fb47c589fdd5e544ab6493d7dbd8f2d9d7b0e8 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57652 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-03-28 23:24:53 +00:00
Matthew Poremba	8b30b6520a	configs: Add GPU TLBs for GPU full system Add the constructors for the Vega TLB and TLB coalescers in the python config. These need a pointer to the gpu device which is added as a parameter. The last level TLB's page table walker is added as a dma device to the system so that the port is connected to the GPU device memory in the disjoint VIPER configuration file. A portion of the the GPUFS system configuration file needs to be shuffled around so that the shader CPU is created before the TLBs are created so they can be connected to the shader's ports. This means the real CPU init code needs to break once reaching the shader. The vendor string must also be set after createThreads is called on real CPUs. Change-Id: I36ed93db262b21427f3eaf4904a1c897a2894835 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57649 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-03-25 19:51:29 +00:00
Matthew Poremba	1dea025fcc	configs: Force GPUFS config to use KVM Change-Id: Ibca219df75bb2f2315297505a21b347e9dd26853 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57532 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-03-25 19:51:29 +00:00
Matthew Poremba	581e451723	gpu-compute,dev-hsa: Update CP and HSAPP for full-system Make the necessary changes to connect Vega pagetable walkers for full-system mode. Previously the CP and HSA packet processor could only read AQL packets from system/host memory using proxy port. This allows for AQL to be read from device memory which is used for non-blit kernels. Change-Id: If28eb8be68173da03e15084765e77e92eda178e9 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53077 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-03-25 19:51:29 +00:00
Matthew Poremba	66dd94a0ee	configs: Add disjoint VIPER configuration The disjoint VIPER configuration creates completely disconnected CPU and GPU Ruby networks which can communicate only via the PCI bus. Either garnet or simple network can be used. This copies most of the Ruby setup from Ruby.py's create_system since creating disjoint networks is not possible using Ruby.py. Change-Id: Ibc23aa592f56554d088667d8e309ecdeb306da68 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53072 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-03-25 14:12:51 +00:00
Matthew Poremba	0aedbb82fe	configs: Allow for second disk in GPUFS Connect the --second-disk option in GPUFS. Typically this is used as a benchmarks disk image. If the disk is unmounted at the time of checkpoint, a new disk image can be mounted after restoring the checkpoint for a simple way to add new benchmarks without recreating a checkpoint. Change-Id: I57b31bdf8ec628006d774feacff3fde6f533cd4b Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53071 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-03-24 14:59:57 +00:00
Matthew Poremba	110b9a1bbd	configs: Set CPU vendor for GPUFS config A valid CPU vendor string (i.e., not "M5 Simulator") needs to be passed to CPUID in order for Linux to create the sysfs files needed for ROCm's Thunk interface to initialize properly. If these are no created hipDeviceProperties and other basic GPU code APIs will error out. Change-Id: I6e3f459162e4673860a8f0a88473e38d5d7be237 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53070 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-03-24 14:59:57 +00:00
Matthew Poremba	1be246bbe3	dev-amdgpu: Add PM4PP, VMID, Linux definitions The PM4 packet processor is handling all non-HSA GPU packets such as packets for (un)mapping HSA queues. This commit pulls many Linux structs and defines out into their own files for clarity. Finally, it implements the VMID related functions in AMDGPU device. Change-Id: I5f0057209305404df58aff2c4cd07762d1a31690 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53068 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-03-24 14:59:57 +00:00
Alexandru Dutu	e67e02d657	configs: Connect SDMA, IH, and memory manager in GPUFS Add the devices that have been added in previous changesets to the config file. Forward MMIO writes to the appropriate device based on the MMIO address. Connect doorbells and forward rings to the appropriate device based on queue type. Change-Id: I44110c9a24559936102a246c9658abb84a8ce07e Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53065 Maintainer: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-03-24 14:59:57 +00:00
Matthew Poremba	9313294efe	misc: Remove AMD license addition Remove the line "For use for simulation and test purposes only" in files were AMD is the only copyright holder listed in the header. This happens to be the case for all files where this line exists, removing it completely from gem5. Change-Id: I623f266b002f564301b28774f49081099cfc60fd Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53943 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-12-11 04:00:56 +00:00
Matthew Poremba	ca12a8997d	mem-ruby,sim: Add support for VGA ROM memory region Checks if the address is in a shadowed region, and sends the request to pio to be serviced by the device backing up that range. Based on: https://gem5-review.googlesource.com/c/amd/gem5/+/23484 Change-Id: I4d5b46cccd6203523008b2e9545d55eb62130964 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46159 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-06-11 17:10:32 +00:00
Matthew Poremba	a9f2e21e08	configs: Initial configuration for full-system GPU This is an initial configuration capable of booting Linux and registering a PCI device which registers as an AMD Vega 10 (Frontier Edition) GPU. It it loosely based on the the example/fs.py and gem5 book full system example scripts. The top-level file is meant to be modular such that convenience scripts can be created to set arguments automatically and then call the main run function. This will evolve over time as more full-system GPU components are added and the network topology needed for disjoint address spaces is created for the VIPER protocol. Change-Id: I7002213ca8de5eb73919e49fb11840a688744012 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/44907 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-04-29 17:13:12 +00:00

31 Commits