derek/gem5 - gem5 - Gitea: Git with a cup of tea

derek/gem5

Author	SHA1	Message	Date
Bobby R. Bruce	ddf6cb88e4	misc: Run `pre-commit run --all-files` This is reflect the updates made to black when running `pre-commit autoupdate`. Change-Id: Ifb7fea117f354c7f02f26926a5afdf7d67bc5919	2023-10-10 14:01:58 -07:00
Vishnu Ramadas	d3637a489d	configs: Add option to disable AVX in GPUFS GPUFS+KVM simulations automatically enable AVX. This commit adds a command line option to disable AVX if its not needed for a GPUFS simulation. Change-Id: Ic22592767dbdca86f3718eca9c837a8e29b6b781	2023-10-03 12:10:42 -05:00
Matthew Poremba	addba01d29	configs,dev-amdgpu: Add PCI express capability info The ROCm stack requires PCI express atomics. Currently the first PCI CapabilityPtr does not point to anything, which signals to the OS (Linux) that this is an early generation PCI device. As PCI express atomics were introduced later, the CapabilityPtr needs to point to at least a PCI express capability structure. This capability is defined as 0x10 in Linux. We additionally set the PCI atomic based bits and implement device specific PCI configuration space reads and writes to the amdgpu device. With this commit, the output of simulation when loading the amdgpu driver no longer outputs "PCIE atomics not supported". Further, an application which uses PCIe atomics (PyTorch with a reduce_sum kernel) now makes further progress. Change-Id: I5e3866979659a2657f558941106ef65c2f4d9988	2023-08-24 09:10:35 -05:00
Matthew Poremba	9acfc5a751	configs: Enable AVX2 for GPUFS+KVM AVX is a requirement for some ROCm libraries, such as rocBLAS, which are themselves requirements for libraries higher up the stack like PyTorch. This patch sets the necessary CPUID bits in the GPUFS config to enable AVX, AVX2, and various SSE features so that applications using these libraries do not cause an illegal instruction trap. Change-Id: Id22f543fb2a06b268271725a54075ee6a9a1f041	2023-07-28 11:34:04 -05:00
Matthew Poremba	3756af8ed9	gpu-compute,configs: Make sim exits conditional The unconditional exit event when a kernel completes that was added in `c644eae2dd` is causing scripts that do not ignore unknown exit events to end simulation prematurely. One such script is the apu_se.py script used in SE mode GPU simulation. Make this exit conditional to the parameter being set to a valid value to avoid this problem. Change-Id: I1d2c082291fdbcf27390913ffdffb963ec8080dd Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/72098 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2023-07-07 14:12:54 +00:00
Matthew Poremba	ce715601ad	configs: Add GPUFS --root-partition option Different GPUFS disk images have different root partitions that Linux needs to boot from. In particular, Ubuntu's new installer has a GRUB partition that cannot seem to be removed. Adding this as an option prevents needing to edit a config script to change one character each time a different disk image is used. Change-Id: Iac2996ea096047281891a70aa2901401ac9746fc Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71918 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com>	2023-06-29 23:30:16 +00:00
Matthew Poremba	6b4a1020be	configs,dev-amdgpu: GPUFS MI200/gfx90a support Add support for MI200-like device. This includes adding PCI IDs and new MMIOs for the device, a different MAP_PROCESS packet, and a different calculation for the number of VGPRs. Change-Id: I0fb7b3ad928826beaa5386d52a94ba504369cb0d Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70317 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2023-05-25 19:14:32 +00:00
Matthew Poremba	8b91ac6f8d	dev-amdgpu: Refactor MMIO interface for SDMA engines Currently the amdgpu simulated device is assumed to be a Vega10. As a result there are a few things that are hardcoded. One of those is the number of SDMAs. In order to add a newer device, such as MI100+, we need to enable a flexible number of SDMAs. In order to support a variable number of SDMAs and with the MMIO offsets of each device being potentially different, the MMIO interface for SDMAs is changed to use an SDMA class method dispatch table with forwards a 32-bit value from the MMIO packet to the MMIO functions in SDMA of the format `void method(uint32_t)`. Several changes are made to enable this: - Allow the SDMA to have a variable MMIO base and size. These are configured in python. - An SDMA class method dispatch table which contains the MMIO offset relative to the SDMA's MMIO base address. - An updated writeMMIO method to iterate over the SDMA MMIO address ranges and call the appropriate SDMA MMIO method which matches the MMIO offset. - Moved all SDMA related MMIO data bit twiddling, masking, etc. into the MMIO methods themselves instead of in the writeMMIO method in SDMAEngine. Change-Id: Ifce626f84d52f9e27e4438ba4e685e30dbf06dbc Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70040 Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2023-04-28 00:48:35 +00:00
Matthew Poremba	9c3107c762	dev-amdgpu,configs: Add human readable names for different GPUs Add a human readable string for GPU device names rather than using the device ID in the code. This is intended to make code more readable. Change-Id: Id3ea74ca37422b1f4a0f09e5a9522d37b5998c1a Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70038 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com>	2023-04-28 00:48:35 +00:00
Matthew Poremba	c2c5cd1048	configs: Allow other CPU types in GPUFS Previously the CPU type and memory modes were hardcoded for KVM, because there was a deadlock bug. After some recent testing, this deadlock bug no longer exists with the simple CPU models. Thus, changing the configs to allow for other CPU models as a first step toward lifting the KVM requirement from GPUFS. Change-Id: Ib616c3ef60f173871421b55a8bb73b25ce2990b5 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/69979 Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2023-04-22 00:48:28 +00:00
Matthew Poremba	27da9b3576	configs: GPUFS: use multiple event queues for >1 CPU The KVM CPU hangs if there are not multiple event queues when more than one CPU is created. Since GPUFS primarily relies on the KVM CPU, support for multiple event queues is needed. Some GPU libraries, such as AMD Research's ATMI library, assume more than one CPU. This changeset adds support for multiple CPUs and was tested for up to four CPUs. Change-Id: Ia354e02209d0fa18195f3ad44f4fb1d58e93b5ca Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65131 Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2022-11-01 15:34:29 +00:00
Bobby R. Bruce	787204c92d	python: Apply Black formatter to Python files The command executed was `black src configs tests util`. Change-Id: I8dfaa6ab04658fea37618127d6ac19270028d771 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/47024 Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-08-03 09:10:41 +00:00
Matthew Poremba	91e8bbe299	configs,gpu-compute: Support fetch from system pages The amdgpu driver supports fetching instructions from pages which reside in system memory rather than device memory. This changeset adds support to do this by adding the system hub object added in a prior changeset to the fetch unit and issues requests to the system hub if the system bit in the memory page's PTE is set. Otherwise, the requestor ID is set to be device memory and the request is routed through the Ruby network / GPU caches to fetch the instructions. Change-Id: Ib2fb47c589fdd5e544ab6493d7dbd8f2d9d7b0e8 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57652 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-03-28 23:24:53 +00:00
Matthew Poremba	8b30b6520a	configs: Add GPU TLBs for GPU full system Add the constructors for the Vega TLB and TLB coalescers in the python config. These need a pointer to the gpu device which is added as a parameter. The last level TLB's page table walker is added as a dma device to the system so that the port is connected to the GPU device memory in the disjoint VIPER configuration file. A portion of the the GPUFS system configuration file needs to be shuffled around so that the shader CPU is created before the TLBs are created so they can be connected to the shader's ports. This means the real CPU init code needs to break once reaching the shader. The vendor string must also be set after createThreads is called on real CPUs. Change-Id: I36ed93db262b21427f3eaf4904a1c897a2894835 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57649 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-03-25 19:51:29 +00:00
Matthew Poremba	1dea025fcc	configs: Force GPUFS config to use KVM Change-Id: Ibca219df75bb2f2315297505a21b347e9dd26853 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57532 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-03-25 19:51:29 +00:00
Matthew Poremba	581e451723	gpu-compute,dev-hsa: Update CP and HSAPP for full-system Make the necessary changes to connect Vega pagetable walkers for full-system mode. Previously the CP and HSA packet processor could only read AQL packets from system/host memory using proxy port. This allows for AQL to be read from device memory which is used for non-blit kernels. Change-Id: If28eb8be68173da03e15084765e77e92eda178e9 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53077 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-03-25 19:51:29 +00:00
Matthew Poremba	66dd94a0ee	configs: Add disjoint VIPER configuration The disjoint VIPER configuration creates completely disconnected CPU and GPU Ruby networks which can communicate only via the PCI bus. Either garnet or simple network can be used. This copies most of the Ruby setup from Ruby.py's create_system since creating disjoint networks is not possible using Ruby.py. Change-Id: Ibc23aa592f56554d088667d8e309ecdeb306da68 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53072 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-03-25 14:12:51 +00:00
Matthew Poremba	0aedbb82fe	configs: Allow for second disk in GPUFS Connect the --second-disk option in GPUFS. Typically this is used as a benchmarks disk image. If the disk is unmounted at the time of checkpoint, a new disk image can be mounted after restoring the checkpoint for a simple way to add new benchmarks without recreating a checkpoint. Change-Id: I57b31bdf8ec628006d774feacff3fde6f533cd4b Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53071 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-03-24 14:59:57 +00:00
Matthew Poremba	110b9a1bbd	configs: Set CPU vendor for GPUFS config A valid CPU vendor string (i.e., not "M5 Simulator") needs to be passed to CPUID in order for Linux to create the sysfs files needed for ROCm's Thunk interface to initialize properly. If these are no created hipDeviceProperties and other basic GPU code APIs will error out. Change-Id: I6e3f459162e4673860a8f0a88473e38d5d7be237 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53070 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-03-24 14:59:57 +00:00
Matthew Poremba	1be246bbe3	dev-amdgpu: Add PM4PP, VMID, Linux definitions The PM4 packet processor is handling all non-HSA GPU packets such as packets for (un)mapping HSA queues. This commit pulls many Linux structs and defines out into their own files for clarity. Finally, it implements the VMID related functions in AMDGPU device. Change-Id: I5f0057209305404df58aff2c4cd07762d1a31690 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53068 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-03-24 14:59:57 +00:00
Alexandru Dutu	e67e02d657	configs: Connect SDMA, IH, and memory manager in GPUFS Add the devices that have been added in previous changesets to the config file. Forward MMIO writes to the appropriate device based on the MMIO address. Connect doorbells and forward rings to the appropriate device based on queue type. Change-Id: I44110c9a24559936102a246c9658abb84a8ce07e Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53065 Maintainer: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-03-24 14:59:57 +00:00
Matthew Poremba	9313294efe	misc: Remove AMD license addition Remove the line "For use for simulation and test purposes only" in files were AMD is the only copyright holder listed in the header. This happens to be the case for all files where this line exists, removing it completely from gem5. Change-Id: I623f266b002f564301b28774f49081099cfc60fd Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53943 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-12-11 04:00:56 +00:00
Matthew Poremba	e9bac9df87	dev-amdgpu,configs: checkpoint before MMIOs The flow for Full System amdgpu is the use KVM to boot linux and begin loading the driver module. However, the amdgpu module requires reading the VGA ROM located at 0xc0000 in X86. KVM does not support having a small 128KiB hole at this location, therefore we take a checkpoint and switch to a timing CPU to continue loading the drivers before the VGA ROM is read. This creates a checkpoint just before the first MMIOs. This is indicated by three interrupts being sent to the PCI device. After three interrupts in a row are counted a checkpoint exit event occurs. The interrupt counter is reset if a non-interrupt PCI read is seen. Change-Id: I23b320abe81ff6e766cb3f604eca2979339938e5 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46161 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-06-11 17:10:32 +00:00
Matthew Poremba	ca12a8997d	mem-ruby,sim: Add support for VGA ROM memory region Checks if the address is in a shadowed region, and sends the request to pio to be serviced by the device backing up that range. Based on: https://gem5-review.googlesource.com/c/amd/gem5/+/23484 Change-Id: I4d5b46cccd6203523008b2e9545d55eb62130964 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46159 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-06-11 17:10:32 +00:00
Matthew Poremba	a9f2e21e08	configs: Initial configuration for full-system GPU This is an initial configuration capable of booting Linux and registering a PCI device which registers as an AMD Vega 10 (Frontier Edition) GPU. It it loosely based on the the example/fs.py and gem5 book full system example scripts. The top-level file is meant to be modular such that convenience scripts can be created to set arguments automatically and then call the main run function. This will evolve over time as more full-system GPU components are added and the network topology needed for disjoint address spaces is created for the VIPER protocol. Change-Id: I7002213ca8de5eb73919e49fb11840a688744012 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/44907 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-04-29 17:13:12 +00:00

25 Commits