derek/gem5 - gem5 - Gitea: Git with a cup of tea

derek/gem5

Author	SHA1	Message	Date
Matthew Poremba	c5feca8251	dev-amdgpu: Rework PM4 NOP packet The PM4 NOP header is used to insert spaces in the PM4 ring and can therefore be any size. This includes zero. A size of zero is denoted by a value of 0x3fff in the NOP packet header. Currently we assume this means the remainder of the PM4 queue up to the wptr is empty/NOPs. This is not always true. This changeset reworks the PM4 NOP packet to handle the value of 0x3fff as a special value and advances the rptr by 0 bytes. This fixes issues where there were additional packets in the queue which were being skipped over by fast forwarding. Since those packets could be anything, that leads to undefined behavior afterwards. Change-Id: I3f5c3f4b7dd50f93ba503fea97454a9d41771e30 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65094 Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2022-11-01 15:34:08 +00:00
Matthew Poremba	b623d26543	dev-amdgpu: Fix interrupt call for release mem Both the client id and source id are incorrect for the release mem CP packet. This changeset sets both to the correct value and adds asserts that the value is declared in the client ID and source ID enums. Change-Id: I4cc6c3a5f2a482e8f7dcd2a529c4a69bf71742c0 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63177 Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2022-09-09 04:13:49 +00:00
Matthew Poremba	4211962f8c	dev-amdgpu: Fix translation reading SDMA MQD ("RLC queue") The RLC queue MQD address is a GART address, not a system address, so it must be translated through the GART first. Change-Id: Ie52b0e65ebf57141b8ba6f88a49989813750eeec Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62711 Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2022-09-03 16:05:58 +00:00
Matthew Poremba	68115460d8	gpu-compute: Set LDS and Scratch apertures in FS The LDS and scratch aperture base and limits are hardcoded to some values that are useful for SE mode. In reality, these are chosen by the driver so we need to honor whatever values the driver passes so that when addresses are calculated they fall into the correct aperture to route flat instructions to those apertures. This overwrites the default hardcoded values for LDS and scratch base and limit using the values providing by the driver in a MAP_PROCESS packet. Change-Id: I0e194a26631f697819d8aaecf1bf346a7b7c7026 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61656 Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2022-07-28 14:10:33 +00:00
Matthew Poremba	f65f5a8981	gpu-compute,arch-vega: Overhaul HWRegs, setreg, getreg These instructions are supposed to be read/writing special shader hardware registers. Currently they are getting/setting to an SGPR. This results in getting incorrect registers at best and clobbering an SGPR being used by an application at worst. Furthermore, some registers need to be set in the shader and the application will never (can never) set them. This patch overhauls the getreg/setreg instructions to use different storage in the shader. The values will be updated either via setreg from an application (e.g., mode register) or set by a PM4 MAP_PROCESS. Change-Id: Ie5e5d552bd04dc47f5b35b5ee40a569ae345abac Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/61655 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com>	2022-07-28 14:10:33 +00:00
Matthew Poremba	54d2438066	dev-amdgpu: Removed hardcoded AQL queue size The AQL queue size is currently hardcoded to 64kB. For longer running applications this causes the circular queue to wrap before reaching the real end of the queue. Add the computation for queue size instead. Previously longer applications (e.g., bc in pannotia) were hanging around 4k kernels. With change the application launches 10k+ kernels. Change-Id: I6c31677c1799a3c9ce28cf4e7e79efcb987e3b7f Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/59449 Maintainer: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2022-05-07 03:47:06 +00:00
Matthew Poremba	e3f65393fd	dev-amdgpu,arch-vega: Implement TLB invalidation logic Add logic to collect pointers to all GPU TLBs in full system. Implement the invalid TLBs PM4 packet. The invalidate is done functionally since there is really no benefit to simulate it with timing and there is no support in the TLB to do so. This allow application with much larger data sets which may reuse device memory pages to work in gem5 without possibly crashing due to a stale translation being leftover in the TLB. Change-Id: Ia30cce02154d482d8f75b2280409abb8f8375c24 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/58470 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-04-08 17:12:32 +00:00
Bobby R. Bruce	ea9b7ef6a2	dev-amdgpu: Add braces to stop clang compilation braces error Additional braces are needed due to a clang compilation bug that falsely throws a "suggest braces around initialization of subject" error. More info on this bug is available here: https://stackoverflow.com/questions/31555584 Change-Id: Ide5cdd260716ba06f6da4663732e39d18e00af97 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/58150 Reviewed-by: Matthew Poremba <matthew.poremba@amd.com> Maintainer: Matthew Poremba <matthew.poremba@amd.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-03-25 13:40:04 +00:00
Matthew Poremba	1be246bbe3	dev-amdgpu: Add PM4PP, VMID, Linux definitions The PM4 packet processor is handling all non-HSA GPU packets such as packets for (un)mapping HSA queues. This commit pulls many Linux structs and defines out into their own files for clarity. Finally, it implements the VMID related functions in AMDGPU device. Change-Id: I5f0057209305404df58aff2c4cd07762d1a31690 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53068 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-03-24 14:59:57 +00:00

9 Commits