derek/gem5 - gem5 - Gitea: Git with a cup of tea

derek/gem5

Author	SHA1	Message	Date
Gabriel Busnot	ba19f967d7	sim: Use ref constructor of MemberEventWrapper everywhere Change-Id: I77989aa7318142634c771c558293138e7b1e8e51 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67657 Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>	2023-03-13 06:47:09 +00:00
Gabriel Busnot	1bb8cd3d44	sim: Switch from EventWrapper to MemberEventWrapper before deprec Change-Id: I25c81787d522a0dd063112b6727669da46e0f0e7 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67655 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com>	2023-03-13 06:47:09 +00:00
Roger Chang	e6604bf109	arch-riscv,dev: Add HiFive Base Platform This is basic abstract platform and all of RISC-V system should use platform inherit from HiFiveBase, HiFiveBase declared the common way to handle interrupt. Change-Id: I52122e1c82c200d7e6012433c2535c07d427f637 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/68199 Maintainer: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Yu-hsin Wang <yuhsingw@google.com> Tested-by: kokoro <noreply+kokoro@google.com>	2023-03-02 21:12:46 +00:00
Roger Chang	2209957256	arch-riscv,dev: Add PLIC abstract class to support multiple PLIC implementation We should create PLIC abstract and have common interface to let HiFive platform send and clear interrupt to variable type of PLIC Change-Id: Ic3a2ffc2a2a002540b400c70c85c3495fa838f2a Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/68197 Maintainer: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Yu-hsin Wang <yuhsingw@google.com> Tested-by: kokoro <noreply+kokoro@google.com>	2023-03-02 21:12:46 +00:00
Roger Chang	9fb5ce5cd3	arch-riscv,dev: Fix behavior issues of PLIC 1. Fix reserved size between enable memory map and threshold memory map. The number of enablePadding should be the number of context in PLIC 2. writePriority to memory should update Change-Id: Ib4b7e5ecd183863e140c4f3382a75057902d446d Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/68198 Reviewed-by: Ayaz Akram <yazakram@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Yu-hsin Wang <yuhsingw@google.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com>	2023-02-24 03:56:23 +00:00
hungweihsu	e10be09dcf	dev: add method to set initial register value out of constructor. The initial value of register is set in constructor but there is no standard way to assign the initial value and default value at the same time out of that. So we decided to add an extra method to set the initialValue to current register value. The usecase would be: reg.get().field1 = val1; reg.get().field2 = val2; reg.resetInitialValue(); Change-Id: Ibc5454e2945cc6aff943e6599043edd8ca442f5f Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67917 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Gabe Black <gabe.black@gmail.com> Maintainer: Gabe Black <gabe.black@gmail.com>	2023-02-15 02:07:09 +00:00
Matthew Poremba	ea9239ae09	dev-amdgpu: Update deprecated ports Change-Id: Icbc5636c33b437c7396ee27363eed1cf006f8882 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67837 Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2023-02-14 18:57:33 +00:00
Matthew Poremba	39b5b5e511	dev-amdgpu: Fix address in POLL_REGMEM SDMA packet The address for the POLL_REGMEM packet should not be shifted when the mode is 1 (memory). Relevant driver code below is not shifting the address. The shift is causing a page fault due to the incorrect address. This changeset removes the shift so the correct address is translated. https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver/blob/ roc-4.3.x/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c#L903 Change-Id: I7a0ec3245ca14376670df24c5d3773958c08d751 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67877 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2023-02-14 15:36:56 +00:00
Gabe Black	d1f76741c6	dev: Add a definition for VectorResetResponsePort. This is just a simple extension of the regular ResetResponsePort, and is useful if there is a collection of reset pins on a device. Change-Id: I6ccb21e949d3a51bf8b788ffd23e4b2b02706da9 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67576 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Yu-hsin Wang <yuhsingw@google.com> Maintainer: Gabe Black <gabeblack@google.com>	2023-02-09 02:04:10 +00:00
Gabriel Busnot	7f4c92c910	mem,arch-arm,mem-ruby,cpu: Remove use of deprecated base port owner Change-Id: I29214278c3dd4829c89a6f7c93214b8123912e74 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67452 Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>	2023-02-03 06:11:45 +00:00
Earl Ou	1b949e9759	dev: terminal: run pollevent in terminal eventq Change-Id: Idefda0ca1cd71d3e790d470458fa1cd370393c4a Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67532 Reviewed-by: Gabe Black <gabe.black@gmail.com> Maintainer: Gabe Black <gabe.black@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2023-02-03 03:40:35 +00:00
Matthew Poremba	0bce2e56d9	dev: Ignore MC146818 UIP bit / Fix x86 Linux 5.11+ boot As of Linux 5.11, the MC146818 code was changed to avoid reading garbage data that may occur if the is a read while the registers are being updated: github.com/torvalds/linux/commit/05a0302c35481e9b47fb90ba40922b0a4cae40d8 Previously toggling this bit was fine as Linux would check twice. It now checks before and after reading time information, causing it to retry infinitely until eventually Linux bootup fails due to watchdog timeout. This changeset always sets update in progress to false. Since this is a simulation, the updates probably will not be occurring at the same time a read is occurring. Change-Id: If0f440de9f9a6bc5a773fc935d1d5af5b98a9a4b Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/66731 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>	2023-01-17 16:43:10 +00:00
Daniel R. Carvalho	5f5aae8940	dev: Remove a couple of deprecated namespaces These namespaces have gone through the deprecation period and can now be removed: Sinic, SCMI, Ps2, Regs, Keyboard, Mouse, TxdOp, iGbReg, CopyEngineReg. Change-Id: Icfaf458bffca2658650318508c0bb376719cf911 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67370 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Richard Cooper <richard.cooper@arm.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com>	2023-01-17 09:16:20 +00:00
Gabe Black	626e445563	dev: Add a "resetter" callback to the typed register class. When using the typed register template, most functionality of the class can be controlled using callbacks. For instance, callbacks can be installed to handle reads or writes to a register without having to subclass the template and override those methods using inheritance. The recently added reset() method did not follow this pattern though, which has two problems. First, it's inconsistent with how the class is normally used. Second, once you've defined a subclass, the reader, writer, etc, callbacks still expect the type of the original class. That means these have to either awkwardly use a type different from the actual real type of the register, or use awkward, inefficient, and/or dangerous casting to get back to the true type. To address these problems, this change adds a resetter(...) method which works like the reader(...) or writer(...) methods to optionally install a callback to implement any special reset behavior. Change-Id: Ia74b36616fd459c1dbed9304568903a76a4b55de Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67203 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Yu-hsin Wang <yuhsingw@google.com> Maintainer: Gabe Black <gabeblack@google.com>	2023-01-12 05:52:39 +00:00
Giacomo Travaglini	5447d55e39	dev: Fix -Wunused-variable in structured binding Change-Id: Ia244767dd9d1dd7b72c320fb78e48f206694f5a2 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/66891 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Yu-hsin Wang <yuhsingw@google.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com>	2022-12-22 08:07:51 +00:00
Gabe Black	fbd0722de4	fastmodel,dev: Replace the reset port with a Signal*Port<bool>. The ResetRequestPort and ResetResponsePort have a few problems: 1. A reset signal should happen during the time a reset is asserted, or in other words the device should stay in reset and not doing anything while reset is asserted. It should not immediately restart execution while the reset is still held. 2. These names are misleading, since there is no response. These names are inherited from other port types where there is an actual response. There is a new generic SignalSourcePort and SignalSinkPort set of port classes which are templated on the type of signal they propogate, and which can be used in place of reset ports in c++. These ports can still have a specialized role which will ensure that only reset ports are connected to each other for a form of type checking, although the underlying c++ instances are more interoperable than that. Change-Id: Id98bef901ab61ac5b200dbbe49439bb2d2e6c57f Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/66675 Maintainer: Gabe Black <gabeblack@google.com> Reviewed-by: Yu-hsin Wang <yuhsingw@google.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-12-16 07:19:05 +00:00
Gabe Black	89d5bfca7c	fastmodel,dev: Rework the IntPin classes with SignalPort. These are largely compatibility wrappers around the SignalPort classes. The python versions of these types enforce more specific compatibility, but on the c++ side the SignalPort<bool> classes can be used directly instead. Change-Id: I1325074d0ed1c8fc6dfece5ac1ee33872cc4f5e3 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/66673 Maintainer: Gabe Black <gabeblack@google.com> Reviewed-by: Yu-hsin Wang <yuhsingw@google.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-12-16 07:19:05 +00:00
Gabe Black	8b1688da34	dev: Introduce a reset() method on RegisterBank and Register classes. This will make it much easier to implement reset behaviors on devices which have RegisterBanks in them. Change-Id: I73fe9874fcb69feed33611a320dcca85c0de2d0e Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/66671 Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Gabe Black <gabeblack@google.com> Reviewed-by: Yu-hsin Wang <yuhsingw@google.com> Reviewed-by: Jui-min Lee <fcrh@google.com>	2022-12-16 07:19:05 +00:00
Roger Chang	dd04e70445	arch-riscv: Implement rv32 zicsr extension 1. Add misc register mstatush, cycleh, timeh, instreth, hpmcounter03...hpmcounter31, pmpcfg1, pmpcfg3 2. Implement handling RV32 only registers 3. Implement methods of set time CSR Change-Id: I5c55c18a0da91977d6e23da24ea3cbcba9f0509b Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65733 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-12-13 05:21:27 +00:00
Gabe Black	9d1cc1bcc9	dev: Add an offset checking mechanism to RegisterBank. When adding a long list of registers, it can be easy to miss one which will offset all the registers after it. It can be hard to find those sorts of problems, and tedious and error prone to fix them. This change adds a mechanism to simply annotate what offset a register should have. That should also make the register list more self documenting, since you'll be able to easily see what offset a register has from the source without having to count up everything in front of it. Change-Id: Ia7e419ffb062a64a10106305f875cec6f9fe9a80 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/66431 Reviewed-by: Yu-hsin Wang <yuhsingw@google.com> Maintainer: Gabe Black <gabe.black@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-12-06 13:27:33 +00:00
Giacomo Travaglini	ed6cf2eced	dev-arm: Allow GICv3 to be externally(publicly) updated Change-Id: Ifa7b745ea11e74c17024c22ae993b6103eecb744 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/66271 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-12-05 10:50:42 +00:00
Giacomo Travaglini	0df37a33f6	arch-arm: Setup TC/ISA at construction time 2nd attempt This partly reverts commit `ec75787aef` by fixing the original problem noted by Bobby (long regressions): setupThreadContext has to be implemented otherswise the GICv3 cpu interface will end up holding old references when switching TC/ISAs. This new implementation is still setting up the cpu interface reference in the ISA only when it is required, but it is storing the TC/ISA reference within the interface every time the ISA::setupThreadContext gets called. Change-Id: I2f54f95761d63655162c253e887b872f3718c764 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65931 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Bobby Bruce <bbruce@ucdavis.edu>	2022-12-04 20:02:10 +00:00
Matthew Poremba	eee42275ee	dev-amdgpu: Writeback RLC queue MQD when unmapped Currently when RLC queues (user mode queues) are mapped, the read/write pointers of the ring buffer are set to zero. However, these queues could be unmapped and then remapped later. In that situation the read/write pointers should be the previous value before unmapping occurred. Since the read pointer gets reset to zero, the queue begins reading from the start of the ring, which usually contains older packets. There is a 99% chance those packets contain addresses which are no longer in the page tables which will cause a page fault. To fix this we update the MQD with the current read/write pointer values and then writeback the MQD to memory when the queue is unmapped. This requires adding a pointer to the MQD and the host address of the MQD where it should be written back to. The interface for registering RLC queue is also simplified. Since we need to pass the MQD anyway, we can get values from it as well. Fixes b+tree and streamcluster from rodinia (when using RLC queues). Change-Id: Ie5dad4d7d90ea240c3e9f0cddf3e844a3cd34c4f Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65791 Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2022-12-01 21:04:05 +00:00
Hoa Nguyen	eac06ad681	python: Fix multiline quotes in a single line An example case, ```python mem_side_port = RequestPort( "This port sends requests and " "receives responses" ) ``` This is the residue of running the python formatter. This is done by finding all tokens matching the regex `"\s"(?![.;"])` and manually replacing them by empty strings. Change-Id: Icf223bbe889e5fa5749a81ef77aa6e721f38b549 Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/66111 Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu> Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com>	2022-11-29 23:44:38 +00:00
Bobby R. Bruce	ec75787aef	arch-arm: Revert 'Setup TC/ISA at construction time..' Reverts: `dd2f1fb2f8` https://gem5-review.googlesource.com/c/public/gem5/+/65174 and `47bd56ee71` https://gem5-review.googlesource.com/c/public/gem5/+/65291 The `47bd56ee` change resulted in the `SuiteUID:tests/gem5/fs/linux/arm/test.py:realview-switcheroo-noncaching-timing-ALL-x86_64-opt` nightly test stalling. This behavior can be reproduced with: ``` ./build/ALL/gem5.opt tests/gem5/fs/linux/arm/run.py tests/gem5/configs/realview-switcheroo-noncaching-timing.py tests/gem5/resources/arm “$(pwd)” ``` The subsequent change, `dd2f1fb2`, must be reverted for this change to be reverted. Change-Id: I6fed74f33d013f321b93cf1a73eee404cb87ce18 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65732 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com>	2022-11-18 17:46:09 +00:00
Matthew Poremba	33a36d35de	dev-amdgpu: Store SDMA queue type, use for ring ID Currently the SDMA queue type is guessed in the trap method by looking at which queue in the engine is processing packets. It is possible for both queues to be processing (e.g., one queue sent a DMA and is waiting then switch to another queue), triggering an assert. Instead store the queue type in the queue itself and use that type in trap to determine which ring ID to use for the interrupt packet. Change-Id: If91c458e60a03f2013c0dc42bab0b1673e3dbd84 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65691 Maintainer: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-11-18 15:30:37 +00:00
Matthew Poremba	623e2d3dac	dev-amdgpu: Handle ring buffer wrap for PM4 queue Change-Id: I27bc274327838add709423b072d437c4e727a714 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65431 Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2022-11-09 15:47:50 +00:00
Matthew Poremba	c8d687b05c	dev-amdgpu: Fix SDMA ring buffer wrap around The current SDMA wrap around handling only considers the ring buffer location as seen by the GPU. Eventually when the end of the SDMA ring buffer is reached, the driver waits until the rptr written back to the host catches up to what the driver sees before wrapping around back to the beginning of the buffer. This writeback currently does not happen at all, causing hangs for applications with a lot of SDMA commands. This changeset first fixes the sizes of the queues, especially RLC queues, so that the wrap around occurs in the correct place. Second, we now store the rptr writeback address and the absoluate (unwrapped) rptr value in each SDMA queue. The absolulte rptr is what the driver sends to the device and what it expects to be written back. This was tested with an application which basically does a few hundred thousand hipMemcpy() calls in a loop. It should also fix the issue with pannotia BC in fullsystem mode. Change-Id: I53ebdcc6b02fb4eb4da435c9a509544066a97069 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65351 Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com>	2022-11-09 04:11:35 +00:00
Giacomo Travaglini	47bd56ee71	dev-arm: Setup TC/ISA at construction time of Gicv3CPUInterface We should initialize them as soon as possible to make sure any Gicv3CPUInterface method uses a valid reference Change-Id: I8fffebdab9136a9027c4f61bb9413e97031e1969 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65291 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>	2022-11-04 21:25:37 +00:00
Matthew Poremba	489074fbfd	dev-amdgpu: Fix issues with PM4 queue map, fences The PM4 release_mem packet is used as a DMA fence in the driver. It specifies which queue the interrupt came from by encoding the me, pipe, and queue fields from the map_queue packet into the interrupt ring ID. Currently these fields are incorrect because (1) the order in the bitfield is backwards, (2) the queue constructor assigns a pointer to the PM4MapQueue packet containing this data to the dmaBuffer which gets deleted in short order, and (3) the order of the encoding of ring ID is incorrect. This change fixes these issues by (1) placing the struct vales in correct order, (2) creating a const copy of the dmaBuffer on construction, and (3) using the ring ID encoding expected by the driver: https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver/blob/roc-4.3.x/ drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c#L5989 Change-Id: I72c382980e57573f8a8a6879912c4139c7e2f505 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65095 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com>	2022-11-01 15:34:17 +00:00
Matthew Poremba	c5feca8251	dev-amdgpu: Rework PM4 NOP packet The PM4 NOP header is used to insert spaces in the PM4 ring and can therefore be any size. This includes zero. A size of zero is denoted by a value of 0x3fff in the NOP packet header. Currently we assume this means the remainder of the PM4 queue up to the wptr is empty/NOPs. This is not always true. This changeset reworks the PM4 NOP packet to handle the value of 0x3fff as a special value and advances the rptr by 0 bytes. This fixes issues where there were additional packets in the queue which were being skipped over by fast forwarding. Since those packets could be anything, that leads to undefined behavior afterwards. Change-Id: I3f5c3f4b7dd50f93ba503fea97454a9d41771e30 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65094 Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2022-11-01 15:34:08 +00:00
Matthew Poremba	752b696883	dev-amdgpu: Fix SDMA trap ring ID, context SDMA traps are used in the driver as a DMA fence. To pass a fence, the SDMA sends the driver the interrupt context from a trap packet and the ring ID which specifies which queue in the SDMA engine is passing a fence. Currently the interrupt context is using the wrong value in the packet and the ring ID is hard-coded to always be the gfx queue. This changeset uses the correct interrupt context from the SDMA packet and sets the ring ID to either 0 if the gfx queue is currently being processed or 3 if the page queue is being processed. The relevant interrupt service routine in the driver can be found at: https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver/blob/roc-4.3.x/ drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c#L2129 Change-Id: Ie4a4a9d6ab1d3bf83bf76bb57a02a91100217b51 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65093 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-11-01 15:34:08 +00:00
Matthew Poremba	8899291db6	dev-amdgpu: Fix interrupt handler address assignment The interrupt handler's base address is sent via MMIO and must be shifted by 8 bits to convert to a byte address. The current code is shifting the MMIO dword first then assigning, resulting in the top 8 bits being shifted out. This changeset fixes the issue by assigning the dword to the 64-bit address first then shifting after. Similarly, the upper dword is cast to a 64-bit value first before shifting. This fixes some "fence fallback timeout" errors in the m5term output. These timeouts become a problem because the driver will reset after a few hundred of them, killing any running GPU applications as part of the process. Change-Id: I0beec313f533765c94063bcf4de8c65aacf2986b Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65092 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com>	2022-11-01 15:34:08 +00:00
Matthew Poremba	144ce7f12c	dev-amdgpu: Fix GART PTE size The GART table is a legacy 1-level page table primarily used for supervisor mode accesses to GPUs. The PTE size is 64-bits, not 32-bit. This causes memory sizes >3GB (in X86) to fail loading amdgpu driver. This changeset fixes the issue by setting the GART mappings to the correct data type. Change-Id: Ibfba2443675fe28316d26afa5f1a14885fdce40c Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65091 Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2022-10-31 14:40:30 +00:00
Matthew Poremba	7b16b17e61	dev-amdgpu: Chunkify SDMA copies that use device memory The current implementation of SDMA copy calls the GPU memory manager's read/write method one time passing a physical address as the source/destination. This implicitly assumes the physical addresses are contiguous which is generally not true for large allocations. This results in reading from/writing to the wrong address. This changeset fixes the problem by copying large copies in chunks of the minimum possible page size on the GPU (4kB). Each page is translated seperately to ensure the correct physical address. The final copy "done" callback is only used for the last transfer. The transfers should complete in order so the copy command will not complete until all chunks have been copied. Tested and verified on an application with a large allocation (~5GB). Change-Id: I27018a963da7133f5e49dec13b0475c3637c8765 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/64752 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-10-31 14:30:24 +00:00
Giacomo Travaglini	506bd9d9e7	dev-arm: Use ThreadContext instead if ISA in GICV3 cpu interface Some CPU wrappers like the Fastmodel one do extend the ThreadContext interface in order to retrieve system register state... By bypassing the TC interface and by using the ISA instead, we are basically forcing users to extend the ISA as well to intercept these calls. So with this patch we are making sure every system register is accessed (like HCR_EL2 or SCR_EL3) through the thread context. This of course does not apply to the CPU interface registers as we still use the ISA storage for them. In the future we should probably move that storage from the ISA class to the Gicv3CPUInterface class itself This is also simplifying Gicv3CPUInterface::isEL3OrMon: currEL already covers the AArch32 case so no need to differentiate between AArch32 and AArch64 Change-Id: I446a14a6e12b77e1a62040b3422f79ae52cc9eec Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/64913 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>	2022-10-27 15:33:18 +00:00
Giacomo Travaglini	9a9de78811	dev-arm: Implement System Security Control registers This block of system registers is part of the N1 SDP [1] [1]: https://developer.arm.com/documentation/101489/0000/\ Programmers-model/System-Security-Control-registers Change-Id: I2ecf5cd247bd68eddcd359e91f3954070dbffaa8 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Richard Cooper <richard.cooper@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/64951 Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>	2022-10-27 10:33:38 +00:00
Giacomo Travaglini	4db981576e	arch-arm: Setup ThreadContext in GICv3 cpu interface Change-Id: If019b4b114031f880dff43e05658a162c201ea6a Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/64912 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-10-27 10:31:10 +00:00
Matthew Poremba	a648be2338	dev-amdgpu: Add an SDMA data debug flag This debug flag is used to print spammy SDMA DPRINTFs, such as an SDMA copy printing the data of large transfers 8 bytes per line at a time. For those prints, the SDMAEngine flag will now only print the first and last qword of the transfer and the new SDMAData flag is needed for verbose data printing. This makes the SDMAEngine flag still useful for verifying copies in applications with predictable data such as square. Additionally, the memory allocation/deallocation done solely for a print statement is removed in favor of casting the data to the printed type. Change-Id: I18c1918ef9085cca4570f79881ee63d510ccc32f Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/64452 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com>	2022-10-13 20:17:00 +00:00
Earl Ou	317bfd62bd	dev: fix device number check error in IDE controller Fixed a typo between 3 and 4. Change-Id: I1470e30c4d472587db0b9da5512b24ab92f1fd65 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/64052 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Gabe Black <gabe.black@gmail.com> Maintainer: Gabe Black <gabe.black@gmail.com>	2022-10-04 01:21:37 +00:00
Giacomo Travaglini	7c0ab07ee2	dev-arm: Fix GICv3 GICD_ITARGETSR address range According to the GICv3 manual, GICD_ITARGETSR address range goes from 0x0800 to 0x0c00 (as already implemented in the GICv2 model [1]) [1]: https://github.com/gem5/gem5/blob/v22.0.0.0/\ src/dev/arm/gic_v2.cc#L64 Change-Id: I064e91d070d1a7b79f41a06ffd2197e4c07dae32 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/64074 Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Richard Cooper <richard.cooper@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-10-03 17:45:10 +00:00
Matthew Poremba	2f1d67f8fe	dev-amdgpu: Remove cached copy of device memory This map was originally used for fast access to the GART table. It is no longer needed as the table has been moved to the AMDGPUVM class. Along with commit `12ec5f9172` which reads functionally from device memory, this table is no longer needed and is essentially a duplicate copy of device memory for anything written over the PCI BAR. This changeset removes the map entirely which will reduce the memory footprint of simulations and potentially avoid stale copies of data when reading over the PCI BAR. Change-Id: I312ae38f869c6a65e50577b1c33dd055078aaf32 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63951 Reviewed-by: Matt Sinclair <mattdsinclair.wisc@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com>	2022-10-01 14:04:45 +00:00
Matthew Poremba	b623d26543	dev-amdgpu: Fix interrupt call for release mem Both the client id and source id are incorrect for the release mem CP packet. This changeset sets both to the correct value and adds asserts that the value is declared in the client ID and source ID enums. Change-Id: I4cc6c3a5f2a482e8f7dcd2a529c4a69bf71742c0 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63177 Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2022-09-09 04:13:49 +00:00
Matthew Poremba	6c935657fd	dev-amdgpu: Implement SDMA atomic packet SDMA atomic packets are used in conjunction with RLC queues in SDMA for synchronization similar to how HSA signals are used with BLIT kernels when SDMA is disabled. Implement a skeleton of the SDMA atomic packet methods as well as the atomic add64 operation. The atomic add operation appears to be the only operation used in ROCm, so this implementation is fairly complete. See: https://github.com/RadeonOpenCompute/ROCR-Runtime/blob/ rocm-4.2.x/src/core/runtime/amd_blit_sdma.cpp#L880 Change-Id: I62cc337f2ffe590bdb947b48053760ee8b3a6f32 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63174 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-09-09 04:13:49 +00:00
Matthew Poremba	9ea28bd782	dev-amdgpu: Implement SDMA RLC queue unmapping The unmap queues packet specifies all non-static queues should be unmapped which includes RLC queues in the SMDA. This functionality did not exist before and is added in this changeset. Fixes bug with rodinia_3.0/hip/bfs. Change-Id: I80ca8cf8d89559625b5870745889b0a27916635e Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63173 Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2022-09-09 04:13:49 +00:00
Matthew Poremba	af4251f6ae	dev-amdgpu: Rework SDMA RLC queue data structure There can only ever be two RLC queues maximum. Use this information for a simpler data structure to store doorbell information. The patch changes the std::unordered_map previously used to std::array. This will also be useful in avoiding erase-while-iterating issues needed to unregister all queues at once. Change-Id: I95600e40de51cb1a992a20bcebaf7580ea4d0be8 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63172 Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2022-09-09 04:13:49 +00:00
Matthew Poremba	12ec5f9172	dev-amdgpu: Rework framebuffer reads Previously framebuffer reads would try reading from MMIO trace, special addresses, and then anything previously written to a special address range. This does not handle direct large BAR reads, causing incorrect results in some applications that were doing this. Rework the readFramebuffer method to do the following. Remove the MMIO trace read altogether, as there were not any framebuffer reads from the trace to begin with. Read special addresses first to avoid overwriting by previous writes. Next read previous writes to special ranges. The special range is the GART table. These are required for functional translations. Lastly read from the device memory directly. This does a functional read required by the PCIDevice read method which is non-timing. Reading from device memory is preferred over the map type used for GART to avoid duplication of a potentially huge amount of data. With this changeset all but one of the HIP samples and HIP examples applications now run and pass verification of results. Change-Id: Id3b788bfc5eaf17cfa1897f25d26f3725d4db321 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63171 Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2022-09-09 04:13:49 +00:00
Matthew Poremba	4b35693bd2	dev-amdgpu: Forward RLC queue doorbells Forward user queue doorbells to the SDMA. This is the final step needed to enable RLC (user) queues to replace BLIT kernels. Change-Id: I0c2ef70bb5414b82785ef437dd65d6c57798d24f Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63033 Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-09-09 04:13:49 +00:00
Matthew Poremba	a5dfb0718d	dev-amdgpu: Add user-mode TranslationGen to SDMA RLC queue do translation using user mode addresses. To support this, add the final aperture translation needed to the SDMA engine. Change-Id: I25841e240e3b44f66d26d503ab52b54379daa49a Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63032 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com>	2022-09-09 04:13:49 +00:00
Matthew Poremba	9ed39afe62	dev-amdgpu: Place all user-mode translations in MMHUB The memory management hub ("mmhub") is an aperture that aliases the GPU device memory. MMHUB addresses functionally map to the same device address, with the exception that it is guaranteed not to overlap with host memory. This is useful in gem5 for APIs with Addr type as it prevents sending e.g., DMAs to the wrong place. Change-Id: Ia296809a8dc2c5fbdeba6d70cd53215f9ab36c93 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63031 Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-09-09 04:13:49 +00:00

1 2 3 4 5 ...

1395 Commits