derek/gem5 - gem5 - Gitea: Git with a cup of tea

derek/gem5

Author	SHA1	Message	Date
Kyle Roarty	b40b361bee	arch-vega, gpu-compute: Add vectors to hold op info This removes the need for redundant functions like isScalarRegister/isVectorRegister, as well as isSrcOperand/isDstOperand. Also, the op info is only generated once this way instead of every time it's needed. Change-Id: I8af5080502ed08ed9107a441e2728828f86496f4 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42211 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com>	2021-04-01 02:58:31 +00:00
Tony Gutierrez	0e2564a629	arch-gcn3, gpu-compute: Update getRegisterIndex() API This change removes the GPUDynInstPtr argument from getRegisterIndex(). The dynamic inst was only needed to get access to its parent WF's state so it could determine the number of scalar registers the wave was allocated. However, we can simply pass the number of scalar registers directly. This cuts down on shared pointer usage. Change-Id: I29ab8d9a3de1f8b82b820ef421fc653284567c65 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42210 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com>	2021-04-01 02:58:31 +00:00
Tony Gutierrez	236b4a502f	gpu-compute: Add operand info class to GPUDynInst This change adds a class that stores operand register info for the GPUDynInst. The operand info is calculated when the instruction object is created and stored for easy access by the RF, etc. Change-Id: I3cf267942e54fe60fcb4224d3b88da08a1a0226e Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42209 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com>	2021-04-01 02:58:31 +00:00
Gabe Black	3f67faec83	arch,dev,gpu-compute,sim: Rename isa_traits.hh page_size.hh. The only thing left in isa_traits.hh are two constants, one for the number of bytes in a page, and one for how far to shift an address to get the page number. To make it clear that this is the only thing isa_traits.hh should be used for from this point forward (until it is entirely eliminated), this change renames it to the much less generic page_size.hh. Also, because isa_traits.hh used to have much more stuff in it, it was included in a lot of places it didn't need to be. This change also clears out all these legacy includes while updating the actually needed ones to the new name. Change-Id: I939b01b117c53d620b6b0a98982f6f21dc2ada72 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/40179 Reviewed-by: Gabe Black <gabe.black@gmail.com> Maintainer: Gabe Black <gabe.black@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-03-30 10:17:48 +00:00
Kyle Roarty	c9415dc389	gpu-compute: Remove unused functions These functions were probably used for some stat collection, but they're no longer used, so they're being removed Change-Id: Ic99f22391c0d5ffb0e9963670efb35e503f9957d Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42202 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com>	2021-03-25 17:21:16 +00:00
Michael LeBeane	25e8a14a6b	gpu-compute: Support dynamic scratch allocations dGPUs in all versions of ROCm and APUs starting with ROCM 2.2 can under-allocate scratch resources. This patch adds support for the CP to trigger a recoverable error so that the host can attempt to re-allocate scratch to satisfy the currently stalled kernel. Note that this patch does not include a mechanism to handle dynamic scratch allocation for queues with in-flight kernels, as these queues would first need to be drained and descheduled, which would require some additional effort in the hsaPP and HW queue scheduler. If the CP encounters this scenerio it will assert. I suspect this is not a particularly common occurence in most of our applications so it is left as a TODO. This patch also fixes a few memory leaks and updates the old DMA callback object interface to use a much cleaner c++11 lambda interface. Change-Id: Ica8a5fc88888283415507544d6cc49fa748fe84d Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42201 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com>	2021-03-25 17:21:08 +00:00
Daniel R. Carvalho	7f1de4e686	misc: Fix coding style for enum's opening braces The systemc dir was not included in this fix. First it was identified that there were only occurrences at 0, 1, and 2 levels of indentation (and 2 of 2 spaces, 1 of 3 spaces and 2 of 12 spaces), using: grep -nrE --exclude-dir=systemc \ "^ enum [A-Za-z]. {$" src/ Then the following commands were run to replace: <indent level>enum X ... { by: <indent level>enum X ... <indent level>{ Level 0: grep -nrl --exclude-dir=systemc \ "^enum [A-Za-z].* {$" src/ \| \ xargs sed -Ei \ 's/^enum ([A-Za-z].) \{$/enum \1\n\{/g' Level 1: grep -nrl --exclude-dir=systemc \ "^ enum [A-Za-z]. {$" src/ \| \ xargs sed -Ei \ 's/^ enum ([A-Za-z].*) \{$/ enum \1\n \{/g' and so on. Change-Id: Ib186cf379049098ceaec20dfe4d1edcedd5f940d Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/43326 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Gabe Black <gabe.black@gmail.com> Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-03-23 16:26:04 +00:00
Kyle Roarty	f5383a5733	gpu-compute: Fix accidental execution when stopped at barrier Due the compute unit pipeline being executed in reverse order, there exists a scenario where a compute unit will execute an extra instruction when it's supposed to be stopped at a barrier. It occurs as follows: * The ScheduleStage sets a barrier instruction ready to execute. * The ScoreboardCheckStage adds another instruction to the readyList. This is where the barrier is checked, but because the barrier isn't executing yet, the instruction can be passed along to ScheduleStage * The barrier executes, and stalls * The ScheduleStage sees that there's a new instruction and schedules it to be executed. * Only now will the ScoreboardCheckStage realize a barrier is active and stall accordingly * The subsequent instruction executes This patch sets the wavefront status to be S_BARRIER in ScheduleStage instead of in the barrier instruction execution in order to have ScoreboardCheckStage realize that we're going to execute a barrier, preventing it from marking another instruciton as ready. Change-Id: Ib683e2c68f361d7ee60a3beaf53b4b6c888c9f8d Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41573 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Alexandru Duțu <alexandru.dutu@amd.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-03-04 17:37:19 +00:00
Kyle Roarty	a9e0a1ccf1	gpu-compute: Explicitly set driver to nullptr in constructor We have a fail_if in attachDriver to prevent driver from being overwritten. However, the fail_if only checks for if the driver is not nullptr. Previously, in some cases driver was set to garbage, which made the fail_if trip the first time we were assigning the driver. This patch explicitly sets driver to nullptr in the constructor, thus ensuring that it will be nullptr the first time we call attachDriver Change-Id: I325f6033e785025a912e3af3888c66cee0332f40 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41973 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-03-01 18:10:11 +00:00
Giacomo Travaglini	41928dac80	misc: Remove unused params() definitions Lots of times the params() helper has been defined but not used Change-Id: Id71829aca71341d46964d8f071099342b946b62f Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41613 Tested-by: kokoro <noreply+kokoro@google.com>	2021-02-19 23:27:34 +00:00
Alexander Klimov	92ba3ba843	misc: Use PARAMS The patch is using the newly defined PARAMS macro to replace custom params() getters in derived class. The patch is also removing redundant _params: Instead of creating yet another _params field, SimObject descendants should use params() to expose the real type of SimObject::_params they already have. Change-Id: I43394cebb9661fe747bdbb332236f0f0181b3dba Signed-off-by: Alexander Klimov <Alexander.Klimov@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/39900 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-02-19 23:27:34 +00:00
Gabe Black	9a0b79459d	misc: Fix mismatched struct/class "tags" and reenable that warning. The mismatches were from places where Params structs had been declared as classes instead of structs, and ruby's MachineID struct. A comment describing why the warning had been disabled said that it was because of libstdc++ version 4.8. As far as I can tell, that version is old enough to be outside the window we support, and so that should no longer be a problem. It looks like the oldest version of gcc we support, 5.0, corresponds with approximately libstdc++ version 6.0.21. https://gcc.gnu.org/onlinedocs/libstdc++/manual/abi.html#abi.versioning Change-Id: I75ad92f3723a1883bd47e3919c5572a353344047 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/40953 Reviewed-by: Gabe Black <gabe.black@gmail.com> Maintainer: Gabe Black <gabe.black@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-02-19 08:29:00 +00:00
Alexandru Dutu	14d6e8fac4	arch-gcn3: Implementation of s_sleep This changeset implements the s_sleep instruction in a similar way to s_waitcnt. Change-Id: I4811c318ac2c76c485e2bfd9d93baa1205ecf183 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/39115 Maintainer: Matthew Poremba <matthew.poremba@amd.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-02-04 00:07:10 +00:00
Bobby R. Bruce	a4345ff324	gpu-compute,misc: Remove unused private variable Clang 9 fails to compile GCN3 due to the unused private variable, `_nxtFreeIdx`, in `src/gpu-compute/dyn_pool_manager.hh`. This variable has therefore been removed. Change-Id: I33f2e9634bbf8d5cea7a42ae2ac9f3ea8298d406 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/40397 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-02-03 19:08:02 +00:00
Bobby R. Bruce	ae33daa8d7	gpu-compute,misc: Fix Clang missing override errors Clang fails to compile GCN3 due to missing overrides in `src/gpu-compute/gpu_command_processor.hh`. This commit fixes this errror. Change-Id: I6da9fce7c3eb86a5418a931ee4f225cceda488a5 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/40396 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-02-03 19:08:02 +00:00
Gabe Black	fc4caa6ad0	misc: Re-remove Authors lines from source files. These were universally removed a while ago, but a bunch have crept back in. Remove them. Change-Id: I3cb5b9f40c9c19aafb5e39a51d1baeae60a591c0 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/40335 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Matthew Poremba <matthew.poremba@amd.com> Maintainer: Gabe Black <gabe.black@gmail.com>	2021-02-03 12:55:17 +00:00
Sooraj Puthoor	965ad12b9a	dev-hsa: enable interruptible hsa signal support Event creation and management support from emulated drivers is required to support interruptible signals in HSA and this support was not available. This changeset adds the event creation and management support in the emulated driver. With this patch, each interruptible signal created by the HSA runtime is associated with a signal event. The HSA runtime can then put a thread waiting on a signal condition to sleep asking the driver to monitor the event associated with that signal. If the signal is modified by the GPU, the dispatcher notifies the driver about signal value change. If the modifier is a CPU thread, the thread will have to make HSA API calls to modify the signal and these API calls will notify the driver about signal value change. Once the driver is notified about a change in the signal value, the driver checks to see if any thread is sleeping on that signal and wake up the sleeping thread associated with that event. The driver has also implemented the time_out wakeup that can wake up the thread after a certain time period has expired. This is also true for barrier packets. Each signal has an event address in a kernel managed and allocated event page that can be used as a mailbox pointer to notify an event. However, this feature used by non-CPU agents to communicate with the driver is not implemented by this changeset because the non-CPU HSA agents in our model can directly communicate with driver in our implementation. Having said that, adding that feature should be trivial because the event address and event pages are correctly setup by this changeset and just adding the event page's virtual address to our PIO doorbell interface in the page tables and registering that pio address to the driver should be sufficient. Managing mailbox pointer for an event is based on event ID and using this event ID as an index into event page, this changeset already provides a unique mailbox pointer for each event. Change-Id: Ic62794076ddd47526b1f952fdb4c1bad632bdd2e Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/38335 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-01-31 03:25:05 +00:00
Kyle Roarty	68fe6fa6cb	gpu-compute: Simplify LGKM decrementing for Flat instructions This commit makes it so LGKM count is decremented in a single place (after completeAcc), which fixes a couple of potential bugs 1. Data is only written by completeAcc, not after initiateAcc. LGKM count is supposed to be decremented after data is written. 2. LGKM count is now properly decremented for atomics without return Change-Id: Ic791af3b42e04f7baaa0ce50cb2a2c6286c54f5a Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/39396 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Matthew Poremba <matthew.poremba@amd.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-01-21 21:41:05 +00:00
Matthew Poremba	5323cccfdd	arch-gcn3,gpu-compute: Update stats style for GPU Convert all gpu-compute stats to Stats::Group style. Change-Id: I29116f1de53ae379210c6cfb5bed3fc74f50cca5 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/39135 Reviewed-by: Matthew Poremba <matthew.poremba@amd.com> Maintainer: Matthew Poremba <matthew.poremba@amd.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-01-18 17:58:05 +00:00
Kyle Roarty	a4657f1b84	gpu-compute: Fix LGKM decrementing for flat atomic insts A prior commit (`f6ec145fc0`) fixed early LGKM decrementing for flat loads and stores, but failed to address flat atomics. Per the GCN3 ISA, LGKM count is decremented on flat atomics with return when the data has been returned. This patch checks if the flat instruction is an atomic with return, and decrements LGKM count if so. Change-Id: I5c0c2c205a8b21327d4c42ba71c59842c15bd63b Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/39155 Reviewed-by: Matthew Poremba <matthew.poremba@amd.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matthew Poremba <matthew.poremba@amd.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-01-15 06:00:06 +00:00
gauravjain14	c29523665e	gpu-compute: Support for dynamic register alloc SimplePoolManager doesn't allow mapping of two WGs simultaneously on the same Compute Unit (provided the previous WG has been mapped to all the SIMDs) even if there is sufficient VRF and SRF space available. DynPoolManager takes care of that by dynamically allocating and deallocating register file space to wavefronts Change-Id: I2255c68d4b421615d7b231edc05d3ebb27cbd66c Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32034 Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Alexandru Duțu <alexandru.dutu@amd.com>	2021-01-14 17:04:27 +00:00
Gabe Black	c6933a27da	misc: Fix missing includes. Change-Id: I545ff03041e8fe66dc489c6aa95c009e54df0970 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/38995 Reviewed-by: Gabe Black <gabe.black@gmail.com> Maintainer: Gabe Black <gabe.black@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-01-13 08:55:59 +00:00
Kyle Roarty	f6ec145fc0	gpu-compute: Fix FLAT insts decrementing lgkm count early FLAT instructions used to decrement lgkm count on execute, while the GCN3 ISA specifies that lgkm count should be decremented on data being returned or data being written. This patch changes it so that lgkm is decremented after initiateAcc (for stores) and after completeAcc (for loads) to better reflect the ISA definition. This fixes a bug where waitcnts would be satisfied even though the memory access wasn't completed, which lead to instructions using the wrong data. Change-Id: I596cb031af9cda8d47a1b5e146e4a4ffd793d36c Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/38696 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Matthew Poremba <matthew.poremba@amd.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-01-07 17:12:31 +00:00
Kyle Roarty	e49a072c7d	gpu-compute: Use dict.get syntax for accessing buildEnv keys 37775 removed SmartDict, which is the type buildEnv used to be. Because of that change, doing buildEnv[key] with a key not in the dict returns KeyError instead of False. By using buildEnv(key, False), we are able to return False when the key isn't in the dict. Change-Id: I4aae29b95b082efb2b021f21d608f9cd1c196379 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/38135 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Matthew Poremba <matthew.poremba@amd.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matthew Poremba <matthew.poremba@amd.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-12-01 19:19:52 +00:00
Kyle Roarty	0bb385941b	gpu-compute: Add exp_cnt tracking for buffer store instructions exp_cnt (expInstsIssued in the code) is used in the waitcnt instruction to track that data has been read out of VGPRs in previous global memory instructions, making it safe to overwrite the VGPRs used in said global memory instructions. Previously, exp_cnt wasn't being tracked at all, which lead to the waitcnt finishing immediately, leading to the memory instruction's VPGRs getting overwritten by subsequent instructions, causing errors. This patch makes it so waitcnts waiting on exp_cnt will wait for MUBUF buffer store instructions to read their VGPRs before completing Change-Id: Idd2b59511bc086cf316217da27b7a228272b0b0f Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/37555 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Alexandru Duțu <alexandru.dutu@amd.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-11-30 20:59:31 +00:00
Daniel Gerzhoy	9a01d3e927	dev-hsa,gpu-compute: Agent Packet handler implemented. HSA packet processor will now accept and process agent packets. Type field in packet is command type. For now: AgentCmd::Nop = 0 AgentCmd::Steal = 1 Steal command steals the completion signal for a running kernel. This enables a benchmark to use hsa primitives to send an agent packet to steal the signal, then wait on that signal. Minimal working example to be added in gem5-resources. Change-Id: I37f8a4b7ea1780b471559aecbf4af1050353b0b1 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/37015 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Matthew Poremba <matthew.poremba@amd.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-11-16 16:12:48 +00:00
Tuan Ta	173c1c6eb0	gpu-compute,mem-ruby: Replace ACQUIRE and RELEASE request flags This patch replaces ACQUIRE and RELEASE flags which are HSA-specific. ACQUIRE flag becomes INV_L1 in VIPER protocol. RELEASE flag is removed. Future protocols may support extra cache coherence flags like INV_L2 and WB_L2. Change-Id: I3d60c9d3625c898f4110a12d81742b6822728533 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32859 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-11-04 21:09:26 +00:00
Gabe Black	d05a0a4ea1	misc: Delete the now unnecessary create methods. Most create() methods are no longer necessary. This change deletes them, and occasionally moves some code from them into the constructors they call. Change-Id: Icbab29ba280144b892f9b12fac9e29a0839477e5 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/36536 Reviewed-by: Gabe Black <gabe.black@gmail.com> Maintainer: Gabe Black <gabe.black@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-10-30 04:00:20 +00:00
Gabe Black	3a49ed0156	gpu: Use X86ISA instead of TheISA in src/gpu-compute. These files are nominally not tied to the X86ISA, but in reality they are because they reach into the GPU TLB, which is defined unchangeably in the X86ISA namespaces, and uses data structures within it. Rather than try to pretend that these structures are generic, we'll instead just use X86ISA instead of TheISA. If this really does become generic in the future, a base class with the ISA agnostic essentials defined in it can be used instead, and the ISA specific TLBs can defined their own derived class which has whatever else they need. Really the compute unit shouldn't be communicating with the TLB using sender state since those are supposed to be little notes for the sender to keep with a transaction, not for communicating between entities across a port. Change-Id: Ie6573396f6c77a9a02194f5f4595eefa45d6d66b Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/34174 Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu> Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com>	2020-10-26 20:32:43 +00:00
Gabe Black	463cb28ca5	misc: Use compiler.hh macros when available. Some places were hand coding __attribute__s when macros in compiler.hh were available to do that job. Using the macros helps abstract away compiler specific details and should be used when possible. Change-Id: I94befebcfde2d673e874e9959588f69781bd9021 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/35975 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Gabe Black <gabeblack@google.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-10-19 05:52:40 +00:00
Kyle Roarty	b20cc7e6d8	gpu-compute,mem-ruby: Properly create/handle WriteCompletePkts There is a flow of packets as so: WriteResp -> WriteReq -> WriteCompleteResp These packets share some variables, in particular senderState and a status vector. One issue was the WriteResp packet decremented the status vector, which was used by the WriteCompleteResp packets to determine when to handle the global memory response. This could lead to multiple WriteCompleteResp packets attempting to handle the global memory response. Because of that, the WriteCompleteResp packets needed to handle the status vector. this patch moves WriteCompleteResp packet handling back into ComputeUnit::DataPort::processMemRespEvent from ComputeUnit::DataPort::recvTimingResp. This helps remove some redundant code. This patch has the WriteResp packet return without doing any status vector handling, and without deleting the senderState, which had previously caused a segfault. Another issue was WriteCompleteResp packets weren't being issued for each active lane, as the coalesced request was being issued too early. In order to fix that, we have to ensure every active lane puts their request into their applicable coalesced request before issuing the coalesced request. Because of that change, we change the issuing of CoalescedRequests from GPUCoalescer::coalescePacket to GPUCoalescer::completeIssue. That change involves adding a new variable to store the CoalescedRequests that are created in the calls to coalescePacket. This variable is a map from instruction sequence number to coalesced requests. Additionally, the WriteCompleteResp packet was attempting to access physical memory in hitCallback while not having any data, which caused a crash. This can be resolved either by not allowing WriteCompleteResp packets to access memory, or by copying the data from the WriteReq packet. This patch denies WriteCompleteResp packets memory access in hitCallback. Finally, in VIPERCoalescer::writeCompleteCallback there was a map that held the WriteComplete packets, but no packets were ever being removed. This patch removes packets that match the address that was passed in to the function. Change-Id: I9a064a0def2bf6c513f5295596c56b1b652b0ca4 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33656 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-10-15 17:52:51 +00:00
Gabe Black	91d83cc8a1	misc: Standardize the way create() constructs SimObjects. The create() method on Params structs usually instantiate SimObjects using a constructor which takes the Params struct as a parameter somehow. There has been a lot of needless variation in how that was done, making it annoying to pass Params down to base classes. Some of the different forms were: const Params & Params & Params * const Params * Params const* This change goes through and fixes up every constructor and every create() method to use the const Params & form. We use a reference because the Params struct should never be null. We use const because neither the create method nor the consuming object should modify the record of the parameters as they came in from the config. That would make consuming them not idempotent, and make it impossible to tell what the actual simulation configuration was since it would change from any user visible form (config script, config.ini, dot pdf output). Change-Id: I77453cba52fdcfd5f4eec92dfb0bddb5a9945f31 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/35938 Reviewed-by: Gabe Black <gabeblack@google.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Gabe Black <gabeblack@google.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-10-14 12:06:44 +00:00
Matthew Poremba	53807c8276	configs,gpu-compute: Fixes to connect gmTokenPort When the TokenPort was moved from the GCN3 staging branch to develop the TokenPort was changed from being the port connecting the ComputeUnit to Ruby's vector memory port to a sideband port which inhibits requests to Ruby's vector memory port. As such, it needs to be explicitly connected as a new port. This changes the getPort method in ComputeUnit to be aware of the port as well as modifying the example config to connect to TCPs. The iteration to connect in the config file was modified since it was not properly connecting to TCPs each time and Ruby.py does not explicitly return a list of each MachineType. Change-Id: Ia70a6756b2af54d95e94d19bec5d8aadd3c2d5c0 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/35096 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-09-30 20:19:21 +00:00
Gabe Black	b877efa6d4	misc: Update attribute syntax, and reorganize compiler.hh. This change replaces the __attribute__ syntax with the now standard [[]] syntax. It also reorganizes compiler.hh so that all special macros have some explanatory text saying what they do, and each attribute which has a standard version can use that if available and what version of c++ it's standard in is put in a comment. Also, the requirements as far as where you put [[]] style attributes are a little more strict than the old school __attribute__ style. The use of the attribute macros was updated to fit these new, more strict requirements. Change-Id: Iace44306a534111f1c38b9856dc9e88cd9b49d2a Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/35219 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Gabe Black <gabeblack@google.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-09-28 21:52:59 +00:00
Gabe Black	50a0b85367	arm,base,gpu: Use std::make_unique instead of m5::make_unique. Now that we're using c++14, we can just assume that std::make_unique exists. We no longer have to conditionally inject our own version. Change-Id: I5d851afb02dd05c7af93864ffec3b3184f3d4ec8 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/35215 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Gabe Black <gabeblack@google.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-09-28 05:41:08 +00:00
Kyle Roarty	347d7644eb	gpu-compute: replace uint32_t* casts with bits API calls The uint32_t* casting was challenging to fully understand what was being done at a glance. Replaced with calls to various bits functions as it's functionally equivalent and much more clear. This also fixes a segfault in GPUInitAbi DPRINTFs from a mis-typed uint32_t* cast. Change-Id: Id5d1863942848dd7a9e5e17e8180c33adbc72f15 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/34677 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-09-24 14:53:16 +00:00
Gabe Black	24e87cb1c5	gpu: Stop using TheISA in the GPU TLB. This class is defined inside the X86ISA namespace, so there's no point in pretending it's generic. Remove TheISA and let the code access what it needs from X86ISA naturally since it's there already. Change-Id: I21b5d2d2b9af6aa0c10ddbb5b3ddca1692188dcc Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/34173 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matthew Poremba <matthew.poremba@amd.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com>	2020-09-18 13:48:45 +00:00
Kyle Roarty	be3bcd1629	gpu-compute: Fix deadlock in fetch_unit after branch instruction The following deadlock was occuring in fetch_unit w/timingSim: 1. exec() is called, a wave is ready to fetch, so it sets pendingFetch 2. A packet is sent to ITLB to fetch for that wave 3. The wave executes a branch, causing the fetch buffer to be cleared 4. The packet is handled, and fetch() is called. However, because the fetch buffer was cleared, it returns doing nothing. 5. exec() gets called again, but the wave will never be scheduled to fetch, as pendingFetch is still set to true. This patch clears pendingFetch (and dropFetch) before returning in fetch() when the fetch buffer has been cleared. dropFetch needed to be cleared otherwise gem5 would crash. Change-Id: Iccbac7defc4849c19e8b17aa2492da641defb772 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/34555 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-09-17 21:24:19 +00:00
Gabe Black	49a41da964	gpu: Fix a syntax error in X86GPUTLB.py. The recent changes which removed master/slave terminology also accidentally deleted an "=", making the syntax in that file illegal. Change-Id: I50aa945f0f66765db36775380b98a88caff23c13 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/34576 Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu> Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com>	2020-09-16 06:08:14 +00:00
Shivani Parekh	392c1ced53	misc: Replaced master/slave terminology Change-Id: I4df2557c71e38cc4e3a485b0e590e85eb45de8b6 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33553 Maintainer: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com>	2020-09-10 23:02:28 +00:00
Kyle Roarty	b00b986353	misc: Use VPtr in hsa_driver.cc This change updates HSADriver::allocateQueue to take in a ThreadContext pointer as opposed to a PortProxy ref. This allows the TypedBufferArg to be replaced with VPtr. This also fixes building GCN3_X86 Change-Id: I1fea26b10c7344daf54a0cb05337e961f834a5fd Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33655 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Gabe Black <gabeblack@google.com> Reviewed-by: Matthew Poremba <matthew.poremba@amd.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-08-31 17:44:11 +00:00
Gabe Black	1d755b4ba1	misc: Clean up usage of arch/isa_traits.hh. isa_traits.hh used to have much more in it, but now it only has PageShift, PageBytes, and (for now) the guest endianness. These values should only be retrieved from the System class generally speaking, so only the system class should include arch/isa_traits.hh. Some gpu compute related files need PageBytes or PageShift. Even though those files don't advertise their ISA dependence, they are tied to x86. In those files, they can include arch/x86/isa_traits.hh. The only other file which legitimately needs arch/isa_traits.hh is the decoder cache since it uses PageBytes to size an array. Change-Id: I12686368715623e3140a68a7027c136bd52567b1 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33203 Reviewed-by: Gabe Black <gabeblack@google.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Gabe Black <gabeblack@google.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-08-28 07:20:58 +00:00
Tony Gutierrez	94000aefe6	gpu-compute: Create CU's ports in the standard way The CU would initialize its ports in getMasterPort(), which is not desirable as getMasterPort() may be called several times for the same port. This can lead to a fatal if the CU expects to only create a single port of a given type, and may lead to other issues where stat names are duplicated. This change instantiates and initializes the CU's ports in the CU constructor using the CU params. The index field is also removed from the CU's ports because the base class already has an ID field, which will be set to the default value in the base class's constructor for scalar ports. It doesn't make sense for scalar port's to take an index because they are scalar, so we let the base class initialize the ID to the invalid port ID. Change-Id: Id18386f5f53800a6447d968380676d8fd9bac9df Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32836 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-08-27 16:31:46 +00:00
Emily Brickey	6333e914d3	gpu-compute: update port terminology Change-Id: I3121c4afb1e137aebe09c1d694e9484844d02b9b Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32313 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Matt Poremba <chesp3@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-08-26 16:48:13 +00:00
Kyle Roarty	b872f02ab1	configs,gpu-compute,mem-ruby: connect gmTokenPorts in apu_se This patch adds gmTokenPorts to the ComputeUnit and RubyGPUCoalescer python classes so the gmTokenPorts can be connected in apu_se. Change-Id: Icf3cb05c757754d6935b46f14e4b1b1d5072c4ca Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32677 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-08-18 23:47:16 +00:00
Gabe Black	40e8cac306	misc: Make registerExitCallback use CallbackQueue2. Issue-on: https://gem5.atlassian.net/browse/GEM5-698 Change-Id: I526d4a19ca4e54a6469a4ee26693c1c0400fcc70 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32644 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Gabe Black <gabeblack@google.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-08-18 11:49:06 +00:00
Matthew Poremba	9b95f32b12	arch-gcn3,gpu-compute: Fix GCN3 related compiler errors Fix all errors that were revealed using the util/compiler-test.sh script. Change-Id: Ie0d35568624e5e1405143593f0677bbd0b066b61 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31154 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-07-20 14:53:13 +00:00
Tony Gutierrez	4d737462c2	gpu-compute, arch-gcn3: Change how waitcnts are implemented Use single counters per memory operation type and increment them upon issue, not execute. Change-Id: I6afc0b66b21882538ef90a14a57a3ab3cc7bd6f3 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29973 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-07-17 16:36:23 +00:00
Tony Gutierrez	63c76448eb	gpu-compute: Add pipeline stage interface classes This change separates the pipeline stage interfaces for the GPU's compute unit into their own classes with a well-defined interface. This helps to create a cleaner interface for users to extend the CU pipeline's capabilities and also helps consolidate all the pipeline communication code in one place in the source. Change-Id: I569d52bce84dc1b9fbf8f0f96d53a81a2b6773c6 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29972 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-07-17 16:36:09 +00:00
Alexandru Dutu	7d50d5d972	gpu-compute: No RF scheduling in case of SKIP or EMPTY In case of flat memory instructions the status for the LM pipe execution unit is set to SKIP or EMPTY, as the bus between the VRF and the GM and LM pipe is shared. The destination operands should not be scheduled for the LM pipe, event if the wave is in the dispatch list. This can lead to deadlock in the destination cache as DCEs are reused and the slotsAvailableForBank count gets artificially incremented. Change-Id: I2230c53e3bc1032d2cccbe00fab62c99ab8de6cd Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29970 Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2020-07-17 16:34:59 +00:00

1 2 3 4

151 Commits