derek/gem5 - gem5 - Gitea: Git with a cup of tea

derek/gem5

Author	SHA1	Message	Date
Matthew Poremba	5323cccfdd	arch-gcn3,gpu-compute: Update stats style for GPU Convert all gpu-compute stats to Stats::Group style. Change-Id: I29116f1de53ae379210c6cfb5bed3fc74f50cca5 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/39135 Reviewed-by: Matthew Poremba <matthew.poremba@amd.com> Maintainer: Matthew Poremba <matthew.poremba@amd.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-01-18 17:58:05 +00:00
gauravjain14	c29523665e	gpu-compute: Support for dynamic register alloc SimplePoolManager doesn't allow mapping of two WGs simultaneously on the same Compute Unit (provided the previous WG has been mapped to all the SIMDs) even if there is sufficient VRF and SRF space available. DynPoolManager takes care of that by dynamically allocating and deallocating register file space to wavefronts Change-Id: I2255c68d4b421615d7b231edc05d3ebb27cbd66c Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32034 Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Alexandru Duțu <alexandru.dutu@amd.com>	2021-01-14 17:04:27 +00:00
Gabe Black	463cb28ca5	misc: Use compiler.hh macros when available. Some places were hand coding __attribute__s when macros in compiler.hh were available to do that job. Using the macros helps abstract away compiler specific details and should be used when possible. Change-Id: I94befebcfde2d673e874e9959588f69781bd9021 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/35975 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Gabe Black <gabeblack@google.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-10-19 05:52:40 +00:00
Gabe Black	91d83cc8a1	misc: Standardize the way create() constructs SimObjects. The create() method on Params structs usually instantiate SimObjects using a constructor which takes the Params struct as a parameter somehow. There has been a lot of needless variation in how that was done, making it annoying to pass Params down to base classes. Some of the different forms were: const Params & Params & Params * const Params * Params const* This change goes through and fixes up every constructor and every create() method to use the const Params & form. We use a reference because the Params struct should never be null. We use const because neither the create method nor the consuming object should modify the record of the parameters as they came in from the config. That would make consuming them not idempotent, and make it impossible to tell what the actual simulation configuration was since it would change from any user visible form (config script, config.ini, dot pdf output). Change-Id: I77453cba52fdcfd5f4eec92dfb0bddb5a9945f31 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/35938 Reviewed-by: Gabe Black <gabeblack@google.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Gabe Black <gabeblack@google.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-10-14 12:06:44 +00:00
Matthew Poremba	53807c8276	configs,gpu-compute: Fixes to connect gmTokenPort When the TokenPort was moved from the GCN3 staging branch to develop the TokenPort was changed from being the port connecting the ComputeUnit to Ruby's vector memory port to a sideband port which inhibits requests to Ruby's vector memory port. As such, it needs to be explicitly connected as a new port. This changes the getPort method in ComputeUnit to be aware of the port as well as modifying the example config to connect to TCPs. The iteration to connect in the config file was modified since it was not properly connecting to TCPs each time and Ruby.py does not explicitly return a list of each MachineType. Change-Id: Ia70a6756b2af54d95e94d19bec5d8aadd3c2d5c0 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/35096 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-09-30 20:19:21 +00:00
Shivani Parekh	392c1ced53	misc: Replaced master/slave terminology Change-Id: I4df2557c71e38cc4e3a485b0e590e85eb45de8b6 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33553 Maintainer: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com>	2020-09-10 23:02:28 +00:00
Tony Gutierrez	94000aefe6	gpu-compute: Create CU's ports in the standard way The CU would initialize its ports in getMasterPort(), which is not desirable as getMasterPort() may be called several times for the same port. This can lead to a fatal if the CU expects to only create a single port of a given type, and may lead to other issues where stat names are duplicated. This change instantiates and initializes the CU's ports in the CU constructor using the CU params. The index field is also removed from the CU's ports because the base class already has an ID field, which will be set to the default value in the base class's constructor for scalar ports. It doesn't make sense for scalar port's to take an index because they are scalar, so we let the base class initialize the ID to the invalid port ID. Change-Id: Id18386f5f53800a6447d968380676d8fd9bac9df Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32836 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-08-27 16:31:46 +00:00
Emily Brickey	6333e914d3	gpu-compute: update port terminology Change-Id: I3121c4afb1e137aebe09c1d694e9484844d02b9b Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32313 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Matt Poremba <chesp3@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-08-26 16:48:13 +00:00
Gabe Black	40e8cac306	misc: Make registerExitCallback use CallbackQueue2. Issue-on: https://gem5.atlassian.net/browse/GEM5-698 Change-Id: I526d4a19ca4e54a6469a4ee26693c1c0400fcc70 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32644 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Gabe Black <gabeblack@google.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-08-18 11:49:06 +00:00
Tony Gutierrez	63c76448eb	gpu-compute: Add pipeline stage interface classes This change separates the pipeline stage interfaces for the GPU's compute unit into their own classes with a well-defined interface. This helps to create a cleaner interface for users to extend the CU pipeline's capabilities and also helps consolidate all the pipeline communication code in one place in the source. Change-Id: I569d52bce84dc1b9fbf8f0f96d53a81a2b6773c6 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29972 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-07-17 16:36:09 +00:00
Tony Gutierrez	af621cd6e6	gpu-compute, arch-gcn3: refactor barriers Barriers were not modeled properly. Firstly, barriers were allocated to each WG that was launched, which is not correct, and the CU would provide an infinite number of barrier slots. There are a limited number of barrier slots per CU in reality. In addition, the CU will not allocate barrier slots to WGs with a single WF (nothing to sync if only one WF). Beyond modeling problems, there also the issue of deadlock. The barrier could deadlock because not all WFs are freed from the barrier once it has been satisfied. Instead, we relied on the scoreboard stage to release them lazily, one-by-one. Under this implementation the scoreboard may not fully release all WFs participating in a barrier; this happens because the first WF to be freed from the barrier could reach an s_barrier instruction again, forever causing the barrier counts across WFs to be out-of-sync. This change refactors the barrier logic to: 1) Create a proper barrier slot implementation 2) Enforce (via a parameter) the number of barrier slots on the CU. 3) Simplify the logic and cleanup the code (i.e., we no longer iterate through the entire WF list each time we check if a barrier is satisfied). 4) Fix deadlock issues. Change-Id: If53955b54931886baaae322640a7b9da7a1595e0 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29943 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-07-16 20:37:22 +00:00
Tony Gutierrez	bbab876c32	gpu-compute: Make headTailMap a std::unordered_map There is no reason that the headTailMap needs to be sorted, so let's use a std::unordered_map. Change-Id: I18641b893352c18ec86e3775c8947a05a6c6547d Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29930 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-06-22 16:14:35 +00:00
Tony Gutierrez	b8da9abba7	gpu-compute, mem-ruby, configs: Add GCN3 ISA support to GPU model Change-Id: Ibe46970f3ba25d62ca2ade5cbc2054ad746b2254 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29912 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-06-15 22:45:17 +00:00
Matthew Poremba	3d57eaf9f5	gpu-compute,mem-ruby: Refactor GPU coalescer Remove the read/write tables and coalescing table and introduce a two levels of tables for uncoalesced and coalesced packets. Tokens are granted to GPU instructions to place in uncoalesced table. If tokens are available, the operation always succeeds such that the 'Aliased' status is never returned. Coalesced accesses are placed in the coalesced table while requests are outstanding. Requests to the same address are added as targets to the table similar to how MSHRs operate. Change-Id: I44983610307b638a97472db3576d0a30df2de600 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/27429 Reviewed-by: Bradford Beckmann <brad.beckmann@amd.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Bradford Beckmann <brad.beckmann@amd.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-05-11 21:25:19 +00:00
Gabe Black	71a868224c	gpu-compute: Delete authors lists from gpu-compute files. Change-Id: I72318eb885f9517de325ea9a9af263f36613bf6e Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/25414 Maintainer: Gabe Black <gabeblack@google.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>	2020-02-17 10:05:52 +00:00
Jing Qu	d379a08b27	gpu-compute: Fix overriden errors When building Gem5 with GPU protocols, overriden errors were thrown from files in gpu-compute. After adding override to the files, the errors were resolved and Gem5 builds successfully. Change-Id: Iab3a0768caf82c226e8bbee5690a834bf92d1e03 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20939 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com>	2019-09-20 22:42:31 +00:00
Gabe Black	cdcc55a6a8	mem: Minimize the use of MemObject. MemObject doesn't provide anything beyond its base ClockedObject any more, so this change removes it from most inheritance hierarchies. Occasionally MemObject is replaced with SimObject when I was fairly confident that the extra functionality of ClockedObject wasn't needed. Change-Id: Ic014ab61e56402e62548e8c831eb16e26523fdce Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18289 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Gabe Black <gabeblack@google.com>	2019-04-28 01:19:40 +00:00
Gabe Black	d3d24835bc	arch, cpu, dev, gpu, mem, sim, python: start using getPort. Replace the getMasterPort, getSlavePort, and getEthPort functions with getPort, and remove extraneous mechanisms that are no longer necessary. Change-Id: Iab7e3c02d2f3a0cf33e7e824e18c28646b5bc318 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17040 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>	2019-03-19 10:22:50 +00:00
Tony Gutierrez	abb21ba99f	style: fix amd license and style issues Change-Id: I26136fb49f743c4a597f8021cfd27f78897267b5 Reviewed-on: https://gem5-review.googlesource.com/10463 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>	2018-05-16 15:32:01 +00:00
Sean Wilson	741261f10b	gpu-compute: Refactor some Event subclasses to lambdas Change-Id: Ic1332b8e8ba0afacbe591c80f4d06afbf5f04bd9 Signed-off-by: Sean Wilson <spwilson2@wisc.edu> Reviewed-on: https://gem5-review.googlesource.com/3922 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>	2017-07-12 20:07:05 +00:00
Tony Gutierrez	aa7364276f	gpu-compute: use System cache line size in the GPU	2016-10-26 22:47:47 -04:00
Tony Gutierrez	98d8a7051d	gpu-compute: add instruction mix stats for the gpu	2016-10-26 22:47:30 -04:00
Tony Gutierrez	7ac38849ab	gpu-compute: remove inst enums and use bit flag for attributes this patch removes the GPUStaticInst enums that were defined in GPU.py. instead, a simple set of attribute flags that can be set in the base instruction class are used. this will help unify the attributes of HSAIL and machine ISA instructions within the model itself. because the static instrution now carries the attributes, a GPUDynInst must carry a pointer to a valid GPUStaticInst so a new static kernel launch instruction is added, which carries the attributes needed to perform a the kernel launch.	2016-10-26 22:47:11 -04:00
Alexandru Dutu	c8cf71f1a0	gpu-compute: Added method to compute the actual workgroup size This patch adds a method to the Wavefront class to compute the actual workgroup size. This can be different from the maximum workgroup size specified when launching the kernel through the NDRange object. Current solution is still not optimal, as we are computing these for each wavefront and the dispatcher also needs to have this information and can't actually call Wavefront::computeActuallWgSz before the wavefronts are being created. A long term solution would be to have a Workgroup class that deals with all these details.	2016-10-04 13:03:52 -04:00
Alexandru Dutu	e9fe1b838b	gpu-compute: Remove WFContext WFContext struct is currently unused and it has been rendered not useful in saving and restoring the context of a Wavefront. Wavefront class should be sufficient for that purpose and the runtime can figure out the memory size it will need to allocate for a Wavefront through an IOCTL.	2016-09-16 12:26:03 -04:00
jkalamat	3724fb15fa	gpu-compute: parametrize Wavefront size Eliminate the VSZ constant that defined the Wavefront size (in numbers of work items); replaced it with a parameter in the GPU.py configuration script. Changed all data structures dependent on the Wavefront size to be dynamically sized. Legal values of Wavefront size are 16, 32, 64 for now and checked at initialization time.	2016-06-09 11:24:55 -04:00
Tony Gutierrez	1a7d3f9fcb	gpu-compute: AMD's baseline GPU model	2016-01-19 14:28:22 -05:00

27 Commits