derek/gem5 - gem5 - Gitea: Git with a cup of tea

derek/gem5

Author	SHA1	Message	Date
KaiBatley	efa1d87add	configs: fix GPU's default number of HW barrier/CU (#92 ) AMD GCN3 and Vega GPUs assume a max of 16 WG/CU. Any GPU WG with more than 1 WF requires a hardware barrier to allow WFs in the WG to synchronize locally. However, currently the default gem5 GPU configuration assumes only 4 barriers per CU, which artificially prevents applications with > 4 WG/CU that could run simultaneously from running simultaneously. This fix resolves this by updating the default number of hardware barriers per CU to 16, which mimics the support described in slide 39 here: https://www.olcf.ornl.gov/wp-content/uploads/2019/10/ ORNL_Application_Readiness_Workshop-AMD_GPU_Basics.pdf Change-Id: Ib7636a13359d998e676c1790f436a83ce88cbfc0	2023-07-17 10:42:40 -07:00
Jason Lowe-Power	442923c414	Add feature to output citations automatically based on configuration (#90 ) This change adds a new file to m5out which is citations.bib. This file will contain the citations to the papers which describe the aspects of the gem5 simulator that the simulation uses. In other words, each simulation configuration could generate a different bib file referencing different works. Each SimObject can now have a set of citations associated with it. After the system is built (in `instantiate`), the citations.bib file is created by parsing all SimObjects that have been instantiated and taking the union of their associated citations. This commit is not meant to add all citations, but to act as an example for others to add more citations to gem5. Change-Id: Icd5c46fd9ee44adbeec1fea162657f5716f7e5ef Signed-off-by: Jason Lowe-Power <jason@lowepower.com>	2023-07-17 10:41:51 -07:00
Matthew Poremba	3756af8ed9	gpu-compute,configs: Make sim exits conditional The unconditional exit event when a kernel completes that was added in `c644eae2dd` is causing scripts that do not ignore unknown exit events to end simulation prematurely. One such script is the apu_se.py script used in SE mode GPU simulation. Make this exit conditional to the parameter being set to a valid value to avoid this problem. Change-Id: I1d2c082291fdbcf27390913ffdffb963ec8080dd Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/72098 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2023-07-07 14:12:54 +00:00
Matthew Poremba	6b4a1020be	configs,dev-amdgpu: GPUFS MI200/gfx90a support Add support for MI200-like device. This includes adding PCI IDs and new MMIOs for the device, a different MAP_PROCESS packet, and a different calculation for the number of VGPRs. Change-Id: I0fb7b3ad928826beaa5386d52a94ba504369cb0d Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70317 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2023-05-25 19:14:32 +00:00
Hoa Nguyen	eac06ad681	python: Fix multiline quotes in a single line An example case, ```python mem_side_port = RequestPort( "This port sends requests and " "receives responses" ) ``` This is the residue of running the python formatter. This is done by finding all tokens matching the regex `"\s"(?![.;"])` and manually replacing them by empty strings. Change-Id: Icf223bbe889e5fa5749a81ef77aa6e721f38b549 Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/66111 Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu> Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com>	2022-11-29 23:44:38 +00:00
vramadas95	dff879cf21	configs, gpu-compute: Add configurable L1 scalar latencies Previously the scalar cache path used the same latency parameter as the vector cache path for memory requests. This commit adds new parameters for the scalar cache path latencies. This commit also modifies the model to use the new latency parameter to set the memory request latency in the scalar cache. The new paramters are '--scalar-mem-req-latency' and '--scalar-mem-resp-latency' and are set to default values of 50 and 0 respectively Change-Id: I7483f780f2fc0cfbc320ed1fd0c2ee3e2dfc7af2 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65511 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com>	2022-11-12 02:23:02 +00:00
Bobby R. Bruce	787204c92d	python: Apply Black formatter to Python files The command executed was `black src configs tests util`. Change-Id: I8dfaa6ab04658fea37618127d6ac19270028d771 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/47024 Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-08-03 09:10:41 +00:00
Matthew Poremba	91e8bbe299	configs,gpu-compute: Support fetch from system pages The amdgpu driver supports fetching instructions from pages which reside in system memory rather than device memory. This changeset adds support to do this by adding the system hub object added in a prior changeset to the fetch unit and issues requests to the system hub if the system bit in the memory page's PTE is set. Otherwise, the requestor ID is set to be device memory and the request is routed through the Ruby network / GPU caches to fetch the instructions. Change-Id: Ib2fb47c589fdd5e544ab6493d7dbd8f2d9d7b0e8 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/57652 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-03-28 23:24:53 +00:00
Matthew Poremba	581e451723	gpu-compute,dev-hsa: Update CP and HSAPP for full-system Make the necessary changes to connect Vega pagetable walkers for full-system mode. Previously the CP and HSA packet processor could only read AQL packets from system/host memory using proxy port. This allows for AQL to be read from device memory which is used for non-blit kernels. Change-Id: If28eb8be68173da03e15084765e77e92eda178e9 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53077 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-03-25 19:51:29 +00:00
Matthew Poremba	9313294efe	misc: Remove AMD license addition Remove the line "For use for simulation and test purposes only" in files were AMD is the only copyright holder listed in the header. This happens to be the case for all files where this line exists, removing it completely from gem5. Change-Id: I623f266b002f564301b28774f49081099cfc60fd Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53943 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-12-11 04:00:56 +00:00
Matthew Poremba	9cabfd7a9b	dev-hsa,gpu-compute: Properly assign DmaVirtDevices in py These SimObjects are DmaVirtDevices in C++ but DmaDevices in the sim object's python file. Make the sim object python files consistent. Change-Id: I728ae737c5901e448628fc5ac877f261ca4c4393 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/53704 Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu> Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-12-07 20:26:17 +00:00
Kyle Roarty	e2e18d41e1	configs,gpu-compute: Add support for gfx902/Raven This patch adds support for a gfx902 Vega APU, ripping the appropriate values for device_id from the ROCm Thunk (src/topology.c). Note: gfx902 isn't officially supported by ROCm. This means that it may not work for all programs. In particular, rocBLAS is incompatible with gfx902, so anything that uses rocBLAS won't be able to run with gfx902. Change-Id: I48893e7cc9c7e52275fdfd22314f371a9db8e90a Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/47530 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Matthew Poremba <matthew.poremba@amd.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-07-10 03:42:03 +00:00
Kyle Roarty	ab9e28ddb8	configs,gpu-compute: Set proper dGPUPoolID defaults In GPU.py, dGPUPoolID is defined as an int, but was defaulted to False. Explicitly set it to 0, instead. In apu_se.py, dGPUPoolID was being set to 1, but that was resulting in crashes. Setting it to 0 avoids those crashes. Change-Id: I0f1161588279a335bbd0d8ae7acda97fc23201b5 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/47527 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Matthew Poremba <matthew.poremba@amd.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-07-09 16:11:20 +00:00
Daniel R. Carvalho	974a47dfb9	misc: Adopt the gem5 namespace Apply the gem5 namespace to the codebase. Some anonymous namespaces could theoretically be removed, but since this change's main goal was to keep conflicts at a minimum, it was decided not to modify much the general shape of the files. A few missing comments of the form "// namespace X" that occurred before the newly added "} // namespace gem5" have been added for consistency. std out should not be included in the gem5 namespace, so they weren't. ProtoMessage has not been included in the gem5 namespace, since I'm not familiar with how proto works. Regarding the SystemC files, although they belong to gem5, they actually perform integration between gem5 and SystemC; therefore, it deserved its own separate namespace. Files that are automatically generated have been included in the gem5 namespace. The .isa files currently are limited to a single namespace. This limitation should be later removed to make it easier to accomodate a better API. Regarding the files in util, gem5:: was prepended where suitable. Notice that this patch was tested as much as possible given that most of these were already not previously compiling. Change-Id: Ia53d404ec79c46edaa98f654e23bc3b0e179fe2d Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46323 Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu> Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu> Reviewed-by: Matthew Poremba <matthew.poremba@amd.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-07-01 19:08:24 +00:00
Kyle Roarty	a71801b9a0	configs,gpu-compute: Add render driver needed for ROCm 4 ROCm 4 utilizes the render driver located at /dev/dri/renderDXXX. This patch implements a very simple driver that just returns a file descriptor when opened, as testing has shown that's all that's needed Change-Id: I65602346cbf17b2dc80e114046ebf5c9830a1507 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46244 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com>	2021-06-30 16:47:43 +00:00
Kyle Roarty	ec6b325382	gpu-compute, dev-hsa: Remove HSADriver, HSADevice HSADriver/HSADevice were primarily used with GPUCommandProcessor/ GPUComputeDriver. This change merges the classes together to simplify the inheritance hierarchy, as well as removing any casting. Change-Id: I670eb9b49a16c8aba17e13fd1d1287d0621c9f48 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42219 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com>	2021-04-24 15:54:15 +00:00
Kyle Roarty	eb09361eef	configs, gpu-compute: Add option to specify gfx version Currently uses gfx801, gfx803, gfx900 for Carrizo, Fiji, and Vega respectively Change-Id: I62758914b6a60f16dd4f2141a23c0a9141a4e1a0 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42217 Maintainer: Matthew Poremba <matthew.poremba@amd.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-04-24 15:54:15 +00:00
Michael LeBeane	ad43083bb3	gpu-compute: Implement per-request MTYPEs GPU MTYPE is currently set using a global config passed to the PACoalescer. This patch enables MTYPE to be set by the shader on a per-request bases. In real hardware, the MTYPE is extracted from a GPUVM PTE during address translation. However, our current simulator only models x86 page tables which do not have the appropriate bits for GPU MTYPES. Rather than hacking non-x86 bits into our x86 page table models, this patch instead keeps an interval tree of all pages that request custom MTYPES in the driver itself. This is currently only used to map host pages to the GPU as uncacheable, but is easily extensible to other MTYPES. Change-Id: I7daab0ffae42084b9131a67c85cd0aa4bbbfc8d6 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42216 Maintainer: Matthew Poremba <matthew.poremba@amd.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-04-24 15:54:15 +00:00
Michael LeBeane	a5f55e0be1	gpu-compute: Topology and driver changes for dGPU New topology ripped from Fiji to support dGPU. A dGPU flag is added to the config which is propogated to the driver. The emulated driver is now able to properly deal with dGPU ioctls and mmaps. For now, dGPU physical memory is allocated from the host, but this is easy to change once we get a GPU memory controller up and running. Change-Id: I594418482b12ec8fb2e4018d8d0371d56f4f51c8 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42214 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-04-15 16:41:11 +00:00
gauravjain14	c29523665e	gpu-compute: Support for dynamic register alloc SimplePoolManager doesn't allow mapping of two WGs simultaneously on the same Compute Unit (provided the previous WG has been mapped to all the SIMDs) even if there is sufficient VRF and SRF space available. DynPoolManager takes care of that by dynamically allocating and deallocating register file space to wavefronts Change-Id: I2255c68d4b421615d7b231edc05d3ebb27cbd66c Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32034 Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Alexandru Duțu <alexandru.dutu@amd.com>	2021-01-14 17:04:27 +00:00
Shivani Parekh	392c1ced53	misc: Replaced master/slave terminology Change-Id: I4df2557c71e38cc4e3a485b0e590e85eb45de8b6 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33553 Maintainer: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com>	2020-09-10 23:02:28 +00:00
Emily Brickey	6333e914d3	gpu-compute: update port terminology Change-Id: I3121c4afb1e137aebe09c1d694e9484844d02b9b Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32313 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Matt Poremba <chesp3@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-08-26 16:48:13 +00:00
Kyle Roarty	b872f02ab1	configs,gpu-compute,mem-ruby: connect gmTokenPorts in apu_se This patch adds gmTokenPorts to the ComputeUnit and RubyGPUCoalescer python classes so the gmTokenPorts can be connected in apu_se. Change-Id: Icf3cb05c757754d6935b46f14e4b1b1d5072c4ca Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32677 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-08-18 23:47:16 +00:00
Tony Gutierrez	af621cd6e6	gpu-compute, arch-gcn3: refactor barriers Barriers were not modeled properly. Firstly, barriers were allocated to each WG that was launched, which is not correct, and the CU would provide an infinite number of barrier slots. There are a limited number of barrier slots per CU in reality. In addition, the CU will not allocate barrier slots to WGs with a single WF (nothing to sync if only one WF). Beyond modeling problems, there also the issue of deadlock. The barrier could deadlock because not all WFs are freed from the barrier once it has been satisfied. Instead, we relied on the scoreboard stage to release them lazily, one-by-one. Under this implementation the scoreboard may not fully release all WFs participating in a barrier; this happens because the first WF to be freed from the barrier could reach an s_barrier instruction again, forever causing the barrier counts across WFs to be out-of-sync. This change refactors the barrier logic to: 1) Create a proper barrier slot implementation 2) Enforce (via a parameter) the number of barrier slots on the CU. 3) Simplify the logic and cleanup the code (i.e., we no longer iterate through the entire WF list each time we check if a barrier is satisfied). 4) Fix deadlock issues. Change-Id: If53955b54931886baaae322640a7b9da7a1595e0 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29943 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-07-16 20:37:22 +00:00
Xianwei Zhang	2c1e9c4e81	gpu-compute: enable flexible control of kernel boundary syncs Kernel end release was turned on for VIPER protocol, which is in fact write-through based and thus no need to have release operation. This changeset splits the option 'impl_kern_boundary_sync' into 'impl_kern_launch_acq' and 'impl_kern_end_rel', and turns off release on VIPER. Change-Id: I5490019b6765a25bd801cc78fb7445b90eb02a3d Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29917 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Xianwei Zhang <xianwei.zhang@amd.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-06-19 20:40:05 +00:00
Tony Gutierrez	b8da9abba7	gpu-compute, mem-ruby, configs: Add GCN3 ISA support to GPU model Change-Id: Ibe46970f3ba25d62ca2ade5cbc2054ad746b2254 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29912 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-06-15 22:45:17 +00:00
Matthew Poremba	3d57eaf9f5	gpu-compute,mem-ruby: Refactor GPU coalescer Remove the read/write tables and coalescing table and introduce a two levels of tables for uncoalesced and coalesced packets. Tokens are granted to GPU instructions to place in uncoalesced table. If tokens are available, the operation always succeeds such that the 'Aliased' status is never returned. Coalesced accesses are placed in the coalesced table while requests are outstanding. Requests to the same address are added as targets to the table similar to how MSHRs operate. Change-Id: I44983610307b638a97472db3576d0a30df2de600 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/27429 Reviewed-by: Bradford Beckmann <brad.beckmann@amd.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Bradford Beckmann <brad.beckmann@amd.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-05-11 21:25:19 +00:00
Gabe Black	cdcc55a6a8	mem: Minimize the use of MemObject. MemObject doesn't provide anything beyond its base ClockedObject any more, so this change removes it from most inheritance hierarchies. Occasionally MemObject is replaced with SimObject when I was fairly confident that the extra functionality of ClockedObject wasn't needed. Change-Id: Ic014ab61e56402e62548e8c831eb16e26523fdce Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18289 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Gabe Black <gabeblack@google.com>	2019-04-28 01:19:40 +00:00
Andreas Sandberg	ef71a987c1	python: Don't assume SimObjects live in the global namespace The importer in Python 3 doesn't like the way we import SimObjects from the global namespace. Convert the existing SimObject declarations to import from m5.objects. As a side-effect, this makes these files consistent with configuration files. Change-Id: I11153502b430822130722839e1fa767b82a027aa Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/15981 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2019-02-12 09:43:00 +00:00
Tony Gutierrez	de72e36619	gpu-compute: support in-order data delivery in GM pipe this patch adds an ordered response buffer to the GM pipeline to ensure in-order data delivery. the buffer is implemented as a stl ordered map, which sorts the request in program order by using their sequence ID. when requests return to the GM pipeline they are marked as done. only the oldest request may be serviced from the ordered buffer, and only if is marked as done. the FIFO response buffers are kept and used in OoO delivery mode	2016-10-26 22:48:28 -04:00
Tony Gutierrez	7ac38849ab	gpu-compute: remove inst enums and use bit flag for attributes this patch removes the GPUStaticInst enums that were defined in GPU.py. instead, a simple set of attribute flags that can be set in the base instruction class are used. this will help unify the attributes of HSAIL and machine ISA instructions within the model itself. because the static instrution now carries the attributes, a GPUDynInst must carry a pointer to a valid GPUStaticInst so a new static kernel launch instruction is added, which carries the attributes needed to perform a the kernel launch.	2016-10-26 22:47:11 -04:00
jkalamat	3724fb15fa	gpu-compute: parametrize Wavefront size Eliminate the VSZ constant that defined the Wavefront size (in numbers of work items); replaced it with a parameter in the GPU.py configuration script. Changed all data structures dependent on the Wavefront size to be dynamically sized. Legal values of Wavefront size are 16, 32, 64 for now and checked at initialization time.	2016-06-09 11:24:55 -04:00
Tony Gutierrez	1a7d3f9fcb	gpu-compute: AMD's baseline GPU model	2016-01-19 14:28:22 -05:00

33 Commits