These were never used with conditions, so the condition check just added
overhead. Also, the not-taken path through the instruction didn't
actually set the destination to something, meaning that it would set it
to something arbitrary and not actually leave it unmodified.
Change-Id: I33fef088979b14ad74adf22b26419a1cacf386dd
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/45305
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
This patch reverts part of the changes made by the removal of
the Stage2MMU class [1]:
Prior to that patch the stage1 and stage2 walkers were sharing
the same port (which was instantiated in the Stage2MMU).
By removing the Stage2MMU we provided every table walker a
unique port.
With this patch we are reintroducing port sharing to temporarily fix
existing platforms using walker caches.
(The long term design goal will be to have a unique page table walker)
Those complain if we try to connect a single ported cache to 2 table
walker ports (stage1 and stage2)
[1]: https://gem5-review.googlesource.com/c/public/gem5/+/45780
Change-Id: Ib68ef97f1e9772a698771269c9a4ec4514f5d4d7
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/48200
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Currently, compiling CheckerCPU uses the dyn_inst.hh header from
O3CPU. However, including this header is not required and it
causes gem5 failed to build when O3CPU is not part of CPU_MODELS.
This change also involves moving the the dependency on
src/cpu/o3/dyn_inst.hh to src/cpu/o3/cpu.cc and src/cpu/lsq_unit.cc,
which previously includes src/cpu/o3/dyn_inst.hh implicitly through
src/cpu/checker/cpu.hh.
JIRA: https://gem5.atlassian.net/browse/GEM5-1025
Change-Id: I7664cd4b9591bf0a4635338fff576cb5f5cbfa10
Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/48079
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Certain DS insts are classfied as Loads, but don't actually go through
the memory pipeline. However, any instruction classified as a load
marks its destination registers as free in the memory pipeline.
Because these instructions didn't use the memory pipeline, they
never freed their destination registers, which led to a deadlock.
This patch explicitly calls the function used to free the destination
registers in the execute() method of those Load instructions that
don't use the memory pipeline.
Change-Id: Ic2ac2e232c8fbad63d0c62c1862f2bdaeaba4edf
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/48019
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Certain instructions don't have implementations in instructions.cc,
and get decoded as a nullptr.
This adds a fatal when decoding a missing instruction, as we aren't
able to properly run a program if all its instructions aren't
implemented, and it allows us to figure out which instruction is
missing due to fatals printing the line they were called.
Change-Id: I7e3690f079b790dceee102063773d5fbbc8619f1
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/47522
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
There are cases where we need a random number generator engine. The
Random class has such an engine but its interface currently only
allows for generating random numbers. To make sure we can reuse the
same random number generator in as many places as possible this patch
makes the engine in the Random class public.
Change-Id: I80153dd39f5b0d12537e4c0cf54773e7725b2a94
Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/47859
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This patch adds support for a gfx902 Vega APU, ripping the
appropriate values for device_id from the ROCm Thunk
(src/topology.c).
Note: gfx902 isn't officially supported by ROCm. This
means that it may not work for all programs. In particular,
rocBLAS is incompatible with gfx902, so anything that uses
rocBLAS won't be able to run with gfx902.
Change-Id: I48893e7cc9c7e52275fdfd22314f371a9db8e90a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/47530
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Remove the duplicate dmaVirt calls from HSA packet processor and GPU
command processor and move them into their own class. This removes some
duplicate code and allows a DmaVirtDevice to be created which will be
useful for upcoming full system GPU commits.
The DmaVirtDevice is an abstraction of the base DmaDevice but iterates
using ChunkGenerator over virtual addresses. Classes which inherit from
DmaVirtDevice must provide a translation function to translate from
virtual address to physical address. Once translated, the physical
address is passed to DmaDevice to do the work.
Change-Id: Idd59ccb4d9ba21c0b1150ee328ededf5a88d824e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/47179
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Add an accuracy and coverage stat for the prefetchers.
Accuracy is defined as the ratio of the number of prefetch
request that have been counted as useful over the number
of prefetch request issued.
Accuracy tells whether the prefetcher is producing useful
requests or not.
Coverage is defined as the ratio of of the number of prefetch
request that have been counted as useful over the number of
demand misses if there was no prefetch, which is counted as
the number of useful prefetch request plus the remaining
demand misses. Due to the way stats are defined in the cache,
I have to add a stat to count the number of remaining demand
misses directly in the prefetcher stat. Demand is defined
as being one of this request type: ReadReq, WriteReq,
WriteLineReq, ReadExReq, ReadCleanReq, ReadSharedReq.
Coverage tells what part of misses are covered by the prefetcher.
Change-Id: I3bb8838f87b42665fdd782889f6ba56ca2a802fc
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/47603
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Daniel Carvalho <odanrc@yahoo.com.br>
Tested-by: kokoro <noreply+kokoro@google.com>
Count how many time a prefetch is useful, meaning
a hit has happened on a prefetched cache block.
Another stat (pfUsefulButMiss) has been added to count
the special case where there is a hit on prefetched block
but it is counted as a miss because the block is not in
the requested coherency state.
Change-Id: I253216b9ac96d5f21139b710c489d6eb3fce7136
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/47602
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Daniel Carvalho <odanrc@yahoo.com.br>
Tested-by: kokoro <noreply+kokoro@google.com>
The apertures for non-gfx801 GPUs are set differently.
If the apertures aren't set properly, ROCm will error out.
This change sets the apertures appropriately based on the
gfx version of the simulated GPU. It also adds in new
functions to set the scratch and lds apertures in GFX9 to mimic
the linux kernel.
Change-Id: I1fa6f60bc20c7b6eb3896057841d96846460a9f8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/47529
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Add a unit test for stats/info.
One test has been disabled due to not knowing the
expected behavior.
It is important to notice that Stats::Info can have
duplicate names using the new style. Stats::Group is
responsible for not allowing duplicate names in this
case.
Change-Id: I8b169d34c1309b37ba79fa9cf6895547b7e97fc0
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/43009
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
When the Vega ISA got committed, it lacked the request counter
tracking for memory requests that existed in the GCN3 code.
Instead of copying over the same lines from the GCN3 code to the Vega
code, this commit makes the various memory pipelines handle updating the
request counter information instead, as every memory instruction calls a
memory pipeline.
This commit also adds an issueRequest in scalar_memory_pipeline, as
previously, the gpuDynInsts were explicitly placed in the queue of
issuedRequests.
Change-Id: I5140d3b2f12be582f2ae9ff7c433167aeec5b68e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/45347
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
vector_register_file uses the exec_mask of a memory instruction in
order to determine if it should mark a register as in-use or not.
Previously, the exec_mask of memory instructions was only set on
execution of that instruction, which occurs after the code in
vector_register_file. This led to the code reading potentially garbage
data, leading to a scenario where a register would be marked used when
it shouldn't be.
This fix sets the exec_mask of memory instructions in schedule_stage,
which works because the only time the wavefront execMask() is updated is
on a instruction executing, and we know the previous instruction will
have executed by the time schedule_stage executes, due to the order the
pipeline is executed in.
This also undoes part of a patch from last year (62ec973) which treated
the symptom of accidental register allocation, without preventing the
registers from being allocated in the first place.
This patch also removes now redundant code that sets the exec_mask in
instructions.cc for memory instructions
Change-Id: Idabd35020000764fb06133ac2458606c1aaf6f04
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/45346
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Matthew Poremba <matthew.poremba@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>