This change sets gpuDynInst->exec_mask for permute and bpermute
instructions, fixing a bug where they would never write their data.
permute and bpermute instructions are load instructions that write
to a VGPR. Because of that, they use gpuDynInst->exec_mask when
checking what lanes should write to the VGPR.
gpuDynInst->exec_mask gets set to wf->execMask() as that is what other
load instructions that write to VGPRs do.
Change-Id: Ie443283488cbd2ab9c17fc255e7cc44418353419
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/35036
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
System calls should now be requested from the workload directly and not
routed through ExecContext or ThreadContext interfaces. That removes a
major special case for SE mode from those interfaces.
For now, when the SE workload gets a request for a system call, it
dispatches it to the appropriate Process object. In the future, the
ISA specific Workload subclasses will be responsible for handling system
calls and not the Process classes.
For simplicity, the Workload syscall() method is defined in the base
class but will panic everywhere except when SEWorkload overrides it. In
the future, this mechanism will turn into a way to request generic
services from the workload which are not necessarily system calls. For
instance, it could be a way to request handling of a page fault without
having to have another PseudoInst just for that purpose.
Change-Id: I18d36d64c54adf4f4f17a62e7e006ff2fc0b22f1
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33282
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
A comment at the top of StaticInstFlags.py says that if IsMemRef is set,
exactly one of IsStore or IsLoad will be set. That's not strictly true
since IsAtomic may be set as well, in which case neither IsStore or
IsLoad will be set (in one example I found).
The isMemRef accessor still exists, and now just ors the IsStore,
IsLoad, and IsAtomic flags.
Change-Id: Ic5ff104da68978273977a6eff2abab5dd0ae7fda
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33744
Reviewed-by: Gabe Black <gabeblack@google.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
There were three different StaticInst flags for memory barriers,
IsMemBarrier, IsReadBarrier, and IsWriteBarrier. IsReadBarrier was never
used, and IsMemBarrier was for both loads and stores, so a composite of
IsReadBarrier and IsWriteBarrier.
This change gets rid of IsMemBarrier and replaces by setting
IsReadBarrier and IsWriteBarrier at the same time. An isMemBarrier
accessor is left, but is now implemented by checking if both of the
other flags are set, and renamed to isFullMemBarrier to make it clear
that it's checking both for both types of barrier, not one or the other.
Change-Id: I702633a047f4777be4b180b42d62438ca69f52ea
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33743
Reviewed-by: Gabe Black <gabeblack@google.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
These were being initialized with BigRegVect brv = {0}, which made the
compiler complain because there is internal structure. The first element
of the union is actually an array, and this was telling it to initialize
that array to scalar 0. It was warning about this which was breaking the
build.
Instead, use zero initlization like BigRegVect brv = {}. This
initializes the first element of the union to all zeroes, with all
padding bits initialized to zero as well.
This satisfies the compiler and avoids a build error.
Change-Id: I31e7a8730c538637ff2e0c7fb00a4e12ed05e074
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/34575
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
This was set by MIPS in two places, I think largely just because it was
available. This flag refers to IPRs which are an Alpha concept. In the
O3 CPU, IsIprAccess was used as a possible indicator to determine if an
instruction IsSerializeBefore, but we've already got a flag for that. In
the minor CPU, which hasn't been made to work with MIPS as far as I
know, it was used in a condition but not mentioned in the comment
alongside the condition. I think there it was added for the sake of
Alpha.
This change eliminates that flag and removes it from the O3 and minor
CPUs. In the MIPS ISA description, the instructions that were marked as
IsIprAccess have now been marked as IsSerializeBefore since, if there
was a real reason for them to be marked as IsIprAccess, it would have
been to get it them to work in O3, and there IsSerializeBefore gets
equivalent behavior.
Change-Id: Ia874cde12fa70b998d3e638458f13d69798d40b7
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33739
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
This has been in this file since it was created in 2009. No global "using
namespace ${NAMESPACE}" should ever appear in a .hh file since then that
namespace is "used" in all files that include the .hh, even if they
aren't aware of it or even actively don't want to.
Change-Id: Idb7d7c5b959077eb4905fbb2044aa55959b8f37f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/34155
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
When isa_traits.hh hopefully goes away in the not too distant future,
this constant will need somewhere to live so ARM components can find it.
There are valid arguments that this should not be a constant in the
first place, but that's outside the scope of this change.
Change-Id: Ic5bd046dc1cc196b3cf6b6c36878fdbf5eb4c0bf
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/34170
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
When the system call happens during the execution of the system call
instruction, it can be ambiguous what state takes precedence, the state
update from the instruction or the system call. These may be tracked
differently and found in an unpredictable order in, for example, the O3
CPU. An instruction can avoid updating any state explicitly, but
implicitly updated state (specifically the PC) will always update,
whether the instruction wants it to or not.
If the system call can be deferred by using a Fault object, then it's no
longer ambiguous. The PC update will be discarded, and the system call
can set the PC however it likes. Because there is no implicit PC update,
the PC needs to be walked forward, either to what it would have been
anyway, or to what the system call set in NPC.
In addition, because of the existing semantics around handling Faults,
the instruction no longer needs to be marked as serializing,
non-speculative, etc.
The "normal", aka architectural, aka FS version of the system call
instructions don't return a Fault artificially.
Change-Id: I72011a16a89332b1dcfb01c79f2f0d75c55ab773
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33281
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Gem5 Hardware Transactional Memory (HTM)
Here we provide a brief note describing HTM support in Gem5 at
a high level.
HTM is an architectural feature that enables speculative concurrency in
a shared-memory system; groups of instructions known as transactions are
executed as an atomic unit. The system allows that transactions be
executed concurrently but intervenes if a transaction's
atomicity/isolation is jeapordised and takes corrective action. In this
implementation, corrective active explicitely means rolling back a
thread's architectural state and reverting any memory updates to a point
just before the transaction began.
This HTM implementation relies on--
(1) A checkpointing mechanism for architectural register state.
(2) Buffering speculative memory updates.
This patch is focusing on the definition of the HTM checkpoint (1)
The checkpointing mechanism is architecture dependent. Each ISA
leveraging HTM support can define a class HTMCheckpoint inhereting from
the generic one (GenericISA::HTMCheckpoint).
Those will need to save/restore the architectural state by overriding
the virtual HTMCheckpoint::save (when starting a transaction) and
HTMCheckpoint::restore (when aborting a transaction).
Instances of this class live in O3's ThreadState and Atomic's
SimpleThread. It is up to the ISA to populate this instance when
executing an instruction that begins a new transaction.
JIRA: https://gem5.atlassian.net/browse/GEM5-587
Change-Id: Icd8d1913d23652d78fe89e930ab1e302eb52363d
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/30314
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
isa_traits.hh used to have much more in it, but now it only has
PageShift, PageBytes, and (for now) the guest endianness. These values
should only be retrieved from the System class generally speaking, so
only the system class should include arch/isa_traits.hh.
Some gpu compute related files need PageBytes or PageShift. Even though
those files don't advertise their ISA dependence, they are tied to x86.
In those files, they can include arch/x86/isa_traits.hh.
The only other file which legitimately needs arch/isa_traits.hh is the
decoder cache since it uses PageBytes to size an array.
Change-Id: I12686368715623e3140a68a7027c136bd52567b1
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33203
Reviewed-by: Gabe Black <gabeblack@google.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The 'jalr' instruction of 'format Jump' should have an immediate as
offset, and the Rd register could not be always omitted. This patch
fixes the problem.
Example output:
jalr ra, -168(ra)
jalr zero, 0(ra)
jalr ra, 0(a5)
Note that this does not apply to the other two instructions of the
same format: 'c.jr' and 'c.jalr'.
Change-Id: Ia656c2e8bfafd243bfec221ac291190a84684929
Signed-off-by: Ian Jiang <ianjiang.ict@gmail.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33155
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The VAddr class wasn't used and was just a copy (with style fixes) of
the Alpha version.
Delete unused constants in isa_traits.hh, and remove unnecessary
includes. Replace MachineBytes with sizeof(uint32_t) in
arch/power/process.cc.
Change-Id: Ia4862448c43b2dd07078b1ebbbbfda4636343730
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33199
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
In flat instructions, wrLmReqsInPipe/rdLmReqsInPipe are decremented
in the calcAddr() function. However, the calcAddr() function is only
called when execMask != 0.
This patch adds in statements to decrement wrLmReqsInPipe and
rdLmReqsInPipe in all implemented atomic flats when execMask is 0.
This fixes a scenario where vector local memory and flat instructions
are unable to execute due to LocalMemPipeline::isLMReqFIFOWrRdy
always returning false in ScheduleStage::dispatchReady after too many
atomic flats execute with execMask = 0
Change-Id: I081cfd3faf74bbfcf0728445e7160fa2a76a6a7e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32614
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Alexandru Duțu <alexandru.dutu@amd.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>