Sometimes ELF files have segments in them which are marked as loadable,
but which actually have zero size in memory. When setting up a memory
image we should drop those to avoid confusing other code which tries
to find the footprint of a memory image. No part of these segments,
including their starting address or ending address, need to actually
land on top of memory since they don't actually contain any data.
Change-Id: If8b61d10db139e0f688b6ceabcb8e6a898557469
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/35156
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This change sets gpuDynInst->exec_mask for permute and bpermute
instructions, fixing a bug where they would never write their data.
permute and bpermute instructions are load instructions that write
to a VGPR. Because of that, they use gpuDynInst->exec_mask when
checking what lanes should write to the VGPR.
gpuDynInst->exec_mask gets set to wf->execMask() as that is what other
load instructions that write to VGPRs do.
Change-Id: Ie443283488cbd2ab9c17fc255e7cc44418353419
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/35036
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
(1) ThreadContexts are registered into System in BaseCPU::init.
(2) FVPBasePwrCtrl state is resized based on registered ThreadContexts
in FVPBasePwrCtrl::init.
FVPBasePwrCtrl::init may be called before BaseCPU::init based on the
model names alphabetical order, leading to segmentation faults.
To fix this, (2) is now carried out in FVPBasePwrCtrl::startup.
Change-Id: Ica6c5b7448da556d61aee53f8777a709fcad2212
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/35075
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
System calls should now be requested from the workload directly and not
routed through ExecContext or ThreadContext interfaces. That removes a
major special case for SE mode from those interfaces.
For now, when the SE workload gets a request for a system call, it
dispatches it to the appropriate Process object. In the future, the
ISA specific Workload subclasses will be responsible for handling system
calls and not the Process classes.
For simplicity, the Workload syscall() method is defined in the base
class but will panic everywhere except when SEWorkload overrides it. In
the future, this mechanism will turn into a way to request generic
services from the workload which are not necessarily system calls. For
instance, it could be a way to request handling of a page fault without
having to have another PseudoInst just for that purpose.
Change-Id: I18d36d64c54adf4f4f17a62e7e006ff2fc0b22f1
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33282
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The workload object is still optional for the sake of compatibility,
even though it probably shouldn't be in the long term. If a simulation
is just a collection of components with nothing in particular running on
it, for instance driven by a traffic generator, should it even have a
System object in the first place?
Change-Id: I8bcda72bdfa3730248226fb62f0bba9a83243d95
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33278
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This size was used to break up DMA transactions so that a single
transaction would not cross a page boundary. This was because on Alpha,
there was an actual page table which translated between PCI and DMA
address spaces. On all currently implemented systems, the mapping is
simply to add a scalar offset, so it's not possible for a legal region
of memory to be contiguous in one space but not in the other.
Additionally, if it *was* possible for there to be a mismatch, it was
only coincidence that Alpha used a page table which had the same sized
pages as it normally used. There is no requirement that there even would
be fixed sized pages in the first place.
To avoid this artificial dependency between the IDE controller and the
ISA, this change simply changes the chunk size for DMA accesses to 4K.
That's the page size at least on x86 and probably other architectures,
and will be a pretty close approximation of the previous behavior.
It's possible that even having this chunking in the first place is
unnecessary and functionally useless, but there are some checks which
happen between chunks, and changing how big they are would change the
frequency of those checks. For instance, the controller/disk may not
notice in the same amount of time if a DMA was cancelled somehow.
Change-Id: I1ec840d1f158c3faa31ba0184458b69bf654c252
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/34178
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Most DPRINTFs will be skipped over most of the time, and when they
aren't they'll already have overhead from string handling, output to the
console and/or a file, etc, which will drown out the behavior of a
branch.
Change-Id: I5475d7b5add63b44f60c0a1d46b4b14e6bf30fd3
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/34818
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The following deadlock was occuring in fetch_unit w/timingSim:
1. exec() is called, a wave is ready to fetch, so it sets pendingFetch
2. A packet is sent to ITLB to fetch for that wave
3. The wave executes a branch, causing the fetch buffer to be cleared
4. The packet is handled, and fetch() is called. However, because the
fetch buffer was cleared, it returns doing nothing.
5. exec() gets called again, but the wave will never be scheduled to
fetch, as pendingFetch is still set to true.
This patch clears pendingFetch (and dropFetch) before returning in fetch()
when the fetch buffer has been cleared.
dropFetch needed to be cleared otherwise gem5 would crash.
Change-Id: Iccbac7defc4849c19e8b17aa2492da641defb772
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/34555
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The queue in systemC scheduler is implemented as a std::map. This provides
the best big-O solution. However, most of simulation usecases has very
small number of pending events. This is expected as we usually only trigger a
few new events after some events are processed. In such scenario, we
should optimize for insert/erase instead of search. This change use
std::list instead of std::map.
As a proof, we can find that gem5's original event_queue is also
implemented as a list instead of tree.
We see 5% speed improvement with the example provided by Matthias Jung:
https://gist.github.com/myzinsky/557200aa04556de44a317e0a10f51840
Change-Id: I75c30df9134e94df42fd778115cf923488ff5886
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/34515
Reviewed-by: Gabe Black <gabeblack@google.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
A comment at the top of StaticInstFlags.py says that if IsMemRef is set,
exactly one of IsStore or IsLoad will be set. That's not strictly true
since IsAtomic may be set as well, in which case neither IsStore or
IsLoad will be set (in one example I found).
The isMemRef accessor still exists, and now just ors the IsStore,
IsLoad, and IsAtomic flags.
Change-Id: Ic5ff104da68978273977a6eff2abab5dd0ae7fda
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33744
Reviewed-by: Gabe Black <gabeblack@google.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
There were three different StaticInst flags for memory barriers,
IsMemBarrier, IsReadBarrier, and IsWriteBarrier. IsReadBarrier was never
used, and IsMemBarrier was for both loads and stores, so a composite of
IsReadBarrier and IsWriteBarrier.
This change gets rid of IsMemBarrier and replaces by setting
IsReadBarrier and IsWriteBarrier at the same time. An isMemBarrier
accessor is left, but is now implemented by checking if both of the
other flags are set, and renamed to isFullMemBarrier to make it clear
that it's checking both for both types of barrier, not one or the other.
Change-Id: I702633a047f4777be4b180b42d62438ca69f52ea
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33743
Reviewed-by: Gabe Black <gabeblack@google.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
These were including instruction class definitions from x86 for some
reason. There was no code in those .cc files which actually used
anything from them, as evidenced by the fact that the GCN3_X86 build
still works. No other code in the file was conditionally compiled as of
today.
Change-Id: I3cef8348fb601dd7af67665cf64bbf514c91c3db
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/34577
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
These were being initialized with BigRegVect brv = {0}, which made the
compiler complain because there is internal structure. The first element
of the union is actually an array, and this was telling it to initialize
that array to scalar 0. It was warning about this which was breaking the
build.
Instead, use zero initlization like BigRegVect brv = {}. This
initializes the first element of the union to all zeroes, with all
padding bits initialized to zero as well.
This satisfies the compiler and avoids a build error.
Change-Id: I31e7a8730c538637ff2e0c7fb00a4e12ed05e074
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/34575
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
This was set by MIPS in two places, I think largely just because it was
available. This flag refers to IPRs which are an Alpha concept. In the
O3 CPU, IsIprAccess was used as a possible indicator to determine if an
instruction IsSerializeBefore, but we've already got a flag for that. In
the minor CPU, which hasn't been made to work with MIPS as far as I
know, it was used in a condition but not mentioned in the comment
alongside the condition. I think there it was added for the sake of
Alpha.
This change eliminates that flag and removes it from the O3 and minor
CPUs. In the MIPS ISA description, the instructions that were marked as
IsIprAccess have now been marked as IsSerializeBefore since, if there
was a real reason for them to be marked as IsIprAccess, it would have
been to get it them to work in O3, and there IsSerializeBefore gets
equivalent behavior.
Change-Id: Ia874cde12fa70b998d3e638458f13d69798d40b7
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33739
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>