The log_call helper is now accepting a time parameter (dictionary). If
the param is not None, the function will fill the timing indications
(user and system time) for the TestCase.
There are some TestCases whose user time is not of our interest; for
example we don't really care about the cpu time of a stdout diff
(MatchStdout tests). In those cases the resulting cpu time in the
generated JUnit file (results.xml) will be 0.
JIRA: https://gem5.atlassian.net/browse/GEM5-548
Change-Id: I53c1b59f8ad93900aeac06197e39189c00a9053c
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32653
Tested-by: kokoro <noreply+kokoro@google.com>
This change replaces the __attribute__ syntax with the now standard [[]]
syntax. It also reorganizes compiler.hh so that all special macros have
some explanatory text saying what they do, and each attribute which has a
standard version can use that if available and what version of c++ it's
standard in is put in a comment.
Also, the requirements as far as where you put [[]] style attributes are
a little more strict than the old school __attribute__ style. The use of
the attribute macros was updated to fit these new, more strict
requirements.
Change-Id: Iace44306a534111f1c38b9856dc9e88cd9b49d2a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/35219
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Before this commit, ExecSymbol would show only the symbol and no address:
0: system.cpu: A0 T0 : @_kernel_flags_le_lo32+6 : mrs x0, currentel
After this commit, it shows the symbol in addition to the address:
0: system.cpu: A0 T0 : 0x10 @_kernel_flags_le_lo32+6 : mrs x0, currentel
Change-Id: I665802f50ce9aeac6bb9e174b5dd06196e757c60
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/35077
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This macro probably would have been defined to "return" in some cases,
to be put after a call to a function that doesn't return so that the
compiler wouldn't think control would reach the end of a non-void
function. It was only ever defined to expand to nothing, and now that
[[noreturn]] is a standard attribute, it should never be needed going
forward.
Change-Id: I37625eab72deeaede77f9347116b9fddd75febf7
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/35217
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Sometimes ELF files have segments in them which are marked as loadable,
but which actually have zero size in memory. When setting up a memory
image we should drop those to avoid confusing other code which tries
to find the footprint of a memory image. No part of these segments,
including their starting address or ending address, need to actually
land on top of memory since they don't actually contain any data.
Change-Id: If8b61d10db139e0f688b6ceabcb8e6a898557469
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/35156
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Added utility class `TimedWaitPID` which monkey-patches os.waitpid()
with a functor that has the same signature, but calls os.wait4()
instead. This allows the process's user and system CPU time to be
obtained from the OS when using APIs (such as subprocess) which use
os.waitpid() internally.
The process CPU time is stored within the functor and can be read back
later by calling TimedWaitPID.get_time_for_pid().
JIRA: https://gem5.atlassian.net/browse/GEM5-548
Change-Id: I9ebe9ca1241a4f28c90ad31f672f32ac52786664
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32652
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
This change sets gpuDynInst->exec_mask for permute and bpermute
instructions, fixing a bug where they would never write their data.
permute and bpermute instructions are load instructions that write
to a VGPR. Because of that, they use gpuDynInst->exec_mask when
checking what lanes should write to the VGPR.
gpuDynInst->exec_mask gets set to wf->execMask() as that is what other
load instructions that write to VGPRs do.
Change-Id: Ie443283488cbd2ab9c17fc255e7cc44418353419
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/35036
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
(1) ThreadContexts are registered into System in BaseCPU::init.
(2) FVPBasePwrCtrl state is resized based on registered ThreadContexts
in FVPBasePwrCtrl::init.
FVPBasePwrCtrl::init may be called before BaseCPU::init based on the
model names alphabetical order, leading to segmentation faults.
To fix this, (2) is now carried out in FVPBasePwrCtrl::startup.
Change-Id: Ica6c5b7448da556d61aee53f8777a709fcad2212
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/35075
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Without this, HDF5 is not built, e.g. a run such as
http://jenkins.gem5.org/job/Nightly/68/console contains:
Checking for hdf5-serial using pkg-config... pkg-config not found
Checking for hdf5 using pkg-config... pkg-config not found
Checking for H5Fcreate("", 0, 0, 0) in C library hdf5... (cached) no
Warning: Couldn't find any HDF5 C++ libraries. Disabling
HDF5 support.
This is done to increase coverage a bit, and serve as dependency
documentation to users.
Change-Id: Ibf820a3aa76c29eeee1201646924ee181615a162
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/34777
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
System calls should now be requested from the workload directly and not
routed through ExecContext or ThreadContext interfaces. That removes a
major special case for SE mode from those interfaces.
For now, when the SE workload gets a request for a system call, it
dispatches it to the appropriate Process object. In the future, the
ISA specific Workload subclasses will be responsible for handling system
calls and not the Process classes.
For simplicity, the Workload syscall() method is defined in the base
class but will panic everywhere except when SEWorkload overrides it. In
the future, this mechanism will turn into a way to request generic
services from the workload which are not necessarily system calls. For
instance, it could be a way to request handling of a page fault without
having to have another PseudoInst just for that purpose.
Change-Id: I18d36d64c54adf4f4f17a62e7e006ff2fc0b22f1
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33282
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The workload object is still optional for the sake of compatibility,
even though it probably shouldn't be in the long term. If a simulation
is just a collection of components with nothing in particular running on
it, for instance driven by a traffic generator, should it even have a
System object in the first place?
Change-Id: I8bcda72bdfa3730248226fb62f0bba9a83243d95
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33278
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This size was used to break up DMA transactions so that a single
transaction would not cross a page boundary. This was because on Alpha,
there was an actual page table which translated between PCI and DMA
address spaces. On all currently implemented systems, the mapping is
simply to add a scalar offset, so it's not possible for a legal region
of memory to be contiguous in one space but not in the other.
Additionally, if it *was* possible for there to be a mismatch, it was
only coincidence that Alpha used a page table which had the same sized
pages as it normally used. There is no requirement that there even would
be fixed sized pages in the first place.
To avoid this artificial dependency between the IDE controller and the
ISA, this change simply changes the chunk size for DMA accesses to 4K.
That's the page size at least on x86 and probably other architectures,
and will be a pretty close approximation of the previous behavior.
It's possible that even having this chunking in the first place is
unnecessary and functionally useless, but there are some checks which
happen between chunks, and changing how big they are would change the
frequency of those checks. For instance, the controller/disk may not
notice in the same amount of time if a DMA was cancelled somehow.
Change-Id: I1ec840d1f158c3faa31ba0184458b69bf654c252
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/34178
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This matches what's documented elsewhere. We *need* version 3.4 to
support c++14, but we support only as far back as 3.9. Also, the
argument to set c++14 as the standard is different in 3.4 and earlier
(-std=c++1y), so it makes life slightly easier to move past it to 3.9.
Change-Id: I66fa578dd3222c62907496a888f8068ed0918c7b
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/34819
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Most DPRINTFs will be skipped over most of the time, and when they
aren't they'll already have overhead from string handling, output to the
console and/or a file, etc, which will drown out the behavior of a
branch.
Change-Id: I5475d7b5add63b44f60c0a1d46b4b14e6bf30fd3
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/34818
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The following deadlock was occuring in fetch_unit w/timingSim:
1. exec() is called, a wave is ready to fetch, so it sets pendingFetch
2. A packet is sent to ITLB to fetch for that wave
3. The wave executes a branch, causing the fetch buffer to be cleared
4. The packet is handled, and fetch() is called. However, because the
fetch buffer was cleared, it returns doing nothing.
5. exec() gets called again, but the wave will never be scheduled to
fetch, as pendingFetch is still set to true.
This patch clears pendingFetch (and dropFetch) before returning in fetch()
when the fetch buffer has been cleared.
dropFetch needed to be cleared otherwise gem5 would crash.
Change-Id: Iccbac7defc4849c19e8b17aa2492da641defb772
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/34555
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This cleans up the mmap-ing. This is primarily used for testing since
the tests may end up mmap-ing the backing file many times, and we don't
want all those earlier mappings lying around.
This change also makes the original mmap-ing function close the file it
opens, since the man page for mmap explicitly says you can do that and
not lose the mapping. That means we don't have to keep track of the file
descriptor which corresponds to the mmap-ed file when we do the
unmapping, and it's slightly cleaner in general.
Change-Id: I90e3e755cebf3d03e2bf644adf8ef3e157236172
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/27750
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Pouya Fotouhi <pfotouhi@ucdavis.edu>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The queue in systemC scheduler is implemented as a std::map. This provides
the best big-O solution. However, most of simulation usecases has very
small number of pending events. This is expected as we usually only trigger a
few new events after some events are processed. In such scenario, we
should optimize for insert/erase instead of search. This change use
std::list instead of std::map.
As a proof, we can find that gem5's original event_queue is also
implemented as a list instead of tree.
We see 5% speed improvement with the example provided by Matthias Jung:
https://gist.github.com/myzinsky/557200aa04556de44a317e0a10f51840
Change-Id: I75c30df9134e94df42fd778115cf923488ff5886
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/34515
Reviewed-by: Gabe Black <gabeblack@google.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
A comment at the top of StaticInstFlags.py says that if IsMemRef is set,
exactly one of IsStore or IsLoad will be set. That's not strictly true
since IsAtomic may be set as well, in which case neither IsStore or
IsLoad will be set (in one example I found).
The isMemRef accessor still exists, and now just ors the IsStore,
IsLoad, and IsAtomic flags.
Change-Id: Ic5ff104da68978273977a6eff2abab5dd0ae7fda
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33744
Reviewed-by: Gabe Black <gabeblack@google.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>