The gem5 standard library hardcoded some parameters of the workload.
E.g., the kernel filename must be `object_file`.
Change-Id: I5eeb7359be399138693eaba0738eaf524c59408f
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
This patch adds supports for using the "classic" prefetchers with ruby
cache controllers.
This pull request includes a few commits making the changes in this
order:
- Refactor decouples the classic cache and prefetchers interfaces
- Extras probes for later integration with ruby
- General ruby-side support
- Adds support for the CHI protocol
Commit [mem-ruby: support prefetcher in CHI
protocol](2bdb65653b)
may be used as example on how to add support for other protocols.
JIRA issues that may be related to this pull request:
https://gem5.atlassian.net/browse/GEM5-457https://gem5.atlassian.net/browse/GEM5-1112
The change will only add include and library path if the fastmodel is
required to build. The change will benefit for most of gem5 build.
Change-Id: I98c20bd1470b7227940036199e02bc001e307eac
Currently, if you try to use ctrl-c while python code is running nothing
happens. This is not ideal. This change enables users to use ctrl-c
while python is running (e.g., when a large disk image is downloading).
To do this, we moved the `initSignals` function in gem5 from `main` to
the simulate loop. Thus, every time the simulate loop starts (i.e., is
called from python) gem5 will install its signal handlers. Also, when
the control is returned to python, we put python's default SIGINT
handler back.
Change-Id: I14490e48d931eb316e8c641217bf8d8ddaa340ed
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
- The resources field in workload now changed to a dict of id and
version from a string with just the id.
There was an if statement added to support both versions in develop.
Removing the if statement so that 23.1 supports the new changes only.
Change-Id: Id8dc3f932f53a156e4fb609a215db7d85bd81a44
Converts CLFLUSHOPT/WB/FLUSH operations from Write to Read operations
during address translation so that they don't trigger a page fault when
done on write-protected pages.
Solves #226
Some variables hava narrow datatypes that overflow on large VLEN values.
For example, the maximum number of microops for LMUL=8 SEW=8 and
VLEN=64K is 2^16.
Change-Id: I5cce759f040884e09ce83bee7e54a62c4b42c5aa
Co-authored-by: Adrià Armejach <adria.armejach@bsc.es>
With the use of large RVV vectors (i.e., 8K or 16K bits) and a limited
number of cacheLoadPorts, some loads take multiple cycles to execute.
This triggered certain conditions when store-to-load forwarding happens
in the middle of the execution of a load that already has outstanding
packets.
First, after store-to-load forwarding the request is marked as discarded
and the load is immediately writtenback, which triggers a writebackDone
that tries to delete the request, triggering an assert as it still has
outstanding packets. This patch avoid deleting the request leaving it
self owned, it will be deleted when the last packet arrives in
packetReplied.
Second, this patch avoid checking snoops on discarded requests by
checking if the request exists.
Change-Id: Icea0add0327929d3a6af7e6dd0af9945cb0d0970
Co-authored-by: Adrià Armejach <adria.armejach@bsc.es>
This patch adds RubyPrefetcherProxy, which provides means to inject
requests generated by the "classic" prefetchers into a SLICC prefetch
queue. It defines defines notifyPf* functions to be used by protocols
to notify a prefetcher. It also includes the probes required to
interface with the classic implementation.
AbstractController defines the accessor needed to snoop the caches.
A followup patch will add support for RubyPrefetcherProxy in the
CHI protocol.
Related JIRA:
https://gem5.atlassian.net/browse/GEM5-457https://gem5.atlassian.net/browse/GEM5-1112
Additional authors:
Tuan Ta <tuan.ta2@arm.com>
Change-Id: Ie908150b510f951cdd6fd0fd9c95d9760ff70fb0
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
hasBeenPrefetched can now take a requestor id and returns true only if
the block was prefetched by a prefetcher with the same id. This may be
necessary to properly train multiple prefetchers attached to the same
cache. If returns true if the block was prefetched by any prefetcher
when the id is not provided.
Related JIRA:
https://gem5.atlassian.net/browse/GEM5-457https://gem5.atlassian.net/browse/GEM5-1112
Change-Id: I205e000fd5ff100e5a5d24d88bca7c6a46689ab2
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Remove the prefetch_on_access and prefetch_on_pf_hit from BaseCache.
BasePrefetch no longer expects this params to exist in the parent.
Configurations that set these parameter using the cache object were
fixed.
Change-Id: I9ab6a545eaf930ee41ebda74e2b6b8bad0ca35a7
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
This patches decouples the prefetchers from the cache implementation
as the first step to allow using the classic prefetchers with ruby
caches. The prefetchers that need do cache lookups can do so using
the accessor object provided when the probes are notified. This may
also facilitate connecting the same prefetcher to multiple caches.
Related JIRA:
https://gem5.atlassian.net/browse/GEM5-457https://gem5.atlassian.net/browse/GEM5-1112
Change-Id: I4fee1a3613ae009fabf45d7b747e4582cad315ef
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
MOV instructions 8C and 8E can be prefixed with a REX prefix to extend
the source/destination register.
However, the R bit in REX will be applied to the segment register.
The decoder file checks for valid segment registers, checking the
MODRM_REG only, however, later this will be extended with the REX_R when
adding the register to the sources/destinations of the instruction.
This will trigger an assert.
Additionally, MOV instructions of various miscelaneous registers are
also not check for being valid when taking into account the REX_R bit.
This patch checks that the REX_R is not set, otherwise, UD2 will be
generated.
In a high performance CPU there is no other way than a BTB hit
to know about a branch instruction and its type. For low-end CPU's
pre-decoding might sit in from of the BPU to provide this information.
Currently, the BPU models only low-end behavior and updates the
RAS and the indirect branch prediction even without a BTB hit.
This patch adds three things to model the correct behavior for high-end
CPUs.
1. A check before the RAS and indirect predictor wheather there was
a BTB hit or not. Only for BTB hits the BPU will consolidate RAS, and
indirect predictor.
2. Since, this check requires a BTB hit for indirect branches they must
also be installed into the BTB. For returns this was already done.
3. Finally, the BTB update previously happened at squash (decode
or commit). Since this can be out-of-order that means branches from
the false path can get installed without ever been retired.
The memory translation require supervisor mode implement. If the
supervisor mode is not implemented, the satp CSR is not exists and
should not do address translation
Change-Id: Ie6c8a1a130d0aab0647b35e0f731f6b930834176
Converts CLFLUSHOPT/WB/FLUSH operations from Write to Read operations
during address translation so that they don't trigger a page fault
when done on write-protected pages.
Change-Id: I20e89cc0cb2b288b36ba1f0ba39a2e1bf0f728af
The PR contains the following changes:
- Move all of the config options(`env["CONF"]`) from SConsopt to Kconfig
files
- Update `build_opts` files to Kconfig option formats
- The Ruby Protocol files are only built if `RUBY=y`
- Remove the default-default build target
- Kconfig commands are included in the PR:
- defconfig
- setconfig
- meunconfig
- guiconfig
- listnewconfig
- savedefconfig
- oldconfig
- olddefconfig
- Add the `python3-tk` package dependencies
Jira issue: https://gem5.atlassian.net/browse/GEM5-1211
The GPU device keeps a local copy of each ring buffers read pointer
(rptr) to avoid constant DMAs to/from host memory. This means it needs
to be periodically updated on the host side as the driver uses this to
determine how much space is left in the queue and may hang if it believe
the queue is full. For user-mode queues, this already happens when
queues are unmapped. For kernel mode queues (e.g., HIQ, KIQ) the rptr is
never updated leading to a hang.
In this patch the rptr for *all* queues is reported back to the kernel
whenever the queue reaches an empty state (rptr == wptr). Additionally
to handle PM4 queue wrap-around, the queue processing function checks if
the queue is not empty instead of rptr < wptr. This is state because the
driver fills PM4 queues with NOP packets on initialization and when wrap
around occurs.
Change-Id: Ie13a4354f82999208a75bb1eaec70513039ff30f
The V_PERM_B32 instruction is selecting the correct byte, but is
shifting into place moving by bits instead of bytes. The V_OR3_B32
instruction is calling the wrong instruction implementation in the
decoder.
This patch fixes both issues plus a bonus fix for GCN3's V_PERM_B32.
(GCN3 does not have V_OR3_B32).
Change-Id: Ied66c43981bc4236f680db42a9868f760becc284
This PR is fixing remaining issues in the ArmISA::Interrupt class; more
specifically it is enabling
virtual interrupts in secure mode (when FEAT_SEL2 is present). Previous
version was assuming no
virtual interrupt was possible in secure mode. We fix this assumption by
replacing the security check
with the EL2Enabled helper which closely matches the Arm pseudocode
MOV instructions 8C and 8E can be prefixed with a REX prefix to extend
the source/destination register. However, the R bit in REX will be
applied to the segment register. The decoder file checks for valid
segment registers, checking the MODRM_REG only, however, later this
will be extended with the REX_R when adding the register to the
sources/destinations of the instruction. This will trigger an assert.
This patch checks that the REX_R is not set, otherwise, UD2 will be
generated.
Change-Id: I78a93c35116232fe37e5ec50025e721b8c633c5f
The config option HAVE_CAPSTONE is added in the previous [1] and
the Kconfig options should be sync with it.
[1] https://github.com/gem5/gem5/pull/494
Change-Id: Id83718bc825f53d87d37d6ac930b96371209bdb3
The CL updates the Kconfig:
1. Replace the USE_NULL_ISA with BUILD_ISA
2. The USE_XXX_ISAs are depends on BUILD_ISA
3. If the BUILD_ISA is set, at least one of USE_XXX_ISAs must be set
4. Refactor the USE_KVM option
Change-Id: I2a600dea9fb671263b0191c46c5790ebbe91a7b8
These are not yet consumed by anything, but convert all the settings
from SCons variables to Kconfig variables.
If you have existing SConsopts files which need to be converted, you
should take a look at KCONFIG.md to learn about how kconfig is used in
gem5. You should decide if any variables need to be available to C++ or
kconfig itself, and whether those are options which should be detected
automatically, or should be up to the user. Options which should be
measured automatically should still be in SConsopts files, while user
facing options should be added to new or existing Kconfig files.
Generally, make sure you're storing c++/kconfig visible options in
env['CONF'][...]. Also remove references to sticky_vars since persistent
options should now be handled with kconfig, and export_vars since
everything in env['CONF'] is now exported automatically.
Switch SCons/gem5 to use Kconfig for configuration, except EXTRAS which
is still a sticky SCons variable. This is necessary because EXTRAS also
controls what config options exist. If it came from Kconfig itself, then
there would be a circular dependency. This dependency could
theoretically be handled by reparsing the Kconfig when EXTRAS
directories were added or removed, but that would be complicated, and
isn't supported by kconfiglib. It wouldn't be worth the significant
effort it would take to add it, just to use Kconfig more purely.
Change-Id: I29ab1940b2d7b0e6635a490452d05befe5b4a2c9
The AtomicWait event was not being woken up properly due to the
numPending count in the TBE not being decremented. This patch decrements
the count when Data is returned. Since that moves to a base state, the
TBE should no longer be needed.
Additionally added a transition which stalls and wait when an AtomicWait
occurs while in WI state so that it retries.
Change-Id: Ic8bfc700f9df3f95bea0799121898926a23d8163
When restoring checkpoints for certain applications, gem5 tries to
create new doorbells with a pre-existing queue ID and simulation crashes
shortly after. This commit adds existing IDs to the GPU device's used
VMID map so that new doorbells are aware of existing queue IDs and use a
new ID. This ensures that queue IDs are unique after checkpoint
restoration
The CPU should not sleep with a pending virtual interrupt
if secure mode EL2 is supported (FEAT_SEL2)
Change-Id: Ib71c4a09d76a790331cf6750da45f83694946aee
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Similarly to the physical version [1], we rewrite the
masking logic to account for FEAT_SEL2.
The interrupt table is taken from the Arm architecture reference
manual (version DDI 0487H.a, section D1.3.6, table R_BKHXL)
[1]: https://github.com/gem5/gem5/pull/430
Change-Id: Icb6eb1944d8241293b3ef3c349b20f3981bcc558
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
There is no need to call the methods for every kind
of interrupt. A pending one should short-circuit the
remaining checks
Change-Id: I2c9eb680a7baa4644745b8cbe48183ff6f8e3102
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
With this patch we align virtual interrupts with respect to
the physical ones by introducing a matching takeVirtualInt
method.
Change-Id: Ib7835a21b85e4330ba9f051bc8fed691d6e1382e
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>