TracingExtension contains a stack recording the port names
passed through of the Packet. The target receiving the Packet
can dump out the whole path of this Packet for the debug purpose.
This mechanism can be enabled with the debug flag PortTrace.
Change-Id: Ic11e708b35fdddc4f4b786d91b35fd4def08948c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71538
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Yu-hsin Wang <yuhsingw@google.com>
This patch includes several changes to the gem5 tools interface to the
gem5-resources infrastructure. These are:
* The old download and JSON query functions have been removed from the
downloader module. These functions were used for directly downloading
and inspecting the resource JSON file, hosted at
https://resources.gem5.org/resources. This information is now obtained
via `gem5.client`. If a resources JSON file is specified as a client,
it should conform to the new schema:
https//resources.gem5.org/gem5-resources-schema.json. The old schema
(pre-v23.0) is no longer valid. Tests have been updated to reflect
this change. Those which tested these old functions have been removed.
* Unused imports have been removed.
* For the resource query functions, and those tasked with obtaining the
resources, the parameter `gem5_version` has been added. In all cases
it does the same thing:
* It will filter results based on compatibility to the
`gem5_version` specified. If no resources are compatible the
latest version of that resource is chosen (though a warning is
thrown).
* By default it is set to the current gem5 version.
* It is optional. If `None`, this filtering functionality is not
carried out.
* Tests have been updated to fix the version to “develop” so the
they do not break between versions.
* The `gem5_version` parameters will filter using a logic which will
base compatibility on the specificity of the gem5-version specified in
a resource’s data. If a resource has a compatible gem5-version of
“v18.4” it will be compatible with any minor/hotfix version within the
v18.4 release (this can be seen as matching on “v18.4.*.*”.) Likewise,
if a resource has a compatible gem5-version of “v18.4.1” then it’s
only compatible with the v18.4.1 release but any of it’s hot fix
releases (“v18.4.1.*”).
* The ‘list_resources’ function has been updated to use the
“gem5.client” APIs to get resource information from the clients
(MongoDB or a JSON file). This has been designed to remain backwards
compatible to as much as is possible, though, due to schema changes,
the function does search across all versions of gem5.
* `get_resources` function was added to the `AbstractClient`. This is a
more general function than `get_resource_by_id`. It was
primarily created to handle the `list_resources` update but is a
useful update to the API. The `get_resource_by_id` function has been
altered to function as a wrapped to the `get_resources` function.
* Removed “GEM5_RESOURCE_JSON” code has been removed. This is no longer
used.
* Tests have been cleaned up a little bit to be easier to read.
* Some docstrings have been updated.
Things that are left TODO with this code:
* The client_wrapper/client/abstract_client abstractions are rather
pointless. In particular the client_wrapper and client classes could
be merged.
* The downloader module no longer does much and should have its
functions merged into other modules.
* With the addition of the `get_resources` function, much of the code in
the `AbstractClient` could be simplified.
Change-Id: I0ce48e88b93a2b9db53d4749861fa0b5f9472053
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71506
Reviewed-by: Kunal Pai <kunpai@ucdavis.edu>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
(cherry picked from commit 82587ce71b)
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71739
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
* According to the manual, load reservations must be cleared on a
failed or a successful SC attempt.
* A load reservation can be arbitrarily large. The current
implementation was reserving something different than cacheBlockSize
which could lead to problems if snoop addresses are cache block
aligned. This patch implementation assumes a cacheBlock granularity.
* Load reservations should also be cleared on faults
Change-Id: I64513534710b5f269260fcb204f717801913e2f5
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71558
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Roger Chang <rogerycchang@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The cache is modeled after an AMD EPYC cache, but not exactly
like AMD EPYC cache.
- K cores per core complex (CCD), each core has one private split L1,
and one private L2.
- K cores in the same CCD share 1 slice of L3 cache, which is not
a victim cache.
- There can be multiple CCDs, which communicate with each other via
Cross-CCD router. The Cross-CCD rounter is also connected to
directory controllers and dma controllers.
- All links latency are set to 1.
Change-Id: Ib64248bed9155b8e48e5158ffdeebf1f2d770754
Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71598
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Many of the outstanding issues with the GPU model are related to
instructions not having SDWA/DPP implementations and executing by
ignoring the special registers leading to incorrect executiong.
Adding SDWA/DPP is current very cumbersome as there is a lot of
boilerplate code.
This changeset adds helper methods for VOP2 with one instruction
changed as an example. This review is intended to get feedback
before applying this change to all VOP2 instructions that support
SDWA/DPP.
Change-Id: I1edbc3f3bb166d34f151545aa9f47a94150e1406
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70738
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The uint_fast16_t is the integer at least 16 bits size, it can be
32, 64 bits and more. Usually most of the simulations are in the
x86-64 linux host, the size of uint_fast16_t is 64 bits. Therefore,
there is no problem for double precision float operations and it can
pass FloatMM test. However, in the Mac OS, the size of uint_fast16_t
is 16 bits, it will lose the upper bits when converting float
register bits to freg_t and it will generate unexpected results for
FloatMM test.
The change can guarantee that the size of data in freg_t is at least
64 bits and it will not lose any data from floating point to freg_t.
Reference:
https://developer.apple.com/documentation/kernel/uint_fast16_thttps://codebrowser.dev/glibc/glibc/stdlib/stdint.h.html
Change-Id: I3df6610f0903cdee0f56584d6cbdb51ac26c86c8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71578
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Currently fmax and fmin instructions convert source float registers such as
Fs1_bits to float64_t(or float32_t and float16_t) many times in the single
instruction. It is not efficient for the future maintenance of these
instructions.
The change adds non-register float_t intermediate variables fs1 and fs2 to
keep converted results so that we don’t need to do it repeatedly. It also
added an intermediate variable fd for specific float type to assume the upper
bits of the packed float register are all one.
Change-Id: Ic508d5255db6c4b38ca4df6dd805df440c043fff
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71479
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Add two kernel dispatch-based exit events that are useful for limiting
the simulation and enabling debug flags at specific GPU kernels. Since
the KVM CPU typically used with GPUFS is not deterministic, this help
with enabling debug flags when the Tick number may vary. The exit at GPU
kernel option can also limit simulation by only simulating a few hundred
kernels, for example, and exit at a determined point.
Change-Id: I81bae92a80c25fc38c41e999aa662e1417b7a20d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71418
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
This patch makes changes to the stdlib based on the gem5 Vision project.
Firstly, a MongoDB database is supported.
A JSON database's support is continued.
The JSON can either be a local path or a raw GitHub link.
The data for these databases is stored in src/python
under "gem5-config.json".
This will be used by default.
However, the configuration can be overridden:
- by providing a path using the GEM5_CONFIG env variable.
- by placing a gem5-config.json file in the current working directory.
An AbstractClient is an abstract class that implements
searching and sorting relevant to the databases.
Clients is an optional list that can be passed
while defining any Resource class and obtain_resource.
These databases can be defined in the config JSON.
Resources now have versions. This allows for a
single version, e.g., 'x86-ubuntu-boot', to have
multiple versions. As such, the key of a resource is
its ID and Version (e.g., 'x86-ubuntu-boot/v2.1.0').
Different versions of a resource might be compatible
with different versions of gem5.
By default, it picks the latest version compatible with the gem5 Version
of the user.
A gem5 resource schema now has additional fields.
These are:
- source_url: Stores URL of GitHub Source of the resource.
- license: License information of the resource.
- tags: Words to identify a resource better, like hello for hello-world
- example_usage: How to use the resource in a simulation.
- gem5_versions: List of gem5 versions that resource is compatible with.
- resource_version: The version of the resource itself.
- size: The download size of the resource, if it exists.
- code_examples: List of objects.
These objects contain the path to where a resource is
used in gem5 example config scripts,
and if the resource itself is used in tests or not.
- category: Category of the resource, as defined by classes in
src/python/gem5/resources/resource.py.
Some fields have been renamed:
- "name" is changed to "id"
- "documentation" is changed to "description"
Besides these, the schema also supports resource specialization.
It adds fields relevant to a specific resource as specified in
src/python/gem5/resources/resource.py
These changes have been made to better present
information on the new gem5 Resources website.
But, they do not affect the way resources are used by a gem5 user.
This patch is also backwards compatible.
Existing code doesn't break with this new infrastructure.
Also, refs in the tests have been changed to match this new schema.
Tests have been changed to work with the two clients.
Change-Id: Ia9bf47f7900763827fd5e873bcd663cc3ecdba40
Co-authored-by: Kunal Pai <kunpai@ucdavis.edu>
Co-authored-by: Parth Shah <helloparthshah@gmail.com>
Co-authored-by: Harshil Patel <harshilp2107@gmail.com>
Co-authored-by: aarsli <arsli@ucdavis.edu>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70858
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
According to the GIC specification (IHI0069) reserved addresses in the
GIC memory map are treated as RES0. We allow to disable this behaviour
and panic instead (reserved_res0 = False, which is what we have been
doing so far) to catch development bugs (in gem5 and in the guest SW)
Change-Id: I23f98519c2f256c092a52425735b8792bae7a2c7
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71138
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
There are three bugs fixed in this patch:
1. The `dram_3_dir` was missing the "dramsim3" directory.
2. Missing `not` when checking if configs is a directory.
3. Missing `not` when checking if input file is a file.
Change-Id: I185f4832c1c2f1ecc4e138c148ad7969ef9b6fd4
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71038
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
We have some customized protocols in gem5 repository and they require
the include path from src directory. It causes the users of those
protocols need to handle the include path correctly by theirselve. This
is tedious and unstable. We should add the default include path in
SIMGEN command line to prevent issues.
Change-Id: I2a3748646567635d131a8fb4099e02e332691e97
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71118
Reviewed-by: Wei-Han Chen <weihanchen@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
This change updates the HBMCtrl such that both pseudo channels
can be in separate states (read or write) at the same time. In
addition, the controller queues are now always split in two
halves for both pseudo channels.
Change-Id: Ifb599e611ad99f6c511baaf245bad2b5c9210a86
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/65491
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This change adds a single DDR5 memory inteface.
A DDR5 DIMM contains two physical channels. Therefore,
two instances of this interface should be used to model
a DDR5 DIMM. The configuration includes 3 different speed
bins models. The configuration is tested with different
types of memory traffic using the traffic generator and shows
performance similar to what is observed in existing
literature [1]. One of the key features of DDR5
"same bank refresh" is yet not supported in gem5, but is
expected to improve the performance of the DDR5 model.
[1] Exploration of DDR5 with the Open-Source Simulator DRAMSys.
Change-Id: I5856a10c8dcd92dbecc7fd4dcea0f674b2412dd7
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/68257
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This patch changes the RISCVMatched Cache Hierarchy to
private L1 shared L2.
It also changes the RISCVMatched Core's parameters to
better match hardware performance.
Also, sizes are changed to MiB or KiB instead of MB
or KB, to match the datasheet.
All the changes that deviate from the datasheet and the
ARM HPI CPU (reference for pipeline parameters)
are documented.
The core parameters that are changed are:
- threadPolicy:
This is initialized to "SingleThreaded".
- decodeToExecuteForwardDelay:
This is changed from 1 to 2 to avoid a PMC address fault.
- fetch1ToFetch2BackwardDelay:
This is changed from 1 to 0 to better match hardware performance.
- fetch2InputBufferSize:
This is changed from 2 to 1 to better match hardware performance.
- decodeInputBufferSize:
This is changed from 3 to 2 to better match hardware performance.
- decodeToExecuteForwardDelay:
This is changed from 2 to 1 to better match hardware performance.
- executeInputBufferSize:
This is changed from 7 to 4 to better match hardware performance.
- executeMaxAccessesInMemory:
This is changed from 2 to 1 to better match hardware performance.
- executeLSQStoreBufferSize:
This is changed from 5 to 3 to better match hardware performance.
- executeBranchDelay:
This is changed from 1 to 2 to better match hardware performance.
- enableIdling:
This is changed to False to better match hardware performance.
- MemReadFU: changed to 2 cycles from 3 cycles.
The changes in the branch predictor are:
- BTBEntries:
This is changed from 16 entries to 32 entries.
- RASSize:
This is changed from 6 entries to 12 entries.
- IndirectSets:
This is changed from 8 sets to 16 sets.
- localPredictorSize:
This is changed from 8192 to 16384.
- globalPredictorSize:
This is changed from 8192 to 16384.
- choicePredictorSize:
This is changed from 8192 to 16384.
- localCtrBits:
This is changed from 2 to 4.
- globalCtrBits:
This is changed from 2 to 4.
- choiceCtrBits:
This is changed from 2 to 4.
Change-Id: I4235140f33be6a3b529a819ae6a7223cb88bb7ab
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70798
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Add support for the Arm SVE Integer Matrix Multiply-Accumulate
(SMMLA, USMMLA, UMMLA) instructions. Because the associated SUDOT and
USDOT instructions have not yet been implemented, the SVE Feature ID
register 0 (ID_AA64ZFR0_EL1) has not yet been updated to indicate
support for SVE Int8 matrix multiplication instructions at this time.
For more information please refer to the "ARM Architecture Reference
Manual Supplement - The Scalable Vector Extension (SVE), for ARMv8-A"
(https://developer.arm.com/architectures/cpu-architecture/a-profile/
docs/arm-architecture-reference-manual-supplement-armv8-a)
Additional Contributors: Giacomo Travaglini
Change-Id: Ia50e28fae03634cbe04b42a9900bab65a604817f
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70730
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Sets the appropriate bit in the ID_AA64ZFR0_EL1 sysreg that declares
support for ARMv8.2-F64MM.
This indicates that all pre-requisites for Armv8.2 SVE FP64
double-precision floating-point matrix multiplication instructions
have been met.
FMMLA, and LD1RO* instructions have been implemented, as well as the
128-bit element variants of TRN1, TRN2, UZP1, UZP2, ZIP1, and ZIP2.
For more information please refer to the "ARM Architecture Reference
Manual Supplement - The Scalable Vector Extension (SVE), for ARMv8-A"
(https://developer.arm.com/architectures/cpu-architecture/a-profile/
docs/arm-architecture-reference-manual-supplement-armv8-a)
Additional Contributors: Giacomo Travaglini
Change-Id: Idac3a3ca590e6eb2beb217a40a8c10af1e917440
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70729
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Add support for the Arm SVE Load-Broadcast Octaword (LD1RO{B,H,W,D})
instructions. These are similar to the Load-Broadcast
Quadword (LD1RQ{B,H,W,D}) instructions, but work on a 32-byte memory
segment rather than a 16-byte memory segment. Consequently, the LD1ROx
implementations build on the code for the LD1RQx implementations.
For more information please refer to the "ARM Architecture Reference
Manual Supplement - The Scalable Vector Extension (SVE), for ARMv8-A"
(https://developer.arm.com/architectures/cpu-architecture/a-profile/
docs/arm-architecture-reference-manual-supplement-armv8-a)
Change-Id: I98ee4f56c8099bf40c9034baa488d318ae57d3aa
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70727
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Add support for the Arm SVE Floating Point Matrix Multiply-Accumulate
(FMMLA) instruction. Both 32-bit element (single precision) and 64-bit
element (double precision) encodings are implemented, but because the
associated required instructions (LD1RO*, etc) have not yet been
implemented, the SVE Feature ID register 0 (ID_AA64ZFR0_EL1) has only
been updated to indicate 32-bit element support at this time.
For more information please refer to the "ARM Architecture Reference
Manual Supplement - The Scalable Vector Extension (SVE), for ARMv8-A"
(https://developer.arm.com/architectures/cpu-architecture/a-profile/
docs/arm-architecture-reference-manual-supplement-armv8-a)
Additional Contributors: Giacomo Travaglini
Change-Id: If3547378ffa48527fe540767399bcc37a5dab524
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70726
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>