The PCI read/write functions are atomic functions in gem5, meaning they
expect a response with a latency value on the same simulation Tick. For
reads to a PCI device, the response must also include a data value read
from the device.
The AMDGPU device has a PCI BAR which mirrors the frame buffer memory.
Currently reads are done atomically, but writes are sent to a DMA device
without waiting for a write completion ACK. As a result, it is possible
that writes can be queued in the DMA device long enough that another
read for a queued address arrives. This happens very deterministically
with the AtomicSimpleCPU and causes GPUFS to break with that CPU.
This change makes writes to the frame BAR atomic the same as reads. This
avoids that problem and as a result the AtomicSimpleCPU can now load the
driver for GPUFS simulations.
Change-Id: I9a8e8b172712c78b667ebcec81a0c5d0060234db
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71898
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matthew Poremba <matthew.poremba@amd.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
(cherry picked from commit 079fc47dc2)
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/72079
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
The unconditional exit event when a kernel completes that was added in
c644eae2dd is causing scripts that do not
ignore unknown exit events to end simulation prematurely. One such
script is the apu_se.py script used in SE mode GPU simulation. Make this
exit conditional to the parameter being set to a valid value to avoid
this problem.
Change-Id: I1d2c082291fdbcf27390913ffdffb963ec8080dd
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/72098
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
These types include std::pair, std::tuple, all iterable types and any
composition of these. Convenience hash factory and computation
functions are also provided.
These functions are in the stl_helpers namespace and must not move to
::std which could cause undefined behaviour. This is because
specialization of std templates for std or native types (or
composition of these) is undefined behaviour. This inconvenience can't
be circumvented for generic code. Users are free to bring these hash
implementations to namespace std after specialization for their own
non-std and non-native types.
Change-Id: Ifd0f0b64e5421d5d44890eb25428cc9c53484eb3
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67663
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Daniel Carvalho <odanrc@yahoo.com.br>
Tested-by: kokoro <noreply+kokoro@google.com>
Prior to this patch, when a memory controller was failing at sending a
response to AbstractController, it would not wakeup until the next
request. This patch gives the opportunity to Ruby models to notify
memory response buffer dequeue so that AbstractController can send a
retry request if necessary.
A dequeueMemRspQueue function has been added AbstractController to
automate the dequeue+notify operation.
Note that models that don't notify AbstractController will continue
working as before.
Change-Id: I261bb4593c126208c98825e54f538638d818d16b
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67658
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
When you build Python from scratch, the modules would be separated
shared libraries. They would be dlopen when doing module import. To make
the separated shared libraries can share the symbol in the binary, we
should add -rdynamic when compliing.
Change-Id: I26bf9fd7ea5068fd2d08c8f059b37ff34073e8c2
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/72040
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Vega adds multiple new D16 instructions which load a byte or short into
the lower or upper 16 bits of a register for packed math. The decoder
table has subDecode tables for FLAT instructions which represents 32
opcodes in each subDecode table. The subDecode table for opcodes 32-63
is missing so it is added here.
The opcode for V_SWAP_B32 is also off by one- In the ISA manual this
instruction is opcode 81, the instruction before is 79, and there is no
opcode 80, so the decoder entry is swapped with the invalid decoding
below it.
Change-Id: I278fea574ea684ccc6302d5b4d0f5dd8813a88ad
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71899
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The PCI read/write functions are atomic functions in gem5, meaning they
expect a response with a latency value on the same simulation Tick. For
reads to a PCI device, the response must also include a data value read
from the device.
The AMDGPU device has a PCI BAR which mirrors the frame buffer memory.
Currently reads are done atomically, but writes are sent to a DMA device
without waiting for a write completion ACK. As a result, it is possible
that writes can be queued in the DMA device long enough that another
read for a queued address arrives. This happens very deterministically
with the AtomicSimpleCPU and causes GPUFS to break with that CPU.
This change makes writes to the frame BAR atomic the same as reads. This
avoids that problem and as a result the AtomicSimpleCPU can now load the
driver for GPUFS simulations.
Change-Id: I9a8e8b172712c78b667ebcec81a0c5d0060234db
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71898
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matthew Poremba <matthew.poremba@amd.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
TracingExtension contains a stack recording the port names
passed through of the Packet. The target receiving the Packet
can dump out the whole path of this Packet for the debug purpose.
This mechanism can be enabled with the debug flag PortTrace.
Change-Id: Ic11e708b35fdddc4f4b786d91b35fd4def08948c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71538
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Yu-hsin Wang <yuhsingw@google.com>
* According to the manual, load reservations must be cleared on a
failed or a successful SC attempt.
* A load reservation can be arbitrarily large. The current
implementation was reserving something different than cacheBlockSize
which could lead to problems if snoop addresses are cache block
aligned. This patch implementation assumes a cacheBlock granularity.
* Load reservations should also be cleared on faults
Change-Id: I64513534710b5f269260fcb204f717801913e2f5
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71520
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
This patch includes several changes to the gem5 tools interface to the
gem5-resources infrastructure. These are:
* The old download and JSON query functions have been removed from the
downloader module. These functions were used for directly downloading
and inspecting the resource JSON file, hosted at
https://resources.gem5.org/resources. This information is now obtained
via `gem5.client`. If a resources JSON file is specified as a client,
it should conform to the new schema:
https//resources.gem5.org/gem5-resources-schema.json. The old schema
(pre-v23.0) is no longer valid. Tests have been updated to reflect
this change. Those which tested these old functions have been removed.
* Unused imports have been removed.
* For the resource query functions, and those tasked with obtaining the
resources, the parameter `gem5_version` has been added. In all cases
it does the same thing:
* It will filter results based on compatibility to the
`gem5_version` specified. If no resources are compatible the
latest version of that resource is chosen (though a warning is
thrown).
* By default it is set to the current gem5 version.
* It is optional. If `None`, this filtering functionality is not
carried out.
* Tests have been updated to fix the version to “develop” so the
they do not break between versions.
* The `gem5_version` parameters will filter using a logic which will
base compatibility on the specificity of the gem5-version specified in
a resource’s data. If a resource has a compatible gem5-version of
“v18.4” it will be compatible with any minor/hotfix version within the
v18.4 release (this can be seen as matching on “v18.4.*.*”.) Likewise,
if a resource has a compatible gem5-version of “v18.4.1” then it’s
only compatible with the v18.4.1 release but any of it’s hot fix
releases (“v18.4.1.*”).
* The ‘list_resources’ function has been updated to use the
“gem5.client” APIs to get resource information from the clients
(MongoDB or a JSON file). This has been designed to remain backwards
compatible to as much as is possible, though, due to schema changes,
the function does search across all versions of gem5.
* `get_resources` function was added to the `AbstractClient`. This is a
more general function than `get_resource_by_id`. It was
primarily created to handle the `list_resources` update but is a
useful update to the API. The `get_resource_by_id` function has been
altered to function as a wrapped to the `get_resources` function.
* Removed “GEM5_RESOURCE_JSON” code has been removed. This is no longer
used.
* Tests have been cleaned up a little bit to be easier to read.
* Some docstrings have been updated.
Things that are left TODO with this code:
* The client_wrapper/client/abstract_client abstractions are rather
pointless. In particular the client_wrapper and client classes could
be merged.
* The downloader module no longer does much and should have its
functions merged into other modules.
* With the addition of the `get_resources` function, much of the code in
the `AbstractClient` could be simplified.
Change-Id: I0ce48e88b93a2b9db53d4749861fa0b5f9472053
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71506
Reviewed-by: Kunal Pai <kunpai@ucdavis.edu>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
(cherry picked from commit 82587ce71b)
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71739
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
* According to the manual, load reservations must be cleared on a
failed or a successful SC attempt.
* A load reservation can be arbitrarily large. The current
implementation was reserving something different than cacheBlockSize
which could lead to problems if snoop addresses are cache block
aligned. This patch implementation assumes a cacheBlock granularity.
* Load reservations should also be cleared on faults
Change-Id: I64513534710b5f269260fcb204f717801913e2f5
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71558
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Roger Chang <rogerycchang@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The cache is modeled after an AMD EPYC cache, but not exactly
like AMD EPYC cache.
- K cores per core complex (CCD), each core has one private split L1,
and one private L2.
- K cores in the same CCD share 1 slice of L3 cache, which is not
a victim cache.
- There can be multiple CCDs, which communicate with each other via
Cross-CCD router. The Cross-CCD rounter is also connected to
directory controllers and dma controllers.
- All links latency are set to 1.
Change-Id: Ib64248bed9155b8e48e5158ffdeebf1f2d770754
Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71598
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Many of the outstanding issues with the GPU model are related to
instructions not having SDWA/DPP implementations and executing by
ignoring the special registers leading to incorrect executiong.
Adding SDWA/DPP is current very cumbersome as there is a lot of
boilerplate code.
This changeset adds helper methods for VOP2 with one instruction
changed as an example. This review is intended to get feedback
before applying this change to all VOP2 instructions that support
SDWA/DPP.
Change-Id: I1edbc3f3bb166d34f151545aa9f47a94150e1406
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70738
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This patch includes several changes to the gem5 tools interface to the
gem5-resources infrastructure. These are:
* The old download and JSON query functions have been removed from the
downloader module. These functions were used for directly downloading
and inspecting the resource JSON file, hosted at
https://resources.gem5.org/resources. This information is now obtained
via `gem5.client`. If a resources JSON file is specified as a client,
it should conform to the new schema:
https//resources.gem5.org/gem5-resources-schema.json. The old schema
(pre-v23.0) is no longer valid. Tests have been updated to reflect
this change. Those which tested these old functions have been removed.
* Unused imports have been removed.
* For the resource query functions, and those tasked with obtaining the
resources, the parameter `gem5_version` has been added. In all cases
it does the same thing:
* It will filter results based on compatibility to the
`gem5_version` specified. If no resources are compatible the
latest version of that resource is chosen (though a warning is
thrown).
* By default it is set to the current gem5 version.
* It is optional. If `None`, this filtering functionality is not
carried out.
* Tests have been updated to fix the version to “develop” so the
they do not break between versions.
* The `gem5_version` parameters will filter using a logic which will
base compatibility on the specificity of the gem5-version specified in
a resource’s data. If a resource has a compatible gem5-version of
“v18.4” it will be compatible with any minor/hotfix version within the
v18.4 release (this can be seen as matching on “v18.4.*.*”.) Likewise,
if a resource has a compatible gem5-version of “v18.4.1” then it’s
only compatible with the v18.4.1 release but any of it’s hot fix
releases (“v18.4.1.*”).
* The ‘list_resources’ function has been updated to use the
“gem5.client” APIs to get resource information from the clients
(MongoDB or a JSON file). This has been designed to remain backwards
compatible to as much as is possible, though, due to schema changes,
the function does search across all versions of gem5.
* `get_resources` function was added to the `AbstractClient`. This is a
more general function than `get_resource_by_id`. It was
primarily created to handle the `list_resources` update but is a
useful update to the API. The `get_resource_by_id` function has been
altered to function as a wrapped to the `get_resources` function.
* Removed “GEM5_RESOURCE_JSON” code has been removed. This is no longer
used.
* Tests have been cleaned up a little bit to be easier to read.
* Some docstrings have been updated.
Things that are left TODO with this code:
* The client_wrapper/client/abstract_client abstractions are rather
pointless. In particular the client_wrapper and client classes could
be merged.
* The downloader module no longer does much and should have its
functions merged into other modules.
* With the addition of the `get_resources` function, much of the code in
the `AbstractClient` could be simplified.
Change-Id: I0ce48e88b93a2b9db53d4749861fa0b5f9472053
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71506
Reviewed-by: Kunal Pai <kunpai@ucdavis.edu>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The uint_fast16_t is the integer at least 16 bits size, it can be
32, 64 bits and more. Usually most of the simulations are in the
x86-64 linux host, the size of uint_fast16_t is 64 bits. Therefore,
there is no problem for double precision float operations and it can
pass FloatMM test. However, in the Mac OS, the size of uint_fast16_t
is 16 bits, it will lose the upper bits when converting float
register bits to freg_t and it will generate unexpected results for
FloatMM test.
The change can guarantee that the size of data in freg_t is at least
64 bits and it will not lose any data from floating point to freg_t.
Reference:
https://developer.apple.com/documentation/kernel/uint_fast16_thttps://codebrowser.dev/glibc/glibc/stdlib/stdint.h.html
Change-Id: I3df6610f0903cdee0f56584d6cbdb51ac26c86c8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71519
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
The uint_fast16_t is the integer at least 16 bits size, it can be
32, 64 bits and more. Usually most of the simulations are in the
x86-64 linux host, the size of uint_fast16_t is 64 bits. Therefore,
there is no problem for double precision float operations and it can
pass FloatMM test. However, in the Mac OS, the size of uint_fast16_t
is 16 bits, it will lose the upper bits when converting float
register bits to freg_t and it will generate unexpected results for
FloatMM test.
The change can guarantee that the size of data in freg_t is at least
64 bits and it will not lose any data from floating point to freg_t.
Reference:
https://developer.apple.com/documentation/kernel/uint_fast16_thttps://codebrowser.dev/glibc/glibc/stdlib/stdint.h.html
Change-Id: I3df6610f0903cdee0f56584d6cbdb51ac26c86c8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71578
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Currently fmax and fmin instructions convert source float registers such as
Fs1_bits to float64_t(or float32_t and float16_t) many times in the single
instruction. It is not efficient for the future maintenance of these
instructions.
The change adds non-register float_t intermediate variables fs1 and fs2 to
keep converted results so that we don’t need to do it repeatedly. It also
added an intermediate variable fd for specific float type to assume the upper
bits of the packed float register are all one.
Change-Id: Ic508d5255db6c4b38ca4df6dd805df440c043fff
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71479
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Add two kernel dispatch-based exit events that are useful for limiting
the simulation and enabling debug flags at specific GPU kernels. Since
the KVM CPU typically used with GPUFS is not deterministic, this help
with enabling debug flags when the Tick number may vary. The exit at GPU
kernel option can also limit simulation by only simulating a few hundred
kernels, for example, and exit at a determined point.
Change-Id: I81bae92a80c25fc38c41e999aa662e1417b7a20d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71418
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
This patch makes changes to the stdlib based on the gem5 Vision project.
Firstly, a MongoDB database is supported.
A JSON database's support is continued.
The JSON can either be a local path or a raw GitHub link.
The data for these databases is stored in src/python
under "gem5-config.json".
This will be used by default.
However, the configuration can be overridden:
- by providing a path using the GEM5_CONFIG env variable.
- by placing a gem5-config.json file in the current working directory.
An AbstractClient is an abstract class that implements
searching and sorting relevant to the databases.
Clients is an optional list that can be passed
while defining any Resource class and obtain_resource.
These databases can be defined in the config JSON.
Resources now have versions. This allows for a
single version, e.g., 'x86-ubuntu-boot', to have
multiple versions. As such, the key of a resource is
its ID and Version (e.g., 'x86-ubuntu-boot/v2.1.0').
Different versions of a resource might be compatible
with different versions of gem5.
By default, it picks the latest version compatible with the gem5 Version
of the user.
A gem5 resource schema now has additional fields.
These are:
- source_url: Stores URL of GitHub Source of the resource.
- license: License information of the resource.
- tags: Words to identify a resource better, like hello for hello-world
- example_usage: How to use the resource in a simulation.
- gem5_versions: List of gem5 versions that resource is compatible with.
- resource_version: The version of the resource itself.
- size: The download size of the resource, if it exists.
- code_examples: List of objects.
These objects contain the path to where a resource is
used in gem5 example config scripts,
and if the resource itself is used in tests or not.
- category: Category of the resource, as defined by classes in
src/python/gem5/resources/resource.py.
Some fields have been renamed:
- "name" is changed to "id"
- "documentation" is changed to "description"
Besides these, the schema also supports resource specialization.
It adds fields relevant to a specific resource as specified in
src/python/gem5/resources/resource.py
These changes have been made to better present
information on the new gem5 Resources website.
But, they do not affect the way resources are used by a gem5 user.
This patch is also backwards compatible.
Existing code doesn't break with this new infrastructure.
Also, refs in the tests have been changed to match this new schema.
Tests have been changed to work with the two clients.
Change-Id: Ia9bf47f7900763827fd5e873bcd663cc3ecdba40
Co-authored-by: Kunal Pai <kunpai@ucdavis.edu>
Co-authored-by: Parth Shah <helloparthshah@gmail.com>
Co-authored-by: Harshil Patel <harshilp2107@gmail.com>
Co-authored-by: aarsli <arsli@ucdavis.edu>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71278
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>