Different GPUFS disk images have different root partitions that Linux
needs to boot from. In particular, Ubuntu's new installer has a GRUB
partition that cannot seem to be removed. Adding this as an option
prevents needing to edit a config script to change one character each
time a different disk image is used.
Change-Id: Iac2996ea096047281891a70aa2901401ac9746fc
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71918
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Vega adds multiple new D16 instructions which load a byte or short into
the lower or upper 16 bits of a register for packed math. The decoder
table has subDecode tables for FLAT instructions which represents 32
opcodes in each subDecode table. The subDecode table for opcodes 32-63
is missing so it is added here.
The opcode for V_SWAP_B32 is also off by one- In the ISA manual this
instruction is opcode 81, the instruction before is 79, and there is no
opcode 80, so the decoder entry is swapped with the invalid decoding
below it.
Change-Id: I278fea574ea684ccc6302d5b4d0f5dd8813a88ad
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71899
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The PCI read/write functions are atomic functions in gem5, meaning they
expect a response with a latency value on the same simulation Tick. For
reads to a PCI device, the response must also include a data value read
from the device.
The AMDGPU device has a PCI BAR which mirrors the frame buffer memory.
Currently reads are done atomically, but writes are sent to a DMA device
without waiting for a write completion ACK. As a result, it is possible
that writes can be queued in the DMA device long enough that another
read for a queued address arrives. This happens very deterministically
with the AtomicSimpleCPU and causes GPUFS to break with that CPU.
This change makes writes to the frame BAR atomic the same as reads. This
avoids that problem and as a result the AtomicSimpleCPU can now load the
driver for GPUFS simulations.
Change-Id: I9a8e8b172712c78b667ebcec81a0c5d0060234db
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71898
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matthew Poremba <matthew.poremba@amd.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
This adds a job level timeout for the compiler tests, allows
them to run weekly instead of daily, adds a workflow
dispatch option, and changes the
'latests-compilers-all-gem5-builds' jobs to run only the .opt
variant. It also adds a ready for review option
to the CI tests to run when someone converts a draft
pull request.
Change-Id: Id32b5f7da90745d18de2e550bd48d32ba45fb4d9
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71725
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
TracingExtension contains a stack recording the port names
passed through of the Packet. The target receiving the Packet
can dump out the whole path of this Packet for the debug purpose.
This mechanism can be enabled with the debug flag PortTrace.
Change-Id: Ic11e708b35fdddc4f4b786d91b35fd4def08948c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71538
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Yu-hsin Wang <yuhsingw@google.com>
* According to the manual, load reservations must be cleared on a
failed or a successful SC attempt.
* A load reservation can be arbitrarily large. The current
implementation was reserving something different than cacheBlockSize
which could lead to problems if snoop addresses are cache block
aligned. This patch implementation assumes a cacheBlock granularity.
* Load reservations should also be cleared on faults
Change-Id: I64513534710b5f269260fcb204f717801913e2f5
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71520
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
This patch includes several changes to the gem5 tools interface to the
gem5-resources infrastructure. These are:
* The old download and JSON query functions have been removed from the
downloader module. These functions were used for directly downloading
and inspecting the resource JSON file, hosted at
https://resources.gem5.org/resources. This information is now obtained
via `gem5.client`. If a resources JSON file is specified as a client,
it should conform to the new schema:
https//resources.gem5.org/gem5-resources-schema.json. The old schema
(pre-v23.0) is no longer valid. Tests have been updated to reflect
this change. Those which tested these old functions have been removed.
* Unused imports have been removed.
* For the resource query functions, and those tasked with obtaining the
resources, the parameter `gem5_version` has been added. In all cases
it does the same thing:
* It will filter results based on compatibility to the
`gem5_version` specified. If no resources are compatible the
latest version of that resource is chosen (though a warning is
thrown).
* By default it is set to the current gem5 version.
* It is optional. If `None`, this filtering functionality is not
carried out.
* Tests have been updated to fix the version to “develop” so the
they do not break between versions.
* The `gem5_version` parameters will filter using a logic which will
base compatibility on the specificity of the gem5-version specified in
a resource’s data. If a resource has a compatible gem5-version of
“v18.4” it will be compatible with any minor/hotfix version within the
v18.4 release (this can be seen as matching on “v18.4.*.*”.) Likewise,
if a resource has a compatible gem5-version of “v18.4.1” then it’s
only compatible with the v18.4.1 release but any of it’s hot fix
releases (“v18.4.1.*”).
* The ‘list_resources’ function has been updated to use the
“gem5.client” APIs to get resource information from the clients
(MongoDB or a JSON file). This has been designed to remain backwards
compatible to as much as is possible, though, due to schema changes,
the function does search across all versions of gem5.
* `get_resources` function was added to the `AbstractClient`. This is a
more general function than `get_resource_by_id`. It was
primarily created to handle the `list_resources` update but is a
useful update to the API. The `get_resource_by_id` function has been
altered to function as a wrapped to the `get_resources` function.
* Removed “GEM5_RESOURCE_JSON” code has been removed. This is no longer
used.
* Tests have been cleaned up a little bit to be easier to read.
* Some docstrings have been updated.
Things that are left TODO with this code:
* The client_wrapper/client/abstract_client abstractions are rather
pointless. In particular the client_wrapper and client classes could
be merged.
* The downloader module no longer does much and should have its
functions merged into other modules.
* With the addition of the `get_resources` function, much of the code in
the `AbstractClient` could be simplified.
Change-Id: I0ce48e88b93a2b9db53d4749861fa0b5f9472053
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71506
Reviewed-by: Kunal Pai <kunpai@ucdavis.edu>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
(cherry picked from commit 82587ce71b)
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71739
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
* According to the manual, load reservations must be cleared on a
failed or a successful SC attempt.
* A load reservation can be arbitrarily large. The current
implementation was reserving something different than cacheBlockSize
which could lead to problems if snoop addresses are cache block
aligned. This patch implementation assumes a cacheBlock granularity.
* Load reservations should also be cleared on faults
Change-Id: I64513534710b5f269260fcb204f717801913e2f5
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71558
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Roger Chang <rogerycchang@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The cache is modeled after an AMD EPYC cache, but not exactly
like AMD EPYC cache.
- K cores per core complex (CCD), each core has one private split L1,
and one private L2.
- K cores in the same CCD share 1 slice of L3 cache, which is not
a victim cache.
- There can be multiple CCDs, which communicate with each other via
Cross-CCD router. The Cross-CCD rounter is also connected to
directory controllers and dma controllers.
- All links latency are set to 1.
Change-Id: Ib64248bed9155b8e48e5158ffdeebf1f2d770754
Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71598
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Many of the outstanding issues with the GPU model are related to
instructions not having SDWA/DPP implementations and executing by
ignoring the special registers leading to incorrect executiong.
Adding SDWA/DPP is current very cumbersome as there is a lot of
boilerplate code.
This changeset adds helper methods for VOP2 with one instruction
changed as an example. This review is intended to get feedback
before applying this change to all VOP2 instructions that support
SDWA/DPP.
Change-Id: I1edbc3f3bb166d34f151545aa9f47a94150e1406
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70738
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This patch includes several changes to the gem5 tools interface to the
gem5-resources infrastructure. These are:
* The old download and JSON query functions have been removed from the
downloader module. These functions were used for directly downloading
and inspecting the resource JSON file, hosted at
https://resources.gem5.org/resources. This information is now obtained
via `gem5.client`. If a resources JSON file is specified as a client,
it should conform to the new schema:
https//resources.gem5.org/gem5-resources-schema.json. The old schema
(pre-v23.0) is no longer valid. Tests have been updated to reflect
this change. Those which tested these old functions have been removed.
* Unused imports have been removed.
* For the resource query functions, and those tasked with obtaining the
resources, the parameter `gem5_version` has been added. In all cases
it does the same thing:
* It will filter results based on compatibility to the
`gem5_version` specified. If no resources are compatible the
latest version of that resource is chosen (though a warning is
thrown).
* By default it is set to the current gem5 version.
* It is optional. If `None`, this filtering functionality is not
carried out.
* Tests have been updated to fix the version to “develop” so the
they do not break between versions.
* The `gem5_version` parameters will filter using a logic which will
base compatibility on the specificity of the gem5-version specified in
a resource’s data. If a resource has a compatible gem5-version of
“v18.4” it will be compatible with any minor/hotfix version within the
v18.4 release (this can be seen as matching on “v18.4.*.*”.) Likewise,
if a resource has a compatible gem5-version of “v18.4.1” then it’s
only compatible with the v18.4.1 release but any of it’s hot fix
releases (“v18.4.1.*”).
* The ‘list_resources’ function has been updated to use the
“gem5.client” APIs to get resource information from the clients
(MongoDB or a JSON file). This has been designed to remain backwards
compatible to as much as is possible, though, due to schema changes,
the function does search across all versions of gem5.
* `get_resources` function was added to the `AbstractClient`. This is a
more general function than `get_resource_by_id`. It was
primarily created to handle the `list_resources` update but is a
useful update to the API. The `get_resource_by_id` function has been
altered to function as a wrapped to the `get_resources` function.
* Removed “GEM5_RESOURCE_JSON” code has been removed. This is no longer
used.
* Tests have been cleaned up a little bit to be easier to read.
* Some docstrings have been updated.
Things that are left TODO with this code:
* The client_wrapper/client/abstract_client abstractions are rather
pointless. In particular the client_wrapper and client classes could
be merged.
* The downloader module no longer does much and should have its
functions merged into other modules.
* With the addition of the `get_resources` function, much of the code in
the `AbstractClient` could be simplified.
Change-Id: I0ce48e88b93a2b9db53d4749861fa0b5f9472053
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71506
Reviewed-by: Kunal Pai <kunpai@ucdavis.edu>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The uint_fast16_t is the integer at least 16 bits size, it can be
32, 64 bits and more. Usually most of the simulations are in the
x86-64 linux host, the size of uint_fast16_t is 64 bits. Therefore,
there is no problem for double precision float operations and it can
pass FloatMM test. However, in the Mac OS, the size of uint_fast16_t
is 16 bits, it will lose the upper bits when converting float
register bits to freg_t and it will generate unexpected results for
FloatMM test.
The change can guarantee that the size of data in freg_t is at least
64 bits and it will not lose any data from floating point to freg_t.
Reference:
https://developer.apple.com/documentation/kernel/uint_fast16_thttps://codebrowser.dev/glibc/glibc/stdlib/stdint.h.html
Change-Id: I3df6610f0903cdee0f56584d6cbdb51ac26c86c8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71519
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
The uint_fast16_t is the integer at least 16 bits size, it can be
32, 64 bits and more. Usually most of the simulations are in the
x86-64 linux host, the size of uint_fast16_t is 64 bits. Therefore,
there is no problem for double precision float operations and it can
pass FloatMM test. However, in the Mac OS, the size of uint_fast16_t
is 16 bits, it will lose the upper bits when converting float
register bits to freg_t and it will generate unexpected results for
FloatMM test.
The change can guarantee that the size of data in freg_t is at least
64 bits and it will not lose any data from floating point to freg_t.
Reference:
https://developer.apple.com/documentation/kernel/uint_fast16_thttps://codebrowser.dev/glibc/glibc/stdlib/stdint.h.html
Change-Id: I3df6610f0903cdee0f56584d6cbdb51ac26c86c8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71578
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Currently fmax and fmin instructions convert source float registers such as
Fs1_bits to float64_t(or float32_t and float16_t) many times in the single
instruction. It is not efficient for the future maintenance of these
instructions.
The change adds non-register float_t intermediate variables fs1 and fs2 to
keep converted results so that we don’t need to do it repeatedly. It also
added an intermediate variable fd for specific float type to assume the upper
bits of the packed float register are all one.
Change-Id: Ic508d5255db6c4b38ca4df6dd805df440c043fff
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71479
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Add two kernel dispatch-based exit events that are useful for limiting
the simulation and enabling debug flags at specific GPU kernels. Since
the KVM CPU typically used with GPUFS is not deterministic, this help
with enabling debug flags when the Tick number may vary. The exit at GPU
kernel option can also limit simulation by only simulating a few hundred
kernels, for example, and exit at a determined point.
Change-Id: I81bae92a80c25fc38c41e999aa662e1417b7a20d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71418
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>