The Vega ISA's s_memtime instruction is used to obtain a cycle value
from the GPU. Previously, this was implemented to obtain the cycle count
when the memtime instruction reached the execute stage of the GPU
pipeline. However, from microbenchmarking we have found that this under
reports the latency for memtime instructions relative to real hardware.
Thus, we changed its behavior to go through the scalar memory pipeline
and obtain a latency value from the the SQC (L1 I$). This mirrors the
suggestion of the AMD Vega ISA manual that s_memtime should be treated
like a s_load_dwordx2.
The default latency was set based on microbenchmarking.
Change-Id: I5e251dde28c06fe1c492aea4abf9f34f05784420
Explicitly convert to float/double to fix compiler warnings that I have
turned on locally. It might make sense to make use of fplib functions to
be portable across different host float formats but something as simple
as comparison against zero should be safe.
Change-Id: I96c6ee7c5497fece11be07234ff80ff86e7555e2
Programming an event ID while counters are disabled is perfectly fine,
so we should just log this using DPRINTF instead of printing a warn()
every time it happens.
Change-Id: Ib9499857271033ef941f74a7f012d8694328eaf3
I'm not entirely sure what the mandated behaviour is according the the
ARM ARM, but I was very confused by the counters continuing to increment
with the old event even when programmed to an event ID that is not
currently supported by GEM5. Disconnecting the counter if the event is
not supported is less surprising behaviour IMO.
Change-Id: I927d9339c138dafa1484db1515c2aa09b0a9a0a9
This matches the Arm manual and the output produced by capstone. Also
avoid unnecessary spaces in vsel* instruction printing.
Change-Id: I071dd834b7104f10f6358a6b2e2895bdab64df82
Add declaration of HAFGRTR_EL2 registers and read/write as GPR.
Change-Id: I87570d1e87d479f4530cf2c6e05931cdc26ee361
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
This PR implements #1429. It mainly achieve so with the following
changes
1) The IndexingPolicy is now a templated SimObject to make its APIs work
with different data types.
As an example, look at the getPossibleEntries, which is requiring an
Addr object whereas we want to be able to call the method with different
keys depending on the Tag
2) The AssociativeCache extracts type information from the Entry
template parameter.
This means any AssociativeCache entry will have to define the following
types:
KeyType = This is the data type used for lookups (in its simplest case,
it is Addr)
IndexingPolicy = This is the base indexing policy SimObject
As an example, the PR is also reworking the TaggedEntry to be
AssociativeCache compliant. This
ultimately allows us to remove the weird overloading of cache querying
methods with the secure flag, and to
remove the AssociativeSet which was providing such weird interface.
As mentioned in the [base, mem-cache: Rewrite TaggedEntry
code](7ee9790464)
commit, further cleanup is needed. TaggedEntry
is really a misleading name as its sole difference with the CacheEntry
(which is also tagged) is the presence of
the secure bit. A better name should be chosen.
We don't store a pointer to the indexing policy anymore.
Instead, we register a tag extractor callback when we
construct the TaggedEntry
Change-Id: I79dbc1bc5c5ce90d350e83451f513c05da9f0d61
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
We don't store a pointer to the indexing policy anymore.
Instead, we register a tag extractor callback when we
construct the CacheEntry
Change-Id: I06dc58e2f67e01f3f9bcd9f0c641505d3aec82ff
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
As detailed by a previous commit, AssociativeSet is not needed anymore.
The class is effectively the same as AssociativeCache
Change-Id: I24bfb98fbf0826c0a2ea6ede585576286f093318
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
The only difference between the TaggedEntry and the newly defined
CacheEntry is the presence of the secure flag in the first case. The
need to tag a cache entry according to the security bit required the
overloading of the matching methods in the TaggedEntry class to take
security into account (See matchTag [1]), and the persistance after
PR #745 of the AssociativeSet class which is basically identical
to its AssociativeCache superclass, only it overrides its virtual
method to match the tag according to the secure bit as well.
The introduction of the KeyType parameter in the previous commit
will smoothe the differences and help unifying the interface.
Rather than overloading and overriding to account for a different
signature, we embody the difference in the KeyType class. A
CacheEntry will match with KeyType = Addr,
whereas a TaggedEntry will use the following lookup type proposed in this
patch:
struct KeyType {
Addr address;
bool secure;
}
This patch is partly reverting the changes in #745 which were
reimplementing TaggedEntry on top of the CacheEntry. Instead
we keep them separate as the plan is to allow different
entry types with templatization rather than polymorphism.
As a final note, I believe a separate commit will have to
change the naming of our entries; the CacheEntry should
probably be renamed into TaggedEntry and the current TaggedEntry
into something that reflect the presence of the security bit
alongside the traditional address tag
[1]: https://github.com/gem5/gem5/blob/stable/\
src/mem/cache/tags/tagged_entry.hh#L81
Change-Id: Ifc104c8d0c1d64509f612d87b80d442e0764f7ca
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
As long as the AssociativeCache Entry parameter satisfies the
interface it should be fine. We enforce the bare minimum of having
a replaceable entry.
Doing otherwise will restrict our capability to have a generic cache
with generic tags
Change-Id: I23e32b7540fea6b6e5894aca3d91538e81214932
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
The KeyType data type is the type of the lookup and the cache extracts
it from the Entry template parameter
Change-Id: I147d7c2503abc11becfeebe6336e7f90989ad4e8
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
This commit is making the AssociativeCache indexing policy
a type extracted from the Entry template parameter
Change-Id: Ic9fb6ccb1b3549aaa250901e91ae3c300b92103e
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Exposing the tag of a cache entry through the associative
cache APIs makes it hard to generalize the cache for
structured tags. Ultimately the tag should be a property
of the cache entry and any tag extraction logic (if needed)
should reside there. In this we can reuse the associative
cache for different Entry params, each one bearing a different
representation of a tag
Change-Id: I51b4526be64683614e01d763b1656e5be23a611b
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Some compilers (gcc version 12.3.0) will start complaining when perfect
forwarding the StrideEntry argument constructed with an extra parameter
(see later patches).
Using a pointer seems to fix the gcc bug.
The commit is also changing the signature of findTable and allocateContext
so that a reference rather than a pointer is return. In this way we don't
deal with the hack of returning a raw ptr from a unique_ptr
Change-Id: Idd451208aae80bbfae76110c859e93084bcb2635
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
The code is already assuming a fully associative cache. Rather than
calling getPossibleEntries with a random value and therefore needlessly
passing a vector of pointers, we use the AssociativeCache iterator to
loop over the cache entries
Change-Id: Ic99cbd39ee9f12eef9091d9d62ca24d0c3e61300
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
- Updated search query so that resources that are not compatible with
the gem5 version are still downloaded and used but a warning is thrown
instead of returning an error.
In `get_default_kernel_root_val()`, now prioiritizes the explicit
disk_device passed from the user over the default implemented by the
board.
Also adjusts syntax for selecting this value in
`set_kernel_disk_workload()` for consistency.
It seems that the common use case for setting `disk_device` is that
there is a mismatch between where the disk image is mounted and where
the board expects it by default. In this case, it also seems common that
the root partition will be on this explicit device as well.
In cases where this is not true, explicit kernel arguments can be used
to define the distinct disk device apart from the root. However, this
seems less common than the above so, in that case, it would be easier to
tie these together.
Add declaration of HAFGRTR_EL2 registers and read/write as GPR.
Change-Id: I87570d1e87d479f4530cf2c6e05931cdc26ee361
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
This PR fixes the issues with the implementation of the Best Offset
Prefetcher described in issue #1402
On branch bop
Changes to be committed:
modified: src/mem/cache/prefetch/bop.cc
modified: src/mem/cache/prefetch/bop.hh
---------
Co-authored-by: Setu Gupta <setu.gupta.2020@gamil.com>
Co-authored-by: Abhishek Shailendra Singh <abs218@leigh.edu>
Co-authored-by: Setu Gupta <setu.gupta@partner.samsung.com>
Without specifying the "gem5/gpu" directory, this test attempted to run
the entire test suite. This caused the daily and weekly tests to fail.
This change fixes this.
After merging the old personal gem5 repository with the stable version
v24, I tried to run the project inside the `.devcontainer` environment.
During the image build process, I encountered the following error:
```sh
[7683 ms] Start: Run in container: /bin/sh -c ./.devcontainer/on-create.sh
fatal: detected dubious ownership in repository at '/workspaces/gem5'
To add an exception for this directory, call:
git config --global --add safe.directory /workspaces/gem5
[7724 ms] onCreateCommand failed with exit code 128. Skipping any further user-provided commands.
```
This error occurred due to an ownership permission problem, which I
resolved by adding the following line.
* Removes the "docker-compose.yaml" in favor of "docker-bake.hcl". This
uses the `docker buildx` tool which has the advantage of enabling
multi-platformm builds where desired. By default all images are built
targeting `linux/arm64`, `linux/amd64` and `linux/riscv64` as targets
with the exception of the GPU images where only `linux/amd64` makes
sense.
* Remove unused/older Docker build targets (these can easily be re-added
but they were not regularly built or have any current usage).
* Update "README.md" to better describe these Dockerfiles and how they
are built.
* Simplify GCC and Clang compiler images. Each uses the Ubuntu 24.04 All
Deps image as a base then specialized the compiler on top.
* To simply things, all compiler versions are built from 24.04. This
means **narrowing the supported versions from GCC v10 to v14 and Clang
v14 to v18**.
* Fix some bugs in the "docker-bake.hcl" thus ensuring all targets may
be built from it.
* Cleanup the systemc and sst images: reducing their size and building
them off the common 24.04 ubuntu base image.
A new host tag `gcn_gpu` has been added. This allows for selection of
those GPU tests which depend upon the gcn-gpu docker image to run.
In addition to this, the square GPU tests has been moved to the CI
tests. This ensures some GPU code is compiled and run on every PR.
Since PR #1316, we use sign-extend for all address generation, including
PC, to match the ISA specification for modifiable XLEN. However, when we
set a breakpoint using remote GDB, our address is not sign-extended.
This causes the breakpoint to be set at the wrong address, as specified
in Issue #1463. This PR fixes the issue by sign-extending the address
when setting a breakpoint. This also matches the RISC-V ISA
Specification that "must sign-extend results to fill the entire widest
supported XLEN in the destination register."
Change-Id: I9b493bf8ad5b1ef45a9728bb40fc5e38250fe9c3
Signed-off-by: Yangyu Chen <cyy@cyyself.name>
In `get_default_kernel_root_val()`, now prioiritizes
the explicit disk_device passed from the user over the
default implemented by the board.
Also adjusts syntax for selecting this value in
`set_kernel_disk_workload()` for consistency.
Change-Id: Icddcf438f5b96c2288c3cc608782f191df2c394e
After a lot of debugging and comparing traces I noticed that vrintp was
giving different results from QEMU. An input of 0x3f800000 (1.0) was
being passed to the fplib helpers as (uint32_t)1 which has a completely
different floating-point interpretation and the result was therefore
completely wrong.
I've fixed this as well as all remaining implicit float-to-int
conversions in the ARM instruction execution. There are more
-W(implicit-)float-conversion warnings in the other executors, but for
now this fixes the issue I was seeing.
Change-Id: Ifdeee745ca155d7f4504ac4c54235ac431acdeb9
This PR fixes the issues mentioned in #1448.
**Note that this contribution is the result of a joint collaboration
with @AbhishekUoR**
This PR introduces the following 4 changes:
1. It changes the addresses which are used to compute the stride to
cache line aligned addresses (the current version uses word aligned
addresses)
2. It correctly returns if the stride does not match (as opposed to
issuing prefetches using the new stride incorrectly)
3. It returns if the new stride is 0, indicating multiple reads from the
same cache line.
4. It removes code which is no longer necessary after the addition of
changes number 1 and 3.
Change-Id: Ic346d0e15df6d07e2b93289c8d6b89b4c2f45a34
---------
Co-authored-by: Abhishek Shailendra Singh <abs218@leigh.edu>
1. Builds on top of the Ubuntu 24.04 all-deps image.
2. Unify the download, build, install, and cleanup steps.
Change-Id: I4c2bf8e571dfd228f7df8372cda0f428de59af51