This guarantees all changes put on the staging branch and, for whatever
reason, put on stable are on develop. This syncs the branches.
Change-Id: Ib3513f49977bb4ed3046c2d9d6cf162953b15887
The acquire-release flavor of the ldadd instruction should read ldaddalx
(eg. ldaddalb/ldaddalh) according to specification. However, this is
currently noted as ldadd"la"x (eg. ldaddlab/ldaddlah).
Issue: https://github.com/gem5/gem5/issues/1224
Change-Id: Ib932fa0e572207729c923c27f24c34cc21dff0e5
Co-authored-by: Bobby R. Bruce <bbruce@ucdavis.edu>
- gem5 was querying the full version of gem5 that is `24.0.0.0` while
searching for resources.
This was causing an error to find resources on staging branch.
This change trims the gem5 version to be just the major.minor version.
Change-Id: I30c3a1b38c631981f797ef0fd2b616e6a66ca18e
- gem5 was querying the full version of gem5 that is `24.0.0.0` while
searching for resources.
This was causing an error to find resources on staging branch.
This change trims the gem5 version to be just the major.minor version.
Change-Id: I30c3a1b38c631981f797ef0fd2b616e6a66ca18e
This change adds a new utility function for processing Spatter traces
into SpatterKernels under parse_kernels.
Additionally, it adds documentation for all the utility functions in
spatter_kernel.py.
Lastly, it adds an example script for running one spatter trace using
SpatterGenerator to the examples.
This change extend the AbstractMemory class to add a getter method that
allows other components to get the memory's range without interleaving.
This method will be useful if other components in the system need to
interleave the memory range different to the way the memory has
interleaved them.
This change extend the AbstractMemory class to add a getter method that
allows other components to get the memory's range without interleaving.
This method will be useful if other components in the system need to
interleave the memory range different to the way the memory has
interleaved them.
This change adds a new utility function for processing Spatter traces
into SpatterKernels under parse_kernels.
Additionally, it adds documentation for all the utility functions in
spatter_kernel.py.
Lastly, it adds an example script for running one spatter trace using
SpatterGenerator to the examples.
Removing -Werror flag on the stable branch ensures that as new compilers
are releases (likely withs stricter warnings) gem5 remains compilable.
Change-Id: I0267c895414b630c1d7cd9b28236249790b3006f
Previously, all of the TLB lookup/insert functions were using the full
virtual addresses even though the variables in the functions said "vpn."
This change explicitly converts the virtual address to the VPN without
any least significant zeros for the offset. I.e., vpn >> page_size.
The main bug solved in this changeset is the asid was |'d with the upper
bits of the virtual address, but sometimes there were all 1's.
Therefore, you could get a TLB hit even if the ASID was different.
Interestingly, the page that seemed to cause these issues was a 1 GiB
page.
This change also starts refactoring some of the page table details to
support sv46 and sv57 page table formats.
In my testing, the Linux kernel boot uses large pages (even OpenSBI uses
large pages), so it seems that large pages also work. However, this
seems like magic to me, so I'm not sure if it's correct.
This change also updates some asserts, and debug statements with more
useful debugging information.
Partially fixes#1235. More testing needs to be done to be confident.
Introduced in #1234, this caused compilation to faill in Apple Silicon
systems. This bug is the same as #582 where a more detailed explanation
is provided.
This change fixes the way indices are generated in a multi generator
setup.
It changes it from all cores generating the same trace of indices for
accessing the index array to each core generating an interleaved subset
of indices.
For an example look below for traces (indices to index array) in a 2
core setup.
Before:
core_0: 0, 1, 2, 3, 4, 5, 6, 7, ...
core_1: 0, 1, 2, 3, 4, 5, 6, 7, ...
After:
core_0: 0, 1, 2, 3, 8, 9, 10, 11, ...
core_1: 4, 5, 6, 7, 12, 13, 14, 15, ...
Additionally, this change fixes the SpatterKernel class in the standard
library to comply with the change in the SpatterGen source code.
Rather than adding the options to *every* config that might be using
GPU_VIPER.py, just change the Ruby config to check if the option is
available before trying to use it. Otherwise, reverts to what was the
default on stable.
Change-Id: Ia6f1d0827d489ee2a35c598b644461cbff59e247
I noticed while using the stable branch that there were a few typos of
the word 'cache' and so I've corrected a few files where I found such
typos.
Change-Id: I7c7f64812039f34fe39d0c45c4f5ce921cba06d0
Often, you want to add another argument to the default kernel arguments.
This function allows you to do that on the `kernel_disk_workload` board
mixin.
This allows for multiple gem5 simulations to be spawned from a single
parent gem5 process, as defined in a simgle gem5 configuration. In this
design _all_ the `Simulator`s are defined in the simulation script and
then added to the mutlisim module. For example:
```py
from gem5.simulate.Simulator import Simulator
import gem5.utils.multisim as multisim
# Construct the board[0] and board[1] as you wish here...
simulator1 = Simulator(board=board[0], id="board-1")
simulator2 = Simulator(board=board[1], id="board-2")
multisim.add_simulator(simulator1)
multisim.add_simulator(simulator2)
```
This specifies that two simulations are to be run in parallel in
seperate threads: one specified by `simulator1` and another by
`simulator2`. They are then added to MultiSim via the
`multisim.add_simulator` function. The user can specify an id via the
Simulator constructor. This is used to give each process a unique id and
output directory name. Given this, the id should be a helpful name
describing the simulation being specified. If not specified one is
automatically given.
To run these simulators we use `<gem5 binary> -m gem5.utils.multisim
<script> -p <num_processes>`. Note: multisim is an executable module in
gem5. This is the same module we input into our scripts to add the
simulators. This is an intentionally modular encapsulated design. When
the module processes a script it will schedule multiple gem5 jobs and,
dependent on the number of processes specified, will create child gem5
processes to processes tjese jobs (jobs are just gem5 simulations in
this case). The `--processes` (`-p`) argument is optional and if not
specified the max number of processes which can be run concurrently will
be the number of available threads on the host system.
The id for each process is used to create a subdirectory inside the
`outputdor` (`m5out`) of that id name. E.g, in the example above the
ID's are `board-1` and `board-2`. Therefore the m5 out directory will
look as follows:
```sh
- m5out
- board-1
- stats.txt
- config.ini
- config.json
- terminal.out
- board-2
- stats.txt
- config.ini
- config.json
- terminal.out
```
Each simulations output is encapsulated inside the subdirectory of the
id name.
If the multisim configuation script is passed directly to gem5 (like a
traditional gem5 configuraiton script, i.e.: `<gem5 binary> <script>`),
the user may run a single simulation specified in that script by passing
its id as an argument. E.g. `<gem5 binary> <script> board-1` will run
the `board-1` simulation specified in `script`. If no argument is passed
an Exception is raised asking the user to either specify or use the
MultiSim module if multiprocessing is needed.
If the user desires a list of ids of the simulations specified in a
given MultiSim script, they can do so by passing the `--list` (`-l`)
parameter to the config script. I.e., `<gem5 binary> <script> --list`
will list all the IDs for all the simulations specified in`script`.
This change comes with two new example scripts found in
'configs/example/gem5_library/multsim" to demonstrate multisim in both
an SE and FS mode simulation. Tests have been added which run these
scripts as part of gem5' Daily suite of tests.
Notes
=====
* **Bug fixed**: The `NoCache` classic cache hierarchy has been modified
so the Xbar is no longet set with a `__func__` call. This interfered
with MultiProcessing as this structure is not serializable via Pickle.
This was quite bad design anyway so should be changed
* **Change**: `readfile_contents` parameter previously wrote its value
to a file called "readfile" in the output dorectory. This has been
changed to write to a file called "readfile_{hash}" with "{hash}" being
a hash of the `readfile_contents`. This ensures that, during multisim
running, this file is not overwritten by other processes.
* **Removal note**: This implementation supercedes the functionality
outlined in 'src/python/gem5/utils/multiprocessing'. As such, this code
has been removed.
Limitations/Things to Fix/Improve
=================================
* Though each Simulator process has its own output directory (a
subdirectory within m5out, with an ID set by the user unique to that
Simulator), the stdout and stderr are still output to the terminal, not
the output directory. This results in: 1. stdout and stderr data lost
and not recorded for these runs. 2. An incredibly noisy terminal output.
* Each process uses the same cached resources. While there are locks on
resources when downloading, each processes will hash the resources they
require to ensure they are valid. This is very inefficient in cases
where resources are common between processes (e.g., you may have 10
processes each using the same disk image with each processes hashing the
disk images independently to give the same result to validate the
resources).
Change-Id: Ief5a3b765070c622d1f0de53ebd545c85a3f0eee
---------
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
Co-authored-by: Jason Lowe-Power <jason@lowepower.com>
This PR adds source code for C++ implementation of SpatterGen as well as
SpatterKernel. SpatterGen uses a PyBindMethod to add kernels to the
backend code. This way the process of processing json files could be
offloaded to python. In addition it adds standard library components for
SpatterGenCore and SpatterGen. These two components follow the same
structure as AbstractCore and AbstractProcessor. In addition
spatter_kernel.py adds a definition for SpatterKernel in python to make
adding kernels to C++ easier. Also it adds utility functions for parsing
dictionaries read from json as well as partitioning traces for multicore
setups.
Currently, gem5's inst tracer prints the whole vector register container
by default. The size of vector register containers in gem5 is the
maximum size allowed by the ISA. For vector-length agnostic (VLA) vector
registers, this means ARM SVE vector container is 2048 bits long, and
RISC-V vector container is 65535 bits long. Note that VLA implementation
in gem5 allows the vector length to be varied within the limit specified
by the ISAs.
However, in most use cases of gem5, the vector length is much less than
65535 bits. This causes two issues: (1) the vector container requires
allocating and moving around a large amount of unused data while only a
fraction of it is used, and (2) printing the execution trace of a vector
register results in a wall of text with a small amount of useful data.
This change addresses the problem (2) by providing a mechanism to limit
the amount data printed by the instruction tracer. This is done by
adding a function printing the first X bits of a vector register
container, where X is the vector length determined at runtime, as
opposed to the vector container size, which is determined at compilation
time.
Change-Id: I815fa5aa738373510afcfb0d544a5b19c40dc0c7
---------
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
This is a follow-up on the discussion here [1].
The IsInvalid flag was previously defined as an instruction that does
not appear in the ISA. However, a micro-architecture can choose to not
recognize an instruction in and raise illegal instruction fault even if
the instruction is in the ISA.
This change modifies the definition of a Invalid instruction such that,
if a StaticInst instruction is marked as IsInvalid, it means the
instruction is not recognized by the decoder. This means that any
instruction recognized by the decoder are not invalid, even if the
instruction is not in the official ISA spec; e.g., m5
pseudo-instructions.
Note that instructions that are recognized by the decoder but are chosen
to act as a nop are not invalid. This applies to WarnUnimplemented
instructions, e.g. hint instructions.
[1] https://github.com/gem5/gem5/pull/1071
Change-Id: I1371b222d8b06793d47f434d0f148c5571672068
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
- Fix address calculation issue with scratch_* instructions when SVE bit
is 0.
- Fix ds_swizzle_b32 not mapping to execution unit.
- Implement VOP3 V_FMAC_B32.
- Fix architected scratch address register being clobbered.
Tested with MNIST from PyTorch quickstart tutorial and nanoGPT on
mi300.py.
Currently writing to SRF which is incorrect, as the physical register
number can be clobbered by another wavefront if registers get renamed to
the physical register number.
Fix this by actually architecting the register, i.e., there is a
dedicated "hardware" register in the wavefront class.
Change-Id: I94e9e463eed348b2928cae884c1c20566c00984d