This updates the pre-commit utility from v4.3.0 to v4.5.0 and updates
black from 22.6.0 to 23.9.1.
Change-Id: I7ebb551f30e617059ce49f89a30207f739b1cb14
#371 Updates the runners. This PR updates the tests to:
1. Drop the 'run' and 'build' labels (all runners are now of the same
type).
2. Utilize the threading where possible (runners now have 4 cores
minimum).
The RISC-V vector instructions still work without setRegOperand.
We should fix the register statistic issue by
https://github.com/gem5/gem5/pull/360 to avoid duplicate statistic
register write count
Change-Id: Ib6a52935e00c3e557b366abfcf60450dca05614d
Earlier, GPU checkpointing was working only if a checkpoint was created
before the first kernel execution. This pull request adds support to
checkpoint in-between any two kernel calls. It does so by doing the
following.
- Adds flush support in the GPU_VIPER protocol
- Adds flush support in the GPUCoalescer
- Updates cache recorder to use the GPUCoalescer during simulation
cooldown and cache warmup times.
The new implementation matches the table in the ARM Architecture
Reference Manual (version DDI 0487J.a, section D1.3.6, table R_SXLWJ)
It takes into consideration features like FEAT_SEL2 (scr.eel2 bit) and
FEAT_VHE (hcr.e2h bit) which affect the masking of interrupts under
certain circumstances
The new implementation matches the table in the ARM Architecture
Reference Manual (version DDI 0487J.a, section D1.3.6, table R_SXLWJ)
It takes into consideration features like FEAT_SEL2 (scr.eel2 bit) and
FEAT_VHE (hcr.e2h bit) which affect the masking of interrupts under
certain circumstances
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Change-Id: I07ebd8d859651475bd32fd201eea0f4e64a7dd5f
We pay a small duplication cost but we make the code
more readable and we enable further modifications to the
AArch64 code without forcing the same code on the AArch32
method
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Change-Id: I1efa33cf19f91094fd33bd48b6a0a57d8df8f89f
This `mkdir` is problematic as it doesn't create the directory
recursively. This casues errors if `dir` is `X/Y/Z` and both `Y` and `Z`
has not been created. An error will be returned (`No such file or
directory`).
This issue was fixed with: https://github.com/gem5/gem5/pull/263. The
checkpointing code already recursively creates directories as needed.
Ergo was can remove this `mkdir` statement.
This PR adds two commit to handle timestamps in the ROCm runtime. ROCr
uses a mix of GPU timestamp reads and HSA packet timestamps to output
profiling information for a task dispatch.
The first patch added timestamps to the HSA completion signal indicating
when the task started and ended and require changing the flow of
completion signal DMAs to ensure the DMA of the timestamp values
completed before writing the completion signal value.
Second commit adds MMIOs for reading the GPU's timestamp counter. This
MMIO resides in the GFX MMIO space so a new class is added to handle
MMIOs in that address range.
Current [TraceCPU
documentation](https://www.gem5.org/documentation/general_docs/cpu_models/TraceCPU)
still references the deprecated **se.py/fs.py** scripts for elastic
trace generation (script paths are also outdated).
With this PR we provide a simpler Arm based elastic trace generation
script that can
be used out of the box by a user or that can be extended as needed.
This `mkdir` is problematic as it doesn't create the directory
recursively. This casues errors if `dir` is `X/Y/Z` and both `Y` and `Z`
has not been created. An error will be returned (`No such file or
directory`).
This issue was fixed with: https://github.com/gem5/gem5/pull/263. The
checkpointing code already recursively creates directories as needed.
Ergo was can remove this `mkdir` statement.
Change-Id: Ibae38267c8ee1eba76d7834367aa1c54013365bc
This automatically formats .yaml files. By deault has the following
parameters:
* `mapping`: 4 spaces.
* `sequence`: 6 spaces.
* `offset`: 4 spaces.
* `colons`: do not align top-level colons.
* `width`: None.
Change-Id: Iee5194cd57b7b162fd7e33aa6852b64c5578a3d2
Exposed in our failing compiler tests:
https://github.com/gem5/gem5/actions/runs/6348223508, this PR:
* Adds missing overrides to `PCState`'s `set` function.
* Removes `std::binary_function` from DramPower (it was deprecated in
CPP-11 and officially removed in CPP-17).
Added a parameter (_disk_device) to kernel_disk_workload which allows
users to change the disk device location. get_disk_device() now chooses
between the parameter and, if no parameter was passed, it calls a new
function _get_default_disk_device() which is implemented by each board
and has a default disk device according to each board, eg /dev/hda in
the x86_board. The previous way of setting a disk device still exists as
a default, however, with the new function users can now override this
default
This comment was left in the codebase in error. The
`set_se_binary_workload` function works fine with multi-threaded
applications. This hasn't been a restriction for some time.
This is the first PR in a series of enhancements to the BPU proposed in
#358.
However, I think putting everything into one PR is not nice to review
and prone to oversee I might did.
This PR restructures the BTB:
- A new abstract BTB class is created to enable different BTB
implementations. The new BTB class gets its own parameter and stats.
- An enum is added to differentiate branch instruction types. This enum
is used to enhance statistics and BPU management.
- The existing BTB is moved into `simple_btb` as default.
- An additional function is added to store the static instruction in the
BTB. This function is used for the decoupled front-end.
- Update configs to match new BTB parameters.
These were previously only running on single-threaded machines. Now
they'll be running on 4-core VMs so may as well run tests in parallel.
Change-Id: I7ee86512dc72851cea307dfd800dcf9a02f2f738
The new script will automatically use the newly
defined O3_ARM_v7a_3_Etrace CPU to run a simple SE simulation while
generating elastic trace files.
The script is based on starter_se.py, but contains the following
limitations:
1) No L2 cache as it might affect computational delay calculations
2) Supporting SimpleMemory only with minimal memory latency
There restrictions were imported by the existing elastic trace
generation logic in the common library (collected by grepping
elastic_trace_en) [1][2][3]
Example usage:
build/ARM/gem5.opt configs/example/arm/etrace_se.py \
--inst-trace-file [INSTRUCTION TRACE] \
--data-trace-file [DATA TRACE] \
[WORKLOAD]
[1]: https://github.com/gem5/gem5/blob/stable/\
configs/common/MemConfig.py#L191
[2]: https://github.com/gem5/gem5/blob/stable/\
configs/common/MemConfig.py#L232
[3]: https://github.com/gem5/gem5/blob/stable/\
configs/common/CacheConfig.py#L130
Change-Id: I021fc84fa101113c5c2f0737d50a930bb4750f76
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
According to the original paper [1] the elastic trace generation process
requires a cpu with a big number of entries in the ROB, LQ and SQ, so
that there are no stalls due to resource limitation.
At the moment these numbers are copy pasted from the
CpuConfig.config_etrace method [2].
[1]: https://ieeexplore.ieee.org/document/7818336
[2]: https://github.com/gem5/gem5/blob/stable/\
configs/common/CpuConfig.py#L40
Change-Id: I00fde49e5420e420a4eddb7b49de4b74360348c9
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
We define a new parent (ClusterSystem) to model a system
with one or more cpu clusters within it.
The idea is to make this new base class reusable by SE
systems/scripts as well (like starter_se.py)
Change-Id: I1398d773813db565f6ad5ce62cb4c022cb12a55a
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Modified the x86 KVM-in-SE syscall handler to flush the TLB following
each syscall, in case the page table has been modified. This is done by
reloading the value in %cr3. Doing this requires an intermediate GPR,
which we store in a new scratch buffer following the syscall code at
address `syscallDataBuf`.
GitHub issue: https://github.com/gem5/gem5/issues/409
- A new abstract BTB class is created to enable different BTB
implementations. The new BTB class gets its own parameter
and stats.
- An enum is added to differentiate branch instruction types.
This enum is used to enhance statistics and BPU management.
- The existing BTB is moved into `simple_btb` as default.
- An additional function is added to store the static instruction in
the BTB. This function is used for the decoupled front-end.
- Update configs to match new BTB parameters.
Change-Id: I99b29a19a1b57e59ea2b188ed7d62a8b79426529
Signed-off-by: David Schall <david.schall@ed.ac.uk>
This is still trying to completely remove any artifact
which implies virtualization is only supported in
non-secure mode (NS=1)
Change-Id: I83fed1c33cc745ecdf3c5ad60f4f356f3c58aad5
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
This info can be used during TLB invalidation
Change-Id: I81247e40b11745f0207178b52c47845ca1b92870
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
The syscall emulation of brk() incorrectly did not ensure that newly
allocated memory was zero-initialized, which Linux guarantees and which
seems to be the expectation of glibc's malloc() and free()
implementation. This patch fixes the incorrect behavior by zero-
initalizing all memory allocations via brk().
GitHub issue: https://github.com/gem5/gem5/issues/342
Change-Id: I53cf29d6f3f83285c8e813e18c06c2e9a69d7cc2
The PR is fixing the CHI fromSequencer helper function which is making
use of the undefined tbe entry.
This has been broken by #177
Change-Id: I52feff4b5ab2faf0aa91edd6572e3e767c88e257
1. All VMs are deployable from a single Vagrantfile (per host machine).
2. Runners within VMs are now ephemeral. They cease to exist after a job
is complete. After the VM cleans the workspace and creates a new runner.
This will reduce old data, scripts, and images causing space issues on
our VMs
3. No more 'vm_manager.sh' script. The standard `vagrant` command to
manage the VMs will work.
4. Adds Copyright notices where missing.
This patch removed the bespoke "vm_manager.sh" script in favor of a
Multi-Machine Vagrantfile.
With this the users needs to only change the variables in Vagrantfile
then use the standard `vagrant` commands to launch the VMs/Runners.
Change-Id: Ida5d2701319fd844c6a5b6fa7baf0c48b67db975
Prior to this change we were limited to a root partition with only 60GB
of space which caused issues when running larger simulations (see:
https://github.com/gem5/gem5/issues/165).
There are two factors in this issue which this patch resolves:
1. The root partition in the VM was capped at 60GB despite the virtual
machines size being capped at 128GB. This resulted in libvirt giving the
VM free space it couldn't use. To fix this `lvextend` was added to the
"provision_root.sh" script to resize the root partition to fill the
available space.
2. The virtual machine size can be set via the `machine_virtual_size`
parameter. The minimum and default value is 128GB. This wasn't exposed
previously. Now, if we required, we can increase the size of the VM/Root
partition if we require (though I believe 128GB is more than sufficient
for now).
Fixes: https://github.com/gem5/gem5/issues/165
Change-Id: I82dd500d8807ee0164f92d91515729d5fbd598e3
The "action-run.sh" action replaces inline scripting in the Vagrantfile.
The major improvement is this script runs an infinite loop and
configures the runners to be ephemeral. This means they cease to exist
after a job is complete. The script then cleans the VM workspace and the
loop restarts by configuring and setting up another runner. This means
our VMs no longer accumulate files that eventually lead to the VM
running out of space.
Change-Id: Iba6dc9a480f5805042602f120fc84bdc47a96d55
There are two places self-hosted runners can exist on GitHub:
1. At the level of the repository: In this case the runners can only be
used by that repository and runners can only be distinguished from one
another by labels.
2. At the level of the organization: In this case the runners can be
used by any repository in the organization, thus increasing their
versatility. In addition to labels, runners in the level of the
organization can be organized into groups.
While we do not use our self-hosted runners on other repositories, there
may be future use for this, so we might as well enable it now.
Change-Id: Id5e113194314336221dcdc8c2858b352afcbaf6e