Commit Graph

20685 Commits

Author SHA1 Message Date
Bobby R. Bruce
c53529783b misc,python: Add requirements-txt-fixer to pre-commit
This sorts entries in requirements.txt files.

Change-Id: I7ee6e31f3cbe5078f24d13471a6aa9edc482cecd
2023-10-09 13:09:21 -07:00
Bobby R. Bruce
bbe05b0cba tests,misc: Fix compilation tests failures (#400)
Exposed in our failing compiler tests:
https://github.com/gem5/gem5/actions/runs/6348223508, this PR:

* Adds missing overrides to `PCState`'s `set` function.
* Removes `std::binary_function` from DramPower (it was deprecated in
CPP-11 and officially removed in CPP-17).
2023-10-09 11:20:52 -07:00
Harshil Patel
452a600c49 New function to kernel_disk_workload to allow new disk device location (#151)
Added a parameter (_disk_device) to kernel_disk_workload which allows
users to change the disk device location. get_disk_device() now chooses
between the parameter and, if no parameter was passed, it calls a new
function _get_default_disk_device() which is implemented by each board
and has a default disk device according to each board, eg /dev/hda in
the x86_board. The previous way of setting a disk device still exists as
a default, however, with the new function users can now override this
default
2023-10-09 10:33:45 -07:00
Harshil Patel
79f40ffdab stdlib: Del comment stating SE mode limited to single thread (#402)
This comment was left in the codebase in error. The
`set_se_binary_workload` function works fine with multi-threaded
applications. This hasn't been a restriction for some time.
2023-10-09 10:30:32 -07:00
Harshil Patel
d8fc0180a5 cpu: Restructure BTB (#412)
This is the first PR in a series of enhancements to the BPU proposed in
#358.
However, I think putting everything into one PR is not nice to review
and prone to oversee I might did.

This PR restructures the BTB:
- A new abstract BTB class is created to enable different BTB
implementations. The new BTB class gets its own parameter and stats.
- An enum is added to differentiate branch instruction types. This enum
is used to enhance statistics and BPU management.
- The existing BTB is moved into `simple_btb` as default.
- An additional function is added to store the static instruction in the
BTB. This function is used for the decoupled front-end.
- Update configs to match new BTB parameters.
2023-10-09 10:13:00 -07:00
Andreas Sandberg
ec7921305b arch-arm: Implement FEAT_TLBIRANGE extension (#414) 2023-10-09 17:09:31 +01:00
Jason Lowe-Power
d4be9c76c5 cpu-kvm, arch-x86: flush TLB after syscalls (#411)
Modified the x86 KVM-in-SE syscall handler to flush the TLB following
each syscall, in case the page table has been modified. This is done by
reloading the value in %cr3. Doing this requires an intermediate GPR,
which we store in a new scratch buffer following the syscall code at
address `syscallDataBuf`.

GitHub issue: https://github.com/gem5/gem5/issues/409
2023-10-09 08:16:06 -07:00
David Schall
edf9092fee cpu: Restructure BTB
- A new abstract BTB class is created to enable different BTB
  implementations. The new BTB class gets its own parameter
  and stats.
- An enum is added to differentiate branch instruction types.
  This enum is used to enhance statistics and BPU management.
- The existing BTB is moved into `simple_btb` as default.
- An additional function is added to store the static instruction in
  the BTB. This function is used for the decoupled front-end.
- Update configs to match new BTB parameters.

Change-Id: I99b29a19a1b57e59ea2b188ed7d62a8b79426529
Signed-off-by: David Schall <david.schall@ed.ac.uk>
2023-10-09 14:37:47 +00:00
Giacomo Travaglini
39fdfaea5a arch-arm: Implement FEAT_TLBIRANGE
Change-Id: I7eb020573420e49a8a54e1fc7a89eb6e2236dacb
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-09 13:59:47 +01:00
Giacomo Travaglini
6b698630a2 arch-arm: Check VMID in secure mode as well (NS=0)
This is still trying to completely remove any artifact
which implies virtualization is only supported in
non-secure mode (NS=1)

Change-Id: I83fed1c33cc745ecdf3c5ad60f4f356f3c58aad5
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-09 13:56:57 +01:00
Giacomo Travaglini
a8efded644 arch-arm: Include Granule Size in a TLB entry
This info can be used during TLB invalidation

Change-Id: I81247e40b11745f0207178b52c47845ca1b92870
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-09 13:56:57 +01:00
Giacomo Travaglini
5cd70bf9bf sim-se: zero out memory allocated via brk() (#343)
The syscall emulation of brk() incorrectly did not ensure that newly
allocated memory was zero-initialized, which Linux guarantees and which
seems to be the expectation of glibc's malloc() and free()
implementation. This patch fixes the incorrect behavior by zero-
initalizing all memory allocations via brk().

GitHub issue: https://github.com/gem5/gem5/issues/342

Change-Id: I53cf29d6f3f83285c8e813e18c06c2e9a69d7cc2
2023-10-09 13:48:53 +01:00
Giacomo Travaglini
226052ed5a mem-ruby: Far atomics fix (#407)
The PR is fixing the CHI fromSequencer helper function which is making
use of the undefined tbe entry.

This has been broken by #177

Change-Id: I52feff4b5ab2faf0aa91edd6572e3e767c88e257
2023-10-09 08:08:50 +01:00
Bobby R. Bruce
b0e1efb555 util: Update the GitHub Self-Hosted Runners (#371)
1. All VMs are deployable from a single Vagrantfile (per host machine).
2. Runners within VMs are now ephemeral. They cease to exist after a job
is complete. After the VM cleans the workspace and creates a new runner.
This will reduce old data, scripts, and images causing space issues on
our VMs
3. No more 'vm_manager.sh' script. The standard `vagrant` command to
manage the VMs will work.
4. Adds Copyright notices where missing.
2023-10-08 21:52:33 -07:00
Bobby R. Bruce
df3bcaf143 util: Make all runs "build" and "run"
Change-Id: If9ecf467efa5c7118d34166953630e6c436c55a4
2023-10-06 15:53:57 -07:00
Bobby R. Bruce
53219bf827 util: Add Troubleshooting for "Vagrant failed..."
Change-Id: I01e637f09084acb6c5fbd7800b3e578a43487849
2023-10-06 15:53:57 -07:00
Bobby R. Bruce
6571a54a65 util: Use a Multi-Machine Vagrantfile
This patch removed the bespoke "vm_manager.sh" script in favor of a
Multi-Machine Vagrantfile.

With this the users needs to only change the variables in Vagrantfile
then use the standard `vagrant` commands to launch the VMs/Runners.

Change-Id: Ida5d2701319fd844c6a5b6fa7baf0c48b67db975
2023-10-06 15:53:57 -07:00
Bobby R. Bruce
0e5c6d9f50 util: Resize VM root partition max size to ~128GB
Prior to this change we were limited to a root partition with only 60GB
of space which caused issues when running larger simulations (see:
https://github.com/gem5/gem5/issues/165).

There are two factors in this issue which this patch resolves:

1. The root partition in the VM was capped at 60GB despite the virtual
machines size being capped at 128GB. This resulted in libvirt giving the
VM free space it couldn't use. To fix this `lvextend` was added to the
"provision_root.sh" script to resize the root partition to fill the
available space.
2. The virtual machine size can be set via the `machine_virtual_size`
parameter. The minimum and default value is 128GB. This wasn't exposed
previously. Now, if we required, we can increase the size of the VM/Root
partition if we require (though I believe 128GB is more than sufficient
for now).

Fixes: https://github.com/gem5/gem5/issues/165
Change-Id: I82dd500d8807ee0164f92d91515729d5fbd598e3
2023-10-06 15:53:57 -07:00
Bobby R. Bruce
f36449be98 util: Add missing copyright notices
Change-Id: I243046c17264eb5c522285096ecf9c7e5e968322
2023-10-06 15:53:57 -07:00
Bobby R. Bruce
8c2d414223 util: Cleanup the provision_root.sh
Change-Id: I58215dddc34476695c7aedc77b55d338e0304198
2023-10-06 15:53:57 -07:00
Bobby R. Bruce
c0cb16ba89 util: Create HOSTNAME variable
Change-Id: Ia68f1bef2bb9e4e5e18476b6100be80f8cf1c799
2023-10-06 15:53:57 -07:00
Bobby R. Bruce
529423f47a util: Remove note about ssh use
This is confusing and setting the ssh username and password is normal.

Change-Id: Ic925e92ade47f455c86a461a267b8cad7aa6d7ba
2023-10-06 15:53:57 -07:00
Bobby R. Bruce
a924fa3bdc util: Add action-run.sh to run Action Runners
The "action-run.sh" action replaces inline scripting in the Vagrantfile.

The major improvement is this script runs an infinite loop and
configures the runners to be ephemeral. This means they cease to exist
after a job is complete. The script then cleans the VM workspace and the
loop restarts by configuring and setting up another runner. This means
our VMs no longer accumulate files that eventually lead to the VM
running out of space.

Change-Id: Iba6dc9a480f5805042602f120fc84bdc47a96d55
2023-10-06 15:53:57 -07:00
Bobby R. Bruce
3e1c0b0714 util: Move runners from gem5 repository to gem5 org
There are two places self-hosted runners can exist on GitHub:

1. At the level of the repository: In this case the runners can only be
used by that repository and runners can only be distinguished from one
another by labels.
2. At the level of the organization: In this case the runners can be
used by any repository in the organization, thus increasing their
versatility. In addition to labels, runners in the level of the
organization can be organized into groups.

While we do not use our self-hosted runners on other repositories, there
may be future use for this, so we might as well enable it now.

Change-Id: Id5e113194314336221dcdc8c2858b352afcbaf6e
2023-10-06 15:53:57 -07:00
Bobby R. Bruce
d1f9f98747 util: Make all runners the same type
Having two types of GitHub Action Runners has not yielded much benefit
and caused confusion and inefficiencies. This change simplifies things
to having just one runner with 8-cores and 16GB of memory. It is
sufficient to build gem5 and run most simulations.

Change-Id: Ic49ae5e98b02086f153f4ae2a4eedd8a535786c8
2023-10-06 15:53:57 -07:00
Nicholas Mosier
7a0e84d853 cpu-kvm, arch-x86: flush TLB after syscalls
Modified the x86 KVM-in-SE syscall handler to flush the TLB following
each syscall, in case the page table has been modified. This is done
by reloading the value in %cr3. Doing this requires an intermediate
GPR, which we store in a new scratch buffer following the syscall code
at address `syscallDataBuf`.

GitHub issue: https://github.com/gem5/gem5/issues/409

Change-Id: Ibc20018c97ebb1794fa31a0c71e0857d661c7c9d
2023-10-06 20:41:59 +00:00
Nicholas Mosier
0dcf0fb829 sim-se: unmap reclaimed heap pages in brk syscall emulation
gem5::MemState::updateBrkRegion(), which is called during the syscall
emulation of brk, did not unmap deallocated heap pages when the brk
region is receding. Instead, it kept it mapped for simplicity. This
introduced a bug where subequent expansions of the brk region reused
prior heap page mappings that were not zero-filled. This violates
the assumptions of glibc malloc, resulting in heap corruption and
crashes.

This patch fixes the bug by always unmapping pages that are deallocated
during a call to brk() that reduces the heap size. This makes the
gem5::MemState::_endBrkPoint field obsolete, so this patch removes it.

GitHub issue: https://github.com/gem5/gem5/issues/342

Change-Id: Ib2244e1aa4d2a26666ad60d231fdde2c22d2df35
2023-10-06 20:39:57 +00:00
Giacomo Travaglini
00748c7901 mem-ruby: Fix CHI fromSequencer helper function
This has been broken by #177

Change-Id: I52feff4b5ab2faf0aa91edd6572e3e767c88e257
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-06 14:51:11 +01:00
Giacomo Travaglini
ae104cc431 mem-ruby: Add new feature far atomics in CHI (#177)
Added a new feature to CHI protocol (in collaboration with @tiagormk).
Here is the Jira Ticket
[https://gem5.atlassian.net/browse/GEM5-1326](https://gem5.atlassian.net/browse/GEM5-1326
). As described in CHI specs, far atomic transactions enable remote
execution of Atomic Memory Operations. This pull request incorporates
several changes:

* Fix Arm ISA definition of Swap instructions. These instructions should
return an operand, so their ISA definition should be Return Operation.
* Enable AMOs in Ruby Mem Test to verify that AMOs work
* Enable near and far AMO in the Cache Controler of CHI

Three configuration parameters have been used to tune this behavior:
* policy_type: sets the atomic policy to one of the described in [our
paper](https://dl.acm.org/doi/10.1145/3579371.3589065)
* atomic_op_latency: simulates the AMO ALU operation latency
* comp_anr: configures the Atomic No return transaction to split
CompDBIDResp into two different messages DBIDResp and Comp
2023-10-06 10:09:58 +01:00
Matt Sinclair
85340973bf configs: Add configurable GPU L1,L2 num banks and L2 latencies (#389)
Previously, the L1, L2 number of banks and L2 latencies were not
configurable through command line arguments. This commit adds support to
configure them through the arguments '--tcp-num-banks' for number of
banks in L1, '--tcc-num-banks' for number of banks in L2, and
'--tcc-tag-access-latency', and '--tcc-data-access-latency'

Change-Id: Ie3b713ead16865fd7120e2d809ebfa56b69bc4a1
2023-10-05 15:54:24 -05:00
Bobby R. Bruce
4db748a507 resources, stdlib: Adding 'suite' category to gem5 (#191) 2023-10-05 13:26:58 -07:00
Bobby R. Bruce
761f6b73a0 arch-arm: Implement FEAT_FGT (#334)
This PR implements FEAT_FGT (Fine Grain Traps)
2023-10-05 10:44:26 -07:00
Bobby R. Bruce
f5c7ea01ef gpu-compute: Fix dynamic scratch size test (#391)
ROCm supports dynamically allocating scratch space, which resides in
framebuffer memory, to reduce the amount of memory allocated for kernels
that have not yet launched. The size of the scratch space allocated is
located in task->amdQueue.compute_tmpring_size_wavesize. This size is in
kilobytes. The AQL task contains the number of bytes requested *per work
item*, however we currently check if there is enough tmpring space by
comparing a single work item. This should instead check the size *per
wavefront*.

This causes problems in applications where multiple kernels use dynamic
scratch allocation and a later kernel requires more space than the
earlier kernel. The only application being tested that does this is
LULESH. This was resulting in the scratch space being too small,
resulting in workgroups clobbering each other's private memory leading
to some nasty bugs. It is fixed by this patch as task->amdQueue will be
re-read from the host and will contain the updated tmpring size. After
this there is enough scratch space and LULESH makes forward progress.
2023-10-05 10:38:13 -07:00
Bobby R. Bruce
ee8c569513 arch-riscv: Implement Zcb instructions (#399)
Added the following instructions:
c.lbu
c.lh
c.lhu
c.sb
c.sh
c.zext.b
c.sext.b
c.zext.h
c.sext.h
c.zext.w
c.not
c.mul

Reference: https://github.com/riscv/riscv-code-size-reduction
2023-10-05 10:36:02 -07:00
Bobby R. Bruce
f75c0fca8a stdlib: Del comment stating SE mode limited to single thread
This comment was left in the codebase in error. The
`set_se_binary_workload` function works fine with multi-threaded
applications. This hasn't been a restriction for some time.

Change-Id: I1b1d27c86f8d9284659f62ae27d752bf5325e31b
2023-10-05 10:20:55 -07:00
Bobby R. Bruce
06bbc43b46 ext: Remove std::binary_function from DramPower
`std::binary_function` was deprecated in C++11 and officially
removed in CPP-17.

This caused a compilation error on some systems. Fortunately it can be
safely removed. It was unecessary. The commandItemSorter was compliant
witih the `sort` regardless.

Change-Id: I0d910e50c51cce2545dd89f618c99aef0fe8ab79
2023-10-05 10:18:19 -07:00
Bobby R. Bruce
39c7e7d1ed arch: Adding missing override to PCState.set
As highlighed in this failing compiler test:
https://github.com/gem5/gem5/actions/runs/6348223508/job/17389057995

Clang was failing when compiling "build/ALL/gem5.opt" due missing
overrides in `PCState`'s "set" function.

This was observed in Clang-14 and, stangely, Clang-8.

Change-Id: I240c1087e8875fd07630e467e7452c62a5d14d5b
2023-10-05 10:18:19 -07:00
Roger Chang
ea3ee880aa arch-riscv: Implement Zcb instructions
Added the following instructions:
c.lbu
c.lh
c.lhu
c.sb
c.sh
c.zext.b
c.sext.b
c.zext.h
c.sext.h
c.zext.w
c.not
c.mul

Reference: https://github.com/riscv/riscv-code-size-reduction
Change-Id: Ib04820bf5591b365a3bfbbd8b90655a8a1d844cf
2023-10-05 18:46:35 +08:00
Leo Redivo
98a6cd6ee2 misc: changed call get_default_disk_device to get_disk_device
Change-Id: I240da78a658208211ede6648547dfa4c971074a1
2023-10-04 13:32:35 -07:00
Víctor Soria
6411b2255c mem-ruby,configs: Add CHI far atomics support
Introduce far atomic operations in CHI protocol.
Three configuration parameters have been used to tune this behavior:

  policy_type:       sets the atomic policy to one of the described in our paper
  atomic_op_latency: simulates the AMO ALU operation latency
  comp_anr:          configures the Atomic No return transaction to split
                     CompDBIDResp into two different messages DBIDResp and Comp

Change-Id: I087afad9ad9fcb9df42d72893c9e32ad5a5eb478
2023-10-04 19:19:08 +02:00
Víctor Soria
12dada2dc5 arch-arm: Correct return operand in swap instructions
Swap instructions are configured as non returning AMO operations. This is wrong because they
return the previous value stored in the target memory position

Change-Id: I84d75a571a8eaeaee0dbfac344f7b34c72b47d53
2023-10-04 19:11:01 +02:00
Víctor Soria
4fd9d66c53 tests,mem-ruby: Enhance ruby false sharing test with Atomics
New ruby mem test includes a percentages of AMOs that will be executed randomly in ruby mem test

Change-Id: Ie95ed78e59ea773ce6b59060eaece3701fe4478c
2023-10-04 19:11:01 +02:00
Jason Lowe-Power
6f5d877b1a misc: Update gem5 to use clang-15 and clang-16 (#365)
This introduces the changes necessary for clang-15 and clang-16 to run
within gem5, and adds them to the compiler tests.

This also updates the dockerfiles for ubuntu 22.04 to include the steps
necessary to compile clang-15 and clang-16.
2023-10-04 09:51:12 -07:00
Matthew Poremba
2b97f17fe1 gpu-compute: Fix dynamic scratch size test
ROCm supports dynamically allocating scratch space, which resides in
framebuffer memory, to reduce the amount of memory allocated for kernels
that have not yet launched. The size of the scratch space allocated is
located in task->amdQueue.compute_tmpring_size_wavesize. This size is in
kilobytes. The AQL task contains the number of bytes requested *per work
item*, however we currently check if there is enough tmpring space by
comparing a single work item. This should instead check the size *per
wavefront*.

This causes problems in applications where multiple kernels use dynamic
scratch allocation and a later kernel requires more space than the
earlier kernel. The only application being tested that does this is
LULESH. This was resulting in the scratch space being too small,
resulting in workgroups clobbering each other's private memory leading
to some nasty bugs. It is fixed by this patch as task->amdQueue will be
re-read from the host and will contain the updated tmpring size. After
this there is enough scratch space and LULESH makes forward progress.

Change-Id: Ie9e0f92bb98fd3c3d6c2da3db9ee65352f9ae070
2023-10-04 09:38:31 -05:00
Andreas Sandberg
7806eaad51 arch: Add instruction size and PC set methods (#357)
Add the instruction size of a static instruction. x86 and arm decoders
add now the instruction size to the macro instruction. However, microops
are still handled by the fetch stage which is not nice.
Furthermore, we add a set method to the PC state. It allows setting a PC
state to acertain address.
Both methods are required for the decoupled front-end.

Change-Id: I311fe3f637e867c42dee7781f5373ea2e69e2072
2023-10-04 10:49:30 +01:00
Bobby R. Bruce
57e0c7d006 arch-riscv: FS bits -> DIRTY for more floating point loads (#381)
The affected instructions are,
- c.flw
- c.flwsp
- flh
- flw

This change is related to [1] [2], which also aim to change the FS bits
to DIRTY when the state of any floating point register might change.

[1] https://gem5-review.googlesource.com/c/public/gem5/+/65272
[2] https://github.com/gem5/gem5/pull/370
2023-10-03 11:51:47 -07:00
Vishnu Ramadas
53627cc39c configs: Add configurable GPU L1,L2 num banks and L2 latencies
Previously, the L1, L2 number of banks and L2 latencies were not
configurable through command line arguments. This commit adds support to
configure them through the arguments '--tcp-num-banks' for number of
banks in L1, '--tcc-num-banks' for number of banks in L2, and
'--tcc-tag-access-latency', and '--tcc-data-access-latency'

Change-Id: Ie3b713ead16865fd7120e2d809ebfa56b69bc4a1
2023-10-03 11:51:28 -05:00
Harshil Patel
3af3c1121b stdlib, resources: Addressed requested changes
Change-Id: I22abdc3bdcdde52301ed10cb3113e8925159c245
Co-authored-by: Kunal Pai <kunpai@users.noreply.github.com>
2023-10-02 23:27:32 -07:00
Harshil Patel
7301d4bd19 python: Add importer to standalone gem5py_m5 (#369)
I believe the point of this binary was to allow people to use the m5
objects without the entire gem5 binary. However, without adding the
importer call, this did not work. Unfortunately, with the importer call
there is a circular dependence on the original gem5py.cc file.
Therefore, this change creates a new file that has the importer call.

Now, with the `gem5py_m5` binary you can run python code that references
modules in `src/python`. Note that `_m5` is not available, so anything
that depends on the gem5 SimObjects' implementation will not work.
However, this can still be useful for things like getting Resources,
processing stats, etc.
2023-10-02 14:28:45 -07:00
David Schall
7d2e1ee789 arch: Add instruction size and PC set methods
Adds the instruction size to all static instruction. x86, arm
and RISC-V decoders add the instruction size to every decoded
macro instruction. As microops should reflect the size of the
their parent macroop the set method is overwritten to pass the
size to all microops.
Furthermore, we add a set method to the PC state. It allows
setting a PC state to a certain address.
Both methods are required for the decoupled front-end.

Change-Id: I311fe3f637e867c42dee7781f5373ea2e69e2072
Signed-off-by: David Schall <david.schall@ed.ac.uk>
2023-10-02 20:10:57 +00:00