Compare commits

...

2125 Commits

Author SHA1 Message Date
fb29eaab11 Fix missing include 2025-03-29 14:54:42 +01:00
b2f5575f9c Fix DRAMPower linkage issue with gem5 2025-03-25 13:25:53 +00:00
a5ba2bf60d Fix nlohmann_json include 2025-03-25 14:01:37 +01:00
705f1295c7 Use fetch content in DRAMSys 2025-03-25 12:45:34 +00:00
bebf9d05f2 tests: Update DRAMSys test to use new version 2025-03-25 13:04:14 +01:00
e89a9e22f5 ext,stdlib: Update integration of DRAMSys
The latest version of DRAMSys required several API changes which were
applied in this commit.

Also, the README for the usage of DRAMSys has been updated.

The updated version fixes a bug in DRAMSys that caused some full-system
simulations to loop endlessly.

GitHub Issue: https://github.com/gem5/gem5/issues/1452
2025-03-25 13:03:38 +01:00
43fbdd853f Wallclock time plots 2025-03-21 18:17:12 +01:00
353488837c Prepare dataframe format for Latex plots 2025-03-21 18:17:12 +01:00
3d9533c10c First plotting scripts 2025-03-21 18:17:12 +01:00
da29aa865b First plot script 2025-03-21 18:17:12 +01:00
2212a03ae4 Add simulation script 2025-03-21 18:17:12 +01:00
51de880666 Update configuration 2025-03-21 18:17:12 +01:00
438d997ddb Increase HBM2 memory size to 2 GiB 2025-03-21 18:16:43 +01:00
e423da5256 Add support for shared pim units 2025-03-21 18:15:44 +01:00
e1c6318edb Integrate additional pim-vm library in DRAMSys linking 2025-03-21 18:15:44 +01:00
f28b51fce0 Enable m5ops and change cache line size to 32 2025-03-21 18:15:44 +01:00
7c183df27b First PIM modifications 2025-03-21 18:15:44 +01:00
Bobby R. Bruce
186a913a48 misc: Hotfix v24.1.0.2 (#1964)
This adds #1930 as a hotfix to gem5 v24.1.0
2025-02-12 12:09:26 -08:00
Bobby R. Bruce
7d6d253f6b misc: Update release notes for v24.1.0.2 2025-02-02 01:02:59 -08:00
Bobby R. Bruce
8f2ccca3a3 misc: Update version to v24.1.0.2 2025-02-02 00:56:30 -08:00
Adrià Armejach
dc448c9530 mem-ruby: set RubySystem pointer during TBE alloc (#1930)
Currently the RubySystem pointer is set when set_tbe is performed, which
effectively clears the NetDest objects from the TBE (if any). This is
fine if the TBE has been just allocated before set_tbe is called (no
NetDest info in the TBE). However, the CHI protocol has an action
(RestoreFromHazard) that performs a set_tbe over a TBE that had already
been set, i.e., it already has valid NetDest data.

This patch sets the RubySystem pointer when the TBE is allocated, which
is more natural and follows the style already adopted in the
PerfectCacheMemory class (#1864).

Co-authored-by: Adrià Armejach <adria.armejach@bsc.es>
2025-02-01 23:34:55 -08:00
Bobby R. Bruce
c9625ce9cc v24.1.0.1 Hotfix Release (#1875) 2024-12-19 18:28:44 -08:00
Bobby R. Bruce
ea28fcee5b misc: Update RELEASE-NOTES.md for v24.1.0.1 2024-12-19 18:27:30 -08:00
Bobby R. Bruce
0e4c8487dd misc: Update the gem5 version to v24.1.0.1 2024-12-19 18:24:22 -08:00
Tommaso Marinelli
b5e27f5ed8 configs: Generalize class types in CHI RNF/MN generators (#1851)
Classes CHI_RNF and CHI_MN can be specialized to override base
class/subclass attributes, like it happens in CustomMesh with
router_list (see configs/example/noc_config/2x4.py). To avoid missing
these attributes, it is needed to generalize the class types when
instantiating the objects in the recently added generators.
2024-12-18 21:16:26 -08:00
Melissa Jost
e146f1b2bc misc: Add sphinx stdlib documentation (#335)
This PR adds documentation to the standard library using Sphinx. For
details on how the documentation was generated, refer to
https://gem5.atlassian.net/browse/GEM5-1314. Currently, some modules
like `dramsys` and `mesi_three_level` appear as blank pages. To view the
current state of the documentation locally, run: `cd docs/_build/html;
python3 -m http.server 8000`


---------
Co-authored-by: ivanaamit <ivanamit91@gmail.com>
2024-12-18 21:14:10 -08:00
Marleson Graf
b6c941c9ca mem-ruby: Fix missing RubySystem in PerfectCacheMemory's entries (#1864)
MOESI_CMP_directory protocol crashes with one of the several assertions
in NetDest.cc. It happens because the entry type used to instantiate a
PerfectCacheMemory object in MOESI_CMP_directory-L2cache.sm contains a
NetDest object, so it requires a RubySystem object to be manually set
for it.

Instead of just receiving the block size, change PerfectCacheMemory to
receive a RubySystem object and use it to set the block size and call
ENTRY::setRubySystem if the entries require it.
2024-12-18 21:13:32 -08:00
Marleson Graf
0fe31664f3 mem-ruby: Add missing option in ProtocolInfo (#1865)
After the support for multiple ruby protocols was added, the macros
PROTOCOL_MESI_Two_Level and PROTOCOL_MESI_Three_Level were removed.
These macros are still being used to determine if Load_Linked requests
are sent to the protocol, an information required by the fix that
addresses LL/SC livelock.
Replace the macros with a new option: useSecondaryLoadLinked.
2024-12-18 21:12:57 -08:00
Harshil Patel
63d25922a2 tests: Update pyunit tests references to include 24.1 (#1843) 2024-12-07 00:02:57 -08:00
Vishnu Ramadas
8877516e5b mem-ruby: Fix GPU_VIPER-TCP.sm atomic transitions in TCC WB mode
The transition that happens when TCC acknowledges TCP of an atomic
operation completion does not move the cacheline state from A to I. This
commit fixes the transition and moves the state to I
2024-12-06 23:17:46 -08:00
Vishnu Ramadas
6aa9db28f1 mem-ruby: Fix segfault in pa_performAtomics in GPU_VIPER-TCC.sm
When the cache is performing an atomics and receives data, it performs.
pa_performAtomic. This action peeks into the coreRequest queue to check
the messaage type. This queue, however, is already dequeued in the
transition that precedes the one that contains pa_performAtomic. When
pa_performAtomic is called, the simulation crashes. This commit fixes
the crash by using the TBE entry information instead of peeking when TBE
entry exists, and peeking when it doesn't
2024-12-06 23:17:10 -08:00
Jason Lowe-Power
93b58fbf64 misc: Add GPU info to release notes (#1844)
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
Co-authored-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Co-authored-by: Matt Sinclair <mattdsinclair.wisc@gmail.com>
2024-12-06 21:59:24 -08:00
Bobby R. Bruce
ae60062a9e mem-ruby,misc: Fix RNG range (#1842)
This upper range must be `UINT_MAX - 1`. This was previously fixed but
reverted back. Without this the RNG crashes.
2024-12-06 21:58:52 -08:00
Bobby R. Bruce
26ba6dad80 scons: Remove warn as error for v24.1 2024-12-06 19:45:58 -08:00
Erin Le
4559bafaa6 arch-riscv: Remove warning message in senvcfg setMiscReg
Previously, a warning would be printed when the WPRI bits
in the senvcfg register were written to. The other registers
do not print warnings for this, so the warning is being
removed.
2024-12-06 19:38:39 -08:00
Erin Le
e6b931213f arch-riscv: Implement behavior for senvcfg register
This commit adds behavior for writes to the senvcfg register.
It also implements the CBIE, CBCFE, and CBZE bitfields of
senvcfg.
2024-12-06 19:38:24 -08:00
Erin Le
3b62f1f8e4 arch-riscv: Add senvcfg CSR
This commit adds the senvcfg CSR, which fixes the 6.11.3 kernel
crash documented in issue 1674. I have not added a bitfield and
its implementation in isa.cc only uses setMiscRegNoEffect, so
this implementation is likely missing some critical components.
2024-12-06 19:38:14 -08:00
Erin (Jianghua) Le
8f37677c9b misc: v24.1 release notes update (#1840) 2024-12-06 16:13:43 -08:00
Clement Dieperink
2b645ed38c arch-riscv: fix tlb stats in timming mode (#1832)
The previous #484 issue reported a bug where the TLB stats on RISC-V
were incremented twice on misses by calling the `lookup` function twice
with hidden argument set to `false`. The fix is only applied on atomic
mode as the `translation` argument of `doTranslate` will not be
`nullptr` in timing mode.

In that case, if the TLB lookup miss, the `doTranslate` function will
start the walker and then return without doing anything more. Then
later, when the pagetable walker found the corresponding PTE, it will
insert it and call `translateWithTLB`. This function then call `lookup`
again which will hit in any case (and crash if not due to the following
assert), but the hit count is incremented here too. 

This commit fix by setting the `hidden` argument of `lookup` to true.
2024-12-06 11:27:52 -08:00
Bobby R. Bruce
3711bf8a7a base,arch-arm: Add GEM5_NO_OPTIMIZE; use in ARM's vfp.hh (#1834)
GCC and CLANG have different annotations for declaring code should not
be optimized. Adding GEM5_NO_OPTIMZE provides gem5 developers a MACRO
that works in both cases.

This change replaces the GCC pragmas in vfp.hh with GEM5_NO_OPTIMIZE
as this solution didn't work with clang.
2024-12-04 21:36:18 -08:00
Jason Lowe-Power
5672d63ae4 mem-ruby: Fix functional access in MI_example (#1838)
In MI_example, when in MI state the block "Maybe_Stale" as in this
controller may have the most up to date value or it could be in the
network. For MII it is guaranteed that this controller has the most up
to date value because it received a PUTX_NACK.

This fixes one of the daily test failures.
2024-12-04 21:35:46 -08:00
Harshil Patel
a8db1fc683 scons: get protocol info from slicc instead of file parsing 2024-12-04 21:35:14 -08:00
Harshil Patel
02a5ddaeac mem-ruby, scons: Add ProtocolInfo.hh files in build targets
- In the new MultiRuby system, the generated ProtocolInfo header files were not being correctly added to the build targets in SCons.

- As a result, when building gem5 with the --duplicate-sources option, these files were mistakenly deleted by SCons.
This happened because SCons treated them as source files instead of generated build targets.

- This commit ensures that the ProtocolInfo header files are explicitly included in the build targets, preventing their unintended removal and fixing the build issue.
2024-12-04 21:34:54 -08:00
Bobby R. Bruce
dee42f1867 arch-riscv: Remove CPU_SET use for non-linux host (#1835)
For non-Linux systems, we use cpu_set_cpu. CPU_SET is a macro that is
not available for non-Linux systems.

Fixes #1720
2024-12-04 15:48:49 -08:00
2channelkrt
f799d91309 ruby-chi: fix wrong ruby-CHI base class name (#1817)
fix ruby-CHI base class name so it actually runs

previously was combined with PR #1797
2024-12-04 15:47:44 -08:00
Giacomo Travaglini
8a9f61c546 misc: Add CHI section to the RELEASE-NOTES.md (#1833)
Change-Id: I2f01dd9c7a45c5f6baf57e4aad0f171417a6efb1

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-12-03 00:47:08 -08:00
Bobby R. Bruce
59ca5600ec misc: Update version info for v24.1 2024-12-02 11:10:28 -08:00
Giacomo Travaglini
c64a807f94 misc: Add ArmISA section to the RELEASE-NOTES.md file (#1822) 2024-12-02 09:38:02 -08:00
Junshi Wang
0a22e63467 arch-arm: Fix bug in VQRSHL.
If shiftAmt is 0, bits raise assert, causing core dump.

Change-Id: Ic4285f51a866ffc017645655e98674ca69a15a40
2024-12-02 08:46:57 -08:00
Erin (Jianghua) Le
1e5021c2e3 tests: modify gem5/learning-gem5 ref file to fix failure (#1795)
The test `ruby_test_test-ALL-x86_64-opt-MatchStdout` is currently
failing because the reference file doesn't match the actual output. This
PR changes the reference file to match.
2024-12-02 08:46:10 -08:00
Nicholas Mosier
25523e73a4 arch-x86, sim-se: move mmap end downward in case of large stacks (#1810)
Fix #1809. Shift the mmap end to a lower address in case the process has
a large max stack size, to avoid overlapping the stack with the mmap
memory range.

Change-Id: Idae343dbbe851a7510463ff141c03f1847e36328
2024-12-02 08:44:54 -08:00
Giacomo Travaglini
1b16697029 mem-ruby: Fix conflict between 117 and 1084
This is fixing the conflict between the multi-ruby [1] and the CHI-TLM
[2] PRs

[1]: https://github.com/gem5/gem5/pull/117
[2]: https://github.com/gem5/gem5/pull/1084

Change-Id: Ie9c6381c361ac344e22984d8a53ed03c387b0b43
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-12-02 08:43:58 -08:00
Harshil Patel
e51bc00dc7 misc: revert riscvmatched-fs.py due to a bug
- link to issue https://github.com/gem5/gem5/issues/1554

Change-Id: Ic9cf6e5166eeee2226b6022e6f7c971d4e7caaeb
2024-12-02 08:41:58 -08:00
Erin (Jianghua) Le
e221a70355 Add ExitEvent import to arm-ubuntu-run.py 2024-12-02 08:41:58 -08:00
Harshil Patel
630173a845 misc: update fs examples to use ubuntu 24.04 boot workloads
Change-Id: I7e16f69eff3a7ff0ab16c18e6d35e846d07ac829
2024-12-02 08:41:55 -08:00
Roger Chang
40ccb8b171 arch-riscv: Use getValidAddr to get zero-extend address in RV32 mode
Previous PR #1758 implements the generic getValidAddr to get pure
vaddr without any tags or sign-extend bits.

In RISC-V implementation, the getValidAddr will zero-extend
address in RV32 mode and use it to do TLB translation. Use
getValidAddr to get zero-extend vaddr can reduce zero-extend
repetition

Change-Id: I2273ce48bccb873790103ba0fcdb0b48de9ced4c
2024-12-02 08:33:15 -08:00
studyztp
3a2cfb2dee cpu: fix looppoint anaylsis param python string spacing
Change-Id: I98fe434f1066f12b975425e49baca6e6a6087dab
2024-12-02 08:33:14 -08:00
studyztp
0f0a6a7851 cpu: fix pc count pair helper function return type
Change the helper function's return type from int to uint64_t

Change-Id: I34b6b563a6333bbf8516a16d2ad4b76b7c16bfe4
2024-12-02 08:33:14 -08:00
studyztp
4ce0f20436 cpu: make PcCountPair use 64 bit unsigned int for count
In PcCountPair param, change the type for "count" from 32 bit int to
64 bit unsigned int.

Change-Id: I2dc1bb2692914f06eaaae9bd5bbfb061bcbbfb8b
2024-12-02 08:33:14 -08:00
studyztp
6a9db637ae cpu: add function to get inst map of each basic block
Change-Id: I147d8c90cdfc7bf795d1c6a6daf96e11fa1c0858
2024-12-02 08:33:14 -08:00
studyztp
7ffa3646bd cpu: fix the incorrect debug message
Change-Id: I062e359e8c9205a9a993a33865434922c1f540b8
2024-12-02 08:33:14 -08:00
studyztp
1410c29147 cpu: modified after review feedback
src/cpu/simple/probes/LooppointAnalysis.py:
- remove default values for bb_valid_addr_range and
marker_valid_addr_range
- add more comments to explain parameter behaviors
- add citation to the LoopPoint paper

src/cpu/simple/probes/looppoint_analysis.cc:
- fix the incorrect styles
- remove updateBackwardBranch() function call
- match the style of checking if listeners vector is empty
- change the way of stopListening() to remove the listeners through the
manager instead of through the ProbeListener object's destructor.

src/cpu/simple/probes/looppoint_analysis.hh:
- removed backwardBranchPC and use the backwardBranchCounter to replace
its functionaility. Therefore, also removed updateBackwardBranch
function.

Change-Id: Id2430e2f04e61f72d5c4f1aad5cfd4d24a0fbc45
2024-12-02 08:33:14 -08:00
studyztp
89717eca3c cpu: add more debug flags
Change-Id: I4edd8f383294f76d3e76895d3a631cba21a45f90
2024-12-02 08:33:14 -08:00
studyztp
753d9971d2 cpu: add more comments to looppoint_analysis.cc
Change-Id: I027db66ffed0cd5957bae2a9a36286ca1c73c313
2024-12-02 08:33:14 -08:00
studyztp
a1072357c1 cpu: fix a issue
Change-Id: Iab621e294c84c7f5c704882b0c681f950ad08f9c
2024-12-02 08:33:13 -08:00
studyztp
abc8a4a483 cpu: fix a wrong file path
Change-Id: I93343f4053c7a6d1bd4b6972a1e7c3dbc073c979
2024-12-02 08:33:13 -08:00
studyztp
cd29b199ce cpu: add the python class
Add the python classes for the LooppointAnalysis and the
LooppointAnalysis Manager.

Change-Id: I0a882bc1a9ef03b7b482e871a7160e7c33f9ac08
2024-12-02 08:33:13 -08:00
studyztp
e10fff4876 cpu: add looppoint_analysis.cc content
Add LooppointAnalysis and LooppointAnalysisManager function definitions

Change-Id: I1c05072ebf1b744ee102a82f8de2b93bab4a056f
2024-12-02 08:33:13 -08:00
studyztp
fff6c895fe cpu: add comments and improve naming in looppoint_analysis.hh
Add comments to most variables and functions.
Change the naming of some variables and functions to improve the
clearness.

Change-Id: Idb557ec84698b4344ed4683f5de87b1a3c2fd66d
2024-12-02 08:33:13 -08:00
studyztp
3c7c7b8b54 cpu: add looppoint_analysis.hh content and licenses
In looppoint_analysis.hh, added LooppointAnalysis and
LooppointAnalysisManager classes.
Added all functions and variables for the classes.
Comments needed.

Change-Id: Ia7425b672ef092a68c99b702136850bfa1fcf0a2
2024-12-02 08:33:13 -08:00
studyztp
157d89e255 cpu: add basic files for LoopPoint analysis
Because the LoopPoint analysis will be done with ATOMIC CPU, so all
files related to the LoopPoint analysis object will be under
/src/cpu/simple/probes.

Change-Id: Icbdb0742b712a23dc8f6a19f4c1c827a1f5bf288
2024-12-02 08:33:13 -08:00
Matthew Poremba
9fe8c7cd74 stdlib: Updates to VIPER board after all protocols PR 2024-12-02 08:33:13 -08:00
Jason Lowe-Power
6cf5a46f68 stdlib: Update names for GPU children
This change updates the names for the GPU children in a better way than
overriding the parent. Now it looks something like

```text
board.gpus.shader.CUs00
board.gpus.gpu_caches.ruby_gpu.controllers02
board.gpus.memory.mem_ctrl0
```

Note that it is "gpus" with an "s" because the board accepts more than 1
GPU, optionally.

Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2024-12-02 08:33:12 -08:00
Jason Lowe-Power
c75c267e34 stdlib: Remove debug prints
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2024-12-02 08:33:12 -08:00
Jason Lowe-Power
e93f498aac stdlib: Add get_devices to abstract board
This function returns the GPUs (for now, possibly other devices in the
future). It needs to be in the abstract board so the GPU-specific cache
hierarchies can be used with non-GPU boards.

Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2024-12-02 08:33:12 -08:00
Jason Lowe-Power
bec9ae77e6 stdlib: Override the readfile contents in GPU board
This prepends loading the GPU drivers to anything passed in via the
readfile_contents. Note that if the user sets a specific readfile via a
file they will be responsible for loading the driver

Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2024-12-02 08:33:12 -08:00
Mahyar Samani
2fca39cec7 dev-amdgpu: Separating gpu_memory from gpu_cache.
This change separates the instantiation of gpu memory from
instantiatiing the gpu cache. Prior to this change, the gpu
cache instantiated the memories for the gpu by receiving number
of channels as a parameter. With this change, the gpu memory
should be constructed outside the gpu, without being added as a
child to any other object, and passed to the constructor of
the gpu.
2024-12-02 08:33:12 -08:00
Mahyar Samani
1948155fb2 stdlib: AbstractMemorySystem.get_mem_interfaces.
This change adds a new method to AbstractMemorySystem to allow
getting its objects of the class MemInterface. This is useful
when certain other classes require a list of MemInterface objects
to create physical memory. In addition, ChanneledMemory and
HighBandwidthMemory implement this function.
2024-12-02 08:33:12 -08:00
Maryam Babaie
c0c0955178 dev-amdgpu: Adding support for avs extended states and features. 2024-12-02 08:33:12 -08:00
Matthew Poremba
2105dc47a9 stdlib: Add viper board, viper cache, and gpu components
Adds GPU_VIPER protocol related caches to stdlib components: CorePair
cache, TCP, SQC, TCC, Directory, and DMA controllers. Adds GPU related
components in a new components/devices/gpus/ directory. Adds prebuilt
GPU and CPU cache hierarchies, GPU and CPU network classes, and a board
overriding the X86Board to provide helper methods for disk image root,
the complex kernel parameter list, and method to provide functionality
to the current GPUFS scripts to load in applications and handle loading
the GPU driver.

The new GPU components can be used as follows:
 - Create a GPU device *before* the CPU cache hierarchy is created.
 - Add the GPU's CPU-side DMA controllers to the list of CPU cache
   controllers.
 - Use GPU device method to connect to an AbstractBoard.

Each GPU components has it's own RubySystem, PCI device ID, and address
ranges for VBIOS and legacy PCI BARs. Therefore, in theory, multiple
GPUs can be created. This requires PR #1453 .

An example of using this board is added to configs/example/gem5_library
under x86-mi300x-gpu.py. It is designed to work with the disk image,
kernel, and applications provided in the gem5-resources repository.

Change-Id: Ie65ffcfee5e311d9492de935d6d0631260645cd3
2024-12-02 08:33:12 -08:00
Giacomo Travaglini
44b8f5f422 tests: Write unit-tests for ruby using the CHI-TLM library
This commit is adding two python files:

* ruby_mem_test.py is the canonical gem5 configuration script,
and it is an adaptation of the existing ruby_mem_test.py [1].
The main difference is the use of the TlmController as a
cache controller, and the use of TlmGenerator instead of
the MemTest memory tester. The config is minimally setting up
the system. The extent of the testing is specified in the second
python file:

* read_shared_unit.py: "unit-test" for the CHI ReadShared request
The file should be passed to the ruby_mem_test.py as cmdline
argument:

build/ARM/gem5.opt <>/ruby_mem_test.py <>/read_shared_unit.py

This is a simple testing file. We should ideally generate separate
test files for separate transactions/scenarios.
The test file can have whatever for inside, it only needs to comply
to the minimal interface required by the ruby_mem_test.py, which is
to define the following function:

def test_all(generator):

[1]: https://github.com/gem5/gem5/blob/stable/\
    configs/example/ruby_mem_test.py

Change-Id: I767ede9b8572f3eafe677c84da45fd904d77e319
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-12-02 08:33:11 -08:00
Giacomo Travaglini
706cb4195f mem-ruby: Add a CHI-TLM transaction Generator for testing
This commit is building over the CHI-TLM wrapping introduced
by the previous commit and it is adding a CHI traffic generator
as a SimObject.
This will get the python objects as input and it will forward
them to the TlmController to convert them into ruby CHI
messages

Change-Id: Ia67094c9bb880e37b24184313df546ecbaa3289f
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-12-02 08:33:11 -08:00
Giacomo Travaglini
786e539fa4 mem-ruby: Wrap the CHI-TLM library with pybind11
This commit is wrapping the external AMBA CHI-TLM with pybind11
so that it will be possible to use its data structures/functions
from python.

More specifically we will be able to instantiate a ARM::CHI::Payload
and ARM::CHI::Phase from a gem5 config, with the end goal of being
able to configure a CHI transaction from python

Change-Id: I9587b445c21df44161fa3d9e09fc2651541b38bd
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-12-02 08:33:11 -08:00
Giacomo Travaglini
b795d28ee8 mem-ruby: Add a CHI-TLM CacheController
This commit is extending the previously defined CHIGenericController
to implement a CacheController which acts as a bridge between the
AMBA TLM 2.0 implementation of CHI [1][2] with the gem5 (ruby) one.

In other words it translates AMBA CHI transactions into ruby
messages (which are then forwarded to the MessageQueues)
and viceversa.

ARM::CHI::Payload,         CHIRequestMsg
                     <-->  CHIDataMsg
ARM::CHI::Phase            CHIResponseMsg
                           CHIDataMsg

[1]: https://developer.arm.com/documentation/101459/latest
[2]: https://developer.arm.com/Architectures/AMBA#Downloads

Change-Id: I6f35e7b4ade4d0de1b5e5d2dbf73ce796a9f9fb6
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-12-02 08:33:11 -08:00
Giacomo Travaglini
76541929c9 configs: Instantiate RNFs and MN via callbacks
This commit allows top level configs making use of the Ruby module
to define node generation callbacks.
The config_ruby function will check the system object for two
factory methods

1) _rnf_gen, if defined, will be called to generate RNFs
2) _mn_gen, if defined, will be called to generate MNs

Change-Id: I9daeece646e7cdb2d3bfefa761a9650562f8eb4b
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-12-02 08:33:11 -08:00
Tiago Mück
390c2b67e4 mem-ruby: Implement a CHI generic controller
Component implementing a generic controller that allow classic caches
interaction with Ruby/CHI.
The CHIGenericController provides an interface to send/receive CHI
messages to/from the interconnect. This is implement in C++ rather then
SLICC. This controller is seen as a MachineType:Cache by the CHI
implementation in SLICC.

Change-Id: I3afc4363f4290095c2f7428c8487bccd932e0300
2024-12-02 08:33:11 -08:00
Tiago Mück
488c6fc246 mem-ruby: add CHI missing valid SnpRespData type
Change-Id: I49c24e8b99932f8ae88511bb7a08a94f59ce7d29
2024-12-02 08:33:11 -08:00
Tiago Mück
bc52d886a8 mem-ruby: add CHI SnpRespData_SD_Fwded_SC message
This snoop reponse is not generated internally by the SLICC
implementation, but is required for compatibility with classic caches
which may remain in SD state while returning SC data upon receiving
a converted SnpShared.

Change-Id: I5270b29c8863c7afd8abc39b3c7978b95330c183
2024-12-02 08:33:10 -08:00
Tiago Mück
f37dfc090d mem-ruby: sequencer prints panic pkts
Change-Id: I9cd780597c4680513d9cbeb8dda2e13f2a1faf56
2024-12-02 08:33:09 -08:00
Bobby R. Bruce
85c00a2ebc misc: Fix types in RELEASE_NOTES.md 2024-11-19 15:28:40 -08:00
Bobby R. Bruce
2d8a2eab70 misc: Revert bad merge
Merge 78db0e2 was bad and cause problems. This commit reverts it.
2024-11-19 15:02:02 -08:00
Saúl
c54132bdd9 arch-riscv: fix reg dep autoref on vslide with vcpy micro (#1782)
Vector slide instructions can have the same register group as source and
destination.
Because we are pinning the destination this will provoke an
auto-reference in the dependency graph.

The solution is to use the `vcpy` micro. This way we use the `vtmp`
register group as source and pin the destination without issues.
2024-11-19 11:18:45 -08:00
Erin (Jianghua) Le
75c4003a7e python: modify comment for ExitEvent.WORKEND (#1790)
This PR modifies the documentation for ExitEvent.WORKEND in simulator.py
so it is more consistent.
2024-11-19 11:17:59 -08:00
Bobby R. Bruce
5f01a03bde arch-arm,misc: Fix build errors (#1789)
1. Add missing override to `print` function.
2. Change `TlbEntry` to struct in `ArmISA` class.

This was found attempting to compile gem5 on MacOS (Apple Silicon) with
clang v19.
2024-11-19 11:14:54 -08:00
Bobby R. Bruce
9b5f2db157 misc: Include build_opts updates in RELEASE_NOTES 2024-11-19 11:04:03 -08:00
Erin Le
82c8642954 tests: remove protocol=None, add print statement back in
This commit removes the `protocol=None` argument in various
gem5_verify_config()s because protocol is set to None by
default. Also, two print statements that were taken out in
previous commits were put back in with different function calls.
2024-11-19 11:02:15 -08:00
Erin Le
bcfa988a67 tests, scons: Fix Testlib test failures
This commit changes the fs/linux/arm and learning_gem5 tests as
they were previously failing with the Ruby change. The
fs/linux/arm long tests require the addition of a new gem5 build,
ARM_X86, which builds the ARM and X86 ISAs with the
MESI_Two_Level cache hierarchy.
2024-11-19 11:00:37 -08:00
Erin Le
f29c2c2dcf scons: modify ALL build to have all Ruby protocols 2024-11-19 11:00:37 -08:00
Erin Le
cffc2e6144 tests: modify tests to use ALL build
This commit modifies a number of Testlib tests to use the ALL
build instead of a more specific build.
2024-11-19 11:00:37 -08:00
Erin Le
2ee40f1c11 mem-ruby: changes to MESIThreeLevel, MIExample, OctopiCache
This commit changes MESIThreeLevel, MIExample, and OctopiCache
so they work with this PR. It also adds MESIThreeLevel and
OctopiCache to the testlib tests.
2024-11-19 11:00:37 -08:00
Erin Le
8fe1228f3e scons, ext-testlib: rename NULL_All_Ruby to use all caps
This commit renames NULL_All_Ruby to NULL_ALL_RUBY and adds a
tag for this build for testlib. These changes were made so the
NULL_All_Ruby build could be used with testlib.
2024-11-19 11:00:37 -08:00
Erin Le
3535fd0449 tests: additional trusted_stats files
This commit adds trusted_stats.json files for CHIL1,
MESI_Three_Level, MIExample, and OctopiCache. These jsons are
used to verify the output of tests from test_memory_traffic_gen.py.
2024-11-19 11:00:37 -08:00
Jason Lowe-Power
6903420310 tests: Add more hierarchies to traffic gen tests
This adds all of the hierarchies supported in the standard library. We
can soon move to using a different build target and run all hierarchies!

Change-Id: Ic065a679ea34c3bb2f71b3b133806d240039fbb5
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2024-11-19 11:00:37 -08:00
Jason Lowe-Power
97542c1a4c mem-ruby,scons: Add scons option for multiple protocols
This change does many things, but they must all be atomically done.

**USER FACING CHANGE**: The Ruby protocols in Kconfig have changed names
(they are now the same case as the SLICC file names). So, after this
commit, your build configurations need to be updated. You can do so by
running `scons menuconfig <build dir>` and selecting the right ruby
options. Alternatively, if you're using a `build_opts` file, you can run
`scons defconfig build/<ISA> build_opts/<ISA>` which should update your
config correctly.

Detailed changes are described below.

Kconfig changes:

- Kconfig files in ruby now must all be declared in the ruby/Kconfig
  file
- All of the protocol names are changed to match their slicc file names
  including the case
- A new option is available called "Use multiple protocols" which should
  be selected if multiple protocols are selected. This is only used to
  set the PROTOCOL variable to "MULTIPLE" when in multiple mode.
- The PROTOCOL variable can now be "MULTIPLE" which means it will be
  ignored. If it's not "MULTIPLE" then it holds the "main" protocol,
  which is necessary for backwards compatibility with the Ruby.py files.

Ruby config changes:

To make this change backwards compatible with Ruby.py, this change adds
a new "protocol" config called MULTIPLE.py which is used to allow the
user to set a "--protocol" option on the command line. This is only
needed if you are using a gem5 binary with multiple protocols but need
to use Ruby.py.

stdlib changes:

- Make the coherence protocol file behave like the ISA file
- Add a function to get the coherence protocol from the `CacheHierarchy`
  like we do with the ISA in the `Processor`.
  - Use this function where `get_runtime_coherence_protocol` was used
- Update the requires code to work with the ne CoherenceProtocol
- Fix a typo in the AMD Hammer name and also add the missing MSI
  protocol

Scons changes:

- In Ruby we now gather up all of the protocols and build them all if
  there are multiple protocols
- There's some bending over backwards to tell the user if they are using
  an out of date gem5.build/config file and how to update it
- Note that multiple ruby protocols adds a significant amount of time to
  the build since we have to run slicc twice for each file.

build_opts:

- Update all files with new names
- Add a new NULL_All_Ruby that will be used for testing

Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2024-11-19 11:00:34 -08:00
Jason Lowe-Power
9a904478eb mem-ruby: Use runtime protocol instead of #defines
This removes two #defines: PARTIAL_FUNC_READS and PROTOCOL_<protocol>.
Instead, update the code to use the runtime information about which
protocol we are using.

Change-Id: Icb6f10fc2d3fd59128c62f9f6e37b52ef2581b61
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2024-11-19 10:53:59 -08:00
Jason Lowe-Power
b7ce3040de mem-ruby: Add ProtocolInfo class
Add a ProtocolInfo class that is specialized (through inheritance) for
each protocol. This class currently has the protocol's name and any
protocol-specific options (partial_func_reads is the only one so far).
Note that the SLICC language has been updated so that you can specify
the options in the `protocol` statement in the `.slicc` file.

Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2024-11-19 10:53:59 -08:00
Jason Lowe-Power
4f53451073 mem-ruby,configs: Update AMD protos with new names
Update the MOESI_AMD_Base and GPU_VIPER configuration files with the new
full protocol-specific names for the controllers instead of the
deprecated names.

Note: If you have any files which use the `CntrlBase` base, you will
likely need to update the class names that you are inheriting from.

Change-Id: I623fea7dd4cd151f7b15fe7cb43f8a4c45492d89
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2024-11-19 10:53:59 -08:00
Jason Lowe-Power
d1ed308af8 stdlib,mem-ruby: Use protocol-spec. names
Update the standard library Ruby protocols to use the protocol-specific
class names instead of the deprecated general names.

Unfortunately, some code became duplicated between similar controllers.
I tried multiple inheritance, but it didn't work out for me. I think the
correct solution is to move some of the shared code down into the
generated python. That's out of the scope for these changes.

Change-Id: I3444bee3c2917dcbe92b600b85e60244129aad35
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2024-11-19 10:53:59 -08:00
Jason Lowe-Power
42fe5accea configs,mem-ruby: Procotol-spec. names in CHI
Use the protocol-specific controller names in CHI.

**Important**: This could change some scripts. As long as people use
CHI_config (likely), this shouldn't be a problem, but if you have a
different version of CHI_config.py locally, you will need to make the
following updates:

`Cache_Controller` -> `CHI_Cache_Controller`
`Memory_Controller` -> `CHI_Memory_Controller`

Website updates coming soon!

Change-Id: I7afdcede884ac5f9a9a76cc3d3dd35941e4e2faa
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2024-11-19 10:53:59 -08:00
Jason Lowe-Power
d56d561102 configs,mem-ruby: Protocol-spec. in learning gem5
Use protocol-specific names in Learning gem5 configs. Now, we should no
longer use the generic names for the controllers (it's deprecated). This
updates Learning gem5.

Website changes coming soon. (Hopefull before I push this...)

Change-Id: I18fc5b8bb0fef7c3b8b5cea8de4f73fc0f66a1b3
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2024-11-19 10:53:58 -08:00
Jason Lowe-Power
3ba16adeff scons: Change scons for multiple protocols in SLICC
This change is a step toward multiple protocols building at the same
time in scons. Add functions and use lists instead of single protocol.

Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2024-11-19 10:53:58 -08:00
Jason Lowe-Power
b925a6e57c mem-ruby: Update MachineType autogen file with all types
This change makes it so that the MachineType.cc/hh file are not unique
for each protocol. All of the machine types are now tracked.

Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2024-11-19 10:53:58 -08:00
Jason Lowe-Power
18401758aa mem-ruby: Rename SLICC SimObjs with compatibility
Rename all SLICC generated SimObjects to have the protocol in their
name. This will allow for two different protocols to have the same
machine names (e.g., L1Cache). For compatiblity, we check to see if the
current or main protocol that is built matches the SimObject's protocol
and export the backwards-compatible name.

Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2024-11-19 10:53:58 -08:00
Jason Lowe-Power
1a713e8c65 mem-ruby: Update HTML output to include protocol
Move the html output to be in a subdirectory with the protocol name.

Change-Id: I1510d2d5a531cc6db74d10a0478c23bc8a836a26
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2024-11-19 10:53:58 -08:00
Jason Lowe-Power
feb45c9cb9 mem-ruby: Move protocol files to subdir
Move all generated protocol-specific files to a subdirectory with the
protocol's name.

This change also updates SLICC to have separate variables for the
filename, c identifier and python identifier instead of just using
variations of the c identifier.

Change-Id: I62f69a4606b030ee23cb2d96493f3257a6923748
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2024-11-19 10:53:58 -08:00
Jason Lowe-Power
3a4465d908 mem-ruby: Use namespaces for protocol types
Wrap all protocol-specific types in `namespace <protocol>`. This will
facilitate compiling multiple protocols into one binary.

There is a one-time hack to the generated `MachineType.cc` file to use
the namespace for the protocol until we generalize the machine types.

Change-Id: I5947e8ac69afe6f7ed257d7c5980ad65e9338acf
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2024-11-19 10:53:58 -08:00
Jason Lowe-Power
1b84fbbeae mem-ruby: Use shared and per-protocol SLICC files
This changes extends SLICC to understand two different kinds of slicc
files: files that are protocol-specific and files that are shared or
included between different protocols.

Each declaration in SLICC can now be shared or not. If it is shared,
then we can take a different action in the code generation (e.g., wrap
in a namespace).

*Developer facing change*
Removes the RubySlicc_interfaces.slicc file from the SLICC includes of
every protocol.

Changes required: If you have a custom protocol, you will need to remove
the line `include "RubySlicc_interfaces.slicc" from your .slicc file.

Change-Id: Ia6c2dafe2b8fe86749a13d17daa885bddd166855
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2024-11-19 10:53:47 -08:00
Jason Lowe-Power
c0f67f7388 python: Expand Enum param type to be more Enum-like
This extends gem5's version of python enums to support an equal operator
and the hash operator so we can compare two instances of enums and add
these to sets/dicts/etc.

Change-Id: I4a785bf9570a54254ada1db684379ee77e67b192
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2024-11-19 10:48:53 -08:00
studyztp
0d16c92341 cpu: add comments and change input type to list
Change inst_threshold param to inst_thresholds, which it is now
expecting a list of thresholds instead of one threshold.

Add more getter and setter functions:

addThreshold:
it is for adding new thresholds

getCounter:
it is for getting the current counter

getThresholds:
it returns the list of targeted thresholds

resetThresholds:
it clears all the targeted thresholds

Change-Id: I48d022effe7b315112ac150e6a4eaf5aab41c514
2024-11-18 11:24:26 -08:00
studyztp
627734e830 cpu: clear listeners list
Change-Id: Ie9d664df1b29a0ba62174046a7ab1fda6753bef4
2024-11-18 11:24:19 -08:00
studyztp
0b0a8431dc cpu: delete listener ptrs after removal
The listener pointer does not get deleted with the removeListener()
function call, so we need to make sure it is deleted in the
ProbeListenerObject.

Change-Id: I370f34651b889c8c00a378743e9c1c09fa1d775e
2024-11-18 11:24:11 -08:00
studyztp
9fede07f44 cpu: modified with review feedback
x86-global-inst-tracker.py:
- change the incorrect use of comment styly
- add more comments about the usage of the script and the purpose of
the script

src/cpu/probes/inst_tracker.cc:
- change the way of stopListening to use the manager function to remove
listeners. If in the future, the ProbeListner object does not call the
manager to remove itself in the destruction, then we should call it
here.
- fix stlying

src/cpu/probes/inst_tracker.hh:
- fix stlying

Change-Id: I6f3d745e15883a8a702593f72f984e0d4cc4c526
2024-11-18 11:24:04 -08:00
studyztp
6e39b737c8 cpu: add an example config script
Change-Id: Id24f60d43d61766526bd45086f9aeda02fe24822
2024-11-18 11:23:58 -08:00
studyztp
ddb29819ee cpu: reorder functions, add more debug flags and comments
Change-Id: I94bd4771130441a8e2e449a7527e87ba5c355236
2024-11-18 11:23:50 -08:00
studyztp
66d3f7c038 cpu: add GlobalInstTracker and LocalInstTracker
The GlobalInstTracker manages the global instruction counter and
responsible for triggering an exit event when the global instruction
counter reaches the defined threshold.

The LocalInstTracker listens to one core's retiredInsts probe point
and updates the GlobalInstTracker every time there is an instruction
committed.

The purpose of this instruction tracker is to raise an instruction
executed exit event with multi-core simulation.

Related discussion can be found:
https://github.com/gem5/gem5/issues/1087

Change-Id: Iab6fec57f14f28e590b035506282130ba8662706
2024-11-18 11:23:34 -08:00
aperais
b82ab5ac89 misc: Do not share the random number generator across components (#1534)
Component that require randomness should not share their randomness
source with other components to avoid simulation noise. For instance,
the branch predictor of one core should not impact the random
cache replacement policy of the cache of another core. This currently
happens as all components share a single random number generator.
    
This PR provides their own generators to relevant components, although
a couple components still use rand().
    
Change-Id: I3fb7226111c9194ee457af0f0f2b83f8c7b69d1e

Co-authored-by: Arthur Perais <arthur.perais@univ-grenoble-alpes.fr>
2024-11-18 01:37:12 -08:00
Jason Lowe-Power
5ae26c0f09 stdlib: Add interface to set binary in fs mode (#1743) 2024-11-18 00:23:59 -08:00
Marleson Graf
c31bc284a8 mem-ruby,sim-se: Fix functional reads for MESI protocols
This commit fixes three issues in MESI_Three_Level and MESI_Two_Level
implementations (MEI_Three_Level_HTM might still have issues).

1) Define functional read priorities for the cache controllers which
have states with Maybe_Stale access permission (L1 > L2 > Directory).

2) Fix incorrect access permissions in MESI_Three_Level-L1cache:
* S_IL0 is Read_Only, it is waiting for L0 to acknowledge the
  invalidation request before moving to SS, also a Read_Only state.
* E_IL0 is Maybe_Stale, its contents might be valid, since there is a
  transition (E_IL0, L0_Ack, EE) with no writeback data.
* M_IL0 is Maybe_Stale, its contents might be valid, since there is a
  transition (M_IL0, L0_Ack, MM) with no writeback data.

3) Add missing message types carrying valid data in functional reads:
* INV_DATA is a writeback from L0 to L1.
* DATA is a response to GET_S, but there are scenarios where it might
  be the only place with valid data (e.g. during L2 replacement).

Change-Id: Ie44fa317027f9ede272967e7461d337e14355eec
2024-11-18 00:22:45 -08:00
Marleson Graf
63d110fb7a mem-ruby,sim-se: Support Maybe_Stale in functional reads
Functional reads can be satisfied by one of the following, in order:
1. Main memory (when the data is not present in the cache hierarchy);
2. Valid data block in cache;
3. Valid data block in coherence message;
4. Valid data block marked as Maybe_Stale;

Number 4 is not handled by the current implementation. A Maybe_Stale
block can be either truly stale or actually valid. When it is stale,
the memory read will be satisfied by either number 2 or number 3. When
it is valid, there will be no coherence message with valid data inside,
and the Maybe_Stale block will transition to a valid state after
receiving some kind of acknowledgement.

The main challenge to handle number 4 is how to know from which
Maybe_Stale block the data should be read from. For instance, in a two
level cache hierarchy, we might have a block marked as Maybe_Stale in
both L1 and L2. In this case, we should prioritize the cache controller
that is closest to the CPU. To define this priority, a new virtual
function 'functionalReadPriority' was added to the AbstractController
class.

Change-Id: I4774cd01aab7bb9ca53694cd9dc4f9416a8e4025
2024-11-18 00:22:36 -08:00
Bobby R. Bruce
78db0e26b2 misc: Merge branch v24.0 stable into v24.1 release staging
For reasons I do not fully understand the prefetch code was out-of-sync
between develop and stable.
2024-11-11 13:51:39 -08:00
Simon Lammer
665d32cba2 misc: Fix typo in README.md (#1763) 2024-11-11 10:03:54 -08:00
Yu-Cheng Chang
8b1075b792 arch, cpu: Add generic getValidAddr to correct exetrace symbol table (#1758)
The getValidAddr is the method get virtual address with valid bits. It
is useful to get the correct symbol table via valid virtual address.

For ARM, we have `purifyTaggedAddr` to get the right virtual address.
For RISC-V, we only get lower 32 bits in RV32 mode to get the right
symbol table.

Change-Id: I33ad7bec6e7ea4ec82cb1b3a7f521432c6d735b6
2024-11-08 13:33:53 +00:00
Noah Krim
ad8bd6b5c7 arch-arm,util-m5: Change arm64's default m5 call type to addr (#1583)
As of PR #977, gem5 has a defined M5OPS_ADDR for arm64, even if it is
constrained to certain conditions. With that change and the arm64 board
KVM support (#725), it seems like interaction with the m5 utility under
arm64 will most commonly occur in KVM -- where instruction-mode does not
work -- and thus address-mode becomes more desirable as the default.

This also makes m5's behavior in arm64 consistent with x86, the only
other architecture that supports address-mode operations.
2024-11-07 10:08:10 -08:00
dependabot[bot]
ca07a06893 misc: bump mypy from 1.11.2 to 1.13.0 (#1748)
Bumps [mypy](https://github.com/python/mypy) from 1.11.2 to 1.13.0.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-06 10:41:03 -08:00
Erin (Jianghua) Le
f2892fd5bc misc: update RELEASE-NOTES.md for simInsts and simOps (#1750)
This PR updates RELEASE-NOTES.md to say that simInsts and simOps have
been modified such that they now reset to 0 when m5.stats.reset() is
called.
2024-11-06 10:40:07 -08:00
dependabot[bot]
ecde7d9fa9 misc: bump pre-commit from 3.8.0 to 4.0.1 (#1749)
Bumps [pre-commit](https://github.com/pre-commit/pre-commit) from 3.8.0
to 4.0.1.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-05 12:16:42 -08:00
Yu-Cheng Chang
70c211236a arch-riscv: sign-extend the PC when enter/leave trap handler (#1756)
The PR https://github.com/gem5/gem5/pull/1316 changes the sign-extend
address generation. We also need to sign-extend the address when setting
the PC in enter/leave trap handler

Change-Id: I62d58a26dba0b0c64125fea8ac9376ebf55c4952
2024-11-05 12:14:55 -08:00
Saúl
63ea52de56 arch-riscv: fix vrgather pin count (#1759)
The number of register pins for the vector gather instructions was not
calculated correctly because the micro vl was not right. This caused
some micros to rename a new register instead of using a pinned one.
2024-11-05 12:13:51 -08:00
Tommaso Marinelli
7f50372979 configs: Update legacy RISC-V FS Linux script (#1753)
This PR improves the legacy RISC-V FS Linux script in the following
ways:

- Adds an argument to specify the bootloader, to (optionally) use the
  `RiscvBootloaderKernelWorkload` class.
- Updates the DTB generation function adding the Chosen node. This
  fixes the execution with recent Linux kernels.
- Checks if the `--kernel` required argument is set.
2024-11-05 10:57:57 -08:00
Matthew Poremba
6881534bd2 misc: Add v24.1 release notes for RubySystem changes (#1735) 2024-11-05 10:47:37 -08:00
Vishnu Ramadas
d463868f28 dev-amdgpu, gpu-compute, mem-ruby: Add support for writeback L2 in GPU (#1692)
Previously, GPU L2 caches could be configured in either writeback or
writethrough mode when used in an APU. However, in a CPU+dGPU system,
only writethrough worked. This is mainly because in CPU+dGPU system, the
CPU sends either PCI or SDMA requests to transfer data from the GPU
memory to CPU. When L2 cache is configured to be writeback, the dirty
data resides in L2 when CPU transfers data from GPU memory. This leads
to the wrong version being transferred. A similar issue also crops up
when the GPU command processor reads kernel information before kernel
dispatch, only to incorrect data. This PR contains a set of commits that
fix both these issues.
2024-11-05 10:45:46 -08:00
ylldummy
940f49b63b base: Make BaseGdbRegCache::data() non constant (#1734)
The method is defined as const but the caller will actually modify the
content of the structure directly with the pointer in
BaseRemoteGDB::cmdRegW. The member access in the const method are
actually treated as const and will cause error if we use
reinterpret_cast instead.

Remove the const tag to align the expectation of the virtual method.
2024-11-05 10:43:41 -08:00
Giacomo Travaglini
3e628dd1c0 arch-arm: Cache a pointer to previously matched TLB entry (#1752)
One of the perks of the previous TLB storage implementation [1] is that
its custom implementation of LRU exploited temporal locality to speed up
simulation performance

            TlbEntry tmp_entry = *entry;
            for (int i = idx; i > 0; i--)
                table[i] = table[i - 1];
            table[0] = tmp_entry;
            return &table[0];

In other words the matching entry was placed as the first entry of the
TLB table (table[0], top of LRU stack). In this way a following lookup
would encounter it as the first entry while looping over the TLB table,
therefore massively reducing simulation time when temporal locality is
present
(most of TLB table loops would find a match in the first iteration).

   int x = 0;
    while (x < size) {
        if (table[x].match(lookup_data)) {

With the new implementation we decouple TLB storage from the replacement
policy. The result is a more flexible implementation but with the
drawback of a slower lookup/search. We therefore we need to find another
way to exploit temporal locality. This patch addresses it by caching a
previously matched entry in the TLB table

[1]: https://github.com/gem5/gem5/blob/v24.0.0.0/src/arch/arm/tlb.cc

Change-Id: Id7dedf5411ea6f6724d1e4bdb51635417a6d5363

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-11-05 08:49:17 +00:00
Leon
2e998c9fc0 arch-riscv: Add support for Zicbop extension (#1710)
This PR add support for RISC-V
[Zicbop](https://github.com/riscv/riscv-CMOs/blob/master/cmobase/Zicbop.adoc)
extension.

Change-Id: I13b044cf84608fb09b760348366ffad659a00427

Co-authored-by: Zhibo Hong <hongzhibo@bytedance.com>
2024-11-04 17:08:38 -08:00
dependabot[bot]
dba9a9e564 misc: bump tqdm from 4.66.5 to 4.66.6 (#1747)
Bumps [tqdm](https://github.com/tqdm/tqdm) from 4.66.5 to 4.66.6.
2024-11-04 11:11:40 -08:00
Giacomo Travaglini
4f74c3a949 arch-arm: Use the cached release object instead of HaveExt (#1751)
The MMU already stores a pointer to the release object, so it can query
it directly to check for PAN instead of relying on the slower HaveExt
helper

Change-Id: Ie3a186aa1d65955cff4a40871bde1ee78aa36ec0

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-11-03 11:18:10 +00:00
Matthew Poremba
2ed724b670 mem-ruby: Fix two NetDest locals using default constructor (#1746)
Two NetDest locally declared variables are using default constructor
instead of constructor with RubySystem pointer. This will cause asserts
when (1) garnet is used or (2) a protocol that uses `broadcast()` is
built.

Fix these two by passing the appropriate RubySystem pointers.
2024-11-02 08:37:04 -07:00
handsomeliu-google
956b164a43 Add Python interface to get port actual name (#1744)
In our usecase, we'd like to intercept some gadgets in some gem5 ports,
and register them to a Python-level collection. The registered name is
the string from C++ constructor argument (portName), and it would be
great if we can access that from Python-level as well. This commit
enable this by exporting a py-binded method to access the portName.

Change-Id: I93398697536f27a52d3a1dd0e658fcb321b9e293
2024-11-02 08:59:50 -05:00
Giacomo Travaglini
d376360255 arch-arm: Rewrite the ArmTLB storage to use an AssociativeCache (#1661)
With this PR we replace the TlbEntry storage within the TLB from an
array of entries with a custom hardcoded FA indexing policy and LRU
replacement policy, into the flexible SetAssociative cache.
2024-11-02 10:18:44 +00:00
Ivana Mitrovic
cc4f466e1e util: Bumps werkzeug in gem5-resources-manager (#1723)
Bumps [werkzeug](https://github.com/pallets/werkzeug) from 3.0.3 to
3.0.6.
2024-11-01 10:51:34 -07:00
Giacomo Travaglini
a2476373c9 arch-arm: Do not compute purifyTaggedAddr in checkPermissions (#1739)
purifyTaggedAddr is known to be an expensive computation regardless of
the memoization we do, as it sits in the critical path from a host
performance point of view (instruction fetch).
In checkPermissions64 we compute it without really needing the tag
purification. The only place where it is used is to check for
PCAlignment, but the alignment checks the 3LSBs whereas a potential tag
would be stored in the most significant ones

Change-Id: I9f39db658c3575dcbacb5351813ff9bb3775046d

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-11-01 16:18:57 +00:00
Jason Lowe-Power
df6a318a86 arch-x86: Update MTRR defType register (#1732)
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2024-11-01 08:59:33 -07:00
Daniel Carvalho
ad17fa040a base: Remove DPRINTF_UNCONDITIONAL (#1724)
This macro has been marked as deprecated since 2021. Wrap its
deprecation process up.

Signed-off-by: odanrc <odanrc@yahoo.com.br>
2024-10-31 18:40:38 +00:00
Bobby R. Bruce
b5a73b59ef sim: Add include guards in simulate.hh (#1737) 2024-10-31 00:34:39 -07:00
Yu-Cheng Chang
757b272a25 arch-riscv: Fix Zcmp implement typos (#1727)
Fix some typos from previous PR: https://github.com/gem5/gem5/pull/1432

Change-Id: I7126d0a20b3294c7f15d90f2d50842d20ddb5e40
2024-10-30 09:47:30 -07:00
Bobby R. Bruce
24b672ab01 tests: update timout on pannotia fw gpu test (#1736) 2024-10-30 09:47:15 -07:00
Harshil Patel
429580ee77 tests: update timout on pannotia fw gpu test 2024-10-30 16:42:23 +00:00
Bobby R. Bruce
2c6de97ea1 Add SE mode to X86Board and RiscvBoard (#1702) 2024-10-29 20:17:47 -07:00
Bobby R. Bruce
d5d7880840 util-docker: Add qemu-riscv-env Dockerfile (#1731) 2024-10-29 17:19:43 -07:00
Bobby R. Bruce
d8e7c91127 mem-ruby: Remove unused variables/mark [maybe unused] (#1650)
PR gem5#1453 left some unused variables in the ruby code that triggered
"unused variable" warnings found comiling ALL/gem5.opt to use the CHI
protocol. These have been removed.
2024-10-29 14:31:20 -07:00
Matthew Poremba
1442a4dccd mem-ruby: Re-enable assign with implicit_ctor structures (#1694)
In #1453, an `implicit_ctor` option was added for SLICC structures. This
was done to allow statements such as `NetDest tmp;` which now require a
non-default constructor without modifying every protocol. The new
`implicit_ctor` option converts the statement `NetDest tmp;` in SLICC to
`NetDest tmp(<implicit_ctor>);` in C++. This is problematic when doing
something like `NetDest tmp := getMachines(...);` which gets converted
to `NetDest tmp(<implicit_ctor) = getMachines(...);` as the constructor
doesn't return an object. Before #1453 NetDest had a default constructor
so there we no difference between a local variable definition and local
variable assignment.

This commit fixes this issue by checking in the LocalVariableAST if the
local variable is part of an assignment or not. If it is not part of an
assignment, the implicit_ctor is used. Otherwise, the assignment is
printed to the generated code.

Note that this is not done anywhere in the public code but should be
allowed for folks writing their own Ruby protocols who might otherwise
be confused why a simple assignment presents a compile error.
2024-10-29 08:53:14 -07:00
Matt Sinclair
853f2ea012 configs,scons: Update scripts and build_opts to make GPU-FS simulations more configurable (#1693)
This PR adds support for command line arguments in GPU-FS runs to allow
the user to configure several parts of the GPU. It also increases the
bits per set in the build_opts/VEGA_X86 file to enable GPU-FS
simulations to use 64 directories or more.
2024-10-28 17:19:18 -05:00
Erin Le
11dd2c6c09 stdlib: address requested changes to X86, Riscv boards
This commit addresses the requested changes. An additional
comment is added for clarification, the exception type is
changed, and a few of the error messages have been
modified.
2024-10-28 15:00:19 -07:00
Marleson Graf
7bddc764cc mem-ruby: Prevent LL/SC livelock in MESI protocols (#1384) (#1399)
Fix #1384.

MESI_Two_Level and MESI_Three_Level protocols are susceptible to LL/SC
livelocks when simulating boards with high core count.

This fix is based on MOESI_CMP_directory's implementation of locked
states, but tailors the solution to only apply it when a Load-Linked is
initiated.

There are two new states to act as locked states and stall any messages
leading to eviction:
* LLSC_E: equivalent to E state, go to E after timeout.
* LLSC_M: equivalent to M state, go to M after timeout.

The main new event is Load_Linked, which is very similar (in behavior)
to a Store, reusing several transient states. When a controller receives
the exclusive data, it differentiates a Load_Linked from a Store by
checking a new field added to the TBE: 'isLoadLinked'. It triggers a
different event when it is a Load_Linked, which in turn causes the
transition to one of the locked states.

The entire mechanism can be turned off by setting 'use_llsc_lock' to
false, and the amount of time to keep locked is defined by
'llsc_lock_timeout_latency'.

Change-Id: I13f415b6b7890d51d01f23001047d2363467a814
2024-10-28 09:57:10 -07:00
Bobby R. Bruce
dde1c7d3a1 util-docker: Add RISCV to Ubuntu all-deps Docker platforms (#1716)
I have re-implemented building this image to target RISC-V in addition
to X86 and ARM. I have found it makes for quite a good cross compilation
tool.
2024-10-26 21:17:40 -07:00
Giacomo Travaglini
c9f94f4e06 arch-arm: Replace translateAtomic with translateFunctional in AT (#1713)
A previous PR mistakenly [1] replaced translateFunctional with
translateAtomic. This commit is reverting that

[1]: https://github.com/gem5/gem5/pull/1697

Change-Id: I945c3fe59cea36732d9f30109b950d4114aa8fad

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-10-25 09:15:52 -07:00
Harshil Patel
c91af552d4 tests: move weekly gpu tests to have separate jobs (#1698) 2024-10-24 04:02:23 -07:00
Bobby R. Bruce
709f2c7695 mem-ruby,tests: Add CHI with ISA tests (#1651) 2024-10-23 15:12:37 -07:00
Bobby R. Bruce
35db93ada4 arch-riscv: Fix the bug of vsetivli frequently flushing the pipeline (#1526)
This PR fix the bug of vsetivli frequently flushing the pipeline.

Here are two pictures of the pipeline illustrate this phenomenon.


![20240830-200208](https://github.com/user-attachments/assets/532a1a8e-8acd-483f-b9a0-c25dadbe76b4)

![20240830-200213](https://github.com/user-attachments/assets/9354a6ad-4024-4afb-be6f-01f08dc9610c)

The vsetivli(0x00013334.0) instruction in the first picture flushes the
pipeline every time it is executed. This is due to vsetivli being
incorrectly flagged as a 'DirectControl' instruction. The branch
predictor cannot predict it correctly.

The second picture is the pipeline after fixing the bug.

Change-Id: I5bede47919c06cea86fa23a81624b502fbdc1159
2024-10-23 08:32:56 -07:00
Zhibo Hong
089d780c76 arch-riscv: Fix the bug of vsetivli frequently flushing the pipeline
Change-Id: I5bede47919c06cea86fa23a81624b502fbdc1159
2024-10-23 17:24:43 +08:00
Erin Le
7b7f5ef34a stdlib: add SE mode to RiscvBoard
This commit adds SE mode to RiscvBoard. RiscvDemoBoard has also
been modified as adding SE mode to RiscvBoard made the
overridden functions in RiscvDemoBoard obsolete.
2024-10-22 16:31:01 -07:00
Erin Le
b9a19625ce stdlib: add SE mode to X86Board
This commit adds SE mode to X86Board. X86DemoBoard was also modified,
as functions that were previously needed to add SE mode to
X86DemoBoard were removed.
2024-10-22 15:01:27 -07:00
Erin (Jianghua) Le
f01d68bf96 stdlib, configs: Add RiscvDemoBoard (#1490)
This PR adds a RiscvDemoBoard that can be used with both SE and FS
mode.This was tested using the workloads riscv-matrix-multiply-run for
SE and riscv-ubuntu-20.04-boot for FS. Two example config scripts have
also been added.
2024-10-22 10:13:22 -07:00
Giacomo Travaglini
3a14a73982 arch-arm: Add support of AArch32 VRINTN/X/A/Z/M/P instructions. (#1655)
Add decoder and function of AArch32 VRINTN, VRINTX, VRINTA, VRINTZ,
VRINTM, and VRINTP (Advanced SIMD) instructions. Support both 16-bit and
32-bit variants.

Add vfpFPRint in vfp.hh to perform the behavior of round-to-integer.

Only support A32 encoding.

Change-Id: Icb9b6f71edf16ea14a439e15c480351cd8e1eb88
2024-10-22 18:37:30 +02:00
Nicholas Mosier
faf764e668 arch-x86: break 32/64-bit LEA's input dependency on prior dest value (#1683)
Fix #1682. Treat LEA as a BigLdStOp. BigLdStOps (as well as other Big*
x86 uops) do not have input dependencies on 32-/64-bit destinations. LEA
will still have input dependencies on 16-bit destinations. (LEA cannot
have an 8-bit destination.)

Change-Id: I5d0678e6bd79bfd6064941a89c6fe290750543c9
2024-10-22 09:34:30 -07:00
Giacomo Travaglini
0f75c39d30 arch-arm: Implement AT as standalone instructions (#1697)
Moving the address translation logic outside of the ISA::setMiscReg will
allow it to return and potentially invoke a fault
upon execution of the AT instruction. This change affects AArch64 mode
only
2024-10-22 17:25:16 +02:00
Harry Chiang
fce42880b9 dev: move dprint of reg name before register read/write (#1684)
Originally, the debug print for read/write to specific register name
will happen after reg.read() and reg.write(). However, there might be
other debug print or warning inside reg.read(), reg.write() which would
be confusing if this debug log happen after all other debug print inside
reg.read(), reg.write().

Creating this commit to change the order.
2024-10-22 10:12:38 +01:00
Matthew Poremba
16217f843f mem-ruby: Fix issues in protocols due to multi-RubySystem (#1690)
Starting with https://github.com/gem5/gem5/pull/1453 , some Ruby
structures require a block size be set
and other require a pointer to the Ruby system. This fixes some cases
which were not covered by the per-checkin tests but seen in daily+
tests. In particular:

 - WriteMasks and PerfectCacheMemory must explicitly set a block size.
 - NetDest and RubyProxyPort require RubySystem pointer.
 - Classes inheriting Message now have a setRubySystem collecting all
   objects that need a RubySystem pointer and this should be called in
   the constructor of the Message.

This commit makes sure all of these happen. This should fix daily
arm_boot_tests and daily learning_gem5 tests.
2024-10-21 12:30:03 -07:00
Bobby R. Bruce
2c679bfa04 tests: Fix replacement_policies tests' refs (#1695)
At some point 'system' -> 'board' in the stdlib code the replacement
policy tests used. Due to this the output is slightly different meaning
the refs need updated.

This was causing the Daily Tests to fail.
2024-10-21 12:28:29 -07:00
Junshi Wang
abf939f880 arch-arm: Improve implementation of AT instructions
Move AT instructions out of setMiscReg.

Modification includes:

- Add template for AT instructions in misc64.isa.
- Add decoder and execution of AT instruction in aarch64.isa and
data64.isa.
- Add AtOp64 and AtOp64Hub to perform the behavior of AT instructions.

Change-Id: I7e8b802421f7335203edb9f8d748ad8669954b8c
2024-10-21 17:32:15 +01:00
Junshi Wang
91c5218f91 arch-arm: Add WnR into the AnnotationIDs.
To force WnR to 1 when cache maintainance and address translation
instruction.

Change-Id: Id8608f655eacb5e3c2eba36da0a31e883c55a641
2024-10-21 17:32:15 +01:00
Bobby R. Bruce
b705629b83 learning-gem5: Add ruby_system param set to RubyPortProxy (#1686)
This missing parameter causing the Learning gem5 tests to fail.

**Note:** We need to update the website's learning gem5 examples to
reflect this change.
2024-10-20 13:04:47 -07:00
Bobby R. Bruce
db47d20371 mem-ruby,misc: Remove redundant assignment (#1685)
This caused a warning to be thrown in Clang 19.
2024-10-20 13:02:53 -07:00
Nagendra-KJ
d0a9945d47 scons: Changed bits per set for VEGA_X86 to 128
Change-Id: I03fbb3000a13cf11fb751367677a7f1735f64ec9
2024-10-20 11:53:31 -05:00
Nagendra-KJ
a443b5cbb8 configs: Added command line arguments to gpufs config scripts
This commit adds command line arguments to the scripts that GPU-FS mode
uses.

Change-Id: I5514e77e699b9144461bbd2be6e267e7d44a6fb2
2024-10-20 11:53:21 -05:00
Bobby R. Bruce
644ad3cdb0 misc,tests: Fix incorrect date assignment in Actions 2024-10-18 14:59:16 -07:00
Mahesh Madhav
3e83f3ce4f scons,misc: Portable debug flag generation (#1666)
Modifies union construction in the debug directory so output is more
amenable to alternative compilers. Verified that this change produces
code that builds with clang, gcc, msvc, nvhpc, aocc, icc, openxl, and
cray hpc.

These were the kinds of errors seen in MSVC, which this patch fixes.
```
debug/Decoder.hh(24): error C2461: 'gem5::debug::unions::Decoder': constructor syntax missing formal parameters
debug/Decoder.hh(31): error C7624: Type name 'gem5::debug::unions::Decoder' cannot appear on the right side of a class member access expression
```
2024-10-18 14:39:09 -07:00
Bobby R. Bruce
b836a3f239 tests: update input sizes for pannotia tests (#1631)
This PR addresses comments from #1584 

- removed tests using the same binary multiple times. Each binary is
tested once with one graph
- Updated the input sizes as per the comments in the above mentioned PR
2024-10-18 13:42:30 -07:00
Bobby R. Bruce
ddaf70b64f Merge branch 'develop' into update-pannotia-tests 2024-10-18 13:40:59 -07:00
Giacomo Travaglini
2e271459d0 mem-cache: Implementation of SMS prefetcher (#1454)
This PR adds the SMS prefetcher described in [this
](https://web.eecs.umich.edu/~twenisch/papers/isca06.pdf) paper.
This work was done in collaboration with @Setu-Gupta, and @xmlizhao

On branch sms
Changes to be committed:
modified: src/mem/cache/prefetch/Prefetcher.py
modified: src/mem/cache/prefetch/SConscript
new file: src/mem/cache/prefetch/sms.cc
new file: src/mem/cache/prefetch/sms.hh

Change-Id: I68d3bb6cf07385177d0f776fb958f652cfc41489
2024-10-18 19:15:57 +02:00
Harshil Patel
ae56a31b21 tests: Download only the resources used in ponnotia tests 2024-10-18 17:12:43 +00:00
Giacomo Travaglini
c974bca123 arch-arm: Implement the L2 TLB as a 5-way set associative
Change-Id: I65d7a384f6d54989cec3c431090c35285011849f
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2024-10-18 14:30:15 +01:00
Giacomo Travaglini
7f826ffbaa arch-arm: Use the AssociativeCache in the ArmTLB
With this commit we replace the TlbEntry storage within the TLB from an
array of entries with a custom hardcoded FA indexing policy and LRU
replacement, into the flexible SetAssociative cache.

Change-Id: Ia74ff6962ac8195802b51dcc0caa516965f0ce37
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2024-10-18 14:30:14 +01:00
Giacomo Travaglini
ab6354a9cc arch-arm: Rename TlbEntry::Lookup into TlbEntry::KeyType
KeyType definition is required if we want to store the TlbEntry
within an AssociativeCache. We could add an alias and keep the
Lookup name but this will just create extra confusion

Change-Id: Ib0b7c9529498f0f6f15ddd0e7cf3cec52966e8df
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2024-10-18 14:30:14 +01:00
Giacomo Travaglini
c83321b843 arch-arm: Define a SetAssociative indexing policy for the TLB
Change-Id: I8149ddc4ecf7ac3b8b7e8e1cf7eb4932fd99c34a
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2024-10-18 14:30:14 +01:00
Giacomo Travaglini
376530ef72 arch-arm: Add isValid method to the TlbEntry
Change-Id: I93b183ad0768e8afc94bb3f21387c21cdc9cc78b
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2024-10-18 14:30:14 +01:00
Giacomo Travaglini
3f18cada53 arch-arm: Add insert method to the TlbEntry
Change-Id: I664b03b61e4540025c6cebaa4a7298297565c76b
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2024-10-18 14:30:14 +01:00
Giacomo Travaglini
c6cca14b74 arch-arm: Add invalidate method to the TlbEntry
This is needed for compliance with the AssociativeCache
container. It will call the invalidate method when
invalidating the TLB entry

Change-Id: Idb1bc40b5aea8c475146700c81ab79d9980f745d
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2024-10-18 14:30:14 +01:00
Giacomo Travaglini
ce8a98d657 arch-arm: Generate Lookup from TlbEntry
Change-Id: I355d190acfeb3cd829647b962548c82dd0013f8d
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2024-10-18 14:30:14 +01:00
Giacomo Travaglini
fda8eeace4 arch-arm: Keep track of observed page sizes in the TLB
With this commit we record the page sizes of all valid
TLB entries in a TLB.

We update the set conservatively, therefore allowing
false positives but not false negatives.

This information will be used when doing a page size based
lookup. At the moment we don't strictly need it as we
iterate over all TLB entries (the TLB implements a fully
associative cache) and if we find multiple matches, it means
we have stored some partial translations.

The existing logic is prioritizing complete translations
over partial translations and among the latter, late stage
translations over early stage (with the idea to minimize
the number of walks).

The "iterate once over the entire TLB and record all matches"
won't work well when we shift from a fully associative
TLB into a set associative. With the introduction of the
aforementioned set, we can do page size based lookups,
so we can explicitly lookup the TLB for a specific page size
therefore looking into the appropriate set for a match

Change-Id: If77853373792d6a5ec84cf1909ee5eb567f3d0e4
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2024-10-18 14:30:14 +01:00
Giacomo Travaglini
da3919a6f4 arch-arm: Add pagesize field to the Lookup data structure
Change-Id: Ibc2c80cbf3cfd98f24440e8e6ddf4dbb7e4e26d6
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2024-10-18 14:30:14 +01:00
Giacomo Travaglini
1eed6e9769 arch-arm: Make TlbEntry a ReplaceableEntry
Change-Id: I3b8169bfb620ea36f6bbe63c38b71184285b55c2
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2024-10-18 14:30:14 +01:00
Giacomo Travaglini
94ffe5f233 arch-arm: Replace TLB,TLBVerbose usage in ArmMMU
Some ISAs (like Arm) have moved most of the translation logic into
the MMU and use the TLB simply as translation storage. It makes
sense to use the MMU debug flag for that logic and reduce the
scope of the TLB flag to TLB insertion/hits/misses

Change-Id: I2a164545c711d83d3e87075b0cb5c279eed274c9
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2024-10-18 14:30:14 +01:00
Giacomo Travaglini
58bc790a09 arch-arm: Do not include tlb.hh in mmu.hh
This commit is moving some MMU methods definition in the
source file from the header to avoid including tlb.hh

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Change-Id: I8fb1aeccd9c38c48b09583b4dc5d152acd09c3e6
2024-10-18 14:30:14 +01:00
Giacomo Travaglini
08c66a0b6a arch-arm: Avoid unnecessary include of faults.hh
Remove unused include from self_debug.hh

Change-Id: Ic675a277ebb2ff4a319e9a7cfe2bea4af850609e
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-10-18 14:25:39 +01:00
Giacomo Travaglini
0f1436ba5f arch-arm: DomainType is not specific to the TlbEntry
Change-Id: I626c79973fcd60b1be36a965923999a1c9a9bc54
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-10-18 14:25:39 +01:00
Giacomo Travaglini
d3cdd2dc17 arch-arm: TranMethod is not specific to the ArmFault
It is a simple enum to distinguish between short and big
descriptors. By moving it away from the ArmFault we can
avoid including fault.hh from mmu.hh

Change-Id: Ib556b577c62f5ea3e4c8c9e0d4560a3e99c96778
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-10-18 14:25:39 +01:00
handsomeliu-google
3fc6cc7763 sim: Make SignalSinkPort::set virtual (#1679)
We are implementing derived classes of SignalSinkPort that does some
additional logic after it's triggered (set() invoked by SignalSourcePort
peer), and before executing the callback that a device provides (in
onChange_). The logic is like additional logging, or providing debugging
features. However, set() itself directly calls the onChange_ callback.

Making the set() virtual could provide the flexibility to achieve this
feature.
2024-10-18 05:41:05 -07:00
Pranith
ae0cee66ed systemc: Disable 'overloaded-virtual' warn for clang (#1662)
We need to extend the warning disable even for clang compiler.

Fixes #1658
2024-10-18 05:40:10 -07:00
Harshil Patel
946bf83b75 arch-arm: Add arm demo board (#1478)
This demo board is a preset arm board, that can be used to run example
gem5 simulations. This board doesnt simulate any known hardware.

The board will be used to run benchmarks such as gapbs and npb to
collect stats. The plan is to show these stats on the gem5 resources
website to provide more details about the resources.
2024-10-18 05:36:31 -07:00
Bobby R. Bruce
cb5d14f753 arch-riscv: Implement Zcmp instructions (#1432)
1. Implement Zcmp(cm.push, cm.pop, cm.popret, cm.popretz, cm.mva01s,
cm.mvsa01) instructions

2. The Zcd instructions overlap the Zcmp and Zcmt instruction. This
option is used to enable/disable Zcd extension, implies enable Zcmp/Zcmt
extension. If Zcd is enable, the Zcmp and Zcmt is disabled. Otherwise,
Zcmp and Zcmt is enabled.

Spec: https://github.com/riscv/riscv-isa-manual/blob/main/src/zc.adoc
2024-10-18 05:33:55 -07:00
Harshil Patel
7591f2a843 tests: Fix compiler tests (#1678)
- This change updates syntax of constructors of Template Classes from
`class<T>()` to `class()`

- Initializes coherence to 0 in `src/mem/cache_blk.hh`

The above changes are made to solve the errors when compiling gem5 in
gcc 14
2024-10-17 11:19:46 -07:00
Bobby R. Bruce
d454e421d2 stdlib,arch-x86: Update X86Demoboard (#1618)
This commit modifies X86DemoBoard so it has numbers more similar to that
of RiscvDemoBoard and ArmDemoBoard. It also adds SE mode to
X86DemoBoard. Note that the changes here depend on the changes in PR
1579.

**Note**: This PR was created so @BobbyRBruce could add his commits to
#1600

---------

Co-authored-by: Erin Le <ejle@ucdavis.edu>
2024-10-17 10:29:17 -07:00
Bobby R. Bruce
0341c5a502 SE script and tests for risc-v's vector extension (#1542)
This two commits add the SE config and test script, respectively, to run
the rvv tests mentioned in #1246.
2024-10-17 10:26:30 -07:00
Jason Lowe-Power
f55a4ce989 arch-x86,arch-arm: Remove static variables in decoders (#1643)
There were a number of variables in the arm and x86 decoders that are
static (e.g., the decode cache). It's a bit interesting that this
doesn't cause problems with multiple cores since each core has its own
decoder.

However, this causes segfaults if you run different cores on different
*host* threads. We are experimenting with running gem5 with multiple
host thread (i.e., in parallel), and removing these static variables
resolves the segfault.

This change also adds const to any other static variables to ensure that
they cannot be modified.

Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2024-10-17 08:17:34 -07:00
Abhishek Shailendra Singh
cf3427f87b mem-cache: refactored the code 2024-10-17 17:13:37 +02:00
pre-commit-ci[bot]
bd939821c8 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2024-10-17 17:13:37 +02:00
Abhishek Shailendra Singh
3eabd02801 mem-cache: This commit adds sms prefetcher
Change-Id: I68d3bb6cf07385177d0f776fb958f652cfc41489
2024-10-17 17:13:37 +02:00
Roger Chang
a6421e4404 arch-riscv: Add IsDelayedCommit for each zcmp micro instructions 2024-10-17 13:29:38 +08:00
Roger Chang
28b112e2a6 arch-riscv: Implement Zcmp
Implement instructions:
cm.push
cm.pop
cm.popret
cm.popretz
cm.mva01s
cm.mvsa01

Spec: https://github.com/riscv/riscv-isa-manual/blob/main/src/zc.adoc#zcmp

Change-Id: I2921c4bdb0c654858a237386056ebb2aed643a5a
2024-10-17 13:29:38 +08:00
Roger Chang
aa782cffee arch-riscv: Add enable_Zcd options to RiscvISA
The Zcd instructions overlap the Zcmp and Zcmt instruction

This option is used to enable/disable Zcd extension, implies enable
Zcmp/Zcmt extension. If Zcd is enable, the Zcmp and Zcmt is disabled.
Otherwise, Zcmp and Zcmt is enabled.

Spec: https://github.com/riscv/riscv-isa-manual/blob/main/src/zc.adoc#zc-overview

Change-Id: I3788eb6539e13a210c9946efc43ca1fef4639560
2024-10-17 13:29:38 +08:00
Matthew Poremba
deb8f983a1 arch-vega: Fix multi-dword setElem in PackedReg (#1664)
There are two issues related to setting an element in PackedReg where
the element spans multiple dwords. First, the mask value is wrong and is
clobbering both dwords. Second, a portion of the value is shifted out of
the narrower input type.

Fix this by using the correct mask to clear the bits where the value
will be placed and use a larger data type to shift the value into place.
2024-10-14 10:19:52 -07:00
Ivana Mitrovic
20965f571b stdlib: Extend AbstractBoard pre_instantiation functionality (#1497)
* Deprecates the setting of FS/SE mode via the `Simulator` module.
* Moved the creation of the `Root` object from the `Simulator` to the
board.
* Moved the setting of `sim_quantum` from the `Simulator` to the
processor.
* Allows for easier development of boards which support both SE and FS
mode simulation by moving board setup function calls to occur after the
set_workload function is call which sets a boards stats `is_fs` status.
2024-10-14 10:12:41 -07:00
Leon
652a72d122 arch-riscv: Add support for riscv hardware probing syscall (#1525)
This PR adds the support for riscv hardware probing syscall described in
[this](https://docs.kernel.org/arch/riscv/hwprobe.html). The
implementation logic refers to [linux
kernel](https://github.com/torvalds/linux/blob/master/arch/riscv/kernel/sys_hwprobe.c)
and
[qemu](https://github.com/qemu/qemu/blob/master/linux-user/syscall.c).
And passed the [RISC-V hwprobe
exmaple](https://github.com/cyyself/hwprobe) test.

Hope to be merged. Thanks.

Change-Id: Iab714974f0551fc451e0d6846c75a7153809a308

Co-authored-by: Zhibo Hong <hongzhibo@bytedance.com>
2024-10-14 10:00:48 -07:00
Matthew Poremba
1edeeda881 dev: Make unknown PCI device writes a warning (#1657)
This pops up in kernel 6.8.0. The device it is trying to write is
currently unknown but does not cause problems ignoring the device,
therefore change the panic to a warning and responding to the request
with the default PCI latency.

Change-Id: I4c1229753a75a94a255d8cfd411ac7311283366b
2024-10-14 08:51:05 -07:00
Saúl Adserias
f4ffe5f815 tests: add rvv-intrinsic-tests script and config
Change-Id: Ia3fa67bb2a2603dd5cbf665504f85a8b969c2a5e
2024-10-11 17:42:51 +02:00
Saúl Adserias
a35f146ba2 configs: add example RVV SE parametrized config
Change-Id: I0776c5751da8b80340166ab518593686d141a4dd
2024-10-11 17:32:09 +02:00
Bobby R. Bruce
a8f88abfb1 misc: Add 'ext' & 'tests' to vscode pythin extraPaths (#1652)
'ext' is set as a Python source path for gem5, like 'src/python'. It
helps vscode users to have vscode aware of this to better analytics and
reduce warnings (most comminly "unable to resolve import).

'tests' isn't in the Python source path when compiling gem5 but it is
when running `tests/main.py`. Though somewhat unideal as is lets vscode
think files in 'src' can import from files in 'test', adding this helps
vscode Python analytics parse the test files which reduces warnings and
aids in betters navigation of the testing code. This is particularly
helpful given the complexity of the testlib testing infrastructure.
2024-10-10 10:18:14 -07:00
Bobby R. Bruce
65ba2dcae5 tests: Refactor downloading of pannotia tests (#1653)
With this patch the pannotia tests now:

1. Download the resources to 'gpu-pannotia' in the
'tests/gem5/resources' directory. This is where other test resources are
store.
2. Download thr USA-road-d.NY.gr dataset from Google cloud bucket in a
decompressed state.
2. Avoid re-download the resources if they are already present on the
host machine.
2024-10-10 10:17:32 -07:00
Erin (Jianghua) Le
6195b33960 util-docker,tests: Add compiler tests & Dockerfiles for GCC 14 (#1646)
This commit adds gcc 14 to the compiler tests and Dockerfiles.
2024-10-10 10:17:03 -07:00
Bobby R. Bruce
c1c5147e53 tests,misc: Remove edited from PR Action trigger list (#1654)
`edited` is what forces a re-run of our tests when the PR title is
updated and other minor metadata stuff. I believe all changes to the
code are covered by the remainder. `synchronize` is means the PR is
triggered with the when the this PR is from (in this case my forked gem5
repo) is synced with the PR branch here. This covers the vast majority
of cases we care about. `opended` covers for the case where the PR is
created and `ready_for_review` for when something moves out of a draft.
2024-10-10 10:13:56 -07:00
Jason Lowe-Power
3f42ab4ca9 stdlib,ruby: Enable resetting version numbers (#1649)
Ruby requires each machine type to have a continuous set of version
numbers starting at 0. We were hiding this from users/developers by
using a Python class variable in the stdlib. Unfortunately, with
multiple ruby systems this doesn't work anymore.

As a stop-gap this change adds "resetting" these versions to the
beginning of `incorporate_caches`. It would be better to fix this in the
C++ code (and assign these numbers in C++ probably via the RubySystem),
but that's a bigger change than is needed right now.

---------

Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-10-10 09:53:40 -07:00
Pranith
50f652a2ee Implement BTB using the cache library (#1537)
This enables the BTB to be associative and use various replacement
policies.
2024-10-10 17:05:22 +01:00
Junshi Wang
7df35187a0 arch-arm: Add support of AArch32 VRINTN/X/A/Z/M/P instructions.
Add decoder and function of AArch32 VRINTN, VRINTX, VRINTA, VRINTZ,
VRINTM, and VRINTP (Advanced SIMD) instructions. Support both 16-bit and
32-bit variants.

Add vfpFPRint in vfp.hh to perform the behavior of round-to-integer.

Only support A32 encoding.

Change-Id: Icb9b6f71edf16ea14a439e15c480351cd8e1eb88
2024-10-10 12:08:15 +01:00
Giacomo Travaglini
1c8ab47a54 arch-arm: Add support of AArch32 VCVTA/P/N/M instructions. (#1533)
Add decoder and function of AArch32 VCVTA, VCVTP, VCVTN and VCVTM
instructions. Support both 16-bit and 32-bit variants.

Only support A32 encoding.

Change-Id: I6ece0e1b779f9a7cc9d709894a49a7fdcda28373
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-10-10 11:58:37 +02:00
Erin (Jianghua) Le
feeb3b2d67 cpu: fix simInsts and simOps not resetting (#1615)
This PR fixes the bug where simInsts and simOps don't reset when
m5.stats.reset() is called. The stats hostInstRate and hostOpRate are
affected by this change as well, as they depend on simInsts and simOps
respectively.

This is related to issue 1443 linked
[here](https://github.com/gem5/gem5/issues/1443).
2024-10-09 19:49:43 -07:00
Bobby R. Bruce
3443788013 misc: Add "src/python" to vscode Python Analysis Paths (#1647)
This allows vscode to resolve python imported from "src/python".
Warnings regarding these imports are numerous and the issue stops users
of vscode to utilizubg features like navigating the codebase though "Go
to Definition" queries on imported classes/functions.
2024-10-09 14:46:54 -07:00
Bobby R. Bruce
965da9ea79 misc: pre-commit autoupdate (#1642)
<!--pre-commit.ci start-->
updates:
- [github.com/pre-commit/pre-commit-hooks: v4.5.0 →
v5.0.0](https://github.com/pre-commit/pre-commit-hooks/compare/v4.5.0...v5.0.0)
- [github.com/PyCQA/isort: 5.11.5 →
5.13.2](https://github.com/PyCQA/isort/compare/5.11.5...5.13.2)
- [github.com/psf/black: 23.9.1 →
24.10.0](https://github.com/psf/black/compare/23.9.1...24.10.0)
- [github.com/asottile/pyupgrade: v3.14.0 →
v3.17.0](https://github.com/asottile/pyupgrade/compare/v3.14.0...v3.17.0)
<!--pre-commit.ci end-->
2024-10-09 14:46:20 -07:00
Jason Lowe-Power
f03dddb458 Use board get_mem_ports consistently (#1509)
Previously, whether the board object or the memory_system returned
the memory ports was not consistent in the cache_hierarchies

This commit makes it consistently use the board. Note: the board
is a better place so it can customize the ports (e.g., add I/O
components or other things.

This commit also makes the arm board consistent with the other
boards and removes the specialized `get_mem_ports` that was not
used.
2024-10-09 13:21:28 -07:00
pre-commit-ci[bot]
54487d3bf6 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2024-10-09 14:04:56 +00:00
pre-commit-ci[bot]
7661116b00 misc: [pre-commit.ci] pre-commit autoupdate
updates:
- [github.com/pre-commit/pre-commit-hooks: v4.5.0 → v5.0.0](https://github.com/pre-commit/pre-commit-hooks/compare/v4.5.0...v5.0.0)
- [github.com/PyCQA/isort: 5.11.5 → 5.13.2](https://github.com/PyCQA/isort/compare/5.11.5...5.13.2)
- [github.com/psf/black: 23.9.1 → 24.10.0](https://github.com/psf/black/compare/23.9.1...24.10.0)
- [github.com/asottile/pyupgrade: v3.14.0 → v3.17.0](https://github.com/asottile/pyupgrade/compare/v3.14.0...v3.17.0)
2024-10-09 07:03:42 -07:00
Bobby R. Bruce
11fa0ac9a5 stdlib: Mv setup_board/setup_mem_ranges calls to set_fs
This change allows for the `_setup_memory_range` and `_setup_board`
functions to know if the board is to run a FS or SE workload, thus
allowing for a baord to handle both cases considerably easier than
before. With this change all functions are called after FS or SE
is declared via the `_set_fullsystem` function and thus all can
accomodate for SE and FS workloads.
2024-10-09 06:32:41 -07:00
wmin0
ee91356632 systemc: Disable 'overloaded-virtual' warn for systemc bind funcs (#1637)
For GCC >=v13 systemc was breaking due to the overloaded virtual warning
check.

Issue: gem5#1121

Change-Id: I68872f58d0bbe5430976163ba7316bbd2e403ec8
2024-10-09 06:28:43 -07:00
Bobby R. Bruce
cc0eb12e9a misc,tests: Add cache of ALL/gem5.opt to ci-test.yaml (#1595)
Where appropriate utilize caching of ALL/gem5.opt or VEGA_X86/gem5.opt.
The cache key is just the date returned by the runner. This is unlikely
the most efficient solution but it is simple and difficulties were
encountered when attempting to create a hash of  This solution will do
for now.
2024-10-09 06:24:57 -07:00
Yu-Cheng Chang
402a030ce1 cpu,arch,arch-riscv: Check wake up signal when post interrupt (#1641)
The RISC-V doesn't not draft about how to handle wake up from interrupt
signal. In SiFive U74 core, the hart will wake up if there is any
enabled pending interrupt.

[1] Section 14.3.1
https://sifive.cdn.prismic.io/sifive/ad5577a0-9a00-45c9-a5d0-424a3d586060_u74_core_complex_manual_21G3.pdf
2024-10-08 08:51:38 -07:00
Yu-Cheng Chang
67edf64326 arch-riscv: Fix CLINT mtime reset handling (#1638)
The previous https://github.com/gem5/gem5/pull/1617 introduce the CLINT
reset feature. When reset, we changed the mtime to 0 and keep mtimecmp
unchanged by default, we also need to check mtime & mtimecmp regiter to
update the MTI signal. However, the mtime register will be incremented
to 1 by `raiseInterruptPin`.

In the PR, we introduced the interrupt ID for CLINT, the mtime will be
incremented only if received the RTC signal

---------

Co-authored-by: Jason Lowe-Power <jason@lowepower.com>
2024-10-08 08:51:20 -07:00
Matthew Poremba
4f7b3ed827 mem-ruby: Remove static methods from RubySystem (#1453)
There are several parts to this PR to work towards #1349 .

(1) Make RubySystem::getBlockSizeBytes non-static by providing ways to
access the block size or passing the block size explicitly to classes.

The main changes are:
 - DataBlocks must be explicitly allocated. A default ctor still exists
   to avoid needing to heavily modify SLICC. The size can be set using a
   realloc function, operator=, or copy ctor. This is handled completely
   transparently meaning no protocol or config changes are required.
 - WriteMask now requires block size to be set. This is also handled
   transparently by modifying the SLICC parser to identify WriteMask
   types and call setBlockSize().
 - AbstractCacheEntry and TBE classes now require block size to be set.
   This is handled transparently by modifying the SLICC parser to
   identify these classes and call initBlockSize() which calls
   setBlockSize() for any DataBlock or WriteMask.
 - All AbstractControllers now have a pointer to RubySystem. This is
   assigned in SLICC generated code and requires no changes to protocol
   or configs.
 - The Ruby Message class now requires block size in all constructors.
   This is added to the argument list automatically by the SLICC parser.
   
(2) Relax dependence on common functions in
src/mem/ruby/common/Address.hh
so that RubySystem::getBlockSizeBits is no longer static. Many classes
already have a way to get block size from the previous commit, so they
simply multiple by 8 to get the number of bits. For handling SLICC and
reducing the number of changes, define makeCacheLine, getOffset, etc. in
RubyPort and AbstractController. The only protocol changes required are
to change any "RubySystem::foo()" calls with "m_ruby_system->foo()".

For classes which do not have a way to get access to block size but
still used makeLineAddress, getOffset, etc., the block size must be
passed to that class. This requires some changes to the SimObject
interface for two commonly used classes: DirectoryMemory and
RubyPrefecther, resulting in user-facing API changes

User-facing API changes:
 - DirectoryMemory and RubyPrefetcher now require the cache line size as
   a non-optional argument.
 - RubySequencer SimObjects now require RubySystem as a non-optional
   argument.
 - TesterThread in the GPU ruby tester now requires the cache line size
   as a non-optional argument.

(3) Removes static member variables in RubySystem which control
randomization, cooldown, and warmup. These are mostly used by the Ruby
Network. The network classes are modified to take these former static
variables as parameters which are passed to the corresponding method
(e.g., enqueue, delayHead, etc.) rather than needing a RubySystem object
at all.

Change-Id: Ia63c2ad5cf0bf9d1cbdffba5d3a679bb4d3b1220

(4) There are two major SLICC generated static methods:
getNumControllers()
on each cache controller which returns the number of controllers created
by the configs at run time and the functions which access this method,
which are MachineType_base_count and MachineType_base_number. These need
to be removed to create multiple RubySystem objects otherwise NetDest,
version value, and other objects are incorrect.

To remove the static requirement, MachineType_base_count and
MachineType_base_number are moved to RubySystem. Any class which needs
to call these methods must now have a pointer to a RubySystem. To enable
that, several changes are made:
 - RubyRequest and Message now require a RubySystem pointer in the
   constructor. The pointer is passed to fields in the Message class
   which require a RubySystem pointer (e.g., NetDest). SLICC is modified
   to do this automatically.
 - SLICC structures may now optionally take an "implicit constructor"
   which can be used to call a non-default constructor for locally
   defined variables (e.g., temporary variables within SLICC actions). A
   statement such as "NetDest bcast_dest;" in SLICC will implicitly
   append a call to the NetDest constructor taking RubySystem, for
   example.
 - RubySystem gets passed to Ruby network objects (Network, Topology).
2024-10-08 08:14:50 -07:00
Giacomo Travaglini
4a3e2633d2 cpu-o3: Add Matrix OpDesc to the O3 Default FU (#1640)
There was a bug exposed by a recent PR [1] where until recently the O3
CPU was executing an instruction even if it did not have the required
functional unit in the FU pool.

We are adding the matrix descriptors to the Default FU pool in the O3
cpu so that no panic is encountered upon executing of a matrix
instruction

[1]: https://github.com/gem5/gem5/pull/1516

Change-Id: I04250255a2cbb2ee6f3ef204b62bc2c1ee2d4d2c

Reviewed-by: Richard Cooper <richard.cooper@arm.com>

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-10-08 10:23:14 +01:00
Giacomo Travaglini
440999e447 cpu-o3: Add Crypto OpDesc to the O3 Default FU (#1639)
There was a bug exposed by a recent PR [1] where until recently the O3
CPU was executing an instruction even if it did not have the required
functional unit in the FU pool.

We are adding the crypto descriptors to the Default FU pool in the O3
cpu so that no panic is encountered upon executing of a crypto
instruction

[1]: https://github.com/gem5/gem5/pull/1516

Change-Id: Ifaf2f8e4780dfb8ba825a99a02dd587f011dbd23

Reviewed-by: Richard Cooper <richard.cooper@arm.com>

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-10-08 10:22:25 +01:00
Bobby R. Bruce
3fc21da13c learning-gem5,tests: Update learning-gem5 Ruby Test ref (#1635)
The Daily tests have been failing as the learning-gem5 Ruby test now
exits at tick 9831 instead of tick 9981.

**Note**: The cause of this change is currently unknown. I'm not sure if
this is symptomatic of something bigger but for now I only observe this
bug failure and this patch at least silences the error.
2024-10-07 14:40:45 -07:00
Jason Lowe-Power
6ff3821c9d arch-riscv: Enable clone3 syscall in riscv64 (#1620)
The clone3 syscall, implemented in commit 87e774c, is currently only
handled for x86-64 in gem5 SE mode. Clone3 is employed by modern glibc
versions instead of clone for processes/threads generation (e.g. issue
#1204). This commit enables the clone3 syscall in riscv64 by adding the
corresponding handler call, as well as its arguments struct.
2024-10-07 13:45:34 -07:00
Erin (Jianghua) Le
1ee924a067 python: clarify SimObject error message (#1625)
This adds more detail to the error message that is thrown when an orphan
node is instantiated.
2024-10-07 13:45:03 -07:00
Matthew Poremba
f5858fe81f dev-amdgpu: Deprecate rom and mmio trace params (#1633)
The ROM field was originally intended as a future alternate way to load
VBIOS without the ROM being on the disk image. This code path is never
taken for the devices gem5 supports and there is no gem5 implementation.
Deprecate the rom_binary field for this reason.

Similarly, MMIO traces were only used for Vega10. Deprecate this as
Vega10 is now deprecated. The MMIO trace reader is kept as it may still
be useful in the future. It is still the primary way to handle devies
which have graphics capability. None of the devices supported by gem5
have graphics now that Vega10 is deprecated.
2024-10-07 07:12:07 -07:00
Bobby R. Bruce
5db68114df misc,tests: Change Github Action caches to just be date-based
Hashing the `src` directory is too costly, with some runners reaching
timeout. Also, as we only have 10GB of cache it makes sense to have
more course grained caching
2024-10-07 00:53:08 -07:00
Bobby R. Bruce
7c83e3379b stdlib: Add _pre_instantiate funcs for caches and memory
Note: At present this is not used but these functions can be filled
or overriden in subclasses as required.
2024-10-04 14:03:46 -07:00
Harshil Patel
a12bef131b tests: update input sizes for pannotia tests 2024-10-04 11:51:58 -07:00
Bobby R. Bruce
b358471eb9 stdlib: Move 'sim_quantum' set from Simulator to Processor
The setting of the `sim_quantum` parameter makes considerably more sense
to occur in the Processor. Through the `_pre_instnatiate` functions this
is now possible.
2024-10-04 11:40:18 -07:00
Bobby R. Bruce
4bdcb040d0 stdlib: Move Root obj creation from Simulator to Board
It makes much more sense for the Root Object to be create within the
board and passed where required. Creating it in the Simulator class is
not required.

For this to work the signuature of the `_pre_instantiate` function in
`AbstractBoard` has been updated to return the Root object.
2024-10-04 11:40:13 -07:00
Bobby R. Bruce
4b3ba1daa6 stdlib: Deprecate Simulator 'full_system' param
THis is deprecated in favor of the board determining whether the
simulation is FS or SE. Usually this will be contingent on which
`set_workload` funciton has been called. Regardless, it is the board's
responsibility. The user should not need to explicitly declare this any
longer.
2024-10-04 11:33:23 -07:00
Bobby R. Bruce
6a24b69a97 misc,tests: Increase Weekly and Daily GPU test timeout (#1628)
The Weekly GPU tests are failing due to a timeout, but I found the
testing timeout was set to 5 hours, and we have been frequently close to
reaching this but have recently changed the test enough to consistently
go over.

 The main two things that appear to have caused this are:

~~1. Moving the X86_VEGA compilation into the same step as the running
of the tests.~~ (I take this back, the timeout is per-job, it shouldn't
matter how stuff is deivided among steps in the job. However, keeping it
separate does no harm and merging the two steps did coincide with
failures occurring. I'll play it safe for now_.
2. Reducing the number of threads per GitHub Actions runner, thus
slowing job execution.

In addition, we've added more tests to this weekly GPU suite, though I
don't believe we have got to running these tests yet. The timeout
appears to always have been triggered before this.

This PR increases the timeout to 3 days and moves the compilation into a
separate step.

**Update: Same changes done for Daily tests too as it appears to be the
same problem.
2024-10-04 07:41:17 -07:00
Bobby R. Bruce
d49d0272ff misc,tests: Create Daily GPU Test timeout 2024-10-04 07:36:46 -07:00
Bobby R. Bruce
866b51a1cc misc,tests: Increase Weekly GPU test timeout
The Weekly GPU tests are failing due to a timeout but I found the testing
timeout was set to 5 hours and  we have been frequently close to reaching this
but have recently changes the test enought o consistently go over.

 The main two things that appear to have caused this are:

1. Moving the X86_VEGA compilation into the the same step as the running of
   the tests.
2. Reducing the number of threads per GitHub Actions runner, thus slowing
   job execution.

In addition we've added more tests to this weekly GPU suite though I don't
believe have got to running these tests yet. The timeout appears to
always been triggered before this.

This PR increases the timout to 3 days and moves the compilation into a
seperate step.
2024-10-04 06:12:13 -07:00
Bobby R. Bruce
7117b1399b util-docker: Fix gpu dpcker images (#1627)
Two faults:

1. You can't give description the docker-bake file for single platform
builds. They must be in the Dockerfile..
2. The gpu docker image def in docker-bake.hcl was not overriding the
"common" setttings as previously thought. This was causing builds to
something build the wrong platform and vairous other weird bugs. This
has been fixed in this patch.
2024-10-04 02:37:16 -07:00
Yu-Cheng Chang
5b5f7afc1b arch-riscv: Implement CLINT reset feature (#1617)
When reset, registers are change
msip: cleared to zero
mtimecmp: unknown state, cloud be origin values or change to any values
mtime: cleared to zero

Spec:
https://github.com/riscv/riscv-aclint/blob/main/riscv-aclint.adoc
https://sifive.cdn.prismic.io/sifive/1a82e600-1f93-4f41-b2d8-86ed8b16acba_fu740-c000-manual-v1p6.pdf

Change-Id: I3c50b41eb765ad9cd7a8a03c427bd0011195de5c
2024-10-03 13:22:27 -07:00
Matthew Poremba
24504c9a3e dev-amdgpu: Use GPU specific cache line size (#1621)
Invalidate requests align to system cache line size. This causes
problems if the GPU cache hierarchy's cache line size is different than
the system as the unlaigned requests never return, leading to deadlock
on deferred dispatch.

This commit uses the cache line size from the GPU memory manager and
makes the cache line size there non-optional.

Tested with multiple RubySystems where CPU side was 64B and GPU side was
128B cache lines.
2024-10-03 08:47:08 -07:00
Tommaso Marinelli
242c0e9693 arch-riscv: Add more syscall placeholders 2024-10-03 03:25:39 +02:00
Matthew Poremba
c8c75959ad configs: Deprecate Vega10 (#1619)
Vega10 is no longer officially supported by ROCm and ROCm is starting to
use some packet types not supported. These were originally kept to allow
users to use older disk images with newer gem5. Going forward the gem5
version and gem5-resources releases will be required to be the same to
prevent lingering old configs.

As a replacement for vega10*.py, mi300.py or mi200.py should be used.
HIP examples, cookbook, and rodinia configs can be replaced with the
standard flow of building / obtaining the GPU application and running
using mi300.py or mi200.py as they do not require any input options and
therefore do not require changes to the disk image.
2024-10-02 14:18:41 -07:00
Tommaso Marinelli (imec)
be49bf89c0 arch-riscv: Enable clone3 syscall in riscv64
The clone3 syscall, implemented in commit 87e774c, is currently only
handled for x86-64 in gem5. Clone3 is employed by modern glibc versions
instead of clone for processes/threads generation (e.g. issue #1204).
This commit enables the clone3 syscall in riscv64 by adding the
corresponding handler call, as well as its arguments struct.
2024-10-02 18:23:27 +02:00
Giacomo Travaglini
bdd10069b1 arch-arm: Add recursive reduce in Neon instruction. (#1616)
FMAXV, FMINV, FMAXNMV, FMINNMV and ADDV instructions perform recursive
reduction. Different reduction methods lie to different result when
handle NaN values.

Reuse the template of `twoRegAcrossInstX`. Add one more option
`recursive` for recursive reduction.

Change-Id: I69e690ce7668baee818542d3ea463f7a5f269a69
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-10-02 12:41:53 +02:00
Bobby R. Bruce
c9408828a1 misc,tests: Revert "Test docker runners vs self-hosted"
This reverts commit 0792d94b6f.
2024-10-01 16:03:57 -07:00
LYC
93313b3daa arch-riscv: fix viota (#1559)
This commit fixs a bug in the viota instuction.

The two different instructions can be referenced to the same
StaticInstPtr because the decoder behaves as shown in [the section of
the
code](https://github.com/gem5/gem5/blob/stable/src/arch/riscv/decoder.cc#L98-L100).

So every first micro-op should reset the cnt variable in the macro-op.

Change-Id: Id311a05cfed41b01e16fd7256d9baa166aee49da

Co-authored-by: Jack Yung-Chen Lin <jack622@andestech.com>
2024-10-01 11:23:27 -07:00
Erin (Jianghua) Le
d5dfe03eb1 stdlib: Add warning message for set_workload being called twice (#1571)
This commit adds a warning message for when set_workload is called
twice, as users typically do not mean to do this.
2024-10-01 11:22:07 -07:00
Erin (Jianghua) Le
c10feed524 tests, configs, util, mem, python, systemc: Change base 10 units to base 2 (#1605)
This commit changes metric units (e.g. kB, MB, and GB) to binary units
(KiB, MiB, GiB) in various files. This PR covers files that were missed
by a previous PR that also made these changes.
2024-10-01 11:18:05 -07:00
Kaustav Goswami
d57208c615 arch-x86,stdlib: added MADT entries on the X86Board (#1574)
This change adds MADT entries to the X86Board. Previously, the kernel in
full-system mode was complaining about a `ACPI BIOS Error (bug): Invalid
table length 0x24 in RSDT/XSDT (20190816/tbutils-291)`. This patch fixes
the invalid length and initializes all the tables correctly.

Signed-off-by: Kaustav Goswami <kggoswami@ucdavis.edu>
2024-10-01 11:14:09 -07:00
Bobby R. Bruce
0792d94b6f misc,tests: Test docker runners vs self-hosted 2024-10-01 08:53:31 -07:00
Bobby R. Bruce
34f6bc4501 misc,tests: Fix caching in daily tests 2024-10-01 08:18:55 -07:00
Junshi Wang
a25d9a126f arch-arm: Add recursive reduce in Neon instruction.
FMAXV, FMINV, FMAXNMV, FMINNMV and ADDV instructions perform recursive
reduction. Different reduction methods lie to different result when
handle NaN values.

Reuse the template of `twoRegAcrossInstX`. Add one more option
`recursive` for recursive reduction.

Change-Id: I69e690ce7668baee818542d3ea463f7a5f269a69
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-09-30 16:31:35 +01:00
Giacomo Travaglini
8381e1c5d3 mem-cache: Helper functions to allow dynamic configuration of partitioning policies (#1609)
This PR is doing a simple refactoring of some partitioning policies. It
moves existing functionalities
within PP methods so that they can be called multiple times throughout
the simulation.
Therefore allowing a dynamic adjustment of the partitioning scheme
2024-09-27 00:01:18 +02:00
Bobby R. Bruce
277b5be4dd arch-arm: Add a method to determine External Abort (#1610)
- Add `isExternalAbort()` in `AbortFault<T>` to determine external
abort.
- Add `virtual isExternalAbort()` in `ArmFault` so the method can be
used in base class.
- Set iss.ea by `isExternalAbort()`
2024-09-26 14:41:33 -07:00
Erin (Jianghua) Le
e987c60a4c tests: Add Pannotia GPU Tests (#1584)
This PR adds the Pannotia GPU tests.
2024-09-26 14:39:39 -07:00
Bobby R. Bruce
054790ad47 ext: Fix GCC v13+ comp of systemc due to problematic overloaded-virtual warn (#1576)
Fixes #1121 in line with the following suggesting:

https://github.com/gem5/gem5/issues/1121#issuecomment-2352743409
2024-09-26 14:32:20 -07:00
Bobby R. Bruce
a240ff8d32 misc,tests: Fix caching in daily tests 2024-09-26 11:12:52 -07:00
Bobby R. Bruce
e3fd7dcaec misc,tests: Remove cache store from dramsys test 2024-09-26 11:10:32 -07:00
Ivana Mitrovic
6bb1c9638c util: Update gem5-resources-manager (#1604)
Bumps [cryptography](https://github.com/pyca/cryptography) from 42.0.4
to 43.0.1.
2024-09-26 10:09:38 -07:00
Junshi Wang
a4bacb9823 arch-arm: Add a method to determine External Abort.
- Add `isExternalAbort()` in `AbortFault<T>` to determine external abort.
- Add `virtual isExternalAbort()` in `ArmFault` so the method can be
used in base class.
- Set iss.ea by `isExternalAbort()`.

Change-Id: I01c22dc46958ab424b389af96d3c3b6243cbc671
2024-09-26 14:05:09 +01:00
Junshi Wang
9a7a661c66 arch-arm: Set tranMethod for external Data Abort.
The External Data Abort may not set TranMethod, and it leads to assert
error.

- Make `ArmFault::update` virtual.
- Implement override `update` in `AbortFault<T>` to set TranMethod.

Change-Id: I49e18799df8420b214b6059ffa756a13edf343d5
2024-09-26 14:04:17 +01:00
Giacomo Travaglini
b232204b49 mem-cache: Allow dynamic configuration of the Way pp
Change-Id: I1ba9266b24ebc9563f9380fcf155cdc436b2e376
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-09-26 10:21:46 +01:00
Giacomo Travaglini
fdcfc28cf4 mem-cache: Allow dynamic configuration of MaxCapacity pp
This will allow gem5 to configure the maximum capacity of a
partition dynamically during simulation, rather than
having it statically defined at construction time

Change-Id: Ib55c9990a6bc2930abaf2438c13337acc643520f
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-09-26 10:21:21 +01:00
Giacomo Travaglini
3100418fb1 mem-cache: Store totalBlockCount directly in MaxCapacity pp
In this way we actually need to store one unsigned integer instead of
two. We also won't need to recompute the total number of cache blocks
whenever we will adapt this policy to be dynamically modified

Change-Id: Ia8cf906539d1891b6cdb821f2a74628127dc68c6
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-09-26 10:20:57 +01:00
Junshi Wang
bf61bd127f arch-arm: Add support of AArch32 VCVTA/P/N/M instructions.
Add decoder and function of AArch32 VCVTA, VCVTP, VCVTN and VCVTM
instructions. Support both 16-bit and 32-bit variants.

Only support A32 encoding.

Change-Id: I6ece0e1b779f9a7cc9d709894a49a7fdcda28373
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-09-26 10:02:23 +02:00
Ronchi1997
e17875b7c7 misc: Correctly display build information (#1603)
See: #1591

Co-authored-by: Ronchi <ronchi@qq.com>
2024-09-25 14:23:51 -07:00
aperais
36264938db misc: Make random gen portable across compilers. (#1580)
Replace std::uniform_*_distribution by custom code
to make random number generation in gem5 portable across
compilers.

Of note, FP random number generation was not uniformly
distributed, and this PR does not fix that issue.

Thanks to Chandana S. Deshpande (deshpande.s.chandana@gmail.com)
for uncovering the issue.

Co-authored-by: Arthur Perais <arthur.perais@univ-grenoble-alpes.fr>
2024-09-25 07:31:00 -07:00
Saúl
d1ce4fb6c7 arch-riscv: add VLEN/ELEN as class attributes for all vec insts (#1538)
This refactor attempts to homogenize all riscv's vector (macro/micro)
instruction classes so that ELEN and VLEN are guaranteed to be a class
attribute. Since both are constant, all instructions will get it on the
decoding process passed through to their vector base class.

This allows the removal of VLEN in the PC state and also in some
constructor default parameters (solves issue #1207).

Change-Id: I6f0471004335f49b00b015c37e95dc7f9569e303
2024-09-24 14:32:37 -07:00
Yu-Cheng Chang
e9ea18000d arch-riscv: Move static GDB methods to RemoteGDB virtual methods (#1590)
Move getRvType & getPrivilegeModeSet static methods into
RiscvISA::RemoteGDB virtual methods allows the derived
RiscvISA::RemoteGDB to override it without change a lot of methods in
base methods

Change-Id: I3cbb9cf1fdee4a298e903bb4a0a5683c042b749d
2024-09-24 07:46:56 -07:00
Bobby R. Bruce
2fc44a50f8 gpu-compute: Fix '64kB' to '64KiB' in gpu-compute (#1594)
64kB, in these cases, will cast to 64KiB regardless. To improve
readability and understanding of these objects, this patch changes there
SI Prefix (kB -> KiB).
2024-09-23 15:25:43 -07:00
Bobby R. Bruce
d74d550af4 misc,tests: Improve daily cache handling. 2024-09-23 14:20:29 -07:00
Bobby R. Bruce
6af68bcf81 tests,misc: Update weekly/daily caches 2024-09-23 13:16:01 -07:00
Bobby R. Bruce
5214c8b0cb misc, tests: Add missing build/ALL cache in daily-tests.yaml 2024-09-23 12:05:42 -07:00
Bobby R. Bruce
162ea1fa74 tests,misc: Add caching to daily and weekly test workflows 2024-09-23 12:01:57 -07:00
Bobby R. Bruce
1a637e6d94 tests: test_requires.py moved to very-long and drop risv
This test required a lot of compilation for what it does. It is now moed
to very-long/weekly and riscv has been dropped as arm and x86 are
sufficient.
2024-09-23 11:34:14 -07:00
Bobby R. Bruce
87daf94c0e tests: 'NULL_MI' -> 'NULL' in test_replacement_policies.py
NULL already compiled to include the MI protocol. This explicit
declaration causes compilation of another binary which is not required.
2024-09-23 11:34:14 -07:00
Bobby R. Bruce
91fb4acd29 tests: Remove 'multi_isa' tests (redundant)
These tests are almost identical to 'stdlib/test_requires.py' tests.
They use all the same functions and tests the functionality.
2024-09-23 11:34:14 -07:00
Giacomo Travaglini
c3d356b43d arch-arm: Move generateTrap from MiscRegOp to ArmStaticInst (#1560)
System(Misc) register accesses are not the only trappable instructions.
We move the exception generation logic (generateTrap) from the
MiscRegOp64 to the base ArmStaticInst
2024-09-23 19:37:29 +02:00
Bobby R. Bruce
e85592da14 scons: Fix scons 'readCommand' non-zero exits (#1587)
There appears to have been an assumption here that `Popen` would raise
an exception if the command run returned non-zero. This is not the case.
This commit fixes this by obtaining the return code and throwing an
exception if it is non-zero.

This bug caused some minor issues as Exception handling code to handle
the non-zero case elsewhere in Scons was never executed.
2024-09-23 10:09:23 -07:00
Bobby R. Bruce
41b02c5020 misc: Revert "Revert Dramsys Ubuntu to 22.04 to ..."
This reverts commit 52fbc8ebcf.

This commit used Ubuntu 22.04 instead of the typucal 24.04 as 24.04
has GCC v13 installed by default. GCC v13 (and new compilrs introduce a
'oerloaderdf-virtual' check that is triggered in systemc. Systemc
developers suggest this fix to proceed.
2024-09-23 05:20:30 -07:00
Bobby R. Bruce
0bd2cfaf53 systemc: Disable 'overloaded-virtual' warn for systemc bind funcs
For GCC >=v13 systemc was breaking due to the overloaded virtual
warning check.

Issue: #1121
2024-09-23 05:20:29 -07:00
Bobby R. Bruce
9b83fc8736 misc: Add caching to weekly tests 2024-09-23 05:02:38 -07:00
Bobby R. Bruce
8aa58714c4 misc: Update docker-build.yaml to target default 2024-09-23 02:40:37 -07:00
Bobby R. Bruce
688268d22d util-docker: Minor housekeeping to Dockerfiles (#1592)
1. Moved description label to docker-bake.hcl. Image descriptions must
be specified here. See:
https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry#adding-a-description-to-multi-arch-images
2. Moved specifying the 'Dockerfile' to 'common'.
3. Changed it so the gpu-fs and gcn-fpu images only built to
linux/amd64. arm64 doesn't work.
2024-09-23 02:36:09 -07:00
Bobby R. Bruce
ba02266260 misc: Fix 'target' field in docker-build.yaml 2024-09-21 08:05:09 -07:00
Bobby R. Bruce
7a5b8d9a9c misc: Fix 'username' field in docker-build.yaml 2024-09-21 08:01:22 -07:00
Bobby R. Bruce
b47dc0d5e6 misc: Fix 'needs' field in docker-build.yaml 2024-09-21 07:41:58 -07:00
Bobby R. Bruce
c01aaf83f7 misc: Add matrix to docker-build.yaml 2024-09-21 07:39:06 -07:00
Bobby R. Bruce
c88f0d0097 misc: docker-build.yaml test 2024-09-21 07:27:42 -07:00
Bobby R. Bruce
50aac87c71 misc: docker-build.yaml fix 2024-09-21 07:17:32 -07:00
Bobby R. Bruce
cae4852606 misc: Fix docker-build.yaml (#1588)
This is an attempt to get the docker build workflow working
2024-09-21 07:08:12 -07:00
Kaustav Goswami
51b5279671 ext,util-docker: updated SST to v.14.0.0 (#1575)
This change updates SST from v.13.0.0 to v.14.0.0. It also adds an
updated docker file to test the new version.
2024-09-21 06:18:12 -07:00
Bobby R. Bruce
473a37be04 util-docker: Minor docker improvements/fixes (#1586)
1. Added `sudo` to Ubuntu 24.04 all dependency Dockerfile

Without this an admin user entering a container mirroring host user
permissions can't run `sudo` within the container as it doesn't exist.
They also can't install it as `apt install` requires `sudo`.

As 24.04_all-deps serves as the base images for other images, this
change will be reflected in most other gem5 Docker images.

2. Fix multiplatform builds by removing `BUILDPLATFORM` platform fix.

This actually breaks multi-platform builds when using docker buildx via
the docker-bake.hcl file. Removing this fixes and permits the
multi-platform builds to be built.

3.Remove 'latex/riscv64' as Docker build target

It is unlikely anyone will be running these images on a RISC-V system
anytime soon. They are costly in terms of space and also require RISC-V
emulation to build which is very slow. This change has it so our
multi-platform builds just target ARM and X86.
2024-09-21 04:55:47 -07:00
Bobby R. Bruce
6186fc72a0 util-docker: Add 'sudo' to Ubuntu 24.04_all-deps
Without this an admin user entering a container mirroring host user
permissions can't run `sudo` within the container as it doesn't exist.
They also can't install it as `apt install` requires `sudo`.

As 24.04_all-deps serves as the base images for other images, this
change will be reflected in most other gem5 Docker images.
2024-09-21 04:52:33 -07:00
Bobby R. Bruce
827bca0cdb util-docker: Remove 'latex/riscv64' as Docker build target
It is unlikely anyone will be running these images on a RISC-V system
anytime soon. They are costly in terms of space and also require
RISC-V emulation to build which is very slow. This change has it so our
multi-platform builds just target ARM and X86.
2024-09-21 04:49:28 -07:00
Bobby R. Bruce
8fc2c4c9b4 util-docker: Remove 'BUILDPLATFORM' set
This actually breaks multi-platform builds when using docker buildx via
the docker-bake.hcl file. Removing this fixes and permits the
multi-platform builds to be built.
2024-09-21 04:47:47 -07:00
Jason Lowe-Power
fee603fd84 mem-cache: Do not require p.size and p.entry_size in IP template (#1557)
This PR is adjusting the constructor to relax template
requirements. In this way child classes are free to provide
their own way of calculating the number of entries and the
shifting required to extract the set

Why do we need this?
Up to this patch we have been configuring the indexing policy
by setting up the cache/table size (in bytes) and the entry size.
Those parameters make a lot of sense in caching structures
where:

a) We want to configure the caching structure using
the amount of storage (in bytes) provided (e.g. 4kB of Cache)
b) the content of a single entry is addressable therefore
we need the entry size to know how many bits in the indexing
process we need to shift to extract the set

In those cases the number of cache entries is derived from the formula

num_entries = size / entry_size

The adoption of the IndexingPolicy for different kinds
of caching structures (e.g. prefetcher tables) make this
way of configuring the IP a bit quirky.

For some tables directly setting the number of entries is a far more
intuitive way of configuring the IP, instead of allocating the desired
number of entries by working things out with the formula above
2024-09-19 07:48:46 -07:00
Giacomo Travaglini
e564561d41 misc: Remove Serialize-related code in Random (#1567)
The Random ser/des support has been non-existent since 2014.
Removing it will enable the Random class to be unit tested
without having a dependency on the src/sim code.
2024-09-19 14:13:10 +02:00
Arthur perais
85210cf51d misc: Remove unecessary include in random.hh 2024-09-18 13:43:37 +02:00
Giacomo Travaglini
77dff262a1 arch-arm: Fix DC IVAC for Secure EL2 (#1569)
According to the Arm architecture reference manual:

"When the value of HCR_EL2.VM is 1, data cache invalidate instructions
executed at EL1 perform a data cache clean and invalidate"

This behaviour should be exteded to secure mode now that Secure EL2 is
supported

Change-Id: I8b4733e6336a0fd5577f4ef35c0bae5408f91194

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-09-18 11:07:10 +01:00
Bobby R. Bruce
f2f86a3e42 stdlib, python: Add warning message and clarify binary vs metric units (#1479)
This PR changes memory and cache sizes in various parts of the gem5
codebase to use binary units (e.g. KiB) instead of metric units (e.g.
kB). This makes the codebase more consistent, as gem5 automatically
converts memory and cache sizes that are in metric units to binary
units.

This PR also adds a warning message to let users know when an
auto-conversion from base 10 to base 2 units occurs.

There were a few places in configs and in the comments of various files
where I didn't change the metric units, as I couldn't figure out where
the parameters with those units were being used.
2024-09-17 17:32:27 +00:00
Matt Sinclair
6d49130b0b mem-ruby: Fix replacement policy in GPU_VIPER (#1564)
The current GPU_VIPER protocol's TCC cache update the MRU information
twice with calling a_allocateBlock and ut_updateTag which affects the
LIP and RRIP replacement polies. Remove ut_updateTag fixes the LIP and
RRIP replacement polies.

Change-Id: I79ad9392593e00425a7fe8828048465b2c2c2e1f
2024-09-17 12:16:09 -05:00
Bobby R. Bruce
3feeb5724f stdlib: Issue warn if func is a gen for exit_event (#1499)
Addresses Issue #1492
2024-09-17 09:34:24 -07:00
Arthur perais
4de65bbd57 misc: Remove Serialize-related code in Random
The Random ser/des support has been non-existent since 2014.
Removing it will enable the Random class to be unit tested
without having a dependency on the src/sim code.
2024-09-16 11:17:32 +02:00
Jarvis Jia
c1fcc0c54a Merge branch 'update_gpu_tcc' of https://github.com/yuxiaojia/gem5 into update_gpu_tcc
Change-Id: I7f04a5490193d9802351be6cd4e7d6baf3c79cb8
2024-09-14 23:22:51 -05:00
Jarvis Jia
9dfd66aca4 mem-ruby: Fix replacement policy in GPU_VIPER
The current GPU_VIPER protocol's TCC cache update the MRU information
twice with calling a_allocateBlock and ut_updateTag which affectgs the
LIP and RRIP replacement polies. Remove ut_updateTag fixes the LIP and
RRIP replacement polies.

Change-Id: I79ad9392593e00425a7fe8828048465b2c2c2e1f
2024-09-14 23:22:22 -05:00
Jarvis Jia
d8954745cf mem-ruby: Fix replacement policy in GPU_VIPER
The current GPU_VIPER protocol's TCC cache update the MRU information
twice with calling a_allocateBlock and ut_updateTag which affectgs the
LIP and RRIP replacement polies. Remove ut_updateTag fixes the LIP and
RRIP replacement polies.
2024-09-14 20:30:48 -05:00
Erin (Jianghua) Le
5aa7b1ce3e python: Redirect into correct subdirectory when using -re with multisim (#1551)
Previously, when passing the -re option while using multisim, the files
simerr.txt and simout.txt would be redirected into the m5out directory
instead of the correct subdirectory. They would also have a name of the
format
Spawn_gem5PoolWorker-some-integer_(simout|simerr).txt, which doesn't
indicate which simulation the files correspond to.

This commit fixes these issues by redirecting simerr.txt and simout.txt
into the correct subdirectory.

Change-Id: I0a25a9fd8dc672949f5f85fc5ca6452529301a73
2024-09-14 01:17:48 -07:00
Bobby R. Bruce
ad481167fa misc: Fix lone header bug (#1563) 2024-09-14 00:11:32 -07:00
Bobby R. Bruce
a1105cf234 misc,github,tests: Remove gerrit change ID requirement (#1486) 2024-09-13 20:22:04 -07:00
Bobby R. Bruce
4126035f88 util-docker: Move LABEL to after image import (#1548)
A Dockerfile must start with the importation of a docker base image. It
is only after this point that `LABEL` be provided. Having `LABEL` at the
top of the Dockerfiles resulted in the Docker images failing to build.
2024-09-13 20:21:35 -07:00
Yu-Cheng Chang
f94cac6f65 arch-riscv: Change the packed data of GdbRegCache to protected (#1552)
Change it to protected to enable access the packed data from derived
RiscvGdbRegCache class

Change-Id: Ib33732642914ad367773c3fa45adaf6dfdeb248d
2024-09-12 09:52:03 -07:00
Giacomo Travaglini
5eec041e2d arch-arm: Use generateTrap for SME/SVE/SIMD/WFE/WFI trapping
This avoids repeating the same switch construct

Change-Id: Ie16c52519b1e1f984284f2f1344a3903a0010d36
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-09-12 16:04:03 +01:00
Giacomo Travaglini
a4c9600200 arch-arm: Move generateTrap from MiscRegOp to ArmStaticInst
System(Misc) register accesses are not the only trappable instructions.
We move the exception generation logic (generateTrap) from the
MiscRegOp64 to the base ArmStaticInst

Change-Id: Ie2ba0c39790f50e3e8d504d153025d402283ec95
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-09-12 14:29:21 +01:00
aperais
e970acb9d2 cpu-o3: Replace integral constants by named constants in FU pool (#1556)
This replaces hardcoded integral values with more explicit constant
names in the code allocating functional units to instructions.

This commit follows ba5886aee7 which
should have read:

"If an instruction requires a functional unit that is not present in the
model (e.g., because it is not present in the configuration), O3CPU
treats it as a 1-cycle operation.

This commit changes the behavior to make the cpu panic when this
happens. The cpu panics only if the instruction reaches the head of the
ROB, meaning it is ok to have unsupported instructions on the wrong
path.

Thanks to Chandana S. Deshpande (deshpande.s.chandana@gmail.com) for
finding the issue."

Change-Id: I5e0a37e5fb8404cb5496bd2cb0a9a5baeae3b895

Co-authored-by: Arthur perais <arthur.perais@univ-grenoble-alpes.fr>
2024-09-12 14:04:34 +01:00
Giacomo Travaglini
e73c442ad8 mem-cache: Move size/entry_size params away from the template
Change-Id: Iec7a79cd9f2fa60d97f4a430e047e286f50338c8
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-09-12 10:10:58 +01:00
Giacomo Travaglini
3bd54db68d mem-cache: Do not require p.size and p.entry_size in IP template
This commit is adjusting the constructor to relax template
requirements. In this way child classes are free to provide
their own way of calculating the number of entries and the
shifting required to extract the set

Why do we need this?
Up to this patch we have been configuring the indexing policy
by setting up the cache/table size (in bytes) and the entry size.
Those parameters make a lot of sense in caching structures
where:

a) We want to configure the caching structure using
the amount of storage (in bytes) provided (e.g. 4kB of Cache)
b) the content of a single entry is addressable therefore
we need the entry size to know how many bits in the indexing
process we need to shift to extract the set

In those cases the number of cache entries is derived from the formula

num_entries = size / entry_size

The adoption of the IndexingPolicy for different kinds
of caching structures (e.g. prefetcher tables) make this
way of configuring the IP a bit quirky.

For some tables directly setting the number of entries is a far more
intuitive way of configuring the IP, instead of allocating the desired
number of entries by working things out with the formula above

Change-Id: Ic7994c129196d6ba83dc99ce397ad43393d35252
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-09-12 10:10:58 +01:00
Erin Le
52c2ecd033 python: remove outdated comment in convert.py
Change-Id: I0cdeb709e5ae1a3100662172d96a5f6328be1a3d
2024-09-11 11:57:22 -07:00
Erin Le
39ea74c4ee tests: add test for checking conversion from base 10 to base 2
This commit adds a test that checks that strings representing
base 10 memory sizes or base 10 memory bandwidths are correctly
converted to strings representing base 2 values.

Change-Id: Ie8cac15f06b4ceb1786484fea4e8ba2111f4e8d3
2024-09-11 11:35:17 -07:00
Erin Le
3a8bbc41b8 python: refactor base 10 to 2 error message
This commit refactors the base 10 to base 2 error message such
that it uses the preexisting _split_suffix function instead
of a new function based off of _split_suffix. This commit also
removes the new helper function used previously.

Change-Id: I44d9ac3d8b98bcff33d6bfea7ffbdb5009272ede
2024-09-11 11:28:55 -07:00
aperais
ba5886aee7 cpu-o3: Panic if no FU exists for an instruction needing to issue (#1516)
At present, if an instruction requires a functional unit that is not
present in the O3CPU config, O3CPU treats it as a 1-cycle operation that
does not consume an FU. This seems like a silent failure : if I forgot
to add a FU for a new operation type I added, then I don't want it to
silently work "for free".

The problem is that the code treats the FU allocator returning
`NoCapableFU` for a given DynInst as equivalent to the case where the
DynInst obtained an FU, with default latency of 1. This is because there
is a single if statement that checks whether the FU allocator returned
`NoFreeFU` or not, and `NoCapableFU` happens to be different. The change
is to introduce `NoNeedFU` and to panic if the FU allocator returns
`NoCapableFU`

An improvement would be to use a strongly typed enum rather than integer
constants. Thoughts ?

In addition to unit tests, I have tested this with `main.py run` and get
panics if I remove support for `IntMul` type in `O3CPU.py` in:

```
./SuiteUID-asm-riscv-rv32um-ps-mul-o3-ALL-x86_64-opt/TestUID-asm-riscv-rv32um-ps-mul-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-asm-riscv-rv32um-ps-mulh-o3-ALL-x86_64-opt/TestUID-asm-riscv-rv32um-ps-mulh-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-asm-riscv-rv32um-ps-mulhsu-o3-ALL-x86_64-opt/TestUID-asm-riscv-rv32um-ps-mulhsu-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-asm-riscv-rv32um-ps-mulhu-o3-ALL-x86_64-opt/TestUID-asm-riscv-rv32um-ps-mulhu-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-asm-riscv-rv64um-ps-mul-o3-ALL-x86_64-opt/TestUID-asm-riscv-rv64um-ps-mul-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-asm-riscv-rv64um-ps-mulh-o3-ALL-x86_64-opt/TestUID-asm-riscv-rv64um-ps-mulh-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-asm-riscv-rv64um-ps-mulhsu-o3-ALL-x86_64-opt/TestUID-asm-riscv-rv64um-ps-mulhsu-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-asm-riscv-rv64um-ps-mulhu-o3-ALL-x86_64-opt/TestUID-asm-riscv-rv64um-ps-mulhu-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-asm-riscv-rv64um-ps-mulw-o3-ALL-x86_64-opt/TestUID-asm-riscv-rv64um-ps-mulw-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-BaseCPUProcessor-arm-hello-ALL-x86_64-opt/TestUID-BaseCPUProcessor-arm-hello-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-cpu_test_ArmDerivO3CPU_Bubblesort-ALL-x86_64-opt/TestUID-cpu_test_ArmDerivO3CPU_Bubblesort-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-cpu_test_ArmDerivO3CPU_FloatMM-ALL-x86_64-opt/TestUID-cpu_test_ArmDerivO3CPU_FloatMM-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-cpu_test_RiscvDerivO3CPU_Bubblesort-ALL-x86_64-opt/TestUID-cpu_test_RiscvDerivO3CPU_Bubblesort-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-cpu_test_RiscvDerivO3CPU_FloatMM-ALL-x86_64-opt/TestUID-cpu_test_RiscvDerivO3CPU_FloatMM-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-o3-cpu_1-cores_classic_DualChannelDDR3_1600_arm_boot_test_to-tick-ALL-x86_64-opt/TestUID-o3-cpu_1-cores_classic_DualChannelDDR3_1600_arm_boot_test_to-tick-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-o3-cpu_1-cores_classic_DualChannelDDR3_1600_riscv-boot-test_to-tick-ALL-x86_64-opt/TestUID-o3-cpu_1-cores_classic_DualChannelDDR3_1600_riscv-boot-test_to-tick-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-test-arm-hello32-static-o3-ALL-x86_64-opt/TestUID-test-arm-hello32-static-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-test-arm-hello64-static-o3-ALL-x86_64-opt/TestUID-test-arm-hello64-static-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-test-mips-hello-o3-ALL-x86_64-opt/TestUID-test-mips-hello-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-test-riscv-hello-o3-ALL-x86_64-opt/TestUID-test-riscv-hello-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-test-riscv-print-this-o3-ALL-x86_64-opt/TestUID-test-riscv-print-this-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
```

Co-authored-by: Arthur perais <arthur.perais@univ-grenoble-alpes.fr>
2024-09-11 16:43:31 +01:00
Bobby R. Bruce
f327559ca4 tests,stdlib,python: Add tests for base 10 to 2 SI unit check
**Note**: Erin needs to complete the commit by expanding this test to
properly test the behavior of this change.

To run the pyunit tests:

```sh
scons build/ALL/gem5.opt -j`nproc`
./build/ALL/gem5.opt tests/run_pyunit.py
```

Change-Id: I8cea0fe8b088e03e84072a000444953768bc3151
2024-09-10 15:17:53 -07:00
handsomeliu-google
0da65b31c2 python: Ignore *args and **kwargs when generating cxxMethod pybinding script (#1535)
According to the pybind documentation, "When combining *args or **kwargs
with Keyword arguments you should not include py::arg tags for the
py::args and py::kwargs arguments."

In the current implementation of gem5, if you use the cxxMethod
decorator on a function that has *args or **kwargs, gem5 will
incorrectly add these variables to the pybind generated declaration.

I.e., def f(arg1, arg2,  *args, **kwargs): -> .def("f", &f,
py::arg("arg1"), py::arg("arg2"), py::arg("*args"), py::arg("**kwargs"))
which is incorrect pybind code.

To fix this problem, we should ignore variables in the generator if they
are *args or **kwargs. This change skips these variables when creating
the pybind declaration.

Change-Id: I44a1e0eb0b5fc5c1e1d423ba145d456bff92c6b8
2024-09-09 10:23:26 -07:00
Ivana Mitrovic
da6ce1d9c2 ext,tests,misc: Suppress incorrect GCC 12 error in Pybind (#1501)
There is a compiler error with GCC 12 discussed here:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115824

This Pybind code triggers the bug and was causing our compiler tests to
fail.

To fix gem5 compilation for gcc 12 these warnings/errors have been
suppressed for this code.
2024-09-09 10:21:54 -07:00
Daniel Carvalho
51863d322f gpu-compute: Reuse RP list in GPU_VIPER (#1530)
It is safer to reuse the dynamic list than manually listing all possible
replacement policies.

---------

Signed-off-by: odanrc <odanrc@yahoo.com.br>
2024-09-09 09:18:01 -07:00
Bobby R. Bruce
5207b3be6d ext,tests,misc: Suppress incorrect GCC 12 error in Pybind
There is a compiler error with GCC 12 discussed here:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115824

This Pybind code triggers the bug and was causing our compiler tests to
fail.

To fix gem5 compilation for gcc 12 these warnings/errors have been
suppressed for this code.

This is a copy and paste of:
https://github.com/pybind/pybind11/pull/5355

Change-Id: I9344951ef00d121ea0b609f4faa13dfe09aabb3b
2024-09-08 00:38:02 -07:00
Erin Le
00f927a4e2 mem, python: refactor error message formatting
This commit refactors the error message added to convert.py.
A mapping between the base 10 and base 2 suffix magnitudes
(e.g. k: ki, M: Mi, etc.) and a new function that extracts the
magnitude and numerical value have been added. Also, a warning
message has been added to the toMemoryBandwidth function in
addition to the one in toMemorySize.

Change-Id: I3ae157d13c7089d38a34a6e4c35a2b58978106d0
2024-09-05 18:00:41 -07:00
dependabot[bot]
4d6e968b04 misc: bump tqdm from 4.66.4 to 4.66.5 (#1532)
Bumps [tqdm](https://github.com/tqdm/tqdm) from 4.66.4 to 4.66.5.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-05 05:54:37 -07:00
dependabot[bot]
f014092fc2 misc: bump mypy from 1.11.1 to 1.11.2 (#1531)
Bumps [mypy](https://github.com/python/mypy) from 1.11.1 to 1.11.2.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-05 05:53:52 -07:00
Giacomo Travaglini
57d82fdbb4 sim-se, arch: Fix syscall parametre sizes for 32-bit OSs (#1482)
A bug was uncovered in that for various syscalls that used 64bit
parametres, the ABI for 32bit operating systems was passing the wrong
values to the syscalls, due to discrepancies between the target and
guest OS. This commit fixes that by replacing 64-bit types, or types
that are platform specific in size, with the exact correspondent for the
guest OS, thus producing the correct signature for the respective
syscalls. On top of this, the --param argument is added to the
starter_se script, in order to support attachment of remote debuggers.
2024-09-03 09:49:59 +01:00
Erin (Jianghua) Le
a5a3810ac9 util-docker: Add labels to Dockerfiles (#1528)
This PR adds labels to Dockerfiles. The labels are the source
(https://github.com/gem5/gem5), a description, and the license.

Change-Id: I47ce432257641b394efef4958f1474eefe2a11c1

Co-authored-by: Harshil Patel <harshilp2107@gmail.com>
2024-09-01 09:52:13 -07:00
Matt Sinclair
403622f376 dev-amdgpu: Implement UNMAP_QUEUES queue_sel==2 (#1481)
Unmap queues with queue_sel of 2 unmaps all queues while queue_sel of 3
unmaps all non-static queues. The implementation of 3 was actually
correct for 2. Static queues are queues which were mapped using a map
queues packet with a queue_type of 1 or 2.

This commit adds ability to mark a queue as static. When unmap queues
with queue_sel of 2 is sent, the existing code is now executed. With a
value of 3, we now check if the queue was marked static and do not unmap
it if marked.

Change-Id: I87d7cf78a0600c7baa516c01f42c294d3c4e90c5
2024-08-31 22:52:08 -05:00
Matthew Poremba
21f1e54ecd dev-amdgpu: Implement UNMAP_QUEUES queue_sel==2
Unmap queues with queue_sel of 2 unmaps all queues while queue_sel of 3
unmaps all non-static queues. The implementation of 3 was actually
correct for 2. Static queues are queues which were mapped using a map
queues packet with a queue_type of 1 or 2.

This commit adds ability to mark a queue as static. When unmap queues
with queue_sel of 2 is sent, the existing code is now executed. With a
value of 3, we now check if the queue was marked static and do not
unmap it if marked.

Change-Id: I87d7cf78a0600c7baa516c01f42c294d3c4e90c5
2024-08-31 17:41:47 -07:00
Giacomo Travaglini
29d6b46f1f arch-arm: Fix Execution Permission in Stage2 Direct Permission. (#1502)
In Stage 2 under AArch64, execution permission does not need read
permission.

Change-Id: I45887e8f4d50ed5edc4afaed9a2dd8a74db9d0d4
2024-08-29 23:57:58 +01:00
Matthew Poremba
bb9539ad4d arch-vega: Revert incorrect SOPC compare (#1521)
LT <= was previously correct while < is not. Can lead to incorrect
program execution. Related to #1366 related to #1520

Change-Id: I00b7838e920eee7c8adb508e869fdf53a9373e1f
2024-08-29 09:20:25 -07:00
Giacomo Travaglini
d78a571660 base: Allow DPRINTF debugging of AssociativeCache (#1514)
The AssociativeCache is used by different caching agents.
This PR will allow to pass the appropriate flag to the cache so that we
can meaningfully debug
its internals. For instance, when used to model a prefetcher table, will
will pass the
HWPrefetch flag; when used to model a TLB, we will pass the TLB flag.
2024-08-27 16:48:24 +01:00
Marco Kurzynski
a8447b7fc0 arch-vega: Pass s_memtime through smem pipe (#1350)
The Vega ISA's s_memtime instruction is used to obtain a cycle value
from the GPU. Previously, this was implemented to obtain the cycle count
when the memtime instruction reached the execute stage of the GPU
pipeline. However, from microbenchmarking we have found that this under
reports the latency for memtime instructions relative to real hardware.
Thus, we changed its behavior to go through the scalar memory pipeline
and obtain a latency value from the the SQC (L1 I$). This mirrors the
suggestion of the AMD Vega ISA manual that s_memtime should be treated
like a s_load_dwordx2.

The default latency was set based on microbenchmarking.

Change-Id: I5e251dde28c06fe1c492aea4abf9f34f05784420
2024-08-26 19:47:04 -07:00
Ivana Mitrovic
9bd79bc160 tests: Fix gpu-tests (#1515)
This PR resolves the issue with the failing daily tests.

Change-Id: I984b09a6b69701a7a57b36e3346e55245f2fa04a
2024-08-26 09:40:28 -07:00
Alexander Richardson
b9eafdb190 arch-arm: Fix implicit int-to-float conversion in VCMP (#1326)
Explicitly convert to float/double to fix compiler warnings that I have
turned on locally. It might make sense to make use of fplib functions to
be portable across different host float formats but something as simple
as comparison against zero should be safe.

Change-Id: I96c6ee7c5497fece11be07234ff80ff86e7555e2
2024-08-26 10:22:07 +01:00
Alexander Richardson
3e288305c1 arch-arm: downgrade a warning to a DPRINTF (#1438)
Programming an event ID while counters are disabled is perfectly fine,
so we should just log this using DPRINTF instead of printing a warn()
every time it happens.

Change-Id: Ib9499857271033ef941f74a7f012d8694328eaf3
2024-08-26 10:20:49 +01:00
Alexander Richardson
ff12822606 arch-arm: when programming an invalid PMU ID detach the counter (#1510)
I'm not entirely sure what the mandated behaviour is according the the
ARM ARM, but I was very confused by the counters continuing to increment
with the old event even when programmed to an event ID that is not
currently supported by GEM5. Disconnecting the counter if the event is
not supported is less surprising behaviour IMO.

Change-Id: I927d9339c138dafa1484db1515c2aa09b0a9a0a9
2024-08-26 10:17:51 +01:00
Alexander Richardson
a679b9e8a3 arch-arm: Use .f32/.f64 suffixes for vfp mnemonics (#1512)
This matches the Arm manual and the output produced by capstone. Also
avoid unnecessary spaces in vsel* instruction printing.

Change-Id: I071dd834b7104f10f6358a6b2e2895bdab64df82
2024-08-25 14:12:18 +01:00
Giacomo Travaglini
399f85223d base: Add print when inserting/evicting an AssociativeCache entry
We are adding two debug prints in the AssociativeCache:

1) Inserting print
2) Evicting print

Among those, the evicting one is probably the most important
This is because while the DPRINTF can be added in the
Entry::insert implementation (called during insertion),
the AssociativeCache does not reference any evict method.
Instead, the findVictim is transparently invalidating the
victim, which makes it impossible for the client code
to understand whether the victim was a valid
entry or not.

Change-Id: I4fee59cc63c6b0e14c5b02bcf3ba5f58aa21ef9f
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-08-24 08:16:29 +01:00
Giacomo Travaglini
cda34c68a8 base: Store a pointer to a debug flag in the AssociativeCache
This will allow to print debug information for the associative
cache depending on its usage

Change-Id: Ia64e1cd7cb31fcbac27d031c38cd448ea64c5b4d
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-08-24 08:16:29 +01:00
Giacomo Travaglini
fc391cb9e8 arch-arm: Add place holder of registers. (#1495)
Add declaration of HAFGRTR_EL2 registers and read/write as GPR.

Change-Id: I87570d1e87d479f4530cf2c6e05931cdc26ee361
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-08-24 07:18:23 +01:00
Erin Le
e1db67c4bd configs, dev, learning-gem5, python, tests: more clarification
This commit contains the rest of the base 2 vs base 10 cache/memory
size clarifications. It also changes the warning message to use
warn(). With these changes, the warning message should now no
longer show up during a fresh compilation of gem5.

Change-Id: Ia63f841bdf045b76473437f41548fab27dc19631
2024-08-23 18:02:42 -07:00
Erin Le
28453a0e3e python: add warning message for conversion from base 10 to base 2
This commit adds a warning message for when cache or memory sizes
will be automatically converted from metric units (e.g. kB) to
binary units (e.g. KiB).

Change-Id: I4ddf199ff2f00c78bbcb147e04bd88c496fa16ae
2024-08-23 18:02:42 -07:00
Erin Le
70c8081107 stdlib: Clarify power of 2 vs power of 10
This commit changes files in src/python/gem5 so that memory and
cache sizes use "KiB", "MiB", and "GiB" instead of "KB", etc. This
makes the codebase more consistent, as gem5 will automatically
convert memory and cache sizes that are in metric units (KB) to
binary units (KiB).

Change-Id: If5b1e908ddcff7182b71e789229a3ba1fa6ad1f1
2024-08-23 18:02:42 -07:00
Giacomo Travaglini
fec28e466e base, mem-cache: Make the AssociativeCache more generic (#1446)
This PR implements #1429. It mainly achieve so with the following
changes

1) The IndexingPolicy is now a templated SimObject to make its APIs work
with different data types.
As an example, look at the getPossibleEntries, which is requiring an
Addr object whereas we want to be able to call the method with different
keys depending on the Tag
2) The AssociativeCache extracts type information from the Entry
template parameter.
This means any AssociativeCache entry will have to define the following
types:

KeyType = This is the data type used for lookups (in its simplest case,
it is Addr)
IndexingPolicy = This is the base indexing policy SimObject

As an example, the PR is also reworking the TaggedEntry to be
AssociativeCache compliant. This
ultimately allows us to remove the weird overloading of cache querying
methods with the secure flag, and to
remove the AssociativeSet which was providing such weird interface.
As mentioned in the [base, mem-cache: Rewrite TaggedEntry
code](7ee9790464)
commit, further cleanup is needed. TaggedEntry
is really a misleading name as its sole difference with the CacheEntry
(which is also tagged) is the presence of
the secure bit. A better name should be chosen.
2024-08-23 21:07:43 +01:00
Giacomo Travaglini
c52f77a1d7 mem-cache: Remove IP dependency from TaggedEntry
We don't store a pointer to the indexing policy anymore.
Instead, we register a tag extractor callback when we
construct the TaggedEntry

Change-Id: I79dbc1bc5c5ce90d350e83451f513c05da9f0d61
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-08-23 12:15:10 +01:00
Giacomo Travaglini
b6d34db216 base, mem-cache: Remove IP dependency from the CacheEntry
We don't store a pointer to the indexing policy anymore.
Instead, we register a tag extractor callback when we
construct the CacheEntry

Change-Id: I06dc58e2f67e01f3f9bcd9f0c641505d3aec82ff
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-08-23 12:15:10 +01:00
Giacomo Travaglini
4030e39c9a mem-cache: Remove AssociativeSet data type
As detailed by a previous commit, AssociativeSet is not needed anymore.
The class is effectively the same as AssociativeCache

Change-Id: I24bfb98fbf0826c0a2ea6ede585576286f093318
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-08-23 12:15:10 +01:00
Giacomo Travaglini
ee9814499d base, mem-cache: Rewrite TaggedEntry code
The only difference between the TaggedEntry and the newly defined
CacheEntry is the presence of the secure flag in the first case.  The
need to tag a cache entry according to the security bit required the
overloading of the matching methods in the TaggedEntry class to take
security into account (See matchTag [1]), and the persistance after
PR #745 of the AssociativeSet class which is basically identical
to its AssociativeCache superclass, only it overrides its virtual
method to match the tag according to the secure bit as well.

The introduction of the KeyType parameter in the previous commit
will smoothe the differences and help unifying the interface.

Rather than overloading and overriding to account for a different
signature, we embody the difference in the KeyType class. A
CacheEntry will match with KeyType = Addr,
whereas a TaggedEntry will use the following lookup type proposed in this
patch:

struct KeyType {
    Addr address;
    bool secure;
}

This patch is partly reverting the changes in #745 which were
reimplementing TaggedEntry on top of the CacheEntry. Instead
we keep them separate as the plan is to allow different
entry types with templatization rather than polymorphism.

As a final note, I believe a separate commit will have to
change the naming of our entries; the CacheEntry should
probably be renamed into TaggedEntry and the current TaggedEntry
into something that reflect the presence of the security bit
alongside the traditional address tag

[1]: https://github.com/gem5/gem5/blob/stable/\
    src/mem/cache/tags/tagged_entry.hh#L81

Change-Id: Ifc104c8d0c1d64509f612d87b80d442e0764f7ca
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-08-23 12:15:10 +01:00
Giacomo Travaglini
1c57195d7f base: Do not require an AssociativeCache to store a CacheEntry
As long as the AssociativeCache Entry parameter satisfies the
interface it should be fine. We enforce the bare minimum of having
a replaceable entry.
Doing otherwise will restrict our capability to have a generic cache
with generic tags

Change-Id: I23e32b7540fea6b6e5894aca3d91538e81214932
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-08-23 12:15:10 +01:00
Giacomo Travaglini
8c81479193 base: Extract KeyType type in the AssociativeCache from Entry
The KeyType data type is the type of the lookup and the cache extracts
it from the Entry template parameter

Change-Id: I147d7c2503abc11becfeebe6336e7f90989ad4e8
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-08-23 12:15:10 +01:00
Giacomo Travaglini
4a2e015ff8 base: Extract IP type in the AssociativeCache from Entry
This commit is making the AssociativeCache indexing policy
a type extracted from the Entry template parameter

Change-Id: Ic9fb6ccb1b3549aaa250901e91ae3c300b92103e
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-08-23 12:15:10 +01:00
Giacomo Travaglini
4814fedef0 base, mem-cache: Do not expose tags to the AssociativeCache
Exposing the tag of a cache entry through the associative
cache APIs makes it hard to generalize the cache for
structured tags. Ultimately the tag should be a property
of the cache entry and any tag extraction logic (if needed)
should reside there. In this we can reuse the associative
cache for different Entry params, each one bearing a different
representation of a tag

Change-Id: I51b4526be64683614e01d763b1656e5be23a611b
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-08-23 12:15:10 +01:00
Giacomo Travaglini
31d967b453 base, mem-cache: Templatize the BaseIndexingPolicy
Change-Id: I4a7a0effd0100371afbd31c51d8ac643049dbdb1
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-08-23 12:15:10 +01:00
Giacomo Travaglini
9661ca7708 mem-cache: Simplify generation of stride prefetcher table
Some compilers (gcc version 12.3.0) will start complaining when perfect
forwarding the StrideEntry argument constructed with an extra parameter
(see later patches).
Using a pointer seems to fix the gcc bug.

The commit is also changing the signature of findTable and allocateContext
so that a reference rather than a pointer is return. In this way we don't
deal with the hack of returning a raw ptr from a unique_ptr

Change-Id: Idd451208aae80bbfae76110c859e93084bcb2635
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-08-23 12:15:10 +01:00
Giacomo Travaglini
58aa0cfbe5 mem-cache: Rewrite explicit fully associative lookup
The code is already assuming a fully associative cache.  Rather than
calling getPossibleEntries with a random value and therefore needlessly
passing a vector of pointers, we use the AssociativeCache iterator to
loop over the cache entries

Change-Id: Ic99cbd39ee9f12eef9091d9d62ca24d0c3e61300
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-08-23 12:15:10 +01:00
Harshil Patel
1773001dd6 resources: update filtering of resources by gem5 versions (#1475)
- Updated search query so that resources that are not compatible with
the gem5 version are still downloaded and used but a warning is thrown
instead of returning an error.
2024-08-22 09:42:34 -07:00
Junshi Wang
7205652476 arch-arm: Fix Execution Permission in Stage2 Direct Permission.
In Stage 2 under AArch64, execution permission does not need read
permission.

Change-Id: I45887e8f4d50ed5edc4afaed9a2dd8a74db9d0d4
2024-08-22 15:56:06 +01:00
Bobby R. Bruce
30866376d3 tests,gpu-compute: Fix gpu tests (#1496) 2024-08-22 05:49:26 -07:00
Bobby R. Bruce
6057de452b tests,gpu-compute: Fix incorrect options handling
Change-Id: Ica845ad7c4a49fe2636df3bf184220a33557bc5e
2024-08-22 05:49:07 -07:00
Bobby R. Bruce
0a188850fe tests,gpu-compute: Fix artifact upload for GPU tests
actions/upload-artifact@v4 does not understand periods in artifact
names.

Change-Id: Ia272f9dcf9cb2213fb78b1814007921232395914
2024-08-22 05:49:07 -07:00
Bobby R. Bruce
28a6ca201b misc,tests: Remove Gerrit ID check from CI Workflow
Change-Id: I86933f3b315f3233e135de2e32498c1641f7443e
2024-08-22 04:24:56 -07:00
Bobby R. Bruce
868e287e71 stdlib: Give user's disk_device priority when setting root val (#1467)
In `get_default_kernel_root_val()`, now prioiritizes the explicit
disk_device passed from the user over the default implemented by the
board.

Also adjusts syntax for selecting this value in
`set_kernel_disk_workload()` for consistency.

It seems that the common use case for setting `disk_device` is that
there is a mismatch between where the disk image is mounted and where
the board expects it by default. In this case, it also seems common that
the root partition will be on this explicit device as well.

In cases where this is not true, explicit kernel arguments can be used
to define the distinct disk device apart from the root. However, this
seems less common than the above so, in that case, it would be easier to
tie these together.
2024-08-22 03:38:56 -07:00
Junshi Wang
387caa0075 arch-arm: Add place holder of registers.
Add declaration of HAFGRTR_EL2 registers and read/write as GPR.

Change-Id: I87570d1e87d479f4530cf2c6e05931cdc26ee361
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-08-22 08:21:28 +01:00
Setu
f6010439fe mem: Fixed implementation of Best Offset Prefetcher (#1403)
This PR fixes the issues with the implementation of the Best Offset
Prefetcher described in issue #1402

On branch bop
Changes to be committed:
	modified:   src/mem/cache/prefetch/bop.cc
	modified:   src/mem/cache/prefetch/bop.hh

---------

Co-authored-by: Setu Gupta <setu.gupta.2020@gamil.com>
Co-authored-by: Abhishek Shailendra Singh <abs218@leigh.edu>
Co-authored-by: Setu Gupta <setu.gupta@partner.samsung.com>
2024-08-21 09:54:20 -07:00
Bobby R. Bruce
3429da7787 python,tests,misc: Remove Gerrit-ID insertion from pre-commit
Change-Id: I4db06415c9d0bbba7a6db56d7e9febf6491003bf
2024-08-20 15:40:55 -07:00
Noah Krim
dfa4bbd7a4 Merge branch 'develop' into fix-kernel-workload-root-val 2024-08-20 15:12:10 -07:00
Bobby R. Bruce
e7442036a5 tests,gpu-compute: Fix Daily/Weekly GPU tests failures (#1485)
Without specifying the "gem5/gpu" directory, this test attempted to run
the entire test suite. This caused the daily and weekly tests to fail.
This change fixes this.
2024-08-20 14:18:51 -07:00
Ali Nezhadi Khelejani
1512eddd43 misc: Update on-create.sh (#1477)
After merging the old personal gem5 repository with the stable version
v24, I tried to run the project inside the `.devcontainer` environment.
During the image build process, I encountered the following error:

```sh
[7683 ms] Start: Run in container: /bin/sh -c ./.devcontainer/on-create.sh
fatal: detected dubious ownership in repository at '/workspaces/gem5'
To add an exception for this directory, call:

        git config --global --add safe.directory /workspaces/gem5
[7724 ms] onCreateCommand failed with exit code 128. Skipping any further user-provided commands.
```
This error occurred due to an ownership permission problem, which I
resolved by adding the following line.
2024-08-20 11:15:33 -07:00
Bobby R. Bruce
0857442e44 util-docker: Cleanup, refactor, better document Dockerfiles (#1292)
* Removes the "docker-compose.yaml" in favor of "docker-bake.hcl". This
uses the `docker buildx` tool which has the advantage of enabling
multi-platformm builds where desired. By default all images are built
targeting `linux/arm64`, `linux/amd64` and `linux/riscv64` as targets
with the exception of the GPU images where only `linux/amd64` makes
sense.
* Remove unused/older Docker build targets (these can easily be re-added
but they were not regularly built or have any current usage).
* Update "README.md" to better describe these Dockerfiles and how they
are built.
* Simplify GCC and Clang compiler images. Each uses the Ubuntu 24.04 All
Deps image as a base then specialized the compiler on top.
* To simply things, all compiler versions are built from 24.04. This
means **narrowing the supported versions from GCC v10 to v14 and Clang
v14 to v18**.
* Fix some bugs in the "docker-bake.hcl" thus ensuring all targets may
be built from it.
* Cleanup the systemc and sst images: reducing their size and building
them off the common 24.04 ubuntu base image.
2024-08-20 09:45:47 -07:00
Tiberiu Bucur
88de81f167 arch-arm, sim-se: Fix VPtr bug
Some syscalls were incorrectly using 64 bit
integers instead of VPtr's guest pointers,
causing parameter value corruption. This
commit addresses this issue.

Change-Id: If9e27a7c776b802dda18979d1a83a76c23557359
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-08-20 16:18:24 +01:00
Tiberiu Bucur
107e8f3d17 arch, sim-se: Fix size_t size mismatch bug
Same as with the off_t, some syscalls were using
incorrect size parametres in place of a guest-defined
size_t. This commit changes the signature of said
syscalls and adds the size_t typedef to the
arch-dependent Linux OSs.

Change-Id: Iece43814971a8e6275d25f6789e41528d241d1f4
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-08-20 16:18:24 +01:00
Tiberiu Bucur
f74260c552 arch, sim-se: Fix off_t size mismatch bug
Some system calls were using incorrect sizing for
offset parametres, which was causing the ABI to pass
wrong values due to size mismatches. One such syscall
is lseek, which in the Arm syscall table was
incorrectly marked as llseek, which does not exist
in aarch64 Linux. In addition, the off_t alias for
general Linux was changed from an unsigned to a
signed type, to accurately reflect the behaviour
in the real-life Linux operating system.

Change-Id: Iada4b66a8933466c162ba9ec901dbdae73c73a18
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-08-20 16:18:24 +01:00
Tiberiu Bucur
9b9b9ffbff arch-arm: Ignore/implement several syscalls
This commit either adds the implementation or the ignoreFunc
to the corresponding entry in the syscall table for
some Arm syscalls that were required in order to test
the fix for the incorrect parameter size bug in se mode.

Change-Id: Ifc6d87e2decf1bf96ecd81de6690f92927377bf8
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-08-20 16:18:24 +01:00
Tiberiu Bucur
fe6ef662d1 configs: Add --param to starter_se
This commit adds the --param option to the starter_se
configuration script for the Arm ISA. This is in order
to support attaching remote debugger sessions.

Change-Id: I2d8cc9f677f731948872003cca6066d1072ad570
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-08-20 16:18:24 +01:00
Noah Krim
cb79596036 Merge branch 'develop' into fix-kernel-workload-root-val 2024-08-19 12:40:03 -07:00
Harshil Patel
ce4c2c6495 dev,arch-x86: Added softstrobe mode to intel8254 timer (#1447)
This PR should fix the #1195
2024-08-19 12:21:31 -07:00
Bobby R. Bruce
7413d3217c docs,misc: RELEASE-NOTES.md updates for v24.1 (#1460) 2024-08-19 10:58:29 -07:00
Bobby R. Bruce
f600db4a98 gpu-compute,tests: Move GPU tests to testlib (#1270)
A new host tag `gcn_gpu` has been added. This allows for selection of
those GPU tests which depend upon the gcn-gpu docker image to run.

In addition to this, the square GPU tests has been moved to the CI
tests. This ensures some GPU code is compiled and run on every PR.
2024-08-19 10:58:06 -07:00
Yangyu Chen
b0d81ec8a2 arch-riscv: fix GDB breakpoint issue for RV32 (#1470)
Since PR #1316, we use sign-extend for all address generation, including
PC, to match the ISA specification for modifiable XLEN. However, when we
set a breakpoint using remote GDB, our address is not sign-extended.
This causes the breakpoint to be set at the wrong address, as specified
in Issue #1463. This PR fixes the issue by sign-extending the address
when setting a breakpoint. This also matches the RISC-V ISA
Specification that "must sign-extend results to fill the entire widest
supported XLEN in the destination register."

Change-Id: I9b493bf8ad5b1ef45a9728bb40fc5e38250fe9c3

Signed-off-by: Yangyu Chen <cyy@cyyself.name>
2024-08-19 10:25:39 -07:00
Bobby R. Bruce
cad4307951 util-docker: Re-add env variables to SST
Change-Id: I653baeb69f8be1501766b57337f6643e00d7dd60
2024-08-19 10:08:20 -07:00
Yu-Cheng Chang
aa4fe362a5 arch-riscv: Sign-extend the address in newPCState (#1471)
From #1316, creating the new PCState should sign-extend the address to
avoid wrong address issue.

Change-Id: I884b4e3708f5f1cc49cfd44d51bec5a2b63cc47a
2024-08-19 08:21:42 -07:00
Giacomo Travaglini
280871245b arch-arm: Redirect VHE for ZCR_EL1 (#1472)
Change-Id: Iff83d25257065503dc02728461823bc9985dbab3

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-08-16 22:49:49 +01:00
Noah Krim
eca0059654 stdlib: Give user's disk_device priority when setting root val
In `get_default_kernel_root_val()`, now prioiritizes
the explicit disk_device passed from the user over the
default implemented by the board.

Also adjusts syntax for selecting this value in
`set_kernel_disk_workload()` for consistency.

Change-Id: Icddcf438f5b96c2288c3cc608782f191df2c394e
2024-08-15 13:34:03 -07:00
Bobby R. Bruce
646df63e56 misc: Fix typos in util/dockerfiles/README.md
Change-Id: I5488301543bfff21279b6c0b1aae841574efee95

Co-authored-by: Harshil Patel <harshilp2107@gmail.com>
2024-08-15 10:45:53 -07:00
Alexander Richardson
646f994efb arch-arm: Fix incorrect operation of VRINT* instructions (#1325)
After a lot of debugging and comparing traces I noticed that vrintp was
giving different results from QEMU. An input of 0x3f800000 (1.0) was
being passed to the fplib helpers as (uint32_t)1 which has a completely
different floating-point interpretation and the result was therefore
completely wrong.

I've fixed this as well as all remaining implicit float-to-int
conversions in the ARM instruction execution. There are more
-W(implicit-)float-conversion warnings in the other executors, but for
now this fixes the issue I was seeing.

Change-Id: Ifdeee745ca155d7f4504ac4c54235ac431acdeb9
2024-08-15 11:01:48 +01:00
Bobby R. Bruce
0c26ee5f71 util-docker: Replace gem5 v24.0 clone with wget
This is more efficient.

Change-Id: Idd57343183a8667425dbc036ad0c7c18581898f5
2024-08-14 14:08:44 -07:00
Setu
629bf84e10 mem: Stride Prefetcher Fix (#1449)
This PR fixes the issues mentioned in #1448.

**Note that this contribution is the result of a joint collaboration
with @AbhishekUoR**

This PR introduces the following 4 changes:
1. It changes the addresses which are used to compute the stride to
cache line aligned addresses (the current version uses word aligned
addresses)
2. It correctly returns if the stride does not match (as opposed to
issuing prefetches using the new stride incorrectly)
3. It returns if the new stride is 0, indicating multiple reads from the
same cache line.
4. It removes code which is no longer necessary after the addition of
changes number 1 and 3.

Change-Id: Ic346d0e15df6d07e2b93289c8d6b89b4c2f45a34

---------

Co-authored-by: Abhishek Shailendra Singh <abs218@leigh.edu>
2024-08-14 07:16:10 -07:00
Bobby R. Bruce
dcb04a72fc util-docker,tests: Remove Ubuntu 20.04 Docker
Change-Id: I1d4bbebaa4b6f064b5f40a95d066bbf092cf103f
2024-08-13 16:15:49 -07:00
Bobby R. Bruce
9f93c8ac9c util-docker: Revert docker image tag to 'latest'
Change-Id: Iafe92716725e6b3cecfeba57098c3a7efaf73d97
2024-08-13 16:13:33 -07:00
Bobby R. Bruce
59455daa85 util-docker: Fix correct common platform comment
Change-Id: Ifc703b47b1e59522ba01f4c2b59a4863779eefb1
2024-08-13 16:12:45 -07:00
Bobby R. Bruce
8b61490df1 util-docker: Update dockerfiles README
Change-Id: I39bca04b3770bd51203944d69d0fbecff85055f8
2024-08-13 16:09:08 -07:00
Bobby R. Bruce
bef452ce72 misc,tests: Update supported GCC and Clang compilers
- GCC: v10 to v14
- Clang: v14 to v18

Change-Id: I6cd1686ffff0f08686a231b6b4936da343d53831
2024-08-13 16:09:06 -07:00
Bobby R. Bruce
b68c2ef37f util-docker: Add vim to 24.04-all-deps Ubuntu Docker
Change-Id: I898a0fddcdcf8a876fcbbe11795e858395ad9740
2024-08-13 16:08:05 -07:00
Bobby R. Bruce
3875dcdfd7 util-docker: Update the sst Dockerfile
1. Builds on top of the Ubuntu 24.04 all-deps image.
2. Unify the download, build, install, and cleanup steps.

Change-Id: I4c2bf8e571dfd228f7df8372cda0f428de59af51
2024-08-13 16:08:05 -07:00
Bobby R. Bruce
2c0c933a3a util-docker: Cleanup the systemc docker
1. Uses the ubuntu-24.04_all-deps as the base image.
2. Unifies the build and cleanup into a single step, thus reducing the
   size of the image.

Change-Id: I63b5dad2af0e8b1f6be8ad1f28321c743f36b2dc
2024-08-13 16:08:05 -07:00
Bobby R. Bruce
58aad68329 util-docker: Order targets in docker-bake
This improves readbility. The targets order matches that in the default
group.

Change-Id: I1102aeb48bc256df9b58032a327ec663e5733a98
2024-08-13 16:08:03 -07:00
Bobby R. Bruce
9978b4ea4c util-docker: Add 'devcontainer' to default bake group
Change-Id: I4b245cabd6e384cab780bd22b0f8b40d9819b92b
2024-08-13 16:07:22 -07:00
Bobby R. Bruce
4956c475f4 util-docker: Set the GPU Docker images to build only to x86
These images won't work and make no sense compiling to any platform
other than X86. These are used in SE mode simulations where the host
platform matters.

Change-Id: I47405e930bf511fabcbc93d0b08ee2fb2c556869
2024-08-13 16:07:22 -07:00
Bobby R. Bruce
c1a562083d util-docker: Set 'pull' to 'always'
This ensures the a `docker pull` command is always run before building.

Change-Id: If1a66b9b426d5843459e0308a64f13a11c0c6ed2
2024-08-13 16:07:21 -07:00
Bobby R. Bruce
9d635dea55 util-docker: Improve Docker gcc and clang builder
1. Uses the all-dependencies image as the base image.
2. Has all compilers use Ubuntu 24.04.

Notes: This change implitly changes our supported compilers to GCC v10
to v13 and Clang v14 to v18. This will be fully incorporated into the
project later.

Change-Id: Id8e2141ea64a34c7e3532605f6ecb7d9ccb76951
2024-08-13 16:07:15 -07:00
Bobby R. Bruce
a1eefb6ed8 util-docker: Remove old/unsed/unecessary Dockerfiles
* Unsupported compilers.
* Unsed cross compilers.
* The gem5-all-min-deps image.

Change-Id: Iaab64e5e6685b0a538c38b2979fae86f01bc53e8
2024-08-13 16:05:47 -07:00
Bobby R. Bruce
e03a20bdb4 util-docker: Remove version from systemc Docker context
This simplifies things slightly.

Change-Id: I1263e385f7adeb2b83cdc09f7f6903be9193c467
2024-08-13 16:05:47 -07:00
Bobby R. Bruce
c291678881 util-docker: Fix docker-bake.hcl sst context
The context for sst is the "sst" directory.

Change-Id: Ic120cca13a9e4df02b98d101ad8e16c296807c2d
2024-08-13 16:05:47 -07:00
Bobby R. Bruce
e82e824f08 util-docker: Breakup long (>79 char) lines in docker-bake
Change-Id: I5488301543bfff21279b6c0b1aae841574efee95
2024-08-13 16:05:46 -07:00
Bobby R. Bruce
3640559a12 misc,tests: Fix compiler tests (add missing ,) (#1459) 2024-08-13 06:54:12 -07:00
Alexander Richardson
f6f547fb62 arch-arm: Fix incorrect behaviour of VFNMS and VFNMA (#1420)
This was found while comparing a diverging execution against QEMU traces
and checking for the first mismatched program counter. Fortunately this
was
caused by a branch shortly after this incorrect computation but still
took
a long time to track down.

There are two issues here: the decoder had inverted the cases for *S and
*A,
and the sign bit was wrong for VFN*.
2024-08-13 09:05:52 +01:00
Matthew Poremba
c359b53a19 arch-vega: Update microscaling format scaling and denorm handling (#1451)
This PR has 3 commits:
- Update scaling methods to scale by multiplication or division when
upcasting or downcasting respectively.
- Preserve the sign when a microscaling conversion results in NaN or
infinity to match hardware.
- Rework rounding to handle cases where conversion results in a denormal
number in the output type so that the value is correct.
2024-08-12 07:00:26 -07:00
Matthew Poremba
7d46c50663 arch-vega: Swizzle multi-dword scratch requests (#1445)
Scratch memory requests that are larger than one dword are using a
different memory layout than global instructions. Rather than being
placed contiguously, each dword is interleaved 64 lanes * 4 bytes away
as described in Section 9.1.5.2. "Swizzled Buffer Addressing" in the
MI300 specification. This was verified by comparing MI300 output (which
uses scratch_ instructions) with MI200 (which uses buffer instructions).
MI300 FashionMNIST bs=1 now matches CPU reference.

This requires several changes to the instruction implementations:
- For stores, data in the GPUDynInst can be swizzled before the data is
written to memory. This is easy to do using a helper method. This is
done in the template<int N> variant of initMemWrite. To use this x2
stores are changed to use template<int N> rather than loading a U64. The
swizzle function is renamed to swizzleAddr to avoid confusion with
swizzleData.
- For loads, data is unswizzled in completeAcc when writing register
values. This is not as easy to implement as a helper and is thus
implemented for the three load instructions that load more than one
dword.
- Accessing swizzled data requires at least one packet per dword. A new
GPU memory helper is added to create these packets for scratch requests
specifically. This is called in the template<int N> variant of
initMemRead / initMemWrite. Loads and stores of x2 are changed to use
this variant instead of accessing a U64.

The GPUDynInst status vector restrictions are increased to allow for
swizzled x4 accesses. For simplicity this does not currently support
misaligned swizzled accesses and will panic upon seeing such a case.

Change-Id: Ic686c51e28e0af029a043d5a5b3d4069f2cb94f9
2024-08-12 06:58:48 -07:00
Matthew Poremba
62a2c09d4b arch-vega: Rework rounding for microscaling conversions
The current implementation does not correctly convert subnormal numbers
(number that fill the underflow gap around zero in floating-point
arithmetic). This commit reworks the rounding code to get correct
results.

First, the min_exp is set to 0 which allows for numbers to become
subnormal when rounding. Second, the rounding code now uses something
closer to "GRS" rounding (guard, round, sticky) which represent the
first bit removed when rounding to a smaller type, the next second bit
removed, and whether any of the other bits removed are one. More details
can be found in the code comments.

Change-Id: Idcd2f1e4383e4012fc3abf73b1f73c847d44f67b
2024-08-10 10:23:07 -07:00
Matthew Poremba
bdba981753 arch-vega: Preserve sign of NaN/Inf for microscaling types
The implementation of microscaling formats uses the Open Compute Project
specification which includes a sign bit for NaN and infinity. This
should be preserved when a conversion results in NaN or infinity.

Change-Id: Id9e99324c6486e256c699016aff301d5f06814d5
2024-08-10 10:23:07 -07:00
Matthew Poremba
c1251f51c1 arch-vega: Introduce two scaling methods for microscaling types
Currently there is only a scale() method which multiplies a microscaling
type by an int8 value. This should only be applied when upcasting to
a larger type after conversion to match hardware. When downcasting to a
smaller type, the scaling method should divide by the int8 value before
conversion.

This commit adds both scaling methods.

Change-Id: Ibafa8caa389cde4df609e536cd53bd2289959420
2024-08-10 10:23:07 -07:00
Robert Hauser
e980780efd arch-riscv: Extend wfi behavior (#1364)
At the moment, a hart does not halt if there are pending interrupts.
However, an implementation can also consider the enable status of the
individual interrupts, i.e., a halted hart would only resume if there
are locally enabled pending interrupts. This commit introduces this
behavior. The wfi behavior is controlled by the new configuration
variable wfi_pending_resume of RiscvISA.

Change-Id: I316239f9732c6e73e6ad692491bca08d773dd995

---------

Signed-off-by: Robert Hauser <robert.hauser@uni-rostock.de>
2024-08-09 11:28:15 -07:00
Marleson Graf
b8001a861b mem-ruby,sim-se: Clear LL/SC locks after functional writes (#1404)
Functional writes atomically update all copies of a data block, so they
should invalidate any pending LL/SC locks, just like a conventional
write would.

Change-Id: Ic79d2d8d24901f1b6a2ce81dc0e2decc84c0ebbc
2024-08-09 09:30:37 -07:00
Bobby R. Bruce
b1a44b89c7 misc: v24.0.0.1 Hotfix release (#1425) 2024-08-08 14:15:18 -07:00
Bobby R. Bruce
8593f69f0a util: Fix MongoDB script requirements.txt (#1426)
Dependency Bot appears to have had difficulty with this file:
https://github.com/gem5/gem5/security/dependabot/29

This PR:

1. Removes the weird "```" which could not be parsed.
2. Ups PyMongo to a more secure version.
2024-08-08 13:01:29 -07:00
MMysore2
33e3bc4ff1 Updating Traffic Generators (#1416)
Added documentation for `strided_generator.py` and
`strided_generator_core.py.`

Updated clarity of documentation for `linear_generator.py`,
`linear_generator_core.py`, `random_generator.py`, and
`random_generator_core.py`.

Made `max_addr` exclusive instead of inclusive for strided and linear
traffic generation in `strided_gen.cc` and `linear_gen.cc`.
2024-08-08 12:46:10 -07:00
Matthew Poremba
85c48a36ec dev-amdgpu: Fix issues found by address sanitizer (#1430)
These commits primarily fix the SDMA engine which was (1) using pointer
arithmetic on a variable returned by new and then attempting to free the
modified pointer and (2) using a buffer after it was freed due to the
DMA device calling completion event before Ruby actually completed.

Some minor fixes are included: Stop using uninitialized value as packet
context and using same request pointer for two separate packets for GPU
invalidations.
2024-08-08 11:14:50 -07:00
Ivana Mitrovic
ba0c3cc29a misc: Update GitHub badge links (#1428)
Change-Id: Iaead9f6146a90c9b2a671b9b78a318869ca739e6
2024-08-08 08:44:26 -07:00
Yangyu Chen
ce07203c5f arch-riscv: use sign-extend for all address generation (#1316)
In gem5, we use the same code base for RISC-V 32 and 64.

However, if we need to allow modifiable XLEN control on CSR.mstatus in
the future, we should follow the RISC-V ISA manual to sign-extend all
the register results, including PC and GPR. If this feature implemented,
the simulator needs to handle user-mode in RV32 but CSR.SATP sets to
Sv39. In this case, 0x80000000 and 0xffffffff80000000 are different
addresses in the 64-bit S-Mode perspective, but they are the same in the
32-bit U-Mode perspective. We should avoid this wrong behavior happening
before we implement this feature.

Thus, we need to sign-extend the results of all the addresses, including
the PC and memory addresses, which currently use zero-extend. As
specified in the RISC-V ISA manual, we use zero-extend in narrow XLEN
mode for the physical address implemented in TLB.

Changes based on spec:
1. Sign-extend narrow XLEN:
https://github.com/riscv/riscv-isa-manual/blob/riscv-isa-release-b7a445a-2024-07-02/src/machine.adoc?plain=1#L567
2. Zero-extend physical address:
https://github.com/riscv/riscv-isa-manual/blob/riscv-isa-release-b7a445a-2024-07-02/src/supervisor.adoc?plain=1#L1670

Signed-off-by: Yangyu Chen <cyy@cyyself.name>
2024-08-08 08:41:35 -07:00
Matt Sinclair
86f7fae86b gpu-compute: fix GPU TLB outstandingReqs vs. associativity (#1431)
The GPU TLB maxOutstandingReqs field gets limited by the associativity.
In the current setup, this means that the max outstanding requests is 32
even though the setup is for 64 entries. Update the associativity to be
64 entries.

Change-Id: I2104e4647d97bf4d1cf5ac447e38ad6ac6a1a0d8
2024-08-07 21:37:36 -05:00
Matthew Poremba
84fedecafe gpu-compute: Update Requests for invalidations
The SQC and TCC invalidations share a Request pointer which they both
modify. This can cause some problems, so use a different request pointer
for each invalidate. The setContext call is also removed as the value
being assigned to it is uninitialized.

Change-Id: I82ea7aa44a4f4515c1560993caa26cc6a89355af
2024-08-07 14:37:49 -07:00
Matthew Poremba
db0d5f19cf dev-amdgpu: Add cleanup events for SDMA
SDMA packets which use dmaVirtWrites call their completion event before
the write takes place in the Ruby protocol. This causes a use-after-free
issue corruption random memory locations leading to random errors. This
commit adds a cleanup event for each packet that uses DMA and sets the
cleanup latency as 10000 ticks. In atomic mode, the writes complete
exactly 2000 ticks after the completion event is called and therefore a
fixed latency can be used. This is not tested with timing mode, which
does not work with GPUFS at the moment, so a warning is added to give an
idea where to look in case the same issue occurs once timing mode is
supported.

Change-Id: I9ee2689f2becc46bb7794b18b31205f1606109d8
2024-08-07 14:37:49 -07:00
Matt Sinclair
03ddd0b75f gpu-compute: fix GPU TLB outstandingReqs vs. associativity
The GPU TLB maxOutstandingReqs field gets limited by the associativity.
In the current setup, this means that the max outstanding requests is
32 even though the setup is for 64 entries.  Update the associativity
to all 64 entries.

Change-Id: I2104e4647d97bf4d1cf5ac447e38ad6ac6a1a0d8
2024-08-07 16:16:01 -05:00
Matthew Poremba
0d0b68266c dev-amdgpu: Fix bad free in SDMA
The SDMA engine copies data in chunks. It currently uses the pointer
returned from new[] and manipulates it using pointer arithmetic. This
modified pointer is then passed to the completion function which deletes
the pointer. Since it is not the original pointer allocated by new[]
this triggers issues in ASAN.

Change-Id: I03ccf026633285e75005509445c62fcbda8eb978
2024-08-07 12:54:45 -07:00
Saili Karkare
bd228af5cf Updating hex addr printing (#1385)
This change changes the addresses that are printed when TrafficGen
DebugFlag is enabled. Previously, hex strings were printed without a
preceding 0x. This change fixes that to distinguish between decimal and
hex.
2024-08-07 02:31:21 -07:00
Bobby R. Bruce
811e8c0fb4 util-docker,tests: Up clang support: >=v10 (#1415)
The compiler tests are failing to to a compile bug in Clang 7:
https://github.com/gem5/gem5/actions/runs/10170081794

Given Ubuntu 20.04 APT installs v10 by default (i.e., with `apt install
clang`). This is the oldest LTS Ubuntu version. It therefore seems
sensible to drop support for older (<v10) versions of clang.
2024-08-07 00:38:08 -07:00
Bobby R. Bruce
eabb625870 util-docker,tests: Up clang support: >=v10
The compiler tests are failing to to a compile bug in Clang 7:
https://github.com/gem5/gem5/actions/runs/10170081794

Given Ubuntu 20.04 APT installs v10 by default (i.e., with `apt install
clang`). This is the oldest LTS Ubuntu version. It therefore seems
sensible to drop support for older (<v10) versions of clang.

Change-Id: I4c48223b80306422beac1464c09f03397c156ba1
2024-08-07 00:35:34 -07:00
Bobby R. Bruce
69ca952724 misc: Release v24.0.0.1 hotfix
- Updates releases notes.
- Increase the versioning from v24.0.0.0 to v24.0.0.1.

Change-Id: I2f3f9eed06ecf74a5c6d86bb4dab25f1ff23b10d
2024-08-06 21:23:33 -07:00
Erin Le
2955d89ed3 mem: Add constexprs to spatio_temporal_memory_streaming.cc
Change-Id: I6fa3d9f9a9d89d59d9ec1fc97c152bea3059f87d
2024-08-06 21:09:20 -07:00
Erin Le
be6fadca52 mem: remove stray comment from signature_path_v2.cc
Change-Id: I5ddd2ddd6a9cb4fb032b48870c5ef6b0dc9533c0
2024-08-06 21:09:13 -07:00
Erin Le
8e80ede3f1 mem: Comment removal and adding constexpr to is_secure bools
This commit removes some comments and adds constexpr in front
of "bool is_secure..." in pif.cc, signature_path.cc, and
signature_path_v2.cc

Change-Id: Icafe1d7c97d1d3fbf6abc12ba87ebb596255b96f
2024-08-06 21:09:04 -07:00
Erin Le
6c4621c665 mem: use is_secure instead of hardcoded false in prefetcher crash
This modifies the crash fix so that the function calls that were
modified use a local variables called `is_secure` instead of a
hardcoded `false`. Some of these existed previously so it made
more sense to use them, while others were newly added in to mark
where the code might need to be changed later.

Change-Id: I0c0d14b74f0ccf70ee5fe7c8b01ed0266353b3c1
2024-08-06 21:08:57 -07:00
Erin Le
dc2162dbff mem: Fix "Need is_secure arg" prefetcher crash
This commit fixes the "Need is_secure arg" crash that occurs when
using the IndirectMemoryPrefetcher, SignaturePathPrefetcher,
SignaturePathPrefetcherV2, STeMSPrefetcher, and PIFPrefetcher. This
was done by changing some variables to be AssociativeSet<...>
instead of AssociativeCache<...> and changing the affected function
calls.

Change-Id: I61808c877514efeb73ad041de273ae386711acae
2024-08-06 21:08:45 -07:00
Bobby R. Bruce
bbc49aa914 misc: Stable merge to dev (#1424) 2024-08-06 21:02:35 -07:00
Bobby R. Bruce
8885d60399 misc: Merge branch stable branch into develop
Change-Id: Ie391ea7eeb86a6e862e910e7d150edde0059cc54
2024-08-06 21:02:06 -07:00
Bobby R. Bruce
bb290aaff5 misc: Change devcontainer for isca tutorial and bootcamp (#1282) 2024-08-06 19:45:48 -07:00
Robert Hauser
ba704a01b2 misc: Fix typo in multisim code snippet (#1417) 2024-08-06 13:54:16 -07:00
Bobby R. Bruce
bd53bad5cf mem: Fix "Need is_secure arg" prefetcher crash (#1374)
This PR fixes the "Need is_secure arg" crash that occurs when using
IndirectMemoryPrefetcher, SignaturePathPrefetcher,
SignaturePathPrefetcherV2, STeMSPrefetcher, and PIFPrefetcher. This was
done by changing some variables to have the type AssociativeSet<...>
instead of AssociativeCache<...> and adding in "false" or an existing
value for the value of the secure bit in some function calls. Further
changes may be needed to move away from hard-coding values.
2024-08-06 13:01:40 -07:00
Erin Le
6dbe2bca7b mem: Add constexprs to spatio_temporal_memory_streaming.cc
Change-Id: I6fa3d9f9a9d89d59d9ec1fc97c152bea3059f87d
2024-08-06 00:06:38 +00:00
Erin Le
f325949ba5 mem: remove stray comment from signature_path_v2.cc
Change-Id: I5ddd2ddd6a9cb4fb032b48870c5ef6b0dc9533c0
2024-08-05 23:10:10 +00:00
Erin Le
2db021b27b mem: Comment removal and adding constexpr to is_secure bools
This commit removes some comments and adds constexpr in front
of "bool is_secure..." in pif.cc, signature_path.cc, and
signature_path_v2.cc

Change-Id: Icafe1d7c97d1d3fbf6abc12ba87ebb596255b96f
2024-08-05 15:43:40 -07:00
Erin Le
9adf44ed1f mem: use is_secure instead of hardcoded false in prefetcher crash
This modifies the crash fix so that the function calls that were
modified use a local variables called `is_secure` instead of a
hardcoded `false`. Some of these existed previously so it made
more sense to use them, while others were newly added in to mark
where the code might need to be changed later.

Change-Id: I0c0d14b74f0ccf70ee5fe7c8b01ed0266353b3c1
2024-08-05 15:43:40 -07:00
Erin Le
b0756bedba mem: Fix "Need is_secure arg" prefetcher crash
This commit fixes the "Need is_secure arg" crash that occurs when
using the IndirectMemoryPrefetcher, SignaturePathPrefetcher,
SignaturePathPrefetcherV2, STeMSPrefetcher, and PIFPrefetcher. This
was done by changing some variables to be AssociativeSet<...>
instead of AssociativeCache<...> and changing the affected function
calls.

Change-Id: I61808c877514efeb73ad041de273ae386711acae
2024-08-05 15:43:40 -07:00
Yu-Cheng Chang
5df08fdb08 arch-riscv: Move pmpReset implementation to MMU::reset() (#1406)
The PMP is part of RISC-V MMU subssystem, it should be put in
RiscvISA::MMU::reset()
2024-08-05 14:21:48 -07:00
Matt Sinclair
edd73bd330 gpu-compute: fix typo in GPUMem debug print (#1412)
The GPUMem print for when a memstatus request completes accidentally put
a newline before the word "complete", causing complete to print on a
newline and cause confusion. This commit resolves that.
2024-08-05 12:44:13 -07:00
Matt Sinclair
ba455e2025 gpu-compute: update GPUKernelInfo print to print WG number (#1413)
Whenever a GPU kernel is launching a new WG, the GPUKernelInfo debug
flag will print that the kernel is being launched, without the context
of which WG from that kernel is being launched. This has caused some
confusion to users, who think the entire kernel is being launched
repeatedly. To resolve this confusion, update this print to make it
clear which WG is being launched when this print is enabled.
2024-08-05 12:43:41 -07:00
dependabot[bot]
7b1948c18c misc: bump mypy from 1.10.1 to 1.11.1 (#1407)
Bumps [mypy](https://github.com/python/mypy) from 1.10.1 to 1.11.1.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/python/mypy/blob/master/CHANGELOG.md">mypy's
changelog</a>.</em></p>
<blockquote>
<h1>Mypy Release Notes</h1>
<h2>Next release</h2>
<h2>Mypy 1.11</h2>
<p>We’ve just uploaded mypy 1.11 to the Python Package Index (<a
href="https://pypi.org/project/mypy/">PyPI</a>). Mypy is a static type
checker for Python. This release includes new features, performance
improvements and bug fixes. You can install it as follows:</p>
<pre><code>python3 -m pip install -U mypy
</code></pre>
<p>You can read the full documentation for this release on <a
href="http://mypy.readthedocs.io">Read the Docs</a>.</p>
<h4>Support Python 3.12 Syntax for Generics (PEP 695)</h4>
<p>Mypy now supports the new type parameter syntax introduced in Python
3.12 (<a href="https://peps.python.org/pep-0695/">PEP 695</a>).
This feature is still experimental and must be enabled with the
<code>--enable-incomplete-feature=NewGenericSyntax</code> flag, or with
<code>enable_incomplete_feature = NewGenericSyntax</code> in the mypy
configuration file.
We plan to enable this by default in the next mypy feature release.</p>
<p>This example demonstrates the new syntax:</p>
<pre lang="python"><code># Generic function
def f[T](https://github.com/python/mypy/blob/master/x: T) -&gt; T: ...
<p>reveal_type(f(1))  # Revealed type is 'int'</p>
<h1>Generic class</h1>
<p>class C[T]:
def <strong>init</strong>(self, x: T) -&gt; None:
self.x = x</p>
<p>c = C('a')
reveal_type(c.x)  # Revealed type is 'str'</p>
<h1>Type alias</h1>
<p>type A[T] = C[list[T]]
</code></pre></p>
<p>This feature was contributed by Jukka Lehtosalo.</p>
<h4>Support for <code>functools.partial</code></h4>
<p>Mypy now type checks uses of <code>functools.partial</code>.
Previously mypy would accept arbitrary arguments.</p>
<p>This example will now produce an error:</p>
<pre lang="python"><code>from functools import partial
&lt;/tr&gt;&lt;/table&gt; 
</code></pre>
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="570b90a7a3"><code>570b90a</code></a>
Bump version to 1.11</li>
<li><a
href="b3a102ef31"><code>b3a102e</code></a>
Fix <code>RawExpressionType.accept</code> crash with
<code>--cache-fine-grained</code> (<a
href="https://redirect.github.com/python/mypy/issues/17588">#17588</a>)</li>
<li><a
href="aec04c7448"><code>aec04c7</code></a>
Fix PEP 604 isinstance caching (<a
href="https://redirect.github.com/python/mypy/issues/17563">#17563</a>)</li>
<li><a
href="cb44e4d8f1"><code>cb44e4d</code></a>
Fix <code>typing.TypeAliasType</code> being undefined on python &lt;
3.12 (<a
href="https://redirect.github.com/python/mypy/issues/17558">#17558</a>)</li>
<li><a
href="6cf9180e14"><code>6cf9180</code></a>
Fix types.GenericAlias lookup crash (<a
href="https://redirect.github.com/python/mypy/issues/17543">#17543</a>)</li>
<li><a
href="64c1ebf7cf"><code>64c1ebf</code></a>
Bump version to 1.11.1+dev</li>
<li><a
href="dbd5f5cdb6"><code>dbd5f5c</code></a>
Remove +dev from version for 1.11 release</li>
<li><a
href="f0a8c69314"><code>f0a8c69</code></a>
Update CHANGELOG for mypy 1.11 (<a
href="https://redirect.github.com/python/mypy/issues/17540">#17540</a>)</li>
<li><a
href="371f7801e9"><code>371f780</code></a>
CHANGELOG.md update for 1.11 (<a
href="https://redirect.github.com/python/mypy/issues/17539">#17539</a>)</li>
<li><a
href="2563da0c72"><code>2563da0</code></a>
Fix daemon crash on invalid type in TypedDict (<a
href="https://redirect.github.com/python/mypy/issues/17495">#17495</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/python/mypy/compare/v1.10.1...v1.11.1">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=mypy&package-manager=pip&previous-version=1.10.1&new-version=1.11.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Ivana Mitrovic <imitrovic@ucdavis.edu>
2024-08-05 12:18:38 -07:00
Matt Sinclair
a4cb466457 misc: update GPU maintainters (#1411)
Add Matt Sinclair to GPU maintainers for various GPU-related maintainer
labels.

Change-Id: Iaadeaff1318ff24664be9ff0bb14f9ce4da93086
2024-08-05 09:30:27 -07:00
Giacomo Travaglini
d2c8754ab3 mem: Fix name() helper for DRAM rank (#1410)
At the moment the method simply returns the rank number. This is not
particularly useful when enabling debug flags as the beginning of the
line prints something like:

1: <debug_message>

whereas it should really be:

system.dram.rank1: <debug_message>

Change-Id: I0136dc3d182afa4ae2e5a719cb366d8d0f444667

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-08-03 22:49:59 +01:00
Bobby R. Bruce
ace61f4022 util-docker: Update devcontainer for bootcamp24
Change-Id: Ia19840a3858f2f39ef9d9bdc60d0ba6b9231948e
2024-08-02 09:21:25 -07:00
Jason Lowe-Power
5961b0ba76 misc: Change devcontainer for isca tutorial
Change-Id: I4e946bf32b5ad362ff34de5d354fe29cd26fc0cb
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2024-08-02 09:20:53 -07:00
dependabot[bot]
217def7bf9 misc: bump pre-commit from 3.7.1 to 3.8.0 (#1408)
Bumps [pre-commit](https://github.com/pre-commit/pre-commit) from 3.7.1
to 3.8.0.


Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-02 07:25:30 -07:00
Alexander Richardson
267817eaa1 arch-riscv: Fix implicit int-to-float conversion in .isa files (#1319)
Explicitly convert to float/double to fix compiler warnings that I have
turned on locally.
2024-07-31 04:24:54 -07:00
Erin (Jianghua) Le
2bfafa726f sim: Add error message for kernel exceeding memory size (#1329)
This commit adds an error message to src/sim/kernel_workload.cc to tell
the user when the end address of the kernel is greater than the size of
memory. The error message also specifies the minimum memory size needed
to fit the kernel.

Change-Id: I7d8f50889ed8172f64b84f98301a35e5f2f352d3
2024-07-30 19:39:41 -07:00
Matthew Poremba
7d4febcce3 misc: Remove GCN3 from maintainers (#1395)
These files no longer exist. This was missed when GCN3 support was
removed in the last release.

Change-Id: I5d2aab1952c37e64da5c362a6201aa6750531c1b
2024-07-30 10:08:08 -07:00
Yu-Cheng Chang
c13f895af0 arch,cpu: Implement generic reset method for MMU (#1342)
Implementing generic reset method for MMU allows each ISA implementing
their own reset methods. The default reset MMU method is flush all TLB
entries. For example, The RISC-V needs to do PMP reset when received the
reset signal, but the TLBs don't require to be flushed.

Change-Id: I158261570fb6e5216ec105fbdc53460f83f88d15
2024-07-30 09:47:55 +01:00
Alexander Richardson
b64aa0b9b3 arch: Dump semihosting write buffer in debug output (#1389)
This makes it easier to debug unexpected semihosting outputs (in my case
a wrong buffer argument was being passed).

Change-Id: I342610a92fb8efe121d030f7b9ea3307efc4fec3
2024-07-30 09:39:05 +01:00
Matthew Poremba
ddc9a18536 configs: GPUFS: Disable KVM perf counters by default (#1391)
This is on by default in gem5 (see src/cpu/kvm/BaseKvmCPU.py), however
the perf counters only measure host instruction counters and GPUFS is
not concerned about accuracy of KVM CPU stats. There are also a larger
set of users who have access to KVM, but do not have the paranoid level
low enough to attach performance counters.

Therefore, make the performance counters OFF by default. They can still
be enabled, but this will allow for a larger set of users to follow the
upcoming GPUFS documentation without needing to read through a
troubleshooting section after seeing a gem5 error about the KVM paranoid
level.

Change-Id: I6b465559edf3ce17e7117ada049c60bd39aecd83
2024-07-29 12:26:10 -07:00
Alexander Richardson
b23a4c7806 arch-arm: Add support for AArch32 PMEVCNTR*/PMEVTYPER*/PMCCFILTR (#1388)
These registers were only handled in AArch64 mode but are also
accessible as a c14 registers for AArch32.

Change-Id: I62fe54427e96265df0589308afa1b5d665dbf210
2024-07-29 18:22:00 +01:00
Alexander Richardson
b51927e7a8 arch-arm: return 64-bit cycle counter for MISCREG_PMCCNTR (#1390)
In AArch32 mode it is possible to read a 64-bit counter using mrrc.
Instead of truncating in the PMU code, just allow the instruction
implementation to truncate to 32 bits if accessed using mrc.

Change-Id: I77620f6d1852a7d9e79c1ecee50f4297b4103b1c
2024-07-29 16:57:48 +01:00
Harshil Patel
679000f91d tests: remove dependant job (#1386) 2024-07-26 16:36:35 -07:00
Bobby R. Bruce
b11718536e misc,tests: Rm gem5 binary pre-build from dailys (#1383) 2024-07-26 09:35:08 -07:00
Matthew Poremba
37ca94450a arch-vega: Improve SDWA, SDWAB, and DPP (#1378)
This PR has four components:

- Implement a helper method for SDWAB similar to SDWA helpers. SDWAB is
used for VOPC instructions only (vector compares).
- Update two instructions commonly using SDWAB to use helper
(v_cmp_ne_u16 and v_cmp_eq_u16).
- Add panics to *all* VOP1 and VOP2 instructions which do not implement
SDWA or DPP if they use an SDWA or DPP register.
- Add panics to *all* VOPC instructions which do not implement SDWAB or
DPP if they are an SDWA or DPP register.

Only VOP1, VOP2, and VOPC may use SDWA, SDWAB, or DPP. The panics should
therefore cover all instructions which have missing implementations for
these modes. The intent is to exit gem5 instead of continuing simulation
will data that is likely incorrect. Continuing simulation only makes
debugging gem5 more difficult.
2024-07-26 07:12:10 -07:00
Ivana Mitrovic
b3b289ae81 Revert daily test changes (#1382) 2024-07-25 23:09:16 -07:00
Matthew Poremba
21f6e166b7 arch-vega: Panic on SDWAB / DPP VOPC unimplemented
If SDWAB or DPP are used on a VOPC instruction and those are not
implemented, it is highly likely to be a problem for the application.
Rather than continue to execute and cause undefined behavior, exit the
simulation with a panic showing the line of the instruction causing the
issue.

Change-Id: Ib3f94df7445d068b26907470c1f733be16cd2fc2
2024-07-25 16:18:14 -07:00
Matthew Poremba
b75fe56da5 arch-vega: Panic unimplemented SDWA/DPP for VOP1/VOP2
Add a panic if SDWA or DPP is used for an instruction which does not
implement support for it. If an application uses SDWA or DPP it likely
does not operate in the same way as the base instruction and therefore
gem5 should panic rather than continue. It is likely data is incorrect
which will make it more difficult to debug an application.

Change-Id: I68ac448b0d62941761ef4efa0169f95796270f48
2024-07-25 16:18:14 -07:00
Matthew Poremba
6558821e2d arch-vega: Add SDWAB for v_cmp_{eq,ne}_u16
This shows an example of how to use the previous commit which adds an
SDWAB helper. The execute() method of both are the same with the
exception of the lambda function passed to the helper method.

Change-Id: I5ffe361440b4020b9f7669c0ed946aa6b3bbec25
2024-07-25 16:18:14 -07:00
Matthew Poremba
69338703e7 arch-vega: Implement SDWAB helper
Implement a SDWAB helper which accepts a dynamic instruction and a
lambda function defining a comparison function taking two values and
returning a comparison result of 0 or 1 for false or true.

Current instructions which implement SDWA do so on a per-instruction
basis which adds a lot of redundant code. This allows for generic SDWAB
implementations for VOPC instructions.

All modifiers are implemented assuming that SDWBA VOPC instruction
comparison types may be U32, I32, F32, U16, I16, F16 (which exist) but
is extendible to I8, U8, or F8.

Change-Id: Idab58a327c29dd19a1a5457237f3799a04f2031b
2024-07-25 16:18:13 -07:00
Harshil Patel
99aa8307b6 misc,tests: Revert "Attempt fix daily downloads (#1369)"
This reverts commit 97f6f3c4da.

Change-Id: I406c0c3d5429266da5ca037999247e21f1859ce5
2024-07-25 11:03:04 -07:00
Harshil Patel
bb2fd84111 misc,tests: Revert "Second attempt at fixing Daily test (#1373)"
This reverts commit e7d1c90aeb.

Change-Id: I5aed0c59d55c20b1774abcaa6396f6dcad11699b
2024-07-25 11:02:57 -07:00
Harshil Patel
22ebbd17dc misc,test: Revert "Third attempt at fixing Daily test"
This reverts commit 7722f84d1e.
2024-07-25 10:51:29 -07:00
Matthew Poremba
a7bc4ca19a arch-vega: Fix unconditional clamps in VOP3 (#1379)
Some instructions are clamping floating point outputs unconditionally,
leading to incorrect results. This commit finds instructions with this
issue and checks the clamp bit before applying clamp.

Change-Id: Ibc6de3813d81fd4f9d2c98dd497d19dd34cf6bde
2024-07-25 08:06:00 -07:00
Bobby R. Bruce
7722f84d1e misc,tests: Third attempt at fixing Daily test (#1375) 2024-07-24 05:41:22 -07:00
Matthew Poremba
7dae1a1d25 arch-vega: Multiple SOPC fixes (#1366)
Make S_CMP_LT_U32 use < instead of <=. Change types of EQ / LG for U64
to be U64.

Change-Id: Ib0b3b7a46ba1aff16a6d439302ca087d988d6417
2024-07-23 12:45:52 -07:00
Bobby R. Bruce
e7d1c90aeb misc,tests: Second attempt at fixing Daily test (#1373)
This fix works by only downloading the gem5 binaries needed for each
test, instead of overwhelming the downloader by fetching them all at
once.
2024-07-23 10:21:53 -07:00
Bobby R. Bruce
97f6f3c4da misc,tests: Attempt fix daily downloads (#1369)
This tests attempts to infer which tests to download per job in the
matrix thereby significantly reducing the download times for each job.

Change-Id: I61b4f4b6410aa86de7437caf213499d805861e0c
2024-07-22 11:44:39 -07:00
Ivana Mitrovic
82c91e8edb arch-riscv: Improve widening/narrowing vectors overlap check (#1331)
This PR improves the vector register groups overlap check in
widening/narrowing
instructions.

- Fix wrong illegal overlap condition between VS2 and VD vector register
groups.
- Also check VS1 vector register group for overlap with VD in
vector-vector
instructions.
- Parametrize widening/narrowing factors in overlap check function to
potentially
handle more cases.

Fixes issue #442.
2024-07-22 10:54:02 -07:00
Erin (Jianghua) Le
b6f8ecb1be python: move cache coherence protocol check above imports (#1360)
This commit moves the requires() call that checks the cache coherence
protocol above the imports. This change was made for the chi private l1,
ruby mesi three level, mesi two level, and mi example cache hierarchies.
This ensures that a clear error message about having the wrong coherence
protocol is printed, rather than a less useful message.

Change-Id: I3bac1ffcb1f8a9d94e486237f880cf248e442ba8
2024-07-22 09:34:04 -07:00
Bobby R. Bruce
b88f814e63 tests,misc: Sync .github dir develop -> stable (#1361) 2024-07-18 12:11:42 -07:00
Alexander Richardson
fc59109429 arch,arch-arm: Fix remaining implicit float conversion warnings in .isa (#1327)
This fixes the remaining implicit int/float conversions and enables the
float conversion warnings for clang when building the Arm instruction
execution logic. This depends on the previous fixes.

Change-Id: I51aac94644a483175842c36da2d49d308aaceb49
2024-07-18 10:43:12 -07:00
Erin (Jianghua) Le
aaa6566548 mem: Change long in src/mem/physical.cc to int64_t (#1275)
This changes `long`s in src/mem/physical.cc, which are 32 bits or more,
to `uint64_t`s, which are exactly 64 bits.

Change-Id: I64e089a2ac087bcf58b9c3c918c59dc5ff75d010
2024-07-18 10:12:24 -07:00
Robert Hauser
9b8c84cb5d arch-riscv: Overwrite getEMI() for timing expr (#1346)
TimingExpression enables runtime calculation of the commit latency in
MinorCPU. For this, machInst is obtained by getEMI() to match it with a
given instruction. At default, getEMI() always returns 0 and is
therefore overwritten to enable timing expressions for RISC-V. This was
already done for ARM (see src/arch/arm/insts/static_inst.hh).

Change-Id: I03d669b3439fd24e00cbf893f5db9951dfe56b1f

Signed-off-by: Robert Hauser <robert.hauser@uni-rostock.de>
2024-07-12 20:52:24 -07:00
Robert Hauser
5e5e8fb9c6 arch-riscv: Update local interrupts citation (#1347)
Updated the bib information of the local RISC-V interrupts.

Change-Id: I666c3df4529e159bd1946ca1a9623e47f84d5d9e

Signed-off-by: Robert Hauser <robert.hauser@uni-rostock.de>
2024-07-12 20:51:49 -07:00
Tommaso Marinelli
e3b41291da arch-riscv: Check VS1 group for overlap when widening/narrowing
Currently, only the VS2 register group is checked for overlap with VD
when executing a widening/narrowing instruction. This commits extends
the check to VS1, when applicable (i.e. vector-vector operations).

Change-Id: I892b7717c01e25546fb41e05afbd08fc40c60c59
2024-07-12 01:17:14 +00:00
Tommaso Marinelli
a8b7e9727d arch-riscv: Generalize widening/narrowing vectors overlap check
As of now, the widening/narrowing vector register groups overlap check
always assumes a SEW multiplication factor equal to 2 (for either VD or
VS2). This commits aims at making this check more generic.

Change-Id: I4311fc3624cd324ccfdf2a1920a19efc85357120
2024-07-12 01:17:14 +00:00
Tommaso Marinelli
5b693fd8b6 arch-riscv: Remove duplicate line
Change-Id: I32200aad5a59c9fd85f6ed783a4cebb841bf6ff1
2024-07-12 01:17:14 +00:00
Tommaso Marinelli
fbe6985365 arch-riscv: Fix widening instructions vectors overlap check
This commit fixes the overlap check between VS2 and VD register groups
in vector widening instructions. While the narrowing instructions check
is correct, the widening one has to differentiate between two cases
(Vs2 EEW = 2*SEW and Vs2 EEW = SEW). In the first case, overlap is
allowed, as the EEW is the same as Vd. In the second case, the overlap
legality check has to be adapted to use the Vs2 EMUL to calculate the
boundaries. The rule has been derived again from Section 5.2 of RISC-V
"V" Vector Extension specifications, version 1.0.

The patch also includes some small code refactoring, e.g. using
already defined vlmul and constants for vector operands.

Fixes issue #442.

Change-Id: Ic87095fb9079e6c8f53b9a0d79fbf531a85dc71d
2024-07-12 01:17:14 +00:00
Ivana Mitrovic
ebfb8999cb util: Update gem5-resources-manager (#1343)
Bumps [zipp](https://github.com/jaraco/zipp) from 3.15.0 to 3.19.1.

Bumps [certifi](https://github.com/certifi/python-certifi) from
2023.7.22 to 2024.7.4.

Change-Id: I457b952d86412776d9be9b8bce0b1b2d2550f3a6
2024-07-11 07:19:11 -07:00
Saúl
8dde32d2dc arch-riscv: fix initialization for some vector reduction insts (#1340)
Vector reduce float (widening and non-widening) and integer (widening)
instructions initialize the reduce loop operation with the first element
of the destination register (i.e. `Vd[0]`).

Since all reductions per spec seem to be `Vd[0] = Vs1[0] + Vs2[*]`
(where `+` is an arbitrary binary op and `*` indicates all active
elements) gem5 will calculate this incorrectly if `Vd[0]` and/or
`Vs1[0]` are non-neutral for the operation (the later case being because
it's not taken into account at all).

To solve this we just have to initialize the reduction loop to `Vs1[0]`
(the non-widening integer reduction already does this).
2024-07-10 22:08:49 -07:00
Yangyu Chen
2b902b0aec arch-riscv: add rv32 option to FS Linux config file (#1312)
Since we have supported RISC-V 32, add this option to allow the RISC-V
32 full system to run easily.

Signed-off-by: Yangyu Chen <cyy@cyyself.name>
2024-07-10 11:41:48 -07:00
Yu-Cheng Chang
ce8db85867 cpu: Add cpuIdlePins to indicate the threadContext of CPU is idle (#1285)
If the threacContext of CPU enters the suspend mode, raise the threadID
of threadContext cpu_idle_pins with the high signal to target. If the
threadContext of CPU enters the activate mode, lower the threadID of
thread cpu_idle_pins with low signal to target.
2024-07-10 10:36:37 +01:00
Yu-Cheng Chang
d54dcac393 arch-riscv: Fix setRegs from GDB failed after #1099 (#1291)
The gem5 crashed when user try to update register value from GDB because
PR[1] changes the index of CSR_XSTATUS to MISCREG_XSTATUS, which is out
of NUM_PHYS_MISCREGS.

The CSR_XSTATUS should use setRegWithMask to update it.

[1] : https://github.com/gem5/gem5/pull/1099

gem5 issue: https://github.com/gem5/gem5/issues/1299

Change-Id: Iefc0d1f5adfb98ecfda0e74907964b47d1864b6d
2024-07-09 15:55:35 -07:00
Jason Lowe-Power
d20512c291 arch-riscv: add agnostic option to vector tail/mask policy for mem and arith instructions (#1135)
These two commits add agnostic capability for both tail/mask policies,
for vector memory and arithmetic instructions respectively. The common
policy for instructions is to act as undisturbed if one is (i.e. tail or
mask), or write all 1s if none.

For those instructions in which multiple micro instructions are
instantiated to write to the same register (`VlStride` and `VlIndex` for
memory, and `VectorGather`, `VectorSlideUp` and `VectorSlideDown` for
arithmetic), a (new) micro instruction named `VPinVdCpyVsMicroInst` has
been used to pin the destination register so that there's no need to
copy the partial results between them. This idea is similar to what's on
ARM's SVE code. This micro also implements the tail/mask policy for this
cases.

Finally, it's worth noting that while now using an agnostic policy for
both tail/mask should remove all dependencies with old destination
registers, there's an exception with `VectorSlideUp`. The
`vslideup_{vx,vi}` instructions need the elements in the offset to be
unchanged. The current implementation overrides the current vta/vma and
makes them act as undisturbed, since they require the old destination
register anyways. There's a minor issue with this though, as
`v{,f}slide1up` variants do not need this, but since they share the same
constructor, will act all the same.

Related issue #997.
2024-07-08 11:47:11 -07:00
Robert Hauser
77528d1928 systemc: Use headerDelay in timing annotation (#1328)
1. Responder (downstream components):

    When sending a BEGIN_REQ, the timing annotation marks the time when
    a transaction is visible to the target (see [1] on page 465).

    When writing the data, the downstream component calculates the
    transfer time and would send END_REQ after this time (see [1] on
    page 540). Therefore, not the payloadDelay, but the headerDelay
    should be used, as already written as a comment in the source files.
    When reading data, payloadDelay will be 0 anyway.

2. Requester (upstream component):

    For data read, the begin of the transfer is marked by BEGIN_RESP
    and the upstream component would delay END_RESP to model the
    data transfer (see [1] on page 540). Therefore, BEGIN_RESP should be
    delayed by the headerDelay, not the payloadDelay.

[1] "IEEE Standard for Standard SystemC® Language Reference Manual," in
IEEE Std 1666-2023 (Revision of IEEE Std 1666-2011), vol., no.,
pp.1-618, 8 Sept. 2023, doi: 10.1109/IEEESTD.2023.10246125.

Change-Id: I3b5e8ad6bc37cbb309b124efdc8764fca3728b7a

Signed-off-by: Robert Hauser <robert.hauser@uni-rostock.de>
2024-07-05 09:05:24 -07:00
Giacomo Travaglini
d825103df2 arch-arm: Implement FEAT_TTST (#1323)
Implement small translation table extension.
This feature relaxes the lower limit on the size of the translation
tables, by increasing the maximum permitted values of the T1SZ and T0SZ
field in: TCR_EL1, TCR_EL2, TCR_EL3,VTCR_EL2 and VSTCR_EL2

Change-Id: I4c2187815b2d7f14407edb38095c6bcc2004b62a

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-07-04 09:37:41 +01:00
Giacomo Travaglini
c9d9108978 arch-arm: MISCREG_AT_S1E2R/W are executable from S state (#1322)
Change-Id: Ieaebdf0d62b5115f8085f478b2da105633b6a26a

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-07-04 09:37:17 +01:00
dependabot[bot]
baf2a9b917 misc: bump mypy from 1.10.0 to 1.10.1 (#1309)
Bumps [mypy](https://github.com/python/mypy) from 1.10.0 to 1.10.1.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-03 09:59:19 -07:00
Giacomo Travaglini
6ebc6dd998 arch-arm: Properly implement IPASpace in the MMU (#1313)
This PR is introducing the concept of IPA space in gem5, which is
necessary after the implementation
of FEAT_SEL2. In fact we can now have Secure and Non-Secure intermediate
physical address spaces when the PE is
executing in Secure state.
2024-07-03 08:20:53 +01:00
Giacomo Travaglini
f3e3c60805 arch-arm: Proper support for NonSecure IPA space in Secure state
Change-Id: Ie2e2278ecdc5213db74999e3561b2918937c2c2e
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-07-02 13:16:13 +01:00
Giacomo Travaglini
eb400e773b arch-arm: Remove makeStage2 from TLBIOp
Change-Id: I25276e4b5b7c491e69208044ceb193c67ddfd91c
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-07-02 13:15:49 +01:00
Giacomo Travaglini
49ca08b01a arch-arm: Add isStage2 qualifier to the LongDecriptor
We are currently using the LongDecriptor for both stage1
and stage2 translations. There are several cases where
the bitfield meaning changes depending on the translation
stage.

Change-Id: Ic33d9ef225a57fd79ce2b4bf47896aeb6bdd8d9c
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-07-02 13:15:31 +01:00
Giacomo Travaglini
9cce68ca71 arch-arm: Replace isSecure boolean with SecurityState enum
Change-Id: If01b8b2811b2c028e669ea3700174c7945b07a06
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-07-02 12:45:24 +01:00
Alexander Richardson
d5c0383887 arch-arm: support 64-bit PMCCNTR from AArch32 (#1304)
For ARMv8 CPUs this register allows reading a 64-bit cycle counter in
from 32-bit execution state.

Change-Id: I7cd9e2711ada5156920440cc3c89e7a74ca54a49
2024-07-02 08:59:44 +01:00
Giacomo Travaglini
b28659d4f9 arch-arm: Implement FEAT_XS (#1303)
This patch is adding a functional implementation of FEAT_XS. Unless we
operate with DVM enabled, TLBIs broadcasting is accomplished in 0 time;
so there is no timing benefit introduced by enabling FEAT_XS other than
the way it affects TLB management (invalidation)

Change-Id: I067cb8b7702c59c40c9bbb8da536a0b7f3337b5d

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-07-02 08:52:59 +01:00
Bobby R. Bruce
093e3afc81 misc,github,tests: Attempt Fixes for flakey Daily Tests (#1310) 2024-07-01 15:26:32 -07:00
Bobby R. Bruce
8d88a37ec9 misc: v4.0.0 -> v4 for actions/download-artifact
This was fixed to v4.0.0 under the assumption the flaky nature or the
daily-tests.yaml workflow was due to a later, minor v4 version causing
issue. This did not work. Ergo this patch reverts back to using the
latest v4 version.

Change-Id: I72b8811022268f34309de193445987dbe0085951
2024-07-01 15:08:19 -07:00
Bobby R. Bruce
801b86e860 misc: Remove artifact merge from daily-tests.yaml
This Workflow is flakey and it appears to msotly around the usage of the
the merging of all the gem5 builds into a single artifact. In attempt to
stabalize the workflow this merge step has been removed. ALL jobs now
download all gem5 binaries.

Change-Id: Ib1e9d82514c3d5e5af9de974a477e213f8af2aaa
2024-07-01 15:05:08 -07:00
Bobby R. Bruce
bb418d41eb misc: Add scheduler.yaml (#1308) 2024-07-01 12:40:16 -07:00
Bobby R. Bruce
3142464ff7 misc: Add 'scheduler.yaml' workflow (#1307)
This is made to run on the 'stable' branch to schedule workflow runs on
the `develop` branch. This solves the problem of GitHub Workflows being
scheduled to only run on 'stable' branch' thus ignoring changes made to
them on 'develop'

With this schedule we no longer need to force a checkout of 'develop' in
the workflows. As such these have been removed.

The scheduled workflows are now triggered via "workflow_dispatch" via
the "scheduler.yaml" workflow
2024-07-01 12:36:56 -07:00
Bobby R. Bruce
a7645cdf20 misc, tests: Fix missing 's' in GPU tests (#1306)
This caused the weekly tests to fail. It's 'tests' not 'test'.

Copy of #1305 .
2024-07-01 07:53:52 -07:00
Bobby R. Bruce
e5414a80a3 misc, tests: Fix missing 's' in GPU tests (#1305)
This caused the weekly tests to fail. It's 'tests' not 'test'.
2024-07-01 07:37:32 -07:00
Matt Sinclair
04a3fd5b5d gpu-compute,mem-ruby: Add RubyHitMiss flag for TCP and TCC cache (#1260)
Add hit and miss print for TCP and TCC cache with RubyHitMiss debug flag

Change-Id: I40ae3449020b917f39ac91d29fa4e1dd7c791e7b
2024-06-30 13:32:01 -05:00
Bobby R. Bruce
ca4897897c misc: Merge stable into develop (v24.0 release) (#1295)
This guarantees all changes put on the staging branch and, for whatever
reason, put on stable are on develop.

In addition this PR reverts specific release procedures (e.g., reverting
the removal removing the -Werror compilation flag, and changing the
versioning back to "DEVELOP").
2024-06-27 23:45:25 -07:00
Bobby R. Bruce
beff732ecf util-docker: Set dev container to ":latest"
Change-Id: I73bb569e05830d35f0aa63eb75026a83377ae3a5
2024-06-27 23:42:25 -07:00
Bobby R. Bruce
4a28b367d7 scons: Readd -Werror for the develop branch
This reverts commit 6e4c1c5db7.
2024-06-27 23:38:36 -07:00
Bobby R. Bruce
b3f23830c9 misc: Update versioning for develop branch
Develop for v24.1

Change-Id: I4ef34c4a4ef67d171505ff9380746ae193655305
2024-06-27 23:36:07 -07:00
Bobby R. Bruce
6fcc13cf55 misc: Merge branch stable into develop
This guarantees all changes put on the staging branch and, for whatever
reason, put on stable are on develop. This syncs the branches.

Change-Id: Ib3513f49977bb4ed3046c2d9d6cf162953b15887
2024-06-27 23:27:21 -07:00
Bobby R. Bruce
43769abaf0 misc: Merge v24.0 release staging branch to stable (#1274)
This merge officially marks the release of gem5 v24.0.
2024-06-27 23:22:40 -07:00
Harshil Patel
3acb6e59cf resources: Update elfie.py to work with obtain_resources (#1289)
Change-Id: I08c5e50a150c8434c6c2ca36af81fb6ec3915af8
2024-06-27 20:02:57 -07:00
Jarvis Jia
f56571fed9 Merge branch 'develop' into rubyhitmiss 2024-06-27 21:45:08 +08:00
Bobby R. Bruce
b471d5f382 stdlib,tests: Update resources to v24.0 in Pyunit test (#1290)
This needs a better fix. I don't like having to update these files for
every release. Though for now, this will mean the tests passing in v24.0
2024-06-27 05:48:48 -07:00
Jason Lowe-Power
c1825a9c0a misc: Update release notes
Change-Id: Ia8bd55ab46dca7f0eef533c0c3b7da1fe4c84cc9
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2024-06-26 15:09:27 -07:00
Bobby R. Bruce
e4218a8c43 Merge branch 'stable' into release-staging-v24.0.0.0 2024-06-26 13:45:57 -07:00
Bobby R. Bruce
9d05b35884 misc: Improve gem5 Release notes for v24.0
Change-Id: I5d59f41a84919a9eba1cc00b116e2477ad0beb6e
2024-06-26 13:44:35 -07:00
Rajesh Shashi Kumar
3ce5e0584a arch-arm: This commit fixes a typo in the ARM ldaddalx instruction (#1279)
The acquire-release flavor of the ldadd instruction should read ldaddalx
(eg. ldaddalb/ldaddalh) according to specification. However, this is
currently noted as ldadd"la"x (eg. ldaddlab/ldaddlah).

Issue: https://github.com/gem5/gem5/issues/1224
Change-Id: Ib932fa0e572207729c923c27f24c34cc21dff0e5

Co-authored-by: Bobby R. Bruce <bbruce@ucdavis.edu>
2024-06-26 09:03:50 -07:00
Harshil Patel
e0d03fbc2f resources: fix check for additional_params for workloads
Change-Id: I0a4b5f0eef6e2f9faf35cea8130572a066aab6cd
2024-06-26 07:13:04 -07:00
Jason Lowe-Power
f68f4dd390 resources: fix check for additional_params for workloads (#1287)
Change-Id: I0a4b5f0eef6e2f9faf35cea8130572a066aab6cd
2024-06-26 09:08:45 -05:00
Harshil Patel
144a2071fe resources: fix check for additional_params for workloads
Change-Id: I0a4b5f0eef6e2f9faf35cea8130572a066aab6cd
2024-06-25 16:30:07 -07:00
Matthew Poremba
faa18576f2 misc: Add high level GPU model release notes
Change-Id: I73dfba5eeeffe1b812bc41a80b9d0901822e8062
2024-06-25 11:21:20 -07:00
Harshil Patel
241b8a09df resources: Update client_query to trim gem5 version (#1284)
- gem5 was querying the full version of gem5 that is `24.0.0.0` while
searching for resources.
This was causing an error to find resources on staging branch. 
This change trims the gem5 version to be just the major.minor version.

Change-Id: I30c3a1b38c631981f797ef0fd2b616e6a66ca18e
2024-06-25 09:04:13 -07:00
Harshil Patel
52fde944a5 resources: Update client_query to trim gem5 version (#1284)
- gem5 was querying the full version of gem5 that is `24.0.0.0` while
searching for resources.
This was causing an error to find resources on staging branch. 
This change trims the gem5 version to be just the major.minor version.

Change-Id: I30c3a1b38c631981f797ef0fd2b616e6a66ca18e
2024-06-25 09:01:36 -07:00
Jarvis Jia
341c72839b Fix hit issue
Change-Id: I28745489de693591d5ad8453b035a8c782adaf1f
2024-06-24 11:19:51 -07:00
Jarvis Jia
21b69975a6 Fix compilation error
Change-Id: I8273472b8d0cff8c02f2d1e1a9d66599af7c4866
2024-06-24 11:19:51 -07:00
Jarvis Jia
e957a882ed gpu-compute,mem-ruby: Add RubyHitMiss flag for TCP and TCC cache
Add hit and miss print for TCP and TCC cache with RubyHitMiss debug flag

Change-Id: I40ae3449020b917f39ac91d29fa4e1dd7c791e7b
2024-06-24 11:19:51 -07:00
Saúl Adserias
99f58d37da arch-riscv: add agnostic opt to vector tail/mask for arith insts
Change-Id: I693b5f3a6cc8a8f320be26b214fd9b359e541f14
2024-06-24 10:03:52 -07:00
Saúl Adserias
73c364519a arch-riscv: add agnostic opt to vector tail/mask for mem insts
Change-Id: I567a110806b77d5576810706bd3e30185b0e0b75
2024-06-24 10:03:52 -07:00
Bobby R. Bruce
84c3b0c111 misc: Update dummy jobs for workflows
These give us clear indications if a workflow has passed or failed.

Change-Id: If61b9ac5dc4d2da54b4ad68e427b149bbcb4a30b
2024-06-22 12:59:42 -07:00
Bobby R. Bruce
09781fd78f misc: Update dummy jobs for workflows
These give us clear indications if a workflow has passed or failed.

Change-Id: If61b9ac5dc4d2da54b4ad68e427b149bbcb4a30b
2024-06-22 12:58:35 -07:00
Bobby R. Bruce
79a0a2a15f Merge branch 'stable' into release-staging-v24.0.0.0 2024-06-22 12:31:45 -07:00
Giacomo Travaglini
7f3afc211c misc: Add MPAM entry in the v24 release notes
Change-Id: If11d470003e21e51dd5a3b1831d50ed8a54e1919
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-06-21 11:05:22 +01:00
Mahyar Samani
21bd1c28ab Adding an example for Spatter (#1272)
This change adds a new utility function for processing Spatter traces
into SpatterKernels under parse_kernels.
Additionally, it adds documentation for all the utility functions in
spatter_kernel.py.
Lastly, it adds an example script for running one spatter trace using
SpatterGenerator to the examples.
2024-06-21 02:26:58 -07:00
Mahyar Samani
30bfdc8e52 stdlib: Getter method to get monolith range. (#1273)
This change extend the AbstractMemory class to add a getter method that
allows other components to get the memory's range without interleaving.
This method will be useful if other components in the system need to
interleave the memory range different to the way the memory has
interleaved them.
2024-06-21 02:26:50 -07:00
Mahyar Samani
18bc5227f6 stdlib: Getter method to get monolith range. (#1273)
This change extend the AbstractMemory class to add a getter method that
allows other components to get the memory's range without interleaving.
This method will be useful if other components in the system need to
interleave the memory range different to the way the memory has
interleaved them.
2024-06-21 02:23:58 -07:00
Mahyar Samani
590bb1fbbb Adding an example for Spatter (#1272)
This change adds a new utility function for processing Spatter traces
into SpatterKernels under parse_kernels.
Additionally, it adds documentation for all the utility functions in
spatter_kernel.py.
Lastly, it adds an example script for running one spatter trace using
SpatterGenerator to the examples.
2024-06-21 02:23:41 -07:00
Bobby R. Bruce
6f95d4f3c3 misc: Add content for v24.0 RELEASE_NOTES.md
First draft. Not yet complete.

Change-Id: I80fa4fa4e82e0367cb347df53e9cc9cb52670bfc
2024-06-20 15:37:13 -07:00
Bobby R. Bruce
6e4c1c5db7 scons: Remove -Werror for the gem5 v24.0 release
Removing -Werror flag on the stable branch ensures that as new compilers
are releases (likely withs stricter warnings) gem5 remains compilable.

Change-Id: I0267c895414b630c1d7cd9b28236249790b3006f
2024-06-20 14:47:09 -07:00
Bobby R. Bruce
ec120e0c58 util-docker: Update devontainer Dockerfile for v24.0
Change-Id: Id21fb1b12d8ad58338233d4f32be5b57e025f18b
2024-06-20 14:31:12 -07:00
Bobby R. Bruce
d9d7d7646a misc: Update Doxygen version to v24.0.0.0
Change-Id: Ibaa04b09813a1d497727ed9d2a903ee2b3049ffd
2024-06-20 13:53:20 -07:00
Bobby R. Bruce
888bf0d693 base: Update src/base/version.cc for v24.0
Change-Id: Iac980772a42853f9bfbdadb65d5efc3c5fdb6aed
2024-06-20 13:53:07 -07:00
Jason Lowe-Power
013f773d31 arch-riscv: Fix TLB lookup with vaddrs (#1264)
Previously, all of the TLB lookup/insert functions were using the full
virtual addresses even though the variables in the functions said "vpn."
This change explicitly converts the virtual address to the VPN without
any least significant zeros for the offset. I.e., vpn >> page_size.

The main bug solved in this changeset is the asid was |'d with the upper
bits of the virtual address, but sometimes there were all 1's.
Therefore, you could get a TLB hit even if the ASID was different.
Interestingly, the page that seemed to cause these issues was a 1 GiB
page.

This change also starts refactoring some of the page table details to
support sv46 and sv57 page table formats.

In my testing, the Linux kernel boot uses large pages (even OpenSBI uses
large pages), so it seems that large pages also work. However, this
seems like magic to me, so I'm not sure if it's correct.

This change also updates some asserts, and debug statements with more
useful debugging information.

Partially fixes #1235. More testing needs to be done to be confident.
2024-06-20 13:24:50 -07:00
Bobby R. Bruce
7137b73ca0 cpu: Fix std::min type mismatch in reg_class.hh (#1266)
Introduced in #1234, this caused compilation to faill in Apple Silicon
systems. This bug is the same as #582 where a more detailed explanation
is provided.
2024-06-20 13:02:08 -07:00
Mahyar Samani
7ff1e381c9 cpu,stdlib: Fix Access Trace for Accessing Indices in SpatterGen (#1258)
This change fixes the way indices are generated in a multi generator
setup.
It changes it from all cores generating the same trace of indices for
accessing the index array to each core generating an interleaved subset
of indices.
For an example look below for traces (indices to index array) in a 2
core setup.

Before:
core_0: 0, 1, 2, 3, 4, 5, 6, 7, ...
core_1: 0, 1, 2, 3, 4, 5, 6, 7, ...
After:
core_0: 0, 1, 2, 3, 8, 9, 10, 11, ...
core_1: 4, 5, 6, 7, 12, 13, 14, 15, ...

Additionally, this change fixes the SpatterKernel class in the standard
library to comply with the change in the SpatterGen source code.
2024-06-20 11:24:44 -07:00
Matthew Poremba
ed860dfe54 configs: Check before use replacement policy options (#1261)
Rather than adding the options to *every* config that might be using
GPU_VIPER.py, just change the Ruby config to check if the option is
available before trying to use it. Otherwise, reverts to what was the
default on stable.

Change-Id: Ia6f1d0827d489ee2a35c598b644461cbff59e247
2024-06-20 09:50:29 -07:00
TiredTumblrina
9fb0b18863 gpu-compute,mem,systemc: This commit corrects typos of 'cache' (#1263)
I noticed while using the stable branch that there were a few typos of
the word 'cache' and so I've corrected a few files where I found such
typos.

Change-Id: I7c7f64812039f34fe39d0c45c4f5ce921cba06d0
2024-06-20 09:45:13 -07:00
Jason Lowe-Power
943daeb603 stdlib: Add function to append kernel args (#1262)
Often, you want to add another argument to the default kernel arguments.
This function allows you to do that on the `kernel_disk_workload` board
mixin.
2024-06-20 09:14:55 -07:00
Bobby R. Bruce
25d614e4ce tests: Fix x86_boot_exit_run.py 'set_max_ticks' typo (#1267) 2024-06-20 00:31:23 -07:00
Ivana Mitrovic
e88f0944e3 util: Bump urllib3 in gem5-resource-manager (#1257)
Bumps [urllib3](https://github.com/urllib3/urllib3) from 2.0.7 to 2.2.2.

Change-Id: I218236ff9ebe99839e417b67e740e6f98c0ee473
2024-06-18 11:05:13 -07:00
Bobby R. Bruce
9fe2bc9edc util-docker: Update devcontainer to use Ubuntu 24.04 (#1256)
Change-Id: I0e0dbaca2194c7f0ff5de54a49888da1c938c2de
2024-06-18 09:35:18 -07:00
Bobby R. Bruce
1a00ecfaf9 stdlib,configs,tests: Add gem5 MultiSim (MultiProcessing for gem5) (#1167)
This allows for multiple gem5 simulations to be spawned from a single
parent gem5 process, as defined in a simgle gem5 configuration. In this
design _all_ the `Simulator`s are defined in the simulation script and
then added to the mutlisim module. For example:

```py
from gem5.simulate.Simulator import Simulator
import gem5.utils.multisim as multisim

# Construct the board[0] and board[1] as you wish here...

simulator1 = Simulator(board=board[0], id="board-1")
simulator2 = Simulator(board=board[1], id="board-2")

multisim.add_simulator(simulator1)
multisim.add_simulator(simulator2)
```

This specifies that two simulations are to be run in parallel in
seperate threads: one specified by `simulator1` and another by
`simulator2`. They are then added to MultiSim via the
`multisim.add_simulator` function. The user can specify an id via the
Simulator constructor. This is used to give each process a unique id and
output directory name. Given this, the id should be a helpful name
describing the simulation being specified. If not specified one is
automatically given.

To run these simulators we use `<gem5 binary> -m gem5.utils.multisim
<script> -p <num_processes>`. Note: multisim is an executable module in
gem5. This is the same module we input into our scripts to add the
simulators. This is an intentionally modular encapsulated design. When
the module processes a script it will schedule multiple gem5 jobs and,
dependent on the number of processes specified, will create child gem5
processes to processes tjese jobs (jobs are just gem5 simulations in
this case). The `--processes` (`-p`) argument is optional and if not
specified the max number of processes which can be run concurrently will
be the number of available threads on the host system.

The id for each process is used to create a subdirectory inside the
`outputdor` (`m5out`) of that id name. E.g, in the example above the
ID's are `board-1` and `board-2`. Therefore the m5 out directory will
look as follows:

```sh
- m5out
    - board-1
        - stats.txt
        - config.ini
        - config.json
        - terminal.out
    - board-2
        - stats.txt
        - config.ini
        - config.json
        - terminal.out
```

Each simulations output is encapsulated inside the subdirectory of the
id name.

If the multisim configuation script is passed directly to gem5 (like a
traditional gem5 configuraiton script, i.e.: `<gem5 binary> <script>`),
the user may run a single simulation specified in that script by passing
its id as an argument. E.g. `<gem5 binary> <script> board-1` will run
the `board-1` simulation specified in `script`. If no argument is passed
an Exception is raised asking the user to either specify or use the
MultiSim module if multiprocessing is needed.

If the user desires a list of ids of the simulations specified in a
given MultiSim script, they can do so by passing the `--list` (`-l`)
parameter to the config script. I.e., `<gem5 binary> <script> --list`
will list all the IDs for all the simulations specified in`script`.

This change comes with two new example scripts found in
'configs/example/gem5_library/multsim" to demonstrate multisim in both
an SE and FS mode simulation. Tests have been added which run these
scripts as part of gem5' Daily suite of tests.

Notes
=====

* **Bug fixed**: The `NoCache` classic cache hierarchy has been modified
so the Xbar is no longet set with a `__func__` call. This interfered
with MultiProcessing as this structure is not serializable via Pickle.
This was quite bad design anyway so should be changed

* **Change**: `readfile_contents` parameter previously wrote its value
to a file called "readfile" in the output dorectory. This has been
changed to write to a file called "readfile_{hash}" with "{hash}" being
a hash of the `readfile_contents`. This ensures that, during multisim
running, this file is not overwritten by other processes.

* **Removal note**: This implementation supercedes the functionality
outlined in 'src/python/gem5/utils/multiprocessing'. As such, this code
has been removed.

Limitations/Things to Fix/Improve
=================================

* Though each Simulator process has its own output directory (a
subdirectory within m5out, with an ID set by the user unique to that
Simulator), the stdout and stderr are still output to the terminal, not
the output directory. This results in: 1. stdout and stderr data lost
and not recorded for these runs. 2. An incredibly noisy terminal output.
* Each process uses the same cached resources. While there are locks on
resources when downloading, each processes will hash the resources they
require to ensure they are valid. This is very inefficient in cases
where resources are common between processes (e.g., you may have 10
processes each using the same disk image with each processes hashing the
disk images independently to give the same result to validate the
resources).

Change-Id: Ief5a3b765070c622d1f0de53ebd545c85a3f0eee

---------

Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
Co-authored-by: Jason Lowe-Power <jason@lowepower.com>
2024-06-18 09:34:39 -07:00
Bobby R. Bruce
3138c8a8b1 gpu-compute,mem-ruby: Revert "Add RubyHitMiss flag for TCP and TCC cache" (#1254)
Reverts gem5/gem5#1226
2024-06-18 07:58:54 -07:00
Bobby R. Bruce
36f73f671d cpu,stdlib: Adding Spatter (#1136)
This PR adds source code for C++ implementation of SpatterGen as well as
SpatterKernel. SpatterGen uses a PyBindMethod to add kernels to the
backend code. This way the process of processing json files could be
offloaded to python. In addition it adds standard library components for
SpatterGenCore and SpatterGen. These two components follow the same
structure as AbstractCore and AbstractProcessor. In addition
spatter_kernel.py adds a definition for SpatterKernel in python to make
adding kernels to C++ easier. Also it adds utility functions for parsing
dictionaries read from json as well as partitioning traces for multicore
setups.
2024-06-17 15:28:45 -07:00
Hoa Nguyen
15e0236a8b arch,cpu,sim: Add mechanism to partially print vector regs (#1234)
Currently, gem5's inst tracer prints the whole vector register container
by default. The size of vector register containers in gem5 is the
maximum size allowed by the ISA. For vector-length agnostic (VLA) vector
registers, this means ARM SVE vector container is 2048 bits long, and
RISC-V vector container is 65535 bits long. Note that VLA implementation
in gem5 allows the vector length to be varied within the limit specified
by the ISAs.

However, in most use cases of gem5, the vector length is much less than
65535 bits. This causes two issues: (1) the vector container requires
allocating and moving around a large amount of unused data while only a
fraction of it is used, and (2) printing the execution trace of a vector
register results in a wall of text with a small amount of useful data.

This change addresses the problem (2) by providing a mechanism to limit
the amount data printed by the instruction tracer. This is done by
adding a function printing the first X bits of a vector register
container, where X is the vector length determined at runtime, as
opposed to the vector container size, which is determined at compilation
time.

Change-Id: I815fa5aa738373510afcfb0d544a5b19c40dc0c7

---------

Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2024-06-17 14:05:47 -07:00
hahaxxz
fef6a97f93 mem-ruby: This commit fixes MI_example protocol (#1236)
fix two bugs in MI_example-dir.sm:
1. Directory cannot handle DMA_READ & DMA_WRITE events in M_DRDI state.
2. Directory cannot handle PUTX_NotOwner events in {M_DWR, M_DRD,
M_DRDI, M_DWRI} state.

Github Issue: https://github.com/gem5/gem5/issues/1210

Change-Id: I52a9d674ce0688dcfbbcc2b583f17de95afdeb87
2024-06-17 12:45:11 -07:00
Hoa Nguyen
500da4306b arch: Mark FailUnimplemented instructions as Invalid instructions (#1247)
This is a follow-up on the discussion here [1].

The IsInvalid flag was previously defined as an instruction that does
not appear in the ISA. However, a micro-architecture can choose to not
recognize an instruction in and raise illegal instruction fault even if
the instruction is in the ISA.

This change modifies the definition of a Invalid instruction such that,
if a StaticInst instruction is marked as IsInvalid, it means the
instruction is not recognized by the decoder. This means that any
instruction recognized by the decoder are not invalid, even if the
instruction is not in the official ISA spec; e.g., m5
pseudo-instructions.

Note that instructions that are recognized by the decoder but are chosen
to act as a nop are not invalid. This applies to WarnUnimplemented
instructions, e.g. hint instructions.

[1] https://github.com/gem5/gem5/pull/1071

Change-Id: I1371b222d8b06793d47f434d0f148c5571672068

Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2024-06-17 12:44:05 -07:00
Giacomo Travaglini
2804311f7b cpu-o3: Revert "Do not set Executed on load instruction to be replayed" (#1251)
Reverts gem5/gem5#1182

This is breaking O3 execution. Investigating the matter
2024-06-17 12:24:43 -07:00
Matt Sinclair
6776bebbf6 gpu-compute,mem-ruby: Add RubyHitMiss flag for TCP and TCC cache (#1226)
Add hit and miss print for TCP and TCC cache with RubyHitMiss debug flag

Change-Id: I4430532b901811e03d9b077b61e2eca4557b34e1
2024-06-17 12:47:47 -05:00
Matthew Poremba
50e4209a4a arch-vega: Various MI300 fixes for PyTorch tests (#1249)
- Fix address calculation issue with scratch_* instructions when SVE bit
is 0.
- Fix ds_swizzle_b32 not mapping to execution unit.
- Implement VOP3 V_FMAC_B32.
- Fix architected scratch address register being clobbered.

Tested with MNIST from PyTorch quickstart tutorial and nanoGPT on
mi300.py.
2024-06-17 07:59:47 -07:00
Jarvis Jia
3a2bf47d57 Add default value and change Ruby address format specifier
Change-Id: I8fbaf34745e90589e610d3b9bd423937e7ebdc3d
2024-06-17 03:27:25 -05:00
Jarvis Jia
edb2e76077 Merge branch 'develop' into rubyhitmiss 2024-06-17 15:57:50 +08:00
Matthew Poremba
2b0ca93517 gpu-compute: Fix architected flat scratch
Currently writing to SRF which is incorrect, as the physical register
number can be clobbered by another wavefront if registers get renamed to
the physical register number.

Fix this by actually architecting the register, i.e., there is a
dedicated "hardware" register in the wavefront class.

Change-Id: I94e9e463eed348b2928cae884c1c20566c00984d
2024-06-15 15:46:33 -07:00
Matthew Poremba
2f5842d253 arch-vega: Add valid flag to ds_swizzle_b32
Currently the flag is just Load and there is a long comment explaining
why. This does not meet any of the scoreboard check requirements:

https://github.com/gem5/gem5/blob/develop/src/gpu-compute/scoreboard_check_stage.cc#L230-L241

Add a generic ALU flag as well so the instruction executes instead of
panicking.

Change-Id: I54b2d20d47fad5e8f05f927328433aab7db7d862
2024-06-15 14:28:59 -07:00
Matthew Poremba
42369eab2c arch-vega: Implement MI300 FLAT SVE bit
For scratch instructions only, this bit specifies if an offset in a VGPR
should be used for address calculation. This is new in MI300 and was
previously the LDS bit. The LDS bit is rarely used and in fact gem5 does
not even check this bit.

This fixes a bug when SADDR == 0x7f (i.e., no SGPR should be used) where
a VGPR was being added to the address when it should have been ignored.

Change-Id: I9864379692df6795b25b58b98825da05d18fc5db
2024-06-15 14:28:59 -07:00
Matthew Poremba
1dab4be002 arch-vega: Implement VOP3 V_FMAC_F32
A version of V_FMAC_F32 with extra modifiers from VOP3 format.

Change-Id: Ib6b41b0a3ceb91269b91a0287dfc94bc73e4d217
2024-06-15 14:28:58 -07:00
Matthew Poremba
f91d14fe46 gpu-compute: Add MFMA stats (#1248)
Add dynamic instruction counts for MFMAs.

Change-Id: I976b01344577cf011aeb3dd648a8c0017281c4e3
2024-06-15 13:04:00 -07:00
Mahyar Samani
d661023de4 stdlib: Adding SpatterGenCore and SpatterGen
This change adds code for SpatterGenCore and SpatterGen as well
as SpatterKernel to the standard library. SpatterGenCore and
SpatterGen follow the same structure as AbstractCore and
AbstractProcessor. spatter_kernel.py adds utility functions
to parse dictionaries as well as partition a list into
multiple lists through interleaving to be used when setting up
a multicore SpatterGen.

Change-Id: I003553e97f901c0724f5feac0bb6e21a020bd6ad
2024-06-14 13:44:34 -07:00
Mahyar Samani
6695e5ef70 cpu: Adding SpatterGen
This change adds source code for SpatterGen ClockedObject.
The set of source code pushed includes code for SpatterKernel
that tracks whether information is being gathered or scattered
as well as the list of indices to be accessed. This model
has PyBindMethod to add SpatterKernels from python.
This way all the preparations for kernels can be done in python.
SpatterGen has a few parameters that model limits on a few of
hardware resources in the backend of a processor, e.g. number
of functional units to calculate effective address, the latency
of calculating effective address, number of integer registers.

Change-Id: I451ffb385180a914e884cab220928c5f1944b2e3
2024-06-14 10:45:09 -07:00
Minje Jun
b8e21a2d32 cpu-o3: Do not set Executed on load instruction to be replayed (#1182)
A load instruction can be replayed when
1) it's strictly ordered or
2) it falls into load-store forwarding mismatch.

Case 1 was considered in executeLoad function but the case 2 wasn't. It
causes the case-2 replayed load instruction to violate the assertion
condition "assert(!load_inst->isExecuted())" in LSQUnit::read. This
commit fixes the problem by adding consideration of the case 2 in
LSQUnit::executeLoad.

Co-authored-by: Minje Jun <minje.jun@samsung.com>
2024-06-14 10:12:26 -07:00
Matthew Poremba
3cf638e217 gpu-compute, util-m5: add GPU kernel exit events (#1217)
The GPUFS scripts include support for dumping and resetting
stats at kernel boundaries by identifying specific GPU kernel 
exit events. This commit extends that support to work with 
GPU SE-mode support.

Change-Id: I662233ae71e2987d90af1fd0100e29036b2ef1c6
2024-06-14 08:13:27 -07:00
Jason Lowe-Power
21ffd91529 cpu,arch: Add IsInvalid flag to Unknown insts (#1071)
The IsInvalid flag indicates that the static instruction is not part of
the executing ISA and not part of m5's pseudo-instructions. This flag
provides a way to recognize an illegal instruction at the decode stage.
2024-06-13 16:26:35 -07:00
Matthew Poremba
b3d9dc42d4 configs: Add replacement policy options for GPUFS (#1230)
GPU_VIPER.py was modified to use these options but they did not exist,
breaking GPUFS. This commit adds them to fix the issue.

Change-Id: I0095f400ea606c4e8d91a41870ef208465cef803
2024-06-13 11:23:50 -07:00
Jarvis Jia
87c0d7732c Merge branch 'develop' into rubyhitmiss 2024-06-12 17:30:35 -04:00
Jarvis Jia
edfc139c40 Change black format
Change-Id: I3733b31baf187e0d3d38d971d9423a1b1afe2296

gpu-compute: add GPU RubyHitMiss for TCP and TCC

Change-Id: I4430532b901811e03d9b077b61e2eca4557b34e1

gpu-compute: Add RubyHitMiss flag for TCP and TCC cache

Change-Id: I4e5d1127c84b9eb1060ec9ba0b6638267449eda5

gpu-compute: Add RubyHitMiss flag for TCP and TCC cache

Change-Id: I4e5d1127c84b9eb1060ec9ba0b6638267449eda5

Remove space

Change-Id: I401f528c6f128ba0956bdbc232e8f2ae37bf648c
2024-06-12 16:04:36 -05:00
Jarvis Jia
b6b2e8c6c5 Black format
Change-Id: If224c106262bae25127675160ea78386eedace3b
2024-06-12 15:57:04 -05:00
Jarvis Jia
0ebcddea95 Update apu_se.py to remove part not needed
Change-Id: I06df4e0a67ccd2b7a45296ff65bf26c2b465a934
2024-06-12 15:54:13 -05:00
Matthew Poremba
be0a7937c1 mem-ruby: Fix deadlock in GPU_VIPER when issuing atomic requests (#1216)
When a compute unit issues several requests to the same line,
the requests wait in the L2 if it is a writeback cache. If the line is
invalid initially and the first request is atomic in nature, the L2
cache issues a request to main memory. On data return, the cache line
transitions to M but doesn't wake up the other requests, resulting in
a deadlock. This commit adds a wakeup call on data return for atomics
and fixes potential deadlocks.
2024-06-12 10:10:32 -07:00
Harshil Patel
74afea471d cpu: Revert "Don't change to suspend if the thread status is halted" (#1225)
Reverts gem5/gem5#1039
2024-06-12 00:20:06 -07:00
Bobby R. Bruce
f9abf6bb08 stdlib: Improve gem5 PyStats (#996)
This PR incorporates numerous improvements and fixes to the gem5
PyStats. This includes:

* PyStats now support SimObject Vectors. The PyStats representing them
are subscribable and therefore acceptable by accessing an index: e.g.,:
`simobjectvec[0]`. (This replaces the `Vector` group PyStat)
* Adds the `SparseHist` PyStats.
* Adds the `Vector2d` to PyStats.
* The `Distribution` PyStats is fixed to be a vector of Scalars.
* Tests added for the PyStat's Vector and bugs fixed.
2024-06-12 00:19:08 -07:00
Bobby R. Bruce
e03a5f78d1 misc,tests: Revert merge version to 'v4' from 'v4.0.0'
'v4.0.0' wasn't working. The following error was occurred:

```
Can't find 'action.yml', 'action.yaml' or 'Dockerfile' for action 'actions/upload-artifact/merge@v4.0.0'.
```

Change-Id: I658b0fe292df029501fbc1286acb06f4014ae4e1
2024-06-12 00:15:06 -07:00
Bobby R. Bruce
261490f23c misc,tests: Revert merge version to 'v4' from 'v4.0.0'
'v4.0.0' wasn't working. The following error was occurred:

```
Can't find 'action.yml', 'action.yaml' or 'Dockerfile' for action 'actions/upload-artifact/merge@v4.0.0'.
```

Change-Id: I658b0fe292df029501fbc1286acb06f4014ae4e1
2024-06-12 00:14:27 -07:00
Vishnu Ramadas
42b9a9666e mem-ruby: Add instSeqNum to atomic responses from GPU L2 caches
This commit adds instSeqNum to the atomic responses in
GPU_VIPER-TCC.sm. This will be useful when debugging issues related to
GPU atomic transactions

Change-Id: Ic05c8e1a1cb230abfca2759b51e5603304aadaa3
2024-06-11 20:35:43 -05:00
Vishnu Ramadas
943d1f1453 mem-ruby: Fix deadlock in GPU_VIPER when issuing atomic requests
When a compute unit issues several requests to the same line,
the requests wait in the L2 if it is a writeback cache. If the line is
invalid initially and the first request is atomic in nature, the L2
cache issues a request to main memory. On data return, the cache line
transitions to M but doesn't wake up the other requests, resulting in
a deadlock. This commit adds a wakeup call on data return for atomics
and fixes potential deadlocks.

Change-Id: I8200ce6e77da7c8b4db285c0cc8b8ca0dfa7d720
2024-06-11 20:33:46 -05:00
Bobby R. Bruce
7e45ec0ff0 stdlib: Fix m5.ext.pystats __init__.py
Addresses Jason's complaint that wildcare imports should be avoided, in
accordance with PEP008:
https://github.com/gem5/gem5/pull/996#discussion_r1621051601.

Change-Id: I72266df43d3ec4ede3f45c3e34e2e05e1990bd6b
2024-06-11 16:26:24 -07:00
Bobby R. Bruce
26a1d2ff0b misc,tests: Update daily test artifact actions to v4.0.0
Change-Id: I711fa36639e925ce958e0484a31ee6a4dde87dbe
2024-06-11 15:44:07 -07:00
Bobby R. Bruce
8fc4d3f793 misc,tests: Update daily test artifact actions to v4.0.0
Change-Id: I711fa36639e925ce958e0484a31ee6a4dde87dbe
2024-06-11 15:43:40 -07:00
Matt Sinclair
8a44e97a10 gpu-compute: Added functions to choose replacement policies for GPU (#1213)
Adding RP_choose functions to change replacement policies among
TreePLRU, LRU, FIFO, LFU, LIP, MRU, NRU, RRIP, SecondChance AND ShiPMem
replacement
policies  for TCC, TCP and SQC caches for GPU
2024-06-11 15:08:42 -05:00
Hoa Nguyen
d528a6bd2d arch: Flag all ISAs Unknown instruction as IsInvalid
Change-Id: I096138a157c4e2063c5f4f4324c21c1463dddb65
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2024-06-11 18:48:29 +00:00
Hoa Nguyen
369029d2be cpu: Add IsInvalid flag to StaticInstFlags
The IsInvalid flag indicates that the static instruction is not part
of the executing ISA and not part of m5's pseudo-instructions. This
flag provides a way to recognize an illegal instruction at the decode
stage.

Change-Id: I2779c6edcd8c5e6a77ea11cad3ff73bacb79d800
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2024-06-11 18:48:29 +00:00
Harry Chiang
d198380489 base: Fix uninitialized variable warning in symtab.test.cc (#1221)
This warning is appeared when I add warning related flags to LINKFLAGS
and turn on LTO to build unit tests.
2024-06-11 10:53:00 -07:00
Jarvis Jia
4fea51b598 Black format change
Change-Id: I95cbf5b97601ef3b6ca26bc1a1835305929ffcab
2024-06-10 22:52:56 -05:00
Jarvis Jia
8e268d42e2 gpu-compute: Provided m5ops support for gpu
Adding m5 stat dump and reset into python script through different exit
event

Change-Id: I662233ae71e2987d90af1fd0100e29036b2ef1c6
2024-06-10 20:56:08 -05:00
Jarvis Jia
cf5e316a92 Change black format
Change-Id: I3733b31baf187e0d3d38d971d9423a1b1afe2296
2024-06-10 16:33:18 -05:00
Jarvis Jia
3404369e68 gpu-compute: Added functions to choose replacement policies for GPU
Adding RP_choose function to change replacement policies among
TreePLRU, LRU, FIFO, LFU, LIP, MRU, NRU, RRIP, SecondChance AND ShiPMem replacement
policies  for TCC, TCP and SQC caches for GPU

Change-Id: I86cc41cca19f8e0d24d8cf015e2e034a1fc4bc43
2024-06-10 16:24:20 -05:00
Jarvis Jia
ccdfe00998 gpu-compute: Added functions to choose replacement policies for GPU
Adding RP_choose functions to change replacement policies among
TreePLRU, LRU, FIFO, LFU, LIP, MRU, NRU, RRIP, SecondChance AND ShiPMem replacement
policies for TCC, TCP and SQC caches for GPU

Change-Id: If84a13babf1006ad41a557747c45d48ce2ce22a9
2024-06-10 16:22:41 -05:00
Jarvis Jia
3c8c783bc3 gpu-compute: Added functions to choose replacement policies for GPU
Adding RP_choose functions to change replacement policies among
TreePLRU, LRU, FIFO, LFU, LIP, MRU, NRU, RRIP, SecondChance AND ShiPMem replacement
policies for TCC, TCP and SQC caches for GPU
2024-06-10 15:13:21 -05:00
Jarvis Jia
c158ce22bf gpu-compute: Added functions to choose replacement policies for GPU
Adding RP_choose function to change replacement policies among
TreePLRU, LRU, FIFO, LFU, LIP, MRU, NRU, RRIP, SecondChance AND ShiPMem replacement
policies  for TCC, TCP and SQC caches for GPU
2024-06-10 15:11:17 -05:00
Jarvis Jia
7c410797d1 Adding functions to choose replacement policies for GPU
Adding RP_choose functions to change replacement policies among
TreePLRU, LRU, FIFO, LFU, LIP, MRU, NRU, RRIP, SecondChance AND ShiPMem replacement
policies  for TCC, TCP and SQC caches for GPU
2024-06-10 14:09:09 -05:00
Jarvis Jia
5b44eca64e Adding functions to choose replacement policies for GPU
Adding RP_choose functions to change replacement policies among
TreePLRU, LRU, FIFO, LFU, LIP, MRU, NRU, RRIP, SecondChance AND ShiPMem replacement
policies  for TCC, TCP and SQC caches for GPU
2024-06-10 13:58:24 -05:00
Alexander Richardson
3cfc550fc0 arch-arm,mem: Don't hardcode secure mode accesses for semihosting (#1200)
When accessing memory using functionalAccess(), the MMU could tell us to
use a nonsecure access even though the CPU is operating in secure mode.
I noticed this while trying to run a simple semihosting hello world with
the MMU+caches enabled and the semihosting calls ended up reading from
memory instead of the caches due to an S/NS mismatch.

See also https://github.com/gem5/gem5/pull/1198 which happens to also
mask the issue I saw, but I believe both changes are needed.

Change-Id: I9e6b9839b194fbd41938e2225449c74701ea7fee
2024-06-09 14:08:54 -07:00
Saúl
5cfad84a98 arch-riscv: correctly set dynamic VLEN for all arith instructions (#1187)
Some arithmetic instructions of the riscv vector extension where still
using the default VLEN=256 instead of the dynamic one through the
inherited `vlen` attribute. Most of them only use this to calculate the
effective index for the mask element like so:

```
uint32_t ei = i + vtype_VLMAX(vtype, vlen, true) * this->microIdx;
if (this->vm || elem_mask(v0, ei)) {
...
```

This means that instructions will wrongly compute the mask index in the
second and subsequent micro instructions (`microIdx` > 0). This commit
fixes this by adding the corresponding `set_vlen` snippet to the
affected instruction formats.

Change-Id: Ib041de972d6938490741a9fb4c214a6a5172c34e
2024-06-07 22:33:56 -07:00
Alexander Richardson
ec5881ec4e arch-arm: avoid using an uninitialized variable use in MMU walks (#1198)
While running a simple Arm32 binary, I noticed that all memory
transactions were being marked as NS instead of S once I turn on the MMU
(even though the page tables have the NS bit set to zero). The result of
this was that semihosting calls were failing since they were using
functional accesses with the SECURE flag set, but the caches only
contained NS tagged entries so these accesses always read stale values
from DRAM.

Digging through the Arm MMU code it appears that the NS bit lookup was
being keyed of the `secureLookup` flag which is only used for long
descriptors. I believe 0c28712f51 should
have used isSecure instead of secureLookup. To avoid using these
uninitialized values in the future I wrapped the LPAE state in a
std::optional to ensure that it is only accessed once initialized.

Change-Id: Ibc406ed3f4cfa768f470e34a5eca3c1a2bf45cd8
2024-06-07 08:59:28 +01:00
Alexander Richardson
8e5fbcbbbb arch-generic: flush streams after semihosting write calls (#1202)
The SYS_WRITEC and SYS_WRITE0 calls are specified as writing to the
debug channel, so it is a reasonable expectation for these messages to
be visibile immediately after the semihosting call.

Change-Id: I8e6e9a7aab593a59e82ecb9cf4603c18c7a8acbe
2024-06-06 09:57:36 +01:00
Alexander Richardson
abbb94af8b dev-arm: Fix -Wdeprecated-copy warning (#1197)
Clang warns as follows: `warning: definition of implicit copy
constructor for 'TranslResult' is deprecated because it has a
user-declared copy assignment operator`

Change-Id: Ic701d8522aac75d569f4f513f54de91f76a17e48
2024-06-05 12:36:38 +01:00
Ivana Mitrovic
a764b9be1c Revert "arch-x86: Fix TLB Assertion Error on CFLUSH" (#1196)
Reverts gem5/gem5#1080 as it is not a good fix.
2024-06-04 10:26:53 -07:00
Hoa Nguyen
40ef8f3afb dev: Remove an extra file in virtio (#1191)
`src/dev/virtio/VirtIORng 2.py` is identical to
`src/dev/virtio/VirtIORng.py`, and the former does not appear in any
build script.

Change-Id: I9c5f1b1a3809d1c7028b630c32310e540613e232

Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2024-06-04 08:40:41 -07:00
dependabot[bot]
500bdc5302 misc: bump tqdm from 4.66.3 to 4.66.4 (#1192)
Bumps [tqdm](https://github.com/tqdm/tqdm) from 4.66.3 to 4.66.4.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-04 06:35:35 -07:00
dependabot[bot]
8c98dcb7cf misc: bump pre-commit from 3.7.0 to 3.7.1 (#1193)
Bumps [pre-commit](https://github.com/pre-commit/pre-commit) from 3.7.0
to 3.7.1.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-04 06:34:53 -07:00
Lukas Zenick
dad5c7b6f7 arch-x86: Fix TLB Assertion Error on CFLUSH (#1080)
Fixed the assertion statement in the cpu's translation.hh file so that
it doesn't fail the assertion if the cache is clean.

I compile this c code to `test`
```c
#include <stdio.h>

static inline void clflush(volatile void *p) {
    __asm__ volatile ("clflush (%0)" : : "r"(p) : "memory");
}

int main() {
    int data = 42;  // Example variable

    printf("Value before clflush: %d\n", data);

    clflush(&data);

    printf("Value after clflush: %d\n", data);

    return 0;
}
```
And run it with this script
`./build/X86/gem5.opt configs/learning_gem5/part1/two_level.py ./test`
In order to verify that it no longer fails the assertion check.

GitHub Issue: #862 
Change-Id: I6004662e7c99f637ba0ddb07d205d1657708e99f
2024-06-03 10:17:10 -07:00
Yu-Cheng Chang
5d3f1c3316 arch-riscv: Add rvZext to BranchTarget (#1173)
Ensure the upper xlen bits are all zeros

Change-Id: Id81330eced907d21320bc1af85ad38fb6e95f6b1
2024-06-03 10:03:51 -07:00
Ivana Mitrovic
fe8daa85d6 arch-vega: More scratch, accvgpr instructions (#1190)
- Implements the remaining scratch instruction which have corresponding
flat implementations
- Implements the remaining v_accvgpr instructions.
2024-06-03 08:56:32 -07:00
Matthew Poremba
00dcd5b0bc arch-vega: Implement literals for 64b dest operands
This feature has been available since Vega10 but was never implemented.
MI300 adds a few new instructions that make use of this more often
(e.g., v_mov_b64).

Change-Id: Ieeb7834462b76d77c0030f49622d0de09f90c9e4
2024-05-31 13:41:46 -07:00
Matthew Poremba
6c8caf83c6 arch-vega: Implement V_ACCVGPR_MOV_B32 instruction
This instruction is a simple move from accumulation register to
accumulation register. It is essentially a move with the accumulation
offset added to the register index.

Change-Id: Ic93ae72599b75c91213f56ebafe5bbd7b2867089
2024-05-31 09:32:35 -07:00
Matthew Poremba
7cdb69bf21 arch-vega: Fill in scratch insts to match flat/global
Flat, scratch, and global share the same instruction implementation with
different address calculations essentially. These instructions were
already implemented but not added to the decoder. This commit adds the
remaining scratch instructions which have a shared instruction
implementation.

Change-Id: I8f2e9ceb221294dce1b81c45745b642f0592d985
2024-05-31 09:32:34 -07:00
Bobby R. Bruce
fd0e6acc94 misc: Fix daily-tests
1. Typo in container.
2. Add compression level to minimize size of generated artifact.

Change-Id: I854e814162fb434ad50a64e3070b943905e4134b
2024-05-30 10:19:17 -07:00
Bobby R. Bruce
3b7307182f misc: Fix daily-tests
1. Typo in container.
2. Add compression level to minimize size of generated artifact.

Change-Id: I854e814162fb434ad50a64e3070b943905e4134b
2024-05-30 10:15:35 -07:00
Bobby R. Bruce
a0de33110b arch-vega: Fix clang comp error due to constant exp (#1183)
The lines `constexpr int B_I = std::ceil(64.0f / (N * M / H));` caused
the following compilation error in clang Version 16:

```
error: constexpr variable 'B_I' must be initialized by a constant
expression
```

`std::ceil` is not a const expression. Therefore instances of this
expression in instructions.hh have been replaced with a constant
expression friendly alternative.

This is calling our compiler tests to fail:
https://github.com/gem5/gem5/actions/runs/9288296434/job/25559409142

Change-Id: I74da1dab08b335c59bdddef6581746a94107f370
2024-05-30 09:44:34 -07:00
NSurawar
efbfdeabd7 mem-ruby: Reduce handshaking between CorePair and dir (#1117)
Currently when data is downgraded by MOESI_AMD_Base-CorePair (e.g. due
to a replacement) this requires a 4-way handshake between the CorePair
and the dir. Specifically, the CorePair send a message telling the dir
it'd like to downgrade then, the dir sends an ACK back and then, the
CorePair writes the data back, and finally, the dir ACKs the writeback.
This is very inefficient and not representative of how modern protocols
downgrade a request. Accordingly, this commits updates the downgrade
support such that the CorePair writes back the data immediately and then
the dir ACKs it.
Thus, this approach requires only a 2-way handshake.

Change-Id: I7ebc85bb03e8ce46a8847e3240fc170120e9fcd6

Co-authored-by: Neeraj Surawar <neerajs@hyrule.cs.wisc.edu>
2024-05-30 09:36:29 -07:00
Bobby R. Bruce
ef2a9110b7 misc: Merge .github dir develop -> stable (#1189) 2024-05-30 07:48:47 -07:00
Bobby R. Bruce
7c1207d5c4 misc: Another attempt to fix the merge-upload in for daily (#1188)
Change-Id: I6a6064ec3b5be4ac1f7d6cd3c2f6c0ca62d2cfcd
2024-05-30 07:45:35 -07:00
Bobby R. Bruce
bbdaae540c misc: Sync .github dir to stable (#1185) 2024-05-30 04:29:55 -07:00
Bobby R. Bruce
65b86cfac9 misc: Fix daily tests merge-artifacts (#1184) 2024-05-30 04:27:40 -07:00
Bobby R. Bruce
c0a64c4862 stdlib: Move SimStat specific varibale sets out of loop
Change-Id: I6e1f4c01a52ae904e9a6c6582b5b413f94c1cb05
2024-05-30 03:03:29 -07:00
Bobby R. Bruce
7f0290985f stdlib,tests: Add Pyunit tests to check Pyunit nav, fix bugs
Bigs fixed of note:

1. The 'find' method has been fixed to work. This involved making
   'children' a class implemented per-subclass as required.
2. The 'get_all_stats_of_name' method has been removed. This was not
   working at all correctly and is largely doing what 'find' does.
2. The functionality to get an element in a vector via an attribute call
   (i.e., self.vector1 == self.vector[1]) has been implemented this
   maintaining backwards compatibility with the regular Python stats.

Change-Id: I31a4ccc723937018a3038dcdf491c82629ddbbb2
2024-05-30 03:02:06 -07:00
Bobby R. Bruce
2d4a213046 stdlib: Make PyStat SimStat inherit from Group
The SimStat Object is nothing more than a group of other SimStats and is
therefore logically a group. With this, functionality can be shared more
easily.

Change-Id: I5dce23a02d5871e640b422654ca063e590b1429a
2024-05-30 02:56:13 -07:00
ylldummy
7fa0342a7c mem-cache: Fix maybe-uninitialized warning (#1179)
When compiler tries to inline a vector construction with a default value
as default constructed ReplaceableEntry. It can complain about the
uninitialized member.

Let's provide basic initialization to the members.

Example codepath:
 SignaturePathV2 constructor
 -> GlobalHistoryEntry() as init_value to AssociativeSet
 -> AssociativeSet initialize vector<Entry> with init_value
2024-05-29 10:41:35 -07:00
Bobby R. Bruce
6d174c43e4 stdlib: Expand and simplify PyStats __init__.py
1. Adds newly added PyStat classes to "__init__.py", ensuring they can
   all be accessed via a `m5.ext.pystats` import.
2. Simplifies the layout out "__init__.py" to just import all classes
   from all files.

Change-Id: I43bfc5e7ff1aec837e661905304c6fb10b00c90e
2024-05-29 08:22:49 -07:00
Bobby R. Bruce
62c1b9f9de tests: move Pystat pyunit tests to their own dir
Change-Id: Ifd3d88deebd4e72bdb8792405966d2e158e6366d
2024-05-29 08:16:38 -07:00
Bobby R. Bruce
b161172f65 arch-arm: Fix memory attributes of table walks (#1180)
This PR is doing the following:

1) Fixing memory attributes of partial translation entries (table walks)
2) Properly setting the cacheability of table walks
2024-05-29 08:07:44 -07:00
Nicholas Mosier
9027d5c3e2 arch-x86: set AF=0 when logical instructions execute (#1171)
Fix #1168. Prevent logical instructions like AND, OR, and TEST from
having input dependencies on the previous value of the Zaps register
(ZF+AF+PF+SF) by having them set AF=0, rather than not modifying AF.
2024-05-29 08:04:44 -07:00
shinezyy
7d339ee79b util: allow to override ARCH in cxx config's Makefile (#1165)
allow to override ARCH in cxx config's Makefile

gem5 issue: #1164
2024-05-29 07:55:48 -07:00
Bobby R. Bruce
ce0bb4655c util-docker,gpu,gpu-compute: Improve GCN-GPU Dockerfile (#1170)
* The GCC used in the GCN-GPU images was increase from version 8 to 
  version 10. This was necessary due to PR #1145 which made GCC require
GCC >=10. This patch was previously part of #1161 but has been merged
into
  this PR.
* A patch has been applied to ROCm-OpenCL-Runtime to fix a linking error
  in which there were multiple definitions of `ret_val`. This issue is
highlighted here:
https://github.com/ROCm/ROCm-OpenCL-Runtime/issues/113.
  This was previously part #1161 but has been moved into this PR.
* The Dockerfile's `RUN` command (built to layers in the Docker image)
  have been refactored so sources and built objects are deleted in the
  same RUN command as where they were built and installed. This reduces
  the size of the image substantially: from 16.3GB down to 6.6GB.
* The `apt upgrade` has been removed. This step (previously at the start
  of the file) did nothing of importance. Removing it saves both time
building the image and reduces the size of the image by a small amount.
* `--depth=1` is used when cloning repositories so the entire commit
tree
  tree is not pulled each time. This saves some time when building the
  image.
* `apt -y update` has been added  where `apt -y install` is used so
  CACHED image layers do not become an issue in the future if the image
  were to be rebuilt.
2024-05-29 07:54:28 -07:00
Nicholas Mosier
a54d3198a8 arch-x86: break 32/64-bit mov's input dependency on prior dest value (#1172)
Fix #1169. Break the input dependency of 32-bit and 64-bit 'mov'
micro-ops on the prior value in the destination register. Such a
dependency is required for 8-bit and 16-bit moves, as they do not
completely overwrite the value in the destination register. However, it
is unnecessary for 32-bit moves (which implicitly zero the upper 32
bits) and 64-bit moves.

This patch implements the fix by adding a new code template field inside
the generated constructors of X86StaticInst's, called `invalidate_srcs`,
which instruction implementations like `mov` can use to conditionally
invalidate particular source registers as needed. In `mov`'s case, this
is when the data size is 32 or 64 bits.

Change-Id: Ib2aef6be6da08752640ea3414b90efb7965be924
2024-05-29 07:54:03 -07:00
Bobby R. Bruce
8404ae276b misc: Sync .github develop -> stable (#1181) 2024-05-29 07:48:13 -07:00
Matthew Poremba
07f6b7c59c dev-amdgpu: Fix pending PCI RLC doorbell (#1157)
SDMA RLC queues do not currently remove their doorbell mapping. This can
cause issues re-registering the queue and prevents the pending doorbells
feature from working. In addition the data value of the doorbell (the
ring buffer rptr) is not saved, leading to UB when this workaround is
used.

This commit removes the doorbell mapping from the gpu device when the
SDMA engine unmaps an RLC queue and copies the next doorbell value to
the pending packet as was originally intended.

Change-Id: Ifd551450f439c065579afcf916f8ff192e7598ab
2024-05-29 07:15:46 -07:00
Giacomo Travaglini
c4ed23a10b arch-arm: Implement HCR_EL2 force broadcast for EL1&0 TLBIs (#1175)
According to the Arm architecture reference manual, it is possible to
force the broadcast of the following TLBIs:

AArch64: TLBI VMALLE1, TLBI VAE1, TLBI ASIDE1, TLBI VAAE1, TLBI VALE1,
TLBI VAALE1, IC IALLU, TLBI RVAE1, TLBI RVAAE1, TLBI RVALE1, and TLBI
RVAALE1.

AArch32: BPIALL, TLBIALL, TLBIMVA, TLBIASID, DTLBIALL, DTLBIMVA,
DTLBIASID, ITLBIALL, ITLBIMVA, ITLBIASID, TLBIMVAA, ICIALLU, TLBIMVAL,
and TLBIMVAAL.

Via the HCR_EL2.FB bit

Change-Id: Ib11aa05cd202fadfbd9221db7a2043051196ecbd

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-05-29 11:54:24 +01:00
Giacomo Travaglini
e9dcb906b4 arch-arm: Set memory attributes for partial table entries
Change-Id: I80adcead410f226c323e4d781adb1ff17a386986
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-05-29 09:30:58 +01:00
Giacomo Travaglini
09f0c20be2 arch-arm: Use HCR_EL2.CD for stage2 table walks
When determining the cacheability of table walks,
SCTLR.C should only be used in stage1 EL1&0 translations.
Stage2 translations should rely on HCR_EL2.CD instead

Change-Id: I1b0830bc3fb5086f68d7a7a1560c7fed5d126d28
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-05-29 09:30:58 +01:00
Giacomo Travaglini
854662f48f arch-arm: Check OSH domain as well for cacheability attribute
Make table walks uncacheable if marked as uncacheable
in either inner or outer shareable domain

Change-Id: I5898a3b91b5b919e0beda6c6fe896394e3ab94df
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-05-29 09:30:58 +01:00
Bobby R. Bruce
4acc20dac1 misc,tests: Download all gem5 bins via one artifact (#1178)
The Daily Tests are failing when downloading artifacts as part of the
`testlib-long-tests` matrix:
https://github.com/gem5/gem5/actions/runs/9250821764/job/25448583827.

It _could_ be that since upgrading to `actions/download@v4`, we're
hitting a limit as the `testlib-long-tests` are downloading every gem5
binary compiled in the `build-gem5` step, each with it's own
`actions/download` step, for every test.

This change adds a small job after `build-gem5` which creates a merged
artifact containing all the gem5 binaries then uses this to lessen the
number of times this action is called in such a short period of time.

Even if the bug still persists, this solution is neater than what was
there previously.
2024-05-28 12:55:30 -07:00
Matthew Poremba
e82cf20150 mem-ruby: Remove VIPER StoreThrough temp cache storage (#1156)
StoreThrough in VIPER when the TCP is disabled, GLC bit is set, or SLC
bit is set will bypass the TCP, but will temporarily allocate a cache
entry seemingly to handle write coalescing with valid blocks. It does
not attempt to evict a block if the set is full and the address is
invalid. This causes a panic if the set is full as there is no spare
cache entry to use temporarily to use for DataBlk manipulation. However,
a cache block is not required for this.

This commit removes using a cache block for StoreThrough with invalid
blocks as there is no existing data to coalesce with. It creates no
allocate variants of the actions needed in StoreThrough and pulls the
DataBlk information from the in_msg instead. Non-invalid blocks do not
have this panic as they have a cache entry already.

Fixes issues with StoreThroughs on more aggressive architectures like
MI300.

Change-Id: Id8687eccb991e967bb5292068cbe7686e0930d7d
2024-05-28 11:02:00 -07:00
Ivana Mitrovic
5ec1acaf5f arch-arm: TLBIs targeting EL2 regime are executable from S state (#1176)
Those AArch64 instructions/registers were labelled as executable
from EL3 only if SCR_EL3.NS == 1. This is not valid anymore
after the introduction of FEAT_SEL2
2024-05-28 10:54:18 -07:00
Matthew Poremba
1dfaa224ff arch-vega: Fix GCC 13 build errors (#1162)
The new static analysis in GCC 13 finds issues with operand.hh. This
commit fixes the error so that gem5 compiles when BUILD_GPU is true.

Change-Id: I6f4b0d350f0cabb6e356de20a46e1ca65fd0da55
2024-05-28 07:58:28 -07:00
Giacomo Travaglini
27c7647fee arch-arm: Use monWrite a shorter version
Change-Id: I8da8a39238eb100315d3df496f55a6bf3da948c6
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-05-28 11:20:52 +01:00
Giacomo Travaglini
6995a99d77 arch-arm: TLBIs targeting EL2 regime are executable from S state
Those AArch64 instructions/registers were labelled as executable
from EL3 only if SCR_EL3.NS == 1. This is not valid anymore
after the introduction of FEAT_SEL2

Change-Id: Ie7b56f3fe779c3a99d4f0ef937c7c8ec0530b00e
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-05-28 11:20:32 +01:00
Giacomo Travaglini
10dbfb8bb7 arch-arm: Rewrite performTlbi to use map instead of switch (#1166)
This is making it easier for TLBI instructions to share code. Common
code (under the form of tlbi* functions) are closely matching the
instruction description in the Arm pseudocode

Change-Id: If10c22fb4a7df2bcd0335e9761286ad3c458722b

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-05-28 11:03:07 +01:00
Bobby R. Bruce
8f0ed46061 stdlib: Move _m5.stats.processDumpQueue to call-once
This commit addresses Jason's comment
(https://github.com/gem5/gem5/pull/996#discussion_r1613870880) which
highlighted putting the `_m5.stats.processDumpQueue` call in the
iteration through the `root` object in `get_simstat` caused this
function be potentially called many times when it only needs to be
called once. This chance moved this call to just before this iteration
and will tehrefore only be called once (if required) per `get_simstat`
execution.

Change-Id: I16908b6dee063a0df7877a19e215883963bfb081
2024-05-27 08:35:21 -07:00
Yu-Cheng Chang
4f6fdbf8bf arch-riscv: Fix c.jalr and c.jr instruction (#1163)
The bit 0 of register should be 0 for jump address. Wrong handling the
jump address may cause infinite run or segment fault.

gem5 issue: https://github.com/gem5/gem5/issues/981
2024-05-25 20:18:42 -07:00
Lukas Zenick
96fbc2068a util, ext: Fix building TLM (#1105)
Fixed the issue that did not allow building TLM.

Build commands:
```bash
scons build/ARM/gem5.opt
scons setconfig build/ARM USE_SYSTEMC=n
scons --with-cxx-config --without-python --without-tcmalloc build/ARM/libgem5_opt.so
cd util/tlm
scons
```
Following this README, I tested it successfully with the simple examples:
https://gem5.googlesource.com/public/gem5/+/master/util/tlm/README

GitHub Issue: #591 
Change-Id: If07fae2eb20ad62627e733573f61bc42d594f970

---------

Co-authored-by: Ivana Mitrovic <ivanamit91@gmail.com>
2024-05-24 13:29:58 -07:00
Bobby R. Bruce
0f6bd24c95 stdlib: Fix get_simstat to accept lists of SimObjects
Change-Id: Iae12a0ac88f9646acb00e73d70f83b1e2ff94ac9
2024-05-23 14:54:59 -07:00
Bobby R. Bruce
c509615ec9 tests: Pretty print Dict when compating for PyStats
Change-Id: I1d93453072d12aa2dd40066f364723de1225b4e0
2024-05-23 14:54:59 -07:00
Bobby R. Bruce
45b26ce465 stdlib: Specialize scalar tests; use 'pystat', not 'simstat'
1. Thests here for the Scalar tasks are named appropriately. Not just
   generic "SimStats tess".
2. We remove 'simstat' terminology. The correct word is "Pystats".

Change-Id: Idebc4e750f4be7f140ad6bff9c6772f580a24861
2024-05-23 14:54:59 -07:00
Bobby R. Bruce
c0a1fa33fe stdlib: Improve PyStat support for SimObject Vectors
Change-Id: Iba0c93ffa5c4b18acf75af82965c63a8881df189
2024-05-23 14:54:59 -07:00
Bobby R. Bruce
178679cbfd stdlib: Add SparseHist to PyStats
This is inclusive of tests to ensure they have implemented correctly.

Change-Id: I5c84d5ffdb7b914936cfd86ca012a7b141eeaf42
2024-05-23 14:54:59 -07:00
Bobby R. Bruce
b5e8804cd4 stdlib: Remove 'Vector' group subclass
This was not used and easily confused with the other 'Vector' in
PyStats.

Change-Id: I9294bb0ae04db0537c87a5f50ce023fc83d587b8
2024-05-23 14:54:59 -07:00
Bobby R. Bruce
6ae3692057 stdlib: Add Vector2d to PyStats
Change-Id: Icb2f691abf88ef4bac8d277e421329edb000209b
2024-05-23 14:54:59 -07:00
Bobby R. Bruce
a3af819d82 stdlib: Remove PyStats Accumulator
This appears to have no equivalent type in the CPP stats and was never
utilized in PyStats.

Change-Id: Ia9afc83b4159eb1ab2c6f44ec0ad86cd73f2a4f8
2024-05-23 14:54:59 -07:00
Bobby R. Bruce
940e1d2063 stdlib: Fix PyStats Distribution to be vector of Scalars
As Distribution inherits from Vector, it should be constructed with
a Dictionary of scalars (in our implementation, a dictionary mapping the
vector position's unique id for each bin and the value of that bin).

Change-Id: Ie603c248e5db4b6dd7f71cc453eebd78793f69a3
2024-05-23 14:54:59 -07:00
Bobby R. Bruce
252dbe9c72 stdlib: Add tests for PyStats's Vector and fix bugs
The big thing missing from the Vector stats was that each position in
the vector could have it's own unique id (a str, float, or int) and each
position in the vector can have its own description. Therefore, to add
this the Vector is represented as a dictionary mapping the unique ID to
a Pystat Scaler (whcih can have it's own unique description.

Change-Id: I3a8634f43298f6491300cf5a4f9d25dee8101808
2024-05-23 14:54:59 -07:00
Bobby R. Bruce
3c86175d08 stdlib: Rename BaseScalarVector -> Vector
This isn't a true Base class, it's just a Vector. In gem5 all Vectors
are Scalar Vectors. This change simplfies the naming.

Change-Id: Ib8881d854ab18de6acbf0fb200db2de6a43621a7
2024-05-23 14:54:58 -07:00
Matthew Poremba
1616d34003 arch-vega: Template MFMA instructions (#1128)
templated
- v_mfma_f64_16x16x4f64

added support for
- v_mfma_f32_32x32x2f32
- v_mfma_f32_4x4x1_16b_f32
- v_mfma_f32_16x16x4f32

[formula for gprs
needed](https://github.com/ROCm/amd_matrix_instruction_calculator)

[formulas for register layouts and lanes used in
computation](https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/instruction-set-architectures/amd-instinct-mi300-cdna3-instruction-set-architecture.pdf)

Change-Id: I15d6c0a5865d58323ae8dbcb3f6dcb701a9ab3c7
2024-05-22 08:53:25 -07:00
Ivana Mitrovic
1a68d71f07 util: Update gem5-resource-manager requirements (#1154)
Bumps [requests](https://github.com/psf/requests) from 2.31.0 to 2.32.0.

Change-Id: I34df01fdd32cb300c4efc8cf072c0aa1137371bc
2024-05-22 07:32:52 -07:00
Bobby R. Bruce
0b2243bb0a misc: Sync stable .github dir with develop (#1155) 2024-05-21 11:56:17 -07:00
Bobby R. Bruce
52fbc8ebcf misc: Revert Dramsys Ubuntu to 22.04 to compile in gcc <13 (#1146)
Until https://github.com/gem5/gem5/issues/1121 is fixed, this change
will ensure our Weekly tests pass.
2024-05-21 10:57:16 -07:00
Bobby R. Bruce
6adb7a8637 misc: Remove gcc 8 support, gem5 support GCC >= v10 (#1145)
note: Due to #556 / #555, we don't support GCC 9. This PR removes gcc-8
which means gem5 would support GCC >= version 10.

The reason for removing gcc-8:

1. We already dropped support for gcc-9. I don't see any good reason to
support anything <9 as a result.
2. GCC is relatively old, and we're probably supporting a bit too many
compiler versions anyway. In Ubuntu 22.04, gcc-11 is downloaded by
default with `apt`. It doesn't seem many system are still using gcc.
3. There is a weird compiler bug in gcc-8 which is causes failure when
compiling gem5 since the inclusion of #1123. The error received is as
follows:

```sh
In file included from src/arch/riscv/tlb.hh:42,
                 from src/arch/riscv/mmu.hh:45,
                 from build/ALL/arch/riscv/generated/exec-g.cc.inc:14,
                 from build/ALL/arch/riscv/generated/generic_cpu_exec.cc:5:
src/arch/riscv/utility.hh: In instantiation of ‘FloatType gem5::RiscvISA::ftype(IntType) [with FloatType = float8_t; IntType = unsigned char]’:
build/ALL/arch/riscv/generated/exec-ns.cc.inc:38839:42:   required from ‘gem5::Fault gem5::RiscvISAInst::Vfwcvt_xu_f_vMicro<ElemType>::execute(gem5::ExecContext*, gem5::trace::InstRecord*) const [with ElemType = float8_t; gem5::Fault = std::shared_ptr<gem5::FaultBase>]’
build/ALL/arch/riscv/generated/exec-ns.cc.inc:38856:16:   required from here
src/arch/riscv/utility.hh:327:15: error: parameter ‘a’ set but not used [-Werror=unused-but-set-parameter]
 ftype(IntType a) -> FloatType
       ~~~~~~~~^
src/arch/riscv/utility.hh: In instantiation of ‘IntType gem5::RiscvISA::f_to_wui(FloatType, uint_fast8_t) [with FloatType = float8_t; IntType = short unsigned int; uint_fast8_t = unsigned char]’:
build/ALL/arch/riscv/generated/exec-ns.cc.inc:38838:49:   required from ‘gem5::Fault gem5::RiscvISAInst::Vfwcvt_xu_f_vMicro<ElemType>::execute(gem5::ExecContext*, gem5::trace::InstRecord*) const [with ElemType = float8_t; gem5::Fault = std::shared_ptr<gem5::FaultBase>]’
build/ALL/arch/riscv/generated/exec-ns.cc.inc:38856:16:   required from here
src/arch/riscv/utility.hh:570:20: error: parameter ‘a’ set but not used [-Werror=unused-but-set-parameter]
 f_to_wui(FloatType a, uint_fast8_t mode)
```

Note: This is currently causing our SST Daily tests to fail, and our
compiler tests to fail.
2024-05-21 10:56:41 -07:00
Harshil Patel
33cebe9376 dev: add reset wrap mode to mouse.cc (#1149)
This change fixes #1148 

I have only added an acknowledged return, as we dont ahve remote and
wrap mode so it can only be in stream mode.

Change-Id: I1882042d873ff0e9465c9491238554c8fbb9aa76
2024-05-21 10:55:03 -07:00
Robert Hauser
688f8fb03b arch-riscv: add exception code to DPRINTFS msg (#1153)
Change-Id: Ib5d1dc991f18256ec634c604c776629ea31317a9
2024-05-21 09:59:25 -07:00
Yu-Cheng Chang
5e20438c1c arch-riscv: Fix GDB connection failed after #1099 (#1152)
GDB connection failed after the PR[1] changed the index of CSR_FCSR to
MISCREG_FCSR itself. It cause the out of bound error.

[1]: https://github.com/gem5/gem5/pull/1099

gem5 issue: https://github.com/gem5/gem5/issues/1151
Change-Id: I402febe5a3a9addf3d4821ad716ade14e227d5d7
2024-05-21 09:58:15 -07:00
Harshil Patel
0824d7f2cd Revert "cpu-kvm: Support perf counters on hybrid host architectures" (#1127)
Reverts gem5/gem5#1065

Reverting this change because this PR breaks X86 kvm as mentioned in the
issue #1126.
2024-05-21 08:14:10 -07:00
Giacomo Travaglini
6f4ba0b422 arch-arm: Add missing outer-shareable TLBIs to the list (#1147)
Those were not part of the performTlbi switch and simulation was
therefore panicking when they were encountered

Change-Id: Ifbe0b89e45539df4abc147ac5970b0caf0d9dfdc

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-05-20 19:24:45 -07:00
Chong-Teng Wang
13924336b1 arch-riscv: Fix viota instruction (#1137)
This commit fixes and refactors the implementation of viota. It also
overrides the generateDisassembly function in viota's macro/micro to
correctly print out the instruction when tacing/debugging.

For example, it changes from:
viota_m vd, vd, vs2, v0.t
to:
viota_m vd, vs2, v0.t
2024-05-20 12:19:22 -07:00
Matthew Poremba
82318e85af arch-x86: Improve KVM set XCR (#1138)
This adds two failsafes which may cause a panic on some machines. First,
check the host machine has the KVM XCR capability before calling getXCRs
or setXCRs. Second, ensure the x87 bit, which must always be one, will
always return at least one by modifying the return value in readMiscReg.

Change-Id: I5e778acc926a47443ef6cef29fabd84eb69bb9ba
2024-05-20 10:22:48 -07:00
Matthew Poremba
b91c9be102 arch-vega: Load/stores commonly used with 16b MFMA
This implements some missing loads and store that are commonly used in
applications with MFMA instructions to load 16-bit data types into
specific register locations: DS_READ_U16_D16, DS_READ_U16_D16_HI,
BUFFER_LOAD_SHORT_D16, BUFFER_LOAD_SHORT_D16_HI.

Change-Id: Ie22d81ef010328f4541553a9a674764dc16a9f4d
2024-05-20 09:29:46 -05:00
Matthew Poremba
a4f0d9e6be arch-vega: Implement v_mfma_f32_32x32x8_bf16
Implement a bfloat16 MFMA. This was tested with PyTorch using
dtype=torch.bfloat16.

Change-Id: I35b4e60e71477553a93020ef0ee31d1bcae9ca5d
2024-05-20 09:28:58 -05:00
Matthew Poremba
10f8fdcd14 arch-vega: Unit test for MXFP types
Add a unit test for the MXFP types (bf16, fp16, fp8, bf8). These types
are not currently operated on directly. Instead the are cast to float
values and then arithmetic is performed. As a result, the unit test
simply checks that when we convert a value from MXFP type to float and
back that the values of the MXFP type match. Exact values are used to
avoid discrepancies with rounding.

Can be run using scons build/VEGA_X86/unittests.opt .

Change-Id: I596e9368eb929d239dd2d917e3abd7927b15b71e
2024-05-20 09:28:58 -05:00
Matthew Poremba
de11daec5f arch-vega: Implement F32 <-> F16 conversions
These instructions are used in some of the F16 MFMA example applications
to convert to/from floating point types.

Change-Id: I7426ea663ce11a39fe8c60c8006d8cca11cfaf07
2024-05-20 09:28:58 -05:00
Matthew Poremba
a062229ac3 arch-vega: Implement v_mov_b64
This instruction is new in MI300 and is used in some of the example
applications used to test MFMAs.

Change-Id: I739f8ab2be6a93ee3b6bdc4120d0117724edb0d4
2024-05-20 09:27:12 -05:00
Matthew Poremba
91955ae879 arch-vega: Decodings for all MFMA/SMFMACs up to MI300
This adds the decodings for all of the matrix fused multiply add (MFMA)
and sparse matrix fused multiply accumulate (SMFMAC) instructions up to
and including MI300. This does not yet provide the implementation for
these instructions, however it is easier and less tedious to add them in
bulk rather that one at a time.

Change-Id: I5acd23ca8a26bdec843bead545d1f8820ad95b41
2024-05-20 09:27:12 -05:00
Matthew Poremba
ce578c8831 arch-vega: MFMA templates for MXFP and INT8 types
The microscaling formats (MXFP) and INT8 types require additional size
checks which are not needed for the current MFMA template. The size
check is done using a constexpr method exclusive to the MXFP type,
therefore create a special class for MXFP types. This is preferrable to
attempting to shoehorn into the existing template as it helps with
readability. Similar, INT8 requires a size check to determine number of
elements per VGPR, but it not an MXFP type. Create a special template
for that as well.

This additionally implements all of the MFMA types which have test cases
in the amd-lab-notes repository (https://github.com/amd/amd-lab-notes/).
The implementations were tested using the applications in the
matrix-cores subfolder and achieve L2 norms equivalent or better than
MI200 hardware.

Change-Id: Ia5ae89387149928905e7bcd25302ed3d1df6af38
2024-05-20 09:27:12 -05:00
Matthew Poremba
994c5ad1cc arch-vega: Add PackedReg helper class
This class can be used to load multiple operand dwords into an array and
then select bits from the span of that array. It handles cases where the
bits span two dwords (e.g., you have four dwords for a 128-bit value and
want to select bits 35:30) and cases where multiple values < 32-bits are
packed into a single dword (e.g., two bf16 values).

This is most useful for packed arrays and instructions which have more
than two dwords. Beyond two dwords, the operator[] overload of
VectorOperand is not available requiring additional logic to select from
an operand. This helper class handles that additional logic itself.

Change-Id: I74856d0f312f7549b3b6c405ab71eb2b174c70ac
2024-05-20 09:27:12 -05:00
Matthew Poremba
2bb62a05e1 arch-vega: Implement v_cvt_pk_fp8_f32
This instruction serves as a test for the MXFP8 type.

Change-Id: I2ce30bf7f3a3ecc850a445aebdf971c37c39a79e
2024-05-20 09:27:12 -05:00
Matthew Poremba
d420a0a1e7 arch-vega: Add OCP microscaling formats
The open compute project (OCP) microscaling formats (MX) are used in the
GPU model. The specification is available at [1]. This implements a C++
version of MXFP formats with many constraints that conform to the
specification.

Actually arithmetic is not performed directly on the MXFP types. They
are rather converted to fp32 and the computation is performed. For most
of these types this is acceptable for the GPU model as there are no
instruction which directly perform arithmetic on them. For example, the
DOT/MFMA instructions operating may first convert to FP32 and then
perform arithmetic.

Change-Id: I7235722627f7f66c291792b5dbf9e3ea2f67883e
2024-05-20 09:27:12 -05:00
Marco Kurzynski
d5a734c252 arch-vega: Template MFMA instructions
templated
- v_mfma_f64_16x16x4f64

added support for
- v_mfma_f32_32x32x2f32
- v_mfma_f32_4x4x1_16b_f32
- v_mfma_f32_16x16x4f32

[formula for gprs needed](https://github.com/ROCm/amd_matrix_instruction_calculator)

[formulas for register layouts and lanes used in computation](https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/instruction-set-architectures/amd-instinct-mi300-cdna3-instruction-set-architecture.pdf)

Change-Id: I15d6c0a5865d58323ae8dbcb3f6dcb701a9ab3c7
2024-05-20 09:27:12 -05:00
Bobby R. Bruce
8b30d848e9 scons: Setup scons for gem5 only supporting gcc >=10
Change-Id: I66f83498a38def3d00d1c9e981aa90706ee20bbb
2024-05-20 07:05:08 -07:00
Bobby R. Bruce
ba1c22f143 misc,tests: Remove gcc-8 from compiler tests
GCC Version 10 is no longer supported by the gem5 project.

Change-Id: If657654299c1a018764d5f92e814ed5cd18c50f0
2024-05-20 06:27:45 -07:00
Bobby R. Bruce
d011fe47a9 util-docker: Upgrade sst-env docker image to use GCC 10
Previously was GCC 9 which is no longer supported by gem5.

Change-Id: Ife715446e3f1179d19db544953fbd6ded25f5b4d
2024-05-20 06:24:14 -07:00
Bobby R. Bruce
321c34d0bd util-docker: Remove GCC-8 from docker-compose.yaml
Change-Id: Ia1aba03412b138b05b569b08a146a2123f7142e4
2024-05-20 06:23:28 -07:00
Matthew Poremba
2b3beb92ff dev-amdgpu,gpu-compute,configs: MI300X (#1141)
Release of MI300X simulation capability:

- Implements the required MI300X features over MI200 (currently only
architecture flat scratch).
- Make the gpu-compute model use MI200 features when MI300X / gfx942 is
configured.
- Fix up the scratch_ instructions which are seem to be preferred in
debug hipcc builds over buffer_.
- Add mi300.py config similar to mi200.py. This config can optionally
use resources instead of command line args.
2024-05-17 09:26:04 -07:00
Alexander Richardson
716fe6d31d arch-arm: Fix 32-bit semihosting ABI (#1142)
It appears we have been trying to read 64-bit arguments for ARM32 since
695583709b. I noticed that SYS_OPEN was
trying to read a really long string as the pathname argument and it
turned out it was reading from the wrong stack offset. With this change
I can successfully run some of the semihosting tests for ARM32.

Change-Id: Ie154052dac4211993fb6c4c99d93990123c2eacf
2024-05-16 10:28:45 -07:00
Alexander Richardson
6b34765d5d arch-generic: Avoid out-of-memory errors for bad semihosting calls (#1143)
In BaseSemihosting::readString() we were using the len argument to
allocate a std::vector without checking whether the value makes any
sense. This resulted in a std::bad_alloc exception being raised prior to
https://github.com/gem5/gem5/pull/1142 for my semihosting tests. This
commit prevents semihosting from reading more than 64K for string
arguments which should be more than sufficient for any valid code.

Change-Id: I059669016ee2c5721fedb914595d0494f6cfd4cd
2024-05-16 10:28:10 -07:00
Chong-Teng Wang
adb177dab6 arch-riscv: Fix vrgather instruction (#1134)
This commit fixes the implementation of vrgather instruction based on
rvv 1.0.

In section 16.4. Vector Register Gather Instructions,

> Vector-scalar and vector-immediate forms of the register gather are
also provided. These read one element from the source vector at the
given index, and write this value to the active elements of the
destination vector register. The index value in the scalar register and
the immediate, zero-extended to XLEN bits, are treated as unsigned
integers. If XLEN > SEW, the index value is not truncated to SEW bits.

The fix zero-extends the index value in the scalar register and the
immediate.
2024-05-16 10:12:35 -07:00
Hossam ElAtali
97a87a7c84 util: Fixed gem5img.py script (#990)
Made the script more robust to different names.

Co-authored-by: Hossam ElAtali <hossam.elatali@uwaterloo.ca>
2024-05-16 10:09:27 -07:00
Yu-Cheng Chang
321bd07163 cpu: Don't change to suspend if the thread status is halted (#1039)
In our gem5 model, there are four types represent thread context:
Active, Suspend, Halting and Halted


5641c5e464/src/cpu/thread_context.hh (L99-L117)

When initializing the gem5 instance, all of the thread contexts are set
Halted. The status of thread context will not be active until the
Workload initializes start up, except the StubWorkload. So if the user
uses the StubWorkload, and the CPU is connected with the model_reset
port. The thread context of the CPU will be activated possibly.

The following is the steps of activating thread context of the CPU
without Workload[1] initialization or lower model_reset port[2].

1. Raise the model_reset port (Change the state from Halted to Suspend)
5641c5e464/src/cpu/base.cc (L671-L673)

2. Post the interrupt to CPU (Change the state from Suspend to Active)
5641c5e464/src/cpu/base.cc (L231-L239)

Implementation of wakeup

SimpleCPU:

5641c5e464/src/cpu/simple/base.cc (L251-L259)

MinorCPU:

5641c5e464/src/cpu/minor/cpu.cc (L143-L151)

O3CPU:

5641c5e464/src/cpu/o3/cpu.cc (L1337-L1346)

This CL fixed the issue when raising the model reset port to CPU(let CPU
sleep) if the CPU is not activated by workload. If the CPU status is
halted, it's should not change to Suspend to avoid wake up

Reference

The model_reset is introduced in the CL:
https://gem5-review.googlesource.com/c/public/gem5/+/67574/4

[1] Activate by workload (ARM example):

5641c5e464/src/arch/arm/fs_workload.cc (L101-L114)

[2] Lower the model_reset:

5641c5e464/src/cpu/base.cc (L191-L192)
5641c5e464/src/cpu/base.cc (L674-L685)

Change-Id: I5bfc0b7491d14369fff77b98b71c0ac763fb7c42
2024-05-16 10:02:53 -07:00
Matthew Poremba
6164835230 configs: GPUFS: MI300X
Add a config capable of simulating MI300X ISA (gfx942). This is similar
to the mi200.py config and uses the same scripts followed by some
tuneable parameters. This config optionally lets the user call the
runMI300GPU function with gem5 resources. This allows for something like
the following before a VIPER stdlib python is available:

```
import mi300
from gem5.resources.resource import obtain_resource

disk = obtain_resource("x86-gpu-fs-img")
kernel = obtain_resource("x86-linux-kernel-5.4.0-105-generic")
app = obtain_resource("square-gpu-test")

mi300.runMI300GPUFS("X86KvmCPU", disk, kernel, app)
```

Tested cold boot config, checkpoint create and restore, and using gem5
resources.

Change-Id: I50a13d7a3d207786b779bf7fd47a5645256b1e6a
2024-05-16 09:23:03 -07:00
Matthew Poremba
c1803eafac arch-vega: Architected flat scratch and scratch insts
Architected flat scratch is added in MI300 which store the scratch base
address in dedicated registers rather than in SGPRs. These registers are
used by scratch_ instructions. These are flat instruction which
explicitly target the private memory aperture. These instructions have a
different address calculation than global_ instructions.

This change implements architected flat scratch support, fixes the
address calculation of scratch_ instructions, and implements decodings
for some scratch_ instructions. Previous flat_ instructions which happen
to access the private memory aperture have no change in address
calculation. Since scratch_ instructions are identical to flat_
instruction except for address calculation, the decodings simply reuse
existing flat_ instruction definitions.

Change-Id: I1e1d15a2fbcc7a4a678157c35608f4f22b359e21
2024-05-16 09:23:03 -07:00
Chong-Teng Wang
d48191d608 arch-riscv: Add RVV FP16 support (Zvfh & Zvfhmin) (#1123)
Add support for the following two extensions:

[Zvfh](https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#185-zvfh-vector-extension-for-half-precision-floating-point):
Vector Extension for Half-Precision Floating-Point

[Zvfhmin](https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#184-zvfhmin-vector-extension-for-minimal-half-precision-floating-point):
Vector Extension for Minimal Half-Precision Floating-Point

For instructions (`vfncvt[.rtz].x[u].f.w`) and (`vfwcvt.f.x[u].v`) which
will become defined when `SEW = 8`, a new template
`VectorFloatWideningAndNarrowingCvtDecodeBlock` is added and 8-bit
floating point type (`float8_t`) is defined.

The data type `float8_t` is introduced in the newer `3e` version of the
SoftFloat Package, however, the current version in use is `3d` which
does not include this definition. Despite this, `float8_t` is utilized
solely for constructing the `vfncvt[.rtz].x[u].f.w` and
`vfwcvt.f.x[u].v` instructions when `SEW = 8`. There are no operations
that directly manipulate data of the `float8_t` type.
2024-05-16 08:37:00 -07:00
Matthew Poremba
8be5ce6fc9 dev-amdgpu,configs,gpu-compute: Add gfx942 version
This is the version for MI300. For the most part, it is the same as
MI200 with the exception of architected flat scratch (not yet
implemented in gem5) and therefore a new version enum is required.

Change-Id: Id18cd7b57c4eebd467c010a3f61e3117beb8d58a
2024-05-15 12:08:41 -07:00
Harshil Patel
65976e4c6d util: Add GNU non executable line to x86 m5 (#1116)
- Adding this line as not specifiying GNU non executable stack was
throwing warnings when building m5
for ubuntu 24.04

Change-Id: I620c508be4090804698391cff671ba5091b053d7
2024-05-14 11:06:13 -07:00
Lukas Zenick
b279e40cb7 configs: nvm sweep fix (#1114)
These changes to sweep and sweep_hybrid for NVM allow them to run. I'm
not an expert on this, so I'm not sure if these are technically correct,
but they no longer fail when running
`build/X86/gem5.opt configs/nvm/sweep.py` and `build/X86/gem5.opt
configs/nvm/sweep_hybrid.py`

GitHub Issue: #669
2024-05-13 14:51:39 -07:00
Zhantong Qiu
6b427a84f7 stdlib: change default exit event for SIMPOINT_BEGIN (#1085)
The SIMPOINT_BEGIN should do nothing by default since it might be used
in various cases.

In
[https://www.mail-archive.com/gem5-users@gem5.org/msg22383.html](mailing
list), a user discovered a bug with the current
`simpoints-se-restore.py` example.
The bug is caused by the default behavior of the SIMPOINT_BEGIN exit
event.
When taking a checkpoint with `simpoints-se-checkpoint.py`, it stores
the future exit event scheduled at the beginning of the simulation. I
did not notice this when I wrote and tested the example script due to
the long print out log and my custom handler of the SIMPOINT_BEGIN exit
event.
In the restoring, the SIMPOINT_BEGIN exit event was triggered right
before the region end, so it resets the stats before the final stats
dump. Therefore, the simulation time is 0 as the user discovered.
This patch should fix this bug.

Change-Id: I800dfbd28d7b2c842864a1ab7d84b8f8e17b9b3c
2024-05-13 14:11:00 -07:00
Ivana Mitrovic
10b24dc9a4 arch-arm: Implement FEAT_MPAM in CPU (#1082)
This PR implements FEAT_MPAM on the CPU side. We define a MPAM system
registers and a mechanism
for tagging memory requests with the MPAM information bundle as
specified in existing documentation [1].

What this PR is *not* covering is the MPAM implementation in a MSC
(Memory System Component).
Which means at the moment it's only possible to have static partitioning
schemes (via the PartitioningPolicies
already part of gem5) and there is currently no way to dynamically
program partitions at runtime.

[1]: https://developer.arm.com/documentation/ddi0487/latest/
2024-05-13 08:56:23 -07:00
Ivana Mitrovic
53245fa0e8 arch-riscv: Fix CSR instruction behavior 2nd attempts (#1099)
Quote from change[1]

> The RISC-V spec clarifies the CSR instruction operation, some of them
shall not read or write CSR by the hints of RD/RS1/uimm, but the
original version use the 'data != oldData' condition to determine
whether write or not, and always read CSR first.
See CSR instruction in spec:
Section 9.1 Page 56 of
https://github.com/riscv/riscv-isa-manual/releases/download/Ratified-IMAFDQC/riscv-spec-20191213.pdf

|||Register operand|||
|--- |--- |--- |--- |--- |
|Instruction|rd is x0|rs1 is x0|Reads CSR|Writes CSR|
|CSRRW|Yes|-|No|Yes|
|CSRRW|No|-|Yes|Yes|
|CSRRS/CSRRC|-|Yes|Yes|No|
|CSRRS/CSRRC|-|No|Yes|Yes|
|||Immediate operand|||
|Instruction|rd is x0|uimm = 0|Reads CSR|Writes
CSR|
|CSRRWI|Yes|-|No|Yes|
|CSRRWI|No|-|Yes|Yes|
|CSRRSI/CSRRCI|-|Yes|Yes|No|
|CSRRSI/CSRRCI|-|No|Yes|Yes|

The issue cause the ubuntu hanging because we shared the same status CSR
with `mstatus`, `sstatus` and `ustatus` and interrupt enabling CSR with
mip, sip and uip. We may need to read origin CSR without effect of
unmask bits to avoid override the bits of other CSR. Now the ubuntu can
work after the patch merged.

[1] https://gem5-review.googlesource.com/c/public/gem5/+/67717
2024-05-10 10:21:48 -07:00
Matthew Poremba
e3c2a322a1 arch-vega: Fix SDWA dst select (#1120)
The destination select should take a value of the selection size (dword,
word, or byte) starting at bit 0, move that to the selected destination,
and then apply the unused constraint (DST_U) to the remaining word or
bytes. Currently the code is selecting the word/byte currently being
iterated over, rather than the least significant word/byte. As a result,
any selection that is not word 0 or byte 0 will be replaced with the
original destination value at those bits. This results in the wrong
value.

This commit changes the orig bits to be the original dest value at the
lowest word / byte location. Tested with the mfma_i32_16x16x16i8 example
which uses an SDWA V_OR_B32 to pack i8 values into VGPRs for the MFMA.

Change-Id: I54ed819479a25fa9276d29a8f14f0fea7fd71afe
2024-05-10 08:49:13 -07:00
Chong-Teng Wang
8c4d5f8e27 arch-riscv: Fix narrowing/widening type-convert instructions (#1079)
Correct ei calculation under VectorFloatWideningCvtFormat and
VectorFloatNarrowingCvtFormat.

Change-Id: I08699ffe3b9f8a7d4543023437626cc054344053
2024-05-09 10:17:15 -07:00
Harshil Patel
5c82447653 misc: Add resource versions to examples (#1110)
- Explicitly defining resource version in obtain resource calls in
examples.

Change-Id: I74ab5d2f5e9bc73a0145585a0fe75f2ec905472f
2024-05-09 10:16:27 -07:00
Matthew Poremba
e4ebe29f43 util: Bump gpu-fs docker to ROCm 6.1 (#1097)
This version matches the disk image on gem5-resources.

Change-Id: I69a45ef290f0fdf2167ead4d67d4d789d30e0e91
2024-05-09 10:11:54 -07:00
Ivana Mitrovic
233135da81 mem-ruby: Fix NullPointerException in RubyRequest (#1118)
This PR includes a check for `m_pkt` being null and appropriately
handles that case. This issue was causing the Daily tests to fail.

Change-Id: I87142ca14ca4ab3d8306153a1cf34c2629a119ba
2024-05-09 08:46:13 -07:00
Giacomo Travaglini
0df5635bdf mem-ruby: Implement NS bit for CHI transactions (#1100)
This patch is adding the NS bit to CHI requests to make sure they are
properly tagged according to their security


Change-Id: I33d3610edefbb5a05a6090e9125c35d4fb8bca58
Reviewed-by: Tiago Muck <tiago.muck@arm.com>

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-05-08 07:46:50 +02:00
Ivana Mitrovic
bc0f388316 util: Update gem5-resource-manager requirements (#1115)
Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.3 to 3.1.4.

Bumps [werkzeug](https://github.com/pallets/werkzeug) from 2.3.8 to
3.0.3.

Change-Id: I88e97c3c546c8dcfaa8c310a537def850177f0b9
2024-05-07 17:33:51 -07:00
Ivana Mitrovic
06ab3f9b18 misc: Update version in optional-requirements (#1109)
Bumps [tqdm](https://github.com/tqdm/tqdm) from 4.66.2 to 4.66.3.
2024-05-07 17:33:30 -07:00
Roger Chang
c1713a0b18 arch-riscv: Fix CSR instruction behavior 2nd attempts
Change-Id: Id0a9a374281445c7821863f0f74564857d3d8fa2
2024-05-07 20:32:56 +08:00
Roger Chang
1a81144985 arch-riscv: Move FCSR implementation to isa.cc
Change-Id: I132edfe2c0ae4caecaa9e6209249662895b5c608
2024-05-07 20:32:56 +08:00
Matthew Poremba
6ed446e546 arch-x86: Add XCR0 register and add to X86KvmCPU (#1040)
The extended control registers were not being updated in the KVM thread
context nor updated in the KVM state. This was causing issues when
checkpointing since the XCR0 value was reverting to the default value
rather than what it was previously before the checkpoint. THis was
causing multiple applications to crash due to executing instructions
which are now illegal instructions due to XCR0 being incorrect.

This commit adds the XCR0 as a misc register similar to the exiting x86
control registers and adds all of the helper functions to access and set
the register value. It also adds support for updating the KVM CPU's
state with the register value and updating the thread context's misc reg
value so that it is checkpointed along with the other misc regs.

Note that this does *not* add support for XSAVE of the AVX state (i.e.,
the upper 128 bits of YMM registers). It does however fix the immediate
problem in issue #958 .

Change-Id: I97456c8b57cbc7b381bd4be94944ce6567a43c76
2024-05-06 09:58:07 -07:00
Matthew Poremba
cb47755e15 gpu: Consolidated fixes for v24.0 (#1103)
Includes fixes for several bugs reported via email, self found, and
internal reports. Also includes runs through Valgrind and UBsan. See
individual commits for more details.
2024-05-06 07:35:57 -07:00
Matthew Poremba
0d3d456894 gpu-compute: Invalidate Scalar cache when SQC invalidates (#1093)
The scalar cache is not being invalidated which causes stale data to be
left in the scalar cache between GPU kernels. This commit sends
invalidates to the scalar cache when the SQC is invalidated. This is a
sufficient baseline for simulation.

Since the number of invalidates might be larger than the mandatory queue
can hold and no flash invalidate mechanism exists in the VIPER protocol,
the command line option for the mandatory queue size is removed, which
is the same behavior as the SQC.

Change-Id: I1723f224711b04caa4c88beccfa8fb73ccf56572
2024-05-06 07:35:38 -07:00
Giacomo Travaglini
36c1ea9c61 mem-ruby: Implement MakeReadUnique in CHI (#1101)
Change-Id: I64cd3c62804cca184d68287fc099534e9205f2b8
Reviewed-by: Tiago Muck <tiago.muck@arm.com>

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-05-06 08:30:59 +02:00
Giacomo Travaglini
7c9925bafa arch-generic: Fix reading from special :semihosting-features file (#1089)
The implementation of SYS_FLEN was missing, which caused picolibc to
treat this file as not implemented. Additionally, there was a bug in the
SYS_READ call that was comparing the wrong variable against the passed
buffer length. It was comparing the current file position against the
buffer length instead of the number of written bytes. Finally, pos was
unititialized which could result in spurious errors.

Change-Id: I8b487a79df5970a5001d3fef08d5579bb4aa0dd0
2024-05-06 07:30:13 +01:00
dependabot[bot]
d834e8bf4e misc: bump mypy from 1.9.0 to 1.10.0 (#1092)
Bumps [mypy](https://github.com/python/mypy) from 1.9.0 to 1.10.0.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-03 21:00:51 -07:00
Matthew Poremba
3490d5bf18 gpu-compute: Add DebugFlag for LDS
This prints what values are read/written to LDS and the previous value
on write. This is useful for debugging problems with LDS instructions.

Change-Id: I30063327bec1a1a808914a018467d5d78d5d58b4
2024-05-03 14:31:17 -07:00
Matthew Poremba
29f63f630b dev-amdgpu: Correct missing GART warning
SDMA ptePde packets are generating a warning that a GART address is
missing, causing a wrong address to be clobbered by the operation.

This commit fixes this by converting the GART address when the queue is
running in privledged mode, which is the only mode allowed to use GART
addresses. This removes the warnings and writes to the correct memory
region.

Change-Id: I64acac308db2431c5996b876bf4cda704f51cf25
2024-05-03 14:31:17 -07:00
Matthew Poremba
8249d6d1cd arch-vega: Remove FP asserts in VOP3 lane manip insts
The VOP3 instruction encoding generally states that ABS/NEG modifiers in
the instruction encoding are only valid on floating point data types.
This is currently coded in gem5 to mean floating point *instructions*.
For untyped instructions like V_CNDMASK_B32, we don't actually know what
the data type is. We must trust that the compiler did not attempt to
apply these bits to non-FP data types.

This commit simply removes the asserts. The ABS/NEG modifiers are
therefore ignored which is consistent with the ISA documentation.
This is done on the lane manipulation instructions V_CNDMASK_B32,
V_READLINE_B32, and V_WRITELANE_B32 which are typically used to mask off
or move data between registers. Other bitwise instructions (e.g.,
V_OR_B32) keep the asserts as bitwise operations on FP types are
genernally illegal in languages like C++.

Change-Id: I478c5272ba96383a063b2828de21d60948b25c8f
2024-05-03 14:31:17 -07:00
Matthew Poremba
2703fb5699 gpu-compute: Fix valgrind memleak complaints
Fixes several memory leaks, mostly of small and medium severity. Fixes
mismatched new/new[] and delete/delete[] calls.

Change-Id: Iedafc409389bd94e45f330bc587d6d72d1971219
2024-05-03 14:29:31 -07:00
Matthew Poremba
386fb3d1cc configs: Fix HSA packer processor address
The address has one too many zeros and is therefore placed in a memory
region usually used for system memory. As a result this causes failure
when trying to run a simulation with a huge amount of memory.

Change the address to be within the C000'0000h - FFFF'FFFFh X86 I/O hole
as was intended.

Change-Id: I5d03ac19ea3b2c01a8c431073c12fa1868b3df24
2024-05-03 14:29:30 -07:00
Matthew Poremba
0faa9510f9 arch-vega,gpu-compute: Fix misc ubsan runtime errors
Three main fixes:
 - Remove the initDynOperandInfo. UBSAN errors and exits due to things
   not being captured properly. After a few failed attempts playing with
   the capture list, just move the lambda to a new method.
 - Invalid data type size for some thread mask instructions. This might
   actually have caused silent bugs when the thread id was > 31.
 - Alignment issues with the operands.

Change-Id: I0297e10df0f0ab9730b6f1bd132602cd36b5e7ac
2024-05-03 14:26:46 -07:00
Harshil Patel
1164f9b81e tests: update resource to use new checkpoint
- Updated the id of the simpoint-se-checkpoint  resource.

Change-Id: Iab0b10da87b9790c24407e0edce7a18c38e0f48a
2024-05-03 10:55:04 -07:00
Yu-Cheng Chang
3a2a917a53 arch-riscv: Fix VCSR read behavoir (#1076)
The VCSR should read the value with VXSAT and VXRM

<table class="tableblock frame-all grid-all fit-content center">
<caption class="title">Table 40. vcsr layout</caption>
<colgroup>
<col>
<col>
<col>
</colgroup>
<thead>
<tr>
<th class="tableblock halign-right valign-top">Bits</th>
<th class="tableblock halign-left valign-top">Name</th>
<th class="tableblock halign-left valign-top">Description</th>
</tr>
</thead>
<tbody>
<tr>
<td class="tableblock halign-right valign-top"><p
class="tableblock">XLEN-1:3</p></td>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"><p
class="tableblock">Reserved</p></td>
</tr>
<tr>
<td class="tableblock halign-right valign-top"><p
class="tableblock">2:1</p></td>
<td class="tableblock halign-left valign-top"><p
class="tableblock">vxrm[1:0]</p></td>
<td class="tableblock halign-left valign-top"><p
class="tableblock">Fixed-point rounding mode</p></td>
</tr>
<tr>
<td class="tableblock halign-right valign-top"><p
class="tableblock">0</p></td>
<td class="tableblock halign-left valign-top"><p
class="tableblock">vxsat</p></td>
<td class="tableblock halign-left valign-top"><p
class="tableblock">Fixed-point accrued saturation flag</p></td>
</tr>
</tbody>
</table>

Change-Id: I1227b920da78026951dfa548e41c8cc56da6caac
2024-05-03 09:53:43 -07:00
Yu-Cheng Chang
8b885222b1 arch-riscv: Fix interrupt and status CSR behavoir (#1091)
From sepc

> Instructions that access a non-existent CSR are reserved. Attempts to
access a CSR without appropriate privilege level raise
illegal-instruction exceptions or, as described in Section 13.6.1,
virtual-instruction exceptions. Attempts to write a read-only register
raise illegal-instruction exceptions. A read/write register might also
contain some bits that are read-only, in which case writes to the
read-only bits are ignored.

Setting the bit not in the mask should be ignore rather than raise the
illegal exception. The unmask bits of xstatus CSR are `WPRI`, the
unmasks bits of xie are `RO`(above priv v1.12) or `WPRI`(priv v1.11 and
priv v1.10), the unmask bits of xip CSR are `RO`(above priv v1.12) or
`WPRI`(priv v1.11) or `WIRI` (priv v1.10).

Note: The workload of `riscv-ubuntu-20.04-boot` uses the priv v1.10.

More details please see the `RISC-V spec:  Privileged Architecture`
v1.10:
https://github.com/riscv/riscv-isa-manual/releases/tag/riscv-priv-1.10
v1.11(20190608):
https://github.com/riscv/riscv-isa-manual/releases/tag/Ratified-IMFDQC-and-Priv-v1.11
v1.12(20211213):
https://github.com/riscv/riscv-isa-manual/releases/tag/Priv-v1.12

Change-Id: I5d6e964e99b30b71da3dc267cd1575665d922633
2024-05-02 09:07:30 -07:00
Giacomo Travaglini
a6b20eae80 Merge branch 'develop' into semihosting-features-fix 2024-05-02 10:12:27 +01:00
Alexander Richardson
aa2fade12e Drop unrelated change 2024-05-01 18:00:09 +01:00
Alexander Richardson
e7566448fa arch-generic: More reliable special file name handling in semihosting (#1090)
Currently, the filesRootDir is prepended for all paths that do not start
with '/'. However, we should not be doing this for the special files :tt
and :semihosting-features. Noticed this while testing semihosting with a
non-empty filesRootDir.

Change-Id: I156c8b680cb71cdc88788be3b0e93fc1d52e11e5
2024-05-01 17:41:55 +01:00
Alex Richardson
bb4c13143c arch-generic: Fix reading from special :semihosting-features file
The implementation of SYS_FLEN was missing, which caused picolibc to
treat this file as not implemented. Additionally, there was a bug in
the SYS_READ call that was comparing the wrong variable against the
passed buffer length. It was comparing the current file position against
the buffer length instead of the number of written bytes.
Finally, pos was unititialized which could result in spurious errors.

Change-Id: I8b487a79df5970a5001d3fef08d5579bb4aa0dd0
2024-04-30 16:28:06 -07:00
Yangyu Chen
666d1dd9a2 arch-riscv: Add Integer Conditional operations extension (Zicond) instructions (#1078)
This PR added RISC-V Integer Conditional Operations Extension, which is
in the RVA23U64 Profile Mandatory Base. And the performance of
conditional move instructions in micro-architecture is an interesting
point to explore.

Zicond instructions added: czero.eqz, czero.nez

Changes based on spec:

https://github.com/riscvarchive/riscv-zicond/releases/download/v1.0.1/riscv-zicond_1.0.1.pdf
2024-04-30 05:44:45 -07:00
Matthew Poremba
c495ff84ec util: Make x86-add-xcr0 work for testlib checkpoints
Change-Id: I7b93d7afc7710bd43412a77a204ce8838d0bfb4e
2024-04-29 11:45:55 -07:00
OdnetninI (Eduardo José Gómez Hernández)
17cbbd84ae cpu: Indirect predictor track conditional indirect (#1077)
As discussed in https://github.com/orgs/gem5/discussions/954: 

In the refactor made by commit f65df9b959 conditional indirect
branches are no longer updated in the indirect predictor.
This kind of branches do not exist in x86 neither arm, but they are
present in PowerPC.

This patch, enables the indirect predictor to track this kind of
branches.
2024-04-29 11:38:22 +01:00
Giacomo Travaglini
b8e414f14d arch-arm: Implement FEAT_MPAM
With this patch we are adding the specific logic required to tag memory
requests from the PE with MPAM data. This is happening within the MMU
before a translation is completed

Change-Id: Ifecb285244dd469a639150d69a7e884fe3c441be
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
2024-04-29 11:32:39 +01:00
Adrián Herrera
c988642ca8 arch-arm: Define system registers for FEAT_MPAM
This patch is adding FEAT_MPAM register definition/decoding.

Co-authored-by: Hristo Belchev <hristo.belchev@arm.com>
Co-authored-by: Giacomo Travaglini <giacomo.travaglini@arm.com>

Change-Id: I70483fcc758419365f4b3762479684c6c52f4d62
Signed-off-by: Adrián Herrera <adrian.herrera@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
2024-04-29 11:32:39 +01:00
Giacomo Travaglini
65cf6b0a1c arch-arm: Cache the highestEL in the ISA object
This is for fast retrieval of the highest implemented
exception level

Change-Id: Id631c2b999d46a8b79570e4043ae04bc2b2e7531
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
2024-04-29 10:38:08 +01:00
Alexander Richardson
1bb5d3b99e arch-riscv: Add support for RISC-V semihosting (#681)
See https://github.com/riscv-software-src/riscv-semihosting for the
current specification. Almost all code is shared with the Arm
implementation.

Tested by running some binaries built with
[picolibc](https://github.com/picolibc/picolibc).
2024-04-27 05:12:32 -07:00
Ivana Mitrovic
939d8e28df mem-cache: Fix TreePLRU num leaves error (#1075)
This PR fixes the error noted here #1073. 

Change-Id: I5d31c259ac5ee93f46f28b20eda4f58460ba8523
2024-04-26 20:22:20 -07:00
Harshil Patel
a6138777e2 tests: update versions for new checkpoints
Change-Id: I075110b68a7aa762fb060fcae7bb74ee8ec581b0
2024-04-26 09:54:43 -07:00
Matthew Poremba
a6f2c8afdb arch-x86: Add XCR0 register and add to X86KvmCPU
The extended control registers were not being updated in the KVM thread
context nor updated in the KVM state. This was causing issues when
checkpointing since the XCR0 value was reverting to the default value
rather than what it was previously before the checkpoint. THis was
causing multiple applications to crash due to executing instructions
which are now illegal instructions due to XCR0 being incorrect.

This commit adds the XCR0 as a misc register similar to the exiting x86
control registers and adds all of the helper functions to access and set
the register value. It also adds support for updating the KVM CPU's
state with the register value and updating the thread context's misc reg
value so that it is checkpointed along with the other misc regs.

Note that this does *not* add support for XSAVE of the AVX state (i.e.,
the upper 128 bits of YMM registers). It does however fix the immediate
problem in issue #958 .

A checkpoint upgrader is also provided to add the default value of XCR0
if the checkpoint tag is missing.

Change-Id: I97456c8b57cbc7b381bd4be94944ce6567a43c76
2024-04-25 11:24:53 -07:00
Robert Hauser
1b323a9571 systemc: remove if clause in Gem5ToTlmBridgeBase (#1059)
In the payload event queue in Gem5ToTlmBridgeBase, the phase is checked
twice for BEGIN_RESP. This commit removes the second if clause since it
is unnecessary.

Duplicate if clause in line 234 & line 256


dd2689905f/src/systemc/tlm_bridge/gem5_to_tlm.cc (L234-L267)

please correct me if I am missing something important
2024-04-25 11:15:30 -07:00
Nicholas Mosier
c679c9c127 cpu-o3: prioritize exiting threads when committing (#1056)
Fix #1055. Prioritize committing from exiting threads before we consider
other threads using the specified SMT commit policy. All instructions in
the ROB for exiting threads should already have been squashed. Thus,
this ensures that the ROB instruction queues for all exiting threads
will be empty at the end of the current cycle, avoiding the assertion
failure encountered in #1055.

Change-Id: Ib0178a1aa6e94bce2b6c49dd87750e82776639dc
2024-04-25 11:15:14 -07:00
Nicholas Mosier
51d546cb06 cpu-o3: Clear current macro-op in fetch if squashing after last micro-op (#1047)
Fix #1042. Clear the current fetch macro-op if the instruction
initiating the squash is the last micro-op in its macro-op.

Change-Id: I77f60334771277e47f19573d4067b3a7bc5488b2
2024-04-25 11:14:58 -07:00
Nicholas Mosier
66decb2e93 mem-ruby: Fix functional reads for MESI Three-Level messages (#1045)
Fix #1044. This patch adds checks for message types (PUTX_COPY, DATA,
DATA_EXCLUSIVE) that contain data blocks but were missing from the
original `functionalRead` method in MESI Three-Level messages.

Change-Id: I0cedc314166c9cc037bf20f5b7fef5552dd1253c
2024-04-25 11:14:37 -07:00
Harshil Patel
d75afeabb1 tests: fix persistence issue in pyunit tests (#1070)
- Fixed patching/ mocking of functions and global variables to reset for
each test.
- Uncommented tests as they should pass now.
2024-04-25 10:03:10 -07:00
Giacomo Travaglini
83e55743e1 arch-arm: Add misc_accessor templated functions to read/write regs at different ELs (#1072)
A usual system register read/write pattern is something like the
following

```
switch(el) {
    case EL1:
        tc->readMiscReg(REG_EL1);
    case EL2:
        tc->readMiscReg(REG_EL2);
    case EL3:
        tc->readMiscReg(REG_EL3);
}
```

To avoid repeating these switch statements all over gem5, we define
templated functions which have
an accessor struct as a template parameter. These accessor will help
populating the templated switch
construct. We provide the FAR register accessor as an example. The
accessor should define the following
fields: (type, el0, el1, el2, el3)

Example:

```
struct FarAccessor
{
    using type = RegVal;
    static const MiscRegIndex el0 = NUM_MISCREGS;
    static const MiscRegIndex el1 = MISCREG_FAR_EL1;
    static const MiscRegIndex el2 = MISCREG_FAR_EL2;
    static const MiscRegIndex el3 = MISCREG_FAR_EL3;
};
```
2024-04-25 14:57:10 +01:00
Andreas Sandberg
85d21b5718 cpu-kvm: Support perf counters on hybrid host architectures (#1065)
Fix #1064 by adding support for hardware performance counters on hybrid
architectures like Intel Alder Lake.

Hybrid architectures have multiple types of cores, each of which require
the instantiation of a separate performance counter. The KVM CPU's
PerfKvmCounter class was not aware of this, any only instantiated a
single performance counter, implicitly bound to the P-core only. This
meant that if gem5 ever ran on an E-core, the various hardware
performance counters would not get updated properly, in some cases
always zero (e.g., for the number of instructions executed).

This patch adds support for hybrid host architectures as follows. First,
we convert PerfKvmCounter into an abstract class, which has two concrete
implementations: SimplePerfKvmCounter and HybridPerfKvmCounter. The
former is used for non-hybrid architectures or for non-hardware
performance counters and is functionally equivalent to the prior
implementation of PerfKvmCounter. The latter is used for instantiating
hardware performance counters (i.e., of type PERF_TYPE_HARDWARE) on
hybrid host architectures. It does so by internally instantiating two
SimplePerfKvmCounters, one for a P-core and one for an E-core. Upon
read, it sums the results of reading the two internal counters.

Change-Id: If64fcb0e2fcc1b3a6a37d77455c2b21e1fc81150
2024-04-25 10:45:47 +01:00
Giacomo Travaglini
a3d030d161 arch-arm: Add the FAR_EL* register accessor
Use it accordingly in the faulting/exception logic

Change-Id: I2f6360d04698b6fb7188e776f1d6966e99ce19b1
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
2024-04-25 09:45:54 +01:00
Giacomo Travaglini
19628e746d arch-arm: Add readRegister/writeRegister templates
This is adding two templated functions for reading/writing
system registers (MiscRegs). It is introducing them inside
a new misc_regs namespace.

Change-Id: I21233337c057673d46d1147971ebabbfc2c2bb6a
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
2024-04-25 09:45:00 +01:00
Giacomo Travaglini
01602cdf13 tests: Revert "tests: Move the arm+ruby tests to not use ALL" (#1069)
This reverts commit c1de2b8762. We revert
the commit as Ruby does not use get_runtime_isa anymore after [1]

[1]: https://github.com/gem5/gem5/pull/241

Change-Id: Iaac8d64194bbd53a9b1a57a796ff92f763c75a87
2024-04-24 21:01:53 -07:00
Bobby R. Bruce
b83a53e521 tests: Fix gem5 testlib compilation (#1063)
Prior to this patch the usage of KConfig was creating an empty config in
the case where a protocol was not specified.
2024-04-24 21:01:30 -07:00
Ivana Mitrovic
cc3655cdad arch-arm: Refactor PTW (#1060)
This PR is refactoring the Arm PageTableWalker in the following way:

1) Simplifying the currState handling logic (mainly the tear down)
2) Amending the TlbTestInterface APIs to use a RequestPtr reference
3) Use finalizePhysical even when MMU is off, which means allowing
memory mapped m5ops to work also in that circumstance
2024-04-24 21:00:42 -07:00
Nicholas Mosier
ed8a09303a mem-cache: Remove power-of-2 requirement for TreePLRU num leaves (#1061)
Remove the requirement in TreePLRU's implementation that the number of
leaves (i.e., the number of cache ways) be a power of two. Firstly, on
some recent processors, this is not the case---for example, Intel Golden
Cove's L1D has 12 ways. Secondly, The implementation of TreePLRU appears
to work just fine as-is with a way count that's not a power of two.

Change-Id: If2a27dc5bbe7a8e96684f79ce791df5c0b582230
2024-04-24 20:59:06 -07:00
Giacomo Travaglini
bf78579fa5 arch-arm: Change the TlbTestInterface to accept a RequestPtr
Now that the Request has been made an Extensible object, it
can carry within itself much more data. It makes sense
to pass it to the TlbTestInterface as more information about
the table walk can be extracted from it.

This is also aligning with the testTranslation utility which
is expecting a request reference as first argument.

Change-Id: I3dbc9a81d6b4bcc1801246ba7eb4136774d8f3c7
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
2024-04-24 18:12:36 +01:00
Giacomo Travaglini
89323c5112 arch-arm: Group testTranslation and finalizeTranslation together
They both make final checks to the VA->PA translation before
relinquishing control back to the translate client (usually
CPU code)

Change-Id: Ib0a9da25404248c22c6a240817d2f50f0913fdf7
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
2024-04-24 18:12:36 +01:00
Giacomo Travaglini
0c20eb3ec7 arch-arm: Call finalizePhysical even when MMU is off
The finalizePhysical is just checking if the physical
address falls within the m5op region (if using mmapped
m5ops). There's not reason why we shouldn't enable it
with virtual memory off

Change-Id: I5ab80fd4e7886743abd4b7d85937b72253b578d3
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
2024-04-24 18:12:36 +01:00
Giacomo Travaglini
a299d2db0c arch-arm: Move testWalk check within the fetchDescriptor
We also unify the fault handling logic; rather than cleaning
up the WalkerState in several places scattered throughout the
walking code, we handle faults in the top level method

Change-Id: Ia22fb6f27044ff445fffbab228777a48efa473cb
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
2024-04-24 18:12:36 +01:00
Giacomo Travaglini
6d0cb6eaa3 arch-arm: Pull out Request generation from the TableWalker::Port
Change-Id: Ie8c309bb79b4ce7c656428660c9e2effd58a89f0
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
2024-04-24 18:12:36 +01:00
Giacomo Travaglini
e450cfef16 arch-arm: Move testWalk functionality to the TableWalker class
It's more efficient to pass a reference of the tester to the
TableWalkers. In this way a table walk check is tested directly
from the walkers instead of going through the MMU every time.

Change-Id: I9820dbabb8b551981005a65efa54a76b1a027541
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
2024-04-24 18:12:36 +01:00
Giacomo Travaglini
bbe5bf2644 arch-arm: Simplify TableWalker::walk method
Change-Id: Ib823b3b577a70f6ec14de854cb9c250faa04e932
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
2024-04-24 18:12:36 +01:00
Giacomo Travaglini
9d9b7848bb arch-arm: Properly compute EL even in stage2 walks
This is done in order to differentiate between EL0 (unprivileged) and
EL1. Effectively it won't change much as most of the decisions are
now taken according to the translation regime which will be the
same regardless (EL10)

Change-Id: I218037e9c19cf638aff05c51869e439204d9af69
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2024-04-24 18:12:36 +01:00
Nicholas Mosier
cf5ec880c9 cpu-kvm: Support overflows when migrating across hybrid cores
Add support for event overflows when the host thread migrates across
differnt types of cores on a hybrid host architecture. This patch
achieves this by simply halving the sample period for each performance
counter. Since there are two types of cores, this guarantees that an
overflow event will trigger before N events occur, where N is the
requested period (e.g., number of instructions to simulate). This
may result in many early triggers (up to log2(N)) before the requested
period is reached. However, gem5's existing bookkeeping logic already
handles this case properly: if fewer events than requested occurred,
it will set a new period (N - observed) and resume execution. This loop
will exit once N events have actually occurred.

Change-Id: Iff85237da1ae1aa25bc2045fbf9091726291fe36
2024-04-24 09:47:46 -07:00
Nicholas Mosier
30ea15009f cpu-kvm: Support perf counters on hybrid host architectures
Fix #1064 by adding support for hardware performance counters on hybrid
architectures like Intel Alder Lake.

Hybrid architectures have multiple types of cores, each of which require
the instantiation of a separate performance counter. The KVM CPU's
PerfKvmCounter class was not aware of this, any only instantiated a
single performance counter, implicitly bound to the P-core only. This
meant that if gem5 ever ran on an E-core, the various hardware
performance counters would not get updated properly, in some cases
always zero (e.g., for the number of instructions executed).

This patch adds support for hybrid host architectures as follows. First,
we convert PerfKvmCounter into an abstract class, which has two
concrete implementations: SimplePerfKvmCounter and HybridPerfKvmCounter.
The former is used for non-hybrid architectures or for non-hardware
performance counters and is functionally equivalent to the prior
implementation of PerfKvmCounter. The latter is used for instantiating
hardware performance counters (i.e., of type PERF_TYPE_HARDWARE) on
hybrid host architectures. It does so by internally instantiating two
SimplePerfKvmCounters, one for a P-core and one for an E-core. Upon
read, it sums the results of reading the two internal counters.

Change-Id: If64fcb0e2fcc1b3a6a37d77455c2b21e1fc81150
2024-04-24 09:47:46 -07:00
Bobby R. Bruce
9f5c97c7fd stdlib: Tests Fix/Disable pyunit tests (#1067) 2024-04-23 22:05:19 -07:00
Harshil Patel
5658eec958 tests: update mocking for tests
- After removal of the ClientWrapper class, the mocking of clients needs
  to be changed to _create_client function.
- Commented failing tests due to persistence issues.
  The persistence is being caused as the new mocked clients
  not being used as the older clients are persisting over
  the tests.

Change-Id: Ie342c9fc8103504dd12f49ae30b3bf62d189ce1d
2024-04-23 16:26:44 -07:00
Harshil Patel
d548f2c5c4 tests: fix tests that use JSON client
- There was a bug in JSONClient when searching
  for resoruces. The id was not checked and
  the booleans were not set to true when
  optional search queries like resource_version
  and gem5_version are not passed.

Change-Id: I4aa7c5388035144ec6864d57130ad09e6709692e
2024-04-23 16:24:09 -07:00
Harshil Patel
97a0530452 stdlib: Enable bundled resource requests from the databases (#779) 2024-04-22 11:53:23 -07:00
Bobby R. Bruce
40fdf368d8 util: Enable m5term Apple Mac OS Compilation (#1046)
The "linux/limits.h" equivalent on Apple systems is "sys/syslimits.h".
By adding an include guard to include the correct header dependent on
the host system, we can compile m5term on Mac OS systems.
2024-04-22 11:31:16 -07:00
Bobby R. Bruce
115322319c misc: Sync stable .github with develop (#1051) 2024-04-21 09:27:59 -07:00
Bobby R. Bruce
dd2689905f misc,tests: Remove zip step from Workflows (#1048)
This is not needed with upload-artifact v4 directories are archived and
compressed by default.

This zip step was also causing Daily/Weekly test failures due to not
running `apt update` before the `apt install` for the zip utility. Ergo
this patch fixes these errors.
2024-04-21 09:15:20 -07:00
Matthew Poremba
c54039da5b configs: GPUFS: Turn off SSE4 and fancy XSAVEs (#1041)
A user reported a bug with the SSE4.1 version of memcmp in libc. When
enabled the simulated program crashes with SIGILL. After attempting all
fixes recommended by Intel SDM and still not working, turning the bit
off instead.

Similar, the default XSAVE functionality is not completely implemented
for AVX and newer ISA extensions. Therefore, there is not much point to
claiming to support the more advanced versions of XSAVE (XSAVEOPT,
XSAVEC, XSAVES, and XGETBV with ECX=1).

Note that none of these bits are enabled for non-GPU full system
simulations (see src/arch/x86/X86ISA.py). This only impacts GPUFS
simulations.

Change-Id: I8eb7bf0f2a0a29226095e7889fec9c1e8a65f88f
2024-04-20 11:04:59 -07:00
Bobby R. Bruce
0b2fa9900b misc: Merge .github develop dir to stable (#1043) 2024-04-19 19:41:30 -07:00
Bobby R. Bruce
e578f83739 github,tests: Add Pyunit tests to CI GitHub Action Workflow (#1026)
Due to an oversight, the PyUnit tests were not being run as part of the
gem5 CI tests. This was because they are located in "tests/pyunit"
instead of "tests/gem5", where the CI GitHub Action workflow searched
for tests to run and where all other tests reside.

This adds the Pyunit tests as a seperate job in the CI GitHub Action's
workflow.
2024-04-19 15:22:04 -07:00
Bobby R. Bruce
13f85b989f stdlib: Fix obtaining of Simpoint Resources
Change-Id: Ic73547c8c4acbe5d8a30a24dd8709cb2e9f6eb5e
2024-04-19 01:54:42 -07:00
Bobby R. Bruce
52a7218bd8 stdlib,tests: Fix test resources entry for to new schema
Change-Id: I77c263315d3e7f15df6f7fd83ab4ad9280faf777
2024-04-18 17:33:30 -07:00
Bobby R. Bruce
b80a04e146 stdlib,tests: Fix mocked_resquest_post - add kwargs
Change-Id: I1c080d42b6f238d2f716c500913dc7576dc13ed6
2024-04-18 17:33:30 -07:00
Bobby R. Bruce
e4ff5df35a tests,stdlib: Fix pyunit tests - Workload -> ShadowResource
Change-Id: I307439334c93851ebe3a78d3a80d048374a0900a
2024-04-18 17:33:30 -07:00
Bobby R. Bruce
29d56d3d65 misc,tests: Add Pyunit tests to CI GitHub Action Workflow
Due to an oversight, the PyUnit tests were not being run as part of the
gem5 CI tests. This was because they are located in "tests/pyunit"
instead of "tests/gem5", where the CI GitHub Action workflow searched
for tests to run and where all other tests reside.

This adds the Pyunit tests as a seperate job in the CI GitHub Action's
workflow.

Change-Id: I63d93571fde11c19bf3d281c034eddf4b455ae4e
2024-04-18 17:33:30 -07:00
Bobby R. Bruce
cbf0334762 misc: Fix jq install for testlib-quick-matrix (#1038) 2024-04-18 17:30:53 -07:00
Ivana Mitrovic
42ffa52907 mem-ruby: Implement no_alloc Far Atomics in CHI (#994)
This PR introduces a missing pice of far atomic implementation. This
pull request incorporates several changes:

- Enable 2-level and 4-level (and N-level) cache hierarchies, removing
Atomic_NoWait transactions
- Fix Unique Near policy implementation that raised abort
- Add support for alloc_on_atomic == False. Enables Far Atomics on
systems where the HNF does not allocate evicted lines at LLC (Like in
WriteUpdate).
2024-04-18 11:35:47 -07:00
Ivana Mitrovic
c44b8635ab arch-x86: Movfp account for dataSize=4 (#1024)
Movfp instruction did not account for only copying the lower half of src
register if dataSize is 4.
GitHub Issue: #893 
I used the test code in issue #893 to verify the fix is working.
2024-04-18 10:36:00 -07:00
Bartek Gąsiorzewski
84cba2a8a8 dev: Fix interrupt logic in uart8250 (#1009)
Hi, we've noticed some issues with the Uart8250 device when using it as
the Linux console. Sometimes the Uart interrupt would remain constantly
posted, so Linux would continue to try and handle it, effectively
resulting in an infinite loop. With this patch, I'm no longer seeing any
issues, but my testing has been limited to configurations and workloads
we're interested in at Imagination, so please let me know if there's
some other tests I should run or if you notice any other issues.

This patch fixes several issues with interrupt posting and clearing in
the uart8250 device.

The "status" member variable and the console interrupt should be kept in
sync. However, in one code path in readIir, the interrupt bit was being
cleared in the status variable but not in the platform controller.

Additionally, in some code paths, the interrupts would be cleared in the
status variable and in the interrupt controller, but a future interrupt
would remain scheduled, causing a spurious interrupt and setting a bit
in status to 1.

These issues can confuse the kernel and result in an ininite interrupt
handling loop.

Another issue is related to the fact that there are two interrupt causes
(TX and RX) and both of them can be valid at the same time. When one of
them becomes no longer valid, we should check the status of the other
one before clearing the interrupt.

This patch addresses the issues listed above and refactors the interrupt
clearing logic to reduce repetition.
2024-04-17 11:27:39 -07:00
Jason Lowe-Power
c13aa7727d cpu: Fix Ruby/x86 pio port connections (#1035)
Fixes #1033

In the BaseCPU object _uncached_interrupt_response_ports is a class
variable, not an instance variable. #1004 changed the explicit
self._uncached_interrupt_response_ports to use extend. This caused the
list of ports to be extended *for all cores*, which caused problems when
using a system with more than 1 core.

This reverts the `extend` part of the change, but keeps the rest.

Change-Id: I6dc7d6da6763048d82960229d34933a3a2ac36e0

Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2024-04-17 08:20:04 -07:00
Yu-Cheng Chang
6b4dbdcedb tests,arch-riscv: update bitmanip asmtest binaries (#931)
Gem5 resource update: https://github.com/gem5/gem5-resources/pull/25
Gem5 issue: https://github.com/gem5/gem5/issues/883

Change-Id: I1892d7591d6fa49d0563623fd90292e0d38d9ba3
2024-04-16 09:51:32 -07:00
Lukas Zenick
01a5edc86e arch-x86: Use mbits function for clarity
Change-Id: I577ee55752f917e561e4c741ba7a19f0229318b5
2024-04-15 22:49:41 -05:00
Matthew Poremba
9b463dbdfd util-docker: Bump gpu-fs build docker to ROCm 6.0.2 (#1025)
This bumps the docker image used to build GPU applications for input to
GPUFS simulations from ROCm 5.4.2 to ROCm 6.0.2 and Ubuntu from 20.04 to
22.04. This matches the versions in gem5-resources#29 .

Several notes were added to the Dockerfile to describe where the RUN
commands come from. A README.md is also added to clarify that this is
not a disk image for GPUFS and is only used to build applications.

Change-Id: I9ada99e2ed1854cb7adb76f2a1fa662bab398f86
2024-04-15 13:36:06 -07:00
Bobby R. Bruce
1aa0bf8ec6 tests,github: Update CI Tests' GitHub Actions versions (#1021) 2024-04-15 13:35:33 -07:00
Bobby R. Bruce
56a2346b8d tests,util-docker,github: Add Ubuntu 24.04 Docker image & updated tests/actions to use it (#1018)
This ensures gem5 compiles and runs in 24.04 environments. A necessary
PR, for ensuring gem5 support Ubuntu 24.04 (related issue: #909)
2024-04-15 13:34:22 -07:00
Matthew Poremba
a03319bef7 arch-vega: Fix output warnings, gem5.fast (#1023)
Fix gem5.fast build not building when using gpu model.

Removes very spammy stat distribution bucket size prints when running
gpu model.
2024-04-15 13:18:27 -07:00
Matthew Poremba
7e2d8dee42 mem,gpu-compute: Implement GPU TCC directed invalidate (#1011)
The GPU device currently supports large BAR which means that the driver
can write directly to GPU memory over the PCI bus without using SDMA or
PM4 packets. The gem5 PCI interface only provides an atomic interface
for BAR reads/writes, which means the values cannot go through timing
mode Ruby caches. This causes bugs as the TCC cache is allowed to keep
clean data between kernels for performance reasons. If there is a BAR
write directly to memory bypassing the cache, the value in the cache is
stale and must be invalidated.

In this commit a TCC invalidate is generated for all writes over PCI
that go directly to GPU memory. This will also invalidate TCP along the
way if necessary. This currently relies on the driver synchonization
which only allows BAR writes in between kernels. Therefore, the cache
should only be in I or V state.

To handle a race condition between invalidates and launching the next
kernel, the invalidates return a response and the GPU command processor
will wait for all TCC invalidates to be complete before launching the
next kernel.

This fixes issues with stale data in nanoGPT and possibly PENNANT.
2024-04-15 13:18:01 -07:00
Bobby R. Bruce
630f3822b8 github: Update 'ubuntu-22.04' to 'ubuntu-latest' (#1022)
There was some inconsistency in the GitHub Workflow files on using
'ubuntu-latest' (which gets the latest Ubuntu version) or
'ubuntu-22.04'. To keep things consistent 'ubuntu-latest' is now used in
all cases. This also saves us updating workloads upon release of a new
Ubuntu version.
2024-04-15 09:55:56 -07:00
Giacomo Travaglini
bdcffdd0f0 dev-arm: Do not mark the MpamMSC as abstract (#1030)
This prevents its instantiation


Change-Id: I775a64904a01cf36e4cc1e0cd45765f03325c5ca

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-04-15 09:40:22 -07:00
Bobby R. Bruce
a7330ac4fb misc: bump dnspython in /util/gem5-resource-manager (#1027)
Bumps [dnspython](https://github.com/rthalley/dnspython) from 2.3.0 to
2.6.1.
- [Release notes](https://github.com/rthalley/dnspython/releases)
-
[Changelog](https://github.com/rthalley/dnspython/blob/main/doc/whatsnew.rst)
-
[Commits](https://github.com/rthalley/dnspython/compare/v2.3.0...v2.6.1)

Change-Id: Iaa0ed0fa68220fd8b52eb62c0089831b253e17d0

---
updated-dependencies:
- dependency-name: dnspython dependency-type: direct:production ...

Change-Id: I6e3ed8287f5fd60e7bd1c0a3e565db94ef8627a9

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-04-15 08:41:14 -07:00
Ivana Mitrovic
dbb71948ce util: Update resource manager dependencies (#1015)
This PR combines the changes from these dependabot PRs: #1008 and #1012.
2024-04-15 08:35:47 -07:00
Lukas Zenick
d67a7797d2 arch-x86: Movfp account for dataSize=4
Change-Id: I97e7a6f2738a57cad9907ddfe5c8030a26c147e8
2024-04-14 15:59:24 -05:00
Matthew Poremba
3db6e86fea arch-vega: Fix string check warnings on fast build
gem5.fast does not currently build if the GPU model is built. This fixes
the array-bounds warnings allowing gem5.fast to build again.

Change-Id: I463c2847c3ecfd2257a70418fa247090b0493f9b
2024-04-14 12:22:57 -07:00
Bobby R. Bruce
b986c542ca tests,misc: Set pre-commit/action to v3.0.1
v3.0.0 of pre-commit/action caused a deprecation warning in actions.
v3.0.1 was released to deal with this.

Change-Id: Ib5654e465565ad4356754ac097983aec4166b98f
2024-04-13 20:30:34 -07:00
Bobby R. Bruce
3f45a2d08d misc,tests: Up actions/setup-python version to v5
This was causing a deprecation warning in GitHub Actions.

Change-Id: I9ab147acf12e3763ab731769468ce5b1dc5e4dea
2024-04-13 20:20:26 -07:00
Matthew Poremba
01f2df4b8a gpu-compute: Fix stat bucket sizes
Change-Id: If30505515867a866c631cb117d3d22e19814a2f2
2024-04-13 15:51:41 -07:00
Bobby R. Bruce
ccd9beb661 util-docker: Remove 22.04 min-dep Dockerfile
We only test the latest LTS Ubuntu release with min-deps. With 24.04, we
no longer require the 22.04 min dependencies image.

Change-Id: I4b3d668c1f9d10c2b6071848e6daada6c763b5e7
2024-04-13 14:16:41 -07:00
Bobby R. Bruce
05bc85aa9b misc: Update Update GitHub Actions to use 24.04 over 22.04
This change ensures all our tests run on our most recent supported LTS
release of Ubuntu.

In the case of compiler tests we still test 22.04 all-dep but test 24.04
all-dep and min-dep (i.e., we drop 22.04 min-dep as it's somewhat
redundant).

Change-Id: I63666d1017594b496523a48e5112a8994f57885f
2024-04-13 14:13:35 -07:00
Bobby R. Bruce
d091c64db1 util-docker: Add Ubuntu 24.04 min-dep Docker
Change-Id: Ia5cb4f2fd54ce53494ab95705b4f6006648d7eba
2024-04-13 14:08:13 -07:00
Bobby R. Bruce
3962fca2e3 util-docker: Add ubuntu-24.04_all-deps Docker
Change-Id: I5917c446cacc25d1a333b5cf8147ee78b112aeb3
2024-04-13 14:08:13 -07:00
Bobby R. Bruce
bdaeb082c3 util-docker: Update docker-compose URLs to 'ghcr.io/gem5' (#1017)
'gcr.io/test-gem5' was the registry we used when hosting them on Google
Cloud services. We now use the GitHub container registries.
2024-04-13 14:05:34 -07:00
Bobby R. Bruce
392a2b4ffa misc: Add a DevContainer specification to the gem5 repo (#911)
Speciftying a DevContainer in gem5 allows for users to quickly create an
environment in which they can develop, build, and run gem5. The
".devcontainer/devcontainer.json" file specifies the properties of the
container. In this commit they are as follows:

1. The Docker image ghcr.io/gem5/devcontainer. This is built from
"util/dockerfiles/devcontainer". This Dockerfile provides all
dependencies and a pre-built gem5 binary from the current main branch
(added to "/usr/local/bin"). In order to support this Docker container
on different platforms we use the Docker multi-platform feature. As
such, this must be built using `docker buildx bake devcontainer --push`
which reads the `docker-bake.hcl file for the specification of the
multi-platform image.
2. Visual Studio extensions. This is a list of Visual Studio Code
extensions useful when developing gem5. They are automatically added the
Visual Studio dev container.
3. Features. Features are enhancemets that can be added to a
DevContainer. Normally they are libraries and other commonly used tools
to be included in the Container. As we have our dependencies specified
in the Dockerfile here we select one to enable Docker inside the
container, one to enable the Github CLI, one to improve Linting, and
finally one to enable the vscode CLI.
4. The On Create Command : This command allows us to specify commands to
be run after the DevContainer is created. In this case we execute
".devcontainer/on-create.sh" which, right now, refreshes the git index
and installed the pre-commit checks.
2024-04-12 10:37:17 -07:00
Yu-Cheng Chang
ebb70dea99 cpu: Fix KVM false negative warning after Kconfig transition (#1013)
When we start to build gem5. We will read and process all of SConsopts
files, and process the after_sconsopts_callbacks after all of SConsopts
files read.

In the KVM_ISA env setting, the KVM_ISA env can be set in the different
files, take x86 and arm as example:

KVM_ISA default value:

bc39283451/src/cpu/kvm/SConsopts

x86 KVM_ISA:

bc39283451/src/arch/x86/kvm/SConsopts (L39-L45)

arm KVM_ISA:

bc39283451/src/arch/arm/kvm/SConsopts (L35-L36)

We should move the kvm warning after all of SConsopts env read

issue: https://github.com/gem5/gem5/issues/686

Change-Id: I096c6bebaaec18f9b2af93191d0dd23c65084eda
2024-04-12 09:23:56 -07:00
Nicholas Mosier
bc39283451 cpu-o3, arch-x86: initialize interrupts for all SMT threads (#1007)
Fix issue #1004. When enabling SMT with the O3 cpu, only the first
interrupts object was getting initialized properly. This patch
initializes all interrupts objects, one per SMT thread.

Change-Id: I300782b645bd8ea3ef2497278fb73125ab4bf495
2024-04-11 11:17:24 -07:00
Ivana Mitrovic
db1c336237 cpu,arch-arm,arch-riscv: adding new instruction types to RISC-V (#589)
This commit adds more detailed instruction types for RISC-V Vector.
Concretely, it substitutes VectorIntegerArith, VectorFloatArith,
VectorIntegerReduce and VectorFloatReduce with more specific types
related to the operation that each instruction (e.g., VectorIntegerAdd
or VectorIntegerMult).

Additionaly, fixes two RISC-V instruction types (VectorXXX) that were
used in ARM SVE, placing the proper SimdXXX ones.

Change-Id: I31774fa6a7cd249abfffec68d11d3d77f08ad70b

CC @adriaarmejach
2024-04-11 10:15:56 -07:00
Giacomo Travaglini
3b5ae7b4d1 Add a generic cache template library (#745)
Add a generic cache template to construct internal storage structures.
Also add some example use cases by converting the prefetcher tables to
use this new library.
2024-04-11 08:00:34 +01:00
Pranith Kumar
769f750eb9 mem-cache: Implement AssociativeSet from AssociativeCache
AssociativeSet can reuse most of the generic cache library code with the
addition of a secure bit. This reduces duplicated code.

Change-Id: I008ef79b0dd5f95418a3fb79396aeb0a6c601784
2024-04-10 16:17:57 -04:00
Pranith Kumar
f3bc10c168 mem-cache: Derive tagged entry from cache entry
The tagged entry can be derived from the generic cache entry and add the secure
flag that it needs. This reduces code duplication.

Change-Id: I7ff0bddc40604a8a789036a6300eabda40339a0f
2024-04-10 16:17:57 -04:00
Pranith Kumar
8fb3611614 mem-cache: prefetch: Implement DCPT tables using cache library
The DCPT table is better built using the generic cache library since we do not
need the secure bit.

Change-Id: I8a4a8d3dab7fbc3bbc816107492978ac7f3f5934
2024-04-10 16:17:57 -04:00
Pranith Kumar
2c7d4bed66 mem-cache: Implement VFT tables using cache library
The frequency table is better built using the generic cache library instead of the
AssociativeSet since the secure bit is not needed for this structure.

Change-Id: Ie3b6442235daec7b350c608ad1380bed58f5ccf4
2024-04-10 16:17:57 -04:00
Pranith Kumar
2cc2ad5097 misc: Add a generic cache library
Add a generic cache library modeled after AssociativeSet that can be used for
constructing internal caching structures.

Change-Id: I1767309ed01f52672b32810636a09142ff23242f
2024-04-10 16:17:57 -04:00
Giacomo Travaglini
4b98551aaf Update src/python/gem5/components/cachehierarchies/abstract_cache_hierarchy.py
Co-authored-by: Bobby R. Bruce <bbruce@ucdavis.edu>
2024-04-10 16:17:56 -04:00
Giacomo Travaglini
efe397ca92 stdlib: Add DTB generation capabilites to AbstractCacheHierarchy
Now that we are able to provide a view of the cache hierarchy from
the python world, we can start generating DTB entries for caches
and more specifically to properly fill the next-level-cache and
cache-level properties

Change-Id: Iba9ea08fe605f77a353c9e64d62b04b80478b4e2
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-04-10 16:17:56 -04:00
Giacomo Travaglini
e6637fc852 stdlib: Use newly defined tree for PrivateL1PrivateL2 hierarchy
Change-Id: I803c6118c4df62484018f9e4d995026adb1bbc2c
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-04-10 16:17:56 -04:00
Giacomo Travaglini
d67672facc stdlib: Add tree structure to the AbstractCacheHierarchy
One of things we miss in gem5 is the capability to neatly compose the
cache hierarchy of CPUs and clusters of CPUs.  The BaseCPU
addPrivateSplitL1Caches and addTwoLevelCacheHierarchy APIs have
historically been used to bind cache levels together.

These APIs have been superseeded by the introduction of the Cache
hierarchy abstraction in the standard library. The standard library
makes it cleaner for a user to quickly instantiate a hierarchy of caches
with few lines of code.  While this removes a lot of complexity for a
user, the Hierarchy objects still have little information about their
internal topology.

To address this problem, this patch adds a tree data structure to the
AbstractCacheHierarchy class, where every node of the tree represent
a cache in the hierarchy. In this way we will expose APIs for traversing
and querying the tree.

For example a 2 CPUs system with private L1, private L2 and shared L3
will contain the following tree:

         [root]
           |
          [L3]
           /\
          /  \
        [L2] [L2]
         |    |
        [L1] [L1]

Change-Id: I78fe6ad094f0938ff9bed191fb10b9e841418692
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-04-10 16:17:56 -04:00
Matthew Poremba
1d64669473 mem,gpu-compute: Implement GPU TCC directed invalidate
The GPU device currently supports large BAR which means that the driver
can write directly to GPU memory over the PCI bus without using SDMA or
PM4 packets. The gem5 PCI interface only provides an atomic interface
for BAR reads/writes, which means the values cannot go through timing
mode Ruby caches. This causes bugs as the TCC cache is allowed to keep
clean data between kernels for performance reasons. If there is a BAR
write directly to memory bypassing the cache, the value in the cache is
stale and must be invalidated.

In this commit a TCC invalidate is generated for all writes over PCI
that go directly to GPU memory. This will also invalidate TCP along the
way if necessary. This currently relies on the driver synchonization
which only allows BAR writes in between kernels. Therefore, the cache
should only be in I or V state.

To handle a race condition between invalidates and launching the next
kernel, the invalidates return a response and the GPU command processor
will wait for all TCC invalidates to be complete before launching the
next kernel.

This fixes issues with stale data in nanoGPT and possibly PENNANT.

Change-Id: I8e1290f842122682c271e5508a48037055bfbcdf
2024-04-10 11:35:25 -07:00
Matthew Poremba
833392e7b2 mem-ruby,gpu-compute: Allow memory reqs without inst
The GPUDynInst for sending memory requests through the CUs data port
is required but only used for DPRINTFs. Relax this constraint so that
the methods can be reused for requests such as probes generated by the
GPU device.

Change-Id: I16094e400968225596370b684d6471580888d98a
2024-04-10 11:35:24 -07:00
Yu-Cheng Chang
116c483a42 arch-riscv: Make c.flwsp destination register more maintainable (#1006)
RISC-V C.FLWSP format:


![image](https://github.com/gem5/gem5/assets/32214817/f4c8d114-cd6b-4946-afff-fa752b31e337)
The name FC1 and FD share the same bits, change to FC1 to make it better


ee6f1377d7/src/arch/riscv/isa/bitfields.isa (L110)


ee6f1377d7/src/arch/riscv/isa/operands.isa (L84)


ee6f1377d7/src/arch/riscv/isa/bitfields.isa (L85)


ee6f1377d7/src/arch/riscv/isa/operands.isa (L76)
2024-04-10 08:11:51 -07:00
Hoa Nguyen
bc3627d682 arch-riscv: Remove a tab character (#1010)
Change-Id: Id54ae8ba37faba11cf4055ddaf7e6b99cfd51999

Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2024-04-10 08:08:57 -07:00
Giacomo Travaglini
5641c5e464 stdlib: Add tree structure to the AbstractCacheHierarchy (#967)
One of things we miss in gem5 is the capability to neatly compose the
cache hierarchy of CPUs and clusters of CPUs.  The BaseCPU
addPrivateSplitL1Caches and addTwoLevelCacheHierarchy APIs have
historically been used to bind cache levels together.

These APIs have been superseded by the introduction of the Cache
hierarchy abstraction in the standard library. The standard library
makes it cleaner for a user to quickly instantiate a hierarchy of caches
with few lines of code.  While this removes a lot of complexity for a
user, the Hierarchy objects still have little information about their
internal topology.

To address this problem, this patch adds a tree data structure to the
AbstractCacheHierarchy class, where every node of the tree represent
a cache in the hierarchy. In this way we will expose APIs for traversing
and querying the tree.

For example a 2 CPUs system with private L1, private L2 and shared L3
will contain the following tree:

         [root]
           |
          [L3]
           /\
          /  \
        [L2] [L2]
         |    |
        [L1] [L1]
2024-04-09 09:16:37 +01:00
Bobby R. Bruce
3af15a535e mem-cache, configs, arch-arm: Handle partitioning policies through a PartitionManager (#966)
This PR is offloading some of the partitioning logic to the partitioning
manager, effectively changing
the partitioning interface. Rather than always relying on the
PartitionFieldExtention data structure to
convey partition IDs, we make it implementation defined by introducing
the partitioning manager abstraction.
We want user to be able to extract the partitionId more flexibly and
this requires using a SimObject.

Users can extend the PartitioningManager, overriding the
readPacketPartitionId, therefore providing their
own mean of injecting/extracting partitioning data from a packet
2024-04-08 16:05:17 -07:00
Ivana Mitrovic
a8d778516d arch-riscv,sim: m5ops argument / return fix for 32 bit RISC-V (#900)
M5Ops C / C++ functions partially use 64 bit arguments and return value.
In general, 64 bit arguments and return values are possible for 32 bit
RISC-V systems as well, since the arguments and the return value is
split into two registers. However, at the moment, this does not work for
32 bit RISC-V systems on the simulator side, since there is a one to one
mapping between argument registers and m5op function parameters.

To solve this problem, the get() function of the RISC-V reg_abi is
updated. It now will merge two registers if there is a 64 bit argument.
For this, the function code has to be passed to the get() function. The
default value of this function code is set to 0xF00, since 0x00 is
already used for M5_ARM. The parameter list of other get() functions for
argument return is also extended by this function code parameter with
the keyword [[maybe_unused]].

To enable a return value of size 64 bit, a0 is assigned with the lower
32 bit and a1 with the higher 32 bit.

Related Issue: https://github.com/gem5/gem5/issues/881
2024-04-08 10:09:17 -07:00
Robert Hauser
841b821261 arch-riscv: fix c.fswsp source register (#998)
RISC-V C.FSWSP format (RISC-V Unprivileged ISA V20191213, page 102):
 
|15-13|12-7|6-2|1-0|
|-------|----|----|----|
|funct3|imm|rs2|op|

Source register is bit 2-6, not bit 20-24


ee6f1377d7/src/arch/riscv/isa/bitfields.isa (L111)


ee6f1377d7/src/arch/riscv/isa/operands.isa (L86)


ee6f1377d7/src/arch/riscv/isa/bitfields.isa (L87)


ee6f1377d7/src/arch/riscv/isa/operands.isa (L80)
2024-04-08 08:41:11 -07:00
Yu-Cheng Chang
71b0b1f2b6 arch-riscv: Fix c.fsw source register (#1005)
RISC-V C.FSW format:


![image](https://github.com/gem5/gem5/assets/32214817/31f46525-23e1-4b36-91ee-968f18b9d32a)
Source register is bit 2-4, not bit 20-24
 

ee6f1377d7/src/arch/riscv/isa/bitfields.isa (L112)


ee6f1377d7/src/arch/riscv/isa/operands.isa (L88)


ee6f1377d7/src/arch/riscv/isa/bitfields.isa (L87)


ee6f1377d7/src/arch/riscv/isa/operands.isa (L80)
2024-04-08 08:30:54 -07:00
Giacomo Travaglini
bdb08a5b6c arch-arm, dev-arm: Fix typo in PartitionFieldExtention name
Rename PartitionFieldExtention into PartitionFieldExtension

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Change-Id: I8072adf78d81b94c5b8bc61a317c0238cf0a9fd9
2024-04-07 11:45:57 +01:00
Giacomo Travaglini
dd45e1c319 misc: Make PartitionFieldExtention private to Arm
The new ISA-agnostic interface is the PartitionManager.
We therefore make the PartitionFieldExtention private to the
Arm implementation of memory partitioning (FEAT_MPAM)

Any other partitioning implementation should override the
PartitionManager::readPacketPartitionID to provide a mean
for extracting partitioning data (partition_id) from the
incoming Packet.

With this commit we also define an MPAM MSC which is
supposed to be the partitioning manager for the
Memory System Component

Change-Id: I6959ace0c0cbca549dcc1aacd53dff223b5fe328
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-04-07 11:45:57 +01:00
Víctor Soria Pardos
98358da968 mem-ruby: Implement Atomic No Alloc Policy
Add alternative implementation to far atomics when the flag alloc_on_commit
is false. The implementation fetches the data, performs the atomic and
writes back the cache line to main memory.

Co-authored-by: Fabian Schätzle <f.schaetzle@fz-juelich.de>
Change-Id: I8797fbc68448e1866a292f4afeedd3613113dddd
2024-04-06 18:51:11 +02:00
Giacomo Travaglini
82a82c8793 configs: Change cache_partitioning.py to use PartitionManager
Change-Id: I891cc4967dc5483313bcb1179d19b37123a37ba0
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-04-05 10:09:46 +01:00
Giacomo Travaglini
6c2ac8e641 Update src/python/gem5/components/cachehierarchies/abstract_cache_hierarchy.py
Co-authored-by: Bobby R. Bruce <bbruce@ucdavis.edu>
2024-04-05 08:53:07 +01:00
Bobby R. Bruce
c65071282d stdlib,tests: Add StatTester SimObject and Scalar tests (#973)
This SimObject can be used to quickly test the statistics are
functioning correctly. The SimObject schedules a single event which sets
the statistics to values dependent on the SimObject params.

With this commit the "Scalar" stats have a StatTester subclass that can
be used for testing. More can be added as required.

Tests are included to check our Scalar SimStat functionality.
2024-04-04 19:12:44 -07:00
Bobby R. Bruce
504005da87 scons,tests: Add 'USE_TEST_OBJECTS' kconfig
This has the SimObjects defined in "src/test_objects" only be compiled
into the gem5 binary if the Kconfig 'USE_TEST_OBJECTS" == 'y'. This
happens in two cases:

1. When 'ALL/gem5' is compiled via "build_opts".
2. When tests are run via "./tests/main.py".

Change-Id: I2330008fd7c7900de5f4de142b8ac89ef4e351ce
2024-04-04 13:04:21 -07:00
Bobby R. Bruce
2771061207 stdlib,tests: Add StatTester SimObject and Scalar tests
This SimObject can be used to quickly test the statistics are
functioning correctly. The SimObject schedules a single event which sets
the statistics to values dependent on the SimObject params.

With this commit the "Scalar" stats have a StatTester subclass that can
be used for testing. More can be added as required.

Tests are included to check our Scalar SimStat functionality.

Change-Id: I78fa5d9a0c3fc7115bd6c6d3410a5436aaa47f55
2024-04-04 13:04:20 -07:00
Bobby R. Bruce
be00691cd3 scons: Disable Address Sanitizer for GCC (#951)
This removes the '--with-asan' as it's are not compatible with GCC. The
`--with-asan` incompatibility with GCC is discussed in Issue #916.
2024-04-04 11:48:34 -07:00
dependabot[bot]
9b143930b6 misc: bump mypy from 1.8.0 to 1.9.0 (#983)
Bumps [mypy](https://github.com/python/mypy) from 1.8.0 to 1.9.0.
2024-04-04 11:21:59 -07:00
Bobby R. Bruce
8d7e3fb16b stdlib: Move SimStat 'unit' and 'datatype' field to Scalar (#970)
These are not general statistic properties and better put as a property
of a Scalar value.
2024-04-04 10:02:22 -07:00
Bobby R. Bruce
213b418391 stdlib: Specify typing for SimStat Scalar value (#971) 2024-04-04 08:34:20 -07:00
Bobby R. Bruce
4ff34a75bb stdlib: Fix 'nozero' for Scalar SimStats (#972)
When the `statistics::nozero` flag is set gem5 does not output that stat
if its value is zero. This was not the case for SimStats which output in
this case. This patch correct this behavior.
2024-04-04 08:33:48 -07:00
Víctor Soria Pardos
5a6a3be6da mem-ruby: Fix policy_type condition in CHI
Fix if-else condition in CHI-cache-actions to correctly
support policy_type Present Near (2)

Change-Id: Ib776d847a908a8ac7693c2d10405bc0c4a9d767d
2024-04-04 10:55:56 +02:00
Víctor Soria Pardos
7ee574b309 mem-ruby: Remove AtomicReturn_NoWait from CHI
To make Atomic transaction recursive and enable 2-level config,
remove AtomicReturn_NoWait and other level-dependent code

GitHub Issue: https://github.com/gem5/gem5/issues/882

Change-Id: Iac468cdb8a3b5914c8f05c5cedde866ce85f359a
2024-04-04 10:54:42 +02:00
Giacomo Travaglini
0c6543d781 python: Add is_subset to the AddrRange param class (#993)
This will just call the _m5.range.isSubset method

Change-Id: If747819a008a8ed20796b4efd42a42e5c3a8d7d9

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-04-04 08:12:30 +01:00
Minje Jun
ffd0680a2c mem-ruby: Copyback UD_RU line when evicted in CHI protocol (#945)
This is a followed up fix to #791 mem-ruby: Fix possible dirty line loss
in CHI when ReadShared hit on UD line.
UD_RU line may have stale data since the upstream could have updated the
line, so its local cache line data is treated as invalid
(dataValid=false). But when the line is evicted, it must be written back
to downstream because the upstream may have the line in clean state
(UC). This change fixes it by performing copy back the UD_RU line while
keeping its dataValid as false.

Example error case:
- L3 was in UD_RSC and being evicted without back-invalidation. LLC (HN)
was in RU state.
- Because there's still upstream sharer, L3 sends WriteClean.
- Because the data state was unique and dirty, L3 sends CBWrData_UD_PD.
- LLC becomes UD_RU.
- When the line is evicted from LLC (LocalHN_Eviction), the line is just
dropped, causing the loss of the dirty copy

Co-authored-by: Minje Jun <minje.jun@samsung.com>
2024-04-03 08:33:22 -07:00
Yu-Cheng Chang
1fa25a60c8 arch-riscv: Fix the RiscvBareMetal parameter reset_vect (#964)
The `reset_vect` has exist for a long time and `reset_vect` will not
effect if the user gonna to use customized reset_vect. The CL added the
`auto_reset_vect` to let the config determine the `reset_vect` from
workload entry point or user-specified

Ref: https://gem5-review.googlesource.com/c/public/gem5/+/42053

Change-Id: I928c0dc42aaa85ceabf8d75f9654486496e0ffee
2024-04-03 08:31:57 -07:00
dependabot[bot]
514b759d63 misc: bump pre-commit from 3.6.2 to 3.7.0 (#984)
Bumps [pre-commit](https://github.com/pre-commit/pre-commit) from 3.6.2
to 3.7.0.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-04-03 08:23:30 -07:00
Kaustav Goswami
28b081b348 arch-arm,stdlib: ARM release for_kvm is moved to configs (#986)
This change sets the `release` of the ARM board at the config file
instead of overriding the release on the ArmBoard. This change partially
solves issue 932 as the system taking and restoring the checkpoint is
consistent across KVM and timing CPUs respectively.

Signed-off-by: Kaustav Goswami <kggoswami@ucdavis.edu>
2024-04-03 11:48:24 +01:00
Nicholas Mosier
32ee09df4a sim-se: Fix copyOutStatxBuf compile error (#989)
Fix #988. Rewrite statxFunc and copyOutStatxBuf to use platform-agnostic
stat system call, not Linux-specific statx system call.

Change-Id: I3d17e14684e9cd77cdbfd0141b93c3bcbd27dbeb
2024-04-02 14:59:24 -07:00
Bobby R. Bruce
c238b7a3e0 base: Fix 'doGzipLoad' str manipulation (#959)
When running `scons build/ALL/gem5.opt --with-ubsan`, with GCC, the
following error was returned:

```
[     CXX] src/base/loader/image_file_data.cc -> ALL/base/loader/image_file_data.o
In file included from /usr/include/string.h:535,
                 from /usr/include/c++/11/cstring:42,
                 from src/base/cprintf_formats.hh:33,
                 from src/base/cprintf.hh:38,
                 from src/base/logging.hh:49,
                 from src/base/loader/image_file_data.cc:40:
In function ‘char* strcpy(char*, const char*)’,
    inlined from ‘int gem5::loader::doGzipLoad(int)’ at src/base/loader/image_file_data.cc:70:11,
    inlined from ‘gem5::loader::ImageFileData::ImageFileData(const string&)’ atsrc/base/loader/image_file_data.cc:116:24:
/usr/include/x86_64-linux-gnu/bits/string_fortified.h:79:33: error: ‘void* __builtin_memcpy(void*, const void*, long unsigned int)’ offset [0, 19] is out of the bounds [0, 0] [-Werror=array-bounds]
   79 |   return __builtin___strcpy_chk (__dest, __src, __glibc_objsize (__dest));
      |          ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1plus: all warnings being treated as errors
scons: *** [build/ALL/base/loader/image_file_data.o] Error 1
scons: building terminated because of errors.
```
2024-04-02 10:37:42 -07:00
Bobby R. Bruce
ee6f1377d7 misc: Sync develop .github to stable (#987) 2024-04-02 10:20:25 -07:00
Bobby R. Bruce
dea8fc0ee8 misc,github: Upgrade checkout and upload/download-artifact Actions to latest version (#979)
As can be seen from this Daily test log:
https://github.com/gem5/gem5/actions/runs/8478384881, checkout@v2 and
{upload/download}-artifact@v3 was causing warnings to be thrown. This
fix upgrades all instances of these actions to the latest version (in
both cases, v4).
2024-04-02 10:14:12 -07:00
Hoa Nguyen
628826896f arch-riscv: Use TeX's escape seq in Python instead of Unicode (#985)
Currently, the citation string has a Unicode character. This works well
in gem5, but it breaks the gem5+SST simulation [1]. This change modifies
the letter "u" with umlaut to use TeX's escape sequence for this letter
instead of using the UTF-8 character.

[1] https://github.com/gem5/gem5/issues/982

Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2024-04-02 08:42:21 -07:00
Robert Hauser
f9a9e50007 sim: adding constructor to GuestAddr
A constructor is added to GuestAddr as suggested in the pull request
feedback. This allows a cast conversion from uint64_t GuestAddr. Hence,
the casting from uint64_t to GuestAddr by reinterpret_cast is removed
(was added in a previous commit).

using namespace pseudo_inst is also removed as requested.

Comments are added to GuestAddr.

Change-Id: Ib76de2bff285f4e53ad03361969c27f7bb2dfe9e
2024-04-01 18:05:56 +00:00
Jason Lowe-Power
ed5ffee49c util-m5: Add default M5OP_ADDR to arm64 (#977)
As pointed out here [1], the expected M5OP_ADDR for arm64 arch is
0x10010000. This change reflects that.

[1] https://github.com/gem5/gem5/pull/725
2024-04-01 08:51:54 -07:00
Matthew Poremba
78cf39bf63 arch-vega: Operand selectors for accumulation registers (#955)
AMD's MI100 introduced a new register file called accumulation registers
for the matrix cores. In MI200 these were recombined into the same
register file according to the documentation. The accumulation register
file is the same size as the architectural register file, hence the size
is doubled.

The ISA spec does not explicitly state the register selector values,
however it does say that the accumulation offset from the kernel
dispatch packet should be added to the architecture register file
selector number when an instruction sets the ACC bit. Therefore we can
infer that the value must simply be an extension beyond the
architectural VGPRs.

This fixes errors of the form "invalid register selector: 512" (or
higher value). This was tested with the Learn the Basics tutorial
example on pytorch.org

Change-Id: I48ced1532fc166d2f5032fe21fbeba70ac77f258
2024-04-01 08:45:37 -07:00
Nicholas Mosier
00d4b6825c sim-se: Implement statx system call for Linux x86-64 (#887)
Implement the statx Linux-specific system call for x86-64. statx is used
by LLVM's libc.

Change-Id: Ic000a36a5e5c1399996f520fa357b9354c73c864
2024-04-01 08:23:39 -07:00
Bobby R. Bruce
b310ddf79a misc: Upgrade {download/upload}-artifact to v4
v3 was causing a 'Node.js 16 actions are deprecated' error.

Note: download-artifact@v4 must be used with upload-artifact@v4 and
vice-versa.

Change-Id: Icb8ab6d27aed4557be95ce31dd89d4655010968e
2024-03-30 01:22:28 -07:00
Bobby R. Bruce
21a00be6eb misc: Fix 'checkout@v2' to 'checkout@v4'
This caused a 'Node.js 16 actions are deprecated;' error.

With this commit all our checkout actions are set to '@v4'.

Change-Id: I0f931bf7967f49ee44b7bf1d6a56e19f017fb948
2024-03-30 01:14:57 -07:00
Hoa Nguyen
136c0eff3b util-m5: Add a warning when m5op_addr is 0x0
This address, 0x0, is most likely a wrong address to call m5 ops.
The warning will catch the problem where m5op_addr is not initialized
properly.

Change-Id: I442b4806191ae6f5c137bc947f2a269684c599dd
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2024-03-30 02:15:24 +00:00
Debjyoti Bhattacharjee
ec690de0da arch-riscv: This commit fixes bug in vfmv.f.s impl. in riscv (#863)
The existing implementation of vfmv instruction did not type cast the
first element of the source vector, which caused the "freg" to interpret
the result as a NaN.

With the type cast to f32, the value is correctly recognized as float
and sign extended to be stored in the fd register.

Git issue: https://github.com/gem5/gem5/issues/827

Change-Id: Ibe9873910827594c0ec11cb51ac0438428c3b54e

---------

Co-authored-by: Debjyoti B <bhatta53@imec.be>
Co-authored-by: Tommaso Marinelli <tommarin@ucm.es>
2024-03-29 08:23:14 -07:00
Harshil Patel
9207458fd7 stdlib: add socks proxy to atlas client (#864) 2024-03-28 14:30:02 -07:00
Bobby R. Bruce
55c58da504 base: Convert doGzipLoad to use std::string instead of *char
Change-Id: I28c9bf7853267686402b43be00f857914770f7a7
2024-03-28 14:23:13 -07:00
Hoa Nguyen
294dd6dd01 util-m5: Add default M5OP_ADDR to arm64
As pointed out here [1], the expected M5OP_ADDR for arm64 arch
is 0x10010000. This change reflects that.

[1] https://github.com/gem5/gem5/pull/725

Change-Id: I7e72f5ea20d4aacf3115a485ba7cd664d33d037e
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2024-03-28 21:05:58 +00:00
Giacomo Travaglini
63706f04b5 dev: Remove duplicate virtio files (#976)
Remove the following files:
* src/dev/virtio/rng 2.cc
* src/dev/virtio/rng 2.hh

Which were a copy of rng.hh and rng.cc. Probably added to the repository
by accident. They were not compiled by scons


Change-Id: I9d1da19cc243c513ab7af887b1b6260d8e361b57

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-03-28 14:32:11 +00:00
Yu-Cheng Chang
896c32cd0d arch: Add getIsaName in BaseISA (#975)
Change-Id: I81bfcd691d570430f7011f0d5023e5ea613e0dd9
2024-03-28 13:27:32 +00:00
Giacomo Travaglini
42fb1d657c stdlib: Add DTB generation capabilites to AbstractCacheHierarchy
Now that we are able to provide a view of the cache hierarchy from
the python world, we can start generating DTB entries for caches
and more specifically to properly fill the next-level-cache and
cache-level properties

Change-Id: Iba9ea08fe605f77a353c9e64d62b04b80478b4e2
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-03-26 13:14:35 +00:00
Giacomo Travaglini
be1cac6c21 stdlib: Use newly defined tree for PrivateL1PrivateL2 hierarchy
Change-Id: I803c6118c4df62484018f9e4d995026adb1bbc2c
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-03-26 13:09:55 +00:00
Giacomo Travaglini
1664625c91 stdlib: Add tree structure to the AbstractCacheHierarchy
One of things we miss in gem5 is the capability to neatly compose the
cache hierarchy of CPUs and clusters of CPUs.  The BaseCPU
addPrivateSplitL1Caches and addTwoLevelCacheHierarchy APIs have
historically been used to bind cache levels together.

These APIs have been superseeded by the introduction of the Cache
hierarchy abstraction in the standard library. The standard library
makes it cleaner for a user to quickly instantiate a hierarchy of caches
with few lines of code.  While this removes a lot of complexity for a
user, the Hierarchy objects still have little information about their
internal topology.

To address this problem, this patch adds a tree data structure to the
AbstractCacheHierarchy class, where every node of the tree represent
a cache in the hierarchy. In this way we will expose APIs for traversing
and querying the tree.

For example a 2 CPUs system with private L1, private L2 and shared L3
will contain the following tree:

         [root]
           |
          [L3]
           /\
          /  \
        [L2] [L2]
         |    |
        [L1] [L1]

Change-Id: I78fe6ad094f0938ff9bed191fb10b9e841418692
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-03-26 13:09:47 +00:00
Giacomo Travaglini
9ab97c8930 mem-cache: Move partitioningPolicies to the PartitionManager
Change-Id: I13b41e918ed3864e1a52940786b3eec063253e1d
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-03-26 12:12:24 +00:00
Giacomo Travaglini
d0539fe7cb mem-cache: Define a PartitionManager to handle partitioning
This is a first step towards offloading some of the partitioning
logic to the partitioning manager. We start with this patch
by replacing the static readPacketPartitionId into a virtual
method owned by the manager.

The issue with readPacketPartitionId as of now is that it relies
on the fixed PartitionFieldExtention.
We want user to be able to extract the partitionId more flexibly
and this requires using a SimObject

Change-Id: I3bd2e81e2a97c55fc83548956fc59f422c8049a6
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-03-26 12:12:15 +00:00
Ivan Fernandez
c91d1253de cpu: This commit updates cpu FUs according to new Simd types
This commit updates cpu by removing VectorXXX types and updates
    FUs according to the newer SimdXXX ones. This is part of the
    homogenization of RISCV Vector instruction types, which moved
    from VectorXXX to SimdXXX.

Change-Id: I84baccd099b73a11cf26dd714487a9f272671d3d
2024-03-25 19:01:47 +01:00
Ivan Fernandez
aa24c9010f arch-riscv: This commit adds new instruction types to RISC-V
This commit adds some more detailed instruction types for RISC-V
    Vector. Concretely, it substitutes VectorIntegerArith,
    VectorFloatArith, VectorIntegerReduce and VectorFloatReduce with
    more specific types related to the operation that each instruction
    performs, being consistent with SimdXXX ones.

Change-Id: Iaffa74871ccc56d8c3627e1f1e111b9bc9e864af
2024-03-25 19:01:06 +01:00
Ivan Fernandez
274795c6ee arch-arm: This commit fixes two RISC-V inst types used in SVE
This commit fixes two RISC-V instruction types (VectorXXX) that
    were used in ARM SVE to the proper SimdXXX ones.

Change-Id: Id632926a89ae2395234f3cf34adeab63844bdd57
2024-03-25 15:25:11 +01:00
Carson Molder
dd5a30d41e sim-se,cpu-kvm: Fix SE workload setup on KVM CPUs (#956)
This PR fixes #948 in which running KVM CPUs through the updated gem5
interface in SE mode causes an immediate crash.

To fix this, I added a check to set_se_binary_workload that checks if
any of the cores are KVM, and if so, sets a couple of knobs for the
board and process that are required to make KVM work. The depecated
se.py script, which sets these knobs, is able to run KVM in SE mode just
fine, so doing the same here fixed the bug.
2024-03-23 15:15:11 -07:00
Bobby R. Bruce
8249fa8dee base: Fix 'doGzipLoad' str manipulation
When running `scons build/ALL/gem5.opt --with-ubsan`, with GCC, the
following error was returned:

```
[     CXX] src/base/loader/image_file_data.cc -> ALL/base/loader/image_file_data.o
In file included from /usr/include/string.h:535,
                 from /usr/include/c++/11/cstring:42,
                 from src/base/cprintf_formats.hh:33,
                 from src/base/cprintf.hh:38,
                 from src/base/logging.hh:49,
                 from src/base/loader/image_file_data.cc:40:
In function ‘char* strcpy(char*, const char*)’,
    inlined from ‘int gem5::loader::doGzipLoad(int)’ at src/base/loader/image_file_data.cc:70:11,
    inlined from ‘gem5::loader::ImageFileData::ImageFileData(const string&)’ atsrc/base/loader/image_file_data.cc:116:24:
/usr/include/x86_64-linux-gnu/bits/string_fortified.h:79:33: error: ‘void* __builtin_memcpy(void*, const void*, long unsigned int)’ offset [0, 19] is out of the bounds [0, 0] [-Werror=array-bounds]
   79 |   return __builtin___strcpy_chk (__dest, __src, __glibc_objsize (__dest));
      |          ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1plus: all warnings being treated as errors
scons: *** [build/ALL/base/loader/image_file_data.o] Error 1
scons: building terminated because of errors.
```

I do not know the exact issue but using strcpy in this way (i.e.
`strcpy(char_pointer + offset, string)`) appears to trigger this error
with the undefined behavior sanitizer. The fix in this patch replaces
this with `strcat`.

Change-Id: I1a0c50c9022adc841e175aad0fe2247bfcb29d71
2024-03-23 15:07:26 -07:00
Ivan Fernandez
1e743fd85a arch-riscv: adding vector unit-stride segment stores to RISC-V (#913)
This commit adds support for vector unit-stride segment store operations
for RISC-V (vssegXeXX). This implementation is based in two types of
microops:
- VsSegIntrlv microops that properly interleave source registers into
structs.
- VsSeg microops that store data in memory as contiguous structs of
several fields.

Change-Id: Id80dd4e781743a60eb76c18b6a28061f8e9f723d

Gem5 issue: https://github.com/gem5/gem5/issues/382
2024-03-22 15:45:58 -07:00
Matthew Poremba
7d62da6d10 dev-amdgpu: Support for ROCm 6.0 (#926)
Implement several features new in ROCm 6.0 and features required for
future devices. Includes the following:

- Support for multiple command processors
- Improve handling of unknown register addresses
- Use AddrRange for MMIO address regions
- Handle GART writes through SDMA copy
- Implement PCIe indirect reads and writes
- Improve PM4 write to check dword count
- Implement common MI300X instruction
2024-03-21 21:12:09 -07:00
Matthew Poremba
dca040983b arch-vega: Various vega fixes to enable nanogpt (#950)
This PR fixes some issues observed that were needed to get nanogpt
working.
2024-03-21 21:11:44 -07:00
Michael Boyer
803dbbfdac arch-vega: Implement flat_load_sbyte instruction (#953)
Change-Id: I642a71c504e2d3afecd5d2dfd9db016945aed21b
2024-03-21 21:11:10 -07:00
Harshil Patel
76965c6431 tests: Update tests to use specific resource versions (#901)
This update modifies the test configuration to specify the versions of
resources used, rather than automatically using the latest versions.
Previously, if a resource was updated for a change, it could potentially
cause tests to fail if those tests were incompatible with the new
version of the resource.
Now, with this change, tests are tied to specific versions of resources,
ensuring that any updates to resources will require corresponding
updates to the tests to maintain compatibility.

Change-Id: I9633b1749f6c6c82af6aa6697b7e7656020f62fa
2024-03-21 09:03:46 -07:00
Bobby R. Bruce
4c33397592 misc: Add ".DS_Store" to .gitignore (#952)
These Apple MacOS files define custom characteristics of a directory.
They have nothing to do with the source code and should therefore be
ignored.
2024-03-21 08:40:44 -07:00
Matthew Poremba
823b5a6eb8 dev-amdgpu: Support multiple CPs and MMIO AddrRanges
Currently gem5 assumes that there is only one command processor (CP)
which contains the PM4 packet processor. Some GPU devices have multiple
CPs which the driver tests individually during POST if they are used or
not. Therefore, these additional CPs need to be supported.

This commit allows for multiple PM4 packet processors which represent
multiple CPs. Each of these processors will have its own independent
MMIO address range. To more easily support ranges, the MMIO addresses
now use AddrRange to index a PM4 packet processor instead of the
hard-coded constexpr MMIO start and size pairs.

By default only one PM4 packet processor is created, meaning the
functionality of the simulation is unchanged for devices currently
supported in gem5.

Change-Id: I977f4fd3a169ef4a78671a4fb58c8ea0e19bf52c
2024-03-21 10:13:55 -05:00
Matthew Poremba
39153cd234 dev-amdgpu: Implement PCIe indirect read/write
PCIe can read/write to any 32-bit address using the PCI index/index2
registers as an address and then reading/writing the corresponding
data/data2 register.

This commit adds this functionality and removes one magic value being
written to support GPU POST. This feature is disabled for Vega10 which
relies on an MMIO trace for too many values to implement in the MMIO
interface.

Change-Id: Iacfdd1294a7652fc3e60304b57df536d318c847b
2024-03-21 10:13:55 -05:00
Matthew Poremba
047c194780 dev-amdgpu: Implement SRBM write
The SRBM write packets where previously not required. This commit
implements SRBM writes to set a register by using the new setRegVal
interface. SRBM writes seem to be used for SRIOV enabled devices.

Change-Id: I202653d339e882e8de59d69a995f65332b2dfb8c
2024-03-21 10:10:01 -05:00
Matthew Poremba
6bbde8fbb8 dev-amdgpu: Rework handling of unknown registers
The top level AMDGPUDevice currently reads/writes all unknown registers
to/from a map containing the previously written value. This is intended
as a way to handle registers that are not part of the model but the
driver requires for functionality. Since this is at the top level, it
can mask changes to register values which do not go through the same
interface. For example, reading an MMIO, changing via PM4 queue, and
reading again returns the stale cached value.

This commit removes the usage of the regs map in AMDGPUDevice,
implements some important MMIOs that were previously handled by it, and
moves the unknown register handling to the NBIO aperture only. To reduce
the number of additional MMIOs to implement, the display manager in
vega10 is now disabled.

Change-Id: Iff0a599dd82d663c7e710b79c6ef6d0ad1fc44a2
2024-03-21 10:10:01 -05:00
Matthew Poremba
009cec56e0 dev-amdgpu: Check for SDMA copies to GART range
The SDMA engine can potentially be used to write to the GART address
range. Since gem5 has a shadow copy of the GART table to avoid sending
functional reads to device memory, the GART table must be updated when
copying to the GART range.

This changeset adds a check in the VM for GART range and implements the
SDMA copy packet writing to the GART range. A fatal is added to write
and ptePde, which are the only other two ways to write to memory, as
using these packets to update the GART table has not been observed.

Change-Id: I1e62dfd9179cc9e987659e68414209fd77bba2bd
2024-03-21 10:10:01 -05:00
Matthew Poremba
998709d4fc dev-amdgpu: Improve PM4 write data packet
The write data packet can write multiple dwords but currently always
assumes there is one dword, which can cause some write data to be
missed. This case is not common, but the number of dwords is implicitly
defined in the PM4 header.

This changeset passes the PM4 header to write data so that the correct
number of dwords can be determined. For now we assume no page crossing
when writing multiple dwords as the driver should be checking for that.

Change-Id: I0e8c3cbc28873779f468c2a11fdcf177210a22b7
2024-03-21 10:10:01 -05:00
Matthew Poremba
c045c68540 dev-amdgpu: Add node_id to interrupt handler
The ROCm 6.0 driver adds a node_id field to interrupts which must match
before passing on the interrupt to be cleared by the cookie from gem5's
interrupt handler implementation. Add this field and enable for gfx942.

The usage of the field can be seen in event_interrupt_isr_v9_4_3 at
https://github.com/ROCm/ROCK-Kernel-Driver/blob/roc-6.0.x/drivers/
    gpu/drm/amd/amdkfd/kfd_int_process_v9.c#L449

Change-Id: Iae8b8f0386a5ad2852b4a3c69f2c161d965c4922
2024-03-21 10:10:01 -05:00
Matthew Poremba
9ab004cccc arch-vega: Implement V_LSHL_ADD_U64
This is a new instruction in MI300 and operates similar to
V_LSHL_ADD_U32 but on 64-bit values.

Change-Id: Ia4ac65160bdad748fccdcb28286ba03157cc4046
2024-03-21 10:10:01 -05:00
Matthew Poremba
f36be791aa arch-vega: Expand FLAT subDecode range in main decoder
The main decoder for GPU instructions looks at the first 9 bits of a
dword to determine either the instruction or a subDecode table with more
information for specific instructions types. For flat instructions the
first 9 bits currently consist of 6 fixed encoding bits, a reserved bit,
and the first two bits of the opcode. Hence to support all opcodes there
are four indirections to the flat subDecode table. In MI300 the reserved
bit is part of a field to determine memory scope and therefore may be
non-zero.

This commit adds four addition calls to the subDecode table for the
cases where the scope bit is 1. See page 468 (PDF page 478) below:

https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/
    instruction-set-architectures/
    amd-instinct-mi300-cdna3-instruction-set-architecture.pdf

Change-Id: Ic3c786f0ca00a758cbe87f42c5e3470576f73a32
2024-03-21 10:10:01 -05:00
Michael Boyer
acd9d3ff94 gpu-compute: Add support for skipping GPU kernels (#940)
gpu-compute: Add support for skipping GPU kernels

This commit adds two new command-line options:

--skip-until-gpu-kernel N
Skips (non-blit) GPU kernels until the target kernel is reached.
Execution continues normally from there. Blit kernels are not skipped
because they are responsible for copying the kernel code and metadata
for the non-blit kernels. Note that skipping kernels can impact
correctness; this feature is only useful if the kernel of interest has
no data-dependent behavior, or its data-dependent behavior is not based
on data generated by the skipped kernels.

--exit-after-gpu-kernel N
Ends the simulation after completing (non-blit) GPU kernel N.

This commit also renames two existing command-line options:
--debug-at-gpu-kernel -> --debug-at-gpu-task
--exit-at-gpu-kernel  -> --exit-at-gpu-task

These were renamed because they count GPU tasks, which include both
kernels launched by the application as well as blit kernels.

Change-Id: If250b3fd2db05c1222e369e9e3f779c4422074bc
2024-03-21 07:46:27 -07:00
Matthew Poremba
e02f329d5d arch-vega: Fix VOP3 decode table off-by-one
There is no VOP3 opcode 667. Mark that invalid and move the opcodes
after down by one.

Change-Id: Ia4ccda91f6f501c1ce7c5898d7d0e924604a459a
2024-03-20 16:41:31 -05:00
Matthew Poremba
457d97ea52 arch-vega: Implement V_XNOR_B32
Change-Id: Id23a8d984f227ca23a92adb6c7fde3b4627af054
2024-03-20 16:37:37 -05:00
Matthew Poremba
1b15b2cc4b arch-vega: Support negative modifiers for packed F32 math
MI200 adds support for four FP32 packed math instructions. These are
VOP3P instructions which have a negative input modifier field. The
description made it unclear if these were used for F32 packed math
however the assembly of some Tensile kernels are using these modifiers
therefore adding support for them. Tested with PyTorch nn.Dropout kernel
which is using negative modifiers.

Change-Id: I568a18c084f93dd2a88439d8f451cf28a51dfe79
2024-03-20 16:37:23 -05:00
Matthew Poremba
3f8d0e1ef8 arch-vega: Fix V_FMAC_F32 data type
The datatype is U32 but should be F32. This is causing an implicit cast
leading to incorrect results. This fixes nn.Dropout in PyTorch.

Change-Id: I546aa917fde1fd6bc832d9d0fa9ffe66505e87dd
2024-03-20 16:37:23 -05:00
Michael Boyer
ba2f5615ba gpu-compute: Support cache line sizes >64B in GPUFS (#939)
This change fixes two issues:

1) The --cacheline_size option was setting the system cache line size
but not the Ruby cache line size, and the mismatch was causing assertion
failures.

2) The submitDispatchPkt() function accesses the kernel object in
chunks, with the chunk size equal to the cache line size. For cache line
sizes >64B (e.g. 128B), the kernel object is not guaranteed to be
aligned to a cache line and it was possible for a chunk to be partially
contained in two separate device memories, causing the memory access to
fail.

Change-Id: I8e45146901943e9c2750d32162c0f35c851e09e1

Co-authored-by: Michael Boyer <Michael.Boyer@amd.com>
2024-03-20 11:09:25 -07:00
Giacomo Travaglini
2b67d0eba6 stdlib, tests, configs: Add a new PrivateL1PrivateL2WalkCache hierarchy (#935)
From [1] The PrivateL1PrivateL2Cache hierarchy has been amended with an
MMUCache, which is basically a small cache in front of the page table
walker.

Not every ISA makes use of it.
Arm for example already implements caching of page table walks, via the
partial_levels parameter in the ArmTLB.
With this patch we define a new module which explicitly makes use of the
WalkCache. Configurations that do not require
another cache in the first level of the memsys (for the ptw) can use the
PrivateL1PrivateL2CacheHierarchy
    
[1]: https://gem5-review.googlesource.com/c/public/gem5/+/49364
2024-03-19 09:04:32 +00:00
Yu-Cheng Chang
dbae09e4d9 arch-riscv: Move alignment check to Physical Memory Attribute(PMA) (#914)
In the RISC-V unprivileged spec[1], the misaligned load/store support is
depend on the EEI.
    
In the RISC-V privileged spec ver1.12[2], the PMA specify wether the
misaligned access is support for each data width and the memory region.
    
In the [3] of `mcause` spec, we cloud directly raise misalign exception
if there is no memory region misalignment support. If the part of memory
region support misaligned-access, we need to translate the `vaddr` to
`paddr` first then check the `paddr` later. The page-fault or
access-fault is rose before misalign-fault.
    
The benefit of moving check_alignment option from ISA option to PMA
option is we can specify the part region of memory support misalign
load/store.

MMU will check alignment with virtual addresss if there is no misaligned
memory region specified. If there are some misaligned memory region
supported, translate address first and check alignment at final.
    
[1]
https://github.com/riscv/riscv-isa-manual/blob/main/src/rv32.adoc#base-instruction-formats
[2]
https://github.com/riscv/riscv-isa-manual/blob/main/src/machine.adoc#physical-memory-attributes
[3]
https://github.com/riscv/riscv-isa-manual/blob/main/src/machine.adoc#machine-cause-register-mcause
2024-03-18 12:59:13 -07:00
Yan Lee
84da503d37 mem: Fix callback of functional access in port wrapper (#938)
In previous implementation of port_wrapper, recvFunctional() will call
timing request callback. This should be a typo and this change fix the
typo.
2024-03-18 08:21:43 -07:00
Giacomo Travaglini
058dd7e195 configs, tests: Amend stdlib configs to use WalkCache hierarchy
As X86 and RISCV are relying on a Table Walker cache, we
change their stdlib configs to use the newly defined

PrivateL1PrivateL2WalkCacheHierarchy

Change-Id: I63c3f70a9daa3b2c7a8306e51af8065bf1bea92b
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-03-18 09:42:05 +00:00
Giacomo Travaglini
d32a438913 stdlib: Add a new private_l1_private_l2_walk_cache_hierarchy.py module
From [1] The PrivateL1PrivateL2Cache hierarchy has been amended
with an MMUCache, which is basically a small cache in front
of the page table walker. Not every ISA makes use of it.

Arm for example already implements caching of page table
walks, via the partial_levels parameter in the ArmTLB.

With this patch we define a new module which explicitly makes
use of the WalkCache. Configurations that do not require
another cache in the first level of the memsys (for the ptw)
can use the PrivateL1PrivateL2CacheHierarchy

[1]: https://gem5-review.googlesource.com/c/public/gem5/+/49364

Change-Id: I17f7e68940ee947ca5b30e6ab3a01dafeed0f338
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-03-18 09:42:05 +00:00
Robert Hauser
0b45be7720 arch-riscv: define size_t and off_t for 32 bit
size_t is defined as 32 bit unsigned integer and off_t as 64 bit signed
integer for 32 bit Linux.

Change-Id: Icaa26dfc75600df2450d7df45b6ba4e3c1a1546f
2024-03-16 09:09:57 +00:00
Robert Hauser
0fc08acf92 sim: add whitespace for correct coding style
This commit adds two additional whitespaces in the definition of
GuestAddr as well as in the operator << overload.

Change-Id: Ifb371a09b378fcf4862a768f113b5963b45bd167
2024-03-16 09:07:38 +00:00
Robert Hauser
f7da70bd10 arch-riscv,sim: simplify templates for GuestAddr
Simplify templates in argument handling for ABI=RicsvISA::RegABI32 and
Arg=GuestAddr.

Change-Id: I6af2e6fe1b77b1367136a8e8621053069bf3c3f0
2024-03-16 09:05:30 +00:00
Robert Hauser
e3fd3d7775 arch-arm,sim: fix argument handling for GuestAddr
Change-Id: If7bc759ee752333b717b61a6c577cf2d5846f4db
2024-03-16 09:04:51 +00:00
Robert Hauser
bf63ec953a arch-riscv: revert SyscallABI32 changes
Change-Id: I07c3e4aee06a6f5576d4a3488a29673fdb0a09bf
2024-03-16 09:04:36 +00:00
Giacomo Travaglini
0ec8cf8d05 dev-arm: Fix SMMUv3 DTB autogen (#934)
Replacing FdtProperyWords (expecting an integer) with FdtPropertyStrings

Change-Id: Icd1cf00704e253c88ac9b1d69c3cf946d2a8ca70

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-03-14 15:42:57 +00:00
dependabot[bot]
6f90feca56 build(deps): bump cryptography from 42.0.0 to 42.0.4 in /util/gem5-resources-manager (#929)
Bumps [cryptography](https://github.com/pyca/cryptography) from 42.0.0
to 42.0.4.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-03-11 20:50:29 -07:00
Bobby R. Bruce
e8bc4fc137 misc: Sync stable .github dir with develop (#928) 2024-03-11 12:35:35 -07:00
Ivana Mitrovic
85a20773c7 misc: Fix weekly tests (#924)
There were two errors causing the Weekly tests to fail. Each has a patch
in this PR:

1. Fixed incorrect version for the `artifact-downloader` (v4 instead of
v3).
2. Fixed incorrect use of `working-directory` which use of
`build/VEGA_X86/gem5.opt` to fail (not accessible in set
`working-directory`. The default `${github.workspace}` is sufficient.
2024-03-11 11:08:30 -07:00
Robert Hauser
3d2d960f10 arch-riscv: fix return value of pseudo instruction
Only the lower 32 bit of return values of pseudo instructions are
stored (in a0). Therefore, the upper 32 bit are stored in a1 to
enable a correct return value.

Change-Id: Idf33c325033281fc191a9285eb5d34fd4965cde9
2024-03-11 15:32:15 +00:00
Tiago Mück
942979162a READ_MODIFY_WRITE flag fix (#922)
Change bit for Request::READ_MODIFY_WRITE, which was the same as
Request::ACQUIRE.

Signed-off-by: Tiago Mück <tiago.muck@arm.com>
2024-03-11 08:32:11 -07:00
Robert Hauser
d358813a7a arch-riscv: fix argument handling of syscalls in SE mode
With the previously introduced struct wrapper GuestAddr, the asm
tests fail. This patch substitutes implements SyscallABI32 similar
to RegABI32, i.e., as a struct based on GenericSyscallABI32.
Furthermore, a get function for arguments is implemented for wide
arguments. It returns the lower 32 bits of a register.

Change-Id: I233a67a5d5c15ab0d019a63bc57f1225288e33cc
2024-03-11 15:28:23 +00:00
Robert Hauser
de52f3614c sim: enable pseudo instructions with varying pointer size
In this patch, Addr is subtituted by a struct wrapper (uint64_t) in the
pseudo instruction functions. This enables a correct argument handling
in systems where pointer size != 64 bit.

Change-Id: Ie84b43b4ab8e6c0d38c7b6b16e19fc043110681b
2024-03-11 15:27:58 +00:00
Bobby R. Bruce
1f0075bdbd misc: Remove incorrect 'working-directory' in Weekly tests
"build/VEGA_X86/gem5.opt" is not available in directory "hip". `${
github.workspace}` is default should be run from there. This patch fixes
this.

Change-Id: I99875270c77dde92d3ec2ae0a07760905eaf903e
2024-03-11 06:09:47 -07:00
Bobby R. Bruce
2bec61e69a misc: Revert download-artifact v4 to v3
This was accidently added in a previous commit. It breaks downloading.

Change-Id: I790b184e852ff732ea1106cb2cde79a83828bdcf
2024-03-11 05:59:37 -07:00
Giacomo Travaglini
bbde68c08c dev-arm: Handle translation aborts and add IRQ support to the SMMU (#920)
At the moment the SMMU is not handling translation errors gracefully the
way it is described by the SMMUv3 spec: whenever a translation fault
arises, simulation aborts as a whole. With this PR we add minimal
support for
translation fault handling, which means:

1) Not aborting simulation, but rather:
2) Writing an event entry to the SMMU_EVENTQ (event queue)
3) Signaling the PE an error arose and there is an event entry to be
consumed. The signaling is achieved
with the addition of the eventq SPI. Using an MSI is also possible
though it is currently disabled by the SMMU_IDR0.MSI being set to zero.

The PR is addressing issues reported by
https://github.com/orgs/gem5/discussions/898
2024-03-08 18:27:35 +00:00
Giacomo Travaglini
5161195db5 dev-arm: Remove the SMMUv3 irq_interface_enable parameter
The SMMU_IRQ_CTRL had been made optionally writeable by a
prior patch [1] even if interrupts were not supported in
the SMMUv3 model.
As we are partially enabling IRQ support, we remove this option
and we make the SMMU_IRQ_CTRL always writeable

[1]: https://gem5-review.googlesource.com/c/public/gem5/+/38555

Change-Id: Ie1f9458d583a5d8bcbe450c3e88bda6b3c53cf10
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-03-08 13:53:44 +00:00
Giacomo Travaglini
d63282a9da dev-arm: Implement wired interrupt for SMMU event queue
See https://github.com/orgs/gem5/discussions/898

The SMMUv3 Event Queue is basically unused at the moment.  Whenever a
transaction fails we actually abort simulation.  The sendEvent method
could be used to actually report the failure to the driver but it is
lacking interrupt support to notify the PE there is an event to handle.
The SMMUv3 spec allows both wired and MSI interrupts to be used.

We add the eventq_irq SPI param to the SMMU object and we draft an
initial sendInterrupt utility that makes use of it whenever it is
needed.

Change-Id: I6d103919ca8bf53794ae4bc922cbdc7156adf37a
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-03-08 13:53:21 +00:00
Giacomo Travaglini
63c815b5fc dev-arm: Do not panic in the SMMUv3 for fauting transactions
Rely on the architected solution instead of aborting simulation.
This means handling writes to the Event queue to signal managing
software there was a fault in the SMMU

Change-Id: I7b69ca77021732c6059bd6b837ae722da71350ff
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-03-08 11:29:22 +00:00
Giacomo Travaglini
7d5d1cd9c8 dev-arm: Rewrite SMMUEvent
The struct fields of the SMMUEvent were not matching the SMMUv3 specs.
This was "not an issue" as events have been implicitly disabled until
now (every translation error was aborting simulation)

With generateEvent we automatically construct a SMMU event from
a translation result.

Change-Id: Iba6a08d551c0a99bb58c4118992f1d2b683f62cf
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-03-08 11:29:22 +00:00
Giacomo Travaglini
ef10db5a3e dev-arm: Record additional information in the TranslResult
A faulting translation should return additional information
(other than the fault type). This will be used by future
patches to properly populate the SMMU event record of the
event queue

As we currenlty support two faults only:

1) F_TRANSLATION
2) F_PERMISSION

We add to TranslResult the relevant fault information only:
type, class, stage and ipa

Change-Id: I0a81d5fc202e1b6135cecdcd6dfd2239c2f1ba7e
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-03-08 11:29:22 +00:00
Giacomo Travaglini
3d1f68f205 dev-arm: Return translation fault in doReadCD
Reading the Context Descriptor (CD) might require a stage2
translation. At the moment doReadCD does not check for the
return value of the translateStage2.
This means that any stage2 fault will be silently discarded
and an invalid address will be used/returned.

By returning a translation result we make sure any error
happening in the second stage of translation will be properly
flagged

Change-Id: I2ecd43f7e23080bf8222bc3addfabbd027ee8feb
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-03-08 11:29:22 +00:00
Giacomo Travaglini
4a4b775985 dev-arm: Provide encapsulation by adding TranslResult::isFaulting
We don't check the fault type directly. This will improve
readability once the TranslResult class will be augmented
with extra fields

Change-Id: I5acafaabf098d6ee79e1f0c384499cc043a75a9d
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-03-08 11:29:22 +00:00
dependabot[bot]
f70dc88c8a build(deps): bump cryptography from 39.0.2 to 42.0.0 in /util/gem5-resources-manager (#853)
Bumps [cryptography](https://github.com/pyca/cryptography) from 39.0.2
to 42.0.0.

Co-authored-by: Harshil Patel <hpppatel@ucdavis.edu>
2024-03-07 08:14:14 -08:00
dependabot[bot]
f35815cd48 misc: bump pre-commit from 3.6.0 to 3.6.2 (#905)
Bumps [pre-commit](https://github.com/pre-commit/pre-commit) from 3.6.0
to 3.6.2.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-03-06 14:20:42 -08:00
dependabot[bot]
ceee8fed29 misc: bump tqdm from 4.66.1 to 4.66.2 (#906)
Bumps [tqdm](https://github.com/tqdm/tqdm) from 4.66.1 to 4.66.2.

Co-authored-by: Bobby R. Bruce <bbruce@ucdavis.edu>
2024-03-06 14:20:03 -08:00
Ivan Fernandez
f6c61836b3 arch-riscv: adding vector unit-stride segment loads to RISC-V (#851)
This commit adds support for vector unit-stride segment load operations
for RISC-V (vlseg<NF>e<X>). This implementation is based in two types of
microops:
- VlSeg microops that load data as it is organized in memory in structs
of several fields.
- VectorDeIntrlv microops that properly deinterleave structs into
destination registers.

Gem5 issue: https://github.com/gem5/gem5/issues/382
2024-03-06 11:27:06 -08:00
Giacomo Travaglini
b930c57d54 misc: Tag checkpoints with the ISA of the CPUs (#908)
With the introduction of multi-ISA gem5, we don't store the TARGET_ISA
anymore as a string in the root section of the checkpoint [1].  There is
therefore no way at the moment to asses the ISA of a CPU/ThreadContext.
This is a problem when it comes to checkpoint updates which are ISA
specific.

By explicitly serializing the ISA as a string under the cpu.isa section
we avoid this problem and we let cpt_upgraders be aware of the ISA in
use.

[1]: https://gem5-review.googlesource.com/c/public/gem5/+/48884
2024-03-05 10:04:06 +00:00
Bobby R. Bruce
650b92124b misc: Copy the develop .github dir to stable (#912) 2024-03-05 08:35:11 +00:00
Giacomo Travaglini
5bce5673b0 util: Fix recent cpt_upgraders not checking for ISA
A set of cpt_upgraders was patching old checkpoints regardless
of the ISA in use. Thanks to the previous patch, we can now
retrieve the ISA of the CPU from the isa section.

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Change-Id: Ia110068c06453796cefac028ee13f21667e5371a
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
2024-03-04 17:51:40 +00:00
Giacomo Travaglini
3d2052bc03 misc: Serialize the ISA as a string in the checkpoint
With the introduction of multi-ISA gem5, we don't store the TARGET_ISA
anymore as a string in the root section of the checkpoint [1].  There is
therefore no way at the moment to asses the ISA of a CPU/ThreadContext.
This is a problem when it comes to checkpoint updates which are ISA
specific.

By explicitly serializing the ISA as a string under the cpu.isa section
we avoid this problem and we let cpt_upgraders be aware of the ISA in
use.

[1]: https://gem5-review.googlesource.com/c/public/gem5/+/48884

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Change-Id: I1e75230cbc370cab84f4a54141b1e425af2dbfac
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2024-03-04 17:51:40 +00:00
Nitish Arya
676d571009 arch-riscv: adding stats to show completed page walks (#869)
This commit adds statistics showing completed page walks for 4KB and 2MB
pages. This will add to stats.txt the variables num_4kb_walks,
num_2mb_walks and the corresponding values. This is done based on the
level of page table walk traversed specific to Sv39 Virtual Memory
System.
2024-03-04 08:38:28 -08:00
Giacomo Travaglini
c57a6b0d59 mem-cache: Add support for partitioning caches (#765)
* Add Cache partitioning policies to manage and enforce cache
partitioning:
    * Add Way partition policy 
    * Add MaxCapacity partition policy
* Add PartitionFieldsExtension Extension class for Packets to store
Partition IDs for cache partitioning and monitoring
* Modify Cache SimObjects to store partition policies
* Modify Cache block eviction logic to use new partitioning policies

Co-authored-by: Adrian Herrera <adrian.herrera@arm.com>

Change-Id: Ib35153a8b46803c22a433926270d82e5e19ce544
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-03-04 09:44:01 +00:00
Giacomo Travaglini
c1d5ffe7c7 mem-cache: Prefetchers Improvements (#872)
This pull request contains a set of small patches which fix some bugs in
the gem5 prefetchers, and aligns out-of-the box prefetcher performance
more closely with that which a typical user would expect.

The performance patches have been tested with an out-of-the-box
(untuned) Stride prefetcher configuration against a set of SPEC 2017
SimPoints, and show a small-to-modest IPC uplift across the about half
the benchmarks, with no significant IPC degradation.

The new defaults were identified as part of work on gem5 prefetchers
undertaken by Nikolaos Kyparissas while on internship at Arm.

This PR is an updated version of PR #564, which was reverted due to Bug
#580. Bug #580 was fixed in PR #871. This PR updates #564 to the latest
state of the develop branch, and should be applied after PR #871.
2024-03-04 09:09:47 +00:00
Ivana Mitrovic
fae5f5e00b sim-se: Catch None value if binary is not compatible with gem5 (#903)
Adding an error message in case the binary is not compatible with gem5.

This PR is addressing the comments in issue #807.

Change-Id: I66466ed6f657276c13d237fde3b1ec12c20cfe91
2024-03-01 16:41:18 -08:00
Ivana Mitrovic
61adfa38b2 stdlib: Fix initialization for self.pic.hart_config in lupv_board (#904)
Previously merged PR #886 created pic.hart_config, but it was not
initialized properly in lupv_board.py. This issue is causing daily tests
to fail.

Change-Id: I193ff4a3e5ef787eefcf066404e762f024fa6603

---------

Co-authored-by: Yu-Cheng Chang <aucixw45876@gmail.com>
2024-03-01 11:25:00 -08:00
Giacomo Travaglini
c0e5d58a96 dev: RegisterBank addRegistersAt for fragmented reg banks (#902)
One of the limitations of the RegBank class is that it does not allow
you to pass a non-contiguous set of registers. Its simplest form will
just accept an initializer list of registers and it will store them in
sequence.

A more refined version [1] will optionally accept an offset value to be
passed alongside the register reference. This is not meant to be used by
the register bank to store the register at the provided offset.

It is rather used by the bank to sanity check the register sits exactly
at the provided range.

The way to work around this for a fragemented register space is to
explicitly allocate RAZ/RAO blocks as registers and to pass them to
addRegisters together with the others. (See the SysSecCtrl [2] as an
example)

This makes it a bit tedious to model a register bank with gaps between
its registers. First, the exact number and position of the gaps needs to
be extraced from a spec. These sometimes report only implemented
registers and their offset, and omit to document gaps/reserved space. So
a developer needs to manually add register offset and size to check if
all registers are contiguous. Second, these reserved register blocks
need to be instantiated in the bank adding boilerplate code and
affecting readibility.

For these reasons we add a new registration method, called
addRegisters*At*. It reuses the RegisterAdder class but this time the
offset field is really used to instruct the bank where the register
should be mapped. The method is templated and the template parameter
tells the bank which register type should be used to fill the remaining
space. We make the RegBank the owner of this filler space (registers are
generated internally within addRegistersAt).

[1]: https://github.com/gem5/gem5/blob/stable/src/dev/reg_bank.hh#L106
[2]: https://github.com/gem5/gem5/blob/stable/src/dev/arm/ssc.cc#L48

Change-Id: I614ae6e9eeb40b365ac9b6dd8b75abbfdb9cb687

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-03-01 15:32:40 +00:00
Hristo Belchev
27c8355565 mem-cache: Add support for partitioning caches
* Add Cache partitioning policies to manage and enforce cache partitioning:
    * Add Way partition policy
    * Add MaxCapacity partition policy
* Add PartitionFieldsExtension Extension class for Packets to store
  Partition IDs for cache partitioning and monitoring
* Modify Cache Tags SimObjects to store partition policies
* Modify Cache Tags block eviction logic to use new partitioning policies
* Add example system and TrafficGen configurations for testing Cache
  Partitioning Policies

Change-Id: Ic3fb0f35cf060783fbb9289380721a07e18fad49
Co-authored-by: Adrian Herrera <adrian.herrera@arm.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-03-01 15:26:38 +00:00
Mahyar Samani
9bd71bff0c python: Adding fatal statement to notify user mistakes. (#826)
This change adds a fatal statement to check all params for all
SimObjects have been unproxied before C++ object are created.
The fatal statement notifies the user of a mistake that could
possibly lead to a SimObject to not have its params unproxied.
This mistake could be made by adding a child SimObject with a
name that starts with an underscore.
2024-02-29 10:47:26 -08:00
Matthew Poremba
db42aeb630 arch-vega: Implement accumulation offset (#895)
This PR implements a few changes related to the accumulation offset
which is new in MI200. Previously MI100 contained two vector register
files: the architectural and accumulation register files. These have now
been unified and the architectural register file is twice the size. As a
result of this the dispatch packet set an offset into the unified vector
register file for where the former accumulation registers would go. The
changes are:

- Calculate the accumulation offset from dispatch packet and store in
HSA task.
- Update the accumulation move instructions (v_accvgpr_read/write) to
use it.
- Update the current MFMA instructions to use it.
- Make the MFMA examples more clean.
2024-02-29 09:05:39 -08:00
Nicholas Mosier
69762e272e sim-se, arch-x86: initialize max stack size from parameter (#892)
Initialize x86 process' max stack size to the value given in the process
params, rather than hard-coding it to 8 MB, which made it impossible to
run x86 programs requiring more than 8 MB of stack.

Change-Id: I0b17fe60b016b1e4a82d704ef7ad367974ea6a08
2024-02-29 08:15:43 -08:00
amatabsc
0d79b5098b Increased packets sanity check limit to 1024 (#797)
For some simulations with big values for VLEN (e.g. 8k and 16k) there
were more packets created on the fly and, as a consequence, failing the
simulations. The sanity check has been increased in order to solve this
high VLEN cases.

Supervised by [@aarmejach](https://github.com/aarmejach)
Change-Id: I137b0f3113687b3fc9c4154d19ca5e8017e6e992

Co-authored-by: Adrià Armejach <adria.armejach@bsc.es>
2024-02-29 08:12:59 -08:00
Matt Sinclair
777ac91bb0 mem-ruby: Add categorization of bypassed atomics in TCC (#899)
Adds categorization of bypassed atomics in TCC to the TBE as either
return or no-return, which gets consumed in pa_performAtomic to
determine if atomic logs should be stored.

Reestablishes TCC bypassed atomics after #546.

Change-Id: Ibc1fa2b795ef1c47c3893a0b1911fa7993522d38
2024-02-28 14:26:09 -06:00
Matt Sinclair
8a28ca8ffb mem-ruby: Add missing transition for SLC writes to VIPER TCC (#894)
Bypassed write though requests on invalid lines in the TCC should be
written though to the directory. This transition was previously missing.

Change-Id: I16b117c4e085ce6be0ed5297aa0129d52cd35a51
2024-02-28 00:13:07 -06:00
Daniel Kouchekinia
de615836f0 mem-ruby: Add categorization of bypassed atomics in TCC
Adds categorization of bypassed atomics in TCC to the TBE as either return
or no-return, which gets consumed in pa_performAtomic to determine if
atomic logs should be stored.

Reestablishes TCC bypassed atomics after #546.

Change-Id: Ibc1fa2b795ef1c47c3893a0b1911fa7993522d38
2024-02-27 23:12:45 -06:00
Daniel Kouchekinia
0fd73f4e05 Merge branch 'develop' into missing-tcc-transition 2024-02-27 16:46:30 -06:00
Richard Cooper
4e12f2486b util: update list_changes.py to support multiple Change-Ids (#861)
The original version of `list_changes.py` assumed no more than one
`Change-Id` tag per commit. However, since transitioning to GitHub, the
repository now contains some merge commits containing multiple
`Change-Id`s.

This patch updates `list_changes.py` to support commits with any number
of `Change-Id` tags.
2024-02-27 11:10:31 -08:00
Giacomo Travaglini
e5eea7efcc mem: QoS q_policy assertions fix (#889)
Fix QoS Memory Queue Policies

* Fix assertions in LRG policy to correctly assert requestor and list
validity
* Fix `selectPacket()` in LIFO Queue Policy to correctly return the end
of the `deque` backing store for its packet queue
2024-02-27 13:32:19 +00:00
Hristo Belchev
e78a6b71fe Merge branch 'develop' into qos-qpolicy-assertions-fix 2024-02-27 09:38:34 +00:00
Harshil Patel
920497c19f tests: Add compiler test for gcc 13 (#858)
Change-Id: I41bdf3ab7ffff21c4148ef17fc5229b5597ec953
2024-02-26 18:03:14 -05:00
Matthew Poremba
2ca7f48828 arch-vega: Accumulation offset for existing MFMA insts
This commit update the two exiting MFMA instructions to support the
accumulation offset for A, B, and C/D matrix. Additionally uses array
indexed C/D matrix registers to reduce duplicate code. Future MFMA
instructions have up to 16 registers for C/D and this reduces the amount
of code being written.

Change-Id: Ibdc3b6255234a3bab99f115c79e8a0248c800400
2024-02-26 14:30:50 -06:00
Daniel Kouchekinia
6374697a20 mem-ruby: Add missing transition for SLC writes to VIPER TCC
Bypassed write though requests on invalid lines in the TCC should be
written though to the directory. This transition was previously
missing.

Change-Id: I16b117c4e085ce6be0ed5297aa0129d52cd35a51
2024-02-26 13:13:06 -06:00
Matthew Poremba
e0e65221b4 arch-vega: Use accum offset for v_accvgpr_read/write
The accum offset is used as an index into the unified VGPR register file
in MI200 and is not the same as a move if accum_offset in the dispatch
packet is non-zero.

Change these instructions to use the stored accum_offset value.

Change-Id: Ib661804f8f5b8392e4c586082c423645f539e641
2024-02-26 12:57:09 -06:00
Matthew Poremba
8722aef2e2 gpu-compute: Store accum_offset from code object in WF
The accumulation offset is needed for some instructions. In order to
access this value we need to place it somewhere instruction definitions
can access. The most logical place is in the wavefront.

This commit simply copies the value from the HSA task to the wavefront
object.

Change-Id: I44ef62ef32d2421953f096c431dd758e882245b4
2024-02-26 12:54:37 -06:00
Nicholas Mosier
1990186170 configs: Ensure m5ops base doesn't overlap physical mem in KVM (#875)
Fix #874, in which running se.py with 4GB or more memory (via option
--mem-size=4GB) causes all KVM programs to crash or hang. This occurred
because the m5ops address range (set to 0xFFFF0000-0x100000000)
overlapped with physical memory under such a configuration.

This patch fixes the bug by moving the m5ops address range if phyiscal
memory is >=4GB.

Change-Id: Ic8a004517bc2be2c27860ed314460be749a11dc1
2024-02-26 10:33:48 -08:00
Yu-Cheng Chang
bcf455755e arch-riscv,dev: Update the PLIC implementation (#886)
Update the PLIC based on the
[riscv-plic-spec](https://github.com/riscv/riscv-plic-spec) in the PR:
- Support customized PLIC hardID and privilege mode configuration
- Backward compatable with the n_contexts parameter, will generate the
config like {0,M}, {0,S}, {1,M} ...

Change-Id: Ibff736827edb7c97921e01fa27f503574a27a562
2024-02-26 10:32:53 -08:00
Yu-Cheng Chang
521a7c1de0 tests: Exit riscv_asmtest script with simulator status code (#891)
It will be helpful to check if the instruction simulate well

Change-Id: I5faa435fad79601682126ee7978d8444093df900
2024-02-26 10:31:18 -08:00
Ivana Mitrovic
61ee36eee6 mem-ruby: Fix possible dirty line loss in CHI when ReadShared hit on UD line (#791)
In case ReadShared hit on a UD line and there's no sharers, this chage
makes the downstream passes Dirty to the requestor whenever possible
even though it doesn't deallocate the line. This will make the requestor
to SD and the downstream to UD_RSD.
In the previous implementation, loosely exclusive intermediate cache can
cause loss of dirty data. Example error condition is as below.
   
Configurations
L2 cache: Roughly inclusive to L1 without back-invalidation
- dealloc_on_* = false
- dealloc_backinv_* = false
L3 cache: Roughly exclusive to L2 without back-invalidation
- alloc_on_readshared = tue
- alloc_on_readunique = false
- dealloc_on_shared = false
- dealloc_on_unique = true
- dealloc_backinv_* = false
- is_HN = false
LLC: Same clusivity as L3 except is_HN = true
For all caches, allow_SD = true and fwd_unique_on_readshared = false
    
Example problem sequence:
1. L1 sends ReadUnique then becomes UD. L2 is UC_RU. L3 and LLC are RU.
2. L1 evicts the line to L2 by WriteBackFull (UD_PD). L2 becomes UD.
3. L2 evicts the line to L3 using WriteBackFull (UD_PD). L3 becomes UD.
4. L1 reads the line with ReadShared which misses on L2.
5. L2 reads the line with ReadShared which hits on L3. L3 becomes UD_RSC
because it doesn't deallocate the line (dataToBeInvalid=false)
6. L3 evicts the line to LLC by WriteCleanFull (UD_PD) because L3
doesn't back-invalidate and still has sharer. The local cache line is
invalidated by Deallocate_CacheBlock. L3 becomes RUSC and LLC becomes
UD_RU.
7. When UD_RU is evicted at LLC, the UD_RU line is dropped expecting the
upstream to writeback, causing loss of dirty data
2024-02-26 10:06:17 -08:00
wmin0
00ed1d30cf python,util: Fix SimObjectParams default constructor and destructor (#880)
The empty constructor prevent zero-initialization working correctly. In
this change we fix the issue by removing the unwanted empty constructor.
We also change the default destructor specification with c++11 style.

Change-Id: I869a93ca5283f811c2aa58406f1478459e0d7022
2024-02-26 06:42:27 -08:00
Giacomo Travaglini
1d5be8d9e5 mem-cache: Optimize strided prefetcher address generation
This commit optimizes the address generation logic in the strided
prefetcher by introducing the following changes

(d is the degree of the prefetcher)

* Evaluate the fixed prefetch_stride only once (and not d-times)
* Replace 2d multiplications (d * prefetch_stride and distance *
prefetch_stride) with additions by updating the new base prefetch
address while looping

Change-Id: I3ec0c642bc9ec7635b0d38308797e99b645304bb
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-02-26 10:40:45 +00:00
Nikolaos Kyparissas
a5fece3b91 mem: added distance parameter to stride prefetcher
The Stride Prefetcher will skip this number of strides ahead of the
first identified prefetch, then generate `degree` prefetches at
`stride` intervals. A value of zero indicates no skip (i.e. start
prefetching from the next identified prefetch address).

This parameter can be used to increase the timeliness of prefetches by
starting to prefetch far enough ahead of the demand stream to cover
the memory system latency.

[Richard Cooper <richard.cooper@arm.com>:
- Added detail to commit comment and `distance` Param documentation.
- Changed `distance` Param from `Param.Int` to `Param.Unsigned`.
]

Change-Id: I4ce79c72d74445b12acf68e0a54e13966e30041c
Co-authored-by: Richard Cooper <richard.cooper@arm.com>
Signed-off-by: Richard Cooper <richard.cooper@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2024-02-26 10:40:45 +00:00
Nikolaos Kyparissas
1ccdf407cb mem-cache: Added clean eviction check for prefetchers.
pkt->req->isCacheMaintenance() would not include a check
for clean eviction before notifying the prefetcher,
causing gem5 to crash.

Change-Id: I4a56c7384818c63d6e2263f26645e87cef1243cb
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2024-02-26 10:40:45 +00:00
Richard Cooper
9fe998a8c0 mem-cache: Update default prefetch options.
Update the default prefetch options to achieve out-of-the box
prefetcher performance closer to that which a typical user would
expect. Configurations that set these parameters explicitly will be
unaffected.

The new defaults were identified as part of work on gem5 prefetchers
undertaken by Nikolaos Kyparissas while on internship at Arm.

Change-Id: Ia6c1803c86e42feef01de40c34d928de50fe0bed
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2024-02-26 10:40:45 +00:00
Richard Cooper
05f33fbef5 mem-cache: Squash prefetch queue entries by block address.
Prefetch queue entries were being squashed by comparing the address
of each queued prefetch against the block address of the demand
access. Only prefetches that happen to fall on a cache-line block
boundary would be squashed.

This patch converts the prefetch addresses to block addresses before
comparison.

Change-Id: I3a80a1e3d752f925595e33edebf5359d2cc67182
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2024-02-26 10:40:45 +00:00
Yu-Cheng Chang
47f3ad45d3 stdlib: Add get_last_exit_event_code to get m5 exit status code (#890)
Change-Id: I7319437dff24e31f343e71b6b8993f833b62147c
2024-02-23 09:09:28 -08:00
Hristo Belchev
2138a4ec92 mem: Fix LIFO q_policy and add assetions
* Fix selectPacket() in LIFO Queue Policy to correctly return the end of
  the `deque` backing store for its packet queue
* Move selectPacket() implementations for FIFO and LIFO queues into
  `q_policy.cc` file

Change-Id: I8c35e5fc83dc380b19f52be14c18b1f414f9e141
2024-02-22 21:57:08 +00:00
Yu-Cheng Chang
816ef46c78 arch-riscv: Fix fflags behavior of float inst. in O3 CPU (#868)
According to the RISC-V spec [1]. Any float-point instructions
accumulate FFLAGS register rather than write it to reflect the CSR
behavior.

In the previous implementation. We read the FFLAGS, set the exception
flags, and write the result back to the FFLAGS. This works in the gem5
simple and minor CPU model as they are actually written to `regFile`
after executing the instructions. However, in the gem5 O3 CPU model, it
will record in the `destMiscReg` buffer until the commit stage when
writing to the `miscReg` in the execution stage. The next instruction
will get the old FFLAGS and cause the incorrect result.

The CL introduced the `MISCREG_FFLAGS_EXE` and used the same size of
`miscRegFile` because the `MISCREG_FFLAGS_EXE` and `MISCREG_FFLAGS`
shared the same space. When executing the float-pointing instruction,
any exception flags should be updated via `MISCREG_FFLAGS_EXE` to
accumulate the FFLAGS in `setMiscReg` method. For the MISCREG_FFLAGS, it
should only be called in the CSROp.

[1] Syntactic Dependencies: Appendix A

c80ecada1c/src/mm-eplan.adoc (syntactic-dependencies-rules-9-11)

gem5 issue: https://github.com/gem5/gem5/issues/755

Change-Id: Ib7f13d95b8a921c37766a54a217a5a4b1ef17c6f
2024-02-22 08:33:34 -08:00
Hristo Belchev
f20ac07dde mem: Fix assertions in LRG Q policy
Fix assertions in LRG Queue Policy to correctly assert requestor and
list validity

Change-Id: I84e3f5b8936b74e7ac675faf7a3e6b9999026781
2024-02-22 14:16:20 +00:00
Harshil Patel
0f79b15b2f tests: Update checkpoint tests to new checkpoints (#888)
Change-Id: I1bf6d47017bcf77a4f93341c73de355372e1dea7
2024-02-21 16:37:28 -08:00
Jason Lowe-Power
c719ea960a arch-arm: Add FEAT_FGT trapping for debug registers (#873)
We already implemented FEAT_FGT but we were missing trapping
capabilities for trapping debug registers accesses
2024-02-21 11:27:43 -08:00
Nicholas Mosier
7ac9733199 arch-x86, cpu-kvm: initialize x87 FCW (#877)
Fix #876. The x87 floating-point control word (FCW) was not initialized
at process startup in syscall emulation mode. This resulted in floating
point exceptions in KVM mode when executing x87 floating-point
instructions.

This patch fixes the bug by initializing FCW to its reset value, 0x37F.

Change-Id: Idd1573c6951524ef59466cc5c9f1e640ea7658ae
2024-02-20 07:46:44 -08:00
wmin0
4e75e35a33 dev-arm: Remove the dependency of Platform for ArmSigInterruptPin (#878)
ArmSigInterruptPin don't send the interrupt to GIC. Instead it sends the
interrupt to the irq specified in Param. When using ArmSigInterruptPin,
we shouldn't ask users to provide "Platform" since it doesn't need it.
To reduce the confusion, this change removes the dependency of Platform
for ArmSigInterruptPin.

Change-Id: I0ee507ed1c08b4fa6d3e384e28732f3acb4f6892
2024-02-20 08:50:27 +00:00
Giacomo Travaglini
8759131df3 cpu-o3, arch: Fix SMT bug arising from v23.0 and make gem5 more robust with SMT (#828)
This PR is fixing https://github.com/gem5/gem5/issues/668. It fixes it
for all ISAs other than Arm with the first commit, which is setting the
number of architectural Matrix registers to 0 for those ISA which are
not using them.

It then partly fixes it for Arm as well with the 2nd commit: by removing
RenameMap::numFreeEntries we don't stall renaming unless a matrix
instruction is encountered... This means most binaries will run with SMT
as long as they don't use FEAT_SME instructions. Please note: this is
not simply a SMT fix, it will generally address a shortcoming in the way
we were renaming instructions.

If an Arm binary wants to use SMT with FEAT_SME, the 4th commit will
make sure the lack of physical registers is notified explicitly at the
beginning of simulation, rather than silently blocking renaming
2024-02-19 08:52:31 +00:00
Richard Cooper
308fef6b46 mem-cache: Fix possible crash in base prefetcher (#871)
When processing memory Packets for prefetch, the `PrefetchInfo` class
constructor will attempt to copy the `Packet` data. In cases where the
`Packet` under consideration does not contain data, an assertion will be
triggered in the Packet's `getConstPtr` method, causing the simulation
to crash.

This problem was first exposed by Bug #580 when processing an
`UpgradeReq` memory packet.

This patch addresses the problem by suppressing the copying of the
`Packet` data during construction of a `PrefetchInfo` object in cases
where the `Packet` has no data.

This patch addresses Bug #580 [1], which was exposed by PR #564 [2],
subsequently reverted by PR #581 [3]

[1] https://github.com/gem5/gem5/issues/580
[2] https://github.com/gem5/gem5/pull/564
[3] https://github.com/gem5/gem5/pull/581

Change-Id: Ic1e828c0887f4003441b61647440c8e912bf0fbc
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2024-02-17 14:14:57 -08:00
Giacomo Travaglini
2c0cc0040b arch-arm: Implement FEAT_FGT Debug trapping
Change-Id: I30af2b49ee604bcaa43fd419f6bc69e9ee6d9350
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
2024-02-15 15:58:34 +00:00
Giacomo Travaglini
683007c6ca arch-arm: Add FEAT_FGT Debug Read/Write registers
Those are supposed to control trapping for accesses to debug registers

Change-Id: I4a25a379e718ea6d5ea8ae22ac7edbeb452d1836
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
2024-02-15 15:58:34 +00:00
Harshil Patel
47c4dad869 arch-riscv: Remove unnecessary assert (#866)
`assert(interruptID >=0)` is always true as `interruptID` is an unsigned
int.

This was causing compilation tests failures in GCC-8 with the following
error:

```sh
src/arch/riscv/interrupts.cc:47:32: error: comparison is always true due to limited range of data type [-Werror=type-limits]
             assert(interruptID >= 0);
```

Change-Id: I356be78d7f75ea5d20d34768fb8ece0f746be2fc
2024-02-13 08:30:18 -08:00
Arnabjyoti Kalita
b826d96f40 cpu-o3: add PerThreadUnifiedThreadMap to O3 CPU (#842)
Github issue: https://github.com/gem5/gem5/issues/373

Change-Id: I1c8aba9bc5ea4e45faa6c174780904b8bd618604
2024-02-12 09:26:31 -08:00
Matt Sinclair
a840dda23a arch-vega,gpu-compute,mem-ruby: SQC Invalidation Support (#852)
This PR adds support for SQC (GPU I-cache) invalidation to the GPU
model. It does this by updating the GPU-VIPER-SQC protocol to support
flushes, the sequencer model to send out invalidates and the gpu compute
model to send invalidates and handle responses. It also adds support for
S_ICACHE_INV, a VEGA ISA instruction that invalidates the entire GPU
I-cache. Additionally, the PR modifies the kernel start behavior to
invalidate the I-cache too. It previously invalidated only the L1
D-cache.
2024-02-09 17:29:56 -06:00
Vishnu Ramadas
8054459df6 arch-vega: Add support for S_ICACHE_INV instruction
Previously, the S_ICACHE_INV instruction was unimplemented and
simulation panicked if it was encountered. This commit adds support for
executing the instruction by injecting a memory barrier in the scalar
pipeline and invalidating the ICACHE (or SQC)

Change-Id: I0fbd4e53f630a267971a23cea6f17d4fef403d15
2024-02-09 12:19:08 -06:00
Vishnu Ramadas
85680ea58e gpu-compute: Remove unused and redundant functions
In ComputeUnit, a previous commit added a  SystemHubEvent event class to
the SQCPort. This was found to be unnecessary during the review process
and is removed in this commit. Similarly, invBuf() which was added in
FetchUnit as part of an earlier commit was found to be redundant. This
commit removes it

Change-Id: I6ee8d344d29e7bfade49fb9549654b71e3c4b96f
2024-02-09 12:17:24 -06:00
Vishnu Ramadas
690b2b9462 gpu-compute, mem-ruby: Add comments and reformat code
Change-Id: Id2b3886dce347fdcfcad22009a42b92febc00a6c
2024-02-09 12:17:24 -06:00
Vishnu Ramadas
7dae25e881 configs, gpu-compute: Add parameter in shader for CUs per SQC
Change-Id: If0ae0db1b6ccc08a92f169a271b137f69f410f7b
2024-02-09 12:17:24 -06:00
Vishnu Ramadas
0e93e6142a arch-vega, gpu-compute, mem-ruby: Remove extra empty lines
Change-Id: I18770ec7e38c4a992a0ae6de95b0be49ab4426c2
2024-02-09 12:17:24 -06:00
Vishnu Ramadas
440409d807 gpu-compute: Add Icache invalidation at kernel start
Previously, the data caches were invalidated at the start of each
kernel. This commit adds support for invalidating instruction cache at
kernel launch time

Change-Id: I32e50f63fa1442c2514d4dd8f9d7689759f503d3
2024-02-09 12:16:41 -06:00
Vishnu Ramadas
03838afce0 gpu-compute: Add support for injecting scalar memory barrier
This commit adds support for injecting a scalar memory barrier in the
GPU. The barrier will primarily be used to invalidate the entire SQC
cache. The commit also invalidates all buffers and decrements related
counters upon completion of the invalidation request

Change-Id: Ib8e270bbeb8229a4470d606c96876ba5c87335bf
2024-02-09 12:14:57 -06:00
Vishnu Ramadas
23dc98ea72 mem-ruby: Add SQC cache invalidation support to GPU VIPER
This commit adds support for cache invalidation in GPU VIPER protocol's
SQC cache. To support this, the commit also adds L1 cache invalidation
framework in the Sequencer such that the Sequencer sends out an
invalidation request for each line in the cache and declares completion
once all lines are evicted.

Change-Id: I2f52eacabb2412b16f467f994e985c378230f841
2024-02-09 12:14:57 -06:00
Hristo Belchev
fd3aac1518 mem-cache: Fix circular dependency in QoS mem (#857)
This PR removes a circular dependency between `QoSMemSinkCtrl` and
`QoSMemSinkInterface` that prevented the `controller()` function of
`QoSMemSinkInterface` from being used by removing the default value for
`QoSMemSinkCtrl.interface`.

Change-Id: I4ecc39b974e239be1a2e9285e1f6f8ea873c018d
2024-02-09 11:32:16 +00:00
Saúl
7d80658a39 arch-riscv: fix vl in mask load/store (i.e vlm.v/vsm.v) (#830)
The vlm.v and vsm.v unit-stride mask load/store instructions are
constructed with an incorrect VL when the current one is larger than
than VLEN/EEW (i.e. when LMUL > 1). This commit fixes the issue for both
instructions.
2024-02-08 14:06:49 -08:00
Bobby R. Bruce
7fe1588546 arch-riscv: Fix load and store to use EEW instead of SEW (#859)
Vector unit-stride instructions have an EEW encoded directly in the
instruction, We should use that instead of SEW in vtype.

Ref:

https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#73-vector-loadstore-width-encoding
2024-02-08 12:14:11 -08:00
Bobby R. Bruce
b2d13ee63a util: Remove action runner add-apt-repo git-core/ppa (#856)
We were having some difficulty on a server running this
`apt-apt-repository` command due to suspected firewall issues. On
further inspection is appear to be superfluous as git can be obtained
easily through `apt-get` without adding this repository.
2024-02-08 12:13:12 -08:00
Saúl
804f137325 arch-riscv: add unit-stride fault-only-first loads (i.e. vle*ff) (#794)
This patch provides unit-stride fault-only-first loads (i.e. vle*ff) for
the RISC-V architecture.

They are implemented within the regular unit-stride load (i.e. vle*). A
snippet named `fault_code` is inserted with templating to change their
behaviour to fault-only-first.

A part from this, a new micro based on the vset\*vl\* instructions
(VlFFTrimVlMicroOp) is inserted as the last micro in the macro
constructor to trim the VL to it's corresponding length based on the
faulting index.

This trimming micro waits for the load micros to finish (via data
dependency) and has a reference to the other micros to check whether
they faulted or not. The new VL is calculated with the VL of each micro,
stopping on the first faulting one (if there's such a fault).

I've tested this with VLEN=128,256,...,16384 and all the corresponding
SEW+LMUL configurations.


Change-Id: I7b937f6bcb396725461bba4912d2667f3b22f955
2024-02-08 09:15:58 -08:00
Minje Jun
db5c71a919 mem-ruby: Pass UD on ReadShared hit only if SD is not allowed
This commit allows CompData_SD be sent when ReadShared hits on UD line and
the local cache keeps the line, unless the request doesn't allow SD.

Change-Id: I337f24c871cc4c19c5b5fb11f9b35c0a8eb7911c
2024-02-08 18:47:44 +09:00
Minje Jun
628be390a0 mem-ruby: Fix ReadShared hit handling on UD line
In case ReadShared hit on a UD line and there's no sharers, this chage
makes the downstream respond with Unique even though it doesn't deallocate
the line. This will make the requestor to UD and the downstream to UD_RU.
In the previous implementation, loosely exclusive intermediate cache can
cause loss of dirty data. Example sequence is as below.

Configurations
L2 cache: Roughly inclusive to L1 without back-invalidation
- dealloc_on_* = false
- dealloc_backinv_* = false
L3 cache: Roughly exclusive to L2 without back-invalidation
- alloc_on_readshared = tue
- alloc_on_readunique = false
- dealloc_on_shared = false
- dealloc_on_unique = true
- dealloc_backinv_* = false
- is_HN = false
LLC: Same clusivity as L3 except is_HN = true
For all caches, allow_SD = true and fwd_unique_on_readshared = false

Example problem sequence:
1. L1 sends ReadUnique then becomes UD. L2 is UC_RU. L3 and LLC are RU.
2. L1 evicts the line to L2 by WriteBackFull (UD_PD). L2 becomes UD.
3. L2 evicts the line to L3 using WriteBackFull (UD_PD). L3 becomes UD.
4. L1 reads the line with ReadShared which misses on L2.
5. L2 reads the line with ReadShared which hits on L3. L3 becomes UD_RSC
   because it doesn't deallocate the line (dataToBeInvalid=false)
6. L3 evicts the line to LLC by WriteCleanFull (UD_PD) because L3 doesn't
   back-invalidate and still has sharer. The local cache line is
   invalidated by Deallocate_CacheBlock.
   L3 becomes RUSC and LLC becomes UD_RU.
7. When UD_RU is evicted at LLC, the UD_RU line is dropped expecting the
   upstream to writeback, causing loss of dirty data.

Change-Id: Ic9bee27f2ec8906dd5df8bd3be60e5a9a76c782f
2024-02-08 18:47:44 +09:00
Minje Jun
1b5d92ee9c mem-ruby: Revert Writeback CHI UD_RU line at local evict
This reverts commit d613d814a431525e122552a667eed653a057f2be.

Change-Id: I50e218b7debf3a2836ce12515d8fcb6c0b38df53
2024-02-08 18:47:44 +09:00
Minje Jun
e141d9e4d0 mem-ruby: Writeback CHI UD_RU line at local evict
In Ruby CHI protocol UD_RU state means the line is in UD state in
the local cache and the upstream may have it in UD or UC state.
In the previous implementation UD_RU line was just dropped without
WriteBack which can cause loss of dirty data when the upstream has it
in UC state.
This commit fixes it by performing WriteBack when evciting UD_RU line.

Change-Id: I1db9b4f95cc576e71dcef38b01de24775df514ba
2024-02-08 18:47:44 +09:00
QQeg
e685c072d1 arch-riscv: Remove micro_elems in VleMicro template
Change-Id: I91267de8b1142075aa2873bfcedfd8b15c6863d4
2024-02-08 07:24:55 +00:00
QQeg
7eeac98b8d arch-riscv: Fix load and store to use EEW instead of SEW
Vector unit-stride instructions have an EEW encoded directly in the instruction,
We should use that instead of SEW in vtype.

Change-Id: I282041ce8ed57fbcca899f7497ef6c6fb2dfcf85
2024-02-07 21:11:28 +00:00
Jason Lowe-Power
4aecf9d35c stdlib: fix typo in error message (#855)
Change-Id: I28f1881d207caa36c6101eef221ef4cdd229da57

Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2024-02-06 09:50:01 -08:00
Robert Hauser
f289f9e8b5 arch-riscv: adding support for local interrupts (#813)
Besides the standard RISC-V interrupts software, timer, and external
interrupt, the RISC-V specification also offers the possibility to
implement local interrupts. With this patch, we contribute an extension
of RiscvInterrupts that enables connecting interrupt sources to the
local interrupt controller. We assigned the local interrupts to
machine-level and gave them the highest priority. If two local
interrupts are pending, there exception code will be the tie-breaker
(higher ID > lower ID). 32 Bit systems only recognize the local
interrupts 16 to 31, 64 Bit systems 16 to 63.

Change-Id: Iff8d34e740b925dce351c0c6f54f4bd37a647e0c

---------

Co-authored-by: Robert Hauser <robert.hauser@uni-rostock.de>
2024-02-06 09:38:50 -08:00
Harshil Patel
de0342128c tests: move to obtain-resources from wget (#845) 2024-02-06 09:34:03 -08:00
Bobby R. Bruce
c7426f9427 misc: Add 'workflow_dispatch' to daily tests (#850)
This allows us to manually trigger daily test runs rather than wait for
the scheduled time. This can be useful in cases where a fix for a broken
test is pushed and we wish to verify it works as intended ASAP.
2024-02-06 09:32:31 -08:00
Suraj Shirvankar
44aaebc49a tests: Allow pyunit tests to run on specific directories (#847)
This change allows pyunit tests to be run on specific directories
instead of the default `pyunit` directory.
You can pass in the directory as follows. I have built gem5.opt for
RISCV however it should work the same with other builds
```
./build/RISCV/gem5.opt tests/run_pyunit.py --directory tests/pyunit/gem5/
```
The default path works as it is currently 
```
./build/RISCV/gem5.opt tests/run_pyunit.py
```

Change-Id: Id9cc17498fa01b489de0bc96a9c80fc6b639a43f

Signed-off-by: Suraj Shirvankar <surajshirvankar@gmail.com>
2024-02-06 09:32:12 -08:00
Yu-Cheng Chang
ba6c569b8d arch-riscv: Add BasePMAChecker to support customized PMA (#846)
The RISC-V privilege spec don't specify the implementation of
PMA(physical memory attribute), which is addressed in the previous
CL[1].

This CL creates the BasePMAChecker to support customized PMA so that we
can only focus on the features wanted in the study. The CL also leaves
the common methods `check` and `takeOverFrom` to make MMU easy to
interact with PMA.

[1] https://gem5-review.googlesource.com/c/public/gem5/+/40596

Change-Id: I9725e3a8f7f9276e41f0d06988259456149d2a77
2024-02-06 05:38:34 -08:00
Giacomo Travaglini
a60d6960c7 arch-arm: Remove unused/unimplemented TLB methods (#849)
Change-Id: I3a76a914df1ba65ec5200f11111cf20f3e1eb924

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-02-06 09:18:06 +00:00
Mahyar Samani
8efe6dc1bc sim: Updating Process::Map (#835)
Changing size from int to int64_t to allow for mapping regions bigger
than 2GB.
2024-02-05 12:17:05 -08:00
Giacomo Travaglini
05f93175a7 arch-arm: Crypto instruction execution requires SIMD to be enabled (#848)
Crypto instructions will cause an undefined instruction when executed
with SIMD disabled. The PR is also
refactoring their implementation by checking the release object instead
of the ID register field. This is improving
readability
2024-02-05 19:22:04 +00:00
wmin0
e4e359135e systemc: Reduce unnecessary backdoor request in atomic transaction (#795)
The backdoor request in b_transport is only used for hinting the dmi
capability. Since most of traffic patterns are continous, we can cache
the previous backdoor request result to spare the backdoor inspect of
next request.

Change-Id: I53c47226f949dd0be19d52cad0650fcfd62eebbc
2024-02-05 11:08:20 -08:00
dependabot[bot]
61516e863f misc: bump tqdm from 4.64.1 to 4.66.1 (#833)
Bumps [tqdm](https://github.com/tqdm/tqdm) from 4.64.1 to 4.66.1.
2024-02-05 10:19:32 -08:00
Bobby R. Bruce
6f1d9b47e9 misc: Update actions/checkout from v3 to v4 (#836)
The `checkout` action now has a v4. v3 utilizes Node.js 16 which is now
deprecated by GitHub actions. Migrating to v4 is therefore encouraged.
2024-02-05 08:54:32 -08:00
Bobby R. Bruce
df83efe129 misc: bump mypy from 1.5.1 to 1.8.0 (#837)
See PR #834. This was accidently closed. This dependabot was correct.
2024-02-05 08:53:02 -08:00
Chong-Teng Wang
40ecdf5fb4 arch-riscv: Fix RVV instructions vmv.s.x/vfmv.s.f (#843)
This commit fixes the implementation of vmv.s.x and vfmv.s.f. 
When vl = 0, no elements are updated in the destination vector register
group, regardless of vstart.

Change-Id: Ib21b3125da8009325743ec70ca0874704328356c

Reference:
[Integer Scalar Move
Instructions](https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#161-integer-scalar-move-instructions)
[Floating-Point Scalar Move
Instructions](https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#162-floating-point-scalar-move-instructions)
2024-02-05 08:51:42 -08:00
Chong-Teng Wang
85059a369e arch-riscv: Fix control flow in VectorFloatMaskMacroConstructor (#844)
This commit adjusts the logic in VectorFloatMaskMacroConstructor to
ensure the %(copy_old_vd)s section is not skipped when vl = 0, ensuring
correct values in destination vector register.

Change-Id: I2478722d6f003a0f2e4b3cd0ba3e845bed938ee6

This is the same problem as #715 .
2024-02-05 06:29:05 -08:00
Giacomo Travaglini
16e06bad0c arch-arm: Exec Crypto instructions only if SIMD&FP enabled
We not only check for the presence of the relative FEAT_*,
we also check if AdvSIMD is enabled; we throw an undefined
instruction otherwise.

Change-Id: I1fd0cdc8057c5a7901774802dc076817f06c8e66
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2024-02-05 12:56:48 +00:00
Giacomo Travaglini
ebef2fc4b1 arch-arm: Crypto instructions checking release object
Check directly if extension is enabled instead of looking
for ID register field value. This makes the code more readable

Change-Id: If0b882ac3464c3587731b72a7edb3b8b65ea86c7
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2024-02-05 12:56:48 +00:00
Giacomo Travaglini
4eb0cd44fc cpu-o3: Restrict constraint on number of physical registers
Having the number of physical registers matching exactly the number of
architectural ones does not guarantee a proper execution as it means the
freeList would have 0 registers available for renaming. In this case the
worst would happen: renaming would silently stall execution
indefinitely.  With this change we report the issue to the user and fail
execution

Change-Id: I1eb968802f1a1a5115012f44b541542a682f887d
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-02-02 21:18:32 +00:00
Bobby R. Bruce
f0ee1db19f Merge branch 'develop' into mypy-1.8.0 2024-02-02 10:40:50 -08:00
Bobby R. Bruce
ea3face87b misc: bump pre-commit from 2.20.0 to 3.6.0 (#832)
Bumps [pre-commit](https://github.com/pre-commit/pre-commit) from 2.20.0
to 3.6.0.
2024-02-02 10:39:56 -08:00
Bobby R. Bruce
c890e6b113 misc: Merge .github dir from develop to stable (#841)
This is to incorporate Daily Test fix #840.
2024-02-02 10:38:47 -08:00
Harshil Patel
858acacb20 tests: fix wget link for gpu tests (#840) 2024-02-02 10:34:41 -08:00
Giacomo Travaglini
1fb7c1ad7e cpu-o3: Rename numFreeEntries into minFreeEntries
Change-Id: I89faeb001ebdcbc90ea88508f8d231ec6e7fe197
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-02-02 18:11:47 +00:00
Giacomo Travaglini
86158de220 cpu-o3: Stop using RenameMap::numFreeEntries
The method is extracting the minimum number of [1] non-zero free
registers/entries across all register classes.  This means that if we
have saturated all register storage for a particular class, renaming
will stop as a whole.

I believe it does make sense to keep renaming and only block renaming in
case an instruction requiring the particular register type is
encountered. This would happen with the Rename::renameInsts method

[1]: https://github.com/gem5/gem5/blob/stable/src/cpu/o3/rename_map.hh#L269
[2]: https://github.com/gem5/gem5/blob/stable/src/cpu/o3/rename.cc#L662

Change-Id: I932826a77a5c0b2e05d8fdcab0e6ca13cf0e3d23
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-02-02 18:11:47 +00:00
Giacomo Travaglini
d031244ca7 misc: When unused, set #MatRegClass registers to 0
This is working around an existing SMT issue [1].

The BaseO3CPU uses two physical matrix registers [2]. This is
enough for a single threaded CPU which as of now uses
1 architectural matrix only.

The problem arises when SMT is enabled.  As 2 architectural matrices
need to be supported by a single CPU, the O3CPU won't have any available
register in the freeList for renaming.  This causes the SMT O3CPU to
indefinitely stall renaming [3]

If the archtectural number of registers is seto to 0, the regclass won't
be taken into consideration when evaluating if we can rename
instructions.

This issue has been implicitly fixed for RISCV by a preceding PR [4]

[1]: https://github.com/gem5/gem5/issues/668
[2]: https://github.com/gem5/gem5/blob/stable/src/cpu/o3/BaseO3CPU.py#L170
[3]: https://github.com/gem5/gem5/blob/stable/src/cpu/o3/rename.cc#L1228
[4]: https://github.com/gem5/gem5/pull/83

Change-Id: I99bfdefff11a246b1f191251dc67689e95b3f0db
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-02-02 18:11:47 +00:00
Giacomo Travaglini
33e62b8e8a arch-arm: Adopt new TranslationRegime data type in MMU translations (#829)
This is more complaint with the VMSAv8-64, which is using Translation
Regimes instead of
historical (Armv7) isHyp tagging and the ExceptionLevel managing the
translation. This greatly
simplifies translation code, specially with FEAT_VHE where the managing
el (EL2) could handle to different
translation regimes (EL and EL2&0).
2024-02-02 11:54:38 +00:00
dependabot[bot]
234d63db6f misc: bump pre-commit from 2.20.0 to 3.6.0
Bumps [pre-commit](https://github.com/pre-commit/pre-commit) from 2.20.0 to 3.6.0.
- [Release notes](https://github.com/pre-commit/pre-commit/releases)
- [Changelog](https://github.com/pre-commit/pre-commit/blob/main/CHANGELOG.md)
- [Commits](https://github.com/pre-commit/pre-commit/compare/v2.20.0...v3.6.0)

Change-Id: I421f6d08fa370562a4310b2010d3d5071498bd6e

---
updated-dependencies:
- dependency-name: pre-commit
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Change-Id: Ifcf6ecdfdbdd465c1e1cd58506c21445dbe747f0
Signed-off-by: dependabot[bot] <support@github.com>
2024-02-01 15:51:24 -08:00
Bobby R. Bruce
80a7dfc300 misc: bump mypy from 1.5.1 to 1.8.0
See PR #834. This was accidently closed. This dependabot was correct.

Change-Id: I63a337b6f3cc4ae06bdfb28976605a9682fc236a
2024-02-01 15:44:37 -08:00
kroarty-lanl
197be3a0dd dev: Fix off-by-one in IDE controller PCI register allocation (#824)
The PCI configuration space is 256 bytes, yet because the
PCI_CONFIG_SIZE macro is 0xff, the final register allocation in the IDE
controller only allocated up to byte 255.

Change-Id: I1aef2cad9df366ee8425edb410037061eb29ae33
2024-02-01 10:14:28 -08:00
Bobby R. Bruce
5b2766829b misc: Merge develop .github dir into stable (#831) 2024-02-01 09:18:15 -08:00
Mahyar Samani
b79fe82e5c cpu,stdlib: Updating strided generator (#762)
This change improves the functionality of strided generator to create
trace with better flexibility.
It allows the user to manually set offset and stride size instead of
calculating it based on a "gen_id".
This way different patterns could be created with the same SimObject.
In addition, this change adds stdlib components for strided generator.
2024-02-01 09:08:42 -08:00
Harshil Patel
b5fae2f620 tests: Switch to vega_x86 from gcn3_x86 in daily tests (#817)
Change-Id: Ic2ed8cc4488ddd361b5773b91100d806b94f1b8a
2024-02-01 09:06:04 -08:00
Giacomo Travaglini
3a2c8feca8 arch-arm: MMU aarch64EL is not used in AArch64 only anymore
We therefore rename it to exceptionLevel

Change-Id: I2a3aabaefa315d95bd034b13d95d5a5b0b8e9319
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-02-01 13:45:06 +00:00
Giacomo Travaglini
3737e8b6df arch-arm: Use MAIR_EL2 mem attribute register when in EL0 host
With the old code, the MAIR_EL1 register was checked when inserting
an EL2&0 TLB entry

Change-Id: I064032fb2946777c2f4c50c06a124f828245e18a
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-02-01 13:44:16 +00:00
Giacomo Travaglini
d42ef792bf arch-arm: Check ELIs64 for EL2 when in EL2&0 regime
The problem with:

ELIs64(tc, aarch64EL == EL0 ? EL1 : aarch64EL);

Is that when we are executing at EL0 in host (EL2&0 translation
regime), the execution mode (AArch32 vs AArch64) is dictated
by EL2 and not by EL1 (which is the guest)

Change-Id: I463a2a9461c94d0886990ae3d0a6e22aeb4b9ea3
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-02-01 13:43:59 +00:00
Giacomo Travaglini
458c98082c arch-arm: Replace EL based translation with regimes
This is the final step in the transformation process.
We limit the use of the "managing Exception Level" for
a translation in favour of the more standard "Translation
Regime"
This greatly simplifies our code, especially with VHE
where the managing el (EL2) could handle to different
translation regimes (EL and EL2&0).

We can therefore remove the isHost flag wherever it got
used. That case is automatically handled by the proper
regime value (EL2&0)

Change-Id: Iafd1d2ce4757cfa6598656759694e5e7b05267ad
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-02-01 13:43:47 +00:00
Giacomo Travaglini
e333a77c12 arch-arm: Remove _Xt postfix from TLBI instructions
The Xt is not part of the architectural name of the register
and it was likely added with the introduction of extended
register (Xt) TLBIs in Armv8 to differentiate them with
the old Armv7 ones.

The use of _Xt was not consistent anyway: newer TLBIs were
already omitting it.

Change-Id: Ic805340ffa7b5770e3b75a71bfb76e055e651f8b
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-02-01 13:43:26 +00:00
Giacomo Travaglini
594428f010 arch-arm: Remove redundant isHyp as a TLB entry field
We should stop using isHyp.. An hypervisor entry is flagged
already by the EL of the entry (el == EL2)

Change-Id: I20c3d06fa2b04e0b938a380ca917d0b596eddcf2
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-02-01 13:43:00 +00:00
Giacomo Travaglini
a6ca81906a arch-arm: Simplify setting of isHyp for mem translations
The isHyp descriptor is an old artifact of armv7 and it flags a PL2
(AArch32) or EL2 & EL2&0 (AArch64) translations.
It is commonly set according to the EL/mode [1] but it may differ from
the execution state in case of explicit translation requests (via
the AT instruction as an example [2]).

There is really no need to complicate the masking of isHyp. We should
just make use of the tranType method (in charge of setting aarch64EL)
to properly set aarch64EL, and make isHyp coincide with the case of
aarch64EL == EL2.

This is a step towards the removal of the isHyp flag.

More specifically the patch does the following:

* HypMode translation type moved in the EL2 case
The translation is used by

ATS1HR/ATS1HW:
Performs stage 1 address translation as defined for PL2 and the
Non-secure state

* S1S2NsTran translation type moved in the EL1 case
The translation is used by

ATS12NSOPR/ATS12NSOPW:
Performs stage 1 and 2 address translations as defined for PL1 and the
Non-secure state

* S1CTran translation type can be at either EL1 or EL3
The translation is used by

ATS1CPR/ATS1CPW
Performs stage 1 address translation as defined for PL1 and the current
Security state

[1]: https://github.com/gem5/gem5/blob/stable/src/arch/arm/mmu.cc#L1281
[2]: https://github.com/gem5/gem5/blob/stable/src/arch/arm/mmu.cc#L1282

Change-Id: Ie653170f6053c5d8141a2de9f50febf5bf53ab9c
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-02-01 13:42:40 +00:00
Matthew Poremba
2ff57b09d8 util: Update gcn-gpu to remove GCN3 add gfx902 (#804)
This removes the gfx803 and gfx801 targets for building applications as
GCN3 will be removed from gem5. It also removes the copy/paste bug from
the HACC docker which is clobbering the HCC targets and removing gfx902.

Change-Id: I9a0d7fda437e797baf0f743a0a450948b9260b07

Co-authored-by: Harshil Patel <hpppatel@ucdavis.edu>
2024-01-31 16:02:07 -08:00
Harshil Patel
c92ddf90e6 tests: update binaries for gpu tests
Change-Id: I057f76e472bc0f9fdeacd59238a05980389c92c8
2024-01-31 13:37:48 -08:00
Kaustav Goswami
b5d18b84a8 arm,stdlib: added kvm support to the ARM board (#725)
This change adds support to use KVM cores on the ARM board. The board
simulates gic to enable KVM, similar to the gem5 ARM FS configs. The
limitation is that it only supports VExpress_GEM5_V1.

Signed-off-by: Kaustav Goswami <kggoswami@ucdavis.edu>
2024-01-31 10:17:58 -08:00
Harshil Patel
76c3c02acb tests: remove GCN3_X86 from compiler tests (#819)
Change-Id: Ibb75e08abb9051b70e474d721fbafd71957db701
2024-01-30 15:55:15 -08:00
Jason Lowe-Power
b3870ee7b0 arch-riscv: Fix fence.i instruction in O3 CPU (#816)
arch-riscv: Fix fence.i instruction in O3 CPU
2024-01-30 15:39:32 -08:00
Harshil Patel
47369e786a tests: Switch to vega_x86 from gcn3_x86 in daily tests
Change-Id: Ic2ed8cc4488ddd361b5773b91100d806b94f1b8a
2024-01-29 10:58:07 -08:00
Harshil Patel
d1fca18eb3 tests: Added tests for suites (#676)
Change-Id: I69db8e82e9373d659d125d3bd48a69de12b32390
2024-01-29 10:52:33 -08:00
Bobby R. Bruce
c0100b18cc util: add scripts that help maintain mongoDB (#653) 2024-01-29 10:42:08 -08:00
Harshil Patel
5a7d61d990 misc: move dependabot.yml to .github (#812)
Change-Id: I5c882afd1e15420b8fcdcc14895a77b275aedc4e
2024-01-29 10:07:32 -08:00
Jason Lowe-Power
bb5d55510f arch-riscv: Fix RVV instructions vmsbf/vmsif/vmsof (#814)
This pull request has two commits, one is to fix the segmentation fault,

> arch-riscv: Fix segmentation fault in vmsbf/vmsof/vmsif
    
    This commit simplifies the conditional logic in vmsbf/vmsof/vmsif
    by removing an unnecessary variable and condition.
The updated logic checks 'this->vm' or the result of 'elem_mask(v0, i)'
    directly, which prevents a segmentation fault regardless of
    whether 'vm' is set or not.

another is to fix the incorrect output,

> arch-riscv: Add template Vector1Vs1VdMaskDeclare
    
    This commit adds a new template, Vector1Vs1VdMaskDeclare, to replace
    the use of Vector1Vs1RdMaskDeclare in Vector1Vs1VdMaskFormat.
    
The change addresses the issue with the number of indices in
srcRegIdxArr.
Only two indices are available in Vector1Vs1RdMaskDeclare, but
instructions
    that use Vector1Vs1VdMaskFormat, like 'vmsbf', require three indices
    (for vs1, vs2(old_vd), and vm) to function correctly.
    
Demonstration of incorrect output compared with spike:
[vmsbf](https://github.com/QQeg/rvv_intrinsic_testcases/tree/master/vmsbf)
```
**** REAL SIMULATION ****
src/sim/simulate.cc:199: info: Entering event queue @ 0.  Starting simulation...
Vs1 = 0 0 0 0 0 0 0 0   1 1 1 1 1 1 1 1   1 1 1 1 1 1 1 1   1 1 1 1 1 1 1 1   
Vd  = 1 1 1 1 1 1 1 1   1 1 1 1 1 1 1 1   1 1 1 1 1 1 1 1   1 1 1 1 1 1 1 1   
Exiting @ tick 23504000 because exiting with last active thread context

 ----SPIKE----
bbl loader
Vs1 = 0 0 0 0 0 0 0 0   1 1 1 1 1 1 1 1   1 1 1 1 1 1 1 1   1 1 1 1 1 1 1 1   
Vd  = 1 1 1 1 1 1 1 1   0 0 0 0 0 0 0 0   0 0 0 0 0 0 0 0   0 0 0 0 0 0 0 0
```
2024-01-29 08:28:16 -08:00
Roger Chang
d94ef08a36 arch-riscv: Fix fence.i instruction in O3 CPU
We should clean the instruction buffer after the fence.i is execute
to avoid execute old instruction for self-modifying code

Change-Id: Iece0ee0d10631fcd9bd17ee67cf0c92f72acdbd8
2024-01-29 11:43:27 +08:00
QQeg
08ed87bc9d arch-riscv: Add template Vector1Vs1VdMaskDeclare
This commit adds a new template, Vector1Vs1VdMaskDeclare, to replace
the use of Vector1Vs1RdMaskDeclare in Vector1Vs1VdMaskFormat.

The change addresses the issue with the number of indices in srcRegIdxArr.
Only two indices are available in Vector1Vs1RdMaskDeclare, but instructions
that use Vector1Vs1VdMaskFormat, like 'vmsbf', require three indices
(for vs1, vs2(old_vd), and vm) to function correctly.

Change-Id: I0c966e11289ce07efcc3b0cc56948311289530ad
2024-01-28 09:38:11 +00:00
QQeg
31ffc11c57 arch-riscv: Fix segmentation fault in vmsbf/vmsof/vmsif
This commit simplifies the conditional logic in vmsbf/vmsof/vmsif
by removing an unnecessary variable and condition.
The updated logic checks 'this->vm' or the result of 'elem_mask(v0, i)'
directly, which prevents a segmentation fault regardless of
whether 'vm' is set or not.

Change-Id: I799fa7b684ff98959a64f9694ef9c854f3a1f43a
2024-01-28 09:38:11 +00:00
Giacomo Travaglini
ce32d7c523 arch-arm: Replace CRYPTO extension with canonical names (#810)
These are:

FEAT_AES,
FEAT_PMULL,
FEAT_SHA256,
FEAT_SHA1,
FEAT_CRC32

With this patch we are also enabling them by default by adding them to
the Armv8 release object. Some of them are mandatory anyway since
Armv8.1

Change-Id: I221ae8646d91151fdfaf97a4815168a4fe3d8c5a

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-01-26 19:39:35 +00:00
Ivana Mitrovic
8a6804231c misc: Added dependabot config file (#767)
- Added a yaml file to make dependabot target develop instead of stable.
2024-01-25 19:25:51 -08:00
Matthew Poremba
7f71477f15 dev-amdgpu: Limit SDMA NOP count to wptr boundary (#806)
If the NOP count of an SDMA NOP packet goes beyond the wptr address, the
queue decode method will loop infinitely. If a packet comes in with a
bad count this causes gem5 to hang. This change advances the rptr one
dword at a time until either reaching the NOP count or when rptr == wptr
to prevent this issue.

Change-Id: Ib2c0f74a477bff27890c9c064bb4190e76e513bd
2024-01-25 15:35:35 -08:00
Ivana Mitrovic
235f6bd43f misc: Update .mailmap file (#739)
The .mailmap file is designed to maintain a record of unique
contributors, aiming for a single identifier for each person. What is
included in this file does not impact or alter commits; rather, it just
merges the counts for all commits by one person under a single name.
2024-01-25 12:00:13 -08:00
Ivana Mitrovic
1c0127ae7c base: Fix Integer overflow in AddrRange (#786)
This PR fixes the bug mentioned in #240.
2024-01-25 10:18:29 -08:00
Ivana Mitrovic
24e0d71034 arch-gcn3: Remove gcn3 (#781)
Related to issue #703 , this PR removes GCN3 related files and updates
source code, documentation, and tests to switch over to Vega is that was
not done already. Highlights are:

 - Remove all src/arch/amdgpu/gcn3 files and update Kconfigs.
 - Remove references to GCN3 and replace with Vega where applicable.
- Update the build targets in the gcn-gpu Docker. This will need to be
rebuilt but not urgently.
- Remove the GCN3 tag in testlib. Most tests seem to be using Vega
already, so that commit is small.
2024-01-25 10:14:46 -08:00
Harshil Patel
7cf5c8c840 misc: Added dependabot config file
- Added a yaml file to make dependabot target develop instead of stable.

Change-Id: I5b28c06960c5a346b40e2af8f9284b11d9cc07cd
2024-01-25 08:57:32 -08:00
QQeg
7a96709b11 arch-riscv: Fix vsadd_vi and vsaddu_vi to match v-spec (#805)
This commit fixes the implementation of two instructions, vsadd_vi and
vsaddu_vi, in the OPIVI category
to match the RISC-V vector specification.

According to
[riscv-v-spec](https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#101-vector-arithmetic-instruction-encoding),
the immediate field of these two instructions should be sign extended.

> For integer operations, the scalar can be a 5-bit immediate, imm[4:0],
encoded in the rs1 field. The value is sign-extended to SEW bits, unless
otherwise specified.

There is an example in both
[vsadd](https://github.com/QQeg/rvv_intrinsic_testcases/tree/master/vsadd_vi)
and
[vsaddu](https://github.com/QQeg/rvv_intrinsic_testcases/tree/master/vsaddu_vi).

Change-Id: Ib877627ba01c0868b2103d41613651df488fca13
2024-01-24 17:21:26 -08:00
Yu-Cheng Chang
6dd936e5b5 arch-riscv: Simply implementation of vector multiply and divide instructions (#793)
Align the implementation of scalar multiply and divide instructions

Change-Id: I53297d4c841c41593baaae0ea140bfbbd874a1d9
2024-01-24 13:20:15 -08:00
Matthew Poremba
44c78d843c arch-vega: Implement memory aperture operands (#803)
Vega (gfx900) introduced new memory aperture registers to get the base
address and limit for LDS and private (scratch) memory. These have not
commonly been used by the compiler until ROCm 6. Now that the compiler
is generating reads from these special registers, implement the support
for them.

Tested with LULESH which is using the SHARED_BASE register (LDS) with
ROCm 6.0. This assembly seems to replace S_GETREG_B32 emitted by the
ROCm 5 compiler.

Change-Id: Id2bd26ce8ef687c84a647fa2ac2da54d657913e5
2024-01-24 11:19:43 -08:00
Matthew Poremba
0ac110ac95 dev-amdgpu: Check privledge bit for SDMA RLC queues (#792)
By default all SDMA queues are privileged queues, meaning the addresses
in SDMA packets use the privileged translation tables. RLC queues
(sometimes called user queues) are not necessarily privileged and might
use user translation tables. RLC queues are used more often in ROCm 6.0
exposing an issue with invalid translations with RLC queues.

This changeset checks the priv bit in the SDMA MQD when an RLC queue is
mapped. Each packet type which uses an address then checks the bit
before performing translation. Tested with daily/weekly tests with a
ROCm 6.0 disk image and tests are passing.

Change-Id: I6122fbc194e8d6f5d38e81f1b0e11646d90e0ea0
2024-01-24 07:25:43 -08:00
Matthew Poremba
dfafc5792a arch-vega: Remove deleted instruction.cc from build (#801)
Change-Id: I03073d35a0d36788dfe8309e6ed466d0a496e31e
2024-01-23 18:47:01 -08:00
Harshil Patel
78613e2307 base: Add a check for edge case
- Now check for the condition where the bigger address range wraps but smaller does not.

Change-Id: Icc7a549afaf82a277dc2845255aa1702a1d662e0
2024-01-23 11:35:54 -08:00
Harshil Patel
fea4106414 util: updated resource manager dependencies (#737)
Change-Id: Ia07eed6c2f2e55f1a2cb8da30e75f0b3a2fb3bc3

Co-authored-by: Bobby R. Bruce <bbruce@ucdavis.edu>
2024-01-23 11:09:15 -08:00
Matthew Poremba
4fe6489038 arch-vega: Reorganize inst and misc files (#789)
This PR reorganizes the instructions.cc into multiple files and renames
some files which do not match their corresponding header file names. The
intention is to make iterating on development of these files faster.
2024-01-23 10:06:40 -08:00
Harshil Patel
7372097376 base: fix Integer overflow in AddrRange bug
An issue raised in #240 where if an address range ends
at the last byte of a 64 bit address space, it will be
considered a subset of any other address range that starts
at the first byte of the range.

Change-Id: I517f4717052eda2504de971be0eb59ee9a623dd3
2024-01-22 15:43:11 -08:00
Ivana Mitrovic
f2916e1b2b misc: Merge Weekly GPU tests into Weekly Tests (#647)
This separation was only for convenience while GPU tests were under
development and rapidly changing. This test merges the GPU tests into
the weekly tests where they belong.
2024-01-22 10:53:28 -08:00
Matthew Poremba
a5757e7e01 arch-vega: Rename mismatched source/header files
The files registers.cc, isa.cc, and decoder.cc do not match the header
name. This is a minor cleanup to make development more straightforward.

Change-Id: Ibab18dfe315b0ce84359939b490f8227ea43cac0
2024-01-19 13:32:24 -06:00
Matthew Poremba
cd91c6321f arch-vega: Reorganize instructions to multiple files
The Vega instructions.cc file is 47k lines long which results in both
large compilation times whenever it is modified and long style check
times. This makes iterating over more complex instruction
implementations very time consuming.

This commit moves the instruction definitions to multiple files based on
the instruction encoding (SOP2, VOP2, FLAT, DS, etc.). The resulting
files are much smaller (max is 8k lines) and compilation and style check
times are much more reasonable. Other than moving code around, there are
no functional changes in this commit.

Change-Id: Id4ac8e98ef11a58de5fd328f8a0cd7ce60a11819
2024-01-19 13:32:24 -06:00
Jason Lowe-Power
a555449c12 arch-arm: Fix compile error in kvm (#784)
The addition of std::optional in #732 caused a compile error. This
change fixes the error by checking to see if the value is present and
panicing otherwise.

Change-Id: I46c3fb76eb0e14ba7bede7c336293fbe9add8c84

Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2024-01-19 07:59:59 -08:00
Bobby R. Bruce
5f767d7836 misc: Fixing comment indentation in weekly-tests.yaml
Change-Id: I047ef921703e635b37bacb54cd5b091c2a41b1d3
2024-01-18 15:55:25 -08:00
Yu-Cheng Chang
f56459470a arch-riscv: Refactor the RISC-V multiplication utility (#780)
1. Add the new double width for int64_t and uint64_t
2. Use the wider type to get the upper result of multiplication

Change-Id: Id6cfa6f274c65592b2b3e2b70c00f82954b41f1a
2024-01-18 12:40:11 -08:00
Matthew Poremba
9b89149142 tests,ext: Remove GCN3 tags, update tests to Vega
Change-Id: I782b6e61cd43b51cfbe80161d4dc1cee125f7f64
2024-01-17 11:13:50 -06:00
Matthew Poremba
0f45ae424c util: Remove GCN3 references and target from gcn-gpu docker
Change-Id: I622470588a7e02088a1b9bb3dcfaa677e835e87c
2024-01-17 11:12:36 -06:00
Matthew Poremba
63caa780c2 misc: Remove all references to GCN3
Replace instances of "GCN3" with Vega. Remove gfx801 and gfx803. Rename
FIJI to Vega and Carrizo to Raven.

Using misc since there is not enough room to fit all the tags.

Change-Id: Ibafc939d49a69be9068107a906e878408c7a5891
2024-01-17 11:11:06 -06:00
QQeg
511729ab76 arch-riscv: Fix issue when vl=0 in VectorIntMaskMacroConstructor (#715)
I’ve been working on a fix for the issue #759 where ‘vd’ incorrectly
stores all zeros when ‘vl’ is set to 0 in VectorIntMaskMacroConstructor.
My solution seems to work, but it behaves differently from other macros
when ‘vl’ = 0. Instead of pushing a ‘nop’ to ‘microops’, it pushes a
micro operation that remains ineffective due to ‘vl’ being 0.
2024-01-17 08:45:08 -08:00
Matthew Poremba
57fb083f43 arch-gcn3: Remove all GCN3 files
Change-Id: Ib7d9e8676a31e51a330e68d81099580e2509a90a
2024-01-17 10:44:44 -06:00
Nitish Arya
c2a22b03b4 mem-ruby: fix ruby startup() to reset exit event correctly (#773)
When restoring the simulate_limit_event pointer is not
restored after running the dry simulation run which ends up in
"Panic: event not found!"
In this commit we fix this issue by correctly restoring
the pointer value along with the event queue head

Change-Id: Id5ad4d2a270a6cd34eec1dc5c9b170b2b84610d4

---------

Co-authored-by: narya <nitish.arya@bsc.es>
Co-authored-by: Jason Lowe-Power <jason@lowepower.com>
2024-01-17 08:41:10 -08:00
Matthew Poremba
70376d43a3 arch-vega: Fix upsize cast error in newer compilers (#774)
Newer compilers error on -Warray-length in the recent MI200 patches due
to casting from a 32-bit data type to a 64-bit type. Change it to cast
the 32-bit integer first then 64-bit integer latter to remove the
warning.

Rerun of validation tests on the three instructions passed.

Change-Id: I0309e5f7b5b8cc8ce1651660ddddb120fa6e7666
2024-01-16 09:41:23 -08:00
Matthew Poremba
6a9e80c54c gpu-compute: Support for MI200 GPU model (#733) 2024-01-15 08:18:34 -08:00
Hoa Nguyen
85eb99388a arch-riscv: Remove the check of bit 63 of the physical address (#756)
Currently, the TLB enforces that the bit 63 of a physical address to be
zero. This check stems from the riscv-tests that checks for the bit 63
of a physical address [1]. This is due to the fact that the ISA
implicitly says that the physical address must be zero-extended on the
most significant bits that are not translated [2]. More details on this
issue is here [3].

The check for bit 63 of a physical address in the TLB is rather too
specific, and I believe the check of invalid physical address is alread
implemented in PMA. Thus, this change proposes to remove this check from
RISC-V TLB.

[1]
bd0a19c136/isa/rv64mi/access.S (L18)
[2] https://groups.google.com/a/groups.riscv.org/g/isa-dev/c/8kO7X0y4ubo
[3] https://github.com/gem5/gem5/issues/238

Change-Id: I247e4d4c75c1ef49a16882c431095f6e83f30383

Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2024-01-12 15:17:49 -08:00
Arteen Abrishami
e5bdc760e3 mem-ruby: allow comparison of int and Addr in SLICC (#701)
allow easy isolation of specific addresses in coherence protocols.
useful for debugging.

Change-Id: I93e07956b8e29837219d328dacfbd5c6067c1a62
2024-01-12 10:02:29 -08:00
Giacomo Travaglini
7487c13181 configs: Add o3 --cpu choice to the starter_se.py script (#764)
This is matching what we are already doing in the starter_fs.py script

Change-Id: I50239050be9bd151a607ec892f8dd9322b24040b

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-01-12 07:47:51 -08:00
Harshil Patel
77d6442c1a util: Addressed requested changes
Change-Id: I202bb591960b76f74c3fbb95867905b968c3517d
2024-01-10 21:59:21 -08:00
Yu-Cheng Chang
2f24ee570e arch-riscv: Move PMAChecker and PMP to RiscvISA namespace (#691)
The PMAChecker and PMP are only used in the RisvISA and it should be in
the RiscvISA to simply the implementation

Change-Id: I4968e2de4c028cb2dceed977f2173fc8b1efd175
2024-01-10 16:58:13 -08:00
Yu-Cheng Chang
74dd0bb9bb fastmodel: Fix the Fastmodel RemoteGDB initial (#735)
Change-Id: Iec9ef145ccac353b8a41f501dd76bf53288dd478
2024-01-10 16:55:54 -08:00
Matt Sinclair
ab9e61ea03 gpu-compute: WAX dependency detection (#731)
WAX Dependencies would be missed if a RAW Dependency also existed.
2024-01-05 12:57:24 -06:00
Matt Sinclair
dc85d1492c gpu-compute: Added register file cache support (#730)
The RFC is defaulted to a size of 0 which removes it completely. To use
the RFC set the --register-file-cache-size to a non-zero multiple of
two. In addition, rfc_pipe_length may be altered to increase or decrease
RFC latency benefit.
2024-01-05 12:57:06 -06:00
KaiBatley
359ac63280 gpu-compute: Added register file cache support
The RFC is defaulted to a size of 0 which removes it completely. To use
the RFC set the --register-file-cache-size to a non-zero multiple of
two. In addition, rfc_pipe_length may be altrered to increase or
decrease RFC latency benefit.

Change-Id: I6f5bf5b750eb64155fbc8c8343e9feadce5c9f79
2024-01-04 22:43:05 -06:00
Tiago Mück
b652ab8558 mem-ruby: fix missing txnId for prefetch requests (#734)
Internal prefetch message generation at AllocateTBE_PfRequest was
missing the expected txnId value.

Change-Id: I7d1ead24db947a15133f6ec45b27a47c70096682

Signed-off-by: Tiago Mück <tiago.muck@arm.com>
2024-01-04 07:55:11 -08:00
Giacomo Travaglini
5e2e748f3a arch-arm: Handle invalid case for encodeAArch64SysReg (#732)
This patch is amending encodeAArch64SysReg so that it covers the case
where there are no arch numbers available for the misc index passed as
an argument.

This could happen if the register ID is a gem5 pseudo register which is
not associated with any architected op1/op2/crn/crm tuple.

Rather than panicking we return a nullopt.

Change-Id: I7ab70467105ef93c0c78ac4e999c7dc8e5e09925

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-01-04 10:04:40 +00:00
KaiBatley
55fce58c19 gpu-compute: WAX dependency detection
WAX Dependencies would be missed if a RAW Dependency also existed.

Change-Id: I2a9e50b9d0540a30de9c1bf6bb544c7b9654cb29
2024-01-03 22:02:02 -06:00
Matthew Poremba
31e63b01ad arch-vega: Add vop3p DOT instructions
Implemented according to the ISA spec. Validated with silion. In
particular the sign extend is important for the signed variants and the
unsigned variants seem to overflow lanes (hence why there is no mask()
in the unsigned varints. FP16 -> FP32 continues using ARM's fplib.

Tested vs. an MI210. Clamp has not been verified.

Change-Id: Ifc09aecbc1ef2c92a5524a43ca529983018a6d59
2024-01-03 15:41:06 -06:00
Matthew Poremba
a40f8f0efa configs: Add MI200 script
This is the MI200 equivalent of configs/example/gpufs/vega10.py.

Change-Id: Ib9761caa4326abe6b90099e6a77111b2acce0f76
2024-01-03 15:41:06 -06:00
Matthew Poremba
420cda1bef arch-vega: Implement FP32 packed math
Starting with MI200, packed math can operate on double dword inputs. In
this case, 64-bits of inputs (two VGPRs per lane) contain two FP32
values.

Add instructions to perform add, multiply, and FMA on packed FP32 types.

Change-Id: Ib838bff91a10e02e013cc7c33ec3d91ff08647b0
2024-01-03 15:41:06 -06:00
Matthew Poremba
7b0c47d52f arch-vega: Implement all global atomics up to gfx90a
This change adds all of the missing flat/global atomics up to including
the new atomics in gfx90a (MI200). Adds all decodings and instruction
implementations with the exception of __half2 which does not have a
corresponding data type in gem5. This refactors the execute() and
completeAcc() methods by creating helper functions similar to what
initiateAcc() uses. This reduces redundant code for global atomic
instruction implementations.

Validated all except PK_ADD_F16, ADD_F32, and ADD_F64 which will be done
shortly. Verified the source/dest register sizes in the header are
correct and the template parameters for the new execute()/completeAcc()
methods are correct.

Change-Id: I4b3351229af401a1a4cbfb97166801aac67b74e4
2024-01-03 15:41:06 -06:00
Matthew Poremba
472c697d88 arch-vega: Implement v_mfma_i32_16x16x16i8
Tested using AMD labs notes examples located on github:

https://github.com/amd/amd-lab-notes/blob/release/matrix-cores/
    src/mfma_i32_16x16x16i8.cpp

Change-Id: Ib0e50162288528012b6d3395e1f629ebf12e8e54
2024-01-03 15:41:06 -06:00
Matthew Poremba
cc75281802 gpu-compute: Update code object to latest LLVM
The AMDKernelCode struct is very outdated. Most of the fields are no
longer used and have been replaced with new fields that are used.
Therefore in order to support the new fields the code object needs to be
updated. The new structure is based on the table located at
https://llvm.org/docs/AMDGPUUsage.html#code-object-v3-kernel-descriptor

Most notably this adds the new compute_pgm_rsrc3 and kernarg preload
fields which are new features in gfx90a (MI200). The accum_offset in
compute_pgm_rsrc3 and kergarg preload values are necessary to run
application which enable those features and therefore a way to check
their values is needed.

Also noteable is the removal of enable_sgpr_workgroup_id_{X,Y,Z}. These
seem to be unused in all versions of ROCm that gem5 supports and
therefore these fields can be removed. They are replaced with a reserved
field in the new code object.

Change-Id: I5542442e1e5961b05e17affad0adb5186d6d9d1a
2024-01-03 15:41:06 -06:00
Matthew Poremba
7e1b27969f arch-vega: Improve FLAT disassembly
Use the opSelectorToRegSym which will print the full range of VGPRs
(e.g., will now print v[2:3] instead of v2 when the source / dest is
64-bits). Fixes atomic disassembly prints. Now shows "glc" if GLC bit is
enabled. Fixes some VGPR fields being printed as an SGPR in places where
the 9-bit register index bit is implied (e.g., VDST).

This makes it easier to use a GPUExec trace to match with LLVM
disassembly when debugging.

Change-Id: Ia163774850f0054243907aca8fc8d0361e37fdd5
2024-01-03 10:40:34 -06:00
Matthew Poremba
bc69ab0a1f arch-vega: Add VOP3P encodings and packed 16b insts
This adds the VOP3P and VOP3P_MAI encodings from the MI200 spec. These
instructions are used for packed math and miSIMD instructions. The first
19 VOP3P opcodes are implemented and validated against hardware. This
includes all instructions which operate on one dword containing two
packed 16-bit values of fp16, int16_t, or uint16_t.

Implement one MFMA instruction for now which was also validated against
hardware.
2024-01-03 10:40:34 -06:00
Matthew Poremba
4903fe2db1 arch-arm: Allow fplib to be used outside of ARM build
This is useful in other ISAs to implement FP16 computation. For example,
it can be used in the GPU model. The ARM specific misc register is
ignored in that case.

Change-Id: I339ac0ccd9be4371b0f220ad99068e5e12b3d263
2024-01-03 10:40:34 -06:00
Matthew Poremba
8c016ebbbc gpu-compute: Implement packed workitem ABI init
This initialization method is used in gfx90a (MI200). Rather than using
three VGPRs for X,Y,Z dimensions of the kernel, pack them into one
register with 10-bits for each dimensions.

Change-Id: I8e5b681c8287779ff9f80451d6028e862322294a
2024-01-03 10:40:34 -06:00
Matthew Poremba
5e45233484 gpu-compute: Add gfx version to HSA task entry
The version is necessary for determining the correct ABI init process.
Add it to the task queue so it is accessible when doing ABI init.

Change-Id: If77434b0f93614057b5c40fcf612d59b54e05dbb
2024-01-03 10:40:34 -06:00
Bobby R. Bruce
bae3487678 misc: Merge release-staging-v23-1 into stable (#711) 2023-12-28 12:50:12 -08:00
Alexander Richardson
e7d7199ea4 scons: Add option to use libc++ (#680)
this adds an option --with-libcxx, that adds the -stdlib=libc++ flag to
link against libc++ instead of libstdc++ on Linux. Currently this is
only possible with clang and may not work with all build configurations
(e.g. protobuf linked against libstdc++), so this needs to be opt-in
rather than being on by default for clang whenever libc++ is detected.

Change-Id: Ib4022a58bb2dbd32417c58f01c7443a02ff710fe
2023-12-28 12:49:44 -08:00
Bobby R. Bruce
88ea70886b misc: Merge v23.1 staging branch into develop (#716)
This is just to triple mark sure everything on staging is in the develop
branch.
2023-12-27 20:16:30 -08:00
Bobby R. Bruce
e0706e9270 misc: Merge stable into release-staging-v23-1 (#717) 2023-12-27 12:47:27 -08:00
Bobby R. Bruce
0615ba4748 misc: Merge branch 'release-staging-v23-1' into develop
Change-Id: I091b7788d67f1803ddb8f9c4f5661f1f24c3b594
2023-12-27 12:42:51 -08:00
Bobby R. Bruce
b90cc7f3bc misc: Merge branch stable into release-staging-v23-1
Change-Id: I3903331ec4c9d7ba83656bbf579ac3c1cac8518f
2023-12-27 12:39:52 -08:00
Bobby R. Bruce
2fe738911e misc: Change version information to develop for v24.0
Change-Id: I5a29cd574256f8a0f8963567ead0af45c1fce9f2
2023-12-27 12:29:52 -08:00
Bobby R. Bruce
7c882e17db scons: Re-enable warnings-as-errors for develop branch
Change-Id: I9c9898395858bad7365b36779b71d07f847bed70
2023-12-27 11:48:16 -08:00
Bobby R. Bruce
4ea676471a misc: Merge branch 'release-staging-v23-1' into develop
This is just a stanity check to ensure all changes are in the `develop`
branch.

Change-Id: I7f6e6709338ae9a386bc7527d5cf4daf10d768c2
2023-12-27 11:42:56 -08:00
Bobby R. Bruce
012c4a3fbd misc: Merge branch stable into release-staging-v23-1
Change-Id: I3903331ec4c9d7ba83656bbf579ac3c1cac8518f
2023-12-27 11:41:50 -08:00
Yu-Cheng Chang
5c4e41ad23 misc: Fix kconfig section format of RELEASE-NOTE.md (#714)
Change-Id: Iff6edca7db3c46d2ff1c3d4b19cc0907d4f2922d
2023-12-27 11:20:56 -08:00
Harshil Patel
cafc5e685d misc: Add release notes for version 23.1 (#447) 2023-12-23 18:23:06 -08:00
Bobby R. Bruce
025ccadc68 configs: Fix SMT cpu type checking (#698)
The args.cpu_type is not a type but a string so the isinstance checking
will always fail and an assertion will always be thrown

A cherry-pick of #684 to develop

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Co-authored-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-12-22 11:30:45 -08:00
Bobby R. Bruce
4c02ae214f scons: Remove warnings-as-errors comp feature for v23.1 (#708) 2023-12-21 19:45:12 -08:00
Bobby R. Bruce
2134ec5163 scons: Remove warnings-as-errors comp feature for v23.1
Change-Id: Iae1981e06cd98aa12e1e2362487e36388602dc7b
2023-12-21 19:15:51 -08:00
Bobby R. Bruce
d48ed780d2 misc: Cherry-pick commits from develop to v23.1 staging (#707)
This PR includes commits from:

* #696
* #706 
* #705 
* #695 
* #704
2023-12-21 18:50:09 -08:00
Jason Lowe-Power
a28a2be4e6 Revert "tests: Fix garnet and memcheck tests to use X86"
This reverts commit b22ca02a65.
2023-12-21 18:46:45 -08:00
Jason Lowe-Power
8eded999f4 mem-ruby,configs: Enable Ruby with NULL build
After removing `get_runtime_isa`, the `send_evicts` function in the ruby
configs assumes that there is an ISA built. This change short-circuits
that logic if the current build is the NULL (none) ISA.

Change-Id: I75fefe3d70649b636b983c4d2145c63a9e1342f7
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2023-12-21 18:46:37 -08:00
Bobby R. Bruce
646b1f4882 cpu: 'suppressFuncErrors' -> 'pkt->suppressFuncError()' fix
Change-Id: If4aa71e9f6332df2a3daa51b69eaad97f6603f6b
2023-12-21 18:46:25 -08:00
Bobby R. Bruce
585ce62703 tests: Fix garnet and memcheck tests to use X86
These tests previously used "build/NULL" but due to changes in the
"Ruby" and "garnet_synth_traffic.py" scripts, "NULL" fails as the script
exists "X86TimingSimple" with MESI_Two_Level.

This change fixes the tests by compiling and using the correct
compilation of gem5. It shouldn't affect the tests in any negative way.
As far as I'm aware it does not matter what ISA is used for these tests.

Change-Id: I8ae84b49f65968e97bef4904268de5a455f06f5c
2023-12-21 18:46:16 -08:00
Bobby R. Bruce
aadd59b1f0 configs: Add hasattr guard to ensure DerivO3CPU compiled
configs/ruby/Ruby.py fails when `DerivO3CPU` is not compiled into the
gem5 binary. The `isinstance` check fails. This fix addds a guard.

Change-Id: I1e5503ab18ec94683056c6eb28cebeda6632ae8e
2023-12-21 18:46:09 -08:00
Harshil Patel
70aeaaa0e9 mem: Updated bytesRead and bytesWritten stat (#705)
- The bytesRead and bytesWritten stat had duplicate names. Updated
bytesRead and bytesWritten for dram_interface and nvm_interface

Change-Id: I7658e8a0d12ef6b95819bcafa52a85424f01ac76
2023-12-21 18:46:02 -08:00
Bobby R. Bruce
c4146d8813 misc: Fix 'maybe-uninitialized' warn turn off (#706)
https://github.com/gem5/gem5/pull/696 was implemented incorrectly and
and causes error when running with GCC 12.1. This patch fixes the error.
2023-12-21 18:45:56 -08:00
Bobby R. Bruce
e95389920a misc: Turn off 'maybe-uninitialized' warn for regex include (#696)
https://github.com/gem5/gem5/pull/636 triggered a bug with the GCC
compiler and its interaction with the CPP stdlib regex library, outlined
here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105562.

This was causing the gem5 Compiler tests to fail for GCC-12:
https://github.com/gem5/gem5/actions/runs/7219055796

This fix turns off the 'maybe-unitialized' warning when we include the
regex headers in "src/kern/linux/helpers.cc".
2023-12-21 18:45:47 -08:00
Bobby R. Bruce
d6b798431f mem-ruby,configs: Enable Ruby with NULL build (#704)
After removing `get_runtime_isa`, the `send_evicts` function in the ruby
configs assumes that there is an ISA built. This change short-circuits
that logic if the current build is the NULL (none) ISA.
2023-12-21 10:22:42 -08:00
Harshil Patel
5288dbbf90 mem: Updated bytesRead and bytesWritten stat (#705)
- The bytesRead and bytesWritten stat had duplicate names. Updated
bytesRead and bytesWritten for dram_interface and nvm_interface

Change-Id: I7658e8a0d12ef6b95819bcafa52a85424f01ac76
2023-12-21 10:21:40 -08:00
Bobby R. Bruce
25e0e96741 misc: Fix 'maybe-uninitialized' warn turn off (#706)
https://github.com/gem5/gem5/pull/696 was implemented incorrectly and
and causes error when running with GCC 12.1. This patch fixes the error.
2023-12-21 10:21:20 -08:00
Jason Lowe-Power
bab8af44fb Revert "tests: Fix garnet and memcheck tests to use X86"
This reverts commit b22ca02a65.
2023-12-20 15:32:57 -08:00
Jason Lowe-Power
7adaaa6f2a mem-ruby,configs: Enable Ruby with NULL build
After removing `get_runtime_isa`, the `send_evicts` function in the ruby
configs assumes that there is an ISA built. This change short-circuits
that logic if the current build is the NULL (none) ISA.

Change-Id: I75fefe3d70649b636b983c4d2145c63a9e1342f7
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2023-12-20 15:26:03 -08:00
Bobby R. Bruce
82b5c332b7 tests: Fix Daily memory tests (#695)
Fixes a series of issues in the Daily memory tests causing test failure.
Discussed in #697.
2023-12-20 13:11:25 -08:00
Bobby R. Bruce
2f58f1c87b misc: Turn off 'maybe-uninitialized' warn for regex include (#696)
https://github.com/gem5/gem5/pull/636 triggered a bug with the GCC
compiler and its interaction with the CPP stdlib regex library, outlined
here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105562.

This was causing the gem5 Compiler tests to fail for GCC-12:
https://github.com/gem5/gem5/actions/runs/7219055796

This fix turns off the 'maybe-unitialized' warning when we include the
regex headers in "src/kern/linux/helpers.cc".
2023-12-20 13:10:56 -08:00
Bobby R. Bruce
213d0b0bfe cpu: 'suppressFuncErrors' -> 'pkt->suppressFuncError()' fix
Change-Id: If4aa71e9f6332df2a3daa51b69eaad97f6603f6b
2023-12-20 09:15:15 -08:00
Giacomo Travaglini
4f5d4b9baf mem-ruby: Implement WriteUniqueZero CHI transaction (#692)
The WriteUniqueZero is an immediate write to a Snoopable address region
that does not require any data transfer (cacheline is zeroed)

Change-Id: Ia8c9b40e08a3b7d613f0b62ce0ac4b0547860871

Reviewed-by: Tiago Muck <tiago.muck@arm.com>

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-12-19 11:12:50 +00:00
Bobby R. Bruce
211d00f48f misc: Cherry pick changes from develop to the v23.1 staging branch (#699)
This PR includes:

* #689 
* #510
2023-12-18 17:29:46 -08:00
Harshil Patel
d76a01973a util: Added script to copy resources from mongodb (#510)
- This script copies all resources from a mongodb database locally The
script creates a resources.json and downloads all the resources. It also
updates the resources.json to point to these local downloads instead of
the cloud bucket.

Change-Id: I15480c4ba82bbf245425205c9c1ab7c0f3501cc3
2023-12-18 17:28:19 -08:00
Tiberiu Bucur
27d89379d2 sim: Remove trailing / from proc/meminfo special path (#689)
Note: A bug was identified in that the one of the special file paths,
namely /proc/meminfo contained an extra trailing /, implicitly making
the incorrect assumption that meminfo was a directory, when it is, in
fact, a (pseudo-)file. This was causing application in SE mode to fail
opening the meminfo pseudo-file with errno 13. This commit fixes this
issue.

Change-Id: I93fa81cab49645d70775088f1e634f067b300698
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
2023-12-18 17:28:10 -08:00
Alexander Richardson
2700f392cb tests: Silence Clang 16 warnings (#679)
I was trying to build with clang 16 and ran into these -Werror warnings

Change-Id: I9207990fcfe9c1a5485945294969f21d1d812a7c
2023-12-18 14:57:11 -08:00
Bobby R. Bruce
b22ca02a65 tests: Fix garnet and memcheck tests to use X86
These tests previously used "build/NULL" but due to changes in the
"Ruby" and "garnet_synth_traffic.py" scripts, "NULL" fails as the script
exists "X86TimingSimple" with MESI_Two_Level.

This change fixes the tests by compiling and using the correct
compilation of gem5. It shouldn't affect the tests in any negative way.
As far as I'm aware it does not matter what ISA is used for these tests.

Change-Id: I8ae84b49f65968e97bef4904268de5a455f06f5c
2023-12-18 14:39:28 -08:00
Bobby R. Bruce
5d09ff4525 configs: Add hasattr guard to ensure DerivO3CPU compiled
configs/ruby/Ruby.py fails when `DerivO3CPU` is not compiled into the
gem5 binary. The `isinstance` check fails. This fix addds a guard.

Change-Id: I1e5503ab18ec94683056c6eb28cebeda6632ae8e
2023-12-18 14:37:51 -08:00
Harshil Patel
b42d9fabf7 util: Added script to copy resources from mongodb (#510)
- This script copies all resources from a mongodb database locally The
script creates a resources.json and downloads all the resources. It also
updates the resources.json to point to these local downloads instead of
the cloud bucket.

Change-Id: I15480c4ba82bbf245425205c9c1ab7c0f3501cc3
2023-12-18 12:41:52 -08:00
Giacomo Travaglini
ce6fd7f084 configs: Fix SMT cpu type checking (#684)
The args.cpu_type is not a type but a string so the isinstance checking
will always fail and an assertion will always be thrown

Change-Id: I6a88d1a514bb323c517949632f4e76be40e87e8c

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-12-17 22:08:49 -08:00
Tiberiu Bucur
9b0bf33f79 sim: Remove trailing / from proc/meminfo special path (#689)
Note: A bug was identified in that the one of the special file paths,
namely /proc/meminfo contained an extra trailing /, implicitly making
the incorrect assumption that meminfo was a directory, when it is, in
fact, a (pseudo-)file. This was causing application in SE mode to fail
opening the meminfo pseudo-file with errno 13. This commit fixes this
issue.

Change-Id: I93fa81cab49645d70775088f1e634f067b300698
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
2023-12-17 22:07:39 -08:00
Giacomo Travaglini
a008cd2611 mem-ruby: Implement a dummy StashOnceShared/Unique (#688)
Stash requests will simply be discarded by the Home Node This will
return a CompI response to the RNF

Change-Id: I9c2ce5d4d42f380d1a554933d381cf8a8590ba22

Reviewed-by: Tiago Muck <tiago.muck@arm.com>

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-12-16 14:43:45 -08:00
Bobby R. Bruce
9064249fab misc: Cherry-pick PR #666 from develop to release-staging-v23-1 (#683)
This was supposed to be part of the #682 but got missed. Adding now as a
separate PR.
2023-12-14 01:51:40 -08:00
Bobby R. Bruce
29b77260f3 arch-x86: Fix two_byte_opcodes.isa 0x6 -> 0x0 (#666)
This bug was introduced by https://github.com/gem5/gem5/pull/593 and
caused Issue https://github.com/gem5/gem5/issues/664.

Change-Id: Ia55de364ee8260e1fe315e37e1cffbc71ab229fb
2023-12-14 00:41:09 -08:00
Bobby R. Bruce
a84cfd2f0d misc: Cherry-pick from develop to release-staging-v23-1 [Nov 13th] (#682)
This PR includes all the commits from the following PRs which appear on
the `develop` branch but are required in the v23.1 release and are
therefore being cherry-picked to the `release-staging-v23-1` branch.

* https://github.com/gem5/gem5/pull/645
* https://github.com/gem5/gem5/pull/671
* https://github.com/gem5/gem5/pull/675
* https://github.com/gem5/gem5/pull/677
* https://github.com/gem5/gem5/pull/674
2023-12-14 00:10:59 -08:00
Roger Chang
654e7c6019 arch: Fix inst flag of RISC-V vector store macro instructions
Correct the instruction flags of RISC-V vector store instructions, such
as `vse64_v`, `vse32_v`. The `vse64_v`  in `decoder.isa` is
`Mem_vc.as<uint64_t>()[i] = Vs3_ud[i];` and it will generate the code
`Mem.as<uint64_t>()[i] = Vs3[i];`. The current regex of assignRE only
mark the operand `Mem` as `dest` only if meet the formats like `Mem = Rd`
or `Mem[i] = Rd` because the code ` = Rd` or `[i] = Rd` match the
`assignRE` respectively. For the expression `Mem.as<uint64_t>()[i]`,
the operand `Mem` will falsely mark the operand as `src` because the
code `.as<uint64_t>()[i]` is not match the  `assignRE`.

The PR will ensure the operand `Mem` is dest for the format like
`Mem.as<xxx>()[i] = yyy`.

Change-Id: I9c57986a64f1efb81eb9c7ade90712b118e0788d
2023-12-14 00:05:26 -08:00
Roger Chang
a9f8db7044 arch-riscv: Fix the vector store indexed instructions declaration
Change-Id: I6f8701ef0819c22eda8cb20d09c40101f2d001a0
2023-12-14 00:05:17 -08:00
Harshil Patel
9aab380775 arch-riscv: fix riscv matched board for se mode (#677) 2023-12-14 00:05:02 -08:00
Bobby R. Bruce
d8cc530597 stdlib: Add get_local_path() call to Looppoint resources
Due to a change introduced in https://github.com/gem5/gem5/pull/625, a
gem5 resource will not download any external files until
`get_local_path()` is called. In the construction of the Looppoint
Resources this function was not called, the `local_path` variable was
called directly. As such, an error occured.

The downside of this fix is the Looppoint resources external files are
downloaded when `obtain_resource` is called, thus the bandwidth savings
introduced with https://github.com/gem5/gem5/pull/625 will not occur for
Looppoint resources. However, https://github.com/gem5/gem5/issues/644
proposes a fix which would supercede the
https://github.com/gem5/gem5/pull/625 solution.

Change-Id: I52181382a03e492ec1cb58b01e71bc4820af9ccc
2023-12-14 00:04:10 -08:00
Bobby R. Bruce
301fb3f509 stdlib: Remove 'additional_params' value type assert
The value of a `WorkloadResource`'s additional parameter may not always
be a string. It can be any JSON value (integer, a list, a dict, ect.).
For Looppoint resources we have additional parameters such as a List of
region start points.

The assert inside workloads checking the type of the value breaks
certain usecase and is therefore removed in this commit.

Change-Id: Iecb1518082c28ab3872a8de888c76f0800261640
2023-12-14 00:04:00 -08:00
Harshil Patel
34f784f59c tests: fix gapbs and npb tests (#671)
Change-Id: I6090bde7903e302e501319b545fb4b06ef3e3df9
2023-12-14 00:01:39 -08:00
Harshil Patel
5ac9598133 Arch-riscv: Add chosen node
Change-Id: I458665caec08856cd8e61d2cd7a5b0dc5c35d469
2023-12-14 00:00:41 -08:00
Harshil Patel
7ce69b56be arch-riscv: Update riscv matched boad
- Update riscv matched board to work with new
RiscvBootloaderKernelWorkload

Change-Id: Ic20b964f33e73b76775bfe18798bd667f36253f6
2023-12-14 00:00:30 -08:00
Yu-Cheng Chang
6b80a2e81c configs: Make riscv/fs_linux work in build/ALL/gem5.opt (#655)
Change-Id: If9add7dc5e9c5600f769d27817da41466158942b
2023-12-13 23:59:53 -08:00
Yu-Cheng Chang
db286903ee stdlib: Fix the chi protocol of arm boot tests (#658)
Change-Id: I63f17a73b2e16bc26d9b41babc63439a6040791f
2023-12-13 23:59:00 -08:00
Harshil Patel
c66862f6e3 arch-riscv: fix riscv matched board for se mode (#677) 2023-12-13 13:16:08 -08:00
Bobby R. Bruce
695c350f31 stdlib,resources: Fix obtaining gem5 Looppoint resources (#675)
There were two small bugs preventing gem5 from obtaining Looppoint
resources.

1. When obtained via a `WorkloadResource` there was an assert which
assumed the values in the resource's DB entry's `additional_parameter`
field were of type string. This is not the case. For Looppoint resources
there are additional parameters which are arrays.
2. Due to changes introduced in https://github.com/gem5/gem5/pull/625,
the Looppoint CSV and JSON files were not being downloaded when needed.
This was fixed by replacing access to the `local_path` variable with a
call to `get_local_path()`.
2023-12-13 12:49:57 -08:00
Bobby R. Bruce
da3e3b806d arch-riscv: squash walks with tlb hits in startWalkWrapper (#672)
Because each vector load is fragmented into 64 byte cache-aligned
chunks, and one page-table walk is issued per fragment on tlb miss,
walks start to accumulate on a pending queue, which is processed in a
blocking way (no pending walks can be issued while one is being
processed). This adds noticeable latency on vector loads when VLEN is
sufficiently large.

This commit fixes the issue by allowing walks to be squashed if a TLB
lookup hits just before starting the walk on `startWalkWrapper`. This
idea was taken from the ARM walker.
2023-12-13 12:45:40 -08:00
Saúl Adserias
78f23ad2df arch-riscv: squash walks with tlb hits in startWalkWrapper
Change-Id: I1bdfd7b2ee02ddee5a2d4c13bafc8c472f555f61
2023-12-13 16:40:46 +01:00
Giacomo Travaglini
8d09e95420 arch-arm: Partial SVE2 Implementation (#657)
Instructions added:

BGRP, RAX1, EOR3, BCAX,
XAR & TBX, PMUL, PMULLB/T, SMULLB/T and UMULLB/T

Move from gerrit [1]

[1]: https://gem5-review.googlesource.com/c/public/gem5/+/70277

Change-Id: Ia135ba9300eae312b24342bcbda835fef6867113
2023-12-13 10:27:19 +00:00
Bobby R. Bruce
4eb81296b1 stdlib: Add get_local_path() call to Looppoint resources
Due to a change introduced in https://github.com/gem5/gem5/pull/625, a
gem5 resource will not download any external files until
`get_local_path()` is called. In the construction of the Looppoint
Resources this function was not called, the `local_path` variable was
called directly. As such, an error occured.

The downside of this fix is the Looppoint resources external files are
downloaded when `obtain_resource` is called, thus the bandwidth savings
introduced with https://github.com/gem5/gem5/pull/625 will not occur for
Looppoint resources. However, https://github.com/gem5/gem5/issues/644
proposes a fix which would supercede the
https://github.com/gem5/gem5/pull/625 solution.

Change-Id: I52181382a03e492ec1cb58b01e71bc4820af9ccc
2023-12-12 14:28:11 -08:00
Bobby R. Bruce
4adeb24a4f stdlib: Remove 'additional_params' value type assert
The value of a `WorkloadResource`'s additional parameter may not always
be a string. It can be any JSON value (integer, a list, a dict, ect.).
For Looppoint resources we have additional parameters such as a List of
region start points.

The assert inside workloads checking the type of the value breaks
certain usecase and is therefore removed in this commit.

Change-Id: Iecb1518082c28ab3872a8de888c76f0800261640
2023-12-12 14:23:04 -08:00
Bobby R. Bruce
eff08ba113 mem: Add a flag on AbstractMemory to control statistics collection (#656)
The stats initialization in the AbstractMemory allocates the space
according to the max requestors of the System. This may cause issues in
multiple system simulation.
Given there are two system A and B. A has one requestor and a memory,
while B has two requestors. When the requestor with requestor id 2
sending requests to the meomry in A, the simulator would crash because
requestor id 2 is out of the allocated space.

Current solution is adding a SysBridge between across A and B which
would rewrite the requestor id to a valid one. This solution works but
it needs to the bridge at the correct boundary which may not easy. In
addition, the stats would record a mapped data which may not accurate.

To reduce the complexity, we add an flag to AbstractMemory to control
the stats. If users don't want the statistics and want to solve the
cross system issue simply, users can disable the statistics collection.
We also makes the flag by default True to not disturb current users.
2023-12-12 13:13:30 -08:00
Bobby R. Bruce
c8cc193db8 arch,arch-riscv: Fix inst flag of RISC-V vector store macro instructions (#674)
Correct the instruction flags of RISC-V vector store instructions, such
as `vse64_v`, `vse32_v`. The `vse64_v` in `decoder.isa` is
`Mem_vc.as<uint64_t>()[i] = Vs3_ud[i];` and it will generate the code
`Mem.as<uint64_t>()[i] = Vs3[i];`. The current regex of assignRE only
mark the operand `Mem` as `dest` only if meet the formats like `Mem =
Rd` or `Mem[i] = Rd` because the code ` = Rd` or `[i] = Rd` match the
`assignRE` respectively. For the expression `Mem.as<uint64_t>()[i]`, the
operand `Mem` will falsely mark the operand as `src` because the code
`.as<uint64_t>()[i]` is not match the `assignRE`.

The PR will ensure the operand `Mem` is dest for the format like
`Mem.as<xxx>()[i] = yyy`.
2023-12-12 13:07:50 -08:00
Harshil Patel
bc12e7269d tests: fix gapbs and npb tests (#671)
Change-Id: I6090bde7903e302e501319b545fb4b06ef3e3df9
2023-12-12 12:33:22 -08:00
Yu-Cheng Chang
5a6901c405 configs: Make riscv/fs_linux work in build/ALL/gem5.opt (#655)
Change-Id: If9add7dc5e9c5600f769d27817da41466158942b
2023-12-12 08:23:28 -08:00
Bobby R. Bruce
37e4173351 arch-x86: Fix two_byte_opcodes.isa 0x6 -> 0x0 (#666)
This bug was introduced by https://github.com/gem5/gem5/pull/593 and
caused Issue https://github.com/gem5/gem5/issues/664.

Change-Id: Ia55de364ee8260e1fe315e37e1cffbc71ab229fb
2023-12-12 08:21:27 -08:00
Roger Chang
bedc3c597c arch: Fix inst flag of RISC-V vector store macro instructions
Correct the instruction flags of RISC-V vector store instructions, such
as `vse64_v`, `vse32_v`. The `vse64_v`  in `decoder.isa` is
`Mem_vc.as<uint64_t>()[i] = Vs3_ud[i];` and it will generate the code
`Mem.as<uint64_t>()[i] = Vs3[i];`. The current regex of assignRE only
mark the operand `Mem` as `dest` only if meet the formats like `Mem = Rd`
or `Mem[i] = Rd` because the code ` = Rd` or `[i] = Rd` match the
`assignRE` respectively. For the expression `Mem.as<uint64_t>()[i]`,
the operand `Mem` will falsely mark the operand as `src` because the
code `.as<uint64_t>()[i]` is not match the  `assignRE`.

The PR will ensure the operand `Mem` is dest for the format like
`Mem.as<xxx>()[i] = yyy`.

Change-Id: I9c57986a64f1efb81eb9c7ade90712b118e0788d
2023-12-12 17:04:31 +08:00
Roger Chang
10d344a942 arch-riscv: Fix the vector store indexed instructions declaration
Change-Id: I6f8701ef0819c22eda8cb20d09c40101f2d001a0
2023-12-12 16:36:49 +08:00
Harshil Patel
3f2a72641b util: add scripts that help maintain mongoDB
Change-Id: Ie421176782070462bb2a57351a04ba6ae004a9d4
2023-12-11 15:20:37 -08:00
Bobby R. Bruce
ea1226119c arch-riscv: Update riscv matched board (#654)
- Update riscv matched board to work with new
RiscvBootloaderKernelWorkload

Change-Id: Ic20b964f33e73b76775bfe18798bd667f36253f6
2023-12-08 13:33:09 -08:00
Yu-Cheng Chang
10a0c950da stdlib: Fix the chi protocol of arm boot tests (#658)
Change-Id: I63f17a73b2e16bc26d9b41babc63439a6040791f
2023-12-07 16:10:45 -08:00
Giacomo Travaglini
81d3c6307d arch-arm: add Sve mla and mls indexed (#596)
This contains the implementation of mla and MLS index version
instructions from ARM SVE2 ISA specification.
2023-12-07 21:47:35 +00:00
Harshil Patel
0f0317ad16 Arch-riscv: Add chosen node
Change-Id: I458665caec08856cd8e61d2cd7a5b0dc5c35d469
2023-12-06 20:10:56 -08:00
Yu-hsin Wang
5de50cc9dd mem: update flag description and the if-block style
Change-Id: Iac727d38b88f1818aeeccf0d8de639fe18759074
2023-12-07 10:21:28 +08:00
Bobby R. Bruce
d006f866c0 misc: Update version to v23.1.0.0 (#662) 2023-12-06 14:09:06 -08:00
Bobby R. Bruce
75544b2abf arch-riscv: Add PCEvent for RISCV FS Workload kernel panic/oops (#573)
Inspired by the similar feature in ARM's full system workload, this
change adds
an option to halt gem5 simulation if the guest system encounter kernel
panic
or kernel oops.

On RiscvISA::BootloaderKernelWorkload, by default, the simulation
will exit upon kernel panic, while kernel oops will not induce
simulation halt.
This is because the system will essentially do nop after a kernel panic,
while the
system might be still functional after a kernel oops.

Dumping kernel's dmesg is useful for diagonizing the cause of kernel
panic, so
ideally, we want to dump the guest's dmesg to the host. However, due to
a bug
described in [1], kernel v5.18+ dmesg might not be dumped properly.
Hence, the
dmesg will not be dumped to the host.

On RiscvISA::FsLinux, this feature is turned off by default as the
symbols from the
official RISC-V kernel resource are stripped from the binary. However,
if this feature
is enable, the dmesg will be dumped to the host system.

[1] https://github.com/gem5/gem5/issues/550

Change-Id: I8f52257727a3a789ebf99fdd4dffe5b3d89f1ebf
2023-12-06 10:58:23 -08:00
Yu-Cheng Chang
9bd61f217f configs: Fix issues after get_runtime_isa() #241 removed (#652)
1. Fix the wrong ISA detect of get_isa function
2. Fix the typo ObjectLIst.cpu_list
3. Fix missing PageTableWalkerCache
4. Fix the invalid default cpu_type paramter

Change-Id: I217ea8da8a6d8e712743a5b32c4c0669216ce6c4
2023-12-06 10:57:18 -08:00
Nitesh Narayana
d962d2588d arch-arm: This commit cleans .isa files
This commit cleans extra new lines from .isa files from this branch

Change-Id: I4087ed230aa041747038b49360c2aba3f82c0790
2023-12-06 16:03:21 +01:00
Matthias Boettcher
e4dccbea8a arch-arm: Partial SVE2 Implementation
Instructions added:

BGRP, RAX1, EOR3, BCAX,
XAR & TBX, PMUL, PMULLB/T, SMULLB/T and UMULLB/T

Change-Id: Ia135ba9300eae312b24342bcbda835fef6867113
2023-12-06 14:26:31 +00:00
Yu-hsin Wang
e2b3f0b8e4 mem: Add a flag on AbstractMemory to control statistics collection
The stats initialization in the AbstractMemory allocates the space
according to the max requestors of the System. This may cause issues in
multiple system simulation.
Given there are two system A and B. A has one requestor and a memory,
while B has two requestors. When the requestor with requestor id 2
sending requests to the meomry in A, the simulator would crash because
requestor id 2 is out of the allocated space.

Current solution is adding a SysBridge between across A and B which
would rewrite the requestor id to a valid one. This solution works but
it needs to the bridge at the correct boundary which may not easy. In
addition, the stats would record a mapped data which may not accurate.

To reduce the complexity, we add an flag to AbstractMemory to control
the stats. If users don't want the statistics and want to solve the
cross system issue simply, users can disable the statistics collection.
We also makes the flag by default True to not disturb current users.

Change-Id: Ibb46a63d216d4f310b3e920815a295073496ea6e
2023-12-06 13:41:37 +08:00
Harshil Patel
ee4c6a9bac arch-riscv: Update riscv matched boad
- Update riscv matched board to work with new
RiscvBootloaderKernelWorkload

Change-Id: Ic20b964f33e73b76775bfe18798bd667f36253f6
2023-12-05 14:54:12 -08:00
Nitesh Narayana
db8e1652e8 arch-arm: This commit uses existing template code for mla/s index
This includes mla/s index version  implementation using the existing template code
to avoid code repeatition.

Change-Id: If1de84e01dec638e206c979ca832308ebc904212
2023-12-05 23:40:06 +01:00
Matthew Poremba
f00d7f70a4 configs: Fix apu_se.py CPU type checks (#651)
The current checks do not work. Correct the CPU type names

Change-Id: I81778873df0567c4a8dabbbe659c4c7a39326f98
2023-12-04 19:14:46 -08:00
Hoa Nguyen
4a77d532b0 stdlib: Add Kernel Panic/Oops exit event to stdlib
RISCV full system workloads have the capability of exit the simulation loop
upon the guest's kernel panic/oops. This change adds more stdlib exit event types
to accommodate the corresponding gem5 exits upon the guest's kernel panic and
kernel oops.

Change-Id: I3a4f313711793a473c6f138ff831b948034d0bb6
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-12-04 16:52:14 -08:00
Hoa Nguyen
cf087d4d11 arch-riscv: Add PCEvent for RISCV FS Workload kernel panic/oops
Inspired by a similar feature in ARM's full system workload, this change adds
an option to halt gem5 simulation if the guest system encounter kernel panic
or kernel oops.

On RiscvISA::BootloaderKernelWorkload, by default, the simulation
will exit upon kernel panic, while kernel oops will not induce simulation halt.
This is because the system will essentially do nop after a kernel panic, while the
system might be still functional after a kernel oops.

Dumping kernel's dmesg is useful for diagonizing the cause of kernel panic, so
ideally, we want to dump the guest's dmesg to the host. However, due to a bug
described in [1], kernel v5.18+ dmesg might not be dumped properly. Hence, the
dmesg will not be dumped to the host.

On RiscvISA::FsLinux, this feature is turned off by default as the symbols from the
official RISC-V kernel resource are stripped from the binary. However, if this feature
is enable, the dmesg will be dumped to the host system.

[1] https://github.com/gem5/gem5/issues/550

Change-Id: I8f52257727a3a789ebf99fdd4dffe5b3d89f1ebf
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
Co-authored-by: Jason Lowe-Power <jason@lowepower.com>
2023-12-04 14:59:26 -08:00
Bobby R. Bruce
569e21f798 configs,stdlib,tests: Remove get_runtime_isa() (#241)
`get_runtime_isa()` has been deprecated for some time. It is a leftover
piece of code from when gem5 was compiled to a single ISA and that ISA
used to configure the simulated system to use that ISA. Since multi-ISA
compilations are possible, `get_runtime_isa()` should not be used.
Unless the gem5 binary is compiled to a single ISA, a failure will
occur.

The new proceedure for specify which ISA to use is by the setting of the
correct `BaseCPU` implementation. E.g., `X86SimpleTimingCPU` of
`ArmO3CPU`.

This patch removes the remaining `get_runtime_isa()` instances and
removes the function itself. The `SimpleCore` class has been updated to
allow for it's CPU factory to return a class, needed by scripts in
"configs/common".

The deprecated functionality in the standard library, which allowed for
the specifying of an ISA when setting up a processor and/or core has
also been removed. Setting an ISA is now manditory.

Fixes #216.
2023-12-04 09:53:35 -08:00
Nitish Arya
7b98641953 arch-riscv: correctly pass arguments to kernel with new bootloader+kernel (#635)
The [PR](https://github.com/gem5/gem5/pull/390) adds support for new
bootloader and linux kernel. However after applying the changes the
arguments are not passed correctly to the kernel resulting in kernel
panic during simulations. This commit fixes the issue.
2023-12-04 09:02:50 -08:00
Bobby R. Bruce
e0c5f95110 misc: Merge Weekly GPU tests into Weekly Tests
This seperation was only for convenience while GPU tests were under
development and rapidly changing. This test merges the GPU tests into
the weekly tests where they belong.

Change-Id: I0e7118e863dba51334de89b3bbc3592374ef63ec
2023-12-03 13:46:55 -08:00
Jason Lowe-Power
895944fa27 mem-ruby: Fix compile error in chi-dvm-funcs (#646)
clang correctly found that the functions `inCache`, `hasBeenPrefetched`
and `inMissQueue` had the wrong signatures in the DVM funcs files. These
functions are unused, so this change just updates their signatures.

Change-Id: Id669ff661e1c6c46eaf04ea1f17cd9866a9e49ed
2023-12-03 13:39:26 -08:00
Bobby R. Bruce
c718e94753 stdlib: Add comment to ShadowResource (#645)
This comment explains that this solution is a hack the solution created
by https://github.com/gem5/gem5/issues/644 should eventually replace it.
2023-12-03 13:38:59 -08:00
Harshil Patel
bad569a3f8 misc: update x86-npb-benchmarks.py to use suites (#587)
- updated the x86-npb-benchmarks.py to use npb workloads and suites.

The suites and workloads are not in the database are also waiting
feedback. I am attaching the JSON file here.

[npb_workloads_suite.json](https://github.com/gem5/gem5/files/13431116/npb_workloads_suite.json)

To run the x86-npb-benchmarks.py script use the
GEM5_RESOURCE_JSON_APPEND env variable. The full command is:
```
GEM5_RESOURCE_JSON_APPEND=[path to npb_workloads_suite.json] ./build/X86/gem5.opt configs/example/gem5_library/x86-npb-benchmarks.py --benchmark [benchmark]
```
Change-Id: I248e6452ea4122e9260e34e4368847660edae577
2023-12-03 13:23:46 -08:00
Harshil Patel
5eba3941f4 arch-riscv: fix o3 cpu stuck in spinlock bug (#641) 2023-12-03 13:22:46 -08:00
Hoa Nguyen
7a5052b3a0 arch-arm: Only build ArmCapstoneDisassembler when ISA is arm (#553)
Currently, if the Capstone header file is found in the host system,
scons will try to build the ArmCapstoneDisassembler regardless of the
gem5 target ISA. This is causing problem when the host has Capstone, but
the gem5 target ISA is not arm. Compiling gem5 in this case will cause
errors, e.g., ArmISA and ArmSystem is not found.

This change aims to prevent building the ArmCapstoneDisassembler when
the gem5 target ISA is not arm.

Ref:
[1] The Arm Capstone PR https://github.com/gem5/gem5/pull/494

Change-Id: I1e714d34aec8fe2a2af8cd351536951053a4d8a5
2023-12-03 13:22:11 -08:00
Harshil Patel
88c57e22de misc: update gapbs example to use suites (#607) 2023-12-03 13:21:37 -08:00
Bobby R. Bruce
21919addca Fix for gem5 Issue #550 (#636)
This Pull-Request addresses gem5 Issue #550. The code that dumps the
Dmesg buffer is now templated on the two variants of the `Metadata`
structure, and the correct one is chosen based on the detected Kernel
version.

To support this functionality, the pull request also adds Symbol Size
data to the loader Symbol Table, and adds a method to query the Kernel
Version from the image in guest memory. The new attributes in the Symbol
class are de-serialized speculatively, so no checkpoint upgrader is
required to support this change.
2023-12-01 18:06:20 -08:00
Richard Cooper
d9c870f641 sim: Rework the Linux Kernel exit events (#639)
This patch reworks the Linux Kernel panic and oops events. The code has
been re-factored to provide re-usable events that can be applied to all
ISAs from the base `KernelWorkload` `SimObject`. At the moment they are
installed for the Arm workloads.

This update also provides more configuration options that can be
specified using the new `KernelPanicOopsBehaviour` enum. The options are
applied to the Kernel Workload parameters `on_panic` and `on_oops` which
are available to all subclasses of `KernelWorkload`.

The main rationale for this reworking is to add the option to cleanly
exit the simulation after dumping the Dmesg buffer. Without this option,
the simulation would continue running after a Kernel panic. If system
components (e.g. a system timer) keep the event queue alive, this causes
the simulation to run slowly to the maximum allowed tick.
2023-12-01 17:33:59 -08:00
Bobby R. Bruce
461af51575 misc: Update .github dir in stable from develop (#643)
Change-Id: I4609e260ab594f1ca8b5ed62ea4c5707f7669411
2023-12-01 17:30:03 -08:00
Jason Lowe-Power
ecb72b74f8 misc: Add gem5_build/ to .gitignore (#642)
The website's documentation on building gem5 now references
<gem5_base>/gem5_build in addition to <gem5_base>/build. So, we should
ignore files in that directory as well as the build directory.

Change-Id: Ie226545e04b885ce81b3c17e18b5052ed64af328
2023-12-01 17:05:41 -08:00
Bobby R. Bruce
500a4221a0 stdlib: Mv resource download to get_local_path and add ShadowResource (#625)
This change decouple's the downloading of a resource from it's data.
With this change the `obtain_resource` function returns the
`AbstractResource` implementation which contains the data. The resource
itself (e.g., the actual disk image, binary, file, etc.) is only
downloaded to the host system, if not already present, upon the
`get_local_path` call.

`get_local_path` is the function used by gem5 to ultimately load the
resource into a simulation, therefore this change ensures we only
download resources when they are loaded into a simulation.

This change is not ideal and comes with the following caveats:

1. The `downloader` function is created in `obtain_workload` and passed
to the `AbstractResource` implementation for later use. This function
comes with the following requirements:
    * The function will download the resource to `local_path`.
    * The function will not re-download the resources if already present
as this function is called _everytime_ `get_local_path` is called.
2. The directories needed to store `local_path` are created in
`obtain_workload` regardless. Ergo even if the resource is not used and
`get_local_path` is never called these directories are still created.


In keeping with this efficiency `ShadowResource` is introduced to allow
the storing of just the resource ID and Version of a resource with
additional information only obtained when requested.
2023-12-01 17:04:21 -08:00
Bobby R. Bruce
48f3cd1c0e stdlib: Integrate BootloaderKernelWorkload (#630)
This change does the following,

- Change the name of several python parameter names of the
RiscvBootloaderKernelWorkload. This is done to conform the expectation
from the stdlib, e.g., the kernel path must be `object_file`, and the
boot parameter must be `command_line`.
- Use RiscvBootloaderKernelWorkload by default for all full system
RISC-V simulations. RiscvBootloaderKernelWorkload is a superset of
RiscvFsWorkload.
2023-12-01 17:04:02 -08:00
Robert Hauser
84efeb976a systemc: Bugfix in TlmToGem5Bridge (#615)
In handleBeginReq, a timing request is sent. If the receiver rejects the
request, the bridge will save the pointers of the original transaction
object and the generated gem5 packet. After a recvReqRetry-signal and a
successful timing request, the variable for transaction object pointer,
but not for the gem5 packet, is set to nullptr. When a new transaction
with the phase BEGIN_REQ arrives, the assertion in handleBeginReq that
there is no pending gem5 packet fails.
Therefore, the variable pendingPacket has to be set to nullptr in
recvReqRetry after a successful timing request too.

Change-Id: I876f8f88e1893e8fdfa3441ed2ae5ddc39cef2ce

Co-authored-by: Robert Hauser <robert.hauser@uni-rostock.de>
2023-12-01 16:18:51 -08:00
Harshil Patel
9d108826b0 tests: fix artifact reference in HACC tests (#638)
Change-Id: I181f17a598885b59c186ff7e810d5b8b3b304e05
2023-12-01 15:19:17 -08:00
Matt Sinclair
bd2838d18e mem-ruby: update CacheMemory RubyCache debug prints (#637)
Update the RubyCache debug flag prints in CacheMemory to be more
descriptive and make clearer what is happening in a given function.
This makes it easier to determine what is happening when looking at the
RubyCache debug flags prints.

Change-Id: Ieee172b6df0d100f4b1e8fe4bba872fc9cf65854
2023-12-01 16:11:52 -06:00
Hoa Nguyen
39fd61d7dd misc: Fix precommit install (#634)
Previously, the `subprocess` module was used to execute shell command
installing precommit hook. However, after #431 [1], the import of the
`subprocess` module was overriden by `asyncio.subprocess`, which has a
different API to execute the shell command. This change removes the
`asyncio.subprocess` import.

[1] https://github.com/gem5/gem5/pull/431

Change-Id: I9a7d51f85518089d258ab57c5d849a36dcf128e9

Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-12-01 14:03:59 -08:00
Richard Cooper
ccb8b30967 misc: Update Dmesg dump for changes to printk in Linux v5.18+.
Linux v5.18+ changed the format of one of the data structures used to
implement the printk ring buffer. This caused the gem5 feature to dump
the printk messages on Kernel panic to fail for Kernels 5.18.0 and
later.

This patch updates the printk messages dump feature to support the new
printk ring buffer format when Kernels 5.18 and later are detected.

This patch addresses gem5 Issue #550:
https://github.com/gem5/gem5/issues/550

Change-Id: I533d539eafb83c44eeb4b9fbbcdd9c172fd398e6
Reported-by: Hoa Nguyen <hn@hnpl.org>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2023-12-01 22:00:36 +00:00
Richard Cooper
f21df19fd7 misc: Add function to extract the Linux Kernel Version
This function, `extract_kernel_version`, attempts to find the Kernel
version by searching the `uts_namespace` struct at the exported symbol
`init_uts_ns`. This structure contains the Kernel version as a string,
and is referenced by `procfs` to return the Kernel version in a
running system.

Different versions of the Kernel use different layouts for this
structure, and the top level structure is marked as
`__randomize_layout`, so the exact layout in memory cannot be relied
on. Because of this, `extract_kernel_version` takes the approach of
searching the memory holding the structure for printable strings, and
parsing the version from those strings.

Change-Id: If8b2e81a041af891fd6e56a87346a341df3c9728
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2023-12-01 22:00:36 +00:00
Richard Cooper
7ecff99c25 base: Add a size field to the Symbol object
Add a size field to the Symbol objects in the Symbol Table. This
allows client code to read the data associated with a symbol in cases
where the data type/size is not known beforehand (e.g. if an object's
size might be different between different versions of a
workload/kernel).

Access is mediated via the `sizeOrDefault()` method, which requires
client code to specify a fallback size. Since correct size data may
not be available (for example in legacy checkpoints), this forces the
client code to consider the 'missing size data' case.

Change-Id: If1a47463790b25beadf94f84382e3b7392ab2f04
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2023-12-01 22:00:35 +00:00
Richard Cooper
2fbbdad618 base: Add encapsulation to the loader::Symbol class
This commit converts `gem5::loader::Symbol` to a full class with
private members, enforcing encapsulation. Until now client code has
been able to (and does) access members directly.

This change will enable class invariants to be enforced via accessor
methods.

Change-Id: Ia0b5b080d4f656637a211808e13dce1ddca74541
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2023-12-01 22:00:26 +00:00
Bobby R. Bruce
88601d3ac5 stdlib: Add ShadowResource
The purpose of a `ShadowResource` is a resource which only contains the
ID and Version information, not any additional information about the
resource thus avoiding the `obtain_resource` call.

When attributes of the `ShadowResource` are accessed which can only be
obtained via `obtain_resource` the `ShowResource` calls the function and
returns what is required.

This is useful for `Suite` resources which contain several workloads
and resources which may not all be needed when the `Suite` object is
first instantiated.

Change-Id: Icc56261b2c4d74e4079ee66486ddae677bb35cfa
2023-12-01 13:38:08 -08:00
anoop
fc0a043950 mem-ruby: Unused L3CacheCntrl freed (#598)
Seems like the MOESI_AMD_Base-L3Cache.sm file is unused in the VIPER
protocol. It's confusing to have it in the GPU_VIPER.slicc file.
2023-12-01 13:01:19 -08:00
Ivana Mitrovic
d96b6cdae7 misc, stdlib: Update documentation to adhere to RST formatting. (#631)
This PR updates files in `src/python` to adhere to reStructuredText
formatting.
2023-12-01 11:43:49 -08:00
Matt Sinclair
0a2f9d4b18 mem-ruby: update CacheMemory RubyCache debug prints
Update the RubyCache debug flag prints in CacheMemory to be more
descriptive and make clearer what is happening in a given function.
This makes it easier to determine what is happening when looking at the
RubyCache debug flags prints.

Change-Id: Ieee172b6df0d100f4b1e8fe4bba872fc9cf65854
2023-12-01 12:31:52 -06:00
Hoa Nguyen
be3163a072 stdlib: Integrate BootloaderKernelWorkload
Change-Id: Ifeaa98059d5667c3335eaccd57a5295f44f88e43
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-12-01 07:28:30 +00:00
Hoa Nguyen
bbe5216d88 arch-riscv: Rename BootloaderKernelWorkload parameters
The gem5 standard library hardcoded some parameters of the workload.
E.g., the kernel filename must be `object_file`.

Change-Id: I5eeb7359be399138693eaba0738eaf524c59408f
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-12-01 07:28:30 +00:00
Bobby R. Bruce
bfd25f5352 util-docker: Enforce cmake version >=3.24 for DRAMSys (#627)
DRAMSys requires cmake 3.24 or greater. By default neither Ubuntu 22.04
or 20.04 delevery this by APT.

In both cases wget is required. In 20.04 OpenSSL is required.

Then ext/dramsys README file has been updated to state this requirement.
2023-11-30 21:23:53 -08:00
Bobby R. Bruce
743b2aada6 stdlib: Move resource download to get_local_path
This change decouple's the downloading of a resource from it's data.
With this change the `obtain_resource` function returns the
`AbstractResource` implementation which contains the data. The resource
itself (e.g., the actual disk image, binary, file, etc.) is only
downloaded to the host system, if not already present, upon the
`get_local_path` call.

`get_local_path` is the function used by gem5 to ultimately load the
resource into a simulation, therefore this change ensures we only
download resources when they are loaded into a simulation.

This change is not ideal and comes with the following caveats:

1. The `downloader` function is created in `obtain_workload` and passed
to the `AbstractResource` implementation for later use. This function
comes with the following requirements:
    * The function will download the resource to `local_path`.
    * The function will not re-download the resources if already present
as this function is called _everytime_ `get_local_path` is called.
2. The directories needed to store `local_path` are created in
`obtain_workload` regardless. Ergo even if the resource is not used and
`get_local_path` is never called these directories are still created.

Change-Id: I3f0e9a0099cba946630d719c3d17b7da0bccf74a
2023-11-30 15:27:44 -08:00
Jason Lowe-Power
62a2b6eed2 ext: Update readme for DRAMSys
Specify the cmake version

Change-Id: I8bbdb128667df37724c38caef5572d8fb1641ef5
2023-11-30 15:09:28 -08:00
Jason Lowe-Power
b3e7af9d79 Support for classic prefetchers in Ruby (#502)
This patch adds supports for using the "classic" prefetchers with ruby
cache controllers.

This pull request includes a few commits making the changes in this
order:
- Refactor decouples the classic cache and prefetchers interfaces
- Extras probes for later integration with ruby
- General ruby-side support
- Adds support for the CHI protocol

Commit [mem-ruby: support prefetcher in CHI
protocol](2bdb65653b)
may be used as example on how to add support for other protocols.

JIRA issues that may be related to this pull request:
    https://gem5.atlassian.net/browse/GEM5-457
    https://gem5.atlassian.net/browse/GEM5-1112
2023-11-30 10:24:29 -08:00
Yu-Cheng Chang
a16fd8a592 scons: Limit adding fastmodel files and libpath (#629)
The change will only add include and library path if the fastmodel is
required to build. The change will benefit for most of gem5 build.

Change-Id: I98c20bd1470b7227940036199e02bc001e307eac
2023-11-30 07:36:26 -08:00
Jason Lowe-Power
9afe9932bc sim,python: Restore sigint handler in python (#531)
Currently, if you try to use ctrl-c while python code is running nothing
happens. This is not ideal. This change enables users to use ctrl-c
while python is running (e.g., when a large disk image is downloading).
To do this, we moved the `initSignals` function in gem5 from `main` to
the simulate loop. Thus, every time the simulate loop starts (i.e., is
called from python) gem5 will install its signal handlers. Also, when
the control is returned to python, we put python's default SIGINT
handler back.

Change-Id: I14490e48d931eb316e8c641217bf8d8ddaa340ed
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2023-11-30 07:27:52 -08:00
Andreas Sandberg
dcdebec0f6 misc,python: Add isort hook to pre-commit (#431) 2023-11-30 09:54:12 +00:00
Bobby R. Bruce
d11c40dcac misc: Run pre-commit run --all-files
This ensures `isort` is applied to all files in the repo.

Change-Id: Ib7ced1c924ef1639542bf0d1a01c5737f6ba43e9
2023-11-29 22:06:41 -08:00
Bobby R. Bruce
7d67109ca2 python,misc: Add isort to pre-commit
Change-Id: I391a8948f0817bd5c6a9fe8a4c3e4fed07a98c49
2023-11-29 22:06:05 -08:00
Bobby R. Bruce
f256064b4a util-docker: Enforce cmake version >=3.24 for DRAMSys
DRAMSys requires cmake 3.24 or greater. By default neither Ubuntu 22.04
or 20.04 delevery this by APT.

In both cases wget is required. In 20.04 OpenSSL is required.

Change-Id: I51a7f8a8a46e8cf1908a120adb9289aa3907ccda
2023-11-29 21:49:11 -08:00
Bobby R. Bruce
b99af93183 misc: Merge .github directory from develop to stable (#626) 2023-11-29 19:04:13 -08:00
Bobby R. Bruce
403bf38a0e tests: switch lulesh/hacc to use vega_x86 (#620) 2023-11-29 18:55:53 -08:00
Harshil Patel
b59a398312 tests: change HACC tests to VEGA_X86
Change-Id: I846229db1ab1480d79471c717b714698c3132df9
2023-11-29 16:02:30 -08:00
Harshil Patel
392086b43d stdlib, resources: removed deprecated if statement in obtain_resource for workload resources (#611)
- The resources field in workload now changed to a dict of id and
version from a string with just the id.
There was an if statement added to support both versions in develop.
Removing the if statement so that 23.1 supports the new changes only.

Change-Id: Id8dc3f932f53a156e4fb609a215db7d85bd81a44
2023-11-29 14:27:23 -08:00
Bobby R. Bruce
fcbcd1ce72 arch-x86: Fixes page fault for CLFLUSH on write-protected pages (#592)
Converts CLFLUSHOPT/WB/FLUSH operations from Write to Read operations
during address translation so that they don't trigger a page fault when
done on write-protected pages.

Solves #226
2023-11-29 14:25:21 -08:00
Hoa Nguyen
1d1cba297b scons: Add an option to reduce memory usage of ld (#601)
Linking the gem5 binary consists of linking numerous object files. By
default, ld optimizes for speed over for memory consumption [1]. This
leads to the huge memory consumption of gem5 at the linking stage.

This patch adds an option to add the `--no-keep-memory` flag to ld.
According to the documentation [1], this flag optimizes the memory usage
of ld.

[1]
https://ftp.gnu.org/old-gnu/Manuals/ld-2.9.1/html_chapter/ld_2.html#IDX133

Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-11-29 14:24:48 -08:00
Yu-Cheng Chang
57ba3fccb7 scons: Move CPPPATH systemc_home to "src/systemc" folder (#617)
Files under src/systemc require the include path of systemc_home

Change-Id: Ibcbac2762259a0b997ac444b2c63a218c27af9ee
2023-11-29 13:56:23 -08:00
Bobby R. Bruce
a2e7bd4698 arch-riscv: Support combination of privilege modes configuration (#522)
The user can select privilege modes witch is included in the system, not
always enable the user and supervisor privilege modes.
2023-11-29 10:12:57 -08:00
Adrià Armejach
b0cefac9b2 arch-riscv: Fix narrow datatypes in RVV isa files (#606)
Some variables hava narrow datatypes that overflow on large VLEN values.
For example, the maximum number of microops for LMUL=8 SEW=8 and
VLEN=64K is 2^16.

Change-Id: I5cce759f040884e09ce83bee7e54a62c4b42c5aa

Co-authored-by: Adrià Armejach <adria.armejach@bsc.es>
2023-11-29 10:11:06 -08:00
Adrià Armejach
eb13b32314 cpu-o3: Fix discarded requests str-ld forwarding (#614)
With the use of large RVV vectors (i.e., 8K or 16K bits) and a limited
number of cacheLoadPorts, some loads take multiple cycles to execute.
This triggered certain conditions when store-to-load forwarding happens
in the middle of the execution of a load that already has outstanding
packets.

First, after store-to-load forwarding the request is marked as discarded
and the load is immediately writtenback, which triggers a writebackDone
that tries to delete the request, triggering an assert as it still has
outstanding packets. This patch avoid deleting the request leaving it
self owned, it will be deleted when the last packet arrives in
packetReplied.

Second, this patch avoid checking snoops on discarded requests by
checking if the request exists.

Change-Id: Icea0add0327929d3a6af7e6dd0af9945cb0d0970

Co-authored-by: Adrià Armejach <adria.armejach@bsc.es>
2023-11-29 08:45:03 -08:00
Harshil Patel
089b82b2e9 arch-riscv: fix tlb bug (#610)
- one tlb miss was getting counted twice by the lookup function.

Change-Id: I5fee08bd6e936896704e7dbbd242720b8d23b547
2023-11-29 08:39:02 -08:00
Harshil Patel
23cadf0886 tests: switch lulesh to use vega_x86
Change-Id: Ifbf0fdfd7d8c2bbaad0b6094090acecd1cb8055c
2023-11-29 07:41:51 -08:00
Tiago Mück
0f8c60bce5 mem-ruby: add missing state for CHI Prefetch event
RUSC state is missing for the Prefetch event.

Change-Id: If440ac0052100dba295708471a75a24cd234c011
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
2023-11-28 18:30:50 -06:00
Tiago Mück
91cf58871e mem-ruby: support prefetcher in CHI protocol
Use RubyPrefetcherProxy to support prefetchers in the
CHI cache controller

L1I/L1D/L2 prefechers can now be added by specifying a non-null
prefetcher type when configuring a CHI_RNF.

Related JIRA:
https://gem5.atlassian.net/browse/GEM5-457
https://gem5.atlassian.net/browse/GEM5-1112

Additional authors:
    Tuan Ta <tuan.ta2@arm.com>

Change-Id: I41dc637969acaab058b22a8c9c3931fa137eeace
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
2023-11-28 18:30:50 -06:00
Tiago Mück
94d5cc17a2 mem-ruby,mem-cache: ruby supports classic pfs
This patch adds RubyPrefetcherProxy, which provides means to inject
requests generated by the "classic" prefetchers into a SLICC prefetch
queue. It defines defines notifyPf* functions to be used by protocols
to notify a prefetcher. It also includes the probes required to
interface with the classic implementation.
AbstractController defines the accessor needed to snoop the caches.

A followup patch will add support for RubyPrefetcherProxy in the
CHI protocol.

Related JIRA:
https://gem5.atlassian.net/browse/GEM5-457
https://gem5.atlassian.net/browse/GEM5-1112

Additional authors:
    Tuan Ta <tuan.ta2@arm.com>

Change-Id: Ie908150b510f951cdd6fd0fd9c95d9760ff70fb0
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
2023-11-28 18:30:50 -06:00
Tiago Mück
3a7192d682 mem-cache: change hasBeenPrefetched
hasBeenPrefetched can now take a requestor id and returns true only if
the block was prefetched by a prefetcher with the same id. This may be
necessary to properly train multiple prefetchers attached to the same
cache. If returns true if the block was prefetched by any prefetcher
when the id is not provided.

Related JIRA:
https://gem5.atlassian.net/browse/GEM5-457
https://gem5.atlassian.net/browse/GEM5-1112

Change-Id: I205e000fd5ff100e5a5d24d88bca7c6a46689ab2
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
2023-11-28 18:30:49 -06:00
Tiago Mück
a63ff3c442 mem-cache: add prefetcher listener for evictions
Listener to data update probe notifies prefetcher of evictions.
Prefetchers need to implement notifyEvict to make use of this
information.

Related JIRA:
https://gem5.atlassian.net/browse/GEM5-457
https://gem5.atlassian.net/browse/GEM5-1112

Change-Id: I052cfdeba1e40ede077554ada104522f6a0cb2c7
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
2023-11-28 18:30:49 -06:00
Tiago Mück
d8a04f902e mem-cache: add prefetch info to update probe
CacheDataUpdateProbeArg has additional info to tell listeners if the
block was prefetched and evicted without being used, as well as which
object prefetched the block.

Related JIRA:
https://gem5.atlassian.net/browse/GEM5-457
https://gem5.atlassian.net/browse/GEM5-1112

Change-Id: Id8ac9099ddbce6e94ee775655da23de5df25cf0f
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
2023-11-28 18:30:49 -06:00
Tiago Mück
becba00d95 mem-cache,configs: remove extra prefetch_* params
Remove the prefetch_on_access and prefetch_on_pf_hit from BaseCache.
BasePrefetch no longer expects this params to exist in the parent.

Configurations that set these parameter using the cache object were
fixed.

Change-Id: I9ab6a545eaf930ee41ebda74e2b6b8bad0ca35a7
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
2023-11-28 18:30:49 -06:00
Tiago Mück
af2ee0db30 mem-cache: decoupled prefetchers from cache
This patches decouples the prefetchers from the cache implementation
as the first step to allow using the classic prefetchers with ruby
caches. The prefetchers that need do cache lookups can do so using
the accessor object provided when the probes are notified. This may
also facilitate connecting the same prefetcher to multiple caches.

Related JIRA:
https://gem5.atlassian.net/browse/GEM5-457
https://gem5.atlassian.net/browse/GEM5-1112

Change-Id: I4fee1a3613ae009fabf45d7b747e4582cad315ef
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
2023-11-28 18:30:49 -06:00
Jason Lowe-Power
3fe5e58f28 arch-x86: Fix misc registers in mov instructions (#593)
MOV instructions 8C and 8E can be prefixed with a REX prefix to extend
the source/destination register.
However, the R bit in REX will be applied to the segment register.  
The decoder file checks for valid segment registers, checking the
MODRM_REG only, however, later this will be extended with the REX_R when
adding the register to the sources/destinations of the instruction.
This will trigger an assert.

Additionally, MOV instructions of various miscelaneous registers are
also not check for being valid when taking into account the REX_R bit.

This patch checks that the REX_R is not set, otherwise, UD2 will be
generated.
2023-11-28 11:14:53 -08:00
Andreas Sandberg
0c30353c59 cpu: Require BTB hit to detect branches. (#493)
In a high performance CPU there is no other way than a BTB hit
to know about a branch instruction and its type. For low-end CPU's
pre-decoding might sit in from of the BPU to provide this information.
Currently, the BPU models only low-end behavior and updates the
RAS and the indirect branch prediction even without a BTB hit.
This patch adds three things to model the correct behavior for high-end
CPUs.
1. A check before the RAS and indirect predictor wheather there was
a BTB hit or not. Only for BTB hits the BPU will consolidate RAS, and
indirect predictor.
2. Since, this check requires a BTB hit for indirect branches they must
also be installed into the BTB. For returns this was already done.
3. Finally, the BTB update previously happened at squash (decode
or commit). Since this can be out-of-order that means branches from
the false path can get installed without ever been retired.
2023-11-28 09:39:14 +00:00
Roger Chang
9a0c671cce arch-riscv: Handle the exception following the privilege mode set
Change-Id: I4867941ec286fe485e01db848b8c7357488f6cf4
2023-11-28 09:26:27 +08:00
Roger Chang
d56801c240 arch-riscv: Add misa rvs check for memory translation
The memory translation require supervisor mode implement. If the
supervisor mode is not implemented, the satp CSR is not exists and
should not do address translation

Change-Id: Ie6c8a1a130d0aab0647b35e0f731f6b930834176
2023-11-28 09:26:27 +08:00
Roger Chang
6fd4feb797 arch-riscv: fatal_if the process run without SU modes
Change-Id: Ifce7eec6cea10881964c29d206a92f3d10271de6
2023-11-28 09:26:27 +08:00
Roger Chang
9e738a65ea arch-riscv: Add isaExts field for CSR registers
Change-Id: Idd94af57f3a721d455ea7fb9d335fab7b16a0f7e
2023-11-28 09:26:27 +08:00
Roger Chang
0e4f82a119 arch-riscv: define the CSR masks for each privilege modes
Change-Id: I9936d9bc816921a827b94550847d4898b3aa3292
2023-11-28 09:26:27 +08:00
Roger Chang
f745e8cf89 arch-riscv: Initial the privilege modes configuration
1. Declare the new enum type PrivilegeModes
2. Disallow setting the MISA register RVU and RVS.

Change-Id: I932d714bc70c9720a706353c557a5be76c950f81
2023-11-28 09:26:27 +08:00
aditya
c00209a0c0 Merge branch 'x86-clflush-fault-fix' of github.com:AKKamath/gem5 into x86-clflush-fault-fix
Change-Id: Ia9a694b0f7ef99331430cdb796c252ce8bb51f3b
2023-11-28 00:42:27 +00:00
Aditya K Kamath
9a0566e295 arch-x86: Fixes page fault for CLFLUSH on write-protected pages
Converts CLFLUSHOPT/WB/FLUSH operations from Write to Read operations
during address translation so that they don't trigger a page fault
when done on write-protected pages.

Change-Id: I20e89cc0cb2b288b36ba1f0ba39a2e1bf0f728af
2023-11-28 00:42:17 +00:00
Bobby R. Bruce
3bf0b1d22a misc: Merge develop .github dir to stable (#608) 2023-11-27 14:32:51 -08:00
Bobby R. Bruce
d94d6017b0 scons: Change to Kconfig build system (#69)
The PR contains the following changes:
- Move all of the config options(`env["CONF"]`) from SConsopt to Kconfig
files
- Update `build_opts` files to Kconfig option formats
- The Ruby Protocol files are only built if `RUBY=y`
- Remove the default-default build target
- Kconfig commands are included in the PR:
    - defconfig
    - setconfig
    - meunconfig
    - guiconfig
    - listnewconfig
    - savedefconfig
    - oldconfig
    - olddefconfig
- Add the `python3-tk` package dependencies
 
Jira issue: https://gem5.atlassian.net/browse/GEM5-1211
2023-11-27 13:59:18 -08:00
Matthew Poremba
9e6a87e67a dev-amdgpu: Writeback PM4 queue rptr when empty (#597)
The GPU device keeps a local copy of each ring buffers read pointer
(rptr) to avoid constant DMAs to/from host memory. This means it needs
to be periodically updated on the host side as the driver uses this to
determine how much space is left in the queue and may hang if it believe
the queue is full. For user-mode queues, this already happens when
queues are unmapped. For kernel mode queues (e.g., HIQ, KIQ) the rptr is
never updated leading to a hang.

In this patch the rptr for *all* queues is reported back to the kernel
whenever the queue reaches an empty state (rptr == wptr). Additionally
to handle PM4 queue wrap-around, the queue processing function checks if
the queue is not empty instead of rptr < wptr. This is state because the
driver fills PM4 queues with NOP packets on initialization and when wrap
around occurs.

Change-Id: Ie13a4354f82999208a75bb1eaec70513039ff30f
2023-11-27 11:02:11 -08:00
Bobby R. Bruce
d4b7c8a26d Merge branch 'develop' into develop-kconfig 2023-11-27 09:39:08 -08:00
Bobby R. Bruce
0f6eabe8c9 ext,github,tests: Update DRAMSys tests to v5.0 and handle new dependencies (#577)
#525 Updated DRAMSys to v5.0. This PR further improves v5.0
inforporation into gem5 by better managing its new dependencies and
updating the DRAMSys tests to use v5.0.

This PR:

1. Adds a check which throws warning if DRAMSys cannot be build due to a
missing `cmake` instead of failing with a build error. `cmake` is not a
hard gem5 requirement. It is only required to build DRAMSys in the cases
it is required. It is therefore prudent to not fail a build in cases
`cmake` is not present on the host system.
2. Updates the "all-dependency" Docker images to include the optional
dependencies `git-lfs` (needed to clone the DRAMSys repo when running
the command outlined in ext/dramsys/README -- introduced in #525) and
`cmake` (needed to build DRAMSys).
3. Updates the Weekly workflow's `dramsys-tests`' `Checkout DRAMSys` job
to clone DRAMSys in the same manner as outlined in ext/dramsys/README.
This ensures the `dram-systests` test the instructions we give users.
4. `.gitignore` is added to ext/dramsys to ignore the
ext/dramsys/DRAMSys directory when cloned for building and integration
into gem5.

(2.) Should fix our failing weekly tests:
https://github.com/gem5/gem5/actions/runs/6912511984/job/18808339821 and
(3.) will ensure the changes introduced in #525 are tested.
2023-11-27 09:37:11 -08:00
Harshil Patel
1de992bc75 tests: fix lulesh (#600)
- fixed the broken command that was causing lulesh to fail the run
Change-Id: I4e8a310f153d86deb8829f41b5ddd0c317df23cb
2023-11-27 07:42:59 -08:00
Matthew Poremba
cc9f81b08a arch-vega,arch-gcn3: Bugfix V_PERM_B32 and V_OR3_B32 (#599)
The V_PERM_B32 instruction is selecting the correct byte, but is
shifting into place moving by bits instead of bytes. The V_OR3_B32
instruction is calling the wrong instruction implementation in the
decoder.

This patch fixes both issues plus a bonus fix for GCN3's V_PERM_B32.
(GCN3 does not have V_OR3_B32).

Change-Id: Ied66c43981bc4236f680db42a9868f760becc284
2023-11-26 23:22:01 -08:00
Bobby R. Bruce
0b2c56ef66 mem-cache: Revert "Prefetchers Improvements" (#581)
Reverts gem5/gem5#564 to fix #580.

Discussion in #581 showed there may be a fix to this but reverting for now until 
a better solution is found.
2023-11-26 18:43:21 -08:00
Bobby R. Bruce
ab1d5dc3a0 arch-arm: Fix Virtual Interrupt logic in secure mode (#584)
This PR is fixing remaining issues in the ArmISA::Interrupt class; more
specifically it is enabling
virtual interrupts in secure mode (when FEAT_SEL2 is present). Previous
version was assuming no
virtual interrupt was possible in secure mode. We fix this assumption by
replacing the security check
with the EL2Enabled helper which closely matches the Arm pseudocode
2023-11-26 18:11:08 -08:00
Bobby R. Bruce
36e83943b5 tests,misc: Update DRAMSys test clone command
This clone is updated to reflect the new advice given in
ext/dramasys/README that was introduced in PR
https://github.com/gem5/gem5/pull/525 to upgrade DRAMSysm to v5.0.

Change-Id: I868619ecc1a44298dd3885e5719979bdaa24e9c2
2023-11-26 17:10:40 -08:00
Bobby R. Bruce
8f9a328652 util-docker: Add 'cmake' to all-deps
'cmake' is required to build DRAMSysm.

This is an optional dependency for compiling DRAMSys. It is therefore
not required. It is included in the "all-dependencies" Docker images
as they may be needed if DRAMSys is desired.

Change-Id: I1a3e1a6fa2da4d0116d423e9267d4d3095000d4e
2023-11-26 17:10:40 -08:00
Bobby R. Bruce
575114b63b ext: Add .gitignore to ext/dramsys
Change-Id: Ifc1a3c77b56cbe5777d041a88b2c0d5cb77eaf89
2023-11-26 17:10:40 -08:00
Bobby R. Bruce
cb61d01ede ext: Add 'cmake' dep check to DRAMSys install
CMake is not required to build gem5. It is only required to build
and link the optional DRAMSysm library. Therefore, if the DRAMSys repo
has been cloned but CMake is not present this patch ensures no attempt
at building or linking DRAMSysm is made. A warning is thrown
inform the user of the missing CMake.

Change-Id: I4d22e3a16655fd90f6b109b4e75859628f7d532d
2023-11-26 17:10:40 -08:00
Nitesh Narayana
35ccd7f907 arch-arm: This commit adds the mla/s indexed versions
This includes the isa and instruction implementations
of mla and mls indexed versions from ARM SVE2 ISA spec.

Change-Id: I4fbd0382f23d8611e46411f74dc991f5a211a313
2023-11-24 15:20:30 +01:00
Eduardo José Gómez Hernández
670bf6a488 arch-x86: Check REX_R for MOV misc registers
Change-Id: I08ea37ffe695df500ea84cbddd94be246f916caf
2023-11-24 13:41:24 +01:00
Eduardo José Gómez Hernández
cea169f5e7 arch-x86: Fix segment registers in instructions 8C and 8E
MOV instructions 8C and 8E can be prefixed with a REX prefix to extend
the source/destination register. However, the R bit in REX will be
applied to the segment register.  The decoder file checks for valid
segment registers, checking the MODRM_REG only, however, later this
will be extended with the REX_R when adding the register to the
sources/destinations of the instruction.  This will trigger an assert.

This patch checks that the REX_R is not set, otherwise, UD2 will be
generated.

Change-Id: I78a93c35116232fe37e5ec50025e721b8c633c5f
2023-11-23 10:18:17 +01:00
Roger Chang
92670e9745 fastmodel: Simply the logic of USE_ARM_FASTMODEL setting
Change-Id: Ib00cf83ca881727987050a987a2adb1e9f9d31ef
2023-11-23 14:15:28 +08:00
Roger Chang
412cf3e644 util: Update the gem5_within_systemc README
Change-Id: Ife34fe5ccd00fa2c6a83f34af49333d49017dfed
2023-11-23 08:46:27 +08:00
Roger Chang
23e4525e29 util: Update the tlm README
Change-Id: I4006257bf55d7065136347788783796fd39ab725
2023-11-23 08:41:37 +08:00
Roger Chang
4d632cb73f scons: Add new config option HAVE_CAPSTONE to Kconfig
The config option HAVE_CAPSTONE is added in the previous [1] and
the Kconfig options should be sync with it.

[1] https://github.com/gem5/gem5/pull/494

Change-Id: Id83718bc825f53d87d37d6ac930b96371209bdb3
2023-11-23 08:26:11 +08:00
Roger Chang
5828b1eb32 misc: Update daily-test.yaml to match Kconfig build system configuration
Change-Id: I20b04d006c67374e3226a91c03550f2731ed7fe7
2023-11-23 08:26:11 +08:00
Roger Chang
5b21233491 tests: Update Gem5Fixture to compatible of Kconfig system
Used after the build system changed to Kconfig

Change-Id: I699b36f09691dc821da8ee80fe5b60f30fe2179c
2023-11-23 08:26:11 +08:00
Roger Chang
758f9d2ea1 util: Add python3-tk package to dockerfile
The guiconfig required the python3-tk package to run.

Change-Id: I1d126021c2c57448b1ceefb9fff256e2a6bbbf33
2023-11-23 08:26:11 +08:00
Roger Chang
7b35765217 scons: Refactor the USE_SYSTEMC option
Change-Id: I2f51081e0db932b83eea9dd395551afe13d54a34
2023-11-23 08:26:11 +08:00
Roger Chang
3b06925408 scons: Update Kconfig description
Change-Id: I69206fb9881bc0d53660bbd1cf8fc225ead9fea3
2023-11-23 08:26:11 +08:00
Roger Chang
d758df4b5c scons: Update the Kconfig build options
The CL updates the Kconfig:
1. Replace the USE_NULL_ISA with BUILD_ISA
2. The USE_XXX_ISAs are depends on BUILD_ISA
3. If the BUILD_ISA is set, at least one of USE_XXX_ISAs must be set
4. Refactor the USE_KVM option

Change-Id: I2a600dea9fb671263b0191c46c5790ebbe91a7b8
2023-11-23 08:26:11 +08:00
Gabe Black
d37673be9f scons: Remove the default-default build target.
In gem5, there are many equally valid and equally useful top level
targets which the user might want. It no longer makes sense to
arbitrarily pick one to be the default target. It makes sense to force
the user to actually specify what they want, instead of assuming it
must be the ARM debug binary.

There is currently an M5_DEFAULT_BINARY environment variable which
will change what the default binary is, if set. This change leaves
that in place, but removes the default-default, or in other words the
default that is used if M5_DEFAULT_BINARY is not set.

This way if the user knows what default they want, they can specify it
locally in their environment and avoid having to type it over and over
again, but we're not making an arbitrary choice at a more global level
without the context to know what actually makes sense.

Change-Id: I886adb1289b9879d53387250f950909a4809ed8b
2023-11-23 08:26:11 +08:00
Gabe Black
63919f6a68 scons: Hook up oldconfig and olddefconfig.
These two utilities help update an old config to add settings for new
config options. The difference between them is that oldconfig asks what
new settings you want to use, while olddefconfig automatically picks the
defaults.

Change-Id: Icd3e57f834684e620705beb884faa5b6e2cc7baa
2023-11-23 08:26:11 +08:00
Gabe Black
ec76214f68 scons: Hook up the savedefconfig kconfig helper.
This helper utility lets you save the defconfig which would give rise to
a given config. For instance, you could use menuconfig to set up a
config how you want it with the options you cared about configured, and
then use savedefconfig to save a defconfig of that somewhere to the
side, in the gem5 defconfig directory, etc. Then later, you could use
that defconfig to set up a new build directory with that same config,
even if the kconfig options have changed a little bit since then.

A saved defconfig like that can also be a good way to visually see what
options have been set to something interesting, and an easier way to
pass a config to someone else to use, to put in bug reports, etc.

Change-Id: Ifd344278638c59b48c261b36058832034c009c78
2023-11-23 08:26:11 +08:00
Gabe Black
51b8cfcede scons: Hook up the kconfig guiconfig program.
Change-Id: I0563a2fb2d79cea5974aeaf65a400be5ee51dc63
2023-11-23 08:26:11 +08:00
Gabe Black
91b3da016b scons: Hook in the listnewconfig kconfig helper.
This helper lists config options which are new in the Kconfig and which
are not currently set in the config file.

Change-Id: I0c426d85c0cf0d2bdbac599845669165285a82a0
2023-11-23 08:26:11 +08:00
Gabe Black
083bca1e23 scons: Hook in the kconfig setconfig utility.
This little utility lets you set particular values in an existing config
without having to open up the whole menuconfig interface.

Also reorganize things in kconfig.py a little to help share code between
wrappers.

Change-Id: I7cba0c0ef8d318d6c39e49c779ebb2bbdc3d94c8
2023-11-23 08:26:11 +08:00
Gabe Black
1ae2dfcc56 scons: Add a mechanism to manually defconfig a build dir.
This will let you specify *any* defconfig file, instead of implicitly
selecting one from the defconfig directory based on the variant name.

Change-Id: I74c981b206849f08e60c2df702c06534c670cc7c
2023-11-23 08:26:11 +08:00
Gabe Black
1e84d9f941 scons: Add a mechanism to run menuconfig to set up a build dir.
If you call scons with the fist argument set to menuconfig, that means
to run menuconfig on the path following it. Or in other words, if
you ran this command:

scons menuconfig build/foo/bar

That would tell SCons to set up a build directory at the path
build/foo/bar, and then invoke menuconfig so you can set up its
configuration.

In addition to using this mechanism to set up a new build directory, you
can also use it to reconfigure an existing directory.

This supplements and does not replace the existing mechanism of using
"build/${VARIANT}" to select a config with defconfig.

Change-Id: Ief8e8c2ee6477799455c2004bef06c64be5cc1db
2023-11-23 08:26:11 +08:00
Gabe Black
f4c578f458 scons: Flesh out the help text for "magic" targets.
These targets are not necessarily obvious, and tell SCons to do useful
things, like build a particular version of the gem5 binary with a
particular configuration, or run the unit tests.

Add descriptions of these targets to the help so that they are much
more discoverable.

Change-Id: If84399be1a7155ff5f66f511efe1f1c241089c84
2023-11-23 08:26:10 +08:00
Gabe Black
1cdccd7ac0 scons: Add a build script for generating a root Kconfig file.
This root Kconfig file "source"s (includes) the base gem5 src/Kconfig
file, and also any optional Kconfig files found in the base of EXTRAS
directories. These will be called out in the menuconfig interface and
config files with the name of the EXTRAS directory they came from, and a
blank section will be present either if the Kconfig didn't exist, or it
did exist but had no options in it.

Change-Id: I54060d613f0e0ab9372bed37a2fe5849bf5bbcdb
2023-11-23 08:26:10 +08:00
Gabe Black
db3a6e8e84 scons: Use Kconfig to configure gem5.
These are not yet consumed by anything, but convert all the settings
from SCons variables to Kconfig variables.

If you have existing SConsopts files which need to be converted, you
should take a look at KCONFIG.md to learn about how kconfig is used in
gem5. You should decide if any variables need to be available to C++ or
kconfig itself, and whether those are options which should be detected
automatically, or should be up to the user. Options which should be
measured automatically should still be in SConsopts files, while user
facing options should be added to new or existing Kconfig files.

Generally, make sure you're storing c++/kconfig visible options in
env['CONF'][...]. Also remove references to sticky_vars since persistent
options should now be handled with kconfig, and export_vars since
everything in env['CONF'] is now exported automatically.

Switch SCons/gem5 to use Kconfig for configuration, except EXTRAS which
is still a sticky SCons variable. This is necessary because EXTRAS also
controls what config options exist. If it came from Kconfig itself, then
there would be a circular dependency. This dependency could
theoretically be handled by reparsing the Kconfig when EXTRAS
directories were added or removed, but that would be complicated, and
isn't supported by kconfiglib. It wouldn't be worth the significant
effort it would take to add it, just to use Kconfig more purely.

Change-Id: I29ab1940b2d7b0e6635a490452d05befe5b4a2c9
2023-11-23 08:26:10 +08:00
Gabe Black
5f73a9bbf0 scons: Use either the "build" or "gem5.build" as build anchor.
If gem5.build already exists within a directory, then that build
directory can be used without having to worry about variants.

If it doesn't exist and we find a build/${VARIANT} style path, then we
use that as the anchor.

In either case, the variant name is the final component of the build
path. The parse_build_path function had been separating that out, but it
was just put back onto the path again anyway by the only caller, and
then split out again when that path was consumed. We save a step by not
splitting it out in parse_build_path.

Change-Id: I8705b3dbb7664748f5525869cb188df70319d403
2023-11-23 08:26:10 +08:00
Matthew Poremba
6e433ed885 mem-ruby: Fixes for new AtomicWait event in VIPER TCC (#585)
The AtomicWait event was not being woken up properly due to the
numPending count in the TBE not being decremented. This patch decrements
the count when Data is returned. Since that moves to a base state, the
TBE should no longer be needed.

Additionally added a transition which stalls and wait when an AtomicWait
occurs while in WI state so that it retries.

Change-Id: Ic8bfc700f9df3f95bea0799121898926a23d8163
2023-11-22 14:05:43 -08:00
Aditya K Kamath
368fcdde75 arch-x86: Fixes page fault for CLFLUSH on write-protected pages
Converts CLFLUSHOPT/WB/FLUSH operations from Write to Read operations
during address translation so that they don't trigger a page fault
when done on write-protected pages.
2023-11-22 20:09:45 +00:00
Bobby R. Bruce
23a22ed95c dev-amdgpu: Add VMID map to checkpoint (#570)
When restoring checkpoints for certain applications, gem5 tries to
create new doorbells with a pre-existing queue ID and simulation crashes
shortly after. This commit adds existing IDs to the GPU device's used
VMID map so that new doorbells are aware of existing queue IDs and use a
new ID. This ensures that queue IDs are unique after checkpoint
restoration
2023-11-22 10:05:21 -08:00
Giacomo Travaglini
098feb4042 arch-arm: Fix WFI sleeping in secure mode
The CPU should not sleep with a pending virtual interrupt
if secure mode EL2 is supported (FEAT_SEL2)

Change-Id: Ib71c4a09d76a790331cf6750da45f83694946aee
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-11-21 13:39:41 +00:00
Giacomo Travaglini
b8fabc15d9 arch-arm: Revamp takeVirtualInt to take FEAT_SEL2 into account
Similarly to the physical version [1], we rewrite the
masking logic to account for FEAT_SEL2.

The interrupt table is taken from the Arm architecture reference
manual (version DDI 0487H.a, section D1.3.6, table R_BKHXL)

[1]: https://github.com/gem5/gem5/pull/430

Change-Id: Icb6eb1944d8241293b3ef3c349b20f3981bcc558
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-11-21 13:39:41 +00:00
Giacomo Travaglini
49d07578de arch-arm: Call take(Virtual)Int only when needed
There is no need to call the methods for every kind
of interrupt. A pending one should short-circuit the
remaining checks

Change-Id: I2c9eb680a7baa4644745b8cbe48183ff6f8e3102
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-11-21 13:39:41 +00:00
Giacomo Travaglini
bb323923f2 arch-arm: Simplify get/checkInterrupts with takeVirtualInt
With this patch we align virtual interrupts with respect to
the physical ones by introducing a matching takeVirtualInt
method.

Change-Id: Ib7835a21b85e4330ba9f051bc8fed691d6e1382e
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-11-21 13:39:41 +00:00
Giacomo Travaglini
3d41339366 arch-arm: Fix ISR_EL1 register read in secure mode
Vitual interrupts are enabled in secure mode as well
after the introduction of FEAT_SEL2. Replacing the
secure mode check with the EL2Enabled one

Change-Id: Id685a05d5adfa87b2a366f6be42bf344168927d4
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-11-21 13:39:41 +00:00
Giacomo Travaglini
90b711e879 arch-arm: Define an ISR type register
Change-Id: I358050a507fb76654e87165720dfb3b2ea6ca838
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-11-21 13:39:41 +00:00
Hoa Nguyen
3009e0fb57 mem-ruby: Fix typo in CHI's Send_CompI (#579)
The destination for the response is set twice.
2023-11-20 21:38:13 -08:00
Bobby R. Bruce
d772f3967b dev: Fix std::min type mismatch in reg_bank.hh (#582)
https://github.com/gem5/gem5/pull/386 included two cases in
"src/dev/reg_bank.hh" where `std:: min` was used to compare a an integer
of type `size_t` and another of type `Addr`. This causes an error on my
Apple Silicon Mac as the comparison between an "unsigned long" and an
"unsigned long long" is not permitted. To fix this issue this patch
changes `reg_size` from `size_t` to `Addr`, as well as it the types of
the values it was derived from and the variable used to hold the return
from the `std::min` calls. While not completely correct typing from a
labelling perspective (`reg_bytes` is not an address), functions in
"src/dev/reg_bank.hh" already abuse `Addr` in this way frequently (for
example, `bytes` in the `write` function).
2023-11-20 21:37:45 -08:00
Bobby R. Bruce
f26867a075 mem-cache: Revert "Prefetchers Improvements"
Reverts PR https://github.com/gem5/gem5/pull/564

Reverts commits:

* 047a494c2b
* 2abd65c270
* 38045d7a25
* 6416304e07
* 8598764a03

Change-Id: Id523acc1778c3f827637302a6465f5a9e539d6b5
2023-11-20 19:49:04 -08:00
Vishnu Ramadas
06161ded8c dev-amdgpu: Add VMID map to checkpoint
When restoring checkpoints for certain applications, gem5 tries to
create new doorbells with a pre-existing queue ID and simulation crashes
shortly after. This commit checkpoints the existing VMID map so that any
new doorbells after restoration use a unique queue ID

Change-Id: I9bf89a2769db26ceab4441634ff2da936eea6d6f
2023-11-20 21:19:17 -06:00
Bobby R. Bruce
08c0d1f27a dev: Fix std::min type mismatch in reg_bank.hh
https://github.com/gem5/gem5/pull/386 included two cases in
"src/dev/reg_bank.hh" where `std:: min` was used to compare a an integer
of type `size_t` and another of type `Addr`. This cause an error on my
Apple Silicon Mac as this is a comparison between an "unsigned long"
and an "unsigned long long" which (at least on my setup) was not
permitted. To fix this issue the `reg_size` was changed from `size_t` to
`Addr`, as well as it the types of the values it was derived from and
the variable used to hold the return from the `std::min` calls.

Change-Id: I31e9c04a8e0327d4f6f5390bc5a743c629db4746
2023-11-20 17:33:44 -08:00
Matthew Poremba
3896673ddc util: Bump GPUFS build docker to 5.4.2 (#571)
This dockerfile is used to *build* applications (e.g., from
gem5-resources) which can be run using full system mode in a GPU build.
The next releases disk image will use ROCm 5.4.2, therefore bump the
version from 4.2 to that version.

Again this is used to *build* input applications only and is not needed
to run or compile gem5 with GPUFS. For example:

$ docker build -t rocm54-build .
/some/gem5-resources/src/gpu/lulesh$ docker run --rm -u $UID:$GID -v \
    ${PWD}:${PWD} -w ${PWD} rocm54-build make

Change-Id: If169c8d433afb3044f9b88e883ff3bb2f4bc70d2
2023-11-18 18:13:06 -08:00
Vishnu Ramadas
d19d6fc31e dev-amdgpu: Add PM4 queue ID to GPU used VMID map
When restoring checkpoints for certain applications, gem5 tries to
create new doorbells with a pre-existing queue ID and simulation crashes
shortly after. This commit adds existing IDs to the GPU device's used
VMID map so that new doorbells are aware of existing queue IDs and use a
new ID. This ensures that queue IDs are unique after checkpoint
restoration

Change-Id: I9bf89a2769db26ceab4441634ff2da936eea6d6f
2023-11-16 17:30:00 -06:00
Jason Lowe-Power
db6a869786 mem-cache: Prefetchers Improvements (#564)
This pull request contains a set of small patches which fix some bugs in
the gem5 prefetchers, and aligns out-of-the box prefetcher performance
more closely with that which a typical user would expect.

The performance patches have been tested with an out-of-the-box
(untuned) Stride prefetcher configuration against a set of SPEC 2017
SimPoints, and show a modest IPC uplift across the board, with no IPC
degradation.

The new defaults were identified as part of work on gem5 prefetchers
undertaken by Nikolaos Kyparissas while on internship at Arm.
2023-11-16 15:22:26 -08:00
Giacomo Travaglini
4ca2efac16 mem-ruby: AtomicNoReturn should check comp_anr instead of comp_wu (#545)
The comp_anr parameter is currently unused. Both parameters (comp_wu and
comp_anr) are set to false by default

Change-Id: If09567504540dbee082191d46fcd53f1363d819f

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-11-16 15:20:51 -08:00
Matthew Poremba
4965367724 mem-ruby, gpu-compute: fix SQC/TCP requests to same line (#540)
Currently, the GPU SQC (L1I$) and TCP (L1D$) have a performance bug
where they do not behave correctly when multiple requests to the same
cache line overlap one another.  The intended behavior is that if the
first request that arrives at the Ruby code for the SQC/TCP misses, it
should send a request to the GPU TCC (L2$).  If any requests to the
same cache line occur while this first request is pending, they should
wait locally at the L1 in the MSHRs (TBEs) until the first request has
returned.  At that point they can be serviced, and assuming the line
has not been evicted, they should hit.

For example, in the following test (on 1 GPU thread, in 1 WG):

load Arr[0]
load Arr[1]
load Arr[2]

The expected behavior (confirmed via profiling on real GPUs) is that
we should get 1 miss (Arr[0]) and 2 hits (Arr[1], Arr[2]) for such a
program.

However, the current support in the VIPER SQC/TCP code does not model
this correctly.  Instead it lets all 3 concurrent requests go straight
through to the TCC instead of stopping the Arr[1] and Arr[2] requests
locally while Arr[0] is serviced.  This causes all 3 requests to be
classified as misses.

To resolve this, this patch adds support into the SQC/TCP code to
prevent subsequent, concurrent requests to a pending cache line from
being
sent in parallel with the original one.  To do this, we add an
additional transient state (IV) to indicate that a load is pending to
this cache line.  If a subsequent request of any kind to the same cache
line occurs while this load is pending, the requests are put on the
local wait buffer and woken up when the first request returns to the
SQC/TCP.  Likewise, when the first load is returned to the SQC/TCP, it
transitions from IV --> V.

As part of this support, additional transitions were also added to
account for corner cases such as what happens when the line is evicted
by another request that maps to the same set index while the first load
is pending (the line is immediately given to the new request, and when
the load returns it completes, wakes up any pending requests to the same
line, but does not attempt to change the state of the line) and how GPU
bypassing loads and stores should interact with the pending requests
(they are forced to wait if they reach the L1 after the pending,
non-bypassing load; but if they reach the L1 before the non-bypassing
load then they make sure not to change the state of the line from IV if
they return before the non-bypassing load).

As part of this change, we also move the MSHR behavior from internally
in the GPUCoalescer for loads to the Ruby code (like all other
requests).  This is important to get correct hits and misses in stats
and other prints, since the GPUCoalescer MSHR behavior assumes all
requests serviced out of its MSHR also miss if the original request to
that line missed.

Although the SQC does not support stores, the TCP does.  Thus,
we could have applied a similar change to the GPU stores at the TCP.
However, since the TCP support assumes write-through caches and does not
attempt to allocate space in the TCP, we elected not to add this support
since it seems to run contrary to the intended behavior (i.e., the
intended behavior seems to be that writes just bypass the TCP and thus
should not need to wait for another write to the same cache line to
complete).

Additionally, making these changes introduced issues with deadlocks at
the TCC.  Specifically, some Pannotia applications have accesses to the
same cache line where some of the accesses are GLC (i.e., they bypass
the GPU L1 cache) and others are non-GLC (i.e., they want to be cached
in the GPU L1 cache). We have support already per CU in the above code.
However, the problem here is that these requests are coming from
different CUs and happening concurrently (seemingly because different
WGs are at different points in the kernel around the same time).
This causes a problem because our support at the TCC for the TBEs
overwrites the information about the GPU bypassing bits (SLC, GLC) every
time. The problem is when the second (non-GLC) load reaches the TCC, it
overwrites the SLC/GLC information for the first (GLC) load. Thus, when
the the first load returns from the directory/memory, it no longer has
the GLC bit set, which causes an assert failure at the TCP.

After talking with other developers, it was decided the best way handle
this and attempt to model real hardware more closely was to move the
point at which requests are put to sleep on the wakeup buffer from the
TCC to the directory. Accordingly, this patch includes support for that
-- now when multiple loads (bypassing or non-bypassing) from different
CUs reach the directory, all but the first one will be forced to wait
there until the first one completes, then will be woken up and
performed.  This required updating the WTRequestor information at the
TCC to pass the information about what CU performed the original request
for loads as well (otherwise since the TBE can be updated by multiple
pending loads, we can't tell where to send the final result to).  Thus,
I changed the field to be named CURequestor instead of WTRequestor since
it is now used for more than stores.  Moreover, I also updated the
directory to take this new field and the GLC information from incoming
TCC requests and then pass that information back to the TCC on the
response -- without doing this, because the TBE can be updated by
multiple pending, concurrent requests we cannot determine if this memory
request was a bypassing or non-bypassing request.  Finally, these
changes introduced a lot of additional contention and protocol stalls at
the directory, so this patch converted all directory uses of z_stall to
instead put requests on the wakeup buffer (and wake them up when the
current request completes) instead. Without this, protocol stalls cause
many applications to deadlock at the directory.

However, this exposed another issue at the TCC: other applications
(e.g., HACC) have a mix of atomics and non-atomics to the same cache
line in the same kernel.  Since the TCC transitions to the A state when
an atomic arrives. For example, after the first pending load returns to
the TCC from the directory, which causes the TCC state to become V, but
when there are still other pending loads at the TCC. This causes invalid
transition errors at the TCC when those pending loads return, because
the A state thinks they are atomics and decrements the pending atomic
count (plus the loads are never sent to the TCP as returning loads).
This patch fixes this by changing the TCC TBEs to model the number of
pending requests, and not allowing atomics to be issued from the TCC
until all prior, pending non-atomic requests have returned.

Change-Id: I37f8bda9f8277f2355bca5ef3610f6b63ce93563
2023-11-16 14:24:00 -08:00
Bobby R. Bruce
bfe899e48e stdlib, resources: Update JSON data in workload (#532)
- resources field in workload now supports a dict with resources id and
version.

- Older workload JSON are still supported but added a deprecation waring
2023-11-16 10:11:13 -08:00
David Schall
94879c2410 cpu: Require BTB hit to detect branches.
In a high performance CPU there is no other way than a BTB hit
to know about a branch instruction and its type. For low-end CPU's
pre-decoding might sit in from of the BPU to provide this information.
Currently, the BPU models only low-end behavior and updates the
RAS and the indirect branch prediction even without a BTB hit.
This patch adds two things to model the correct behavior for high-end
CPUs.
1. A check before the RAS and indirect predictor wheather there was
a BTB hit or not. Only for BTB hits the BPU will consolidate RAS, and
indirect predictor.
2. Since, this check requires a BTB hit for indirect branches they must
also be installed into the BTB. For returns this was already done.

Change-Id: Ibef9aa890f180efe547c82f41fc71f457c988a89
Signed-off-by: David Schall <david.schall@ed.ac.uk>
2023-11-16 12:35:10 +00:00
Giacomo Travaglini
047a494c2b mem-cache: Optimize strided prefetcher address generation
This commit optimizes the address generation logic in the strided
prefetcher by introducing the following changes

(d is the degree of the prefetcher)

* Evaluate the fixed prefetch_stride only once (and not d-times)
* Replace 2d multiplications (d * prefetch_stride and distance *
prefetch_stride) with additions by updating the new base prefetch
address while looping

Change-Id: I49c52333fc4c7071ac3d73443f2ae07bfcd5b8e4
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Reviewed-by: Tiberiu Bucur <tiberiu.bucur@arm.com>
2023-11-16 09:48:15 +00:00
Nikolaos Kyparissas
2abd65c270 mem: added distance parameter to stride prefetcher
The Stride Prefetcher will skip this number of strides ahead of the
first identified prefetch, then generate `degree` prefetches at
`stride` intervals. A value of zero indicates no skip (i.e. start
prefetching from the next identified prefetch address).

This parameter can be used to increase the timeliness of prefetches by
starting to prefetch far enough ahead of the demand stream to cover
the memory system latency.

[Richard Cooper <richard.cooper@arm.com>:
- Added detail to commit comment and `distance` Param documentation.
- Changed `distance` Param from `Param.Int` to `Param.Unsigned`.
]

Change-Id: I6c4e744079b53a7b804d8eab93b0f07b566f0c08
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Signed-off-by: Richard Cooper <richard.cooper@arm.com>
2023-11-16 09:48:09 +00:00
Yu-Cheng Chang
ceabe86b31 arch-riscv: Add overrides to RISC-V Interrupts class (#568) 2023-11-15 18:36:15 -08:00
Matt Sinclair
c3326c78e6 mem-ruby, gpu-compute: fix SQC/TCP requests to same line
Currently, the GPU SQC (L1I$) and TCP (L1D$) have a performance bug
where they do not behave correctly when multiple requests to the same
cache line overlap one another.  The intended behavior is that if the
first request that arrives at the Ruby code for the SQC/TCP misses, it
should send a request to the GPU TCC (L2$).  If any requests to the
same cache line occur while this first request is pending, they should
wait locally at the L1 in the MSHRs (TBEs) until the first request has
returned.  At that point they can be serviced, and assuming the line
has not been evicted, they should hit.

For example, in the following test (on 1 GPU thread, in 1 WG):

load Arr[0]
load Arr[1]
load Arr[2]

The expected behavior (confirmed via profiling on real GPUs) is that
we should get 1 miss (Arr[0]) and 2 hits (Arr[1], Arr[2]) for such a
program.

However, the current support in the VIPER SQC/TCP code does not model
this correctly.  Instead it lets all 3 concurrent requests go straight
through to the TCC instead of stopping the Arr[1] and Arr[2] requests
locally while Arr[0] is serviced.  This causes all 3 requests to be
classified as misses.

To resolve this, this patch adds support into the SQC/TCP code to
prevent subsequent, concurrent requests to a pending cache line from being
sent in parallel with the original one.  To do this, we add an
additional transient state (IV) to indicate that a load is pending to
this cache line.  If a subsequent request of any kind to the same cache
line occurs while this load is pending, the requests are put on the
local wait buffer and woken up when the first request returns to the
SQC/TCP.  Likewise, when the first load is returned to the SQC/TCP, it
transitions from IV --> V.

As part of this support, additional transitions were also added to
account for corner cases such as what happens when the line is evicted
by another request that maps to the same set index while the first load
is pending (the line is immediately given to the new request, and when
the load returns it completes, wakes up any pending requests to the same
line, but does not attempt to change the state of the line) and how GPU
bypassing loads and stores should interact with the pending requests
(they are forced to wait if they reach the L1 after the pending,
non-bypassing load; but if they reach the L1 before the non-bypassing
load then they make sure not to change the state of the line from IV if
they return before the non-bypassing load).

As part of this change, we also move the MSHR behavior from internally
in the GPUCoalescer for loads to the Ruby code (like all other
requests).  This is important to get correct hits and misses in stats
and other prints, since the GPUCoalescer MSHR behavior assumes all
requests serviced out of its MSHR also miss if the original request to
that line missed.

Although the SQC does not support stores, the TCP does.  Thus,
we could have applied a similar change to the GPU stores at the TCP.
However, since the TCP support assumes write-through caches and does not
attempt to allocate space in the TCP, we elected not to add this support
since it seems to run contrary to the intended behavior (i.e., the
intended behavior seems to be that writes just bypass the TCP and thus
should not need to wait for another write to the same cache line to
complete).

Additionally, making these changes introduced issues with deadlocks at
the TCC.  Specifically, some Pannotia applications have accesses to the
same cache line where some of the accesses are GLC (i.e., they bypass
the GPU L1 cache) and others are non-GLC (i.e., they want to be cached
in the GPU L1 cache). We have support already per CU in the above code.
However, the problem here is that these requests are coming from
different CUs and happening concurrently (seemingly because different
WGs are at different points in the kernel around the same time).
This causes a problem because our support at the TCC for the TBEs
overwrites the information about the GPU bypassing bits (SLC, GLC) every
time. The problem is when the second (non-GLC) load reaches the TCC, it
overwrites the SLC/GLC information for the first (GLC) load. Thus, when
the the first load returns from the directory/memory, it no longer has
the GLC bit set, which causes an assert failure at the TCP.

After talking with other developers, it was decided the best way handle
this and attempt to model real hardware more closely was to move the
point at which requests are put to sleep on the wakeup buffer from the
TCC to the directory. Accordingly, this patch includes support for that
-- now when multiple loads (bypassing or non-bypassing) from different
CUs reach the directory, all but the first one will be forced to wait
there until the first one completes, then will be woken up and
performed.  This required updating the WTRequestor information at the
TCC to pass the information about what CU performed the original request
for loads as well (otherwise since the TBE can be updated by multiple
pending loads, we can't tell where to send the final result to).  Thus,
I changed the field to be named CURequestor instead of WTRequestor since
it is now used for more than stores.  Moreover, I also updated the
directory to take this new field and the GLC information from incoming
TCC requests and then pass that information back to the TCC on the
response -- without doing this, because the TBE can be updated by
multiple pending, concurrent requests we cannot determine if this memory
request was a bypassing or non-bypassing request.  Finally, these
changes introduced a lot of additional contention and protocol stalls at
the directory, so this patch converted all directory uses of z_stall to
instead put requests on the wakeup buffer (and wake them up when the
current request completes) instead. Without this, protocol stalls cause
many applications to deadlock at the directory.

However, this exposed another issue at the TCC: other applications
(e.g., HACC) have a mix of atomics and non-atomics to the same cache
line in the same kernel.  Since the TCC transitions to the A state when
an atomic arrives. For example, after the first pending load returns to
the TCC from the directory, which causes the TCC state to become V, but
when there are still other pending loads at the TCC. This causes invalid
transition errors at the TCC when those pending loads return, because
the A state thinks they are atomics and decrements the pending atomic
count (plus the loads are never sent to the TCP as returning loads).
This patch fixes this by changing the TCC TBEs to model the number of
pending requests, and not allowing atomics to be issued from the TCC
until all prior, pending non-atomic requests have returned.

Change-Id: I37f8bda9f8277f2355bca5ef3610f6b63ce93563
2023-11-15 19:23:51 -06:00
Matt Sinclair
065ddf759f mem-ruby, gpu-compute: fix bug with GPU bypassing loads
The current GPU TCP (L1D$) Ruby SLICC code had a bug where a GPU
load that wants to bypass the L1D$ (e.g., GLC or SLC bit was set)
but the line is in Invalid when that request arrives, results in
a non-bypassing load being sent to the GPU TCC (L2$) instead of
a bypassing load.

This issue was not caught by currently nightly or weekly tests,
because the tests do not test for correctness in terms of hits
and misses in the caches.  However, tests for these corner cases
expose this issue.

To fix, this, this patch removes the check that the entry is valid
when deciding what to do with a bypassing GPU load -- since the
TCP Ruby code has transitions for bypassing loads in both I and V,
we can simply call the LoadBypassEvict event in both cases and the
appropriate transition will handle the bypassing load given the
cache line's current state in the TCP.

Change-Id: Ia224cefdf56b4318b2bcbd0bed995fc8d3b62a14
2023-11-15 19:23:51 -06:00
hungweihsuG
83f1fe3fec dev: add debug flag in register bank. (#386)
Print extra logs for the full/partial read/write access to the registers
through the register bank. The debug flag is empty by default and would
not print anything.

Test: run unittest of dev/reg_bank.test.xml to check the behavior would
not affect the original functionality.
run gem5 with debug flags and use m5term to poke on registers.
2023-11-15 10:04:46 -08:00
wmin0
a8440f367d arch-riscv: Move fault handler addr logic to ISA (#554)
mtvec.mode is extended in the new riscv proposal, like fast interrupt.
This change moves that part from Fault class to ISA class for
extendable.

Ref: https://github.com/riscv/riscv-fast-interrupt
2023-11-15 10:04:01 -08:00
BujSet
4a5ec70e08 gpu-compute: Minor edits for atomic no returns and stores (#565)
Since returned data is not needed for AtomicNoReturn and Store memory
requests, the coalescer need not spend time writing in dummy data for
packets of these types.

Change-Id: Ie669e8c2a3bf44b5b0c290f62c49c5d4876a9a6a
2023-11-15 07:20:07 -08:00
Bobby R. Bruce
d0d3c74ce0 misc: Merge develop .github dir to stable (#566) 2023-11-14 13:49:37 -08:00
Bobby R. Bruce
30787b59d4 tests: Remove multiple suites per job for Weekly tests (#562)
I believe the weekly test failures (example:
https://github.com/gem5/gem5/actions/runs/6832805510/job/18592876184)
are due to a container running out of memory when running the very-long
x86 boot tests. I found that the `-t $(nproc)` flag meant, on our
runners, 4 x86 full system gem5 simulations were being pawned. Locally I
found these gem5 x86 boot sims can reach 4GB in size so I suspect they
eventually grew big enough exceed the 16GB memory of the VM.

I have removed `-t $(nproc)` meaning each execution to see if this fixes
the issue (we may want to use `-t 2` later if the Weeklies take too long
running single-threaded).
2023-11-14 11:00:07 -08:00
Bobby R. Bruce
8859592893 tests,gpu-compute: Fix Lulesh 'Obtain LULESH' step (#563)
The `working-directory: ${{ github.workspace }}` line was included by
mistake and resulted in this step failing as the command was being
executed in the wrong directory.

Example failure:
https://github.com/gem5/gem5/actions/runs/6832831307/job/18593080567
2023-11-14 08:43:00 -08:00
Derek Christ
e95cab429f configs,ext,stdlib: Update DRAMSys integration (#525)
Recent breaking changes in the DRAMSys API require user code to be
updated. These updates have been applied to the gem5 integration.

Furthermore, as DRAMSys started to use CMake dependency management,
it is no longer sensible to maintain two separate build systems for
DRAMSys. The use of the DRAMSys integration in gem5 will therefore
from now on require that CMake is installed on the target machine.

Additionally, support for snapshots have been implemented into DRAMSys
and coupled with gem5's checkpointing API.
2023-11-14 08:05:11 -08:00
Derek Christ
99553fdbee systemc: Fix two bugs in gem5-to-tlm bridge (#542)
This commit fixes a violation of the TLM2.0 protocol as well as a
bug regarding back-pressure:
- In the BEGIN_REQ phase, transaction objects are required to set
  their response status to TLM_INCOMPLETE_RESPONSE. This was not
  the case in the packet2payload function that converts gem5 packets
  to TLM2.0 generic payloads.
- When the target applies back-pressure to the initiator, an assert
  condition was triggered as soon as the response is retried. The
  cause of this was an unintentional nullptr-access into a map.
2023-11-14 08:02:58 -08:00
BujSet
65b44e6516 mem-ruby: Fix for not creating log entries on atomic no return requests (#546)
Augmenting Datablock and WriteMask to support optional arg to
distinguish between return and no return. In the case of atomic no
return requests, log entries should not be created when performing the
atomic.

Change-Id: Ic3112834742f4058a7aa155d25ccc4c014b60199a
2023-11-14 07:54:42 -08:00
Daniel Kouchekinia
be5c03ea9f mem-ruby,configs: Add GPU GLC Atomic Resource Constraints (#120)
Added a resource constraint, AtomicALUOperation, to GLC atomics
performed in the TCC.

The resource constraint uses a new class, ALUFreeList array. The class
assumes the following:
  - There are a fixed number of atomic ALU pipelines
- While a new cache line can be processed in each pipeline each cycle,
if a cache line is currently going through a pipeline, it can't be
processed again until it's finished

Two configuration parameters have been used to tune this behavior:
- tcc-num-atomic-alus corresponds to the number of atomic ALU pipelines
- atomic-alu-latency corresponds to the latency of atomic ALU pipelines

Change-Id: I25bdde7dafc3877590bb6536efdf57b8c540a939
2023-11-14 07:48:48 -08:00
Nikolaos Kyparissas
38045d7a25 mem-cache: Added clean eviction check for prefetchers.
pkt->req->isCacheMaintenance() would not include a check
for clean eviction before notifying the prefetcher,
causing gem5 to crash.

Change-Id: I1d082a87a3908b1ed46c5d632d45d8b09950b382
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-11-14 15:20:52 +00:00
Richard Cooper
6416304e07 mem-cache: Update default prefetch options.
Update the default prefetch options to achieve out-of-the box
prefetcher performance closer to that which a typical user would
expect. Configurations that set these parameters explicitly will be
unaffected.

The new defaults were identified as part of work on gem5 prefetchers
undertaken by Nikolaos Kyparissas while on internship at Arm.

Change-Id: Id63868c7c8f00ee15a0b09a6550780a45ae67e55
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-11-14 15:20:52 +00:00
Richard Cooper
8598764a03 mem-cache: Squash prefetch queue entries by block address.
Prefetch queue entries were being squashed by comparing the address
of each queued prefetch against the block address of the demand
access. Only prefetches that happen to fall on a cache-line block
boundary would be squashed.

This patch converts the prefetch addresses to block addresses before
comparison.

Change-Id: I55ecb4919e94ad314b91c7795bba257c550b1528
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-11-14 15:20:52 +00:00
Yu-Cheng Chang
f11227b4a0 systemc: Fix gcc13 systemC compilation error (#520)
issue: https://github.com/gem5/gem5/issues/472
2023-11-14 03:54:35 -08:00
Bobby R. Bruce
6ac6d0c340 tests,misc: Add "build/ALL/gem5.fast" Clang compilation to CI (#432)
While we do run compiler tests weekly, 9/10 the issue is a strict check
in clang we did not check before incorporating code into the codebase.

Therefore, running a clang compilation as part of our CI would help us
catch errors quicker.
2023-11-14 03:53:28 -08:00
Daniel Kouchekinia
dde3d10aea cpu: Remove SLC bit restraint for GPU tester (#552)
This reverts gem5#133, the temporary work-around for gem5#131, allowing
both SLC and GLC atomic requests to be made in the GPU tester.

The underlying issues behind gem5#131 have been resolved by gem5#367 and
gem5#397.
2023-11-14 03:47:34 -08:00
Rajarshi Das
f71450d26d python,util: Fix magic number check in decode_inst_dep_trace.py (#560)
The decode_inst_dep_trace.py opens the trace file in read mode, and
subsequently reads the magic number from the trace file. Once the number
is read, it is compared against the string 'gem5' without decoding it
first. This causes the comparison to fail.
The fix addresses this by calling the decode() routine on the output of
the read() call. Please find the details in the associated issue #543
2023-11-14 03:47:04 -08:00
Bobby R. Bruce
1c7934c9d6 tests,util-docker: Remove gcc 9 support (#556)
When compiling GCC-9 gem5 the gem5 object files are near double the size
than when compiling with other GCC versions. This increase in size means
we need >16GB of memory available when linking. As we do not want to
mandate >16GB systems for building gem5, we are going to drop GCC-9. The
exact cause of this bug unknown. This is highlighted in Issue #555.
2023-11-14 03:45:51 -08:00
Matt Sinclair
48fde5a9c6 mem-ruby, gpu-compute: fix formatting of TCC (#536)
mem-ruby, gpu-compute: fix formatting of TCC

Fix several not properly indented prints and extraneous extra lines in
the SLICC code for the GPU TCC (L2 cache).
2023-11-13 15:01:30 -08:00
Matt Sinclair
7d0a1fb284 mem-ruby, gpu-compute: fix typo in GPU coalescer deadlock print (#535)
mem-ruby, gpu-compute: fix typo in GPU coalescer deadlock print 
 
The GPU Coalescer's deadlock print did not previously print a newline at
the end of each deadlock, which caused confusion when there were
multiple deadlocks as each deadlock print would appear to go with the
address after it. This patch fixes this issue.
2023-11-13 15:01:01 -08:00
Matt Sinclair
75ca2c4282 gpu-compute: Fix typo with GPUTLB print (#529)
gpu-compute: Fix typo with GPUTLB print

Print was not properly ending in a newline, which caused confusion when
looking a trace with GPUTLB enabled. This fixes that.
2023-11-13 14:40:27 -08:00
Matt Sinclair
f312804364 mem-ruby: fix hex print in CacheMemory (#561)
Update print in CacheMemory about clearing the lock to properly print in
hex.
2023-11-13 14:34:33 -08:00
Matt Sinclair
3642bc4892 mem-ruby, gpu-compute: fix GPU SQC/TCP Ruby formatting (#538)
mem-ruby, gpu-compute: fix GPU SQC/TCP Ruby formatting

Fix several not properly indented prints and extraneous extra lines in
the SLICC code for the GPU SQC (L1I$) and TCP (L1D$).
2023-11-13 14:20:54 -08:00
Harshil Patel
50c9cbf613 stdlib, resources: Fixed deprecation warning
Change-Id: I61865d9a2c08e344824a735ee5e85fb54cd489da
2023-11-13 14:09:13 -08:00
Bobby R. Bruce
b62308dfa3 base,sim: Add the SymbolType field to the Symbol object (#512)
Symbol type is part of the info provided by an ELF object's symtab.
It indicates whether a symbol is a file symbol, or a function symbol,
etc.

This chain of commits introduces a way to only load function symbols
to the gem5's symbol table. The RISC-V BootloaderKernelWorkload now
loads only function symbols from the bootloader and the kernel binaries
by default.
2023-11-13 08:14:05 -08:00
Bobby R. Bruce
52354662aa arch-riscv: Fixing CMO instructions and allowing using CMO instructions in FS mode (#517)
arch-riscv: Fix implementation of CMO extension instructions

This change introduces a template for store instruction's mem access.
The new template is called CacheBlockBasedStore.

The reasons for not reusing the current Store's mem access template
are as follows,
- The CMO extension instructions operate on cache block size
granularity,
while regular load/store instructions operate on data of size 64 bits or
fewer.
- The writeMemAtomicLE/writeMemTimingLE interfaces do not allow passing
nullptr as data. However, CPUs in gem5 rely on (data == NULL) to detect
CACHE_BLOCK_ZERO instructions. Setting `Mem = 0;` to `uint64_t Mem;`
does not solve the problem as the reference is allocated and thus,
it's always true that `&Mem != NULL`. This change uses the
writeMemAtomic/writeMemTiming interfaces instead.
- Per CMO v1.0.1, the instructions in the spec do not generate
address misaligned faults.
- The CMO extension instructions do not use IMM.

---

arch-riscv: Fix generateDisassembly for Store with 1 source reg

Currently, store instructions are assumed to have two source registers.
However, since we are supporting the RISC-V CMO instructions, which
are Store instructions in gem5 but they only have one source register.
This change allows printing disassembly of Store instructions with
one source register.

---

arch-riscv: Make Zicbom/Zicboz extensions optional in FS mode

Currently, we're enable Zicbom/Zicboz by default. Since those
extensions might be buggy as they are not well-tested, making
those entensions optional allows running simulation where
the performance implication of the instructions do not matter.

Effectively, by turning off the extensions, we simply remove
those extensions from the device tree, so the OS would not
use them. It doesn't prohibit the userspace application to
use those instructions, however.

---

arch-riscv: Add all supporting Z extensions to RISC-V isa string
2023-11-13 03:38:49 -08:00
Bobby R. Bruce
cb62b08989 util-docker: Update Ubuntu 20.04 to use GCC-10
GCC-9 is no longer supported.

Change-Id: I09bf8f744546908b1c06615b458b31b9b814b61a
2023-11-13 01:36:52 -08:00
Bobby R. Bruce
c40c4450f5 util-docker: Remove GCC Version 9 from Dockerfiles
As we are no longer testing for GCC Version 9, we no longer need to
compile these Docker images.

Change-Id: Ia8fc712043ce211ff46da47fdce691a67ecdbb54
2023-11-13 01:29:56 -08:00
Bobby R. Bruce
eaec1a7146 tests: Remove GCC-9 compiler test
When compiling GCC-9 gem5 the gem5 object files are near double the size
than when compiling with other GCC versions. This increase in size means
we need >16GB of memory available when linking. As we do not want to
mandate >16GB systems for building gem5, we are going to drop GCC-9. The
exact cause of this bug unknown.

Change-Id: I43744d421b88b79ccb21a76badd6b525e894e973
2023-11-13 01:03:43 -08:00
Matt Sinclair
f61d709321 mem-ruby: update RubyRequest print to include GPU fields (#537)
mem-ruby: update RubyRequest print to include GPU fields

The print function used for RubyRequests did not include the GPU
specific fields (for the GLC and SLC bits, which are cache modifiers
that specify what level of the memory hierarchy a request should be
performed at). This causes confusion when the GPU Ruby SLICC code prints
out RubyRequest messages, since important fields are missing.

Thus this commit adds that support. Since these fields are already part
of the RubyRequest class, and are always 0 for non-GPU requests, it
should not affect other components beyond slightly longer prints.

Change-Id: I31c9122b82dfa2c6415ce25d225ea82cb35c7333
2023-11-13 01:12:25 -06:00
Daniel Kouchekinia
1204267fd8 mem-ruby: SLICC Fixes to GLC Atomics in WB L2 (#397)
Made the following changes to fix the behavior of GLC atomics in a WB
L2:
- Stored atomic write mask in TBE For GLC atomics on an invalid line
that bypass to the directory, but have their atomics performed on the
return path.
- Replaced !presentOrAvail() check for bypassing atomics to directory
(which will then be performed on return path), with check for invalid
line state.
- Replaced wdb_writeDirtyBytes action used when performing atomics with
owm_orWriteMask action that doesn't write from invalid atomic request
data block
   - Fixed atomic return path actions

Change-Id: I6a406c313d2f9c88cd75bfe39187ef94ce84098f
2023-11-09 13:15:10 -08:00
Bobby R. Bruce
0442c9a88c configs,ext: gem5 SST bridge calls m5.instantiate() in gem5 (#507)
This change updates the gem5 SST bridge to call m5.instantiate() in the
gem5 config script instead of in the SST component. This allows more
flexibility for the gem5-SST setup, as we can now write traffic
generators using the bridge.
2023-11-08 10:14:29 -08:00
Matt Sinclair
86131d4323 mem-ruby, gpu-compute: update GPU L1I$ MRU info (#530)
Previously the GPU L1 I$ (SQC) was not updating the MRU information on
hits in the SQC. This commit resolves that by adding support to the
appropriate Ruby transition.
2023-11-08 10:13:15 -08:00
Giacomo Travaglini
1f1e15e48f arch-arm,kvm: Fix copy-paste error (#541)
This was probably a copy paste error introduced by [1]. Luckily armv7
KVM mode has been superseeded by the armv8 one.

[1]: https://gem5-review.googlesource.com/c/public/gem5/+/52059

Change-Id: I260229c94077d856510976bda58383f0564fc15b

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-11-08 08:35:02 +00:00
Zixian Cai
f97adbaac7 python: Handle unicode characters in config files (#521)
Previously, opening a config file (such as
`configs/example/hmc_hello.py`) containing non-ASCII characters causes
UnicodeDecodeError.
Also switch to use more an more idiomatic context manager for handling
files.

Change-Id: Ia39cbe2c420e9c94f3a84af459b7e5f4d9718d14
2023-11-07 08:59:42 -08:00
Daniel Carvalho
10374f2f05 Fix calculation of compressed size in bytes (#534)
An integer division in the compression:Base:getSize() was being done,
which led to rounding down instead of up.

Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
2023-11-07 08:58:32 -08:00
Matt Sinclair
76279fef59 mem-ruby: update RubyRequest print to include GPU fields
The print function used for RubyRequests did not include the GPU
specific fields (for the GLC and SLC bits, which are cache modifiers
that specify what level of the memory hierarchy a request should be
performed at).  This causes confusion when the GPU Ruby SLICC code
prints out RubyRequest messages, since important fields are missing.

Thus this commit adds that support.  Since these fields are already
part of the RubyRequest class, and are always 0 for non-GPU requests,
it should not affect other components beyond slightly longer prints.

Change-Id: I31c9122b82dfa2c6415ce25d225ea82cb35c7333
2023-11-07 00:52:37 -06:00
Kaustav Goswami
e109076357 ext: removed SST deprecation notice from SimpleMem
SST SimpleMem will be deprecated in SST 14. PR 396 updated the
bridge to use StandardMem, which is the new memory interface in
SST. This change removes all references to SimpleMem.

Change-Id: I6e4d645317d95ebb610e3dfc93a30d53b91b6b5d
Signed-off-by: Kaustav Goswami <kggoswami@ucdavis.edu>
2023-11-06 11:54:35 -08:00
Kaustav Goswami
2c229aa2ff configs,ext: gem5 SST bridge calls m5.instantiate() in gem5
This change updates the gem5 SST bridge to call m5.instantiate()
in the gem5 config script instead of in the SST component. This
allows more flexibility for the gem5-SST setup, as we can now write
traffic generators using the bridge.

Change-Id: I510a8c15f8fb00bdbdd60dafa2d9f5ad011e48f2
Signed-off-by: Kaustav Goswami <kggoswami@ucdavis.edu>
2023-11-06 11:54:35 -08:00
Jason Lowe-Power
71973b386e gpu-compute,dev-hsa: ROCm 5.5+ support (#498)
ROCm 5.5 support including:
- Vendor packet completion signals
- Queue remapping race condition fix
- Backwards compatible GPR allocation
- Fix transient readBlob fatal reading kernel descriptor
2023-11-06 10:51:37 -08:00
Yu-Cheng Chang
e4cdd73a59 arch-riscv: Fix line length of CSRData declaration (#519)
The length of CSRData declaration must less than 79 characters

Change-Id: I3767b069664690d7b4498a73536880cfa491c6e5
2023-11-06 10:26:08 -08:00
Harshil Patel
42fd7ff894 stdlib, resources: Update JSON data in workload
- resources field in workload now supports a dict with resources id
and version.

- Older workload JSON are still supported but added a deprecation waring

Change-Id: I137dbb99799a5294e84ce7d5d914f05e4cfe9e00
2023-11-03 13:54:30 -07:00
Matthew Poremba
e362310f3d gpu-compute: Update GPR allocation counts
GPR allocation is using fields in the AMD kernel code structure which
are not backwards compatible and are not populated in more recent
compiler versions. Use the granulated fields instead which is enfored to
be backwards compatible.

Change-Id: I718716226f5dbeb08369d5365d5e85b029027932
2023-11-01 14:52:39 -05:00
Matthew Poremba
f07e0e7f5d gpu-compute: Read dispatch packet with timing DMA
This fixes occasional readBlob fatals caused by the functional read of
system memory, seen often with the KVM CPU.

Change-Id: Ifccee666f62faa5b2fcf0a64a9d77c8cf95b3add
2023-11-01 14:52:39 -05:00
Matthew Poremba
37da1c45f3 dev-amdgpu: Better handling for queue remapping
The amdgpu driver can, at *any* time, tell the device to unmap a queue
to force the queue descriptor to be written back to main memory in the
form of a memory queue descriptor (MQD). It will then immediately remap
the queue and continue writing the doorbell to the queue. It is possible
that the doorbell write occurs after the queue is unmapped but before it
is remapped. In this situation, we need to check the updated value of
the doorbell for the queue and write that to the queue after it is
mapped.

To handle this, a pending doorbell packet map is created to hold a
packet to replay when the queue is mapped. Because PCI in gem5
implements only the atomic protocol port, we cannot use the original
packet as it must respond in the same Tick. This patch fixes issues with
the doorbell maps not being cleared on unmapping to ensure the doorbell
is not found in writeDoorbell and places in the pending doorbell map.
This includes fixing the doorbell offset value in the doorbell to VMID
map which was is now multiplied by four as it is a dword address.

This was tested using tensorflow 2.0's MNIST example which was seeing
this issue consistently. With this patch it now makes progress and does
issue pending doorbell writes.

Change-Id: Ic6b401d3fe7fc46b7bcbf19a769cdea6814e7d1e
2023-11-01 14:52:39 -05:00
Matthew Poremba
d05433b3f6 gpu-compute,dev-hsa: Send vendor packet completion signal
gem5 does not currently implement any vendor-specific HSA packets.
Starting in ROCm 5.5, vendor packets appear to end with a completion
signal. Not sending this completion causes gem5 to hang. Since these
packets are not documented anywhere and need to be reverse engineered we
send the completion signal, if non-zero, and finish the packet as is the
current behavior.

Testing: HIP examples working on most recent ROCm release (5.7.1).

Change-Id: Id0841407bec564c84f590c943f0609b17e01e14c
2023-11-01 14:52:39 -05:00
Hoa Nguyen
68287604ee arch-riscv: Make Zicbom/Zicboz extensions optional in FS mode
Currently, we're enable Zicbom/Zicboz by default. Since those
extensions might be buggy as they are not well-tested, making
those entensions optional allows running simulation where
the performance implication of the instructions do not matter.

Effectively, by turning off the extensions, we simply remove
those extensions from the device tree, so the OS would not
use them. It doesn't prohibit the userspace application to
use those instructions, however.

Change-Id: Ib30e98c4c39f741dec5f7d31bd7b832391686840
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-10-30 21:45:13 +00:00
Hoa Nguyen
7c6fcb3838 arch-riscv: Add all supporting Z extensions to RISC-V isa string
Change-Id: I809744fc546bc5c0e27380f9b75bdf99f8520583
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-10-30 21:45:10 +00:00
Hoa Nguyen
f615ee4cd4 arch-riscv: Fix generateDisassembly for Store with 1 source reg
Currently, store instructions are assumed to have two source registers.
However, since we are supporting the RISC-V CMO instructions, which
are Store instructions in gem5 but they only have one source register.
This change allows printing disassembly of Store instructions with
one source register.

Change-Id: I4dd7818c9ac8a89d5e10e77db72248942a25e938
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-10-30 21:44:18 +00:00
Hoa Nguyen
2521ba0664 arch-riscv: Fix implementation of CMO extension instructions
This change introduces a template for store instruction's mem access.
The new template is called CacheBlockBasedStore.

The reasons for not reusing the current Store's mem access template
are as follows,
- The CMO extension instructions operate on cache block size granularity,
while regular load/store instructions operate on data of size 64 bits or
fewer.
- The writeMemAtomicLE/writeMemTimingLE interfaces do not allow passing
nullptr as data. However, CPUs in gem5 rely on (data == NULL) to detect
CACHE_BLOCK_ZERO instructions. Setting `Mem = 0;` to `uint64_t Mem;`
does not solve the problem as the reference is allocated and thus,
it's always true that `&Mem != NULL`. This change uses the
writeMemAtomic/writeMemTiming interfaces instead.
- Per CMO v1.0.1, the instructions in the spec do not generate
address misaligned faults.
- The CMO extension instructions do not use IMM.

Change-Id: I323615639a4ba882fe40a55ed32c7632e0251421
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-10-30 21:44:18 +00:00
Hoa Nguyen
4fdfb96cad arch-riscv: Load function symbols for BootloaderKernelWorkload
Change-Id: Iade91b2cdf6701ed3fe6f5583127c8c3d669d695
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-10-30 21:36:34 +00:00
Hoa Nguyen
6eca83d0fb base: Add ability to generate SymbolTable by filtering SymbolType
This allows filtering out non function symbols.

Change-Id: I518c2842a6f04c4475240126ad64070a6de09df9
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-10-30 21:35:01 +00:00
Hoa Nguyen
697cab0544 base,sim: Add the SymbolType field to the Symbol object
Symbol type is part of the info provided by an ELF object's symtab.
It indicates whether a symbol is a file symbol, or a function symbol, etc.

Change-Id: I827e79f8439c47ac9e889734aaf354c653aff530
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-10-30 21:35:01 +00:00
Jason Lowe-Power
d0113185c6 arch-riscv: Dynamically add V extension to device tree (#464)
Currently, we are hardcoding the ISA string in the device tree
generator. The ISA string from the device tree affects which
ISA extensions will be used by the bootloader/kernel.

This function allows generating the ISA string from the gem5's
ISA object rather than using hardcoded values.

This series of changes also correct a couple of hardcoded
RISC-V ISA strings in the standard library, as well as not
enable RVV instructions for the U74 core model.

Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-10-30 10:29:25 -07:00
Jason Lowe-Power
3d93584900 mem-ruby, stdlib: Far atomics fix (#514)
This PR is fixing https://github.com/gem5/gem5/issues/449 by applying
the following changes

1) Setting up alloc_on_atomic=False in the stdlib
This is directly related to the error message reported by the Issue #449

2) Disabling far atomics in stdlib with policy type = 0
There is an invalid transaction error, likely caused by the fact the
current implementation
is expecting a 2 level cache hierarchy whereas the stdlib example only
allocates one
level of caches (L1). This needs further investigation

3) Explicitly clearing the atomic log
Even by disabling far atomics, the execution of atomicPartial was
populating
the atomic log queue without ever clearing it. This caused the OOM
killer in Linux
to detect the leak and to kill it when the physical resources of the
machine no longer
sufficed. IMHO the atomic log interface should be revamped as atomic
users should
be allocating the atomic log only if explicitly needed
2023-10-30 09:59:49 -07:00
Hoa Nguyen
0218103162 arch-riscv: Correct BootloaderKernelWorkload symbol table (#511)
Currently, the kernel's symbols are shifted by `kernel_paddr_offset`,
which is where the kernel is located in the physcial address space.
However, the symbols are mapped to virtual addresses, which stay the
same even though the physical address space is shifted.

This patch removes the offset for the kernel's symbols virtual
addresses.

Change-Id: I7c35f925777220f56bd8c69bba14c267d2048ade

Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-10-30 09:56:10 -07:00
Giacomo Travaglini
d131ff488e arch-arm: Set UNCACHEABLE flag in Request in SE mode (#515)
As pointed out by [1], Arm doesn't seem to respect the cacheability
attribute when mapping uncacheable memory. This is because the request
is not tagged as uncacheable during SE translation With this patch we
are checking for the cacheability attribute before finalizing
translation

[1]: https://github.com/gem5/gem5/issues/509

Change-Id: I42df0e119af61763971d5766ae764a540055781b

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-30 10:43:22 +00:00
Giacomo Travaglini
1087041698 stdlib: Use near atomics in the CHI component nodes
This is a temporary solution to fix daily tests. We could revert
to the default (policy_type = 1) once the problem is properly
fixed

Change-Id: Ia80af9a7d84d5c777ddeb441110a91a1680c1030
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-29 09:26:26 +00:00
Giacomo Travaglini
1b05c0050b mem-ruby: Clear the atomic log from the DataBlock in CHI
The new far atomics implementation [1] didn't take into consideration
it was supposed to manually clear the atomic log. This caused a
memory leak where the log queue was getting bigger and bigger
as no cleaning was happening

[1]: https://github.com/gem5/gem5/pull/177

Change-Id: I4a74fbf15d21e35caec69c29117e2d98cc86d5ff
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-29 09:26:09 +00:00
Giacomo Travaglini
e496d29171 stdlib: Explicitly set alloc_on_atomic for the CHI example
gem5 will otherwise fatal with the error message:

fatal: ... alloc_on_atomic without default or user set value

See github issue [1] for further details

[1]: https://github.com/gem5/gem5/issues/449

Change-Id: I0bb8fccf0ac6d60fc6c1229436a35e91b2fb45cd
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-29 09:25:49 +00:00
Jason Lowe-Power
06bf783a85 arch-riscv: Move RVV implementation from header to source (#500)
Move the implementation of RVV template class definition from header to
source can speed up the process of building gem5
2023-10-26 17:38:18 -07:00
Ivana Mitrovic
ecc248c3c1 misc: Fix spelling error in MAINTAINERS.yaml (#475) 2023-10-26 08:27:32 -07:00
Roger Chang
e561f3b6f1 arch-riscv: Move insts/vector from header to source
Move the implemention of following classes
- VMaskMergeMicroInst
- VxsatMicroInst

Change-Id: I42ec45681064a0f599c3b2313c2125da7cfc849b
2023-10-26 18:04:58 +08:00
Roger Chang
62af678d5c arch-riscv: Move VArith implementations from header to source
Move VArith implementations from heaher_output to decoder_output
and exec_output respectively

Change-Id: I406eedbd9dd625aa939ec0e20aa29ef4f18ba79c
2023-10-26 18:04:58 +08:00
Roger Chang
605ec6899e arch-riscv: Move VMem implementation from header to source
Move the VMem implementation from header_output to
decoder_output and exec_output respectively.

Change-Id: I699e197f37f22a59ecb9f92a64b5e296d2e9f5fa
2023-10-26 18:04:58 +08:00
Andreas Sandberg
60290c7c2f cpu: Branch Predictor Refactoring (#455)
Major refactoring of the branch predictor unit.
- Clearer control flow of the main branch predictor
- Remove `uncondBranch` and `btbUpdate` functions in favor
  of a common `historyUpdate` function. There is now only
  one lookup function for conditional branches and the new
  `historyUpdate` for speculative history update.
- Added a new *target provider* class.
- More expressive statistics depending on the different branch
  types.
- Cleanup the branch history management
2023-10-26 09:15:11 +01:00
Hoa Nguyen
50196863a4 stdlib,dev: Fix several hardcoded RISC-V ISA strings
The "s" and "u" letters are not recognized by the Linux kernel as
RISC-V extensions [1].

[1] https://elixir.bootlin.com/linux/v6.5.7/source/arch/riscv/kernel/cpufeature.c#L170

Change-Id: I2a99557482cde6e6d6160626b3995275c41b1577
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-10-25 20:12:57 +00:00
Hoa Nguyen
dce8d07703 stdlib: Turn off RVV for U74 core
The U74 core doesn't support vector instructions.

Change-Id: Iadfb6b43ef8c62dcad23391e468a43b908e4a22c
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-10-25 20:12:57 +00:00
Hoa Nguyen
4f72f6172a stdlib: Use the ISA string generator in the RiscvBoard
Current hardcoded value does not support vector instructions.
The new ISA string generator function allows the flexibility
of using or not using the vector extension.

Change-Id: Ic78c4b6629ad3813fc172f700d77ea956552e613
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-10-25 20:12:57 +00:00
Hoa Nguyen
a47ca9dadc arch-riscv: Add a function generating the ISA string
Currently, we are hardcoding the ISA string in the device tree
generator. The ISA string from the device tree affects which
ISA extensions will be used by the bootloader/kernel.

This function allows generating the ISA string from the gem5's
ISA object rather than using hardcoded values.

Change-Id: I2f3720fb6da24347f38f26d9a49939484b11d3bb
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-10-25 20:12:55 +00:00
Bobby R. Bruce
b6ce2d0db8 misc: Add GitHub Runner API rate limiting (#497)
This stops the 'action-run.sh' from exhausting the GitHub API by adding
sleeps.
2023-10-24 14:51:31 -07:00
Bobby R. Bruce
b670ed9fba util: Add 'sudo' to rm WORK_DIR command (#496)
Unfortunately Actions uses docker contaienrs to create files on the
system with root permissions. The 'vagrant' user which we login to run
the Actions Runner, can't remove these files. However, 'vagrant' is part
of the sudo group and can therefore use sudo to remove these files.

I don't like this, but it works.
2023-10-24 14:51:19 -07:00
David Schall
ccbb85c67f cpu: Branch Predictor Refactoring
Major refactoring of the branch predictor unit.
- Clearer control flow of the main branch predictor
- Remove `uncondBranch` and `btbUpdate` functions in favour of a
  common `historyUpdate` function. There is now only one lookup
  function for conditional branches and the new `historyUpdate` for
  speculative history update.
- Added a new *target provider* class.
- More expressive statistics depending on the different branch types.
- Cleanup the branch history management

Change-Id: I21fa555b5663e4abad7c836fc1d41a9c8b205263
Signed-off-by: David Schall <david.schall@ed.ac.uk>
2023-10-24 18:53:20 +00:00
Giacomo Travaglini
6ddf8c94ee arch-arm: Fix KVM Failed to set register (0x603000000013808c) (#486)
Some debug registers were incorrectly tagged
(e.g. as being writeable). This was causing a bug in some gem5-KVM runs
where gem5 was trying to initialize the state of those registers
(OSLSR_EL1) [1] but KVM was returning an error (as the registers were
RO).

[1]: https://github.com/gem5/gem5/blob/stable/\
    src/arch/arm/kvm/armv8_cpu.cc#L408
2023-10-20 11:30:19 -07:00
Boris Shingarov
8b78e87f1b misc: Integrate a Capstone Disassembler in gem5 (#494)
Capstone is an open source disassembler [1] already used by
other projects (like QEMU).

gem5 is already capable of disassembling instructions.  Every StaticInst
is supposed to define a generateDisassembly method which returns the
instruction mnemonic (opcode + operand list) as a string.

This "distributed" implementation of a disassembler relies
on the developer to properly populate the metadata fields
of the base instruction class.
The growing complexity of the ISA code and the massive reuse
of base classes beyond their intended use has led to a
disassembling logic which contains several bugs.

By allowing a tracer to rely on a third party disassembler, we fill the
instruction trace with a more trustworthy instruction stream.

This will make any trace parsing tool to work better and it will
also allow us to spot/fix our own bugs by comparing instruction
traces with native vs custom disassembler

[1]: http://www.capstone-engine.org/
2023-10-20 13:47:23 -04:00
Bobby R. Bruce
cb56c67a8b misc: Fix weekly-tests.yaml container uris (#488) 2023-10-20 09:39:12 -07:00
Giacomo Travaglini
b13102fcc4 scons: Explicit some config options HAVE_* to boolean type (#490)
The config options HAVE_* is used in the conditional code and it should
be the boolean type
2023-10-20 11:39:48 +01:00
Giacomo Travaglini
8233aa8a9b arch-arm: Implement a CapstoneDisassembler for Arm
Change-Id: Id3135bda065efa9b4f3ab36972957fd00c05a53c
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-20 09:27:51 +01:00
Giacomo Travaglini
82675648c8 cpu: Implement a CapstoneDisassembler
Capstone is an open source disassembler [1] already used by
other projects (like QEMU).

gem5 is already capable of disassembling instructions.  Every StaticInst
is supposed to define a generateDisassembly method which returns the
instruction mnemonic (opcode + operand list) as a string.

This "distributed" implementation of a disassembler relies
on the developer to properly populate the metadata fields
of the base instruction class.
The growing complexity of the ISA code and the massive reuse
of base classes beyond their intended use has led to a
disassembling logic which contains several bugs.

By allowing a tracer to rely on a third party disassembler, we fill the
intruction trace with a more trustworthy instruction stream.

This will make any trace parsing tool to work better and it will
also allow us to spot/fix our own bugs by comparing instruction
traces with native vs custom disassembler

[1]: http://www.capstone-engine.org/

Change-Id: I3c4db5072c03d2731265d0398d3863c101dcb180
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-20 09:27:51 +01:00
Giacomo Travaglini
34336208b7 arch-arm: Disassemble through InstDisassembler in TarmacTracer
Change-Id: I5407338501084c016522749be697dd688ca51735
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-20 09:27:51 +01:00
Giacomo Travaglini
27ce721ad3 arch-arm: Pass a reference of the parent tracer to TarmacContext
Change-Id: I7ab0442353a8b5854bb6b50bd54dac89f83ecc1d
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-20 09:27:51 +01:00
Giacomo Travaglini
81b6e296dd arch-arm: disassemble member variable not used by TarmacParser
We move it to the child class which is what the TarmacTracer
actually uses.

Change-Id: Ia30892723d2e1f7306dae87c6c9c1d69d00ad73d
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-20 09:27:50 +01:00
Giacomo Travaglini
237bbf0e42 cpu: Disassemble through the InstDisassembler in the ExeTracer
Change-Id: I4a0c585b9b8824a0694066bef0ee004f68407111
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-20 09:27:50 +01:00
Giacomo Travaglini
952c4f5eea cpu: Pass a reference of the parent tracer to the ExeTracerRecord
Change-Id: I3576df2b7bee1289db60bb6072bd9c90038ca8ce
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-20 09:27:50 +01:00
Giacomo Travaglini
2d85707a75 sim: Define an InstructionDisassembler SimObject
We want to be able to configure from python the disassembler
used by an instruction tracer. The default/base version will
reuse existing instruction logic and it will simply
call the StaticInst::disassemble method.

Change-Id: Ieb16f059a436757c5892dcc82882f6d42090927f
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-20 09:27:50 +01:00
Roger Chang
069baed971 scons: Explicit the config option HAVE_PROTOBUF type boolean
Ensure the type of HAVE_PROTOBUF is boolean

Change-Id: I9cf18c52ac290000168f5228b7f4ba3621225a85
2023-10-20 11:46:02 +08:00
Roger Chang
1a7014c653 scons: Explicit the config option HAVE_PKG_CONFIG type boolean
The scons function Detect will return the program name if the program
is exists in the system. However, the HAVE_PKG_CONFIG is used to
check the pkg-config program is exists and it should be the boolean
type.

Change-Id: I18c4813d36eea68b8851a41db41777bdb2a80b7b
2023-10-20 11:42:20 +08:00
Roger Chang
fe20f4ada6 scons: Explicit the config option HAVE_DEPRECATED_NAMESPACE type bool
Currently the type of HAVE_DEPRECATED_NAMESPACE is used to detect
if the compiler support gnu::deprecated feature. The return type
of conf.TryCompile is int, but HAVE_DEPRECATED_NAMESPACE is used
as boolean type. The CL is add bool type caster to ensure the type
of it is boolean.

Change-Id: Ife7d9716e485a8be8722d58776f064e7c2268a30
2023-10-20 11:41:53 +08:00
Bobby R. Bruce
531067fffa mem,tests: Set Ruby Mem Test atomic percent to 0 (#489)
Fixes https://github.com/gem5/gem5/issues/450
(https://github.com/gem5/gem5/pull/477 fixes non-ruby memtests, so only
a partial fix).
2023-10-19 15:38:38 -07:00
Jason Lowe-Power
73c48a4828 arch-riscv: Add dynamic VLEN and ELEN configuration support to RVV path (#171)
At this moment, VLEN and ELEN RVV parameters are set as constants that
need to be modified at compile time if you want to experiment with
different values. With this patch, I want to set a first point to
discuss how to configure these parameters dynamically.

Also, I have modified some data types that were provoking wrong
behaviour in particular instructions when using a large enough VLEN
value in the considered range inside the specification.
2023-10-19 07:41:39 -07:00
Melissa Jost
34314b3f92 misc: Add LULESH GPU tests (#256)
Adds the LULESH GPU Tests to our GitHub Actions infrastructure

Co-authored-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Co-authored-by: Harshil Patel <harshilp2107@gmail.com>
2023-10-18 22:14:39 -07:00
Bobby R. Bruce
62e5198796 docker-images: Use GitHub Container Registry (#418)
This PR aims to enhance our Docker image build and registry management
by implementing multi-platform support and migrating from the Google
Docker registry to the GitHub Container Registry. Issue:
[#336](https://github.com/gem5/gem5/issues/336)
2023-10-18 22:08:01 -07:00
Alvaro Moreno
edf1d69257 arch-riscv: Define vlwhole/vswhole mem acceses using vlen.
This patch fixes the size of the memory acceses in vswhole and
vlwhole instructions to the maximum vector length.

Change-Id: Ib86b5356d9f1dfa277cb4b367893e3b08242f93e
2023-10-19 00:27:58 +02:00
Adrià Armejach
bfb295ac3f util: cpt_upgrader fix vregs size for #PR171
* Make cpt_upgrader set vregs of size MaxVecLenInBytes

Change-Id: Ie7e00d9bf42b705a0fb30c9d203933fc2e9bdcd9
2023-10-19 00:27:58 +02:00
Alvaro Moreno
52219e5e6f arch-riscv: Add elen configuration to vector config instructions
This patch adds elen as a member of vector configuration instructions so it can be used with the especulative execution

Change-Id: Iaf79015717a006374c5198aaa36e050edde40cee
2023-10-19 00:27:58 +02:00
Alvaro Moreno
2c9fca7b60 arch-riscv: Add vlen configuration to vector instructions
In first place, vlen is added as a member of Vector Macro Instructions
where it is needed to split the instruction in Micro Instructions.

Then, new PCState methods are used to get dynamic vlen and vlenb
values at execution.

Finally, vector length data types are fixed to 32 bits so every vlen value
is considered.

Change-Id: I5b8ceb0d291f456a30a4b0ae2f58601231d33a7a
2023-10-19 00:27:58 +02:00
Alvaro Moreno
8a20f20f79 arch-riscv: Add vlen component to decoder state
This patch add vlen definition to the riscv decoder so it can be used in Vector Instruction Constructors

Change-Id: I52292bc261c43562b690062b16d2b323675c2fe0
2023-10-19 00:27:58 +02:00
Alvaro Moreno
5d97cb8b0b arch-riscv: Define VLEN and ELEN through the ISA object
This commit define VLEN and ELEN values as parameters of the RiscvISA class.

Change-Id: Ic5b80397d316522d729e4db4f906aa189f27a491
2023-10-19 00:27:58 +02:00
Alvaro Moreno
57e0ba7765 arch-riscv: Define VecRegContainer with maximum expected length
This path redefine VecRegContainer for RISCV so it can hold every VLEN + ELEN possible configuration used at execution time

Change-Id: Ie6abd01a1c4ebe9aae3d93f4e835fcfdc4a82dcd
2023-10-19 00:27:58 +02:00
Bobby R. Bruce
be89758f0e misc: Add additional pre-commit hook checks (#420)
Adds the following hooks:

1. `check-ast`: Verifies all Python files have a AST indicating they are
valid Python.
2. `check-merge-conflict`: Checks to see if files have merge conflict
strings and blocks commits if so.
3. `check-symlinks`: Checks that symlinks in the repo still point to a
valid location.
4. `destroyed-symlinks`: Checks if symlink is replaced with a file and
if that file is identical to the file it was previously pointing too.

None of these commits change any code. They are all checks to ensure bad
code is not committed.
2023-10-18 12:21:22 -07:00
Hoa Nguyen
c3acfdc9b8 arch-riscv: Copy Misc Regs when swiching cpus (#479)
Misc Regs might contain rather important information about the state of
a core, e.g., information in CSR registers.

This patch enforces copying the CSR registers when switching cpus. The
bug and the proposed fix are reported here [1].

[1] https://github.com/gem5/gem5/issues/451

Change-Id: I611782e6e3bcd5530ddac346342a9e0e44b0f757

Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-10-18 10:51:37 -07:00
Harshil Patel
7bd0b99635 tests: Changed percent atomics to 0 in memtest to fix daily test (#477) 2023-10-18 10:09:45 -07:00
Bobby R. Bruce
334df18dce arch-riscv: Add bootloader+kernel workload (#390)
Aims to boot OpenSBI + Linux kernel.
2023-10-18 09:17:12 -07:00
Bobby R. Bruce
e9fe9cb001 util: Improve GitHub Action runners: Enable KVM; Better Cleanup; Better Tooling (#470)
This PR adds the following the GitHub Actions runners:

1. Enables KVM to be run within docker containers within the VMs, if
permitted. Now, any Docker containers wanting to use KVM must create
containers with the `--device /dev/kvm` argument. This may make it hard
or impossible to utilize with GitHub Actions. Nonetheless it is enabled.
2. Improves the docker prune step (a cleanup step carried out after each
job) so it now removes all the Docker images in the VM.
3. Adds the "halt-helper.sh" script which automatically, and safely,
halts (shutsdown) all the VMs so maintenance tasks can be undertaken.
2023-10-17 11:38:40 -07:00
Andreas Sandberg
42d1c8b3c3 cpu: Restructure RAS (#428) 2023-10-17 19:14:13 +01:00
David Schall
5387e67114 cpu: Restructure RAS
The return address stack (RAS) is restructured to be a separate SimObject.
This enables disabling the RAS and better separation of the functionality.
Furthermore, easier statistics and debugging.

Change-Id: I8aacf7d4c8e308165d0e7e15bc5a5d0df77f8192
Signed-off-by: David Schall <david.schall@ed.ac.uk>
2023-10-17 15:30:56 +00:00
Bobby R. Bruce
3783afff5d util: Enable KVM on VMs and ensure working in Runners
This patch:

1. Adds setup scripting to "provision_root.sh" to setup and enable KVM,
   for the 'vagrant' user, for VMs which are capable of this.
2. Runs a check on each VM to see if KVM can be run sucessfully within a
   docker container. If so, the GitHub Actions runner is given a 'kvm'
   label. It is unknown at this time if GitHub Runners can utlized KVM
   but it is open to their processes.

Change-Id: Idfcbb7bfa3e5b7cc47d29aea50fb1ebcafdb7acc
2023-10-16 21:21:31 -07:00
Bobby R. Bruce
d18087af96 util: Add halt-helper.sh
This script helps use safely halt vagrant VMs.

Change-Id: I2f2f36b93f82e07756d069334db178604a9915b3
2023-10-16 21:14:09 -07:00
Bobby R. Bruce
adb5470996 arch-arm: Fix (other) line-length errors (#468)
https://github.com/gem5/gem5/pull/459 missed a couple.
This commit should complete the task.
2023-10-16 17:47:46 -07:00
Bobby R. Bruce
4b9c4e1e17 misc: Add --all to Runner docker system prune
Without `--all` `docker prune --force --volumes` will remove everything
exception non-dangling images. For an image to be considered dangling it
must be untagged and/or not used by a container at that time. As most of
the images we download are tagged (e.g., `:latest`) then most of our
images are never removed without the inclusion of `--all` which will
remove any image not currently used by a container.

Images were starting to accumulate on runners. This will ensure they do
not and are cleaned after each job run.

Change-Id: I6d8441a11d22fdcf827e9c44422dbcf02cf600e0
2023-10-16 13:33:30 -07:00
Bobby R. Bruce
cfef2ac23b util-docker: Fix end-of-line error in docker-bake.hcl
Change-Id: I2c792f35d8c74e29cf0dc0bc1287b6b5f3e4d6c8
2023-10-16 12:06:01 -07:00
Ivana Mitrovic
cb078f14c6 docker-bake: Changed compilers names to be more descriptive 2023-10-16 11:53:45 -07:00
Ivana Mitrovic
45df1dbb55 docker-images: Changed path from Google Registry to GitHub
Replaced all instances of the Google Docker registry
(gcr.io/gem5-test/) with the GitHub Docker registry (ghcr.io/gem5/).
2023-10-16 11:53:27 -07:00
Ivana Mitrovic
5b721b033f docker-bake: modified .hcl file
Migrated all the image build definitions from docker-compose.yaml to the bake file.
2023-10-16 11:47:49 -07:00
Ivana Mitrovic
df471092d9 dockerfiles: multi-platform setup (#336)
Updated Dockerfiles to work with multi-platform setups
2023-10-16 11:47:49 -07:00
Bobby R. Bruce
aaefda3b08 arch-arm: Fix line-length error in branch64.is
Change-Id: I62c5d5fd47927a838e6731a464fc7e6d8afab768
2023-10-16 10:57:03 -07:00
Hoa Nguyen
d048ad34d6 arch-riscv: Change to VS bits to DIRTY for rvv insts changing vregs (#376)
This is similar to [1] and [2].

Essentially, the VS bits of STATUS CSR keep track of the state of
the vector registers. (VS bits == DIRTY) means the content of vector
registers have been updated since the last time the VS bits were
updated.

This chain of changes is supposed to change the VS bits to DIRTY for if
any
vector register is potentially updated.

[1] https://gem5-review.googlesource.com/c/public/gem5/+/65272
[2] https://github.com/gem5/gem5/pull/370

Change-Id: I0427890dadc63b74a470d7405807dcfcad18005b
2023-10-16 10:07:40 -07:00
Yu-Cheng Chang
2825bc1d55 misc: Add missing RISCV valid ISA option to README.md (#462)
The list of valid ISA options should be same as the website:
https://www.gem5.org/documentation/general_docs/building

Change-Id: Id5ace5b0356ec35634caec5b11159551801c0615
2023-10-16 09:45:28 -07:00
Hoa Nguyen
9b2b6cd8d2 arch-riscv: Mark vector configuration insts as vector insts (#463) 2023-10-16 09:40:09 -07:00
Bobby R. Bruce
a9464a41f5 stdlib,resources: Generalize exception for request retry (#466)
In commit bbc301f2f0 the generalized
`Exception` was changed back to the more specific `HTTPError`.

In this case we do not desire specific error handling. If the connection
to the database fails I want the exception handled in the way outlined:
i.e., i want the connection to be retried 4 times before giving up. With
`HTTPError`, only `HTTPError`s warrent a retry.

Changing this to `HTTPError` cause tests to fail due to a failure to
retry downloading of a resource. Here is an example:
https://github.com/gem5/gem5/actions/runs/6521543885/job/17710779784

In this case `request.urlopen` raised a `URLError`. I suspect this was
some issued to do with reaching the DNS servers. It likely would've
succeeded if it had just tried again.
2023-10-16 09:39:44 -07:00
Bobby R. Bruce
322b105b9d arch-arm: Fix (another) line-length error in misc.cc
https://github.com/gem5/gem5/pull/459 missed one.
This commit should complete the task.

Change-Id: I0aeba79d6f13ddc45effe141945f5636b75daecc
2023-10-16 09:37:51 -07:00
Bobby R. Bruce
5240c07d3c util: Fix runners to extent to max disk size (#460)
THe `lvextend` command extends the logical volume. However, the
`resize2fs` command is needed to extend the filesystem to fill the
logical volume.

Prior to this patch the filesystem ran out of space despite there being
enough room in the volume. This was just wasted free space.
2023-10-16 09:20:13 -07:00
Bobby R. Bruce
97f4b44dd3 arch-arm: Fix line-length error in misc.cc (#459) 2023-10-16 08:35:54 -07:00
Giacomo Travaglini
f9cf8bf8a2 cpu, arch-arm: Add IsPseudo tag for gem5 pseudo instructions (#465)
This only applies to pseudo instructions with their own encoding (m5
ops)... In other
words memory mapped m5 operations are not supported. This make sense as
they should
rather be treated as device accesses... Though it is something to take
into consideration
when relying on the flag
2023-10-16 16:15:05 +01:00
Bobby R. Bruce
d42eeb6b68 cpu: Explicitly define cache_line_size -> 64-bit unsigned int (#329)
While it's plausible to define the cache_line_size as a 32-bit unsigned
int, the use of cache_line_size is way out of its original scope.

cache_line_size has been used to produce an address mask, which masking
out the offset bits from an address. For example, [1], [2], [3], and
[4]. However, since the cache_line_size is an "unsigned int", the type
of the value is not guaranteed to be 64-bit long. Subsequently, the bit
twiddling hacks in [1], [2], [3], and [4] produce 32-bit mask, i.e.,
0x00000000FFFFFFC0.

This behavior at least caused a problem in LLSC in RISC-V [5], where the
load reservation (LR) relies on the mask to produce the cache block
address. Two distinct 64-bit addresses can be mapped to the same cache
block using the above mask.

This patch explicitly defines cache_line_size as a 64-bit unsigned int
so the cache block mask can be produced correctly for 64-bit addresses.

[1]
3bdcfd6f7a/src/cpu/simple/atomic.hh (L147)
[2]
3bdcfd6f7a/src/cpu/simple/timing.hh (L224)
[3]
3bdcfd6f7a/src/cpu/o3/lsq_unit.cc (L241)
[4]
3bdcfd6f7a/src/cpu/minor/lsq.cc (L1425)
[5]
3bdcfd6f7a/src/arch/riscv/isa.cc (L787)
2023-10-16 07:50:35 -07:00
Jason Lowe-Power
d702d3b90a misc: fix clang13 overloaded-virtual warning (#454)
Like #363 clang is also unhappy about the overloaded virtual. However,
clang needs to have the diagnostic in a different place

Fixes #437
2023-10-16 07:23:08 -07:00
Giacomo Travaglini
3f925c4084 arch-arm: Mark gem5 pseudo-ops with IsPseudo flag
Change-Id: I9c8a146d73596597f28cdeca22ad7b7b01b381a7
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-16 13:42:23 +01:00
Giacomo Travaglini
a3b1bfdbf0 cpu: Add a IsPseudo StaticInstFlag for gem5 pseudo-ops
Being able to recognise pseudo ops from the static instruction
pointer is actually quite useful in several circumstances

Change-Id: Ib39badf9aabba15ab3ebe7a8e9717583412731e4
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-16 13:41:04 +01:00
Giacomo Travaglini
2e85c95f4b arch-arm: Remove Jazelle state + ThumbEE support (#364)
This PR removes Jazelle state (while still keeping a "Trivial Jazelle
implementation",
see Arm Architecture Reference Manual) and ThumbEE support
2023-10-16 09:41:44 +01:00
Jason Lowe-Power
20f5555f30 python: Enable -m switch on gem5 binary (#453)
With -m, you can now run a module from the command line that is embedded
in the gem5 binary.
This will allow us to put some common "scripts" in the stdlib instead of
in the "configs" directory.
2023-10-14 20:08:06 -07:00
Matthew Poremba
ca2592d3ba configs: Fix missing param exchange for GPUFS (#457)
PR #367 adds an option to configs/ruby/GPU_VIPER.py that was not added
to the corresponding dGPU equal for GPUFS and thus all GPUFS runs are
failing. Fixed in this patch.
2023-10-14 20:07:39 -07:00
Daniel Kouchekinia
4931fb0010 mem-ruby: Always pass on GPU atomics to dir in write-through TCC (#367)
Added checks to ensure that atomics are not performed in the TCC when it
is configured as a write-through cache. Also added SLC bit overwrite to
ensure directory preforms atomics when there is a write-through TCC.

Change-Id: I4514e6c8022aeb7785f2c59871cd9acec8161ed8
2023-10-14 06:39:50 -07:00
Yu-Cheng Chang
a3c51ca38c arch-riscv: Fix write back register issue of vmask_mv_micro (#443)
After removing the setRegOperand in VecRegOperand
https://github.com/gem5/gem5/pull/341. The vmask_vm_micro will not write
back to register because tmp_d0 is not the reference type. The PR will
make tmp_d0 as reference of regFile.

Change-Id: I2a934ad28045ac63950d4e2ed3eecc4a7d137919
2023-10-13 15:20:42 -07:00
Matthew Poremba
7706e958e5 mem-ruby: Update cache recorder to use RubyPort and remove BUILD_GPU guards (#448)
This PR updates cache recorder to use a vector of RubyPorts for cache
cooldown and warmup instead of Sequencer or GPUCoalescer vectors (refer
to issue #403 for more details). It also removes the extra guards that
were added in #377 to prevent compile-time failures in non-GPU builds.
2023-10-13 14:36:45 -07:00
Kaustav Goswami
68af3f45c9 tests: updated the nightly tests to use SST 13.0.0 (#441)
PR https://github.com/gem5/gem5/pull/396 updates the gem5 SST bridge to
use StandardMem in SST. This change updates the nightly tests to use SST
13.0.0 instead of SST 11.1.0. It also updates the dockerfile.

Change-Id: I5c109c40379d2f09601a1c9f19c51dd716c6582e

---------

Signed-off-by: Kaustav Goswami <kggoswami@ucdavis.edu>
Co-authored-by: Bobby R. Bruce <bbruce@ucdavis.edu>
2023-10-13 14:31:35 -07:00
Andreas Sandberg
59f96deb0f cpu: Refactor indirect predictor (#429) 2023-10-13 11:35:02 +01:00
Giacomo Travaglini
1c45cdcc41 arch-arm: Remove legacy ThumbEE references
ThumbEE had already been removed but there were still some
references to it dangling around. We were also signaling
ThumbEE as being available through HWCAPS in SE which
was not correct. This patch is fixing it

Change-Id: I8b196f5bd27822cd4dd8b3ab3ad9f12a6f54b047
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-13 09:25:48 +01:00
Giacomo Travaglini
a33f3d3967 arch-arm: Remove Jazelle state support
Jazelle state has been officially removed in Armv8.
Every AArch32 implementation must still support the
"Trivial Jazelle implementation", which means that while
the instruction set has been removed, it is still possible
for privileged software to access some Jazelle registers
like JIDR,JMCR, and JOSCR which are just treated as RAZ

Change-Id: Ie403c4f004968eb4cb45fa51067178a550726c87
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-13 09:25:48 +01:00
Vishnu Ramadas
8d54a5cbab mem-ruby: Remove BUILD_GPU guards from ruby coalescer models
A previous commit added BUILD_GPU guards to gpu coalescer models since
a related cache recorder commit added GPU support. This is no longer
needed since the cache recorder moved to using a vector of RubyPorts
instead of Sequencer/GPUCoalescer pointers. This commit removes
BUILD_GPU guards from the Ruby coalescer models

Change-Id: I23a7957d82524d6cd3483d22edfb35ac51796eca
2023-10-12 14:53:29 -05:00
Vishnu Ramadas
08c1af1b16 mem-ruby: Use RubyPort vector to access Ruby in cache recorder
Previously, the cache recorder used a vector of sequencer pointers to
access Ruby objects. A recent commit updated the cache recorder to also
maintain a vector of GPUCoalescer pointers in order for GPUs to support
flushin. This added redundant code to the cache recorder. This commit
replaces the sequencer and GPUCoalescer vectors with a vector of
RubyPort pointers so that the code does not contain redundant lines

Change-Id: Id5da33fb870f17bb9daef816cc43c0bcd70a8706
2023-10-12 14:49:06 -05:00
Bobby R. Bruce
3455d9e68d misc,tests: Add dummy jobs to workflows for status checks (#444)
With this change we can block PRs if the compiler, daily, and/or weekly
tests are failing by setting these dummy jobs as require status checks.
2023-10-12 11:14:08 -07:00
Bobby R. Bruce
bf1c10d4b2 tests,misc: Update CI Tests 'testlib-quick' runs-on
Here it's more sensible to use a GitHub hosted runner. This job is
miniscule and is used to check the other tests have completed
successfully. It makes sense for this not to be on our own self-hosted
runner.

Change-Id: I5377e025334d43eaedd0fc61e5c708ba61255d28
2023-10-12 07:37:11 -07:00
Bobby R. Bruce
3816ea5633 misc,tests: Add dummy jobs to workflows for status checks
Change-Id: I52e42b6f93cfbb1a8e4800a3f6e264d49bebb06c
2023-10-12 07:37:04 -07:00
Matthew Poremba
4d336c0636 arch-vega: Implement buffer_atomic_cmpswap (#439)
This is a standard compare and swap but implemented on vector memory
buffer instructions (i.e., it is the same as FLAT_ATOMIC_CMPSWAP with
MUBUF's special address calculation).

This was tested using a Tensile kernel, a backend for rocBLAS, which is
used by PyTorch and Tensorflow. Prior to this patch both ML frameworks
crashed. With this patch they both make forward progress.

Change-Id: Ie76447a72d210f81624e01e1fa374e41c2c21e06
2023-10-12 07:33:40 -07:00
Matthew Poremba
7bae5464dc arch-vega: Ignore s_setprio instruction instead of panic (#438)
This instruction is used by ML frameworks to prioritize certain
wavefronts. Since gem5 does not have any support for wavefront
scheduling based on priority (besides wavefront age), we ignore this
instruction and warn_once rather than calling panic. Since hardware can
override this priority anyways, we can be sure that ignoring the value
will not inhibit forward progress resulting in application hangs.

Change-Id: Ic5eef14f9685dd2b316c5cf76078bb78d5bfe3cc
2023-10-12 07:32:58 -07:00
Matthew Poremba
f7ad8fe435 configs: GPUFS option to disable KVM perf counters (#433)
Add a --no-kvm-perf option to disable KVM perf counters for GPUFS
scripts. This is useful for users who have KVM enabled but configured
with more restrictive settings, which seems to be the default in newer
Linux distros.

Change-Id: I7508113d0f7c74deb21ea7b2770522885a0ec822
2023-10-11 14:20:27 -07:00
Matthew Poremba
4b7f25fcb6 arch-vega: Ignore s_setprio instruction instead of panic
This instruction is used by ML frameworks to prioritize certain
wavefronts. Since gem5 does not have any support for wavefront
scheduling based on priority (besides wavefront age), we ignore this
instruction and warn_once rather than calling panic. Since hardware can
override this priority anyways, we can be sure that ignoring the value
will not inhibit forward progress resulting in application hangs.

Change-Id: Ic5eef14f9685dd2b316c5cf76078bb78d5bfe3cc
2023-10-11 15:55:16 -05:00
Matthew Poremba
4b85a1710e arch-vega: Implement buffer_atomic_cmpswap
This is a standard compare and swap but implemented on vector memory
buffer instructions (i.e., it is the same as FLAT_ATOMIC_CMPSWAP with
MUBUF's special address calculation).

This was tested using a Tensile kernel, a backend for rocBLAS, which is
used by PyTorch and Tensorflow. Prior to this patch both ML frameworks
crashed. With this patch they both make forward progress.

Change-Id: Ie76447a72d210f81624e01e1fa374e41c2c21e06
2023-10-11 15:42:50 -05:00
Bobby R. Bruce
c855dbf7c5 configs,ext: Updated the gem5 SST Bridge to use SST 13.0.0 (#396)
This change updates the gem5 SST Bridge to use SST 13.0.0. Changes are
made to replace SimpleMem class to StandardMem class as SimpleMem will
be deprecated in SST 14 and above. In addition, the translator.hh is
updated to translate more types of gem5 packets. A new parameter `ports`
was added on SST's side when invoking the gem5 component which does not
require recompiling the gem5 component whenever a new outgoing bridge is
added in a gem5 config.
2023-10-11 13:34:48 -07:00
Bobby R. Bruce
70b6b53e54 misc,python: Add pyupgrade to pre-commit (#424)
This adds the [pyupgrade](https://github.com/asottile/pyupgrade) hook to
pre-commit.

This hook automatically upgrades the syntax to the recommended standards
for the newer version of the language.
2023-10-11 09:07:09 -07:00
Matthew Poremba
da11427ba6 gpu-compute: Update tokens for flat global/scratch (#408)
Memory instructions acquire coalescer tokens in the schedule stage.
Currently this is only done for buffer and flat instructions, but not
flat global or flat scratch. This change now acquires tokens for flat
global and flat scratch instructions. This provides back-pressure to the
CUs and helps to avoid deadlocks in Ruby.

The change also handles returning tokens for buffer, flat global, and
flat scratch instructions. This was previously only being done for
normal flat instructions leading to deadlocks in some applications when
the tokens were exhausted.

To simplify the logic, added a needsToken() method to GPUDynInst which
return if the instruction is buffer or any flat segment.

The waitcnts were also incorrect for flat global and flat scratch. We
should always decrement vmem and exp count for stores and only normal
flat instructions should decrement lgkm. Currently vmem/exp are not
decremented for flat global and flat scratch which can lead to deadlock.
This change set fixes this by always decrementing vmem/exp and lgkm only
for normal flat instructions.

Change-Id: I673f4ac6121e4b5a5e8491bc9130c6d825d95fc5
2023-10-11 09:00:10 -07:00
Andreas Sandberg
891250192d arch-arm: Implement FEAT_TCR2 and FEAT_SCTLR2 (#416)
This is simply adding the new Armv8.9 registers defined in the related
features:

- FEAT_TCR2
- FEAT_SCTLR2
2023-10-11 10:14:31 +01:00
David Schall
f65df9b959 cpu: Refactor indirect predictor
Simplify indirect predictor interface. Several of the existing
functions where merged together into four clear once. Those
four are similar to the main direction predictor interface.
'lookup', 'update', 'squash' and 'commit'. This makes the
interface much more clear, allows better functionality isolation
and makes it simpler to develop new predictor models.

A new parameter is added to allow additional buffer space for
speculative path history.

Change-Id: I6d6b43965b2986ef959953a64c428e50bc68d38e
Signed-off-by: David Schall <david.schall@ed.ac.uk>
2023-10-11 07:50:32 +00:00
Bobby R. Bruce
c4156b06fb python: Fix base logic in MetaSimObject
This ensures `class Foo` is considered equivalent to `class
Foo(object)`.

Change-Id: I65a8aec27280a0806308bbc9d32281dfa6a8f84e
2023-10-10 21:47:08 -07:00
Bobby R. Bruce
298119e402 misc,python: Run pre-commit run --all-files
Applies the `pyupgrade` hook to all files in the repo.

Change-Id: I9879c634a65c5fcaa9567c63bc5977ff97d5d3bf
2023-10-10 21:47:07 -07:00
Bobby R. Bruce
83af4525ce misc,python: Add pyupgrade hook to pre-commit
This hook automatically upgrades the syntax to recommended standards for
new versions of the language.

These are numerous and are outlined here:
https://github.com/asottile/pyupgrade

Change-Id: I73fc58a08160ed9a21cfa3b3e023c259a84592ba
2023-10-10 21:43:39 -07:00
Bobby R. Bruce
3f5d7d647a misc: Run pre-commit autoupdate (#419)
1. Runs `pre-commit autoupdate`.
2. Runs `pre-commit run --all-files`.
3. Adds (2.) to ".git-blame-ignore-rev".
2023-10-10 21:41:33 -07:00
Bobby R. Bruce
592fbae2f5 python,misc: Add destroyed-symlinks hook to pre-commit
This hook detects which symlinks are changed to regular files with the
content of a path which that symlink was pointing to.

Change-Id: Ic925f02debc65c7c04e6d4cc3a25415b30858977
2023-10-10 21:40:28 -07:00
Bobby R. Bruce
768e488a6b python,misc: Add check-symlinks hook to pre-commit
This hook checks that symlinks in the repo still point to valid
location.

Change-Id: I350760800406d9c003e81236af8248c6fc0a7359
2023-10-10 21:39:32 -07:00
Bobby R. Bruce
5b5c5d09dd python,misc: Add check-merge-conflict hook to pre-commit
This hook will check to see if files have merge conflict strings and
blocks commit if so.

Change-Id: I8687e0a8367d3c43133890001023e0352954d90d
2023-10-10 21:38:48 -07:00
Bobby R. Bruce
132ec10818 python,misc: Add check-ast hook to pre-commit
This verifies all Python files have a AST indicating they valid Python.
No file in the repo fails this test, so it triggered no changes.

Change-Id: Ifd7998268df6be766d92c19cfc7f1cfdf8ed103e
2023-10-10 21:37:45 -07:00
Bobby R. Bruce
d559c24ac2 stdlib: Improve handing of errors in Atlas request failures (#404)
Now:

* The Atlas Client will attempt a connection 4 times, using an
exponential backoff approach between attempts.
* When a failure does arise a rich output is given so problems can be
easily diagnosed.

Addresses: #340
2023-10-10 21:34:24 -07:00
Bobby R. Bruce
ad2fe42686 Learning-gem5: fix formatting (#401)
Using f-strings instead of % for formatting.
2023-10-10 16:47:37 -07:00
Harshil Patel
bbc301f2f0 stdlib, tests: Fixed bugs and tests
- Fixed bugs rekated to retrying on request faliure.
- Updated the pyunit tests.

Change-Id: Ia484690267bf27018488324f3408f7e47c59bef3
2023-10-10 15:54:20 -07:00
Bobby R. Bruce
25b2786db8 misc,python: Add requirements-txt-fixer to pre-commit (#422) 2023-10-10 14:30:39 -07:00
Kaustav Goswami
937b829e8f configs,ext: Updated the gem5 SST Bridge to use SST 13.0.0
This change updates the gem5 SST Bridge to use SST 13.0.0. Changes
are made to replace SimpleMem class to StandardMem class as
SimpleMem will be deprecated in SST 14 and above. In addition, the
translator.hh is updated to translate more types of gem5 packets.
A new parameter `ports` was added on SST's side when invoking the
gem5 component which does not require recompiling the gem5
component whenever a new outgoing bridge is added in a gem5 config.

Change-Id: I45f0013bc35d088df0aa5a71951422cabab4d7f7
Signed-off-by: Kaustav Goswami <kggoswami@ucdavis.edu>
2023-10-10 14:16:29 -07:00
Bobby R. Bruce
1502f7c09f misc: Add black update change to .git-blame-ignore-rev
Change-Id: Ief04aec128bc48e66b79fc2f5c474948dd5eb9eb
2023-10-10 14:02:37 -07:00
Bobby R. Bruce
ddf6cb88e4 misc: Run pre-commit run --all-files
This is reflect the updates made to black when running `pre-commit
autoupdate`.

Change-Id: Ifb7fea117f354c7f02f26926a5afdf7d67bc5919
2023-10-10 14:01:58 -07:00
Bobby R. Bruce
317d2fb5b8 misc: Run pre-commit autoupdate
This updates the pre-commit utility from v4.3.0 to v4.5.0 and updates
black from 22.6.0 to 23.9.1.

Change-Id: I7ebb551f30e617059ce49f89a30207f739b1cb14
2023-10-10 14:00:57 -07:00
ivanaamit
486763b671 learning-gem5: use f-string for print
Change-Id: If27af6524af4e4a6a59e914e9e40ba10de24adf4
2023-10-10 13:54:07 -07:00
Bobby R. Bruce
58140bba1f tests: Update test workflows for new runners (#417)
#371 Updates the runners. This PR updates the tests to:

1. Drop the 'run' and 'build' labels (all runners are now of the same
type).
2. Utilize the threading where possible (runners now have 4 cores
minimum).
2023-10-10 12:03:00 -07:00
Bobby R. Bruce
0ec1fb167b stdlib: Fix use internal _hashlib in md5_utils.py (#427)
Removes the use of the internal _hashlib, which is an internal Python
API
This is a fix for issue #383
2023-10-10 08:32:45 -07:00
Yu-Cheng Chang
141b06d335 arch,arch-riscv: Remove setRegOperand in VecRegOperand (#341)
The RISC-V vector instructions still work without setRegOperand.
We should fix the register statistic issue by
https://github.com/gem5/gem5/pull/360 to avoid duplicate statistic
register write count



Change-Id: Ib6a52935e00c3e557b366abfcf60450dca05614d
2023-10-10 08:00:10 -07:00
Matthew Poremba
9f4d334644 gpu-compute: Update tokens for flat global/scratch
Memory instructions acquire coalescer tokens in the schedule stage.
Currently this is only done for buffer and flat instructions, but not
flat global or flat scratch. This change now acquires tokens for flat
global and flat scratch instructions. This provides back-pressure to the
CUs and helps to avoid deadlocks in Ruby.

The change also handles returning tokens for buffer, flat global, and
flat scratch instructions. This was previously only being done for
normal flat instructions leading to deadlocks in some applications when
the tokens were exhausted.

To simplify the logic, added a needsToken() method to GPUDynInst which
return if the instruction is buffer or any flat segment.

The waitcnts were also incorrect for flat global and flat scratch. We
should always decrement vmem and exp count for stores and only normal
flat instructions should decrement lgkm. Currently vmem/exp are not
decremented for flat global and flat scratch which can lead to deadlock.
This change set fixes this by always decrementing vmem/exp and lgkm only
for normal flat instructions.

Change-Id: I673f4ac6121e4b5a5e8491bc9130c6d825d95fc5
2023-10-10 09:48:16 -05:00
Matt Sinclair
ec633b3d68 dev-amdgpu,mem-ruby: Add support to checkpoint and restore between kernels in GPUFS (#377)
Earlier, GPU checkpointing was working only if a checkpoint was created
before the first kernel execution. This pull request adds support to
checkpoint in-between any two kernel calls. It does so by doing the
following.

- Adds flush support in the GPU_VIPER protocol
- Adds flush support in the GPUCoalescer
- Updates cache recorder to use the GPUCoalescer during simulation
cooldown and cache warmup times.
2023-10-10 09:41:21 -05:00
Giacomo Travaglini
d9fe0cfe1c arch-arm: Make interrupt masking handle VHE/SEL2 cases (#430)
The new implementation matches the table in the ARM Architecture
Reference Manual (version DDI 0487J.a, section D1.3.6, table R_SXLWJ)

It takes into consideration features like FEAT_SEL2 (scr.eel2 bit) and
FEAT_VHE (hcr.e2h bit) which affect the masking of interrupts under
certain circumstances
2023-10-10 15:22:34 +01:00
Giacomo Travaglini
8acf49b6fa arch-arm: Revamp takeInt to take VHE/SEL2 into account
The new implementation matches the table in the ARM Architecture
Reference Manual (version DDI 0487J.a, section D1.3.6, table R_SXLWJ)

It takes into consideration features like FEAT_SEL2 (scr.eel2 bit) and
FEAT_VHE (hcr.e2h bit) which affect the masking of interrupts under
certain circumstances

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Change-Id: I07ebd8d859651475bd32fd201eea0f4e64a7dd5f
2023-10-10 09:46:47 +01:00
Giacomo Travaglini
e412ddddbd arch-arm: Split takeInt into AArch64/32 versions
We pay a small duplication cost but we make the code
more readable and we enable further modifications to the
AArch64 code without forcing the same code on the AArch32
method

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Change-Id: I1efa33cf19f91094fd33bd48b6a0a57d8df8f89f
2023-10-10 09:45:59 +01:00
root
05ebbd2184 stdlib: Fix use internal _hashlib in md5_utils.py
Removes the use of the internal _hashlib, which is an
internal Python API

Change-Id: Id4541a143adb767ca7d942c0fd8a1cf1a08a04ab
2023-10-10 06:18:59 +00:00
Bobby R. Bruce
dc38a801b7 Merge branch 'develop' into workflows-for-new-runners 2023-10-09 23:10:18 -07:00
Bobby R. Bruce
da212e04b5 Merge branch 'develop' into requirements-fixer-hook 2023-10-09 22:37:25 -07:00
Bobby R. Bruce
486916b5d4 configs,tests: Remove mkdir in simpoint-se-checkpoint.py (#425)
This `mkdir` is problematic as it doesn't create the directory
recursively. This casues errors if `dir` is `X/Y/Z` and both `Y` and `Z`
has not been created. An error will be returned (`No such file or
directory`).

This issue was fixed with: https://github.com/gem5/gem5/pull/263. The
checkpointing code already recursively creates directories as needed.
Ergo was can remove this `mkdir` statement.
2023-10-09 22:34:19 -07:00
Bobby R. Bruce
c5f06265bb misc,python: Add yaml formatter to pre commit (#423) 2023-10-09 17:55:25 -07:00
Bobby R. Bruce
51c881d0f1 stdlib: Improve handing of errors in Atlas request failures
Now:

* The Atlas Client will attempt a connection 4 times, using an
  exponential backoff approach between attempts.
* When a failure does arise a rich output is given so problems can be
  easily diagnosed.

Change-Id: I3df332277c33a040c0ed734b9f3e28f38606af44
2023-10-09 16:30:02 -07:00
Bobby R. Bruce
93704a81f1 dev-amdgpu,gpu-compute: Implement GPU and HSA timestamps (#410)
This PR adds two commit to handle timestamps in the ROCm runtime. ROCr
uses a mix of GPU timestamp reads and HSA packet timestamps to output
profiling information for a task dispatch.

The first patch added timestamps to the HSA completion signal indicating
when the task started and ended and require changing the flow of
completion signal DMAs to ensure the DMA of the timestamp values
completed before writing the completion signal value.

Second commit adds MMIOs for reading the GPU's timestamp counter. This
MMIO resides in the GFX MMIO space so a new class is added to handle
MMIOs in that address range.
2023-10-09 14:11:52 -07:00
Bobby R. Bruce
21c5d77000 configs: Add an example elastic trace generation script (#415)
Current [TraceCPU
documentation](https://www.gem5.org/documentation/general_docs/cpu_models/TraceCPU)
still references the deprecated **se.py/fs.py** scripts for elastic
trace generation (script paths are also outdated).

With this PR we provide a simpler Arm based elastic trace generation
script that can
be used out of the box by a user or that can be extended as needed.
2023-10-09 14:11:33 -07:00
Bobby R. Bruce
1fe0056d3b configs,tests: Remove mkdir in simpoint-se-checkpoint.py
This `mkdir` is problematic as it doesn't create the directory
recursively. This casues errors if `dir` is `X/Y/Z` and both `Y` and `Z`
has not been created. An error will be returned (`No such file or
directory`).

This issue was fixed with: https://github.com/gem5/gem5/pull/263. The
checkpointing code already recursively creates directories as needed.
Ergo was can remove this `mkdir` statement.

Change-Id: Ibae38267c8ee1eba76d7834367aa1c54013365bc
2023-10-09 14:00:21 -07:00
Bobby R. Bruce
fa8c9414b2 misc,python: Run pre-commit run --all-files
This applies the automatical formatting to the .yaml files.

Change-Id: I10d067ba65722aca8aaf64a62b42ae57de468e75
2023-10-09 13:20:25 -07:00
Bobby R. Bruce
5b09777011 misc,python: Add pre-commit-hook-yamlfmt to pre-commit
This automatically formats .yaml files. By deault has the following
parameters:

* `mapping`: 4 spaces.
* `sequence`: 6 spaces.
* `offset`: 4 spaces.
* `colons`: do not align top-level colons.
* `width`: None.

Change-Id: Iee5194cd57b7b162fd7e33aa6852b64c5578a3d2
2023-10-09 13:16:52 -07:00
Bobby R. Bruce
402ec3a57c misc,python: Run pre-commit run --all-files
This applies the `requirement-txt-fixer` to the repo.

Change-Id: I23b1d26ad8deb49ec0019095efc6d253ac1c817c
2023-10-09 13:10:23 -07:00
Bobby R. Bruce
c53529783b misc,python: Add requirements-txt-fixer to pre-commit
This sorts entries in requirements.txt files.

Change-Id: I7ee6e31f3cbe5078f24d13471a6aa9edc482cecd
2023-10-09 13:09:21 -07:00
Bobby R. Bruce
bbe05b0cba tests,misc: Fix compilation tests failures (#400)
Exposed in our failing compiler tests:
https://github.com/gem5/gem5/actions/runs/6348223508, this PR:

* Adds missing overrides to `PCState`'s `set` function.
* Removes `std::binary_function` from DramPower (it was deprecated in
CPP-11 and officially removed in CPP-17).
2023-10-09 11:20:52 -07:00
Harshil Patel
452a600c49 New function to kernel_disk_workload to allow new disk device location (#151)
Added a parameter (_disk_device) to kernel_disk_workload which allows
users to change the disk device location. get_disk_device() now chooses
between the parameter and, if no parameter was passed, it calls a new
function _get_default_disk_device() which is implemented by each board
and has a default disk device according to each board, eg /dev/hda in
the x86_board. The previous way of setting a disk device still exists as
a default, however, with the new function users can now override this
default
2023-10-09 10:33:45 -07:00
Harshil Patel
79f40ffdab stdlib: Del comment stating SE mode limited to single thread (#402)
This comment was left in the codebase in error. The
`set_se_binary_workload` function works fine with multi-threaded
applications. This hasn't been a restriction for some time.
2023-10-09 10:30:32 -07:00
Harshil Patel
d8fc0180a5 cpu: Restructure BTB (#412)
This is the first PR in a series of enhancements to the BPU proposed in
#358.
However, I think putting everything into one PR is not nice to review
and prone to oversee I might did.

This PR restructures the BTB:
- A new abstract BTB class is created to enable different BTB
implementations. The new BTB class gets its own parameter and stats.
- An enum is added to differentiate branch instruction types. This enum
is used to enhance statistics and BPU management.
- The existing BTB is moved into `simple_btb` as default.
- An additional function is added to store the static instruction in the
BTB. This function is used for the decoupled front-end.
- Update configs to match new BTB parameters.
2023-10-09 10:13:00 -07:00
Bobby R. Bruce
d5e454138a util: Remove 'run' and 'build' tags from runners
Change-Id: Ib7b2eba5f08a1d8a311dc20cb55f540a5cd7dc7b
2023-10-09 09:56:32 -07:00
Bobby R. Bruce
243a261491 tests: Update Testlib CI tests to use multiheading
These were previously only running on single-threaded machines. Now
they'll be running on 4-core VMs so may as well run tests in parallel.

Change-Id: I7ee86512dc72851cea307dfd800dcf9a02f2f738
2023-10-09 09:56:32 -07:00
Bobby R. Bruce
70f8c49e8b tests,misc: Remove 'run' and 'build' labels
All runners are now equal, these labels are pointless.

Change-Id: I9d5fb31e20e95d30e9726d4bf0353dc87af614d7
2023-10-09 09:56:25 -07:00
Giacomo Travaglini
eac5a8b215 arch-arm: Implement FEAT_TCR2
Change-Id: I0396f5938c09b68fcc3303a6fdda1e4dde290869
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-09 17:19:57 +01:00
Giacomo Travaglini
49cbb24351 arch-arm: Implement FEAT_SCTLR2
Change-Id: Ifb8c8dc1729cc21007842b950273fe38129d9539
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-09 17:12:53 +01:00
Giacomo Travaglini
c4c5d2e172 arch-arm: Implement ID_AA64MMFR3_EL1 register
Change-Id: If8c37bdccf35a070870900c06dc4640348f0f063
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-09 17:12:53 +01:00
Andreas Sandberg
ec7921305b arch-arm: Implement FEAT_TLBIRANGE extension (#414) 2023-10-09 17:09:31 +01:00
Giacomo Travaglini
4c4615523f configs: Add an example elastic-trace-generating script
The new script will automatically use the newly
defined O3_ARM_v7a_3_Etrace CPU to run a simple SE simulation while
generating elastic trace files.

The script is based on starter_se.py, but contains the following
limitations:

1) No L2 cache as it might affect computational delay calculations
2) Supporting SimpleMemory only with minimal memory latency

There restrictions were imported by the existing elastic trace
generation logic in the common library (collected by grepping
elastic_trace_en) [1][2][3]

Example usage:

build/ARM/gem5.opt configs/example/arm/etrace_se.py \
    --inst-trace-file [INSTRUCTION TRACE] \
    --data-trace-file [DATA TRACE] \
    [WORKLOAD]

[1]: https://github.com/gem5/gem5/blob/stable/\
    configs/common/MemConfig.py#L191
[2]: https://github.com/gem5/gem5/blob/stable/\
    configs/common/MemConfig.py#L232
[3]: https://github.com/gem5/gem5/blob/stable/\
    configs/common/CacheConfig.py#L130

Change-Id: I021fc84fa101113c5c2f0737d50a930bb4750f76
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
2023-10-09 16:45:00 +01:00
Giacomo Travaglini
1a5dee0f0f configs: Add an elastic-trace-generating CPU
According to the original paper [1] the elastic trace generation process
requires a cpu with a big number of entries in the ROB, LQ and SQ, so
that there are no stalls due to resource limitation.

At the moment these numbers are copy pasted from the
CpuConfig.config_etrace method [2].

[1]: https://ieeexplore.ieee.org/document/7818336
[2]: https://github.com/gem5/gem5/blob/stable/\
    configs/common/CpuConfig.py#L40

Change-Id: I00fde49e5420e420a4eddb7b49de4b74360348c9
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
2023-10-09 16:45:00 +01:00
Giacomo Travaglini
e35e2966c0 configs: Use devices.SimpleSeSystem in starter_se.py
Change-Id: I742e280e7a2a4047ac4bb3d783a28ee97f461480
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
2023-10-09 16:45:00 +01:00
Giacomo Travaglini
7395b94c40 configs: Add a SimpleSeSystem class to devices.py
Change-Id: I9d120fbaf0c61c5a053163ec1e5f4f93c583df52
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
2023-10-09 16:45:00 +01:00
Giacomo Travaglini
3b8c974456 configs: Refactor BaseSimpleSystem in devices.py
We define a new parent (ClusterSystem) to model a system
with  one or more cpu clusters within it.
The idea is to make this new base class reusable by SE
systems/scripts as well (like starter_se.py)

Change-Id: I1398d773813db565f6ad5ce62cb4c022cb12a55a
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
2023-10-09 16:45:00 +01:00
Jason Lowe-Power
d4be9c76c5 cpu-kvm, arch-x86: flush TLB after syscalls (#411)
Modified the x86 KVM-in-SE syscall handler to flush the TLB following
each syscall, in case the page table has been modified. This is done by
reloading the value in %cr3. Doing this requires an intermediate GPR,
which we store in a new scratch buffer following the syscall code at
address `syscallDataBuf`.

GitHub issue: https://github.com/gem5/gem5/issues/409
2023-10-09 08:16:06 -07:00
David Schall
edf9092fee cpu: Restructure BTB
- A new abstract BTB class is created to enable different BTB
  implementations. The new BTB class gets its own parameter
  and stats.
- An enum is added to differentiate branch instruction types.
  This enum is used to enhance statistics and BPU management.
- The existing BTB is moved into `simple_btb` as default.
- An additional function is added to store the static instruction in
  the BTB. This function is used for the decoupled front-end.
- Update configs to match new BTB parameters.

Change-Id: I99b29a19a1b57e59ea2b188ed7d62a8b79426529
Signed-off-by: David Schall <david.schall@ed.ac.uk>
2023-10-09 14:37:47 +00:00
Giacomo Travaglini
39fdfaea5a arch-arm: Implement FEAT_TLBIRANGE
Change-Id: I7eb020573420e49a8a54e1fc7a89eb6e2236dacb
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-09 13:59:47 +01:00
Giacomo Travaglini
6b698630a2 arch-arm: Check VMID in secure mode as well (NS=0)
This is still trying to completely remove any artifact
which implies virtualization is only supported in
non-secure mode (NS=1)

Change-Id: I83fed1c33cc745ecdf3c5ad60f4f356f3c58aad5
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-09 13:56:57 +01:00
Giacomo Travaglini
a8efded644 arch-arm: Include Granule Size in a TLB entry
This info can be used during TLB invalidation

Change-Id: I81247e40b11745f0207178b52c47845ca1b92870
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-09 13:56:57 +01:00
Giacomo Travaglini
5cd70bf9bf sim-se: zero out memory allocated via brk() (#343)
The syscall emulation of brk() incorrectly did not ensure that newly
allocated memory was zero-initialized, which Linux guarantees and which
seems to be the expectation of glibc's malloc() and free()
implementation. This patch fixes the incorrect behavior by zero-
initalizing all memory allocations via brk().

GitHub issue: https://github.com/gem5/gem5/issues/342

Change-Id: I53cf29d6f3f83285c8e813e18c06c2e9a69d7cc2
2023-10-09 13:48:53 +01:00
Giacomo Travaglini
226052ed5a mem-ruby: Far atomics fix (#407)
The PR is fixing the CHI fromSequencer helper function which is making
use of the undefined tbe entry.

This has been broken by #177

Change-Id: I52feff4b5ab2faf0aa91edd6572e3e767c88e257
2023-10-09 08:08:50 +01:00
Bobby R. Bruce
b0e1efb555 util: Update the GitHub Self-Hosted Runners (#371)
1. All VMs are deployable from a single Vagrantfile (per host machine).
2. Runners within VMs are now ephemeral. They cease to exist after a job
is complete. After the VM cleans the workspace and creates a new runner.
This will reduce old data, scripts, and images causing space issues on
our VMs
3. No more 'vm_manager.sh' script. The standard `vagrant` command to
manage the VMs will work.
4. Adds Copyright notices where missing.
2023-10-08 21:52:33 -07:00
Bobby R. Bruce
df3bcaf143 util: Make all runs "build" and "run"
Change-Id: If9ecf467efa5c7118d34166953630e6c436c55a4
2023-10-06 15:53:57 -07:00
Bobby R. Bruce
53219bf827 util: Add Troubleshooting for "Vagrant failed..."
Change-Id: I01e637f09084acb6c5fbd7800b3e578a43487849
2023-10-06 15:53:57 -07:00
Bobby R. Bruce
6571a54a65 util: Use a Multi-Machine Vagrantfile
This patch removed the bespoke "vm_manager.sh" script in favor of a
Multi-Machine Vagrantfile.

With this the users needs to only change the variables in Vagrantfile
then use the standard `vagrant` commands to launch the VMs/Runners.

Change-Id: Ida5d2701319fd844c6a5b6fa7baf0c48b67db975
2023-10-06 15:53:57 -07:00
Bobby R. Bruce
0e5c6d9f50 util: Resize VM root partition max size to ~128GB
Prior to this change we were limited to a root partition with only 60GB
of space which caused issues when running larger simulations (see:
https://github.com/gem5/gem5/issues/165).

There are two factors in this issue which this patch resolves:

1. The root partition in the VM was capped at 60GB despite the virtual
machines size being capped at 128GB. This resulted in libvirt giving the
VM free space it couldn't use. To fix this `lvextend` was added to the
"provision_root.sh" script to resize the root partition to fill the
available space.
2. The virtual machine size can be set via the `machine_virtual_size`
parameter. The minimum and default value is 128GB. This wasn't exposed
previously. Now, if we required, we can increase the size of the VM/Root
partition if we require (though I believe 128GB is more than sufficient
for now).

Fixes: https://github.com/gem5/gem5/issues/165
Change-Id: I82dd500d8807ee0164f92d91515729d5fbd598e3
2023-10-06 15:53:57 -07:00
Bobby R. Bruce
f36449be98 util: Add missing copyright notices
Change-Id: I243046c17264eb5c522285096ecf9c7e5e968322
2023-10-06 15:53:57 -07:00
Bobby R. Bruce
8c2d414223 util: Cleanup the provision_root.sh
Change-Id: I58215dddc34476695c7aedc77b55d338e0304198
2023-10-06 15:53:57 -07:00
Bobby R. Bruce
c0cb16ba89 util: Create HOSTNAME variable
Change-Id: Ia68f1bef2bb9e4e5e18476b6100be80f8cf1c799
2023-10-06 15:53:57 -07:00
Bobby R. Bruce
529423f47a util: Remove note about ssh use
This is confusing and setting the ssh username and password is normal.

Change-Id: Ic925e92ade47f455c86a461a267b8cad7aa6d7ba
2023-10-06 15:53:57 -07:00
Bobby R. Bruce
a924fa3bdc util: Add action-run.sh to run Action Runners
The "action-run.sh" action replaces inline scripting in the Vagrantfile.

The major improvement is this script runs an infinite loop and
configures the runners to be ephemeral. This means they cease to exist
after a job is complete. The script then cleans the VM workspace and the
loop restarts by configuring and setting up another runner. This means
our VMs no longer accumulate files that eventually lead to the VM
running out of space.

Change-Id: Iba6dc9a480f5805042602f120fc84bdc47a96d55
2023-10-06 15:53:57 -07:00
Bobby R. Bruce
3e1c0b0714 util: Move runners from gem5 repository to gem5 org
There are two places self-hosted runners can exist on GitHub:

1. At the level of the repository: In this case the runners can only be
used by that repository and runners can only be distinguished from one
another by labels.
2. At the level of the organization: In this case the runners can be
used by any repository in the organization, thus increasing their
versatility. In addition to labels, runners in the level of the
organization can be organized into groups.

While we do not use our self-hosted runners on other repositories, there
may be future use for this, so we might as well enable it now.

Change-Id: Id5e113194314336221dcdc8c2858b352afcbaf6e
2023-10-06 15:53:57 -07:00
Bobby R. Bruce
d1f9f98747 util: Make all runners the same type
Having two types of GitHub Action Runners has not yielded much benefit
and caused confusion and inefficiencies. This change simplifies things
to having just one runner with 8-cores and 16GB of memory. It is
sufficient to build gem5 and run most simulations.

Change-Id: Ic49ae5e98b02086f153f4ae2a4eedd8a535786c8
2023-10-06 15:53:57 -07:00
Nicholas Mosier
7a0e84d853 cpu-kvm, arch-x86: flush TLB after syscalls
Modified the x86 KVM-in-SE syscall handler to flush the TLB following
each syscall, in case the page table has been modified. This is done
by reloading the value in %cr3. Doing this requires an intermediate
GPR, which we store in a new scratch buffer following the syscall code
at address `syscallDataBuf`.

GitHub issue: https://github.com/gem5/gem5/issues/409

Change-Id: Ibc20018c97ebb1794fa31a0c71e0857d661c7c9d
2023-10-06 20:41:59 +00:00
Nicholas Mosier
0dcf0fb829 sim-se: unmap reclaimed heap pages in brk syscall emulation
gem5::MemState::updateBrkRegion(), which is called during the syscall
emulation of brk, did not unmap deallocated heap pages when the brk
region is receding. Instead, it kept it mapped for simplicity. This
introduced a bug where subequent expansions of the brk region reused
prior heap page mappings that were not zero-filled. This violates
the assumptions of glibc malloc, resulting in heap corruption and
crashes.

This patch fixes the bug by always unmapping pages that are deallocated
during a call to brk() that reduces the heap size. This makes the
gem5::MemState::_endBrkPoint field obsolete, so this patch removes it.

GitHub issue: https://github.com/gem5/gem5/issues/342

Change-Id: Ib2244e1aa4d2a26666ad60d231fdde2c22d2df35
2023-10-06 20:39:57 +00:00
Matthew Poremba
75a7f30dfb dev-amdgpu: Implement GPU clock MMIOs
The ROCr runtime uses a combination of HSA signal timestamps and
hardware MMIOs to calculate profiling times. At the beginning of an
application a timestamp is read from the GPU using MMIOs. The clock
MMIOs reside in the GFX MMIO region, so a new AMDGPUGfx class is added
to handle these MMIOs.

The timestamp value is expected to be in nanoseconds, so we simply use
the gem5 tick converted to ns.

Change-Id: I7d1cba40d5042a7f7a81fd4d132402dc11b71bd4
2023-10-06 13:21:40 -05:00
Matthew Poremba
6a4b2bb096 dev-hsa,gpu-compute: Add timestamps to AMD HSA signals
The AMD specific HSA signal contains start/end timestamps for dispatch
packet completion signals. These are current always zero. These
timestamp values are used for profiling in the ROCr runtime.
Unfortunately, the GpuAgent::TranslateTime method in ROCr does not check
for zero values before dividing, causing applications that use profiling
to crash with SIGFPE. Profiling is used via hipEvents in the HACC
application, so these should be supported in gem5.

In order to handle writing the timestamp values, we need to DMA the
values to memory before writing the completion signal. This changes the
flow of the async completion signal write to be (1) read mailbox pointer
(2) if valid, write the mailbox data, other skip to 4 (3) write mailbox
data if pointer is valid (4) write timestamp values (5) write completion
signal. The application will process the timestamp data as soon as the
completion signal is received, so we need to ordering to ensure the DMA
for timestamps was completed.

HACC now runs to completion on GPUFS and has the same output was
hardware.

Change-Id: I09877cdff901d1402140f2c3bafea7605fa6554e
2023-10-06 13:21:40 -05:00
Giacomo Travaglini
00748c7901 mem-ruby: Fix CHI fromSequencer helper function
This has been broken by #177

Change-Id: I52feff4b5ab2faf0aa91edd6572e3e767c88e257
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-06 14:51:11 +01:00
Giacomo Travaglini
ae104cc431 mem-ruby: Add new feature far atomics in CHI (#177)
Added a new feature to CHI protocol (in collaboration with @tiagormk).
Here is the Jira Ticket
[https://gem5.atlassian.net/browse/GEM5-1326](https://gem5.atlassian.net/browse/GEM5-1326
). As described in CHI specs, far atomic transactions enable remote
execution of Atomic Memory Operations. This pull request incorporates
several changes:

* Fix Arm ISA definition of Swap instructions. These instructions should
return an operand, so their ISA definition should be Return Operation.
* Enable AMOs in Ruby Mem Test to verify that AMOs work
* Enable near and far AMO in the Cache Controler of CHI

Three configuration parameters have been used to tune this behavior:
* policy_type: sets the atomic policy to one of the described in [our
paper](https://dl.acm.org/doi/10.1145/3579371.3589065)
* atomic_op_latency: simulates the AMO ALU operation latency
* comp_anr: configures the Atomic No return transaction to split
CompDBIDResp into two different messages DBIDResp and Comp
2023-10-06 10:09:58 +01:00
Hoa Nguyen
6f8b74ece8 dev,arch-riscv: Mark gem5's 8250 UART as 16550a compatible
8250 UART is supposed to be compatible to 16550a UART.

This enables OpenSBI to print things to UART as OpenSBI only
prints if the UART is 16550a compatible [1].

There is a similar change from gem5 gerrit [2] pointing out
that this also enables bbl to print things to UART. This is
confirmed :)

[1] https://github.com/riscv-software-src/opensbi/blob/v1.3.1/lib/utils/serial/fdt_serial_uart8250.c#L29
[2] https://gem5-review.googlesource.com/c/public/gem5/+/68481

Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-10-06 00:48:12 -07:00
Hoa Nguyen
3fc6b67974 arch-riscv: Add several inform() to RiscvISA::BootloaderKernelWorkload
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-10-06 00:45:21 -07:00
Hoa Nguyen
e8fd8303fb stdlib: Add chosen node to the device tree of RISC-V board
This enables two things,
- /chosen/stdout-path is now default to uart@10000000, meaning
the linux kernel's boot console will be redirected to uart.
- /chosen/bootargs now contains the boot arguments obtained from
gem5's library. This allows passing the boot arguments to the
linux kernel via the device tree.

Change-Id: I53821d85f619e6276da85f41c972c041eaaf3280
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-10-06 00:33:45 -07:00
Hoa Nguyen
46a9d85215 arch-riscv: Add bootloader+kernel workload
Aims to boot OpenSBI + Linux kernel.

Change-Id: I9ee93cc367e8c06bdd0c7ddf43335d32965be14d
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-10-06 00:33:05 -07:00
Vishnu Ramadas
a19667427a mem-ruby: Add BUILD_GPU guard to ruby cooldown and warmup phases
Ruby was recently updated to support flushes and warmup for GPUs. Since
this support uses the GPUCoalescer, non-GPU builds face a compile time
issue. This is because GPU code is not built for non-GPU builds. This
commit addes "#if BUILD_GPU" guards around the GPU-related code in
common files like AbstractController.hh, CacheRecorder.*, RubySystem.cc,
GPUCoalescer.hh, and VIPERCoalescer.hh. This support allows GPU builds
to use flushing while non-GPU builds compile without problems

Change-Id: If8ee4ff881fe154553289e8c00881ee1b6e3f113
2023-10-05 18:59:54 -05:00
Matt Sinclair
85340973bf configs: Add configurable GPU L1,L2 num banks and L2 latencies (#389)
Previously, the L1, L2 number of banks and L2 latencies were not
configurable through command line arguments. This commit adds support to
configure them through the arguments '--tcp-num-banks' for number of
banks in L1, '--tcc-num-banks' for number of banks in L2, and
'--tcc-tag-access-latency', and '--tcc-data-access-latency'

Change-Id: Ie3b713ead16865fd7120e2d809ebfa56b69bc4a1
2023-10-05 15:54:24 -05:00
Bobby R. Bruce
4db748a507 resources, stdlib: Adding 'suite' category to gem5 (#191) 2023-10-05 13:26:58 -07:00
Bobby R. Bruce
761f6b73a0 arch-arm: Implement FEAT_FGT (#334)
This PR implements FEAT_FGT (Fine Grain Traps)
2023-10-05 10:44:26 -07:00
Bobby R. Bruce
f5c7ea01ef gpu-compute: Fix dynamic scratch size test (#391)
ROCm supports dynamically allocating scratch space, which resides in
framebuffer memory, to reduce the amount of memory allocated for kernels
that have not yet launched. The size of the scratch space allocated is
located in task->amdQueue.compute_tmpring_size_wavesize. This size is in
kilobytes. The AQL task contains the number of bytes requested *per work
item*, however we currently check if there is enough tmpring space by
comparing a single work item. This should instead check the size *per
wavefront*.

This causes problems in applications where multiple kernels use dynamic
scratch allocation and a later kernel requires more space than the
earlier kernel. The only application being tested that does this is
LULESH. This was resulting in the scratch space being too small,
resulting in workgroups clobbering each other's private memory leading
to some nasty bugs. It is fixed by this patch as task->amdQueue will be
re-read from the host and will contain the updated tmpring size. After
this there is enough scratch space and LULESH makes forward progress.
2023-10-05 10:38:13 -07:00
Bobby R. Bruce
ee8c569513 arch-riscv: Implement Zcb instructions (#399)
Added the following instructions:
c.lbu
c.lh
c.lhu
c.sb
c.sh
c.zext.b
c.sext.b
c.zext.h
c.sext.h
c.zext.w
c.not
c.mul

Reference: https://github.com/riscv/riscv-code-size-reduction
2023-10-05 10:36:02 -07:00
Bobby R. Bruce
f75c0fca8a stdlib: Del comment stating SE mode limited to single thread
This comment was left in the codebase in error. The
`set_se_binary_workload` function works fine with multi-threaded
applications. This hasn't been a restriction for some time.

Change-Id: I1b1d27c86f8d9284659f62ae27d752bf5325e31b
2023-10-05 10:20:55 -07:00
Bobby R. Bruce
06bbc43b46 ext: Remove std::binary_function from DramPower
`std::binary_function` was deprecated in C++11 and officially
removed in CPP-17.

This caused a compilation error on some systems. Fortunately it can be
safely removed. It was unecessary. The commandItemSorter was compliant
witih the `sort` regardless.

Change-Id: I0d910e50c51cce2545dd89f618c99aef0fe8ab79
2023-10-05 10:18:19 -07:00
Bobby R. Bruce
39c7e7d1ed arch: Adding missing override to PCState.set
As highlighed in this failing compiler test:
https://github.com/gem5/gem5/actions/runs/6348223508/job/17389057995

Clang was failing when compiling "build/ALL/gem5.opt" due missing
overrides in `PCState`'s "set" function.

This was observed in Clang-14 and, stangely, Clang-8.

Change-Id: I240c1087e8875fd07630e467e7452c62a5d14d5b
2023-10-05 10:18:19 -07:00
Roger Chang
ea3ee880aa arch-riscv: Implement Zcb instructions
Added the following instructions:
c.lbu
c.lh
c.lhu
c.sb
c.sh
c.zext.b
c.sext.b
c.zext.h
c.sext.h
c.zext.w
c.not
c.mul

Reference: https://github.com/riscv/riscv-code-size-reduction
Change-Id: Ib04820bf5591b365a3bfbbd8b90655a8a1d844cf
2023-10-05 18:46:35 +08:00
Leo Redivo
98a6cd6ee2 misc: changed call get_default_disk_device to get_disk_device
Change-Id: I240da78a658208211ede6648547dfa4c971074a1
2023-10-04 13:32:35 -07:00
Víctor Soria
6411b2255c mem-ruby,configs: Add CHI far atomics support
Introduce far atomic operations in CHI protocol.
Three configuration parameters have been used to tune this behavior:

  policy_type:       sets the atomic policy to one of the described in our paper
  atomic_op_latency: simulates the AMO ALU operation latency
  comp_anr:          configures the Atomic No return transaction to split
                     CompDBIDResp into two different messages DBIDResp and Comp

Change-Id: I087afad9ad9fcb9df42d72893c9e32ad5a5eb478
2023-10-04 19:19:08 +02:00
Víctor Soria
12dada2dc5 arch-arm: Correct return operand in swap instructions
Swap instructions are configured as non returning AMO operations. This is wrong because they
return the previous value stored in the target memory position

Change-Id: I84d75a571a8eaeaee0dbfac344f7b34c72b47d53
2023-10-04 19:11:01 +02:00
Víctor Soria
4fd9d66c53 tests,mem-ruby: Enhance ruby false sharing test with Atomics
New ruby mem test includes a percentages of AMOs that will be executed randomly in ruby mem test

Change-Id: Ie95ed78e59ea773ce6b59060eaece3701fe4478c
2023-10-04 19:11:01 +02:00
Jason Lowe-Power
6f5d877b1a misc: Update gem5 to use clang-15 and clang-16 (#365)
This introduces the changes necessary for clang-15 and clang-16 to run
within gem5, and adds them to the compiler tests.

This also updates the dockerfiles for ubuntu 22.04 to include the steps
necessary to compile clang-15 and clang-16.
2023-10-04 09:51:12 -07:00
Matthew Poremba
2b97f17fe1 gpu-compute: Fix dynamic scratch size test
ROCm supports dynamically allocating scratch space, which resides in
framebuffer memory, to reduce the amount of memory allocated for kernels
that have not yet launched. The size of the scratch space allocated is
located in task->amdQueue.compute_tmpring_size_wavesize. This size is in
kilobytes. The AQL task contains the number of bytes requested *per work
item*, however we currently check if there is enough tmpring space by
comparing a single work item. This should instead check the size *per
wavefront*.

This causes problems in applications where multiple kernels use dynamic
scratch allocation and a later kernel requires more space than the
earlier kernel. The only application being tested that does this is
LULESH. This was resulting in the scratch space being too small,
resulting in workgroups clobbering each other's private memory leading
to some nasty bugs. It is fixed by this patch as task->amdQueue will be
re-read from the host and will contain the updated tmpring size. After
this there is enough scratch space and LULESH makes forward progress.

Change-Id: Ie9e0f92bb98fd3c3d6c2da3db9ee65352f9ae070
2023-10-04 09:38:31 -05:00
Andreas Sandberg
7806eaad51 arch: Add instruction size and PC set methods (#357)
Add the instruction size of a static instruction. x86 and arm decoders
add now the instruction size to the macro instruction. However, microops
are still handled by the fetch stage which is not nice.
Furthermore, we add a set method to the PC state. It allows setting a PC
state to acertain address.
Both methods are required for the decoupled front-end.

Change-Id: I311fe3f637e867c42dee7781f5373ea2e69e2072
2023-10-04 10:49:30 +01:00
Bobby R. Bruce
57e0c7d006 arch-riscv: FS bits -> DIRTY for more floating point loads (#381)
The affected instructions are,
- c.flw
- c.flwsp
- flh
- flw

This change is related to [1] [2], which also aim to change the FS bits
to DIRTY when the state of any floating point register might change.

[1] https://gem5-review.googlesource.com/c/public/gem5/+/65272
[2] https://github.com/gem5/gem5/pull/370
2023-10-03 11:51:47 -07:00
Vishnu Ramadas
d3637a489d configs: Add option to disable AVX in GPUFS
GPUFS+KVM simulations automatically enable AVX. This commit adds a
command line option to disable AVX if its not needed for a GPUFS
simulation.

Change-Id: Ic22592767dbdca86f3718eca9c837a8e29b6b781
2023-10-03 12:10:42 -05:00
Vishnu Ramadas
53627cc39c configs: Add configurable GPU L1,L2 num banks and L2 latencies
Previously, the L1, L2 number of banks and L2 latencies were not
configurable through command line arguments. This commit adds support to
configure them through the arguments '--tcp-num-banks' for number of
banks in L1, '--tcc-num-banks' for number of banks in L2, and
'--tcc-tag-access-latency', and '--tcc-data-access-latency'

Change-Id: Ie3b713ead16865fd7120e2d809ebfa56b69bc4a1
2023-10-03 11:51:28 -05:00
Harshil Patel
3af3c1121b stdlib, resources: Addressed requested changes
Change-Id: I22abdc3bdcdde52301ed10cb3113e8925159c245
Co-authored-by: Kunal Pai <kunpai@users.noreply.github.com>
2023-10-02 23:27:32 -07:00
Vishnu Ramadas
f69191a31d dev-amdgpu: Remove duplicate writes to PM4 queue pointers
During checkpoint restoration, the unserialize() function writes rptr,
wptr, and indirect buffer rptr, wptr to PM4 queue's rptr, wptr fields.
This commit updates this to write only the relevant pointers to the
queue structure. If indirect buffers are used, then it writes only the
indirect buffer pointers to the queue. If they are not used, then it
writes rptr, wptr values to the queue.

Change-Id: Iedb25a726112e1af99cc1e7bc012de51c4ebfd45
2023-10-02 19:37:46 -05:00
Vishnu Ramadas
ae5a51994c mem-ruby: Update cache recorder to use GPUCoalescer port for GPUs
Previously, the cache recorder used the Sequencer to issue flush
requests and cache warmup requests. The GPU however uses GPUCoalescer to
access the cache, and not the Sequencer. This commit adds a GPUCoalescer
map to the cache recorder and uses it to send flushes and cache warmup
requests to any GPU caches in the system

Change-Id: I10490cf5e561c8559a98d4eb0550c62eefe769c9
2023-10-02 19:05:10 -05:00
Vishnu Ramadas
085789d00c mem-ruby: Add flush support to GPU_VIPER protocol
This commit adds flush support to the GPU VIPER coherence protocol. The
L1 cache will now initiate a flush request if the packet it receives
is of type RubyRequestType_FLUSH. During the flush process, the L1 cache
will a request to L2 if its in either V or I state. L2 will issue a
flush request to the directory if its cache line is in the valid
state before invalidating its copy. The directory, on receiving this
request, writes data to memory and sends an ack back to the L2. L2
forwards this ack back to the L1, which then ends the flush by calling
the write callback

Change-Id: I9dfc0c7b71a1e9f6d5e9e6ed4977c1e6a3b5ba46
2023-10-02 19:05:10 -05:00
Vishnu Ramadas
61e39d5b26 mem-ruby: Add cache cooldown and warmup support to GPUCoalescer
The GPU Coalescer does not contain cache cooldown and warmup support.
This commit updates the coalsecer to support cache cooldown during flush
and warmup during checkpoint restore.

Change-Id: I5459471dec20ff304fd5954af1079a7486ee860a
2023-10-02 19:05:04 -05:00
Vishnu Ramadas
a50ead5907 mem-ruby: Add Flush as a supported memory type in VIPERCoalescer
This commit adds flush as a recognized memory type in VIPERCoalescer.

Change-Id: I0f1b6f4518548e8e893ef681955b12a49293d8b4
2023-10-02 19:02:55 -05:00
Vishnu Ramadas
107e05266d dev-amdgpu: Add aql, hsa queue information to checkpoint-restore
GPUFS uses aql information from PM4 queues to initialize doorbells. This
commit adds aql information to the checkpoint so that it can be used
during restoration to correctly initialize all doorbells. Additionally,
this commit also sets the hsa queue correctly during checkpoint-restoration

Change-Id: Ief3ef6dc973f70f27255234872a12c396df05d89
2023-10-02 19:02:50 -05:00
Harshil Patel
7301d4bd19 python: Add importer to standalone gem5py_m5 (#369)
I believe the point of this binary was to allow people to use the m5
objects without the entire gem5 binary. However, without adding the
importer call, this did not work. Unfortunately, with the importer call
there is a circular dependence on the original gem5py.cc file.
Therefore, this change creates a new file that has the importer call.

Now, with the `gem5py_m5` binary you can run python code that references
modules in `src/python`. Note that `_m5` is not available, so anything
that depends on the gem5 SimObjects' implementation will not work.
However, this can still be useful for things like getting Resources,
processing stats, etc.
2023-10-02 14:28:45 -07:00
David Schall
7d2e1ee789 arch: Add instruction size and PC set methods
Adds the instruction size to all static instruction. x86, arm
and RISC-V decoders add the instruction size to every decoded
macro instruction. As microops should reflect the size of the
their parent macroop the set method is overwritten to pass the
size to all microops.
Furthermore, we add a set method to the PC state. It allows
setting a PC state to a certain address.
Both methods are required for the decoupled front-end.

Change-Id: I311fe3f637e867c42dee7781f5373ea2e69e2072
Signed-off-by: David Schall <david.schall@ed.ac.uk>
2023-10-02 20:10:57 +00:00
Hoa Nguyen
da72590c19 arch-riscv: FS bits -> DIRTY for more floating point loads
The affected instructions are,
- c.flw
- c.flwsp
- flh
- flw

This change is related to [1] [2], which also aim to change the
FS bits to DIRTY when the state of any floating point register
might change.

[1] https://gem5-review.googlesource.com/c/public/gem5/+/65272
[2] https://github.com/gem5/gem5/pull/370

Change-Id: I098e1b1812fb352bd5d3614ff5d3547e58903b65
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-10-01 23:12:25 -07:00
Bobby R. Bruce
e211674625 util-docker: Fix/Improve ubuntu-22.04_clang-16
* Removes `+` symbol accidently left in (this broke building).
* Removes `ARG` thus making the Docker exclusively for clang-16.
* Adds "llvm.sh" to the repo. This stops us being dependent on the url
  download. The script is under the apache license therefore compatible.
* Merges several `apt install` commands into one.

Change-Id: Iaf411656aac83f67f5395b20efd96ecc1eabb263
2023-09-29 13:12:19 -07:00
Harshil Patel
f9781af6e5 mem: fix bug in 3-level cache (#265)
The L3 cache did not work due to argument type mismatch in the call to
the constructor `DMAController`. The second argument is expecting a
`RubySystem` type but the code passes in a `cache_line_size` variable.
After I change the second argument to `self.ruby_system` everything
works.
2023-09-29 10:59:18 -07:00
Bobby R. Bruce
2b791ff556 misc: fix g++13 overloaded-virtual warning (#363)
There are two overloaded-virtual issues reported by g++13.

1. Copy assignment and move assignment overload is hidden in the derived
class

[ CXX] src/mem/cache/replacement_policies/weighted_lru_rp.cc ->
ALL/mem/cache/replacement_policies/weighted_lru_rp.o
In file included from src/mem/cache/base.hh:61,
                 from src/mem/cache/base.cc:46:
src/mem/cache/cache_blk.hh:172:5: error: ‘virtual gem5::CacheBlk&
gem5::CacheBlk::operator=(gem5::CacheBlk&&)’ was hidden
[-Werror=overloaded-virtual=]
  172 |     operator=(CacheBlk&& other)
      |     ^~~~~~~~
src/mem/cache/cache_blk.hh:518:19: note: by ‘gem5::TempCacheBlk&
gem5::TempCacheBlk::operator=(const gem5::TempCacheBlk&)’
  518 |     TempCacheBlk& operator=(const TempCacheBlk&) = delete;
      |                   ^~~~~~~~

In this case, we can exiplict using parent operator= to keep the
function overload.

2. Intended overload hidden in SystemC is reported as error.

In file included from
src/systemc/ext/tlm_utils/simple_initiator_socket.h:24,
                 from src/systemc/tlm_bridge/gem5_to_tlm.hh:72,
from build/ALL/python/_m5/param_Gem5ToTlmBridge256.cc:17:
src/systemc/ext/tlm_utils/../tlm_core/2/sockets/initiator_socket.hh: In
instantiation of ‘class tlm::tlm_base_initiator_socket<256,
tlm::tlm_fw_transport_if<>, tlm::tlm_bw_transport_if<>, 1,
sc_core::SC_ONE_OR_MORE_BOUND>’:

src/systemc/ext/tlm_utils/../tlm_core/2/sockets/initiator_socket.hh:185:7:
required from ‘class tlm::tlm_initiator_socket<256,
tlm::tlm_base_protocol_types, 1, sc_core::SC_ONE_OR_MORE_BOUND>’
src/systemc/ext/tlm_utils/simple_initiator_socket.h:37:7: required from
‘class
tlm_utils::simple_initiator_socket_b<sc_gem5::Gem5ToTlmBridge<256>, 256,
tlm::tlm_base_protocol_types, sc_core::SC_ONE_OR_MORE_BOUND>’
src/systemc/ext/tlm_utils/simple_initiator_socket.h:156:7: required from
‘class tlm_utils::simple_initiator_socket<sc_gem5::Gem5ToTlmBridge<256>,
256, tlm::tlm_base_protocol_types>’
src/systemc/tlm_bridge/gem5_to_tlm.hh:147:46: required from ‘class
sc_gem5::Gem5ToTlmBridge<256>’
/usr/include/c++/13/type_traits:1411:38: required from ‘struct
std::is_base_of<sc_gem5::Gem5ToTlmBridgeBase,
sc_gem5::Gem5ToTlmBridge<256> >’
ext/pybind11/include/pybind11/detail/../detail/common.h:880:59: required
from ‘struct pybind11::class_<sc_gem5::Gem5ToTlmBridge<256>,
sc_gem5::Gem5ToTlmBridgeBase,
std::unique_ptr<sc_gem5::Gem5ToTlmBridge<256>, pybind11::nodelete>
>::is_valid_class_option<sc_gem5::Gem5ToTlmBridgeBase>’
ext/pybind11/include/pybind11/detail/../detail/common.h:719:35: required
by substitution of ‘template<class ... Ts> using
pybind11::detail::all_of = pybind11::detail::bool_constant<(Ts::value &&
...)> [with Ts = {pybind11::class_<sc_gem5::Gem5ToTlmBridge<256>,
sc_gem5::Gem5ToTlmBridgeBase,
std::unique_ptr<sc_gem5::Gem5ToTlmBridge<256>, pybind11::nodelete>
>::is_valid_class_option<sc_gem5::Gem5ToTlmBridgeBase>,
pybind11::class_<sc_gem5::Gem5ToTlmBridge<256>,
sc_gem5::Gem5ToTlmBridgeBase,
std::unique_ptr<sc_gem5::Gem5ToTlmBridge<256>, pybind11::nodelete>
>::is_valid_class_option<std::unique_ptr<sc_gem5::Gem5ToTlmBridge<256>,
pybind11::nodelete> >}]’
ext/pybind11/include/pybind11/pybind11.h:1506:70: required from ‘class
pybind11::class_<sc_gem5::Gem5ToTlmBridge<256>,
sc_gem5::Gem5ToTlmBridgeBase,
std::unique_ptr<sc_gem5::Gem5ToTlmBridge<256>, pybind11::nodelete> >’
build/ALL/python/_m5/param_Gem5ToTlmBridge256.cc:34:179: required from
here
src/systemc/ext/tlm_utils/../core/sc_port.hh:125:18: error: ‘void
sc_core::sc_port_b<IF>::bind(sc_core::sc_port_b<IF>&) [with IF =
tlm::tlm_fw_transport_if<>]’ was hidden [-Werror=overloaded-virtual=]
125 | virtual void bind(sc_port_b<IF> &p) { sc_port_base::bind(p); }
      |                  ^~~~
In file included from
src/systemc/ext/tlm_utils/simple_initiator_socket.h:27:

src/systemc/ext/tlm_utils/../tlm_core/2/sockets/initiator_socket.hh:133:18:
note: by ‘tlm::tlm_base_initiator_socket<256,
tlm::tlm_fw_transport_if<>, tlm::tlm_bw_transport_if<>, 1,
sc_core::SC_ONE_OR_MORE_BOUND>::bind’
133 | virtual void bind(bw_interface_type &ifs) {
(get_base_export())(ifs); }
      |                  ^~~~
src/systemc/ext/tlm_utils/../core/sc_port.hh:124:18: error: ‘void
sc_core::sc_port_b<IF>::bind(IF&) [with IF =
tlm::tlm_fw_transport_if<>]’ was hidden [-Werror=overloaded-virtual=]
  124 |     virtual void bind(IF &i) { sc_port_base::bind(i); }
      |                  ^~~~

src/systemc/ext/tlm_utils/../tlm_core/2/sockets/initiator_socket.hh:133:18:
note: by ‘tlm::tlm_base_initiator_socket<256,
tlm::tlm_fw_transport_if<>, tlm::tlm_bw_transport_if<>, 1,
sc_core::SC_ONE_OR_MORE_BOUND>::bind’
133 | virtual void bind(bw_interface_type &ifs) {
(get_base_export())(ifs); }
      |                  ^~~~

From the code comment, it's intended in SystemC header.

// The overloaded virtual is intended in SystemC, so we'll disable the
warning. // Please check section 9.3 of SystemC 2.3.1 release note for
more details.

The issue is we should move the skip to the base class.
2023-09-29 10:53:52 -07:00
Harshil Patel
8182f8084b stdlib, resources, tests: Introduce Suite of Workloads
This patch introduces a new category called "suite".
A suite is a collection of workloads.
Each workload in a SuiteResource has a tag that can be narrowed down
through the function with_input_group.
Also, the set of input groups can be seen through list_input_groups.
Added unit tests to test all functions of SuiteResource class.

Change-Id: Iddda5c898b32b7cd874987dbe694ac09aa231f08

Co-authored-by: Kunal Pai <kunpai@ucdavis.edu>
2023-09-29 10:50:09 -07:00
Bobby R. Bruce
3a35bdf57a arch-riscv: Update FS bits when doing floating point loads (#370)
This problem is similar to the problem described in [1]. This problem
produces symptoms as described in [2].

In short, the Linux kernel relies on the CSR_STATUS's FS bits to decide
whether to save the floating point registers. If the FS bits are set to
DIRTY, the floating point registers will be saved during context
switching / task switching.

Currently, with the patch in [1], we only change the FS bits upon every
floating arithmetic instruction. However, since floating load
instructions also mutate the state of floating point registers, the FS
bits should be updated to DIRTY.

The problem in [2] arose when the program populates the content of one
floating register to an array by repeatedly using `fld fa5, EA`. A
context switch occured upon a page fault, and while handling that page
fault, the kernel might have to handle an interrupt. This caused the
kernel to task switch between handling page fault and handling
interrupt. This caused __switch_to() to be called, which will save the
floating point registers only if the SD (indirectly set by FS) bits are
set to DIRTY, while restoring the floating point registers to the
switch-to task [3]. This caused the floating point registers to be
zeroed out when it was restored as it was never saved before.

[1] https://gem5-review.googlesource.com/c/public/gem5/+/65272
[2] https://github.com/gem5/gem5/issues/349
[3]
https://github.com/torvalds/linux/blob/v6.5/arch/riscv/include/asm/switch_to.h#L56
2023-09-29 10:47:05 -07:00
Melissa Jost
a79dc3f23c util: Add steps to compile clang-15 and clang-16
This updates the dockerfiles for ubuntu 22.04 to include the
steps necessary to compile clang-15 and clang-16.

Change-Id: I2bba6393ab93a6ce05a2c3ce31f3bbc71bcdca7c
2023-09-29 08:32:01 -07:00
Hoa Nguyen
6640447c1e arch-riscv: Update FS bits when doing floating point loads
This problem is similar to the problem described in [1].
This problem produces symptoms as described in [2].

In short, the Linux kernel relies on the CSR_STATUS's FS bits
to decide whether to save the floating point registers. If
the FS bits are set to DIRTY, the floating point registers will
be saved during context switching / task switching.

Currently, with the patch in [1], we only change the FS bits
upon every floating arithmetic instruction. However, since
floating load instructions also mutate the state of floating
point registers, the FS bits should be updated to DIRTY.

The problem in [2] arose when the program populates the content
of one floating register to an array by repeatedly using
`fld fa5, EA`. A context switch occured upon a page fault, and
while handling that page fault, the kernel might have to handle
an interrupt. This caused the kernel to task switch between
handling page fault and handling interrupt. This caused
__switch_to() to be called, which will save the floating point
registers only if the SD (indirectly set by FS) bits are set to
DIRTY, while restoring the floating point registers to the
switch-to task [3]. This caused the floating point registers to
be zeroed out when it was restored as it was never saved before.

[1] https://gem5-review.googlesource.com/c/public/gem5/+/65272
[2] https://github.com/gem5/gem5/issues/349
[3] https://github.com/torvalds/linux/blob/v6.5/arch/riscv/include/asm/switch_to.h#L56

Change-Id: Ia5656da5a589a8e29fb699d2ee12885b8f3fa2d2
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-09-28 19:14:29 -07:00
Jason Lowe-Power
aaad79cf51 python: Add importer to standalone gem5py_m5
I believe the point of this binary was to allow people to use the m5
objects without the entire gem5 binary. However, without adding the
importer call, this did not work. Unfortunately, with the importer call
there is a circular dependence on the original gem5py.cc file.
Therefore, this change creates a new file that has the importer call.

Now, with the `gem5py_m5` binary you can run python code that references
modules in `src/python`. Note that `_m5` is not available, so anything
that depends on the gem5 SimObjects' implementation will not work.
However, thic can still be useful for things like getting Resources,
processing stats, etc.

Change-Id: I5c0e5d1a669fe5ce491458df916f2049c81292eb
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2023-09-28 11:22:33 -07:00
Bobby R. Bruce
62d34ef374 misc: 'sim{out/err}' -> 'sim{out/err}.txt' (#250)
By default, the `--stderr-file` and `--stdout-file` arguments were
directing the simulator to output files named "simerr" and "simout"
respectively if an output redirect was requested.

A small annoyance is these files lack an extension meaning programs
refuse to open them, or don't do so withou additional effort. On many
systems they are assumed to scripts.

This patch adds the `.txt` extension to both, thus clearly indicating to
other programs these are text files and can be opened and read as such.
2023-09-27 17:36:03 -07:00
Bobby R. Bruce
5d254ffb02 stdlib, resources: Added pretty printing resource (#323)
- Implemented a __str__ for AbstractResource __str__ prints resource
category, id and version.
link to resources website is also printed.
2023-09-27 17:32:35 -07:00
Bobby R. Bruce
14b928f77c base: Add a warning when failing to insert a whole symbol table (#361)
Currently we drop the insertion of a whole symbol table if the name of
one symbol already exists in the base table. Having similar symbols
across different binaries is common.

This change adds a warning and recommends a fix instead of silently
dropping the table.
2023-09-27 17:26:03 -07:00
Bobby R. Bruce
074fa4c604 misc,ext,tests: Automatically split CI TestLib tests across GitHub Action jobs (#263)
This PR utilizes GitHub Action's matrix's to automatically distribute
the CI testlib gem5 build and test jobs across available GitHub Action
Runners.

The CI tests (the `quick` testlib tests, i.e. those run with `./main.py
run`) are distributed across the runners on a per directory basis ---
all directories under "tests/gem5" are run as their own jobs.

The necessary gem5 builds for each workflow are now automatically
inferred via the introduction of `./main.py list`'s `--build-targets`
flag which returns the gem5 build target for a given test or collection
of tests. E.g., `./main.py list --build-targets` will return the build
targets for all the `quick` testlib tests and `./main.py list
--build-target --uid=<id>` will return the build targets the test suite
`<id>` requires.

Moving from monolithic jobs to fine-grained ones will make the locaiton
of test failures more obvious. Each job has it's own artifact containing
"test/testing-results" for the tests run in that job. In addition,
maintenance of these files should become less burdensome due to less
hardcoding.
2023-09-27 14:32:16 -07:00
Bobby R. Bruce
3a0f4598b9 cpu-o3: Mark getWritableRegOperand() in O3CPU as a regwrite (#360)
As discussed here, [1], O3CPU counts getWritableRegOperand() as a reg
read, while SimpleCPU variants count getWriableRegOperand() as a reg
write.

This patch fixes this inconsistency. Here, I assume that if
getWritableRegOperand() is used, setReg() will not be used again to
write to the destination register.

[1] https://github.com/gem5/gem5/pull/341
2023-09-27 14:31:38 -07:00
Bobby R. Bruce
49a1d48264 arch-x86: properly initialize the auxv platform string (#347)
The auxv platform string was not copied to the same location that was
pointed to by the value of AT_PLATFORM; instead, it was copied over the
auxv random buffer. This patch fixes this by copying the auxv platform
string to the right offset in the initial program stack.

GitHub issue: https://github.com/gem5/gem5/issues/346
2023-09-27 14:31:19 -07:00
Bobby R. Bruce
4638434b97 arch-x86: make popx87 micro-op actually pop st(0) (#345)
The popx87 micro-op did not in fact pop the st(0) floating-point
register off the stack; it acted as a no-op. This patch fixes the bug by
passing the spm=1 argument to PopX87's superclass to indicate the
floating-point stack pointer should be incremented.

GitHub issue: https://github.com/gem5/gem5/issues/344
2023-09-27 14:31:00 -07:00
Harshil Patel
633bdc08f2 stdlib: Addressed requested changes
- Added mulitline string for print message

- Added get_category_name method instead of having category as variable

Change-Id: I51e0e14a70e802453c21070711b200bc47994ba3
2023-09-27 11:36:51 -07:00
Melissa Jost
34c3676105 misc: Update gem5 to use clang-15 and clang-16
This introduces the changes necessary for clang-15 and clang-16
to run within gem5, and adds them to the compiler tests.

Change-Id: If809eae1bd8c366b4d62476891feff0625bdf210
2023-09-27 09:35:18 -07:00
Yu-hsin Wang
9ca2672cab misc: fix g++13 overloaded-virtual warning
There are two overloaded-virtual issues reported by g++13.

1. Copy assignment and move assignment overload is hidden in the derived
   class

 [     CXX] src/mem/cache/replacement_policies/weighted_lru_rp.cc -> ALL/mem/cache/replacement_policies/weighted_lru_rp.o
In file included from src/mem/cache/base.hh:61,
                 from src/mem/cache/base.cc:46:
src/mem/cache/cache_blk.hh:172:5: error: ‘virtual gem5::CacheBlk& gem5::CacheBlk::operator=(gem5::CacheBlk&&)’ was hidden [-Werror=overloaded-virtual=]
  172 |     operator=(CacheBlk&& other)
      |     ^~~~~~~~
src/mem/cache/cache_blk.hh:518:19: note:   by ‘gem5::TempCacheBlk& gem5::TempCacheBlk::operator=(const gem5::TempCacheBlk&)’
  518 |     TempCacheBlk& operator=(const TempCacheBlk&) = delete;
      |                   ^~~~~~~~

In this case, we can exiplict using parent operator= to keep the
function overload.

2. Intended overload hidden in SystemC is reported as error.

In file included from src/systemc/ext/tlm_utils/simple_initiator_socket.h:24,
                 from src/systemc/tlm_bridge/gem5_to_tlm.hh:72,
                 from build/ALL/python/_m5/param_Gem5ToTlmBridge256.cc:17:
src/systemc/ext/tlm_utils/../tlm_core/2/sockets/initiator_socket.hh: In instantiation of ‘class tlm::tlm_base_initiator_socket<256, tlm::tlm_fw_transport_if<>, tlm::tlm_bw_transport_if<>, 1, sc_core::SC_ONE_OR_MORE_BOUND>’:
src/systemc/ext/tlm_utils/../tlm_core/2/sockets/initiator_socket.hh:185:7:   required from ‘class tlm::tlm_initiator_socket<256, tlm::tlm_base_protocol_types, 1, sc_core::SC_ONE_OR_MORE_BOUND>’
src/systemc/ext/tlm_utils/simple_initiator_socket.h:37:7:   required from ‘class tlm_utils::simple_initiator_socket_b<sc_gem5::Gem5ToTlmBridge<256>, 256, tlm::tlm_base_protocol_types, sc_core::SC_ONE_OR_MORE_BOUND>’
src/systemc/ext/tlm_utils/simple_initiator_socket.h:156:7:   required from ‘class tlm_utils::simple_initiator_socket<sc_gem5::Gem5ToTlmBridge<256>, 256, tlm::tlm_base_protocol_types>’
src/systemc/tlm_bridge/gem5_to_tlm.hh:147:46:   required from ‘class sc_gem5::Gem5ToTlmBridge<256>’
/usr/include/c++/13/type_traits:1411:38:   required from ‘struct std::is_base_of<sc_gem5::Gem5ToTlmBridgeBase, sc_gem5::Gem5ToTlmBridge<256> >’
ext/pybind11/include/pybind11/detail/../detail/common.h:880:59:   required from ‘struct pybind11::class_<sc_gem5::Gem5ToTlmBridge<256>, sc_gem5::Gem5ToTlmBridgeBase, std::unique_ptr<sc_gem5::Gem5ToTlmBridge<256>, pybind11::nodelete> >::is_valid_class_option<sc_gem5::Gem5ToTlmBridgeBase>’
ext/pybind11/include/pybind11/detail/../detail/common.h:719:35:   required by substitution of ‘template<class ... Ts> using pybind11::detail::all_of = pybind11::detail::bool_constant<(Ts::value  && ...)> [with Ts = {pybind11::class_<sc_gem5::Gem5ToTlmBridge<256>, sc_gem5::Gem5ToTlmBridgeBase, std::unique_ptr<sc_gem5::Gem5ToTlmBridge<256>, pybind11::nodelete> >::is_valid_class_option<sc_gem5::Gem5ToTlmBridgeBase>, pybind11::class_<sc_gem5::Gem5ToTlmBridge<256>, sc_gem5::Gem5ToTlmBridgeBase, std::unique_ptr<sc_gem5::Gem5ToTlmBridge<256>, pybind11::nodelete> >::is_valid_class_option<std::unique_ptr<sc_gem5::Gem5ToTlmBridge<256>, pybind11::nodelete> >}]’
ext/pybind11/include/pybind11/pybind11.h:1506:70:   required from ‘class pybind11::class_<sc_gem5::Gem5ToTlmBridge<256>, sc_gem5::Gem5ToTlmBridgeBase, std::unique_ptr<sc_gem5::Gem5ToTlmBridge<256>, pybind11::nodelete> >’
build/ALL/python/_m5/param_Gem5ToTlmBridge256.cc:34:179:   required from here
src/systemc/ext/tlm_utils/../core/sc_port.hh:125:18: error: ‘void sc_core::sc_port_b<IF>::bind(sc_core::sc_port_b<IF>&) [with IF = tlm::tlm_fw_transport_if<>]’ was hidden [-Werror=overloaded-virtual=]
  125 |     virtual void bind(sc_port_b<IF> &p) { sc_port_base::bind(p); }
      |                  ^~~~
In file included from src/systemc/ext/tlm_utils/simple_initiator_socket.h:27:
src/systemc/ext/tlm_utils/../tlm_core/2/sockets/initiator_socket.hh:133:18: note:   by ‘tlm::tlm_base_initiator_socket<256, tlm::tlm_fw_transport_if<>, tlm::tlm_bw_transport_if<>, 1, sc_core::SC_ONE_OR_MORE_BOUND>::bind’
  133 |     virtual void bind(bw_interface_type &ifs) { (get_base_export())(ifs); }
      |                  ^~~~
src/systemc/ext/tlm_utils/../core/sc_port.hh:124:18: error: ‘void sc_core::sc_port_b<IF>::bind(IF&) [with IF = tlm::tlm_fw_transport_if<>]’ was hidden [-Werror=overloaded-virtual=]
  124 |     virtual void bind(IF &i) { sc_port_base::bind(i); }
      |                  ^~~~
src/systemc/ext/tlm_utils/../tlm_core/2/sockets/initiator_socket.hh:133:18: note:   by ‘tlm::tlm_base_initiator_socket<256, tlm::tlm_fw_transport_if<>, tlm::tlm_bw_transport_if<>, 1, sc_core::SC_ONE_OR_MORE_BOUND>::bind’
  133 |     virtual void bind(bw_interface_type &ifs) { (get_base_export())(ifs); }
      |                  ^~~~

From the code comment, it's intended in SystemC header.

// The overloaded virtual is intended in SystemC, so we'll disable the warning.
// Please check section 9.3 of SystemC 2.3.1 release note for more details.

The issue is we should move the skip to the base class.

Change-Id: I6683919e594ffe1fb3b87ccca1602bffdb788e7d
2023-09-27 13:43:28 +08:00
Andreas Sandberg
cfa13f9feb sim: Probe listener template with lambda (#356)
Adds a new probe listener template which can be used to instantiate with
a lambda function that is called by notify(). It is similar to
ProbeListenerArg with class but provides more flexibility. I.e. the can
be another object than the one instantiating the lambda which allows to
listen to any object. Furthermore additional parameters can be passed in
easily.

Change-Id: Iba451357182caf25097b9ae201cd5c647aff3a4f
2023-09-26 10:08:24 +01:00
Giacomo Travaglini
f5968da41c mem-ruby: start using txnid and DBID identifiers in CHI transactions (#288)
With this PR our CHI implementation starts making use of the txnid and
DBID identifiers.
Note: we were already making use of the txnId for DVM messages to convey
the DVM address. This is still the case.
In the future we should realign the DVM logic so that the txnId is
solely used as a transaction identifier.
2023-09-26 09:51:47 +01:00
Hoa Nguyen
91e55d9c60 base: Add warning when failing to insert a whole symbol table
Current we drop the insertion of a whole symbol table if the name
of one symbol already exists in the base table. Having similar
symbols across different binaries is very common.

This change adds a warning and recommends a fix instead of silently
dropping the table. This is useful for debugging when there are two
or more workloads, e.g. bootloader + kernel, are added separately.

Change-Id: I9e4cf06037cd70926fb5cee3c4dab464daf0912e
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-09-25 16:27:21 -07:00
Hoa Nguyen
b759f22cc9 cpu-o3: Mark getWritableRegOperand() in O3CPU as a regwrite
As discussed here, [1], O3CPU counts getWritableRegOperand() as a
reg read, while SimpleCPU variants count getWriableRegOperand()
as a reg write.

This patch fixes this inconsistency. Here, I assume that if
getWritableRegOperand() is used, setReg() will not be used again
to write to the destination register.

[1] https://github.com/gem5/gem5/pull/341

Change-Id: If00049eb598f6722285e9e09419aef98ceed759f
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-09-25 12:29:28 -07:00
Jason Lowe-Power
010ac43369 arch-riscv: Make RISC-V decodeInst overridable (#350)
The change will allow developers to implement and decode their
non-standard instructions to the CPU models
2023-09-25 06:43:56 -07:00
David Schall
7cb308db90 sim: Probe listener template with lambda
Adds a new probe listener template which can be used
to instantiate with a lambda function that is called by
notify(). It is similar to ProbeListenerArg with class but
provides more flexibility. I.e. the can be another object
than the one instantiating the lambda which allows to listen
to any object. Furthermore additional parameters can be
passed in easily.

Change-Id: Iba451357182caf25097b9ae201cd5c647aff3a4f
Signed-off-by: David Schall <david.schall@ed.ac.uk>
2023-09-25 08:51:00 +00:00
Giacomo Travaglini
9d63a1492a cpu: Add override to TraceCPU init function (#348)
This introduces a fix that caused the clang compiler tests to fail here:
https://github.com/gem5/gem5/actions/runs/6195015407

Change-Id: I48c61539f497976c038c6e8e379d00285e1c39c7
2023-09-25 09:10:33 +01:00
Giacomo Travaglini
83224e2c85 arch: Enable customized decoder class name (#351)
Developers can make the own ISADesc action in the SConscript with their
decoder class name.

Change-Id: I011cf059642e178913e1f62df4e5c02401cc132e
2023-09-25 09:10:06 +01:00
Giacomo Travaglini
df60b0f5c9 arch-arm: Implement FEAT_FGT
Change-Id: I89391f17f353ab6ce555d65783977c1f30f64fc5
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-09-22 16:33:58 +01:00
Giacomo Travaglini
37b6824c4c arch-arm: Fix disassembly for NZCV read/writes
At the moment the instruction is disassembled as an integer
operation:

msrNZCV   x547, x0

Instead of

msr nzcv x0

Change-Id: I3f6576dccbe86db401c73747750ca3cfdf4055d5
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-09-22 16:33:58 +01:00
Roger Chang
d55f8f2716 arch: Enable customized decoder class name
Developers can make the own ISADesc action in the SConscript with
their decoder class name.

Change-Id: I011cf059642e178913e1f62df4e5c02401cc132e
2023-09-22 15:45:56 +08:00
Roger Chang
5b41112e03 arch-riscv: Make RISC-V decodeInst overridable
The change will allow developers to implement and decode their
non-standard instructions to the CPU models

Bug: 289467440
Test: None
Change-Id: I67f4abc71596f819c1265e325784f51c8e9bb359
2023-09-22 11:38:22 +08:00
Bobby R. Bruce
391f62b213 misc: 'sim{out/err}' -> 'sim{out/err}.txt'
By default, the --stderr-file and --stdout-file arguments were
directing the simulator output to files named "simerr" and
"simout" respectively if an output redirect was requested.

A small annoyance is these files lack an extension meaning programs
refuse to open them, or to do so without some additional effort. On
many systems they are assumed to scripts.

This patch adds the .txt extension to both, thus clearly indicating
to other programs these are text files and can be opened to be read
as such.

Change-Id: Iff5af4a9e6966b4467d005a029dbf401099fbd35
2023-09-21 12:57:43 -07:00
Bobby R. Bruce
f5a255c68d configs: Fixed Typo (#337)
Fixed a typo importing obtain_resource
2023-09-21 11:58:49 -07:00
Bobby R. Bruce
3f9afe96c6 python,util: Add Python MyPy Stubgen to enable Pylance IntelliSense (#307)
This allows us to generate stubs for the modules in gem5. The output
will be a "typings" directory which can be used by Pylance (Python
IntelliSense) to infer typings in Visual Studio Code.

Note: A "typings" directory in the root of the workspace is the default
location for Pylance to look for typings. This can be changed via
`python.analysis.stubPath` in "settings.json".

Usage
=====

```
pip3 install -r requirements.txt
scons build/ALL/gem5.opt -j$(nproc)
./build/ALL/gem5.opt util/gem5-stubgen.py
```
2023-09-21 11:52:16 -07:00
Melissa Jost
d297da3654 cpu: Add override to TraceCPU init function
This introduces a fix that caused the clang compiler tests to
fail here: https://github.com/gem5/gem5/actions/runs/6195015407

Change-Id: I48c61539f497976c038c6e8e379d00285e1c39c7
2023-09-21 10:11:39 -07:00
Nicholas Mosier
7298ebd49b arch-x86: properly initialize the auxv platform string
The auxv platform string was not copied to the same location that was
pointed to by the value of AT_PLATFORM; instead, it was copied over
the auxv random buffer. This patch fixes this by copying the auxv
platform string to the right offset in the initial program stack.

GitHub issue: https://github.com/gem5/gem5/issues/346

Change-Id: Ied4b660d5fc444a94acb97b799be0a3722438b5e
2023-09-21 05:16:17 +00:00
Nicholas Mosier
5697bf26a8 arch-x86: make popx87 micro-op actually pop st(0)
The popx87 micro-op did not in fact pop the st(0) floating-point
register off the stack; it acted as a no-op. This patch fixes the bug
by passing the spm=1 argument to PopX87's superclass to indicate the
floating-point stack pointer should be incremented.

GitHub issue: https://github.com/gem5/gem5/issues/344

Change-Id: I6e731882b6bcf8f0e06ebd2f66f673bf9da80717
2023-09-21 04:29:05 +00:00
Bobby R. Bruce
958eda6961 arch-riscv: Fix inst flags for jal and jalr (#325)
The jal and jalr share the same instruction format JumpConstructor,
which sets the IsCall and IsReturn flags by the register ID. However, it
may cause wrong instruction flags set for jal because the section
"handle the 'Jalr' instruction" misses the opcode checking. The PR fix
the issue to ensure the IsReturn can be only set in Jalr.
2023-09-20 16:25:21 -07:00
Bobby R. Bruce
aa0702c6eb dev-amdgpu: Handle GPU atomics on host memory addresses (#328)
It is possible to execute a GPU atomic instruction using a memory
address that is in the host memory space (e.g, HMM, __managed__,
hipHostMalloc'd address). Since these are in host memory they are passed
to the SystemHub DmaDevice. However, this currently executes as a write
packet without modifying data. This leads to hangs in applications that
use atomics for forward progress (e.g., HeteroSync).

It is not clear where these are handled on a real GPU, but they are
certainly not handled by the software stack nor driver, so they must be
handled in hardware and therefore implemented in gem5. Handling for
atomics in the SystemHub makes the most sense.

To make atomics work a few extra changes need to be made to the
SystemHub. (1) The atomic is implemented as a host memory read, followed
by calling the AtomicOpFunctor, followed by a write. This requires a
second event to handle read response, performing atomic, and issuing a
write. (2) Atomics must be serialized otherwise two atomics might return
the same value which is incorrect. This patch adds serialization logic
for all request types to the same address to handle this. (3) With the
added complexity of the SystemHub, a new debug flag explicitly for
SystemHub is added.

Testing done: The heterosync application with input "sleepMutex 10 16 4"
previously hung before this patch. It passes with the patch applied.
This application tests both (1) and (2) above, as it allocates locks
with hipHostMalloc and has multiple workgroups sending an atomic request
in the same Tick, verifying the serialization mechanism.
2023-09-20 16:24:56 -07:00
Bobby R. Bruce
4526a314a9 arch-x86: fix negative overflow check bug in PACK micro-op (#332)
The implementation of the x86 PACK micro-op had a logical bug that
caused the `PACKSSWB` and `PACKSSDW` instructions to produce incorrect
results. Specifically, due to a signedness error, the overflow check for
negative integers being packed always evaluated to true, resulting in
all negative integers being packed as -1 in the output.

This patch fixes the signedness error that causes the bug.

GitHub issue: https://github.com/gem5/gem5/issues/331
2023-09-20 16:18:16 -07:00
Leo Redivo
83374bdf99 misc: changed name get_default_disk_device to get_disk_device
Change-Id: Ida9673445a4426ddedc8221010204bd2b71103a5
2023-09-20 15:28:49 -07:00
Marco Kurzynski
516dcf3bcd configs: Fixed Typo
Fixed a typo importing obtain_resource

Change-Id: I5792ca161187c6576e2501e5aaea610d8b8ee5ea
2023-09-20 21:42:56 +00:00
Hoa Nguyen
1fc89bc8ae cpu,mem,dev: Use Addr for cacheLineSize
Change-Id: I2f056571dbf35081d58afda09726c600141d5a05
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-09-20 14:16:46 -07:00
Hoa Nguyen
ac5280fedc mem,sim: Change the type of cache_line_size to Addr
Change-Id: Id39e8249fef89c0d59bb39f8104650257ff00245
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-09-20 14:00:45 -07:00
Pu (Luke) Yi
3c38d4952a mem: fix bug in 3-level cache
Change-Id: I5b875908ac8f81180d781e609869e2f6fe1a8dc4
2023-09-20 12:15:33 -07:00
Matthew Poremba
63cabf2848 dev-amdgpu: Handle GPU atomics on host memory addresses
It is possible to execute a GPU atomic instruction using a memory
address that is in the host memory space (e.g, HMM, __managed__,
hipHostMalloc'd address). Since these are in host memory they are passed
to the SystemHub DmaDevice. However, this currently executes as a write
packet without modifying data. This leads to hangs in applications that
use atomics for forward progress (e.g., HeteroSync).

It is not clear where these are handled on a real GPU, but they are
certianly not handled by the software stack nor driver, so they must be
handled in hardware and therefore implemented in gem5. Handling for
atomics in the SystemHub makes the most sense.

To make atomics work a few extra changes need to be made to the
SystemHub. (1) The atomic is implemented as a host memory read, followed
by calling the AtomicOpFunctor, followed by a write. This requires a
second event to handle read response, performing atomic, and issuing a
write. (2) Atomics must be serialized otherwise two atomics might return
the same value which is incorrect. This patch adds serialization logic
for all request types to the same address to handle this. (3) With the
added complexity of the SystemHub, a new debug flag explicitly for
SystemHub is added.

Testing done: The heterosync application with input "sleepMutex 10 16 4"
previously hung before this patch. It passes with the patch applied.
This application tests both (1) and (2) above, as it allocates locks
with hipHostMalloc and has multiple workgroups sending an atomic request
in the same Tick, verifying the serialization mechanism.

Change-Id: Ife84b30037d1447dd384340cfeb06fdfd472fff9
2023-09-20 13:52:25 -05:00
Bobby R. Bruce
6eb7c10eb9 misc: Add HACC GPU tests (#258)
This adds the HACC GPU tests to be run weekly
2023-09-20 11:26:54 -07:00
Roger Chang
70c1d762c7 arch-riscv: Fix inst flags for jal and jalr
The jal and jalr share the same instruction format JumpConstructor,
which sets the IsCall and IsReturn flags by the register ID.
However, it may cause wrong instruction flags set for jal because
the section "handle the 'Jalr' instruction" misses the opcode
checking. The PR fix the issue to ensure the IsReturn can be only
set in Jalr.

Change-Id: I9ad867a389256f9253988552e6567d2b505a6901
2023-09-20 14:27:23 +08:00
Nicholas Mosier
741a901d8d arch-x86: fix negative overflow check bug in PACK micro-op
The implementation of the x86 PACK micro-op had a logical bug that
caused the `PACKSSWB` and `PACKSSDW` instructions to produce
incorrect results. Specifically, due to a signedness error, the
overflow check for negative integers being packed always evaluated
to true, resulting in all negative integers being packed as -1 in
the output.

This patch fixes the signedness error that causes the bug.

GitHub issue: https://github.com/gem5/gem5/issues/331

Change-Id: I44b7328a8ce31742a3c0dfaebd747f81751e8851
2023-09-20 05:09:32 +00:00
Bobby R. Bruce
561f3bd75b misc,tests: Split testlib CI Tests to one dir-per-job
This splits the CI Tests to one job per sub-directory in "tests/gem5"
via a matrix.

Advantages:
* We can utilize more runners to run the quick tests. This should mean
  tests run quicker.
* This approach does not require editing of the workflow as more tests
  are added or taken away.
* There is now an output artifact for each directory in "tests/gem5"
  instead of one for the entriety of every quick test in "tests".

In addition:
* The artifact retention for the test outputs has been increased to 30 days.
* The output test artifacts have been renamed to be more descriptive of
  the job, run, attempt, directory run, and the status.
* The 'tar' step has been removed. GitHub's 'action/artifact' can handle
  directories.

Change-Id: I5b3132b424e3769d81d9cd75db2a8c59dbe4a7e5
2023-09-19 19:35:58 -07:00
Bobby R. Bruce
6921d94373 python: Recursively create checkpoint dir
While there was code present in "serialize.cc" to create the checkpoint
directory, it did not do recursively. This patch ensures all the
directories are created in a path to the checkpoint directory.

Change-Id: Ibcf7f800358fd89946f550b8cfb0cef8b51fceac
2023-09-19 15:48:11 -07:00
Bobby R. Bruce
efd58f9b72 tests: Remove ":" from testing results output dir name
Colons in path names is not advisable.

Change-Id: I7748a36cabafde69759f7a9892f7b8910470b85e
2023-09-19 15:48:11 -07:00
Bobby R. Bruce
0337613afc ext,tests: Add --build-targets option to ./main.py list
This allows for build target information (i.e., the gem5 binary to be
built for the tests) to be returned.

Change-Id: I6638b54cbb1822555f58e74938d36043c11108ba
2023-09-19 15:48:10 -07:00
Bobby R. Bruce
13b77b3e41 ext,tests: Allow passing of --uid to ./main.py list
This is useful for listing the fixtures of a Suite.

Change-Id: Id2f1294cc7dea03a6b26e8abc5083886fe0299d9
2023-09-19 15:48:10 -07:00
Bobby R. Bruce
43226004a1 ext,tests: Fix --figures flag when using ./main.py list
Now the "tests/main.py" script will accept the `--fixtures` flag when
using the `list` command. This will only list the fixtures needed.

To have this implemented `__str__` for the `Fixture` class has been
implemented.

Change-Id: I4bba26e923c8b0001163726637f2e48c801e92b1
2023-09-19 15:48:10 -07:00
Bobby R. Bruce
c36a4d12aa tests: Replace print with testlib.log for PARSEC warn
Using just a print was causing this warning to print even with the `-q`
flag was passed. The `-q` flag sets the output to machine readable,
which the warning statement is not.

Change-Id: I139e2565dbc53aaee9027c0e003d34ba800a7ef4
2023-09-19 15:48:10 -07:00
Hoa Nguyen
9057eeabec cpu: Explicitly define cache_line_size -> 64-bit unsigned int
While it makes sense to define the cache_line_size as a 32-bit unsigned int,
the use of cache_line_size is way out of its original scope.

cache_line_size has been used to produce an address mask, which masking out
the offset bits from an address. For example, [1], [2], [3], and [4].
However, since the cache_line_size is an "unsigned int", the type of the
value is not guaranteed to be 64-bit long. Subsequently, the
bit twiddling hacks in [1], [2], [3], and [4] produce 32-bit mask,
i.e., 0x00000000FFFFFFC0.

This behavior at least caused a problem in LLSC in RISC-V [5], where the
load reservation (LR) relies on the mask to produce the cache block address.
Two distinct 64-bit addresses can be mapped to the same cache block using
the above mask.

This patch explicitly defines cache_line_size as a 64-bit unsigned int so
the cache block mask can be produced correctly for 64-bit addresses.

[1] 3bdcfd6f7a/src/cpu/simple/atomic.hh (L147)
[2] 3bdcfd6f7a/src/cpu/simple/timing.hh (L224)
[3] 3bdcfd6f7a/src/cpu/o3/lsq_unit.cc (L241)
[4] 3bdcfd6f7a/src/cpu/minor/lsq.cc (L1425)
[5] 3bdcfd6f7a/src/arch/riscv/isa.cc (L787)

Change-Id: I29abc7aaab266a37326846bbf7a82219071c4ffe
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-09-19 13:32:09 -07:00
Giacomo Travaglini
aec1d081c8 mem-ruby: Populate missing txnId field to CompDBID_Stale response
Change-Id: I6861d27063b13cd710e09c153d15062640c887fe
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-09-18 15:23:21 +01:00
Bobby R. Bruce
3bdcfd6f7a mem-ruby: patch fixes a protocol error in MOESI_CMP_Directory (#316)
When there is race between FwdGetX
and PUTX on owner. Owner in this case hands off
ownership to GetX requestor and PUTX still goes
through. But since owner has changed, state should go back to M and PUTX
is essentially trashed.
An Unblock to the Directory in this case will give an undefined
transition. I have added transitions which indicate that when an Unblock
is served to the Directory, it means that some kind of ownership
transfer has happened while a PUTX/PUTO was in progress.
2023-09-15 13:25:51 -07:00
Harshil Patel
7225da4ac6 stdlib, resources: Removed unused import
Change-Id: Iee54cc695c7c8ce146719ef583be424b792e2232
2023-09-15 10:41:45 -07:00
Harshil Patel
f3ce343a26 stdlib, resources: Added pretty printing resource
- Implemented a __str__  for AbstractResource
__str__ prints resource category, id and version.
link to resources website is also printed.

Change-Id: Iad5825ff7d8d505ceb236e00dc49bb56055fc8f0
2023-09-15 10:21:27 -07:00
Giacomo Travaglini
320454b75f mem-ruby: Populate missing txnId field to CompI response
Change-Id: I02030f61dd4e64a29b16e47d49bcde8c723260b5
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-09-15 12:13:00 +01:00
Bobby R. Bruce
23442727f7 util,resources,stdlib: Add 'obtain-resource.py' utility to easily obtain resources from the CLI (#317)
This allows users to obtain resources via the CLI instead of having to
write a python script to do so. It is essentially a nice CLI wrapper for
"gem5.resources.resource.obtain_resource"

## Usage

```sh
> scons build/ALL/gem5.opt -j `nproc`
> ./build/ALL/gem5.opt util/obtain-resource.py --help

usage: obtain-resource.py [-h] [-p PATH] [-q] id

positional arguments:
  id                    The resource id to download.

options:
  -h, --help            show this help message and exit
  -p PATH, --path PATH  The path the resource is to be downloaded to. If not specified, the resource will be downloaded to the default
                        location in the gem5 local cache of resources
  -q, --quiet           Suppress output.
```

E.g.:

```sh
./build/ALL/gem5.opt util/obtain-resource.py arm-hello64-static -p arm-hello
```

Will download the resource with ID `arm-hello64-static` to `arm-hello`
in the CWD.
2023-09-14 21:04:30 -07:00
Bobby R. Bruce
600ea81031 util: Add 'obtain-resource.py' utility
This can be used to obtain a resource from gem5-resources.

Change-Id: I922d78ae0450bf011f18893ffc05cb1ad6c97572
2023-09-14 15:33:17 -07:00
Bobby R. Bruce
a101b1aba3 stdlib: Add 'to_path' arg to obtain_resource
This allows for a user to specify the exact path they want a resource to
be downloaded to. This differs from 'resource_direcctory' in that a user
may specify the file/directory name of the resource (using just the
'resource_directory' will have the resource as its ID in that directory.

Change-Id: I887be6216c7607c22e49cf38226a5e4600f39057
2023-09-14 15:33:17 -07:00
Bobby R. Bruce
b12f28af96 stdlib: Add 'quiet' option to obtain_resource func
Change-Id: I15d3be959ba7ab8af328fc6ec2912a8151941a1e
2023-09-14 15:33:17 -07:00
Bobby R. Bruce
46be2d2339 misc,tests: Use GitHub Docker registry for 22.04 all-deps (#321)
Via this workflow we now can build and push our docker images to the
GitHub Docker container registry:

26a1ee4e61/.github/workflows/docker-build.yaml

GitHub does not charge for downloads to runners (hosted or self-hosted).
This can therefore save the project money if we download from GitHub's
Docker reigstry over Google Cloud's.

This is a test to ensure this works as intended.
2023-09-14 15:10:58 -07:00
Bobby R. Bruce
017fb51fad misc,tests: Remove duplicate running of daily gem5_library_tests (#318)
The long/daily tests in "tests/gem5/gem5_library_tests" were running in
both the "testlib-long-tests" and the
"testlib-long-gem5_library_example_tests" job in the Daily tests
Workflow. The running in "testlib-long-tests" is removed in this PR.
2023-09-14 15:10:02 -07:00
Bobby R. Bruce
1c5870d775 misc: Update docker-build.yaml artifact actions to v3 (#322)
v2 uses some deprecated dependencies.
2023-09-14 15:09:47 -07:00
Leo Redivo
020bc05928 misc: moved logic of get_disk_device to workload.command_line
Change-Id: I5313bb381d5d8983b050047849fae61ea7dfc63b
2023-09-14 11:47:19 -07:00
Melissa Jost
29fa894e19 misc: Add HACC GPU tests
This adds the HACC GPU tests to be run weekly

Change-Id: I77d58ee9a3d067a749bae83826266bf89bb5020f
2023-09-14 10:35:10 -07:00
Bobby R. Bruce
210ab04bca misc: Update docker-build.yaml artifact actions to v3
Change-Id: I4dea25fcfb786758942e6245133d32949b921774
2023-09-14 01:28:10 -07:00
Bobby R. Bruce
59a96c8c2f mem-cache: Fix bug in classic cache while clflush (#274)
This change, https://github.com/gem5/gem5/pull/205, mistakenly allocates
write buffer for clflush instruction when there's a cache miss. However,
clflush in gem5 is not a write instruction. Thus, the cache should
allocate miss buffer in this case.
2023-09-14 01:14:39 -07:00
Bobby R. Bruce
040f4d5ae0 misc,tests: Use GitHub Docker registry for 22.04 all-deps
Via this workflow we now can build and push our docker images to
the GitHub Docker container registry:
26a1ee4e61/.github/workflows/docker-build.yaml

GitHub does not charge for downloads to runners (hosted or self-hosted).
This can therefore save the project money if we download from GitHub's
Docker reigstry over Google Cloud's.

This is a test to ensure this works as intended.

Change-Id: Iccdb1b7a912f1e0a0d82b7f888694958099315b3
2023-09-14 01:04:05 -07:00
Bobby R. Bruce
26a1ee4e61 configs: 'memoy' -> 'memory' spelling mistake fix (#314)
Fixes https://github.com/gem5/gem5/issues/309
2023-09-13 22:59:48 -07:00
Bobby R. Bruce
7a17c780bd misc: Use 'workdir' for docker-build.yaml (#320) 2023-09-13 22:54:01 -07:00
Bobby R. Bruce
772a316dab misc: Use 'workdir' for docker-build.yaml
Change-Id: If8b30a31e1a8c3fdba84d69da4bb28e09179cb96
2023-09-13 22:52:26 -07:00
Bobby R. Bruce
61339b6471 misc: Fix docker build workflow (#319) 2023-09-13 22:48:20 -07:00
Bobby R. Bruce
dc02862c56 misc: Fix docker build workflow
Change-Id: Ib66cc124a4c3ce1354faee092f14543e699dca40
2023-09-13 22:47:08 -07:00
Bobby R. Bruce
1d160e6ab0 scons: Revert "Add an option specifying the path to mold linker binary" (#313)
Reverts https://github.com/gem5/gem5/pull/244

Fixes https://github.com/gem5/gem5/issues/312
2023-09-13 22:02:30 -07:00
Bobby R. Bruce
5102072950 misc,tests: Rm duplicate running of daily gem5_library_tests
The long/daily tests in "tests/gem5/gem5_library_tests" were running in
both the "testlib-long-tests" and the
"testlib-long-gem5_library_example_tests" job in the Daily tests
Workflow. The running in "testlib-long-tests" is removed in this patch.

Change-Id: I1c665529e3dcb594ffb7f6e2224077ae366772d6
2023-09-13 17:50:56 -07:00
Gautham Pathak
178db9e270 mem-ruby: patch fixes a protocol error in MOESI_CMP_Directory
When there is race between FwdGetX
and PUTX on owner. Owner in this case hands off
ownership to GetX requestor and PUTX still goes
through. But since owner has changed, state should
go back to M and PUTX is essentially trashed.
An Unblock to the Directory in this case will give an undefined
transition. I have added transitions which indicate that when
an Unblock is served to the Directory, it means that some kind
of ownership transfer has happened while a PUTX/PUTO was in
progress.

Change-Id: I37439b5a363417096030a0875a51c605bd34c127
2023-09-13 19:09:13 -04:00
Bobby R. Bruce
b53a311363 misc,util-docker: Fix docker-build.yaml (#285)
https://github.com/gem5/gem5/actions/runs/6114221855 failure was due to
to running the actions inside our 22.04-all-dependencies container. This
container does not contain docker. We must therefore run this action
outside of the container. However, due to our policy of checking out the
code within this container, we must split this into two jobs and use the
artifact upload and download to get the resources we want.
2023-09-13 15:54:15 -07:00
Bobby R. Bruce
d38c029195 mem-ruby: This commit patches an error in AbstractController.cc (#294)
Links to #293 

After calling m5_dump_reset_stats(0,0) in a test program, some
statistics like
l1_controllers.L1Dcache.m_demand_hits,
l1_controllers.L1Dcache.m_demand_misses,
l1_controllers.L1Dcache.m_demand_accesses
were not getting reset in the newer stat dumps.
This one line patch fixes that. Changes were tested with calling two
m5_dump_reset_stats(0,0) in a row for a system with 1 core, tested on
both SE and FS.
Credits: @MeatBoy106
2023-09-13 15:48:46 -07:00
Bobby R. Bruce
673d4b2ac2 arch-x86: initialize and correct bitwidth for FPU tag word (#304)
The x87 FPU tag word (FTW) was not explicitly initialized in
{X86_64,i386}Process::initState(), resulting in holding an initial value
of zero, resulting in an invalid x87 FPU state. This commit initializes
FTW to 0xFFFF, indicating the FPU is empty at program start during
syscall emulation.

The 16-bit FTW register was also incorrectly masked down to 8-bits in
X86ISA::ISA::setMiscRegNoEffect(), leading to an invalid X87 FPU state
that later caused crashes in the X86KvmCPU. This commit corrects the
bitwidth of the mask to 16.

GitHub issue: https://github.com/gem5/gem5/issues/303
2023-09-13 15:47:50 -07:00
Bobby R. Bruce
23c1014677 util: Fix TLM configs making use of TraceCPU replayer (#310)
A recent PR [1] moved the TraceCPU away from the BaseCPU hierarchy.
While the common etrace_replayer.py has been amended, I missed these
hybrid TLM + TraceCPU example scripts.

[1]: https://github.com/gem5/gem5/pull/302
2023-09-13 15:47:05 -07:00
Bobby R. Bruce
e42d71e802 configs: 'memoy' -> 'memory' spelling mistake fix
Fixes https://github.com/gem5/gem5/issues/309

Change-Id: I41ac7c5559d49353d01b3676b5bdf7b91e4efbda
2023-09-13 14:30:22 -07:00
Bobby R. Bruce
d463f73a43 scons: Revert "Add an option specifying the..."
Change-Id: I2bd952d3cfd6c3c671b5ab3458e44c53f93bf649
2023-09-13 14:28:05 -07:00
Gautham Pathak
87db6df8f6 mem-ruby: This commit patches an error in AbstractController.cc
After calling m5_dump_reset_stats(0,0) in a test program,
some statistics like
l1_controllers.L1Dcache.m_demand_hits,
l1_controllers.L1Dcache.m_demand_misses,
l1_controllers.L1Dcache.m_demand_accesses
were not getting reset in the newer stat dumps.
This one line patch fixes that. Changes were tested with
calling two m5_dump_reset_stats(0,0) in a row for a system
with 1 core, tested on both SE and FS.
Credits to Gabriel Busnot for finding the fix.

Change-Id: I19d75996fa53d31ef20f7b206024fd38dbeac643
2023-09-13 14:07:16 -04:00
Giacomo Travaglini
f95e1505b8 util: Fix TLM configs making use of TraceCPU replayer
A recent PR [1] moved the TraceCPU away from the BaseCPU hierarchy.
While the common etrace_replayer.py has been amended, I missed these
hybrid TLM + TraceCPU example scripts.

[1]: https://github.com/gem5/gem5/pull/302

Change-Id: I7e9bc9a612d2721d72f5881ddb2fb4d9ee011587
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-09-13 13:36:33 +01:00
Bobby R. Bruce
5fd901ffbb cpu, configs: Fix TraceCPU after multi-ISA addition (#302)
This PR fixes #301
2023-09-12 17:26:27 -07:00
Bobby R. Bruce
133e4ed636 misc: Add "typings" directory to .gitignore
This is used by Pylance IntelliSense to infer gem5 typing.

See "util/gem5-stubgen.py" for generating this directory.

Change-Id: Ie39762c718e5392f6194ff7c8238bd0cd677f486
2023-09-12 15:20:06 -07:00
Bobby R. Bruce
bceac5d951 util: Allow MyPy stubgen to aid Pylance IntelliSense
Change-Id: I42fe177e5ae428afd0f23ea482b6af5b7d3ecaf9
2023-09-12 15:19:49 -07:00
Bobby R. Bruce
39f0bcd9af python: Mimic Python 3's -P flag in gem5
Python 3's `-P` flag, when set, means `sys.path` is not prepended with
potentially unsafe paths:
https://docs.python.org/3/using/cmdline.html#cmdoption-P

This patch allows gem5 to mimic this. This is necesssary when using
`mypy.stubgen` as it expects the Python Interpreter to have the `-P`
flag.

Change-Id: I456c8001d3ee1e806190dc37142566d50d54cc90
2023-09-12 14:51:59 -07:00
Nicholas Mosier
2178e26bf2 arch-x86: initialize and correct bitwidth for FPU tag word
The x87 FPU tag word (FTW) was not explicitly initialized in
{X86_64,i386}Process::initState(), resulting in holding an initial
value of zero, resulting in an invalid x87 FPU state. This commit
initializes FTW to 0xFFFF, indicating the FPU is empty at program
start during syscall emulation.

The 16-bit FTW register was also incorrectly masked down to 8-bits
in X86ISA::ISA::setMiscRegNoEffect(), leading to an invalid X87 FPU
state that later caused crashes in the X86KvmCPU. This commit
corrects the bitwidth of the mask to 16.

GitHub issue: https://github.com/gem5/gem5/issues/303

Change-Id: I97892d707998a87c1ff8546e08c15fede7eed66f
2023-09-12 15:39:29 +00:00
Bobby R. Bruce
1bebf6a3cc sim-se: Use tgt_stat64 instead of tgt_stat in newfstatatFunc (#283)
The syscall emulation of newfstatat incorrectly treated the output stat
buffer to be of type `OS::tgt_stat`, not `OS::tgt_stat64`, causing the
invalid output stat buffer in the application to hold invalid data.

This patch fixes the bug by simply substituting the type `OS::tgt_stat`
with `OS::tgt_stat64` in `newstatatFunc()`.

GitHub issue: https://github.com/gem5/gem5/issues/281
2023-09-12 08:33:42 -07:00
Bobby R. Bruce
94e5a0cccf sim-se: Fix tgkill logic bug in handling signal argument (#286)
The syscall emulation of tgkill contained a simple logic bug (a `||`
instead of a `&&`), causing the signal argument to always be considered
invalid. This patch fixes the bug by simply changing the `||` to a `&&`.

GitHub issue: https://github.com/gem5/gem5/issues/284
2023-09-12 08:32:56 -07:00
Bobby R. Bruce
d67a6603c1 cpu-kvm: properly set x86 xsave header on gem5->KVM transition (#298)
If the XSAVE KVM capability is available (KVM_CAP_XSAVE), the X86KvmCPU
will try to set the x87 FPU + SSE state using KVM_SET_XSAVE, which
expects a buffer (struct kvm_xsave) in XSAVE area format (Vol. 1, Sec.
13.4 of Intel x86 SDM). The original implementation of
`X86KvmCPU::updateKvmStateFPUXSave()`, however, improperly sets the
xsave header, which contains a bitmap of state components present in the
xsave area.

This patch defines `XSaveHeader` structure to model the xsave header,
which is expected directly following the legacy FPU region (defined in
the `FXSave` structure) in the xsave area. It then sets two bist in the
xsave header to indicate the presence of x86 FPU and SSE state
components.

GitHub issue: https://github.com/gem5/gem5/issues/296
2023-09-12 08:32:20 -07:00
Bobby R. Bruce
5fefbe2933 arch-riscv: Enable RVV run in Minor and O3 CPU (#228)
Changes in the PR:

1. Change the vset\*vl\* instructions to jump/branch family, and
implement the branchTarget.
2. Move the Vl and Vtype from decoder to PCState
3. get VL, VTYPE and VLENB value from PCState
4. Remove vtype checking in construction so that the minor and o3 cpu
and decode the instructions after the vset\*vl\*
2023-09-12 08:31:36 -07:00
Giacomo Travaglini
a0a799f474 cpu: Disable CPU switching functionality with TraceCPU
Now that the TraceCPU is no longer a BaseCPU we disable CPU switching
functionality. AFAICS from the code, it seems like using m5.switchCpus
was never really working.
The takeOverFrom was described as being used when checkpointing
(which is not really the case). Moreover the icache/dcache
event loops were not checking if the CPU was switched out
so the trace was always been consumed regardless of the BaseCPU
state.

Note: IMHO the only case where you might want to switch between
an execution-driven CPU to the TraceCPU is when you want to
warm your caches before the ROI.
All other cases don't really make sense as with the TraceCPU
there is no architectural state being maintained/updated.

Change-Id: I0611359d2b833e1bc0762be72642df24a7c92b1e
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-09-12 15:50:05 +01:00
Giacomo Travaglini
785eba6ce1 configs: Reflect TraceCPU changes in the etrace_replay script
As we no longer inherit from the BaseCPU, we can't really use
CPU generation methods (like Simulation.setCPUClass) and
cache generation ones (like CacheConfig.config_cache).

This is good news as it allows us to simplify the etrace
script and to remove a dependency with the deprecated-to-be
common library.

Change-Id: Ic89ce2b9d713ee6f6e11bf20c5065426298b3da2
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-09-12 15:49:39 +01:00
Giacomo Travaglini
9a5d900770 cpu: Stop treating TraceCPU as a BaseCPU
This is fixing a recently reported issue [1] where it is
not possible to use the TraceCPU to replay elastic traces

It requires some architectural data structures (like ArchMMU,
ArchDecoder...) which are no longer defined in the BaseCPU class at
compilation time.  Which Arch version should be used for a class
(TraceCPU) that is supposed to be ISA agnostic ? Does it really make
sense to define them for the TraceCPU? Those classes are not used anyway
during trace replay and their sole purpose would just be to comply to
the BaseCPU interface.

As there is no elegant way to make things work, this patch stops
treating the TraceCPU as a BaseCPU.

While it philosophically makes sense to treat the TraceCPU as a common
CPU (it sort of replays pre-executed instructions), a case can be made
for considering it more like a traffic generator.

[1]: https://github.com/gem5/gem5/issues/301

Change-Id: I7438169e8cc7fb6272731efb336ed2cf271c0844
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-09-12 15:49:29 +01:00
Roger Chang
e41184fafc util: Update the RISC-V PCState checkpoint
Change-Id: I64b6a3e1706173a001f5f8fb06756bd50d65f5bd
2023-09-12 17:54:35 +08:00
Roger Chang
def89745bc arch-riscv: Allow Minor and O3 CPU execute RVV
Change-Id: I4780b42c25d349806254b5053fb0da3b6993ca2f
2023-09-12 13:56:22 +08:00
Roger Chang
0f54cb0593 arch-riscv: Remove check vconf done implementation
Change-Id: If633cef209390d0500c4c2c5741d56158ef26c00
2023-09-12 13:56:22 +08:00
Roger Chang
31b95987da arch-riscv: Change the instruction family to jump like
The method that get the vl, vtype from PCState in the next changes

Change-Id: I022b47b7a96572f6434eed30dd9f7caa79854c31
2023-09-12 13:56:22 +08:00
Roger Chang
282765234b arch-riscv: Implement the branchTarget for vset*vl*
Change-Id: I10bf6be736ce2b99323ace410bff1d8e1e2a4123
2023-09-12 13:56:22 +08:00
Roger Chang
a3aaad2ecd arch-riscv: Refactor the execution part of vset*vl*
Change-Id: Ie0d9671242481a85bb0fe5728748b16c3ef62592
2023-09-12 13:56:21 +08:00
Roger Chang
1bde42760f arch-riscv: Get vl, vtype and vlenb from PCState
Change-Id: I0ded57a3dc2db6fcc7121f147bcaf6d8a8873f6a
2023-09-12 13:56:21 +08:00
Roger Chang
8918302239 arch-riscv: Change the implementation of vset*vl*
The changes includes:

1. Add VL, Vtype and VlenbBits operands
2. Change R/W methods of VL, Vtype and VlenbBits from PCState

Change-Id: I0531ddc14344f2cca94d0e750a3b4291e0227d54
2023-09-12 13:56:21 +08:00
Roger Chang
7b5d8b4e5b arch-riscv: Add vlenb, vtype and vl in PCState
Change-Id: I7c2aed7dda34a1a449253671d7b86aa615c28464
2023-09-12 13:56:21 +08:00
Roger Chang
f94658098d arch-riscv: Remove checked_type in StaticInst Constructor
We should not try to check vtype when decoding the instruction.
It should be checked in vset{i}vl{i} since the register can be
modified via vset{i}vl{i}

Change-Id: I403e5c4579bc5b8e6af10f93eac20c14662e4d2d
2023-09-12 13:56:21 +08:00
Roger Chang
3f0475321a arch-riscv: Change VTYPE to BitUnion64
Change-Id: I7620ad1ef3ee0cc045bcd02b3c9a2d83f93bf3fe
2023-09-12 13:56:21 +08:00
Roger Chang
dfc725838e arch-riscv: Refactor PCState class
Change-Id: I1d25350ba2a3c7c366f42340c20b4488c33cde6f
2023-09-12 13:56:21 +08:00
Bobby R. Bruce
a217c218e0 util: Update gcn-gpu Dockerfile (#290)
This adds the setting up of environment variables to the gcn-gpu
Dockerfile from the halofinder Dockerfile in gem5-resources so that we
don't need to use a separate Docker image in the gpu tests.
2023-09-11 11:19:49 -07:00
Jason Lowe-Power
a89aeb3906 util: Revert "Add docker prune cron to GitHub..." (#299)
This reverts commit 0249d47acc,
https://github.com/gem5/gem5/pull/271.

This solution doesn't work. GitHub runners pull the images they need at
the start of job (i.e., all the images they may need for each step).
They then create the containers later, at the step they are needed. This
solution therefore breaks in the case a cleanup happens during the
running of a job. I.e., a `docker system prune` happens after setup,
therefore deleting all the images, then the job tries to use one of the
images during a step.

This crontab solution may work if we can only do it when the runner is
in an idle state. Whether this is possible is unknown.
2023-09-11 09:33:28 -07:00
Bobby R. Bruce
7091a8b7a0 util: Revert "Add docker prune cron to GitHub..."
This reverts commit 0249d47acc,
https://github.com/gem5/gem5/pull/271.

This solution doesn't work. GitHub runners pull the images they need at
the start of job (i.e., all the images they may need for each step).
They then create the containers later, at the step they are needed.
This solution therefore breaks in the case a cleanup happens during the
running of a job. I.e., a `docker system prune` happens after setup,
therefore deleting all the images, then the job tries to use one of the
images during a step.

This crontab solution may work if we can only do it when the runner is
in an idle state. Whether this is possible is unknown.

Change-Id: I7cb5b2d98d596e9380ae1525c7d66ad97af1b59b
2023-09-10 20:32:49 -07:00
Nicholas Mosier
2b9d558cef cpu-kvm: properly set x86 xsave header on gem5->KVM transition
If the XSAVE KVM capability is available (KVM_CAP_XSAVE), the X86KvmCPU
will try to set the x87 FPU + SSE state using KVM_SET_XSAVE, which
expects a buffer (struct kvm_xsave) in XSAVE area format (Vol. 1,
Sec. 13.4 of Intel x86 SDM). The original implementation of
`X86KvmCPU::updateKvmStateFPUXSave()`, however, improperly sets the
xsave header, which contains a bitmap of state components present
in the xsave area.

This patch defines `XSaveHeader` structure to model the xsave header,
which is expected directly following the legacy FPU region (defined in
the `FXSave` structure) in the xsave area. It then sets two bist in
the xsave header to indicate the presence of x86 FPU and SSE state
components.

GitHub issue: https://github.com/gem5/gem5/issues/296

Change-Id: I5c5c7925fa7f78a7b5e2adc209187deff53ac039
2023-09-10 15:16:50 +00:00
Nicholas Mosier
8740385f9e sim-se: Fix tgkill logic bug in handling signal argument
The syscall emulation of tgkill contained a simple logic bug
(a `||` instead of a `&&`), causing the signal argument to always
be considered invalid. This patch fixes the bug by simply changing
the `||` to a `&&`.

GitHub issue: https://github.com/gem5/gem5/issues/284

Change-Id: I3b02c618c369ef56d32a0b04e0b13eacc9fb4977
2023-09-09 08:51:41 -07:00
Jason Lowe-Power
ebde1133c0 redirect_path patch for restoring cpt (#221)
Modify the FDArray::unserialize function to perform a checkPathRedirect
if a Process pointer is passed in.
Currently when restoring a checkpoint, it doesn't perform
checkPathRedirect for files that were opened during checkpointing. This
patch adds a checkPathRedirect in the FDArray::unserialize to redirect
app path for restoring checkpoints.
2023-09-08 15:30:53 -07:00
Nicholas Mosier
259a5d6272 sim-se: Use tgt_stat64 instead of tgt_stat in newfstatatFunc
The syscall emulation of newfstatat incorrectly treated the output
stat buffer to be of type `OS::tgt_stat`, not `OS::tgt_stat64`, causing
the invalid output stat buffer in the application to hold invalid
data.

This patch fixes the bug by simply substituting the type `OS::tgt_stat`
with `OS::tgt_stat64` in `newstatatFunc()`.

GitHub issue: https://github.com/gem5/gem5/issues/281

Change-Id: Ice97c1fc4cccbfb6824e313ebecde00f134ebf9c
2023-09-08 11:28:54 -07:00
Bobby R. Bruce
d5f5211b91 scons: Add an option specifying the path to mold linker binary (#244)
To use mold linker with gcc of version older than 12.1.0, the user has
to pass the -B option to specify where the linker is. [1]

Currently in gem5, scons only looks for the mold binary at conventional
places, such as /usr/libexec/mold and /usr/local/libexec/mold. There's
no option to manually specify the path to the linker.

gcc-12 and mold are not widely available on older systems. Having an
option to manually input the linker path allows user to build and use
mold without sudo permission.

[1] https://github.com/rui314/mold#how-to-use
2023-09-08 11:21:29 -07:00
Hoa Nguyen
91d1a5deb5 mem-cache: Fix bug in classic cache while clflush
This change, https://github.com/gem5/gem5/pull/205, mistakenly
allocates write buffer for clflush instruction when there's a
cache miss. However, clflush in gem5 is not a write instruction.
Thus, the cache should allocate miss buffer in this case.

Change-Id: I9c1c9b841159c4420567e9c929e71e4aa27d5c28
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-09-08 18:16:10 +00:00
Melissa Jost
16e8c95091 util: Update gcn-gpu Dockerfile
This adds the setting up of environment variables to the gcn-gpu
Dockerfile from the halofinder Dockerfile in gem5-resources
so that we don't need to use a seperate Docker image in the
gpu tests.

Change-Id: Ifcc7a4c6bbcd5289ce9561923366e9ed193f170c
2023-09-08 10:25:09 -07:00
Bobby R. Bruce
ce27f5c07a sim-se: Fix crash in chdirFunc() on nonexistent directory (#277)
This commit fixes a crash in the syscall emulation of the chdir(2)
syscall, implemented by chdirFunc() in src/sim/syscall_emul.cc, when
passed a nonexistent directory. The buggy code did not check the return
value of realpath().

This patch adds code to check the return value of realpath(), and if it
is NULL (i.e., there was an error with the requested directory to change
to), propagates the error in `errno` to the application.

GitHub issue: https://github.com/gem5/gem5/issues/276
2023-09-08 10:10:30 -07:00
Giacomo Travaglini
da740b1cdd mem-ruby: Add a DBID field to the CHIResponseMsg data type
This will hold the CHI Data Buffer Identifier (DBID) field.
The DBID allows a Completer of a transaction to provide its own
identifier for a transaction ID.
This new ID will be used as a TxnId field by a following
WriteData/CompData/CompAck response.

For now we only set it to the original txnId (identity mapping)

Change-Id: If30c5e1cafbe5a30073c7cd01d60bf41eb586cee
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-09-08 08:38:13 +01:00
Giacomo Travaglini
4359567180 mem-ruby: Generate TxnId field for an incoming CHI request
The TxnId field of a CHI request has so far been unused (other than for
DVM transactions). With this patch we always initialize the field when
we extract a ruby request from the sequencer port.
According to specs (IHI0050F):

A 12-bit field is defined for the TxnID with the number of outstanding
transactions being limited to 1024. A Requester is permitted to reuse a
TxnID value after it has received either:
* All responses associated with a previous transaction that have used
the same value.
* A RetryAck response for a previous transaction that used the same value

Change-Id: Ie48f0fee99966339799ac50932d36b2a927b1c7d
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-09-08 08:38:13 +01:00
Giacomo Travaglini
f032eeae93 mem-ruby: Provide a fromSequencer helper function
Based on the CHIRequestType, it automatically tells if the
request has been originated from the sequencer
(CPU load/fetch/store)

Change-Id: I50fd116c8b1a995b1c37e948cd96db60c027fe66
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-09-08 08:38:13 +01:00
Giacomo Travaglini
5dbc48432f mem-ruby: Allow Addr as a controller member type
Change-Id: I63127ed06b4f871b74faad6c2c6436aebd118334
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-09-08 08:38:13 +01:00
Giacomo Travaglini
f7d6dadc10 mem-ruby: Allow trivial integer operations with Addr type
At the moment an address value can only be used in the slicc code
to do TBE lookups but there is no way to add/subtract/divide/multiply
two addresses nor an address and an integer value.

This hinders the development of protocol specific code and
forces developers to place such code in shared
C++ structures

Change-Id: Ia184e793b6cd38f951f475a7cdf284f529972ccb
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-09-08 08:38:13 +01:00
Giacomo Travaglini
ddb6749b62 mem-ruby: Add static_cast by value in SLICC
At the moment it is possible to static_cast by pointer/reference only:

static_cast(type, "pointer", val) -> static_cast<type*>(val);
static_cast(type, "reference", val) -> static_cast<type&>(val);

With this patch it will also be possible to do something like

static_cast(type, "value", val) -> static_cast<type>(val);

Which is important when wishing to convert integer types into
custom onces and viceversa.

This patch is also deferring static_cast type check to C++

At the moment it is difficult to use the static_cast utility in slicc as
it tries to handle type checking in the language itself. This would
force us to explicitly define compatible types (like an Addr and an int
as an example). Rather than pushing the burden on us, we should always
allow a developer to use a static_cast in slicc and let the C++ compiler
complain if the generated code is not compatible

Change-Id: I0586b9224b1e41751a07d15e2d48a435061c2582
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-09-08 08:38:13 +01:00
Nicholas Mosier
e0498cb490 Merge branch 'develop' into bugfix-chdir 2023-09-07 14:13:00 -07:00
Bobby R. Bruce
75a33ec377 misc,util-docker: Fix docker-build.yaml
https://github.com/gem5/gem5/actions/runs/6114221855 failure was due
to to running the actions inside our 22.04-all-dependencies container.
This container does not contain docker. We must therefore run this action
outside of the container. However, due to our policy of checking out the
code within this container, we must split this into two jobs and use the
artifact upload and download to get the resources we want.

Change-Id: I6a5f9c3a4c287a56a3d5abe3b84dd560fa2e9ff1
2023-09-07 14:07:36 -07:00
Bobby R. Bruce
ce54ff061a scons: Fix 'recompiling' -> 'recompile'
Change-Id: Ifb2366a0c2342bf4e7207df8db6196e14184a9d5
2023-09-07 12:52:31 -07:00
Bobby R. Bruce
aca67fe3a3 misc: Add test status badges to README.md (#233)
These allow visitors to the repository to quickly see the status of our
tests run on the develop branch.
2023-09-07 12:47:00 -07:00
Bobby R. Bruce
eb5ae35341 resources,stdlib: Add workload to resource specialization and deprecate workload.py (#212) 2023-09-07 12:45:45 -07:00
Bobby R. Bruce
9cdd6093bd util: Add docker prune cron to GitHub runners (#271)
Issue-on: https://github.com/gem5/gem5/issues/254

This has been implemented in the runners. The regular pruning of the
docker images should fix the issue.
2023-09-07 12:06:41 -07:00
Bobby R. Bruce
84e0224e85 util-docker: Proof-of-concept using Docker buildx (#273)
Introduced in https://github.com/gem5/gem5/pull/236 the
"docker-build.yaml" file will allow us to build and push docker images
to the GitHub Container Registry. This allows for both automation of
docker image building and allows us to utilize Github's zero-cost
pulling policy for downloads to GitHub Actions runners.

In this PR https://github.com/gem5/gem5/pull/236 has been altered to use
Docker `buildx` which allows for multi-platform Docker Image builds. A
multi-platform Docker image pull automatically pull the correct image
for your platform from a single URL. In this prototype the images are
build to both `linux/arm64` and `linux/amd64` have been set.

Docker `buildx` has it's own file format for specifying image builds
called `bake`. "util/dockerfiles/docker-bake.hcl" has been added with
the goal of replacing "util/dockerfiles/docker-compose.yaml".

In this proof-of-concept doesn't build all our docker images, just
enough to ensure it works inside our actions as intended.
2023-09-07 11:37:24 -07:00
Nicholas Mosier
62e81930d6 Merge branch 'develop' into bugfix-chdir 2023-09-07 09:54:35 -07:00
Giacomo Travaglini
1fa1575f58 sim: add bypass_on_change to the set() of a signal (#279)
When reset a port, we don't want to trigger a onChange(). Offer an
option to bypass it and update state only.

Change-Id: Ia53b7a76d2a320ea67101096cdbfe2eafaf440d2
2023-09-07 18:42:08 +02:00
studyztp
e206b16f73 sim:fixed some style issues
Change-Id: I0832a8b68e802e9671b755d3a71fd9c8f17e1648
2023-09-07 08:52:24 -07:00
studyztp
377c875733 sim: check redirect path when unserialize for cpt
sim/fd_array.hh:
Add "class Process;" to forward declare Process for unserialize
function to pass in a Process object pointer.
Fix the styling issue with include files.

sim/fd_array.cc"
Add comments.

Change-Id: Ifb21eb1c7bad119028b8fd8e610a125100fde696
2023-09-07 08:52:24 -07:00
studyztp
2a4f3f206b sim: modifed the type of path
Change-Id: I56be3b62b1804371b9b9e0f84ee1ec49cbedf553
2023-09-07 08:52:24 -07:00
studyztp
0dab27f24a sim: check redirect path when unserialize for cpt
Change-Id: I55b8ce1770b0580d52b8dfa782572d492c1bf727
2023-09-07 08:52:24 -07:00
Johnny
105839ae2b sim: add bypass_on_change to the set() of a signal
When reset a port, we don't want to trigger a onChange().
Offer an option to bypass it and update state only.

Change-Id: Ia53b7a76d2a320ea67101096cdbfe2eafaf440d2
2023-09-07 11:54:56 +08:00
Nicholas Mosier
6cdaa2c16a sim-se: Fix crash in chdirFunc() on nonexistent directory
This commit fixes a crash in the syscall emulation of the chdir(2)
syscall, implemented by chdirFunc() in src/sim/syscall_emul.cc,
when passed a nonexistent directory. The buggy code did not check
the return value of realpath().

This patch adds code to check the return value of realpath(), and
if it is NULL (i.e., there was an error with the requested directory
to change to), propagates the error in `errno` to the application.

GitHub issue: https://github.com/gem5/gem5/issues/276

Change-Id: I8a576f60fe3687f320d0cfc28e9d3a6b477d7054
2023-09-07 03:18:58 +00:00
Bobby R. Bruce
0249d47acc util: Add docker prune cron to GitHub runners
Change-Id: Ic90ebc650b6a89606eaf9e8feafddfe15c44e578
Issue-on: https://github.com/gem5/gem5/issues/254
2023-09-06 15:09:32 -07:00
Bobby R. Bruce
e80cde0713 ext: Stop excluding 'ext/testlib' from pre-commit and format (#267)
Though in "ext" this directory is regularly modified. `pre-commit`
should run on these files.

This PR includes running `pre-commit run --files ext/testlib` to
reformat the files in "ext/testlib" using Python Black.
2023-09-06 11:18:19 -07:00
Bobby R. Bruce
cc757cfe7a misc: Fix buggy special path comparisons (#270)
This patch fixes the buggy special path comparisons in
src/kern/linux/linux.cc Linux::openSpecialFile(), which only checked for
equality of path prefixes, but not equality of the paths themselves.
This patch replaces those buggy comparisons with regular
std::string::operator== string equality comparisons.

GitHub issue: https://github.com/gem5/gem5/issues/269
2023-09-06 11:10:19 -07:00
Bobby R. Bruce
5d98d18fb6 misc: Fix CI GitHub Action to stop if Workflow re-triggered (#275)
This ensures that if the CI tests are running for a PR, and a new
workflow is triggered (typically by pushing/rebasing the PR) then the
older workflow is cancelled.
2023-09-06 11:09:52 -07:00
Harshil Patel
bbe96d6485 stdlib: Changed use of Workload to obtain_resource
- Changed files calling Workload class to call obtain_resoucre instead.

Change-Id: I41f5f0c3ccc7c08b39e7049eabef9609d6d68788
2023-09-06 10:06:16 -07:00
Bobby R. Bruce
12c6742607 misc: Fix CI GitHub Action to stop if Workflow re-triggered
This ensures that if the CI tests are running for a PR, and a new
workflow is triggered (typically by pushing/rebasing the PR) then the
older workflow is cancelled.

Change-Id: Ifa172bdbdac09c5a91abb41a0162c597445e4e2e
2023-09-05 20:49:28 -07:00
Bobby R. Bruce
d10d752d7e misc: Improve ".github/ISSUE_TEMPLATE/bug_report.md" (#268)
The bug report template used escape characters. This is not necessary as
the bug report is not rendered when creating a bug report. It is
displayed to the user in plain text for them to edit.

In addition languages have been added to the code-blocks and newlines
have been added and removed where appropriate to cleanup the document.
2023-09-05 18:00:28 -07:00
Bobby R. Bruce
1b0bb678ab util-docker: Proof-of-concept using Docker buildx
Introduced in https://github.com/gem5/gem5/pull/236 the
"docker-build.yaml" file will allow us to build and push docker images
to the GitHub Container Registry. This allows for both automation of
docker image building and allows us to utilize Github's zero-cost
pulling policy for downloads to GitHub Actions runners.

In this PR https://github.com/gem5/gem5/pull/236 has been altered to
use Docker `buildx` which allows for multi-platform Docker Image builds.
A multi-platform Docker image pull automatically pull the correct image
for your platform from a single URL. In this prototype the images are
build to both `linux/arm64` and `linux/amd64` have been set.

Docker `buildx` has it's own file format for specifying image builds
called `bake`. "util/dockerfiles/docker-bake.hcl" has been added with
the goal of replacing "util/dockerfiles/docker-compose.yaml".

In this proof-of-concept doesn't build all our docker images, just
enough to ensure it works inside our actions as intended.

Change-Id: Id0debed216c91ec514aa4fce3bc2ff4fc2ea669b
2023-09-05 17:59:06 -07:00
Nicholas Mosier
3dfdd48211 misc: Fix buggy special path comparisons
This patch fixes the buggy special path comparisons in
src/kern/linux/linux.cc Linux::openSpecialFile(), which only checked
for equality of path prefixes, but not equality of the paths
themselves. This patch replaces those buggy comparisons with
regular std::string::operator== string equality comparisons.

GitHub issue: https://github.com/gem5/gem5/issues/269

Change-Id: I216ff8019b9a6a3e87e364c2e197d9b991959ec1
2023-09-05 13:44:10 -07:00
Harshil Patel
bf06d61c35 stdlib, tests, resources: Updated tests
- Updated workload tests to use WrokloadResource and obtain_resource

Change-Id: I39194e7fe764566a528e5141c29f30efa14e0cde
2023-09-05 12:39:38 -07:00
Bobby R. Bruce
96144f90ed misc: Add pre-commit run to .git-blame-ignore-revs
Change-Id: Iaae9d735e2972e41f1f9225ea5bed9acf22ff991
2023-09-05 00:01:35 -07:00
Bobby R. Bruce
9e1afdecef ext: Run pre-commit run --files ext/testlib
Change-Id: Ic581132f6136dddb127e2a1c5a1ecc19876488c3
2023-09-05 00:00:25 -07:00
Bobby R. Bruce
ff75e5b30e misc,ext: Update pre-commit hook to run on ext/testlib
Though in ext, we regularly modify these files to add features and
extend our testlib testing infrastructure. Ergo, the pre-commit checks
should be run.

Change-Id: I921a263f25f850b03e5535a8a1f509921c124763
2023-09-05 00:00:25 -07:00
Bobby R. Bruce
2eeecc532a mem-ruby: Reorder SLC atomic and response actions (#255)
Currently the MOESI_AMD_Base-directory transition for system level
atomics sends the response message before the atomic is performed. This
was likely done because atomics are supposed to return the value of the
data *before* the atomic is performed and by simply ordering the actions
this way that was taken care of.

With the new atomic log feature, the atomic values are pulled from the
log by the coalescer on the return path. Therefore, these actions can be
reordered. In fact, it is now necessary that the atomics be performed
before sending the response so that the log is populated and copied by
the response action. This should fix #253 .
2023-09-02 04:48:45 -07:00
Bobby R. Bruce
1ec58d589a misc: Fix broken code example in bug_report.md
Change-Id: I9bc1b42d488a415d2ea165385d83fea3d4ac288d
2023-09-02 04:46:04 -07:00
Bobby R. Bruce
188d29fe05 misc: Add language specification to code-blocks
Change-Id: I875aeee7eb0f9970711a97448d3bcb7acddbe7b1
2023-09-02 04:44:07 -07:00
Bobby R. Bruce
fcb586cfed misc: Add/Remove new lines from bug_report.md
There were some weird newline characters in this file, or lack of lines.
This patch adds/removes them.

Change-Id: I6cc918788c07bbc4be5c68401ad3987be00fffc4
2023-09-02 04:42:53 -07:00
Bobby R. Bruce
0ac2f67437 misc: Remove escape characters from bug_report.md
The bug_report.md is rendered as plain text, not markdown, when creating
a bug report. As such the escape characters are removed in this commit.

Change-Id: I524c66ae61d00b7ed59153ba9f4b2297ff50ee18
2023-09-02 04:41:08 -07:00
Matthew Poremba
2da54d5a4f mem-ruby: Reorder SLC atomic and response actions
Currently the MOESI_AMD_Base-directory transition for system level
atomics sends the response message before the atomic is performed. This
was likely done because atomics are supposed to return the value of the
data *before* the atomic is performed and by simply ordering the actions
this way that was taken care of.

With the new atomic log feature, the atomic values are pulled from the
log by the coalescer on the return path. Therefore, these actions can be
reordered. However, it is now necessary that the atomics be performed
before sending the response so that the log is populated and copied by
the response action. This should fix #253 .

Change-Id: Ie7e178f93990975367de2cc3e89e5ef9c9069241
2023-09-01 10:36:54 -05:00
Hoa Nguyen
73b0e88d6e scons: Add an option specifying the path to mold linker binary
To use mold linker with gcc of version older than 12.1.0, the user
has to pass the -B option to specify where the linker is. [1]

Currently in gem5, scons only looks for the mold binary at
conventional places, such as /usr/libexec/mold and
/usr/local/libexec/mold. There's no option to manually specify
the path to the linker.

gcc-12 and mold are not widely available on older systems. Having
an option to manually input the mold linker path allows users to use
built mold instances anywhere on the system, not just the default
locations in /usr where they may not have permission to install mold
(i.e., no sudo permissions).

[1] https://github.com/rui314/mold#how-to-use

Change-Id: Ifb2366a0c2342bf4e7207df8db6196e14184a9d4
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-09-01 00:31:28 -07:00
Bobby R. Bruce
c0db065c26 util: Add gdb to gcn-gpu Dockerfile (#248)
gdb was originally part of the ROCm 1.6 Dockerfile a few years ago. It
got removed when updating to ROCm 4.0. This adds it back as being able
to debug things is quite useful.
2023-09-01 00:29:44 -07:00
Bobby R. Bruce
8d47cda8b6 arch-x86: Fix wrong x86 assembly (#251)
The RM field of ModRM was printed as Reg field for several instructions.

For reference, this change fixes typos introduced by [1].

[1] https://gem5-review.googlesource.com/c/public/gem5/+/40339
2023-09-01 00:26:00 -07:00
Bobby R. Bruce
4de4e22553 misc: Remove 'run-name' from compiler-tests.yaml (#245)
This isn't necessary. Without 'run-name' the action's default name is
'run-name'. Displaying the actor who launched the action is pointless
for scheduled tests.
2023-08-31 17:38:38 -07:00
Bobby R. Bruce
ddd1bc1e48 gpu-compute: Set LDS/scratch aperture base register (#247)
Starting with gfx900 (Vega) the LDS and scratch apertures can be queried
using a new s_getreg_b32 instruction. If the instruction is called with
the SH_MEM_BASES argument it returns the upper 16 bits of a 64 bit
address for the LDS and scratch apertures. The current addresses cannot
be encoded in this register, so that addresses are changed to have the
lower 48 bits be all zeros in addition to writing the bases register.
2023-08-31 17:38:08 -07:00
Hoa Nguyen
4ff1f160ec arch-x86: Fix wrong x86 assembly
The RM field of ModRM was printed as Reg field for several instructions.

For reference, this change fixes typos introduced by [1].

[1] https://gem5-review.googlesource.com/c/public/gem5/+/40339

Change-Id: I41eb58e6a70845c4ddd6774ccba81b8069888be5
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-09-01 00:26:51 +00:00
Harshil Patel
7042d74ac2 stdlib, resources: Moved parsing params.
- moved parsing from WorkloadResource __init__ to obtain_resource

Change-Id: I9ed2aebb23af7f359bc1e5fff8ebe616a0da1374
2023-08-31 09:19:24 -07:00
Matthew Poremba
3520c83673 util: Add gdb to gcn-gpu Dockerfile
gdb was originally part of the ROCm 1.6 Dockerfile a few years ago. It
got removed when updating to ROCm 4.0. This adds it back as being able
to debug things is quite useful.

Change-Id: I3f8148cde79e6cc5233fa3c8c830b64817f01d3a
2023-08-31 11:08:30 -05:00
Matthew Poremba
cfa833a97d gpu-compute: Set LDS/scratch aperture base register
Starting with gfx900 (Vega) the LDS and scratch apertures can be queried
using a new s_getreg_b32 instruction. If the instruction is called with
the SH_MEM_BASES argument it returns the upper 16 bits of a 64 bit
address for the LDS and scratch apertures. The current addresses cannot
be encoded in this register, so that addresses are changed to have the
lower 48 bits be all zeros in addition to writing the bases register.

Change-Id: If20f262b2685d248afe31aa3ebb274e4f0fc0772
2023-08-31 11:01:32 -05:00
Bobby R. Bruce
7db2aac943 misc: Remove 'run-name' from compiler-tests.yaml
This isn't necessary. Without 'run-name' the action's default name is
'run-name'. Displaying the actor who launched the action is pointless
for scheduled tests.

Change-Id: I15d52959389881381ef7685efb57152c5162c89d
2023-08-31 02:12:20 -07:00
Bobby R. Bruce
0e323bc409 mem: Atomic ops to same address (#200)
Augmenting the DataBlock class with a change log structure to record the
effects of atomic operations on a data block and service these changes
if the atomic operations require return values.

Although the operations are atomic, the coalescer need not send unique
memory requests for each operation. Atomic operations within a wavefront
to the same address are now coalesced into a single memory request. The
response of this request carries all the necessary information to
provide the requesting lanes unique values as a result of their
individual atomic operations. This helps reduce contention for request
and response queues in simulation.

Previously, only the final value of the datablock after all atomic ops
to the same address was visible to the requesting waves. This change
corrects this behavior by allowing each wave to see the effect of this
individual atomic op is a return value is necessary.
2023-08-30 23:53:35 -07:00
Bobby R. Bruce
fceb7e05a3 util-docker: Add GitHub Action to create Docker Images (#236)
This is built to test the following assumptions:

1. We can trigger a GitHub action event on the changing of a
file/directory.
2. We can use GitHub actions to build a docker image.
3. We can use GitHub actions to push a docker image to a container
registry.
4. We can use GitHub's container registry.

Right now this will only build and push ubuntu-20.04_all-depenencies, as
a test.
2023-08-30 12:15:33 -07:00
Bobby R. Bruce
c156df620d resources, stdlib: Add support for local files in obtain_resource (#204)
This patch allows a local JSON file to specify a local path in the JSON
object of a Resource, through the "url" field.

Local paths can be entered with the prefix "file:" in the "url" field.

If the local path exists, then the Resource from there is copied into
the resource directory defined in the
function earlier.

This behavior is the same as using specific Resource classes (ex.
BinaryResource) and passing a local_path into the function.

But, the above class does not allow simultaneous creation of local
Resources and Workloads of those local Resources.

With this patch, someone can use a local JSON, specify the location of
local Resources and create a Workload from those Resources and test both
together.
2023-08-29 20:35:40 -07:00
KUNAL PAI
d52c7ce87f resources, stdlib: Add support for local files in obtain_resource
This patch allows a local JSON file to specify a local path
in the JSON object of a Resource, through the "url" field.

Local paths can be entered with the prefix "file:" in "url".
All File URI scheme formats are supported.

This behavior is the same as using specific Resource classes
(ex. BinaryResource) and passing a local_path into the function.

But, the above infrastructure does not allow simultaneous
creation of Resources and Workloads of those Resources.

With this patch, someone can use a local JSON, specify the location
of local Resources and create a Workload from those Resources and
test both together.

Also, this patch adds pyunit tests to check the functionality
of the function used to convert the "url" field into a path.

Change-Id: I1fa3ce33a9870528efd7751d7ca24c27baf36ad4
2023-08-29 09:47:03 -07:00
Harshil Patel
7da0aeee7d stdlib, resources: Updated warn messages
- Updated warn messages to be more clear.

Change-Id: I2a8922e96b5b544f2f229b01b3c51fc5e79995b4
2023-08-29 09:08:44 -07:00
Giacomo Travaglini
815d5b1cba util: Update & fix bug in m5stats2streamline.py (#211)
There is conversion error in ./util/streamline/m5stats2streamline.py
script to convert gem5 stats.txt,sys, system.tasks.txt to the apc folder
required by DS-5 streamline. The fix to the bug can convert to apc
folder without error. The zipped apc folder can then be imported in
older DS-5 v5.24 for visualization (didn't work with DS-5 v5.29).

Changes:
1) writeBinary function binary_list can have either string or ints and
it needs to be properly converted to bytes
2) packed32(x) function can have x as int or float. Incase of float it
needs to be converted to int

The bug was reported and solved primarily in the issue
https://github.com/gem5/gem5/issues/145

Change-Id: I6a52aa59e1582dd6bb06b2d1c49ddaf8fe61c997
2023-08-29 11:05:51 +02:00
Bobby R. Bruce
7cdce3a975 util-docker: Add GitHub Action to create Docker Images
This is built to test the following assumptions:

1. We can trigger a GitHub action event on the changing of a
   file/directory.
2. We can use GitHub actions to build a docker image.
3. We can use GitHub actions to push a docker image to a container
   registry.
4. We can use GitHub's container registry.

Right now this will only build and push ubuntu-20.04_all-depenencies, as
a test.

Change-Id: Ie1a55c97c6eef26281456c908e1200b27da4d961
2023-08-29 00:30:51 -07:00
Bobby R. Bruce
fee465c97c misc: Add test status badges to README.md
These allow visitors to the repository to quickly see the status of our
tests run on the develop branch.

Change-Id: I3658c0e0d9dea66feebd69588c8a29d369a0b43d
2023-08-28 23:01:56 -07:00
Bobby R. Bruce
68a48a2dfa mem-ruby: fix CHI sending the wrong snoop response (#219)
Do not respond with SnpRespData_I when the line is still present
upstream.
2023-08-28 16:21:25 -07:00
Bobby R. Bruce
737c611e72 mem-ruby: fix assert on CHI ReadUnique (#218)
DCT must be disabled when handling a ReadUnique where the copy need to
be upgraded.

Previously we were just asserting as it was assumed DCT is only enabled
for HNFs (which can "auto-upgrade"). However DCT may also be enabled for
intermediated levels of distributed shared caches above the HNFs.
2023-08-28 16:06:09 -07:00
Bobby R. Bruce
9d2e860d74 misc: Update CI tests to not run on draft PRs (#229)
This updates all the jobs for our CI tests to make sure they don't run
tests on draft pull request, and only trigger when ready for review
2023-08-28 15:19:49 -07:00
Bobby R. Bruce
4bd3d2f864 mem-ruby: Improve Ruby/CHI stats for in/out trans (#220)
Currently we generate these stats for all defined Events in the
protocol, which may generate too many stats that are never used. Though
these don't appear in the stats.txt file, they unnecessarily increases
simulation startup time and memory footprint.

This patch limits those stats to events with the "in_trans" and/or
"out_trans" properties. SLICC compiler then checks which combinations of
event+state are possible when generating the stats.

Also the possible level of detail for inTransLatHist was reduced.
Only the number of transactions for each event+initial+final state
combinations is now accounted. Latency histograms are only defined per
event type (similarly to outTransLatHist). This significantly reduces
the final file size for generated stats.
2023-08-28 15:06:39 -07:00
atrah22
99fc5de3fb util: Update & fix bug in m5stats2streamline.py
1) writeBinary function binary_list can have either string or ints and it needs to be properly converted to bytes
2) packed32(x) function can have x as int or float. Incase of float it needs to be converted to int
3) encode lines to string using .decode() or else TypeError will be invoked during run

Change-Id: I678169f191901f02a80187418a17adbc1240c7d3
2023-08-27 19:07:45 -07:00
atrah22
fab458daa2 util: Update & fix bug in m5stats2streamline.py
1) writeBinary function binary_list can have either string or ints and it needs to be properly converted to bytes
2) packed32(x) function can have x as int or float. Incase of float it needs to be converted to int

Change-Id: I6a52aa59e1582dd6bb06b2d1c49ddaf8fe61c997
2023-08-27 19:07:29 -07:00
Matthew Poremba
82ffc16e6e gpu-compute: Flat scratch implementation and bug fixes (#231)
Add commits fixing private segment counters, flat scratch address
calculation, and implementation of flat scratch instructions.

These commits were tested using a modified version of 'square':

template <typename T>
__global__ void
scratch_square(T *C_d, T *A_d, size_t N)
{
    size_t offset = (blockIdx.x * blockDim.x + threadIdx.x);
    size_t stride = blockDim.x * gridDim.x ;

    volatile int foo; // Volatile ensures scratch / unoptimized code

    for (size_t i=offset; i<N; i+=stride) {
        foo = A_d[i];
        C_d[i] = foo * foo;
    }
}
2023-08-27 07:40:24 -07:00
Matthew Poremba
60f071d09a gpu-compute,arch-vega: Implement flat scratch insts
Flat scratch instructions (aka private) are the 3rd and final segment of
flat instructions in gfx9 (Vega) and beyond. These are used for things
like spills/fills and thread local storage. This commit enables two
forms of flat scratch instructions: (1) flat_load/flat_store
instructions where the memory address resolves to private memory and (2)
the new scratch_load/scratch_store instructions in Vega. The first are
similar to older generation ISAs where the aperture is unknown until
address translation. The second are instructions guaranteed to go to
private memory.

Since these are very similar to flat global instructions there are
minimal changes needed:

- Ensure a flat instruction is either regular flat, global, XOR scratch
- Rename the global op_encoding methods to GlobalScratch to indicate
  they are for both and are intentionally used.
- Flat instructions in segment 1 output scratch_ in the disassembly
- Flat instruction executed as private use similar mem helpers as global
- Flat scratch cannot be an atomic

This was tested using a modified version of the 'square' application:

template <typename T>
__global__ void
scratch_square(T *C_d, T *A_d, size_t N)
{
    size_t offset = (blockIdx.x * blockDim.x + threadIdx.x);
    size_t stride = blockDim.x * gridDim.x ;

    volatile int foo; // Volatile ensures scratch / unoptimized code

    for (size_t i=offset; i<N; i+=stride) {
        foo = A_d[i];
        C_d[i] = foo * foo;
    }
}

Change-Id: Icc91a7f67836fa3e759fefe7c1c3f6851528ae7d
2023-08-26 13:40:12 -05:00
Matthew Poremba
4506188e00 gpu-compute: Fix private offset/size register indexes
According to the ABI documentation from LLVM, the *low* register of flat
scratch (maxSGPR - 4) is the offset and the high register (maxSGPR - 3)
is size. These are currently backwards, resulting in some gnarly
addresses being generated leading to page fault and/or incorrect data.

This commit fixes this by setting the order correctly.

Change-Id: I0b1d077c49c0ee2a4e59b0f6d85cdb8f17f9be61
2023-08-26 13:40:12 -05:00
Matthew Poremba
e0379f4526 gpu-compute: Fix flat scratch resource counters
Flat instructions may access memory locations in LDS (scratchpad) and
global (VRAM/framebuffer) and therefore increment both counters when
dispatched. Once the aperture is known, we decrement the counters of the
aperture that was *not* used. This is done incorrectly for scratch /
private flat instruction. Private memory is global and therefore local
memory counters should be decremented.

This commit fixes the counters by changing the global decrements to
local decrements.

Change-Id: I25890446908df72e5469e9dbaba6c984955196cf
2023-08-26 13:40:12 -05:00
Matthew Poremba
a9b32cdb3a gpu-compute: Use timing DMAs for GPUFS HSA signals (#230)
The functional HSA signal read was a hack left in the gpu-compute code.
In full system, this functional read is causing problems occasionally
with the translation not yet being in the page table. The error message
output by gem5 was a fatal message on the readBlob method in port proxy.
Changing this to a timing DMA fixes this problem.

This commit adds the various timing DMA functions to send and receive
response and clean up. A helper method "sendCompletionSignal" is added
to the GPUCommandProcessor because the indentation level was getting too
deep. This change applies only to FS mode. Code for SE mode is
equivalent to what it was before this commit.

Change-Id: I1bfcaa0a52731cdf9532a7fd0eb06ab2f0e09d48
2023-08-26 11:38:37 -07:00
Matthew Poremba
57b3d2897c gpu-compute: Use timing DMAs for GPUFS HSA signals
The functional HSA signal read was a hack left in the gpu-compute code.
In full system, this functional read is causing problems occasionally
with the translation not yet being in the page table. The error message
output by gem5 was a fatal message on the readBlob method in port proxy.
Changing this to a timing DMA fixes this problem.

This commit adds the various timing DMA functions to send and receive
response and clean up. A helper method "sendCompletionSignal" is added
to the GPUCommandProcessor because the indentation level was getting too
deep. This change applies only to FS mode. Code for SE mode is
equivalent to what it was before this commit.

Change-Id: I1bfcaa0a52731cdf9532a7fd0eb06ab2f0e09d48
2023-08-25 13:10:51 -05:00
Melissa Jost
d640c17f75 misc: Update CI tests to not run on draft PRs
This updates all the jobs for our CI tests to make sure they
don't run tests on draft pull request, and only trigger when
ready for review

Change-Id: I3fe7ae373c39fc6ef594c0c71c6f10e7319553d8
2023-08-25 10:27:08 -07:00
Bobby R. Bruce
5cb604559a misc: Move compiler tests to run on 'build' runners (#222)
This is an experiment. The runners were sometimes running out of memory
building gem5. The builders have more memory so should be able to
handling this. The runners have 4-cores so compilation should be faster
(note the inclusion of the `-j$(nproc)`.
2023-08-25 03:24:03 -07:00
Matthew Poremba
fcbed2bd8a dev-amdgpu: Tell OS about PCIe atomic support (#224)
configs,dev-amdgpu: Add PCI express capability info

The ROCm stack requires PCI express atomics. Currently the first PCI
CapabilityPtr does not point to anything, which signals to the OS
(Linux) that this is an early generation PCI device. As PCI express
atomics were introduced later, the CapabilityPtr needs to point to at
least a PCI express capability structure. This capability is defined as
0x10 in Linux. We additionally set the PCI atomic based bits and
implement device specific PCI configuration space reads and writes to
the amdgpu device.

The second commit, output of simulation when loading the amdgpu
driver no longer outputs "PCIE atomics not supported". Further, an
application which uses PCIe atomics (PyTorch with a reduce_sum kernel)
now makes further progress.

First commit is a minor typo fix changing PCI capability struct to
union.
2023-08-24 11:19:30 -07:00
Bobby R. Bruce
cf997c93a5 tests, gpu-compute: Updating weekly.sh to use mmapped version of FW (#186) 2023-08-24 10:16:25 -07:00
Bobby R. Bruce
7aa896fe8f cpu-minor: Separate the reg_index of VecClassReg and VecElemReg (#225)
In the RISC-V system, we need to VecClassReg to run RISC-V vector
instruction, and VecElemReg is not applicable because the element length
of vector can be resizable via vset\*vl\* instruction.

The change will seperate the reg_index for VecReg and VecElemReg to
ensure that have the space for VecReg when VecElemReg is not applicable.
2023-08-24 10:13:21 -07:00
Giacomo Travaglini
56a8ab3f3c sim: provide a signal constructor with an init_state (#210)
The current SignalSinkPort and SignalSourcePort have no ways to assign
the init value of the state. Add a new constructor for them with the
param init_state

Bug: 293410800
Test: boot to linux
Change-Id: Idde0a12aa0ddd0c9c599ef47059674fb12aa5d68
Reviewed-on:
https://soc-sim-external-review.googlesource.com/c/gem5/gem5/+/13159
Gem5-Virtual-Platform-Presubmit-Ready: Johnny Ko <johnnyko@google.com>
Reviewed-by: Yu-hsin Wang <yuhsingw@google.com>
Perf-Presubmit-Ready: Johnny Ko <johnnyko@google.com>
Gem5-Virtual-Platform-Verified: kokoro <noreply+kokoro@google.com>
Perf-Verified: kokoro <noreply+kokoro@google.com>
2023-08-24 18:06:21 +01:00
Bobby R. Bruce
e77666d9e8 mem-ruby: fix CHI Evict race condition (#217)
When an Evict request is received from upstream for a shared line and
the line is no longer cached locally (or on any other upstream cache),
we need to also send an Evict downstream. In this case we need to wait
until our outgoing Evict completes before completing the Evict from
upstream in order be able to resolve race conditions with incoming
snoops. E.g.: while our outgoing Evict is pending we may receive a snoop
requesting data, but we won't be able to complete this snoop if we have
already completed all upstream Evicts and we no longer have the line.
2023-08-24 10:04:28 -07:00
Matthew Poremba
9fd846f48d gpu-compute,arch-vega: Fix ALU-only LDS counters (#223)
There are a few LDS instructions that perform local ALU operations and
writeback which are marked as loads. These are marked as loads because
they fit in the pipeline logic better, according to a several year old
comment. In the VEGA ISA these instructions (swizzle, permute, bpermute)
are not decrementing the LDS load counter. As a result, the counter will
gradually increase over time. Since wavefront slots are persistent, this
can cause applications with a few thousand kernels to eventually hang
thinking there are not enough resources.

This changeset fixes this by decrementing the LDS load counter for these
instructions. This fix was already integrated in the GCN3 ISA in the
exact same way. This changeset moves it near a similar comment about
scheduling register file writes.

Change-Id: Ife5237a2cae7213948c32ef266f4f8f22917351c
2023-08-24 07:12:56 -07:00
Matthew Poremba
addba01d29 configs,dev-amdgpu: Add PCI express capability info
The ROCm stack requires PCI express atomics. Currently the first PCI
CapabilityPtr does not point to anything, which signals to the OS
(Linux) that this is an early generation PCI device. As PCI express
atomics were introduced later, the CapabilityPtr needs to point to at
least a PCI express capability structure. This capability is defined as
0x10 in Linux. We additionally set the PCI atomic based bits and
implement device specific PCI configuration space reads and writes to
the amdgpu device.

With this commit, the output of simulation when loading the amdgpu
driver no longer outputs "PCIE atomics not supported". Further, an
application which uses PCIe atomics (PyTorch with a reduce_sum kernel)
now makes further progress.

Change-Id: I5e3866979659a2657f558941106ef65c2f4d9988
2023-08-24 09:10:35 -05:00
Bobby R. Bruce
2d9ad02ae7 ext: Specialize GDBSignal MACRO to gem5 (#209)
The goal is to fix this issue which appears to be affects some Apple
users: https://github.com/gem5/gem5/issues/94.

By specializing the `EXC_*` to gem5 we avoid the name conflicts plagiing
some users.
2023-08-24 02:44:56 -07:00
Roger Chang
5c28113a06 cpu-minor: Separate the reg_index of VecClassReg and VecElemReg
In the RISC-V system, we need to VecClassReg to run RISC-V vector
instruction, and VecElemReg is not applicable because the element
length of vector can be resizable via vset*vl* instruction.

The change will seperate the reg_index for VecReg and VecElemReg to
ensure that have the space for VecReg when VecElemReg is not
applicable.

Change-Id: I99a82dec273baeee31df89a0ee0f5e87f3ff187c
2023-08-24 13:27:27 +08:00
Matthew Poremba
8b4c38302f dev: PCI: Fix PCI express capability union
The capabilities for PCI express is a struct, instead of a union, like
the other capability unions. A union is used here to provide access to
the ordinal data values when reading/writing an offset while
simultaneously providing human readable field values that can be set
when writing the code.

This commit changes it to union which is likely should be. Nothing
appears to be using this union yet so it is likely an oversight.

Change-Id: I85fe7cc62914525c70fd7a5946d725ed308f8775
2023-08-23 19:32:38 -05:00
Matthew Poremba
90a518e885 gpu-compute,arch-vega: Fix ALU-only LDS counters
There are a few LDS instructions that perform local ALU operations and
writeback which are marked as loads. These are marked as loads because
they fit in the pipeline logic better, according to a several year old
comment. In the VEGA ISA these instructions (swizzle, permute, bpermute)
are not decrementing the LDS load counter. As a result, the counter will
gradually increase over time. Since wavefront slots are persistent, this
can cause applications with a few thousand kernels to eventually hang
thinking there are not enough resources.

This changeset fixes this by decrementing the LDS load counter for these
instructions. This fix was already integrated in the GCN3 ISA in the
exact same way. This changeset moves it near a similar comment about
scheduling register file writes.

Change-Id: Ife5237a2cae7213948c32ef266f4f8f22917351c
2023-08-23 19:30:24 -05:00
Bobby R. Bruce
b2d40edc62 misc: Move compiler tests to run on 'build' runners
This is an experiment. The runners were sometimes running out of memory
building gem5. The builders have more memory to handle this. The runners
have 4-cores so compilation should be faster (note the inclusion of the
`-j$(nproc)`.

Change-Id: I964c5a778938b449502d92dec3431f8b788397e4
2023-08-23 17:17:28 -07:00
Tiago Mück
9584d2efa9 mem-ruby: add in_trans/out_trans to CHI events
Marks which events signal the beginning of incoming and outgoing
transactions for generating inTransLatHist and outTransLatHist stats.

Change-Id: I90594a27fa01ef9cfface309971354b281308d22
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
2023-08-23 17:25:50 -05:00
Tiago Mück
3360a87d5a mem-ruby: optimize in/outTransLatHist stats
Generating these stats for all defined Events may generate too many
stats that are never used, which unnecessarily increases simulation
startup time and memory consumption.

This patch limits those stats to events with the "in_trans" and/or
"out_trans" properties. SLICC compiler then checks which combinations
of event+state are possible when generating the stats.

Also the possible level of detail for inTransLatHist was reduced.
Only the number of transactions for each event+initial+final state
combinations is now accounted. Latency histograms are only defined
per event type (similarly to outTransLatHist). This significantly
reduces the final file size for generated stats.

Change-Id: I29aaeb771436cc3f0ce7547a223d58e71d9cedcc
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
2023-08-23 17:25:38 -05:00
Tiago Mück
a5fd6edea1 mem-ruby: fix CHI sending the wrong snoop response
Do not respond with SnpRespData_I when the line is still present
upstream.

Change-Id: I2592e5c6637cfc0e83042169a245837648276e61
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
2023-08-23 17:04:09 -05:00
Tiago Mück
49f5ec16d1 mem-ruby: fix assert on CHI ReadUnique
DCT must be disabled when handling a ReadUnique where the copy
need to be upgraded.

Previously we were just asserting as it was assumed DCT is only enabled
for HNFs (which can "auto-upgrade"). However DCT may also be enabled
for intermediated levels of distributed shared caches above the HNFs.

Change-Id: I9e29142a8d2f59ea61c1d90cda6b00c19435d6b7
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
2023-08-23 16:58:25 -05:00
Harshil Patel
328d140c70 stdlib, resources: Added warn msgs and commets.
- Added deprecated warnings to Workload and Abstract workload.

- Added comments to the classes changed.

Change-Id: I671daacf5ef455ea65103bd96aa442486142a486
2023-08-23 13:50:08 -07:00
Reiley Jeyapaul
c9ff54677f mem-ruby: fix CHI Evict race condition
When an Evict request is received from upstream for a shared line
and the line is no longer cached locally (or on any other upstream
cache), we need to also send an Evict downstream. In this case we need
to wait until our outgoing Evict completes before completing the Evict
from upstream in order be able to resolve race conditions with incoming
snoops. E.g.: while our outgoing Evict is pending we may receive a
snoop requesting data, but we won't be able to complete this snoop if
we have already completed all upstream Evicts and we no longer have the
line.

Change-Id: I23ac4f0a9c4ddd81e2425376c8d1e1c7fb66d107
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
2023-08-23 15:49:51 -05:00
Ranganath (Bujji) Selagamsetty
f6a453362f mem: Atomic ops to same address
Augmenting the DataBlock class with a change log structure to
record the effects of atomic operations on a data block and
service these changes if the atomic operations require return
values.

Although the operations are atomic, the coalescer need not
send unique memory requests for each operation. Atomic
operations within a wavefront to the same address are now
coalesced into a single memory request. The response of this
request carries all the necessary information to provide the
requesting lanes unique values as a result of their individual
atomic operations. This helps reduce contention for request
and response queues in simulation.

Previously, only the final value of the datablock after all
atomic ops to the same address was visible to the requesting
waves. This change corrects this behavior by allowing each wave
to see the effect of this individual atomic op is a return value
is necessary.

Change-Id: I639bea943afd317e45f8fa3bff7689f6b8df9395
2023-08-23 14:45:25 -05:00
Johnny
76fe71ebd0 sim: provide a signal constructor with an init_state
Add more description to the code

Change-Id: Iff8fb20762baa0c9d0b7e5f24fb8769d7e198b5c
2023-08-23 10:49:15 +08:00
Johnny
6acb687975 sim: provide a signal constructor with an init_state
1. The current SignalSinkPort and SignalSourcePort have no
   ways to assign the init value of the state. Add a new constructor
   for them with the param init_state
2. After the source and sink are bound, the state at both side should
   be the same. Set the the state of sink to the state of source in the
   bind() function.

Change-Id: Idde0a12aa0ddd0c9c599ef47059674fb12aa5d68
2023-08-23 10:12:41 +08:00
Bobby R. Bruce
c218104f52 tests: Update asmtest script and add more test binaries (#206)
Upload the config script to make it only for riscv asmtest and replace
Resource with obtain_resourse.

Also adds more test binaries.
2023-08-22 13:59:56 -07:00
Jason Lowe-Power
e3414c7098 base: Make 'findLsbSetFallback' constexpr to fix gcc-8 comp (#203)
Compilation bug found on:
https://github.com/gem5/gem5/actions/runs/5899831222/job/16002984553

In gcc Version 8 and below the following error is received:

```
src/base/bitfield.hh: In function ‘constexpr int gem5::findLsbSet(uint64_t)’:
src/base/bitfield.hh:365:34: error: call to non-‘constexpr’ function ‘int gem5::{anonymous}::findLsbSetFallback(uint64_t)’
         return findLsbSetFallback(val);
                ~~~~~~~~~~~~~~~~~~^~~~~
scons: *** [build/ALL/kern/linux/events.o] Error 1
```

`findLsbSet` cannot be `constexr` as it calls non-constexpr function
`findLsbSetFallback`. `findLsbSetFallback`. The problematic function is
the `count` on the std::bitset.

This patch changes this to a constexpr.
2023-08-22 11:23:59 -07:00
Roger Chang
f41172f9e4 tests: Add RV32 test binaries 2023-08-22 16:00:16 +08:00
Roger Chang
61488e1e17 tests: Add more tests for RV64 2023-08-22 16:00:16 +08:00
Roger Chang
fee1c3fc7a tests: Update asmtest script
Upload the config script to make it only for riscv asmtest and replace
Resource with obtain_resourse

Change-Id: I0bab96ea352b7ce1c6838203bfa13eee795f41f9
2023-08-22 16:00:16 +08:00
Bobby R. Bruce
f9a4a794b7 misc: Add DRAMSys tests to our weekly tests (#198)
This adds the DRAMSys tests to our weekly-tests.yaml file
2023-08-21 17:31:36 -07:00
Bobby R. Bruce
6f7fc51a18 ext: Specialize GDBSignal MACRO to gem5
The goal is to fix this issue which appears to be affects some Apple
users: https://github.com/gem5/gem5/issues/94.

By specializing the `EXC_*` to gem5 we avoid the name conflicts plagiing
some users.

Change-Id: I031f7110b4b4ae82677b6586903cd57b22ca2137
2023-08-21 17:23:09 -07:00
Bobby R. Bruce
709f632730 base: Make 'findLsbSetFallback' constexpr to fix gcc-8 comp
Compilation bug found on:
https://github.com/gem5/gem5/actions/runs/5899831222/job/16002984553

In gcc Version 8 and below the following error is received:

```
src/base/bitfield.hh: In function ‘constexpr int gem5::findLsbSet(uint64_t)’:
src/base/bitfield.hh:365:34: error: call to non-‘constexpr’ function ‘int gem5::{anonymous}::findLsbSetFallback(uint64_t)’
         return findLsbSetFallback(val);
                ~~~~~~~~~~~~~~~~~~^~~~~
scons: *** [build/ALL/kern/linux/events.o] Error 1
```

`findLsbSet` cannot be `constexr` as it calls non-constexpr function
`findLsbSetFallback`. `findLsbSetFallback`. The problematic function is
the `count` on the std::bitset.

This patch changes this to a constexpr.

Change-Id: I48bd15d03e4615148be6c4d926a3c9c2f777dc3c
2023-08-21 14:04:36 -07:00
Melissa Jost
e611cc66b1 misc: ADD DRAMSys tests to our weekly tests
This adds the DRAMSys tests to our weekly-tests.yaml file

Change-Id: Ieb7903a3a7ffae6359b3de5f66e1dd65eb51fc80
2023-08-21 11:53:08 -07:00
Bobby R. Bruce
63b91b51a2 mem-cache: Allow clflush's uncacheable requests on classic cache (#205)
When a linux kernel changes a page property, it flushes the related
cache lines. The kernel might change the page property before flushing
the cache lines. This results in the clflush might occur in an
uncacheable region.

Currently, an uncacheable request must be a read or a write. However,
clflush request is neither of them.

This change aims to allow clflush requests to work on uncacheable
regions. Since there is no straightforward way to check if a packet is
from a clflush instruction, this change permits all Clean Invalidate
Requests, which is the type of request produced by clflush, to work on
uncacheable regions.
2023-08-21 10:42:10 -07:00
Bobby R. Bruce
f98cd15ec7 arch-riscv,systemc: Update cxx_config_cc.py to use is port.is_source (#196)
Fix for issue #181. Update the port description generation to use the
port.is_source attribute.
2023-08-20 21:00:33 -07:00
Bobby R. Bruce
e5fcc116ec ext: Update DRAMSys README (#202)
This fixes:

1. Most importantly: The submodule recursive update was incorrect. This
adds the recursive obtaining of submodules as a seperate explicity step.
2. Changes the `git clone` to use https.
2023-08-20 20:13:17 -07:00
Thilo Vörtler
73b6e98f51 arch-riscv,systemc: Fix cxx_config_cc.py to use is is_source
Update the cxx_config_cc.oy port description generation to use the
port.is_source attribute.

Github Issue: https://github.com/gem5/gem5/issues/181

Change-Id: I3fa12c2fbb06083379118e57aedb8be414c0d929
2023-08-20 14:06:37 +00:00
Hoa Nguyen
9e007e5bd7 mem-cache: fix wrong function call
Change-Id: I924ede89f373ec21557faf25c96b36f4bc8430dd
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-08-19 22:56:55 +00:00
Hoa Nguyen
f442846d9d mem-cache: Fix another typo
Change-Id: Ib2051f9bda6e6d9002d3be1dbf0b890299098201
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-08-19 22:50:53 +00:00
Hoa Nguyen
7b897a30fa mem-cache: Fix syntax error
Change-Id: I1360879c13d377661e9eeeddf345b785c01efeb6
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-08-19 21:27:53 +00:00
Hoa Nguyen
98daec7d99 mem-cache: Allow clflush's uncacheable requests on classic cache
When a linux kernel changes a page property, it flushes the related cache
lines. The kernel might change the page property before flushing the
cache lines. This results in the clflush might occur in an uncacheable region.

Currently, an uncacheable request must be a read or a write. However,
clflush request is neither of them.

This change aims to allow clflush requests to work on uncacheable regions.
Since there is no straightforward way to check if a packet is from a clflush
instruction, this change permits all Clean Invalidate Requests, which is
the type of request produced by clflush, to work on uncacheable regions.

Change-Id: Ib3ec01d9281d3dfe565a0ced773ed912edb32b8f
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-08-19 18:20:16 +00:00
Bobby R. Bruce
16752b7ca2 ext: Update DRAMSys README
This fixes:

1. Most importantly: The submodule recursive update was incorrect. This
adds the recursive obtaining of submodules as a seperate explicity step.
2. Changes the `git clone` to use https.

Change-Id: Iad69e44b927a5aa982b49dffa6929c52fcc7ee72
2023-08-18 15:43:14 -07:00
Bobby R. Bruce
d7d441becb tests: Add checkpoint tests for all ISAs (#167)
Added save and restore checkpoint tests for arm-hello, x86-hello,
x86-fs, power-hello

Added mips and sparc test but mips does not support checkpoint and there
is a bug in sparc.

Added test file to run the tests.
2023-08-18 15:01:39 -07:00
Harshil Patel
a18b4b17ed stdlib, resources: depricated workload
- Added WrokloadResource in resource.py.

- depricated Workload and CustomWorkload.

- changed iscvmatched-fs.py with obtain resource for workload to test.

Change-Id: I2267c44249b96ca37da3890bf630e0d15c7335ed
Note: change example files back to original
2023-08-18 13:56:12 -07:00
Bobby R. Bruce
ac88871017 misc: Update matrix runs in scheduled tests (#194)
This changes continue-on-error to be fail-fast instead, as
continue-on-error will mark failed matrix runs as
successful, whereas fail-fast makes sure everything in the matrix runs,
but gets marked as failed if part of it fails.
2023-08-18 10:56:26 -07:00
Bobby R. Bruce
30ab2c19b1 stdlib: Allow passing of func as Exit Event generator (#195)
In this case the function is turned into a generator with the "yield" of
the generator the return the function's execution.

Translation of this stale Gerrit Change:
https://gem5-review.googlesource.com/c/public/gem5/+/62872
2023-08-18 10:55:50 -07:00
Harshil Patel
9d86a559ed tests: removed mips tests and added issue link.
- Removed MIPS tests.
- Added link to github issue sparc test bug.

Change-Id: Ib3c69dca578371ecf0ac2d7694f46f24834a7e5f
2023-08-18 09:51:40 -07:00
Harshil Patel
ea5951a467 stdlib, resources: skeleton for workload resouce
Change-Id: I5017ac479ad61c767ede36fae195105e0519304f
2023-08-18 08:57:31 -07:00
Bobby R. Bruce
c0216dbe48 stdlib: Allow passing of func as Exit Event generator
In this case the function is turned into a generator with the
"yield" of the generator the return the function's execution.

Change-Id: I4b06d64c5479638712a11e3c1a2f7bd30f60d188
2023-08-17 16:48:33 -07:00
Jason Lowe-Power
22c52f4fba Fix reporting traps (faults) to GDB in SE mode (#166)
This addresses #123
2023-08-17 16:08:49 -07:00
Melissa Jost
fa49de5b98 misc: Update matrix runs in scheduled tests
This changes continue-on-error to be fail-fast instead, as
continue-on-error will mark failed matrix runs as
successful, whereas fail-fast makes sure everything in the matrix
runs, but gets marked as failed if part of it fails.

Change-Id: Ie20652c229b6cce9f1c0a45958b088391e7aae97
2023-08-17 15:56:02 -07:00
Bobby R. Bruce
fe43e4a3e3 arch-riscv: Check CSR before executing VMem instructions (#187)
Any instructions require vector register should check if vector is
enabled. Any instructions need vtype CSR to execute them should check
vill bit beforehead.
2023-08-17 11:20:21 -07:00
Jan Vrany
3564348eec arch-riscv: Report traps to GDB in SE mode
This commit add code to report illegal instruction and breakpoint traps
to GDB (if connected). This merely follows what POWER does.
2023-08-17 15:55:04 +01:00
Jan Vrany
546b3eac7d arch-riscv: Do not advance PC when handling faults in SE mode
On RISC-V when trap occurs the contents of PC register contains the
address of instruction that caused that trap (as opposed to the address
of instruction following it in instruction stream). Therefore this commit
does not advance the PC before reporting trap in SE mode.

Change-Id: I83f3766cff276312cefcf1b4ac6e78a6569846b9
2023-08-17 15:55:04 +01:00
Jan Vrany
fde58a4365 arch-power: Fix reporting traps to GDB
Due to inverted logic in POWER fault handlers, unimplemented opcode and
trap faults did not report trap to GDB (if connected). This commit fixes
the problem.

While at it, I opted to use `if (! ...) { panic(...) }` rather than
`panic_if(...)`. I find it easier to understand in this case.

Change-Id: I6cd5dfd5f6546b8541d685e877afef21540d6824
2023-08-17 15:55:04 +01:00
Roger Chang
fe142f485a arch-riscv: Add missing vector required check for vmem instructions
The mem instructions usually executed from initiateAcc. We also need
to check vector required in those instructions

Change-Id: I97b4fec7fada432abb55ca58050615e12e00d1ca
2023-08-17 09:53:30 +08:00
Roger Chang
35a6fe6f3d arhc-riscv: Check vill in vector mem instructions
Any vector instructions using vtype should check vill flag is set

Change-Id: Ia9a2695f3005a176422da78e6f413cc789116faa
2023-08-17 09:53:30 +08:00
Bobby R. Bruce
3ff6fe0e90 arch-x86,cpu-kvm: Fix gem5.fast due to unused variable (#189)
Detected via this failing workload:
https://github.com/gem5/gem5/actions/runs/5861958237

Ir caused the following compilation error to be thrown:

```
src/arch/x86/kvm/x86_cpu.cc:1462:22: error: unused variable ‘rv’ [-Werror=unused-variable]
 1462 |                 bool rv = isa->cpuid->doCpuid(tc, function, idx, cpuid);
      |                      ^~
```

`rv` is unused in the .fast compilation as it's only used in the
`assert` statement immediately after.

To fix this, the `[[maybe_unused]]` annotation is used.
2023-08-16 12:52:44 -07:00
Bobby R. Bruce
f6b116d8a0 util-docker: Fix clang-version-8 docker container (#190)
clang v8, when installed in this manner via Docker, did not install the
libstdc++. This caused compilation errors. This patch adds the
libstdc++-10-dev package to this Dockerfile.
2023-08-16 12:52:25 -07:00
Bobby R. Bruce
c835c9faa3 arch-x86,cpu-kvm: Fix gem5.fast due to unused variable
Detected via this failing workload:
https://github.com/gem5/gem5/actions/runs/5861958237

It caused the following compilation error to be thrown:

```
src/arch/x86/kvm/x86_cpu.cc:1462:22: error: unused variable ‘rv’ [-Werror=unused-variable]
 1462 |                 bool rv = isa->cpuid->doCpuid(tc, function, idx, cpuid);
      |                      ^~
```

`rv` is unused in the .fast compilation as it's only used in the
`assert` statement immediately after.

 To fix this, the `[[maybe_unused]]` annotation is used

Change-Id: Ib98dd859c62f171c8eeefae93502f92a8f133776
2023-08-16 10:06:39 -07:00
Bobby R. Bruce
74f6fa34af util-docker: Fix clang-version-8 docker container
clang v8, when installed in this manner via Docker, did not install the
libstdc++. This caused compilation errors. This patch adds the
libstdc++-10-dev package to this Dockerfile.

Change-Id: Ia0f41e82b3df2d4bf32b418b0cb78111a35e0b9f
2023-08-16 10:00:45 -07:00
Matthew Poremba
bc9bbc10f0 gpu-compute: Change kernel-based exit location (#184)
The previous exit event occurs when the dispatcher sends a completion
signal for a kernel, but gem5 does some kernel-based stats updates after
the signal is sent. Therefore, if these exit events are used as a way to
dump per-kernel stats, some of the stats for the kernel that just ended
will be in the next kernel's stat dump which is misleading.

This patch moves the exit event to where the stats are updated and only
exits if the dispatcher has requested a stat dump to prevent situations
where stats are updated mid-kernel.

Change-Id: I74dc1cad5fc90382a2a80564764b3e7c9fb65521
2023-08-16 07:38:12 -07:00
Andreas Sandberg
f6d44ac7b3 fastmodel: Add option to retry licence server connection. (#183)
We're seeing some occasional connection timeouts in CI, possibly when we
aggressively hit the license server, so let's add a parameter to retry
the connection a few times.

Also, print the time required to connect to the server to help debug
issues.
2023-08-16 10:08:59 +01:00
Bobby R. Bruce
9ee400ff92 mem: Port trace in xbar when address error (#180)
When xbar encounters the address error, print out the port trace in the
packet for user to debug if the port trace is enabled.

To gain the packet of the access, the parameter of findPort() function
is changed from AddrRange to PacketPtr.

When running gem5 with "--debug-flags=PortTrace", we can see the full
path of the unexpected access when xbar cannot find the destination of
the address.
2023-08-15 23:27:17 -07:00
Bobby R. Bruce
954328fa28 mem: Fixing memory size type issue in port proxy. (#185)
This patch changes the data type used for image size from int to
uint64_t. Current version allows initializing AbstractMemory types with
a maximum binary size of 2GiB. This will be limiting in many studies.
2023-08-15 21:43:42 -07:00
Mahyar Samani
d869018226 mem: Fixing memory size type issue in port proxy.
This patch changes the data type used for image size from int
to uint64_t. Current version allows initializing AbstractMemory
types with a maximum binary size of 2GiB. This will be limiting
in many studies.

Change-Id: Iea3bbd525d4a67aa7cf606f6311aef66c9b4a52c
2023-08-15 12:40:45 -07:00
Matthew Poremba
df4739929d gpu-compute: Change kernel-based exit location
The previous exit event occurs when the dispatcher sends a completion
signal for a kernel, but gem5 does some kernel-based stats updates after
the signal is sent. Therefore, if these exit events are used as a way to
dump per-kernel stats, some of the stats for the kernel that just ended
will be in the next kernel's stat dump which is misleading.

This patch moves the exit event to where the stats are updated and only
exits if the dispatcher has requested a stat dump to prevent situations
where stats are updated mid-kernel.

Change-Id: I74dc1cad5fc90382a2a80564764b3e7c9fb65521
2023-08-15 11:06:26 -05:00
Nicolas Boichat
3ea7a792b0 fastmodel: Add option to retry licence server connection.
We're seeing some occasional connection timeouts in CI, possibly
when we aggressively hit the license server, so let's add a
parameter to retry the connection a few times.

Also, print the time required to connect to the server to help
debug issues.

Change-Id: I804af28f79f893fcdca615d7bf82dd9b8686a74c
2023-08-15 10:47:32 +00:00
Yan Lee
96d80a41d2 mem: dump out port trace when address decode error
1. Add findPort(PacketPtr pkt) for getting the port trace from the Packet.
   Keep the findPort(AddrRange addr_range) for recvMemBackdoorReq(...)
2. With the debug flag `PortTrace` enabled, user can see the full path of
   the packet with the corresponding address when address error in xbar.

Change-Id: Iaf43ee2d7f8c46b9b84b2bc421a6bc3b02e01b3e
2023-08-15 00:41:42 -07:00
Yan Lee
b01590fdf4 mem: port: add getTraceInString() method
Return the whole port trace of the packet as a string.

Change-Id: I7b1b1fef628a47a6ce147cb5fb75af81948c1d89
2023-08-15 00:40:29 -07:00
Yan Lee
5edb760414 mem: port: add address value in the port trace
Add the address value from the packet with the request port name.

Change-Id: I3d4c75f48ca6fbdbd5656e594d5f85f9e5626be8
2023-08-15 00:38:29 -07:00
Leo Redivo
576e8c1897 misc: Move inform to get_default_kernel_args() and fix formatting
Change-Id: I788b630d811f8268da0e87923741cf9afdef0a3e
2023-08-11 15:07:41 -07:00
Harshil Patel
b19d4beeb8 tests: Removed mips checkpoint tests
Change-Id: I03ad0025ec982245721fd7faad8d75cdbb99cf81
2023-08-11 09:00:11 -07:00
Bobby R. Bruce
41dcd3c5d5 misc: Add continue-on-error to matrix runs (#175)
This sets continue-on-error to true on any scheduled test that uses a
matrix so we can have all sets of tests run regardless if one of them
fails or not.
2023-08-11 07:39:25 -07:00
Bobby R. Bruce
fa918f61d1 tests: Move replacement policy and simulator config files (#173)
Moving these files should address the failures in the daily tests.
2023-08-11 07:39:06 -07:00
Bobby R. Bruce
cfea9afae3 misc: Update MAINTAINERS.yaml (#155)
- Updates the list of current gem5 Maintainers.
- Updates the MAINATAINERS.yaml comments to reflect gem5's contribution
policy and maintainer responsibilities since moving to GitHub.
- Updates the subsystem maintainers, based on maintainer's stated
preferences.
- Adds the optional 'expert' field to subsystems.
2023-08-11 02:05:26 -07:00
Melissa Jost
b08bc5ff56 misc: Add continue-on-error to matrix runs
This sets continue-on-error to true on any scheduled test that
uses a matrix so we can have all sets of tests run regardless
if one of them fails or not.

Change-Id: I8f6137ebdf62a5cecd582387316c330c8a1401ca
2023-08-10 16:58:22 -07:00
Melissa Jost
912b7c06dd tests: Move replacement policy and simulator config files
Moving these files should address the failures in the daily
tests.

Change-Id: I438adba1a45bdf6083651b6b3f610c8bbe4ebdf0
2023-08-10 16:38:20 -07:00
Bobby R. Bruce
6ca9435961 misc: Add Matt P. as maintainer to requested tags
Change-Id: Ifffb34acbf9ebf313e5bc09ebbfd2a44a017359a
2023-08-10 15:52:00 -07:00
Bobby R. Bruce
a41c6f8d84 misc: Remove PMC/Maintainers list from MAINTAINERS.yaml
Change-Id: I772fa31d0aeea5534355731d841cf2d118fa0df4
2023-08-10 15:48:13 -07:00
Bobby R. Bruce
557d532bc3 misc: Add Jason Lowe-Power as website expert
Change-Id: I52ebce434732bb921efc040397a9aa9538a6d1d9
2023-08-10 15:46:02 -07:00
Bobby R. Bruce
77e63b6a6c cpu-o3: bugfix of rename squash when SMT (#172)
In an SMT CPU, upon a squash, the mis-predicted(squashing) instructions
can still be executing at IEW and own phys registers. If these registers
are added back to the rename freelist on this Tick, the registers may be
renamed to be used by other SMT thread(s). This causes register
ownership hazards, which may eventually freeze the CPU. This problem
seems to date back to 2014
(https://www.mail-archive.com/gem5-users@gem5.org/msg10180.html).

This patch delays the freelist update to avoid the hazard.

I tested that this patch does not cause any performance impact for my
set of benchmarks on default non-SMT O3CPU.
2023-08-10 15:29:51 -07:00
Jason Lowe-Power
b88e60ff28 cpu: Fix segment fault when using debug flags Branch (#169) 2023-08-10 08:33:48 -07:00
He, Wenjian
03c2b4692c cpu-o3: bugfix of rename squash when SMT
In an SMT CPU, upon a squash, the phys regs used by
mispredicted instructions can still be owned by executing
instructions in IEW. If the regs are added back to freelist
on this tick, the reg may be renamed to be used by another
SMT thread. This causes reg ownership hazard, which may
eventually freeze the CPU.

This patch delays the freelist update to avoid the hazard.

Change-Id: I993b3c7d357269f01146db61fc8a7b83a989ea45
2023-08-10 21:43:09 +08:00
Roger Chang
f54777419d cpu: Fix ?: error due to different type
Change-Id: I35c50fbba047fe05cc0cc29c631002a9b68795fd
2023-08-10 14:36:26 +08:00
rogerchang23424
81e3bfcdc3 cpu: Update src/cpu/pred/bpred_unit.cc
Change-Id: I0cf177676d0f9fb9db4b127d5507ba66904739c4
Co-authored-by: Jason Lowe-Power <jason@lowepower.com>
2023-08-10 14:36:12 +08:00
James Braun
da87f65e4a Updating weekly.sh to use mmapped version of FW
Change-Id: Id0059d9b3e9e4a4db3ba59793c41ae71269666ae
2023-08-09 23:05:58 -05:00
Roger Chang
97e55fc173 cpu: Fix segment fault when using debug flags Branch
Change-Id: I36624b93f53aa101a57d51f3b917696cb2809136
2023-08-10 07:36:50 +08:00
Bobby R. Bruce
4cac85cb80 arch-riscv: Add checking CSR condition for RVV instructions (#170)
If status.vs == OFF, the RVV instruction should raise Illegal
Instruction according to RISC-V V spec. If RVV is not implemented, all
of the RVV instruction need to raise exception.
2023-08-09 13:47:56 -07:00
Roger Chang
42c2ed6c2d arch-riscv: Add condition for setting misa and mstatus CSR
Change-Id: I7e03b60d0de32fe8169dd79ded485d560aca64aa
2023-08-09 19:32:04 +08:00
Roger Chang
43adc5309a arch-riscv: Add Illegal Instruction Fault Condition for RVV Config
Check the status.vs and misa.rvv CSR registers before executing
RVV instructions

Change-Id: I0355b94ea8ee4018be11a75aab8c19b10cb36126
2023-08-09 19:31:58 +08:00
Roger Chang
85549842c7 arch-riscv: Add Illegal Instruction Fault Condition for Mem RVV
Check the status.vs and misa.rvv CSR registers before executing
RVV instructions

Change-Id: If1f6a440713612b9a044de4f320997e99722c06c
2023-08-09 19:22:32 +08:00
Roger Chang
c18e43a0ab arch-riscv: Add Illegal Instruction Fault Condition for Arith RVV
Check the status.vs and misa.rvv CSR registers before executing
RVV instructions

Change-Id: Idc143e1ba90320254926de9fa7a7b343bb96ba88
2023-08-09 19:20:53 +08:00
Harshil Patel
a880ff1e15 tests: Updated directory structure
-Changed copyright message to reflect correct year.
- Updated directory structure.
- Changed directory name to snake case.
- Added a README.md for checkpoint tests.

Change-Id: Id350addb9cce6740a20a5a45171f80306b711efa
2023-08-08 13:38:56 -07:00
Bobby R. Bruce
572c6bc1bb misc: Update where runners are cleaned in workflow files (#163)
This moves the clean runner step in our yaml files to be at the
beginning of a job, so that if a runner goes down and is unable to clean
at the end, we can ensure that
subsequent jobs still run as expected.

Change-Id: Iba52694aefe03c550ad0bfdb5b5f938305273988
2023-08-08 12:42:24 -07:00
Harshil Patel
a903ff43f2 tests: Add checkpoint tests for all ISAs
Added save and restore checkpoint tests for arm-hello, x86-hello, x86-fs, power-hello
Added mips and sparc test but mips does not support checkpoint and there is a bug in sparc.
Added test file to run the tests.

Change-Id: I2d3b96f95ee08aae921de9a885ac5be77d49f326
2023-08-08 09:42:41 -07:00
Bobby R. Bruce
faed0d3f6d tests: Temporarily cease using PARSEC disk image in tests (#164)
Due to a 60GB limit on the VMs the gem5 project's GitHub Actions
self-hosted Runners execute within, we cannot run tests which need to
download the gem5 Resource's PARSEC disk image (v1.0.0,
http://resources.gem5.org/resources/x86-parsec?version=1.0.0). This
image, 33GB, is too big and causes our runners to run out of disk space
and fail.

These changes can be reverted when we are able to increase the size of
our VMs.
2023-08-08 00:24:25 -07:00
Bobby R. Bruce
dc31883a2d tests: Update resource downloading test to skip x86-parsec
The x86-parsec gem5 Resource (v1.0.0,
http://resources.gem5.org/resources/x86-parsec?version=1.0.0) is 33GB.
The gem5 GitHub Actions self-hosted runners do not have enough Disk
Space in the VMs they are run to download this. Ergo we skip it.

Change-Id: I290fe265f03ceca65b2bed87e9f4a4ad601e0fc1
2023-08-07 15:30:09 -07:00
Bobby R. Bruce
b86bc7b1ed tests: Add '--skip' arg to "download_check.py"
This argument allows the passing of IDs of resources which should be
skipped for this check.

Note: A current limitation here is you cannot specify the version of a
resource. Passing the ID of a resource to this will skip the downloading
for all versions of that resource.

Change-Id: Ifdb7c2b71553126fd52a3d286897ed5dd8e98f7c
2023-08-07 15:30:09 -07:00
Bobby R. Bruce
f21c5d0d78 tests: Disable PARSEC benchmark tests
These tests are disabled due our GitHub Actions self-hosted Runners
having a 60GB of disk space. The PARSEC Disk Image Resource (v1.0.0,
http://resources.gem5.org/resources/x86-parsec?version=1.0.0) is 33GB
and is simply too big to download and unzip for these tests.

These tests can be reenabled when this issue is resolved.

Change-Id: I9a63aa1903cea3ce7942bdc85bcd0b24761d2f29
2023-08-07 15:30:08 -07:00
Melissa Jost
8b6912f331 misc: Update where runners are cleaned in workflow files
This moves the clean runner step in our yaml files to be at the
beginning of a job, so that if a runner goes down and is
unable to clean at the end, we can ensure that
subsequent jobs still run as expected.

Change-Id: Iba52694aefe03c550ad0bfdb5b5f938305273988
2023-08-07 15:05:32 -07:00
Bobby R. Bruce
5200d9ca3d misc: Refactor weekly-tests.yaml (#160)
This adds a matrix to the weekly tests in order to make the file
cleaner.
2023-08-07 14:39:21 -07:00
Melissa Jost
bc2cabbeb5 Merge branch 'develop' into clean-weeklies 2023-08-07 11:32:18 -07:00
Melissa Jost
eb541da32c misc: Refactor weekly-tests.yaml
This adds a matrix to the weekly tests in order to make the file
cleaner.

Change-Id: I830a0bf8b7d0406e9c377fedf2a7edfa5beabf40
2023-08-07 11:30:44 -07:00
Bobby R. Bruce
4114114bed tests: Refactor test configs (#156)
The yaml file changes made here will need to be copied to develop for
our scheduled tests to run properly. In addition, the original comments
on this can be seen here:
https://gem5-review.googlesource.com/c/public/gem5/+/70340/2
2023-08-07 11:18:18 -07:00
Melissa Jost
effca10cb4 tests: Add switcheroo.py to fs configs directory
There was another missing file in the configs for the fs tests,
so this should allow the switcheroo tests to pass.

Change-Id: Ic4e26cceeb9209f176158b80eaaba88b47968c39
2023-08-04 17:57:51 -07:00
Melissa Jost
bd6a1f5b4b Merge branch 'develop' into refactor-test-configs 2023-08-04 14:54:30 -07:00
Bobby R. Bruce
3d39bc160c misc: Fix daily tests (#158)
The dailies timed out as they were running the entire directory of tests
due to a wrong variable name being used. In addition, the names of tests
were adjusted to include the matrix type so the artifacts won't
overwrite each other
2023-08-04 14:38:12 -07:00
Melissa Jost
4376e5fa9f tests: Add checkpoint.py to fs tests directory
The configs directory for the fs tests was missing the
checkpoint.py file, causing some of the CI tests to fail.

Change-Id: Ifbd775ad658f96d06bea7bee554fe3bedcf5a5b5
2023-08-04 14:19:49 -07:00
Leo Redivo
cf1678f43f misc: Fix pre-commit formatting issues
Change-Id: I50e71cfc21d43c2c17da52cf2f40591599907548
2023-08-04 13:15:55 -07:00
Jason Lowe-Power
7a9f7f51ae arch-riscv: Implemented zicbom/zicboz extensions for RISC V (#137) 2023-08-04 11:39:34 -07:00
Jason Lowe-Power
ed44df5d02 util: fix cpt upgrader for rvv changes in PR #83 (#115)
Solves issue #106 by updating the cpts with the necessary vector
registers.
2023-08-04 11:35:28 -07:00
Adrià Armejach
f777cc143c util: fix cpt upgrader for rvv changes in PR #83
* Solves issue #106 by updating the cpts with the necessary vector
    registers.

Change-Id: Ifeda90e96097f0b0a65338c6b22a8258c932c585

util: clear vector_element field

Change-Id: I6c9ec4e71f66722b26de030fa139cd626bdb24dc
2023-08-04 13:59:23 +02:00
zmckevitt
14c25a383c arch-riscv: Implemented zicbom/zicboz extensions for RISC V
Change-Id: I79d0e6059a2dbb5a0057c4f7489b999f9e803684
2023-08-04 10:05:15 +08:00
Bobby R. Bruce
6e39f2097d tests: download_check.py to rm each resource after check (#152)
"tests/gem5/configs/download_check.py" is used by the
"test-resource-downloading" test (defined in
"tests/gem5/gem5-resources/test_download_resources.py" and ran as part
of the "very-long" suite).

Prior to this change "download_check.py" would download each resource,
check it's md5, then at the end of the script remove all the downloaded
resources. This is inefficient on disk space and was causing our
"very-long" suite of tests to require a machines with a lot of disk
space to run.

This change alters 'download_check.py" to remove each resource after the
md5 check. Thus, only one resource is ever downloaded and present at any
given time during the running of this script.
2023-08-03 17:28:58 -07:00
Melissa Jost
e7c8a12349 misc: Fix daily tests
The dailies timed out as they were running the entire directory
of tests due to a wrong variable named being used.  In addition,
the names of tests were adjusted to include the matrix type so
the artifacts won't overwrite each other

Change-Id: Iaa1be8e0cfcbf9d64f1a674590bfe2bf1f0dae90
2023-08-03 17:00:07 -07:00
Jason Lowe-Power
0ff485f7d0 stdlib, resources: fixed style issue in isa.hh (#149)
Changed "rv_type" to "rvType".

Change-Id: I7432a87d7a37324777385707854aefba2475b98c
2023-08-03 16:52:52 -07:00
Bobby R. Bruce
2bef8efb94 stdlib, resources: Fixed keyerror: 'is_zipped' bug (#153)
Change-Id: I68fffd880983ebc225ec6fc8c7f8d509759b581d
2023-08-03 16:01:07 -07:00
Melissa Jost
298b1fafb4 misc: Update test names in daily and weekly yaml files
Updates the directories in which tests are run in accordance
with the refactoring of the testing directory

Change-Id: I93f5c5b0236c5180da04deb425ec2ed6804fa003
2023-08-03 15:57:24 -07:00
Bobby R. Bruce
23f78159ec misc: Add 'experts' field to MAINTAINERS.yaml
This field was added to give gem5 community members a change to register
that they have expertise in a particular subsystem but do not much to
assign themselves the responsibilities of a subsystem maintainer.

Those who have registered interest on being an subsystem expert have
been added.

Change-Id: I8f532e381e8e42257b2a68ac48204131479d8cd0
2023-08-03 14:58:56 -07:00
Bobby R. Bruce
3f1518a1c2 misc: Update subsystem maintainers in MAINTAINERS.yaml
This change incorporates changes to the set of maintainers and the
maintainers assigned to each subsystem based on individual maintainers'
preferences.

Change-Id: Ic2c39907763282e89936fa0d90e3c1a105a0d917
2023-08-03 14:58:49 -07:00
Bobby R. Bruce
5e6095fecc misc: Update MAINTAINERS.yaml documentation comment
This comment is updated to reflect new gem5 policy and its move to
GitHub and a Pull Request contribution model.

Change-Id: Iec909ffa0cca254fdbe56ce3165cb948cdd0cbce
2023-08-03 14:58:40 -07:00
Harshil Patel
23f5535ef5 Merge branch 'develop' into riscv-fix-style 2023-08-03 13:32:53 -07:00
Harshil Patel
5cfac2cc94 stdlib: Fixed stype issue pcstate.hh
- Changed _rv_type to _rvType.
- Changed rv_type to rvType.

Change-Id: I27bdf342b038f5ebae78b104a29892684265584a
2023-08-03 13:04:17 -07:00
Harshil Patel
a25ca04851 stdlib, resources: Fixed keyerror: 'is_zipped' bug
Change-Id: I68fffd880983ebc225ec6fc8c7f8d509759b581d
2023-08-03 10:59:11 -07:00
Bobby R. Bruce
0855c58538 tests: download_check.py to rm each resource after check
"tests/gem5/configs/download_check.py" is used by the
"test-resource-downloading" test (defined in
"tests/gem5/gem5-resources/test_download_resources.py" and ran as part
of the "very-long" suite).

Prior to this change "download_check.py" would download each resource,
check it's md5, then at the end of the script remove all the downloaded
resources. This is inefficient on disk space and was causing our
"very-long" suite of tests to require a machines with a lot of disk
space to run.

This change alters 'download_check.py" to remove each resource after the
md5 check. Thus, only one resource is ever downloaded and present at any
given time during the running of this script.

Change-Id: I38fce100ab09f66c256ccddbcb6f29763839ac40
2023-08-03 10:48:44 -07:00
Jason Lowe-Power
5eda9fe2ca arch-riscv: Relation chain on RVV support (#83)
This merges initial support for RVV. Currently, only the simple CPUs are supported.
The decoder stalls for every vsetvl instruction.

In the future, we will implement vsetvl as a control instruction as described in #144
2023-08-03 07:31:08 -07:00
Bobby R. Bruce
fbcf50befd stdlib,resources: Enable loading of local Resources data via JSON file path (#150) 2023-08-02 15:49:47 -07:00
Melissa Jost
3bf92d0e0b tests: Update layout of testing directory
This changeset reorganizes the testing directory within gem5,
removing the bigger config folders, then replacing them with
smaller configs folders within each directory containing only
the scripts necessary for that set of tests. It also changes
the locations of the config scripts used in each set of tests,
and updates the tests accordingly.

Change-Id: I38297d4496f72bd5cf7200471acd5c4d93002b27
2023-08-02 14:59:13 -07:00
Melissa Jost
7ff67459b6 tests: Add READMEs to the testing directory
This change adds READMEs to each directory within tests/gem5,
with a short description of the test, as well as how to run it.

Change-Id: I574ebcdc837848b52f21e8c0f8856ff09463284b
2023-08-02 14:58:39 -07:00
Melissa Jost
57fff0221b tests: Unify testing directory names
This updates the testing directory so everything uses underscores
and is more uniform.

Change-Id: I7ea45c9e0fc1892605387cd2453ce8656ddccd49
2023-08-02 14:58:20 -07:00
Harshil Patel
51d492487e stdlib: stlye fix rv_type to _rvType in isa.hh and isa.cc
Change-Id: I68e2b1be9150e6528693e68fb73470d158838885
2023-08-02 14:06:30 -07:00
Adrià Armejach
884d62b33a arch-riscv: Make vset*vl* instructions serialize
Current implementation of vset*vl* instructions serialize pipeline and
are non-speculative.

Change-Id: Ibf93b60133fb3340690b126db12827e36e2c202d
2023-08-02 14:46:36 +02:00
Jason Lowe-Power
98d68a7307 arch-riscv: Improve style
Minor style fixes in vector code

Change-Id: If0de45a2dbfb5d5aaa65ed3b5d91d9bee9bcc960
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2023-08-02 14:46:36 +02:00
Jason Lowe-Power
af1b2ec2d5 arch-riscv: Add fatal if RVV used with o3 or minor
Since the O3 and Minor CPU models do not support RVV right now as the
implementation stalls the decode until vsetvl instructions are exectued,
this change calls `fatal` if RVV is not explicitly enabled.

It is possible to override this if you explicitly enable RVV in the
config file.

Change-Id: Ia801911141bb2fb2bedcff3e139bf41ba8936085
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2023-08-02 14:46:36 +02:00
Adrià Armejach
ae651f4de1 configs: update riscv restore checkpoint test
Change-Id: I019fc6394a03196711ab52533ad8062b22c89daf
2023-08-02 14:46:36 +02:00
Xuan Hu
a9f9c4d6d3 arch-riscv: Add risc-v vector ext v1.0 arith insts support
TODOs:
  + vcompress.vm

Change-Id: I86eceae66e90380416fd3be2c10ad616512b5eba
Co-authored-by: Yang Liu <numbksco@gmail.com>
Co-authored-by: Fan Yang <1209202421@qq.com>
Co-authored-by: Jerin Joy <joy@rivosinc.com>

arch-riscv: Add LICENCE to template files

Change-Id: I825e72bffb84cce559d2e4c1fc2246c3b05a1243
2023-08-02 14:46:36 +02:00
Xuan Hu
91b1d50f59 arch-riscv: Add risc-v vector ext v1.0 mem insts support
* TODOs:
  + Vector Segment Load/Store
  + Vector Fault-only-first Load

Change-Id: I2815c76404e62babab7e9466e4ea33ea87e66e75
Co-authored-by: Yang Liu <numbksco@gmail.com>
Co-authored-by: Fan Yang <1209202421@qq.com>
Co-authored-by: Jerin Joy <joy@rivosinc.com>
2023-08-02 14:46:35 +02:00
Xuan Hu
e14e066fde arch-riscv: Add risc-v vector ext v1.0 vset insts support
Change-Id: I84363164ca327151101e8a1c3d8441a66338c909
Co-authored-by: Yang Liu <numbksco@gmail.com>
Co-authored-by: Fan Yang <1209202421@qq.com>

arch-riscv: Add a todo to fix vsetvl stall on decode

Change-Id: Iafb129648fba89009345f0c0ad3710f773379bf6
2023-08-02 14:46:35 +02:00
Xuan Hu
73892c9b47 arch-riscv: Add risc-v vector regs and configs
This commit add regs and configs for vector extension

* Add 32 vector arch regs as spec defined and 8 internal regs for
  uop-based vector implementation.
* Add default vector configs(VLEN = 256, ELEN = 64). These cannot
  be changed yet, since the vector implementation has only be tested
  with such configs.
* Add disassamble register name v0~v31 and vtmp0~vtmp7.
* Add CSR registers defined in RISCV Vector Spec v1.0.
* Add vector bitfields.
* Add vector operand_types and operands.

Change-Id: I7bbab1ee9e0aa804d6f15ef7b77fac22d4f7212a
Co-authored-by: Yang Liu <numbksco@gmail.com>
Co-authored-by: Fan Yang <1209202421@qq.com>
Co-authored-by: Jerin Joy <joy@rivosinc.com>

arch-riscv: enable rvv flags only for RV64

Change-Id: I6586e322dfd562b598f63a18964d17326c14d4cf
2023-08-02 14:46:35 +02:00
Harshil Patel
32b7ffc454 stdlib: fixed warning message
Change-Id: I04ef23529d7afc5d46fbba7558279ec08acd629a
Co-authored-by: paikunal <kunpai@ucdavis.edu>
2023-08-01 17:22:35 -07:00
leoredivo
052c870058 New function to kernel_disk_workload to allow new disk device location
Added a parameter to kernel_disk_workload which allows users to change the disk device location. Maintained the previous way of setting a disk device as the default, however added a function to allow users to override this default
2023-08-01 16:51:36 -07:00
Harshil Patel
d96df40253 stdlib: Added support for JSON via env variables.
Change-Id: I5791e6d51b3b9f68eb212a46c4cd0add23668340
Co-authored-by: Kunal Pai <kunpai@ucdavis.edu>
2023-08-01 16:22:44 -07:00
Bobby R. Bruce
dceabe5fda dev-amdgpu: Support for ROCm 5.4+ and MI200 (#141) 2023-07-31 10:24:46 -07:00
Jason Lowe-Power
4ee6dbc330 mem: Minor typo fix in packet.hh (#143)
Change-Id: I07c31b7a62d83fe3250b48141951aec3c2f280df
2023-07-31 10:01:50 -07:00
Matthew Poremba
f8490e4681 configs: Only require MMIO trace for Vega10
The MMIO trace contains register values for parts of the GPU that are
not modeled in gem5, such as registers related to the graphics core.
Since MI100 and MI200 do not have anything that is not modeled, the
MMIO trace is not needed, therefore it does not need to be used or
checked and the command line option goes away entirely for MI100/200.

Change-Id: I23839db32b1b072bd44c8c977899a99347fc9687
2023-07-30 13:17:05 -05:00
Matthew Poremba
3589a4c11f arch-vega: Implement translate further
Starting with ROCm 5.4+, MI100 and MI200 make use of the translate
further bit in the page table. This bit enables mixing 4kiB and 2MiB
pages and is functionally equivalent to mixing page sizes using the
PDE.P bit for which gem5 currently has support.

With PDE.P bit set, we stop walking and the page size is equal to the
level in the page table we stopped at. For example, stopping at level
2 would be a 1GiB page, stopping at level 3 would be a 2MiB page.
This assumes most pages are 4kiB.

When the F bit is used, it is assumed most pages are 2MiB and we will
stop walking at the 3rd level of the page table unless the F bit is set.
When the F bit is set, the 2nd level PDE contains a block fragment size
representing the page size of the next PDE in the form of 2^(12+size).
If the next page has the F bit set we continue walking to the 4th level.
The block fragment size is hardcoded to 9 in the driver therefore we
assert that the block fragment size must be 0 or 9.

This enables MI200 with ROCm 5.4+ in gem5. This functionality was
determine by examining the driver source code in Linux and there is no
public documentation about this feature or why the change is made in or
around ROCm 5.4.

Change-Id: I603c0208cd9e821f7ad6eeb1d94ae15eaa146fb9
2023-07-30 13:17:05 -05:00
Matthew Poremba
3b35e73eb8 dev-amdgpu: Implement SDMA constant fill
This SDMA packet is much more common starting around ROCm 5.4.
Previously this was mostly used to clear page tables after an
application ended and was therefore left unimplemented. It is
now used for basic operation like device memsets.

This patch implements constant fill as it is now necessary.

Change-Id: I9b2cf076ec17f5ed07c20bb820e7db0c082bbfbc
2023-07-30 13:17:05 -05:00
Matthew Poremba
618b2a60de arch-vega, dev-amdgpu: Fix for memory leaks (#129)
When using the new operator, delete should be called
on any allocated memory after it's use is complete.

Change-Id: Id5fcfb264b6ddc252c0a9dcafc2d3b020f7b5019
2023-07-30 10:48:17 -07:00
Matthew Poremba
b35c2ba8c5 arch-vega: Fix vop2Helper scalar support (#142)
A previous change added a vop2Helper to remove 100s of lines of common
code from VOP2 instructions related to processing SDWA and DPP support.
That change inadvertently changed the type of operand source 0 from
const to non-const. The vector container operator[] does not allow
reading a scalar value such as a constant, a dword literal, etc. The
error shows up in the form of: assert(!scalar) in operand.hh.

Since the SDWA and DPP cases need to modify the source vector and
non-SDWA/DPP cases might require const, we make a non-const copy of the
const source 0 vector and place it in a temporary non-const vector. This
non-const vector is passed to the lambda function implementation of the
instruction. This prevents needing a const and non-const version of the
lambda and avoids needing to propagate the template parameters through
the various SDWA/DPP helper methods which seems like it will not work
anyways as they need to modify the vector.

As a result of this, as more VOP2 instructions are implemented using
this helper, they will need to specify the const and non-const template
parameters of the vector container needed for the instruction.

Change-Id: Ia0b3c550d7de32b830040007a110f4821e3385aa
2023-07-30 10:47:36 -07:00
Ranganath (Bujji) Selagamsetty
ede4d89a83 arch-vega, dev-amdgpu: Fix for memory leaks
When using the new operator, delete should be called
on any allocated memory after it's use is complete.

Change-Id: Id5fcfb264b6ddc252c0a9dcafc2d3b020f7b5019
2023-07-28 19:14:46 -05:00
Jason Lowe-Power
81cc57b828 gpu-compute: "<random>" -> "base/random.hh" in testers/gpu... (#140)
In "src/cpu/testers/gpu_ruby_test" a random number generator was used.
This was using the CPP "<random>" library. This patch changes it to the
gem5 random class (that declared in "base/random.hh").

In addition to this, undeterministic behavior has been removed. Via
"protocol_tester.cc" the RNG is either seeded with a seed specified by
the user, or goes with the gem5 default seed. This ensures reproducable
runs. Prior to this patch the RNG was seeded with `time(NULL)`. This
made finding faults difficult.

This, at least partially, addresses Issue #138

Change-Id: Ia8e9f7b87e91323f828e0b7f6c3906c0c5793b2c
2023-07-28 16:54:24 -07:00
Ranganath (Bujji) Selagamsetty
3f2899a7a8 mem: Minor typo fix in packet.hh
Change-Id: I07c31b7a62d83fe3250b48141951aec3c2f280df
2023-07-28 17:28:10 -05:00
Matthew Poremba
6b020ed033 arch-x86: Move CPUID values to python (#113)
arch-x86: Move CPUID values to python

CPUID values for X86 are currently hard-coded in the C++ source file.
This makes it difficult to configure the bits if needed. Move these to
python instead. This will provide a few benefits:

1. We can enable features for certain configurations, for example AVX
can be enabled when the KVM CPU is used, but otherwise should not be
enabled as gem5 does not have full AVX support.
2. We can more accurately communicate things like cache/TLB sizes based
on the actual gem5 configuration. The CPUID values are can be used by
some libraries, e.g., MPI, to query system topology.
3. Enabling some bits breaks things in certain configurations and this
can be prevented by configuring in python. For example, enabling AVX
seems to currently be breaking SMP, meaning gem5 can only boot one CPU
in that configuration.
2023-07-28 14:52:13 -07:00
Bobby R. Bruce
08a3762a14 gpu-compute: Add warn for random_seed == 0 case
Addresses:
https://github.com/gem5/gem5/pull/140#pullrequestreview-1552383650

Change-Id: Ia09a2bc74f35d3d6cb066efaf9d113db6caf4557
2023-07-28 12:55:18 -07:00
Bobby R. Bruce
48ac1ea38d gpu-compute: "<random>" -> "base/random.hh" in testers/gpu...
In "src/cpu/testers/gpu_ruby_test" a random number generator was used.
This was using the CPP "<random>" library. This patch changes it to the
gem5 random class (that declared in "base/random.hh").

In addition to this, undeterministic behavior has been removed. Via
"protocol_tester.cc" the RNG is either seeded with a seed specified by
the user, or goes with the gem5 default seed. This ensures reproducable
runs. Prior to this patch the RNG was seeded with `time(NULL)`. This
made finding faults difficult.

Change-Id: Ia8e9f7b87e91323f828e0b7f6c3906c0c5793b2c
2023-07-28 12:55:03 -07:00
Matthew Poremba
c722b0c73d arch-vega: Fix vop2Helper scalar support
A previous change added a vop2Helper to remove 100s of lines of common
code from VOP2 instructions related to processing SDWA and DPP support.
That change inadvertently changed the type of operand source 0 from
const to non-const. The vector container operator[] does not allow
reading a scalar value such as a constant, a dword literal, etc. The
error shows up in the form of: assert(!scalar) in operand.hh.

Since the SDWA and DPP cases need to modify the source vector and
non-SDWA/DPP cases might require const, we make a non-const copy of the
const source 0 vector and place it in a tempoary non-const vector. This
non-const vector is passed to the lambda function implementation of the
instruction. This prevents needing a const and non-const version of the
lambda and avoids needing to propagate the template parameters through
the various SDWA/DPP helper methods which seems like it will not work
anyways as they need to modify the vector.

As a result of this, as more VOP2 instructions are implemented using
this helper,they will need to specify the const and non-const template
parameters of the vector container needed for the instruction.

Change-Id: Ia0b3c550d7de32b830040007a110f4821e3385aa
2023-07-28 13:47:55 -05:00
Bobby R. Bruce
31230025e9 misc: Sync CONTRIBUTING.md with website (#130)
This change syncs the repo's contributing documentation with that of the
website's contributing documentation:
https://www.gem5.org/contributing

From now on we'll attempt to keep the repo's CONTRIBUTING.md
documentation in sync with that on the website.

Change-Id: I2c91e6dd5cd7a9b642377878b007d7da3f0ee2ad
2023-07-28 09:42:28 -07:00
Matthew Poremba
9acfc5a751 configs: Enable AVX2 for GPUFS+KVM
AVX is a requirement for some ROCm libraries, such as rocBLAS, which are
themselves requirements for libraries higher up the stack like PyTorch.
This patch sets the necessary CPUID bits in the GPUFS config to enable
AVX, AVX2, and various SSE features so that applications using these
libraries do not cause an illegal instruction trap.

Change-Id: Id22f543fb2a06b268271725a54075ee6a9a1f041
2023-07-28 11:34:04 -05:00
Matthew Poremba
7c3c2b05f3 arch-x86: Add extended state CPUID function
The extended state CPUID function is used to set the values of the XCR0
register as well as specify the size of storage for context switching
storage for x87 and AVX+. This function is iterative and therefore
requires (1) marking it as such in the hsaSignificantIndex function (2)
setting multiple sets of 4-tuples for the default CPUID values where the
last 4-tuple ends with all zeros.

Change-Id: Ib6a43925afb1cae75f61d8acff52a3cc26ce17c8
2023-07-28 11:34:04 -05:00
Matthew Poremba
3584c3126c arch-x86: Expose CR4.osxsave bit
Related to the recent changes with moving CPUID values to python, this
value is needed to enable AVX and needs a way to be exposed to python as
well in order to set the bit and the corresponding CPUID values at the
same time.

Change-Id: I3cadb0fe61ff4ebf6de903018a8d8a411bfdb4e0
2023-07-28 11:34:04 -05:00
Matthew Poremba
3946f7ba2c arch-x86: Support CPUID functions with indexes
Various CPUID functions will return different values depending on the
value of ECX when executing the CPUID instruction. Add support for this
in the X86 KVM CPU. A subsequent patch will add a CPUID function which
requires iterating through multiple ECX values.

Change-Id: Ib44a52be52ea632d5e2cee3fb2ca390b60a7202a
2023-07-28 11:34:04 -05:00
Matthew Poremba
63d98018ea arch-x86: Move CPUID values to python
CPUID values for X86 are currently hard-coded in the C++ source file.
This makes it difficult to configure the bits if needed. Move these to
python instead. This will provide a few benefits:

1. We can enable features for certain configurations, for example AVX
can be enabled when the KVM CPU is used, but otherwise should not be
enabled as gem5 does not have full AVX support.
2. We can more accurately communicate things like cache/TLB sizes based
on the actual gem5 configuration. The CPUID values are can be used by
some libraries, e.g., MPI, to query system topology.
3. Enabling some bits breaks things in certain configurations and this
can be prevented by configuring in python. For example, enabling AVX
seems to currently be breaking SMP, meaning gem5 can only boot one CPU
in that configuration.

Change-Id: Ib3866f39c86d61374b9451e60b119a3155575884
2023-07-28 11:34:04 -05:00
Bobby R. Bruce
dcf3c4ba98 misc: Sync CONTRIBUTING.md with website
This change syncs the repo's contributing documentation with that of the
website's contributing documentation:
https://www.gem5.org/contributing

From now on we'll attempt to keep the repo's CONTRIBUTING.md
documentation in sync with that on the website.

Change-Id: I2c91e6dd5cd7a9b642377878b007d7da3f0ee2ad
2023-07-27 10:11:46 -07:00
Bobby R. Bruce
65b99fffc9 util: Ignore line length check for #include pragma in C/C++ files (#134)
The length of the path of #include pragmas can be more than
79-character long. The length of the path of a #include pragma
can be outside of user's control.
2023-07-27 09:39:18 -07:00
Bobby R. Bruce
42b65cad68 misc: Add missing dependency to daily tests (#136)
The refactoring to the daily tests was missing the dependency on the
'name-artifacts' job, which is necessary for downloading all the gem5
artifacts. This adds it in so the tests run as expected.

Change-Id: I0d71ab147395f41c881f2b24597bc07006e1f9c0
2023-07-27 09:38:23 -07:00
Bobby R. Bruce
5aa955212f learning-gem5: Add a missing override (#135) 2023-07-27 09:37:52 -07:00
Jason Lowe-Power
ea18c2f417 cpu: Set SLC bit for GPU tester (#133)
This fixes issue #131 by reverting to the old behavior of performing all
atomics at the system level. To do this the SLC bit needs to be set for
all atomic requests.

Change-Id: I63f4e449be1b02c933832d09700237f8c8026f4c
2023-07-27 07:37:52 -07:00
Melissa Jost
415a6eb9d4 misc: Add missing dependency to daily tests
The refactoring to the daily tests was missing the dependency
on the 'name-artifacts' job, which is necessary for downloading
all the gem5 artifacts.  This adds it in so the tests run as
expected.

Change-Id: I0d71ab147395f41c881f2b24597bc07006e1f9c0
2023-07-26 23:49:39 -07:00
Hoa Nguyen
f19945e9cb ext: Remove the test
Change-Id: I5c174ad388f63e7846dab5d9497ab2faa73ca6f7
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-07-26 21:29:00 -07:00
Bobby R. Bruce
5888ea68a3 misc: Split up tests in daily-tests.yaml (#105)
This splits up the gem5 library example tests by Suite UID, as right now
running them together uses the runner for a long period of time. It is
important to note that doing this means additional tests from this
directory will need to be manually added, such as the kvm tests.

Change-Id: Ib2a0aca08f9b51b60e9dd0528324372cf2d98c05
2023-07-26 21:04:18 -07:00
Hoa Nguyen
bd82e6f1a7 learning-gem5: Add a missing override
Change-Id: I9acebe6f3096b499fa2c69b6d757373431f63c71
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-07-26 20:01:37 -07:00
Hoa Nguyen
9ec7a1c14a util: Ignore line length check for #include pragma in C/C++ files
The length of the path of the #include pragma can be more than
79-character long.

Change-Id: Id72250c166370c7f456bd1f7d05589a49c14c33d
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-07-26 19:41:04 -07:00
Matthew Poremba
ff7e67ee93 cpu: Set SLC bit for GPU tester
This fixes #131 by reverting to the old behavior of performing all
atomics at the system level. To do this the SLC bit needs to be set for
all atomic requests.

Change-Id: I63f4e449be1b02c933832d09700237f8c8026f4c
2023-07-26 21:18:26 -05:00
Jason Lowe-Power
21b4ad609f mem: Make functional request a response when satisfied by queue (#124)
In the memory controller, MemCtrl::MemoryPort::recvFunctional, when the
functional request is satisfied by the ctrl-response queue, correctly
make the packet a response.

This change mirrors AbstractMemory::functionalAccess, which uses
Packet::makeResponse() after satisfying the request.

Note:
bool trySatisfyFunctional(..) functions return true or false based on
whether the request was satisfied.
void recvFunctional(..) functions modify the packet to indicate
successful request satisfaction.
2023-07-26 17:03:16 -07:00
Bobby R. Bruce
6a503d52cd misc: Updating TESTING.md (#121)
This updates the TESTING.md to reflect the current state of the tests in
the gem5 repository and how they interact with the GitHub Actions
infrastructure.
2023-07-26 16:24:56 -07:00
Bobby R. Bruce
c056ef07a5 tests: Deprecate Gerrit/Jenkins testing scripts (#126)
These testing scripts are no longer used since moving to GitHub. The
Nightly (now refered to as "Daily" tests), the Weekly Tests, Compiler
Tests and the CI (Kokoro, pre-commit) tests are run via the GitHub
Actions infrastructure. Their setup is described via Workflow files in
".github/workflows". To run tests locally please consult the
"TESTING.md" file.

These scripts may still be useful to reference and are therefore being
moved into a deprecated state.

Change-Id: Ie75c2f4f1179eb73d0f45ba0b259e8d79aa02ace
2023-07-26 16:24:41 -07:00
Melissa Jost
7371dd51b9 misc: Move gem5 library example tests into a matrix
This moves the gem5 library example tests into a separate matrix,
so they can run on separate runners

Change-Id: Ie9f51b5bae9e7e424d1c98b545b4cf92b481a2fb
2023-07-26 16:19:12 -07:00
Melissa Jost
62df5ae35f misc: Refactor daily-tests.yaml
This changes the daily tests to use a matrix in order to run
tests.  It also includes forces the cleaning step to run
regardless of success or failure.  With this refactoring, now
all builds of gem5 must finish before any tests run, and all
tests download all artifacts from all the build runs.

Change-Id: I16e1bc9acaf619feb85fba53eb6129e7df3fe409
2023-07-26 16:19:11 -07:00
Bobby R. Bruce
ce8e1b6aaa tests: Deprecate Gerrit/Jenkins testing scripts
These testing scripts are no longer used since moving to GitHub. The
Nightly (now refered to as "Daily" tests), the Weekly Tests, Compiler
Tests and the CI (Kokoro, pre-commit) tests are run via the GitHub
Actions infrastructure. Their setup is described via Workflow files
in ".github/workflows". To run tests locally please consult the
"TESTING.md" file.

These scripts may still be useful to reference and are therefore being
moved into a deprecated state.

Change-Id: Ie75c2f4f1179eb73d0f45ba0b259e8d79aa02ace
2023-07-26 10:54:57 -07:00
Atri Bhattacharyya
256729a40c mem: Make functional request a response when satisfied by queue
In the memory controller, MemCtrl::MemoryPort::recvFunctional,
when the functional request is satisfied by the ctrl-response queue,
correctly make the packet a response.

This change mirrors AbstractMemory::functionalAccess, which uses
Packet::makeResponse() after satisfying the request.

Change-Id: I47917062d3270915a97eed2c9fade66ba17019eb
2023-07-26 17:34:24 +02:00
Bobby R. Bruce
949119b5bb misc: Add Pyunit Test info to TESTING.md
Change-Id: Ibff77963653600ac7c9d706edca882d95e5c47df
2023-07-25 20:42:02 -07:00
Bobby R. Bruce
56a9bec602 misc: Update GitHub Actions text in TESTING.md
This change simplifies the explanation of how GitHub actions works.

Change-Id: Ia1540008463b8584f172c40ca7b4826cbbf95eb7
2023-07-25 20:22:00 -07:00
Bobby R. Bruce
cb98715514 misc: Add 'testing resources' sec to TESTING.md
Change-Id: Ie8a9c9200461d4f9e272dea75de1755b1b18aceb
2023-07-25 20:04:43 -07:00
Bobby R. Bruce
2846df946a misc: Remove test binary sections from TESTING.md
These sections are very out-of-date and confusing.

Change-Id: I61aae0686f38671e46412e27ea516a5e06f4e6f2
2023-07-25 19:50:17 -07:00
Bobby R. Bruce
00b2846109 misc: Update TESTING.md for subset selection
This change:

1. Removes the 'Specifying a subset of tests to run' section. This
   section is no longer useful since tests are no longer divided up so
   neatly by tags as they once were.
2. Adds a section outlining the 'quick', 'long' and 'very-long' tests
   and how they may be selected and run.

Change-Id: I61370dd80cc925a15d1a22755faa7d62e810862f
2023-07-25 19:43:24 -07:00
Bobby R. Bruce
7601fcfba6 cpu-minor: Check pc valid before printing (#107)
In https://gem5-review.googlesource.com/c/public/gem5/+/52047 inst.pc
was changed from an object to a pointer. It is possible that this
pointer is null (e.g., if there is an interrupt and there is a bubble).
Make sure to check that it's not null before printing.

I believe that other places this pointer is dereferenced without an
explicit null check are safe, but I'm not certain.

Should fix #97 

Change-Id: Idbe246cfdb62d4d75416d41b451fb3c076233bbc
2023-07-25 17:14:38 -07:00
Melissa Jost
556c9154dd base: Add maybe_unused to findLsbSetFallback (#109)
When compiling with clang-14 I received the following error:

```
src/base/bitfield.hh:328:1: error: function 'findLsbSetFallback' is not needed and will not be emitted [-Werror,-Wunneeded-internal-declaration]
```

This function was introduced in PR #76.
This fixes this compiler warning/error by using `[[maybe_unused]]`.

Change-Id: I0b99eab0a9e42ee1687e7a0594a5a7bf9588b422
2023-07-25 10:41:59 -07:00
Giacomo Travaglini
189d514f2f arch-arm: Hook TLBIOS instructions to the TlbiShareable obj (#114)
FEAT_TLBIOS has been introduced by a recent patch [1] which was however
missing to include the outer shareable case in the Msr disambiguation
switch. Which meant the TLBIOS instructions were decoded as normal MSR
instructions, with no effect whatsoever on the TLBs

[1]: https://gem5-review.googlesource.com/c/public/gem5/+/70567

Change-Id: I41665a4634fbe0ee8cc30dbc5d88d63103082ae9
2023-07-25 11:57:50 +02:00
Bobby R. Bruce
78849ac4fb Merge branch 'develop' into fix-bitfield-unused 2023-07-24 11:05:05 -07:00
Bobby R. Bruce
9f56bbd7dd misc: Update ci-tests.yaml to always clean runner (#111)
Adds line to make sure the runners are always cleaned whether or not the
previous tests pass

Change-Id: I980c0232305999fb3548464ea1b6eaeca7bcdbd6
2023-07-24 11:02:18 -07:00
Giacomo Travaglini
7dba30209a arch-arm: Hook TLBIOS instructions to the TlbiShareable obj
FEAT_TLBIOS has been introduced by a recent patch [1] which
was however missing to include the outer shareable case in the
Msr disambiguation switch. Which meant the TLBIOS instructions
were decoded as normal MSR instructions, with no effect whatsoever
on the TLBs

[1]: https://gem5-review.googlesource.com/c/public/gem5/+/70567

Change-Id: I41665a4634fbe0ee8cc30dbc5d88d63103082ae9
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-07-24 09:05:01 +01:00
Daniel Kouchekinia
984499329d mem-ruby,configs: Add GLC Atomic Latency VIPER Parameter (#110)
Added a GLC atomic latency parameter (glc-atomic-latency) used when
enqueueing response messages regarding atomics directly performed in
the TCC. This latency is added in addition to the L2 response latency
(TCC_latency). This represents the latency of performing an atomic
within the L2.

With this change, the TCC response queue will receive enqueues with
varying latencies as GLC atomic responses will have this added GLC
atomic latency while data responses will not. To accommodate this in
light of the queue having strict FIFO ordering (which would be violated
here), this change also adds an optional parameter bypassStrictFIFO to
the SLICC enqueue function which allows overriding strict FIFO
requirements for individual messages on a case-by-case basis. This
parameter is only being used in the TCC's atomic response enqueue call.

Change-Id: Iabd52cbd2c0cc385c1fb3fe7bcd0cc64bdb40aac
2023-07-23 15:57:06 -05:00
Melissa Jost
6a360bd1bb misc: Update ci-tests.yaml to always clean runner
Adds line to make sure the runners are always cleaned whether
or not the previous tests pass

Change-Id: I980c0232305999fb3548464ea1b6eaeca7bcdbd6
2023-07-21 15:53:12 -07:00
Jason Lowe-Power
0dd4334622 misc: Add workflow to close stale issues (#96)
Create a new workflow file that will hold jobs that are for managing the
repository, issues, prs, etc. This changeset then adds a job to close
issues that have been open for 30 days without a response someone marks
the issue as "needs details."

Change-Id: I23b9b6aa5fa67f205e116c88d5449cb69f53b6f9

Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2023-07-21 12:36:24 -07:00
Gabriel Busnot
75b6fa5ad1 base: Ostream helpers (iterable, tuple, pair, enum, pointers, optional) (#77)
* base: Enable stl_helpers::operator<< in _formatString

The string format (%s) eventually relies on bare operator<< to
display any type T. This gives the opportunity to use the helpers in
stl_helpers. This patch enables printing enums, pairs, tuples,
vectors, maps and others in a PRINTF debug macro without any extra
manual operation.

Change-Id: I8ac85133ebadcb95354598c1cfe687d8fffb89e2

* base: Add Printer util class to force use of operator<< helpers

Wrapping any value in a Printer instance before using operator<< will
force the use of stl_helpers::operator<<.

Change-Id: I7b505194eeabc3e0721effd9b5ce98f9e151b807

* base: Fix typo in ostream_helpers.hh

Change-Id: I283a5414f3add4f18649b77153dcbcc8661bc81e

* base: Disambiguate null optional representation in ostream helper

Change-Id: I5b093555688566cc405248d3a448a8f3efa67888

* base: Add unit test for std::optional ostream helper

Change-Id: I6fb9ced5e6461de5685638a162b5534e10710e20

* base: Ostream helpers Printer unit test

Change-Id: I11db89e85fd40c12bceecb41cadee78b8e871d7b

* base: Unit test for ostream helpers for pointers and smart ptr

Change-Id: Ifa87e8b69fdd9a4869250ab40311f352e8f54ed9

* base: Coding style fix in ostream_helpers.test.cc

Change-Id: I095c7048fad35e63f979aa601bfc8cde65c9077b

* base: Test shared_ptr in ostream_helpers.test.cc

Change-Id: I553df0614f1dd6eef2061c4dc1794af8c543b78f

---------

Co-authored-by: Gabriel Busnot <gabriel.busnot@arteris.com>
2023-07-21 11:11:09 -07:00
Jason Lowe-Power
27cfe4cc1b Merge branch 'develop' into fix-invalid-pc-on-bubble 2023-07-21 08:10:03 -07:00
Bobby R. Bruce
01623fac68 stdlib,configs,tests: Remove deprecated Resource classes usage (#102)
* stdlib,configs,tests: Remove `Resource` class use

This class is deprecated, but was still used in various example
configuration scriots and tests. This patch replaces it with the
`obtain_resource` function.

Change-Id: I0c89bf17783ccaaafc18072aaeefb5d1e207bc55

* configs: Remove `CustomDiskImageResource` use

The class is deprecated but was still used in the SPEC example scripts.
This patch replaces it with the `DiskImageResource` class.

Change-Id: Ie0697fe59a3d737b05eb45ff3bc964f42b0387e0

* configs,tests: Remove `CustomResource` use

This class is deprecated but was still used in example scripts and
mentioned, incorrectly, in comments in the pyunit tests. This patch
removes these.

Change-Id: Icb6d02f47a5b72cd58551e5dcd59cc72d6a91a01

* stdlib: Remove '\' in Workload docstring example

This example shows how to use the Workload. The backslash is not correct Python and would fail if used in this way.

Co-authored-by: Jason Lowe-Power <jason@lowepower.com>

---------

Co-authored-by: Jason Lowe-Power <jason@lowepower.com>
2023-07-20 23:08:39 -07:00
Bobby R. Bruce
573573b5ba base: Add maybe_unused to findLsbSetFallback
When compiling with clang-14 I received the following error:

```
src/base/bitfield.hh:328:1: error: function 'findLsbSetFallback' is not needed and will not be emitted [-Werror,-Wunneeded-internal-declaration]
```

This function was introduced in PR #76.
This fixes this compiler warning/error by using `[[maybe_unused]]`.

Change-Id: I0b99eab0a9e42ee1687e7a0594a5a7bf9588b422
2023-07-20 15:04:06 -07:00
Adwaith R Krishna
427b4d596e mem-garnet: Fix packet_id val in flit (#72)
Change-Id: I163b5a32972783bf2e99f3383b9f86776577b727

Co-authored-by: Bobby R. Bruce <bbruce@ucdavis.edu>
2023-07-20 13:56:31 -07:00
Jason Lowe-Power
29832849f7 cpu-minor: Check pc valid before printing
In https://gem5-review.googlesource.com/c/public/gem5/+/52047 inst.pc
was changed from an object to a pointer. It is possible that this
pointer is null (e.g., if there is an interrupt and there is a bubble).
Make sure to check that it's not null before printing.

I believe that other places this pointer is dereferenced without an
explicit null check are safe, but I'm not certain.

Change-Id: Idbe246cfdb62d4d75416d41b451fb3c076233bbc
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2023-07-20 13:23:16 -07:00
Kunal Pai
3c6563d6f7 stdlib: Change resource compatibility warning (#91)
* stdlib: Change resource compatibility warning

If the gem5 version is "develop", the warning will not
be thrown.

Change-Id: Id2be1c4323c6ca06c5503c2885c1608f8d119420

* stdlib: Change resource compatibility warning

If the gem5 version is "develop", the warning will not
be thrown.

Change-Id: Id2be1c4323c6ca06c5503c2885c1608f8d119420

* tests: Edit obtain_resources warning test

Since we are editing the warning message for
the develop branch, the test removes the
warning message as well.

Change-Id: I90882340188360bb3435344cdc14b324412c6c0e

---------

Co-authored-by: Jason Lowe-Power <jason@lowepower.com>
2023-07-20 12:00:49 -07:00
Melissa Jost
2a6d39aa88 misc: Split up tests in daily-tests.yaml
This splits up the gem5 library example tests by Suite UID, as
right now running them together uses the runner for a long
period of time.  It is important to note that doing this means
additional tests from this directory will need to be
manually added, such as the kvm tests.

Change-Id: Ib2a0aca08f9b51b60e9dd0528324372cf2d98c05
2023-07-20 10:43:32 -07:00
Hoa Nguyen
f7da973f34 cpu-kvm: Make using perf when using KVM CPU optional (#95)
* cpu-kvm: Add a variable signifying whether we are using perf

Change-Id: Iaa081e364f85c863f781723b5524d267724ed0e4
Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu>

* cpu-kvm: Making it clear the functionalities are specific to KVM

Change-Id: I982426f294d90655227dc15337bf73c42a260ded
Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu>

* cpu-kvm: Make perf optional

Change-Id: I8973c2a96575383976cea7ca3fda478f83e95c3f
Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu>

* configs: Add an example config of using KVM without perf

Change-Id: Ic69fa7dac4f1a2c8fe23712b0fa77b5b22c5f2df
Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu>

* Apply suggestions from code review

Co-authored-by: Jason Lowe-Power <jason@lowepower.com>

* misc: Add an example to the panic

Change-Id: Ic1fdfb955e5d8b9ad1d4f0a2bf30fa8050deba70
Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu>

* misc: Add warning of not using perf when using KVM CPU

Change-Id: I96c0832fb48c63a79773665ca6228da778ef0497
Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu>

* misc: Fix stuff

Change-Id: Ib407ae7407955b695f0e0f2718324f41bb0d768f
Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu>

* misc: style fix

Change-Id: I7275942e43f46140fdd52c975f76abb3c81b8b0a
Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu>

---------

Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
Co-authored-by: Jason Lowe-Power <jason@lowepower.com>
2023-07-20 10:34:44 -07:00
Daniel Kouchekinia
1705853b12 mem-ruby: Added support for non-system-scope atomics in VIPER (#101)
Added support for performing non-SLC-set atomics in the TCC.
Previously, all atomics were being passed on by the TCC to the
directory. With this change, atomics will only be passed on if the SLC
bit is set or if the line isn't present or available in the TCC.

If a non-SLC atomic is passed on to the directory because it is not
present in the TCC, the atomic will be performed on the return path on
the Data event. To accommodate the directory not performing the atomic
in this case, this change also passes the SLC bit on to the directory.

The previously-named "Atomic" action has been renamed to
"AtomicPassOn", with the new "Atomic" corresponding to an atomic
performed directly in the TCC.

Change-Id: Ibf92f71ddceb38bd1b0da70b0a786cc4c3cf2669
2023-07-20 11:48:08 -05:00
rogerchang23424
566308dad9 scons: Add extra parent dir to CPPPATH if --no-duplicate-sources (#104)
In the previous version of gem5, the source files of extra directories
will copy to build directory for compilation. It will not be a problem
if the extra directories include *.h(*.hh) from the other extra
directories.

After the patch applied from the change
(https://gem5-review.googlesource.com/c/public/gem5/+/68758). The
source files of extra directories will not copy to the build directory
unless the user compiles gem5 with "--duplicate-sources". It will
cause the compilation error if the code includes a header file from
other repositories.

For example, assume we want to compile gem5 with "foo/bar1" and
"foo/bar2" repositories and they are gem5-independent. There are some
header files in "foo/bar1/a.h" "foo/bar1/b.h" and "foo/bar2/d.h". If
the code "foo/bar1/sample.c" tries to include the file "foo/bar2/d.h".
They usually include the file by declare "#include bar2/d.h" in
foo/bar1/sample.c. It can work if --duplicate-sources is specified in
gem5 build because they will copy to <builddir>/bar1 and
<builddir>/bar2 respectively, and -I<builddir> is specified by default
whether duplicate_sources or not. It will raise the compilation error
if the user does not specify it.

The change is aimed to let the situation work without
duplicate-sources specified by adding parent extra directory, and
adding them before the extra directories. If the --duplicate-sources
specified, it will not add parent extra directories to avoid repeat
include paths.

Change-Id: I461e1dcb8266d785f1f38eeff77f9d515d47c03d
2023-07-20 09:35:45 -07:00
rogerchang23424
5d2edca1e3 arch-riscv: Set default check alignment True (#98)
Raise misaligned trap if effective address if not aligned by default

Change-Id: I634aa7ddbf5282fc583316fc77ab1e37bfe415e3
2023-07-19 11:14:27 -07:00
Gabriel Busnot
4c4419296b base: Unit tests miscellaneous patches (#73)
* base: Fix Memoizer constructor parameter type

* base: switch from new to mk_unq in amo.test.cc

* base: Fix memory management in IniFile

* base: Fix memory management in Trie

* sim: Fix out-of-bounds access in CheckpointIn::setDir

Change-Id: Iac50bbf01b6d7acc458c786da8ac371582a4ce09

---------

Co-authored-by: Gabriel Busnot <gabriel.busnot@arteris.com>
2023-07-19 08:45:29 -07:00
Daniel Kouchekinia
4d9bd7dedf base: Added missing backup dummy __has_builtin definition (#99)
Added dummy definition of __has_builtin to bitfield.hh's hasBuiltinCtz,
which is already being done in popCount.

Change-Id: I4a1760a142209462bb807c6df4bc868284b6f5f3
2023-07-19 02:07:39 -07:00
Lingkang
573523c07a python: fix fatal in main.py (github #78) (#93)
* python: fix fatal in main.py (github #78)

Issue-On: https://github.com/gem5/gem5/issues/78

* python: fix fatal in main.py (github #78)

Issue-On: https://github.com/gem5/gem5/issues/78
Change-Id: I80855b05168a067ddd7706ad9fd7e71e75bfd3b1

---------

Co-authored-by: Jason Lowe-Power <jason@lowepower.com>
2023-07-18 14:00:57 -07:00
Bobby R. Bruce
65fc9a6bfa misc: Drop older compilers and Ubuntu 18.04 (#80)
* tests,util-docker,misc: Drop compiler support for GCC 7

Change-Id: I8b17b77c92b88e78a8cb6d38cd5f045dbe80a643

* tests,util-docker,misc: Drop compiler support for clang 6.0

Change-Id: Ie3b6bfe889ad1d119cee0c9ffb04c5996517922e

* util-docker,tests,misc: Remove Ubuntu 18.04 support

18.04 is no longer supported. This patch removes specific 18.04 compiler
tests and removes our 18.04 dockerfiles. Images will no longer be
produced for specific 18.04 tasks.

Compiler images for GCC and Clang, which used 18.04 have been updated to
use 20.04.

Change-Id: I6338ab47af3287a25a557dbbeaeebcfccfdec9fc
2023-07-18 10:28:09 -07:00
Bobby R. Bruce
8450b93f8e misc: Add bug report template (#85)
* misc: Add Bug Report Issue Template

Change-Id: I3acf7a1991f889462c0f2604d251dead563846c2

* misc: Cleanup bug_report.md

* Inform the reader to use codeblocks where approproate.
* Inform the reader they should include the Python configuraiton
  script and state parameters passed.

Change-Id: Ib0b8e9a6d3ed199c435917acfdf958073d4faa04
2023-07-18 10:27:48 -07:00
Melissa Jost
424350f446 misc: Update CI test workflow (#88)
* misc: Update CI test workflow

This updates our CI tests to clean the runners after every
workflow, to make sure no hanging files cause problems for
future tests

Change-Id: Iff6a702bbc2e86a31e4c18ef9764a3cfd3af2f7d

* misc: Update scheduled workflows to clean runners

This updates our scheduled tests to clean up any remaining
files after running tests to avoid anything hanging for
future runs.

Change-Id: Icfdd5a0559337ad0e62d108a47f4e5a12e0db677

* misc: Fix spacing in workflow files

Some commands were incorrectly spaced

Change-Id: Id340dc77bfb5c5d579b5f1e5b3ddeabea4a35ea8
2023-07-18 10:27:32 -07:00
Gabriel Busnot
6fb72d84e1 base: Find lsb set generalization and optimization (#76)
* base: Generalize findLsbSet to std::bitset<N>

* base: Split builtin and fallback implementations of findLsbSet

* base: Add more unit testing for findLsbSet

Change-Id: Id75dfb7d306c9a8228fa893798b1b867137465a9

---------

Co-authored-by: Gabriel Busnot <gabriel.busnot@arteris.com>
2023-07-17 15:32:04 -07:00
Bobby R. Bruce
f80015ea18 misc: Update README/README.md (#71)
* misc: Update README to README.md

This change converts the text-based README to markdown. This works
better with modern source-control systems, most notably, GitHub.

The README.md has been broken down into sections to better organize the
document.
This section now included expanded information on Reporting bugs and
Requesting Features.

Due to renaming 'README' to 'README.md', this code was generating the
following for "info.py":

```
README.md = "<FILE CONTENTS HERE>"
```

As '.' is used to access member variables/methods in python. To fix this
"infopy.oy" now replaces "." with "_". As such the generated in in
"info.py" is now:

```
README_MD = "<FILE CONTENTS HERE>"

This puts GitHub Discussions and GitHub Issues towards the top of the
list. This is to incentivize their usage.

Change-Id: I18018ba23493f43861544497f23ec59f1e8debe1

---------

Co-authored-by: Jason Lowe-Power <jason@lowepower.com>
2023-07-17 15:30:35 -07:00
wmin0
162f2e2dba scons: Use pkgconfig to get correct Protobuf dependency (#68)
Latest protobuf library depends on abseil libraries. We should rely on
pkgconfig to give us correct dependency. We still keep the old check as
fallback.

Change-Id: I529ea1f61e5bbc16b2520ab1badff3d8264f1c33
2023-07-17 15:29:05 -07:00
KaiBatley
efa1d87add configs: fix GPU's default number of HW barrier/CU (#92)
AMD GCN3 and Vega GPUs assume a max of 16 WG/CU.  Any GPU WG with more
than 1 WF requires a hardware barrier to allow WFs in the WG to
synchronize locally.  However, currently the default gem5 GPU
configuration assumes only 4 barriers per CU, which artificially
prevents applications with > 4 WG/CU that could run simultaneously
from running simultaneously.

This fix resolves this by updating the default number of hardware barriers
per CU to 16, which mimics the support described in slide 39 here:
https://www.olcf.ornl.gov/wp-content/uploads/2019/10/
ORNL_Application_Readiness_Workshop-AMD_GPU_Basics.pdf

Change-Id: Ib7636a13359d998e676c1790f436a83ce88cbfc0
2023-07-17 10:42:40 -07:00
Bobby R. Bruce
6062214d87 util: Add "Improving stability" sec to github-vagrant-runner (#87)
Change-Id: I9812a21523b5b29bd7f570df4f1e90dbeabea085
2023-07-17 10:42:28 -07:00
Jason Lowe-Power
442923c414 Add feature to output citations automatically based on configuration (#90)
This change adds a new file to m5out which is citations.bib.
This file will contain the citations to the papers which describe the
aspects of the gem5 simulator that the simulation uses. In other words,
each simulation configuration could generate a different bib file
referencing different works.

Each SimObject can now have a set of citations associated with it. After
the system is built (in `instantiate`), the citations.bib file is
created by parsing all SimObjects that have been instantiated and taking
the union of their associated citations.

This commit is not meant to add all citations, but to act as an example
for others to add more citations to gem5.

Change-Id: Icd5c46fd9ee44adbeec1fea162657f5716f7e5ef
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2023-07-17 10:41:51 -07:00
Daniel Kouchekinia
f8f5dd98bf mem-ruby: Added WIB State to VIPER TCC Cache (#67)
Added WIB (Waiting on Writethrough Ack; Will be Bypassed) state which
is transitioned to when a dirty line in the TCC is evicted in a
bypassed read. Previously, we were transitioning to invalid.

While a WI (Waiting on Writethrough Ack) state exists, transitions from
it on WBAck deallocates the TBE, which contains SLC bit information
needed to trigger the Bypass event when the read response from the
directory comes in.

Without this change, WB acknowledgements from the directory in read
bypass evicts (with the SLC bit set) were being treated as if they were
read responses, leading to an invalid transition panic.

Change-Id: I703c3fe8af0366856552bb677810cb1a8f2896de
2023-07-17 10:17:47 -07:00
rogerchang23424
52d9259396 arch-riscv: Fix clearLoadReservation merge (#81)
The previous change
(https://gem5-review.googlesource.com/c/public/gem5/+/71818) makes
the clearLoadReservation be RISC-V only.

Change-Id: I5df1a7fa688489d57fff8da937e3c8addfe4c299
2023-07-14 08:48:43 -07:00
Mahyar Samani
b2fcc558d8 stdlib: Deviding range for linear multicore. (#63)
This patch changes the way memory ranges are devided when using
multiple cores for linear traffic. The current state assigns the
same range to multiple linear generators so all the cores start
generating the same trace. This patch devides the overall range
assigned to the generator ([min_addr:max_addr]) between the cores.

Change-Id: I49f69b3d61b590899f8d54ee3be997ad22d7fa9b

Co-authored-by: Jason Lowe-Power <jason@lowepower.com>
Co-authored-by: mkjost0 <50555529+mkjost0@users.noreply.github.com>
Co-authored-by: Bobby R. Bruce <bbruce@ucdavis.edu>
2023-07-14 07:33:21 -07:00
Bobby R. Bruce
bc99a6e346 tests: Improve Pyunit tests gem5 Resources' downloads (#79)
* tests: Remove large files from resource specialization tests

These tests were downloading resources (but not actually using them) to
ensure the `obtain_resources` function returned the correct
specialization and was parsing the data correctly. As these resources
were never used, this patch removes the downloading of large files in
this case, replacing them with smaller binaries.

Change-Id: I7b33aa6be8ec65b296b470cd50b128c084f2b71f

* tests: Rename 'looppoint-json...' example specailization

Appending '-example' to the end avoids any name clashes.
'looppoint-json-restore-resources-region-1' shares this ID with a real
resources in gem5 Resources.

Change-Id: I9853e97cb71e768c46ad173b5a497609f4acc3b2

* tests: Remove disk image download from Workload Checks

This download is big and unecessary (the workload is never run as part
of the test). This patch changes this test to instead download a small
binary in it's palce (again, this does not matter as this is never
actually run as a disk image).

Change-Id: I74034ebcf5f2501917847c258570e88a8f653a5d

* tests: Update IDs in Pyunit Workload checks

Some of these IDs clash with real workloads/resources in gem5 Resources.
To avoid any possible clashes or confusions, all the mock
resources/workloads in this suite of tests has been renamed with
'-example' appending on the end of the ID.

Change-Id: Ifd907b2321416bf05e8c4e646024d179da2ca487
2023-07-14 06:31:34 -07:00
Giacomo Travaglini
18470b4747 arch-arm: Fix assert fail when UQRSHL shiftAmt==0 (#75)
When shiftAmt is 0 for a UQRSHL instruction, the code called bits() with
incorrect arguments. This fixes a left-shift of 0 to be a NOP/mov, as
required.

Change-Id: Ic86ca40ac42bfb767a09e8c65a53cec56382a008

Co-authored-by: Marton Erdos <marton.erdos@arm.com>
2023-07-13 10:57:51 -07:00
Bobby R. Bruce
552ae9a1a2 misc: Merge v23.0.0.1 Hotfix into develop (#65)
* gpu-compute: Remove use of 'std::random_shuffle'

This was deprecated in C++14 and removed in C++17. This has been
replaced with std::random. This has been implemented to ensure
reproducible results despite (pseudo)random behavior.

Change-Id: Idd52bc997547c7f8c1be88f6130adff8a37b4116

* dev-amdgpu: Add missing 'overrides'

This causes warnings/errors in some compilers.

Change-Id: I36a3548943c030d2578c2f581c8985c12eaeb0ae

* dev: Fix Linux specific includes to be portable

This allows for compilation in non-linux systems (e.g., Mac OS).

Change-Id: Ib6c9406baf42db8caaad335ebc670c1905584ea2

* gpu-compute: Add missing include in dispatcher.cc

Due to some cherry-picking onto the release-staging branch, there was a
missing "sim/sim_exit.hh" include in "src/gpu-compute/dispatcher.cc".
This was causing compilation errors.

This is being added to the v23.0.0 release as a hotfix.

Change-Id: I1043ecf5c41ad6afc0e91311b196f4801646002f
Issue-on: https://gem5.atlassian.net/browse/GEM5-1332

* misc: Update version to v23.0.0.1

Change-Id: I3bbcfd4dd9798149b37d4a2824fe63652e29786c

* misc: Update RELEASE-NOTES.md for v23.0.0.1 hotfix

Change-Id: Ieced7f693a8cbef586324dfe7ce826da16d9a3c3
2023-07-13 10:26:02 -07:00
Gabriel Busnot
2a880053bb Sanitizer libraries static linking (#70)
* scons: Fix sanitizer lib link for clang

Change-Id: I2441466c5c9343afd938185b8ec5047d4e95ac70

* scons: Statically link libubsan when using sanitizers with gcc

Change-Id: I362a1fb87771454ad94e439847a85d19108f375a

---------

Co-authored-by: Gabriel Busnot <gabriel.busnot@arteris.com>
2023-07-12 11:24:18 -07:00
Bobby R. Bruce
753933d471 gpu-compute, tests: Fix GPU_X86 compilation, add compiler tests (#64)
* gpu-compute: Remove use of 'std::random_shuffle'

This was deprecated in C++14 and removed in C++17. This has been
replaced with std::random. This has been implemented to ensure
reproducible results despite (pseudo)random behavior.

Change-Id: Idd52bc997547c7f8c1be88f6130adff8a37b4116

* dev-amdgpu: Add missing 'overrides'

This causes warnings/errors in some compilers.

Change-Id: I36a3548943c030d2578c2f581c8985c12eaeb0ae

* dev: Fix Linux specific includes to be portable

This allows for compilation in non-linux systems (e.g., Mac OS).

Change-Id: Ib6c9406baf42db8caaad335ebc670c1905584ea2

* tests: Add 'VEGA_X86' build target to compiler-tests.sh

Change-Id: Icbf1d60a096b1791a4718a7edf17466f854b6ae5

* tests: Add 'GCN3_X86' build target to compiler-tests.sh

Change-Id: Ie7c9c20bb090f8688e48c8619667312196a7c123
2023-07-11 14:35:03 -07:00
Gabriel Busnot
73afee1e0d base: Provide stl_helpers::operator<< for more types
This operator can be safely brought in scope when needed with "using
stl_helpers::operator<<".

In order to provide a specialization for operator<< with
stl_helpers-enabled types without loosing the hability to use it with
other types, a dual-dispatch mechanism is used. The only entry point
in the system is through a primary dispatch function that won't
resolve for non-helped types. Then, recursive calls go through the
secondary dispatch interface that sort between helped and non-helped
types. Helped typed will enter the system back through the primary
dispatch interface while other types will look for operator<< through
regular lookup, especially ADL.

Change-Id: I1609dd6e85e25764f393458d736ec228e025da32
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67666
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
2023-07-10 23:00:45 +00:00
Luming Wang
c634b23305 sim,python: follow the new CPython startup sequence
Currently, gem5 suffers from several bugs related
to Python interpreter's locale encoding issues.
gem5 will crash when the working directory contains
Non-ASCII characters.

The reason is that Python 3.8+ introduces a new
interpreter startup sequence [1]. The startup
sequence consists of three phases:

1. Python core runtime preinitialization
2. Python core runtime initialization
3. Main interpreter configuration

Stage 1 determining the encodings used for system
interfaces.

However, gem5 doesn't preinitialize the Python
interpreter. Thus, the locale settings do not take
effect. This patch preinitialize the Python for
Python 3.8+.

Also, this patch avoid the use of `Py_SetProgramName`,
which is deprecated since Python 3.11[3].

[1] https://peps.python.org/pep-0432/
[2] https://peps.python.org/pep-0587/
[3] https://docs.python.org/3/c-api/init.html#c.Py_SetProgramName

Change-Id: I08a2ec6ab2b39a95ab194909932c8fc578c745ce
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70898
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Roger Chang <rogerycchang@google.com>
2023-07-10 23:00:31 +00:00
Yang Liu
35763bdfb2 arch: Add setRegOperand in VecRegOperand
VecRegOperand also need setRegOperand method to write back execution
result.

Change-Id: Ie50606014827c14a7219558dd003eb4747231649
Co-authored-by: Xuan Hu <huxuan@bosc.ac.cn>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67292
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2023-07-10 22:59:12 +00:00
Melissa Jost
eb07d3fcf4 misc: Update documentation and links for GitHub
This changes mentions of googlesource and Gerrit to instead
link to the gem5 GitHub repository, and updates the documentation
to reflect the GitHub review process.

Change-Id: I5dc1d9fcf6b96f9e5116802f938b7e3bb5b09567
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71878
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
2023-07-10 22:57:28 +00:00
mbabaie
037e6fe33c misc: update gem5 links
This change updates all of the gerrit links to use github.

Change-Id: I2a020dafac0bd2ba99b26c6a9cd4f0c585e253f8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71719
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
2023-07-10 22:57:28 +00:00
Bobby R. Bruce
c9ff8ef57d stdlib,tests: Add Simulator Exit Event handler tests
Change-Id: Ib5f119730299c0f201a14a9e9bb933d23d65ac62
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62751
Reviewed-by: Melissa Jost <mkjost@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
2023-07-10 22:56:54 +00:00
Bobby R. Bruce
a43f8e6904 stdlib: Allow passing of func list as exit_event generator
Allows for a passing of functions to specify actions to execute on an
exit event via the Simulator module in the stdlib.

The list of functions must have no manditory arguments and return True
if the Simulation is to exit upon the function call's completion.

Issue-on: https://gem5.atlassian.net/browse/GEM5-1126
Change-Id: Ia88caf2975227e78243763627acab9e9f89e2a7d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62691
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2023-07-10 22:56:54 +00:00
Bobby R. Bruce
63bdde4f63 arch-riscv: Remove clearLoadReservation
This was added due to a bad merge from stable to develop.

Change-Id: I7adf9604ee4d6f1cf11c404af5e8e1c071461a4a
2023-07-10 15:28:41 -07:00
Melissa Jost
edc4ff3382 misc: Include body in check for Change-Id
Updates ci-tests.yaml to check the entire commit message for
the Change-Id, not just the subject.

Change-Id: Ia76c77d096617a6fe76ffea7f2bd8a4295ca14f7
2023-07-10 15:02:07 -07:00
Ayaz Akram
a2d34a52fc configs: Fix SPEC benchmarks example scripts
This small change fixes the gem5_library example
scripts for SPEC benchmarks to make them compatible
with the latest version of the std library.

Change-Id: I3da9745f0ee6b253871e32082e135e0fa4040108
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71738
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
2023-07-10 21:47:55 +00:00
Melissa Jost
ab0f43d290 misc: Update CI tests to only require 1 Change-Id
Since commits will be squashed and merged in GitHub, we only
require one of the commits to contain a Change-ID within a
pull request

Change-Id: I0fbb1c0e79009097456193fbe3c6fa20746e4805
2023-07-10 13:29:55 -07:00
Melissa Jost
05e7a68487 misc: Add runs-on line to CI tests
Adds missing line to CI tests

Change-Id: I34019d76648dc6025ac89cbec4605f17d2a5e3f7
2023-07-10 12:42:47 -07:00
Bobby R. Bruce
4f45f84aa5 misc: Revert remove -Werror
This reverts commit 6c997b633f.

This is so the develop branch gives warnings as errors.

Change-Id: I6c8c81d478dc11fb561748dd2f392103b2beff68
2023-07-10 12:31:31 -07:00
Bobby R. Bruce
160681cabf misc: Update version info for develop branch
Change-Id: Iecee9e230c1c80f5675ec14bbeba9f7d9e2b8664
2023-07-10 12:28:44 -07:00
Melissa Jost
3105f59544 misc: Update Change-Id Check
This updates the change-id code to refer to commit messages in
pull requests instead of on pushes.

Change-Id: I308f02b4616804b386140d5875a79878eccd721e
2023-07-10 12:25:38 -07:00
Bobby R. Bruce
54501c3e2b misc: Merge branch 'stable' into 'develop'
This ensures all commits in v23.0 are now in the develop branch.

Change-Id: I791346115dd123f3541a3c8060482e00cf4dbfb5
2023-07-10 12:24:27 -07:00
KUNAL PAI
9b9dc09f6e resources: Add the gem5 Resources Manager
A GUI web-based tool to manage gem5 Resources.

Can manage in two data sources,
a MongoDB database or a JSON file.

The JSON file can be both local or remote.

JSON files are written to a temporary file before
writing to the local file.

The Manager supports the following functions
on a high-level:
- searching for a resource by ID
- navigating to a resource version
- adding a new resource
- adding a new version to a resource
- editing any information within a searched resource
(while enforcing the gem5 Resources schema
found at: https://resources.gem5.org/gem5-resources-schema.json)
- deleting a resource version
- undo and redo up to the last 10 operations

The Manager also allows a user to save a session
through localStorage and re-access it through a password securely.

This patch also provides a
Command Line Interface tool mainly for
MongoDB-related functions.

This CLI tool can currently:
- backup a MongoDB collection to a JSON file
- restore a JSON file to a MongoDB collection
- search for a resource through its ID and
view its JSON object
- make a JSON file that is compliant with the
gem5 Resources Schema

Co-authored-by: Parth Shah <helloparthshah@gmail.com>
Co-authored-by: Harshil2107 <harshilp2107@gmail.com>
Co-authored-by: aarsli <arsli@ucdavis.edu>
Change-Id: I8107f609c869300b5323d4942971a7ce7c28d6b5
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71218
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
2023-07-08 02:01:02 +00:00
Bobby R. Bruce
c5f9daa86c stdlib,tests: Fix download_check.py
This was causing the Weekly tests to fail. The removing of the download
directory should only happen at the end. Prior to this patch it was
deleted and then referenced, which caused problems.

Change-Id: I134782e89a13f5c3cd5c1912ad53a701d0413d16
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/72019
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
2023-07-07 15:52:52 +00:00
Bobby R. Bruce
4912e90978 stdlib: Change default gem5-resources DB collection
This was set to "test_collection", which was used during development.
Changing to "resources".

Change-Id: I52c83c6b73f3a227fbb05dc321a4bc38210ad71c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/72018
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2023-07-07 15:52:52 +00:00
Matthew Poremba
3756af8ed9 gpu-compute,configs: Make sim exits conditional
The unconditional exit event when a kernel completes that was added in
c644eae2dd is causing scripts that do not
ignore unknown exit events to end simulation prematurely. One such
script is the apu_se.py script used in SE mode GPU simulation. Make this
exit conditional to the parameter being set to a valid value to avoid
this problem.

Change-Id: I1d2c082291fdbcf27390913ffdffb963ec8080dd
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/72098
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2023-07-07 14:12:54 +00:00
Gabriel Busnot
223a2ce234 misc: Add the magic_enum.hh external header (v0.9.2)
Also increase maximum enum supported value to 0x100

This library is courtesy of Daniil Goncharov and hosted at
https://github.com/Neargye/magic_enum/tree/master. It enables
compile-time and runtime string to/from enum value conversion as well
as other fancy enum introspection-like things.

The MIT lincence should not conflict with gem5 in any way.

Copying the file for now but a submodule might be more suitable if
gem5 uses them in the future.

Change-Id: Ib0c8f943b79c703f1247c11c7291fb4fb1548b0f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67665
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
2023-07-07 10:17:54 +00:00
Gabriel Busnot
b8af5f6a6c base: stl_hlp::unordered_{map,set} with stl_hlp::hash by default
Change-Id: Iad01d7fa6ff6293a2d931ba796666ad3550c6e44
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67664
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Daniel Carvalho <odanrc@yahoo.com.br>
2023-07-07 10:17:54 +00:00
Gabriel Busnot
eb241e8a99 base: Provide several hash implementations for common types
These types include std::pair, std::tuple, all iterable types and any
composition of these. Convenience hash factory and computation
functions are also provided.

These functions are in the stl_helpers namespace and must not move to
::std which could cause undefined behaviour. This is because
specialization of std templates for std or native types (or
composition of these) is undefined behaviour. This inconvenience can't
be circumvented for generic code. Users are free to bring these hash
implementations to namespace std after specialization for their own
non-std and non-native types.

Change-Id: Ifd0f0b64e5421d5d44890eb25428cc9c53484eb3
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67663
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Daniel Carvalho <odanrc@yahoo.com.br>
Tested-by: kokoro <noreply+kokoro@google.com>
2023-07-07 10:17:54 +00:00
Gabriel Busnot
5282fac317 base: define is_std_hash_enabled type trait
Change-Id: I7ffb7f80a90006d6b8cd42bdf3d63e34c6dbda01
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71839
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
2023-07-07 10:17:54 +00:00
Gabriel Busnot
2f327fa2b8 base: define is_iterable type trait
Change-Id: I38bb0ddcbb95645797f1d20724b78aff3bef4580
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71838
Maintainer: Daniel Carvalho <odanrc@yahoo.com.br>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
2023-07-07 10:17:54 +00:00
Gabriel Busnot
91b4540477 python: Fix namespaced enums params code generation
The wrapper_name parameter was not properly handled. Enums were always
generated in the enums namespace even if required differently by
wrapper_name.

Change-Id: I366846ce39dfe10effc2cc145e7772a3fd171b92
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67662
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
2023-07-07 10:17:54 +00:00
Gabriel Busnot
159953080a mem-ruby: Fix of an address bug in MESI_Two_Level-dir.sm
Physical access address and line address were mixed up in
qw_queueMemoryWBRequest_partial

Change-Id: I0b238ffc59d2bb3de221d96905c75b7616eac964
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67661
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2023-07-07 10:17:54 +00:00
Gabriel Busnot
d79941df7a configs: Fix default CustomMesh for use with Garnet
Garnet routers do not support 0 latency switches. Use 1 instead if the
network is garnet.

Change-Id: I09841a01eaf413bee0a1629307ecff0ae2bda948
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67660
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2023-07-07 10:17:54 +00:00
Gabriel Busnot
20dd444273 mem-ruby: Switch to dequeueMemRspQueue() in all Ruby protocols
Change-Id: I33bca345d985618e3fca62e9ddd5bcc3ad8226a3
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67659
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
2023-07-07 10:17:54 +00:00
Gabriel Busnot
833afc3451 mem-ruby: AbstractController can send retry req to mem controller
Prior to this patch, when a memory controller was failing at sending a
response to AbstractController, it would not wakeup until the next
request. This patch gives the opportunity to Ruby models to notify
memory response buffer dequeue so that AbstractController can send a
retry request if necessary.

A dequeueMemRspQueue function has been added AbstractController to
automate the dequeue+notify operation.

Note that models that don't notify AbstractController will continue
working as before.

Change-Id: I261bb4593c126208c98825e54f538638d818d16b
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67658
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
2023-07-07 10:17:54 +00:00
Wei-Han Chen
b4687aa7d9 dev: Warn when resp packet is error in dma port
This CL adds a warning when the response packet is error.

Change-Id: I8e94dc2b85cd1753a4d6265cfda3cd5d6325f425
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71778
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Yu-hsin Wang <yuhsingw@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2023-07-07 07:12:54 +00:00
Yu-hsin Wang
42c52b1add scons: Update default environment comments
Change-Id: Ib6dcf1a6390010682365f393241c1e022aeeb813
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/72058
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2023-07-07 07:11:03 +00:00
Bobby R. Bruce
84b457c6aa util: Add 'swapspace' daemon to runner VM.
As these VMs, particularly the runners, don't have much memory, the
'swapspace' daemon allows for dynamic swap spaces to be created for when
more memory is required.

Change-Id: Ie8e734a8fde54e122df33dda187c6c4aafdcd006
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71680
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
2023-07-07 02:31:50 +00:00
Bobby R. Bruce
c391fdfbb4 util: Add 'shutdown' argument option to vm_manager.sh
This allows for the VMs to be shutdown rather than destroyed. The can be
rebooted with `./vm_manager.sh` after shutdown.

Change-Id: I58329ec835af664bfb970b029e09ad16c5472015
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71500
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
2023-07-07 02:31:50 +00:00
Bobby R. Bruce
1fd392e7bf util: '-eq' -> '-ge' for if in vm_manager.sh
A small nit-pick change that ensures that cases where the number of
arguments being >1 does not result in the argument checking being
skipped (NOTE: arguments after the first are never processed and are
ignored).

Change-Id: If7e9c16c2c3581ea95ed888586736618d1ae5f5f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71499
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Tested-by: kokoro <noreply+kokoro@google.com>
2023-07-07 02:31:50 +00:00
Bobby R. Bruce
34d984d147 util: Update GitHub Runners Vagrant to overcommit memory
SE mode tests were failing in some cases where the VM did not have
enough memory to satisfy the constraints of the simulated system. This
change ensures the VM allows overcommitting of memory.

Change-Id: I1800288e16146bdae612a401c2ff282d8664892d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71498
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2023-07-07 02:31:50 +00:00
Yu-hsin Wang
fa905fd512 scons: Add -rdynamic when building python embed binary
When you build Python from scratch, the modules would be separated
shared libraries. They would be dlopen when doing module import. To make
the separated shared libraries can share the symbol in the binary, we
should add -rdynamic when compliing.

Change-Id: I26bf9fd7ea5068fd2d08c8f059b37ff34073e8c2
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/72040
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2023-07-06 02:27:44 +00:00
Yu-hsin Wang
7cc4f820d7 scons: Pass the CPATH environment variable through to SCons.
For sandbox environment, the default include may be override by CPATH.
To make the SCons can work in this environment, we need to pass CPATH
into SCons.

Change-Id: I1015f20a553a2e18595c8d2a89b209ca665879fd
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/72038
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2023-07-06 02:27:07 +00:00
Matthew Poremba
05ffa35426 configs: Create base GPUFS vega config and atomic config
Move the Vega KVM script code to a common base file and add scripts for
KVM and atomic. Since atomic is now possible in GPUFS this gives a way
to run it without editing the current scripts.

Change-Id: I094bc4d4df856563535c28c1f6d6cc045d6734cd
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71939
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
2023-06-30 19:55:18 +00:00
Adrià Armejach
fe7b18c2d7 arch-riscv: Make virtual method RISC-V private
* Prior commit defined a shared virtual method that is only used in
  RISC-V. This patch makes the method only visible to the RISC-V ISA.

Change-Id: Ie31e1e1e5933d7c3b9f5af0c20822d3a6a382eee
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71818
Reviewed-by: Roger Chang <rogerycchang@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2023-06-30 07:39:40 +00:00
Gabe Black
1aa8d6c004 scons: Pass the DISPLAY environment variable through to SCons.
This lets gui programs run correctly within SCons, specifically the
kconfig "guiconfig" helper utility.

Change-Id: Iec51df3db89ac7e7411e6c08fe8201afb69dc63e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56952
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Matthew Poremba <matthew.poremba@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2023-06-30 01:50:23 +00:00
Matthew Poremba
ce715601ad configs: Add GPUFS --root-partition option
Different GPUFS disk images have different root partitions that Linux
needs to boot from. In particular, Ubuntu's new installer has a GRUB
partition that cannot seem to be removed. Adding this as an option
prevents needing to edit a config script to change one character each
time a different disk image is used.

Change-Id: Iac2996ea096047281891a70aa2901401ac9746fc
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71918
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
2023-06-29 23:30:16 +00:00
Matthew Poremba
841e6fe978 arch-vega: Add Vega D16 decodings and fix V_SWAP_B32
Vega adds multiple new D16 instructions which load a byte or short into
the lower or upper 16 bits of a register for packed math. The decoder
table has subDecode tables for FLAT instructions which represents 32
opcodes in each subDecode table. The subDecode table for opcodes 32-63
is missing so it is added here.

The opcode for V_SWAP_B32 is also off by one- In the ISA manual this
instruction is opcode 81, the instruction before is 79, and there is no
opcode 80, so the decoder entry is swapped with the invalid decoding
below it.

Change-Id: I278fea574ea684ccc6302d5b4d0f5dd8813a88ad
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71899
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2023-06-29 19:56:56 +00:00
Matthew Poremba
079fc47dc2 dev-amdgpu: Perform frame writes atomically
The PCI read/write functions are atomic functions in gem5, meaning they
expect a response with a latency value on the same simulation Tick. For
reads to a PCI device, the response must also include a data value read
from the device.

The AMDGPU device has a PCI BAR which mirrors the frame buffer memory.
Currently reads are done atomically, but writes are sent to a DMA device
without waiting for a write completion ACK. As a result, it is possible
that writes can be queued in the DMA device long enough that another
read for a queued address arrives. This happens very deterministically
with the AtomicSimpleCPU and causes GPUFS to break with that CPU.

This change makes writes to the frame BAR atomic the same as reads. This
avoids that problem and as a result the AtomicSimpleCPU can now load the
driver for GPUFS simulations.

Change-Id: I9a8e8b172712c78b667ebcec81a0c5d0060234db
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71898
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matthew Poremba <matthew.poremba@amd.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
2023-06-29 19:56:49 +00:00
handsomeliu
f54b3e6e75 mem: Support backdoor request in AddrMapper
Change-Id: Iedbe8eb75006ce1b81e85910af848fb8c4cba646
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/69057
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Yu-hsin Wang <yuhsingw@google.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-06-29 15:11:19 +00:00
Melissa Jost
051801a7bf misc: Add workflow files to develop
This copies our .github folder from stable into the develop
branch, which allows the GitHub Actions workflows to run
on both branches

Change-Id: I864939f86f0fbd6d73676f137df2670d3eac1d1a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71860
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
2023-06-28 16:19:59 +00:00
Melissa Jost
6d776eb468 resources: Output error message in downloader.py
This allows for the actual error message to be output in addition
to the output gem5 has on ValueErrors and ImportErrors.

Change-Id: Ic52f5646aa41dbf7c217ab27d142c0a18fa24c55
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71859
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2023-06-28 16:19:59 +00:00
Melissa Jost
3c0c3fb623 resources: Catch ConnectionResourceError in downloading resources
This handles an error we see within GitHub Actions that
occassionally occurs when downloading resources.  We retry in the
same way we do when handling HTTPErrors.

Change-Id: I4dce5d607ccc41ad53b51e39082c486e644d815c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71858
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2023-06-28 16:19:59 +00:00
Yan Lee
d5bdf6cf79 mem: port: add TracingExtension for debug purpose
TracingExtension contains a stack recording the port names
passed through of the Packet. The target receiving the Packet
can dump out the whole path of this Packet for the debug purpose.
This mechanism can be enabled with the debug flag PortTrace.

Change-Id: Ic11e708b35fdddc4f4b786d91b35fd4def08948c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71538
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Yu-hsin Wang <yuhsingw@google.com>
2023-06-21 02:37:42 +00:00
Bobby R. Bruce
aff1e7b64c stdlib: Refactor gem5 Vision/gem5-resources code
This patch includes several changes to the gem5 tools interface to the
gem5-resources infrastructure. These are:

* The old download and JSON query functions have been removed from the
  downloader module. These functions were used for directly downloading
  and inspecting the resource JSON file, hosted at
  https://resources.gem5.org/resources. This information is now obtained
  via `gem5.client`. If a resources JSON file is specified as a client,
  it should conform to the new schema:
  https//resources.gem5.org/gem5-resources-schema.json. The old schema
  (pre-v23.0) is no longer valid. Tests have been updated to reflect
  this change. Those which tested these old functions have been removed.
* Unused imports have been removed.
* For the resource query functions, and those tasked with obtaining the
  resources, the parameter `gem5_version` has been added. In all cases
  it does the same thing:
    * It will filter results based on compatibility to the
      `gem5_version` specified. If no resources are compatible the
      latest version of that resource is chosen (though a warning is
      thrown).
    * By default it is set to the current gem5 version.
    * It is optional. If `None`, this filtering functionality is not
      carried out.
    * Tests have been updated to fix the version to “develop” so the
      they do not break between versions.
* The `gem5_version` parameters will filter using a logic which will
  base compatibility on the specificity of the gem5-version specified in
  a resource’s data. If a resource has a compatible gem5-version of
  “v18.4” it will be compatible with any minor/hotfix version within the
  v18.4 release (this can be seen as matching on “v18.4.*.*”.) Likewise,
  if a resource has a compatible gem5-version of “v18.4.1” then it’s
  only compatible with the v18.4.1 release but any of it’s hot fix
  releases (“v18.4.1.*”).
* The ‘list_resources’ function has been updated to use the
  “gem5.client” APIs to get resource information from the clients
  (MongoDB or a JSON file). This has been designed to remain backwards
  compatible to as much as is possible, though, due to schema changes,
  the function does search across all versions of gem5.
* `get_resources` function was added to the `AbstractClient`. This is a
   more general function than `get_resource_by_id`. It was
  primarily created to handle the `list_resources` update but is a
  useful update to the API. The `get_resource_by_id` function has been
  altered to function as a wrapped to the `get_resources` function.
* Removed “GEM5_RESOURCE_JSON” code has been removed. This is no longer
  used.
* Tests have been cleaned up a little bit to be easier to read.
* Some docstrings have been updated.

Things that are left TODO with this code:

* The client_wrapper/client/abstract_client abstractions are rather
  pointless. In particular the client_wrapper and client classes could
  be merged.
* The downloader module no longer does much and should have its
  functions merged into other modules.
* With the addition of the `get_resources` function, much of the code in
  the `AbstractClient` could be simplified.

Change-Id: I0ce48e88b93a2b9db53d4749861fa0b5f9472053
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71506
Reviewed-by: Kunal Pai <kunpai@ucdavis.edu>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
(cherry picked from commit 82587ce71b)
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71739
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
2023-06-16 16:43:58 +00:00
Bobby R. Bruce
b22afde3be python: Remove Python 'pipes' module
This is scheduled for removal from Python in 3.13:
https://docs.python.org/3/library/pipes.html.

The 'shlex.quote' function can replace the 'pipes.quote' function used
in "main.py". A special wrapper has been made to account for the Windows
case which 'shlex.quote' doesn't handle.

Change-Id: I9c84605f0ccd8468b9cab6cece6248ef8c2107f0
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71678
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
(cherry picked from commit a63d376ecd)
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71740
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
2023-06-16 16:43:40 +00:00
Bobby R. Bruce
29055a0a6a scons,stdlib: Remove deprecated 'distutils' module
The Python module 'distutils' will be removed in Python 3.12:
https://docs.python.org/3/library/distutils.html

This patch removed usage of 'distutils' in the gem5 code base.

Change-Id: I1e3a944446149f3cd6cbf4211a1565b5f74c85a0
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71679
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
(cherry picked from commit b182b15f93)
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71741
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
2023-06-16 16:43:26 +00:00
Adrià Armejach
7398e1e401 arch-riscv: fix load reserved store conditional
* According to the manual, load reservations must be cleared on a
    failed or a successful SC attempt.
  * A load reservation can be arbitrarily large. The current
    implementation was reserving something different than cacheBlockSize
    which could lead to problems if snoop addresses are cache block
    aligned. This patch implementation assumes a cacheBlock granularity.
  * Load reservations should also be cleared on faults

Change-Id: I64513534710b5f269260fcb204f717801913e2f5
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71558
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Roger Chang <rogerycchang@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2023-06-16 06:49:25 +00:00
Mahyar Samani
8a6ee4cab3 tests: Reducing json stat dump size.
This change reduces the number of stats dumped as json in
traffic_gen tests.

Change-Id: I94becb2e6d5da6096271cf7893ff2b380314da06
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71402
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
(cherry picked from commit f78471fb81)
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71438
2023-06-16 00:48:59 +00:00
Hoa Nguyen
307ec86f05 configs: Add example configuration for OctopiCache
Change-Id: Ia78dd63e63808ebad40052d2a7cdb67cc7179e44
Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71618
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2023-06-15 23:07:47 +00:00
Hoa Nguyen
13e55e8491 stdlib: Add a prebuilt MESI_Three_Level cache
The cache is modeled after an AMD EPYC cache, but not exactly
like AMD EPYC cache.
- K cores per core complex (CCD), each core has one private split L1,
and one private L2.
- K cores in the same CCD share 1 slice of L3 cache, which is not
a victim cache.
- There can be multiple CCDs, which communicate with each other via
Cross-CCD router. The Cross-CCD rounter is also connected to
directory controllers and dma controllers.
- All links latency are set to 1.

Change-Id: Ib64248bed9155b8e48e5158ffdeebf1f2d770754
Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71598
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
2023-06-15 23:07:47 +00:00
Matthew Poremba
db903f4fd4 arch-vega: Helper methods for SDWA/DPP for VOP2
Many of the outstanding issues with the GPU model are related to
instructions not having SDWA/DPP implementations and executing by
ignoring the special registers leading to incorrect executiong.
Adding SDWA/DPP is current very cumbersome as there is a lot of
boilerplate code.

This changeset adds helper methods for VOP2 with one instruction
changed as an example. This review is intended to get feedback
before applying this change to all VOP2 instructions that support
SDWA/DPP.

Change-Id: I1edbc3f3bb166d34f151545aa9f47a94150e1406
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70738
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2023-06-15 23:02:39 +00:00
Yu-hsin Wang
694673f1d7 arch: set multiline re as default in isa_parser
In python3.11, it requires the global specifier should be the first
token of regex. However it's not possible when using ply library.
Instead, we set the rules are multiline regex by default and modifies
those single line rules.

Ref: https://github.com/dabeaz/ply/issues/282

Change-Id: I7bdbfeb97a9dd74f45c1890a76f8cc16100e5a42
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71019
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
2023-06-15 10:04:00 +00:00
Yu-hsin Wang
23a88d0400 fastmodel: only support single line literal when paring project file
In python3.11, it requires the global specifier should be the first
token of regex. However it's not possible when using ply library. In
fastmodel case, we actually don't need to support multiline string
literal. We fix this issue by just making the string literal single
line.

Ref: https://github.com/dabeaz/ply/issues/282

Change-Id: I746b628db7ad4c1d7834f1a1b2c1243cef68aa01
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71018
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
2023-06-15 10:03:47 +00:00
Roger Chang
9a27322c7b arch-riscv: Fix unexpected behavior of float operations in Mac OS
The uint_fast16_t is the integer at least 16 bits size, it can be
32, 64 bits and more. Usually most of the simulations are in the
x86-64 linux host, the size of uint_fast16_t is 64 bits. Therefore,
there is no problem for double precision float operations and it can
pass FloatMM test. However, in the Mac OS, the size of uint_fast16_t
is 16 bits, it will lose the upper bits when converting float
register bits to freg_t and it will generate unexpected results for
FloatMM test.

The change can guarantee that the size of data in freg_t is at least
64 bits and it will not lose any data from floating point to freg_t.

Reference:
https://developer.apple.com/documentation/kernel/uint_fast16_t

https://codebrowser.dev/glibc/glibc/stdlib/stdint.h.html

Change-Id: I3df6610f0903cdee0f56584d6cbdb51ac26c86c8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71578
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2023-06-13 23:20:47 +00:00
Jason Lowe-Power
28777e1114 python: Ignore -s as gem5 option
This enables more compatibility with the normal python binary. This is
needed to get multiprocessing to work on some systems.

Change-Id: Ibb946136d153979bf54a773060010a0ae479a9d1
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71502
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
2023-06-13 21:56:31 +00:00
Roger Chang
0fef2300c0 arch-riscv: Refactor fmax and fmin instructions
Currently fmax and fmin instructions convert source float registers such as
Fs1_bits to float64_t(or float32_t and float16_t) many times in the single
instruction. It is not efficient for the future maintenance of these
instructions.

The change adds non-register float_t intermediate variables fs1 and fs2 to
keep converted results so that we don’t need to do it repeatedly. It also
added an intermediate variable fd for specific float type to assume the upper
bits of the packed float register are all one.

Change-Id: Ic508d5255db6c4b38ca4df6dd805df440c043fff
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71479
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2023-06-13 00:09:34 +00:00
Melissa Jost
f467cf79fd util: Add util for GitHub runner configuration
This adds files that can be used to configure Vagrant machines
that will be used to test running gem5 alongside Github Actions.

Change-Id: I52b0f39b6e6044c22481f02163d5fc01eab76788
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71098
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
2023-06-09 00:33:32 +00:00
Matthew Poremba
90067c6ce4 configs: GPUFS: Only use parallel eventqs for KVM
This is turned on by default with multiple CPUs in the GPUFS configs,
which causes other CPU types (e.g., AtomicSimpleCPU) to assert. Only
enable parallel event queues for KVM CPUs to avoid this issue.

Change-Id: Ic8235437caf0150560e2b360a4544d82dfc26c36
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71419
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2023-06-08 22:04:02 +00:00
Matthew Poremba
c644eae2dd configs,gpu-compute: Kernel dispatch-based exit events
Add two kernel dispatch-based exit events that are useful for limiting
the simulation and enabling debug flags at specific GPU kernels. Since
the KVM CPU typically used with GPUFS is not deterministic, this help
with enabling debug flags when the Tick number may vary. The exit at GPU
kernel option can also limit simulation by only simulating a few hundred
kernels, for example, and exit at a determined point.

Change-Id: I81bae92a80c25fc38c41e999aa662e1417b7a20d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71418
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2023-06-08 22:03:47 +00:00
KUNAL PAI
25d7badcc1 tests: Fix bugs related to gem5 Vision
This patch fixes refs under tests/pyunit/stdlib/resources.

Removes instances of {url_base} in refs.

Also, renames two refs: mongo_mock and mongo_dup_mock
to mongo-mock and mongo-dup-mock to follow naming
convention of other refs.

Change-Id: If115114bc7a89764e7c546b77a93d36d6a3b5f8a
Co-authored-by: Parth Shah <helloparthshah@gmail.com>
Co-authored-by: Harshil2107 <harshilp2107@gmail.com>
Co-authored-by: aarsli <arsli@ucdavis.edu>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71360
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
2023-06-08 02:56:33 +00:00
Daniel R. Carvalho
77ac6eacd9 mem-cache: De-virtualize forEachBlk() in tags
Avoid code duplication by using the anyBlk function
with a lambda that always returns false, which forces
all blocks to be visited.

Change-Id: I25527602535c719f46699677a7f70f3e31157f26
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70998
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2023-06-08 00:26:02 +00:00
Roger Chang
685a5cd017 scons: Fix grpc protobuf actions
The change will fix the proto import issue and build issue with
--no-duplicate-sources options, more details please reference:
https://gem5-review.googlesource.com/c/public/gem5/+/64491.

Change-Id: I259413f7739f89598dcd42c3f2e1e865cec3de43
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71318
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2023-06-08 00:17:23 +00:00
Kunal Pai
ffbf73db1d stdlib, tests, configs: Introduce gem5 Vision to resources
This patch makes changes to the stdlib based on the gem5 Vision project.
Firstly, a MongoDB database is supported.
A JSON database's support is continued.
The JSON can either be a local path or a raw GitHub link.

The data for these databases is stored in src/python
under "gem5-config.json".
This will be used by default.
However, the configuration can be overridden:
- by providing a path using the GEM5_CONFIG env variable.
- by placing a gem5-config.json file in the current working directory.

An AbstractClient is an abstract class that implements
searching and sorting relevant to the databases.

Clients is an optional list that can be passed
while defining any Resource class and obtain_resource.
These databases can be defined in the config JSON.

Resources now have versions. This allows for a
single version, e.g., 'x86-ubuntu-boot', to have
multiple versions. As such, the key of a resource is
its ID and Version (e.g., 'x86-ubuntu-boot/v2.1.0').
Different versions of a resource might be compatible
with different versions of gem5.

By default, it picks the latest version compatible with the gem5 Version
of the user.

A gem5 resource schema now has additional fields.
These are:
- source_url: Stores URL of GitHub Source of the resource.
- license: License information of the resource.
- tags: Words to identify a resource better, like hello for hello-world
- example_usage: How to use the resource in a simulation.
- gem5_versions: List of gem5 versions that resource is compatible with.
- resource_version: The version of the resource itself.
- size: The download size of the resource, if it exists.
- code_examples: List of objects.
These objects contain the path to where a resource is
used in gem5 example config scripts,
and if the resource itself is used in tests or not.
- category: Category of the resource, as defined by classes in
src/python/gem5/resources/resource.py.

Some fields have been renamed:
- "name" is changed to "id"
- "documentation" is changed to "description"

Besides these, the schema also supports resource specialization.
It adds fields relevant to a specific resource as specified in
src/python/gem5/resources/resource.py
These changes have been made to better present
information on the new gem5 Resources website.

But, they do not affect the way resources are used by a gem5 user.
This patch is also backwards compatible.
Existing code doesn't break with this new infrastructure.

Also, refs in the tests have been changed to match this new schema.
Tests have been changed to work with the two clients.

Change-Id: Ia9bf47f7900763827fd5e873bcd663cc3ecdba40
Co-authored-by: Kunal Pai <kunpai@ucdavis.edu>
Co-authored-by: Parth Shah <helloparthshah@gmail.com>
Co-authored-by: Harshil Patel <harshilp2107@gmail.com>
Co-authored-by: aarsli <arsli@ucdavis.edu>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70858
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
2023-06-06 03:30:50 +00:00
Giacomo Travaglini
b355baac93 dev-arm: Treat GICv3 reserved addresses as RES0
According to the GIC specification (IHI0069) reserved addresses in the
GIC memory map are treated as RES0.  We allow to disable this behaviour
and panic instead (reserved_res0 = False, which is what we have been
doing so far) to catch development bugs (in gem5 and in the guest SW)

Change-Id: I23f98519c2f256c092a52425735b8792bae7a2c7
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71138
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2023-06-05 15:01:59 +00:00
Roger Chang
942a9ea503 stdlib: Add U74VecFU to U74CPU
This change is to elimilate the warning message from U74CPU.

Change-Id: I7a5d0cd0b2955e54ed14fc1ac6f7127bd7f0604b
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71238
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2023-06-04 23:50:36 +00:00
Xuan Hu
14f919a67e arch-riscv,cpu-minor: Add MinorDefaultVecFU for risc-v v-ext
Change-Id: Id5c5ae5fa1901154cadeb0a4958703f3f15d491f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67295
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2023-06-04 23:50:18 +00:00
Roger Chang
5e5e81d1c5 arch-riscv: Check FPU status for c.flwsp c.fldsp c.fswsp c.fsdsp
The change adds the missing FPU checking for these instructions.

Change-Id: I7f2ef89786af0d528f2029f1097cfeac6c7d65f2
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71198
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2023-06-03 00:58:25 +00:00
Bobby R. Bruce
917ced812c stdlib: Fix incorrect path and checks for DRAMsim3
There are three bugs fixed in this patch:

1. The `dram_3_dir` was missing the "dramsim3" directory.
2. Missing `not` when checking if configs is a directory.
3. Missing `not` when checking if input file is a file.

Change-Id: I185f4832c1c2f1ecc4e138c148ad7969ef9b6fd4
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71038
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2023-06-01 22:45:44 +00:00
Matthew Poremba
ebd5b3e4ae gpu-compute: Gfx version check for FS and SE mode
There is no GPU device in SE mode to get version from and no GPU driver
in FS mode to get version from, so a conditional needs to be added
depending on the mode to get the gfx version.

Change-Id: I33fdafb60d351ebc5148e2248244537fb5bebd31
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71078
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2023-06-01 00:15:02 +00:00
Yu-hsin Wang
bd1d72f61e fastmodel: add src include path by default
We have some customized protocols in gem5 repository and they require
the include path from src directory. It causes the users of those
protocols need to handle the include path correctly by theirselve. This
is tedious and unstable. We should add the default include path in
SIMGEN command line to prevent issues.

Change-Id: I2a3748646567635d131a8fb4099e02e332691e97
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71118
Reviewed-by: Wei-Han Chen <weihanchen@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2023-05-31 23:47:30 +00:00
2389 changed files with 208433 additions and 174046 deletions

View File

@@ -0,0 +1,37 @@
{
"name": "gem5 Development Container",
"image": "ghcr.io/gem5/devcontainer:latest",
"hostRequirements": {
"cpus": 8,
"memory": "16gb",
"storage": "32gb"
},
"customizations": {
"vscode": {
"extensions": [
"eamodio.gitlens",
"GitHub.copilot",
"GitHub.copilot-chat",
"GitHub.vscode-pull-request-github",
"ms-python.debugpy",
"ms-python.isort",
"ms-python.python",
"ms-python.vscode-pylance",
"ms-vscode.cpptools",
"ms-vscode.cpptools-extension-pack",
"ms-vscode.cpptools-themes",
"ms-vscode.makefile-tools",
"ms-vscode-remote.remote-containers",
"Tsinghua-Hexin-Joint-Institute.gem5-slicc",
"VisualStudioExptTeam.vscodeintellicode"
]
}
},
"features": {
"ghcr.io/devcontainers/features/docker-in-docker:2": {},
"ghcr.io/devcontainers/features/github-cli:1": {},
"ghcr.io/devcontainers-contrib/features/actionlint:1": {},
"ghcr.io/devcontainers-contrib/features/vscode-cli:1": {}
},
"onCreateCommand": "./.devcontainer/on-create.sh"
}

View File

@@ -1,4 +1,6 @@
# Copyright (c) 2022 The Regents of the University of California
#!/bin/bash
# Copyright (c) 2024 The Regents of the University of California
# All Rights Reserved.
#
# Redistribution and use in source and binary forms, with or without
@@ -24,13 +26,16 @@
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
FROM gcr.io/gem5-test/ubuntu-22.04_min-dependencies:latest as source
RUN apt -y update && apt -y install git
RUN git clone -b develop https://github.com/gem5/gem5/ /gem5
WORKDIR /gem5
RUN scons -j`nproc` build/ALL/gem5.fast
# This script is run when the Docker container specified in devcontainer.json
# is created.
FROM gcr.io/gem5-test/ubuntu-22.04_min-dependencies:latest
COPY --from=source /gem5/build/ALL/gem5.fast /usr/local/bin/gem5
set -e
ENTRYPOINT [ "/usr/local/bin/gem5" ]
# Making the downloaded repository safe as the owner might differ for .devcontainer env.
git config --global --add safe.directory /workspaces/gem5
# Refresh the git index.
git update-index
# Install the pre-commit checks.
./util/pre-commit-install.sh

View File

@@ -29,3 +29,9 @@ c3bd8eb1214cbebbc92c7958b80aa06913bce3ba
# A commit which ran flynt all Python files.
e73655d038cdfa68964109044e33c9a6e7d85ac9
# A commit which ran pre-commit on ext/testlib
9e1afdecefaf910fa6e266f29dc480a32b0fa83e
# Updated black from 22.6.0 to 23.9.1
ddf6cb88e48df4ac7de4a9e4b612daf2e7e635c8

17
.github/dependabot.yml vendored Normal file
View File

@@ -0,0 +1,17 @@
---
version: 2
updates:
- package-ecosystem: pip
directory: /
schedule:
interval: monthly
assignees:
- Harshil2107
commit-message:
prefix: 'misc: '
# Raise pull requests for version updates
# to pip against the `develop` branch
target-branch: develop
# Labels on pull requests for version updates only
labels:
- misc

View File

@@ -5,7 +5,7 @@ name: CI Tests
on:
pull_request:
types: [opened, edited, synchronize, ready_for_review]
types: [opened, synchronize, ready_for_review]
concurrency:
group: ${{ github.workflow }}-${{ github.ref || github.run_id }}
@@ -14,50 +14,55 @@ concurrency:
jobs:
pre-commit:
# runs on github hosted runner
runs-on: ubuntu-22.04
runs-on: ubuntu-latest
if: github.event.pull_request.draft == false
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v3
- uses: pre-commit/action@v3.0.0
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
- uses: pre-commit/action@v3.0.1
# ensures we have a change-id in every commit, needed for gerrit
check-for-change-id:
# runs on github hosted runner
runs-on: ubuntu-22.04
if: github.event.pull_request.draft == false
get-date:
# We use the date to label caches. A cache is a a "hit" if the date is the
# request binary and date are the same as what is stored in the cache.
# This essentially means the first job to run on a given day for a given
# binary will always be a "miss" and will have to build the binary then
# upload it as that day's binary to upload. While this isn't the most
# efficient way to do this, the alternative was to run take a hash of the
# `src` directory contents and use it as a hash. We found there to be bugs
# with the hash function where this task would timeout. This approach is
# simple, works, and still provides some level of caching.
runs-on: ubuntu-latest
outputs:
date: ${{ steps.date.outputs.date }}
steps:
- uses: actions/checkout@v3
with:
fetch-depth: 0
- name: Check for Change-Id
run: |
# loop through all the commits in the pull request
for commit in $(git rev-list ${{ github.event.pull_request.base.sha }}..${{ github.event.pull_request.head.sha }}); do
git checkout $commit
if (git log -1 --pretty=format:"%B" | grep -q "Change-Id: ")
then
# passes as long as at least one change-id exists in the pull request
exit 0
fi
done
# if we reach this part, none of the commits had a change-id
echo "None of the commits in this pull request contains a Change-ID, which we require for any changes made to gem5. "\
"To automatically insert one, run the following:\n f=`git rev-parse --git-dir`/hooks/commit-msg ; mkdir -p $(dirname $f) ; "\
"curl -Lo $f https://gerrit-review.googlesource.com/tools/hooks/commit-msg ; chmod +x $f\n Then amend the commit with git commit --amend --no-edit, and update your pull request."
exit 1
- name: Get the current date
id: date
run: echo "date=$(date +'%Y-%m-%d')" >> $GITHUB_ENV
unittests-all-opt:
runs-on: [self-hosted, linux, x64]
if: github.event.pull_request.draft == false
container: ghcr.io/gem5/ubuntu-22.04_all-dependencies:latest
needs: [pre-commit, check-for-change-id] # only runs if pre-commit and change-id passes
container: ghcr.io/gem5/ubuntu-24.04_all-dependencies:latest
needs: [pre-commit, get-date] # only runs if pre-commit passes.
timeout-minutes: 60
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
# Restore the cache if available. As this just builds the unittests
# we only obtain the cache and do not provide if if is not
# available.
- name: Cache build/ALL
uses: actions/cache/restore@v4
with:
path: build/ALL
key: testlib-build-all-${{ needs.get-date.outputs.date }}
restore-keys: |
testlib-build-all
- name: CI Unittests
working-directory: ${{ github.workspace }}
run: scons build/ALL/unittests.opt -j $(nproc)
run: scons --no-compress-debug build/ALL/unittests.opt -j $(nproc)
- run: echo "This job's status is ${{ job.status }}."
testlib-quick-matrix:
@@ -65,15 +70,15 @@ jobs:
if: github.event.pull_request.draft == false
# In order to make sure the environment is exactly the same, we run in
# the same container we use to build gem5 and run the testlib tests. This
container: ghcr.io/gem5/ubuntu-22.04_all-dependencies:latest
needs: [pre-commit, check-for-change-id]
container: ghcr.io/gem5/ubuntu-24.04_all-dependencies:latest
needs: [pre-commit]
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
# Unfortunately the 'ubunutu-latest' image doesn't have jq installed.
# We therefore need to install it as a step here.
# Unfortunately the image doesn't have jq installed.
# We therefore need to install it as a step here.
- name: Install jq
run: apt install -y jq
run: apt update && apt install -y jq
- name: Get directories for testlib-quick
working-directory: ${{ github.workspace }}/tests
@@ -89,18 +94,44 @@ jobs:
build-matrix: ${{ steps.build-matrix.outputs.build-matrix }}
test-dirs-matrix: ${{ steps.dir-matrix.outputs.test-dirs-matrix }}
clang-fast-compilation:
# gem5 binaries built in `quick-gem5-builds` always use GCC.
# Clang is more strict than GCC. This job checks that gem5 compiles
# with Clang. It compiles build/ALL/gem5.fast to maximize the change
# for compilation error to be exposed.
runs-on: [self-hosted, linux, x64]
if: github.event.pull_request.draft == false
container: ghcr.io/gem5/clang-version-18:latest
needs: [pre-commit]
timeout-minutes: 90
steps:
- uses: actions/checkout@v4
- name: Clang Compilation
working-directory: ${{ github.workspace }}
run: scons build/ALL/gem5.fast -j $(nproc)
testlib-quick-gem5-builds:
runs-on: [self-hosted, linux, x64]
if: github.event.pull_request.draft == false
container: ghcr.io/gem5/ubuntu-22.04_all-dependencies:latest
needs: [pre-commit, check-for-change-id, testlib-quick-matrix]
container: ghcr.io/gem5/ubuntu-24.04_all-dependencies:latest
needs: [pre-commit, testlib-quick-matrix, get-date]
strategy:
matrix:
build-target: ${{ fromJson(needs.testlib-quick-matrix.outputs.build-matrix) }}
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Cache build/ALL
uses: actions/cache@v4
if: ${{ endsWith(matrix.build-target, 'build/ALL/gem5.opt') }}
with:
path: build/ALL
key: testlib-build-all-${{ needs.get-date.outputs.date }}
restore-keys: |
testlib-build-all
- name: Build gem5
run: scons ${{ matrix.build-target }} -j $(nproc)
run: scons --no-compress-debug ${{ matrix.build-target }} -j $(nproc)
# Upload the gem5 binary as an artifact.
# Note: the "achor.txt" file is a hack to make sure the paths are
@@ -111,7 +142,7 @@ jobs:
# stripping the "build" directory. By adding the "anchor.txt" file, we
# ensure the "build" directory is preserved.
- run: echo "anchor" > anchor.txt
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4
with:
name: ci-tests-${{ github.run_number }}-testlib-quick-all-gem5-builds
path: |
@@ -122,8 +153,8 @@ jobs:
testlib-quick-execution:
runs-on: [self-hosted, linux, x64]
if: github.event.pull_request.draft == false
container: ghcr.io/gem5/ubuntu-22.04_all-dependencies:latest
needs: [pre-commit, check-for-change-id, testlib-quick-matrix, testlib-quick-gem5-builds]
container: ghcr.io/gem5/ubuntu-24.04_all-dependencies:latest
needs: [pre-commit, testlib-quick-matrix, testlib-quick-gem5-builds]
timeout-minutes: 360 # 6 hours
strategy:
fail-fast: false
@@ -134,8 +165,8 @@ jobs:
run: rm -rf ./* || true rm -rf ./.??* || true rm -rf ~/.cache || true
# Checkout the repository then download the gem5.opt artifact.
- uses: actions/checkout@v3
- uses: actions/download-artifact@v3
- uses: actions/checkout@v4
- uses: actions/download-artifact@v4
with:
name: ci-tests-${{ github.run_number }}-testlib-quick-all-gem5-builds
@@ -159,21 +190,98 @@ jobs:
run: echo "sanatized-test-dir=$(echo '${{ matrix.test-dir }}' | sed 's/\//-/g')" >> $GITHUB_OUTPUT
# Upload the tests/testing-results directory as an artifact.
- name: Upload test results
- name: upload results
if: success() || failure()
uses: actions/upload-artifact@v3
uses: actions/upload-artifact@v4
with:
name: ci-tests-run-${{ github.run_number }}-attempt-${{ github.run_attempt }}-testlib-quick-${{ steps.sanitize-test-dir.outputs.sanatized-test-dir
}}-status-${{ steps.run-tests.outcome }}-output
path: tests/testing-results
retention-days: 30
testlib-quick:
pyunit:
runs-on: [self-hosted, linux, x64]
if: github.event.pull_request.draft == false
container: ghcr.io/gem5/ubuntu-24.04_all-dependencies:latest
needs: [pre-commit, testlib-quick-gem5-builds]
timeout-minutes: 30
steps:
# Checkout the repository then download the builds.
- uses: actions/checkout@v4
- uses: actions/download-artifact@v4
with:
name: ci-tests-${{ github.run_number }}-testlib-quick-all-gem5-builds
# Check that the gem5 binaries exist and are executable.
- name: Chmod gem5.{opt,debug,fast} to be executable
run: |
find . -name "gem5.opt" -exec chmod u+x {} \;
find . -name "gem5.debug" -exec chmod u+x {} \;
find . -name "gem5.fast" -exec chmod u+x {} \;
# Run the pyunit tests.
# Note: these are all quick tests.
- name: Run The pyunit tests
id: run-tests
working-directory: ${{ github.workspace }}/tests
run: ./main.py run --skip-build -vv -j$(nproc) pyunit
# Upload the tests/testing-results directory as an artifact.
- name: Upload pyunit test results
if: success() || failure()
uses: actions/upload-artifact@v4
with:
name: ci-tests-run-${{ github.run_number }}-attempt-${{ github.run_attempt }}-pyunit-status-${{ steps.run-tests.outcome }}-output
path: tests/testing-results
retention-days: 30
gpu-tests:
runs-on: [self-hosted, linux, x64]
container: ghcr.io/gem5/gcn-gpu:latest
timeout-minutes: 180
needs: [pre-commit, get-date]
steps:
- uses: actions/checkout@v4
# Obtain the cache if available. If not available this will upload
# this job's instance of the cache.
- name: Cache build/VEGA_X86
uses: actions/cache@v4
with:
path: build/VEGA_X86
key: testlib-build-vega-${{ needs.get-date.outputs.date }}
restore-keys: |
testlib-build-vega
# Build the VEGA_X86/gem5.opt binary.
- name: Build VEGA_X86/gem5.opt
run: scons --no-compress-debug build/VEGA_X86/gem5.opt -j`nproc`
# Run the GPU tests.
- name: Run Testlib GPU Tests
working-directory: ${{ github.workspace }}/tests
run: ./main.py run --skip-build -vvv -t $(nproc) --host gcn_gpu gem5/gpu
# Upload the tests/testing-results directory as an artifact.
- name: Upload results
if: success() || failure()
uses: actions/upload-artifact@v4
with:
name: ci-tests-run-${{ github.run_number }}-attempt-${{ github.run_attempt }}-gpu-status-${{ steps.run-tests.outcome }}-output
path: tests/testing-results
retention-days: 30
ci-tests:
# It is 'testlib-quick' which needs to pass for the pull request to be
# merged. The 'testlib-quick-execution' is a matrix job which runs all the
# the testlib quick tests. This job is therefore a stub which will pass if
# all the testlib-quick-execution jobs pass.
runs-on: ubuntu-22.04
needs: testlib-quick-execution
# merged. This job is a dummy job that depends on all the other jobs.
runs-on: ubuntu-latest
needs:
- testlib-quick-execution
- pyunit
- clang-fast-compilation
- unittests-all-opt
- pre-commit
- gpu-tests
steps:
- run: echo "This job's status is ${{ job.status }}."

View File

@@ -4,10 +4,7 @@
name: Compiler Tests
on:
# Runs every Friday from 7AM UTC
schedule:
- cron: 00 7 * * 5
# Allows us to manually start workflow for testing
# This is triggered weekly via the 'scheduler.yaml' workflow.
workflow_dispatch:
jobs:
@@ -16,19 +13,14 @@ jobs:
strategy:
fail-fast: false
matrix:
image: [gcc-version-12, gcc-version-11, gcc-version-10, gcc-version-9, gcc-version-8, clang-version-16, clang-version-15, clang-version-14,
clang-version-13, clang-version-12, clang-version-11, clang-version-10, clang-version-9, clang-version-8, clang-version-7, ubuntu-20.04_all-dependencies,
ubuntu-22.04_all-dependencies, ubuntu-22.04_min-dependencies]
image: [gcc-version-14, gcc-version-13, gcc-version-12, gcc-version-11, gcc-version-10, clang-version-18, clang-version-17, clang-version-16,
clang-version-15, clang-version-14, ubuntu-22.04_all-dependencies, ubuntu-24.04_all-dependencies, ubuntu-24.04_min-dependencies]
opts: [.opt, .fast]
runs-on: [self-hosted, linux, x64]
timeout-minutes: 2880 # 48 hours
container: ghcr.io/gem5/${{ matrix.image }}:latest
steps:
- uses: actions/checkout@v3
with:
# Scheduled workflows run on the default branch by default. We
# therefore need to explicitly checkout the develop branch.
ref: develop
- uses: actions/checkout@v4
- name: Compile build/ALL/gem5${{ matrix.opts }} with ${{ matrix.image }}
run: /usr/bin/env python3 /usr/bin/scons --ignore-style build/ALL/gem5${{ matrix.opts }} -j$(nproc)
timeout-minutes: 600 # 10 hours
@@ -38,20 +30,15 @@ jobs:
strategy:
fail-fast: false
matrix:
gem5-compilation: [ARM, ARM_MESI_Three_Level, ARM_MESI_Three_Level_HTM, ARM_MOESI_hammer, Garnet_standalone, GCN3_X86, MIPS, 'NULL', NULL_MESI_Two_Level,
NULL_MOESI_CMP_directory, NULL_MOESI_CMP_token, NULL_MOESI_hammer, POWER, RISCV, SPARC, X86, X86_MI_example, X86_MOESI_AMD_Base, VEGA_X86,
GCN3_X86]
image: [gcc-version-12, clang-version-16]
gem5-compilation: [ARM, ARM_MESI_Three_Level, ARM_MESI_Three_Level_HTM, ARM_MOESI_hammer, Garnet_standalone, MIPS, 'NULL', NULL_MESI_Two_Level,
NULL_MOESI_CMP_directory, NULL_MOESI_CMP_token, NULL_MOESI_hammer, POWER, RISCV, SPARC, X86, X86_MI_example, X86_MOESI_AMD_Base, VEGA_X86]
image: [gcc-version-14, clang-version-18]
opts: [.opt]
runs-on: [self-hosted, linux, x64]
timeout-minutes: 2880 # 48 hours
container: ghcr.io/gem5/${{ matrix.image }}:latest
steps:
- uses: actions/checkout@v3
with:
# Scheduled workflows run on the default branch by default. We
# therefore need to explicitly checkout the develop branch.
ref: develop
- uses: actions/checkout@v4
- name: Compile build/${{ matrix.gem5-compilation }}/gem5${{ matrix.opts }} with ${{ matrix.image }}
run: /usr/bin/env python3 /usr/bin/scons --ignore-style build/${{ matrix.gem5-compilation }}/gem5${{ matrix.opts }} -j$(nproc)
timeout-minutes: 600 # 10 hours
@@ -62,7 +49,7 @@ jobs:
# I.e., if we want to stop pull requests from being merged if the
# compiler tests are failing, we can add this job as a required status
# check.
runs-on: ubuntu-22.04
runs-on: ubuntu-latest
needs:
- latest-compilers-all-gem5-builds
- all-compilers

View File

@@ -4,56 +4,19 @@
name: Daily Tests
on:
# Runs every day from 7AM UTC
schedule:
- cron: 0 7 * * *
# This is triggered weekly via the 'scheduler.yaml' workflow.
workflow_dispatch:
jobs:
name-artifacts:
get-date:
runs-on: ubuntu-latest
outputs:
build-name: ${{ steps.artifact-name.outputs.name }}
date: ${{ steps.date.outputs.date }}
steps:
- uses: actions/checkout@v2
- id: artifact-name
run: echo "name=$(date +"%Y-%m-%d_%H.%M.%S-")" >> $GITHUB_OUTPUT
build-gem5:
strategy:
fail-fast: false
matrix:
# NULL is in quotes since it is considered a keyword in yaml files
image: [ALL, ALL_CHI, ARM, ALL_MSI, ALL_MESI_Two_Level, 'NULL', NULL_MI_example, RISCV, VEGA_X86]
# this allows us to pass additional command line parameters
# the default is to add -j $(nproc), but some images
# require more specifications when built
include:
- command-line: -j $(nproc)
- image: ALL_CHI
command-line: --default=ALL PROTOCOL=CHI -j $(nproc)
- image: ALL_MSI
command-line: --default=ALL PROTOCOL=MSI -j $(nproc)
- image: ALL_MESI_Two_Level
command-line: --default=ALL PROTOCOL=MESI_Two_Level -j $(nproc)
- image: NULL_MI_example
command-line: --default=NULL PROTOCOL=MI_example -j $(nproc)
runs-on: [self-hosted, linux, x64]
needs: name-artifacts
container: ghcr.io/gem5/ubuntu-22.04_all-dependencies:latest
steps:
- uses: actions/checkout@v3
with:
# Scheduled workflows run on the default branch by default. We
# therefore need to explicitly checkout the develop branch.
ref: develop
- name: Build gem5
run: scons build/${{ matrix.image }}/gem5.opt ${{ matrix.command-line }}
- uses: actions/upload-artifact@v3
with:
name: ${{ needs.name-artifacts.outputs.build-name }}${{ matrix.image }}
path: build/${{ matrix.image }}/gem5.opt
retention-days: 5
- run: echo "This job's status is ${{ job.status }}."
- name: Get the current date
id: date
run: echo "date=$(date +'%Y-%m-%d')" >> $GITHUB_ENV
# this builds both unittests.fast and unittests.debug
unittests-fast-debug:
@@ -61,14 +24,18 @@ jobs:
matrix:
type: [fast, debug]
runs-on: [self-hosted, linux, x64]
container: ghcr.io/gem5/ubuntu-22.04_all-dependencies:latest
container: ghcr.io/gem5/ubuntu-24.04_all-dependencies:latest
timeout-minutes: 60
needs: get-date
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Cache build/ALL
uses: actions/cache/restore@v4
with:
# Scheduled workflows run on the default branch by default. We
# therefore need to explicitly checkout the develop branch.
ref: develop
path: build/ALL
key: testlib-build-all-${{ needs.get-date.outputs.date }}
restore-keys: |
testlib-build-all
- name: ALL/unittests.${{ matrix.type }} UnitTests
run: scons build/ALL/unittests.${{ matrix.type }} -j $(nproc)
@@ -77,87 +44,42 @@ jobs:
strategy:
fail-fast: false
matrix:
test-type: [arm_boot_tests, fs, gpu, insttest_se, learning_gem5, m5threads_test_atomic, memory, multi_isa, replacement_policies, riscv_boot_tests,
test-type: [arm_boot_tests, fs, gpu, insttest_se, learning_gem5, m5threads_test_atomic, memory, replacement_policies, riscv_boot_tests,
stdlib, x86_boot_tests]
runs-on: [self-hosted, linux, x64]
container: ghcr.io/gem5/ubuntu-22.04_all-dependencies:latest
needs: [name-artifacts, build-gem5]
container: ghcr.io/gem5/ubuntu-24.04_all-dependencies:latest
timeout-minutes: 1440 # 24 hours for entire matrix to run
needs: get-date
steps:
- name: Clean runner
run: rm -rf ./* || true rm -rf ./.??* || true rm -rf ~/.cache || true
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Restore build/NULL cache
uses: actions/cache@v4
with:
# Scheduled workflows run on the default branch by default. We
# therefore need to explicitly checkout the develop branch.
ref: develop
# download all artifacts for each test
# since long tests can't start until the build matrix completes,
# we download all artifacts from the build for each test
# in this matrix
- uses: actions/download-artifact@v3
with:
name: ${{needs.name-artifacts.outputs.build-name}}ALL
path: build/ALL
- run: chmod u+x build/ALL/gem5.opt
- uses: actions/download-artifact@v3
with:
name: ${{needs.name-artifacts.outputs.build-name}}ALL_CHI
path: build/ALL_CHI
- run: chmod u+x build/ALL_CHI/gem5.opt
- uses: actions/download-artifact@v3
with:
name: ${{needs.name-artifacts.outputs.build-name}}ARM
path: build/ARM
- run: chmod u+x build/ARM/gem5.opt
- uses: actions/download-artifact@v3
with:
name: ${{needs.name-artifacts.outputs.build-name}}ALL_MSI
path: build/ALL_MSI
- run: chmod u+x build/ALL_MSI/gem5.opt
- uses: actions/download-artifact@v3
with:
name: ${{needs.name-artifacts.outputs.build-name}}ALL_MESI_Two_Level
path: build/ALL_MESI_Two_Level
- run: chmod u+x build/ALL_MESI_Two_Level/gem5.opt
- uses: actions/download-artifact@v3
with:
name: ${{needs.name-artifacts.outputs.build-name}}NULL
path: build/NULL
- run: chmod u+x build/NULL/gem5.opt
- uses: actions/download-artifact@v3
key: testlib-build-null-${{ needs.get-date.outputs.date }}
- name: Restore build/ALL cache
uses: actions/cache@v4
with:
name: ${{needs.name-artifacts.outputs.build-name}}NULL_MI_example
path: build/NULL_MI_example
- run: chmod u+x build/NULL_MI_example/gem5.opt
- uses: actions/download-artifact@v3
with:
name: ${{needs.name-artifacts.outputs.build-name}}RISCV
path: build/RISCV
- run: chmod u+x build/RISCV/gem5.opt
- uses: actions/download-artifact@v3
with:
name: ${{needs.name-artifacts.outputs.build-name}}VEGA_X86
path: build/VEGA_X86
- run: chmod u+x build/VEGA_X86/gem5.opt
# run test
path: build/ALL
key: testlib-build-all-${{ needs.get-date.outputs.date }}
- name: long ${{ matrix.test-type }} tests
working-directory: ${{ github.workspace }}/tests
run: ./main.py run gem5/${{ matrix.test-type }} --length=long --skip-build -vv -t $(nproc)
- name: create zip of results
run: ./main.py run gem5/${{ matrix.test-type }} -j$(nproc) --length=long -vv -t $(nproc)
- name: upload results
if: success() || failure()
run: |
apt-get -y install zip
zip -r output.zip tests/testing-results
- name: upload zip
if: success() || failure()
uses: actions/upload-artifact@v3
uses: actions/upload-artifact@v4
env:
MY_STEP_VAR: ${{ matrix.test-type }}_COMMIT.${{github.sha}}_RUN.${{github.run_id}}_ATTEMPT.${{github.run_attempt}}
with:
name: ${{ env.MY_STEP_VAR }}
path: output.zip
path: tests/testing-results
retention-days: 7
- run: echo "This job's status is ${{ job.status }}."
# split library example tests into runs based on Suite UID
@@ -169,42 +91,72 @@ jobs:
matrix:
test-type: [gem5-library-example-x86-ubuntu-run-ALL-x86_64-opt, gem5-library-example-riscv-ubuntu-run-ALL-x86_64-opt, lupv-example-ALL-x86_64-opt,
gem5-library-example-arm-ubuntu-run-test-ALL-x86_64-opt, gem5-library-example-riscvmatched-hello-ALL-x86_64-opt]
container: ghcr.io/gem5/ubuntu-22.04_all-dependencies:latest
needs: [name-artifacts, build-gem5]
container: ghcr.io/gem5/ubuntu-24.04_all-dependencies:latest
timeout-minutes: 1440 # 24 hours
needs: get-date
steps:
- name: Clean runner
run: rm -rf ./* || true rm -rf ./.??* || true rm -rf ~/.cache || true
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Cache build/ALL
uses: actions/cache@v4
with:
# Scheduled workflows run on the default branch by default. We
# therefore need to explicitly checkout the develop branch.
ref: develop
- uses: actions/download-artifact@v3
with:
name: ${{needs.name-artifacts.outputs.build-name}}ALL
path: build/ALL
- run: chmod u+x build/ALL/gem5.opt
key: testlib-build-all-${{ needs.get-date.outputs.date }}
restore-keys: |
testlib-build-all
- name: long ${{ matrix.test-type }} gem5_library_example_tests
working-directory: ${{ github.workspace }}/tests
run: ./main.py run --uid SuiteUID:tests/gem5/gem5_library_example_tests/test_gem5_library_examples.py:test-${{ matrix.test-type }} --length=long
--skip-build -vv
- name: create zip of results
run: ./main.py run --uid SuiteUID:tests/gem5/gem5_library_example_tests/test_gem5_library_examples.py:test-${{ matrix.test-type }} -j $(nproc)
--length=long -vv
- name: upload results
if: success() || failure()
run: |
apt-get -y install zip
zip -r output.zip tests/testing-results
- name: upload zip
if: success() || failure()
uses: actions/upload-artifact@v3
uses: actions/upload-artifact@v4
env:
MY_STEP_VAR: ${{ matrix.test-type }}_COMMIT.${{github.sha}}_RUN.${{github.run_id}}_ATTEMPT.${{github.run_attempt}}
with:
name: ${{ env.MY_STEP_VAR }}
path: output.zip
path: tests/testing-results
retention-days: 7
- run: echo "This job's status is ${{ job.status }}."
gpu-tests:
runs-on: [self-hosted, linux, x64]
container: ghcr.io/gem5/gcn-gpu:latest
timeout-minutes: 720 # 12 hours
needs: get-date
steps:
- uses: actions/checkout@v4
with:
ref: develop
- name: Cache build/VEGA_X86
uses: actions/cache@v4
with:
path: build/VEGA_X86
key: testlib-build-vega-${{ needs.get-date.outputs.date }}
restore-keys: |
testlib-build-vega
- name: Build VEGA_X86/gem5.opt
working-directory: ${{ github.workspace }}
run: scons build/VEGA_X86/gem5.opt -j $(nproc)
- name: Run Testlib GPU Tests
working-directory: ${{ github.workspace }}/tests
run: ./main.py run --length=long -vvv --skip-build -t $(nproc) --host gcn_gpu gem5/gpu
- name: Upload results
if: success() || failure()
uses: actions/upload-artifact@v4
with:
name: gpu_tests_${{github.sha}}_RUN_${{github.run_id}}_ATTEMPT_${{github.run_attempt}}
path: tests/testing-results
retention-days: 7
# This runs the SST-gem5 integration compilation and tests it with
# ext/sst/sst/example.py.
sst-test:
@@ -213,11 +165,7 @@ jobs:
timeout-minutes: 180
steps:
- uses: actions/checkout@v3
with:
# Scheduled workflows run on the default branch by default. We
# therefore need to explicitly checkout the develop branch.
ref: develop
- uses: actions/checkout@v4
- name: Build RISCV/libgem5_opt.so with SST
run: scons build/RISCV/libgem5_opt.so --without-tcmalloc --duplicate-sources --ignore-style -j $(nproc)
- name: Makefile ext/sst
@@ -238,15 +186,13 @@ jobs:
timeout-minutes: 180
steps:
- uses: actions/checkout@v3
with:
# Scheduled workflows run on the default branch by default. We
# therefore need to explicitly checkout the develop branch.
ref: develop
- uses: actions/checkout@v4
- name: Build ARM/gem5.opt
run: scons build/ARM/gem5.opt --ignore-style --duplicate-sources -j$(nproc)
- name: disable systemc
run: scons setconfig build/ARM --ignore-style USE_SYSTEMC=n
- name: Build ARM/libgem5_opt.so
run: scons build/ARM/libgem5_opt.so --with-cxx-config --without-python --without-tcmalloc USE_SYSTEMC=0 -j$(nproc) --duplicate-sources
run: scons build/ARM/libgem5_opt.so --with-cxx-config --without-python --without-tcmalloc -j$(nproc) --duplicate-sources
- name: Compile gem5 withing SystemC
working-directory: ${{ github.workspace }}/util/systemc/gem5_within_systemc
run: make
@@ -255,45 +201,13 @@ jobs:
- name: Continue gem5 within SystemC test
run: LD_LIBRARY_PATH=build/ARM/:/opt/systemc/lib-linux64/ ./util/systemc/gem5_within_systemc/gem5.opt.sc m5out/config.ini
# Runs the gem5 Nighyly GPU tests.
gpu-tests:
runs-on: [self-hosted, linux, x64]
container: ghcr.io/gem5/gcn-gpu:latest
timeout-minutes: 720 # 12 hours
steps:
- uses: actions/checkout@v3
with:
# Scheduled workflows run on the default branch by default. We
# therefore need to explicitly checkout the develop branch.
ref: develop
- name: Compile build/GCN3_X86/gem5.opt
run: scons build/GCN3_X86/gem5.opt -j $(nproc)
- name: Get Square test-prog from gem5-resources
uses: wei/wget@v1
with:
args: -q http://dist.gem5.org/dist/develop/test-progs/square/square # Removed -N bc it wasn't available within actions, should be okay bc workspace is clean every time: https://github.com/coder/sshcode/issues/102
- name: Run Square test with GCN3_X86/gem5.opt (SE mode)
run: |
mkdir -p tests/testing-results
./build/GCN3_X86/gem5.opt configs/example/apu_se.py --reg-alloc-policy=dynamic -n3 -c square
- name: Get allSyncPrims-1kernel from gem5-resources
uses: wei/wget@v1
with:
args: -q http://dist.gem5.org/dist/develop/test-progs/heterosync/gcn3/allSyncPrims-1kernel # Removed -N bc it wasn't available within actions, should be okay bc workspace is clean every time
- name: Run allSyncPrims-1kernel sleepMutex test with GCN3_X86/gem5.opt (SE mode)
run: ./build/GCN3_X86/gem5.opt configs/example/apu_se.py --reg-alloc-policy=dynamic -n3 -c allSyncPrims-1kernel --options="sleepMutex 10 16
4"
- name: Run allSyncPrims-1kernel lfTreeBarrUsing test with GCN3_X86/gem5.opt (SE mode)
run: ./build/GCN3_X86/gem5.opt configs/example/apu_se.py --reg-alloc-policy=dynamic -n3 -c allSyncPrims-1kernel --options="lfTreeBarrUniq
10 16 4"
daily-tests:
# The dummy job is used to indicate whether the daily tests have
# passed or not. This can be used as status check for pull requests.
# I.e., if we want to stop pull requests from being merged if the
# daily tests are failing we can add this job as a required status
# check.
runs-on: ubuntu-22.04
runs-on: ubuntu-latest
needs:
- unittests-fast-debug
- testlib-long-tests

View File

@@ -1,54 +1,65 @@
---
name: Docker images build and push
#on:
# push:
# branches:
# - 'develop'
# paths:
# - util/dockerfiles/**
on:
workflow_dispatch:
jobs:
obtain-dockerfiles:
runs-on: [self-hosted, linux, x64]
container: ghcr.io/gem5/ubuntu-22.04_all-dependencies:latest
obtain-targets:
runs-on: ubuntu-latest
outputs:
targets: ${{ steps.generate.outputs.targets }}
steps:
- uses: actions/checkout@v3
with:
# Scheduled workflows run on the default branch by default. We
# therefore need to explicitly checkout the develop branch.
ref: develop
- uses: actions/upload-artifact@v3
with:
name: dockerfiles
path: util/dockerfiles
- name: Checkout
uses: actions/checkout@v4
# This builds and pushes the docker image.
build-and-push:
- name: List targets
id: generate
uses: docker/bake-action/subaction/list-targets@v4
with:
target: default
workdir: util/dockerfiles
docker-buildx-bake:
runs-on: [self-hosted, linux, x64]
needs: obtain-dockerfiles
needs:
- obtain-targets
strategy:
fail-fast: false
matrix:
target: ${{ fromJson(needs.obtain-targets.outputs.targets) }}
permissions:
packages: write
contents: read
steps:
- uses: actions/download-artifact@v3
with:
name: dockerfiles
path: dockerfiles-docker-build
- name: Checkout
uses: actions/checkout@v4
- uses: docker/setup-qemu-action@v2
name: Setup QEMU
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- uses: docker/setup-buildx-action@v2
name: Set up Docker Buildx
- uses: docker/login-action@v2
name: Login to the GitHub Container Registry
- name: Login to GitHub Container Registry
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.repository_owner }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Build and push with bake
uses: docker/bake-action@v4
- name: Build and push
uses: docker/bake-action@v5
with:
workdir: ./dockerfiles-docker-build
files: docker-bake.hcl
targets: ${{ matrix.target }}
workdir: util/dockerfiles
push: true

View File

@@ -1,96 +0,0 @@
---
# This workflow runs all the Weekly GPU Tests.
# For now this file is kept separate as we are still developing and testing
# this workflow. It will eventually be merged with "weekly-tests.yaml"
name: Weekly Tests (GPU)
on:
# Runs every Sunday from 7AM UTC
schedule:
- cron: 00 7 * * 6
# Allows us to manually start workflow for testing
workflow_dispatch:
jobs:
build-gem5:
runs-on: [self-hosted, linux, x64]
container: ghcr.io/gem5/gcn-gpu:latest
steps:
- uses: actions/checkout@v3
with:
# Scheduled workflows run on the default branch by default. We
# therefore need to explicitly checkout the develop branch.
ref: develop
- name: Build gem5
run: scons build/GCN3_X86/gem5.opt -j $(nproc) --ignore-style
- uses: actions/upload-artifact@v3
with:
name: weekly-test-${{ github.run_number }}-attempt-${{ github.run_attempt }}-gem5-build-gcn3
path: build/GCN3_X86/gem5.opt
retention-days: 5
- run: echo "This job's status is ${{ job.status }}."
LULESH-tests:
runs-on: [self-hosted, linux, x64]
container: ghcr.io/gem5/gcn-gpu:latest
needs: build-gem5
timeout-minutes: 480 # 8 hours
steps:
- uses: actions/checkout@v3
with:
# Scheduled workflows run on the default branch by default. We
# therefore need to explicitly checkout the develop branch.
ref: develop
- name: Download build/GCN3_X86/gem5.opt
uses: actions/download-artifact@v3
with:
name: weekly-test-${{ github.run_number }}-attempt-${{ github.run_attempt }}-gem5-build-gcn3
path: build/GCN3_X86
# `download-artifact` does not preserve permissions so we need to set
# them again.
- run: chmod u+x build/GCN3_X86/gem5.opt
- name: Obtain LULESH
working-directory: ${{ github.workspace }}/lulesh
# Obtains the latest LULESH compatible with this version of gem5 via
# gem5 Resources.
run: build/GCN3_X86/gem5.opt util/obtain-resource.py lulesh -p lulesh
- name: Run LULUESH tests
working-directory: ${{ github.workspace }}
run: |
build/GCN3_X86/gem5.opt configs/example/apu_se.py -n3 --mem-size=8GB --reg-alloc-policy=dynamic --benchmark-root="lulesh" -c \
lulesh 0.01 2
HACC-tests:
runs-on: [self-hosted, linux, x64]
container: ghcr.io/gem5/gcn-gpu:latest
needs: build-gem5
timeout-minutes: 120 # 2 hours
steps:
- uses: actions/checkout@v3
with:
# Scheduled workflows run on the default branch by default. We
# therefore need to explicitly checkout the develop branch.
ref: develop
- uses: actions/download-artifact@v3
with:
name: weekly-test-${{ github.run_number }}-attempt-${{ github.run_attempt }}-gem5-build-gcn3
path: build/GCN3_X86
- run: chmod u+x build/GCN3_X86/gem5.opt
- name: make hip directory
run: mkdir hip
- name: Compile m5ops and x86
working-directory: ${{ github.workspace }}/util/m5
run: |
export TERM=xterm-256color
scons build/x86/out/m5
- name: Download tests
working-directory: ${{ github.workspace }}/hip
run: wget http://dist.gem5.org/dist/v22-1/test-progs/halo-finder/ForceTreeTest
- name: Run HACC tests
working-directory: ${{ github.workspace }}
run: |
build/GCN3_X86/gem5.opt configs/example/apu_se.py -n3 --reg-alloc-policy=dynamic --benchmark-root=hip -c ForceTreeTest --options="0.5 0.1 64 0.1 1 N 12 rcb"

91
.github/workflows/scheduler.yaml vendored Normal file
View File

@@ -0,0 +1,91 @@
---
name: Workflow Scheduler
# GitHub scheduled workflows run on the default branch ('stable' in the case of
# gem5). this means for changes in a workflow to take effect, the default
# branch must be updated. This is not ideal as it requires regular commits into
# the stable branch. Ideally we just want to update the workflow on develop and
# have it run on the develop branch.
#
# This workflow is designed to run on the stable branch and will trigger other
# workflows on the develop branch.
#
# To do so we simply schedule this workflow to run every hour and use some
# simple bash logic to determine if the current time is when we want to run the
# other workflows.
on:
schedule:
# Runs every hour, 30 minutes past the hour.
- cron: 30 * * * *
env:
# This is the token used to authenticate with GitHub.
# It is required to run the `gh` CLI.
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
jobs:
schedule-workflows:
runs-on: ubuntu-latest
steps:
# This step is necessary to allow the `gh` CLI to be used in the
# following steps. The `gh` CLI is used to trigger the workflows.
# and needs to be used inside a the same repository where the
# workflows are defined.
- name: Checkout Repository
uses: actions/checkout@v4
- name: Record day and time
id: timedate-recorder
run: |
# `date +H` returns the current hour as a number from
# `00` to `23`.
echo "HOUR=$(date +%H)" >> $GITHUB_OUTPUT
# `date +%u` returns the day of the week as a number from
# `1` to `7`.
# `1` is Monday and `7` is Sunday.
echo "DAY=$(date +%u)" >> $GITHUB_OUTPUT
- name: Daily Tests
env:
HOUR: ${{ steps.timedate-recorder.outputs.HOUR }}
run: |
# If current time is 7pm then run the workflow.
if [[ $HOUR == '19' ]]
then
gh workflow run daily-tests.yaml --ref develop >/dev/null
echo "Daily test scheduled to run on develop branch."
else
echo "Daily tests not scheduled."
fi
- name: Weekly Tests
env:
DAY: ${{ steps.timedate-recorder.outputs.DAY }}
HOUR: ${{ steps.timedate-recorder.outputs.HOUR }}
run: |
# If the current day is Friday and the time is 7pm then run
# the workflow.
if [[ $DAY == '5' ]] && [[ $HOUR == '19' ]]
then
gh workflow run weekly-tests.yaml --ref develop >/dev/null
echo "Weekly test scheduled to run on develop branch."
else
echo "Weekly tests not scheduled."
fi
- name: Compiler Tests
env:
DAY: ${{ steps.timedate-recorder.outputs.DAY }}
HOUR: ${{ steps.timedate-recorder.outputs.HOUR }}
run: |
# If the current day is Tuesday and the time is 9pm then run
# the workflow.
if [[ $DAY == '2' ]] && [[ $HOUR == '21' ]]
then
gh workflow run compiler-tests.yaml --ref develop >/dev/null
echo "Compiler tests scheduled to run on the develop branch."
else
echo "Compiler tests not scheduled."
fi

View File

@@ -4,98 +4,255 @@
name: Weekly Tests
on:
# Runs every Sunday from 7AM UTC
schedule:
- cron: 00 7 * * 6
# Allows us to manually start workflow for testing
# This is triggered weekly via the 'scheduler.yaml' workflow.
workflow_dispatch:
jobs:
build-gem5:
runs-on: [self-hosted, linux, x64]
container: ghcr.io/gem5/ubuntu-22.04_all-dependencies:latest
outputs:
build-name: ${{ steps.artifact-name.outputs.name }}
steps:
- uses: actions/checkout@v3
with:
# Scheduled workflows run on the default branch by default. We
# therefore need to explicitly checkout the develop branch.
ref: develop
- id: artifact-name
run: echo "name=$(date +"%Y-%m-%d_%H.%M.%S")-ALL" >> $GITHUB_OUTPUT
- name: Build gem5
run: |
scons build/ALL/gem5.opt -j $(nproc)
- uses: actions/upload-artifact@v3
with:
name: ${{ steps.artifact-name.outputs.name }}
path: build/ALL/gem5.opt
retention-days: 5
- run: echo "This job's status is ${{ job.status }}."
# start running the very-long tests
get-date:
runs-on: ubuntu-latest
outputs:
date: ${{ steps.date.outputs.date }}
steps:
- name: Get the current date
id: date
run: echo "date=$(date +'%Y-%m-%d')" >> $GITHUB_ENV
# start running the very-long tests
testlib-very-long-tests:
strategy:
fail-fast: false
matrix:
test-type: [gem5_library_example_tests, gem5_resources, parsec_benchmarks, x86_boot_tests]
test-type: [gem5_library_example_tests, gem5_resources, stdlib, parsec_benchmarks, x86_boot_tests]
runs-on: [self-hosted, linux, x64]
container: ghcr.io/gem5/ubuntu-22.04_all-dependencies:latest
needs: [build-gem5]
container: ghcr.io/gem5/ubuntu-24.04_all-dependencies:latest
timeout-minutes: 4320 # 3 days
needs: get-date
steps:
- name: Clean runner
run: rm -rf ./* || true rm -rf ./.??* || true rm -rf ~/.cache || true
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Cache build/ALL
uses: actions/cache@v4
with:
# Scheduled workflows run on the default branch by default. We
# therefore need to explicitly checkout the develop branch.
ref: develop
- uses: actions/download-artifact@v3
with:
name: ${{needs.build-gem5.outputs.build-name}}
path: build/ALL
- run: chmod u+x build/ALL/gem5.opt
key: testlib-build-all-${{ needs.get-date.outputs.date }}
restore-keys: |
testlib-build-all
- name: very-long ${{ matrix.test-type }}
working-directory: ${{ github.workspace }}/tests
run: ./main.py run gem5/${{ matrix.test-type }} --length very-long --skip-build -vv -t $(nproc)
- name: create zip of results
run: ./main.py run gem5/${{ matrix.test-type }} --length very-long -j$(nproc) -vv
- name: upload results
if: success() || failure()
run: |
apt-get -y install zip
zip -r output.zip tests/testing-results
- name: upload zip
if: success() || failure()
uses: actions/upload-artifact@v3
uses: actions/upload-artifact@v4
env:
MY_STEP_VAR: ${{ matrix.test-type }}_COMMIT.${{github.sha}}_RUN.${{github.run_id}}_ATTEMPT.${{github.run_attempt}}
with:
name: ${{ env.MY_STEP_VAR }}
path: output.zip
path: tests/testing-results
retention-days: 7
- run: echo "This job's status is ${{ job.status }}."
dramsys-tests:
# The GPU tests are run in different jobs beacuse they take a long time to run. This way we can run them in parallel on different runners.
gpu-test-hacc:
runs-on: [self-hosted, linux, x64]
container: ghcr.io/gem5/ubuntu-22.04_all-dependencies:latest
timeout-minutes: 4320 # 3 days
container: ghcr.io/gem5/gcn-gpu:latest
timeout-minutes: 720 # 12 hours
needs: get-date
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
# Scheduled workflows run on the default branch by default. We
# therefore need to explicitly checkout the develop branch.
ref: develop
- name: Cache build/VEGA_X86
uses: actions/cache@v4
with:
path: build/VEGA_X86
key: testlib-build-vega-${{ needs.get-date.outputs.date }}
restore-keys: |
testlib-build-vega
- name: Run Testlib GPU Tests
working-directory: ${{ github.workspace }}/tests
run: ./main.py run --length=very-long -vvv -j $(nproc) --host gcn_gpu --uid SuiteUID:tests/gem5/gpu/test_gpu_apu_se.py:gpu-apu-se-hacc-VEGA_X86-gcn_gpu-opt
- name: Upload results
if: success() || failure()
uses: actions/upload-artifact@v4.0.0
with:
name: gpu_tests_hacc_${{github.sha}}_RUN_${{github.run_id}}_ATTEMPT_${{github.run_attempt}}
path: tests/testing-results
retention-days: 7
gpu-test-lulesh:
runs-on: [self-hosted, linux, x64]
container: ghcr.io/gem5/gcn-gpu:latest
timeout-minutes: 720 # 12 hours
needs: get-date
steps:
- uses: actions/checkout@v4
with:
ref: develop
- name: Cache build/VEGA_X86
uses: actions/cache@v4
with:
path: build/VEGA_X86
key: testlib-build-vega-${{ needs.get-date.outputs.date }}
restore-keys: |
testlib-build-vega
- name: Run Testlib GPU Tests
working-directory: ${{ github.workspace }}/tests
run: ./main.py run --length=very-long -vvv -j $(nproc) --host gcn_gpu --uid SuiteUID:tests/gem5/gpu/test_gpu_apu_se.py:gpu-apu-se-lulesh-VEGA_X86-gcn_gpu-opt
- name: Upload results
if: success() || failure()
uses: actions/upload-artifact@v4.0.0
with:
name: gpu_tests_lulesh_${{github.sha}}_RUN_${{github.run_id}}_ATTEMPT_${{github.run_attempt}}
path: tests/testing-results
retention-days: 7
gpu-test-pannotia-bc:
runs-on: [self-hosted, linux, x64]
container: ghcr.io/gem5/gcn-gpu:latest
timeout-minutes: 720 # 12 hours
needs: get-date
steps:
- uses: actions/checkout@v4
with:
ref: develop
- name: Cache build/VEGA_X86
uses: actions/cache@v4
with:
path: build/VEGA_X86
key: testlib-build-vega-${{ needs.get-date.outputs.date }}
restore-keys: |
testlib-build-vega
- name: Run Testlib GPU Tests
working-directory: ${{ github.workspace }}/tests
run: ./main.py run --length=very-long -vvv -j $(nproc) --host gcn_gpu --uid SuiteUID:tests/gem5/gpu/test_gpu_pannotia.py:gpu-apu-se-pannotia-bc-1k-128k-VEGA_X86-gcn_gpu-opt
- name: Upload results
if: success() || failure()
uses: actions/upload-artifact@v4.0.0
with:
name: gpu_tests_pannotia_bc_${{github.sha}}_RUN_${{github.run_id}}_ATTEMPT_${{github.run_attempt}}
path: tests/testing-results
retention-days: 7
gpu-test-pannotia-color-maxmin:
runs-on: [self-hosted, linux, x64]
container: ghcr.io/gem5/gcn-gpu:latest
timeout-minutes: 720 # 12 hours
needs: get-date
steps:
- uses: actions/checkout@v4
with:
ref: develop
- name: Cache build/VEGA_X86
uses: actions/cache@v4
with:
path: build/VEGA_X86
key: testlib-build-vega-${{ needs.get-date.outputs.date }}
restore-keys: |
testlib-build-vega
- name: Run Testlib GPU Tests
working-directory: ${{ github.workspace }}/tests
run: ./main.py run --length=very-long -vvv -j $(nproc) --host gcn_gpu --uid SuiteUID:tests/gem5/gpu/test_gpu_pannotia.py:gpu-apu-se-pannotia-color-maxmin-1k-128k-VEGA_X86-gcn_gpu-opt
- name: Upload results
if: success() || failure()
uses: actions/upload-artifact@v4.0.0
with:
name: gpu_tests_pannotia_color_maxmin_${{github.sha}}_RUN_${{github.run_id}}_ATTEMPT_${{github.run_attempt}}
path: tests/testing-results
retention-days: 7
gpu-test-pannotia-color-max:
runs-on: [self-hosted, linux, x64]
container: ghcr.io/gem5/gcn-gpu:latest
timeout-minutes: 720 # 12 hours
needs: get-date
steps:
- uses: actions/checkout@v4
with:
ref: develop
- name: Cache build/VEGA_X86
uses: actions/cache@v4
with:
path: build/VEGA_X86
key: testlib-build-vega-${{ needs.get-date.outputs.date }}
restore-keys: |
testlib-build-vega
- name: Run Testlib GPU Tests
working-directory: ${{ github.workspace }}/tests
run: ./main.py run --length=very-long -vvv -j $(nproc) --host gcn_gpu --uid SuiteUID:tests/gem5/gpu/test_gpu_pannotia.py:gpu-apu-se-pannotia-color-max-1k-128k-VEGA_X86-gcn_gpu-opt
- name: Upload results
if: success() || failure()
uses: actions/upload-artifact@v4.0.0
with:
name: gpu_tests_pannotia_color_max_${{github.sha}}_RUN_${{github.run_id}}_ATTEMPT_${{github.run_attempt}}
path: tests/testing-results
retention-days: 7
gpu-test-pannotia-fw-hip:
runs-on: [self-hosted, linux, x64]
container: ghcr.io/gem5/gcn-gpu:latest
timeout-minutes: 2160 # 36 hours
needs: get-date
steps:
- uses: actions/checkout@v4
with:
ref: develop
- name: Cache build/VEGA_X86
uses: actions/cache@v4
with:
path: build/VEGA_X86
key: testlib-build-vega-${{ needs.get-date.outputs.date }}
restore-keys: |
testlib-build-vega
- name: Run Testlib GPU Tests
working-directory: ${{ github.workspace }}/tests
run: ./main.py run --length=very-long -vvv -j $(nproc) --host gcn_gpu --uid SuiteUID:tests/gem5/gpu/test_gpu_pannotia.py:gpu-apu-se-pannotia-fw-hip-1k-128k-VEGA_X86-gcn_gpu-opt
- name: Upload results
if: success() || failure()
uses: actions/upload-artifact@v4.0.0
with:
name: gpu_tests_pannotia_fw_hip_${{github.sha}}_RUN_${{github.run_id}}_ATTEMPT_${{github.run_attempt}}
path: tests/testing-results
retention-days: 7
dramsys-tests:
runs-on: [self-hosted, linux, x64]
container: ghcr.io/gem5/ubuntu-24.04_all-dependencies:latest
timeout-minutes: 4320 # 3 days
steps:
- uses: actions/checkout@v4
- name: Checkout DRAMSys
working-directory: ${{ github.workspace }}/ext/dramsys
run: |
git clone https://github.com/tukl-msd/DRAMSys DRAMSys
cd DRAMSys
git checkout -b gem5 09f6dcbb91351e6ee7cadfc7bc8b29d97625db8f
git submodule update --init --recursive
run: git clone https://github.com/tukl-msd/DRAMSys --branch v5.1 --depth 1 DRAMSys
# gem5 is built separately because it depends on the DRAMSys library
# gem5 is built separately because it depends on the DRAMSys library
- name: Build gem5
working-directory: ${{ github.workspace }}
run: scons build/ALL/gem5.opt -j $(nproc)
@@ -112,9 +269,15 @@ jobs:
# I.e., if we want to stop pull requests from being merged if the
# weekly tests are failing we can add this job as a required status
# check.
runs-on: ubuntu-22.04
runs-on: ubuntu-latest
needs:
- testlib-very-long-tests
- dramsys-tests
- gpu-test-hacc
- gpu-test-lulesh
- gpu-test-pannotia-bc
- gpu-test-pannotia-color-maxmin
- gpu-test-pannotia-color-max
- gpu-test-pannotia-fw-hip
steps:
- run: echo "This weekly tests have passed."

4
.gitignore vendored
View File

@@ -1,4 +1,5 @@
build
gem5_build
parser.out
parsetab.py
cscope.files
@@ -9,6 +10,7 @@ cscope.out
.*.swo
m5out
/src/doxygen/html
/docs/_build
/ext/dramsim2/DRAMSim2
/ext/mcpat/regression/*/*.out
/util/m5/*.o
@@ -32,3 +34,5 @@ configs/example/memcheck.cfg
configs/dram/lowp_sweep.cfg
.pyenv
.vscode
typings
.DS_Store

132
.mailmap
View File

@@ -1,8 +1,11 @@
Abdul Mutaal Ahmad <abdul.mutaal@gmail.com>
adarshpatil <adarshpatil123@gmail.com>
Aditya K Kamath <a_kamath@hotmail.com> aditya <a_kamath@hotmail.com>
Adrià Armejach <adria.armejach@bsc.es> Adrià Armejach <adria.armejach@gmail.com>
Adrià Armejach <adria.armejach@bsc.es> Adrià Armejach <66964292+aarmejach@users.noreply.github.com>
Adrian Herrera <adrian.herrera@arm.com>
Adrien Pesle <adrien.pesle@arm.com>
Adwaith R Krishna <adwaithrk19@gmail.com>
Akash Bagdia <akash.bagdia@ARM.com> Akash Bagdia <akash.bagdia@arm.com>
Alec Roelke <alec.roelke@gmail.com> Alec Roelke <ar4jc@virginia.edu>
Alexander Klimov <Alexander.Klimov@arm.com>
@@ -10,21 +13,19 @@ Alexandru Dutu <alexandru.dutu@amd.com> Alexandru <alexandru.dutu@amd.com>
Alex Richardson <alexrichardson@google.com>
Ali Jafri <ali.jafri@arm.com>
Ali Saidi <Ali.Saidi@arm.com> Ali Saidi <ali.saidi@arm.com>
Ali Saidi <Ali.Saidi@arm.com> Ali Saidi <Ali.Saidi@ARM.com>
Ali Saidi <Ali.Saidi@arm.com> Ali Saidi <saidi@eecs.umich.edu>
Alistair Delva <adelva@google.com>
Alvaro Moreno <alvaro.moreno@bsc.es>
Amin Farmahini <aminfar@gmail.com>
Anders Handler <s052838@student.dtu.dk>
Andrea Mondelli <andrea.mondelli@huawei.com> Andrea Mondelli <andrea.mondelli@ucf.edu>
Andrea Mondelli <andrea.mondelli@huawei.com> Andrea Mondelli <Andrea.Mondelli@ucf.edu>
Andrea Pellegrini <andrea.pellegrini@gmail.com>
Andreas Hansson <andreas.hanson@arm.com> Andreas Hansson <andreas.hansson>
Andreas Hansson <andreas.hanson@arm.com> Andreas Hansson <andreas.hansson@arm.com>
Andreas Hansson <andreas.hanson@arm.com> Andreas Hansson <Andreas.Hansson@ARM.com>
Andreas Hansson <andreas.hanson@arm.com> Andreas Hansson <andreas.hansson@armm.com>
Andreas Sandberg <Andreas.Sandberg@arm.com> Andreas Sandberg <andreas.sandberg@arm.com>
Andreas Sandberg <Andreas.Sandberg@arm.com> Andreas Sandberg <Andreas.Sandberg@ARM.com>
Andreas Sandberg <Andreas.Sandberg@arm.com> Andreas Sandberg <andreas@sandberg.pp.se>
Andreas Sandberg <Andreas.Sandberg@arm.com> Andreas Sandberg <andreas@sandberg.uk>
Andrew Bardsley <Andrew.Bardsley@arm.com> Andrew Bardsley <Andreas.Bardsley@arm.com>
Andrew Lukefahr <lukefahr@umich.edu>
Andrew Schultz <alschult@umich.edu>
@@ -32,11 +33,14 @@ Andriani Mappoura <andriani.mappoura@arm.com>
Angie Lee <peiyinglee@google.com>
Anis Peysieux <anis.peysieux@inria.fr>
Ani Udipi <ani.udipi@arm.com>
anoop <mysanoop@gmail.com>
Anouk Van Laer <anouk.vanlaer@arm.com>
ARM gem5 Developers <none@none>
Arthur Perais <Arthur.Perais@univ-grenoble-alpes.fr> Arthur Perais <arthur.perais@inria.fr>
Arun Rodrigues <afrodri@gmail.com>
Ashkan Tousi <ashkan.tousimojarad@arm.com>
atrah22 <atul.rahman@outlook.com>
Atri Bhattacharyya <atri.bhattacharyya@epfl.ch>
Austin Harris <austinharris@utexas.edu> Austin Harris <mail@austin-harris.com>
Avishai Tvila <avishai.tvila@gmail.com>
Ayaz Akram <yazakram@ucdavis.edu>
@@ -48,6 +52,7 @@ Bjoern A. Zeeb <baz21@cam.ac.uk>
Blake Hechtman <bah13@duke.edu> Blake Hechtman <blake.hechtman@amd.com>
Blake Hechtman <bah13@duke.edu> Blake Hechtman ext:(%2C%20Nilay%20Vaish%20%3Cnilay%40cs.wisc.edu%3E) <bah13@duke.edu>
Bobby R. Bruce <bbruce@ucdavis.edu> Bobby Bruce <bbruce@amarillo.cs.ucdavis.edu>
Bobby R. Bruce <bbruce@ucdavis.edu> Bobby Bruce <bbruce@ucdavis.edu>
Boris Shingarov <shingarov@gmail.com> Boris Shingarov <shingarov@labware.com>
Brad Beckmann <brad.beckmann@amd.com> Brad Beckmann <Brad.Beckmann@amd.com>
Brad Beckmann <brad.beckmann@amd.com> Brad Beckmann ext:(%2C%20Nilay%20Vaish%20%3Cnilay%40cs.wisc.edu%3E) <Brad.Beckmann@amd.com>
@@ -60,15 +65,13 @@ Brian Grayson <b.grayson@samsung.com>
Cagdas Dirik <cdirik@micron.com> cdirik <cdirik@micron.com>
Carlos Falquez <c.falquez@fz-juelich.de>
Chander Sudanthi <chander.sudanthi@arm.com> Chander Sudanthi <Chander.Sudanthi@arm.com>
Chander Sudanthi <chander.sudanthi@arm.com> Chander Sudanthi <Chander.Sudanthi@ARM.com>
Charles Jamieson <cjamieson2@wisc.edu>
CHEN Meng <tundriolaxy@gmail.com>
Chen Meng <tundriolaxy@gmail.com>
Chen Zou <chenzou@uchicago.edu>
Chia-You Chen <hortune@google.com>
Chow, Marcus <marcus.chow@amd.com>
Marcus Chow <marcus.chow@amd.com>
Chris Adeniyi-Jones <Chris.Adeniyi-Jones@arm.com>
Chris Emmons <chris.emmons@arm.com> Chris Emmons <Chris.Emmons@arm.com>
Chris Emmons <chris.emmons@arm.com> Chris Emmons <Chris.Emmons@ARM.com>
Chris January <chris.january@arm.com>
Christian Menard <christian.menard@tu-dresden.de> Christian Menard <Christian.Menard@tu-dresden.de>
Christopher Torng <clt67@cornell.edu>
@@ -83,17 +86,19 @@ Daecheol You <daecheol.you@samsung.com>
Dam Sunwoo <dam.sunwoo@arm.com>
Dan Gibson <gibson@cs.wisc.edu>
Daniel Carvalho <odanrc@yahoo.com.br> Daniel <odanrc@yahoo.com.br>
Daniel Carvalho <odanrc@yahoo.com.br> Daniel Carvalho <odanrc@users.noreply.github.com>
Daniel Carvalho <odanrc@yahoo.com.br> Daniel R. Carvalho <odanrc@yahoo.com.br>
Daniel Gerzhoy <daniel.gerzhoy@gmail.com>
Daniel Johnson <daniel.johnson@arm.com>
Daniel Kouchekinia <DanKouch@users.noreply.github.com>
Daniel Sanchez <sanchezd@stanford.edu>
Davide Basilio Bartolini <davide.basilio.bartolini@huawei.com>
David Guillen-Fandos <david.guillen@arm.com> David Guillen <david.guillen@arm.com>
David Guillen-Fandos <david.guillen@arm.com> David Guillen Fandos <david.guillen@arm.com>
David Hashe <david.hashe@amd.com> David Hashe <david.j.hashe@gmail.com>
David Oehmke <doehmke@umich.edu>
David Schall <david.schall2@arm.com>
Derek Christ <dchrist@rhrk.uni-kl.de>
David Schall <david.schall@ed.ac.uk> David Schall <david.schall2@arm.com>
Derek Christ <dchrist@rhrk.uni-kl.de> Derek Christ <44267643+derchr@users.noreply.github.com>
Derek Hower <drh5@cs.wisc.edu>
Deyaun Guo <guodeyuan@tsinghua.org.cn> Deyuan Guo ext:(%2C%20Nilay%20Vaish%20%3Cnilay%40cs.wisc.edu%3E) <guodeyuan@tsinghua.org.cn>
Deyaun Guo <guodeyuan@tsinghua.org.cn> Deyuan Guo <guodeyuan@tsinghua.org.cn>
@@ -107,11 +112,12 @@ Earl Ou <shunhsingou@google.com>
eavivi <eavivi@ucdavis.edu>
Éder F. Zulian <zulian@eit.uni-kl.de>
Edmund Grimley Evans <Edmund.Grimley-Evans@arm.com>
Eduardo José Gómez Hernández <eduardojose.gomez@um.es>
Eduardo José Gómez Hernández <eduardojose.gomez@um.es> Eduardo José Gómez Hernández <git@edujgh.net>
Eliot Moss <moss@cs.umass.edu>
Emilio Castillo <castilloe@unican.es> Emilio Castillo <ecastill@bsc.es>
Emilio Castillo <castilloe@unican.es> Emilio Castillo ext:(%2C%20Nilay%20Vaish%20%3Cnilay%40cs.wisc.edu%3E) <castilloe@unican.es>
Emily Brickey <esbrickey@ucdavis.edu>
Emin Gadzhiev <e.gadzhiev.mhk@gmail.com>
Erfan Azarkhish <erfan.azarkhish@unibo.it>
Erhu <fengerhu.ipads@gmail.com>
Eric Van Hensbergen <eric.vanhensbergen@arm.com> Eric Van Hensbergen <Eric.VanHensbergen@ARM.com>
@@ -125,11 +131,12 @@ Gabe Black <gabe.black@gmail.com> Gabe Black <gabeblack@google.com>
Gabe Black <gabe.black@gmail.com> Gabe Black <gblack@eecs.umich.edu>
Gabe Loh <gabriel.loh@amd.com> gloh <none@none>
Gabor Dozsa <gabor.dozsa@arm.com>
Gabriel Busnot <gabriel.busnot@arteris.com>
Gabriel Busnot <gabriel.busnot@arteris.com> Gabriel Busnot <gabriel.busnot@cea.fr>
Gabriel Busnot <gabriel.busnot@arteris.com> Gabriel Busnot <gabibusnot@gmail.com>
gauravjain14 <gjain6@wisc.edu>
Gautham Pathak <gspathak@gitlab.uwaterloo.ca>
Gedare Bloom <gedare@rtems.org> Gedare Bloom <gedare@gwmail.gwu.edu>
Gene Wu <gene.wu@arm.com> Gene WU <gene.wu@arm.com>
Gene WU <gene.wu@arm.com> Gene Wu <Gene.Wu@arm.com>
Geoffrey Blake <geoffrey.blake@arm.com> Geoffrey Blake <blakeg@umich.edu>
Geoffrey Blake <geoffrey.blake@arm.com> Geoffrey Blake <Geoffrey.Blake@arm.com>
Georg Kotheimer <georg.kotheimer@mailbox.tu-dresden.de>
@@ -140,10 +147,14 @@ GWDx <gwdx@mail.ustc.edu.cn>
Hamid Reza Khaleghzadeh <khaleghzadeh@gmail.com> Hamid Reza Khaleghzadeh ext:(%2C%20Lluc%20Alvarez%20%3Clluc.alvarez%40bsc.es%3E%2C%20Nilay%20Vaish%20%3Cnilay%40cs.wisc.edu%3E) <khaleghzadeh@gmail.com>
handsomeliu <handsomeliu@google.com>
Hanhwi Jang <jang.hanhwi@gmail.com>
Hoa Nguyen <hoanguyen@ucdavis.edu>
Harshil Patel <hpppatel@ucdavis.edu> Harshil Patel <harshilp2107@gmail.com>
Harshil Patel <hpppatel@ucdavis.edu> Harshil Patel <91860903+Harshil2107@users.noreply.github.com>
Wenjian He <wheac@connect.ust.hk>
HJikram <humzajahangirikram@gmail.com>
Hoa Nguyen <hn@hnpl.org> Hoa Nguyen <hoanguyen@ucdavis.edu>
Hongil Yoon <ongal@cs.wisc.edu>
Hsuan Hsu <hsuan.hsu@mediatek.com>
huangjs <jiasen.hjs@alibaba-inc.com>
hungweihsu <hungweihsu@google.com> hungweihsuG <145444687+hungweihsuG@users.noreply.github.com>
Hussein Elnawawy <hussein.elnawawy@gmail.com>
Ian Jiang <ianjiang.ict@gmail.com>
IanJiangICT <ianjiang.ict@gmail.com>
@@ -152,9 +163,13 @@ Iru Cai <mytbk920423@gmail.com>
Isaac Richter <isaac.richter@rochester.edu>
Isaac Sánchez Barrera <isaac.sanchez@bsc.es>
Ivan Pizarro <ivan.pizarro@metempsy.com>
Jack Whitham <jack-m5ml2@cs.york.ac.uk> Jack Whitman <jack-m5ml2@cs.york.ac.uk>
Ivan Turasov <turasov.ivan@gmail.com>
Ivana Mitrovic <imitrovic@ucdavis.edu> Ivana Mitrovic <ivanamit91@gmail.com>
Ivana Mitrovic <imitrovic@ucdavis.edu> ivanaamit <ivanamit91@gmail.com>
Jack Whitham <jack-m5ml2@cs.york.ac.uk>
Jairo Balart <jairo.balart@metempsy.com>
Jakub Jermar <jakub@jermar.eu>
James Braun <jebraun3@wisc.edu>
James Clarkson <james.clarkson@arm.com>
Jan-Peter Larsson <jan-peter.larsson@arm.com>
Jan Vrany <jan.vrany@labware.com>
@@ -174,8 +189,8 @@ Jayneel Gandhi <jayneel@cs.wisc.edu>
Jennifer Treichler <jtreichl@umich.edu>
Jerin Joy <joy@rivosinc.com>
Jiajie Chen <c@jia.je>
Jiasen Huang <jiasen.hjs@alibaba-inc.com>
Jiasen <jiasen.hjs@alibaba-inc.com>
Jiasen Huang <jiasen.hjs@alibaba-inc.com> Jiasen <jiasen.hjs@alibaba-inc.com>
Jiasen Huang <jiasen.hjs@alibaba-inc.com> huangjs <jiasen.hjs@alibaba-inc.com>
Jiayi Huang <jyhuang91@gmail.com>
jiegec <noc@jiegec.ac.cn>
Jieming Yin <jieming.yin@amd.com> jiemingyin <bjm419@gmail.com>
@@ -188,14 +203,17 @@ Joel Hestness <jthestness@gmail.com> Joel Hestness <hestness@cs.wisc.edu>
Joël Porquet-Lupine <joel@porquet.org>
John Alsop <johnathan.alsop@amd.com>
John Kalamatianos <john.kalamatianos@amd.com> jkalamat <john.kalamatianos@amd.com>
Johnny <johnnyko@google.com>
Jordi Vaquero <jordi.vaquero@metempsy.com>
Jose Marinho <jose.marinho@arm.com>
Juan M. Cebrian <jm.cebriangonzalez@gmail.com>
Jui-min Lee <fcrh@google.com>
kai.ren <kai.ren@streamcomputing.com> Kai Ren <binarystar2006@outlook.com>
Kai Ren <kai.ren@streamcomputing.com> kai.ren <kai.ren@streamcomputing.com>
Kai Ren <kai.ren@streamcomputing.com> Kai Ren <binarystar2006@outlook.com>
KaiBatley <68886332+KaiBatley@users.noreply.github.com>
Kanishk Sugand <kanishk.sugand@arm.com>
Karthik Sangaiah <karthik.sangaiah@arm.com>
Kaustav Goswami <kggoswami@ucdavis.edu>
Kaustav Goswami <kggoswami@ucdavis.edu> Kaustav Goswami <39310478+kaustav-goswami@users.noreply.github.com>
Kelly Nguyen <klynguyen@ucdavis.edu>
Ke Meng <mengke97@hotmail.com>
Kevin Brodsky <kevin.brodsky@arm.com>
@@ -206,11 +224,16 @@ Koan-Sin Tan <koansin.tan@gmail.com>
Korey Sewell <ksewell@umich.edu>
Krishnendra Nathella <Krishnendra.Nathella@arm.com> Krishnendra Nathella <krinat01@arm.com>
ksco <numbksco@gmail.com>
kunpai <kunpai@ucdavis.edu>
Kunal Pai <kunpai@ucdavis.edu> Kunal Pai <62979320+kunpai@users.noreply.github.com>
Kunal Pai <kunpai@ucdavis.edu> kunpai <kunpai@ucdavis.edu>
Kunal Pai <kunpai@ucdavis.edu> paikunal <kunpai@ucdavis.edu>
Kunal Pai <kunpai@ucdavis.edu> KUNAL PAI <kunpai@ucdavis.edu>
Kyle Roarty <kyleroarty1716@gmail.com> Kyle Roarty <Kyle.Roarty@amd.com>
Laura Hinman <llhinman@ucdavis.edu>
Lena Olson <leolson@google.com> Lena Olson <lena@cs.wisc,edu>
Lena Olson <leolson@google.com> Lena Olson <lena@cs.wisc.edu>
Leo Redivo <lredivo@ucdavis.edu> leoredivo <94771718+leoredivo@users.noreply.github.com>
Lingkang <karlzhu12@gmail.com>
Lisa Hsu <Lisa.Hsu@amd.com> Lisa Hsu <hsul@eecs.umich.edu>
Lluc Alvarez <lluc.alvarez@bsc.es>
Lluís Vilanova <vilanova@ac.upc.edu> Lluis Vilanova <vilanova@ac.upc.edu>
@@ -221,9 +244,11 @@ Mahyar Samani <msamani@ucdavis.edu>
Majid Jalili <majid0jalili@gmail.com>
Malek Musleh <malek.musleh@gmail.com> Nilay Vaish ext:(%2C%20Malek%20Musleh%20%3Cmalek.musleh%40gmail.com%3E) <nilay@cs.wisc.edu>
Marc Mari Barcelo <marc.maribarcelo@arm.com>
Marco Balboni <Marco.Balboni@ARM.com>
Marco Elver <Marco.Elver@ARM.com> Marco Elver <marco.elver@ed.ac.uk>
Marc Orr <marc.orr@gmail.com> Marc Orr <morr@cs.wisc.edu>
Marco Balboni <Marco.Balboni@ARM.com>
Marco Chen <mc@soc.pub>
Marco Elver <Marco.Elver@ARM.com> Marco Elver <marco.elver@ed.ac.uk>
Marco Kurzynski <marcokurzynski@icloud.com>
Marjan Fariborz <mfariborz@ucdavis.edu> marjanfariborz <mfariborz@ucdavis.edu>
Mark Hildebrand <mhildebrand@ucdavis.edu>
Marton Erdos <marton.erdos@arm.com>
@@ -233,20 +258,18 @@ Matteo Andreozzi <matteo.andreozzi@arm.com> Matteo Andreozzi <Matteo.Andreozzi@a
Matteo M. Fusi <matteo.fusi@bsc.es>
Matt Evans <matt.evans@arm.com> Matt Evans <Matt.Evans@arm.com>
Matthew Poremba <matthew.poremba@amd.com> Matthew Poremba <Matthew.Poremba@amd.com>
Matthias Boettcher <matthias.boettcher@arm.com>
Matthias Hille <matthiashille8@gmail.com>
Matthias Jung <jungma@eit.uni-kl.de>
Matthias Jung <matthias.jung@iese.fraunhofer.de>
Matt Horsnell <matt.horsnell@arm.com> Matt Horsnell <matt.horsnell@ARM.com>
Matthias Jung <matthias.jung@iese.fraunhofer.de> Matthias Jung <jungma@eit.uni-kl.de>
Matt Horsnell <matt.horsnell@arm.com> Matt Horsnell <Matt.Horsnell@arm.com>
Matt Horsnell <matt.horsnell@arm.com>Matt Horsnell <Matt.Horsnell@ARM.com>
Matt Poremba <matthew.poremba@amd.com> Matt Poremba <Matthew.Poremba@amd.com>
Matt Sinclair <mattdsinclair@gmail.com> Matthew Sinclair <matthew.sinclair@amd.com>
Matt Sinclair <mattdsinclair.wisc@gmail.com> Matt Sinclair <Matthew.Sinclair@amd.com>
Matt Sinclair <mattdsinclair.wisc@gmail.com> Matt Sinclair <mattdsinclair@gmail.com>
Matt Sinclair <mattdsinclair.wisc@gmail.com> Matthew Sinclair <matthew.sinclair@amd.com>
Maurice Becker <madnaurice@googlemail.com>
Maxime Martinasso <maxime.cscs@gmail.com>
Maximilian Stein <maximilian.stein@tu-dresden.de>Maximilian Stein <m@steiny.biz>
Maximilien Breughe <maximilien.breughe@elis.ugent.be> Maximilien Breughe <Maximilien.Breughe@elis.ugent.be>
Melissa Jost <melissakjost@gmail.com>
Melissa Jost <melissakjost@gmail.com> Melissa Jost <50555529+mkjost0@users.noreply.github.com>
Michael Adler <Michael.Adler@intel.com>
Michael Boyer <Michael.Boyer@amd.com>
Michael LeBeane <michael.lebeane@amd.com> Michael LeBeane <Michael.Lebeane@amd.com>
@@ -262,7 +285,6 @@ Min Kyu Jeong <minkyu.jeong@arm.com> Min Kyu Jeong <MinKyu.Jeong@arm.com>
Mitch Hayenga <mitch.hayenga@arm.com> Mitchell Hayenga <Mitchell.Hayenga@ARM.com>
Mitch Hayenga <mitch.hayenga@arm.com> Mitch Hayenga ext:(%2C%20Amin%20Farmahini%20%3Caminfar%40gmail.com%3E) <mitch.hayenga+gem5@gmail.com>
Mitch Hayenga <mitch.hayenga@arm.com> Mitch Hayenga <Mitch.Hayenga@arm.com>
Mitch Hayenga <mitch.hayenga@arm.com> Mitch Hayenga <Mitch.Hayenga@ARM.com>
Mitch Hayenga <mitch.hayenga@arm.com> Mitch Hayenga <mitch.hayenga+gem5@gmail.com>
Mohammad Alian <m.alian1369@gmail.com>
Monir Mozumder <monir.mozumder@amd.com>
@@ -279,13 +301,17 @@ Nathan Binkert <nate@binkert.org> Nathan Binkert <binkertn@umich.edu>
Nayan Deshmukh <nayan26deshmukh@gmail.com>
Neha Agarwal <neha.agarwal@arm.com>
Neil Natekar <nanatekar@ucdavis.edu>
Nicholas Lindsay <nicholas.lindsay@arm.com>
Nicholas Lindsay <nicholas.lindsay@arm.com> Nicholas Lindsay <Nicholas.Lindsey@arm.com>
Nicholas Mosier <nmosier@stanford.edu> Nicholas Mosier <nh.mosier@gmail.com>
Nicolas Boichat <drinkcat@google.com>
Nicolas Derumigny <nderumigny@gmail.com>
Nicolas Zea <nicolas.zea@gmail.com>
Nikolaos Kyparissas <nikolaos.kyparissas@arm.com>
Nikos Nikoleris <nikos.nikoleris@arm.com> Nikos Nikoleris <nikos.nikoleris@gmail.com>
Nilay Vaish ext:(%2C%20Timothy%20Jones%20%3Ctimothy.jones%40cl.cam.ac.uk%3E) <nilay@cs.wisc.edu>
Nils Asmussen <nils.asmussen@barkhauseninstitut.org> Nils Asmussen <nilsasmussen7@gmail.com>
Nitesh Narayana <nitesh.dps@gmail.com>
Nitish Arya <42148385+aryanitish@users.noreply.github.com>
Noah Katz <nkatz@rivosinc.com>
ntampouratzis <ntampouratzis@isc.tuc.gr>
Nuwan Jayasena <Nuwan.Jayasena@amd.com>
@@ -293,7 +319,6 @@ Ola Jeppsson <ola.jeppsson@gmail.com>
Omar Naji <Omar.Naji@arm.com>
Onur Kayiran <onur.kayiran@amd.com>
Pablo Prieto <pablo.prieto@unican.es>
paikunal <kunpai@ucdavis.edu>
Palle Lyckegaard <palle@lyckegaard.dk>
Pau Cabre <pau.cabre@metempsy.com>
Paul Rosenfeld <prosenfeld@micron.com> Paul Rosenfeld <dramninjas@gmail.com>
@@ -308,29 +333,39 @@ Po-Hao Su <supohaosu@gmail.com>
Polina Dudnik <pdudnik@cs.wisc.edu> Polina Dudnik <pdudnik@gmail.com>
Polydoros Petrakis <ppetrak@ics.forth.gr>
Pouya Fotouhi <pfotouhi@ucdavis.edu> Pouya Fotouhi <Pouya.Fotouhi@amd.com>
Prajwal Hegde <prhegde@wisc.edu>
Prakash Ramrakhyani <prakash.ramrakhyani@arm.com> Prakash Ramrakhani <Prakash.Ramrakhani@arm.com>
Prakash Ramrakhyani <prakash.ramrakhyani@arm.com> Prakash Ramrakhyani <Prakash.Ramrakhyani@arm.com>
Pritha Ghoshal <pritha9987@tamu.edu>
Pu (Luke) Yi <lukeyi@stanford.edu>
Quentin Forcioli <quentin.forcioli@telecom-paris.fr>
Radhika Jagtap <radhika.jagtap@arm.com> Radhika Jagtap <radhika.jagtap@ARM.com>
Rahul Thakur <rjthakur@google.com>
Reiley Jeapaul <Reiley.Jeyapaul@arm.com>
Rajarshi Das <drajarsh@gmail.com>
Ranganath (Bujji) Selagamsetty <bujji.selagamsetty@amd.com> BujSet <ranganath1000@gmail.com>
Razeza <borisov.dn@phystech.edu>
Reiley Jeapaul <reiley.jeyapaul@arm.com> Reiley Jeapaul <Reiley.Jeyapaul@arm.com>
Rekai Gonzalez-Alberquilla <rekai.gonzalezalberquilla@arm.com> Rekai Gonzalez Alberquilla <rekai.gonzalezalberquilla@arm.com>
Rekai Gonzalez-Alberquilla <rekai.gonzalezalberquilla@arm.com> Rekai Gonzalez Alberquilla <Rekai.GonzalezAlberquilla@arm.com>
Rekai Gonzalez-Alberquilla <rekai.gonzalezalberquilla@arm.com> Rekai Gonzalez-Alberquilla <Rekai.GonzalezAlberquilla@arm.com>
Rekai Gonzalez-Alberquilla <rekai.gonzalezalberquilla@arm.com> Rekai <Rekai.GonzalezAlberquilla@arm.com>
Rene de Jong <rene.dejong@arm.com>
Ricardo Alves <ricardo.alves@arm.com>
Richard Cooper <richard.cooper@arm.com>
Richard D. Strong <r.d.strong@gmail.com>
Richard Strong <rstrong@hp.com> Richard D. Strong <r.d.strong@gmail.com>
Richard Strong <rstrong@hp.com> Richard Strong <r.d.strong@gmail.com>
Richard Strong <rstrong@hp.com> Richard Strong <rstrong@cs.ucsd.edu>
Richard Strong <rstrong@hp.com> Rick Strong <rstrong@cs.ucsd.edu>
Rico Amslinger <rico.amslinger@informatik.uni-augsburg.de>
Riken Gohil <Riken.Gohil@arm.com>
Rizwana Begum <rb639@drexel.edu>
Robert Hauser <85344819+robhau@users.noreply.github.com>
Robert Kovacsics <rmk35@cl.cam.ac.uk>
Robert Scheffel <robert.scheffel1@tu-dresden.de> Robert <robert.scheffel1@tu-dresden.de>
Rocky Tatiefo <rtatiefo@google.com>
Roger Chang <rogerycchang@google.com> rogerchang23424 <rogerycchang@google.com>
Roger Chang <rogerycchang@google.com> rogerchang23424 <32214817+rogerchang23424@users.noreply.github.com>
Roger Chang <rogerycchang@google.com> rogerchang23424 <aucixw45876@gmail.com>
Roger Chang <rogerycchang@google.com> Yu-Cheng Chang <rogerycchang@google.com>
Rohit Kurup <rohit.kurup@arm.com>
Ron Dreslinski <rdreslin@umich.edu> Ronald Dreslinski <rdreslin@umich.edu>
Ruben Ayrapetyan <ruben.ayrapetyan@arm.com>
@@ -342,23 +377,21 @@ sacak32 <byrakocalan99@gmail.com>
Sampad Mohapatra <sampad.mohapatra@gmail.com>
Samuel Grayson <sam@samgrayson.me>
Samuel Stark <samuel.stark2@arm.com>
Sandipan Das <31861871+sandip4n@users.noreply.github.com>
Sandipan Das <sandipan@linux.ibm.com> Sandipan Das <31861871+sandip4n@users.noreply.github.com>
Santi Galan <santi.galan@metempsy.com>
Sascha Bischoff <sascha.bischoff@arm.com> Sascha Bischoff <sascha.bischoff@ARM.com>
Sascha Bischoff <sascha.bischoff@arm.com> Sascha Bischoff <Sascha.Bischoff@ARM.com>
Saúl Adserias <33020671+saul44203@users.noreply.github.com>
Sean McGoogan <Sean.McGoogan@arm.com>
Sean Wilson <spwilson2@wisc.edu>
Sergei Trofimov <sergei.trofimov@arm.com>
Severin Wischmann <wiseveri@student.ethz.ch> Severin Wischmann ext:(%2C%20Ioannis%20Ilkos%20%3Cioannis.ilkos09%40imperial.ac.uk%3E) <wiseveri@student.ethz.ch>
Shawn Rosti <shawn.rosti@gmail.com>
Sherif Elhabbal <elhabbalsherif@gmail.com>
Shivani Parekh <shparekh@ucdavis.edu>
Shivani <shparekh@ucdavis.edu>
Shivani Parekh <shparekh@ucdavis.edu> Shivani <shparekh@ucdavis.edu>
Siddhesh Poyarekar <siddhesh.poyarekar@gmail.com>
Simon Park <seminpark@google.com>
Somayeh Sardashti <somayeh@cs.wisc.edu>
Sooraj Puthoor <puthoorsooraj@gmail.com>
Sooraj Puthoor <Sooraj.Puthoor@amd.com>
Sooraj Puthoor <puthoorsooraj@gmail.com> Sooraj Puthoor <Sooraj.Puthoor@amd.com>
Sophiane Senni <sophiane.senni@gmail.com>
Soumyaroop Roy <sroy@cse.usf.edu>
Srikant Bharadwaj <srikant.bharadwaj@amd.com>
@@ -370,7 +403,6 @@ Steve Raasch <sraasch@umich.edu>
Steve Reinhardt <stever@gmail.com> Steve Reinhardt ext:(%2C%20Nilay%20Vaish%20%3Cnilay%40cs.wisc.edu%3E%2C%20Ali%20Saidi%20%3CAli.Saidi%40ARM.com%3E) <stever@gmail.com>
Steve Reinhardt <stever@gmail.com> Steve Reinhardt <stever@eecs.umich.edu>
Steve Reinhardt <stever@gmail.com> Steve Reinhardt <steve.reinhardt@amd.com>
Steve Reinhardt <stever@gmail.com> Steve Reinhardt <Steve.Reinhardt@amd.com>
Stian Hvatum <stian@dream-web.no>
Sudhanshu Jha <sudhanshu.jha@arm.com>
Sujay Phadke <electronicsguy123@gmail.com>
@@ -378,16 +410,18 @@ Sungkeun Kim <ksungkeun84@tamu.edu>
Swapnil Haria <swapnilster@gmail.com> Swapnil Haria <swapnilh@cs.wisc.edu>
Taeho Kgil <tkgil@umich.edu>
Tao Zhang <tao.zhang.0924@gmail.com>
Thilo Vörtler <thilo.voertler@coseda-tech.com> root <thilo.voertler@coseda-tech.com>
Thomas Grass <Thomas.Grass@ARM.com>
Tiago Mück <tiago.muck@arm.com> Tiago Muck <tiago.muck@arm.com>
Tiberiu Bucur <36485854+TiberiuBucur@users.noreply.github.com>
Tim Harris <tharris@microsoft.com>
Timothy Hayes <timothy.hayes@arm.com>
Timothy M. Jones <timothy.jones@arm.com> Timothy Jones <timothy.jones@cl.cam.ac.uk>
Timothy M. Jones <timothy.jones@arm.com> Timothy M. Jones <timothy.jones@cl.cam.ac.uk>
Timothy M. Jones <timothy.jones@arm.com> Timothy M. Jones <tjones1@inf.ed.ac.uk>
Tom Jablin <tjablin@gmail.com>
Tommaso Marinelli <tommarin@ucm.es>
Tom Rollet <tom.rollet@huawei.com>
Tommaso Marinelli <tommarin@ucm.es>
Tong Shen <endlessroad@google.com>
Tony Gutierrez <anthony.gutierrez@amd.com> Anthony Gutierrez <atgutier@umich.edu>
Travis Boraten <travis.boraten@amd.com>
@@ -401,6 +435,7 @@ Victor Garcia <victor.garcia@arm.com>
Vilas Sridharan <vilas.sridharan@gmail.com>
Vincentius Robby <acolyte@umich.edu>
Vince Weaver <vince@csl.cornell.edu>
Vishnu Ramadas <vramadas@outlook.com>
vramadas95 <vramadas@wisc.edu>
vsoria <victor.soria@bsc.es>
Wade Walker <wade.walker@arm.com>
@@ -409,14 +444,16 @@ Weiping Liao <weipingliao@google.com>
Wende Tan <twd2@163.com>
Wendy Elsasser <wendy.elsasser@arm.com>
William Wang <william.wang@arm.com> William Wang <William.Wang@arm.com>
William Wang <william.wang@arm.com> William Wang <William.Wang@ARM.com>
Willy Wolff <willy.mh.wolff.ml@gmail.com>
Wing Li <wingers@google.com>
wmin0 <wmin0@hotmail.com>
Xiangyu Dong <rioshering@gmail.com>
Xianwei Zhang <xianwei.zhang.@amd.com> Xianwei Zhang <xianwei.zhang@amd.com>
Xiaoyu Ma <xiaoyuma@google.com>
Xin Ouyang <xin.ouyang@streamcomputing.com>
Xiongfei <xiongfei.liao@gmail.com>
Xuan Hu <huxuan@bosc.ac.cn>
Yan Lee <yanlee@google.com>
Yasuko Eckert <yasuko.eckert@amd.com>
Yen-lin Lai <yenlinlai@google.com>
Yifei Liu <liu.ad2039@gmail.com>
@@ -426,7 +463,10 @@ Yuan Yao <yuanyao@seas.harvard.edu>
Yuetsu Kodama <yuetsu.kodama@riken.jp> yuetsu.kodama <yuetsu.kodama@riken.jp>
Yu-hsin Wang <yuhsingw@google.com>
Zhang Zheng <perise@gmail.com>
Zhantong Qiu <ztqiu@ucdavis.edu>
Zhantong Qiu <ztqiu@ucdavis.edu> studyztp <studyztp@gmail.com>
Zhengrong Wang <seanzw@ucla.edu> seanzw <seanyukigeek@gmail.com>
Zhengrong Wang <seanzw@ucla.edu> Zhengrong Wang <seanyukigeek@gmail.com>
zhongchengyong <zhongcy93@gmail.com>
Zicong Wang <wangzicong@nudt.edu.cn>
Zixian Cai <2891235+caizixian@users.noreply.github.com>
zmckevitt <zack.mckevitt@gmail.com>

View File

@@ -1,3 +1,4 @@
---
# Copyright (c) 2022 Arm Limited
# All rights reserved.
#
@@ -33,57 +34,71 @@
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
minimum_pre_commit_version: "2.18"
minimum_pre_commit_version: '2.18'
default_language_version:
python: python3
python: python3
exclude: |
(?x)^(
ext/.*|
build/.*|
src/systemc/ext/.*|
src/systemc/tests/.*/.*|
src/python/m5/ext/pyfdt/.*|
tests/.*/ref/.*
)$
(?x)^(
ext/(?!testlib/).*|
build/.*|
src/systemc/ext/.*|
src/systemc/tests/.*/.*|
src/python/m5/ext/pyfdt/.*|
tests/.*/ref/.*
)$
default_stages: [commit]
default_stages: [pre-commit]
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.3.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-json
- id: check-yaml
- id: check-added-large-files
- id: mixed-line-ending
args: [--fix=lf]
- id: check-case-conflict
- repo: https://github.com/psf/black
rev: 22.6.0
hooks:
- id: black
- repo: local
hooks:
- id: gem5-style-checker
name: gem5 style checker
entry: util/git-pre-commit.py
always_run: true
exclude: ".*"
language: system
description: 'The gem5 style checker hook.'
- id: gem5-commit-msg-checker
name: gem5 commit msg checker
entry: ext/git-commit-msg
language: system
stages: [commit-msg]
description: 'The gem5 commit message checker hook.'
- id: gerrit-commit-msg-job
name: gerrit commit message job
entry: util/gerrit-commit-msg-hook
language: system
stages: [commit-msg]
description: 'Adds Change-ID to the commit message. Needed by Gerrit.'
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v5.0.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-json
- id: check-yaml
- id: check-added-large-files
- id: mixed-line-ending
args: [--fix=lf]
- id: check-ast
- id: check-case-conflict
- id: check-merge-conflict
- id: check-symlinks
- id: destroyed-symlinks
- id: requirements-txt-fixer
- repo: https://github.com/PyCQA/isort
rev: 5.13.2
hooks:
- id: isort
- repo: https://github.com/jumanjihouse/pre-commit-hook-yamlfmt
rev: 0.2.3
hooks:
- id: yamlfmt
- repo: https://github.com/psf/black
rev: 24.10.0
hooks:
- id: black
- repo: https://github.com/asottile/pyupgrade
rev: v3.17.0
hooks:
- id: pyupgrade
# Python 3.8 is the earliest version supported.
# We therefore conform to the standards compatible with 3.8+.
args: [--py38-plus]
- repo: local
hooks:
- id: gem5-style-checker
name: gem5 style checker
entry: util/git-pre-commit.py
always_run: true
exclude: .*
language: system
description: The gem5 style checker hook.
- id: gem5-commit-msg-checker
name: gem5 commit msg checker
entry: util/git-commit-msg.py
language: system
stages: [commit-msg]
description: The gem5 commit message checker hook.

7
.vscode/settings.json vendored Normal file
View File

@@ -0,0 +1,7 @@
{
"python.analysis.extraPaths": [
"src/python",
"ext",
"tests"
]
}

View File

@@ -1,23 +1,23 @@
---
# See CONTRIBUTING.md for details of gem5's contribution process.
#
# This file contains a list of gem5's subsystems and their
# maintainers. The key used to identifity a subsystem should be used
# as a tag in commit messages targetting that subsystem. At least one
# (not all) of these maintainers must review the patch before it can
# be pushed. These people will automatically be emailed when you
# upload the patch to Gerrit (https://gem5-review.googlesource.com).
# These subsystem keys mostly follow the directory structure.
# maintainers. The key used to identify a subsystem should be used
# as a tag in commit messages targeting that subsystem. Via our GitHub
# Pull Request system (https://github.com/gem5/gem5/pulls) a maintainer
# of the subsystem impacted by a pull request contribution will be added
# as an assignee to that pull request. Their role is be to referee the
# contribution (add a review, assign reviewers, suggest changes, etc.), then
# merge the contribution into the gem5 develop branch when they are satisfied
# with the change.
#
# Maintainers have the following responsibilities:
# 1. That at least one maintainer of each subsystem reviews all
# changes to that subsystem (they will be automatically tagged and
# emailed on each new change).
# 2. They will complete your reviews in a timely manner (within a few
# business days).
# 3. They pledge to uphold gem5's community standards and its code of
# conduct by being polite and professional in their code
# reviews. See CODE-OF-CONDUCT.md.
# Maintainers assigned to a pull request are expected to acknowledge their
# assignment in 2 business days and to fully begin refereeing the contribution
# within a business week.
#
# Maintainers pledge to uphold gem5's community standards and its code of
# conduct by being polite and professional in their interactions with
# contributors. See CODE-OF-CONDUCT.md.
#
# Entries in this file have the following format:
# key:
@@ -27,310 +27,257 @@
# maintainers:
# - John Doe <john.doe@gem5.org>
# - Jane Doe <jane.doe@gem5.org>
#
# experts:
# - Jack Doe <jack.doe@gem5org>
# - Jill Doe <jill.doe@gem5org>
#
# The status field should have one of the following values:
# - maintained: The component has an active maintainer.
# - orphaned: The component is looking for a new owner.
pmc:
desc: >-
PMC Members (general maintainers):
status: maintained
maintainers:
- Andreas Sandberg <andreas.sandberg@arm.com>
- Brad Beckmann <bradford.beckmann@gmail.com>
- David Wood <david@cs.wisc.edu>
- Gabe Black <gabe.black@gmail.com>
- Giacomo Travaglini <giacomo.travaglini@arm.com>
- Jason Lowe-Power <jason@lowepower.com> (chair)
- Matt Sinclair <sinclair@cs.wisc.edu>
- Tony Gutierrez <anthony.gutierrez@amd.com>
- Steve Reinhardt <stever@gmail.com>
#
# The experts field is optional and used to identify people who are
# knowledgeable about the subsystem but are not responsible for it. Those
# listed as an expert are typically good to add as a reviewer for pull requests
# targeting that subsystem.
arch:
desc: >-
General architecture-specific components
status: maintained
maintainers:
- Gabe Black <gabe.black@gmail.com>
desc: >-
General architecture-specific components
status: orphaned
arch-arm:
status: maintained
maintainers:
- Andreas Sandberg <andreas.sandberg@arm.com>
- Giacomo Travaglini <giacomo.travaglini@arm.com>
arch-gcn3:
status: maintained
maintainers:
- Matt Poremba <matthew.poremba@amd.com>
- Matt Sinclair <sinclair@cs.wisc.edu>
status: maintained
maintainers:
- Giacomo Travaglini <giacomo.travaglini@arm.com>
- Andreas Sandberg <andreas.sandberg@arm.com>
arch-vega:
status: maintained
maintainers:
- Matt Poremba <matthew.poremba@amd.com>
- Matt Sinclair <sinclair@cs.wisc.edu>
status: maintained
maintainers:
- Matt Sinclair <sinclair@cs.wisc.edu>
- Matt Poremba <matthew.poremba@amd.com>
arch-mips:
status: orphaned
status: orphaned
arch-power:
status: maintained
maintainers:
- Boris Shingarov <shingarov@labware.com>
status: orphaned
arch-riscv:
status: orphaned
status: orphaned
arch-sparc:
status: maintained
maintainers:
- Gabe Black <gabe.black@gmail.com>
status: orphaned
arch-x86:
status: maintained
maintainers:
- Gabe Black <gabe.black@gmail.com>
status: orphaned
base:
status: maintained
maintainers:
- Bobby Bruce <bbruce@ucdavis.edu>
- Daniel Carvalho <odanrc@yahoo.com.br>
status: orphaned
base-stats:
status: orphaned
status: orphaned
configs:
status: maintained
maintainers:
- Jason Lowe-Power <jason@lowepower.com>
status: orphaned
experts:
- Jason Lowe-Power <jason@lowepower.com>
cpu:
desc: >-
General changes to all CPU models (e.g., BaseCPU)
status: maintained
maintainers:
- Gabe Black <gabe.black@gmail.com>
- Jason Lowe-Power <jason@lowepower.com>
desc: >-
General changes to all CPU models (e.g., BaseCPU)
status: orphaned
experts:
- Jason Lowe-Power <jason@lowepower.com>
cpu-kvm:
status: maintained
maintainers:
- Andreas Sandberg <andreas.sandberg@arm.com>
status: maintained
maintainers:
- Andreas Sandberg <andreas.sandberg@arm.com>
cpu-minor:
status: maintained
maintainers:
- Zhengrong Wang <seanyukigeek@gmail.com>
status: orphaned
cpu-o3:
status: orphaned
status: orphaned
cpu-simple:
status: maintained
maintainers:
- Jason Lowe-Power <jason@lowepower.com>
- Gabe Black <gabe.black@gmail.com>
status: orphaned
experts:
- Jason Lowe-Power <jason@lowepower.com>
dev:
status: maintained
maintainers:
- Gabe Black <gabe.black@gmail.com>
status: orphaned
dev-hsa:
status: maintained
maintainers:
- Matt Poremba <matthew.poremba@amd.com>
status: maintained
maintainers:
- Matt Poremba <matthew.poremba@amd.com>
- Matt Sinclair <sinclair@cs.wisc.edu>
dev-amdgpu:
status: maintained
maintainers:
- Matt Poremba <matthew.poremba@amd.com>
status: maintained
maintainers:
- Matt Poremba <matthew.poremba@amd.com>
- Matt Sinclair <sinclair@cs.wisc.edu>
dev-virtio:
status: maintained
maintainers:
- Andreas Sandberg <andreas.sandberg@arm.com>
status: maintained
maintainers:
- Andreas Sandberg <andreas.sandberg@arm.com>
dev-arm:
status: maintained
maintainers:
- Andreas Sandberg <andreas.sandberg@arm.com>
- Giacomo Travaglini <giacomo.travaglini@arm.com>
status: maintained
maintainers:
- Giacomo Travaglini <giacomo.travaglini@arm.com>
- Andreas Sandberg <andreas.sandberg@arm.com>
doc:
status: maintained
maintainers:
- Bobby Bruce <bbruce@ucdavis.edu>
status: orphaned
ext:
desc: >-
Components external to gem5
status: maintained
maintainers:
- Bobby Bruce <bbruce@ucdavis.edu>
- Jason Lowe-Power <jason@lowepower.com>
desc: >-
Components external to gem5
status: orphaned
experts:
- Jason Lowe-Power <jason@lowepower.com>
ext-testlib:
status: maintained
maintainers:
- Bobby Bruce <bbruce@ucdavis.edu>
- Hoa Nguyen <hoanguyen@ucdavis.edu>
status: orphaned
experts:
- Bobby R. Bruce <bbruce@ucdavis.edu>
fastmodel:
desc: >-
Changes relating to ARM Fast Models
status: maintained
maintainers:
- Gabe Black <gabe.black@gmail.com>
desc: >-
Changes relating to ARM Fast Models
status: orphaned
gpu-compute:
status: maintained
maintainers:
- Matt Poremba <matthew.poremba@amd.com>
- Matt Sinclair <sinclair@cs.wisc.edu>
status: maintained
maintainers:
- Matt Poremba <matthew.poremba@amd.com>
- Matt Sinclair <sinclair@cs.wisc.edu>
learning-gem5:
desc: >-
The code and configs for the Learning gem5 book
status: maintained
maintainers:
- Jason Lowe-Power <jason@lowepower.com>
desc: >-
The code and configs for the Learning gem5 book
status: orphaned
experts:
- Jason Lowe-Power <jason@lowepower.com>
- Bobby R. Bruce <bbruce@ucdavis.edu>
stdlib:
desc: >-
The gem5 standard library found under `src/python/gem5`
status: maintained
maintainers:
- Bobby R. Bruce <bbruce@ucdavis.edu>
desc: >-
The gem5 standard library found under `src/python/gem5`
status: maintained
maintainers:
- Bobby R. Bruce <bbruce@ucdavis.edu>
mem:
desc: >-
General memory system (e.g., XBar, Packet)
status: maintained
maintainers:
- Nikos Nikoleris <nikos.nikoleris@arm.com>
desc: >-
General memory system (e.g., XBar, Packet)
status: orphaned
mem-cache:
desc: >-
Classic caches and coherence
status: maintained
maintainers:
- Nikos Nikoleris <nikos.nikoleris@arm.com>
- Daniel Carvalho <odanrc@yahoo.com.br>
desc: >-
Classic caches and coherence
status: orphaned
mem-dram:
status: maintained
maintainers:
- Nikos Nikoleris <nikos.nikoleris@arm.com>
status: orphaned
mem-garnet:
desc: >-
Garnet subcomponent of Ruby
status: maintained
maintainers:
- Srikant Bharadwaj <srikant.bharadwaj@amd.com>
desc: >-
Garnet subcomponent of Ruby
status: orphaned
mem-ruby:
desc: >-
Ruby structures and protocols
status: maintained
maintainers:
- Jason Lowe-Power <jason@lowepower.com>
- Matt Sinclair <sinclair@cs.wisc.edu>
desc: >-
Ruby structures and protocols
status: maintained
maintainers:
- Matt Sinclair <sinclair@cs.wisc.edu>
experts:
- Jason Lowe-Power <jason@lowepower.com>
misc:
desc: >-
Anything outside of the other categories
status: maintained
maintainers:
- Bobby Bruce <bbruce@ucdavis.edu>
- Jason Lowe-Power <jason@lowepower.com>
desc: >-
Anything outside of the other categories
status: orphaned
experts:
- Jason Lowe-Power <jason@lowepower.com>
python:
desc: >-
Python SimObject wrapping and infrastructure
status: maintained
maintainers:
- Andreas Sandberg <andreas.sandberg@arm.com>
- Jason Lowe-Power <jason@lowepower.com>
desc: >-
Python SimObject wrapping and infrastructure
status: orphaned
experts:
- Jason Lowe-Power <jason@lowepower.com>
- Andreas Sandberg <andreas.sandberg@arm.com>
resources:
desc: >-
The gem5-resources repo with auxiliary resources for simulation
status: maintained
maintainers:
- Bobby Bruce <bbruce@ucdavis.edu>
- Jason Lowe-Power <jason@lowepower.com>
desc: >-
The gem5-resources repo with auxiliary resources for simulation
status: maintained
maintainers:
- Bobby R. Bruce <bbruce@ucdavis.edu>
experts:
- Jason Lowe-Power <jason@lowepower.com>
scons:
desc: >-
Build system
status: maintained
maintainers:
- Gabe Black <gabe.black@gmail.com>
desc: >-
Build system
status: orphaned
sim:
desc: >-
General simulation components
status: maintained
maintainers:
- Jason Lowe-Power <jason@lowepower.com>
desc: >-
General simulation components
status: orphaned
experts:
- Jason Lowe-Power <jason@lowepower.com>
sim-se:
desc: >-
Syscall emulation
status: orphaned
desc: >-
Syscall emulation
status: orphaned
system-arm:
status: maintained
maintainers:
- Andreas Sandberg <andreas.sandberg@arm.com>
- Giacomo Travaglini <giacomo.travaglini@arm.com>
status: maintained
maintainers:
- Giacomo Travaglini <giacomo.travaglini@arm.com>
- Andreas Sandberg <andreas.sandberg@arm.com>
systemc:
desc: >-
Code for the gem5 SystemC implementation and interface
status: maintained
maintainers:
- Gabe Black <gabe.black@gmail.com>
desc: >-
Code for the gem5 SystemC implementation and interface
status: orphaned
tests:
desc: >-
testing changes
status: maintained
maintainers:
- Bobby Bruce <bbruce@ucdavis.edu>
desc: >-
testing changes
status: maintained
maintainers:
- Bobby R. Bruce <bbruce@ucdavis.edu>
util:
status: maintained
maintainers:
- Gabe Black <gabe.black@gmail.com>
status: orphaned
util-docker:
status: maintained
maintainers:
- Bobby Bruce <bbruce@ucdavis.edu>
status: maintained
maintainers:
- Bobby R. Bruce <bbruce@ucdavis.edu>
util-m5:
status: maintained
maintainers:
- Gabe Black <gabe.black@gmail.com>
status: orphaned
util-gem5art:
status: maintained
maintainers:
- Bobby Bruce <bbruce@ucdavis.edu>
- Jason Lowe-Power <jason@lowepower.com>
status: orphaned
website:
desc: >-
The gem5-website repo which contains the gem5.org site
status: maintained
maintainers:
- Bobby Bruce <bbruce@ucdavis.edu>
- Hoa Nguyen <hoanguyen@ucdavis.edu>
desc: >-
The gem5-website repo which contains the gem5.org site
status: maintained
maintainers:
- Bobby R. Bruce <bbruce@ucdavis.edu>
experts:
- Jason Lowe-Power <jason@lowepower.com>

View File

@@ -10,6 +10,15 @@ system software changes, and compile-time and run-time system optimizations.
The main website can be found at <http://www.gem5.org>.
## Testing status
**Note**: These regard tests run on the develop branch of gem5:
<https://github.com/gem5/gem5/tree/develop>.
[![Daily Tests](https://github.com/gem5/gem5/actions/workflows/daily-tests.yaml/badge.svg?branch=develop)](https://github.com/gem5/gem5/actions/workflows/daily-tests.yaml)
[![Weekly Tests](https://github.com/gem5/gem5/actions/workflows/weekly-tests.yaml/badge.svg?branch=develop)](https://github.com/gem5/gem5/actions/workflows/weekly-tests.yaml)
[![Compiler Tests](https://github.com/gem5/gem5/actions/workflows/compiler-tests.yaml/badge.svg?branch=develop)](https://github.com/gem5/gem5/actions/workflows/compiler-tests.yaml)
## Getting started
A good starting point is <http://www.gem5.org/about>, and for
@@ -29,8 +38,8 @@ Once you have all dependencies resolved, execute
`scons build/ALL/gem5.opt` to build an optimized version of the gem5 binary
(`gem5.opt`) containing all gem5 ISAs. If you only wish to compile gem5 to
include a single ISA, you can replace `ALL` with the name of the ISA. Valid
options include `ARM`, `NULL`, `MIPS`, `POWER`, `SPARC`, and `X86` The complete
list of options can be found in the build_opts directory.
options include `ARM`, `NULL`, `MIPS`, `POWER`, `RISCV`, `SPARC`, and `X86`
The complete list of options can be found in the build_opts directory.
See https://www.gem5.org/documentation/general_docs/building for more
information on building gem5.
@@ -85,6 +94,6 @@ or start discussions. To join the mailing list please visit
## Contributing to gem5
We hope you enjoy using gem5. When appropriate we advise charing your
We hope you enjoy using gem5. When appropriate we advise sharing your
contributions to the project. <https://www.gem5.org/contributing> can help you
get started. Additional information can be found in the CONTRIBUTING.md file.

View File

@@ -1,3 +1,519 @@
# Version 24.1.0.2
**[HOTFIX]** Adds PR <https://github.com/gem5/gem5/pull/1930> as a hotfix to v24.1.0.
This fixes a bug which was was causing the CHI coherence protocol to fail in multi-core simulations.
The fix sets the `RubySystem` pointer when the TBE is allocated, instead of when `set_tbe` is performed, thus ensuring that the `RubySystem` pointer is set before the TBE is used.
# Version 24.1.0.1
**[HOTFIX]** This hotfix release applies the following:
* Generalization of the class types in CHI RNF/MN generators thus fixing an issue with missing attributes when using the CHI protocol.
PR: <https://github.com/gem5/gem5/pull/1851>.
* Add Sphinx documentation for the gem5 standard library.
This is largely generated from Python docstrings.
See "docs/README" for more information on building and deploying Sphinx documentation.
PR: <https://github.com/gem5/gem5/pull/335>.
* Add missing `RubySystem` member and related methods in `PerfectCacheMemory`'s entries.
This was causing assertions to trigger in "src/mem/ruby/commonNetDest.cc".
PR: <https://github.com/gem5/gem5/pull/1864>.
* Add `useSecondaryLoadLinked` function to "src/mem/ruby/slicc_interface/ProtocolInfo.hh".
This fixes a bug which was introduced after the removal of the `PROTOCOL_MESI_Two_Level` and `PROTOCOL_MESI_Three_Level` MACROs in v24.1.0.0.
These MACROs were being used to infer if `Load_Linked` requests are sent to the Ruby protocol or not.
The `useSecondaryLoadLinked` function has been introduced to specify this directly where needed.
PR: <https://github.com/gem5/gem5/pull/1865>.
# Version 24.1
## User facing changes
* The [behavior of the statistics `simInsts` and `simOps` has been changed](https://github.com/gem5/gem5/pull/1615).
* They now reset to zero when m5.stats.reset() is called.
* Previously, they incorrectly did not reset and would increase monotonically throughout the simulation.
* The statistics `hostInstRate` and `hostOpRate` are also affected by this change, as they are calculated using simInsts and simOps respectively.
* Instances of kB, MB, and GB have been changed to KiB, MiB, and GiB for memory and cache sizes #1479
* A warning has also been added for usages of kB, MB, and GB.
* Please use KiB, MiB, and GiB in the future.
* Random number generator is no longer shared across components. This may modify simulation results. #1534
### gem5 Standard Library
* SE mode has been added to X86Board, X86DemoBoard, and RiscvBoard #1702
* ArmDemoBoard and RiscvDemoBoard have been added to the standard library #1478 #1490
* The values in the X86DemoBoard have been modified to make it more similar to the other DemoBoards #1618
### Prefetchers
* The [behavior of the`StridePrefetcher` has been altered](https://github.com/gem5/gem5/pull/1449) as follows:
* The addresses used to compute the stride has been changed from word aligned addresses to cache line aligned addresses.
* It returns if the stride does not match, as opposed to issuing prefetching using the new stride --- the previous, incorrect behavior.
* Returns if the new stride is 0, indicating multiple reads from the same cache line.
* Fix implementation of Best Offset Prefetcher #1403
* Add SMS Prefetcher
### Configuration scripts
* Update the full system gem5 Standard Library example scripts to use Ubuntu 24.04 disk images #1491
* Add RV32 option to configs/example/riscv/fs_linux.py #1312
* Other updates to configs/example/riscv/fs_linux.py #1753
### Multisim
* simerr.txt and simout.txt now output into the correct sub-directory when -re is passed #1551
### Compiler and OS support
As of this release, gem5 supports Clang versions 14 through 18 and GCC versions 10 through 14.
Other versions may work, but they are not regularly tested.
### Multiple Ruby Protocols in a Single Build
There are many developer facing / API changes to enable Ruby multiple protocols in a single build.
The most notable changes are:
* Removes the RubySlicc_interfaces.slicc file from the SLICC includes of
every protocol.
* Changes required: If you have a custom protocol, you will need to remove the line `include "RubySlicc_interfaces.slicc"` from your .slicc file.
* Updates the build configurations variables
* **USER FACING CHANGE**: The Ruby protocols in Kconfig have changed names (they are now the same case as the SLICC file names), and in addition, So, after this commit, your build configurations need to be updated. You can do so by running `scons menuconfig <build dir>` and selecting the right ruby options. Alternatively, if you're using a `build_opts` file, you can run `scons defconfig build/<ISA> build_opts/<ISA>` which should update your config correctly.
* **USER FACING CHANGE**: The the "build_opts/ALL" build spec has been updated to include all Ruby protocols . As such, gem5 compilations of the "ALL" compilation target will include all gem5 Ruby protocols (previously just MESI_Two_Level).
* A "build_opts/NULL_ALL_RUBY" build spec has been added to include all Ruby protocols for a "NULL ISA" build . This is useful for testing Ruby protocols without the overhead of a full ISA and is used in gem5's traffic generator tests.
* A "build_opts/ARM_X86" build spec has been added due to a unique restriction in the "tests/gem5/fs/linux/arm" tests which requires a compilation of gem5 with both ARM and X86 and solely the MESI_Two_Level protocol.
### Multiple RubySystem objects in a simulation
Simulation configurations can now create multiple `RubySystem`s in the same simulation.
Previously this was not possible due to `RubySystem` sharing variables across all `RubySystems` (e.g., cache line size).
Allowing this feature requires developer facing changes for custom Ruby protocols.
The most common changes will be:
* Modify your custom protocol SLICC files, replace any instances of `RubySystem::foo()` with `m_ruby_system->foo()`, and recompile. `m_ruby_system` is automatically set by SLICC generated code.
* If your custom protocol contains local `WriteMask` declarations (e.g., `WriteMask tmp_mask;`), modify the protocol so that `tmp_mask.setBlockSize(...)` is called. Use the block size of the `RubySystem` here (e.g., you can use `other_mask.getBlockSize()` or get block size from another object).
* Modify your python configurations to assign the parameter `ruby_system` for the python classes `RubySequencer`, `RubyDirectoryMemory`, and `RubyPortProxy` or any derived classes. You will receive an error at the start of gem5 if this is not done.
* If your python configuration uses a `RubyPrefetcher`, modify the configuration to assign the `block_size` parameter to the cache line size of the `RubySystem` the prefetcher is part of.
The complete list of changes are:
* `AbstractCacheEntry`, `ALUFreeListArray`, `DataBlock`, `Message`, `PerfectCacheMemory`, `PersistentTable`, `TBETable`, `TimerTable`, and `WriteMask` classes now require the cache line size to be explicitly set. This is handled automatically by the SLICC parser but must be done explicitly in C++ code by calling `setBlockSize()`.
* `RubyPrefetcher` now requires `block_size` be assigned in python configurations.
* `CacheMemory` now requires a pointer to the `RubySystem` to be set. This is handled automatically by the SLICC parser but must be done explicitly in C++ code by calling `setRubySystem()`.
* `RubyDirectoryMemory`, `RubyPortProxy`, and `RubySequencer` now require a pointer to the `RubySystem` to be set by python configurations. If you have custom protocols using `DirectoryMemory` or derived classes from it, the `ruby_system` parameter must be set in the python configuration.
* `ALUFreeListArray` and `BankedArray` now require a clock period to be set in C++ using `setClockPeriod()` and no longer require a pointer to the `RubySystem`.
* You may no longer call `RubySystem::getBlockSizeBytes()`, `RubySystem::getBlockSizeBits()`, etc. You must have a pointer to the `RubySystem` you are a part of and call, for example, `ruby_system->getBlockSizeBytes()`.
* `MessageBuffer::enqueue()` has two new parameters indicating if the `RubySystem` has randomization and warmup enabled. You must explicitly specify these values now.
## ArmISA changes/improvements
### Architectural extensions
Architectural support for the following extensions:
* FEAT_TTST
* FEAT_XS
### Bugfixes
* Add support of AArch32 VRINTN/X/A/Z/M/P instructions
* Add support of AArch32 VCVTA/P/N/M instructions
* The following syscalls have been added in SE mode
* readv
* poll
* pread64
* pwrite64
* truncate64
* The following syscalls have been fixed in SE mode when running on a 32bit HOST:
* getcwd
* lseek
### CPU changes
Before this release the Arm TLBs were using an hardcoded fully associative model with LRU replacement policy.
The associativity and replacement policy of the Arm TLBs are now configurable with the IndexingPolicy and ReplacementPolicy classes by setting the indexing_policy and replacement_policy params.
```python
indexing_policy = Param.TLBIndexingPolicy(
TLBSetAssociative(assoc=Parent.assoc, num_entries=Parent.size),
"Indexing policy of the TLB",
)
replacement_policy = Param.BaseReplacementPolicy(
LRURP(), "Replacement policy of the TLB"
)
```
While default behaviour is still LRU + FA, the L2 TLB in the ArmMMU (l2_shared) has been converted from being a fully associative structure into being a 5-way set associative.
The default ArmMMU is therefore:
```python
# L2 TLBs
l2_shared = ArmTLB(
entry_type="unified", size=1280, assoc=5, partial_levels=["L2"]
)
# L1 TLBs
itb = ArmTLB(entry_type="instruction", next_level=Parent.l2_shared)
dtb = ArmTLB(entry_type="data", next_level=Parent.l2_shared)
```
## AMBA CHI changes/improvements
PR [1084](https://github.com/gem5/gem5/pull/1084) introduced two new CHI relevant classes.
* The first one is the CHIGenericController. This is a purely C++ based / abstract interface of a Coherence Controller for ruby.
It is meant to bypass SLICC and removes the limitation of using the gem5 Sequencer and associated data structures.
* The second one is the CHI-TLM controller, which extends the aforementioned CHIGenericController. This is a bridge between the AMBA TLM 2.0 implementation of CHI [1](https://developer.arm.com/documentation/101459/latest) [2](https://developer.arm.com/Architectures/AMBA#Downloads) with the gem5 (ruby) one.
In other words it translates AMBA CHI transactions into ruby messages (which are then forwarded to the MessageQueues)
and vice versa.
```text
ARM::CHI::Payload, CHIRequestMsg
<--> CHIDataMsg
ARM::CHI::Phase CHIResponseMsg
CHIDataMsg
```
In this way it will be possible to connect external RNF models to the ruby interconnect via the CHI-TLM library
## RISC-V ISA improvements
* Use sign extend for all address generation #1316
* Fix implicit int-to-float conversion in .isa files #1319
* Implement Zcmp instructions #1432
* Add support for riscv hardware probing syscall #1525
* Add support for Zicbop extension #1710
* Fix vector instruction assertion caused by speculative execution #1711
## GPU model improvements
The GPUFS model is now available in the standard library!
There is a new `ViperBoard` in `gem5.prebuilt.viper`.
This board is an initial implementation and will be improved in the next versions of gem5.
There is an example script in `configs/example/gem5_library/x86-mi300x-gpu.py` that shows how to use the `ViperBoard`.
See #1636.
### Other GPU changes
* Vega10 has been deprecated #1619
* Replacement policy has been improved #1564
* Swizzle multi-dword scratch requests now supported #1445
* Many improvements to Vega implementation including memtime, SDWA, SDWAB, and DPP instructions #1350, #1378
* Matrix Core Engines (AMD's equivalent to NVIDIA's TensorCores) now supported! #1248, #1700
* Pannotia tests integrated into weekly tests #1584
## Other Miscellaneous Changes
### Other Ruby Related Changes
* RubyHitMiss debug flag #1260
* Prevent LL/SC livelock in MESI protocols #1399
* Added files for [generating Sphinx documentation](https://github.com/gem5/gem5/pull/335) for the gem5 standard library.
### Other
* Looppoint analysis object #1419
* Add global and local instruction trackers for raising instruction executed exit events with multi-core simulation #1433
### Development
* Removal of Gerrit Change-ID requirement #1486
# Version 24.0.0.1
**[HOTFIX]** Fixes a bug affecting the use of the `IndirectMemoryPrefetcher`, `SignaturePathPrefetcher`, `SignaturePathPrefetcherV2`, `STeMSPrefetcher`, and `PIFPrefetcher` SimObjects.
Use of these resulted in gem5 crashing a gem5 crash with the error message "Need is_secure arg".
The fix to this introduced to the gem5 develop branch in the <https://github.com/gem5/gem5/pull/1374> Pull Request.
The commits in this PR were cherry-picked on the gem5 stable branch to create the v24.0.0.1 hotfix release.
# Version 24.0
gem5 Version 24.0 is the first major release of 2024.
During this time there have been 298 pull requests merged, comprising of over 600 commits, from 56 unique contributors.
## API and user-facing changes
* The GCN3 GPU model has been removed in favor of the newer VEGA_X85 GPU model.
* gem5 now supports building, running, and simulating Ubuntu 24.04.
### Compiler and OS support
As of this release gem5 support Clang version 6 to 16 and GCC version 10 to 13.
While other compilers and versions may work, they are not regularly tested.
gem5 now supports building, running, and simulating on Ubuntu 24.04.
We continue to support 22.04 with 20.04 being deprecated in the coming year.
The majority of our testing is done on Ubuntu LTS systems though Apple Silicon machines and other Linux distributions have also been used regularly during development.
Improvements have been made to ensure a wider support of operating systems.
## New features
### gem5 MultiSim: Multiprocessing for gem5
The gem5 "MultiSim" module allows for multiple simulations to be run from a single gem5 execution via a single gem5 configuration script.
This allows for multiple simulations to be run in parallel in a structured manner.
To use MultiSim first create multiple simulators and add them to the MultiSim with the `add_simulator` function.
If needed, limit the maximum number of parallel processes with the `set_num_processes` function.
Then run the simulations in parallel with the `gem5` binary using `-m gem5.utils.multisim`.
Here is an example of how to use MultiSim:
```python
import gem5.utils.multisim as multisim
# Set the maximum number of processes to run in parallel
multisim.set_num_processes(4)
# Create multiple simulators.
# In this case, one for each workload in the benchmark suite.
for workload in benchmark_suite:
board = X86Board(
# ...
)
board.set_workload(workload)
# Useful to set the ID here. This is used to create unique output
# directorires for each gem5 process and can be used to idenfify and
# run gem5 processes individually.
simulator = Simulator(board, id=f"{workload.get_id()}")
multisim.add_simulator(simulator)
```
Then to run the simulations in parallel:
```sh
<gem5 binary> -m gem5.utils.multisim <config script>
```
The output directory ("m5out" by default) will contain sub-directories for each simulation run.
The sub-directory will be named after the simulator ID set in the configuration script.
We therefore recommend setting the simulator ID to something meaningful to help identify the output directories (i.e., the workload run or something identifying the meaningful characteristics of the simulated system in comparison to others).
If only one simulation specified in the config needs run, you can do so with:
```sh
<gem5 binary> <config script> --list # Lists the simulations by ID
<gem5 binary> <config script> <ID> # Run the simulation with the specified ID.
```
Example scripts of using MultiSim can be found in "configs/example/gem5_library/multisim".
### RISC-V Vector Extension Support
There have been significant improvements to the RVV support in gem5 including
* Fixed viota (#1137)
* Fixed vrgather (#1134)
* Added RVV FP16 support (#1123)
* Fixed widening and narrowing instructions (#1079)
* Fixed bug in vfmv.f.s (#863)
* Add unit stride segment loads and stores (#851) (#913)
* Fix vl in masked load/store (#830)
* Add unit-stride loads (#794)
* Fix many RVV instructions (#814) (#805) (#715)
### General RISC-V bugfixes
* Fixed problem in TLB lookup (#1264)
* Fixed sign-extended branch target (#1173)
* Fixed compressed jump instructions (#1163)
* Fixed GDB connection (#1152)
* Fixed CSR behavior (#1099)
* Add Integer conditional operations Zicond (#1078)
* Add RISC-V Semihosting support (#681)
* Added more detailed instruction types (#589)
* Fixed 32-bit m5op arguments (#900)
* Fixed c.fswsp and c.fsw (#998) (#1005)
* Update PLIC implementation (#886)
* Fix fflags behavior in O3 (#868)
* Add support for local interrupts (#813)
* Removebit 63 of physical address (#756)
## Improvements
* Added an new generator which can generate requests based on [spatter](https://github.com/hpcgarage/spatter) patterns.
* KVM is now supported in the gem5 Standard Library ARM Board.
* Generic Cache template added to the Standard Library (#745)
* Support added for partitioning caches.
* The Standard Library `obtain_resources` function can request multiple resources at once thus reducing delay associated with multiple requests.
* An official gem5 DevContainer has been added to the gem5 repository.
This can be used to build and run gem5 in consistent environment and enables GitHub Codespaces support.
### gem5 Python Statistics
The gem5 Python statistics API has been improved.
The gem5 Project's general intent with this improvement is make it easier and more desirable to obtain and interact with gem5 simulation statistics via Python.
For example, the following code snippet demonstrates how to obtain statistics from a gem5 simulation:
```python
from m5.stats.gem5stats import get_simstat
## Setup and run the configuation ...
simstat = get_simstat(board)
# Print the number of cycles the CPU at index 0 has executed.
print(simstat.cpu[0].numCycles)
# Strings can also be used to access statistics.
print(simstat['cpu'][0]['numCycles'])
# Print the total number of cycles executed by all CPUs.
print(sum(simstat.cpu[i].numCycles for i in range(len(simstat.cpu))))
```
We hope the usage of the gem5 Python statistics API will be more intuitive and easier to use while allowing better processing of statistical data.
### GPU Model
* Support for MI300X and MI200 GPU models including their features and most instructions.
* ROCm 6.1 disk image and compile docker files have been added. ROCm 5.4.2 and 4.2 resources are removed.
* The deprecated GCN3 ISA has been removed. Use VEGA instead.
## Bug Fixes
* An integer overflow error known to affect the `AddrRange` class has been fixed.
* Fix fflags behavior of floating point instruction in RISC-V for Out-of-Order CPUs.
### Arm FEAT_MPAM Support
An initial implementation of FEAT_MPAM has been introduced in gem5 with the capability to statically partition
classic caches. Guidance on how to use this is available on a Arm community [blog post](https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/gem5-cache-partitioning)
# Version 23.1
gem5 Version 23.1 is our first release where the development has been on GitHub.
During this release, there have been 362 pull requests merged which comprise 416 commits with 51 unique contributors.
## Significant API and user-facing changes
### The gem5 build can is now configured with `kconfig`
* Most gem5 builds without customized options (excluding double dash options) (e.g. , build/X86/gem5.opt) are backwards compatible and require no changes to your current workflows.
* All of the default builds in `build_opts` are unchanged and still available.
* However, if you want to specialize your build. For example, use customized ruby protocol. The command `scons PROTOCOL=<PROTOCAL_NAME> build/ALL/gem5.opt` will not work anymore. you now have to use `scons <kconfig command>` to update the ruby protocol as example. The double dash options (`--without-tcmalloc`, `--with-asan` and so on) are still continue to work as normal.
* For more details refer to the documentation here: [kconfig documentation](https://www.gem5.org/documentation/general_docs/kconfig_build_system/)
### Standard library improvements
#### `WorkloadResource` added to resource specialization
* The `Workload` and `CustomWorkload` classes are now deprecated. They have been transformed into wrappers for the `obtain_resource` and `WorkloadResource` classes in `resource.py`, respectively.
* Code utilizing the older API will continue to function as expected but will trigger a warning message. To update code using the `Workload` class, change the call from `Workload(id='resource_id', resource_version='1.0.0')` to `obtain_resource(id='resource_id', resource_version='1.0.0')`. Similarly, to update code using the `CustomWorkload` class, change the call from `CustomWorkload(function=func, parameters=params)` to `WorkloadResource(function=func, parameters=params)`.
* Workload resources in gem5 can now be directly acquired using the `obtain_resource` function, just like other resources.
#### Introducing Suites
Suites is a new category of resource being introduced in gem5. Documentation of suites can be found here: [suite documentation](https://www.gem5.org/documentation/gem5-stdlib/suites).
#### Other API changes
* All resource object now have their own `id` and `category`. Each resource class has its own `__str__()` function which return its information in the form of **category(id, version)** like **BinaryResource(id='riscv-hello', resource_version='1.0.0')**.
* Users can use GEM5_RESOURCE_JSON and GEM5_RESOURCE_JSON_APPEND env variables to overwrite all the data sources with the provided JSON and append a JSON file to all the data source respectively. More information can be found [here](https://www.gem5.org/documentation/gem5-stdlib/using-local-resources).
### Other user-facing changes
* Added support for clang 15 and clang 16
* gem5 no longer supports building on Ubuntu 18.04
* GCC 7, GCC 9, and clang 6 are no longer supported
* Two `DRAMInterface` stats have changed names (`bytesRead` and `bytesWritten`). For instance, `board.memory.mem_ctrl.dram.bytesRead` and `board.memory.mem_ctrl.dram.bytesWritten`. These are changed to `dramBytesRead` and `dramBytesWritten` so they don't collide with the stat with the same name in `AbstractMemory`.
* The stats for `NVMInterface` (`bytesRead` and `bytesWritten`) have been change to `nvmBytesRead` and `nvmBytesWritten` as well.
## Full-system GPU model improvements
* Support for up to latest ROCm 5.7.1.
* Various changes to enable PyTorch/TensorFlow simulations.
* New packer disk image script containing ROCm 5.4.2, PyTorch 2.0.1, and Tensorflow 2.11.
* GPU instructions can now perform atomics on host addresses.
* The provided configs scripts can now run KVM on more restrictive setups.
* Add support to checkpoint and restore between kernels in GPUFS, including adding various AQL, HSA Queue, VMID map, MQD attributes, GART translations, and PM4Queues to GPU checkpoints
* move GPU cache recorder code to RubyPort instead of Sequencer/GPUCoalescer to allow checkpointing to occur
* add support for flushing GPU caches, as well as cache cooldown/warmup support, for checkpoints
* Update vega10_kvm.py to add checkpointing instructions
## SE mode GPU model improvements
* started adding support for mmap'ing inputs for GPUSE tests, which reduces their runtime by 8-15% per run
## GPU model improvements
* update GPU VIPER and Coalescer support to ensure correct replacement policy behavior when multiple requests from the same CU are concurrently accessing the same line
* fix bug with GPU VIPER to resolve a race conflict for loads that bypass the TCP (L1D$)
* fix bug with MRU replacement policy updates in GPU SQC (I$)
* update GPU and Ruby debug prints to resolve various small errors
* Add configurable GPU L1,L2 num banks and L2 latencies
* Add decodings for new MI100 VOP2 insts
* Add GPU GLC Atomic Resource Constraints to better model how atomic resources are shared at GPU TCC (L2$)
* Update GPU tester to work with both requests that bypass all caches (SLC) and requests that bypass only the TCP (L1D$)
* Fixes for how write mask works for GPU WB L2 caches
* Added support for WB and WT GPU atomics
* Added configurable support to better model the latency of GPU atomic requests
* fix GPU's default number of HW barrier/CU to better model amount of concurrency GPU CUs should have
## RISC-V RVV 1.0 implemented
This was a huge undertaking by a large number of people!
Some of these people include Adrià Armejach who pushed it over the finish line, Xuan Hu who pushed the most recent version to gerrit that Adrià picked up,
Jerin Joy who did much of the initial work, and many others who contributed to the implementation including Roger Chang, Hoa Nguyen who put significant effort into testing and reviewing the code.
* Most of the instructions in the 1.0 spec implemented
* Works with both FS and SE mode
* Compatible with Simple CPUs, the O3, and the minor CPU models
* User can specify the width of the vector units
* Future improvements
* Widening/narrowing instructions are *not* implemented
* The model for executing memory instructions is not very high performance
* The statistics are not correct for counting vector instruction execution
## ArmISA changes/improvements
* Architectural support for the following extensions:
* FEAT_TLBIRANGE
* FEAT_FGT
* FEAT_TCR2
* FEAT_SCTLR2
* Arm support for SVE instructions improved
* Fixed some FEAT_SEL2 related issues:
* [Fix virtual interrupt logic in secure mode](https://github.com/gem5/gem5/pull/584)
* [Make interrupt masking handle VHE/SEL2 cases](https://github.com/gem5/gem5/pull/430)
* Removed support for Arm Jazelle and ThumbEE
* Implementation of an Arm Capstone Disassembler
## Other notable changes/improvements
* Improvements to the CHI coherence protocol implementation
* Far atomics implemented in CHI
* Ruby now supports using the prefetchers from the classic caches, if the protocol supports it. CHI has been extended to support the classic prefetchers.
* Bug in RISC-V TLB to fixed to correctly count misses and hits
* Added new [RISC-V Zcb instructions](https://github.com/gem5/gem5/pull/399)
* RISC-V can now use a separate binary for the bootloader and kernel in FS mode
* DRAMSys integration updated to latest DRAMSys version (5.0)
* Improved support for RISC-V privilege modes
* Fixed bug in switching CPUs with RISC-V
* CPU branch preditor refactoring to prepare for decoupled front end support
* Perf is now optional when using the KVM CPU model
* Improvements to the gem5-SST bridge including updating to SST 13.0
* Improved formatting of documentation in stdlib
* By default use isort for python imports in style
* Many, many testing improvements during the migration to GitHub actions
* Fixed the elastic trace replaying logic (TraceCPU)
## Known Bugs/Issues
* [RISC-V RVV Bad execution of riscv rvv vss instruction](https://github.com/gem5/gem5/issues/594)
* [RISC-V Vector Extension float32_t bugs/unsupported widening instructions](https://github.com/gem5/gem5/issues/442)
* [Implement AVX xsave/xstor to avoid workaround when checkpointing](https://github.com/gem5/gem5/issues/434)
* [Adding Vector Segmented Loads/Stores to RISC-V V 1.0 implementation](https://github.com/gem5/gem5/issues/382)
* [Integer overflow in AddrRange subset check](https://github.com/gem5/gem5/issues/240)
* [RISCV64 TLB refuses to access upper half of physical address space](https://github.com/gem5/gem5/issues/238)
* [Bug when trying to restore checkpoints in SPARC: “panic: panic condition !pte occurred: Tried to execute unmapped address 0.”](https://github.com/gem5/gem5/issues/197)
* [BaseCache::recvTimingResp can trigger an assertion error from getTarget() due to MSHR in senderState having no targets](https://github.com/gem5/gem5/issues/100)
# Version 23.0.1.0
This minor release incorporates documentation updates, bug fixes, and some minor improvements.
@@ -70,10 +586,10 @@ Scons no longer defines the `DEBUG` guard in debug builds, so code making using
Also, this release:
- Removes deprecated namespaces. Namespace names were updated a couple of releases ago. This release removes the old names.
- Uses `MemberEventWrapper` in favor of `EventWrapper` for instance member functions.
- Adds an extension mechanism to `Packet` and `Request`.
- Sets x86 CPU vendor string to "HygoneGenuine" to better support GLIBC.
* Removes deprecated namespaces. Namespace names were updated a couple of releases ago. This release removes the old names.
* Uses `MemberEventWrapper` in favor of `EventWrapper` for instance member functions.
* Adds an extension mechanism to `Packet` and `Request`.
* Sets x86 CPU vendor string to "HygoneGenuine" to better support GLIBC.
## New features and improvements
@@ -105,6 +621,7 @@ Architectural support for Armv9 [Scalable Matrix extension](https://developer.ar
The implementation employs a simple renaming scheme for the Za array register in the O3 CPU, so that writes to difference tiles in the register are considered a dependency and are therefore serialized.
The following SVE and SIMD & FP extensions have also been implemented:
* FEAT_F64MM
* FEAT_F32MM
* FEAT_DOTPROD
@@ -127,32 +644,31 @@ gem5 can now use DRAMSys <https://github.com/tukl-msd/DRAMSys> as a DRAM backend
This release:
- Fully implements RISC-V scalar cryptography extensions.
- Fully implement RISC-V rv32.
- Implements PMP lock features.
- Adds general RISC-V improvements to provide better stability.
* Fully implements RISC-V scalar cryptography extensions.
* Fully implement RISC-V rv32.
* Implements PMP lock features.
* Adds general RISC-V improvements to provide better stability.
### Standard library improvements and new components
This release:
- Adds MESI_Three_Level component.
- Supports ELFies and LoopPoint analysis output from Sniper.
- Supports DRAMSys in the stdlib.
* Adds MESI_Three_Level component.
* Supports ELFies and LoopPoint analysis output from Sniper.
* Supports DRAMSys in the stdlib.
## Bugfixes and other small improvements
This release also:
- Removes deprecated python libraries.
- Adds a DDR5 model.
- Adds AMD GPU MI200/gfx90a support.
- Changes building so it no longer "duplicates sources" in build/ which improves support for some IDEs and code analysis. If you still need to duplicate sources you can use the `--duplicate-sources` option to `scons`.
- Enables `--debug-activate=<object name>` to use debug trace for only a single SimObject (the opposite of `--debug-ignore`). See `--debug-help` for more information.
- Adds support to exit the simulation loop based on Arm-PMU events.
- Supports Python 3.11.
- Adds the idea of a CpuCluster to gem5.
* Removes deprecated python libraries.
* Adds a DDR5 model.
* Adds AMD GPU MI200/gfx90a support.
* Changes building so it no longer "duplicates sources" in build/ which improves support for some IDEs and code analysis. If you still need to duplicate sources you can use the `--duplicate-sources` option to `scons`.
* Enables `--debug-activate=<object name>` to use debug trace for only a single SimObject (the opposite of `--debug-ignore`). See `--debug-help` for more information.
* Adds support to exit the simulation loop based on Arm-PMU events.
* Supports Python 3.11.
* Adds the idea of a CpuCluster to gem5.
# Version 22.1.0.0
@@ -163,23 +679,23 @@ See below for more details!
## New features and improvements
- The gem5 binary can now be compiled to include multiple ISA targets.
* The gem5 binary can now be compiled to include multiple ISA targets.
A compilation of gem5 which includes all gem5 ISAs can be created using: `scons build/ALL/gem5.opt`.
This will use the Ruby `MESI_Two_Level` cache coherence protocol by default, to use other protocols: `scons build/ALL/gem5.opt PROTOCOL=<other protocol>`.
The classic cache system may continue to be used regardless as to which Ruby cache coherence protocol is compiled.
- The `m5` Python module now includes functions to set exit events are particular simululation ticks:
- *setMaxTick(tick)* : Used to to specify the maximum simulation tick.
- *getMaxTick()* : Used to obtain the maximum simulation tick value.
- *getTicksUntilMax()*: Used to get the number of ticks remaining until the maximum tick is reached.
- *scheduleTickExitFromCurrent(tick)* : Used to schedule an exit exit event a specified number of ticks in the future.
- *scheduleTickExitAbsolute(tick)* : Used to schedule an exit event as a specified tick.
- We now include the `RiscvMatched` board as part of the gem5 stdlib.
* The `m5` Python module now includes functions to set exit events are particular simululation ticks:
* *setMaxTick(tick)* : Used to to specify the maximum simulation tick.
* *getMaxTick()* : Used to obtain the maximum simulation tick value.
* *getTicksUntilMax()*: Used to get the number of ticks remaining until the maximum tick is reached.
* *scheduleTickExitFromCurrent(tick)* : Used to schedule an exit exit event a specified number of ticks in the future.
* *scheduleTickExitAbsolute(tick)* : Used to schedule an exit event as a specified tick.
* We now include the `RiscvMatched` board as part of the gem5 stdlib.
This board is modeled after the [HiFive Unmatched board](https://www.sifive.com/boards/hifive-unmatched) and may be used to emulate its behavior.
See "configs/example/gem5_library/riscv-matched-fs.py" and "configs/example/gem5_library/riscv-matched-hello.py" for examples using this board.
- An API for [SimPoints](https://doi.org/10.1145/885651.781076) has been added.
* An API for [SimPoints](https://doi.org/10.1145/885651.781076) has been added.
SimPoints can substantially improve gem5 Simulation time by only simulating representative parts of a simulation then extrapolating statistical data accordingly.
Examples of using SimPoints with gem5 can be found in "configs/example/gem5_library/checkpoints/simpoints-se-checkpoint.py" and "configs/example/gem5_library/checkpoints/simpoints-se-restore.py".
- "Workloads" have been introduced to gem5.
* "Workloads" have been introduced to gem5.
Workloads have been incorporated into the gem5 Standard library.
They can be used specify the software to be run on a simulated system that come complete with input parameters and any other dependencies necessary to run a simuation on the target hardware.
At the level of the gem5 configuration script a user may specify a workload via a board's `set_workload` function.
@@ -187,105 +703,104 @@ For example, `set_workload(Workload("x86-ubuntu-18.04-boot"))` sets the board to
This workload specifies a boot consisting of the Linux 5.4.49 kernel then booting an Ubunutu 18.04 disk image, to exit upon booting.
Workloads are agnostic to underlying gem5 design and, via the gem5-resources infrastructure, will automatically retrieve all necessary kernels, disk-images, etc., necessary to execute.
Examples of using gem5 Workloads can be found in "configs/example/gem5_library/x86-ubuntu-ruby.py" and "configs/example/gem5_library/riscv-ubuntu-run.py".
- To aid gem5 developers, we have incorporated [pre-commit](https://pre-commit.com) checks into gem5.
* To aid gem5 developers, we have incorporated [pre-commit](https://pre-commit.com) checks into gem5.
These checks automatically enforce the gem5 style guide on Python files and a subset of other requirements (such as line length) on altered code prior to a `git commit`.
Users may install pre-commit by running `./util/pre-commit-install.sh`.
Passing these checks is a requirement to submit code to gem5 so installation is strongly advised.
- A multiprocessing module has been added.
* A multiprocessing module has been added.
This allows for multiple simulations to be run from a single gem5 execution via a single gem5 configuration script.
Example of usage found [in this commit message](https://gem5-review.googlesource.com/c/public/gem5/+/63432).
**Note: This feature is still in development.
While functional, it'll be subject to subtantial changes in future releases of gem5**.
- The stdlib's `ArmBoard` now supports Ruby caches.
- Due to numerious fixes and improvements, Ubuntu 22.04 can be booted as a gem5 workload, both in FS and SE mode.
- Substantial improvements have been made to gem5's GDB capabilities.
- The `HBM2Stack` has been added to the gem5 stdlib as a memory component.
- The `MinorCPU` has been fully incorporated into the gem5 Standard Library.
- We now allow for full-system simulation of GPU applications.
* The stdlib's `ArmBoard` now supports Ruby caches.
* Due to numerious fixes and improvements, Ubuntu 22.04 can be booted as a gem5 workload, both in FS and SE mode.
* Substantial improvements have been made to gem5's GDB capabilities.
* The `HBM2Stack` has been added to the gem5 stdlib as a memory component.
* The `MinorCPU` has been fully incorporated into the gem5 Standard Library.
* We now allow for full-system simulation of GPU applications.
The introduction of GPU FS mode allows for the same use-cases as SE mode but reduces the requirement of specific host environments or usage of a Docker container.
The GPU FS mode also has improved simulated speed by functionally simulating memory copies, and provides an easier update path for gem5 developers.
An X86 host and KVM are required to run GPU FS mode.
## API (user facing) changes
- The default CPU Vendor String has been updated to `HygonGenuine`.
* The default CPU Vendor String has been updated to `HygonGenuine`.
This is due to newer versions of GLIBC being more strict about checking current system's supported features.
The previous value, `M5 Simulator`, is not recognized as a valid vendor string and therefore GLIBC returns an error.
- [The stdlib's `_connect_things` funciton call has been moved from the `AbstractBoard`'s constructor to be run as board pre-instantiation process](https://gem5-review.googlesource.com/c/public/gem5/+/65051).
* [The stdlib's `_connect_things` funciton call has been moved from the `AbstractBoard`'s constructor to be run as board pre-instantiation process](https://gem5-review.googlesource.com/c/public/gem5/+/65051).
This is to overcome instances where stdlib components (memory, processor, and cache hierarhcy) require Board information known only after its construction.
**This change breaks cases where a user utilizes the stdlib `AbstractBoard` but does not use the stdlib `Simulator` module. This can be fixed by adding the `_pre_instantiate` function before `m5.instantiate`**.
An exception has been added which explains this fix, if this error occurs.
- The setting of checkpoints has been moved from the stdlib's "set_workload" functions to the `Simulator` module.
* The setting of checkpoints has been moved from the stdlib's "set_workload" functions to the `Simulator` module.
Setting of checkpoints via the stdlib's "set_workload" functions is now deprecated and will be removed in future releases of gem5.
- The gem5 namespace `Trace` has been renamed `trace` to conform to the gem5 style guide.
- Due to the allowing of multiple ISAs per gem5 build, the `TARGET_ISA` variable has been replaced with `USE_$(ISA)` variables.
* The gem5 namespace `Trace` has been renamed `trace` to conform to the gem5 style guide.
* Due to the allowing of multiple ISAs per gem5 build, the `TARGET_ISA` variable has been replaced with `USE_$(ISA)` variables.
For example, if a build contains both the X86 and ARM ISAs the `USE_X86` and `USE_ARM` variables will be set.
## Big Fixes
- Several compounding bugs were causing bugs with floating point operations within gem5 simulations.
* Several compounding bugs were causing bugs with floating point operations within gem5 simulations.
These have been fixed.
- Certain emulated syscalls were behaving incorrectly when using RISC-V due to incorrect `open(2)` flag values.
* Certain emulated syscalls were behaving incorrectly when using RISC-V due to incorrect `open(2)` flag values.
These values have been fixed.
- The GIVv3 List register mapping has been fixed.
- Access permissions for GICv3 cpu registers have been fixed.
- In previous releases of gem5 the `sim_quantum` value was set for all cores when using the Standard Library.
* The GIVv3 List register mapping has been fixed.
* Access permissions for GICv3 cpu registers have been fixed.
* In previous releases of gem5 the `sim_quantum` value was set for all cores when using the Standard Library.
This caused issues when setting exit events at a particular tick as it resulted in the exit being off by `sim_quantum`.
As such, the `sim_quantum` value is only when using KVM cores.
- PCI ranges in `VExpress_GEM5_Foundation` fixed.
- The `SwitchableProcessor` processor has been fixed to allow switching to a KVM core.
* PCI ranges in `VExpress_GEM5_Foundation` fixed.
* The `SwitchableProcessor` processor has been fixed to allow switching to a KVM core.
Previously the `SwitchableProcessor` only allowed a user to switch from a KVM core to a non-KVM core.
- The Standard Library has been fixed to permit multicore simulations in SE mode.
- [A bug was fixed in the rcr X86 instruction](https://gem5.atlassian.net/browse/GEM5-1265).
* The Standard Library has been fixed to permit multicore simulations in SE mode.
* [A bug was fixed in the rcr X86 instruction](https://gem5.atlassian.net/browse/GEM5-1265).
## Build related changes
- gem5 can now be compiled with Scons 4 build system.
- gem5 can now be compiled with Clang version 14 (minimum Clang version 6).
- gem5 can now be compiled with GCC Version 12 (minimum GCC version 7).
* gem5 can now be compiled with Scons 4 build system.
* gem5 can now be compiled with Clang version 14 (minimum Clang version 6).
* gem5 can now be compiled with GCC Version 12 (minimum GCC version 7).
## Other minor updates
- The gem5 stdlib examples in "configs/example/gem5_library" have been updated to, where appropriate, use the stdlib's Simulator module.
* The gem5 stdlib examples in "configs/example/gem5_library" have been updated to, where appropriate, use the stdlib's Simulator module.
These example configurations can be used for reference as to how `Simulator` module may be utilized in gem5.
- Granulated SGPR computation has been added for gfx9 gpu-compute.
- The stdlib statistics have been improved:
- A `get_simstats` function has been added to access statistics from the `Simulator` module.
- Statistics can be printed: `print(simstats.board.core.some_integer)`.
- GDB ports are now specified for each workload, as opposed to per-simulation run.
- The `m5` utility has been expanded to include "workbegin" and "workend" annotations.
* Granulated SGPR computation has been added for gfx9 gpu-compute.
* The stdlib statistics have been improved:
* A `get_simstats` function has been added to access statistics from the `Simulator` module.
* Statistics can be printed: `print(simstats.board.core.some_integer)`.
* GDB ports are now specified for each workload, as opposed to per-simulation run.
* The `m5` utility has been expanded to include "workbegin" and "workend" annotations.
This can be added with `m5 workbegin` and `m5 workend`.
- A `PrivateL1SharedL2CacheHierarchy` has been added to the Standard Library.
- A `GEM5_USE_PROXY` environment variable has been added.
* A `PrivateL1SharedL2CacheHierarchy` has been added to the Standard Library.
* A `GEM5_USE_PROXY` environment variable has been added.
This allows users to specify a socks5 proxy server to use when obtaining gem5 resources and the resources.json file.
It uses the format `<host>:<port>`.
- The fastmodel support has been improved to function with Linux Kernel 5.x.
- The `set_se_binary_workload` function now allows for the passing of input parameters to a binary workload.
- A functional CHI cache hierarchy has been added to the gem5 Standard Library: "src/python/gem5/components/cachehierarchies/chi/private_l1_cache_hierarchy.py".
- The RISC-V K extension has been added.
* The fastmodel support has been improved to function with Linux Kernel 5.x.
* The `set_se_binary_workload` function now allows for the passing of input parameters to a binary workload.
* A functional CHI cache hierarchy has been added to the gem5 Standard Library: "src/python/gem5/components/cachehierarchies/chi/private_l1_cache_hierarchy.py".
* The RISC-V K extension has been added.
It includes the following instructions:
- Zbkx: xperm8, xperm4
- Zknd: aes64ds, aes64dsm, aes64im, aes64ks1i, aes64ks2
- Zkne: aes64es, aes64esm, aes64ks1i, aes64ks2
- Zknh: sha256sig0, sha256sig1, sha256sum0, sha256sum1, sha512sig0, sha512sig1, sha512sum0, sha512sum1
- Zksed: sm4ed, sm4ks
- Zksh: sm3p0, sm3p1
* Zbkx: xperm8, xperm4
* Zknd: aes64ds, aes64dsm, aes64im, aes64ks1i, aes64ks2
* Zkne: aes64es, aes64esm, aes64ks1i, aes64ks2
* Zknh: sha256sig0, sha256sig1, sha256sum0, sha256sum1, sha512sig0, sha512sig1, sha512sum0, sha512sum1
* Zksed: sm4ed, sm4ks
* Zksh: sm3p0, sm3p1
# Version 22.0.0.2
**[HOTFIX]** This hotfix contains a set of critical fixes to be applied to gem5 v22.0.
This hotfix:
- Fixes the ARM booting of Linux kernels making use of FEAT_PAuth.
- Removes incorrect `requires` functions in AbstractProcessor and AbstractGeneratorCore.
* Fixes the ARM booting of Linux kernels making use of FEAT_PAuth.
* Removes incorrect `requires` functions in AbstractProcessor and AbstractGeneratorCore.
These `requires` were causing errors when running generators with any ISA other than NULL.
- Fixes the standard library's `set_se_binary_workload` function to exit on Exit Events (work items) by default.
- Connects a previously unconnected PCI port in the example SST RISC-V config to the membus.
- Updates the SST-gem5 README with the correct download links.
- Adds a `getAddrRanges` function to the `HBMCtrl`.
* Fixes the standard library's `set_se_binary_workload` function to exit on Exit Events (work items) by default.
* Connects a previously unconnected PCI port in the example SST RISC-V config to the membus.
* Updates the SST-gem5 README with the correct download links.
* Adds a `getAddrRanges` function to the `HBMCtrl`.
This ensures the XBar connected to the controller can see the address ranges covered by both pseudo channels.
- Fixes test_download_resources.py so the correct parameter is passed to the download test script.
* Fixes test_download_resources.py so the correct parameter is passed to the download test script.
# Version 22.0.0.1
@@ -306,14 +821,14 @@ See below for more details!
## New features
- [Arm now models DVM messages for TLBIs and DSBs accurately](https://gem5.atlassian.net/browse/GEM5-1097). This is implemented in the CHI protocol.
- EL2/EL3 support on by default in ArmSystem
- HBM controller which supports pseudo channels
- [Improved Ruby's SimpleNetwork routing](https://gem5.atlassian.net/browse/GEM5-920)
- Added x86 bare metal workload and better real mode support
- [Added round-robin arbitration when using multiple prefetchers](https://gem5.atlassian.net/browse/GEM5-1169)
- [KVM Emulation added for ARM GIGv3](https://gem5.atlassian.net/browse/GEM5-1138)
- Many improvements to the CHI protocol
* [Arm now models DVM messages for TLBIs and DSBs accurately](https://gem5.atlassian.net/browse/GEM5-1097). This is implemented in the CHI protocol.
* EL2/EL3 support on by default in ArmSystem
* HBM controller which supports pseudo channels
* [Improved Ruby's SimpleNetwork routing](https://gem5.atlassian.net/browse/GEM5-920)
* Added x86 bare metal workload and better real mode support
* [Added round-robin arbitration when using multiple prefetchers](https://gem5.atlassian.net/browse/GEM5-1169)
* [KVM Emulation added for ARM GIGv3](https://gem5.atlassian.net/browse/GEM5-1138)
* Many improvements to the CHI protocol
## Many RISC-V instructions added
@@ -362,7 +877,7 @@ An example gem5 configuration script using this board can be found in `configs/e
When the system is configured for NUMA, it has multiple memory ranges, and each memory range is mapped to a corresponding NUMA node. For this, the change enables `createAddrRanges` to map address ranges to only a given HNFs.
Jira ticker here: https://gem5.atlassian.net/browse/GEM5-1187.
Jira ticker [here](https://gem5.atlassian.net/browse/GEM5-1187).
## API (user-facing) changes
@@ -370,7 +885,7 @@ Jira ticker here: https://gem5.atlassian.net/browse/GEM5-1187.
For instance, the `O3CPU` is now the `X86O3CPU` and `ArmO3CPU`, etc.
This requires a number of changes if you have your own CPU models.
See https://gem5-review.googlesource.com/c/public/gem5/+/52490 for details.
See [here](https://gem5-review.googlesource.com/c/public/gem5/+/52490) for details.
Additionally, this requires changes in any configuration script which inherits from the old CPU types.
@@ -384,31 +899,31 @@ Now, if you want to compile a CPU model for a particular ISA you will have to ad
If you have any specialized CPU models or any ISAs which are not in the mainline, expect many changes when rebasing on this release.
- No longer use read/setIntReg (e.g., see https://gem5-review.googlesource.com/c/public/gem5/+/49766)
- InvalidRegClass has changed (e.g., see https://gem5-review.googlesource.com/c/public/gem5/+/49745)
- All of the register classes have changed (e.g., see https://gem5-review.googlesource.com/c/public/gem5/+/49764/)
- `initiateSpecialMemCmd` renamed to `initiateMemMgmtCmd` to generalize to other command beyond HTM (e.g., DVM/TLBI)
- `OperandDesc` class added (e.g., see https://gem5-review.googlesource.com/c/public/gem5/+/49731)
- Many cases of `TheISA` have been removed
* No longer use read/setIntReg (e.g., see [link](https://gem5-review.googlesource.com/c/public/gem5/+/49766))
* InvalidRegClass has changed (e.g., see [link](https://gem5-review.googlesource.com/c/public/gem5/+/49745))
* All of the register classes have changed (e.g., see [link](https://gem5-review.googlesource.com/c/public/gem5/+/49764/))
* `initiateSpecialMemCmd` renamed to `initiateMemMgmtCmd` to generalize to other command beyond HTM (e.g., DVM/TLBI)
* `OperandDesc` class added (e.g., see [link](https://gem5-review.googlesource.com/c/public/gem5/+/49731))
* Many cases of `TheISA` have been removed
## Bug Fixes
- [Fixed RISC-V call/ret instruction decoding](https://gem5-review.googlesource.com/c/public/gem5/+/58209). The fix adds IsReturn` and `IsCall` flags for RISC-V jump instructions by defining a new `JumpConstructor` in "standard.isa". Jira Ticket here: https://gem5.atlassian.net/browse/GEM5-1139.
- [Fixed x86 Read-Modify-Write behavior in multiple timing cores with classic caches](https://gem5-review.googlesource.com/c/public/gem5/+/55744). Jira Ticket here: https://gem5.atlassian.net/browse/GEM5-1105.
- [The circular buffer for the O3 LSQ has been fixed](https://gem5-review.googlesource.com/c/public/gem5/+/58649). This issue affected running the O3 CPU with large workloaders. Jira Ticket here: https://gem5.atlassian.net/browse/GEM5-1203.
- [Removed "memory-leak"-like error in RISC-V lr/sc implementation](https://gem5-review.googlesource.com/c/public/gem5/+/55663). Jira issue here: https://gem5.atlassian.net/browse/GEM5-1170.
- [Resolved issues with Ruby's memtest](https://gem5-review.googlesource.com/c/public/gem5/+/56811). In gem5 v21.2, If the size of the address range was smaller than the maximum number of outstandnig requests allowed downstream, the tester would get stuck trying to find a unique address. This has been resolved.
* [Fixed RISC-V call/ret instruction decoding](https://gem5-review.googlesource.com/c/public/gem5/+/58209). The fix adds IsReturn` and `IsCall` flags for RISC-V jump instructions by defining a new `JumpConstructor` in "standard.isa". Jira Ticket [here](https://gem5.atlassian.net/browse/GEM5-1139).
* [Fixed x86 Read-Modify-Write behavior in multiple timing cores with classic caches](https://gem5-review.googlesource.com/c/public/gem5/+/55744). Jira Ticket [here](https://gem5.atlassian.net/browse/GEM5-1105).
* [The circular buffer for the O3 LSQ has been fixed](https://gem5-review.googlesource.com/c/public/gem5/+/58649). This issue affected running the O3 CPU with large workloaders. Jira Ticket [here](https://gem5.atlassian.net/browse/GEM5-1203).
* [Removed "memory-leak"-like error in RISC-V lr/sc implementation](https://gem5-review.googlesource.com/c/public/gem5/+/55663). Jira issue [here](https://gem5.atlassian.net/browse/GEM5-1170).
* [Resolved issues with Ruby's memtest](https://gem5-review.googlesource.com/c/public/gem5/+/56811). In gem5 v21.2, If the size of the address range was smaller than the maximum number of outstandnig requests allowed downstream, the tester would get stuck trying to find a unique address. This has been resolved.
## Build-related changes
- Variable in `env` in the SConscript files now requires you to use `env['CONF']` to access them. Anywhere that `env['<VARIABLE>']` appeared should noe be `env['CONF']['<VARIABLE>']`
- Internal build files are now in a per-target `gem5.build` directory
- All build variable are per-target and there are no longer any shared variables.
* Variable in `env` in the SConscript files now requires you to use `env['CONF']` to access them. Anywhere that `env['<VARIABLE>']` appeared should noe be `env['CONF']['<VARIABLE>']`
* Internal build files are now in a per-target `gem5.build` directory
* All build variable are per-target and there are no longer any shared variables.
## Other changes
- New bootloader is required for Arm VExpress_GEM5_Foundation platform. See https://gem5.atlassian.net/browse/GEM5-1222 for details.
- The MemCtrl interface has been updated to use more inheritance to make extending it to other memory types (e.g., HBM pseudo channels) easier.
* New bootloader is required for Arm VExpress_GEM5_Foundation platform. See [here](https://gem5.atlassian.net/browse/GEM5-1222) for details.
* The MemCtrl interface has been updated to use more inheritance to make extending it to other memory types (e.g., HBM pseudo channels) easier.
# Version 21.2.1.1
@@ -446,28 +961,28 @@ This has now been wrapped in a larger "standard library".
The *gem5 standard library* is a Python package which contains the following:
- **Components:** A set of Python classes which wrap gem5's models. Some of the components are preconfigured to match real hardware (e.g., `SingleChannelDDR3_1600`) and others are parameterized. Components can be combined together onto *boards* which can be simulated.
- **Resources:** A set of utilities to interact with the gem5-resources repository/website. Using this module allows you to *automatically* download and use many of gem5's prebuilt resources (e.g., kernels, disk images, etc.).
- **Simulate:** *THIS MODULE IS IN BETA!* A simpler interface to gem5's simulation/run capabilities. Expect API changes to this module in future releases. Feedback is appreciated.
- **Prebuilt**: These are fully functioning prebuilt systems. These systems are built from the components in `components`. This release has a "demo" board to show an example of how to use the prebuilt systems.
* **Components:** A set of Python classes which wrap gem5's models. Some of the components are preconfigured to match real hardware (e.g., `SingleChannelDDR3_1600`) and others are parameterized. Components can be combined together onto *boards* which can be simulated.
* **Resources:** A set of utilities to interact with the gem5-resources repository/website. Using this module allows you to *automatically* download and use many of gem5's prebuilt resources (e.g., kernels, disk images, etc.).
* **Simulate:** *THIS MODULE IS IN BETA!* A simpler interface to gem5's simulation/run capabilities. Expect API changes to this module in future releases. Feedback is appreciated.
* **Prebuilt**: These are fully functioning prebuilt systems. These systems are built from the components in `components`. This release has a "demo" board to show an example of how to use the prebuilt systems.
Examples of using the gem5 standard library can be found in `configs/example/gem5_library/`.
The source code is found under `src/python/gem5`.
## Many Arm improvements
- [Improved configurability for Arm architectural extensions](https://gem5.atlassian.net/browse/GEM5-1132): we have improved how to enable/disable architectural extensions for an Arm system. Rather than working with indipendent boolean values, we now use a unified ArmRelease object modelling the architectural features supported by a FS/SE Arm simulation
- [Arm TLB can store partial entries](https://gem5.atlassian.net/browse/GEM5-1108): It is now possible to configure an ArmTLB as a walk cache: storing intermediate PAs obtained during a translation table walk.
- [Implemented a multilevel TLB hierarchy](https://gem5.atlassian.net/browse/GEM5-790): enabling users to compose/model a customizable multilevel TLB hierarchy in gem5. The default Arm MMU has now an Instruction L1 TLB, a Data L1 TLB and a Unified (Instruction + Data) L2 TLB.
- Provided an Arm example script for the gem5-SST integration (<https://gem5.atlassian.net/browse/GEM5-1121>).
* [Improved configurability for Arm architectural extensions](https://gem5.atlassian.net/browse/GEM5-1132): we have improved how to enable/disable architectural extensions for an Arm system. Rather than working with indipendent boolean values, we now use a unified ArmRelease object modelling the architectural features supported by a FS/SE Arm simulation
* [Arm TLB can store partial entries](https://gem5.atlassian.net/browse/GEM5-1108): It is now possible to configure an ArmTLB as a walk cache: storing intermediate PAs obtained during a translation table walk.
* [Implemented a multilevel TLB hierarchy](https://gem5.atlassian.net/browse/GEM5-790): enabling users to compose/model a customizable multilevel TLB hierarchy in gem5. The default Arm MMU has now an Instruction L1 TLB, a Data L1 TLB and a Unified (Instruction + Data) L2 TLB.
* Provided an Arm example script for the gem5-SST integration (<https://gem5.atlassian.net/browse/GEM5-1121>).
## GPU improvements
- Vega support: gfx900 (Vega) discrete GPUs are now both supported and tested with [gem5-resources applications](https://gem5.googlesource.com/public/gem5-resources/+/refs/heads/stable/src/gpu/).
- Improvements to the VIPER coherence protocol to fix bugs and improve performance: this improves scalability for large applications running on relatively small GPU configurations, which caused deadlocks in VIPER's L2. Instead of continually replaying these requests, the updated protocol instead wakes up the pending requests once the prior request to this cache line has completed.
- Additional GPU applications: The [Pannotia graph analytics benchmark suite](https://github.com/pannotia/pannotia) has been added to gem5-resources, including Makefiles, READMEs, and sample commands on how to run each application in gem5.
- Regression Testing: Several GPU applications are now tested as part of the nightly and weekly regressions, which improves test coverage and avoids introducing inadvertent bugs.
- Minor updates to the architecture model: We also added several small changes/fixes to the HSA queue size (to allow larger GPU applications with many kernels to run), the TLB (to create GCN3- and Vega-specific TLBs), adding new instructions that were previously unimplemented in GCN3 and Vega, and fixing corner cases for some instructions that were leading to incorrect behavior.
* Vega support: gfx900 (Vega) discrete GPUs are now both supported and tested with [gem5-resources applications](https://gem5.googlesource.com/public/gem5-resources/+/refs/heads/stable/src/gpu/).
* Improvements to the VIPER coherence protocol to fix bugs and improve performance: this improves scalability for large applications running on relatively small GPU configurations, which caused deadlocks in VIPER's L2. Instead of continually replaying these requests, the updated protocol instead wakes up the pending requests once the prior request to this cache line has completed.
* Additional GPU applications: The [Pannotia graph analytics benchmark suite](https://github.com/pannotia/pannotia) has been added to gem5-resources, including Makefiles, READMEs, and sample commands on how to run each application in gem5.
* Regression Testing: Several GPU applications are now tested as part of the nightly and weekly regressions, which improves test coverage and avoids introducing inadvertent bugs.
* Minor updates to the architecture model: We also added several small changes/fixes to the HSA queue size (to allow larger GPU applications with many kernels to run), the TLB (to create GCN3- and Vega-specific TLBs), adding new instructions that were previously unimplemented in GCN3 and Vega, and fixing corner cases for some instructions that were leading to incorrect behavior.
## gem5-SST bridges revived
@@ -488,13 +1003,13 @@ However, they should be simple to extend to other ISAs through small source chan
## Other improvements
- Removed master/slave terminology: this was a closed ticket which was marked as done even though there were multiple references of master/slave in the config scripts which we fixed.
- Armv8.2-A FEAT_UAO implementation.
- Implemented 'at' variants of file syscall in SE mode (<https://gem5.atlassian.net/browse/GEM5-1098>).
- Improved modularity in SConscripts.
- Arm atomic support in the CHI protocol
- Many testing improvements.
- New "tester" CPU which mimics GUPS.
* Removed master/slave terminology: this was a closed ticket which was marked as done even though there were multiple references of master/slave in the config scripts which we fixed.
* Armv8.2-A FEAT_UAO implementation.
* Implemented 'at' variants of file syscall in SE mode (<https://gem5.atlassian.net/browse/GEM5-1098>).
* Improved modularity in SConscripts.
* Arm atomic support in the CHI protocol
* Many testing improvements.
* New "tester" CPU which mimics GUPS.
# Version 21.1.0.2
@@ -509,7 +1024,7 @@ This hotfix initializes using loops which fixes the broken statistics.
# Version 21.1.0.0
Since v21.0 we have received 780 commits with 48 unique contributors, closing 64 issues on our [Jira Issue Tracker](https://gem5.atlassian.net/).
In addition to our [first gem5 minor release](#version-21.0.1.0), we have included a range of new features, and API changes which we outline below.
In addition to our first gem5 minor release, we have included a range of new features, and API changes which we outline below.
## Added the Components Library [Alpha Release]
@@ -568,7 +1083,7 @@ Classes that handle set dueling have been created ([Dueler and DuelingMonitor](h
They can be used in conjunction with different cache policies.
A [replacement policy that uses it](https://gem5.googlesource.com/public/gem5/+/refs/tags/v21.1.0.0/src/mem/cache/replacement_policies/dueling_rp.hh) has been added for guidance.
## RISC-V is now supported as a host machine.
## RISC-V is now supported as a host machine
gem5 is now compilable and runnable on a RISC-V host system.
@@ -578,6 +1093,7 @@ Deprecation MACROS have been added for deprecating namespaces (`GEM5_DEPRECATED_
**Note:**
For technical reasons, using old macros won't produce any deprecation warnings.
## Refactoring of the gem5 Namespaces
Snake case has been adopted as the new convention for name spaces.
@@ -656,9 +1172,9 @@ Version 21.0.1 is a minor gem5 release consisting of bug fixes. The 21.0.1 relea
* Fixes the [GCN-GPU Dockerfile](https://gem5.googlesource.com/public/gem5/+/refs/tags/v21.0.1.0/util/dockerfiles/gcn-gpu/Dockerfile) to pull from the v21-0 bucket.
* Fixes the tests to download from the v21-0 bucket instead of the develop bucket.
* Fixes the Temperature class:
* Fixes [fs_power.py](https://gem5.googlesource.com/public/gem5/+/refs/tags/v21.0.1.0/configs/example/arm/fs_power.py), which was producing a ["Temperature is not JSON serializable" error](https://gem5.atlassian.net/browse/GEM5-951).
* Fixes temperature printing in `config.ini`.
* Fixes the pybind export for the `from_kelvin` function.
* Fixes [fs_power.py](https://gem5.googlesource.com/public/gem5/+/refs/tags/v21.0.1.0/configs/example/arm/fs_power.py), which was producing a ["Temperature is not JSON serializable" error](https://gem5.atlassian.net/browse/GEM5-951).
* Fixes temperature printing in `config.ini`.
* Fixes the pybind export for the `from_kelvin` function.
* Eliminates a duplicated name warning in [ClockTick](https://gem5.googlesource.com/public/gem5/+/refs/tags/v21.0.1.0/src/systemc/channel/sc_clock.cc).
* Fixes the [Ubuntu 18.04 Dockerfile](https://gem5.googlesource.com/public/gem5/+/refs/tags/v21.0.1.0/util/dockerfiles/ubuntu-20.04_all-dependencies/Dockerfile) to use Python3 instead of Python2.
* Makes [verify.py](https://gem5.googlesource.com/public/gem5/+/refs/tags/v21.0.1.0/src/systemc/tests/verify.py) compatible with Python3.
@@ -769,7 +1285,7 @@ This bug affected the overall behavior of the Garnet Network Model.
# Version 20.1.0.0
Thank you to everyone that made this release possible!
This has been a very productive release with [150 issues](https://gem5.atlassian.net/), over 650 commits (a 25% increase from the 20.0 release), and 58 unique contributors (a 100% increase!).
This has been a very productive release with [150 issues](https://gem5.atlassian.net/), over 650 commits (a 25% increase from the 20.0 release), and 58 unique contributors (a 100% increase!).
## Process changes
@@ -853,7 +1369,7 @@ See <http://www.gem5.org/documentation/general_docs/building> for gem5's current
* There may be some bugs introduced with this change as there were many places in the Python configurations which relied on "duck typing".
* This change is mostly backwards compatible and warning will be issued until at least gem5-20.2.
```
```txt
MasterPort -> RequestorPort
SlavePort -> ResponsePort
@@ -927,7 +1443,7 @@ Below are some of the highlights, though I'm sure I've missed some important cha
* All full system config/run scripts must be updated (e.g., anything that used the `LinuxX86System` or similar SimObject).
* Many of the parameters of `System` are now parameters of the `Workload` (see `src/sim/Workload.py`).
* For instance, many parameters of `LinuxX86System` are now part of `X86FsLinux` which is now the `workload` parameter of the `System` SimObject.
* See https://gem5-review.googlesource.com/c/public/gem5/+/24283/ and https://gem5-review.googlesource.com/c/public/gem5/+/26466 for more details.
* See [here](https://gem5-review.googlesource.com/c/public/gem5/+/24283/) and [here](https://gem5-review.googlesource.com/c/public/gem5/+/26466) for more details.
* Sv39 paging has been added to the RISC-V ISA, bringing gem5 close to running Linux on RISC-V.
* (Some) Baremetal OSes are now supported.
* Improvements to DRAM model:

View File

@@ -44,15 +44,6 @@
#
# SCons top-level build description (SConstruct) file.
#
# While in this directory ('gem5'), just type 'scons' to build the default
# configuration (see below), or type 'scons build/<CONFIG>/<binary>'
# to build some other configuration (e.g., 'build/X86/gem5.opt' for
# the optimized X86 version).
#
# You can build gem5 in a different directory as long as there is a
# 'build/<CONFIG>' somewhere along the target path. The build system
# expects that all configs under the same build directory are being
# built for the same host system.
#
# Examples:
#
@@ -77,10 +68,11 @@
# Global Python imports
import atexit
import itertools
import os
import sys
from os import mkdir, remove, environ
from os import mkdir, remove, environ, listdir
from os.path import abspath, dirname, expanduser
from os.path import isdir, isfile
from os.path import join, split
@@ -115,8 +107,6 @@ AddOption('--no-colors', dest='use_colors', action='store_false',
help="Don't add color to abbreviated scons output")
AddOption('--with-cxx-config', action='store_true',
help="Build with support for C++-based configuration")
AddOption('--default',
help='Override which build_opts file to use for defaults')
AddOption('--ignore-style', action='store_true',
help='Disable style checking hooks')
AddOption('--linker', action='store', default=None, choices=linker_options,
@@ -127,6 +117,8 @@ AddOption('--no-compress-debug', action='store_true',
help="Don't compress debug info in build files")
AddOption('--with-lto', action='store_true',
help='Enable Link-Time Optimization')
AddOption('--with-libcxx', action='store_true',
help='Use libc++ as the C++ standard library (requires Clang)')
AddOption('--verbose', action='store_true',
help='Print full tool command lines')
AddOption('--without-python', action='store_true',
@@ -141,6 +133,8 @@ AddOption('--with-systemc-tests', action='store_true',
help='Build systemc tests')
AddOption('--install-hooks', action='store_true',
help='Install revision control hooks non-interactively')
AddOption('--limit-ld-memory-usage', action='store_true',
help='Tell ld, the linker, to reduce memory usage.')
AddOption('--gprof', action='store_true',
help='Enable support for the gprof profiler')
AddOption('--pprof', action='store_true',
@@ -162,6 +156,7 @@ sys.path[1:1] = [ Dir('#build_tools').abspath ]
# declared above.
from gem5_scons import error, warning, summarize_warnings, parse_build_path
from gem5_scons import TempFileSpawn, EnvDefaults, MakeAction, MakeActionTool
from gem5_scons import kconfig
import gem5_scons
from gem5_scons.builders import ConfigFile, AddLocalRPATH, SwitchingHeaders
from gem5_scons.builders import Blob
@@ -205,7 +200,71 @@ if not ('CC' in main and 'CXX' in main):
error("No C++ compiler installed (package g++ on Ubuntu and RedHat)")
# Find default configuration & binary.
Default(environ.get('M5_DEFAULT_BINARY', 'build/ARM/gem5.debug'))
default_target = environ.get('M5_DEFAULT_BINARY', None)
if default_target:
Default(default_target)
# If no target is set, even a default, print help instead.
if not BUILD_TARGETS:
warning("No target specified, and no default.")
SetOption('help', True)
buildopts_dir = Dir('#build_opts')
buildopts = list([f for f in os.listdir(buildopts_dir.abspath) if
isfile(os.path.join(buildopts_dir.abspath, f))])
buildopts.sort()
buildopt_list = '\n'.join(' ' * 10 + buildopt for buildopt in buildopts)
Help(f"""
Targets:
To build gem5 using a predefined configuration, use a target with
a directory called "build" in the path, followed by a directory named
after a predefined configuration in "build_opts" directory, and then
the actual target, likely a gem5 binary. For example:
scons build/ALL/gem5.opt
The "build" component tells SCons that the next part names an initial
configuration, and the part after that is the actual target.
The predefined targets currently available are:
{buildopt_list}
The extension on the gem5 binary specifies what type of binary to
build. Options are:
debug: A debug binary with optimizations turned off and debug info
turned on.
opt: An optimized binary with debugging still turned on.
fast: An optimized binary with debugging, asserts, and tracing
disabled.
gem5 can also be built as a static or dynamic library. In that case,
the extension is determined by the operating system, so the binary type
is part of the target file name. For example:
scons build/ARM/libgem5_opt.so
In MacOS, the extension should change to "dylib" like this:
scons build/ARM/libgem5_opt.dylib
To build unit tests, you can use a target like this:
scons build/RISCV/unittests.debug
The unittests.debug part of the target is actual a directory which
holds the results for all the unit tests built with the "debug"
settings. When that's used as the target, SCons will build all the
files under that directory, which will run all the tests.
To build and run an individual test, you can built it's binary
specifically and then run it manually:
scons build/SPARC/base/bitunion.test.opt
build/SPARC/base/bitunion.test.opt
""", append=True)
########################################################################
@@ -215,52 +274,134 @@ Default(environ.get('M5_DEFAULT_BINARY', 'build/ARM/gem5.debug'))
#
########################################################################
# helper function: find last occurrence of element in list
def rfind(l, elt, offs = -1):
for i in range(len(l)+offs, 0, -1):
if l[i] == elt:
return i
raise ValueError("element not found")
kconfig_actions = (
'defconfig',
'guiconfig',
'listnewconfig',
'menuconfig',
'oldconfig',
'olddefconfig',
'savedefconfig',
'setconfig',
)
Help("""
Kconfig:
In addition to the default configs, you can also create your own
configs, or edit one that already exists. To use one of the kconfig
tools with a particular directory, use a target which is the directory
to configure, and then the name of the tool. For example, to run
menuconfig on directory build_foo/bar, run:
scons menuconfig build_foo/bar
will set up a build directory in build_foo/bar if one doesn't already
exist, and open the menuconfig editor to view/set configuration
values.
Kconfig tools:
defconfig:
Set up a config using values specified in a defconfig file, or if no
value is given, use the default. The second argument specifies the
defconfig file. A defconfig file in the build_opts directory can be
implicitly specified in the build path via `build/<defconfig file>/`
scons defconfig build_foo/bar build_opts/MIPS
guiconfig:
Opens the guiconfig editor which will let you view and edit config
values, and view help text. guiconfig runs as a graphical application.
scons guiconfig build_foo/bar
listnewconfig:
Lists config options which are new in the Kconfig and which are not
currently set in the existing config file.
scons listnewconfig build_foo/bar
menuconfig:
Opens the menuconfig editor which will let you view and edit config
values, and view help text. menuconfig runs in text mode.
scons menuconfig build_foo/bar
oldconfig:
Update an existing config by adding settings for new options. This is
the same as the olddefconfig tool, except it asks what values you want
for the new settings.
scons oldconfig build_foo/bar
olddefconfig:
Update an existing config by adding settings for new options. This is
the same as the oldconfig tool, except it uses the default for any new
setting.
scons olddefconfig build_foo/bar
savedefconfig:
Save a defconfig file which would give rise to the current config.
For instance, you could use menuconfig to set up a config how you want
it with the options you cared about, and then use savedefconfig to save
a minimal config file. These files would be suitable to use in the
defconfig directory. The second argument specifies the filename for
the new defconfig file.
scons savedefconfig build_foo/bar new_def_config
setconfig:
Set values in an existing config directory as specified on the command
line. For example, to enable gem5's built in systemc kernel:
scons setconfig build_foo/bar USE_SYSTEMC=y
""", append=True)
# Take a list of paths (or SCons Nodes) and return a list with all
# paths made absolute and ~-expanded. Paths will be interpreted
# relative to the launch directory unless a different root is provided
def makePathAbsolute(path, root=GetLaunchDir()):
return abspath(os.path.join(root, expanduser(str(path))))
def makePathListAbsolute(path_list, root=GetLaunchDir()):
return [abspath(os.path.join(root, expanduser(str(p))))
for p in path_list]
return [makePathAbsolute(p, root) for p in path_list]
# Each target must have 'build' in the interior of the path; the
# directory below this will determine the build parameters. For
# example, for target 'foo/bar/build/X86/arch/x86/blah.do' we
# recognize that X86 specifies the configuration because it
# follow 'build' in the build path.
if BUILD_TARGETS and BUILD_TARGETS[0] in kconfig_actions:
# The build targets are really arguments for the kconfig action.
kconfig_args = BUILD_TARGETS[:]
BUILD_TARGETS[:] = []
# The funky assignment to "[:]" is needed to replace the list contents
# in place rather than reassign the symbol to a new list, which
# doesn't work (obviously!).
BUILD_TARGETS[:] = makePathListAbsolute(BUILD_TARGETS)
kconfig_action = kconfig_args[0]
if len(kconfig_args) < 2:
error(f'Missing arguments for kconfig action {kconfig_action}')
dir_to_configure = makePathAbsolute(kconfig_args[1])
# Generate a list of the unique build roots and configs that the
# collected targets reference.
variant_paths = set()
build_root = None
for t in BUILD_TARGETS:
this_build_root, variant = parse_build_path(t)
kconfig_args = kconfig_args[2:]
# Make sure all targets use the same build root.
if not build_root:
build_root = this_build_root
elif this_build_root != build_root:
error("build targets not under same build root\n %s\n %s" %
(build_root, this_build_root))
variant_paths = {dir_to_configure}
else:
# Each target must have 'build' in the interior of the path; the
# directory below this will determine the build parameters. For
# example, for target 'foo/bar/build/X86/arch/x86/blah.do' we
# recognize that X86 specifies the configuration because it
# follow 'build' in the build path.
# Collect all the variants into a set.
variant_paths.add(os.path.join('/', build_root, variant))
# The funky assignment to "[:]" is needed to replace the list contents
# in place rather than reassign the symbol to a new list, which
# doesn't work (obviously!).
BUILD_TARGETS[:] = makePathListAbsolute(BUILD_TARGETS)
# Make sure build_root exists (might not if this is the first build there)
if not isdir(build_root):
mkdir(build_root)
main['BUILDROOT'] = build_root
# Generate a list of the unique build directories that the collected
# targets reference.
variant_paths = set(map(parse_build_path, BUILD_TARGETS))
kconfig_action = None
########################################################################
@@ -395,10 +536,14 @@ for variant_path in variant_paths:
env = main.Clone()
env['BUILDDIR'] = variant_path
gem5_build = os.path.join(build_root, variant_path, 'gem5.build')
gem5_build = os.path.join(variant_path, 'gem5.build')
env['GEM5BUILD'] = gem5_build
Execute(Mkdir(gem5_build))
config_file = Dir(gem5_build).File('config')
kconfig_file = Dir(gem5_build).File('Kconfig')
gem5_kconfig_file = Dir('#src').File('Kconfig')
env.SConsignFile(os.path.join(gem5_build, 'sconsign'))
# Set up default C++ compiler flags
@@ -424,6 +569,16 @@ for variant_path in variant_paths:
with gem5_scons.Configure(env) as conf:
conf.CheckLinkFlag('-Wl,--as-needed')
want_libcxx = GetOption('with_libcxx')
if want_libcxx:
with gem5_scons.Configure(env) as conf:
# Try using libc++ if it supports the <filesystem> library.
code = '#include <filesystem>\nint main() { return 0; }'
if (not conf.CheckCxxFlag('-stdlib=libc++') or
not conf.CheckLinkFlag('-stdlib=libc++', code=code)
):
error('Requested libc++ but it is not usable')
linker = GetOption('linker')
if linker:
with gem5_scons.Configure(env) as conf:
@@ -447,7 +602,13 @@ for variant_path in variant_paths:
conf.CheckLinkFlag(
'-Wl,--thread-count=%d' % GetOption('num_jobs'))
with gem5_scons.Configure(env) as conf:
ld_optimize_memory_usage = GetOption('limit_ld_memory_usage')
if ld_optimize_memory_usage:
if conf.CheckLinkFlag('-Wl,--no-keep-memory'):
env.Append(LINKFLAGS=['-Wl,--no-keep-memory'])
else:
error("Unable to use --no-keep-memory with the linker")
else:
error('\n'.join((
"Don't know what compiler options to use for your compiler.",
@@ -463,8 +624,8 @@ for variant_path in variant_paths:
"src/SConscript to support that compiler.")))
if env['GCC']:
if compareVersions(env['CXXVERSION'], "7") < 0:
error('gcc version 7 or newer required.\n'
if compareVersions(env['CXXVERSION'], "10") < 0:
error('gcc version 10 or newer required.\n'
'Installed version:', env['CXXVERSION'])
# Add the appropriate Link-Time Optimization (LTO) flags if
@@ -488,17 +649,6 @@ for variant_path in variant_paths:
'-fno-builtin-malloc', '-fno-builtin-calloc',
'-fno-builtin-realloc', '-fno-builtin-free'])
if compareVersions(env['CXXVERSION'], "9") < 0:
# `libstdc++fs`` must be explicitly linked for `std::filesystem``
# in GCC version 8. As of GCC version 9, this is not required.
#
# In GCC 7 the `libstdc++fs`` library explicit linkage is also
# required but the `std::filesystem` is under the `experimental`
# namespace(`std::experimental::filesystem`).
#
# Note: gem5 does not support GCC versions < 7.
env.Append(LIBS=['stdc++fs'])
elif env['CLANG']:
if compareVersions(env['CXXVERSION'], "6") < 0:
error('clang version 6 or newer required.\n'
@@ -516,7 +666,7 @@ for variant_path in variant_paths:
env.Append(TCMALLOC_CCFLAGS=['-fno-builtin'])
if compareVersions(env['CXXVERSION'], "11") < 0:
if not want_libcxx and compareVersions(env['CXXVERSION'], "11") < 0:
# `libstdc++fs`` must be explicitly linked for `std::filesystem``
# in clang versions 6 through 10.
#
@@ -530,7 +680,7 @@ for variant_path in variant_paths:
# On Mac OS X/Darwin we need to also use libc++ (part of XCode) as
# opposed to libstdc++, as the later is dated.
if sys.platform == "darwin":
if not want_libcxx and sys.platform == "darwin":
env.Append(CXXFLAGS=['-stdlib=libc++'])
env.Append(LIBS=['c++'])
@@ -539,27 +689,37 @@ for variant_path in variant_paths:
if GetOption('with_ubsan'):
sanitizers.append('undefined')
if GetOption('with_asan'):
# Available for gcc >= 5 or llvm >= 3.1 both a requirement
# by the build system
sanitizers.append('address')
suppressions_file = Dir('util').File('lsan-suppressions').get_abspath()
suppressions_opt = 'suppressions=%s' % suppressions_file
suppressions_opts = ':'.join([suppressions_opt,
'print_suppressions=0'])
env['ENV']['LSAN_OPTIONS'] = suppressions_opts
print()
warning('To suppress false positive leaks, set the LSAN_OPTIONS '
'environment variable to "%s" when running gem5' %
suppressions_opts)
warning('LSAN_OPTIONS=%s' % suppressions_opts)
print()
if env['GCC']:
# Address sanitizer is not supported with GCC. Please see Github
# Issue https://github.com/gem5/gem5/issues/916 for more details.
warning("Address Sanitizer is not supported with GCC. "
"This option will be ignored.")
else:
# Available for llvm >= 3.1. A requirement by the build system.
sanitizers.append('address')
suppressions_file = Dir('util').File('lsan-suppressions')\
.get_abspath()
suppressions_opt = 'suppressions=%s' % suppressions_file
suppressions_opts = ':'.join([suppressions_opt,
'print_suppressions=0'])
env['ENV']['LSAN_OPTIONS'] = suppressions_opts
print()
warning('To suppress false positive leaks, set the LSAN_OPTIONS '
'environment variable to "%s" when running gem5' %
suppressions_opts)
warning('LSAN_OPTIONS=%s' % suppressions_opts)
print()
if sanitizers:
sanitizers = ','.join(sanitizers)
if env['GCC'] or env['CLANG']:
libsan = (
['-static-libubsan', '-static-libasan']
if env['GCC']
else ['-static-libsan']
)
env.Append(CCFLAGS=['-fsanitize=%s' % sanitizers,
'-fno-omit-frame-pointer'],
LINKFLAGS=['-fsanitize=%s' % sanitizers,
'-static-libasan'])
LINKFLAGS=['-fsanitize=%s' % sanitizers] + libsan)
if main["BIN_TARGET_ARCH"] == "x86_64":
# Sanitizers can enlarge binary size drammatically, north of
@@ -626,7 +786,7 @@ for variant_path in variant_paths:
LINKFLAGS=['-Wl,--no-as-needed', '-lprofiler',
'-Wl,--as-needed'])
env['HAVE_PKG_CONFIG'] = env.Detect('pkg-config')
env['HAVE_PKG_CONFIG'] = env.Detect('pkg-config') == 'pkg-config'
with gem5_scons.Configure(env) as conf:
# On Solaris you need to use libsocket for socket ops
@@ -670,59 +830,13 @@ for variant_path in variant_paths:
after_sconsopts_callbacks.append(cb)
Export('AfterSConsopts')
# Sticky variables get saved in the variables file so they persist from
# one invocation to the next (unless overridden, in which case the new
# value becomes sticky).
sticky_vars = Variables(args=ARGUMENTS)
Export('sticky_vars')
extras_file = os.path.join(gem5_build, 'extras')
extras_var = Variables(extras_file, args=ARGUMENTS)
# EXTRAS is special since it affects what SConsopts need to be read.
sticky_vars.Add(('EXTRAS', 'Add extra directories to the compilation', ''))
# Set env variables according to the build directory config.
sticky_vars.files = []
# Variables for $BUILD_ROOT/$VARIANT_DIR are stored in
# $BUILD_ROOT/$VARIANT_DIR/gem5.build/variables
gem5_build_vars = os.path.join(gem5_build, 'variables')
build_root_vars = os.path.join(build_root, 'variables', variant_dir)
current_vars_files = [gem5_build_vars, build_root_vars]
existing_vars_files = list(filter(isfile, current_vars_files))
if existing_vars_files:
sticky_vars.files.extend(existing_vars_files)
if not GetOption('silent'):
print('Using saved variables file(s) %s' %
', '.join(existing_vars_files))
else:
# Variant specific variables file doesn't exist.
# Get default build variables from source tree. Variables are
# normally determined by name of $VARIANT_DIR, but can be
# overridden by '--default=' arg on command line.
default = GetOption('default')
opts_dir = Dir('#build_opts').abspath
if default:
default_vars_files = [
gem5_build_vars,
build_root_vars,
os.path.join(opts_dir, default)
]
else:
default_vars_files = [os.path.join(opts_dir, variant_dir)]
existing_default_files = list(filter(isfile, default_vars_files))
if existing_default_files:
default_vars_file = existing_default_files[0]
sticky_vars.files.append(default_vars_file)
print("Variables file(s) %s not found,\n using defaults in %s" %
(' or '.join(current_vars_files), default_vars_file))
else:
error("Cannot find variables file(s) %s or default file(s) %s" %
(' or '.join(current_vars_files),
' or '.join(default_vars_files)))
Exit(1)
extras_var.Add(('EXTRAS', 'Add extra directories to the compilation', ''))
# Apply current settings for EXTRAS to env.
sticky_vars.Update(env)
extras_var.Update(env)
# Parse EXTRAS variable to build list of all directories where we're
# look for sources etc. This list is exported as extras_dir_list.
@@ -733,6 +847,17 @@ for variant_path in variant_paths:
Export('extras_dir_list')
# Generate a Kconfig that will source the main gem5 one, and any in any
# EXTRAS directories.
kconfig_base_py = Dir('#build_tools').File('kconfig_base.py')
kconfig_base_cmd_parts = [f'"{kconfig_base_py}" "{kconfig_file.abspath}"',
f'"{gem5_kconfig_file.abspath}"']
for ed in extras_dir_list:
kconfig_base_cmd_parts.append(f'"{ed}"')
kconfig_base_cmd = ' '.join(kconfig_base_cmd_parts)
if env.Execute(kconfig_base_cmd) != 0:
error("Failed to build base Kconfig file")
# Variables which were determined with Configure.
env['CONF'] = {}
@@ -760,24 +885,48 @@ for variant_path in variant_paths:
for cb in after_sconsopts_callbacks:
cb()
# Update env for new variables added by the SConsopts.
sticky_vars.Update(env)
# Handle any requested kconfig action, then exit.
if kconfig_action:
if kconfig_action == 'defconfig':
if len(kconfig_args) != 1:
error('Usage: scons defconfig <build dir> <defconfig file>')
defconfig_path = makePathAbsolute(kconfig_args[0])
kconfig.defconfig(env, kconfig_file.abspath,
defconfig_path, config_file.abspath)
elif kconfig_action == 'guiconfig':
kconfig.guiconfig(env, kconfig_file.abspath, config_file.abspath,
variant_path)
elif kconfig_action == 'listnewconfig':
kconfig.listnewconfig(env, kconfig_file.abspath,
config_file.abspath)
elif kconfig_action == 'menuconfig':
kconfig.menuconfig(env, kconfig_file.abspath, config_file.abspath,
variant_path)
elif kconfig_action == 'oldconfig':
kconfig.oldconfig(env, kconfig_file.abspath, config_file.abspath)
elif kconfig_action == 'olddefconfig':
kconfig.olddefconfig(env, kconfig_file.abspath,
config_file.abspath)
elif kconfig_action == 'savedefconfig':
if len(kconfig_args) != 1:
error('Usage: scons defconfig <build dir> <defconfig file>')
defconfig_path = makePathAbsolute(kconfig_args[0])
kconfig.savedefconfig(env, kconfig_file.abspath,
config_file.abspath, defconfig_path)
elif kconfig_action == 'setconfig':
kconfig.setconfig(env, kconfig_file.abspath, config_file.abspath,
ARGUMENTS)
Exit(0)
Help('''
Build variables for {dir}:
{help}
'''.format(dir=variant_dir, help=sticky_vars.GenerateHelpText(env)),
append=True)
# If no config exists yet, see if we know how to make one?
if not isfile(config_file.abspath):
buildopts_file = Dir('#build_opts').File(variant_dir)
if not isfile(buildopts_file.abspath):
error('No config found, and no implicit config recognized')
kconfig.defconfig(env, kconfig_file.abspath, buildopts_file.abspath,
config_file.abspath)
# If the old vars file exists, delete it to avoid confusion/stale values.
if isfile(build_root_vars):
warning(f'Deleting old variant variables file "{build_root_vars}"')
remove(build_root_vars)
# Save sticky variables back to the gem5.build variant variables file.
sticky_vars.Save(gem5_build_vars, env)
# Pull all the sticky variables into the CONF dict.
env['CONF'].update({key: env[key] for key in sticky_vars.keys()})
kconfig.update_env(env, kconfig_file.abspath, config_file.abspath)
# Do this after we save setting back, or else we'll tack on an
# extra 'qdo' every time we run scons.

View File

@@ -1,7 +1,22 @@
USE_ARM_ISA = True
USE_MIPS_ISA = True
USE_POWER_ISA = True
USE_RISCV_ISA = True
USE_SPARC_ISA = True
USE_X86_ISA = True
PROTOCOL = 'MESI_Two_Level'
RUBY=y
USE_MULTIPLE_PROTOCOLS=y
PROTOCOL="MULTIPLE"
RUBY_PROTOCOL_MOESI_AMD_Base=y
RUBY_PROTOCOL_MESI_Two_Level=y
RUBY_PROTOCOL_MESI_Three_Level=y
RUBY_PROTOCOL_MESI_Three_Level_HTM=y
RUBY_PROTOCOL_MI_example=y
RUBY_PROTOCOL_MOESI_CMP_directory=y
RUBY_PROTOCOL_MOESI_CMP_token=y
RUBY_PROTOCOL_MOESI_hammer=y
RUBY_PROTOCOL_Garnet_standalone=y
RUBY_PROTOCOL_CHI=y
RUBY_PROTOCOL_MSI=y
BUILD_ISA=y
USE_ARM_ISA=y
USE_MIPS_ISA=y
USE_POWER_ISA=y
USE_RISCV_ISA=y
USE_SPARC_ISA=y
USE_X86_ISA=y
USE_TEST_OBJECTS=y

View File

@@ -1,2 +1,5 @@
USE_ARM_ISA = True
PROTOCOL = 'CHI'
BUILD_ISA=y
USE_ARM_ISA=y
RUBY=y
PROTOCOL="CHI"
RUBY_PROTOCOL_CHI=y

View File

@@ -1,5 +1,5 @@
# Copyright (c) 2019 ARM Limited
# All rights reserved.
USE_ARM_ISA = True
PROTOCOL = 'MESI_Three_Level'
BUILD_ISA=y
USE_ARM_ISA=y
RUBY=y
PROTOCOL="MESI_Three_Level"
RUBY_PROTOCOL_MESI_Three_Level=y

View File

@@ -1,5 +1,5 @@
# Copyright (c) 2019 ARM Limited
# All rights reserved.
USE_ARM_ISA = True
PROTOCOL = 'MESI_Three_Level_HTM'
BUILD_ISA=y
USE_ARM_ISA=y
RUBY=y
PROTOCOL="MESI_Three_Level_HTM"
RUBY_PROTOCOL_MESI_Three_Level_HTM=y

View File

@@ -1,5 +1,5 @@
# Copyright (c) 2019 ARM Limited
# All rights reserved.
USE_ARM_ISA = True
PROTOCOL = 'MOESI_hammer'
BUILD_ISA=y
USE_ARM_ISA=y
RUBY=y
PROTOCOL="MOESI_hammer"
RUBY_PROTOCOL_MOESI_hammer=y

6
build_opts/ARM_X86 Normal file
View File

@@ -0,0 +1,6 @@
BUILD_ISA=y
USE_ARM_ISA=y
USE_X86_ISA=y
RUBY=y
PROTOCOL="MESI_Two_Level"
RUBY_PROTOCOL_MESI_Two_Level=y

View File

@@ -1,4 +0,0 @@
PROTOCOL = 'GPU_VIPER'
USE_X86_ISA = True
TARGET_GPU_ISA = 'gcn3'
BUILD_GPU = True

View File

@@ -1,2 +1,3 @@
USE_NULL_ISA = True
PROTOCOL = 'Garnet_standalone'
RUBY=y
PROTOCOL="Garnet_standalone"
RUBY_PROTOCOL_Garnet_standalone=y

View File

@@ -1,2 +1,5 @@
USE_MIPS_ISA = True
PROTOCOL = 'MI_example'
RUBY=y
PROTOCOL="MI_example"
RUBY_PROTOCOL_MI_example=y
BUILD_ISA=y
USE_MIPS_ISA=y

View File

@@ -1,2 +1,3 @@
USE_NULL_ISA = True
PROTOCOL='MI_example'
RUBY=y
PROTOCOL="MI_example"
RUBY_PROTOCOL_MI_example=y

14
build_opts/NULL_ALL_RUBY Normal file
View File

@@ -0,0 +1,14 @@
RUBY=y
USE_MULTIPLE_PROTOCOLS=y
PROTOCOL="MULTIPLE"
RUBY_PROTOCOL_MOESI_AMD_Base=y
RUBY_PROTOCOL_MESI_Two_Level=y
RUBY_PROTOCOL_MESI_Three_Level=y
RUBY_PROTOCOL_MESI_Three_Level_HTM=y
RUBY_PROTOCOL_MI_example=y
RUBY_PROTOCOL_MOESI_CMP_directory=y
RUBY_PROTOCOL_MOESI_CMP_token=y
RUBY_PROTOCOL_MOESI_hammer=y
RUBY_PROTOCOL_Garnet_standalone=y
RUBY_PROTOCOL_CHI=y
RUBY_PROTOCOL_MSI=y

View File

@@ -1,2 +1,3 @@
USE_NULL_ISA = True
PROTOCOL = 'MESI_Two_Level'
RUBY=y
PROTOCOL="MESI_Two_Level"
RUBY_PROTOCOL_MESI_Two_Level=y

View File

@@ -1,2 +1,3 @@
USE_NULL_ISA = True
PROTOCOL='MOESI_CMP_directory'
RUBY=y
PROTOCOL="MOESI_CMP_directory"
RUBY_PROTOCOL_MOESI_CMP_directory=y

View File

@@ -1,2 +1,3 @@
USE_NULL_ISA = True
PROTOCOL='MOESI_CMP_token'
RUBY=y
PROTOCOL="MOESI_CMP_token"
RUBY_PROTOCOL_MOESI_CMP_token=y

View File

@@ -1,2 +1,3 @@
USE_NULL_ISA = True
PROTOCOL='MOESI_hammer'
RUBY=y
PROTOCOL="MOESI_hammer"
RUBY_PROTOCOL_MOESI_hammer=y

View File

@@ -1,2 +1,5 @@
USE_POWER_ISA = True
PROTOCOL = 'MI_example'
RUBY=y
PROTOCOL="MI_example"
RUBY_PROTOCOL_MI_example=y
BUILD_ISA=y
USE_POWER_ISA=y

View File

@@ -1,2 +1,5 @@
USE_RISCV_ISA = True
PROTOCOL = 'MI_example'
RUBY=y
PROTOCOL="MI_example"
RUBY_PROTOCOL_MI_example=y
BUILD_ISA=y
USE_RISCV_ISA=y

View File

@@ -1,2 +1,5 @@
USE_SPARC_ISA = True
PROTOCOL = 'MI_example'
RUBY=y
PROTOCOL="MI_example"
RUBY_PROTOCOL_MI_example=y
BUILD_ISA=y
USE_SPARC_ISA=y

View File

@@ -1,4 +1,8 @@
PROTOCOL = 'GPU_VIPER'
USE_X86_ISA = True
TARGET_GPU_ISA = 'vega'
BUILD_GPU = True
RUBY=y
NUMBER_BITS_PER_SET=128
PROTOCOL="GPU_VIPER"
RUBY_PROTOCOL_GPU_VIPER=y
BUILD_ISA=y
USE_X86_ISA=y
VEGA_GPU_ISA=y
BUILD_GPU=y

View File

@@ -1,3 +1,6 @@
USE_X86_ISA = True
PROTOCOL = 'MESI_Two_Level'
NUMBER_BITS_PER_SET = '128'
RUBY=y
NUMBER_BITS_PER_SET=128
PROTOCOL="MESI_Two_Level"
RUBY_PROTOCOL_MESI_Two_Level=y
BUILD_ISA=y
USE_X86_ISA=y

View File

@@ -1,3 +1,6 @@
USE_X86_ISA = True
PROTOCOL = 'MESI_Two_Level'
NUMBER_BITS_PER_SET = '128'
RUBY=y
NUMBER_BITS_PER_SET=128
PROTOCOL="MESI_Two_Level"
RUBY_PROTOCOL_MESI_Two_Level=y
BUILD_ISA=y
USE_X86_ISA=y

View File

@@ -1,2 +1,5 @@
USE_X86_ISA = True
PROTOCOL = 'MI_example'
RUBY=y
PROTOCOL="MI_example"
RUBY_PROTOCOL_MI_example=y
BUILD_ISA=y
USE_X86_ISA=y

View File

@@ -1,2 +1,5 @@
PROTOCOL = 'MOESI_AMD_Base'
USE_X86_ISA = True
RUBY=y
PROTOCOL="MOESI_AMD_Base"
RUBY_PROTOCOL_MOESI_AMD_Base=y
BUILD_ISA=y
USE_X86_ISA=y

View File

@@ -46,7 +46,7 @@ import os
import re
class lookup(object):
class lookup:
def __init__(self, formatter, frame, *args, **kwargs):
self.frame = frame
self.formatter = formatter
@@ -106,7 +106,7 @@ class code_formatter_meta(type):
"""
def __init__(cls, name, bases, dct):
super(code_formatter_meta, cls).__init__(name, bases, dct)
super().__init__(name, bases, dct)
if "pattern" in dct:
pat = cls.pattern
else:
@@ -125,7 +125,7 @@ class code_formatter_meta(type):
cls.pattern = re.compile(pat, re.VERBOSE | re.DOTALL | re.MULTILINE)
class code_formatter(object, metaclass=code_formatter_meta):
class code_formatter(metaclass=code_formatter_meta):
delim = r"$"
ident = r"[_A-z]\w*"
pos = r"[0-9]+"
@@ -272,7 +272,7 @@ class code_formatter(object, metaclass=code_formatter_meta):
# check for a lone identifier
if ident:
indent = match.group("indent") # must be spaces
lone = "%s" % (l[ident],)
lone = f"{l[ident]}"
def indent_lines(gen):
for line in gen:
@@ -284,7 +284,7 @@ class code_formatter(object, metaclass=code_formatter_meta):
# check for an identifier, braced or not
ident = match.group("ident") or match.group("b_ident")
if ident is not None:
return "%s" % (l[ident],)
return f"{l[ident]}"
# check for a positional parameter, braced or not
pos = match.group("pos") or match.group("b_pos")
@@ -295,13 +295,13 @@ class code_formatter(object, metaclass=code_formatter_meta):
"Positional parameter #%d not found in pattern" % pos,
code_formatter.pattern,
)
return "%s" % (args[int(pos)],)
return f"{args[int(pos)]}"
# check for a double braced expression
eval_expr = match.group("eval")
if eval_expr is not None:
result = eval(eval_expr, {}, l)
return "%s" % (result,)
return f"{result}"
# check for an escaped delimiter
if match.group("escaped") is not None:

View File

@@ -3,6 +3,7 @@
# Copyright 2013 Mark D. Hill and David A. Wood
# Copyright 2017-2020 ARM Limited
# Copyright 2021 Google, Inc.
# Copyright 2023 COSEDA Technologies GmbH
#
# The license below extends only to copyright in the software and shall
# not be construed as granting a license to any other intellectual
@@ -42,7 +43,6 @@ import os.path
import sys
import importer
from code_formatter import code_formatter
parser = argparse.ArgumentParser()
@@ -58,8 +58,8 @@ importer.install()
module = importlib.import_module(args.modpath)
sim_object = getattr(module, sim_object_name)
from m5.params import isSimObjectClass
import m5.params
from m5.params import isSimObjectClass
code = code_formatter()
@@ -104,7 +104,7 @@ for param in sim_object._params.values():
for port in sim_object._ports.values():
is_vector = isinstance(port, m5.params.VectorPort)
is_requestor = port.role == "GEM5 REQUESTOR"
is_requestor = port.is_source
code(
'ports["%s"] = new PortDesc("%s", %s, %s);'

View File

@@ -42,7 +42,6 @@ import os.path
import sys
import importer
from code_formatter import code_formatter
parser = argparse.ArgumentParser()

View File

@@ -100,13 +100,16 @@ if components:
inline union ${{args.name}}
{
~${{args.name}}() {}
CompoundFlag ${{args.name}} = {
"${{args.name}}", "${{args.desc}}", {
CompoundFlag flag${{args.name}};
${{args.name}}() : flag${{args.name}}("${{args.name}}", "${{args.desc}}",
{
${{",\\n ".join(
f"(Flag *)&::gem5::debug::{flag}" for flag in components)}}
}
};
} ${{args.name}};
}) {}
} instance${{args.name}};
"""
)
else:
@@ -115,10 +118,11 @@ else:
inline union ${{args.name}}
{
~${{args.name}}() {}
SimpleFlag ${{args.name}} = {
"${{args.name}}", "${{args.desc}}", ${{"true" if fmt else "false"}}
};
} ${{args.name}};
SimpleFlag flag${{args.name}};
${{args.name}}() : flag${{args.name}}("${{args.name}}", "${{args.desc}}", ${{"true" if fmt else "false"}}) {}
} instance${{args.name}};
"""
)
@@ -127,7 +131,7 @@ code(
} // namespace unions
inline constexpr const auto& ${{args.name}} =
::gem5::debug::unions::${{args.name}}.${{args.name}};
::gem5::debug::unions::instance${{args.name}}.flag${{args.name}};
} // namespace debug
} // namespace gem5

View File

@@ -42,7 +42,6 @@ import os.path
import sys
import importer
from code_formatter import code_formatter
parser = argparse.ArgumentParser()
@@ -118,7 +117,6 @@ code("} // namespace gem5")
if use_python:
name = enum.__name__
enum_name = enum.__name__ if enum.enum_name is None else enum.enum_name
wrapper_name = enum_name if enum.is_class else enum.wrapper_name

View File

@@ -42,7 +42,6 @@ import os.path
import sys
import importer
from code_formatter import code_formatter
parser = argparse.ArgumentParser()
@@ -66,7 +65,7 @@ code = code_formatter()
wrapper_name = enum.wrapper_name
wrapper = "struct" if enum.wrapper_is_struct else "namespace"
name = enum.__name__ if enum.enum_name is None else enum.enum_name
idem_macro = "__ENUM__%s__%s__" % (wrapper_name, name)
idem_macro = f"__ENUM__{wrapper_name}__{name}__"
code(
"""\

View File

@@ -36,7 +36,7 @@ class ParseError(Exception):
self.token = token
class Grammar(object):
class Grammar:
def setupLexerFactory(self, **kwargs):
if "module" in kwargs:
raise AttributeError("module is an illegal attribute")
@@ -92,7 +92,7 @@ class Grammar(object):
return self.current_lexer.lineno
raise AttributeError(
"'%s' object has no attribute '%s'" % (type(self), attr)
f"'{type(self)}' object has no attribute '{attr}'"
)
def parse_string(self, data, source="<string>", debug=None, tracking=0):
@@ -118,7 +118,7 @@ class Grammar(object):
def parse_file(self, f, **kwargs):
if isinstance(f, str):
source = f
f = open(f, "r")
f = open(f)
elif isinstance(f, file):
source = f.name
else:
@@ -137,7 +137,7 @@ class Grammar(object):
t.value,
)
else:
msg = "Syntax error at end of %s" % (self.current_source,)
msg = f"Syntax error at end of {self.current_source}"
raise ParseError(msg, t)
def t_error(self, t):

View File

@@ -56,7 +56,7 @@ for source in args.files:
# `README.md = "..."` which is not valid as `md` is not a property of
# `README`.
src = os.path.basename(source).replace(".", "_")
with open(source, "r") as f:
with open(source) as f:
data = "".join(f)
code("${src} = ${{repr(data)}}")

55
build_tools/kconfig_base.py Executable file
View File

@@ -0,0 +1,55 @@
#! /usr/bin/env python3
#
# Copyright 2022 Google LLC
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are
# met: redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer;
# redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution;
# neither the name of the copyright holders nor the names of its
# contributors may be used to endorse or promote products derived from
# this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
import argparse
from code_formatter import code_formatter
parser = argparse.ArgumentParser()
parser.add_argument("output", help="path of generated base Kconfig file")
parser.add_argument("main", help="relative path to the main gem5 Kconfig file")
parser.add_argument("extras_dirs", nargs="*", help="EXTRAS paths")
args = parser.parse_args()
code = code_formatter()
code(
f"""# Automatically generated base Kconfig file, DO NOT EDIT!
source "{args.main}"
"""
)
for extras_dir in args.extras_dirs:
code(
f"""
osource "{extras_dir}/Kconfig"
"""
)
code.write(args.output)

View File

@@ -74,7 +74,7 @@ if "LC_CTYPE" in os.environ:
_, cpp, python, modpath, abspath = sys.argv
with open(python, "r") as f:
with open(python) as f:
src = f.read()
compiled = compile(src, python, "exec")

View File

@@ -42,7 +42,6 @@ import os.path
import sys
import importer
from code_formatter import code_formatter
parser = argparse.ArgumentParser()
@@ -88,7 +87,6 @@ ports = sim_object._ports.local
# only include pybind if python is enabled in the build
if use_python:
code(
"""#include "pybind11/pybind11.h"
#include "pybind11/stl.h"

View File

@@ -42,7 +42,6 @@ import os.path
import sys
import importer
from code_formatter import code_formatter
parser = argparse.ArgumentParser()
@@ -81,7 +80,7 @@ except:
warned_about_nested_templates = False
class CxxClass(object):
class CxxClass:
def __init__(self, sig, template_params=[]):
# Split the signature into its constituent parts. This could
# potentially be done with regular expressions, but
@@ -212,8 +211,7 @@ code.indent()
if sim_object == SimObject:
code(
"""
SimObjectParams() {}
virtual ~SimObjectParams() {}
virtual ~SimObjectParams() = default;
std::string name;
"""

View File

@@ -24,8 +24,14 @@
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
from common.SysPaths import script, disk, binary
from os import environ as env
from common.SysPaths import (
binary,
disk,
script,
)
from m5.defines import buildEnv
@@ -49,7 +55,7 @@ class SysConfig:
if self.memsize:
return self.memsize
else:
return "128MB"
return "128MiB"
def disks(self):
if self.disknames:
@@ -71,8 +77,8 @@ class SysConfig:
# The first defined machine is the test system, the others are driving systems
Benchmarks = {
"PovrayBench": [SysConfig("povray-bench.rcS", "512MB", ["povray.img"])],
"PovrayAutumn": [SysConfig("povray-autumn.rcS", "512MB", ["povray.img"])],
"PovrayBench": [SysConfig("povray-bench.rcS", "512MiB", ["povray.img"])],
"PovrayAutumn": [SysConfig("povray-autumn.rcS", "512MiB", ["povray.img"])],
"NetperfStream": [
SysConfig("netperf-stream-client.rcS"),
SysConfig("netperf-server.rcS"),
@@ -91,55 +97,55 @@ Benchmarks = {
SysConfig("netperf-server.rcS"),
],
"SurgeStandard": [
SysConfig("surge-server.rcS", "512MB"),
SysConfig("surge-client.rcS", "256MB"),
SysConfig("surge-server.rcS", "512MiB"),
SysConfig("surge-client.rcS", "256MiB"),
],
"SurgeSpecweb": [
SysConfig("spec-surge-server.rcS", "512MB"),
SysConfig("spec-surge-client.rcS", "256MB"),
SysConfig("spec-surge-server.rcS", "512MiB"),
SysConfig("spec-surge-client.rcS", "256MiB"),
],
"Nhfsstone": [
SysConfig("nfs-server-nhfsstone.rcS", "512MB"),
SysConfig("nfs-server-nhfsstone.rcS", "512MiB"),
SysConfig("nfs-client-nhfsstone.rcS"),
],
"Nfs": [
SysConfig("nfs-server.rcS", "900MB"),
SysConfig("nfs-server.rcS", "900MiB"),
SysConfig("nfs-client-dbench.rcS"),
],
"NfsTcp": [
SysConfig("nfs-server.rcS", "900MB"),
SysConfig("nfs-server.rcS", "900MiB"),
SysConfig("nfs-client-tcp.rcS"),
],
"IScsiInitiator": [
SysConfig("iscsi-client.rcS", "512MB"),
SysConfig("iscsi-server.rcS", "512MB"),
SysConfig("iscsi-client.rcS", "512MiB"),
SysConfig("iscsi-server.rcS", "512MiB"),
],
"IScsiTarget": [
SysConfig("iscsi-server.rcS", "512MB"),
SysConfig("iscsi-client.rcS", "512MB"),
SysConfig("iscsi-server.rcS", "512MiB"),
SysConfig("iscsi-client.rcS", "512MiB"),
],
"Validation": [
SysConfig("iscsi-server.rcS", "512MB"),
SysConfig("iscsi-client.rcS", "512MB"),
SysConfig("iscsi-server.rcS", "512MiB"),
SysConfig("iscsi-client.rcS", "512MiB"),
],
"Ping": [SysConfig("ping-server.rcS"), SysConfig("ping-client.rcS")],
"ValAccDelay": [SysConfig("devtime.rcS", "512MB")],
"ValAccDelay2": [SysConfig("devtimewmr.rcS", "512MB")],
"ValMemLat": [SysConfig("micro_memlat.rcS", "512MB")],
"ValMemLat2MB": [SysConfig("micro_memlat2mb.rcS", "512MB")],
"ValMemLat8MB": [SysConfig("micro_memlat8mb.rcS", "512MB")],
"ValMemLat": [SysConfig("micro_memlat8.rcS", "512MB")],
"ValTlbLat": [SysConfig("micro_tlblat.rcS", "512MB")],
"ValSysLat": [SysConfig("micro_syscall.rcS", "512MB")],
"ValCtxLat": [SysConfig("micro_ctx.rcS", "512MB")],
"ValStream": [SysConfig("micro_stream.rcS", "512MB")],
"ValStreamScale": [SysConfig("micro_streamscale.rcS", "512MB")],
"ValStreamCopy": [SysConfig("micro_streamcopy.rcS", "512MB")],
"MutexTest": [SysConfig("mutex-test.rcS", "128MB")],
"ValAccDelay": [SysConfig("devtime.rcS", "512MiB")],
"ValAccDelay2": [SysConfig("devtimewmr.rcS", "512MiB")],
"ValMemLat": [SysConfig("micro_memlat.rcS", "512MiB")],
"ValMemLat2MB": [SysConfig("micro_memlat2mb.rcS", "512MiB")],
"ValMemLat8MB": [SysConfig("micro_memlat8mb.rcS", "512MiB")],
"ValMemLat": [SysConfig("micro_memlat8.rcS", "512MiB")],
"ValTlbLat": [SysConfig("micro_tlblat.rcS", "512MiB")],
"ValSysLat": [SysConfig("micro_syscall.rcS", "512MiB")],
"ValCtxLat": [SysConfig("micro_ctx.rcS", "512MiB")],
"ValStream": [SysConfig("micro_stream.rcS", "512MiB")],
"ValStreamScale": [SysConfig("micro_streamscale.rcS", "512MiB")],
"ValStreamCopy": [SysConfig("micro_streamcopy.rcS", "512MiB")],
"MutexTest": [SysConfig("mutex-test.rcS", "128MiB")],
"ArmAndroid-GB": [
SysConfig(
"null.rcS",
"256MB",
"256MiB",
["ARMv7a-Gingerbread-Android.SMP.mouse.nolock.clean.img"],
None,
"android-gingerbread",
@@ -148,7 +154,7 @@ Benchmarks = {
"bbench-gb": [
SysConfig(
"bbench-gb.rcS",
"256MB",
"256MiB",
["ARMv7a-Gingerbread-Android.SMP.mouse.nolock.img"],
None,
"android-gingerbread",
@@ -157,7 +163,7 @@ Benchmarks = {
"ArmAndroid-ICS": [
SysConfig(
"null.rcS",
"256MB",
"256MiB",
["ARMv7a-ICS-Android.SMP.nolock.clean.img"],
None,
"android-ics",
@@ -166,7 +172,7 @@ Benchmarks = {
"bbench-ics": [
SysConfig(
"bbench-ics.rcS",
"256MB",
"256MiB",
["ARMv7a-ICS-Android.SMP.nolock.img"],
None,
"android-ics",

View File

@@ -40,13 +40,13 @@
# Configure the M5 cache hierarchy config in one place
#
from common import ObjectList
from common.Caches import *
import m5
from m5.objects import *
from gem5.isas import ISA
from gem5.runtime import get_runtime_isa
from common.Caches import *
from common import ObjectList
from gem5.isas import ISA
def _get_hwp(hwp_option):
@@ -117,9 +117,6 @@ def config_cache(options, system):
None,
)
if get_runtime_isa() in [ISA.X86, ISA.RISCV]:
walk_cache_class = PageTableWalkerCache
# Set the cache line size of the system
system.cache_line_size = options.cacheline_size
@@ -150,11 +147,13 @@ def config_cache(options, system):
icache = icache_class(**_get_cache_opts("l1i", options))
dcache = dcache_class(**_get_cache_opts("l1d", options))
# If we have a walker cache specified, instantiate two
# instances here
if walk_cache_class:
iwalkcache = walk_cache_class()
dwalkcache = walk_cache_class()
# If we are using ISA.X86 or ISA.RISCV, we set walker caches.
if ObjectList.cpu_list.get_isa(options.cpu_type) in [
ISA.RISCV,
ISA.X86,
]:
iwalkcache = PageTableWalkerCache()
dwalkcache = PageTableWalkerCache()
else:
iwalkcache = None
dwalkcache = None
@@ -192,7 +191,11 @@ def config_cache(options, system):
# on these names. For simplicity, we would advise configuring
# it to use this naming scheme; if this isn't possible, change
# the names below.
if get_runtime_isa() in [ISA.X86, ISA.ARM, ISA.RISCV]:
if ObjectList.cpu_list.get_isa(options.cpu_type) in [
ISA.X86,
ISA.ARM,
ISA.RISCV,
]:
system.cpu[i].addPrivateSplitL1Caches(
ExternalCache("cpu%d.icache" % i),
ExternalCache("cpu%d.dcache" % i),

View File

@@ -39,8 +39,8 @@
from m5.defines import buildEnv
from m5.objects import *
from gem5.isas import ISA
from gem5.runtime import get_runtime_isa
# Base implementations of L1, L2, IO and TLB-walker caches. There are
# used in the regressions and also as base components in the
@@ -84,7 +84,7 @@ class IOCache(Cache):
data_latency = 50
response_latency = 50
mshrs = 20
size = "1kB"
size = "1KiB"
tgts_per_mshr = 12
@@ -94,13 +94,6 @@ class PageTableWalkerCache(Cache):
data_latency = 2
response_latency = 2
mshrs = 10
size = "1kB"
size = "1KiB"
tgts_per_mshr = 12
# the x86 table walker actually writes to the table-walker cache
if get_runtime_isa() in [ISA.X86, ISA.RISCV]:
is_read_only = False
else:
is_read_only = True
# Writeback clean lines as well
writeback_clean = True
is_read_only = False

View File

@@ -33,8 +33,19 @@
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
from m5 import fatal
import m5.objects
from m5 import fatal
from gem5.isas import ISA
isa_string_map = {
ISA.X86: "X86",
ISA.ARM: "Arm",
ISA.RISCV: "Riscv",
ISA.SPARC: "Sparc",
ISA.POWER: "Power",
ISA.MIPS: "Mips",
}
def config_etrace(cpu_cls, cpu_list, options):

View File

@@ -38,12 +38,13 @@
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
from common import ObjectList
from common.Benchmarks import *
import m5
import m5.defines
from m5.objects import *
from m5.util import *
from common.Benchmarks import *
from common import ObjectList
# Populate to reflect supported os types per target ISA
os_types = set()
@@ -136,8 +137,8 @@ def makeSparcSystem(mem_mode, mdesc=None, cmdline=None):
self.t1000.attachOnChipIO(self.membus)
self.t1000.attachIO(self.iobus)
self.mem_ranges = [
AddrRange(Addr("1MB"), size="64MB"),
AddrRange(Addr("2GB"), size="256MB"),
AddrRange(Addr("1MiB"), size="64MiB"),
AddrRange(Addr("2GiB"), size="256MiB"),
]
self.bridge.mem_side_port = self.iobus.cpu_side_ports
self.bridge.cpu_side_port = self.membus.mem_side_ports
@@ -173,21 +174,21 @@ def makeSparcSystem(mem_mode, mdesc=None, cmdline=None):
# ROM for OBP/Reset/Hypervisor
self.rom = SimpleMemory(
image_file=binary("t1000_rom.bin"),
range=AddrRange(0xFFF0000000, size="8MB"),
range=AddrRange(0xFFF0000000, size="8MiB"),
)
# nvram
self.nvram = SimpleMemory(
image_file=binary("nvram1"), range=AddrRange(0x1F11000000, size="8kB")
image_file=binary("nvram1"), range=AddrRange(0x1F11000000, size="8KiB")
)
# hypervisor description
self.hypervisor_desc = SimpleMemory(
image_file=binary("1up-hv.bin"),
range=AddrRange(0x1F12080000, size="8kB"),
range=AddrRange(0x1F12080000, size="8KiB"),
)
# partition description
self.partition_desc = SimpleMemory(
image_file=binary("1up-md.bin"),
range=AddrRange(0x1F12000000, size="8kB"),
range=AddrRange(0x1F12000000, size="8KiB"),
)
self.rom.port = self.membus.mem_side_ports
@@ -422,7 +423,7 @@ def makeLinuxMipsSystem(mem_mode, mdesc=None, cmdline=None):
self.iobus = IOXBar()
self.membus = MemBus()
self.bridge = Bridge(delay="50ns")
self.mem_ranges = [AddrRange("1GB")]
self.mem_ranges = [AddrRange("1GiB")]
self.bridge.mem_side_port = self.iobus.cpu_side_ports
self.bridge.cpu_side_port = self.membus.mem_side_ports
self.disks = makeCowDisks(mdesc.disks())
@@ -468,7 +469,7 @@ def connectX86ClassicSystem(x86_sys, numCPUs):
x86_sys.bridge.cpu_side_port = x86_sys.membus.mem_side_ports
# Allow the bridge to pass through:
# 1) kernel configured PCI device memory map address: address range
# [0xC0000000, 0xFFFF0000). (The upper 64kB are reserved for m5ops.)
# [0xC0000000, 0xFFFF0000). (The upper 64KiB are reserved for m5ops.)
# 2) the bridge to pass through the IO APIC (two pages, already contained in 1),
# 3) everything in the IO address range up to the local APIC, and
# 4) then the entire PCI address space and beyond.
@@ -525,22 +526,22 @@ def makeX86System(mem_mode, numCPUs=1, mdesc=None, workload=None, Ruby=False):
# Physical memory
# On the PC platform, the memory region 0xC0000000-0xFFFFFFFF is reserved
# for various devices. Hence, if the physical memory size is greater than
# 3GB, we need to split it into two parts.
# 3GiB, we need to split it into two parts.
excess_mem_size = convert.toMemorySize(mdesc.mem()) - convert.toMemorySize(
"3GB"
"3GiB"
)
if excess_mem_size <= 0:
self.mem_ranges = [AddrRange(mdesc.mem())]
else:
warn(
"Physical memory size specified is %s which is greater than "
"3GB. Twice the number of memory controllers would be "
"3GiB. Twice the number of memory controllers would be "
"created." % (mdesc.mem())
)
self.mem_ranges = [
AddrRange("3GB"),
AddrRange(Addr("4GB"), size=excess_mem_size),
AddrRange("3GiB"),
AddrRange(Addr("4GiB"), size=excess_mem_size),
]
# Platform
@@ -662,16 +663,16 @@ def makeLinuxX86System(
# Build up the x86 system and then specialize it for Linux
self = makeX86System(mem_mode, numCPUs, mdesc, X86FsLinux(), Ruby)
# We assume below that there's at least 1MB of memory. We'll require 2
# We assume below that there's at least 1MiB of memory. We'll require 2
# just to avoid corner cases.
phys_mem_size = sum([r.size() for r in self.mem_ranges])
assert phys_mem_size >= 0x200000
assert len(self.mem_ranges) <= 2
entries = [
# Mark the first megabyte of memory as reserved
X86E820Entry(addr=0, size="639kB", range_type=1),
X86E820Entry(addr=0x9FC00, size="385kB", range_type=2),
# Mark the first mibibyte of memory as reserved
X86E820Entry(addr=0, size="639KiB", range_type=1),
X86E820Entry(addr=0x9FC00, size="385KiB", range_type=2),
# Mark the rest of physical memory as available
X86E820Entry(
addr=0x100000,
@@ -680,7 +681,7 @@ def makeLinuxX86System(
),
]
# Mark [mem_size, 3GB) as reserved if memory less than 3GB, which force
# Mark [mem_size, 3iB) as reserved if memory less than 3GiB, which force
# IO devices to be mapped to [0xC0000000, 0xFFFF0000). Requests to this
# specific range can pass though bridge to iobus.
if len(self.mem_ranges) == 1:
@@ -692,10 +693,10 @@ def makeLinuxX86System(
)
)
# Reserve the last 16kB of the 32-bit address space for the m5op interface
entries.append(X86E820Entry(addr=0xFFFF0000, size="64kB", range_type=2))
# Reserve the last 16KiB of the 32-bit address space for the m5op interface
entries.append(X86E820Entry(addr=0xFFFF0000, size="64KiB", range_type=2))
# In case the physical memory is greater than 3GB, we split it into two
# In case the physical memory is greater than 3GiB, we split it into two
# parts and add a separate e820 entry for the second part. This entry
# starts at 0x100000000, which is the first address after the space
# reserved for devices.

View File

@@ -36,18 +36,31 @@
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
import getpass
import operator
import os
import platform
from functools import reduce
from os import (
access,
getpid,
listdir,
makedirs,
mkdir,
stat,
)
from os.path import isdir
from os.path import join as joinpath
from pwd import getpwuid
from shutil import (
copyfile,
rmtree,
)
import m5
from m5.objects import *
from m5.util.convert import *
from functools import reduce
import operator, os, platform, getpass
from os import mkdir, makedirs, getpid, listdir, stat, access
from pwd import getpwuid
from os.path import join as joinpath
from os.path import isdir
from shutil import rmtree, copyfile
def hex_mask(terms):
dec_mask = reduce(operator.or_, [2**i for i in terms], 0)
@@ -189,7 +202,7 @@ def register_node(cpu_list, mem, node_number):
file_append((nodedir, "cpumap"), hex_mask(cpu_list))
file_append(
(nodedir, "meminfo"),
"Node %d MemTotal: %dkB"
"Node %d MemTotal: %dKiB"
% (node_number, toMemorySize(str(mem)) / kibi),
)

View File

@@ -36,7 +36,6 @@ from m5.objects import *
def TLB_constructor(options, level, gpu_ctrl=None, full_system=False):
if full_system:
constructor_call = (
"VegaGPUTLB(\
@@ -71,7 +70,6 @@ def TLB_constructor(options, level, gpu_ctrl=None, full_system=False):
def Coalescer_constructor(options, level, full_system):
if full_system:
constructor_call = (
"VegaTLBCoalescer(probesPerCycle = \

View File

@@ -29,7 +29,6 @@
def tlb_options(parser):
# ===================================================================
# TLB Configuration
# ===================================================================
@@ -45,8 +44,8 @@ def tlb_options(parser):
# L1 TLB Options (D-TLB, I-TLB, Dispatcher-TLB)
# ===================================================================
parser.add_argument("--L1TLBentries", type=int, default="32")
parser.add_argument("--L1TLBassoc", type=int, default="32")
parser.add_argument("--L1TLBentries", type=int, default="64")
parser.add_argument("--L1TLBassoc", type=int, default="64")
parser.add_argument(
"--L1AccessLatency",
type=int,
@@ -69,7 +68,7 @@ def tlb_options(parser):
# ===================================================================
parser.add_argument("--L2TLBentries", type=int, default="4096")
parser.add_argument("--L2TLBassoc", type=int, default="32")
parser.add_argument("--L2TLBassoc", type=int, default="64")
parser.add_argument(
"--L2AccessLatency",
type=int,
@@ -91,7 +90,7 @@ def tlb_options(parser):
# ===================================================================
parser.add_argument("--L3TLBentries", type=int, default="8192")
parser.add_argument("--L3TLBassoc", type=int, default="32")
parser.add_argument("--L3TLBassoc", type=int, default="64")
parser.add_argument(
"--L3AccessLatency",
type=int,

View File

@@ -300,10 +300,10 @@ def add_options(parser):
# address range for each of the serial links
parser.add_argument(
"--serial-link-addr-range",
default="1GB",
default="1GiB",
type=str,
help="memory range for each of the serial links.\
Default: 1GB",
Default: 1GiB",
)
# *****************************PERFORMANCE MONITORING*********************
@@ -390,10 +390,10 @@ def add_options(parser):
# HMC device - vault capacity or size
parser.add_argument(
"--hmc-dev-vault-size",
default="256MB",
default="256MiB",
type=str,
help="vault storage capacity in bytes. Default:\
256MB",
256MiB",
)
parser.add_argument(
"--mem-type",
@@ -430,7 +430,6 @@ def add_options(parser):
# configure HMC host controller
def config_hmc_host_ctrl(opt, system):
# create HMC host controller
system.hmc_host = SubSystem()
@@ -533,7 +532,6 @@ def config_hmc_host_ctrl(opt, system):
# Create an HMC device
def config_hmc_dev(opt, system, hmc_host):
# create HMC device
system.hmc_dev = SubSystem()
@@ -570,9 +568,9 @@ def config_hmc_dev(opt, system, hmc_host):
# Attach 4 serial link to 4 crossbar/s
for i in range(opt.num_serial_links):
if opt.enable_link_monitor:
system.hmc_host.seriallink[
i
].mem_side_port = system.hmc_dev.lmonitor[i].cpu_side_port
system.hmc_host.seriallink[i].mem_side_port = (
system.hmc_dev.lmonitor[i].cpu_side_port
)
system.hmc_dev.lmonitor[i].mem_side_port = system.hmc_dev.xbar[
i
].cpu_side_ports
@@ -615,14 +613,12 @@ def config_hmc_dev(opt, system, hmc_host):
]
# Connect the bridge between corssbars
system.hmc_dev.xbar[
i
].mem_side_ports = system.hmc_dev.buffers[
index
].cpu_side_port
system.hmc_dev.buffers[
index
].mem_side_port = system.hmc_dev.xbar[j].cpu_side_ports
system.hmc_dev.xbar[i].mem_side_ports = (
system.hmc_dev.buffers[index].cpu_side_port
)
system.hmc_dev.buffers[index].mem_side_port = (
system.hmc_dev.xbar[j].cpu_side_ports
)
else:
# Don't connect the xbar to itself
pass
@@ -631,49 +627,49 @@ def config_hmc_dev(opt, system, hmc_host):
# can only direct traffic to it local vaults
if opt.arch == "mixed":
system.hmc_dev.buffer30 = Bridge(ranges=system.mem_ranges[0:4])
system.hmc_dev.xbar[
3
].mem_side_ports = system.hmc_dev.buffer30.cpu_side_port
system.hmc_dev.xbar[3].mem_side_ports = (
system.hmc_dev.buffer30.cpu_side_port
)
system.hmc_dev.buffer30.mem_side_port = system.hmc_dev.xbar[
0
].cpu_side_ports
system.hmc_dev.buffer31 = Bridge(ranges=system.mem_ranges[4:8])
system.hmc_dev.xbar[
3
].mem_side_ports = system.hmc_dev.buffer31.cpu_side_port
system.hmc_dev.xbar[3].mem_side_ports = (
system.hmc_dev.buffer31.cpu_side_port
)
system.hmc_dev.buffer31.mem_side_port = system.hmc_dev.xbar[
1
].cpu_side_ports
system.hmc_dev.buffer32 = Bridge(ranges=system.mem_ranges[8:12])
system.hmc_dev.xbar[
3
].mem_side_ports = system.hmc_dev.buffer32.cpu_side_port
system.hmc_dev.xbar[3].mem_side_ports = (
system.hmc_dev.buffer32.cpu_side_port
)
system.hmc_dev.buffer32.mem_side_port = system.hmc_dev.xbar[
2
].cpu_side_ports
system.hmc_dev.buffer20 = Bridge(ranges=system.mem_ranges[0:4])
system.hmc_dev.xbar[
2
].mem_side_ports = system.hmc_dev.buffer20.cpu_side_port
system.hmc_dev.xbar[2].mem_side_ports = (
system.hmc_dev.buffer20.cpu_side_port
)
system.hmc_dev.buffer20.mem_side_port = system.hmc_dev.xbar[
0
].cpu_side_ports
system.hmc_dev.buffer21 = Bridge(ranges=system.mem_ranges[4:8])
system.hmc_dev.xbar[
2
].mem_side_ports = system.hmc_dev.buffer21.cpu_side_port
system.hmc_dev.xbar[2].mem_side_ports = (
system.hmc_dev.buffer21.cpu_side_port
)
system.hmc_dev.buffer21.mem_side_port = system.hmc_dev.xbar[
1
].cpu_side_ports
system.hmc_dev.buffer23 = Bridge(ranges=system.mem_ranges[12:16])
system.hmc_dev.xbar[
2
].mem_side_ports = system.hmc_dev.buffer23.cpu_side_port
system.hmc_dev.xbar[2].mem_side_ports = (
system.hmc_dev.buffer23.cpu_side_port
)
system.hmc_dev.buffer23.mem_side_port = system.hmc_dev.xbar[
3
].cpu_side_ports

View File

@@ -33,14 +33,17 @@
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
from common import (
HMC,
ObjectList,
)
import m5.objects
from common import ObjectList
from common import HMC
def create_mem_intf(intf, r, i, intlv_bits, intlv_size, xor_low_bit):
"""
Helper function for creating a single memoy controller from the given
Helper function for creating a single memory controller from the given
options. This function is invoked multiple times in config_mem function
to create an array of controllers.
"""
@@ -174,6 +177,7 @@ def config_mem(options, system):
nbr_mem_ctrls = opt_mem_channels
import math
from m5.util import fatal
intlv_bits = int(math.log(nbr_mem_ctrls, 2))

View File

@@ -34,15 +34,18 @@
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
from gem5.runtime import get_supported_isas
import m5.objects
import m5.internal.params
import inspect
import sys
from textwrap import TextWrapper
import m5.internal.params
import m5.objects
class ObjectList(object):
from gem5.isas import ISA
from gem5.runtime import get_supported_isas
class ObjectList:
"""Creates a list of objects that are sub-classes of a given class."""
def _is_obj_class(self, cls):
@@ -86,7 +89,7 @@ class ObjectList(object):
print(line)
if self._aliases:
print("\Aliases:")
print(r"\Aliases:")
for alias, target in list(self._aliases.items()):
print(f"\t{alias} => {target}")
@@ -127,14 +130,14 @@ class CPUList(ObjectList):
# We can't use the normal inspect.isclass because the ParamFactory
# and ProxyFactory classes have a tendency to confuse it.
try:
return super(CPUList, self)._is_obj_class(cls) and not issubclass(
return super()._is_obj_class(cls) and not issubclass(
cls, m5.objects.CheckerCPU
)
except (TypeError, AttributeError):
return False
def _add_objects(self):
super(CPUList, self)._add_objects()
super()._add_objects()
from importlib import import_module
@@ -157,6 +160,27 @@ class CPUList(ObjectList):
):
self._sub_classes[name] = cls
def get_isa(self, name: str) -> ISA:
"""For a given CPU (string representation) determine the ISA of the
CPU."""
cls = self.get(name)
if hasattr(m5.objects, "X86CPU") and issubclass(
cls, m5.objects.X86CPU
):
return ISA.X86
elif hasattr(m5.objects, "ArmCPU") and issubclass(
cls, m5.objects.ArmCPU
):
return ISA.ARM
elif hasattr(m5.objects, "RiscvCPU") and issubclass(
cls, m5.objects.RiscvCPU
):
return ISA.RISCV
else:
raise ValueError("Unable to determine CPU ISA.")
class EnumList(ObjectList):
"""Creates a list of possible values for a given enum class."""
@@ -164,7 +188,7 @@ class EnumList(ObjectList):
def _add_objects(self):
"""Add all enum values to the ObjectList"""
self._sub_classes = {}
for (key, value) in list(self.base_cls.__members__.items()):
for key, value in list(self.base_cls.__members__.items()):
# All Enums have a value Num_NAME at the end which we
# do not want to include
if not key.startswith("Num_"):
@@ -204,3 +228,4 @@ def _subclass_tester(name):
is_kvm_cpu = _subclass_tester("BaseKvmCPU")
is_noncaching_cpu = _subclass_tester("NonCachingSimpleCPU")
is_o3_cpu = _subclass_tester("BaseO3CPU")

View File

@@ -37,13 +37,20 @@
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
import argparse
from typing import Optional
from common import (
CpuConfig,
ObjectList,
)
from common.Benchmarks import *
import m5
from m5.defines import buildEnv
from m5.objects import *
from common.Benchmarks import *
from common import ObjectList
from gem5.isas import ISA
from gem5.runtime import get_supported_isas
vio_9p_help = """\
Enable the Virtio 9P device and set the path to share. The default 9p path is
@@ -148,7 +155,7 @@ def addNoISAOptions(parser):
"--mem-size",
action="store",
type=str,
default="512MB",
default="512MiB",
help="Specify the physical memory size (single memory)",
)
parser.add_argument(
@@ -181,10 +188,10 @@ def addNoISAOptions(parser):
parser.add_argument("--num-dirs", type=int, default=1)
parser.add_argument("--num-l2caches", type=int, default=1)
parser.add_argument("--num-l3caches", type=int, default=1)
parser.add_argument("--l1d_size", type=str, default="64kB")
parser.add_argument("--l1i_size", type=str, default="32kB")
parser.add_argument("--l2_size", type=str, default="2MB")
parser.add_argument("--l3_size", type=str, default="16MB")
parser.add_argument("--l1d_size", type=str, default="64KiB")
parser.add_argument("--l1i_size", type=str, default="32KiB")
parser.add_argument("--l2_size", type=str, default="2MiB")
parser.add_argument("--l3_size", type=str, default="16MiB")
parser.add_argument("--l1d_assoc", type=int, default=2)
parser.add_argument("--l1i_assoc", type=int, default=2)
parser.add_argument("--l2_assoc", type=int, default=8)
@@ -237,9 +244,13 @@ def addNoISAOptions(parser):
# Add common options that assume a non-NULL ISA.
def addCommonOptions(parser):
def addCommonOptions(parser, default_isa: Optional[ISA] = None):
# start by adding the base options that do not assume an ISA
addNoISAOptions(parser)
if default_isa is None:
isa = list(get_supported_isas())[0]
else:
isa = default_isa
# system options
parser.add_argument(
@@ -250,7 +261,7 @@ def addCommonOptions(parser):
)
parser.add_argument(
"--cpu-type",
default="AtomicSimpleCPU",
default=CpuConfig.isa_string_map[isa] + "AtomicSimpleCPU",
choices=ObjectList.cpu_list.get_names(),
help="type of cpu to run with",
)
@@ -581,7 +592,7 @@ def addCommonOptions(parser):
parser.add_argument(
"--restore-with-cpu",
action="store",
default="AtomicSimpleCPU",
default=CpuConfig.isa_string_map[isa] + "AtomicSimpleCPU",
choices=ObjectList.cpu_list.get_names(),
help="cpu type for restoring from a checkpoint",
)
@@ -784,12 +795,20 @@ def addFSOptions(parser):
"files in the gem5 output directory",
)
if buildEnv["USE_ARM_ISA"]:
if buildEnv["USE_ARM_ISA"] or buildEnv["USE_RISCV_ISA"]:
parser.add_argument(
"--bare-metal",
action="store_true",
help="Provide the raw system without the linux specific bits",
)
parser.add_argument(
"--dtb-filename",
action="store",
type=str,
help="Specifies device tree blob file to use with device-tree-"
"enabled kernels",
)
if buildEnv["USE_ARM_ISA"]:
parser.add_argument(
"--list-machine-types",
action=ListPlatform,
@@ -802,13 +821,6 @@ def addFSOptions(parser):
choices=ObjectList.platform_list.get_names(),
default="VExpress_GEM5_V1",
)
parser.add_argument(
"--dtb-filename",
action="store",
type=str,
help="Specifies device tree blob file to use with device-tree-"
"enabled kernels",
)
parser.add_argument(
"--enable-context-switch-stats-dump",
action="store_true",

View File

@@ -1,4 +1,3 @@
# -*- coding: utf-8 -*-
# Copyright (c) 2015 Jason Power
# All rights reserved.
#
@@ -35,12 +34,12 @@ from each class instead of only from the configuration script.
# Module-level variable to track if we've called the parse_args function yet
called_parse_args = False
# For fatal
import m5
# import the argument parser
from argparse import ArgumentParser
# For fatal
import m5
# add the args we want to be able to control from the command line
parser = ArgumentParser()

View File

@@ -41,8 +41,10 @@ import sys
from os import getcwd
from os.path import join as joinpath
from common import CpuConfig
from common import ObjectList
from common import (
CpuConfig,
ObjectList,
)
import m5
from m5.defines import buildEnv
@@ -79,7 +81,10 @@ def setCPUClass(options):
TmpClass, test_mem_mode = getCPUClass(options.restore_with_cpu)
elif options.fast_forward:
CPUClass = TmpClass
TmpClass = AtomicSimpleCPU
CPUISA = ObjectList.cpu_list.get_isa(options.cpu_type)
TmpClass = getCPUClass(
CpuConfig.isa_string_map[CPUISA] + "AtomicSimpleCPU"
)
test_mem_mode = "atomic"
# Ruby only supports atomic accesses in noncaching mode
@@ -128,9 +133,12 @@ def findCptDir(options, cptdir, testsys):
the appropriate directory.
"""
from os.path import isdir, exists
from os import listdir
import re
from os import listdir
from os.path import (
exists,
isdir,
)
if not isdir(cptdir):
fatal("checkpoint dir %s does not exist!", cptdir)
@@ -153,8 +161,8 @@ def findCptDir(options, cptdir, testsys):
# Assumes that the checkpoint dir names are formatted as follows:
dirs = listdir(cptdir)
expr = re.compile(
"cpt\.simpoint_(\d+)_inst_(\d+)"
+ "_weight_([\d\.e\-]+)_interval_(\d+)_warmup_(\d+)"
r"cpt\.simpoint_(\d+)_inst_(\d+)"
+ r"_weight_([\d\.e\-]+)_interval_(\d+)_warmup_(\d+)"
)
cpts = []
for dir in dirs:
@@ -190,7 +198,7 @@ def findCptDir(options, cptdir, testsys):
else:
dirs = listdir(cptdir)
expr = re.compile("cpt\.([0-9]+)")
expr = re.compile(r"cpt\.([0-9]+)")
cpts = []
for dir in dirs:
match = expr.match(dir)
@@ -325,7 +333,7 @@ def parseSimpointAnalysisFile(options, testsys):
line = simpoint_file.readline()
if not line:
break
m = re.match("(\d+)\s+(\d+)", line)
m = re.match(r"(\d+)\s+(\d+)", line)
if m:
interval = int(m.group(1))
else:
@@ -334,7 +342,7 @@ def parseSimpointAnalysisFile(options, testsys):
line = weight_file.readline()
if not line:
fatal("not enough lines in simpoint weight file!")
m = re.match("([0-9\.e\-]+)\s+(\d+)", line)
m = re.match(r"([0-9\.e\-]+)\s+(\d+)", line)
if m:
weight = float(m.group(1))
else:
@@ -533,9 +541,9 @@ def run(options, root, testsys, cpu_class):
IndirectBPClass = ObjectList.indirect_bp_list.get(
options.indirect_bp_type
)
switch_cpus[
i
].branchPred.indirectBranchPred = IndirectBPClass()
switch_cpus[i].branchPred.indirectBranchPred = (
IndirectBPClass()
)
switch_cpus[i].createThreads()
# If elastic tracing is enabled attach the elastic trace probe
@@ -771,7 +779,6 @@ def run(options, root, testsys, cpu_class):
if (
options.take_checkpoints or options.take_simpoint_checkpoints
) and options.checkpoint_restore:
if m5.options.outdir:
cptdir = m5.options.outdir
else:

View File

@@ -24,13 +24,14 @@
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
import os, sys
import os
import sys
config_path = os.path.dirname(os.path.abspath(__file__))
config_root = os.path.dirname(config_path)
class PathSearchFunc(object):
class PathSearchFunc:
_sys_paths = None
environment_variable = "M5_PATH"
@@ -58,7 +59,7 @@ class PathSearchFunc(object):
paths = list(filter(os.path.isdir, paths))
if not paths:
raise IOError(
raise OSError(
"Can't find system files directory, "
"check your {} environment variable".format(
self.environment_variable
@@ -72,7 +73,7 @@ class PathSearchFunc(object):
try:
return next(p for p in paths if os.path.exists(p))
except StopIteration:
raise IOError(
raise OSError(
f"Can't find file '{filepath}' on {self.environment_variable}."
)

View File

@@ -44,6 +44,7 @@ at: http://www.arm.com/ResearchEnablement/SystemModeling
from m5.objects import *
# Simple function to allow a string of [01x_] to be converted into a
# mask and value for use with MinorFUTiming
def make_implicant(implicant_string):
@@ -1679,7 +1680,23 @@ class HPI_MMU(ArmMMU):
dtb = ArmTLB(entry_type="data", size=256)
class HPI_BTB(SimpleBTB):
numEntries = 128
tagBits = 18
associativity = 1
instShiftAmt = 2
btbReplPolicy = LRURP()
btbIndexingPolicy = BTBSetAssociative(
num_entries=Parent.numEntries,
set_shift=Parent.instShiftAmt,
assoc=Parent.associativity,
tag_bits=Parent.tagBits,
)
class HPI_BP(TournamentBP):
btb = HPI_BTB()
ras = ReturnAddrStack(numEntries=8)
localPredictorSize = 64
localCtrBits = 2
localHistoryTableSize = 64
@@ -1687,9 +1704,6 @@ class HPI_BP(TournamentBP):
globalCtrBits = 2
choicePredictorSize = 1024
choiceCtrBits = 2
BTBEntries = 128
BTBTagSize = 18
RASSize = 8
instShiftAmt = 2
@@ -1699,7 +1713,7 @@ class HPI_ICache(Cache):
response_latency = 1
mshrs = 2
tgts_per_mshr = 8
size = "32kB"
size = "32KiB"
assoc = 2
# No prefetcher, this is handled by the core
@@ -1710,7 +1724,7 @@ class HPI_DCache(Cache):
response_latency = 1
mshrs = 4
tgts_per_mshr = 8
size = "32kB"
size = "32KiB"
assoc = 4
write_buffers = 4
prefetcher = StridePrefetcher(queue_size=4, degree=4)
@@ -1722,7 +1736,7 @@ class HPI_L2(Cache):
response_latency = 5
mshrs = 4
tgts_per_mshr = 8
size = "1024kB"
size = "1024KiB"
assoc = 16
write_buffers = 16
# prefetcher FIXME

View File

@@ -0,0 +1,60 @@
# Copyright (c) 2012, 2017-2018, 2023 Arm Limited
# All rights reserved.
#
# The license below extends only to copyright in the software and shall
# not be construed as granting a license to any other intellectual
# property including but not limited to intellectual property relating
# to a hardware implementation of the functionality of the software
# licensed hereunder. You may use the software subject to the license
# terms below provided that you ensure that this notice is replicated
# unmodified and in its entirety in all distributions of the software,
# modified or unmodified, in source code or in binary form.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are
# met: redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer;
# redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution;
# neither the name of the copyright holders nor the names of its
# contributors may be used to endorse or promote products derived from
# this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
from m5.objects import *
from .O3_ARM_v7a import O3_ARM_v7a_3
# O3_ARM_v7a_3 adapted to generate elastic traces
class O3_ARM_v7a_3_Etrace(O3_ARM_v7a_3):
# Make the number of entries in the ROB, LQ and SQ very
# large so that there are no stalls due to resource
# limitation as such stalls will get captured in the trace
# as compute delay. For replay, ROB, LQ and SQ sizes are
# modelled in the Trace CPU.
numROBEntries = 512
LQEntries = 128
SQEntries = 128
def attach_probe_listener(self, inst_trace_file, data_trace_file):
# Attach the elastic trace probe listener. Set the protobuf trace
# file names. Set the dependency window size equal to the cpu it
# is attached to.
self.traceListener = m5.objects.ElasticTrace(
instFetchTraceFile=inst_trace_file,
dataDepTraceFile=data_trace_file,
depWindowSize=3 * self.numROBEntries,
)

View File

@@ -26,6 +26,7 @@
from m5.objects import *
# Simple ALU Instructions have a latency of 1
class O3_ARM_v7a_Simple_Int(FUDesc):
opList = [OpDesc(opClass="IntAlu", opLat=1)]
@@ -107,15 +108,28 @@ class O3_ARM_v7a_FUP(FUPool):
]
class O3_ARM_v7a_BTB(SimpleBTB):
numEntries = 2048
tagBits = 18
associativity = 1
instShiftAmt = 2
btbReplPolicy = LRURP()
btbIndexingPolicy = BTBSetAssociative(
num_entries=Parent.numEntries,
set_shift=Parent.instShiftAmt,
assoc=Parent.associativity,
tag_bits=Parent.tagBits,
)
# Bi-Mode Branch Predictor
class O3_ARM_v7a_BP(BiModeBP):
btb = O3_ARM_v7a_BTB()
ras = ReturnAddrStack(numEntries=16)
globalPredictorSize = 8192
globalCtrBits = 2
choicePredictorSize = 8192
choiceCtrBits = 2
BTBEntries = 2048
BTBTagSize = 18
RASSize = 16
instShiftAmt = 2
@@ -171,7 +185,7 @@ class O3_ARM_v7a_ICache(Cache):
response_latency = 1
mshrs = 2
tgts_per_mshr = 8
size = "32kB"
size = "32KiB"
assoc = 2
is_read_only = True
# Writeback clean lines as well
@@ -185,7 +199,7 @@ class O3_ARM_v7a_DCache(Cache):
response_latency = 2
mshrs = 6
tgts_per_mshr = 8
size = "32kB"
size = "32KiB"
assoc = 2
write_buffers = 16
# Consider the L2 a victim cache also for clean lines
@@ -199,12 +213,11 @@ class O3_ARM_v7aL2(Cache):
response_latency = 12
mshrs = 16
tgts_per_mshr = 8
size = "1MB"
size = "1MiB"
assoc = 16
write_buffers = 8
prefetch_on_access = True
clusivity = "mostly_excl"
# Simple stride prefetcher
prefetcher = StridePrefetcher(degree=8, latency=1)
prefetcher = StridePrefetcher(degree=8, latency=1, prefetch_on_access=True)
tags = BaseSetAssoc()
replacement_policy = RandomRP()

View File

@@ -33,8 +33,8 @@
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
from pkgutil import iter_modules
from importlib import import_module
from pkgutil import iter_modules
_cpu_modules = [name for _, name, ispkg in iter_modules(__path__) if not ispkg]

View File

@@ -31,6 +31,7 @@ from m5.objects import *
# ex5 LITTLE core (based on the ARM Cortex-A7)
# -----------------------------------------------------------------------
# Simple ALU Instructions have a latency of 3
class ex5_LITTLE_Simple_Int(MinorDefaultIntFU):
opList = [OpDesc(opClass="IntAlu", opLat=4)]
@@ -123,7 +124,7 @@ class L1Cache(Cache):
class L1I(L1Cache):
mshrs = 2
size = "32kB"
size = "32KiB"
assoc = 2
is_read_only = True
tgts_per_mshr = 20
@@ -131,7 +132,7 @@ class L1I(L1Cache):
class L1D(L1Cache):
mshrs = 4
size = "32kB"
size = "32KiB"
assoc = 4
write_buffers = 4
@@ -143,12 +144,11 @@ class L2(Cache):
response_latency = 9
mshrs = 8
tgts_per_mshr = 12
size = "512kB"
size = "512KiB"
assoc = 8
write_buffers = 16
prefetch_on_access = True
clusivity = "mostly_excl"
# Simple stride prefetcher
prefetcher = StridePrefetcher(degree=1, latency=1)
prefetcher = StridePrefetcher(degree=1, latency=1, prefetch_on_access=True)
tags = BaseSetAssoc()
replacement_policy = RandomRP()

View File

@@ -31,6 +31,7 @@ from m5.objects import *
# ex5 big core (based on the ARM Cortex-A15)
# -----------------------------------------------------------------------
# Simple ALU Instructions have a latency of 1
class ex5_big_Simple_Int(FUDesc):
opList = [OpDesc(opClass="IntAlu", opLat=1)]
@@ -104,15 +105,28 @@ class ex5_big_FUP(FUPool):
]
class ex5_big_BTB(SimpleBTB):
numEntries = 4096
tagBits = 18
associativity = 1
instShiftAmt = 2
btbReplPolicy = LRURP()
btbIndexingPolicy = BTBSetAssociative(
num_entries=Parent.numEntries,
set_shift=Parent.instShiftAmt,
assoc=Parent.associativity,
tag_bits=Parent.tagBits,
)
# Bi-Mode Branch Predictor
class ex5_big_BP(BiModeBP):
btb = ex5_big_BTB()
ras = ReturnAddrStack(numEntries=48)
globalPredictorSize = 4096
globalCtrBits = 2
choicePredictorSize = 1024
choiceCtrBits = 3
BTBEntries = 4096
BTBTagSize = 18
RASSize = 48
instShiftAmt = 2
@@ -172,7 +186,7 @@ class L1Cache(Cache):
# Instruction Cache
class L1I(L1Cache):
mshrs = 2
size = "32kB"
size = "32KiB"
assoc = 2
is_read_only = True
@@ -180,7 +194,7 @@ class L1I(L1Cache):
# Data Cache
class L1D(L1Cache):
mshrs = 6
size = "32kB"
size = "32KiB"
assoc = 2
write_buffers = 16
@@ -192,12 +206,11 @@ class L2(Cache):
response_latency = 15
mshrs = 16
tgts_per_mshr = 8
size = "2MB"
size = "2MiB"
assoc = 16
write_buffers = 8
prefetch_on_access = True
clusivity = "mostly_excl"
# Simple stride prefetcher
prefetcher = StridePrefetcher(degree=8, latency=1)
prefetcher = StridePrefetcher(degree=8, latency=1, prefetch_on_access=True)
tags = BaseSetAssoc()
replacement_policy = RandomRP()

View File

@@ -26,8 +26,15 @@
import os
import sys
from os.path import basename, exists, join as joinpath, normpath
from os.path import isdir, isfile, islink
from os.path import (
basename,
exists,
isdir,
isfile,
islink,
)
from os.path import join as joinpath
from os.path import normpath
spec_dist = os.environ.get("M5_CPU2000", "/dist/m5/cpu2000")
@@ -71,7 +78,7 @@ def copyfiles(srcdir, dstdir):
os.symlink(".", outlink)
class Benchmark(object):
class Benchmark:
def __init__(self, isa, os, input_set):
if not hasattr(self.__class__, "name"):
self.name = self.__class__.__name__
@@ -877,7 +884,7 @@ class vortex(Benchmark):
else:
raise AttributeError(f"unknown ISA {isa}")
super(vortex, self).__init__(isa, os, input_set)
super().__init__(isa, os, input_set)
def test(self, isa, os):
self.args = [f"{self.endian}.raw"]

View File

@@ -45,25 +45,30 @@ import sys
import m5
from m5.defines import buildEnv
from m5.objects import *
from m5.util import addToPath, fatal, warn
from m5.util import (
addToPath,
fatal,
warn,
)
from m5.util.fdthelper import *
from gem5.isas import ISA
from gem5.runtime import get_runtime_isa
addToPath("../../")
from ruby import Ruby
from common import (
CacheConfig,
CpuConfig,
MemConfig,
ObjectList,
Options,
Simulation,
)
from common.Benchmarks import *
from common.Caches import *
from common.FSConfig import *
from common.SysPaths import *
from common.Benchmarks import *
from common import Simulation
from common import CacheConfig
from common import CpuConfig
from common import MemConfig
from common import ObjectList
from common.Caches import *
from common import Options
from ruby import Ruby
def cmd_line_template():
@@ -80,9 +85,8 @@ def cmd_line_template():
return None
def build_test_system(np):
def build_test_system(np, isa: ISA):
cmdline = cmd_line_template()
isa = get_runtime_isa()
if isa == ISA.MIPS:
test_sys = makeLinuxMipsSystem(test_mem_mode, bm[0], cmdline=cmdline)
elif isa == ISA.SPARC:
@@ -164,7 +168,7 @@ def build_test_system(np):
# assuming that there is just one such port.
test_sys.iobus.mem_side_ports = test_sys.ruby._io_port.in_ports
for (i, cpu) in enumerate(test_sys.cpu):
for i, cpu in enumerate(test_sys.cpu):
#
# Tie the cpu ports to the correct ruby system ports
#
@@ -209,9 +213,9 @@ def build_test_system(np):
IndirectBPClass = ObjectList.indirect_bp_list.get(
args.indirect_bp_type
)
test_sys.cpu[
i
].branchPred.indirectBranchPred = IndirectBPClass()
test_sys.cpu[i].branchPred.indirectBranchPred = (
IndirectBPClass()
)
test_sys.cpu[i].createThreads()
# If elastic tracing is enabled when not restoring from checkpoint and
@@ -378,7 +382,8 @@ else:
np = args.num_cpus
test_sys = build_test_system(np)
isa = ObjectList.cpu_list.get_isa(args.cpu_type)
test_sys = build_test_system(np, isa)
if len(bm) == 2:
drive_sys = build_drive_system(np)

View File

@@ -41,30 +41,35 @@
# "m5 test.py"
import argparse
import sys
import os
import sys
import m5
from m5.defines import buildEnv
from m5.objects import *
from m5.params import NULL
from m5.util import addToPath, fatal, warn
from m5.util import (
addToPath,
fatal,
warn,
)
from gem5.isas import ISA
from gem5.runtime import get_runtime_isa
addToPath("../../")
from ruby import Ruby
from common import Options
from common import Simulation
from common import CacheConfig
from common import CpuConfig
from common import ObjectList
from common import MemConfig
from common.FileSystemConfig import config_filesystem
from common import (
CacheConfig,
CpuConfig,
MemConfig,
ObjectList,
Options,
Simulation,
)
from common.Caches import *
from common.cpu2000 import *
from common.FileSystemConfig import config_filesystem
from ruby import Ruby
def get_processes(args):
@@ -94,7 +99,7 @@ def get_processes(args):
process.gid = os.getgid()
if args.env:
with open(args.env, "r") as f:
with open(args.env) as f:
process.env = [line.rstrip() for line in f]
if len(pargs) > idx:
@@ -113,7 +118,8 @@ def get_processes(args):
idx += 1
if args.smt:
assert args.cpu_type == "DerivO3CPU"
cpu_type = ObjectList.cpu_list.get(args.cpu_type)
assert ObjectList.is_o3_cpu(cpu_type), "SMT requires an O3CPU"
return multiprocesses, idx
else:
return multiprocesses, 1
@@ -144,7 +150,7 @@ if args.bench:
for app in apps:
try:
if get_runtime_isa() == ISA.ARM:
if ObjectList.cpu_list.get_isa(args.cpu_type) == ISA.ARM:
exec(
"workload = %s('arm_%s', 'linux', '%s')"
% (app, args.arm_iset, args.spec_input)
@@ -159,7 +165,7 @@ if args.bench:
multiprocesses.append(workload.makeProcess())
except:
print(
f"Unable to find workload for {get_runtime_isa().name()}: {app}",
f"Unable to find workload for ISA: {app}",
file=sys.stderr,
)
sys.exit(1)
@@ -218,7 +224,7 @@ for cpu in system.cpu:
if ObjectList.is_kvm_cpu(CPUClass) or ObjectList.is_kvm_cpu(FutureClass):
if buildEnv["USE_X86_ISA"]:
system.kvm_vm = KvmVM()
system.m5ops_base = 0xFFFF0000
system.m5ops_base = max(0xFFFF0000, Addr(args.mem_size).getValue())
for process in multiprocesses:
process.useArchPT = True
process.kvmInSE = True

2
configs/dist/sw.py vendored
View File

@@ -62,7 +62,7 @@ def build_switch(args):
for i in range(args.dist_size)
]
for (i, link) in enumerate(switch.portlink):
for i, link in enumerate(switch.portlink):
link.int0 = switch.interface[i]
return switch

View File

@@ -33,18 +33,20 @@
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
import gzip
import argparse
import gzip
import os
import m5
from m5.objects import *
from m5.util import addToPath
from m5.stats import periodicStatDump
from m5.util import addToPath
addToPath("../")
from common import ObjectList
from common import MemConfig
from common import (
MemConfig,
ObjectList,
)
addToPath("../../util")
import protolib
@@ -96,7 +98,7 @@ parser.add_argument(
"--mem-size",
action="store",
type=str,
default="16MB",
default="16MiB",
help="Specify the memory size",
)
parser.add_argument(
@@ -150,6 +152,7 @@ cfg_file = open(cfg_file_name, "w")
burst_size = 64
system.cache_line_size = burst_size
# lazy version to check if an integer is a power of two
def is_pow2(num):
return num != 0 and ((num & (num - 1)) == 0)
@@ -158,7 +161,7 @@ def is_pow2(num):
# assume we start every range at 0
max_range = int(mem_range.end)
# start at a size of 4 kByte, and go up till we hit the max, increase
# start at a size of 4 kibibyte, and go up till we hit the max, increase
# the step every time we hit a power of two
min_range = 4096
ranges = [min_range]
@@ -177,13 +180,14 @@ iterations = 2
# do not pile up in the system, adjust if needed
itt = 150 * 1000
# for every data point, we create a trace containing a random address
# sequence, so that we can play back the same sequence for warming and
# the actual measurement
def create_trace(filename, max_addr, burst_size, itt):
try:
proto_out = gzip.open(filename, "wb")
except IOError:
except OSError:
print("Failed to open ", filename, " for writing")
exit(-1)
@@ -276,6 +280,7 @@ system.tgen.port = system.monitor.cpu_side_port
# basic to explore some of the options
from common.Caches import *
# a starting point for an L3 cache
class L3Cache(Cache):
assoc = 16
@@ -290,17 +295,17 @@ class L3Cache(Cache):
# note that everything is in the same clock domain, 2.0 GHz as
# specified above
system.l1cache = L1_DCache(size="64kB")
system.l1cache = L1_DCache(size="64KiB")
system.monitor.mem_side_port = system.l1cache.cpu_side
system.l2cache = L2Cache(size="512kB", writeback_clean=True)
system.l2cache = L2Cache(size="512KiB", writeback_clean=True)
system.l2cache.xbar = L2XBar()
system.l1cache.mem_side = system.l2cache.xbar.cpu_side_ports
system.l2cache.cpu_side = system.l2cache.xbar.mem_side_ports
# make the L3 mostly exclusive, and correspondingly ensure that the L2
# writes back also clean lines to the L3
system.l3cache = L3Cache(size="4MB", clusivity="mostly_excl")
system.l3cache = L3Cache(size="4MiB", clusivity="mostly_excl")
system.l3cache.xbar = L2XBar()
system.l2cache.mem_side = system.l3cache.xbar.cpu_side_ports
system.l3cache.cpu_side = system.l3cache.xbar.mem_side_ports

View File

@@ -37,13 +37,15 @@ import argparse
import m5
from m5.objects import *
from m5.util import addToPath
from m5.stats import periodicStatDump
from m5.util import addToPath
addToPath("../")
from common import ObjectList
from common import MemConfig
from common import (
MemConfig,
ObjectList,
)
# This script aims at triggering low power state transitions in the DRAM
# controller. The traffic generator is used in DRAM mode and traffic
@@ -114,8 +116,8 @@ system.clk_domain = SrcClockDomain(
clock="2.0GHz", voltage_domain=VoltageDomain(voltage="1V")
)
# We are fine with 256 MB memory for now.
mem_range = AddrRange("256MB")
# We are fine with 256 MiB memory for now.
mem_range = AddrRange("256MiB")
# Start address is 0
system.mem_ranges = [mem_range]

View File

@@ -33,18 +33,20 @@
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
import math
import argparse
import math
import m5
from m5.objects import *
from m5.util import addToPath
from m5.stats import periodicStatDump
from m5.util import addToPath
addToPath("../")
from common import ObjectList
from common import MemConfig
from common import (
MemConfig,
ObjectList,
)
# this script is helpful to sweep the efficiency of a specific memory
# controller configuration, by varying the number of banks accessed,
@@ -106,8 +108,8 @@ system.clk_domain = SrcClockDomain(
clock="2.0GHz", voltage_domain=VoltageDomain(voltage="1V")
)
# we are fine with 256 MB memory for now
mem_range = AddrRange("256MB")
# we are fine with 256 MiB memory for now
mem_range = AddrRange("256MiB")
system.mem_ranges = [mem_range]
# do not worry about reserving space for the backing store

View File

@@ -27,28 +27,34 @@
# ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
# POSSIBILITY OF SUCH DAMAGE.
import argparse, os, re, getpass
import math
import argparse
import getpass
import glob
import inspect
import math
import os
import re
import m5
from m5.objects import *
from m5.util import addToPath
from gem5.isas import ISA
from gem5.runtime import get_runtime_isa
from gem5.resources.resource import obtain_resource
from gem5.runtime import get_supported_isas
addToPath("../")
from ruby import Ruby
from common import Options
from common import Simulation
from common import GPUTLBOptions, GPUTLBConfig
import hsaTopology
from common import FileSystemConfig
from common import (
FileSystemConfig,
GPUTLBConfig,
GPUTLBOptions,
ObjectList,
Options,
Simulation,
)
from ruby import Ruby
# Adding script options
parser = argparse.ArgumentParser()
@@ -290,6 +296,14 @@ parser.add_argument(
help="Latency for scalar responses from ruby to the cu.",
)
parser.add_argument(
"--memtime-latency",
type=int,
# Set to a default of 41 from micro-benchmarks
default=41,
help="Latency for memtimes in scalar memory pipeline.",
)
parser.add_argument("--TLB-prefetch", type=int, help="prefetch depth for TLBs")
parser.add_argument(
"--pf-type",
@@ -331,6 +345,12 @@ parser.add_argument(
default="dynamic",
help="register allocation policy (simple/dynamic)",
)
parser.add_argument(
"--register-file-cache-size",
type=int,
default=0,
help="number of registers in cache",
)
parser.add_argument(
"--dgpu",
@@ -365,11 +385,52 @@ parser.add_argument(
parser.add_argument(
"--gfx-version",
type=str,
default="gfx801",
default="gfx902",
choices=GfxVersion.vals,
help="Gfx version for gpuNote: gfx902 is not fully supported by ROCm",
)
parser.add_argument(
"--tcp-rp",
type=str,
default="TreePLRURP",
choices=ObjectList.rp_list.get_names(),
help="cache replacement policy" "policy for tcp",
)
parser.add_argument(
"--tcc-rp",
type=str,
default="TreePLRURP",
choices=ObjectList.rp_list.get_names(),
help="cache replacement policy" "policy for tcc",
)
# sqc rp both changes sqc rp and scalar cache rp
parser.add_argument(
"--sqc-rp",
type=str,
default="TreePLRURP",
choices=ObjectList.rp_list.get_names(),
help="cache replacement policy" "policy for sqc",
)
parser.add_argument(
"--download-resource",
type=str,
default=None,
required=False,
help="Download this resources prior to simulation",
)
parser.add_argument(
"--download-dir",
type=str,
default=None,
required=False,
help="Download resources to this directory",
)
Ruby.define_options(parser)
# add TLB options to the parser
@@ -377,6 +438,17 @@ GPUTLBOptions.tlb_options(parser)
args = parser.parse_args()
# Get the resource if specified.
if args.download_resource:
resources = obtain_resource(
resource_id=args.download_resource,
resource_directory=args.download_dir,
)
# This line seems pointless but is actually what triggers the download.
resources.get_local_path()
# The GPU cache coherence protocols only work with the backing store
args.access_backing_store = True
@@ -394,8 +466,8 @@ if buildEnv["PROTOCOL"] == "None":
fatal("GPU model requires ruby")
# Currently the gpu model requires only timing or detailed CPU
if not (args.cpu_type == "TimingSimpleCPU" or args.cpu_type == "DerivO3CPU"):
fatal("GPU model requires TimingSimpleCPU or DerivO3CPU")
if not (args.cpu_type == "X86TimingSimpleCPU" or args.cpu_type == "X86O3CPU"):
fatal("GPU model requires X86TimingSimpleCPU or X86O3CPU.")
# This file can support multiple compute units
assert args.num_compute_units >= 1
@@ -424,6 +496,7 @@ print(
# shader is the GPU
shader = Shader(
n_wf=args.wfs_per_simd,
cu_per_sqc=args.cu_per_sqc,
clk_domain=SrcClockDomain(
clock=args.gpu_clock,
voltage_domain=VoltageDomain(voltage=args.gpu_voltage),
@@ -478,6 +551,7 @@ for i in range(n_cu):
mem_resp_latency=args.mem_resp_latency,
scalar_mem_req_latency=args.scalar_mem_req_latency,
scalar_mem_resp_latency=args.scalar_mem_resp_latency,
memtime_latency=args.memtime_latency,
localDataStore=LdsState(
banks=args.numLdsBanks,
bankConflictPenalty=args.ldsBankConflictPenalty,
@@ -489,6 +563,7 @@ for i in range(n_cu):
vrfs = []
vrf_pool_mgrs = []
srfs = []
rfcs = []
srf_pool_mgrs = []
for j in range(args.simds_per_cu):
for k in range(shader.n_wf):
@@ -533,10 +608,16 @@ for i in range(n_cu):
simd_id=j, wf_size=args.wf_size, num_regs=args.sreg_file_size
)
)
rfcs.append(
RegisterFileCache(
simd_id=j, cache_size=args.register_file_cache_size
)
)
compute_units[-1].wavefronts = wavefronts
compute_units[-1].vector_register_file = vrfs
compute_units[-1].scalar_register_file = srfs
compute_units[-1].register_file_cache = rfcs
compute_units[-1].register_manager = RegisterManager(
policy=args.registerManagerPolicy,
vrf_pool_managers=vrf_pool_mgrs,
@@ -567,7 +648,7 @@ cp_list = []
cpu_list = []
CpuClass, mem_mode = Simulation.getCPUClass(args.cpu_type)
if CpuClass == AtomicSimpleCPU:
if CpuClass == X86AtomicSimpleCPU or CpuClass == AtomicSimpleCPU:
fatal("AtomicSimpleCPU is not supported")
if mem_mode != "timing":
fatal("Only the timing memory mode is supported")
@@ -667,12 +748,13 @@ render_driver = GPURenderDriver(filename=f"dri/renderD{renderDriNum}")
gpu_hsapp = HSAPacketProcessor(
pioAddr=hsapp_gpu_map_paddr, numHWQueues=args.num_hw_queues
)
dispatcher = GPUDispatcher()
dispatcher = GPUDispatcher(kernel_exit_events=True)
gpu_cmd_proc = GPUCommandProcessor(hsapp=gpu_hsapp, dispatcher=dispatcher)
gpu_driver.device = gpu_cmd_proc
shader.dispatcher = dispatcher
shader.gpu_cmd_proc = gpu_cmd_proc
# Create and assign the workload Check for rel_path in elements of
# base_list using test, returning the first full path that satisfies test
def find_path(base_list, rel_path, test):
@@ -698,7 +780,7 @@ if os.path.isdir(executable):
executable = find_file(benchmark_path, args.cmd)
if args.env:
with open(args.env, "r") as f:
with open(args.env) as f:
env = [line.rstrip() for line in f]
else:
env = [
@@ -756,7 +838,7 @@ if fast_forward:
]
# Other CPU strings cause bad addresses in ROCm. Revert back to M5 Simulator.
for (i, cpu) in enumerate(cpu_list):
for i, cpu in enumerate(cpu_list):
for j in range(len(cpu)):
cpu.isa[j].vendor_string = "M5 Simulator"
@@ -781,7 +863,7 @@ system.clk_domain = SrcClockDomain(
if fast_forward:
have_kvm_support = "BaseKvmCPU" in globals()
if have_kvm_support and get_runtime_isa() == ISA.X86:
if have_kvm_support and get_supported_isas().contains(ISA.X86):
system.vm = KvmVM()
system.m5ops_base = 0xFFFF0000
for i in range(len(host_cpu.workload)):
@@ -793,6 +875,8 @@ if fast_forward:
# configure the TLB hierarchy
GPUTLBConfig.config_tlb_hierarchy(args, system, shader_idx)
system.exit_on_work_items = True
# create Ruby system
system.piobus = IOXBar(
width=32, response_latency=0, frontend_latency=0, forward_latency=0
@@ -820,18 +904,15 @@ for i in range(args.num_cpus):
system.cpu[i].dcache_port = ruby_port.in_ports
ruby_port.mem_request_port = system.piobus.cpu_side_ports
if get_runtime_isa() == ISA.X86:
system.cpu[i].interrupts[0].pio = system.piobus.mem_side_ports
system.cpu[i].interrupts[
0
].int_requestor = system.piobus.cpu_side_ports
system.cpu[i].interrupts[
0
].int_responder = system.piobus.mem_side_ports
if fast_forward:
system.cpu[i].mmu.connectWalkerPorts(
ruby_port.in_ports, ruby_port.in_ports
)
# X86 ISA is implied from cpu type check above
system.cpu[i].interrupts[0].pio = system.piobus.mem_side_ports
system.cpu[i].interrupts[0].int_requestor = system.piobus.cpu_side_ports
system.cpu[i].interrupts[0].int_responder = system.piobus.mem_side_ports
if fast_forward:
system.cpu[i].mmu.connectWalkerPorts(
ruby_port.in_ports, ruby_port.in_ports
)
# attach CU ports to Ruby
# Because of the peculiarities of the CP core, you may have 1 CPU but 2
@@ -854,9 +935,9 @@ gpu_port_idx = gpu_port_idx - args.num_cp * 2
token_port_idx = 0
for i in range(len(system.ruby._cpu_ports)):
if isinstance(system.ruby._cpu_ports[i], VIPERCoalescer):
system.cpu[shader_idx].CUs[
token_port_idx
].gmTokenPort = system.ruby._cpu_ports[i].gmTokenPort
system.cpu[shader_idx].CUs[token_port_idx].gmTokenPort = (
system.ruby._cpu_ports[i].gmTokenPort
)
token_port_idx += 1
wavefront_size = args.wf_size
@@ -936,19 +1017,15 @@ root = Root(system=system, full_system=False)
# knows what type of GPU hardware we are simulating
if args.dgpu:
assert args.gfx_version in [
"gfx803",
"gfx900",
], "Incorrect gfx version for dGPU"
if args.gfx_version == "gfx803":
hsaTopology.createFijiTopology(args)
elif args.gfx_version == "gfx900":
if args.gfx_version == "gfx900":
hsaTopology.createVegaTopology(args)
else:
assert args.gfx_version in [
"gfx801",
"gfx902",
], "Incorrect gfx version for APU"
hsaTopology.createCarrizoTopology(args)
hsaTopology.createRavenTopology(args)
m5.ticks.setGlobalFrequency("1THz")
if args.abs_max_tick:
@@ -974,6 +1051,41 @@ if args.fast_forward:
exit_event = m5.simulate(maxtick)
while True:
if (
exit_event.getCause() == "m5_exit instruction encountered"
or exit_event.getCause() == "user interrupt received"
or exit_event.getCause() == "simulate() limit reached"
or "exiting with last active thread context" in exit_event.getCause()
):
print(f"breaking loop due to: {exit_event.getCause()}.")
break
elif "checkpoint" in exit_event.getCause():
assert args.checkpoint_dir is not None
m5.checkpoint(args.checkpoint_dir)
print("breaking loop with checkpoint")
break
elif "GPU Kernel Completed" in exit_event.getCause():
print("GPU Kernel Completed dump and reset")
m5.stats.dump()
m5.stats.reset()
elif "GPU Blit Kernel Completed" in exit_event.getCause():
print("GPU Blit Kernel Completed dump and reset")
m5.stats.dump()
m5.stats.reset()
elif "workbegin" in exit_event.getCause():
print("m5 work begin dump and reset")
m5.stats.dump()
m5.stats.reset()
elif "workend" in exit_event.getCause():
print("m5 work end dump and reset")
m5.stats.dump()
m5.stats.reset()
else:
print(f"Unknown exit event: {exit_event.getCause()}. Continuing...")
exit_event = m5.simulate(maxtick - m5.curTick())
if args.fast_forward:
if exit_event.getCause() == "a thread reached the max instruction count":
m5.switchCpus(system, switch_cpu_list)

View File

@@ -39,24 +39,29 @@ Research Starter Kit on System Modeling. More information can be found
at: http://www.arm.com/ResearchEnablement/SystemModeling
"""
import argparse
import os
import m5
from m5.util import addToPath
from m5.objects import *
from m5.options import *
from m5.util import addToPath
from gem5.simulate.exit_event import ExitEvent
import argparse
m5.util.addToPath("../..")
from common import SysPaths
from common import MemConfig
from common import ObjectList
from common.cores.arm import HPI
from common.cores.arm import O3_ARM_v7a
import devices
import workloads
from common import (
MemConfig,
ObjectList,
SysPaths,
)
from common.cores.arm import (
HPI,
O3_ARM_v7a,
)
# Pre-defined CPU configurations. Each tuple must be ordered as : (cpu_class,
# l1_icache_class, l1_dcache_class, walk_cache_class, l2_Cache_class). Any of
@@ -171,9 +176,10 @@ def create(args):
system.workload = workload_class(object_file, system)
if args.with_pmu:
enabled_pmu_events = set(
(*args.pmu_dump_stats_on, *args.pmu_reset_stats_on)
)
enabled_pmu_events = {
*args.pmu_dump_stats_on,
*args.pmu_reset_stats_on,
}
exit_sim_on_control = bool(
enabled_pmu_events & set(pmu_control_events.keys())
)
@@ -302,7 +308,7 @@ def main():
"--mem-size",
action="store",
type=str,
default="2GB",
default="2GiB",
help="Specify the physical memory size",
)
parser.add_argument("--checkpoint", action="store_true")

View File

@@ -39,8 +39,8 @@ import m5
from m5.objects import *
m5.util.addToPath("../../")
from common.Caches import *
from common import ObjectList
from common.Caches import *
have_kvm = "ArmV8KvmCPU" in ObjectList.cpu_list.get_names()
have_fastmodel = "FastModelCortexA76" in ObjectList.cpu_list.get_names()
@@ -52,7 +52,7 @@ class L1I(L1_ICache):
response_latency = 1
mshrs = 4
tgts_per_mshr = 8
size = "48kB"
size = "48KiB"
assoc = 3
@@ -62,7 +62,7 @@ class L1D(L1_DCache):
response_latency = 1
mshrs = 16
tgts_per_mshr = 16
size = "32kB"
size = "32KiB"
assoc = 2
write_buffers = 16
@@ -73,14 +73,14 @@ class L2(L2Cache):
response_latency = 5
mshrs = 32
tgts_per_mshr = 8
size = "1MB"
size = "1MiB"
assoc = 16
write_buffers = 8
clusivity = "mostly_excl"
class L3(Cache):
size = "16MB"
size = "16MiB"
assoc = 16
tag_latency = 20
data_latency = 20
@@ -338,56 +338,15 @@ class FastmodelCluster(CpuCluster):
pass
class BaseSimpleSystem(ArmSystem):
cache_line_size = 64
def __init__(self, mem_size, platform, **kwargs):
super(BaseSimpleSystem, self).__init__(**kwargs)
self.voltage_domain = VoltageDomain(voltage="1.0V")
self.clk_domain = SrcClockDomain(
clock="1GHz", voltage_domain=Parent.voltage_domain
)
if platform is None:
self.realview = VExpress_GEM5_V1()
else:
self.realview = platform
if hasattr(self.realview.gic, "cpu_addr"):
self.gic_cpu_addr = self.realview.gic.cpu_addr
self.terminal = Terminal()
self.vncserver = VncServer()
self.iobus = IOXBar()
# Device DMA -> MEM
self.mem_ranges = self.getMemRanges(int(Addr(mem_size)))
class ClusterSystem:
"""
Base class providing cpu clusters generation/handling methods to
SE/FS systems
"""
def __init__(self, **kwargs):
self._clusters = []
def getMemRanges(self, mem_size):
"""
Define system memory ranges. This depends on the physical
memory map provided by the realview platform and by the memory
size provided by the user (mem_size argument).
The method is iterating over all platform ranges until they cover
the entire user's memory requirements.
"""
mem_ranges = []
for mem_range in self.realview._mem_regions:
size_in_range = min(mem_size, mem_range.size())
mem_ranges.append(
AddrRange(start=mem_range.start, size=size_in_range)
)
mem_size -= size_in_range
if mem_size == 0:
return mem_ranges
raise ValueError("memory size too big for platform capabilities")
def numCpuClusters(self):
return len(self._clusters)
@@ -423,13 +382,87 @@ class BaseSimpleSystem(ArmSystem):
cluster.connectMemSide(cluster_mem_bus)
class SimpleSeSystem(System, ClusterSystem):
"""
Example system class for syscall emulation mode
"""
# Use a fixed cache line size of 64 bytes
cache_line_size = 64
def __init__(self, **kwargs):
System.__init__(self, **kwargs)
ClusterSystem.__init__(self, **kwargs)
# Create a voltage and clock domain for system components
self.voltage_domain = VoltageDomain(voltage="3.3V")
self.clk_domain = SrcClockDomain(
clock="1GHz", voltage_domain=self.voltage_domain
)
# Create the off-chip memory bus.
self.membus = SystemXBar()
def connect(self):
self.system_port = self.membus.cpu_side_ports
class BaseSimpleSystem(ArmSystem, ClusterSystem):
cache_line_size = 64
def __init__(self, mem_size, platform, **kwargs):
ArmSystem.__init__(self, **kwargs)
ClusterSystem.__init__(self, **kwargs)
self.voltage_domain = VoltageDomain(voltage="1.0V")
self.clk_domain = SrcClockDomain(
clock="1GHz", voltage_domain=Parent.voltage_domain
)
if platform is None:
self.realview = VExpress_GEM5_V1()
else:
self.realview = platform
if hasattr(self.realview.gic, "cpu_addr"):
self.gic_cpu_addr = self.realview.gic.cpu_addr
self.terminal = Terminal()
self.vncserver = VncServer()
self.iobus = IOXBar()
# Device DMA -> MEM
self.mem_ranges = self.getMemRanges(int(Addr(mem_size)))
def getMemRanges(self, mem_size):
"""
Define system memory ranges. This depends on the physical
memory map provided by the realview platform and by the memory
size provided by the user (mem_size argument).
The method is iterating over all platform ranges until they cover
the entire user's memory requirements.
"""
mem_ranges = []
for mem_range in self.realview._mem_regions:
size_in_range = min(mem_size, mem_range.size())
mem_ranges.append(
AddrRange(start=mem_range.start, size=size_in_range)
)
mem_size -= size_in_range
if mem_size == 0:
return mem_ranges
raise ValueError("memory size too big for platform capabilities")
class SimpleSystem(BaseSimpleSystem):
"""
Meant to be used with the classic memory model
"""
def __init__(self, caches, mem_size, platform=None, **kwargs):
super(SimpleSystem, self).__init__(mem_size, platform, **kwargs)
super().__init__(mem_size, platform, **kwargs)
self.membus = MemBus()
# CPUs->PIO
@@ -468,7 +501,7 @@ class ArmRubySystem(BaseSimpleSystem):
"""
def __init__(self, mem_size, platform=None, **kwargs):
super(ArmRubySystem, self).__init__(mem_size, platform, **kwargs)
super().__init__(mem_size, platform, **kwargs)
self._dma_ports = []
self._mem_ports = []

View File

@@ -39,11 +39,11 @@
import argparse
import os
import fs_bigLITTLE as bL
import m5
from m5.objects import *
import fs_bigLITTLE as bL
m5.util.addToPath("../../dist")
import sw

View File

@@ -0,0 +1,191 @@
# Copyright (c) 2016-2017, 2022-2023 Arm Limited
# All rights reserved.
#
# The license below extends only to copyright in the software and shall
# not be construed as granting a license to any other intellectual
# property including but not limited to intellectual property relating
# to a hardware implementation of the functionality of the software
# licensed hereunder. You may use the software subject to the license
# terms below provided that you ensure that this notice is replicated
# unmodified and in its entirety in all distributions of the software,
# modified or unmodified, in source code or in binary form.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are
# met: redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer;
# redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution;
# neither the name of the copyright holders nor the names of its
# contributors may be used to endorse or promote products derived from
# this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
import argparse
import os
import shlex
import m5
from m5.objects import *
from m5.util import addToPath
m5.util.addToPath("../..")
import devices
from common import ObjectList
def get_processes(cmd):
"""Interprets commands to run and returns a list of processes"""
cwd = os.getcwd()
multiprocesses = []
for idx, c in enumerate(cmd):
argv = shlex.split(c)
process = Process(pid=100 + idx, cwd=cwd, cmd=argv, executable=argv[0])
process.gid = os.getgid()
print("info: %d. command and arguments: %s" % (idx + 1, process.cmd))
multiprocesses.append(process)
return multiprocesses
def create(args):
"""Create and configure the system object."""
system = devices.SimpleSeSystem(
mem_mode="timing",
)
# Add CPUs to the system. A cluster of CPUs typically have
# private L1 caches and a shared L2 cache.
system.cpu_cluster = devices.ArmCpuCluster(
system,
args.num_cores,
args.cpu_freq,
"1.2V",
ObjectList.cpu_list.get("O3_ARM_v7a_3_Etrace"),
devices.L1I,
devices.L1D,
devices.L2,
)
# Attach the elastic trace probe listener to every CPU in the cluster
for cpu in system.cpu_cluster:
cpu.attach_probe_listener(args.inst_trace_file, args.data_trace_file)
# As elastic trace generation is enabled, make sure the memory system is
# minimal so that compute delays do not include memory access latencies.
# Configure the compulsory L1 caches for the O3CPU, do not configure
# any more caches.
system.addCaches(True, last_cache_level=1)
# For elastic trace, over-riding Simple Memory latency to 1ns."
system.memory = SimpleMemory(
range=AddrRange(start=0, size=args.mem_size),
latency="1ns",
port=system.membus.mem_side_ports,
)
# Parse the command line and get a list of Processes instances
# that we can pass to gem5.
processes = get_processes(args.commands_to_run)
if len(processes) != args.num_cores:
print(
"Error: Cannot map %d command(s) onto %d CPU(s)"
% (len(processes), args.num_cores)
)
sys.exit(1)
system.workload = SEWorkload.init_compatible(processes[0].executable)
# Assign one workload to each CPU
for cpu, workload in zip(system.cpu_cluster.cpus, processes):
cpu.workload = workload
return system
def main():
parser = argparse.ArgumentParser(epilog=__doc__)
parser.add_argument(
"commands_to_run",
metavar="command(s)",
nargs="+",
help="Command(s) to run",
)
parser.add_argument(
"--inst-trace-file",
action="store",
type=str,
help="""Instruction fetch trace file input to
Elastic Trace probe in a capture simulation and
Trace CPU in a replay simulation""",
default="fetchtrace.proto.gz",
)
parser.add_argument(
"--data-trace-file",
action="store",
type=str,
help="""Data dependency trace file input to
Elastic Trace probe in a capture simulation and
Trace CPU in a replay simulation""",
default="deptrace.proto.gz",
)
parser.add_argument("--cpu-freq", type=str, default="4GHz")
parser.add_argument(
"--num-cores", type=int, default=1, help="Number of CPU cores"
)
parser.add_argument(
"--mem-size",
action="store",
type=str,
default="2GiB",
help="Specify the physical memory size",
)
args = parser.parse_args()
# Create a single root node for gem5's object hierarchy. There can
# only exist one root node in the simulator at any given
# time. Tell gem5 that we want to use syscall emulation mode
# instead of full system mode.
root = Root(full_system=False)
# Populate the root node with a system. A system corresponds to a
# single node with shared memory.
root.system = create(args)
# Instantiate the C++ object hierarchy. After this point,
# SimObjects can't be instantiated anymore.
m5.instantiate()
# Start the simulator. This gives control to the C++ world and
# starts the simulator. The returned event tells the simulation
# script why the simulator exited.
event = m5.simulate()
# Print the reason for the simulation exit. Some exit codes are
# requests for service (e.g., checkpoints) from the simulation
# script. We'll just ignore them here and exit.
print(f"{event.getCause()} ({event.getCode()}) @ {m5.curTick()}")
if __name__ == "__m5_main__":
main()

View File

@@ -39,25 +39,33 @@
import argparse
import os
import sys
import m5
import m5.util
from m5.objects import *
m5.util.addToPath("../../")
from common import FSConfig
from common import SysPaths
from common import ObjectList
from common import Options
from common.cores.arm import ex5_big, ex5_LITTLE
import devices
from devices import AtomicCluster, KvmCluster, FastmodelCluster
from common import (
FSConfig,
ObjectList,
Options,
SysPaths,
)
from common.cores.arm import (
ex5_big,
ex5_LITTLE,
)
from devices import (
AtomicCluster,
FastmodelCluster,
KvmCluster,
)
default_disk = "aarch64-ubuntu-trusty-headless.img"
default_mem_size = "2GB"
default_mem_size = "2GiB"
def _to_ticks(value):
@@ -410,7 +418,8 @@ def build(options):
system.generateDtb(system.workload.dtb_filename)
if devices.have_fastmodel and issubclass(big_model, FastmodelCluster):
from m5 import arm_fast_model as fm, systemc as sc
from m5 import arm_fast_model as fm
from m5 import systemc as sc
# setup FastModels for simulation
fm.setup_simulation("cortexa76")

View File

@@ -39,15 +39,18 @@
import argparse
import os
import m5
from m5.objects import MathExprPowerModel, PowerModel
import fs_bigLITTLE as bL
import m5
from m5.objects import (
MathExprPowerModel,
PowerModel,
)
class CpuPowerOn(MathExprPowerModel):
def __init__(self, cpu_path, **kwargs):
super(CpuPowerOn, self).__init__(**kwargs)
super().__init__(**kwargs)
# 2A per IPC, 3pA per cache miss
# and then convert to Watt
self.dyn = (
@@ -64,7 +67,7 @@ class CpuPowerOff(MathExprPowerModel):
class CpuPowerModel(PowerModel):
def __init__(self, cpu_path, **kwargs):
super(CpuPowerModel, self).__init__(**kwargs)
super().__init__(**kwargs)
self.pm = [
CpuPowerOn(cpu_path), # ON
CpuPowerOff(), # CLK_GATED
@@ -75,7 +78,7 @@ class CpuPowerModel(PowerModel):
class L2PowerOn(MathExprPowerModel):
def __init__(self, l2_path, **kwargs):
super(L2PowerOn, self).__init__(**kwargs)
super().__init__(**kwargs)
# Example to report l2 Cache overallAccesses
# The estimated power is converted to Watt and will vary based
# on the size of the cache
@@ -90,7 +93,7 @@ class L2PowerOff(MathExprPowerModel):
class L2PowerModel(PowerModel):
def __init__(self, l2_path, **kwargs):
super(L2PowerModel, self).__init__(**kwargs)
super().__init__(**kwargs)
# Choose a power model for every power state
self.pm = [
L2PowerOn(l2_path), # ON

View File

@@ -33,24 +33,28 @@
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
import argparse
import os
import m5
from m5.util import addToPath
from m5.objects import *
from m5.options import *
import argparse
from m5.util import addToPath
m5.util.addToPath("../..")
from common import MemConfig
from common import ObjectList
from common import Options
from common import SysPaths
from common.cores.arm import O3_ARM_v7a, HPI
from ruby import Ruby
import devices
from common import (
MemConfig,
ObjectList,
Options,
SysPaths,
)
from common.cores.arm import (
HPI,
O3_ARM_v7a,
)
from ruby import Ruby
default_kernel = "vmlinux.arm64"
default_disk = "linaro-minimal-aarch64.img"
@@ -274,10 +278,10 @@ def main():
parser.add_argument("--num-dirs", type=int, default=1)
parser.add_argument("--num-l2caches", type=int, default=1)
parser.add_argument("--num-l3caches", type=int, default=1)
parser.add_argument("--l1d_size", type=str, default="64kB")
parser.add_argument("--l1i_size", type=str, default="32kB")
parser.add_argument("--l2_size", type=str, default="2MB")
parser.add_argument("--l3_size", type=str, default="16MB")
parser.add_argument("--l1d_size", type=str, default="64KiB")
parser.add_argument("--l1i_size", type=str, default="32KiB")
parser.add_argument("--l2_size", type=str, default="2MiB")
parser.add_argument("--l3_size", type=str, default="16MiB")
parser.add_argument("--l1d_assoc", type=int, default=2)
parser.add_argument("--l1i_assoc", type=int, default=2)
parser.add_argument("--l2_assoc", type=int, default=8)

View File

@@ -38,22 +38,26 @@ Research Starter Kit on System Modeling. More information can be found
at: http://www.arm.com/ResearchEnablement/SystemModeling
"""
import argparse
import os
import m5
from m5.util import addToPath
from m5.objects import *
from m5.options import *
import argparse
from m5.util import addToPath
m5.util.addToPath("../..")
from common import SysPaths
from common import ObjectList
from common import MemConfig
from common.cores.arm import O3_ARM_v7a, HPI
import devices
from common import (
MemConfig,
ObjectList,
SysPaths,
)
from common.cores.arm import (
HPI,
O3_ARM_v7a,
)
default_kernel = "vmlinux.arm64"
default_disk = "linaro-minimal-aarch64.img"
@@ -272,7 +276,7 @@ def main():
"--mem-size",
action="store",
type=str,
default="2GB",
default="2GiB",
help="Specify the physical memory size",
)
parser.add_argument(

View File

@@ -1,4 +1,4 @@
# Copyright (c) 2016-2017, 2022-2023 Arm Limited
# Copyright (c) 2016-2017, 2022-2024 Arm Limited
# All rights reserved.
#
# The license below extends only to copyright in the software and shall
@@ -38,98 +38,42 @@ Research Starter Kit on System Modeling. More information can be found
at: http://www.arm.com/ResearchEnablement/SystemModeling
"""
import os
import m5
from m5.util import addToPath
from m5.objects import *
import argparse
import os
import shlex
import m5
from m5.objects import *
from m5.util import addToPath
m5.util.addToPath("../..")
from common import ObjectList
from common import MemConfig
from common.cores.arm import HPI
import devices
from common import (
MemConfig,
ObjectList,
)
from common.cores.arm import (
HPI,
O3_ARM_v7a,
)
# Pre-defined CPU configurations. Each tuple must be ordered as : (cpu_class,
# l1_icache_class, l1_dcache_class, walk_cache_class, l2_Cache_class). Any of
# l1_icache_class, l1_dcache_class, l2_Cache_class). Any of
# the cache class may be 'None' if the particular cache is not present.
cpu_types = {
"atomic": (AtomicSimpleCPU, None, None, None),
"minor": (MinorCPU, devices.L1I, devices.L1D, devices.L2),
"hpi": (HPI.HPI, HPI.HPI_ICache, HPI.HPI_DCache, HPI.HPI_L2),
"o3": (
O3_ARM_v7a.O3_ARM_v7a_3,
O3_ARM_v7a.O3_ARM_v7a_ICache,
O3_ARM_v7a.O3_ARM_v7a_DCache,
O3_ARM_v7a.O3_ARM_v7aL2,
),
}
class SimpleSeSystem(System):
"""
Example system class for syscall emulation mode
"""
# Use a fixed cache line size of 64 bytes
cache_line_size = 64
def __init__(self, args, **kwargs):
super(SimpleSeSystem, self).__init__(**kwargs)
# Setup book keeping to be able to use CpuClusters from the
# devices module.
self._clusters = []
self._num_cpus = 0
# Create a voltage and clock domain for system components
self.voltage_domain = VoltageDomain(voltage="3.3V")
self.clk_domain = SrcClockDomain(
clock="1GHz", voltage_domain=self.voltage_domain
)
# Create the off-chip memory bus.
self.membus = SystemXBar()
# Wire up the system port that gem5 uses to load the kernel
# and to perform debug accesses.
self.system_port = self.membus.cpu_side_ports
# Add CPUs to the system. A cluster of CPUs typically have
# private L1 caches and a shared L2 cache.
self.cpu_cluster = devices.ArmCpuCluster(
self,
args.num_cores,
args.cpu_freq,
"1.2V",
*cpu_types[args.cpu],
tarmac_gen=args.tarmac_gen,
tarmac_dest=args.tarmac_dest,
)
# Create a cache hierarchy (unless we are simulating a
# functional CPU in atomic memory mode) for the CPU cluster
# and connect it to the shared memory bus.
if self.cpu_cluster.memory_mode() == "timing":
self.cpu_cluster.addL1()
self.cpu_cluster.addL2(self.cpu_cluster.clk_domain)
self.cpu_cluster.connectMemSide(self.membus)
# Tell gem5 about the memory mode used by the CPUs we are
# simulating.
self.mem_mode = self.cpu_cluster.memory_mode()
def numCpuClusters(self):
return len(self._clusters)
def addCpuCluster(self, cpu_cluster):
assert cpu_cluster not in self._clusters
assert len(cpu_cluster) > 0
self._clusters.append(cpu_cluster)
self._num_cpus += len(cpu_cluster)
def numCpus(self):
return self._num_cpus
def get_processes(cmd):
"""Interprets commands to run and returns a list of processes"""
@@ -150,7 +94,31 @@ def get_processes(cmd):
def create(args):
"""Create and configure the system object."""
system = SimpleSeSystem(args)
cpu_class = cpu_types[args.cpu][0]
mem_mode = cpu_class.memory_mode()
# Only simulate caches when using a timing CPU (e.g., the HPI model)
want_caches = True if mem_mode == "timing" else False
system = devices.SimpleSeSystem(
mem_mode=mem_mode,
)
# Add CPUs to the system. A cluster of CPUs typically have
# private L1 caches and a shared L2 cache.
system.cpu_cluster = devices.ArmCpuCluster(
system,
args.num_cores,
args.cpu_freq,
"1.2V",
*cpu_types[args.cpu],
tarmac_gen=args.tarmac_gen,
tarmac_dest=args.tarmac_dest,
)
# Create a cache hierarchy for the cluster. We are assuming that
# clusters have core-private L1 caches and an L2 that's shared
# within the cluster.
system.addCaches(want_caches, last_cache_level=2)
# Tell components about the expected physical memory ranges. This
# is, for example, used by the MemConfig helper to determine where
@@ -160,6 +128,9 @@ def create(args):
# Configure the off-chip memory system.
MemConfig.config_mem(args, system)
# Wire up the system's memory system
system.connect()
# Parse the command line and get a list of Processes instances
# that we can pass to gem5.
processes = get_processes(args.commands_to_run)
@@ -218,7 +189,7 @@ def main():
"--mem-size",
action="store",
type=str,
default="2GB",
default="2GiB",
help="Specify the physical memory size",
)
parser.add_argument(
@@ -232,6 +203,19 @@ def main():
default="stdoutput",
help="Destination for the Tarmac trace output. [Default: stdoutput]",
)
parser.add_argument(
"-P",
"--param",
action="append",
default=[],
help="Set a SimObject parameter relative to the root node. "
"An extended Python multi range slicing syntax can be used "
"for arrays. For example: "
"'system.cpu[0,1,3:8:2].max_insts_all_threads = 42' "
"sets max_insts_all_threads for cpus 0, 1, 3, 5 and 7 "
"Direct parameters of the root object are not accessible, "
"only parameters of its children.",
)
args = parser.parse_args()
@@ -244,6 +228,7 @@ def main():
# Populate the root node with a system. A system corresponds to a
# single node with shared memory.
root.system = create(args)
root.apply_config(args.param)
# Instantiate the C++ object hierarchy. After this point,
# SimObjects can't be instantiated anymore.

View File

@@ -35,13 +35,17 @@
#
import inspect
from common.ObjectList import ObjectList
from common.SysPaths import (
binary,
disk,
)
import m5
from m5.objects import *
from m5.options import *
from common.ObjectList import ObjectList
from common.SysPaths import binary, disk
class ArmBaremetal(ArmFsWorkload):
"""Baremetal workload"""
@@ -49,7 +53,7 @@ class ArmBaremetal(ArmFsWorkload):
dtb_addr = 0
def __init__(self, obj, system, **kwargs):
super(ArmBaremetal, self).__init__(**kwargs)
super().__init__(**kwargs)
self.object_file = obj
@@ -76,7 +80,7 @@ class ArmTrustedFirmware(ArmFsWorkload):
dtb_addr = 0
def __init__(self, obj, system, **kwargs):
super(ArmTrustedFirmware, self).__init__(**kwargs)
super().__init__(**kwargs)
self.extras = [binary("bl1.bin"), binary("fip.bin")]
self.extras_addrs = [

View File

@@ -0,0 +1,201 @@
# Copyright (c) 2024 ARM Limited
# All rights reserved.
#
# The license below extends only to copyright in the software and shall
# not be construed as granting a license to any other intellectual
# property including but not limited to intellectual property relating
# to a hardware implementation of the functionality of the software
# licensed hereunder. You may use the software subject to the license
# terms below provided that you ensure that this notice is replicated
# unmodified and in its entirety in all distributions of the software,
# modified or unmodified, in source code or in binary form.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are
# met: redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer;
# redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution;
# neither the name of the copyright holders nor the names of its
# contributors may be used to endorse or promote products derived from
# this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
# This script showcases the functionality of cache partitioning policies,
# containg a simple system comprised of a memory requestor (TrafficGen),
# a cache enforcing policies for requests and a SimpleMemory backing store.
#
# Using the Way policy, the cache should show the following statistics in the
# provided configuration:
#
# | Allocated Ways | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
# |----------------|---|-----|-----|-----|-----|-----|-----|------|
# | Cache Hits | 0 | 256 | 384 | 512 | 640 | 768 | 896 | 1024 |
#
# Using the MaxCapacity policy, expected results are the following:
#
# | Allocation % | 10 | 20 | 30 | 40 | 50 | 60 | 70 | 80 | 90 | 100 |
# |--------------|----|-----|-----|-----|-----|-----|-----|-----|-----|------|
# | Cache Hits | 0 | 152 | 307 | 409 | 512 | 614 | 716 | 819 | 921 | 1024 |
import argparse
import m5
from m5.objects import *
def capacityAllocation(capacity_str):
"""
Verify that Max Capacity partitioning policy has been provided with a suitable
configuration
"""
capacity = float(capacity_str)
if capacity > 1 or capacity < 0:
raise argparse.ArgumentTypeError(
"Max Capacity Policy needs allocation in range [0, 1]"
)
return capacity
def wayAllocation(way_str):
"""
Verify that Way partitioning policy has been provided with a suitable
configuration
"""
way_alloc = int(way_str)
if way_alloc < 0:
raise argparse.ArgumentTypeError(
"Way Policy needs positive number of ways"
)
return way_alloc
def generatePartPolicy(args):
"""
Generate Partitioning Policy object based on provided arguments
"""
assert args.policy in [
"way",
"max_capacity",
], "Only support generating way and max_capacity policies"
if args.policy == "way":
allocated_ways = [way for way in range(0, args.way_allocation)]
allocation = WayPolicyAllocation(partition_id=0, ways=allocated_ways)
return WayPartitioningPolicy(allocations=[allocation])
if args.policy == "max_capacity":
return MaxCapacityPartitioningPolicy(
partition_ids=[0], capacities=[args.capacity_allocation]
)
def configSystem():
"""
Configure base system and memory
"""
system = System(membus=IOXBar(width=128))
system.clk_domain = SrcClockDomain(
clock="10THz",
voltage_domain=VoltageDomain(),
)
# Memory configuration
system.mem_ctrl = SimpleMemory(bandwidth="1GiB/s", latency="10ns")
# add memory
system.mem_ctrl.range = AddrRange("64KiB")
system.mem_ctrl.port = system.membus.mem_side_ports
return system
parser = argparse.ArgumentParser(
formatter_class=argparse.ArgumentDefaultsHelpFormatter
)
parser.add_argument(
"--policy",
default="way",
choices=["way", "max_capacity"],
help="This option defines which Cache Partitioning Policy to use for "
"the system cache",
)
parser.add_argument(
"--capacity-allocation",
type=capacityAllocation,
default=0.5,
help="The amount of the cache to partition to the default PartitionID "
"when using Max Capacity Cache Partitioning Policy in [0,1] range",
)
parser.add_argument(
"--way-allocation",
type=wayAllocation,
default=4,
help="The number of ways in the cache to partition to the default "
"PartitionID when using Way Cache Partitioning Policy",
)
args = parser.parse_args()
m5.ticks.setGlobalFrequency("10THz")
system = configSystem()
# create a cache to sit between the memory and traffic gen to enforce
# partitioning policies
part_manager = PartitionManager(
partitioning_policies=[generatePartPolicy(args)]
)
system.cache = NoncoherentCache(
size="64KiB",
assoc=8,
partitioning_manager=part_manager,
tag_latency=0,
data_latency=0,
response_latency=0,
mshrs=1,
tgts_per_mshr=8,
write_buffers=1,
replacement_policy=MRURP(),
)
system.cache.mem_side = system.membus.cpu_side_ports
# instantiate traffic gen and connect to crossbar
system.tgen = PyTrafficGen()
system.tgen.port = system.cache.cpu_side
# finalise config and run simulation
root = Root(full_system=False, system=system)
root.system.mem_mode = "timing"
m5.instantiate()
# configure traffic generator to do 2x 64KiB sequential reads from address 0
# to 65536; one to warm up the cache one to test cache partitioning
linear_tgen = system.tgen.createLinear(
1000000000, 0, 65536, 64, 1, 1, 100, 65536
)
exit_tgen = system.tgen.createExit(1)
system.tgen.start([linear_tgen, linear_tgen, exit_tgen])
# handle exit reporting
exit_event = m5.simulate(2000000000)
print(f"Exiting @ tick {m5.curTick()} because {exit_event.getCause()}")

View File

@@ -25,7 +25,6 @@
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
import m5
from m5.objects import *
traffic_gen = PyTrafficGen()
@@ -37,9 +36,8 @@ system.mem_mode = "timing"
system.cpu = traffic_gen
dramsys = DRAMSys(
configuration="ext/dramsys/DRAMSys/DRAMSys/"
"library/resources/simulations/ddr4-example.json",
resource_directory="ext/dramsys/DRAMSys/DRAMSys/library/resources",
configuration="ext/dramsys/DRAMSys/configs/ddr4-example.json",
resource_directory="ext/dramsys/DRAMSys/configs",
)
system.target = dramsys

View File

@@ -1,4 +1,4 @@
# Copyright (c) 2015 ARM Limited
# Copyright (c) 2015, 2023 Arm Limited
# All rights reserved.
#
# The license below extends only to copyright in the software and shall
@@ -37,16 +37,53 @@
import argparse
from m5.util import addToPath, fatal
from m5.util import (
addToPath,
fatal,
)
addToPath("../")
from common import Options
from common import Simulation
from common import CacheConfig
from common import MemConfig
from common import (
MemConfig,
Options,
Simulation,
)
from common.Caches import *
def config_cache(args, system):
"""
Configure the cache hierarchy. Only two configurations are natively
supported as an example: L1(I/D) only or L1 + L2.
"""
from common.CacheConfig import _get_cache_opts
system.l1i = L1_ICache(**_get_cache_opts("l1i", args))
system.l1d = L1_DCache(**_get_cache_opts("l1d", args))
system.cpu.dcache_port = system.l1d.cpu_side
system.cpu.icache_port = system.l1i.cpu_side
if args.l2cache:
# Provide a clock for the L2 and the L1-to-L2 bus here as they
# are not connected using addTwoLevelCacheHierarchy. Use the
# same clock as the CPUs.
system.l2 = L2Cache(
clk_domain=system.cpu_clk_domain, **_get_cache_opts("l2", args)
)
system.tol2bus = L2XBar(clk_domain=system.cpu_clk_domain)
system.l2.cpu_side = system.tol2bus.mem_side_ports
system.l2.mem_side = system.membus.cpu_side_ports
system.l1i.mem_side = system.tol2bus.cpu_side_ports
system.l1d.mem_side = system.tol2bus.cpu_side_ports
else:
system.l1i.mem_side = system.membus.cpu_side_ports
system.l1d.mem_side = system.membus.cpu_side_ports
parser = argparse.ArgumentParser()
Options.addCommonOptions(parser)
@@ -59,29 +96,18 @@ if "--ruby" in sys.argv:
args = parser.parse_args()
numThreads = 1
if args.cpu_type != "TraceCPU":
fatal(
"This is a script for elastic trace replay simulation, use "
"--cpu-type=TraceCPU\n"
)
if args.num_cpus > 1:
fatal("This script does not support multi-processor trace replay.\n")
# In this case FutureClass will be None as there is not fast forwarding or
# switching
(CPUClass, test_mem_mode, FutureClass) = Simulation.setCPUClass(args)
CPUClass.numThreads = numThreads
system = System(
cpu=CPUClass(cpu_id=0),
mem_mode=test_mem_mode,
mem_mode=TraceCPU.memory_mode(),
mem_ranges=[AddrRange(args.mem_size)],
cache_line_size=args.cacheline_size,
)
# Generate the TraceCPU
system.cpu = TraceCPU()
# Create a top-level voltage domain
system.voltage_domain = VoltageDomain(voltage=args.sys_voltage)
@@ -105,11 +131,6 @@ system.cpu_clk_domain = SrcClockDomain(
for cpu in system.cpu:
cpu.clk_domain = system.cpu_clk_domain
# BaseCPU no longer has default values for the BaseCPU.isa
# createThreads() is needed to fill in the cpu.isa
for cpu in system.cpu:
cpu.createThreads()
# Assign input trace files to the Trace CPU
system.cpu.instTraceFile = args.inst_trace_file
system.cpu.dataTraceFile = args.data_trace_file
@@ -118,8 +139,11 @@ system.cpu.dataTraceFile = args.data_trace_file
MemClass = Simulation.setMemClass(args)
system.membus = SystemXBar()
system.system_port = system.membus.cpu_side_ports
CacheConfig.config_cache(args, system)
# Configure the classic cache hierarchy
config_cache(args, system)
MemConfig.config_mem(args, system)
root = Root(full_system=False, system=system)
Simulation.run(args, root, system, FutureClass)
Simulation.run(args, root, system, None)

View File

@@ -26,11 +26,14 @@
#
# Author: Tushar Krishna
import argparse
import os
import sys
import m5
from m5.objects import *
from m5.defines import buildEnv
from m5.objects import *
from m5.util import addToPath
import os, argparse, sys
addToPath("../")

View File

@@ -0,0 +1,92 @@
# Copyright (c) 2024 The Regents of the University of California
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are
# met: redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer;
# redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution;
# neither the name of the copyright holders nor the names of its
# contributors may be used to endorse or promote products derived from
# this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
"""
This script further shows an example of booting an ARM based full system Ubuntu
disk image. This simulation boots the disk image using the ArmDemoBoard.
Usage
-----
```bash
scons build/ARM/gem5.opt -j $(nproc)
./build/ARM/gem5.opt configs/example/gem5_library/arm-demo-ubuntu-run.py
```
"""
import argparse
from gem5.isas import ISA
from gem5.prebuilt.demo.arm_demo_board import ArmDemoBoard
from gem5.resources.resource import obtain_resource
from gem5.simulate.exit_event import ExitEvent
from gem5.simulate.simulator import Simulator
from gem5.utils.requires import requires
# This runs a check to ensure the gem5 binary interpreting this file is compiled to include the ARM ISA.
requires(isa_required=ISA.ARM)
parser = argparse.ArgumentParser(
description="An example configuration script to run the ArmDemoBoard."
)
parser.add_argument(
"--use-kvm",
action="store_true",
help="Use KVM cores instead of Timing.",
)
args = parser.parse_args()
board = ArmDemoBoard(use_kvm=args.use_kvm)
board.set_workload(
obtain_resource(
"arm-ubuntu-24.04-boot-with-systemd", resource_version="2.0.0"
)
)
def exit_event_handler():
print("First exit: kernel booted")
yield False # gem5 is now executing systemd startup
print("Second exit: Started `after_boot.sh` script")
# The after_boot.sh script is executed after the kernel and systemd have
# booted.
yield False # gem5 is now executing the `after_boot.sh` script
print("Third exit: Finished `after_boot.sh` script")
# The after_boot.sh script will run a script if it is passed via
# m5 readfile. This is the last exit event before the simulation exits.
yield True
# We define the system with the aforementioned system defined.
simulator = Simulator(
board=board,
on_exit_event={
ExitEvent.EXIT: exit_event_handler(),
},
)
simulator.run()

View File

@@ -43,13 +43,18 @@ scons build/ARM/gem5.opt
from gem5.isas import ISA
from gem5.utils.requires import requires
from gem5.resources.resource import Resource
from gem5.resources.resource import BinaryResource
from gem5.components.memory import SingleChannelDDR3_1600
from gem5.components.processors.cpu_types import CPUTypes
from gem5.components.boards.simple_board import SimpleBoard
from gem5.components.cachehierarchies.classic.no_cache import NoCache
from gem5.components.memory import SingleChannelDDR3_1600
from gem5.components.processors.cpu_types import CPUTypes
from gem5.components.processors.simple_processor import SimpleProcessor
from gem5.isas import ISA
from gem5.resources.resource import obtain_resource
from gem5.simulate.simulator import Simulator
from gem5.utils.requires import requires
# This check ensures the gem5 binary is compiled to the ARM ISA target. If not,
# an exception will be thrown.
@@ -59,12 +64,12 @@ requires(isa_required=ISA.ARM)
cache_hierarchy = NoCache()
# We use a single channel DDR3_1600 memory system
memory = SingleChannelDDR3_1600(size="32MB")
memory = SingleChannelDDR3_1600(size="32MiB")
# We use a simple Timing processor with one core.
processor = SimpleProcessor(cpu_type=CPUTypes.TIMING, isa=ISA.ARM, num_cores=1)
# The gem5 library simble board which can be used to run simple SE-mode
# The gem5 library simple board which can be used to run simple SE-mode
# simulations.
board = SimpleBoard(
clk_freq="3GHz",
@@ -84,7 +89,7 @@ board.set_se_binary_workload(
# Any resource specified in this file will be automatically retrieved.
# At the time of writing, this file is a WIP and does not contain all
# resources. Jira ticket: https://gem5.atlassian.net/browse/GEM5-1096
Resource("arm-hello64-static")
BinaryResource("physical")
)
# Lastly we run the simulation.

View File

@@ -0,0 +1,140 @@
# Copyright (c) 2022-23 The Regents of the University of California
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are
# met: redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer;
# redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution;
# neither the name of the copyright holders nor the names of its
# contributors may be used to endorse or promote products derived from
# this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
"""
This script further shows an example of booting an ARM based full system Ubuntu
disk image. This simulation boots the disk image using 2 TIMING CPU cores. The
simulation ends when the startup is completed successfully (i.e. when an
`m5_exit instruction is reached on successful boot).
Usage
-----
```
scons build/ARM/gem5.opt -j<NUM_CPUS>
./build/ARM/gem5.opt configs/example/gem5_library/arm-ubuntu-run-with-kvm.py
```
"""
from m5.objects import (
ArmDefaultRelease,
VExpress_GEM5_V1,
)
from gem5.coherence_protocol import CoherenceProtocol
from gem5.components.boards.arm_board import ArmBoard
from gem5.components.memory import DualChannelDDR4_2400
from gem5.components.processors.cpu_types import CPUTypes
from gem5.components.processors.simple_switchable_processor import (
SimpleSwitchableProcessor,
)
from gem5.isas import ISA
from gem5.resources.resource import obtain_resource
from gem5.simulate.exit_event import ExitEvent
from gem5.simulate.simulator import Simulator
from gem5.utils.requires import requires
# This runs a check to ensure the gem5 binary is compiled for ARM.
requires(isa_required=ISA.ARM)
from gem5.components.cachehierarchies.classic.private_l1_private_l2_cache_hierarchy import (
PrivateL1PrivateL2CacheHierarchy,
)
# Here we setup the parameters of the l1 and l2 caches.
cache_hierarchy = PrivateL1PrivateL2CacheHierarchy(
l1d_size="16KiB", l1i_size="16KiB", l2_size="256KiB"
)
# Memory: Dual Channel DDR4 2400 DRAM device.
memory = DualChannelDDR4_2400(size="2GiB")
# Here we setup the processor. This is a special switchable processor in which
# a starting core type and a switch core type must be specified. Once a
# configuration is instantiated a user may call `processor.switch()` to switch
# from the starting core types to the switch core types. In this simulation
# we start with KVM cores to simulate the OS boot, then switch to the Timing
# cores for the command we wish to run after boot.
processor = SimpleSwitchableProcessor(
starting_core_type=CPUTypes.KVM,
switch_core_type=CPUTypes.TIMING,
isa=ISA.ARM,
num_cores=2,
)
# The ArmBoard requires a `release` to be specified. This adds all the
# extensions or features to the system. We are setting this to for_kvm()
# to enable KVM simulation.
release = ArmDefaultRelease.for_kvm()
# The platform sets up the memory ranges of all the on-chip and off-chip
# devices present on the ARM system. ARM KVM only works with VExpress_GEM5_V1
# on the ArmBoard at the moment.
platform = VExpress_GEM5_V1()
# Here we setup the board. The ArmBoard allows for Full-System ARM simulations.
board = ArmBoard(
clk_freq="3GHz",
processor=processor,
memory=memory,
cache_hierarchy=cache_hierarchy,
release=release,
platform=platform,
)
# Here we set a full system workload. The "arm-ubuntu-24.04-boot-with-systemd" boots
# Ubuntu 24.04.
workload = obtain_resource("arm-ubuntu-24.04-boot-with-systemd")
board.set_workload(workload)
def exit_event_handler():
print("First exit: kernel booted")
yield False # gem5 is now executing systemd startup
print("Second exit: Started `after_boot.sh` script")
# The after_boot.sh script is executed after the kernel and systemd have
# booted.
# Here we switch the CPU type to Timing.
print("Switching to Timing CPU")
processor.switch()
yield False # gem5 is now executing the `after_boot.sh` script
print("Third exit: Finished `after_boot.sh` script")
# The after_boot.sh script will run a script if it is passed via
# m5 readfile. This is the last exit event before the simulation exits.
yield True
simulator = Simulator(
board=board,
on_exit_event={
# Here we want override the default behavior for the first m5 exit
# exit event.
ExitEvent.EXIT: exit_event_handler()
},
)
simulator.run()

Some files were not shown because too many files have changed in this diff Show More