Commit Graph

22308 Commits

Author SHA1 Message Date
aperais
b82ab5ac89 misc: Do not share the random number generator across components (#1534)
Component that require randomness should not share their randomness
source with other components to avoid simulation noise. For instance,
the branch predictor of one core should not impact the random
cache replacement policy of the cache of another core. This currently
happens as all components share a single random number generator.
    
This PR provides their own generators to relevant components, although
a couple components still use rand().
    
Change-Id: I3fb7226111c9194ee457af0f0f2b83f8c7b69d1e

Co-authored-by: Arthur Perais <arthur.perais@univ-grenoble-alpes.fr>
2024-11-18 01:37:12 -08:00
Jason Lowe-Power
5ae26c0f09 stdlib: Add interface to set binary in fs mode (#1743) 2024-11-18 00:23:59 -08:00
Marleson Graf
c31bc284a8 mem-ruby,sim-se: Fix functional reads for MESI protocols
This commit fixes three issues in MESI_Three_Level and MESI_Two_Level
implementations (MEI_Three_Level_HTM might still have issues).

1) Define functional read priorities for the cache controllers which
have states with Maybe_Stale access permission (L1 > L2 > Directory).

2) Fix incorrect access permissions in MESI_Three_Level-L1cache:
* S_IL0 is Read_Only, it is waiting for L0 to acknowledge the
  invalidation request before moving to SS, also a Read_Only state.
* E_IL0 is Maybe_Stale, its contents might be valid, since there is a
  transition (E_IL0, L0_Ack, EE) with no writeback data.
* M_IL0 is Maybe_Stale, its contents might be valid, since there is a
  transition (M_IL0, L0_Ack, MM) with no writeback data.

3) Add missing message types carrying valid data in functional reads:
* INV_DATA is a writeback from L0 to L1.
* DATA is a response to GET_S, but there are scenarios where it might
  be the only place with valid data (e.g. during L2 replacement).

Change-Id: Ie44fa317027f9ede272967e7461d337e14355eec
2024-11-18 00:22:45 -08:00
Marleson Graf
63d110fb7a mem-ruby,sim-se: Support Maybe_Stale in functional reads
Functional reads can be satisfied by one of the following, in order:
1. Main memory (when the data is not present in the cache hierarchy);
2. Valid data block in cache;
3. Valid data block in coherence message;
4. Valid data block marked as Maybe_Stale;

Number 4 is not handled by the current implementation. A Maybe_Stale
block can be either truly stale or actually valid. When it is stale,
the memory read will be satisfied by either number 2 or number 3. When
it is valid, there will be no coherence message with valid data inside,
and the Maybe_Stale block will transition to a valid state after
receiving some kind of acknowledgement.

The main challenge to handle number 4 is how to know from which
Maybe_Stale block the data should be read from. For instance, in a two
level cache hierarchy, we might have a block marked as Maybe_Stale in
both L1 and L2. In this case, we should prioritize the cache controller
that is closest to the CPU. To define this priority, a new virtual
function 'functionalReadPriority' was added to the AbstractController
class.

Change-Id: I4774cd01aab7bb9ca53694cd9dc4f9416a8e4025
2024-11-18 00:22:36 -08:00
Bobby R. Bruce
78db0e26b2 misc: Merge branch v24.0 stable into v24.1 release staging
For reasons I do not fully understand the prefetch code was out-of-sync
between develop and stable.
2024-11-11 13:51:39 -08:00
Simon Lammer
665d32cba2 misc: Fix typo in README.md (#1763) 2024-11-11 10:03:54 -08:00
Yu-Cheng Chang
8b1075b792 arch, cpu: Add generic getValidAddr to correct exetrace symbol table (#1758)
The getValidAddr is the method get virtual address with valid bits. It
is useful to get the correct symbol table via valid virtual address.

For ARM, we have `purifyTaggedAddr` to get the right virtual address.
For RISC-V, we only get lower 32 bits in RV32 mode to get the right
symbol table.

Change-Id: I33ad7bec6e7ea4ec82cb1b3a7f521432c6d735b6
2024-11-08 13:33:53 +00:00
Noah Krim
ad8bd6b5c7 arch-arm,util-m5: Change arm64's default m5 call type to addr (#1583)
As of PR #977, gem5 has a defined M5OPS_ADDR for arm64, even if it is
constrained to certain conditions. With that change and the arm64 board
KVM support (#725), it seems like interaction with the m5 utility under
arm64 will most commonly occur in KVM -- where instruction-mode does not
work -- and thus address-mode becomes more desirable as the default.

This also makes m5's behavior in arm64 consistent with x86, the only
other architecture that supports address-mode operations.
2024-11-07 10:08:10 -08:00
dependabot[bot]
ca07a06893 misc: bump mypy from 1.11.2 to 1.13.0 (#1748)
Bumps [mypy](https://github.com/python/mypy) from 1.11.2 to 1.13.0.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-06 10:41:03 -08:00
Erin (Jianghua) Le
f2892fd5bc misc: update RELEASE-NOTES.md for simInsts and simOps (#1750)
This PR updates RELEASE-NOTES.md to say that simInsts and simOps have
been modified such that they now reset to 0 when m5.stats.reset() is
called.
2024-11-06 10:40:07 -08:00
dependabot[bot]
ecde7d9fa9 misc: bump pre-commit from 3.8.0 to 4.0.1 (#1749)
Bumps [pre-commit](https://github.com/pre-commit/pre-commit) from 3.8.0
to 4.0.1.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-05 12:16:42 -08:00
Yu-Cheng Chang
70c211236a arch-riscv: sign-extend the PC when enter/leave trap handler (#1756)
The PR https://github.com/gem5/gem5/pull/1316 changes the sign-extend
address generation. We also need to sign-extend the address when setting
the PC in enter/leave trap handler

Change-Id: I62d58a26dba0b0c64125fea8ac9376ebf55c4952
2024-11-05 12:14:55 -08:00
Saúl
63ea52de56 arch-riscv: fix vrgather pin count (#1759)
The number of register pins for the vector gather instructions was not
calculated correctly because the micro vl was not right. This caused
some micros to rename a new register instead of using a pinned one.
2024-11-05 12:13:51 -08:00
Tommaso Marinelli
7f50372979 configs: Update legacy RISC-V FS Linux script (#1753)
This PR improves the legacy RISC-V FS Linux script in the following
ways:

- Adds an argument to specify the bootloader, to (optionally) use the
  `RiscvBootloaderKernelWorkload` class.
- Updates the DTB generation function adding the Chosen node. This
  fixes the execution with recent Linux kernels.
- Checks if the `--kernel` required argument is set.
2024-11-05 10:57:57 -08:00
Matthew Poremba
6881534bd2 misc: Add v24.1 release notes for RubySystem changes (#1735) 2024-11-05 10:47:37 -08:00
Vishnu Ramadas
d463868f28 dev-amdgpu, gpu-compute, mem-ruby: Add support for writeback L2 in GPU (#1692)
Previously, GPU L2 caches could be configured in either writeback or
writethrough mode when used in an APU. However, in a CPU+dGPU system,
only writethrough worked. This is mainly because in CPU+dGPU system, the
CPU sends either PCI or SDMA requests to transfer data from the GPU
memory to CPU. When L2 cache is configured to be writeback, the dirty
data resides in L2 when CPU transfers data from GPU memory. This leads
to the wrong version being transferred. A similar issue also crops up
when the GPU command processor reads kernel information before kernel
dispatch, only to incorrect data. This PR contains a set of commits that
fix both these issues.
2024-11-05 10:45:46 -08:00
ylldummy
940f49b63b base: Make BaseGdbRegCache::data() non constant (#1734)
The method is defined as const but the caller will actually modify the
content of the structure directly with the pointer in
BaseRemoteGDB::cmdRegW. The member access in the const method are
actually treated as const and will cause error if we use
reinterpret_cast instead.

Remove the const tag to align the expectation of the virtual method.
2024-11-05 10:43:41 -08:00
Giacomo Travaglini
3e628dd1c0 arch-arm: Cache a pointer to previously matched TLB entry (#1752)
One of the perks of the previous TLB storage implementation [1] is that
its custom implementation of LRU exploited temporal locality to speed up
simulation performance

            TlbEntry tmp_entry = *entry;
            for (int i = idx; i > 0; i--)
                table[i] = table[i - 1];
            table[0] = tmp_entry;
            return &table[0];

In other words the matching entry was placed as the first entry of the
TLB table (table[0], top of LRU stack). In this way a following lookup
would encounter it as the first entry while looping over the TLB table,
therefore massively reducing simulation time when temporal locality is
present
(most of TLB table loops would find a match in the first iteration).

   int x = 0;
    while (x < size) {
        if (table[x].match(lookup_data)) {

With the new implementation we decouple TLB storage from the replacement
policy. The result is a more flexible implementation but with the
drawback of a slower lookup/search. We therefore we need to find another
way to exploit temporal locality. This patch addresses it by caching a
previously matched entry in the TLB table

[1]: https://github.com/gem5/gem5/blob/v24.0.0.0/src/arch/arm/tlb.cc

Change-Id: Id7dedf5411ea6f6724d1e4bdb51635417a6d5363

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-11-05 08:49:17 +00:00
Leon
2e998c9fc0 arch-riscv: Add support for Zicbop extension (#1710)
This PR add support for RISC-V
[Zicbop](https://github.com/riscv/riscv-CMOs/blob/master/cmobase/Zicbop.adoc)
extension.

Change-Id: I13b044cf84608fb09b760348366ffad659a00427

Co-authored-by: Zhibo Hong <hongzhibo@bytedance.com>
2024-11-04 17:08:38 -08:00
dependabot[bot]
dba9a9e564 misc: bump tqdm from 4.66.5 to 4.66.6 (#1747)
Bumps [tqdm](https://github.com/tqdm/tqdm) from 4.66.5 to 4.66.6.
2024-11-04 11:11:40 -08:00
Giacomo Travaglini
4f74c3a949 arch-arm: Use the cached release object instead of HaveExt (#1751)
The MMU already stores a pointer to the release object, so it can query
it directly to check for PAN instead of relying on the slower HaveExt
helper

Change-Id: Ie3a186aa1d65955cff4a40871bde1ee78aa36ec0

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-11-03 11:18:10 +00:00
Matthew Poremba
2ed724b670 mem-ruby: Fix two NetDest locals using default constructor (#1746)
Two NetDest locally declared variables are using default constructor
instead of constructor with RubySystem pointer. This will cause asserts
when (1) garnet is used or (2) a protocol that uses `broadcast()` is
built.

Fix these two by passing the appropriate RubySystem pointers.
2024-11-02 08:37:04 -07:00
handsomeliu-google
956b164a43 Add Python interface to get port actual name (#1744)
In our usecase, we'd like to intercept some gadgets in some gem5 ports,
and register them to a Python-level collection. The registered name is
the string from C++ constructor argument (portName), and it would be
great if we can access that from Python-level as well. This commit
enable this by exporting a py-binded method to access the portName.

Change-Id: I93398697536f27a52d3a1dd0e658fcb321b9e293
2024-11-02 08:59:50 -05:00
Giacomo Travaglini
d376360255 arch-arm: Rewrite the ArmTLB storage to use an AssociativeCache (#1661)
With this PR we replace the TlbEntry storage within the TLB from an
array of entries with a custom hardcoded FA indexing policy and LRU
replacement policy, into the flexible SetAssociative cache.
2024-11-02 10:18:44 +00:00
Ivana Mitrovic
cc4f466e1e util: Bumps werkzeug in gem5-resources-manager (#1723)
Bumps [werkzeug](https://github.com/pallets/werkzeug) from 3.0.3 to
3.0.6.
2024-11-01 10:51:34 -07:00
Giacomo Travaglini
a2476373c9 arch-arm: Do not compute purifyTaggedAddr in checkPermissions (#1739)
purifyTaggedAddr is known to be an expensive computation regardless of
the memoization we do, as it sits in the critical path from a host
performance point of view (instruction fetch).
In checkPermissions64 we compute it without really needing the tag
purification. The only place where it is used is to check for
PCAlignment, but the alignment checks the 3LSBs whereas a potential tag
would be stored in the most significant ones

Change-Id: I9f39db658c3575dcbacb5351813ff9bb3775046d

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-11-01 16:18:57 +00:00
Jason Lowe-Power
df6a318a86 arch-x86: Update MTRR defType register (#1732)
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2024-11-01 08:59:33 -07:00
Daniel Carvalho
ad17fa040a base: Remove DPRINTF_UNCONDITIONAL (#1724)
This macro has been marked as deprecated since 2021. Wrap its
deprecation process up.

Signed-off-by: odanrc <odanrc@yahoo.com.br>
2024-10-31 18:40:38 +00:00
Bobby R. Bruce
b5a73b59ef sim: Add include guards in simulate.hh (#1737) 2024-10-31 00:34:39 -07:00
Yu-Cheng Chang
757b272a25 arch-riscv: Fix Zcmp implement typos (#1727)
Fix some typos from previous PR: https://github.com/gem5/gem5/pull/1432

Change-Id: I7126d0a20b3294c7f15d90f2d50842d20ddb5e40
2024-10-30 09:47:30 -07:00
Bobby R. Bruce
24b672ab01 tests: update timout on pannotia fw gpu test (#1736) 2024-10-30 09:47:15 -07:00
Harshil Patel
429580ee77 tests: update timout on pannotia fw gpu test 2024-10-30 16:42:23 +00:00
Bobby R. Bruce
2c6de97ea1 Add SE mode to X86Board and RiscvBoard (#1702) 2024-10-29 20:17:47 -07:00
Bobby R. Bruce
d5d7880840 util-docker: Add qemu-riscv-env Dockerfile (#1731) 2024-10-29 17:19:43 -07:00
Bobby R. Bruce
d8e7c91127 mem-ruby: Remove unused variables/mark [maybe unused] (#1650)
PR gem5#1453 left some unused variables in the ruby code that triggered
"unused variable" warnings found comiling ALL/gem5.opt to use the CHI
protocol. These have been removed.
2024-10-29 14:31:20 -07:00
Matthew Poremba
1442a4dccd mem-ruby: Re-enable assign with implicit_ctor structures (#1694)
In #1453, an `implicit_ctor` option was added for SLICC structures. This
was done to allow statements such as `NetDest tmp;` which now require a
non-default constructor without modifying every protocol. The new
`implicit_ctor` option converts the statement `NetDest tmp;` in SLICC to
`NetDest tmp(<implicit_ctor>);` in C++. This is problematic when doing
something like `NetDest tmp := getMachines(...);` which gets converted
to `NetDest tmp(<implicit_ctor) = getMachines(...);` as the constructor
doesn't return an object. Before #1453 NetDest had a default constructor
so there we no difference between a local variable definition and local
variable assignment.

This commit fixes this issue by checking in the LocalVariableAST if the
local variable is part of an assignment or not. If it is not part of an
assignment, the implicit_ctor is used. Otherwise, the assignment is
printed to the generated code.

Note that this is not done anywhere in the public code but should be
allowed for folks writing their own Ruby protocols who might otherwise
be confused why a simple assignment presents a compile error.
2024-10-29 08:53:14 -07:00
Matt Sinclair
853f2ea012 configs,scons: Update scripts and build_opts to make GPU-FS simulations more configurable (#1693)
This PR adds support for command line arguments in GPU-FS runs to allow
the user to configure several parts of the GPU. It also increases the
bits per set in the build_opts/VEGA_X86 file to enable GPU-FS
simulations to use 64 directories or more.
2024-10-28 17:19:18 -05:00
Erin Le
11dd2c6c09 stdlib: address requested changes to X86, Riscv boards
This commit addresses the requested changes. An additional
comment is added for clarification, the exception type is
changed, and a few of the error messages have been
modified.
2024-10-28 15:00:19 -07:00
Marleson Graf
7bddc764cc mem-ruby: Prevent LL/SC livelock in MESI protocols (#1384) (#1399)
Fix #1384.

MESI_Two_Level and MESI_Three_Level protocols are susceptible to LL/SC
livelocks when simulating boards with high core count.

This fix is based on MOESI_CMP_directory's implementation of locked
states, but tailors the solution to only apply it when a Load-Linked is
initiated.

There are two new states to act as locked states and stall any messages
leading to eviction:
* LLSC_E: equivalent to E state, go to E after timeout.
* LLSC_M: equivalent to M state, go to M after timeout.

The main new event is Load_Linked, which is very similar (in behavior)
to a Store, reusing several transient states. When a controller receives
the exclusive data, it differentiates a Load_Linked from a Store by
checking a new field added to the TBE: 'isLoadLinked'. It triggers a
different event when it is a Load_Linked, which in turn causes the
transition to one of the locked states.

The entire mechanism can be turned off by setting 'use_llsc_lock' to
false, and the amount of time to keep locked is defined by
'llsc_lock_timeout_latency'.

Change-Id: I13f415b6b7890d51d01f23001047d2363467a814
2024-10-28 09:57:10 -07:00
Bobby R. Bruce
dde1c7d3a1 util-docker: Add RISCV to Ubuntu all-deps Docker platforms (#1716)
I have re-implemented building this image to target RISC-V in addition
to X86 and ARM. I have found it makes for quite a good cross compilation
tool.
2024-10-26 21:17:40 -07:00
Giacomo Travaglini
c9f94f4e06 arch-arm: Replace translateAtomic with translateFunctional in AT (#1713)
A previous PR mistakenly [1] replaced translateFunctional with
translateAtomic. This commit is reverting that

[1]: https://github.com/gem5/gem5/pull/1697

Change-Id: I945c3fe59cea36732d9f30109b950d4114aa8fad

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-10-25 09:15:52 -07:00
Harshil Patel
c91af552d4 tests: move weekly gpu tests to have separate jobs (#1698) 2024-10-24 04:02:23 -07:00
Bobby R. Bruce
709f2c7695 mem-ruby,tests: Add CHI with ISA tests (#1651) 2024-10-23 15:12:37 -07:00
Bobby R. Bruce
35db93ada4 arch-riscv: Fix the bug of vsetivli frequently flushing the pipeline (#1526)
This PR fix the bug of vsetivli frequently flushing the pipeline.

Here are two pictures of the pipeline illustrate this phenomenon.


![20240830-200208](https://github.com/user-attachments/assets/532a1a8e-8acd-483f-b9a0-c25dadbe76b4)

![20240830-200213](https://github.com/user-attachments/assets/9354a6ad-4024-4afb-be6f-01f08dc9610c)

The vsetivli(0x00013334.0) instruction in the first picture flushes the
pipeline every time it is executed. This is due to vsetivli being
incorrectly flagged as a 'DirectControl' instruction. The branch
predictor cannot predict it correctly.

The second picture is the pipeline after fixing the bug.

Change-Id: I5bede47919c06cea86fa23a81624b502fbdc1159
2024-10-23 08:32:56 -07:00
Zhibo Hong
089d780c76 arch-riscv: Fix the bug of vsetivli frequently flushing the pipeline
Change-Id: I5bede47919c06cea86fa23a81624b502fbdc1159
2024-10-23 17:24:43 +08:00
Erin Le
7b7f5ef34a stdlib: add SE mode to RiscvBoard
This commit adds SE mode to RiscvBoard. RiscvDemoBoard has also
been modified as adding SE mode to RiscvBoard made the
overridden functions in RiscvDemoBoard obsolete.
2024-10-22 16:31:01 -07:00
Erin Le
b9a19625ce stdlib: add SE mode to X86Board
This commit adds SE mode to X86Board. X86DemoBoard was also modified,
as functions that were previously needed to add SE mode to
X86DemoBoard were removed.
2024-10-22 15:01:27 -07:00
Erin (Jianghua) Le
f01d68bf96 stdlib, configs: Add RiscvDemoBoard (#1490)
This PR adds a RiscvDemoBoard that can be used with both SE and FS
mode.This was tested using the workloads riscv-matrix-multiply-run for
SE and riscv-ubuntu-20.04-boot for FS. Two example config scripts have
also been added.
2024-10-22 10:13:22 -07:00
Giacomo Travaglini
3a14a73982 arch-arm: Add support of AArch32 VRINTN/X/A/Z/M/P instructions. (#1655)
Add decoder and function of AArch32 VRINTN, VRINTX, VRINTA, VRINTZ,
VRINTM, and VRINTP (Advanced SIMD) instructions. Support both 16-bit and
32-bit variants.

Add vfpFPRint in vfp.hh to perform the behavior of round-to-integer.

Only support A32 encoding.

Change-Id: Icb9b6f71edf16ea14a439e15c480351cd8e1eb88
2024-10-22 18:37:30 +02:00
Nicholas Mosier
faf764e668 arch-x86: break 32/64-bit LEA's input dependency on prior dest value (#1683)
Fix #1682. Treat LEA as a BigLdStOp. BigLdStOps (as well as other Big*
x86 uops) do not have input dependencies on 32-/64-bit destinations. LEA
will still have input dependencies on 16-bit destinations. (LEA cannot
have an 8-bit destination.)

Change-Id: I5d0678e6bd79bfd6064941a89c6fe290750543c9
2024-10-22 09:34:30 -07:00