Commit Graph

15466 Commits

Author SHA1 Message Date
Giacomo Travaglini
8381e1c5d3 mem-cache: Helper functions to allow dynamic configuration of partitioning policies (#1609)
This PR is doing a simple refactoring of some partitioning policies. It
moves existing functionalities
within PP methods so that they can be called multiple times throughout
the simulation.
Therefore allowing a dynamic adjustment of the partitioning scheme
2024-09-27 00:01:18 +02:00
Bobby R. Bruce
277b5be4dd arch-arm: Add a method to determine External Abort (#1610)
- Add `isExternalAbort()` in `AbortFault<T>` to determine external
abort.
- Add `virtual isExternalAbort()` in `ArmFault` so the method can be
used in base class.
- Set iss.ea by `isExternalAbort()`
2024-09-26 14:41:33 -07:00
Bobby R. Bruce
054790ad47 ext: Fix GCC v13+ comp of systemc due to problematic overloaded-virtual warn (#1576)
Fixes #1121 in line with the following suggesting:

https://github.com/gem5/gem5/issues/1121#issuecomment-2352743409
2024-09-26 14:32:20 -07:00
Junshi Wang
a4bacb9823 arch-arm: Add a method to determine External Abort.
- Add `isExternalAbort()` in `AbortFault<T>` to determine external abort.
- Add `virtual isExternalAbort()` in `ArmFault` so the method can be
used in base class.
- Set iss.ea by `isExternalAbort()`.

Change-Id: I01c22dc46958ab424b389af96d3c3b6243cbc671
2024-09-26 14:05:09 +01:00
Junshi Wang
9a7a661c66 arch-arm: Set tranMethod for external Data Abort.
The External Data Abort may not set TranMethod, and it leads to assert
error.

- Make `ArmFault::update` virtual.
- Implement override `update` in `AbortFault<T>` to set TranMethod.

Change-Id: I49e18799df8420b214b6059ffa756a13edf343d5
2024-09-26 14:04:17 +01:00
Giacomo Travaglini
b232204b49 mem-cache: Allow dynamic configuration of the Way pp
Change-Id: I1ba9266b24ebc9563f9380fcf155cdc436b2e376
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-09-26 10:21:46 +01:00
Giacomo Travaglini
fdcfc28cf4 mem-cache: Allow dynamic configuration of MaxCapacity pp
This will allow gem5 to configure the maximum capacity of a
partition dynamically during simulation, rather than
having it statically defined at construction time

Change-Id: Ib55c9990a6bc2930abaf2438c13337acc643520f
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-09-26 10:21:21 +01:00
Giacomo Travaglini
3100418fb1 mem-cache: Store totalBlockCount directly in MaxCapacity pp
In this way we actually need to store one unsigned integer instead of
two. We also won't need to recompute the total number of cache blocks
whenever we will adapt this policy to be dynamically modified

Change-Id: Ia8cf906539d1891b6cdb821f2a74628127dc68c6
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-09-26 10:20:57 +01:00
Junshi Wang
bf61bd127f arch-arm: Add support of AArch32 VCVTA/P/N/M instructions.
Add decoder and function of AArch32 VCVTA, VCVTP, VCVTN and VCVTM
instructions. Support both 16-bit and 32-bit variants.

Only support A32 encoding.

Change-Id: I6ece0e1b779f9a7cc9d709894a49a7fdcda28373
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-09-26 10:02:23 +02:00
Ronchi1997
e17875b7c7 misc: Correctly display build information (#1603)
See: #1591

Co-authored-by: Ronchi <ronchi@qq.com>
2024-09-25 14:23:51 -07:00
aperais
36264938db misc: Make random gen portable across compilers. (#1580)
Replace std::uniform_*_distribution by custom code
to make random number generation in gem5 portable across
compilers.

Of note, FP random number generation was not uniformly
distributed, and this PR does not fix that issue.

Thanks to Chandana S. Deshpande (deshpande.s.chandana@gmail.com)
for uncovering the issue.

Co-authored-by: Arthur Perais <arthur.perais@univ-grenoble-alpes.fr>
2024-09-25 07:31:00 -07:00
Saúl
d1ce4fb6c7 arch-riscv: add VLEN/ELEN as class attributes for all vec insts (#1538)
This refactor attempts to homogenize all riscv's vector (macro/micro)
instruction classes so that ELEN and VLEN are guaranteed to be a class
attribute. Since both are constant, all instructions will get it on the
decoding process passed through to their vector base class.

This allows the removal of VLEN in the PC state and also in some
constructor default parameters (solves issue #1207).

Change-Id: I6f0471004335f49b00b015c37e95dc7f9569e303
2024-09-24 14:32:37 -07:00
Yu-Cheng Chang
e9ea18000d arch-riscv: Move static GDB methods to RemoteGDB virtual methods (#1590)
Move getRvType & getPrivilegeModeSet static methods into
RiscvISA::RemoteGDB virtual methods allows the derived
RiscvISA::RemoteGDB to override it without change a lot of methods in
base methods

Change-Id: I3cbb9cf1fdee4a298e903bb4a0a5683c042b749d
2024-09-24 07:46:56 -07:00
Bobby R. Bruce
2fc44a50f8 gpu-compute: Fix '64kB' to '64KiB' in gpu-compute (#1594)
64kB, in these cases, will cast to 64KiB regardless. To improve
readability and understanding of these objects, this patch changes there
SI Prefix (kB -> KiB).
2024-09-23 15:25:43 -07:00
Giacomo Travaglini
c3d356b43d arch-arm: Move generateTrap from MiscRegOp to ArmStaticInst (#1560)
System(Misc) register accesses are not the only trappable instructions.
We move the exception generation logic (generateTrap) from the
MiscRegOp64 to the base ArmStaticInst
2024-09-23 19:37:29 +02:00
Bobby R. Bruce
0bd2cfaf53 systemc: Disable 'overloaded-virtual' warn for systemc bind funcs
For GCC >=v13 systemc was breaking due to the overloaded virtual
warning check.

Issue: #1121
2024-09-23 05:20:29 -07:00
Jason Lowe-Power
fee603fd84 mem-cache: Do not require p.size and p.entry_size in IP template (#1557)
This PR is adjusting the constructor to relax template
requirements. In this way child classes are free to provide
their own way of calculating the number of entries and the
shifting required to extract the set

Why do we need this?
Up to this patch we have been configuring the indexing policy
by setting up the cache/table size (in bytes) and the entry size.
Those parameters make a lot of sense in caching structures
where:

a) We want to configure the caching structure using
the amount of storage (in bytes) provided (e.g. 4kB of Cache)
b) the content of a single entry is addressable therefore
we need the entry size to know how many bits in the indexing
process we need to shift to extract the set

In those cases the number of cache entries is derived from the formula

num_entries = size / entry_size

The adoption of the IndexingPolicy for different kinds
of caching structures (e.g. prefetcher tables) make this
way of configuring the IP a bit quirky.

For some tables directly setting the number of entries is a far more
intuitive way of configuring the IP, instead of allocating the desired
number of entries by working things out with the formula above
2024-09-19 07:48:46 -07:00
Giacomo Travaglini
e564561d41 misc: Remove Serialize-related code in Random (#1567)
The Random ser/des support has been non-existent since 2014.
Removing it will enable the Random class to be unit tested
without having a dependency on the src/sim code.
2024-09-19 14:13:10 +02:00
Arthur perais
85210cf51d misc: Remove unecessary include in random.hh 2024-09-18 13:43:37 +02:00
Giacomo Travaglini
77dff262a1 arch-arm: Fix DC IVAC for Secure EL2 (#1569)
According to the Arm architecture reference manual:

"When the value of HCR_EL2.VM is 1, data cache invalidate instructions
executed at EL1 perform a data cache clean and invalidate"

This behaviour should be exteded to secure mode now that Secure EL2 is
supported

Change-Id: I8b4733e6336a0fd5577f4ef35c0bae5408f91194

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-09-18 11:07:10 +01:00
Bobby R. Bruce
f2f86a3e42 stdlib, python: Add warning message and clarify binary vs metric units (#1479)
This PR changes memory and cache sizes in various parts of the gem5
codebase to use binary units (e.g. KiB) instead of metric units (e.g.
kB). This makes the codebase more consistent, as gem5 automatically
converts memory and cache sizes that are in metric units to binary
units.

This PR also adds a warning message to let users know when an
auto-conversion from base 10 to base 2 units occurs.

There were a few places in configs and in the comments of various files
where I didn't change the metric units, as I couldn't figure out where
the parameters with those units were being used.
2024-09-17 17:32:27 +00:00
Matt Sinclair
6d49130b0b mem-ruby: Fix replacement policy in GPU_VIPER (#1564)
The current GPU_VIPER protocol's TCC cache update the MRU information
twice with calling a_allocateBlock and ut_updateTag which affects the
LIP and RRIP replacement polies. Remove ut_updateTag fixes the LIP and
RRIP replacement polies.

Change-Id: I79ad9392593e00425a7fe8828048465b2c2c2e1f
2024-09-17 12:16:09 -05:00
Bobby R. Bruce
3feeb5724f stdlib: Issue warn if func is a gen for exit_event (#1499)
Addresses Issue #1492
2024-09-17 09:34:24 -07:00
Arthur perais
4de65bbd57 misc: Remove Serialize-related code in Random
The Random ser/des support has been non-existent since 2014.
Removing it will enable the Random class to be unit tested
without having a dependency on the src/sim code.
2024-09-16 11:17:32 +02:00
Jarvis Jia
9dfd66aca4 mem-ruby: Fix replacement policy in GPU_VIPER
The current GPU_VIPER protocol's TCC cache update the MRU information
twice with calling a_allocateBlock and ut_updateTag which affectgs the
LIP and RRIP replacement polies. Remove ut_updateTag fixes the LIP and
RRIP replacement polies.

Change-Id: I79ad9392593e00425a7fe8828048465b2c2c2e1f
2024-09-14 23:22:22 -05:00
Erin (Jianghua) Le
5aa7b1ce3e python: Redirect into correct subdirectory when using -re with multisim (#1551)
Previously, when passing the -re option while using multisim, the files
simerr.txt and simout.txt would be redirected into the m5out directory
instead of the correct subdirectory. They would also have a name of the
format
Spawn_gem5PoolWorker-some-integer_(simout|simerr).txt, which doesn't
indicate which simulation the files correspond to.

This commit fixes these issues by redirecting simerr.txt and simout.txt
into the correct subdirectory.

Change-Id: I0a25a9fd8dc672949f5f85fc5ca6452529301a73
2024-09-14 01:17:48 -07:00
Yu-Cheng Chang
f94cac6f65 arch-riscv: Change the packed data of GdbRegCache to protected (#1552)
Change it to protected to enable access the packed data from derived
RiscvGdbRegCache class

Change-Id: Ib33732642914ad367773c3fa45adaf6dfdeb248d
2024-09-12 09:52:03 -07:00
Giacomo Travaglini
5eec041e2d arch-arm: Use generateTrap for SME/SVE/SIMD/WFE/WFI trapping
This avoids repeating the same switch construct

Change-Id: Ie16c52519b1e1f984284f2f1344a3903a0010d36
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-09-12 16:04:03 +01:00
Giacomo Travaglini
a4c9600200 arch-arm: Move generateTrap from MiscRegOp to ArmStaticInst
System(Misc) register accesses are not the only trappable instructions.
We move the exception generation logic (generateTrap) from the
MiscRegOp64 to the base ArmStaticInst

Change-Id: Ie2ba0c39790f50e3e8d504d153025d402283ec95
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-09-12 14:29:21 +01:00
aperais
e970acb9d2 cpu-o3: Replace integral constants by named constants in FU pool (#1556)
This replaces hardcoded integral values with more explicit constant
names in the code allocating functional units to instructions.

This commit follows ba5886aee7 which
should have read:

"If an instruction requires a functional unit that is not present in the
model (e.g., because it is not present in the configuration), O3CPU
treats it as a 1-cycle operation.

This commit changes the behavior to make the cpu panic when this
happens. The cpu panics only if the instruction reaches the head of the
ROB, meaning it is ok to have unsupported instructions on the wrong
path.

Thanks to Chandana S. Deshpande (deshpande.s.chandana@gmail.com) for
finding the issue."

Change-Id: I5e0a37e5fb8404cb5496bd2cb0a9a5baeae3b895

Co-authored-by: Arthur perais <arthur.perais@univ-grenoble-alpes.fr>
2024-09-12 14:04:34 +01:00
Giacomo Travaglini
e73c442ad8 mem-cache: Move size/entry_size params away from the template
Change-Id: Iec7a79cd9f2fa60d97f4a430e047e286f50338c8
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-09-12 10:10:58 +01:00
Giacomo Travaglini
3bd54db68d mem-cache: Do not require p.size and p.entry_size in IP template
This commit is adjusting the constructor to relax template
requirements. In this way child classes are free to provide
their own way of calculating the number of entries and the
shifting required to extract the set

Why do we need this?
Up to this patch we have been configuring the indexing policy
by setting up the cache/table size (in bytes) and the entry size.
Those parameters make a lot of sense in caching structures
where:

a) We want to configure the caching structure using
the amount of storage (in bytes) provided (e.g. 4kB of Cache)
b) the content of a single entry is addressable therefore
we need the entry size to know how many bits in the indexing
process we need to shift to extract the set

In those cases the number of cache entries is derived from the formula

num_entries = size / entry_size

The adoption of the IndexingPolicy for different kinds
of caching structures (e.g. prefetcher tables) make this
way of configuring the IP a bit quirky.

For some tables directly setting the number of entries is a far more
intuitive way of configuring the IP, instead of allocating the desired
number of entries by working things out with the formula above

Change-Id: Ic7994c129196d6ba83dc99ce397ad43393d35252
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-09-12 10:10:58 +01:00
Erin Le
52c2ecd033 python: remove outdated comment in convert.py
Change-Id: I0cdeb709e5ae1a3100662172d96a5f6328be1a3d
2024-09-11 11:57:22 -07:00
Erin Le
3a8bbc41b8 python: refactor base 10 to 2 error message
This commit refactors the base 10 to base 2 error message such
that it uses the preexisting _split_suffix function instead
of a new function based off of _split_suffix. This commit also
removes the new helper function used previously.

Change-Id: I44d9ac3d8b98bcff33d6bfea7ffbdb5009272ede
2024-09-11 11:28:55 -07:00
aperais
ba5886aee7 cpu-o3: Panic if no FU exists for an instruction needing to issue (#1516)
At present, if an instruction requires a functional unit that is not
present in the O3CPU config, O3CPU treats it as a 1-cycle operation that
does not consume an FU. This seems like a silent failure : if I forgot
to add a FU for a new operation type I added, then I don't want it to
silently work "for free".

The problem is that the code treats the FU allocator returning
`NoCapableFU` for a given DynInst as equivalent to the case where the
DynInst obtained an FU, with default latency of 1. This is because there
is a single if statement that checks whether the FU allocator returned
`NoFreeFU` or not, and `NoCapableFU` happens to be different. The change
is to introduce `NoNeedFU` and to panic if the FU allocator returns
`NoCapableFU`

An improvement would be to use a strongly typed enum rather than integer
constants. Thoughts ?

In addition to unit tests, I have tested this with `main.py run` and get
panics if I remove support for `IntMul` type in `O3CPU.py` in:

```
./SuiteUID-asm-riscv-rv32um-ps-mul-o3-ALL-x86_64-opt/TestUID-asm-riscv-rv32um-ps-mul-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-asm-riscv-rv32um-ps-mulh-o3-ALL-x86_64-opt/TestUID-asm-riscv-rv32um-ps-mulh-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-asm-riscv-rv32um-ps-mulhsu-o3-ALL-x86_64-opt/TestUID-asm-riscv-rv32um-ps-mulhsu-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-asm-riscv-rv32um-ps-mulhu-o3-ALL-x86_64-opt/TestUID-asm-riscv-rv32um-ps-mulhu-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-asm-riscv-rv64um-ps-mul-o3-ALL-x86_64-opt/TestUID-asm-riscv-rv64um-ps-mul-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-asm-riscv-rv64um-ps-mulh-o3-ALL-x86_64-opt/TestUID-asm-riscv-rv64um-ps-mulh-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-asm-riscv-rv64um-ps-mulhsu-o3-ALL-x86_64-opt/TestUID-asm-riscv-rv64um-ps-mulhsu-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-asm-riscv-rv64um-ps-mulhu-o3-ALL-x86_64-opt/TestUID-asm-riscv-rv64um-ps-mulhu-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-asm-riscv-rv64um-ps-mulw-o3-ALL-x86_64-opt/TestUID-asm-riscv-rv64um-ps-mulw-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-BaseCPUProcessor-arm-hello-ALL-x86_64-opt/TestUID-BaseCPUProcessor-arm-hello-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-cpu_test_ArmDerivO3CPU_Bubblesort-ALL-x86_64-opt/TestUID-cpu_test_ArmDerivO3CPU_Bubblesort-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-cpu_test_ArmDerivO3CPU_FloatMM-ALL-x86_64-opt/TestUID-cpu_test_ArmDerivO3CPU_FloatMM-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-cpu_test_RiscvDerivO3CPU_Bubblesort-ALL-x86_64-opt/TestUID-cpu_test_RiscvDerivO3CPU_Bubblesort-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-cpu_test_RiscvDerivO3CPU_FloatMM-ALL-x86_64-opt/TestUID-cpu_test_RiscvDerivO3CPU_FloatMM-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-o3-cpu_1-cores_classic_DualChannelDDR3_1600_arm_boot_test_to-tick-ALL-x86_64-opt/TestUID-o3-cpu_1-cores_classic_DualChannelDDR3_1600_arm_boot_test_to-tick-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-o3-cpu_1-cores_classic_DualChannelDDR3_1600_riscv-boot-test_to-tick-ALL-x86_64-opt/TestUID-o3-cpu_1-cores_classic_DualChannelDDR3_1600_riscv-boot-test_to-tick-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-test-arm-hello32-static-o3-ALL-x86_64-opt/TestUID-test-arm-hello32-static-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-test-arm-hello64-static-o3-ALL-x86_64-opt/TestUID-test-arm-hello64-static-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-test-mips-hello-o3-ALL-x86_64-opt/TestUID-test-mips-hello-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-test-riscv-hello-o3-ALL-x86_64-opt/TestUID-test-riscv-hello-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
./SuiteUID-test-riscv-print-this-o3-ALL-x86_64-opt/TestUID-test-riscv-print-this-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2
```

Co-authored-by: Arthur perais <arthur.perais@univ-grenoble-alpes.fr>
2024-09-11 16:43:31 +01:00
Bobby R. Bruce
f327559ca4 tests,stdlib,python: Add tests for base 10 to 2 SI unit check
**Note**: Erin needs to complete the commit by expanding this test to
properly test the behavior of this change.

To run the pyunit tests:

```sh
scons build/ALL/gem5.opt -j`nproc`
./build/ALL/gem5.opt tests/run_pyunit.py
```

Change-Id: I8cea0fe8b088e03e84072a000444953768bc3151
2024-09-10 15:17:53 -07:00
handsomeliu-google
0da65b31c2 python: Ignore *args and **kwargs when generating cxxMethod pybinding script (#1535)
According to the pybind documentation, "When combining *args or **kwargs
with Keyword arguments you should not include py::arg tags for the
py::args and py::kwargs arguments."

In the current implementation of gem5, if you use the cxxMethod
decorator on a function that has *args or **kwargs, gem5 will
incorrectly add these variables to the pybind generated declaration.

I.e., def f(arg1, arg2,  *args, **kwargs): -> .def("f", &f,
py::arg("arg1"), py::arg("arg2"), py::arg("*args"), py::arg("**kwargs"))
which is incorrect pybind code.

To fix this problem, we should ignore variables in the generator if they
are *args or **kwargs. This change skips these variables when creating
the pybind declaration.

Change-Id: I44a1e0eb0b5fc5c1e1d423ba145d456bff92c6b8
2024-09-09 10:23:26 -07:00
Erin Le
00f927a4e2 mem, python: refactor error message formatting
This commit refactors the error message added to convert.py.
A mapping between the base 10 and base 2 suffix magnitudes
(e.g. k: ki, M: Mi, etc.) and a new function that extracts the
magnitude and numerical value have been added. Also, a warning
message has been added to the toMemoryBandwidth function in
addition to the one in toMemorySize.

Change-Id: I3ae157d13c7089d38a34a6e4c35a2b58978106d0
2024-09-05 18:00:41 -07:00
Giacomo Travaglini
57d82fdbb4 sim-se, arch: Fix syscall parametre sizes for 32-bit OSs (#1482)
A bug was uncovered in that for various syscalls that used 64bit
parametres, the ABI for 32bit operating systems was passing the wrong
values to the syscalls, due to discrepancies between the target and
guest OS. This commit fixes that by replacing 64-bit types, or types
that are platform specific in size, with the exact correspondent for the
guest OS, thus producing the correct signature for the respective
syscalls. On top of this, the --param argument is added to the
starter_se script, in order to support attachment of remote debuggers.
2024-09-03 09:49:59 +01:00
Matt Sinclair
403622f376 dev-amdgpu: Implement UNMAP_QUEUES queue_sel==2 (#1481)
Unmap queues with queue_sel of 2 unmaps all queues while queue_sel of 3
unmaps all non-static queues. The implementation of 3 was actually
correct for 2. Static queues are queues which were mapped using a map
queues packet with a queue_type of 1 or 2.

This commit adds ability to mark a queue as static. When unmap queues
with queue_sel of 2 is sent, the existing code is now executed. With a
value of 3, we now check if the queue was marked static and do not unmap
it if marked.

Change-Id: I87d7cf78a0600c7baa516c01f42c294d3c4e90c5
2024-08-31 22:52:08 -05:00
Matthew Poremba
21f1e54ecd dev-amdgpu: Implement UNMAP_QUEUES queue_sel==2
Unmap queues with queue_sel of 2 unmaps all queues while queue_sel of 3
unmaps all non-static queues. The implementation of 3 was actually
correct for 2. Static queues are queues which were mapped using a map
queues packet with a queue_type of 1 or 2.

This commit adds ability to mark a queue as static. When unmap queues
with queue_sel of 2 is sent, the existing code is now executed. With a
value of 3, we now check if the queue was marked static and do not
unmap it if marked.

Change-Id: I87d7cf78a0600c7baa516c01f42c294d3c4e90c5
2024-08-31 17:41:47 -07:00
Giacomo Travaglini
29d6b46f1f arch-arm: Fix Execution Permission in Stage2 Direct Permission. (#1502)
In Stage 2 under AArch64, execution permission does not need read
permission.

Change-Id: I45887e8f4d50ed5edc4afaed9a2dd8a74db9d0d4
2024-08-29 23:57:58 +01:00
Matthew Poremba
bb9539ad4d arch-vega: Revert incorrect SOPC compare (#1521)
LT <= was previously correct while < is not. Can lead to incorrect
program execution. Related to #1366 related to #1520

Change-Id: I00b7838e920eee7c8adb508e869fdf53a9373e1f
2024-08-29 09:20:25 -07:00
Giacomo Travaglini
d78a571660 base: Allow DPRINTF debugging of AssociativeCache (#1514)
The AssociativeCache is used by different caching agents.
This PR will allow to pass the appropriate flag to the cache so that we
can meaningfully debug
its internals. For instance, when used to model a prefetcher table, will
will pass the
HWPrefetch flag; when used to model a TLB, we will pass the TLB flag.
2024-08-27 16:48:24 +01:00
Marco Kurzynski
a8447b7fc0 arch-vega: Pass s_memtime through smem pipe (#1350)
The Vega ISA's s_memtime instruction is used to obtain a cycle value
from the GPU. Previously, this was implemented to obtain the cycle count
when the memtime instruction reached the execute stage of the GPU
pipeline. However, from microbenchmarking we have found that this under
reports the latency for memtime instructions relative to real hardware.
Thus, we changed its behavior to go through the scalar memory pipeline
and obtain a latency value from the the SQC (L1 I$). This mirrors the
suggestion of the AMD Vega ISA manual that s_memtime should be treated
like a s_load_dwordx2.

The default latency was set based on microbenchmarking.

Change-Id: I5e251dde28c06fe1c492aea4abf9f34f05784420
2024-08-26 19:47:04 -07:00
Alexander Richardson
b9eafdb190 arch-arm: Fix implicit int-to-float conversion in VCMP (#1326)
Explicitly convert to float/double to fix compiler warnings that I have
turned on locally. It might make sense to make use of fplib functions to
be portable across different host float formats but something as simple
as comparison against zero should be safe.

Change-Id: I96c6ee7c5497fece11be07234ff80ff86e7555e2
2024-08-26 10:22:07 +01:00
Alexander Richardson
3e288305c1 arch-arm: downgrade a warning to a DPRINTF (#1438)
Programming an event ID while counters are disabled is perfectly fine,
so we should just log this using DPRINTF instead of printing a warn()
every time it happens.

Change-Id: Ib9499857271033ef941f74a7f012d8694328eaf3
2024-08-26 10:20:49 +01:00
Alexander Richardson
ff12822606 arch-arm: when programming an invalid PMU ID detach the counter (#1510)
I'm not entirely sure what the mandated behaviour is according the the
ARM ARM, but I was very confused by the counters continuing to increment
with the old event even when programmed to an event ID that is not
currently supported by GEM5. Disconnecting the counter if the event is
not supported is less surprising behaviour IMO.

Change-Id: I927d9339c138dafa1484db1515c2aa09b0a9a0a9
2024-08-26 10:17:51 +01:00
Alexander Richardson
a679b9e8a3 arch-arm: Use .f32/.f64 suffixes for vfp mnemonics (#1512)
This matches the Arm manual and the output produced by capstone. Also
avoid unnecessary spaces in vsel* instruction printing.

Change-Id: I071dd834b7104f10f6358a6b2e2895bdab64df82
2024-08-25 14:12:18 +01:00
Giacomo Travaglini
399f85223d base: Add print when inserting/evicting an AssociativeCache entry
We are adding two debug prints in the AssociativeCache:

1) Inserting print
2) Evicting print

Among those, the evicting one is probably the most important
This is because while the DPRINTF can be added in the
Entry::insert implementation (called during insertion),
the AssociativeCache does not reference any evict method.
Instead, the findVictim is transparently invalidating the
victim, which makes it impossible for the client code
to understand whether the victim was a valid
entry or not.

Change-Id: I4fee59cc63c6b0e14c5b02bcf3ba5f58aa21ef9f
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-08-24 08:16:29 +01:00