Commit Graph

14417 Commits

Author SHA1 Message Date
Hoa Nguyen
91e55d9c60 base: Add warning when failing to insert a whole symbol table
Current we drop the insertion of a whole symbol table if the name
of one symbol already exists in the base table. Having similar
symbols across different binaries is very common.

This change adds a warning and recommends a fix instead of silently
dropping the table. This is useful for debugging when there are two
or more workloads, e.g. bootloader + kernel, are added separately.

Change-Id: I9e4cf06037cd70926fb5cee3c4dab464daf0912e
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-09-25 16:27:21 -07:00
Jason Lowe-Power
010ac43369 arch-riscv: Make RISC-V decodeInst overridable (#350)
The change will allow developers to implement and decode their
non-standard instructions to the CPU models
2023-09-25 06:43:56 -07:00
Giacomo Travaglini
9d63a1492a cpu: Add override to TraceCPU init function (#348)
This introduces a fix that caused the clang compiler tests to fail here:
https://github.com/gem5/gem5/actions/runs/6195015407

Change-Id: I48c61539f497976c038c6e8e379d00285e1c39c7
2023-09-25 09:10:33 +01:00
Roger Chang
d55f8f2716 arch: Enable customized decoder class name
Developers can make the own ISADesc action in the SConscript with
their decoder class name.

Change-Id: I011cf059642e178913e1f62df4e5c02401cc132e
2023-09-22 15:45:56 +08:00
Roger Chang
5b41112e03 arch-riscv: Make RISC-V decodeInst overridable
The change will allow developers to implement and decode their
non-standard instructions to the CPU models

Bug: 289467440
Test: None
Change-Id: I67f4abc71596f819c1265e325784f51c8e9bb359
2023-09-22 11:38:22 +08:00
Bobby R. Bruce
3f9afe96c6 python,util: Add Python MyPy Stubgen to enable Pylance IntelliSense (#307)
This allows us to generate stubs for the modules in gem5. The output
will be a "typings" directory which can be used by Pylance (Python
IntelliSense) to infer typings in Visual Studio Code.

Note: A "typings" directory in the root of the workspace is the default
location for Pylance to look for typings. This can be changed via
`python.analysis.stubPath` in "settings.json".

Usage
=====

```
pip3 install -r requirements.txt
scons build/ALL/gem5.opt -j$(nproc)
./build/ALL/gem5.opt util/gem5-stubgen.py
```
2023-09-21 11:52:16 -07:00
Melissa Jost
d297da3654 cpu: Add override to TraceCPU init function
This introduces a fix that caused the clang compiler tests to
fail here: https://github.com/gem5/gem5/actions/runs/6195015407

Change-Id: I48c61539f497976c038c6e8e379d00285e1c39c7
2023-09-21 10:11:39 -07:00
Bobby R. Bruce
958eda6961 arch-riscv: Fix inst flags for jal and jalr (#325)
The jal and jalr share the same instruction format JumpConstructor,
which sets the IsCall and IsReturn flags by the register ID. However, it
may cause wrong instruction flags set for jal because the section
"handle the 'Jalr' instruction" misses the opcode checking. The PR fix
the issue to ensure the IsReturn can be only set in Jalr.
2023-09-20 16:25:21 -07:00
Bobby R. Bruce
aa0702c6eb dev-amdgpu: Handle GPU atomics on host memory addresses (#328)
It is possible to execute a GPU atomic instruction using a memory
address that is in the host memory space (e.g, HMM, __managed__,
hipHostMalloc'd address). Since these are in host memory they are passed
to the SystemHub DmaDevice. However, this currently executes as a write
packet without modifying data. This leads to hangs in applications that
use atomics for forward progress (e.g., HeteroSync).

It is not clear where these are handled on a real GPU, but they are
certainly not handled by the software stack nor driver, so they must be
handled in hardware and therefore implemented in gem5. Handling for
atomics in the SystemHub makes the most sense.

To make atomics work a few extra changes need to be made to the
SystemHub. (1) The atomic is implemented as a host memory read, followed
by calling the AtomicOpFunctor, followed by a write. This requires a
second event to handle read response, performing atomic, and issuing a
write. (2) Atomics must be serialized otherwise two atomics might return
the same value which is incorrect. This patch adds serialization logic
for all request types to the same address to handle this. (3) With the
added complexity of the SystemHub, a new debug flag explicitly for
SystemHub is added.

Testing done: The heterosync application with input "sleepMutex 10 16 4"
previously hung before this patch. It passes with the patch applied.
This application tests both (1) and (2) above, as it allocates locks
with hipHostMalloc and has multiple workgroups sending an atomic request
in the same Tick, verifying the serialization mechanism.
2023-09-20 16:24:56 -07:00
Matthew Poremba
63cabf2848 dev-amdgpu: Handle GPU atomics on host memory addresses
It is possible to execute a GPU atomic instruction using a memory
address that is in the host memory space (e.g, HMM, __managed__,
hipHostMalloc'd address). Since these are in host memory they are passed
to the SystemHub DmaDevice. However, this currently executes as a write
packet without modifying data. This leads to hangs in applications that
use atomics for forward progress (e.g., HeteroSync).

It is not clear where these are handled on a real GPU, but they are
certianly not handled by the software stack nor driver, so they must be
handled in hardware and therefore implemented in gem5. Handling for
atomics in the SystemHub makes the most sense.

To make atomics work a few extra changes need to be made to the
SystemHub. (1) The atomic is implemented as a host memory read, followed
by calling the AtomicOpFunctor, followed by a write. This requires a
second event to handle read response, performing atomic, and issuing a
write. (2) Atomics must be serialized otherwise two atomics might return
the same value which is incorrect. This patch adds serialization logic
for all request types to the same address to handle this. (3) With the
added complexity of the SystemHub, a new debug flag explicitly for
SystemHub is added.

Testing done: The heterosync application with input "sleepMutex 10 16 4"
previously hung before this patch. It passes with the patch applied.
This application tests both (1) and (2) above, as it allocates locks
with hipHostMalloc and has multiple workgroups sending an atomic request
in the same Tick, verifying the serialization mechanism.

Change-Id: Ife84b30037d1447dd384340cfeb06fdfd472fff9
2023-09-20 13:52:25 -05:00
Roger Chang
70c1d762c7 arch-riscv: Fix inst flags for jal and jalr
The jal and jalr share the same instruction format JumpConstructor,
which sets the IsCall and IsReturn flags by the register ID.
However, it may cause wrong instruction flags set for jal because
the section "handle the 'Jalr' instruction" misses the opcode
checking. The PR fix the issue to ensure the IsReturn can be only
set in Jalr.

Change-Id: I9ad867a389256f9253988552e6567d2b505a6901
2023-09-20 14:27:23 +08:00
Nicholas Mosier
741a901d8d arch-x86: fix negative overflow check bug in PACK micro-op
The implementation of the x86 PACK micro-op had a logical bug that
caused the `PACKSSWB` and `PACKSSDW` instructions to produce
incorrect results. Specifically, due to a signedness error, the
overflow check for negative integers being packed always evaluated
to true, resulting in all negative integers being packed as -1 in
the output.

This patch fixes the signedness error that causes the bug.

GitHub issue: https://github.com/gem5/gem5/issues/331

Change-Id: I44b7328a8ce31742a3c0dfaebd747f81751e8851
2023-09-20 05:09:32 +00:00
Bobby R. Bruce
3bdcfd6f7a mem-ruby: patch fixes a protocol error in MOESI_CMP_Directory (#316)
When there is race between FwdGetX
and PUTX on owner. Owner in this case hands off
ownership to GetX requestor and PUTX still goes
through. But since owner has changed, state should go back to M and PUTX
is essentially trashed.
An Unblock to the Directory in this case will give an undefined
transition. I have added transitions which indicate that when an Unblock
is served to the Directory, it means that some kind of ownership
transfer has happened while a PUTX/PUTO was in progress.
2023-09-15 13:25:51 -07:00
Bobby R. Bruce
a101b1aba3 stdlib: Add 'to_path' arg to obtain_resource
This allows for a user to specify the exact path they want a resource to
be downloaded to. This differs from 'resource_direcctory' in that a user
may specify the file/directory name of the resource (using just the
'resource_directory' will have the resource as its ID in that directory.

Change-Id: I887be6216c7607c22e49cf38226a5e4600f39057
2023-09-14 15:33:17 -07:00
Bobby R. Bruce
b12f28af96 stdlib: Add 'quiet' option to obtain_resource func
Change-Id: I15d3be959ba7ab8af328fc6ec2912a8151941a1e
2023-09-14 15:33:17 -07:00
Bobby R. Bruce
59a96c8c2f mem-cache: Fix bug in classic cache while clflush (#274)
This change, https://github.com/gem5/gem5/pull/205, mistakenly allocates
write buffer for clflush instruction when there's a cache miss. However,
clflush in gem5 is not a write instruction. Thus, the cache should
allocate miss buffer in this case.
2023-09-14 01:14:39 -07:00
Gautham Pathak
178db9e270 mem-ruby: patch fixes a protocol error in MOESI_CMP_Directory
When there is race between FwdGetX
and PUTX on owner. Owner in this case hands off
ownership to GetX requestor and PUTX still goes
through. But since owner has changed, state should
go back to M and PUTX is essentially trashed.
An Unblock to the Directory in this case will give an undefined
transition. I have added transitions which indicate that when
an Unblock is served to the Directory, it means that some kind
of ownership transfer has happened while a PUTX/PUTO was in
progress.

Change-Id: I37439b5a363417096030a0875a51c605bd34c127
2023-09-13 19:09:13 -04:00
Bobby R. Bruce
d38c029195 mem-ruby: This commit patches an error in AbstractController.cc (#294)
Links to #293 

After calling m5_dump_reset_stats(0,0) in a test program, some
statistics like
l1_controllers.L1Dcache.m_demand_hits,
l1_controllers.L1Dcache.m_demand_misses,
l1_controllers.L1Dcache.m_demand_accesses
were not getting reset in the newer stat dumps.
This one line patch fixes that. Changes were tested with calling two
m5_dump_reset_stats(0,0) in a row for a system with 1 core, tested on
both SE and FS.
Credits: @MeatBoy106
2023-09-13 15:48:46 -07:00
Bobby R. Bruce
673d4b2ac2 arch-x86: initialize and correct bitwidth for FPU tag word (#304)
The x87 FPU tag word (FTW) was not explicitly initialized in
{X86_64,i386}Process::initState(), resulting in holding an initial value
of zero, resulting in an invalid x87 FPU state. This commit initializes
FTW to 0xFFFF, indicating the FPU is empty at program start during
syscall emulation.

The 16-bit FTW register was also incorrectly masked down to 8-bits in
X86ISA::ISA::setMiscRegNoEffect(), leading to an invalid X87 FPU state
that later caused crashes in the X86KvmCPU. This commit corrects the
bitwidth of the mask to 16.

GitHub issue: https://github.com/gem5/gem5/issues/303
2023-09-13 15:47:50 -07:00
Gautham Pathak
87db6df8f6 mem-ruby: This commit patches an error in AbstractController.cc
After calling m5_dump_reset_stats(0,0) in a test program,
some statistics like
l1_controllers.L1Dcache.m_demand_hits,
l1_controllers.L1Dcache.m_demand_misses,
l1_controllers.L1Dcache.m_demand_accesses
were not getting reset in the newer stat dumps.
This one line patch fixes that. Changes were tested with
calling two m5_dump_reset_stats(0,0) in a row for a system
with 1 core, tested on both SE and FS.
Credits to Gabriel Busnot for finding the fix.

Change-Id: I19d75996fa53d31ef20f7b206024fd38dbeac643
2023-09-13 14:07:16 -04:00
Bobby R. Bruce
5fd901ffbb cpu, configs: Fix TraceCPU after multi-ISA addition (#302)
This PR fixes #301
2023-09-12 17:26:27 -07:00
Bobby R. Bruce
39f0bcd9af python: Mimic Python 3's -P flag in gem5
Python 3's `-P` flag, when set, means `sys.path` is not prepended with
potentially unsafe paths:
https://docs.python.org/3/using/cmdline.html#cmdoption-P

This patch allows gem5 to mimic this. This is necesssary when using
`mypy.stubgen` as it expects the Python Interpreter to have the `-P`
flag.

Change-Id: I456c8001d3ee1e806190dc37142566d50d54cc90
2023-09-12 14:51:59 -07:00
Nicholas Mosier
2178e26bf2 arch-x86: initialize and correct bitwidth for FPU tag word
The x87 FPU tag word (FTW) was not explicitly initialized in
{X86_64,i386}Process::initState(), resulting in holding an initial
value of zero, resulting in an invalid x87 FPU state. This commit
initializes FTW to 0xFFFF, indicating the FPU is empty at program
start during syscall emulation.

The 16-bit FTW register was also incorrectly masked down to 8-bits
in X86ISA::ISA::setMiscRegNoEffect(), leading to an invalid X87 FPU
state that later caused crashes in the X86KvmCPU. This commit
corrects the bitwidth of the mask to 16.

GitHub issue: https://github.com/gem5/gem5/issues/303

Change-Id: I97892d707998a87c1ff8546e08c15fede7eed66f
2023-09-12 15:39:29 +00:00
Bobby R. Bruce
1bebf6a3cc sim-se: Use tgt_stat64 instead of tgt_stat in newfstatatFunc (#283)
The syscall emulation of newfstatat incorrectly treated the output stat
buffer to be of type `OS::tgt_stat`, not `OS::tgt_stat64`, causing the
invalid output stat buffer in the application to hold invalid data.

This patch fixes the bug by simply substituting the type `OS::tgt_stat`
with `OS::tgt_stat64` in `newstatatFunc()`.

GitHub issue: https://github.com/gem5/gem5/issues/281
2023-09-12 08:33:42 -07:00
Bobby R. Bruce
94e5a0cccf sim-se: Fix tgkill logic bug in handling signal argument (#286)
The syscall emulation of tgkill contained a simple logic bug (a `||`
instead of a `&&`), causing the signal argument to always be considered
invalid. This patch fixes the bug by simply changing the `||` to a `&&`.

GitHub issue: https://github.com/gem5/gem5/issues/284
2023-09-12 08:32:56 -07:00
Bobby R. Bruce
d67a6603c1 cpu-kvm: properly set x86 xsave header on gem5->KVM transition (#298)
If the XSAVE KVM capability is available (KVM_CAP_XSAVE), the X86KvmCPU
will try to set the x87 FPU + SSE state using KVM_SET_XSAVE, which
expects a buffer (struct kvm_xsave) in XSAVE area format (Vol. 1, Sec.
13.4 of Intel x86 SDM). The original implementation of
`X86KvmCPU::updateKvmStateFPUXSave()`, however, improperly sets the
xsave header, which contains a bitmap of state components present in the
xsave area.

This patch defines `XSaveHeader` structure to model the xsave header,
which is expected directly following the legacy FPU region (defined in
the `FXSave` structure) in the xsave area. It then sets two bist in the
xsave header to indicate the presence of x86 FPU and SSE state
components.

GitHub issue: https://github.com/gem5/gem5/issues/296
2023-09-12 08:32:20 -07:00
Giacomo Travaglini
a0a799f474 cpu: Disable CPU switching functionality with TraceCPU
Now that the TraceCPU is no longer a BaseCPU we disable CPU switching
functionality. AFAICS from the code, it seems like using m5.switchCpus
was never really working.
The takeOverFrom was described as being used when checkpointing
(which is not really the case). Moreover the icache/dcache
event loops were not checking if the CPU was switched out
so the trace was always been consumed regardless of the BaseCPU
state.

Note: IMHO the only case where you might want to switch between
an execution-driven CPU to the TraceCPU is when you want to
warm your caches before the ROI.
All other cases don't really make sense as with the TraceCPU
there is no architectural state being maintained/updated.

Change-Id: I0611359d2b833e1bc0762be72642df24a7c92b1e
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-09-12 15:50:05 +01:00
Giacomo Travaglini
9a5d900770 cpu: Stop treating TraceCPU as a BaseCPU
This is fixing a recently reported issue [1] where it is
not possible to use the TraceCPU to replay elastic traces

It requires some architectural data structures (like ArchMMU,
ArchDecoder...) which are no longer defined in the BaseCPU class at
compilation time.  Which Arch version should be used for a class
(TraceCPU) that is supposed to be ISA agnostic ? Does it really make
sense to define them for the TraceCPU? Those classes are not used anyway
during trace replay and their sole purpose would just be to comply to
the BaseCPU interface.

As there is no elegant way to make things work, this patch stops
treating the TraceCPU as a BaseCPU.

While it philosophically makes sense to treat the TraceCPU as a common
CPU (it sort of replays pre-executed instructions), a case can be made
for considering it more like a traffic generator.

[1]: https://github.com/gem5/gem5/issues/301

Change-Id: I7438169e8cc7fb6272731efb336ed2cf271c0844
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-09-12 15:49:29 +01:00
Roger Chang
def89745bc arch-riscv: Allow Minor and O3 CPU execute RVV
Change-Id: I4780b42c25d349806254b5053fb0da3b6993ca2f
2023-09-12 13:56:22 +08:00
Roger Chang
0f54cb0593 arch-riscv: Remove check vconf done implementation
Change-Id: If633cef209390d0500c4c2c5741d56158ef26c00
2023-09-12 13:56:22 +08:00
Roger Chang
31b95987da arch-riscv: Change the instruction family to jump like
The method that get the vl, vtype from PCState in the next changes

Change-Id: I022b47b7a96572f6434eed30dd9f7caa79854c31
2023-09-12 13:56:22 +08:00
Roger Chang
282765234b arch-riscv: Implement the branchTarget for vset*vl*
Change-Id: I10bf6be736ce2b99323ace410bff1d8e1e2a4123
2023-09-12 13:56:22 +08:00
Roger Chang
a3aaad2ecd arch-riscv: Refactor the execution part of vset*vl*
Change-Id: Ie0d9671242481a85bb0fe5728748b16c3ef62592
2023-09-12 13:56:21 +08:00
Roger Chang
1bde42760f arch-riscv: Get vl, vtype and vlenb from PCState
Change-Id: I0ded57a3dc2db6fcc7121f147bcaf6d8a8873f6a
2023-09-12 13:56:21 +08:00
Roger Chang
8918302239 arch-riscv: Change the implementation of vset*vl*
The changes includes:

1. Add VL, Vtype and VlenbBits operands
2. Change R/W methods of VL, Vtype and VlenbBits from PCState

Change-Id: I0531ddc14344f2cca94d0e750a3b4291e0227d54
2023-09-12 13:56:21 +08:00
Roger Chang
7b5d8b4e5b arch-riscv: Add vlenb, vtype and vl in PCState
Change-Id: I7c2aed7dda34a1a449253671d7b86aa615c28464
2023-09-12 13:56:21 +08:00
Roger Chang
f94658098d arch-riscv: Remove checked_type in StaticInst Constructor
We should not try to check vtype when decoding the instruction.
It should be checked in vset{i}vl{i} since the register can be
modified via vset{i}vl{i}

Change-Id: I403e5c4579bc5b8e6af10f93eac20c14662e4d2d
2023-09-12 13:56:21 +08:00
Roger Chang
3f0475321a arch-riscv: Change VTYPE to BitUnion64
Change-Id: I7620ad1ef3ee0cc045bcd02b3c9a2d83f93bf3fe
2023-09-12 13:56:21 +08:00
Roger Chang
dfc725838e arch-riscv: Refactor PCState class
Change-Id: I1d25350ba2a3c7c366f42340c20b4488c33cde6f
2023-09-12 13:56:21 +08:00
Nicholas Mosier
2b9d558cef cpu-kvm: properly set x86 xsave header on gem5->KVM transition
If the XSAVE KVM capability is available (KVM_CAP_XSAVE), the X86KvmCPU
will try to set the x87 FPU + SSE state using KVM_SET_XSAVE, which
expects a buffer (struct kvm_xsave) in XSAVE area format (Vol. 1,
Sec. 13.4 of Intel x86 SDM). The original implementation of
`X86KvmCPU::updateKvmStateFPUXSave()`, however, improperly sets the
xsave header, which contains a bitmap of state components present
in the xsave area.

This patch defines `XSaveHeader` structure to model the xsave header,
which is expected directly following the legacy FPU region (defined in
the `FXSave` structure) in the xsave area. It then sets two bist in
the xsave header to indicate the presence of x86 FPU and SSE state
components.

GitHub issue: https://github.com/gem5/gem5/issues/296

Change-Id: I5c5c7925fa7f78a7b5e2adc209187deff53ac039
2023-09-10 15:16:50 +00:00
Nicholas Mosier
8740385f9e sim-se: Fix tgkill logic bug in handling signal argument
The syscall emulation of tgkill contained a simple logic bug
(a `||` instead of a `&&`), causing the signal argument to always
be considered invalid. This patch fixes the bug by simply changing
the `||` to a `&&`.

GitHub issue: https://github.com/gem5/gem5/issues/284

Change-Id: I3b02c618c369ef56d32a0b04e0b13eacc9fb4977
2023-09-09 08:51:41 -07:00
Jason Lowe-Power
ebde1133c0 redirect_path patch for restoring cpt (#221)
Modify the FDArray::unserialize function to perform a checkPathRedirect
if a Process pointer is passed in.
Currently when restoring a checkpoint, it doesn't perform
checkPathRedirect for files that were opened during checkpointing. This
patch adds a checkPathRedirect in the FDArray::unserialize to redirect
app path for restoring checkpoints.
2023-09-08 15:30:53 -07:00
Nicholas Mosier
259a5d6272 sim-se: Use tgt_stat64 instead of tgt_stat in newfstatatFunc
The syscall emulation of newfstatat incorrectly treated the output
stat buffer to be of type `OS::tgt_stat`, not `OS::tgt_stat64`, causing
the invalid output stat buffer in the application to hold invalid
data.

This patch fixes the bug by simply substituting the type `OS::tgt_stat`
with `OS::tgt_stat64` in `newstatatFunc()`.

GitHub issue: https://github.com/gem5/gem5/issues/281

Change-Id: Ice97c1fc4cccbfb6824e313ebecde00f134ebf9c
2023-09-08 11:28:54 -07:00
Hoa Nguyen
91d1a5deb5 mem-cache: Fix bug in classic cache while clflush
This change, https://github.com/gem5/gem5/pull/205, mistakenly
allocates write buffer for clflush instruction when there's a
cache miss. However, clflush in gem5 is not a write instruction.
Thus, the cache should allocate miss buffer in this case.

Change-Id: I9c1c9b841159c4420567e9c929e71e4aa27d5c28
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-09-08 18:16:10 +00:00
Nicholas Mosier
e0498cb490 Merge branch 'develop' into bugfix-chdir 2023-09-07 14:13:00 -07:00
Bobby R. Bruce
eb5ae35341 resources,stdlib: Add workload to resource specialization and deprecate workload.py (#212) 2023-09-07 12:45:45 -07:00
Nicholas Mosier
62e81930d6 Merge branch 'develop' into bugfix-chdir 2023-09-07 09:54:35 -07:00
studyztp
e206b16f73 sim:fixed some style issues
Change-Id: I0832a8b68e802e9671b755d3a71fd9c8f17e1648
2023-09-07 08:52:24 -07:00
studyztp
377c875733 sim: check redirect path when unserialize for cpt
sim/fd_array.hh:
Add "class Process;" to forward declare Process for unserialize
function to pass in a Process object pointer.
Fix the styling issue with include files.

sim/fd_array.cc"
Add comments.

Change-Id: Ifb21eb1c7bad119028b8fd8e610a125100fde696
2023-09-07 08:52:24 -07:00
studyztp
2a4f3f206b sim: modifed the type of path
Change-Id: I56be3b62b1804371b9b9e0f84ee1ec49cbedf553
2023-09-07 08:52:24 -07:00