Commit Graph

20578 Commits

Author SHA1 Message Date
Vishnu Ramadas
107e05266d dev-amdgpu: Add aql, hsa queue information to checkpoint-restore
GPUFS uses aql information from PM4 queues to initialize doorbells. This
commit adds aql information to the checkpoint so that it can be used
during restoration to correctly initialize all doorbells. Additionally,
this commit also sets the hsa queue correctly during checkpoint-restoration

Change-Id: Ief3ef6dc973f70f27255234872a12c396df05d89
2023-10-02 19:02:50 -05:00
Bobby R. Bruce
f5a255c68d configs: Fixed Typo (#337)
Fixed a typo importing obtain_resource
2023-09-21 11:58:49 -07:00
Bobby R. Bruce
3f9afe96c6 python,util: Add Python MyPy Stubgen to enable Pylance IntelliSense (#307)
This allows us to generate stubs for the modules in gem5. The output
will be a "typings" directory which can be used by Pylance (Python
IntelliSense) to infer typings in Visual Studio Code.

Note: A "typings" directory in the root of the workspace is the default
location for Pylance to look for typings. This can be changed via
`python.analysis.stubPath` in "settings.json".

Usage
=====

```
pip3 install -r requirements.txt
scons build/ALL/gem5.opt -j$(nproc)
./build/ALL/gem5.opt util/gem5-stubgen.py
```
2023-09-21 11:52:16 -07:00
Bobby R. Bruce
958eda6961 arch-riscv: Fix inst flags for jal and jalr (#325)
The jal and jalr share the same instruction format JumpConstructor,
which sets the IsCall and IsReturn flags by the register ID. However, it
may cause wrong instruction flags set for jal because the section
"handle the 'Jalr' instruction" misses the opcode checking. The PR fix
the issue to ensure the IsReturn can be only set in Jalr.
2023-09-20 16:25:21 -07:00
Bobby R. Bruce
aa0702c6eb dev-amdgpu: Handle GPU atomics on host memory addresses (#328)
It is possible to execute a GPU atomic instruction using a memory
address that is in the host memory space (e.g, HMM, __managed__,
hipHostMalloc'd address). Since these are in host memory they are passed
to the SystemHub DmaDevice. However, this currently executes as a write
packet without modifying data. This leads to hangs in applications that
use atomics for forward progress (e.g., HeteroSync).

It is not clear where these are handled on a real GPU, but they are
certainly not handled by the software stack nor driver, so they must be
handled in hardware and therefore implemented in gem5. Handling for
atomics in the SystemHub makes the most sense.

To make atomics work a few extra changes need to be made to the
SystemHub. (1) The atomic is implemented as a host memory read, followed
by calling the AtomicOpFunctor, followed by a write. This requires a
second event to handle read response, performing atomic, and issuing a
write. (2) Atomics must be serialized otherwise two atomics might return
the same value which is incorrect. This patch adds serialization logic
for all request types to the same address to handle this. (3) With the
added complexity of the SystemHub, a new debug flag explicitly for
SystemHub is added.

Testing done: The heterosync application with input "sleepMutex 10 16 4"
previously hung before this patch. It passes with the patch applied.
This application tests both (1) and (2) above, as it allocates locks
with hipHostMalloc and has multiple workgroups sending an atomic request
in the same Tick, verifying the serialization mechanism.
2023-09-20 16:24:56 -07:00
Bobby R. Bruce
4526a314a9 arch-x86: fix negative overflow check bug in PACK micro-op (#332)
The implementation of the x86 PACK micro-op had a logical bug that
caused the `PACKSSWB` and `PACKSSDW` instructions to produce incorrect
results. Specifically, due to a signedness error, the overflow check for
negative integers being packed always evaluated to true, resulting in
all negative integers being packed as -1 in the output.

This patch fixes the signedness error that causes the bug.

GitHub issue: https://github.com/gem5/gem5/issues/331
2023-09-20 16:18:16 -07:00
Marco Kurzynski
516dcf3bcd configs: Fixed Typo
Fixed a typo importing obtain_resource

Change-Id: I5792ca161187c6576e2501e5aaea610d8b8ee5ea
2023-09-20 21:42:56 +00:00
Matthew Poremba
63cabf2848 dev-amdgpu: Handle GPU atomics on host memory addresses
It is possible to execute a GPU atomic instruction using a memory
address that is in the host memory space (e.g, HMM, __managed__,
hipHostMalloc'd address). Since these are in host memory they are passed
to the SystemHub DmaDevice. However, this currently executes as a write
packet without modifying data. This leads to hangs in applications that
use atomics for forward progress (e.g., HeteroSync).

It is not clear where these are handled on a real GPU, but they are
certianly not handled by the software stack nor driver, so they must be
handled in hardware and therefore implemented in gem5. Handling for
atomics in the SystemHub makes the most sense.

To make atomics work a few extra changes need to be made to the
SystemHub. (1) The atomic is implemented as a host memory read, followed
by calling the AtomicOpFunctor, followed by a write. This requires a
second event to handle read response, performing atomic, and issuing a
write. (2) Atomics must be serialized otherwise two atomics might return
the same value which is incorrect. This patch adds serialization logic
for all request types to the same address to handle this. (3) With the
added complexity of the SystemHub, a new debug flag explicitly for
SystemHub is added.

Testing done: The heterosync application with input "sleepMutex 10 16 4"
previously hung before this patch. It passes with the patch applied.
This application tests both (1) and (2) above, as it allocates locks
with hipHostMalloc and has multiple workgroups sending an atomic request
in the same Tick, verifying the serialization mechanism.

Change-Id: Ife84b30037d1447dd384340cfeb06fdfd472fff9
2023-09-20 13:52:25 -05:00
Bobby R. Bruce
6eb7c10eb9 misc: Add HACC GPU tests (#258)
This adds the HACC GPU tests to be run weekly
2023-09-20 11:26:54 -07:00
Roger Chang
70c1d762c7 arch-riscv: Fix inst flags for jal and jalr
The jal and jalr share the same instruction format JumpConstructor,
which sets the IsCall and IsReturn flags by the register ID.
However, it may cause wrong instruction flags set for jal because
the section "handle the 'Jalr' instruction" misses the opcode
checking. The PR fix the issue to ensure the IsReturn can be only
set in Jalr.

Change-Id: I9ad867a389256f9253988552e6567d2b505a6901
2023-09-20 14:27:23 +08:00
Nicholas Mosier
741a901d8d arch-x86: fix negative overflow check bug in PACK micro-op
The implementation of the x86 PACK micro-op had a logical bug that
caused the `PACKSSWB` and `PACKSSDW` instructions to produce
incorrect results. Specifically, due to a signedness error, the
overflow check for negative integers being packed always evaluated
to true, resulting in all negative integers being packed as -1 in
the output.

This patch fixes the signedness error that causes the bug.

GitHub issue: https://github.com/gem5/gem5/issues/331

Change-Id: I44b7328a8ce31742a3c0dfaebd747f81751e8851
2023-09-20 05:09:32 +00:00
Bobby R. Bruce
3bdcfd6f7a mem-ruby: patch fixes a protocol error in MOESI_CMP_Directory (#316)
When there is race between FwdGetX
and PUTX on owner. Owner in this case hands off
ownership to GetX requestor and PUTX still goes
through. But since owner has changed, state should go back to M and PUTX
is essentially trashed.
An Unblock to the Directory in this case will give an undefined
transition. I have added transitions which indicate that when an Unblock
is served to the Directory, it means that some kind of ownership
transfer has happened while a PUTX/PUTO was in progress.
2023-09-15 13:25:51 -07:00
Bobby R. Bruce
23442727f7 util,resources,stdlib: Add 'obtain-resource.py' utility to easily obtain resources from the CLI (#317)
This allows users to obtain resources via the CLI instead of having to
write a python script to do so. It is essentially a nice CLI wrapper for
"gem5.resources.resource.obtain_resource"

## Usage

```sh
> scons build/ALL/gem5.opt -j `nproc`
> ./build/ALL/gem5.opt util/obtain-resource.py --help

usage: obtain-resource.py [-h] [-p PATH] [-q] id

positional arguments:
  id                    The resource id to download.

options:
  -h, --help            show this help message and exit
  -p PATH, --path PATH  The path the resource is to be downloaded to. If not specified, the resource will be downloaded to the default
                        location in the gem5 local cache of resources
  -q, --quiet           Suppress output.
```

E.g.:

```sh
./build/ALL/gem5.opt util/obtain-resource.py arm-hello64-static -p arm-hello
```

Will download the resource with ID `arm-hello64-static` to `arm-hello`
in the CWD.
2023-09-14 21:04:30 -07:00
Bobby R. Bruce
600ea81031 util: Add 'obtain-resource.py' utility
This can be used to obtain a resource from gem5-resources.

Change-Id: I922d78ae0450bf011f18893ffc05cb1ad6c97572
2023-09-14 15:33:17 -07:00
Bobby R. Bruce
a101b1aba3 stdlib: Add 'to_path' arg to obtain_resource
This allows for a user to specify the exact path they want a resource to
be downloaded to. This differs from 'resource_direcctory' in that a user
may specify the file/directory name of the resource (using just the
'resource_directory' will have the resource as its ID in that directory.

Change-Id: I887be6216c7607c22e49cf38226a5e4600f39057
2023-09-14 15:33:17 -07:00
Bobby R. Bruce
b12f28af96 stdlib: Add 'quiet' option to obtain_resource func
Change-Id: I15d3be959ba7ab8af328fc6ec2912a8151941a1e
2023-09-14 15:33:17 -07:00
Bobby R. Bruce
46be2d2339 misc,tests: Use GitHub Docker registry for 22.04 all-deps (#321)
Via this workflow we now can build and push our docker images to the
GitHub Docker container registry:

26a1ee4e61/.github/workflows/docker-build.yaml

GitHub does not charge for downloads to runners (hosted or self-hosted).
This can therefore save the project money if we download from GitHub's
Docker reigstry over Google Cloud's.

This is a test to ensure this works as intended.
2023-09-14 15:10:58 -07:00
Bobby R. Bruce
017fb51fad misc,tests: Remove duplicate running of daily gem5_library_tests (#318)
The long/daily tests in "tests/gem5/gem5_library_tests" were running in
both the "testlib-long-tests" and the
"testlib-long-gem5_library_example_tests" job in the Daily tests
Workflow. The running in "testlib-long-tests" is removed in this PR.
2023-09-14 15:10:02 -07:00
Bobby R. Bruce
1c5870d775 misc: Update docker-build.yaml artifact actions to v3 (#322)
v2 uses some deprecated dependencies.
2023-09-14 15:09:47 -07:00
Melissa Jost
29fa894e19 misc: Add HACC GPU tests
This adds the HACC GPU tests to be run weekly

Change-Id: I77d58ee9a3d067a749bae83826266bf89bb5020f
2023-09-14 10:35:10 -07:00
Bobby R. Bruce
210ab04bca misc: Update docker-build.yaml artifact actions to v3
Change-Id: I4dea25fcfb786758942e6245133d32949b921774
2023-09-14 01:28:10 -07:00
Bobby R. Bruce
59a96c8c2f mem-cache: Fix bug in classic cache while clflush (#274)
This change, https://github.com/gem5/gem5/pull/205, mistakenly allocates
write buffer for clflush instruction when there's a cache miss. However,
clflush in gem5 is not a write instruction. Thus, the cache should
allocate miss buffer in this case.
2023-09-14 01:14:39 -07:00
Bobby R. Bruce
040f4d5ae0 misc,tests: Use GitHub Docker registry for 22.04 all-deps
Via this workflow we now can build and push our docker images to
the GitHub Docker container registry:
26a1ee4e61/.github/workflows/docker-build.yaml

GitHub does not charge for downloads to runners (hosted or self-hosted).
This can therefore save the project money if we download from GitHub's
Docker reigstry over Google Cloud's.

This is a test to ensure this works as intended.

Change-Id: Iccdb1b7a912f1e0a0d82b7f888694958099315b3
2023-09-14 01:04:05 -07:00
Bobby R. Bruce
26a1ee4e61 configs: 'memoy' -> 'memory' spelling mistake fix (#314)
Fixes https://github.com/gem5/gem5/issues/309
2023-09-13 22:59:48 -07:00
Bobby R. Bruce
7a17c780bd misc: Use 'workdir' for docker-build.yaml (#320) 2023-09-13 22:54:01 -07:00
Bobby R. Bruce
772a316dab misc: Use 'workdir' for docker-build.yaml
Change-Id: If8b30a31e1a8c3fdba84d69da4bb28e09179cb96
2023-09-13 22:52:26 -07:00
Bobby R. Bruce
61339b6471 misc: Fix docker build workflow (#319) 2023-09-13 22:48:20 -07:00
Bobby R. Bruce
dc02862c56 misc: Fix docker build workflow
Change-Id: Ib66cc124a4c3ce1354faee092f14543e699dca40
2023-09-13 22:47:08 -07:00
Bobby R. Bruce
1d160e6ab0 scons: Revert "Add an option specifying the path to mold linker binary" (#313)
Reverts https://github.com/gem5/gem5/pull/244

Fixes https://github.com/gem5/gem5/issues/312
2023-09-13 22:02:30 -07:00
Bobby R. Bruce
5102072950 misc,tests: Rm duplicate running of daily gem5_library_tests
The long/daily tests in "tests/gem5/gem5_library_tests" were running in
both the "testlib-long-tests" and the
"testlib-long-gem5_library_example_tests" job in the Daily tests
Workflow. The running in "testlib-long-tests" is removed in this patch.

Change-Id: I1c665529e3dcb594ffb7f6e2224077ae366772d6
2023-09-13 17:50:56 -07:00
Gautham Pathak
178db9e270 mem-ruby: patch fixes a protocol error in MOESI_CMP_Directory
When there is race between FwdGetX
and PUTX on owner. Owner in this case hands off
ownership to GetX requestor and PUTX still goes
through. But since owner has changed, state should
go back to M and PUTX is essentially trashed.
An Unblock to the Directory in this case will give an undefined
transition. I have added transitions which indicate that when
an Unblock is served to the Directory, it means that some kind
of ownership transfer has happened while a PUTX/PUTO was in
progress.

Change-Id: I37439b5a363417096030a0875a51c605bd34c127
2023-09-13 19:09:13 -04:00
Bobby R. Bruce
b53a311363 misc,util-docker: Fix docker-build.yaml (#285)
https://github.com/gem5/gem5/actions/runs/6114221855 failure was due to
to running the actions inside our 22.04-all-dependencies container. This
container does not contain docker. We must therefore run this action
outside of the container. However, due to our policy of checking out the
code within this container, we must split this into two jobs and use the
artifact upload and download to get the resources we want.
2023-09-13 15:54:15 -07:00
Bobby R. Bruce
d38c029195 mem-ruby: This commit patches an error in AbstractController.cc (#294)
Links to #293 

After calling m5_dump_reset_stats(0,0) in a test program, some
statistics like
l1_controllers.L1Dcache.m_demand_hits,
l1_controllers.L1Dcache.m_demand_misses,
l1_controllers.L1Dcache.m_demand_accesses
were not getting reset in the newer stat dumps.
This one line patch fixes that. Changes were tested with calling two
m5_dump_reset_stats(0,0) in a row for a system with 1 core, tested on
both SE and FS.
Credits: @MeatBoy106
2023-09-13 15:48:46 -07:00
Bobby R. Bruce
673d4b2ac2 arch-x86: initialize and correct bitwidth for FPU tag word (#304)
The x87 FPU tag word (FTW) was not explicitly initialized in
{X86_64,i386}Process::initState(), resulting in holding an initial value
of zero, resulting in an invalid x87 FPU state. This commit initializes
FTW to 0xFFFF, indicating the FPU is empty at program start during
syscall emulation.

The 16-bit FTW register was also incorrectly masked down to 8-bits in
X86ISA::ISA::setMiscRegNoEffect(), leading to an invalid X87 FPU state
that later caused crashes in the X86KvmCPU. This commit corrects the
bitwidth of the mask to 16.

GitHub issue: https://github.com/gem5/gem5/issues/303
2023-09-13 15:47:50 -07:00
Bobby R. Bruce
23c1014677 util: Fix TLM configs making use of TraceCPU replayer (#310)
A recent PR [1] moved the TraceCPU away from the BaseCPU hierarchy.
While the common etrace_replayer.py has been amended, I missed these
hybrid TLM + TraceCPU example scripts.

[1]: https://github.com/gem5/gem5/pull/302
2023-09-13 15:47:05 -07:00
Bobby R. Bruce
e42d71e802 configs: 'memoy' -> 'memory' spelling mistake fix
Fixes https://github.com/gem5/gem5/issues/309

Change-Id: I41ac7c5559d49353d01b3676b5bdf7b91e4efbda
2023-09-13 14:30:22 -07:00
Bobby R. Bruce
d463f73a43 scons: Revert "Add an option specifying the..."
Change-Id: I2bd952d3cfd6c3c671b5ab3458e44c53f93bf649
2023-09-13 14:28:05 -07:00
Gautham Pathak
87db6df8f6 mem-ruby: This commit patches an error in AbstractController.cc
After calling m5_dump_reset_stats(0,0) in a test program,
some statistics like
l1_controllers.L1Dcache.m_demand_hits,
l1_controllers.L1Dcache.m_demand_misses,
l1_controllers.L1Dcache.m_demand_accesses
were not getting reset in the newer stat dumps.
This one line patch fixes that. Changes were tested with
calling two m5_dump_reset_stats(0,0) in a row for a system
with 1 core, tested on both SE and FS.
Credits to Gabriel Busnot for finding the fix.

Change-Id: I19d75996fa53d31ef20f7b206024fd38dbeac643
2023-09-13 14:07:16 -04:00
Giacomo Travaglini
f95e1505b8 util: Fix TLM configs making use of TraceCPU replayer
A recent PR [1] moved the TraceCPU away from the BaseCPU hierarchy.
While the common etrace_replayer.py has been amended, I missed these
hybrid TLM + TraceCPU example scripts.

[1]: https://github.com/gem5/gem5/pull/302

Change-Id: I7e9bc9a612d2721d72f5881ddb2fb4d9ee011587
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-09-13 13:36:33 +01:00
Bobby R. Bruce
5fd901ffbb cpu, configs: Fix TraceCPU after multi-ISA addition (#302)
This PR fixes #301
2023-09-12 17:26:27 -07:00
Bobby R. Bruce
133e4ed636 misc: Add "typings" directory to .gitignore
This is used by Pylance IntelliSense to infer gem5 typing.

See "util/gem5-stubgen.py" for generating this directory.

Change-Id: Ie39762c718e5392f6194ff7c8238bd0cd677f486
2023-09-12 15:20:06 -07:00
Bobby R. Bruce
bceac5d951 util: Allow MyPy stubgen to aid Pylance IntelliSense
Change-Id: I42fe177e5ae428afd0f23ea482b6af5b7d3ecaf9
2023-09-12 15:19:49 -07:00
Bobby R. Bruce
39f0bcd9af python: Mimic Python 3's -P flag in gem5
Python 3's `-P` flag, when set, means `sys.path` is not prepended with
potentially unsafe paths:
https://docs.python.org/3/using/cmdline.html#cmdoption-P

This patch allows gem5 to mimic this. This is necesssary when using
`mypy.stubgen` as it expects the Python Interpreter to have the `-P`
flag.

Change-Id: I456c8001d3ee1e806190dc37142566d50d54cc90
2023-09-12 14:51:59 -07:00
Nicholas Mosier
2178e26bf2 arch-x86: initialize and correct bitwidth for FPU tag word
The x87 FPU tag word (FTW) was not explicitly initialized in
{X86_64,i386}Process::initState(), resulting in holding an initial
value of zero, resulting in an invalid x87 FPU state. This commit
initializes FTW to 0xFFFF, indicating the FPU is empty at program
start during syscall emulation.

The 16-bit FTW register was also incorrectly masked down to 8-bits
in X86ISA::ISA::setMiscRegNoEffect(), leading to an invalid X87 FPU
state that later caused crashes in the X86KvmCPU. This commit
corrects the bitwidth of the mask to 16.

GitHub issue: https://github.com/gem5/gem5/issues/303

Change-Id: I97892d707998a87c1ff8546e08c15fede7eed66f
2023-09-12 15:39:29 +00:00
Bobby R. Bruce
1bebf6a3cc sim-se: Use tgt_stat64 instead of tgt_stat in newfstatatFunc (#283)
The syscall emulation of newfstatat incorrectly treated the output stat
buffer to be of type `OS::tgt_stat`, not `OS::tgt_stat64`, causing the
invalid output stat buffer in the application to hold invalid data.

This patch fixes the bug by simply substituting the type `OS::tgt_stat`
with `OS::tgt_stat64` in `newstatatFunc()`.

GitHub issue: https://github.com/gem5/gem5/issues/281
2023-09-12 08:33:42 -07:00
Bobby R. Bruce
94e5a0cccf sim-se: Fix tgkill logic bug in handling signal argument (#286)
The syscall emulation of tgkill contained a simple logic bug (a `||`
instead of a `&&`), causing the signal argument to always be considered
invalid. This patch fixes the bug by simply changing the `||` to a `&&`.

GitHub issue: https://github.com/gem5/gem5/issues/284
2023-09-12 08:32:56 -07:00
Bobby R. Bruce
d67a6603c1 cpu-kvm: properly set x86 xsave header on gem5->KVM transition (#298)
If the XSAVE KVM capability is available (KVM_CAP_XSAVE), the X86KvmCPU
will try to set the x87 FPU + SSE state using KVM_SET_XSAVE, which
expects a buffer (struct kvm_xsave) in XSAVE area format (Vol. 1, Sec.
13.4 of Intel x86 SDM). The original implementation of
`X86KvmCPU::updateKvmStateFPUXSave()`, however, improperly sets the
xsave header, which contains a bitmap of state components present in the
xsave area.

This patch defines `XSaveHeader` structure to model the xsave header,
which is expected directly following the legacy FPU region (defined in
the `FXSave` structure) in the xsave area. It then sets two bist in the
xsave header to indicate the presence of x86 FPU and SSE state
components.

GitHub issue: https://github.com/gem5/gem5/issues/296
2023-09-12 08:32:20 -07:00
Bobby R. Bruce
5fefbe2933 arch-riscv: Enable RVV run in Minor and O3 CPU (#228)
Changes in the PR:

1. Change the vset\*vl\* instructions to jump/branch family, and
implement the branchTarget.
2. Move the Vl and Vtype from decoder to PCState
3. get VL, VTYPE and VLENB value from PCState
4. Remove vtype checking in construction so that the minor and o3 cpu
and decode the instructions after the vset\*vl\*
2023-09-12 08:31:36 -07:00
Giacomo Travaglini
a0a799f474 cpu: Disable CPU switching functionality with TraceCPU
Now that the TraceCPU is no longer a BaseCPU we disable CPU switching
functionality. AFAICS from the code, it seems like using m5.switchCpus
was never really working.
The takeOverFrom was described as being used when checkpointing
(which is not really the case). Moreover the icache/dcache
event loops were not checking if the CPU was switched out
so the trace was always been consumed regardless of the BaseCPU
state.

Note: IMHO the only case where you might want to switch between
an execution-driven CPU to the TraceCPU is when you want to
warm your caches before the ROI.
All other cases don't really make sense as with the TraceCPU
there is no architectural state being maintained/updated.

Change-Id: I0611359d2b833e1bc0762be72642df24a7c92b1e
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-09-12 15:50:05 +01:00
Giacomo Travaglini
785eba6ce1 configs: Reflect TraceCPU changes in the etrace_replay script
As we no longer inherit from the BaseCPU, we can't really use
CPU generation methods (like Simulation.setCPUClass) and
cache generation ones (like CacheConfig.config_cache).

This is good news as it allows us to simplify the etrace
script and to remove a dependency with the deprecated-to-be
common library.

Change-Id: Ic89ce2b9d713ee6f6e11bf20c5065426298b3da2
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-09-12 15:49:39 +01:00