Commit Graph

5705 Commits

Author SHA1 Message Date
Giacomo Travaglini
8233aa8a9b arch-arm: Implement a CapstoneDisassembler for Arm
Change-Id: Id3135bda065efa9b4f3ab36972957fd00c05a53c
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-20 09:27:51 +01:00
Giacomo Travaglini
34336208b7 arch-arm: Disassemble through InstDisassembler in TarmacTracer
Change-Id: I5407338501084c016522749be697dd688ca51735
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-20 09:27:51 +01:00
Giacomo Travaglini
27ce721ad3 arch-arm: Pass a reference of the parent tracer to TarmacContext
Change-Id: I7ab0442353a8b5854bb6b50bd54dac89f83ecc1d
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-20 09:27:51 +01:00
Giacomo Travaglini
81b6e296dd arch-arm: disassemble member variable not used by TarmacParser
We move it to the child class which is what the TarmacTracer
actually uses.

Change-Id: Ia30892723d2e1f7306dae87c6c9c1d69d00ad73d
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-20 09:27:50 +01:00
Alvaro Moreno
edf1d69257 arch-riscv: Define vlwhole/vswhole mem acceses using vlen.
This patch fixes the size of the memory acceses in vswhole and
vlwhole instructions to the maximum vector length.

Change-Id: Ib86b5356d9f1dfa277cb4b367893e3b08242f93e
2023-10-19 00:27:58 +02:00
Alvaro Moreno
52219e5e6f arch-riscv: Add elen configuration to vector config instructions
This patch adds elen as a member of vector configuration instructions so it can be used with the especulative execution

Change-Id: Iaf79015717a006374c5198aaa36e050edde40cee
2023-10-19 00:27:58 +02:00
Alvaro Moreno
2c9fca7b60 arch-riscv: Add vlen configuration to vector instructions
In first place, vlen is added as a member of Vector Macro Instructions
where it is needed to split the instruction in Micro Instructions.

Then, new PCState methods are used to get dynamic vlen and vlenb
values at execution.

Finally, vector length data types are fixed to 32 bits so every vlen value
is considered.

Change-Id: I5b8ceb0d291f456a30a4b0ae2f58601231d33a7a
2023-10-19 00:27:58 +02:00
Alvaro Moreno
8a20f20f79 arch-riscv: Add vlen component to decoder state
This patch add vlen definition to the riscv decoder so it can be used in Vector Instruction Constructors

Change-Id: I52292bc261c43562b690062b16d2b323675c2fe0
2023-10-19 00:27:58 +02:00
Alvaro Moreno
5d97cb8b0b arch-riscv: Define VLEN and ELEN through the ISA object
This commit define VLEN and ELEN values as parameters of the RiscvISA class.

Change-Id: Ic5b80397d316522d729e4db4f906aa189f27a491
2023-10-19 00:27:58 +02:00
Alvaro Moreno
57e0ba7765 arch-riscv: Define VecRegContainer with maximum expected length
This path redefine VecRegContainer for RISCV so it can hold every VLEN + ELEN possible configuration used at execution time

Change-Id: Ie6abd01a1c4ebe9aae3d93f4e835fcfdc4a82dcd
2023-10-19 00:27:58 +02:00
Hoa Nguyen
c3acfdc9b8 arch-riscv: Copy Misc Regs when swiching cpus (#479)
Misc Regs might contain rather important information about the state of
a core, e.g., information in CSR registers.

This patch enforces copying the CSR registers when switching cpus. The
bug and the proposed fix are reported here [1].

[1] https://github.com/gem5/gem5/issues/451

Change-Id: I611782e6e3bcd5530ddac346342a9e0e44b0f757

Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-10-18 10:51:37 -07:00
Bobby R. Bruce
334df18dce arch-riscv: Add bootloader+kernel workload (#390)
Aims to boot OpenSBI + Linux kernel.
2023-10-18 09:17:12 -07:00
Bobby R. Bruce
adb5470996 arch-arm: Fix (other) line-length errors (#468)
https://github.com/gem5/gem5/pull/459 missed a couple.
This commit should complete the task.
2023-10-16 17:47:46 -07:00
Bobby R. Bruce
aaefda3b08 arch-arm: Fix line-length error in branch64.is
Change-Id: I62c5d5fd47927a838e6731a464fc7e6d8afab768
2023-10-16 10:57:03 -07:00
Hoa Nguyen
d048ad34d6 arch-riscv: Change to VS bits to DIRTY for rvv insts changing vregs (#376)
This is similar to [1] and [2].

Essentially, the VS bits of STATUS CSR keep track of the state of
the vector registers. (VS bits == DIRTY) means the content of vector
registers have been updated since the last time the VS bits were
updated.

This chain of changes is supposed to change the VS bits to DIRTY for if
any
vector register is potentially updated.

[1] https://gem5-review.googlesource.com/c/public/gem5/+/65272
[2] https://github.com/gem5/gem5/pull/370

Change-Id: I0427890dadc63b74a470d7405807dcfcad18005b
2023-10-16 10:07:40 -07:00
Hoa Nguyen
9b2b6cd8d2 arch-riscv: Mark vector configuration insts as vector insts (#463) 2023-10-16 09:40:09 -07:00
Bobby R. Bruce
322b105b9d arch-arm: Fix (another) line-length error in misc.cc
https://github.com/gem5/gem5/pull/459 missed one.
This commit should complete the task.

Change-Id: I0aeba79d6f13ddc45effe141945f5636b75daecc
2023-10-16 09:37:51 -07:00
Bobby R. Bruce
97f4b44dd3 arch-arm: Fix line-length error in misc.cc (#459) 2023-10-16 08:35:54 -07:00
Giacomo Travaglini
3f925c4084 arch-arm: Mark gem5 pseudo-ops with IsPseudo flag
Change-Id: I9c8a146d73596597f28cdeca22ad7b7b01b381a7
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-16 13:42:23 +01:00
Giacomo Travaglini
2e85c95f4b arch-arm: Remove Jazelle state + ThumbEE support (#364)
This PR removes Jazelle state (while still keeping a "Trivial Jazelle
implementation",
see Arm Architecture Reference Manual) and ThumbEE support
2023-10-16 09:41:44 +01:00
Yu-Cheng Chang
a3c51ca38c arch-riscv: Fix write back register issue of vmask_mv_micro (#443)
After removing the setRegOperand in VecRegOperand
https://github.com/gem5/gem5/pull/341. The vmask_vm_micro will not write
back to register because tmp_d0 is not the reference type. The PR will
make tmp_d0 as reference of regFile.

Change-Id: I2a934ad28045ac63950d4e2ed3eecc4a7d137919
2023-10-13 15:20:42 -07:00
Giacomo Travaglini
1c45cdcc41 arch-arm: Remove legacy ThumbEE references
ThumbEE had already been removed but there were still some
references to it dangling around. We were also signaling
ThumbEE as being available through HWCAPS in SE which
was not correct. This patch is fixing it

Change-Id: I8b196f5bd27822cd4dd8b3ab3ad9f12a6f54b047
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-13 09:25:48 +01:00
Giacomo Travaglini
a33f3d3967 arch-arm: Remove Jazelle state support
Jazelle state has been officially removed in Armv8.
Every AArch32 implementation must still support the
"Trivial Jazelle implementation", which means that while
the instruction set has been removed, it is still possible
for privileged software to access some Jazelle registers
like JIDR,JMCR, and JOSCR which are just treated as RAZ

Change-Id: Ie403c4f004968eb4cb45fa51067178a550726c87
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-13 09:25:48 +01:00
Matthew Poremba
4d336c0636 arch-vega: Implement buffer_atomic_cmpswap (#439)
This is a standard compare and swap but implemented on vector memory
buffer instructions (i.e., it is the same as FLAT_ATOMIC_CMPSWAP with
MUBUF's special address calculation).

This was tested using a Tensile kernel, a backend for rocBLAS, which is
used by PyTorch and Tensorflow. Prior to this patch both ML frameworks
crashed. With this patch they both make forward progress.

Change-Id: Ie76447a72d210f81624e01e1fa374e41c2c21e06
2023-10-12 07:33:40 -07:00
Matthew Poremba
4b7f25fcb6 arch-vega: Ignore s_setprio instruction instead of panic
This instruction is used by ML frameworks to prioritize certain
wavefronts. Since gem5 does not have any support for wavefront
scheduling based on priority (besides wavefront age), we ignore this
instruction and warn_once rather than calling panic. Since hardware can
override this priority anyways, we can be sure that ignoring the value
will not inhibit forward progress resulting in application hangs.

Change-Id: Ic5eef14f9685dd2b316c5cf76078bb78d5bfe3cc
2023-10-11 15:55:16 -05:00
Matthew Poremba
4b85a1710e arch-vega: Implement buffer_atomic_cmpswap
This is a standard compare and swap but implemented on vector memory
buffer instructions (i.e., it is the same as FLAT_ATOMIC_CMPSWAP with
MUBUF's special address calculation).

This was tested using a Tensile kernel, a backend for rocBLAS, which is
used by PyTorch and Tensorflow. Prior to this patch both ML frameworks
crashed. With this patch they both make forward progress.

Change-Id: Ie76447a72d210f81624e01e1fa374e41c2c21e06
2023-10-11 15:42:50 -05:00
Bobby R. Bruce
70b6b53e54 misc,python: Add pyupgrade to pre-commit (#424)
This adds the [pyupgrade](https://github.com/asottile/pyupgrade) hook to
pre-commit.

This hook automatically upgrades the syntax to the recommended standards
for the newer version of the language.
2023-10-11 09:07:09 -07:00
Matthew Poremba
da11427ba6 gpu-compute: Update tokens for flat global/scratch (#408)
Memory instructions acquire coalescer tokens in the schedule stage.
Currently this is only done for buffer and flat instructions, but not
flat global or flat scratch. This change now acquires tokens for flat
global and flat scratch instructions. This provides back-pressure to the
CUs and helps to avoid deadlocks in Ruby.

The change also handles returning tokens for buffer, flat global, and
flat scratch instructions. This was previously only being done for
normal flat instructions leading to deadlocks in some applications when
the tokens were exhausted.

To simplify the logic, added a needsToken() method to GPUDynInst which
return if the instruction is buffer or any flat segment.

The waitcnts were also incorrect for flat global and flat scratch. We
should always decrement vmem and exp count for stores and only normal
flat instructions should decrement lgkm. Currently vmem/exp are not
decremented for flat global and flat scratch which can lead to deadlock.
This change set fixes this by always decrementing vmem/exp and lgkm only
for normal flat instructions.

Change-Id: I673f4ac6121e4b5a5e8491bc9130c6d825d95fc5
2023-10-11 09:00:10 -07:00
Andreas Sandberg
891250192d arch-arm: Implement FEAT_TCR2 and FEAT_SCTLR2 (#416)
This is simply adding the new Armv8.9 registers defined in the related
features:

- FEAT_TCR2
- FEAT_SCTLR2
2023-10-11 10:14:31 +01:00
Bobby R. Bruce
298119e402 misc,python: Run pre-commit run --all-files
Applies the `pyupgrade` hook to all files in the repo.

Change-Id: I9879c634a65c5fcaa9567c63bc5977ff97d5d3bf
2023-10-10 21:47:07 -07:00
Bobby R. Bruce
ddf6cb88e4 misc: Run pre-commit run --all-files
This is reflect the updates made to black when running `pre-commit
autoupdate`.

Change-Id: Ifb7fea117f354c7f02f26926a5afdf7d67bc5919
2023-10-10 14:01:58 -07:00
Yu-Cheng Chang
141b06d335 arch,arch-riscv: Remove setRegOperand in VecRegOperand (#341)
The RISC-V vector instructions still work without setRegOperand.
We should fix the register statistic issue by
https://github.com/gem5/gem5/pull/360 to avoid duplicate statistic
register write count



Change-Id: Ib6a52935e00c3e557b366abfcf60450dca05614d
2023-10-10 08:00:10 -07:00
Matthew Poremba
9f4d334644 gpu-compute: Update tokens for flat global/scratch
Memory instructions acquire coalescer tokens in the schedule stage.
Currently this is only done for buffer and flat instructions, but not
flat global or flat scratch. This change now acquires tokens for flat
global and flat scratch instructions. This provides back-pressure to the
CUs and helps to avoid deadlocks in Ruby.

The change also handles returning tokens for buffer, flat global, and
flat scratch instructions. This was previously only being done for
normal flat instructions leading to deadlocks in some applications when
the tokens were exhausted.

To simplify the logic, added a needsToken() method to GPUDynInst which
return if the instruction is buffer or any flat segment.

The waitcnts were also incorrect for flat global and flat scratch. We
should always decrement vmem and exp count for stores and only normal
flat instructions should decrement lgkm. Currently vmem/exp are not
decremented for flat global and flat scratch which can lead to deadlock.
This change set fixes this by always decrementing vmem/exp and lgkm only
for normal flat instructions.

Change-Id: I673f4ac6121e4b5a5e8491bc9130c6d825d95fc5
2023-10-10 09:48:16 -05:00
Giacomo Travaglini
8acf49b6fa arch-arm: Revamp takeInt to take VHE/SEL2 into account
The new implementation matches the table in the ARM Architecture
Reference Manual (version DDI 0487J.a, section D1.3.6, table R_SXLWJ)

It takes into consideration features like FEAT_SEL2 (scr.eel2 bit) and
FEAT_VHE (hcr.e2h bit) which affect the masking of interrupts under
certain circumstances

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Change-Id: I07ebd8d859651475bd32fd201eea0f4e64a7dd5f
2023-10-10 09:46:47 +01:00
Giacomo Travaglini
e412ddddbd arch-arm: Split takeInt into AArch64/32 versions
We pay a small duplication cost but we make the code
more readable and we enable further modifications to the
AArch64 code without forcing the same code on the AArch32
method

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Change-Id: I1efa33cf19f91094fd33bd48b6a0a57d8df8f89f
2023-10-10 09:45:59 +01:00
Bobby R. Bruce
bbe05b0cba tests,misc: Fix compilation tests failures (#400)
Exposed in our failing compiler tests:
https://github.com/gem5/gem5/actions/runs/6348223508, this PR:

* Adds missing overrides to `PCState`'s `set` function.
* Removes `std::binary_function` from DramPower (it was deprecated in
CPP-11 and officially removed in CPP-17).
2023-10-09 11:20:52 -07:00
Giacomo Travaglini
eac5a8b215 arch-arm: Implement FEAT_TCR2
Change-Id: I0396f5938c09b68fcc3303a6fdda1e4dde290869
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-09 17:19:57 +01:00
Giacomo Travaglini
49cbb24351 arch-arm: Implement FEAT_SCTLR2
Change-Id: Ifb8c8dc1729cc21007842b950273fe38129d9539
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-09 17:12:53 +01:00
Giacomo Travaglini
c4c5d2e172 arch-arm: Implement ID_AA64MMFR3_EL1 register
Change-Id: If8c37bdccf35a070870900c06dc4640348f0f063
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-09 17:12:53 +01:00
Andreas Sandberg
ec7921305b arch-arm: Implement FEAT_TLBIRANGE extension (#414) 2023-10-09 17:09:31 +01:00
Giacomo Travaglini
39fdfaea5a arch-arm: Implement FEAT_TLBIRANGE
Change-Id: I7eb020573420e49a8a54e1fc7a89eb6e2236dacb
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-09 13:59:47 +01:00
Giacomo Travaglini
6b698630a2 arch-arm: Check VMID in secure mode as well (NS=0)
This is still trying to completely remove any artifact
which implies virtualization is only supported in
non-secure mode (NS=1)

Change-Id: I83fed1c33cc745ecdf3c5ad60f4f356f3c58aad5
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-09 13:56:57 +01:00
Giacomo Travaglini
a8efded644 arch-arm: Include Granule Size in a TLB entry
This info can be used during TLB invalidation

Change-Id: I81247e40b11745f0207178b52c47845ca1b92870
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-09 13:56:57 +01:00
Nicholas Mosier
7a0e84d853 cpu-kvm, arch-x86: flush TLB after syscalls
Modified the x86 KVM-in-SE syscall handler to flush the TLB following
each syscall, in case the page table has been modified. This is done
by reloading the value in %cr3. Doing this requires an intermediate
GPR, which we store in a new scratch buffer following the syscall code
at address `syscallDataBuf`.

GitHub issue: https://github.com/gem5/gem5/issues/409

Change-Id: Ibc20018c97ebb1794fa31a0c71e0857d661c7c9d
2023-10-06 20:41:59 +00:00
Giacomo Travaglini
ae104cc431 mem-ruby: Add new feature far atomics in CHI (#177)
Added a new feature to CHI protocol (in collaboration with @tiagormk).
Here is the Jira Ticket
[https://gem5.atlassian.net/browse/GEM5-1326](https://gem5.atlassian.net/browse/GEM5-1326
). As described in CHI specs, far atomic transactions enable remote
execution of Atomic Memory Operations. This pull request incorporates
several changes:

* Fix Arm ISA definition of Swap instructions. These instructions should
return an operand, so their ISA definition should be Return Operation.
* Enable AMOs in Ruby Mem Test to verify that AMOs work
* Enable near and far AMO in the Cache Controler of CHI

Three configuration parameters have been used to tune this behavior:
* policy_type: sets the atomic policy to one of the described in [our
paper](https://dl.acm.org/doi/10.1145/3579371.3589065)
* atomic_op_latency: simulates the AMO ALU operation latency
* comp_anr: configures the Atomic No return transaction to split
CompDBIDResp into two different messages DBIDResp and Comp
2023-10-06 10:09:58 +01:00
Hoa Nguyen
3fc6b67974 arch-riscv: Add several inform() to RiscvISA::BootloaderKernelWorkload
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-10-06 00:45:21 -07:00
Hoa Nguyen
46a9d85215 arch-riscv: Add bootloader+kernel workload
Aims to boot OpenSBI + Linux kernel.

Change-Id: I9ee93cc367e8c06bdd0c7ddf43335d32965be14d
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-10-06 00:33:05 -07:00
Bobby R. Bruce
761f6b73a0 arch-arm: Implement FEAT_FGT (#334)
This PR implements FEAT_FGT (Fine Grain Traps)
2023-10-05 10:44:26 -07:00
Bobby R. Bruce
39c7e7d1ed arch: Adding missing override to PCState.set
As highlighed in this failing compiler test:
https://github.com/gem5/gem5/actions/runs/6348223508/job/17389057995

Clang was failing when compiling "build/ALL/gem5.opt" due missing
overrides in `PCState`'s "set" function.

This was observed in Clang-14 and, stangely, Clang-8.

Change-Id: I240c1087e8875fd07630e467e7452c62a5d14d5b
2023-10-05 10:18:19 -07:00
Roger Chang
ea3ee880aa arch-riscv: Implement Zcb instructions
Added the following instructions:
c.lbu
c.lh
c.lhu
c.sb
c.sh
c.zext.b
c.sext.b
c.zext.h
c.sext.h
c.zext.w
c.not
c.mul

Reference: https://github.com/riscv/riscv-code-size-reduction
Change-Id: Ib04820bf5591b365a3bfbbd8b90655a8a1d844cf
2023-10-05 18:46:35 +08:00