Commit Graph

5835 Commits

Author SHA1 Message Date
Matthew Poremba
db42aeb630 arch-vega: Implement accumulation offset (#895)
This PR implements a few changes related to the accumulation offset
which is new in MI200. Previously MI100 contained two vector register
files: the architectural and accumulation register files. These have now
been unified and the architectural register file is twice the size. As a
result of this the dispatch packet set an offset into the unified vector
register file for where the former accumulation registers would go. The
changes are:

- Calculate the accumulation offset from dispatch packet and store in
HSA task.
- Update the accumulation move instructions (v_accvgpr_read/write) to
use it.
- Update the current MFMA instructions to use it.
- Make the MFMA examples more clean.
2024-02-29 09:05:39 -08:00
Nicholas Mosier
69762e272e sim-se, arch-x86: initialize max stack size from parameter (#892)
Initialize x86 process' max stack size to the value given in the process
params, rather than hard-coding it to 8 MB, which made it impossible to
run x86 programs requiring more than 8 MB of stack.

Change-Id: I0b17fe60b016b1e4a82d704ef7ad367974ea6a08
2024-02-29 08:15:43 -08:00
Matthew Poremba
2ca7f48828 arch-vega: Accumulation offset for existing MFMA insts
This commit update the two exiting MFMA instructions to support the
accumulation offset for A, B, and C/D matrix. Additionally uses array
indexed C/D matrix registers to reduce duplicate code. Future MFMA
instructions have up to 16 registers for C/D and this reduces the amount
of code being written.

Change-Id: Ibdc3b6255234a3bab99f115c79e8a0248c800400
2024-02-26 14:30:50 -06:00
Matthew Poremba
e0e65221b4 arch-vega: Use accum offset for v_accvgpr_read/write
The accum offset is used as an index into the unified VGPR register file
in MI200 and is not the same as a move if accum_offset in the dispatch
packet is non-zero.

Change these instructions to use the stored accum_offset value.

Change-Id: Ib661804f8f5b8392e4c586082c423645f539e641
2024-02-26 12:57:09 -06:00
Yu-Cheng Chang
816ef46c78 arch-riscv: Fix fflags behavior of float inst. in O3 CPU (#868)
According to the RISC-V spec [1]. Any float-point instructions
accumulate FFLAGS register rather than write it to reflect the CSR
behavior.

In the previous implementation. We read the FFLAGS, set the exception
flags, and write the result back to the FFLAGS. This works in the gem5
simple and minor CPU model as they are actually written to `regFile`
after executing the instructions. However, in the gem5 O3 CPU model, it
will record in the `destMiscReg` buffer until the commit stage when
writing to the `miscReg` in the execution stage. The next instruction
will get the old FFLAGS and cause the incorrect result.

The CL introduced the `MISCREG_FFLAGS_EXE` and used the same size of
`miscRegFile` because the `MISCREG_FFLAGS_EXE` and `MISCREG_FFLAGS`
shared the same space. When executing the float-pointing instruction,
any exception flags should be updated via `MISCREG_FFLAGS_EXE` to
accumulate the FFLAGS in `setMiscReg` method. For the MISCREG_FFLAGS, it
should only be called in the CSROp.

[1] Syntactic Dependencies: Appendix A

c80ecada1c/src/mm-eplan.adoc (syntactic-dependencies-rules-9-11)

gem5 issue: https://github.com/gem5/gem5/issues/755

Change-Id: Ib7f13d95b8a921c37766a54a217a5a4b1ef17c6f
2024-02-22 08:33:34 -08:00
Jason Lowe-Power
c719ea960a arch-arm: Add FEAT_FGT trapping for debug registers (#873)
We already implemented FEAT_FGT but we were missing trapping
capabilities for trapping debug registers accesses
2024-02-21 11:27:43 -08:00
Nicholas Mosier
7ac9733199 arch-x86, cpu-kvm: initialize x87 FCW (#877)
Fix #876. The x87 floating-point control word (FCW) was not initialized
at process startup in syscall emulation mode. This resulted in floating
point exceptions in KVM mode when executing x87 floating-point
instructions.

This patch fixes the bug by initializing FCW to its reset value, 0x37F.

Change-Id: Idd1573c6951524ef59466cc5c9f1e640ea7658ae
2024-02-20 07:46:44 -08:00
Giacomo Travaglini
8759131df3 cpu-o3, arch: Fix SMT bug arising from v23.0 and make gem5 more robust with SMT (#828)
This PR is fixing https://github.com/gem5/gem5/issues/668. It fixes it
for all ISAs other than Arm with the first commit, which is setting the
number of architectural Matrix registers to 0 for those ISA which are
not using them.

It then partly fixes it for Arm as well with the 2nd commit: by removing
RenameMap::numFreeEntries we don't stall renaming unless a matrix
instruction is encountered... This means most binaries will run with SMT
as long as they don't use FEAT_SME instructions. Please note: this is
not simply a SMT fix, it will generally address a shortcoming in the way
we were renaming instructions.

If an Arm binary wants to use SMT with FEAT_SME, the 4th commit will
make sure the lack of physical registers is notified explicitly at the
beginning of simulation, rather than silently blocking renaming
2024-02-19 08:52:31 +00:00
Giacomo Travaglini
2c0cc0040b arch-arm: Implement FEAT_FGT Debug trapping
Change-Id: I30af2b49ee604bcaa43fd419f6bc69e9ee6d9350
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
2024-02-15 15:58:34 +00:00
Giacomo Travaglini
683007c6ca arch-arm: Add FEAT_FGT Debug Read/Write registers
Those are supposed to control trapping for accesses to debug registers

Change-Id: I4a25a379e718ea6d5ea8ae22ac7edbeb452d1836
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
2024-02-15 15:58:34 +00:00
Harshil Patel
47c4dad869 arch-riscv: Remove unnecessary assert (#866)
`assert(interruptID >=0)` is always true as `interruptID` is an unsigned
int.

This was causing compilation tests failures in GCC-8 with the following
error:

```sh
src/arch/riscv/interrupts.cc:47:32: error: comparison is always true due to limited range of data type [-Werror=type-limits]
             assert(interruptID >= 0);
```

Change-Id: I356be78d7f75ea5d20d34768fb8ece0f746be2fc
2024-02-13 08:30:18 -08:00
Vishnu Ramadas
8054459df6 arch-vega: Add support for S_ICACHE_INV instruction
Previously, the S_ICACHE_INV instruction was unimplemented and
simulation panicked if it was encountered. This commit adds support for
executing the instruction by injecting a memory barrier in the scalar
pipeline and invalidating the ICACHE (or SQC)

Change-Id: I0fbd4e53f630a267971a23cea6f17d4fef403d15
2024-02-09 12:19:08 -06:00
Saúl
7d80658a39 arch-riscv: fix vl in mask load/store (i.e vlm.v/vsm.v) (#830)
The vlm.v and vsm.v unit-stride mask load/store instructions are
constructed with an incorrect VL when the current one is larger than
than VLEN/EEW (i.e. when LMUL > 1). This commit fixes the issue for both
instructions.
2024-02-08 14:06:49 -08:00
Bobby R. Bruce
7fe1588546 arch-riscv: Fix load and store to use EEW instead of SEW (#859)
Vector unit-stride instructions have an EEW encoded directly in the
instruction, We should use that instead of SEW in vtype.

Ref:

https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#73-vector-loadstore-width-encoding
2024-02-08 12:14:11 -08:00
Saúl
804f137325 arch-riscv: add unit-stride fault-only-first loads (i.e. vle*ff) (#794)
This patch provides unit-stride fault-only-first loads (i.e. vle*ff) for
the RISC-V architecture.

They are implemented within the regular unit-stride load (i.e. vle*). A
snippet named `fault_code` is inserted with templating to change their
behaviour to fault-only-first.

A part from this, a new micro based on the vset\*vl\* instructions
(VlFFTrimVlMicroOp) is inserted as the last micro in the macro
constructor to trim the VL to it's corresponding length based on the
faulting index.

This trimming micro waits for the load micros to finish (via data
dependency) and has a reference to the other micros to check whether
they faulted or not. The new VL is calculated with the VL of each micro,
stopping on the first faulting one (if there's such a fault).

I've tested this with VLEN=128,256,...,16384 and all the corresponding
SEW+LMUL configurations.


Change-Id: I7b937f6bcb396725461bba4912d2667f3b22f955
2024-02-08 09:15:58 -08:00
QQeg
e685c072d1 arch-riscv: Remove micro_elems in VleMicro template
Change-Id: I91267de8b1142075aa2873bfcedfd8b15c6863d4
2024-02-08 07:24:55 +00:00
QQeg
7eeac98b8d arch-riscv: Fix load and store to use EEW instead of SEW
Vector unit-stride instructions have an EEW encoded directly in the instruction,
We should use that instead of SEW in vtype.

Change-Id: I282041ce8ed57fbcca899f7497ef6c6fb2dfcf85
2024-02-07 21:11:28 +00:00
Robert Hauser
f289f9e8b5 arch-riscv: adding support for local interrupts (#813)
Besides the standard RISC-V interrupts software, timer, and external
interrupt, the RISC-V specification also offers the possibility to
implement local interrupts. With this patch, we contribute an extension
of RiscvInterrupts that enables connecting interrupt sources to the
local interrupt controller. We assigned the local interrupts to
machine-level and gave them the highest priority. If two local
interrupts are pending, there exception code will be the tie-breaker
(higher ID > lower ID). 32 Bit systems only recognize the local
interrupts 16 to 31, 64 Bit systems 16 to 63.

Change-Id: Iff8d34e740b925dce351c0c6f54f4bd37a647e0c

---------

Co-authored-by: Robert Hauser <robert.hauser@uni-rostock.de>
2024-02-06 09:38:50 -08:00
Yu-Cheng Chang
ba6c569b8d arch-riscv: Add BasePMAChecker to support customized PMA (#846)
The RISC-V privilege spec don't specify the implementation of
PMA(physical memory attribute), which is addressed in the previous
CL[1].

This CL creates the BasePMAChecker to support customized PMA so that we
can only focus on the features wanted in the study. The CL also leaves
the common methods `check` and `takeOverFrom` to make MMU easy to
interact with PMA.

[1] https://gem5-review.googlesource.com/c/public/gem5/+/40596

Change-Id: I9725e3a8f7f9276e41f0d06988259456149d2a77
2024-02-06 05:38:34 -08:00
Giacomo Travaglini
a60d6960c7 arch-arm: Remove unused/unimplemented TLB methods (#849)
Change-Id: I3a76a914df1ba65ec5200f11111cf20f3e1eb924

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-02-06 09:18:06 +00:00
Giacomo Travaglini
05f93175a7 arch-arm: Crypto instruction execution requires SIMD to be enabled (#848)
Crypto instructions will cause an undefined instruction when executed
with SIMD disabled. The PR is also
refactoring their implementation by checking the release object instead
of the ID register field. This is improving
readability
2024-02-05 19:22:04 +00:00
Chong-Teng Wang
40ecdf5fb4 arch-riscv: Fix RVV instructions vmv.s.x/vfmv.s.f (#843)
This commit fixes the implementation of vmv.s.x and vfmv.s.f. 
When vl = 0, no elements are updated in the destination vector register
group, regardless of vstart.

Change-Id: Ib21b3125da8009325743ec70ca0874704328356c

Reference:
[Integer Scalar Move
Instructions](https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#161-integer-scalar-move-instructions)
[Floating-Point Scalar Move
Instructions](https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#162-floating-point-scalar-move-instructions)
2024-02-05 08:51:42 -08:00
Chong-Teng Wang
85059a369e arch-riscv: Fix control flow in VectorFloatMaskMacroConstructor (#844)
This commit adjusts the logic in VectorFloatMaskMacroConstructor to
ensure the %(copy_old_vd)s section is not skipped when vl = 0, ensuring
correct values in destination vector register.

Change-Id: I2478722d6f003a0f2e4b3cd0ba3e845bed938ee6

This is the same problem as #715 .
2024-02-05 06:29:05 -08:00
Giacomo Travaglini
16e06bad0c arch-arm: Exec Crypto instructions only if SIMD&FP enabled
We not only check for the presence of the relative FEAT_*,
we also check if AdvSIMD is enabled; we throw an undefined
instruction otherwise.

Change-Id: I1fd0cdc8057c5a7901774802dc076817f06c8e66
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2024-02-05 12:56:48 +00:00
Giacomo Travaglini
ebef2fc4b1 arch-arm: Crypto instructions checking release object
Check directly if extension is enabled instead of looking
for ID register field value. This makes the code more readable

Change-Id: If0b882ac3464c3587731b72a7edb3b8b65ea86c7
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2024-02-05 12:56:48 +00:00
Giacomo Travaglini
d031244ca7 misc: When unused, set #MatRegClass registers to 0
This is working around an existing SMT issue [1].

The BaseO3CPU uses two physical matrix registers [2]. This is
enough for a single threaded CPU which as of now uses
1 architectural matrix only.

The problem arises when SMT is enabled.  As 2 architectural matrices
need to be supported by a single CPU, the O3CPU won't have any available
register in the freeList for renaming.  This causes the SMT O3CPU to
indefinitely stall renaming [3]

If the archtectural number of registers is seto to 0, the regclass won't
be taken into consideration when evaluating if we can rename
instructions.

This issue has been implicitly fixed for RISCV by a preceding PR [4]

[1]: https://github.com/gem5/gem5/issues/668
[2]: https://github.com/gem5/gem5/blob/stable/src/cpu/o3/BaseO3CPU.py#L170
[3]: https://github.com/gem5/gem5/blob/stable/src/cpu/o3/rename.cc#L1228
[4]: https://github.com/gem5/gem5/pull/83

Change-Id: I99bfdefff11a246b1f191251dc67689e95b3f0db
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-02-02 18:11:47 +00:00
Giacomo Travaglini
3a2c8feca8 arch-arm: MMU aarch64EL is not used in AArch64 only anymore
We therefore rename it to exceptionLevel

Change-Id: I2a3aabaefa315d95bd034b13d95d5a5b0b8e9319
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-02-01 13:45:06 +00:00
Giacomo Travaglini
3737e8b6df arch-arm: Use MAIR_EL2 mem attribute register when in EL0 host
With the old code, the MAIR_EL1 register was checked when inserting
an EL2&0 TLB entry

Change-Id: I064032fb2946777c2f4c50c06a124f828245e18a
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-02-01 13:44:16 +00:00
Giacomo Travaglini
d42ef792bf arch-arm: Check ELIs64 for EL2 when in EL2&0 regime
The problem with:

ELIs64(tc, aarch64EL == EL0 ? EL1 : aarch64EL);

Is that when we are executing at EL0 in host (EL2&0 translation
regime), the execution mode (AArch32 vs AArch64) is dictated
by EL2 and not by EL1 (which is the guest)

Change-Id: I463a2a9461c94d0886990ae3d0a6e22aeb4b9ea3
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-02-01 13:43:59 +00:00
Giacomo Travaglini
458c98082c arch-arm: Replace EL based translation with regimes
This is the final step in the transformation process.
We limit the use of the "managing Exception Level" for
a translation in favour of the more standard "Translation
Regime"
This greatly simplifies our code, especially with VHE
where the managing el (EL2) could handle to different
translation regimes (EL and EL2&0).

We can therefore remove the isHost flag wherever it got
used. That case is automatically handled by the proper
regime value (EL2&0)

Change-Id: Iafd1d2ce4757cfa6598656759694e5e7b05267ad
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-02-01 13:43:47 +00:00
Giacomo Travaglini
e333a77c12 arch-arm: Remove _Xt postfix from TLBI instructions
The Xt is not part of the architectural name of the register
and it was likely added with the introduction of extended
register (Xt) TLBIs in Armv8 to differentiate them with
the old Armv7 ones.

The use of _Xt was not consistent anyway: newer TLBIs were
already omitting it.

Change-Id: Ic805340ffa7b5770e3b75a71bfb76e055e651f8b
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-02-01 13:43:26 +00:00
Giacomo Travaglini
594428f010 arch-arm: Remove redundant isHyp as a TLB entry field
We should stop using isHyp.. An hypervisor entry is flagged
already by the EL of the entry (el == EL2)

Change-Id: I20c3d06fa2b04e0b938a380ca917d0b596eddcf2
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-02-01 13:43:00 +00:00
Giacomo Travaglini
a6ca81906a arch-arm: Simplify setting of isHyp for mem translations
The isHyp descriptor is an old artifact of armv7 and it flags a PL2
(AArch32) or EL2 & EL2&0 (AArch64) translations.
It is commonly set according to the EL/mode [1] but it may differ from
the execution state in case of explicit translation requests (via
the AT instruction as an example [2]).

There is really no need to complicate the masking of isHyp. We should
just make use of the tranType method (in charge of setting aarch64EL)
to properly set aarch64EL, and make isHyp coincide with the case of
aarch64EL == EL2.

This is a step towards the removal of the isHyp flag.

More specifically the patch does the following:

* HypMode translation type moved in the EL2 case
The translation is used by

ATS1HR/ATS1HW:
Performs stage 1 address translation as defined for PL2 and the
Non-secure state

* S1S2NsTran translation type moved in the EL1 case
The translation is used by

ATS12NSOPR/ATS12NSOPW:
Performs stage 1 and 2 address translations as defined for PL1 and the
Non-secure state

* S1CTran translation type can be at either EL1 or EL3
The translation is used by

ATS1CPR/ATS1CPW
Performs stage 1 address translation as defined for PL1 and the current
Security state

[1]: https://github.com/gem5/gem5/blob/stable/src/arch/arm/mmu.cc#L1281
[2]: https://github.com/gem5/gem5/blob/stable/src/arch/arm/mmu.cc#L1282

Change-Id: Ie653170f6053c5d8141a2de9f50febf5bf53ab9c
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-02-01 13:42:40 +00:00
Jason Lowe-Power
b3870ee7b0 arch-riscv: Fix fence.i instruction in O3 CPU (#816)
arch-riscv: Fix fence.i instruction in O3 CPU
2024-01-30 15:39:32 -08:00
Roger Chang
d94ef08a36 arch-riscv: Fix fence.i instruction in O3 CPU
We should clean the instruction buffer after the fence.i is execute
to avoid execute old instruction for self-modifying code

Change-Id: Iece0ee0d10631fcd9bd17ee67cf0c92f72acdbd8
2024-01-29 11:43:27 +08:00
QQeg
08ed87bc9d arch-riscv: Add template Vector1Vs1VdMaskDeclare
This commit adds a new template, Vector1Vs1VdMaskDeclare, to replace
the use of Vector1Vs1RdMaskDeclare in Vector1Vs1VdMaskFormat.

The change addresses the issue with the number of indices in srcRegIdxArr.
Only two indices are available in Vector1Vs1RdMaskDeclare, but instructions
that use Vector1Vs1VdMaskFormat, like 'vmsbf', require three indices
(for vs1, vs2(old_vd), and vm) to function correctly.

Change-Id: I0c966e11289ce07efcc3b0cc56948311289530ad
2024-01-28 09:38:11 +00:00
QQeg
31ffc11c57 arch-riscv: Fix segmentation fault in vmsbf/vmsof/vmsif
This commit simplifies the conditional logic in vmsbf/vmsof/vmsif
by removing an unnecessary variable and condition.
The updated logic checks 'this->vm' or the result of 'elem_mask(v0, i)'
directly, which prevents a segmentation fault regardless of
whether 'vm' is set or not.

Change-Id: I799fa7b684ff98959a64f9694ef9c854f3a1f43a
2024-01-28 09:38:11 +00:00
Giacomo Travaglini
ce32d7c523 arch-arm: Replace CRYPTO extension with canonical names (#810)
These are:

FEAT_AES,
FEAT_PMULL,
FEAT_SHA256,
FEAT_SHA1,
FEAT_CRC32

With this patch we are also enabling them by default by adding them to
the Armv8 release object. Some of them are mandatory anyway since
Armv8.1

Change-Id: I221ae8646d91151fdfaf97a4815168a4fe3d8c5a

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2024-01-26 19:39:35 +00:00
Ivana Mitrovic
24e0d71034 arch-gcn3: Remove gcn3 (#781)
Related to issue #703 , this PR removes GCN3 related files and updates
source code, documentation, and tests to switch over to Vega is that was
not done already. Highlights are:

 - Remove all src/arch/amdgpu/gcn3 files and update Kconfigs.
 - Remove references to GCN3 and replace with Vega where applicable.
- Update the build targets in the gcn-gpu Docker. This will need to be
rebuilt but not urgently.
- Remove the GCN3 tag in testlib. Most tests seem to be using Vega
already, so that commit is small.
2024-01-25 10:14:46 -08:00
QQeg
7a96709b11 arch-riscv: Fix vsadd_vi and vsaddu_vi to match v-spec (#805)
This commit fixes the implementation of two instructions, vsadd_vi and
vsaddu_vi, in the OPIVI category
to match the RISC-V vector specification.

According to
[riscv-v-spec](https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#101-vector-arithmetic-instruction-encoding),
the immediate field of these two instructions should be sign extended.

> For integer operations, the scalar can be a 5-bit immediate, imm[4:0],
encoded in the rs1 field. The value is sign-extended to SEW bits, unless
otherwise specified.

There is an example in both
[vsadd](https://github.com/QQeg/rvv_intrinsic_testcases/tree/master/vsadd_vi)
and
[vsaddu](https://github.com/QQeg/rvv_intrinsic_testcases/tree/master/vsaddu_vi).

Change-Id: Ib877627ba01c0868b2103d41613651df488fca13
2024-01-24 17:21:26 -08:00
Yu-Cheng Chang
6dd936e5b5 arch-riscv: Simply implementation of vector multiply and divide instructions (#793)
Align the implementation of scalar multiply and divide instructions

Change-Id: I53297d4c841c41593baaae0ea140bfbbd874a1d9
2024-01-24 13:20:15 -08:00
Matthew Poremba
44c78d843c arch-vega: Implement memory aperture operands (#803)
Vega (gfx900) introduced new memory aperture registers to get the base
address and limit for LDS and private (scratch) memory. These have not
commonly been used by the compiler until ROCm 6. Now that the compiler
is generating reads from these special registers, implement the support
for them.

Tested with LULESH which is using the SHARED_BASE register (LDS) with
ROCm 6.0. This assembly seems to replace S_GETREG_B32 emitted by the
ROCm 5 compiler.

Change-Id: Id2bd26ce8ef687c84a647fa2ac2da54d657913e5
2024-01-24 11:19:43 -08:00
Matthew Poremba
dfafc5792a arch-vega: Remove deleted instruction.cc from build (#801)
Change-Id: I03073d35a0d36788dfe8309e6ed466d0a496e31e
2024-01-23 18:47:01 -08:00
Matthew Poremba
a5757e7e01 arch-vega: Rename mismatched source/header files
The files registers.cc, isa.cc, and decoder.cc do not match the header
name. This is a minor cleanup to make development more straightforward.

Change-Id: Ibab18dfe315b0ce84359939b490f8227ea43cac0
2024-01-19 13:32:24 -06:00
Matthew Poremba
cd91c6321f arch-vega: Reorganize instructions to multiple files
The Vega instructions.cc file is 47k lines long which results in both
large compilation times whenever it is modified and long style check
times. This makes iterating over more complex instruction
implementations very time consuming.

This commit moves the instruction definitions to multiple files based on
the instruction encoding (SOP2, VOP2, FLAT, DS, etc.). The resulting
files are much smaller (max is 8k lines) and compilation and style check
times are much more reasonable. Other than moving code around, there are
no functional changes in this commit.

Change-Id: Id4ac8e98ef11a58de5fd328f8a0cd7ce60a11819
2024-01-19 13:32:24 -06:00
Jason Lowe-Power
a555449c12 arch-arm: Fix compile error in kvm (#784)
The addition of std::optional in #732 caused a compile error. This
change fixes the error by checking to see if the value is present and
panicing otherwise.

Change-Id: I46c3fb76eb0e14ba7bede7c336293fbe9add8c84

Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2024-01-19 07:59:59 -08:00
Yu-Cheng Chang
f56459470a arch-riscv: Refactor the RISC-V multiplication utility (#780)
1. Add the new double width for int64_t and uint64_t
2. Use the wider type to get the upper result of multiplication

Change-Id: Id6cfa6f274c65592b2b3e2b70c00f82954b41f1a
2024-01-18 12:40:11 -08:00
QQeg
511729ab76 arch-riscv: Fix issue when vl=0 in VectorIntMaskMacroConstructor (#715)
I’ve been working on a fix for the issue #759 where ‘vd’ incorrectly
stores all zeros when ‘vl’ is set to 0 in VectorIntMaskMacroConstructor.
My solution seems to work, but it behaves differently from other macros
when ‘vl’ = 0. Instead of pushing a ‘nop’ to ‘microops’, it pushes a
micro operation that remains ineffective due to ‘vl’ being 0.
2024-01-17 08:45:08 -08:00
Matthew Poremba
57fb083f43 arch-gcn3: Remove all GCN3 files
Change-Id: Ib7d9e8676a31e51a330e68d81099580e2509a90a
2024-01-17 10:44:44 -06:00
Matthew Poremba
70376d43a3 arch-vega: Fix upsize cast error in newer compilers (#774)
Newer compilers error on -Warray-length in the recent MI200 patches due
to casting from a 32-bit data type to a 64-bit type. Change it to cast
the 32-bit integer first then 64-bit integer latter to remove the
warning.

Rerun of validation tests on the three instructions passed.

Change-Id: I0309e5f7b5b8cc8ce1651660ddddb120fa6e7666
2024-01-16 09:41:23 -08:00