derek/gem5 - gem5 - Gitea: Git with a cup of tea

derek/gem5

Author	SHA1	Message	Date
Giacomo Travaglini	7dba30209a	arch-arm: Hook TLBIOS instructions to the TlbiShareable obj FEAT_TLBIOS has been introduced by a recent patch [1] which was however missing to include the outer shareable case in the Msr disambiguation switch. Which meant the TLBIOS instructions were decoded as normal MSR instructions, with no effect whatsoever on the TLBs [1]: https://gem5-review.googlesource.com/c/public/gem5/+/70567 Change-Id: I41665a4634fbe0ee8cc30dbc5d88d63103082ae9 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2023-07-24 09:05:01 +01:00
rogerchang23424	5d2edca1e3	arch-riscv: Set default check alignment True (#98 ) Raise misaligned trap if effective address if not aligned by default Change-Id: I634aa7ddbf5282fc583316fc77ab1e37bfe415e3	2023-07-19 11:14:27 -07:00
rogerchang23424	52d9259396	arch-riscv: Fix clearLoadReservation merge (#81 ) The previous change (https://gem5-review.googlesource.com/c/public/gem5/+/71818) makes the clearLoadReservation be RISC-V only. Change-Id: I5df1a7fa688489d57fff8da937e3c8addfe4c299	2023-07-14 08:48:43 -07:00
Giacomo Travaglini	18470b4747	arch-arm: Fix assert fail when UQRSHL shiftAmt==0 (#75 ) When shiftAmt is 0 for a UQRSHL instruction, the code called bits() with incorrect arguments. This fixes a left-shift of 0 to be a NOP/mov, as required. Change-Id: Ic86ca40ac42bfb767a09e8c65a53cec56382a008 Co-authored-by: Marton Erdos <marton.erdos@arm.com>	2023-07-13 10:57:51 -07:00
Bobby R. Bruce	753933d471	gpu-compute, tests: Fix GPU_X86 compilation, add compiler tests (#64 ) * gpu-compute: Remove use of 'std::random_shuffle' This was deprecated in C++14 and removed in C++17. This has been replaced with std::random. This has been implemented to ensure reproducible results despite (pseudo)random behavior. Change-Id: Idd52bc997547c7f8c1be88f6130adff8a37b4116 * dev-amdgpu: Add missing 'overrides' This causes warnings/errors in some compilers. Change-Id: I36a3548943c030d2578c2f581c8985c12eaeb0ae * dev: Fix Linux specific includes to be portable This allows for compilation in non-linux systems (e.g., Mac OS). Change-Id: Ib6c9406baf42db8caaad335ebc670c1905584ea2 * tests: Add 'VEGA_X86' build target to compiler-tests.sh Change-Id: Icbf1d60a096b1791a4718a7edf17466f854b6ae5 * tests: Add 'GCN3_X86' build target to compiler-tests.sh Change-Id: Ie7c9c20bb090f8688e48c8619667312196a7c123	2023-07-11 14:35:03 -07:00
Yang Liu	35763bdfb2	arch: Add setRegOperand in VecRegOperand VecRegOperand also need setRegOperand method to write back execution result. Change-Id: Ie50606014827c14a7219558dd003eb4747231649 Co-authored-by: Xuan Hu <huxuan@bosc.ac.cn> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/67292 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu> Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>	2023-07-10 22:59:12 +00:00
Bobby R. Bruce	63bdde4f63	arch-riscv: Remove clearLoadReservation This was added due to a bad merge from stable to develop. Change-Id: I7adf9604ee4d6f1cf11c404af5e8e1c071461a4a	2023-07-10 15:28:41 -07:00
Bobby R. Bruce	54501c3e2b	misc: Merge branch 'stable' into 'develop' This ensures all commits in v23.0 are now in the develop branch. Change-Id: I791346115dd123f3541a3c8060482e00cf4dbfb5	2023-07-10 12:24:27 -07:00
Adrià Armejach	fe7b18c2d7	arch-riscv: Make virtual method RISC-V private * Prior commit defined a shared virtual method that is only used in RISC-V. This patch makes the method only visible to the RISC-V ISA. Change-Id: Ie31e1e1e5933d7c3b9f5af0c20822d3a6a382eee Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71818 Reviewed-by: Roger Chang <rogerycchang@google.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2023-06-30 07:39:40 +00:00
Matthew Poremba	841e6fe978	arch-vega: Add Vega D16 decodings and fix V_SWAP_B32 Vega adds multiple new D16 instructions which load a byte or short into the lower or upper 16 bits of a register for packed math. The decoder table has subDecode tables for FLAT instructions which represents 32 opcodes in each subDecode table. The subDecode table for opcodes 32-63 is missing so it is added here. The opcode for V_SWAP_B32 is also off by one- In the ISA manual this instruction is opcode 81, the instruction before is 79, and there is no opcode 80, so the decoder entry is swapped with the invalid decoding below it. Change-Id: I278fea574ea684ccc6302d5b4d0f5dd8813a88ad Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71899 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2023-06-29 19:56:56 +00:00
Adrià Armejach	d54b8f8475	arch-riscv: fix load reserved store conditional * According to the manual, load reservations must be cleared on a failed or a successful SC attempt. * A load reservation can be arbitrarily large. The current implementation was reserving something different than cacheBlockSize which could lead to problems if snoop addresses are cache block aligned. This patch implementation assumes a cacheBlock granularity. * Load reservations should also be cleared on faults Change-Id: I64513534710b5f269260fcb204f717801913e2f5 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71520 Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>	2023-06-16 16:44:49 +00:00
Adrià Armejach	7398e1e401	arch-riscv: fix load reserved store conditional * According to the manual, load reservations must be cleared on a failed or a successful SC attempt. * A load reservation can be arbitrarily large. The current implementation was reserving something different than cacheBlockSize which could lead to problems if snoop addresses are cache block aligned. This patch implementation assumes a cacheBlock granularity. * Load reservations should also be cleared on faults Change-Id: I64513534710b5f269260fcb204f717801913e2f5 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71558 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Roger Chang <rogerycchang@google.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2023-06-16 06:49:25 +00:00
Matthew Poremba	db903f4fd4	arch-vega: Helper methods for SDWA/DPP for VOP2 Many of the outstanding issues with the GPU model are related to instructions not having SDWA/DPP implementations and executing by ignoring the special registers leading to incorrect executiong. Adding SDWA/DPP is current very cumbersome as there is a lot of boilerplate code. This changeset adds helper methods for VOP2 with one instruction changed as an example. This review is intended to get feedback before applying this change to all VOP2 instructions that support SDWA/DPP. Change-Id: I1edbc3f3bb166d34f151545aa9f47a94150e1406 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70738 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2023-06-15 23:02:39 +00:00
Roger Chang	328aaa626f	arch-riscv: Fix unexpected behavior of float operations in Mac OS The uint_fast16_t is the integer at least 16 bits size, it can be 32, 64 bits and more. Usually most of the simulations are in the x86-64 linux host, the size of uint_fast16_t is 64 bits. Therefore, there is no problem for double precision float operations and it can pass FloatMM test. However, in the Mac OS, the size of uint_fast16_t is 16 bits, it will lose the upper bits when converting float register bits to freg_t and it will generate unexpected results for FloatMM test. The change can guarantee that the size of data in freg_t is at least 64 bits and it will not lose any data from floating point to freg_t. Reference: https://developer.apple.com/documentation/kernel/uint_fast16_t https://codebrowser.dev/glibc/glibc/stdlib/stdint.h.html Change-Id: I3df6610f0903cdee0f56584d6cbdb51ac26c86c8 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71519 Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>	2023-06-15 20:10:04 +00:00
Yu-hsin Wang	694673f1d7	arch: set multiline re as default in isa_parser In python3.11, it requires the global specifier should be the first token of regex. However it's not possible when using ply library. Instead, we set the rules are multiline regex by default and modifies those single line rules. Ref: https://github.com/dabeaz/ply/issues/282 Change-Id: I7bdbfeb97a9dd74f45c1890a76f8cc16100e5a42 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71019 Reviewed-by: Richard Cooper <richard.cooper@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>	2023-06-15 10:04:00 +00:00
Yu-hsin Wang	23a88d0400	fastmodel: only support single line literal when paring project file In python3.11, it requires the global specifier should be the first token of regex. However it's not possible when using ply library. In fastmodel case, we actually don't need to support multiline string literal. We fix this issue by just making the string literal single line. Ref: https://github.com/dabeaz/ply/issues/282 Change-Id: I746b628db7ad4c1d7834f1a1b2c1243cef68aa01 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71018 Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>	2023-06-15 10:03:47 +00:00
Roger Chang	9a27322c7b	arch-riscv: Fix unexpected behavior of float operations in Mac OS The uint_fast16_t is the integer at least 16 bits size, it can be 32, 64 bits and more. Usually most of the simulations are in the x86-64 linux host, the size of uint_fast16_t is 64 bits. Therefore, there is no problem for double precision float operations and it can pass FloatMM test. However, in the Mac OS, the size of uint_fast16_t is 16 bits, it will lose the upper bits when converting float register bits to freg_t and it will generate unexpected results for FloatMM test. The change can guarantee that the size of data in freg_t is at least 64 bits and it will not lose any data from floating point to freg_t. Reference: https://developer.apple.com/documentation/kernel/uint_fast16_t https://codebrowser.dev/glibc/glibc/stdlib/stdint.h.html Change-Id: I3df6610f0903cdee0f56584d6cbdb51ac26c86c8 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71578 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2023-06-13 23:20:47 +00:00
Roger Chang	0fef2300c0	arch-riscv: Refactor fmax and fmin instructions Currently fmax and fmin instructions convert source float registers such as Fs1_bits to float64_t(or float32_t and float16_t) many times in the single instruction. It is not efficient for the future maintenance of these instructions. The change adds non-register float_t intermediate variables fs1 and fs2 to keep converted results so that we don’t need to do it repeatedly. It also added an intermediate variable fd for specific float type to assume the upper bits of the packed float register are all one. Change-Id: Ic508d5255db6c4b38ca4df6dd805df440c043fff Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71479 Maintainer: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2023-06-13 00:09:34 +00:00
Giacomo Travaglini	4434d48973	arch-arm: Apply FEAT_IDST to missing ID registers When FEAT_IDST got implemented [1], we forgot to add the logic for AArch64 ID registers tracking AArch32 state/capabilities [1]: https://gem5-review.googlesource.com/c/public/gem5/+/70723 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Change-Id: I19bddf67ecc379a14f91cfede385692536982101 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71178 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Richard Cooper <richard.cooper@arm.com> Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu> Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Maintainer: Jason Lowe-Power <power.jg@gmail.com>	2023-06-07 07:38:08 +00:00
Roger Chang	5e5e81d1c5	arch-riscv: Check FPU status for c.flwsp c.fldsp c.fswsp c.fsdsp The change adds the missing FPU checking for these instructions. Change-Id: I7f2ef89786af0d528f2029f1097cfeac6c7d65f2 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71198 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com>	2023-06-03 00:58:25 +00:00
Yu-hsin Wang	bd1d72f61e	fastmodel: add src include path by default We have some customized protocols in gem5 repository and they require the include path from src directory. It causes the users of those protocols need to handle the include path correctly by theirselve. This is tedious and unstable. We should add the default include path in SIMGEN command line to prevent issues. Change-Id: I2a3748646567635d131a8fb4099e02e332691e97 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/71118 Reviewed-by: Wei-Han Chen <weihanchen@google.com> Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com>	2023-05-31 23:47:30 +00:00
Giacomo Travaglini	5095e29c8e	arch-arm: Implement FEAT_HCX This is just making the HCRX_EL2 register read/writable; trapping behaviour will be implemented with further extensions Change-Id: Id1ec42a754b7d999782edde3a8ec6c6099e3331e Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70939 Maintainer: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Richard Cooper <richard.cooper@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>	2023-05-25 21:36:58 +00:00
Giacomo Travaglini	0fae6e8163	arch-arm: Implement FEAT_EVT This extension is optional in Armv8.2 but mandatory since Armv8.5 We only implement this for AArch64 Change-Id: I063642ac24d27f0a81ba79b1d38f72468bb130eb Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70938 Reviewed-by: Richard Cooper <richard.cooper@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com>	2023-05-25 21:36:58 +00:00
Richard Cooper	9de1443ebb	arch-arm: Add support for Armv8.2-I8MM NEON extension. Add support for the Armv8.2-I8MM NEON extension. This provides the SUDOT and USDOT mixed-sign SIMD Dot Product instructions, as well as the SMMLA, UMMLA, and USMMLA SIMD Matrix Multiply-Accumulate instructions. For more information please refer to the Arm Architecture Reference Manual (https://developer.arm.com/documentation/ddi0487/latest/). Additional Contributors: Giacomo Travaglini Change-Id: I6fb9318f67cc9d2737079283e1a095630c4d2ad9 Reviewed-by: Richard Cooper <richard.cooper@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70737 Maintainer: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>	2023-05-25 21:36:39 +00:00
Richard Cooper	eb4f83b178	arch-arm: Add support for Armv8.2-DotProd NEON extension. Add support for the Armv8.2-DotProd NEON extension. This provides the SDOT and UDOT SIMD Dot Product instructions. For more information please refer to the Arm Architecture Reference Manual (https://developer.arm.com/documentation/ddi0487/latest/). Change-Id: I4caa3b97a74c65f32421487c55c3e36427194e61 Reviewed-by: Richard Cooper <richard.cooper@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70736 Maintainer: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2023-05-25 21:36:39 +00:00
Richard Cooper	fab3d8a1c1	arch-arm: Fix too long lines in existing Arm NEON instructons. These lines break the current gem5 coding guidelines. Change-Id: I587fcb2d75c4ab9de47fa53b4ae96526a20afe3f Reviewed-by: Richard Cooper <richard.cooper@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70735 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>	2023-05-25 21:36:39 +00:00
Richard Cooper	d02ea0dfbb	arch-arm, cpu, configs: Add new Op Classes for Matrix Multiply insts Add SimdMatMultAcc and SimdFloatMatMultAcc Op Classes for the SVE Matrix Multiply Accumulate instructions in the SVE F32MM, F64MM and I8MM extensions. Initial latencies have been set to be the same as SimdMultAcc and SimdFloatMultAcc respectively. Change-Id: Ifab63a0efbb0ccfbd272245e0b0b055279f66e3a Reviewed-by: Richard Cooper <richard.cooper@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70734 Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2023-05-25 21:36:39 +00:00
Richard Cooper	560df49c28	arch-arm: Declare support for Armv8.2-I8MM. Sets the appropriate bit in the ID_AA64ZFR0_EL1 sysreg that declares support for ARMv8.2-I8MM. This indicates that all pre-requisites for Armv8.2 SVE Int8 matrix multiplication instructions have been met. SMMLA, SUDOT, UMMLA, USMMLA, and USDOT instructions are implemented. For more information please refer to the "ARM Architecture Reference Manual Supplement - The Scalable Vector Extension (SVE), for ARMv8-A" (https://developer.arm.com/architectures/cpu-architecture/a-profile/ docs/arm-architecture-reference-manual-supplement-armv8-a) Additional Contributors: Giacomo Travaglini Change-Id: Id97e1c5de8c23a25336a6b323034e9eca8e598e4 Reviewed-by: Richard Cooper <richard.cooper@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70733 Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>	2023-05-25 21:36:39 +00:00
Richard Cooper	f8b60b7a1d	arch-arm: Added Armv8.2-I8MM SVE mixed-sign dot product instrs. Add support for the SVE mixed sign dot product instructions (USDOT, SUDOT) required by the Armv8.2 SVE Int8 matrix multiplication extension (ARMv8.2-I8MM). For more information please refer to the "ARM Architecture Reference Manual Supplement - The Scalable Vector Extension (SVE), for ARMv8-A" (https://developer.arm.com/architectures/cpu-architecture/a-profile/ docs/arm-architecture-reference-manual-supplement-armv8-a) Change-Id: I83841654cee74b940f967b3a37b99d87c01bd92c Reviewed-by: Richard Cooper <richard.cooper@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70732 Maintainer: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>	2023-05-25 21:36:39 +00:00
Richard Cooper	9421a46d71	arch-arm: Re-factor Arm decoder for SVE mixed-sign DOT insts. Re-factored the Arm instruction decoder to add placeholders for the SVE Integer mixed-sign DOT product instructions. This has involved moving some existing decode helper functions. Change-Id: I42b280d4bd1b4ab9d8c633bdc523bd08c281d218 Reviewed-by: Richard Cooper <richard.cooper@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70731 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>	2023-05-25 21:36:39 +00:00
Richard Cooper	98e67c8610	arch-arm: Add support for Arm SVE Integer Matrix instructions. Add support for the Arm SVE Integer Matrix Multiply-Accumulate (SMMLA, USMMLA, UMMLA) instructions. Because the associated SUDOT and USDOT instructions have not yet been implemented, the SVE Feature ID register 0 (ID_AA64ZFR0_EL1) has not yet been updated to indicate support for SVE Int8 matrix multiplication instructions at this time. For more information please refer to the "ARM Architecture Reference Manual Supplement - The Scalable Vector Extension (SVE), for ARMv8-A" (https://developer.arm.com/architectures/cpu-architecture/a-profile/ docs/arm-architecture-reference-manual-supplement-armv8-a) Additional Contributors: Giacomo Travaglini Change-Id: Ia50e28fae03634cbe04b42a9900bab65a604817f Reviewed-by: Richard Cooper <richard.cooper@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70730 Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>	2023-05-25 21:36:39 +00:00
Richard Cooper	0f857873f9	arch-arm: Declare support for Armv8.2-F64MM. Sets the appropriate bit in the ID_AA64ZFR0_EL1 sysreg that declares support for ARMv8.2-F64MM. This indicates that all pre-requisites for Armv8.2 SVE FP64 double-precision floating-point matrix multiplication instructions have been met. FMMLA, and LD1RO* instructions have been implemented, as well as the 128-bit element variants of TRN1, TRN2, UZP1, UZP2, ZIP1, and ZIP2. For more information please refer to the "ARM Architecture Reference Manual Supplement - The Scalable Vector Extension (SVE), for ARMv8-A" (https://developer.arm.com/architectures/cpu-architecture/a-profile/ docs/arm-architecture-reference-manual-supplement-armv8-a) Additional Contributors: Giacomo Travaglini Change-Id: Idac3a3ca590e6eb2beb217a40a8c10af1e917440 Reviewed-by: Richard Cooper <richard.cooper@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70729 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>	2023-05-25 21:36:39 +00:00
Richard Cooper	8bf89d6967	arch-arm: Added 128-bit encodings of SVE TRN, UZP, and ZIP insts. Add support for the 128-bit element encodings of the TRN1, TRN2, UZP1, UZP2, ZIP1, and ZIP2 instructions, required by the Armv8.2 SVE Double-precision floating-point Matrix Multiplication instructions (ARMv8.2-F64MM). For more information please refer to the "ARM Architecture Reference Manual Supplement - The Scalable Vector Extension (SVE), for ARMv8-A" (https://developer.arm.com/architectures/cpu-architecture/a-profile/ docs/arm-architecture-reference-manual-supplement-armv8-a) Change-Id: I496576340c48410fedb2cf6fc7d1a02e219b3bd4 Reviewed-by: Richard Cooper <richard.cooper@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70728 Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>	2023-05-25 21:36:39 +00:00
Richard Cooper	19e8023043	arch-arm: Support Arm SVE Load-Broadcast Octaword instructions. Add support for the Arm SVE Load-Broadcast Octaword (LD1RO{B,H,W,D}) instructions. These are similar to the Load-Broadcast Quadword (LD1RQ{B,H,W,D}) instructions, but work on a 32-byte memory segment rather than a 16-byte memory segment. Consequently, the LD1ROx implementations build on the code for the LD1RQx implementations. For more information please refer to the "ARM Architecture Reference Manual Supplement - The Scalable Vector Extension (SVE), for ARMv8-A" (https://developer.arm.com/architectures/cpu-architecture/a-profile/ docs/arm-architecture-reference-manual-supplement-armv8-a) Change-Id: I98ee4f56c8099bf40c9034baa488d318ae57d3aa Reviewed-by: Richard Cooper <richard.cooper@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70727 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>	2023-05-25 21:36:39 +00:00
Richard Cooper	94a629b527	arch-arm: Add support for Arm SVE fmmla instruction. Add support for the Arm SVE Floating Point Matrix Multiply-Accumulate (FMMLA) instruction. Both 32-bit element (single precision) and 64-bit element (double precision) encodings are implemented, but because the associated required instructions (LD1RO*, etc) have not yet been implemented, the SVE Feature ID register 0 (ID_AA64ZFR0_EL1) has only been updated to indicate 32-bit element support at this time. For more information please refer to the "ARM Architecture Reference Manual Supplement - The Scalable Vector Extension (SVE), for ARMv8-A" (https://developer.arm.com/architectures/cpu-architecture/a-profile/ docs/arm-architecture-reference-manual-supplement-armv8-a) Additional Contributors: Giacomo Travaglini Change-Id: If3547378ffa48527fe540767399bcc37a5dab524 Reviewed-by: Richard Cooper <richard.cooper@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70726 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2023-05-25 21:36:39 +00:00
Matthew Poremba	2aa95ccc7d	arch-x86: Fix CPUID function 0 This should return the number of standard features, not the number of extended features. Change-Id: Ieb3a36d832cee603f1efd39b4f430b5ac0478561 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70778 Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2023-05-25 14:38:09 +00:00
Giacomo Travaglini	dc76c00c9b	arch-arm: Add an ArmAllRelease containing every defined extension This is probably the easiest way to instantiate a release containing any implemented extension. It is alternatively possible to use the latest release (e.g. Armv92 as of now). This could be preferrable for consistency across simulations. However if users want to always be up to date with development, using ArmAllRelease will allow them to do so without the need to change their configuration script Change-Id: Ibca629e99da9b571f233de9d05a5a9186d02aa99 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70958 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Richard Cooper <richard.cooper@arm.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com>	2023-05-25 07:30:48 +00:00
Prajwal Hegde	dfa3c073cf	arch-arm,cpu: Add four Arm SVE2 int instructions This changeset adds ARM SVE2 integer instructions - ADCLB, ADCLT, SBCLB, SBCLT - Decoding logic as per sve encoding of Version: 2023-03 Change-Id: I1bd3fe24b33677baa0b6da3c1dd7423f2b13b2c6 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70137 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>	2023-05-24 01:22:30 +00:00
Roger Chang	2579bacf06	arch-riscv: Merge rv32 and rv64 version of xperm4 and xperm8 Remove unessential postfix like '_32' and '_64' from mnemonic Change-Id: I83d47eeccd04fe61ac8ee0addd7221abbdcefbd1 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70600 Reviewed-by: Yu-hsin Wang <yuhsingw@google.com> Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com>	2023-05-23 23:58:16 +00:00
Roger Chang	5fa81af8c6	arch-riscv: Simplify the rev8 and brev8 instructions These mnemonic of instructions should not have 'rv32_' prefix Change-Id: Ic072ba8b84e5a51be060e5d7ca16dd913c318957 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70599 Reviewed-by: Yu-hsin Wang <yuhsingw@google.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2023-05-23 23:58:16 +00:00
Roger Chang	4dccd7dd6c	arch-riscv: Add BS format isa This format is helper for aes32dsi, aes32dsmi, aes32esi, aes32esmi, sm4ed, sm4ks disassembly Change-Id: Ieff1932e267efc0a8c5fd8e557fc467dc376da4e Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70598 Reviewed-by: Yu-hsin Wang <yuhsingw@google.com> Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com>	2023-05-23 23:58:16 +00:00
Giacomo Travaglini	d537ded9d2	arch-arm: Fix printing of VecElemClass registers At the moment it is not possible to trace the value of VecElemClass registers. If a AArch32 SIMD binary is run with tracing on, simulation will fail the following assertion [1]. std::string valString(const void val, size_t size) const override { assert(size == sizeof(ValueType)); The problem is that Arm VecElems are stored in RegVal (uint64_t), but the VecElem data type (ValueType above) per se is a uint32_t. So valString is getting called with size = 8 (coming from RegVal) but ValueType has size = 4. We fix this problem by using RegVal as a VecElemRegClassOps template parameter to make them match. This is not changing anything from a functionality perspective. The result will be that we will be able to print VecElems as 64bit values. This solution is the most simple one but a bit dirty. I believe in the long term we should make the VecElemClass use the void interface rather than the RegVal one. In this way we will be able to correctly print the VecElem size as 32bit value. [1]: https://github.com/gem5/gem5/blob/v22.1.0.0/src/cpu/reg_class.hh#L362 Change-Id: Ic3fc252d41449f828b77f938fefc0cd4274b1c57 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70697 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Richard Cooper <richard.cooper@arm.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com>	2023-05-23 19:51:21 +00:00
Giacomo Travaglini	7b91521c60	arch-arm: Define a AA64ZFR0 data type Change-Id: I6b0dcf0c1882f356783934f625c2bc3a25fbb885 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Richard Cooper <richard.cooper@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70725 Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com>	2023-05-23 06:43:21 +00:00
Giacomo Travaglini	3787ab5b20	arch-arm: Rename AdvSIMD instruction pool The decoding function was wrongly named decodeNeon3SameExtra, referring to the "AdvSIMD three same Extra" instruction pool This might be an old name as I can only find the "AdvSIMD scalar three same Extra" in the Arm arm. The encoding space reserved to the pool bears the "Advanced SIMD three-register extension" name; we therefore rename the function to decodeNeon3RegExtension Change-Id: I056da8f0c7808935d12a4b05490d30654178071f Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Richard Cooper <richard.cooper@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70724 Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com>	2023-05-23 06:43:21 +00:00
Giacomo Travaglini	ae115fcfd5	arch-arm: Implement FEAT_IDST Change-Id: I3cabcfdb10f4eefaf2ab039376d840cc4c54609a Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Richard Cooper <richard.cooper@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70723 Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com>	2023-05-23 06:43:21 +00:00
Giacomo Travaglini	e005e6f250	arch-arm: Implement trapping of SME registers Change-Id: Ic5bcc79a535c928265fbc1db1cd0c85ba1a1b152 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Richard Cooper <richard.cooper@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70722 Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com>	2023-05-23 06:43:21 +00:00
Giacomo Travaglini	1629ee71c7	arch-arm: Implement FEAT_RNG Change-Id: I9d60d249172ef4bbaf5d9b38ef279eff344b80d8 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Richard Cooper <richard.cooper@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70721 Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>	2023-05-23 06:43:21 +00:00
Giacomo Travaglini	2a5c427c5c	arch-arm: Extend SCR to be 64-bit wide Change-Id: I9928de3db61957404269d189a15a951fd6707c8a Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Richard Cooper <richard.cooper@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70720 Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com>	2023-05-23 06:43:21 +00:00
Giacomo Travaglini	e3d2191b73	arch-arm: Implement FEAT_FLAGM(2) Change-Id: I21f1eb91ad9acb019a776a7d5edd38754571a62e Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Richard Cooper <richard.cooper@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70719 Maintainer: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2023-05-23 06:43:21 +00:00
Giacomo Travaglini	223a07031f	arch-arm: Improve debugging of CC regs accesses As of now we are simply printing the CC reg index which is not particularly helpful. With this patch we actually print the (NZ\|C\|V) reg name. Change-Id: Ib4b56a372b25e5bc2b6b762d2ef3ff2084097cce Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Richard Cooper <richard.cooper@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70718 Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2023-05-23 06:43:21 +00:00

1 2 3 4 5 ...

5588 Commits