derek/gem5 - gem5 - Gitea: Git with a cup of tea

derek/gem5

Author	SHA1	Message	Date
Debjyoti Bhattacharjee	ec690de0da	arch-riscv: This commit fixes bug in vfmv.f.s impl. in riscv (#863 ) The existing implementation of vfmv instruction did not type cast the first element of the source vector, which caused the "freg" to interpret the result as a NaN. With the type cast to f32, the value is correctly recognized as float and sign extended to be stored in the fd register. Git issue: https://github.com/gem5/gem5/issues/827 Change-Id: Ibe9873910827594c0ec11cb51ac0438428c3b54e --------- Co-authored-by: Debjyoti B <bhatta53@imec.be> Co-authored-by: Tommaso Marinelli <tommarin@ucm.es>	2024-03-29 08:23:14 -07:00
Yu-Cheng Chang	896c32cd0d	arch: Add getIsaName in BaseISA (#975 ) Change-Id: I81bfcd691d570430f7011f0d5023e5ea613e0dd9	2024-03-28 13:27:32 +00:00
Ivan Fernandez	1e743fd85a	arch-riscv: adding vector unit-stride segment stores to RISC-V (#913 ) This commit adds support for vector unit-stride segment store operations for RISC-V (vssegXeXX). This implementation is based in two types of microops: - VsSegIntrlv microops that properly interleave source registers into structs. - VsSeg microops that store data in memory as contiguous structs of several fields. Change-Id: Id80dd4e781743a60eb76c18b6a28061f8e9f723d Gem5 issue: https://github.com/gem5/gem5/issues/382	2024-03-22 15:45:58 -07:00
Matthew Poremba	7d62da6d10	dev-amdgpu: Support for ROCm 6.0 (#926 ) Implement several features new in ROCm 6.0 and features required for future devices. Includes the following: - Support for multiple command processors - Improve handling of unknown register addresses - Use AddrRange for MMIO address regions - Handle GART writes through SDMA copy - Implement PCIe indirect reads and writes - Improve PM4 write to check dword count - Implement common MI300X instruction	2024-03-21 21:12:09 -07:00
Matthew Poremba	dca040983b	arch-vega: Various vega fixes to enable nanogpt (#950 ) This PR fixes some issues observed that were needed to get nanogpt working.	2024-03-21 21:11:44 -07:00
Michael Boyer	803dbbfdac	arch-vega: Implement flat_load_sbyte instruction (#953 ) Change-Id: I642a71c504e2d3afecd5d2dfd9db016945aed21b	2024-03-21 21:11:10 -07:00
Matthew Poremba	9ab004cccc	arch-vega: Implement V_LSHL_ADD_U64 This is a new instruction in MI300 and operates similar to V_LSHL_ADD_U32 but on 64-bit values. Change-Id: Ia4ac65160bdad748fccdcb28286ba03157cc4046	2024-03-21 10:10:01 -05:00
Matthew Poremba	f36be791aa	arch-vega: Expand FLAT subDecode range in main decoder The main decoder for GPU instructions looks at the first 9 bits of a dword to determine either the instruction or a subDecode table with more information for specific instructions types. For flat instructions the first 9 bits currently consist of 6 fixed encoding bits, a reserved bit, and the first two bits of the opcode. Hence to support all opcodes there are four indirections to the flat subDecode table. In MI300 the reserved bit is part of a field to determine memory scope and therefore may be non-zero. This commit adds four addition calls to the subDecode table for the cases where the scope bit is 1. See page 468 (PDF page 478) below: https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/ instruction-set-architectures/ amd-instinct-mi300-cdna3-instruction-set-architecture.pdf Change-Id: Ic3c786f0ca00a758cbe87f42c5e3470576f73a32	2024-03-21 10:10:01 -05:00
Matthew Poremba	e02f329d5d	arch-vega: Fix VOP3 decode table off-by-one There is no VOP3 opcode 667. Mark that invalid and move the opcodes after down by one. Change-Id: Ia4ccda91f6f501c1ce7c5898d7d0e924604a459a	2024-03-20 16:41:31 -05:00
Matthew Poremba	457d97ea52	arch-vega: Implement V_XNOR_B32 Change-Id: Id23a8d984f227ca23a92adb6c7fde3b4627af054	2024-03-20 16:37:37 -05:00
Matthew Poremba	1b15b2cc4b	arch-vega: Support negative modifiers for packed F32 math MI200 adds support for four FP32 packed math instructions. These are VOP3P instructions which have a negative input modifier field. The description made it unclear if these were used for F32 packed math however the assembly of some Tensile kernels are using these modifiers therefore adding support for them. Tested with PyTorch nn.Dropout kernel which is using negative modifiers. Change-Id: I568a18c084f93dd2a88439d8f451cf28a51dfe79	2024-03-20 16:37:23 -05:00
Matthew Poremba	3f8d0e1ef8	arch-vega: Fix V_FMAC_F32 data type The datatype is U32 but should be F32. This is causing an implicit cast leading to incorrect results. This fixes nn.Dropout in PyTorch. Change-Id: I546aa917fde1fd6bc832d9d0fa9ffe66505e87dd	2024-03-20 16:37:23 -05:00
Yu-Cheng Chang	dbae09e4d9	arch-riscv: Move alignment check to Physical Memory Attribute(PMA) (#914 ) In the RISC-V unprivileged spec[1], the misaligned load/store support is depend on the EEI. In the RISC-V privileged spec ver1.12[2], the PMA specify wether the misaligned access is support for each data width and the memory region. In the [3] of `mcause` spec, we cloud directly raise misalign exception if there is no memory region misalignment support. If the part of memory region support misaligned-access, we need to translate the `vaddr` to `paddr` first then check the `paddr` later. The page-fault or access-fault is rose before misalign-fault. The benefit of moving check_alignment option from ISA option to PMA option is we can specify the part region of memory support misalign load/store. MMU will check alignment with virtual addresss if there is no misaligned memory region specified. If there are some misaligned memory region supported, translate address first and check alignment at final. [1] https://github.com/riscv/riscv-isa-manual/blob/main/src/rv32.adoc#base-instruction-formats [2] https://github.com/riscv/riscv-isa-manual/blob/main/src/machine.adoc#physical-memory-attributes [3] https://github.com/riscv/riscv-isa-manual/blob/main/src/machine.adoc#machine-cause-register-mcause	2024-03-18 12:59:13 -07:00
Ivan Fernandez	f6c61836b3	arch-riscv: adding vector unit-stride segment loads to RISC-V (#851 ) This commit adds support for vector unit-stride segment load operations for RISC-V (vlseg<NF>e<X>). This implementation is based in two types of microops: - VlSeg microops that load data as it is organized in memory in structs of several fields. - VectorDeIntrlv microops that properly deinterleave structs into destination registers. Gem5 issue: https://github.com/gem5/gem5/issues/382	2024-03-06 11:27:06 -08:00
Giacomo Travaglini	3d2052bc03	misc: Serialize the ISA as a string in the checkpoint With the introduction of multi-ISA gem5, we don't store the TARGET_ISA anymore as a string in the root section of the checkpoint [1]. There is therefore no way at the moment to asses the ISA of a CPU/ThreadContext. This is a problem when it comes to checkpoint updates which are ISA specific. By explicitly serializing the ISA as a string under the cpu.isa section we avoid this problem and we let cpt_upgraders be aware of the ISA in use. [1]: https://gem5-review.googlesource.com/c/public/gem5/+/48884 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Change-Id: I1e75230cbc370cab84f4a54141b1e425af2dbfac Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>	2024-03-04 17:51:40 +00:00
Nitish Arya	676d571009	arch-riscv: adding stats to show completed page walks (#869 ) This commit adds statistics showing completed page walks for 4KB and 2MB pages. This will add to stats.txt the variables num_4kb_walks, num_2mb_walks and the corresponding values. This is done based on the level of page table walk traversed specific to Sv39 Virtual Memory System.	2024-03-04 08:38:28 -08:00
Matthew Poremba	db42aeb630	arch-vega: Implement accumulation offset (#895 ) This PR implements a few changes related to the accumulation offset which is new in MI200. Previously MI100 contained two vector register files: the architectural and accumulation register files. These have now been unified and the architectural register file is twice the size. As a result of this the dispatch packet set an offset into the unified vector register file for where the former accumulation registers would go. The changes are: - Calculate the accumulation offset from dispatch packet and store in HSA task. - Update the accumulation move instructions (v_accvgpr_read/write) to use it. - Update the current MFMA instructions to use it. - Make the MFMA examples more clean.	2024-02-29 09:05:39 -08:00
Nicholas Mosier	69762e272e	sim-se, arch-x86: initialize max stack size from parameter (#892 ) Initialize x86 process' max stack size to the value given in the process params, rather than hard-coding it to 8 MB, which made it impossible to run x86 programs requiring more than 8 MB of stack. Change-Id: I0b17fe60b016b1e4a82d704ef7ad367974ea6a08	2024-02-29 08:15:43 -08:00
Matthew Poremba	2ca7f48828	arch-vega: Accumulation offset for existing MFMA insts This commit update the two exiting MFMA instructions to support the accumulation offset for A, B, and C/D matrix. Additionally uses array indexed C/D matrix registers to reduce duplicate code. Future MFMA instructions have up to 16 registers for C/D and this reduces the amount of code being written. Change-Id: Ibdc3b6255234a3bab99f115c79e8a0248c800400	2024-02-26 14:30:50 -06:00
Matthew Poremba	e0e65221b4	arch-vega: Use accum offset for v_accvgpr_read/write The accum offset is used as an index into the unified VGPR register file in MI200 and is not the same as a move if accum_offset in the dispatch packet is non-zero. Change these instructions to use the stored accum_offset value. Change-Id: Ib661804f8f5b8392e4c586082c423645f539e641	2024-02-26 12:57:09 -06:00
Yu-Cheng Chang	816ef46c78	arch-riscv: Fix fflags behavior of float inst. in O3 CPU (#868 ) According to the RISC-V spec [1]. Any float-point instructions accumulate FFLAGS register rather than write it to reflect the CSR behavior. In the previous implementation. We read the FFLAGS, set the exception flags, and write the result back to the FFLAGS. This works in the gem5 simple and minor CPU model as they are actually written to `regFile` after executing the instructions. However, in the gem5 O3 CPU model, it will record in the `destMiscReg` buffer until the commit stage when writing to the `miscReg` in the execution stage. The next instruction will get the old FFLAGS and cause the incorrect result. The CL introduced the `MISCREG_FFLAGS_EXE` and used the same size of `miscRegFile` because the `MISCREG_FFLAGS_EXE` and `MISCREG_FFLAGS` shared the same space. When executing the float-pointing instruction, any exception flags should be updated via `MISCREG_FFLAGS_EXE` to accumulate the FFLAGS in `setMiscReg` method. For the MISCREG_FFLAGS, it should only be called in the CSROp. [1] Syntactic Dependencies: Appendix A `c80ecada1c/src/mm-eplan.adoc (syntactic-dependencies-rules-9-11)` gem5 issue: https://github.com/gem5/gem5/issues/755 Change-Id: Ib7f13d95b8a921c37766a54a217a5a4b1ef17c6f	2024-02-22 08:33:34 -08:00
Jason Lowe-Power	c719ea960a	arch-arm: Add FEAT_FGT trapping for debug registers (#873 ) We already implemented FEAT_FGT but we were missing trapping capabilities for trapping debug registers accesses	2024-02-21 11:27:43 -08:00
Nicholas Mosier	7ac9733199	arch-x86, cpu-kvm: initialize x87 FCW (#877 ) Fix #876. The x87 floating-point control word (FCW) was not initialized at process startup in syscall emulation mode. This resulted in floating point exceptions in KVM mode when executing x87 floating-point instructions. This patch fixes the bug by initializing FCW to its reset value, 0x37F. Change-Id: Idd1573c6951524ef59466cc5c9f1e640ea7658ae	2024-02-20 07:46:44 -08:00
Giacomo Travaglini	8759131df3	cpu-o3, arch: Fix SMT bug arising from v23.0 and make gem5 more robust with SMT (#828 ) This PR is fixing https://github.com/gem5/gem5/issues/668. It fixes it for all ISAs other than Arm with the first commit, which is setting the number of architectural Matrix registers to 0 for those ISA which are not using them. It then partly fixes it for Arm as well with the 2nd commit: by removing RenameMap::numFreeEntries we don't stall renaming unless a matrix instruction is encountered... This means most binaries will run with SMT as long as they don't use FEAT_SME instructions. Please note: this is not simply a SMT fix, it will generally address a shortcoming in the way we were renaming instructions. If an Arm binary wants to use SMT with FEAT_SME, the 4th commit will make sure the lack of physical registers is notified explicitly at the beginning of simulation, rather than silently blocking renaming	2024-02-19 08:52:31 +00:00
Giacomo Travaglini	2c0cc0040b	arch-arm: Implement FEAT_FGT Debug trapping Change-Id: I30af2b49ee604bcaa43fd419f6bc69e9ee6d9350 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Richard Cooper <richard.cooper@arm.com>	2024-02-15 15:58:34 +00:00
Giacomo Travaglini	683007c6ca	arch-arm: Add FEAT_FGT Debug Read/Write registers Those are supposed to control trapping for accesses to debug registers Change-Id: I4a25a379e718ea6d5ea8ae22ac7edbeb452d1836 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Richard Cooper <richard.cooper@arm.com>	2024-02-15 15:58:34 +00:00
Harshil Patel	47c4dad869	arch-riscv: Remove unnecessary assert (#866 ) `assert(interruptID >=0)` is always true as `interruptID` is an unsigned int. This was causing compilation tests failures in GCC-8 with the following error: ```sh src/arch/riscv/interrupts.cc:47:32: error: comparison is always true due to limited range of data type [-Werror=type-limits] assert(interruptID >= 0); ``` Change-Id: I356be78d7f75ea5d20d34768fb8ece0f746be2fc	2024-02-13 08:30:18 -08:00
Vishnu Ramadas	8054459df6	arch-vega: Add support for S_ICACHE_INV instruction Previously, the S_ICACHE_INV instruction was unimplemented and simulation panicked if it was encountered. This commit adds support for executing the instruction by injecting a memory barrier in the scalar pipeline and invalidating the ICACHE (or SQC) Change-Id: I0fbd4e53f630a267971a23cea6f17d4fef403d15	2024-02-09 12:19:08 -06:00
Saúl	7d80658a39	arch-riscv: fix vl in mask load/store (i.e vlm.v/vsm.v) (#830 ) The vlm.v and vsm.v unit-stride mask load/store instructions are constructed with an incorrect VL when the current one is larger than than VLEN/EEW (i.e. when LMUL > 1). This commit fixes the issue for both instructions.	2024-02-08 14:06:49 -08:00
Bobby R. Bruce	7fe1588546	arch-riscv: Fix load and store to use EEW instead of SEW (#859 ) Vector unit-stride instructions have an EEW encoded directly in the instruction, We should use that instead of SEW in vtype. Ref: https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#73-vector-loadstore-width-encoding	2024-02-08 12:14:11 -08:00
Saúl	804f137325	arch-riscv: add unit-stride fault-only-first loads (i.e. vleff) (#794 ) This patch provides unit-stride fault-only-first loads (i.e. vleff) for the RISC-V architecture. They are implemented within the regular unit-stride load (i.e. vle). A snippet named `fault_code` is inserted with templating to change their behaviour to fault-only-first. A part from this, a new micro based on the vset\vl\* instructions (VlFFTrimVlMicroOp) is inserted as the last micro in the macro constructor to trim the VL to it's corresponding length based on the faulting index. This trimming micro waits for the load micros to finish (via data dependency) and has a reference to the other micros to check whether they faulted or not. The new VL is calculated with the VL of each micro, stopping on the first faulting one (if there's such a fault). I've tested this with VLEN=128,256,...,16384 and all the corresponding SEW+LMUL configurations. Change-Id: I7b937f6bcb396725461bba4912d2667f3b22f955	2024-02-08 09:15:58 -08:00
QQeg	e685c072d1	arch-riscv: Remove micro_elems in VleMicro template Change-Id: I91267de8b1142075aa2873bfcedfd8b15c6863d4	2024-02-08 07:24:55 +00:00
QQeg	7eeac98b8d	arch-riscv: Fix load and store to use EEW instead of SEW Vector unit-stride instructions have an EEW encoded directly in the instruction, We should use that instead of SEW in vtype. Change-Id: I282041ce8ed57fbcca899f7497ef6c6fb2dfcf85	2024-02-07 21:11:28 +00:00
Robert Hauser	f289f9e8b5	arch-riscv: adding support for local interrupts (#813 ) Besides the standard RISC-V interrupts software, timer, and external interrupt, the RISC-V specification also offers the possibility to implement local interrupts. With this patch, we contribute an extension of RiscvInterrupts that enables connecting interrupt sources to the local interrupt controller. We assigned the local interrupts to machine-level and gave them the highest priority. If two local interrupts are pending, there exception code will be the tie-breaker (higher ID > lower ID). 32 Bit systems only recognize the local interrupts 16 to 31, 64 Bit systems 16 to 63. Change-Id: Iff8d34e740b925dce351c0c6f54f4bd37a647e0c --------- Co-authored-by: Robert Hauser <robert.hauser@uni-rostock.de>	2024-02-06 09:38:50 -08:00
Yu-Cheng Chang	ba6c569b8d	arch-riscv: Add BasePMAChecker to support customized PMA (#846 ) The RISC-V privilege spec don't specify the implementation of PMA(physical memory attribute), which is addressed in the previous CL[1]. This CL creates the BasePMAChecker to support customized PMA so that we can only focus on the features wanted in the study. The CL also leaves the common methods `check` and `takeOverFrom` to make MMU easy to interact with PMA. [1] https://gem5-review.googlesource.com/c/public/gem5/+/40596 Change-Id: I9725e3a8f7f9276e41f0d06988259456149d2a77	2024-02-06 05:38:34 -08:00
Giacomo Travaglini	a60d6960c7	arch-arm: Remove unused/unimplemented TLB methods (#849 ) Change-Id: I3a76a914df1ba65ec5200f11111cf20f3e1eb924 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2024-02-06 09:18:06 +00:00
Giacomo Travaglini	05f93175a7	arch-arm: Crypto instruction execution requires SIMD to be enabled (#848 ) Crypto instructions will cause an undefined instruction when executed with SIMD disabled. The PR is also refactoring their implementation by checking the release object instead of the ID register field. This is improving readability	2024-02-05 19:22:04 +00:00
Chong-Teng Wang	40ecdf5fb4	arch-riscv: Fix RVV instructions vmv.s.x/vfmv.s.f (#843 ) This commit fixes the implementation of vmv.s.x and vfmv.s.f. When vl = 0, no elements are updated in the destination vector register group, regardless of vstart. Change-Id: Ib21b3125da8009325743ec70ca0874704328356c Reference: [Integer Scalar Move Instructions](https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#161-integer-scalar-move-instructions) [Floating-Point Scalar Move Instructions](https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#162-floating-point-scalar-move-instructions)	2024-02-05 08:51:42 -08:00
Chong-Teng Wang	85059a369e	arch-riscv: Fix control flow in VectorFloatMaskMacroConstructor (#844 ) This commit adjusts the logic in VectorFloatMaskMacroConstructor to ensure the %(copy_old_vd)s section is not skipped when vl = 0, ensuring correct values in destination vector register. Change-Id: I2478722d6f003a0f2e4b3cd0ba3e845bed938ee6 This is the same problem as #715 .	2024-02-05 06:29:05 -08:00
Giacomo Travaglini	16e06bad0c	arch-arm: Exec Crypto instructions only if SIMD&FP enabled We not only check for the presence of the relative FEAT_*, we also check if AdvSIMD is enabled; we throw an undefined instruction otherwise. Change-Id: I1fd0cdc8057c5a7901774802dc076817f06c8e66 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>	2024-02-05 12:56:48 +00:00
Giacomo Travaglini	ebef2fc4b1	arch-arm: Crypto instructions checking release object Check directly if extension is enabled instead of looking for ID register field value. This makes the code more readable Change-Id: If0b882ac3464c3587731b72a7edb3b8b65ea86c7 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>	2024-02-05 12:56:48 +00:00
Giacomo Travaglini	d031244ca7	misc: When unused, set #MatRegClass registers to 0 This is working around an existing SMT issue [1]. The BaseO3CPU uses two physical matrix registers [2]. This is enough for a single threaded CPU which as of now uses 1 architectural matrix only. The problem arises when SMT is enabled. As 2 architectural matrices need to be supported by a single CPU, the O3CPU won't have any available register in the freeList for renaming. This causes the SMT O3CPU to indefinitely stall renaming [3] If the archtectural number of registers is seto to 0, the regclass won't be taken into consideration when evaluating if we can rename instructions. This issue has been implicitly fixed for RISCV by a preceding PR [4] [1]: https://github.com/gem5/gem5/issues/668 [2]: https://github.com/gem5/gem5/blob/stable/src/cpu/o3/BaseO3CPU.py#L170 [3]: https://github.com/gem5/gem5/blob/stable/src/cpu/o3/rename.cc#L1228 [4]: https://github.com/gem5/gem5/pull/83 Change-Id: I99bfdefff11a246b1f191251dc67689e95b3f0db Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2024-02-02 18:11:47 +00:00
Giacomo Travaglini	3a2c8feca8	arch-arm: MMU aarch64EL is not used in AArch64 only anymore We therefore rename it to exceptionLevel Change-Id: I2a3aabaefa315d95bd034b13d95d5a5b0b8e9319 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2024-02-01 13:45:06 +00:00
Giacomo Travaglini	3737e8b6df	arch-arm: Use MAIR_EL2 mem attribute register when in EL0 host With the old code, the MAIR_EL1 register was checked when inserting an EL2&0 TLB entry Change-Id: I064032fb2946777c2f4c50c06a124f828245e18a Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2024-02-01 13:44:16 +00:00
Giacomo Travaglini	d42ef792bf	arch-arm: Check ELIs64 for EL2 when in EL2&0 regime The problem with: ELIs64(tc, aarch64EL == EL0 ? EL1 : aarch64EL); Is that when we are executing at EL0 in host (EL2&0 translation regime), the execution mode (AArch32 vs AArch64) is dictated by EL2 and not by EL1 (which is the guest) Change-Id: I463a2a9461c94d0886990ae3d0a6e22aeb4b9ea3 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2024-02-01 13:43:59 +00:00
Giacomo Travaglini	458c98082c	arch-arm: Replace EL based translation with regimes This is the final step in the transformation process. We limit the use of the "managing Exception Level" for a translation in favour of the more standard "Translation Regime" This greatly simplifies our code, especially with VHE where the managing el (EL2) could handle to different translation regimes (EL and EL2&0). We can therefore remove the isHost flag wherever it got used. That case is automatically handled by the proper regime value (EL2&0) Change-Id: Iafd1d2ce4757cfa6598656759694e5e7b05267ad Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2024-02-01 13:43:47 +00:00
Giacomo Travaglini	e333a77c12	arch-arm: Remove _Xt postfix from TLBI instructions The Xt is not part of the architectural name of the register and it was likely added with the introduction of extended register (Xt) TLBIs in Armv8 to differentiate them with the old Armv7 ones. The use of _Xt was not consistent anyway: newer TLBIs were already omitting it. Change-Id: Ic805340ffa7b5770e3b75a71bfb76e055e651f8b Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2024-02-01 13:43:26 +00:00
Giacomo Travaglini	594428f010	arch-arm: Remove redundant isHyp as a TLB entry field We should stop using isHyp.. An hypervisor entry is flagged already by the EL of the entry (el == EL2) Change-Id: I20c3d06fa2b04e0b938a380ca917d0b596eddcf2 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2024-02-01 13:43:00 +00:00
Giacomo Travaglini	a6ca81906a	arch-arm: Simplify setting of isHyp for mem translations The isHyp descriptor is an old artifact of armv7 and it flags a PL2 (AArch32) or EL2 & EL2&0 (AArch64) translations. It is commonly set according to the EL/mode [1] but it may differ from the execution state in case of explicit translation requests (via the AT instruction as an example [2]). There is really no need to complicate the masking of isHyp. We should just make use of the tranType method (in charge of setting aarch64EL) to properly set aarch64EL, and make isHyp coincide with the case of aarch64EL == EL2. This is a step towards the removal of the isHyp flag. More specifically the patch does the following: * HypMode translation type moved in the EL2 case The translation is used by ATS1HR/ATS1HW: Performs stage 1 address translation as defined for PL2 and the Non-secure state * S1S2NsTran translation type moved in the EL1 case The translation is used by ATS12NSOPR/ATS12NSOPW: Performs stage 1 and 2 address translations as defined for PL1 and the Non-secure state * S1CTran translation type can be at either EL1 or EL3 The translation is used by ATS1CPR/ATS1CPW Performs stage 1 address translation as defined for PL1 and the current Security state [1]: https://github.com/gem5/gem5/blob/stable/src/arch/arm/mmu.cc#L1281 [2]: https://github.com/gem5/gem5/blob/stable/src/arch/arm/mmu.cc#L1282 Change-Id: Ie653170f6053c5d8141a2de9f50febf5bf53ab9c Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2024-02-01 13:42:40 +00:00
Jason Lowe-Power	b3870ee7b0	arch-riscv: Fix fence.i instruction in O3 CPU (#816 ) arch-riscv: Fix fence.i instruction in O3 CPU	2024-01-30 15:39:32 -08:00

1 2 3 4 5 ...

5851 Commits