derek/gem5 - gem5 - Gitea: Git with a cup of tea

derek/gem5

Author	SHA1	Message	Date
Matthew Poremba	cd91c6321f	arch-vega: Reorganize instructions to multiple files The Vega instructions.cc file is 47k lines long which results in both large compilation times whenever it is modified and long style check times. This makes iterating over more complex instruction implementations very time consuming. This commit moves the instruction definitions to multiple files based on the instruction encoding (SOP2, VOP2, FLAT, DS, etc.). The resulting files are much smaller (max is 8k lines) and compilation and style check times are much more reasonable. Other than moving code around, there are no functional changes in this commit. Change-Id: Id4ac8e98ef11a58de5fd328f8a0cd7ce60a11819	2024-01-19 13:32:24 -06:00
Jason Lowe-Power	a555449c12	arch-arm: Fix compile error in kvm (#784 ) The addition of std::optional in #732 caused a compile error. This change fixes the error by checking to see if the value is present and panicing otherwise. Change-Id: I46c3fb76eb0e14ba7bede7c336293fbe9add8c84 Signed-off-by: Jason Lowe-Power <jason@lowepower.com>	2024-01-19 07:59:59 -08:00
Yu-Cheng Chang	f56459470a	arch-riscv: Refactor the RISC-V multiplication utility (#780 ) 1. Add the new double width for int64_t and uint64_t 2. Use the wider type to get the upper result of multiplication Change-Id: Id6cfa6f274c65592b2b3e2b70c00f82954b41f1a	2024-01-18 12:40:11 -08:00
QQeg	511729ab76	arch-riscv: Fix issue when vl=0 in VectorIntMaskMacroConstructor (#715 ) I’ve been working on a fix for the issue #759 where ‘vd’ incorrectly stores all zeros when ‘vl’ is set to 0 in VectorIntMaskMacroConstructor. My solution seems to work, but it behaves differently from other macros when ‘vl’ = 0. Instead of pushing a ‘nop’ to ‘microops’, it pushes a micro operation that remains ineffective due to ‘vl’ being 0.	2024-01-17 08:45:08 -08:00
Matthew Poremba	70376d43a3	arch-vega: Fix upsize cast error in newer compilers (#774 ) Newer compilers error on -Warray-length in the recent MI200 patches due to casting from a 32-bit data type to a 64-bit type. Change it to cast the 32-bit integer first then 64-bit integer latter to remove the warning. Rerun of validation tests on the three instructions passed. Change-Id: I0309e5f7b5b8cc8ce1651660ddddb120fa6e7666	2024-01-16 09:41:23 -08:00
Matthew Poremba	6a9e80c54c	gpu-compute: Support for MI200 GPU model (#733 )	2024-01-15 08:18:34 -08:00
Hoa Nguyen	85eb99388a	arch-riscv: Remove the check of bit 63 of the physical address (#756 ) Currently, the TLB enforces that the bit 63 of a physical address to be zero. This check stems from the riscv-tests that checks for the bit 63 of a physical address [1]. This is due to the fact that the ISA implicitly says that the physical address must be zero-extended on the most significant bits that are not translated [2]. More details on this issue is here [3]. The check for bit 63 of a physical address in the TLB is rather too specific, and I believe the check of invalid physical address is alread implemented in PMA. Thus, this change proposes to remove this check from RISC-V TLB. [1] `bd0a19c136/isa/rv64mi/access.S (L18)` [2] https://groups.google.com/a/groups.riscv.org/g/isa-dev/c/8kO7X0y4ubo [3] https://github.com/gem5/gem5/issues/238 Change-Id: I247e4d4c75c1ef49a16882c431095f6e83f30383 Signed-off-by: Hoa Nguyen <hn@hnpl.org>	2024-01-12 15:17:49 -08:00
Yu-Cheng Chang	2f24ee570e	arch-riscv: Move PMAChecker and PMP to RiscvISA namespace (#691 ) The PMAChecker and PMP are only used in the RisvISA and it should be in the RiscvISA to simply the implementation Change-Id: I4968e2de4c028cb2dceed977f2173fc8b1efd175	2024-01-10 16:58:13 -08:00
Yu-Cheng Chang	74dd0bb9bb	fastmodel: Fix the Fastmodel RemoteGDB initial (#735 ) Change-Id: Iec9ef145ccac353b8a41f501dd76bf53288dd478	2024-01-10 16:55:54 -08:00
Giacomo Travaglini	5e2e748f3a	arch-arm: Handle invalid case for encodeAArch64SysReg (#732 ) This patch is amending encodeAArch64SysReg so that it covers the case where there are no arch numbers available for the misc index passed as an argument. This could happen if the register ID is a gem5 pseudo register which is not associated with any architected op1/op2/crn/crm tuple. Rather than panicking we return a nullopt. Change-Id: I7ab70467105ef93c0c78ac4e999c7dc8e5e09925 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2024-01-04 10:04:40 +00:00
Matthew Poremba	31e63b01ad	arch-vega: Add vop3p DOT instructions Implemented according to the ISA spec. Validated with silion. In particular the sign extend is important for the signed variants and the unsigned variants seem to overflow lanes (hence why there is no mask() in the unsigned varints. FP16 -> FP32 continues using ARM's fplib. Tested vs. an MI210. Clamp has not been verified. Change-Id: Ifc09aecbc1ef2c92a5524a43ca529983018a6d59	2024-01-03 15:41:06 -06:00
Matthew Poremba	420cda1bef	arch-vega: Implement FP32 packed math Starting with MI200, packed math can operate on double dword inputs. In this case, 64-bits of inputs (two VGPRs per lane) contain two FP32 values. Add instructions to perform add, multiply, and FMA on packed FP32 types. Change-Id: Ib838bff91a10e02e013cc7c33ec3d91ff08647b0	2024-01-03 15:41:06 -06:00
Matthew Poremba	7b0c47d52f	arch-vega: Implement all global atomics up to gfx90a This change adds all of the missing flat/global atomics up to including the new atomics in gfx90a (MI200). Adds all decodings and instruction implementations with the exception of __half2 which does not have a corresponding data type in gem5. This refactors the execute() and completeAcc() methods by creating helper functions similar to what initiateAcc() uses. This reduces redundant code for global atomic instruction implementations. Validated all except PK_ADD_F16, ADD_F32, and ADD_F64 which will be done shortly. Verified the source/dest register sizes in the header are correct and the template parameters for the new execute()/completeAcc() methods are correct. Change-Id: I4b3351229af401a1a4cbfb97166801aac67b74e4	2024-01-03 15:41:06 -06:00
Matthew Poremba	472c697d88	arch-vega: Implement v_mfma_i32_16x16x16i8 Tested using AMD labs notes examples located on github: https://github.com/amd/amd-lab-notes/blob/release/matrix-cores/ src/mfma_i32_16x16x16i8.cpp Change-Id: Ib0e50162288528012b6d3395e1f629ebf12e8e54	2024-01-03 15:41:06 -06:00
Matthew Poremba	7e1b27969f	arch-vega: Improve FLAT disassembly Use the opSelectorToRegSym which will print the full range of VGPRs (e.g., will now print v[2:3] instead of v2 when the source / dest is 64-bits). Fixes atomic disassembly prints. Now shows "glc" if GLC bit is enabled. Fixes some VGPR fields being printed as an SGPR in places where the 9-bit register index bit is implied (e.g., VDST). This makes it easier to use a GPUExec trace to match with LLVM disassembly when debugging. Change-Id: Ia163774850f0054243907aca8fc8d0361e37fdd5	2024-01-03 10:40:34 -06:00
Matthew Poremba	bc69ab0a1f	arch-vega: Add VOP3P encodings and packed 16b insts This adds the VOP3P and VOP3P_MAI encodings from the MI200 spec. These instructions are used for packed math and miSIMD instructions. The first 19 VOP3P opcodes are implemented and validated against hardware. This includes all instructions which operate on one dword containing two packed 16-bit values of fp16, int16_t, or uint16_t. Implement one MFMA instruction for now which was also validated against hardware.	2024-01-03 10:40:34 -06:00
Matthew Poremba	4903fe2db1	arch-arm: Allow fplib to be used outside of ARM build This is useful in other ISAs to implement FP16 computation. For example, it can be used in the GPU model. The ARM specific misc register is ignored in that case. Change-Id: I339ac0ccd9be4371b0f220ad99068e5e12b3d263	2024-01-03 10:40:34 -06:00
Bobby R. Bruce	da3e3b806d	arch-riscv: squash walks with tlb hits in startWalkWrapper (#672 ) Because each vector load is fragmented into 64 byte cache-aligned chunks, and one page-table walk is issued per fragment on tlb miss, walks start to accumulate on a pending queue, which is processed in a blocking way (no pending walks can be issued while one is being processed). This adds noticeable latency on vector loads when VLEN is sufficiently large. This commit fixes the issue by allowing walks to be squashed if a TLB lookup hits just before starting the walk on `startWalkWrapper`. This idea was taken from the ARM walker.	2023-12-13 12:45:40 -08:00
Saúl Adserias	78f23ad2df	arch-riscv: squash walks with tlb hits in startWalkWrapper Change-Id: I1bdfd7b2ee02ddee5a2d4c13bafc8c472f555f61	2023-12-13 16:40:46 +01:00
Giacomo Travaglini	8d09e95420	arch-arm: Partial SVE2 Implementation (#657 ) Instructions added: BGRP, RAX1, EOR3, BCAX, XAR & TBX, PMUL, PMULLB/T, SMULLB/T and UMULLB/T Move from gerrit [1] [1]: https://gem5-review.googlesource.com/c/public/gem5/+/70277 Change-Id: Ia135ba9300eae312b24342bcbda835fef6867113	2023-12-13 10:27:19 +00:00
Bobby R. Bruce	c8cc193db8	arch,arch-riscv: Fix inst flag of RISC-V vector store macro instructions (#674 ) Correct the instruction flags of RISC-V vector store instructions, such as `vse64_v`, `vse32_v`. The `vse64_v` in `decoder.isa` is `Mem_vc.as<uint64_t>()[i] = Vs3_ud[i];` and it will generate the code `Mem.as<uint64_t>()[i] = Vs3[i];`. The current regex of assignRE only mark the operand `Mem` as `dest` only if meet the formats like `Mem = Rd` or `Mem[i] = Rd` because the code ` = Rd` or `[i] = Rd` match the `assignRE` respectively. For the expression `Mem.as<uint64_t>()[i]`, the operand `Mem` will falsely mark the operand as `src` because the code `.as<uint64_t>()[i]` is not match the `assignRE`. The PR will ensure the operand `Mem` is dest for the format like `Mem.as<xxx>()[i] = yyy`.	2023-12-12 13:07:50 -08:00
Bobby R. Bruce	37e4173351	arch-x86: Fix two_byte_opcodes.isa `0x6` -> `0x0` (#666 ) This bug was introduced by https://github.com/gem5/gem5/pull/593 and caused Issue https://github.com/gem5/gem5/issues/664. Change-Id: Ia55de364ee8260e1fe315e37e1cffbc71ab229fb	2023-12-12 08:21:27 -08:00
Roger Chang	bedc3c597c	arch: Fix inst flag of RISC-V vector store macro instructions Correct the instruction flags of RISC-V vector store instructions, such as `vse64_v`, `vse32_v`. The `vse64_v` in `decoder.isa` is `Mem_vc.as<uint64_t>()[i] = Vs3_ud[i];` and it will generate the code `Mem.as<uint64_t>()[i] = Vs3[i];`. The current regex of assignRE only mark the operand `Mem` as `dest` only if meet the formats like `Mem = Rd` or `Mem[i] = Rd` because the code ` = Rd` or `[i] = Rd` match the `assignRE` respectively. For the expression `Mem.as<uint64_t>()[i]`, the operand `Mem` will falsely mark the operand as `src` because the code `.as<uint64_t>()[i]` is not match the `assignRE`. The PR will ensure the operand `Mem` is dest for the format like `Mem.as<xxx>()[i] = yyy`. Change-Id: I9c57986a64f1efb81eb9c7ade90712b118e0788d	2023-12-12 17:04:31 +08:00
Roger Chang	10d344a942	arch-riscv: Fix the vector store indexed instructions declaration Change-Id: I6f8701ef0819c22eda8cb20d09c40101f2d001a0	2023-12-12 16:36:49 +08:00
Giacomo Travaglini	81d3c6307d	arch-arm: add Sve mla and mls indexed (#596 ) This contains the implementation of mla and MLS index version instructions from ARM SVE2 ISA specification.	2023-12-07 21:47:35 +00:00
Nitesh Narayana	d962d2588d	arch-arm: This commit cleans .isa files This commit cleans extra new lines from .isa files from this branch Change-Id: I4087ed230aa041747038b49360c2aba3f82c0790	2023-12-06 16:03:21 +01:00
Matthias Boettcher	e4dccbea8a	arch-arm: Partial SVE2 Implementation Instructions added: BGRP, RAX1, EOR3, BCAX, XAR & TBX, PMUL, PMULLB/T, SMULLB/T and UMULLB/T Change-Id: Ia135ba9300eae312b24342bcbda835fef6867113	2023-12-06 14:26:31 +00:00
Nitesh Narayana	db8e1652e8	arch-arm: This commit uses existing template code for mla/s index This includes mla/s index version implementation using the existing template code to avoid code repeatition. Change-Id: If1de84e01dec638e206c979ca832308ebc904212	2023-12-05 23:40:06 +01:00
Hoa Nguyen	cf087d4d11	arch-riscv: Add PCEvent for RISCV FS Workload kernel panic/oops Inspired by a similar feature in ARM's full system workload, this change adds an option to halt gem5 simulation if the guest system encounter kernel panic or kernel oops. On RiscvISA::BootloaderKernelWorkload, by default, the simulation will exit upon kernel panic, while kernel oops will not induce simulation halt. This is because the system will essentially do nop after a kernel panic, while the system might be still functional after a kernel oops. Dumping kernel's dmesg is useful for diagonizing the cause of kernel panic, so ideally, we want to dump the guest's dmesg to the host. However, due to a bug described in [1], kernel v5.18+ dmesg might not be dumped properly. Hence, the dmesg will not be dumped to the host. On RiscvISA::FsLinux, this feature is turned off by default as the symbols from the official RISC-V kernel resource are stripped from the binary. However, if this feature is enable, the dmesg will be dumped to the host system. [1] https://github.com/gem5/gem5/issues/550 Change-Id: I8f52257727a3a789ebf99fdd4dffe5b3d89f1ebf Signed-off-by: Hoa Nguyen <hn@hnpl.org> Co-authored-by: Jason Lowe-Power <jason@lowepower.com>	2023-12-04 14:59:26 -08:00
Harshil Patel	5eba3941f4	arch-riscv: fix o3 cpu stuck in spinlock bug (#641 )	2023-12-03 13:22:46 -08:00
Hoa Nguyen	7a5052b3a0	arch-arm: Only build ArmCapstoneDisassembler when ISA is arm (#553 ) Currently, if the Capstone header file is found in the host system, scons will try to build the ArmCapstoneDisassembler regardless of the gem5 target ISA. This is causing problem when the host has Capstone, but the gem5 target ISA is not arm. Compiling gem5 in this case will cause errors, e.g., ArmISA and ArmSystem is not found. This change aims to prevent building the ArmCapstoneDisassembler when the gem5 target ISA is not arm. Ref: [1] The Arm Capstone PR https://github.com/gem5/gem5/pull/494 Change-Id: I1e714d34aec8fe2a2af8cd351536951053a4d8a5	2023-12-03 13:22:11 -08:00
Bobby R. Bruce	21919addca	Fix for gem5 Issue #550 (#636 ) This Pull-Request addresses gem5 Issue #550. The code that dumps the Dmesg buffer is now templated on the two variants of the `Metadata` structure, and the correct one is chosen based on the detected Kernel version. To support this functionality, the pull request also adds Symbol Size data to the loader Symbol Table, and adds a method to query the Kernel Version from the image in guest memory. The new attributes in the Symbol class are de-serialized speculatively, so no checkpoint upgrader is required to support this change.	2023-12-01 18:06:20 -08:00
Richard Cooper	d9c870f641	sim: Rework the Linux Kernel exit events (#639 ) This patch reworks the Linux Kernel panic and oops events. The code has been re-factored to provide re-usable events that can be applied to all ISAs from the base `KernelWorkload` `SimObject`. At the moment they are installed for the Arm workloads. This update also provides more configuration options that can be specified using the new `KernelPanicOopsBehaviour` enum. The options are applied to the Kernel Workload parameters `on_panic` and `on_oops` which are available to all subclasses of `KernelWorkload`. The main rationale for this reworking is to add the option to cleanly exit the simulation after dumping the Dmesg buffer. Without this option, the simulation would continue running after a Kernel panic. If system components (e.g. a system timer) keep the event queue alive, this causes the simulation to run slowly to the maximum allowed tick.	2023-12-01 17:33:59 -08:00
Richard Cooper	2fbbdad618	base: Add encapsulation to the loader::Symbol class This commit converts `gem5::loader::Symbol` to a full class with private members, enforcing encapsulation. Until now client code has been able to (and does) access members directly. This change will enable class invariants to be enforced via accessor methods. Change-Id: Ia0b5b080d4f656637a211808e13dce1ddca74541 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>	2023-12-01 22:00:26 +00:00
Hoa Nguyen	bbe5216d88	arch-riscv: Rename BootloaderKernelWorkload parameters The gem5 standard library hardcoded some parameters of the workload. E.g., the kernel filename must be `object_file`. Change-Id: I5eeb7359be399138693eaba0738eaf524c59408f Signed-off-by: Hoa Nguyen <hn@hnpl.org>	2023-12-01 07:28:30 +00:00
Yu-Cheng Chang	a16fd8a592	scons: Limit adding fastmodel files and libpath (#629 ) The change will only add include and library path if the fastmodel is required to build. The change will benefit for most of gem5 build. Change-Id: I98c20bd1470b7227940036199e02bc001e307eac	2023-11-30 07:36:26 -08:00
Andreas Sandberg	dcdebec0f6	misc,python: Add `isort` hook to pre-commit (#431 )	2023-11-30 09:54:12 +00:00
Bobby R. Bruce	d11c40dcac	misc: Run `pre-commit run --all-files` This ensures `isort` is applied to all files in the repo. Change-Id: Ib7ced1c924ef1639542bf0d1a01c5737f6ba43e9	2023-11-29 22:06:41 -08:00
Bobby R. Bruce	fcbcd1ce72	arch-x86: Fixes page fault for CLFLUSH on write-protected pages (#592 ) Converts CLFLUSHOPT/WB/FLUSH operations from Write to Read operations during address translation so that they don't trigger a page fault when done on write-protected pages. Solves #226	2023-11-29 14:25:21 -08:00
Yu-Cheng Chang	57ba3fccb7	scons: Move CPPPATH systemc_home to "src/systemc" folder (#617 ) Files under src/systemc require the include path of systemc_home Change-Id: Ibcbac2762259a0b997ac444b2c63a218c27af9ee	2023-11-29 13:56:23 -08:00
Bobby R. Bruce	a2e7bd4698	arch-riscv: Support combination of privilege modes configuration (#522 ) The user can select privilege modes witch is included in the system, not always enable the user and supervisor privilege modes.	2023-11-29 10:12:57 -08:00
Adrià Armejach	b0cefac9b2	arch-riscv: Fix narrow datatypes in RVV isa files (#606 ) Some variables hava narrow datatypes that overflow on large VLEN values. For example, the maximum number of microops for LMUL=8 SEW=8 and VLEN=64K is 2^16. Change-Id: I5cce759f040884e09ce83bee7e54a62c4b42c5aa Co-authored-by: Adrià Armejach <adria.armejach@bsc.es>	2023-11-29 10:11:06 -08:00
Harshil Patel	089b82b2e9	arch-riscv: fix tlb bug (#610 ) - one tlb miss was getting counted twice by the lookup function. Change-Id: I5fee08bd6e936896704e7dbbd242720b8d23b547	2023-11-29 08:39:02 -08:00
Jason Lowe-Power	3fe5e58f28	arch-x86: Fix misc registers in mov instructions (#593 ) MOV instructions 8C and 8E can be prefixed with a REX prefix to extend the source/destination register. However, the R bit in REX will be applied to the segment register. The decoder file checks for valid segment registers, checking the MODRM_REG only, however, later this will be extended with the REX_R when adding the register to the sources/destinations of the instruction. This will trigger an assert. Additionally, MOV instructions of various miscelaneous registers are also not check for being valid when taking into account the REX_R bit. This patch checks that the REX_R is not set, otherwise, UD2 will be generated.	2023-11-28 11:14:53 -08:00
Roger Chang	9a0c671cce	arch-riscv: Handle the exception following the privilege mode set Change-Id: I4867941ec286fe485e01db848b8c7357488f6cf4	2023-11-28 09:26:27 +08:00
Roger Chang	d56801c240	arch-riscv: Add misa rvs check for memory translation The memory translation require supervisor mode implement. If the supervisor mode is not implemented, the satp CSR is not exists and should not do address translation Change-Id: Ie6c8a1a130d0aab0647b35e0f731f6b930834176	2023-11-28 09:26:27 +08:00
Roger Chang	6fd4feb797	arch-riscv: fatal_if the process run without SU modes Change-Id: Ifce7eec6cea10881964c29d206a92f3d10271de6	2023-11-28 09:26:27 +08:00
Roger Chang	9e738a65ea	arch-riscv: Add isaExts field for CSR registers Change-Id: Idd94af57f3a721d455ea7fb9d335fab7b16a0f7e	2023-11-28 09:26:27 +08:00
Roger Chang	0e4f82a119	arch-riscv: define the CSR masks for each privilege modes Change-Id: I9936d9bc816921a827b94550847d4898b3aa3292	2023-11-28 09:26:27 +08:00
Roger Chang	f745e8cf89	arch-riscv: Initial the privilege modes configuration 1. Declare the new enum type PrivilegeModes 2. Disallow setting the MISA register RVU and RVS. Change-Id: I932d714bc70c9720a706353c557a5be76c950f81	2023-11-28 09:26:27 +08:00

1 2 3 4 5 ...

5790 Commits