derek/gem5 - gem5 - Gitea: Git with a cup of tea

derek/gem5

Author	SHA1	Message	Date
mupton	4b22bfaf3e	arch-arm: fix double delete Change-Id: I05cec0ef8b97fa39aa0d4bf97d7ebd79059e3d7b Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32094 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-08-02 03:51:07 +00:00
Ian Jiang	ae75e7fc45	arch-riscv: Fix disassembling of float register instructions In disassembling of float register instructions, Gem5 always gives 2 source registers rs1 and rs2. However, this is not correct for Mul-Add instructions which have three rs1, rs2, and rs3, and for Move, Convert instructions which have only rs1. For example: (Gem5 output vs Expected) - fmadd.d fa0,fa0,fa4 vs fmadd.d fa0,fa0,fa4,fa5 - fcvt.d.l fa4,a6,zero vs fcvt.d.l fa4,a6 This patch fixes the problem. Change-Id: I02d840eab602ac4a9782911b3cdff2935dfe5e68 Signed-off-by: Ian Jiang <ianjiang.ict@gmail.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32054 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-08-01 02:14:37 +00:00
Jordi Vaquero	bd25fc971d	arch-arm: Implementing SecureEL2 feature for Armv8 This patch adds Secure EL2 feature. This allows stage1 EL2/EL&0 and stage2 secure translation. The changes are organized as follow: + insts/static_inst.cc: Modify checks for illegalInstruction on eret + isa.cc/hh: Enabling contorl bits + isa/insts/misc.hh/64.hh: Smc fault trigger. + miscregs.cc/hh: Declaration and initialization of new registers + self_debug.cc/hh: Add secureEL2 types for breakpoints + stage2_lookup.cc/hh: Allow stage2 in secure state. + tlb.cc/table_walker.cc: Allow secure state for stage2 and stage 1 EL2&0 translation regime + utility.cc/hh: New function InSecure and refactor of other helpers to enable secure state JIRA: https://gem5.atlassian.net/browse/GEM5-686 Change-Id: Ie59438b1828508e944334420da1d8f4745649056 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31394 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-07-31 13:13:57 +00:00
Matt Sinclair	4d84590dee	arch-gcn3: add support for flat atomic adds, subs, incs, decs Add support for all missing flat atomic adds, subtracts, increments, and decrements, including their x2 variants. Change-Id: I37a67fcacca91a09a82be6597facaa366105d2dc Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31974 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-07-30 23:57:02 +00:00
Chris January	433546a88f	fastmodel: Implement GIC DTB auto-generation. Implement generateDeviceTree for FastModelGIC so the interrupt controller is automatically added to the DTB. This is sufficient to allow a VExpressFastmodel system model to boot Linux without an explicit DTB. Change-Id: I69d86fd8bba1b86768c8a118d2de079a56179854 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31078 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Maintainer: Gabe Black <gabeblack@google.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-07-29 08:10:37 +00:00
Chris January	b382bb758f	fastmodel: Remove scs_prefix_appli_output binding. The scx_prefix_appli_output function is removed in recent Fast Models releases. Change-Id: I324b911ec7ed68b7d0c324ac20a9795515e4de57 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31077 Reviewed-by: Gabe Black <gabeblack@google.com> Maintainer: Gabe Black <gabeblack@google.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-07-29 08:10:37 +00:00
Chris January	a29cabb545	fastmodel: Fix hierachical Iris component names. Recent releases of Fast Models structure Iris resources in a hierarchy. Use the parent resource ID if set to construct the hierachical name of components when constructing the resource map. Change-Id: Iafafa26d5aff560c3b2e93894f81f770c0e98079 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31076 Reviewed-by: Gabe Black <gabeblack@google.com> Maintainer: Gabe Black <gabeblack@google.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-07-29 08:10:37 +00:00
Chris January	e177e6c372	fastmodel: Add missing dependencies. Add -latomic library required by recent Fast Models releases. Add SystemCExport directory for tlm_has_get_protocol_types.h include. Change-Id: Ia0c275d55f5077499588228737ed1ff5975cd5db Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31075 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Gabe Black <gabeblack@google.com> Maintainer: Gabe Black <gabeblack@google.com>	2020-07-29 08:10:37 +00:00
Jordi Vaquero	980888eb81	arch-arm: Implement ARM8.1-VHE feature This commit implemented the VHE feature in ARMv8. This consist in 3 parts 1. Register decl/init and register redirection from el1 to el2 miscregs.cc/hh miscregs_types.hh isa.cc utility.cc/hh 2. Definition of new EL2&0 translation regime. tlb.cc/hh table_walker.cc pagetable.hh tlbi_op.hh isa.cc ( for tlb invalidation functions) 3. Self Debug adaptation for VHE self_debug.cc 4. Effects on AMO/IMO/FMO interruptions faults.cc interrupts.hh JIRA: https://gem5.atlassian.net/browse/GEM5-682 Change-Id: I478389322c295b1ec560571071626373a8c2af61 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31177 Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-07-27 17:23:55 +00:00
Jordi Vaquero	a5b3a36bf3	arch-arm: Fix Trap to EL1 on register DC CVAU Change-Id: I8add9fc8595bb1ac0a7de9778bd4544a01b94ee4 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31774 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-07-24 12:04:12 +00:00
Gabe Black	cbefc453c4	arch,sim,misc: Add a new m5 op "sum" which just sums its inputs. This very simple and mostly useless operation has no side effects, and can be used to verify that arguments are making it into gem5, being operated on, and then that a result can be returned into the simulation. Change-Id: I29bce824078526ff77513c80365f8fad88fef128 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/27557 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-07-24 03:59:49 +00:00
Jordi Vaquero	11e0dccbd5	arch-arm: Add System register trap check for EL1 This change adds and refactors the register trap checks for EL1 in the same function, unifying the registry trapping Change-Id: Ief3e0a9f70cc8cd44c1c8215515f36168927362d Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31694 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-07-23 13:11:13 +00:00
Bobby R. Bruce	b99c316840	base,arch: Fixed usage of `bitfield::replaceBits` `bitfield::replaceBits` has two parameters, `first` and `last`, which relate to the position of the MSB and the LSB of the bits to be replaced respectively. Therefore `first` >= `last`. In some areas of the codebase, this assumption has been flipped with `first` <= `last`. This caused at least one known error, recorded here: https://gem5.atlassian.net/browse/GEM5-695. These inconsistencies have therefore been rectified. A note has been added to the `bitfield::replaceBits` Doxygen to make the usage of this function clearer. Change-Id: Ie75856161d9a5684066430ecbdcc52e04e1e77bf Issue-on: https://gem5.atlassian.net/browse/GEM5-696 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31674 Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu> Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com>	2020-07-22 05:17:33 +00:00
Boris Shingarov	18fff9739c	arch-mips: Implement GDB XML target description for MIPS Change-Id: Icff3b2c3e60d5989978de854247232afbb3b0dae Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31574 Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Gabe Black <gabeblack@google.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>	2020-07-21 15:57:00 +00:00
Jordi Vaquero	435d53629f	arch-arm: Fix Fault subsystem adding EL2Enable func Change-Id: I7a4f0c22ac31fd56a8976ee8a1d9760cf6055d63 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31374 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-07-21 14:37:54 +00:00
Matthew Poremba	9b95f32b12	arch-gcn3,gpu-compute: Fix GCN3 related compiler errors Fix all errors that were revealed using the util/compiler-test.sh script. Change-Id: Ie0d35568624e5e1405143593f0677bbd0b066b61 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31154 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-07-20 14:53:13 +00:00
Tony Gutierrez	4d737462c2	gpu-compute, arch-gcn3: Change how waitcnts are implemented Use single counters per memory operation type and increment them upon issue, not execute. Change-Id: I6afc0b66b21882538ef90a14a57a3ab3cc7bd6f3 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29973 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-07-17 16:36:23 +00:00
Chow, Marcus	6655161037	arch-gcn3: Add case to op selector when operand is vcc_hi Change-Id: Ib8846656e18aad04ccb8c9112bc629c69078fe36 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29971 Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2020-07-17 16:35:44 +00:00
Michael LeBeane	f509fa735c	arch-gcn3: Fix stride bug in buffer OOB detection logic The out-of-range logic for buffer accesses is missing the top 4 bits of const_stride when dealing with scratch buffers. This can cause perfectly valid scratch acceses to be suppressed when const_stride is large. Change-Id: I8f94d44c242fda26cf6dfb75db04fa3aca934b3e Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29968 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-07-17 16:34:07 +00:00
Travis Boraten	4c1dc827bc	arch-gcn3: Replace some instances of std::isnormal with std::fpclassify Affected instructions: V_DIV_SCALE_F64, V_CMP_CLASS_F64, V_CMPX_CLASS_F64 and their VOPC, VOP3, F32 variants. These instances of std::isnormal were being used to check for subnormal (denorms) values. std::isnormal is not specific enough. It returns true for normal values but false for NaN, Inf, 0.0, and subnormals. std::fpclassify returns macros for each category of floating point numbers. Now we only catch subnormals. Change-Id: I8d8f4452ff58de71e7c8e0b2b5e73467b532e196 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29967 Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2020-07-17 16:33:26 +00:00
Travis Boraten	e1d10c3894	arch-gcn3: Fix VOP3 V_LDEXP_F64 Replaced !std::isnormal with std::fpclassify because std::isnormal is not specific enough. !std::isnormal was incorrectly catching NaN, Inf, 0.0, and subnormals (aka denormals), where as it was only suppose to catch subnormals. The return value and error handling spec of std::ldexp listed on cppreference.com appears to match up in nearly all cases after making these changes. If std::ldexp handled subnormals as described in the GCN3 2016 guide, we could have used vdst[lane] = std::ldexp and not need to check for any corner cases. Change-Id: I4c77af77c3b7798f86d40442610cef1296a28441 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29966 Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2020-07-17 16:32:56 +00:00
Travis Boraten	e4f7982e90	arch-gcn3: Fix roundNearestEven for V_RNDNE_F64 and V_RNDNE_F32 roundNearestEven is an inst_util function that RNDNE_F64 and F32 call, including both VOP1 and VOP3 formats. IEEE 754 spec says this function should round inputs to the nearest integer but round ties to the nearest even integer. Prior to this patch it was rounding all inputs to nearest even, not just the ties. It was probably implemented this way originally because the language in the ISA manual is ambiguous although it provided the correct logic. Fixed roundNearestEven to use the semantics originally described in the GCN3 ISA manual. Change-Id: I83ecb1d516fcf5bdf17e54ddf409b447a129a9a7 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29964 Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2020-07-17 16:32:56 +00:00
Matt Sinclair	a23ef78c91	arch-gcn3: add all s_buffer_load_dword instructions Adds the other s_buffer_load_dword* instruction implementations to f134a84. Change-Id: I8d97527278900dc68c32463ea1824409ccd04e1d Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29962 Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2020-07-17 16:31:39 +00:00
Matthew Poremba	39f305b329	arch-gcn3: Add memcpy condition when writing EXEC_LO Some compilers emit an error on the operand template class when writing exec mask. Add a condition to explicitly set memcpy size argument to 32b or 64b based on the number of dwords. Change-Id: I49b0e4a1680283e772d0a5a8efd687b31d4f1624 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29961 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-07-17 16:31:10 +00:00
Tony Gutierrez	550f0203aa	arch-gcn3: Remove invalid assert when reading EXEC_LO This assert assumed all reads to EXEC_LO would be 64b, that is, we would always read the entire EXEC mask. This is invalid as some kernels read only the low 32b of EXEC. The write to EXEC_LO is also updated to handle 32b writes. Change-Id: Ifeb167578515bf112b1eab70bbf2201a5e936358 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29960 Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2020-07-17 16:30:41 +00:00
Tony Gutierrez	72e9324ef0	arch-gcn3: Implement ds_swizzle Change-Id: I7d188388afa16932217ae207368666a724207c52 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29958 Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2020-07-17 16:13:43 +00:00
Tony Gutierrez	513e75d99a	arch-gcn3: Implement s_buffer_load_dwordx16 Change-Id: I25382dcae9bb55eaf035385fa925157f25d39c20 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29957 Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2020-07-17 16:13:17 +00:00
Tony Gutierrez	0c3b84fd33	arch-gcn3: Fixup DIV instructions Adds support to handle the special cases for GCN3 DIV instructions. Change-Id: I18f91870e802407c93831f313ce76be053bc4230 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29956 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-07-17 16:12:58 +00:00
Chow, Marcus	b267350ee5	arch-gcn3: fixed scale,fixup,fmas f64 ops Change-Id: Ie13794554db8a958fda1f7103ec18058fda2e66d Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29955 Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2020-07-16 20:37:22 +00:00
Tony Gutierrez	5dc5d23b79	arch-gcn3: Fix s_getpc operand information s_getpc was currently reporting only a single operand, and was only considering the SSRC operand. However, this instruction' source is implicitly the PC. Because its destination register was never tracked for dependence checking purposes, dependence violations are possible. Change-Id: Ia80b8b3e24d5885f646a9ee41212a2cb35b9ffe6 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29954 Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2020-07-16 20:37:22 +00:00
Chow, Marcus	a0cfd8da6b	arch-gcn3: Add handling for Inf/overflow in CVT insts Change-Id: I0fddffdeaebd9f45fe89f44d536f80a43de63ff5 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29953 Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2020-07-16 20:37:22 +00:00
Tony Gutierrez	5c3b02de09	arch-gcn3: Add ds_bpermute and ds_permute insts The implementation of these insts provided by this change is based on the description provided here: https://gpuopen.com/amd-gcn-assembly-cross-lane-operations/ Change-Id: Id63b6c34c9fdc6e0dbd445d859e7b209023f2874 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29952 Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2020-07-16 20:37:22 +00:00
Alexandru Dutu	3aa633cc3f	arch-gcn3: ds_read_u8 and ds_read_u16 fix This changeset zero extends the destination register for ds_read_u8 and ds_read_u16 instructions. Change-Id: I193adadd68adf2572b59743b1504f18ad225f506 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29951 Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2020-07-16 20:37:22 +00:00
Xianwei Zhang	fff185993a	arch-gcn3: implement instruction s_setreg_b32 Instruction s_setreg_b32 was unimplemented, but is used by hipified rodinia 'srad'. The instruction sets values of hardware internal registers. If the instruction is writing into MODE to control single-precision FP round and denorm modes, a simple warn will be printed; for all other cases (non-MODE hw register or other precisions), panic will happen. Change-Id: Idb1cd5f60548a146bc980f1a27faff30259e74ce Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29949 Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Xianwei Zhang <xianwei.zhang@amd.com>	2020-07-16 20:37:22 +00:00
Matt Sinclair	1836d58b36	arch-gcn3: add support for v_mbcnt_hi and v_mbcnt_lo Change-Id: I1c70fe693c904f1abd7d5a2b99220c74a075eae5 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29948 Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2020-07-16 20:37:22 +00:00
Matt Sinclair	c7b6e7c613	arch-gcn3: fix bug with DPP support Instructions that use the DPP field need to use the extra SRC0 register associated with the DPP instruction instead of the "default" SRC0 register, since the default SRC0 register contains the DPP information when DPP is being used. This commit fixes 2735c3bb88 to take this into account. Additionally, this commit removes write of the src register from the DPP helper functions, to avoid overwriting any changes made to the destination register. Finally, this change modifies the instructions that use DPP to simplify the flow through the execute() functions. Change-Id: I80fd0af1f131f287f18ff73b3c1c9122d8c60823 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29947 Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2020-07-16 20:37:22 +00:00
Matt Sinclair	ed3135ea6a	arch-gcn3: implement multi-dword buffer loads and stores Add support for all multi-dword buffer loads and stores: buffer_load_dword x2, x3, and x4 and buffer_store_dword x2, x3, and x4 Change-Id: I4017b6b4f625fc92002ce8ade695ae29700fa55e Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29946 Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2020-07-16 20:37:22 +00:00
Tony Gutierrez	ea52df816d	arch-gcn3: Add support for rd/wr EXEC_HI to operand class Change-Id: Ib22dd604f88ea56801964235082835002deffca1 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29944 Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2020-07-16 20:37:22 +00:00
Tony Gutierrez	af621cd6e6	gpu-compute, arch-gcn3: refactor barriers Barriers were not modeled properly. Firstly, barriers were allocated to each WG that was launched, which is not correct, and the CU would provide an infinite number of barrier slots. There are a limited number of barrier slots per CU in reality. In addition, the CU will not allocate barrier slots to WGs with a single WF (nothing to sync if only one WF). Beyond modeling problems, there also the issue of deadlock. The barrier could deadlock because not all WFs are freed from the barrier once it has been satisfied. Instead, we relied on the scoreboard stage to release them lazily, one-by-one. Under this implementation the scoreboard may not fully release all WFs participating in a barrier; this happens because the first WF to be freed from the barrier could reach an s_barrier instruction again, forever causing the barrier counts across WFs to be out-of-sync. This change refactors the barrier logic to: 1) Create a proper barrier slot implementation 2) Enforce (via a parameter) the number of barrier slots on the CU. 3) Simplify the logic and cleanup the code (i.e., we no longer iterate through the entire WF list each time we check if a barrier is satisfied). 4) Fix deadlock issues. Change-Id: If53955b54931886baaae322640a7b9da7a1595e0 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29943 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-07-16 20:37:22 +00:00
Xianwei Zhang	c2641eec89	arch-gcn3: add support of 64-bit SOPK instruction s_setreg_imm32_b32 is a 64-bit instruction, using a 32-bit literal constant. Related functions are added to support decoding the second dword. Change-Id: I290f8578f726885c137dbfac3773035f814e0a3a Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29942 Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Xianwei Zhang <xianwei.zhang@amd.com>	2020-07-16 20:37:22 +00:00
Matt Sinclair	3e84a8d710	arch-gcn3: ensure that atomics follow HSA conventions Add asserts to make sure atomics are following the HSA conventions that atomics should be word aligned (i.e., can't be byte aligned) and should not be misaligned such that a given lane's access spans multiple cache lines. Change-Id: Ia48758b9ed96764864234dc607f337e30e287d1c Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29941 Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2020-07-16 20:37:22 +00:00
Giacomo Travaglini	3ce7333a36	arch-arm: AddressSize check on translateMmuOff for AArch64 only Motivation: An AddressSizeFault on AArch32 can only happen during a table walk since the register used as a base by LD/ST is always 32 bit wide. On AArch64 on the other hand, addresses can be 64bit wide; when MMU is off (no virtual memory) an invalid physical address can be specified Change-Id: Id3ef170e99202c6b0b511fa7205c754956861720 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31274 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-07-15 13:14:56 +00:00
Alexandru Dutu	07fcbf16fc	arch-gcn3: Implementation of flat atomic swap instruction Change-Id: I9b9042899e65e8c9848b31c509eb2e3b13293e52 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29937 Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2020-07-13 23:32:27 +00:00
Michael LeBeane	6747b127af	arch-gcn3: Fix VOP2 dissasembly prints VOP2 prints VSRC1 register index as hex instead of decimal if the instruction contains a literal operand. This patch resets the format specifiers in the stream to print the register correctly. Change-Id: Icc7e6588b3c5af545be6590ce412460e72df253f Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29936 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>	2020-07-13 19:48:12 +00:00
Michael LeBeane	ed7daa10aa	arch-gcn3, gpu-compute: Implement out-of-range accesses Certain buffer out-of-range memory accesses should be special cased and not generate memory accesses. This patch implements those special cases and supresses lanes from accessing memory when the calculated address falls in an ISA-specified out-of-range condition. Change-Id: I8298f861c6b59587789853a01e503ba7d98cb13d Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29935 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>	2020-07-13 19:48:00 +00:00
Michael LeBeane	f8e295922b	arch-gcn3: Fix writelane src0,src1 usage Src1 should only be used for lane select. The data should come from src0. Change-Id: Ibe960df2e56d351a3819b40194104d2972a5cd4c Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29933 Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2020-07-13 19:47:47 +00:00
Matt Sinclair	3846e90737	arch-gcn3: fix bits that SDWA selects This commit fixes a bug in 200f2408 where the SDWA support was selecting bits backwards. As part of this commit, to help resolve this problem in the future, I have added asserts in the helper functions in bitfield.hh to ensure that the number of bits aren't negative. Change-Id: I4b0ecb0e7c110600c0b5063101b75f9adcc512ac Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29931 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>	2020-07-13 16:19:47 +00:00
Giacomo Travaglini	ecd1e05f57	arch-arm: Fix coding style in self_debug.[cc, hh] Change-Id: I67be98af412b745ea9e16d4e8c6d422c9fbb29fc Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31082 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-07-13 13:56:41 +00:00
Giacomo Travaglini	10519e225c	arch-arm: Remove getters/setters from SelfDebug class Change-Id: I63e5ed25e453cb8fcb2c39ba0728cc81c499c166 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31081 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-07-13 13:56:41 +00:00
Giacomo Travaglini	8ac717a3a8	arch-arm: Fix pmc == on SelfDebug The Assignment operator was used instead of the Equal-To Change-Id: Ibf5a0006bce79b67d662fd1f8942699582956d58 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31080 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-07-13 13:56:41 +00:00

1 2 3 4 5 ...

4156 Commits