Commit Graph

15586 Commits

Author SHA1 Message Date
Chow, Marcus
a0cfd8da6b arch-gcn3: Add handling for Inf/overflow in CVT insts
Change-Id: I0fddffdeaebd9f45fe89f44d536f80a43de63ff5
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29953
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2020-07-16 20:37:22 +00:00
Tony Gutierrez
5c3b02de09 arch-gcn3: Add ds_bpermute and ds_permute insts
The implementation of these insts provided by this
change is based on the description provided here:

https://gpuopen.com/amd-gcn-assembly-cross-lane-operations/

Change-Id: Id63b6c34c9fdc6e0dbd445d859e7b209023f2874
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29952
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2020-07-16 20:37:22 +00:00
Alexandru Dutu
3aa633cc3f arch-gcn3: ds_read_u8 and ds_read_u16 fix
This changeset zero extends the destination register
for ds_read_u8 and ds_read_u16 instructions.

Change-Id: I193adadd68adf2572b59743b1504f18ad225f506
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29951
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2020-07-16 20:37:22 +00:00
Matt Sinclair
13079629a1 arch-gcn3: convert vALU instruction counters from 32 to 64-bit
The vALU instruction counters were previously 32 bits, but for some
workloads this value wraps around and triggers an assert failure
because the max vALU operations are reached.  To resolve this, this
commit increases the counter size to 64 bits.

Change-Id: I90ed4514669485cfea7ccc37ba9d69665277bccb
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29950
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2020-07-16 20:37:22 +00:00
Xianwei Zhang
fff185993a arch-gcn3: implement instruction s_setreg_b32
Instruction s_setreg_b32 was unimplemented, but is used by hipified
rodinia 'srad'. The instruction sets values of hardware internal
registers. If the instruction is writing into MODE to control
single-precision FP round and denorm modes, a simple warn will be
printed; for all other cases (non-MODE hw register or other
precisions), panic will happen.

Change-Id: Idb1cd5f60548a146bc980f1a27faff30259e74ce
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29949
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Xianwei Zhang <xianwei.zhang@amd.com>
2020-07-16 20:37:22 +00:00
Matt Sinclair
1836d58b36 arch-gcn3: add support for v_mbcnt_hi and v_mbcnt_lo
Change-Id: I1c70fe693c904f1abd7d5a2b99220c74a075eae5
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29948
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2020-07-16 20:37:22 +00:00
Matt Sinclair
c7b6e7c613 arch-gcn3: fix bug with DPP support
Instructions that use the DPP field need to use the extra SRC0
register associated with the DPP instruction instead of the
"default" SRC0 register, since the default SRC0 register contains
the DPP information when DPP is being used.  This commit fixes
2735c3bb88 to take this into account.  Additionally, this commit
removes write of the src register from the DPP helper functions,
to avoid overwriting any changes made to the destination register.
Finally, this change modifies the instructions that use DPP to
simplify the flow through the execute() functions.

Change-Id: I80fd0af1f131f287f18ff73b3c1c9122d8c60823
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29947
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2020-07-16 20:37:22 +00:00
Matt Sinclair
ed3135ea6a arch-gcn3: implement multi-dword buffer loads and stores
Add support for all multi-dword buffer loads and stores:
buffer_load_dword x2, x3, and x4 and buffer_store_dword x2, x3, and x4

Change-Id: I4017b6b4f625fc92002ce8ade695ae29700fa55e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29946
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2020-07-16 20:37:22 +00:00
Tony Gutierrez
0c5d671ea1 gpu-compute: Init CU object for pipe stages in their ctors
This change updates the constructors of the CU's pipe
stages/memory pipelines to accept a pointer to their
parent CU. Because the CU creates these objects, and
can pass a pointer to itself to these object via their
constructors, this is the safer way to initalize these
classes.

Change-Id: I0b3732ce7c03781ee15332dac7a21c097ad387a4
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29945
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2020-07-16 20:37:22 +00:00
Tony Gutierrez
ea52df816d arch-gcn3: Add support for rd/wr EXEC_HI to operand class
Change-Id: Ib22dd604f88ea56801964235082835002deffca1
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29944
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2020-07-16 20:37:22 +00:00
Tony Gutierrez
af621cd6e6 gpu-compute, arch-gcn3: refactor barriers
Barriers were not modeled properly. Firstly, barriers were
allocated to each WG that was launched, which is not
correct, and the CU would provide an infinite number
of barrier slots. There are a limited number of barrier slots
per CU in reality. In addition, the CU will not allocate
barrier slots to WGs with a single WF (nothing to sync if
only one WF).

Beyond modeling problems, there also the issue of deadlock.
The barrier could deadlock because not all WFs are freed
from the barrier once it has been satisfied. Instead, we
relied on the scoreboard stage to release them lazily,
one-by-one.

Under this implementation the scoreboard may not fully release
all WFs participating in a barrier; this happens because the
first WF to be freed from the barrier could reach an s_barrier
instruction again, forever causing the barrier counts across
WFs to be out-of-sync.

This change refactors the barrier logic to:

1) Create a proper barrier slot implementation

2) Enforce (via a parameter) the number of barrier
   slots on the CU.

3) Simplify the logic and cleanup the code (i.e., we
   no longer iterate through the entire WF list each
   time we check if a barrier is satisfied).

4) Fix deadlock issues.

Change-Id: If53955b54931886baaae322640a7b9da7a1595e0
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29943
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2020-07-16 20:37:22 +00:00
Xianwei Zhang
c2641eec89 arch-gcn3: add support of 64-bit SOPK instruction
s_setreg_imm32_b32 is a 64-bit instruction, using a 32-bit literal
constant. Related functions are added to support decoding the second
dword.

Change-Id: I290f8578f726885c137dbfac3773035f814e0a3a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29942
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Xianwei Zhang <xianwei.zhang@amd.com>
2020-07-16 20:37:22 +00:00
Matt Sinclair
3e84a8d710 arch-gcn3: ensure that atomics follow HSA conventions
Add asserts to make sure atomics are following the HSA conventions
that atomics should be word aligned (i.e., can't be byte aligned)
and should not be misaligned such that a given lane's access
spans multiple cache lines.

Change-Id: Ia48758b9ed96764864234dc607f337e30e287d1c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29941
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2020-07-16 20:37:22 +00:00
Tony Gutierrez
701f026ba5 gpu-compute: Fix LDS out-of-bounds behavior
The LDS is capable of handling out-of-bounds accesses,
that is, accesses that are outside the bounds of the
chunk allocated to a WG. Currently, the simulator asserts
on these accesses. This patch changes the behavior of the
LDS to return 0 for reads and dropping writes that are
out-of-bounds.

Change-Id: I5f467d0f52113e8565e1a3029e82fb89cc6f07ea
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29940
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2020-07-16 20:37:22 +00:00
Bobby R. Bruce
cc791d51a2 base,scons: Fixed stats/hdf5.cc CXXFlags to -Wno-deprecated
`Wno-deprecated-copy` was added to disable a warning in hdf5.cc:
https://gem5-review.googlesource.com/c/public/gem5/+/26325.

This works with GCC but does not work with clang. Clang returns
`error: unknown warning option '-Wno-deprecated-copy'; did you mean
'-Wno-deprecated'? [-Werror,-Wunknown-warning-option]` when this flag
is enabled. This flag has therefore been changed to `Wno-deprecated`.
This works in both GCC and Clang.

Change-Id: I38dd58f3007975ccb60b2eec936c3b200b3df3ca
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31216
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Gabe Black <gabeblack@google.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2020-07-16 02:44:26 +00:00
Bobby R. Bruce
7deef45046 arch-arm: Added missing break to miscregs.cc
Later GCC compilers >=9 fail with a `this statement may fall through`
error due to this missing break.

Change-Id: I44b3386930a0b71b842a3a9b4837e4d6ad588f9d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31215
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2020-07-16 02:44:26 +00:00
Bobby R. Bruce
191212a6dc sim: Added M5_VAR_USED to unused cpu var
The `BaseCPU cpu` variable unused when compiling gem5.fast. This
causes the compilation to fail. Adding the M5_VAR_USED resolves this
issue.

Change-Id: I62588563e9cde384e30755742d6bc754e819d7f4
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31214
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2020-07-16 02:44:26 +00:00
Bobby R. Bruce
5052aae04f util: Added cloudbuild_create_images.yaml
This cloudbuild file is used to build and upload our Docker images
to our Google Cloud infrastructure. To run:

```
gcloud builds submit --config \
util/cloudbuild/cloudbuild_create_images.yaml
```

Change-Id: I812584284a3cf79aac244a3b04a0f316f4281c49
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/30516
Reviewed-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2020-07-16 02:44:26 +00:00
Bobby R. Bruce
e416e6b1f1 util,tests: Added gcc version 10 to compiler tests
In addition, gcc version 9 has been added, which was previously
served by "ubuntu-20.04_all-dependencies".

Change-Id: I57aeac2aa75b7751f0d4010efee7780e23d447d4
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/30515
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2020-07-16 02:44:26 +00:00
Bobby R. Bruce
f7a63fbd09 util: Added Dockerfile for GCC versions on ubuntu 20.04
Ubuntu 20.04 contains GCC verisons 9 and 10, which are not easily
obtainable (at least through APT) on Ubuntu 18.04. Therefore, a
Dockerfile for obtaining GCC versions in Ubuntu 20.04 has been added.
The orignal GCC version Dockerfile (Ubuntu 18.04) has been kept as
GCC versions 4.8, 5, and 6 are not obtainable, via APT, on Ubuntu
20.04. A complete migration to the 20.04 Dockerfile is not possible
until these earlier GCC versions are dropped.

The Docker images for GCC Versions 9 and 10 can be found here:
https://gcr.io/gem5-test/gcc-version-10
https://gcr.io/gem5-test/gcc-version-9

The other Dockerfile directories have been renamed for consistency.

Change-Id: I569249331095ee62d1be5be479c7ba7da0077422
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/30514
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
2020-07-16 02:44:26 +00:00
Bobby R. Bruce
91350dc4d5 util,tests: Added compiler test script
This script runs a series of compilations on gem5. The following
compilers are tested:

clang-9
clang-8
clang-7
clang-6
clang-5
clang-4
clang-3.9
gcc-9
gcc-8
gcc-7
gcc-6
gcc-5
gcc-4.8 (to be dropped soon:
         https://gem5.atlassian.net/browse/GEM5-218)

They are tested by building the following build targets:

ARM
ARM_MESI_Three_Level
Garnet_standalone
GCN3_X86
MIPS
NULL_MESI_Two_Level
NULL_MOESI_CMP_directory
NULL_MOESI_CMP_token
NULL_MOESI_hammer
POWER
RISCV
SPARC
X86
X86_MOESI_AMD_BASE

For each, ".opt" and ".fast" compiler build settings are tested.

clang-9 and gcc-9 are tested against all targets with each build
setting. For the remaining compilers, a random build target is
chosen. After the script has run, the output of the tests can be
found in "compile-test-out".

Docker is required to run this script.

Change-Id: Id3bf4c89b9d424c87e9409930ee2aceaef72cb29
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/30395
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2020-07-15 20:49:57 +00:00
Tony Gutierrez
a408b1ada7 mem-ruby: Add support for MemSync reqs in VIPER
Change-Id: Ib129e82be5348c641a8ae18093324bcedfb38abe
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29939
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2020-07-15 18:14:41 +00:00
seanzw
75257c7a42 mem-ruby: Fix type casting in makeNextStrideAddress
The RubyPrefetcher uses makeNextStrideAddress() with
a negative stride to find prefetched address.
The type of this expression is:
uint64_t + uint32_t * int;
This gives wrong result due to implicit conversion.
Fix this with static cast and it works correctly:
uint64_t + int * int;

Change-Id: I36e17e00d5c66c3699fe1d5b287971225a162d04
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31314
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2020-07-15 17:38:12 +00:00
Giacomo Travaglini
3ce7333a36 arch-arm: AddressSize check on translateMmuOff for AArch64 only
Motivation:
An AddressSizeFault on AArch32 can only happen during a table walk
since the register used as a base by LD/ST is always 32 bit wide.
On AArch64 on the other hand, addresses can be 64bit wide;
when MMU is off (no virtual memory) an invalid physical address
can be specified

Change-Id: Id3ef170e99202c6b0b511fa7205c754956861720
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31274
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2020-07-15 13:14:56 +00:00
Jason Lowe-Power
981bdda174 misc: Add a code of conduct
This file codifies our current unofficial policies. This doesn't change
the community in any way except for clarifying our policies.

This was modeled off of https://www.contributor-covenant.org/.

Change-Id: Ib976636b490bbe4d46bf79260e6a345b46c02e2c
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/30954
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2020-07-14 18:41:37 +00:00
Xianwei Zhang
024f978cff gpu-compute: enable kernel-end WB functionality
Change-Id: Ib17e1d700586d1aa04d408e7b924270f0de82efe
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29938
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Xianwei Zhang <xianwei.zhang@amd.com>
2020-07-13 23:32:37 +00:00
Alexandru Dutu
07fcbf16fc arch-gcn3: Implementation of flat atomic swap instruction
Change-Id: I9b9042899e65e8c9848b31c509eb2e3b13293e52
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29937
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2020-07-13 23:32:27 +00:00
Gabe Black
1427fdb455 misc: Remove support for checking out as a mercurial repo.
This will still be technically possible with the right converters, but
this removes the tags, ignore file, and style checking hooks related to
mercurial. We no longer maintain a mercurial mirror of the main git
repository, and this support adds clutter and could diverge from the git
style hooks, etc, over time.

Change-Id: Icf4833c4f0fda51ea98989d1d741432ae3ddc6dd
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31174
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Gabe Black <gabeblack@google.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2020-07-13 22:25:41 +00:00
Michael LeBeane
6747b127af arch-gcn3: Fix VOP2 dissasembly prints
VOP2 prints VSRC1 register index as hex instead of decimal if the
instruction contains a literal operand.  This patch resets the
format specifiers in the stream to print the register correctly.

Change-Id: Icc7e6588b3c5af545be6590ce412460e72df253f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29936
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
2020-07-13 19:48:12 +00:00
Michael LeBeane
ed7daa10aa arch-gcn3, gpu-compute: Implement out-of-range accesses
Certain buffer out-of-range memory accesses should be special
cased and not generate memory accesses. This patch implements
those special cases and supresses lanes from accessing memory
when the calculated address falls in an ISA-specified out-of-range
condition.

Change-Id: I8298f861c6b59587789853a01e503ba7d98cb13d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29935
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
2020-07-13 19:48:00 +00:00
Michael LeBeane
f8e295922b arch-gcn3: Fix writelane src0,src1 usage
Src1 should only be used for lane select.  The data should come
from src0.

Change-Id: Ibe960df2e56d351a3819b40194104d2972a5cd4c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29933
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
2020-07-13 19:47:47 +00:00
Onur Kayiran
bff8df2288 gpu-compute: Dropping fetchs when no entry is reserved in the buffer
This changeset drops fetches if there is no entry reserved in the
fetch buffer for that instruction. This can happen due to a fetch
attempted to be issued in the same cycle where a branch instruction
flushed the fetch buffer, while an ITLB or I-cache request is still
pending.

Change-Id: I3b80dbd71af27ccf790b543bd5c034bb9b02624a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29932
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Reviewed-by: Onur Kayıran <onur.kayiran@amd.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
2020-07-13 19:47:26 +00:00
Matt Sinclair
3846e90737 arch-gcn3: fix bits that SDWA selects
This commit fixes a bug in 200f2408 where the SDWA support was selecting bits
backwards.  As part of this commit, to help resolve this problem in the
future, I have added asserts in the helper functions in bitfield.hh to ensure
that the number of bits aren't negative.

Change-Id: I4b0ecb0e7c110600c0b5063101b75f9adcc512ac
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29931
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
2020-07-13 16:19:47 +00:00
Giacomo Travaglini
ecd1e05f57 arch-arm: Fix coding style in self_debug.[cc, hh]
Change-Id: I67be98af412b745ea9e16d4e8c6d422c9fbb29fc
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31082
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2020-07-13 13:56:41 +00:00
Giacomo Travaglini
10519e225c arch-arm: Remove getters/setters from SelfDebug class
Change-Id: I63e5ed25e453cb8fcb2c39ba0728cc81c499c166
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31081
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2020-07-13 13:56:41 +00:00
Giacomo Travaglini
8ac717a3a8 arch-arm: Fix pmc == on SelfDebug
The Assignment operator was used instead of the Equal-To

Change-Id: Ibf5a0006bce79b67d662fd1f8942699582956d58
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31080
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2020-07-13 13:56:41 +00:00
Giacomo Travaglini
7884d5296d arch-arm: Move breakpoint/watchpoint check out of the TLB
The breakpoint, watchpoint, vector catch and software step checks
have been moved from the TLB to the SelfDebug class.

This is cleaningup the TLB model which is simply asking the SelfDebug
class if there is a pending debug fault

Change-Id: I1724896b24e4728b32a6b46c5cd51cc6ef279fd7
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31079
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2020-07-13 13:56:41 +00:00
Giacomo Travaglini
2df992b53c dev-arm: Style fixes for src/dev/arm/gic_v2.hh
JIRA: https://gem5.atlassian.net/browse/GEM5-667

Change-Id: I80dce7b72775beabafa3b54e915a369571f2e4c9
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31057
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2020-07-11 07:50:24 +00:00
Giacomo Travaglini
01c695d5d3 dev-arm: Implement Level Sensitive PPIs in GICv2
JIRA: https://gem5.atlassian.net/browse/GEM5-667

Change-Id: I9ae411110f08f4a1de95469ff5ed6788354abafc
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31056
Reviewed-by: Hsuan Hsu <kugwa2000@gmail.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2020-07-11 07:50:24 +00:00
Giacomo Travaglini
2f4e975d44 dev-arm: Use getIntConfig when reading/writing GICD_ICFGR
This patch is changing the getIntConfig helper (which has been
used so far by isLevelSensitive only) to make it usable by the
read/writes of the GICD_ICFGR register.

While the helper was previously returning the irq config bits
provided a single irq as an input, this new version is returning
the entire GICD_ICFGR word (read/writable)

JIRA: https://gem5.atlassian.net/browse/GEM5-667

Change-Id: I07e455a9e2819fed1f97a0e372d9d9a2e5ad4801
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31055
Reviewed-by: Hsuan Hsu <kugwa2000@gmail.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2020-07-11 07:50:24 +00:00
Giacomo Travaglini
548e2884ec dev-arm: Move GICv2 intConfig for consistency
Every other helper is placed below the respective array storage

JIRA: https://gem5.atlassian.net/browse/GEM5-667

Change-Id: I398ac23eb68d84a8e0ed856550bfac8e403a86b3
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31054
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2020-07-11 07:50:24 +00:00
Gabe Black
a6721c7a73 arch,cpu: Consolidate most of the StackTrace classes into a base class.
These classes are all basically empty now that Alpha has been deleted,
except in cases where the arch versions had copied versions of the Alpha
code.

This change pulls all the generic logic out of the arch versions, making
the arch versions much simpler and making it clearer what the core
functionality of the class is, and what parts are architecture specific
details.

In the future, the way the StackTrace class is instantiated should be
delegated to the Workload class so that ISA agnostic code doesn't need
to know about a particular ISA's StackTrace class, and so that
StackTrace logic can, at least theoretically, be specialized for a
particular workload. The way a stack trace is collected could vary from
OS to OS, for example.

Change-Id: Id8108f94e9fe8baf9b4056f2b6404571e9fa52f1
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/30961
Reviewed-by: Gabe Black <gabeblack@google.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2020-07-11 00:24:38 +00:00
Gabe Black
f1fc7ba257 cpu: Slightly modernize and simplify code in cpu/profile.(hh|cc).
Change-Id: Ideb104d20b333305ead2356cbfff2aac2e0173b5
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/30960
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2020-07-11 00:24:13 +00:00
Tony Gutierrez
787fb48055 arch-arm: Initialized some variables
Some of the variables in pauth_helpers.cc
are uninitialized in certain control paths
which causes a compiler warning. We initialize
these to false since they should be updated
to the correct value in all valid code paths.

Change-Id: If34d7daaf2404c2cf014c7b4c0c2f979580f36b9
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31094
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2020-07-10 18:10:20 +00:00
Hsuan Hsu
6616354423 arch-arm: Add basic support for KVM_CAP_ARM_USER_IRQ
KVM_CAP_ARM_USER_IRQ is a KVM extension introduced in newer versions of
Linux (>= 4.12). It supports delivering interrupt from the kernel-space
timer to the user-space GIC, which means that it will be unnecessary to
use the memory-mapped timer and emulate it in gem5 anymore.

Using the option provided by this change, Linux is able to boot with 1
CPU successfully, and the speed is slightly faster then the memory-
mapped timer option. However, multicore seems to hang during boot and
still needs more investigation to be enabled.

JIRA: https://gem5.atlassian.net/browse/GEM5-663

Change-Id: I146bbcce3cf66f8f5ebee04ea5f1b9f54868721a
Signed-off-by: Hsuan Hsu <hsuan.hsu@mediatek.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/30921
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
2020-07-10 18:03:15 +00:00
Hsuan Hsu
d4e9a34590 cpu-kvm,arch-arm: Improve KvmCPU tick event scheduling
The memory-mapped timer emulated by gem5 is driven by the underlying
gem5 tick, which means that we must align the tick with the host time
to make the timer interrupt fire at a nearly native rate.

In each KVM execution round, the number of ticks incremented is
directly calculated from the number of instructions executed. However,
when a guest CPU switches to idle state, KVM seems to stay in kernel-
space until the POSIX timer set up in user-space raises an expiration
signal, instead of trapping to user-space immediately; and somehow the
instruction count is just too low to match the elapsed host time. This
makes the gem5 tick increment very slowly when the guest is idle and
drastically slow down workloads being sensitive to the guest time which
is driven by timer interrupt.

Before switching to KVM to execute the guest code, gem5 programs the
POSIX timer to expire according to the remaining ticks before the next
event in the event queue. Based on this, we can come up with the
following solution: If KVM returns to user-space due to POSIX timer
expiration, it must be time to process the next gem5 event, so we just
fast-forward the tick (by scheduling the next CPU tick event) to that
event directly without calculating from the instruction count.

There is one more related issue needed to be solved. The KVM exit
reason, KVM_EXIT_INTR, was treated as the case where the KVM execution
was disturbed by POSIX timer expiration. However, there exists a case
where the exit reason is KVM_EXIT_INTR but the POSIX timer has not
expired. Its cause is still unknown, but it can be observed via the
"old_value" argument returned by timer_settime() when disarming the
POSIX timer. In addition, it seems to happen often when a guest CPU is
not in idle state. When this happens, the above tick event scheduling
incorrectly treats KVM_EXIT_INTR as POSIX timer expiration and fast-
forwards the tick to process the next event too early. This makes the
guest feel external events come too fast, and will sometimes cause
trouble. One example is the VSYNC interrupt from HDLCD. The guest seems
to get stuck in VSYNC handling if the KVM CPU is not given enough time
between each VSYNC interrupt to complete a service. (Honestly I did not
dig in to see how the guest handled the VSYNC interrupt and how the
above situation became trouble. I just observed from the debug trace of
GIC & HDLCD & timer, and made this conclusion.) This change also uses
a workaround to detect POSIX timer expiration correctly to make the
guest work with HDLCD.

JIRA: https://gem5.atlassian.net/browse/GEM5-663

Change-Id: I6159238a36fc18c0c881d177a742d8a7745a23ca
Signed-off-by: Hsuan Hsu <hsuan.hsu@mediatek.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/30919
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2020-07-10 18:03:15 +00:00
Hsuan Hsu
c513835c4f dev-arm: Fix handling of writing timer control registers
We should also deal with change of the imask bit, or we will lose timer
interrupt if the timer expires before the guest kernel unmasks the bit.
More precisely, consider the following common pattern in timer interrupt
handling:

    1. Set the interrupt mask bit (CNTV_CTL.IMASK)
    2. Reprogram the downcounter (CNTV_TVAL) for the next interrupt
    3. Clear the interrupt mask bit (CNTV_CTL.IMASK)

The timer can expires between step 2 & 3 if the value programmed in step
2 is small enough, and this seems very likely to happen in KVM mode. If
we don't check for timer expiration right after unmasking, we will miss
the only chance to inject the interrupt.

JIRA: https://gem5.atlassian.net/browse/GEM5-663

Change-Id: I75e8253bb78d15ae72cb985ed132f896d8e92ca6
Signed-off-by: Hsuan Hsu <hsuan.hsu@mediatek.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/30918
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2020-07-10 18:03:15 +00:00
Hsuan Hsu
f53eefa397 cpu-kvm: Initialize _hasKernelIRQChip in the constructor
This class member was only correctly set to true when using an in-kernel
interrupt controller, but was un-initialized when trying to use a user-
space one and would cause trouble.

JIRA: https://gem5.atlassian.net/browse/GEM5-663

Change-Id: I71b052c6da7e8790b05a15c07e7933bc4f912785
Signed-off-by: Hsuan Hsu <hsuan.hsu@mediatek.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/30917
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2020-07-10 18:03:15 +00:00
Hsuan Hsu
5a55a242ab dev-arm: Make generic timer work with level-sensitive support
Support for level-sensitive PPIs and SPIs has been added to GICv2 now.
It is therefore the timer's responsibility to notify GICv2 to clear its
interrupt pending state. Without doing this, the guest will get stuck
in just a single round of the interrupt handler because GICv2 does not
clear the pending state, and eventually make the guest treat this
interrupt as problematic and then just disable it.

JIRA: https://gem5.atlassian.net/browse/GEM5-663

Change-Id: Ia8fd96bf00b28e91aa440274e6f8bb000446fbe3
Signed-off-by: Hsuan Hsu <hsuan.hsu@mediatek.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/30916
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2020-07-10 18:03:15 +00:00
Chris January
98ce167176 configs: Add earlycon to default kernel_cmd.
The earlyprintk kernel command line argument does not take a value on Arm.
Rather pass early console name using the earlycon command line argument.

Change-Id: Ie14fc425e87c50a0b59fa4270a3743ed4fe97589
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/31074
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2020-07-09 07:25:57 +00:00