The dailies timed out as they were running the entire directory of tests
due to a wrong variable name being used. In addition, the names of tests
were adjusted to include the matrix type so the artifacts won't
overwrite each other
The configs directory for the fs tests was missing the
checkpoint.py file, causing some of the CI tests to fail.
Change-Id: Ifbd775ad658f96d06bea7bee554fe3bedcf5a5b5
* Solves issue #106 by updating the cpts with the necessary vector
registers.
Change-Id: Ifeda90e96097f0b0a65338c6b22a8258c932c585
util: clear vector_element field
Change-Id: I6c9ec4e71f66722b26de030fa139cd626bdb24dc
"tests/gem5/configs/download_check.py" is used by the
"test-resource-downloading" test (defined in
"tests/gem5/gem5-resources/test_download_resources.py" and ran as part
of the "very-long" suite).
Prior to this change "download_check.py" would download each resource,
check it's md5, then at the end of the script remove all the downloaded
resources. This is inefficient on disk space and was causing our
"very-long" suite of tests to require a machines with a lot of disk
space to run.
This change alters 'download_check.py" to remove each resource after the
md5 check. Thus, only one resource is ever downloaded and present at any
given time during the running of this script.
The dailies timed out as they were running the entire directory
of tests due to a wrong variable named being used. In addition,
the names of tests were adjusted to include the matrix type so
the artifacts won't overwrite each other
Change-Id: Iaa1be8e0cfcbf9d64f1a674590bfe2bf1f0dae90
Updates the directories in which tests are run in accordance
with the refactoring of the testing directory
Change-Id: I93f5c5b0236c5180da04deb425ec2ed6804fa003
"tests/gem5/configs/download_check.py" is used by the
"test-resource-downloading" test (defined in
"tests/gem5/gem5-resources/test_download_resources.py" and ran as part
of the "very-long" suite).
Prior to this change "download_check.py" would download each resource,
check it's md5, then at the end of the script remove all the downloaded
resources. This is inefficient on disk space and was causing our
"very-long" suite of tests to require a machines with a lot of disk
space to run.
This change alters 'download_check.py" to remove each resource after the
md5 check. Thus, only one resource is ever downloaded and present at any
given time during the running of this script.
Change-Id: I38fce100ab09f66c256ccddbcb6f29763839ac40
This merges initial support for RVV. Currently, only the simple CPUs are supported.
The decoder stalls for every vsetvl instruction.
In the future, we will implement vsetvl as a control instruction as described in #144
This changeset reorganizes the testing directory within gem5,
removing the bigger config folders, then replacing them with
smaller configs folders within each directory containing only
the scripts necessary for that set of tests. It also changes
the locations of the config scripts used in each set of tests,
and updates the tests accordingly.
Change-Id: I38297d4496f72bd5cf7200471acd5c4d93002b27
This change adds READMEs to each directory within tests/gem5,
with a short description of the test, as well as how to run it.
Change-Id: I574ebcdc837848b52f21e8c0f8856ff09463284b
Since the O3 and Minor CPU models do not support RVV right now as the
implementation stalls the decode until vsetvl instructions are exectued,
this change calls `fatal` if RVV is not explicitly enabled.
It is possible to override this if you explicitly enable RVV in the
config file.
Change-Id: Ia801911141bb2fb2bedcff3e139bf41ba8936085
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
TODOs:
+ vcompress.vm
Change-Id: I86eceae66e90380416fd3be2c10ad616512b5eba
Co-authored-by: Yang Liu <numbksco@gmail.com>
Co-authored-by: Fan Yang <1209202421@qq.com>
Co-authored-by: Jerin Joy <joy@rivosinc.com>
arch-riscv: Add LICENCE to template files
Change-Id: I825e72bffb84cce559d2e4c1fc2246c3b05a1243
* TODOs:
+ Vector Segment Load/Store
+ Vector Fault-only-first Load
Change-Id: I2815c76404e62babab7e9466e4ea33ea87e66e75
Co-authored-by: Yang Liu <numbksco@gmail.com>
Co-authored-by: Fan Yang <1209202421@qq.com>
Co-authored-by: Jerin Joy <joy@rivosinc.com>
Change-Id: I84363164ca327151101e8a1c3d8441a66338c909
Co-authored-by: Yang Liu <numbksco@gmail.com>
Co-authored-by: Fan Yang <1209202421@qq.com>
arch-riscv: Add a todo to fix vsetvl stall on decode
Change-Id: Iafb129648fba89009345f0c0ad3710f773379bf6
This commit add regs and configs for vector extension
* Add 32 vector arch regs as spec defined and 8 internal regs for
uop-based vector implementation.
* Add default vector configs(VLEN = 256, ELEN = 64). These cannot
be changed yet, since the vector implementation has only be tested
with such configs.
* Add disassamble register name v0~v31 and vtmp0~vtmp7.
* Add CSR registers defined in RISCV Vector Spec v1.0.
* Add vector bitfields.
* Add vector operand_types and operands.
Change-Id: I7bbab1ee9e0aa804d6f15ef7b77fac22d4f7212a
Co-authored-by: Yang Liu <numbksco@gmail.com>
Co-authored-by: Fan Yang <1209202421@qq.com>
Co-authored-by: Jerin Joy <joy@rivosinc.com>
arch-riscv: enable rvv flags only for RV64
Change-Id: I6586e322dfd562b598f63a18964d17326c14d4cf
The MMIO trace contains register values for parts of the GPU that are
not modeled in gem5, such as registers related to the graphics core.
Since MI100 and MI200 do not have anything that is not modeled, the
MMIO trace is not needed, therefore it does not need to be used or
checked and the command line option goes away entirely for MI100/200.
Change-Id: I23839db32b1b072bd44c8c977899a99347fc9687
Starting with ROCm 5.4+, MI100 and MI200 make use of the translate
further bit in the page table. This bit enables mixing 4kiB and 2MiB
pages and is functionally equivalent to mixing page sizes using the
PDE.P bit for which gem5 currently has support.
With PDE.P bit set, we stop walking and the page size is equal to the
level in the page table we stopped at. For example, stopping at level
2 would be a 1GiB page, stopping at level 3 would be a 2MiB page.
This assumes most pages are 4kiB.
When the F bit is used, it is assumed most pages are 2MiB and we will
stop walking at the 3rd level of the page table unless the F bit is set.
When the F bit is set, the 2nd level PDE contains a block fragment size
representing the page size of the next PDE in the form of 2^(12+size).
If the next page has the F bit set we continue walking to the 4th level.
The block fragment size is hardcoded to 9 in the driver therefore we
assert that the block fragment size must be 0 or 9.
This enables MI200 with ROCm 5.4+ in gem5. This functionality was
determine by examining the driver source code in Linux and there is no
public documentation about this feature or why the change is made in or
around ROCm 5.4.
Change-Id: I603c0208cd9e821f7ad6eeb1d94ae15eaa146fb9
This SDMA packet is much more common starting around ROCm 5.4.
Previously this was mostly used to clear page tables after an
application ended and was therefore left unimplemented. It is
now used for basic operation like device memsets.
This patch implements constant fill as it is now necessary.
Change-Id: I9b2cf076ec17f5ed07c20bb820e7db0c082bbfbc
When using the new operator, delete should be called
on any allocated memory after it's use is complete.
Change-Id: Id5fcfb264b6ddc252c0a9dcafc2d3b020f7b5019
A previous change added a vop2Helper to remove 100s of lines of common
code from VOP2 instructions related to processing SDWA and DPP support.
That change inadvertently changed the type of operand source 0 from
const to non-const. The vector container operator[] does not allow
reading a scalar value such as a constant, a dword literal, etc. The
error shows up in the form of: assert(!scalar) in operand.hh.
Since the SDWA and DPP cases need to modify the source vector and
non-SDWA/DPP cases might require const, we make a non-const copy of the
const source 0 vector and place it in a temporary non-const vector. This
non-const vector is passed to the lambda function implementation of the
instruction. This prevents needing a const and non-const version of the
lambda and avoids needing to propagate the template parameters through
the various SDWA/DPP helper methods which seems like it will not work
anyways as they need to modify the vector.
As a result of this, as more VOP2 instructions are implemented using
this helper, they will need to specify the const and non-const template
parameters of the vector container needed for the instruction.
Change-Id: Ia0b3c550d7de32b830040007a110f4821e3385aa
When using the new operator, delete should be called
on any allocated memory after it's use is complete.
Change-Id: Id5fcfb264b6ddc252c0a9dcafc2d3b020f7b5019
In "src/cpu/testers/gpu_ruby_test" a random number generator was used.
This was using the CPP "<random>" library. This patch changes it to the
gem5 random class (that declared in "base/random.hh").
In addition to this, undeterministic behavior has been removed. Via
"protocol_tester.cc" the RNG is either seeded with a seed specified by
the user, or goes with the gem5 default seed. This ensures reproducable
runs. Prior to this patch the RNG was seeded with `time(NULL)`. This
made finding faults difficult.
This, at least partially, addresses Issue #138
Change-Id: Ia8e9f7b87e91323f828e0b7f6c3906c0c5793b2c
arch-x86: Move CPUID values to python
CPUID values for X86 are currently hard-coded in the C++ source file.
This makes it difficult to configure the bits if needed. Move these to
python instead. This will provide a few benefits:
1. We can enable features for certain configurations, for example AVX
can be enabled when the KVM CPU is used, but otherwise should not be
enabled as gem5 does not have full AVX support.
2. We can more accurately communicate things like cache/TLB sizes based
on the actual gem5 configuration. The CPUID values are can be used by
some libraries, e.g., MPI, to query system topology.
3. Enabling some bits breaks things in certain configurations and this
can be prevented by configuring in python. For example, enabling AVX
seems to currently be breaking SMP, meaning gem5 can only boot one CPU
in that configuration.
In "src/cpu/testers/gpu_ruby_test" a random number generator was used.
This was using the CPP "<random>" library. This patch changes it to the
gem5 random class (that declared in "base/random.hh").
In addition to this, undeterministic behavior has been removed. Via
"protocol_tester.cc" the RNG is either seeded with a seed specified by
the user, or goes with the gem5 default seed. This ensures reproducable
runs. Prior to this patch the RNG was seeded with `time(NULL)`. This
made finding faults difficult.
Change-Id: Ia8e9f7b87e91323f828e0b7f6c3906c0c5793b2c
A previous change added a vop2Helper to remove 100s of lines of common
code from VOP2 instructions related to processing SDWA and DPP support.
That change inadvertently changed the type of operand source 0 from
const to non-const. The vector container operator[] does not allow
reading a scalar value such as a constant, a dword literal, etc. The
error shows up in the form of: assert(!scalar) in operand.hh.
Since the SDWA and DPP cases need to modify the source vector and
non-SDWA/DPP cases might require const, we make a non-const copy of the
const source 0 vector and place it in a tempoary non-const vector. This
non-const vector is passed to the lambda function implementation of the
instruction. This prevents needing a const and non-const version of the
lambda and avoids needing to propagate the template parameters through
the various SDWA/DPP helper methods which seems like it will not work
anyways as they need to modify the vector.
As a result of this, as more VOP2 instructions are implemented using
this helper,they will need to specify the const and non-const template
parameters of the vector container needed for the instruction.
Change-Id: Ia0b3c550d7de32b830040007a110f4821e3385aa
This change syncs the repo's contributing documentation with that of the
website's contributing documentation:
https://www.gem5.org/contributing
From now on we'll attempt to keep the repo's CONTRIBUTING.md
documentation in sync with that on the website.
Change-Id: I2c91e6dd5cd7a9b642377878b007d7da3f0ee2ad
AVX is a requirement for some ROCm libraries, such as rocBLAS, which are
themselves requirements for libraries higher up the stack like PyTorch.
This patch sets the necessary CPUID bits in the GPUFS config to enable
AVX, AVX2, and various SSE features so that applications using these
libraries do not cause an illegal instruction trap.
Change-Id: Id22f543fb2a06b268271725a54075ee6a9a1f041
The extended state CPUID function is used to set the values of the XCR0
register as well as specify the size of storage for context switching
storage for x87 and AVX+. This function is iterative and therefore
requires (1) marking it as such in the hsaSignificantIndex function (2)
setting multiple sets of 4-tuples for the default CPUID values where the
last 4-tuple ends with all zeros.
Change-Id: Ib6a43925afb1cae75f61d8acff52a3cc26ce17c8
Related to the recent changes with moving CPUID values to python, this
value is needed to enable AVX and needs a way to be exposed to python as
well in order to set the bit and the corresponding CPUID values at the
same time.
Change-Id: I3cadb0fe61ff4ebf6de903018a8d8a411bfdb4e0