'ext' is set as a Python source path for gem5, like 'src/python'. It
helps vscode users to have vscode aware of this to better analytics and
reduce warnings (most comminly "unable to resolve import).
'tests' isn't in the Python source path when compiling gem5 but it is
when running `tests/main.py`. Though somewhat unideal as is lets vscode
think files in 'src' can import from files in 'test', adding this helps
vscode Python analytics parse the test files which reduces warnings and
aids in betters navigation of the testing code. This is particularly
helpful given the complexity of the testlib testing infrastructure.
With this patch the pannotia tests now:
1. Download the resources to 'gpu-pannotia' in the
'tests/gem5/resources' directory. This is where other test resources are
store.
2. Download thr USA-road-d.NY.gr dataset from Google cloud bucket in a
decompressed state.
2. Avoid re-download the resources if they are already present on the
host machine.
`edited` is what forces a re-run of our tests when the PR title is
updated and other minor metadata stuff. I believe all changes to the
code are covered by the remainder. `synchronize` is means the PR is
triggered with the when the this PR is from (in this case my forked gem5
repo) is synced with the PR branch here. This covers the vast majority
of cases we care about. `opended` covers for the case where the PR is
created and `ready_for_review` for when something moves out of a draft.
Ruby requires each machine type to have a continuous set of version
numbers starting at 0. We were hiding this from users/developers by
using a Python class variable in the stdlib. Unfortunately, with
multiple ruby systems this doesn't work anymore.
As a stop-gap this change adds "resetting" these versions to the
beginning of `incorporate_caches`. It would be better to fix this in the
C++ code (and assign these numbers in C++ probably via the RubySystem),
but that's a bigger change than is needed right now.
---------
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Add decoder and function of AArch32 VCVTA, VCVTP, VCVTN and VCVTM
instructions. Support both 16-bit and 32-bit variants.
Only support A32 encoding.
Change-Id: I6ece0e1b779f9a7cc9d709894a49a7fdcda28373
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
This PR fixes the bug where simInsts and simOps don't reset when
m5.stats.reset() is called. The stats hostInstRate and hostOpRate are
affected by this change as well, as they depend on simInsts and simOps
respectively.
This is related to issue 1443 linked
[here](https://github.com/gem5/gem5/issues/1443).
This allows vscode to resolve python imported from "src/python".
Warnings regarding these imports are numerous and the issue stops users
of vscode to utilizubg features like navigating the codebase though "Go
to Definition" queries on imported classes/functions.
Previously, whether the board object or the memory_system returned
the memory ports was not consistent in the cache_hierarchies
This commit makes it consistently use the board. Note: the board
is a better place so it can customize the ports (e.g., add I/O
components or other things.
This commit also makes the arm board consistent with the other
boards and removes the specialized `get_mem_ports` that was not
used.
Where appropriate utilize caching of ALL/gem5.opt or VEGA_X86/gem5.opt.
The cache key is just the date returned by the runner. This is unlikely
the most efficient solution but it is simple and difficulties were
encountered when attempting to create a hash of This solution will do
for now.
The previous https://github.com/gem5/gem5/pull/1617 introduce the CLINT
reset feature. When reset, we changed the mtime to 0 and keep mtimecmp
unchanged by default, we also need to check mtime & mtimecmp regiter to
update the MTI signal. However, the mtime register will be incremented
to 1 by `raiseInterruptPin`.
In the PR, we introduced the interrupt ID for CLINT, the mtime will be
incremented only if received the RTC signal
---------
Co-authored-by: Jason Lowe-Power <jason@lowepower.com>
There are several parts to this PR to work towards #1349 .
(1) Make RubySystem::getBlockSizeBytes non-static by providing ways to
access the block size or passing the block size explicitly to classes.
The main changes are:
- DataBlocks must be explicitly allocated. A default ctor still exists
to avoid needing to heavily modify SLICC. The size can be set using a
realloc function, operator=, or copy ctor. This is handled completely
transparently meaning no protocol or config changes are required.
- WriteMask now requires block size to be set. This is also handled
transparently by modifying the SLICC parser to identify WriteMask
types and call setBlockSize().
- AbstractCacheEntry and TBE classes now require block size to be set.
This is handled transparently by modifying the SLICC parser to
identify these classes and call initBlockSize() which calls
setBlockSize() for any DataBlock or WriteMask.
- All AbstractControllers now have a pointer to RubySystem. This is
assigned in SLICC generated code and requires no changes to protocol
or configs.
- The Ruby Message class now requires block size in all constructors.
This is added to the argument list automatically by the SLICC parser.
(2) Relax dependence on common functions in
src/mem/ruby/common/Address.hh
so that RubySystem::getBlockSizeBits is no longer static. Many classes
already have a way to get block size from the previous commit, so they
simply multiple by 8 to get the number of bits. For handling SLICC and
reducing the number of changes, define makeCacheLine, getOffset, etc. in
RubyPort and AbstractController. The only protocol changes required are
to change any "RubySystem::foo()" calls with "m_ruby_system->foo()".
For classes which do not have a way to get access to block size but
still used makeLineAddress, getOffset, etc., the block size must be
passed to that class. This requires some changes to the SimObject
interface for two commonly used classes: DirectoryMemory and
RubyPrefecther, resulting in user-facing API changes
User-facing API changes:
- DirectoryMemory and RubyPrefetcher now require the cache line size as
a non-optional argument.
- RubySequencer SimObjects now require RubySystem as a non-optional
argument.
- TesterThread in the GPU ruby tester now requires the cache line size
as a non-optional argument.
(3) Removes static member variables in RubySystem which control
randomization, cooldown, and warmup. These are mostly used by the Ruby
Network. The network classes are modified to take these former static
variables as parameters which are passed to the corresponding method
(e.g., enqueue, delayHead, etc.) rather than needing a RubySystem object
at all.
Change-Id: Ia63c2ad5cf0bf9d1cbdffba5d3a679bb4d3b1220
(4) There are two major SLICC generated static methods:
getNumControllers()
on each cache controller which returns the number of controllers created
by the configs at run time and the functions which access this method,
which are MachineType_base_count and MachineType_base_number. These need
to be removed to create multiple RubySystem objects otherwise NetDest,
version value, and other objects are incorrect.
To remove the static requirement, MachineType_base_count and
MachineType_base_number are moved to RubySystem. Any class which needs
to call these methods must now have a pointer to a RubySystem. To enable
that, several changes are made:
- RubyRequest and Message now require a RubySystem pointer in the
constructor. The pointer is passed to fields in the Message class
which require a RubySystem pointer (e.g., NetDest). SLICC is modified
to do this automatically.
- SLICC structures may now optionally take an "implicit constructor"
which can be used to call a non-default constructor for locally
defined variables (e.g., temporary variables within SLICC actions). A
statement such as "NetDest bcast_dest;" in SLICC will implicitly
append a call to the NetDest constructor taking RubySystem, for
example.
- RubySystem gets passed to Ruby network objects (Network, Topology).
There was a bug exposed by a recent PR [1] where until recently the O3
CPU was executing an instruction even if it did not have the required
functional unit in the FU pool.
We are adding the matrix descriptors to the Default FU pool in the O3
cpu so that no panic is encountered upon executing of a matrix
instruction
[1]: https://github.com/gem5/gem5/pull/1516
Change-Id: I04250255a2cbb2ee6f3ef204b62bc2c1ee2d4d2c
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
There was a bug exposed by a recent PR [1] where until recently the O3
CPU was executing an instruction even if it did not have the required
functional unit in the FU pool.
We are adding the crypto descriptors to the Default FU pool in the O3
cpu so that no panic is encountered upon executing of a crypto
instruction
[1]: https://github.com/gem5/gem5/pull/1516
Change-Id: Ifaf2f8e4780dfb8ba825a99a02dd587f011dbd23
Reviewed-by: Richard Cooper <richard.cooper@arm.com>
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
The Daily tests have been failing as the learning-gem5 Ruby test now
exits at tick 9831 instead of tick 9981.
**Note**: The cause of this change is currently unknown. I'm not sure if
this is symptomatic of something bigger but for now I only observe this
bug failure and this patch at least silences the error.
The clone3 syscall, implemented in commit 87e774c, is currently only
handled for x86-64 in gem5 SE mode. Clone3 is employed by modern glibc
versions instead of clone for processes/threads generation (e.g. issue
#1204). This commit enables the clone3 syscall in riscv64 by adding the
corresponding handler call, as well as its arguments struct.
The ROM field was originally intended as a future alternate way to load
VBIOS without the ROM being on the disk image. This code path is never
taken for the devices gem5 supports and there is no gem5 implementation.
Deprecate the rom_binary field for this reason.
Similarly, MMIO traces were only used for Vega10. Deprecate this as
Vega10 is now deprecated. The MMIO trace reader is kept as it may still
be useful in the future. It is still the primary way to handle devies
which have graphics capability. None of the devices supported by gem5
have graphics now that Vega10 is deprecated.
Hashing the `src` directory is too costly, with some runners reaching
timeout. Also, as we only have 10GB of cache it makes sense to have
more course grained caching
The Weekly GPU tests are failing due to a timeout, but I found the
testing timeout was set to 5 hours, and we have been frequently close to
reaching this but have recently changed the test enough to consistently
go over.
The main two things that appear to have caused this are:
~~1. Moving the X86_VEGA compilation into the same step as the running
of the tests.~~ (I take this back, the timeout is per-job, it shouldn't
matter how stuff is deivided among steps in the job. However, keeping it
separate does no harm and merging the two steps did coincide with
failures occurring. I'll play it safe for now_.
2. Reducing the number of threads per GitHub Actions runner, thus
slowing job execution.
In addition, we've added more tests to this weekly GPU suite, though I
don't believe we have got to running these tests yet. The timeout
appears to always have been triggered before this.
This PR increases the timeout to 3 days and moves the compilation into a
separate step.
**Update: Same changes done for Daily tests too as it appears to be the
same problem.
The Weekly GPU tests are failing due to a timeout but I found the testing
timeout was set to 5 hours and we have been frequently close to reaching this
but have recently changes the test enought o consistently go over.
The main two things that appear to have caused this are:
1. Moving the X86_VEGA compilation into the the same step as the running of
the tests.
2. Reducing the number of threads per GitHub Actions runner, thus slowing
job execution.
In addition we've added more tests to this weekly GPU suite though I don't
believe have got to running these tests yet. The timeout appears to
always been triggered before this.
This PR increases the timout to 3 days and moves the compilation into a
seperate step.
Two faults:
1. You can't give description the docker-bake file for single platform
builds. They must be in the Dockerfile..
2. The gpu docker image def in docker-bake.hcl was not overriding the
"common" setttings as previously thought. This was causing builds to
something build the wrong platform and vairous other weird bugs. This
has been fixed in this patch.
Invalidate requests align to system cache line size. This causes
problems if the GPU cache hierarchy's cache line size is different than
the system as the unlaigned requests never return, leading to deadlock
on deferred dispatch.
This commit uses the cache line size from the GPU memory manager and
makes the cache line size there non-optional.
Tested with multiple RubySystems where CPU side was 64B and GPU side was
128B cache lines.
Vega10 is no longer officially supported by ROCm and ROCm is starting to
use some packet types not supported. These were originally kept to allow
users to use older disk images with newer gem5. Going forward the gem5
version and gem5-resources releases will be required to be the same to
prevent lingering old configs.
As a replacement for vega10*.py, mi300.py or mi200.py should be used.
HIP examples, cookbook, and rodinia configs can be replaced with the
standard flow of building / obtaining the GPU application and running
using mi300.py or mi200.py as they do not require any input options and
therefore do not require changes to the disk image.
The clone3 syscall, implemented in commit 87e774c, is currently only
handled for x86-64 in gem5. Clone3 is employed by modern glibc versions
instead of clone for processes/threads generation (e.g. issue #1204).
This commit enables the clone3 syscall in riscv64 by adding the
corresponding handler call, as well as its arguments struct.
FMAXV, FMINV, FMAXNMV, FMINNMV and ADDV instructions perform recursive
reduction. Different reduction methods lie to different result when
handle NaN values.
Reuse the template of `twoRegAcrossInstX`. Add one more option
`recursive` for recursive reduction.
Change-Id: I69e690ce7668baee818542d3ea463f7a5f269a69
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
This commit fixs a bug in the viota instuction.
The two different instructions can be referenced to the same
StaticInstPtr because the decoder behaves as shown in [the section of
the
code](https://github.com/gem5/gem5/blob/stable/src/arch/riscv/decoder.cc#L98-L100).
So every first micro-op should reset the cnt variable in the macro-op.
Change-Id: Id311a05cfed41b01e16fd7256d9baa166aee49da
Co-authored-by: Jack Yung-Chen Lin <jack622@andestech.com>
This commit changes metric units (e.g. kB, MB, and GB) to binary units
(KiB, MiB, GiB) in various files. This PR covers files that were missed
by a previous PR that also made these changes.
This change adds MADT entries to the X86Board. Previously, the kernel in
full-system mode was complaining about a `ACPI BIOS Error (bug): Invalid
table length 0x24 in RSDT/XSDT (20190816/tbutils-291)`. This patch fixes
the invalid length and initializes all the tables correctly.
Signed-off-by: Kaustav Goswami <kggoswami@ucdavis.edu>
FMAXV, FMINV, FMAXNMV, FMINNMV and ADDV instructions perform recursive
reduction. Different reduction methods lie to different result when
handle NaN values.
Reuse the template of `twoRegAcrossInstX`. Add one more option
`recursive` for recursive reduction.
Change-Id: I69e690ce7668baee818542d3ea463f7a5f269a69
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
This PR is doing a simple refactoring of some partitioning policies. It
moves existing functionalities
within PP methods so that they can be called multiple times throughout
the simulation.
Therefore allowing a dynamic adjustment of the partitioning scheme
- Add `isExternalAbort()` in `AbortFault<T>` to determine external
abort.
- Add `virtual isExternalAbort()` in `ArmFault` so the method can be
used in base class.
- Set iss.ea by `isExternalAbort()`