Hashing the `src` directory is too costly, with some runners reaching
timeout. Also, as we only have 10GB of cache it makes sense to have
more course grained caching
The Weekly GPU tests are failing due to a timeout but I found the testing
timeout was set to 5 hours and we have been frequently close to reaching this
but have recently changes the test enought o consistently go over.
The main two things that appear to have caused this are:
1. Moving the X86_VEGA compilation into the the same step as the running of
the tests.
2. Reducing the number of threads per GitHub Actions runner, thus slowing
job execution.
In addition we've added more tests to this weekly GPU suite though I don't
believe have got to running these tests yet. The timeout appears to
always been triggered before this.
This PR increases the timout to 3 days and moves the compilation into a
seperate step.
This reverts commit 52fbc8ebcf.
This commit used Ubuntu 22.04 instead of the typucal 24.04 as 24.04
has GCC v13 installed by default. GCC v13 (and new compilrs introduce a
'oerloaderdf-virtual' check that is triggered in systemc. Systemc
developers suggest this fix to proceed.
Without specifying the "gem5/gpu" directory, this test attempted to run
the entire test suite. This caused the daily and weekly tests to fail.
This change fixes this.
A new host tag `gcn_gpu` has been added. This allows for selection of
those GPU tests which depend upon the gcn-gpu docker image to run.
In addition to this, the square GPU tests has been moved to the CI
tests. This ensures some GPU code is compiled and run on every PR.
This is made to run on the 'stable' branch to schedule workflow runs on
the `develop` branch. This solves the problem of GitHub Workflows being
scheduled to only run on 'stable' branch' thus ignoring changes made to
them on 'develop'
With this schedule we no longer need to force a checkout of 'develop' in
the workflows. As such these have been removed.
The scheduled workflows are now triggered via "workflow_dispatch" via
the "scheduler.yaml" workflow
This is not needed with upload-artifact v4 directories are archived and
compressed by default.
This zip step was also causing Daily/Weekly test failures due to not
running `apt update` before the `apt install` for the zip utility. Ergo
this patch fixes these errors.
There was some inconsistency in the GitHub Workflow files on using
'ubuntu-latest' (which gets the latest Ubuntu version) or
'ubuntu-22.04'. To keep things consistent 'ubuntu-latest' is now used in
all cases. This also saves us updating workloads upon release of a new
Ubuntu version.
This change ensures all our tests run on our most recent supported LTS
release of Ubuntu.
In the case of compiler tests we still test 22.04 all-dep but test 24.04
all-dep and min-dep (i.e., we drop 22.04 min-dep as it's somewhat
redundant).
Change-Id: I63666d1017594b496523a48e5112a8994f57885f
v3 was causing a 'Node.js 16 actions are deprecated' error.
Note: download-artifact@v4 must be used with upload-artifact@v4 and
vice-versa.
Change-Id: Icb8ab6d27aed4557be95ce31dd89d4655010968e
"build/VEGA_X86/gem5.opt" is not available in directory "hip". `${
github.workspace}` is default should be run from there. This patch fixes
this.
Change-Id: I99875270c77dde92d3ec2ae0a07760905eaf903e
This seperation was only for convenience while GPU tests were under
development and rapidly changing. This test merges the GPU tests into
the weekly tests where they belong.
Change-Id: I0e7118e863dba51334de89b3bbc3592374ef63ec
This clone is updated to reflect the new advice given in
ext/dramasys/README that was introduced in PR
https://github.com/gem5/gem5/pull/525 to upgrade DRAMSysm to v5.0.
Change-Id: I868619ecc1a44298dd3885e5719979bdaa24e9c2
I believe the weekly test failures (example:
https://github.com/gem5/gem5/actions/runs/6832805510/job/18592876184)
are due to a container running out of memory when running the very-long
x86 boot tests. I found that the `-t $(nproc)` flag meant, on our
runners, 4 x86 full system gem5 simulations were being pawned. Locally I
found these gem5 x86 boot sims can reach 4GB in size so I suspect they
eventually grew big enough exceed the 16GB memory of the VM.
I have removed `-t $(nproc)` meaning each execution to see if this fixes
the issue (we may want to use `-t 2` later if the Weeklies take too long
running single-threaded).
This changes continue-on-error to be fail-fast instead, as
continue-on-error will mark failed matrix runs as
successful, whereas fail-fast makes sure everything in the matrix
runs, but gets marked as failed if part of it fails.
Change-Id: Ie20652c229b6cce9f1c0a45958b088391e7aae97
This sets continue-on-error to true on any scheduled test that
uses a matrix so we can have all sets of tests run regardless
if one of them fails or not.
Change-Id: I8f6137ebdf62a5cecd582387316c330c8a1401ca
This moves the clean runner step in our yaml files to be at the
beginning of a job, so that if a runner goes down and is
unable to clean at the end, we can ensure that
subsequent jobs still run as expected.
Change-Id: Iba52694aefe03c550ad0bfdb5b5f938305273988
* misc: Update CI test workflow
This updates our CI tests to clean the runners after every
workflow, to make sure no hanging files cause problems for
future tests
Change-Id: Iff6a702bbc2e86a31e4c18ef9764a3cfd3af2f7d
* misc: Update scheduled workflows to clean runners
This updates our scheduled tests to clean up any remaining
files after running tests to avoid anything hanging for
future runs.
Change-Id: Icfdd5a0559337ad0e62d108a47f4e5a12e0db677
* misc: Fix spacing in workflow files
Some commands were incorrectly spaced
Change-Id: Id340dc77bfb5c5d579b5f1e5b3ddeabea4a35ea8