misc: Merge branch release-staging-v21-2 into develop

Change-Id: I8200ac51c20117f63b51d555fa2f12e5dd35f22e
2021-12-26 23:58:26 -08:00
parent 515c89b860 f554b1a7b5
commit 065a7dbf1b
23 changed files with 213 additions and 72 deletions
--- a/RELEASE-NOTES.md
+++ b/RELEASE-NOTES.md
@@ -1,3 +1,70 @@
+# Version 21.2.0.0
+
+## API (user-facing) changes
+
+All `SimObject` declarations in SConscript files now require a `sim_objects` parameter which should list all SimObject classes declared in that file which need c++ wrappers.
+Those are the SimObject classes which have a `type` attribute defined.
+
+Also, there is now an optional `enums` parameter which needs to list all of the Enum types defined in that SimObject file.
+This should technically only include Enum types which generate c++ wrapper files, but currently all Enums do that so all Enums should be listed.
+
+## Initial release of the "gem5 standard library"
+
+Previous release had an alpha release of the "components library."
+This has now been wrapped in a larger "standard library".
+
+The *gem5 standard library* is a Python package which contains the following:
+
+- **Components:** A set of Python classes which wrap gem5's models. Some of the components are preconfigured to match real hardware (e.g., `SingleChannelDDR3_1600`) and others are parameterized. Components can be combined together onto *boards* which can be simulated.
+- **Resources:** A set of utilities to interact with the gem5-resources repository/website. Using this module allows you to *automatically* download and use many of gem5's prebuilt resources (e.g., kernels, disk images, etc.).
+- **Simulate:** *THIS MODULE IS IN BETA!* A simpler interface to gem5's simulation/run capabilities. Expect API changes to this module in future releases. Feedback is appreciated.
+- **Prebuilt**: These are fully functioning prebuilt systems. These systems are built from the components in `components`. This release has a "demo" board to show an example of how to use the prebuilt systems.
+
+Examples of using the gem5 standard library can be found in `configs/example/gem5_library/`.
+The source code is found under `src/python/gem5`.
+
+## Many Arm improvements
+
+- [Improved configurability for Arm architectural extensions](https://gem5.atlassian.net/browse/GEM5-1132): we have improved how to enable/disable architectural extensions for an Arm system. Rather than working with indipendent boolean values, we now use a unified ArmRelease object modelling the architectural features supported by a FS/SE Arm simulation
+- [Arm TLB can store partial entries](https://gem5.atlassian.net/browse/GEM5-1108): It is now possible to configure an ArmTLB as a walk cache: storing intermediate PAs obtained during a translation table walk.
+- [Implemented a multilevel TLB hierarchy](https://gem5.atlassian.net/browse/GEM5-790): enabling users to compose/model a customizable multilevel TLB hierarchy in gem5. The default Arm MMU has now an Instruction L1 TLB, a Data L1 TLB and a Unified (Instruction + Data) L2 TLB.
+- Provided an Arm example script for the gem5-SST integration (<https://gem5.atlassian.net/browse/GEM5-1121>).
+
+## GPU improvements
+
+- Vega support: gfx900 (Vega) discrete GPUs are now both supported and tested with [gem5-resources applications](https://gem5.googlesource.com/public/gem5-resources/+/refs/heads/stable/src/gpu/).
+- Improvements to the VIPER coherence protocol to fix bugs and improve performance: this improves scalability for large applications running on relatively small GPU configurations, which caused deadlocks in VIPER's L2.  Instead of continually replaying these requests, the updated protocol instead wakes up the pending requests once the prior request to this cache line has completed.
+- Additional GPU applications: The [Pannotia graph analytics benchmark suite](https://github.com/pannotia/pannotia) has been added to gem5-resources, including Makefiles, READMEs, and sample commands on how to run each application in gem5.
+- Regression Testing: Several GPU applications are now tested as part of the nightly and weekly regressions, which improves test coverage and avoids introducing inadvertent bugs.
+- Minor updates to the architecture model: We also added several small changes/fixes to the HSA queue size (to allow larger GPU applications with many kernels to run), the TLB (to create GCN3- and Vega-specific TLBs), adding new instructions that were previously unimplemented in GCN3 and Vega, and fixing corner cases for some instructions that were leading to incorrect behavior.
+
+## gem5-SST bridges revived
+
+We now support gem5 cores connected to SST memory system for gem5 full system mode.
+This has been tested for RISC-V and Arm.
+See `ext/sst/README.md` for details.
+
+## LupIO devices
+
+LupIO devices were developed by Prof. Joel Porquet-Lupine as a set of open-source I/O devices to be used for teaching.
+They were designed to model a complete set of I/O devices that are neither too complex to teach in a classroom setting, or too simple to translate to understanding real-world devices.
+Our collection consists of a real-time clock, random number generator, terminal device, block device, system controller, timer device, programmable interrupt controller, as well as an inter-processor interrupt controller.
+A more detailed outline of LupIO can be found here: <https://luplab.cs.ucdavis.edu/assets/lupio/wcae21-porquet-lupio-paper.pdf>.
+Within gem5, these devices offer the capability to run simulations with a complete set of I/O devices that are both easy to understand and manipulate.
+
+The initial implementation of the LupIO devices are for the RISC-V ISA.
+However, they should be simple to extend to other ISAs through small source changes and updating the SConscripts.
+
+## Other improvements
+
+- Removed master/slave terminology: this was a closed ticket which was marked as done even though there were multiple references of master/slave in the config scripts which we fixed.
+- Armv8.2-A FEAT_UAO implementation.
+- Implemented 'at' variants of file syscall in SE mode (<https://gem5.atlassian.net/browse/GEM5-1098>).
+- Improved modularity in SConscripts.
+- Arm atomic support in the CHI protocol
+- Many testing improvements.
+- New "tester" CPU which mimics GUPS.
+
 # Version 21.1.0.2

 **[HOTFIX]** [A commit introduced `std::vector` with `resize()` to initialize all storages](https://gem5-review.googlesource.com/c/public/gem5/+/27085).
--- a/6
+++ b/6
@@ -348,12 +348,6 @@ if main['GCC'] or main['CLANG']:
    if GetOption('gold_linker'):
        main.Append(LINKFLAGS='-fuse-ld=gold')

-    # Treat warnings as errors but white list some warnings that we
-    # want to allow (e.g., deprecation warnings).
-    main.Append(CCFLAGS=['-Werror',
-                         '-Wno-error=deprecated-declarations',
-                         '-Wno-error=deprecated',
-                        ])
 else:
    error('\n'.join((
          "Don't know what compiler options to use for your compiler.",
--- a/configs/example/gem5_library/x86-ubuntu-run-with-kvm.py
+++ b/configs/example/gem5_library/x86-ubuntu-run-with-kvm.py
@@ -121,7 +121,7 @@ board.set_kernel_disk_workload(
    kernel=Resource("x86-linux-kernel-5.4.49"),
    # The x86 ubuntu image will be automatically downloaded to the if not
    # already present.
-    disk_image=Resource("x86-ubuntu-img"),
+    disk_image=Resource("x86-ubuntu-18.04-img"),
    readfile_contents=command,
 )

--- a/configs/example/gem5_library/x86-ubuntu-run.py
+++ b/configs/example/gem5_library/x86-ubuntu-run.py
@@ -58,7 +58,7 @@ board = X86DemoBoard()
 # downloaded.
 board.set_kernel_disk_workload(
    kernel=Resource("x86-linux-kernel-5.4.49"),
-    disk_image=Resource("x86-ubuntu-img"),
+    disk_image=Resource("x86-ubuntu-18.04-img"),
 )

 simulator = Simulator(board=board)
--- a/configs/ruby/CHI_config.py
+++ b/configs/ruby/CHI_config.py
@@ -360,7 +360,7 @@ class CPUSequencerWrapper:
            if str(p) != 'icache_port':
                exec('cpu.%s = self.data_seq.in_ports' % p)
        cpu.connectUncachedPorts(
-            self.data_seq.in_ports, self.data_seq.out_ports)
+            self.data_seq.in_ports, self.data_seq.interrupt_out_port)

    def connectIOPorts(self, piobus):
        self.data_seq.connectIOPorts(piobus)
--- a/ext/sst/README.md
+++ b/ext/sst/README.md
@@ -62,7 +62,7 @@ See `INSTALL.md`.
 Downloading the built bootloader containing a Linux Kernel and a workload,

 ```sh
-wget http://dist.gem5.org/dist/develop/misc/riscv/bbl-busybox-boot-exit
+wget http://dist.gem5.org/dist/v21-2/misc/riscv/bbl-busybox-boot-exit
 ```

 Running the simulation
@@ -78,7 +78,7 @@ the `bbl-busybox-boot-exit` resource, which contains an m5 binary, and
 `m5 exit` will be called upon the booting process reaching the early userspace.
 More information about building a bootloader containing a Linux Kernel and a
 customized workload is available at
-[https://gem5.googlesource.com/public/gem5-resources/+/refs/heads/develop/src/riscv-boot-exit-nodisk/].
+[https://gem5.googlesource.com/public/gem5-resources/+/refs/heads/stable/src/riscv-boot-exit-nodisk/].

 ## Running an example simulation (Arm)

@@ -87,7 +87,7 @@ extract them under the $M5_PATH directory (make sure M5_PATH points to a valid
 directory):

 ```sh
-wget http://dist.gem5.org/dist/develop/arm/aarch-sst-20211207.tar.bz2
+wget http://dist.gem5.org/dist/v21-2/arm/aarch-sst-20211207.tar.bz2
 tar -xf aarch-sst-20211207.tar.bz2

 # copying bootloaders
--- a/ext/testlib/configuration.py
+++ b/ext/testlib/configuration.py
@@ -213,7 +213,7 @@ def define_defaults(defaults):
                                                      os.pardir,
                                                      os.pardir))
    defaults.result_path = os.path.join(os.getcwd(), 'testing-results')
-    defaults.resource_url = 'http://dist.gem5.org/dist/develop'
+    defaults.resource_url = 'http://dist.gem5.org/dist/v21-2'
    defaults.resource_path = os.path.abspath(os.path.join(defaults.base_dir,
                                            'tests',
                                            'gem5',
--- a/site_scons/gem5_scons/configure.py
+++ b/site_scons/gem5_scons/configure.py
@@ -48,7 +48,10 @@ def CheckCxxFlag(context, flag, autoadd=True):
    context.Message("Checking for compiler %s support... " % flag)
    last_cxxflags = context.env['CXXFLAGS']
    context.env.Append(CXXFLAGS=[flag])
+    pre_werror = context.env['CXXFLAGS']
+    context.env.Append(CXXFLAGS=['-Werror'])
    ret = context.TryCompile('// CheckCxxFlag DO NOTHING', '.cc')
+    context.env['CXXFLAGS'] = pre_werror
    if not (ret and autoadd):
        context.env['CXXFLAGS'] = last_cxxflags
    context.Result(ret)
@@ -58,7 +61,10 @@ def CheckLinkFlag(context, flag, autoadd=True, set_for_shared=True):
    context.Message("Checking for linker %s support... " % flag)
    last_linkflags = context.env['LINKFLAGS']
    context.env.Append(LINKFLAGS=[flag])
+    pre_werror = context.env['LINKFLAGS']
+    context.env.Append(LINKFLAGS=['-Werror'])
    ret = context.TryLink('int main(int, char *[]) { return 0; }', '.cc')
+    context.env['LINKFLAGS'] = pre_werror
    if not (ret and autoadd):
        context.env['LINKFLAGS'] = last_linkflags
    if (ret and set_for_shared):
--- a/src/Doxyfile
+++ b/src/Doxyfile
@@ -31,7 +31,7 @@ PROJECT_NAME           = gem5
 # This could be handy for archiving the generated documentation or 
 # if some version control system is used.

-PROJECT_NUMBER         = DEVELOP-FOR-V21-2
+PROJECT_NUMBER         = v21.2.0.0

 # The OUTPUT_DIRECTORY tag is used to specify the (relative or absolute) 
 # base path where the generated documentation will be put. 
--- a/src/arch/amdgpu/gcn3/gpu_mem_helpers.hh
+++ b/src/arch/amdgpu/gcn3/gpu_mem_helpers.hh
@@ -107,7 +107,8 @@ initMemReqHelper(GPUDynInstPtr gpuDynInst, MemCmd mem_req_type,
                pkt1->dataStatic(&(reinterpret_cast<T*>(
                    gpuDynInst->d_data))[lane * N]);
                pkt2->dataStatic(&(reinterpret_cast<T*>(
-                    gpuDynInst->d_data))[lane * N + req1->getSize()]);
+                    gpuDynInst->d_data))[lane * N +
+                                         req1->getSize()/sizeof(T)]);
                DPRINTF(GPUMem, "CU%d: WF[%d][%d]: index: %d unaligned memory "
                        "request for %#x\n", gpuDynInst->cu_id,
                        gpuDynInst->simdId, gpuDynInst->wfSlotId, lane,
--- a/src/arch/amdgpu/vega/gpu_mem_helpers.hh
+++ b/src/arch/amdgpu/vega/gpu_mem_helpers.hh
@@ -107,7 +107,8 @@ initMemReqHelper(GPUDynInstPtr gpuDynInst, MemCmd mem_req_type,
                pkt1->dataStatic(&(reinterpret_cast<T*>(
                    gpuDynInst->d_data))[lane * N]);
                pkt2->dataStatic(&(reinterpret_cast<T*>(
-                    gpuDynInst->d_data))[lane * N + req1->getSize()]);
+                    gpuDynInst->d_data))[lane * N +
+                                         req1->getSize()/sizeof(T)]);
                DPRINTF(GPUMem, "CU%d: WF[%d][%d]: index: %d unaligned memory "
                        "request for %#x\n", gpuDynInst->cu_id,
                        gpuDynInst->simdId, gpuDynInst->wfSlotId, lane,
--- a/src/base/version.cc
+++ b/src/base/version.cc
@@ -32,6 +32,6 @@ namespace gem5
 /**
 * @ingroup api_base_utils
 */
-const char *gem5Version = "[DEVELOP-FOR-V21.2]";
+const char *gem5Version = "21.2.0.0";

 } // namespace gem5
--- a/src/python/gem5/resources/downloader.py
+++ b/src/python/gem5/resources/downloader.py
@@ -41,13 +41,16 @@ This Python module contains functions used to download, list, and obtain
 information about resources from resources.gem5.org.
 """

+def _resources_json_version_required() -> str:
+    """
+    Specifies the version of resources.json to obtain.
+    """
+    return "21.2"

 def _get_resources_json_uri() -> str:
-    # TODO: This is hardcoded to develop. This will need updated for each
-    # release to the stable branch.
    uri = (
        "https://gem5.googlesource.com/public/gem5-resources/"
-        + "+/refs/heads/develop/resources.json?format=TEXT"
+        + "+/refs/heads/stable/resources.json?format=TEXT"
    )

    return uri
@@ -64,8 +67,27 @@ def _get_resources_json() -> Dict:
    # text. Therefore when we open the URL we receive the JSON in base64
    # format. Conversion is needed before it can be loaded.
    with urllib.request.urlopen(_get_resources_json_uri()) as url:
-        return json.loads(base64.b64decode(url.read()).decode("utf-8"))
+        to_return = json.loads(base64.b64decode(url.read()).decode("utf-8"))

+    # If the current version pulled is not correct, look up the
+    # "previous-versions" field to find the correct one.
+    version = _resources_json_version_required()
+    if to_return["version"] != version:
+        if version in to_return["previous-versions"].keys():
+            with urllib.request.urlopen(
+                    to_return["previous-versions"][version]
+                ) as url:
+                to_return = json.loads(
+                    base64.b64decode(url.read()).decode("utf-8")
+                )
+        else:
+            # This should never happen, but we thrown an exception to explain
+            # that we can't find the version.
+            raise Exception(
+                f"Version '{version}' of resources.json cannot be found."
+                )
+
+    return to_return

 def _get_url_base() -> str:
    """
--- a/tests/compiler-tests.sh
+++ b/tests/compiler-tests.sh
@@ -99,7 +99,7 @@ for compiler in ${images[@]}; do
    # targets for this test
    build_indices=(${build_permutation[@]:0:$builds_count})

-    repo_name="${base_url}/${compiler}:latest"
+    repo_name="${base_url}/${compiler}:v21-2"

    # Grab compiler image
    docker pull $repo_name >/dev/null
--- a/tests/gem5/configs/boot_kvm_fork_run.py
+++ b/tests/gem5/configs/boot_kvm_fork_run.py
@@ -203,7 +203,7 @@ motherboard.set_kernel_disk_workload(
        resource_directory=args.resource_directory,
    ),
    disk_image=Resource(
-        "x86-ubuntu-img",
+        "x86-ubuntu-18.04-img",
        resource_directory=args.resource_directory,
    ),
    readfile_contents=dedent(
--- a/tests/gem5/configs/boot_kvm_switch_exit.py
+++ b/tests/gem5/configs/boot_kvm_switch_exit.py
@@ -188,7 +188,7 @@ motherboard.set_kernel_disk_workload(
        resource_directory=args.resource_directory,
    ),
    disk_image=Resource(
-        "x86-ubuntu-img",
+        "x86-ubuntu-18.04-img",
        resource_directory=args.resource_directory,
    ),
    # The first exit signals to switch processors.
--- a/tests/gem5/configs/x86_boot_exit_run.py
+++ b/tests/gem5/configs/x86_boot_exit_run.py
@@ -207,7 +207,7 @@ motherboard.set_kernel_disk_workload(
        resource_directory=args.resource_directory,
    ),
    disk_image=Resource(
-        "x86-ubuntu-img",
+        "x86-ubuntu-18.04-img",
        resource_directory=args.resource_directory,
    ),
    kernel_args=kernal_args,
--- a/tests/gem5/cpu_tests/test.py
+++ b/tests/gem5/cpu_tests/test.py
@@ -57,7 +57,7 @@ valid_isas = {

 base_path = joinpath(config.bin_path, 'cpu_tests')

-base_url = config.resource_url + '/gem5/cpu_tests/benchmarks/bin/'
+base_url = config.resource_url + '/test-progs/cpu-tests/bin/'

 isa_url = {
    constants.gcn3_x86_tag : base_url + "x86",
--- a/tests/jenkins/presubmit.cfg
+++ b/tests/jenkins/presubmit.cfg
@@ -3,4 +3,4 @@
 # Location of the continuous batch script in repository.
 build_file: "jenkins-gem5-prod/tests/jenkins/presubmit.sh"

-timeout_mins: 360 # 6 hours
+timeout_mins: 420 # 7 hours
--- a/tests/jenkins/presubmit.sh
+++ b/tests/jenkins/presubmit.sh
@@ -37,8 +37,8 @@

 set -e

-DOCKER_IMAGE_ALL_DEP=gcr.io/gem5-test/ubuntu-20.04_all-dependencies
-DOCKER_IMAGE_CLANG_COMPILE=gcr.io/gem5-test/clang-version-9
+DOCKER_IMAGE_ALL_DEP=gcr.io/gem5-test/ubuntu-20.04_all-dependencies:v21-2
+DOCKER_IMAGE_CLANG_COMPILE=gcr.io/gem5-test/clang-version-9:v21-2
 PRESUBMIT_STAGE2=tests/jenkins/presubmit-stage2.sh
 GEM5ART_TESTS=tests/jenkins/gem5art-tests.sh

--- a/tests/nightly.sh
+++ b/tests/nightly.sh
@@ -46,6 +46,18 @@ if [[ $# -gt 1 ]]; then
    run_threads=$2
 fi

+# The third argument is the GPU ISA to run. If no argument is given we default
+# to GCN3_X86.
+gpu_isa=GCN3_X86
+if [[ $# -gt 2 ]]; then
+    gpu_isa=$3
+fi
+
+if [[ "$gpu_isa" != "GCN3_X86" ]] && [[ "$gpu_isa" != "VEGA_X86" ]]; then
+    echo "Invalid GPU ISA: $gpu_isa"
+    exit 1
+fi
+
 build_target () {
    isa=$1

@@ -53,7 +65,8 @@ build_target () {
    # SCons is not perfect, and occasionally does not catch a necessary
    # compilation: https://gem5.atlassian.net/browse/GEM5-753
    docker run -u $UID:$GID --volume "${gem5_root}":"${gem5_root}" -w \
-        "${gem5_root}" --rm gcr.io/gem5-test/ubuntu-20.04_all-dependencies \
+        "${gem5_root}" --rm \
+        gcr.io/gem5-test/ubuntu-20.04_all-dependencies:v21-2 \
            bash -c "scons build/${isa}/gem5.opt -j${compile_threads} \
                || (rm -rf build && scons build/${isa}/gem5.opt -j${compile_threads})"
 }
@@ -62,12 +75,13 @@ unit_test () {
    build=$1

    docker run -u $UID:$GID --volume "${gem5_root}":"${gem5_root}" -w \
-        "${gem5_root}" --rm gcr.io/gem5-test/ubuntu-20.04_all-dependencies \
+        "${gem5_root}" --rm \
+        gcr.io/gem5-test/ubuntu-20.04_all-dependencies:v21-2 \
            scons build/NULL/unittests.${build} -j${compile_threads}
 }

 # Ensure we have the latest docker images.
-docker pull gcr.io/gem5-test/ubuntu-20.04_all-dependencies
+docker pull gcr.io/gem5-test/ubuntu-20.04_all-dependencies:v21-2

 # Try to build the ISA targets.
 build_target NULL
@@ -84,19 +98,19 @@ unit_test debug

 # Run the gem5 long tests.
 docker run -u $UID:$GID --volume "${gem5_root}":"${gem5_root}" -w \
-    "${gem5_root}"/tests --rm gcr.io/gem5-test/ubuntu-20.04_all-dependencies \
+    "${gem5_root}"/tests --rm \
+    gcr.io/gem5-test/ubuntu-20.04_all-dependencies:v21-2 \
        ./main.py run --length long -j${compile_threads} -t${run_threads} -vv

-# Run the GPU tests.
-# For the GPU tests we compile and run GCN3_X86 inside a gcn-gpu container.
-docker pull gcr.io/gem5-test/gcn-gpu:latest
+# For the GPU tests we compile and run the GPU ISA inside a gcn-gpu container.
+docker pull gcr.io/gem5-test/gcn-gpu:v21-2
 docker run --rm -u $UID:$GID --volume "${gem5_root}":"${gem5_root}" -w \
-    "${gem5_root}" gcr.io/gem5-test/gcn-gpu:latest bash -c \
-    "scons build/GCN3_X86/gem5.opt -j${compile_threads} \
-        || (rm -rf build && scons build/GCN3_X86/gem5.opt -j${compile_threads})"
+    "${gem5_root}" gcr.io/gem5-test/gcn-gpu:v21-2  bash -c \
+    "scons build/${gpu_isa}/gem5.opt -j${compile_threads} \
+        || (rm -rf build && scons build/${gpu_isa}/gem5.opt -j${compile_threads})"

 # get square
-wget -qN http://dist.gem5.org/dist/develop/test-progs/square/square
+wget -qN http://dist.gem5.org/dist/v21-2/test-progs/square/square

 mkdir -p tests/testing-results

@@ -104,18 +118,18 @@ mkdir -p tests/testing-results
 # Thus, we always want to run this in the nightly regressions to make sure
 # basic GPU functionality is working.
 docker run --rm -u $UID:$GID --volume "${gem5_root}":"${gem5_root}" -w \
-    "${gem5_root}" gcr.io/gem5-test/gcn-gpu:latest build/GCN3_X86/gem5.opt \
+    "${gem5_root}" gcr.io/gem5-test/gcn-gpu:v21-2  build/${gpu_isa}/gem5.opt \
    configs/example/apu_se.py --reg-alloc-policy=dynamic -n3 -c square

 # get HeteroSync
-wget -qN http://dist.gem5.org/dist/develop/test-progs/heterosync/gcn3/allSyncPrims-1kernel
+wget -qN http://dist.gem5.org/dist/v21-2/test-progs/heterosync/gcn3/allSyncPrims-1kernel

 # run HeteroSync sleepMutex -- 16 WGs (4 per CU in default config), each doing
 # 10 Ld/St per thread and 4 iterations of the critical section is a reasonable
 # moderate contention case for the default 4 CU GPU config and help ensure GPU
 # atomics are tested.
 docker run --rm -u $UID:$GID --volume "${gem5_root}":"${gem5_root}" -w \
-    "${gem5_root}" gcr.io/gem5-test/gcn-gpu:latest build/GCN3_X86/gem5.opt \
+    "${gem5_root}" gcr.io/gem5-test/gcn-gpu:v21-2 build/${gpu_isa}/gem5.opt \
    configs/example/apu_se.py --reg-alloc-policy=dynamic -n3 -c \
    allSyncPrims-1kernel --options="sleepMutex 10 16 4"

@@ -125,7 +139,7 @@ docker run --rm -u $UID:$GID --volume "${gem5_root}":"${gem5_root}" -w \
 # moderate contention case for the default 4 CU GPU config and help ensure GPU
 # atomics are tested.
 docker run --rm -u $UID:$GID --volume "${gem5_root}":"${gem5_root}" -w \
-    "${gem5_root}" gcr.io/gem5-test/gcn-gpu:latest build/GCN3_X86/gem5.opt \
+    "${gem5_root}" gcr.io/gem5-test/gcn-gpu:v21-2  build/${gpu_isa}/gem5.opt \
    configs/example/apu_se.py --reg-alloc-policy=dynamic -n3 -c \
    allSyncPrims-1kernel --options="lfTreeBarrUniq 10 16 4"

@@ -135,7 +149,7 @@ build_and_run_SST () {
    variant=$2

    docker run -u $UID:$GID --volume "${gem5_root}":"${gem5_root}" -w \
-        "${gem5_root}" --rm gcr.io/gem5-test/sst-env \
+        "${gem5_root}" --rm gcr.io/gem5-test/sst-env:v21-2 \
            bash -c "\
 scons build/${isa}/libgem5_${variant}.so -j${compile_threads} --without-tcmalloc; \
 cd ext/sst; \
--- a/tests/weekly.sh
+++ b/tests/weekly.sh
@@ -32,16 +32,31 @@ set -x
 dir="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
 gem5_root="${dir}/.."

-# We assume the lone argument is the number of threads. If no argument is
-# given we default to one.
+# We assume the first two arguments are the number of threads followed by the
+# GPU ISA to test. These default to 1 and GCN3_X86 is no argument is given.
 threads=1
-if [[ $# -gt 0 ]]; then
+gpu_isa=GCN3_X86
+if [[ $# -eq 1 ]]; then
    threads=$1
+elif [[ $# -eq 2 ]]; then
+    threads=$1
+    gpu_isa=$2
+else
+    if [[ $# -gt 0 ]]; then
+        echo "Invalid number of arguments: $#"
+        exit 1
+    fi
+fi
+
+if [[ "$gpu_isa" != "GCN3_X86" ]] && [[ "$gpu_isa" != "VEGA_X86" ]]; then
+    echo "Invalid GPU ISA: $gpu_isa"
+    exit 1
 fi

 # Run the gem5 very-long tests.
 docker run -u $UID:$GID --volume "${gem5_root}":"${gem5_root}" -w \
-    "${gem5_root}"/tests --rm gcr.io/gem5-test/ubuntu-20.04_all-dependencies \
+    "${gem5_root}"/tests --rm \
+    gcr.io/gem5-test/ubuntu-20.04_all-dependencies:v21-2 \
        ./main.py run --length very-long -j${threads} -t${threads} -vv

 mkdir -p tests/testing-results
@@ -49,7 +64,7 @@ mkdir -p tests/testing-results
 # GPU weekly tests start here
 # before pulling gem5 resources, make sure it doesn't exist already
 docker run --rm --volume "${gem5_root}":"${gem5_root}" -w \
-       "${gem5_root}" gcr.io/gem5-test/gcn-gpu:latest bash -c \
+       "${gem5_root}" gcr.io/gem5-test/gcn-gpu:v21-2 bash -c \
       "rm -rf ${gem5_root}/gem5-resources"
 # delete Pannotia datasets and output files in case a failed regression run left
 # them around
@@ -61,21 +76,42 @@ rm -f coAuthorsDBLP.graph 1k_128k.gr result.out
 # Moreover, DNNMark builds a library and thus doesn't have a binary, so we
 # need to build it before we run it.
 # Need to pull this first because HACC's docker requires this path to exist
-git clone -b develop https://gem5.googlesource.com/public/gem5-resources \
+git clone https://gem5.googlesource.com/public/gem5-resources \
    "${gem5_root}/gem5-resources"

-# For the GPU tests we compile and run GCN3_X86 inside a gcn-gpu container.
+
+# The following script is to ensure these tests are runnable as the resources
+# directory changes over time. The gem5 resources repository stable branch is
+# tagged upon the new release for that of the previous release. For example,
+# when v22.0 is released, the stable branch will be tagged with "v21.2.X.X"
+# prior to the merging of the develop/staging branch into the stable branch.
+# This is so a user may revert the gem5-resources sources back to a state
+# compatable with a particular major release.
+#
+# To ensure the v21.2 version of these tests continues to run as future
+# versions are released, we run this check. If there's been another release,
+# we checkout the correct version of gem5 resources.
+cd "${gem5_root}/gem5-resources"
+version_tag=$(git tag | grep "v21.2")
+
+if [[ ${version_tag} != "" ]]; then
+       git checkout "${version_tag}"
+fi
+
+cd "${gem5_root}"
+
+# For the GPU tests we compile and run the GPU ISA inside a gcn-gpu container.
 # HACC requires setting numerous environment variables to run correctly.  To
 # avoid needing to set all of these, we instead build a docker for it, which
 # has all these variables pre-set in its Dockerfile
 # To avoid compiling gem5 multiple times, all GPU benchmarks will use this
-docker pull gcr.io/gem5-test/gcn-gpu:latest
+docker pull gcr.io/gem5-test/gcn-gpu:v21-2
 docker build -t hacc-test-weekly ${gem5_root}/gem5-resources/src/gpu/halo-finder

 docker run --rm -u $UID:$GID --volume "${gem5_root}":"${gem5_root}" -w \
    "${gem5_root}" hacc-test-weekly bash -c \
-    "scons build/GCN3_X86/gem5.opt -j${threads} \
-        || rm -rf build && scons build/GCN3_X86/gem5.opt -j${threads}"
+    "scons build/${gpu_isa}/gem5.opt -j${threads} \
+        || rm -rf build && scons build/${gpu_isa}/gem5.opt -j${threads}"

 # Some of the apps we test use m5ops (and x86), so compile them for x86
 # Note: setting TERM in the environment is necessary as scons fails for m5ops if
@@ -94,7 +130,7 @@ docker run --rm --volume "${gem5_root}":"${gem5_root}" -w \
 # LULESH is heavily used in the HPC community on GPUs, and does a good job of
 # stressing several GPU compute and memory components
 docker run --rm -u $UID:$GID --volume "${gem5_root}":"${gem5_root}" -w \
-    "${gem5_root}" hacc-test-weekly build/GCN3_X86/gem5.opt \
+    "${gem5_root}" hacc-test-weekly build/${gpu_isa}/gem5.opt \
    configs/example/apu_se.py -n3 --mem-size=8GB --reg-alloc-policy=dynamic \
    --benchmark-root="${gem5_root}/gem5-resources/src/gpu/lulesh/bin" -c lulesh

@@ -136,7 +172,7 @@ docker run --rm -u $UID:$GID --volume "${gem5_root}":"${gem5_root}" -w \
 docker run --rm --volume "${gem5_root}":"${gem5_root}" -v \
       "${gem5_root}/gem5-resources/src/gpu/DNNMark/cachefiles:/root/.cache/miopen/2.9.0" \
       -w "${gem5_root}/gem5-resources/src/gpu/DNNMark" hacc-test-weekly \
-       "${gem5_root}/build/GCN3_X86/gem5.opt" "${gem5_root}/configs/example/apu_se.py" -n3 \
+       "${gem5_root}/build/${gpu_isa}/gem5.opt" "${gem5_root}/configs/example/apu_se.py" -n3 \
       --reg-alloc-policy=dynamic \
       --benchmark-root="${gem5_root}/gem5-resources/src/gpu/DNNMark/build/benchmarks/test_fwd_softmax" \
       -c dnnmark_test_fwd_softmax \
@@ -146,7 +182,7 @@ docker run --rm --volume "${gem5_root}":"${gem5_root}" -v \
 docker run --rm --volume "${gem5_root}":"${gem5_root}" -v \
       "${gem5_root}/gem5-resources/src/gpu/DNNMark/cachefiles:/root/.cache/miopen/2.9.0" \
       -w "${gem5_root}/gem5-resources/src/gpu/DNNMark" hacc-test-weekly \
-       "${gem5_root}/build/GCN3_X86/gem5.opt" "${gem5_root}/configs/example/apu_se.py" -n3 \
+       "${gem5_root}/build/${gpu_isa}/gem5.opt" "${gem5_root}/configs/example/apu_se.py" -n3 \
       --reg-alloc-policy=dynamic \
       --benchmark-root="${gem5_root}/gem5-resources/src/gpu/DNNMark/build/benchmarks/test_fwd_pool" \
       -c dnnmark_test_fwd_pool \
@@ -156,7 +192,7 @@ docker run --rm --volume "${gem5_root}":"${gem5_root}" -v \
 docker run --rm --volume "${gem5_root}":"${gem5_root}" -v \
       "${gem5_root}/gem5-resources/src/gpu/DNNMark/cachefiles:/root/.cache/miopen/2.9.0" \
       -w "${gem5_root}/gem5-resources/src/gpu/DNNMark" hacc-test-weekly \
-       "${gem5_root}/build/GCN3_X86/gem5.opt" "${gem5_root}/configs/example/apu_se.py" -n3 \
+       "${gem5_root}/build/${gpu_isa}/gem5.opt" "${gem5_root}/configs/example/apu_se.py" -n3 \
       --reg-alloc-policy=dynamic \
       --benchmark-root="${gem5_root}/gem5-resources/src/gpu/DNNMark/build/benchmarks/test_bwd_bn" \
       -c dnnmark_test_bwd_bn \
@@ -172,7 +208,7 @@ docker run --rm -v ${PWD}:${PWD} -w \
 # Like LULESH, HACC is heavily used in the HPC community and is used to stress
 # the GPU memory system
 docker run --rm -v ${gem5_root}:${gem5_root} -w ${gem5_root} -u $UID:$GID \
-       hacc-test-weekly ${gem5_root}/build/GCN3_X86/gem5.opt \
+       hacc-test-weekly ${gem5_root}/build/${gpu_isa}/gem5.opt \
       ${gem5_root}/configs/example/apu_se.py -n3 --reg-alloc-policy=dynamic \
       --benchmark-root=${gem5_root}/gem5-resources/src/gpu/halo-finder/src/hip \
       -c ForceTreeTest --options="0.5 0.1 64 0.1 1 N 12 rcb"
@@ -189,10 +225,10 @@ docker run --rm -v ${PWD}:${PWD} \
       "export GEM5_PATH=${gem5_root} ; make gem5-fusion"

 # # get input dataset for BC test
-wget http://dist.gem5.org/dist/develop/datasets/pannotia/bc/1k_128k.gr
+wget http://dist.gem5.org/dist/v21-2/datasets/pannotia/bc/1k_128k.gr
 # run BC
 docker run --rm -v ${gem5_root}:${gem5_root} -w ${gem5_root} -u $UID:$GID \
-       hacc-test-weekly ${gem5_root}/build/GCN3_X86/gem5.opt \
+       hacc-test-weekly ${gem5_root}/build/${gpu_isa}/gem5.opt \
       ${gem5_root}/configs/example/apu_se.py -n3 --mem-size=8GB \
       --reg-alloc-policy=dynamic \
       --benchmark-root=gem5-resources/src/gpu/pannotia/bc/bin -c bc.gem5 \
@@ -206,7 +242,7 @@ docker run --rm -v ${gem5_root}:${gem5_root} -w \

 # run Color (Max) (use same input dataset as BC for faster testing)
 docker run --rm -v ${gem5_root}:${gem5_root} -w ${gem5_root} -u $UID:$GID \
-       hacc-test-weekly ${gem5_root}/build/GCN3_X86/gem5.opt \
+       hacc-test-weekly ${gem5_root}/build/${gpu_isa}/gem5.opt \
       ${gem5_root}/configs/example/apu_se.py -n3 --mem-size=8GB \
       --reg-alloc-policy=dynamic \
       --benchmark-root=${gem5_root}/gem5-resources/src/gpu/pannotia/color/bin \
@@ -220,7 +256,7 @@ docker run --rm -v ${gem5_root}:${gem5_root} -w \

 # run Color (MaxMin) (use same input dataset as BC for faster testing)
 docker run --rm -v ${gem5_root}:${gem5_root} -w ${gem5_root} -u $UID:$GID \
-       hacc-test-weekly ${gem5_root}/build/GCN3_X86/gem5.opt \
+       hacc-test-weekly ${gem5_root}/build/${gpu_isa}/gem5.opt \
       ${gem5_root}/configs/example/apu_se.py -n3 --mem-size=8GB \
       --reg-alloc-policy=dynamic \
       --benchmark-root=${gem5_root}/gem5-resources/src/gpu/pannotia/color/bin \
@@ -234,7 +270,7 @@ docker run --rm -v ${gem5_root}:${gem5_root} -w \

 # run FW (use same input dataset as BC for faster testing)
 docker run --rm -v ${gem5_root}:${gem5_root} -w ${gem5_root} -u $UID:$GID \
-       hacc-test-weekly ${gem5_root}/build/GCN3_X86/gem5.opt \
+       hacc-test-weekly ${gem5_root}/build/${gpu_isa}/gem5.opt \
       ${gem5_root}/configs/example/apu_se.py -n3 --mem-size=8GB \
       --reg-alloc-policy=dynamic \
       --benchmark-root=${gem5_root}/gem5-resources/src/gpu/pannotia/fw/bin \
@@ -248,7 +284,7 @@ docker run --rm -v ${gem5_root}:${gem5_root} -w \

 # run MIS (use same input dataset as BC for faster testing)
 docker run --rm -v ${gem5_root}:${gem5_root} -w ${gem5_root} -u $UID:$GID \
-       hacc-test-weekly ${gem5_root}/build/GCN3_X86/gem5.opt \
+       hacc-test-weekly ${gem5_root}/build/${gpu_isa}/gem5.opt \
       ${gem5_root}/configs/example/apu_se.py -n3 --mem-size=8GB \
       --reg-alloc-policy=dynamic \
       --benchmark-root=${gem5_root}/gem5-resources/src/gpu/pannotia/mis/bin \
@@ -261,10 +297,10 @@ docker run --rm -v ${gem5_root}:${gem5_root} -w \
       "export GEM5_PATH=${gem5_root} ; make gem5-fusion"

 # get PageRank input dataset
-wget http://dist.gem5.org/dist/develop/datasets/pannotia/pagerank/coAuthorsDBLP.graph
+wget http://dist.gem5.org/dist/v21-2/datasets/pannotia/pagerank/coAuthorsDBLP.graph
 # run PageRank (Default)
 docker run --rm -v ${gem5_root}:${gem5_root} -w ${gem5_root} -u $UID:$GID \
-       hacc-test-weekly ${gem5_root}/build/GCN3_X86/gem5.opt \
+       hacc-test-weekly ${gem5_root}/build/${gpu_isa}/gem5.opt \
       ${gem5_root}/configs/example/apu_se.py -n3 --mem-size=8GB \
       --reg-alloc-policy=dynamic \
       --benchmark-root=${gem5_root}/gem5-resources/src/gpu/pannotia/pagerank/bin \
@@ -278,7 +314,7 @@ docker run --rm -v ${gem5_root}:${gem5_root} -w \

 # run PageRank (SPMV)
 docker run --rm -v ${gem5_root}:${gem5_root} -w ${gem5_root} -u $UID:$GID \
-       hacc-test-weekly ${gem5_root}/build/GCN3_X86/gem5.opt \
+       hacc-test-weekly ${gem5_root}/build/${gpu_isa}/gem5.opt \
       ${gem5_root}/configs/example/apu_se.py -n3 --mem-size=8GB \
       --reg-alloc-policy=dynamic \
       --benchmark-root=${gem5_root}/gem5-resources/src/gpu/pannotia/pagerank/bin \
@@ -292,7 +328,7 @@ docker run --rm -v ${gem5_root}:${gem5_root} -w \

 # run SSSP (CSR) (use same input dataset as BC for faster testing)
 docker run --rm -v ${gem5_root}:${gem5_root} -w ${gem5_root} -u $UID:$GID \
-       hacc-test-weekly ${gem5_root}/build/GCN3_X86/gem5.opt \
+       hacc-test-weekly ${gem5_root}/build/${gpu_isa}/gem5.opt \
       ${gem5_root}/configs/example/apu_se.py -n3 --mem-size=8GB \
       --reg-alloc-policy=dynamic \
       --benchmark-root=${gem5_root}/gem5-resources/src/gpu/pannotia/sssp/bin \
@@ -306,7 +342,7 @@ docker run --rm -v ${gem5_root}:${gem5_root} -w \

 # run SSSP (ELL) (use same input dataset as BC for faster testing)
 docker run --rm -v ${gem5_root}:${gem5_root} -w ${gem5_root} -u $UID:$GID \
-       hacc-test-weekly ${gem5_root}/build/GCN3_X86/gem5.opt \
+       hacc-test-weekly ${gem5_root}/build/${gpu_isa}/gem5.opt \
       ${gem5_root}/configs/example/apu_se.py -n3 --mem-size=8GB \
       --reg-alloc-policy=dynamic \
       --benchmark-root=${gem5_root}/gem5-resources/src/gpu/pannotia/sssp/bin \
--- a/util/dockerfiles/gcn-gpu/Dockerfile
+++ b/util/dockerfiles/gcn-gpu/Dockerfile
@@ -70,7 +70,7 @@ RUN git clone -b rocm-4.0.0 \

 WORKDIR /ROCclr
 # The patch allows us to avoid building blit kernels on-the-fly in gem5
-RUN wget -q -O - dist.gem5.org/dist/develop/rocm_patches/ROCclr.patch | git apply -v
+RUN wget -q -O - dist.gem5.org/dist/v21-2/rocm_patches/ROCclr.patch | git apply -v

 WORKDIR /ROCclr/build
 RUN cmake -DOPENCL_DIR="/ROCm-OpenCL-Runtime" \