diff --git a/RELEASE-NOTES.md b/RELEASE-NOTES.md index 6f672a88e5..b15e6cb80e 100644 --- a/RELEASE-NOTES.md +++ b/RELEASE-NOTES.md @@ -1,3 +1,70 @@ +# Version 21.2.0.0 + +## API (user-facing) changes + +All `SimObject` declarations in SConscript files now require a `sim_objects` parameter which should list all SimObject classes declared in that file which need c++ wrappers. +Those are the SimObject classes which have a `type` attribute defined. + +Also, there is now an optional `enums` parameter which needs to list all of the Enum types defined in that SimObject file. +This should technically only include Enum types which generate c++ wrapper files, but currently all Enums do that so all Enums should be listed. + +## Initial release of the "gem5 standard library" + +Previous release had an alpha release of the "components library." +This has now been wrapped in a larger "standard library". + +The *gem5 standard library* is a Python package which contains the following: + +- **Components:** A set of Python classes which wrap gem5's models. Some of the components are preconfigured to match real hardware (e.g., `SingleChannelDDR3_1600`) and others are parameterized. Components can be combined together onto *boards* which can be simulated. +- **Resources:** A set of utilities to interact with the gem5-resources repository/website. Using this module allows you to *automatically* download and use many of gem5's prebuilt resources (e.g., kernels, disk images, etc.). +- **Simulate:** *THIS MODULE IS IN BETA!* A simpler interface to gem5's simulation/run capabilities. Expect API changes to this module in future releases. Feedback is appreciated. +- **Prebuilt**: These are fully functioning prebuilt systems. These systems are built from the components in `components`. This release has a "demo" board to show an example of how to use the prebuilt systems. + +Examples of using the gem5 standard library can be found in `configs/example/gem5_library/`. +The source code is found under `src/python/gem5`. + +## Many Arm improvements + +- [Improved configurability for Arm architectural extensions](https://gem5.atlassian.net/browse/GEM5-1132): we have improved how to enable/disable architectural extensions for an Arm system. Rather than working with indipendent boolean values, we now use a unified ArmRelease object modelling the architectural features supported by a FS/SE Arm simulation +- [Arm TLB can store partial entries](https://gem5.atlassian.net/browse/GEM5-1108): It is now possible to configure an ArmTLB as a walk cache: storing intermediate PAs obtained during a translation table walk. +- [Implemented a multilevel TLB hierarchy](https://gem5.atlassian.net/browse/GEM5-790): enabling users to compose/model a customizable multilevel TLB hierarchy in gem5. The default Arm MMU has now an Instruction L1 TLB, a Data L1 TLB and a Unified (Instruction + Data) L2 TLB. +- Provided an Arm example script for the gem5-SST integration (). + +## GPU improvements + +- Vega support: gfx900 (Vega) discrete GPUs are now both supported and tested with [gem5-resources applications](https://gem5.googlesource.com/public/gem5-resources/+/refs/heads/stable/src/gpu/). +- Improvements to the VIPER coherence protocol to fix bugs and improve performance: this improves scalability for large applications running on relatively small GPU configurations, which caused deadlocks in VIPER's L2. Instead of continually replaying these requests, the updated protocol instead wakes up the pending requests once the prior request to this cache line has completed. +- Additional GPU applications: The [Pannotia graph analytics benchmark suite](https://github.com/pannotia/pannotia) has been added to gem5-resources, including Makefiles, READMEs, and sample commands on how to run each application in gem5. +- Regression Testing: Several GPU applications are now tested as part of the nightly and weekly regressions, which improves test coverage and avoids introducing inadvertent bugs. +- Minor updates to the architecture model: We also added several small changes/fixes to the HSA queue size (to allow larger GPU applications with many kernels to run), the TLB (to create GCN3- and Vega-specific TLBs), adding new instructions that were previously unimplemented in GCN3 and Vega, and fixing corner cases for some instructions that were leading to incorrect behavior. + +## gem5-SST bridges revived + +We now support gem5 cores connected to SST memory system for gem5 full system mode. +This has been tested for RISC-V and Arm. +See `ext/sst/README.md` for details. + +## LupIO devices + +LupIO devices were developed by Prof. Joel Porquet-Lupine as a set of open-source I/O devices to be used for teaching. +They were designed to model a complete set of I/O devices that are neither too complex to teach in a classroom setting, or too simple to translate to understanding real-world devices. +Our collection consists of a real-time clock, random number generator, terminal device, block device, system controller, timer device, programmable interrupt controller, as well as an inter-processor interrupt controller. +A more detailed outline of LupIO can be found here: . +Within gem5, these devices offer the capability to run simulations with a complete set of I/O devices that are both easy to understand and manipulate. + +The initial implementation of the LupIO devices are for the RISC-V ISA. +However, they should be simple to extend to other ISAs through small source changes and updating the SConscripts. + +## Other improvements + +- Removed master/slave terminology: this was a closed ticket which was marked as done even though there were multiple references of master/slave in the config scripts which we fixed. +- Armv8.2-A FEAT_UAO implementation. +- Implemented 'at' variants of file syscall in SE mode (). +- Improved modularity in SConscripts. +- Arm atomic support in the CHI protocol +- Many testing improvements. +- New "tester" CPU which mimics GUPS. + # Version 21.1.0.2 **[HOTFIX]** [A commit introduced `std::vector` with `resize()` to initialize all storages](https://gem5-review.googlesource.com/c/public/gem5/+/27085). diff --git a/SConstruct b/SConstruct index 8fa5517641..ceeb1ba52f 100755 --- a/SConstruct +++ b/SConstruct @@ -348,12 +348,6 @@ if main['GCC'] or main['CLANG']: if GetOption('gold_linker'): main.Append(LINKFLAGS='-fuse-ld=gold') - # Treat warnings as errors but white list some warnings that we - # want to allow (e.g., deprecation warnings). - main.Append(CCFLAGS=['-Werror', - '-Wno-error=deprecated-declarations', - '-Wno-error=deprecated', - ]) else: error('\n'.join(( "Don't know what compiler options to use for your compiler.", diff --git a/configs/example/gem5_library/x86-ubuntu-run-with-kvm.py b/configs/example/gem5_library/x86-ubuntu-run-with-kvm.py index fa84960db6..1aea258b16 100644 --- a/configs/example/gem5_library/x86-ubuntu-run-with-kvm.py +++ b/configs/example/gem5_library/x86-ubuntu-run-with-kvm.py @@ -121,7 +121,7 @@ board.set_kernel_disk_workload( kernel=Resource("x86-linux-kernel-5.4.49"), # The x86 ubuntu image will be automatically downloaded to the if not # already present. - disk_image=Resource("x86-ubuntu-img"), + disk_image=Resource("x86-ubuntu-18.04-img"), readfile_contents=command, ) diff --git a/configs/example/gem5_library/x86-ubuntu-run.py b/configs/example/gem5_library/x86-ubuntu-run.py index c6f6f83726..2aee8c73df 100644 --- a/configs/example/gem5_library/x86-ubuntu-run.py +++ b/configs/example/gem5_library/x86-ubuntu-run.py @@ -58,7 +58,7 @@ board = X86DemoBoard() # downloaded. board.set_kernel_disk_workload( kernel=Resource("x86-linux-kernel-5.4.49"), - disk_image=Resource("x86-ubuntu-img"), + disk_image=Resource("x86-ubuntu-18.04-img"), ) simulator = Simulator(board=board) diff --git a/configs/ruby/CHI_config.py b/configs/ruby/CHI_config.py index 2d39659c15..097f36735d 100644 --- a/configs/ruby/CHI_config.py +++ b/configs/ruby/CHI_config.py @@ -360,7 +360,7 @@ class CPUSequencerWrapper: if str(p) != 'icache_port': exec('cpu.%s = self.data_seq.in_ports' % p) cpu.connectUncachedPorts( - self.data_seq.in_ports, self.data_seq.out_ports) + self.data_seq.in_ports, self.data_seq.interrupt_out_port) def connectIOPorts(self, piobus): self.data_seq.connectIOPorts(piobus) diff --git a/ext/sst/README.md b/ext/sst/README.md index 148adcc5cc..dbec200b43 100644 --- a/ext/sst/README.md +++ b/ext/sst/README.md @@ -62,7 +62,7 @@ See `INSTALL.md`. Downloading the built bootloader containing a Linux Kernel and a workload, ```sh -wget http://dist.gem5.org/dist/develop/misc/riscv/bbl-busybox-boot-exit +wget http://dist.gem5.org/dist/v21-2/misc/riscv/bbl-busybox-boot-exit ``` Running the simulation @@ -78,7 +78,7 @@ the `bbl-busybox-boot-exit` resource, which contains an m5 binary, and `m5 exit` will be called upon the booting process reaching the early userspace. More information about building a bootloader containing a Linux Kernel and a customized workload is available at -[https://gem5.googlesource.com/public/gem5-resources/+/refs/heads/develop/src/riscv-boot-exit-nodisk/]. +[https://gem5.googlesource.com/public/gem5-resources/+/refs/heads/stable/src/riscv-boot-exit-nodisk/]. ## Running an example simulation (Arm) @@ -87,7 +87,7 @@ extract them under the $M5_PATH directory (make sure M5_PATH points to a valid directory): ```sh -wget http://dist.gem5.org/dist/develop/arm/aarch-sst-20211207.tar.bz2 +wget http://dist.gem5.org/dist/v21-2/arm/aarch-sst-20211207.tar.bz2 tar -xf aarch-sst-20211207.tar.bz2 # copying bootloaders diff --git a/ext/testlib/configuration.py b/ext/testlib/configuration.py index 95800deb46..d0fca7451a 100644 --- a/ext/testlib/configuration.py +++ b/ext/testlib/configuration.py @@ -213,7 +213,7 @@ def define_defaults(defaults): os.pardir, os.pardir)) defaults.result_path = os.path.join(os.getcwd(), 'testing-results') - defaults.resource_url = 'http://dist.gem5.org/dist/develop' + defaults.resource_url = 'http://dist.gem5.org/dist/v21-2' defaults.resource_path = os.path.abspath(os.path.join(defaults.base_dir, 'tests', 'gem5', diff --git a/site_scons/gem5_scons/configure.py b/site_scons/gem5_scons/configure.py index b335673774..24a4a3deff 100644 --- a/site_scons/gem5_scons/configure.py +++ b/site_scons/gem5_scons/configure.py @@ -48,7 +48,10 @@ def CheckCxxFlag(context, flag, autoadd=True): context.Message("Checking for compiler %s support... " % flag) last_cxxflags = context.env['CXXFLAGS'] context.env.Append(CXXFLAGS=[flag]) + pre_werror = context.env['CXXFLAGS'] + context.env.Append(CXXFLAGS=['-Werror']) ret = context.TryCompile('// CheckCxxFlag DO NOTHING', '.cc') + context.env['CXXFLAGS'] = pre_werror if not (ret and autoadd): context.env['CXXFLAGS'] = last_cxxflags context.Result(ret) @@ -58,7 +61,10 @@ def CheckLinkFlag(context, flag, autoadd=True, set_for_shared=True): context.Message("Checking for linker %s support... " % flag) last_linkflags = context.env['LINKFLAGS'] context.env.Append(LINKFLAGS=[flag]) + pre_werror = context.env['LINKFLAGS'] + context.env.Append(LINKFLAGS=['-Werror']) ret = context.TryLink('int main(int, char *[]) { return 0; }', '.cc') + context.env['LINKFLAGS'] = pre_werror if not (ret and autoadd): context.env['LINKFLAGS'] = last_linkflags if (ret and set_for_shared): diff --git a/src/Doxyfile b/src/Doxyfile index dd52c258e1..8ed8839f9a 100644 --- a/src/Doxyfile +++ b/src/Doxyfile @@ -31,7 +31,7 @@ PROJECT_NAME = gem5 # This could be handy for archiving the generated documentation or # if some version control system is used. -PROJECT_NUMBER = DEVELOP-FOR-V21-2 +PROJECT_NUMBER = v21.2.0.0 # The OUTPUT_DIRECTORY tag is used to specify the (relative or absolute) # base path where the generated documentation will be put. diff --git a/src/arch/amdgpu/gcn3/gpu_mem_helpers.hh b/src/arch/amdgpu/gcn3/gpu_mem_helpers.hh index 009bb7c6c7..05299e1a0d 100644 --- a/src/arch/amdgpu/gcn3/gpu_mem_helpers.hh +++ b/src/arch/amdgpu/gcn3/gpu_mem_helpers.hh @@ -107,7 +107,8 @@ initMemReqHelper(GPUDynInstPtr gpuDynInst, MemCmd mem_req_type, pkt1->dataStatic(&(reinterpret_cast( gpuDynInst->d_data))[lane * N]); pkt2->dataStatic(&(reinterpret_cast( - gpuDynInst->d_data))[lane * N + req1->getSize()]); + gpuDynInst->d_data))[lane * N + + req1->getSize()/sizeof(T)]); DPRINTF(GPUMem, "CU%d: WF[%d][%d]: index: %d unaligned memory " "request for %#x\n", gpuDynInst->cu_id, gpuDynInst->simdId, gpuDynInst->wfSlotId, lane, diff --git a/src/arch/amdgpu/vega/gpu_mem_helpers.hh b/src/arch/amdgpu/vega/gpu_mem_helpers.hh index c60325dac0..a5a9ec97a5 100644 --- a/src/arch/amdgpu/vega/gpu_mem_helpers.hh +++ b/src/arch/amdgpu/vega/gpu_mem_helpers.hh @@ -107,7 +107,8 @@ initMemReqHelper(GPUDynInstPtr gpuDynInst, MemCmd mem_req_type, pkt1->dataStatic(&(reinterpret_cast( gpuDynInst->d_data))[lane * N]); pkt2->dataStatic(&(reinterpret_cast( - gpuDynInst->d_data))[lane * N + req1->getSize()]); + gpuDynInst->d_data))[lane * N + + req1->getSize()/sizeof(T)]); DPRINTF(GPUMem, "CU%d: WF[%d][%d]: index: %d unaligned memory " "request for %#x\n", gpuDynInst->cu_id, gpuDynInst->simdId, gpuDynInst->wfSlotId, lane, diff --git a/src/base/version.cc b/src/base/version.cc index 078a2f9d12..0da32c2494 100644 --- a/src/base/version.cc +++ b/src/base/version.cc @@ -32,6 +32,6 @@ namespace gem5 /** * @ingroup api_base_utils */ -const char *gem5Version = "[DEVELOP-FOR-V21.2]"; +const char *gem5Version = "21.2.0.0"; } // namespace gem5 diff --git a/src/python/gem5/resources/downloader.py b/src/python/gem5/resources/downloader.py index 86ddefbfc3..2cb73baef4 100644 --- a/src/python/gem5/resources/downloader.py +++ b/src/python/gem5/resources/downloader.py @@ -41,13 +41,16 @@ This Python module contains functions used to download, list, and obtain information about resources from resources.gem5.org. """ +def _resources_json_version_required() -> str: + """ + Specifies the version of resources.json to obtain. + """ + return "21.2" def _get_resources_json_uri() -> str: - # TODO: This is hardcoded to develop. This will need updated for each - # release to the stable branch. uri = ( "https://gem5.googlesource.com/public/gem5-resources/" - + "+/refs/heads/develop/resources.json?format=TEXT" + + "+/refs/heads/stable/resources.json?format=TEXT" ) return uri @@ -64,8 +67,27 @@ def _get_resources_json() -> Dict: # text. Therefore when we open the URL we receive the JSON in base64 # format. Conversion is needed before it can be loaded. with urllib.request.urlopen(_get_resources_json_uri()) as url: - return json.loads(base64.b64decode(url.read()).decode("utf-8")) + to_return = json.loads(base64.b64decode(url.read()).decode("utf-8")) + # If the current version pulled is not correct, look up the + # "previous-versions" field to find the correct one. + version = _resources_json_version_required() + if to_return["version"] != version: + if version in to_return["previous-versions"].keys(): + with urllib.request.urlopen( + to_return["previous-versions"][version] + ) as url: + to_return = json.loads( + base64.b64decode(url.read()).decode("utf-8") + ) + else: + # This should never happen, but we thrown an exception to explain + # that we can't find the version. + raise Exception( + f"Version '{version}' of resources.json cannot be found." + ) + + return to_return def _get_url_base() -> str: """ diff --git a/tests/compiler-tests.sh b/tests/compiler-tests.sh index 15ffb1673b..43092dadec 100755 --- a/tests/compiler-tests.sh +++ b/tests/compiler-tests.sh @@ -99,7 +99,7 @@ for compiler in ${images[@]}; do # targets for this test build_indices=(${build_permutation[@]:0:$builds_count}) - repo_name="${base_url}/${compiler}:latest" + repo_name="${base_url}/${compiler}:v21-2" # Grab compiler image docker pull $repo_name >/dev/null diff --git a/tests/gem5/configs/boot_kvm_fork_run.py b/tests/gem5/configs/boot_kvm_fork_run.py index 2cd180ac2e..662ef23010 100644 --- a/tests/gem5/configs/boot_kvm_fork_run.py +++ b/tests/gem5/configs/boot_kvm_fork_run.py @@ -203,7 +203,7 @@ motherboard.set_kernel_disk_workload( resource_directory=args.resource_directory, ), disk_image=Resource( - "x86-ubuntu-img", + "x86-ubuntu-18.04-img", resource_directory=args.resource_directory, ), readfile_contents=dedent( diff --git a/tests/gem5/configs/boot_kvm_switch_exit.py b/tests/gem5/configs/boot_kvm_switch_exit.py index 9f5f7eea04..5cfee40b6d 100644 --- a/tests/gem5/configs/boot_kvm_switch_exit.py +++ b/tests/gem5/configs/boot_kvm_switch_exit.py @@ -188,7 +188,7 @@ motherboard.set_kernel_disk_workload( resource_directory=args.resource_directory, ), disk_image=Resource( - "x86-ubuntu-img", + "x86-ubuntu-18.04-img", resource_directory=args.resource_directory, ), # The first exit signals to switch processors. diff --git a/tests/gem5/configs/x86_boot_exit_run.py b/tests/gem5/configs/x86_boot_exit_run.py index 5c8b025ded..93c9028780 100644 --- a/tests/gem5/configs/x86_boot_exit_run.py +++ b/tests/gem5/configs/x86_boot_exit_run.py @@ -207,7 +207,7 @@ motherboard.set_kernel_disk_workload( resource_directory=args.resource_directory, ), disk_image=Resource( - "x86-ubuntu-img", + "x86-ubuntu-18.04-img", resource_directory=args.resource_directory, ), kernel_args=kernal_args, diff --git a/tests/gem5/cpu_tests/test.py b/tests/gem5/cpu_tests/test.py index ee56400915..a96233724b 100644 --- a/tests/gem5/cpu_tests/test.py +++ b/tests/gem5/cpu_tests/test.py @@ -57,7 +57,7 @@ valid_isas = { base_path = joinpath(config.bin_path, 'cpu_tests') -base_url = config.resource_url + '/gem5/cpu_tests/benchmarks/bin/' +base_url = config.resource_url + '/test-progs/cpu-tests/bin/' isa_url = { constants.gcn3_x86_tag : base_url + "x86", diff --git a/tests/jenkins/presubmit.cfg b/tests/jenkins/presubmit.cfg index a356c766df..ef2596911a 100644 --- a/tests/jenkins/presubmit.cfg +++ b/tests/jenkins/presubmit.cfg @@ -3,4 +3,4 @@ # Location of the continuous batch script in repository. build_file: "jenkins-gem5-prod/tests/jenkins/presubmit.sh" -timeout_mins: 360 # 6 hours +timeout_mins: 420 # 7 hours diff --git a/tests/jenkins/presubmit.sh b/tests/jenkins/presubmit.sh index 2aa0c04469..d15df01193 100755 --- a/tests/jenkins/presubmit.sh +++ b/tests/jenkins/presubmit.sh @@ -37,8 +37,8 @@ set -e -DOCKER_IMAGE_ALL_DEP=gcr.io/gem5-test/ubuntu-20.04_all-dependencies -DOCKER_IMAGE_CLANG_COMPILE=gcr.io/gem5-test/clang-version-9 +DOCKER_IMAGE_ALL_DEP=gcr.io/gem5-test/ubuntu-20.04_all-dependencies:v21-2 +DOCKER_IMAGE_CLANG_COMPILE=gcr.io/gem5-test/clang-version-9:v21-2 PRESUBMIT_STAGE2=tests/jenkins/presubmit-stage2.sh GEM5ART_TESTS=tests/jenkins/gem5art-tests.sh diff --git a/tests/nightly.sh b/tests/nightly.sh index 05a1e0b6ac..190c33d1f3 100755 --- a/tests/nightly.sh +++ b/tests/nightly.sh @@ -46,6 +46,18 @@ if [[ $# -gt 1 ]]; then run_threads=$2 fi +# The third argument is the GPU ISA to run. If no argument is given we default +# to GCN3_X86. +gpu_isa=GCN3_X86 +if [[ $# -gt 2 ]]; then + gpu_isa=$3 +fi + +if [[ "$gpu_isa" != "GCN3_X86" ]] && [[ "$gpu_isa" != "VEGA_X86" ]]; then + echo "Invalid GPU ISA: $gpu_isa" + exit 1 +fi + build_target () { isa=$1 @@ -53,7 +65,8 @@ build_target () { # SCons is not perfect, and occasionally does not catch a necessary # compilation: https://gem5.atlassian.net/browse/GEM5-753 docker run -u $UID:$GID --volume "${gem5_root}":"${gem5_root}" -w \ - "${gem5_root}" --rm gcr.io/gem5-test/ubuntu-20.04_all-dependencies \ + "${gem5_root}" --rm \ + gcr.io/gem5-test/ubuntu-20.04_all-dependencies:v21-2 \ bash -c "scons build/${isa}/gem5.opt -j${compile_threads} \ || (rm -rf build && scons build/${isa}/gem5.opt -j${compile_threads})" } @@ -62,12 +75,13 @@ unit_test () { build=$1 docker run -u $UID:$GID --volume "${gem5_root}":"${gem5_root}" -w \ - "${gem5_root}" --rm gcr.io/gem5-test/ubuntu-20.04_all-dependencies \ + "${gem5_root}" --rm \ + gcr.io/gem5-test/ubuntu-20.04_all-dependencies:v21-2 \ scons build/NULL/unittests.${build} -j${compile_threads} } # Ensure we have the latest docker images. -docker pull gcr.io/gem5-test/ubuntu-20.04_all-dependencies +docker pull gcr.io/gem5-test/ubuntu-20.04_all-dependencies:v21-2 # Try to build the ISA targets. build_target NULL @@ -84,19 +98,19 @@ unit_test debug # Run the gem5 long tests. docker run -u $UID:$GID --volume "${gem5_root}":"${gem5_root}" -w \ - "${gem5_root}"/tests --rm gcr.io/gem5-test/ubuntu-20.04_all-dependencies \ + "${gem5_root}"/tests --rm \ + gcr.io/gem5-test/ubuntu-20.04_all-dependencies:v21-2 \ ./main.py run --length long -j${compile_threads} -t${run_threads} -vv -# Run the GPU tests. -# For the GPU tests we compile and run GCN3_X86 inside a gcn-gpu container. -docker pull gcr.io/gem5-test/gcn-gpu:latest +# For the GPU tests we compile and run the GPU ISA inside a gcn-gpu container. +docker pull gcr.io/gem5-test/gcn-gpu:v21-2 docker run --rm -u $UID:$GID --volume "${gem5_root}":"${gem5_root}" -w \ - "${gem5_root}" gcr.io/gem5-test/gcn-gpu:latest bash -c \ - "scons build/GCN3_X86/gem5.opt -j${compile_threads} \ - || (rm -rf build && scons build/GCN3_X86/gem5.opt -j${compile_threads})" + "${gem5_root}" gcr.io/gem5-test/gcn-gpu:v21-2 bash -c \ + "scons build/${gpu_isa}/gem5.opt -j${compile_threads} \ + || (rm -rf build && scons build/${gpu_isa}/gem5.opt -j${compile_threads})" # get square -wget -qN http://dist.gem5.org/dist/develop/test-progs/square/square +wget -qN http://dist.gem5.org/dist/v21-2/test-progs/square/square mkdir -p tests/testing-results @@ -104,18 +118,18 @@ mkdir -p tests/testing-results # Thus, we always want to run this in the nightly regressions to make sure # basic GPU functionality is working. docker run --rm -u $UID:$GID --volume "${gem5_root}":"${gem5_root}" -w \ - "${gem5_root}" gcr.io/gem5-test/gcn-gpu:latest build/GCN3_X86/gem5.opt \ + "${gem5_root}" gcr.io/gem5-test/gcn-gpu:v21-2 build/${gpu_isa}/gem5.opt \ configs/example/apu_se.py --reg-alloc-policy=dynamic -n3 -c square # get HeteroSync -wget -qN http://dist.gem5.org/dist/develop/test-progs/heterosync/gcn3/allSyncPrims-1kernel +wget -qN http://dist.gem5.org/dist/v21-2/test-progs/heterosync/gcn3/allSyncPrims-1kernel # run HeteroSync sleepMutex -- 16 WGs (4 per CU in default config), each doing # 10 Ld/St per thread and 4 iterations of the critical section is a reasonable # moderate contention case for the default 4 CU GPU config and help ensure GPU # atomics are tested. docker run --rm -u $UID:$GID --volume "${gem5_root}":"${gem5_root}" -w \ - "${gem5_root}" gcr.io/gem5-test/gcn-gpu:latest build/GCN3_X86/gem5.opt \ + "${gem5_root}" gcr.io/gem5-test/gcn-gpu:v21-2 build/${gpu_isa}/gem5.opt \ configs/example/apu_se.py --reg-alloc-policy=dynamic -n3 -c \ allSyncPrims-1kernel --options="sleepMutex 10 16 4" @@ -125,7 +139,7 @@ docker run --rm -u $UID:$GID --volume "${gem5_root}":"${gem5_root}" -w \ # moderate contention case for the default 4 CU GPU config and help ensure GPU # atomics are tested. docker run --rm -u $UID:$GID --volume "${gem5_root}":"${gem5_root}" -w \ - "${gem5_root}" gcr.io/gem5-test/gcn-gpu:latest build/GCN3_X86/gem5.opt \ + "${gem5_root}" gcr.io/gem5-test/gcn-gpu:v21-2 build/${gpu_isa}/gem5.opt \ configs/example/apu_se.py --reg-alloc-policy=dynamic -n3 -c \ allSyncPrims-1kernel --options="lfTreeBarrUniq 10 16 4" @@ -135,7 +149,7 @@ build_and_run_SST () { variant=$2 docker run -u $UID:$GID --volume "${gem5_root}":"${gem5_root}" -w \ - "${gem5_root}" --rm gcr.io/gem5-test/sst-env \ + "${gem5_root}" --rm gcr.io/gem5-test/sst-env:v21-2 \ bash -c "\ scons build/${isa}/libgem5_${variant}.so -j${compile_threads} --without-tcmalloc; \ cd ext/sst; \ diff --git a/tests/weekly.sh b/tests/weekly.sh index d65ee40894..b6eda61932 100755 --- a/tests/weekly.sh +++ b/tests/weekly.sh @@ -32,16 +32,31 @@ set -x dir="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )" gem5_root="${dir}/.." -# We assume the lone argument is the number of threads. If no argument is -# given we default to one. +# We assume the first two arguments are the number of threads followed by the +# GPU ISA to test. These default to 1 and GCN3_X86 is no argument is given. threads=1 -if [[ $# -gt 0 ]]; then +gpu_isa=GCN3_X86 +if [[ $# -eq 1 ]]; then threads=$1 +elif [[ $# -eq 2 ]]; then + threads=$1 + gpu_isa=$2 +else + if [[ $# -gt 0 ]]; then + echo "Invalid number of arguments: $#" + exit 1 + fi +fi + +if [[ "$gpu_isa" != "GCN3_X86" ]] && [[ "$gpu_isa" != "VEGA_X86" ]]; then + echo "Invalid GPU ISA: $gpu_isa" + exit 1 fi # Run the gem5 very-long tests. docker run -u $UID:$GID --volume "${gem5_root}":"${gem5_root}" -w \ - "${gem5_root}"/tests --rm gcr.io/gem5-test/ubuntu-20.04_all-dependencies \ + "${gem5_root}"/tests --rm \ + gcr.io/gem5-test/ubuntu-20.04_all-dependencies:v21-2 \ ./main.py run --length very-long -j${threads} -t${threads} -vv mkdir -p tests/testing-results @@ -49,7 +64,7 @@ mkdir -p tests/testing-results # GPU weekly tests start here # before pulling gem5 resources, make sure it doesn't exist already docker run --rm --volume "${gem5_root}":"${gem5_root}" -w \ - "${gem5_root}" gcr.io/gem5-test/gcn-gpu:latest bash -c \ + "${gem5_root}" gcr.io/gem5-test/gcn-gpu:v21-2 bash -c \ "rm -rf ${gem5_root}/gem5-resources" # delete Pannotia datasets and output files in case a failed regression run left # them around @@ -61,21 +76,42 @@ rm -f coAuthorsDBLP.graph 1k_128k.gr result.out # Moreover, DNNMark builds a library and thus doesn't have a binary, so we # need to build it before we run it. # Need to pull this first because HACC's docker requires this path to exist -git clone -b develop https://gem5.googlesource.com/public/gem5-resources \ +git clone https://gem5.googlesource.com/public/gem5-resources \ "${gem5_root}/gem5-resources" -# For the GPU tests we compile and run GCN3_X86 inside a gcn-gpu container. + +# The following script is to ensure these tests are runnable as the resources +# directory changes over time. The gem5 resources repository stable branch is +# tagged upon the new release for that of the previous release. For example, +# when v22.0 is released, the stable branch will be tagged with "v21.2.X.X" +# prior to the merging of the develop/staging branch into the stable branch. +# This is so a user may revert the gem5-resources sources back to a state +# compatable with a particular major release. +# +# To ensure the v21.2 version of these tests continues to run as future +# versions are released, we run this check. If there's been another release, +# we checkout the correct version of gem5 resources. +cd "${gem5_root}/gem5-resources" +version_tag=$(git tag | grep "v21.2") + +if [[ ${version_tag} != "" ]]; then + git checkout "${version_tag}" +fi + +cd "${gem5_root}" + +# For the GPU tests we compile and run the GPU ISA inside a gcn-gpu container. # HACC requires setting numerous environment variables to run correctly. To # avoid needing to set all of these, we instead build a docker for it, which # has all these variables pre-set in its Dockerfile # To avoid compiling gem5 multiple times, all GPU benchmarks will use this -docker pull gcr.io/gem5-test/gcn-gpu:latest +docker pull gcr.io/gem5-test/gcn-gpu:v21-2 docker build -t hacc-test-weekly ${gem5_root}/gem5-resources/src/gpu/halo-finder docker run --rm -u $UID:$GID --volume "${gem5_root}":"${gem5_root}" -w \ "${gem5_root}" hacc-test-weekly bash -c \ - "scons build/GCN3_X86/gem5.opt -j${threads} \ - || rm -rf build && scons build/GCN3_X86/gem5.opt -j${threads}" + "scons build/${gpu_isa}/gem5.opt -j${threads} \ + || rm -rf build && scons build/${gpu_isa}/gem5.opt -j${threads}" # Some of the apps we test use m5ops (and x86), so compile them for x86 # Note: setting TERM in the environment is necessary as scons fails for m5ops if @@ -94,7 +130,7 @@ docker run --rm --volume "${gem5_root}":"${gem5_root}" -w \ # LULESH is heavily used in the HPC community on GPUs, and does a good job of # stressing several GPU compute and memory components docker run --rm -u $UID:$GID --volume "${gem5_root}":"${gem5_root}" -w \ - "${gem5_root}" hacc-test-weekly build/GCN3_X86/gem5.opt \ + "${gem5_root}" hacc-test-weekly build/${gpu_isa}/gem5.opt \ configs/example/apu_se.py -n3 --mem-size=8GB --reg-alloc-policy=dynamic \ --benchmark-root="${gem5_root}/gem5-resources/src/gpu/lulesh/bin" -c lulesh @@ -136,7 +172,7 @@ docker run --rm -u $UID:$GID --volume "${gem5_root}":"${gem5_root}" -w \ docker run --rm --volume "${gem5_root}":"${gem5_root}" -v \ "${gem5_root}/gem5-resources/src/gpu/DNNMark/cachefiles:/root/.cache/miopen/2.9.0" \ -w "${gem5_root}/gem5-resources/src/gpu/DNNMark" hacc-test-weekly \ - "${gem5_root}/build/GCN3_X86/gem5.opt" "${gem5_root}/configs/example/apu_se.py" -n3 \ + "${gem5_root}/build/${gpu_isa}/gem5.opt" "${gem5_root}/configs/example/apu_se.py" -n3 \ --reg-alloc-policy=dynamic \ --benchmark-root="${gem5_root}/gem5-resources/src/gpu/DNNMark/build/benchmarks/test_fwd_softmax" \ -c dnnmark_test_fwd_softmax \ @@ -146,7 +182,7 @@ docker run --rm --volume "${gem5_root}":"${gem5_root}" -v \ docker run --rm --volume "${gem5_root}":"${gem5_root}" -v \ "${gem5_root}/gem5-resources/src/gpu/DNNMark/cachefiles:/root/.cache/miopen/2.9.0" \ -w "${gem5_root}/gem5-resources/src/gpu/DNNMark" hacc-test-weekly \ - "${gem5_root}/build/GCN3_X86/gem5.opt" "${gem5_root}/configs/example/apu_se.py" -n3 \ + "${gem5_root}/build/${gpu_isa}/gem5.opt" "${gem5_root}/configs/example/apu_se.py" -n3 \ --reg-alloc-policy=dynamic \ --benchmark-root="${gem5_root}/gem5-resources/src/gpu/DNNMark/build/benchmarks/test_fwd_pool" \ -c dnnmark_test_fwd_pool \ @@ -156,7 +192,7 @@ docker run --rm --volume "${gem5_root}":"${gem5_root}" -v \ docker run --rm --volume "${gem5_root}":"${gem5_root}" -v \ "${gem5_root}/gem5-resources/src/gpu/DNNMark/cachefiles:/root/.cache/miopen/2.9.0" \ -w "${gem5_root}/gem5-resources/src/gpu/DNNMark" hacc-test-weekly \ - "${gem5_root}/build/GCN3_X86/gem5.opt" "${gem5_root}/configs/example/apu_se.py" -n3 \ + "${gem5_root}/build/${gpu_isa}/gem5.opt" "${gem5_root}/configs/example/apu_se.py" -n3 \ --reg-alloc-policy=dynamic \ --benchmark-root="${gem5_root}/gem5-resources/src/gpu/DNNMark/build/benchmarks/test_bwd_bn" \ -c dnnmark_test_bwd_bn \ @@ -172,7 +208,7 @@ docker run --rm -v ${PWD}:${PWD} -w \ # Like LULESH, HACC is heavily used in the HPC community and is used to stress # the GPU memory system docker run --rm -v ${gem5_root}:${gem5_root} -w ${gem5_root} -u $UID:$GID \ - hacc-test-weekly ${gem5_root}/build/GCN3_X86/gem5.opt \ + hacc-test-weekly ${gem5_root}/build/${gpu_isa}/gem5.opt \ ${gem5_root}/configs/example/apu_se.py -n3 --reg-alloc-policy=dynamic \ --benchmark-root=${gem5_root}/gem5-resources/src/gpu/halo-finder/src/hip \ -c ForceTreeTest --options="0.5 0.1 64 0.1 1 N 12 rcb" @@ -189,10 +225,10 @@ docker run --rm -v ${PWD}:${PWD} \ "export GEM5_PATH=${gem5_root} ; make gem5-fusion" # # get input dataset for BC test -wget http://dist.gem5.org/dist/develop/datasets/pannotia/bc/1k_128k.gr +wget http://dist.gem5.org/dist/v21-2/datasets/pannotia/bc/1k_128k.gr # run BC docker run --rm -v ${gem5_root}:${gem5_root} -w ${gem5_root} -u $UID:$GID \ - hacc-test-weekly ${gem5_root}/build/GCN3_X86/gem5.opt \ + hacc-test-weekly ${gem5_root}/build/${gpu_isa}/gem5.opt \ ${gem5_root}/configs/example/apu_se.py -n3 --mem-size=8GB \ --reg-alloc-policy=dynamic \ --benchmark-root=gem5-resources/src/gpu/pannotia/bc/bin -c bc.gem5 \ @@ -206,7 +242,7 @@ docker run --rm -v ${gem5_root}:${gem5_root} -w \ # run Color (Max) (use same input dataset as BC for faster testing) docker run --rm -v ${gem5_root}:${gem5_root} -w ${gem5_root} -u $UID:$GID \ - hacc-test-weekly ${gem5_root}/build/GCN3_X86/gem5.opt \ + hacc-test-weekly ${gem5_root}/build/${gpu_isa}/gem5.opt \ ${gem5_root}/configs/example/apu_se.py -n3 --mem-size=8GB \ --reg-alloc-policy=dynamic \ --benchmark-root=${gem5_root}/gem5-resources/src/gpu/pannotia/color/bin \ @@ -220,7 +256,7 @@ docker run --rm -v ${gem5_root}:${gem5_root} -w \ # run Color (MaxMin) (use same input dataset as BC for faster testing) docker run --rm -v ${gem5_root}:${gem5_root} -w ${gem5_root} -u $UID:$GID \ - hacc-test-weekly ${gem5_root}/build/GCN3_X86/gem5.opt \ + hacc-test-weekly ${gem5_root}/build/${gpu_isa}/gem5.opt \ ${gem5_root}/configs/example/apu_se.py -n3 --mem-size=8GB \ --reg-alloc-policy=dynamic \ --benchmark-root=${gem5_root}/gem5-resources/src/gpu/pannotia/color/bin \ @@ -234,7 +270,7 @@ docker run --rm -v ${gem5_root}:${gem5_root} -w \ # run FW (use same input dataset as BC for faster testing) docker run --rm -v ${gem5_root}:${gem5_root} -w ${gem5_root} -u $UID:$GID \ - hacc-test-weekly ${gem5_root}/build/GCN3_X86/gem5.opt \ + hacc-test-weekly ${gem5_root}/build/${gpu_isa}/gem5.opt \ ${gem5_root}/configs/example/apu_se.py -n3 --mem-size=8GB \ --reg-alloc-policy=dynamic \ --benchmark-root=${gem5_root}/gem5-resources/src/gpu/pannotia/fw/bin \ @@ -248,7 +284,7 @@ docker run --rm -v ${gem5_root}:${gem5_root} -w \ # run MIS (use same input dataset as BC for faster testing) docker run --rm -v ${gem5_root}:${gem5_root} -w ${gem5_root} -u $UID:$GID \ - hacc-test-weekly ${gem5_root}/build/GCN3_X86/gem5.opt \ + hacc-test-weekly ${gem5_root}/build/${gpu_isa}/gem5.opt \ ${gem5_root}/configs/example/apu_se.py -n3 --mem-size=8GB \ --reg-alloc-policy=dynamic \ --benchmark-root=${gem5_root}/gem5-resources/src/gpu/pannotia/mis/bin \ @@ -261,10 +297,10 @@ docker run --rm -v ${gem5_root}:${gem5_root} -w \ "export GEM5_PATH=${gem5_root} ; make gem5-fusion" # get PageRank input dataset -wget http://dist.gem5.org/dist/develop/datasets/pannotia/pagerank/coAuthorsDBLP.graph +wget http://dist.gem5.org/dist/v21-2/datasets/pannotia/pagerank/coAuthorsDBLP.graph # run PageRank (Default) docker run --rm -v ${gem5_root}:${gem5_root} -w ${gem5_root} -u $UID:$GID \ - hacc-test-weekly ${gem5_root}/build/GCN3_X86/gem5.opt \ + hacc-test-weekly ${gem5_root}/build/${gpu_isa}/gem5.opt \ ${gem5_root}/configs/example/apu_se.py -n3 --mem-size=8GB \ --reg-alloc-policy=dynamic \ --benchmark-root=${gem5_root}/gem5-resources/src/gpu/pannotia/pagerank/bin \ @@ -278,7 +314,7 @@ docker run --rm -v ${gem5_root}:${gem5_root} -w \ # run PageRank (SPMV) docker run --rm -v ${gem5_root}:${gem5_root} -w ${gem5_root} -u $UID:$GID \ - hacc-test-weekly ${gem5_root}/build/GCN3_X86/gem5.opt \ + hacc-test-weekly ${gem5_root}/build/${gpu_isa}/gem5.opt \ ${gem5_root}/configs/example/apu_se.py -n3 --mem-size=8GB \ --reg-alloc-policy=dynamic \ --benchmark-root=${gem5_root}/gem5-resources/src/gpu/pannotia/pagerank/bin \ @@ -292,7 +328,7 @@ docker run --rm -v ${gem5_root}:${gem5_root} -w \ # run SSSP (CSR) (use same input dataset as BC for faster testing) docker run --rm -v ${gem5_root}:${gem5_root} -w ${gem5_root} -u $UID:$GID \ - hacc-test-weekly ${gem5_root}/build/GCN3_X86/gem5.opt \ + hacc-test-weekly ${gem5_root}/build/${gpu_isa}/gem5.opt \ ${gem5_root}/configs/example/apu_se.py -n3 --mem-size=8GB \ --reg-alloc-policy=dynamic \ --benchmark-root=${gem5_root}/gem5-resources/src/gpu/pannotia/sssp/bin \ @@ -306,7 +342,7 @@ docker run --rm -v ${gem5_root}:${gem5_root} -w \ # run SSSP (ELL) (use same input dataset as BC for faster testing) docker run --rm -v ${gem5_root}:${gem5_root} -w ${gem5_root} -u $UID:$GID \ - hacc-test-weekly ${gem5_root}/build/GCN3_X86/gem5.opt \ + hacc-test-weekly ${gem5_root}/build/${gpu_isa}/gem5.opt \ ${gem5_root}/configs/example/apu_se.py -n3 --mem-size=8GB \ --reg-alloc-policy=dynamic \ --benchmark-root=${gem5_root}/gem5-resources/src/gpu/pannotia/sssp/bin \ diff --git a/util/dockerfiles/gcn-gpu/Dockerfile b/util/dockerfiles/gcn-gpu/Dockerfile index 8dd1b1b139..50d34bd6c6 100644 --- a/util/dockerfiles/gcn-gpu/Dockerfile +++ b/util/dockerfiles/gcn-gpu/Dockerfile @@ -70,7 +70,7 @@ RUN git clone -b rocm-4.0.0 \ WORKDIR /ROCclr # The patch allows us to avoid building blit kernels on-the-fly in gem5 -RUN wget -q -O - dist.gem5.org/dist/develop/rocm_patches/ROCclr.patch | git apply -v +RUN wget -q -O - dist.gem5.org/dist/v21-2/rocm_patches/ROCclr.patch | git apply -v WORKDIR /ROCclr/build RUN cmake -DOPENCL_DIR="/ROCm-OpenCL-Runtime" \