tests,configs,mem-ruby: Adding Ruby tester for GPU_VIPER
This patch adds the GPU protocol tester that uses data-race-free operation to discover bugs in GPU protocols including GPU_VIPER. For more information please see the following paper and the README: T. Ta, X. Zhang, A. Gutierrez and B. M. Beckmann, "Autonomous Data-Race-Free GPU Testing," 2019 IEEE International Symposium on Workload Characterization (IISWC), Orlando, FL, USA, 2019, pp. 81-92, doi: 10.1109/IISWC47752.2019.9042019. Change-Id: Ic9939d131a930d1e7014ed0290601140bdd1499f Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32855 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>
This commit is contained in:
129
src/cpu/testers/gpu_ruby_test/README
Normal file
129
src/cpu/testers/gpu_ruby_test/README
Normal file
@@ -0,0 +1,129 @@
|
||||
/*
|
||||
* Copyright (c) 2017-2020 Advanced Micro Devices, Inc.
|
||||
* All rights reserved.
|
||||
*
|
||||
* For use for simulation and test purposes only
|
||||
*
|
||||
* Redistribution and use in source and binary forms, with or without
|
||||
* modification, are permitted provided that the following conditions are met:
|
||||
*
|
||||
* 1. Redistributions of source code must retain the above copyright notice,
|
||||
* this list of conditions and the following disclaimer.
|
||||
*
|
||||
* 2. Redistributions in binary form must reproduce the above copyright notice,
|
||||
* this list of conditions and the following disclaimer in the documentation
|
||||
* and/or other materials provided with the distribution.
|
||||
*
|
||||
* 3. Neither the name of the copyright holder nor the names of its
|
||||
* contributors may be used to endorse or promote products derived from this
|
||||
* software without specific prior written permission.
|
||||
*
|
||||
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
|
||||
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
||||
* ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
|
||||
* LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
|
||||
* CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
|
||||
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
|
||||
* INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
|
||||
* CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
|
||||
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
|
||||
* POSSIBILITY OF SUCH DAMAGE.
|
||||
*/
|
||||
|
||||
This directory contains a tester for gem5 GPU protocols. Unlike the Ruby random
|
||||
teter, this tester does not rely on sequential consistency. Instead, it
|
||||
assumes tested protocols supports release consistency.
|
||||
|
||||
----- Getting Started -----
|
||||
|
||||
To start using the tester quickly, you can use the following example command
|
||||
line to get running immediately:
|
||||
|
||||
build/GCN3_X86/gem5.opt configs/example/ruby_gpu_random_test.py \
|
||||
--test-length=1000 --system-size=medium --cache-size=small
|
||||
|
||||
An overview of the main command line options is as follows. For all options
|
||||
use `build/GCN3_X86/gem5.opt configs/example/ruby_gpu_random_test.py --help`
|
||||
or see the configuration file.
|
||||
|
||||
* --cache-size (small, large): Use smaller sizes for testing evict, etc.
|
||||
* --system-size (small, medium, large): Effectively the number of threads in
|
||||
the GPU model. Large size will have more contention. Larger
|
||||
sizes are useful for checking contention.
|
||||
* --episode-length (short, medium, long): Number of loads and stores in an
|
||||
episode. Episodes will also have atomics mixed in. See below
|
||||
for a definition of episode.
|
||||
* --test-length (int): Number of episodes to execute. This will determine the
|
||||
amount of time the tester runs for. Longer time will stress
|
||||
the protocol harder.
|
||||
|
||||
The remainder of this file describes the theory behind the tester design and
|
||||
a link to a more detailed research paper is provided at the end.
|
||||
|
||||
----- Theory Overview -----
|
||||
|
||||
The GPU Ruby tester creates a system consisting of both CPU threads and GPU
|
||||
wavefronts. CPU threads are scalar, so there is one lane per CPU thread. GPU
|
||||
wavefront may have multiple lanes. The number of lanes is initialized when
|
||||
a thread/wavefront is created.
|
||||
|
||||
Each thread/wavefront executes a number of episodes. Each episode is a series
|
||||
of memory actions (i.e., atomic, load, store, acquire and release). In a
|
||||
wavefront, all lanes execute the same sequence of actions, but they may target
|
||||
different addresses. One can think of an episode as a critical section which
|
||||
is bounded by a lock acquire in the beginning and a lock release at the end. An
|
||||
episode consists of actions in the following order:
|
||||
|
||||
1 - Atomic action
|
||||
2 - Acquire action
|
||||
3 - A number of load and store actions
|
||||
4 - Release action
|
||||
5 - Atomic action that targets the same address as (1) does
|
||||
|
||||
There are two separate set of addresses: atomic and non-atomic. Atomic actions
|
||||
target only atomic addresses. Load and store actions target only non-atomic
|
||||
addresses. Memory addresses are all 4-byte aligned in the tester.
|
||||
|
||||
To test false sharing cases in which both atomic and non-atomic addresses are
|
||||
placed in the same cache line, we abstract out the concept of memory addresses
|
||||
from the tester's perspective by introducing the concept of location. Locations
|
||||
are numbered from 0 to N-1 (if there are N addresses). The first X locations
|
||||
[0..X-1] are atomic locations, and the rest are non-atomic locations.
|
||||
The 1-1 mapping between locations and addresses are randomly created when the
|
||||
tester is initialized.
|
||||
|
||||
Per load and store action, its target location is selected so that there is no
|
||||
data race in the generated stream of memory requests at any time during the
|
||||
test. Since in Data-Race-Free model, the memory system's behavior is undefined
|
||||
in data race cases, we exclude data race scenarios from our protocol test.
|
||||
|
||||
Once location per load/store action is determined, each thread/wavefront either
|
||||
loads current value at the location or stores an incremental value to that
|
||||
location. The tester maintains a table tracking all last writers and their
|
||||
written values, so we know what value should be returned from a load and what
|
||||
value should be written next at a particular location. Value returned from a
|
||||
load must match with the value written by the last writer.
|
||||
|
||||
----- Directory Structure -----
|
||||
|
||||
ProtocolTester.hh/cc -- This is the main tester class that orchestrates the
|
||||
entire test.
|
||||
AddressManager.hh/cc -- This manages address space, randomly maps address to
|
||||
location, generates locations for all episodes,
|
||||
maintains per-location last writer and validates
|
||||
values returned from load actions.
|
||||
GpuThread.hh/cc -- This is abstract class for CPU threads and GPU
|
||||
wavefronts. It generates and executes a series of
|
||||
episodes.
|
||||
CpuThread.hh/cc -- Thread class for CPU threads. Not fully implemented yet
|
||||
GpuWavefront.hh/cc -- GpuThread class for GPU wavefronts.
|
||||
Episode.hh/cc -- Class to encapsulate an episode, notably including
|
||||
episode load/store structure and ordering.
|
||||
|
||||
For more detail, please see the following paper:
|
||||
|
||||
T. Ta, X. Zhang, A. Gutierrez and B. M. Beckmann, "Autonomous Data-Race-Free
|
||||
GPU Testing," 2019 IEEE International Symposium on Workload Characterization
|
||||
(IISWC), Orlando, FL, USA, 2019, pp. 81-92, doi:
|
||||
10.1109/IISWC47752.2019.9042019.
|
||||
Reference in New Issue
Block a user