gpu-compute: Implement per-request MTYPEs

GPU MTYPE is currently set using a global config passed to the
PACoalescer.  This patch enables MTYPE to be set by the shader on a
per-request bases.  In real hardware, the MTYPE is extracted from a
GPUVM PTE during address translation.  However, our current simulator
only models x86 page tables which do not have the appropriate bits for
GPU MTYPES.  Rather than hacking non-x86 bits into our x86 page table
models, this patch instead keeps an interval tree of all pages that
request custom MTYPES in the driver itself.  This is currently
only used to map host pages to the GPU as uncacheable, but is easily
extensible to other MTYPES.

Change-Id: I7daab0ffae42084b9131a67c85cd0aa4bbbfc8d6
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42216
Maintainer: Matthew Poremba <matthew.poremba@amd.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This commit is contained in:
Michael LeBeane
2018-10-31 16:25:12 -04:00
committed by Matthew Poremba
parent dfa712f041
commit ad43083bb3
9 changed files with 274 additions and 23 deletions

View File

@@ -173,6 +173,21 @@ parser.add_argument("--dgpu", action="store_true", default=False,
"transfered from host to device memory using runtime calls "
"that copy data over a PCIe-like IO bus.")
# Mtype option
#-- 1 1 1 C_RW_S (Cached-ReadWrite-Shared)
#-- 1 1 0 C_RW_US (Cached-ReadWrite-Unshared)
#-- 1 0 1 C_RO_S (Cached-ReadOnly-Shared)
#-- 1 0 0 C_RO_US (Cached-ReadOnly-Unshared)
#-- 0 1 x UC_L2 (Uncached_GL2)
#-- 0 0 x UC_All (Uncached_All_Load)
# default value: 5/C_RO_S (only allow caching in GL2 for read. Shared)
parser.add_argument("--m-type", type='int', default=5,
help="Default Mtype for GPU memory accesses. This is the "
"value used for all memory accesses on an APU and is the "
"default mode for dGPU unless explicitly overwritten by "
"the driver on a per-page basis. Valid values are "
"between 0-7")
Ruby.define_options(parser)
# add TLB options to the parser
@@ -407,8 +422,15 @@ hsapp_gpu_map_vaddr = 0x200000000
hsapp_gpu_map_size = 0x1000
hsapp_gpu_map_paddr = int(Addr(args.mem_size))
if args.dgpu:
# Default --m-type for dGPU is write-back gl2 with system coherence
# (coherence at the level of the system directory between other dGPUs and
# CPUs) managed by kernel boundary flush operations targeting the gl2.
args.m_type = 6
# HSA kernel mode driver
gpu_driver = GPUComputeDriver(filename = "kfd", isdGPU = args.dgpu)
gpu_driver = GPUComputeDriver(filename = "kfd", isdGPU = args.dgpu,
dGPUPoolID = 1, m_type = args.m_type)
# Creating the GPU kernel launching components: that is the HSA
# packet processor (HSAPP), GPU command processor (CP), and the