gpu-compute: Support cache line sizes >64B in GPUFS (#939)

This change fixes two issues:

1) The --cacheline_size option was setting the system cache line size
but not the Ruby cache line size, and the mismatch was causing assertion
failures.

2) The submitDispatchPkt() function accesses the kernel object in
chunks, with the chunk size equal to the cache line size. For cache line
sizes >64B (e.g. 128B), the kernel object is not guaranteed to be
aligned to a cache line and it was possible for a chunk to be partially
contained in two separate device memories, causing the memory access to
fail.

Change-Id: I8e45146901943e9c2750d32162c0f35c851e09e1

Co-authored-by: Michael Boyer <Michael.Boyer@amd.com>
This commit is contained in:
Michael Boyer
2024-03-20 11:09:25 -07:00
committed by GitHub
parent 2b67d0eba6
commit ba2f5615ba
2 changed files with 11 additions and 3 deletions

View File

@@ -58,6 +58,8 @@ class Disjoint_VIPER(RubySystem):
self.network_cpu = DisjointSimple(self)
self.network_gpu = DisjointSimple(self)
self.block_size_bytes = options.cacheline_size
# Construct CPU controllers
cpu_dir_nodes = construct_dirs(options, system, self, self.network_cpu)
(cp_sequencers, cp_cntrl_nodes) = construct_corepairs(