Minor changes

2024-04-07 22:41:59 +02:00
parent 3d15758c82
commit d634f97fb2
6 changed files with 107 additions and 61 deletions
--- a/slides/simulations.md
+++ b/slides/simulations.md
@@ -14,7 +14,7 @@

 - Vector-Matrix benchmarks (BLAS level 2)
    - GEMV: $z = A \cdot x$
-    - DNN:
+    - Simple DNN:
      - $f(x) = z = ReLU(A \cdot x)$
      - $z_{n+1} = f(z_n)$
      - 5 layers in total
@@ -36,24 +36,44 @@ Operand Dimensions
 </div>
 </div>

+<!--
+- operand data significantly larger than on-chip cache
+-->
+
 ---

 ## Simulations
 ### System Configuration
 <hr/>

+<br>
 <br>
 <br>

- Two simulated systems:
-    - Generic ARM systems
-    - Infinite compute ARM system
+<div class="grid grid-cols-2 gap-4">
+<div>
+
+#### Two simulated systems:

 <br>

- Two real GPUs using HBM2:
-  - AMD RX Vega 56
-  - NVIDIA V100
+- Generic ARM system
+- Infinite compute system
+  - completely memory bound
+
+</div>
+
+<div>
+
+#### Two real GPUs using HBM2:
+
+<br>
+
+- AMD RX Vega 56
+- NVIDIA V100
+
+</div>
+</div>

 ---
 layout: figure
@@ -75,11 +95,15 @@ figureCaption: Speedups of PIM compared to non-PIM
 ### Speedups / Infinite Compute System
 <hr/>

+<!--
+- VADD: 12.7x
+- GEMV: 9.0x
+-->
+
 ---
 layout: figure
 figureUrl: /samsung.svg
 figureCaption: Speedups of Samsung for VADD and GEMV
-figureFootnoteNumber: 1
 ---

 ## Simulations
@@ -97,6 +121,7 @@ figureFootnoteNumber: 1
 - ADD shows deviation

 -> differences in hardware architecture
+- GPU has no speculative execution
 -->

 ---
@@ -111,6 +136,7 @@ figureCaption: Runtimes for Vector Benchmarks

 <!--
 - Real GPUs use multiple memory channels
+- Memory barriers
 - Also architectural differences
 -->