Add more slides and images

2024-04-03 22:45:37 +02:00
parent fb8c674f2a
commit a7d5b77dcd
19 changed files with 20783 additions and 6 deletions
--- a/slides/implementation.md
+++ b/slides/implementation.md
@@ -0,0 +1,68 @@
+---
+layout: figure
+figureUrl: /dramsys.svg
+figureCaption: The PIM-HBM model integrated into DRAMSys
+---
+
+## Virtual Prototype
+### Processing Units
+<hr/>
+
+---
+layout: figure-side
+figureUrl: /data_structures.svg
+figureCaption: The PIM-HBM model integrated into DRAMSys
+---
+
+## Virtual Prototype
+### Software Library
+<hr/>
+
+<br>
+<br>
+
+- Software support library written in Rust
+- Provides data structures for PIM-HBM
+  - Adhering special memory layout requirements
+- Executes programmed microkernels
+
+---
+layout: figure-side
+figureUrl: /bare_metal.svg
+---
+
+## Virtual Prototype
+### Platform
+<hr/>
+
+<br>
+<br>
+
+- Bare-metal kernel executes on ARM processor model
+- Custom page table configuration
+  - Non-PIM DRAM region mapped as cacheable memory
+  - PIM DRAM region mapped as non-cacheable memory
+
+---
+
+<hr/>
+
+<br>
+<br>
+GEMV Microkernel
+
+```asm{none|1-8|9,10|11|all}{lines:true}
+MOV GRF_A #0, BANK
+MOV GRF_A #1, BANK
+MOV GRF_A #2, BANK
+MOV GRF_A #3, BANK
+MOV GRF_A #4, BANK
+MOV GRF_A #5, BANK
+MOV GRF_A #6, BANK
+MOV GRF_A #7, BANK
+MAC(AAM) GRF_B, BANK, GRF_A
+JUMP -1, 7
+FILL BANK, GRF_B #0
+EXIT
+```
+
--- a/slides/introduction.md
+++ b/slides/introduction.md
@@ -1,6 +1,6 @@
 ---
 layout: figure
-figureUrl: world_energy.svg
+figureUrl: /world_energy.svg
 figureCaption: Total energy of computing
 figureFootnoteNumber: 1
 ---
@@ -17,7 +17,7 @@ figureFootnoteNumber: 1

 ---
 layout: figure
-figureUrl: gpt.svg
+figureUrl: /gpt.svg
 figureCaption: Roofline model of GPT revisions
 figureFootnoteNumber: 1
 ---
--- a/slides/pim.md
+++ b/slides/pim.md
@@ -1,6 +1,6 @@
 ---
 layout: figure
-figureUrl: dnn.svg
+figureUrl: /dnn.svg
 figureCaption: A fully connected DNN layer
 figureFootnoteNumber: 1
 ---
@@ -37,11 +37,107 @@ Possible placements of compute logic<sup>1</sup>:

 <br>

-<div v-click class="text-xl"> The nearer the computation is to the memory array, the higher the achievable bandwidth! </div>
+<div v-click class="text-xl"> The nearer the computation is to the memory cells, the higher the achievable bandwidth! </div>

 <Footnotes separator>
  <Footnote :number=1>
  Sudarshan et al. „A Critical Assessment of DRAM-PIM Architectures - Trends, Challenges and Solutions“, 2022.
-
 </Footnote>
-</Footnotes>
+</Footnotes>
+
+---
+layout: figure
+figureUrl: /hbm-pim.svg
+figureCaption: Architecture of PIM-HBM
+figureFootnoteNumber: 1
+---
+
+## Processing-in-Memory
+### Samsung's HBM-PIM
+<hr/>
+
+<Footnotes separator>
+  <Footnote :number=1>
+  Lee et al. „Hardware Architecture and Software Stack for PIM Based on Commercial DRAM Technology : Industrial Product“, 2021.
+</Footnote>
+</Footnotes>
+
+<!--
+- Real-world PIM implementation based on HBM2
+- SIMD FPUs are 16-wide, i.e., there are 16 FPU units
+- Three execution modes
+    - Single-Bank (SB)
+    - All-Bank (AB)
+    - All-Bank-PIM (AB-PIM)
+-->
+
+---
+layout: figure
+figureUrl: /pu.svg
+figureCaption: Architecture of a PIM processing unit
+figureFootnoteNumber: 1
+---
+
+## Processing-in-Memory
+### Samsung's HBM-PIM
+<hr/>
+
+<Footnotes separator>
+  <Footnote :number=1>
+  Lee et al. „Hardware Architecture and Software Stack for PIM Based on Commercial DRAM Technology : Industrial Product“, 2021.
+</Footnote>
+</Footnotes>
+
+<!--
+- Control unit executes RISC instructions
+- Two SIMD FPUs
+  - ADD
+  - MUL
+
+- CRF: 32 32-bit entries (32 instructions)
+- GRF: 16 256-bit entries
+- SRF: 16 16-bit entries
+
+- One instruction is executed when RD or WR command is issued
+-->
+
+---
+layout: figure
+figureUrl: /gemv.svg
+figureCaption: Procedure to perform a (128×8)×(128) GEMV operation
+figureFootnoteNumber: 1
+---
+
+## Processing-in-Memory
+### Samsung's HBM-PIM
+<hr/>
+
+<Footnotes separator>
+  <Footnote :number=1>
+  Lee et al. „Hardware Architecture and Software Stack for PIM Based on Commercial DRAM Technology : Industrial Product“, 2021.
+</Footnote>
+</Footnotes>
+
+---
+layout: figure
+figureUrl: /layout.svg
+figureCaption: Mapping of the weight matrix onto the memory banks
+---
+
+## Processing-in-Memory
+### Samsung's HBM-PIM
+<hr/>
+
+<!--
+- Data layout in program and address mapping must match
+-->
+
+---
+
+## Processing-in-Memory
+### Research
+<hr/>
+
+simulation models needed
+
+research should not only focus on hardware but also explore the software side!
--- a/slides/simulations.md
+++ b/slides/simulations.md
@@ -0,0 +1,38 @@
+## Simulations
+### Microbenchmarks
+<hr/>
+
+<br>
+<br>
+
+<div class="grid grid-cols-2 gap-4">
+<div>
+
+- Vector benchmarks (BLAS level 1)
+    - VADD: $z = x + y$ 
+    - VMUL: $z = x \cdot y$
+    - HAXPY: $z = a \cdot x + y$
+
+- Vector-Matrix benchmarks (BLAS level 2)
+    - GEMV: $z = A \cdot x$
+    - DNN Layer: $z = ReLU(A \cdot x)$
+
+</div>
+<div>
+
+| Level | Vector | GEMV          | DNN           |
+|-------|--------|---------------|---------------|
+| X1    | (2M)   | (1024 x 4096) | (256 x 256)   |
+| X2    | (4M)   | (2048 x 4096) | (512 x 512)   |
+| X3    | (8M)   | (4096 x 8192) | (1024 x 1024) |
+| X4    | (16M)  | (4096 x 8192) | (2048 x 2048) |
+
+</div>
+</div>
+
+---
+layout: figure
+figureUrl: /dnn.svg
+figureCaption: A fully connected DNN layer
+figureFootnoteNumber: 1
+---