Add more slides and images
This commit is contained in:
68
slides/implementation.md
Normal file
68
slides/implementation.md
Normal file
@@ -0,0 +1,68 @@
|
||||
---
|
||||
layout: figure
|
||||
figureUrl: /dramsys.svg
|
||||
figureCaption: The PIM-HBM model integrated into DRAMSys
|
||||
---
|
||||
|
||||
## Virtual Prototype
|
||||
### Processing Units
|
||||
<hr/>
|
||||
|
||||
---
|
||||
layout: figure-side
|
||||
figureUrl: /data_structures.svg
|
||||
figureCaption: The PIM-HBM model integrated into DRAMSys
|
||||
---
|
||||
|
||||
## Virtual Prototype
|
||||
### Software Library
|
||||
<hr/>
|
||||
|
||||
<br>
|
||||
<br>
|
||||
|
||||
- Software support library written in Rust
|
||||
- Provides data structures for PIM-HBM
|
||||
- Adhering special memory layout requirements
|
||||
- Executes programmed microkernels
|
||||
|
||||
---
|
||||
layout: figure-side
|
||||
figureUrl: /bare_metal.svg
|
||||
---
|
||||
|
||||
## Virtual Prototype
|
||||
### Platform
|
||||
<hr/>
|
||||
|
||||
<br>
|
||||
<br>
|
||||
|
||||
- Bare-metal kernel executes on ARM processor model
|
||||
- Custom page table configuration
|
||||
- Non-PIM DRAM region mapped as cacheable memory
|
||||
- PIM DRAM region mapped as non-cacheable memory
|
||||
|
||||
---
|
||||
|
||||
<hr/>
|
||||
|
||||
<br>
|
||||
<br>
|
||||
GEMV Microkernel
|
||||
|
||||
```asm{none|1-8|9,10|11|all}{lines:true}
|
||||
MOV GRF_A #0, BANK
|
||||
MOV GRF_A #1, BANK
|
||||
MOV GRF_A #2, BANK
|
||||
MOV GRF_A #3, BANK
|
||||
MOV GRF_A #4, BANK
|
||||
MOV GRF_A #5, BANK
|
||||
MOV GRF_A #6, BANK
|
||||
MOV GRF_A #7, BANK
|
||||
MAC(AAM) GRF_B, BANK, GRF_A
|
||||
JUMP -1, 7
|
||||
FILL BANK, GRF_B #0
|
||||
EXIT
|
||||
```
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
---
|
||||
layout: figure
|
||||
figureUrl: world_energy.svg
|
||||
figureUrl: /world_energy.svg
|
||||
figureCaption: Total energy of computing
|
||||
figureFootnoteNumber: 1
|
||||
---
|
||||
@@ -17,7 +17,7 @@ figureFootnoteNumber: 1
|
||||
|
||||
---
|
||||
layout: figure
|
||||
figureUrl: gpt.svg
|
||||
figureUrl: /gpt.svg
|
||||
figureCaption: Roofline model of GPT revisions
|
||||
figureFootnoteNumber: 1
|
||||
---
|
||||
|
||||
104
slides/pim.md
104
slides/pim.md
@@ -1,6 +1,6 @@
|
||||
---
|
||||
layout: figure
|
||||
figureUrl: dnn.svg
|
||||
figureUrl: /dnn.svg
|
||||
figureCaption: A fully connected DNN layer
|
||||
figureFootnoteNumber: 1
|
||||
---
|
||||
@@ -37,11 +37,107 @@ Possible placements of compute logic<sup>1</sup>:
|
||||
|
||||
<br>
|
||||
|
||||
<div v-click class="text-xl"> The nearer the computation is to the memory array, the higher the achievable bandwidth! </div>
|
||||
<div v-click class="text-xl"> The nearer the computation is to the memory cells, the higher the achievable bandwidth! </div>
|
||||
|
||||
<Footnotes separator>
|
||||
<Footnote :number=1>
|
||||
Sudarshan et al. „A Critical Assessment of DRAM-PIM Architectures - Trends, Challenges and Solutions“, 2022.
|
||||
|
||||
</Footnote>
|
||||
</Footnotes>
|
||||
</Footnotes>
|
||||
|
||||
---
|
||||
layout: figure
|
||||
figureUrl: /hbm-pim.svg
|
||||
figureCaption: Architecture of PIM-HBM
|
||||
figureFootnoteNumber: 1
|
||||
---
|
||||
|
||||
## Processing-in-Memory
|
||||
### Samsung's HBM-PIM
|
||||
<hr/>
|
||||
|
||||
<Footnotes separator>
|
||||
<Footnote :number=1>
|
||||
Lee et al. „Hardware Architecture and Software Stack for PIM Based on Commercial DRAM Technology : Industrial Product“, 2021.
|
||||
</Footnote>
|
||||
</Footnotes>
|
||||
|
||||
<!--
|
||||
- Real-world PIM implementation based on HBM2
|
||||
- SIMD FPUs are 16-wide, i.e., there are 16 FPU units
|
||||
- Three execution modes
|
||||
- Single-Bank (SB)
|
||||
- All-Bank (AB)
|
||||
- All-Bank-PIM (AB-PIM)
|
||||
-->
|
||||
|
||||
---
|
||||
layout: figure
|
||||
figureUrl: /pu.svg
|
||||
figureCaption: Architecture of a PIM processing unit
|
||||
figureFootnoteNumber: 1
|
||||
---
|
||||
|
||||
## Processing-in-Memory
|
||||
### Samsung's HBM-PIM
|
||||
<hr/>
|
||||
|
||||
<Footnotes separator>
|
||||
<Footnote :number=1>
|
||||
Lee et al. „Hardware Architecture and Software Stack for PIM Based on Commercial DRAM Technology : Industrial Product“, 2021.
|
||||
</Footnote>
|
||||
</Footnotes>
|
||||
|
||||
<!--
|
||||
- Control unit executes RISC instructions
|
||||
- Two SIMD FPUs
|
||||
- ADD
|
||||
- MUL
|
||||
|
||||
- CRF: 32 32-bit entries (32 instructions)
|
||||
- GRF: 16 256-bit entries
|
||||
- SRF: 16 16-bit entries
|
||||
|
||||
- One instruction is executed when RD or WR command is issued
|
||||
-->
|
||||
|
||||
---
|
||||
layout: figure
|
||||
figureUrl: /gemv.svg
|
||||
figureCaption: Procedure to perform a (128×8)×(128) GEMV operation
|
||||
figureFootnoteNumber: 1
|
||||
---
|
||||
|
||||
## Processing-in-Memory
|
||||
### Samsung's HBM-PIM
|
||||
<hr/>
|
||||
|
||||
<Footnotes separator>
|
||||
<Footnote :number=1>
|
||||
Lee et al. „Hardware Architecture and Software Stack for PIM Based on Commercial DRAM Technology : Industrial Product“, 2021.
|
||||
</Footnote>
|
||||
</Footnotes>
|
||||
|
||||
---
|
||||
layout: figure
|
||||
figureUrl: /layout.svg
|
||||
figureCaption: Mapping of the weight matrix onto the memory banks
|
||||
---
|
||||
|
||||
## Processing-in-Memory
|
||||
### Samsung's HBM-PIM
|
||||
<hr/>
|
||||
|
||||
<!--
|
||||
- Data layout in program and address mapping must match
|
||||
-->
|
||||
|
||||
---
|
||||
|
||||
## Processing-in-Memory
|
||||
### Research
|
||||
<hr/>
|
||||
|
||||
simulation models needed
|
||||
|
||||
research should not only focus on hardware but also explore the software side!
|
||||
|
||||
38
slides/simulations.md
Normal file
38
slides/simulations.md
Normal file
@@ -0,0 +1,38 @@
|
||||
## Simulations
|
||||
### Microbenchmarks
|
||||
<hr/>
|
||||
|
||||
<br>
|
||||
<br>
|
||||
|
||||
<div class="grid grid-cols-2 gap-4">
|
||||
<div>
|
||||
|
||||
- Vector benchmarks (BLAS level 1)
|
||||
- VADD: $z = x + y$
|
||||
- VMUL: $z = x \cdot y$
|
||||
- HAXPY: $z = a \cdot x + y$
|
||||
|
||||
- Vector-Matrix benchmarks (BLAS level 2)
|
||||
- GEMV: $z = A \cdot x$
|
||||
- DNN Layer: $z = ReLU(A \cdot x)$
|
||||
|
||||
</div>
|
||||
<div>
|
||||
|
||||
| Level | Vector | GEMV | DNN |
|
||||
|-------|--------|---------------|---------------|
|
||||
| X1 | (2M) | (1024 x 4096) | (256 x 256) |
|
||||
| X2 | (4M) | (2048 x 4096) | (512 x 512) |
|
||||
| X3 | (8M) | (4096 x 8192) | (1024 x 1024) |
|
||||
| X4 | (16M) | (4096 x 8192) | (2048 x 2048) |
|
||||
|
||||
</div>
|
||||
</div>
|
||||
|
||||
---
|
||||
layout: figure
|
||||
figureUrl: /dnn.svg
|
||||
figureCaption: A fully connected DNN layer
|
||||
figureFootnoteNumber: 1
|
||||
---
|
||||
Reference in New Issue
Block a user