Refactor presentation
This commit is contained in:
@@ -2,7 +2,6 @@
|
||||
### Microbenchmarks
|
||||
<hr/>
|
||||
|
||||
<br>
|
||||
<br>
|
||||
|
||||
<div class="grid grid-cols-2 gap-4">
|
||||
@@ -15,11 +14,16 @@
|
||||
|
||||
- Vector-Matrix benchmarks (BLAS level 2)
|
||||
- GEMV: $z = A \cdot x$
|
||||
- DNN Layer: $z = ReLU(A \cdot x)$
|
||||
- DNN:
|
||||
- $f(x) = z = ReLU(A \cdot x)$
|
||||
- $z_{n+1} = f(z_n)$
|
||||
- 5 layers in total
|
||||
|
||||
</div>
|
||||
<div>
|
||||
|
||||
<br>
|
||||
|
||||
| Level | Vector | GEMV | DNN |
|
||||
|-------|--------|---------------|---------------|
|
||||
| X1 | (2M) | (1024 x 4096) | (256 x 256) |
|
||||
@@ -27,6 +31,8 @@
|
||||
| X3 | (8M) | (4096 x 8192) | (1024 x 1024) |
|
||||
| X4 | (16M) | (4096 x 8192) | (2048 x 2048) |
|
||||
|
||||
Operand Dimensions
|
||||
|
||||
</div>
|
||||
</div>
|
||||
|
||||
@@ -36,11 +42,18 @@
|
||||
### System Configuration
|
||||
<hr/>
|
||||
|
||||
- Two system configurations:
|
||||
- ARM 3GHz
|
||||
- ARM Infinite
|
||||
<br>
|
||||
<br>
|
||||
|
||||
- TODO ... GPU und so
|
||||
- Two simulated systems:
|
||||
- Generic ARM systems
|
||||
- Infinite compute ARM system
|
||||
|
||||
<br>
|
||||
|
||||
- Two real GPUs using HBM2:
|
||||
- AMD RX Vega 56
|
||||
- NVIDIA V100
|
||||
|
||||
---
|
||||
layout: figure
|
||||
@@ -49,7 +62,7 @@ figureCaption: Speedups of PIM compared to non-PIM
|
||||
---
|
||||
|
||||
## Simulations
|
||||
### Speedups / ARM System
|
||||
### Speedups / Generic ARM System
|
||||
<hr/>
|
||||
|
||||
---
|
||||
@@ -74,11 +87,18 @@ figureFootnoteNumber: 1
|
||||
<hr/>
|
||||
|
||||
<Footnotes separator>
|
||||
<Footnote :number=1>
|
||||
<Footnote>
|
||||
Lee et al. „Hardware Architecture and Software Stack for PIM Based on Commercial DRAM Technology : Industrial Product“, 2021.
|
||||
</Footnote>
|
||||
</Footnotes>
|
||||
|
||||
<!--
|
||||
- GEMV matches good
|
||||
- ADD shows deviation
|
||||
|
||||
-> differences in hardware architecture
|
||||
-->
|
||||
|
||||
---
|
||||
layout: figure
|
||||
figureUrl: /runtimes_vector.svg
|
||||
@@ -89,6 +109,11 @@ figureCaption: Runtimes for Vector Benchmarks
|
||||
### Runtimes / Vector Benchmarks
|
||||
<hr/>
|
||||
|
||||
<!--
|
||||
- Real GPUs use multiple memory channels
|
||||
- Also architectural differences
|
||||
-->
|
||||
|
||||
---
|
||||
layout: figure
|
||||
figureUrl: /runtimes_matrix.svg
|
||||
|
||||
Reference in New Issue
Block a user