Files
master-thesis-presentation/slides/simulations.md
2024-04-07 21:21:40 +02:00

126 lines
2.1 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters
This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
## Simulations
### Microbenchmarks
<hr/>
<br>
<div class="grid grid-cols-2 gap-4">
<div>
- Vector benchmarks (BLAS level 1)
- VADD: $z = x + y$
- VMUL: $z = x \cdot y$
- HAXPY: $z = a \cdot x + y$
- Vector-Matrix benchmarks (BLAS level 2)
- GEMV: $z = A \cdot x$
- DNN:
- $f(x) = z = ReLU(A \cdot x)$
- $z_{n+1} = f(z_n)$
- 5 layers in total
</div>
<div>
<br>
| Level | Vector | GEMV | DNN |
|-------|--------|---------------|---------------|
| X1 | (2M) | (1024 x 4096) | (256 x 256) |
| X2 | (4M) | (2048 x 4096) | (512 x 512) |
| X3 | (8M) | (4096 x 8192) | (1024 x 1024) |
| X4 | (16M) | (4096 x 8192) | (2048 x 2048) |
Operand Dimensions
</div>
</div>
---
## Simulations
### System Configuration
<hr/>
<br>
<br>
- Two simulated systems:
- Generic ARM systems
- Infinite compute ARM system
<br>
- Two real GPUs using HBM2:
- AMD RX Vega 56
- NVIDIA V100
---
layout: figure
figureUrl: /speedup_normal.svg
figureCaption: Speedups of PIM compared to non-PIM
---
## Simulations
### Speedups / Generic ARM System
<hr/>
---
layout: figure
figureUrl: /speedup_inf.svg
figureCaption: Speedups of PIM compared to non-PIM
---
## Simulations
### Speedups / Infinite Compute System
<hr/>
---
layout: figure
figureUrl: /samsung.svg
figureCaption: Speedups of Samsung for VADD and GEMV
figureFootnoteNumber: 1
---
## Simulations
### Speedups / Samsung
<hr/>
<Footnotes separator>
<Footnote>
Lee et al. „Hardware Architecture and Software Stack for PIM Based on Commercial DRAM Technology: Industrial Product“, 2021.
</Footnote>
</Footnotes>
<!--
- GEMV matches good
- ADD shows deviation
-> differences in hardware architecture
-->
---
layout: figure
figureUrl: /runtimes_vector.svg
figureCaption: Runtimes for Vector Benchmarks
---
## Simulations
### Runtimes / Vector Benchmarks
<hr/>
<!--
- Real GPUs use multiple memory channels
- Also architectural differences
-->
---
layout: figure
figureUrl: /runtimes_matrix.svg
figureCaption: Runtimes for Matrix Benchmarks
---
## Simulations
### Runtimes / Matrix Benchmarks
<hr/>