-- achievable speedup of 17.6 × and 9.0 × hypothetical infinite compute system
- - lower bound
-- linux driver implementation
-- comparison with real neural network workloads
-- consider replacing library approach with compiler approach
-- power comparison, power models needed
+
+
+A speedup of 17.6× and 9.0× for the hypothetical infinite compute system has been achieved
+
+
+
+Future work:
+ - Implementation of Linux driver
+ - Comparison with complete neural networks
+ - Consider replacing library approach with compiler approach
+ - Implement a power model to analyze the power efficiency gains
diff --git a/slides/implementation.md b/slides/implementation.md
index 6e472fa..ded60a6 100644
--- a/slides/implementation.md
+++ b/slides/implementation.md
@@ -1,13 +1,23 @@
----
-layout: figure
-figureUrl: /dramsys.svg
-figureCaption: The PIM-HBM model integrated into DRAMSys
----
-
## Virtual Prototype
### Processing Units
+
+
+- Integrate DRAMSys into gem5
+- Implement PIM-HBM virtual prototype in DRAM model
+
+
+
+
+
+
+
+
+
---
layout: figure-side
figureUrl: /data_structures.svg
@@ -18,12 +28,15 @@ figureCaption: Data structures for instructions and register files
### Software Library
+
-- Software support library
-- Provides data structures for PIM-HBM
- - Adhering special memory layout requirements
+#### Software support library
+
+
+
+- Provides data structures for operand data and microkernels
- Executes programmed microkernels
---
diff --git a/slides/introduction.md b/slides/introduction.md
index a85e18f..07d14b2 100644
--- a/slides/introduction.md
+++ b/slides/introduction.md
@@ -1,33 +1,41 @@
----
-layout: figure
-figureUrl: /world_energy.svg
-figureCaption: Total energy of computing
-figureFootnoteNumber: 1
----
-
## Introduction
### Energy Demand of Applications
+
+
+- Total compute energy approaches world's energy production
+
+ --> drastic improvements in energy efficiency needed
+
+
+
+
+
-
+
SRC. „Decadal Plan for Semiconductors“, Januar 2021. https://www.src.org/about/decadal-plan/.
-
+
----
-layout: figure
-figureUrl: /gpt.svg
-figureCaption: Roofline model of GPT revisions
-figureFootnoteNumber: 1
---
## Introduction
### Memory Bound Workloads
+
+
+#### Roofline model of GPT revisions1
+
+
+
+
+
+
+
-
+
Ivo Bolsens. „Scalable AI Architectures for Edge and Cloud“, 2023.
-
+
diff --git a/slides/pim.md b/slides/pim.md
index 7ff319d..01f44e3 100644
--- a/slides/pim.md
+++ b/slides/pim.md
@@ -79,6 +79,10 @@ clicks: 1
+
+
---
## Processing-in-Memory
@@ -94,8 +98,8 @@ clicks: 1
- Inside the memory subarray
-- In the PSA region near a subarray
-- Outside the bank in its peripheral region
+- Near the subarray in the PSA output region
+- Near the bank in its peripheral region
- In the I/O region of the memory
@@ -120,7 +124,7 @@ clicks: 1
The nearer the computation is to the memory cells, the higher the achievable bandwidth!
-
+
Sudarshan et al. „A Critical Assessment of DRAM-PIM Architectures - Trends, Challenges and Solutions“, 2022.
@@ -141,19 +145,27 @@ clicks: 1
- more traditional accelerator approach
-->
----
-layout: figure
-figureUrl: /hbm-pim.svg
-figureCaption: Architecture of PIM-HBM
-figureFootnoteNumber: 1
---
## Processing-in-Memory
-### Samsung's HBM-PIM
+### Samsung's PIM-HBM
+
+
+
+- Real-world PIM implementation based on HBM2
+- PIM units embedded at the bank level
+
+
+
+
+
+
+
+
-
+
Lee et al. „Hardware Architecture and Software Stack for PIM Based on Commercial DRAM Technology : Industrial Product“, 2021.
@@ -167,19 +179,23 @@ figureFootnoteNumber: 1
- All-Bank-PIM (AB-PIM)
-->
----
-layout: figure
-figureUrl: /pu.svg
-figureCaption: Architecture of a PIM processing unit
-figureFootnoteNumber: 1
---
## Processing-in-Memory
-### Samsung's HBM-PIM
+### Samsung's PIM-HBM
+
+
+- Two 16-wide 16-bit FPUs
+- Register files and control unit
+
+
+
+
+
-
+
Lee et al. „Hardware Architecture and Software Stack for PIM Based on Commercial DRAM Technology : Industrial Product“, 2021.
@@ -201,15 +217,14 @@ figureFootnoteNumber: 1
layout: figure
figureUrl: /gemv.svg
figureCaption: Procedure to perform a (128×8)×(128) GEMV operation
-figureFootnoteNumber: 1
---
## Processing-in-Memory
-### Samsung's HBM-PIM
+### Samsung's PIM-HBM
-
+
Lee et al. „Hardware Architecture and Software Stack for PIM Based on Commercial DRAM Technology : Industrial Product“, 2021.
@@ -221,7 +236,7 @@ figureCaption: Mapping of the weight matrix onto the memory banks
---
## Processing-in-Memory
-### Samsung's HBM-PIM
+### Samsung's PIM-HBM
+
---
layout: figure
figureUrl: /runtimes_vector.svg
@@ -89,6 +109,11 @@ figureCaption: Runtimes for Vector Benchmarks
### Runtimes / Vector Benchmarks
+
+
---
layout: figure
figureUrl: /runtimes_matrix.svg