## Processing-in-Memory
### Applicable Workloads
- Fully connected layers have a large weight matrix
- Weight matrix does not fit onto on-chip cache
- No data reuse in the matrix
---
preload: false
clicks: 1
---
## Processing-in-Memory
### Applicable Workloads
- Convolutional layers have a small filter matrix
- Matrix does fit onto on-chip cache
- Excessive data reuse in the matrix
---
## Processing-in-Memory
### Applicable Workloads
### Suitable candidates for PIM:
- Multilayer perceptrons (MLPs)
- Layers in recurrent neural networks (RNNs)
### Unsuitable candidates for PIM:
- Convolutional neural networks (CNNs)
---
## Processing-in-Memory
### Architectures
- Inside the memory subarray
- Near the subarray in the PSA output region
- Near the bank in its peripheral region
- In the I/O region of the memory
The nearer the computation is to the memory cells, the higher the achievable bandwidth!
Sudarshan et al. „A Critical Assessment of DRAM-PIM Architectures - Trends, Challenges and Solutions“, 2022.
---
## Processing-in-Memory
### Samsung's PIM-HBM
- Real-world PIM implementation based on HBM2
- PIM units embedded at the bank level
Lee et al. „Hardware Architecture and Software Stack for PIM Based on Commercial DRAM Technology : Industrial Product“, 2021.
---
## Processing-in-Memory
### Samsung's PIM-HBM
- Two 16-wide 16-bit FPUs
- Register files and control unit
Lee et al. „Hardware Architecture and Software Stack for PIM Based on Commercial DRAM Technology : Industrial Product“, 2021.
---
layout: figure
figureUrl: /gemv.svg
figureCaption: Procedure to perform a (128×8)×(128) GEMV operation
---
## Processing-in-Memory
### Samsung's PIM-HBM
Lee et al. „Hardware Architecture and Software Stack for PIM Based on Commercial DRAM Technology : Industrial Product“, 2021.
---
layout: figure
figureUrl: /layout.svg
figureCaption: Mapping of the weight matrix onto the memory banks
---
## Processing-in-Memory
### Samsung's PIM-HBM
---
## Processing-in-Memory
### Research
- To analyze the performance gains of PIM, simulation models are needed
- Research should not only focus on hardware but also explore the software side
- In the following, a virtual prototype of PIM-HBM is implemented