Add address mapping example figure

This commit is contained in:
2024-01-30 23:24:19 +01:00
parent eb05bf6507
commit c077fa2dc4
5 changed files with 55 additions and 9 deletions

View File

@@ -98,6 +98,10 @@
short = DDR,
long = double data rate,
}
\DeclareAcronym{ddr}{
short = DDR,
long = double data rate,
}
\DeclareAcronym{dimm}{
short = DIMM,
long = dual in-line memory module,
@@ -126,6 +130,10 @@
short = WR,
long = write,
}
\DeclareAcronym{am}{
short = AM,
long = address mapping,
}
\DeclareAcronym{tlm}{
short = TLM,
long = transaction level modeling,

View File

@@ -15,14 +15,14 @@ The banks can be controlled independently of each other, while the memory arrays
Memory arrays, in turn, are composed of multiple \acp{subarray}.
\Acp{subarray} are grid-like structures composed of \acp{lwl} and \acp{lbl}, with a storage cell at each intersection point.
The \ac{lwl} is connected to the transistor's gate, switching it on and off, while the \ac{lbl} is used to access the stored value.
Global \acp{mwl} and \acp{mbl} span over all \acp{subarray}, forming complete \textit{rows} and \textit{columns} of a memory array.
Global \acp{mwl} and \acp{mbl} span over all \acp{subarray}, forming complete \textit{rows} and \textit{columns} of a memory array.
Because the charge stored in each cell is very small, so-called \acp{psa} are needed to amplify the voltage of each cell while it is being connected to the shared \ac{lbl} \cite{jacob2008}, basic structure of which is illustrated in Figure \ref{img:psa}.
\begin{figure}
\centering
\includegraphics{images/psa}
\caption[\ac{psa} of an open bitline architecture]{\ac{psa} of an open bitline architecture \cite{jacob2008} \cite{jung2017a}}
\caption[\ac{psa} of an open bitline architecture]{\ac{psa} of an open bitline architecture \cite{jacob2008} \cite{jung2017a}.}
\label{img:psa}
\end{figure}
@@ -39,7 +39,7 @@ The Figure \ref{img:bank} summarizes the basic architecture of a single storage
\begin{figure}
\centering
\includegraphics{images/bank}
\caption[Architecture of a single DRAM device]{Architecture of a single DRAM device \cite{jung2017a}}
\caption[Architecture of a single DRAM device]{Architecture of a single DRAM device \cite{jung2017a}.}
\label{img:bank}
\end{figure}
@@ -49,9 +49,46 @@ A \ac{dimm} may also consist of several independent \textit{ranks}, which are co
Besides the data bus, the channel consists also of the \textit{command bus} and the \textit{address bus}.
Over the command bus, the commands necessary to control memory are issued by the \textit{memory controller}, that sits in between the \ac{dram} and the \ac{mpsoc}.
For example, to read data, the memory controller may first issue a \ac{pre} command to precharge the bitlines in a certain bank, followed by an \ac{act} command to load the contents of a row into the \acp{psa}, and finally a \ac{rd} command to move the data from the \acp{psa} to the \acp{ssa} where it can be exposed to the data bus.
The row, column, bank and rank in question is determined by the address bus.
For example, to read data, the memory controller may first issue a \ac{pre} command to precharge the bitlines in a certain bank, followed by an \ac{act} command to load the contents of a row into the \acp{psa}, and finally a \ac{rd} command to move the data from the \acp{psa} to the \acp{ssa} where it can further be exposed to the data bus.
The value on the address bus determines the row, column, bank and rank used during the respective commands, while it is the responsibility of the memory controllers to translate the \ac{mpsoc}-side address to the respective components in a process called \ac{am}.
\Ac{am} ensures that the number of \textit{row misses}, i.e., the need for precharging and activating another row, is minimized.
% One particularly common \ac{am} scheme is called \textit{Bank Interleaving} \cite{jung2017a}, which maps the lower address bits to the columns, followed by the ranks and banks, and the highest bits to the rows.
One particularly common \ac{am} scheme is called \textit{Bank Interleaving} \cite{jung2017a}, which is illustrated using an exemplary mapping in Figure \ref{img:bank_interleaving}.
Under the assumption of a sequentially increasing address access pattern, this scheme maps the lowest bits of an address to the column bits of a row to exploit the already activated row as much as possible.
After that, instead of addressing the next row of the current bank directly, the mapping switches to another bank to take advantage of \textit{bank parallelism}.
Because banks can be controlled independently, one bank can be outputting the next data burst while another is concurrently precharging or activating a new row.
\begin{figure}
\centering
% \begin{tikzpicture}
% \draw[step=4mm,gray,very thin] (0,0) grid (128mm,4mm);
% \node[draw,minimum width=128mm,minimum height=4mm,inner sep=1pt,anchor=south west] (input) at (0,0) {\tiny Input Address};
% % \node[fill=white,inner sep=1pt] at (input) {\tiny Input Address};
% % \node[draw,grid,gray,very thin] (input.south west) {test};
% % \draw[gray,very thin] (input.north east) grid (2,2);
% \node[draw,minimum width=72mm,outer sep=0,anchor=south west] (row) at (0,-1.5) {\tiny Row};
% \node[draw,minimum width=12mm,outer sep=0,anchor=west] (bank) at (row.east) {\tiny Bank};
% \node[draw,minimum width=12mm,outer sep=0,anchor=west] (rank) at (bank.east) {\tiny Rank};
% \node[draw,minimum width=32mm,outer sep=0,anchor=west] (column) at (rank.east) {\tiny Column};
% \draw [decorate,decoration={brace,mirror}] (0,0) -- (1,0);
% \end{tikzpicture}
\definecolor{verylightgray}{gray}{0.85}
\begin{bytefield}[bitwidth=4mm,bitheight=5mm]{32}
\bitheader[endianness=big]{0,2,3,12,13,16,17,31} \\
\bitbox{15}{Row}
\bitbox{4}{Bank}
\bitbox{10}{Column}
\bitbox{3}[bgcolor=verylightgray]{}
\end{bytefield}
\caption[Exemplary address mapping scheme]{Exemplary address mapping scheme for an input address of size 32.}
\label{img:bank_interleaving}
\end{figure}
Besides \ac{dimm}-based \ac{dram}, which is mainly used in desktop workstations, there are also other \ac{dram} integrations such as ...
% gibt dimms oder auch gddr
% ODER auch hbm -> überleitung zu hbm

View File

@@ -17,7 +17,7 @@ In addition, Moore's Law is slowing down as further device scaling approaches ph
\begin{figure}[!ht]
\centering
\input{plots/energy_chart}
\caption[Total energy of computing]{Total energy of computing \cite{src2021}}
\caption[Total energy of computing]{Total energy of computing \cite{src2021}.}
\label{plt:enery_chart}
\end{figure}
@@ -34,7 +34,7 @@ In contrast, compute-intensive workloads, such as visual processing, are referre
\begin{figure}[!ht]
\centering
\input{plots/roofline}
\caption[Roofline model of GPT revisions]{Roofline model of GPT revisions \cite{ivobolsens2023}}
\caption[Roofline model of GPT revisions]{Roofline model of GPT revisions \cite{ivobolsens2023}.}
\label{plt:roofline}
\end{figure}

View File

@@ -20,6 +20,7 @@
\usepackage{url}
\usepackage[square,numbers]{natbib}
\usepackage{pgfplots}
\usepackage{bytefield}
% Configurations
\setlength\textheight{24cm}

View File

@@ -31,14 +31,14 @@
2050 8e+20
};
\addlegendentry{world's energy production}
\addplot [very thick, steelblue]
\addplot [smooth,very thick, steelblue]
table {
2010 5e+17
2020 5e+18
2030 4e+19
};
\addlegendentry{total compute energy}
\addplot [very thick, steelblue, dashed]
\addplot [smooth,very thick, steelblue, dashed]
table {
2030 4e+19
2035 8e+19