diff --git a/drampower-main.tex b/drampower-main.tex index 6a0a25f..f65769f 100644 --- a/drampower-main.tex +++ b/drampower-main.tex @@ -1112,7 +1112,7 @@ Alternatively, a DRAM command trace can be provided as an input file. For the interface power calculation, the provided commands, addresses and data are translated into equivalent bit patterns using the command truth table of the simulated standard. Based on this data, the number of transmitted zeros $n_0$, transmitted ones $n_1$ and zero to one transitions $n_{0 \rightarrow 1}$ can be calculated. To achieve high simulation speeds, bit manipulation instructions including the population count (\texttt{POPCNT}) instruction are used. -If no data is provided, a switching activity $\alpha$ and a ratio between both logic levels \todo{name} has to be provided. +If no data is provided, a switching activity $\alpha$ and a duty cycle $D$ has to be provided. In addition to the command/address and data bus, the remaining signals like the clock signal pair, data strobe pairs or chip select need to be considered (see Section~\ref{subsec:background_interface}). %As explained in Section~\ref{sec:interface_power_modeling}, the interface power calculation can depend on lots of parameters and, thus, can become very complex. %\todo{In order to avoid the complexity within DRAMPower, the tool only receives the precalculated termination and dynamic power values for all signals as inputs.} @@ -1197,12 +1197,19 @@ The total power consumption can be queried at any time even when the simulation \subsection{Simulation Speed} % Since DRAMPower is not used as a standalone tool in the normal use case, but rather coupled to a behavioral DRAM subsystem simulator, we evaluate its simulation speed in terms of the overhead of adding power simulation. -For this analysis, DRAMPower is coupled to the well-known DRAM subsystem simulator DRAMSys~\cite{} - - -As DRAMPower is not intended as a standalone simulator, we evaluate the simulation speed - +For this analysis, DRAMPower is coupled to the well-known DRAM subsystem simulator DRAMSys~\cite{stejun_20}. +Within DRAMSys, \todo{one million read and write requests with random addresses and random data are generated.} +This simulation is carried out both with and without power simulation enabled. +Moreover, the simulations are also performed without actual data. +In this case, DRAMPower is provided with a switching activity $\alpha$ and a duty cycle $D$. +For the simulations with data, DRAMSys alone requires 896\,ms to finish, while with added power simulation, +it takes 1004\,ms to finish. +This corresponds to an overhead of 12\,\%. +When no data is simulated, DRAMSys alone requires only 559\,ms to finish, while with DRAMPower enabled, the simulation time increases to 774\,ms. +In this case, the overhead is 38\,\%. +While this overhead is relatively large, there are two things to consider. +First, DRAMSys is highly optimized for simulation speed and outperforms all other simulators \begin{figure} \centering @@ -1250,36 +1257,34 @@ alternatively, duty cycle/toggling rates can be used %\input{content/05_exp_results} \subsection{Simulation Accuracy} % +\todo{ Interface -> comparison with SPICE, maybe use a random pattern in spice with fixed n0, n1 and alpha Core -> we do not yet have a measurement platform for DDR5/LPDDR5/HBM3... where we can issue specific command patterns to DRAM and compare it with the results provided by DRAMPower. -\todo{Marco, Derek} +} % IDD Patterns mit Daimler Messung vergleichen -To verify the power estimates of the new DRAMPower implementation, we use measurement data from DRAMs of three different vendors, as reported in a real LPDDR4 memory measurement platform study~\cite{feldmann_23}. +To verify the power estimates of the new DRAMPower implementation, we use core and interface power measurements of DRAMs from three different vendors, as reported in a study of a real LPDDR4 memory measurement platform~\cite{feldmann_23}. Each DRAM is operated with six different access patterns, which are analogous to the following $I_{DD}$ currents: -\tikz{\node[circle,draw,inner sep=1pt] {\tiny 1}}~$I_{DD}0$*, -\tikz{\node[circle,draw,inner sep=1pt] {\tiny 2}}~$I_{DD}4R$, -\tikz{\node[circle,draw,inner sep=1pt] {\tiny 3}}~$I_{DD}4W$, -\tikz{\node[circle,draw,inner sep=1pt] {\tiny 4}}~$I_{DD}5AB$, -\tikz{\node[circle,draw,inner sep=1pt] {\tiny 5}}~$I_{DD}2N$ and -\tikz{\node[circle,draw,inner sep=1pt] {\tiny 6}}~$I_{DD}6$. -As it was not possible to reproduce the usual $I_{DD}0$ pattern of ACT-PRE for the measurement platform, $I_{DD}0$* is a variation using the pattern ACT-RD-PRE, which is also resembled in the DRAMPower simulation. -Also, the measurement platform was not able to accurately measure the write current $I_{DD}4W$ +\tikz{\node[circle,draw,inner sep=1pt] {\tiny 1}}~$I_{DD0*}$, +\tikz{\node[circle,draw,inner sep=1pt] {\tiny 2}}~$I_{DD4R}$, +\tikz{\node[circle,draw,inner sep=1pt] {\tiny 3}}~$I_{DD4W}$, +\tikz{\node[circle,draw,inner sep=1pt] {\tiny 4}}~$I_{DD5B}$, +\tikz{\node[circle,draw,inner sep=1pt] {\tiny 5}}~$I_{DD2N}$ and +\tikz{\node[circle,draw,inner sep=1pt] {\tiny 6}}~$I_{DD6}$. +As it was not possible to reproduce the usual $I_{DD0}$ pattern of ACT-PRE for the measurement platform, $I_{DD0*}$ is a variation using the ACT-RD-PRE pattern, which is also resembled in the DRAMPower simulation. +In addition, the measurement platform could not accurately measure the write current $I_{DD4W}$ because only one write request could be issued at a time, the simulation was also configured to limit the number of outstanding write requests to one. The initial simulations are based on the current values specified in the datasheet of the specific vendor. Then, based on the actual measurements, the current values are reapplied to a second simulation. The results are shown in Figure~\ref{fig:power_plot}. \begin{figure} \centering - % \resizebox{\linewidth}{!}{% \input{img/power_plot} - % } \caption{Average Power Consumption of Simulations and Measurements for Different Vendors} \label{fig:power_plot} \end{figure} As it can be seen, the $I_{DD}$ currents in the datasheet are overly pessimistic for all vendors: -The simulations based on the datasheets show on average a $4.8\times$ higher power consumption than the actual power measurements. -However, when the measured currents are applied to the simulation, there is still a small discrepancy: -This can be explained by the fact that the \todo{wrong:} measurement platform only measures the core power and not the interface power. -As DRAMPower also includes interface power estimates, it therefore reports a higher total power. +The simulations based on the datasheets show on average a $2.9\times$ higher power consumption than the actual power measurements. +However, when the measured currents are applied to the simulation, the deviation drops to only around $18.8\%$. +The largest deviation comes from the $I_{DD0*}$ current. It is unclear whether the measurement platform was able to fully saturate the memory controller's buffer and therefore report a lower average power consumption than in the simulation. % LP4 vs LP5 % DDR4 vs. DDR5 diff --git a/drampower.bib b/drampower.bib index ea4801d..4f7eb57 100644 --- a/drampower.bib +++ b/drampower.bib @@ -240,3 +240,21 @@ Cg\_type: Outlook\\ Subject\_term: Machine learning, Sustainability, Technology, Computer science, Engineering}, file = {/Users/myzinsky/Zotero/storage/2XJ6LXCA/d41586-024-03408-z.html} } + +@InProceedings{stejun_20, +author="Steiner, Lukas +and Jung, Matthias +and Prado, Felipe S. +and Bykov, Kirill +and Wehn, Norbert", +editor="Orailoglu, Alex +and Jung, Matthias +and Reichenbach, Marc", +title="{DRAMSys4.0}: A Fast and Cycle-Accurate {SystemC/TLM}-Based {DRAM} Simulator", +booktitle="Embedded Computer Systems: Architectures, Modeling, and Simulation", +year="2020", +publisher="Springer International Publishing", +address="Cham", +pages="110--126", +isbn="978-3-030-60939-9" +} \ No newline at end of file