From 6b9e81a433174484641c9f60bb80e7c778450f12 Mon Sep 17 00:00:00 2001
From: "christ.derek" <dchrist@rhrk.uni-kl.de>
Date: Mon, 3 Jun 2024 09:28:51 +0000
Subject: [PATCH] Update on Overleaf.

---
 main.tex | 32 +++++++++++++++-----------------
 1 file changed, 15 insertions(+), 17 deletions(-)

diff --git a/main.tex b/main.tex
index baf08d6..7075d3d 100644
--- a/main.tex
+++ b/main.tex
@@ -756,7 +756,7 @@ In the case of an incoming MBE and WD, the SEC engine is not able to correct any
 \todo{Compared to LPDDR4, LPDDR5 supports higher data transfer rates at the bus interface, which, in turn, leads to higher bit error rates for the transmission between DRAM controller and device.
 For that reason, LPDDR5 introduces a link ECC mechanism, which uses a \textit{Single Error Correction Double Error Detection} (SECDED) code in form of a $(137,128)$ Hamming ECC. New link ECC proposed by Santiago: are all 0 and all 1 valid codewords? Important for AZ error!!!}
 \new{Therefore, we analyze the FITs of a typical LPDDR5 interface. 
-According to JEDEC, the interface must fulfill at least a \textit{Bit Error Rate} (BER) of $10^{-16}$ for a single DRAM pin \todo{(CITE, gilt nur für LP4)}.}
+\newer{According to the DDR4 JEDEC standard, the interface must fulfill at least a \textit{Bit Error Rate} (BER) of $10^{-16}$ for a single DRAM pin~\cite{jedec2021c}.}}
 \newer{As each code word consists of 137 bits, we can compute the probability for multi-bit errors within one code word with
 \[ p(e) = \binom{n}{e} \cdot \mathrm{BER}^e \cdot \left(1-\mathrm{BER}\right)^{n-e},\]
 where $e$ is the number of errors and $n$ is the number of transmitted bits.}
@@ -934,16 +934,15 @@ In DRAM systems, the best case is usually estimated with a sequential access pat
 For the worst case, a random access pattern is used, since each memory access results in a row miss, which lowers bandwidth and increases latency.}
 
 \new{Figure~\ref{fig:bandwdith} shows the theoretical maximum bandwidth of a single LPDDR5 channel, which is \qty{102.4}{\giga\bit\per\second}. 
-With the sequential access pattern, a bandwidth utilization of \qty{100.45}{\giga\bit\per\second} is reached when the ECC functionality is disabled.
-\qty{2}{\percent} of the maximum bandwidth are lost due to refresh and the remaining page misses. 
-\todo{Update the text with the correct percentages!}
-When ECC is enabled, the bandwidth drops to \qty{96.84}{\giga\bit\per\second}, which corresponds to a decrease of another \qty{3.5}{\percent}. 
+With the sequential access pattern, a bandwidth utilization of \newer{\qty{100.53}{\giga\bit\per\second}} is reached when the ECC functionality is disabled.
+\qty{2}{\percent} of the maximum bandwidth are lost due to refresh and the remaining page misses.
+When ECC is enabled, the bandwidth drops to \newer{\qty{94.58}{\giga\bit\per\second}}, which corresponds to a decrease of another \newer{\qty{5.8}{\percent}}. 
 The drop is small because with a sequential pattern all the columns within a row are accessed successively and the fetched parity bits can be fully utilized, i.e., only 4 additional ECC accesses are required for 56 user data accesses (see Figure~\ref{fig:in-line}).
-When the DRAM is stressed with a worst case scenario, i.e., a fully random access pattern where each data access results in a row miss, the real bandwidth utilization without ECC is \qty{47.28}{\giga\bit\per\second}, which is only \qty{46}{\percent} of the theoretical maximum bandwidth. 
-With enabled ECC, the bandwidth drops by another \qty{14}{\percent} to \qty{33.51}{\giga\bit\per\second}. 
+When the DRAM is stressed with a worst case scenario, i.e., a fully random access pattern where each data access results in a row miss, the real bandwidth utilization without ECC is \newer{\qty{47.27}{\giga\bit\per\second}}, which is only \qty{46}{\percent} of the theoretical maximum bandwidth. 
+With enabled ECC, the bandwidth drops by another \newer{\qty{13}{\percent}} to \newer{\qty{33.51}{\giga\bit\per\second}}. 
 In this case, the drop is greater because each user data access requires an additional ECC access.
-This ECC access is at least always a row hit.
-When the bandwidth drop is set in direct relation to the real bandwidth utilization, it even corresponds to a decrease of \qty{29}{\percent}, i.e., for random traffic the DRAM channel loses almost one third of its performance due to the additional safety measure.
+\newer{At the very least, this ECC access is always a row hit.
+When the ECC-enabled bandwidth is set in direct relation to the original bandwidth utilization}, it even corresponds to a decrease of \qty{29}{\percent}, i.e., for random traffic the DRAM channel loses almost one third of its performance due to the additional safety measure.
 }
 %This is due to the high row miss rate and the additional ECC memory accesses.
 
@@ -984,14 +983,13 @@ When the bandwidth drop is set in direct relation to the real bandwidth utilizat
 
 \new{Furthermore, we analyze the impact of the in-line ECC on latency. 
 To do this, the frequency that requests are issued to the DRAM subsystem is varied from \qty{25}{\mega\hertz} in increasing steps of \qty{25}{\mega\hertz} to \qty{400}{\mega\hertz}, which is the maximum a channel with a data rate of \qty{6400}{\mega\transfer\per\second} and a burst length of 16 can theoretically handle. 
-The Figures~\ref{fig:lat_bw:linear} and \ref{fig:lat_bw:random} plot the average response latency of all requests over the bandwidth for the four investigated scenarios. 
-\todo{update the values also here}
-In the sequential case, the idle response latency is \qty{30}{\nano\second} with disabled ECC and increases only marginally when ECC is enabled (by less than \qty{0.5}{\nano\second}).
-At high request issue frequencies, the impact of ECC becomes more visible as the graph starts to saturate slightly earlier and the maximum response latency is higher (\qty{149}{\nano\second} compared to \qty{90}{\nano\second}).
-In the random case, the idle response latency without ECC is already \qty{49}{\nano\second} because the target row must always be activated first.
-When ECC is enabled, it increases by \qty{10}{\percent} to \qty{54}{\nano\second} because an additional ECC access is issued before each user data access.
+The Figures~\ref{fig:lat_bw:linear} and \ref{fig:lat_bw:random} plot the average response latency of all requests over the bandwidth for the four investigated scenarios.
+In the sequential case, the idle response latency is \qty{30}{\nano\second} with disabled ECC and increases only marginally when ECC is enabled (by less than \newer{\qty{1}{\nano\second}}).
+At high request issue frequencies, the impact of ECC becomes more visible as the graph starts to saturate slightly earlier and the maximum response latency is higher \newer{(\qty{164}{\nano\second} compared to \qty{156}{\nano\second})}.
+In the random case, the idle response latency without ECC is already \newer{\qty{55}{\nano\second}} because the target row must always be activated first.
+When ECC is enabled, it increases by \newer{\qty{9}{\percent} to \qty{60}{\nano\second}} because an additional ECC access is issued before each user data access.
 Also, the impact at high request issue frequencies is more significant compared to the sequential case. 
-With ECC, the graph starts saturating at around \qty{150}{\mega\hertz} compared to \qty{200}{\mega\hertz} without ECC, and the maximum response latency increases from \qty{150}{\nano\second} to \qty{336}{\nano\second}. 
+With ECC, the graph starts saturating at around \qty{150}{\mega\hertz} compared to \qty{200}{\mega\hertz} without ECC, and the maximum response latency increases from \newer{\qty{343}{\nano\second} to \qty{485}{\nano\second}}. 
 This means that the channel with in-line ECC can handle around \qty{25}{\percent} less random traffic, which is consistent with the bandwidth results in Figure~\ref{fig:bandwdith}.
 % at high freq. saturation starts earlier (150 vs. 200 MHz), higher max response latency
 %Generally, it can be observed that the latency is only weakly affected in the sequential case, whereas in the random case the distribution is shifted more towards higher latencies once ECC is enabled.
@@ -1244,7 +1242,7 @@ However, since the current safety measures are not sufficient to reach levels ab
 
 \section{Conclusion and Future Work}
 \label{sec:conclusion}
-In this paper, we presented a new methodology for modeling the safety behavior of modern hardware systems in compliance with the ISO\,26262 automotive standard. The implementation of this new methodology is provided as an open-source SystemC library and can be used to enhance legacy models with safety and quality analysis. In order to demonstrate the power of this new methodology, we modeled a state-of-the-art automotive DRAM memory architecture. Based on this model, we simulated a continuous space of failure rates of the DRAM system. We conclude that with the current safety measures, it is not possible to achieve a rating higher than ASIL\,A. \new{Furthermore, we combined the safety simulation with a functional simulation, such that the overhead of the safety measures could be estimated quickly. In fact, we see a storage overhead of \qty{12.5}{\percent} and a bandwidth overhead of \qty{3.5}{\percent} in the best case and \qty{14}{\percent} in the worst case. In the future, we will analyze new safety measures that could help reaching the goal of an ASIL\,D certification by using the presented methodology.}
+In this paper, we presented a new methodology for modeling the safety behavior of modern hardware systems in compliance with the ISO\,26262 automotive standard. The implementation of this new methodology is provided as an open-source SystemC library and can be used to enhance legacy models with safety and quality analysis. In order to demonstrate the power of this new methodology, we modeled a state-of-the-art automotive DRAM memory architecture. Based on this model, we simulated a continuous space of failure rates of the DRAM system. We conclude that with the current safety measures, it is not possible to achieve a rating higher than ASIL\,A. \new{Furthermore, we combined the safety simulation with a functional simulation, such that the overhead of the safety measures could be estimated quickly. In fact, we see a storage overhead of \qty{12.5}{\percent} and a bandwidth overhead of \newer{\qty{6}{\percent}} in the best case and \newer{\qty{29}{\percent}} in the worst case. In the future, we will analyze new safety measures that could help reaching the goal of an ASIL\,D certification by using the presented methodology.}
 %
 \section*{Author's Contributions}
 All authors contributed to all parts of the paper.