Conclusion erste Version

2024-02-17 10:58:47 +01:00
parent 0bdbd2ddc9
commit 779461b515
4 changed files with 33 additions and 6 deletions
--- a/src/chapters/introduction.tex
+++ b/src/chapters/introduction.tex
@@ -6,12 +6,12 @@ A key component of these models is the use of \acp{dnn}, which are a type of mac
 Consequently, \acp{dnn} make it possible to tackle many new classes of problems that were previously beyond the reach of conventional algorithms.

 However, the ever-increasing use of these technologies poses new challenges for hardware architectures, as the energy required to train and run these models reaches unprecedented levels.
-Recently published numbers approximate that the development and training of Meta's LLaMA model over a period of about 5 months consumed around $\qty{2638}{\mega\watt\hour}$ of electrical energy and caused a total emission of $\qty{1015}{tCO_2eq}$ \cite{touvron2023}.
+Recently published numbers approximate that the development and training of Meta's LLaMA model over a period of about five months consumed around $\qty{2638}{\mega\watt\hour}$ of electrical energy and caused a total emission of $\qty{1015}{tCO_2eq}$ \cite{touvron2023}.
 As these numbers are expected to increase in the future, it is clear that the energy footprint of current deployment of \ac{ai} applications is not sustainable \cite{blott2023}.


 In a more general view, the energy demand of computing for new applications continues to grow exponentially, doubling about every two years, while the world's energy production only grows linearly, at about $\qty{2}{\percent}$ per year \cite{src2021}.
-This dramatic increase in energy consumption is due to the fact that while the energy efficiency of compute processor units has continued to improve, the ever-increasing demand for computing is outpacing this progress.
+This dramatic increase in energy consumption is due to the fact that while the energy efficiency of compute processor units has continued to improve, the ever-increasing demand for computing however is outpacing this progress.
 In addition, Moore's Law is slowing down as further device scaling approaches physical limits.

 \begin{figure}[!ht]
@@ -28,7 +28,7 @@ In recent years, domain-specific accelerators, such as \acp{gpu} or \acp{tpu} ha
 However, research must also take into account off-chip memory - moving data between the computation unit and the \ac{dram} is very costly, as fetching operands consumes more power than performing the computation on them itself.
 While performing a double precision floating point operation on a $\qty{28}{\nano\meter}$ technology might consume an energy of about $\qty{20}{\pico\joule}$, fetching the operands from \ac{dram} consumes almost 3 orders of magnitude more energy at about $\qty{16}{\nano\joule}$ \cite{dally2010}.

-Furthermore, many types of \ac{dnn} used for language and speech processing, such as \acp{rnn}, \acp{mlp} and some layers of \acp{cnn}, are severely limited by the memory bandwidth that the \ac{dram} can provide, making them \textit{memory-bounded} \cite{he2020}.
+Furthermore, many types of \acp{dnn} used for language and speech processing, such as \acp{rnn}, \acp{mlp} and some layers of \acp{cnn}, are severely limited by the memory bandwidth that the \ac{dram} can provide, making them \textit{memory-bounded} \cite{he2020}.
 In contrast, compute-intensive workloads, such as visual processing, are referred to as \textit{compute-bounded}.

 \begin{figure}[!ht]
@@ -51,5 +51,5 @@ The remainder of this work is structured as follows:
 In \cref{sec:pim} various types of \ac{pim} architectures are presented, with some concrete examples discussed in detail.
 \cref{sec:vp} is an introduction to virtual prototyping and system-level hardware simulation.
 After explaining the necessary prerequisites, \cref{sec:implementation} implements a concrete \ac{pim} architecture in software and provides a development library that applications can use to take advantage of in-memory processing.
-The \cref{sec:results} demonstrates the possible performance enhancement of \ac{pim} by simulating a typical neural-network inference.
+The \cref{sec:results} demonstrates the possible performance enhancement of \ac{pim} by simulating a typical neural network inference.
 Finally, \cref{sec:conclusion} concludes the findings and identifies future improvements in \ac{pim} architectures.