Apply Lukas' corrections

This commit is contained in:
2022-07-13 11:27:04 +02:00
parent d890c4cc79
commit a9e7132ed7
12 changed files with 280 additions and 224 deletions

View File

@@ -1,10 +1,10 @@
\section{Introduction}
\label{sec:introduction}
Todays computing systems accompany us in almost all areas of life in the form of smart devices, computers, or game consoles.
%vlt noch warum DRAMs immer mehr eingesetzt werden
Today's computing systems accompany us in almost all areas of life in the form of smart devices, computers, or game consoles.
With the increasing performance requirements on these devices, not only faster processors are needed, but also high-performance memory systems, namely dynamic random access memories, which are supposed to deliver a lot of bandwidth at low latency.
While these storage systems are very complex and offer a lot of room for configuration, as the used DRAM standard, the memory controller configuration or the address mapping, there are different requirements for the very different applications\cite{Gomony2012}.
Consequently, system designers are commissioned with the complex task of finding the most effective configurations that match the performance and power contraints with good optimizations applied for the specific use case.
While these storage systems are very complex and offer a lot of room for configuration, e.g., the \revabbr{dynamic random-access memory}{DRAM} standard, the memory controller configuration or the address mapping, there are different requirements for the very different applications\cite{Gomony2012}.
Consequently, system designers are entrusted with the complex task of finding the most effective configurations that match the performance and power contraints with good optimizations applied for the specific use case.
\input{img/thesis.tikzstyles}
\begin{figure}[!ht]
@@ -15,24 +15,24 @@ Consequently, system designers are commissioned with the complex task of finding
\end{center}
\end{figure}
For the exploration of the design space for these configurations it is impractical to use real systems as they expensive and are not suitable for rapid prototyping.
For the exploration of the design space of these configurations it is impractical to use real systems as they are too cost-intensive and not modifyable and therefore not suitable for rapid prototyping.
To overcome this limitation, it is important to simulate the memory system using a simulation framework with sufficient accuracy.
Such a simulation framework is DRAMSys\cite{Steiner2020}\cite{Jung2017}, which is based on transaction level modeling and enables the fast simulation of numerous DRAM standards and controller configurations with cycle-accuracy.
Stimuli for the memory system can either be generated using a prerecorded trace file with fixed or relative timestamps, a traffic generator that acts as a state machine and initiates different request patterns or a detailed processor model of the gem5\cite{Binkert2011} simulation framework.
Such a simulation framework is DRAMSys\cite{Steiner2020}\cite{Jung2017}, which is based on SystemC \revabbr{transaction level modeling}{TLM} and enables the fast simulation of numerous DRAM standards and controller configurations with cycle-accuracy.
Stimuli for the memory system can either be generated using a prerecorded trace file with timestamps, a traffic generator that acts as a state machine and initiates different request patterns, or a detailed processor model of the gem5\cite{Binkert2011} simulation framework.
However, the two former methods lack in accurary whereas the latter may provide the sufficient precision but represents a very time-consuming effort.
However, the two former methods lack in accurary whereas the latter may provide the sufficient precision but is a very time-consuming effort.
To fill this gap of fast but accurate traffic generation, a new simulation frontend for DRAMSys is developed and presented in this thesis.
The methology this new framwork is based on is dynamic binary instrumentation.
It allows the extraction of memory accesses of multi-threaded applications as they are executed on real hardware.
These memory access traces then are played back using a simplified core model and filtered by a cache model before the memory requests are passed to the DRAM.
This allows an accurate modeling of the system and the variing of numerous configuration parameters in a short time.
The methology this new framework is based on is dynamic binary instrumentation.
It allows the extraction of memory accesses of multi-threaded applications while they are executed on real hardware.
These memory access traces are then played back using a simplified core model and are filtered by a cache model before the memory requests are passed to the DRAM.
This allows an accurate modeling of the system and the variation of numerous configuration parameters in a short time.
The remainder of the thesis is structured as follows:
In section \ref{sec:dynamorio} the used dynamic binary instrumentation framework, DynamoRIO, is introduced.
The section \ref{sec:systemc} presents the modeling language SystemC, the developed core and cache models are based on.
After that, the section \ref{sec:caches} gives a short overview over modern cache architectures and their high-level implementations.
Section \ref{sec:dramsys} introduces the DRAMSys simulator framework and its basic functionalities.
Section \ref{sec:implementation} concerns with the implementation of the cache model, the processor model and the instrumentation tool.
In section \ref{sec:simulation_results} the accuracy of the new framwork is compared against the gem5 and Ramulator\cite{Kim2016} simulators, whereas section \ref{sec:future_work} denotes which future improvements can be achieved.
In Section \ref{sec:dynamorio} the used dynamic binary instrumentation framework, DynamoRIO, is introduced.
Section \ref{sec:systemc} presents the modeling language SystemC, on which the developed core and cache models are based on.
After that, Section \ref{sec:caches} gives a short overview of modern cache architectures and their high-level implementations.
Section \ref{sec:dramsys} introduces the DRAMSys simulation framework and its basic functionalities.
Section \ref{sec:implementation} explains the implementation of the cache model, the processor model and the instrumentation tool in detail.
In Section \ref{sec:simulation_results} the accuracy of the new framework is compared against the gem5 and Ramulator\cite{Kim2016} simulators, whereas Section \ref{sec:future_work} denotes future improvements that can be achieved.