Apply Lukas' corrections
This commit is contained in:
@@ -1,10 +1,10 @@
|
||||
\section{Introduction}
|
||||
\label{sec:introduction}
|
||||
|
||||
Todays computing systems accompany us in almost all areas of life in the form of smart devices, computers, or game consoles.
|
||||
%vlt noch warum DRAMs immer mehr eingesetzt werden
|
||||
Today's computing systems accompany us in almost all areas of life in the form of smart devices, computers, or game consoles.
|
||||
With the increasing performance requirements on these devices, not only faster processors are needed, but also high-performance memory systems, namely dynamic random access memories, which are supposed to deliver a lot of bandwidth at low latency.
|
||||
While these storage systems are very complex and offer a lot of room for configuration, as the used DRAM standard, the memory controller configuration or the address mapping, there are different requirements for the very different applications\cite{Gomony2012}.
|
||||
Consequently, system designers are commissioned with the complex task of finding the most effective configurations that match the performance and power contraints with good optimizations applied for the specific use case.
|
||||
While these storage systems are very complex and offer a lot of room for configuration, e.g., the \revabbr{dynamic random-access memory}{DRAM} standard, the memory controller configuration or the address mapping, there are different requirements for the very different applications\cite{Gomony2012}.
|
||||
Consequently, system designers are entrusted with the complex task of finding the most effective configurations that match the performance and power contraints with good optimizations applied for the specific use case.
|
||||
|
||||
\input{img/thesis.tikzstyles}
|
||||
\begin{figure}[!ht]
|
||||
@@ -15,24 +15,24 @@ Consequently, system designers are commissioned with the complex task of finding
|
||||
\end{center}
|
||||
\end{figure}
|
||||
|
||||
For the exploration of the design space for these configurations it is impractical to use real systems as they expensive and are not suitable for rapid prototyping.
|
||||
For the exploration of the design space of these configurations it is impractical to use real systems as they are too cost-intensive and not modifyable and therefore not suitable for rapid prototyping.
|
||||
To overcome this limitation, it is important to simulate the memory system using a simulation framework with sufficient accuracy.
|
||||
|
||||
Such a simulation framework is DRAMSys\cite{Steiner2020}\cite{Jung2017}, which is based on transaction level modeling and enables the fast simulation of numerous DRAM standards and controller configurations with cycle-accuracy.
|
||||
Stimuli for the memory system can either be generated using a prerecorded trace file with fixed or relative timestamps, a traffic generator that acts as a state machine and initiates different request patterns or a detailed processor model of the gem5\cite{Binkert2011} simulation framework.
|
||||
Such a simulation framework is DRAMSys\cite{Steiner2020}\cite{Jung2017}, which is based on SystemC \revabbr{transaction level modeling}{TLM} and enables the fast simulation of numerous DRAM standards and controller configurations with cycle-accuracy.
|
||||
Stimuli for the memory system can either be generated using a prerecorded trace file with timestamps, a traffic generator that acts as a state machine and initiates different request patterns, or a detailed processor model of the gem5\cite{Binkert2011} simulation framework.
|
||||
|
||||
However, the two former methods lack in accurary whereas the latter may provide the sufficient precision but represents a very time-consuming effort.
|
||||
However, the two former methods lack in accurary whereas the latter may provide the sufficient precision but is a very time-consuming effort.
|
||||
To fill this gap of fast but accurate traffic generation, a new simulation frontend for DRAMSys is developed and presented in this thesis.
|
||||
|
||||
The methology this new framwork is based on is dynamic binary instrumentation.
|
||||
It allows the extraction of memory accesses of multi-threaded applications as they are executed on real hardware.
|
||||
These memory access traces then are played back using a simplified core model and filtered by a cache model before the memory requests are passed to the DRAM.
|
||||
This allows an accurate modeling of the system and the variing of numerous configuration parameters in a short time.
|
||||
The methology this new framework is based on is dynamic binary instrumentation.
|
||||
It allows the extraction of memory accesses of multi-threaded applications while they are executed on real hardware.
|
||||
These memory access traces are then played back using a simplified core model and are filtered by a cache model before the memory requests are passed to the DRAM.
|
||||
This allows an accurate modeling of the system and the variation of numerous configuration parameters in a short time.
|
||||
|
||||
The remainder of the thesis is structured as follows:
|
||||
In section \ref{sec:dynamorio} the used dynamic binary instrumentation framework, DynamoRIO, is introduced.
|
||||
The section \ref{sec:systemc} presents the modeling language SystemC, the developed core and cache models are based on.
|
||||
After that, the section \ref{sec:caches} gives a short overview over modern cache architectures and their high-level implementations.
|
||||
Section \ref{sec:dramsys} introduces the DRAMSys simulator framework and its basic functionalities.
|
||||
Section \ref{sec:implementation} concerns with the implementation of the cache model, the processor model and the instrumentation tool.
|
||||
In section \ref{sec:simulation_results} the accuracy of the new framwork is compared against the gem5 and Ramulator\cite{Kim2016} simulators, whereas section \ref{sec:future_work} denotes which future improvements can be achieved.
|
||||
In Section \ref{sec:dynamorio} the used dynamic binary instrumentation framework, DynamoRIO, is introduced.
|
||||
Section \ref{sec:systemc} presents the modeling language SystemC, on which the developed core and cache models are based on.
|
||||
After that, Section \ref{sec:caches} gives a short overview of modern cache architectures and their high-level implementations.
|
||||
Section \ref{sec:dramsys} introduces the DRAMSys simulation framework and its basic functionalities.
|
||||
Section \ref{sec:implementation} explains the implementation of the cache model, the processor model and the instrumentation tool in detail.
|
||||
In Section \ref{sec:simulation_results} the accuracy of the new framework is compared against the gem5 and Ramulator\cite{Kim2016} simulators, whereas Section \ref{sec:future_work} denotes future improvements that can be achieved.
|
||||
|
||||
Reference in New Issue
Block a user