39 lines
3.5 KiB
TeX
39 lines
3.5 KiB
TeX
\section{Introduction}
|
|
\label{sec:introduction}
|
|
|
|
Todays computing systems accompany us in almost all areas of life in the form of smart devices, computers, or game consoles.
|
|
With the increasing performance requirements on these devices, not only faster processors are needed, but also high-performance memory systems, namely dynamic random access memories, which are supposed to deliver a lot of bandwidth at low latency.
|
|
While these storage systems are very complex and offer a lot of room for configuration, as the used DRAM standard, the memory controller configuration or the address mapping, there are different requirements for the very different applications\cite{Gomony2012}.
|
|
Consequently, system designers are commissioned with the complex task of finding the most effective configurations that match the performance and power contraints with good optimizations applied for the specific use case.
|
|
|
|
\input{img/thesis.tikzstyles}
|
|
\begin{figure}[!ht]
|
|
\begin{center}
|
|
\tikzfig{img/cloud}
|
|
\caption{Exemplary DRAM configuration parameters to consider when designing a system.}
|
|
\label{fig:cloud}
|
|
\end{center}
|
|
\end{figure}
|
|
|
|
For the exploration of the design space for these configurations it is impractical to use real systems as they expensive and are not suitable for rapid prototyping.
|
|
To overcome this limitation, it is important to simulate the memory system using a simulation framework with sufficient accuracy.
|
|
|
|
Such a simulation framework is DRAMSys\cite{Steiner2020}\cite{Jung2017}, which is based on transaction level modeling and enables the fast simulation of numerous DRAM standards and controller configurations with cycle-accuracy.
|
|
Stimuli for the memory system can either be generated using a prerecorded trace file with fixed or relative timestamps, a traffic generator that acts as a state machine and initiates different request patterns or a detailed processor model of the gem5\cite{Binkert2011} simulation framework.
|
|
|
|
However, the two former methods lack in accurary whereas the latter may provide the sufficient precision but represents a very time-consuming effort.
|
|
To fill this gap of fast but accurate traffic generation, a new simulation frontend for DRAMSys is developed and presented in this thesis.
|
|
|
|
The methology this new framwork is based on is dynamic binary instrumentation.
|
|
It allows the extraction of memory accesses of multi-threaded applications as they are executed on real hardware.
|
|
These memory access traces then are played back using a simplified core model and filtered by a cache model before the memory requests are passed to the DRAM.
|
|
This allows an accurate modeling of the system and the variing of numerous configuration parameters in a short time.
|
|
|
|
The remainder of the thesis is structured as follows:
|
|
In section \ref{sec:dynamorio} the used dynamic binary instrumentation framework, DynamoRIO, is introduced.
|
|
The section \ref{sec:systemc} presents the modeling language SystemC, the developed core and cache models are based on.
|
|
After that, the section \ref{sec:caches} gives a short overview over modern cache architectures and their high-level implementations.
|
|
Section \ref{sec:dramsys} introduces the DRAMSys simulator framework and its basic functionalities.
|
|
Section \ref{sec:implementation} concerns with the implementation of the cache model, the processor model and the instrumentation tool.
|
|
In section \ref{sec:simulation_results} the accuracy of the new framwork is compared against the gem5 and Ramulator\cite{Kim2016} simulators, whereas section \ref{sec:future_work} denotes which future improvements can be achieved.
|