Lukas' second improvements
This commit is contained in:
@@ -25,7 +25,7 @@ The gem5 syscall-emulation does not simulate a whole operating system, rather it
|
||||
In contrast, the gem5 full-system simulation boots into a complete Linux system including all processes, that may run in the background.
|
||||
Therefore, syscall-emulation is conceptually closer to the DynamoRIO approach than full-system simulation.
|
||||
|
||||
The simulation setup consists in both cases of a two-level cache hierarchy with the following parameters:
|
||||
In both cases, the simulation setup consists of a two-level cache hierarchy with the following parameters:
|
||||
|
||||
\begin{table}[!ht]
|
||||
\caption{Cache parameters used in simulations.}
|
||||
@@ -90,7 +90,7 @@ Their access patterns are as followed:
|
||||
\label{tab:benchmark_description}
|
||||
\end{table}
|
||||
|
||||
In the following, the simulation results of the new simulation frontend, the gem5 full-system emulation and the gem5 syscall-emulation will now be presented.
|
||||
In the following, the simulation results of the new simulation frontend, the gem5 full-system emulation and the gem5 syscall-emulation are now presented.
|
||||
|
||||
\begin{table}[!ht]
|
||||
\caption{Results for bandwidth and bytes read/written with DDR4-2400. \textit{FS} denotes gem5 full-system, \textit{SE} denotes gem5 syscall-emulation, \textit{DS} denotes DRAMSys.}
|
||||
@@ -248,8 +248,8 @@ Here, the absolute deviations in the average memory bandwidth amount to 27.5\% a
|
||||
The differences for the amount of bytes read result to 31.6\% for gem5 FS and to 14.7\% to gem5 SE.
|
||||
Also here, the bytes written only show small deviations of 5.2\% for gem5 FS and 0.02\% for gem5 SE.
|
||||
|
||||
It has to be noted that the average memory bandwidth for the new trace player is highly influenced by the configured CPI value.
|
||||
So to match a real system, this value has to be chosen wisely to achieve good simulation results for the memory bandwidth.
|
||||
% It has to be noted that the average memory bandwidth for the new trace player is highly influenced by the configured CPI value.
|
||||
% So to match a real system, this value has to be chosen wisely to achieve good simulation results for the memory bandwidth.
|
||||
|
||||
|
||||
% Latency und simulation time
|
||||
@@ -440,4 +440,5 @@ Figure \ref{fig:runtimes} presents the runtimes of the various benchmarks and si
|
||||
As expected, DRAMSys outperforms the gem5 full-system and syscall-emulation simulators in every case.
|
||||
On average, DRAMSys is 47.0\% faster than gem5 SE and 73.7\% faster than gem5 FS, with a maximum speedup of 82.6\% for the benchmark \texttt{SUM}.
|
||||
While gem5 SE only simulates the target application using the detailed processor model, gem5 FS has to simulate the complete operating system kernel and applications, that run in the background concurrently.
|
||||
This explains the large runtime differences between these two simulation modes.
|
||||
However, the bootup process of the operating system was not included in the simulations.
|
||||
These conceptual differences explains the large runtime deviations between the two simulation modes.
|
||||
|
||||
Reference in New Issue
Block a user