Fixes from Niklas, Johannes, Hendrik
This commit is contained in:
@@ -1,17 +1,17 @@
|
||||
\section{Future Work}
|
||||
\section{Conclusion and Future Work}
|
||||
\label{sec:future_work}
|
||||
|
||||
Due to the complexity of possible memory subsystem configurations, simulation is an indispensable part of the development process of today's systems.
|
||||
It not only has an high impact on the development cost but also significantly reduces the time-to-market and enables the rapid release of new products.
|
||||
However, the accurate simulation of a specific application takes a large period of time because of the detailed processor core models.
|
||||
On the other hand, fixed or relative time memory traces allow faster simulation at the expense of accuracy, which makes them often unsuitable.
|
||||
To fill this gap, this thesis introduced a new simulation frontend for DRAMSys, that is fast and makes only few compromises on accuracy.
|
||||
To fill this gap, this thesis introduced a new simulation frontend for DRAMSys, which fastens the process while only making few compromises on accuracy.
|
||||
|
||||
In conclusion, the newly developed instrumentation tool provides an flexible way of generating traces for arbitrary multi-threaded applications.
|
||||
In conclusion, the newly developed instrumentation tool provides a flexible way of generating traces for arbitrary multi-threaded applications.
|
||||
The mature DRAMSys simulator framework then can be used to explore the design space and vary numerous configuration parameters of the DRAM subsystem to find a well-suited set of options.
|
||||
|
||||
It was shown that in comparison to the well-established full-system simulation framework gem5, only some deviations have to be accepted.
|
||||
Also, the Pin-Tool based memory access tracing of the Ramulator DRAM simulator was compared to the new fronted. %(ergenisse kurz hier zusammenfassen)
|
||||
Also, the Pin-Tool based memory access tracing of the Ramulator DRAM simulator was compared to the new frontend. %(ergenisse kurz hier zusammenfassen)
|
||||
Although Ramulator takes a slightly different approach to trace generation than this thesis, a very good correlation in the results could be demonstrated.
|
||||
A noteworthy advantage of the newly developed tool is its support for all hardware architectures that DynamoRIO provides (currently IA-32, x86-64, ARM, and AArch64) in contrast to the supported architectures of Pin (IA-32 and x86-64).
|
||||
|
||||
@@ -23,7 +23,7 @@ As mentioned in \ref{sec:cache_implementation}, the cache models do not yet guar
|
||||
Although this can be a complex task, it is possible to implement this in future work.
|
||||
|
||||
A less impactful inaccuracy results from the scheduling of the applications threads in the new simplified core models.
|
||||
While an application can spawn a arbitrary number of threads, the platform may not be able to process them all in parallel.
|
||||
While an application can spawn an arbitrary number of threads, the platform may not be able to process them all in parallel.
|
||||
Currently, the new trace player does not take this into account and runs all threads in parallel.
|
||||
This deviation could be prevented by recording used processor cores on the initial system and using this information to better match the scheduling.
|
||||
|
||||
@@ -42,5 +42,3 @@ In the future, the DynamoRIO tool could decode those computational instructions
|
||||
One significant improvement that still could be applied is the consideration of dependencies between the memory accesses.
|
||||
Similarily to the elastic trace player of gem5 \cite{Jagtap2016}, which captures data load and store dependencies by instrumenting a detailed out-of-order processor model, the DynamoRIO tool could create a dependency graph of the memory accesses using the decoded instructions.
|
||||
By using this technique, it is possible to also model out-of-order behavior of modern processors and make the simulation more accurate, whereas the current implementation is entirely in-order.
|
||||
|
||||
These mentioned potential improvements could make the new simulation frontend for DRAMSys even more accurate.
|
||||
|
||||
Reference in New Issue
Block a user