b20cc7e6d890aba7feadf27782a773262e6486e6
There is a flow of packets as so: WriteResp -> WriteReq -> WriteCompleteResp These packets share some variables, in particular senderState and a status vector. One issue was the WriteResp packet decremented the status vector, which was used by the WriteCompleteResp packets to determine when to handle the global memory response. This could lead to multiple WriteCompleteResp packets attempting to handle the global memory response. Because of that, the WriteCompleteResp packets needed to handle the status vector. this patch moves WriteCompleteResp packet handling back into ComputeUnit::DataPort::processMemRespEvent from ComputeUnit::DataPort::recvTimingResp. This helps remove some redundant code. This patch has the WriteResp packet return without doing any status vector handling, and without deleting the senderState, which had previously caused a segfault. Another issue was WriteCompleteResp packets weren't being issued for each active lane, as the coalesced request was being issued too early. In order to fix that, we have to ensure every active lane puts their request into their applicable coalesced request before issuing the coalesced request. Because of that change, we change the issuing of CoalescedRequests from GPUCoalescer::coalescePacket to GPUCoalescer::completeIssue. That change involves adding a new variable to store the CoalescedRequests that are created in the calls to coalescePacket. This variable is a map from instruction sequence number to coalesced requests. Additionally, the WriteCompleteResp packet was attempting to access physical memory in hitCallback while not having any data, which caused a crash. This can be resolved either by not allowing WriteCompleteResp packets to access memory, or by copying the data from the WriteReq packet. This patch denies WriteCompleteResp packets memory access in hitCallback. Finally, in VIPERCoalescer::writeCompleteCallback there was a map that held the WriteComplete packets, but no packets were ever being removed. This patch removes packets that match the address that was passed in to the function. Change-Id: I9a064a0def2bf6c513f5295596c56b1b652b0ca4 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33656 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com>
This is the gem5 simulator. The main website can be found at http://www.gem5.org A good starting point is http://www.gem5.org/about, and for more information about building the simulator and getting started please see http://www.gem5.org/documentation and http://www.gem5.org/documentation/learning_gem5/introduction. To build gem5, you will need the following software: g++ or clang, Python (gem5 links in the Python interpreter), SCons, SWIG, zlib, m4, and lastly protobuf if you want trace capture and playback support. Please see http://www.gem5.org/documentation/general_docs/building for more details concerning the minimum versions of the aforementioned tools. Once you have all dependencies resolved, type 'scons build/<ARCH>/gem5.opt' where ARCH is one of ARM, NULL, MIPS, POWER, SPARC, or X86. This will build an optimized version of the gem5 binary (gem5.opt) for the the specified architecture. See http://www.gem5.org/documentation/general_docs/building for more details and options. The basic source release includes these subdirectories: - configs: example simulation configuration scripts - ext: less-common external packages needed to build gem5 - src: source code of the gem5 simulator - system: source for some optional system software for simulated systems - tests: regression tests - util: useful utility programs and files To run full-system simulations, you will need compiled system firmware (console and PALcode for Alpha), kernel binaries and one or more disk images. If you have questions, please send mail to gem5-users@gem5.org Enjoy using gem5 and please share your modifications and extensions.
Description