In almost all cases reading/writing using the GPU memory manager will
want to wait until that read or write is complete. Therefore, change the
API to not default to no callback so that the user must explicitly
specify nullptr indicating they do not want to wait for completion.
Updates a write call which cannot use a callback due to being atomic in
the base gpu device code.
Change-Id: Id19145d49c7cafc97e2e178819682cb97270a16a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62716
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Requests sent using the GPU memory manager are not guaranteed to be
ordered. As a result, the last chunk created by the chunk generator
could complete before all of the previous chunks are done. This will
trigger the final callback and may cause an SDMA/PM4/etc. packet that is
waiting for its completion to resume before the data is ready.
This is likely a fix for verification failures in many applications.
Currently this is tested on MatrixTranspose from the HIP cookbook which
now passes its verification step. It could also potentially fix other
race conditions between reads/writes from/to memory such as using a PTE
or PDE before it is written, etc.
Change-Id: Id6fb342d899db6bd0b86c80056ecf91eeb3026f5
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62714
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>