Propagate the fence errors from drivers to userspace. Allows userspace to react to asynchronous errors coming from the drivers.
One of the trickiest bits of drm syncobj interface is that, unexpectedly, the syncobj doesn't propagate the fence errors on wait. Whenever something goes wrong in an asynchronous task/job that uses drm syncobj to communicate with the userspace there's no way to convey that issue with userspace as drm syncobj wait function will only check whether a fence has been signaled but not whether it has been signaled without error. Instead of assuming that a signaled fence implies success grab the actual status of the fence and return the first fence error that has been spotted. Return the first error because all the subsequent errors are likely to be caused by the initial error in a chain of tasks. [RFC]: Some drivers (e.g. Xe) do accept drm syncobj's in the vm_bind and exec interface, they also call dma_fence_set_error when those operations asynchronously fail, currently those errors will just be silently ignored (because they don't propagate), I'm not sure how the userspace written for those drivers will react to actually receiving those errors, even if silently dropping those driver errors seems completely wrong to me. Signed-off-by: Zack Rusin <[email protected]> Cc: [email protected] Cc: David Airlie <[email protected]> Cc: Simona Vetter <[email protected]> Cc: Maarten Lankhorst <[email protected]> Cc: Maxime Ripard <[email protected]> Cc: Thomas Zimmermann <[email protected]> --- drivers/gpu/drm/drm_syncobj.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c index e1b0fa4000cd..bcd8eff8b59a 100644 --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -1067,6 +1067,7 @@ static signed long drm_syncobj_array_wait_timeout(struct drm_syncobj **syncobjs, struct dma_fence *fence; uint64_t *points; uint32_t signaled_count, i; + int fence_status, first_fence_error = 0; if (flags & (DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT | DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE)) { @@ -1170,6 +1171,9 @@ static signed long drm_syncobj_array_wait_timeout(struct drm_syncobj **syncobjs, dma_fence_add_callback(fence, &entries[i].fence_cb, syncobj_wait_fence_func))) { + fence_status = dma_fence_get_status(fence); + if (fence_status < 0 && !first_fence_error) + first_fence_error = fence_status; /* The fence has been signaled */ if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL) { signaled_count++; @@ -1213,6 +1217,14 @@ static signed long drm_syncobj_array_wait_timeout(struct drm_syncobj **syncobjs, err_free_points: kfree(points); + /* + * Propagate the last fence error the code has seen, but + * give precedence to the overall wait error in case one + * was encountered. + */ + if (first_fence_error < 0 && timeout >= 0) + timeout = first_fence_error; + return timeout; } -- 2.48.1
