There is a need to inform user-space clients when a rebind worker has ran out of memory so that it can react, adjust its working-set and restart the job. This patch series aims to start a discussion about the bet way to accomplish this.
The series builds on the core "general notification mechanism" or "watch_queue", and attaches a watch queue to each xe drm file. The watch_queue is extremely flexible and allows filtering out events of interest at the kernel level. There can be multiple listeners. Another approach would be to use drm events, but then there could only be one listener per open file and no filtering. Otoh drm events would probably have the shortest delivery latency. Finally there is eventfd (man 2 eventfd) but doesn't appear to allow carrying metadata. Any feedback appreciated, also on method preference. Patch 1 extende the watch_queue interface slightly, Patch 2 implements delivery of a VM rebind worker error. Note this is to be regarded as a POC at this time. No need for a detailed review. A user-space igt user is posted as an RFC here: https://patchwork.freedesktop.org/series/162576/ Thomas Hellström (2): watch_queue: Add a DRM_XE_NOTIFY watch type and export init_watch() drm/xe: Add watch_queue-based device event notification drivers/gpu/drm/xe/Kconfig | 1 + drivers/gpu/drm/xe/Makefile | 1 + drivers/gpu/drm/xe/xe_device.c | 7 ++ drivers/gpu/drm/xe/xe_device_types.h | 6 ++ drivers/gpu/drm/xe/xe_vm.c | 7 +- drivers/gpu/drm/xe/xe_vm_types.h | 2 + drivers/gpu/drm/xe/xe_watch_queue.c | 107 +++++++++++++++++++++++++++ drivers/gpu/drm/xe/xe_watch_queue.h | 20 +++++ include/uapi/drm/xe_drm.h | 46 ++++++++++++ include/uapi/drm/xe_drm_events.h | 56 ++++++++++++++ include/uapi/linux/watch_queue.h | 3 +- kernel/watch_queue.c | 13 +++- 12 files changed, 263 insertions(+), 6 deletions(-) create mode 100644 drivers/gpu/drm/xe/xe_watch_queue.c create mode 100644 drivers/gpu/drm/xe/xe_watch_queue.h create mode 100644 include/uapi/drm/xe_drm_events.h -- 2.53.0
