When capturing the bo, we allocate an array for min(vma->size,
vma->node.size) pages, plus a bit for compression overhead. Through my
and CI testing, this was sufficient for the mostly empty NULL context as
it compressed well (or the out-of-bounds access simply didn't cause an
issue). However, in real workloads on Cannonlake, we were overflowing
that array and causing havoc with the random memory corruption.

Reported-by: Rafael Antognolli <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103964
Fixes: 4e90a6e22272 ("drm/i915: Record default HW state in the GPU error state")
Signed-off-by: Chris Wilson <[email protected]>
Cc: Chris Wilson <[email protected]>
Cc: Mika Kuoppala <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Tested-by: Rodrigo Vivi <[email protected]>
---
 drivers/gpu/drm/i915/i915_gpu_error.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 876be8f1d930..48418fb81066 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1424,6 +1424,7 @@ capture_object(struct drm_i915_private *dev_priv,
        if (obj && i915_gem_object_has_pages(obj)) {
                struct i915_vma fake = {
                        .node = { .start = U64_MAX, .size = obj->base.size },
+                       .size = obj->base.size,
                        .pages = obj->mm.pages,
                        .obj = obj,
                };
-- 
2.15.1

_______________________________________________
Intel-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to