userptr requires the exact same memory layout between cpu and gpu, since the current implementation uses the value of row_pitch*h, ignoring the slice_pitch provided by the application. so, enable userptr only if slice_pitch == row_pitch*h for image3d, 2darray and 1darray.
Signed-off-by: Guo Yejun <[email protected]> --- src/cl_mem.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/cl_mem.c b/src/cl_mem.c index cb2af47..ca3e76f 100644 --- a/src/cl_mem.c +++ b/src/cl_mem.c @@ -839,7 +839,8 @@ _cl_mem_new_image(cl_context ctx, int cacheline_size = 0; cl_get_device_info(ctx->device, CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE, sizeof(cacheline_size), &cacheline_size, NULL); if (ALIGN((unsigned long)data, cacheline_size) == (unsigned long)data && - ALIGN(h, cl_buffer_get_tiling_align(ctx, CL_NO_TILE, 1)) == h) { + ALIGN(h, cl_buffer_get_tiling_align(ctx, CL_NO_TILE, 1)) == h && + ((image_type != CL_MEM_OBJECT_IMAGE3D && image_type != CL_MEM_OBJECT_IMAGE1D_ARRAY && image_type != CL_MEM_OBJECT_IMAGE2D_ARRAY) || pitch * h == slice_pitch)) { tiling = CL_NO_TILE; enableUserptr = 1; } -- 1.9.1 _______________________________________________ Beignet mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/beignet
