> -----Original Message-----
> From: Song, Ruiling
> Sent: Tuesday, July 1, 2014 2:20 PM
> To: Gong, Zhigang; [email protected]
> Cc: Gong, Zhigang
> Subject: RE: [Beignet] [PATCH] runtime: fix potential curbe allocation issue.
>
>
> Two comments inline.
> -----Original Message-----
> From: Beignet [mailto:[email protected]] On Behalf Of
> Zhigang Gong
> Sent: Tuesday, July 01, 2014 12:55 PM
> To: [email protected]
> Cc: Gong, Zhigang
> Subject: [Beignet] [PATCH] runtime: fix potential curbe allocation issue.
>
> According to spec, different platforms have different curbe allocation
> restrication. The previous code set the curbe allocated size to 480 statically
> which is not correct.
>
> This patch change to always set the curbe entry num to 64 which is the
> maximum work group size. And set proper curbe allocation size according to
> the platform's hard limitation and a relatively reasonable kernel argument
> usage limitation.
>
> Signed-off-by: Zhigang Gong <[email protected]>
>
> +#define MAX_KERNEL_ARG_SIZE (32 * 4 + 24 * 4 + 5 * 64) * 64 // 32 integer
> arguments, 24 uniform special register and 5 vector special register.
> +
>
>
> >>> Seem that MAX_KERNEL_ARG_SIZE you calculate from backend? If we add
> more payload register, we need also change here?
This is not elegant current, but not a big deal as the backend is relatively
stable now,
at least for the special register part. I will put a comment there.
> >>> I think we don't need this, it is OK we simply return the maximum
> curbe_size provided in hardware here.
The hardware limitation is too large for most of the kernel. The typical
requirement should be about 512 entries.
But IVB and haswell have 2016 maximum entries, which is 4 times as we need.
> >> The recommended function name is intel_gpgpu_get_max_curbe_size.
>
Right, intel_gpgpu_xxx is better here. I firstly implement this function in the
cl_device_id.c but latter
I found it's better to be in the driver domain. I will change to use
intel_gpgpu_xxx name
> +LOCAL cl_int
> +cl_get_max_curbe_size(uint32_t device_id) {
> + int max_curbe_size;
> + if (IS_BAYTRAIL_T(device_id) ||
> + IS_IVB_GT1(device_id))
> + max_curbe_size = 992;
> + else
> + max_curbe_size = 2016;
> +
> + return (max_curbe_size*32) > MAX_KERNEL_ARG_SIZE ?
> + (MAX_KERNEL_ARG_SIZE / 32) : max_curbe_size; }
> +
>
>
> /* curbe_size */
> - OUT_BATCH(gpgpu->batch, 480);
> + OUT_BATCH(gpgpu->batch,
> + cl_get_max_curbe_size(gpgpu->drv->device_id));
>
> >>> I think you can set curbe size here as "gpgpu->curb.size_cs_entry *
> gpgpu->curb.num_cs_entries", what do you think?
size_cs_entry * num_cs_entries may exceed the limitation. get_max_curbe_size
will ensure we set
a smallest valid size here.
Any further comment?
_______________________________________________
Beignet mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/beignet