On Fri, 2014-08-29 at 19:18 +0800, Zhigang Gong wrote:
> I have a quick look at the cl kernel . It uses double:
> #pragma OPENCL EXTENSION cl_khr_fp64: enable
>
> __kernel void sum(
> __global const double *a,
> __global const double *b,
> __global double *c,
> long n)
> {
> int gid = get_global_id(0);
> if (gid < n)
> c[gid] = a[gid] + b[gid];
> }
> ~
>
> But beignet doesn't support cl_khr_fp64 currently. If application
> wants to use double in kernel, application
> need to query platform/device extension to make sure cl_khr_fp64 is
> supported.
> You can try to change the demo code to use float rather than double,
> float data type is basic type and supported
> by any OpenCL platform.
Tried. Now I see in stdout:
BAD 1!
Before this change:
BAD 2!So, float doesn't fixes this bug. > > > -----Original Message----- > > From: Beignet [mailto:[email protected]] On > > Behalf Of > > Igor Gnatenko > > Sent: Friday, August 29, 2014 5:53 PM > > To: An open source open CL implemenation for Intel platform > > Subject: [Beignet] [hpc12/tools] build of 'sum' on 'Intel(R) HD > > Graphics > > IvyBridge M GT2' failed > > > > Hi, > > > > I've tried to use example opencl program[0] and when I'm using > > beignet it can't > > compile CL kernel. > > $ CL_HELPER_PRINT_COMPILER_OUTPUT=1 ./cl-demo 1000 10 Choose > > platform: > > [0] The pocl project > > [1] Intel > > Enter choice: 1 > > Choose device: > > [0] Intel(R) HD Graphics IvyBridge M GT2 Enter choice: 0 > > > > --------------------------------------------------------------------- > > NAME: Intel(R) HD Graphics IvyBridge M GT2 > > VENDOR: Intel > > PROFILE: FULL_PROFILE > > VERSION: OpenCL 1.2 beignet 0.9.2 > > EXTENSIONS: cl_khr_global_int32_base_atomics > > cl_khr_global_int32_extended_atomics > > cl_khr_local_int32_base_atomics > > cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store > > cl_khr_icd > > DRIVER_VERSION: 0.9.2 > > > > Type: GPU > > EXECUTION_CAPABILITIES: Kernel Native > > GLOBAL_MEM_CACHE_TYPE: Read-Write (2) > > CL_DEVICE_LOCAL_MEM_TYPE: Global (2) > > SINGLE_FP_CONFIG: 0x6 > > QUEUE_PROPERTIES: 0x2 > > > > VENDOR_ID: 358 > > MAX_COMPUTE_UNITS: 16 > > MAX_WORK_ITEM_DIMENSIONS: 3 > > MAX_WORK_GROUP_SIZE: 1024 > > PREFERRED_VECTOR_WIDTH_CHAR: 16 > > PREFERRED_VECTOR_WIDTH_SHORT: 16 > > PREFERRED_VECTOR_WIDTH_INT: 16 > > PREFERRED_VECTOR_WIDTH_LONG: 16 > > PREFERRED_VECTOR_WIDTH_FLOAT: 16 > > PREFERRED_VECTOR_WIDTH_DOUBLE: 0 > > MAX_CLOCK_FREQUENCY: 1000 > > ADDRESS_BITS: 32 > > MAX_MEM_ALLOC_SIZE: 268435456 > > IMAGE_SUPPORT: 1 > > MAX_READ_IMAGE_ARGS: 128 > > MAX_WRITE_IMAGE_ARGS: 8 > > IMAGE2D_MAX_WIDTH: 8192 > > IMAGE2D_MAX_HEIGHT: 8192 > > IMAGE3D_MAX_WIDTH: 8192 > > IMAGE3D_MAX_HEIGHT: 8192 > > IMAGE3D_MAX_DEPTH: 2048 > > MAX_SAMPLERS: 16 > > MAX_PARAMETER_SIZE: 1024 > > MEM_BASE_ADDR_ALIGN: 1024 > > MIN_DATA_TYPE_ALIGN_SIZE: 128 > > GLOBAL_MEM_CACHELINE_SIZE: 128 > > GLOBAL_MEM_CACHE_SIZE: 8192 > > GLOBAL_MEM_SIZE: 1073741824 > > MAX_CONSTANT_BUFFER_SIZE: 524288 > > MAX_CONSTANT_ARGS: 8 > > LOCAL_MEM_SIZE: 65536 > > ERROR_CORRECTION_SUPPORT: 0 > > PROFILING_TIMER_RESOLUTION: 80 > > ENDIAN_LITTLE: 1 > > AVAILABLE: 1 > > COMPILER_AVAILABLE: 1 > > MAX_WORK_GROUP_SIZES: 1024 1024 1024 > > > > --------------------------------------------------------------------- > > *** build of 'sum' on 'Intel(R) HD Graphics IvyBridge M GT2' said: > > > > *** (end of message) > > 0.000123 s > > 0.195890 GB/s > > BAD 2! > > Aborted (core dumped) > > > > When I'm using POCL - it's OK. > > $ CL_HELPER_PRINT_COMPILER_OUTPUT=1 ./cl-demo 1000 10 Choose > > platform: > > [0] The pocl project > > [1] Intel > > Enter choice: 0 > > Choose device: > > [0] pthread-Intel(R) Core(TM) i7-3667U CPU @ 2.00GHz Enter > > choice: 0 > > > > --------------------------------------------------------------------- > > NAME: pthread-Intel(R) Core(TM) i7-3667U CPU @ 2.00GHz > > VENDOR: unknown > > PROFILE: FULL_PROFILE > > VERSION: OpenCL 1.2 pocl > > EXTENSIONS: cl_khr_fp64 cl_khr_fp16 cl_khr_byte_addressable_store > > DRIVER_VERSION: 0.9 > > > > Type: Default CPU > > EXECUTION_CAPABILITIES: Kernel Native > > GLOBAL_MEM_CACHE_TYPE: None (0) > > CL_DEVICE_LOCAL_MEM_TYPE: Global (2) > > SINGLE_FP_CONFIG: 0x6 > > QUEUE_PROPERTIES: 0x2 > > > > VENDOR_ID: 0 > > MAX_COMPUTE_UNITS: 4 > > MAX_WORK_ITEM_DIMENSIONS: 3 > > MAX_WORK_GROUP_SIZE: 1024 > > PREFERRED_VECTOR_WIDTH_CHAR: 16 > > PREFERRED_VECTOR_WIDTH_SHORT: 8 > > PREFERRED_VECTOR_WIDTH_INT: 4 > > PREFERRED_VECTOR_WIDTH_LONG: 2 > > PREFERRED_VECTOR_WIDTH_FLOAT: 4 > > PREFERRED_VECTOR_WIDTH_DOUBLE: 2 > > MAX_CLOCK_FREQUENCY: 3200 > > ADDRESS_BITS: 64 > > MAX_MEM_ALLOC_SIZE: 2037270528 > > IMAGE_SUPPORT: 1 > > MAX_READ_IMAGE_ARGS: 128 > > MAX_WRITE_IMAGE_ARGS: 128 > > IMAGE2D_MAX_WIDTH: 8192 > > IMAGE2D_MAX_HEIGHT: 8192 > > IMAGE3D_MAX_WIDTH: 2048 > > IMAGE3D_MAX_HEIGHT: 2048 > > IMAGE3D_MAX_DEPTH: 2048 > > MAX_SAMPLERS: 16 > > MAX_PARAMETER_SIZE: 1024 > > MEM_BASE_ADDR_ALIGN: 128 > > MIN_DATA_TYPE_ALIGN_SIZE: 128 > > GLOBAL_MEM_CACHELINE_SIZE: 0 > > GLOBAL_MEM_CACHE_SIZE: 0 > > GLOBAL_MEM_SIZE: 8149082112 > > MAX_CONSTANT_BUFFER_SIZE: 6332237824 > > MAX_CONSTANT_ARGS: 4294967304 > > LOCAL_MEM_SIZE: 2037270528 > > ERROR_CORRECTION_SUPPORT: 0 > > PROFILING_TIMER_RESOLUTION: 0 > > ENDIAN_LITTLE: 1 > > AVAILABLE: 1 > > COMPILER_AVAILABLE: 1 > > MAX_WORK_GROUP_SIZES: 1024 1024 1024 > > > > --------------------------------------------------------------------- > > *** build of 'sum' on 'pthread-Intel(R) Core(TM) i7-3667U CPU @ > > 2.00GHz' > > said: > > > > *** (end of message) > > 0.054946 s > > 0.000437 GB/s > > GOOD > > > > I don't know how-to debug this issue, so help me. > > > > [0]https://github.com/hpc12/tools/ > > -- > > -Igor Gnatenko > > _______________________________________________ > Beignet mailing list > [email protected] > http://lists.freedesktop.org/mailman/listinfo/beignet > -- -Igor Gnatenko
signature.asc
Description: This is a digitally signed message part
_______________________________________________ Beignet mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/beignet
