Re: [Beignet] [PATCH v2] Add missed kernel names into built-in kernel list.

2017-06-21 Thread yan . wang
m > Sent: Thursday, June 22, 2017 13:52 > To: beignet@lists.freedesktop.org > Cc: Yan Wang > Subject: [Beignet] [PATCH v2] Add missed kernel names into built-in kernel > list. > > From: Yan Wang > > Signed-off-by: Yan Wang > --- > src/cl_gt_device.h | 17 +++

[Beignet] [PATCH v2] Add missed kernel names into built-in kernel list.

2017-06-21 Thread yan . wang
From: Yan Wang Signed-off-by: Yan Wang --- src/cl_gt_device.h | 17 + 1 file changed, 17 insertions(+) diff --git a/src/cl_gt_device.h b/src/cl_gt_device.h index f6cb5f8..ff23b32 100644 --- a/src/cl_gt_device.h +++ b/src/cl_gt_device.h @@ -115,16 +115,33 @@ DECL_INFO_STRING

Re: [Beignet] [PATCH] Add aligned copy kernels into built-in kernel list.

2017-06-21 Thread yan . wang
? > -Original Message- > From: Beignet [mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of > yan.w...@linux.intel.com > Sent: Wednesday, June 21, 2017 11:26 > To: beignet@lists.freedesktop.org > Cc: Yan Wang > Subject: [Beignet] [PATCH] Add aligned copy kernels into

[Beignet] [PATCH] Add aligned copy kernels into built-in kernel list.

2017-06-20 Thread yan . wang
From: Yan Wang Signed-off-by: Yan Wang --- src/cl_gt_device.h | 8 1 file changed, 8 insertions(+) diff --git a/src/cl_gt_device.h b/src/cl_gt_device.h index f6cb5f8..8008606 100644 --- a/src/cl_gt_device.h +++ b/src/cl_gt_device.h @@ -122,9 +122,17 @@ DECL_INFO_STRING

Re: [Beignet] [PATCH 2/2] Use aligned16 and aligne4 kernel to copy for large 3D image with TILE_Y.

2017-06-14 Thread yan . wang
manual and pushed, thanks. > -Original Message- > From: Beignet [mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of > yan.w...@linux.intel.com > Sent: Tuesday, June 13, 2017 16:32 > To: beignet@lists.freedesktop.org > Cc: Yan Wang > Subject: [Beignet] [PATCH 2

[Beignet] [PATCH 2/2] Use aligned16 and aligne4 kernel to copy for large 3D image with TILE_Y.

2017-06-13 Thread yan . wang
From: Yan Wang It is similar with 2D image for avoiding extended image width truncated. Signed-off-by: Yan Wang --- src/CMakeLists.txt | 2 + src/cl_context.h | 4 ++ src/cl_mem.c | 46

[Beignet] [PATCH 1/2] Add test case for large 3D image with TILE_Y.

2017-06-13 Thread yan . wang
From: Yan Wang It will test aligned4 and aligned16 kernel for 3D image. Signed-off-by: Yan Wang --- utests/compiler_fill_large_image.cpp | 98 1 file changed, 98 insertions(+) diff --git a/utests/compiler_fill_large_image.cpp b/utests

[Beignet] [PATCH v5 7/7] Optimize clEnqueueWriteImageByKernel and clEnqueuReadImageByKernel.

2017-06-13 Thread yan . wang
From: Yan Wang 1. Only copy the data by origin and region defined. 2. Add clFinish to guarantee the kernel copying is finished when blocking writing. Signed-off-by: Yan Wang --- src/cl_api_mem.c | 25 ++--- 1 file changed, 18 insertions(+), 7 deletions(-) diff --git a

[Beignet] [PATCH v5 6/7] Fix bug of clEnqueueUnmapMemObjectForKernel and clEnqueueMapImageByKernel.

2017-06-13 Thread yan . wang
From: Yan Wang 1. Support wrrting data by mapping/unmapping mode. 2. Add mapping record logic. 3. Add clFinish to guarantee the kernel copying is finished. 4. Fix the error of calling clEnqueueMapImageByKernel. blocking_map and map_flags need be switched. Signed-off-by: Yan Wang --- src

[Beignet] [PATCH v4 7/7] Optimize clEnqueueWriteImageByKernel and clEnqueuReadImageByKernel.

2017-06-12 Thread yan . wang
From: Yan Wang 1. Only copy the data by origin and region defined. 2. Add clFinish to guarantee the kernel copying is finished when blocking writing. Signed-off-by: Yan Wang --- src/cl_api_mem.c | 20 ++-- 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/src

[Beignet] [PATCH v4 3/7] Add utest to test writing data into large image (TILE_Y) by map/unmap and USE_HOST_PTR mode.

2017-06-12 Thread yan . wang
From: Yan Wang Signed-off-by: Yan Wang --- utests/runtime_use_host_ptr_large_image.cpp | 115 1 file changed, 115 insertions(+) diff --git a/utests/runtime_use_host_ptr_large_image.cpp b/utests/runtime_use_host_ptr_large_image.cpp index c8200b3..3c77cae 100644

[Beignet] [PATCH v4 5/7] Add clFinish for guarantee the kernel copying is finished when create TILE_Y large image.

2017-06-12 Thread yan . wang
From: Yan Wang Signed-off-by: Yan Wang --- src/cl_mem.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/src/cl_mem.c b/src/cl_mem.c index 3f41fd8..b6dce3f 100644 --- a/src/cl_mem.c +++ b/src/cl_mem.c @@ -817,6 +817,13 @@ _cl_new_image_copy_from_host_ptr(cl_context ctx, return

[Beignet] [PATCH v4 6/7] Fix bug of clEnqueueUnmapMemObjectForKernel and clEnqueueMapImageByKernel.

2017-06-12 Thread yan . wang
From: Yan Wang 1. Support wrrting data by mapping/unmapping mode. 2. Add mapping record logic. 3. Add clFinish to guarantee the kernel copying is finished. 4. Fix the error of calling clEnqueueMapImageByKernel. blocking_map and map_flags need be switched. Signed-off-by: Yan Wang --- src

[Beignet] [PATCH v4 4/7] Add cl_mem_record_map_mem_for_kernel() for record map adress for TILE_Y image by kernel copying.

2017-06-12 Thread yan . wang
From: Yan Wang Signed-off-by: Yan Wang --- src/cl_mem.c | 109 +-- src/cl_mem.h | 5 +++ 2 files changed, 88 insertions(+), 26 deletions(-) diff --git a/src/cl_mem.c b/src/cl_mem.c index a8543c9..3f41fd8 100644 --- a/src/cl_mem.c +++ b

[Beignet] [PATCH v4 2/7] Add utest to test writing data into large image (TILE_Y) by map/unmap mode.

2017-06-12 Thread yan . wang
From: Yan Wang It is used to reproduce the bug of clCopyImage/clFillImage of conformance test. Signed-off-by: Yan Wang --- utests/compiler_copy_large_image.cpp | 198 +++ 1 file changed, 198 insertions(+) diff --git a/utests/compiler_copy_large_image.cpp b

[Beignet] [PATCH v4 1/7] Add utest case for filling image by small region.

2017-06-12 Thread yan . wang
From: Yan Wang It is used to reproduce the bug of allocations of conformance test. Signed-off-by: Yan Wang --- utests/compiler_fill_large_image.cpp | 50 1 file changed, 50 insertions(+) diff --git a/utests/compiler_fill_large_image.cpp b/utests

[Beignet] [PATCH v3 7/7] Optimize clEnqueueWriteImageByKernel and clEnqueuReadImageByKernel.

2017-06-07 Thread yan . wang
From: Yan Wang 1. Only copy the data by origin and region defined. 2. Add clFinish to guarantee the kernel copying is finished when blocking writing. Signed-off-by: Yan Wang --- src/cl_api_mem.c | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/src

[Beignet] [PATCH v3 6/7] Fix bug of clEnqueueUnmapMemObjectForKernel and clEnqueueMapImageByKernel.

2017-06-07 Thread yan . wang
From: Yan Wang 1. Support wrrting data by mapping/unmapping mode. 2. Add mapping record logic. 3. Add clFinish to guarantee the kernel copying is finished. 4. Fix the error of calling clEnqueueMapImageByKernel. blocking_map and map_flags need be switched. Signed-off-by: Yan Wang --- src

[Beignet] [PATCH v3 5/7] Add clFinish for guarantee the kernel copying is finished when create TILE_Y large image.

2017-06-07 Thread yan . wang
From: Yan Wang Signed-off-by: Yan Wang --- src/cl_mem.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/src/cl_mem.c b/src/cl_mem.c index 3f41fd8..b6dce3f 100644 --- a/src/cl_mem.c +++ b/src/cl_mem.c @@ -817,6 +817,13 @@ _cl_new_image_copy_from_host_ptr(cl_context ctx, return

[Beignet] [PATCH v3 1/7] Add utest case for filling image by small region.

2017-06-07 Thread yan . wang
From: Yan Wang It is used to reproduce the bug of allocations of conformance test. Signed-off-by: Yan Wang --- utests/compiler_fill_large_image.cpp | 50 1 file changed, 50 insertions(+) diff --git a/utests/compiler_fill_large_image.cpp b/utests

[Beignet] [PATCH v3 2/7] Add utest to test writing data into large image (TILE_Y) by map/unmap mode.

2017-06-07 Thread yan . wang
From: Yan Wang it is used to reproduce the bug of clCopyImage/clFillImage of conformance test. Signed-off-by: Yan Wang --- utests/compiler_copy_large_image.cpp | 176 +++ 1 file changed, 176 insertions(+) diff --git a/utests/compiler_copy_large_image.cpp b

[Beignet] [PATCH v3 4/7] Add cl_mem_record_map_mem_for_kernel() for record map adress for TILE_Y image by kernel copying.

2017-06-07 Thread yan . wang
From: Yan Wang Signed-off-by: Yan Wang --- src/cl_mem.c | 109 +-- src/cl_mem.h | 5 +++ 2 files changed, 88 insertions(+), 26 deletions(-) diff --git a/src/cl_mem.c b/src/cl_mem.c index a8543c9..3f41fd8 100644 --- a/src/cl_mem.c +++ b

[Beignet] [PATCH v3 3/7] Add utest to test writing data into large image (TILE_Y) by map/unmap and USE_HOST_PTR mode.

2017-06-07 Thread yan . wang
From: Yan Wang Signed-off-by: Yan Wang --- utests/runtime_use_host_ptr_large_image.cpp | 109 1 file changed, 109 insertions(+) diff --git a/utests/runtime_use_host_ptr_large_image.cpp b/utests/runtime_use_host_ptr_large_image.cpp index c8200b3..8f3e330 100644

[Beignet] [PATCH v2 2/2] Fix bug of size of tmp_ker_buf for TILE_Y copying of large image.

2017-05-26 Thread yan . wang
From: Yan Wang 1. The size should be calculated based region and bpp of image instead of the whole image size. 2. When use blocking mode, the copying kernel need be finished. Otherwise, it will cause allocations of conformance test failed. Signed-off-by: Yan Wang --- src/cl_api_mem.c | 26

[Beignet] [PATCH 1/2] Add utest case for filling image by small region.

2017-05-26 Thread yan . wang
From: Yan Wang It is used to reproduce the bug of allocations of conformance test. Signed-off-by: Yan Wang --- utests/compiler_fill_large_image.cpp | 50 1 file changed, 50 insertions(+) diff --git a/utests/compiler_fill_large_image.cpp b/utests

[Beignet] [PATCH 2/2] Fix bug of size of tmp_ker_buf for TILE_Y copying of large image.

2017-05-26 Thread yan . wang
From: Yan Wang the size should be calculated based region and bpp of image instead of the whole image size. Otherwise, it will cause allocations of conformance test failed. Signed-off-by: Yan Wang --- src/cl_api_mem.c | 15 +-- 1 file changed, 9 insertions(+), 6 deletions(-) diff

[Beignet] [PATCH v2 2/2] Fix bug of clEnqueueCopyBufferToImage and clEnqueueCopyImageToBuffer.

2017-05-25 Thread yan . wang
From: Yan Wang "imagedim_non_pow_2" cases of basic modudle of confrmance shows regression after use TILE_Y mode for large image by previous patch. This bug comes from the non-align16 kernel of clEnqueueCopyBufferToImage and clEnqueueCopyImageToBuffer. It will force CL_RGBA/CL_

[Beignet] [PATCH 2/2] Fix bug of clEnqueueCopyBufferToImage and clEnqueueCopyImageToBuffer.

2017-05-24 Thread yan . wang
From: Yan Wang "imagedim_non_pow_2" cases of basic modudle of confrmance shows regression after use TILE_Y mode for large image by previous patch. This bug comes from the non-align16 kernel of clEnqueueCopyBufferToImage and clEnqueueCopyImageToBuffer. It will force CL_RGBA/CL_

[Beignet] [PATCH 1/2] Add utest to reproduce the bug of imagedim_non_pow_2 cases of conformance test.

2017-05-24 Thread yan . wang
From: Yan Wang Signed-off-by: Yan Wang --- utests/compiler_fill_large_image.cpp | 46 1 file changed, 46 insertions(+) diff --git a/utests/compiler_fill_large_image.cpp b/utests/compiler_fill_large_image.cpp index 6fb872d..1ecf65b 100644 --- a/utests

[Beignet] [PATCH v3 8/8] Implement TILE_Y large image in clEnqueueWriteImage.

2017-05-16 Thread yan . wang
From: Yan Wang It will fail to copy data from host ptr to TILE_Y large image by memcpy. Use clEnqueueCopyBufferToImage to do this on GPU side. Signed-off-by: Yan Wang --- src/cl_api_mem.c | 46 ++ 1 file changed, 46 insertions(+) diff --git a/src

[Beignet] [PATCH v3 7/8] Implement TILE_Y large image in clEnqueueReadImage.

2017-05-16 Thread yan . wang
From: Yan Wang It will fail to copy data from TILE_Y large image to buffer by memcpy. Use clEnqueueCopyImageToBuffer to do this on GPU side. Signed-off-by: Yan Wang --- src/cl_api_mem.c | 55 +++ 1 file changed, 55 insertions(+) diff --git

[Beignet] [PATCH v3 5/8] Create image with TILE_Y mode still when image size>128MB for performance.

2017-05-16 Thread yan . wang
From: Yan Wang It may failed to copy data from host ptr to TILE_Y large image. So use clCopyBufferToImage to do this on GPU side. Signed-off-by: Yan Wang --- src/cl_context.c | 6 src/cl_context.h | 2 +- src/cl_mem.c | 107

[Beignet] [PATCH v3 6/8] Implement TILE_Y large image in clEnqueueMapImage and clEnqueueUnmapMemObject.

2017-05-16 Thread yan . wang
From: Yan Wang It will fail to copy data from TILE_Y large image to buffer by memcpy. Use clEnqueueCopyImageToBuffer to do this. Signed-off-by: Yan Wang --- src/cl_api_mem.c | 111 +++ 1 file changed, 111 insertions(+) diff --git a/src

[Beignet] [PATCH v3 4/8] Add image use_hostptr benchmark case for testing large image operations.

2017-05-16 Thread yan . wang
From: Yan Wang It is for testing large image with TILE_Y mode. Signed-off-by: Yan Wang --- benchmark/CMakeLists.txt | 1 + benchmark/benchmark_use_host_ptr_large_image.cpp | 84 2 files changed, 85 insertions(+) create mode 100644 benchmark

[Beignet] [PATCH v3 2/8] Add image filling case for testing large image operations.

2017-05-16 Thread yan . wang
From: Yan Wang It is for testing large image with TILE_Y mode. Signed-off-by: Yan Wang --- utests/CMakeLists.txt| 1 + utests/compiler_fill_large_image.cpp | 120 +++ 2 files changed, 121 insertions(+) create mode 100644 utests

[Beignet] [PATCH v3 3/8] Add image use_hostptr case for testing large image operations.

2017-05-16 Thread yan . wang
From: Yan Wang It is for testing large image with TILE_Y mode. Signed-off-by: Yan Wang --- utests/CMakeLists.txt | 1 + utests/runtime_use_host_ptr_large_image.cpp | 75 + 2 files changed, 76 insertions(+) create mode 100644 utests

[Beignet] [PATCH v3 1/8] Add image copying case for testing large image operations.

2017-05-16 Thread yan . wang
From: Yan Wang It is for testing large image with TILE_Y mode. Signed-off-by: Yan Wang --- utests/CMakeLists.txt| 1 + utests/compiler_copy_large_image.cpp | 121 +++ 2 files changed, 122 insertions(+) create mode 100644 utests

[Beignet] [PATCH v2 6/6] Implement TILE_Y large image in clEnqueueWriteImage.

2017-05-14 Thread yan . wang
From: Yan Wang It will fail to copy data from host ptr to TILE_Y large image by memcpy. Use clEnqueueCopyBufferToImage to do this on GPU side. Signed-off-by: Yan Wang --- src/cl_api_mem.c | 46 ++ 1 file changed, 46 insertions(+) diff --git a/src

[Beignet] [PATCH v2 5/6] Implement TILE_Y large image in clEnqueueReadImage.

2017-05-14 Thread yan . wang
From: Yan Wang It will fail to copy data from TILE_Y large image to buffer by memcpy. Use clEnqueueCopyImageToBuffer to do this on GPU side. Signed-off-by: Yan Wang --- src/cl_api_mem.c | 53 + 1 file changed, 53 insertions(+) diff --git a

[Beignet] [PATCH v2 4/6] Implement TILE_Y large image in clEnqueueMapImage and clEnqueueUnmapMemObject.

2017-05-14 Thread yan . wang
From: Yan Wang It will fail to copy data from TILE_Y large image to buffer by memcpy. Use clEnqueueCopyImageToBuffer to do this. Signed-off-by: Yan Wang --- src/cl_api_mem.c | 88 1 file changed, 88 insertions(+) diff --git a/src

[Beignet] [PATCH 5/6] Implement TILE_Y large image in clEnqueueReadImage.

2017-05-09 Thread yan . wang
From: Yan Wang It will fail to copy data from TILE_Y large image to buffer by memcpy. Use clEnqueueCopyImageToBuffer to do this on GPU side. Signed-off-by: Yan Wang --- src/cl_api_mem.c | 53 + 1 file changed, 53 insertions(+) diff --git a

[Beignet] [PATCH 6/6] Implement TILE_Y large image in clEnqueueWriteImage.

2017-05-09 Thread yan . wang
From: Yan Wang It will fail to copy data from host ptr to TILE_Y large image by memcpy. Use clEnqueueCopyBufferToImage to do this on GPU side. Signed-off-by: Yan Wang --- src/cl_api_mem.c | 47 +++ 1 file changed, 47 insertions(+) diff --git a/src

[Beignet] [PATCH 4/6] Implement TILE_Y large image in clEnqueueMapImage and clEnqueueUnmapMemObject.

2017-05-09 Thread yan . wang
From: Yan Wang It will fail to copy data from TILE_Y large image to buffer by memcpy. Use clEnqueueCopyImageToBuffer to do this. Signed-off-by: Yan Wang --- src/cl_api_mem.c | 88 1 file changed, 88 insertions(+) diff --git a/src

[Beignet] [PATCH 3/6] Create image with TILE_Y mode still when image size > 128MB for performance.

2017-05-09 Thread yan . wang
From: Yan Wang It may failed to copy data from host ptr to TILE_Y large image. So use clCopyBufferToImage to do this on GPU side. Signed-off-by: Yan Wang --- src/cl_mem.c | 100 --- src/cl_mem.h | 2 ++ 2 files changed, 97 insertions

[Beignet] [PATCH 2/6] Add image filling case for testing large image operations.

2017-05-09 Thread yan . wang
From: Yan Wang It is for testing large image with TILE_Y mode. Signed-off-by: Yan Wang --- utests/CMakeLists.txt| 1 + utests/compiler_fill_large_image.cpp | 124 +++ 2 files changed, 125 insertions(+) create mode 100644 utests

[Beignet] [PATCH 1/6] Add image copying case for testing large image operations.

2017-05-09 Thread yan . wang
From: Yan Wang It is for testing large image with TILE_Y mode. Signed-off-by: Yan Wang --- utests/CMakeLists.txt| 1 + utests/compiler_copy_large_image.cpp | 121 +++ 2 files changed, 122 insertions(+) create mode 100644 utests

Re: [Beignet] [PATCH] Set the bit of "cross thread constant dara read length".

2017-04-05 Thread yan . wang
This patch may not correct. Need more investigation and change to adjust curbe_offset of patches for constant data sharing between GPU threads. Please ignore it. Sorry. yan.wang From: yan.wang Date: 2017-04-05 15:20 To: beignet CC: Yan Wang Subject: [Beignet] [PATCH] Set the bit of "

[Beignet] [PATCH] Set the bit of "cross thread constant dara read length".

2017-04-05 Thread yan . wang
From: Yan Wang Set this bit to enable constant data sharing between GPU threads. Signed-off-by: Yan Wang --- src/intel/intel_gpgpu.c | 2 ++ src/intel/intel_structs.h | 5 - 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/src/intel/intel_gpgpu.c b/src/intel/intel_gpgpu.c

Re: [Beignet] [PATCH v2] Provide more possible candidate of load/store as possible.

2017-03-09 Thread yan . wang
day, March 9, 2017 5:41 PM > To: beignet@lists.freedesktop.org > Cc: Yan Wang > Subject: [Beignet] [PATCH v2] Provide more possible candidate of load/store as > possible. > > From: Yan Wang > > Avoid searching range too small in some case like vector of float. > It will le

Re: [Beignet] [PATCH v2] Provide more possible candidate of load/store as possible.

2017-03-09 Thread yan . wang
t-boun...@lists.freedesktop.org] On Behalf Of > yan.w...@linux.intel.com > Sent: Thursday, March 9, 2017 5:41 PM > To: beignet@lists.freedesktop.org > Cc: Yan Wang > Subject: [Beignet] [PATCH v2] Provide more possible candidate of load/store as > possible. > > From: Yan Wang > >

[Beignet] [PATCH v2] Provide more possible candidate of load/store as possible.

2017-03-09 Thread yan . wang
From: Yan Wang Avoid searching range too small in some case like vector of float. It will lead more load/store merged for improving perforamnce. Signed-off-by: Yan Wang --- backend/src/llvm/llvm_loadstore_optimization.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a

[Beignet] [PATCH] Provide more possible candidate of load/store as possible.

2017-03-09 Thread yan . wang
From: Yan Wang Avoid search range too small in same case like vector of float. It will lead more load/store merged for improving perforamnce. Signed-off-by: Yan Wang --- backend/src/llvm/llvm_loadstore_optimization.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a

[Beignet] [PATCH] MAD compact instrcution could not support "absolute" attribute.

2017-02-23 Thread yan . wang
From: Yan Wang If absolute of SRCs of MAD instruction is 1, doens't use compact instruction. Signed-off-by: Yan Wang --- backend/src/backend/gen_insn_compact.cpp | 2 ++ 1 file changed, 2 insertions(+) diff --git a/backend/src/backend/gen_insn_compact.cpp b/backend/src/ba

[Beignet] [PATCH] Avoid possible invalid pointer by vector interator.

2016-12-28 Thread yan . wang
From: Yan Wang "revisit" as vector containber will be pushed more elements in findPointerEsacape() and cause previous interator to introduce possible invalid pointer. When compiling huge kernel like blender, it will cause random segment fault crash. [] operator will be more safe. Sig

[Beignet] [PATCH] Avoid possible invalid pointer by vector interator.

2016-12-28 Thread yan . wang
From: Yan Wang "revisit" as vector containber will be pushed more elements in findPointerEsacape() and cause previous interator to introduce possible invalid pointer. When compiling huge kernel like blender, it will cause random segment fault crash. [] operator will be more safe. --

Re: [Beignet] [PATCH] GBE: reorder the LLVM pass to reduce the compilation time.

2016-12-25 Thread Yan Wang
LGTM. Thanks. Yan Wang On Fri, 2016-12-16 at 16:38 +0800, Yang Rong wrote: > Set all function's linkage to LinkOnceAnyLinkage, then Inlining pass > could delete the inlined functions. > And reorder createFunctionInliningPass before > createStripAttributesPass > can reduce

[Beignet] [PATCH] Restore jump threading pass for reducing compiling time when run the large and complex kernel like Luxmark.

2016-12-08 Thread yan . wang
From: Yan Wang Jump threading pass could optimize the connection between LLVM basic blocks of the function and provide the chance to merge and remove unnecessary basic blocks to reduce the compilation time and ASM code size. Signed-off-by: Yan Wang --- backend/src/llvm/llvm_to_gen.cpp | 2

Re: [Beignet] [PATCH 1/2] remove some redundant code for printf

2016-11-29 Thread Yan Wang
LGTM. Thanks. Yan Wang On Mon, 2016-11-21 at 18:16 +0800, Guo, Yejun wrote: > tmp0 is added into src in selection stage, and just ignored at > context > stage, it is redundant. > > Signed-off-by: Guo, Yejun > --- > backend/src/backend/gen_context.cpp| 2 -- &

Re: [Beignet] [PATCH 2/2] do not care dst for printf

2016-11-29 Thread Yan Wang
LGTM. Thanks. Yan Wang On Mon, 2016-11-21 at 18:16 +0800, Guo, Yejun wrote: > acutally, the dst of printf means nothing, don't need to touch it. > > Signed-off-by: Guo, Yejun > --- > backend/src/backend/gen_context.cpp| 14 ++ > backend/src/backend

[Beignet] [PATCH v2] Fix bug: Initialize bti of LoadInstuctionPattern::shootByteGatherMsg().

2016-11-23 Thread yan . wang
From: Yan Wang If it isn't initialized, Luxmark hotel scene will display wrong. --- backend/src/backend/gen_insn_selection.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/src/backend/gen_insn_selection.cpp b/backend/src/backend/gen_insn_selection.cpp

[Beignet] [PATCH] Fix bug: Initialize bti of LoadInstuctionPattern::shootByteGatherMsg().

2016-11-23 Thread yan . wang
From: Yan Wang If it isn't initialized, Luxmark hotel scene will display wrong. --- backend/src/backend/gen_insn_selection.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/backend/src/backend/gen_insn_selection.cpp b/backend/src/backend/gen_insn_selection.cpp

[Beignet] [PATCH] Fix bug: Initialize bti LoadInstuctionPattern::shootUntypedReadMsg().

2016-11-23 Thread yan . wang
From: Yan Wang If it isn't initialized, Luxmark hotel scene will display wrong. --- backend/src/backend/gen_insn_selection.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/backend/src/backend/gen_insn_selection.cpp b/backend/src/backend/gen_insn_selection.cpp

[Beignet] [PATCH] Fix getting bitwidth of PointerType of LLVM.

2016-11-17 Thread yan . wang
From: Yan Wang PointerType could not be forced to IntegerTyoe for getting bitwidth. With Rong's comments, use getTypeBitSize() instead of Type::getIntegerBitWidth(). --- backend/src/llvm/llvm_gen_backend.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/src

[Beignet] [PATCH] Reduce the compilation time of inline pass in runModulePass().

2016-10-25 Thread yan . wang
From: Yan Wang It could reduce much compilation time when run Luxmark scenes. Avoid calling inline pass many times in runModulePass when module is changed by the other pass. Create a single funtion to run inline pass. In this single funtion, lower pass and strict math related pass are also added

Re: [Beignet] [PATCH] Add read_imagef benchmark for optimization.

2016-09-13 Thread Yan Wang
On Mon, 2016-09-12 at 06:53 +, Yang, Rong R wrote: > > > -Original Message- > > From: Beignet [mailto:beignet-boun...@lists.freedesktop.org] On > > Behalf Of > > yan.w...@linux.intel.com > > Sent: Monday, September 5, 2016 14:52 > > To: beignet

[Beignet] [PATCH] Add read_imagef benchmark for optimization.

2016-09-04 Thread yan . wang
From: Yan Wang --- benchmark/CMakeLists.txt | 1 + benchmark/benchmark_read_image_float.cpp | 65 kernels/compiler_read_image_float.cl | 9 + 3 files changed, 75 insertions(+) create mode 100644 benchmark/benchmark_read_image_float.cpp

[Beignet] [PATCH] Add cl_khr_3d_image_writes into info string.

2016-06-02 Thread yan . wang
From: Yan Wang The extension is supported in fact and avoid misunderstanding. --- src/cl_extensions.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/cl_extensions.c b/src/cl_extensions.c index 349f2f1..183aafc 100644 --- a/src/cl_extensions.c +++ b/src/cl_extensions.c @@ -48,6 +48,8

[Beignet] [PATCH] Remove unncessary assertion in printf processing.

2016-05-02 Thread yan . wang
From: Yan Wang It causes alert when printf long vector. --- backend/src/llvm/llvm_gen_backend.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/src/llvm/llvm_gen_backend.cpp b/backend/src/llvm/llvm_gen_backend.cpp index 51a1dab..7d21ebf 100644 --- a/backend/src

[Beignet] [PATCH] Add condition checking of residuals because it may be NULL.

2016-03-28 Thread yan . wang
From: Yan Wang --- src/kernels/cl_internal_block_motion_estimate_intel.cl | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/src/kernels/cl_internal_block_motion_estimate_intel.cl b/src/kernels/cl_internal_block_motion_estimate_intel.cl index 23c5488..e56520a 100644

Re: [Beignet] [PATCH] utest: do not check MV near image border

2016-03-19 Thread yan . wang
Now this case could passed when previous test_printf case has multiply tests. VME engine seems to read data out of specified image buffer which is based on drm bo. If this drm bo of src/ref image object reuse from previous bo with garbage by coincidence, it will cause different MV results. Yan

[Beignet] [Printf v2][PATCH 07/12] Implement emision of printf instruction.

2016-02-04 Thread yan . wang
From: Yan Wang Contributor: Junyan He Signed-off-by: Yan Wang --- backend/src/llvm/llvm_gen_backend.cpp | 95 +-- 1 file changed, 80 insertions(+), 15 deletions(-) diff --git a/backend/src/llvm/llvm_gen_backend.cpp b/backend/src/llvm/llvm_gen_backend.cpp

Re: [Beignet] [Printf v2][PATCH 07/12] Add the implementation of printf ir instruction.

2016-02-04 Thread yan . wang
Sorry. I have re-sent 7/12. Yan Wang > patch of 06 and 07 have the same title? > I think it is a typo here. > Please correct it. > All the other things are OK, just rename this one and > the whole patchset can be pushed later. > > Also can push my patch about printf test c

Re: [Beignet] [PATCH] Fix type assert error generated by lstPartSum incorrect type

2016-02-03 Thread yan . wang
After applying this patch, benchmark of workgroup add optimization could run on my BSW platform. Thanks. Yan Wang > Signed-off-by: Grigore Lupescu > --- > backend/src/backend/gen_insn_selection.cpp | 8 > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff -

[Beignet] [Printf v2][PATCH 12/12] Scalarize vector in printf.

2016-01-31 Thread yan . wang
From: Yan Wang Contributor: Junyan He Signed-off-by: Yan Wang --- backend/src/llvm/llvm_scalarize.cpp | 5 + 1 file changed, 5 insertions(+) diff --git a/backend/src/llvm/llvm_scalarize.cpp b/backend/src/llvm/llvm_scalarize.cpp index 899a696..2cc8179 100644 --- a/backend/src/llvm

[Beignet] [Printf v2][PATCH 11/12] Output printf result.

2016-01-31 Thread yan . wang
From: Yan Wang Contributor: Junyan He Signed-off-by: Yan Wang --- backend/src/ir/printf.cpp | 122 +- backend/src/ir/printf.hpp | 2 +- 2 files changed, 112 insertions(+), 12 deletions(-) diff --git a/backend/src/ir/printf.cpp b/backend/src/ir

[Beignet] [Printf v2][PATCH 09/12] Implement ASM generation of printf.

2016-01-31 Thread yan . wang
From: Yan Wang Contributor: Junyan He Signed-off-by: Yan Wang --- backend/src/backend/gen8_context.cpp | 36 +++ backend/src/backend/gen8_context.hpp | 1 + backend/src/backend/gen_context.cpp | 70 backend/src/backend/gen_context.hpp

[Beignet] [Printf v2][PATCH 10/12] Implement printf buffer management.

2016-01-31 Thread yan . wang
From: Yan Wang Contributor: Junyan He Signed-off-by: Yan Wang --- backend/src/backend/program.cpp | 10 + backend/src/backend/program.h | 12 +- backend/src/backend/program.hpp | 7 backend/src/ir/printf.cpp | 3 +- backend/src/ir/printf.hpp | 3 +- backend/src

[Beignet] [Printf v2][PATCH 06/12] Add the implementation of printf ir instruction.

2016-01-31 Thread yan . wang
From: Yan Wang Contributor: Junyan He Signed-off-by: Yan Wang --- backend/src/ir/instruction.cpp | 57 +- backend/src/ir/instruction.hpp | 13 ++ backend/src/ir/instruction.hxx | 1 + backend/src/ir/register.cpp| 8 ++ backend/src/ir

[Beignet] [Printf v2][PATCH 08/12] Implement instruction selection of printf.

2016-01-31 Thread yan . wang
From: Yan Wang Contributor: Junyan He Signed-off-by: Yan Wang --- backend/src/backend/gen_context.cpp| 3 + backend/src/backend/gen_context.hpp| 1 + .../src/backend/gen_insn_gen7_schedule_info.hxx| 3 +- backend/src/backend/gen_insn_selection.cpp

[Beignet] [Printf v2][PATCH 07/12] Add the implementation of printf ir instruction.

2016-01-31 Thread yan . wang
From: Yan Wang Contributor: Junyan He Signed-off-by: Yan Wang --- backend/src/llvm/llvm_gen_backend.cpp | 95 +-- 1 file changed, 80 insertions(+), 15 deletions(-) diff --git a/backend/src/llvm/llvm_gen_backend.cpp b/backend/src/llvm/llvm_gen_backend.cpp

[Beignet] [Printf v2][PATCH 05/12] Add tuple processing logic for printf.

2016-01-31 Thread yan . wang
From: Yan Wang Contributor: Junyan He Signed-off-by: Yan Wang --- backend/src/ir/context.hpp | 5 + backend/src/ir/function.hpp | 8 2 files changed, 13 insertions(+) diff --git a/backend/src/ir/context.hpp b/backend/src/ir/context.hpp index b95741f..877d639 100644 --- a

[Beignet] [Printf v2][PATCH 04/12] Add LLVM fcuntion definition of printf.

2016-01-31 Thread yan . wang
From: Yan Wang Contributor: Junyan He Signed-off-by: Yan Wang --- backend/src/llvm/llvm_gen_ocl_function.hxx | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/backend/src/llvm/llvm_gen_ocl_function.hxx b/backend/src/llvm/llvm_gen_ocl_function.hxx index e3d89a3..dd7816c

[Beignet] [Printf v2][PATCH 03/12] Reconstruct printf parser.

2016-01-31 Thread yan . wang
From: Yan Wang Contributor: Junyan He Signed-off-by: Yan Wang --- backend/src/ir/unit.cpp | 1 - backend/src/ir/unit.hpp | 2 +- backend/src/llvm/llvm_gen_backend.cpp | 4 +- backend/src/llvm/llvm_printf_parser.cpp | 115

[Beignet] [Printf v2][PATCH 01/12] Change printf data structure and remove old code.

2016-01-31 Thread yan . wang
From: Yan Wang Contributor: Junyan He Signed-off-by: Yan Wang --- backend/src/backend/program.cpp | 14 -- backend/src/backend/program.hpp | 10 +- backend/src/gbe_bin_interpreter.cpp | 2 - backend/src/ir/printf.cpp | 168 - backend/src

[Beignet] [Printf v2][PATCH 02/12] Add PrintfLog structure.

2016-01-31 Thread yan . wang
From: Yan Wang Contributor: Junyan He Signed-off-by: Yan Wang --- backend/src/ir/printf.hpp | 25 + 1 file changed, 25 insertions(+) diff --git a/backend/src/ir/printf.hpp b/backend/src/ir/printf.hpp index def6331..6b2b741 100644 --- a/backend/src/ir/printf.hpp +++ b

Re: [Beignet] [Printf][PATCH 06/11] Implement emision of printf instruction.

2016-01-31 Thread yan . wang
Now the root cause has been founded. The allocated surface size is not enough because it is dependent on global size. I Will fix it and resend patch set based on all previous review comments. Thanks. Yan Wang > After applied the printf patch set, I find the last test still > failed, pleas

[Beignet] [Printf][PATCH 11/11] Scalarize vector in printf.

2016-01-20 Thread Yan Wang
Contributor: Junyan He Signed-off-by: Yan Wang --- backend/src/llvm/llvm_scalarize.cpp | 5 + 1 file changed, 5 insertions(+) diff --git a/backend/src/llvm/llvm_scalarize.cpp b/backend/src/llvm/llvm_scalarize.cpp index 899a696..2cc8179 100644 --- a/backend/src/llvm/llvm_scalarize.cpp

[Beignet] [Printf][PATCH 07/11] Implement instruction selection of printf.

2016-01-20 Thread Yan Wang
Contributor: Junyan He Signed-off-by: Yan Wang --- backend/src/backend/gen_context.cpp| 3 + backend/src/backend/gen_context.hpp| 1 + .../src/backend/gen_insn_gen7_schedule_info.hxx| 3 +- backend/src/backend/gen_insn_selection.cpp | 116

[Beignet] [Printf][PATCH 08/11] Implement ASM generation of printf.

2016-01-20 Thread Yan Wang
Contributor: Junyan He Signed-off-by: Yan Wang --- backend/src/backend/gen8_context.cpp | 36 +++ backend/src/backend/gen8_context.hpp | 1 + backend/src/backend/gen_context.cpp | 70 backend/src/backend/gen_context.hpp | 1 + 4 files

[Beignet] [Printf][PATCH 06/11] Implement emision of printf instruction.

2016-01-20 Thread Yan Wang
Contributor: Junyan He Signed-off-by: Yan Wang --- backend/src/ir/context.hpp| 5 ++ backend/src/llvm/llvm_gen_backend.cpp | 89 --- 2 files changed, 78 insertions(+), 16 deletions(-) diff --git a/backend/src/ir/context.hpp b/backend/src/ir

[Beignet] [Printf][PATCH 10/11] Output printf result.

2016-01-20 Thread Yan Wang
Contributor: Junyan He Signed-off-by: Yan Wang --- backend/src/ir/printf.cpp | 122 +- backend/src/ir/printf.hpp | 2 +- 2 files changed, 112 insertions(+), 12 deletions(-) diff --git a/backend/src/ir/printf.cpp b/backend/src/ir/printf.cpp index

[Beignet] [Printf][PATCH 09/11] Implement printf buffer management.

2016-01-20 Thread Yan Wang
Contributor: Junyan He Signed-off-by: Yan Wang --- backend/src/backend/program.cpp | 10 + backend/src/backend/program.h | 12 +- backend/src/backend/program.hpp | 7 backend/src/ir/printf.cpp | 3 +- backend/src/ir/printf.hpp | 3 +- backend/src/ir/profile.cpp

[Beignet] [Printf][PATCH 05/11] Add LLVM fcuntion definition of printf.

2016-01-20 Thread Yan Wang
Contributor: Junyan He Signed-off-by: Yan Wang --- backend/src/llvm/llvm_gen_ocl_function.hxx | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/backend/src/llvm/llvm_gen_ocl_function.hxx b/backend/src/llvm/llvm_gen_ocl_function.hxx index e3d89a3..dd7816c 100644 --- a

[Beignet] [Printf][PATCH 04/11] Add the implementation of printf ir instruction.

2016-01-20 Thread Yan Wang
Contributor: Junyan He Signed-off-by: Yan Wang --- backend/src/ir/function.hpp| 8 ++ backend/src/ir/instruction.cpp | 57 +- backend/src/ir/instruction.hpp | 13 ++ backend/src/ir/instruction.hxx | 1 + backend/src/ir/register.cpp

[Beignet] [Printf][PATCH 03/11] Reconstruct printf parser.

2016-01-20 Thread Yan Wang
Contributor: Junyan He Signed-off-by: Yan Wang --- backend/src/ir/unit.cpp | 1 - backend/src/ir/unit.hpp | 2 +- backend/src/llvm/llvm_gen_backend.cpp | 4 +- backend/src/llvm/llvm_printf_parser.cpp | 112 ++-- 4 files changed

[Beignet] [Printf][PATCH 02/11] Add PrintfLog structure.

2016-01-20 Thread Yan Wang
Contributor: Junyan He Signed-off-by: Yan Wang --- backend/src/ir/printf.hpp | 25 + 1 file changed, 25 insertions(+) diff --git a/backend/src/ir/printf.hpp b/backend/src/ir/printf.hpp index def6331..6b2b741 100644 --- a/backend/src/ir/printf.hpp +++ b/backend/src/ir

[Beignet] [Printf][PATCH 01/11] Change printf data structure and remove old code.

2016-01-20 Thread Yan Wang
Contributor: Junyan He Signed-off-by: Yan Wang --- backend/src/backend/program.cpp | 14 -- backend/src/backend/program.hpp | 10 +- backend/src/gbe_bin_interpreter.cpp | 2 - backend/src/ir/printf.cpp | 168 - backend/src/ir/printf.hpp

Re: [Beignet] [PATCH v2] Use CreateCall instead of CreateCall2.

2015-11-19 Thread yan . wang
Thanks. Yan Wang > The llvm function prototype is CreateCall((Value *Callee, ArrayRef *> Args = None, const Twine &Name = "") > Cast from std::initializer_list to ArrayRef<> is not supported on older > llvm version. > Please try: >/*

Re: [Beignet] [PATCH v2] Use CreateCall instead of CreateCall2.

2015-11-19 Thread yan . wang
So should we rollback to v1? It should be safe because it is only for LLVM >=3.7. Thanks. Yan Wang > Build fail in LLVM3.5.2. > >> -Original Message- >> From: Beignet [mailto:beignet-boun...@lists.freedesktop.org] On Behalf >> Of >> Yan Wang >>

[Beignet] [PATCH v2] Use CreateCall instead of CreateCall2.

2015-11-18 Thread Yan Wang
Signed-off-by: Yan Wang --- backend/src/llvm/llvm_profiling.cpp | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/backend/src/llvm/llvm_profiling.cpp b/backend/src/llvm/llvm_profiling.cpp index 8c9157c..3fbd00d 100644 --- a/backend/src/llvm/llvm_profiling.cpp +++ b

  1   2   >