[libclc] [libclc] Refine __clc_fp*_subnormals_supported and __clc_flush_denormal_if_not_supported (PR #157633)

2025-09-09 Thread Wenju He via cfe-commits
https://github.com/wenju-he updated https://github.com/llvm/llvm-project/pull/157633 >From 7e2d210d9c6cd20c342562a44c2e4d2cb238e229 Mon Sep 17 00:00:00 2001 From: Wenju He Date: Tue, 9 Sep 2025 11:05:02 +0200 Subject: [PATCH 1/4] [libclc] Refine __clc_fp*_subnormals_supported and __clc_flush_d

[libclc] [libclc] Refine __clc_fp*_subnormals_supported and __clc_flush_denormal_if_not_supported (PR #157633)

2025-09-09 Thread Wenju He via cfe-commits
https://github.com/wenju-he edited https://github.com/llvm/llvm-project/pull/157633 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] Refine __clc_fp*_subnormals_supported and __clc_flush_denormal_if_not_supported (PR #157633)

2025-09-09 Thread Wenju He via cfe-commits
https://github.com/wenju-he created https://github.com/llvm/llvm-project/pull/157633 Remove the dependency on the libclc build-time configuration for __clc_fp*_subnormals_supported. The check is now implemented with LLVM intrinsics so it can be resolved during target lowering or at runtime. I

[libclc] [libclc] Refine __clc_fp*_subnormals_supported and __clc_flush_denormal_if_not_supported (PR #157633)

2025-09-09 Thread Wenju He via cfe-commits
https://github.com/wenju-he updated https://github.com/llvm/llvm-project/pull/157633 >From 7e2d210d9c6cd20c342562a44c2e4d2cb238e229 Mon Sep 17 00:00:00 2001 From: Wenju He Date: Tue, 9 Sep 2025 11:05:02 +0200 Subject: [PATCH 1/2] [libclc] Refine __clc_fp*_subnormals_supported and __clc_flush_d

[libclc] Revert "[NFC][libclc] Move _CLC_V_V_VP_VECTORIZE macro into clc_lgamma_r.cl and delete clcmacro.h (#156280)" (PR #157002)

2025-09-07 Thread Wenju He via cfe-commits
https://github.com/wenju-he updated https://github.com/llvm/llvm-project/pull/157002 >From 8390286ffa32ce98ba39cfbe313d9396ce0572fc Mon Sep 17 00:00:00 2001 From: Wenju He Date: Fri, 5 Sep 2025 04:47:56 +0200 Subject: [PATCH 1/2] Revert "[NFC][libclc] Move _CLC_V_V_VP_VECTORIZE macro into clc_

[libclc] [libclc] Implement erf/erfc vector function with loop since scalar function is large (PR #157055)

2025-09-07 Thread Wenju He via cfe-commits
https://github.com/wenju-he created https://github.com/llvm/llvm-project/pull/157055 This PR reduces amdgcn--amdhsa.bc size by 3% and nvptx64--nvidiacl.bc size by 4%. Loop trip count is constant and backend can decide whether to unroll. >From 84fbdfea1fc1f9d7d61ef388df4d34eb2d0552d0 Mon Sep 17

[libclc] [libclc] Implement erf/erfc vector function with loop since scalar function is large (PR #157055)

2025-09-07 Thread Wenju He via cfe-commits
https://github.com/wenju-he closed https://github.com/llvm/llvm-project/pull/157055 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] Revert "[NFC][libclc] Move _CLC_V_V_VP_VECTORIZE macro into clc_lgamma_r.cl and delete clcmacro.h (#156280)" (PR #157002)

2025-09-07 Thread Wenju He via cfe-commits
https://github.com/wenju-he created https://github.com/llvm/llvm-project/pull/157002 This partially reverts commit d50f2ef437aeb1784f7556fd63639487f245ffaa because _CLC_V_V_VP_VECTORIZE is also used in our downstream code: https://github.com/intel/llvm/blob/0433e4d6f5c9/libclc/libspirv/lib/ptx

[libclc] [libclc] Implement erf/erfc vector function with loop since scalar function is large (PR #157055)

2025-09-07 Thread Wenju He via cfe-commits
https://github.com/wenju-he updated https://github.com/llvm/llvm-project/pull/157055 >From 84fbdfea1fc1f9d7d61ef388df4d34eb2d0552d0 Mon Sep 17 00:00:00 2001 From: Wenju He Date: Fri, 5 Sep 2025 10:41:01 +0200 Subject: [PATCH 1/3] [libclc] Implement erf/erfc vector function with loop since scal

[libclc] [libclc] Override generic symbol using llvm-link --override flag instead of using weak linkage (PR #156778)

2025-09-05 Thread Wenju He via cfe-commits
https://github.com/wenju-he closed https://github.com/llvm/llvm-project/pull/156778 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] Implement erf/erfc vector function with loop since scalar function is large (PR #157055)

2025-09-05 Thread Wenju He via cfe-commits
https://github.com/wenju-he edited https://github.com/llvm/llvm-project/pull/157055 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] Implement erf/erfc vector function with loop since scalar function is large (PR #157055)

2025-09-05 Thread Wenju He via cfe-commits
https://github.com/wenju-he updated https://github.com/llvm/llvm-project/pull/157055 >From 84fbdfea1fc1f9d7d61ef388df4d34eb2d0552d0 Mon Sep 17 00:00:00 2001 From: Wenju He Date: Fri, 5 Sep 2025 10:41:01 +0200 Subject: [PATCH 1/2] [libclc] Implement erf/erfc vector function with loop since scal

[libclc] [libclc] Replace _CLC_V_V_VP_VECTORIZE macro with use of unary_def_with_ptr_scalarize.inc (PR #157002)

2025-09-04 Thread Wenju He via cfe-commits
https://github.com/wenju-he edited https://github.com/llvm/llvm-project/pull/157002 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [NFC][libclc] Set MACRO_ARCH to ${ARCH} uncondionally before customizing (PR #156789)

2025-09-03 Thread Wenju He via cfe-commits
https://github.com/wenju-he created https://github.com/llvm/llvm-project/pull/156789 Our downstream libclc add a few more targets that customizes build_flags and opt_flags. Then in each customization block, MACRO_ARCH is defined to be ${ARCH}. Hoisting MACRO_ARCH definition out of if-else-end

[libclc] [NFC][libclc] Define _CLC_DEF_WEAK and replace _CLC_DEF_ldexp with it (PR #156378)

2025-09-03 Thread Wenju He via cfe-commits
wenju-he wrote: Applying weak attribute to a small set of libclc functions might be acceptable because libclc will eventually set linkage to linkonce_odr for all functions. The small set of libclc functions, that were not optimized by PostOrderFunctionAttrsPass during libclc build, will be opt

[libclc] [NFC][libclc] Remove unused -DCLC_INTERNAL build flag, remove unused M_LOG210 (PR #156590)

2025-09-03 Thread Wenju He via cfe-commits
https://github.com/wenju-he created https://github.com/llvm/llvm-project/pull/156590 None >From d4943f75762951d23833b89e2d54efe31def7d98 Mon Sep 17 00:00:00 2001 From: Wenju He Date: Tue, 2 Sep 2025 23:22:50 -0700 Subject: [PATCH] [NFC][libclc] Remove unused -DCLC_INTERNAL build flag, remove

[libclc] [NFC][libclc] Define _CLC_DEF_WEAK and replace _CLC_DEF_ldexp with it (PR #156378)

2025-09-03 Thread Wenju He via cfe-commits
wenju-he wrote: Unfortunately weak linkage prevents PostOrderFunctionAttrsPass from deducing attributes for functions that have weak linkage: https://github.com/llvm/llvm-project/blob/7624c6141974f66f24ea90a18a55a111e98baa40/llvm/lib/Transforms/IPO/FunctionAttrs.cpp#L285 where hasExactDefiniti

[libclc] [NFC][libclc] Define _CLC_DEF_WEAK and replace _CLC_DEF_ldexp with it (PR #156378)

2025-09-01 Thread Wenju He via cfe-commits
https://github.com/wenju-he created https://github.com/llvm/llvm-project/pull/156378 _CLC_DEF_WEAK can be used in our downstream libclc to allow overriding generic __clc_tgamma implementation. >From cc51cd1e096794162a5f3c7be9aa160d83ba2547 Mon Sep 17 00:00:00 2001 From: Wenju He Date: Tue, 2

[libclc] [NFC][libclc] Move _CLC_V_V_VP_VECTORIZE macro into clc_lgamma_r.cl and delete clcmacro.h (PR #156280)

2025-08-31 Thread Wenju He via cfe-commits
https://github.com/wenju-he created https://github.com/llvm/llvm-project/pull/156280 clcmacro.h only defines _CLC_V_V_VP_VECTORIZE which is only used in clc/lib/generic/math/clc_lgamma_r.cl. >From 4125c7faf2d70b4059da9f56d29024e359307513 Mon Sep 17 00:00:00 2001 From: Wenju He Date: Mon, 1 Se

[libclc] [libclc] update __clc_mem_fence: add MemorySemantic arg and use __builtin_amdgcn_fence for AMDGPU (PR #152275)

2025-08-31 Thread Wenju He via cfe-commits
https://github.com/wenju-he closed https://github.com/llvm/llvm-project/pull/152275 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] update __clc_mem_fence: add MemorySemantic arg and use __builtin_amdgcn_fence for AMDGPU (PR #152275)

2025-08-31 Thread Wenju He via cfe-commits
wenju-he wrote: kindly ping @arsenm @frasercrmck to review the last commit. It is a renaming commit. https://github.com/llvm/llvm-project/pull/152275 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listin

[libclc] [libclc] Only create a target per each compile command for cmake MSVC generator (PR #154479)

2025-08-21 Thread Wenju He via cfe-commits
@@ -132,6 +124,33 @@ function(link_bc) ) endfunction() +# Create a custom target for each bitcode file, which is the output of a custom +# command. This is required for parallel compilation of the custom commands that +# generate the bitcode files when using the CMake MSVC

[libclc] [libclc] Only create a target per each compile command for cmake MSVC generator (PR #154479)

2025-08-21 Thread Wenju He via cfe-commits
https://github.com/wenju-he updated https://github.com/llvm/llvm-project/pull/154479 >From 75b5f46b2858b399482082eabd1388167aaa58e4 Mon Sep 17 00:00:00 2001 From: Wenju He Date: Wed, 20 Aug 2025 07:57:54 +0200 Subject: [PATCH 1/4] [libclc] Only create a target per each compile command for cmak

[libclc] [libclc] Only create a target per each compile command for cmake MSVC generator (PR #154479)

2025-08-20 Thread Wenju He via cfe-commits
https://github.com/wenju-he edited https://github.com/llvm/llvm-project/pull/154479 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] Only create a target per each compile command for cmake MSVC generator (PR #154479)

2025-08-20 Thread Wenju He via cfe-commits
@@ -132,6 +124,33 @@ function(link_bc) ) endfunction() +# Create a custom target for each bitcode file, which is the output of a custom +# command. This is required for parallel compilation of the custom commands that +# generate the bitcode files when using the CMake MSVC

[libclc] [libclc] Only create a target per each compile command for cmake MSVC generator (PR #154479)

2025-08-20 Thread Wenju He via cfe-commits
@@ -335,19 +348,28 @@ function(add_libclc_builtin_set) endif() endforeach() + set( builtins_comp_lib_tgt builtins.comp.${ARG_ARCH_SUFFIX} ) + if ( CMAKE_GENERATOR MATCHES "Visual Studio" ) wenju-he wrote: > Perhaps a comment above this to explain what

[libclc] [libclc] Only create a target per each compile command for cmake MSVC generator (PR #154479)

2025-08-20 Thread Wenju He via cfe-commits
https://github.com/wenju-he updated https://github.com/llvm/llvm-project/pull/154479 >From 75b5f46b2858b399482082eabd1388167aaa58e4 Mon Sep 17 00:00:00 2001 From: Wenju He Date: Wed, 20 Aug 2025 07:57:54 +0200 Subject: [PATCH 1/3] [libclc] Only create a target per each compile command for cmak

[libclc] [libclc] Only create a target per each compile command for cmake MSVC generator (PR #154479)

2025-08-20 Thread Wenju He via cfe-commits
https://github.com/wenju-he edited https://github.com/llvm/llvm-project/pull/154479 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] Only create a target per each compile command for cmake MSVC generator (PR #154479)

2025-08-19 Thread Wenju He via cfe-commits
https://github.com/wenju-he edited https://github.com/llvm/llvm-project/pull/154479 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] Only create a target per each compile command for cmake MSVC generator (PR #154479)

2025-08-19 Thread Wenju He via cfe-commits
https://github.com/wenju-he created https://github.com/llvm/llvm-project/pull/154479 libclc sequential build issue addressed in commit 0c21d6b4c8ad is specific to cmake MSVC generator. Therefore, this PR avoids creating a large number of targets when a non-MSVC generator is used, such as the N

[libclc] [libclc] Implement __clc_get_local_size/__clc_get_max_sub_group_size for amdgcn (PR #153785)

2025-08-18 Thread Wenju He via cfe-commits
https://github.com/wenju-he closed https://github.com/llvm/llvm-project/pull/153785 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] Enable DEPENDS_EXPLICIT_ONLY if cmake version >= 3.27 (PR #154084)

2025-08-18 Thread Wenju He via cfe-commits
https://github.com/wenju-he created https://github.com/llvm/llvm-project/pull/154084 Commit 0c21d6b4c8ad fixed sequential build of libclc on Windows by adding a target for each compile command. This PR conditionally enables DEPENDS_EXPLICIT_ONLY and requires at least cmake version 3.27. DEPEND

[libclc] [NFC][libclc] add missing __CLC_ prefix all internal macros (PR #153523)

2025-08-17 Thread Wenju He via cfe-commits
https://github.com/wenju-he closed https://github.com/llvm/llvm-project/pull/153523 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] Implement __clc_get_local_size/__clc_get_max_sub_group_size for amdgcn (PR #153785)

2025-08-17 Thread Wenju He via cfe-commits
wenju-he wrote: https://github.com/llvm/llvm-project/pull/153785/commits/a8fcf405007d05d123f397f66b4139d53586c1b5 is a minor fix to return 1 for out-of-bound dim https://github.com/llvm/llvm-project/pull/153785 ___ cfe-commits mailing list cfe-commits

[libclc] [libclc] Implement __clc_get_local_size/__clc_get_max_sub_group_size for amdgcn (PR #153785)

2025-08-17 Thread Wenju He via cfe-commits
https://github.com/wenju-he updated https://github.com/llvm/llvm-project/pull/153785 >From d0a9e7fa683d294aaabf24ccc34cea54a8a5eb1f Mon Sep 17 00:00:00 2001 From: Wenju He Date: Fri, 15 Aug 2025 12:43:54 +0200 Subject: [PATCH 1/2] [libclc] Implement __clc_get_local_size/__clc_get_max_sub_group

[libclc] [libclc] Fix out-of-bound value for workitem functions according to OpenCL spec (PR #153784)

2025-08-17 Thread Wenju He via cfe-commits
https://github.com/wenju-he closed https://github.com/llvm/llvm-project/pull/153784 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] Implement __clc_get_local_size/__clc_get_max_sub_group_size for amdgcn (PR #153785)

2025-08-15 Thread Wenju He via cfe-commits
https://github.com/wenju-he created https://github.com/llvm/llvm-project/pull/153785 This simplifies downstream refactoring of libspirv workitem function in https://github.com/intel/llvm/tree/sycl/libclc/libspirv/lib/generic >From d0a9e7fa683d294aaabf24ccc34cea54a8a5eb1f Mon Sep 17 00:00:00 20

[libclc] [libclc] Fix out-of-bound value for workitem functions according to OpenCL spec (PR #153784)

2025-08-15 Thread Wenju He via cfe-commits
https://github.com/wenju-he created https://github.com/llvm/llvm-project/pull/153784 None >From c51ac2969099be4831c3d296fffec9e6f4fce780 Mon Sep 17 00:00:00 2001 From: Wenju He Date: Fri, 15 Aug 2025 12:37:00 +0200 Subject: [PATCH] [libclc] Fix out-of-bound value for workitem functions accord

[libclc] [libclc] Use __ocml_cos/sin/tan/exp*/lgamma/log*/fmax/fmin/sqrt for AMDGPU (PR #153328)

2025-08-13 Thread Wenju He via cfe-commits
wenju-he wrote: > If you want to use these implementations, I'd rather merge the OCML content > into libclc and migrate over to using it. Tag @frasercrmck The amdgcn implementation probably needs improvement to use llvm elementwise builtin for e.g. half/float exp* https://github.com/llvm/llv

[libclc] [libclc] Enable -ffp-contract=fast compile option for math native_* functions (PR #153137)

2025-08-11 Thread Wenju He via cfe-commits
https://github.com/wenju-he created https://github.com/llvm/llvm-project/pull/153137 According to OpenCL spec, native_* functions have implementation-defined accuracy and typically have better performance. We can enable floating- point contraction optimizations for them. >From 719a6914321afc0

[libclc] [libclc] update __clc_mem_fence: add MemorySemantic arg and use __builtin_amdgcn_fence for AMDGPU (PR #152275)

2025-08-11 Thread Wenju He via cfe-commits
@@ -0,0 +1,21 @@ +//===--===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apac

[libclc] [libclc] update __clc_mem_fence: add MemorySemantic arg and use __builtin_amdgcn_fence for AMDGPU (PR #152275)

2025-08-11 Thread Wenju He via cfe-commits
https://github.com/wenju-he updated https://github.com/llvm/llvm-project/pull/152275 >From c48a94749e7e4ee261895826f2df2e2c48f040ef Mon Sep 17 00:00:00 2001 From: Wenju He Date: Wed, 6 Aug 2025 11:07:15 +0200 Subject: [PATCH 1/3] [libclc] update __clc_mem_fence: add MemorySemantic arg and use

[libclc] [libclc] Fix libclc install on Windows when MSVC generator is used (PR #152703)

2025-08-11 Thread Wenju He via cfe-commits
https://github.com/wenju-he closed https://github.com/llvm/llvm-project/pull/152703 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] Fix libclc install on Windows when MSVC generator is used (PR #152703)

2025-08-10 Thread Wenju He via cfe-commits
wenju-he wrote: > LGTM if you've tested it and it works 👍 Yes, the fix is verified both locally and in our internal CI. I also fixed alias install in https://github.com/llvm/llvm-project/pull/152703/commits/b6cbefcbc06e42d7107723cb5b37749f3b1e0931. @frasercrmck please review this commit again

[libclc] [libclc] Fix libclc install on Windows when MSVC generator is used (PR #152703)

2025-08-10 Thread Wenju He via cfe-commits
https://github.com/wenju-he updated https://github.com/llvm/llvm-project/pull/152703 >From be71d635d2de980797be595c4f35f307c703bc96 Mon Sep 17 00:00:00 2001 From: Wenju He Date: Fri, 8 Aug 2025 05:59:54 -0700 Subject: [PATCH 1/3] [libclc] Fix libclc install on Windows when MSVC generator is us

[libclc] [libclc] Implement clc_log/sinpi/sqrt with __nv_* functions (PR #150174)

2025-08-10 Thread Wenju He via cfe-commits
https://github.com/wenju-he closed https://github.com/llvm/llvm-project/pull/150174 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [libclc] [clang] Add the ability to link libclc OpenCL libraries (PR #146503)

2025-08-08 Thread Wenju He via cfe-commits
@@ -92,10 +95,14 @@ else() get_host_tool_path( llvm-link LLVM_LINK llvm-link_exe llvm-link_target ) get_host_tool_path( opt OPT opt_exe opt_target ) endif() -endif() -# Setup the paths where libclc runtimes should be stored. -set( LIBCLC_OUTPUT_LIBRARY_DIR ${CMAKE_C

[libclc] [libclc] Fix libclc install on Windows when MSVC generator is used (PR #152703)

2025-08-08 Thread Wenju He via cfe-commits
https://github.com/wenju-he updated https://github.com/llvm/llvm-project/pull/152703 >From be71d635d2de980797be595c4f35f307c703bc96 Mon Sep 17 00:00:00 2001 From: Wenju He Date: Fri, 8 Aug 2025 05:59:54 -0700 Subject: [PATCH 1/2] [libclc] Fix libclc install on Windows when MSVC generator is us

[libclc] [libclc] Fix libclc install on Windows when MSVC generator is used (PR #152703)

2025-08-08 Thread Wenju He via cfe-commits
https://github.com/wenju-he edited https://github.com/llvm/llvm-project/pull/152703 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] Fix libclc install on Windows when MSVC generator is used (PR #152703)

2025-08-08 Thread Wenju He via cfe-commits
https://github.com/wenju-he updated https://github.com/llvm/llvm-project/pull/152703 >From be71d635d2de980797be595c4f35f307c703bc96 Mon Sep 17 00:00:00 2001 From: Wenju He Date: Fri, 8 Aug 2025 05:59:54 -0700 Subject: [PATCH] [libclc] Fix libclc install on Windows when MSVC generator is used

[libclc] [libclc] Fix libclc bitcodes install on windows when cmake msvc generator is used (PR #152666)

2025-08-08 Thread Wenju He via cfe-commits
wenju-he wrote: sorry, not working, closing PR https://github.com/llvm/llvm-project/pull/152666 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] Fix libclc bitcodes install on windows when cmake msvc generator is used (PR #152666)

2025-08-08 Thread Wenju He via cfe-commits
https://github.com/wenju-he closed https://github.com/llvm/llvm-project/pull/152666 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [libclc] [clang] Add the ability to link libclc OpenCL libraries (PR #146503)

2025-08-08 Thread Wenju He via cfe-commits
https://github.com/wenju-he edited https://github.com/llvm/llvm-project/pull/146503 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [libclc] [clang] Add the ability to link libclc OpenCL libraries (PR #146503)

2025-08-08 Thread Wenju He via cfe-commits
@@ -92,10 +95,14 @@ else() get_host_tool_path( llvm-link LLVM_LINK llvm-link_exe llvm-link_target ) get_host_tool_path( opt OPT opt_exe opt_target ) endif() -endif() -# Setup the paths where libclc runtimes should be stored. -set( LIBCLC_OUTPUT_LIBRARY_DIR ${CMAKE_C

[libclc] [libclc] update __clc_mem_fence: add MemorySemantic arg and use __builtin_amdgcn_fence for AMDGPU (PR #152275)

2025-08-07 Thread Wenju He via cfe-commits
https://github.com/wenju-he edited https://github.com/llvm/llvm-project/pull/152275 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] update __clc_mem_fence: add MemorySemantic arg and use __builtin_amdgcn_fence for AMDGPU (PR #152275)

2025-08-07 Thread Wenju He via cfe-commits
@@ -0,0 +1,21 @@ +//===--===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apac

[libclc] [NFC][libclc] Delete unused clc/shared/binary_decl_with_scalar_second_arg.inc (PR #152463)

2025-08-07 Thread Wenju He via cfe-commits
https://github.com/wenju-he closed https://github.com/llvm/llvm-project/pull/152463 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [NFC][libclc] Delete unused clc/shared/binary_decl_with_scalar_second_arg.inc (PR #152463)

2025-08-07 Thread Wenju He via cfe-commits
https://github.com/wenju-he created https://github.com/llvm/llvm-project/pull/152463 None >From 10c997f9de3c2a99b2f7bd507523be901d7c6ee7 Mon Sep 17 00:00:00 2001 From: Wenju He Date: Thu, 7 Aug 2025 00:47:50 -0700 Subject: [PATCH] [NFC][libclc] Delete unused clc/shared/binary_decl_with_scalar

[libclc] [libclc] Add __attribute__((const)) to functions that don't access memory (PR #152456)

2025-08-07 Thread Wenju He via cfe-commits
https://github.com/wenju-he created https://github.com/llvm/llvm-project/pull/152456 Before this PR, PostOrderFunctionAttrsPass in opt run can deduce memory(none) for these functions. This PR explicitly adds the attribute to align with Clang's OpenCL headers and ensures the attribute is prese

[libclc] [libclc] Add missing clc/lib/ptx-nvidiacl/SOURCES to CMAKE_CONFIGURE_DEPENDS (PR #152431)

2025-08-07 Thread Wenju He via cfe-commits
https://github.com/wenju-he closed https://github.com/llvm/llvm-project/pull/152431 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] Implement __clc_rsqrt with __ocml_rsqrt_* functions (PR #152436)

2025-08-06 Thread Wenju He via cfe-commits
wenju-he wrote: @arsenm I'm not sure if the change is an improvement, please review. https://github.com/llvm/llvm-project/pull/152436 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] Implement __clc_rsqrt with __ocml_rsqrt_* functions (PR #152436)

2025-08-06 Thread Wenju He via cfe-commits
https://github.com/wenju-he created https://github.com/llvm/llvm-project/pull/152436 Motivation is to upstream use of __ocml_rsqrt_ in https://github.com/intel/llvm/blob/sycl/libclc/libspirv/lib/amdgcn-amdhsa/math/rsqrt.cl llvm-diff shows vectorized calls of llvm.sqrt.v2f32 and fdiv are scalari

[libclc] [libclc] Add missing clc/lib/ptx-nvidiacl/SOURCES to CMAKE_CONFIGURE_DEPENDS (PR #152431)

2025-08-06 Thread Wenju He via cfe-commits
https://github.com/wenju-he created https://github.com/llvm/llvm-project/pull/152431 None >From 73299b736c61a2042aeadf46e62a93e11ca5a890 Mon Sep 17 00:00:00 2001 From: Wenju He Date: Thu, 7 Aug 2025 05:03:56 +0200 Subject: [PATCH] [libclc] Add missing clc/lib/ptx-nvidiacl/SOURCES to CMAKE_CON

[libclc] [libclc] Move mem_fence and barrier to clc library (PR #151446)

2025-08-06 Thread Wenju He via cfe-commits
@@ -0,0 +1,37 @@ +//===--===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apac

[libclc] [libclc] update __clc_mem_fence: add MemorySemantic arg and use __builtin_amdgcn_fence for AMDGPU (PR #152275)

2025-08-06 Thread Wenju He via cfe-commits
https://github.com/wenju-he created https://github.com/llvm/llvm-project/pull/152275 It is necessary to add MemorySemantic argument which means the memory or address space to which the memory ordering is applied. The MemorySemantic is also necessary for implementing the SPIR-V MemoryBarrier i

[libclc] [libclc] Set TARGET_FILE property for prepare-${obj_suffix} target (PR #152245)

2025-08-05 Thread Wenju He via cfe-commits
https://github.com/wenju-he created https://github.com/llvm/llvm-project/pull/152245 The target's output bitcode `libclc_builtins_lib` is located in a sub-directory in clang resource directory since df7473673214. Setting TARGET_FILE property can allow targets in non-libclc project to obtain th

[libclc] [libclc] Move mem_fence and barrier to clc library (PR #151446)

2025-08-05 Thread Wenju He via cfe-commits
https://github.com/wenju-he closed https://github.com/llvm/llvm-project/pull/151446 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] Move mem_fence and barrier to clc library (PR #151446)

2025-08-05 Thread Wenju He via cfe-commits
@@ -0,0 +1,37 @@ +//===--===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apac

[libclc] [libclc] Move mem_fence and barrier to clc library (PR #151446)

2025-08-05 Thread Wenju He via cfe-commits
https://github.com/wenju-he edited https://github.com/llvm/llvm-project/pull/151446 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] Move mem_fence and barrier to clc library (PR #151446)

2025-08-05 Thread Wenju He via cfe-commits
@@ -0,0 +1,36 @@ +//===--===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apac

[libclc] [libclc] Move mem_fence and barrier to clc library (PR #151446)

2025-08-05 Thread Wenju He via cfe-commits
https://github.com/wenju-he updated https://github.com/llvm/llvm-project/pull/151446 >From eed56d228c0613f563c23f9be23d681ef3d87f2b Mon Sep 17 00:00:00 2001 From: Wenju He Date: Thu, 31 Jul 2025 05:07:23 +0200 Subject: [PATCH 1/3] [libclc] Move mem_fence and barrier to clc library __clc_mem_fe

[libclc] [libclc] Refine id in async_work_group_copy STRIDED_COPY (PR #151644)

2025-07-31 Thread Wenju He via cfe-commits
https://github.com/wenju-he created https://github.com/llvm/llvm-project/pull/151644 Move id first along 0th dimension to achieve coalesced memory access when stride is 1. >From 1fe808b52e11dfe569c489a9dc8f1cdd3fa87afc Mon Sep 17 00:00:00 2001 From: Wenju He Date: Fri, 1 Aug 2025 07:45:50 +02

[libclc] [libclc] Move mem_fence and barrier to clc library (PR #151446)

2025-07-30 Thread Wenju He via cfe-commits
https://github.com/wenju-he updated https://github.com/llvm/llvm-project/pull/151446 >From eed56d228c0613f563c23f9be23d681ef3d87f2b Mon Sep 17 00:00:00 2001 From: Wenju He Date: Thu, 31 Jul 2025 05:07:23 +0200 Subject: [PATCH 1/2] [libclc] Move mem_fence and barrier to clc library __clc_mem_fe

[libclc] [libclc] Optimize generic CLC fmin/fmax (PR #128506)

2025-07-29 Thread Wenju He via cfe-commits
https://github.com/wenju-he approved this pull request. LGTM. I think llvm-spirv should be fixed, so that we can also use __builtin_elementwise_max/minimumnum for the target. https://github.com/llvm/llvm-project/pull/128506 ___ cfe-commits mailing lis

[libclc] [libclc] Optimize generic CLC fmin/fmax (PR #128506)

2025-07-29 Thread Wenju He via cfe-commits
@@ -43,8 +48,10 @@ _CLC_DEF _CLC_OVERLOAD half __clc_fmin(half x, half y) { return (y < x) ? y : x; wenju-he wrote: >I wonder if we in fact want to have `half` use `__builtin_fminf16`? We can >simplify the definitions if all types are using a builtin. What do

[libclc] [libclc] Fix building top-level 'libclc' target (PR #150972)

2025-07-29 Thread Wenju He via cfe-commits
https://github.com/wenju-he approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/150972 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] Optimize generic CLC fmin/fmax (PR #128506)

2025-07-28 Thread Wenju He via cfe-commits
@@ -43,8 +48,10 @@ _CLC_DEF _CLC_OVERLOAD half __clc_fmin(half x, half y) { return (y < x) ? y : x; wenju-he wrote: can we use this same implementation for float and double? https://github.com/llvm/llvm-project/pull/128506

[libclc] [libclc] Fix building top-level 'libclc' target (PR #150972)

2025-07-28 Thread Wenju He via cfe-commits
@@ -5,6 +5,9 @@ if(CMAKE_SOURCE_DIR STREQUAL CMAKE_CURRENT_SOURCE_DIR) endif() set(LLVM_SUBPROJECT_TITLE "libclc") +# Top level target used to build all Libclc libraries. +add_custom_target( libclc ALL ) wenju-he wrote: can we put this line near line 48~49, o

[libclc] [libclc] Simplify unary_def_scalarize.inc's use in __clc_erf/erfc/tgamma (PR #150181)

2025-07-23 Thread Wenju He via cfe-commits
https://github.com/wenju-he created https://github.com/llvm/llvm-project/pull/150181 Also delete unary_def_via_fp32.inc. There are small changes in amdgcn--amdhsa.bc due to vector conversion is scalarized, e.g. %2 = fpext <4 x half> %0 to <4 x float> %3 = extractelement <4 x float> %2, i64

[libclc] [libclc] Implement clc_log/sinpi/sqrt with __nv_* functions (PR #150174)

2025-07-22 Thread Wenju He via cfe-commits
https://github.com/wenju-he created https://github.com/llvm/llvm-project/pull/150174 This is to upstream implementations in https://github.com/intel/llvm/tree/sycl/libclc/clc/lib/ptx-nvidiacl/math >From d4fcaf56d63efe283240bfc582706e268e30c854 Mon Sep 17 00:00:00 2001 From: Wenju He Date: Wed

[libclc] [libclc] Add generic native half implementation of __clc_normalize (PR #150165)

2025-07-22 Thread Wenju He via cfe-commits
https://github.com/wenju-he edited https://github.com/llvm/llvm-project/pull/150165 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] Add native half implementation of __clc_normalize (PR #150165)

2025-07-22 Thread Wenju He via cfe-commits
wenju-he wrote: llvm-diff amdgcn--amdhsa.bc.new amdgcn--amdhsa.bc.old ``` in function _Z9normalizeDh: in block %1 / %1: > %2 = fpext half %0 to float %2 = fcmp one half %0, 0xH > %4 = select i1 %3, float 1.00e+00, float 0.00e+00 > %5 = tail call noundef flo

[libclc] [libclc] Add native half implementation of __clc_normalize (PR #150165)

2025-07-22 Thread Wenju He via cfe-commits
https://github.com/wenju-he created https://github.com/llvm/llvm-project/pull/150165 This is ported from https://github.com/intel/llvm/blob/sycl/libclc/libspirv/lib/generic/geometric/normalize.cl and can pass a closed-source OpenCL CTS "test_geometrics geom_normalize --half CL_DEVICE_TYPE_GPU"

[libclc] [libclc] Optimize generic CLC fmin/fmax (PR #128506)

2025-07-22 Thread Wenju He via cfe-commits
wenju-he wrote: > If the decision is the conformance test continues doing what it has been > doing, it should directly map to llvm.minimumnum/maximumnum. For now, @frasercrmck can we update this PR to use __builtin_elementwise_maximumnum/minimumnum so that OpenCL CTS can pass? https://github.

[clang] [Clang] Add elementwise maximumnum/minimumnum builtin functions (PR #149775)

2025-07-21 Thread Wenju He via cfe-commits
@@ -4108,6 +4108,22 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID, return RValue::get(Result); } + case Builtin::BI__builtin_elementwise_maximumnum: { +Value *Op0 = EmitScalarExpr(E->getArg(0)); wenju-he wrote:

[clang] [Clang] Add elementwise maximumnum/minimumnum builtin functions (PR #149775)

2025-07-21 Thread Wenju He via cfe-commits
https://github.com/wenju-he created https://github.com/llvm/llvm-project/pull/149775 Addresses https://github.com/llvm/llvm-project/issues/112164. minimumnum and maximumnum intrinsics were added in 5bf81e53dbea. The new built-ins can be used for implementing OpenCL math function fmax and fmin

[libclc] [libclc] Fix installed symlinks to be relative again (PR #149728)

2025-07-20 Thread Wenju He via cfe-commits
https://github.com/wenju-he edited https://github.com/llvm/llvm-project/pull/149728 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] Fix installed symlinks to be relative again (PR #149728)

2025-07-20 Thread Wenju He via cfe-commits
@@ -425,17 +425,21 @@ function(add_libclc_builtin_set) WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR} ) endif() - if(CMAKE_HOST_UNIX OR LLVM_USE_SYMLINKS) -set(LIBCLC_LINK_OR_COPY create_symlink) - else() -set(LIBCLC_LINK_OR_COPY copy) - endif() - foreach(

[libclc] [libclc] Fix installed symlinks to be relative again (PR #149728)

2025-07-20 Thread Wenju He via cfe-commits
@@ -425,17 +425,21 @@ function(add_libclc_builtin_set) WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR} ) endif() - if(CMAKE_HOST_UNIX OR LLVM_USE_SYMLINKS) -set(LIBCLC_LINK_OR_COPY create_symlink) - else() -set(LIBCLC_LINK_OR_COPY copy) - endif() - foreach(

[libclc] [libclc] Enable `clang fp reciprocal` in clc_native_divide/recip/rsqrt/tan (PR #149269)

2025-07-17 Thread Wenju He via cfe-commits
https://github.com/wenju-he closed https://github.com/llvm/llvm-project/pull/149269 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] Enable `clang fp reciprocal` in clc_native_divide/recip/rsqrt/tan (PR #149269)

2025-07-17 Thread Wenju He via cfe-commits
https://github.com/wenju-he updated https://github.com/llvm/llvm-project/pull/149269 >From 9ac644cb8ed43ac28e0ee715f0a0e6bed4df470a Mon Sep 17 00:00:00 2001 From: Wenju He Date: Thu, 17 Jul 2025 10:02:20 +0200 Subject: [PATCH 1/2] [libclc] Enable `clang fp reciprocal` in clc_native_divide/reci

[libclc] [libclc] Enable `clang fp reciprocal` in clc_native_divide/recip/rsqrt/tan (PR #149269)

2025-07-17 Thread Wenju He via cfe-commits
https://github.com/wenju-he created https://github.com/llvm/llvm-project/pull/149269 The pragma adds `arcp` flag to `fdiv` instruction in these functions. The flag can provide better performance. >From 9ac644cb8ed43ac28e0ee715f0a0e6bed4df470a Mon Sep 17 00:00:00 2001 From: Wenju He Date: Thu,

[libclc] [NFC][libclc] Delete clc/include/clc/relational/floatn.inc (PR #149252)

2025-07-16 Thread Wenju He via cfe-commits
https://github.com/wenju-he created https://github.com/llvm/llvm-project/pull/149252 llvm-diff shows no change to amdgcn--amdhsa.bc. >From 91827fa45fbf45936e57241b0bb0c1a215112834 Mon Sep 17 00:00:00 2001 From: Wenju He Date: Thu, 17 Jul 2025 07:27:50 +0200 Subject: [PATCH] [NFC][libclc] Delet

[libclc] [libclc] Add generic implementation of bitfield_insert/extract,bit_reverse (PR #149070)

2025-07-16 Thread Wenju He via cfe-commits
https://github.com/wenju-he updated https://github.com/llvm/llvm-project/pull/149070 >From 9f8b12e6cf600cd05bab586e3d521e5354789e12 Mon Sep 17 00:00:00 2001 From: Wenju He Date: Wed, 16 Jul 2025 12:44:48 +0200 Subject: [PATCH 1/3] [libclc] Add generic implementation of bitfield_insert/extract,

[clang] [SPIR] Set MaxAtomicInlineWidth minimum size to 32 for spir32 and 64 for spir64 (PR #148997)

2025-07-16 Thread Wenju He via cfe-commits
https://github.com/wenju-he closed https://github.com/llvm/llvm-project/pull/148997 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [SPIR] Set MaxAtomicInlineWidth minimum size to 32 for spir32 and 64 for spir64 (PR #148997)

2025-07-16 Thread Wenju He via cfe-commits
https://github.com/wenju-he edited https://github.com/llvm/llvm-project/pull/148997 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [SPIR] Set MaxAtomicInlineWidth minimum size to 32 for spir32 and 64 for spir64 (PR #148997)

2025-07-16 Thread Wenju He via cfe-commits
https://github.com/wenju-he edited https://github.com/llvm/llvm-project/pull/148997 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] Add generic implementation of bitfield_insert/extract,bit_reverse (PR #149070)

2025-07-16 Thread Wenju He via cfe-commits
https://github.com/wenju-he created https://github.com/llvm/llvm-project/pull/149070 The implementation is based on reference implementation in OpenCL-CTS/test_integer_ops. The generic implementations pass OpenCL-CTS/test_integer_ops tests on Intel GPU. >From 9f8b12e6cf600cd05bab586e3d521e535

[libclc] [libclc] Enable -fdiscard-value-names build flag to reduce bitcode size (PR #149016)

2025-07-16 Thread Wenju He via cfe-commits
wenju-he wrote: > I think this could optionally do with a comment before explaining what this > flag is helping to achieve, but it's not a blocker. it is explained at https://clang.llvm.org/docs/UsersManual.html#id72 and reducing size and verbosity is what this flag is intended for. I mean the

[libclc] [libclc] Move CMake for prepare_builtins to a subdirectory (PR #148815)

2025-07-15 Thread Wenju He via cfe-commits
https://github.com/wenju-he approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/148815 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

  1   2   3   >