skc7 created this revision.
skc7 added reviewers: sameerds, cdevadas, ronlieb.
Herald added subscribers: mattd, asavonic, ThomasRaoux, jdoerfert, kerbowa, 
kbarton, jvesely, nemanjai.
Herald added a project: All.
skc7 requested review of this revision.
Herald added a reviewer: jdoerfert.
Herald added subscribers: llvm-commits, cfe-commits, sstefan1.
Herald added projects: clang, LLVM.

Change https://reviews.llvm.org/D105169 enables noundef attribute by default. 
This is causing issue with functions tagged with convergent attribute.

For Ex: SimplifyCFG pass removes the branch leading to a BB which has an 
incoming value that will always trigger undefined behavior. This basically 
modifies the CFG and combines the basic blocks. This works for CPU execution. 
But on a GPU, there are intrinsics like "__shfl_sync(unsigned mask, T var, int 
srcLane, int width=warpSize)", Where the exchange of variable occurs 
simultaneously for all active threads within the warp. So, here in the cuda/hip 
kernel, variable var in shuffl_sync may not be initialised, and LLVM IR treats 
it as undef. Currently all the arguments are tagged with noundef attribute and 
the above mentioned optimization by SimplifyCFG gets applied and the kernel 
execution becomes ambiguous. So, the proposed change is to skip adding noundef 
attribute to arguments when a function has been tagged with convergent 
attribute.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D124158

Files:
  clang/lib/CodeGen/CGCall.cpp
  clang/test/CodeGen/PowerPC/ppc64le-varargs-f128.c
  clang/test/CodeGenCUDA/address-spaces.cu
  clang/test/CodeGenCUDA/builtins-amdgcn.cu
  clang/test/CodeGenCUDA/cuda-builtin-vars.cu
  clang/test/CodeGenCUDA/kernel-args-alignment.cu
  clang/test/CodeGenCUDA/kernel-args.cu
  clang/test/CodeGenCUDA/redux-builtins.cu
  clang/test/CodeGenCUDA/usual-deallocators.cu
  clang/test/CodeGenCUDA/vtbl.cu
  clang/test/CodeGenCUDASPIRV/kernel-argument.cu
  clang/test/CodeGenOpenCL/addr-space-struct-arg.cl
  clang/test/CodeGenOpenCL/address-spaces.cl
  clang/test/CodeGenOpenCL/amdgcn-automatic-variable.cl
  clang/test/CodeGenOpenCL/amdgpu-abi-struct-coerce.cl
  clang/test/CodeGenOpenCL/amdgpu-call-kernel.cl
  clang/test/CodeGenOpenCL/amdgpu-printf.cl
  clang/test/CodeGenOpenCL/as_type.cl
  clang/test/CodeGenOpenCL/atomic-ops-libcall.cl
  clang/test/CodeGenOpenCL/blocks.cl
  clang/test/CodeGenOpenCL/byval.cl
  clang/test/CodeGenOpenCL/const-str-array-decay.cl
  clang/test/CodeGenOpenCL/constant-addr-space-globals.cl
  clang/test/CodeGenOpenCL/convergent.cl
  clang/test/CodeGenOpenCL/fpmath.cl
  clang/test/CodeGenOpenCL/half.cl
  clang/test/CodeGenOpenCL/kernel-param-alignment.cl
  clang/test/CodeGenOpenCL/kernels-have-spir-cc-by-default.cl
  clang/test/CodeGenOpenCL/no-half.cl
  clang/test/CodeGenOpenCL/overload.cl
  clang/test/CodeGenOpenCL/size_t.cl
  clang/test/CodeGenOpenCL/spir-calling-conv.cl
  clang/test/CodeGenOpenCLCXX/address-space-deduction.clcpp
  clang/test/CodeGenOpenCLCXX/addrspace-of-this.clcpp
  clang/test/CodeGenOpenCLCXX/addrspace-operators.clcpp
  clang/test/CodeGenOpenCLCXX/addrspace-references.clcpp
  clang/test/CodeGenOpenCLCXX/addrspace-with-class.clcpp
  clang/test/CodeGenSYCL/address-space-conversions.cpp
  clang/test/CodeGenSYCL/address-space-mangling.cpp
  clang/test/CodeGenSYCL/functionptr-addrspace.cpp
  clang/test/CodeGenSYCL/unique_stable_name.cpp
  clang/test/OpenMP/amdgcn-attributes.cpp
  clang/test/OpenMP/amdgcn_target_global_constructor.cpp
  clang/test/OpenMP/assumes_include_nvptx.cpp
  clang/test/OpenMP/declare_target_codegen.cpp
  clang/test/OpenMP/declare_target_codegen_globalization.cpp
  clang/test/OpenMP/declare_target_link_codegen.cpp
  clang/test/OpenMP/declare_variant_mixed_codegen.c
  clang/test/OpenMP/distribute_codegen.cpp
  clang/test/OpenMP/distribute_simd_codegen.cpp
  clang/test/OpenMP/nvptx_allocate_codegen.cpp
  clang/test/OpenMP/nvptx_data_sharing.cpp
  clang/test/OpenMP/nvptx_declare_target_var_ctor_dtor_codegen.cpp
  clang/test/OpenMP/nvptx_distribute_parallel_generic_mode_codegen.cpp
  clang/test/OpenMP/nvptx_multi_target_parallel_codegen.cpp
  clang/test/OpenMP/nvptx_nested_parallel_codegen.cpp
  clang/test/OpenMP/nvptx_parallel_codegen.cpp
  clang/test/OpenMP/nvptx_parallel_for_codegen.cpp
  clang/test/OpenMP/nvptx_target_firstprivate_codegen.cpp
  clang/test/OpenMP/nvptx_target_parallel_codegen.cpp
  clang/test/OpenMP/nvptx_target_parallel_num_threads_codegen.cpp
  clang/test/OpenMP/nvptx_target_parallel_reduction_codegen.cpp
  clang/test/OpenMP/nvptx_target_printf_codegen.c
  clang/test/OpenMP/nvptx_target_teams_codegen.cpp
  clang/test/OpenMP/nvptx_target_teams_distribute_codegen.cpp
  clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp
  
clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_generic_mode_codegen.cpp
  clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_simd_codegen.cpp
  clang/test/OpenMP/nvptx_teams_codegen.cpp
  clang/test/OpenMP/nvptx_teams_reduction_codegen.cpp
  clang/test/OpenMP/nvptx_unsupported_type_codegen.cpp
  clang/test/OpenMP/openmp_offload_codegen.cpp
  clang/test/OpenMP/reduction_implicit_map.cpp
  clang/test/OpenMP/target_firstprivate_codegen.cpp
  clang/test/OpenMP/target_parallel_codegen.cpp
  clang/test/OpenMP/target_parallel_debug_codegen.cpp
  clang/test/OpenMP/target_parallel_for_codegen.cpp
  clang/test/OpenMP/target_parallel_for_debug_codegen.cpp
  clang/test/OpenMP/target_parallel_for_simd_codegen.cpp
  clang/test/OpenMP/target_parallel_if_codegen.cpp
  clang/test/OpenMP/target_parallel_num_threads_codegen.cpp
  clang/test/OpenMP/target_private_codegen.cpp
  clang/test/OpenMP/target_reduction_codegen.cpp
  clang/test/OpenMP/target_teams_codegen.cpp
  clang/test/OpenMP/target_teams_distribute_codegen.cpp
  clang/test/OpenMP/target_teams_distribute_parallel_for_codegen.cpp
  
clang/test/OpenMP/target_teams_distribute_parallel_for_firstprivate_codegen.cpp
  clang/test/OpenMP/target_teams_distribute_parallel_for_private_codegen.cpp
  clang/test/OpenMP/target_teams_distribute_parallel_for_simd_codegen.cpp
  
clang/test/OpenMP/target_teams_distribute_parallel_for_simd_firstprivate_codegen.cpp
  
clang/test/OpenMP/target_teams_distribute_parallel_for_simd_private_codegen.cpp
  clang/test/OpenMP/target_teams_distribute_simd_codegen.cpp
  clang/test/OpenMP/target_teams_map_codegen.cpp
  clang/test/OpenMP/target_teams_num_teams_codegen.cpp
  clang/test/OpenMP/target_teams_thread_limit_codegen.cpp
  clang/test/OpenMP/teams_codegen.cpp
  
llvm/test/Transforms/SimplifyCFG/tautological-conditional-branch-convergent-noundef.ll

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
  • [PATCH] D124158:... krishna chaitanya sankisa via Phabricator via cfe-commits
    • [PATCH] D12... Matt Arsenault via Phabricator via cfe-commits
    • [PATCH] D12... krishna chaitanya sankisa via Phabricator via cfe-commits
    • [PATCH] D12... krishna chaitanya sankisa via Phabricator via cfe-commits
    • [PATCH] D12... krishna chaitanya sankisa via Phabricator via cfe-commits
    • [PATCH] D12... krishna chaitanya sankisa via Phabricator via cfe-commits
    • [PATCH] D12... krishna chaitanya sankisa via Phabricator via cfe-commits
    • [PATCH] D12... krishna chaitanya sankisa via Phabricator via cfe-commits
    • [PATCH] D12... krishna chaitanya sankisa via Phabricator via cfe-commits
    • [PATCH] D12... Johannes Doerfert via Phabricator via cfe-commits
    • [PATCH] D12... krishna chaitanya sankisa via Phabricator via cfe-commits

Reply via email to