[PATCH] D41677: Change memcpy/memove/memset to have dest and source alignment attributes.
dneilson created this revision. dneilson added a reviewer: rjmccall. Herald added subscribers: fedor.sergeev, kbarton, aheejin, sbc100, javed.absar, nhaehnle, nemanjai, jyknight. Upstream LLVM is changing the the prototypes of the @llvm.memcpy/memmove/memset intrinsics. This change updates the Clang CGBuilder and the Clang tests for this change. The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument which is required to be a constant integer. It represents the alignment of the dest (and source), and so must be the minimum of the actual alignment of the two. This change allows source and dest to each have their own alignments by using the alignment attribute on their arguments. The alignment argument is removed. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false) will now read call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false) Repository: rC Clang https://reviews.llvm.org/D41677 Files: lib/CodeGen/CGBuilder.h test/CodeGen/2007-11-07-CopyAggregateAlign.c test/CodeGen/2007-11-07-ZeroAggregateAlign.c test/CodeGen/64bit-swiftcall.c test/CodeGen/aarch64-neon-intrinsics.c test/CodeGen/aarch64-neon-ldst-one.c test/CodeGen/aarch64-neon-perm.c test/CodeGen/aarch64-poly64.c test/CodeGen/aarch64-v8.2a-neon-intrinsics.c test/CodeGen/arm-arguments.c test/CodeGen/arm64-be-bitfield.c test/CodeGen/arm_neon_intrinsics.c test/CodeGen/atomic-arm64.c test/CodeGen/block-byref-aggr.c test/CodeGen/builtin-memfns.c test/CodeGen/c11atomics-ios.c test/CodeGen/c11atomics.c test/CodeGen/compound-literal.c test/CodeGen/le32-vaarg.c test/CodeGen/ms-intrinsics.c test/CodeGen/no-opt-volatile-memcpy.c test/CodeGen/packed-nest-unpacked.c test/CodeGen/packed-structure.c test/CodeGen/partial-reinitialization2.c test/CodeGen/ppc-varargs-struct.c test/CodeGen/ppc64-align-struct.c test/CodeGen/ppc64-soft-float.c test/CodeGen/ppc64le-aggregates.c test/CodeGen/sparc-vaarg.c test/CodeGen/tbaa-struct.cpp test/CodeGen/volatile.c test/CodeGen/wasm-varargs.c test/CodeGen/windows-swiftcall.c test/CodeGen/x86-atomic-long_double.c test/CodeGen/x86_32-arguments-realign.c test/CodeGen/x86_64-arguments.c test/CodeGen/xcore-abi.c test/CodeGenCXX/alignment.cpp test/CodeGenCXX/assign-construct-memcpy.cpp test/CodeGenCXX/constructor-direct-call.cpp test/CodeGenCXX/copy-constructor-elim.cpp test/CodeGenCXX/copy-constructor-synthesis-2.cpp test/CodeGenCXX/copy-constructor-synthesis.cpp test/CodeGenCXX/cxx0x-delegating-ctors.cpp test/CodeGenCXX/cxx0x-initializer-array.cpp test/CodeGenCXX/cxx11-initializer-array-new.cpp test/CodeGenCXX/cxx1z-lambda-star-this.cpp test/CodeGenCXX/eh.cpp test/CodeGenCXX/float16-declarations.cpp test/CodeGenCXX/microsoft-abi-sret-and-byval.cpp test/CodeGenCXX/microsoft-abi-virtual-inheritance.cpp test/CodeGenCXX/microsoft-uuidof.cpp test/CodeGenCXX/new-array-init.cpp test/CodeGenCXX/no-opt-volatile-memcpy.cpp test/CodeGenCXX/pod-member-memcpys.cpp test/CodeGenCXX/pr20897.cpp test/CodeGenCXX/value-init.cpp test/CodeGenCXX/varargs.cpp test/CodeGenObjC/arc-foreach.m test/CodeGenObjC/arc.m test/CodeGenObjC/builtin-memfns.m test/CodeGenObjC/messages-2.m test/CodeGenObjC/stret-1.m test/CodeGenObjCXX/arc-exceptions.mm test/CodeGenOpenCL/amdgcn-automatic-variable.cl test/CodeGenOpenCL/amdgpu-nullptr.cl test/CodeGenOpenCL/partial_initializer.cl test/CodeGenOpenCL/private-array-initialization.cl test/Modules/templates.mm test/OpenMP/atomic_write_codegen.c test/OpenMP/distribute_firstprivate_codegen.cpp test/OpenMP/distribute_lastprivate_codegen.cpp test/OpenMP/distribute_parallel_for_firstprivate_codegen.cpp test/OpenMP/distribute_parallel_for_lastprivate_codegen.cpp test/OpenMP/distribute_parallel_for_simd_firstprivate_codegen.cpp test/OpenMP/distribute_parallel_for_simd_lastprivate_codegen.cpp test/OpenMP/distribute_simd_firstprivate_codegen.cpp test/OpenMP/distribute_simd_lastprivate_codegen.cpp test/OpenMP/for_firstprivate_codegen.cpp test/OpenMP/for_lastprivate_codegen.cpp test/OpenMP/for_reduction_codegen.cpp test/OpenMP/nvptx_target_firstprivate_codegen.cpp test/OpenMP/ordered_doacross_codegen.cpp test/OpenMP/parallel_codegen.cpp test/OpenMP/parallel_copyin_codegen.cpp test/OpenMP/parallel_firstprivate_codegen.cpp test/OpenMP/parallel_reduction_codegen.cpp test/OpenMP/sections_firstprivate_codegen.cpp test/OpenMP/sections_lastprivate_codegen.cpp test/OpenMP/sections_reduction_codegen.cpp test/OpenMP/single_codegen.cpp test/OpenMP/single_firstprivate_codegen.cpp test/OpenMP/target_enter_data_depend_codegen.cpp test/OpenMP/target_exit_data_depend_codegen.cpp test/OpenMP/target_firstprivate_codegen.cpp test/OpenMP/target_teams_distribute_firstprivate_codegen.cpp test/OpenMP/target_teams_distribute_lastpr
[PATCH] D41677: Change memcpy/memove/memset to have dest and source alignment attributes.
dneilson added a comment. In https://reviews.llvm.org/D41677#966094, @rjmccall wrote: > I'm glad to hear that progress is finally happening on this. > > The change to CGBuilder looks good to me. I'm going to take your word for it > that the test changes are all just obvious updates; if there's one in > particular that you'd like me to look at, I'd be happy to. Thanks! Yeah, the CGBuilder change is the meat of it. The test changes are all either sed-script changes, or manual fixes. For the manual fixes I made an effort to make sure that if the test wasn't checking for a specific alignment value before this change then it won't be checking for a specific alignment value after it either (i.e. matches like align {{[0-9]+}} when the alignment value doesn't matter). Repository: rC Clang https://reviews.llvm.org/D41677 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D41677: Change memcpy/memove/memset to have dest and source alignment attributes.
dneilson updated this revision to Diff 131685. dneilson added a comment. Herald added subscribers: niosHD, sabuasal, apazos, jordy.potman.lists, simoncook, johnrusso, rbar, asb. Rebaseline Repository: rC Clang https://reviews.llvm.org/D41677 Files: lib/CodeGen/CGBuilder.h test/CodeGen/arm-arguments.c test/CodeGen/arm64-be-bitfield.c test/CodeGen/block-byref-aggr.c test/CodeGen/builtin-memfns.c test/CodeGen/c11atomics-ios.c test/CodeGen/c11atomics.c test/CodeGen/packed-nest-unpacked.c test/CodeGen/ppc64-align-struct.c test/CodeGen/ppc64-soft-float.c test/CodeGen/ppc64le-aggregates.c test/CodeGen/riscv32-abi.c test/CodeGen/riscv64-abi.c test/CodeGen/x86_32-arguments-realign.c test/CodeGen/x86_64-arguments.c test/CodeGenCXX/alignment.cpp test/CodeGenCXX/assign-construct-memcpy.cpp test/CodeGenCXX/copy-constructor-elim.cpp test/CodeGenCXX/eh.cpp test/OpenMP/parallel_reduction_codegen.cpp Index: test/OpenMP/parallel_reduction_codegen.cpp === --- test/OpenMP/parallel_reduction_codegen.cpp +++ test/OpenMP/parallel_reduction_codegen.cpp @@ -737,7 +737,7 @@ // CHECK: [[UP:%.+]] = call dereferenceable(4) [[S_INT_TY]]* @{{.+}}([[S_INT_TY]]* [[VAR_REF]], [[S_INT_TY]]* dereferenceable(4) [[VAR_PRIV]]) // CHECK: [[BC1:%.+]] = bitcast [[S_INT_TY]]* [[VAR_REF]] to i8* // CHECK: [[BC2:%.+]] = bitcast [[S_INT_TY]]* [[UP]] to i8* -// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 [[BC1]], i8* align 4 [[BC2]], i64 4, i1 false) +// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 128 [[BC1]], i8* align 4 [[BC2]], i64 4, i1 false) // var1 = var1.operator &&(var1_reduction); // CHECK: [[TO_INT:%.+]] = call i{{[0-9]+}} @{{.+}}([[S_INT_TY]]* [[VAR1_REF]]) @@ -752,7 +752,7 @@ // CHECK: call void @{{.+}}([[S_INT_TY]]* [[COND_LVALUE:%.+]], i32 [[CONV]]) // CHECK: [[BC1:%.+]] = bitcast [[S_INT_TY]]* [[VAR1_REF]] to i8* // CHECK: [[BC2:%.+]] = bitcast [[S_INT_TY]]* [[COND_LVALUE]] to i8* -// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 [[BC1]], i8* align 4 [[BC2]], i64 4, i1 false) +// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 128 [[BC1]], i8* align 4 [[BC2]], i64 4, i1 false) // t_var1 = min(t_var1, t_var1_reduction); // CHECK: [[T_VAR1_VAL:%.+]] = load i{{[0-9]+}}, i{{[0-9]+}}* [[T_VAR1_REF]], @@ -778,7 +778,7 @@ // CHECK: [[UP:%.+]] = call dereferenceable(4) [[S_INT_TY]]* @{{.+}}([[S_INT_TY]]* [[VAR_REF]], [[S_INT_TY]]* dereferenceable(4) [[VAR_PRIV]]) // CHECK: [[BC1:%.+]] = bitcast [[S_INT_TY]]* [[VAR_REF]] to i8* // CHECK: [[BC2:%.+]] = bitcast [[S_INT_TY]]* [[UP]] to i8* -// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 [[BC1]], i8* align 4 [[BC2]], i64 4, i1 false) +// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 128 [[BC1]], i8* align 4 [[BC2]], i64 4, i1 false) // CHECK: call void @__kmpc_end_critical( // var1 = var1.operator &&(var1_reduction); @@ -796,7 +796,7 @@ // CHECK: call void @{{.+}}([[S_INT_TY]]* [[COND_LVALUE:%.+]], i32 [[CONV]]) // CHECK: [[BC1:%.+]] = bitcast [[S_INT_TY]]* [[VAR1_REF]] to i8* // CHECK: [[BC2:%.+]] = bitcast [[S_INT_TY]]* [[COND_LVALUE]] to i8* -// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 [[BC1]], i8* align 4 [[BC2]], i64 4, i1 false) +// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 128 [[BC1]], i8* align 4 [[BC2]], i64 4, i1 false) // CHECK: call void @__kmpc_end_critical( // t_var1 = min(t_var1, t_var1_reduction); @@ -864,7 +864,7 @@ // CHECK: [[UP:%.+]] = call dereferenceable(4) [[S_INT_TY]]* @{{.+}}([[S_INT_TY]]* [[VAR_LHS]], [[S_INT_TY]]* dereferenceable(4) [[VAR_RHS]]) // CHECK: [[BC1:%.+]] = bitcast [[S_INT_TY]]* [[VAR_LHS]] to i8* // CHECK: [[BC2:%.+]] = bitcast [[S_INT_TY]]* [[UP]] to i8* -// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 [[BC1]], i8* align 4 [[BC2]], i64 4, i1 false) +// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 128 [[BC1]], i8* align 4 [[BC2]], i64 4, i1 false) // var1_lhs = var1_lhs.operator &&(var1_rhs); // CHECK: [[TO_INT:%.+]] = call i{{[0-9]+}} @{{.+}}([[S_INT_TY]]* [[VAR1_LHS]]) @@ -880,7 +880,7 @@ // CHECK: call void @{{.+}}([[S_INT_TY]]* [[COND_LVALUE:%.+]], i32 [[CONV]]) // CHECK: [[BC1:%.+]] = bitcast [[S_INT_TY]]* [[VAR1_LHS]] to i8* // CHECK: [[BC2:%.+]] = bitcast [[S_INT_TY]]* [[COND_LVALUE]] to i8* -// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 [[BC1]], i8* align 4 [[BC2]], i64 4, i1 false) +// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 128 [[BC1]], i8* align 4 [[BC2]], i64 4, i1 false) // t_var1_lhs = min(t_var1_lhs, t_var1_rhs); // CHECK: [[T_VAR1_LHS_VAL:%.+]] = load i{{[0-9]+}}, i{{[0-9]+}}* [[T_VAR1_LHS]], Index: test/CodeGenCXX/eh.cpp === --- test/CodeGenCXX/eh.cpp +++ test/CodeGenCXX/eh.cpp @@ -13,7 +13,7 @@ // CHECK: [[EXNOBJ:%.*]] = call i8* @__cxa_allocate_exception(i64 8) // CHECK-NEXT: [[EXN:%.*
[PATCH] D41677: Change memcpy/memove/memset to have dest and source alignment attributes.
This revision was automatically updated to reflect the committed changes. Closed by commit rL323617: Change memcpy/memove/memset to have dest and source alignment attributes. (authored by dneilson, committed by ). Herald added a subscriber: llvm-commits. Repository: rL LLVM https://reviews.llvm.org/D41677 Files: cfe/trunk/lib/CodeGen/CGBuilder.h cfe/trunk/test/CodeGen/arm-arguments.c cfe/trunk/test/CodeGen/arm64-be-bitfield.c cfe/trunk/test/CodeGen/block-byref-aggr.c cfe/trunk/test/CodeGen/builtin-memfns.c cfe/trunk/test/CodeGen/c11atomics-ios.c cfe/trunk/test/CodeGen/c11atomics.c cfe/trunk/test/CodeGen/packed-nest-unpacked.c cfe/trunk/test/CodeGen/ppc64-align-struct.c cfe/trunk/test/CodeGen/ppc64-soft-float.c cfe/trunk/test/CodeGen/ppc64le-aggregates.c cfe/trunk/test/CodeGen/riscv32-abi.c cfe/trunk/test/CodeGen/riscv64-abi.c cfe/trunk/test/CodeGen/x86_32-arguments-realign.c cfe/trunk/test/CodeGen/x86_64-arguments.c cfe/trunk/test/CodeGenCXX/alignment.cpp cfe/trunk/test/CodeGenCXX/assign-construct-memcpy.cpp cfe/trunk/test/CodeGenCXX/copy-constructor-elim.cpp cfe/trunk/test/CodeGenCXX/eh.cpp cfe/trunk/test/OpenMP/parallel_reduction_codegen.cpp Index: cfe/trunk/lib/CodeGen/CGBuilder.h === --- cfe/trunk/lib/CodeGen/CGBuilder.h +++ cfe/trunk/lib/CodeGen/CGBuilder.h @@ -258,23 +258,23 @@ using CGBuilderBaseTy::CreateMemCpy; llvm::CallInst *CreateMemCpy(Address Dest, Address Src, llvm::Value *Size, bool IsVolatile = false) { -auto Align = std::min(Dest.getAlignment(), Src.getAlignment()); -return CreateMemCpy(Dest.getPointer(), Src.getPointer(), Size, -Align.getQuantity(), IsVolatile); +return CreateMemCpy(Dest.getPointer(), Dest.getAlignment().getQuantity(), +Src.getPointer(), Src.getAlignment().getQuantity(), +Size,IsVolatile); } llvm::CallInst *CreateMemCpy(Address Dest, Address Src, uint64_t Size, bool IsVolatile = false) { -auto Align = std::min(Dest.getAlignment(), Src.getAlignment()); -return CreateMemCpy(Dest.getPointer(), Src.getPointer(), Size, -Align.getQuantity(), IsVolatile); +return CreateMemCpy(Dest.getPointer(), Dest.getAlignment().getQuantity(), +Src.getPointer(), Src.getAlignment().getQuantity(), +Size, IsVolatile); } using CGBuilderBaseTy::CreateMemMove; llvm::CallInst *CreateMemMove(Address Dest, Address Src, llvm::Value *Size, bool IsVolatile = false) { -auto Align = std::min(Dest.getAlignment(), Src.getAlignment()); -return CreateMemMove(Dest.getPointer(), Src.getPointer(), Size, - Align.getQuantity(), IsVolatile); +return CreateMemMove(Dest.getPointer(), Dest.getAlignment().getQuantity(), + Src.getPointer(), Src.getAlignment().getQuantity(), + Size, IsVolatile); } using CGBuilderBaseTy::CreateMemSet; Index: cfe/trunk/test/CodeGen/ppc64le-aggregates.c === --- cfe/trunk/test/CodeGen/ppc64le-aggregates.c +++ cfe/trunk/test/CodeGen/ppc64le-aggregates.c @@ -104,7 +104,7 @@ // CHECK-LABEL: @call_f9 // CHECK: %[[TMP1:[^ ]+]] = alloca [5 x i64] // CHECK: %[[TMP2:[^ ]+]] = bitcast [5 x i64]* %[[TMP1]] to i8* -// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %[[TMP2]], i8* align 4 bitcast (%struct.f9* @global_f9 to i8*), i64 36, i1 false) +// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 %[[TMP2]], i8* align 4 bitcast (%struct.f9* @global_f9 to i8*), i64 36, i1 false) // CHECK: %[[TMP3:[^ ]+]] = load [5 x i64], [5 x i64]* %[[TMP1]] // CHECK: call void @func_f9(%struct.f9* sret %{{[^ ]+}}, [5 x i64] %[[TMP3]]) struct f9 global_f9; Index: cfe/trunk/test/CodeGen/x86_32-arguments-realign.c === --- cfe/trunk/test/CodeGen/x86_32-arguments-realign.c +++ cfe/trunk/test/CodeGen/x86_32-arguments-realign.c @@ -2,7 +2,7 @@ // RUN: FileCheck < %t %s // CHECK-LABEL: define void @f0(%struct.s0* byval align 4) -// CHECK: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %{{.*}}, i8* align 4 %{{.*}}, i32 16, i1 false) +// CHECK: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 16 %{{.*}}, i8* align 4 %{{.*}}, i32 16, i1 false) // CHECK: } struct s0 { long double a; }; void f0(struct s0 a0) { Index: cfe/trunk/test/CodeGen/arm64-be-bitfield.c === --- cfe/trunk/test/CodeGen/arm64-be-bitfield.c +++ cfe/trunk/test/CodeGen/arm64-be-bitfield.c @@ -7,6 +7,6 @@ // IR: callee_b0f(i64 [[ARG:%.*]]) // IR: store i64 [[ARG]], i64* [[PTR:%.*]], align 8 // IR: [[BITCAST:%.*]] = bitcast i64* [[PTR]] to i8* -// IR: call void @