[PATCH] D69250: [ARM][AArch64] Implement __cls and __clsl intrinsics from ACLE
vhscampos created this revision. Herald added subscribers: llvm-commits, cfe-commits, hiraditya, kristof.beyls. Herald added projects: clang, LLVM. Writing support for two ACLE functions: unsigned int __cls(uint32_t x) unsigned int __clsl(unsigned long x) CLS stands for "Count number of leading sign bits". In AArch64, these two intrinsics can be translated into the 'cls' instruction directly. In AArch32, on the other hand, this functionality is achieved by implementing it in terms of clz (count number of leading zeros). Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D69250 Files: clang/include/clang/Basic/BuiltinsAArch64.def clang/include/clang/Basic/BuiltinsARM.def clang/lib/CodeGen/CGBuiltin.cpp clang/lib/Headers/arm_acle.h clang/test/CodeGen/builtins-arm.c clang/test/CodeGen/builtins-arm64.c llvm/include/llvm/IR/IntrinsicsAArch64.td llvm/include/llvm/IR/IntrinsicsARM.td llvm/lib/Target/AArch64/AArch64InstrInfo.td llvm/lib/Target/ARM/ARMISelLowering.cpp llvm/test/CodeGen/AArch64/cls.ll llvm/test/CodeGen/ARM/cls.ll Index: llvm/test/CodeGen/ARM/cls.ll === --- /dev/null +++ llvm/test/CodeGen/ARM/cls.ll @@ -0,0 +1,12 @@ +; RUN: llc -mtriple=armv5 %s -o - | FileCheck %s + +; CHECK: eor [[T:r[0-9]+]], [[T]], [[T]], asr #31 +; CHECK-NEXT: mov [[C1:r[0-9]+]], #1 +; CHECK-NEXT: orr [[T]], [[C1]], [[T]], lsl #1 +; CHECK-NEXT: clz [[T]], [[T]] +define i32 @cls(i32 %t) { + %cls.i = call i32 @llvm.arm.cls(i32 %t) + ret i32 %cls.i +} + +declare i32 @llvm.arm.cls(i32) nounwind Index: llvm/test/CodeGen/AArch64/cls.ll === --- /dev/null +++ llvm/test/CodeGen/AArch64/cls.ll @@ -0,0 +1,20 @@ +; RUN: llc -mtriple=aarch64-eabi %s -o - | FileCheck %s + +; @llvm.aarch64.cls must be directly translated into the 'cls' instruction + +; CHECK-LABEL: cls +; CHECK: cls [[REG:w[0-9]+]], [[REG]] +define i32 @cls(i32 %t) { + %cls.i = call i32 @llvm.aarch64.cls(i32 %t) + ret i32 %cls.i +} + +; CHECK-LABEL: cls64 +; CHECK: cls [[REG:x[0-9]+]], [[REG]] +define i32 @cls64(i64 %t) { + %cls.i = call i32 @llvm.aarch64.cls64(i64 %t) + ret i32 %cls.i +} + +declare i32 @llvm.aarch64.cls(i32) nounwind +declare i32 @llvm.aarch64.cls64(i64) nounwind Index: llvm/lib/Target/ARM/ARMISelLowering.cpp === --- llvm/lib/Target/ARM/ARMISelLowering.cpp +++ llvm/lib/Target/ARM/ARMISelLowering.cpp @@ -3625,6 +3625,19 @@ EVT PtrVT = getPointerTy(DAG.getDataLayout()); return DAG.getNode(ARMISD::THREAD_POINTER, dl, PtrVT); } + case Intrinsic::arm_cls: { +const SDValue &Operand = Op.getOperand(1); +const EVT VTy = Op.getValueType(); +SDValue SRA = +DAG.getNode(ISD::SRA, dl, VTy, Operand, DAG.getConstant(31, dl, VTy)); +SDValue XOR = DAG.getNode(ISD::XOR, dl, VTy, SRA, Operand); +SDValue SHL = +DAG.getNode(ISD::SHL, dl, VTy, XOR, DAG.getConstant(1, dl, VTy)); +SDValue OR = +DAG.getNode(ISD::OR, dl, VTy, SHL, DAG.getConstant(1, dl, VTy)); +SDValue Result = DAG.getNode(ISD::CTLZ, dl, VTy, OR); +return Result; + } case Intrinsic::eh_sjlj_lsda: { MachineFunction &MF = DAG.getMachineFunction(); ARMFunctionInfo *AFI = MF.getInfo(); Index: llvm/lib/Target/AArch64/AArch64InstrInfo.td === --- llvm/lib/Target/AArch64/AArch64InstrInfo.td +++ llvm/lib/Target/AArch64/AArch64InstrInfo.td @@ -1478,6 +1478,8 @@ def : Pat<(ctlz (or (shl (xor (sra GPR64:$Rn, (i64 63)), GPR64:$Rn), (i64 1)), (i64 1))), (CLSXr GPR64:$Rn)>; +def : Pat<(int_aarch64_cls GPR32:$Rn), (CLSWr GPR32:$Rn)>; +def : Pat<(int_aarch64_cls64 GPR64:$Rm), (EXTRACT_SUBREG (CLSXr GPR64:$Rm), sub_32)>; // Unlike the other one operand instructions, the instructions with the "rev" // mnemonic do *not* just different in the size bit, but actually use different Index: llvm/include/llvm/IR/IntrinsicsARM.td === --- llvm/include/llvm/IR/IntrinsicsARM.td +++ llvm/include/llvm/IR/IntrinsicsARM.td @@ -787,4 +787,6 @@ [], [IntrReadMem, IntrWriteMem]>; +def int_arm_cls: Intrinsic<[llvm_i32_ty], [llvm_i32_ty], [IntrNoMem]>; + } // end TargetPrefix Index: llvm/include/llvm/IR/IntrinsicsAArch64.td === --- llvm/include/llvm/IR/IntrinsicsAArch64.td +++ llvm/include/llvm/IR/IntrinsicsAArch64.td @@ -33,6 +33,9 @@ def int_aarch64_fjcvtzs : Intrinsic<[llvm_i32_ty], [llvm_double_ty], [IntrNoMem]>; +def int_aarch64_cls: Intrinsic<[llvm_i32_ty], [llvm_i32_ty], [IntrNoMem]>; +def int_aarch64_cls64: Intrinsic<[llvm_i32_ty], [llvm_i64_ty], [IntrNoMem]>; + //===
[PATCH] D69250: [ARM][AArch64] Implement __cls and __clsl intrinsics from ACLE
vhscampos marked 4 inline comments as done. vhscampos added inline comments. Comment at: clang/lib/Headers/arm_acle.h:150 +__clsl(unsigned long __t) { +#if __SIZEOF_LONG__ == 4 + return __builtin_arm_cls(__t); compnerd wrote: > I don't see a pattern match for the `cls64` on ARM32, would that not fail to > lower? Yes. However, for now, I am not enabling support for `cls64` on ARM32 as it is not done yet. Comment at: clang/lib/Headers/arm_acle.h:155 +#endif +} + compnerd wrote: > Should we have a `__clsll` extension, otherwise these two are the same in > LLP64? I'm thinking about the LLP64 environments, where `long` and `long > long` are different (32-bit vs 64-bit). ACLE does provide a `long long` version of `cls` called `__clsll`. But since the support for `cls64` on Arm32 is not done yet, I decided not to write support for `__clsll`. If I did, it would work for 64-bit but not for 32-bit. Please let me know what you think. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D69250/new/ https://reviews.llvm.org/D69250 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D69297: [ARM][AArch64] Implement __arm_rsrf, __arm_rsrf64, __arm_wsrf & __arm_wsrf64
vhscampos created this revision. Herald added subscribers: cfe-commits, kristof.beyls. Herald added a project: clang. Adding support for ACLE intrinsics. Patch by Michael Platings. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D69297 Files: clang/lib/Headers/arm_acle.h clang/test/CodeGen/arm_acle.c Index: clang/test/CodeGen/arm_acle.c === --- clang/test/CodeGen/arm_acle.c +++ clang/test/CodeGen/arm_acle.c @@ -822,6 +822,55 @@ __arm_wsrp("sysreg", v); } +// ARM-LABEL: test_rsrf +// AArch64: call i64 @llvm.read_register.i64(metadata ![[M0:[0-9]]]) +// AArch32: call i32 @llvm.read_register.i32(metadata ![[M2:[0-9]]]) +// ARM-NOT: uitofp +// ARM: bitcast +float test_rsrf() { +#ifdef __ARM_32BIT_STATE + return __arm_rsrf("cp1:2:c3:c4:5"); +#else + return __arm_rsrf("1:2:3:4:5"); +#endif +} +// ARM-LABEL: test_rsrf64 +// AArch64: call i64 @llvm.read_register.i64(metadata ![[M0:[0-9]]]) +// AArch32: call i64 @llvm.read_register.i64(metadata ![[M3:[0-9]]]) +// ARM-NOT: uitofp +// ARM: bitcast +double test_rsrf64() { +#ifdef __ARM_32BIT_STATE + return __arm_rsrf64("cp1:2:c3"); +#else + return __arm_rsrf64("1:2:3:4:5"); +#endif +} +// ARM-LABEL: test_wsrf +// ARM-NOT: fptoui +// ARM: bitcast +// AArch64: call void @llvm.write_register.i64(metadata ![[M0:[0-9]]], i64 %{{.*}}) +// AArch32: call void @llvm.write_register.i32(metadata ![[M2:[0-9]]], i32 %{{.*}}) +void test_wsrf(float v) { +#ifdef __ARM_32BIT_STATE + __arm_wsrf("cp1:2:c3:c4:5", v); +#else + __arm_wsrf("1:2:3:4:5", v); +#endif +} +// ARM-LABEL: test_wsrf64 +// ARM-NOT: fptoui +// ARM: bitcast +// AArch64: call void @llvm.write_register.i64(metadata ![[M0:[0-9]]], i64 %{{.*}}) +// AArch32: call void @llvm.write_register.i64(metadata ![[M3:[0-9]]], i64 %{{.*}}) +void test_wsrf64(double v) { +#ifdef __ARM_32BIT_STATE + __arm_wsrf64("cp1:2:c3", v); +#else + __arm_wsrf64("1:2:3:4:5", v); +#endif +} + // AArch32: ![[M2]] = !{!"cp1:2:c3:c4:5"} // AArch32: ![[M3]] = !{!"cp1:2:c3"} // AArch32: ![[M4]] = !{!"sysreg"} Index: clang/lib/Headers/arm_acle.h === --- clang/lib/Headers/arm_acle.h +++ clang/lib/Headers/arm_acle.h @@ -613,6 +613,35 @@ #define __arm_wsr64(sysreg, v) __builtin_arm_wsr64(sysreg, v) #define __arm_wsrp(sysreg, v) __builtin_arm_wsrp(sysreg, v) +static __inline__ float __attribute__((__always_inline__, __nodebug__)) +__bit_cast_to_float_from_uint32(uint32_t __from) { + float __to; + __builtin_memcpy(&__to, &__from, sizeof(__to)); + return __to; +} +static __inline__ uint32_t __attribute__((__always_inline__, __nodebug__)) +__bit_cast_to_uint32_from_float(float __from) { + uint32_t __to; + __builtin_memcpy(&__to, &__from, sizeof(__to)); + return __to; +} +static __inline__ double __attribute__((__always_inline__, __nodebug__)) +__bit_cast_to_double_from_uint64(uint64_t __from) { + double __to; + __builtin_memcpy(&__to, &__from, sizeof(__to)); + return __to; +} +static __inline__ uint64_t __attribute__((__always_inline__, __nodebug__)) +__bit_cast_to_uint64_from_double(double __from) { + uint64_t __to; + __builtin_memcpy(&__to, &__from, sizeof(__to)); + return __to; +} +#define __arm_rsrf(sysreg) __bit_cast_to_float_from_uint32(__arm_rsr(sysreg)) +#define __arm_wsrf(sysreg, v) __arm_wsr(sysreg, __bit_cast_to_uint32_from_float(v)) +#define __arm_rsrf64(sysreg) __bit_cast_to_double_from_uint64(__arm_rsr64(sysreg)) +#define __arm_wsrf64(sysreg, v) __arm_wsr64(sysreg, __bit_cast_to_uint64_from_double(v)) + /* Memory Tagging Extensions (MTE) Intrinsics */ #if __ARM_FEATURE_MEMORY_TAGGING #define __arm_mte_create_random_tag(__ptr, __mask) __builtin_arm_irg(__ptr, __mask) Index: clang/test/CodeGen/arm_acle.c === --- clang/test/CodeGen/arm_acle.c +++ clang/test/CodeGen/arm_acle.c @@ -822,6 +822,55 @@ __arm_wsrp("sysreg", v); } +// ARM-LABEL: test_rsrf +// AArch64: call i64 @llvm.read_register.i64(metadata ![[M0:[0-9]]]) +// AArch32: call i32 @llvm.read_register.i32(metadata ![[M2:[0-9]]]) +// ARM-NOT: uitofp +// ARM: bitcast +float test_rsrf() { +#ifdef __ARM_32BIT_STATE + return __arm_rsrf("cp1:2:c3:c4:5"); +#else + return __arm_rsrf("1:2:3:4:5"); +#endif +} +// ARM-LABEL: test_rsrf64 +// AArch64: call i64 @llvm.read_register.i64(metadata ![[M0:[0-9]]]) +// AArch32: call i64 @llvm.read_register.i64(metadata ![[M3:[0-9]]]) +// ARM-NOT: uitofp +// ARM: bitcast +double test_rsrf64() { +#ifdef __ARM_32BIT_STATE + return __arm_rsrf64("cp1:2:c3"); +#else + return __arm_rsrf64("1:2:3:4:5"); +#endif +} +// ARM-LABEL: test_wsrf +// ARM-NOT: fptoui +// ARM: bitcast +// AArch64: call void @llvm.write_register.i64(metadata ![[M0:[0-9]]], i64 %{{.*}}) +// AArch32: call void @llvm.write_register.i32(metadata ![[M2:[0-9]]], i32 %{{.*}}) +void test_wsrf(float v) { +#ifdef __ARM_32BIT_STATE + __arm_w
[PATCH] D69297: [ARM][AArch64] Implement __arm_rsrf, __arm_rsrf64, __arm_wsrf & __arm_wsrf64
vhscampos updated this revision to Diff 226015. vhscampos added a comment. Run clang-format Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D69297/new/ https://reviews.llvm.org/D69297 Files: clang/lib/Headers/arm_acle.h clang/test/CodeGen/arm_acle.c Index: clang/test/CodeGen/arm_acle.c === --- clang/test/CodeGen/arm_acle.c +++ clang/test/CodeGen/arm_acle.c @@ -822,6 +822,55 @@ __arm_wsrp("sysreg", v); } +// ARM-LABEL: test_rsrf +// AArch64: call i64 @llvm.read_register.i64(metadata ![[M0:[0-9]]]) +// AArch32: call i32 @llvm.read_register.i32(metadata ![[M2:[0-9]]]) +// ARM-NOT: uitofp +// ARM: bitcast +float test_rsrf() { +#ifdef __ARM_32BIT_STATE + return __arm_rsrf("cp1:2:c3:c4:5"); +#else + return __arm_rsrf("1:2:3:4:5"); +#endif +} +// ARM-LABEL: test_rsrf64 +// AArch64: call i64 @llvm.read_register.i64(metadata ![[M0:[0-9]]]) +// AArch32: call i64 @llvm.read_register.i64(metadata ![[M3:[0-9]]]) +// ARM-NOT: uitofp +// ARM: bitcast +double test_rsrf64() { +#ifdef __ARM_32BIT_STATE + return __arm_rsrf64("cp1:2:c3"); +#else + return __arm_rsrf64("1:2:3:4:5"); +#endif +} +// ARM-LABEL: test_wsrf +// ARM-NOT: fptoui +// ARM: bitcast +// AArch64: call void @llvm.write_register.i64(metadata ![[M0:[0-9]]], i64 %{{.*}}) +// AArch32: call void @llvm.write_register.i32(metadata ![[M2:[0-9]]], i32 %{{.*}}) +void test_wsrf(float v) { +#ifdef __ARM_32BIT_STATE + __arm_wsrf("cp1:2:c3:c4:5", v); +#else + __arm_wsrf("1:2:3:4:5", v); +#endif +} +// ARM-LABEL: test_wsrf64 +// ARM-NOT: fptoui +// ARM: bitcast +// AArch64: call void @llvm.write_register.i64(metadata ![[M0:[0-9]]], i64 %{{.*}}) +// AArch32: call void @llvm.write_register.i64(metadata ![[M3:[0-9]]], i64 %{{.*}}) +void test_wsrf64(double v) { +#ifdef __ARM_32BIT_STATE + __arm_wsrf64("cp1:2:c3", v); +#else + __arm_wsrf64("1:2:3:4:5", v); +#endif +} + // AArch32: ![[M2]] = !{!"cp1:2:c3:c4:5"} // AArch32: ![[M3]] = !{!"cp1:2:c3"} // AArch32: ![[M4]] = !{!"sysreg"} Index: clang/lib/Headers/arm_acle.h === --- clang/lib/Headers/arm_acle.h +++ clang/lib/Headers/arm_acle.h @@ -613,6 +613,38 @@ #define __arm_wsr64(sysreg, v) __builtin_arm_wsr64(sysreg, v) #define __arm_wsrp(sysreg, v) __builtin_arm_wsrp(sysreg, v) +static __inline__ float __attribute__((__always_inline__, __nodebug__)) +__bit_cast_to_float_from_uint32(uint32_t __from) { + float __to; + __builtin_memcpy(&__to, &__from, sizeof(__to)); + return __to; +} +static __inline__ uint32_t __attribute__((__always_inline__, __nodebug__)) +__bit_cast_to_uint32_from_float(float __from) { + uint32_t __to; + __builtin_memcpy(&__to, &__from, sizeof(__to)); + return __to; +} +static __inline__ double __attribute__((__always_inline__, __nodebug__)) +__bit_cast_to_double_from_uint64(uint64_t __from) { + double __to; + __builtin_memcpy(&__to, &__from, sizeof(__to)); + return __to; +} +static __inline__ uint64_t __attribute__((__always_inline__, __nodebug__)) +__bit_cast_to_uint64_from_double(double __from) { + uint64_t __to; + __builtin_memcpy(&__to, &__from, sizeof(__to)); + return __to; +} +#define __arm_rsrf(sysreg) __bit_cast_to_float_from_uint32(__arm_rsr(sysreg)) +#define __arm_wsrf(sysreg, v) \ + __arm_wsr(sysreg, __bit_cast_to_uint32_from_float(v)) +#define __arm_rsrf64(sysreg) \ + __bit_cast_to_double_from_uint64(__arm_rsr64(sysreg)) +#define __arm_wsrf64(sysreg, v)\ + __arm_wsr64(sysreg, __bit_cast_to_uint64_from_double(v)) + /* Memory Tagging Extensions (MTE) Intrinsics */ #if __ARM_FEATURE_MEMORY_TAGGING #define __arm_mte_create_random_tag(__ptr, __mask) __builtin_arm_irg(__ptr, __mask) ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D69250: [ARM][AArch64] Implement __cls and __clsl intrinsics from ACLE
vhscampos marked 2 inline comments as done. vhscampos added inline comments. Comment at: clang/lib/Headers/arm_acle.h:150 +__clsl(unsigned long __t) { +#if __SIZEOF_LONG__ == 4 + return __builtin_arm_cls(__t); compnerd wrote: > vhscampos wrote: > > compnerd wrote: > > > I don't see a pattern match for the `cls64` on ARM32, would that not fail > > > to lower? > > Yes. However, for now, I am not enabling support for `cls64` on ARM32 as it > > is not done yet. > Is the difference not just the parameter type? I think that implementing it > should be a trivial change to the existing implementation. Is there a reason > that you are not implementing that? At clang's side, yes, but not in the backend: Arm32 does not have a `cls` instruction, thus the CLS operations need to be custom lowered. In the `llvm.arm.cls(i32)` case, lowering is quite simple, and it's been included in this patch. For `llvm.arm.cls64(i64)`, on the other hand, it is not as trivial since it's necessary to break its logic into 32-bit instructions. So the reason not to implement that (yet) is just to split work in two different efforts. Comment at: clang/lib/Headers/arm_acle.h:155 +#endif +} + compnerd wrote: > vhscampos wrote: > > compnerd wrote: > > > Should we have a `__clsll` extension, otherwise these two are the same in > > > LLP64? I'm thinking about the LLP64 environments, where `long` and `long > > > long` are different (32-bit vs 64-bit). > > ACLE does provide a `long long` version of `cls` called `__clsll`. But > > since the support for `cls64` on Arm32 is not done yet, I decided not to > > write support for `__clsll`. If I did, it would work for 64-bit but not for > > 32-bit. > > > > Please let me know what you think. > clang supports Windows where `long` is 4-bytes even on 64-bit targets, and > this means that this doesn't work for that target. I think that we need to > add `__clsll` so that 64-bit ARM at least is covered. I'm not sure if I am following you. On AArch64-Windows, `__clsl` will be lowered to `llvm.aarch64.cls(i32)` which will then be custom lowered correctly. Let me know if I am thinking this wrong. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D69250/new/ https://reviews.llvm.org/D69250 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D69297: [ARM][AArch64] Implement __arm_rsrf, __arm_rsrf64, __arm_wsrf & __arm_wsrf64
vhscampos updated this revision to Diff 226406. vhscampos added a comment. Use __builtin_bit_cast to perform the relevant bitcasts. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D69297/new/ https://reviews.llvm.org/D69297 Files: clang/lib/Headers/arm_acle.h clang/test/CodeGen/arm_acle.c Index: clang/test/CodeGen/arm_acle.c === --- clang/test/CodeGen/arm_acle.c +++ clang/test/CodeGen/arm_acle.c @@ -822,6 +822,55 @@ __arm_wsrp("sysreg", v); } +// ARM-LABEL: test_rsrf +// AArch64: call i64 @llvm.read_register.i64(metadata ![[M0:[0-9]]]) +// AArch32: call i32 @llvm.read_register.i32(metadata ![[M2:[0-9]]]) +// ARM-NOT: uitofp +// ARM: bitcast +float test_rsrf() { +#ifdef __ARM_32BIT_STATE + return __arm_rsrf("cp1:2:c3:c4:5"); +#else + return __arm_rsrf("1:2:3:4:5"); +#endif +} +// ARM-LABEL: test_rsrf64 +// AArch64: call i64 @llvm.read_register.i64(metadata ![[M0:[0-9]]]) +// AArch32: call i64 @llvm.read_register.i64(metadata ![[M3:[0-9]]]) +// ARM-NOT: uitofp +// ARM: bitcast +double test_rsrf64() { +#ifdef __ARM_32BIT_STATE + return __arm_rsrf64("cp1:2:c3"); +#else + return __arm_rsrf64("1:2:3:4:5"); +#endif +} +// ARM-LABEL: test_wsrf +// ARM-NOT: fptoui +// ARM: bitcast +// AArch64: call void @llvm.write_register.i64(metadata ![[M0:[0-9]]], i64 %{{.*}}) +// AArch32: call void @llvm.write_register.i32(metadata ![[M2:[0-9]]], i32 %{{.*}}) +void test_wsrf(float v) { +#ifdef __ARM_32BIT_STATE + __arm_wsrf("cp1:2:c3:c4:5", v); +#else + __arm_wsrf("1:2:3:4:5", v); +#endif +} +// ARM-LABEL: test_wsrf64 +// ARM-NOT: fptoui +// ARM: bitcast +// AArch64: call void @llvm.write_register.i64(metadata ![[M0:[0-9]]], i64 %{{.*}}) +// AArch32: call void @llvm.write_register.i64(metadata ![[M3:[0-9]]], i64 %{{.*}}) +void test_wsrf64(double v) { +#ifdef __ARM_32BIT_STATE + __arm_wsrf64("cp1:2:c3", v); +#else + __arm_wsrf64("1:2:3:4:5", v); +#endif +} + // AArch32: ![[M2]] = !{!"cp1:2:c3:c4:5"} // AArch32: ![[M3]] = !{!"cp1:2:c3"} // AArch32: ![[M4]] = !{!"sysreg"} Index: clang/lib/Headers/arm_acle.h === --- clang/lib/Headers/arm_acle.h +++ clang/lib/Headers/arm_acle.h @@ -609,9 +609,13 @@ #define __arm_rsr(sysreg) __builtin_arm_rsr(sysreg) #define __arm_rsr64(sysreg) __builtin_arm_rsr64(sysreg) #define __arm_rsrp(sysreg) __builtin_arm_rsrp(sysreg) +#define __arm_rsrf(sysreg) __builtin_bit_cast(float, __arm_rsr(sysreg)) +#define __arm_rsrf64(sysreg) __builtin_bit_cast(double, __arm_rsr64(sysreg)) #define __arm_wsr(sysreg, v) __builtin_arm_wsr(sysreg, v) #define __arm_wsr64(sysreg, v) __builtin_arm_wsr64(sysreg, v) #define __arm_wsrp(sysreg, v) __builtin_arm_wsrp(sysreg, v) +#define __arm_wsrf(sysreg, v) __arm_wsr(sysreg, __builtin_bit_cast(uint32_t, v)) +#define __arm_wsrf64(sysreg, v) __arm_wsr64(sysreg, __builtin_bit_cast(uint64_t, v)) /* Memory Tagging Extensions (MTE) Intrinsics */ #if __ARM_FEATURE_MEMORY_TAGGING Index: clang/test/CodeGen/arm_acle.c === --- clang/test/CodeGen/arm_acle.c +++ clang/test/CodeGen/arm_acle.c @@ -822,6 +822,55 @@ __arm_wsrp("sysreg", v); } +// ARM-LABEL: test_rsrf +// AArch64: call i64 @llvm.read_register.i64(metadata ![[M0:[0-9]]]) +// AArch32: call i32 @llvm.read_register.i32(metadata ![[M2:[0-9]]]) +// ARM-NOT: uitofp +// ARM: bitcast +float test_rsrf() { +#ifdef __ARM_32BIT_STATE + return __arm_rsrf("cp1:2:c3:c4:5"); +#else + return __arm_rsrf("1:2:3:4:5"); +#endif +} +// ARM-LABEL: test_rsrf64 +// AArch64: call i64 @llvm.read_register.i64(metadata ![[M0:[0-9]]]) +// AArch32: call i64 @llvm.read_register.i64(metadata ![[M3:[0-9]]]) +// ARM-NOT: uitofp +// ARM: bitcast +double test_rsrf64() { +#ifdef __ARM_32BIT_STATE + return __arm_rsrf64("cp1:2:c3"); +#else + return __arm_rsrf64("1:2:3:4:5"); +#endif +} +// ARM-LABEL: test_wsrf +// ARM-NOT: fptoui +// ARM: bitcast +// AArch64: call void @llvm.write_register.i64(metadata ![[M0:[0-9]]], i64 %{{.*}}) +// AArch32: call void @llvm.write_register.i32(metadata ![[M2:[0-9]]], i32 %{{.*}}) +void test_wsrf(float v) { +#ifdef __ARM_32BIT_STATE + __arm_wsrf("cp1:2:c3:c4:5", v); +#else + __arm_wsrf("1:2:3:4:5", v); +#endif +} +// ARM-LABEL: test_wsrf64 +// ARM-NOT: fptoui +// ARM: bitcast +// AArch64: call void @llvm.write_register.i64(metadata ![[M0:[0-9]]], i64 %{{.*}}) +// AArch32: call void @llvm.write_register.i64(metadata ![[M3:[0-9]]], i64 %{{.*}}) +void test_wsrf64(double v) { +#ifdef __ARM_32BIT_STATE + __arm_wsrf64("cp1:2:c3", v); +#else + __arm_wsrf64("1:2:3:4:5", v); +#endif +} + // AArch32: ![[M2]] = !{!"cp1:2:c3:c4:5"} // AArch32: ![[M3]] = !{!"cp1:2:c3"} // AArch32: ![[M4]] = !{!"sysreg"} Index: clang/lib/Headers/arm_acle.h === --- clang/lib/Headers/arm_acle.h +++ clang/lib/Hea
[PATCH] D69250: [ARM][AArch64] Implement __cls and __clsl intrinsics from ACLE
vhscampos updated this revision to Diff 226430. vhscampos added a comment. Add support for __clsll. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D69250/new/ https://reviews.llvm.org/D69250 Files: clang/include/clang/Basic/BuiltinsAArch64.def clang/include/clang/Basic/BuiltinsARM.def clang/lib/CodeGen/CGBuiltin.cpp clang/lib/Headers/arm_acle.h clang/test/CodeGen/arm_acle.c clang/test/CodeGen/builtins-arm.c clang/test/CodeGen/builtins-arm64.c llvm/include/llvm/IR/IntrinsicsAArch64.td llvm/include/llvm/IR/IntrinsicsARM.td llvm/lib/Target/AArch64/AArch64InstrInfo.td llvm/lib/Target/ARM/ARMISelLowering.cpp llvm/test/CodeGen/AArch64/cls.ll llvm/test/CodeGen/ARM/cls.ll Index: llvm/test/CodeGen/ARM/cls.ll === --- /dev/null +++ llvm/test/CodeGen/ARM/cls.ll @@ -0,0 +1,27 @@ +; RUN: llc -mtriple=armv5 %s -o - | FileCheck %s + +; CHECK: eor [[T:r[0-9]+]], [[T]], [[T]], asr #31 +; CHECK-NEXT: mov [[C1:r[0-9]+]], #1 +; CHECK-NEXT: orr [[T]], [[C1]], [[T]], lsl #1 +; CHECK-NEXT: clz [[T]], [[T]] +define i32 @cls(i32 %t) { + %cls.i = call i32 @llvm.arm.cls(i32 %t) + ret i32 %cls.i +} + +; CHECK: cmp r1, #0 +; CHECK: mvnne [[ADJUSTEDLO:r[0-9]+]], r0 +; CHECK: clz [[CLZLO:r[0-9]+]], [[ADJUSTEDLO]] +; CHECK: eor [[A:r[0-9]+]], r1, r1, asr #31 +; CHECK: mov r1, #1 +; CHECK: orr [[A]], r1, [[A]], lsl #1 +; CHECK: clz [[CLSHI:r[0-9]+]], [[A]] +; CHECK: cmp [[CLSHI]], #31 +; CHECK: addeq r0, [[CLZLO]], #31 +define i32 @cls64(i64 %t) { + %cls.i = call i32 @llvm.arm.cls64(i64 %t) + ret i32 %cls.i +} + +declare i32 @llvm.arm.cls(i32) nounwind +declare i32 @llvm.arm.cls64(i64) nounwind Index: llvm/test/CodeGen/AArch64/cls.ll === --- /dev/null +++ llvm/test/CodeGen/AArch64/cls.ll @@ -0,0 +1,20 @@ +; RUN: llc -mtriple=aarch64 %s -o - | FileCheck %s + +; @llvm.aarch64.cls must be directly translated into the 'cls' instruction + +; CHECK-LABEL: cls +; CHECK: cls [[REG:w[0-9]+]], [[REG]] +define i32 @cls(i32 %t) { + %cls.i = call i32 @llvm.aarch64.cls(i32 %t) + ret i32 %cls.i +} + +; CHECK-LABEL: cls64 +; CHECK: cls [[REG:x[0-9]+]], [[REG]] +define i32 @cls64(i64 %t) { + %cls.i = call i32 @llvm.aarch64.cls64(i64 %t) + ret i32 %cls.i +} + +declare i32 @llvm.aarch64.cls(i32) nounwind +declare i32 @llvm.aarch64.cls64(i64) nounwind Index: llvm/lib/Target/ARM/ARMISelLowering.cpp === --- llvm/lib/Target/ARM/ARMISelLowering.cpp +++ llvm/lib/Target/ARM/ARMISelLowering.cpp @@ -3629,6 +3629,49 @@ EVT PtrVT = getPointerTy(DAG.getDataLayout()); return DAG.getNode(ARMISD::THREAD_POINTER, dl, PtrVT); } + case Intrinsic::arm_cls: { +const SDValue &Operand = Op.getOperand(1); +const EVT VTy = Op.getValueType(); +SDValue SRA = +DAG.getNode(ISD::SRA, dl, VTy, Operand, DAG.getConstant(31, dl, VTy)); +SDValue XOR = DAG.getNode(ISD::XOR, dl, VTy, SRA, Operand); +SDValue SHL = +DAG.getNode(ISD::SHL, dl, VTy, XOR, DAG.getConstant(1, dl, VTy)); +SDValue OR = +DAG.getNode(ISD::OR, dl, VTy, SHL, DAG.getConstant(1, dl, VTy)); +SDValue Result = DAG.getNode(ISD::CTLZ, dl, VTy, OR); +return Result; + } + case Intrinsic::arm_cls64: { +// cls(x) = if cls(hi(x)) != 31 then cls(hi(x)) +// else 31 + clz(if hi(x) == 0 then lo(x) else not(lo(x))) +const SDValue &Operand = Op.getOperand(1); +const EVT VTy = Op.getValueType(); + +SDValue Hi = DAG.getNode(ISD::EXTRACT_ELEMENT, dl, VTy, Operand, + DAG.getConstant(1, dl, VTy)); +SDValue Lo = DAG.getNode(ISD::EXTRACT_ELEMENT, dl, VTy, Operand, + DAG.getConstant(0, dl, VTy)); +SDValue Constant0 = DAG.getConstant(0, dl, VTy); +SDValue Constant1 = DAG.getConstant(1, dl, VTy); +SDValue Constant31 = DAG.getConstant(31, dl, VTy); +SDValue SRAHi = DAG.getNode(ISD::SRA, dl, VTy, Hi, Constant31); +SDValue XORHi = DAG.getNode(ISD::XOR, dl, VTy, SRAHi, Hi); +SDValue SHLHi = DAG.getNode(ISD::SHL, dl, VTy, XORHi, Constant1); +SDValue ORHi = DAG.getNode(ISD::OR, dl, VTy, SHLHi, Constant1); +SDValue CLSHi = DAG.getNode(ISD::CTLZ, dl, VTy, ORHi); +SDValue CheckLo = +DAG.getSetCC(dl, MVT::i1, CLSHi, Constant31, ISD::CondCode::SETEQ); +SDValue HiIsZero = +DAG.getSetCC(dl, MVT::i1, Hi, Constant0, ISD::CondCode::SETEQ); +SDValue AdjustedLo = +DAG.getSelect(dl, VTy, HiIsZero, Lo, DAG.getNOT(dl, Lo, VTy)); +SDValue CLZAdjustedLo = DAG.getNode(ISD::CTLZ, dl, VTy, AdjustedLo); +SDValue Result = +DAG.getSelect(dl, VTy, CheckLo, + DAG.getNode(ISD::ADD, dl, VTy, CLZAdjustedLo, Constant31), CLSHi); +return Result; + } case Intrinsic::eh_sjlj_lsda: { MachineFunction &MF = DAG.getMachineFunction(); ARM
[PATCH] D69250: [ARM][AArch64] Implement __cls, __clsl and __clsll intrinsics from ACLE
vhscampos marked an inline comment as done. vhscampos added a comment. Added support for `__clsll` as requested. Comment at: clang/lib/Headers/arm_acle.h:150 +__clsl(unsigned long __t) { +#if __SIZEOF_LONG__ == 4 + return __builtin_arm_cls(__t); compnerd wrote: > vhscampos wrote: > > compnerd wrote: > > > vhscampos wrote: > > > > compnerd wrote: > > > > > I don't see a pattern match for the `cls64` on ARM32, would that not > > > > > fail to lower? > > > > Yes. However, for now, I am not enabling support for `cls64` on ARM32 > > > > as it is not done yet. > > > Is the difference not just the parameter type? I think that implementing > > > it should be a trivial change to the existing implementation. Is there a > > > reason that you are not implementing that? > > At clang's side, yes, but not in the backend: Arm32 does not have a `cls` > > instruction, thus the CLS operations need to be custom lowered. In the > > `llvm.arm.cls(i32)` case, lowering is quite simple, and it's been included > > in this patch. For `llvm.arm.cls64(i64)`, on the other hand, it is not as > > trivial since it's necessary to break its logic into 32-bit instructions. > > > > So the reason not to implement that (yet) is just to split work in two > > different efforts. > Would it not be sufficient to do the top half (after a shift right of > 32-bits), and if it is exactly 32, then do the bottom 32-bits, otherwise, > you're done? Sort of. How we interpret the bottom half depends on the value of the top half. I've added this custom lowering in the latest revision. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D69250/new/ https://reviews.llvm.org/D69250 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D69297: [ARM][AArch64] Implement __arm_rsrf, __arm_rsrf64, __arm_wsrf & __arm_wsrf64
This revision was automatically updated to reflect the committed changes. Closed by commit rG5d35b7d9e1a3: [ARM][AArch64] Implement __arm_rsrf, __arm_rsrf64, __arm_wsrf & __arm_wsrf64 (authored by vhscampos). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D69297/new/ https://reviews.llvm.org/D69297 Files: clang/lib/Headers/arm_acle.h clang/test/CodeGen/arm_acle.c Index: clang/test/CodeGen/arm_acle.c === --- clang/test/CodeGen/arm_acle.c +++ clang/test/CodeGen/arm_acle.c @@ -822,6 +822,55 @@ __arm_wsrp("sysreg", v); } +// ARM-LABEL: test_rsrf +// AArch64: call i64 @llvm.read_register.i64(metadata ![[M0:[0-9]]]) +// AArch32: call i32 @llvm.read_register.i32(metadata ![[M2:[0-9]]]) +// ARM-NOT: uitofp +// ARM: bitcast +float test_rsrf() { +#ifdef __ARM_32BIT_STATE + return __arm_rsrf("cp1:2:c3:c4:5"); +#else + return __arm_rsrf("1:2:3:4:5"); +#endif +} +// ARM-LABEL: test_rsrf64 +// AArch64: call i64 @llvm.read_register.i64(metadata ![[M0:[0-9]]]) +// AArch32: call i64 @llvm.read_register.i64(metadata ![[M3:[0-9]]]) +// ARM-NOT: uitofp +// ARM: bitcast +double test_rsrf64() { +#ifdef __ARM_32BIT_STATE + return __arm_rsrf64("cp1:2:c3"); +#else + return __arm_rsrf64("1:2:3:4:5"); +#endif +} +// ARM-LABEL: test_wsrf +// ARM-NOT: fptoui +// ARM: bitcast +// AArch64: call void @llvm.write_register.i64(metadata ![[M0:[0-9]]], i64 %{{.*}}) +// AArch32: call void @llvm.write_register.i32(metadata ![[M2:[0-9]]], i32 %{{.*}}) +void test_wsrf(float v) { +#ifdef __ARM_32BIT_STATE + __arm_wsrf("cp1:2:c3:c4:5", v); +#else + __arm_wsrf("1:2:3:4:5", v); +#endif +} +// ARM-LABEL: test_wsrf64 +// ARM-NOT: fptoui +// ARM: bitcast +// AArch64: call void @llvm.write_register.i64(metadata ![[M0:[0-9]]], i64 %{{.*}}) +// AArch32: call void @llvm.write_register.i64(metadata ![[M3:[0-9]]], i64 %{{.*}}) +void test_wsrf64(double v) { +#ifdef __ARM_32BIT_STATE + __arm_wsrf64("cp1:2:c3", v); +#else + __arm_wsrf64("1:2:3:4:5", v); +#endif +} + // AArch32: ![[M2]] = !{!"cp1:2:c3:c4:5"} // AArch32: ![[M3]] = !{!"cp1:2:c3"} // AArch32: ![[M4]] = !{!"sysreg"} Index: clang/lib/Headers/arm_acle.h === --- clang/lib/Headers/arm_acle.h +++ clang/lib/Headers/arm_acle.h @@ -609,9 +609,13 @@ #define __arm_rsr(sysreg) __builtin_arm_rsr(sysreg) #define __arm_rsr64(sysreg) __builtin_arm_rsr64(sysreg) #define __arm_rsrp(sysreg) __builtin_arm_rsrp(sysreg) +#define __arm_rsrf(sysreg) __builtin_bit_cast(float, __arm_rsr(sysreg)) +#define __arm_rsrf64(sysreg) __builtin_bit_cast(double, __arm_rsr64(sysreg)) #define __arm_wsr(sysreg, v) __builtin_arm_wsr(sysreg, v) #define __arm_wsr64(sysreg, v) __builtin_arm_wsr64(sysreg, v) #define __arm_wsrp(sysreg, v) __builtin_arm_wsrp(sysreg, v) +#define __arm_wsrf(sysreg, v) __arm_wsr(sysreg, __builtin_bit_cast(uint32_t, v)) +#define __arm_wsrf64(sysreg, v) __arm_wsr64(sysreg, __builtin_bit_cast(uint64_t, v)) /* Memory Tagging Extensions (MTE) Intrinsics */ #if __ARM_FEATURE_MEMORY_TAGGING Index: clang/test/CodeGen/arm_acle.c === --- clang/test/CodeGen/arm_acle.c +++ clang/test/CodeGen/arm_acle.c @@ -822,6 +822,55 @@ __arm_wsrp("sysreg", v); } +// ARM-LABEL: test_rsrf +// AArch64: call i64 @llvm.read_register.i64(metadata ![[M0:[0-9]]]) +// AArch32: call i32 @llvm.read_register.i32(metadata ![[M2:[0-9]]]) +// ARM-NOT: uitofp +// ARM: bitcast +float test_rsrf() { +#ifdef __ARM_32BIT_STATE + return __arm_rsrf("cp1:2:c3:c4:5"); +#else + return __arm_rsrf("1:2:3:4:5"); +#endif +} +// ARM-LABEL: test_rsrf64 +// AArch64: call i64 @llvm.read_register.i64(metadata ![[M0:[0-9]]]) +// AArch32: call i64 @llvm.read_register.i64(metadata ![[M3:[0-9]]]) +// ARM-NOT: uitofp +// ARM: bitcast +double test_rsrf64() { +#ifdef __ARM_32BIT_STATE + return __arm_rsrf64("cp1:2:c3"); +#else + return __arm_rsrf64("1:2:3:4:5"); +#endif +} +// ARM-LABEL: test_wsrf +// ARM-NOT: fptoui +// ARM: bitcast +// AArch64: call void @llvm.write_register.i64(metadata ![[M0:[0-9]]], i64 %{{.*}}) +// AArch32: call void @llvm.write_register.i32(metadata ![[M2:[0-9]]], i32 %{{.*}}) +void test_wsrf(float v) { +#ifdef __ARM_32BIT_STATE + __arm_wsrf("cp1:2:c3:c4:5", v); +#else + __arm_wsrf("1:2:3:4:5", v); +#endif +} +// ARM-LABEL: test_wsrf64 +// ARM-NOT: fptoui +// ARM: bitcast +// AArch64: call void @llvm.write_register.i64(metadata ![[M0:[0-9]]], i64 %{{.*}}) +// AArch32: call void @llvm.write_register.i64(metadata ![[M3:[0-9]]], i64 %{{.*}}) +void test_wsrf64(double v) { +#ifdef __ARM_32BIT_STATE + __arm_wsrf64("cp1:2:c3", v); +#else + __arm_wsrf64("1:2:3:4:5", v); +#endif +} + // AArch32: ![[M2]] = !{!"cp1:2:c3:c4:5"} // AArch32: ![[M3]] = !{!"cp1:2:c3"} // AArch32: ![[M4]] = !{!"sysreg"} Index: clang/lib/Headers/arm_acle.h ==
[PATCH] D69250: [ARM][AArch64] Implement __cls, __clsl and __clsll intrinsics from ACLE
This revision was automatically updated to reflect the committed changes. Closed by commit rGf6e11a36c49c: [ARM][AArch64] Implement __cls, __clsl and __clsll intrinsics from ACLE (authored by vhscampos). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D69250/new/ https://reviews.llvm.org/D69250 Files: clang/include/clang/Basic/BuiltinsAArch64.def clang/include/clang/Basic/BuiltinsARM.def clang/lib/CodeGen/CGBuiltin.cpp clang/lib/Headers/arm_acle.h clang/test/CodeGen/arm_acle.c clang/test/CodeGen/builtins-arm.c clang/test/CodeGen/builtins-arm64.c llvm/include/llvm/IR/IntrinsicsAArch64.td llvm/include/llvm/IR/IntrinsicsARM.td llvm/lib/Target/AArch64/AArch64InstrInfo.td llvm/lib/Target/ARM/ARMISelLowering.cpp llvm/test/CodeGen/AArch64/cls.ll llvm/test/CodeGen/ARM/cls.ll Index: llvm/test/CodeGen/ARM/cls.ll === --- /dev/null +++ llvm/test/CodeGen/ARM/cls.ll @@ -0,0 +1,27 @@ +; RUN: llc -mtriple=armv5 %s -o - | FileCheck %s + +; CHECK: eor [[T:r[0-9]+]], [[T]], [[T]], asr #31 +; CHECK-NEXT: mov [[C1:r[0-9]+]], #1 +; CHECK-NEXT: orr [[T]], [[C1]], [[T]], lsl #1 +; CHECK-NEXT: clz [[T]], [[T]] +define i32 @cls(i32 %t) { + %cls.i = call i32 @llvm.arm.cls(i32 %t) + ret i32 %cls.i +} + +; CHECK: cmp r1, #0 +; CHECK: mvnne [[ADJUSTEDLO:r[0-9]+]], r0 +; CHECK: clz [[CLZLO:r[0-9]+]], [[ADJUSTEDLO]] +; CHECK: eor [[A:r[0-9]+]], r1, r1, asr #31 +; CHECK: mov r1, #1 +; CHECK: orr [[A]], r1, [[A]], lsl #1 +; CHECK: clz [[CLSHI:r[0-9]+]], [[A]] +; CHECK: cmp [[CLSHI]], #31 +; CHECK: addeq r0, [[CLZLO]], #31 +define i32 @cls64(i64 %t) { + %cls.i = call i32 @llvm.arm.cls64(i64 %t) + ret i32 %cls.i +} + +declare i32 @llvm.arm.cls(i32) nounwind +declare i32 @llvm.arm.cls64(i64) nounwind Index: llvm/test/CodeGen/AArch64/cls.ll === --- /dev/null +++ llvm/test/CodeGen/AArch64/cls.ll @@ -0,0 +1,20 @@ +; RUN: llc -mtriple=aarch64 %s -o - | FileCheck %s + +; @llvm.aarch64.cls must be directly translated into the 'cls' instruction + +; CHECK-LABEL: cls +; CHECK: cls [[REG:w[0-9]+]], [[REG]] +define i32 @cls(i32 %t) { + %cls.i = call i32 @llvm.aarch64.cls(i32 %t) + ret i32 %cls.i +} + +; CHECK-LABEL: cls64 +; CHECK: cls [[REG:x[0-9]+]], [[REG]] +define i32 @cls64(i64 %t) { + %cls.i = call i32 @llvm.aarch64.cls64(i64 %t) + ret i32 %cls.i +} + +declare i32 @llvm.aarch64.cls(i32) nounwind +declare i32 @llvm.aarch64.cls64(i64) nounwind Index: llvm/lib/Target/ARM/ARMISelLowering.cpp === --- llvm/lib/Target/ARM/ARMISelLowering.cpp +++ llvm/lib/Target/ARM/ARMISelLowering.cpp @@ -3629,6 +3629,49 @@ EVT PtrVT = getPointerTy(DAG.getDataLayout()); return DAG.getNode(ARMISD::THREAD_POINTER, dl, PtrVT); } + case Intrinsic::arm_cls: { +const SDValue &Operand = Op.getOperand(1); +const EVT VTy = Op.getValueType(); +SDValue SRA = +DAG.getNode(ISD::SRA, dl, VTy, Operand, DAG.getConstant(31, dl, VTy)); +SDValue XOR = DAG.getNode(ISD::XOR, dl, VTy, SRA, Operand); +SDValue SHL = +DAG.getNode(ISD::SHL, dl, VTy, XOR, DAG.getConstant(1, dl, VTy)); +SDValue OR = +DAG.getNode(ISD::OR, dl, VTy, SHL, DAG.getConstant(1, dl, VTy)); +SDValue Result = DAG.getNode(ISD::CTLZ, dl, VTy, OR); +return Result; + } + case Intrinsic::arm_cls64: { +// cls(x) = if cls(hi(x)) != 31 then cls(hi(x)) +// else 31 + clz(if hi(x) == 0 then lo(x) else not(lo(x))) +const SDValue &Operand = Op.getOperand(1); +const EVT VTy = Op.getValueType(); + +SDValue Hi = DAG.getNode(ISD::EXTRACT_ELEMENT, dl, VTy, Operand, + DAG.getConstant(1, dl, VTy)); +SDValue Lo = DAG.getNode(ISD::EXTRACT_ELEMENT, dl, VTy, Operand, + DAG.getConstant(0, dl, VTy)); +SDValue Constant0 = DAG.getConstant(0, dl, VTy); +SDValue Constant1 = DAG.getConstant(1, dl, VTy); +SDValue Constant31 = DAG.getConstant(31, dl, VTy); +SDValue SRAHi = DAG.getNode(ISD::SRA, dl, VTy, Hi, Constant31); +SDValue XORHi = DAG.getNode(ISD::XOR, dl, VTy, SRAHi, Hi); +SDValue SHLHi = DAG.getNode(ISD::SHL, dl, VTy, XORHi, Constant1); +SDValue ORHi = DAG.getNode(ISD::OR, dl, VTy, SHLHi, Constant1); +SDValue CLSHi = DAG.getNode(ISD::CTLZ, dl, VTy, ORHi); +SDValue CheckLo = +DAG.getSetCC(dl, MVT::i1, CLSHi, Constant31, ISD::CondCode::SETEQ); +SDValue HiIsZero = +DAG.getSetCC(dl, MVT::i1, Hi, Constant0, ISD::CondCode::SETEQ); +SDValue AdjustedLo = +DAG.getSelect(dl, VTy, HiIsZero, Lo, DAG.getNOT(dl, Lo, VTy)); +SDValue CLZAdjustedLo = DAG.getNode(ISD::CTLZ, dl, VTy, AdjustedLo); +SDValue Result = +DAG.getSelect(dl, VTy, CheckLo, + DAG.getNode(ISD::ADD, dl, VTy, CLZAdjustedLo, Constant31), CLSHi); +return Resul
[PATCH] D82948: [Driver][ARM] Disable unsupported features when nofp arch extension is used
vhscampos added a comment. Ping. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D82948/new/ https://reviews.llvm.org/D82948 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D82948: [Driver][ARM] Disable unsupported features when nofp arch extension is used
vhscampos updated this revision to Diff 280072. vhscampos added a comment. 1. Add comment explaining the MVE-Integer detail. 2. Add another test to check the disabled features. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D82948/new/ https://reviews.llvm.org/D82948 Files: clang/lib/Driver/ToolChains/Arch/ARM.cpp clang/test/CodeGen/arm-bf16-softfloat.c clang/test/Driver/arm-nofp-disabled-features.c llvm/include/llvm/Support/ARMTargetParser.h llvm/lib/Support/ARMTargetParser.cpp llvm/unittests/Support/TargetParserTest.cpp Index: llvm/unittests/Support/TargetParserTest.cpp === --- llvm/unittests/Support/TargetParserTest.cpp +++ llvm/unittests/Support/TargetParserTest.cpp @@ -668,9 +668,10 @@ testArchExtDependency(const char *ArchExt, const std::initializer_list &Expected) { std::vector Features; + unsigned FPUID; if (!ARM::appendArchExtFeatures("", ARM::ArchKind::ARMV8_1MMainline, ArchExt, - Features)) + Features, FPUID)) return false; return llvm::all_of(Expected, [&](StringRef Ext) { Index: llvm/lib/Support/ARMTargetParser.cpp === --- llvm/lib/Support/ARMTargetParser.cpp +++ llvm/lib/Support/ARMTargetParser.cpp @@ -490,9 +490,10 @@ return ARM::FK_INVALID; } -bool ARM::appendArchExtFeatures( - StringRef CPU, ARM::ArchKind AK, StringRef ArchExt, - std::vector &Features) { +bool ARM::appendArchExtFeatures(StringRef CPU, ARM::ArchKind AK, +StringRef ArchExt, +std::vector &Features, +unsigned &ArgFPUID) { size_t StartingNumFeatures = Features.size(); const bool Negated = stripNegationPrefix(ArchExt); @@ -527,6 +528,7 @@ } else { FPUKind = getDefaultFPU(CPU, AK); } +ArgFPUID = FPUKind; return ARM::getFPUFeatures(FPUKind, Features); } return StartingNumFeatures != Features.size(); Index: llvm/include/llvm/Support/ARMTargetParser.h === --- llvm/include/llvm/Support/ARMTargetParser.h +++ llvm/include/llvm/Support/ARMTargetParser.h @@ -250,7 +250,8 @@ StringRef getArchExtName(uint64_t ArchExtKind); StringRef getArchExtFeature(StringRef ArchExt); bool appendArchExtFeatures(StringRef CPU, ARM::ArchKind AK, StringRef ArchExt, - std::vector &Features); + std::vector &Features, + unsigned &ArgFPUKind); StringRef getHWDivName(uint64_t HWDivKind); // Information by Name Index: clang/test/Driver/arm-nofp-disabled-features.c === --- /dev/null +++ clang/test/Driver/arm-nofp-disabled-features.c @@ -0,0 +1,18 @@ +// RUN: %clang -target arm-arm-none-eabi -mfloat-abi=soft %s -### 2>&1 | FileCheck %s --check-prefix=CHECK-MFLOAT-ABI-SOFT +// CHECK-MFLOAT-ABI-SOFT: "-target-feature" "-dotprod" +// CHECK-MFLOAT-ABI-SOFT: "-target-feature" "-fp16fml" +// CHECK-MFLOAT-ABI-SOFT: "-target-feature" "-bf16" +// CHECK-MFLOAT-ABI-SOFT: "-target-feature" "-mve" +// CHECK-MFLOAT-ABI-SOFT: "-target-feature" "-mve.fp" +// CHECK-MFLOAT-ABI-SOFT: "-target-feature" "-fpregs" + +// RUN: %clang -target arm-arm-none-eabi -mfpu=none %s -### 2>&1 | FileCheck %s +// RUN: %clang -target arm-arm-none-eabi -march=armv8-a+nofp %s -### 2>&1 | FileCheck %s +// RUN: %clang -target arm-arm-none-eabi -mcpu=cortex-a35+nofp %s -### 2>&1 | FileCheck %s +// RUN: %clang -target arm-arm-none-eabi -march=armv8-a+nofp+nomve %s -### 2>&1 | FileCheck %s --check-prefixes=CHECK,CHECK-NOMVE +// RUN: %clang -target arm-arm-none-eabi -mcpu=cortex-a35+nofp+nomve %s -### 2>&1 | FileCheck %s --check-prefixes=CHECK,CHECK-NOMVE +// CHECK: "-target-feature" "-dotprod" +// CHECK: "-target-feature" "-fp16fml" +// CHECK: "-target-feature" "-bf16" +// CHECK: "-target-feature" "-mve.fp" +// CHECK-NOMVE: "-target-feature" "-fpregs" Index: clang/test/CodeGen/arm-bf16-softfloat.c === --- clang/test/CodeGen/arm-bf16-softfloat.c +++ clang/test/CodeGen/arm-bf16-softfloat.c @@ -1,4 +1,9 @@ -// RUN: not %clang -o %t.out -target arm-arm-eabi -march=armv8-a+bf16 -mfloat-abi=soft -c %s 2>&1 | FileCheck %s +// RUN: not %clang -target arm-arm-none-eabi -march=armv8-a+bf16 -mfloat-abi=soft -c %s -o %t 2>&1 | FileCheck %s +// RUN: not %clang -target arm-arm-none-eabi -march=armv8-a+bf16 -mfpu=none -c %s -o %t 2>&1 | FileCheck %s +// RUN: not %clang -target arm-arm-none-eabi -march=armv8-a+bf16+nofp -c %s -o %t 2>&1 | FileCheck %s +// RUN: not %clang -target arm-arm-none-eabi -march=armv8-a+bf16+fp+nofp -c %s -o %t 2>&1 | FileCheck %s +// RUN: %clang -target arm-arm-none-eabi -march=armv8-a+bf16+
[PATCH] D82948: [Driver][ARM] Disable unsupported features when nofp arch extension is used
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rGd1a3396bfbc6: [Driver][ARM] Disable unsupported features when nofp arch extension is used (authored by vhscampos). Changed prior to commit: https://reviews.llvm.org/D82948?vs=280072&id=281549#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D82948/new/ https://reviews.llvm.org/D82948 Files: clang/lib/Driver/ToolChains/Arch/ARM.cpp clang/test/CodeGen/arm-bf16-softfloat.c clang/test/Driver/arm-nofp-disabled-features.c llvm/include/llvm/Support/ARMTargetParser.h llvm/lib/Support/ARMTargetParser.cpp llvm/unittests/Support/TargetParserTest.cpp Index: llvm/unittests/Support/TargetParserTest.cpp === --- llvm/unittests/Support/TargetParserTest.cpp +++ llvm/unittests/Support/TargetParserTest.cpp @@ -668,9 +668,10 @@ testArchExtDependency(const char *ArchExt, const std::initializer_list &Expected) { std::vector Features; + unsigned FPUID; if (!ARM::appendArchExtFeatures("", ARM::ArchKind::ARMV8_1MMainline, ArchExt, - Features)) + Features, FPUID)) return false; return llvm::all_of(Expected, [&](StringRef Ext) { Index: llvm/lib/Support/ARMTargetParser.cpp === --- llvm/lib/Support/ARMTargetParser.cpp +++ llvm/lib/Support/ARMTargetParser.cpp @@ -490,9 +490,10 @@ return ARM::FK_INVALID; } -bool ARM::appendArchExtFeatures( - StringRef CPU, ARM::ArchKind AK, StringRef ArchExt, - std::vector &Features) { +bool ARM::appendArchExtFeatures(StringRef CPU, ARM::ArchKind AK, +StringRef ArchExt, +std::vector &Features, +unsigned &ArgFPUID) { size_t StartingNumFeatures = Features.size(); const bool Negated = stripNegationPrefix(ArchExt); @@ -527,6 +528,7 @@ } else { FPUKind = getDefaultFPU(CPU, AK); } +ArgFPUID = FPUKind; return ARM::getFPUFeatures(FPUKind, Features); } return StartingNumFeatures != Features.size(); Index: llvm/include/llvm/Support/ARMTargetParser.h === --- llvm/include/llvm/Support/ARMTargetParser.h +++ llvm/include/llvm/Support/ARMTargetParser.h @@ -250,7 +250,8 @@ StringRef getArchExtName(uint64_t ArchExtKind); StringRef getArchExtFeature(StringRef ArchExt); bool appendArchExtFeatures(StringRef CPU, ARM::ArchKind AK, StringRef ArchExt, - std::vector &Features); + std::vector &Features, + unsigned &ArgFPUKind); StringRef getHWDivName(uint64_t HWDivKind); // Information by Name Index: clang/test/Driver/arm-nofp-disabled-features.c === --- /dev/null +++ clang/test/Driver/arm-nofp-disabled-features.c @@ -0,0 +1,18 @@ +// RUN: %clang -target arm-arm-none-eabi -mfloat-abi=soft %s -### 2>&1 | FileCheck %s --check-prefix=CHECK-MFLOAT-ABI-SOFT +// CHECK-MFLOAT-ABI-SOFT: "-target-feature" "-dotprod" +// CHECK-MFLOAT-ABI-SOFT: "-target-feature" "-fp16fml" +// CHECK-MFLOAT-ABI-SOFT: "-target-feature" "-bf16" +// CHECK-MFLOAT-ABI-SOFT: "-target-feature" "-mve" +// CHECK-MFLOAT-ABI-SOFT: "-target-feature" "-mve.fp" +// CHECK-MFLOAT-ABI-SOFT: "-target-feature" "-fpregs" + +// RUN: %clang -target arm-arm-none-eabi -mfpu=none %s -### 2>&1 | FileCheck %s +// RUN: %clang -target arm-arm-none-eabi -march=armv8-a+nofp %s -### 2>&1 | FileCheck %s +// RUN: %clang -target arm-arm-none-eabi -mcpu=cortex-a35+nofp %s -### 2>&1 | FileCheck %s +// RUN: %clang -target arm-arm-none-eabi -march=armv8-a+nofp+nomve %s -### 2>&1 | FileCheck %s --check-prefixes=CHECK,CHECK-NOMVE +// RUN: %clang -target arm-arm-none-eabi -mcpu=cortex-a35+nofp+nomve %s -### 2>&1 | FileCheck %s --check-prefixes=CHECK,CHECK-NOMVE +// CHECK: "-target-feature" "-dotprod" +// CHECK: "-target-feature" "-fp16fml" +// CHECK: "-target-feature" "-bf16" +// CHECK: "-target-feature" "-mve.fp" +// CHECK-NOMVE: "-target-feature" "-fpregs" Index: clang/test/CodeGen/arm-bf16-softfloat.c === --- clang/test/CodeGen/arm-bf16-softfloat.c +++ clang/test/CodeGen/arm-bf16-softfloat.c @@ -1,4 +1,9 @@ -// RUN: not %clang -o %t.out -target arm-arm-eabi -march=armv8-a+bf16 -mfloat-abi=soft -c %s 2>&1 | FileCheck %s +// RUN: not %clang -target arm-arm-none-eabi -march=armv8-a+bf16 -mfloat-abi=soft -c %s -o %t 2>&1 | FileCheck %s +// RUN: not %clang -target arm-arm-none-eabi -march=armv8-a+bf16 -mfpu=none -c %s -o %t 2>&1 | FileCheck %s +// RUN: not %clang -target arm-arm-none-eabi -march=armv8-a+bf16+nofp -c %s -o %t 2>&1 | FileCheck %
[PATCH] D81847: [ARM] Improve diagnostics message when Neon is unsupported
vhscampos created this revision. Herald added subscribers: cfe-commits, danielkiss, kristof.beyls. Herald added a project: clang. vhscampos added a reviewer: stuij. Whenever Neon is not supported, a generic message is printed: error: "NEON support not enabled" Followed by a series of other error messages that are not useful once the first one is printed. This patch gives a more precise message in the case where Neon is unsupported because an invalid float ABI was specified: the soft float ABI. error: "NEON intrinsics not available with the soft-float ABI. Please use -mfloat-abi=softfp or -mfloat-abi=hard" This message is the same one that GCC gives, so it is also making their diagnostics more compatible with each other. Also, by rearranging preprocessor directives, these "unsupported" error messages are now the only ones printed out, which is also GCC's behaviour. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D81847 Files: clang/utils/TableGen/NeonEmitter.cpp Index: clang/utils/TableGen/NeonEmitter.cpp === --- clang/utils/TableGen/NeonEmitter.cpp +++ clang/utils/TableGen/NeonEmitter.cpp @@ -2311,9 +2311,14 @@ OS << "#ifndef __ARM_NEON_H\n"; OS << "#define __ARM_NEON_H\n\n"; + OS << "#ifndef __ARM_FP\n"; + OS << "#error \"NEON intrinsics not available with the soft-float ABI. " +"Please use -mfloat-abi=softfp or -mfloat-abi=hard\"\n"; + OS << "#else\n\n"; + OS << "#if !defined(__ARM_NEON)\n"; OS << "#error \"NEON support not enabled\"\n"; - OS << "#endif\n\n"; + OS << "#else\n\n"; OS << "#include \n\n"; @@ -2403,6 +2408,8 @@ OS << "\n"; OS << "#undef __ai\n\n"; + OS << "#endif /* if !defined(__ARM_NEON) */\n"; + OS << "#endif /* ifndef __ARM_FP */\n"; OS << "#endif /* __ARM_NEON_H */\n"; } Index: clang/utils/TableGen/NeonEmitter.cpp === --- clang/utils/TableGen/NeonEmitter.cpp +++ clang/utils/TableGen/NeonEmitter.cpp @@ -2311,9 +2311,14 @@ OS << "#ifndef __ARM_NEON_H\n"; OS << "#define __ARM_NEON_H\n\n"; + OS << "#ifndef __ARM_FP\n"; + OS << "#error \"NEON intrinsics not available with the soft-float ABI. " +"Please use -mfloat-abi=softfp or -mfloat-abi=hard\"\n"; + OS << "#else\n\n"; + OS << "#if !defined(__ARM_NEON)\n"; OS << "#error \"NEON support not enabled\"\n"; - OS << "#endif\n\n"; + OS << "#else\n\n"; OS << "#include \n\n"; @@ -2403,6 +2408,8 @@ OS << "\n"; OS << "#undef __ai\n\n"; + OS << "#endif /* if !defined(__ARM_NEON) */\n"; + OS << "#endif /* ifndef __ARM_FP */\n"; OS << "#endif /* __ARM_NEON_H */\n"; } ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D82947: [Driver][ARM] Disable unsupported features when nofp arch extension is used
vhscampos created this revision. Herald added subscribers: cfe-commits, danielkiss, kristof.beyls. Herald added a project: clang. vhscampos abandoned this revision. A list of target features is disabled when there is no hardware floating-point support. This is the case when one of the following options is passed to clang: - -mfloat-abi=soft - -mfpu=none This option list is missing, however, the extension "+nofp" that can be specified in -march flags, such as "-march=armv8-a+nofp". This patch also disables unsupported target features when nofp is passed to -march. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D82947 Files: clang/lib/Driver/ToolChains/Arch/ARM.cpp Index: clang/lib/Driver/ToolChains/Arch/ARM.cpp === --- clang/lib/Driver/ToolChains/Arch/ARM.cpp +++ clang/lib/Driver/ToolChains/Arch/ARM.cpp @@ -285,6 +285,35 @@ (NoMVE == F.rend() || std::distance(MVE, NoMVE) > 0); } +static void appendNoFPUnsupportedFeatures(const arm::FloatABI ABI, + const unsigned FPUID, + const StringRef &ArchName, + std::vector &Features) { + auto checkFPDisabledInArchName = [](const StringRef &ArchName) { +SmallVector Split; +ArchName.split(Split, '+', -1, false); +return llvm::any_of( +Split, [](const StringRef &Extension) { return Extension == "nofp"; }); + }; + + if (ABI == arm::FloatABI::Soft) { +llvm::ARM::getFPUFeatures(llvm::ARM::FK_NONE, Features); + +// Disable all features relating to hardware FP, not already disabled by the +// above call. +Features.insert(Features.end(), +{"-dotprod", "-fp16fml", "-mve", "-mve.fp", "-fpregs"}); + } else if (FPUID == llvm::ARM::FK_NONE || + checkFPDisabledInArchName(ArchName)) { +// -mfpu=none or -march=armvX+nofp is *very* similar to -mfloat-abi=soft, +// only that it should not disable MVE-I. +Features.insert(Features.end(), {"-dotprod", "-fp16fml", "-mve.fp"}); +if (!hasIntegerMVE(Features)) { + Features.emplace_back("-fpregs"); +} + } +} + void arm::getARMTargetFeatures(const Driver &D, const llvm::Triple &Triple, const ArgList &Args, ArgStringList &CmdArgs, std::vector &Features, bool ForAS) { @@ -455,23 +484,10 @@ Features.push_back("+fullfp16"); } - // Setting -msoft-float/-mfloat-abi=soft effectively disables the FPU (GCC - // ignores the -mfpu options in this case). - // Note that the ABI can also be set implicitly by the target selected. - if (ABI == arm::FloatABI::Soft) { -llvm::ARM::getFPUFeatures(llvm::ARM::FK_NONE, Features); - -// Disable all features relating to hardware FP, not already disabled by the -// above call. -Features.insert(Features.end(), -{"-dotprod", "-fp16fml", "-mve", "-mve.fp", "-fpregs"}); - } else if (FPUID == llvm::ARM::FK_NONE) { -// -mfpu=none is *very* similar to -mfloat-abi=soft, only that it should not -// disable MVE-I. -Features.insert(Features.end(), {"-dotprod", "-fp16fml", "-mve.fp"}); -if (!hasIntegerMVE(Features)) - Features.emplace_back("-fpregs"); - } + // Setting -msoft-float/-mfloat-abi=soft, -mfpu=none, or adding +nofp to + // -march effectively disables the FPU (GCC ignores the -mfpu options in this + // case). Note that the ABI can also be set implicitly by the target selected. + appendNoFPUnsupportedFeatures(ABI, FPUID, ArchName, Features); // En/disable crc code generation. if (Arg *A = Args.getLastArg(options::OPT_mcrc, options::OPT_mnocrc)) { Index: clang/lib/Driver/ToolChains/Arch/ARM.cpp === --- clang/lib/Driver/ToolChains/Arch/ARM.cpp +++ clang/lib/Driver/ToolChains/Arch/ARM.cpp @@ -285,6 +285,35 @@ (NoMVE == F.rend() || std::distance(MVE, NoMVE) > 0); } +static void appendNoFPUnsupportedFeatures(const arm::FloatABI ABI, + const unsigned FPUID, + const StringRef &ArchName, + std::vector &Features) { + auto checkFPDisabledInArchName = [](const StringRef &ArchName) { +SmallVector Split; +ArchName.split(Split, '+', -1, false); +return llvm::any_of( +Split, [](const StringRef &Extension) { return Extension == "nofp"; }); + }; + + if (ABI == arm::FloatABI::Soft) { +llvm::ARM::getFPUFeatures(llvm::ARM::FK_NONE, Features); + +// Disable all features relating to hardware FP, not already disabled by the +// above call. +Features.insert(Features.end(), +{"-dotprod", "-fp16fml", "-mve", "-mve.fp", "-fpregs"}); + } else if (FPUID == llvm::ARM::FK_NONE || + checkFPDisabledInArchName(ArchNa
[PATCH] D82949: [Driver][ARM] Disable bf16 when hardware FP support is missing
vhscampos created this revision. Herald added a project: clang. Herald added a subscriber: cfe-commits. vhscampos retitled this revision from "Disable bf16 when hardware FP support is missing" to "[Driver][ARM] Disable bf16 when hardware FP support is missing". vhscampos edited the summary of this revision. Herald added subscribers: danielkiss, kristof.beyls. vhscampos added a reviewer: chill. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D82949 Files: clang/lib/Driver/ToolChains/Arch/ARM.cpp clang/test/CodeGen/arm-bf16-softfloat.c Index: clang/test/CodeGen/arm-bf16-softfloat.c === --- clang/test/CodeGen/arm-bf16-softfloat.c +++ clang/test/CodeGen/arm-bf16-softfloat.c @@ -1,4 +1,6 @@ -// RUN: not %clang -o %t.out -target arm-arm-eabi -march=armv8-a+bf16 -mfloat-abi=soft -c %s 2>&1 | FileCheck %s +// RUN: not %clang -target arm-arm-none-eabi -march=armv8-a+bf16 -mfloat-abi=soft -c %s -o %t 2>&1 | FileCheck %s +// RUN: not %clang -target arm-arm-none-eabi -march=armv8-a+bf16 -mfpu=none -c %s -o %t 2>&1 | FileCheck %s +// RUN: not %clang -target arm-arm-none-eabi -march=armv8-a+bf16+nofp -c %s -o %t 2>&1 | FileCheck %s // CHECK: error: __bf16 is not supported on this target extern __bf16 var; Index: clang/lib/Driver/ToolChains/Arch/ARM.cpp === --- clang/lib/Driver/ToolChains/Arch/ARM.cpp +++ clang/lib/Driver/ToolChains/Arch/ARM.cpp @@ -301,13 +301,14 @@ // Disable all features relating to hardware FP, not already disabled by the // above call. -Features.insert(Features.end(), -{"-dotprod", "-fp16fml", "-mve", "-mve.fp", "-fpregs"}); +Features.insert(Features.end(), {"-dotprod", "-fp16fml", "-bf16", "-mve", + "-mve.fp", "-fpregs"}); } else if (FPUID == llvm::ARM::FK_NONE || checkFPDisabledInArchName(ArchName)) { // -mfpu=none or -march=armvX+nofp is *very* similar to -mfloat-abi=soft, // only that it should not disable MVE-I. -Features.insert(Features.end(), {"-dotprod", "-fp16fml", "-mve.fp"}); +Features.insert(Features.end(), +{"-dotprod", "-fp16fml", "-bf16", "-mve.fp"}); if (!hasIntegerMVE(Features)) { Features.emplace_back("-fpregs"); } Index: clang/test/CodeGen/arm-bf16-softfloat.c === --- clang/test/CodeGen/arm-bf16-softfloat.c +++ clang/test/CodeGen/arm-bf16-softfloat.c @@ -1,4 +1,6 @@ -// RUN: not %clang -o %t.out -target arm-arm-eabi -march=armv8-a+bf16 -mfloat-abi=soft -c %s 2>&1 | FileCheck %s +// RUN: not %clang -target arm-arm-none-eabi -march=armv8-a+bf16 -mfloat-abi=soft -c %s -o %t 2>&1 | FileCheck %s +// RUN: not %clang -target arm-arm-none-eabi -march=armv8-a+bf16 -mfpu=none -c %s -o %t 2>&1 | FileCheck %s +// RUN: not %clang -target arm-arm-none-eabi -march=armv8-a+bf16+nofp -c %s -o %t 2>&1 | FileCheck %s // CHECK: error: __bf16 is not supported on this target extern __bf16 var; Index: clang/lib/Driver/ToolChains/Arch/ARM.cpp === --- clang/lib/Driver/ToolChains/Arch/ARM.cpp +++ clang/lib/Driver/ToolChains/Arch/ARM.cpp @@ -301,13 +301,14 @@ // Disable all features relating to hardware FP, not already disabled by the // above call. -Features.insert(Features.end(), -{"-dotprod", "-fp16fml", "-mve", "-mve.fp", "-fpregs"}); +Features.insert(Features.end(), {"-dotprod", "-fp16fml", "-bf16", "-mve", + "-mve.fp", "-fpregs"}); } else if (FPUID == llvm::ARM::FK_NONE || checkFPDisabledInArchName(ArchName)) { // -mfpu=none or -march=armvX+nofp is *very* similar to -mfloat-abi=soft, // only that it should not disable MVE-I. -Features.insert(Features.end(), {"-dotprod", "-fp16fml", "-mve.fp"}); +Features.insert(Features.end(), +{"-dotprod", "-fp16fml", "-bf16", "-mve.fp"}); if (!hasIntegerMVE(Features)) { Features.emplace_back("-fpregs"); } ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D82948: [Driver][ARM] Disable unsupported features when nofp arch extension is used
vhscampos created this revision. Herald added subscribers: cfe-commits, danielkiss, kristof.beyls. Herald added a project: clang. vhscampos added a child revision: D82949: [Driver][ARM] Disable bf16 when hardware FP support is missing. vhscampos added a reviewer: chill. A list of target features is disabled when there is no hardware floating-point support. This is the case when one of the following options is passed to clang: - -mfloat-abi=soft - -mfpu=none This option list is missing, however, the extension "+nofp" that can be specified in -march flags, such as "-march=armv8-a+nofp". This patch also disables unsupported target features when nofp is passed to -march. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D82948 Files: clang/lib/Driver/ToolChains/Arch/ARM.cpp Index: clang/lib/Driver/ToolChains/Arch/ARM.cpp === --- clang/lib/Driver/ToolChains/Arch/ARM.cpp +++ clang/lib/Driver/ToolChains/Arch/ARM.cpp @@ -285,6 +285,35 @@ (NoMVE == F.rend() || std::distance(MVE, NoMVE) > 0); } +static void appendNoFPUnsupportedFeatures(const arm::FloatABI ABI, + const unsigned FPUID, + const StringRef &ArchName, + std::vector &Features) { + auto checkFPDisabledInArchName = [](const StringRef &ArchName) { +SmallVector Split; +ArchName.split(Split, '+', -1, false); +return llvm::any_of( +Split, [](const StringRef &Extension) { return Extension == "nofp"; }); + }; + + if (ABI == arm::FloatABI::Soft) { +llvm::ARM::getFPUFeatures(llvm::ARM::FK_NONE, Features); + +// Disable all features relating to hardware FP, not already disabled by the +// above call. +Features.insert(Features.end(), +{"-dotprod", "-fp16fml", "-mve", "-mve.fp", "-fpregs"}); + } else if (FPUID == llvm::ARM::FK_NONE || + checkFPDisabledInArchName(ArchName)) { +// -mfpu=none or -march=armvX+nofp is *very* similar to -mfloat-abi=soft, +// only that it should not disable MVE-I. +Features.insert(Features.end(), {"-dotprod", "-fp16fml", "-mve.fp"}); +if (!hasIntegerMVE(Features)) { + Features.emplace_back("-fpregs"); +} + } +} + void arm::getARMTargetFeatures(const Driver &D, const llvm::Triple &Triple, const ArgList &Args, ArgStringList &CmdArgs, std::vector &Features, bool ForAS) { @@ -455,23 +484,10 @@ Features.push_back("+fullfp16"); } - // Setting -msoft-float/-mfloat-abi=soft effectively disables the FPU (GCC - // ignores the -mfpu options in this case). - // Note that the ABI can also be set implicitly by the target selected. - if (ABI == arm::FloatABI::Soft) { -llvm::ARM::getFPUFeatures(llvm::ARM::FK_NONE, Features); - -// Disable all features relating to hardware FP, not already disabled by the -// above call. -Features.insert(Features.end(), -{"-dotprod", "-fp16fml", "-mve", "-mve.fp", "-fpregs"}); - } else if (FPUID == llvm::ARM::FK_NONE) { -// -mfpu=none is *very* similar to -mfloat-abi=soft, only that it should not -// disable MVE-I. -Features.insert(Features.end(), {"-dotprod", "-fp16fml", "-mve.fp"}); -if (!hasIntegerMVE(Features)) - Features.emplace_back("-fpregs"); - } + // Setting -msoft-float/-mfloat-abi=soft, -mfpu=none, or adding +nofp to + // -march effectively disables the FPU (GCC ignores the -mfpu options in this + // case). Note that the ABI can also be set implicitly by the target selected. + appendNoFPUnsupportedFeatures(ABI, FPUID, ArchName, Features); // En/disable crc code generation. if (Arg *A = Args.getLastArg(options::OPT_mcrc, options::OPT_mnocrc)) { Index: clang/lib/Driver/ToolChains/Arch/ARM.cpp === --- clang/lib/Driver/ToolChains/Arch/ARM.cpp +++ clang/lib/Driver/ToolChains/Arch/ARM.cpp @@ -285,6 +285,35 @@ (NoMVE == F.rend() || std::distance(MVE, NoMVE) > 0); } +static void appendNoFPUnsupportedFeatures(const arm::FloatABI ABI, + const unsigned FPUID, + const StringRef &ArchName, + std::vector &Features) { + auto checkFPDisabledInArchName = [](const StringRef &ArchName) { +SmallVector Split; +ArchName.split(Split, '+', -1, false); +return llvm::any_of( +Split, [](const StringRef &Extension) { return Extension == "nofp"; }); + }; + + if (ABI == arm::FloatABI::Soft) { +llvm::ARM::getFPUFeatures(llvm::ARM::FK_NONE, Features); + +// Disable all features relating to hardware FP, not already disabled by the +// above call. +Features.insert(Features.end(), +{"-dotprod", "-fp16fml", "-mve", "-mve.fp
[PATCH] D82946: [Driver][ARM] Disable bf16 when hardware FP support is missing
vhscampos created this revision. Herald added subscribers: cfe-commits, danielkiss, kristof.beyls. Herald added a project: clang. vhscampos abandoned this revision. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D82946 Files: clang/lib/Driver/ToolChains/Arch/ARM.cpp clang/test/CodeGen/arm-bf16-softfloat.c Index: clang/test/CodeGen/arm-bf16-softfloat.c === --- clang/test/CodeGen/arm-bf16-softfloat.c +++ clang/test/CodeGen/arm-bf16-softfloat.c @@ -1,4 +1,6 @@ -// RUN: not %clang -o %t.out -target arm-arm-eabi -march=armv8-a+bf16 -mfloat-abi=soft -c %s 2>&1 | FileCheck %s +// RUN: not %clang -target arm-arm-none-eabi -march=armv8-a+bf16 -mfloat-abi=soft -c %s -o %t 2>&1 | FileCheck %s +// RUN: not %clang -target arm-arm-none-eabi -march=armv8-a+bf16 -mfpu=none -c %s -o %t 2>&1 | FileCheck %s +// RUN: not %clang -target arm-arm-none-eabi -march=armv8-a+bf16+nofp -c %s -o %t 2>&1 | FileCheck %s // CHECK: error: __bf16 is not supported on this target extern __bf16 var; Index: clang/lib/Driver/ToolChains/Arch/ARM.cpp === --- clang/lib/Driver/ToolChains/Arch/ARM.cpp +++ clang/lib/Driver/ToolChains/Arch/ARM.cpp @@ -301,13 +301,14 @@ // Disable all features relating to hardware FP, not already disabled by the // above call. -Features.insert(Features.end(), -{"-dotprod", "-fp16fml", "-mve", "-mve.fp", "-fpregs"}); +Features.insert(Features.end(), {"-dotprod", "-fp16fml", "-bf16", "-mve", + "-mve.fp", "-fpregs"}); } else if (FPUID == llvm::ARM::FK_NONE || checkFPDisabledInArchName(ArchName)) { // -mfpu=none or -march=armvX+nofp is *very* similar to -mfloat-abi=soft, // only that it should not disable MVE-I. -Features.insert(Features.end(), {"-dotprod", "-fp16fml", "-mve.fp"}); +Features.insert(Features.end(), +{"-dotprod", "-fp16fml", "-bf16", "-mve.fp"}); if (!hasIntegerMVE(Features)) { Features.emplace_back("-fpregs"); } Index: clang/test/CodeGen/arm-bf16-softfloat.c === --- clang/test/CodeGen/arm-bf16-softfloat.c +++ clang/test/CodeGen/arm-bf16-softfloat.c @@ -1,4 +1,6 @@ -// RUN: not %clang -o %t.out -target arm-arm-eabi -march=armv8-a+bf16 -mfloat-abi=soft -c %s 2>&1 | FileCheck %s +// RUN: not %clang -target arm-arm-none-eabi -march=armv8-a+bf16 -mfloat-abi=soft -c %s -o %t 2>&1 | FileCheck %s +// RUN: not %clang -target arm-arm-none-eabi -march=armv8-a+bf16 -mfpu=none -c %s -o %t 2>&1 | FileCheck %s +// RUN: not %clang -target arm-arm-none-eabi -march=armv8-a+bf16+nofp -c %s -o %t 2>&1 | FileCheck %s // CHECK: error: __bf16 is not supported on this target extern __bf16 var; Index: clang/lib/Driver/ToolChains/Arch/ARM.cpp === --- clang/lib/Driver/ToolChains/Arch/ARM.cpp +++ clang/lib/Driver/ToolChains/Arch/ARM.cpp @@ -301,13 +301,14 @@ // Disable all features relating to hardware FP, not already disabled by the // above call. -Features.insert(Features.end(), -{"-dotprod", "-fp16fml", "-mve", "-mve.fp", "-fpregs"}); +Features.insert(Features.end(), {"-dotprod", "-fp16fml", "-bf16", "-mve", + "-mve.fp", "-fpregs"}); } else if (FPUID == llvm::ARM::FK_NONE || checkFPDisabledInArchName(ArchName)) { // -mfpu=none or -march=armvX+nofp is *very* similar to -mfloat-abi=soft, // only that it should not disable MVE-I. -Features.insert(Features.end(), {"-dotprod", "-fp16fml", "-mve.fp"}); +Features.insert(Features.end(), +{"-dotprod", "-fp16fml", "-bf16", "-mve.fp"}); if (!hasIntegerMVE(Features)) { Features.emplace_back("-fpregs"); } ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D82948: [Driver][ARM] Disable unsupported features when nofp arch extension is used
vhscampos marked 2 inline comments as done. vhscampos added a comment. I will merge the two patches into one. Please also see my inline responses. Comment at: clang/lib/Driver/ToolChains/Arch/ARM.cpp:288 +static void appendNoFPUnsupportedFeatures(const arm::FloatABI ABI, + const unsigned FPUID, chill wrote: > That's kinda mouthful name. Not sure how to compress it more without making it unmeaningful though. Comment at: clang/lib/Driver/ToolChains/Arch/ARM.cpp:292-297 + auto checkFPDisabledInArchName = [](const StringRef &ArchName) { +SmallVector Split; +ArchName.split(Split, '+', -1, false); +return llvm::any_of( +Split, [](const StringRef &Extension) { return Extension == "nofp"; }); + }; chill wrote: > chill wrote: > > Wouldn't just looking for the substring do the job? > > > > Also need to handle `-mcpu=...+nofp`. > > > > We already "parse" the arguments to `-march=` and `-mcpu=` (and `-mfpu=`) > > earlier, it seems to me we > > could note the `+nofp` and `+nofp.dp` earlier. (TBH, it isn't immediately > > obvious to me how to untangle this mess). > > > Hmm, actually, `+nofp.dp` should not disable the FPU, I think. Just looking for the substring might be sufficient indeed. Yes, we already do `-march`/`-mcpu` parsing a bit earlier. However, this parsing and the following handling of it is done deeper in the call stack. I wondered about ways to propagate this information back to this point here (e.g. adding one more by-ref argument that is set by the first round of parsing), but I don't feel confident to back it up. Are you okay with me just changing it to a substring search? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D82948/new/ https://reviews.llvm.org/D82948 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D82948: [Driver][ARM] Disable unsupported features when nofp arch extension is used
vhscampos marked 2 inline comments as done. vhscampos added inline comments. Comment at: clang/lib/Driver/ToolChains/Arch/ARM.cpp:292-297 + auto checkFPDisabledInArchName = [](const StringRef &ArchName) { +SmallVector Split; +ArchName.split(Split, '+', -1, false); +return llvm::any_of( +Split, [](const StringRef &Extension) { return Extension == "nofp"; }); + }; vhscampos wrote: > chill wrote: > > chill wrote: > > > Wouldn't just looking for the substring do the job? > > > > > > Also need to handle `-mcpu=...+nofp`. > > > > > > We already "parse" the arguments to `-march=` and `-mcpu=` (and `-mfpu=`) > > > earlier, it seems to me we > > > could note the `+nofp` and `+nofp.dp` earlier. (TBH, it isn't immediately > > > obvious to me how to untangle this mess). > > > > > Hmm, actually, `+nofp.dp` should not disable the FPU, I think. > Just looking for the substring might be sufficient indeed. > > Yes, we already do `-march`/`-mcpu` parsing a bit earlier. However, this > parsing and the following handling of it is done deeper in the call stack. I > wondered about ways to propagate this information back to this point here > (e.g. adding one more by-ref argument that is set by the first round of > parsing), but I don't feel confident to back it up. > > Are you okay with me just changing it to a substring search? Actually it may be better to keep the string splitting method. The search required here must be whole-word, as to flag up "+nofp", but not "+nofp.dp". It can be done with less code using the current list of tokens as opposed to using substring search, followed by a "is it whole-word?" check. Comment at: clang/lib/Driver/ToolChains/Arch/ARM.cpp:296 +return llvm::any_of( +Split, [](const StringRef &Extension) { return Extension == "nofp"; }); + }; DavidSpickett wrote: > I would check what this does: > $ ./bin/clang -target arm-arm-none-eabi /tmp/test.c -c > -march=armv8.1-a+nofp+fp -### > > I'm not sure if by this point we've resolved "+nofp+fp" to simply "+fp". > (perhaps you'd end up with a bunch of "-" followed by the same as > "+". Either way I'd expect +fp to win. Good catch. I will be sure to cover this case. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D82948/new/ https://reviews.llvm.org/D82948 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D82948: [Driver][ARM] Disable unsupported features when nofp arch extension is used
vhscampos updated this revision to Diff 275392. vhscampos added a comment. Herald added subscribers: llvm-commits, hiraditya. Herald added a project: LLVM. 1. Merged the second patch into this (handle bf16). 2. Do the same treatment for -mcpu. 3. Instead of doing string search once again, return the desired information in the first time using a by-ref argument. 4. This new approach covers positional differences between +fp and +nofp. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D82948/new/ https://reviews.llvm.org/D82948 Files: clang/lib/Driver/ToolChains/Arch/ARM.cpp clang/test/CodeGen/arm-bf16-softfloat.c llvm/include/llvm/Support/ARMTargetParser.h llvm/lib/Support/ARMTargetParser.cpp Index: llvm/lib/Support/ARMTargetParser.cpp === --- llvm/lib/Support/ARMTargetParser.cpp +++ llvm/lib/Support/ARMTargetParser.cpp @@ -490,9 +490,10 @@ return ARM::FK_INVALID; } -bool ARM::appendArchExtFeatures( - StringRef CPU, ARM::ArchKind AK, StringRef ArchExt, - std::vector &Features) { +bool ARM::appendArchExtFeatures(StringRef CPU, ARM::ArchKind AK, +StringRef ArchExt, +std::vector &Features, +unsigned &ArgFPUID) { size_t StartingNumFeatures = Features.size(); const bool Negated = stripNegationPrefix(ArchExt); @@ -527,6 +528,7 @@ } else { FPUKind = getDefaultFPU(CPU, AK); } +ArgFPUID = FPUKind; return ARM::getFPUFeatures(FPUKind, Features); } return StartingNumFeatures != Features.size(); Index: llvm/include/llvm/Support/ARMTargetParser.h === --- llvm/include/llvm/Support/ARMTargetParser.h +++ llvm/include/llvm/Support/ARMTargetParser.h @@ -250,7 +250,8 @@ StringRef getArchExtName(uint64_t ArchExtKind); StringRef getArchExtFeature(StringRef ArchExt); bool appendArchExtFeatures(StringRef CPU, ARM::ArchKind AK, StringRef ArchExt, - std::vector &Features); + std::vector &Features, + unsigned &ArgFPUKind); StringRef getHWDivName(uint64_t HWDivKind); // Information by Name Index: clang/test/CodeGen/arm-bf16-softfloat.c === --- clang/test/CodeGen/arm-bf16-softfloat.c +++ clang/test/CodeGen/arm-bf16-softfloat.c @@ -1,4 +1,9 @@ -// RUN: not %clang -o %t.out -target arm-arm-eabi -march=armv8-a+bf16 -mfloat-abi=soft -c %s 2>&1 | FileCheck %s +// RUN: not %clang -target arm-arm-none-eabi -march=armv8-a+bf16 -mfloat-abi=soft -c %s -o %t 2>&1 | FileCheck %s +// RUN: not %clang -target arm-arm-none-eabi -march=armv8-a+bf16 -mfpu=none -c %s -o %t 2>&1 | FileCheck %s +// RUN: not %clang -target arm-arm-none-eabi -march=armv8-a+bf16+nofp -c %s -o %t 2>&1 | FileCheck %s +// RUN: not %clang -target arm-arm-none-eabi -march=armv8-a+bf16+fp+nofp -c %s -o %t 2>&1 | FileCheck %s +// RUN: %clang -target arm-arm-none-eabi -march=armv8-a+bf16+fp -c %s -o %t +// RUN: %clang -target arm-arm-none-eabi -march=armv8-a+bf16+nofp+fp -c %s -o %t // CHECK: error: __bf16 is not supported on this target extern __bf16 var; Index: clang/lib/Driver/ToolChains/Arch/ARM.cpp === --- clang/lib/Driver/ToolChains/Arch/ARM.cpp +++ clang/lib/Driver/ToolChains/Arch/ARM.cpp @@ -73,14 +73,15 @@ } // Decode ARM features from string like +[no]featureA+[no]featureB+... -static bool DecodeARMFeatures(const Driver &D, StringRef text, - StringRef CPU, llvm::ARM::ArchKind ArchKind, - std::vector &Features) { +static bool DecodeARMFeatures(const Driver &D, StringRef text, StringRef CPU, + llvm::ARM::ArchKind ArchKind, + std::vector &Features, + unsigned &ArgFPUID) { SmallVector Split; text.split(Split, StringRef("+"), -1, false); for (StringRef Feature : Split) { -if (!appendArchExtFeatures(CPU, ArchKind, Feature, Features)) +if (!appendArchExtFeatures(CPU, ArchKind, Feature, Features, ArgFPUID)) return false; } return true; @@ -102,14 +103,14 @@ static void checkARMArchName(const Driver &D, const Arg *A, const ArgList &Args, llvm::StringRef ArchName, llvm::StringRef CPUName, std::vector &Features, - const llvm::Triple &Triple) { + const llvm::Triple &Triple, unsigned &ArgFPUID) { std::pair Split = ArchName.split("+"); std::string MArch = arm::getARMArch(ArchName, Triple); llvm::ARM::ArchKind ArchKind = llvm::ARM::parseArch(MArch); if (ArchKind == llvm::ARM::ArchKind::INVALID || - (Split.second.size() && !DecodeARMFeatures(
[PATCH] D81847: [ARM] Improve diagnostics message when Neon is unsupported
This revision was automatically updated to reflect the committed changes. Closed by commit rG1b090db0df47: [ARM] Improve diagnostics message when Neon is unsupported (authored by vhscampos). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D81847/new/ https://reviews.llvm.org/D81847 Files: clang/utils/TableGen/NeonEmitter.cpp Index: clang/utils/TableGen/NeonEmitter.cpp === --- clang/utils/TableGen/NeonEmitter.cpp +++ clang/utils/TableGen/NeonEmitter.cpp @@ -2312,9 +2312,14 @@ OS << "#ifndef __ARM_NEON_H\n"; OS << "#define __ARM_NEON_H\n\n"; + OS << "#ifndef __ARM_FP\n"; + OS << "#error \"NEON intrinsics not available with the soft-float ABI. " +"Please use -mfloat-abi=softfp or -mfloat-abi=hard\"\n"; + OS << "#else\n\n"; + OS << "#if !defined(__ARM_NEON)\n"; OS << "#error \"NEON support not enabled\"\n"; - OS << "#endif\n\n"; + OS << "#else\n\n"; OS << "#include \n\n"; @@ -2404,6 +2409,8 @@ OS << "\n"; OS << "#undef __ai\n\n"; + OS << "#endif /* if !defined(__ARM_NEON) */\n"; + OS << "#endif /* ifndef __ARM_FP */\n"; OS << "#endif /* __ARM_NEON_H */\n"; } Index: clang/utils/TableGen/NeonEmitter.cpp === --- clang/utils/TableGen/NeonEmitter.cpp +++ clang/utils/TableGen/NeonEmitter.cpp @@ -2312,9 +2312,14 @@ OS << "#ifndef __ARM_NEON_H\n"; OS << "#define __ARM_NEON_H\n\n"; + OS << "#ifndef __ARM_FP\n"; + OS << "#error \"NEON intrinsics not available with the soft-float ABI. " +"Please use -mfloat-abi=softfp or -mfloat-abi=hard\"\n"; + OS << "#else\n\n"; + OS << "#if !defined(__ARM_NEON)\n"; OS << "#error \"NEON support not enabled\"\n"; - OS << "#endif\n\n"; + OS << "#else\n\n"; OS << "#include \n\n"; @@ -2404,6 +2409,8 @@ OS << "\n"; OS << "#undef __ai\n\n"; + OS << "#endif /* if !defined(__ARM_NEON) */\n"; + OS << "#endif /* ifndef __ARM_FP */\n"; OS << "#endif /* __ARM_NEON_H */\n"; } ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D82949: [Driver][ARM] Disable bf16 when hardware FP support is missing
vhscampos added a comment. Not really. Closing it Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D82949/new/ https://reviews.llvm.org/D82949 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D111283: [clang] template / auto deduction deduces common sugar
vhscampos added a comment. For reference, another small reproducer of the crash, but with a different stack trace than the first example posted here: // Must compile with -std=c++03 to crash #include int main(int, char**) { int i[3] = {1, 2, 3}; int j[3] = {4, 5, 6}; std::swap(i, j); return 0; } Compile with -std=c++03 to reproduce the assertion failure. We found it by running the libcxx tests. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D111283/new/ https://reviews.llvm.org/D111283 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D121983: Driver: Don't warn on -mbranch-protection when linking
vhscampos accepted this revision. vhscampos added a comment. This revision is now accepted and ready to land. LGTM, thanks Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D121983/new/ https://reviews.llvm.org/D121983 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D88645: [Annotation] Allows annotation to carry some additional constant arguments.
vhscampos added a comment. @Tyker This is causing another build failure in another example: [2858/4034] Building CXX object tools/clang/examples/Attribute/CMakeFiles/Attribute.dir/Attribute.cpp.o FAILED: tools/clang/examples/Attribute/CMakeFiles/Attribute.dir/Attribute.cpp.o CCACHE_CPP2=yes CCACHE_HASHDIR=yes /usr/bin/ccache /usr/lib/ccache/clang++ -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools/clang/examples/Attribute -I/home/vicspe01/workspace/upstream/llvm-project/clang/examples/Attribute -I/home/vicspe01/workspace/upstream/llvm-project/clang/include -Itools/clang/include -Iinclude -I/home/vicspe01/workspace/upstream/llvm-project/llvm/include -fPIC -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wstring-conversion -fdiagnostics-color -ffunction-sections -fdata-sections -fno-common -Woverloaded-virtual -Wno-nested-anon-types -O3 -fPIC -fno-exceptions -fno-rtti -UNDEBUG -std=c++14 -MD -MT tools/clang/examples/Attribute/CMakeFiles/Attribute.dir/Attribute.cpp.o -MF tools/clang/examples/Attribute/CMakeFiles/Attribute.dir/Attribute.cpp.o.d -o tools/clang/examples/Attribute/CMakeFiles/Attribute.dir/Attribute.cpp.o -c /home/vicspe01/workspace/upstream/llvm-project/clang/examples/Attribute/Attribute.cpp /home/vicspe01/workspace/upstream/llvm-project/clang/examples/Attribute/Attribute.cpp:73:16: error: no matching function for call to 'Create' D->addAttr(AnnotateAttr::Create(S.Context, "example(" + Str.str() + ")", ^~~~ tools/clang/include/clang/AST/Attrs.inc:885:24: note: candidate function not viable: requires at least 4 arguments, but 3 were provided static AnnotateAttr *Create(ASTContext &Ctx, llvm::StringRef Annotation, Expr * *Args, unsigned ArgsSize, const AttributeCommonInfo &CommonInfo = {SourceRange{}}); ^ tools/clang/include/clang/AST/Attrs.inc:887:24: note: candidate function not viable: requires 6 arguments, but 3 were provided static AnnotateAttr *Create(ASTContext &Ctx, llvm::StringRef Annotation, Expr * *Args, unsigned ArgsSize, SourceRange Range, AttributeCommonInfo::Syntax Syntax); ^ 1 error generated. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D88645/new/ https://reviews.llvm.org/D88645 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D109517: [Clang][ARM][AArch64] Add support for Armv9-A, Armv9.1-A and Armv9.2-A
vhscampos updated this revision to Diff 373943. vhscampos added a comment. 1. Fix bug in "+sve2" feature position in the target features list. It was being inserted at the end, which made it impossible to disable it using +nosve2, as the positive option would always be placed after the negative one. 2. Add tests to the bug fix above. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D109517/new/ https://reviews.llvm.org/D109517 Files: clang/lib/Basic/Targets/AArch64.cpp clang/lib/Basic/Targets/AArch64.h clang/lib/Basic/Targets/ARM.cpp clang/lib/Driver/ToolChains/Arch/AArch64.cpp clang/test/Driver/aarch64-cpus.c clang/test/Driver/arm-cortex-cpus.c clang/test/Preprocessor/aarch64-target-features.c clang/test/Preprocessor/arm-target-features.c llvm/include/llvm/ADT/Triple.h llvm/include/llvm/Support/AArch64TargetParser.def llvm/include/llvm/Support/ARMTargetParser.def llvm/lib/Support/AArch64TargetParser.cpp llvm/lib/Support/ARMTargetParser.cpp llvm/lib/Support/Triple.cpp llvm/lib/Target/AArch64/AArch64.td llvm/lib/Target/AArch64/AArch64InstrInfo.td llvm/lib/Target/AArch64/AArch64Subtarget.h llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp llvm/lib/Target/ARM/ARM.td llvm/lib/Target/ARM/ARMSubtarget.h llvm/lib/Target/ARM/MCTargetDesc/ARMELFStreamer.cpp llvm/test/MC/AArch64/SME/directives-negative.s llvm/test/MC/AArch64/SME/directives.s llvm/test/MC/AArch64/SVE2/directive-arch-negative.s llvm/test/MC/AArch64/SVE2/directive-arch.s llvm/unittests/Support/TargetParserTest.cpp Index: llvm/unittests/Support/TargetParserTest.cpp === --- llvm/unittests/Support/TargetParserTest.cpp +++ llvm/unittests/Support/TargetParserTest.cpp @@ -31,6 +31,7 @@ "armv8.5a","armv8.6-a","armv8.6a","armv8.7-a","armv8.7a", "armv8-r", "armv8r", "armv8-m.base","armv8m.base", "armv8-m.main", "armv8m.main", "iwmmxt", "iwmmxt2", "xscale", "armv8.1-m.main", +"armv9-a", "armv9.1-a","armv9.2-a", }; template @@ -492,6 +493,15 @@ EXPECT_TRUE( testARMArch("armv8.7-a", "generic", "v8.7a", ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE( + testARMArch("armv9-a", "generic", "v9a", + ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE( + testARMArch("armv9.1-a", "generic", "v9.1a", + ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE( + testARMArch("armv9.2-a", "generic", "v9.2a", + ARMBuildAttrs::CPUArch::v8_A)); EXPECT_TRUE( testARMArch("armv8-r", "cortex-r52", "v8r", ARMBuildAttrs::CPUArch::v8_R)); @@ -821,6 +831,9 @@ case ARM::ArchKind::ARMV8_5A: case ARM::ArchKind::ARMV8_6A: case ARM::ArchKind::ARMV8_7A: +case ARM::ArchKind::ARMV9A: +case ARM::ArchKind::ARMV9_1A: +case ARM::ArchKind::ARMV9_2A: EXPECT_EQ(ARM::ProfileKind::A, ARM::parseArchProfile(ARMArch[i])); break; default: @@ -1204,6 +1217,12 @@ ARMBuildAttrs::CPUArch::v8_A)); EXPECT_TRUE(testAArch64Arch("armv8.7-a", "generic", "v8.7a", ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE(testAArch64Arch("armv9-a", "generic", "v9a", + ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE(testAArch64Arch("armv9.1-a", "generic", "v9.1a", + ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE(testAArch64Arch("armv9.2-a", "generic", "v9.2a", + ARMBuildAttrs::CPUArch::v8_A)); } bool testAArch64Extension(StringRef CPUName, AArch64::ArchKind AK, Index: llvm/test/MC/AArch64/SVE2/directive-arch.s === --- llvm/test/MC/AArch64/SVE2/directive-arch.s +++ llvm/test/MC/AArch64/SVE2/directive-arch.s @@ -1,21 +1,21 @@ // RUN: llvm-mc -triple aarch64 -filetype asm -o - %s 2>&1 | FileCheck %s -.arch armv8-a+sve2 +.arch armv9-a+sve2 tbx z0.b, z1.b, z2.b // CHECK: tbx z0.b, z1.b, z2.b -.arch armv8-a+sve2-aes +.arch armv9-a+sve2-aes aesd z23.b, z23.b, z13.b // CHECK: aesd z23.b, z23.b, z13.b -.arch armv8-a+sve2-sm4 +.arch armv9-a+sve2-sm4 sm4e z0.s, z0.s, z0.s // CHECK: sm4e z0.s, z0.s, z0.s -.arch armv8-a+sve2-sha3 +.arch armv9-a+sve2-sha3 rax1 z0.d, z0.d, z0.d // CHECK: rax1 z0.d, z0.d, z0.d -.arch armv8-a+sve2-bitperm +.arch armv9-a+sve2-bitperm bgrp z21.s, z10.s, z21.s // CHECK: bgrp z21.s, z10.s, z21.s Index: llvm/test/MC/AArch64/SVE2/directive-arch-negative.s === --- llvm/test/MC/AArch64/SVE2/directive-arch-negative.s +++ llvm/test/MC/AArch64/SVE2/directive-arch-negative.s @@ -1,31 +1,31 @@ // RUN: not llvm-mc -triple aarch64 -filetype asm -o - %s 2>&1 | FileCheck %s -.arch armv8-a+sve2 -.a
[PATCH] D109517: [Clang][ARM][AArch64] Add support for Armv9-A, Armv9.1-A and Armv9.2-A
vhscampos requested review of this revision. vhscampos added a comment. Sorry @SjoerdMeijer , I found a bug in the implementation (as described in the latest comment). Therefore I kindly ask another round of review, please. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D109517/new/ https://reviews.llvm.org/D109517 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D109517: [Clang][ARM][AArch64] Add support for Armv9-A, Armv9.1-A and Armv9.2-A
vhscampos updated this revision to Diff 374162. vhscampos added a comment. Add missing . to end of sentences in comments. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D109517/new/ https://reviews.llvm.org/D109517 Files: clang/lib/Basic/Targets/AArch64.cpp clang/lib/Basic/Targets/AArch64.h clang/lib/Basic/Targets/ARM.cpp clang/lib/Driver/ToolChains/Arch/AArch64.cpp clang/test/Driver/aarch64-cpus.c clang/test/Driver/arm-cortex-cpus.c clang/test/Preprocessor/aarch64-target-features.c clang/test/Preprocessor/arm-target-features.c llvm/include/llvm/ADT/Triple.h llvm/include/llvm/Support/AArch64TargetParser.def llvm/include/llvm/Support/ARMTargetParser.def llvm/lib/Support/AArch64TargetParser.cpp llvm/lib/Support/ARMTargetParser.cpp llvm/lib/Support/Triple.cpp llvm/lib/Target/AArch64/AArch64.td llvm/lib/Target/AArch64/AArch64InstrInfo.td llvm/lib/Target/AArch64/AArch64Subtarget.h llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp llvm/lib/Target/ARM/ARM.td llvm/lib/Target/ARM/ARMSubtarget.h llvm/lib/Target/ARM/MCTargetDesc/ARMELFStreamer.cpp llvm/test/MC/AArch64/SME/directives-negative.s llvm/test/MC/AArch64/SME/directives.s llvm/test/MC/AArch64/SVE2/directive-arch-negative.s llvm/test/MC/AArch64/SVE2/directive-arch.s llvm/unittests/Support/TargetParserTest.cpp Index: llvm/unittests/Support/TargetParserTest.cpp === --- llvm/unittests/Support/TargetParserTest.cpp +++ llvm/unittests/Support/TargetParserTest.cpp @@ -31,6 +31,7 @@ "armv8.5a","armv8.6-a","armv8.6a","armv8.7-a","armv8.7a", "armv8-r", "armv8r", "armv8-m.base","armv8m.base", "armv8-m.main", "armv8m.main", "iwmmxt", "iwmmxt2", "xscale", "armv8.1-m.main", +"armv9-a", "armv9.1-a","armv9.2-a", }; template @@ -492,6 +493,15 @@ EXPECT_TRUE( testARMArch("armv8.7-a", "generic", "v8.7a", ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE( + testARMArch("armv9-a", "generic", "v9a", + ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE( + testARMArch("armv9.1-a", "generic", "v9.1a", + ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE( + testARMArch("armv9.2-a", "generic", "v9.2a", + ARMBuildAttrs::CPUArch::v8_A)); EXPECT_TRUE( testARMArch("armv8-r", "cortex-r52", "v8r", ARMBuildAttrs::CPUArch::v8_R)); @@ -821,6 +831,9 @@ case ARM::ArchKind::ARMV8_5A: case ARM::ArchKind::ARMV8_6A: case ARM::ArchKind::ARMV8_7A: +case ARM::ArchKind::ARMV9A: +case ARM::ArchKind::ARMV9_1A: +case ARM::ArchKind::ARMV9_2A: EXPECT_EQ(ARM::ProfileKind::A, ARM::parseArchProfile(ARMArch[i])); break; default: @@ -1204,6 +1217,12 @@ ARMBuildAttrs::CPUArch::v8_A)); EXPECT_TRUE(testAArch64Arch("armv8.7-a", "generic", "v8.7a", ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE(testAArch64Arch("armv9-a", "generic", "v9a", + ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE(testAArch64Arch("armv9.1-a", "generic", "v9.1a", + ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE(testAArch64Arch("armv9.2-a", "generic", "v9.2a", + ARMBuildAttrs::CPUArch::v8_A)); } bool testAArch64Extension(StringRef CPUName, AArch64::ArchKind AK, Index: llvm/test/MC/AArch64/SVE2/directive-arch.s === --- llvm/test/MC/AArch64/SVE2/directive-arch.s +++ llvm/test/MC/AArch64/SVE2/directive-arch.s @@ -1,21 +1,21 @@ // RUN: llvm-mc -triple aarch64 -filetype asm -o - %s 2>&1 | FileCheck %s -.arch armv8-a+sve2 +.arch armv9-a+sve2 tbx z0.b, z1.b, z2.b // CHECK: tbx z0.b, z1.b, z2.b -.arch armv8-a+sve2-aes +.arch armv9-a+sve2-aes aesd z23.b, z23.b, z13.b // CHECK: aesd z23.b, z23.b, z13.b -.arch armv8-a+sve2-sm4 +.arch armv9-a+sve2-sm4 sm4e z0.s, z0.s, z0.s // CHECK: sm4e z0.s, z0.s, z0.s -.arch armv8-a+sve2-sha3 +.arch armv9-a+sve2-sha3 rax1 z0.d, z0.d, z0.d // CHECK: rax1 z0.d, z0.d, z0.d -.arch armv8-a+sve2-bitperm +.arch armv9-a+sve2-bitperm bgrp z21.s, z10.s, z21.s // CHECK: bgrp z21.s, z10.s, z21.s Index: llvm/test/MC/AArch64/SVE2/directive-arch-negative.s === --- llvm/test/MC/AArch64/SVE2/directive-arch-negative.s +++ llvm/test/MC/AArch64/SVE2/directive-arch-negative.s @@ -1,31 +1,31 @@ // RUN: not llvm-mc -triple aarch64 -filetype asm -o - %s 2>&1 | FileCheck %s -.arch armv8-a+sve2 -.arch armv8-a+nosve2 +.arch armv9-a+sve2 +.arch armv9-a+nosve2 tbx z0.b, z1.b, z2.b // CHECK: error: instruction requires: streaming-sve or sve2 // CHECK-NEXT: tbx z0.b, z1.b, z2.b -.arch armv8-a+sve2-aes -.arch a
[PATCH] D110241: [docs] List support for Armv9-A, Armv9.1-A and Armv9.2-A in LLVM and Clang
vhscampos created this revision. Herald added a subscriber: kristof.beyls. vhscampos requested review of this revision. Herald added projects: clang, LLVM. Herald added subscribers: llvm-commits, cfe-commits. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D110241 Files: clang/docs/ReleaseNotes.rst llvm/docs/ReleaseNotes.rst Index: llvm/docs/ReleaseNotes.rst === --- llvm/docs/ReleaseNotes.rst +++ llvm/docs/ReleaseNotes.rst @@ -72,12 +72,12 @@ Changes to the AArch64 Backend -- -* ... +* Added support for Armv9-A, Armv9.1-A and Armv9.2-A architectures. Changes to the ARM Backend -- -During this release ... +* Added support for Armv9-A, Armv9.1-A and Armv9.2-A architectures. Changes to the MIPS Target -- Index: clang/docs/ReleaseNotes.rst === --- clang/docs/ReleaseNotes.rst +++ clang/docs/ReleaseNotes.rst @@ -83,6 +83,12 @@ - RISC-V SiFive S54 (``sifive-s54``). - RISC-V SiFive S76 (``sifive-s76``). +- Support has been added for the following architectures (``-march`` identifiers in parentheses): + + - Armv9-A (``armv9-a``). + - Armv9.1-A (``armv9.1-a``). + - Armv9.2-A (``armv9.2-a``). + Removed Compiler Flags - Index: llvm/docs/ReleaseNotes.rst === --- llvm/docs/ReleaseNotes.rst +++ llvm/docs/ReleaseNotes.rst @@ -72,12 +72,12 @@ Changes to the AArch64 Backend -- -* ... +* Added support for Armv9-A, Armv9.1-A and Armv9.2-A architectures. Changes to the ARM Backend -- -During this release ... +* Added support for Armv9-A, Armv9.1-A and Armv9.2-A architectures. Changes to the MIPS Target -- Index: clang/docs/ReleaseNotes.rst === --- clang/docs/ReleaseNotes.rst +++ clang/docs/ReleaseNotes.rst @@ -83,6 +83,12 @@ - RISC-V SiFive S54 (``sifive-s54``). - RISC-V SiFive S76 (``sifive-s76``). +- Support has been added for the following architectures (``-march`` identifiers in parentheses): + + - Armv9-A (``armv9-a``). + - Armv9.1-A (``armv9.1-a``). + - Armv9.2-A (``armv9.2-a``). + Removed Compiler Flags - ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D110241: [docs] List support for Armv9-A, Armv9.1-A and Armv9.2-A in LLVM and Clang
vhscampos updated this revision to Diff 374207. vhscampos added a comment. Added 'the' for better phrasing. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D110241/new/ https://reviews.llvm.org/D110241 Files: clang/docs/ReleaseNotes.rst llvm/docs/ReleaseNotes.rst Index: llvm/docs/ReleaseNotes.rst === --- llvm/docs/ReleaseNotes.rst +++ llvm/docs/ReleaseNotes.rst @@ -72,12 +72,12 @@ Changes to the AArch64 Backend -- -* ... +* Added support for the Armv9-A, Armv9.1-A and Armv9.2-A architectures. Changes to the ARM Backend -- -During this release ... +* Added support for the Armv9-A, Armv9.1-A and Armv9.2-A architectures. Changes to the MIPS Target -- Index: clang/docs/ReleaseNotes.rst === --- clang/docs/ReleaseNotes.rst +++ clang/docs/ReleaseNotes.rst @@ -83,6 +83,12 @@ - RISC-V SiFive S54 (``sifive-s54``). - RISC-V SiFive S76 (``sifive-s76``). +- Support has been added for the following architectures (``-march`` identifiers in parentheses): + + - Armv9-A (``armv9-a``). + - Armv9.1-A (``armv9.1-a``). + - Armv9.2-A (``armv9.2-a``). + Removed Compiler Flags - Index: llvm/docs/ReleaseNotes.rst === --- llvm/docs/ReleaseNotes.rst +++ llvm/docs/ReleaseNotes.rst @@ -72,12 +72,12 @@ Changes to the AArch64 Backend -- -* ... +* Added support for the Armv9-A, Armv9.1-A and Armv9.2-A architectures. Changes to the ARM Backend -- -During this release ... +* Added support for the Armv9-A, Armv9.1-A and Armv9.2-A architectures. Changes to the MIPS Target -- Index: clang/docs/ReleaseNotes.rst === --- clang/docs/ReleaseNotes.rst +++ clang/docs/ReleaseNotes.rst @@ -83,6 +83,12 @@ - RISC-V SiFive S54 (``sifive-s54``). - RISC-V SiFive S76 (``sifive-s76``). +- Support has been added for the following architectures (``-march`` identifiers in parentheses): + + - Armv9-A (``armv9-a``). + - Armv9.1-A (``armv9.1-a``). + - Armv9.2-A (``armv9.2-a``). + Removed Compiler Flags - ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D109517: [Clang][ARM][AArch64] Add support for Armv9-A, Armv9.1-A and Armv9.2-A
vhscampos updated this revision to Diff 376228. vhscampos edited the summary of this revision. vhscampos added a comment. 1. Disable the cryptographic extensions by default. 2. Small fix in TargetParserTest.cpp to include different spellings of the -march values. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D109517/new/ https://reviews.llvm.org/D109517 Files: clang/lib/Basic/Targets/AArch64.cpp clang/lib/Basic/Targets/AArch64.h clang/lib/Basic/Targets/ARM.cpp clang/lib/Driver/ToolChains/Arch/AArch64.cpp clang/test/Driver/aarch64-cpus.c clang/test/Driver/arm-cortex-cpus.c clang/test/Preprocessor/aarch64-target-features.c clang/test/Preprocessor/arm-target-features.c llvm/include/llvm/ADT/Triple.h llvm/include/llvm/Support/AArch64TargetParser.def llvm/include/llvm/Support/ARMTargetParser.def llvm/lib/Support/AArch64TargetParser.cpp llvm/lib/Support/ARMTargetParser.cpp llvm/lib/Support/Triple.cpp llvm/lib/Target/AArch64/AArch64.td llvm/lib/Target/AArch64/AArch64InstrInfo.td llvm/lib/Target/AArch64/AArch64Subtarget.h llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp llvm/lib/Target/ARM/ARM.td llvm/lib/Target/ARM/ARMSubtarget.h llvm/lib/Target/ARM/MCTargetDesc/ARMELFStreamer.cpp llvm/test/MC/AArch64/SME/directives-negative.s llvm/test/MC/AArch64/SME/directives.s llvm/test/MC/AArch64/SVE2/directive-arch-negative.s llvm/test/MC/AArch64/SVE2/directive-arch.s llvm/unittests/Support/TargetParserTest.cpp Index: llvm/unittests/Support/TargetParserTest.cpp === --- llvm/unittests/Support/TargetParserTest.cpp +++ llvm/unittests/Support/TargetParserTest.cpp @@ -31,6 +31,8 @@ "armv8.5a","armv8.6-a","armv8.6a","armv8.7-a","armv8.7a", "armv8-r", "armv8r", "armv8-m.base","armv8m.base", "armv8-m.main", "armv8m.main", "iwmmxt", "iwmmxt2", "xscale", "armv8.1-m.main", +"armv9-a", "armv9","armv9a", "armv9.1-a","armv9.1a", +"armv9.2-a", "armv9.2a", }; template @@ -492,6 +494,15 @@ EXPECT_TRUE( testARMArch("armv8.7-a", "generic", "v8.7a", ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE( + testARMArch("armv9-a", "generic", "v9a", + ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE( + testARMArch("armv9.1-a", "generic", "v9.1a", + ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE( + testARMArch("armv9.2-a", "generic", "v9.2a", + ARMBuildAttrs::CPUArch::v8_A)); EXPECT_TRUE( testARMArch("armv8-r", "cortex-r52", "v8r", ARMBuildAttrs::CPUArch::v8_R)); @@ -821,6 +832,9 @@ case ARM::ArchKind::ARMV8_5A: case ARM::ArchKind::ARMV8_6A: case ARM::ArchKind::ARMV8_7A: +case ARM::ArchKind::ARMV9A: +case ARM::ArchKind::ARMV9_1A: +case ARM::ArchKind::ARMV9_2A: EXPECT_EQ(ARM::ProfileKind::A, ARM::parseArchProfile(ARMArch[i])); break; default: @@ -1204,6 +1218,12 @@ ARMBuildAttrs::CPUArch::v8_A)); EXPECT_TRUE(testAArch64Arch("armv8.7-a", "generic", "v8.7a", ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE(testAArch64Arch("armv9-a", "generic", "v9a", + ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE(testAArch64Arch("armv9.1-a", "generic", "v9.1a", + ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE(testAArch64Arch("armv9.2-a", "generic", "v9.2a", + ARMBuildAttrs::CPUArch::v8_A)); } bool testAArch64Extension(StringRef CPUName, AArch64::ArchKind AK, Index: llvm/test/MC/AArch64/SVE2/directive-arch.s === --- llvm/test/MC/AArch64/SVE2/directive-arch.s +++ llvm/test/MC/AArch64/SVE2/directive-arch.s @@ -1,21 +1,21 @@ // RUN: llvm-mc -triple aarch64 -filetype asm -o - %s 2>&1 | FileCheck %s -.arch armv8-a+sve2 +.arch armv9-a+sve2 tbx z0.b, z1.b, z2.b // CHECK: tbx z0.b, z1.b, z2.b -.arch armv8-a+sve2-aes +.arch armv9-a+sve2-aes aesd z23.b, z23.b, z13.b // CHECK: aesd z23.b, z23.b, z13.b -.arch armv8-a+sve2-sm4 +.arch armv9-a+sve2-sm4 sm4e z0.s, z0.s, z0.s // CHECK: sm4e z0.s, z0.s, z0.s -.arch armv8-a+sve2-sha3 +.arch armv9-a+sve2-sha3 rax1 z0.d, z0.d, z0.d // CHECK: rax1 z0.d, z0.d, z0.d -.arch armv8-a+sve2-bitperm +.arch armv9-a+sve2-bitperm bgrp z21.s, z10.s, z21.s // CHECK: bgrp z21.s, z10.s, z21.s Index: llvm/test/MC/AArch64/SVE2/directive-arch-negative.s === --- llvm/test/MC/AArch64/SVE2/directive-arch-negative.s +++ llvm/test/MC/AArch64/SVE2/directive-arch-negative.s @@ -1,31 +1,31 @@ // RUN: not llvm-mc -triple aarch64 -filetype asm -o - %s 2>&1 | FileCheck %s -.arch armv8-a+sve2 -.arch armv8-a
[PATCH] D109517: [Clang][ARM][AArch64] Add support for Armv9-A, Armv9.1-A and Armv9.2-A
vhscampos added a comment. To the relevant persons I have just added to the review: @srhines @nickdesaulniers @llozano The cryptographic extensions will **//NOT//** be enabled by default on Armv9-A and on Armv9-A ARM CPUs. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D109517/new/ https://reviews.llvm.org/D109517 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D112420: [clang][ARM] PACBTI-M assembly support
vhscampos added inline comments. Comment at: llvm/lib/Target/ARM/ARMInstrThumb2.td:4083 def : t2InstAlias<"csdb$p", (t2HINT 20, pred:$p), 1>; +def : t2InstAlias<"pacbti$p r12,lr,sp", (t2HINT 13, pred:$p), 1>; +def : t2InstAlias<"bti$p", (t2HINT 15, pred:$p), 1>; ostannard wrote: > Why are these needed in addition to the PACBTIHintSpaceInst instructions > below? Since these instructions are in the HINT space, without specifying inst aliases, they were printed in the HINT form when using llvm-mc instead of in their PACBTI form. Therefore I added these t2InstAlias instances. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D112420/new/ https://reviews.llvm.org/D112420 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D112421: [clang][ARM] PACBTI-M frontend support
vhscampos added inline comments. Comment at: clang/lib/Basic/Targets/AArch64.cpp:134-135 StringRef &Err) const { - llvm::AArch64::ParsedBranchProtection PBP; - if (!llvm::AArch64::parseBranchProtection(Spec, PBP, Err)) + llvm::ARM::ParsedBranchProtection PBP; + if (!llvm::ARM::parseBranchProtection(Spec, PBP, Err)) return false; aaron.ballman wrote: > This change surprises me. Why should AArch64TargetInfo prefer calling into > ARM instead? Since that particular function ended up identical in both ARM and AArch64, we removed the AArch64 specific function and kept only one under ARM. You can spot the removal further down the patch. The ARM namespace under ARMTargetParser.h already had code used in AArch64TargetParser, so we did not introduce new cross dependencies. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D112421/new/ https://reviews.llvm.org/D112421 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D112421: [clang][ARM] PACBTI-M frontend support
vhscampos added inline comments. Comment at: clang/include/clang/Basic/DiagnosticSemaKinds.td:2979 +def warn_unsupported_branch_protection_spec : Warning< + "unsupported branch protection specification '%0'">, InGroup; + Still need to remove extraneous whitespace Comment at: clang/lib/CodeGen/TargetInfo.cpp:6377 + +static const char *SignReturnAddrStr[] = {"none", "non-leaf", "all"}; +Fn->addFnAttr("sign-return-address", I reckon selecting the string using a switch statement on BPI.SignReturnAddr is more type safe than doing it like this. The current selection is prone to out of bounds accesses to the array in case the enum changes. Please consider so. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D112421/new/ https://reviews.llvm.org/D112421 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D136589: [AArch64] Add support for the Cortex-X3 CPU
vhscampos created this revision. Herald added subscribers: hiraditya, kristof.beyls. Herald added a project: All. vhscampos requested review of this revision. Herald added projects: clang, LLVM. Herald added subscribers: llvm-commits, cfe-commits. Cortex-X3 is an Armv9-A AArch64 CPU. This patch introduces support for Cortex-X3. Technical Reference Manual: https://developer.arm.com/documentation/101593/latest Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D136589 Files: clang/docs/ReleaseNotes.rst clang/test/Misc/target-invalid-cpu-note.c llvm/docs/ReleaseNotes.rst llvm/include/llvm/Support/AArch64TargetParser.def llvm/lib/Target/AArch64/AArch64.td llvm/lib/Target/AArch64/AArch64Subtarget.cpp llvm/lib/Target/AArch64/AArch64Subtarget.h llvm/unittests/Support/TargetParserTest.cpp Index: llvm/unittests/Support/TargetParserTest.cpp === --- llvm/unittests/Support/TargetParserTest.cpp +++ llvm/unittests/Support/TargetParserTest.cpp @@ -1073,6 +1073,18 @@ AArch64::AEK_SVE2BITPERM | AArch64::AEK_SSBS | AArch64::AEK_SB | AArch64::AEK_FP16FML, "9-A"), +ARMCPUTestParams("cortex-x3", "armv9-a", "crypto-neon-fp-armv8", + AArch64::AEK_CRC | AArch64::AEK_FP | AArch64::AEK_BF16 | + AArch64::AEK_SIMD | AArch64::AEK_RAS | + AArch64::AEK_LSE | AArch64::AEK_RDM | + AArch64::AEK_RCPC | AArch64::AEK_DOTPROD | + AArch64::AEK_MTE | AArch64::AEK_PAUTH | + AArch64::AEK_SVE | AArch64::AEK_SVE2 | + AArch64::AEK_SVE2BITPERM | AArch64::AEK_SB | + AArch64::AEK_PROFILE | AArch64::AEK_PERFMON | + AArch64::AEK_I8MM | AArch64::AEK_FP16 | + AArch64::AEK_FP16FML | AArch64::AEK_PREDRES, + "9-A"), ARMCPUTestParams("cyclone", "armv8-a", "crypto-neon-fp-armv8", AArch64::AEK_NONE | AArch64::AEK_CRYPTO | AArch64::AEK_FP | AArch64::AEK_SIMD, @@ -1295,7 +1307,7 @@ AArch64::AEK_LSE | AArch64::AEK_RDM, "8.2-A"))); -static constexpr unsigned NumAArch64CPUArchs = 58; +static constexpr unsigned NumAArch64CPUArchs = 59; TEST(TargetParserTest, testAArch64CPUArchList) { SmallVector List; Index: llvm/lib/Target/AArch64/AArch64Subtarget.h === --- llvm/lib/Target/AArch64/AArch64Subtarget.h +++ llvm/lib/Target/AArch64/AArch64Subtarget.h @@ -68,6 +68,7 @@ CortexX1, CortexX1C, CortexX2, +CortexX3, ExynosM3, Falkor, Kryo, Index: llvm/lib/Target/AArch64/AArch64Subtarget.cpp === --- llvm/lib/Target/AArch64/AArch64Subtarget.cpp +++ llvm/lib/Target/AArch64/AArch64Subtarget.cpp @@ -143,6 +143,7 @@ break; case CortexA710: case CortexX2: + case CortexX3: PrefFunctionLogAlignment = 4; VScaleForTuning = 1; PrefLoopLogAlignment = 5; Index: llvm/lib/Target/AArch64/AArch64.td === --- llvm/lib/Target/AArch64/AArch64.td +++ llvm/lib/Target/AArch64/AArch64.td @@ -785,6 +785,11 @@ FeatureLSLFast, FeaturePostRAScheduler]>; +def TuneX3 : SubtargetFeature<"cortex-x3", "ARMProcFamily", "CortexX3", + "Cortex-X3 ARM processors", [ + FeaturePostRAScheduler, + FeatureFuseAES]>; + def TuneA64FX : SubtargetFeature<"a64fx", "ARMProcFamily", "A64FX", "Fujitsu A64FX processors", [ FeaturePostRAScheduler, @@ -1109,6 +1114,11 @@ FeatureMatMulInt8, FeatureBF16, FeatureAM, FeatureMTE, FeatureETE, FeatureSVE2BitPerm, FeatureFP16FML]; + list X3 = [HasV9_0aOps, FeatureSVE, FeatureNEON, + FeaturePerfMon, FeatureETE, FeatureTRBE, + FeatureSPE, FeatureBF16, FeatureMatMulInt8, + FeatureMTE, FeatureSVE2BitPerm, FeatureFullFP16, + FeatureFP16FML]; list A64FX= [HasV8_2aOps, FeatureFPARMv8, FeatureNEON, FeatureSHA2, FeaturePerfMon, FeatureFullFP16, FeatureSVE, FeatureComplxNum]; @@ -1239,6 +1249,8 @@ [TuneX1]>; def : ProcessorModel<"cortex-x2", NeoverseN2Model, ProcessorFeatures.X2,
[PATCH] D136589: [AArch64] Add support for the Cortex-X3 CPU
vhscampos updated this revision to Diff 470449. vhscampos marked 2 inline comments as done. vhscampos added a comment. Comments addressed Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D136589/new/ https://reviews.llvm.org/D136589 Files: clang/docs/ReleaseNotes.rst clang/test/Driver/aarch64-mcpu.c clang/test/Misc/target-invalid-cpu-note.c llvm/docs/ReleaseNotes.rst llvm/include/llvm/Support/AArch64TargetParser.def llvm/lib/Support/Host.cpp llvm/lib/Target/AArch64/AArch64.td llvm/lib/Target/AArch64/AArch64Subtarget.cpp llvm/lib/Target/AArch64/AArch64Subtarget.h llvm/unittests/Support/TargetParserTest.cpp Index: llvm/unittests/Support/TargetParserTest.cpp === --- llvm/unittests/Support/TargetParserTest.cpp +++ llvm/unittests/Support/TargetParserTest.cpp @@ -1073,6 +1073,18 @@ AArch64::AEK_SVE2BITPERM | AArch64::AEK_SSBS | AArch64::AEK_SB | AArch64::AEK_FP16FML, "9-A"), +ARMCPUTestParams("cortex-x3", "armv9-a", "neon-fp-armv8", + AArch64::AEK_CRC | AArch64::AEK_FP | AArch64::AEK_BF16 | + AArch64::AEK_SIMD | AArch64::AEK_RAS | + AArch64::AEK_LSE | AArch64::AEK_RDM | + AArch64::AEK_RCPC | AArch64::AEK_DOTPROD | + AArch64::AEK_MTE | AArch64::AEK_PAUTH | + AArch64::AEK_SVE | AArch64::AEK_SVE2 | + AArch64::AEK_SVE2BITPERM | AArch64::AEK_SB | + AArch64::AEK_PROFILE | AArch64::AEK_PERFMON | + AArch64::AEK_I8MM | AArch64::AEK_FP16 | + AArch64::AEK_FP16FML | AArch64::AEK_PREDRES, + "9-A"), ARMCPUTestParams("cyclone", "armv8-a", "crypto-neon-fp-armv8", AArch64::AEK_NONE | AArch64::AEK_CRYPTO | AArch64::AEK_FP | AArch64::AEK_SIMD, @@ -1296,7 +1308,7 @@ "8.2-A"))); // Note: number of CPUs includes aliases. -static constexpr unsigned NumAArch64CPUArchs = 59; +static constexpr unsigned NumAArch64CPUArchs = 60; TEST(TargetParserTest, testAArch64CPUArchList) { SmallVector List; Index: llvm/lib/Target/AArch64/AArch64Subtarget.h === --- llvm/lib/Target/AArch64/AArch64Subtarget.h +++ llvm/lib/Target/AArch64/AArch64Subtarget.h @@ -68,6 +68,7 @@ CortexX1, CortexX1C, CortexX2, +CortexX3, ExynosM3, Falkor, Kryo, Index: llvm/lib/Target/AArch64/AArch64Subtarget.cpp === --- llvm/lib/Target/AArch64/AArch64Subtarget.cpp +++ llvm/lib/Target/AArch64/AArch64Subtarget.cpp @@ -143,6 +143,7 @@ break; case CortexA710: case CortexX2: + case CortexX3: PrefFunctionLogAlignment = 4; VScaleForTuning = 1; PrefLoopLogAlignment = 5; Index: llvm/lib/Target/AArch64/AArch64.td === --- llvm/lib/Target/AArch64/AArch64.td +++ llvm/lib/Target/AArch64/AArch64.td @@ -785,6 +785,13 @@ FeatureLSLFast, FeaturePostRAScheduler]>; +def TuneX3 : SubtargetFeature<"cortex-x3", "ARMProcFamily", "CortexX3", + "Cortex-X3 ARM processors", [ + FeatureLSLFast, + FeatureFuseAdrpAdd, + FeatureFuseAES, + FeaturePostRAScheduler]>; + def TuneA64FX : SubtargetFeature<"a64fx", "ARMProcFamily", "A64FX", "Fujitsu A64FX processors", [ FeaturePostRAScheduler, @@ -1109,6 +1116,11 @@ FeatureMatMulInt8, FeatureBF16, FeatureAM, FeatureMTE, FeatureETE, FeatureSVE2BitPerm, FeatureFP16FML]; + list X3 = [HasV9_0aOps, FeatureSVE, FeatureNEON, + FeaturePerfMon, FeatureETE, FeatureTRBE, + FeatureSPE, FeatureBF16, FeatureMatMulInt8, + FeatureMTE, FeatureSVE2BitPerm, FeatureFullFP16, + FeatureFP16FML]; list A64FX= [HasV8_2aOps, FeatureFPARMv8, FeatureNEON, FeatureSHA2, FeaturePerfMon, FeatureFullFP16, FeatureSVE, FeatureComplxNum]; @@ -1239,6 +1251,8 @@ [TuneX1]>; def : ProcessorModel<"cortex-x2", NeoverseN2Model, ProcessorFeatures.X2, [TuneX2]>; +def : ProcessorModel<"cortex-x3", CortexA57Model
[PATCH] D136589: [AArch64] Add support for the Cortex-X3 CPU
vhscampos added inline comments. Comment at: llvm/docs/ReleaseNotes.rst:84 +* Added support for the Cortex-X3 CPU. + dmgreen wrote: > Can you add a reference to neoverse-v2 too. I could do this in another patch, but I prefer to restrict this one to X3. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D136589/new/ https://reviews.llvm.org/D136589 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D136589: [AArch64] Add support for the Cortex-X3 CPU
vhscampos added inline comments. Comment at: llvm/docs/ReleaseNotes.rst:84 +* Added support for the Cortex-X3 CPU. + vhscampos wrote: > dmgreen wrote: > > Can you add a reference to neoverse-v2 too. > I could do this in another patch, but I prefer to restrict this one to X3. I meant I prefer to restrict this patch to X3 only. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D136589/new/ https://reviews.llvm.org/D136589 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D136589: [AArch64] Add support for the Cortex-X3 CPU
vhscampos planned changes to this revision. vhscampos added a comment. We found an issue in the list of target features for v9-A. This patch will be affected. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D136589/new/ https://reviews.llvm.org/D136589 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D136589: [AArch64] Add support for the Cortex-X3 CPU
vhscampos updated this revision to Diff 472001. vhscampos added a comment. Combining release notes of Cortex-X3 and Neoverse V2. Make cortex-x3 use neoverse-v2 model. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D136589/new/ https://reviews.llvm.org/D136589 Files: clang/docs/ReleaseNotes.rst clang/test/Driver/aarch64-mcpu.c clang/test/Misc/target-invalid-cpu-note.c llvm/docs/ReleaseNotes.rst llvm/include/llvm/Support/AArch64TargetParser.def llvm/lib/Support/Host.cpp llvm/lib/Target/AArch64/AArch64.td llvm/lib/Target/AArch64/AArch64Subtarget.cpp llvm/lib/Target/AArch64/AArch64Subtarget.h llvm/unittests/Support/TargetParserTest.cpp Index: llvm/unittests/Support/TargetParserTest.cpp === --- llvm/unittests/Support/TargetParserTest.cpp +++ llvm/unittests/Support/TargetParserTest.cpp @@ -1073,6 +1073,18 @@ AArch64::AEK_SVE2BITPERM | AArch64::AEK_SSBS | AArch64::AEK_SB | AArch64::AEK_FP16FML, "9-A"), +ARMCPUTestParams("cortex-x3", "armv9-a", "neon-fp-armv8", + AArch64::AEK_CRC | AArch64::AEK_FP | AArch64::AEK_BF16 | + AArch64::AEK_SIMD | AArch64::AEK_RAS | + AArch64::AEK_LSE | AArch64::AEK_RDM | + AArch64::AEK_RCPC | AArch64::AEK_DOTPROD | + AArch64::AEK_MTE | AArch64::AEK_PAUTH | + AArch64::AEK_SVE | AArch64::AEK_SVE2 | + AArch64::AEK_SVE2BITPERM | AArch64::AEK_SB | + AArch64::AEK_PROFILE | AArch64::AEK_PERFMON | + AArch64::AEK_I8MM | AArch64::AEK_FP16 | + AArch64::AEK_FP16FML | AArch64::AEK_PREDRES, + "9-A"), ARMCPUTestParams("cyclone", "armv8-a", "crypto-neon-fp-armv8", AArch64::AEK_NONE | AArch64::AEK_CRYPTO | AArch64::AEK_FP | AArch64::AEK_SIMD, @@ -1296,7 +1308,7 @@ "8.2-A"))); // Note: number of CPUs includes aliases. -static constexpr unsigned NumAArch64CPUArchs = 59; +static constexpr unsigned NumAArch64CPUArchs = 60; TEST(TargetParserTest, testAArch64CPUArchList) { SmallVector List; Index: llvm/lib/Target/AArch64/AArch64Subtarget.h === --- llvm/lib/Target/AArch64/AArch64Subtarget.h +++ llvm/lib/Target/AArch64/AArch64Subtarget.h @@ -68,6 +68,7 @@ CortexX1, CortexX1C, CortexX2, +CortexX3, ExynosM3, Falkor, Kryo, Index: llvm/lib/Target/AArch64/AArch64Subtarget.cpp === --- llvm/lib/Target/AArch64/AArch64Subtarget.cpp +++ llvm/lib/Target/AArch64/AArch64Subtarget.cpp @@ -143,6 +143,7 @@ break; case CortexA710: case CortexX2: + case CortexX3: PrefFunctionLogAlignment = 4; VScaleForTuning = 1; PrefLoopLogAlignment = 5; Index: llvm/lib/Target/AArch64/AArch64.td === --- llvm/lib/Target/AArch64/AArch64.td +++ llvm/lib/Target/AArch64/AArch64.td @@ -785,6 +785,13 @@ FeatureLSLFast, FeaturePostRAScheduler]>; +def TuneX3 : SubtargetFeature<"cortex-x3", "ARMProcFamily", "CortexX3", + "Cortex-X3 ARM processors", [ + FeatureLSLFast, + FeatureFuseAdrpAdd, + FeatureFuseAES, + FeaturePostRAScheduler]>; + def TuneA64FX : SubtargetFeature<"a64fx", "ARMProcFamily", "A64FX", "Fujitsu A64FX processors", [ FeaturePostRAScheduler, @@ -1109,6 +1116,11 @@ FeatureMatMulInt8, FeatureBF16, FeatureAM, FeatureMTE, FeatureETE, FeatureSVE2BitPerm, FeatureFP16FML]; + list X3 = [HasV9_0aOps, FeatureSVE, FeatureNEON, + FeaturePerfMon, FeatureETE, FeatureTRBE, + FeatureSPE, FeatureBF16, FeatureMatMulInt8, + FeatureMTE, FeatureSVE2BitPerm, FeatureFullFP16, + FeatureFP16FML]; list A64FX= [HasV8_2aOps, FeatureFPARMv8, FeatureNEON, FeatureSHA2, FeaturePerfMon, FeatureFullFP16, FeatureSVE, FeatureComplxNum]; @@ -1239,6 +1251,8 @@ [TuneX1]>; def : ProcessorModel<"cortex-x2", NeoverseN2Model, ProcessorFeatures.X2, [TuneX2]>; +def : ProcessorMod
[PATCH] D131555: [Clang] Propagate const context info when emitting compound literal
vhscampos added inline comments. Comment at: clang/lib/CodeGen/CGExprConstant.cpp:2217 assert(E->isFileScope() && "not a file-scope compound literal expr"); - return tryEmitGlobalCompoundLiteral(*this, nullptr, E); + ConstantEmitter emitter(*this, nullptr); + return tryEmitGlobalCompoundLiteral(emitter, E); This constructor has the second parameter optional anyway. I suggest you omit the nullptr here. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D131555/new/ https://reviews.llvm.org/D131555 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D131555: [Clang] Propagate const context info when emitting compound literal
vhscampos added inline comments. Comment at: clang/lib/CodeGen/CGExprConstant.cpp:2217 assert(E->isFileScope() && "not a file-scope compound literal expr"); - return tryEmitGlobalCompoundLiteral(*this, nullptr, E); + ConstantEmitter emitter(*this, nullptr); + return tryEmitGlobalCompoundLiteral(emitter, E); vhscampos wrote: > This constructor has the second parameter optional anyway. I suggest you omit > the nullptr here. To clarify, when I say optional, I mean it has a default value with is already nullptr Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D131555/new/ https://reviews.llvm.org/D131555 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D136957: [AArch64] Add support for the Cortex-A715 CPU
vhscampos added inline comments. Comment at: clang/docs/ReleaseNotes.rst:699 + + * Arm cortex-A715 (cortex-a715). Please capitalise the first cortex-A715. It should be "Arm Cortex-A715". Comment at: llvm/docs/ReleaseNotes.rst:84 +* Added support for the cortex-a715 CPU. + Please capitalise. "Cortex-A715". Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D136957/new/ https://reviews.llvm.org/D136957 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D136589: [AArch64] Add support for the Cortex-X3 CPU
vhscampos updated this revision to Diff 472567. vhscampos added a comment. Added AEK_FLAM to the list of features for Cortex-X3 Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D136589/new/ https://reviews.llvm.org/D136589 Files: clang/docs/ReleaseNotes.rst clang/test/Driver/aarch64-mcpu.c clang/test/Misc/target-invalid-cpu-note.c llvm/docs/ReleaseNotes.rst llvm/include/llvm/Support/AArch64TargetParser.def llvm/lib/Support/Host.cpp llvm/lib/Target/AArch64/AArch64.td llvm/lib/Target/AArch64/AArch64Subtarget.cpp llvm/lib/Target/AArch64/AArch64Subtarget.h llvm/unittests/Support/TargetParserTest.cpp Index: llvm/unittests/Support/TargetParserTest.cpp === --- llvm/unittests/Support/TargetParserTest.cpp +++ llvm/unittests/Support/TargetParserTest.cpp @@ -1073,6 +1073,19 @@ AArch64::AEK_SVE2BITPERM | AArch64::AEK_SSBS | AArch64::AEK_SB | AArch64::AEK_FP16FML, "9-A"), +ARMCPUTestParams("cortex-x3", "armv9-a", "neon-fp-armv8", + AArch64::AEK_CRC | AArch64::AEK_FP | AArch64::AEK_BF16 | + AArch64::AEK_SIMD | AArch64::AEK_RAS | + AArch64::AEK_LSE | AArch64::AEK_RDM | + AArch64::AEK_RCPC | AArch64::AEK_DOTPROD | + AArch64::AEK_MTE | AArch64::AEK_PAUTH | + AArch64::AEK_SVE | AArch64::AEK_SVE2 | + AArch64::AEK_SVE2BITPERM | AArch64::AEK_SB | + AArch64::AEK_PROFILE | AArch64::AEK_PERFMON | + AArch64::AEK_I8MM | AArch64::AEK_FP16 | + AArch64::AEK_FP16FML | AArch64::AEK_PREDRES | + AArch64::AEK_FLAGM, + "9-A"), ARMCPUTestParams("cyclone", "armv8-a", "crypto-neon-fp-armv8", AArch64::AEK_NONE | AArch64::AEK_CRYPTO | AArch64::AEK_FP | AArch64::AEK_SIMD, @@ -1296,7 +1309,7 @@ "8.2-A"))); // Note: number of CPUs includes aliases. -static constexpr unsigned NumAArch64CPUArchs = 59; +static constexpr unsigned NumAArch64CPUArchs = 60; TEST(TargetParserTest, testAArch64CPUArchList) { SmallVector List; Index: llvm/lib/Target/AArch64/AArch64Subtarget.h === --- llvm/lib/Target/AArch64/AArch64Subtarget.h +++ llvm/lib/Target/AArch64/AArch64Subtarget.h @@ -68,6 +68,7 @@ CortexX1, CortexX1C, CortexX2, +CortexX3, ExynosM3, Falkor, Kryo, Index: llvm/lib/Target/AArch64/AArch64Subtarget.cpp === --- llvm/lib/Target/AArch64/AArch64Subtarget.cpp +++ llvm/lib/Target/AArch64/AArch64Subtarget.cpp @@ -143,6 +143,7 @@ break; case CortexA710: case CortexX2: + case CortexX3: PrefFunctionLogAlignment = 4; VScaleForTuning = 1; PrefLoopLogAlignment = 5; Index: llvm/lib/Target/AArch64/AArch64.td === --- llvm/lib/Target/AArch64/AArch64.td +++ llvm/lib/Target/AArch64/AArch64.td @@ -785,6 +785,13 @@ FeatureLSLFast, FeaturePostRAScheduler]>; +def TuneX3 : SubtargetFeature<"cortex-x3", "ARMProcFamily", "CortexX3", + "Cortex-X3 ARM processors", [ + FeatureLSLFast, + FeatureFuseAdrpAdd, + FeatureFuseAES, + FeaturePostRAScheduler]>; + def TuneA64FX : SubtargetFeature<"a64fx", "ARMProcFamily", "A64FX", "Fujitsu A64FX processors", [ FeaturePostRAScheduler, @@ -1109,6 +1116,11 @@ FeatureMatMulInt8, FeatureBF16, FeatureAM, FeatureMTE, FeatureETE, FeatureSVE2BitPerm, FeatureFP16FML]; + list X3 = [HasV9_0aOps, FeatureSVE, FeatureNEON, + FeaturePerfMon, FeatureETE, FeatureTRBE, + FeatureSPE, FeatureBF16, FeatureMatMulInt8, + FeatureMTE, FeatureSVE2BitPerm, FeatureFullFP16, + FeatureFP16FML]; list A64FX= [HasV8_2aOps, FeatureFPARMv8, FeatureNEON, FeatureSHA2, FeaturePerfMon, FeatureFullFP16, FeatureSVE, FeatureComplxNum]; @@ -1239,6 +1251,8 @@ [TuneX1]>; def : ProcessorModel<"cortex-x2", NeoverseN2Model, ProcessorFeatures.X2, [TuneX2]>; +def : P
[PATCH] D136589: [AArch64] Add support for the Cortex-X3 CPU
vhscampos marked an inline comment as done. vhscampos added inline comments. Comment at: clang/docs/ReleaseNotes.rst:696 -- Add driver and tuning support for Neoverse V2 via the flag ``-mcpu=neoverse-v2``. - Native detection is also supported via ``-mcpu=native``. tschuett wrote: > dmgreen wrote: > > tschuett wrote: > > > What happened with native detection? > > I think I prefer the new wording, FWIW, as we add more CPUs. The native > > detection should just be automatically implied. > But the word `native` disappeared. I've looked at https://reviews.llvm.org/D134352 and also in code related to -mcpu=native implementation and I couldn't find anything related to Neoverse V2. Can you please point me to where that release note relates to? Comment at: llvm/unittests/Support/TargetParserTest.cpp:1083 + AArch64::AEK_SVE | AArch64::AEK_SVE2 | + AArch64::AEK_SVE2BITPERM | AArch64::AEK_SB | + AArch64::AEK_PROFILE | AArch64::AEK_PERFMON | dmgreen wrote: > Should SSBS be included in here? As far as I understand it is enabled by > default in 8.5-A. That should mean it's included automatically, but that can > be said for a lot of the other features here too. > Same for AEK_FLAGM, which I noticed as a difference between A715 and X3. SSBS is an optional feature for all Armv8-A architectures. But FlagM should be present here as it's mandatory in v8.4-A. Fixed. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D136589/new/ https://reviews.llvm.org/D136589 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D136589: [AArch64] Add support for the Cortex-X3 CPU
vhscampos updated this revision to Diff 472583. vhscampos added a comment. Added SSBS Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D136589/new/ https://reviews.llvm.org/D136589 Files: clang/docs/ReleaseNotes.rst clang/test/Driver/aarch64-mcpu.c clang/test/Misc/target-invalid-cpu-note.c llvm/docs/ReleaseNotes.rst llvm/include/llvm/Support/AArch64TargetParser.def llvm/lib/Support/Host.cpp llvm/lib/Target/AArch64/AArch64.td llvm/lib/Target/AArch64/AArch64Subtarget.cpp llvm/lib/Target/AArch64/AArch64Subtarget.h llvm/unittests/Support/TargetParserTest.cpp Index: llvm/unittests/Support/TargetParserTest.cpp === --- llvm/unittests/Support/TargetParserTest.cpp +++ llvm/unittests/Support/TargetParserTest.cpp @@ -1073,6 +1073,19 @@ AArch64::AEK_SVE2BITPERM | AArch64::AEK_SSBS | AArch64::AEK_SB | AArch64::AEK_FP16FML, "9-A"), +ARMCPUTestParams("cortex-x3", "armv9-a", "neon-fp-armv8", + AArch64::AEK_CRC | AArch64::AEK_FP | AArch64::AEK_BF16 | + AArch64::AEK_SIMD | AArch64::AEK_RAS | + AArch64::AEK_LSE | AArch64::AEK_RDM | + AArch64::AEK_RCPC | AArch64::AEK_DOTPROD | + AArch64::AEK_MTE | AArch64::AEK_PAUTH | + AArch64::AEK_SVE | AArch64::AEK_SVE2 | + AArch64::AEK_SVE2BITPERM | AArch64::AEK_SB | + AArch64::AEK_PROFILE | AArch64::AEK_PERFMON | + AArch64::AEK_I8MM | AArch64::AEK_FP16 | + AArch64::AEK_FP16FML | AArch64::AEK_PREDRES | + AArch64::AEK_FLAGM | AArch64::AEK_SSBS, + "9-A"), ARMCPUTestParams("cyclone", "armv8-a", "crypto-neon-fp-armv8", AArch64::AEK_NONE | AArch64::AEK_CRYPTO | AArch64::AEK_FP | AArch64::AEK_SIMD, @@ -1296,7 +1309,7 @@ "8.2-A"))); // Note: number of CPUs includes aliases. -static constexpr unsigned NumAArch64CPUArchs = 59; +static constexpr unsigned NumAArch64CPUArchs = 60; TEST(TargetParserTest, testAArch64CPUArchList) { SmallVector List; Index: llvm/lib/Target/AArch64/AArch64Subtarget.h === --- llvm/lib/Target/AArch64/AArch64Subtarget.h +++ llvm/lib/Target/AArch64/AArch64Subtarget.h @@ -68,6 +68,7 @@ CortexX1, CortexX1C, CortexX2, +CortexX3, ExynosM3, Falkor, Kryo, Index: llvm/lib/Target/AArch64/AArch64Subtarget.cpp === --- llvm/lib/Target/AArch64/AArch64Subtarget.cpp +++ llvm/lib/Target/AArch64/AArch64Subtarget.cpp @@ -143,6 +143,7 @@ break; case CortexA710: case CortexX2: + case CortexX3: PrefFunctionLogAlignment = 4; VScaleForTuning = 1; PrefLoopLogAlignment = 5; Index: llvm/lib/Target/AArch64/AArch64.td === --- llvm/lib/Target/AArch64/AArch64.td +++ llvm/lib/Target/AArch64/AArch64.td @@ -785,6 +785,13 @@ FeatureLSLFast, FeaturePostRAScheduler]>; +def TuneX3 : SubtargetFeature<"cortex-x3", "ARMProcFamily", "CortexX3", + "Cortex-X3 ARM processors", [ + FeatureLSLFast, + FeatureFuseAdrpAdd, + FeatureFuseAES, + FeaturePostRAScheduler]>; + def TuneA64FX : SubtargetFeature<"a64fx", "ARMProcFamily", "A64FX", "Fujitsu A64FX processors", [ FeaturePostRAScheduler, @@ -1109,6 +1116,11 @@ FeatureMatMulInt8, FeatureBF16, FeatureAM, FeatureMTE, FeatureETE, FeatureSVE2BitPerm, FeatureFP16FML]; + list X3 = [HasV9_0aOps, FeatureSVE, FeatureNEON, + FeaturePerfMon, FeatureETE, FeatureTRBE, + FeatureSPE, FeatureBF16, FeatureMatMulInt8, + FeatureMTE, FeatureSVE2BitPerm, FeatureFullFP16, + FeatureFP16FML]; list A64FX= [HasV8_2aOps, FeatureFPARMv8, FeatureNEON, FeatureSHA2, FeaturePerfMon, FeatureFullFP16, FeatureSVE, FeatureComplxNum]; @@ -1239,6 +1251,8 @@ [TuneX1]>; def : ProcessorModel<"cortex-x2", NeoverseN2Model, ProcessorFeatures.X2, [TuneX2]>; +def : ProcessorModel<"cortex-
[PATCH] D136589: [AArch64] Add support for the Cortex-X3 CPU
vhscampos added a comment. @tschuett Do you want me to put the note about 'native' detection for neoverse-v2 back? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D136589/new/ https://reviews.llvm.org/D136589 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D136589: [AArch64] Add support for the Cortex-X3 CPU
vhscampos updated this revision to Diff 474029. vhscampos added a comment. Rebasing and conflict resolution Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D136589/new/ https://reviews.llvm.org/D136589 Files: clang/docs/ReleaseNotes.rst clang/test/Driver/aarch64-mcpu.c clang/test/Misc/target-invalid-cpu-note.c llvm/docs/ReleaseNotes.rst llvm/include/llvm/Support/AArch64TargetParser.def llvm/lib/Support/Host.cpp llvm/lib/Target/AArch64/AArch64.td llvm/lib/Target/AArch64/AArch64Subtarget.cpp llvm/lib/Target/AArch64/AArch64Subtarget.h llvm/unittests/Support/TargetParserTest.cpp Index: llvm/unittests/Support/TargetParserTest.cpp === --- llvm/unittests/Support/TargetParserTest.cpp +++ llvm/unittests/Support/TargetParserTest.cpp @@ -1086,6 +1086,19 @@ AArch64::AEK_SVE2BITPERM | AArch64::AEK_SSBS | AArch64::AEK_SB | AArch64::AEK_FP16FML, "9-A"), +ARMCPUTestParams("cortex-x3", "armv9-a", "neon-fp-armv8", + AArch64::AEK_CRC | AArch64::AEK_FP | AArch64::AEK_BF16 | + AArch64::AEK_SIMD | AArch64::AEK_RAS | + AArch64::AEK_LSE | AArch64::AEK_RDM | + AArch64::AEK_RCPC | AArch64::AEK_DOTPROD | + AArch64::AEK_MTE | AArch64::AEK_PAUTH | + AArch64::AEK_SVE | AArch64::AEK_SVE2 | + AArch64::AEK_SVE2BITPERM | AArch64::AEK_SB | + AArch64::AEK_PROFILE | AArch64::AEK_PERFMON | + AArch64::AEK_I8MM | AArch64::AEK_FP16 | + AArch64::AEK_FP16FML | AArch64::AEK_PREDRES | + AArch64::AEK_FLAGM | AArch64::AEK_SSBS, + "9-A"), ARMCPUTestParams("cyclone", "armv8-a", "crypto-neon-fp-armv8", AArch64::AEK_NONE | AArch64::AEK_CRYPTO | AArch64::AEK_FP | AArch64::AEK_SIMD, @@ -1309,7 +1322,7 @@ "8.2-A"))); // Note: number of CPUs includes aliases. -static constexpr unsigned NumAArch64CPUArchs = 60; +static constexpr unsigned NumAArch64CPUArchs = 61; TEST(TargetParserTest, testAArch64CPUArchList) { SmallVector List; Index: llvm/lib/Target/AArch64/AArch64Subtarget.h === --- llvm/lib/Target/AArch64/AArch64Subtarget.h +++ llvm/lib/Target/AArch64/AArch64Subtarget.h @@ -69,6 +69,7 @@ CortexX1, CortexX1C, CortexX2, +CortexX3, ExynosM3, Falkor, Kryo, Index: llvm/lib/Target/AArch64/AArch64Subtarget.cpp === --- llvm/lib/Target/AArch64/AArch64Subtarget.cpp +++ llvm/lib/Target/AArch64/AArch64Subtarget.cpp @@ -144,6 +144,7 @@ case CortexA710: case CortexA715: case CortexX2: + case CortexX3: PrefFunctionLogAlignment = 4; VScaleForTuning = 1; PrefLoopLogAlignment = 5; Index: llvm/lib/Target/AArch64/AArch64.td === --- llvm/lib/Target/AArch64/AArch64.td +++ llvm/lib/Target/AArch64/AArch64.td @@ -799,6 +799,13 @@ FeatureLSLFast, FeaturePostRAScheduler]>; +def TuneX3 : SubtargetFeature<"cortex-x3", "ARMProcFamily", "CortexX3", + "Cortex-X3 ARM processors", [ + FeatureLSLFast, + FeatureFuseAdrpAdd, + FeatureFuseAES, + FeaturePostRAScheduler]>; + def TuneA64FX : SubtargetFeature<"a64fx", "ARMProcFamily", "A64FX", "Fujitsu A64FX processors", [ FeaturePostRAScheduler, @@ -1127,6 +1134,11 @@ FeatureMatMulInt8, FeatureBF16, FeatureAM, FeatureMTE, FeatureETE, FeatureSVE2BitPerm, FeatureFP16FML]; + list X3 = [HasV9_0aOps, FeatureSVE, FeatureNEON, + FeaturePerfMon, FeatureETE, FeatureTRBE, + FeatureSPE, FeatureBF16, FeatureMatMulInt8, + FeatureMTE, FeatureSVE2BitPerm, FeatureFullFP16, + FeatureFP16FML]; list A64FX= [HasV8_2aOps, FeatureFPARMv8, FeatureNEON, FeatureSHA2, FeaturePerfMon, FeatureFullFP16, FeatureSVE, FeatureComplxNum]; @@ -1259,6 +1271,8 @@ [TuneX1]>; def : ProcessorModel<"cortex-x2", NeoverseN2Model, ProcessorFeatures.X2, [TuneX2]>;
[PATCH] D136589: [AArch64] Add support for the Cortex-X3 CPU
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG9d1ff787e5c2: [AArch64] Add support for the Cortex-X3 CPU (authored by vhscampos). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D136589/new/ https://reviews.llvm.org/D136589 Files: clang/docs/ReleaseNotes.rst clang/test/Driver/aarch64-mcpu.c clang/test/Misc/target-invalid-cpu-note.c llvm/docs/ReleaseNotes.rst llvm/include/llvm/Support/AArch64TargetParser.def llvm/lib/Support/Host.cpp llvm/lib/Target/AArch64/AArch64.td llvm/lib/Target/AArch64/AArch64Subtarget.cpp llvm/lib/Target/AArch64/AArch64Subtarget.h llvm/unittests/Support/TargetParserTest.cpp Index: llvm/unittests/Support/TargetParserTest.cpp === --- llvm/unittests/Support/TargetParserTest.cpp +++ llvm/unittests/Support/TargetParserTest.cpp @@ -1086,6 +1086,19 @@ AArch64::AEK_SVE2BITPERM | AArch64::AEK_SSBS | AArch64::AEK_SB | AArch64::AEK_FP16FML, "9-A"), +ARMCPUTestParams("cortex-x3", "armv9-a", "neon-fp-armv8", + AArch64::AEK_CRC | AArch64::AEK_FP | AArch64::AEK_BF16 | + AArch64::AEK_SIMD | AArch64::AEK_RAS | + AArch64::AEK_LSE | AArch64::AEK_RDM | + AArch64::AEK_RCPC | AArch64::AEK_DOTPROD | + AArch64::AEK_MTE | AArch64::AEK_PAUTH | + AArch64::AEK_SVE | AArch64::AEK_SVE2 | + AArch64::AEK_SVE2BITPERM | AArch64::AEK_SB | + AArch64::AEK_PROFILE | AArch64::AEK_PERFMON | + AArch64::AEK_I8MM | AArch64::AEK_FP16 | + AArch64::AEK_FP16FML | AArch64::AEK_PREDRES | + AArch64::AEK_FLAGM | AArch64::AEK_SSBS, + "9-A"), ARMCPUTestParams("cyclone", "armv8-a", "crypto-neon-fp-armv8", AArch64::AEK_NONE | AArch64::AEK_CRYPTO | AArch64::AEK_FP | AArch64::AEK_SIMD, @@ -1309,7 +1322,7 @@ "8.2-A"))); // Note: number of CPUs includes aliases. -static constexpr unsigned NumAArch64CPUArchs = 60; +static constexpr unsigned NumAArch64CPUArchs = 61; TEST(TargetParserTest, testAArch64CPUArchList) { SmallVector List; Index: llvm/lib/Target/AArch64/AArch64Subtarget.h === --- llvm/lib/Target/AArch64/AArch64Subtarget.h +++ llvm/lib/Target/AArch64/AArch64Subtarget.h @@ -69,6 +69,7 @@ CortexX1, CortexX1C, CortexX2, +CortexX3, ExynosM3, Falkor, Kryo, Index: llvm/lib/Target/AArch64/AArch64Subtarget.cpp === --- llvm/lib/Target/AArch64/AArch64Subtarget.cpp +++ llvm/lib/Target/AArch64/AArch64Subtarget.cpp @@ -144,6 +144,7 @@ case CortexA710: case CortexA715: case CortexX2: + case CortexX3: PrefFunctionLogAlignment = 4; VScaleForTuning = 1; PrefLoopLogAlignment = 5; Index: llvm/lib/Target/AArch64/AArch64.td === --- llvm/lib/Target/AArch64/AArch64.td +++ llvm/lib/Target/AArch64/AArch64.td @@ -799,6 +799,13 @@ FeatureLSLFast, FeaturePostRAScheduler]>; +def TuneX3 : SubtargetFeature<"cortex-x3", "ARMProcFamily", "CortexX3", + "Cortex-X3 ARM processors", [ + FeatureLSLFast, + FeatureFuseAdrpAdd, + FeatureFuseAES, + FeaturePostRAScheduler]>; + def TuneA64FX : SubtargetFeature<"a64fx", "ARMProcFamily", "A64FX", "Fujitsu A64FX processors", [ FeaturePostRAScheduler, @@ -1127,6 +1134,11 @@ FeatureMatMulInt8, FeatureBF16, FeatureAM, FeatureMTE, FeatureETE, FeatureSVE2BitPerm, FeatureFP16FML]; + list X3 = [HasV9_0aOps, FeatureSVE, FeatureNEON, + FeaturePerfMon, FeatureETE, FeatureTRBE, + FeatureSPE, FeatureBF16, FeatureMatMulInt8, + FeatureMTE, FeatureSVE2BitPerm, FeatureFullFP16, + FeatureFP16FML]; list A64FX= [HasV8_2aOps, FeatureFPARMv8, FeatureNEON, FeatureSHA2, FeaturePerfMon, FeatureFullFP16, FeatureSVE, FeatureComplxNum]; @@ -1259,6 +1271,8 @@
[PATCH] D157479: [Clang][DebugInfo] Emit narrower base types for structured binding declarations that bind to struct bitfields
vhscampos updated this revision to Diff 549359. vhscampos added a comment. - Addressed the one comment regarding code. - Changed the test to use update_cc_test Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D157479/new/ https://reviews.llvm.org/D157479 Files: clang/lib/CodeGen/CGDebugInfo.cpp clang/lib/CodeGen/CGDebugInfo.h clang/test/CodeGenCXX/debug-info-structured-binding-bitfield.cpp Index: clang/test/CodeGenCXX/debug-info-structured-binding-bitfield.cpp === --- /dev/null +++ clang/test/CodeGenCXX/debug-info-structured-binding-bitfield.cpp @@ -0,0 +1,557 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --check-globals --version 2 +// RUN: %clang_cc1 -emit-llvm -debug-info-kind=standalone -triple aarch64-arm-none-eabi %s -o - | FileCheck %s + +struct S0 { + unsigned int x : 16; + unsigned int y : 16; +}; + +// CHECK-LABEL: define dso_local void @_Z3fS0v +// CHECK-SAME: () #[[ATTR0:[0-9]+]] !dbg [[DBG5:![0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[S0:%.*]] = alloca [[STRUCT_S0:%.*]], align 4 +// CHECK-NEXT:[[TMP0:%.*]] = alloca [[STRUCT_S0]], align 4 +// CHECK-NEXT:call void @llvm.dbg.declare(metadata ptr [[S0]], metadata [[META10:![0-9]+]], metadata !DIExpression()), !dbg [[DBG16:![0-9]+]] +// CHECK-NEXT:call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[META17:![0-9]+]], metadata !DIExpression()), !dbg [[DBG19:![0-9]+]] +// CHECK-NEXT:call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[META20:![0-9]+]], metadata !DIExpression(DW_OP_plus_uconst, 2)), !dbg [[DBG21:![0-9]+]] +// CHECK-NEXT:call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[META22:![0-9]+]], metadata !DIExpression()), !dbg [[DBG23:![0-9]+]] +// CHECK-NEXT:call void @llvm.memcpy.p0.p0.i64(ptr align 4 [[TMP0]], ptr align 4 [[S0]], i64 4, i1 false), !dbg [[DBG24:![0-9]+]] +// CHECK-NEXT:ret void, !dbg [[DBG25:![0-9]+]] +// +void fS0() { + S0 s0; + auto [a, b] = s0; +} + +struct S1 { + volatile unsigned int x : 16; + volatile unsigned int y : 16; +}; + +// CHECK-LABEL: define dso_local void @_Z3fS1v +// CHECK-SAME: () #[[ATTR0]] !dbg [[DBG26:![0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[S1:%.*]] = alloca [[STRUCT_S1:%.*]], align 4 +// CHECK-NEXT:[[TMP0:%.*]] = alloca [[STRUCT_S1]], align 4 +// CHECK-NEXT:call void @llvm.dbg.declare(metadata ptr [[S1]], metadata [[META27:![0-9]+]], metadata !DIExpression()), !dbg [[DBG33:![0-9]+]] +// CHECK-NEXT:call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[META34:![0-9]+]], metadata !DIExpression()), !dbg [[DBG36:![0-9]+]] +// CHECK-NEXT:call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[META37:![0-9]+]], metadata !DIExpression(DW_OP_plus_uconst, 2)), !dbg [[DBG38:![0-9]+]] +// CHECK-NEXT:call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[META39:![0-9]+]], metadata !DIExpression()), !dbg [[DBG40:![0-9]+]] +// CHECK-NEXT:call void @llvm.memcpy.p0.p0.i64(ptr align 4 [[TMP0]], ptr align 4 [[S1]], i64 4, i1 false), !dbg [[DBG41:![0-9]+]] +// CHECK-NEXT:ret void, !dbg [[DBG42:![0-9]+]] +// +void fS1() { + S1 s1; + auto [a, b] = s1; +} + +struct S2 { + unsigned int x : 8; + unsigned int y : 8; +}; + +// CHECK-LABEL: define dso_local void @_Z3fS2v +// CHECK-SAME: () #[[ATTR0]] !dbg [[DBG43:![0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[S2:%.*]] = alloca [[STRUCT_S2:%.*]], align 4 +// CHECK-NEXT:[[TMP0:%.*]] = alloca [[STRUCT_S2]], align 4 +// CHECK-NEXT:call void @llvm.dbg.declare(metadata ptr [[S2]], metadata [[META44:![0-9]+]], metadata !DIExpression()), !dbg [[DBG49:![0-9]+]] +// CHECK-NEXT:call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[META50:![0-9]+]], metadata !DIExpression()), !dbg [[DBG52:![0-9]+]] +// CHECK-NEXT:call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[META53:![0-9]+]], metadata !DIExpression(DW_OP_plus_uconst, 1)), !dbg [[DBG54:![0-9]+]] +// CHECK-NEXT:call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[META55:![0-9]+]], metadata !DIExpression()), !dbg [[DBG56:![0-9]+]] +// CHECK-NEXT:call void @llvm.memcpy.p0.p0.i64(ptr align 4 [[TMP0]], ptr align 4 [[S2]], i64 4, i1 false), !dbg [[DBG57:![0-9]+]] +// CHECK-NEXT:ret void, !dbg [[DBG58:![0-9]+]] +// +void fS2() { + S2 s2; + auto [a, b] = s2; +} + +struct S3 { + volatile unsigned int x : 8; + volatile unsigned int y : 8; +}; + +// CHECK-LABEL: define dso_local void @_Z3fS3v +// CHECK-SAME: () #[[ATTR0]] !dbg [[DBG59:![0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[S3:%.*]] = alloca [[STRUCT_S3:%.*]], align 4 +// CHECK-NEXT:[[TMP0:%.*]] = alloca [[STRUCT_S3]], align 4 +// CHECK-NEXT:call void @llvm.dbg.declare(metadata ptr [[S3]], metadata [[META60:![0-9]+]], metadata !DIExpression()), !dbg [[DBG65:![0-9]+]] +// CHECK-NEXT:call void @llv
[PATCH] D157479: [Clang][DebugInfo] Emit narrower base types for structured binding declarations that bind to struct bitfields
vhscampos marked 3 inline comments as done. vhscampos added inline comments. Comment at: clang/test/CodeGenCXX/debug-info-structured-binding-bitfield.cpp:6 + unsigned int y : 16; +}; + tmatheson wrote: > This would be easier to check if the structs had meaningful names, e.g. > S_16_16 I left naming as it was, but now used your suggested code structure and also used update_cc_test Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D157479/new/ https://reviews.llvm.org/D157479 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D157479: [Clang][DebugInfo] Emit narrower base types for structured binding declarations that bind to struct bitfields
vhscampos updated this revision to Diff 550298. vhscampos added a comment. - Redone test to cover only what's needed. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D157479/new/ https://reviews.llvm.org/D157479 Files: clang/lib/CodeGen/CGDebugInfo.cpp clang/lib/CodeGen/CGDebugInfo.h clang/test/CodeGenCXX/debug-info-structured-binding-bitfield.cpp Index: clang/test/CodeGenCXX/debug-info-structured-binding-bitfield.cpp === --- /dev/null +++ clang/test/CodeGenCXX/debug-info-structured-binding-bitfield.cpp @@ -0,0 +1,285 @@ +// RUN: %clang_cc1 -emit-llvm -debug-info-kind=standalone -triple aarch64-arm-none-eabi %s -o - | FileCheck %s + +struct S0 { + unsigned int x : 16; + unsigned int y : 16; +}; + +// CHECK-LABEL: define dso_local void @_Z3fS0v +// CHECK:alloca %struct.S0, align 4 +// CHECK-NEXT:[[TMP0:%.*]] = alloca %struct.S0, align 4 +// CHECK: call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S0_A:![0-9]+]], metadata !DIExpression()) +// CHECK-NEXT:call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S0_B:![0-9]+]], metadata !DIExpression(DW_OP_plus_uconst, 2)) +// +void fS0() { + S0 s0; + auto [a, b] = s0; +} + +struct S1 { + volatile unsigned int x : 16; + volatile unsigned int y : 16; +}; + +// CHECK-LABEL: define dso_local void @_Z3fS1v +// CHECK:alloca %struct.S1, align 4 +// CHECK-NEXT:[[TMP0:%.*]] = alloca %struct.S1, align 4 +// CHECK: call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S1_A:![0-9]+]], metadata !DIExpression()) +// CHECK-NEXT:call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S1_B:![0-9]+]], metadata !DIExpression(DW_OP_plus_uconst, 2)) +// +void fS1() { + S1 s1; + auto [a, b] = s1; +} + +struct S2 { + unsigned int x : 8; + unsigned int y : 8; +}; + +// CHECK-LABEL: define dso_local void @_Z3fS2v +// CHECK:alloca %struct.S2, align 4 +// CHECK-NEXT:[[TMP0:%.*]] = alloca %struct.S2, align 4 +// CHECK: call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S2_A:![0-9]+]], metadata !DIExpression()) +// CHECK-NEXT:call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S2_B:![0-9]+]], metadata !DIExpression(DW_OP_plus_uconst, 1)) +// +void fS2() { + S2 s2; + auto [a, b] = s2; +} + +struct S3 { + volatile unsigned int x : 8; + volatile unsigned int y : 8; +}; + +// CHECK-LABEL: define dso_local void @_Z3fS3v +// CHECK:alloca %struct.S3, align 4 +// CHECK-NEXT:[[TMP0:%.*]] = alloca %struct.S3, align 4 +// CHECK: call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S3_A:![0-9]+]], metadata !DIExpression()) +// CHECK-NEXT:call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S3_B:![0-9]+]], metadata !DIExpression(DW_OP_plus_uconst, 1)) +// +void fS3() { + S3 s3; + auto [a, b] = s3; +} + +struct S4 { + unsigned int x : 8; + unsigned int y : 16; +}; + +// CHECK-LABEL: define dso_local void @_Z3fS4v +// CHECK:alloca %struct.S4, align 4 +// CHECK-NEXT:[[TMP0:%.*]] = alloca %struct.S4, align 4 +// CHECK: call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S4_A:![0-9]+]], metadata !DIExpression()) +// CHECK-NEXT:call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S4_B:![0-9]+]], metadata !DIExpression(DW_OP_plus_uconst, 1)) +// +void fS4() { + S4 s4; + auto [a, b] = s4; +} + +struct S5 { + volatile unsigned int x : 8; + volatile unsigned int y : 16; +}; + +// CHECK-LABEL: define dso_local void @_Z3fS5v +// CHECK:alloca %struct.S5, align 4 +// CHECK-NEXT:[[TMP0:%.*]] = alloca %struct.S5, align 4 +// CHECK: call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S5_A:![0-9]+]], metadata !DIExpression()) +// CHECK-NEXT:call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S5_B:![0-9]+]], metadata !DIExpression(DW_OP_plus_uconst, 1)) +// +void fS5() { + S5 s5; + auto [a, b] = s5; +} + +struct S6 { + unsigned int x : 16; + unsigned int y : 8; +}; + +// CHECK-LABEL: define dso_local void @_Z3fS6v +// CHECK:alloca %struct.S6, align 4 +// CHECK-NEXT:[[TMP0:%.*]] = alloca %struct.S6, align 4 +// CHECK: call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S6_A:![0-9]+]], metadata !DIExpression()) +// CHECK-NEXT:call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S6_B:![0-9]+]], metadata !DIExpression(DW_OP_plus_uconst, 2)) +// +void fS6() { + S6 s6; + auto [a, b] = s6; +} + +struct S7 { + volatile unsigned int x : 16; + volatile unsigned int y : 8; +}; + +// CHECK-LABEL: define dso_local void @_Z3fS7v +// CHECK:alloca %struct.S7, align 4 +// CHECK-NEXT:[[TMP0:%.*]] = alloca %struct.S7, align 4 +// CHECK: call void @llvm.dbg
[PATCH] D157479: [Clang][DebugInfo] Emit narrower base types for structured binding declarations that bind to struct bitfields
vhscampos marked an inline comment as done. vhscampos added a comment. @aprantl We have discussed internally a few options to implement correct debug information in this case. But this will be future work. For the time being, we believe it is better to have no debug information rather than wrong debug info, as can be also concluded from the first line of https://llvm.org/docs/HowToUpdateDebugInfo.html Comment at: clang/test/CodeGenCXX/debug-info-structured-binding-bitfield.cpp:525 +// CHECK: !202 = !DILocation(line: 279, column: 8, scope: !194) +// CHECK: !203 = !DILocation(line: 279, column: 17, scope: !194) +// CHECK: !204 = !DILocation(line: 280, column: 1, scope: !194) aprantl wrote: > This test is going to to be a nightmare to maintain since it's hardcoding all > the metadata numbering. Please use FileCheck variables `![[VAR:[0-9]+]]` to > refer to other fields. Also, this test is probably checking too much. > The primary thing this patch changes is the data types of the fields, so the > CHECK lines should focus on that. Ok. Redone the checks to cover only what's needed. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D157479/new/ https://reviews.llvm.org/D157479 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D157479: [Clang][DebugInfo] Emit narrower base types for structured binding declarations that bind to struct bitfields
This revision was automatically updated to reflect the committed changes. vhscampos marked an inline comment as done. Closed by commit rGd77cba6d474a: [Clang][DebugInfo] Emit narrower base types for structured binding declarations⦠(authored by vhscampos). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D157479/new/ https://reviews.llvm.org/D157479 Files: clang/lib/CodeGen/CGDebugInfo.cpp clang/lib/CodeGen/CGDebugInfo.h clang/test/CodeGenCXX/debug-info-structured-binding-bitfield.cpp Index: clang/test/CodeGenCXX/debug-info-structured-binding-bitfield.cpp === --- /dev/null +++ clang/test/CodeGenCXX/debug-info-structured-binding-bitfield.cpp @@ -0,0 +1,285 @@ +// RUN: %clang_cc1 -emit-llvm -debug-info-kind=standalone -triple aarch64-arm-none-eabi %s -o - | FileCheck %s + +struct S0 { + unsigned int x : 16; + unsigned int y : 16; +}; + +// CHECK-LABEL: define dso_local void @_Z3fS0v +// CHECK:alloca %struct.S0, align 4 +// CHECK-NEXT:[[TMP0:%.*]] = alloca %struct.S0, align 4 +// CHECK: call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S0_A:![0-9]+]], metadata !DIExpression()) +// CHECK-NEXT:call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S0_B:![0-9]+]], metadata !DIExpression(DW_OP_plus_uconst, 2)) +// +void fS0() { + S0 s0; + auto [a, b] = s0; +} + +struct S1 { + volatile unsigned int x : 16; + volatile unsigned int y : 16; +}; + +// CHECK-LABEL: define dso_local void @_Z3fS1v +// CHECK:alloca %struct.S1, align 4 +// CHECK-NEXT:[[TMP0:%.*]] = alloca %struct.S1, align 4 +// CHECK: call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S1_A:![0-9]+]], metadata !DIExpression()) +// CHECK-NEXT:call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S1_B:![0-9]+]], metadata !DIExpression(DW_OP_plus_uconst, 2)) +// +void fS1() { + S1 s1; + auto [a, b] = s1; +} + +struct S2 { + unsigned int x : 8; + unsigned int y : 8; +}; + +// CHECK-LABEL: define dso_local void @_Z3fS2v +// CHECK:alloca %struct.S2, align 4 +// CHECK-NEXT:[[TMP0:%.*]] = alloca %struct.S2, align 4 +// CHECK: call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S2_A:![0-9]+]], metadata !DIExpression()) +// CHECK-NEXT:call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S2_B:![0-9]+]], metadata !DIExpression(DW_OP_plus_uconst, 1)) +// +void fS2() { + S2 s2; + auto [a, b] = s2; +} + +struct S3 { + volatile unsigned int x : 8; + volatile unsigned int y : 8; +}; + +// CHECK-LABEL: define dso_local void @_Z3fS3v +// CHECK:alloca %struct.S3, align 4 +// CHECK-NEXT:[[TMP0:%.*]] = alloca %struct.S3, align 4 +// CHECK: call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S3_A:![0-9]+]], metadata !DIExpression()) +// CHECK-NEXT:call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S3_B:![0-9]+]], metadata !DIExpression(DW_OP_plus_uconst, 1)) +// +void fS3() { + S3 s3; + auto [a, b] = s3; +} + +struct S4 { + unsigned int x : 8; + unsigned int y : 16; +}; + +// CHECK-LABEL: define dso_local void @_Z3fS4v +// CHECK:alloca %struct.S4, align 4 +// CHECK-NEXT:[[TMP0:%.*]] = alloca %struct.S4, align 4 +// CHECK: call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S4_A:![0-9]+]], metadata !DIExpression()) +// CHECK-NEXT:call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S4_B:![0-9]+]], metadata !DIExpression(DW_OP_plus_uconst, 1)) +// +void fS4() { + S4 s4; + auto [a, b] = s4; +} + +struct S5 { + volatile unsigned int x : 8; + volatile unsigned int y : 16; +}; + +// CHECK-LABEL: define dso_local void @_Z3fS5v +// CHECK:alloca %struct.S5, align 4 +// CHECK-NEXT:[[TMP0:%.*]] = alloca %struct.S5, align 4 +// CHECK: call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S5_A:![0-9]+]], metadata !DIExpression()) +// CHECK-NEXT:call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S5_B:![0-9]+]], metadata !DIExpression(DW_OP_plus_uconst, 1)) +// +void fS5() { + S5 s5; + auto [a, b] = s5; +} + +struct S6 { + unsigned int x : 16; + unsigned int y : 8; +}; + +// CHECK-LABEL: define dso_local void @_Z3fS6v +// CHECK:alloca %struct.S6, align 4 +// CHECK-NEXT:[[TMP0:%.*]] = alloca %struct.S6, align 4 +// CHECK: call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S6_A:![0-9]+]], metadata !DIExpression()) +// CHECK-NEXT:call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S6_B:![0-9]+]], metadata !DIExpression(DW_OP_plus_uconst, 2)) +// +void fS6() { + S6 s6; + auto [a, b] = s6; +} + +struct S7 { + volatile unsigned int x : 16; + volatile unsigned int y : 8; +}; + +// CHECK-LABEL: define dso_local void @_Z3fS7v +// CHECK:
[PATCH] D157479: [Clang][DebugInfo] Emit narrower base types for structured binding declarations that bind to struct bitfields
vhscampos created this revision. Herald added a project: All. vhscampos requested review of this revision. Herald added a project: clang. Herald added a subscriber: cfe-commits. In cases where a structured binding declaration is made to a struct with bitfields: struct A { unsigned int x : 16; unsigned int y : 16; } g; auto [a, b] = g; // structured binding declaration Clang assigns the 'unsigned int' DWARF base type to 'a' and 'b' because this is their deduced C++ type in the structured binding declaration. However, their actual type in memory is 'unsigned short' as they have 16 bits allocated for each. This is a problem for debug information consumers: if the debug information for 'a' has the 'unsigned int' base type, a debugger will assume it has 4 bytes, whereas it actually has a length of 2, resulting in a read (or write) past its length. This patch mimics GCC's behaviour: in case of structured bindings to bitfields, the binding declaration's DWARF base type is of the target's integer type with the same bitwidth as the bitfield. If no suitable integer type is found in the target, no debug information is emitted anymore in order to prevent wrong debug output. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D157479 Files: clang/lib/CodeGen/CGDebugInfo.cpp clang/lib/CodeGen/CGDebugInfo.h clang/test/CodeGenCXX/debug-info-structured-binding-bitfield.cpp Index: clang/test/CodeGenCXX/debug-info-structured-binding-bitfield.cpp === --- /dev/null +++ clang/test/CodeGenCXX/debug-info-structured-binding-bitfield.cpp @@ -0,0 +1,207 @@ +// RUN: %clang_cc1 -emit-llvm -debug-info-kind=standalone -triple aarch64-arm-none-eabi %s -o - | FileCheck %s + +struct S0 { + unsigned int x : 16; + unsigned int y : 16; +}; + +struct S1 { + volatile unsigned int x : 16; + volatile unsigned int y : 16; +}; + +struct S2 { + unsigned int x : 8; + unsigned int y : 8; +}; + +struct S3 { + volatile unsigned int x : 8; + volatile unsigned int y : 8; +}; + +struct S4 { + unsigned int x : 8; + unsigned int y : 16; +}; + +struct S5 { + volatile unsigned int x : 8; + volatile unsigned int y : 16; +}; + +struct S6 { + unsigned int x : 16; + unsigned int y : 8; +}; + +struct S7 { + volatile unsigned int x : 16; + volatile unsigned int y : 8; +}; + +struct S8 { + unsigned int x : 16; + volatile unsigned int y : 16; +}; + +struct S9 { + unsigned int x : 16; + unsigned int y : 32; +}; + +struct S10 { + const unsigned int x : 8; + const volatile unsigned int y : 8; + S10() : x(0), y(0) {} +}; + +// It's currently not possible to produce complete debug information for the following cases. +// Confirm that no wrong debug info is output. +// Once this is implemented, these tests should be amended. +struct S11 { + unsigned int x : 15; + unsigned int y : 16; +}; + +struct S12 { + unsigned int x : 16; + unsigned int y : 17; +}; + +struct __attribute__((packed)) S13 { + unsigned int x : 15; + unsigned int y : 16; +}; + +int main() { +// CHECK: %s0 = alloca %struct.S0, align 4 +// CHECK: %s1 = alloca %struct.S1, align 4 +// CHECK: %s2 = alloca %struct.S2, align 4 +// CHECK: %s3 = alloca %struct.S3, align 4 +// CHECK: %s4 = alloca %struct.S4, align 4 +// CHECK: %s5 = alloca %struct.S5, align 4 +// CHECK: %s6 = alloca %struct.S6, align 4 +// CHECK: %s7 = alloca %struct.S7, align 4 +// CHECK: %s8 = alloca %struct.S8, align 4 +// CHECK: %s9 = alloca %struct.S9, align 4 +// CHECK: %s10 = alloca %struct.S10, align 4 +// CHECK: [[ADDR0:%.*]] = alloca %struct.S0, align 4 +// CHECK: [[ADDR1:%.*]] = alloca %struct.S1, align 4 +// CHECK: [[ADDR2:%.*]] = alloca %struct.S2, align 4 +// CHECK: [[ADDR3:%.*]] = alloca %struct.S3, align 4 +// CHECK: [[ADDR4:%.*]] = alloca %struct.S4, align 4 +// CHECK: [[ADDR5:%.*]] = alloca %struct.S5, align 4 +// CHECK: [[ADDR6:%.*]] = alloca %struct.S6, align 4 +// CHECK: [[ADDR7:%.*]] = alloca %struct.S7, align 4 +// CHECK: [[ADDR8:%.*]] = alloca %struct.S8, align 4 +// CHECK: [[ADDR9:%.*]] = alloca %struct.S9, align 4 +// CHECK: [[ADDR10:%.*]] = alloca %struct.S10, align 4 +// CHECK: [[ADDR11:%.*]] = alloca %struct.S11, align 4 +// CHECK: [[ADDR12:%.*]] = alloca %struct.S12, align 4 +// CHECK: [[ADDR13:%.*]] = alloca %struct.S13, align 1 + S0 s0; + S1 s1; + S2 s2; + S3 s3; + S4 s4; + S5 s5; + S6 s6; + S7 s7; + S8 s8; + S9 s9; + S10 s10; + S11 s11; + S12 s12; + S13 s13; + +// CHECK: call void @llvm.dbg.declare(metadata ptr [[ADDR0]], metadata [[A0:![0-9]+]], metadata !DIExpression()) +// CHECK: call void @llvm.dbg.declare(metadata ptr [[ADDR0]], metadata [[B0:![0-9]+]], metadata !DIExpression(DW_OP_plus_uconst, 2)) + auto [a0, b0] = s0; +// CHECK: call void @llvm.dbg.declare(metadata ptr [[ADDR1]], metadata [[A1:![0-9]+]], metadata !DIExpression()) +// CHECK: call void @llvm.dbg.declare(metadata ptr [[ADDR1]], metadata [[B1:![0-9]+]], metadata !DIExpression(DW_OP_plus_uconst, 2)) + auto [
[PATCH] D158626: [AArch64] Add missing vrnd intrinsics
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rGdbeb3d029d8e: Add missing vrnd intrinsics (authored by miyengar, committed by vhscampos). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D158626/new/ https://reviews.llvm.org/D158626 Files: clang/include/clang/Basic/arm_neon.td clang/lib/CodeGen/CGBuiltin.cpp clang/test/CodeGen/aarch64-v8.5a-neon-frint3264-intrinsic.c llvm/lib/Target/AArch64/AArch64InstrInfo.td llvm/test/CodeGen/AArch64/v8.5a-neon-frint3264-intrinsic.ll Index: llvm/test/CodeGen/AArch64/v8.5a-neon-frint3264-intrinsic.ll === --- llvm/test/CodeGen/AArch64/v8.5a-neon-frint3264-intrinsic.ll +++ llvm/test/CodeGen/AArch64/v8.5a-neon-frint3264-intrinsic.ll @@ -81,3 +81,85 @@ %val = tail call <4 x float> @llvm.aarch64.neon.frint64z.v4f32(<4 x float> %a) ret <4 x float> %val } + +declare <1 x double> @llvm.aarch64.neon.frint32x.v1f64(<1 x double>) +declare <2 x double> @llvm.aarch64.neon.frint32x.v2f64(<2 x double>) +declare <1 x double> @llvm.aarch64.neon.frint32z.v1f64(<1 x double>) +declare <2 x double> @llvm.aarch64.neon.frint32z.v2f64(<2 x double>) + +define dso_local <1 x double> @t_vrnd32x_f64(<1 x double> %a) { +; CHECK-LABEL: t_vrnd32x_f64: +; CHECK: frint32x d0, d0 +; CHECK-NEXT:ret +entry: + %val = tail call <1 x double> @llvm.aarch64.neon.frint32x.v1f64(<1 x double> %a) + ret <1 x double> %val +} + +define dso_local <2 x double> @t_vrnd32xq_f64(<2 x double> %a) { +; CHECK-LABEL: t_vrnd32xq_f64: +; CHECK: frint32x v0.2d, v0.2d +; CHECK-NEXT:ret +entry: + %val = tail call <2 x double> @llvm.aarch64.neon.frint32x.v2f64(<2 x double> %a) + ret <2 x double> %val +} + +define dso_local <1 x double> @t_vrnd32z_f64(<1 x double> %a) { +; CHECK-LABEL: t_vrnd32z_f64: +; CHECK: frint32z d0, d0 +; CHECK-NEXT:ret +entry: + %val = tail call <1 x double> @llvm.aarch64.neon.frint32z.v1f64(<1 x double> %a) + ret <1 x double> %val +} + +define dso_local <2 x double> @t_vrnd32zq_f64(<2 x double> %a) { +; CHECK-LABEL: t_vrnd32zq_f64: +; CHECK: frint32z v0.2d, v0.2d +; CHECK-NEXT:ret +entry: + %val = tail call <2 x double> @llvm.aarch64.neon.frint32z.v2f64(<2 x double> %a) + ret <2 x double> %val +} + +declare <1 x double> @llvm.aarch64.neon.frint64x.v1f64(<1 x double>) +declare <2 x double> @llvm.aarch64.neon.frint64x.v2f64(<2 x double>) +declare <1 x double> @llvm.aarch64.neon.frint64z.v1f64(<1 x double>) +declare <2 x double> @llvm.aarch64.neon.frint64z.v2f64(<2 x double>) + +define dso_local <1 x double> @t_vrnd64x_f64(<1 x double> %a) { +; CHECK-LABEL: t_vrnd64x_f64: +; CHECK: frint64x d0, d0 +; CHECK-NEXT:ret +entry: + %val = tail call <1 x double> @llvm.aarch64.neon.frint64x.v1f64(<1 x double> %a) + ret <1 x double> %val +} + +define dso_local <2 x double> @t_vrnd64xq_f64(<2 x double> %a) { +; CHECK-LABEL: t_vrnd64xq_f64: +; CHECK: frint64x v0.2d, v0.2d +; CHECK-NEXT:ret +entry: + %val = tail call <2 x double> @llvm.aarch64.neon.frint64x.v2f64(<2 x double> %a) + ret <2 x double> %val +} + +define dso_local <1 x double> @t_vrnd64z_f64(<1 x double> %a) { +; CHECK-LABEL: t_vrnd64z_f64: +; CHECK: frint64z d0, d0 +; CHECK-NEXT:ret +entry: + %val = tail call <1 x double> @llvm.aarch64.neon.frint64z.v1f64(<1 x double> %a) + ret <1 x double> %val +} + +define dso_local <2 x double> @t_vrnd64zq_f64(<2 x double> %a) { +; CHECK-LABEL: t_vrnd64zq_f64: +; CHECK: frint64z v0.2d, v0.2d +; CHECK-NEXT:ret +entry: + %val = tail call <2 x double> @llvm.aarch64.neon.frint64z.v2f64(<2 x double> %a) + ret <2 x double> %val +} Index: llvm/lib/Target/AArch64/AArch64InstrInfo.td === --- llvm/lib/Target/AArch64/AArch64InstrInfo.td +++ llvm/lib/Target/AArch64/AArch64InstrInfo.td @@ -4447,6 +4447,16 @@ defm FRINT64X : FRIntNNT<0b11, "frint64x", int_aarch64_frint64x>; } // HasFRInt3264 +// Pattern to convert 1x64 vector intrinsics to equivalent scalar instructions +def : Pat<(v1f64 (int_aarch64_neon_frint32z (v1f64 FPR64:$Rn))), + (FRINT32ZDr FPR64:$Rn)>; +def : Pat<(v1f64 (int_aarch64_neon_frint64z (v1f64 FPR64:$Rn))), + (FRINT64ZDr FPR64:$Rn)>; +def : Pat<(v1f64 (int_aarch64_neon_frint32x (v1f64 FPR64:$Rn))), + (FRINT32XDr FPR64:$Rn)>; +def : Pat<(v1f64 (int_aarch64_neon_frint64x (v1f64 FPR64:$Rn))), + (FRINT64XDr FPR64:$Rn)>; + // Emitting strict_lrint as two instructions is valid as any exceptions that // occur will happen in exactly one of the instructions (e.g. if the input is // not an integer the inexact exception will happen in the FRINTX but not then Index: clang/test/CodeGen/aarch64-v8.5a-neon-frint3264-intrinsic.c =
[PATCH] D70862: [ARM][AArch64] Complex addition Neon intrinsics for Armv8.3-A
vhscampos created this revision. Herald added subscribers: llvm-commits, cfe-commits, hiraditya, kristof.beyls. Herald added projects: clang, LLVM. Add support for vcadd_* family of intrinsics. This set of intrinsics is available in Armv8.3-A. The fp16 versions require the FP16 extension, which has been available (opt-in) since Armv8.2-A. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D70862 Files: clang/include/clang/Basic/arm_neon.td clang/lib/Basic/Targets/AArch64.cpp clang/lib/Basic/Targets/ARM.cpp clang/lib/Basic/Targets/ARM.h clang/lib/CodeGen/CGBuiltin.cpp clang/test/CodeGen/aarch64-neon-vcadd.c clang/test/CodeGen/arm-neon-vcadd.c llvm/include/llvm/IR/IntrinsicsAArch64.td llvm/include/llvm/IR/IntrinsicsARM.td llvm/lib/Target/AArch64/AArch64InstrInfo.td llvm/lib/Target/ARM/ARMInstrNEON.td llvm/test/CodeGen/AArch64/neon-vcadd.ll llvm/test/CodeGen/ARM/neon-vcadd.ll Index: llvm/test/CodeGen/ARM/neon-vcadd.ll === --- /dev/null +++ llvm/test/CodeGen/ARM/neon-vcadd.ll @@ -0,0 +1,54 @@ +; RUN: llc %s -mtriple=arm -mattr=+armv8.3-a,+fullfp16 -o - | FileCheck %s + +define <4 x half> @foo16x4_rot(<4 x half> %a, <4 x half> %b) { +entry: +; CHECK-LABEL: foo16x4_rot +; CHECK-DAG: vcadd.f16 d{{[0-9]+}}, d{{[0-9]+}}, d{{[0-9]+}}, #90 +; CHECK-DAG: vcadd.f16 d{{[0-9]+}}, d{{[0-9]+}}, d{{[0-9]+}}, #270 + %vcadd_rot90_v2.i = tail call <4 x half> @llvm.arm.neon.vcadd.rot90.v4f16(<4 x half> %a, <4 x half> %b) + %vcadd_rot270_v2.i = tail call <4 x half> @llvm.arm.neon.vcadd.rot270.v4f16(<4 x half> %a, <4 x half> %b) + %add = fadd <4 x half> %vcadd_rot90_v2.i, %vcadd_rot270_v2.i + ret <4 x half> %add +} + +define <2 x float> @foo32x2_rot(<2 x float> %a, <2 x float> %b) { +entry: +; CHECK-LABEL: foo32x2_rot +; CHECK-DAG: vcadd.f32 d{{[0-9]+}}, d{{[0-9]+}}, d{{[0-9]+}}, #90 +; CHECK-DAG: vcadd.f32 d{{[0-9]+}}, d{{[0-9]+}}, d{{[0-9]+}}, #270 + %vcadd_rot90_v2.i = tail call <2 x float> @llvm.arm.neon.vcadd.rot90.v2f32(<2 x float> %a, <2 x float> %b) + %vcadd_rot270_v2.i = tail call <2 x float> @llvm.arm.neon.vcadd.rot270.v2f32(<2 x float> %a, <2 x float> %b) + %add = fadd <2 x float> %vcadd_rot90_v2.i, %vcadd_rot270_v2.i + ret <2 x float> %add +} + +define <8 x half> @foo16x8_rot(<8 x half> %a, <8 x half> %b) { +entry: +; CHECK-LABEL: foo16x8_rot +; CHECK-DAG: vcadd.f16 q{{[0-9]+}}, q{{[0-9]+}}, q{{[0-9]+}}, #90 +; CHECK-DAG: vcadd.f16 q{{[0-9]+}}, q{{[0-9]+}}, q{{[0-9]+}}, #270 + %vcaddq_rot90_v2.i = tail call <8 x half> @llvm.arm.neon.vcadd.rot90.v8f16(<8 x half> %a, <8 x half> %b) + %vcaddq_rot270_v2.i = tail call <8 x half> @llvm.arm.neon.vcadd.rot270.v8f16(<8 x half> %a, <8 x half> %b) + %add = fadd <8 x half> %vcaddq_rot90_v2.i, %vcaddq_rot270_v2.i + ret <8 x half> %add +} + +define <4 x float> @foo32x4_rot(<4 x float> %a, <4 x float> %b) { +entry: +; CHECK-LABEL: foo32x4_rot +; CHECK-DAG: vcadd.f32 q{{[0-9]+}}, q{{[0-9]+}}, q{{[0-9]+}}, #90 +; CHECK-DAG: vcadd.f32 q{{[0-9]+}}, q{{[0-9]+}}, q{{[0-9]+}}, #270 + %vcaddq_rot90_v2.i = tail call <4 x float> @llvm.arm.neon.vcadd.rot90.v4f32(<4 x float> %a, <4 x float> %b) + %vcaddq_rot270_v2.i = tail call <4 x float> @llvm.arm.neon.vcadd.rot270.v4f32(<4 x float> %a, <4 x float> %b) + %add = fadd <4 x float> %vcaddq_rot90_v2.i, %vcaddq_rot270_v2.i + ret <4 x float> %add +} + +declare <4 x half> @llvm.arm.neon.vcadd.rot90.v4f16(<4 x half>, <4 x half>) +declare <4 x half> @llvm.arm.neon.vcadd.rot270.v4f16(<4 x half>, <4 x half>) +declare <2 x float> @llvm.arm.neon.vcadd.rot90.v2f32(<2 x float>, <2 x float>) +declare <2 x float> @llvm.arm.neon.vcadd.rot270.v2f32(<2 x float>, <2 x float>) +declare <8 x half> @llvm.arm.neon.vcadd.rot90.v8f16(<8 x half>, <8 x half>) +declare <8 x half> @llvm.arm.neon.vcadd.rot270.v8f16(<8 x half>, <8 x half>) +declare <4 x float> @llvm.arm.neon.vcadd.rot90.v4f32(<4 x float>, <4 x float>) +declare <4 x float> @llvm.arm.neon.vcadd.rot270.v4f32(<4 x float>, <4 x float>) Index: llvm/test/CodeGen/AArch64/neon-vcadd.ll === --- /dev/null +++ llvm/test/CodeGen/AArch64/neon-vcadd.ll @@ -0,0 +1,67 @@ +; RUN: llc %s -mtriple=aarch64 -mattr=+v8.3a,+fullfp16 -o - | FileCheck %s + +define <4 x half> @foo16x4_rot(<4 x half> %a, <4 x half> %b) { +entry: +; CHECK-LABEL: foo16x4_rot +; CHECK-DAG: fcadd v{{[0-9]+}}.4h, v{{[0-9]+}}.4h, v{{[0-9]+}}.4h, #90 +; CHECK-DAG: fcadd v{{[0-9]+}}.4h, v{{[0-9]+}}.4h, v{{[0-9]+}}.4h, #270 + %vcadd_rot90_v2.i = tail call <4 x half> @llvm.aarch64.neon.vcadd.rot90.v4f16(<4 x half> %a, <4 x half> %b) + %vcadd_rot270_v2.i = tail call <4 x half> @llvm.aarch64.neon.vcadd.rot270.v4f16(<4 x half> %a, <4 x half> %b) + %add = fadd <4 x half> %vcadd_rot90_v2.i, %vcadd_rot270_v2.i + ret <4 x half> %add +} + +define <2 x float> @foo32x2_rot(<2 x float> %a, <2 x float> %b) { +entry: +; CHECK-LABEL: foo32x2_rot +; CHECK-DAG: fcadd v{{[0-9]+}}.2s, v{{[0-
[PATCH] D70862: [ARM][AArch64] Complex addition Neon intrinsics for Armv8.3-A
This revision was automatically updated to reflect the committed changes. Closed by commit rGdcf11c5e86ce: [ARM][AArch64] Complex addition Neon intrinsics for Armv8.3-A (authored by vhscampos). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D70862/new/ https://reviews.llvm.org/D70862 Files: clang/include/clang/Basic/arm_neon.td clang/lib/Basic/Targets/AArch64.cpp clang/lib/Basic/Targets/ARM.cpp clang/lib/Basic/Targets/ARM.h clang/lib/CodeGen/CGBuiltin.cpp clang/test/CodeGen/aarch64-neon-vcadd.c clang/test/CodeGen/arm-neon-vcadd.c llvm/include/llvm/IR/IntrinsicsAArch64.td llvm/include/llvm/IR/IntrinsicsARM.td llvm/lib/Target/AArch64/AArch64InstrInfo.td llvm/lib/Target/ARM/ARMInstrNEON.td llvm/test/CodeGen/AArch64/neon-vcadd.ll llvm/test/CodeGen/ARM/neon-vcadd.ll Index: llvm/test/CodeGen/ARM/neon-vcadd.ll === --- /dev/null +++ llvm/test/CodeGen/ARM/neon-vcadd.ll @@ -0,0 +1,54 @@ +; RUN: llc %s -mtriple=arm -mattr=+armv8.3-a,+fullfp16 -o - | FileCheck %s + +define <4 x half> @foo16x4_rot(<4 x half> %a, <4 x half> %b) { +entry: +; CHECK-LABEL: foo16x4_rot +; CHECK-DAG: vcadd.f16 d{{[0-9]+}}, d{{[0-9]+}}, d{{[0-9]+}}, #90 +; CHECK-DAG: vcadd.f16 d{{[0-9]+}}, d{{[0-9]+}}, d{{[0-9]+}}, #270 + %vcadd_rot90_v2.i = tail call <4 x half> @llvm.arm.neon.vcadd.rot90.v4f16(<4 x half> %a, <4 x half> %b) + %vcadd_rot270_v2.i = tail call <4 x half> @llvm.arm.neon.vcadd.rot270.v4f16(<4 x half> %a, <4 x half> %b) + %add = fadd <4 x half> %vcadd_rot90_v2.i, %vcadd_rot270_v2.i + ret <4 x half> %add +} + +define <2 x float> @foo32x2_rot(<2 x float> %a, <2 x float> %b) { +entry: +; CHECK-LABEL: foo32x2_rot +; CHECK-DAG: vcadd.f32 d{{[0-9]+}}, d{{[0-9]+}}, d{{[0-9]+}}, #90 +; CHECK-DAG: vcadd.f32 d{{[0-9]+}}, d{{[0-9]+}}, d{{[0-9]+}}, #270 + %vcadd_rot90_v2.i = tail call <2 x float> @llvm.arm.neon.vcadd.rot90.v2f32(<2 x float> %a, <2 x float> %b) + %vcadd_rot270_v2.i = tail call <2 x float> @llvm.arm.neon.vcadd.rot270.v2f32(<2 x float> %a, <2 x float> %b) + %add = fadd <2 x float> %vcadd_rot90_v2.i, %vcadd_rot270_v2.i + ret <2 x float> %add +} + +define <8 x half> @foo16x8_rot(<8 x half> %a, <8 x half> %b) { +entry: +; CHECK-LABEL: foo16x8_rot +; CHECK-DAG: vcadd.f16 q{{[0-9]+}}, q{{[0-9]+}}, q{{[0-9]+}}, #90 +; CHECK-DAG: vcadd.f16 q{{[0-9]+}}, q{{[0-9]+}}, q{{[0-9]+}}, #270 + %vcaddq_rot90_v2.i = tail call <8 x half> @llvm.arm.neon.vcadd.rot90.v8f16(<8 x half> %a, <8 x half> %b) + %vcaddq_rot270_v2.i = tail call <8 x half> @llvm.arm.neon.vcadd.rot270.v8f16(<8 x half> %a, <8 x half> %b) + %add = fadd <8 x half> %vcaddq_rot90_v2.i, %vcaddq_rot270_v2.i + ret <8 x half> %add +} + +define <4 x float> @foo32x4_rot(<4 x float> %a, <4 x float> %b) { +entry: +; CHECK-LABEL: foo32x4_rot +; CHECK-DAG: vcadd.f32 q{{[0-9]+}}, q{{[0-9]+}}, q{{[0-9]+}}, #90 +; CHECK-DAG: vcadd.f32 q{{[0-9]+}}, q{{[0-9]+}}, q{{[0-9]+}}, #270 + %vcaddq_rot90_v2.i = tail call <4 x float> @llvm.arm.neon.vcadd.rot90.v4f32(<4 x float> %a, <4 x float> %b) + %vcaddq_rot270_v2.i = tail call <4 x float> @llvm.arm.neon.vcadd.rot270.v4f32(<4 x float> %a, <4 x float> %b) + %add = fadd <4 x float> %vcaddq_rot90_v2.i, %vcaddq_rot270_v2.i + ret <4 x float> %add +} + +declare <4 x half> @llvm.arm.neon.vcadd.rot90.v4f16(<4 x half>, <4 x half>) +declare <4 x half> @llvm.arm.neon.vcadd.rot270.v4f16(<4 x half>, <4 x half>) +declare <2 x float> @llvm.arm.neon.vcadd.rot90.v2f32(<2 x float>, <2 x float>) +declare <2 x float> @llvm.arm.neon.vcadd.rot270.v2f32(<2 x float>, <2 x float>) +declare <8 x half> @llvm.arm.neon.vcadd.rot90.v8f16(<8 x half>, <8 x half>) +declare <8 x half> @llvm.arm.neon.vcadd.rot270.v8f16(<8 x half>, <8 x half>) +declare <4 x float> @llvm.arm.neon.vcadd.rot90.v4f32(<4 x float>, <4 x float>) +declare <4 x float> @llvm.arm.neon.vcadd.rot270.v4f32(<4 x float>, <4 x float>) Index: llvm/test/CodeGen/AArch64/neon-vcadd.ll === --- /dev/null +++ llvm/test/CodeGen/AArch64/neon-vcadd.ll @@ -0,0 +1,67 @@ +; RUN: llc %s -mtriple=aarch64 -mattr=+v8.3a,+fullfp16 -o - | FileCheck %s + +define <4 x half> @foo16x4_rot(<4 x half> %a, <4 x half> %b) { +entry: +; CHECK-LABEL: foo16x4_rot +; CHECK-DAG: fcadd v{{[0-9]+}}.4h, v{{[0-9]+}}.4h, v{{[0-9]+}}.4h, #90 +; CHECK-DAG: fcadd v{{[0-9]+}}.4h, v{{[0-9]+}}.4h, v{{[0-9]+}}.4h, #270 + %vcadd_rot90_v2.i = tail call <4 x half> @llvm.aarch64.neon.vcadd.rot90.v4f16(<4 x half> %a, <4 x half> %b) + %vcadd_rot270_v2.i = tail call <4 x half> @llvm.aarch64.neon.vcadd.rot270.v4f16(<4 x half> %a, <4 x half> %b) + %add = fadd <4 x half> %vcadd_rot90_v2.i, %vcadd_rot270_v2.i + ret <4 x half> %add +} + +define <2 x float> @foo32x2_rot(<2 x float> %a, <2 x float> %b) { +entry: +; CHECK-LABEL: foo32x2_rot +; CHECK-DAG: fcadd v{{[0-9]+}}.2s, v{{[0-9]+}}.2s, v{{[0-9]+}}.2s, #90 +; CHECK-DAG: fcadd v{{[0-9]+}}.2s, v{{[0-9]+}}.2s, v
[PATCH] D116153: [ARM][AArch64] Add missing v8.x checks
vhscampos added a comment. I don't see any problem with the patch, but we should wait on @SjoerdMeijer. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D116153/new/ https://reviews.llvm.org/D116153 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D118757: [AArch64] Remove unused feature flags from AArch64TargetInfo
vhscampos accepted this revision. vhscampos added a comment. This revision is now accepted and ready to land. LGTM Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D118757/new/ https://reviews.llvm.org/D118757 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D116154: [ARM] Adding macros for coprocessor intrinsics as per ACLE
vhscampos added inline comments. Comment at: clang/lib/Basic/Targets/ARM.cpp:929-949 + if (ArchKind == llvm::ARM::ArchKind::ARMV8A || + ArchKind == llvm::ARM::ArchKind::ARMV8R || + ArchKind == llvm::ARM::ArchKind::ARMV8_1A || + ArchKind == llvm::ARM::ArchKind::ARMV8_2A || + ArchKind == llvm::ARM::ArchKind::ARMV8_3A || + ArchKind == llvm::ARM::ArchKind::ARMV8_4A || + ArchKind == llvm::ARM::ArchKind::ARMV8_5A || Consider merging these two if statements. Comment at: clang/lib/Basic/Targets/ARM.cpp:951 + + if (ArchKind == llvm::ARM::ArchKind::ARMV8MMainline) { +Builder.defineMacro("__ARM_TARGET_COPROC", "1"); Is v8.1-M not included on purpose? Comment at: clang/test/Preprocessor/aarch64-target-features.c:46 // CHECK-NOT: __ARM_SIZEOF_WCHAR_T 2 +// CHECK-NOT: __ARM_TARGET_COPROC 1 // CHECK-NOT: __ARM_FEATURE_SVE I don't see any change to the AArch64 target macros. Does it use the same function as ARM? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D116154/new/ https://reviews.llvm.org/D116154 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D115507: Add PACBTI-M support to LLVM release notes.
vhscampos accepted this revision. vhscampos added a comment. This revision is now accepted and ready to land. LGTM. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D115507/new/ https://reviews.llvm.org/D115507 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D100372: [Clang][ARM] Define __VFP_FP__ macro unconditionally
vhscampos created this revision. Herald added subscribers: danielkiss, kristof.beyls. vhscampos requested review of this revision. Herald added a project: clang. Herald added a subscriber: cfe-commits. Clang only defines __VFP_FP__ when the FPU is enabled. However, gcc defines it unconditionally. This patch aligns Clang with gcc. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D100372 Files: clang/lib/Basic/Targets/ARM.cpp clang/test/Preprocessor/arm-target-features.c Index: clang/test/Preprocessor/arm-target-features.c === --- clang/test/Preprocessor/arm-target-features.c +++ clang/test/Preprocessor/arm-target-features.c @@ -141,6 +141,11 @@ // CHECK-V7S-NOT: __ARM_FEATURE_DIRECTED_ROUNDING // CHECK-V7S: #define __ARM_FP 0xe +// RUN: %clang -target arm-arm-none-eabi -march=armv7-m -mfloat-abi=soft -x c -E -dM %s | FileCheck -match-full-lines --check-prefix=CHECK-VFP-FP %s +// RUN: %clang -target arm-arm-none-eabi -march=armv7-m -mfloat-abi=softfp -x c -E -dM %s | FileCheck -match-full-lines --check-prefix=CHECK-VFP-FP %s +// RUN: %clang -target arm-arm-none-eabi -march=armv7-m -mfloat-abi=hard -x c -E -dM %s | FileCheck -match-full-lines --check-prefix=CHECK-VFP-FP %s +// CHECK-VFP-FP: #define __VFP_FP__ 1 + // RUN: %clang -target armv8a -mfloat-abi=hard -x c -E -dM %s | FileCheck -match-full-lines --check-prefix=CHECK-V8-BAREHF %s // CHECK-V8-BAREHF: #define __ARMEL__ 1 // CHECK-V8-BAREHF: #define __ARM_ARCH 8 Index: clang/lib/Basic/Targets/ARM.cpp === --- clang/lib/Basic/Targets/ARM.cpp +++ clang/lib/Basic/Targets/ARM.cpp @@ -755,8 +755,9 @@ // Note, this is always on in gcc, even though it doesn't make sense. Builder.defineMacro("__APCS_32__"); + Builder.defineMacro("__VFP_FP__"); + if (FPUModeIsVFP((FPUMode)FPU)) { -Builder.defineMacro("__VFP_FP__"); if (FPU & VFP2FPU) Builder.defineMacro("__ARM_VFPV2__"); if (FPU & VFP3FPU) Index: clang/test/Preprocessor/arm-target-features.c === --- clang/test/Preprocessor/arm-target-features.c +++ clang/test/Preprocessor/arm-target-features.c @@ -141,6 +141,11 @@ // CHECK-V7S-NOT: __ARM_FEATURE_DIRECTED_ROUNDING // CHECK-V7S: #define __ARM_FP 0xe +// RUN: %clang -target arm-arm-none-eabi -march=armv7-m -mfloat-abi=soft -x c -E -dM %s | FileCheck -match-full-lines --check-prefix=CHECK-VFP-FP %s +// RUN: %clang -target arm-arm-none-eabi -march=armv7-m -mfloat-abi=softfp -x c -E -dM %s | FileCheck -match-full-lines --check-prefix=CHECK-VFP-FP %s +// RUN: %clang -target arm-arm-none-eabi -march=armv7-m -mfloat-abi=hard -x c -E -dM %s | FileCheck -match-full-lines --check-prefix=CHECK-VFP-FP %s +// CHECK-VFP-FP: #define __VFP_FP__ 1 + // RUN: %clang -target armv8a -mfloat-abi=hard -x c -E -dM %s | FileCheck -match-full-lines --check-prefix=CHECK-V8-BAREHF %s // CHECK-V8-BAREHF: #define __ARMEL__ 1 // CHECK-V8-BAREHF: #define __ARM_ARCH 8 Index: clang/lib/Basic/Targets/ARM.cpp === --- clang/lib/Basic/Targets/ARM.cpp +++ clang/lib/Basic/Targets/ARM.cpp @@ -755,8 +755,9 @@ // Note, this is always on in gcc, even though it doesn't make sense. Builder.defineMacro("__APCS_32__"); + Builder.defineMacro("__VFP_FP__"); + if (FPUModeIsVFP((FPUMode)FPU)) { -Builder.defineMacro("__VFP_FP__"); if (FPU & VFP2FPU) Builder.defineMacro("__ARM_VFPV2__"); if (FPU & VFP3FPU) ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D100372: [Clang][ARM] Define __VFP_FP__ macro unconditionally
vhscampos updated this revision to Diff 337723. vhscampos added a comment. Add a clarifying comment. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D100372/new/ https://reviews.llvm.org/D100372 Files: clang/lib/Basic/Targets/ARM.cpp clang/test/Preprocessor/arm-target-features.c Index: clang/test/Preprocessor/arm-target-features.c === --- clang/test/Preprocessor/arm-target-features.c +++ clang/test/Preprocessor/arm-target-features.c @@ -141,6 +141,11 @@ // CHECK-V7S-NOT: __ARM_FEATURE_DIRECTED_ROUNDING // CHECK-V7S: #define __ARM_FP 0xe +// RUN: %clang -target arm-arm-none-eabi -march=armv7-m -mfloat-abi=soft -x c -E -dM %s | FileCheck -match-full-lines --check-prefix=CHECK-VFP-FP %s +// RUN: %clang -target arm-arm-none-eabi -march=armv7-m -mfloat-abi=softfp -x c -E -dM %s | FileCheck -match-full-lines --check-prefix=CHECK-VFP-FP %s +// RUN: %clang -target arm-arm-none-eabi -march=armv7-m -mfloat-abi=hard -x c -E -dM %s | FileCheck -match-full-lines --check-prefix=CHECK-VFP-FP %s +// CHECK-VFP-FP: #define __VFP_FP__ 1 + // RUN: %clang -target armv8a -mfloat-abi=hard -x c -E -dM %s | FileCheck -match-full-lines --check-prefix=CHECK-V8-BAREHF %s // CHECK-V8-BAREHF: #define __ARMEL__ 1 // CHECK-V8-BAREHF: #define __ARM_ARCH 8 Index: clang/lib/Basic/Targets/ARM.cpp === --- clang/lib/Basic/Targets/ARM.cpp +++ clang/lib/Basic/Targets/ARM.cpp @@ -755,8 +755,12 @@ // Note, this is always on in gcc, even though it doesn't make sense. Builder.defineMacro("__APCS_32__"); + // __VFP_FP__ means that the floating-point format is VFP, not that a hardware + // FPU is present. Moreover, the VFP format is the only one supported by + // clang. For these reasons, this macro is always defined. + Builder.defineMacro("__VFP_FP__"); + if (FPUModeIsVFP((FPUMode)FPU)) { -Builder.defineMacro("__VFP_FP__"); if (FPU & VFP2FPU) Builder.defineMacro("__ARM_VFPV2__"); if (FPU & VFP3FPU) Index: clang/test/Preprocessor/arm-target-features.c === --- clang/test/Preprocessor/arm-target-features.c +++ clang/test/Preprocessor/arm-target-features.c @@ -141,6 +141,11 @@ // CHECK-V7S-NOT: __ARM_FEATURE_DIRECTED_ROUNDING // CHECK-V7S: #define __ARM_FP 0xe +// RUN: %clang -target arm-arm-none-eabi -march=armv7-m -mfloat-abi=soft -x c -E -dM %s | FileCheck -match-full-lines --check-prefix=CHECK-VFP-FP %s +// RUN: %clang -target arm-arm-none-eabi -march=armv7-m -mfloat-abi=softfp -x c -E -dM %s | FileCheck -match-full-lines --check-prefix=CHECK-VFP-FP %s +// RUN: %clang -target arm-arm-none-eabi -march=armv7-m -mfloat-abi=hard -x c -E -dM %s | FileCheck -match-full-lines --check-prefix=CHECK-VFP-FP %s +// CHECK-VFP-FP: #define __VFP_FP__ 1 + // RUN: %clang -target armv8a -mfloat-abi=hard -x c -E -dM %s | FileCheck -match-full-lines --check-prefix=CHECK-V8-BAREHF %s // CHECK-V8-BAREHF: #define __ARMEL__ 1 // CHECK-V8-BAREHF: #define __ARM_ARCH 8 Index: clang/lib/Basic/Targets/ARM.cpp === --- clang/lib/Basic/Targets/ARM.cpp +++ clang/lib/Basic/Targets/ARM.cpp @@ -755,8 +755,12 @@ // Note, this is always on in gcc, even though it doesn't make sense. Builder.defineMacro("__APCS_32__"); + // __VFP_FP__ means that the floating-point format is VFP, not that a hardware + // FPU is present. Moreover, the VFP format is the only one supported by + // clang. For these reasons, this macro is always defined. + Builder.defineMacro("__VFP_FP__"); + if (FPUModeIsVFP((FPUMode)FPU)) { -Builder.defineMacro("__VFP_FP__"); if (FPU & VFP2FPU) Builder.defineMacro("__ARM_VFPV2__"); if (FPU & VFP3FPU) ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D100372: [Clang][ARM] Define __VFP_FP__ macro unconditionally
vhscampos added a comment. Thanks Peter. Since one week has passed, I plan to commit these changes by the end of the day if nothing surfaces. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D100372/new/ https://reviews.llvm.org/D100372 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D100372: [Clang][ARM] Define __VFP_FP__ macro unconditionally
This revision was automatically updated to reflect the committed changes. Closed by commit rGee3e01627ff8: [Clang][ARM] Define __VFP_FP__ macro unconditionally (authored by vhscampos). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D100372/new/ https://reviews.llvm.org/D100372 Files: clang/lib/Basic/Targets/ARM.cpp clang/test/Preprocessor/arm-target-features.c Index: clang/test/Preprocessor/arm-target-features.c === --- clang/test/Preprocessor/arm-target-features.c +++ clang/test/Preprocessor/arm-target-features.c @@ -141,6 +141,11 @@ // CHECK-V7S-NOT: __ARM_FEATURE_DIRECTED_ROUNDING // CHECK-V7S: #define __ARM_FP 0xe +// RUN: %clang -target arm-arm-none-eabi -march=armv7-m -mfloat-abi=soft -x c -E -dM %s | FileCheck -match-full-lines --check-prefix=CHECK-VFP-FP %s +// RUN: %clang -target arm-arm-none-eabi -march=armv7-m -mfloat-abi=softfp -x c -E -dM %s | FileCheck -match-full-lines --check-prefix=CHECK-VFP-FP %s +// RUN: %clang -target arm-arm-none-eabi -march=armv7-m -mfloat-abi=hard -x c -E -dM %s | FileCheck -match-full-lines --check-prefix=CHECK-VFP-FP %s +// CHECK-VFP-FP: #define __VFP_FP__ 1 + // RUN: %clang -target armv8a -mfloat-abi=hard -x c -E -dM %s | FileCheck -match-full-lines --check-prefix=CHECK-V8-BAREHF %s // CHECK-V8-BAREHF: #define __ARMEL__ 1 // CHECK-V8-BAREHF: #define __ARM_ARCH 8 Index: clang/lib/Basic/Targets/ARM.cpp === --- clang/lib/Basic/Targets/ARM.cpp +++ clang/lib/Basic/Targets/ARM.cpp @@ -755,8 +755,12 @@ // Note, this is always on in gcc, even though it doesn't make sense. Builder.defineMacro("__APCS_32__"); + // __VFP_FP__ means that the floating-point format is VFP, not that a hardware + // FPU is present. Moreover, the VFP format is the only one supported by + // clang. For these reasons, this macro is always defined. + Builder.defineMacro("__VFP_FP__"); + if (FPUModeIsVFP((FPUMode)FPU)) { -Builder.defineMacro("__VFP_FP__"); if (FPU & VFP2FPU) Builder.defineMacro("__ARM_VFPV2__"); if (FPU & VFP3FPU) Index: clang/test/Preprocessor/arm-target-features.c === --- clang/test/Preprocessor/arm-target-features.c +++ clang/test/Preprocessor/arm-target-features.c @@ -141,6 +141,11 @@ // CHECK-V7S-NOT: __ARM_FEATURE_DIRECTED_ROUNDING // CHECK-V7S: #define __ARM_FP 0xe +// RUN: %clang -target arm-arm-none-eabi -march=armv7-m -mfloat-abi=soft -x c -E -dM %s | FileCheck -match-full-lines --check-prefix=CHECK-VFP-FP %s +// RUN: %clang -target arm-arm-none-eabi -march=armv7-m -mfloat-abi=softfp -x c -E -dM %s | FileCheck -match-full-lines --check-prefix=CHECK-VFP-FP %s +// RUN: %clang -target arm-arm-none-eabi -march=armv7-m -mfloat-abi=hard -x c -E -dM %s | FileCheck -match-full-lines --check-prefix=CHECK-VFP-FP %s +// CHECK-VFP-FP: #define __VFP_FP__ 1 + // RUN: %clang -target armv8a -mfloat-abi=hard -x c -E -dM %s | FileCheck -match-full-lines --check-prefix=CHECK-V8-BAREHF %s // CHECK-V8-BAREHF: #define __ARMEL__ 1 // CHECK-V8-BAREHF: #define __ARM_ARCH 8 Index: clang/lib/Basic/Targets/ARM.cpp === --- clang/lib/Basic/Targets/ARM.cpp +++ clang/lib/Basic/Targets/ARM.cpp @@ -755,8 +755,12 @@ // Note, this is always on in gcc, even though it doesn't make sense. Builder.defineMacro("__APCS_32__"); + // __VFP_FP__ means that the floating-point format is VFP, not that a hardware + // FPU is present. Moreover, the VFP format is the only one supported by + // clang. For these reasons, this macro is always defined. + Builder.defineMacro("__VFP_FP__"); + if (FPUModeIsVFP((FPUMode)FPU)) { -Builder.defineMacro("__VFP_FP__"); if (FPU & VFP2FPU) Builder.defineMacro("__ARM_VFPV2__"); if (FPU & VFP3FPU) ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D109517: [Clang][ARM][AArch64] Add support for Armv9-A, Armv9.1-A and Armv9.2-A
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG3550e242fad6: [Clang][ARM][AArch64] Add support for Armv9-A, Armv9.1-A and Armv9.2-A (authored by vhscampos). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D109517/new/ https://reviews.llvm.org/D109517 Files: clang/lib/Basic/Targets/AArch64.cpp clang/lib/Basic/Targets/AArch64.h clang/lib/Basic/Targets/ARM.cpp clang/lib/Driver/ToolChains/Arch/AArch64.cpp clang/test/Driver/aarch64-cpus.c clang/test/Driver/arm-cortex-cpus.c clang/test/Preprocessor/aarch64-target-features.c clang/test/Preprocessor/arm-target-features.c llvm/include/llvm/ADT/Triple.h llvm/include/llvm/Support/AArch64TargetParser.def llvm/include/llvm/Support/ARMTargetParser.def llvm/lib/Support/AArch64TargetParser.cpp llvm/lib/Support/ARMTargetParser.cpp llvm/lib/Support/Triple.cpp llvm/lib/Target/AArch64/AArch64.td llvm/lib/Target/AArch64/AArch64InstrInfo.td llvm/lib/Target/AArch64/AArch64Subtarget.h llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp llvm/lib/Target/ARM/ARM.td llvm/lib/Target/ARM/ARMSubtarget.h llvm/lib/Target/ARM/MCTargetDesc/ARMELFStreamer.cpp llvm/test/MC/AArch64/SME/directives-negative.s llvm/test/MC/AArch64/SME/directives.s llvm/test/MC/AArch64/SVE2/directive-arch-negative.s llvm/test/MC/AArch64/SVE2/directive-arch.s llvm/unittests/Support/TargetParserTest.cpp Index: llvm/unittests/Support/TargetParserTest.cpp === --- llvm/unittests/Support/TargetParserTest.cpp +++ llvm/unittests/Support/TargetParserTest.cpp @@ -31,6 +31,8 @@ "armv8.5a","armv8.6-a","armv8.6a","armv8.7-a","armv8.7a", "armv8-r", "armv8r", "armv8-m.base","armv8m.base", "armv8-m.main", "armv8m.main", "iwmmxt", "iwmmxt2", "xscale", "armv8.1-m.main", +"armv9-a", "armv9","armv9a", "armv9.1-a","armv9.1a", +"armv9.2-a", "armv9.2a", }; template @@ -492,6 +494,15 @@ EXPECT_TRUE( testARMArch("armv8.7-a", "generic", "v8.7a", ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE( + testARMArch("armv9-a", "generic", "v9a", + ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE( + testARMArch("armv9.1-a", "generic", "v9.1a", + ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE( + testARMArch("armv9.2-a", "generic", "v9.2a", + ARMBuildAttrs::CPUArch::v8_A)); EXPECT_TRUE( testARMArch("armv8-r", "cortex-r52", "v8r", ARMBuildAttrs::CPUArch::v8_R)); @@ -821,6 +832,9 @@ case ARM::ArchKind::ARMV8_5A: case ARM::ArchKind::ARMV8_6A: case ARM::ArchKind::ARMV8_7A: +case ARM::ArchKind::ARMV9A: +case ARM::ArchKind::ARMV9_1A: +case ARM::ArchKind::ARMV9_2A: EXPECT_EQ(ARM::ProfileKind::A, ARM::parseArchProfile(ARMArch[i])); break; default: @@ -1204,6 +1218,12 @@ ARMBuildAttrs::CPUArch::v8_A)); EXPECT_TRUE(testAArch64Arch("armv8.7-a", "generic", "v8.7a", ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE(testAArch64Arch("armv9-a", "generic", "v9a", + ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE(testAArch64Arch("armv9.1-a", "generic", "v9.1a", + ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE(testAArch64Arch("armv9.2-a", "generic", "v9.2a", + ARMBuildAttrs::CPUArch::v8_A)); } bool testAArch64Extension(StringRef CPUName, AArch64::ArchKind AK, Index: llvm/test/MC/AArch64/SVE2/directive-arch.s === --- llvm/test/MC/AArch64/SVE2/directive-arch.s +++ llvm/test/MC/AArch64/SVE2/directive-arch.s @@ -1,21 +1,21 @@ // RUN: llvm-mc -triple aarch64 -filetype asm -o - %s 2>&1 | FileCheck %s -.arch armv8-a+sve2 +.arch armv9-a+sve2 tbx z0.b, z1.b, z2.b // CHECK: tbx z0.b, z1.b, z2.b -.arch armv8-a+sve2-aes +.arch armv9-a+sve2-aes aesd z23.b, z23.b, z13.b // CHECK: aesd z23.b, z23.b, z13.b -.arch armv8-a+sve2-sm4 +.arch armv9-a+sve2-sm4 sm4e z0.s, z0.s, z0.s // CHECK: sm4e z0.s, z0.s, z0.s -.arch armv8-a+sve2-sha3 +.arch armv9-a+sve2-sha3 rax1 z0.d, z0.d, z0.d // CHECK: rax1 z0.d, z0.d, z0.d -.arch armv8-a+sve2-bitperm +.arch armv9-a+sve2-bitperm bgrp z21.s, z10.s, z21.s // CHECK: bgrp z21.s, z10.s, z21.s Index: llvm/test/MC/AArch64/SVE2/directive-arch-negative.s === --- llvm/test/MC/AArch64/SVE2/directive-arch-negative.s +++ llvm/test/MC/AArch64/SVE2/directive-arch-negative.s @@ -1,31 +1,31 @@ // RUN: not llvm-mc -triple aarch64 -filetype asm -o - %s 2>&1 | FileCheck %s -.arch armv8-a+sve2 -.arch armv8-a+nosve2
[PATCH] D110241: [docs] List support for Armv9-A, Armv9.1-A and Armv9.2-A in LLVM and Clang
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG3e7cf33a8376: [docs] List support for Armv9-A, Armv9.1-A and Armv9.2-A in LLVM and Clang (authored by vhscampos). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D110241/new/ https://reviews.llvm.org/D110241 Files: clang/docs/ReleaseNotes.rst llvm/docs/ReleaseNotes.rst Index: llvm/docs/ReleaseNotes.rst === --- llvm/docs/ReleaseNotes.rst +++ llvm/docs/ReleaseNotes.rst @@ -73,12 +73,12 @@ Changes to the AArch64 Backend -- -* ... +* Added support for the Armv9-A, Armv9.1-A and Armv9.2-A architectures. Changes to the ARM Backend -- -During this release ... +* Added support for the Armv9-A, Armv9.1-A and Armv9.2-A architectures. Changes to the MIPS Target -- Index: clang/docs/ReleaseNotes.rst === --- clang/docs/ReleaseNotes.rst +++ clang/docs/ReleaseNotes.rst @@ -83,6 +83,12 @@ - RISC-V SiFive S54 (``sifive-s54``). - RISC-V SiFive S76 (``sifive-s76``). +- Support has been added for the following architectures (``-march`` identifiers in parentheses): + + - Armv9-A (``armv9-a``). + - Armv9.1-A (``armv9.1-a``). + - Armv9.2-A (``armv9.2-a``). + Removed Compiler Flags - Index: llvm/docs/ReleaseNotes.rst === --- llvm/docs/ReleaseNotes.rst +++ llvm/docs/ReleaseNotes.rst @@ -73,12 +73,12 @@ Changes to the AArch64 Backend -- -* ... +* Added support for the Armv9-A, Armv9.1-A and Armv9.2-A architectures. Changes to the ARM Backend -- -During this release ... +* Added support for the Armv9-A, Armv9.1-A and Armv9.2-A architectures. Changes to the MIPS Target -- Index: clang/docs/ReleaseNotes.rst === --- clang/docs/ReleaseNotes.rst +++ clang/docs/ReleaseNotes.rst @@ -83,6 +83,12 @@ - RISC-V SiFive S54 (``sifive-s54``). - RISC-V SiFive S76 (``sifive-s76``). +- Support has been added for the following architectures (``-march`` identifiers in parentheses): + + - Armv9-A (``armv9-a``). + - Armv9.1-A (``armv9.1-a``). + - Armv9.2-A (``armv9.2-a``). + Removed Compiler Flags - ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D109517: [Clang][ARM][AArch64] Add support for Armv9-A, Armv9.1-A and Armv9.2-A
vhscampos created this revision. Herald added subscribers: dexonsmith, hiraditya, kristof.beyls. vhscampos requested review of this revision. Herald added projects: clang, LLVM. Herald added subscribers: llvm-commits, cfe-commits. armv9-a, armv9.1-a and armv9.2-a can be targeted using the -march option both in ARM and AArch64. The Armv9-A architecture is described in the ArmĀ® Architecture Reference Manual Supplement Armv9, for Armv9-A architecture profile (https://developer.arm.com/documentation/ddi0608/latest). Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D109517 Files: clang/lib/Basic/Targets/AArch64.cpp clang/lib/Basic/Targets/AArch64.h clang/lib/Basic/Targets/ARM.cpp clang/lib/Driver/ToolChains/Arch/AArch64.cpp clang/test/Driver/aarch64-cpus.c clang/test/Driver/arm-cortex-cpus.c clang/test/Preprocessor/aarch64-target-features.c clang/test/Preprocessor/arm-target-features.c llvm/include/llvm/ADT/Triple.h llvm/include/llvm/Support/AArch64TargetParser.def llvm/include/llvm/Support/ARMTargetParser.def llvm/lib/Support/AArch64TargetParser.cpp llvm/lib/Support/ARMTargetParser.cpp llvm/lib/Support/Triple.cpp llvm/lib/Target/AArch64/AArch64.td llvm/lib/Target/AArch64/AArch64InstrInfo.td llvm/lib/Target/AArch64/AArch64Subtarget.h llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp llvm/lib/Target/ARM/ARM.td llvm/lib/Target/ARM/ARMSubtarget.h llvm/lib/Target/ARM/MCTargetDesc/ARMELFStreamer.cpp llvm/test/MC/AArch64/SME/directives-negative.s llvm/test/MC/AArch64/SME/directives.s llvm/test/MC/AArch64/SVE2/directive-arch-negative.s llvm/test/MC/AArch64/SVE2/directive-arch.s llvm/unittests/Support/TargetParserTest.cpp Index: llvm/unittests/Support/TargetParserTest.cpp === --- llvm/unittests/Support/TargetParserTest.cpp +++ llvm/unittests/Support/TargetParserTest.cpp @@ -492,6 +492,15 @@ EXPECT_TRUE( testARMArch("armv8.7-a", "generic", "v8.7a", ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE( + testARMArch("armv9-a", "generic", "v9a", + ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE( + testARMArch("armv9.1-a", "generic", "v9.1a", + ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE( + testARMArch("armv9.2-a", "generic", "v9.2a", + ARMBuildAttrs::CPUArch::v8_A)); EXPECT_TRUE( testARMArch("armv8-r", "cortex-r52", "v8r", ARMBuildAttrs::CPUArch::v8_R)); @@ -821,6 +830,9 @@ case ARM::ArchKind::ARMV8_5A: case ARM::ArchKind::ARMV8_6A: case ARM::ArchKind::ARMV8_7A: +case ARM::ArchKind::ARMV9A: +case ARM::ArchKind::ARMV9_1A: +case ARM::ArchKind::ARMV9_2A: EXPECT_EQ(ARM::ProfileKind::A, ARM::parseArchProfile(ARMArch[i])); break; default: @@ -1204,6 +1216,12 @@ ARMBuildAttrs::CPUArch::v8_A)); EXPECT_TRUE(testAArch64Arch("armv8.7-a", "generic", "v8.7a", ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE(testAArch64Arch("armv9-a", "generic", "v9a", + ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE(testAArch64Arch("armv9.1-a", "generic", "v9.1a", + ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE(testAArch64Arch("armv9.2-a", "generic", "v9.2a", + ARMBuildAttrs::CPUArch::v8_A)); } bool testAArch64Extension(StringRef CPUName, AArch64::ArchKind AK, Index: llvm/test/MC/AArch64/SVE2/directive-arch.s === --- llvm/test/MC/AArch64/SVE2/directive-arch.s +++ llvm/test/MC/AArch64/SVE2/directive-arch.s @@ -1,21 +1,21 @@ // RUN: llvm-mc -triple aarch64 -filetype asm -o - %s 2>&1 | FileCheck %s -.arch armv8-a+sve2 +.arch armv9-a+sve2 tbx z0.b, z1.b, z2.b // CHECK: tbx z0.b, z1.b, z2.b -.arch armv8-a+sve2-aes +.arch armv9-a+sve2-aes aesd z23.b, z23.b, z13.b // CHECK: aesd z23.b, z23.b, z13.b -.arch armv8-a+sve2-sm4 +.arch armv9-a+sve2-sm4 sm4e z0.s, z0.s, z0.s // CHECK: sm4e z0.s, z0.s, z0.s -.arch armv8-a+sve2-sha3 +.arch armv9-a+sve2-sha3 rax1 z0.d, z0.d, z0.d // CHECK: rax1 z0.d, z0.d, z0.d -.arch armv8-a+sve2-bitperm +.arch armv9-a+sve2-bitperm bgrp z21.s, z10.s, z21.s // CHECK: bgrp z21.s, z10.s, z21.s Index: llvm/test/MC/AArch64/SVE2/directive-arch-negative.s === --- llvm/test/MC/AArch64/SVE2/directive-arch-negative.s +++ llvm/test/MC/AArch64/SVE2/directive-arch-negative.s @@ -1,31 +1,31 @@ // RUN: not llvm-mc -triple aarch64 -filetype asm -o - %s 2>&1 | FileCheck %s -.arch armv8-a+sve2 -.arch armv8-a+nosve2 +.arch armv9-a+sve2 +.arch armv9-a+nosve2 tbx z0.b, z1.b, z2.b // CHECK: error: instruction requires: streaming-sve or sve2 // CHECK-NEXT: tbx z0.b, z1.b, z2.b -.arch armv8-a+sve2-aes -.arch ar
[PATCH] D109517: [Clang][ARM][AArch64] Add support for Armv9-A, Armv9.1-A and Armv9.2-A
vhscampos updated this revision to Diff 372473. vhscampos marked 3 inline comments as done. vhscampos added a comment. 1. Enable the SVE2 extension as default. 2. Remove out of date comments in tests. 3. Remove unrelated change. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D109517/new/ https://reviews.llvm.org/D109517 Files: clang/lib/Basic/Targets/AArch64.cpp clang/lib/Basic/Targets/AArch64.h clang/lib/Basic/Targets/ARM.cpp clang/lib/Driver/ToolChains/Arch/AArch64.cpp clang/test/Driver/aarch64-cpus.c clang/test/Driver/arm-cortex-cpus.c clang/test/Preprocessor/aarch64-target-features.c clang/test/Preprocessor/arm-target-features.c llvm/include/llvm/ADT/Triple.h llvm/include/llvm/Support/AArch64TargetParser.def llvm/include/llvm/Support/ARMTargetParser.def llvm/lib/Support/AArch64TargetParser.cpp llvm/lib/Support/ARMTargetParser.cpp llvm/lib/Support/Triple.cpp llvm/lib/Target/AArch64/AArch64.td llvm/lib/Target/AArch64/AArch64InstrInfo.td llvm/lib/Target/AArch64/AArch64Subtarget.h llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp llvm/lib/Target/ARM/ARM.td llvm/lib/Target/ARM/ARMSubtarget.h llvm/lib/Target/ARM/MCTargetDesc/ARMELFStreamer.cpp llvm/test/MC/AArch64/SME/directives-negative.s llvm/test/MC/AArch64/SME/directives.s llvm/test/MC/AArch64/SVE2/directive-arch-negative.s llvm/test/MC/AArch64/SVE2/directive-arch.s llvm/unittests/Support/TargetParserTest.cpp Index: llvm/unittests/Support/TargetParserTest.cpp === --- llvm/unittests/Support/TargetParserTest.cpp +++ llvm/unittests/Support/TargetParserTest.cpp @@ -31,6 +31,7 @@ "armv8.5a","armv8.6-a","armv8.6a","armv8.7-a","armv8.7a", "armv8-r", "armv8r", "armv8-m.base","armv8m.base", "armv8-m.main", "armv8m.main", "iwmmxt", "iwmmxt2", "xscale", "armv8.1-m.main", +"armv9-a", "armv9.1-a","armv9.2-a", }; template @@ -492,6 +493,15 @@ EXPECT_TRUE( testARMArch("armv8.7-a", "generic", "v8.7a", ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE( + testARMArch("armv9-a", "generic", "v9a", + ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE( + testARMArch("armv9.1-a", "generic", "v9.1a", + ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE( + testARMArch("armv9.2-a", "generic", "v9.2a", + ARMBuildAttrs::CPUArch::v8_A)); EXPECT_TRUE( testARMArch("armv8-r", "cortex-r52", "v8r", ARMBuildAttrs::CPUArch::v8_R)); @@ -821,6 +831,9 @@ case ARM::ArchKind::ARMV8_5A: case ARM::ArchKind::ARMV8_6A: case ARM::ArchKind::ARMV8_7A: +case ARM::ArchKind::ARMV9A: +case ARM::ArchKind::ARMV9_1A: +case ARM::ArchKind::ARMV9_2A: EXPECT_EQ(ARM::ProfileKind::A, ARM::parseArchProfile(ARMArch[i])); break; default: @@ -1204,6 +1217,12 @@ ARMBuildAttrs::CPUArch::v8_A)); EXPECT_TRUE(testAArch64Arch("armv8.7-a", "generic", "v8.7a", ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE(testAArch64Arch("armv9-a", "generic", "v9a", + ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE(testAArch64Arch("armv9.1-a", "generic", "v9.1a", + ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE(testAArch64Arch("armv9.2-a", "generic", "v9.2a", + ARMBuildAttrs::CPUArch::v8_A)); } bool testAArch64Extension(StringRef CPUName, AArch64::ArchKind AK, Index: llvm/test/MC/AArch64/SVE2/directive-arch.s === --- llvm/test/MC/AArch64/SVE2/directive-arch.s +++ llvm/test/MC/AArch64/SVE2/directive-arch.s @@ -1,21 +1,21 @@ // RUN: llvm-mc -triple aarch64 -filetype asm -o - %s 2>&1 | FileCheck %s -.arch armv8-a+sve2 +.arch armv9-a+sve2 tbx z0.b, z1.b, z2.b // CHECK: tbx z0.b, z1.b, z2.b -.arch armv8-a+sve2-aes +.arch armv9-a+sve2-aes aesd z23.b, z23.b, z13.b // CHECK: aesd z23.b, z23.b, z13.b -.arch armv8-a+sve2-sm4 +.arch armv9-a+sve2-sm4 sm4e z0.s, z0.s, z0.s // CHECK: sm4e z0.s, z0.s, z0.s -.arch armv8-a+sve2-sha3 +.arch armv9-a+sve2-sha3 rax1 z0.d, z0.d, z0.d // CHECK: rax1 z0.d, z0.d, z0.d -.arch armv8-a+sve2-bitperm +.arch armv9-a+sve2-bitperm bgrp z21.s, z10.s, z21.s // CHECK: bgrp z21.s, z10.s, z21.s Index: llvm/test/MC/AArch64/SVE2/directive-arch-negative.s === --- llvm/test/MC/AArch64/SVE2/directive-arch-negative.s +++ llvm/test/MC/AArch64/SVE2/directive-arch-negative.s @@ -1,31 +1,31 @@ // RUN: not llvm-mc -triple aarch64 -filetype asm -o - %s 2>&1 | FileCheck %s -.arch armv8-a+sve2 -.arch armv8-a+nosve2 +.arch armv9-a+sve2 +.arch armv9-a+nosve2 tbx z0.b, z1.b, z2.b // CHECK: error: instruct
[PATCH] D109517: [Clang][ARM][AArch64] Add support for Armv9-A, Armv9.1-A and Armv9.2-A
vhscampos added inline comments. Comment at: clang/lib/Driver/ToolChains/Arch/AArch64.cpp:413 - auto V8_6Pos = llvm::find(Features, "+v8.6a"); - if (V8_6Pos != std::end(Features)) -V8_6Pos = Features.insert(std::next(V8_6Pos), {"+i8mm", "+bf16"}); + const char *Archs[] = {"+v8.6a", "+v8.7a", "+v9.1a", "+v9.2a"}; + auto Pos = std::find_first_of(Features.begin(), Features.end(), SjoerdMeijer wrote: > How about `+v9a`? Since v9a maps to v8.5a, and the latter was skipped in the original code, I did not include v9a here. Comment at: llvm/unittests/Support/TargetParserTest.cpp:495 ARMBuildAttrs::CPUArch::v8_A)); + EXPECT_TRUE( + testARMArch("armv9-a", "generic", "v9a", SjoerdMeijer wrote: > I haven't looked, but in these target parser tests, do we also not need to > check the architecture descriptions? > Copied this for example from the target parser def file: > > (ARM::AEK_SEC | ARM::AEK_MP | ARM::AEK_VIRT | ARM::AEK_HWDIVARM | > ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP | ARM::AEK_CRC | ARM::AEK_RAS | > ARM::AEK_DOTPROD) If I understand it correctly, we only check architecture extensions for CPUs, not for the architectures themselves. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D109517/new/ https://reviews.llvm.org/D109517 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits