[PATCH] D50175: [AArch64][NFC] better matching of AArch64 target in aarch64-cpus.c tests
SjoerdMeijer added inline comments. Comment at: test/Driver/aarch64-cpus.c:10 +// GENERIC: "-cc1"{{.*}} "-triple" "aarch64"{{.*}} "-target-cpu" "generic" +// GENERIC-LE: "-cc1"{{.*}} "-triple" "aarch64--"{{.*}} "-target-cpu" "generic" olista01 wrote: > Why do these need new check prefixes? All of the RUN lines above are > selecting little-endian, so I'd expect GENERIC and GENERIC-LE to be the same. Ok, good point. The output is slightly different. For the little-endian runs above the output is: "-triple" "aarch64" and with "-target aarch64_be -mlittle-endian" the output is: "-triple" "aarch64--" As we don't want to be too generic and match "aarch64{{.*}}", I will therefore change the GENERIC checks to match "aarch64{{[--]*}}", and indeed remove GENERIC-LE. https://reviews.llvm.org/D50175 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D50175: [AArch64][NFC] better matching of AArch64 target in aarch64-cpus.c tests
SjoerdMeijer updated this revision to Diff 159715. SjoerdMeijer added a comment. Addressed comments. https://reviews.llvm.org/D50175 Files: test/Driver/aarch64-cpus.c Index: test/Driver/aarch64-cpus.c === --- test/Driver/aarch64-cpus.c +++ test/Driver/aarch64-cpus.c @@ -6,7 +6,7 @@ // RUN: %clang -target aarch64 -mlittle-endian -mcpu=generic -### -c %s 2>&1 | FileCheck -check-prefix=GENERIC %s // RUN: %clang -target aarch64_be -mlittle-endian -### -c %s 2>&1 | FileCheck -check-prefix=GENERIC %s // RUN: %clang -target aarch64_be -mlittle-endian -mcpu=generic -### -c %s 2>&1 | FileCheck -check-prefix=GENERIC %s -// GENERIC: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "generic" +// GENERIC: "-cc1"{{.*}} "-triple" "aarch64{{[--]*}}"{{.*}} "-target-cpu" "generic" // RUN: %clang -target arm64 -### -c %s 2>&1 | FileCheck -check-prefix=ARM64-GENERIC %s // RUN: %clang -target arm64 -mcpu=generic -### -c %s 2>&1 | FileCheck -check-prefix=ARM64-GENERIC %s @@ -29,8 +29,8 @@ // RUN: %clang -target aarch64 -mtune=cortex-a35 -### -c %s 2>&1 | FileCheck -check-prefix=CA35-TUNE %s // RUN: %clang -target aarch64 -mlittle-endian -mtune=cortex-a35 -### -c %s 2>&1 | FileCheck -check-prefix=CA35-TUNE %s // RUN: %clang -target aarch64_be -mlittle-endian -mtune=cortex-a35 -### -c %s 2>&1 | FileCheck -check-prefix=CA35-TUNE %s -// CA35: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "cortex-a35" -// CA35-TUNE: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "generic" +// CA35: "-cc1"{{.*}} "-triple" "aarch64{{[--]*}}"{{.*}} "-target-cpu" "cortex-a35" +// CA35-TUNE: "-cc1"{{.*}} "-triple" "aarch64{{[--]*}}"{{.*}} "-target-cpu" "generic" // RUN: %clang -target arm64 -mcpu=cortex-a35 -### -c %s 2>&1 | FileCheck -check-prefix=ARM64-CA35 %s // RUN: %clang -target arm64 -mlittle-endian -mcpu=cortex-a35 -### -c %s 2>&1 | FileCheck -check-prefix=ARM64-CA35 %s @@ -44,8 +44,8 @@ // RUN: %clang -target aarch64_be -mlittle-endian -mcpu=cortex-a53 -### -c %s 2>&1 | FileCheck -check-prefix=CA53 %s // RUN: %clang -target aarch64 -mtune=cortex-a53 -### -c %s 2>&1 | FileCheck -check-prefix=CA53-TUNE %s // RUN: %clang -target aarch64_be -mlittle-endian -mtune=cortex-a53 -### -c %s 2>&1 | FileCheck -check-prefix=CA53-TUNE %s -// CA53: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "cortex-a53" -// CA53-TUNE: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "generic" +// CA53: "-cc1"{{.*}} "-triple" "aarch64{{[--]*}}"{{.*}} "-target-cpu" "cortex-a53" +// CA53-TUNE: "-cc1"{{.*}} "-triple" "aarch64{{[--]*}}"{{.*}} "-target-cpu" "generic" // RUN: %clang -target arm64 -mcpu=cortex-a53 -### -c %s 2>&1 | FileCheck -check-prefix=ARM64-CA53 %s // RUN: %clang -target arm64 -mlittle-endian -mcpu=cortex-a53 -### -c %s 2>&1 | FileCheck -check-prefix=ARM64-CA53 %s @@ -59,8 +59,8 @@ // RUN: %clang -target aarch64_be -mlittle-endian -mcpu=cortex-a55 -### -c %s 2>&1 | FileCheck -check-prefix=CA55 %s // RUN: %clang -target aarch64 -mtune=cortex-a55 -### -c %s 2>&1 | FileCheck -check-prefix=CA55-TUNE %s // RUN: %clang -target aarch64_be -mlittle-endian -mtune=cortex-a55 -### -c %s 2>&1 | FileCheck -check-prefix=CA55-TUNE %s -// CA55: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "cortex-a55" -// CA55-TUNE: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "generic" +// CA55: "-cc1"{{.*}} "-triple" "aarch64{{[--]*}}"{{.*}} "-target-cpu" "cortex-a55" +// CA55-TUNE: "-cc1"{{.*}} "-triple" "aarch64{{[--]*}}"{{.*}} "-target-cpu" "generic" // RUN: %clang -target arm64 -mcpu=cortex-a55 -### -c %s 2>&1 | FileCheck -check-prefix=ARM64-CA55 %s // RUN: %clang -target arm64 -mlittle-endian -mcpu=cortex-a55 -### -c %s 2>&1 | FileCheck -check-prefix=ARM64-CA55 %s @@ -75,8 +75,8 @@ // RUN: %clang -target aarch64 -mtune=cortex-a57 -### -c %s 2>&1 | FileCheck -check-prefix=CA57-TUNE %s // RUN: %clang -target aarch64 -mlittle-endian -mtune=cortex-a57 -### -c %s 2>&1 | FileCheck -check-prefix=CA57-TUNE %s // RUN: %clang -target aarch64_be -mlittle-endian -mtune=cortex-a57 -### -c %s 2>&1 | FileCheck -check-prefix=CA57-TUNE %s -// CA57: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "cortex-a57" -// CA57-TUNE: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "generic" +// CA57: "-cc1"{{.*}} "-triple" "aarch64{{[--]*}}"{{.*}} "-target-cpu" "cortex-a57" +// CA57-TUNE: "-cc1"{{.*}} "-triple" "aarch64{{[--]*}}"{{.*}} "-target-cpu" "generic" // RUN: %clang -target arm64 -mcpu=cortex-a57 -### -c %s 2>&1 | FileCheck -check-prefix=ARM64-CA57 %s // RUN: %clang -target arm64 -mlittle-endian -mcpu=cortex-a57 -### -c %s 2>&1 | FileCheck -check-prefix=ARM64-CA57 %s @@ -91,8 +91,8 @@ // RUN: %clang -target aarch64 -mtune=cortex-a72 -### -c %s 2>&1 | FileCheck -check-prefix=CA72-TUNE %s // RUN: %clang -target aarch64 -mlittle-endian -mtune=cortex-a72 -### -c %s 2>&1 | FileCheck -check-prefix=CA72-TUNE %s // RUN: %clang -target aarch64_be -mlittle
[PATCH] D50175: [AArch64][NFC] better matching of AArch64 target in aarch64-cpus.c tests
SjoerdMeijer added inline comments. Comment at: test/Driver/aarch64-cpus.c:10 +// GENERIC: "-cc1"{{.*}} "-triple" "aarch64"{{.*}} "-target-cpu" "generic" +// GENERIC-LE: "-cc1"{{.*}} "-triple" "aarch64--"{{.*}} "-target-cpu" "generic" olista01 wrote: > SjoerdMeijer wrote: > > olista01 wrote: > > > Why do these need new check prefixes? All of the RUN lines above are > > > selecting little-endian, so I'd expect GENERIC and GENERIC-LE to be the > > > same. > > Ok, good point. The output is slightly different. For the little-endian > > runs above the output is: > > > > "-triple" "aarch64" > > > > and with "-target aarch64_be -mlittle-endian" the output is: > > > > "-triple" "aarch64--" > > > > As we don't want to be too generic and match "aarch64{{.*}}", I will > > therefore change the GENERIC checks to match "aarch64{{[--]*}}", and indeed > > remove GENERIC-LE. > I think that works, but it's a strange way to write the regex. You have "-" > twice inside a character set, which is the same as only having it once, so > "[--]*" matches zero or more occurrences of "-". I'd suggest using something > like "(--)?" which matches either "--" or nothing. Ah, of course, thanks! That was a bit silly, will fix. https://reviews.llvm.org/D50175 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D50175: [AArch64][NFC] better matching of AArch64 target in aarch64-cpus.c tests
SjoerdMeijer updated this revision to Diff 159915. SjoerdMeijer added a comment. Addressed comments. https://reviews.llvm.org/D50175 Files: test/Driver/aarch64-cpus.c Index: test/Driver/aarch64-cpus.c === --- test/Driver/aarch64-cpus.c +++ test/Driver/aarch64-cpus.c @@ -6,7 +6,7 @@ // RUN: %clang -target aarch64 -mlittle-endian -mcpu=generic -### -c %s 2>&1 | FileCheck -check-prefix=GENERIC %s // RUN: %clang -target aarch64_be -mlittle-endian -### -c %s 2>&1 | FileCheck -check-prefix=GENERIC %s // RUN: %clang -target aarch64_be -mlittle-endian -mcpu=generic -### -c %s 2>&1 | FileCheck -check-prefix=GENERIC %s -// GENERIC: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "generic" +// GENERIC: "-cc1"{{.*}} "-triple" "aarch64{{(--)?}}"{{.*}} "-target-cpu" "generic" // RUN: %clang -target arm64 -### -c %s 2>&1 | FileCheck -check-prefix=ARM64-GENERIC %s // RUN: %clang -target arm64 -mcpu=generic -### -c %s 2>&1 | FileCheck -check-prefix=ARM64-GENERIC %s @@ -29,8 +29,8 @@ // RUN: %clang -target aarch64 -mtune=cortex-a35 -### -c %s 2>&1 | FileCheck -check-prefix=CA35-TUNE %s // RUN: %clang -target aarch64 -mlittle-endian -mtune=cortex-a35 -### -c %s 2>&1 | FileCheck -check-prefix=CA35-TUNE %s // RUN: %clang -target aarch64_be -mlittle-endian -mtune=cortex-a35 -### -c %s 2>&1 | FileCheck -check-prefix=CA35-TUNE %s -// CA35: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "cortex-a35" -// CA35-TUNE: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "generic" +// CA35: "-cc1"{{.*}} "-triple" "aarch64{{(--)?}}"{{.*}} "-target-cpu" "cortex-a35" +// CA35-TUNE: "-cc1"{{.*}} "-triple" "aarch64{{(--)?}}"{{.*}} "-target-cpu" "generic" // RUN: %clang -target arm64 -mcpu=cortex-a35 -### -c %s 2>&1 | FileCheck -check-prefix=ARM64-CA35 %s // RUN: %clang -target arm64 -mlittle-endian -mcpu=cortex-a35 -### -c %s 2>&1 | FileCheck -check-prefix=ARM64-CA35 %s @@ -44,8 +44,8 @@ // RUN: %clang -target aarch64_be -mlittle-endian -mcpu=cortex-a53 -### -c %s 2>&1 | FileCheck -check-prefix=CA53 %s // RUN: %clang -target aarch64 -mtune=cortex-a53 -### -c %s 2>&1 | FileCheck -check-prefix=CA53-TUNE %s // RUN: %clang -target aarch64_be -mlittle-endian -mtune=cortex-a53 -### -c %s 2>&1 | FileCheck -check-prefix=CA53-TUNE %s -// CA53: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "cortex-a53" -// CA53-TUNE: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "generic" +// CA53: "-cc1"{{.*}} "-triple" "aarch64{{(--)?}}"{{.*}} "-target-cpu" "cortex-a53" +// CA53-TUNE: "-cc1"{{.*}} "-triple" "aarch64{{(--)?}}"{{.*}} "-target-cpu" "generic" // RUN: %clang -target arm64 -mcpu=cortex-a53 -### -c %s 2>&1 | FileCheck -check-prefix=ARM64-CA53 %s // RUN: %clang -target arm64 -mlittle-endian -mcpu=cortex-a53 -### -c %s 2>&1 | FileCheck -check-prefix=ARM64-CA53 %s @@ -59,8 +59,8 @@ // RUN: %clang -target aarch64_be -mlittle-endian -mcpu=cortex-a55 -### -c %s 2>&1 | FileCheck -check-prefix=CA55 %s // RUN: %clang -target aarch64 -mtune=cortex-a55 -### -c %s 2>&1 | FileCheck -check-prefix=CA55-TUNE %s // RUN: %clang -target aarch64_be -mlittle-endian -mtune=cortex-a55 -### -c %s 2>&1 | FileCheck -check-prefix=CA55-TUNE %s -// CA55: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "cortex-a55" -// CA55-TUNE: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "generic" +// CA55: "-cc1"{{.*}} "-triple" "aarch64{{(--)?}}"{{.*}} "-target-cpu" "cortex-a55" +// CA55-TUNE: "-cc1"{{.*}} "-triple" "aarch64{{(--)?}}"{{.*}} "-target-cpu" "generic" // RUN: %clang -target arm64 -mcpu=cortex-a55 -### -c %s 2>&1 | FileCheck -check-prefix=ARM64-CA55 %s // RUN: %clang -target arm64 -mlittle-endian -mcpu=cortex-a55 -### -c %s 2>&1 | FileCheck -check-prefix=ARM64-CA55 %s @@ -75,8 +75,8 @@ // RUN: %clang -target aarch64 -mtune=cortex-a57 -### -c %s 2>&1 | FileCheck -check-prefix=CA57-TUNE %s // RUN: %clang -target aarch64 -mlittle-endian -mtune=cortex-a57 -### -c %s 2>&1 | FileCheck -check-prefix=CA57-TUNE %s // RUN: %clang -target aarch64_be -mlittle-endian -mtune=cortex-a57 -### -c %s 2>&1 | FileCheck -check-prefix=CA57-TUNE %s -// CA57: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "cortex-a57" -// CA57-TUNE: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "generic" +// CA57: "-cc1"{{.*}} "-triple" "aarch64{{(--)?}}"{{.*}} "-target-cpu" "cortex-a57" +// CA57-TUNE: "-cc1"{{.*}} "-triple" "aarch64{{(--)?}}"{{.*}} "-target-cpu" "generic" // RUN: %clang -target arm64 -mcpu=cortex-a57 -### -c %s 2>&1 | FileCheck -check-prefix=ARM64-CA57 %s // RUN: %clang -target arm64 -mlittle-endian -mcpu=cortex-a57 -### -c %s 2>&1 | FileCheck -check-prefix=ARM64-CA57 %s @@ -91,8 +91,8 @@ // RUN: %clang -target aarch64 -mtune=cortex-a72 -### -c %s 2>&1 | FileCheck -check-prefix=CA72-TUNE %s // RUN: %clang -target aarch64 -mlittle-endian -mtune=cortex-a72 -### -c %s 2>&1 | FileCheck -check-prefix=CA72-TUNE %s // RUN: %clang -target aarch64_be -mlittle
[PATCH] D50175: [AArch64][NFC] better matching of AArch64 target in aarch64-cpus.c tests
This revision was automatically updated to reflect the committed changes. Closed by commit rL339347: [AArch64][NFC] better matching of AArch64 target in aarch64-cpus.c tests (authored by SjoerdMeijer, committed by ). Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D50175?vs=159915&id=159920#toc Repository: rL LLVM https://reviews.llvm.org/D50175 Files: cfe/trunk/test/Driver/aarch64-cpus.c Index: cfe/trunk/test/Driver/aarch64-cpus.c === --- cfe/trunk/test/Driver/aarch64-cpus.c +++ cfe/trunk/test/Driver/aarch64-cpus.c @@ -6,7 +6,7 @@ // RUN: %clang -target aarch64 -mlittle-endian -mcpu=generic -### -c %s 2>&1 | FileCheck -check-prefix=GENERIC %s // RUN: %clang -target aarch64_be -mlittle-endian -### -c %s 2>&1 | FileCheck -check-prefix=GENERIC %s // RUN: %clang -target aarch64_be -mlittle-endian -mcpu=generic -### -c %s 2>&1 | FileCheck -check-prefix=GENERIC %s -// GENERIC: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "generic" +// GENERIC: "-cc1"{{.*}} "-triple" "aarch64{{(--)?}}"{{.*}} "-target-cpu" "generic" // RUN: %clang -target arm64 -### -c %s 2>&1 | FileCheck -check-prefix=ARM64-GENERIC %s // RUN: %clang -target arm64 -mcpu=generic -### -c %s 2>&1 | FileCheck -check-prefix=ARM64-GENERIC %s @@ -29,8 +29,8 @@ // RUN: %clang -target aarch64 -mtune=cortex-a35 -### -c %s 2>&1 | FileCheck -check-prefix=CA35-TUNE %s // RUN: %clang -target aarch64 -mlittle-endian -mtune=cortex-a35 -### -c %s 2>&1 | FileCheck -check-prefix=CA35-TUNE %s // RUN: %clang -target aarch64_be -mlittle-endian -mtune=cortex-a35 -### -c %s 2>&1 | FileCheck -check-prefix=CA35-TUNE %s -// CA35: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "cortex-a35" -// CA35-TUNE: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "generic" +// CA35: "-cc1"{{.*}} "-triple" "aarch64{{(--)?}}"{{.*}} "-target-cpu" "cortex-a35" +// CA35-TUNE: "-cc1"{{.*}} "-triple" "aarch64{{(--)?}}"{{.*}} "-target-cpu" "generic" // RUN: %clang -target arm64 -mcpu=cortex-a35 -### -c %s 2>&1 | FileCheck -check-prefix=ARM64-CA35 %s // RUN: %clang -target arm64 -mlittle-endian -mcpu=cortex-a35 -### -c %s 2>&1 | FileCheck -check-prefix=ARM64-CA35 %s @@ -44,8 +44,8 @@ // RUN: %clang -target aarch64_be -mlittle-endian -mcpu=cortex-a53 -### -c %s 2>&1 | FileCheck -check-prefix=CA53 %s // RUN: %clang -target aarch64 -mtune=cortex-a53 -### -c %s 2>&1 | FileCheck -check-prefix=CA53-TUNE %s // RUN: %clang -target aarch64_be -mlittle-endian -mtune=cortex-a53 -### -c %s 2>&1 | FileCheck -check-prefix=CA53-TUNE %s -// CA53: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "cortex-a53" -// CA53-TUNE: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "generic" +// CA53: "-cc1"{{.*}} "-triple" "aarch64{{(--)?}}"{{.*}} "-target-cpu" "cortex-a53" +// CA53-TUNE: "-cc1"{{.*}} "-triple" "aarch64{{(--)?}}"{{.*}} "-target-cpu" "generic" // RUN: %clang -target arm64 -mcpu=cortex-a53 -### -c %s 2>&1 | FileCheck -check-prefix=ARM64-CA53 %s // RUN: %clang -target arm64 -mlittle-endian -mcpu=cortex-a53 -### -c %s 2>&1 | FileCheck -check-prefix=ARM64-CA53 %s @@ -59,8 +59,8 @@ // RUN: %clang -target aarch64_be -mlittle-endian -mcpu=cortex-a55 -### -c %s 2>&1 | FileCheck -check-prefix=CA55 %s // RUN: %clang -target aarch64 -mtune=cortex-a55 -### -c %s 2>&1 | FileCheck -check-prefix=CA55-TUNE %s // RUN: %clang -target aarch64_be -mlittle-endian -mtune=cortex-a55 -### -c %s 2>&1 | FileCheck -check-prefix=CA55-TUNE %s -// CA55: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "cortex-a55" -// CA55-TUNE: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "generic" +// CA55: "-cc1"{{.*}} "-triple" "aarch64{{(--)?}}"{{.*}} "-target-cpu" "cortex-a55" +// CA55-TUNE: "-cc1"{{.*}} "-triple" "aarch64{{(--)?}}"{{.*}} "-target-cpu" "generic" // RUN: %clang -target arm64 -mcpu=cortex-a55 -### -c %s 2>&1 | FileCheck -check-prefix=ARM64-CA55 %s // RUN: %clang -target arm64 -mlittle-endian -mcpu=cortex-a55 -### -c %s 2>&1 | FileCheck -check-prefix=ARM64-CA55 %s @@ -75,8 +75,8 @@ // RUN: %clang -target aarch64 -mtune=cortex-a57 -### -c %s 2>&1 | FileCheck -check-prefix=CA57-TUNE %s // RUN: %clang -target aarch64 -mlittle-endian -mtune=cortex-a57 -### -c %s 2>&1 | FileCheck -check-prefix=CA57-TUNE %s // RUN: %clang -target aarch64_be -mlittle-endian -mtune=cortex-a57 -### -c %s 2>&1 | FileCheck -check-prefix=CA57-TUNE %s -// CA57: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "cortex-a57" -// CA57-TUNE: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "generic" +// CA57: "-cc1"{{.*}} "-triple" "aarch64{{(--)?}}"{{.*}} "-target-cpu" "cortex-a57" +// CA57-TUNE: "-cc1"{{.*}} "-triple" "aarch64{{(--)?}}"{{.*}} "-target-cpu" "generic" // RUN: %clang -target arm64 -mcpu=cortex-a57 -### -c %s 2>&1 | FileCheck -check-prefix=ARM64-CA57 %s // RUN: %clang -target arm64 -mlittle-endian -mcpu=cortex-a57 -### -c %s 2>&1 | FileCheck -check-prefi
[PATCH] D50179: [AArch64][ARM] Context sensitive meaning of option "crypto"
SjoerdMeijer added inline comments. Comment at: lib/Driver/ToolChains/Arch/ARM.cpp:430 + if (ArchName.find_lower("+noaes") == StringRef::npos) +Features.push_back("+aes"); +} else if (ArchName.find_lower("-crypto") != StringRef::npos) { efriedma wrote: > The ARM backend doesn't support features named "sha2" and "aes" at the moment. These ARM target features were introduced in rL335953. https://reviews.llvm.org/D50179 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D50179: [AArch64][ARM] Context sensitive meaning of option "crypto"
SjoerdMeijer updated this revision to Diff 159979. SjoerdMeijer added a comment. fixed typo https://reviews.llvm.org/D50179 Files: lib/Driver/ToolChains/Arch/AArch64.cpp lib/Driver/ToolChains/Arch/ARM.cpp test/Driver/arm-features.c test/Preprocessor/aarch64-target-features.c Index: test/Preprocessor/aarch64-target-features.c === --- test/Preprocessor/aarch64-target-features.c +++ test/Preprocessor/aarch64-target-features.c @@ -143,6 +143,101 @@ // CHECK-MARCH-2: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "-fp-armv8" "-target-feature" "-neon" "-target-feature" "-crc" "-target-feature" "-crypto" // CHECK-MARCH-3: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "-neon" +// Check +sm4: +// +// RUN: %clang -target aarch64 -march=armv8.2a+sm4 -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-SM4 %s +// CHECK-SM4: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8.2a" "-target-feature" "+sm4" +// +// Check +sha3: +// +// RUN: %clang -target aarch64 -march=armv8.2a+sha3 -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-SHA3 %s +// CHECK-SHA3: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8.2a" "-target-feature" "+sha3" +// +// Check +sha2: +// +// RUN: %clang -target aarch64 -march=armv8.3a+sha2 -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-SHA2 %s +// CHECK-SHA2: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8.{{.}}a" "-target-feature" "+sha2" +// +// Check +aes: +// +// RUN: %clang -target aarch64 -march=armv8.3a+aes -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-AES %s +// CHECK-AES: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8.{{.}}a" "-target-feature" "+aes" +// +// Check -sm4: +// +// RUN: %clang -target aarch64 -march=armv8.2a+noSM4 -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NO-SM4 %s +// CHECK-NO-SM4: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8.2a" "-target-feature" "-sm4" +// +// Check -sha3: +// +// RUN: %clang -target aarch64 -march=armv8.2a+noSHA3 -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NO-SHA3 %s +// CHECK-NO-SHA3: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8.2a" "-target-feature" "-sha3" +// +// Check -sha2: +// +// RUN: %clang -target aarch64 -march=armv8.2a+noSHA2 -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NO-SHA2 %s +// CHECK-NO-SHA2: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8.2a" "-target-feature" "-sha2" +// +// Check -aes: +// +// RUN: %clang -target aarch64 -march=armv8.2a+noAES -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NO-AES %s +// CHECK-NO-AES: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8.2a" "-target-feature" "-aes" +// +// +// Arch <= ARMv8.3: crypto = sha2 + aes +// - +// +// Check +crypto: +// +// RUN: %clang -target aarch64 -march=armv8a+crypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-CRYPTO83 %s +// RUN: %clang -target aarch64 -march=armv8.1a+crypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-CRYPTO83 %s +// RUN: %clang -target aarch64 -march=armv8.2a+crypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-CRYPTO83 %s +// RUN: %clang -target aarch64 -march=armv8.3a+crypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-CRYPTO83 %s +// RUN: %clang -target aarch64 -march=armv8a+crypto+nocrypto+crypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-CRYPTO83 %s +// CHECK-CRYPTO83: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+crypto" "-target-feature" "+sha2" "-target-feature" "+aes" +// +// Check -crypto: +// +// RUN: %clang -target aarch64 -march=armv8a+nocrypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NOCRYPTO8A %s +// RUN: %clang -target aarch64 -march=armv8.1a+nocrypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NOCRYPTO81 %s +// RUN: %clang -target aarch64 -march=armv8.2a+nocrypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NOCRYPTO82 %s +// RUN: %clang -target aarch64 -march=armv8.3a+nocrypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NOCRYPTO82 %s +// RUN: %clang -target aarch64 -march=armv8.3a+nocrypto+crypto+nocrypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NOCRYPTO82 %s + +// CHECK-NOCRYPTO8A: "-target-feature" "+neon" "-target-feature" "-crypto" "-target-feature" "-sha2" "-target-feature" "-aes" "-target-abi" "aapcs" +// CHECK-NOCRYPTO81: "-target-feature" "+neon" "-target-feature" "+v8.1a" "-target-feature" "-crypto" "-target-feature" "-sha2" "-target-feature" "-aes" "-target-abi" "aapcs" +// CHECK-NOCRYPTO82: "-target-feature" "+neon" "-target-feature" "+v8.{{.}}a" "-target-feature" "-crypto" "-target-feature" "-sha2" "-target-feature" "-aes" "-target-feature" "-sm4" "-target-feature" "-sha3" "-target-abi" "aapcs" +// +// Check +crypto -sha2 -aes: +// +// RUN: %clang -target aarch64 -march=armv8.1a+crypto+nosha2+noaes -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-CRYPTO83-NOSHA2-NOAES %s +// CHECK-CRYPTO83-NOSHA2-NOAES-NOT: "-target-fea
[PATCH] D50179: [AArch64][ARM] Context sensitive meaning of option "crypto"
SjoerdMeijer updated this revision to Diff 159991. https://reviews.llvm.org/D50179 Files: lib/Driver/ToolChains/Arch/AArch64.cpp lib/Driver/ToolChains/Arch/ARM.cpp test/Driver/arm-features.c test/Preprocessor/aarch64-target-features.c Index: test/Preprocessor/aarch64-target-features.c === --- test/Preprocessor/aarch64-target-features.c +++ test/Preprocessor/aarch64-target-features.c @@ -143,6 +143,101 @@ // CHECK-MARCH-2: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "-fp-armv8" "-target-feature" "-neon" "-target-feature" "-crc" "-target-feature" "-crypto" // CHECK-MARCH-3: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "-neon" +// Check +sm4: +// +// RUN: %clang -target aarch64 -march=armv8.2a+sm4 -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-SM4 %s +// CHECK-SM4: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8.2a" "-target-feature" "+sm4" +// +// Check +sha3: +// +// RUN: %clang -target aarch64 -march=armv8.2a+sha3 -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-SHA3 %s +// CHECK-SHA3: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8.2a" "-target-feature" "+sha3" +// +// Check +sha2: +// +// RUN: %clang -target aarch64 -march=armv8.3a+sha2 -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-SHA2 %s +// CHECK-SHA2: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8.{{.}}a" "-target-feature" "+sha2" +// +// Check +aes: +// +// RUN: %clang -target aarch64 -march=armv8.3a+aes -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-AES %s +// CHECK-AES: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8.{{.}}a" "-target-feature" "+aes" +// +// Check -sm4: +// +// RUN: %clang -target aarch64 -march=armv8.2a+noSM4 -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NO-SM4 %s +// CHECK-NO-SM4: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8.2a" "-target-feature" "-sm4" +// +// Check -sha3: +// +// RUN: %clang -target aarch64 -march=armv8.2a+noSHA3 -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NO-SHA3 %s +// CHECK-NO-SHA3: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8.2a" "-target-feature" "-sha3" +// +// Check -sha2: +// +// RUN: %clang -target aarch64 -march=armv8.2a+noSHA2 -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NO-SHA2 %s +// CHECK-NO-SHA2: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8.2a" "-target-feature" "-sha2" +// +// Check -aes: +// +// RUN: %clang -target aarch64 -march=armv8.2a+noAES -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NO-AES %s +// CHECK-NO-AES: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8.2a" "-target-feature" "-aes" +// +// +// Arch <= ARMv8.3: crypto = sha2 + aes +// - +// +// Check +crypto: +// +// RUN: %clang -target aarch64 -march=armv8a+crypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-CRYPTO83 %s +// RUN: %clang -target aarch64 -march=armv8.1a+crypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-CRYPTO83 %s +// RUN: %clang -target aarch64 -march=armv8.2a+crypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-CRYPTO83 %s +// RUN: %clang -target aarch64 -march=armv8.3a+crypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-CRYPTO83 %s +// RUN: %clang -target aarch64 -march=armv8a+crypto+nocrypto+crypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-CRYPTO83 %s +// CHECK-CRYPTO83: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+crypto" "-target-feature" "+sha2" "-target-feature" "+aes" +// +// Check -crypto: +// +// RUN: %clang -target aarch64 -march=armv8a+nocrypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NOCRYPTO8A %s +// RUN: %clang -target aarch64 -march=armv8.1a+nocrypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NOCRYPTO81 %s +// RUN: %clang -target aarch64 -march=armv8.2a+nocrypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NOCRYPTO82 %s +// RUN: %clang -target aarch64 -march=armv8.3a+nocrypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NOCRYPTO82 %s +// RUN: %clang -target aarch64 -march=armv8.3a+nocrypto+crypto+nocrypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NOCRYPTO82 %s + +// CHECK-NOCRYPTO8A: "-target-feature" "+neon" "-target-feature" "-crypto" "-target-feature" "-sha2" "-target-feature" "-aes" "-target-abi" "aapcs" +// CHECK-NOCRYPTO81: "-target-feature" "+neon" "-target-feature" "+v8.1a" "-target-feature" "-crypto" "-target-feature" "-sha2" "-target-feature" "-aes" "-target-abi" "aapcs" +// CHECK-NOCRYPTO82: "-target-feature" "+neon" "-target-feature" "+v8.{{.}}a" "-target-feature" "-crypto" "-target-feature" "-sha2" "-target-feature" "-aes" "-target-feature" "-sm4" "-target-feature" "-sha3" "-target-abi" "aapcs" +// +// Check +crypto -sha2 -aes: +// +// RUN: %clang -target aarch64 -march=armv8.1a+crypto+nosha2+noaes -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-CRYPTO83-NOSHA2-NOAES %s +// CHECK-CRYPTO83-NOSHA2-NOAES-NOT: "-target-feature" "+sha2" "-target-feature" "+aes" +//
[PATCH] D47267: [UnrollAndJam] Add unroll_and_jam pragma handling
SjoerdMeijer added a comment. Just out of curiousity: - How do you plan to implement this? Are you going to generate from the pragma some sort of "script" that dictates the transformation order which is going to be fed to the pass manager? - About the stacking of pragmas, in your example you apply the bottom one first, but would a user perhaps expect the first to be applied? In other words, is the expected behaviour described somewhere (in a spec)? https://reviews.llvm.org/D47267 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D51093: [ARM] Set __ARM_FEATURE_SIMD32 for +dsp cores
SjoerdMeijer accepted this revision. SjoerdMeijer added a comment. This revision is now accepted and ready to land. Looks reasonable to me. https://reviews.llvm.org/D51093 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D51429: [AArch64] Return Address Signing B Key Support
SjoerdMeijer added inline comments. Comment at: test/CodeGen/aarch64-sign-return-address.c:3 // RUN: %clang -target aarch64-arm-none-eabi -S -emit-llvm -o - -msign-return-address=non-leaf %s | FileCheck %s --check-prefix=CHECK-PARTIAL // RUN: %clang -target aarch64-arm-none-eabi -S -emit-llvm -o - -msign-return-address=all %s | FileCheck %s --check-prefix=CHECK-ALL +// RUN: %clang -target aarch64-arm-none-eabi -S -emit-llvm -o - -msign-return-address=all+a_key %s | FileCheck %s --check-prefix=CHECK-A-KEY If the default is the a_key, does this test need to check the a_key attribute? Repository: rC Clang https://reviews.llvm.org/D51429 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D49376: [NEON] Define half-precision vrnd intrinsics only when available
SjoerdMeijer accepted this revision. SjoerdMeijer added a comment. This revision is now accepted and ready to land. LGTM Comment at: include/clang/Basic/arm_neon.td:1419 // Vector rounding - def FRINTZH : SInst<"vrnd", "dd", "hQh">; - def FRINTNH : SInst<"vrndn", "dd", "hQh">; - def FRINTAH : SInst<"vrnda", "dd", "hQh">; - def FRINTPH : SInst<"vrndp", "dd", "hQh">; - def FRINTMH : SInst<"vrndm", "dd", "hQh">; - def FRINTXH : SInst<"vrndx", "dd", "hQh">; + let ArchGuard = "__ARM_ARCH >= 8 && defined(__ARM_FEATURE_DIRECTED_ROUNDING) && defined(__ARM_FEATURE_FP16_VECTOR_ARITHMETIC)" in { +def FRINTZH : SInst<"vrnd", "dd", "hQh">; nit: is the indentation a bit off here? https://reviews.llvm.org/D49376 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D49375: [NEON] Define half-precision vmaxnm intrinsics only when available
SjoerdMeijer accepted this revision. SjoerdMeijer added a comment. This revision is now accepted and ready to land. LGTM Comment at: include/clang/Basic/arm_neon.td:1466 def VMINH : SInst<"vmin", "ddd", "hQh">; - def FMAXNMH : SInst<"vmaxnm", "ddd", "hQh">; - def FMINNMH : SInst<"vminnm", "ddd", "hQh">; + let ArchGuard = "__ARM_ARCH >= 8 && defined(__ARM_FEATURE_NUMERIC_MAXMIN) && defined(__ARM_FEATURE_FP16_VECTOR_ARITHMETIC)" in { +def FMAXNMH : SInst<"vmaxnm", "ddd", "hQh">; nit: indentation? https://reviews.llvm.org/D49375 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D49075: [NEON] Define fp16 vld and vst intrinsics conditionally
SjoerdMeijer added a comment. Now that they are conditionally defined, do we need negative tests (in test/Sema/arm-no-fp16.c?) to check that they are not available when fp16 is not enabled? https://reviews.llvm.org/D49075 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D48829: [NEON] Fix support for vrndi_f32(), vrndiq_f32() and vrndns_f32() intrinsics
SjoerdMeijer accepted this revision. SjoerdMeijer added a comment. This revision is now accepted and ready to land. LGTM https://reviews.llvm.org/D48829 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D49375: [NEON] Define half-precision vmaxnm intrinsics only when available
SjoerdMeijer added inline comments. Comment at: include/clang/Basic/arm_neon.td:1466 def VMINH : SInst<"vmin", "ddd", "hQh">; - def FMAXNMH : SInst<"vmaxnm", "ddd", "hQh">; - def FMINNMH : SInst<"vminnm", "ddd", "hQh">; + let ArchGuard = "__ARM_ARCH >= 8 && defined(__ARM_FEATURE_NUMERIC_MAXMIN) && defined(__ARM_FEATURE_FP16_VECTOR_ARITHMETIC)" in { +def FMAXNMH : SInst<"vmaxnm", "ddd", "hQh">; kosarev wrote: > SjoerdMeijer wrote: > > nit: indentation? > Do we want some special indentation here? Ah, sorry, got confused, it's just one big string. https://reviews.llvm.org/D49375 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D49376: [NEON] Define half-precision vrnd intrinsics only when available
SjoerdMeijer added inline comments. Comment at: include/clang/Basic/arm_neon.td:1419 // Vector rounding - def FRINTZH : SInst<"vrnd", "dd", "hQh">; - def FRINTNH : SInst<"vrndn", "dd", "hQh">; - def FRINTAH : SInst<"vrnda", "dd", "hQh">; - def FRINTPH : SInst<"vrndp", "dd", "hQh">; - def FRINTMH : SInst<"vrndm", "dd", "hQh">; - def FRINTXH : SInst<"vrndx", "dd", "hQh">; + let ArchGuard = "__ARM_ARCH >= 8 && defined(__ARM_FEATURE_DIRECTED_ROUNDING) && defined(__ARM_FEATURE_FP16_VECTOR_ARITHMETIC)" in { +def FRINTZH : SInst<"vrnd", "dd", "hQh">; kosarev wrote: > SjoerdMeijer wrote: > > nit: is the indentation a bit off here? > It's a nested `let ArchGuard`, so I guess we do want the indentation here? Yep, got confused, please ignore. https://reviews.llvm.org/D49376 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D50068: [AArch64][ARM] Add Armv8.4-A tests
SjoerdMeijer created this revision. SjoerdMeijer added reviewers: samparker, olista01, john.brawn, ab, t.p.northover. Herald added a reviewer: javed.absar. Herald added subscribers: chrib, kristof.beyls. This adds tests for Armv8.4-A, and also some v8.2 and v8.3 tests that were missed in previous upstreaming exercises. https://reviews.llvm.org/D50068 Files: lib/Basic/Targets/ARM.cpp test/Driver/aarch64-cpus.c test/Driver/arm-cortex-cpus.c Index: test/Driver/arm-cortex-cpus.c === --- test/Driver/arm-cortex-cpus.c +++ test/Driver/arm-cortex-cpus.c @@ -287,13 +287,62 @@ // RUN: %clang -target armv8a-linux-eabi -march=armv8.2-a+fp16 -### -c %s 2>&1 | FileCheck --check-prefix CHECK-V82A-FP16 %s // CHECK-V82A-FP16: "-cc1"{{.*}} "-triple" "armv8.2{{.*}}" "-target-cpu" "generic" {{.*}}"-target-feature" "+fullfp16" +// RUN: %clang -target armv8.3a -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-V83A %s +// RUN: %clang -target arm -march=armv8.3a -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-V83A %s +// RUN: %clang -target arm -march=armv8.3-a -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-V83A %s +// RUN: %clang -target arm -march=armv8.3a -mlittle-endian -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-V83A %s +// RUN: %clang -target armv8.3a -mlittle-endian -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-V83A %s +// RUN: %clang -target arm -march=armv8.3a -mlittle-endian -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-V83A %s +// RUN: %clang -target arm -mlittle-endian -march=armv8.3-a -mlittle-endian -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-V83A %s +// CHECK-V83A: "-cc1"{{.*}} "-triple" "armv8.3{{.*}}" "-target-cpu" "generic" + +// RUN: %clang -target armebv8.3a -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-BE-V83A %s +// RUN: %clang -target armv8.3a -mbig-endian -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-BE-V83A %s +// RUN: %clang -target armeb -march=armebv8.3a -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-BE-V83A %s +// RUN: %clang -target armeb -march=armebv8.3-a -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-BE-V83A %s +// RUN: %clang -target arm -march=armebv8.3a -mbig-endian -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-BE-V83A %s +// RUN: %clang -target arm -march=armebv8.3-a -mbig-endian -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-BE-V83A %s +// CHECK-BE-V83A: "-cc1"{{.*}} "-triple" "armebv8.3{{.*}}" "-target-cpu" "generic" + +// RUN: %clang -target armv8a-linux-eabi -march=armv8.3-a+fp16 -### -c %s 2>&1 | FileCheck --check-prefix CHECK-V83A-FP16 %s +// CHECK-V83A-FP16: "-cc1"{{.*}} "-triple" "armv8.3{{.*}}" "-target-cpu" "generic" {{.*}}"-target-feature" "+fullfp16" + +// RUN: %clang -target armv8.4a -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-V84A %s +// RUN: %clang -target arm -march=armv8.4a -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-V84A %s +// RUN: %clang -target arm -march=armv8.4-a -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-V84A %s +// RUN: %clang -target arm -march=armv8.4a -mlittle-endian -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-V84A %s +// RUN: %clang -target armv8.4a -mlittle-endian -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-V84A %s +// RUN: %clang -target arm -march=armv8.4a -mlittle-endian -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-V84A %s +// RUN: %clang -target arm -mlittle-endian -march=armv8.4-a -mlittle-endian -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-V84A %s +// CHECK-V84A: "-cc1"{{.*}} "-triple" "armv8.4{{.*}}" "-target-cpu" "generic" + +// RUN: %clang -target armebv8.4a -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-BE-V84A %s +// RUN: %clang -target armv8.4a -mbig-endian -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-BE-V84A %s +// RUN: %clang -target armeb -march=armebv8.4a -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-BE-V84A %s +// RUN: %clang -target armeb -march=armebv8.4-a -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-BE-V84A %s +// RUN: %clang -target arm -march=armebv8.4a -mbig-endian -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-BE-V84A %s +// RUN: %clang -target arm -march=armebv8.4-a -mbig-endian -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-BE-V84A %s +// CHECK-BE-V84A: "-cc1"{{.*}} "-triple" "armebv8.4{{.*}}" "-target-cpu" "generic" + +// RUN: %clang -target armv8a-linux-eabi -march=armv8.4-a+fp16 -### -c %s 2>&1 | FileCheck --check-prefix CHECK-V84A-FP16 %s +// CHECK-V84A-FP16: "-cc1"{{.*}} "-triple" "armv8.4{{.*}}" "-target-cpu" "generic" {{.*}}"-target-feature" "+fullfp16" + // Once we have CPUs with optional v8.2-A FP16, we will need a way to turn it // on and off. Cortex-A53 is a placeholder for now. // RUN: %clang -target armv8a-linux-eabi -mcpu=cortex-a53+fp16 -### -c %s 2>&1 | FileCheck --check-prefix CHECK-CORTEX-A53-FP16 %s // RUN: %clang -target armv8a-linux-eabi -mcpu=cortex-a53+nofp16 -### -c %s 2>&1 | FileCheck --check-prefix CHECK-CORTEX-A53-NOFP16 %s // CHECK-CORTEX-A53-FP16: "-cc1" {{.*}}"-target
[PATCH] D50068: [AArch64][ARM] Add Armv8.4-A tests
SjoerdMeijer added inline comments. Comment at: test/Driver/aarch64-cpus.c:9 // RUN: %clang -target aarch64_be -mlittle-endian -mcpu=generic -### -c %s 2>&1 | FileCheck -check-prefix=GENERIC %s // GENERIC: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "generic" ab wrote: > Ideally, these should test `aarch64-{{.*}}`, no? Agreed, good point. I will commit this first, and address this in a follow up as it looks like needs fixing in a few places here in this file. https://reviews.llvm.org/D50068 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D50068: [AArch64][ARM] Add Armv8.4-A tests
This revision was automatically updated to reflect the committed changes. Closed by commit rC338525: [AArch64][ARM] Add Armv8.4-A tests (authored by SjoerdMeijer, committed by ). Repository: rC Clang https://reviews.llvm.org/D50068 Files: lib/Basic/Targets/ARM.cpp test/Driver/aarch64-cpus.c test/Driver/arm-cortex-cpus.c Index: lib/Basic/Targets/ARM.cpp === --- lib/Basic/Targets/ARM.cpp +++ lib/Basic/Targets/ARM.cpp @@ -185,6 +185,10 @@ return "8_1A"; case llvm::ARM::ArchKind::ARMV8_2A: return "8_2A"; + case llvm::ARM::ArchKind::ARMV8_3A: +return "8_3A"; + case llvm::ARM::ArchKind::ARMV8_4A: +return "8_4A"; case llvm::ARM::ArchKind::ARMV8MBaseline: return "8M_BASE"; case llvm::ARM::ArchKind::ARMV8MMainline: Index: test/Driver/aarch64-cpus.c === --- test/Driver/aarch64-cpus.c +++ test/Driver/aarch64-cpus.c @@ -388,6 +388,14 @@ // RUN: %clang -target aarch64_be -mlittle-endian -march=armv8.2-a -### -c %s 2>&1 | FileCheck -check-prefix=GENERICV82A %s // GENERICV82A: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "generic" "-target-feature" "+neon" "-target-feature" "+v8.2a" +// RUN: %clang -target aarch64_be -march=armv8.2a -### -c %s 2>&1 | FileCheck -check-prefix=GENERICV82A-BE %s +// RUN: %clang -target aarch64_be -march=armv8.2-a -### -c %s 2>&1 | FileCheck -check-prefix=GENERICV82A-BE %s +// RUN: %clang -target aarch64 -mbig-endian -march=armv8.2a -### -c %s 2>&1 | FileCheck -check-prefix=GENERICV82A-BE %s +// RUN: %clang -target aarch64 -mbig-endian -march=armv8.2-a -### -c %s 2>&1 | FileCheck -check-prefix=GENERICV82A-BE %s +// RUN: %clang -target aarch64_be -mbig-endian -march=armv8.2a -### -c %s 2>&1 | FileCheck -check-prefix=GENERICV82A-BE %s +// RUN: %clang -target aarch64_be -mbig-endian -march=armv8.2-a -### -c %s 2>&1 | FileCheck -check-prefix=GENERICV82A-BE %s +// GENERICV82A-BE: "-cc1"{{.*}} "-triple" "aarch64_be{{.*}}" "-target-cpu" "generic" "-target-feature" "+neon" "-target-feature" "+v8.2a" + // RUN: %clang -target aarch64 -march=armv8.2-a+fp16 -### -c %s 2>&1 | FileCheck -check-prefix=GENERICV82A-FP16 %s // GENERICV82A-FP16: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "generic" "-target-feature" "+neon" "-target-feature" "+v8.2a" "-target-feature" "+fullfp16" @@ -397,6 +405,71 @@ // RUN: %clang -target aarch64 -march=armv8.2-a+fp16+profile -### -c %s 2>&1 | FileCheck -check-prefix=GENERICV82A-FP16-SPE %s // GENERICV82A-FP16-SPE: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "generic" "-target-feature" "+neon" "-target-feature" "+v8.2a" "-target-feature" "+fullfp16" "-target-feature" "+spe" +// RUN: %clang -target aarch64 -march=armv8.3a -### -c %s 2>&1 | FileCheck -check-prefix=GENERICV83A %s +// RUN: %clang -target aarch64 -march=armv8.3-a -### -c %s 2>&1 | FileCheck -check-prefix=GENERICV83A %s +// RUN: %clang -target aarch64 -mlittle-endian -march=armv8.3a -### -c %s 2>&1 | FileCheck -check-prefix=GENERICV83A %s +// RUN: %clang -target aarch64 -mlittle-endian -march=armv8.3-a -### -c %s 2>&1 | FileCheck -check-prefix=GENERICV83A %s +// RUN: %clang -target aarch64_be -mlittle-endian -march=armv8.3a -### -c %s 2>&1 | FileCheck -check-prefix=GENERICV83A %s +// RUN: %clang -target aarch64_be -mlittle-endian -march=armv8.3-a -### -c %s 2>&1 | FileCheck -check-prefix=GENERICV83A %s +// GENERICV83A: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "generic" "-target-feature" "+neon" "-target-feature" "+v8.3a" + +// RUN: %clang -target aarch64_be -march=armv8.3a -### -c %s 2>&1 | FileCheck -check-prefix=GENERICV83A-BE %s +// RUN: %clang -target aarch64_be -march=armv8.3-a -### -c %s 2>&1 | FileCheck -check-prefix=GENERICV83A-BE %s +// RUN: %clang -target aarch64 -mbig-endian -march=armv8.3a -### -c %s 2>&1 | FileCheck -check-prefix=GENERICV83A-BE %s +// RUN: %clang -target aarch64 -mbig-endian -march=armv8.3-a -### -c %s 2>&1 | FileCheck -check-prefix=GENERICV83A-BE %s +// RUN: %clang -target aarch64_be -mbig-endian -march=armv8.3a -### -c %s 2>&1 | FileCheck -check-prefix=GENERICV83A-BE %s +// RUN: %clang -target aarch64_be -mbig-endian -march=armv8.3-a -### -c %s 2>&1 | FileCheck -check-prefix=GENERICV83A-BE %s +// GENERICV83A-BE: "-cc1"{{.*}} "-triple" "aarch64_be{{.*}}" "-target-cpu" "generic" "-target-feature" "+neon" "-target-feature" "+v8.3a" + +// RUN: %clang -target aarch64 -march=armv8.3-a+fp16 -### -c %s 2>&1 | FileCheck -check-prefix=GENERICV83A-FP16 %s +// GENERICV83A-FP16: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "generic" "-target-feature" "+neon" "-target-feature" "+v8.3a" "-target-feature" "+fullfp16" + +// RUN: %clang -target aarch64 -march=armv8.4a -### -c %s 2>&1 | FileCheck -check-prefix=GENERICV84A %s +// RUN: %clang -target aarch64 -march=armv8.4-a -### -c %s 2>&1 | FileCheck -check-prefix=GENERICV84A %s +// RUN: %clang -target aarch64 -mlittle-endia
[PATCH] D50175: [AArch64][NFC] better matching of AArch64 target in aarch64-cpus.c tests
SjoerdMeijer created this revision. SjoerdMeijer added reviewers: olista01, ab. Herald added a reviewer: javed.absar. Herald added a subscriber: kristof.beyls. In https://reviews.llvm.org/D50068, @ab noticed that it would be better to match aarch64-{{.*}} for tests that use "-target aarch64_be -mlittle-endian" instead of just aarch64{{.*}}. https://reviews.llvm.org/D50175 Files: test/Driver/aarch64-cpus.c Index: test/Driver/aarch64-cpus.c === --- test/Driver/aarch64-cpus.c +++ test/Driver/aarch64-cpus.c @@ -4,9 +4,10 @@ // RUN: %clang -target aarch64 -mcpu=generic -### -c %s 2>&1 | FileCheck -check-prefix=GENERIC %s // RUN: %clang -target aarch64 -mlittle-endian -### -c %s 2>&1 | FileCheck -check-prefix=GENERIC %s // RUN: %clang -target aarch64 -mlittle-endian -mcpu=generic -### -c %s 2>&1 | FileCheck -check-prefix=GENERIC %s -// RUN: %clang -target aarch64_be -mlittle-endian -### -c %s 2>&1 | FileCheck -check-prefix=GENERIC %s -// RUN: %clang -target aarch64_be -mlittle-endian -mcpu=generic -### -c %s 2>&1 | FileCheck -check-prefix=GENERIC %s -// GENERIC: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "generic" +// RUN: %clang -target aarch64_be -mlittle-endian -### -c %s 2>&1 | FileCheck -check-prefix=GENERIC-LE %s +// RUN: %clang -target aarch64_be -mlittle-endian -mcpu=generic -### -c %s 2>&1 | FileCheck -check-prefix=GENERIC-LE %s +// GENERIC: "-cc1"{{.*}} "-triple" "aarch64"{{.*}} "-target-cpu" "generic" +// GENERIC-LE: "-cc1"{{.*}} "-triple" "aarch64--"{{.*}} "-target-cpu" "generic" // RUN: %clang -target arm64 -### -c %s 2>&1 | FileCheck -check-prefix=ARM64-GENERIC %s // RUN: %clang -target arm64 -mcpu=generic -### -c %s 2>&1 | FileCheck -check-prefix=ARM64-GENERIC %s @@ -25,12 +26,14 @@ // RUN: %clang -target aarch64 -mcpu=cortex-a35 -### -c %s 2>&1 | FileCheck -check-prefix=CA35 %s // RUN: %clang -target aarch64 -mlittle-endian -mcpu=cortex-a35 -### -c %s 2>&1 | FileCheck -check-prefix=CA35 %s -// RUN: %clang -target aarch64_be -mlittle-endian -mcpu=cortex-a35 -### -c %s 2>&1 | FileCheck -check-prefix=CA35 %s +// RUN: %clang -target aarch64_be -mlittle-endian -mcpu=cortex-a35 -### -c %s 2>&1 | FileCheck -check-prefix=CA35-LE %s // RUN: %clang -target aarch64 -mtune=cortex-a35 -### -c %s 2>&1 | FileCheck -check-prefix=CA35-TUNE %s // RUN: %clang -target aarch64 -mlittle-endian -mtune=cortex-a35 -### -c %s 2>&1 | FileCheck -check-prefix=CA35-TUNE %s -// RUN: %clang -target aarch64_be -mlittle-endian -mtune=cortex-a35 -### -c %s 2>&1 | FileCheck -check-prefix=CA35-TUNE %s -// CA35: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "cortex-a35" -// CA35-TUNE: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "generic" +// RUN: %clang -target aarch64_be -mlittle-endian -mtune=cortex-a35 -### -c %s 2>&1 | FileCheck -check-prefix=CA35-TUNE-LE %s +// CA35: "-cc1"{{.*}} "-triple" "aarch64"{{.*}} "-target-cpu" "cortex-a35" +// CA35-LE: "-cc1"{{.*}} "-triple" "aarch64--"{{.*}} "-target-cpu" "cortex-a35" +// CA35-TUNE: "-cc1"{{.*}} "-triple" "aarch64"{{.*}} "-target-cpu" "generic" +// CA35-TUNE-LE: "-cc1"{{.*}} "-triple" "aarch64--"{{.*}} "-target-cpu" "generic" // RUN: %clang -target arm64 -mcpu=cortex-a35 -### -c %s 2>&1 | FileCheck -check-prefix=ARM64-CA35 %s // RUN: %clang -target arm64 -mlittle-endian -mcpu=cortex-a35 -### -c %s 2>&1 | FileCheck -check-prefix=ARM64-CA35 %s @@ -41,11 +44,13 @@ // RUN: %clang -target aarch64 -mcpu=cortex-a53 -### -c %s 2>&1 | FileCheck -check-prefix=CA53 %s // RUN: %clang -target aarch64 -mlittle-endian -mcpu=cortex-a53 -### -c %s 2>&1 | FileCheck -check-prefix=CA53 %s -// RUN: %clang -target aarch64_be -mlittle-endian -mcpu=cortex-a53 -### -c %s 2>&1 | FileCheck -check-prefix=CA53 %s +// RUN: %clang -target aarch64_be -mlittle-endian -mcpu=cortex-a53 -### -c %s 2>&1 | FileCheck -check-prefix=CA53-LE %s // RUN: %clang -target aarch64 -mtune=cortex-a53 -### -c %s 2>&1 | FileCheck -check-prefix=CA53-TUNE %s -// RUN: %clang -target aarch64_be -mlittle-endian -mtune=cortex-a53 -### -c %s 2>&1 | FileCheck -check-prefix=CA53-TUNE %s -// CA53: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "cortex-a53" -// CA53-TUNE: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "generic" +// RUN: %clang -target aarch64_be -mlittle-endian -mtune=cortex-a53 -### -c %s 2>&1 | FileCheck -check-prefix=CA53-TUNE-LE %s +// CA53: "-cc1"{{.*}} "-triple" "aarch64"{{.*}} "-target-cpu" "cortex-a53" +// CA53-LE: "-cc1"{{.*}} "-triple" "aarch64--"{{.*}} "-target-cpu" "cortex-a53" +// CA53-TUNE: "-cc1"{{.*}} "-triple" "aarch64"{{.*}} "-target-cpu" "generic" +// CA53-TUNE-LE: "-cc1"{{.*}} "-triple" "aarch64--"{{.*}} "-target-cpu" "generic" // RUN: %clang -target arm64 -mcpu=cortex-a53 -### -c %s 2>&1 | FileCheck -check-prefix=ARM64-CA53 %s // RUN: %clang -target arm64 -mlittle-endian -mcpu=cortex-a53 -### -c %s 2>&1 | FileCheck -check-prefix=ARM64-CA53 %s @@ -56,11 +61,13 @@ // RUN: %cl
[PATCH] D50179: [AArch64][ARM] Context sensitive meaning of option "crypto"
SjoerdMeijer created this revision. SjoerdMeijer added reviewers: olista01, samparker, john.brawn, ab, t.p.northover. Herald added a reviewer: javed.absar. Herald added subscribers: chrib, kristof.beyls. For AArch64: 1. Crypto means sm4 + sha3 + sha2 + aes for Armv8.4-A and up, 2. and sha2 + aes for Armv8.3-A and earlier. And for AArch32: Crypto means sha2 + aes, because the Armv8.2-A crypto instructions were added to AArch64 only. https://reviews.llvm.org/D50179 Files: lib/Driver/ToolChains/Arch/AArch64.cpp lib/Driver/ToolChains/Arch/ARM.cpp test/Driver/arm-features.c test/Preprocessor/aarch64-target-features.c Index: test/Preprocessor/aarch64-target-features.c === --- test/Preprocessor/aarch64-target-features.c +++ test/Preprocessor/aarch64-target-features.c @@ -143,6 +143,101 @@ // CHECK-MARCH-2: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "-fp-armv8" "-target-feature" "-neon" "-target-feature" "-crc" "-target-feature" "-crypto" // CHECK-MARCH-3: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "-neon" +// Check +sm4: +// +// RUN: %clang -target aarch64 -march=armv8.2a+sm4 -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-SM4 %s +// CHECK-SM4: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8.2a" "-target-feature" "+sm4" +// +// Check +sha3: +// +// RUN: %clang -target aarch64 -march=armv8.2a+sha3 -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-SHA3 %s +// CHECK-SHA3: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8.2a" "-target-feature" "+sha3" +// +// Check +sha2: +// +// RUN: %clang -target aarch64 -march=armv8.3a+sha2 -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-SHA2 %s +// CHECK-SHA2: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8.{{.}}a" "-target-feature" "+sha2" +// +// Check +aes: +// +// RUN: %clang -target aarch64 -march=armv8.3a+aes -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-AES %s +// CHECK-AES: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8.{{.}}a" "-target-feature" "+aes" +// +// Check -sm4: +// +// RUN: %clang -target aarch64 -march=armv8.2a+noSM4 -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NO-SM4 %s +// CHECK-NO-SM4: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8.2a" "-target-feature" "-sm4" +// +// Check -sha3: +// +// RUN: %clang -target aarch64 -march=armv8.2a+noSHA3 -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NO-SHA3 %s +// CHECK-NO-SHA3: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8.2a" "-target-feature" "-sha3" +// +// Check -sha2: +// +// RUN: %clang -target aarch64 -march=armv8.2a+noSHA2 -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NO-SHA2 %s +// CHECK-NO-SHA2: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8.2a" "-target-feature" "-sha2" +// +// Check -aes: +// +// RUN: %clang -target aarch64 -march=armv8.2a+noAES -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NO-AES %s +// CHECK-NO-AES: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8.2a" "-target-feature" "-aes" +// +// +// Arch <= ARMv8.3: crypto = sha2 + aes +// - +// +// Check +crypto: +// +// RUN: %clang -target aarch64 -march=armv8a+crypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-CRYPTO83 %s +// RUN: %clang -target aarch64 -march=armv8.1a+crypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-CRYPTO83 %s +// RUN: %clang -target aarch64 -march=armv8.2a+crypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-CRYPTO83 %s +// RUN: %clang -target aarch64 -march=armv8.3a+crypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-CRYPTO83 %s +// RUN: %clang -target aarch64 -march=armv8a+crypto+nocrypto+crypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-CRYPTO83 %s +// CHECK-CRYPTO83: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+crypto" "-target-feature" "+sha2" "-target-feature" "+aes" +// +// Check -crypto: +// +// RUN: %clang -target aarch64 -march=armv8a+nocrypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NOCRYPTO8A %s +// RUN: %clang -target aarch64 -march=armv8.1a+nocrypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NOCRYPTO81 %s +// RUN: %clang -target aarch64 -march=armv8.2a+nocrypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NOCRYPTO82 %s +// RUN: %clang -target aarch64 -march=armv8.3a+nocrypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NOCRYPTO82 %s +// RUN: %clang -target aarch64 -march=armv8.3a+nocrypto+crypto+nocrypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NOCRYPTO82 %s + +// CHECK-NOCRYPTO8A: "-target-feature" "+neon" "-target-feature" "-crypto" "-target-feature" "-sha2" "-target-feature" "-aes" "-target-abi" "aapcs" +// CHECK-NOCRYPTO81: "-target-feature" "+neon" "-target-feature" "+v8.1a" "-target-feature" "-crypto" "-target-feature" "-sha2" "-target-feature" "-aes" "-target-abi" "aapcs" +// CHECK-NOCRYPTO82: "-target-feature" "+neon" "-target-feature" "+v8.{{.}}a" "-target-feature" "-crypto" "-tar
[PATCH] D50179: [AArch64][ARM] Context sensitive meaning of option "crypto"
SjoerdMeijer added a comment. Hi Eli, thanks for the feedback. > Yes, this logic should be in TargetParser, not here. Trying to rewrite the > target features afterwards is messy at best. (Actually, the target feature > list generated by TargetParser probably shouldn't contain the string "crypto" > at all.) I appreciate there is room for improvement here, which is an understatement! :) I probably should have mentioned earlier that my colleague is working on targetparser and options, and he will send the proposal in the form of an RFC to the dev list soon. Very briefly, the proposal will elaborate on how we want to capture/enforce architecture extension dependencies (I believe thus also disallow architecturally invalid combinations), imply options, and e.g. warn on redundant options. I want to move the crypto logic to this new framework as soon it is there. Thus, for the time being, this is a stopgap to demonstrate what we want to achieve (with crypto), and also quite importantly, we have something that works today. But again, I fully agree that the current implementation is far from ideal, but hopefully with these explanations is somewhat acceptable. https://reviews.llvm.org/D50179 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D49075: [NEON] Define fp16 vld and vst intrinsics conditionally
SjoerdMeijer accepted this revision. SjoerdMeijer added a comment. This revision is now accepted and ready to land. Thanks, LGTM. https://reviews.llvm.org/D49075 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D42993: [AArch64] Fixes for ARMv8.2-A FP16 scalar intrinsic
SjoerdMeijer added a comment. Thanks for fixing this. Looks very reasonable to me. Question about the failures: I am now wondering if this means we were and still are missing tests? Nit: for future reviews, I think it is better to split patches up if they are commits to different repos. https://reviews.llvm.org/D42993 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D41792: [AArch64] Add ARMv8.2-A FP16 scalar intrinsics
SjoerdMeijer closed this revision. SjoerdMeijer added a comment. Herald added a subscriber: hintonda. Committed as r323005 https://reviews.llvm.org/D41792 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D42993: [AArch64] Fixes for ARMv8.2-A FP16 scalar intrinsic
SjoerdMeijer accepted this revision. SjoerdMeijer added a comment. This revision is now accepted and ready to land. Thanks https://reviews.llvm.org/D42993 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D43372: [ARM] Add tests for the vcvtr builtins
SjoerdMeijer created this revision. SjoerdMeijer added reviewers: samparker, olista01, rengolin. Herald added subscribers: kristof.beyls, javed.absar. This adds Sema and Codegen tests for the vcvtr builtins (because they were missing). https://reviews.llvm.org/D43372 Files: test/CodeGen/builtins-arm.c test/Sema/builtins-arm.c Index: test/Sema/builtins-arm.c === --- test/Sema/builtins-arm.c +++ test/Sema/builtins-arm.c @@ -320,3 +320,18 @@ x = __builtin_arm_smusd(a, b); x = __builtin_arm_smusdx(a, b); } + +void test_VFP(float f, double d) { + float fr; + double dr; + + fr = __builtin_arm_vcvtr_f(f, 0); + fr = __builtin_arm_vcvtr_f(f, 1); + fr = __builtin_arm_vcvtr_f(f, -1); // expected-error {{argument should be a value from 0 to 1}} + fr = __builtin_arm_vcvtr_f(f, 2); // expected-error {{argument should be a value from 0 to 1}} + + dr = __builtin_arm_vcvtr_f(d, 0); + dr = __builtin_arm_vcvtr_f(d, 1); + dr = __builtin_arm_vcvtr_f(d, -1); // expected-error {{argument should be a value from 0 to 1}} + dr = __builtin_arm_vcvtr_f(d, 2); // expected-error {{argument should be a value from 0 to 1}} +} Index: test/CodeGen/builtins-arm.c === --- test/CodeGen/builtins-arm.c +++ test/CodeGen/builtins-arm.c @@ -8,69 +8,85 @@ } void f1(char *a, char *b) { + // CHECK: call {{.*}} @__clear_cache __clear_cache(a,b); } -// CHECK: call {{.*}} @__clear_cache +float test_vcvtrf0(float f) { + // CHECK: call float @llvm.arm.vcvtr.f32(float %f) + return __builtin_arm_vcvtr_f(f, 0); +} + +float test_vcvtrf1(float f) { + // CHECK: call float @llvm.arm.vcvtru.f32(float %f) + return __builtin_arm_vcvtr_f(f, 1); +} + +double test_vcvtrd0(double d) { + // CHECK: call float @llvm.arm.vcvtr.f64(double %d) + return __builtin_arm_vcvtr_d(d, 0); +} + +double test_vcvtrd1(double d) { + // call float @llvm.arm.vcvtru.f64(double %d) + return __builtin_arm_vcvtr_d(d, 1); +} void test_eh_return_data_regno() { + // CHECK: store volatile i32 0 + // CHECK: store volatile i32 1 volatile int res; - res = __builtin_eh_return_data_regno(0); // CHECK: store volatile i32 0 - res = __builtin_eh_return_data_regno(1); // CHECK: store volatile i32 1 + res = __builtin_eh_return_data_regno(0); + res = __builtin_eh_return_data_regno(1); } void nop() { + // CHECK: call {{.*}} @llvm.arm.hint(i32 0) __builtin_arm_nop(); } -// CHECK: call {{.*}} @llvm.arm.hint(i32 0) - void yield() { + // CHECK: call {{.*}} @llvm.arm.hint(i32 1) __builtin_arm_yield(); } -// CHECK: call {{.*}} @llvm.arm.hint(i32 1) - void wfe() { + // CHECK: call {{.*}} @llvm.arm.hint(i32 2) __builtin_arm_wfe(); } -// CHECK: call {{.*}} @llvm.arm.hint(i32 2) - void wfi() { + // CHECK: call {{.*}} @llvm.arm.hint(i32 3) __builtin_arm_wfi(); } -// CHECK: call {{.*}} @llvm.arm.hint(i32 3) - void sev() { + // CHECK: call {{.*}} @llvm.arm.hint(i32 4) __builtin_arm_sev(); } -// CHECK: call {{.*}} @llvm.arm.hint(i32 4) - void sevl() { + // CHECK: call {{.*}} @llvm.arm.hint(i32 5) __builtin_arm_sevl(); } -// CHECK: call {{.*}} @llvm.arm.hint(i32 5) - void dbg() { + // CHECK: call {{.*}} @llvm.arm.dbg(i32 0) __builtin_arm_dbg(0); } -// CHECK: call {{.*}} @llvm.arm.dbg(i32 0) - void test_barrier() { - __builtin_arm_dmb(1); //CHECK: call {{.*}} @llvm.arm.dmb(i32 1) - __builtin_arm_dsb(2); //CHECK: call {{.*}} @llvm.arm.dsb(i32 2) - __builtin_arm_isb(3); //CHECK: call {{.*}} @llvm.arm.isb(i32 3) + //CHECK: call {{.*}} @llvm.arm.dmb(i32 1) + //CHECK: call {{.*}} @llvm.arm.dsb(i32 2) + //CHECK: call {{.*}} @llvm.arm.isb(i32 3) + __builtin_arm_dmb(1); + __builtin_arm_dsb(2); + __builtin_arm_isb(3); } -// CHECK: call {{.*}} @llvm.bitreverse.i32(i32 %a) - unsigned rbit(unsigned a) { + // CHECK: call {{.*}} @llvm.bitreverse.i32(i32 %a) return __builtin_arm_rbit(a); } ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D43372: [ARM] Add tests for the vcvtr builtins
SjoerdMeijer added a comment. Thanks for reviewing! https://reviews.llvm.org/D43372 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D43372: [ARM] Add tests for the vcvtr builtins
This revision was automatically updated to reflect the committed changes. Closed by commit rC325351: [ARM] Add tests for the vcvtr builtins (authored by SjoerdMeijer, committed by ). Repository: rL LLVM https://reviews.llvm.org/D43372 Files: test/CodeGen/builtins-arm.c test/Sema/builtins-arm.c Index: test/CodeGen/builtins-arm.c === --- test/CodeGen/builtins-arm.c +++ test/CodeGen/builtins-arm.c @@ -8,69 +8,85 @@ } void f1(char *a, char *b) { + // CHECK: call {{.*}} @__clear_cache __clear_cache(a,b); } -// CHECK: call {{.*}} @__clear_cache +float test_vcvtrf0(float f) { + // CHECK: call float @llvm.arm.vcvtr.f32(float %f) + return __builtin_arm_vcvtr_f(f, 0); +} + +float test_vcvtrf1(float f) { + // CHECK: call float @llvm.arm.vcvtru.f32(float %f) + return __builtin_arm_vcvtr_f(f, 1); +} + +double test_vcvtrd0(double d) { + // CHECK: call float @llvm.arm.vcvtr.f64(double %d) + return __builtin_arm_vcvtr_d(d, 0); +} + +double test_vcvtrd1(double d) { + // call float @llvm.arm.vcvtru.f64(double %d) + return __builtin_arm_vcvtr_d(d, 1); +} void test_eh_return_data_regno() { + // CHECK: store volatile i32 0 + // CHECK: store volatile i32 1 volatile int res; - res = __builtin_eh_return_data_regno(0); // CHECK: store volatile i32 0 - res = __builtin_eh_return_data_regno(1); // CHECK: store volatile i32 1 + res = __builtin_eh_return_data_regno(0); + res = __builtin_eh_return_data_regno(1); } void nop() { + // CHECK: call {{.*}} @llvm.arm.hint(i32 0) __builtin_arm_nop(); } -// CHECK: call {{.*}} @llvm.arm.hint(i32 0) - void yield() { + // CHECK: call {{.*}} @llvm.arm.hint(i32 1) __builtin_arm_yield(); } -// CHECK: call {{.*}} @llvm.arm.hint(i32 1) - void wfe() { + // CHECK: call {{.*}} @llvm.arm.hint(i32 2) __builtin_arm_wfe(); } -// CHECK: call {{.*}} @llvm.arm.hint(i32 2) - void wfi() { + // CHECK: call {{.*}} @llvm.arm.hint(i32 3) __builtin_arm_wfi(); } -// CHECK: call {{.*}} @llvm.arm.hint(i32 3) - void sev() { + // CHECK: call {{.*}} @llvm.arm.hint(i32 4) __builtin_arm_sev(); } -// CHECK: call {{.*}} @llvm.arm.hint(i32 4) - void sevl() { + // CHECK: call {{.*}} @llvm.arm.hint(i32 5) __builtin_arm_sevl(); } -// CHECK: call {{.*}} @llvm.arm.hint(i32 5) - void dbg() { + // CHECK: call {{.*}} @llvm.arm.dbg(i32 0) __builtin_arm_dbg(0); } -// CHECK: call {{.*}} @llvm.arm.dbg(i32 0) - void test_barrier() { - __builtin_arm_dmb(1); //CHECK: call {{.*}} @llvm.arm.dmb(i32 1) - __builtin_arm_dsb(2); //CHECK: call {{.*}} @llvm.arm.dsb(i32 2) - __builtin_arm_isb(3); //CHECK: call {{.*}} @llvm.arm.isb(i32 3) + //CHECK: call {{.*}} @llvm.arm.dmb(i32 1) + //CHECK: call {{.*}} @llvm.arm.dsb(i32 2) + //CHECK: call {{.*}} @llvm.arm.isb(i32 3) + __builtin_arm_dmb(1); + __builtin_arm_dsb(2); + __builtin_arm_isb(3); } -// CHECK: call {{.*}} @llvm.bitreverse.i32(i32 %a) - unsigned rbit(unsigned a) { + // CHECK: call {{.*}} @llvm.bitreverse.i32(i32 %a) return __builtin_arm_rbit(a); } Index: test/Sema/builtins-arm.c === --- test/Sema/builtins-arm.c +++ test/Sema/builtins-arm.c @@ -320,3 +320,18 @@ x = __builtin_arm_smusd(a, b); x = __builtin_arm_smusdx(a, b); } + +void test_VFP(float f, double d) { + float fr; + double dr; + + fr = __builtin_arm_vcvtr_f(f, 0); + fr = __builtin_arm_vcvtr_f(f, 1); + fr = __builtin_arm_vcvtr_f(f, -1); // expected-error {{argument should be a value from 0 to 1}} + fr = __builtin_arm_vcvtr_f(f, 2); // expected-error {{argument should be a value from 0 to 1}} + + dr = __builtin_arm_vcvtr_f(d, 0); + dr = __builtin_arm_vcvtr_f(d, 1); + dr = __builtin_arm_vcvtr_f(d, -1); // expected-error {{argument should be a value from 0 to 1}} + dr = __builtin_arm_vcvtr_f(d, 2); // expected-error {{argument should be a value from 0 to 1}} +} ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D43372: [ARM] Add tests for the vcvtr builtins
This revision was automatically updated to reflect the committed changes. Closed by commit rL325351: [ARM] Add tests for the vcvtr builtins (authored by SjoerdMeijer, committed by ). Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D43372?vs=134570&id=134626#toc Repository: rL LLVM https://reviews.llvm.org/D43372 Files: cfe/trunk/test/CodeGen/builtins-arm.c cfe/trunk/test/Sema/builtins-arm.c Index: cfe/trunk/test/CodeGen/builtins-arm.c === --- cfe/trunk/test/CodeGen/builtins-arm.c +++ cfe/trunk/test/CodeGen/builtins-arm.c @@ -8,69 +8,85 @@ } void f1(char *a, char *b) { + // CHECK: call {{.*}} @__clear_cache __clear_cache(a,b); } -// CHECK: call {{.*}} @__clear_cache +float test_vcvtrf0(float f) { + // CHECK: call float @llvm.arm.vcvtr.f32(float %f) + return __builtin_arm_vcvtr_f(f, 0); +} + +float test_vcvtrf1(float f) { + // CHECK: call float @llvm.arm.vcvtru.f32(float %f) + return __builtin_arm_vcvtr_f(f, 1); +} + +double test_vcvtrd0(double d) { + // CHECK: call float @llvm.arm.vcvtr.f64(double %d) + return __builtin_arm_vcvtr_d(d, 0); +} + +double test_vcvtrd1(double d) { + // call float @llvm.arm.vcvtru.f64(double %d) + return __builtin_arm_vcvtr_d(d, 1); +} void test_eh_return_data_regno() { + // CHECK: store volatile i32 0 + // CHECK: store volatile i32 1 volatile int res; - res = __builtin_eh_return_data_regno(0); // CHECK: store volatile i32 0 - res = __builtin_eh_return_data_regno(1); // CHECK: store volatile i32 1 + res = __builtin_eh_return_data_regno(0); + res = __builtin_eh_return_data_regno(1); } void nop() { + // CHECK: call {{.*}} @llvm.arm.hint(i32 0) __builtin_arm_nop(); } -// CHECK: call {{.*}} @llvm.arm.hint(i32 0) - void yield() { + // CHECK: call {{.*}} @llvm.arm.hint(i32 1) __builtin_arm_yield(); } -// CHECK: call {{.*}} @llvm.arm.hint(i32 1) - void wfe() { + // CHECK: call {{.*}} @llvm.arm.hint(i32 2) __builtin_arm_wfe(); } -// CHECK: call {{.*}} @llvm.arm.hint(i32 2) - void wfi() { + // CHECK: call {{.*}} @llvm.arm.hint(i32 3) __builtin_arm_wfi(); } -// CHECK: call {{.*}} @llvm.arm.hint(i32 3) - void sev() { + // CHECK: call {{.*}} @llvm.arm.hint(i32 4) __builtin_arm_sev(); } -// CHECK: call {{.*}} @llvm.arm.hint(i32 4) - void sevl() { + // CHECK: call {{.*}} @llvm.arm.hint(i32 5) __builtin_arm_sevl(); } -// CHECK: call {{.*}} @llvm.arm.hint(i32 5) - void dbg() { + // CHECK: call {{.*}} @llvm.arm.dbg(i32 0) __builtin_arm_dbg(0); } -// CHECK: call {{.*}} @llvm.arm.dbg(i32 0) - void test_barrier() { - __builtin_arm_dmb(1); //CHECK: call {{.*}} @llvm.arm.dmb(i32 1) - __builtin_arm_dsb(2); //CHECK: call {{.*}} @llvm.arm.dsb(i32 2) - __builtin_arm_isb(3); //CHECK: call {{.*}} @llvm.arm.isb(i32 3) + //CHECK: call {{.*}} @llvm.arm.dmb(i32 1) + //CHECK: call {{.*}} @llvm.arm.dsb(i32 2) + //CHECK: call {{.*}} @llvm.arm.isb(i32 3) + __builtin_arm_dmb(1); + __builtin_arm_dsb(2); + __builtin_arm_isb(3); } -// CHECK: call {{.*}} @llvm.bitreverse.i32(i32 %a) - unsigned rbit(unsigned a) { + // CHECK: call {{.*}} @llvm.bitreverse.i32(i32 %a) return __builtin_arm_rbit(a); } Index: cfe/trunk/test/Sema/builtins-arm.c === --- cfe/trunk/test/Sema/builtins-arm.c +++ cfe/trunk/test/Sema/builtins-arm.c @@ -320,3 +320,18 @@ x = __builtin_arm_smusd(a, b); x = __builtin_arm_smusdx(a, b); } + +void test_VFP(float f, double d) { + float fr; + double dr; + + fr = __builtin_arm_vcvtr_f(f, 0); + fr = __builtin_arm_vcvtr_f(f, 1); + fr = __builtin_arm_vcvtr_f(f, -1); // expected-error {{argument should be a value from 0 to 1}} + fr = __builtin_arm_vcvtr_f(f, 2); // expected-error {{argument should be a value from 0 to 1}} + + dr = __builtin_arm_vcvtr_f(d, 0); + dr = __builtin_arm_vcvtr_f(d, 1); + dr = __builtin_arm_vcvtr_f(d, -1); // expected-error {{argument should be a value from 0 to 1}} + dr = __builtin_arm_vcvtr_f(d, 2); // expected-error {{argument should be a value from 0 to 1}} +} ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D33719: Add _Float16 as a C/C++ source language type
SjoerdMeijer created this revision. Herald added a subscriber: klimek. This adds _Float16 as a source language type. As a first step, _Float16 behaves the same as __fp16 and is thus an alias. This means that _Float16 also behaves like a storage-only type. Subsequent patches will implement the proper semantics of both types. https://reviews.llvm.org/D33719 Files: include/clang-c/Index.h include/clang/AST/ASTContext.h include/clang/AST/BuiltinTypes.def include/clang/Basic/Specifiers.h include/clang/Basic/TokenKinds.def include/clang/Sema/DeclSpec.h include/clang/Serialization/ASTBitCodes.h lib/AST/ASTContext.cpp lib/AST/ItaniumMangle.cpp lib/AST/MicrosoftMangle.cpp lib/AST/NSAPI.cpp lib/AST/Type.cpp lib/AST/TypeLoc.cpp lib/Analysis/PrintfFormatString.cpp lib/CodeGen/CGDebugInfo.cpp lib/CodeGen/CodeGenTypes.cpp lib/CodeGen/ItaniumCXXABI.cpp lib/Format/FormatToken.cpp lib/Index/USRGeneration.cpp lib/Lex/LiteralSupport.cpp lib/Parse/ParseDecl.cpp lib/Parse/ParseExpr.cpp lib/Parse/ParseExprCXX.cpp lib/Parse/ParseTentative.cpp lib/Sema/DeclSpec.cpp lib/Sema/SemaDecl.cpp lib/Sema/SemaTemplateVariadic.cpp lib/Sema/SemaType.cpp lib/Serialization/ASTCommon.cpp lib/Serialization/ASTReader.cpp test/CodeGenCXX/float16-declarations-error.cpp test/CodeGenCXX/float16-declarations.cpp tools/libclang/CXType.cpp Index: tools/libclang/CXType.cpp === --- tools/libclang/CXType.cpp +++ tools/libclang/CXType.cpp @@ -52,6 +52,7 @@ BTCASE(Float); BTCASE(Double); BTCASE(LongDouble); +BTCASE(Float16); BTCASE(Float128); BTCASE(NullPtr); BTCASE(Overload); Index: test/CodeGenCXX/float16-declarations.cpp === --- /dev/null +++ test/CodeGenCXX/float16-declarations.cpp @@ -0,0 +1,90 @@ +// RUN: %clang -S -emit-llvm --target=aarch64 %s -o - | FileCheck %s +// +/* Various contexts where type _Float16 can appear. The different check +prefixes are due to different mangling on X86 and different calling +convention on SystemZ. */ + +/* Namespace */ +namespace { + _Float16 f1n; + _Float16 f2n = 33.f16; + _Float16 arr1n[10]; + _Float16 arr2n[] = { 1.2, 3.0, 3.e4 }; + const volatile _Float16 func1n(const _Float16 &arg) { +return arg + f2n + arr1n[4] - arr2n[1]; + } +} + +/* File */ +_Float16 f1f; +_Float16 f2f = 32.4; +static _Float16 f3f = f2f; +_Float16 arr1f[10]; +_Float16 arr2f[] = { -1.2, -3.0, -3.e4 }; +_Float16 func1f(_Float16 arg); + +/* Class */ +class C1 { + _Float16 f1c; + static const _Float16 f2c; + volatile _Float16 f3c; +public: + C1(_Float16 arg) : f1c(arg), f3c(arg) { } + _Float16 func1c(_Float16 arg ) { +return f1c + arg; + } + static _Float16 func2c(_Float16 arg) { +return arg * C1::f2c; + } +}; + +/* Template */ +template C func1t(C arg) { return arg * 2.f16; } +template struct S1 { + C mem1; +}; +template <> struct S1<_Float16> { + _Float16 mem2; +}; + +/* Local */ +int main(void) { + _Float16 f1l = 123e220q; + _Float16 f2l = -0.f16; + _Float16 f3l = 1.000976562; + C1 c1(f1l); + S1<_Float16> s1 = { 132.f16 }; + _Float16 f4l = func1n(f1l) + func1f(f2l) + c1.func1c(f3l) + c1.func2c(f1l) + +func1t(f1l) + s1.mem2 - f1n + f2n; +#if (__cplusplus >= 201103L) + auto f5l = -1.f16, *f6l = &f2l, f7l = func1t(f3l); +#endif + _Float16 f8l = f4l++; + _Float16 arr1l[] = { -1.f16, -0.f16, -11.f16 }; +} + + +// CHECK-DAG: @f1f = global half 0xH, align 2 +// CHECK-DAG: @f2f = global half 0xH500D, align 2 +// CHECK-DAG: @arr1f = global [10 x half] zeroinitializer, align 2 +// CHECK-DAG: @arr2f = global [3 x half] [half 0xHBCCD, half 0xHC200, half 0xHF753], align 2 +// CHECK-DAG: @_ZZ4mainE2s1 = private unnamed_addr constant %struct.S1 { half 0xH5820 }, align 2 +// CHECK-DAG: @_ZN12_GLOBAL__N_13f1nE = internal global half 0xH, align 2 +// CHECK-DAG: @_ZN12_GLOBAL__N_13f2nE = internal global half 0xH5020, align 2 +// CHECK-DAG: @_ZZ4mainE5arr1l = private unnamed_addr constant [3 x half] [half 0xHBC00, half 0xH8000, half 0xHC980], align 2 +// CHECK-DAG: @_ZN12_GLOBAL__N_15arr1nE = internal global [10 x half] zeroinitializer, align 2 +// CHECK-DAG: @_ZN12_GLOBAL__N_15arr2nE = internal global [3 x half] [half 0xH3CCD, half 0xH4200, half 0xH7753], align 2 +// CHECK-DAG: @_ZN2C13f2cE = external constant half, align 2 + +// CHECK-DAG: define linkonce_odr void @_ZN2C1C2EDh(%class.C1* %this, half %arg) +// CHECK-DAG: define linkonce_odr half @_ZN2C16func1cEDh(%class.C1* %this, half %arg) +// CHECK-DAG: define linkonce_odr half @_ZN2C16func2cEDh(half %arg) +// CHECK-DAG: define linkonce_odr half @_Z6func1tIDhET_S0_(half %arg) + +// CHECK-DAG: store half 0xH7C00, half* %f1l, align 2 +// CHECK-DAG: store half 0xH8000, half* %f2l, align 2 +// CHECK-DAG: store half 0xH3C01, half* %f3l, align 2 + +// CHECK-DAG: [[F4L:%[a-z0-9]+]] = load half, half* %f4l, align 2 +// CHECK
[PATCH] D33719: Add _Float16 as a C/C++ source language type
SjoerdMeijer added inline comments. Comment at: include/clang-c/Index.h:3015 CXType_Half = 31, + CXType_Float16 = 30, CXType_FirstBuiltin = CXType_Void, rogfer01 wrote: > This enumerator is the same as `CXType_Float128` above, is that intended? Ah, thanks, copy-paste mistake. Will fix. https://reviews.llvm.org/D33719 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D33719: Add _Float16 as a C/C++ source language type
SjoerdMeijer updated this revision to Diff 100866. SjoerdMeijer added a comment. Fixed typos 'TST_float16' and 'CXType_Float16 = 30', and have also added it to a switch that I had missed. https://reviews.llvm.org/D33719 Files: include/clang-c/Index.h include/clang/AST/ASTContext.h include/clang/AST/BuiltinTypes.def include/clang/Basic/Specifiers.h include/clang/Basic/TokenKinds.def include/clang/Sema/DeclSpec.h include/clang/Serialization/ASTBitCodes.h lib/AST/ASTContext.cpp lib/AST/ItaniumMangle.cpp lib/AST/MicrosoftMangle.cpp lib/AST/NSAPI.cpp lib/AST/Type.cpp lib/AST/TypeLoc.cpp lib/Analysis/PrintfFormatString.cpp lib/CodeGen/CGDebugInfo.cpp lib/CodeGen/CodeGenTypes.cpp lib/CodeGen/ItaniumCXXABI.cpp lib/Format/FormatToken.cpp lib/Index/USRGeneration.cpp lib/Lex/LiteralSupport.cpp lib/Parse/ParseDecl.cpp lib/Parse/ParseExpr.cpp lib/Parse/ParseExprCXX.cpp lib/Parse/ParseTentative.cpp lib/Sema/DeclSpec.cpp lib/Sema/SemaDecl.cpp lib/Sema/SemaTemplateVariadic.cpp lib/Sema/SemaType.cpp lib/Serialization/ASTCommon.cpp lib/Serialization/ASTReader.cpp test/CodeGenCXX/float16-declarations-error.cpp test/CodeGenCXX/float16-declarations.cpp tools/libclang/CXType.cpp Index: tools/libclang/CXType.cpp === --- tools/libclang/CXType.cpp +++ tools/libclang/CXType.cpp @@ -52,6 +52,7 @@ BTCASE(Float); BTCASE(Double); BTCASE(LongDouble); +BTCASE(Float16); BTCASE(Float128); BTCASE(NullPtr); BTCASE(Overload); @@ -490,7 +491,7 @@ TKIND(Char_U); TKIND(UChar); TKIND(Char16); -TKIND(Char32); +TKIND(Char32); TKIND(UShort); TKIND(UInt); TKIND(ULong); @@ -508,6 +509,7 @@ TKIND(Float); TKIND(Double); TKIND(LongDouble); +TKIND(Float16); TKIND(Float128); TKIND(NullPtr); TKIND(Overload); Index: test/CodeGenCXX/float16-declarations.cpp === --- /dev/null +++ test/CodeGenCXX/float16-declarations.cpp @@ -0,0 +1,90 @@ +// RUN: %clang -S -emit-llvm --target=aarch64 %s -o - | FileCheck %s +// +/* Various contexts where type _Float16 can appear. The different check +prefixes are due to different mangling on X86 and different calling +convention on SystemZ. */ + +/* Namespace */ +namespace { + _Float16 f1n; + _Float16 f2n = 33.f16; + _Float16 arr1n[10]; + _Float16 arr2n[] = { 1.2, 3.0, 3.e4 }; + const volatile _Float16 func1n(const _Float16 &arg) { +return arg + f2n + arr1n[4] - arr2n[1]; + } +} + +/* File */ +_Float16 f1f; +_Float16 f2f = 32.4; +static _Float16 f3f = f2f; +_Float16 arr1f[10]; +_Float16 arr2f[] = { -1.2, -3.0, -3.e4 }; +_Float16 func1f(_Float16 arg); + +/* Class */ +class C1 { + _Float16 f1c; + static const _Float16 f2c; + volatile _Float16 f3c; +public: + C1(_Float16 arg) : f1c(arg), f3c(arg) { } + _Float16 func1c(_Float16 arg ) { +return f1c + arg; + } + static _Float16 func2c(_Float16 arg) { +return arg * C1::f2c; + } +}; + +/* Template */ +template C func1t(C arg) { return arg * 2.f16; } +template struct S1 { + C mem1; +}; +template <> struct S1<_Float16> { + _Float16 mem2; +}; + +/* Local */ +int main(void) { + _Float16 f1l = 123e220q; + _Float16 f2l = -0.f16; + _Float16 f3l = 1.000976562; + C1 c1(f1l); + S1<_Float16> s1 = { 132.f16 }; + _Float16 f4l = func1n(f1l) + func1f(f2l) + c1.func1c(f3l) + c1.func2c(f1l) + +func1t(f1l) + s1.mem2 - f1n + f2n; +#if (__cplusplus >= 201103L) + auto f5l = -1.f16, *f6l = &f2l, f7l = func1t(f3l); +#endif + _Float16 f8l = f4l++; + _Float16 arr1l[] = { -1.f16, -0.f16, -11.f16 }; +} + + +// CHECK-DAG: @f1f = global half 0xH, align 2 +// CHECK-DAG: @f2f = global half 0xH500D, align 2 +// CHECK-DAG: @arr1f = global [10 x half] zeroinitializer, align 2 +// CHECK-DAG: @arr2f = global [3 x half] [half 0xHBCCD, half 0xHC200, half 0xHF753], align 2 +// CHECK-DAG: @_ZZ4mainE2s1 = private unnamed_addr constant %struct.S1 { half 0xH5820 }, align 2 +// CHECK-DAG: @_ZN12_GLOBAL__N_13f1nE = internal global half 0xH, align 2 +// CHECK-DAG: @_ZN12_GLOBAL__N_13f2nE = internal global half 0xH5020, align 2 +// CHECK-DAG: @_ZZ4mainE5arr1l = private unnamed_addr constant [3 x half] [half 0xHBC00, half 0xH8000, half 0xHC980], align 2 +// CHECK-DAG: @_ZN12_GLOBAL__N_15arr1nE = internal global [10 x half] zeroinitializer, align 2 +// CHECK-DAG: @_ZN12_GLOBAL__N_15arr2nE = internal global [3 x half] [half 0xH3CCD, half 0xH4200, half 0xH7753], align 2 +// CHECK-DAG: @_ZN2C13f2cE = external constant half, align 2 + +// CHECK-DAG: define linkonce_odr void @_ZN2C1C2EDh(%class.C1* %this, half %arg) +// CHECK-DAG: define linkonce_odr half @_ZN2C16func1cEDh(%class.C1* %this, half %arg) +// CHECK-DAG: define linkonce_odr half @_ZN2C16func2cEDh(half %arg) +// CHECK-DAG: define linkonce_odr half @_Z6func1tIDhET_S0_(half %arg) + +// CHECK-DAG: store half 0xH7C00, h
[PATCH] D33719: Add _Float16 as a C/C++ source language type
SjoerdMeijer added a comment. Hi Bruno, Akira, Many thanks for your feedback! Apologies for the missing context. The patch touches many files and thus with context it is quite big (~4MB). Thought this would be too much if we need a few iterations. Anyway, will include it from now on. I am working on a new revision and fixing an issue that I noticed while restructuring the regression test: it was actually not creating float16 literals properly. About the tests and using --target=aarch64: you're right that there should nothing be ARM specific here, but it is just that for Aarch64 it will show "half" IR types which I preferred, while it looks like x86 way of dealing with is to convert it and work on i16 types. Perhaps I need tests for both? Yes, initially I wanted to unconditionally support _Float16, but now that you asked how about it, I agree it makes more sense to enable it for C11 and C++11 and have: KEYWORD(_Float16, KEYC11|KEYCXX11) Thanks, Sjoerd. https://reviews.llvm.org/D33719 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D33719: Add _Float16 as a C/C++ source language type
SjoerdMeijer updated this revision to Diff 102543. SjoerdMeijer added a comment. Float16 is added as a native type. Implementing it as some sort of alias to fp16 caused too many type issues in expression/literals/etc., and thus was not an easier first step, and also in the end we want it to be a native type anyway. The tests have been restructured, and now also x86 is tested. It turned out I needed a helper function isFloat16Ty, for which I created llvm patch https://reviews.llvm.org/D34205. https://reviews.llvm.org/D33719 Files: include/clang-c/Index.h include/clang/AST/ASTContext.h include/clang/AST/BuiltinTypes.def include/clang/AST/Type.h include/clang/Basic/Specifiers.h include/clang/Basic/TokenKinds.def include/clang/Lex/LiteralSupport.h include/clang/Sema/DeclSpec.h include/clang/Serialization/ASTBitCodes.h lib/AST/ASTContext.cpp lib/AST/ItaniumMangle.cpp lib/AST/MicrosoftMangle.cpp lib/AST/NSAPI.cpp lib/AST/StmtPrinter.cpp lib/AST/Type.cpp lib/AST/TypeLoc.cpp lib/Analysis/PrintfFormatString.cpp lib/CodeGen/CGDebugInfo.cpp lib/CodeGen/CGExprScalar.cpp lib/CodeGen/CodeGenTypes.cpp lib/CodeGen/ItaniumCXXABI.cpp lib/Format/FormatToken.cpp lib/Index/USRGeneration.cpp lib/Lex/LiteralSupport.cpp lib/Parse/ParseDecl.cpp lib/Parse/ParseExpr.cpp lib/Parse/ParseExprCXX.cpp lib/Parse/ParseTentative.cpp lib/Sema/DeclSpec.cpp lib/Sema/SemaDecl.cpp lib/Sema/SemaExpr.cpp lib/Sema/SemaTemplateVariadic.cpp lib/Sema/SemaType.cpp lib/Serialization/ASTCommon.cpp lib/Serialization/ASTReader.cpp test/CodeGenCXX/float16-declarations.cpp test/Lexer/half-literal.cpp tools/libclang/CXType.cpp Index: tools/libclang/CXType.cpp === --- tools/libclang/CXType.cpp +++ tools/libclang/CXType.cpp @@ -52,6 +52,7 @@ BTCASE(Float); BTCASE(Double); BTCASE(LongDouble); +BTCASE(Float16); BTCASE(Float128); BTCASE(NullPtr); BTCASE(Overload); @@ -490,7 +491,7 @@ TKIND(Char_U); TKIND(UChar); TKIND(Char16); -TKIND(Char32); +TKIND(Char32); TKIND(UShort); TKIND(UInt); TKIND(ULong); @@ -508,6 +509,7 @@ TKIND(Float); TKIND(Double); TKIND(LongDouble); +TKIND(Float16); TKIND(Float128); TKIND(NullPtr); TKIND(Overload); Index: test/Lexer/half-literal.cpp === --- test/Lexer/half-literal.cpp +++ test/Lexer/half-literal.cpp @@ -1,3 +1,6 @@ // RUN: %clang_cc1 -fsyntax-only -verify -pedantic %s float a = 1.0h; // expected-error{{invalid suffix 'h' on floating constant}} float b = 1.0H; // expected-error{{invalid suffix 'H' on floating constant}} + +_Float16 c = 1.f166; // expected-error{{invalid suffix 'f166' on floating constant}} +_Float16 d = 1.f1; // expected-error{{invalid suffix 'f1' on floating constant}} Index: test/CodeGenCXX/float16-declarations.cpp === --- /dev/null +++ test/CodeGenCXX/float16-declarations.cpp @@ -0,0 +1,132 @@ +// RUN: %clang -std=c++11 --target=aarch64-arm--eabi -S -emit-llvm %s -o - | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-AARCH64 +// RUN: %clang -std=c++11 --target=x86_64 -S -emit-llvm %s -o - | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-X86 + +/* Various contexts where type _Float16 can appear. */ + + +/* Namespace */ + +namespace { + _Float16 f1n; +// CHECK-DAG: @_ZN12_GLOBAL__N_13f1nE = internal global half 0xH, align 2 + + _Float16 f2n = 33.f16; +// CHECK-AARCH64-DAG: @_ZN12_GLOBAL__N_13f2nE = internal global half 0xH5020, align 2 +// CHECK-X86-DAG: @_ZN12_GLOBAL__N_13f2nE = internal global i16 20512, align 2 + + _Float16 arr1n[10]; +// CHECK-AARCH64-DAG: @_ZN12_GLOBAL__N_15arr1nE = internal global [10 x half] zeroinitializer, align 2 +// CHECK-X86-DAG: @_ZN12_GLOBAL__N_15arr1nE = internal global [10 x half] zeroinitializer, align 16 + + _Float16 arr2n[] = { 1.2, 3.0, 3.e4 }; +// CHECK-AARCH64-DAG: @_ZN12_GLOBAL__N_15arr2nE = internal global [3 x half] [half 0xH3CCD, half 0xH4200, half 0xH7753], align 2 +// CHECK-X86-DAG: @_ZN12_GLOBAL__N_15arr2nE = internal global [3 x i16] [i16 15565, i16 16896, i16 30547], align 2 + + const volatile _Float16 func1n(const _Float16 &arg) { +return arg + f2n + arr1n[4] - arr2n[1]; + } +} + + +/* File */ + +_Float16 f1f; +// CHECK-AARCH64-DAG: @f1f = global half 0xH, align 2 +// CHECK-X86-DAG: @f1f = global half 0xH, align 2 + +_Float16 f2f = 32.4; +// CHECK-AARCH64-DAG: @f2f = global half 0xH500D, align 2 +// CHECK-X86-DAG: @f2f = global i16 20493, align 2 + +_Float16 arr1f[10]; +// CHECK-AARCH64-DAG: @arr1f = global [10 x half] zeroinitializer, align 2 +// CHECK-X86-DAG: @arr1f = global [10 x half] zeroinitializer, align 16 + +_Float16 arr2f[] = { -1.2, -3.0, -3.e4 }; +// CHECK-AARCH64-DAG: @arr2f = global [3 x half] [half 0xHBCCD, half 0x
[PATCH] D33719: Add _Float16 as a C/C++ source language type
SjoerdMeijer updated this revision to Diff 102649. SjoerdMeijer added a comment. Thanks Roger. I did the clean up; there were indeed still a few fixmes there. The good thing is that it's a self-contained clang patch again: we don't need https://reviews.llvm.org/D34205, which I have abandoned. https://reviews.llvm.org/D33719 Files: include/clang-c/Index.h include/clang/AST/ASTContext.h include/clang/AST/BuiltinTypes.def include/clang/Basic/Specifiers.h include/clang/Basic/TokenKinds.def include/clang/Lex/LiteralSupport.h include/clang/Sema/DeclSpec.h include/clang/Serialization/ASTBitCodes.h lib/AST/ASTContext.cpp lib/AST/ItaniumMangle.cpp lib/AST/MicrosoftMangle.cpp lib/AST/NSAPI.cpp lib/AST/StmtPrinter.cpp lib/AST/Type.cpp lib/AST/TypeLoc.cpp lib/Analysis/PrintfFormatString.cpp lib/CodeGen/CGDebugInfo.cpp lib/CodeGen/CGExprScalar.cpp lib/CodeGen/CodeGenTypes.cpp lib/CodeGen/ItaniumCXXABI.cpp lib/Format/FormatToken.cpp lib/Index/USRGeneration.cpp lib/Lex/LiteralSupport.cpp lib/Parse/ParseDecl.cpp lib/Parse/ParseExpr.cpp lib/Parse/ParseExprCXX.cpp lib/Parse/ParseTentative.cpp lib/Sema/DeclSpec.cpp lib/Sema/SemaDecl.cpp lib/Sema/SemaExpr.cpp lib/Sema/SemaTemplateVariadic.cpp lib/Sema/SemaType.cpp lib/Serialization/ASTCommon.cpp lib/Serialization/ASTReader.cpp test/CodeGenCXX/float16-declarations.cpp test/Lexer/half-literal.cpp tools/libclang/CXType.cpp Index: tools/libclang/CXType.cpp === --- tools/libclang/CXType.cpp +++ tools/libclang/CXType.cpp @@ -53,6 +53,7 @@ BTCASE(Float); BTCASE(Double); BTCASE(LongDouble); +BTCASE(Float16); BTCASE(Float128); BTCASE(NullPtr); BTCASE(Overload); @@ -520,7 +521,7 @@ TKIND(Char_U); TKIND(UChar); TKIND(Char16); -TKIND(Char32); +TKIND(Char32); TKIND(UShort); TKIND(UInt); TKIND(ULong); @@ -538,6 +539,7 @@ TKIND(Float); TKIND(Double); TKIND(LongDouble); +TKIND(Float16); TKIND(Float128); TKIND(NullPtr); TKIND(Overload); Index: test/Lexer/half-literal.cpp === --- test/Lexer/half-literal.cpp +++ test/Lexer/half-literal.cpp @@ -1,3 +1,6 @@ // RUN: %clang_cc1 -fsyntax-only -verify -pedantic %s float a = 1.0h; // expected-error{{invalid suffix 'h' on floating constant}} float b = 1.0H; // expected-error{{invalid suffix 'H' on floating constant}} + +_Float16 c = 1.f166; // expected-error{{invalid suffix 'f166' on floating constant}} +_Float16 d = 1.f1; // expected-error{{invalid suffix 'f1' on floating constant}} Index: test/CodeGenCXX/float16-declarations.cpp === --- /dev/null +++ test/CodeGenCXX/float16-declarations.cpp @@ -0,0 +1,132 @@ +// RUN: %clang -std=c++11 --target=aarch64-arm--eabi -S -emit-llvm %s -o - | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-AARCH64 +// RUN: %clang -std=c++11 --target=x86_64 -S -emit-llvm %s -o - | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-X86 + +/* Various contexts where type _Float16 can appear. */ + + +/* Namespace */ + +namespace { + _Float16 f1n; +// CHECK-DAG: @_ZN12_GLOBAL__N_13f1nE = internal global half 0xH, align 2 + + _Float16 f2n = 33.f16; +// CHECK-AARCH64-DAG: @_ZN12_GLOBAL__N_13f2nE = internal global half 0xH5020, align 2 +// CHECK-X86-DAG: @_ZN12_GLOBAL__N_13f2nE = internal global i16 20512, align 2 + + _Float16 arr1n[10]; +// CHECK-AARCH64-DAG: @_ZN12_GLOBAL__N_15arr1nE = internal global [10 x half] zeroinitializer, align 2 +// CHECK-X86-DAG: @_ZN12_GLOBAL__N_15arr1nE = internal global [10 x half] zeroinitializer, align 16 + + _Float16 arr2n[] = { 1.2, 3.0, 3.e4 }; +// CHECK-AARCH64-DAG: @_ZN12_GLOBAL__N_15arr2nE = internal global [3 x half] [half 0xH3CCD, half 0xH4200, half 0xH7753], align 2 +// CHECK-X86-DAG: @_ZN12_GLOBAL__N_15arr2nE = internal global [3 x i16] [i16 15565, i16 16896, i16 30547], align 2 + + const volatile _Float16 func1n(const _Float16 &arg) { +return arg + f2n + arr1n[4] - arr2n[1]; + } +} + + +/* File */ + +_Float16 f1f; +// CHECK-AARCH64-DAG: @f1f = global half 0xH, align 2 +// CHECK-X86-DAG: @f1f = global half 0xH, align 2 + +_Float16 f2f = 32.4; +// CHECK-AARCH64-DAG: @f2f = global half 0xH500D, align 2 +// CHECK-X86-DAG: @f2f = global i16 20493, align 2 + +_Float16 arr1f[10]; +// CHECK-AARCH64-DAG: @arr1f = global [10 x half] zeroinitializer, align 2 +// CHECK-X86-DAG: @arr1f = global [10 x half] zeroinitializer, align 16 + +_Float16 arr2f[] = { -1.2, -3.0, -3.e4 }; +// CHECK-AARCH64-DAG: @arr2f = global [3 x half] [half 0xHBCCD, half 0xHC200, half 0xHF753], align 2 +// CHECK-X86-DAG: @arr2f = global [3 x i16] [i16 -17203, i16 -15872, i16 -2221], align 2 + +_Float16 func1f(_Float16 arg); + + +/* Class */ + +class C1 { + _Float16 f1c; + + static const _Float16 f2c;
[PATCH] D33719: Add _Float16 as a C/C++ source language type
SjoerdMeijer updated this revision to Diff 103201. SjoerdMeijer edited the summary of this revision. SjoerdMeijer added a comment. I have added a fix for mixed __fp16 and _Float16 expressions: _Float16 type is converted to __fp16 type and then the operation is completed as if both operands were of __fp16 type. I've implemented this by adding _Float16 to the FloatingRank. By declaring Float16Rank to be smaller than HalfRank, we automatically get the promotions to __fp16 when required. I thought this approach is more elegant than adding more special cases in the floating point conversion functions. I've added another test case ##test/Frontend/float16.cpp## that checks the AST (this is in addition to the codegen test that we already have). This test also checks mixed type expressions. https://reviews.llvm.org/D33719 Files: include/clang-c/Index.h include/clang/AST/ASTContext.h include/clang/AST/BuiltinTypes.def include/clang/Basic/Specifiers.h include/clang/Basic/TokenKinds.def include/clang/Lex/LiteralSupport.h include/clang/Sema/DeclSpec.h include/clang/Serialization/ASTBitCodes.h lib/AST/ASTContext.cpp lib/AST/ItaniumMangle.cpp lib/AST/MicrosoftMangle.cpp lib/AST/NSAPI.cpp lib/AST/StmtPrinter.cpp lib/AST/Type.cpp lib/AST/TypeLoc.cpp lib/Analysis/PrintfFormatString.cpp lib/CodeGen/CGDebugInfo.cpp lib/CodeGen/CGExprScalar.cpp lib/CodeGen/CodeGenTypes.cpp lib/CodeGen/ItaniumCXXABI.cpp lib/Format/FormatToken.cpp lib/Index/USRGeneration.cpp lib/Lex/LiteralSupport.cpp lib/Parse/ParseDecl.cpp lib/Parse/ParseExpr.cpp lib/Parse/ParseExprCXX.cpp lib/Parse/ParseTentative.cpp lib/Sema/DeclSpec.cpp lib/Sema/SemaDecl.cpp lib/Sema/SemaExpr.cpp lib/Sema/SemaTemplateVariadic.cpp lib/Sema/SemaType.cpp lib/Serialization/ASTCommon.cpp lib/Serialization/ASTReader.cpp test/CodeGenCXX/float16-declarations.cpp test/Frontend/float16.cpp test/Lexer/half-literal.cpp tools/libclang/CXType.cpp Index: tools/libclang/CXType.cpp === --- tools/libclang/CXType.cpp +++ tools/libclang/CXType.cpp @@ -53,6 +53,7 @@ BTCASE(Float); BTCASE(Double); BTCASE(LongDouble); +BTCASE(Float16); BTCASE(Float128); BTCASE(NullPtr); BTCASE(Overload); @@ -520,7 +521,7 @@ TKIND(Char_U); TKIND(UChar); TKIND(Char16); -TKIND(Char32); +TKIND(Char32); TKIND(UShort); TKIND(UInt); TKIND(ULong); @@ -538,6 +539,7 @@ TKIND(Float); TKIND(Double); TKIND(LongDouble); +TKIND(Float16); TKIND(Float128); TKIND(NullPtr); TKIND(Overload); Index: test/Lexer/half-literal.cpp === --- test/Lexer/half-literal.cpp +++ test/Lexer/half-literal.cpp @@ -1,3 +1,6 @@ // RUN: %clang_cc1 -fsyntax-only -verify -pedantic %s float a = 1.0h; // expected-error{{invalid suffix 'h' on floating constant}} float b = 1.0H; // expected-error{{invalid suffix 'H' on floating constant}} + +_Float16 c = 1.f166; // expected-error{{invalid suffix 'f166' on floating constant}} +_Float16 d = 1.f1; // expected-error{{invalid suffix 'f1' on floating constant}} Index: test/Frontend/float16.cpp === --- /dev/null +++ test/Frontend/float16.cpp @@ -0,0 +1,297 @@ +// RUN: %clang_cc1 -std=c++11 -ast-dump %s | FileCheck %s +// RUN: %clang_cc1 -std=c++11 -ast-dump -fnative-half-type %s | FileCheck %s --check-prefix=CHECK-NATIVE + +/* Various contexts where type _Float16 can appear. */ + +/* Namespace */ +namespace { + _Float16 f1n; + _Float16 f2n = 33.f16; + _Float16 arr1n[10]; + _Float16 arr2n[] = { 1.2, 3.0, 3.e4 }; + const volatile _Float16 func1n(const _Float16 &arg) { +return arg + f2n + arr1n[4] - arr2n[1]; + } +} + +//CHECK: |-NamespaceDecl +//CHECK: | |-VarDecl {{.*}} f1n '_Float16' +//CHECK: | |-VarDecl {{.*}} f2n '_Float16' cinit +//CHECK: | | `-FloatingLiteral {{.*}} '_Float16' 3.30e+01 +//CHECK: | |-VarDecl {{.*}} arr1n '_Float16 [10]' +//CHECK: | |-VarDecl {{.*}} arr2n '_Float16 [3]' cinit +//CHECK: | | `-InitListExpr {{.*}} '_Float16 [3]' +//CHECK: | | |-ImplicitCastExpr {{.*}} '_Float16' +//CHECK: | | | `-FloatingLiteral {{.*}} 'double' 1.20e+00 +//CHECK: | | |-ImplicitCastExpr {{.*}} '_Float16' +//CHECK: | | | `-FloatingLiteral {{.*}} 'double' 3.00e+00 +//CHECK: | | `-ImplicitCastExpr {{.*}} '_Float16' +//CHECK: | | `-FloatingLiteral {{.*}} 'double' 3.00e+04 +//CHECK: | `-FunctionDecl {{.*}} func1n 'const volatile _Float16 (const _Float16 &)' + +/* File */ +_Float16 f1f; +_Float16 f2f = 32.4; +_Float16 arr1f[10]; +_Float16 arr2f[] = { -1.2, -3.0, -3.e4 }; +_Float16 func1f(_Float16 arg); + +//CHECK: |-VarDecl {{.*}} f1f '_Float16' +//CHECK: |-VarDecl {{.*}} f2f '_Float16' cinit +//CHECK: | `-ImplicitCastExpr {{.*}} '_Float16' +//CHECK: | `-FloatingLiteral {{.*}} 'double' 3.
[PATCH] D33719: Add _Float16 as a C/C++ source language type
SjoerdMeijer updated this revision to Diff 103397. SjoerdMeijer added a comment. This fixes the “DefaultVariadicArgumentPromotion” for Float16: they should be promoted to double, which makes now e.g. printf work. I have added printf tests to both the AST and codegen test to check variadic functions (and thus checked that printf works), and also restructured the AST test a bit so that the check lines are near the corresponding program statement. Yesterday I fixed mixed type (fp16 and float16) expressions, and with this fix for variadic arguments, I think the C support is ready. Do you agree @bruno, or do you think I've missed anything? https://reviews.llvm.org/D33719 Files: include/clang-c/Index.h include/clang/AST/ASTContext.h include/clang/AST/BuiltinTypes.def include/clang/Basic/Specifiers.h include/clang/Basic/TokenKinds.def include/clang/Lex/LiteralSupport.h include/clang/Sema/DeclSpec.h include/clang/Serialization/ASTBitCodes.h lib/AST/ASTContext.cpp lib/AST/ItaniumMangle.cpp lib/AST/MicrosoftMangle.cpp lib/AST/NSAPI.cpp lib/AST/StmtPrinter.cpp lib/AST/Type.cpp lib/AST/TypeLoc.cpp lib/Analysis/PrintfFormatString.cpp lib/CodeGen/CGDebugInfo.cpp lib/CodeGen/CGExprScalar.cpp lib/CodeGen/CodeGenTypes.cpp lib/CodeGen/ItaniumCXXABI.cpp lib/Format/FormatToken.cpp lib/Index/USRGeneration.cpp lib/Lex/LiteralSupport.cpp lib/Parse/ParseDecl.cpp lib/Parse/ParseExpr.cpp lib/Parse/ParseExprCXX.cpp lib/Parse/ParseTentative.cpp lib/Sema/DeclSpec.cpp lib/Sema/SemaDecl.cpp lib/Sema/SemaExpr.cpp lib/Sema/SemaTemplateVariadic.cpp lib/Sema/SemaType.cpp lib/Serialization/ASTCommon.cpp lib/Serialization/ASTReader.cpp test/CodeGenCXX/float16-declarations.cpp test/Frontend/float16.cpp test/Lexer/half-literal.cpp tools/libclang/CXType.cpp Index: tools/libclang/CXType.cpp === --- tools/libclang/CXType.cpp +++ tools/libclang/CXType.cpp @@ -53,6 +53,7 @@ BTCASE(Float); BTCASE(Double); BTCASE(LongDouble); +BTCASE(Float16); BTCASE(Float128); BTCASE(NullPtr); BTCASE(Overload); @@ -520,7 +521,7 @@ TKIND(Char_U); TKIND(UChar); TKIND(Char16); -TKIND(Char32); +TKIND(Char32); TKIND(UShort); TKIND(UInt); TKIND(ULong); @@ -538,6 +539,7 @@ TKIND(Float); TKIND(Double); TKIND(LongDouble); +TKIND(Float16); TKIND(Float128); TKIND(NullPtr); TKIND(Overload); Index: test/Lexer/half-literal.cpp === --- test/Lexer/half-literal.cpp +++ test/Lexer/half-literal.cpp @@ -1,3 +1,6 @@ // RUN: %clang_cc1 -fsyntax-only -verify -pedantic %s float a = 1.0h; // expected-error{{invalid suffix 'h' on floating constant}} float b = 1.0H; // expected-error{{invalid suffix 'H' on floating constant}} + +_Float16 c = 1.f166; // expected-error{{invalid suffix 'f166' on floating constant}} +_Float16 d = 1.f1; // expected-error{{invalid suffix 'f1' on floating constant}} Index: test/Frontend/float16.cpp === --- /dev/null +++ test/Frontend/float16.cpp @@ -0,0 +1,314 @@ +// RUN: %clang_cc1 -std=c++11 -ast-dump %s | FileCheck %s +// RUN: %clang_cc1 -std=c++11 -ast-dump -fnative-half-type %s | FileCheck %s --check-prefix=CHECK-NATIVE + +/* Various contexts where type _Float16 can appear. */ + +/* Namespace */ +namespace { + _Float16 f1n; + _Float16 f2n = 33.f16; + _Float16 arr1n[10]; + _Float16 arr2n[] = { 1.2, 3.0, 3.e4 }; + const volatile _Float16 func1n(const _Float16 &arg) { +return arg + f2n + arr1n[4] - arr2n[1]; + } +} + +//CHECK: |-NamespaceDecl +//CHECK: | |-VarDecl {{.*}} f1n '_Float16' +//CHECK: | |-VarDecl {{.*}} f2n '_Float16' cinit +//CHECK: | | `-FloatingLiteral {{.*}} '_Float16' 3.30e+01 +//CHECK: | |-VarDecl {{.*}} arr1n '_Float16 [10]' +//CHECK: | |-VarDecl {{.*}} arr2n '_Float16 [3]' cinit +//CHECK: | | `-InitListExpr {{.*}} '_Float16 [3]' +//CHECK: | | |-ImplicitCastExpr {{.*}} '_Float16' +//CHECK: | | | `-FloatingLiteral {{.*}} 'double' 1.20e+00 +//CHECK: | | |-ImplicitCastExpr {{.*}} '_Float16' +//CHECK: | | | `-FloatingLiteral {{.*}} 'double' 3.00e+00 +//CHECK: | | `-ImplicitCastExpr {{.*}} '_Float16' +//CHECK: | | `-FloatingLiteral {{.*}} 'double' 3.00e+04 +//CHECK: | `-FunctionDecl {{.*}} func1n 'const volatile _Float16 (const _Float16 &)' + +/* File */ +_Float16 f1f; +_Float16 f2f = 32.4; +_Float16 arr1f[10]; +_Float16 arr2f[] = { -1.2, -3.0, -3.e4 }; +_Float16 func1f(_Float16 arg); + +//CHECK: |-VarDecl {{.*}} f1f '_Float16' +//CHECK: |-VarDecl {{.*}} f2f '_Float16' cinit +//CHECK: | `-ImplicitCastExpr {{.*}} '_Float16' +//CHECK: | `-FloatingLiteral {{.*}} 'double' 3.24e+01 +//CHECK: |-VarDecl {{.*}} arr1f '_Float16 [10]' +//CHECK: |-VarDecl {{.*}} arr2f '_Float16 [3]' cinit +//CHECK: | `-InitListExpr {{.*}} '_Flo
[PATCH] D41792: [AArch64] Add ARMv8.2-A FP16 scalar intrinsics
SjoerdMeijer added a comment. Thanks for working on this! Some comments inline. Comment at: clang/include/clang/Basic/arm_fp16.td:19 +// The operations are subclasses of Operation providing a list of DAGs, the +// last of which is the return value. +// nit: trailing whitespace. Comment at: clang/include/clang/Basic/arm_fp16.td:58 +class IInst : Inst {} + +// ARMv8.2-A FP16 intrinsics. There's a little bit of duplication here: the definitions above are the same as in arm_neon.td. Would it be easy to share this, with e.g. an include? Comment at: clang/include/clang/Basic/arm_fp16.td:79 + + // Rounding + def FRINTZ_S64H : SInst<"vrnd", "ss", "Sh">; trailing whitespace Comment at: clang/include/clang/Basic/arm_fp16.td:88 + + // Conversion + def SCALAR_SCVTFSH : SInst<"vcvth_f16", "Ys", "silUsUiUl">; trailing whitespace Comment at: clang/include/clang/Basic/arm_fp16.td:89 + // Conversion + def SCALAR_SCVTFSH : SInst<"vcvth_f16", "Ys", "silUsUiUl">; + def SCALAR_FCVTZSH : SInst<"vcvt_s16", "$s", "Sh">; Nit: for the definitions below, indentation is sometimes a bit off. I.e. some defs have 1 space after the semicolon others have 2. Comment at: clang/lib/CodeGen/CGBuiltin.cpp:4102 NEONMAP1(vuqadds_s32, aarch64_neon_suqadd, Add1ArgType), + // FP16 scalar intrinisics go here. + NEONMAP1(vabdh_f16, aarch64_sisd_fabd, Add1ArgType), Looks like a few intrinsic descriptions are missing here. For example, the first 2-operand intrinsic vaddh_f16 is missing, but there are also more. Is this intentional, or might they have slipped through the cracks (or am I missing something)? Comment at: llvm/include/llvm/IR/IntrinsicsAArch64.td:149 [IntrNoMem]>; + + class AdvSIMD_1Arg_Intrinsic This and the other changes in this file are changes to LLVM. Do we need these changes for this patch? It doesn't look like it. Comment at: llvm/include/llvm/IR/IntrinsicsAArch64.td:360 // Vector Absolute Value - def int_aarch64_neon_abs : AdvSIMD_1IntArg_Intrinsic; + //def int_aarch64_neon_abs : AdvSIMD_1IntArg_Intrinsic; + def int_aarch64_neon_abs : AdvSIMD_1Arg_Intrinsic; Forgot to remove this? https://reviews.llvm.org/D41792 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D41792: [AArch64] Add ARMv8.2-A FP16 scalar intrinsics
SjoerdMeijer added inline comments. Comment at: clang/include/clang/Basic/arm_fp16.td:58 +class IInst : Inst {} + +// ARMv8.2-A FP16 intrinsics. az wrote: > SjoerdMeijer wrote: > > There's a little bit of duplication here: the definitions above are the > > same as in arm_neon.td. Would it be easy to share this, with e.g. an > > include? > The duplication is small compared to the overall infrastructure/data > structure needed to automatically generate the intrinsics. There are 3 ways > to do this: 1) copy only the needed data structure in arm_fp16.td (this is > what was done in original review) 2) put all data structure in a newly > created file and include it in arm_neon.td and arm_fp16.td (done here). 3) > put only the duplication in a new file and include it. I did not go for this > one given that we create a new file for the only purpose of avoiding a small > duplication but I am fine of going with 3 too. Note that some of the > duplicated structure in the original arm_fp16.td was a stripped down version > of the copied one. Given that the duplication is tiny, I don't have strong opinions to be honest. Would be nice to share these definitions if that's easy to do, otherwise we can perfectly live with this I think. Comment at: clang/lib/CodeGen/CGBuiltin.cpp:4102 NEONMAP1(vuqadds_s32, aarch64_neon_suqadd, Add1ArgType), + // FP16 scalar intrinisics go here. + NEONMAP1(vabdh_f16, aarch64_sisd_fabd, Add1ArgType), az wrote: > SjoerdMeijer wrote: > > Looks like a few intrinsic descriptions are missing here. For example, the > > first 2-operand intrinsic vaddh_f16 is missing, but there are also more. Is > > this intentional, or might they have slipped through the cracks (or am I > > missing something)? > I agree that this is confusing. For the intrinsics listed in this table, code > generation happens in a generic way based on the info in the table. The ones > not listed in this table are addressed in a more specific way below in a the > function called EmitAArch64BuiltinExpr. While I do not like how few things > were implemented in generating the intrinsics, I am in general following the > approach taken for arm_neon instead of introducing a new approach. Ah, I see, I somehow missed that. Fair enough. Comment at: clang/lib/CodeGen/CGBuiltin.cpp:6511 "vgetq_lane"); + case NEON::BI__builtin_neon_vaddh_f16: + Ops.push_back(EmitScalarExpr(E->getArg(1))); nit: indentation seems off by 1 for these case statements. Comment at: clang/lib/CodeGen/CGBuiltin.cpp:6531 + Value *F = CGM.getIntrinsic(Intrinsic::fma, HalfTy); +Value *Zero = llvm::ConstantFP::getZeroValueForNegation(HalfTy); + Value* Sub = Builder.CreateFSub(Zero, EmitScalarExpr(E->getArg(1)), "vsubh"); nit: indentation seems off by 1 Comment at: clang/utils/TableGen/NeonEmitter.cpp:2464 +" * Permission is hereby granted, free of charge, to any person " +"obtaining " +"a copy\n" more nits: I see this is copied from above, but I think this and the next line can be on the same line, just increasing readability a tiny bit. Comment at: clang/utils/TableGen/NeonEmitter.cpp:2468 +"\"Software\")," +" to deal\n" +" * in the Software without restriction, including without limitation " same here Comment at: clang/utils/TableGen/NeonEmitter.cpp:2470 +" * in the Software without restriction, including without limitation " +"the " +"rights\n" and same here Comment at: llvm/include/llvm/IR/IntrinsicsAArch64.td:250 def int_aarch64_neon_umax : AdvSIMD_2VectorArg_Intrinsic; - def int_aarch64_neon_fmax : AdvSIMD_2VectorArg_Intrinsic; + def int_aarch64_neon_fmax : AdvSIMD_2FloatArg_Intrinsic; def int_aarch64_neon_fmaxnmp : AdvSIMD_2VectorArg_Intrinsic; There's a scalar and vector variant of FMAX and thus I am wondering if we don't need two definitions here: one using AdvSIMD_2FloatArg_Intrinsic and the other AdvSIMD_2VectorArg_Intrinsic? Comment at: llvm/include/llvm/IR/IntrinsicsAArch64.td:262 def int_aarch64_neon_umin : AdvSIMD_2VectorArg_Intrinsic; - def int_aarch64_neon_fmin : AdvSIMD_2VectorArg_Intrinsic; + def int_aarch64_neon_fmin : AdvSIMD_2FloatArg_Intrinsic; def int_aarch64_neon_fminnmp : AdvSIMD_2VectorArg_Intrinsic; Same here for FMIN https://reviews.llvm.org/D41792 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D43650: [ARM] Add ARMv8.2-A FP16 vector intrinsics
SjoerdMeijer accepted this revision. SjoerdMeijer added a comment. This revision is now accepted and ready to land. Look like sensible cleanups/fixes/additions to me. We were struggling whether to pass an i16 or f16 type, which can both be illegal types. Therefore, it perhaps doesn't really matter much which one we use. At the time we were having these problems, the ARM backend wasn't ready for f16s, but a lot has changed since then. So I think we can get this in, review where we are, and iterate on this if necessary. https://reviews.llvm.org/D43650 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D44222: [AArch64] Add vmulxh_lane FP16 intrinsics
SjoerdMeijer created this revision. SjoerdMeijer added reviewers: az, evandro, olista01. Herald added subscribers: kristof.beyls, javed.absar, rengolin. Add 2 vmulxh_lane vector intrinsics that were commented out. https://reviews.llvm.org/D44222 Files: include/clang/Basic/arm_neon.td test/CodeGen/aarch64-v8.2a-neon-intrinsics.c Index: test/CodeGen/aarch64-v8.2a-neon-intrinsics.c === --- test/CodeGen/aarch64-v8.2a-neon-intrinsics.c +++ test/CodeGen/aarch64-v8.2a-neon-intrinsics.c @@ -1223,27 +1223,25 @@ return vmulxq_n_f16(a, b); } -/* TODO: Not implemented yet (needs scalar intrinsic from arm_fp16.h) -// CCHECK-LABEL: test_vmulxh_lane_f16 -// CCHECK: [[CONV0:%.*]] = fpext half %a to float -// CCHECK: [[CONV1:%.*]] = fpext half %{{.*}} to float -// CCHECK: [[MUL:%.*]] = fmul float [[CONV0:%.*]], [[CONV0:%.*]] -// CCHECK: [[CONV3:%.*]] = fptrunc float %mul to half -// CCHECK: ret half [[CONV3:%.*]] +// CHECK-LABEL: test_vmulxh_lane_f16 +// CHECK: [[CONV0:%.*]] = fpext half %a to float +// CHECK: [[CONV1:%.*]] = fpext half %{{.*}} to float +// CHECK: [[MUL:%.*]] = fmul float [[CONV0:%.*]], [[CONV0:%.*]] +// CHECK: [[CONV3:%.*]] = fptrunc float %mul to half +// CHECK: ret half [[CONV3:%.*]] float16_t test_vmulxh_lane_f16(float16_t a, float16x4_t b) { return vmulxh_lane_f16(a, b, 3); } -// CCHECK-LABEL: test_vmulxh_laneq_f16 -// CCHECK: [[CONV0:%.*]] = fpext half %a to float -// CCHECK: [[CONV1:%.*]] = fpext half %{{.*}} to float -// CCHECK: [[MUL:%.*]] = fmul float [[CONV0:%.*]], [[CONV0:%.*]] -// CCHECK: [[CONV3:%.*]] = fptrunc float %mul to half -// CCHECK: ret half [[CONV3:%.*]] +// CHECK-LABEL: test_vmulxh_laneq_f16 +// CHECK: [[CONV0:%.*]] = fpext half %a to float +// CHECK: [[CONV1:%.*]] = fpext half %{{.*}} to float +// CHECK: [[MUL:%.*]] = fmul float [[CONV0:%.*]], [[CONV0:%.*]] +// CHECK: [[CONV3:%.*]] = fptrunc float %mul to half +// CHECK: ret half [[CONV3:%.*]] float16_t test_vmulxh_laneq_f16(float16_t a, float16x8_t b) { return vmulxh_laneq_f16(a, b, 7); } -*/ // CHECK-LABEL: test_vmaxv_f16 // CHECK: [[TMP0:%.*]] = bitcast <4 x half> %a to <8 x i8> Index: include/clang/Basic/arm_neon.td === --- include/clang/Basic/arm_neon.td +++ include/clang/Basic/arm_neon.td @@ -1499,11 +1499,10 @@ def VMULX_LANEH : IOpInst<"vmulx_lane", "ddgi", "hQh", OP_MULX_LN>; def VMULX_LANEQH : IOpInst<"vmulx_laneq", "ddji", "hQh", OP_MULX_LN>; def VMULX_NH : IOpInst<"vmulx_n", "dds", "hQh", OP_MULX_N>; - // TODO: Scalar floating point multiply extended (scalar, by element) - // Below ones are commented out because they need vmulx_f16(float16_t, float16_t) - // which will be implemented later with fp16 scalar intrinsic (arm_fp16.h) - //def SCALAR_FMULX_LANEH : IOpInst<"vmulx_lane", "ssdi", "Sh", OP_SCALAR_MUL_LN>; - //def SCALAR_FMULX_LANEQH : IOpInst<"vmulx_laneq", "ssji", "Sh", OP_SCALAR_MUL_LN>; + + // Scalar floating point multiply extended (scalar, by element) + def SCALAR_FMULX_LANEH : IOpInst<"vmulx_lane", "ssdi", "Sh", OP_SCALAR_MUL_LN>; + def SCALAR_FMULX_LANEQH : IOpInst<"vmulx_laneq", "ssji", "Sh", OP_SCALAR_MUL_LN>; // ARMv8.2-A FP16 reduction vector intrinsics. def VMAXVH : SInst<"vmaxv", "sd", "hQh">; Index: test/CodeGen/aarch64-v8.2a-neon-intrinsics.c === --- test/CodeGen/aarch64-v8.2a-neon-intrinsics.c +++ test/CodeGen/aarch64-v8.2a-neon-intrinsics.c @@ -1223,27 +1223,25 @@ return vmulxq_n_f16(a, b); } -/* TODO: Not implemented yet (needs scalar intrinsic from arm_fp16.h) -// CCHECK-LABEL: test_vmulxh_lane_f16 -// CCHECK: [[CONV0:%.*]] = fpext half %a to float -// CCHECK: [[CONV1:%.*]] = fpext half %{{.*}} to float -// CCHECK: [[MUL:%.*]] = fmul float [[CONV0:%.*]], [[CONV0:%.*]] -// CCHECK: [[CONV3:%.*]] = fptrunc float %mul to half -// CCHECK: ret half [[CONV3:%.*]] +// CHECK-LABEL: test_vmulxh_lane_f16 +// CHECK: [[CONV0:%.*]] = fpext half %a to float +// CHECK: [[CONV1:%.*]] = fpext half %{{.*}} to float +// CHECK: [[MUL:%.*]] = fmul float [[CONV0:%.*]], [[CONV0:%.*]] +// CHECK: [[CONV3:%.*]] = fptrunc float %mul to half +// CHECK: ret half [[CONV3:%.*]] float16_t test_vmulxh_lane_f16(float16_t a, float16x4_t b) { return vmulxh_lane_f16(a, b, 3); } -// CCHECK-LABEL: test_vmulxh_laneq_f16 -// CCHECK: [[CONV0:%.*]] = fpext half %a to float -// CCHECK: [[CONV1:%.*]] = fpext half %{{.*}} to float -// CCHECK: [[MUL:%.*]] = fmul float [[CONV0:%.*]], [[CONV0:%.*]] -// CCHECK: [[CONV3:%.*]] = fptrunc float %mul to half -// CCHECK: ret half [[CONV3:%.*]] +// CHECK-LABEL: test_vmulxh_laneq_f16 +// CHECK: [[CONV0:%.*]] = fpext half %a to float +// CHECK: [[CONV1:%.*]] = fpext half %{{.*}} to float +// CHECK: [[MUL:%.*]] = fmul float [[CONV0:%.*]], [[CONV0:%.*]] +// CHECK: [[CONV3:%.*]] = fptrunc float %mul to half +// CHECK
[PATCH] D44222: [AArch64] Add vmulxh_lane FP16 intrinsics
SjoerdMeijer added inline comments. Comment at: include/clang/Basic/arm_neon.td:1504 + // Scalar floating point multiply extended (scalar, by element) + def SCALAR_FMULX_LANEH : IOpInst<"vmulx_lane", "ssdi", "Sh", OP_SCALAR_MUL_LN>; + def SCALAR_FMULX_LANEQH : IOpInst<"vmulx_laneq", "ssji", "Sh", OP_SCALAR_MUL_LN>; I found that unfortunately it's not that straightforward. This leads to wrong code generation as it is generating a fmul instead of fmulx. I am suspecting this instruction description should be using OP_SCALAR_MULX_LN, but also the type decls are wrong. Need to dig a bit further here. https://reviews.llvm.org/D44222 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D44222: [AArch64] Add vmulxh_lane FP16 intrinsics
SjoerdMeijer added inline comments. Comment at: include/clang/Basic/arm_neon.td:1504 + // Scalar floating point multiply extended (scalar, by element) + def SCALAR_FMULX_LANEH : IOpInst<"vmulx_lane", "ssdi", "Sh", OP_SCALAR_MUL_LN>; + def SCALAR_FMULX_LANEQH : IOpInst<"vmulx_laneq", "ssji", "Sh", OP_SCALAR_MUL_LN>; az wrote: > SjoerdMeijer wrote: > > I found that unfortunately it's not that straightforward. This leads to > > wrong code generation as it is generating a fmul instead of fmulx. I am > > suspecting this instruction description should be using OP_SCALAR_MULX_LN, > > but also the type decls are wrong. Need to dig a bit further here. > Sorry for confusion as the commented code was never intended to be used and > it is a copy of the code for the intrinsic vmulh_lane(). It was done that way > in order to point out that vmulh_lane() and vmulxh_lane() intrinsics should > be implemented in a similar way. The only useful thing in the commented code > is the explanation that we need the scalar intrinsic vmulxh_f16() which was > implemented in the scalar intrinsic patch later on. > > If we look at how vmulh_lane (a, b, lane) is implemented: > x = extract (b, lane); > res = a * x; > return res; > > Similarly, I thought at the time that vmulxh_lane (a, b, lane) can be > implemented: > x = extract (b, lane); > res = vmulxh_f16 (a, x); // no llvm native mulx instruction, so we use > the fp16 scalar intrinsic. > return res; > > I am not sure now that we can easily use scalar intrinsic while generating > the arm_neon.h file. In case we can not do that, I am thinking that the > frontend should generate a new builtin for intrinsic vmulxh_lane() that the > backend recognizes and generate the right code for it which is fmulx h0, h0, > v1.h[lane]. If you made or will be making progress on this, then that is > great. Otherwise, I can look at a frontend solution for it. Hi Abderrazek, Thanks for the clarifications! And I agree with your observations. This simple changed looked to do the right thing, because as you also said, this vmulx is just an extract and a multiply, but then it was incorrectly generating a fmul which should be a fmulx. I briefly looked at fixing this, but also didn't see how I could use the scalar intrinsic here. Looks like passing a builtin is indeed the best thing, also because fmulx is instruction selected based on a intrinsic: defm FMULX: SIMDThreeSameVectorFP<0,0,0b011,"fmulx", int_aarch64_neon_fmulx>; If you have the bandwidth to pick this up, that would be great; I started looking into the other failing AArch64 vector intrinsics. Cheers, Sjoerd. https://reviews.llvm.org/D44222 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D43650: [ARM] Add ARMv8.2-A FP16 vector intrinsics
SjoerdMeijer added a comment. Hi @mstorsjo, thanks for reporting this! I was waiting for @az, and only had a quick look myself, but I don't think it's going to be a quick fix. So that would suggest indeed that a revert is a best. Perhaps we can wait a few more hours to give the guys in the US time zone the opportunity to reply in case I've missed something, but yes, let's revert "end of today" otherwise. https://reviews.llvm.org/D43650 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D43650: [ARM] Add ARMv8.2-A FP16 vector intrinsics
SjoerdMeijer added a comment. Reverted in r327437. https://reviews.llvm.org/D43650 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D43650: [ARM] Add ARMv8.2-A FP16 vector intrinsics
SjoerdMeijer added a comment. FYI: I have partially recommitted this in r327455; I have separated out the minimal functional change related to the FP16 macros. https://reviews.llvm.org/D43650 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D44512: [AAch64] Tests for ACLE FP16 macros
SjoerdMeijer created this revision. SjoerdMeijer added reviewers: samparker, olista01, evandro, az. Herald added subscribers: kristof.beyls, javed.absar. This adds some missing tests for the AArch64 FP16 macros. https://reviews.llvm.org/D44512 Files: test/Preprocessor/aarch64-target-features.c Index: test/Preprocessor/aarch64-target-features.c === --- test/Preprocessor/aarch64-target-features.c +++ test/Preprocessor/aarch64-target-features.c @@ -89,6 +89,18 @@ // RUN: %clang -target aarch64-none-linux-gnu -march=armv8-a+sve -x c -E -dM %s -o - | FileCheck --check-prefix=CHECK-SVE %s // CHECK-SVE: __ARM_FEATURE_SVE 1 +// RUN: %clang -target aarch64-none-linux-gnueabi -march=armv8.2a+fp16 -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-FULLFP16-VECTOR-SCALAR %s +// CHECK-FULLFP16-VECTOR-SCALAR: #define __ARM_FEATURE_FP16_SCALAR_ARITHMETIC 1 +// CHECK-FULLFP16-VECTOR-SCALAR: #define __ARM_FEATURE_FP16_VECTOR_ARITHMETIC 1 +// CHECK-FULLFP16-VECTOR-SCALAR: #define __ARM_FP 0xE +// CHECK-FULLFP16-VECTOR-SCALAR: #define __ARM_FP16_FORMAT_IEEE 1 + +// RUN: %clang -target aarch64-none-linux-gnueabi -march=armv8.2a+fp16+nosimd -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-FULLFP16-SCALAR %s +// CHECK-FULLFP16-SCALAR: #define __ARM_FEATURE_FP16_SCALAR_ARITHMETIC 1 +// CHECK-FULLFP16-SCALAR-NOT: #define __ARM_FEATURE_FP16_VECTOR_ARITHMETIC 1 +// CHECK-FULLFP16-SCALAR: #define __ARM_FP 0xE +// CHECK-FULLFP16-SCALAR: #define __ARM_FP16_FORMAT_IEEE 1 + // == Check whether -mtune accepts mixed-case features. // RUN: %clang -target aarch64 -mtune=CYCLONE -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-MTUNE-CYCLONE %s // CHECK-MTUNE-CYCLONE: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+neon" "-target-feature" "+zcm" "-target-feature" "+zcz" Index: test/Preprocessor/aarch64-target-features.c === --- test/Preprocessor/aarch64-target-features.c +++ test/Preprocessor/aarch64-target-features.c @@ -89,6 +89,18 @@ // RUN: %clang -target aarch64-none-linux-gnu -march=armv8-a+sve -x c -E -dM %s -o - | FileCheck --check-prefix=CHECK-SVE %s // CHECK-SVE: __ARM_FEATURE_SVE 1 +// RUN: %clang -target aarch64-none-linux-gnueabi -march=armv8.2a+fp16 -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-FULLFP16-VECTOR-SCALAR %s +// CHECK-FULLFP16-VECTOR-SCALAR: #define __ARM_FEATURE_FP16_SCALAR_ARITHMETIC 1 +// CHECK-FULLFP16-VECTOR-SCALAR: #define __ARM_FEATURE_FP16_VECTOR_ARITHMETIC 1 +// CHECK-FULLFP16-VECTOR-SCALAR: #define __ARM_FP 0xE +// CHECK-FULLFP16-VECTOR-SCALAR: #define __ARM_FP16_FORMAT_IEEE 1 + +// RUN: %clang -target aarch64-none-linux-gnueabi -march=armv8.2a+fp16+nosimd -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-FULLFP16-SCALAR %s +// CHECK-FULLFP16-SCALAR: #define __ARM_FEATURE_FP16_SCALAR_ARITHMETIC 1 +// CHECK-FULLFP16-SCALAR-NOT: #define __ARM_FEATURE_FP16_VECTOR_ARITHMETIC 1 +// CHECK-FULLFP16-SCALAR: #define __ARM_FP 0xE +// CHECK-FULLFP16-SCALAR: #define __ARM_FP16_FORMAT_IEEE 1 + // == Check whether -mtune accepts mixed-case features. // RUN: %clang -target aarch64 -mtune=CYCLONE -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-MTUNE-CYCLONE %s // CHECK-MTUNE-CYCLONE: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+neon" "-target-feature" "+zcm" "-target-feature" "+zcz" ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D44512: [AAch64] Tests for ACLE FP16 macros
SjoerdMeijer added a comment. Thanks for reviewing! https://reviews.llvm.org/D44512 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D44512: [AAch64] Tests for ACLE FP16 macros
This revision was automatically updated to reflect the committed changes. Closed by commit rL327623: [AAch64] Tests for ACLE FP16 macros (authored by SjoerdMeijer, committed by ). Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D44512?vs=138521&id=138541#toc Repository: rL LLVM https://reviews.llvm.org/D44512 Files: cfe/trunk/test/Preprocessor/aarch64-target-features.c Index: cfe/trunk/test/Preprocessor/aarch64-target-features.c === --- cfe/trunk/test/Preprocessor/aarch64-target-features.c +++ cfe/trunk/test/Preprocessor/aarch64-target-features.c @@ -89,6 +89,18 @@ // RUN: %clang -target aarch64-none-linux-gnu -march=armv8-a+sve -x c -E -dM %s -o - | FileCheck --check-prefix=CHECK-SVE %s // CHECK-SVE: __ARM_FEATURE_SVE 1 +// RUN: %clang -target aarch64-none-linux-gnueabi -march=armv8.2a+fp16 -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-FULLFP16-VECTOR-SCALAR %s +// CHECK-FULLFP16-VECTOR-SCALAR: #define __ARM_FEATURE_FP16_SCALAR_ARITHMETIC 1 +// CHECK-FULLFP16-VECTOR-SCALAR: #define __ARM_FEATURE_FP16_VECTOR_ARITHMETIC 1 +// CHECK-FULLFP16-VECTOR-SCALAR: #define __ARM_FP 0xE +// CHECK-FULLFP16-VECTOR-SCALAR: #define __ARM_FP16_FORMAT_IEEE 1 + +// RUN: %clang -target aarch64-none-linux-gnueabi -march=armv8.2a+fp16+nosimd -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-FULLFP16-SCALAR %s +// CHECK-FULLFP16-SCALAR: #define __ARM_FEATURE_FP16_SCALAR_ARITHMETIC 1 +// CHECK-FULLFP16-SCALAR-NOT: #define __ARM_FEATURE_FP16_VECTOR_ARITHMETIC 1 +// CHECK-FULLFP16-SCALAR: #define __ARM_FP 0xE +// CHECK-FULLFP16-SCALAR: #define __ARM_FP16_FORMAT_IEEE 1 + // == Check whether -mtune accepts mixed-case features. // RUN: %clang -target aarch64 -mtune=CYCLONE -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-MTUNE-CYCLONE %s // CHECK-MTUNE-CYCLONE: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+neon" "-target-feature" "+zcm" "-target-feature" "+zcz" Index: cfe/trunk/test/Preprocessor/aarch64-target-features.c === --- cfe/trunk/test/Preprocessor/aarch64-target-features.c +++ cfe/trunk/test/Preprocessor/aarch64-target-features.c @@ -89,6 +89,18 @@ // RUN: %clang -target aarch64-none-linux-gnu -march=armv8-a+sve -x c -E -dM %s -o - | FileCheck --check-prefix=CHECK-SVE %s // CHECK-SVE: __ARM_FEATURE_SVE 1 +// RUN: %clang -target aarch64-none-linux-gnueabi -march=armv8.2a+fp16 -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-FULLFP16-VECTOR-SCALAR %s +// CHECK-FULLFP16-VECTOR-SCALAR: #define __ARM_FEATURE_FP16_SCALAR_ARITHMETIC 1 +// CHECK-FULLFP16-VECTOR-SCALAR: #define __ARM_FEATURE_FP16_VECTOR_ARITHMETIC 1 +// CHECK-FULLFP16-VECTOR-SCALAR: #define __ARM_FP 0xE +// CHECK-FULLFP16-VECTOR-SCALAR: #define __ARM_FP16_FORMAT_IEEE 1 + +// RUN: %clang -target aarch64-none-linux-gnueabi -march=armv8.2a+fp16+nosimd -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-FULLFP16-SCALAR %s +// CHECK-FULLFP16-SCALAR: #define __ARM_FEATURE_FP16_SCALAR_ARITHMETIC 1 +// CHECK-FULLFP16-SCALAR-NOT: #define __ARM_FEATURE_FP16_VECTOR_ARITHMETIC 1 +// CHECK-FULLFP16-SCALAR: #define __ARM_FP 0xE +// CHECK-FULLFP16-SCALAR: #define __ARM_FP16_FORMAT_IEEE 1 + // == Check whether -mtune accepts mixed-case features. // RUN: %clang -target aarch64 -mtune=CYCLONE -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-MTUNE-CYCLONE %s // CHECK-MTUNE-CYCLONE: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+neon" "-target-feature" "+zcm" "-target-feature" "+zcz" ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D44561: [ARM] Add HasFloat16 to TargetInfo
SjoerdMeijer created this revision. SjoerdMeijer added reviewers: t.p.northover, samparker, olista01. Herald added subscribers: kristof.beyls, javed.absar. For generating NEON intrinsics, this determines the NEON data type, and whether it should be a half type or an i16 type. I.e., we always pass a half type for AArch64, this hasn't changed, but now also for ARM but only when FullFP16 is enabled, and i16 otherwise. This is intended to be non-functional change, but together with the backend work in https://reviews.llvm.org/D44538 which adds support for f16 vectors, this enables adding the AArch32 FP16 (vector) intrinsics. https://reviews.llvm.org/D44561 Files: include/clang/Basic/TargetInfo.h lib/Basic/TargetInfo.cpp lib/Basic/Targets/ARM.cpp lib/Basic/Targets/ARM.h lib/CodeGen/CGBuiltin.cpp Index: lib/CodeGen/CGBuiltin.cpp === --- lib/CodeGen/CGBuiltin.cpp +++ lib/CodeGen/CGBuiltin.cpp @@ -3442,6 +3442,7 @@ static llvm::VectorType *GetNeonType(CodeGenFunction *CGF, NeonTypeFlags TypeFlags, llvm::Triple::ArchType Arch, + bool HasFloat16=false, bool V1Ty=false) { int IsQuad = TypeFlags.isQuad(); switch (TypeFlags.getEltType()) { @@ -3452,9 +3453,8 @@ case NeonTypeFlags::Poly16: return llvm::VectorType::get(CGF->Int16Ty, V1Ty ? 1 : (4 << IsQuad)); case NeonTypeFlags::Float16: -// FIXME: Only AArch64 backend can so far properly handle half types. -// Remove else part once ARM backend support for half is complete. -if (Arch == llvm::Triple::aarch64) +if (Arch == llvm::Triple::aarch64 || Arch == llvm::Triple::aarch64_be || +HasFloat16) return llvm::VectorType::get(CGF->HalfTy, V1Ty ? 1 : (4 << IsQuad)); else return llvm::VectorType::get(CGF->Int16Ty, V1Ty ? 1 : (4 << IsQuad)); @@ -4338,8 +4338,9 @@ NeonTypeFlags Type(NeonTypeConst.getZExtValue()); bool Usgn = Type.isUnsigned(); bool Quad = Type.isQuad(); + const bool HasFloat16 = getTarget().hasFloat16Type(); - llvm::VectorType *VTy = GetNeonType(this, Type, Arch); + llvm::VectorType *VTy = GetNeonType(this, Type, Arch, HasFloat16); llvm::Type *Ty = VTy; if (!Ty) return nullptr; Index: lib/Basic/Targets/ARM.h === --- lib/Basic/Targets/ARM.h +++ lib/Basic/Targets/ARM.h @@ -69,7 +69,6 @@ unsigned Crypto : 1; unsigned DSP : 1; unsigned Unaligned : 1; - unsigned HasFullFP16 : 1; enum { LDREX_B = (1 << 0), /// byte (8-bit) Index: lib/Basic/Targets/ARM.cpp === --- lib/Basic/Targets/ARM.cpp +++ lib/Basic/Targets/ARM.cpp @@ -379,7 +379,7 @@ Unaligned = 1; SoftFloat = SoftFloatABI = false; HWDiv = 0; - HasFullFP16 = 0; + HasFloat16 = 0; // This does not diagnose illegal cases like having both // "+vfpv2" and "+vfpv3" or having "+neon" and "+fp-only-sp". @@ -421,7 +421,7 @@ } else if (Feature == "+fp16") { HW_FP |= HW_FP_HP; } else if (Feature == "+fullfp16") { - HasFullFP16 = 1; + HasFloat16 = true; } } HW_FP &= ~HW_FP_remove; @@ -714,11 +714,11 @@ Builder.defineMacro("__ARM_FP_FAST", "1"); // Armv8.2-A FP16 vector intrinsic - if ((FPU & NeonFPU) && HasFullFP16) + if ((FPU & NeonFPU) && HasFloat16) Builder.defineMacro("__ARM_FEATURE_FP16_VECTOR_ARITHMETIC", "1"); // Armv8.2-A FP16 scalar intrinsics - if (HasFullFP16) + if (HasFloat16) Builder.defineMacro("__ARM_FEATURE_FP16_SCALAR_ARITHMETIC", "1"); Index: lib/Basic/TargetInfo.cpp === --- lib/Basic/TargetInfo.cpp +++ lib/Basic/TargetInfo.cpp @@ -32,6 +32,7 @@ TLSSupported = true; VLASupported = true; NoAsmVariants = false; + HasFloat16 = false; HasFloat128 = false; PointerWidth = PointerAlign = 32; BoolWidth = BoolAlign = 8; Index: include/clang/Basic/TargetInfo.h === --- include/clang/Basic/TargetInfo.h +++ include/clang/Basic/TargetInfo.h @@ -61,6 +61,7 @@ bool TLSSupported; bool VLASupported; bool NoAsmVariants; // True if {|} are normal characters. + bool HasFloat16; bool HasFloat128; unsigned char PointerWidth, PointerAlign; unsigned char BoolWidth, BoolAlign; @@ -361,6 +362,9 @@ return (getPointerWidth(0) >= 64) || getTargetOpts().ForceEnableInt128; } // FIXME + /// \brief Determine whether _Float16 is supported on this target. + virtual bool hasFloat16Type() const { return HasFloat16; } + /// \brief Determine whether the __float128 type is supported on this target. virtual bool hasFloat128Type() const { return HasFloat128; } ___ cfe-com
[PATCH] D44561: [ARM] Add HasFloat16 to TargetInfo
SjoerdMeijer added a comment. Thanks for the review. Please see a first comment inline. Comment at: include/clang/Basic/TargetInfo.h:365 + /// \brief Determine whether _Float16 is supported on this target. + virtual bool hasFloat16Type() const { return HasFloat16; } t.p.northover wrote: > `_Float16` doesn't seem to be supported anywhere in Clang (`__fp16` is). > > But we should probably clarify exactly what kind of support we mean here. > This variable doesn't affect: > > * Whether __fp16 can be used at all in source. > * Its ABI. > > I'm actually slightly worried that when we document what it does affect it'll > end up being an ARM implementation-detail. > > I've added _Float16 support in Clang commit r312794: "Add _Float16 as a C/C++ source language type" :-) https://reviews.llvm.org/D44561 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D44561: [ARM] Add HasFloat16 to TargetInfo
SjoerdMeijer updated this revision to Diff 138722. SjoerdMeijer added a comment. Addressed comments: simplified the logic in GetNeonType. https://reviews.llvm.org/D44561 Files: include/clang/Basic/TargetInfo.h lib/Basic/TargetInfo.cpp lib/Basic/Targets/AArch64.cpp lib/Basic/Targets/ARM.cpp lib/Basic/Targets/ARM.h lib/CodeGen/CGBuiltin.cpp Index: lib/CodeGen/CGBuiltin.cpp === --- lib/CodeGen/CGBuiltin.cpp +++ lib/CodeGen/CGBuiltin.cpp @@ -3442,6 +3442,7 @@ static llvm::VectorType *GetNeonType(CodeGenFunction *CGF, NeonTypeFlags TypeFlags, llvm::Triple::ArchType Arch, + bool HasFloat16=true, bool V1Ty=false) { int IsQuad = TypeFlags.isQuad(); switch (TypeFlags.getEltType()) { @@ -3452,9 +3453,7 @@ case NeonTypeFlags::Poly16: return llvm::VectorType::get(CGF->Int16Ty, V1Ty ? 1 : (4 << IsQuad)); case NeonTypeFlags::Float16: -// FIXME: Only AArch64 backend can so far properly handle half types. -// Remove else part once ARM backend support for half is complete. -if (Arch == llvm::Triple::aarch64) +if (HasFloat16) return llvm::VectorType::get(CGF->HalfTy, V1Ty ? 1 : (4 << IsQuad)); else return llvm::VectorType::get(CGF->Int16Ty, V1Ty ? 1 : (4 << IsQuad)); @@ -4338,8 +4337,9 @@ NeonTypeFlags Type(NeonTypeConst.getZExtValue()); bool Usgn = Type.isUnsigned(); bool Quad = Type.isQuad(); + const bool HasFloat16 = getTarget().hasFloat16Type(); - llvm::VectorType *VTy = GetNeonType(this, Type, Arch); + llvm::VectorType *VTy = GetNeonType(this, Type, Arch, HasFloat16); llvm::Type *Ty = VTy; if (!Ty) return nullptr; @@ -4413,13 +4413,15 @@ case NEON::BI__builtin_neon_vcvt_f32_v: case NEON::BI__builtin_neon_vcvtq_f32_v: Ops[0] = Builder.CreateBitCast(Ops[0], Ty); -Ty = GetNeonType(this, NeonTypeFlags(NeonTypeFlags::Float32, false, Quad), Arch); +Ty = GetNeonType(this, NeonTypeFlags(NeonTypeFlags::Float32, false, Quad), + Arch, HasFloat16); return Usgn ? Builder.CreateUIToFP(Ops[0], Ty, "vcvt") : Builder.CreateSIToFP(Ops[0], Ty, "vcvt"); case NEON::BI__builtin_neon_vcvt_f16_v: case NEON::BI__builtin_neon_vcvtq_f16_v: Ops[0] = Builder.CreateBitCast(Ops[0], Ty); -Ty = GetNeonType(this, NeonTypeFlags(NeonTypeFlags::Float16, false, Quad), Arch); +Ty = GetNeonType(this, NeonTypeFlags(NeonTypeFlags::Float16, false, Quad), Arch, + HasFloat16); return Usgn ? Builder.CreateUIToFP(Ops[0], Ty, "vcvt") : Builder.CreateSIToFP(Ops[0], Ty, "vcvt"); case NEON::BI__builtin_neon_vcvt_n_f16_v: @@ -5528,7 +5530,8 @@ bool usgn = Type.isUnsigned(); bool rightShift = false; - llvm::VectorType *VTy = GetNeonType(this, Type, Arch); + llvm::VectorType *VTy = GetNeonType(this, Type, Arch, + getTarget().hasFloat16Type()); llvm::Type *Ty = VTy; if (!Ty) return nullptr; Index: lib/Basic/Targets/ARM.h === --- lib/Basic/Targets/ARM.h +++ lib/Basic/Targets/ARM.h @@ -69,7 +69,6 @@ unsigned Crypto : 1; unsigned DSP : 1; unsigned Unaligned : 1; - unsigned HasFullFP16 : 1; enum { LDREX_B = (1 << 0), /// byte (8-bit) Index: lib/Basic/Targets/ARM.cpp === --- lib/Basic/Targets/ARM.cpp +++ lib/Basic/Targets/ARM.cpp @@ -379,7 +379,6 @@ Unaligned = 1; SoftFloat = SoftFloatABI = false; HWDiv = 0; - HasFullFP16 = 0; // This does not diagnose illegal cases like having both // "+vfpv2" and "+vfpv3" or having "+neon" and "+fp-only-sp". @@ -421,7 +420,7 @@ } else if (Feature == "+fp16") { HW_FP |= HW_FP_HP; } else if (Feature == "+fullfp16") { - HasFullFP16 = 1; + HasFloat16 = true; } } HW_FP &= ~HW_FP_remove; @@ -714,11 +713,11 @@ Builder.defineMacro("__ARM_FP_FAST", "1"); // Armv8.2-A FP16 vector intrinsic - if ((FPU & NeonFPU) && HasFullFP16) + if ((FPU & NeonFPU) && HasFloat16) Builder.defineMacro("__ARM_FEATURE_FP16_VECTOR_ARITHMETIC", "1"); // Armv8.2-A FP16 scalar intrinsics - if (HasFullFP16) + if (HasFloat16) Builder.defineMacro("__ARM_FEATURE_FP16_SCALAR_ARITHMETIC", "1"); Index: lib/Basic/Targets/AArch64.cpp === --- lib/Basic/Targets/AArch64.cpp +++ lib/Basic/Targets/AArch64.cpp @@ -49,6 +49,9 @@ IntMaxType = SignedLong; } + // AArch64 has H-registers and at least some level instruction half-precision + // support (as opposed to ARM, where it can be completely unsupported). + HasFloat16 = true; LongWidth = LongAlign = PointerWidth = PointerAlign = 64; MaxVectorA
[PATCH] D44561: [ARM] Pass half or i16 types for NEON intrinsics
SjoerdMeijer updated this revision to Diff 138890. SjoerdMeijer retitled this revision from "[ARM] Add HasFloat16 to TargetInfo" to "[ARM] Pass half or i16 types for NEON intrinsics". SjoerdMeijer added a comment. Herald added a subscriber: rengolin. Removed unused function argument, and renamed HasFloat16 https://reviews.llvm.org/D44561 Files: include/clang/Basic/TargetInfo.h lib/Basic/TargetInfo.cpp lib/Basic/Targets/AArch64.cpp lib/Basic/Targets/ARM.cpp lib/Basic/Targets/ARM.h lib/CodeGen/CGBuiltin.cpp Index: lib/CodeGen/CGBuiltin.cpp === --- lib/CodeGen/CGBuiltin.cpp +++ lib/CodeGen/CGBuiltin.cpp @@ -3441,7 +3441,7 @@ static llvm::VectorType *GetNeonType(CodeGenFunction *CGF, NeonTypeFlags TypeFlags, - llvm::Triple::ArchType Arch, + bool HasLegalHalfType=true, bool V1Ty=false) { int IsQuad = TypeFlags.isQuad(); switch (TypeFlags.getEltType()) { @@ -3452,9 +3452,7 @@ case NeonTypeFlags::Poly16: return llvm::VectorType::get(CGF->Int16Ty, V1Ty ? 1 : (4 << IsQuad)); case NeonTypeFlags::Float16: -// FIXME: Only AArch64 backend can so far properly handle half types. -// Remove else part once ARM backend support for half is complete. -if (Arch == llvm::Triple::aarch64) +if (HasLegalHalfType) return llvm::VectorType::get(CGF->HalfTy, V1Ty ? 1 : (4 << IsQuad)); else return llvm::VectorType::get(CGF->Int16Ty, V1Ty ? 1 : (4 << IsQuad)); @@ -4338,8 +4336,9 @@ NeonTypeFlags Type(NeonTypeConst.getZExtValue()); bool Usgn = Type.isUnsigned(); bool Quad = Type.isQuad(); + const bool HasLegalHalfType = getTarget().hasLegalHalfType(); - llvm::VectorType *VTy = GetNeonType(this, Type, Arch); + llvm::VectorType *VTy = GetNeonType(this, Type, HasLegalHalfType); llvm::Type *Ty = VTy; if (!Ty) return nullptr; @@ -4413,13 +4412,15 @@ case NEON::BI__builtin_neon_vcvt_f32_v: case NEON::BI__builtin_neon_vcvtq_f32_v: Ops[0] = Builder.CreateBitCast(Ops[0], Ty); -Ty = GetNeonType(this, NeonTypeFlags(NeonTypeFlags::Float32, false, Quad), Arch); +Ty = GetNeonType(this, NeonTypeFlags(NeonTypeFlags::Float32, false, Quad), + HasLegalHalfType); return Usgn ? Builder.CreateUIToFP(Ops[0], Ty, "vcvt") : Builder.CreateSIToFP(Ops[0], Ty, "vcvt"); case NEON::BI__builtin_neon_vcvt_f16_v: case NEON::BI__builtin_neon_vcvtq_f16_v: Ops[0] = Builder.CreateBitCast(Ops[0], Ty); -Ty = GetNeonType(this, NeonTypeFlags(NeonTypeFlags::Float16, false, Quad), Arch); +Ty = GetNeonType(this, NeonTypeFlags(NeonTypeFlags::Float16, false, Quad), + HasLegalHalfType); return Usgn ? Builder.CreateUIToFP(Ops[0], Ty, "vcvt") : Builder.CreateSIToFP(Ops[0], Ty, "vcvt"); case NEON::BI__builtin_neon_vcvt_n_f16_v: @@ -5528,7 +5529,8 @@ bool usgn = Type.isUnsigned(); bool rightShift = false; - llvm::VectorType *VTy = GetNeonType(this, Type, Arch); + llvm::VectorType *VTy = GetNeonType(this, Type, + getTarget().hasLegalHalfType()); llvm::Type *Ty = VTy; if (!Ty) return nullptr; @@ -5774,7 +5776,7 @@ // Determine the type of this overloaded NEON intrinsic. NeonTypeFlags Type(Result.getZExtValue()); - llvm::VectorType *Ty = GetNeonType(&CGF, Type, Arch); + llvm::VectorType *Ty = GetNeonType(&CGF, Type); if (!Ty) return nullptr; @@ -6828,7 +6830,7 @@ } } - llvm::VectorType *VTy = GetNeonType(this, Type, Arch); + llvm::VectorType *VTy = GetNeonType(this, Type); llvm::Type *Ty = VTy; if (!Ty) return nullptr; @@ -6893,7 +6895,7 @@ Ops[0] = Builder.CreateBitCast(Ops[0], DoubleTy); Ops[1] = Builder.CreateBitCast(Ops[1], DoubleTy); llvm::Type *VTy = GetNeonType(this, -NeonTypeFlags(NeonTypeFlags::Float64, false, true), Arch); +NeonTypeFlags(NeonTypeFlags::Float64, false, true)); Ops[2] = Builder.CreateBitCast(Ops[2], VTy); Ops[2] = Builder.CreateExtractElement(Ops[2], Ops[3], "extract"); Value *F = CGM.getIntrinsic(Intrinsic::fma, DoubleTy); @@ -7132,22 +7134,22 @@ case NEON::BI__builtin_neon_vcvt_f64_v: case NEON::BI__builtin_neon_vcvtq_f64_v: Ops[0] = Builder.CreateBitCast(Ops[0], Ty); -Ty = GetNeonType(this, NeonTypeFlags(NeonTypeFlags::Float64, false, quad), Arch); +Ty = GetNeonType(this, NeonTypeFlags(NeonTypeFlags::Float64, false, quad)); return usgn ? Builder.CreateUIToFP(Ops[0], Ty, "vcvt") : Builder.CreateSIToFP(Ops[0], Ty, "vcvt"); case NEON::BI__builtin_neon_vcvt_f64_f32: { assert(Type.getEltType() == NeonTypeFlags::Float64 && quad && "unexpected vcvt_f64_f32 builtin"); NeonTypeFlags SrcFlag = NeonTypeFlags(NeonTypeFlags::Float3
[PATCH] D44561: [ARM] Pass half or i16 types for NEON intrinsics
SjoerdMeijer added a comment. Thanks a lot for your help and reviews. https://reviews.llvm.org/D44561 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D44561: [ARM] Pass half or i16 types for NEON intrinsics
This revision was automatically updated to reflect the committed changes. Closed by commit rL327836: [ARM] Pass half or i16 types for NEON intrinsics (authored by SjoerdMeijer, committed by ). Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D44561?vs=138890&id=138904#toc Repository: rL LLVM https://reviews.llvm.org/D44561 Files: cfe/trunk/include/clang/Basic/TargetInfo.h cfe/trunk/lib/Basic/TargetInfo.cpp cfe/trunk/lib/Basic/Targets/AArch64.cpp cfe/trunk/lib/Basic/Targets/ARM.cpp cfe/trunk/lib/Basic/Targets/ARM.h cfe/trunk/lib/CodeGen/CGBuiltin.cpp Index: cfe/trunk/include/clang/Basic/TargetInfo.h === --- cfe/trunk/include/clang/Basic/TargetInfo.h +++ cfe/trunk/include/clang/Basic/TargetInfo.h @@ -61,6 +61,8 @@ bool TLSSupported; bool VLASupported; bool NoAsmVariants; // True if {|} are normal characters. + bool HasLegalHalfType; // True if the backend supports operations on the half + // LLVM IR type. bool HasFloat128; unsigned char PointerWidth, PointerAlign; unsigned char BoolWidth, BoolAlign; @@ -361,6 +363,9 @@ return (getPointerWidth(0) >= 64) || getTargetOpts().ForceEnableInt128; } // FIXME + /// \brief Determine whether _Float16 is supported on this target. + virtual bool hasLegalHalfType() const { return HasLegalHalfType; } + /// \brief Determine whether the __float128 type is supported on this target. virtual bool hasFloat128Type() const { return HasFloat128; } Index: cfe/trunk/lib/Basic/Targets/ARM.h === --- cfe/trunk/lib/Basic/Targets/ARM.h +++ cfe/trunk/lib/Basic/Targets/ARM.h @@ -69,7 +69,6 @@ unsigned Crypto : 1; unsigned DSP : 1; unsigned Unaligned : 1; - unsigned HasFullFP16 : 1; enum { LDREX_B = (1 << 0), /// byte (8-bit) Index: cfe/trunk/lib/Basic/Targets/AArch64.cpp === --- cfe/trunk/lib/Basic/Targets/AArch64.cpp +++ cfe/trunk/lib/Basic/Targets/AArch64.cpp @@ -49,6 +49,8 @@ IntMaxType = SignedLong; } + // All AArch64 implementations support ARMv8 FP, which makes half a legal type. + HasLegalHalfType = true; LongWidth = LongAlign = PointerWidth = PointerAlign = 64; MaxVectorAlign = 128; Index: cfe/trunk/lib/Basic/Targets/ARM.cpp === --- cfe/trunk/lib/Basic/Targets/ARM.cpp +++ cfe/trunk/lib/Basic/Targets/ARM.cpp @@ -379,7 +379,6 @@ Unaligned = 1; SoftFloat = SoftFloatABI = false; HWDiv = 0; - HasFullFP16 = 0; // This does not diagnose illegal cases like having both // "+vfpv2" and "+vfpv3" or having "+neon" and "+fp-only-sp". @@ -421,7 +420,7 @@ } else if (Feature == "+fp16") { HW_FP |= HW_FP_HP; } else if (Feature == "+fullfp16") { - HasFullFP16 = 1; + HasLegalHalfType = true; } } HW_FP &= ~HW_FP_remove; @@ -714,11 +713,11 @@ Builder.defineMacro("__ARM_FP_FAST", "1"); // Armv8.2-A FP16 vector intrinsic - if ((FPU & NeonFPU) && HasFullFP16) + if ((FPU & NeonFPU) && HasLegalHalfType) Builder.defineMacro("__ARM_FEATURE_FP16_VECTOR_ARITHMETIC", "1"); // Armv8.2-A FP16 scalar intrinsics - if (HasFullFP16) + if (HasLegalHalfType) Builder.defineMacro("__ARM_FEATURE_FP16_SCALAR_ARITHMETIC", "1"); Index: cfe/trunk/lib/Basic/TargetInfo.cpp === --- cfe/trunk/lib/Basic/TargetInfo.cpp +++ cfe/trunk/lib/Basic/TargetInfo.cpp @@ -32,6 +32,7 @@ TLSSupported = true; VLASupported = true; NoAsmVariants = false; + HasLegalHalfType = false; HasFloat128 = false; PointerWidth = PointerAlign = 32; BoolWidth = BoolAlign = 8; Index: cfe/trunk/lib/CodeGen/CGBuiltin.cpp === --- cfe/trunk/lib/CodeGen/CGBuiltin.cpp +++ cfe/trunk/lib/CodeGen/CGBuiltin.cpp @@ -3441,7 +3441,7 @@ static llvm::VectorType *GetNeonType(CodeGenFunction *CGF, NeonTypeFlags TypeFlags, - llvm::Triple::ArchType Arch, + bool HasLegalHalfType=true, bool V1Ty=false) { int IsQuad = TypeFlags.isQuad(); switch (TypeFlags.getEltType()) { @@ -3452,9 +3452,7 @@ case NeonTypeFlags::Poly16: return llvm::VectorType::get(CGF->Int16Ty, V1Ty ? 1 : (4 << IsQuad)); case NeonTypeFlags::Float16: -// FIXME: Only AArch64 backend can so far properly handle half types. -// Remove else part once ARM backend support for half is complete. -if (Arch == llvm::Triple::aarch64) +if (HasLegalHalfType) return llvm::VectorType::get(CGF->HalfTy, V1Ty ? 1 : (4 << IsQuad)); else return llvm::VectorType::get(CGF->Int16Ty, V1Ty ? 1 : (4
[PATCH] D44222: [AArch64] Add vmulxh_lane FP16 intrinsics
SjoerdMeijer abandoned this revision. SjoerdMeijer added a comment. This is implemented in https://reviews.llvm.org/D44591. https://reviews.llvm.org/D44222 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D44591: [AArch64] Add vmulxh_lane FP16 vector intrinsic
SjoerdMeijer accepted this revision. SjoerdMeijer added a comment. This revision is now accepted and ready to land. This looks good to me, but we need a companion LLVM patch and add codegen tests for this to: CodeGen/AArch64/fp16_intrinsic_lane.ll. https://reviews.llvm.org/D44591 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D47121: [NEON] Support VLD1xN intrinsics in AArch32 mode (Clang part)
SjoerdMeijer added a comment. Had only a first brief look; see some first drive by comments inline. Comment at: lib/CodeGen/CGBuiltin.cpp:7865 } // FIXME: Sharing loads & stores with 32-bit is complicated by the absence // of an Align parameter here. How about this FIXME? Is it still relevant? And does it need to be moved up? Or perhaps better: move the code back here to minimise the diff? Comment at: test/CodeGen/arm-neon-vld.c:4 +// RUN: FileCheck -check-prefixes=CHECK,CHECK-A64 %s +// RUN: %clang_cc1 -triple armv8-none-linux-gnueabi -target-feature +neon \ +// RUN: -S -disable-O0-optnone -emit-llvm -o - %s | opt -S -mem2reg | \ Should this be armv7? https://reviews.llvm.org/D47121 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D47121: [NEON] Support VLD1xN intrinsics in AArch32 mode (Clang part)
SjoerdMeijer accepted this revision. SjoerdMeijer added a comment. This revision is now accepted and ready to land. I agree: these intrinsics are available in v7/A32/A64. Comment at: lib/CodeGen/CGBuiltin.cpp:7865 } // FIXME: Sharing loads & stores with 32-bit is complicated by the absence // of an Align parameter here. kosarev wrote: > SjoerdMeijer wrote: > > How about this FIXME? Is it still relevant? And does it need to be moved > > up? Or perhaps better: move the code back here to minimise the diff? > Yes, it's still true for the vst builtins handled below. None of the vld/vst > patches removes this comment, but it should go away with whatever is the one > to be committed last. > > Umm, it seems leaving the vld code here wouldn't make the diff smaller? ah yes, nevermind, got confused here. https://reviews.llvm.org/D47121 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D47592: [AArch64] Corrected FP16 Intrinsic range checks in Clang + added Sema tests
SjoerdMeijer added inline comments. Comment at: test/Sema/aarch64-neon-fp16-ranges.c:1 +// RUN: %clang_cc1 -triple arm64-linux-gnu -target-feature +neon -fallow-half-arguments-and-returns -target-feature +fullfp16 -ffreestanding -fsyntax-only -verify %s +// RUN: %clang_cc1 -triple aarch64-linux-gnu -target-feature +neon -fallow-half-arguments-and-returns -target-feature +fullfp16 -ffreestanding -fsyntax-only -verify %s Nit: target feature fullfp16 implies ARMv8 FP, so I think you can remove +neon; just a tiny optimisation to make the command line shorter (same below). Comment at: test/Sema/aarch64-neon-fp16-ranges.c:39 + +void test_vcvt_su_f(int64_t a){ + vcvth_n_s16_f16(a, 1); why is this is 'a' an int64_t? Should this not be float16_t? https://reviews.llvm.org/D47592 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D47592: [AArch64] Corrected FP16 Intrinsic range checks in Clang + added Sema tests
SjoerdMeijer added inline comments. Comment at: lib/Sema/SemaChecking.cpp:1409 - switch (BuiltinID) { -#define GET_NEON_OVERLOAD_CHECK -#include "clang/Basic/arm_neon.inc" Why do we need to remove this? Comment at: lib/Sema/SemaChecking.cpp:1462 -#define GET_NEON_IMMEDIATE_CHECK -#include "clang/Basic/arm_neon.inc" -#include "clang/Basic/arm_fp16.inc" And also this one? https://reviews.llvm.org/D47592 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D47592: [AArch64] Corrected FP16 Intrinsic range checks in Clang + added Sema tests
SjoerdMeijer accepted this revision. SjoerdMeijer added a comment. This revision is now accepted and ready to land. I think this looks ok now, just some nits inline. Can you please upload your diffs with more context next time? Comment at: utils/TableGen/NeonEmitter.cpp:2166 +void NeonEmitter::genIntrinsicRangeCheckCode(raw_ostream &OS, SmallVectorImpl &Defs) { OS << "#ifdef GET_NEON_IMMEDIATE_CHECK\n"; Nit: can you realign this? Comment at: utils/TableGen/NeonEmitter.cpp:2193 + if (Def->getBaseType().getElementSizeInBits() == 16 || + Def->getName().find('h') != std::string::npos) + // VCVTh operating on FP16 intrinsics in range [1, 16) Nit: for a moment I thought this could match more cases than intended, but we have already checked for isVCVT_N, so should be fine? Comment at: utils/TableGen/NeonEmitter.cpp:2194 + Def->getName().find('h') != std::string::npos) + // VCVTh operating on FP16 intrinsics in range [1, 16) + UpperBound = "15"; Nit: think you're (almost) repeating the comment above, so you can omit this one? https://reviews.llvm.org/D47592 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D47446: [NEON] Support VST1xN intrinsics in AArch32 mode (Clang part)
SjoerdMeijer accepted this revision. SjoerdMeijer added a comment. This revision is now accepted and ready to land. Agreed: supported architectures are v7/A32/A64. https://reviews.llvm.org/D47446 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D48119: [AArch64] Added Clang Codegen+Test Support for FP16 VCVTA_U16 intrinsic
SjoerdMeijer added a comment. Nits title: - Think you can simplify the title to something along the lines of: "[AArch64] Support the FP16 VCVTA_U16 intrinsic". No need to mention tests are added in the subject (tests should always be added). Nits summary: - Arm v8.2a -> Armv8.2-A - Aarch64 -> AArch64 - No need to mention "tested and works using ninja check"; this is obvious and should always be done. - This is a bit vague: "Added support for existing IP". I think you can reduce the summary to just: "Added support for the vcvta_u16_f16 instrinsic for FP16 Armv8.2-A" Comment at: CodeGen/arm-v8.2a-neon-intrinsics.c:170 +// CHECK: ret <4 x i16> [[VCVT]] +int16x4_t test_vcvta_u16_f16 (float16x4_t a) { + return vcvta_u16_f16(a); Is this exactly the same test also added in aarch64-v8.2a-neon-intrinsics.c? Repository: rC Clang https://reviews.llvm.org/D48119 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D48119: [AArch64] Added support for the vcvta_u16_f16 instrinsic for FP16 Armv8.2-A
SjoerdMeijer accepted this revision. SjoerdMeijer added a comment. This revision is now accepted and ready to land. Thanks. Looks like a straightforward fix to me. Comment at: CodeGen/arm-v8.2a-neon-intrinsics.c:170 +// CHECK: ret <4 x i16> [[VCVT]] +int16x4_t test_vcvta_u16_f16 (float16x4_t a) { + return vcvta_u16_f16(a); LukeGeeson wrote: > SjoerdMeijer wrote: > > Is this exactly the same test also added in aarch64-v8.2a-neon-intrinsics.c? > Not that I'm aware. One is for AArch platforms and the other is Arm. see the > second CHECK line of each Ah, of course, my bad. Looked too quickly. Repository: rC Clang https://reviews.llvm.org/D48119 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D48188: [SPIR] Prevent SPIR targets from using half conversion intrinsics
SjoerdMeijer added a comment. I know very little about SPIR, and Initially didn't understand this: > The SPIR target currently allows for half precision floating point types to > use the LLVM intrinsic functions to convert to floats and doubles. This is > illegal in SPIR as the only intrinsic allowed by SPIR is memcpy ... until I looked at the implementation what you're trying to achieve here. Perhaps you can make the commit message a bit more descriptive and specific. Comment at: lib/Basic/Targets/SPIR.h:50 UseAddrSpaceMapMangling = true; +HasLegalHalfType = true; // Define available target features It doesn't hurt to set this, but you're not using it so you could omit it. I had to introduce this to deal differently with half types depending on architecture extensions, but don't you think have this problem. Comment at: lib/Basic/Targets/SPIR.h:65 + // memcpy as per section 3 of the SPIR spec. + bool useFP16ConversionIntrinsics() const override { return false; } + just a note: this is the only functional change, but you're testing a lot more in test/CodeGen/spir-half-type.cpp Comment at: test/CodeGen/spir-half-type.cpp:3 +// RUN: %clang_cc1 -O0 -triple spir64 -emit-llvm %s -o - | FileCheck %s + +// This file tests that using the _Float16 type with the spir target will not use the llvm intrinsics but instead will use the half arithmetic instructions directly. I think you need one reproducer to test: // CHECK-NOT: llvm.convert.from.fp16 The other tests, like all the compares are valid tests, but not related to this change, and also not specific to SPIR. I put my _Float16 "smoke tests" in test/CodeGenCXX/float16-declarations.cpp, perhaps you can move some of these generic tests there because I for example see I didn't add any compares there. Comment at: test/CodeGen/spir-half-type.cpp:4 + +// This file tests that using the _Float16 type with the spir target will not use the llvm intrinsics but instead will use the half arithmetic instructions directly. + nit: this comment exceeds 80 columns, same for the other comment below. Repository: rC Clang https://reviews.llvm.org/D48188 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D48188: [SPIR] Prevent SPIR targets from using half conversion intrinsics
SjoerdMeijer accepted this revision. SjoerdMeijer added a comment. This revision is now accepted and ready to land. Looks OK to me. Comment at: test/CodeGen/spir-half-type.cpp:89 + +_Float16 fadd() { + _Float16 a = 1.0f16; Nit: let one of these functions take a _Float16 function argument? https://reviews.llvm.org/D48188 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D48188: [SPIR] Prevent SPIR targets from using half conversion intrinsics
SjoerdMeijer added a comment. No problem, will commit this today. https://reviews.llvm.org/D48188 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D48188: [SPIR] Prevent SPIR targets from using half conversion intrinsics
This revision was automatically updated to reflect the committed changes. Closed by commit rC335111: [SPIR] Prevent SPIR targets from using half conversion intrinsics (authored by SjoerdMeijer, committed by ). Changed prior to commit: https://reviews.llvm.org/D48188?vs=151938&id=152045#toc Repository: rC Clang https://reviews.llvm.org/D48188 Files: lib/Basic/Targets/SPIR.h test/CodeGen/spir-half-type.cpp Index: test/CodeGen/spir-half-type.cpp === --- test/CodeGen/spir-half-type.cpp +++ test/CodeGen/spir-half-type.cpp @@ -0,0 +1,146 @@ +// RUN: %clang_cc1 -O0 -triple spir -emit-llvm %s -o - | FileCheck %s +// RUN: %clang_cc1 -O0 -triple spir64 -emit-llvm %s -o - | FileCheck %s + +// This file tests that using the _Float16 type with the spir target will not +// use the llvm intrinsics but instead will use the half arithmetic +// instructions directly. + +// Previously attempting to use a constant _Float16 with a comparison +// instruction when the target is spir or spir64 lead to an assert being hit. +bool fcmp_const() { + _Float16 a = 0.0f16; + const _Float16 b = 1.0f16; + + // CHECK-NOT: llvm.convert.to.fp16 + // CHECK-NOT: llvm.convert.from.fp16 + + // CHECK: [[REG1:%.*]] = load half, half* %a, align 2 + // CHECK-NEXT: fcmp olt half [[REG1]], 0xH3C00 + + // CHECK: [[REG2:%.*]] = load half, half* %a, align 2 + // CHECK-NEXT: fcmp olt half [[REG2]], 0xH4000 + + // CHECK: [[REG3:%.*]] = load half, half* %a, align 2 + // CHECK-NEXT: fcmp ogt half [[REG3]], 0xH3C00 + + // CHECK: [[REG4:%.*]] = load half, half* %a, align 2 + // CHECK-NEXT: fcmp ogt half [[REG4]], 0xH4200 + + // CHECK: [[REG5:%.*]] = load half, half* %a, align 2 + // CHECK-NEXT: fcmp oeq half [[REG5]], 0xH3C00 + + // CHECK: [[REG7:%.*]] = load half, half* %a, align 2 + // CHECK-NEXT: fcmp oeq half [[REG7]], 0xH4400 + + // CHECK: [[REG8:%.*]] = load half, half* %a, align 2 + // CHECK-NEXT: fcmp une half [[REG8]], 0xH3C00 + + // CHECK: [[REG9:%.*]] = load half, half* %a, align 2 + // CHECK-NEXT: fcmp une half [[REG9]], 0xH4500 + + // CHECK: [[REG10:%.*]] = load half, half* %a, align 2 + // CHECK-NEXT: fcmp ole half [[REG10]], 0xH3C00 + + // CHECK: [[REG11:%.*]] = load half, half* %a, align 2 + // CHECK-NEXT: fcmp ole half [[REG11]], 0xH4600 + + // CHECK: [[REG12:%.*]] = load half, half* %a, align 2 + // CHECK-NEXT: fcmp oge half [[REG12]], 0xH3C00 + + // CHECK: [[REG13:%.*]] = load half, half* %a, align 2 + // CHECK-NEXT: fcmp oge half [[REG13]], 0xH4700 + return a < b || a < 2.0f16 || a > b || a > 3.0f16 || a == b || a == 4.0f16 || + a != b || a != 5.0f16 || a <= b || a <= 6.0f16 || a >= b || + a >= 7.0f16; +} + +bool fcmp() { + _Float16 a = 0.0f16; + _Float16 b = 1.0f16; + + // CHECK-NOT: llvm.convert.to.fp16 + // CHECK-NOT: llvm.convert.from.fp16 + // CHECK: [[REG1:%.*]] = load half, half* %a, align 2 + // CHECK-NEXT: [[REG2:%.*]] = load half, half* %b, align 2 + // CHECK-NEXT: fcmp olt half [[REG1]], [[REG2]] + + // CHECK: [[REG3:%.*]] = load half, half* %a, align 2 + // CHECK-NEXT: [[REG4:%.*]] = load half, half* %b, align 2 + // CHECK-NEXT: fcmp ogt half [[REG3]], [[REG4]] + + // CHECK: [[REG5:%.*]] = load half, half* %a, align 2 + // CHECK-NEXT: [[REG6:%.*]] = load half, half* %b, align 2 + // CHECK-NEXT: fcmp oeq half [[REG5]], [[REG6]] + + // CHECK: [[REG7:%.*]] = load half, half* %a, align 2 + // CHECK-NEXT: [[REG8:%.*]] = load half, half* %b, align 2 + // CHECK-NEXT: fcmp une half [[REG7]], [[REG8]] + + // CHECK: [[REG7:%.*]] = load half, half* %a, align 2 + // CHECK-NEXT: [[REG8:%.*]] = load half, half* %b, align 2 + // CHECK-NEXT: fcmp ole half [[REG7]], [[REG8]] + + // CHECK: [[REG7:%.*]] = load half, half* %a, align 2 + // CHECK-NEXT: [[REG8:%.*]] = load half, half* %b, align 2 + // CHECK-NEXT: fcmp oge half [[REG7]], [[REG8]] + return a < b || a > b || a == b || a != b || a <= b || a >= b; +} + +_Float16 fadd() { + _Float16 a = 1.0f16; + const _Float16 b = 2.0f16; + + // CHECK-NOT: llvm.convert.to.fp16 + // CHECK-NOT: llvm.convert.from.fp16 + + // CHECK: [[REG1:%.*]] = load half, half* %a, align 2 + // CHECK-NEXT: [[REG2:%.*]] = fadd half [[REG1]], 0xH4000 + // CHECK-NEXT: [[REG3:%.*]] = fadd half [[REG2]], 0xH4200 + // CHECK-NEXT: ret half [[REG3]] + return a + b + 3.0f16; +} + +_Float16 fsub() { + _Float16 a = 1.0f16; + const _Float16 b = 2.0f16; + + // CHECK-NOT: llvm.convert.to.fp16 + // CHECK-NOT: llvm.convert.from.fp16 + + // CHECK: [[REG1:%.*]] = load half, half* %a, align 2 + // CHECK-NEXT: [[REG2:%.*]] = fsub half [[REG1]], 0xH4000 + // CHECK-NEXT: [[REG3:%.*]] = fsub half [[REG2]], 0xH4200 + // CHECK-NEXT: ret half [[REG3]] + return a - b - 3.0f16; +} + +// CHECK: define spir_func half @_Z4fmulDF16_(half %arg) +_Float16 fmul(_Float16 arg) { + _Float16 a = 1.0f16; + const _Float16 b = 2.0f16; + + // CHECK-NOT: llvm.convert.to.fp16 + // CHECK-NOT: llvm.conver
[PATCH] D57188: Disable _Float16 for non ARM/SPIR Targets
SjoerdMeijer added inline comments. Comment at: include/clang/Basic/TargetInfo.h:66 bool HasFloat128; + bool HasFloat16; unsigned char PointerWidth, PointerAlign; I think this is the same as `HasLegalHalfType`, and we can (re)use that. Or, at least, don't think we need both `HasLegalHalfType` and `HasFloat16`. For context, I needed `HasLegalHalfType` for argument passing, but it looks like it can serve another purpose now. Out of curiousity, I was wondering if specifying: KEYWORD(_Float16, HALFSUPPORT) in TokenKids.def is an alternative approach (it is currently set to KEYALL). Thus, enable the keyword when `LangOpts.Half` is set. By adding this `HasFloat16` property here in clang's targetinfo, we're sort of defining again how targets support different types. I.e., if you throw a `half` type at the backend, the TypeLegalizer will deal with it in one way or another. Perhaps disabling `_Float16` can be achieved by disabling the keyword. But I do see that the big advantage of this patch is the much nicer error message (otherwise we would get something like "unknown type name '_Float16'"). Comment at: include/clang/Basic/TargetInfo.h:521 + /// Determine whether the _Float16 type is supported on this target. + virtual bool hasFloat16Type() const { return HasFloat16; } + Similar remark: the same as `hasLegalHalfType()`? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D57188/new/ https://reviews.llvm.org/D57188 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D53633: [AArch64] Implement FP16FML intrinsics
SjoerdMeijer added a comment. FYI: a new ACLE version has been published, please find it here: https://developer.arm.com/architectures/system-architectures/software-standards/acle The "Neon Intrinsics" section contains these new intrinsics. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D53633/new/ https://reviews.llvm.org/D53633 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D60699: [ARM] add CLI support for 8.1-M and MVE.
SjoerdMeijer added inline comments. Comment at: clang/test/Driver/armv8.1m.main.c:1 +// RUN: %clang -target arm-arm-none-eabi -march=armv8.1-m.main+dsp -### %s 2> %t +// RUN: FileCheck --check-prefix=CHECK-DSP < %t %s It doesn't really matter, I guess, but we don't need a temp file and can pipe the output directly to FileCheck? Comment at: clang/test/Driver/armv8.1m.main.c:3 +// RUN: FileCheck --check-prefix=CHECK-DSP < %t %s +// CHECK-DSP: "-target-feature" "+dsp" + Do we also want to check that just: -march=armv8.1-m doesn't enable DSP (and other non-mandatory extensions)? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D60699/new/ https://reviews.llvm.org/D60699 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D61717: Fix arm_neon.h to be clean under -fno-lax-vector-conversions.
SjoerdMeijer accepted this revision. SjoerdMeijer added a comment. This revision is now accepted and ready to land. Looks okay to me. Repository: rC Clang CHANGES SINCE LAST ACTION https://reviews.llvm.org/D61717/new/ https://reviews.llvm.org/D61717 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D50229: +fp16fml feature for ARM and AArch64
SjoerdMeijer added a comment. (I am now picking this up, and will try to progress this patch and also https://reviews.llvm.org/D50179) > Do you expect that the regression tests will be affected by the TargetParser > fixes? No, and that's exactly the reason why it would be nice to get this in. The tests won't change, they show the expected behaviour, and thus we have a sort of "baseline implementation" while we are working on the new options framework. And just repeating what I said in the other ticket, this option handling implementation is far from ideal and pretty, it's very easy to agree on that. This is a low maintenance patch, so very easy to keep downstream for us, but it would be useful to have it on trunk too perhaps. I will add comments and a FIXME that we expect a full reimplementation of it. Repository: rC Clang https://reviews.llvm.org/D50229 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D50229: +fp16fml feature for ARM and AArch64
SjoerdMeijer added a comment. Ah, and just for your info, the proposal was just sent to the dev list: http://lists.llvm.org/pipermail/llvm-dev/2018-September/126346.html Repository: rC Clang https://reviews.llvm.org/D50229 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D50229: [ARM][AArch64] Add feature +fp16fml
SjoerdMeijer updated this revision to Diff 166439. SjoerdMeijer retitled this revision from "+fp16fml feature for ARM and AArch64" to "[ARM][AArch64] Add feature +fp16fml". SjoerdMeijer edited the summary of this revision. SjoerdMeijer added a comment. Added FIXMEs. https://reviews.llvm.org/D50229 Files: lib/Driver/ToolChains/Arch/AArch64.cpp lib/Driver/ToolChains/Arch/ARM.cpp test/Driver/aarch64-cpus.c test/Driver/arm-cortex-cpus.c test/Preprocessor/aarch64-target-features.c test/Preprocessor/arm-target-features.c Index: test/Preprocessor/arm-target-features.c === --- test/Preprocessor/arm-target-features.c +++ test/Preprocessor/arm-target-features.c @@ -21,18 +21,58 @@ // CHECK-V8A-ALLOW-FP-INSTR: #define __ARM_FP16_FORMAT_IEEE 1 // CHECK-V8A-ALLOW-FP-INSTR-V8A-NOT: #define __ARM_FEATURE_DOTPROD -// RUN: %clang -target arm-none-linux-gnueabi -march=armv8.2a+fp16 -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-FULLFP16-VECTOR-SCALAR %s +// RUN: %clang -target arm-none-linux-gnueabi -march=armv8.2-a+nofp16fml+fp16 -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-FULLFP16-VECTOR-SCALAR %s +// RUN: %clang -target arm-none-linux-gnueabi -march=armv8.2-a+nofp16+fp16fml -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-FULLFP16-VECTOR-SCALAR %s +// RUN: %clang -target arm-none-linux-gnueabi -march=armv8.2-a+fp16+nofp16fml -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-FULLFP16-VECTOR-SCALAR %s +// RUN: %clang -target arm-none-linux-gnueabi -march=armv8-a+fp16fml -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-FULLFP16-VECTOR-SCALAR %s +// RUN: %clang -target arm-none-linux-gnueabi -march=armv8-a+fp16 -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-FULLFP16-VECTOR-SCALAR %s +// RUN: %clang -target arm-none-linux-gnueabi -march=armv8.4-a+nofp16fml+fp16 -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-FULLFP16-VECTOR-SCALAR %s +// RUN: %clang -target arm-none-linux-gnueabi -march=armv8.4-a+nofp16+fp16fml -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-FULLFP16-VECTOR-SCALAR %s +// RUN: %clang -target arm-none-linux-gnueabi -march=armv8.4-a+fp16+nofp16fml -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-FULLFP16-VECTOR-SCALAR %s +// RUN: %clang -target arm-none-linux-gnueabi -march=armv8.4-a+fp16fml -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-FULLFP16-VECTOR-SCALAR %s +// RUN: %clang -target arm-none-linux-gnueabi -march=armv8.4-a+fp16 -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-FULLFP16-VECTOR-SCALAR %s // CHECK-FULLFP16-VECTOR-SCALAR: #define __ARM_FEATURE_FP16_SCALAR_ARITHMETIC 1 // CHECK-FULLFP16-VECTOR-SCALAR: #define __ARM_FEATURE_FP16_VECTOR_ARITHMETIC 1 // CHECK-FULLFP16-VECTOR-SCALAR: #define __ARM_FP 0xe // CHECK-FULLFP16-VECTOR-SCALAR: #define __ARM_FP16_FORMAT_IEEE 1 -// RUN: %clang -target arm-none-linux-gnueabi -march=armv8.2a+fp16 -mfpu=vfp4 -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-FULLFP16-SCALAR %s +// +fp16fml without neon doesn't make sense as the fp16fml instructions all require SIMD. +// However, as +fp16fml implies +fp16 there is a set of defines that we would expect. +// RUN: %clang -target arm-none-linux-gnueabi -march=armv8-a+fp16fml -mfpu=vfp4 -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-FULLFP16-SCALAR %s +// RUN: %clang -target arm-none-linux-gnueabi -march=armv8-a+fp16 -mfpu=vfp4 -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-FULLFP16-SCALAR %s +// RUN: %clang -target arm-none-linux-gnueabi -march=armv8.4-a+fp16fml -mfpu=vfp4 -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-FULLFP16-SCALAR %s +// RUN: %clang -target arm-none-linux-gnueabi -march=armv8.4-a+fp16 -mfpu=vfp4 -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-FULLFP16-SCALAR %s // CHECK-FULLFP16-SCALAR: #define __ARM_FEATURE_FP16_SCALAR_ARITHMETIC 1 // CHECK-FULLFP16-SCALAR-NOT: #define __ARM_FEATURE_FP16_VECTOR_ARITHMETIC 1 // CHECK-FULLFP16-SCALAR: #define __ARM_FP 0xe // CHECK-FULLFP16-SCALAR: #define __ARM_FP16_FORMAT_IEEE 1 -// + +// RUN: %clang -target arm-none-linux-gnueabi -march=armv8.2-a -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-FULLFP16-NOFML-VECTOR-SCALAR %s +// RUN: %clang -target arm-none-linux-gnueabi -march=armv8.2-a+nofp16 -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-FULLFP16-NOFML-VECTOR-SCALAR %s +// RUN: %clang -target arm-none-linux-gnueabi -march=armv8.2-a+nofp16fml -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-FULLFP16-NOFML-VECTOR-SCALAR %s +// RUN: %clang -target arm-none-linux-gnueabi -march=armv8.2-a+fp16fml+nofp16 -x c -E -dM %s -o -
[PATCH] D50179: [AArch64][ARM] Context sensitive meaning of option "crypto"
SjoerdMeijer updated this revision to Diff 166643. SjoerdMeijer added a comment. Added FIXMEs, like in https://reviews.llvm.org/D50229, that this needs reimplementation too after the TargerParser rewrite. About v8.5, the ISA description is now available here: https://developer.arm.com/products/architecture/cpu-architecture/a-profile/exploration-tools But we will add support for that when we upstream v8.5 support, so will be added later. https://reviews.llvm.org/D50179 Files: lib/Driver/ToolChains/Arch/AArch64.cpp lib/Driver/ToolChains/Arch/ARM.cpp test/Driver/arm-features.c test/Preprocessor/aarch64-target-features.c Index: test/Preprocessor/aarch64-target-features.c === --- test/Preprocessor/aarch64-target-features.c +++ test/Preprocessor/aarch64-target-features.c @@ -170,6 +170,101 @@ // CHECK-MARCH-2: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "-fp-armv8" "-target-feature" "-neon" "-target-feature" "-crc" "-target-feature" "-crypto" // CHECK-MARCH-3: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "-neon" +// Check +sm4: +// +// RUN: %clang -target aarch64 -march=armv8.2a+sm4 -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-SM4 %s +// CHECK-SM4: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8.2a" "-target-feature" "+sm4" +// +// Check +sha3: +// +// RUN: %clang -target aarch64 -march=armv8.2a+sha3 -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-SHA3 %s +// CHECK-SHA3: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8.2a" "-target-feature" "+sha3" +// +// Check +sha2: +// +// RUN: %clang -target aarch64 -march=armv8.3a+sha2 -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-SHA2 %s +// CHECK-SHA2: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8.{{.}}a" "-target-feature" "+sha2" +// +// Check +aes: +// +// RUN: %clang -target aarch64 -march=armv8.3a+aes -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-AES %s +// CHECK-AES: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8.{{.}}a" "-target-feature" "+aes" +// +// Check -sm4: +// +// RUN: %clang -target aarch64 -march=armv8.2a+noSM4 -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NO-SM4 %s +// CHECK-NO-SM4: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8.2a" "-target-feature" "-sm4" +// +// Check -sha3: +// +// RUN: %clang -target aarch64 -march=armv8.2a+noSHA3 -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NO-SHA3 %s +// CHECK-NO-SHA3: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8.2a" "-target-feature" "-sha3" +// +// Check -sha2: +// +// RUN: %clang -target aarch64 -march=armv8.2a+noSHA2 -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NO-SHA2 %s +// CHECK-NO-SHA2: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8.2a" "-target-feature" "-sha2" +// +// Check -aes: +// +// RUN: %clang -target aarch64 -march=armv8.2a+noAES -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NO-AES %s +// CHECK-NO-AES: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+v8.2a" "-target-feature" "-aes" +// +// +// Arch <= ARMv8.3: crypto = sha2 + aes +// - +// +// Check +crypto: +// +// RUN: %clang -target aarch64 -march=armv8a+crypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-CRYPTO83 %s +// RUN: %clang -target aarch64 -march=armv8.1a+crypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-CRYPTO83 %s +// RUN: %clang -target aarch64 -march=armv8.2a+crypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-CRYPTO83 %s +// RUN: %clang -target aarch64 -march=armv8.3a+crypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-CRYPTO83 %s +// RUN: %clang -target aarch64 -march=armv8a+crypto+nocrypto+crypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-CRYPTO83 %s +// CHECK-CRYPTO83: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+crypto" "-target-feature" "+sha2" "-target-feature" "+aes" +// +// Check -crypto: +// +// RUN: %clang -target aarch64 -march=armv8a+nocrypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NOCRYPTO8A %s +// RUN: %clang -target aarch64 -march=armv8.1a+nocrypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NOCRYPTO81 %s +// RUN: %clang -target aarch64 -march=armv8.2a+nocrypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NOCRYPTO82 %s +// RUN: %clang -target aarch64 -march=armv8.3a+nocrypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NOCRYPTO82 %s +// RUN: %clang -target aarch64 -march=armv8.3a+nocrypto+crypto+nocrypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NOCRYPTO82 %s + +// CHECK-NOCRYPTO8A: "-target-feature" "+neon" "-target-feature" "-crypto" "-target-feature" "-sha2" "-target-feature" "-aes" "-target-abi" "aapcs" +// CHECK-NOCRYPTO81: "-target-feature" "+neon" "-target-feature" "+v8.1a" "-target-feature" "-crypto" "-target-feature" "-sha2" "-target-feature" "-aes" "-target-abi" "aapcs" +// CHECK-NOCRYPTO82: "-target-feature" "+neon" "-target-feature" "+v8.{{.}}a" "-target-feature" "-crypto
[PATCH] D50229: [ARM][AArch64] Add feature +fp16fml
This revision was automatically updated to reflect the committed changes. Closed by commit rC342862: [ARM][AArch64] Add feature +fp16fml (authored by SjoerdMeijer, committed by ). Repository: rC Clang https://reviews.llvm.org/D50229 Files: lib/Driver/ToolChains/Arch/AArch64.cpp lib/Driver/ToolChains/Arch/ARM.cpp test/Driver/aarch64-cpus.c test/Driver/arm-cortex-cpus.c test/Preprocessor/aarch64-target-features.c test/Preprocessor/arm-target-features.c Index: lib/Driver/ToolChains/Arch/AArch64.cpp === --- lib/Driver/ToolChains/Arch/AArch64.cpp +++ lib/Driver/ToolChains/Arch/AArch64.cpp @@ -193,6 +193,32 @@ Features.push_back("-crc"); } + // Handle (arch-dependent) fp16fml/fullfp16 relationship. + // FIXME: this fp16fml option handling will be reimplemented after the + // TargetParser rewrite. + const auto ItRNoFullFP16 = std::find(Features.rbegin(), Features.rend(), "-fullfp16"); + const auto ItRFP16FML = std::find(Features.rbegin(), Features.rend(), "+fp16fml"); + if (std::find(Features.begin(), Features.end(), "+v8.4a") != Features.end()) { +const auto ItRFullFP16 = std::find(Features.rbegin(), Features.rend(), "+fullfp16"); +if (ItRFullFP16 < ItRNoFullFP16 && ItRFullFP16 < ItRFP16FML) { + // Only entangled feature that can be to the right of this +fullfp16 is -fp16fml. + // Only append the +fp16fml if there is no -fp16fml after the +fullfp16. + if (std::find(Features.rbegin(), ItRFullFP16, "-fp16fml") == ItRFullFP16) +Features.push_back("+fp16fml"); +} +else + goto fp16_fml_fallthrough; + } + else { +fp16_fml_fallthrough: +// In both of these cases, putting the 'other' feature on the end of the vector will +// result in the same effect as placing it immediately after the current feature. +if (ItRNoFullFP16 < ItRFP16FML) + Features.push_back("-fp16fml"); +else if (ItRNoFullFP16 > ItRFP16FML) + Features.push_back("+fullfp16"); + } + if (Arg *A = Args.getLastArg(options::OPT_mno_unaligned_access, options::OPT_munaligned_access)) if (A->getOption().matches(options::OPT_mno_unaligned_access)) Index: lib/Driver/ToolChains/Arch/ARM.cpp === --- lib/Driver/ToolChains/Arch/ARM.cpp +++ lib/Driver/ToolChains/Arch/ARM.cpp @@ -391,6 +391,33 @@ } else if (HDivArg) getARMHWDivFeatures(D, HDivArg, Args, HDivArg->getValue(), Features); + // Handle (arch-dependent) fp16fml/fullfp16 relationship. + // Must happen before any features are disabled due to soft-float. + // FIXME: this fp16fml option handling will be reimplemented after the + // TargetParser rewrite. + const auto ItRNoFullFP16 = std::find(Features.rbegin(), Features.rend(), "-fullfp16"); + const auto ItRFP16FML = std::find(Features.rbegin(), Features.rend(), "+fp16fml"); + if (Triple.getSubArch() == llvm::Triple::SubArchType::ARMSubArch_v8_4a) { +const auto ItRFullFP16 = std::find(Features.rbegin(), Features.rend(), "+fullfp16"); +if (ItRFullFP16 < ItRNoFullFP16 && ItRFullFP16 < ItRFP16FML) { + // Only entangled feature that can be to the right of this +fullfp16 is -fp16fml. + // Only append the +fp16fml if there is no -fp16fml after the +fullfp16. + if (std::find(Features.rbegin(), ItRFullFP16, "-fp16fml") == ItRFullFP16) +Features.push_back("+fp16fml"); +} +else + goto fp16_fml_fallthrough; + } + else { +fp16_fml_fallthrough: +// In both of these cases, putting the 'other' feature on the end of the vector will +// result in the same effect as placing it immediately after the current feature. +if (ItRNoFullFP16 < ItRFP16FML) + Features.push_back("-fp16fml"); +else if (ItRNoFullFP16 > ItRFP16FML) + Features.push_back("+fullfp16"); + } + // Setting -msoft-float/-mfloat-abi=soft effectively disables the FPU (GCC // ignores the -mfpu options in this case). // Note that the ABI can also be set implicitly by the target selected. @@ -404,7 +431,7 @@ //now just be explicit and disable all known dependent features //as well. for (std::string Feature : {"vfp2", "vfp3", "vfp4", "fp-armv8", "fullfp16", -"neon", "crypto", "dotprod"}) +"neon", "crypto", "dotprod", "fp16fml"}) if (std::find(std::begin(Features), std::end(Features), "+" + Feature) != std::end(Features)) Features.push_back(Args.MakeArgString("-" + Feature)); } Index: test/Preprocessor/aarch64-target-features.c === --- test/Preprocessor/aarch64-target-features.c +++ test/Preprocessor/aarch64-target-features.c @@ -93,18 +93,45 @@ // RUN: %clang -target aarch64-none-linux-gnu -march=armv8.2a+dotprod -x c -E -dM %s -o - | FileCheck --check-prefix=CHECK-DOTPROD %s // C
[PATCH] D52491: [ARM/AArch64][v8.5A] Add Armv8.5-A target
SjoerdMeijer accepted this revision. SjoerdMeijer added a comment. This revision is now accepted and ready to land. Looks okay to me Comment at: test/Driver/arm-cortex-cpus.c:338 + +// RUN: %clang -target armv8a-linux-eabi -march=armv8.5-a+fp16 -### -c %s 2>&1 | FileCheck --check-prefix CHECK-V85A-FP16 %s +// CHECK-V85A-FP16: "-cc1"{{.*}} "-triple" "armv8.5{{.*}}" "-target-cpu" "generic" {{.*}}"-target-feature" "+fullfp16" nit: perhaps move this to below, where we have the other fp16 checks? Comment at: test/Preprocessor/arm-target-features.c:746 + +// RUN: %clang -target armv8.4a-none-none-eabi -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-V84A %s +// CHECK-V84A: #define __ARM_ARCH 8 thanks for upstreaming a little bit of v8.3 and v8.4 too :-) Repository: rC Clang https://reviews.llvm.org/D52491 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D52492: [AArch64][v8.5A] Test optional Armv8.5-A random number extension
SjoerdMeijer accepted this revision. SjoerdMeijer added a comment. This revision is now accepted and ready to land. LGTM Repository: rC Clang https://reviews.llvm.org/D52492 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D52493: [AArch64][v8.5A] Test clang option for the Memory Tagging Extension
SjoerdMeijer accepted this revision. SjoerdMeijer added a comment. This revision is now accepted and ready to land. LGTM Repository: rC Clang https://reviews.llvm.org/D52493 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D50179: [AArch64][ARM] Context sensitive meaning of option "crypto"
SjoerdMeijer added a comment. @efriedma : apologies for the ping, but does this look reasonable? https://reviews.llvm.org/D50179 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D50179: [AArch64][ARM] Context sensitive meaning of option "crypto"
SjoerdMeijer added a comment. Thanks! https://reviews.llvm.org/D50179 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D50179: [AArch64][ARM] Context sensitive meaning of option "crypto"
This revision was automatically updated to reflect the committed changes. Closed by commit rL343758: [AArch64][ARM] Context sensitive meaning of crypto (authored by SjoerdMeijer, committed by ). Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D50179?vs=166643&id=168233#toc Repository: rL LLVM https://reviews.llvm.org/D50179 Files: cfe/trunk/lib/Driver/ToolChains/Arch/AArch64.cpp cfe/trunk/lib/Driver/ToolChains/Arch/ARM.cpp cfe/trunk/test/Driver/arm-features.c cfe/trunk/test/Preprocessor/aarch64-target-features.c Index: cfe/trunk/lib/Driver/ToolChains/Arch/AArch64.cpp === --- cfe/trunk/lib/Driver/ToolChains/Arch/AArch64.cpp +++ cfe/trunk/lib/Driver/ToolChains/Arch/AArch64.cpp @@ -219,6 +219,87 @@ Features.push_back("+fullfp16"); } + // FIXME: this needs reimplementation too after the TargetParser rewrite + // + // Context sensitive meaning of Crypto: + // 1) For Arch >= ARMv8.4a: crypto = sm4 + sha3 + sha2 + aes + // 2) For Arch <= ARMv8.3a: crypto = sha2 + aes + const auto ItBegin = Features.begin(); + const auto ItEnd = Features.end(); + const auto ItRBegin = Features.rbegin(); + const auto ItREnd = Features.rend(); + const auto ItRCrypto = std::find(ItRBegin, ItREnd, "+crypto"); + const auto ItRNoCrypto = std::find(ItRBegin, ItREnd, "-crypto"); + const auto HasCrypto = ItRCrypto != ItREnd; + const auto HasNoCrypto = ItRNoCrypto != ItREnd; + const ptrdiff_t PosCrypto = ItRCrypto - ItRBegin; + const ptrdiff_t PosNoCrypto = ItRNoCrypto - ItRBegin; + + bool NoCrypto = false; + if (HasCrypto && HasNoCrypto) { +if (PosNoCrypto < PosCrypto) + NoCrypto = true; + } + + if (std::find(ItBegin, ItEnd, "+v8.4a") != ItEnd) { +if (HasCrypto && !NoCrypto) { + // Check if we have NOT disabled an algorithm with something like: + // +crypto, -algorithm + // And if "-algorithm" does not occur, we enable that crypto algorithm. + const bool HasSM4 = (std::find(ItBegin, ItEnd, "-sm4") == ItEnd); + const bool HasSHA3 = (std::find(ItBegin, ItEnd, "-sha3") == ItEnd); + const bool HasSHA2 = (std::find(ItBegin, ItEnd, "-sha2") == ItEnd); + const bool HasAES = (std::find(ItBegin, ItEnd, "-aes") == ItEnd); + if (HasSM4) +Features.push_back("+sm4"); + if (HasSHA3) +Features.push_back("+sha3"); + if (HasSHA2) +Features.push_back("+sha2"); + if (HasAES) +Features.push_back("+aes"); +} else if (HasNoCrypto) { + // Check if we have NOT enabled a crypto algorithm with something like: + // -crypto, +algorithm + // And if "+algorithm" does not occur, we disable that crypto algorithm. + const bool HasSM4 = (std::find(ItBegin, ItEnd, "+sm4") != ItEnd); + const bool HasSHA3 = (std::find(ItBegin, ItEnd, "+sha3") != ItEnd); + const bool HasSHA2 = (std::find(ItBegin, ItEnd, "+sha2") != ItEnd); + const bool HasAES = (std::find(ItBegin, ItEnd, "+aes") != ItEnd); + if (!HasSM4) +Features.push_back("-sm4"); + if (!HasSHA3) +Features.push_back("-sha3"); + if (!HasSHA2) +Features.push_back("-sha2"); + if (!HasAES) +Features.push_back("-aes"); +} + } else { +if (HasCrypto && !NoCrypto) { + const bool HasSHA2 = (std::find(ItBegin, ItEnd, "-sha2") == ItEnd); + const bool HasAES = (std::find(ItBegin, ItEnd, "-aes") == ItEnd); + if (HasSHA2) +Features.push_back("+sha2"); + if (HasAES) +Features.push_back("+aes"); +} else if (HasNoCrypto) { + const bool HasSHA2 = (std::find(ItBegin, ItEnd, "+sha2") != ItEnd); + const bool HasAES = (std::find(ItBegin, ItEnd, "+aes") != ItEnd); + const bool HasV82a = (std::find(ItBegin, ItEnd, "+v8.2a") != ItEnd); + const bool HasV83a = (std::find(ItBegin, ItEnd, "+v8.3a") != ItEnd); + const bool HasV84a = (std::find(ItBegin, ItEnd, "+v8.4a") != ItEnd); + if (!HasSHA2) +Features.push_back("-sha2"); + if (!HasAES) +Features.push_back("-aes"); + if (HasV82a || HasV83a || HasV84a) { +Features.push_back("-sm4"); +Features.push_back("-sha3"); + } +} + } + if (Arg *A = Args.getLastArg(options::OPT_mno_unaligned_access, options::OPT_munaligned_access)) if (A->getOption().matches(options::OPT_mno_unaligned_access)) Index: cfe/trunk/lib/Driver/ToolChains/Arch/ARM.cpp === --- cfe/trunk/lib/Driver/ToolChains/Arch/ARM.cpp +++ cfe/trunk/lib/Driver/ToolChains/Arch/ARM.cpp @@ -444,6 +444,26 @@ Features.push_back("-crc"); } + // For Arch >= ARMv8.0: crypto = sha2 + aes + // FIXME: this needs reimplementation after the TargetParser rewrite + if (ArchName.find_lower("armv8a") != StringRef::npos || + ArchName.find_lower("armv8.1a") != StringRe
[PATCH] D57577: Make predefined FLT16 macros conditional on support for the type
SjoerdMeijer accepted this revision. SjoerdMeijer added a comment. This revision is now accepted and ready to land. Looks okay to me, with one nit inline. Comment at: test/Preprocessor/init.c:9169 // WEBASSEMBLY-NEXT:#define __FLOAT128__ 1 -// WEBASSEMBLY-NEXT:#define __FLT16_DECIMAL_DIG__ 5 -// WEBASSEMBLY-NEXT:#define __FLT16_DENORM_MIN__ 5.9604644775390625e-8F16 Perhaps change this in WEBASSEMBLY-NOT so that we also have one negative test for this? Repository: rC Clang CHANGES SINCE LAST ACTION https://reviews.llvm.org/D57577/new/ https://reviews.llvm.org/D57577 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D53633: [AArch64] Implement FP16FML intrinsics
SjoerdMeijer added inline comments. Comment at: cfe/trunk/test/CodeGen/aarch64-neon-fp16fml.c:12 + +float32x2_t test_vfmlal_low_u32(float32x2_t a, float16x4_t b, float16x4_t c) { +// CHECK-LABEL: define <2 x float> @test_vfmlal_low_u32(<2 x float> %a, <4 x half> %b, <4 x half> %c) ab wrote: > Hey folks, I'm curious: where does the "_u32" suffix come from? Should it be > _f16? > > Also, are there any new ACLE/intrinsic list documents? As far as I can tell > there hasn't been any release since IHI0073B/IHI0053D. > Also, are there any new ACLE/intrinsic list documents? As far as I can tell > there hasn't been any release since IHI0073B/IHI0053D. I've checked, and an updated ACLE that includes these FP16FML intrinsics is coming soon. > where does the "_u32" suffix come from? Should it be _f16? Good question. It could probably be _f32 or _f16, but _u32 doesn't seem to make much sense. Looks like the spec says _u32, and that's also what GCC has implemented. I think we want to update the spec and fix the name before the updated spec is available. Will chase this, and let you know once I know more. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D53633/new/ https://reviews.llvm.org/D53633 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D53633: [AArch64] Implement FP16FML intrinsics
SjoerdMeijer added inline comments. Comment at: cfe/trunk/test/CodeGen/aarch64-neon-fp16fml.c:12 + +float32x2_t test_vfmlal_low_u32(float32x2_t a, float16x4_t b, float16x4_t c) { +// CHECK-LABEL: define <2 x float> @test_vfmlal_low_u32(<2 x float> %a, <4 x half> %b, <4 x half> %c) SjoerdMeijer wrote: > ab wrote: > > Hey folks, I'm curious: where does the "_u32" suffix come from? Should it > > be _f16? > > > > Also, are there any new ACLE/intrinsic list documents? As far as I can tell > > there hasn't been any release since IHI0073B/IHI0053D. > > Also, are there any new ACLE/intrinsic list documents? As far as I can tell > > there hasn't been any release since IHI0073B/IHI0053D. > > I've checked, and an updated ACLE that includes these FP16FML intrinsics is > coming soon. > > > where does the "_u32" suffix come from? Should it be _f16? > > Good question. It could probably be _f32 or _f16, but _u32 doesn't seem to > make much sense. Looks like the spec says _u32, and that's also what GCC has > implemented. I think we want to update the spec and fix the name before the > updated spec is available. Will chase this, and let you know once I know more. An update on this: we should change this to _f32 (because the first suffixes were refering to the ouput type). The ACLE will be updated accordingly, and also GCC will change its current implementation (from _u32 to _f32). Many thanks for raising this issue. Is there a volunteer to prepare a patch? Or do you have one already? :-) I could look at it, but that will be towards the end of next week. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D53633/new/ https://reviews.llvm.org/D53633 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D58306: [AArch64] Change size suffix for FP16FML intrinsics.
SjoerdMeijer added a comment. I am discussing this with our GCC team as we would like both Clang/GCC implementation to be the same. But you're right that _f16 looks like to be the more consistent choice. I will let you know as soon I know more. Repository: rC Clang CHANGES SINCE LAST ACTION https://reviews.llvm.org/D58306/new/ https://reviews.llvm.org/D58306 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D58306: [AArch64] Change size suffix for FP16FML intrinsics.
SjoerdMeijer accepted this revision. SjoerdMeijer added a comment. This revision is now accepted and ready to land. LGTM The ACLE has been updated and a new version with change included will be released soon. Repository: rC Clang CHANGES SINCE LAST ACTION https://reviews.llvm.org/D58306/new/ https://reviews.llvm.org/D58306 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D33719: Add _Float16 as a C/C++ source language type
SjoerdMeijer updated this revision to Diff 113100. SjoerdMeijer added a comment. No changes were needed to make the conversions work, the existing logic is taking care of that, but I agree it doesn't hurt to add a few test cases. So I've added tests to both files, and cleaned up that comment. https://reviews.llvm.org/D33719 Files: include/clang-c/Index.h include/clang/AST/ASTContext.h include/clang/AST/BuiltinTypes.def include/clang/Basic/Specifiers.h include/clang/Basic/TokenKinds.def include/clang/Lex/LiteralSupport.h include/clang/Sema/DeclSpec.h include/clang/Serialization/ASTBitCodes.h lib/AST/ASTContext.cpp lib/AST/ItaniumMangle.cpp lib/AST/MicrosoftMangle.cpp lib/AST/NSAPI.cpp lib/AST/StmtPrinter.cpp lib/AST/Type.cpp lib/AST/TypeLoc.cpp lib/Analysis/PrintfFormatString.cpp lib/CodeGen/CGDebugInfo.cpp lib/CodeGen/CGExprScalar.cpp lib/CodeGen/CodeGenTypes.cpp lib/CodeGen/ItaniumCXXABI.cpp lib/Format/FormatToken.cpp lib/Index/USRGeneration.cpp lib/Lex/LiteralSupport.cpp lib/Parse/ParseDecl.cpp lib/Parse/ParseExpr.cpp lib/Parse/ParseExprCXX.cpp lib/Parse/ParseTentative.cpp lib/Sema/DeclSpec.cpp lib/Sema/SemaDecl.cpp lib/Sema/SemaExpr.cpp lib/Sema/SemaTemplateVariadic.cpp lib/Sema/SemaType.cpp lib/Serialization/ASTCommon.cpp lib/Serialization/ASTReader.cpp test/CodeGenCXX/float16-declarations.cpp test/Frontend/float16.cpp test/Lexer/half-literal.cpp tools/libclang/CXType.cpp Index: tools/libclang/CXType.cpp === --- tools/libclang/CXType.cpp +++ tools/libclang/CXType.cpp @@ -53,6 +53,7 @@ BTCASE(Float); BTCASE(Double); BTCASE(LongDouble); +BTCASE(Float16); BTCASE(Float128); BTCASE(NullPtr); BTCASE(Overload); @@ -520,7 +521,7 @@ TKIND(Char_U); TKIND(UChar); TKIND(Char16); -TKIND(Char32); +TKIND(Char32); TKIND(UShort); TKIND(UInt); TKIND(ULong); @@ -538,6 +539,7 @@ TKIND(Float); TKIND(Double); TKIND(LongDouble); +TKIND(Float16); TKIND(Float128); TKIND(NullPtr); TKIND(Overload); Index: test/Lexer/half-literal.cpp === --- test/Lexer/half-literal.cpp +++ test/Lexer/half-literal.cpp @@ -1,3 +1,6 @@ // RUN: %clang_cc1 -fsyntax-only -verify -pedantic %s float a = 1.0h; // expected-error{{invalid suffix 'h' on floating constant}} float b = 1.0H; // expected-error{{invalid suffix 'H' on floating constant}} + +_Float16 c = 1.f166; // expected-error{{invalid suffix 'f166' on floating constant}} +_Float16 d = 1.f1; // expected-error{{invalid suffix 'f1' on floating constant}} Index: test/Frontend/float16.cpp === --- /dev/null +++ test/Frontend/float16.cpp @@ -0,0 +1,326 @@ +// RUN: %clang_cc1 -std=c++11 -ast-dump %s | FileCheck %s +// RUN: %clang_cc1 -std=c++11 -ast-dump -fnative-half-type %s | FileCheck %s --check-prefix=CHECK-NATIVE + +/* Various contexts where type _Float16 can appear. */ + +/* Namespace */ +namespace { + _Float16 f1n; + _Float16 f2n = 33.f16; + _Float16 arr1n[10]; + _Float16 arr2n[] = { 1.2, 3.0, 3.e4 }; + const volatile _Float16 func1n(const _Float16 &arg) { +return arg + f2n + arr1n[4] - arr2n[1]; + } +} + +//CHECK: |-NamespaceDecl +//CHECK: | |-VarDecl {{.*}} f1n '_Float16' +//CHECK: | |-VarDecl {{.*}} f2n '_Float16' cinit +//CHECK: | | `-FloatingLiteral {{.*}} '_Float16' 3.30e+01 +//CHECK: | |-VarDecl {{.*}} arr1n '_Float16 [10]' +//CHECK: | |-VarDecl {{.*}} arr2n '_Float16 [3]' cinit +//CHECK: | | `-InitListExpr {{.*}} '_Float16 [3]' +//CHECK: | | |-ImplicitCastExpr {{.*}} '_Float16' +//CHECK: | | | `-FloatingLiteral {{.*}} 'double' 1.20e+00 +//CHECK: | | |-ImplicitCastExpr {{.*}} '_Float16' +//CHECK: | | | `-FloatingLiteral {{.*}} 'double' 3.00e+00 +//CHECK: | | `-ImplicitCastExpr {{.*}} '_Float16' +//CHECK: | | `-FloatingLiteral {{.*}} 'double' 3.00e+04 +//CHECK: | `-FunctionDecl {{.*}} func1n 'const volatile _Float16 (const _Float16 &)' + +/* File */ +_Float16 f1f; +_Float16 f2f = 32.4; +_Float16 arr1f[10]; +_Float16 arr2f[] = { -1.2, -3.0, -3.e4 }; +_Float16 func1f(_Float16 arg); + +//CHECK: |-VarDecl {{.*}} f1f '_Float16' +//CHECK: |-VarDecl {{.*}} f2f '_Float16' cinit +//CHECK: | `-ImplicitCastExpr {{.*}} '_Float16' +//CHECK: | `-FloatingLiteral {{.*}} 'double' 3.24e+01 +//CHECK: |-VarDecl {{.*}} arr1f '_Float16 [10]' +//CHECK: |-VarDecl {{.*}} arr2f '_Float16 [3]' cinit +//CHECK: | `-InitListExpr {{.*}} '_Float16 [3]' +//CHECK: | |-ImplicitCastExpr {{.*}} '_Float16' +//CHECK: | | `-UnaryOperator {{.*}} 'double' prefix '-' +//CHECK: | | `-FloatingLiteral {{.*}} 'double' 1.20e+00 +//CHECK: | |-ImplicitCastExpr {{.*}} '_Float16' +//CHECK: | | `-UnaryOperator {{.*}} 'double' prefix '-' +//CHECK: | | `-FloatingLiteral {{.*}} 'double' 3.00
[PATCH] D33719: Add _Float16 as a C/C++ source language type
SjoerdMeijer updated this revision to Diff 113218. SjoerdMeijer added a comment. Comments addressed. Thanks for reviewing. https://reviews.llvm.org/D33719 Files: include/clang-c/Index.h include/clang/AST/ASTContext.h include/clang/AST/BuiltinTypes.def include/clang/Basic/Specifiers.h include/clang/Basic/TokenKinds.def include/clang/Lex/LiteralSupport.h include/clang/Sema/DeclSpec.h include/clang/Serialization/ASTBitCodes.h lib/AST/ASTContext.cpp lib/AST/ItaniumMangle.cpp lib/AST/MicrosoftMangle.cpp lib/AST/NSAPI.cpp lib/AST/StmtPrinter.cpp lib/AST/Type.cpp lib/AST/TypeLoc.cpp lib/Analysis/PrintfFormatString.cpp lib/CodeGen/CGDebugInfo.cpp lib/CodeGen/CGExprScalar.cpp lib/CodeGen/CodeGenTypes.cpp lib/CodeGen/ItaniumCXXABI.cpp lib/Format/FormatToken.cpp lib/Index/USRGeneration.cpp lib/Lex/LiteralSupport.cpp lib/Parse/ParseDecl.cpp lib/Parse/ParseExpr.cpp lib/Parse/ParseExprCXX.cpp lib/Parse/ParseTentative.cpp lib/Sema/DeclSpec.cpp lib/Sema/SemaDecl.cpp lib/Sema/SemaExpr.cpp lib/Sema/SemaTemplateVariadic.cpp lib/Sema/SemaType.cpp lib/Serialization/ASTCommon.cpp lib/Serialization/ASTReader.cpp test/CodeGenCXX/float16-declarations.cpp test/Frontend/float16.cpp test/Lexer/half-literal.cpp tools/libclang/CXType.cpp Index: tools/libclang/CXType.cpp === --- tools/libclang/CXType.cpp +++ tools/libclang/CXType.cpp @@ -53,6 +53,7 @@ BTCASE(Float); BTCASE(Double); BTCASE(LongDouble); +BTCASE(Float16); BTCASE(Float128); BTCASE(NullPtr); BTCASE(Overload); @@ -520,7 +521,7 @@ TKIND(Char_U); TKIND(UChar); TKIND(Char16); -TKIND(Char32); +TKIND(Char32); TKIND(UShort); TKIND(UInt); TKIND(ULong); @@ -538,6 +539,7 @@ TKIND(Float); TKIND(Double); TKIND(LongDouble); +TKIND(Float16); TKIND(Float128); TKIND(NullPtr); TKIND(Overload); Index: test/Lexer/half-literal.cpp === --- test/Lexer/half-literal.cpp +++ test/Lexer/half-literal.cpp @@ -1,3 +1,6 @@ // RUN: %clang_cc1 -fsyntax-only -verify -pedantic %s float a = 1.0h; // expected-error{{invalid suffix 'h' on floating constant}} float b = 1.0H; // expected-error{{invalid suffix 'H' on floating constant}} + +_Float16 c = 1.f166; // expected-error{{invalid suffix 'f166' on floating constant}} +_Float16 d = 1.f1; // expected-error{{invalid suffix 'f1' on floating constant}} Index: test/Frontend/float16.cpp === --- /dev/null +++ test/Frontend/float16.cpp @@ -0,0 +1,326 @@ +// RUN: %clang_cc1 -std=c++11 -ast-dump %s | FileCheck %s --strict-whitespace +// RUN: %clang_cc1 -std=c++11 -ast-dump -fnative-half-type %s | FileCheck %s --check-prefix=CHECK-NATIVE --strict-whitespace + +/* Various contexts where type _Float16 can appear. */ + +/* Namespace */ +namespace { + _Float16 f1n; + _Float16 f2n = 33.f16; + _Float16 arr1n[10]; + _Float16 arr2n[] = { 1.2, 3.0, 3.e4 }; + const volatile _Float16 func1n(const _Float16 &arg) { +return arg + f2n + arr1n[4] - arr2n[1]; + } +} + +//CHECK: |-NamespaceDecl +//CHECK-NEXT: | |-VarDecl {{.*}} f1n '_Float16' +//CHECK-NEXT: | |-VarDecl {{.*}} f2n '_Float16' cinit +//CHECK-NEXT: | | `-FloatingLiteral {{.*}} '_Float16' 3.30e+01 +//CHECK-NEXT: | |-VarDecl {{.*}} arr1n '_Float16 [10]' +//CHECK-NEXT: | |-VarDecl {{.*}} arr2n '_Float16 [3]' cinit +//CHECK-NEXT: | | `-InitListExpr {{.*}} '_Float16 [3]' +//CHECK-NEXT: | | |-ImplicitCastExpr {{.*}} '_Float16' +//CHECK-NEXT: | | | `-FloatingLiteral {{.*}} 'double' 1.20e+00 +//CHECK-NEXT: | | |-ImplicitCastExpr {{.*}} '_Float16' +//CHECK-NEXT: | | | `-FloatingLiteral {{.*}} 'double' 3.00e+00 +//CHECK-NEXT: | | `-ImplicitCastExpr {{.*}} '_Float16' +//CHECK-NEXT: | | `-FloatingLiteral {{.*}} 'double' 3.00e+04 +//CHECK-NEXT: | `-FunctionDecl {{.*}} func1n 'const volatile _Float16 (const _Float16 &)' + +/* File */ +_Float16 f1f; +_Float16 f2f = 32.4; +_Float16 arr1f[10]; +_Float16 arr2f[] = { -1.2, -3.0, -3.e4 }; +_Float16 func1f(_Float16 arg); + +//CHECK: |-VarDecl {{.*}} f1f '_Float16' +//CHECK-NEXT: |-VarDecl {{.*}} f2f '_Float16' cinit +//CHECK-NEXT: | `-ImplicitCastExpr {{.*}} '_Float16' +//CHECK-NEXT: | `-FloatingLiteral {{.*}} 'double' 3.24e+01 +//CHECK-NEXT: |-VarDecl {{.*}} arr1f '_Float16 [10]' +//CHECK-NEXT: |-VarDecl {{.*}} arr2f '_Float16 [3]' cinit +//CHECK-NEXT: | `-InitListExpr {{.*}} '_Float16 [3]' +//CHECK-NEXT: | |-ImplicitCastExpr {{.*}} '_Float16' +//CHECK-NEXT: | | `-UnaryOperator {{.*}} 'double' prefix '-' +//CHECK-NEXT: | | `-FloatingLiteral {{.*}} 'double' 1.20e+00 +//CHECK-NEXT: | |-ImplicitCastExpr {{.*}} '_Float16' +//CHECK-NEXT: | | `-UnaryOperator {{.*}} 'double' prefix '-' +//CHECK-NEXT: | | `-FloatingLiteral {{.*}} 'double' 3.
[PATCH] D33719: Add _Float16 as a C/C++ source language type
SjoerdMeijer added a comment. I am going to commit this within a few days. That looks reasonable to me given that the comments in the last reviews were very minor (which I have of course addressed already). Also, in case of issues, I am guessing fixes and/or addition can be easily done post-commit, and if not a revert is cheap. Committing this allows me to make some progress on the FP16 work. I have mostly fixed up the AArch64 back-end, and will now focus on ARM and the remaining Clang patches (documentation). Comments are still welcome of course, which I then will address before committing. https://reviews.llvm.org/D33719 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D33719: Add _Float16 as a C/C++ source language type
SjoerdMeijer added a comment. Many thanks for reviewing and your help! https://reviews.llvm.org/D33719 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D33719: Add _Float16 as a C/C++ source language type
This revision was automatically updated to reflect the committed changes. Closed by commit rL312781: Add _Float16 as a C/C++ source language type (authored by SjoerdMeijer). Changed prior to commit: https://reviews.llvm.org/D33719?vs=113218&id=114325#toc Repository: rL LLVM https://reviews.llvm.org/D33719 Files: cfe/trunk/include/clang-c/Index.h cfe/trunk/include/clang/AST/ASTContext.h cfe/trunk/include/clang/AST/BuiltinTypes.def cfe/trunk/include/clang/Basic/Specifiers.h cfe/trunk/include/clang/Basic/TokenKinds.def cfe/trunk/include/clang/Lex/LiteralSupport.h cfe/trunk/include/clang/Sema/DeclSpec.h cfe/trunk/include/clang/Serialization/ASTBitCodes.h cfe/trunk/lib/AST/ASTContext.cpp cfe/trunk/lib/AST/ItaniumMangle.cpp cfe/trunk/lib/AST/MicrosoftMangle.cpp cfe/trunk/lib/AST/NSAPI.cpp cfe/trunk/lib/AST/StmtPrinter.cpp cfe/trunk/lib/AST/Type.cpp cfe/trunk/lib/AST/TypeLoc.cpp cfe/trunk/lib/Analysis/PrintfFormatString.cpp cfe/trunk/lib/CodeGen/CGDebugInfo.cpp cfe/trunk/lib/CodeGen/CGExprScalar.cpp cfe/trunk/lib/CodeGen/CodeGenTypes.cpp cfe/trunk/lib/CodeGen/ItaniumCXXABI.cpp cfe/trunk/lib/Format/FormatToken.cpp cfe/trunk/lib/Index/USRGeneration.cpp cfe/trunk/lib/Lex/LiteralSupport.cpp cfe/trunk/lib/Parse/ParseDecl.cpp cfe/trunk/lib/Parse/ParseExpr.cpp cfe/trunk/lib/Parse/ParseExprCXX.cpp cfe/trunk/lib/Parse/ParseTentative.cpp cfe/trunk/lib/Sema/DeclSpec.cpp cfe/trunk/lib/Sema/SemaDecl.cpp cfe/trunk/lib/Sema/SemaExpr.cpp cfe/trunk/lib/Sema/SemaTemplateVariadic.cpp cfe/trunk/lib/Sema/SemaType.cpp cfe/trunk/lib/Serialization/ASTCommon.cpp cfe/trunk/lib/Serialization/ASTReader.cpp cfe/trunk/test/CodeGenCXX/float16-declarations.cpp cfe/trunk/test/Frontend/float16.cpp cfe/trunk/test/Lexer/half-literal.cpp cfe/trunk/tools/libclang/CXType.cpp Index: cfe/trunk/include/clang-c/Index.h === --- cfe/trunk/include/clang-c/Index.h +++ cfe/trunk/include/clang-c/Index.h @@ -3115,8 +3115,9 @@ CXType_ObjCSel = 29, CXType_Float128 = 30, CXType_Half = 31, + CXType_Float16 = 32, CXType_FirstBuiltin = CXType_Void, - CXType_LastBuiltin = CXType_Half, + CXType_LastBuiltin = CXType_Float16, CXType_Complex = 100, CXType_Pointer = 101, Index: cfe/trunk/include/clang/Lex/LiteralSupport.h === --- cfe/trunk/include/clang/Lex/LiteralSupport.h +++ cfe/trunk/include/clang/Lex/LiteralSupport.h @@ -65,6 +65,7 @@ bool isHalf : 1; // 1.0h bool isFloat : 1; // 1.0f bool isImaginary : 1; // 1.0i + bool isFloat16 : 1; // 1.0f16 bool isFloat128 : 1; // 1.0q uint8_t MicrosoftInteger; // Microsoft suffix extension i8, i16, i32, or i64. Index: cfe/trunk/include/clang/AST/BuiltinTypes.def === --- cfe/trunk/include/clang/AST/BuiltinTypes.def +++ cfe/trunk/include/clang/AST/BuiltinTypes.def @@ -133,6 +133,9 @@ // 'long double' FLOATING_TYPE(LongDouble, LongDoubleTy) +// '_Float16' +FLOATING_TYPE(Float16, HalfTy) + // '__float128' FLOATING_TYPE(Float128, Float128Ty) Index: cfe/trunk/include/clang/AST/ASTContext.h === --- cfe/trunk/include/clang/AST/ASTContext.h +++ cfe/trunk/include/clang/AST/ASTContext.h @@ -973,6 +973,7 @@ CanQualType UnsignedLongLongTy, UnsignedInt128Ty; CanQualType FloatTy, DoubleTy, LongDoubleTy, Float128Ty; CanQualType HalfTy; // [OpenCL 6.1.1.1], ARM NEON + CanQualType Float16Ty; // C11 extension ISO/IEC TS 18661-3 CanQualType FloatComplexTy, DoubleComplexTy, LongDoubleComplexTy; CanQualType Float128ComplexTy; CanQualType VoidPtrTy, NullPtrTy; Index: cfe/trunk/include/clang/Sema/DeclSpec.h === --- cfe/trunk/include/clang/Sema/DeclSpec.h +++ cfe/trunk/include/clang/Sema/DeclSpec.h @@ -280,6 +280,7 @@ static const TST TST_half = clang::TST_half; static const TST TST_float = clang::TST_float; static const TST TST_double = clang::TST_double; + static const TST TST_float16 = clang::TST_Float16; static const TST TST_float128 = clang::TST_float128; static const TST TST_bool = clang::TST_bool; static const TST TST_decimal32 = clang::TST_decimal32; Index: cfe/trunk/include/clang/Basic/TokenKinds.def === --- cfe/trunk/include/clang/Basic/TokenKinds.def +++ cfe/trunk/include/clang/Basic/TokenKinds.def @@ -379,6 +379,9 @@ MODULES_KEYWORD(module) MODULES_KEYWORD(import) +// C11 Extension +KEYWORD(_Float16, KEYALL) + // GNU Extensions (in impl-reserved namespace) KEYWORD(_Decimal32 , KEYALL) KEYWORD(_Decimal64 , KEYALL) Index: cfe/trunk/include/clang/Basic/Specifiers.h =
[PATCH] D34695: _Float16 preprocessor macro definitions
SjoerdMeijer added inline comments. Comment at: lib/Headers/float.h:137 +#ifdef __STDC_WANT_IEC_60559_TYPES_EXT__ +# define FLT16_MANT_DIG __FLT16_MANT_DIG__ scanon wrote: > rogfer01 wrote: > > scanon wrote: > > > rogfer01 wrote: > > > > My understanding is that, given that we support TS18661-2 by default, > > > > this macro should be predefined by clang and then there is no need to > > > > protect these macros. > > > > > > > > You may want to add a test for this in `test/Preprocessor/init.c`. > > > Where do you see that the `__STDC_WANT_IEC_60559_TYPES_EXT__` macro > > > should be predefined by clang? > > Hi Steve, > > > > certainly you're right, the TS says > > > > > The new identifiers added to C11 library headers by this part of ISO/IEC > > > TS-18661 are defined or declared by their respective headers only if > > > `__STDC_WANT_IEC_60559_TYPES_EXT__` is defined as a macro at the point in > > > the source file where the appropriate header is first included. > > > > so (if I read this right) these identifiers are only available if such > > macro is defined when including `float.h`. > > > > Can I assume from your comment that someone else should define it? Perhaps > > the `float.h` header itself, some other file in the C-library > > implementation or the user of the compiler via some > > `-D__STDC_WANT_IEC_60559_TYPES_EXT__`, but not be predefined by the > > compiler? If this is the case, then the macros still have to be guarded > > conditionally (as they were in the original patch). > > > > Does this make sense? Thanks. > I think we could justify defining it ourselves under non-strict compilation > modes; alternatively, system headers might define it for users in non-strict > modes. > > My reading of the TS is that in strict mode, these types and macros should be > hidden unless the user explicitly requests them by defining > `__STDC_WANT_IEC_60559_TYPES_EXT__` themselves. Thanks, very useful discussion and clarification. I will add some tests for this, which I indeed forgot. Cheers. https://reviews.llvm.org/D34695 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D34695: _Float16 preprocessor macro definitions
SjoerdMeijer updated this revision to Diff 114998. SjoerdMeijer added a comment. Fixed the typos, and added tests. https://reviews.llvm.org/D34695 Files: lib/Frontend/InitPreprocessor.cpp lib/Headers/float.h test/Headers/float16.c test/Preprocessor/init.c Index: test/Preprocessor/init.c === --- test/Preprocessor/init.c +++ test/Preprocessor/init.c @@ -301,6 +301,20 @@ // AARCH64:#define __DBL_MIN_EXP__ (-1021) // AARCH64:#define __DBL_MIN__ 2.2250738585072014e-308 // AARCH64:#define __DECIMAL_DIG__ __LDBL_DECIMAL_DIG__ +// AARCH64:#define __FLT16_DECIMAL_DIG__ 5 +// AARCH64:#define __FLT16_DENORM_MIN__ 5.9604644775390625e-8F16 +// AARCH64:#define __FLT16_DIG__ 3 +// AARCH64:#define __FLT16_EPSILON__ 9.765625e-4F16 +// AARCH64:#define __FLT16_HAS_DENORM__ 1 +// AARCH64:#define __FLT16_HAS_INFINITY__ 1 +// AARCH64:#define __FLT16_HAS_QUIET_NAN__ 1 +// AARCH64:#define __FLT16_MANT_DIG__ 11 +// AARCH64:#define __FLT16_MAX_10_EXP__ 4 +// AARCH64:#define __FLT16_MAX_EXP__ 15 +// AARCH64:#define __FLT16_MAX__ 6.5504e+4F16 +// AARCH64:#define __FLT16_MIN_10_EXP__ (-13) +// AARCH64:#define __FLT16_MIN_EXP__ (-14) +// AARCH64:#define __FLT16_MIN__ 6.103515625e-5F16 // AARCH64:#define __FLT_DENORM_MIN__ 1.40129846e-45F // AARCH64:#define __FLT_DIG__ 6 // AARCH64:#define __FLT_EPSILON__ 1.19209290e-7F @@ -9071,7 +9085,7 @@ // WEBASSEMBLY32-NEXT:#define __DECIMAL_DIG__ __LDBL_DECIMAL_DIG__ // WEBASSEMBLY32-NOT:#define __ELF__ // WEBASSEMBLY32-NEXT:#define __FINITE_MATH_ONLY__ 0 -// WEBASSEMBLY32-NEXT:#define __FLT_DECIMAL_DIG__ 9 +// WEBASSEMBLY32:#define __FLT_DECIMAL_DIG__ 9 // WEBASSEMBLY32-NEXT:#define __FLT_DENORM_MIN__ 1.40129846e-45F // WEBASSEMBLY32-NEXT:#define __FLT_DIG__ 6 // WEBASSEMBLY32-NEXT:#define __FLT_EPSILON__ 1.19209290e-7F @@ -9402,7 +9416,7 @@ // WEBASSEMBLY64-NEXT:#define __DECIMAL_DIG__ __LDBL_DECIMAL_DIG__ // WEBASSEMBLY64-NOT:#define __ELF__ // WEBASSEMBLY64-NEXT:#define __FINITE_MATH_ONLY__ 0 -// WEBASSEMBLY64-NEXT:#define __FLT_DECIMAL_DIG__ 9 +// WEBASSEMBLY64:#define __FLT_DECIMAL_DIG__ 9 // WEBASSEMBLY64-NEXT:#define __FLT_DENORM_MIN__ 1.40129846e-45F // WEBASSEMBLY64-NEXT:#define __FLT_DIG__ 6 // WEBASSEMBLY64-NEXT:#define __FLT_EPSILON__ 1.19209290e-7F Index: test/Headers/float16.c === --- /dev/null +++ test/Headers/float16.c @@ -0,0 +1,65 @@ +// RUN: %clang_cc1 -fsyntax-only -verify -std=c89 -ffreestanding %s +// RUN: %clang_cc1 -fsyntax-only -verify -std=c99 -ffreestanding %s +// RUN: %clang_cc1 -fsyntax-only -verify -std=c11 -ffreestanding %s +// RUN: %clang_cc1 -fsyntax-only -verify -std=c++11 -x c++ -ffreestanding %s +// expected-no-diagnostics + +#define __STDC_WANT_IEC_60559_TYPES_EXT__ +#include + +#ifndef FLT16_MIN_10_EXP +#error "Macro FLT16_MIN_10_EXP is missing." +#elif FLT16_MIN_10_EXP > -13 +#error "Macro FLT16_MIN_10_EXP is invalid." +#endif + +_Static_assert(FLT16_MIN_10_EXP == __FLT16_MIN_10_EXP__, ""); + +#ifndef FLT16_MIN_EXP +#error "Macro FLT16_MIN_EXP is missing." +#elif FLT16_MIN_EXP > -14 +#error "Macro FLT16_MIN_EXP is invalid." +#endif + +_Static_assert(FLT16_MIN_EXP == __FLT16_MIN_EXP__, ""); + +#ifndef FLT16_MAX_10_EXP +#error "Macro FLT16_MAX_10_EXP is missing." +#elif FLT16_MAX_10_EXP < 4 +#error "Macro FLT16_MAX_10_EXP is invalid." +#endif + +_Static_assert(FLT16_MAX_10_EXP == __FLT16_MAX_10_EXP__, ""); + +#ifndef FLT16_MAX_EXP +#error "Macro FLT16_MAX_EXP is missing." +#elif FLT16_MAX_EXP < 15 +#error "Macro FLT16_MAX_EXP is invalid." +#endif + +_Static_assert(FLT16_MAX_EXP == __FLT16_MAX_EXP__, ""); + +#ifndef FLT16_DECIMAL_DIG +#error "Macro FLT16_DECIMAL_DIG is missing." +#elif FLT16_DECIMAL_DIG < 5 +#error "Macro FLT16_DECIMAL_DIG is invalid." +#endif + +_Static_assert(FLT16_DECIMAL_DIG == __FLT16_DECIMAL_DIG__, ""); + +#ifndef FLT16_DIG +#error "Macro FLT16_DIG is missing." +#elif FLT16_DIG < 3 +#error "Macro FLT16_DIG is invalid." +#endif + +_Static_assert(FLT16_DIG == __FLT16_DIG__, ""); + +#ifndef FLT16_MANT_DIG +#error "Macro FLT16_MANT_DIG is missing." +#elif FLT16_MANT_DIG < 11 +#error "Macro FLT16_MANT_DIG is invalid." +#endif + +_Static_assert(FLT16_MANT_DIG == __FLT16_MANT_DIG__, ""); + Index: lib/Headers/float.h === --- lib/Headers/float.h +++ lib/Headers/float.h @@ -143,4 +143,18 @@ # define LDBL_DECIMAL_DIG __LDBL_DECIMAL_DIG__ #endif +#ifdef __STDC_WANT_IEC_60559_TYPES_EXT__ +# define FLT16_MANT_DIG__FLT16_MANT_DIG__ +# define FLT16_DECIMAL_DIG __FLT16_DECIMAL_DIG__ +# define FLT16_DIG __FLT16_DIG__ +# define FLT16_MIN_EXP __FLT16_MIN_EXP__ +# define FLT16_MIN_10_EXP __FLT16_MIN_10_EXP__ +# define FLT16_MAX_EXP __FLT16_MAX_EXP__ +# define FLT16_MAX_10_EXP __FLT16_MAX_10_EXP__ +# define FLT16_MAX __FLT16_MAX
[PATCH] D34695: _Float16 preprocessor macro definitions
SjoerdMeijer added a comment. many thanks for reviewing and your help. https://reviews.llvm.org/D34695 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D34695: _Float16 preprocessor macro definitions
This revision was automatically updated to reflect the committed changes. Closed by commit rL313152: This adds the _Float16 preprocessor macro definitions. (authored by SjoerdMeijer). Changed prior to commit: https://reviews.llvm.org/D34695?vs=114998&id=115050#toc Repository: rL LLVM https://reviews.llvm.org/D34695 Files: cfe/trunk/lib/Frontend/InitPreprocessor.cpp cfe/trunk/lib/Headers/float.h cfe/trunk/test/Headers/float16.c cfe/trunk/test/Preprocessor/init.c Index: cfe/trunk/lib/Headers/float.h === --- cfe/trunk/lib/Headers/float.h +++ cfe/trunk/lib/Headers/float.h @@ -143,4 +143,18 @@ # define LDBL_DECIMAL_DIG __LDBL_DECIMAL_DIG__ #endif +#ifdef __STDC_WANT_IEC_60559_TYPES_EXT__ +# define FLT16_MANT_DIG__FLT16_MANT_DIG__ +# define FLT16_DECIMAL_DIG __FLT16_DECIMAL_DIG__ +# define FLT16_DIG __FLT16_DIG__ +# define FLT16_MIN_EXP __FLT16_MIN_EXP__ +# define FLT16_MIN_10_EXP __FLT16_MIN_10_EXP__ +# define FLT16_MAX_EXP __FLT16_MAX_EXP__ +# define FLT16_MAX_10_EXP __FLT16_MAX_10_EXP__ +# define FLT16_MAX __FLT16_MAX__ +# define FLT16_EPSILON __FLT16_EPSILON__ +# define FLT16_MIN __FLT16_MIN__ +# define FLT16_TRUE_MIN__FLT16_TRUE_MIN__ +#endif /* __STDC_WANT_IEC_60559_TYPES_EXT__ */ + #endif /* __FLOAT_H */ Index: cfe/trunk/lib/Frontend/InitPreprocessor.cpp === --- cfe/trunk/lib/Frontend/InitPreprocessor.cpp +++ cfe/trunk/lib/Frontend/InitPreprocessor.cpp @@ -110,9 +110,11 @@ /// PickFP - This is used to pick a value based on the FP semantics of the /// specified FP model. template -static T PickFP(const llvm::fltSemantics *Sem, T IEEESingleVal, +static T PickFP(const llvm::fltSemantics *Sem, T IEEEHalfVal, T IEEESingleVal, T IEEEDoubleVal, T X87DoubleExtendedVal, T PPCDoubleDoubleVal, T IEEEQuadVal) { + if (Sem == (const llvm::fltSemantics*)&llvm::APFloat::IEEEhalf()) +return IEEEHalfVal; if (Sem == (const llvm::fltSemantics*)&llvm::APFloat::IEEEsingle()) return IEEESingleVal; if (Sem == (const llvm::fltSemantics*)&llvm::APFloat::IEEEdouble()) @@ -128,26 +130,26 @@ static void DefineFloatMacros(MacroBuilder &Builder, StringRef Prefix, const llvm::fltSemantics *Sem, StringRef Ext) { const char *DenormMin, *Epsilon, *Max, *Min; - DenormMin = PickFP(Sem, "1.40129846e-45", "4.9406564584124654e-324", - "3.64519953188247460253e-4951", + DenormMin = PickFP(Sem, "5.9604644775390625e-8", "1.40129846e-45", + "4.9406564584124654e-324", "3.64519953188247460253e-4951", "4.94065645841246544176568792868221e-324", "6.47517511943802511092443895822764655e-4966"); - int Digits = PickFP(Sem, 6, 15, 18, 31, 33); - int DecimalDigits = PickFP(Sem, 9, 17, 21, 33, 36); - Epsilon = PickFP(Sem, "1.19209290e-7", "2.2204460492503131e-16", - "1.08420217248550443401e-19", + int Digits = PickFP(Sem, 3, 6, 15, 18, 31, 33); + int DecimalDigits = PickFP(Sem, 5, 9, 17, 21, 33, 36); + Epsilon = PickFP(Sem, "9.765625e-4", "1.19209290e-7", + "2.2204460492503131e-16", "1.08420217248550443401e-19", "4.94065645841246544176568792868221e-324", "1.92592994438723585305597794258492732e-34"); - int MantissaDigits = PickFP(Sem, 24, 53, 64, 106, 113); - int Min10Exp = PickFP(Sem, -37, -307, -4931, -291, -4931); - int Max10Exp = PickFP(Sem, 38, 308, 4932, 308, 4932); - int MinExp = PickFP(Sem, -125, -1021, -16381, -968, -16381); - int MaxExp = PickFP(Sem, 128, 1024, 16384, 1024, 16384); - Min = PickFP(Sem, "1.17549435e-38", "2.2250738585072014e-308", + int MantissaDigits = PickFP(Sem, 11, 24, 53, 64, 106, 113); + int Min10Exp = PickFP(Sem, -13, -37, -307, -4931, -291, -4931); + int Max10Exp = PickFP(Sem, 4, 38, 308, 4932, 308, 4932); + int MinExp = PickFP(Sem, -14, -125, -1021, -16381, -968, -16381); + int MaxExp = PickFP(Sem, 15, 128, 1024, 16384, 1024, 16384); + Min = PickFP(Sem, "6.103515625e-5", "1.17549435e-38", "2.2250738585072014e-308", "3.36210314311209350626e-4932", "2.00416836000897277799610805135016e-292", "3.36210314311209350626267781732175260e-4932"); - Max = PickFP(Sem, "3.40282347e+38", "1.7976931348623157e+308", + Max = PickFP(Sem, "6.5504e+4", "3.40282347e+38", "1.7976931348623157e+308", "1.18973149535723176502e+4932", "1.79769313486231580793728971405301e+308", "1.18973149535723176508575932662800702e+4932"); @@ -802,6 +804,7 @@ DefineFmt("__UINTPTR", TI.getUIntPtrType(), TI, Builder); DefineTypeWidth("__UINTPTR_WIDTH__", TI.getUIntPtrType(), TI, Builder); + DefineFloatMacros(Builder, "FLT16", &TI.getHalfFormat(), "F16"); DefineFloatMacros(Builder, "FLT", &TI.getFl