https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102889
Bug ID: 102889 Summary: -funroll-loops generates incorrect codes from inline assembly on aarch64 Product: gcc Version: 11.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: ariel at amazon dot com Target Milestone: --- Created attachment 51651 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51651&action=edit Source code that reproduces bug. The attached code works correctly with both -O2 and -O3 on GCC7 and 11.2, all clang versions on AARCH64. But generates incorrect code with: /usr/local/gcc11/bin/gcc -O2 -funroll-loops -Wall -Werror -Wextra aa1.c -fsanitize=undefined -fno-strict-aliasing -fwrapv -fno-aggressive-loop-optimizations -o aa11y (and in -O3 as well, just adding -funroll-loops breaks). on all versions of gcc (triggers printf("Mismatch %i != %i\n", res1, res2);) Attaching code to reproduce. --- Using built-in specs. COLLECT_GCC=/usr/local/gcc11/bin/gcc COLLECT_LTO_WRAPPER=/usr/local/gcc11/libexec/gcc/aarch64-unknown-linux-gnu/11.2.0/lto-wrapper Target: aarch64-unknown-linux-gnu Configured with: ./configure --prefix=/usr/local/gcc11 Thread model: posix Supported LTO compression algorithms: zlib gcc version 11.2.0 (GCC) COLLECT_GCC_OPTIONS='-v' '-save-temps' '-O3' '-funroll-loops' '-Wall' '-Werror' '-Wextra' '-march=armv8.2-a+fp16+rcpc+dotprod+crypto' '-mtune=neoverse-n1' '-fsanitize=undefined' '-fno-strict-aliasing' '-fwrapv' '-fno-aggressive-loop-optimizations' '-o' 'aa11y' '-mlittle-endian' '-mabi=lp64' '-dumpdir' 'aa11y-' /usr/local/gcc11/libexec/gcc/aarch64-unknown-linux-gnu/11.2.0/cc1 -E -quiet -v aa1.c -march=armv8.2-a+fp16+rcpc+dotprod+crypto -mtune=neoverse-n1 -mlittle-endian -mabi=lp64 -Wall -Werror -Wextra -funroll-loops -fsanitize=undefined -fno-strict-aliasing -fwrapv -fno-aggressive-loop-optimizations -O3 -fpch-preprocess -o aa11y-aa1.i ignoring nonexistent directory "/usr/local/gcc11/lib/gcc/aarch64-unknown-linux-gnu/11.2.0/../../../../aarch64-unknown-linux-gnu/include" #include "..." search starts here: #include <...> search starts here: /usr/local/gcc11/lib/gcc/aarch64-unknown-linux-gnu/11.2.0/include /usr/local/include /usr/local/gcc11/include /usr/local/gcc11/lib/gcc/aarch64-unknown-linux-gnu/11.2.0/include-fixed /usr/include End of search list. COLLECT_GCC_OPTIONS='-v' '-save-temps' '-O3' '-funroll-loops' '-Wall' '-Werror' '-Wextra' '-march=armv8.2-a+fp16+rcpc+dotprod+crypto' '-mtune=neoverse-n1' '-fsanitize=undefined' '-fno-strict-aliasing' '-fwrapv' '-fno-aggressive-loop-optimizations' '-o' 'aa11y' '-mlittle-endian' '-mabi=lp64' '-dumpdir' 'aa11y-' /usr/local/gcc11/libexec/gcc/aarch64-unknown-linux-gnu/11.2.0/cc1 -fpreprocessed aa11y-aa1.i -quiet -dumpdir aa11y- -dumpbase aa1.c -dumpbase-ext .c -march=armv8.2-a+fp16+rcpc+dotprod+crypto -mtune=neoverse-n1 -mlittle-endian -mabi=lp64 -O3 -Wall -Werror -Wextra -version -funroll-loops -fsanitize=undefined -fno-strict-aliasing -fwrapv -fno-aggressive-loop-optimizations -o aa11y-aa1.s GNU C17 (GCC) version 11.2.0 (aarch64-unknown-linux-gnu) compiled by GNU C version 11.2.0, GMP version 6.2.1, MPFR version 4.1.0, MPC version 1.2.1, isl version none GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 GNU C17 (GCC) version 11.2.0 (aarch64-unknown-linux-gnu) compiled by GNU C version 11.2.0, GMP version 6.2.1, MPFR version 4.1.0, MPC version 1.2.1, isl version none GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 Compiler executable checksum: 3921e1b032be4ab2b700d43daf3de441 COLLECT_GCC_OPTIONS='-v' '-save-temps' '-O3' '-funroll-loops' '-Wall' '-Werror' '-Wextra' '-march=armv8.2-a+fp16+rcpc+dotprod+crypto' '-mtune=neoverse-n1' '-fsanitize=undefined' '-fno-strict-aliasing' '-fwrapv' '-fno-aggressive-loop-optimizations' '-o' 'aa11y' '-mlittle-endian' '-mabi=lp64' '-dumpdir' 'aa11y-' as -v -EL -march=armv8.2-a+fp16+rcpc+dotprod+crypto -mabi=lp64 -o aa11y-aa1.o aa11y-aa1.s GNU assembler version 2.29.1 (aarch64-redhat-linux) using BFD version version 2.29.1-30.amzn2 COMPILER_PATH=/usr/local/gcc11/libexec/gcc/aarch64-unknown-linux-gnu/11.2.0/:/usr/local/gcc11/libexec/gcc/aarch64-unknown-linux-gnu/11.2.0/:/usr/local/gcc11/libexec/gcc/aarch64-unknown-linux-gnu/:/usr/local/gcc11/lib/gcc/aarch64-unknown-linux-gnu/11.2.0/:/usr/local/gcc11/lib/gcc/aarch64-unknown-linux-gnu/ LIBRARY_PATH=/usr/local/gcc11/lib/gcc/aarch64-unknown-linux-gnu/11.2.0/:/usr/local/gcc11/lib/gcc/aarch64-unknown-linux-gnu/11.2.0/../../../../lib64/:/lib/../lib64/:/usr/lib/../lib64/:/usr/local/gcc11/lib/gcc/aarch64-unknown-linux-gnu/11.2.0/../../../:/lib/:/usr/lib/ COLLECT_GCC_OPTIONS='-v' '-save-temps' '-O3' '-funroll-loops' '-Wall' '-Werror' '-Wextra' '-march=armv8.2-a+fp16+rcpc+dotprod+crypto' '-mtune=neoverse-n1' '-fsanitize=undefined' '-fno-strict-aliasing' '-fwrapv' '-fno-aggressive-loop-optimizations' '-o' 'aa11y' '-mlittle-endian' '-mabi=lp64' '-dumpdir' 'aa11y.' /usr/local/gcc11/libexec/gcc/aarch64-unknown-linux-gnu/11.2.0/collect2 -plugin /usr/local/gcc11/libexec/gcc/aarch64-unknown-linux-gnu/11.2.0/liblto_plugin.so -plugin-opt=/usr/local/gcc11/libexec/gcc/aarch64-unknown-linux-gnu/11.2.0/lto-wrapper -plugin-opt=-fresolution=aa11y.res -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lc -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_s --eh-frame-hdr -dynamic-linker /lib/ld-linux-aarch64.so.1 -X -EL -maarch64linux -o aa11y /lib/../lib64/crt1.o /lib/../lib64/crti.o /usr/local/gcc11/lib/gcc/aarch64-unknown-linux-gnu/11.2.0/crtbegin.o -L/usr/local/gcc11/lib/gcc/aarch64-unknown-linux-gnu/11.2.0 -L/usr/local/gcc11/lib/gcc/aarch64-unknown-linux-gnu/11.2.0/../../../../lib64 -L/lib/../lib64 -L/usr/lib/../lib64 -L/usr/local/gcc11/lib/gcc/aarch64-unknown-linux-gnu/11.2.0/../../.. aa11y-aa1.o -lubsan -lgcc --push-state --as-needed -lgcc_s --pop-state -lc -lgcc --push-state --as-needed -lgcc_s --pop-state /usr/local/gcc11/lib/gcc/aarch64-unknown-linux-gnu/11.2.0/crtend.o /lib/../lib64/crtn.o COLLECT_GCC_OPTIONS='-v' '-save-temps' '-O3' '-funroll-loops' '-Wall' '-Werror' '-Wextra' '-march=armv8.2-a+fp16+rcpc+dotprod+crypto' '-mtune=neoverse-n1' '-fsanitize=undefined' '-fno-strict-aliasing' '-fwrapv' '-fno-aggressive-loop-optimizations' '-o' 'aa11y' '-mlittle-endian' '-mabi=lp64' '-dumpdir' 'aa11y.'