http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58727

--- Comment #2 from Niels Penneman <niels at penneman dot org> ---
You could be right about x86 being a different issue, since the superfluous
clear is there for every single optimization level that I have tested.

In that case, for the sake of completeness:
- ARM results have also been verified on 4.6.1 and 4.7.2;
- x86_64 results have also been verified on 4.8.1.

Looking at the list of GCC primary targets, I also see MIPS and PowerPC. Here
is what happens on MIPS.

## GCC version (MIPS) #########################################################

$ mips-none-elf-gcc -###
Using built-in specs.
COLLECT_GCC=mips-none-elf-gcc
COLLECT_LTO_WRAPPER=/path/to/toolchain/libexec/gcc/mips-none-elf/4.8.1/lto-wrapper
Target: mips-none-elf
Configured with: /path/to/builddir/src/gcc-4.8.1/configure
--build=x86_64-build_unknown-linux-gnu --host=x86_64-build_unknown-linux-gnu
--target=mips-none-elf --prefix=/path/to/toolchain
--with-local-prefix=/path/to/toolchain/mips-none-elf/sysroot
--disable-libmudflap --with-sysroot=/path/to/toolchain/mips-none-elf/sysroot
--with-newlib --enable-threads=no --disable-shared
--with-pkgversion='crosstool-NG 1.19.0' --with-abi=32 --with-float=soft
--disable-__cxa_atexit --with-gmp=/path/to/builddir/mips-none-elf/buildtools
--with-mpfr=/path/to/builddir/mips-none-elf/buildtools
--with-mpc=/path/to/builddir/mips-none-elf/buildtools --with-ppl=no
--with-isl=no --with-cloog=no --with-libelf=no --disable-lto
--with-host-libstdcxx='-static-libgcc -Wl,-Bstatic,-lstdc++,-Bdynamic -lm'
--enable-target-optspace --disable-libgomp --disable-libmudflap --disable-nls
--disable-multilib --enable-languages=c
Thread model: single
gcc version 4.8.1 (crosstool-NG 1.19.0) 

$ mips64-none-elf-gcc -###
Using built-in specs.
COLLECT_GCC=./mips64-none-elf-gcc
COLLECT_LTO_WRAPPER=/path/to/toolchain/libexec/gcc/mips64-none-elf/4.8.1/lto-wrapper
Target: mips64-none-elf
Configured with: /path/to/builddir/src/gcc-4.8.1/configure
--build=x86_64-build_unknown-linux-gnu --host=x86_64-build_unknown-linux-gnu
--target=mips64-none-elf --prefix=/path/to/toolchain
--with-local-prefix=/path/to/toolchain/mips64-none-elf/sysroot
--disable-libmudflap --with-sysroot=/path/to/toolchain/mips64-none-elf/sysroot
--with-newlib --enable-threads=no --disable-shared
--with-pkgversion='crosstool-NG 1.19.0' --with-abi=64 --with-float=soft
--disable-__cxa_atexit --with-gmp=/path/to/builddir/mips64-none-elf/buildtools
--with-mpfr=/path/to/builddir/mips64-none-elf/buildtools
--with-mpc=/path/to/builddir/mips64-none-elf/buildtools --with-ppl=no
--with-isl=no --with-cloog=no --with-libelf=no --disable-lto
--with-host-libstdcxx='-static-libgcc -Wl,-Bstatic,-lstdc++,-Bdynamic -lm'
--enable-target-optspace --disable-libgomp --disable-libmudflap --disable-nls
--disable-multilib --enable-languages=c
Thread model: single
gcc version 4.8.1 (crosstool-NG 1.19.0) 

## Compiler invocation / steps to reproduce ###################################

mips-none-elf-gcc -O1 -S -o - -c testcase.c | grep -vE '^\s*\.'
mips64-none-elf-gcc -O1 -S -o - -c testcase.c | grep -vE '^\s*\.'

## Observations & expected results ############################################

The instructions generated for the "clear_set" and "set_clear" functions are
identical:

MIPS32 (o32 ABI):
        li      $2,-4259840                     # 0xffffffffffbf0000
        ori     $2,$2,0xfffd
        and     $2,$4,$2
        j       $31
        ori     $2,$2,0x2

MIPS64 (n64 ABI):
        li      $2,-4259840                     # 0xffffffffffbf0000
        ori     $2,$2,0xfffd
        and     $4,$4,$2
        j       $31
        ori     $2,$4,0x2

Just to compare to the "*_inline" variants (also both identical):

MIPS32:
        li      $2,-4259840                     # 0xffffffffffbf0000
        ori     $2,$2,0xffff
        and     $2,$4,$2
        j       $31
        ori     $2,$2,0x2

MIPS64:
        li      $2,-4259840                     # 0xffffffffffbf0000
        ori     $2,$2,0xffff
        and     $4,$4,$2
        j       $31
        ori     $2,$4,0x2

Compare non-inlined the mask used to clear bits is 0xffbffffd while with
inlined split operations the mask is 0xffbfffff. The SET bit also appears in
the mask to be cleared when using "-O0", "-O2", "-O3" and "-Os", much like on
x86. Once again, from the looks of the clear function I doubt any superfluous
instruction is emitted here, so there are probably no run-time effects. I only
have limited knowledge of MIPS so correct me if I am wrong.

For your reference, here is the clear function (same code at all optimization
levels different from "-O0"):

MIPS32/64:
        li      $2,-4259840                     # 0xffffffffffbf0000
        ori     $2,$2,0xffff
        j       $31
        and     $2,$4,$2

It apparently needs the extra "ori" instruction regardless of whether "SET"
appears in the mask or not.

In the end, the problem with ARM does look different since it does not occur
with -O1; perhaps some optimization is able to undo the superfluous clear, but
this optimization is in its turn undone by -fexpensive-optimizations.

Reply via email to