[Bug c++/54899] New: -fpredictive-commoning and -ftree-vectorize optimizations generate a nonsensical binary which segfaults

2012-10-11 Thread phiren at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54899



 Bug #: 54899

   Summary: -fpredictive-commoning and -ftree-vectorize

optimizations generate a nonsensical binary which

segfaults

Classification: Unclassified

   Product: gcc

   Version: 4.7.2

Status: UNCONFIRMED

  Severity: normal

  Priority: P3

 Component: c++

AssignedTo: unassig...@gcc.gnu.org

ReportedBy: phi...@gmail.com





Created attachment 28423

  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=28423

minimal(ish) testcase which reproduces the bug.



When this code is compiled with -O3 (or even just -O1 with

-fpredictive-commoning and -ftree-vectorize) gcc generates code which segfaults

when run.



The original code was deep inside a template metaprogramming math library. I

don't have much experience with templates so I only managed to trim the minimal

testcase down to 60 lines. Templates may or may not be needed to trigger the

bug, it may be possible to completely factor them out.



Version info:



Using built-in specs.

COLLECT_GCC=gcc

COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-unknown-linux-gnu/4.7.2/lto-wrapper

Target: x86_64-unknown-linux-gnu

Configured with: /build/src/gcc-4.7.2/configure --prefix=/usr --libdir=/usr/lib

--libexecdir=/usr/lib --mandir=/usr/share/man --infodir=/usr/share/info

--with-bugurl=https://bugs.archlinux.org/

--enable-languages=c,c++,ada,fortran,go,lto,objc,obj-c++ --enable-shared

--enable-threads=posix --with-system-zlib --enable-__cxa_atexit

--disable-libunwind-exceptions --enable-clocale=gnu --disable-libstdcxx-pch

--enable-libstdcxx-time --enable-gnu-unique-object --enable-linker-build-id

--with-ppl --enable-cloog-backend=isl --disable-ppl-version-check

--disable-cloog-version-check --enable-lto --enable-gold --enable-ld=default

--enable-plugin --with-plugin-ld=ld.gold --with-linker-hash-style=gnu

--disable-multilib --disable-libssp --disable-build-with-cxx

--disable-build-poststage1-with-cxx --enable-checking=release

Thread model: posix

gcc version 4.7.2 (GCC) 



Command line which triggers the bug:



gcc bug2.ii -o bug -O3 && ./bug



(no compiler output, ./bug will segfault when run)







Looking at the assembly output when disabling predictive commoning there are

only 3 changes and if it wasn't for an off-by-0x8 error they would be

functionally identical.



-O3 -fno-predictive-commoning:



movsd(%rdi), %xmm1

movsd24(%rdi), %xmm2

movhpd8(%rdi), %xmm1

movhpd32(%rdi), %xmm2

movapd%xmm1, %xmm0

movsd16(%rdi), %xmm1





-O3:



movsd(%rdi), %xmm1

movabsq$34359738384, %rax <-- Inserted

movsd24(%rdi), %xmm2

movhpd8(%rdi), %xmm1

movhpd32(%rdi), %xmm2

movapd%xmm1, %xmm0

movsd(%rdi,%rax), %xmm1 <-- Changed




[Bug middle-end/33989] Extra load/store for float with union

2010-11-11 Thread phiren at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33989

Scott Mansell  changed:

   What|Removed |Added

 CC||phiren at gmail dot com

--- Comment #12 from Scott Mansell  2010-11-11 
14:38:37 UTC ---
I'm still getting this same bug with PPC and gcc 4.5.1


[Bug middle-end/33989] Extra load/store for float with union

2010-11-11 Thread phiren at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33989

--- Comment #14 from Scott Mansell  2010-11-12 
03:15:21 UTC ---
I downloaded and compiled the 2010-11-6 snapshot of gcc 4.6.
I'm still getting the extra load/store in ppc with -O3.

.L.f:
lfs 0,0(3)
fadds 0,1,0
stfs 0,-16(1)
lwz 0,-16(1)
stw 0,0(4)
blr


[Bug middle-end/33989] Extra load/store for float with union

2010-11-11 Thread phiren at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33989

--- Comment #15 from Scott Mansell  2010-11-12 
04:04:25 UTC ---
Weirdly, it works fine with doubles.

Testcase:
union a
{
  long int i;
  double f;
};

void d(double *a, long int *b, double e)
{
  union a c;
  c.f = *a + e;
  *b = c.i;
}

Results in:
.L.d:
lfd 0,0(3)
fadd 0,1,0
stfd 0,0(4)
blr

One thing that I noticed while looking at the assembly of my program, it that
is was allocated 8 bytes on the stack for each stfs instruction. Does gcc think
that stfs writes a 64 bit value which is preventing it writing directly into
the space of an int? 

I've checked the IBM docs, stfs does write a 32 bit value.