[Bug c/47617] New: SSE rounding mode works -g, not -O3

2011-02-05 Thread cck0011 at yahoo dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47617

   Summary: SSE rounding mode works -g, not -O3
   Product: gcc
   Version: 4.3.2
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: cck0...@yahoo.com


Created attachment 23252
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23252
generated .i file

Hi folks,

  I'm working with SSE intrinsics and think I have a rounding problem. When I
try to change modes with _MM_SET_ROUNDING_MODE, I see different results when
compiled "-g", but not "-O3". 

  What am I missing?

thanks!

Using built-in specs.
Target: i386-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info --with-bugurl
=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared
--enable-threads=posix --enable-che
cking=release --with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-languages=c,
c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk --disable-dssi
--enable-plugin --with-java-home=/us
r/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre --enable-libgcj-multifile
--enable-java-maintainer-mode --with-ecj-jar
=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib --with-cpu=generic
--build=i386-redhat-linux
Thread model: posix
gcc version 4.3.2 20081105 (Red Hat 4.3.2-7) (GCC)
COLLECT_GCC_OPTIONS='-O3' '-Wall' '-o' 'round' '-msse' '-mmmx' '-save-temps'
'-v' '-mtune=generic'
 /usr/libexec/gcc/i386-redhat-linux/4.3.2/cc1 -E -quiet -v round.c -msse -mmmx
-mtune=generic -Wall -O3 -fp
ch-preprocess -o round.i
ignoring nonexistent directory
"/usr/lib/gcc/i386-redhat-linux/4.3.2/include-fixed"
ignoring nonexistent directory
"/usr/lib/gcc/i386-redhat-linux/4.3.2/../../../../i386-redhat-linux/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/local/include
 /usr/lib/gcc/i386-redhat-linux/4.3.2/include
 /usr/include
End of search list.
COLLECT_GCC_OPTIONS='-O3' '-Wall' '-o' 'round' '-msse' '-mmmx' '-save-temps'
'-v' '-mtune=generic'
 /usr/libexec/gcc/i386-redhat-linux/4.3.2/cc1 -fpreprocessed round.i -quiet
-dumpbase round.c -msse -mmmx -
mtune=generic -auxbase round -O3 -Wall -version -o round.s
GNU C (GCC) version 4.3.2 20081105 (Red Hat 4.3.2-7) (i386-redhat-linux)
compiled by GNU C version 4.3.2 20081105 (Red Hat 4.3.2-7), GMP version
4.2.2, MPFR version 2.3.2.
GGC heuristics: --param ggc-min-expand=55 --param ggc-min-heapsize=48000
Compiler executable checksum: 3bee52601079f736b7b63b762646f4ba
round.c: In function ‘test_sse1_feature’:
round.c:150: warning: unused variable ‘sig’
round.c:150: warning: unused variable ‘extensions’
round.c:149: warning: ‘edx’ may be used uninitialized in this function
COLLECT_GCC_OPTIONS='-O3' '-Wall' '-o' 'round' '-msse' '-mmmx' '-save-temps'
'-v' '-mtune=generic'
 as -V -Qy -o round.o round.s
GNU assembler version 2.18.50.0.9 (i386-redhat-linux) using BFD version version
2.18.50.0.9-8.fc10 20080822
COMPILER_PATH=/usr/libexec/gcc/i386-redhat-linux/4.3.2/:/usr/libexec/gcc/i386-redhat-linux/4.3.2/:/usr/libe
xec/gcc/i386-redhat-linux/:/usr/lib/gcc/i386-redhat-linux/4.3.2/:/usr/lib/gcc/i386-redhat-linux/:/usr/libex
ec/gcc/i386-redhat-linux/4.3.2/:/usr/libexec/gcc/i386-redhat-linux/:/usr/lib/gcc/i386-redhat-linux/4.3.2/:/
usr/lib/gcc/i386-redhat-linux/
LIBRARY_PATH=/usr/lib/gcc/i386-redhat-linux/4.3.2/:/usr/lib/gcc/i386-redhat-linux/4.3.2/:/usr/lib/gcc/i386-
redhat-linux/4.3.2/../../../:/lib/:/usr/lib/
COLLECT_GCC_OPTIONS='-O3' '-Wall' '-o' 'round' '-msse' '-mmmx' '-save-temps'
'-v' '-mtune=generic'
 /usr/libexec/gcc/i386-redhat-linux/4.3.2/collect2 --eh-frame-hdr --build-id -m
elf_i386 --hash-style=gnu -
dynamic-linker /lib/ld-linux.so.2 -o round
/usr/lib/gcc/i386-redhat-linux/4.3.2/../../../crt1.o /usr/lib/gc
c/i386-redhat-linux/4.3.2/../../../crti.o
/usr/lib/gcc/i386-redhat-linux/4.3.2/crtbegin.o -L/usr/lib/gcc/i3
86-redhat-linux/4.3.2 -L/usr/lib/gcc/i386-redhat-linux/4.3.2
-L/usr/lib/gcc/i386-redhat-linux/4.3.2/../../.
. round.o -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed
-lgcc_s --no-as-needed /usr/lib/gc
c/i386-redhat-linux/4.3.2/crtend.o
/usr/lib/gcc/i386-redhat-linux/4.3.2/../../../crtn.o


[Bug c/47617] SSE rounding mode works -g, not -O3

2011-02-06 Thread cck0011 at yahoo dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47617

--- Comment #2 from cck0011 at yahoo dot com 2011-02-06 16:25:55 UTC ---
(In reply to comment #1)
> I think you need to use -frounding-math.  GCC assumes by default the rounding
> mode is round-to-nearest.  See
> http://gcc.gnu.org/onlinedocs/gcc-4.5.2/gcc/Optimize-Options.html#index-frounding_002dmath-819
> .

Hi Andrew, 

  thanks for writing. I tried -frounding-math and the result is still the same.
Adding/removing -mfpmath=sse doesn't change it either. Is there any additional
information I can provide?

Thanks!


[Bug target/47617] SSE rounding mode works -g, not -O3

2011-02-07 Thread cck0011 at yahoo dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47617

--- Comment #4 from cck0011 at yahoo dot com 2011-02-08 01:37:58 UTC ---
Created attachment 23273
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23273
source file

Here's the source code. Rename to round.c.


[Bug target/47617] SSE rounding mode works -g, not -O3

2011-02-07 Thread cck0011 at yahoo dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47617

--- Comment #5 from cck0011 at yahoo dot com 2011-02-08 01:46:18 UTC ---
(In reply to comment #4)
> Created attachment 23273 [details]
> source file
> 
> Here's the source code. Rename to round.c.

Hi Richard,

  here's the source code. Rename to round.c. 

  I think I must be doing something wrong here. Someone would have noticed that
results from _mm_cvtps_pi16 weren't changing when _MM_SET_ROUNDING_MODE() was
called. -) I'm puzzled by it working with -g, but not with -O3.

  Any additional information I can provide?

  Thanks!


[Bug target/47617] SSE rounding mode works -g, not -O3

2011-02-08 Thread cck0011 at yahoo dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47617

--- Comment #8 from cck0011 at yahoo dot com 2011-02-09 02:08:23 UTC ---
Hi folks,

  First, thanks for working on this.

  Second, I read the link and I _think_ I understand it. Let me paraphrase it
back to you and you can tell me if I've got the point:

  There is an optimizer that extracts common expressions and evaluates them
once instead of every time they occur. (What's the name of that so I can call
it by the right name?) In my code it finds the expression:

  dest = _mm_cvtps_pi16(source);

  Several times. Since it doesn't see source changing, this expression only
gets evaluated once. Now, the change to rounding mode that happens with
_MM_SET_ROUNDING_MODE(...) isn't detected as something that would change the
value of the _mm_cvtps_pi16(...) expression, so the optimization is not
removed. Recognizing that change to rounding mode and reacting to it is what's
at the heart of bug 34678, and that's why this is a duplicate.

  The work-arounds are:

1)insert 'asm("":"+X"(source));' before changing rounding mode to make the
compiler re-evaluate expressions that use source.

2) do _MM_SET_ROUNDING_MODE(...) before any divisions or integer conversions
that might get optimized out. The scope of the optimization is a function body
and any inlined code. So do _MM_SET_ROUNDING_MODE early within that scope. 

  Is my understanding correct? 

  A few more questions:

  Will this bug exist on non-X86 processors?

  What does the 'asm("":"+X"(source));' expression do ?

  Will this syntax work for non-X86 processors?

  To be correct, should I compile with -frounding-math ?


Thanks!


[Bug target/47617] SSE rounding mode works -g, not -O3

2011-02-12 Thread cck0011 at yahoo dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47617

--- Comment #9 from cck0011 at yahoo dot com 2011-02-12 18:20:03 UTC ---
Hi folks,

  I tried the asm("":"+X"(source));  as shown. I get an error: inconsistent
operand constraints in an ‘asm’.

  The info pages make it look like this should work, but the Inline Assembly
Howto doesn't mention the X constraint. If the compiler should agree with the
info pages, I'm doing something wrong. What am I missing?

thanks


[Bug c/47825] New: SSE bitwise operations on floats work -g, fail -O3

2011-02-20 Thread cck0011 at yahoo dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47825

   Summary: SSE bitwise operations on floats work -g, fail -O3
   Product: gcc
   Version: 4.3.2
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: cck0...@yahoo.com


Created attachment 23413
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23413
.i file

Hi folks,

  I'm trying the vec_sel() example using SSE (bitwise and, or, andnot) to
select bits from two vectors to build a third. It works under -g, but fails
under -O3.
What am I missing?

thanks.


gcc -v
Using built-in specs.
Target: i386-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla
--enable-bootstrap --enable-shared --enable-threads=posix
--enable-checking=release --with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions
--enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk
--disable-dssi --enable-plugin
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre
--enable-libgcj-multifile --enable-java-maintainer-mode
--with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib
--with-cpu=generic --build=i386-redhat-linux
Thread model: posix
gcc version 4.3.2 20081105 (Red Hat 4.3.2-7) (GCC)

works under:
  cc -g -o setel setel.c  -msse -mmmx -pedantic -Wall

fails under:
  cc -O3 -o setel setel.c  -msse -mmmx -pedantic -Wall


.i file attached


[Bug c/47825] SSE bitwise operations on floats work -g, fail -O3

2011-02-20 Thread cck0011 at yahoo dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47825

--- Comment #2 from cck0011 at yahoo dot com 2011-02-20 17:35:56 UTC ---
Created attachment 23414
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23414
source file with _may_alias_ attribute added

Hi Richard,

  thanks for replying. I added _may_alias_; no change in results.

  What am I missing?

  Source attached.

Thanks!


[Bug c/47825] SSE bitwise operations on floats work -g, fail -O3

2011-02-20 Thread cck0011 at yahoo dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47825

--- Comment #4 from cck0011 at yahoo dot com 2011-02-20 18:22:46 UTC ---
(In reply to comment #3)
> Probably the same on maskarray.  You should also try to update GCC to
> 4.3.5.

Hi Richard,

  added to maskarray as well: no change in results. Adding option
-fno-strict-aliasing fixes it as well. I'm confused: since _mm_store_ps
explicitly copies into floatarray why is aliasing an issue?

thanks!


[Bug target/47825] SSE bitwise operations on floats work -g, fail -O3

2011-02-21 Thread cck0011 at yahoo dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47825

--- Comment #7 from cck0011 at yahoo dot com 2011-02-22 00:54:17 UTC ---
(In reply to comment #5)
> The issue is that maskarray is initialized as array of ints but then you
> load it as array of floats, the scheduler re-orders those so you see
> a load from uninitialized stack:
> 
[looks puzzled; reads info page for gcc -fstrict-aliasing for the N + 1th time;
sudden look of comprehension dawns...]

"taking a pointer and casting gives undefined result..."

 Thank you Richard. That makes sense now given the context... -)

Thanks folks!


[Bug c/48036] New: unexpected byte swap in sse _mm_cvtpu16_ps in 64-bit 4.5.1

2011-03-08 Thread cck0011 at yahoo dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48036

   Summary: unexpected byte swap in sse _mm_cvtpu16_ps in 64-bit
4.5.1
   Product: gcc
   Version: 4.5.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: cck0...@yahoo.com


Created attachment 23586
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23586
.i file

Hi folks,

  I'm getting an unexpected byte swap in _mm_cvtpu16_ps. Version 32-bit 4.3.2
has the behavior I was expecting. I'm posting a program that takes a vector of
chars, widens them to vectors of shorts, and then converts the shorts to
vectors of floats. I was expecting the order of the floats in the vectors to
match the order of the chars in the starting vector. My understanding is wrong
or a bug crept in after 4.3.2. What am I missing?

Thanks!

Using built-in specs.
COLLECT_GCC=/usr/bin/cc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/4.5.1/lto-wrapper
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla
--enable-bootstrap --enable-shared --enable-threads=posix
--enable-checking=release --with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-gnu-unique-object
--enable-linker-build-id
--enable-languages=c,c++,objc,obj-c++,java,fortran,ada,lto --enable-plugin
--enable-java-awt=gtk --disable-dssi
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre
--enable-libgcj-multifile --enable-java-maintainer-mode
--with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib
--with-ppl --with-cloog --with-tune=generic --with-arch_32=i686
--build=x86_64-redhat-linux
Thread model: posix
gcc version 4.5.1 20100924 (Red Hat 4.5.1-4) (GCC) 
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-g' '-o' 'char_to_float' '-I.' '-msse'
'-mmmx' '-mtune=generic' '-march=x86-64'
 /usr/libexec/gcc/x86_64-redhat-linux/4.5.1/cc1 -E -quiet -v -I.
char_to_float.c -msse -mmmx -mtune=generic -march=x86-64 -g -fworking-directory
-fpch-preprocess -o char_to_float.i
ignoring nonexistent directory
"/usr/lib/gcc/x86_64-redhat-linux/4.5.1/include-fixed"
ignoring nonexistent directory
"/usr/lib/gcc/x86_64-redhat-linux/4.5.1/../../../../x86_64-redhat-linux/include"
#include "..." search starts here:
#include <...> search starts here:
 .
 /usr/local/include
 /usr/lib/gcc/x86_64-redhat-linux/4.5.1/include
 /usr/include
End of search list.
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-g' '-o' 'char_to_float' '-I.' '-msse'
'-mmmx' '-mtune=generic' '-march=x86-64'
 /usr/libexec/gcc/x86_64-redhat-linux/4.5.1/cc1 -fpreprocessed char_to_float.i
-quiet -dumpbase char_to_float.c -msse -mmmx -mtune=generic -march=x86-64
-auxbase char_to_float -g -version -o char_to_float.s
GNU C (GCC) version 4.5.1 20100924 (Red Hat 4.5.1-4) (x86_64-redhat-linux)
compiled by GNU C version 4.5.1 20100924 (Red Hat 4.5.1-4), GMP version
4.3.1, MPFR version 2.4.2, MPC version 0.8.1
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
GNU C (GCC) version 4.5.1 20100924 (Red Hat 4.5.1-4) (x86_64-redhat-linux)
compiled by GNU C version 4.5.1 20100924 (Red Hat 4.5.1-4), GMP version
4.3.1, MPFR version 2.4.2, MPC version 0.8.1
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: ea394b69293dd698607206e8e43d607e
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-g' '-o' 'char_to_float' '-I.' '-msse'
'-mmmx' '-mtune=generic' '-march=x86-64'
 as -V -Qy --64 -o char_to_float.o char_to_float.s
GNU assembler version 2.20.51.0.7 (x86_64-redhat-linux) using BFD version
version 2.20.51.0.7-5.fc14 20100318
COMPILER_PATH=/usr/libexec/gcc/x86_64-redhat-linux/4.5.1/:/usr/libexec/gcc/x86_64-redhat-linux/4.5.1/:/usr/libexec/gcc/x86_64-redhat-linux/:/usr/lib/gcc/x86_64-redhat-linux/4.5.1/:/usr/lib/gcc/x86_64-redhat-linux/
LIBRARY_PATH=/usr/lib/gcc/x86_64-redhat-linux/4.5.1/:/usr/lib/gcc/x86_64-redhat-linux/4.5.1/../../../../lib64/:/lib/../lib64/:/usr/lib/../lib64/:/usr/lib/gcc/x86_64-redhat-linux/4.5.1/../../../:/lib/:/usr/lib/
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-g' '-o' 'char_to_float' '-I.' '-msse'
'-mmmx' '-mtune=generic' '-march=x86-64'
 /usr/libexec/gcc/x86_64-redhat-linux/4.5.1/collect2 --build-id --no-add-needed
--eh-frame-hdr -m elf_x86_64 --hash-style=gnu -dynamic-linker
/lib64/ld-linux-x86-64.so.2 -o char_to_float
/usr/lib/gcc/x86_64-redhat-linux/4.5.1/../../../../lib64/crt1.o
/usr/lib/gcc/x86_64-redhat-linux/4.5.1/../../../../lib64/crti.o
/usr/lib/gcc/x86_64-redhat-linux/4.5.1/crtbegin.o
-L/usr/lib/gcc/x86_64-redhat-linux/4.5.1
-L/usr/lib/gcc/x86_64-redhat-linux/4.5.1/../../../../lib64 -L/lib/../lib64
-L/usr/lib/../lib64 -L/usr/lib/gcc/x86_64-redhat-linux/4.5.1/../../..
char_to_float.o -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed
-lgcc_s --no-as-needed /usr/lib/gcc/x8