date:20120612

[Bug middle-end/50749] Auto-inc-dec does not find subsequent contiguous mem accesses

2012-06-12 Thread olegendo at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50749

--- Comment #12 from Oleg Endo  2012-06-12 
07:09:58 UTC ---
Author: olegendo
Date: Tue Jun 12 07:09:52 2012
New Revision: 188426

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=188426
Log:
PR target/50749
* gcc.target/sh/pr50749-sf-postinc-2.c: New.
* gcc.target/sh/pr50749-sf-postinc-4.c: New.
* gcc.target/sh/pr50749-qihisi-postinc-2.c: New.
* gcc.target/sh/pr50749-qihisi-postinc-4.c: New.
* gcc.target/sh/pr50749-sf-predec-2.c: New.
* gcc.target/sh/pr50749-sf-predec-4.c: New.
* gcc.target/sh/pr50749-qihisi-predec-1.c: New.
* gcc.target/sh/pr50749-qihisi-predec-3.c: New.
* gcc.target/sh/pr50749-sf-postinc-1.c: New.
* gcc.target/sh/pr50749-sf-postinc-3.c: New.
* gcc.target/sh/pr50749-qihisi-postinc-1.c: New.
* gcc.target/sh/pr50749-qihisi-postinc-3.c: New.
* gcc.target/sh/pr50749-sf-predec-1.c: New.
* gcc.target/sh/pr50749-sf-predec-3.c: New.
* gcc.target/sh/pr50749-qihisi-predec-2.c: New.
* gcc.target/sh/pr50749-qihisi-predec-4.c: New.


Added:
trunk/gcc/testsuite/gcc.target/sh/pr50749-qihisi-postinc-1.c
trunk/gcc/testsuite/gcc.target/sh/pr50749-qihisi-postinc-2.c
trunk/gcc/testsuite/gcc.target/sh/pr50749-qihisi-postinc-3.c
trunk/gcc/testsuite/gcc.target/sh/pr50749-qihisi-postinc-4.c
trunk/gcc/testsuite/gcc.target/sh/pr50749-qihisi-predec-1.c
trunk/gcc/testsuite/gcc.target/sh/pr50749-qihisi-predec-2.c
trunk/gcc/testsuite/gcc.target/sh/pr50749-qihisi-predec-3.c
trunk/gcc/testsuite/gcc.target/sh/pr50749-qihisi-predec-4.c
trunk/gcc/testsuite/gcc.target/sh/pr50749-sf-postinc-1.c
trunk/gcc/testsuite/gcc.target/sh/pr50749-sf-postinc-2.c
trunk/gcc/testsuite/gcc.target/sh/pr50749-sf-postinc-3.c
trunk/gcc/testsuite/gcc.target/sh/pr50749-sf-postinc-4.c
trunk/gcc/testsuite/gcc.target/sh/pr50749-sf-predec-1.c
trunk/gcc/testsuite/gcc.target/sh/pr50749-sf-predec-2.c
trunk/gcc/testsuite/gcc.target/sh/pr50749-sf-predec-3.c
trunk/gcc/testsuite/gcc.target/sh/pr50749-sf-predec-4.c
Modified:
trunk/gcc/testsuite/ChangeLog

[Bug c++/53613] Cannot override a inline "= default" virtual destructor.

2012-06-12 Thread jakub at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53613

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org,
   ||jason at gcc dot gnu.org

--- Comment #7 from Jakub Jelinek  2012-06-12 
07:24:07 UTC ---
As the testcase compiles with 4.4.x and trunk, but doesn't in 4.6.x and 4.7.x,
I'd call this a 4.6/4.7 Regression (unless what 4.4.x compiled it into was
completely broken).  The question is if the patch is safely backportable
though.

[Bug target/53639] x86_64: redundant 64-bit operations on 32-bit integers

2012-06-12 Thread jakub at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53639

--- Comment #1 from Jakub Jelinek  2012-06-12 
07:40:26 UTC ---
Created attachment 27606
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27606
gcc48-pr53639.patch

The first problem is that combiner combines:
(insn 9 8 10 2 (parallel [
(set (reg:SI 74 [ D.1765 ])
(and:SI (reg/v:SI 60 [ vpn ])
(const_int 1023 [0x3ff])))
(clobber (reg:CC 17 flags))
]) pr53639.c:19 378 {*andsi_1}
 (expr_list:REG_UNUSED (reg:CC 17 flags)
(nil)))

(insn 10 9 11 2 (set (reg:DI 75 [ D.1765 ])
(zero_extend:DI (reg:SI 74 [ D.1765 ]))) pr53639.c:19 112
{*zero_extendsidi2_rex64}
 (expr_list:REG_DEAD (reg:SI 74 [ D.1765 ])
(nil)))

into:
(insn 10 9 11 2 (parallel [
(set (reg:DI 75 [ D.1765 ])
(and:DI (subreg:DI (reg/v:SI 60 [ vpn ]) 0)
(const_int 1023 [0x3ff])))
(clobber (reg:CC 17 flags))
]) pr53639.c:19 377 {*anddi_1}
 (expr_list:REG_UNUSED (reg:CC 17 flags)
(nil)))
(expand_compound_operation in particular).  But the presence of the DImode
paradoxical subreg leads the RA to do the move in 64-bit rather than 32-bit.

The attached untested patch cures that by splitting *anddi_1 into *andsi_1_zext
so that the zero extension from SImode to DImode is done only on the result of
the and.

The second problem looks like RA decision, initially the SI 59 register (read
from *q) and DI 80 register (zero_extend:DI (reg:SI 59)) are given the eax/rax
register:
  Popping a2(r80,l0)  -- assign reg 0
...
  Popping a5(r59,l0)  -- assign reg 0
but there is also an esi = (reg:SI 59) assignment in another code branch (set
up of parameters for the tail call), so in the end IRA decides to put SI 59
into %esi register, but doesn't reconsider that the corresponding DI 80
register could be very well moved to that register as well:
Assigning 4 to a5r59
Disposition:
5:r59  l0 46:r60  l0 14:r62  l0 20:r70  l0 0
3:r72  l0 58:r73  l0 47:r75  l0 21:r78  l0 1
2:r80  l0 0
Nothing afterwards fixes this up then.  The REE pass does nothing, as the
zero_extend uses different registers (%rax = zext (%esi)), so it doesn't
eliminate the extension, and supposedly because of that other passes don't
consider it worthwhile to rename the regs.

[Bug rtl-optimization/53589] [4.7/4.8 Regression] ICE in maybe_record_trace_start with asm goto

2012-06-12 Thread jakub at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53589

--- Comment #3 from Jakub Jelinek  2012-06-12 
07:52:53 UTC ---
Author: jakub
Date: Tue Jun 12 07:52:47 2012
New Revision: 188428

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=188428
Log:
PR rtl-optimization/53589
* cfgrtl.c (force_nonfallthru_and_redirect): Do asm_goto_edge
discovery even when e->dest != target.  If any LABEL_REF points
to e->dest label, redirect it to target's label.

* gcc.dg/torture/pr53589.c: New test.

Added:
trunk/gcc/testsuite/gcc.dg/torture/pr53589.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/cfgrtl.c
trunk/gcc/testsuite/ChangeLog

[Bug fortran/53642] New: Front-end optimization: Wrong string length for deferred-length strings

2012-06-12 Thread burnus at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53642

 Bug #: 53642
   Summary: Front-end optimization: Wrong string length for
deferred-length strings
Classification: Unclassified
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: fortran
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: bur...@gcc.gnu.org
CC: dam...@rouson.net, tkoe...@gcc.gnu.org


As reported by Damian Rouson at
http://gcc.gnu.org/ml/fortran/2012-06/msg00069.html

Without optimization, the following program prints - and should print:
   3   3

However, with -ffrontend-optimize the result is, wrongly,
   3   4

Damian: Use -fno-frontend-optimize as work-around.

character(len=4) :: string="123 "
character(:), allocatable :: trimmed
trimmed = trim(string)
print *,len_trim(string),len(trimmed)
end


>From the original dump:

(a) Without FE optimization
_gfortran_string_trim (&len.1, (void * *) &pstr.0, 4, &string);
D.1864 = len.1;
if (trimmed != 0B) goto L.1;
trimmed = (character(kind=1)[1:.trimmed] *)
   __builtin_malloc ((sizetype) len.1);
...
.trimmed = len.1;

(b) With FE optimization
if (trimmed != 0B) goto L.1;
trimmed = (character(kind=1)[1:.trimmed] *) __builtin_malloc (4);
...
.trimmed = 4;

[Bug fortran/53643] New: [OOP] ICE (segfault) with INTENT(OUT) CLASS array

2012-06-12 Thread burnus at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53643

 Bug #: 53643
   Summary: [OOP] ICE (segfault) with INTENT(OUT) CLASS array
Classification: Unclassified
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: fortran
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: bur...@gcc.gnu.org
CC: fanfarillo@gmail.com


Found by Alessandro Fanfarillo and me, cf.
http://gcc.gnu.org/ml/fortran/2012-06/msg00070.html

The following program gives an ICE (segfault)

type t
  integer, allocatable :: comp
end type t
contains
  subroutine foo(x)
class(t), allocatable, intent(out) :: x(:)
  end subroutine
end



Untested draft patch:

--- a/gcc/fortran/trans-decl.c
+++ b/gcc/fortran/trans-decl.c
@@ -3453,8 +3453,3 @@ init_intent_out_dt (gfc_symbol * proc_sym,
   {
-   tree decl = build_fold_indirect_ref_loc (input_location,
-f->sym->backend_decl);
-   tmp = CLASS_DATA (f->sym)->backend_decl;
-   tmp = fold_build3_loc (input_location, COMPONENT_REF,
-  TREE_TYPE (tmp), decl, tmp, NULL_TREE);
-   tmp = build_fold_indirect_ref_loc (input_location, tmp);
+   tmp = gfc_class_data_get (f->sym->backend_decl);
tmp = gfc_deallocate_alloc_comp (CLASS_DATA (f->sym)->ts.u.derived,

[Bug c++/53549] g++ and armadillo 3.2.0, operator() is inaccessible

2012-06-12 Thread conradsand.arma at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53549

--- Comment #4 from Conrad  2012-06-12 
08:59:54 UTC ---
Created attachment 27607
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27607
pre-processed source exposing the bug

bug confirmed on Fedora 17, using gcc version 4.7.0 20120507 (Red Hat 4.7.0-5)

attached is "arma.ii.gz", generated using:
g++ -v -save-temps -O2 -o arma arma.cpp


output of g++ -v:
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/4.7.0/lto-wrapper
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla
--enable-bootstrap --enable-shared --enable-threads=posix
--enable-checking=release --disable-build-with-cxx
--disable-build-poststage1-with-cxx --with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-gnu-unique-object
--enable-linker-build-id --with-linker-hash-style=gnu
--enable-languages=c,c++,objc,obj-c++,java,fortran,ada,go,lto --enable-plugin
--enable-initfini-array --enable-java-awt=gtk --disable-dssi
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre
--enable-libgcj-multifile --enable-java-maintainer-mode
--with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib
--with-ppl --with-cloog --with-tune=generic --with-arch_32=i686
--build=x86_64-redhat-linux
Thread model: posix
gcc version 4.7.0 20120507 (Red Hat 4.7.0-5) (GCC)

[Bug c++/53549] [4.7/4.8 Regression] g++ and armadillo 3.2.0, operator() is inaccessible

2012-06-12 Thread rguenth at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53549

Richard Guenther  changed:

   What|Removed |Added

   Keywords||rejects-valid
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2012-06-12
  Known to work||4.6.3
   Target Milestone|--- |4.7.2
Summary|g++ and armadillo 3.2.0,|[4.7/4.8 Regression] g++
   |operator() is inaccessible  |and armadillo 3.2.0,
   ||operator() is inaccessible
 Ever Confirmed|0   |1

--- Comment #5 from Richard Guenther  2012-06-12 
09:18:57 UTC ---
Confirmed.  May be a deliberate change to conform to the C++ standard.

[Bug tree-optimization/50569] [4.6/4.7 regression] unaligned memory accesses generated for memcpy

2012-06-12 Thread liujiangning at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50569

--- Comment #16 from Jiangning Liu  2012-06-12 
09:24:21 UTC ---
Author: liujiangning
Date: Tue Jun 12 09:24:11 2012
New Revision: 188431

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=188431
Log:
2011-06-12  Jiangning Liu

Backport r182252 from mainline
2011-12-12  Eric Botcazou  

PR tree-optimization/50569
* tree-sra.c (build_ref_for_model): Replicate a chain of
* COMPONENT_REFs
in the expression of MODEL instead of just the last one.

2011-06-12  Jiangning Liu

Backport r182252 from mainline
2011-12-12  Eric Botcazou  

PR tree-optimization/50569
* gcc.c-torture/execute/20111212-1.c: New test.


Added:
   
branches/ARM/embedded-4_6-branch/gcc/testsuite/gcc.c-torture/execute/20111212-1.c
Modified:
branches/ARM/embedded-4_6-branch/gcc/ChangeLog.arm
branches/ARM/embedded-4_6-branch/gcc/testsuite/ChangeLog.arm
branches/ARM/embedded-4_6-branch/gcc/tree-sra.c

[Bug tree-optimization/53640] Missed cmove with stores

2012-06-12 Thread rguenth at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53640

Richard Guenther  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||WORKSFORME

--- Comment #2 from Richard Guenther  2012-06-12 
09:29:10 UTC ---
cond_if_else_store_replacement should do this, thus cselim.  But you need
-O3 as otherwise --param max-stores-to-sink is zero:

  /* Set PARAM_MAX_STORES_TO_SINK to 0 if either vectorization or if-conversion
 is disabled.  */
  if (!opts->x_flag_tree_vectorize || !opts->x_flag_tree_loop_if_convert)
maybe_set_param_value (PARAM_MAX_STORES_TO_SINK, 0,
   opts->x_param_values, opts_set->x_param_values);

[Bug c++/53549] [4.7/4.8 Regression] g++ and armadillo 3.2.0, operator() is inaccessible

2012-06-12 Thread conradsand.arma at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53549

--- Comment #6 from Conrad  2012-06-12 
09:42:08 UTC ---
bug not present when compiling with Clang 3.0

(I've found clang to often have more thorough/readable diagnostics than gcc)

output of clang -v:
clang version 3.0 (tags/RELEASE_30/final)
Target: x86_64-unknown-linux-gnu
Thread model: posix

[Bug tree-optimization/51070] [4.6/4.7 Regression] ICE verify_gimple failed

2012-06-12 Thread liujiangning at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51070

--- Comment #10 from Jiangning Liu  2012-06-12 
09:44:28 UTC ---
Author: liujiangning
Date: Tue Jun 12 09:44:24 2012
New Revision: 188432

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=188432
Log:
2011-06-12  Jiangning Liu

Backport r182839 from mainline
2012-01-03  Richard Guenther  

PR tree-optimization/51070
* tree-loop-distribution.c (generate_builtin): Do not replace
the loop with a builtin if the partition contains statements which
results are used outside of the loop.
(stmt_has_scalar_dependences_outside_loop): Properly handle calls.

2011-06-12  Jiangning Liu

Backport r182839 from mainline
2012-01-03  Richard Guenther  

PR tree-optimization/51070
* gcc.dg/torture/pr51070.c: New testcase.
* gcc.dg/torture/pr51070-2.c: Likewise.


Added:
branches/ARM/embedded-4_6-branch/gcc/testsuite/gcc.dg/torture/pr51070-2.c
branches/ARM/embedded-4_6-branch/gcc/testsuite/gcc.dg/torture/pr51070.c
Modified:
branches/ARM/embedded-4_6-branch/gcc/ChangeLog.arm
branches/ARM/embedded-4_6-branch/gcc/testsuite/ChangeLog.arm
branches/ARM/embedded-4_6-branch/gcc/tree-loop-distribution.c

[Bug fortran/53643] [OOP] ICE (segfault) with INTENT(OUT) CLASS array

2012-06-12 Thread burnus at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53643

Tobias Burnus  changed:

   What|Removed |Added

 CC||burnus at gcc dot gnu.org

--- Comment #1 from Tobias Burnus  2012-06-12 
09:50:40 UTC ---
The patch fails at run time for  gfortran.dg/typebound_operator_13.f03  in
assign. In GCC 4.7, the generated code is:
  if (lhs->_data->position.data != 0B)

With 4.8 and the patch, one has:
  if (lhs->_data.position.data != 0B)
   ^^^

[Bug tree-optimization/51042] [4.5 Regression] endless recursion in phi_translate

2012-06-12 Thread liujiangning at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51042

--- Comment #9 from Jiangning Liu  2012-06-12 
09:53:57 UTC ---
Author: liujiangning
Date: Tue Jun 12 09:53:53 2012
New Revision: 188433

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=188433
Log:
2011-06-12  Jiangning Liu

Backport r181256 from mainline
2011-11-10  Richard Guenther  

PR tree-optimization/51042
* tree-ssa-pre.c (phi_translate_1): Avoid recursing on
self-referential expressions.  Refactor code to avoid duplication.

2011-06-12  Jiangning Liu

Backport r181256 from mainline
2011-11-10  Richard Guenther  


Added:
branches/ARM/embedded-4_6-branch/gcc/testsuite/gcc.dg/torture/pr51042.c
Modified:
branches/ARM/embedded-4_6-branch/gcc/ChangeLog.arm
branches/ARM/embedded-4_6-branch/gcc/testsuite/ChangeLog.arm
branches/ARM/embedded-4_6-branch/gcc/tree-ssa-pre.c

[Bug rtl-optimization/53533] [4.7/4.8 regression] vectorization causes loop unrolling test slowdown as measured by Adobe's C++Benchmark

2012-06-12 Thread rguenth at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53533

Richard Guenther  changed:

   What|Removed |Added

 Target||x86_64-*-*
 Status|WAITING |NEW
  Known to work||4.6.3
   Keywords||missed-optimization
  Component|middle-end  |rtl-optimization
 CC||jakub at gcc dot gnu.org,
   ||uros at gcc dot gnu.org
Summary|[4.7 regression] loop   |[4.7/4.8 regression]
   |unrolling as measured by|vectorization causes loop
   |Adobe's C++Benchmark is |unrolling test slowdown as
   |twice as slow versus|measured by Adobe's
   |4.4-4.6 |C++Benchmark
  Known to fail||4.7.1, 4.8.0
   Severity|major   |normal

--- Comment #6 from Richard Guenther  2012-06-12 
09:54:02 UTC ---
Ok, it seems to me that this has template-metaprogramming loop unrolling.  With
GCC 4.7 we unroll and vectorize all loops, for example unroll factor 8 looks
like

:
  # vect_var_.941_3474 = PHI 
  # vect_var_.941_3473 = PHI 
  # ivtmp.1325_970 = PHI 
  D.9934_819 = (void *) ivtmp.1325_970;
  vect_var_.918_323 = MEM[base: D.9934_819, offset: 0B];
  vect_var_.919_325 = MEM[base: D.9934_819, offset: 16B];
  vect_var_.920_328 = vect_var_.918_323 + { 12345, 12345, 12345, 12345 };
  vect_var_.920_330 = vect_var_.919_325 + { 12345, 12345, 12345, 12345 };
  vect_var_.923_480 = vect_var_.920_328 * { 914237, 914237, 914237, 914237 };
  vect_var_.923_895 = vect_var_.920_330 * { 914237, 914237, 914237, 914237 };
  vect_var_.926_231 = vect_var_.923_480 + { 12332, 12332, 12332, 12332 };
  vect_var_.926_232 = vect_var_.923_895 + { 12332, 12332, 12332, 12332 };
  vect_var_.929_235 = vect_var_.926_231 * { 914237, 914237, 914237, 914237 };
  vect_var_.929_236 = vect_var_.926_232 * { 914237, 914237, 914237, 914237 };
  vect_var_.932_239 = vect_var_.929_235 + { 12332, 12332, 12332, 12332 };
  vect_var_.932_240 = vect_var_.929_236 + { 12332, 12332, 12332, 12332 };
  vect_var_.935_113 = vect_var_.932_239 * { 914237, 914237, 914237, 914237 };
  vect_var_.935_247 = vect_var_.932_240 * { 914237, 914237, 914237, 914237 };
  vect_var_.938_582 = vect_var_.935_113 + { -13, -13, -13, -13 };
  vect_var_.938_839 = vect_var_.935_247 + { -13, -13, -13, -13 };
  vect_var_.941_3472 = vect_var_.938_582 + vect_var_.941_3474;
  vect_var_.941_3471 = vect_var_.938_839 + vect_var_.941_3473;
  ivtmp.1325_812 = ivtmp.1325_970 + 32;
  if (ivtmp.1325_812 != D.9937_388)
goto ;
  else
goto ;

:
  # vect_var_.941_3468 = PHI 
  # vect_var_.941_3467 = PHI 
  vect_var_.945_3466 = vect_var_.941_3468 + vect_var_.941_3467;
  vect_var_.946_3465 = vect_var_.945_3466 v>> 64;
  vect_var_.946_3464 = vect_var_.946_3465 + vect_var_.945_3466;
  vect_var_.946_3463 = vect_var_.946_3464 v>> 32;
  vect_var_.946_3462 = vect_var_.946_3463 + vect_var_.946_3464;
  stmp_var_.944_3461 = BIT_FIELD_REF ;
  init_value.7_795 = init_value;
  D.8606_796 = (int) init_value.7_795;
  D.8600_797 = D.8606_796 + 12345;
  D.8599_798 = D.8600_797 * 914237;
  D.8602_799 = D.8599_798 + 12332;
  D.8601_800 = D.8602_799 * 914237;
  D.8604_801 = D.8601_800 + 12332;
  D.8603_802 = D.8604_801 * 914237;
  D.8605_803 = D.8603_802 + -13;
  temp_804 = D.8605_803 * 8000;
  if (temp_804 != stmp_var_.944_3461)
goto ;
  else
goto ;


With GCC 4.6 OTOH the above loop is not vectorized, only the (slow) not
unrolled loop is.

:
  # result_622 = PHI 
  # ivtmp.852_1026 = PHI 
  D.9283_3302 = (void *) ivtmp.852_1026;
  temp_801 = MEM[base: D.9283_3302, offset: 0B];
  D.8366_802 = temp_801 + 12345;
  D.8365_803 = D.8366_802 * 914237;
  D.8368_804 = D.8365_803 + 12332;
  D.8367_805 = D.8368_804 * 914237;
  D.8370_806 = D.8367_805 + 12332;
  D.8369_807 = D.8370_806 * 914237;
  temp_808 = D.8369_807 + -13;
  result_810 = temp_808 + result_622;
  temp_815 = MEM[base: D.9283_3302, offset: 4B];
  D.8381_816 = temp_815 + 12345;
  D.8382_817 = D.8381_816 * 914237;
  D.8378_818 = D.8382_817 + 12332;
  D.8379_819 = D.8378_818 * 914237;
  D.8376_820 = D.8379_819 + 12332;
  D.8377_821 = D.8376_820 * 914237;
  temp_822 = D.8377_821 + -13;
  result_824 = result_810 + temp_822;
  temp_788 = MEM[base: D.9283_3302, offset: 8B];
  D.8351_789 = temp_788 + 12345;
  D.8352_790 = D.8351_789 * 914237;
  D.8348_791 = D.8352_790 + 12332;
  D.8349_792 = D.8348_791 * 914237;
  D.8346_793 = D.8349_792 + 12332;
  D.8347_794 = D.8346_793 * 914237;
  temp_795 = D.8347_794 + -13;
  result_797 = temp_795 + result_824;
  temp_774 = MEM[base: D.9283_3302, offset: 12B];
  D.8333_775 = temp_774 + 12345;
  D.8334_776 = D.8333_775 * 914237;
  D.8330_777 = D.

[Bug rtl-optimization/53533] [4.7/4.8 regression] vectorization causes loop unrolling test slowdown as measured by Adobe's C++Benchmark

2012-06-12 Thread rguenth at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53533

--- Comment #7 from Richard Guenther  2012-06-12 
10:11:51 UTC ---
Btw, when I run the benchmark with the addition of -march=native (for me,
that's
-march=corei7) then GCC 4.7 performs better than 4.6:

4.6:

./t 10 

test   description   absolute   operations   ratio with
number   time   per second   test0

 0 "int32_t for loop unroll 1"   0.41 sec   1951.22 M 1.00
 1 "int32_t for loop unroll 2"   0.51 sec   1568.63 M 1.24
 2 "int32_t for loop unroll 3"   0.47 sec   1702.13 M 1.15
 3 "int32_t for loop unroll 4"   0.48 sec   1666.67 M 1.17
 4 "int32_t for loop unroll 5"   0.47 sec   1702.13 M 1.15
 5 "int32_t for loop unroll 6"   0.51 sec   1568.63 M 1.24
 6 "int32_t for loop unroll 7"   0.47 sec   1702.13 M 1.15
 7 "int32_t for loop unroll 8"   0.47 sec   1702.13 M 1.15

Total absolute time for int32_t for loop unrolling: 3.79 sec

4.7:

./t 10 

test   description   absolute   operations   ratio with
number   time   per second   test0

 0 "int32_t for loop unroll 1"   0.39 sec   2051.28 M 1.00
 1 "int32_t for loop unroll 2"   0.40 sec   2000.00 M 1.03
 2 "int32_t for loop unroll 3"   0.39 sec   2051.28 M 1.00
 3 "int32_t for loop unroll 4"   0.39 sec   2051.28 M 1.00
 4 "int32_t for loop unroll 5"   0.38 sec   2105.26 M 0.97
 5 "int32_t for loop unroll 6"   0.41 sec   1951.22 M 1.05
 6 "int32_t for loop unroll 7"   0.37 sec   2162.16 M 0.95
 7 "int32_t for loop unroll 8"   0.36 sec   .22 M 0.92

Total absolute time for int32_t for loop unrolling: 3.09 sec

The loop then looks like (the expected)

.L53:
movdqa  (%rax), %xmm4
paddd   %xmm3, %xmm4
pmulld  %xmm0, %xmm4
paddd   %xmm1, %xmm4
pmulld  %xmm0, %xmm4
paddd   %xmm1, %xmm4
pmulld  %xmm0, %xmm4
paddd   %xmm2, %xmm4
paddd   %xmm4, %xmm6
movdqa  16(%rax), %xmm4
addq$32, %rax
cmpq$data32+32000, %rax
paddd   %xmm3, %xmm4
pmulld  %xmm0, %xmm4
paddd   %xmm1, %xmm4
pmulld  %xmm0, %xmm4
paddd   %xmm1, %xmm4
pmulld  %xmm0, %xmm4
paddd   %xmm2, %xmm4
paddd   %xmm4, %xmm5
jne .L53

looks like pmulld is only available with SSE 4.1 and otherwise we fall back
to the define_insn_and_split "*sse2_mulv4si3".  But that complexity is not
reflected in the vectorizer cost model (which needs improvement ...).

[Bug rtl-optimization/53533] [4.7/4.8 regression] vectorization causes loop unrolling test slowdown as measured by Adobe's C++Benchmark

2012-06-12 Thread rguenth at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53533

--- Comment #8 from Richard Guenther  2012-06-12 
10:27:15 UTC ---
Small testcase:

int a[256];
int b[256];

void foo (void)
{
  int i;
  for (i = 0; i < 256; ++i)
{
  b[i] = a[i] * 23;
}
}

you can see that we shuffle even the vector with constants around!  Not taking
into account the REG_EQUAL note which is gone at split1 time, removed by
either loop2_invariant or loop2_unswitch.

(insn 26 24 27 3 (set (reg:V4SI 82 [ vect_var_.10 ])
(mult:V4SI (reg:V4SI 83 [ MEM[symbol: a, index: ivtmp.20_9, offset: 0B]
])
(reg:V4SI 85))) t.c:9 1496 {*sse2_mulv4si3}
 (expr_list:REG_EQUAL (mult:V4SI (reg:V4SI 83 [ MEM[symbol: a, index:
ivtmp.20_9, offset: 0B] ])
(const_vector:V4SI [
(const_int 23 [0x17])
(const_int 23 [0x17])
(const_int 23 [0x17])
(const_int 23 [0x17])
]))
(expr_list:REG_DEAD (reg:V4SI 84)
(expr_list:REG_DEAD (reg:V4SI 83 [ MEM[symbol: a, index:
ivtmp.20_9, offset: 0B] ])
(nil)

[Bug c++/53549] [4.7/4.8 Regression] g++ and armadillo 3.2.0, operator() is inaccessible

2012-06-12 Thread redi at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53549

Jonathan Wakely  changed:

   What|Removed |Added

 CC||fabien at gcc dot gnu.org

--- Comment #7 from Jonathan Wakely  2012-06-12 
10:33:26 UTC ---
I *think* this is equivalent:

template
class M
{
public:
int operator()(unsigned);
};

template
class C : public M
{
public:
using M::operator();

int operator()();


template
class F : public C
{
public:

using C::operator();
};

};

int main()
{
C::F<2> f;
f();
}

using.cc: In instantiation of 'class C::F<2>':
using.cc:29:18:   required from here
using.cc:14:9: error: 'int C::operator()()' is inaccessible
 int operator()();
 ^
using.cc:18:15: error: within this context
 class F : public C
   ^

There's only one "inaccesible" error because ther's only one operator()
overload in the reduced example, ading a const overload (as in the original)
gives two "inaccessible" errors.

This looks like another "using" issue, CC'ing Fabien.

N.B. Clang++ accepts the example above, but Comeau online rejects this because
C::F uses C before it is complete.  I think Comeau is correct.

Moving the definition of F to a point when C is complete is acepted by
Comeau and Clang and older G++ versions but gives a different error with G++
trunk:

template
class M
{
public:
int operator()(unsigned);
};

template
class C : public M
{
public:
using M::operator();

int operator()();

template class F;

};

template
template
class C::F : public C
{
public:

using C::operator();
};

int main()
{
C::F<2> f;
f();
}

using.cc:26:30: error: no members matching 'C::operator()' in 'class C'
 using C::operator();
  ^

[Bug rtl-optimization/53533] [4.7/4.8 regression] vectorization causes loop unrolling test slowdown as measured by Adobe's C++Benchmark

2012-06-12 Thread rguenth at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53533

--- Comment #9 from Richard Guenther  2012-06-12 
10:39:19 UTC ---
And cprop fails to propagate

  (reg:V4SI 85) := (const_vector:V4SI [
(const_int 23 [0x17])
(const_int 23 [0x17])
(const_int 23 [0x17])
(const_int 23 [0x17])
])

but it at least re-adds the REG_EQUAL note, but DSE drops it again.  From

(insn 26 24 27 3 (set (reg:V4SI 82 [ vect_var_.10 ])
(mult:V4SI (reg:V4SI 83 [ MEM[symbol: a, index: ivtmp.20_9, offset: 0B]
])
(reg:V4SI 85))) t.c:9 1496 {*sse2_mulv4si3}
 (expr_list:REG_EQUAL (mult:V4SI (reg:V4SI 83 [ MEM[symbol: a, index:
ivtmp.20_9, offset: 0B] ])
(const_vector:V4SI [
(const_int 23 [0x17])
(const_int 23 [0x17])
(const_int 23 [0x17])
(const_int 23 [0x17])
]))
(expr_list:REG_DEAD (reg:V4SI 85)
(expr_list:REG_DEAD (reg:V4SI 83 [ MEM[symbol: a, index:
ivtmp.20_9, offset: 0B] ])
(nil)


we go to

(insn 26 24 27 3 (set (reg:V4SI 82 [ vect_var_.10 ])
(mult:V4SI (reg:V4SI 83 [ MEM[symbol: a, index: ivtmp.20_9, offset: 0B]
])
(reg:V4SI 85))) t.c:9 1496 {*sse2_mulv4si3}
 (expr_list:REG_DEAD (reg:V4SI 83 [ MEM[symbol: a, index: ivtmp.20_9,
offset: 0B] ])
(nil)))

Unfortunately there is no cprop pass after split1 to eventually clean things
up again (because of out-of-cfg-layout-mode ...).  If I force it to run
it cannot simplify

(insn 42 24 43 3 (set (subreg:V2DI (reg:V4SI 86) 0)
(mult:V2DI (zero_extend:V2DI (vec_select:V2SI (reg:V4SI 83 [
MEM[symbol: a, index: ivtmp.20_9, offset: 0B] ])
(parallel [
(const_int 0 [0])
(const_int 2 [0x2])
])))
(zero_extend:V2DI (vec_select:V2SI (reg:V4SI 85)
(parallel [
(const_int 0 [0])
(const_int 2 [0x2])
]) t.c:9 -1
 (nil))

either though.

[Bug c++/53549] [4.7/4.8 Regression] g++ and armadillo 3.2.0, operator() is inaccessible

2012-06-12 Thread redi at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53549

--- Comment #8 from Jonathan Wakely  2012-06-12 
10:39:20 UTC ---
Further reduced, a single example showing both errors:

template
struct C
{
int operator()();

template
struct F : C
{
using C::operator();
};
};

template
struct C2
{
int operator()();

template struct F2;
};


template
template
struct C2::F2 : C2
{
using C2::operator();
};

C::F<2> f1;

using.cc:26:31: error: no members matching 'C2::operator()' in 'struct
C2'
 using C2::operator();
   ^
using.cc: In instantiation of 'struct C::F<2>':
using.cc:29:14:   required from here
using.cc:4:9: error: 'int C::operator()()' is inaccessible
 int operator()();
 ^
using.cc:7:16: error: within this context
 struct F : C
^

[Bug target/53639] x86_64: redundant 64-bit operations on 32-bit integers

2012-06-12 Thread jakub at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53639

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #27606|0   |1
is obsolete||

--- Comment #2 from Jakub Jelinek  2012-06-12 
10:41:18 UTC ---
Created attachment 27608
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27608
gcc48-pr53639.patch

Unfortunately that patch regressed pr49095.c testcase.  So, either we limit the
splitter to the paradoxical subreg that is created by the combiner when seeing
SImode and followed by zero_extend to DImode of the result (done in this
patch), or we'd need to add new peepholes for the a = mem; a &= const; mem = a;
if (a)
cases where a &= const has been transformed into andsi_1_zext.  Uros, any
preference?

[Bug lto/53604] ld reports errors using lto after upgrading from gcc-4.6.2 to gcc-4.7.0

2012-06-12 Thread rguenth at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53604

Richard Guenther  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||DUPLICATE

--- Comment #4 from Richard Guenther  2012-06-12 
10:42:15 UTC ---
dup

*** This bug has been marked as a duplicate of bug 53572 ***

[Bug lto/53572] Some public symbols don't get to serialized LTO

2012-06-12 Thread rguenth at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53572

Richard Guenther  changed:

   What|Removed |Added

 CC||paul.scruby at ghco dot
   ||co.uk

--- Comment #7 from Richard Guenther  2012-06-12 
10:42:15 UTC ---
*** Bug 53604 has been marked as a duplicate of this bug. ***

[Bug c++/53613] Cannot override a inline "= default" virtual destructor.

2012-06-12 Thread redi at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53613

--- Comment #8 from Jonathan Wakely  2012-06-12 
10:47:58 UTC ---
I think it compiles with 4.4 because __cook::~__cook is not noexcept, because
4.4 doesn't infer an empty throw spec for a trivial destructor.

If you add throw() to ~__cook you get the same errors about looser throw
specifiers.  So 4.4 was generating different code, but in this example it makes
no practical difference because you can't use the noexcept operator in 4.4 to
test whether something has an empty exception specifier anyway.

[Bug rtl-optimization/53533] [4.7/4.8 regression] vectorization causes loop unrolling test slowdown as measured by Adobe's C++Benchmark

2012-06-12 Thread rguenth at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53533

Richard Guenther  changed:

   What|Removed |Added

 CC||stevenb.gcc at gmail dot
   ||com

--- Comment #10 from Richard Guenther  2012-06-12 
11:57:20 UTC ---
Changing the insn_and_split to

(define_insn_and_split "*sse2_mulv4si3"
  [(set (match_operand:V4SI 0 "register_operand")
(mult:V4SI (match_operand:V4SI 1 "register_operand")
   (match_operand:V4SI 2 "nonmemory_vector_operand")))]
...

and defining

(define_predicate "nonmemory_vector_operand"
(ior (match_operand 0 "register_operand")
 (match_code "const_vector")))

we ICE because when splitting

(insn 26 24 27 3 (set (reg:V4SI 82 [ vect_var_.10 ])
(mult:V4SI (reg:V4SI 83 [ MEM[symbol: a, index: ivtmp.20_9, offset: 0B]
])
(const_vector:V4SI [
(const_int 23 [0x17])
(const_int 23 [0x17])
(const_int 23 [0x17])
(const_int 23 [0x17])
]))) t.c:9 1496 {*sse2_mulv4si3}
 (expr_list:REG_DEAD (reg:V4SI 83 [ MEM[symbol: a, index: ivtmp.20_9,
offset: 0B] ])
(nil)))

we don't even try to simplify when emitting the code.

But maybe allowing const_vector in (some of) the define_insn_and_split would
be the way to go ...

[Bug c++/53549] [4.7/4.8 Regression] g++ and armadillo 3.2.0, operator() is inaccessible

2012-06-12 Thread jakub at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53549

--- Comment #9 from Jakub Jelinek  2012-06-12 
12:17:37 UTC ---
This started to be rejected with
http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=182711

[Bug target/53639] x86_64: redundant 64-bit operations on 32-bit integers

2012-06-12 Thread ubizjak at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53639

--- Comment #3 from Uros Bizjak  2012-06-12 12:21:07 
UTC ---
(In reply to comment #2)

> Unfortunately that patch regressed pr49095.c testcase.  So, either we limit 
> the
> splitter to the paradoxical subreg that is created by the combiner when seeing
> SImode and followed by zero_extend to DImode of the result (done in this
> patch), or we'd need to add new peepholes for the a = mem; a &= const; mem = 
> a;
> if (a)
> cases where a &= const has been transformed into andsi_1_zext.  Uros, any
> preference?

The splitter, since the scheduler can break interesting sequence by inserting
unrelated instructions.

[Bug middle-end/53616] [4.8 Regression] 416.gamess in SPEC CPU 2006 miscompiled

2012-06-12 Thread rguenth at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53616

Richard Guenther  changed:

   What|Removed |Added

 Status|ASSIGNED|WAITING

--- Comment #4 from Richard Guenther  2012-06-12 
12:29:46 UTC ---
I can't reproduce the miscompares - are these really the only flags you use?
Which local patches do you apply to gamess?  For me gamess just hangs (but
even before the patch - ah, PR53086).

I do see a _lot_ of generated memcpy/memmove calls, so bisecting those will be
hard.

I can reproduce it with -flto which avoids PR53086, but only with ref input.

Do you have a SPEC patch for PR53086?

Thx.

[Bug fortran/53643] [OOP] ICE (segfault) with INTENT(OUT) CLASS array

2012-06-12 Thread burnus at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53643

--- Comment #2 from Tobias Burnus  2012-06-12 
12:33:34 UTC ---
I am not sure whether the following (in trans-decl.c) is the proper fix or an
ugly work around, but it seems to work. -- Maybe, a proper fix would be to
modify the following "if" block in trans-array.c's structure_alloc_comps?

  if ((POINTER_TYPE_P (decl_type) && rank != 0)
|| (TREE_CODE (decl_type) == REFERENCE_TYPE && rank == 0))
decl = build_fold_indirect_ref_loc (input_location,
decl);

 * * *

The scalar coarray version does not seem to work; using an array coarray seems
to be okay. My impression is that structure_alloc_comps simply doesn't handle
coarrays types correctly. (Coarray components is a different issue and
currently not properly supported at all.) See trans-array.c part of the patch
below.

  type t
integer, allocatable :: comp
  end type t
  contains
subroutine foo(x)
  class(t), intent(out) :: x[*]
end subroutine
  end


--- a/gcc/fortran/trans-decl.c
+++ b/gcc/fortran/trans-decl.c
@@ -3453,8 +3453,5 @@ init_intent_out_dt (gfc_symbol * proc_sym,
gfc_wrapped_block * block)
   {
-   tree decl = build_fold_indirect_ref_loc (input_location,
-f->sym->backend_decl);
-   tmp = CLASS_DATA (f->sym)->backend_decl;
-   tmp = fold_build3_loc (input_location, COMPONENT_REF,
-  TREE_TYPE (tmp), decl, tmp, NULL_TREE);
-   tmp = build_fold_indirect_ref_loc (input_location, tmp);
+   tmp = gfc_class_data_get (f->sym->backend_decl);
+   if (CLASS_DATA (f->sym)->as == NULL)
+ tmp = build_fold_indirect_ref_loc (input_location, tmp);
tmp = gfc_deallocate_alloc_comp (CLASS_DATA (f->sym)->ts.u.derived,
--- a/gcc/fortran/trans-array.c
+++ b/gcc/fortran/trans-array.c
@@ -7320,5 +7320,3 @@ structure_alloc_comps (gfc_symbol * der_type, tree decl,
|| (TREE_CODE (decl_type) == REFERENCE_TYPE && rank == 0))
-
-decl = build_fold_indirect_ref_loc (input_location,
-   decl);
+decl = build_fold_indirect_ref_loc (input_location, decl);

@@ -7330,3 +7328,3 @@ structure_alloc_comps (gfc_symbol * der_type, tree decl,
   if (TREE_CODE (decl_type) == ARRAY_TYPE
-   || GFC_DESCRIPTOR_TYPE_P (decl_type))
+  || (GFC_DESCRIPTOR_TYPE_P (decl_type) && rank != 0))
 {

[Bug middle-end/53644] New: ICE in force_move_args_size_note, at combine-stack-adj.c:419

2012-06-12 Thread doko at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53644

 Bug #: 53644
   Summary: ICE in force_move_args_size_note, at
combine-stack-adj.c:419
Classification: Unclassified
   Product: gcc
   Version: 4.7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: d...@gcc.gnu.org


Created attachment 27609
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27609
preprocessed source

seen on i686-linux-gnu only, with -Os, works with -O[123]. I see the ICE with
4.6 as well.


$ g++ -c -ggdb2 -Os -std=gnu++0x skeletoncommon.ii
/home/bjoern/libreoffice-3.6.0~alpha1/unodevtools/source/skeletonmaker/skeletoncommon.cxx:
In function 'void
skeletonmaker::checkDefaultInterfaces(boost::unordered::unordered_set&, const boost::unordered::unordered_set&, const rtl::OString&)':
/home/bjoern/libreoffice-3.6.0~alpha1/unodevtools/source/skeletonmaker/skeletoncommon.cxx:317:1:
internal compiler error: in force_move_args_size_note, at
combine-stack-adj.c:419
Please submit a full bug report,
with preprocessed source if appropriate.

[Bug c/53645] New: Missed optimization for division of vector types

2012-06-12 Thread andrii.riabushenko at barclays dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53645

 Bug #: 53645
   Summary: Missed optimization for division of vector types
Classification: Unclassified
   Product: gcc
   Version: 4.7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: andrii.riabushe...@barclays.com


for the following code

v4si ttt(v4si x) {

 return x / (v4si) {3,3,3,3};
}


GCC generates the following assembler

ttt:
movdqa(%rcx), %xmm0
movl$1431655766, %ecx
movd%xmm0, %r8d
pextrd$1, %xmm0, %r10d
pextrd$2, %xmm0, %r11d
movl%r8d, %eax
sarl$31, %r8d
imull%ecx
movl%r10d, %eax
sarl$31, %r10d
movl%edx, %r9d
imull%ecx
movl%r11d, %eax
subl%r8d, %r9d
sarl$31, %r11d
movl%edx, %r8d
imull%ecx
subl%r10d, %r8d
movl%edx, %r10d
subl%r11d, %r10d
pextrd$3, %xmm0, %r11d
movl%r11d, %eax
imull%ecx
sarl$31, %r11d
movd%r10d, %xmm1
movd%r9d, %xmm0
pinsrd$0x1, %r8d, %xmm0
subl%r11d, %edx
pinsrd$0x1, %edx, %xmm1
punpcklqdq%xmm1, %xmm0
ret


Thus gcc DOES optimize the division to be done through High Multiplication, but
it is applied to each value separately instead of vectorized ones. Assember
should look like


movdqa.LC190(%rip), %xmm0
pmulld(%rcx), %xmm0
pslld$31, %xmm0
ret

[Bug c++/53599] [4.7/4.8 Regression] gcc-4.7.1_rc20120606 segfaults compiling boost.karma

2012-06-12 Thread rguenth at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53599

Richard Guenther  changed:

   What|Removed |Added

   Priority|P3  |P1
  Known to work||4.7.0
  Known to fail||4.7.1, 4.8.0

--- Comment #8 from Richard Guenther  2012-06-12 
12:47:31 UTC ---
Can someone please revert it on the branch at least?  Thx.

[Bug middle-end/53644] ICE in force_move_args_size_note, at combine-stack-adj.c:419

2012-06-12 Thread jakub at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53644

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek  2012-06-12 
12:50:19 UTC ---
Dup of PR53602 ?

[Bug middle-end/53616] [4.8 Regression] 416.gamess in SPEC CPU 2006 miscompiled

2012-06-12 Thread hjl.tools at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53616

H.J. Lu  changed:

   What|Removed |Added

 Status|WAITING |NEW

[Bug middle-end/53616] [4.8 Regression] 416.gamess in SPEC CPU 2006 miscompiled

2012-06-12 Thread hjl.tools at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53616

--- Comment #5 from H.J. Lu  2012-06-12 13:03:57 
UTC ---
Created attachment 27610
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27610
alt.src for 416.games

[Bug c++/53646] New: Surprising effects of cxx11 vs cxx98 ABI compatibility

2012-06-12 Thread matz at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53646

 Bug #: 53646
   Summary: Surprising effects of cxx11 vs cxx98 ABI compatibility
Classification: Unclassified
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: m...@gcc.gnu.org


Created attachment 27611
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27611
tarball containing testcase

As is long known mixing cxx11 and cxx98 code isn't supported.  I'm reporting
this anyway because I'm not sure that the full consequences of this were
realized.  Unpack the tarball, then:

# cd cxxabi-incompat
# make CXX=
...
# ./app
Segmentation fault
# ./app2
Segmentation fault

In this specific case the problem is the Rb_tree::equal_range function,
which returns a pair, under cxx98 that's POD (returned via registers), under 
cxx11 it's not POD and returned via invisible reference.  I.e. calls from
cxx98 to cxx11 version of that function will segfault and vice verso.

The testcase consists of two libraries (lib1 lib2), compiled with cxx11
and cxx98 respectively.  Both libs don't depend or interact with each other.
The app stands for a random application making use of random libraries,
calling simple functions in them, which themself don't exhibit any ABI
problem.  I.e. the testcase tries to show a typical situation for
application developers using several different 3rd party libraries not under
control of that developer.

The problem happens because both libs contains a (weak) definition of
the equal_range function.  As the libs don't control their exports this
symbol also is exported.  Hence the calls will be resolved to whatever
version comes first in dynamic linker search order (that's the reason
for the two apps built, once with "-l1 -l2", once with "-l2 -l1").

So, whichever library is first in search order, the other library will
resolve its own equal_range call (stemming from the inlined erase) to that
first library, and thereby crash because that version was compiled with
different c++ ABI.

Now, this might all be as designed, but what this "you can't mix c++11 and
c++98" means is that an application can't even _link_ against two
libraries compiled with different settings, when the libs don't tightly
control their exports (which might not be possible on all targets).
That's even prohibited if the library authors made sure that the API
itself doesn't contain any problematic constructs, for instance just a
C API, and c++ is used only internally.

In effect this means, that if library authors don't control all their
consumers (which is the case for most libraries) they must expect that
those will also link against some c++98 libraries, and hence can't make use
of c++11 constructs even internally.

Furthermore, if one wants to link against a c++11 compiled library (which
the typical user won't even know, especially if only used in the internal
implementation) the whole stack also needs to be c++11, effectively requiring
two versions of every library that somehow uses c++ to be installed.
The dynamic linker (or games via separate paths) would have to resolve to the
right variants.  No distribution is going to do something like this (and not
all libraries might even be compilable in c++11 mode), and this effectively
prevents c++11 to be used _at all_ in development.

This report is a result from a real issue we had: our package management
is written in C++, and the library author started using c++11.  The
applications linking against that component were still at c++98 leading
to crashes.  I had to advise them to control their exports or stop using
c++11.  Luckily we were in the position to do the former, but if it had
been a third party lib there would have been no way out (rewriting the
apps isn't possible because we don't control all of them).

[Bug target/53621] [SH] Frame pointers not generated with -fno-omit-frame-pointer on GCC 4.7.0

2012-06-12 Thread chrbr at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53621

--- Comment #8 from chrbr at gcc dot gnu.org 2012-06-12 13:26:42 UTC ---
Created attachment 27612
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27612
fix

All the suspicious flags reviewed and looked OK excepted maybe
-maccumulate-outgoing-args that looks safer when initialized from the .opt
files, and overridden only in case of conflict with unwind tables.

Removed entirely the forcing of -fomit-frame-pointer as we discussed, since it
doesn't seem needed anymore. However, note that if we shall discover that it is
still in use for a non dwarf2 target, the line should be moved to
common/config/sh:sh_option_init_struct()

checked that omit-frame-pointer is correctly enforced for -pg from
sh_frame_pointer_required.

I'd like to push it to the 4.6, 4.7, validated under sh-superh-elf. 
sh4-linux on-going.

Thanks

[Bug c/53645] Missed optimization for division of vector types

2012-06-12 Thread rguenth at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53645

Richard Guenther  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2012-06-12
 Ever Confirmed|0   |1

--- Comment #1 from Richard Guenther  2012-06-12 
13:34:21 UTC ---
The issue is that no packed integer division exists and that we lower the
vector division to scalar code:

:
  D.2165_4 = BIT_FIELD_REF ;
  D.2166_5 = D.2165_4 / 3;
  D.2167_6 = BIT_FIELD_REF ;
  D.2168_7 = D.2167_6 / 3;
  D.2169_8 = BIT_FIELD_REF ;
  D.2170_9 = D.2169_8 / 3;
  D.2171_10 = BIT_FIELD_REF ;
  D.2172_11 = D.2171_10 / 3;
  D.2159_2 = {D.2166_5, D.2168_7, D.2170_9, D.2172_11};

this lowering should instead try to do the multiplication trick.

Compilable testcase:

typedef int v4si __attribute__((vector_size(16)));
v4si ttt(v4si x) 
{
return x / (v4si) {3,3,3,3};
}

[Bug tree-optimization/53645] Missed optimization for vector integer division lowering

2012-06-12 Thread rguenth at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53645

Richard Guenther  changed:

   What|Removed |Added

   Keywords||missed-optimization
  Component|c   |tree-optimization
Summary|Missed optimization for |Missed optimization for
   |division of vector types|vector integer division
   ||lowering
   Severity|normal  |enhancement

[Bug libstdc++/53646] Surprising effects of cxx11 vs cxx98 ABI compatibility

2012-06-12 Thread redi at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53646

--- Comment #1 from Jonathan Wakely  2012-06-12 
13:38:01 UTC ---
(In reply to comment #0)
> In this specific case the problem is the Rb_tree::equal_range function,
> which returns a pair, under cxx98 that's POD (returned via registers), under 
> cxx11 it's not POD and returned via invisible reference.

Ah, so that explains it.  Same issue as
http://gcc.gnu.org/ml/gcc/2012-05/msg00409.html

[Bug middle-end/53644] ICE in force_move_args_size_note, at combine-stack-adj.c:419

2012-06-12 Thread doko at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53644

Matthias Klose  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||DUPLICATE

--- Comment #2 from Matthias Klose  2012-06-12 
13:40:52 UTC ---
duplicate

*** This bug has been marked as a duplicate of bug 53602 ***

[Bug c++/53602] [4.7 Regression] Libre Office causes an internal compiler error

2012-06-12 Thread doko at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53602

Matthias Klose  changed:

   What|Removed |Added

 CC||doko at gcc dot gnu.org

--- Comment #18 from Matthias Klose  2012-06-12 
13:40:52 UTC ---
*** Bug 53644 has been marked as a duplicate of this bug. ***

[Bug target/53621] [SH] Frame pointers not generated with -fno-omit-frame-pointer on GCC 4.7.0

2012-06-12 Thread kkojima at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53621

--- Comment #9 from Kazumoto Kojima  2012-06-12 
13:56:37 UTC ---
The patch is pre-approved.  Thanks for looking into the issue
thoroughly.

[Bug middle-end/53616] [4.8 Regression] 416.gamess in SPEC CPU 2006 miscompiled

2012-06-12 Thread rguenth at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53616

--- Comment #6 from Richard Guenther  2012-06-12 
14:06:18 UTC ---
Ok, that doesn't apply for me (I'm stuck on v1.0).  I can reproduce it with
-fno-tree-vrp without LTO but only with reference input (thus bisecting
to a single file will take _quite_ some time).  There are more than 1000
loop replacements with memcpy/memmove.

Let's say I hope somebody comes up with a smaller wrong-code testcase ;))

[Bug libstdc++/53646] Surprising effects of cxx11 vs cxx98 ABI compatibility

2012-06-12 Thread redi at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53646

--- Comment #2 from Jonathan Wakely  2012-06-12 
14:19:33 UTC ---
N.B. std::pair is not a POD in c++98 or c++11, so I don't know what libstdc++
could have done to cause the FE to change how it returns a std::pair.

[Bug libstdc++/53646] Surprising effects of cxx11 vs cxx98 ABI compatibility

2012-06-12 Thread redi at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53646

--- Comment #3 from Jonathan Wakely  2012-06-12 
14:27:58 UTC ---
I don't think this is a libstdc++ issue, precompiling the code with 4.7 and
then compiling with 4.8 still segfaults, so it's a FE change not a libstdc++
change.

[Bug libstdc++/53646] Surprising effects of cxx11 vs cxx98 ABI compatibility

2012-06-12 Thread redi at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53646

--- Comment #4 from Jonathan Wakely  2012-06-12 
14:29:03 UTC ---
(In reply to comment #3)
> I don't think this is a libstdc++ issue, precompiling ...

Sorry, brainfart, I meant preprocessing

[Bug c++/53137] [4.7/4.8 Regression] g++ segfault

2012-06-12 Thread jason at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53137

--- Comment #6 from Jason Merrill  2012-06-12 
15:01:31 UTC ---
Author: jason
Date: Tue Jun 12 15:01:17 2012
New Revision: 188460

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=188460
Log:
PR c++/53599
Revert:
PR c++/53137
* pt.c (instantiate_class_template_1): Set LAMBDA_EXPR_THIS_CAPTURE.
(instantiate_decl): Don't push_to_top_level for local class methods.
(instantiate_class_template_1): Or for local classes.

Added:
branches/gcc-4_7-branch/gcc/testsuite/g++.dg/template/local7.C
Removed:
   
branches/gcc-4_7-branch/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-template5.C
Modified:
branches/gcc-4_7-branch/gcc/cp/ChangeLog
branches/gcc-4_7-branch/gcc/cp/pt.c
branches/gcc-4_7-branch/gcc/testsuite/ChangeLog

[Bug c++/53599] [4.7/4.8 Regression] gcc-4.7.1_rc20120606 segfaults compiling boost.karma

2012-06-12 Thread jason at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53599

--- Comment #9 from Jason Merrill  2012-06-12 
15:01:29 UTC ---
Author: jason
Date: Tue Jun 12 15:01:17 2012
New Revision: 188460

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=188460
Log:
PR c++/53599
Revert:
PR c++/53137
* pt.c (instantiate_class_template_1): Set LAMBDA_EXPR_THIS_CAPTURE.
(instantiate_decl): Don't push_to_top_level for local class methods.
(instantiate_class_template_1): Or for local classes.

Added:
branches/gcc-4_7-branch/gcc/testsuite/g++.dg/template/local7.C
Removed:
   
branches/gcc-4_7-branch/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-template5.C
Modified:
branches/gcc-4_7-branch/gcc/cp/ChangeLog
branches/gcc-4_7-branch/gcc/cp/pt.c
branches/gcc-4_7-branch/gcc/testsuite/ChangeLog

[Bug c++/53137] [4.7/4.8 Regression] g++ segfault

2012-06-12 Thread jason at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53137

Jason Merrill  changed:

   What|Removed |Added

   Target Milestone|4.7.1   |4.7.2

--- Comment #7 from Jason Merrill  2012-06-12 
15:06:27 UTC ---
Fix pushed off to 4.7.2.

[Bug libstdc++/53646] Surprising effects of cxx11 vs cxx98 ABI compatibility

2012-06-12 Thread matz at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53646

--- Comment #5 from Michael Matz  2012-06-12 15:36:01 
UTC ---
(In reply to comment #2)
> N.B. std::pair is not a POD in c++98 or c++11, so I don't know what libstdc++
> could have done to cause the FE to change how it returns a std::pair.

I don't know if it's PODness (but I believe that it is), but the calling
convention is changed by the existence of this c++11 construct in std::pair:

  pair(pair&& __p)
  noexcept(__and_,
   is_nothrow_move_constructible<_T2>>::value)
  : first(std::forward(__p.first)),
second(std::forward(__p.second)) { }

It's some sort of copy-ctor, right?  In any case when it's there (and it's
only there when compiling/preprocessing in c++11 mode), then the frontend
makes this type be TREE_ADDRESSABLE, aggregate_value_p will return true,
and that makes the ABI use an invisible reference.

[Bug libstdc++/53646] Surprising effects of cxx11 vs cxx98 ABI compatibility

2012-06-12 Thread matz at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53646

--- Comment #6 from Michael Matz  2012-06-12 15:41:49 
UTC ---
FWIW, it's finish_struct_bits setting TREE_ADDRESSABLE, because
type_has_nontrivial_copy_init returns true for pair with that ctor.
I think this indeed makes pair non-POD.

[Bug middle-end/53647] New: [4.8 Regression] gcc.c-torture/compile/20011229-1.c and gcc.c-torture/compile/pr25311.c

2012-06-12 Thread hjl.tools at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53647

 Bug #: 53647
   Summary: [4.8 Regression] gcc.c-torture/compile/20011229-1.c
and gcc.c-torture/compile/pr25311.c
Classification: Unclassified
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: hjl.to...@gmail.com
CC: wschm...@linux.vnet.ibm.com


On Linux/x86, revision 188457 gave

FAIL: gcc.c-torture/compile/20011229-1.c  -Os  (test for excess errors)
FAIL: gcc.c-torture/compile/pr25311.c  -Os  (internal compiler error)

Revision 188451 is OK.

[Bug libstdc++/53646] Surprising effects of cxx11 vs cxx98 ABI compatibility

2012-06-12 Thread redi at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53646

--- Comment #7 from Jonathan Wakely  2012-06-12 
15:50:05 UTC ---
Trivially copyable is just one small part of the POD requirements. std::pair
has always been non-POD, even in c++98, but in c++98 it is trivially copyable,
in c++11 that move constructor is non-trivial.

It can be made trivial by changing it to:

  pair(pair&& __p)
  noexcept(__and_,
   is_nothrow_move_constructible<_T2>>::value)
  = default;

That might fix this problem, could you test it?

[Bug fortran/53642] Front-end optimization: Wrong string length for deferred-length strings

2012-06-12 Thread burnus at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53642

Tobias Burnus  changed:

   What|Removed |Added

 CC||burnus at gcc dot gnu.org

--- Comment #1 from Tobias Burnus  2012-06-12 
15:52:34 UTC ---
I was thinking about the following patch - namely, doing in optimize_assignment
the remove_trim only if "!lhs->ts.deferred".

 * * *

However, that does not work; seemingly,  a = trim(b) is replaced by
   a = b(1:len_trim(b))  in such a way, that  len(a) == 0  instead of 3.

The same issue occurs for a manual:
   trimmed = string(1:len_trim(string))

Thus, the follow-up issue is not a FE optimization issue but rather a trans*.c
issue. The generated code is [some casting removed]:

trimmed = __builtin_malloc (MAX_EXPR , 0>);

.trimmed = MAX_EXPR , 0>;

D.1861 = _gfortran_string_len_trim (4, &string);

Which shows the wrong ordering, similar to bug 45170 comment 34, which is fixed
by the patch at bug 45170 comment 34. That patch also solves the len_trim issue
above.


--- a/gcc/fortran/frontend-passes.c
+++ b/gcc/fortran/frontend-passes.c
@@ -738 +738 @@ optimize_assignment (gfc_code * c)
-  if (lhs->ts.type == BT_CHARACTER)
+  if (lhs->ts.type == BT_CHARACTER && !lhs->ts.deferred)
@@ -740 +740 @@ optimize_assignment (gfc_code * c)
-  /* Optimize away a = trim(b), where a is a character variable.  */
+  /* Optimize  a = trim(b)  to  a = b.  */
@@ -743,4 +743,2 @@ optimize_assignment (gfc_code * c)
-  /* Replace a = '   ' by a = '' to optimize away a memcpy, but only
-for strings with non-deferred length (otherwise we would
-reallocate the length.  */
-  if (empty_string(rhs) && ! lhs->ts.deferred)
+  /* Replace a = '   ' by a = '' to optimize away a memcpy.  */
+  if (empty_string(rhs))
@@ -1174 +1172 @@ optimize_trim (gfc_expr *e)
-  /* Build the function call to len_trim(x, gfc_defaul_integer_kind).  */
+  /* Build the function call to len_trim(x, gfc_default_integer_kind).  */

[Bug libstdc++/53646] Surprising effects of cxx11 vs cxx98 ABI compatibility

2012-06-12 Thread redi at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53646

Jonathan Wakely  changed:

   What|Removed |Added

 CC||paolo.carlini at oracle dot
   ||com

--- Comment #8 from Jonathan Wakely  2012-06-12 
15:53:15 UTC ---
Ah, the stl_pair.h header has a comment (I think from Paolo) saying that
defaulting that move ctor breaks std::map:

  // XXX Defaulted?!? Breaks std::map!!!

[Bug libstdc++/53646] Surprising effects of cxx11 vs cxx98 ABI compatibility

2012-06-12 Thread redi at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53646

--- Comment #9 from Jonathan Wakely  2012-06-12 
15:57:16 UTC ---
Defaulting that move-ctor fixes the issue referred to in comment 1 too.

I think we need to find out if that comment is still relevant and fix it if it
is, so we can default the move-ctor.

[Bug libstdc++/53646] Surprising effects of cxx11 vs cxx98 ABI compatibility

2012-06-12 Thread matz at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53646

--- Comment #10 from Michael Matz  2012-06-12 16:02:28 
UTC ---
Yep, defaulting that ctor changes the ABI back to register passing.
If we could change that in libstdc++, all the better, but I still think the
issue is larger than just this specific case.

[Bug libstdc++/53646] Surprising effects of cxx11 vs cxx98 ABI compatibility

2012-06-12 Thread paolo.carlini at oracle dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53646

--- Comment #11 from Paolo Carlini  2012-06-12 
16:04:58 UTC ---
Daniel should have all the details. It might be possible to do the change
*together* with changing the constraining in the various container::insert to
use is_constructible instead of is_convertible (we got PRs about this, but
please also get details from Daniel). Actually, we may be close to being able
to do this, mainline only of course, but really run the testsuite to completion
and be ready for fallouts in unexpected places (I didn't feel brave enough to
try for 4.7 ;)

[Bug libstdc++/53646] Surprising effects of cxx11 vs cxx98 ABI compatibility

2012-06-12 Thread paolo.carlini at oracle dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53646

--- Comment #12 from Paolo Carlini  2012-06-12 
16:12:55 UTC ---
If I remember correctly, the last time I tried, default + is_constructible
worked pretty well modulo testcases sensitive to access control under sfinae.
But the latter we are going to implment anyway for 4.8, thus temporary FIXME or
something in the testcases would be fine.

[Bug ada/53592] ICE when hitting assigment to component of SSE vector_type

2012-06-12 Thread ebotcazou at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53592

Eric Botcazou  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 CC||ebotcazou at gcc dot
   ||gnu.org
 AssignedTo|unassigned at gcc dot   |ebotcazou at gcc dot
   |gnu.org |gnu.org

--- Comment #2 from Eric Botcazou  2012-06-12 
16:15:14 UTC ---
Fixing.

[Bug c++/53648] New: nested empty tuple

2012-06-12 Thread chesstr at hotmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53648

 Bug #: 53648
   Summary: nested empty tuple
Classification: Unclassified
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: ches...@hotmail.com


#include 

int main(){
auto b = std::tuple>>{};
}

gives : 

error: 'std::_Tuple_impl<1ul>' is an ambiguous base of 'std::_Tuple_impl<0ul,
std::tuple > >'

[Bug c++/53648] nested empty tuples

2012-06-12 Thread redi at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53648

Jonathan Wakely  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2012-06-12
 Ever Confirmed|0   |1

--- Comment #1 from Jonathan Wakely  2012-06-12 
16:42:37 UTC ---
I have a fix for this already, IIRC it's simply a case of not using the EBO for
a tuple that contains std::tuple<>

[Bug ada/53590] compiler fails to generate SIMD instruction for FP division

2012-06-12 Thread ebotcazou at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53590

Eric Botcazou  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed|2012-06-11 00:00:00 |2012-06-12
 AssignedTo|unassigned at gcc dot   |ebotcazou at gcc dot
   |gnu.org |gnu.org
 Ever Confirmed|0   |1

--- Comment #7 from Eric Botcazou  2012-06-12 
16:54:36 UTC ---
> That wasn't particularly clear; the C compiler and the C++ compiler
> used for comparing things on the machine are form the very
> same build (4.8.0 20120525), though.

Yes, it's a fallout of -fnon-call-exceptions that stems from the Java
semantics.  GNAT GPL doesn't care about Java so it implements more aggressive
dead code elimination passes.  We could add an internal flag enabling this
behavior.

[Bug libstdc++/53648] [C++11] nested empty tuples

2012-06-12 Thread chesstr at hotmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53648

--- Comment #2 from chesstr at hotmail dot com 2012-06-12 17:13:33 UTC ---
(In reply to comment #1)
> I have a fix for this already, IIRC it's simply a case of not using the EBO 
> for
> a tuple that contains std::tuple<>

Yes, an easy fix in tuple implementation by modifying __empty_not_final as
below compiles :


template
using __empty_not_final
  = typename conditional<__is_final(_Tp)||is_same<_Tp,tuple<>>::value,
false_type, is_empty<_Tp>>::type;

instead of 

template
using __empty_not_final
  = typename conditional<__is_final(_Tp), false_type, is_empty<_Tp>>::type;

[Bug middle-end/53647] [4.8 Regression] gcc.c-torture/compile/20011229-1.c and gcc.c-torture/compile/pr25311.c

2012-06-12 Thread wschmidt at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53647

William J. Schmidt  changed:

   What|Removed |Added

 CC||wschmidt at gcc dot gnu.org

--- Comment #1 from William J. Schmidt  2012-06-12 
17:27:18 UTC ---
Can you please post the symptom of the ICE?  These errors don't occur on ppc so
will be some work for me to replicate.

[Bug libstdc++/53648] [C++11] nested empty tuples

2012-06-12 Thread redi at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53648

--- Comment #3 from Jonathan Wakely  2012-06-12 
17:27:50 UTC ---
There are other cases involving non-empty tuples that will still result in an
ambiguity e.g.

struct A { };
auto d = tuple, A>, A>{};

My solution avoids using the EBO for some condition I don't remember (the
code's on another machine) but it handles all the cases I tested.

I also preserve the property that sizeof(tuple>)==1, which I
think your suggestion loses.

[Bug middle-end/53647] [4.8 Regression] gcc.c-torture/compile/20011229-1.c and gcc.c-torture/compile/pr25311.c

2012-06-12 Thread hjl.tools at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53647

--- Comment #2 from H.J. Lu  2012-06-12 17:45:40 
UTC ---
/export/build/gnu/gcc-x32/build-x86_64-linux/gcc/xgcc
-B/export/build/gnu/gcc-x32/build-x86_64-linux/gcc/ -fno-diagnostics-show-caret
-Os -w -c -o pr25311.o
/export/gnu/import/git/gcc/gcc/testsuite/gcc.c-torture/compile/pr25311.c
/export/gnu/import/git/gcc/gcc/testsuite/gcc.c-torture/compile/pr25311.c: In
function ‘set_size’:
/export/gnu/import/git/gcc/gcc/testsuite/gcc.c-torture/compile/pr25311.c:16:1:
internal compiler error: Floating point exception
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.

Program received signal SIGFPE, Arithmetic exception.
0x00cdefb5 in hoist_adjacent_loads (bb0=0x71326d90, 
bb1=0x71326e00, bb2=0x71326e70, bb3=0x71326ee0)
at /export/gnu/import/git/gcc/gcc/tree-ssa-phiopt.c:1941
1941  align1 = DECL_ALIGN (field1) % param_align_bits;
(gdb) p param_align_bits
$1 = 0
(gdb)

[Bug middle-end/53647] [4.8 Regression] gcc.c-torture/compile/20011229-1.c and gcc.c-torture/compile/pr25311.c

2012-06-12 Thread hjl.tools at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53647

H.J. Lu  changed:

   What|Removed |Added

 CC||ubizjak at gmail dot com

--- Comment #3 from H.J. Lu  2012-06-12 17:56:04 
UTC ---
i386.c has

  maybe_set_param_value (PARAM_L1_CACHE_LINE_SIZE, ix86_cost->prefetch_block,
 global_options.x_param_values,
 global_options_set.x_param_values);

But some costs have

  0,/* size of l1 cache  */
  0,/* size of l2 cache  */
  0,/* size of prefetch block */
  0,/* number of parallel prefetches */

and we get param_align == 0:

+  int param_align = PARAM_VALUE (PARAM_L1_CACHE_LINE_SIZE);
+  unsigned param_align_bits = (unsigned) (param_align * BITS_PER_UNIT);
+  gimple_stmt_iterator gsi;

[Bug libstdc++/53648] [C++11] nested empty tuples

2012-06-12 Thread chesstr at hotmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53648

--- Comment #4 from chesstr at hotmail dot com 2012-06-12 18:03:56 UTC ---
(In reply to comment #3)
> There are other cases involving non-empty tuples that will still result in an
> ambiguity e.g.
> 
> struct A { };
> auto d = tuple, A>, A>{};
> 
> My solution avoids using the EBO for some condition I don't remember (the
> code's on another machine) but it handles all the cases I tested.
> 
> I also preserve the property that sizeof(tuple>)==1, which I
> think your suggestion loses.

You are right, the suggestion does not solve the real problem at all. Nice
example.

[Bug target/53511] SH Target: Add support for fma patterns

2012-06-12 Thread olegendo at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53511

--- Comment #13 from Oleg Endo  2012-06-12 
18:25:46 UTC ---
Author: olegendo
Date: Tue Jun 12 18:25:40 2012
New Revision: 188471

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=188471
Log:
PR target/53511
* gcc.target/sh/pr51340-1.c: Delete obsolete test case.
* gcc.target/sh/pr51340-2.c: Likewise.
* gcc.target/sh/pr51340-3.c: Likewise.


Removed:
trunk/gcc/testsuite/gcc.target/sh/pr51340-1.c
trunk/gcc/testsuite/gcc.target/sh/pr51340-2.c
trunk/gcc/testsuite/gcc.target/sh/pr51340-3.c
Modified:
trunk/gcc/testsuite/ChangeLog

[Bug rtl-optimization/53533] [4.7/4.8 regression] vectorization causes loop unrolling test slowdown as measured by Adobe's C++Benchmark

2012-06-12 Thread matt at use dot net

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53533

--- Comment #11 from Matt Hargett  2012-06-12 18:25:25 UTC 
---
Richard,

Thanks for the quick analysis! Sounds like a perfect storm of sorts :/

re: cprop failure: this may be indicated by another major regression in their
suite for the "simple constant folding" tests. in GCC 4.1-4.6, those tests are
all 0.0s but in 4.7 take tens of seconds. Let me know if you want me to file a
separate bug/reduced test case for that, and then have that new bug depend on
this one. Otherwise, I'll wait until this one sees some resolution and then
retest.

re: multiple passes: if you think that feature has enough merit to be revisited
now, I can look into re-proposing Maxim's patches from October/November 2011
that integrated your feedback at the time.

re: -march workaround: our deployment platform's minimum arch is nocona, and
enabling -march=nocona doesn't workaround the issue. For grins, I tried
-march=amdfam10 (another deployment target, but would require a separate
distributable binary), but that also didn't work around the issue.

I see a small improvement when using -fno-tree-vectorize, but not nearly as
dramatic as yours. For the int32_t for and while loop unrolling, the times go
from ~107s and ~105s to ~96s and ~95s, respectively. The do and goto loop
unrolling times get slightly worse (~2%), but it might be noise.

Let me know if there's any additional testing/footwork you'd like me to do.
Again, thanks for the quick turnaround on such a deep analysis!

[Bug c++/53599] [4.7/4.8 Regression] gcc-4.7.1_rc20120606 segfaults compiling boost.karma

2012-06-12 Thread jason at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53599

--- Comment #10 from Jason Merrill  2012-06-12 
18:32:10 UTC ---
Author: jason
Date: Tue Jun 12 18:32:04 2012
New Revision: 188473

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=188473
Log:
PR c++/53599
* name-lookup.c (pushtag_1): Add a DECL_EXPR for a local class.
* semantics.c (finish_cond): Build a COMPOUND_EXPR.
* pt.c (tsubst_expr) [COMPOUND_EXPR]: Handle.
[DECL_EXPR]: Don't call cp_finish_decl for an implicit typedef.
Don't return the decl.

Added:
trunk/gcc/testsuite/g++.dg/template/local7.C
Modified:
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/name-lookup.c
trunk/gcc/cp/pt.c
trunk/gcc/cp/semantics.c
trunk/gcc/testsuite/ChangeLog

[Bug tree-optimization/53645] Missed optimization for vector integer division lowering

2012-06-12 Thread jakub at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53645

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  2012-06-12 
18:33:56 UTC ---
I think we've talked about enhancing the pattern recognizer to expand it as
mult at the tree level, reusing parts of the expander code for that.  I believe
I've looked at it, but can't find any patch (I think it was around mid December
when
sdivmod pattern recognizer has been added).  I'll look at it again.

[Bug rtl-optimization/53533] [4.7/4.8 regression] vectorization causes loop unrolling test slowdown as measured by Adobe's C++Benchmark

2012-06-12 Thread rth at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53533

--- Comment #12 from Richard Henderson  2012-06-12 
18:54:24 UTC ---
(In reply to comment #10)
> But maybe allowing const_vector in (some of) the define_insn_and_split would
> be the way to go ...

Maybe.  It certainly would ease some of the simplifications.
At the moment I don't think we can go from

  mem -> const -> simplify -> const ->newmem

On the other hand, for this particular test case, where all
of the vector_cst elements are the same, and a reasonably
small number of bits set, it would be great to be able to
leverage synth_mult.

The main complexity for sse2_mulv4si3 is due to the fact that
we have to decompose the operation into V8HImode multiplies.
Whereas if we decompose the multiply, we have the shifts and
adds in V4SImode.

[Bug target/53647] [4.8 Regression] gcc.c-torture/compile/20011229-1.c and gcc.c-torture/compile/pr25311.c

2012-06-12 Thread wschmidt at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53647

William J. Schmidt  changed:

   What|Removed |Added

  Component|middle-end  |target

--- Comment #4 from William J. Schmidt  2012-06-12 
19:05:55 UTC ---
So, appears to be a target issue.  The only other use of L1_CACHE_SIZE in the
top-level gcc directory is in tree-ssa-loop-prefetch.c, where 0 is apparently
undetected but causes prefetching to be quietly disabled.  Specifying 0 doesn't
appear to be intentionally valid.

[Bug target/53647] [4.8 Regression] gcc.c-torture/compile/20011229-1.c and gcc.c-torture/compile/pr25311.c

2012-06-12 Thread wschmidt at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53647

--- Comment #5 from William J. Schmidt  2012-06-12 
19:09:48 UTC ---
If this is incorrect, and zero is supposed to indicate a cacheless memory (do
they exist anymore?), then I can disable the adjacent-loads hoisting
optimization for that case.  Someone know the answer for this?

[Bug target/53647] [4.8 Regression] gcc.c-torture/compile/20011229-1.c and gcc.c-torture/compile/pr25311.c

2012-06-12 Thread hjl.tools at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53647

--- Comment #6 from H.J. Lu  2012-06-12 19:26:12 
UTC ---
We should update i386.c to have reasonable values for
size of l1 cache, size of l2 cache, size of prefetch block
and number of parallel prefetches, instead of 0s.

[Bug target/53647] [4.8 Regression] gcc.c-torture/compile/20011229-1.c and gcc.c-torture/compile/pr25311.c

2012-06-12 Thread ubizjak at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53647

--- Comment #7 from Uros Bizjak  2012-06-12 19:40:21 
UTC ---
Perhaps simply:

--cut here--
Index: tree-ssa-phiopt.c
===
--- tree-ssa-phiopt.c   (revision 188475)
+++ tree-ssa-phiopt.c   (working copy)
@@ -1830,6 +1830,11 @@
   unsigned param_align_bits = (unsigned) (param_align * BITS_PER_UNIT);
   gimple_stmt_iterator gsi;

+  /* We assume that transformation is not profitable
+ on targets without cache.  */
+  if (param_align_bits == 0)
+return;
+
   /* Walk the phis in bb3 looking for an opportunity.  We are looking
  for phis of two SSA names, one each of which is defined in bb1 and
  bb2.  */
--cut here--

[Bug target/53647] [4.8 Regression] gcc.c-torture/compile/20011229-1.c and gcc.c-torture/compile/pr25311.c

2012-06-12 Thread ubizjak at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53647

--- Comment #8 from Uros Bizjak  2012-06-12 19:45:56 
UTC ---
Alternative patch with the same functionality:

--cut here--
Index: tree-ssa-phiopt.c
===
--- tree-ssa-phiopt.c   (revision 188475)
+++ tree-ssa-phiopt.c   (working copy)
@@ -1976,12 +1976,14 @@
 /* Determine whether we should attempt to hoist adjacent loads out of
diamond patterns in pass_phiopt.  Always hoist loads if
-fhoist-adjacent-loads is specified and the target machine has
-   a conditional move instruction.  */
+   defined cache line size and a conditional move instruction.  */

 static bool
 gate_hoist_loads (void)
 {
-  return (flag_hoist_adjacent_loads == 1 && HAVE_conditional_move);
+  return (flag_hoist_adjacent_loads == 1
+ && PARAM_VALUE (PARAM_L1_CACHE_LINE_SIZE)
+ && HAVE_conditional_move);
 }

 /* Always do these optimizations if we have SSA
--cut here--

[Bug target/53647] [4.8 Regression] gcc.c-torture/compile/20011229-1.c and gcc.c-torture/compile/pr25311.c

2012-06-12 Thread wschmidt at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53647

--- Comment #8 from Uros Bizjak  2012-06-12 19:45:56 
UTC ---
Alternative patch with the same functionality:

--cut here--
Index: tree-ssa-phiopt.c
===
--- tree-ssa-phiopt.c   (revision 188475)
+++ tree-ssa-phiopt.c   (working copy)
@@ -1976,12 +1976,14 @@
 /* Determine whether we should attempt to hoist adjacent loads out of
diamond patterns in pass_phiopt.  Always hoist loads if
-fhoist-adjacent-loads is specified and the target machine has
-   a conditional move instruction.  */
+   defined cache line size and a conditional move instruction.  */

 static bool
 gate_hoist_loads (void)
 {
-  return (flag_hoist_adjacent_loads == 1 && HAVE_conditional_move);
+  return (flag_hoist_adjacent_loads == 1
+ && PARAM_VALUE (PARAM_L1_CACHE_LINE_SIZE)
+ && HAVE_conditional_move);
 }

 /* Always do these optimizations if we have SSA
--cut here--

--- Comment #9 from William J. Schmidt  2012-06-12 
19:46:30 UTC ---
Yes, that can be done, but the question is whether it is correct, or just
hiding the issue.  Do the processors really not have cache?  Or was this just
an error not filling in the values?  I don't want to hide a real problem if
that's the situation.  Waiting for someone to clarify the intent behind these
processor descriptions.

There's nothing in the documentation of the parameter to suggest zero is a
magic number for no cache, but nothing to suggest otherwise, either.

[Bug tree-optimization/53647] [4.8 Regression] gcc.c-torture/compile/20011229-1.c and gcc.c-torture/compile/pr25311.c

2012-06-12 Thread wschmidt at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53647

William J. Schmidt  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2012-06-12
  Component|target  |tree-optimization
 AssignedTo|unassigned at gcc dot   |wschmidt at gcc dot gnu.org
   |gnu.org |
 Ever Confirmed|0   |1

--- Comment #10 from William J. Schmidt  
2012-06-12 20:30:46 UTC ---
OK, after looking at these machine descriptions I think we will have to
tolerate a line size of 0.  They appear to be for the ancient 386 chips that
didn't have onboard cache, and sometimes no offboard cache.  So I'll go ahead
and make the fix.

[Bug middle-end/17958] expand_divmod fails to optimize division of 64-bit quantity by small constant when BITS_PER_WORD is 32

2012-06-12 Thread pinskia at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17958

Andrew Pinski  changed:

   What|Removed |Added

 AssignedTo|roger at eyesopen dot com   |dtemirbulatov at gmail dot
   ||com

--- Comment #2 from Andrew Pinski  2012-06-12 
20:52:43 UTC ---
http://gcc.gnu.org/ml/gcc-patches/2012-06/msg00450.html

[Bug tree-optimization/53645] Missed optimization for vector integer division lowering

2012-06-12 Thread pinskia at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53645

Andrew Pinski  changed:

   What|Removed |Added

 Depends on||51581

--- Comment #3 from Andrew Pinski  2012-06-12 
20:54:43 UTC ---
I think this is a dup of bug 51581.

[Bug c/53196] unknown struct name in C99 compound initializer doesn't generate error

2012-06-12 Thread jakub at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53196

--- Comment #8 from Jakub Jelinek  2012-06-12 
21:16:24 UTC ---
Author: jakub
Date: Tue Jun 12 21:16:20 2012
New Revision: 188483

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=188483
Log:
PR c/53532
PR c/51034
PR c/53196
* c-decl.c (build_compound_literal): Call c_incomplete_type_error
if type isn't complete.

* gcc.dg/pr53532.c: New test.
* gcc.dg/c99-complit-2.c: Add two new dg-error directives,
adjust line numbers.
* gcc.dg/noncompile/950825-1.c: Expect incomplete type error message.
* gcc.dg/Wcxx-compat-8.c: Likewise.
* gcc.dg/pr51034.c: New test.
* gcc.dg/pr53196-1.c: New test.
* gcc.dg/pr53196-2.c: New test.

Added:
trunk/gcc/testsuite/gcc.dg/pr51034.c
trunk/gcc/testsuite/gcc.dg/pr53196-1.c
trunk/gcc/testsuite/gcc.dg/pr53196-2.c
trunk/gcc/testsuite/gcc.dg/pr53532.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/c-decl.c
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.dg/Wcxx-compat-8.c
trunk/gcc/testsuite/gcc.dg/c99-complit-2.c
trunk/gcc/testsuite/gcc.dg/noncompile/950825-1.c

[Bug c/51034] invalid typeof usage

2012-06-12 Thread jakub at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51034

--- Comment #4 from Jakub Jelinek  2012-06-12 
21:16:24 UTC ---
Author: jakub
Date: Tue Jun 12 21:16:20 2012
New Revision: 188483

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=188483
Log:
PR c/53532
PR c/51034
PR c/53196
* c-decl.c (build_compound_literal): Call c_incomplete_type_error
if type isn't complete.

* gcc.dg/pr53532.c: New test.
* gcc.dg/c99-complit-2.c: Add two new dg-error directives,
adjust line numbers.
* gcc.dg/noncompile/950825-1.c: Expect incomplete type error message.
* gcc.dg/Wcxx-compat-8.c: Likewise.
* gcc.dg/pr51034.c: New test.
* gcc.dg/pr53196-1.c: New test.
* gcc.dg/pr53196-2.c: New test.

Added:
trunk/gcc/testsuite/gcc.dg/pr51034.c
trunk/gcc/testsuite/gcc.dg/pr53196-1.c
trunk/gcc/testsuite/gcc.dg/pr53196-2.c
trunk/gcc/testsuite/gcc.dg/pr53532.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/c-decl.c
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.dg/Wcxx-compat-8.c
trunk/gcc/testsuite/gcc.dg/c99-complit-2.c
trunk/gcc/testsuite/gcc.dg/noncompile/950825-1.c

[Bug c/53532] function call ignored when called with argument of incompatible, undefined structure

2012-06-12 Thread jakub at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53532

--- Comment #3 from Jakub Jelinek  2012-06-12 
21:16:24 UTC ---
Author: jakub
Date: Tue Jun 12 21:16:20 2012
New Revision: 188483

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=188483
Log:
PR c/53532
PR c/51034
PR c/53196
* c-decl.c (build_compound_literal): Call c_incomplete_type_error
if type isn't complete.

* gcc.dg/pr53532.c: New test.
* gcc.dg/c99-complit-2.c: Add two new dg-error directives,
adjust line numbers.
* gcc.dg/noncompile/950825-1.c: Expect incomplete type error message.
* gcc.dg/Wcxx-compat-8.c: Likewise.
* gcc.dg/pr51034.c: New test.
* gcc.dg/pr53196-1.c: New test.
* gcc.dg/pr53196-2.c: New test.

Added:
trunk/gcc/testsuite/gcc.dg/pr51034.c
trunk/gcc/testsuite/gcc.dg/pr53196-1.c
trunk/gcc/testsuite/gcc.dg/pr53196-2.c
trunk/gcc/testsuite/gcc.dg/pr53532.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/c-decl.c
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.dg/Wcxx-compat-8.c
trunk/gcc/testsuite/gcc.dg/c99-complit-2.c
trunk/gcc/testsuite/gcc.dg/noncompile/950825-1.c

[Bug c/53649] New: ICE when using 'C' x86 asm constraint

2012-06-12 Thread svfuerst at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53649

 Bug #: 53649
   Summary: ICE when using 'C' x86 asm constraint
Classification: Unclassified
   Product: gcc
   Version: 4.7.0
Status: UNCONFIRMED
  Severity: trivial
  Priority: P3
 Component: c
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: svfue...@gmail.com


The following prints:

internal compiler error: in ix86_print_operand, at config/i386/i386.c:14393

on x86_64, gcc version 4.7

void foo(void)
{
asm (".asciz \"%0\""
: : "C" ((unsigned __attribute__ ((vector_size (16{-1,-1,-1,-1}));
}

[Bug lto/51997] LTO does not inline available builtin implementations

2012-06-12 Thread aldyh at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51997

Aldy Hernandez  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2012-06-12
 Ever Confirmed|0   |1

--- Comment #2 from Aldy Hernandez  2012-06-12 
21:53:59 UTC ---
confirmed with:

houston:/build/t/gcc$ ./xgcc -B./ -flto a.c b.c -save-temps -o foo -O
[Leaving LTRANS /tmp/ccbO9bZt.args]
[Leaving LTRANS foo.ltrans.out]
[Leaving LTRANS /tmp/ccYBPyHk.args]
[Leaving LTRANS foo.ltrans0.o]
houston:/build/t/gcc$ cat foo.ltrans0.s
...
...

main:
.LFB0:
.cfi_startproc
rep
ret
...
...

[Bug c++/53650] New: large array causes huge memory use

2012-06-12 Thread david at doublewise dot net

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53650

 Bug #: 53650
   Summary: large array causes huge memory use
Classification: Unclassified
   Product: gcc
   Version: 4.7.0
Status: UNCONFIRMED
  Severity: major
  Priority: P3
 Component: c++
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: da...@doublewise.net


This problem did not exist in 4.6.x. The following program uses > 4 GiB of
memory (more than my system has) during compilation. If I reduce the size of
the array a bit, it still uses several GiB but does eventually compile (so the
problem is not infinite recursion) and takes a very long time to compile.
Compiling with optimizations on seems to eliminate the large memory usage but
takes longer than I was willing to wait to compile.

class Class {
public:
Class() {}
};

int main() {
Class table [2048][256] = {};
return 0;
}


Using something like int causes no problems and compiles quickly.

This may also relate to a C++11 missed optimization opportunity, because this
may suggest that the fact that Class has a constructor is making it non-POD,
but the POD rules were relaxed in C++11 to make Class a POD structure (I
think). Compiling with std=c++11 doesn't fix the problem, however.

gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/4.7.0/lto-wrapper
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla
--enable-bootstrap --enable-shared --enable-threads=posix
--enable-checking=release --disable-build-with-cxx
--disable-build-poststage1-with-cxx --with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-gnu-unique-object
--enable-linker-build-id --with-linker-hash-style=gnu
--enable-languages=c,c++,objc,obj-c++,java,fortran,ada,go,lto --enable-plugin
--enable-initfini-array --enable-java-awt=gtk --disable-dssi
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre
--enable-libgcj-multifile --enable-java-maintainer-mode
--with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib
--with-ppl --with-cloog --with-tune=generic --with-arch_32=i686
--build=x86_64-redhat-linux
Thread model: posix
gcc version 4.7.0 20120507 (Red Hat 4.7.0-5) (GCC)

[Bug c++/53651] New: seg fault when specifying using decltype(...)::method

2012-06-12 Thread dlarimer at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53651

 Bug #: 53651
   Summary: seg fault when specifying using decltype(...)::method
Classification: Unclassified
   Product: gcc
   Version: 4.7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: dlari...@gmail.com


Created attachment 27613
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27613
Source file that generated the error.

Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/opt/local/libexec/gcc/x86_64-apple-darwin11/4.7.0/lto-wrapper
Target: x86_64-apple-darwin11
Configured with: ../gcc-4.7.0/configure --prefix=/opt/local
--build=x86_64-apple-darwin11
--enable-languages=c,c++,objc,obj-c++,lto,fortran,java
--libdir=/opt/local/lib/gcc47 --includedir=/opt/local/include/gcc47
--infodir=/opt/local/share/info --mandir=/opt/local/share/man
--datarootdir=/opt/local/share/gcc-4.7 --with-libiconv-prefix=/opt/local
--with-local-prefix=/opt/local --with-system-zlib --disable-nls
--program-suffix=-mp-4.7 --with-gxx-include-dir=/opt/local/include/gcc47/c++/
--with-gmp=/opt/local --with-mpfr=/opt/local --with-mpc=/opt/local
--with-ppl=/opt/local --with-cloog=/opt/local --enable-cloog-backend=isl
--enable-stage1-checking --disable-multilib --enable-lto
--with-as=/opt/local/bin/as --with-ld=/opt/local/bin/ld
--with-ar=/opt/local/bin/ar --with-bugurl=https://trac.macports.org/newticket
--with-pkgversion='MacPorts gcc47 4.7.0_3'
Thread model: posix
gcc version 4.7.0 (MacPorts gcc47 4.7.0_3) 
COLLECT_GCC_OPTIONS='-mmacosx-version-min=10.7.4' '-v' '-save-temps' '-o' 'v5'
'-std=c++11' '-shared-libgcc' '-mtune=core2'
 /opt/local/libexec/gcc/x86_64-apple-darwin11/4.7.0/cc1plus -E -quiet -v
-D__DYNAMIC__ value5.cpp -fPIC -mmacosx-version-min=10.7.4 -mtune=core2
-std=c++11 -fpch-preprocess -o value5.ii
ignoring nonexistent directory
"/opt/local/lib/gcc47/gcc/x86_64-apple-darwin11/4.7.0/../../../../../x86_64-apple-darwin11/include"
#include "..." search starts here:
#include <...> search starts here:
 /opt/local/include/gcc47/c++/
 /opt/local/include/gcc47/c++//x86_64-apple-darwin11
 /opt/local/include/gcc47/c++//backward
 /opt/local/lib/gcc47/gcc/x86_64-apple-darwin11/4.7.0/include
 /opt/local/include
 /opt/local/lib/gcc47/gcc/x86_64-apple-darwin11/4.7.0/include-fixed
 /usr/include
 /System/Library/Frameworks
 /Library/Frameworks
End of search list.
COLLECT_GCC_OPTIONS='-mmacosx-version-min=10.7.4' '-v' '-save-temps' '-o' 'v5'
'-std=c++11' '-shared-libgcc' '-mtune=core2'
 /opt/local/libexec/gcc/x86_64-apple-darwin11/4.7.0/cc1plus -fpreprocessed
value5.ii -fPIC -quiet -dumpbase value5.cpp -mmacosx-version-min=10.7.4
-mtune=core2 -auxbase value5 -std=c++11 -version -o value5.s
GNU C++ (MacPorts gcc47 4.7.0_3) version 4.7.0 (x86_64-apple-darwin11)
compiled by GNU C version 4.7.0, GMP version 5.0.4, MPFR version 3.1.0-p3,
MPC version 0.9
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
GNU C++ (MacPorts gcc47 4.7.0_3) version 4.7.0 (x86_64-apple-darwin11)
compiled by GNU C version 4.7.0, GMP version 5.0.4, MPFR version 3.1.0-p3,
MPC version 0.9
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: e694c643a962d7980e49500e9a4c6aa5
value5.cpp:38:76: internal compiler error: Segmentation fault: 11
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions

[Bug c++/50043] [C++0x] Implement core/1123

2012-06-12 Thread kirbyz...@sogou-inc.com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50043

--- Comment #13 from Kirby Zhou  2012-06-13 03:48:34 
UTC ---
How about back port this patch to 4.7 branch?
It cause a lot of compile error which easily confuse programmers.


(In reply to comment #9)
> Author: paolo
> Date: Mon Apr  2 00:13:30 2012
> New Revision: 186058
> 
> URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=186058
> Log:
> /cp
> 2012-04-01  Paolo Carlini  
> 
> PR c++/50043
> * class.c (deduce_noexcept_on_destructor,
> deduce_noexcept_on_destructors): New.
> (check_bases_and_members): Call the latter.
> * decl.c (grokfndecl): Call the former.
> * method.c (implicitly_declare_fn): Not static.
> * cp-tree.h (deduce_noexcept_on_destructor, implicitly_declare_fn):
> Declare
> 
> /testsuite
> 2012-04-01  Paolo Carlini  
> 
> PR c++/50043
> * g++.dg/cpp0x/noexcept17.C: New.
> * g++.old-deja/g++.eh/cleanup1.C: Adjust.
> * g++.dg/tree-ssa/ehcleanup-1.C: Likewise.
> * g++.dg/cpp0x/noexcept01.C: Likewise.
> * g++.dg/eh/init-temp1.C: Likewise.
> * g++.dg/eh/ctor1.C: Likwise.
> 
> Added:
> trunk/gcc/testsuite/g++.dg/cpp0x/noexcept17.C
> Modified:
> trunk/gcc/cp/ChangeLog
> trunk/gcc/cp/class.c
> trunk/gcc/cp/cp-tree.h
> trunk/gcc/cp/decl.c
> trunk/gcc/cp/method.c
> trunk/gcc/testsuite/ChangeLog
> trunk/gcc/testsuite/g++.dg/cpp0x/noexcept01.C
> trunk/gcc/testsuite/g++.dg/eh/ctor1.C
> trunk/gcc/testsuite/g++.dg/eh/init-temp1.C
> trunk/gcc/testsuite/g++.dg/tree-ssa/ehcleanup-1.C
> trunk/gcc/testsuite/g++.old-deja/g++.eh/cleanup1.C

[Bug c++/53650] large array causes huge memory use

2012-06-12 Thread hjl.tools at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53650

H.J. Lu  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2012-06-13
 CC||jason at redhat dot com
   Target Milestone|--- |4.7.2
 Ever Confirmed|0   |1

--- Comment #1 from H.J. Lu  2012-06-13 04:58:35 
UTC ---
It is caused by revision 180944:

http://gcc.gnu.org/ml/gcc-cvs/2011-11/msg00230.html

[Bug target/53621] [SH] Frame pointers not generated with -fno-omit-frame-pointer on GCC 4.7.0

2012-06-12 Thread chrbr at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53621

--- Comment #10 from chrbr at gcc dot gnu.org 2012-06-13 05:59:16 UTC ---
currently analyzing a regression

gcc.dg/stack-usage-1.c scan-file foo\t(256|264)\tstatic

Don't know yet if it's a problem with the test or a side effect. But this
delays the commit.

[Bug rtl-optimization/53652] New: andn isn't used for vectorization

2012-06-12 Thread jakub at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53652

 Bug #: 53652
   Summary: *andn* isn't used for vectorization
Classification: Unclassified
   Product: gcc
   Version: 4.7.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: rtl-optimization
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: ja...@gcc.gnu.org
CC: u...@gcc.gnu.org
Target: x86_64-linux


#define N 1024
long a[N], b[N], c[N];
int d[N], e[N], f[N];

void
foo (void)
{
  int i;
  for (i = 0; i < N; i++)
a[i] = b[i] & ~c[i];
}

void
bar (void)
{
  int i;
  for (i = 0; i < N; i++)
d[i] = e[i] & ~f[i];
}

doesn't use *andn* insns (e.g. vandnp[sd] for -O3 -mavx).  The problem is that
combiner doesn't help here, because
(insn 42 18 33 2 (set (reg:V4DI 94)
(mem/u/c:V4DI (symbol_ref/u:DI ("*.LC0") [flags 0x2]) [2 S32 A256])) -1
 (expr_list:REG_EQUAL (const_vector:V4DI [
(const_int -1 [0x])
(const_int -1 [0x])
(const_int -1 [0x])
(const_int -1 [0x])
])
(nil)))
is before the loop and thus in a different bb,
so the combiner doesn't substitute the all ones constant into the xor (which
should fail, i?86 doesn't have a *not* SSE/AVX insn) and later on when the xor
is substituted into the and (at that point it could figure that and (xor x -1)
y
is andn).  Wonder if we should change the combiner somehow for the cases where
REG_N_SETS == 1 pseudo has REG_EQUAL note, or if we want instead to handle this
during expansion (introduce optional andnotM3 standard patterns?).

[Bug target/53621] [SH] Frame pointers not generated with -fno-omit-frame-pointer on GCC 4.7.0

2012-06-12 Thread kkojima at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53621

--- Comment #11 from Kazumoto Kojima  2012-06-13 
06:52:20 UTC ---
Looks a problem with the test.  It should be tweaked with adding

#elif defined (__sh__)
#  define SIZE 252

for frame pointer save area.

96 matches

Mail list logo