[Bug tree-optimization/59523] ICE on spec2000/176.gcc, 200.sixtrack after r205856 for -march=core-avx2

2013-12-17 Thread izamyatin at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59523

--- Comment #2 from Igor Zamyatin  ---
Created attachment 31454
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31454&action=edit
Reduced testcase


[Bug tree-optimization/52272] [4.7/4.8/4.9 regression] Performance regression of 410.bwaves on x86.

2013-12-17 Thread amker.cheng at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52272

bin.cheng  changed:

   What|Removed |Added

 CC||amker.cheng at gmail dot com

--- Comment #21 from bin.cheng  ---
Hi Richard,
I looked into PR50955 for which the mentioned commit causing this PR is
applied:

Commit 
2012-02-06  Richard Guenther  

PR tree-optimization/50955
* tree-ssa-loop-ivopts.c (get_computation_cost_at): Artificially
raise cost of expressions that replace an address with an
expression based on a different pointer.

I noticed that the offending non-linear use in PR50955 is actually from memory
reference.  If I understand the issue correct, the whole alias issue is
introduced by rewriting iv use with one base_object through candidate with
another incompatible base_object, and it is related to memory reference.  An
genuine non-linear iv use (the pointer never de-referenced, like in this PR)
won't have this issue.

So I come up this idea to relax the condition:

-  if (address_p)
+  if (address_p
+  || (use->iv->base_object
+ && cand->iv->base_object
+ && POINTER_TYPE_P (TREE_TYPE (use->iv->base_object))
+ && POINTER_TYPE_P (TREE_TYPE (cand->iv->base_object
 {
   /* Do not try to express address of an object with computation based
 on address of a different object.  This may cause problems in rtl

to non-linear uses which truly occurred in memory reference, something like:

-  if (address_p)
+  if (address_p
+  || (use->in_mem_ref_p
+ && use->iv->base_object
+ && cand->iv->base_object
+ && POINTER_TYPE_P (TREE_TYPE (use->iv->base_object))
+ && POINTER_TYPE_P (TREE_TYPE (cand->iv->base_object
 {
   /* Do not try to express address of an object with computation based
 on address of a different object.  This may cause problems in rtl

The flag in_mem_ref_p can be set for appropriate uses when finding interesting
address uses.

With this change, this PR should be resolved while not violating PR50955.

I am not very much into 50955, so how does this sound? I can send a patch for
review if the idea is in right direction.

BTW, I cannot reproduce 50955 with the reported revision of GCC.  The store
isn't deleted by pass_cd_dce, though it is re-written just as the PR reported. 
So maybe I just misunderstood something.

Any words?

Thanks,
bin


[Bug fortran/46371] [Coarray] [OOP] SELECT TYPE: scalar coarray variable is rejected

2013-12-17 Thread dominiq at lps dot ens.fr
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46371

Dominique d'Humieres  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2013-12-17
 Ever confirmed|0   |1

--- Comment #4 from Dominique d'Humieres  ---
Still present at r206026. What happened to the patch?


[Bug rtl-optimization/59535] New: [4.9 regression] -Os code size regressions for Thumb1/Thumb2 (with LRA)?

2013-12-17 Thread rearnsha at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59535

Bug ID: 59535
   Summary: [4.9 regression] -Os code size regressions for
Thumb1/Thumb2 (with LRA)?
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rearnsha at gcc dot gnu.org
CC: ramana.radhakrishnan at arm dot com, vmakarov at redhat dot 
com,
yvan.roux at linaro dot org
Target: arm

Running the CSiBE benchmark for -Os with Thumb1 and Thumb2 code generation
shows significant regressions since 11/12/13 (the day LRA was turned on by
default for ARM).  These cause code size to grow back to the size it was in
2011.

I'll try to find some simple examples to use as test cases.


[Bug rtl-optimization/59535] [4.9 regression] -Os code size regressions for Thumb1/Thumb2 (with LRA)?

2013-12-17 Thread rearnsha at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59535

--- Comment #1 from Richard Earnshaw  ---
CSiBE code size results for Thumb2
2013/12/09 2543786
2013/12/11 2563522


[Bug ipa/59226] [4.9 Regression] ICE: in record_target_from_binfo, at ipa-devirt.c:661

2013-12-17 Thread aivchenk at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59226

--- Comment #10 from Alexander Ivchenko  ---
Patch from comment #7 didn't cure Android build as well..


[Bug ipa/59226] [4.9 Regression] ICE: in record_target_from_binfo, at ipa-devirt.c:661

2013-12-17 Thread trippels at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59226

--- Comment #11 from Markus Trippelsdorf  ---
(In reply to Alexander Ivchenko from comment #10)
> Patch from comment #7 didn't cure Android build as well..

Can you try the patch from PR58252 comment 7? 
I've build Chromium (browser) successfully with it.


[Bug c++/58252] [4.9 Regression] ice in gimple_get_virt_method_for_binfo with -O2

2013-12-17 Thread trippels at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58252

Markus Trippelsdorf  changed:

   What|Removed |Added

 CC||trippels at gcc dot gnu.org

--- Comment #8 from Markus Trippelsdorf  ---
FWIW, with the patch from comment 7 I could successfully build Chromium.


[Bug middle-end/59471] [4.9 Regression] ICE using vector extensions (non-top-level BIT_FIELD_REF, IMAGPART_EXPR or REALPART_EXPR)

2013-12-17 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59471

--- Comment #4 from Jakub Jelinek  ---
I think BIT_FIELD_REF's type can't be a vector, so it has to be integral type
in this case.


[Bug rtl-optimization/59535] [4.9 regression] -Os code size regressions for Thumb1/Thumb2 (with LRA)?

2013-12-17 Thread rearnsha at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59535

--- Comment #2 from Richard Earnshaw  ---
CSiBE code size results for Thumb1

2013/12/09 2634640
2013/12/11 2683980

=1.8% size regression.


[Bug target/59371] [4.8/4.9 Regression] Performance regression in GCC 4.8 and later versions.

2013-12-17 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59371

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #9 from Jakub Jelinek  ---
(In reply to Maciej W. Rozycki from comment #8)
> Richard,
> 
>  I wasn't aware integer promotions applied here, thanks for pointing it
> out.  New code is therefore correct while old one was not.  Unfortunately
> neither -fwrapv nor -funsafe-loop-optimizations changes anything.

But then it must be target specific thing.  Because, -fwrapv certainly changes
it to the same IL as has been emitted before that change (also with -fwrapv, of
course).

So, any reason not to close this PR, because while we generate slower code, the
slower code is actually correct while the old one was wrong?


[Bug target/54089] [SH] Refactor shift patterns

2013-12-17 Thread olegendo at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #30 from Oleg Endo  ---
A case from libmpeg2/slice.c:

mov.b   @(1,r10),r0// load of shift amount
shldr7,r6
add #1,r6
extu.b  r0,r0  // zero extend shift amount
shldr0,r1  // r1 <<= r0
mov r1,r0

The zero extension of the shift amount variable could be omitted because shift
amounts > 31 are undefined behavior.  If the shift amount is in the valid range
of 0...31 the zero extension won't do anything.
A reduced test case:

int test33 (unsigned char* x, int y)
{
  return y << x[4];
}

results in:
mov.b   @(4,r4),r0
extu.b  r0,r0
shldr0,r5
rts
mov r5,r0


[Bug rtl-optimization/59535] [4.9 regression] -Os code size regressions for Thumb1/Thumb2 with LRA

2013-12-17 Thread rearnsha at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59535

Richard Earnshaw  changed:

   What|Removed |Added

Summary|[4.9 regression] -Os code   |[4.9 regression] -Os code
   |size regressions for|size regressions for
   |Thumb1/Thumb2 (with LRA)?   |Thumb1/Thumb2 with LRA

--- Comment #3 from Richard Earnshaw  ---
It seems that one of the major problems is a significant increase in the number
of mov instructions to copy registers from one location to another.  This is
probably due to the two-operand format instructions we have in Thumb.


[Bug rtl-optimization/57189] [4.9 Regression] Vector register is spilled for vector extract pattern

2013-12-17 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57189

--- Comment #5 from Jakub Jelinek  ---
(In reply to Uroš Bizjak from comment #4)
> (In reply to Jakub Jelinek from comment #3)
> > Likely caused by r198611.
> 
> This is the patch that exposes the problem.
> 
> I have filled this PR due to the difference with IRA vs. reload, it looks
> that spill size should be somehow taken into account.

What was the point of the r198611 patch other than macroizing some insns
though?
Isn't it worth to just restore the previous behavior when it worked properly
with both reload and LRA?

[Bug rtl-optimization/59535] [4.9 regression] -Os code size regressions for Thumb1/Thumb2 with LRA

2013-12-17 Thread rearnsha at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59535

--- Comment #4 from Richard Earnshaw  ---
Created attachment 31455
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31455&action=edit
testcase

Compile with -Os -mthumb -mcpu=arm7tdmi -fno-short-enums and either -mlra or
-mno-lra


[Bug bootstrap/59536] New: [4.9 regression] internal compiler error: in cselib_record_set, at cselib.c:2376 breaks m68k-linux bootstrap

2013-12-17 Thread mikpelinux at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59536

Bug ID: 59536
   Summary: [4.9 regression] internal compiler error: in
cselib_record_set, at cselib.c:2376 breaks m68k-linux
bootstrap
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: bootstrap
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mikpelinux at gmail dot com

Attempting to bootstrap gcc-4.9-20131215 (r206004) on m68k-linux fails with:

/mnt/scratch/objdir49/./prev-gcc/xg++ -B/mnt/scratch/objdir49/./prev-gcc/
-B/mnt/scratch/install49/m68k-unknown-linux-gnu/bin/ -nostdinc++
-B/mnt/scratch/objdir49/prev-m68k-unknown-linux-gnu/libstdc++-v3/src/.libs
-B/mnt/scratch/objdir49/prev-m68k-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs
-I/mnt/scratch/objdir49/prev-m68k-unknown-linux-gnu/libstdc++-v3/include/m68k-unknown-linux-gnu
-I/mnt/scratch/objdir49/prev-m68k-unknown-linux-gnu/libstdc++-v3/include
-I/mnt/scratch/gcc-4.9-20131215/libstdc++-v3/libsupc++
-L/mnt/scratch/objdir49/prev-m68k-unknown-linux-gnu/libstdc++-v3/src/.libs
-L/mnt/scratch/objdir49/prev-m68k-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs
-c   -g -O2 -gtoggle -DIN_GCC-fno-exceptions -fno-rtti
-fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings
-Wcast-qual -Wmissing-format-attribute -pedantic -Wno-long-long
-Wno-variadic-macros -Wno-overlength-strings -Werror   -DHAVE_CONFIG_H -I. -I.
-I/mnt/scratch/gcc-4.9-20131215/gcc -I/mnt/scratch/gcc-4.9-20131215/gcc/.
-I/mnt/scratch/gcc-4.9-20131215/gcc/../include
-I/mnt/scratch/gcc-4.9-20131215/gcc/../libcpp/include 
-I/mnt/scratch/gcc-4.9-20131215/gcc/../libdecnumber
-I/mnt/scratch/gcc-4.9-20131215/gcc/../libdecnumber/dpd -I../libdecnumber
-I/mnt/scratch/gcc-4.9-20131215/gcc/../libbacktrace-o
tree-loop-distribution.o -MT tree-loop-distribution.o -MMD -MP -MF
./.deps/tree-loop-distribution.TPo
/mnt/scratch/gcc-4.9-20131215/gcc/tree-loop-distribution.c
/mnt/scratch/gcc-4.9-20131215/gcc/tree-loop-distribution.c: In member function
'virtual unsigned int {anonymous}::pass_loop_distribution::execute()':
/mnt/scratch/gcc-4.9-20131215/gcc/tree-loop-distribution.c:1826:63: internal
compiler error: in cselib_record_set, at cselib.c:2376
   unsigned int execute () { return tree_loop_distribution (); }
   ^
0x80260fa7 cselib_record_set
/mnt/scratch/gcc-4.9-20131215/gcc/cselib.c:2376
0x80261715 cselib_record_sets
/mnt/scratch/gcc-4.9-20131215/gcc/cselib.c:2593
0x8026195b cselib_process_insn(rtx_def*)
/mnt/scratch/gcc-4.9-20131215/gcc/cselib.c:2668
0x804d77b1 reload_cse_regs_1
/mnt/scratch/gcc-4.9-20131215/gcc/postreload.c:222
0x804d731f reload_cse_regs
/mnt/scratch/gcc-4.9-20131215/gcc/postreload.c:68
0x804dc70d rest_of_handle_postreload
/mnt/scratch/gcc-4.9-20131215/gcc/postreload.c:2332
0x804dc789 execute
/mnt/scratch/gcc-4.9-20131215/gcc/postreload.c:2368
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.
make[3]: *** [tree-loop-distribution.o] Error 1
make[3]: Leaving directory `/mnt/scratch/objdir49/gcc'
make[2]: *** [all-stage2-gcc] Error 2
make[2]: Leaving directory `/mnt/scratch/objdir49'
make[1]: *** [stage2-bubble] Error 2
make[1]: Leaving directory `/mnt/scratch/objdir49'
make: *** [bootstrap] Error 2

The previous weekly snapshot, gcc-4.9-20131208, bootstrapped fine.

Configured as:
/mnt/scratch/gcc-4.9-20131215/configure --prefix=/mnt/scratch/install49
--enable-bootstrap --enable-shared --enable-threads=posix
--enable-checking=release --with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-linker-build-id
--enable-languages=c,c++ --disable-dssi
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre
--enable-libgcj-multifile --disable-java-maintainer-mode
--with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib
--disable-sjlj-exceptions --disable-libmudflap --disable-plugin --disable-lto
--disable-multilib


[Bug rtl-optimization/59535] [4.9 regression] -Os code size regressions for Thumb1/Thumb2 with LRA

2013-12-17 Thread rearnsha at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59535

--- Comment #5 from Richard Earnshaw  ---
Number of register-register move operations in the testcase
lra:208
no-lra: 105


[Bug rtl-optimization/57189] [4.9 Regression] Vector register is spilled for vector extract pattern

2013-12-17 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57189

--- Comment #6 from Uroš Bizjak  ---
(In reply to Jakub Jelinek from comment #5)
> (In reply to Uroš Bizjak from comment #4)
> > (In reply to Jakub Jelinek from comment #3)
> > > Likely caused by r198611.
> > 
> > This is the patch that exposes the problem.
> > 
> > I have filled this PR due to the difference with IRA vs. reload, it looks
> > that spill size should be somehow taken into account.
> 
> What was the point of the r198611 patch other than macroizing some insns
> though?
> Isn't it worth to just restore the previous behavior when it worked properly
> with both reload and LRA?

The patch added missing alternative (xmm->mem, IIRC) that exposed this problem.
Since there is no other weight in play, IRA is now free to spill V4SI input
value from xmm register and later parially load SImode to an integer register.
However, we have better alternative at hand, where we can spill SImode value.

Many other examples can be constructed for -march=k8, where interunit moves are
disabled. -march=core2 and other intel processors are immune to this problem.

This is the reason why I think this problem should be solved in IRA in a
generic way. IRA should choose the most appropriate register spill out of
otherwise equal choices based on some criteria, so we won't have to
artificially limit insn alternatives.

[Bug ipa/59265] [4.9 Regression] Segmentation fault in ipa_note_param_call for -fprofile-use in SPEC CPU2006

2013-12-17 Thread trippels at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59265

--- Comment #21 from Markus Trippelsdorf  ---
(In reply to Jan Hubicka from comment #20)
> Hmm, it may be someone altering the insns during streaming process.  You may
> try to check
> who is doing that while streaming out the relevant .o file.
> Is it compilation->linker streaming or wpa->ltrans?

It is wpa->ltrans:
 *** [/tmp/ccgCD7yp.ltrans19.ltrans.o] Error 1

> UIDs are initialized in renumber_gimple_stmt_uids calls in passes.c and then
> you can
> try to check if someone touch the statements
> 
> I will try to check if I spot something obvoius, but probably only tomorrow.

I've tried to bisect the issue, but it's messy. 
However I think I can rule out any commit since r205447.
If r205447 is the culprit I cannot say for sure,
because Firefox ICEs during the build.


[Bug lto/59505] gcc-4.9.0-20131208 can't link glsl_compiler with -flto=4 in -m32 where gcc-4.8.2 works fine

2013-12-17 Thread trippels at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59505

Markus Trippelsdorf  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 CC||trippels at gcc dot gnu.org
 Resolution|--- |INVALID

--- Comment #13 from Markus Trippelsdorf  ---
Let's close this one.


[Bug target/53949] [SH] Add support for mac.w / mac.l instructions

2013-12-17 Thread olegendo at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53949

--- Comment #10 from Oleg Endo  ---
I was wondering whether it would make sense to convert sequences such as

 SH4   SH4A
mov.l   @r15,r3  LS/2  LS/2  
mul.l   r2,r3CO/4  EX/3
sts macl,r3  CO/3  LS/2
add r1,r3EX/1  EX/1

into
mov r15,r0   MT/0  MT/1
mov.l   r2,@-r15 LS/1  LS/1
lds r1,macl  CO/3  LS/1
mac.l   @r15+,@r0+   CO/4  CO/5
sts macl,r3  CO/3  LS/2

Looking simply at the issue cycles (the numbers above) would suggest that it's
not worth doing it, at least not if the value has to be pulled out from the mac
register immediately after the mac operation.  Probably it's not beneficial to
emit a single mac insn if the data is not already in place so that it can be
reached easily with the post-inc addressing.

On the other hand something like ...

int test33 (int* x, int y, int z)
{
  return x[0] * 40 + z;
}

currently compiles to:
mov.l   @r4,r2
mov #40,r1
mul.l   r1,r2
sts macl,r0
rts
add r6,r0

where this one maybe could be better:
mova.L40,r0
lds r6,macl
mac.l   @r4+,r0+
rts
sts macl,r0

.align 2
.L40:   .long   40


[Bug target/53949] [SH] Add support for mac.w / mac.l instructions

2013-12-17 Thread olegendo at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53949

--- Comment #11 from Oleg Endo  ---
Another question is whether the following is OK to do on all SH
implementations:

int test33 (int x, int y, int z)
{
  return x * y + z;
}

currently compiles:
mul.l   r5,r4
sts macl,r0
rts
add r6,r0

could also be done as:
lds r6,macl
mov.l   r4,@-r15
mov.l   r5,@-r15
mac.l   @r15+,@r15+
rts
sts macl,r0

This assumes that a mac insn with both address operands being the same works
exactly as it's described in the Renesas manuals:

 tempn = Read_32 (R[n]);
  R[n] += 4;
  tempm = Read_32 (R[m]);
  R[m] += 4;

However, I don't know whether this is true for all SH implementations.


[Bug target/49263] SH Target: underutilized "TST #imm, R0" instruction

2013-12-17 Thread olegendo at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49263

--- Comment #22 from Oleg Endo  ---
(In reply to Oleg Endo from comment #21)
> What happens is that the sequence is expanded to RTL as follows:
> 
> (insn 7 4 8 2 (set (reg:SI 163 [ D.1856 ])
> (and:SI (reg/v:SI 162 [ xb ])
> (const_int 33 [0x21]))) sh_tmp.cpp:17 -1
>  (nil))
> (insn 8 7 9 2 (set (reg:SI 147 t)
> (eq:SI (reg:SI 163 [ D.1856 ])
> (const_int 0 [0]))) sh_tmp.cpp:17 -1
>  (nil))
> (jump_insn 9 8 10 2 (set (pc)
> (if_then_else (eq (reg:SI 147 t)
> (const_int 0 [0]))
> (label_ref:SI 15)
> (pc))) sh_tmp.cpp:17 301 {*cbranch_t}
>  (int_list:REG_BR_PROB 3900 (nil))
>  -> 15)
> (note 10 9 11 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
> (insn 11 10 12 4 (set (reg:SI 164)
> (const_int 0 [0])) sh_tmp.cpp:18 -1
>  (nil))
> (insn 12 11 15 4 (set (mem:SI (reg/v/f:SI 161 [ x ]) [2 *x_5(D)+0 S4 A32])
> (reg:SI 164)) sh_tmp.cpp:18 -1
>  (nil))
> 
> 
> and insn 11 becomes dead code and is eliminated.
> All of that happens long time before combine, so the tst combine patterns
> have no chance to reconstruct the original code.
> 

Adding an early peephole pass as described in
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59533#c2 and then adding the
following peephole:

;; Peephole after initial expansion.
(define_peephole2
  [(set (match_operand:SI 0 "arith_reg_dest")
(and:SI (match_operand:SI 1 "arith_reg_operand")
(match_operand:SI 2 "logical_operand")))
   (set (reg:SI T_REG) (eq:SI (match_dup 0) (const_int 0)))]
  "TARGET_SH1 && can_create_pseudo_p ()"
  [(set (reg:SI T_REG) (eq:SI (and:SI (match_dup 1) (match_dup 2))
  (const_int 0)))
   (set (match_dup 0) (and:SI (match_dup 1) (match_dup 2)))])

... fixes the problem and results in more uses of the tst #imm,r0 insn
according to the CSiBE set.  On the other hand there is a total code size
increase of 792 bytes on the whole set.  Below are some things that get worse
in the Linux source (mm/filemap.c):

mov.b   @(15,r1),r0->mov.b   @(15,r1),r0
cmp/pz  r0   tst #128,r0 // cmp/pz has less
bf  .L1016   bf  .L1001  // pressure on r0


mov.b   @(15,r0),r0 ->   mov.b   @(15,r0),r0
tst #4,r0sharr0
bf  .L107sharr0
 tst #1,r0


add #16,r0  ->   add #16,r0
mov.b   @(15,r0),r0  mov.b   @(15,r0),r0
tst #16,r0   mov #-4,r1
bf/s.L509shadr1,r0
 tst #1,r0
 bf/s.L509


[Bug rtl-optimization/57422] [4.9 Regression] ICE: SIGSEGV in dominated_by_p with custom flags

2013-12-17 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57422

Jakub Jelinek  changed:

   What|Removed |Added

 CC||abel at gcc dot gnu.org,
   ||jakub at gcc dot gnu.org,
   ||vmakarov at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
Started with r198695, but guess it just uncovered a latent issue in the
selective scheduler.

The reason why FENCE_INSN doesn't have BLOCK_FOR_INSN is that it has been
earlier removed:
#0  delete_insn (insn=0x719e8750) at ../../gcc/cfgrtl.c:175
#1  0x00a276e1 in sel_remove_insn (insn=0x719e8750,
only_disconnect=false, full_tidying=false) at ../../gcc/sel-sched-ir.c:3938
#2  0x00a3e035 in remove_insn_from_stream (insn=0x719e8750,
only_disconnect=false) at ../../gcc/sel-sched.c:6042
#3  0x00a3e0ea in move_op_orig_expr_found (insn=0x719e8750,
expr=0x19b6848, lparams=0x7fffdc60, static_params=0x7fffdc30)
at ../../gcc/sel-sched.c:6065
#4  0x00a3ecdb in code_motion_path_driver (insn=0x719e8750,
orig_ops=0x19b6840, path=0x19b7f38, local_params_in=0x7fffdc60, 
static_params=0x7fffdc30) at ../../gcc/sel-sched.c:6603
#5  0x00a3f000 in move_op (insn=0x719e8750, orig_ops=0x19b5058,
expr_vliw=0x19b7e50, dest=0x719fc9e0, c_expr=0x7fffdd40, 
should_move=0x7fffdd1a) at ../../gcc/sel-sched.c:6758
#6  0x00a3c651 in move_exprs_to_boundary (bnd=0x19b8af8,
expr_vliw=0x19b7e50, expr_seq=0x19b5058, c_expr=0x7fffdd40)
at ../../gcc/sel-sched.c:5292
#7  0x00a3d122 in schedule_expr_on_boundary (bnd=0x19b8af8,
expr_vliw=0x19b7e50, seqno=-1) at ../../gcc/sel-sched.c:5504
#8  0x00a3d55c in fill_insns (fence=0x19b7748, seqno=-1,
scheduled_insns_tailpp=0x7fffdf20) at ../../gcc/sel-sched.c:5646
#9  0x00a406ae in schedule_on_fences (fences=0x19b7740, max_seqno=30,
scheduled_insns_tailpp=0x7fffdf20) at ../../gcc/sel-sched.c:7410
#10 0x00a40b16 in sel_sched_region_2 (orig_max_seqno=85) at
../../gcc/sel-sched.c:7544
#11 0x00a40c83 in sel_sched_region_1 () at ../../gcc/sel-sched.c:7583
#12 0x00a410cf in sel_sched_region (rgn=0) at
../../gcc/sel-sched.c:7684
#13 0x00a411e9 in run_selective_scheduling () at
../../gcc/sel-sched.c:7760

and within the same fill_insns call (so the same fence) in the next cycle we
ICE.


[Bug ipa/59265] [4.9 Regression] Segmentation fault in ipa_note_param_call for -fprofile-use in SPEC CPU2006

2013-12-17 Thread trippels at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59265

--- Comment #22 from Markus Trippelsdorf  ---
(In reply to Markus Trippelsdorf from comment #21)
> 
> I've tried to bisect the issue, but it's messy. 
> However I think I can rule out any commit since r205447.
> If r205447 is the culprit I cannot say for sure,
> because Firefox ICEs during the build.

No, it is not r205447, because reverting it on top of trunk
doesn't fix the issue.


[Bug rtl-optimization/59350] [4.9 regression] ICE: in vt_expand_var_loc_chain, at var-tracking.c:8212

2013-12-17 Thread dcb314 at hotmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59350

--- Comment #31 from David Binderman  ---
Created attachment 31456
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31456&action=edit
C source code


[Bug rtl-optimization/59350] [4.9 regression] ICE: in vt_expand_var_loc_chain, at var-tracking.c:8212

2013-12-17 Thread dcb314 at hotmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59350

--- Comment #32 from David Binderman  ---
Same error with gcc trunk, dated 20131215, for attached
source code.

Flags -O3 -g -fPIC -fstack-protector-strong required.

[dcb@zippy4 foundBugs]$ ../results/bin/gcc -c -O3 -g -fPIC
-fstack-protector-strong bug126.c
zprime.c: In function ‘zfactor’:
zprime.c:577:1: internal compiler error: in vt_expand_var_loc_chain, at
var-tracking.c:8213
0xd1b1d0 vt_expand_var_loc_chain
../../src/trunk/gcc/var-tracking.c:8213
0xd1b5c7 vt_expand_loc_callback
../../src/trunk/gcc/var-tracking.c:8409
0x6d44de cselib_expand_value_rtx_1
../../src/trunk/gcc/cselib.c:1684
0x6d44de cselib_expand_value_rtx_cb(rtx_def*, bitmap_head*, int, rtx_def*
(*)(rtx_def*, bitmap_head*, int, void*), void*)
../../src/trunk/gcc/cselib.c:1531

...

[Bug rtl-optimization/59535] [4.9 regression] -Os code size regressions for Thumb1/Thumb2 with LRA

2013-12-17 Thread rearnsha at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59535

--- Comment #6 from Richard Earnshaw  ---
Created attachment 31457
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31457&action=edit
Another testcase

Another testcase, but this one has some obvious examples of poor behaviour for
-Os.

In addtion to the options used on the previous case, this might need
-fno-strict-aliasing -fno-common -fomit-frame-pointer -fno-strength-reduce 

Example one, spilling a value and then keeping a copy in a hard reg over a
call.

movr5, r1  <= R1 copied to R5
subsp, sp, #28
strr1, [sp, #8]<= And spilled to the stack
movr2, #12
movr1, #0
movr4, r0
blmemset
movr3, #2
movr2, r5  <= Could reload from the stack instead

Example two, use of multiple reloads to use high register:

ldrr3, [sp, #4]
movip, r3  <= Copying value into high register
addip, ip, r5  <= Arithmetic
movr3, ip  <= Copying result back to original register
strr3, [sp, #4]
ldrr3, [sp, #12]
movip, r3  <= And IP is dead anyway...

In this case, 
movip, r3
addip, ip, r5
movr3, ip

can be replaced entirely with
addr3, r5
saving two completely unnecessary MOV instructions.

Third, related case,

movr1, #12
movip, r1
addip, ip, r4
movr1, ip

Could be done either as
movr1, #12
add r1, r4
movip, r1
or
movr1, r4
addr1, #12
movip, r1

both saving one instruction, or even two if the value doesn't really need
copying to a high reg.


[Bug target/59305] [4.9 Regression] gcc.dg/atomic/c11-atomic-exec-5.c fails with WARNING: program timed out on x86_64-apple-darwin13

2013-12-17 Thread dominiq at lps dot ens.fr
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59305

--- Comment #7 from Dominique d'Humieres  ---
Created attachment 31458
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31458&action=edit
reduced test case

c11-atomic-exec_5.c reduced to the test of complex_long_double_add_overflow
only. It takes between 35 to 40s on an unloaded machine (less than 1s for the
full test on a fully loaded machine).


[Bug target/59305] [4.9 Regression] gcc.dg/atomic/c11-atomic-exec-5.c fails with WARNING: program timed out on x86_64-apple-darwin13

2013-12-17 Thread dominiq at lps dot ens.fr
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59305

--- Comment #8 from Dominique d'Humieres  ---
Created attachment 31459
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31459&action=edit
test without the complex instances

The running time fluctuates between 1.6 and 7.5s on an unloaded machine.


[Bug middle-end/55500] [devirt] trunk fails inline-devirt test #7

2013-12-17 Thread hubicka at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55500

Jan Hubicka  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2013-12-17
 CC||hubicka at gcc dot gnu.org,
   ||mkuvyrkov at ispras dot ru,
   ||rguenther at suse dot de
 Ever confirmed|0   |1

--- Comment #1 from Jan Hubicka  ---
The testcase still has three OBJ_TYPE_REF.

  # ivtmp_25 = PHI 
  _18 = operator new (16);
  MEM[(struct LinuxSocket *)_18].D.2889.D.2853._vptr.Stream = &MEM[(void
*)&_ZTV11LinuxSocket + 16B];
  MEM[(struct LinuxSocket *)_18].D.2889.D.2854._vptr.Connection = &MEM[(void
*)&_ZTV11LinuxSocket + 72B];
  LinuxSocket::open (_18);
  goto ;

  :
  __builtin_puts (&"got it"[0]);

  :
  _19 = MEM[(struct Stream *)_18]._vptr.Stream;
  _20 = *_19;
  _21 = OBJ_TYPE_REF(_20;(struct Stream)_18->0) (_18);


here LinuxSocket::open calls __builtin_puts and thus it is not pure and
therefore
basic folding can not propagate the known valud of  vptr pointers.

We sort of agreed to not mark "NEW" as maloc by default, so I understand that
AA can not easily determine that puts is not touching _18.

Maxim, how this testcase passed with your patches?

Perhaps type based analysis can work out the type and can use the assumption
that in-place-new that would be required to change the type is not allowed to
change type of non-pod, but this kind of anlaysis is not in yet
(I have patch that would make this happen with ctor not being inlined)


[Bug tree-optimization/59523] [4.9 Regression] r205856 caused internal compiler error: verify_ssa failed

2013-12-17 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59523

H.J. Lu  changed:

   What|Removed |Added

 Status|WAITING |NEW
 CC||jakub at redhat dot com
   Target Milestone|--- |4.9.0
Summary|ICE on spec2000/176.gcc,|[4.9 Regression]  r205856
   |200.sixtrack after r205856  |caused internal compiler
   |for -march=core-avx2|error: verify_ssa failed


[Bug gcov-profile/59527] [4.9 Regression] ICE: in fixup_reorder_chain, at cfgrtl.c:3739 during PGO Firefox build

2013-12-17 Thread tejohnson at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59527

--- Comment #3 from Teresa Johnson  ---
On Mon, Dec 16, 2013 at 10:41 AM, Teresa Johnson  wrote:
> I will take a look and report back. -freorder-blocks-and-partition was
> recently enabled by default, which presumably exposed this issue.

The issue is that there is a region crossing branch that cannot be
optimized away (since it is needed to cross the region boundary), and
fixup_reorder_chain was not handling this case. In the case where
there was no fallthru for a conditional jump the comments indicate
that this can happen if the conditional jump has side effects and
can't be deleted, in which case a barrier is inserted and no change is
made to the branch. In this case since the branch is region crossing
it also cannot be eliminated, but the assert was not handling that
case. I fixed by simply adding a check for it to the assert.

This routine already has some handling for region-crossing branches,
but it was only handling the case where there was both a taken and
fallthru edge. In this case we had no fallthru. The reason was that
the fallthru had been eliminated in an earlier round of cfg
optimizations when going in/out of cfglayout mode during
pro_and_epilogue. The fallthru was an empty block that appears to be
due to switch expansion with the case having a
__builtin_unreachable().

Here is the patch that fixes the issue, regression testing in progress:

2013-12-17  Teresa Johnson  

* cfgrtl.c (fixup_reorder_chain): Handle a region-crossing
branch, which can't be eliminated.

Index: cfgrtl.c
===
--- cfgrtl.c(revision 206033)
+++ cfgrtl.c(working copy)
@@ -3736,7 +3736,8 @@ fixup_reorder_chain (void)
  if (!e_fall)
{
  gcc_assert (!onlyjump_p (bb_end_insn)
- || returnjump_p (bb_end_insn));
+ || returnjump_p (bb_end_insn)
+  || (e_taken->flags & EDGE_CROSSING));
  emit_barrier_after (bb_end_insn);
  continue;
}


Thanks,
Teresa

> Thanks,
> Teresa
>
> On Mon, Dec 16, 2013 at 8:21 AM, octoploid at yandex dot com
>  wrote:
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59527
>>
>> --- Comment #1 from Markus Trippelsdorf  ---
>> Created attachment 31447
>>   --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31447&action=edit
>> unreduced testcase
>>
>>  % g++ -w -r -nostdlib -fprofile-use -fprofile-correction -march=amdfam10
>> -fno-exceptions -std=gnu++0x -O3 test.ii
>> In file included from /var/tmp/moz-build-dir/js/src/Unified_cpp_9.cpp:101:0:
>> /var/tmp/mozilla-central/js/src/vm/Stack.cpp: In member function
>> ‘js::ScriptFrameIter& js::ScriptFrameIter::operator++()’:
>> /var/tmp/mozilla-central/js/src/vm/Stack.cpp:717:1: internal compiler error: 
>> in
>> fixup_reorder_chain, at cfgrtl.c:3739
>>
>> --
>> You are receiving this mail because:
>> You are on the CC list for the bug.
>
>
>
> --
> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413

[Bug testsuite/59534] [4.9 Regression] FAIL: libgomp.fortran/retval1.f90 execution test due to denormals

2013-12-17 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59534

Uroš Bizjak  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2013-12-17
 CC||burnus at gcc dot gnu.org
  Component|libgomp |testsuite
 Ever confirmed|0   |1

--- Comment #9 from Uroš Bizjak  ---
Recategorized as (fortran) testsuite bug.

Adding CC of fortran expert.

[Bug c/59486] math functions take more cycles after running any Intel AVX function

2013-12-17 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59486

H.J. Lu  changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
   Last reconfirmed||2013-12-17
Version|unknown |4.4.6
 Resolution|INVALID |---
 Ever confirmed|0   |1

--- Comment #2 from H.J. Lu  ---
It was fixed in GCC 4.6 by adding -mvzeroupper.


[Bug tree-optimization/52272] [4.7/4.8/4.9 regression] Performance regression of 410.bwaves on x86.

2013-12-17 Thread rguenther at suse dot de
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52272

--- Comment #22 from rguenther at suse dot de  ---
On 12/17/13 9:29 AM, amker.cheng at gmail dot com wrote:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52272
> 
> bin.cheng  changed:
> 
>What|Removed |Added
> 
>  CC||amker.cheng at gmail dot com
> 
> --- Comment #21 from bin.cheng  ---
> Hi Richard,
> I looked into PR50955 for which the mentioned commit causing this PR is
> applied:
> 
> Commit 
> 2012-02-06  Richard Guenther  
> 
> PR tree-optimization/50955
> * tree-ssa-loop-ivopts.c (get_computation_cost_at): Artificially
> raise cost of expressions that replace an address with an
> expression based on a different pointer.
> 
> I noticed that the offending non-linear use in PR50955 is actually from memory
> reference.  If I understand the issue correct, the whole alias issue is
> introduced by rewriting iv use with one base_object through candidate with
> another incompatible base_object, and it is related to memory reference.  An
> genuine non-linear iv use (the pointer never de-referenced, like in this PR)
> won't have this issue.
> 
> So I come up this idea to relax the condition:
> 
> -  if (address_p)
> +  if (address_p
> +  || (use->iv->base_object
> + && cand->iv->base_object
> + && POINTER_TYPE_P (TREE_TYPE (use->iv->base_object))
> + && POINTER_TYPE_P (TREE_TYPE (cand->iv->base_object
>  {
>/* Do not try to express address of an object with computation based
>  on address of a different object.  This may cause problems in rtl
> 
> to non-linear uses which truly occurred in memory reference, something like:
> 
> -  if (address_p)
> +  if (address_p
> +  || (use->in_mem_ref_p
> + && use->iv->base_object
> + && cand->iv->base_object
> + && POINTER_TYPE_P (TREE_TYPE (use->iv->base_object))
> + && POINTER_TYPE_P (TREE_TYPE (cand->iv->base_object
>  {
>/* Do not try to express address of an object with computation based
>  on address of a different object.  This may cause problems in rtl
> 
> The flag in_mem_ref_p can be set for appropriate uses when finding interesting
> address uses.
> 
> With this change, this PR should be resolved while not violating PR50955.
> 
> I am not very much into 50955, so how does this sound? I can send a patch for
> review if the idea is in right direction.

I'm not 100% sure.

> BTW, I cannot reproduce 50955 with the reported revision of GCC.  The store
> isn't deleted by pass_cd_dce, though it is re-written just as the PR 
> reported. 
> So maybe I just misunderstood something.

It's been too long to remember ;)  The issue boils down to a bogus
TMR_SYMBOL. Later passes may forward a !use->in_mem_ref into a mem-ref,
exposing the issue as well.

Richard.

> Any words?
> 
> Thanks,
> bin
>


[Bug c/59486] math functions take more cycles after running any Intel AVX function

2013-12-17 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59486

H.J. Lu  changed:

   What|Removed |Added

 Status|REOPENED|RESOLVED
  Known to work||4.6.0, 4.7.0, 4.8.0
 Resolution|--- |FIXED
   Target Milestone|--- |4.6.0

--- Comment #3 from H.J. Lu  ---
Fixed.


[Bug middle-end/55498] [devirt] trunk fails inline-devirt test #6 due to lack of return functions

2013-12-17 Thread hubicka at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55498

Jan Hubicka  changed:

   What|Removed |Added

 CC||hubicka at gcc dot gnu.org
Summary|[devirt] trunk fails|[devirt] trunk fails
   |inline-devirt test #6   |inline-devirt test #6 due
   ||to lack of return functions

--- Comment #4 from Jan Hubicka  ---
indeed, this is a testcase for return functions in ipa-prop.  All other
solutions would need iteration and be fragile/unreliable.

-funroll-loops does not seem to have any effect on the second testcase for me. 
We end up with direct call to one() and two() there.


[Bug testsuite/59534] [4.9 Regression] FAIL: libgomp.fortran/retval1.f90 execution test due to denormals

2013-12-17 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59534

--- Comment #10 from Jakub Jelinek  ---
Author: jakub
Date: Tue Dec 17 15:17:00 2013
New Revision: 206051

URL: http://gcc.gnu.org/viewcvs?rev=206051&root=gcc&view=rev
Log:
PR testsuite/59534
* testsuite/libgomp.fortran/retval1.f90 (e5): Avoid non-shortcircuited
comparisons.

Modified:
trunk/libgomp/ChangeLog
trunk/libgomp/testsuite/libgomp.fortran/retval1.f90


[Bug tree-optimization/59523] [4.9 Regression] r205856 caused internal compiler error: verify_ssa failed

2013-12-17 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59523

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek  ---
Reduced testcase:
/* PR tree-optimization/59523 */
/* { dg-do compile } */
/* { dg-options "-Ofast" } */
/* { dg-additional-options "-mavx2" { target { i?86-*-* x86_64-*-* } } } */

int *
foo (int a, int *b, int *c)
{
  int i, *r = __builtin_alloca (a * sizeof (int));
  __builtin_memset (r, 0, a * sizeof (int));
  for (i = 0; i < 64; i++)
c[i] += b[i];
  for (i = 0; i < a; i++)
if (r[i] == 0)
  r[i] = 1;
  return r;
}


[Bug tree-optimization/59523] [4.9 Regression] r205856 caused internal compiler error: verify_ssa failed

2013-12-17 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59523

Jakub Jelinek  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org

--- Comment #4 from Jakub Jelinek  ---
Created attachment 31460
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31460&action=edit
gcc49-pr59523.patch

Untested fix.


[Bug middle-end/54957] Two crashes introduced by rev192488

2013-12-17 Thread ktietz at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54957

Kai Tietz  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 CC||ktietz at gcc dot gnu.org
 Resolution|--- |FIXED

--- Comment #22 from Kai Tietz  ---
As issue was fixed for 4.8 and also for 4.9, I close this bug.


[Bug rtl-optimization/59350] [4.9 regression] ICE: in vt_expand_var_loc_chain, at var-tracking.c:8212

2013-12-17 Thread ebotcazou at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59350

Eric Botcazou  changed:

   What|Removed |Added

 Status|WAITING |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |ebotcazou at gcc dot 
gnu.org

--- Comment #33 from Eric Botcazou  ---
OK, thanks for the testcase, which boils down to:

typedef struct
{
  void *v;
  int len;
  int sign;
} ZVALUE;

extern int pred (ZVALUE);

static unsigned long
small_factor (ZVALUE z)
{
  if (z.len > 0)
return 0;

  return pred (z) ? -1 : 0;
}

unsigned long
zfactor (ZVALUE z)
{
  z.sign = 0;
  return small_factor (z);
}

eric@polaris:~/build/gcc/native> gcc/xgcc -Bgcc -S -O -g pr59350-2.c
pr59350-2.c: In function 'zfactor':
pr59350-2.c:24:1: internal compiler error: in vt_expand_var_loc_chain, at
var-tracking.c:8213
 }
 ^
0xd6d63c vt_expand_var_loc_chain
/home/eric/svn/gcc/gcc/var-tracking.c:8213
0xd6d63c vt_expand_loc_callback
/home/eric/svn/gcc/gcc/var-tracking.c:8409


[Bug fortran/35913] INTRINISIC vs. host-associated procedures (check conformance)

2013-12-17 Thread dominiq at lps dot ens.fr
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35913

Dominique d'Humieres  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |WONTFIX

--- Comment #4 from Dominique d'Humieres  ---
> Nearly two years unconfirmed time to close this one as a WONTFIX?

More than two years and a half later without feedback closing.


[Bug fortran/41823] gcc/fortran/trans-openmp.c: possible null pointer dereference

2013-12-17 Thread dominiq at lps dot ens.fr
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41823

Dominique d'Humieres  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |WONTFIX

--- Comment #5 from Dominique d'Humieres  ---
No feedback since almost two years. Closing.


[Bug fortran/42478] [meta-bug] gfortran OpenMP bugs

2013-12-17 Thread dominiq at lps dot ens.fr
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42478

Bug 42478 depends on bug 41823, which changed state.

Bug 41823 Summary: gcc/fortran/trans-openmp.c: possible null pointer dereference
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41823

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |WONTFIX


[Bug fortran/51610] [OOP] Class container does not properly handle POINTER and TARGET

2013-12-17 Thread dominiq at lps dot ens.fr
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51610

Dominique d'Humieres  changed:

   What|Removed |Added

 Status|REOPENED|NEW

--- Comment #5 from Dominique d'Humieres  ---
> AFAICS, we still get the bogus error:
>
>   target :: a, b, c
> 1
> Error: Duplicate TARGET attribute specified at (1)

Reduced test:

   type t
   end type t
   class(t), allocatable :: a(:), b(:), c(:)
! Bogus error: Error: Duplicate TARGET attribute specified
   target :: a, b, c
 end

There is no error if CLASS is replaced with TYPE or REAL. I still think it
would be better to have a new PR opened for it.

BTW I think the test in comment 1 is invalid due to

...
   allocate (a(1), b(1), c(1))
...
class(t), target :: y(3)
class(t) :: x(3)
...


[Bug tree-optimization/47316] devirtualize calls to virtual methods that are never further overriden

2013-12-17 Thread hubicka at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47316

Jan Hubicka  changed:

   What|Removed |Added

 CC||hubicka at gcc dot gnu.org

--- Comment #4 from Jan Hubicka  ---
We now support final and do this type of analyzis that will lead to
devirtualization (if final or anonymous namespace is used) or speculative
devirtualization otherwise if doing so seems win.

We still do not assume that type hiearchy is complete even with -fwhole-program
since we do allow to link with libraries that may provide their own
derivations.
We may have a flag to enable this assumption and we probably want to strenghten
analysis of what can be supplied by library.


[Bug tree-optimization/47316] devirtualize calls to virtual methods that are never further overriden

2013-12-17 Thread hubicka at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47316

--- Comment #5 from Jan Hubicka  ---
The testcase provided now generates:
void foo(A*) (struct A * a)
{
  int (*__vtbl_ptr_type) () * _3;
  int (*__vtbl_ptr_type) () _4;
  int i.0_6;
  int i.1_7;
  void * PROF_9;
  int i.2_11;
  int i.3_12;

  :
  _3 = a_2(D)->_vptr.A;
  _4 = *_3;
  PROF_9 = [obj_type_ref] OBJ_TYPE_REF(_4;(struct A)a_2(D)->0);
  if (PROF_9 == f)
goto ;
  else
goto ;

  :
  i.2_11 ={v} i;
  i.3_12 = i.2_11 + -1;
  i ={v} i.3_12;
  goto ;

  :
  OBJ_TYPE_REF(_4;(struct A)a_2(D)->0) (a_2(D));

  :
  i.0_6 ={v} i;
  i.1_7 = i.0_6 + -1;
  i ={v} i.1_7;
  return;

}

where
  PROF_9 = [obj_type_ref] OBJ_TYPE_REF(_4;(struct A)a_2(D)->0);
  if (PROF_9 == f)
goto ;
  else
goto ;

looking at it, I am not sure if we want to keep OBJ_TYPE_REF here.
It is generally ignored except for the call statements themselves.
So either we want to drop it or extend type devirtualization in the folding
machinery to work for non-calls, too.


[Bug tree-optimization/47462] g++.dg/opt/devirt1.C no longer devirtualized

2013-12-17 Thread hubicka at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47462

Jan Hubicka  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 CC||hubicka at gcc dot gnu.org
 Resolution|--- |FIXED

--- Comment #1 from Jan Hubicka  ---
this testcase is no longer xfailed, so I am clossing this one.


[Bug middle-end/55150] Crash in copy_rtx

2013-12-17 Thread rmansfield at qnx dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55150

Ryan Mansfield  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #7 from Ryan Mansfield  ---
Fixed


[Bug ipa/59226] [4.9 Regression] ICE: in record_target_from_binfo, at ipa-devirt.c:661

2013-12-17 Thread aivchenk at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59226

--- Comment #12 from Alexander Ivchenko  ---
(In reply to Markus Trippelsdorf from comment #11)
> (In reply to Alexander Ivchenko from comment #10)
> > Patch from comment #7 didn't cure Android build as well..
> 
> Can you try the patch from PR58252 comment 7? 
> I've build Chromium (browser) successfully with it.

Yep, the patch from PR58252 comment 7 has cured my build of Android


[Bug middle-end/35545] virtual call specialization not happening with FDO

2013-12-17 Thread hubicka at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35545

Jan Hubicka  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2013-12-17
 CC||hubicka at gcc dot gnu.org,
   ||mjambor at suse dot cz,
   ||rguenther at suse dot de
 Ever confirmed|0   |1

--- Comment #2 from Jan Hubicka  ---
Couple years later we finally devirtualize here:
  :
  ap_8 = operator new (16);
  ap_8->i = 0;
  ap_8->_vptr.A = &MEM[(void *)&_ZTV1A + 16B];
  _19 = foo;
  PROF_26 = [obj_type_ref] OBJ_TYPE_REF(_19;(struct A)ap_8->0);
  if (PROF_26 == foo)
goto ;
  else
goto ;

  :
  ap_13 = operator new (16);
  MEM[(struct B *)ap_13].D.2237.i = 0;
  MEM[(struct B *)ap_13].b = 0;
  MEM[(struct B *)ap_13].D.2237._vptr.A = &MEM[(void *)&_ZTV1B + 16B];
  _1 = foo;
  PROF_30 = [obj_type_ref] OBJ_TYPE_REF(_1;(struct A)ap_13->0);
  if (PROF_30 == foo)
goto ;
  else
goto ;

however the code ends up super sily after tracer.
for some reason we do not manage to fold away the virtual table lookup.
Why?


[Bug middle-end/45631] devirtualization with profile feedback does not work for function pointers

2013-12-17 Thread hubicka at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45631

Jan Hubicka  changed:

   What|Removed |Added

 CC||davidxl at google dot com

--- Comment #5 from Jan Hubicka  ---
Google has mentioned to have patches for smarter multi-target collection.
So perhaps for next stage1?


[Bug rtl-optimization/59511] [4.9 Regression] FAIL: gcc.target/i386/pr36222-1.c scan-assembler-not movdqa with -mtune=corei7

2013-12-17 Thread vmakarov at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59511

--- Comment #3 from Vladimir Makarov  ---
(In reply to Jakub Jelinek from comment #2)
> One movdqa started appearing with r204212, the second movdqa started
> appearing with r204752.  Vlad, can you please have a look?

It seems the changes triggered a bug in register move cost calculations.  I
have a patch to fix it but I need more time to check affect of it on the
performance.  So the fix will be ready at the end of week if everything is ok.


[Bug target/42949] ICE: reload_cse_simplify_operands, at postreload.c:396

2013-12-17 Thread rmansfield at qnx dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42949

Ryan Mansfield  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |WONTFIX

--- Comment #5 from Ryan Mansfield  ---
ARM OABI is no longer a supported target


[Bug target/45511] ICE in neon_valid_immediate, at config/arm/arm.c:8294

2013-12-17 Thread rmansfield at qnx dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45511

Ryan Mansfield  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |WONTFIX

--- Comment #8 from Ryan Mansfield  ---
ARM OABI is no longer a supported target.


[Bug target/45814] ICE in extract_insn, at recog.c:2127

2013-12-17 Thread rmansfield at qnx dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45814

Ryan Mansfield  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |WONTFIX

--- Comment #3 from Ryan Mansfield  ---
ARM OABI is no longer a supported target.


[Bug target/59233] [4.9 Regression] C++ failures after revision 205058 on *-apple-darwin* with -m32

2013-12-17 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59233

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #9 from Jakub Jelinek  ---
So is this fixed now?


[Bug target/49521] [arm] Bad PIC register load in static initializers

2013-12-17 Thread rmansfield at qnx dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49521

Ryan Mansfield  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |WONTFIX

--- Comment #4 from Ryan Mansfield  ---
ARM OABI is no longer a supported target


[Bug target/43588] ICE in copy_to_mode_reg, at explow.c:635

2013-12-17 Thread rmansfield at qnx dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43588

Ryan Mansfield  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |WONTFIX

--- Comment #2 from Ryan Mansfield  ---
ARM OABI is no longer a supported target


[Bug target/45885] ICE in arm_dbx_register_number, at config/arm/arm.c:22071

2013-12-17 Thread rmansfield at qnx dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45885

Ryan Mansfield  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |WONTFIX

--- Comment #8 from Ryan Mansfield  ---
ARM OABI is no longer a supported target


[Bug target/59147] 128-bit division error

2013-12-17 Thread ktietz at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59147

Kai Tietz  changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING
   Last reconfirmed||2013-12-17
 CC||ktietz at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #4 from Kai Tietz  ---
I've tested testcase on x86_64-w64-mingw32 cross-compiler, and I can't
reproduce issue with current trunk.
As 4.6.1 isn't supported anymore upstream, could you test with more recent
gcc-version please?
Otherwise I will need to close bug a works for me.


[Bug target/59233] [4.9 Regression] C++ failures after revision 205058 on *-apple-darwin* with -m32

2013-12-17 Thread dominiq at lps dot ens.fr
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59233

Dominique d'Humieres  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #10 from Dominique d'Humieres  ---
> So is this fixed now?

AFAICT yes, closing.


[Bug rtl-optimization/59466] Slow code generation by LRA for memory addresses on PPC

2013-12-17 Thread vmakarov at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59466

Vladimir Makarov  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #2 from Vladimir Makarov  ---
It seems fixed.  The code size is definitely improved too.


[Bug target/59147] 128-bit division error

2013-12-17 Thread rglindley at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59147

--- Comment #5 from rglindley at gmail dot com ---
(In reply to Kai Tietz from comment #4)
> I've tested testcase on x86_64-w64-mingw32 cross-compiler, and I can't
> reproduce issue with current trunk.
> As 4.6.1 isn't supported anymore upstream, could you test with more recent
> gcc-version please?
> Otherwise I will need to close bug a works for me.

Go ahead and close as far as I'm concerned.  I will test again after I upgrade,
but that may be awhile as I'm in the middle of a project right now.  Thanks.


[Bug middle-end/45631] devirtualization with profile feedback does not work for function pointers

2013-12-17 Thread xinliangli at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45631

davidxl  changed:

   What|Removed |Added

 CC||xinliangli at gmail dot com

--- Comment #6 from davidxl  ---
(In reply to Jan Hubicka from comment #5)
> Google has mentioned to have patches for smarter multi-target collection.
> So perhaps for next stage1?

Yes -- we will contribute that code back to trunk in next stage1 (was planning
to do so in this cycle, but Rong was busy with contributing other patches: fast
coverage mode, atomic update, libgcov refactoring, and profile-tool).

David


[Bug middle-end/58290] [4.9 Regression] error: virtual definition of statement not up-to-date

2013-12-17 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58290

--- Comment #6 from Jakub Jelinek  ---
Author: jakub
Date: Tue Dec 17 17:35:59 2013
New Revision: 206062

URL: http://gcc.gnu.org/viewcvs?rev=206062&root=gcc&view=rev
Log:
PR ipa/58290
* gfortran.dg/pr58290.f90: New test.

Added:
trunk/gcc/testsuite/gfortran.dg/pr58290.f90
Modified:
trunk/gcc/testsuite/ChangeLog


[Bug fortran/59537] New: "Automatic array cannot have an initializer", for -finit-real and a SAVE statement present in subroutine

2013-12-17 Thread bugs at stellardeath dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59537

Bug ID: 59537
   Summary: "Automatic array cannot have an initializer", for
-finit-real and a SAVE statement present in subroutine
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bugs at stellardeath dot org

Created attachment 31461
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31461&action=edit
Minimal example code

The following valid, minimal example code

  subroutine foo(n)
  implicit none
  integer n
  real :: a(1:n)
  save
  a(1) = 3
  end subroutine foo

cannot be compiled with -finit-real. The presence of the catch-all SAVE
statement seems to affect this, even though it should not apply to the
automatic array a (right?):

#> gfortran -finit-real=nan -c minimal.f90
minimal.f:4.20:

  real :: a(1:n)
1
Error: Automatic array 'a' at (1) cannot have an initializer
#>

Without the SAVE statement it compiles fine.

I see this with the installed gfortran 4.8.2 from openSUSE and also with a
self-compiled gfortran from today (2013-12-17):

#> gfortran --version
GNU Fortran (SUSE Linux) 4.8.2 20131210 [gcc-4_8-branch revision 205857]
Copyright (C) 2013 Free Software Foundation, Inc.

GNU Fortran comes with NO WARRANTY, to the extent permitted by law.
You may redistribute copies of GNU Fortran
under the terms of the GNU General Public License.
For more information about these matters, see the file named COPYING


#> ~/sys/stow/gcc-2013-12-17/bin/gfortran --version
GNU Fortran (GCC) 4.9.0 20131217 (experimental)
Copyright (C) 2013 Free Software Foundation, Inc.

GNU Fortran comes with NO WARRANTY, to the extent permitted by law.
You may redistribute copies of GNU Fortran
under the terms of the GNU General Public License.
For more information about these matters, see the file named COPYING


[Bug middle-end/58290] [4.9 Regression] error: virtual definition of statement not up-to-date

2013-12-17 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58290

Jakub Jelinek  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #7 from Jakub Jelinek  ---
I've looked at this some more and it seems Richard's change was the right fix
for this, so I've committed the testcase and am closing this.


[Bug middle-end/35545] virtual call specialization not happening with FDO

2013-12-17 Thread hubicka at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35545

--- Comment #3 from Jan Hubicka  ---
Following patch gets rid of OBJ_TYPE_REF
Index: value-prof.c
===
--- value-prof.c(revision 206040)
+++ value-prof.c(working copy)
@@ -1333,9 +1333,15 @@ gimple_ic (gimple icall_stmt, struct cgr
   cond_bb = gimple_bb (icall_stmt);
   gsi = gsi_for_stmt (icall_stmt);

-  tmp0 = make_temp_ssa_name (optype, NULL, "PROF");
-  tmp1 = make_temp_ssa_name (optype, NULL, "PROF");
-  tmp = unshare_expr (gimple_call_fn (icall_stmt));
+  tmp0 = make_temp_ssa_name (optype, NULL, "SPEC");
+  tmp1 = make_temp_ssa_name (optype, NULL, "SPEC");
+  tmp = gimple_call_fn (icall_stmt);
+
+  /* Drop OBJ_TYPE_REF expression, it is ignored by rest of
+ optimization queue anyway.  */
+  if (TREE_CODE (tmp) == OBJ_TYPE_REF)
+tmp = OBJ_TYPE_REF_EXPR (tmp);
+  tmp = unshare_expr (tmp);
   load_stmt = gimple_build_assign (tmp0, tmp);
   gsi_insert_before (&gsi, load_stmt, GSI_SAME_STMT);


but it does not help.  We end up:
  _1 = foo;
  if (_1 == foo)
goto ;

not much better.  This nonsense appears only in optimized dump, before we have
the following nonsense:
  # ap_9 = PHI 
  # prephitmp_18 = PHI <&MEM[(void *)&_ZTV1B + 16B](6)>
  _1 = *prephitmp_18;

that is introduced by duplicating
  :
  # ap_2 = PHI 
  # prephitmp_14 = PHI <&MEM[(void *)&_ZTV1A + 16B](5), &MEM[(void *)&_ZTV1B +
16B](6)>
  _19 = *prephitmp_14;
  if (_19 == foo)
goto ;
  else
goto ;

The following patch:
Index: passes.def
===
--- passes.def  (revision 206040)
+++ passes.def  (working copy)
@@ -242,9 +242,9 @@ along with GCC; see the file COPYING3.
 only examines PHIs to discover const/copy propagation
 opportunities.  */
   NEXT_PASS (pass_phi_only_cprop);
+  NEXT_PASS (pass_tracer);
   NEXT_PASS (pass_vrp);
   NEXT_PASS (pass_cd_dce);
-  NEXT_PASS (pass_tracer);
   NEXT_PASS (pass_dse);
   NEXT_PASS (pass_forwprop);
   NEXT_PASS (pass_phiopt);

makes us to VRP value through but only when OBJ_TYPE_REF is nout around.
I think pushing tracer up is good idea - we should have at least one vrp/ccp
pass after tracer. Its main purpose is to provide enough context so those
forward propagating passes can do better job. it seems stupid to do it only
after everything was finished.  I also see why tracer would preffer DCE to be
done first, but we have only so many passes to do.


[Bug middle-end/58290] [4.9 Regression] error: virtual definition of statement not up-to-date

2013-12-17 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58290

--- Comment #8 from Joost VandeVondele  
---
(In reply to Jakub Jelinek from comment #7)
> I've looked at this some more and it seems Richard's change was the right
> fix for this, so I've committed the testcase and am closing this.

Thanks. I notice that you're writing your own testcases, if this would be for
copyright reasons, I extract most of my reports from our GPLed code (CP2K), so
there would be no need for this. Of course, I don't mind if you have other
reasons.


[Bug middle-end/58290] [4.9 Regression] error: virtual definition of statement not up-to-date

2013-12-17 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58290

--- Comment #9 from Jakub Jelinek  ---
This is actually the same testcase, just somewhat manually reduced and with
symbol names simplified.


[Bug middle-end/35545] virtual call specialization not happening with FDO

2013-12-17 Thread rguenther at suse dot de
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35545

--- Comment #4 from rguenther at suse dot de  ---
"hubicka at gcc dot gnu.org"  wrote:
>http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35545
>
>Jan Hubicka  changed:
>
>   What|Removed |Added
>
> Status|UNCONFIRMED |NEW
>   Last reconfirmed||2013-12-17
>CC||hubicka at gcc dot gnu.org,
>  ||mjambor at suse dot cz,
> ||rguenther at suse dot de
> Ever confirmed|0   |1
>
>--- Comment #2 from Jan Hubicka  ---
>Couple years later we finally devirtualize here:
>  :
>  ap_8 = operator new (16);
>  ap_8->i = 0;
>  ap_8->_vptr.A = &MEM[(void *)&_ZTV1A + 16B];
>  _19 = foo;
>  PROF_26 = [obj_type_ref] OBJ_TYPE_REF(_19;(struct A)ap_8->0);
>  if (PROF_26 == foo)
>goto ;
>  else
>goto ;
>
>  :
>  ap_13 = operator new (16);
>  MEM[(struct B *)ap_13].D.2237.i = 0;
>  MEM[(struct B *)ap_13].b = 0;
>  MEM[(struct B *)ap_13].D.2237._vptr.A = &MEM[(void *)&_ZTV1B + 16B];
>  _1 = foo;
>  PROF_30 = [obj_type_ref] OBJ_TYPE_REF(_1;(struct A)ap_13->0);
>  if (PROF_30 == foo)
>goto ;
>  else
>goto ;
>
>however the code ends up super sily after tracer.
>for some reason we do not manage to fold away the virtual table lookup.
>Why?

This is probably doms fault. Does it even handle function pointers?

Tracer runs way too late and with no cleanups behind it.


[Bug middle-end/35545] virtual call specialization not happening with FDO

2013-12-17 Thread hubicka at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35545

--- Comment #5 from Jan Hubicka  ---
Main issue seems to be that VRP messes up on:
  # ap_2 = PHI 
  # prephitmp_14 = PHI <&MEM[(void *)&_ZTV1A + 16B](5)>
  _19 = *prephitmp_14;

here it somehow won't constant propagate the load. 
Index: passes.def
===
--- passes.def  (revision 206040)
+++ passes.def  (working copy)
@@ -236,6 +236,7 @@ along with GCC; see the file COPYING3.
   NEXT_PASS (pass_reassoc);
   NEXT_PASS (pass_strength_reduction);
   NEXT_PASS (pass_dominator);
+  NEXT_PASS (pass_tracer);
   /* The only const/copy propagation opportunities left after
 DOM should be due to degenerate PHI nodes.  So rather than
 run the full propagators, run a specialized pass which
@@ -244,7 +245,6 @@ along with GCC; see the file COPYING3.
   NEXT_PASS (pass_phi_only_cprop);
   NEXT_PASS (pass_vrp);
   NEXT_PASS (pass_cd_dce);
-  NEXT_PASS (pass_tracer);
   NEXT_PASS (pass_dse);
   NEXT_PASS (pass_forwprop);
   NEXT_PASS (pass_phiopt);

actually helps since phi_only_cprop is good on this transform. I do not quite
gather why VRP can't do it itself.

I sent first patch to http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01517.html


[Bug c++/59085] internal compiler error: Segmentation fault

2013-12-17 Thread ktietz at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59085

Kai Tietz  changed:

   What|Removed |Added

 CC||ktietz at gcc dot gnu.org

--- Comment #3 from Kai Tietz  ---
Issue is that the DECL_ASSEMBLER_NAME is tried to be taken by a VAR_DECL and it
is NULL_TREE.

I used the reduced testcase.

debug_tree shows: '

[Bug middle-end/59471] [4.9 Regression] ICE using vector extensions (non-top-level BIT_FIELD_REF, IMAGPART_EXPR or REALPART_EXPR)

2013-12-17 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59471

--- Comment #5 from Marc Glisse  ---
(In reply to Jakub Jelinek from comment #4)
> I think BIT_FIELD_REF's type can't be a vector,

Er, I am quite sure a BIT_FIELD_REF can be a vector. Maybe that wasn't a
general statement and I missed the context?


[Bug middle-end/59471] [4.9 Regression] ICE using vector extensions (non-top-level BIT_FIELD_REF, IMAGPART_EXPR or REALPART_EXPR)

2013-12-17 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59471

--- Comment #6 from Jakub Jelinek  ---
You mean BIT_FIELD_REF argument can be a vector?  Sure.  But the type of the
BIT_FIELD_REF itself?


[Bug middle-end/35545] virtual call specialization not happening with FDO

2013-12-17 Thread law at redhat dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35545

Jeffrey A. Law  changed:

   What|Removed |Added

 CC||law at redhat dot com

--- Comment #6 from Jeffrey A. Law  ---
You certainly don't want to put something between DOM and phi-only-cprop.  Jump
threading will tend to expose lots of degenerate PHIs.  phi-only-cprop
eliminates those degenerates.  We could have used the normal cprop code, but it
seemed too heavy-weight for the cleanups we wanted to do.

One could argue we want a phi-only-cprop cleanup after VRP since it threads
jumps too.


[Bug rtl-optimization/58668] [4.8/4.9 regression] internal compiler error: in cond_exec_process_insns, at ifcvt.c:339

2013-12-17 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58668

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #4 from Jakub Jelinek  ---
Reduced testcase, ICEs with -march=armv7-a -mthumb -O2:

void *fn1 (void *);
void *fn2 (void *, const char *);
void fn3 (void *);
void fn4 (void *, int);

void *
test (void *x)
{
  void *a, *b;
  if (!(a = fn1 (x)))
return (void *) 0;
  if (!(b = fn2 (a, "w")))
{
  fn3 (a);
  return (void *) 0;
}
  fn3 (a);
  fn4 (b, 1);
  return b;
}


[Bug target/59533] [SH] Missed cmp/pz opportunity

2013-12-17 Thread olegendo at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59533

--- Comment #3 from Oleg Endo  ---
This is basically the same issue as PR 54685.


[Bug tree-optimization/59519] [4.9 Regression] ICE on valid code at -O3 on x86_64-linux-gnu in slpeel_update_phi_nodes_for_guard1, at tree-vect-loop-manip.c:486

2013-12-17 Thread mpolacek at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59519

Marek Polacek  changed:

   What|Removed |Added

   Priority|P3  |P1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2013-12-17
 CC||mpolacek at gcc dot gnu.org
   Target Milestone|--- |4.9.0
Summary|ICE on valid code at -O3 on |[4.9 Regression] ICE on
   |x86_64-linux-gnu in |valid code at -O3 on
   |slpeel_update_phi_nodes_for |x86_64-linux-gnu in
   |_guard1, at |slpeel_update_phi_nodes_for
   |tree-vect-loop-manip.c:486  |_guard1, at
   ||tree-vect-loop-manip.c:486
 Ever confirmed|0   |1


[Bug middle-end/35545] virtual call specialization not happening with FDO

2013-12-17 Thread hubicka at ucw dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35545

--- Comment #7 from Jan Hubicka  ---
> You certainly don't want to put something between DOM and phi-only-cprop.  
> Jump
> threading will tend to expose lots of degenerate PHIs.  phi-only-cprop
> eliminates those degenerates.  We could have used the normal cprop code, but 
> it
> seemed too heavy-weight for the cleanups we wanted to do.

Well, consttants, copies and PHIs are accounted as 0 size and thus not part of
tracer's
cost model, so perhaps we do not care about presence of degnerated PHIs here.
Moreover it is the degnerate PHI produced by tracer that causes are problems. I
assume
DOM does degnerate PHIs by code duplication for jump threading and tracer is
exactly the
same type of transformation and for same reasons we may want phi-only-cprop
after tracer
as we do it after DOM.

It however seems a more like VRP's missed optimization to not be able to paper
over that
degenerate PHI produced by tracer at first place, so I will try to poke about
this tomorrow
a bit more.


[Bug middle-end/54685] [SH] Improve unsigned int comparison with 0x7FFFFFFF

2013-12-17 Thread olegendo at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54685

--- Comment #9 from Oleg Endo  ---
This is basically the same issue as PR 59533.  emit_store_flag_1 in expmed.c
always expands the not-shift because the assumption there is that it's cheaper,
which is not true for SH.

The pre-peephole idea from http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59533#c2
also fixes this problem and makes the change in r192200 superfluous.


[Bug rtl-optimization/58668] [4.8/4.9 regression] internal compiler error: in cond_exec_process_insns, at ifcvt.c:339

2013-12-17 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58668

Jakub Jelinek  changed:

   What|Removed |Added

 CC||ebotcazou at gcc dot gnu.org

--- Comment #5 from Jakub Jelinek  ---
I think the problem is that ifcvt relies on consistent counting of insns, but
the various functions count different things.
- count_bb_insns counts CALL_INSN and INSN
- flow_find_cross_jump counts in some cases the jump insn (quite complicated
test whether it counts it or not), and doesn't count insns with USE or CLOBBER
PATTERNs
- flow_find_head_matching_sequence counts all CALL_INSN/INSN, including USE and
CLOBBER
- first_active_insn finds what count_bb_insns counts
- last_active_insn (..., TRUE) finds what count_bb_insns except it skips over
USE patterns

I guess best would be to count/skip/etc. the same things consistently, the
problem is that some of the functions have other uses etc.

So, perhaps
1) let count_bb_insns not count insns with USE or CLOBBER PATTERNs
2) perhaps not count any JUMP_INSNs in flow_find_cross_jump if dir_p == NULL
(i.e.
when called from ifcvt)?
3) perhaps not count USE/CLOBBER insns in flow_find_head_matching_sequence if
stop_after is non-zero?
4) perhaps add also skip_use argument to first_active_insn and if TRUE, ignore
USE insns and for both {first,last}_active_insn if skip_use is TRUE, also
ignore CLOBBER insns
5) in find_active_insn_{before,after} ignore USE/CLOBBER insns
and document this properly?


[Bug go/59431] [4.9 regression] runtime FAILs on Solaris

2013-12-17 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59431

Jakub Jelinek  changed:

   What|Removed |Added

   Priority|P3  |P5
 CC||jakub at gcc dot gnu.org


[Bug c/59538] New: Optimization of -O2 or higher creates incorrect code in loop

2013-12-17 Thread arsham at skrenes dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59538

Bug ID: 59538
   Summary: Optimization of -O2 or higher creates incorrect code
in loop
   Product: gcc
   Version: 4.8.1
Status: UNCONFIRMED
  Severity: critical
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: arsham at skrenes dot com

Created attachment 31462
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31462&action=edit
Small prime number calculating program that shows bug

I'm using Ubuntu 13.10 with all updates installed. GCC was recently updated for
the distribution to:
gcc (Ubuntu/Linaro 4.8.1-10ubuntu9) 4.8.1

Before the update, everything worked correctly. After the update, the -O2 or
higher optimization flag breaks code (in this instance, it's a loop).

I have attached bug.c to this bug report, which compiles with no warnings or
errors. It determines the 1th prime number (naive algorithm) 5 times and
reports the duration it took for each iteration. If you compile as follows, it
works correctly:
gcc -Wall -Wextra bug.c

If you compile it with optimization level -O2 or higher such as the following,
it breaks the loop:
gcc -O3 -Wall -Wextra bug.c

You will see that it returns the results almost instantly. In the code, I have
a commented print statement that shows the 1th prime number. If you
uncomment this, it also suddenly works correctly with -O3. It seems the gcc is
"over-optimizing" and breaking the code if variable "i" is not printed.


[Bug c++/58701] [4.9 Regression] [c++11] ICE initializing member of static union

2013-12-17 Thread mpolacek at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58701

Marek Polacek  changed:

   What|Removed |Added

   Priority|P1  |P2

--- Comment #5 from Marek Polacek  ---
Downgrading to P2.


[Bug ipa/59265] [4.9 Regression] Segmentation fault in ipa_note_param_call for -fprofile-use in SPEC CPU2006

2013-12-17 Thread trippels at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59265

--- Comment #23 from Markus Trippelsdorf  ---
Here's a testcase:

 tmp % wget trippelsdorf.de/cceI2Nud.ltrans22.o.bz2
 tmp % bzip2 -d cceI2Nud.ltrans22.o.bz2
 tmp % g++ -xlto -fltrans cceI2Nud.ltrans22.o
In member function ‘extractBetween’:
lto1: fatal error: Cgraph edge statement index out of range 25 < 50
compilation terminated.

[Bug target/59147] 128-bit division error

2013-12-17 Thread ktietz at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59147

Kai Tietz  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |WORKSFORME

--- Comment #6 from Kai Tietz  ---
Ok, thanks for the feedback


[Bug middle-end/35545] virtual call specialization not happening with FDO

2013-12-17 Thread hubicka at ucw dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35545

--- Comment #10 from Jan Hubicka  ---
> Tracer depends on the usual estimate_num_insns limits
>  (it is 12 years since I wrote it, so what I recall)

note that one impotant thing that changed in those 12 years is that I
originally carefuly tuned tracer in combination with crossjumping. Tracer
produced duplicates and if no pass managed to take use of them, crossjumping
cleaned them up pre-reload. Trace formation in bb-reorder re-instantiated
duplicated when it seemed sensible for code layout.

This broke, since SSA makes RTL cross jumping quite useless and it is now done
after reg-alloc only.  We never really got working code unification pass on
gimple.
http://www.ucw.cz/~hubicka/papers/amd64/node4.html
Claims that at that time I got 1.6% speedup on SPECint with profile feedback
1.43% code growth.  That is not bad, but wonder to what it translates today.

Honza


[Bug middle-end/35545] virtual call specialization not happening with FDO

2013-12-17 Thread hubicka at ucw dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35545

--- Comment #9 from Jan Hubicka  ---
> It's not a matter of cost model, but if propagating the values to their uses. 
> I haven't looked closely at the tracer, but wouldn't it benefit by having
> constants in particular propagated to their uses?

Tracer depends on the usual estimate_num_insns limits
 (it is 12 years since I wrote it, so what I recall)
It collects BBs that are interesting starts of traces, takes them in priority
order and duplicates from the seeds until code growth is met or it runs out of
interesting candidates by other criteria.

I think it generally tends to starve on candidates as the definition of trace
is
relatively strong, but I am not 100% sure on it. So it is not that much
dependent
on bounds given by code size metric.

If we had unlimited time, it would  be better to propagate constants and
cleanup
both before and after tracer.

If we can chose whether we want to do tracer before last pass that is able
to propagate and fold constants or after, I would chose before for the reason
I mentioned on begginig; the whole point of the tail duplication is to simplify
CFG and allow better propagation.

I think missed tracing here and there is less painful than missed optimizatoin
in duplicated code.

We may even consider pushing tracer before DOM, since tail duplication may
enable
DOM to produce more useful threading/propagation and code after tracer is not
too painfuly obstructated. Sure you can end up with PHI that has only one
constant
argument.  I can see that DOM may miss optimization here.
> Propagating the constant for x' in BBm and eliminating the degenerate is what
> the phi-only cprop pass does.  If the tracer generates similar things, then
> running phi-only cprop after it might be useful as well.  It *should* be very
> fast.

Yes, tracer does similar things.  You can think about it as about speculative
jump threading - if one path through meet points seems more likely than the
other based on profile, tracer will duplicate it in a hope that later
optimization pass will prove some of conditionals constant over the duplicated
path.  For that it needs subsequent propagation pass (CCP or better VRP) to
match.  That is why its current place in pass queue is unlucky.  Possible
benefits of tail duplications are of course not limited to threading.

We can do one extra cleanup pass, too.  Tracer is on by default only with
-fprofile-use so extra phi-only cprop with -ftracer probably is not dangerous
to overall compile time experience.

Honza


[Bug go/59433] [4.9 regression] Many 64-bit Go tests SEGV on Solaris

2013-12-17 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59433

Jakub Jelinek  changed:

   What|Removed |Added

   Priority|P3  |P5
 CC||jakub at gcc dot gnu.org


[Bug target/39578] Linkage broken for dllimport vtables

2013-12-17 Thread ktietz at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39578

Kai Tietz  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Kai Tietz  ---
I retested issue for all open branches.  I can't reproduce issue anymore. 
Import library contains symbol for virtual.

If issue still exists please provide a testcase demonstrating that issue for
maintained branches still exists.


[Bug middle-end/59471] [4.9 Regression] ICE using vector extensions (non-top-level BIT_FIELD_REF, IMAGPART_EXPR or REALPART_EXPR)

2013-12-17 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59471

--- Comment #7 from Marc Glisse  ---
(In reply to Jakub Jelinek from comment #6)
> You mean BIT_FIELD_REF argument can be a vector?  Sure.  But the type of the
> BIT_FIELD_REF itself?

Yes, the type of the BIT_FIELD_REF itself. A quick grep gives:
t = build3 (BIT_FIELD_REF, vectype, new_temp,
in tree-vect-stmts.c where I assume vectype is a vector type. IIRC,
tree-vect-generic.c also produces plenty of those when lowering extra-long
vectors.


[Bug go/59432] [4.9 regression] sync/atomic FAILs on Solaris/x86

2013-12-17 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59432

Jakub Jelinek  changed:

   What|Removed |Added

   Priority|P3  |P5
 CC||jakub at gcc dot gnu.org


[Bug go/59430] [4.9 regression] os/user FAILs on Solaris

2013-12-17 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59430

Jakub Jelinek  changed:

   What|Removed |Added

   Priority|P3  |P5
 CC||jakub at gcc dot gnu.org


[Bug middle-end/59538] Optimization of -O2 or higher creates incorrect code in loop

2013-12-17 Thread pinskia at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59538

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #3 from Andrew Pinski  ---
(In reply to Arsham Skrenes from comment #2)
> The result of the loop is implicitly used though (benchmarking how long it
> takes to find nth prime number; I also use this code to create an artificial
> workload for a graduate-level project). This is new behaviour by this
> version of GCC. This is NOT a valid optimization as it clearly is having
> unintended side-effects which I am showcasing.

No you need an explicit use to avoid removing empty finite loops.  Changing the
time is not a side effect of a loop which is defined by the C/C++ standards. 
Many benchmarks can be optimized away if you are not careful; this is true of
any code and most programming languages.


[Bug middle-end/35545] virtual call specialization not happening with FDO

2013-12-17 Thread law at redhat dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35545

--- Comment #8 from Jeffrey A. Law  ---
It's not a matter of cost model, but if propagating the values to their uses. 
I haven't looked closely at the tracer, but wouldn't it benefit by having
constants in particular propagated to their uses?

Yes, DOM's duplication/isolation of paths exposes the degenerate PHIs.  So we
might have had:

One of the nice side effects of jump threading is that it isolates paths.  So
we might have had 

BBn
x = phi (a, b, c, constant, d, e, f)

duplication for threading might turn that into

BBn:
x = phi (a, b, c, d, e, f)  // The original


elsewhere

BBm:
x' = phi (constant)  // the duplicate


Propagating the constant for x' in BBm and eliminating the degenerate is what
the phi-only cprop pass does.  If the tracer generates similar things, then
running phi-only cprop after it might be useful as well.  It *should* be very
fast.


[Bug target/58115] testcase gcc.target/i386/intrinsics_4.c failure

2013-12-17 Thread bernd.edlinger at hotmail dot de
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58115

Bernd Edlinger  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #10 from Bernd Edlinger  ---
patch applied


  1   2   >