[Bug driver/47785] GCC with -flto does not pass -Wa options to the assembler

2019-10-21 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47785

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kugan at gcc dot gnu.org

--- Comment #14 from kugan at gcc dot gnu.org ---
A patch for this is posted at
https://gcc.gnu.org/ml/gcc-patches/2019-10/msg01471.html

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-02-03 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kugan at gcc dot gnu.org

--- Comment #2 from kugan at gcc dot gnu.org ---
I'll assign it to myself unless it is being looked at by someone else.

[Bug tree-optimization/64946] [AArch64] gcc.target/aarch64/vect-abs-compile.c - "abs" vectorization fails for char/short types

2018-06-16 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64946

--- Comment #24 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Sat Jun 16 21:34:29 2018
New Revision: 261681

URL: https://gcc.gnu.org/viewcvs?rev=261681&root=gcc&view=rev
Log:
gcc/ChangeLog:

2018-06-16  Kugan Vivekanandarajah  

PR middle-end/64946
* cfgexpand.c (expand_debug_expr): Hande ABSU_EXPR.
* config/i386/i386.c (ix86_add_stmt_cost): Likewise.
* dojump.c (do_jump): Likewise.
* expr.c (expand_expr_real_2): Check operand type's sign.
* fold-const.c (const_unop): Handle ABSU_EXPR.
(fold_abs_const): Likewise.
* gimple-pretty-print.c (dump_unary_rhs): Likewise.
* gimple-ssa-backprop.c (backprop::process_assign_use): Likesie.
(strip_sign_op_1): Likesise.
* match.pd: Add new pattern to generate ABSU_EXPR.
* optabs-tree.c (optab_for_tree_code): Handle ABSU_EXPR.
* tree-cfg.c (verify_gimple_assign_unary): Likewise.
* tree-eh.c (operation_could_trap_helper_p): Likewise.
* tree-inline.c (estimate_operator_cost): Likewise.
* tree-pretty-print.c (dump_generic_node): Likewise.
* tree-vect-patterns.c (vect_recog_sad_pattern): Likewise.
* tree.def (ABSU_EXPR): New.

gcc/c-family/ChangeLog:

2018-06-16  Kugan Vivekanandarajah  

* c-common.c (c_common_truthvalue_conversion): Handle ABSU_EXPR.

gcc/c/ChangeLog:

2018-06-16  Kugan Vivekanandarajah  

* c-typeck.c (build_unary_op): Handle ABSU_EXPR;
* gimple-parser.c (c_parser_gimple_statement): Likewise.
(c_parser_gimple_unary_expression): Likewise.

gcc/cp/ChangeLog:

2018-06-16  Kugan Vivekanandarajah  

* constexpr.c (potential_constant_expression_1): Handle ABSU_EXPR.
* cp-gimplify.c (cp_fold): Likewise.

gcc/testsuite/ChangeLog:

2018-06-16  Kugan Vivekanandarajah  

PR middle-end/64946
* gcc.dg/absu.c: New test.
* gcc.dg/gimplefe-29.c: New test.
* gcc.target/aarch64/pr64946.c: New test.


Added:
trunk/gcc/testsuite/gcc.dg/absu.c
trunk/gcc/testsuite/gcc.dg/gimplefe-29.c
trunk/gcc/testsuite/gcc.target/aarch64/pr64946.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/c-family/ChangeLog
trunk/gcc/c-family/c-common.c
trunk/gcc/c/ChangeLog
trunk/gcc/c/c-typeck.c
trunk/gcc/c/gimple-parser.c
trunk/gcc/cfgexpand.c
trunk/gcc/config/i386/i386.c
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/constexpr.c
trunk/gcc/cp/cp-gimplify.c
trunk/gcc/dojump.c
trunk/gcc/expr.c
trunk/gcc/fold-const.c
trunk/gcc/gimple-pretty-print.c
trunk/gcc/gimple-ssa-backprop.c
trunk/gcc/match.pd
trunk/gcc/optabs-tree.c
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-cfg.c
trunk/gcc/tree-eh.c
trunk/gcc/tree-inline.c
trunk/gcc/tree-pretty-print.c
trunk/gcc/tree-vect-patterns.c
trunk/gcc/tree.def

[Bug middle-end/82479] missing popcount builtin detection

2018-06-16 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82479

--- Comment #13 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Sat Jun 16 21:39:31 2018
New Revision: 261682

URL: https://gcc.gnu.org/viewcvs?rev=261682&root=gcc&view=rev
Log:
gcc/ChangeLog:

2018-06-16  Kugan Vivekanandarajah  

PR middle-end/82479
* ipa-fnsummary.c (will_be_nonconstant_expr_predicate): Handle
CALL_EXPR.
* tree-scalar-evolution.c (interpret_expr): Likewise.
(expression_expensive_p): Likewise.
* tree-ssa-loop-ivopts.c (contains_abnormal_ssa_name_p): Likewise.
* tree-ssa-loop-niter.c (number_of_iterations_popcount): New.
(number_of_iterations_exit_assumptions): Use
number_of_iterations_popcount.
(ssa_defined_by_minus_one_stmt_p): New.

gcc/testsuite/ChangeLog:

2018-06-16  Kugan Vivekanandarajah  

PR middle-end/82479
* gcc.dg/tree-ssa/popcount.c: New test.
* gcc.dg/tree-ssa/popcount2.c: New test.


Added:
trunk/gcc/testsuite/gcc.dg/tree-ssa/popcount.c
trunk/gcc/testsuite/gcc.dg/tree-ssa/popcount2.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/ipa-fnsummary.c
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-scalar-evolution.c
trunk/gcc/tree-ssa-loop-ivopts.c
trunk/gcc/tree-ssa-loop-niter.c

[Bug ipa/91468] Suspicious codes in ipa-prop.c and ipa-cp.c

2019-08-26 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91468

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kugan at gcc dot gnu.org

--- Comment #2 from kugan at gcc dot gnu.org ---
(In reply to Martin Jambor from comment #1)
> (In reply to Feng Xue from comment #0)

> > 
> > In function update_jump_functions_after_inlining(),
> > 
> >   if (dst->type == IPA_JF_ANCESTOR)
> > {
> >   ..
> > 
> >   if (src->type == IPA_JF_PASS_THROUGH
> >   && src->value.pass_through.operation == NOP_EXPR)
> > {
> >..
> > }
> >   else if (src->type == IPA_JF_PASS_THROUGH
> >&& TREE_CODE_CLASS (src->value.pass_through.operation) == 
> > tcc_unary)
> > {
> >   dst->value.ancestor.formal_id = src->value.pass_through.formal_id;
> >   dst->value.ancestor.agg_preserved = false;
> > }
> >   ..   
> > }
> > 
> > If we suppose pass_through operation is "negate_expr" (while it is not a
> > reasonable operation on pointer type), the code might be incorrect. It's
> > better to specify expected unary operations here.
> 
> Kugan, you added this in 2016 and unfortunately I think it is wrong.
> Are there any unary operations we could possibly want to handle?
> In any event, the information that there was an arithmetic function in
> the path of the parameter would be completely lost if the code ever
> executed.  (Which I don't think it ever does, I think it would take
> crazy code that employs LTO to pass an integer to a pointer parameter
> to trigger).
> 
> So I plan to remove the whole if.
> 

Yes, i think this is a mistake and should go. Thanks for doing that.

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-02-11 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834

--- Comment #3 from kugan at gcc dot gnu.org ---
I added iv-use for MASKED_LOAD_LANE and the result is
cmp w3, 0
ble .L1
sub w5, w3, #1
mov x4, 0
lsr w5, w5, 1
add w5, w5, 1
whilelo p0.s, xzr, x5
.p2align 3,,7
.L3:
lsl x3, x4, 3
incwx4
add x7, x1, x3
add x6, x2, x3
ld2w{z4.s - z5.s}, p0/z, [x7]
ld2w{z2.s - z3.s}, p0/z, [x6]
add x3, x0, x3
add z0.s, z4.s, z2.s
sub z1.s, z5.s, z3.s
st2w{z0.s - z1.s}, p0, [x3]
whilelo p0.s, x4, x5
bne .L3
.L1:
ret

No base plus scaled index addressing mode. This is because in ivopt

When called from ivopt:
Breakpoint 4, aarch64_classify_address (info=0x7fffcba0, x=0x76c44f30,
mode=E_DImode, strict_p=false, type=ADDR_QUERY_M)
at
/home/kugan/work/abe/snapshots/gcc.git~origin~aarch64~sve-acle-branch/gcc/config/aarch64/aarch64.c:5689
5689{
(gdb) p debug_rtx (x)
(plus:DI (mult:DI (reg:DI 91)
(const_int 8 [0x8]))
(reg:DI 90))

it accepts it.

When in cfgexpand:
Breakpoint 5, aarch64_classify_address (info=0x7fffcca0, x=0x76c5b840,
mode=E_VNx8SImode, strict_p=false, type=ADDR_QUERY_M)
at
/home/kugan/work/abe/snapshots/gcc.git~origin~aarch64~sve-acle-branch/gcc/config/aarch64/aarch64.c:5689
5689{
(gdb) p debug_rtx (x)
(plus:DI (mult:DI (reg:DI 92 [ ivtmp_28 ])
(const_int 8 [0x8]))
(reg/v/f:DI 110 [ y ]))


This is not accepted because of aarch64_classify_index (info, op1, mode,
strict_p) failing (as it should).

Note the difference in mode for aarch64_classify_address. Not sure if this is
because of the way my patch changes ivopt.

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-02-11 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834

--- Comment #4 from kugan at gcc dot gnu.org ---
Created attachment 45661
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45661&action=edit
ivopt patch v1

[Bug tree-optimization/89296] New: tree copy-header masking uninitialized warning

2019-02-11 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89296

Bug ID: 89296
   Summary: tree copy-header masking uninitialized warning
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kugan at gcc dot gnu.org
  Target Milestone: ---

void test_func(void) {
  int loop;  // uninitialized and "garbage"
  while (!loop) {
   loop = get_a_value();  // <- must be for this test
   printk("...");
  }
}

from Linaro bug report https://bugs.linaro.org/show_bug.cgi?id=4134
-fno-tree-ch gets the required warning

diff --git a/gcc/tree-ssa-loop-ch.c b/gcc/tree-ssa-loop-ch.c
index c876d62..d405d00 100644
--- a/gcc/tree-ssa-loop-ch.c
+++ b/gcc/tree-ssa-loop-ch.c
@@ -393,7 +393,7 @@ ch_base::copy_headers (function *fun)
{
  gimple *stmt = gsi_stmt (bsi);
  if (gimple_code (stmt) == GIMPLE_COND)
-   gimple_set_no_warning (stmt, true);
+   ;//gimple_set_no_warning (stmt, true);
  else if (is_gimple_assign (stmt))
{
  enum tree_code rhs_code = gimple_assign_rhs_code (stmt);

also gets the required warning. Looking into it.

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-02-12 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834

kugan at gcc dot gnu.org changed:

   What|Removed |Added

  Attachment #45661|0   |1
is obsolete||

--- Comment #5 from kugan at gcc dot gnu.org ---
Created attachment 45686
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45686&action=edit
ivopt patch v2

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-02-12 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834

--- Comment #6 from kugan at gcc dot gnu.org ---

> 
> Note the difference in mode for aarch64_classify_address. Not sure if this
> is because of the way my patch changes ivopt.

Yes, it ws my mistake in iv-use. with attached patch, I now get
cmp w3, 0
ble .L1
sub w3, w3, #1
mov x4, 0
cntwx5
ptrue   p1.s, all
lsr w3, w3, 1
add w3, w3, 1
whilelo p0.s, xzr, x3
.p2align 3,,7
.L3:
ld2w{z4.s - z5.s}, p0/z, [x1, x4, lsl 2]
ld2w{z2.s - z3.s}, p0/z, [x2, x4, lsl 2]
add z0.s, z4.s, z2.s
sub z1.s, z5.s, z3.s
st2w{z0.s - z1.s}, p0, [x0, x4, lsl 2]
whilelo p0.s, x5, x3
incbx4, all, mul #2
incwx5
ptest   p1, p0.b
bne .L3
.L1:
ret
.cfi_endproc

I will post the patch for review after stage-1 opens. In the meantime any
review is appreciated. Especially the part where iv-use is setup and
get_alias_ptr_type_for_ptr_address.

[Bug target/88838] [SVE] Use 32-bit WHILELO in LP64 mode

2019-02-21 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88838

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kugan at gcc dot gnu.org

--- Comment #2 from kugan at gcc dot gnu.org ---
AFIK, we need to:
1. Change the whilelo pattern in backend
2. Change RTL CSE
- Add support for VEC_DUPLICATE
- When handling PARALLEL rtx, we  may kill CSE defined in the first set so that
it docent reach

Attached patch fix this. With the patch I now have:
.LFB0:
.cfi_startproc
cmp w3, 0
ble .L1
sub w4, w3, #1
cntwx3
lsr w4, w4, 1
add w4, w4, 1
whilelo p0.s, xzr, x4
.p2align 3,,7
.L3:
ld2w{z4.s - z5.s}, p0/z, [x1]
ld2w{z2.s - z3.s}, p0/z, [x2]
add z0.s, z4.s, z2.s
sub z1.s, z5.s, z3.s
st2w{z0.s - z1.s}, p0, [x0]
incbx1, all, mul #2
whilelo p0.s, x3, x4
incbx0, all, mul #2
incwx3
incbx2, all, mul #2
bne .L3
.L1:
ret
.cfi_endproc

[Bug target/88838] [SVE] Use 32-bit WHILELO in LP64 mode

2019-02-21 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88838

--- Comment #3 from kugan at gcc dot gnu.org ---
Created attachment 45794
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45794&action=edit
RFC patch

[Bug target/88838] [SVE] Use 32-bit WHILELO in LP64 mode

2019-02-21 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88838

--- Comment #4 from kugan at gcc dot gnu.org ---
sorry wr(In reply to kugan from comment #3)
> Created attachment 45794 [details]
> RFC patch

Oops wrong place, it should be for
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88836

[Bug target/88836] [SVE] Redundant PTEST in loop test

2019-02-21 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88836

--- Comment #2 from kugan at gcc dot gnu.org ---
Created attachment 45795
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45795&action=edit
RFC patch

AFIK, we need to:
1. Change the whilelo pattern in backend
2. Change RTL CSE
- Add support for VEC_DUPLICATE
- When handling PARALLEL rtx, we  may kill CSE defined in the first set so that
it docent reach

Attached patch fix this. With the patch I now have:
.LFB0:
.cfi_startproc
cmp w3, 0
ble .L1
sub w4, w3, #1
cntwx3
lsr w4, w4, 1
add w4, w4, 1
whilelo p0.s, xzr, x4
.p2align 3,,7
.L3:
ld2w{z4.s - z5.s}, p0/z, [x1]
ld2w{z2.s - z3.s}, p0/z, [x2]
add z0.s, z4.s, z2.s
sub z1.s, z5.s, z3.s
st2w{z0.s - z1.s}, p0, [x0]
incbx1, all, mul #2
whilelo p0.s, x3, x4
incbx0, all, mul #2
incwx3
incbx2, all, mul #2
bne .L3
.L1:
ret
.cfi_endproc

[Bug target/88838] [SVE] Use 32-bit WHILELO in LP64 mode

2019-03-20 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88838

--- Comment #5 from kugan at gcc dot gnu.org ---
Created attachment 46000
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46000&action=edit
RFC patch

RFC patch fixes this for review.

[Bug rtl-optimization/89862] New: LTO bootstrap fails for ARM

2019-03-27 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89862

Bug ID: 89862
   Summary: LTO bootstrap fails for ARM
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kugan at gcc dot gnu.org
  Target Milestone: ---

Created attachment 46039
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46039&action=edit
patch

With the commit:
commit 67c18bce7054934528ff5930cca283b4ac967dca
Author: ebotcazou 
Date:   Wed Jan 31 10:03:06 2018 +PR rtl-optimization/84071
* combine.c (record_dead_and_set_regs_1): Record the source
unmodified
for a paradoxical SUBREG on a WORD_REGISTER_OPERATIONS target.

LTO bootstrap fails for arm (possibly for other WORD_REGISTER_OPERATIONS
targets).

There are internal compiler error: in operator+=, at profile-count.h:792. It
looks like the profile_count is set incorrectly.

Commit 67c18bce7054934528ff5930cca283b4ac967dca skips generating gen_lowpart
for
(set (subreg:SI (reg:QI 1434) 0)
(const_int 224 [0xe0])) and likes. This seems to be the reason for the
error.

attached patch fixes this. Does this look reasonable?

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-03-27 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834

--- Comment #8 from kugan at gcc dot gnu.org ---
(In reply to rsand...@gcc.gnu.org from comment #7)
> Thanks for looking at this.
> 
> (In reply to kugan from comment #6)
> > cmp w3, 0
> > ble .L1
> > sub w3, w3, #1
> > mov x4, 0
> > cntwx5
> > ptrue   p1.s, all
> > lsr w3, w3, 1
> > add w3, w3, 1
> > whilelo p0.s, xzr, x3
> > .p2align 3,,7
> > .L3:
> > ld2w{z4.s - z5.s}, p0/z, [x1, x4, lsl 2]
> > ld2w{z2.s - z3.s}, p0/z, [x2, x4, lsl 2]
> > add z0.s, z4.s, z2.s
> > sub z1.s, z5.s, z3.s
> > st2w{z0.s - z1.s}, p0, [x0, x4, lsl 2]
> > whilelo p0.s, x5, x3
> > incbx4, all, mul #2
> > incwx5
> > ptest   p1, p0.b
> > bne .L3
> > .L1:
> > ret
> > .cfi_endproc
> 
> This doesn't look right.  x4 is an index, so it should be
> incremented by the number of words in two vectors, rather than
> the number of bytes in two vectors.

Thanks for the comments. Fixed it with the attached patch it generates

f:
.LFB0:
.cfi_startproc
cmp w3, 0
ble .L1
sub w5, w3, #1
cntwx4
mov x3, 0
ptrue   p1.s, all
lsr w5, w5, 1
add w5, w5, 1
whilelo p0.s, xzr, x5
.p2align 3,,7
.L3:
ld2w{z4.s - z5.s}, p0/z, [x1, x3, lsl 2]
ld2w{z2.s - z3.s}, p0/z, [x2, x3, lsl 2]
add z0.s, z4.s, z2.s
sub z1.s, z5.s, z3.s
st2w{z0.s - z1.s}, p0, [x0, x3, lsl 2]
whilelo p0.s, x4, x5
inchx3
incwx4
ptest   p1, p0.b
bne .L3
.L1:
ret
.cfi_endproc

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-03-27 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834

kugan at gcc dot gnu.org changed:

   What|Removed |Added

  Attachment #45686|0   |1
is obsolete||

--- Comment #9 from kugan at gcc dot gnu.org ---
Created attachment 46040
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46040&action=edit
patch

[Bug rtl-optimization/89862] LTO bootstrap fails for ARM

2019-03-28 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89862

--- Comment #2 from kugan at gcc dot gnu.org ---
(In reply to Eric Botcazou from comment #1)
> Can you try this instead?
> 
> Index: rtl.h
> ===
> --- rtl.h   (revision 269886)
> +++ rtl.h   (working copy)
> @@ -4401,6 +4401,7 @@ word_register_operation_p (const_rtx x)
>  {
>switch (GET_CODE (x))
>  {
> +case CONST_INT:
>  case ROTATE:
>  case ROTATERT:
>  case SIGN_EXTRACT:
Thanks for looking into it. Disallowing all the CONST_INT works for me. I have
verified that lto-bootstrap works with the above changes. I will test for
regression and post it to gcc-patches.

[Bug rtl-optimization/89862] LTO bootstrap fails for ARM

2019-03-29 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89862

--- Comment #3 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Sat Mar 30 04:24:22 2019
New Revision: 270030

URL: https://gcc.gnu.org/viewcvs?rev=270030&root=gcc&view=rev
Log:

2019-03-29  Kugan Vivekanandarajah  
Eric Botcazou  

PR rtl-optimization/89862
* rtl.h (word_register_operation_p): Exclude CONST_INT from operations
that operates on the full registers for WORD_REGISTER_OPERATIONS
architectures.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/rtl.h

[Bug rtl-optimization/89862] LTO bootstrap fails for ARM

2019-03-29 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89862

--- Comment #4 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Sat Mar 30 04:28:51 2019
New Revision: 270031

URL: https://gcc.gnu.org/viewcvs?rev=270031&root=gcc&view=rev
Log:

2019-03-29  Kugan Vivekanandarajah  

Backport from mainline
2019-03-29  Kugan Vivekanandarajah  
Eric Botcazou  

PR rtl-optimization/89862
* rtl.h (word_register_operation_p): Exclude CONST_INT from operations
that operates on the full registers for WORD_REGISTER_OPERATIONS
architectures.


Modified:
branches/gcc-8-branch/gcc/ChangeLog
branches/gcc-8-branch/gcc/rtl.h

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-04-08 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834

--- Comment #12 from kugan at gcc dot gnu.org ---
(In reply to rsand...@gcc.gnu.org from comment #10)
> (In reply to kugan from comment #9)
> > Created attachment 46040 [details]
> > patch
> 
> Wasn't sure whether this patch was WIP or the final version
> for review, but we need to do something more generic than
> dividing by 4.  I think the test will still fail with "int"
> changed to "short" for example.
> 
> I also don't think the new candidate should be tied to the
> mask/load store functions.  Maybe one approach would be to
> check when adding a zero-based candidate for a use in:
> 
>   /* Record common candidate with initial value zero.  */
>   basetype = TREE_TYPE (iv->base);
>   if (POINTER_TYPE_P (basetype))
> basetype = sizetype;
>   record_common_cand (data, build_int_cst (basetype, 0), iv->step, use);
> 
> whether the use actually benefits from this unscaled iv.
> If the use is USE_REF_ADDRESS, we could compare the cost
> of an address with an unscaled index with the cost of an address
> with a scaled index.  I think the natural scale value to try
> would be GET_MODE_INNER (TYPE_MODE (mem_type)).

Thanks for the comments. I agree this is the right place. But I am not sure if
checking the cost at this point is what IV opt generally does. In general,
IV-opt adds candidates which can be helpful and later decides the optimal set. 

If we are to use get_computation_cost to see the costs, we have to create
iv_cand and then discard. Since we are adding only one candidate and that too
for SVE like targets, I am thinking that it is OK. If you still prefer to check
the cost, I will change that.

Attached patch (only the ivopt changes) and testcase

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-04-08 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834

kugan at gcc dot gnu.org changed:

   What|Removed |Added

  Attachment #46040|0   |1
is obsolete||

--- Comment #13 from kugan at gcc dot gnu.org ---
Created attachment 46103
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46103&action=edit
ivopt changes alone

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-04-08 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834

--- Comment #14 from kugan at gcc dot gnu.org ---
Created attachment 46104
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46104&action=edit
testcase

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-04-08 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834

--- Comment #15 from kugan at gcc dot gnu.org ---
(In reply to Wilco from comment #11)
> There is also something odd with the way the loop iterates, this doesn't
> look right:
> 
> whilelo p0.s, x3, x4
> incwx3
> ptest   p1, p0.b
> bne .L3

I am not sure I understand this. I tried with qemu using an execution testcase
and It seems to work.

whilelo p0.s, x4, x5
incwx4
ptest   p1, p0.b
bne .L3
In my case I have the above (register allocation difference only) incw is
correct considering two vector word registers? Am I missing something here?

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-04-09 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834

--- Comment #17 from kugan at gcc dot gnu.org ---
(In reply to Wilco from comment #16)
> (In reply to kugan from comment #15)
> > (In reply to Wilco from comment #11)
> > > There is also something odd with the way the loop iterates, this doesn't
> > > look right:
> > > 
> > > whilelo p0.s, x3, x4
> > > incwx3
> > > ptest   p1, p0.b
> > > bne .L3
> > 
> > I am not sure I understand this. I tried with qemu using an execution
> > testcase and It seems to work.
> > 
> > whilelo p0.s, x4, x5
> > incwx4
> > ptest   p1, p0.b
> > bne .L3
> > In my case I have the above (register allocation difference only) incw is
> > correct considering two vector word registers? Am I missing something here?
> 
> I'm talking about the completely redundant ptest, where does that come from?

It is https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88836

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-06-12 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834

--- Comment #19 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Thu Jun 13 03:18:54 2019
New Revision: 272232

URL: https://gcc.gnu.org/viewcvs?rev=272232&root=gcc&view=rev
Log:

gcc/ChangeLog:

2019-06-13  Kugan Vivekanandarajah  

PR target/88834
* tree-ssa-loop-ivopts.c (get_mem_type_for_internal_fn): Handle
IFN_MASK_LOAD_LANES and IFN_MASK_STORE_LANES.
(get_alias_ptr_type_for_ptr_address): Likewise.
(add_iv_candidate_for_use): Add scaled index candidate if useful.
* tree-ssa-address.c (preferred_mem_scale_factor): New.
* config/aarch64/aarch64.c (aarch64_classify_address): Relax
allow_reg_index_p.

gcc/testsuite/ChangeLog:

2019-06-13  Kugan Vivekanandarajah  

PR target/88834
* gcc.target/aarch64/pr88834.c: New test.
* gcc.target/aarch64/sve/struct_vect_1.c: Adjust.
* gcc.target/aarch64/sve/struct_vect_14.c: Likewise.
* gcc.target/aarch64/sve/struct_vect_15.c: Likewise.
* gcc.target/aarch64/sve/struct_vect_16.c: Likewise.
* gcc.target/aarch64/sve/struct_vect_17.c: Likewise.
* gcc.target/aarch64/sve/struct_vect_7.c: Likewise.


Added:
trunk/gcc/testsuite/gcc.target/aarch64/pr88834.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/aarch64/aarch64.c
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.target/aarch64/sve/struct_vect_1.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/struct_vect_14.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/struct_vect_15.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/struct_vect_16.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/struct_vect_17.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/struct_vect_7.c
trunk/gcc/tree-ssa-address.c
trunk/gcc/tree-ssa-address.h
trunk/gcc/tree-ssa-loop-ivopts.c

[Bug target/88838] [SVE] Use 32-bit WHILELO in LP64 mode

2019-06-12 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88838

--- Comment #6 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Thu Jun 13 03:34:28 2019
New Revision: 272233

URL: https://gcc.gnu.org/viewcvs?rev=272233&root=gcc&view=rev
Log:

gcc/ChangeLog:

2019-06-13  Kugan Vivekanandarajah  

PR target/88838
* tree-vect-loop-manip.c (vect_set_loop_masks_directly): If the
compare_type is not with Pmode size, we will create an IV with
Pmode size with truncated use (i.e. converted to the correct type).
* tree-vect-loop.c (vect_verify_full_masking): Find IV type.
(vect_iv_limit_for_full_masking): New. Factored out of
vect_set_loop_condition_masked.
* tree-vectorizer.h (LOOP_VINFO_MASK_IV_TYPE): New.
(vect_iv_limit_for_full_masking): Declare.

gcc/testsuite/ChangeLog:

2019-06-13  Kugan Vivekanandarajah  

PR target/88838
* gcc.target/aarch64/pr88838.c: New test.
* gcc.target/aarch64/sve/while_1.c: Adjust.

Added:
trunk/gcc/testsuite/gcc.target/aarch64/pr88838.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.target/aarch64/sve/while_1.c
trunk/gcc/tree-vect-loop-manip.c
trunk/gcc/tree-vect-loop.c
trunk/gcc/tree-vectorizer.h

[Bug target/88834] [SVE] Poor addressing mode choices for LD2 and ST2

2019-06-17 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834

--- Comment #21 from kugan at gcc dot gnu.org ---
(In reply to Christophe Lyon from comment #20)
> Hi Kugan,
> 
> The new test fails with -mabi=ilp32:
> FAIL: gcc.target/aarch64/pr88834.c scan-assembler-times \\tld2w\\t{z[0-9]+.s
> - z[0-9]+.s}, p[0-7]/z, \\[x[0-9]+, x[0-9]+, lsl 2\\]\\n 2
> FAIL: gcc.target/aarch64/pr88834.c scan-assembler-times \\tst2w\\t{z[0-9]+.s
> - z[0-9]+.s}, p[0-7], \\[x[0-9]+, x[0-9]+, lsl 2\\]\\n 1

Thanks Christophe. In the back-end, when we use ILP32, we don't accept SImode
ops if like:

(plus:SI (mult:SI (reg:SI 91)
(const_int 4 [0x4]))
(reg:SI 90))

While we would accept Pmode. My question is, should we care about ILP32 for
SVE? If so we need to fix this. Otherwise, we can run the test for LP64.

[Bug tree-optimization/86489] ICE in gimple_phi_arg starting with r261682 when building 531.deepsjeng_r with FDO + LTO

2018-07-11 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86489

--- Comment #1 from kugan at gcc dot gnu.org ---
Sorry about the breakage, I am trying to reproduce it on x86-64. Please let me
know if you have testcase.

[Bug tree-optimization/86489] ICE in gimple_phi_arg starting with r261682 when building 531.deepsjeng_r with FDO + LTO

2018-07-12 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86489

--- Comment #3 from kugan at gcc dot gnu.org ---
(In reply to Richard Biener from comment #2)
>   gimple *phi = SSA_NAME_DEF_STMT (b_11);
>   if (gimple_code (phi) != GIMPLE_PHI
>   || (gimple_assign_lhs (and_stmt)
>   != gimple_phi_arg_def (phi, loop_latch_edge (loop)->dest_idx)))
> return false;
> 
> this may fail if the PHI in question is not the correct one in which case
> it may not have the argument at the latch dest_idx.  Try first verifying
> that the loop latch destination is indeed gimple_bb (phi).

yes, thanks for spotting. I am testing the following patch:

diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index f6fa2f7..fbdf838 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -2555,6 +2555,7 @@ number_of_iterations_popcount (loop_p loop, edge exit,
... = PHI .  */
   gimple *phi = SSA_NAME_DEF_STMT (b_11);
   if (gimple_code (phi) != GIMPLE_PHI
+  || (gimple_bb (phi) != loop_latch_edge (loop)->dest)
   || (gimple_assign_lhs (and_stmt)
  != gimple_phi_arg_def (phi, loop_latch_edge (loop)->dest_idx)))
 return false;

is checking that there is argument at the latch dest_idx (argument count of
PHI) is still necessary?

[Bug tree-optimization/86489] ICE in gimple_phi_arg starting with r261682 when building 531.deepsjeng_r with FDO + LTO

2018-07-12 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86489

--- Comment #7 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Fri Jul 13 05:25:47 2018
New Revision: 262622

URL: https://gcc.gnu.org/viewcvs?rev=262622&root=gcc&view=rev
Log:
gcc/ChangeLog:

2018-07-13  Kugan Vivekanandarajah  
Richard Biener  

PR middle-end/86489
* tree-ssa-loop-niter.c (number_of_iterations_popcount): Check
that the loop latch destination where phi is defined.

gcc/testsuite/ChangeLog:

2018-07-13  Kugan Vivekanandarajah  

PR middle-end/86489
* gcc.dg/pr86489.c: New test.


Added:
trunk/gcc/testsuite/gcc.dg/pr86489.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-ssa-loop-niter.c

[Bug tree-optimization/86544] Popcount detection generates different code on C and C++

2018-07-17 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86544

--- Comment #1 from kugan at gcc dot gnu.org ---
(In reply to ktkachov from comment #0)
> Great to see that GCC now detects the popcount loop in PR 82479!
> I am seeing some curious differences between gcc and g++ though.
> int
> pc (unsigned long long b)
> {
> int c = 0;
> 
> while (b) {
> b &= b - 1;
> c++;
> }
> 
> return c;
> }
> 
> If compiled with gcc -O3 on aarch64 this gives:
> pc:
> fmovd0, x0
> cnt v0.8b, v0.8b
> addvb0, v0.8b
> umovw0, v0.b[0]
> ret
> 
> whereas if compiled with g++ -O3 it gives:
> _Z2pcy:
> .LFB0:
> .cfi_startproc
> fmovd0, x0
> cmp x0, 0
> cnt v0.8b, v0.8b
> addvb0, v0.8b
> umovw0, v0.b[0]
> and x0, x0, 255
> cselw0, w0, wzr, ne
> ret
> 
> which is suboptimal. It seems that phiopt3 manages to optimise the C version
> better. The GIMPLE dumps just before the phiopt pass are:
> For the C (good version):
> 
>   int c;
>   int _7;
> 
>[local count: 118111601]:
>   if (b_4(D) != 0)
> goto ; [89.00%]
>   else
> goto ; [11.00%]
> 
>[local count: 105119324]:
>   _7 = __builtin_popcountl (b_4(D));
> 
>[local count: 118111601]:
>   # c_12 = PHI <0(2), _7(3)>
>   return c_12;
> 
> 
> For the C++ (bad version):
> 
>   int c;
>   int _7;
> 
>[local count: 118111601]:
>   if (b_4(D) == 0)
> goto ; [11.00%]
>   else
> goto ; [89.00%]
> 
>[local count: 105119324]:
>   _7 = __builtin_popcountl (b_4(D));
> 
>[local count: 118111601]:
>   # c_12 = PHI <0(2), _7(3)>
>   return c_12;
> 
> As you can see the order of the gotos and the jump conditions is inverted.
> 
> It seems to me that the two are equivalent and GCC could be doing a better
> job of optimising.
> 
> Can we improve phiopt to handle this more effectively?

Thanks for the test case. I will look at it.

[Bug tree-optimization/86544] Popcount detection generates different code on C and C++

2018-07-17 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86544

--- Comment #2 from kugan at gcc dot gnu.org ---
Patch posted at https://gcc.gnu.org/ml/gcc-patches/2018-07/msg00975.html

[Bug tree-optimization/86544] Popcount detection generates different code on C and C++

2018-07-18 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86544

--- Comment #4 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Wed Jul 18 22:11:24 2018
New Revision: 262864

URL: https://gcc.gnu.org/viewcvs?rev=262864&root=gcc&view=rev
Log:
gcc/ChangeLog:

2018-07-18  Kugan Vivekanandarajah  

PR middle-end/86544
* tree-ssa-phiopt.c (cond_removal_in_popcount_pattern): Handle
comparision with EQ_EXPR
in last stmt.

gcc/testsuite/ChangeLog:

2018-07-18  Kugan Vivekanandarajah  

PR middle-end/86544
* g++.dg/tree-ssa/pr86544.C: New test.


Added:
trunk/gcc/testsuite/g++.dg/tree-ssa/pr86544.C
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-ssa-phiopt.c

[Bug target/86677] New: popcount builtin detection is breaking some kernel build

2018-07-25 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86677

Bug ID: 86677
   Summary: popcount builtin detection is breaking some kernel
build
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kugan at gcc dot gnu.org
  Target Milestone: ---

Linux kernel build for arm/aarch64 (and possibly other targets) which does not
provide appropriate patterns in the backend will break the kernel build. 

As for aarch64 this happens because kernel is built with -mgeneral-regs-only

Also discussed in:
https://gcc.gnu.org/ml/gcc-patches/2018-07/msg00489.html

[Bug target/86677] popcount builtin detection is breaking some kernel build

2018-07-26 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86677

--- Comment #2 from kugan at gcc dot gnu.org ---
(In reply to Richard Biener from comment #1)
> The kernel simply has to provide __popcount{s,d}i2 like it provides other
> libgcc functions if it chooses to not link against libgcc.

Yes, I created this bug just so that I can point it to the kernel people. I
will raise it with the kernel people internally and see what I can do. Thanks.

[Bug target/87253] New: Python test_ctypes fails when built with gcc 8.2

2018-09-08 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87253

Bug ID: 87253
   Summary: Python test_ctypes fails when built with gcc 8.2
   Product: gcc
   Version: 8.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kugan at gcc dot gnu.org
  Target Milestone: ---

Python-2.7.15

Steps to reproduce error
In Python src directory:
./configure
make
./python Lib/test/regrtest.py -v test_ctypes

==
FAIL: test_struct_by_value (ctypes.test.test_win32.Structures)
--
Traceback (most recent call last):
  File
"/home/kugan.vivekanandarajah/Python-2.7.15/Lib/ctypes/test/test_win32.py",
line 113, in test_struct_by_value
self.assertEqual(ret.left, left.value)
AssertionError: -200 != 10



gdb ./python
b ReturnRect
r Lib/test/regrtest.py -v test_ctypesQuit

(gdb) p cp
$9 = {x = 15, y = 25}
(gdb) p fp
$10 = {x = 548534164448, y = 9890688}

cp and fp are the same as can  be seen from below:

vi /home/kugan.vivekanandarajah/Python-2.7.15/Lib/ctypes/test/test_win32.py
+112

pt = POINT(15, 25)
...
ReturnRect = dll.ReturnRect
ReturnRect.argtypes = [c_int, RECT, POINTER(RECT), POINT, RECT,
  POINTER(RECT), POINT, RECT]


ret = ReturnRect(i, rect, pointer(rect), pt, rect,
 byref(rect), pt, rect)


gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/home/kugan.vivekanandarajah/install/usr/local/bin/../libexec/gcc/aarch64-unknown-linux-gnu/8.2.1/lto-wrapper
Target: aarch64-unknown-linux-gnu
Configured with: ../gcc/configure --disable-bootstrap
Thread model: posix
gcc version 8.2.1 20180907 (GCC)

[Bug c++/87469] [9 Regression] ice in record_estimate, at tree-ssa-loop-niter.c:3271

2018-10-17 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87469

--- Comment #4 from kugan at gcc dot gnu.org ---
In the loop here, the value defined in the loop (e) is used outside the loop
hence this should not be detected as popcount (AFIK). I will have a look at
fixing this.

[Bug c++/87469] [9 Regression] ice in record_estimate, at tree-ssa-loop-niter.c:3271

2018-10-29 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87469

--- Comment #5 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Mon Oct 29 22:02:45 2018
New Revision: 265605

URL: https://gcc.gnu.org/viewcvs?rev=265605&root=gcc&view=rev
Log:
gcc/testsuite/ChangeLog:

2018-10-29  Kugan Vivekanandarajah  

PR middle-end/87469
* g++.dg/pr87469.C: New test.

gcc/ChangeLog:

2018-10-29  Kugan Vivekanandarajah  

PR middle-end/87469
* tree-ssa-loop-niter.c (number_of_iterations_popcount): Fix niter
max value.



Added:
trunk/gcc/testsuite/g++.dg/pr87469.C
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-ssa-loop-niter.c

[Bug middle-end/87528] Popcount changes caused 531.deepsjeng_r run-time regression on Skylake

2018-11-12 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87528

--- Comment #7 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Mon Nov 12 23:43:56 2018
New Revision: 266039

URL: https://gcc.gnu.org/viewcvs?rev=266039&root=gcc&view=rev
Log:
gcc/ChangeLog:

2018-11-13  Kugan Vivekanandarajah  

PR middle-end/86677
PR middle-end/87528
* tree-scalar-evolution.c (expression_expensive_p): Make BUILTIN
POPCOUNT
as expensive when backend does not define it.

gcc/testsuite/ChangeLog:

2018-11-13  Kugan Vivekanandarajah  

PR middle-end/86677
PR middle-end/87528
* g++.dg/tree-ssa/pr86544.C: Run only for target supporting popcount
pattern.
* gcc.dg/tree-ssa/popcount.c: Likewise.
* gcc.dg/tree-ssa/popcount2.c: Likewise.
* gcc.dg/tree-ssa/popcount3.c: Likewise.
* gcc.target/aarch64/popcount4.c: New test.
* lib/target-supports.exp (check_effective_target_popcountl): New.


Added:
trunk/gcc/testsuite/gcc.target/aarch64/popcount4.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/g++.dg/tree-ssa/pr86544.C
trunk/gcc/testsuite/gcc.dg/tree-ssa/popcount.c
trunk/gcc/testsuite/gcc.dg/tree-ssa/popcount2.c
trunk/gcc/testsuite/gcc.dg/tree-ssa/popcount3.c
trunk/gcc/testsuite/lib/target-supports.exp
trunk/gcc/tree-scalar-evolution.c

[Bug target/86677] popcount builtin detection is breaking some kernel build

2018-11-12 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86677

--- Comment #13 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Mon Nov 12 23:43:56 2018
New Revision: 266039

URL: https://gcc.gnu.org/viewcvs?rev=266039&root=gcc&view=rev
Log:
gcc/ChangeLog:

2018-11-13  Kugan Vivekanandarajah  

PR middle-end/86677
PR middle-end/87528
* tree-scalar-evolution.c (expression_expensive_p): Make BUILTIN
POPCOUNT
as expensive when backend does not define it.

gcc/testsuite/ChangeLog:

2018-11-13  Kugan Vivekanandarajah  

PR middle-end/86677
PR middle-end/87528
* g++.dg/tree-ssa/pr86544.C: Run only for target supporting popcount
pattern.
* gcc.dg/tree-ssa/popcount.c: Likewise.
* gcc.dg/tree-ssa/popcount2.c: Likewise.
* gcc.dg/tree-ssa/popcount3.c: Likewise.
* gcc.target/aarch64/popcount4.c: New test.
* lib/target-supports.exp (check_effective_target_popcountl): New.


Added:
trunk/gcc/testsuite/gcc.target/aarch64/popcount4.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/g++.dg/tree-ssa/pr86544.C
trunk/gcc/testsuite/gcc.dg/tree-ssa/popcount.c
trunk/gcc/testsuite/gcc.dg/tree-ssa/popcount2.c
trunk/gcc/testsuite/gcc.dg/tree-ssa/popcount3.c
trunk/gcc/testsuite/lib/target-supports.exp
trunk/gcc/tree-scalar-evolution.c

[Bug rtl-optimization/88212] New: IRA Register Coalescing not working for the testcase

2018-11-26 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88212

Bug ID: 88212
   Summary: IRA Register Coalescing not working for the testcase
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kugan at gcc dot gnu.org
  Target Milestone: ---

When compiling the following on aarch64 with -O2:
#include 
void g(int32_t *p, int32x2x2_t val, int x)
{
 vst2_lane_s32(p,val,0);
}

generates:
.cfi_startproc
mov v2.8b, v0.8b
mov v3.8b, v1.8b
st2 {v2.s - v3.s}[0], [x0]
ret

clang produces:
st2 { v0.s, v1.s }[0], [x0]
ret

Essentially the problem is that access to part-registers doesn't get
coalesced, so IRA generates moves which aren't actually required.

[Bug sanitizer/88350] New: Linux kernel build ICE with allyesconfig for aarch64

2018-12-04 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88350

Bug ID: 88350
   Summary: Linux kernel build ICE with allyesconfig for aarch64
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: sanitizer
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kugan at gcc dot gnu.org
CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org,
jakub at gcc dot gnu.org, kcc at gcc dot gnu.org, marxin at 
gcc dot gnu.org
  Target Milestone: ---

When Linux kernel is built (allyesconfig) with trunk,  


++ make
CC=/home/tcwg-buildslave/workspace/tcwg_kernel-bisect-gnu_0/bin/aarch64-cc
ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- HOSTCC=gcc -j32 -s -k
:1335:2: warning: #warning syscall rseq not implemented [-Wcpp]
*** WARNING *** there are active plugins, do not report this as a bug unless
you can reproduce it without enabling any plugins.
Event| Plugins
PLUGIN_FINISH_TYPE   | randomize_layout_plugin structleak_plugin
PLUGIN_FINISH_DECL   | randomize_layout_plugin
PLUGIN_ATTRIBUTES| randomize_layout_plugin
latent_entropy_plugin structleak_plugin
PLUGIN_START_UNIT| latent_entropy_plugin
PLUGIN_ALL_IPA_PASSES_START  | randomize_layout_plugin
during RTL pass: expand
arch/arm64/mm/flush.c: In function '__sync_icache_dcache':
arch/arm64/mm/flush.c:61:6: internal compiler error: in
asan_emit_stack_protection, at asan.c:1574
   61 | void __sync_icache_dcache(pte_t pte)
  |  ^~~~


Full build Log can be found in:
https://ci.linaro.org/job/tcwg_kernel-bisect-gnu-master-aarch64-stable-allyesconfig/11/artifact/artifacts/build-1d89613e77d7db420b13ce3ad8b98f07aaf474e8/console.log


Commit that seem to trigger this is:
Author: marxin 
Date:   Fri Nov 30 14:25:15 2018 +

Make red zone size more flexible for stack variables (PR sanitizer/81715).

2018-11-30  Martin Liska  

PR sanitizer/81715
* asan.c (asan_shadow_cst): Remove, partially transform
into flush_redzone_payload.
(RZ_BUFFER_SIZE): New.
(struct asan_redzone_buffer): New.
(asan_redzone_buffer::emit_redzone_byte): Likewise.
(asan_redzone_buffer::flush_redzone_payload): Likewise.
(asan_redzone_buffer::flush_if_full): Likewise.
(asan_emit_stack_protection): Use asan_redzone_buffer class
that is responsible for proper aligned stores and flushing
of shadow memory payload.
* asan.h (ASAN_MIN_RED_ZONE_SIZE): New.
(asan_var_and_redzone_size): Likewise.
* cfgexpand.c (expand_stack_vars): Use smaller alignment
(ASAN_MIN_RED_ZONE_SIZE) in order to make shadow memory
for automatic variables more compact.
2018-11-30  Martin Liska  

PR sanitizer/81715
* c-c++-common/asan/asan-stack-small.c: New test.


git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@24
138bc75d-0d04-0410-961f-82ee72b054a4

[Bug sanitizer/88350] Linux kernel build ICE with allyesconfig for aarch64

2018-12-06 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88350

kugan at gcc dot gnu.org changed:

   What|Removed |Added

  Alias|PR88333 |

--- Comment #2 from kugan at gcc dot gnu.org ---
Dup of PR88333 and fixed.

[Bug sanitizer/88333] [9 Regression] ice in asan_emit_stack_protection, at asan.c:1574

2018-12-06 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88333

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kugan at gcc dot gnu.org

--- Comment #7 from kugan at gcc dot gnu.org ---
*** Bug 88350 has been marked as a duplicate of this bug. ***

[Bug sanitizer/88350] Linux kernel build ICE with allyesconfig for aarch64

2018-12-06 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88350

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #3 from kugan at gcc dot gnu.org ---
Duplicate

*** This bug has been marked as a duplicate of bug 88333 ***

[Bug tree-optimization/66726] missed optimization, factor conversion out of COND_EXPR

2016-02-11 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66726

--- Comment #16 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Fri Feb 12 00:24:22 2016
New Revision: 233362

URL: https://gcc.gnu.org/viewcvs?rev=233362&root=gcc&view=rev
Log:
gcc/ChangeLog:

2016-02-12  Kugan Vivekanandarajah  

PR middle-end/66726
* tree-ssa-reassoc.c (optimize_range_tests): Handle tcc_compare stmt
whose result is used in PHI.
(maybe_optimize_range_tests): Likewise.
(final_range_test_p): Likweise.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/tree-ssa-reassoc.c

[Bug ipa/69708] ipa inline not working for function reference in static const struct

2016-02-11 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69708

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kugan at gcc dot gnu.org

--- Comment #2 from kugan at gcc dot gnu.org ---
In ipa-prop.c, function ipa_compute_jump_functions_for_edge called from
determine_locally_known_aggregate_parts doesn't compute jump functions for
aggregates when they are initialized during definition.

If I change the testcase to following, it works.

typedef struct F {
int (*call)(int);
} F;

static int g(F f, int x) {
x = f.call(x);
x = f.call(x);
x = f.call(x);
x = f.call(x);
x = f.call(x);
x = f.call(x);
x = f.call(x);
x = f.call(x);
return x;
}

static int sq(int x) {
return x * x;
}

static F f = {sq};
//static const F f = {sq};

void dosomething(int);

int h(int x) {
f.call = sq;
dosomething(g(f, x));
dosomething(g((F){sq}, x));
}

I guess, for the varpool_node corresponding to static const F f = {sq};, we
have to check hat the initialization are constant and create the jump function.

I still haven't figured out how I can get the initialization list for the
structure. Any pointers?

varpool_node *node = varpool_node::get (arg); should get me to the varpool
node.

[Bug tree-optimization/66726] missed optimization, factor conversion out of COND_EXPR

2016-02-11 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66726

--- Comment #17 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Fri Feb 12 06:40:55 2016
New Revision: 233368

URL: https://gcc.gnu.org/viewcvs?rev=233368&root=gcc&view=rev
Log:
2016-02-12  Kugan Vivekanandarajah  

revert:
2016-02-12  Kugan Vivekanandarajah  

PR middle-end/66726
* tree-ssa-reassoc.c (optimize_range_tests): Handle tcc_compare stmt
whose result is used in PHI.
(maybe_optimize_range_tests): Likewise.
(final_range_test_p): Likweise.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/tree-ssa-reassoc.c

[Bug ipa/69708] ipa inline not working for function reference in static const struct

2016-02-13 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69708

--- Comment #3 from kugan at gcc dot gnu.org ---
Created attachment 37685
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37685&action=edit
possible fix

Attached patch fixes the testcase.

[Bug ipa/69589] [6 Regression] ICE in initialize_node_lattices, at ipa-cp.c:971

2016-02-14 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69589

--- Comment #11 from kugan at gcc dot gnu.org ---
In remove_unreachable_nodes, just before ipa-cp, this node becomes local
(address taken is false and local.local = true). After that, when
ipa_propagate_frequency is run, which updates the frequency to zero. I think we
should check the frequency at this point in time and remove such nodes.

[Bug ipa/69589] [6 Regression] ICE in initialize_node_lattices, at ipa-cp.c:971

2016-02-14 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69589

--- Comment #12 from kugan at gcc dot gnu.org ---
Created attachment 37688
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37688&action=edit
possible fix

This fixes the testcase.

[Bug c/69819] ICE on invalid code on x86_64-linux-gnu in tree check: expected function_type or method_type, have array_type in function_args_iter_init, at tree.h:4536

2016-02-14 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69819

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kugan at gcc dot gnu.org

--- Comment #1 from kugan at gcc dot gnu.org ---
It looks like a C front-end bug.It should be able to see that "foo" is already
defined and flag an error. C++ front-end does just that.

[Bug tree-optimization/66726] missed optimization, factor conversion out of COND_EXPR

2016-02-14 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66726

--- Comment #18 from kugan at gcc dot gnu.org ---
Reverted r233362 as it caused PR69786 and PR69781. I will test for these and
post a revised patch for next stage1.

[Bug target/70359] [6 Regression] Code size increase for ARM compared to gcc-5.3.0

2016-03-26 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70359

--- Comment #10 from kugan at gcc dot gnu.org ---
I am looking into it.  -mcpu=arm966e-s does not uses the
TARGET_NEW_GENERIC_COSTS. i.e, the rtx costs setup by the back-end might not be
optimal.

[Bug target/70359] [6 Regression] Code size increase for ARM compared to gcc-5.3.0

2016-03-27 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70359

--- Comment #11 from kugan at gcc dot gnu.org ---
Optimized gimple diff between 5.3 and trunk is :

-;; Function inttostr (inttostr, funcdef_no=0, decl_uid=5268, cgraph_uid=0,
symbol_order=0)
+;; Function inttostr (inttostr, funcdef_no=0, decl_uid=4222, cgraph_uid=0,
symbol_order=0)

 Removing basic block 7
 Removing basic block 8
@@ -43,7 +43,7 @@
 goto ;

   :
-  p_22 = p_2 + 4294967294;
+  p_22 = p_16 + 4294967295;
   MEM[(char *)p_16 + 4294967295B] = 45;

   :

[Bug target/70359] [6 Regression] Code size increase for ARM compared to gcc-5.3.0

2016-03-27 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70359

--- Comment #12 from kugan at gcc dot gnu.org ---
However, diff of cfgexand is significantly different:
 ;; Full RTL generated for this function:
 ;;
32: NOTE_INSN_DELETED
-   38: NOTE_INSN_BASIC_BLOCK 2
+   39: NOTE_INSN_BASIC_BLOCK 2
33: r151:SI=r0:SI
34: r152:SI=r1:SI
35: r153:SI=r2:SI
36: NOTE_INSN_FUNCTION_BEG
-   40: {r141:SI=abs(r151:SI);clobber cc:CC;}
-   41: r154:SI=r153:SI-0x1
-   42: r142:SI=r152:SI+r154:SI
-   43: r155:SI=0
-   44: r156:QI=r155:SI#0
-   45: [r142:SI]=r156:QI
-   61: L61:
-   46: NOTE_INSN_BASIC_BLOCK 4
-   47: r142:SI=r142:SI-0x1
-   48: r1:SI=0xa
-   49: r0:SI=r141:SI
-   50: r0:DI=call [`__aeabi_uidivmod'] argc:0
+   41: {r141:SI=abs(r151:SI);clobber cc:CC;}
+   42: r154:SI=r153:SI-0x1
+   43: r142:SI=r152:SI+r154:SI
+   44: r155:SI=0
+   45: r156:QI=r155:SI#0
+   46: [r142:SI]=r156:QI
+   81: pc=L62
+   82: barrier
+   84: L84:
+   83: NOTE_INSN_BASIC_BLOCK 4
+   37: r142:SI=r150:SI
+   62: L62:
+   47: NOTE_INSN_BASIC_BLOCK 5
+   48: r150:SI=r142:SI-0x1
+   49: r1:SI=0xa
+   50: r0:SI=r141:SI
+   51: r0:DI=call [`__aeabi_uidivmod'] argc:0
   REG_CALL_DECL `__aeabi_uidivmod'
   REG_EH_REGION 0x8000
-   51: r162:SI=r1:SI
+   52: r162:SI=r1:SI
   REG_EQUAL umod(r141:SI,0xa)
-   52: r163:QI=r162:SI#0
-   53: r164:SI=r163:QI#0+0x30
-   54: r165:QI=r164:SI#0
-   55: [r142:SI]=r165:QI
-   56: r1:SI=0xa
-   57: r0:SI=r141:SI
-   58: r0:SI=call [`__aeabi_uidiv'] argc:0
+   53: r163:QI=r162:SI#0
+   54: r164:SI=r163:QI#0+0x30
+   55: r165:QI=r164:SI#0
+   56: [r150:SI]=r165:QI
+   57: r1:SI=0xa
+   58: r0:SI=r141:SI
+   59: r0:SI=call [`__aeabi_uidiv'] argc:0
   REG_CALL_DECL `__aeabi_uidiv'
   REG_EH_REGION 0x8000
-   59: r169:SI=r0:SI
+   60: r169:SI=r0:SI
   REG_EQUAL udiv(r141:SI,0xa)
-   60: r141:SI=r169:SI
-   62: cc:CC=cmp(r141:SI,0)
-   63: pc={(cc:CC!=0)?L61:pc}
+   61: r141:SI=r169:SI
+   63: cc:CC=cmp(r141:SI,0)
+   64: pc={(cc:CC!=0)?L84:pc}
   REG_BR_PROB 9100
-   64: NOTE_INSN_BASIC_BLOCK 5
-   65: cc:CC=cmp(r151:SI,0)
-   66: pc={(cc:CC>=0)?L72:pc}
+   65: NOTE_INSN_BASIC_BLOCK 6
+   66: cc:CC=cmp(r151:SI,0)
+   67: pc={(cc:CC>=0)?L77:pc}
   REG_BR_PROB 6335
-   67: NOTE_INSN_BASIC_BLOCK 6
-   68: r149:SI=r142:SI-0x1
-   69: r170:SI=0x2d
-   70: r171:QI=r170:SI#0
-   71: [r142:SI-0x1]=r171:QI
-   37: r142:SI=r149:SI
-   72: L72:
-   73: NOTE_INSN_BASIC_BLOCK 7
-   74: r150:SI=r142:SI
+   68: NOTE_INSN_BASIC_BLOCK 7
+   69: r149:SI=r142:SI-0x2
+   70: r170:SI=0x2d
+   71: r171:QI=r170:SI#0
+   72: [r150:SI-0x1]=r171:QI
+   38: r150:SI=r149:SI
+   77: L77:
+   80: NOTE_INSN_BASIC_BLOCK 9
78: r0:SI=r150:SI
79: use r0:SI

[Bug target/70359] [6 Regression] Code size increase for ARM compared to gcc-5.3.0

2016-03-29 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70359

--- Comment #14 from kugan at gcc dot gnu.org ---
(In reply to Jeffrey A. Law from comment #13)
> The change to the assignment of p_22 is made by forwprop1.
> 
> It does create a situation where p_2 is live outside the loop and hides the
> CSE opportunity, which may be the cause of the more significant differences
> at expansion time.

Indeed. This is what I see: 

gcc 5 branch with  -O2 t2.c  -S -Os -marm  -mcpu=arm966e-s 96 Bytes
trunk with -O2 t2.c  -c -Os -marm  -mcpu=arm966e-s  112 Bytes
trunk with -O2 t2.c  -c -Os -marm   108 Bytes
trunk with -O2 t2.c  -c -Os -marm  -mcpu=arm966e-s  -fno-tree-forwprop 96 Bytes
trunk with -O2 t2.c  -c -Os -marm  -mcpu=arm966e-s and Jakub's changes in c# 5
- 100 Bytes

[Bug tree-optimization/61839] More optimize opportunity for VRP

2016-04-12 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61839

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kugan at gcc dot gnu.org
  Known to fail|4.10.0  |5.0

--- Comment #2 from kugan at gcc dot gnu.org ---

> -  c = b != 0 ? 486097858 : 972195717;
> +  c = a + 972195718 >> (b != 0);

...

> until the very end, not transforming c_6.  Note that VRP could do the
> missing transform as it knows that _5 is [0, 1] (it has to jump through
> the shift - the value-range for the shift itself is too broad).
> 
> If written this kind of transform should be applied more generally, not
> just for shifts.  It basically wants to ask whether a conditional test
> can be carried out against another SSA name (and another constant) if
> an intermediate compute can be omitted in that case.

Do you mean something like,

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index bbdf9ce..dfce619 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -9902,6 +9902,47 @@ simplify_stmt_using_ranges (gimple_stmt_iterator *gsi)
 {
   enum tree_code rhs_code = gimple_assign_rhs_code (stmt);
   tree rhs1 = gimple_assign_rhs1 (stmt);
+  tree rhs2 = gimple_assign_rhs2 (stmt);
+  tree var;
+
+  /* Convert:
+COND_RES = X COMPARE Y
+TMP = (CAST) COND_RES
+LHS = CST BINOP TMP
+
+To:
+LHS = COND_RES ? (CST BINOP 1) : (CST BINOP 0) */
+
+  if (TREE_CODE_CLASS (rhs_code) == tcc_binary
+ && TREE_CODE (rhs1) == INTEGER_CST
+ && TREE_CODE (rhs2) == SSA_NAME
+ && is_gimple_assign (SSA_NAME_DEF_STMT (rhs2))
+ && gimple_assign_rhs_code (SSA_NAME_DEF_STMT (rhs2)) == NOP_EXPR
+ && (var = gimple_assign_rhs1 (SSA_NAME_DEF_STMT (rhs2)))
+ && TREE_CODE (var) == SSA_NAME
+ && is_gimple_assign (SSA_NAME_DEF_STMT (var))
+ && TREE_CODE_CLASS (gimple_assign_rhs_code (SSA_NAME_DEF_STMT (var)))
+ == tcc_comparison)
+
+   {
+ gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
+ value_range *vr = get_value_range (var);
+ if (range_int_cst_p (vr)
+ && integer_zerop (vr->min)
+ && integer_onep (vr->max))
+   {
+
+ tree lhs = gimple_assign_lhs (stmt);
+ tree new_rhs1 =  int_const_binop (rhs_code, rhs1, vr->min);
+ tree new_rhs2 =  int_const_binop (rhs_code, rhs1, vr->max);
+
+ gimple *s = gimple_build_assign (lhs, COND_EXPR, var,
+  new_rhs1,
+  new_rhs2 PASS_MEM_STAT);
+ gsi_replace (&gsi, s, false);
+ return true;
+   }
+   }

   switch (rhs_code)
{

[Bug tree-optimization/70841] reassoc fails to handle FP division

2016-05-03 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70841

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kugan at gcc dot gnu.org

--- Comment #2 from kugan at gcc dot gnu.org ---
Created attachment 38405
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38405&action=edit
prototype patch

Untested prototype patch.

[Bug tree-optimization/63586] x+x+x+x -> 4*x in gimple

2016-05-05 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63586

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 CC||zboson at zboson dot net

--- Comment #7 from kugan at gcc dot gnu.org ---
*** Bug 68105 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/68105] optimizing repeated floating point addition to multiplication

2016-05-05 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68105

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 CC||kugan at gcc dot gnu.org
 Resolution|--- |DUPLICATE

--- Comment #2 from kugan at gcc dot gnu.org ---
Looks like a duplicate of PR63586.

*** This bug has been marked as a duplicate of bug 63586 ***

[Bug tree-optimization/63586] x+x+x+x -> 4*x in gimple

2016-05-17 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63586

--- Comment #8 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Wed May 18 00:58:45 2016
New Revision: 236356

URL: https://gcc.gnu.org/viewcvs?rev=236356&root=gcc&view=rev
Log:
gcc/testsuite/ChangeLog:

2016-05-17  Kugan Vivekanandarajah  

PR middle-end/63586
* gcc.dg/tree-ssa/pr63586-2.c: New test.
* gcc.dg/tree-ssa/pr63586.c: New test.
* gcc.dg/tree-ssa/reassoc-14.c: Adjust multiplication count.

gcc/ChangeLog:

2016-05-17  Kugan Vivekanandarajah  

PR middle-end/63586
* tree-ssa-reassoc.c (transform_add_to_multiply): New.
(reassociate_bb): Call transform_add_to_multiply.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.dg/tree-ssa/reassoc-14.c
trunk/gcc/tree-ssa-reassoc.c

[Bug tree-optimization/63586] x+x+x+x -> 4*x in gimple

2016-05-18 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63586

--- Comment #9 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Wed May 18 07:50:05 2016
New Revision: 236359

URL: https://gcc.gnu.org/viewcvs?rev=236359&root=gcc&view=rev
Log:
Adding the testcase which was not addaed as part of r236356.
gcc/testsuite/ChangeLog:

2016-05-17  Kugan Vivekanandarajah  

PR middle-end/63586
* gcc.dg/tree-ssa/pr63586-2.c: New test.
* gcc.dg/tree-ssa/pr63586.c: New test.

Added:
trunk/gcc/testsuite/gcc.dg/tree-ssa/pr63586-2.c
trunk/gcc/testsuite/gcc.dg/tree-ssa/pr63586.c

[Bug tree-optimization/71170] [7 Regression] ICE in rewrite_expr_tree, at tree-ssa-reassoc.c:3898

2016-05-18 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71170

--- Comment #7 from kugan at gcc dot gnu.org ---
Created attachment 38519
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38519&action=edit
Another way to fix

Thanks Martin Liška for looking into this. I am attaching another way to fox
this. Testing ongoing and will update once the results are available.

[Bug tree-optimization/71170] [7 Regression] ICE in rewrite_expr_tree, at tree-ssa-reassoc.c:3898

2016-05-18 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71170

--- Comment #8 from kugan at gcc dot gnu.org ---
My patch is not going to work.
At tree-ssa-reassoc.c:3897, we have:

stmt: 
_15 = _4 + c_7(D);

oe->op def_stmt:
_17 = c_7(D) * 3;


:
a1_6 = s_5(D) * 2;
_1 = (long int) a1_6;
x1_8 = _1 + c_7(D);
a2_9 = s_5(D) * 4;
_2 = (long int) a2_9;
a3_11 = s_5(D) * 6;
_3 = (long int) a3_11;
_16 = x1_8 + c_7(D);
_18 = _1 + _2;
_4 = _16 + _2;
_15 = _4 + c_7(D);
_17 = c_7(D) * 3;
x_13 = _15 + _3;
return x_13;


The root cause of this the place in which we are adding (_17 = c_7(D) * 3).
Finding the right place is not always straightforward as this case shows.

We could try  Martin Liška's approach.

[Bug tree-optimization/71170] [7 Regression] ICE in rewrite_expr_tree, at tree-ssa-reassoc.c:3898

2016-05-18 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71170

--- Comment #9 from kugan at gcc dot gnu.org ---
We could also move _17 = c_7(D) * 3; at tree-ssa-reassoc.c:3897 satisfy the
gcc_assert. We could do this based on the use count of _17? any suggestions?

[Bug tree-optimization/71179] [7 Regression] ice fold_convert_loc, at fold-const.c:2360

2016-05-19 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71179

--- Comment #3 from kugan at gcc dot gnu.org ---
Created attachment 38521
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38521&action=edit
untested patch

[Bug tree-optimization/71179] [7 Regression] ice fold_convert_loc, at fold-const.c:2360

2016-05-21 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71179

--- Comment #4 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Sat May 21 07:09:16 2016
New Revision: 236554

URL: https://gcc.gnu.org/viewcvs?rev=236554&root=gcc&view=rev
Log:
gcc/testsuite/ChangeLog:

2016-05-21  Kugan Vivekanandarajah  

PR middle-end/71179
* gcc.dg/tree-ssa/pr71179.c: New test.

gcc/ChangeLog:

2016-05-21  Kugan Vivekanandarajah  

PR middle-end/71179
* tree-ssa-reassoc.c (transform_add_to_multiply): Disallow float
VECTOR type.


Added:
trunk/gcc/testsuite/gcc.dg/tree-ssa/pr71179.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-ssa-reassoc.c

[Bug tree-optimization/40921] missed optimization: x + (-y * z * z) => x - y * z * z

2016-05-22 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40921

--- Comment #5 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Sun May 22 08:13:13 2016
New Revision: 236564

URL: https://gcc.gnu.org/viewcvs?rev=236564&root=gcc&view=rev
Log:
gcc/testsuite/ChangeLog:

2016-05-22  Kugan Vivekanandarajah  

PR middle-end/40921
* gcc.dg/tree-ssa/pr40921.c: New test.

gcc/ChangeLog:

2016-05-22  Kugan Vivekanandarajah  

PR middle-end/40921
* tree-ssa-reassoc.c (try_special_add_to_ops): New.
(linearize_expr_tree): Call try_special_add_to_ops.
(reassociate_bb): Convert MULT_EXPR by (-1) to NEGATE_EXPR.



Added:
trunk/gcc/testsuite/gcc.dg/tree-ssa/pr40921.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-ssa-reassoc.c

[Bug tree-optimization/71170] [7 Regression] ICE in rewrite_expr_tree, at tree-ssa-reassoc.c:3898

2016-05-23 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71170

--- Comment #10 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Tue May 24 00:14:13 2016
New Revision: 236619

URL: https://gcc.gnu.org/viewcvs?rev=236619&root=gcc&view=rev
Log:
gcc/ChangeLog:

2016-05-24  Kugan Vivekanandarajah  

PR middle-end/71170
* tree-ssa-reassoc.c (struct operand_entry): Add field stmt_to_insert.
(add_to_ops_vec): Add stmt_to_insert.
(add_repeat_to_ops_vec): Init stmt_to_insert.
(insert_stmt_before_use): New.
(transform_add_to_multiply): Remove mult_stmt insertion and add it to
ops vector.
(get_ops): Init stmt_to_insert.
(maybe_optimize_range_tests): Likewise.
(rewrite_expr_tree): Insert stmt_to_insert before use stmt.
(rewrite_expr_tree_parallel): Likewise.
(reassociate_bb): Likewise.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/tree-ssa-reassoc.c

[Bug middle-end/71252] [7 Regression] ICE: verify_ssa failed : definition in block 7 does not dominate use in block 6

2016-05-23 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71252

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kugan at gcc dot gnu.org

--- Comment #1 from kugan at gcc dot gnu.org ---
This looks like an issue with my reassociation commit. I will have a look.

[Bug middle-end/71252] [7 Regression] ICE: verify_ssa failed : definition in block 7 does not dominate use in block 6

2016-05-24 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71252

--- Comment #2 from kugan at gcc dot gnu.org ---
Created attachment 38549
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38549&action=edit
untested patch

testing this patch which fixes this.

[Bug tree-optimization/71252] [7 Regression] ICE: verify_ssa failed : definition in block 7 does not dominate use in block 6

2016-05-24 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71252

--- Comment #4 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Tue May 24 10:50:01 2016
New Revision: 236634

URL: https://gcc.gnu.org/viewcvs?rev=236634&root=gcc&view=rev
Log:
gcc/testsuite/ChangeLog:

2016-05-24  Kugan Vivekanandarajah  

PR middle-end/71252
* gfortran.dg/pr71252.f90: New test.

gcc/ChangeLog:

2016-05-24  Kugan Vivekanandarajah  

PR middle-end/71252
* tree-ssa-reassoc.c (rewrite_expr_tree_parallel): Add stmt_to_insert
after
build_and_add_sum creates new use stmt.


Added:
trunk/gcc/testsuite/gfortran.dg/pr71252.f90
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-ssa-reassoc.c

[Bug tree-optimization/71230] [7 Regression] ICE : in zero_one_operation, at tree-ssa-reassoc.c:1230

2016-05-24 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71230

--- Comment #15 from kugan at gcc dot gnu.org ---

> internal compiler error: Segmentation fault
> 0x1090f973 crash_signal
>   /home/seurer/gcc/gcc-test/gcc/toplev.c:333
> 0x10b12ca0 sort_by_operand_rank
>   /home/seurer/gcc/gcc-test/gcc/tree-ssa-reassoc.c:530
> 0x10b21863 vec::qsort(int (*)(void
> const*, void const*))
>   /home/seurer/gcc/gcc-test/gcc/vec.h:951
> 0x10b21863 vec::qsort(int (*)(void const*,
> void const*))
>   /home/seurer/gcc/gcc-test/gcc/vec.h:1698
> 0x10b21863 reassociate_bb
>   /home/seurer/gcc/gcc-test/gcc/tree-ssa-reassoc.c:5268
> 0x10b21037 reassociate_bb
>   /home/seurer/gcc/gcc-test/gcc/tree-ssa-reassoc.c:5389
> 0x10b21037 reassociate_bb
>   /home/seurer/gcc/gcc-test/gcc/tree-ssa-reassoc.c:5389
> 0x10b23eff do_reassoc
>   /home/seurer/gcc/gcc-test/gcc/tree-ssa-reassoc.c:5503
> 0x10b23eff execute_reassoc
>   /home/seurer/gcc/gcc-test/gcc/tree-ssa-reassoc.c:5590
> 0x10b23eff execute
>   /home/seurer/gcc/gcc-test/gcc/tree-ssa-reassoc.c:5629
> Please submit a full bug report,
> with preprocessed source if appropriate.
> Please include the complete backtrace with any bug report.
> See <http://gcc.gnu.org/bugs.html> for instructions.
This looks like what is fixed in r236673. Can you please give it a try with
this commit.

Thanks,
Kugan

[Bug tree-optimization/71252] [7 Regression] ICE: verify_ssa failed : definition in block 7 does not dominate use in block 6

2016-05-25 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71252

--- Comment #8 from kugan at gcc dot gnu.org ---
Sorry for the breakage. I can reproduce this. I am looking into it.

[Bug tree-optimization/71252] [7 Regression] ICE: verify_ssa failed : definition in block 7 does not dominate use in block 6

2016-05-25 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71252

--- Comment #9 from kugan at gcc dot gnu.org ---
What application is this testcase from? I have a patch which I want to try.

[Bug middle-end/71269] [7 Regression] segfault while compiling sqlite

2016-05-25 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71269

--- Comment #3 from kugan at gcc dot gnu.org ---
(In reply to Markus Trippelsdorf from comment #2)
> Started with r236634.

Hi Markus,

This looks like dup of PR71252. I have a patch for which I am testing now. Do
you have a preprocessed file I can test this.

Thanks,
Kugan

[Bug tree-optimization/71252] [7 Regression] ICE: verify_ssa failed : definition in block 7 does not dominate use in block 6

2016-05-25 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71252

--- Comment #12 from kugan at gcc dot gnu.org ---
Posted patch at:
https://gcc.gnu.org/ml/gcc-patches/2016-05/msg02061.html

[Bug middle-end/71269] [7 Regression] segfault while compiling sqlite

2016-05-26 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71269

--- Comment #8 from kugan at gcc dot gnu.org ---
(In reply to Roger Orr from comment #7)
> I've got a very similar problem, building valgrind with trunk revision
> 236644:
> 
> m_mallocfree.c: In function 'sanity_check_malloc_arena':
> m_mallocfree.c:1055:13: internal compiler error: Segmentation fault
>  static void sanity_check_malloc_arena ( ArenaId aid )
>  ^
> 0xbac3df crash_signal
> ../../gcc/toplev.c:333
> 0xd46249 sort_by_operand_rank
> ../../gcc/tree-ssa-reassoc.c:530
> 0xd52d83 reassociate_bb
> ../../gcc/tree-ssa-reassoc.c:5268
> 0xd51027 reassociate_bb
> ../../gcc/tree-ssa-reassoc.c:5389
> 0xd53f7a do_reassoc
> ../../gcc/tree-ssa-reassoc.c:5503
> 0xd53f7a execute_reassoc
> ../../gcc/tree-ssa-reassoc.c:5590
> 0xd53f7a execute
> ../../gcc/tree-ssa-reassoc.c:5629
> Please submit a full bug report,
> with preprocessed source if appropriate.
> Please include the complete backtrace with any bug report.
> See <http://gcc.gnu.org/bugs.html> for instructions.
> 
> I've uploaded the preprocessed input and compiler output.

This Looks like what is fixed in r236673. Can you attach a per-processed src
please.

[Bug tree-optimization/71252] [7 Regression] ICE: verify_ssa failed : definition in block 7 does not dominate use in block 6

2016-05-26 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71252

--- Comment #14 from kugan at gcc dot gnu.org ---
(In reply to Roger Orr from comment #13)
> The patch sadly does not appear to fix the (very similar looking) valgrind
> compilation failure I posted in pr71269.
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71269#c7

That looks like what is fixed in r236673. If you still have the issue, could
you attach a preprocessed src to reproduce please.

[Bug middle-end/71269] [7 Regression] segfault while compiling sqlite

2016-05-26 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71269

--- Comment #10 from kugan at gcc dot gnu.org ---
(In reply to Markus Trippelsdorf from comment #9)
> (In reply to kugan from comment #8)
> > (In reply to Roger Orr from comment #7)
> > > I've got a very similar problem, building valgrind with trunk revision
> > > 236644:
> > > 
> > > m_mallocfree.c: In function 'sanity_check_malloc_arena':
> > > m_mallocfree.c:1055:13: internal compiler error: Segmentation fault
> > >  static void sanity_check_malloc_arena ( ArenaId aid )
> > >  ^
> > > 0xbac3df crash_signal
> > > ../../gcc/toplev.c:333
> > > 0xd46249 sort_by_operand_rank
> > > ../../gcc/tree-ssa-reassoc.c:530
> > > 0xd52d83 reassociate_bb
> > > ../../gcc/tree-ssa-reassoc.c:5268
> > > 0xd51027 reassociate_bb
> > > ../../gcc/tree-ssa-reassoc.c:5389
> > > 0xd53f7a do_reassoc
> > > ../../gcc/tree-ssa-reassoc.c:5503
> > > 0xd53f7a execute_reassoc
> > > ../../gcc/tree-ssa-reassoc.c:5590
> > > 0xd53f7a execute
> > > ../../gcc/tree-ssa-reassoc.c:5629
> > > Please submit a full bug report,
> > > with preprocessed source if appropriate.
> > > Please include the complete backtrace with any bug report.
> > > See <http://gcc.gnu.org/bugs.html> for instructions.
> > > 
> > > I've uploaded the preprocessed input and compiler output.
> > 
> > This Looks like what is fixed in r236673. Can you attach a per-processed src
> > please.
> 
> It is already attached.

Sorry, I missed that.

on x86-64-linux-gnu, with the current trunk:
./build/gcc/cc1 -O2 m_mallocfree.i 
./build/gcc/cc1 -O3 m_mallocfree.i
are working. What command line did you use to ICE ?

[Bug middle-end/71269] [7 Regression] segfault while compiling sqlite

2016-05-26 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71269

--- Comment #12 from kugan at gcc dot gnu.org ---
(In reply to Markus Trippelsdorf from comment #11)
>  > Sorry, I missed that.
> > 
> > on x86-64-linux-gnu, with the current trunk:
> > ./build/gcc/cc1 -O2 m_mallocfree.i 
> > ./build/gcc/cc1 -O3 m_mallocfree.i
> > are working. What command line did you use to ICE ?
> 
> See the other attachment ;)
> I cannot reproduce the issue on current trunk.

Thanks for checking. I need some sleep now :)

[Bug bootstrap/71292] Bootstrap failure with segfault in tree-reassoc.c

2016-05-26 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71292

--- Comment #1 from kugan at gcc dot gnu.org ---
(In reply to ktkachov from comment #0)
> An aarch64-none-linux-gnu bootstrap with an in-tree mpfr fails with an ICE:
> exp_2.i: In function ‘fn1’:
> exp_2.i:4:6: internal compiler error: Segmentation fault
>  void fn1() {
>   ^~~
> 0xb1f8cf crash_signal
> $SRC/gcc/toplev.c:333
> 0x89bcb9 bb_seq_addr
> $SRC/gcc/gimple.h:1655
> 0x89bcb9 gsi_start_bb
> $SRC/gcc/gimple-iterator.h:129
> 0x89bcb9 gsi_for_stmt(gimple*)
> $SRC/gcc/gimple-iterator.c:617
> 0xcbbeba insert_stmt_after
> $SRC/gcc/tree-ssa-reassoc.c:1323
> 0xcbd67a build_and_add_sum
> $SRC/gcc/tree-ssa-reassoc.c:1392
> 0xcbf34f rewrite_expr_tree_parallel
> $SRC/gcc/tree-ssa-reassoc.c:4128
> 0xcc8b95 reassociate_bb
> $SRC/gcc/tree-ssa-reassoc.c:5339
> 0xcc8ad7 reassociate_bb
> $SRC/gcc/tree-ssa-reassoc.c:5391
> 0xccb523 do_reassoc
> $SRC/gcc/tree-ssa-reassoc.c:5505
> 0xccb523 execute_reassoc
> $SRC/gcc/tree-ssa-reassoc.c:5592
> 0xccb523 execute
> $SRC/gcc/tree-ssa-reassoc.c:5631
> Please submit a full bug report,
> with preprocessed source if appropriate.
> Please include the complete backtrace with any bug report.
> See <http://gcc.gnu.org/bugs.html> for instructions.
> 
> The reduced testcase for that reproducible with trunk at:
> gcc version 7.0.0 20160526 
> is:
> 
> unsigned long a;
> long b, d;
> int c;
> void fn1() {
>   unsigned long e = a + c;
>   b = d + e + a + 8;
> }
> 
> compile with -O2.
> Compiling with -fno-tree-reassoc doesn't ICE

It looks like dup of PR71252.

does the patch at help? 
https://gcc.gnu.org/ml/gcc-patches/2016-05/msg02076.html

[Bug bootstrap/71292] Bootstrap failure with segfault in tree-reassoc.c

2016-05-26 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71292

--- Comment #2 from kugan at gcc dot gnu.org ---
(In reply to kugan from comment #1)
> (In reply to ktkachov from comment #0)
> > An aarch64-none-linux-gnu bootstrap with an in-tree mpfr fails with an ICE:
> > exp_2.i: In function ‘fn1’:
> > exp_2.i:4:6: internal compiler error: Segmentation fault
> >  void fn1() {
> >   ^~~
> > 0xb1f8cf crash_signal
> > $SRC/gcc/toplev.c:333
> > 0x89bcb9 bb_seq_addr
> > $SRC/gcc/gimple.h:1655
> > 0x89bcb9 gsi_start_bb
> > $SRC/gcc/gimple-iterator.h:129
> > 0x89bcb9 gsi_for_stmt(gimple*)
> > $SRC/gcc/gimple-iterator.c:617
> > 0xcbbeba insert_stmt_after
> > $SRC/gcc/tree-ssa-reassoc.c:1323
> > 0xcbd67a build_and_add_sum
> > $SRC/gcc/tree-ssa-reassoc.c:1392
> > 0xcbf34f rewrite_expr_tree_parallel
> > $SRC/gcc/tree-ssa-reassoc.c:4128
> > 0xcc8b95 reassociate_bb
> > $SRC/gcc/tree-ssa-reassoc.c:5339
> > 0xcc8ad7 reassociate_bb
> > $SRC/gcc/tree-ssa-reassoc.c:5391
> > 0xccb523 do_reassoc
> > $SRC/gcc/tree-ssa-reassoc.c:5505
> > 0xccb523 execute_reassoc
> > $SRC/gcc/tree-ssa-reassoc.c:5592
> > 0xccb523 execute
> > $SRC/gcc/tree-ssa-reassoc.c:5631
> > Please submit a full bug report,
> > with preprocessed source if appropriate.
> > Please include the complete backtrace with any bug report.
> > See <http://gcc.gnu.org/bugs.html> for instructions.
> > 
> > The reduced testcase for that reproducible with trunk at:
> > gcc version 7.0.0 20160526 
> > is:
> > 
> > unsigned long a;
> > long b, d;
> > int c;
> > void fn1() {
> >   unsigned long e = a + c;
> >   b = d + e + a + 8;
> > }
> > 
> > compile with -O2.
> > Compiling with -fno-tree-reassoc doesn't ICE
> 
> It looks like dup of PR71252.
> 
> does the patch at help? 
> https://gcc.gnu.org/ml/gcc-patches/2016-05/msg02076.html

I can reproduce the error. With the patch from
https://gcc.gnu.org/ml/gcc-patches/2016-05/msg02076.html, I can bootstrap.
Please let me know if you still have any problem with the patch.

[Bug bootstrap/71292] Bootstrap failure with segfault in tree-reassoc.c

2016-05-26 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71292

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #3 from kugan at gcc dot gnu.org ---
Dup of PR71252

*** This bug has been marked as a duplicate of bug 71252 ***

[Bug tree-optimization/71252] [7 Regression] ICE: verify_ssa failed : definition in block 7 does not dominate use in block 6

2016-05-26 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71252

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 CC||ktkachov at gcc dot gnu.org

--- Comment #16 from kugan at gcc dot gnu.org ---
*** Bug 71292 has been marked as a duplicate of this bug. ***

[Bug middle-end/71269] [7 Regression] segfault while compiling sqlite

2016-05-26 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71269

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #15 from kugan at gcc dot gnu.org ---
Duplicate of 71252

*** This bug has been marked as a duplicate of bug 71252 ***

[Bug tree-optimization/71252] [7 Regression] ICE: verify_ssa failed : definition in block 7 does not dominate use in block 6

2016-05-26 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71252

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 CC||trippels at gcc dot gnu.org

--- Comment #17 from kugan at gcc dot gnu.org ---
*** Bug 71269 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/71252] [7 Regression] ICE: verify_ssa failed : definition in block 7 does not dominate use in block 6

2016-05-26 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71252

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 CC||hjl.tools at gmail dot com

--- Comment #18 from kugan at gcc dot gnu.org ---
*** Bug 71284 has been marked as a duplicate of this bug. ***

[Bug middle-end/71284] [7 Regression] internal compiler error: Segmentation fault

2016-05-26 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71284

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 CC||kugan at gcc dot gnu.org
 Resolution|--- |DUPLICATE

--- Comment #2 from kugan at gcc dot gnu.org ---
Duplicate of PR71252

*** This bug has been marked as a duplicate of bug 71252 ***

[Bug tree-optimization/71323] [7 Regression] ice in verify_ssa failed

2016-05-30 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71323

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kugan at gcc dot gnu.org

--- Comment #2 from kugan at gcc dot gnu.org ---
Duplicate of PR71252. Patch posted in
https://gcc.gnu.org/ml/gcc-patches/2016-05/msg02298.html fixes this.

[Bug tree-optimization/71252] [7 Regression] ICE: verify_ssa failed : definition in block 7 does not dominate use in block 6

2016-05-30 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71252

--- Comment #21 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Mon May 30 10:45:19 2016
New Revision: 236875

URL: https://gcc.gnu.org/viewcvs?rev=236875&root=gcc&view=rev
Log:
gcc/ChangeLog:

2016-05-30  Kugan Vivekanandarajah  

PR middle-end/71252
* tree-ssa-reassoc.c (swap_ops_for_binary_stmt): Fix swap such that
all fields including stmt_to_insert are swapped.


gcc/testsuite/ChangeLog:

2016-05-30  Kugan Vivekanandarajah  

PR middle-end/71252
* gcc.dg/tree-ssa/pr71252-2.c: New test.


Added:
trunk/gcc/testsuite/gcc.dg/tree-ssa/pr71252-2.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-ssa-reassoc.c

[Bug tree-optimization/71252] [7 Regression] ICE: verify_ssa failed : definition in block 7 does not dominate use in block 6

2016-05-30 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71252

--- Comment #22 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Mon May 30 10:47:57 2016
New Revision: 236876

URL: https://gcc.gnu.org/viewcvs?rev=236876&root=gcc&view=rev
Log:
gcc/testsuite/ChangeLog:

2016-05-30  Kugan Vivekanandarajah  

PR middle-end/71269
PR middle-end/71292
* gcc.dg/tree-ssa/pr71269.c: New test.
* gcc.dg/tree-ssa/pr71292.c: New test.

gcc/ChangeLog:

2016-05-30  Kugan Vivekanandarajah  

PR middle-end/71269
PR middle-end/71252
* tree-ssa-reassoc.c (insert_stmt_before_use): Use find_insert_point so
that inserted stmt will not dominate stmts that defines its operand.
(rewrite_expr_tree): Add stmt_to_insert before adding the use stmt.
(rewrite_expr_tree_parallel): Likewise.


Added:
trunk/gcc/testsuite/gcc.dg/tree-ssa/pr71269.c
trunk/gcc/testsuite/gcc.dg/tree-ssa/pr71292.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-ssa-reassoc.c

[Bug middle-end/71269] [7 Regression] segfault while compiling sqlite

2016-05-30 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71269

--- Comment #16 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Mon May 30 10:47:57 2016
New Revision: 236876

URL: https://gcc.gnu.org/viewcvs?rev=236876&root=gcc&view=rev
Log:
gcc/testsuite/ChangeLog:

2016-05-30  Kugan Vivekanandarajah  

PR middle-end/71269
PR middle-end/71292
* gcc.dg/tree-ssa/pr71269.c: New test.
* gcc.dg/tree-ssa/pr71292.c: New test.

gcc/ChangeLog:

2016-05-30  Kugan Vivekanandarajah  

PR middle-end/71269
PR middle-end/71252
* tree-ssa-reassoc.c (insert_stmt_before_use): Use find_insert_point so
that inserted stmt will not dominate stmts that defines its operand.
(rewrite_expr_tree): Add stmt_to_insert before adding the use stmt.
(rewrite_expr_tree_parallel): Likewise.


Added:
trunk/gcc/testsuite/gcc.dg/tree-ssa/pr71269.c
trunk/gcc/testsuite/gcc.dg/tree-ssa/pr71292.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-ssa-reassoc.c

[Bug bootstrap/71292] Bootstrap failure with segfault in tree-reassoc.c

2016-05-30 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71292

--- Comment #5 from kugan at gcc dot gnu.org ---
Author: kugan
Date: Mon May 30 10:47:57 2016
New Revision: 236876

URL: https://gcc.gnu.org/viewcvs?rev=236876&root=gcc&view=rev
Log:
gcc/testsuite/ChangeLog:

2016-05-30  Kugan Vivekanandarajah  

PR middle-end/71269
PR middle-end/71292
* gcc.dg/tree-ssa/pr71269.c: New test.
* gcc.dg/tree-ssa/pr71292.c: New test.

gcc/ChangeLog:

2016-05-30  Kugan Vivekanandarajah  

PR middle-end/71269
PR middle-end/71252
* tree-ssa-reassoc.c (insert_stmt_before_use): Use find_insert_point so
that inserted stmt will not dominate stmts that defines its operand.
(rewrite_expr_tree): Add stmt_to_insert before adding the use stmt.
(rewrite_expr_tree_parallel): Likewise.


Added:
trunk/gcc/testsuite/gcc.dg/tree-ssa/pr71269.c
trunk/gcc/testsuite/gcc.dg/tree-ssa/pr71292.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-ssa-reassoc.c

[Bug target/71281] [7 Regression] ICE on gcc trunk on knl, wsm, ivb and bdw targets (tree-ssa-reassoc)

2016-06-03 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71281

--- Comment #4 from kugan at gcc dot gnu.org ---
Sorry about the breakage. Looking into it.

[Bug target/71281] [7 Regression] ICE on gcc trunk on knl, wsm, ivb and bdw targets (tree-ssa-reassoc)

2016-06-03 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71281

--- Comment #5 from kugan at gcc dot gnu.org ---
Created attachment 38640
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38640&action=edit
proposed patch

testing this patch.

[Bug target/71281] [7 Regression] ICE on gcc trunk on knl, wsm, ivb and bdw targets (tree-ssa-reassoc)

2016-06-03 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71281

--- Comment #6 from kugan at gcc dot gnu.org ---
Patch posted for review at
https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00316.html

  1   2   3   >