[Bug lto/94157] [10 Regression] error: lto-wrapper failed with -Wa,--noexecstack -Wa,--noexecstack since r10-6807-gf1a681a174cdfb82

2020-03-12 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94157

--- Comment #4 from prathamesh3492 at gcc dot gnu.org ---
(In reply to Martin Liška from comment #3)
> I've got a patch candidate, will send it to GCC patches mailing list.

Sorry for the breakage, and thanks for taking a look!

Regards,
Prathamesh

[Bug target/86753] [9/10 Regression] gcc.target/aarch64/sve/vcond_[45].c fail after recent combine patch

2019-10-17 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86753

--- Comment #10 from prathamesh3492 at gcc dot gnu.org ---
Author: prathamesh3492
Date: Fri Oct 18 05:13:26 2019
New Revision: 277141

URL: https://gcc.gnu.org/viewcvs?rev=277141&root=gcc&view=rev
Log:
2019-10-18  Prathamesh Kulkarni  
Richard Sandiford  

PR target/86753
* tree-vectorizer.h (scalar_cond_masked_key): New struct,
and define hashmap traits for it.
(loop_vec_info::scalar_cond_masked_set): New member.
(vect_record_loop_mask): Adjust prototype.
* tree-vectorizer.c (scalar_cond_masked_key::get_cond_ops_from_tree):
Implement method.
* tree-vect-loop.c (vectorizable_reduction): Pass NULL as last arg to
vect_record_loop_mask.
(vectorizable_live_operation): Likewise.
(vect_record_loop_mask): New param scalar_mask. Add entry
cond, loop_mask to scalar_cond_masked_set if scalar_mask is non NULL.
* tree-vect-stmts.c (check_load_store_masking): New param scalar_mask.
Pass it as last arg to vect_record_loop_mask.
(vectorizable_call): Pass scalar_mask as last arg to
vect_record_loop_mask.
(vectorizable_store): Likewise.
(vectorizable_load): Likewise.
(vectorizable_condition): Check if another part of vectorized code
applies loop_mask to condition or to it's inverse, and if yes,
apply loop_mask to result of vector comparison.

testsuite/
* gcc.target/aarch64/sve/cond_cnot_2.c: Remove XFAIL
from { scan-assembler-not {\tsel\t}.
* gcc.target/aarch64/sve/cond_convert_1.c: Adjust to make
only one load conditional.
* gcc.target/aarch64/sve/cond_convert_4.c: Likewise.
* gcc.target/aarch64/sve/cond_unary_2.c: Likewise.
* gcc.target/aarch64/sve/vcond_4.c: Remove XFAIL's.
* gcc.target/aarch64/sve/vcond_5.c: Likewise.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.target/aarch64/sve/cond_cnot_2.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/cond_convert_1.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/cond_convert_4.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/cond_unary_2.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/vcond_4.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/vcond_5.c
trunk/gcc/tree-vect-loop.c
trunk/gcc/tree-vect-stmts.c
trunk/gcc/tree-vectorizer.c
trunk/gcc/tree-vectorizer.h

[Bug tree-optimization/92155] strlen(a) not folded after memset(a, 0, sizeof a)

2019-10-18 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92155

prathamesh3492 at gcc dot gnu.org changed:

   What|Removed |Added

 CC||prathamesh3492 at gcc dot 
gnu.org

--- Comment #1 from prathamesh3492 at gcc dot gnu.org ---
Hi Martin,
Just wondering if it's necessary for 3rd arg to be sizeof ?
IIUC memset (a, 0, n) for valid n, should result in strlen(a) equal to 0 ?

Btw, it seems, the comparison is folded to 0 in following case:

extern char a4[4];

void g ()
{
  __builtin_memset (a4, 0, sizeof a4);
  if (__builtin_strlen (a4) != 0)
__builtin_abort ();
}

.optimized dump shows only call to memset.

Thanks,
Prathamesh

[Bug tree-optimization/91532] [SVE] Redundant predicated store in gcc.target/aarch64/fmla_2.c

2019-10-21 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91532

--- Comment #4 from prathamesh3492 at gcc dot gnu.org ---
Author: prathamesh3492
Date: Mon Oct 21 07:31:45 2019
New Revision: 277237

URL: https://gcc.gnu.org/viewcvs?rev=277237&root=gcc&view=rev
Log:
2019-10-21  Prathamesh Kulkarni  

PR tree-optimization/91532
* gcc.target/aarch64/sve/fmla_2.c: Add dg-scan check for two st1d
insns.

Modified:
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.target/aarch64/sve/fmla_2.c

[Bug tree-optimization/92163] [10 Regression] ICE: Segmentation fault (in bitmap_set_bit)

2019-10-21 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92163

--- Comment #3 from prathamesh3492 at gcc dot gnu.org ---
Created attachment 47079
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47079&action=edit
Untested fix

Does this patch look OK ?

Thanks,
Prathamesh

[Bug tree-optimization/92163] [10 Regression] ICE: Segmentation fault (in bitmap_set_bit)

2019-10-23 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92163

--- Comment #6 from prathamesh3492 at gcc dot gnu.org ---
Posted updated patch upstream:
https://gcc.gnu.org/ml/gcc-patches/2019-10/msg01702.html

Thanks,
Prathamesh

[Bug middle-end/91272] [SVE] Use fully-masked loops for CLASTB reductions

2019-10-28 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91272

--- Comment #1 from prathamesh3492 at gcc dot gnu.org ---
Author: prathamesh3492
Date: Mon Oct 28 14:50:58 2019
New Revision: 277524

URL: https://gcc.gnu.org/viewcvs?rev=277524&root=gcc&view=rev
Log:
2019-10-28  Prathamesh Kulkarni  

PR middle-end/91272
* tree-vect-stmts.c (vectorizable_condition): Support
EXTRACT_LAST_REDUCTION with fully-masked loops.

testsuite/
* gcc.target/aarch64/sve/clastb_1.c: Add dg-scan.
* gcc.target/aarch64/sve/clastb_2.c: Likewise.
* gcc.target/aarch64/sve/clastb_3.c: Likewise.
* gcc.target/aarch64/sve/clastb_4.c: Likewise.
* gcc.target/aarch64/sve/clastb_5.c: Likewise.
* gcc.target/aarch64/sve/clastb_6.c: Likewise.
* gcc.target/aarch64/sve/clastb_7.c: Likewise.
* gcc.target/aarch64/sve/clastb_8.c: Likewise.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.target/aarch64/sve/clastb_1.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/clastb_2.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/clastb_3.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/clastb_4.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/clastb_5.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/clastb_6.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/clastb_7.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/clastb_8.c
trunk/gcc/tree-vect-stmts.c

[Bug tree-optimization/92163] [10 Regression] ICE: Segmentation fault (in bitmap_set_bit)

2019-10-28 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92163

--- Comment #7 from prathamesh3492 at gcc dot gnu.org ---
Author: prathamesh3492
Date: Mon Oct 28 15:01:24 2019
New Revision: 277525

URL: https://gcc.gnu.org/viewcvs?rev=277525&root=gcc&view=rev
Log:
2019-10-28  Prathamesh Kulkarni  

PR tree-optimization/92163
* tree-ssa-dse.c (delete_dead_or_redundant_assignment): New param
need_eh_cleanup with default value NULL. Gate on need_eh_cleanup
before calling bitmap_set_bit.
(dse_optimize_redundant_stores): Pass global need_eh_cleanup to
delete_dead_or_redundant_assignment.
(dse_dom_walker::dse_optimize_stmt): Likewise.
* tree-ssa-dse.h (delete_dead_or_redundant_assignment): Adjust
prototype.

testsuite/
* gcc.dg/tree-ssa/pr92163.c: New test.

Added:
trunk/gcc/testsuite/gcc.dg/tree-ssa/pr92163.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-ssa-dse.c
trunk/gcc/tree-ssa-dse.h

[Bug rtl-optimization/92342] [10 Regression] a small missed transformation into x?b:0

2019-11-03 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92342

prathamesh3492 at gcc dot gnu.org changed:

   What|Removed |Added

 CC||prathamesh3492 at gcc dot 
gnu.org

--- Comment #1 from prathamesh3492 at gcc dot gnu.org ---
Hi,
I reverted Segher's commit in my local tree, but am still seeing the same
code-gen for g().

Thanks,
Prathamesh

[Bug rtl-optimization/92342] [10 Regression] a small missed transformation into x?b:0

2019-11-03 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92342

--- Comment #2 from prathamesh3492 at gcc dot gnu.org ---
(In reply to prathamesh3492 from comment #1)
> Hi,
> I reverted Segher's commit in my local tree, but am still seeing the same
> code-gen for g().
Oops I was modifying wrong branch :-/
I can confirm reverting the commit fixes this issue.
Sorry for the noise.

Regards,
Prathamesh
> 
> Thanks,
> Prathamesh

[Bug tree-optimization/92328] [10 Regression] ICE in eliminate_stmt, at tree-ssa-sccvn.c:5497

2019-11-04 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92328

prathamesh3492 at gcc dot gnu.org changed:

   What|Removed |Added

 CC||prathamesh3492 at gcc dot 
gnu.org

--- Comment #2 from prathamesh3492 at gcc dot gnu.org ---
Reverting the following hunk in vn_reference_lookup_3 from r276882, seems to
resolve the ICE:

#if 0
  if (known_eq (ref->size, size2))
return vn_reference_lookup_or_insert_for_pieces
(vuse, get_alias_set (lhs), vr->type, vr->operands,
 SSA_VAL (def_rhs));
#endif
  if (! INTEGRAL_TYPE_P (TREE_TYPE (def_rhs))
   || type_has_mode_precision_p (TREE_TYPE (def_rhs)))
{
  gimple_match_op op (gimple_match_cond::UNCOND,

Altho, I am not sure if that's the issue.

In eliminate_stmt, lhs is unsigned and sprime is int, and thus it goes into
else branch and hits gcc_unreachable():

  if (!useless_type_conversion_p (TREE_TYPE (lhs),
  TREE_TYPE (sprime)))
{
  /* We preserve conversions to but not from function or method
 types.  This asymmetry makes it necessary to re-instantiate
 conversions here.  */
  if (POINTER_TYPE_P (TREE_TYPE (lhs))
  && FUNC_OR_METHOD_TYPE_P (TREE_TYPE (TREE_TYPE (lhs
sprime = fold_convert (TREE_TYPE (lhs), sprime);
  else
gcc_unreachable ();


Thanks,
Prathamesh

[Bug tree-optimization/92608] [9/10 Regression] ICE: Segmentation fault (in find_loop_guard)

2019-11-20 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92608

prathamesh3492 at gcc dot gnu.org changed:

   What|Removed |Added

 CC||prathamesh3492 at gcc dot 
gnu.org

--- Comment #1 from prathamesh3492 at gcc dot gnu.org ---
Posted patch upstream: https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02061.html

Thanks,
Prathamesh

[Bug tree-optimization/92608] [9/10 Regression] ICE: Segmentation fault (in find_loop_guard)

2019-11-21 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92608

--- Comment #2 from prathamesh3492 at gcc dot gnu.org ---
Author: prathamesh3492
Date: Thu Nov 21 20:20:36 2019
New Revision: 278598

URL: https://gcc.gnu.org/viewcvs?rev=278598&root=gcc&view=rev
Log:
Use safe_dyn_cast instead of dyn_cast in find_loop_guard to fix PR92608.

2019-11-22  Prathamesh Kulkarni  

PR tree-optimization/92608
* tree-ssa-loop-unswitch.c (find_loop_guard): Use safe_dyn_cast instead
of dyn_cast.

testsuite/
* gcc.dg/torture/pr92608.c: New test.

Added:
trunk/gcc/testsuite/gcc.dg/torture/pr92608.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-ssa-loop-unswitch.c

[Bug tree-optimization/92649] dead store elimination

2019-11-24 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92649

prathamesh3492 at gcc dot gnu.org changed:

   What|Removed |Added

 CC||prathamesh3492 at gcc dot 
gnu.org

--- Comment #1 from prathamesh3492 at gcc dot gnu.org ---
This is likely dup of PR89332.

Thanks,
Prathamesh

[Bug tree-optimization/92704] [8/9/10 Regression] ICE: Segmentation fault (in process_bb)

2019-11-28 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92704

prathamesh3492 at gcc dot gnu.org changed:

   What|Removed |Added

 CC||prathamesh3492 at gcc dot 
gnu.org

--- Comment #1 from prathamesh3492 at gcc dot gnu.org ---
This seems, to happen because we end up with following phi defining .MEM_113 in
ifcvt dump:

   [local count: 3667364]:
  # q7_91 = PHI 
  # zr_lsm.55_92 = PHI 
  # .MEM_113 = PHI <(5), .MEM_106(50)>

.MEM_113 phi seems to have NULL (!) arg, which then causes segfault in
following hunk in tree-ssa-sccvn.c:process_bb()

  gphi *phi = gsi.phi ();
  use_operand_p use_p = PHI_ARG_DEF_PTR_FROM_EDGE (phi, e);
  tree arg = USE_FROM_PTR (use_p);
  if (TREE_CODE (arg) != SSA_NAME
  || virtual_operand_p (arg))
continue;

Passing -fno-tree-loop-ifconvert in addition to other options, doesn't cause
the segfault. I assume phi args cannot be NULL ?

Thanks,
Prathamesh

[Bug tree-optimization/89007] [SVE] Implement generic vector average expansion

2019-12-09 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89007

--- Comment #1 from prathamesh3492 at gcc dot gnu.org ---
Author: prathamesh3492
Date: Mon Dec  9 09:59:42 2019
New Revision: 279112

URL: https://gcc.gnu.org/viewcvs?rev=279112&root=gcc&view=rev
Log:
2019-12-09  Prathamesh Kulkarni  

PR tree-optimization/89007
* tree-vect-patterns.c (vect_recog_average_pattern): If there is no
target support available, generate code to distribute rshift over plus
and add a carry.

testsuite/
* gcc.target/aarch64/sve/pr89007-1.c: New test.
* gcc.target/aarch64/sve/pr89007-2.c: Likewise.

Added:
trunk/gcc/testsuite/gcc.target/aarch64/sve/pr89007-1.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/pr89007-2.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-vect-patterns.c

[Bug tree-optimization/92867] Use ERF_RETURNS_ARG in more places

2019-12-09 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92867

prathamesh3492 at gcc dot gnu.org changed:

   What|Removed |Added

 CC||prathamesh3492 at gcc dot 
gnu.org

--- Comment #2 from prathamesh3492 at gcc dot gnu.org ---
If it's OK, I will try to implement this.

Thanks,
Prathamesh

[Bug tree-optimization/93054] ICE in gimple_set_lhs, at gimple.c:1820

2019-12-23 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93054

prathamesh3492 at gcc dot gnu.org changed:

   What|Removed |Added

 CC||prathamesh3492 at gcc dot 
gnu.org

--- Comment #1 from prathamesh3492 at gcc dot gnu.org ---
I wonder if we should emit an error in the front-end if a noreturn function has
non-void return type ? For above test-case, the function cb() is marked with
noreturn attribute but has return-type int.

Thanks,
Prathamesh

[Bug tree-optimization/93397] [10 Regression] ICE in vect_create_epilog_for_reduction

2020-01-22 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93397

prathamesh3492 at gcc dot gnu.org changed:

   What|Removed |Added

 CC||prathamesh3492 at gcc dot 
gnu.org

--- Comment #1 from prathamesh3492 at gcc dot gnu.org ---
Seems to ICE at all optimization levels. Because slp_node is NULL in following
hunk in vect_create_epilog_for_reduction since we're calling it from loop
vectorizer:

  /* In SLP reduction chain we reduce vector results into one vector if
 necessary, hence we set here REDUC_GROUP_SIZE to 1.  SCALAR_DEST is the
 LHS of the last stmt in the reduction chain, since we are looking for
 the loop exit phi node.  */
  if (REDUC_GROUP_FIRST_ELEMENT (stmt_info))
{
  stmt_vec_info t = SLP_TREE_SCALAR_STMTS (slp_node)[group_size - 1];

Gating the condition on slp_node seems to work, but not sure if that's the
right fix ?

Thanks,
Prathamesh

[Bug ipa/88788] [9 Regression] Infinite loop in malloc_candidate_p_1 since r264838

2019-01-10 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88788

--- Comment #2 from prathamesh3492 at gcc dot gnu.org ---
Sorry for the breakage, I will take a look.

Regards,
Prathamesh

[Bug ipa/88788] [9 Regression] Infinite loop in malloc_candidate_p_1 since r264838

2019-01-10 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88788

--- Comment #5 from prathamesh3492 at gcc dot gnu.org ---
(In reply to Martin Liška from comment #4)
> Created attachment 45403 [details]
> reduced test-case

Thanks!

[Bug ipa/88788] [9 Regression] Infinite loop in malloc_candidate_p_1 since r264838

2019-01-11 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88788

prathamesh3492 at gcc dot gnu.org changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |prathamesh3492 at gcc 
dot gnu.org

--- Comment #7 from prathamesh3492 at gcc dot gnu.org ---
Created attachment 45412
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45412&action=edit
Untested fix

Hi,
The issue seems to be recursively calling malloc_candidate_p_1 with duplicate
arguments, for example, with above test-case, it shows following trace:

https://pastebin.com/tF5Qg06X

We can see it is calling malloc_candidate_p_1 with resultobj_164=PHI<...>
thrice because resultobj_164 appears 3 times as a phi-arg in:

resultobj_165 = PHI <_12(12), resultobj_164(13), resultobj_164(14),
resultobj_164(15)>

I think it's more of a compile time hog rather than infinite recursion
happening. To avoid that, I simply skipped walking over duplicate args in the
phi in the attached patch:


+bool skip_dup_arg = false;
+for (unsigned j = i; j > 0; j--)
+  if (operand_equal_p (gimple_phi_arg_def (phi, j - 1), arg, 0))
+{
+  skip_dup_arg = true;
+  break;
+}
+if (skip_dup_arg)
+  continue;
+

which appears to compile both the tests again.

I assume a phi stmt usually won't have more than 4 or 5 args, so the loop
shouldn't be too slow in practice ? I will be grateful for any other
suggestions. For the larger test-case it shows 164.08 wall seconds time for
compilation.


Thanks,
Prathamesh

[Bug ipa/88788] [9 Regression] Infinite loop in malloc_candidate_p_1 since r264838

2019-01-11 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88788

--- Comment #10 from prathamesh3492 at gcc dot gnu.org ---
Oops, I didn't realize there could be loop within phi (phi result being an arg
too). I will try to come up with a better approach for handling nested PHI's.
In the meantime, for stage 4, should I revert the recursive calling hunk ?

Thanks,
Prathamesh

[Bug ipa/88788] [9 Regression] Infinite loop in malloc_candidate_p_1 since r264838

2019-01-14 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88788

--- Comment #12 from prathamesh3492 at gcc dot gnu.org ---
(In reply to Jakub Jelinek from comment #11)
> Look e.g. at -O2:
> void bar (int);
> 
> void
> foo (int x)
> {
>   int i = 0;
>   if (x == 8)
> {
>   x = 16;
>   goto lab;
> }
>   for (; i < 100; i++)
> {
> lab:
>   bar (x);
> }
> }
> 
> but pretty much any time you have a loop where some var doesn't really
> change, but there is some other edge to the loop header with a different
> value for that var.

Ah indeed.
Thanks for the explanation!

[Bug ipa/88788] [9 Regression] Infinite loop in malloc_candidate_p_1 since r264838

2019-01-14 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88788

--- Comment #14 from prathamesh3492 at gcc dot gnu.org ---
Created attachment 45425
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45425&action=edit
Patch

Hi,
In the attached patch, I cache results of malloc_candidate_p_1 and avoid
traversing "back edges".
Does it look OK ?

One issue was with creation of hash_table:
hash_table *mc_cache = new hash_table (100);

Using num_ssa_names instead of 100 resulted in allocation failure (and ICE)
for spinning-smaller.ii.
Is using a smaller number like 100 OK correctness wise ?

I think Richard's patch in comment 13 is a better approach, since returning
false should indeed propagate quickly. Testing that patch.

Thanks,
Prathamesh

[Bug ipa/88788] [9 Regression] Infinite loop in malloc_candidate_p_1 since r264838

2019-01-15 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88788

--- Comment #16 from prathamesh3492 at gcc dot gnu.org ---
Author: prathamesh3492
Date: Tue Jan 15 09:37:22 2019
New Revision: 267933

URL: https://gcc.gnu.org/viewcvs?rev=267933&root=gcc&view=rev
Log:
2019-01-15  Richard Biener  
Prathamesh Kulkarni  

PR ipa/88788
* ipa-pure-const.c (malloc_candidate_p_1): Add parameter visited and
return true if SSA_NAME is already marked in visited bitmap.
(malloc_candidate_p): Pass visited to malloc_candidate_p_1.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/ipa-pure-const.c

[Bug ipa/85734] --suggest-attribute=malloc misdiagnoses static functions

2018-05-11 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85734

prathamesh3492 at gcc dot gnu.org changed:

   What|Removed |Added

 CC||prathamesh3492 at gcc dot 
gnu.org

--- Comment #3 from prathamesh3492 at gcc dot gnu.org ---
I will take a look.

Regards,
Prathamesh

[Bug ipa/85734] --suggest-attribute=malloc misdiagnoses static functions

2018-05-14 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85734

--- Comment #4 from prathamesh3492 at gcc dot gnu.org ---
Author: prathamesh3492
Date: Tue May 15 04:44:33 2018
New Revision: 260249

URL: https://gcc.gnu.org/viewcvs?rev=260249&root=gcc&view=rev
Log:
2018-05-15  Prathamesh Kulkarni  

PR ipa/85734
* ipa-pure-const.c (warn_function_malloc): Pass value of known_finite
param
as true in call to suggest_attribute.

testsuite/
* gcc.dg/ipa/pr85734.c: New test.

Added:
trunk/gcc/testsuite/gcc.dg/ipa/pr85734.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/ipa-pure-const.c
trunk/gcc/testsuite/ChangeLog

[Bug ipa/85734] --suggest-attribute=malloc misdiagnoses static functions

2018-05-14 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85734

--- Comment #5 from prathamesh3492 at gcc dot gnu.org ---
Fixed on trunk.

[Bug c/85562] -Wsuggest-attribute=malloc misleads about "returning normally"

2018-05-14 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85562

prathamesh3492 at gcc dot gnu.org changed:

   What|Removed |Added

 CC||prathamesh3492 at gcc dot 
gnu.org

--- Comment #3 from prathamesh3492 at gcc dot gnu.org ---
Fix for PR85734 also fixes this bug.

[Bug ipa/85787] New: malloc_candidate_p fails to detect malloc attribute on nested phis

2018-05-14 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85787

Bug ID: 85787
   Summary: malloc_candidate_p fails to detect malloc attribute on
nested phis
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: ipa
  Assignee: unassigned at gcc dot gnu.org
  Reporter: prathamesh3492 at gcc dot gnu.org
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

For the following test-case, g should have been detected as malloc-like
function by malloc_candidate_p().

void *g (int cond1, int cond2, int cond3)
{
  void *ret;
  void *a;
  void *b;

  if (cond1)
a = __builtin_malloc (10);
  else
a = __builtin_malloc (20);

  if (cond2)
b = __builtin_malloc (30);
  else
b = __builtin_malloc (40);

  if (cond3)
ret = a;
  else
ret = b;

  return ret;
}

[Bug ipa/85787] malloc_candidate_p fails to detect malloc attribute on nested phis

2018-05-14 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85787

--- Comment #1 from prathamesh3492 at gcc dot gnu.org ---
Working on a patch.

[Bug tree-optimization/83648] missing -Wsuggest-attribute=malloc on a trivial malloc-like function

2018-05-14 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83648

--- Comment #3 from prathamesh3492 at gcc dot gnu.org ---
Author: prathamesh3492
Date: Tue May 15 06:07:48 2018
New Revision: 260250

URL: https://gcc.gnu.org/viewcvs?rev=260250&root=gcc&view=rev
Log:
2018-05-15  Prathamesh Kulkarni  

PR tree-optimization/83648
* ipa-pure-const.c (malloc_candidate_p): Allow function with NULL
return value as malloc candidate.

testsuite/
* gcc.dg/tree-ssa/pr83648.c: New test.
* gcc.dg/tree-ssa/pr83648-2.c: Likewise.

Added:
trunk/gcc/testsuite/gcc.dg/tree-ssa/pr83648-2.c
trunk/gcc/testsuite/gcc.dg/tree-ssa/pr83648.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/ipa-pure-const.c
trunk/gcc/testsuite/ChangeLog

[Bug ipa/85817] [9 Regression] ICE in expand_call at gcc/calls.c:4291

2018-05-17 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85817

--- Comment #1 from prathamesh3492 at gcc dot gnu.org ---
Created attachment 44142
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44142&action=edit
Untested fix

Oops, sorry about that.
I put the condition
if (integer_zerop (retval))
  continue;
at the wrong place -;)

Can also be reproduced with:
_Bool f()
{
  return 0;
}
In that the pure-const dump shows the function marked as malloc -:(
Could you check if the attached patch fixes the GIMP issue ?

Thanks,
Prathamesh

[Bug tree-optimization/85820] [9 Regression] internal compiler error: Segmentation fault

2018-05-17 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85820

--- Comment #1 from prathamesh3492 at gcc dot gnu.org ---
This is most likely dup of PR85817. Could you check if the fix in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85817#c1 works ?

Thanks,
Prathamesh

[Bug middle-end/85817] [9 Regression] ICE in expand_call at gcc/calls.c:4291

2018-05-18 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85817

--- Comment #4 from prathamesh3492 at gcc dot gnu.org ---
Author: prathamesh3492
Date: Fri May 18 12:31:04 2018
New Revision: 260358

URL: https://gcc.gnu.org/viewcvs?rev=260358&root=gcc&view=rev
Log:
2018-05-18  Prathamesh Kulkarni  

PR middle-end/85817
* ipa-pure-const.c (malloc_candidate_p): Remove the check integer_zerop
for retval and return false if all args to phi are zero.

testsuite/
* gcc.dg/tree-ssa/pr83648.c: Change scan-tree-dump to
scan-tree-dump-not for h.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/ipa-pure-const.c
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.dg/tree-ssa/pr83648.c

[Bug middle-end/86332] New: Incorrect warning with Wstringop-overflow

2018-06-27 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86332

Bug ID: 86332
   Summary: Incorrect warning with Wstringop-overflow
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: prathamesh3492 at gcc dot gnu.org
  Target Milestone: ---

Hi,
For the following test-case:

void foo(void)
{
  void escape(unsigned char *);

  unsigned char tmp[12];
  unsigned char *p = tmp + 7;
  __builtin_memset (p, 0, 6);
  escape (p);
}

I get warning:
test.c: In function ‘foo’:
test.c:7:3: warning: ‘__builtin_memset’ writing 6 bytes into a region of size 5
overflows the destination [-Wstringop-overflow=]
   __builtin_memset (p, 0, 6);
   ^~

Seems like an "off by one" mistake. Doesn't warn if size of tmp is increased by
1.

[Bug middle-end/86332] Incorrect warning with Wstringop-overflow

2018-06-27 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86332

--- Comment #1 from prathamesh3492 at gcc dot gnu.org ---
Created attachment 44325
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44325&action=edit
Untested fix

[Bug middle-end/86332] Incorrect warning with Wstringop-overflow

2018-06-27 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86332

--- Comment #2 from prathamesh3492 at gcc dot gnu.org ---
Oops, looks like I messed up reducing the test-case from the original program
which triggered this bug -;(
Sorry for the noise.

[Bug middle-end/86332] Incorrect warning with Wstringop-overflow

2018-06-27 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86332

prathamesh3492 at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #3 from prathamesh3492 at gcc dot gnu.org ---
Invalid.

[Bug middle-end/91166] [SVE] Unfolded ZIPs of constants

2019-07-24 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91166

--- Comment #3 from prathamesh3492 at gcc dot gnu.org ---
Author: prathamesh3492
Date: Wed Jul 24 07:20:24 2019
New Revision: 273758

URL: https://gcc.gnu.org/viewcvs?rev=273758&root=gcc&view=rev
Log:
2019-07-24  Prathamesh Kulkarni  

PR middle-end/91166
* match.pd (vec_perm_expr(v, v, mask) -> v): New pattern.
(define_predicates): Add entry for uniform_vector_p.
(vec_same_elem_p): New match pattern.

testsuite/
* gcc.target/aarch64/sve/pr91166.c: New test.

Added:
trunk/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/match.pd
trunk/gcc/testsuite/ChangeLog

[Bug target/91452] New: tls_preserve_1.c fails with -O3 -fpic -march=armv8.2-a+sve

2019-08-14 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91452

Bug ID: 91452
   Summary: tls_preserve_1.c fails with -O3 -fpic
-march=armv8.2-a+sve
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: prathamesh3492 at gcc dot gnu.org
  Target Milestone: ---

Hi,
It seems tls_preserve_1.c is failing with -O3 -fpic -march=armv8.2-a+sve
because it generates:
stp q0, q1, [sp, 16]
str q2, [sp, 48]

which doesn't happen with -march=armv8.2-a.

Thanks,
Prathamesh

[Bug target/90724] ICE with __sync_bool_compare_and_swap with -march=armv8.2-a+sve

2019-08-21 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90724

--- Comment #1 from prathamesh3492 at gcc dot gnu.org ---
Author: prathamesh3492
Date: Wed Aug 21 18:34:43 2019
New Revision: 274805

URL: https://gcc.gnu.org/viewcvs?rev=274805&root=gcc&view=rev
Log:
2019-08-21  Prathamesh Kulkarni  

PR target/90724
* config/aarch64/aarch64.c (aarch64_gen_compare_reg_maybe_ze): Force y
in reg if it fails aarch64_plus_operand predicate.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/aarch64/aarch64.c

[Bug target/88839] [SVE] Poor implementation of blend-like permutes

2019-08-21 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88839

--- Comment #3 from prathamesh3492 at gcc dot gnu.org ---
Author: prathamesh3492
Date: Wed Aug 21 20:41:41 2019
New Revision: 274810

URL: https://gcc.gnu.org/viewcvs?rev=274810&root=gcc&view=rev
Log:
2019-08-22  Prathamesh Kulkarni  
Richard Sandiford  

PR target/88839
* config/aarch64/aarch64.c (aarch64_evpc_sel): New function.
(aarch64_expand_vec_perm_const_1): Call aarch64_evpc_sel.

testsuite/
* gcc.target/aarch64/sve/sel_1.c: New test.
* gcc.target/aarch64/sve/sel_2.c: Likewise.
* gcc.target/aarch64/sve/sel_3.c: Likewise.
* gcc.target/aarch64/sve/sel_4.c: Likewise.
* gcc.target/aarch64/sve/sel_5.c: Likewise.
* gcc.target/aarch64/sve/sel_6.c: Likewise.

Added:
trunk/gcc/testsuite/gcc.target/aarch64/sve/sel_1.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/sel_2.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/sel_3.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/sel_4.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/sel_5.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/sel_6.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/aarch64/aarch64.c
trunk/gcc/testsuite/ChangeLog

[Bug target/90724] ICE with __sync_bool_compare_and_swap with -march=armv8.2-a+sve

2019-08-22 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90724

--- Comment #3 from prathamesh3492 at gcc dot gnu.org ---
(In reply to Eric Gallager from comment #2)
> (In reply to prathamesh3492 from comment #1)
> > Author: prathamesh3492
> > Date: Wed Aug 21 18:34:43 2019
> > New Revision: 274805
> > 
> > URL: https://gcc.gnu.org/viewcvs?rev=274805&root=gcc&view=rev
> > Log:
> > 2019-08-21  Prathamesh Kulkarni  
> > 
> > PR target/90724
> > * config/aarch64/aarch64.c (aarch64_gen_compare_reg_maybe_ze): Force y
> > in reg if it fails aarch64_plus_operand predicate.
> > 
> > Modified:
> > trunk/gcc/ChangeLog
> > trunk/gcc/config/aarch64/aarch64.c
> 
> Did this fix it?

On trunk, yes. Needs to be backported to gcc-9-branch.

Thanks,
Prathamesh

[Bug libfortran/91593] New: Implicit enum conversions in libgfortran/io/transfer.c

2019-08-28 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91593

Bug ID: 91593
   Summary: Implicit enum conversions in libgfortran/io/transfer.c
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libfortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: prathamesh3492 at gcc dot gnu.org
  Target Milestone: ---

Hi,
I added a patch for Wenum-conversion (PR78736), that exposes some implicit enum
conversions in libgfortran/io/transfer.c:

./../../gcc/libgfortran/io/transfer.c: In function ‘current_mode’:
../../../gcc/libgfortran/io/transfer.c:206:5: warning: implicit conversion from
‘enum ’ to ‘file_mode’ {aka ‘enum ’} [-Wenum-conversion]
  206 |   m = FORM_UNSPECIFIED;
  | ^
../../../gcc/libgfortran/io/transfer.c: In function
‘formatted_transfer_scalar_read’:
../../../gcc/libgfortran/io/transfer.c:1730:25: warning: implicit conversion
from ‘enum ’ to ‘unit_sign’ {aka ‘enum ’}
[-Wenum-conversion]
 1730 |dtp->u.p.sign_status = SIGN_S;
  | ^
../../../gcc/libgfortran/io/transfer.c:1735:25: warning: implicit conversion
from ‘enum ’ to ‘unit_sign’ {aka ‘enum ’}
[-Wenum-conversion]
 1735 |dtp->u.p.sign_status = SIGN_SS;
  | ^
../../../gcc/libgfortran/io/transfer.c:1740:25: warning: implicit conversion
from ‘enum ’ to ‘unit_sign’ {aka ‘enum ’}
[-Wenum-conversion]
 1740 |dtp->u.p.sign_status = SIGN_SP;
  | ^
./../../gcc/libgfortran/io/transfer.c: In function
‘formatted_transfer_scalar_write’:
../../../gcc/libgfortran/io/transfer.c:2189:25: warning: implicit conversion
from ‘enum ’ to ‘unit_sign’ {aka ‘enum ’}
[-Wenum-conversion]
 2189 |dtp->u.p.sign_status = SIGN_S;
  | ^
./../../gcc/libgfortran/io/transfer.c:2194:25: warning: implicit conversion
from ‘enum ’ to ‘unit_sign’ {aka ‘enum ’}
[-Wenum-conversion]
 2194 |dtp->u.p.sign_status = SIGN_SS;
  | ^
../../../gcc/libgfortran/io/transfer.c:2199:25: warning: implicit conversion
from ‘enum ’ to ‘unit_sign’ {aka ‘enum ’}
[-Wenum-conversion]
2199 |dtp->u.p.sign_status = SIGN_SP;
 | ^

AFAIU, the warnings are correct in this case since the enums are different
and thus there's an implicit conversion from one enum type to another ?

Thanks,
Prathamesh

[Bug libfortran/91593] Implicit enum conversions in libgfortran/io/transfer.c

2019-08-28 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91593

--- Comment #1 from prathamesh3492 at gcc dot gnu.org ---
Patch for PR78736 that triggers the warnings:
https://gcc.gnu.org/ml/gcc-patches/2019-08/msg01938.html

Thanks,
Prathamesh

[Bug tree-optimization/83661] sincos does not handle sin(2x)

2019-08-30 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83661

--- Comment #3 from prathamesh3492 at gcc dot gnu.org ---
Oh, I thought sincos simultaneously calculated values of sin and cos ?
If that's not the case, then I wonder how is sincos transform itself beneficial
?

Thanks,
Prathamesh

[Bug c/78736] enum warnings in GCC (request for -Wenum-conversion to be added)

2019-09-04 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78736

--- Comment #14 from prathamesh3492 at gcc dot gnu.org ---
Author: prathamesh3492
Date: Wed Sep  4 16:25:21 2019
New Revision: 275376

URL: https://gcc.gnu.org/viewcvs?rev=275376&root=gcc&view=rev
Log:
Add warning Wenum-conversion for C and ObjC.

The patch enables warning with Wextra due to PR91593 and warnings with
allmodconfig kernel build. Once these issues are resolved, we could
consider promoting it to Wall.

2019-09-04  Prathamesh Kulkarni  

PR c/78736
* doc/invoke.texi: Document -Wenum-conversion.

c-family
* c.opt (Wenum-conversion): New option.

c/
* c-typeck.c (convert_for_assignment): Handle Wenum-conversion.

testsuite/
* gcc.dg/Wenum-conversion.c: New test-case.

Added:
trunk/gcc/testsuite/gcc.dg/Wenum-conversion.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/c-family/ChangeLog
trunk/gcc/c-family/c.opt
trunk/gcc/c/ChangeLog
trunk/gcc/c/c-typeck.c
trunk/gcc/doc/invoke.texi
trunk/gcc/testsuite/ChangeLog

[Bug target/91982] gcc.target/aarch64/sve/clastb_*.c tests failing with segfault

2019-10-03 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91982

--- Comment #3 from prathamesh3492 at gcc dot gnu.org ---
Probably started with r276299 ?

We segfault in vect_transform_stmt in call to dominated_by_p:

  if (!slp_node && STMT_VINFO_REDUC_DEF (orig_stmt_info)
  && STMT_VINFO_REDUC_TYPE (orig_stmt_info) != FOLD_LEFT_REDUCTION
  && is_a  (STMT_VINFO_REDUC_DEF (orig_stmt_info)->stmt))
{
  gphi *phi = as_a  (STMT_VINFO_REDUC_DEF (orig_stmt_info)->stmt);
  if (dominated_by_p (CDI_DOMINATORS,
  gimple_bb (orig_stmt_info->stmt), gimple_bb (phi)))
{

because gimple_bb (orig_stmt_info->stmt) is NULL.

Thanks,
Prathamesh

[Bug tree-optimization/91532] [SVE] Redundant predicated store in gcc.target/aarch64/fmla_2.c

2019-10-07 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91532

--- Comment #2 from prathamesh3492 at gcc dot gnu.org ---
Author: prathamesh3492
Date: Mon Oct  7 23:44:49 2019
New Revision: 276681

URL: https://gcc.gnu.org/viewcvs?rev=276681&root=gcc&view=rev
Log:
2019-10-07  Prathamesh Kulkarni  
Richard Biener  

PR tree-optimization/91532
* tree-if-conv.c: Include tree-ssa-dse.h.
(ifcvt_local_dce): Change param from bb to loop,
and call dse_classify_store.
(tree_if_conversion): Pass loop instead of loop->header as arg
to ifcvt_local_dce.
* tree-ssa-dse.c: Include tree-ssa-dse.h.
(delete_dead_or_redundant_assignment): Remove static qualifier from
declaration, and add prototype in tree-ssa-dse.h.
(dse_store_status): Move to tree-ssa-dse.h.
(dse_classify_store): Remove static qualifier and add new tree param
stop_at_vuse, and add prototype in tree-ssa-dse.h.
* tree-ssa-dse.h: New header.

Added:
trunk/gcc/tree-ssa-dse.h
Modified:
trunk/gcc/ChangeLog
trunk/gcc/tree-if-conv.c
trunk/gcc/tree-ssa-dse.c

[Bug tree-optimization/92033] New: ICE during dom with -march=armv8.2-a+sve

2019-10-08 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92033

Bug ID: 92033
   Summary: ICE during dom with -march=armv8.2-a+sve
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: prathamesh3492 at gcc dot gnu.org
  Target Milestone: ---

Hi,
With PR89007 test-case:

#define N 1024
unsigned char dst[N];
unsigned char in1[N];
unsigned char in2[N];

void
foo ()
{
  for( int x = 0; x < N; x++ )
dst[x] = (in1[x] + in2[x] + 1) >> 1;
}

Compiling with -O3 -march=armv8.2-a+sve results in following ICE:

pr89007.c: In function ‘foo’:
pr89007.c:7:1: internal compiler error: tree check: expected integer_cst, have
poly_int_cst in to_wide, at tree.h:5795
7 | foo ()
  | ^~~
0x722ffd tree_check_failed(tree_node const*, char const*, int, char const*,
...)
../../gcc/gcc/tree.c:9924
0x721307 tree_check(tree_node const*, char const*, int, char const*, tree_code)
../../gcc/gcc/tree.h:3523
0x721307 wi::to_wide(tree_node const*)
../../gcc/gcc/tree.h:5795
0x721307 value_range_base::lower_bound(unsigned int) const
../../gcc/gcc/tree-vrp.c:6136
0x104e03f value_range_base::lower_bound(unsigned int) const
../../gcc/gcc/tree-vrp.c:6123
0x1604c36 range_operator::fold_range(tree_node*, value_range_base const&,
value_range_base const&) const
../../gcc/gcc/range-op.cc:156
0x104fb21 range_fold_binary_expr(value_range_base*, tree_code, tree_node*,
value_range_base const*, value_range_base const*)
../../gcc/gcc/tree-vrp.c:1915
0x10cad37 vr_values::extract_range_from_binary_expr(value_range*, tree_code,
tree_node*, tree_node*, tree_node*)
../../gcc/gcc/vr-values.c:808
0x10cd7a8 vr_values::extract_range_from_assignment(value_range*, gassign*)
../../gcc/gcc/vr-values.c:1466
0x1538ef5 evrp_range_analyzer::record_ranges_from_stmt(gimple*, bool)
../../gcc/gcc/gimple-ssa-evrp-analyze.c:307
0xec9e9b dom_opt_dom_walker::before_dom_children(basic_block_def*)
../../gcc/gcc/tree-ssa-dom.c:1503
0x150fae7 dom_walker::walk(basic_block_def*)
../../gcc/gcc/domwalk.c:309
0xec7c94 execute
../../gcc/gcc/tree-ssa-dom.c:724

Possibly caused by r276504.

Thanks,
Prathamesh

[Bug tree-optimization/92033] ICE during dom with -march=armv8.2-a+sve

2019-10-08 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92033

--- Comment #1 from prathamesh3492 at gcc dot gnu.org ---
This seems to happen pretty much for any arithmetic ops inside loop with SVE.
For instance, with cases:

for (int i = 0; i < N; i++)
  dst[i] = ~in1[i];

for (int i = 0; i < N; i++)
  dst[i] = in1[i] + in2[i];

The following workaround "fixes" the issue by punting on POLY_INT_CST in
range_operator::fold_range, but not sure if that's the correct approach.

diff --git a/gcc/range-op.cc b/gcc/range-op.cc
index fc31485384b..93eb59436dc 100644
--- a/gcc/range-op.cc
+++ b/gcc/range-op.cc
@@ -148,6 +148,13 @@ range_operator::fold_range (tree type,
   if (empty_range_check (r, lh, rh))
 return r;

+  if (POLY_INT_CST_P (lh.min ()) || POLY_INT_CST_P (lh.max ())
+  || POLY_INT_CST_P (rh.min ()) || POLY_INT_CST_P (rh.max ()))
+{
+  r.set_varying (lh.type ());
+  return r;
+}
+
   for (unsigned x = 0; x < lh.num_pairs (); ++x)
 for (unsigned y = 0; y < rh.num_pairs (); ++y)
   {

Thanks,
Prathamesh

[Bug tree-optimization/92085] [10 Regression] ICE: tree check: expected class 'type', have 'exceptional' (error_mark) in useless_type_conversion_p, at gimple-expr.c:86

2019-10-14 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92085

--- Comment #4 from prathamesh3492 at gcc dot gnu.org ---
Patch posted upstream: https://gcc.gnu.org/ml/gcc-patches/2019-10/msg01031.html

Thanks,
Prathamesh

[Bug tree-optimization/92085] [10 Regression] ICE: tree check: expected class 'type', have 'exceptional' (error_mark) in useless_type_conversion_p, at gimple-expr.c:86

2019-10-15 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92085

--- Comment #5 from prathamesh3492 at gcc dot gnu.org ---
Author: prathamesh3492
Date: Tue Oct 15 07:19:41 2019
New Revision: 276984

URL: https://gcc.gnu.org/viewcvs?rev=276984&root=gcc&view=rev
Log:
2019-10-15  Prathamesh Kulkarni  

PR tree-optimization/92085
* tree-if-conv.c (ifcvt_local_dce): Call gsi_next in else clause,
instead of calling it unconditionally after
delete_dead_or_redundant_assignment and fix indentation.

testsuite/
* gcc.dg/tree-ssa/pr92085-1.c: New test.
* gcc.dg/tree-ssa/pr92085-2.c: Likewise.

Added:
trunk/gcc/testsuite/gcc.dg/tree-ssa/pr92085-1.c
trunk/gcc/testsuite/gcc.dg/tree-ssa/pr92085-2.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-if-conv.c

[Bug target/90723] pr88598-2.c segfaults with -msve-vector-bits=256

2019-10-15 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90723

--- Comment #3 from prathamesh3492 at gcc dot gnu.org ---
(In reply to Eric Gallager from comment #2)
> (In reply to prathamesh3492 from comment #1)
> > Author: prathamesh3492
> > Date: Sat Jul 13 08:28:33 2019
> > New Revision: 273466
> > 
> > URL: https://gcc.gnu.org/viewcvs?rev=273466&root=gcc&view=rev
> > Log:
> > 2019-07-15  Prathamesh Kulkarni  
> > 
> > PR target/90723
> > * recog.h (temporary_volatile_ok): New class.
> > * config/aarch64/aarch64.c (aarch64_emit_sve_pred_move): Set
> > volatile_ok temporarily to true using temporary_volatile_ok.
> > * expr.c (emit_block_move_via_cpymem): Likewise.
> > * optabs.c (maybe_legitimize_operand): Likewise.
> > 
> > Modified:
> > trunk/gcc/ChangeLog
> > trunk/gcc/config/aarch64/aarch64.c
> > trunk/gcc/expr.c
> > trunk/gcc/optabs.c
> > trunk/gcc/recog.h
> 
> Did this fix it?
Yes.

Thanks,
Prathamesh

[Bug tree-optimization/83501] [8 Regression] strlen(a) not folded after strcpy(a, "...")

2017-12-19 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83501

prathamesh3492 at gcc dot gnu.org changed:

   What|Removed |Added

 CC||prathamesh3492 at gcc dot 
gnu.org

--- Comment #2 from prathamesh3492 at gcc dot gnu.org ---
Created attachment 42927
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42927&action=edit
Untested fix

[Bug ipa/83506] [8 Regression] ICE: Segmentation fault in force_nonfallthru_and_redirect

2017-12-20 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83506

--- Comment #2 from prathamesh3492 at gcc dot gnu.org ---
Sorry for the breakage, I will take a look.

Regards,
Prathamesh

[Bug ipa/83506] [8 Regression] ICE: Segmentation fault in force_nonfallthru_and_redirect

2017-12-20 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83506

--- Comment #5 from prathamesh3492 at gcc dot gnu.org ---
(In reply to Jakub Jelinek from comment #4)
> Though, I guess the real bug is that ipa_free_fn_summary (); is no longer
> called for -fno-ipa-pure-const.  While the ipa_inline pass had unconditional
> gate and so it was freed always if !flag_wpa, ipa-pure-const has a
> non-trivial gate and thus it frees only sometimes.  Calling
> ipa_free_fn_summary () in ipa-inline.c if if (!flag_wpa &&
> !flag_ipa_pure_const && !in_lto_p) is not nice, as it duplicates the
> ipa-pure-const.c gate.  So, we can do something like:
> --- gcc/ipa.c.jj  2017-09-01 09:26:37.0 +0200
> +++ gcc/ipa.c 2017-12-20 11:22:57.915226765 +0100
> @@ -1270,6 +1270,11 @@ ipa_single_use (void)
>varpool_node *var;
>hash_map single_user_map;
>  
> +  /* In WPA we use inline summaries for partitioning process.  Otherwise,
> + free it if earlier IPA passes have not done so yet.  */
> +  if (!flag_wpa)
> +ipa_free_fn_summary ();
> +
>FOR_EACH_DEFINED_VARIABLE (var)
>  if (!var->all_refs_explicit_p ())
>var->aux = BOTTOM;
> But I think I have a cleaner patch than that.

Hi Jakub,
Thanks for the fix! In r254140, I removed the call to ipa_free_fn_summary()
gated on !flag_wpa from inline pass since ipa-pure-const required it for
propagating malloc attribute, which unfortunately caused the above bug.

Regards,
Prathamesh

[Bug tree-optimization/83648] missing -Wsuggest-attribute=malloc on a trivial malloc-like function

2018-01-02 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83648

prathamesh3492 at gcc dot gnu.org changed:

   What|Removed |Added

 CC||prathamesh3492 at gcc dot 
gnu.org

--- Comment #1 from prathamesh3492 at gcc dot gnu.org ---
ipa-pure-const dump shows:
MALLOC LATTICE after propagation:
__builtin_malloc: malloc
g: malloc_bottom
f: malloc

So it's not able to detect that g could be annotated with malloc.
I will take a look.

Thanks,
Prathamesh

[Bug tree-optimization/83648] missing -Wsuggest-attribute=malloc on a trivial malloc-like function

2018-01-02 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83648

--- Comment #2 from prathamesh3492 at gcc dot gnu.org ---
Created attachment 43004
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43004&action=edit
Untested fix

[Bug tree-optimization/82665] missing value range optimization for memchr

2018-01-02 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82665

prathamesh3492 at gcc dot gnu.org changed:

   What|Removed |Added

 CC||prathamesh3492 at gcc dot 
gnu.org

--- Comment #2 from prathamesh3492 at gcc dot gnu.org ---
Patch posted upstream -
https://gcc.gnu.org/ml/gcc-patches/2018-01/msg00020.html

[Bug tree-optimization/83661] New: sincos does not handle sin(2x)

2018-01-02 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83661

Bug ID: 83661
   Summary: sincos does not handle sin(2x)
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: prathamesh3492 at gcc dot gnu.org
  Target Milestone: ---

Hi,
For the following test-case:

double f(double x)
{
  return __builtin_sin(2*x) + __builtin_sin(x);
}

optimzied dump with -O2 -funsafe-math-optimizations -ffast-math shows:
;; Function f (f, funcdef_no=0, decl_uid=1952, cgraph_uid=0, symbol_order=0)

   [local count: 1073741825]:
  _1 = __builtin_sin (x_4(D));
  _2 = x_4(D) * 2.0e+0;
  _3 = __builtin_sin (_2);
  _5 = _1 + _3;
  return _5;

Would it be a good idea to enhance sincos pass to recognize the
identity sin(2*x) = 2*sin(x)*cos(x) and thus eliminate one call
to __builtin_sin ?

Writing 2*sin(x)*cos(x) explicitly in the source yields following optimized
dump:

   [local count: 1073741825]:
  sincostmp_8 = __builtin_cexpi (x_5(D));
  _1 = IMAGPART_EXPR ;
  _2 = REALPART_EXPR ;
  _3 = _1 * _2;
  _4 = _3 * 2.0e+0;
  _6 = _2 + _4;
  return _6;

I agree in general that adding math identities like sin(x)**2 + cos(x)**2 = 1
isn't a good idea since user would almost always write the "optimized" version
in practice. However for the above case, would the transform make sense ?

Thanks,
Prathamesh

[Bug tree-optimization/83661] sincos does not handle sin(2x)

2018-01-02 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83661

prathamesh3492 at gcc dot gnu.org changed:

   What|Removed |Added

   Keywords||missed-optimization
 Target||x86_64-unknown-linux-gnu
   Host||x86_64-unknown-linux-gnu
  Build||x86_64-unknown-linux-gnu
   Severity|normal  |enhancement

[Bug tree-optimization/83501] [8 Regression] strlen(a) not folded after strcpy(a, "...")

2018-01-03 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83501

--- Comment #5 from prathamesh3492 at gcc dot gnu.org ---
Author: prathamesh3492
Date: Wed Jan  3 16:07:32 2018
New Revision: 256180

URL: https://gcc.gnu.org/viewcvs?rev=256180&root=gcc&view=rev
Log:
2018-01-03  Prathamesh Kulkarni  

PR tree-optimization/83501
* tree-ssa-strlen.c (get_string_cst): New.
(handle_char_store): Call get_string_cst.

testsuite/
* gcc.dg/tree-ssa/pr83501.c: New test.

Added:
trunk/gcc/testsuite/gcc.dg/tree-ssa/pr83501.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-ssa-strlen.c

[Bug tree-optimization/83501] [8 Regression] strlen(a) not folded after strcpy(a, "...")

2018-01-03 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83501

prathamesh3492 at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from prathamesh3492 at gcc dot gnu.org ---
Fixed.

[Bug tree-optimization/83750] New: CSE erf/erfc pair

2018-01-09 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83750

Bug ID: 83750
   Summary: CSE erf/erfc pair
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: prathamesh3492 at gcc dot gnu.org
  Target Milestone: ---

For the following test case:

double f(double x)
{
  double g(double, double);

  double t1 = __builtin_erf (x);
  double t2 = __builtin_erfc (x);
  return g(t1, t2);
}

optimized dump shows:
   [local count: 1073741825]:
  t1_2 = __builtin_erf (x_1(D));
  t2_5 = __builtin_erfc (x_1(D));
  _7 = g (t1_2, t2_5); [tail call]
  return _7;

I was wondering if it'd be a good idea to add a simple dom pass to
tree-ssa-math-opts.c that would eliminate call to erfc(x) if erf(x) is present
at-least with -funsafe-math-optimizations ?

erfc(x) == 1.0 - erf(x)

Thanks,
Prathamesh

[Bug tree-optimization/83751] New: CSE erf/erfc pair

2018-01-09 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83751

Bug ID: 83751
   Summary: CSE erf/erfc pair
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: prathamesh3492 at gcc dot gnu.org
  Target Milestone: ---

For the following test case:

double f(double x)
{
  double g(double, double);

  double t1 = __builtin_erf (x);
  double t2 = __builtin_erfc (x);
  return g(t1, t2);
}

optimized dump shows:
   [local count: 1073741825]:
  t1_2 = __builtin_erf (x_1(D));
  t2_5 = __builtin_erfc (x_1(D));
  _7 = g (t1_2, t2_5); [tail call]
  return _7;

I was wondering if it'd be a good idea to add a simple dom pass to
tree-ssa-math-opts.c that would eliminate call to erfc(x) if erf(x) is present
at-least with -funsafe-math-optimizations ?

erfc(x) == 1.0 - erf(x)

Thanks,
Prathamesh

[Bug tree-optimization/83751] CSE erf/erfc pair

2018-01-09 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83751

--- Comment #2 from prathamesh3492 at gcc dot gnu.org ---
Oops, looks like this got posted twice :(
Sorry about that.

[Bug target/83775] New: Segfault in arm_declare_function_name()

2018-01-10 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83775

Bug ID: 83775
   Summary: Segfault in arm_declare_function_name()
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: prathamesh3492 at gcc dot gnu.org
  Target Milestone: ---

Hi,
For the following test-case:

#define STR "1234567"

const char str[] = STR;

char dst[10];

void copy_from_global_str (void)
{
  __builtin_strcpy (dst, str);

  if (__builtin_strlen (dst) != sizeof str - 1)
__builtin_abort ();
}

With arm-linux-gnueabihf-gcc -O2 I get the following ICE:
strlenopt-39.c: In function 'copy_from_global_str':
strlenopt-39.c:13:1: internal compiler error: Segmentation fault
 }
 ^
0xbc1f1f crash_signal
../../gcc/gcc/toplev.c:325
0xf4a5bb std::char_traits::length(char const*)
/usr/include/c++/6/bits/char_traits.h:267
0xf4a5bb std::__cxx11::basic_string,
std::allocator >::assign(char const*)
/usr/include/c++/6/bits/basic_string.h:1268
0xf4a5bb std::__cxx11::basic_string,
std::allocator >::operator=(char const*)
/usr/include/c++/6/bits/basic_string.h:605
0xf4a5bb arm_declare_function_name(_IO_FILE*, char const*, tree_node*)
../../gcc/gcc/config/arm/arm.c:30958
0xf4ad2d arm_asm_declare_function_name(_IO_FILE*, char const*, tree_node*)
../../gcc/gcc/config/arm/arm.c:19899
0xefd8fc assemble_start_function(tree_node*, char const*)
../../gcc/gcc/varasm.c:1880
0x87929f rest_of_handle_final
../../gcc/gcc/final.c:4549
0x87929f execute
../../gcc/gcc/final.c:4625

This happens because of following in arm_declare_function_name():
  /* Only update the assembler .arch string if it is distinct from the last
 such string we printed.  */
  std::string arch_to_print = targ_options->x_arm_arch_string;

In this case, targ_options->x_arm_arch_string is NULL and hence the above
error.
Does the following (untested) fix look OK ?

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 196aa6de1ac..868251a154c 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -30954,7 +30954,10 @@ arm_declare_function_name (FILE *stream, const char
*name, tree decl)

   /* Only update the assembler .arch string if it is distinct from the last
  such string we printed.  */
-  std::string arch_to_print = targ_options->x_arm_arch_string;
+  std::string arch_to_print;
+  if (targ_options->x_arm_arch_string)
+arch_to_print = targ_options->x_arm_arch_string;
+
   if (arch_to_print != arm_last_printed_arch_string)
 {
   std::string arch_name


Thanks,
Prathamesh

[Bug tree-optimization/81703] memcpy folding defeats strlen optimization

2018-01-10 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81703

--- Comment #2 from prathamesh3492 at gcc dot gnu.org ---
Author: prathamesh3492
Date: Thu Jan 11 04:37:48 2018
New Revision: 256475

URL: https://gcc.gnu.org/viewcvs?rev=256475&root=gcc&view=rev
Log:
2018-01-11  Martin Sebor  
Prathamesh Kulkarni  

PR tree-optimization/83501
PR tree-optimization/81703

* tree-ssa-strlen.c (get_string_cst): Rename...
(get_string_len): ...to this.  Handle global constants.
(handle_char_store): Adjust.

testsuite/
* gcc.dg/strlenopt-39.c: New test-case.
* gcc.dg/pr81703.c: Likewise.

Added:
trunk/gcc/testsuite/gcc.dg/pr81703.c
trunk/gcc/testsuite/gcc.dg/strlenopt-39.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-ssa-strlen.c

[Bug tree-optimization/83501] [8 Regression] strlen(a) not folded after strcpy(a, "...")

2018-01-10 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83501

--- Comment #8 from prathamesh3492 at gcc dot gnu.org ---
Author: prathamesh3492
Date: Thu Jan 11 04:37:48 2018
New Revision: 256475

URL: https://gcc.gnu.org/viewcvs?rev=256475&root=gcc&view=rev
Log:
2018-01-11  Martin Sebor  
Prathamesh Kulkarni  

PR tree-optimization/83501
PR tree-optimization/81703

* tree-ssa-strlen.c (get_string_cst): Rename...
(get_string_len): ...to this.  Handle global constants.
(handle_char_store): Adjust.

testsuite/
* gcc.dg/strlenopt-39.c: New test-case.
* gcc.dg/pr81703.c: Likewise.

Added:
trunk/gcc/testsuite/gcc.dg/pr81703.c
trunk/gcc/testsuite/gcc.dg/strlenopt-39.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-ssa-strlen.c

[Bug tree-optimization/81703] memcpy folding defeats strlen optimization

2018-01-10 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81703

prathamesh3492 at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from prathamesh3492 at gcc dot gnu.org ---
Fixed.

[Bug target/83775] Segfault in arm_declare_function_name()

2018-01-11 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83775

--- Comment #3 from prathamesh3492 at gcc dot gnu.org ---
(In reply to prathamesh3492 from comment #0)
> Hi,
> For the following test-case:
> 
> #define STR "1234567"
> 
> const char str[] = STR;
> 
> char dst[10];
> 
> void copy_from_global_str (void)
> {
>   __builtin_strcpy (dst, str);
> 
>   if (__builtin_strlen (dst) != sizeof str - 1)
> __builtin_abort ();
> }
> 
> With arm-linux-gnueabihf-gcc -O2 I get the following ICE:
Oops, this should be cc1, I didn't invoke the driver, but cc1 directly.

> strlenopt-39.c: In function 'copy_from_global_str':
> strlenopt-39.c:13:1: internal compiler error: Segmentation fault
>  }
>  ^
> 0xbc1f1f crash_signal
>   ../../gcc/gcc/toplev.c:325
> 0xf4a5bb std::char_traits::length(char const*)
>   /usr/include/c++/6/bits/char_traits.h:267
> 0xf4a5bb std::__cxx11::basic_string,
> std::allocator >::assign(char const*)
>   /usr/include/c++/6/bits/basic_string.h:1268
> 0xf4a5bb std::__cxx11::basic_string,
> std::allocator >::operator=(char const*)
>   /usr/include/c++/6/bits/basic_string.h:605
> 0xf4a5bb arm_declare_function_name(_IO_FILE*, char const*, tree_node*)
>   ../../gcc/gcc/config/arm/arm.c:30958
> 0xf4ad2d arm_asm_declare_function_name(_IO_FILE*, char const*, tree_node*)
>   ../../gcc/gcc/config/arm/arm.c:19899
> 0xefd8fc assemble_start_function(tree_node*, char const*)
>   ../../gcc/gcc/varasm.c:1880
> 0x87929f rest_of_handle_final
>   ../../gcc/gcc/final.c:4549
> 0x87929f execute
>   ../../gcc/gcc/final.c:4625
> 
> This happens because of following in arm_declare_function_name():
>   /* Only update the assembler .arch string if it is distinct from the last
>  such string we printed.  */
>   std::string arch_to_print = targ_options->x_arm_arch_string;
> 
> In this case, targ_options->x_arm_arch_string is NULL and hence the above
> error.
> Does the following (untested) fix look OK ?
> 
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 196aa6de1ac..868251a154c 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -30954,7 +30954,10 @@ arm_declare_function_name (FILE *stream, const char
> *name, tree decl)
> 
>/* Only update the assembler .arch string if it is distinct from the last
>   such string we printed.  */
> -  std::string arch_to_print = targ_options->x_arm_arch_string;
> +  std::string arch_to_print;
> +  if (targ_options->x_arm_arch_string)
> +arch_to_print = targ_options->x_arm_arch_string;
> +
>if (arch_to_print != arm_last_printed_arch_string)
>  {
>std::string arch_name
> 
> 
> Thanks,
> Prathamesh

[Bug target/83514] ABRT in arm_declare_function_name passing a null pointer to std::string

2018-01-11 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83514

--- Comment #5 from prathamesh3492 at gcc dot gnu.org ---
Author: prathamesh3492
Date: Thu Jan 11 12:13:42 2018
New Revision: 256529

URL: https://gcc.gnu.org/viewcvs?rev=256529&root=gcc&view=rev
Log:
2018-01-11  Prathamesh Kulkarni  

PR target/83514
* config/arm/arm.c (arm_declare_function_name): Set arch_to_print if
targ_options->x_arm_arch_string is non NULL.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/arm/arm.c

[Bug target/83514] ABRT in arm_declare_function_name passing a null pointer to std::string

2018-01-11 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83514

--- Comment #6 from prathamesh3492 at gcc dot gnu.org ---
Committed patch to conditionally set arch_to_print after Kyrill's approval.

Thanks,
Prathamesh

[Bug tree-optimization/83501] [8 Regression] strlen(a) not folded after strcpy(a, "...")

2018-01-14 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83501

--- Comment #9 from prathamesh3492 at gcc dot gnu.org ---
Author: prathamesh3492
Date: Sun Jan 14 08:58:58 2018
New Revision: 256657

URL: https://gcc.gnu.org/viewcvs?rev=256657&root=gcc&view=rev
Log:
2018-01-14  Prathamesh Kulkarni  

PR tree-optimization/83501
* gcc.dg/strlenopt-39.c: Restrict to i?86 and x86_64-*-* targets.

Modified:
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.dg/strlenopt-39.c

[Bug c/83959] New: Missing buffer overflow warning on printf %s

2018-01-21 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83959

Bug ID: 83959
   Summary: Missing buffer overflow warning on printf %s
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: prathamesh3492 at gcc dot gnu.org
  Target Milestone: ---

int main(void)
{
  char a[3] = "xyz";
  __builtin_printf ("%s", a);
  return 0;
}

No warning generated with -Wall -Wextra -Wstringop-overflow=2.
Should -Wstringop-overflow be catching this case ?

I wonder if the compiler should warn (with Wextra maybe?) for
char a[3] = "xyz";
ie when sizeof(array) == strlen(initializier) ?

Although the above initializer doesn't cause overflow by itself, I suppose
almost
all string functions expect char arrays to end with '\0' and would end up
looking past end of array thus causing overflow.

Thanks,
Prathamesh

[Bug tree-optimization/89332] New: Missed detection of dead stores to array in a loop

2019-02-13 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89332

Bug ID: 89332
   Summary: Missed detection of dead stores to array in a loop
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: prathamesh3492 at gcc dot gnu.org
  Target Milestone: ---

Hi,
For the following test-case:

#define ARR_MAX 6

__attribute__((const)) int f(int);

int foo()
{
  int arr[ARR_MAX];

  for (int i = 0; i < ARR_MAX; i++)
arr[i] = f(i);

  return arr[0];
}

With -O3, gcc generates call to f(i) and store to arr[i] on every iteration,
while clang detects the stores to arr are dead (except for arr[0]), removes the
loop and emits a tail-call to f(0).

aarch64-linux-gnu-gcc -O3:
foo:
.LFB0:
.cfi_startproc
stp x29, x30, [sp, -64]!
.cfi_def_cfa_offset 64
.cfi_offset 29, -64
.cfi_offset 30, -56
mov x29, sp
stp x19, x20, [sp, 16]
.cfi_offset 19, -48
.cfi_offset 20, -40
add x20, sp, 40
mov w19, 0
.p2align 3,,7
.L2:
mov w0, w19
bl  f
str w0, [x20], 4
add w19, w19, 1
cmp w19, 6
bne .L2
ldr w0, [sp, 40]
ldp x19, x20, [sp, 16]
ldp x29, x30, [sp], 64
.cfi_restore 30
.cfi_restore 29
.cfi_restore 19
.cfi_restore 20
.cfi_def_cfa_offset 0
ret


clang -O3 --target=aarch64-linux-gnu:
foo:// @foo
// %bb.0:
mov w0, wzr
b   f

It seems, clang takes advantage of loop unrolling for the above-case,
while gcc doesn't seem to. After increasing ARR_MAX from 6 to 512, clang
generates same/similar code as gcc.

I doubt tho if such code is written in practice or can result due to
abstraction lowering ? It was just a contrived test-case I made up.

Thanks,
Prathamesh

[Bug target/88839] [SVE] Poor implementation of blend-like permutes

2019-04-06 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88839

--- Comment #2 from prathamesh3492 at gcc dot gnu.org ---
Fix committed to sve-acle-branch:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=2cd1f397ed5a155e74719977823b28777caa8312


Thanks,
Prathamesh

[Bug target/90644] New: Call to __builtin_memcmp not folded for identical vectors

2019-05-27 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90644

Bug ID: 90644
   Summary: Call to __builtin_memcmp not folded for identical
vectors
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: prathamesh3492 at gcc dot gnu.org
  Target Milestone: ---

Hi,
For following test-case:

#include 

typedef int32_t vnx4si __attribute__((vector_size (32)));

void foo(int a, int b)
{
  vnx4si v = (vnx4si) { a, b, 1, 2 };
  vnx4si expected = (vnx4si) { a, b, 1, 2 };

  if (__builtin_memcmp (&v, &expected, sizeof (vnx4si)) != 0)
__builtin_abort ();
}

-O2 -ftree-vectorize -march=armv8.2-a+sve folds call to __builtin_memcmp
correctly, since both vectors are identical.

But with -msve-vector-bits=256, it fails to fold the call to
__builtin_memcmp().

The issue can also be reproduced with AdvSIMD: Fails to fold the call to
__builtin_memcmp with vector_size == 16 but folds with vector_size == 32.

Thanks,
Prathamesh

[Bug target/88837] [SVE] Poor vector construction code in VL-specific mode

2019-06-03 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88837

--- Comment #1 from prathamesh3492 at gcc dot gnu.org ---
Author: prathamesh3492
Date: Mon Jun  3 09:35:37 2019
New Revision: 271857

URL: https://gcc.gnu.org/viewcvs?rev=271857&root=gcc&view=rev
Log:
2019-06-03  Prathamesh Kulkarni  

PR target/88837
* vector-builder.h (vector_builder::count_dups): New method.
* config/aarch64/aarch64-protos.h (aarch64_expand_sve_vector_init):
Declare prototype.
* config/aarch64/aarch64/sve.md (aarch64_sve_rev64): Use @.
(vec_init): New pattern.
* config/aarch64/aarch64.c (emit_insr): New function.
(aarch64_sve_expand_vector_init_handle_trailing_constants): Likewise.
(aarch64_sve_expand_vector_init_insert_elems): Likewise.
(aarch64_sve_expand_vector_init_handle_trailing_same_elem): Likewise.
(aarch64_sve_expand_vector_init): Define two overloaded functions.

testsuite/
* gcc.target/aarch64/sve/init_1.c: New test.
* gcc.target/aarch64/sve/init_1_run.c: Likewise.
* gcc.target/aarch64/sve/init_2.c: Likewise.
* gcc.target/aarch64/sve/init_2_run.c: Likewise.
* gcc.target/aarch64/sve/init_3.c: Likewise.
* gcc.target/aarch64/sve/init_3_run.c: Likewise.
* gcc.target/aarch64/sve/init_4.c: Likewise.
* gcc.target/aarch64/sve/init_4_run.c: Likewise.
* gcc.target/aarch64/sve/init_5.c: Likewise.
* gcc.target/aarch64/sve/init_5_run.c: Likewise.
* gcc.target/aarch64/sve/init_6.c: Likewise.
* gcc.target/aarch64/sve/init_6_run.c: Likewise.
* gcc.target/aarch64/sve/init_7.c: Likewise.
* gcc.target/aarch64/sve/init_7_run.c: Likewise.
* gcc.target/aarch64/sve/init_8.c: Likewise.
* gcc.target/aarch64/sve/init_8_run.c: Likewise.
* gcc.target/aarch64/sve/init_9.c: Likewise.
* gcc.target/aarch64/sve/init_9_run.c: Likewise.
* gcc.target/aarch64/sve/init_10.c: Likewise.
* gcc.target/aarch64/sve/init_10_run.c: Likewise.
* gcc.target/aarch64/sve/init_11.c: Likewise.
* gcc.target/aarch64/sve/init_11_run.c: Likewise.
* gcc.target/aarch64/sve/init_12.c: Likewise.
* gcc.target/aarch64/sve/init_12_run.c: Likewise.

Added:
trunk/gcc/testsuite/gcc.target/aarch64/sve/init_1.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/init_10.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/init_10_run.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/init_11.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/init_11_run.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/init_12.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/init_12_run.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/init_1_run.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/init_2.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/init_2_run.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/init_3.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/init_3_run.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/init_4.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/init_4_run.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/init_5.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/init_5_run.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/init_6.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/init_6_run.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/init_7.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/init_7_run.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/init_8.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/init_8_run.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/init_9.c
trunk/gcc/testsuite/gcc.target/aarch64/sve/init_9_run.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/aarch64/aarch64-protos.h
trunk/gcc/config/aarch64/aarch64-sve.md
trunk/gcc/config/aarch64/aarch64.c
trunk/gcc/testsuite/ChangeLog
trunk/gcc/vector-builder.h

[Bug target/90722] New: ICE with __builtin_convertvector with -msve-vector-bits=256

2019-06-03 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90722

Bug ID: 90722
   Summary: ICE with __builtin_convertvector with
-msve-vector-bits=256
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: prathamesh3492 at gcc dot gnu.org
  Target Milestone: ---

The following test-case:

typedef int v4si __attribute__((vector_size (4 * sizeof (int;
typedef double v4df __attribute__((vector_size (4 * sizeof (double;

void
f4 (v4df *x, v4si *y)
{
  *y = __builtin_convertvector (*x, v4si);
}

results in ICE with -O2 -march=armv8.2-a+sve -msve-vector-bits=256:

0xcddacd simplify_const_unary_operation(rtx_code, machine_mode, rtx_def*,
machine_mode)
../../gcc/gcc/simplify-rtx.c:1763
0xcd9c2a simplify_unary_operation(rtx_code, machine_mode, rtx_def*,
machine_mode)
../../gcc/gcc/simplify-rtx.c:873
0x13bca5a combine_simplify_rtx
../../gcc/gcc/combine.c:5787
0x13bf1a6 subst
../../gcc/gcc/combine.c:5727
0x13bf2bb subst
../../gcc/gcc/combine.c:5590
0x13bf102 subst
../../gcc/gcc/combine.c:5661
0x13c0568 try_combine
../../gcc/gcc/combine.c:3420
0x13c66c6 combine_instructions
../../gcc/gcc/combine.c:1306
0x13c66c6 rest_of_handle_combine
../../gcc/gcc/combine.c:15068
0x13c66c6 execute
../../gcc/gcc/combine.c:15113

because it hits following assert in simplify_const_unary_operation:
gcc_assert (known_eq (GET_MODE_NUNITS (mode), n_elts));

GET_MODE_NUNITS (mode) == 8 and n_elts == 4 for the test-case.

Thanks,
Prathamesh

[Bug target/90723] New: pr88598-2.c segfaults with -msve-vector-bits=256

2019-06-03 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90723

Bug ID: 90723
   Summary: pr88598-2.c segfaults with -msve-vector-bits=256
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: prathamesh3492 at gcc dot gnu.org
  Target Milestone: ---

cc1 segfaults with the following test-case, with -O2 -march=armv8.2-a+sve
-msve-vector-bits=256:

typedef double v4df __attribute__ ((vector_size (32)));

void foo(v4df);

int
main ()
{
  volatile v4df x1;
  x1 = (v4df) { 0, 1, 1, 2 };
  foo (x1);
  return 0;
}

gdb backtrace (clipped to last 14):

Program received signal SIGSEGV, Segmentation fault.
0x00bfdad1 in expand_binop_directly (icode=CODE_FOR_adddi3,
mode=mode@entry=E_DImode, binoptab=binoptab@entry=add_optab, 
op0=op0@entry=0x77a233a8, op1=op1@entry=0x77a2b290,
target=target@entry=0x75095468, unsignedp=1, methods=OPTAB_LIB_WIDEN, 
last=0x75094bc0) at ../../gcc/gcc/optabs.c:1038
1038{
(gdb) bt
#0  0x00bfdad1 in expand_binop_directly (icode=CODE_FOR_adddi3,
mode=mode@entry=E_DImode, binoptab=binoptab@entry=add_optab, 
op0=op0@entry=0x77a233a8, op1=op1@entry=0x77a2b290,
target=target@entry=0x75095468, unsignedp=1, methods=OPTAB_LIB_WIDEN, 
last=0x75094bc0) at ../../gcc/gcc/optabs.c:1038
#1  0x00bfc0dd in expand_binop (mode=E_DImode, binoptab=, op0=0x77a233a8, op1=0x77a2b290, target=0x75095468, 
unsignedp=1, methods=OPTAB_LIB_WIDEN) at ../../gcc/gcc/optabs.c:1209
#2  0x009cc7e4 in force_operand (value=0x77859f90,
target=0x75095468) at ../../gcc/gcc/expr.c:7527
#3  0x009a80a3 in copy_to_mode_reg (mode=E_DImode,
x=x@entry=0x77859f90) at ../../gcc/gcc/explow.c:627
#4  0x00bf2dce in maybe_legitimize_operand_same_code
(icode=icode@entry=CODE_FOR_aarch64_pred_movvnx2df, opno=opno@entry=2,
op=)
at ../../gcc/gcc/optabs.c:7146
#5  0x00bf56ee in maybe_legitimize_operand (op=0x7bfff400, opno=2,
icode=CODE_FOR_aarch64_pred_movvnx2df) at ../../gcc/gcc/optabs.c:7196
#6  maybe_legitimize_operands (icode=CODE_FOR_aarch64_pred_movvnx2df, opno=0,
nops=, ops=0x7bfff3c0) at ../../gcc/gcc/optabs.c:7323
#7  0x00bf5c0a in maybe_gen_insn
(icode=CODE_FOR_aarch64_pred_movvnx2df, nops=,
ops=0x7bfff3c0)
at ../../gcc/gcc/optabs.c:7342
#8  0x00bf8c39 in maybe_expand_insn (ops=ops@entry=0x7bfff3b0,
nops=nops@entry=3, icode=) at ../../gcc/gcc/optabs.c:7416
#9  expand_insn (icode=, nops=nops@entry=3,
ops=ops@entry=0x7bfff3c0) at ../../gcc/gcc/optabs.c:7416
#10 0x010378a4 in aarch64_emit_sve_pred_move (dest=,
pred=, src=) at ./insn-opinit.h:735
#11 0x012cb710 in gen_movvnx2df (operand0=0x75095408,
operand1=0x77859f78) at ../../gcc/gcc/config/aarch64/aarch64-sve.md:77
#12 0x009c7505 in insn_gen_fn::operator() (this=,
a1=0x77859f78, a0=0x75095408) at ../../gcc/gcc/recog.h:301
#13 emit_move_insn_1 (x=0x75095408, y=0x77859f78) at
../../gcc/gcc/expr.c:3701
#14 0x009c7950 in emit_move_insn (x=x@entry=0x75095408,
y=y@entry=0x77859f78) at ../../gcc/gcc/expr.c:3797

Thanks,
Prathamesh

[Bug target/90724] New: ICE with __sync_bool_compare_and_swap with -march=armv8.2-a+sve

2019-06-03 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90724

Bug ID: 90724
   Summary: ICE with __sync_bool_compare_and_swap with
-march=armv8.2-a+sve
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: prathamesh3492 at gcc dot gnu.org
  Target Milestone: ---

Following test (pr82096.c) and few others fail with following ICE with 
-march=armv8.2-a+sve that contain call to __sync_bool_compare_and_swap:

static long long AL[24];

int
check_ok (void)
{
  return (__sync_bool_compare_and_swap (AL+1, 0x20003ll, 0x1234567890ll));
}

pr82096.c: In function 'check_ok':
pr82096.c:11:1: error: unrecognizable insn:
   11 | }
  | ^
(insn 11 10 12 2 (set (reg:CC 66 cc)
(compare:CC (reg:DI 95)
(const_int 8589934595 [0x20003]))) "pr82096.c":10:11 -1
 (nil))
during RTL pass: vregs
pr82096.c:11:1: internal compiler error: in extract_insn, at recog.c:2310
0x64bb6e _fatal_insn(char const*, rtx_def const*, char const*, int, char
const*)
../../gcc/gcc/rtl-error.c:108
0x64bb8a _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
../../gcc/gcc/rtl-error.c:116
0x64a58b extract_insn(rtx_insn*)
../../gcc/gcc/recog.c:2310
0xa28a45 instantiate_virtual_regs_in_insn
../../gcc/gcc/function.c:1605
0xa28a45 instantiate_virtual_regs
../../gcc/gcc/function.c:1975
0xa28a45 execute
../../gcc/gcc/function.c:2024

Thanks,
Prathamesh

[Bug target/88833] [SVE] Redundant moves for WHILELO-based loops

2019-07-03 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88833

--- Comment #2 from prathamesh3492 at gcc dot gnu.org ---
Author: prathamesh3492
Date: Thu Jul  4 06:48:42 2019
New Revision: 273040

URL: https://gcc.gnu.org/viewcvs?rev=273040&root=gcc&view=rev
Log:
2019-07-04  Prathamesh Kulkarni  

PR target/88833
* fwprop.c (reg_single_def_p): New function.
(propagate_rtx_1): Add unconditional else inside RTX_EXTRA case.
(forward_propagate_into): New parameter reg_prop_only
with default value false.
Propagate def's src into loop only if SET_SRC and SET_DEST
of def_set have single definitions.
Likewise if reg_prop_only is set to true.
(fwprop): New param fwprop_addr_p.
Integrate fwprop_addr into fwprop.
(fwprop_addr): Remove.
(pass_rtl_fwprop_addr::execute): Call fwprop with arg set
to true.
(pass_rtl_fwprop::execute): Call fwprop with arg set to false.
* simplify-rtx.c (simplify_subreg): Add case for vector comparison.
* config/i386/sse.md (UNSPEC_BLENDV): Adjust pattern.

testsuite/
* gfortran.dg/pr88833.f90: New test.

Added:
trunk/gcc/testsuite/gfortran.dg/pr88833.f90
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/sse.md
trunk/gcc/fwprop.c
trunk/gcc/simplify-rtx.c
trunk/gcc/testsuite/ChangeLog

[Bug target/90723] pr88598-2.c segfaults with -msve-vector-bits=256

2019-07-13 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90723

--- Comment #1 from prathamesh3492 at gcc dot gnu.org ---
Author: prathamesh3492
Date: Sat Jul 13 08:28:33 2019
New Revision: 273466

URL: https://gcc.gnu.org/viewcvs?rev=273466&root=gcc&view=rev
Log:
2019-07-15  Prathamesh Kulkarni  

PR target/90723
* recog.h (temporary_volatile_ok): New class.
* config/aarch64/aarch64.c (aarch64_emit_sve_pred_move): Set
volatile_ok temporarily to true using temporary_volatile_ok.
* expr.c (emit_block_move_via_cpymem): Likewise.
* optabs.c (maybe_legitimize_operand): Likewise.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/aarch64/aarch64.c
trunk/gcc/expr.c
trunk/gcc/optabs.c
trunk/gcc/recog.h

[Bug tree-optimization/86570] New: Conditional statement doesn't trigger sincos transform

2018-07-18 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86570

Bug ID: 86570
   Summary: Conditional statement doesn't trigger sincos transform
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: prathamesh3492 at gcc dot gnu.org
  Target Milestone: ---

For the following test-case:

double f2(double x, double a, double b)
{
  if (a == b)
return __builtin_sin (a * x) + __builtin_cos (b * x);
  return 0;
}

Optimized dump with -O2 -ffast-math -funsafe-math-optimizations yields:

f2 (double x, double a, double b)
{
  double _1;
  double _2;
  double _3;
  double _4;
  double _5;
  double _9;

   [local count: 1073741825]:
  if (a_6(D) == b_7(D))
goto ; [34.00%]
  else
goto ; [66.00%]

   [local count: 365072220]:
  _1 = a_6(D) * x_8(D);
  _2 = __builtin_sin (_1);
  _3 = b_7(D) * x_8(D);
  _4 = __builtin_cos (_3);
  _9 = _2 + _4;

   [local count: 1073741825]:
  # _5 = PHI <_9(3), 0.0(2)>
  return _5;

}

I assume the sincos transform would have been valid in the above case ?
Similarly missed for the divmod transform.

Thanks,
Prathamesh

[Bug tree-optimization/86570] Conditional statement doesn't trigger sincos transform (with -ffast-math)

2018-07-19 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86570

--- Comment #2 from prathamesh3492 at gcc dot gnu.org ---
AFAIU, the underlying issue doesn't seem particular to float. For example,
there's a similar missed optimization with divmod transform:

unsigned f(unsigned x, unsigned y, unsigned a, unsigned b)
{
  if (a == b)
{
  unsigned t1 = (a * x) / y;
  unsigned t2 = (b * x) % y;
  return t1 + t2;
}
  return 0;
}

With -O2, optimized dump shows:

f (unsigned int x, unsigned int y, unsigned int a, unsigned int b)
{
  unsigned int t2;
  unsigned int t1;
  unsigned int _1;
  unsigned int _2;
  unsigned int _3;
  unsigned int _10;

   [local count: 1073741825]:
  if (a_4(D) == b_5(D))
goto ; [20.97%]
  else
goto ; [79.03%]

   [local count: 225163661]:
  _1 = a_4(D) * x_6(D);
  t1_8 = _1 / y_7(D);
  _2 = b_5(D) * x_6(D);
  t2_9 = _2 % y_7(D);
  _10 = t1_8 + t2_9;

   [local count: 1073741825]:
  # _3 = PHI <_10(3), 0(2)>
  return _3;

}

I assume the divmod transform would be applicable in this case ?

Thanks,
Prathamesh

[Bug tree-optimization/80155] [7/8/9 regression] Performance regression with code hoisting enabled

2018-07-19 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80155

--- Comment #38 from prathamesh3492 at gcc dot gnu.org ---
Hi,
The issue can be reproduced exactly, with pr77445-2.c. I am testing with making
is_digit() noinline.

* Reordering SINK before PRE

SPEC2006 data for building SPEC2006 with sink before pre:
Number of statements sunk: +2677 (~ +14%)
Number of total PRE insertions: -3971 (~ -1%)
On the private embedded benchmark suite, there's overall no significant
difference.

Not sure if this is much helpful. Is there a way to get info about number of
registers spilled from lra dump or assembly ?
I would like to see the effect on spills by reordering passes.

Reordering sink before pre seems to regress no-scevccp-outer-22.c and
ssa-dom-thread-7.c, and several SVE tests on aarch64:
http://people.linaro.org/~christophe.lyon/cross-validation/gcc-test-patches/262002-sink-pre/aarch64-none-linux-gnu/diff-gcc-rh60-aarch64-none-linux-gnu-default-default-default.txt

Also there seems to be some interplay with hoisting and forwprop. Disabling
forwprop3 and forwprop4 seems to eliminate the spill too. However as Bin
pointed out on the list, forwprop is also helping to reduce register pressure
for this case by mem_ref folding (forward_propagate_addr_expr).

* Jump threading cost models

It seems jump-threading pass increases the size for this case from 38 to 79
blocks. Wondering if that adds up to "resource hog", eventually leading
to extra spill ? Disabling jump threading pass eliminates the spill.

I looked a bit into fine tuning jump threading cost models for cortex-m7.
Strangely, setting max-jump-thread-duplication-stmts to 20 and
fsm-scale-path-stmts to 3 not only removes the spill but also results in 9 more
hoistings! I am investigating why this resulted
in improved performance. However it regresses ssa-dom-thread-7.c:
http://people.linaro.org/~christophe.lyon/cross-validation/gcc-test-patches/262539-jump-thread-cost-models/aarch64-none-elf/diff-gcc-rh60-aarch64-none-elf-default-default-default.txt

* Stop-gap measure for hoisting ?

As a stop-gap measure, would it make sense to "localize" hoisting within
"large" loop (based on loop->num_nodes?) by refusing to hoist expressions
computed outside loop ?
My assumption is that hoisting will increase live range of expression which was
previously computed in a block outside loop but is brought inside the
loop due to hoisting since we'd now need to consider path along the loop as
well for estimating it's live-range ? I suppose a cheap way to test that would
be to check if block's post-dominator also lies within the same loop since it
would ensure all paths from block to EXIT would lie inside the loop ?
I created a patch for this
(http://people.linaro.org/~prathamesh.kulkarni/pdom.diff), which works to
remove the spill but regressed pr77445-2.c (which is how I stumbled on that
test). Although the underlying issue doesn't seem particularly relevant to
hoisting, so not sure if this "heuristic" makes much sense.

* Live range shrinking pass

There was some discussion about an inter-block live-range shrinking GIMPLE pass
on the list (https://gcc.gnu.org/ml/gcc/2018-05/msg00260.html), which will run
just before expand. I would be grateful for suggestions on how to get started
with it. I realize this'd be pretty hard, but would like to give a try. 

Thanks,
Prathamesh

[Bug tree-optimization/80155] [7/8/9 regression] Performance regression with code hoisting enabled

2018-08-01 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80155

--- Comment #40 from prathamesh3492 at gcc dot gnu.org ---
ping https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80155#c38

Thanks,
Prathamesh

[Bug middle-end/87209] New: Wuninitialized or Wmaybe-uninitialized doesn't warn when malloc's return value is used without being initialized

2018-09-03 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87209

Bug ID: 87209
   Summary: Wuninitialized or Wmaybe-uninitialized doesn't warn
when malloc's return value is used without being
initialized
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: prathamesh3492 at gcc dot gnu.org
  Target Milestone: ---

There's no warnings emitted for the following test-case:

int f(void)
{
  int *p = __builtin_malloc (sizeof (*p));
  return *p;
}

I assume this should have been diagnosed with Wuninitialized or
Wmaybe-uninitialized ?

Thanks,
Prathamesh

[Bug tree-optimization/84712] New: Missed evaluating to constant at tree level

2018-03-05 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84712

Bug ID: 84712
   Summary: Missed evaluating to constant at tree level
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: prathamesh3492 at gcc dot gnu.org
  Target Milestone: ---

Hi,
It seems GCC does not evaluate the following function to a constant at the tree
level:

int sum(void)
{
  int a[] = {1, 2, 3, -1};
  int x = 0;

  for (int i = 0; i < 4; i++)
if (a[i] < 0)
  break;
else
  x += a[i];

  return x;
}

optimized dump shows:
sum ()
{
  int x;
  int a[4];
  int _25;
  int _33;
  int _41;

   [local count: 261993005]:
  MEM[(int *)&a] = { 1, 2, 3, -1 };
  _25 = a[1];
  if (_25 < 0)
goto ; [7.91%]
  else
goto ; [92.09%]

   [local count: 246744733]:
  x_30 = _25 + 1;
  _33 = a[2];
  if (_33 < 0)
goto ; [7.91%]
  else
goto ; [92.09%]

   [local count: 232383926]:
  x_38 = x_30 + _33;
  _41 = a[3];
  if (_41 < 0)
goto ; [7.91%]
  else
goto ; [92.09%]

   [local count: 47244641]:
  # x_17 = PHI 
  goto ; [100.00%]

   [local count: 218858940]:
  x_10 = x_38 + _41;

   [local count: 261993005]:
  # x_2 = PHI 
  a ={v} {CLOBBER};
  return x_2;

}

However at RTL, cprop seems to do the constant folding and set return value
register to 6.

Thanks,
Prathamesh

[Bug target/84759] Calculation of quotient and remainder with constant denominator uses __umoddi3+__udivdi3 instead of __udivmoddi4

2018-03-08 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84759

prathamesh3492 at gcc dot gnu.org changed:

   What|Removed |Added

 CC||prathamesh3492 at gcc dot 
gnu.org

--- Comment #1 from prathamesh3492 at gcc dot gnu.org ---
In the former case, the divmod transform takes place and we emit call to
__udivmoddi4. However it does't trigger for divmodConst, because we avoid
handling constants in the transform since expand_divmod has specialized
expansions for few constants, which would otherwise have been missed. I suppose
this could be somewhat improved.

Thanks,
Prathamesh

[Bug tree-optimization/85787] malloc_candidate_p fails to detect malloc attribute on nested phis

2018-10-04 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85787

--- Comment #2 from prathamesh3492 at gcc dot gnu.org ---
Author: prathamesh3492
Date: Thu Oct  4 11:06:24 2018
New Revision: 264838

URL: https://gcc.gnu.org/viewcvs?rev=264838&root=gcc&view=rev
Log:
2018-10-04  Prathamesh Kulkarni  

PR tree-optimization/85787
* ipa-pure-const.c (malloc_candidate_p_1): Move most of
malloc_candidate_p
into this function and add support for detecting multiple phis.
(DUMP_AND_RETURN): Move from malloc_candidate_p into top-level macro.

testsuite/
* gcc.dg/ipa/propmalloc-4.c: New test.

Added:
trunk/gcc/testsuite/gcc.dg/ipa/propmalloc-4.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/ipa-pure-const.c
trunk/gcc/testsuite/ChangeLog

[Bug tree-optimization/80155] [7/8/9 regression] Performance regression with code hoisting enabled

2018-10-17 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80155

--- Comment #42 from prathamesh3492 at gcc dot gnu.org ---
Hi,
This is another simpler approach I tried to apply "cost-model" on hoisting
before approaching a more general solution:
http://people.linaro.org/~prathamesh.kulkarni/hoist-change-order.diff
In this prototype patch, I changed order of hoisting such that instead hoisting
an expression in first candidate block, it hoists expression one dominator at a
time.

For pr77445-2.c test-case, str_225 + 1 gets hoisted in block 10 because it's
the first candidate block found from the top-down dom-tree walk, which leaves
little room for controlling hoisting.
The patch forces expressions to be inserted in immediate dominator at a time
instead of the first candidate block. With this change, the following series of
hoistings take place for str_225 + 1:

Inserting expression in block 15 for code
hoisting:{pointer_plus_expr,str_225,1} (0079)
Inserting expression in block 14 for code hoisting:
{pointer_plus_expr,str_225,1} (0079)
Inserting expression in block 11 for code hoisting:
{pointer_plus_expr,str_225,1} (0079)
Inserting expression in block 10 for code hoisting:
{pointer_plus_expr,str_225,1} (0079)
Inserting expression in block 53 for code hoisting:
{pointer_plus_expr,str_225,1} (0079)

str_225 + 1 originally appears in blocks 16 and 17. It is then first hoisted
into their predecessor block 15, then into block 14 and so on. The advantage I
see with this order of hoisting is, we can control hoisting after each
insertion in it's immediate dominator. So for instance if according to our cost
model, we reach "hoisting threshold" after say block 14, we can then prevent
further hoistings of str_225 + 1. Whereas with the current approach it gets
hoisted right up to block 10 initially. Alternatively we could try to "sink"
the expression down to dominated blocks. I didn't explore this option yet.

* Cost model for hoisting
The cost model would be entirely target specific defined by a target hook and
shouldn't affect other architectures that don't wish to use it. I suppose a
very simple cost model for hoisting could take following two factors:
a) Number of hoistings of a particular expression measured in terms of
dominator depth - This is recorded by expr_hoist_map which is map the former representing value number of pre_expr and latter
represents the count.
b) Number of insertions in basic block - This is recorded by map, the former representing block index and latter represents the count.

I didn't attempt to define the cost-model in the patch. I was wondering what
could be other potential factors that we can consider ?

* Issues with changing hoisting order
I am not entirely sure if the result of changing hoisting order can result in
correctness issues or missed optimizations ? For some confidence, I validated
the patch with bootstrap+test on x86_64, which worked.

There are two problems I see:
(1) Interference with statistics of hoisting, which is easy to fix.
(2) Does not honor the "expression should be available in at least one
successor" constraint, which leads to more aggressive hoisting for
architectures that will not use cost model. In example above, str_225 + 1 got
hoisted one block further upto block-53, while with current-order it's
restricted to block-10. I suppose we could fix this by recording which
expressions were originally available at end of block ?

The patch passes bootstrap+test on x86_64.

* Hoistings crossing loop boundary - One "peculiarity" I see with FMS function
in pr77445-2.c is that all the hoistings cross loop boundaries at one point,
while other tests have significantly lesser.
I did a quick test with SPEC2006 to collect some data:
(number-of-hoistings vs number-of-functions)
{2: 89, 1: 166, 3: 37, 4: 14, 5: 8, 6: 10, 7: 11, 8: 2, 13: 3, 10: 1, 11: 5, 9:
4, 17: 1, 15: 2, 27: 1, 12: 2, 18: 1, 21: 1, 26: 1}

It seems most of the functions have cross loop hoistings less than 5 with 166
functions having one hoisting inside loop and 89 functions having two hoistings
across loops. I was wondering if a hoisting into a block from it's successor
should have "extra penalty" if it crosses a loop boundary ? Or does hoisting
inside a loop have no effect on register pressure ?

Thanks,
Prathamesh

[Bug tree-optimization/80155] [7/8/9 regression] Performance regression with code hoisting enabled

2018-10-17 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80155

--- Comment #43 from prathamesh3492 at gcc dot gnu.org ---
Sorry for duplications / formatting errors in previous comment. Is there a way
to edit posted comments ?

Thanks,
Prathamesh

[Bug target/87920] New: Lots of regression tests fail with bootstrap build of arm-linux-gnueabihf

2018-11-07 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87920

Bug ID: 87920
   Summary: Lots of regression tests fail with bootstrap build of
arm-linux-gnueabihf
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: prathamesh3492 at gcc dot gnu.org
  Target Milestone: ---

Hi,
It seems lots of tests are failing with bootstrap build of arm-linux-gnueabihf
with following ICE:

during GIMPLE pass: ldist
/home/prathamesh.kulkarni/gnu-toolchain/gcc/tcwg-319-4/gcc/gcc/testsuite/c-c++-common/torture/pr53505.c:
In function 'main':
/home/prathamesh.kulkarni/gnu-toolchain/gcc/tcwg-319-4/gcc/gcc/testsuite/c-c++-common/torture/pr53505.c:29:1:
internal compiler error: Segmentation fault
0x5e83cb crash_signal
../../gcc/gcc/toplev.c:325
0x660e57 inchash::hash::add(void const*, unsigned int)
../../gcc/gcc/inchash.h:100
0x660e57 inchash::hash::add_ptr(void const*)
../../gcc/gcc/inchash.h:94
0x660e57 ddr_hasher::hash(data_dependence_relation const*)
../../gcc/gcc/tree-loop-distribution.c:143
0x660e57 hash_table::find_slot(data_dependence_relation* const&, insert_option)
../../gcc/gcc/hash-table.h:414
0x660e57 get_data_dependence
../../gcc/gcc/tree-loop-distribution.c:1184
0x66157b pg_add_dependence_edges
../../gcc/gcc/tree-loop-distribution.c:1890
0x66157b build_partition_graph
../../gcc/gcc/tree-loop-distribution.c:2107
0x66180f merge_dep_scc_partitions
../../gcc/gcc/tree-loop-distribution.c:2171
0x662e69 distribute_loop
../../gcc/gcc/tree-loop-distribution.c:2892
0x66416d execute
../../gcc/gcc/tree-loop-distribution.c:3133

Several tests fail with above ICE, like pr53505.c,
20131115-1.c, 20181024-1.c etc.

Thanks,
Prathamesh

[Bug target/87920] Lots of regression tests fail with bootstrap build of arm-linux-gnueabihf

2018-11-07 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87920

prathamesh3492 at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #2 from prathamesh3492 at gcc dot gnu.org ---
Likely yes, thanks for the pointer! I will mark this as dup.

Thanks,
Prathamesh

*** This bug has been marked as a duplicate of bug 87899 ***

[Bug middle-end/87899] [9 regression]r264897 cause mis-compiled native arm-linux-gnueabihf toolchain

2018-11-07 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87899

prathamesh3492 at gcc dot gnu.org changed:

   What|Removed |Added

 CC||prathamesh3492 at gcc dot 
gnu.org

--- Comment #4 from prathamesh3492 at gcc dot gnu.org ---
*** Bug 87920 has been marked as a duplicate of this bug. ***

  1   2   3   4   >