[Bug rtl-optimization/92007] [9/10 Regression] ICE: verify_flow_info failed (error: EH edge crosses section boundary in bb 7)

2019-10-28 Thread iii at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92007

--- Comment #17 from iii at gcc dot gnu.org ---
Author: iii
Date: Mon Oct 28 10:04:31 2019
New Revision: 277507

URL: https://gcc.gnu.org/viewcvs?rev=277507&root=gcc&view=rev
Log:
Move jump threading before reload

r266734 has introduced a new instance of jump threading pass in order to
take advantage of opportunities that combine opens up.  It was perceived
back then that it was beneficial to delay it after reload, since that
might produce even more such opportunities.

Unfortunately jump threading interferes with hot/cold partitioning.  In
the code from PR92007, it converts the following

  +-- 2/HOT +
  | |
  v v
3/HOT --> 5/HOT --> 8/HOT --> 11/COLD --> 6/HOT --EH--> 16/HOT
|   ^
|   |
+---+

into the following:

  +-- 2/HOT --+
  |   |
  v   v
3/HOT --> 8/HOT --> 11/COLD --> 6/COLD --EH--> 16/HOT

This makes hot bb 6 dominated by cold bb 11, and because of this
fixup_partitions makes bb 6 cold as well, which in turn makes EH edge
6->16 a crossing one.  Not only can't we have crossing EH edges, we are
also not allowed to introduce new crossing edges after reload in
general, since it might require extra registers on some targets.

Therefore, move the jump threading pass between combine and hot/cold
partitioning.  Building SPEC 2006 and SPEC 2017 with the old and the new
code indicates that:

* When doing jump threading right after reload, 3889 edges are threaded.
* When doing jump threading right after combine, 3918 edges are
  threaded.

This means this change will not introduce performance regressions.

gcc/ChangeLog:

2019-10-28  Ilya Leoshkevich  

PR rtl-optimization/92007
* cfgcleanup.c (thread_jump): Add an assertion that we don't
call it after reload if hot/cold partitioning has been done.
(class pass_postreload_jump): Rename to
pass_jump_after_combine.
(make_pass_postreload_jump): Rename to
make_pass_jump_after_combine.
* passes.def(pass_postreload_jump): Move before reload, rename
to pass_jump_after_combine.
* tree-pass.h (make_pass_postreload_jump): Rename to
make_pass_jump_after_combine.

gcc/testsuite/ChangeLog:

2019-10-28  Ilya Leoshkevich  

PR rtl-optimization/92007
* g++.dg/opt/pr92007.C: New test (from Arseny Solokha).

Added:
trunk/gcc/testsuite/g++.dg/opt/pr92007.C
Modified:
trunk/gcc/ChangeLog
trunk/gcc/cfgcleanup.c
trunk/gcc/passes.def
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-pass.h

[Bug rtl-optimization/92007] [9/10 Regression] ICE: verify_flow_info failed (error: EH edge crosses section boundary in bb 7)

2019-10-28 Thread iii at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92007

--- Comment #18 from iii at gcc dot gnu.org ---
Author: iii
Date: Mon Oct 28 13:09:54 2019
New Revision: 277515

URL: https://gcc.gnu.org/viewcvs?rev=277515&root=gcc&view=rev
Log:
Move jump threading before reload

r266734 has introduced a new instance of jump threading pass in order to
take advantage of opportunities that combine opens up.  It was perceived
back then that it was beneficial to delay it after reload, since that
might produce even more such opportunities.

Unfortunately jump threading interferes with hot/cold partitioning.  In
the code from PR92007, it converts the following

  +-- 2/HOT +
  | |
  v v
3/HOT --> 5/HOT --> 8/HOT --> 11/COLD --> 6/HOT --EH--> 16/HOT
|   ^
|   |
+---+

into the following:

  +-- 2/HOT --+
  |   |
  v   v
3/HOT --> 8/HOT --> 11/COLD --> 6/COLD --EH--> 16/HOT

This makes hot bb 6 dominated by cold bb 11, and because of this
fixup_partitions makes bb 6 cold as well, which in turn makes EH edge
6->16 a crossing one.  Not only can't we have crossing EH edges, we are
also not allowed to introduce new crossing edges after reload in
general, since it might require extra registers on some targets.

Therefore, move the jump threading pass between combine and hot/cold
partitioning.  Building SPEC 2006 and SPEC 2017 with the old and the new
code indicates that:

* When doing jump threading right after reload, 3889 edges are threaded.
* When doing jump threading right after combine, 3918 edges are
  threaded.

This means this change will not introduce performance regressions.

gcc/ChangeLog:

2019-10-28  Ilya Leoshkevich  

Backport from mainline
PR rtl-optimization/92007
* cfgcleanup.c (thread_jump): Add an assertion that we don't
call it after reload if hot/cold partitioning has been done.
(class pass_postreload_jump): Rename to
pass_jump_after_combine.
(make_pass_postreload_jump): Rename to
make_pass_jump_after_combine.
* passes.def(pass_postreload_jump): Move before reload, rename
to pass_jump_after_combine.
* tree-pass.h (make_pass_postreload_jump): Rename to
make_pass_jump_after_combine.

gcc/testsuite/ChangeLog:

2019-10-28  Ilya Leoshkevich  

Backport from mainline
PR rtl-optimization/92007
* g++.dg/opt/pr92007.C: New test (from Arseny Solokha).

Added:
branches/gcc-9-branch/gcc/testsuite/g++.dg/opt/pr92007.C
Modified:
branches/gcc-9-branch/gcc/ChangeLog
branches/gcc-9-branch/gcc/cfgcleanup.c
branches/gcc-9-branch/gcc/passes.def
branches/gcc-9-branch/gcc/testsuite/ChangeLog
branches/gcc-9-branch/gcc/tree-pass.h

[Bug rtl-optimization/92430] [9/10 Regression] Compile-time hog w/ -Os -fno-if-conversion -fno-tree-dce -fno-tree-loop-optimize -fno-tree-vrp

2019-11-12 Thread iii at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92430

--- Comment #5 from iii at gcc dot gnu.org ---
Author: iii
Date: Tue Nov 12 14:24:35 2019
New Revision: 278095

URL: https://gcc.gnu.org/viewcvs?rev=278095&root=gcc&view=rev
Log:
Free dominance info at the beginning of pass_jump_after_combine

try_forward_edges does not update dominance info, and merge_blocks
relies on it being up-to-date.  In PR92430 stale dominance info makes
merge_blocks produce a loop in the dominator tree, which in turn makes
delete_basic_block loop forever.

Fix by freeing dominance info at the beginning of cleanup_cfg.

gcc/ChangeLog:

2019-11-12  Ilya Leoshkevich  

PR rtl-optimization/92430
* cfgcleanup.c (pass_jump_after_combine::execute): Free
dominance info at the beginning.

gcc/testsuite/ChangeLog:

2019-11-12  Ilya Leoshkevich  

PR rtl-optimization/92430
* gcc.dg/pr92430.c: New test (from Arseny Solokha).

Added:
trunk/gcc/testsuite/gcc.dg/pr92430.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/cfgcleanup.c
trunk/gcc/testsuite/ChangeLog

[Bug rtl-optimization/92430] [9 Regression] Compile-time hog w/ -Os -fno-if-conversion -fno-tree-dce -fno-tree-loop-optimize -fno-tree-vrp

2019-11-14 Thread iii at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92430

--- Comment #7 from iii at gcc dot gnu.org ---
Author: iii
Date: Thu Nov 14 16:40:33 2019
New Revision: 278254

URL: https://gcc.gnu.org/viewcvs?rev=278254&root=gcc&view=rev
Log:
Make flag_thread_jumps a gate of pass_jump_after_combine

This is a follow-up to
https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00919.html (r278095).
Dominance info is deleted even if we don't perform jump threading.
Since the whole point of this pass is to perform jump threading (other
cleanups are not valuable at this point), skip it completely when
flag_thread_jumps is not set.

gcc/ChangeLog:

2019-11-14  Ilya Leoshkevich  

PR rtl-optimization/92430
* cfgcleanup.c (pass_jump_after_combine::gate): New function.
(pass_jump_after_combine::execute): Perform jump threading
unconditionally.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/cfgcleanup.c

[Bug rtl-optimization/92430] [9 Regression] Compile-time hog w/ -Os -fno-if-conversion -fno-tree-dce -fno-tree-loop-optimize -fno-tree-vrp

2019-11-15 Thread iii at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92430

--- Comment #8 from iii at gcc dot gnu.org ---
Author: iii
Date: Fri Nov 15 12:55:05 2019
New Revision: 278291

URL: https://gcc.gnu.org/viewcvs?rev=278291&root=gcc&view=rev
Log:
Free dominance info at the beginning of pass_jump_after_combine

try_forward_edges does not update dominance info, and merge_blocks
relies on it being up-to-date.  In PR92430 stale dominance info makes
merge_blocks produce a loop in the dominator tree, which in turn makes
delete_basic_block loop forever.

Fix by freeing dominance info at the beginning of cleanup_cfg.

Also, since the whole point of this pass is to perform jump threading
(other cleanups are not valuable at this point), skip it completely when
flag_thread_jumps is not set.

gcc/ChangeLog:

2019-11-15  Ilya Leoshkevich  

Backport from mainline
PR rtl-optimization/92430
* cfgcleanup.c (pass_jump_after_combine::gate): New function.
(pass_jump_after_combine::execute): Free
dominance info at the beginning.

gcc/testsuite/ChangeLog:

2019-11-15  Ilya Leoshkevich  

Backport from mainline
PR rtl-optimization/92430
* gcc.dg/pr92430.c: New test (from Arseny Solokha).

Added:
branches/gcc-9-branch/gcc/testsuite/gcc.dg/pr92430.c
Modified:
branches/gcc-9-branch/gcc/ChangeLog
branches/gcc-9-branch/gcc/cfgcleanup.c
branches/gcc-9-branch/gcc/testsuite/ChangeLog

[Bug target/77918] S390: Floating point comparisons don't raise invalid for unordered operands.

2019-09-30 Thread iii at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77918

--- Comment #3 from iii at gcc dot gnu.org ---
Author: iii
Date: Mon Sep 30 17:40:02 2019
New Revision: 276360

URL: https://gcc.gnu.org/viewcvs?rev=276360&root=gcc&view=rev
Log:
S/390: Remove code duplication in vec_unordered

vec_unordered is vec_ordered plus a negation at the end.
Reuse vec_unordered logic.

gcc/ChangeLog:

2019-09-30  Ilya Leoshkevich  

PR target/77918
* config/s390/vector.md (vec_unordered): Call
gen_vec_ordered.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/s390/vector.md

[Bug target/77918] S390: Floating point comparisons don't raise invalid for unordered operands.

2019-10-01 Thread iii at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77918

--- Comment #4 from iii at gcc dot gnu.org ---
Author: iii
Date: Tue Oct  1 14:03:08 2019
New Revision: 276408

URL: https://gcc.gnu.org/viewcvs?rev=276408&root=gcc&view=rev
Log:
S/390: Implement vcond expander for V1TI,V1TF

Currently gcc does not emit wf{c,k}* instructions when comparing long
double values.  Middle-end actually adds them in the first place, but
then veclower pass replaces them with floating point register pair
operations, because the corresponding expander is missing.

gcc/ChangeLog:

2019-10-01  Ilya Leoshkevich  

PR target/77918
* config/s390/vector.md (V_HW): Add V1TI in order to make
vcond$a$b generate vcondv1tiv1tf.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/s390/vector.md

[Bug target/77918] S390: Floating point comparisons don't raise invalid for unordered operands.

2019-10-01 Thread iii at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77918

--- Comment #5 from iii at gcc dot gnu.org ---
Author: iii
Date: Tue Oct  1 14:04:08 2019
New Revision: 276409

URL: https://gcc.gnu.org/viewcvs?rev=276409&root=gcc&view=rev
Log:
S/390: Remove code duplication in vec_* comparison expanders

s390.md uses a lot of near-identical expanders that perform dispatching
to other expanders based on operand types. Since the following patch
would require even more of these, avoid copy-pasting the code by
generating these expanders using an iterator.

gcc/ChangeLog:

2019-10-01  Ilya Leoshkevich  

PR target/77918
* config/s390/s390.c (s390_expand_vec_compare): Use
gen_vec_cmpordered and gen_vec_cmpunordered.
* config/s390/vector.md (vec_cmpuneq, vec_cmpltgt, vec_ordered,
vec_unordered): Delete.
(vec_ordered): Rename to vec_cmpordered.
(vec_unordered): Rename to vec_cmpunordered.
(VEC_CMP_EXPAND): New iterator for the generic dispatcher.
(vec_cmp): Generic dispatcher.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/s390/s390.c
trunk/gcc/config/s390/vector.md

[Bug target/77918] S390: Floating point comparisons don't raise invalid for unordered operands.

2019-10-07 Thread iii at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77918

--- Comment #6 from iii at gcc dot gnu.org ---
Author: iii
Date: Mon Oct  7 14:59:00 2019
New Revision: 276659

URL: https://gcc.gnu.org/viewcvs?rev=276659&root=gcc&view=rev
Log:
Allow COND_EXPR and VEC_COND_EXPR condtions to trap

Right now gimplifier does not allow VEC_COND_EXPR's condition to trap
and introduces a temporary if this could happen, for example, generating

  _5 = _4 > { 2.0e+0, 2.0e+0, 2.0e+0, 2.0e+0 };
  _6 = VEC_COND_EXPR <_5, { -1, -1, -1, -1 }, { 0, 0, 0, 0 }>;

from GENERIC

  VEC_COND_EXPR < (*b > { 2.0e+0, 2.0e+0, 2.0e+0, 2.0e+0 }) ,
  { -1, -1, -1, -1 } ,
  { 0, 0, 0, 0 } >

This is not necessary and makes the resulting GIMPLE harder to analyze.
Change the gimplifier so as to allow COND_EXPR and VEC_COND_EXPR
conditions to trap.

This patch takes special care to avoid introducing trapping comparisons
in GIMPLE_COND.  They are not allowed, because they would require 3
outgoing edges (then, else and EH), which is awkward to say the least.
Therefore, computations of such conditions should live in their own basic
blocks.

gcc/ChangeLog:

2019-10-07  Ilya Leoshkevich  

PR target/77918
* gimple-expr.c (gimple_cond_get_ops_from_tree): Assert that the
caller passes a non-trapping condition.
(is_gimple_condexpr): Allow trapping conditions.
(is_gimple_condexpr_1): New helper function.
(is_gimple_condexpr_for_cond): New function, acts like old
is_gimple_condexpr.
* gimple-expr.h (is_gimple_condexpr_for_cond): New function.
* gimple.c (gimple_could_trap_p_1): Handle COND_EXPR and
VEC_COND_EXPR. Fix an issue with statements like i = (fp < 1.).
* gimplify.c (gimplify_cond_expr): Use
is_gimple_condexpr_for_cond.
(gimplify_expr): Allow is_gimple_condexpr_for_cond.
* tree-eh.c (operation_could_trap_p): Assert on COND_EXPR and
VEC_COND_EXPR.
(tree_could_trap_p): Handle COND_EXPR and VEC_COND_EXPR.
* tree-ssa-forwprop.c (forward_propagate_into_gimple_cond): Use
is_gimple_condexpr_for_cond, remove pointless tmp check
(forward_propagate_into_cond): Remove pointless tmp check.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/gimple-expr.c
trunk/gcc/gimple-expr.h
trunk/gcc/gimple.c
trunk/gcc/gimplify.c
trunk/gcc/tree-eh.c
trunk/gcc/tree-ssa-forwprop.c

[Bug target/77918] S390: Floating point comparisons don't raise invalid for unordered operands.

2019-10-07 Thread iii at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77918

--- Comment #7 from iii at gcc dot gnu.org ---
Author: iii
Date: Mon Oct  7 15:01:15 2019
New Revision: 276660

URL: https://gcc.gnu.org/viewcvs?rev=276660&root=gcc&view=rev
Log:
Introduce can_vcond_compare_p function

z13 supports only non-signaling vector comparisons.  This means we
cannot vectorize LT, LE, GT, GE and LTGT when compiling for z13.
However, we cannot express this restriction today: the code only checks
whether vcond$a$b optab exists, but this does not say anything about the
operation.

Introduce a function that checks whether back-end supports vector
comparisons with individual rtx codes by matching vcond expander's third
argument with a fake comparison with the corresponding rtx code.

gcc/ChangeLog:

2019-10-07  Ilya Leoshkevich  

PR target/77918
* optabs-tree.c (vcond_icode_p): New function.
(vcond_eq_icode_p): Likewise.
(expand_vec_cond_expr_p): Use vcond_icode_p and
vcond_eq_icode_p.
* optabs.c (can_vcond_compare_p): New function.
* optabs.h (can_vcond_compare_p): Likewise.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/optabs-tree.c
trunk/gcc/optabs.c
trunk/gcc/optabs.h

[Bug target/77918] S390: Floating point comparisons don't raise invalid for unordered operands.

2019-10-10 Thread iii at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77918

--- Comment #8 from iii at gcc dot gnu.org ---
Author: iii
Date: Thu Oct 10 17:00:29 2019
New Revision: 276842

URL: https://gcc.gnu.org/viewcvs?rev=276842&root=gcc&view=rev
Log:
[PATCH 1/3] S/390: Do not use signaling vector comparisons on z13

z13 supports only non-signaling vector comparisons.  This means we
cannot vectorize LT, LE, GT, GE and LTGT when compiling for z13.  Notify
middle-end about this by using more restrictive operator predicate in
vcond.

gcc/ChangeLog:

2019-10-10  Ilya Leoshkevich  

PR target/77918
* config/s390/vector.md (vcond_comparison_operator): New
predicate.
(vcond): Use vcond_comparison_operator.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/s390/vector.md

[Bug target/77918] S390: Floating point comparisons don't raise invalid for unordered operands.

2019-10-11 Thread iii at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77918

--- Comment #9 from iii at gcc dot gnu.org ---
Author: iii
Date: Fri Oct 11 09:00:26 2019
New Revision: 276871

URL: https://gcc.gnu.org/viewcvs?rev=276871&root=gcc&view=rev
Log:
S/390: Use signaling FP comparison instructions

dg-torture.exp=inf-compare-1.c is failing, because (qNaN > +Inf)
comparison is compiled to CDB instruction, which does not signal an
invalid operation exception. KDB should have been used instead.

This patch introduces a new CCmode and a new pattern in order to
generate signaling instructions in this and similar cases.

gcc/ChangeLog:

2019-10-11  Ilya Leoshkevich  

PR target/77918
* config/s390/2827.md: Add new opcodes.
* config/s390/2964.md: Likewise.
* config/s390/3906.md: Likewise.
* config/s390/8561.md: Likewise.
* config/s390/s390-builtins.def (s390_vfchesb): Use
the new vec_cmpgev4sf_quiet_nocc.
(s390_vfchedb): Use the new vec_cmpgev2df_quiet_nocc.
(s390_vfchsb): Use the new vec_cmpgtv4sf_quiet_nocc.
(s390_vfchdb): Use the new vec_cmpgtv2df_quiet_nocc.
(vec_cmplev4sf): Use the new vec_cmplev4sf_quiet_nocc.
(vec_cmplev2df): Use the new vec_cmplev2df_quiet_nocc.
(vec_cmpltv4sf): Use the new vec_cmpltv4sf_quiet_nocc.
(vec_cmpltv2df): Use the new vec_cmpltv2df_quiet_nocc.
* config/s390/s390-modes.def (CCSFPS): New mode.
* config/s390/s390.c (s390_match_ccmode_set): Support CCSFPS.
(s390_select_ccmode): Return CCSFPS for LT, LE, GT, GE and LTGT.
(s390_branch_condition_mask): Reuse CCS for CCSFPS.
(s390_expand_vec_compare): Use non-signaling patterns where
necessary.
(s390_reverse_condition): Support CCSFPS.
* config/s390/s390.md (*cmp_ccsfps): New pattern.
* config/s390/vector.md: (VFCMP_HW_OP): Remove.
(asm_fcmp_op): Likewise.
(*smaxv2df3_vx): Use pattern for quiet comparison.
(*sminv2df3_vx): Likewise.
(*vec_cmp_nocc): Remove.
(*vec_cmpeq_quiet_nocc): New pattern.
(vec_cmpgt_quiet_nocc): Likewise.
(vec_cmplt_quiet_nocc): New expander.
(vec_cmpge_quiet_nocc): New pattern.
(vec_cmple_quiet_nocc): New expander.
(*vec_cmpeq_signaling_nocc): New pattern.
(*vec_cmpgt_signaling_nocc): Likewise.
(*vec_cmpgt_signaling_finite_nocc): Likewise.
(*vec_cmpge_signaling_nocc): Likewise.
(*vec_cmpge_signaling_finite_nocc): Likewise.
(vec_cmpungt): New expander.
(vec_cmpunge): Likewise.
(vec_cmpuneq): Use quiet patterns.
(vec_cmpltgt): Allow only on z14+.
(vec_cmpordered): Use quiet patterns.
(vec_cmpunordered): Likewise.
(VEC_CMP_EXPAND): Add ungt and unge.

gcc/testsuite/ChangeLog:

2019-10-11  Ilya Leoshkevich  

* gcc.target/s390/vector/vec-scalar-cmp-1.c: Adjust
expectations.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/s390/2827.md
trunk/gcc/config/s390/2964.md
trunk/gcc/config/s390/3906.md
trunk/gcc/config/s390/8561.md
trunk/gcc/config/s390/s390-builtins.def
trunk/gcc/config/s390/s390-modes.def
trunk/gcc/config/s390/s390.c
trunk/gcc/config/s390/s390.md
trunk/gcc/config/s390/vector.md
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.target/s390/vector/vec-scalar-cmp-1.c

[Bug target/77918] S390: Floating point comparisons don't raise invalid for unordered operands.

2019-10-11 Thread iii at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77918

--- Comment #11 from iii at gcc dot gnu.org ---
Author: iii
Date: Fri Oct 11 09:03:00 2019
New Revision: 276872

URL: https://gcc.gnu.org/viewcvs?rev=276872&root=gcc&view=rev
Log:
S/390: Test signaling FP comparison instructions

gcc/testsuite/ChangeLog:

2019-10-11  Ilya Leoshkevich  

PR target/77918
* gcc.target/s390/s390.exp: Enable Fortran tests.
* gcc.target/s390/zvector/autovec-double-quiet-eq.c: New test.
* gcc.target/s390/zvector/autovec-double-quiet-ge.c: New test.
* gcc.target/s390/zvector/autovec-double-quiet-gt.c: New test.
* gcc.target/s390/zvector/autovec-double-quiet-le.c: New test.
* gcc.target/s390/zvector/autovec-double-quiet-lt.c: New test.
* gcc.target/s390/zvector/autovec-double-quiet-ordered.c: New test.
* gcc.target/s390/zvector/autovec-double-quiet-uneq.c: New test.
* gcc.target/s390/zvector/autovec-double-quiet-unordered.c: New test.
* gcc.target/s390/zvector/autovec-double-signaling-eq-z13-finite.c: New
test.
* gcc.target/s390/zvector/autovec-double-signaling-eq-z13.c: New test.
* gcc.target/s390/zvector/autovec-double-signaling-eq.c: New test.
* gcc.target/s390/zvector/autovec-double-signaling-ge-z13-finite.c: New
test.
* gcc.target/s390/zvector/autovec-double-signaling-ge-z13.c: New test.
* gcc.target/s390/zvector/autovec-double-signaling-ge.c: New test.
* gcc.target/s390/zvector/autovec-double-signaling-gt-z13-finite.c: New
test.
* gcc.target/s390/zvector/autovec-double-signaling-gt-z13.c: New test.
* gcc.target/s390/zvector/autovec-double-signaling-gt.c: New test.
* gcc.target/s390/zvector/autovec-double-signaling-le-z13-finite.c: New
test.
* gcc.target/s390/zvector/autovec-double-signaling-le-z13.c: New test.
* gcc.target/s390/zvector/autovec-double-signaling-le.c: New test.
* gcc.target/s390/zvector/autovec-double-signaling-lt-z13-finite.c: New
test.
* gcc.target/s390/zvector/autovec-double-signaling-lt-z13.c: New test.
* gcc.target/s390/zvector/autovec-double-signaling-lt.c: New test.
* gcc.target/s390/zvector/autovec-double-signaling-ltgt-z13-finite.c:
New test.
* gcc.target/s390/zvector/autovec-double-signaling-ltgt-z13.c: New
test.
* gcc.target/s390/zvector/autovec-double-signaling-ltgt.c: New test.
* gcc.target/s390/zvector/autovec-double-smax-z13.F90: New test.
* gcc.target/s390/zvector/autovec-double-smax.F90: New test.
* gcc.target/s390/zvector/autovec-double-smin-z13.F90: New test.
* gcc.target/s390/zvector/autovec-double-smin.F90: New test.
* gcc.target/s390/zvector/autovec-float-quiet-eq.c: New test.
* gcc.target/s390/zvector/autovec-float-quiet-ge.c: New test.
* gcc.target/s390/zvector/autovec-float-quiet-gt.c: New test.
* gcc.target/s390/zvector/autovec-float-quiet-le.c: New test.
* gcc.target/s390/zvector/autovec-float-quiet-lt.c: New test.
* gcc.target/s390/zvector/autovec-float-quiet-ordered.c: New test.
* gcc.target/s390/zvector/autovec-float-quiet-uneq.c: New test.
* gcc.target/s390/zvector/autovec-float-quiet-unordered.c: New test.
* gcc.target/s390/zvector/autovec-float-signaling-eq.c: New test.
* gcc.target/s390/zvector/autovec-float-signaling-ge.c: New test.
* gcc.target/s390/zvector/autovec-float-signaling-gt.c: New test.
* gcc.target/s390/zvector/autovec-float-signaling-le.c: New test.
* gcc.target/s390/zvector/autovec-float-signaling-lt.c: New test.
* gcc.target/s390/zvector/autovec-float-signaling-ltgt.c: New test.
* gcc.target/s390/zvector/autovec-fortran.h: New test.
* gcc.target/s390/zvector/autovec-long-double-signaling-ge.c: New test.
* gcc.target/s390/zvector/autovec-long-double-signaling-gt.c: New test.
* gcc.target/s390/zvector/autovec-long-double-signaling-le.c: New test.
* gcc.target/s390/zvector/autovec-long-double-signaling-lt.c: New test.
* gcc.target/s390/zvector/autovec.h: New test.

Added:
trunk/gcc/testsuite/gcc.target/s390/zvector/autovec-double-quiet-eq.c
trunk/gcc/testsuite/gcc.target/s390/zvector/autovec-double-quiet-ge.c
trunk/gcc/testsuite/gcc.target/s390/zvector/autovec-double-quiet-gt.c
trunk/gcc/testsuite/gcc.target/s390/zvector/autovec-double-quiet-le.c
trunk/gcc/testsuite/gcc.target/s390/zvector/autovec-double-quiet-lt.c
trunk/gcc/testsuite/gcc.target/s390/zvector/autovec-double-quiet-ordered.c
trunk/gcc/testsuite/gcc.target/s390/zvector/autovec-double-quiet-uneq.c
   
trunk/gcc/testsuite/gcc.target/s390/zvector/autovec-double-quiet-unordered.c
   
trunk/gcc/testsuite/gcc.target/s390/zvector/autovec-double-signaling-eq-z13-finite.c
   
trunk/gcc/testsuite/gcc.target/s390/zvector/autovec-double-signaling-eq-z13.c
t

[Bug target/89233] [9 Regression] ICE in change_address_1, at emit-rtl.c:2286

2019-02-12 Thread iii at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89233

--- Comment #3 from iii at gcc dot gnu.org ---
Author: iii
Date: Tue Feb 12 14:51:39 2019
New Revision: 268798

URL: https://gcc.gnu.org/viewcvs?rev=268798&root=gcc&view=rev
Log:
S/390: Reject invalid Q/R/S/T addresses after LRA

The following insn:

(insn (set (reg:DI %r2)
   (sign_extend:DI (mem:SI
(const:DI (plus:DI (symbol_ref:DI ("*.LC0"))
   (const_int 16)))

is correctly recognized by LRA as RIL alternative of extendsidi2
define_insn.  However, when recognition runs after LRA, it returns RXY
alternative, which is incorrect, since the offset 16 points past the
end of of *.LC0 literal pool entry.  Such addresses are normally
rejected by s390_decompose_address ().

This inconsistency confuses annotate_constant_pool_refs: the selected
alternative makes it proceed with annotation, only to find that the
annotated address is invalid, causing ICE.

This patch fixes the root cause, namely, that s390_check_qrst_address ()
behaves differently during and after LRA.

gcc/ChangeLog:

2019-02-12  Ilya Leoshkevich  

PR target/89233
* config/s390/s390.c (s390_decompose_address): Update comment.
(s390_check_qrst_address): Reject invalid address forms after
LRA.

gcc/testsuite/ChangeLog:

2019-02-12  Ilya Leoshkevich  

PR target/89233
* gcc.target/s390/pr89233.c: New test.

Added:
trunk/gcc/testsuite/gcc.target/s390/pr89233.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/s390/s390.c
trunk/gcc/testsuite/ChangeLog

[Bug target/80080] S390: Isses with emitted cs-instructions for __atomic builtins.

2018-09-24 Thread iii at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80080

--- Comment #15 from iii at gcc dot gnu.org ---
Author: iii
Date: Mon Sep 24 14:21:03 2018
New Revision: 264535

URL: https://gcc.gnu.org/viewcvs?rev=264535&root=gcc&view=rev
Log:
S/390: Fix conditional returns on z196+

S/390 epilogue ends with (parallel [(return) (use %r14)]) instead of
the more usual (return) or (simple_return).  This sequence is not
recognized by the conditional return logic in try_optimize_cfg ().

This was introduced for processors older than z196, where it is
sometimes profitable to use call-clobbered register for returning
instead of %r14.  On newer processors we always return via %r14,
for which the fact that it's used is already reflected by
EPILOGUE_USES.  In this case a simple (return) suffices.

This patch changes return_use () to emit simple (return)s when
returning via %r14.  The resulting sequences are recognized by the
conditional return logic in try_optimize_cfg ().

gcc/ChangeLog:

2018-09-24  Ilya Leoshkevich  

PR target/80080
* config/s390/s390.c (s390_emit_epilogue): Do not use PARALLEL
RETURN+USE when returning via %r14.

gcc/testsuite/ChangeLog:

2018-09-24  Ilya Leoshkevich  

PR target/80080
* gcc.target/s390/risbg-ll-3.c: Expect conditional returns.
* gcc.target/s390/zvector/vec-cmp-2.c: Likewise.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/s390/s390.c
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.target/s390/risbg-ll-3.c
trunk/gcc/testsuite/gcc.target/s390/zvector/vec-cmp-2.c

[Bug bootstrap/87417] [9 regression] Internal error: abort in attr_alt_intersection, at genattrtab.c:2357

2018-09-24 Thread iii at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87417

--- Comment #8 from iii at gcc dot gnu.org ---
Author: iii
Date: Tue Sep 25 06:38:20 2018
New Revision: 264556

URL: https://gcc.gnu.org/viewcvs?rev=264556&root=gcc&view=rev
Log:
Fix EQ_ATTR_ALT size calculation (PR bootstrap/87417)

"r264537: Change EQ_ATTR_ALT to support up to 64 alternatives" changed
the format of EQ_ATTR_ALT from ii to ww.  This broke the bootstrap on
32-bit systems, because the formula for rtx_code_size assumed that only
certain codes contain HOST_WIDE_INTs.  This did not surface on 64-bit
systems, because rtunion is 8 bytes anyway, but on 32-bit systems it's
only 4 bytes.  This resulted in out-of-bounds writes and memory
corruptions in genattrtab.

gcc/ChangeLog:

2018-09-25  Ilya Leoshkevich  

PR bootstrap/87417
* rtl.c (rtx_code_size): Take into account that EQ_ATTR_ALT
contains HOST_WIDE_INTs when computing its size.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/rtl.c

[Bug rtl-optimization/87596] [9 Regression] ICE: Segmentation fault (in spill_hard_reg_in_range)

2018-10-19 Thread iii at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87596

--- Comment #6 from iii at gcc dot gnu.org ---
Author: iii
Date: Fri Oct 19 08:33:52 2018
New Revision: 265306

URL: https://gcc.gnu.org/viewcvs?rev=265306&root=gcc&view=rev
Log:
lra: fix spill_hard_reg_in_range clobber check

FROM..TO range might contain NOTE_INSN_DELETED insns, for which the
corresponding entries in lra_insn_recog_data[] are NULLs.  Example from
the problematic code from PR87596:

(note 148 154 68 7 NOTE_INSN_DELETED)

lra_insn_recog_data[] is used directly only when the insn in question
is taken from insn_bitmap, which is not the case here.  In other
situations lra_get_insn_recog_data () guarded by INSN_P () or other
stricter predicate are used.  So we need to do this here as well.

A tiny detail worth noting: I put the INSN_P () check before the
insn_bitmap check, because I believe that insn_bitmap can contain only
real insns anyway.

gcc/ChangeLog:

2018-10-19  Ilya Leoshkevich  

PR rtl-optimization/87596
* lra-constraints.c (spill_hard_reg_in_range): Use INSN_P () +
lra_get_insn_recog_data () instead of lra_insn_recog_data[]
for instructions in FROM..TO range.

gcc/testsuite/ChangeLog:

2018-10-19  Ilya Leoshkevich  

PR rtl-optimization/87596
* gcc.target/i386/pr87596.c: New test.

Added:
trunk/gcc/testsuite/gcc.target/i386/pr87596.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/lra-constraints.c
trunk/gcc/testsuite/ChangeLog

[Bug bootstrap/87747] [9 regression] Bootstrap failure if using gcc-4.6 as stage1 compiler

2018-10-25 Thread iii at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87747

--- Comment #3 from iii at gcc dot gnu.org ---
Author: iii
Date: Thu Oct 25 13:47:10 2018
New Revision: 265488

URL: https://gcc.gnu.org/viewcvs?rev=265488&root=gcc&view=rev
Log:
Fix rtx_code_size static initialization order fiasco

r264556 and r264537 changed the format of EQ_ATTR_ALT RTXs to "ww",
which also required adjusting rtx_code_size initializer.  In order to
simplify things, the list of rtx_codes known to use HOST_WIDE_INTs was
replaced by the format string check.  However, unlike the old one, this
new check cannot be always performed at compile time, in which case a
static constructor is generated.  This may lead to a static
initialization order fiasco with respect to other static constructors
in the compiler, in case of PR87747, cselib's pool_allocator.

gcc/ChangeLog:

2018-10-25  Ilya Leoshkevich  

PR bootstrap/87747
* rtl.c (RTX_CODE_HWINT_P_1): New helper macro.
(RTX_CODE_HWINT_P): New macro.
(rtx_code_size): Use RTX_CODE_HWINT_P ().

Modified:
trunk/gcc/ChangeLog
trunk/gcc/rtl.c

[Bug target/87762] [9 Regression] extract_constrain_insn, at recog.c:2206 on s390x

2018-11-06 Thread iii at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87762

--- Comment #3 from iii at gcc dot gnu.org ---
Author: iii
Date: Tue Nov  6 13:20:21 2018
New Revision: 265844

URL: https://gcc.gnu.org/viewcvs?rev=265844&root=gcc&view=rev
Log:
S/390: Introduce relative_long attribute

In order to properly fix PR87762, we need to distinguish between
instructions which support relative addressing and instructions which
don't.  We could check whether the existing "type" attribute is equal to
"larl", but there are notable exceptions (lrl, for example), and
changing them makes scheduling worse on z10.  We could also check
whether the existing "op_type" attribute is equal to "RIL-b" or "RIL-c".
However, adding a new attribute provides more flexibility, since we
don't depend idiosyncrasies which might be introduced into PoP in the
future.

gcc/ChangeLog:

2018-11-06  Ilya Leoshkevich  

PR target/87762
* config/s390/s390.md: Add relative_long attribute.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/s390/s390.md

[Bug target/87762] [9 Regression] extract_constrain_insn, at recog.c:2206 on s390x

2018-11-09 Thread iii at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87762

--- Comment #4 from iii at gcc dot gnu.org ---
Author: iii
Date: Fri Nov  9 20:33:19 2018
New Revision: 265991

URL: https://gcc.gnu.org/viewcvs?rev=265991&root=gcc&view=rev
Log:
S/390: Allow relative addressing of literal pool entries

r265490 allowed the compiler to choose in a more flexible way whether to
use load or load-address-relative-long (LARL) instruction.  When it
chose LARL for literal pool references, the latter ones were rewritten
by pass_s390_early_mach to use UNSPEC_LTREF, which assumes base register
usage, which in turn is not compatible with LARL.  The end result was an
ICE because of unrecognizable insn.

UNSPEC_LTREF and friends are necessary in order to communicate the
dependency on the base register to pass_sched2.  When relative
addressing is used, no base register is necessary, so in such cases the
rewrite must be avoided.

gcc/ChangeLog:

2018-11-09  Ilya Leoshkevich  

PR target/87762
* config/s390/s390.c (s390_safe_relative_long_p): New function.
(annotate_constant_pool_refs): Skip insns which support
relative addressing.
(annotate_constant_pool_refs_1): New helper function.
(find_constant_pool_ref): Skip insns which support relative
addression.
(find_constant_pool_ref_1): New helper function.
(replace_constant_pool_ref): Skip insns which support
relative addressing.
(replace_constant_pool_ref_1): New helper function.
(s390_mainpool_start): Adapt to the new signature.
(s390_mainpool_finish): Likewise.
(s390_chunkify_start): Likewise.
(s390_chunkify_finish): Likewise.
(pass_s390_early_mach::execute): Likewise.
(s390_prologue_plus_offset): Likewise.
(s390_emit_prologue): Likewise.
(s390_emit_epilogue): Likewise.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/s390/s390.c

[Bug target/88083] ICE in find_constant_pool_ref_1, at config/s390/s390.c:8231

2018-11-20 Thread iii at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88083

--- Comment #1 from iii at gcc dot gnu.org ---
Author: iii
Date: Tue Nov 20 09:32:49 2018
New Revision: 266306

URL: https://gcc.gnu.org/viewcvs?rev=266306&root=gcc&view=rev
Log:
S/390: Skip LT(G) peephole when literal pool is involved

By the time peephole optimizations run, we've already made up our mind
whether to use base-register or relative addressing for literal pool
entries.  LT(G) supports only base-register addressing, and so it is
too late to convert L(G)RL + compare to LT(G).  This change should not
make the code worse unless building with e.g. -fno-dce, since comparing
literal pool entries to zero should be optimized away during earlier
passes.

gcc/ChangeLog:

2018-11-20  Ilya Leoshkevich  

PR target/88083
* config/s390/s390.md: Skip LT(G) peephole when literal pool is
involved.
* rtl.h (contains_constant_pool_address_p): New function.
* rtlanal.c (contains_constant_pool_address_p): Likewise.

gcc/testsuite/ChangeLog:

2018-11-20  Ilya Leoshkevich  

PR target/88083
* gcc.target/s390/pr88083.c: New test.

Added:
trunk/gcc/testsuite/gcc.target/s390/pr88083.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/s390/s390.md
trunk/gcc/rtl.h
trunk/gcc/rtlanal.c
trunk/gcc/testsuite/ChangeLog

[Bug target/80080] S390: Isses with emitted cs-instructions for __atomic builtins.

2018-12-03 Thread iii at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80080

--- Comment #16 from iii at gcc dot gnu.org ---
Author: iii
Date: Mon Dec  3 09:49:02 2018
New Revision: 266734

URL: https://gcc.gnu.org/viewcvs?rev=266734&root=gcc&view=rev
Log:
Repeat jump threading after combine

Consider the following RTL:

(insn (set (reg 65) (if_then_else (eq %cc 0) 1 0)))
(insn (parallel [(set %cc (compare (reg 65) 0)) (clobber %scratch)]))
(jump_insn (set %pc (if_then_else (ne %cc 0) (label_ref 23) %pc)))

Combine simplifies this into:

(note NOTE_INSN_DELETED)
(note NOTE_INSN_DELETED)
(jump_insn (set %pc (if_then_else (eq %cc 0) (label_ref 23) %pc)))

opening up the possibility to perform jump threading.

gcc/ChangeLog:

2018-12-03  Ilya Leoshkevich  

PR target/80080
* cfgcleanup.c (class pass_postreload_jump): New pass.
(pass_postreload_jump::execute): Likewise.
(make_pass_postreload_jump): Likewise.
* passes.def: Add pass_postreload_jump before
pass_postreload_cse.
* tree-pass.h (make_pass_postreload_jump): New pass.

gcc/testsuite/ChangeLog:

2018-12-03  Ilya Leoshkevich  

PR target/80080
* gcc.target/s390/pr80080-4.c: New test.

Added:
trunk/gcc/testsuite/gcc.target/s390/pr80080-4.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/cfgcleanup.c
trunk/gcc/passes.def
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-pass.h