date:20190814

[PATCH v6 3/3] PR80791 Consider doloop cmp use in ivopts

2019-08-14 Thread Kewen.Lin

Hi!

Comparing to the previous versions of implementation mainly based on the 
existing IV cands but zeroing the related group/use cost, this new one is based
on Richard and Segher's suggestion introducing one doloop dedicated IV cand.  

Some key points are listed below:
  1) New field doloop_p in struct iv_cand to indicate doloop dedicated IV cand.
  2) Special name "doloop" assigned.
  3) Doloop IV cand with form (niter+1, +, -1)
  4) For doloop IV cand, no extra one cost like BIV, assign zero cost for step.
  5) Support may_be_zero (regressed PR is in this case), the base of doloop IV
 can be COND_EXPR, add handlings in cand_value_at and may_eliminate_iv.
  6) Add more expr support in force_expr_to_var_cost for reasonable cost
 calculation on the IV base with may_be_zero (like COND_EXPR).
  7) Set zero cost when using doloop IV cand for doloop use.
  8) Add three hooks (should we merge _generic and _address?).
*) have_count_reg_decr_p, is to indicate the target has special hardware
   count register, we shouldn't consider the impact of doloop IV when
   calculating register pressures.
*) doloop_cost_for_generic, is the extra cost when using doloop IV cand for
   generic type IV use.
*) doloop_cost_for_address, is the extra cost when using doloop IV cand for
   address type IV use.

Bootstrapped on powerpc64le-linux-gnu and regression testing passed excepting
for one failure on gcc/testsuite/gcc.dg/guality/loop-1.c at -O3 which is tracked
by PR89983.

Any comments and suggestions are highly appreciated.  Thanks!

Kewen

-

gcc/ChangeLog

2019-08-14  Kewen Lin  

PR middle-end/80791
* config/rs6000/rs6000.c (TARGET_HAVE_COUNT_REG_DECR_P): New macro.
(TARGET_DOLOOP_COST_FOR_GENERIC): Likewise.
(TARGET_DOLOOP_COST_FOR_ADDRESS): Likewise.
* target.def (have_count_reg_decr_p): New hook.
(doloop_cost_for_generic): Likewise.
(doloop_cost_for_address): Likewise.
* doc/tm.texi.in (TARGET_HAVE_COUNT_REG_DECR_P): Likewise.
(TARGET_DOLOOP_COST_FOR_GENERIC): Likewise.
(TARGET_DOLOOP_COST_FOR_ADDRESS): Likewise.
* doc/tm.texi: Regenerate.
* tree-ssa-loop-ivopts.c (comp_cost::operator+=): Consider infinite cost
addend.
(record_group): Init doloop_p.
(add_candidate_1): Add optional argument doloop, change the handlings
accordingly.
(add_candidate): Likewise.
(add_iv_candidate_for_biv): Update the call to add_candidate.
(generic_predict_doloop_p): Update attribute.
(force_expr_to_var_cost): Add costing for expressions COND_EXPR/LT_EXPR/
LE_EXPR/GT_EXPR/GE_EXPR/EQ_EXPR/NE_EXPR/UNORDERED_EXPR/ORDERED_EXPR/
UNLT_EXPR/UNLE_EXPR/UNGT_EXPR/UNGE_EXPR/UNEQ_EXPR/LTGT_EXPR/MAX_EXPR/
MIN_EXPR.
(determine_group_iv_cost_generic): Update for doloop IV cand.
(determine_group_iv_cost_address): Likewise.
(determine_group_iv_cost_cond): Likewise.
(determine_iv_cost): Likewise.
(ivopts_estimate_reg_pressure): Likewise.
(cand_value_at): Update argument niter type to struct tree_niter_desc*,
consider doloop IV cand and may_be_zero.
(may_eliminate_iv): Update the call to cand_value_at, consider doloop
IV cand and may_be_zero.
(add_iv_candidate_for_doloop): New function.
(find_iv_candidates): Call function add_iv_candidate_for_doloop.
(determine_set_costs): Update the call to ivopts_estimate_reg_pressure.
(iv_ca_recount_cost): Likewise.
(iv_ca_new): Init n_doloop_cands.
(iv_ca_set_no_cp): Update n_doloop_cands.
(iv_ca_set_cp): Likewise.
(iv_ca_dump): Dump register cost.
(find_doloop_use): Likewise.
(tree_ssa_iv_optimize_loop): Call function generic_predict_doloop_p and
find_doloop_use.

gcc/testsuite/ChangeLog

2019-08-14  Kewen Lin  

PR middle-end/80791
* gcc.dg/tree-ssa/ivopts-3.c: Adjust for doloop change.
* gcc.dg/tree-ssa/ivopts-lt.c: Likewise.
* gcc.dg/tree-ssa/pr32044.c: Likewise.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 6667cd0..5eccbdc 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -1912,6 +1912,16 @@ static const struct attribute_spec 
rs6000_attribute_table[] =
 #undef TARGET_PREDICT_DOLOOP_P
 #define TARGET_PREDICT_DOLOOP_P rs6000_predict_doloop_p
 
+#undef TARGET_HAVE_COUNT_REG_DECR_P
+#define TARGET_HAVE_COUNT_REG_DECR_P true
+
+/* 10 is infinite cost in IVOPTs.  */
+#undef TARGET_DOLOOP_COST_FOR_GENERIC
+#define TARGET_DOLOOP_COST_FOR_GENERIC 10
+
+#undef TARGET_DOLOOP_COST_FOR_ADDRESS
+#define TARGET_DOLOOP_COST_FOR_ADDRESS 10
+
 #undef TARGET_ATOMIC_ASSIGN_EXPAND_FENV
 #define TARGET_ATOMIC_ASSIGN_EXPAND_FENV rs6000_atomic_assign_expand_fenv
 
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index c2aa4d0..9f3a08a 100644
--- a/g

RE: Add TIGERLAKE and COOPERLAKE to GCC

2019-08-14 Thread Cui, Lili

Resend this mail for GCC Patches rejected my message, thanks.

-Original Message-

Hi Uros and all:

This patch is about to add TIGERLAKE and COOPERLAKE to GCC.
TIGERLAKE is based on ICELAKE_CLIENT and plus new ISA 
MOVEDIRI/MOVDIR64B/AVX512VP2INTERSECT.
COOPERLAKE is based on CASCADELAKE and plus new ISA AVX512BF16.
 
Bootstrap is ok, and no regressions for i386/x86-64 testsuite.

Changelog:
gcc/
* common/config/i386/i386-common.c
(processor_names): Add tigerlake and cooperlake.
(processor_alias_table): Add tigerlake and cooperlake.
* config.gcc: Add -march=tigerlake and cooperlake.
* config/i386/driver-i386.c
 (host_detect_local_cpu): Detect tigerlake and cooperlake.
* config/i386/i386-builtins.c
(processor_model) : Add M_INTEL_COREI7_TIGERLAKE and 
M_INTEL_COREI7_COOPERLAKE.
(arch_names_table): Add tigerlake and cooperlake.
(get_builtin_code_for_version) : Handle PROCESSOR_TIGERLAKE and 
PROCESSOR_COOPERLAKE.
* config/i386/i386-c.c
(ix86_target_macros_internal): Handle tigerlake and cooperlake.
(ix86_target_macros_internal): Handle 
OPTION_MASK_ISA_AVX512VP2INTERSECT.
* config/i386/i386-options.c
(m_TIGERLAKE)  : Define.
(m_COOPERLAKE) : Ditto.
(m_CORE_AVX512): Ditto.
(processor_cost_table): Add cascadelake.
(ix86_target_string)  : Handle -mavx512vp2intersect.
(ix86_valid_target_attribute_inner_p) : Handle avx512vp2intersect.
(ix86_option_override_internal): Hadle PTA_SHSTK, PTA_MOVDIRI,
 PTA_MOVDIR64B, PTA_AVX512VP2INTERSECT.
* config/i386/i386.h
(ix86_size_cost) : Define TARGET_TIGERLAKE and TARGET_COOPERLAKE.
(processor_type) : Add PROCESSOR_TIGERLAKE and PROCESSOR_COOPERLAKE.
(PTA_SHSTK) : Define.
(PTA_MOVDIRI): Ditto.
(PTA_MOVDIR64B): Ditto.
(PTA_COOPERLAKE) : Ditto.
(PTA_TIGERLAKE)  : Ditto.
(TARGET_AVX512VP2INTERSECT) : Ditto.
(TARGET_AVX512VP2INTERSECT_P(x)) : Ditto.
(processor_type) : Add PROCESSOR_TIGERLAKE and PROCESSOR_COOPERLAKE.
* doc/extend.texi: Add tigerlake and cooperlake.

gcc/testsuite/
* gcc.target/i386/funcspec-56.inc: Handle new march.
* g++.target/i386/mv16.C: Handle new march

libgcc/
* config/i386/cpuinfo.h: Add INTEL_COREI7_TIGERLAKE and 
INTEL_COREI7_COOPERLAKE.

Re: Canonicalization of compares performed as side-effect operations

2019-08-14 Thread Eric Botcazou

[Sorry for the delay, I missed your question...]

> Interesting.  Does it work for the general case of a reverse subtract,
> which I need to handle as wel?

Not clear, Visium only uses it for SNE and combined NEG/SNE.

-- 
Eric Botcazou

[committed][AArch64] Rework SVE PTEST patterns

2019-08-14 Thread Richard Sandiford

This patch reworks the rtl representation of the SVE PTEST operation
so that:

- the governing predicate is always VNx16BI (and so all bits are defined)

- it is still possible to pattern-match the governing predicate in the
  mode that it had previously

- a new hint operand says whether the governing predicate is known to be
  all true for the element size of interest, rather than this being part
  of the unspec name.

These changes make it easier to handle more flag-setting instructions
as part of the ACLE work.

See the comment in aarch64-sve.md for more details.

Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf.
Applied as r274414.

Richard


2019-08-14  Richard Sandiford  

gcc/
* config/aarch64/aarch64-protos.h (aarch64_ptrue_all): Declare.
* config/aarch64/aarch64.c (aarch64_ptrue_all): New function.
* config/aarch64/aarch64.md (UNSPEC_PTEST_PTRUE): Delete.
(UNSPEC_PTEST): New unspec.
(SVE_MAYBE_NOT_PTRUE, SVE_KNOWN_PTRUE): New constants.
* config/aarch64/iterators.md (data_bytes): New mode attribute.
* config/aarch64/predicates.md (aarch64_sve_ptrue_flag): New predicate.
* config/aarch64/aarch64-sve.md: Add a new section describing the
handling of UNSPEC_PTEST.
(pred_3): Rename to...
(@aarch64_pred__z): ...this.
(ptest_ptrue): Replace with...
(aarch64_ptest): ...this new pattern.
(cbranch4): Update after above changes.
(*3_cc): Use UNSPEC_PTEST instead of
UNSPEC_PTEST_PTRUE.
(*cmp_cc): Likewise.
(*cmp_ptest): Likewise.
(*while_ult_cc): Likewise.

Index: gcc/config/aarch64/aarch64-protos.h
===
--- gcc/config/aarch64/aarch64-protos.h 2019-08-13 22:33:36.213955216 +0100
+++ gcc/config/aarch64/aarch64-protos.h 2019-08-14 08:56:12.498608977 +0100
@@ -550,6 +550,7 @@ const char * aarch64_output_probe_stack_
 const char * aarch64_output_probe_sve_stack_clash (rtx, rtx, rtx, rtx);
 void aarch64_err_no_fpadvsimd (machine_mode);
 void aarch64_expand_epilogue (bool);
+rtx aarch64_ptrue_all (unsigned int);
 void aarch64_expand_mov_immediate (rtx, rtx);
 rtx aarch64_ptrue_reg (machine_mode);
 rtx aarch64_pfalse_reg (machine_mode);
Index: gcc/config/aarch64/aarch64.c
===
--- gcc/config/aarch64/aarch64.c2019-08-13 22:35:11.717252343 +0100
+++ gcc/config/aarch64/aarch64.c2019-08-14 08:56:12.502608946 +0100
@@ -2699,6 +2699,22 @@ aarch64_svpattern_for_vl (machine_mode p
   return AARCH64_NUM_SVPATTERNS;
 }
 
+/* Return a VNx16BImode constant in which every sequence of ELT_SIZE
+   bits has the lowest bit set and the upper bits clear.  This is the
+   VNx16BImode equivalent of a PTRUE for controlling elements of
+   ELT_SIZE bytes.  However, because the constant is VNx16BImode,
+   all bits are significant, even the upper zeros.  */
+
+rtx
+aarch64_ptrue_all (unsigned int elt_size)
+{
+  rtx_vector_builder builder (VNx16BImode, elt_size, 1);
+  builder.quick_push (const1_rtx);
+  for (unsigned int i = 1; i < elt_size; ++i)
+builder.quick_push (const0_rtx);
+  return builder.build ();
+}
+
 /* Return an all-true predicate register of mode MODE.  */
 
 rtx
Index: gcc/config/aarch64/aarch64.md
===
--- gcc/config/aarch64/aarch64.md   2019-08-13 22:33:30.365998256 +0100
+++ gcc/config/aarch64/aarch64.md   2019-08-14 08:56:12.502608946 +0100
@@ -220,7 +220,7 @@ (define_c_enum "unspec" [
 UNSPEC_LD1_GATHER
 UNSPEC_ST1_SCATTER
 UNSPEC_MERGE_PTRUE
-UNSPEC_PTEST_PTRUE
+UNSPEC_PTEST
 UNSPEC_UNPACKSHI
 UNSPEC_UNPACKUHI
 UNSPEC_UNPACKSLO
@@ -259,6 +259,15 @@ (define_c_enum "unspecv" [
   ]
 )
 
+;; These constants are used as a const_int in various SVE unspecs
+;; to indicate whether the governing predicate is known to be a PTRUE.
+(define_constants
+  [; Indicates that the predicate might not be a PTRUE.
+   (SVE_MAYBE_NOT_PTRUE 0)
+
+   ; Indicates that the predicate is known to be a PTRUE.
+   (SVE_KNOWN_PTRUE 1)])
+
 ;; If further include files are added the defintion of MD_INCLUDES
 ;; must be updated.
 
Index: gcc/config/aarch64/iterators.md
===
--- gcc/config/aarch64/iterators.md 2019-08-13 10:38:35.963894971 +0100
+++ gcc/config/aarch64/iterators.md 2019-08-14 08:56:12.502608946 +0100
@@ -1169,6 +1169,10 @@ (define_mode_attr FCMLA_maybe_lane [(V2S
(V4HF "[%4]") (V8HF "[%4]")
])
 
+;; The number of bytes controlled by a predicate
+(define_mode_attr data_bytes [(VNx16BI "1") (VNx8BI "2")
+ (VNx4BI "4") (VNx2BI "8")])
+
 ;; ---
 ;; Code Iterators
 ;;

[committed][AArch64] Canonicalise SVE predicate constants

2019-08-14 Thread Richard Sandiford

This patch makes sure that we build all SVE predicate constants as
VNx16BI before RA, to encourage similar constants to be reused
between modes.  This is also useful for the ACLE, where the single
predicate type svbool_t is always a VNx16BI.

Also, and again to encourage reuse, the patch makes us use a .B PTRUE
for all ptrue-predicated operations, rather than (for example) using
a .S PTRUE for 32-bit operations and a .D PTRUE for 64-bit operations.

The only current case in which a .H, .S or .D operation needs to be
predicated by a "strict" .H/.S/.D PTRUE is the PTEST in a conditional
branch, which an earlier patch fixed to use an appropriate VNx16BI
constant.

Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf.
Applied as r274415.

Richard


2019-08-14  Richard Sandiford  

gcc/
* config/aarch64/aarch64.c (aarch64_target_reg): New function.
(aarch64_emit_set_immediate): Likewise.
(aarch64_ptrue_reg): Build a VNx16BI constant and then bitcast it.
(aarch64_pfalse_reg): Likewise.
(aarch64_convert_sve_data_to_pred): New function.
(aarch64_sve_move_pred_via_while): Take an optional target register
and the required register mode.
(aarch64_expand_sve_const_pred_1): New function.
(aarch64_expand_sve_const_pred): Likewise.
(aarch64_expand_mov_immediate): Build an all-true predicate
if the significant bits of the immediate are all true.  Use
aarch64_expand_sve_const_pred for all compile-time predicate constants.
(aarch64_mov_operand_p): Force predicate constants to be VNx16BI
before register allocation.
* config/aarch64/aarch64-sve.md (*vec_duplicate_reg): Use
a VNx16BI PTRUE when splitting the memory alternative.
(vec_duplicate): Update accordingly.
(*pred_cmp): Rename to...
(@aarch64_pred_cmp): ...this.

gcc/testsuite/
* gcc.target/aarch64/sve/spill_4.c: Expect all ptrues to be .Bs.
* gcc.target/aarch64/sve/single_1.c: Likewise.
* gcc.target/aarch64/sve/single_2.c: Likewise.
* gcc.target/aarch64/sve/single_3.c: Likewise.
* gcc.target/aarch64/sve/single_4.c: Likewise.

Index: gcc/config/aarch64/aarch64.c
===
--- gcc/config/aarch64/aarch64.c2019-08-14 08:58:06.353767448 +0100
+++ gcc/config/aarch64/aarch64.c2019-08-14 09:00:55.960509992 +0100
@@ -2546,6 +2546,36 @@ aarch64_zero_extend_const_eq (machine_mo
 }
  
 
+/* Return TARGET if it is nonnull and a register of mode MODE.
+   Otherwise, return a fresh register of mode MODE if we can,
+   or TARGET reinterpreted as MODE if we can't.  */
+
+static rtx
+aarch64_target_reg (rtx target, machine_mode mode)
+{
+  if (target && REG_P (target) && GET_MODE (target) == mode)
+return target;
+  if (!can_create_pseudo_p ())
+{
+  gcc_assert (target);
+  return gen_lowpart (mode, target);
+}
+  return gen_reg_rtx (mode);
+}
+
+/* Return a register that contains the constant in BUILDER, given that
+   the constant is a legitimate move operand.  Use TARGET as the register
+   if it is nonnull and convenient.  */
+
+static rtx
+aarch64_emit_set_immediate (rtx target, rtx_vector_builder &builder)
+{
+  rtx src = builder.build ();
+  target = aarch64_target_reg (target, GET_MODE (src));
+  emit_insn (gen_rtx_SET (target, src));
+  return target;
+}
+
 static rtx
 aarch64_force_temporary (machine_mode mode, rtx x, rtx value)
 {
@@ -2721,7 +2751,8 @@ aarch64_ptrue_all (unsigned int elt_size
 aarch64_ptrue_reg (machine_mode mode)
 {
   gcc_assert (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL);
-  return force_reg (mode, CONSTM1_RTX (mode));
+  rtx reg = force_reg (VNx16BImode, CONSTM1_RTX (VNx16BImode));
+  return gen_lowpart (mode, reg);
 }
 
 /* Return an all-false predicate register of mode MODE.  */
@@ -2730,7 +2761,26 @@ aarch64_ptrue_reg (machine_mode mode)
 aarch64_pfalse_reg (machine_mode mode)
 {
   gcc_assert (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL);
-  return force_reg (mode, CONST0_RTX (mode));
+  rtx reg = force_reg (VNx16BImode, CONST0_RTX (VNx16BImode));
+  return gen_lowpart (mode, reg);
+}
+
+/* Use a comparison to convert integer vector SRC into MODE, which is
+   the corresponding SVE predicate mode.  Use TARGET for the result
+   if it's nonnull and convenient.  */
+
+static rtx
+aarch64_convert_sve_data_to_pred (rtx target, machine_mode mode, rtx src)
+{
+  machine_mode src_mode = GET_MODE (src);
+  insn_code icode = code_for_aarch64_pred_cmp (NE, src_mode);
+  expand_operand ops[4];
+  create_output_operand (&ops[0], target, mode);
+  create_input_operand (&ops[1], CONSTM1_RTX (mode), mode);
+  create_input_operand (&ops[2], src, src_mode);
+  create_input_operand (&ops[3], CONST0_RTX (src_mode), src_mode);
+  expand_insn (icode, 4, ops);
+  return ops[0].value;
 }
 
 /* Return true if we can move VALUE into a register using a s

[committed][AArch64] Don't rely on REG_EQUAL notes to combine SVE BIC

2019-08-14 Thread Richard Sandiford

This patch generalises the SVE BIC pattern so that it doesn't
rely on REG_EQUAL notes.  The danger with relying on the notes
is that an optimisation could for example replace the original
(not ...) note with an (unspec ... UNSPEC_MERGE_PTRUE) in which
the predicate is a constant.  That's a legitimate change and
could even be useful in some situations.

The patch also makes the operand order match the SVE operand order in
both the vector and predicate BIC patterns, which makes things easier
for the ACLE.

Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf.
Applied as r274416.

Richard


2019-08-14  Richard Sandiford  
Kugan Vivekanandarajah  

gcc/
* config/aarch64/aarch64-sve.md (bic3): Rename to...
(*bic3): ...this.  Match the form that an SVE inverse
actually has, rather than relying on REG_EQUAL notes.
Make the insn operand order match the SVE operand order.
(*3): Make the insn operand order match
the SVE operand order.

Index: gcc/config/aarch64/aarch64-sve.md
===
--- gcc/config/aarch64/aarch64-sve.md   2019-08-14 09:03:20.515438326 +0100
+++ gcc/config/aarch64/aarch64-sve.md   2019-08-14 09:05:41.902390293 +0100
@@ -1779,15 +1779,20 @@ (define_insn "3"
 ;; - BIC
 ;; -
 
-;; REG_EQUAL notes on "not3" should ensure that we can generate
-;; this pattern even though the NOT instruction itself is predicated.
-(define_insn "bic3"
+(define_insn_and_rewrite "*bic3"
   [(set (match_operand:SVE_I 0 "register_operand" "=w")
(and:SVE_I
- (not:SVE_I (match_operand:SVE_I 1 "register_operand" "w"))
- (match_operand:SVE_I 2 "register_operand" "w")))]
+ (unspec:SVE_I
+   [(match_operand 3)
+(not:SVE_I (match_operand:SVE_I 2 "register_operand" "w"))]
+   UNSPEC_MERGE_PTRUE)
+ (match_operand:SVE_I 1 "register_operand" "w")))]
   "TARGET_SVE"
-  "bic\t%0.d, %2.d, %1.d"
+  "bic\t%0.d, %1.d, %2.d"
+  "&& !CONSTANT_P (operands[3])"
+  {
+operands[3] = CONSTM1_RTX (mode);
+  }
 )
 
 ;; -
@@ -2451,11 +2456,11 @@ (define_insn "*3"
   [(set (match_operand:PRED_ALL 0 "register_operand" "=Upa")
(and:PRED_ALL
  (NLOGICAL:PRED_ALL
-   (not:PRED_ALL (match_operand:PRED_ALL 2 "register_operand" "Upa"))
-   (match_operand:PRED_ALL 3 "register_operand" "Upa"))
+   (not:PRED_ALL (match_operand:PRED_ALL 3 "register_operand" "Upa"))
+   (match_operand:PRED_ALL 2 "register_operand" "Upa"))
  (match_operand:PRED_ALL 1 "register_operand" "Upa")))]
   "TARGET_SVE"
-  "\t%0.b, %1/z, %3.b, %2.b"
+  "\t%0.b, %1/z, %2.b, %3.b"
 )
 
 ;; -

[committed][AArch64] Use unspecs for remaining SVE FP binary ops

2019-08-14 Thread Richard Sandiford

Another patch in the series to make the SVE FP patterns use unspecs,
so that they can accurately describe cases in which the predicate
isn't a PTRUE.

Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf.
Applied as r274417.

Richard


2019-08-14  Richard Sandiford  
Kugan Vivekanandarajah  

gcc/
* config/aarch64/aarch64-sve.md (add3, *add3)
(sub3, *sub3, *fabd3, mul3, *mul3)
(div3, *div3): Use SVE_COND_FP_* unspecs instead of
rtx codes.
(cond_, *cond__2, *cond__3)
(*cond__any): Add the predicate to the SVE_COND_FP_*
unspecs.

Index: gcc/config/aarch64/aarch64-sve.md
===
--- gcc/config/aarch64/aarch64-sve.md   2019-08-14 09:08:04.289334990 +0100
+++ gcc/config/aarch64/aarch64-sve.md   2019-08-14 09:10:48.912115057 +0100
@@ -1963,7 +1963,8 @@ (define_expand "cond_"
(unspec:SVE_F
  [(match_operand: 1 "register_operand")
   (unspec:SVE_F
-[(match_operand:SVE_F 2 "register_operand")
+[(match_dup 1)
+ (match_operand:SVE_F 2 "register_operand")
  (match_operand:SVE_F 3 "register_operand")]
 SVE_COND_FP_BINARY)
   (match_operand:SVE_F 4 "aarch64_simd_reg_or_zero")]
@@ -1977,7 +1978,8 @@ (define_insn "*cond__2"
(unspec:SVE_F
  [(match_operand: 1 "register_operand" "Upl, Upl")
   (unspec:SVE_F
-[(match_operand:SVE_F 2 "register_operand" "0, w")
+[(match_dup 1)
+ (match_operand:SVE_F 2 "register_operand" "0, w")
  (match_operand:SVE_F 3 "register_operand" "w, w")]
 SVE_COND_FP_BINARY)
   (match_dup 2)]
@@ -1995,7 +1997,8 @@ (define_insn "*cond__3"
(unspec:SVE_F
  [(match_operand: 1 "register_operand" "Upl, Upl")
   (unspec:SVE_F
-[(match_operand:SVE_F 2 "register_operand" "w, w")
+[(match_dup 1)
+ (match_operand:SVE_F 2 "register_operand" "w, w")
  (match_operand:SVE_F 3 "register_operand" "0, w")]
 SVE_COND_FP_BINARY)
   (match_dup 3)]
@@ -2013,7 +2016,8 @@ (define_insn_and_rewrite "*cond_<
(unspec:SVE_F
  [(match_operand: 1 "register_operand" "Upl, Upl, Upl, Upl, 
Upl")
   (unspec:SVE_F
-[(match_operand:SVE_F 2 "register_operand" "0, w, w, w, w")
+[(match_dup 1)
+ (match_operand:SVE_F 2 "register_operand" "0, w, w, w, w")
  (match_operand:SVE_F 3 "register_operand" "w, 0, w, w, w")]
 SVE_COND_FP_BINARY)
   (match_operand:SVE_F 4 "aarch64_simd_reg_or_zero" "Dz, Dz, Dz, 0, 
w")]
@@ -2051,10 +2055,9 @@ (define_expand "add3"
   [(set (match_operand:SVE_F 0 "register_operand")
(unspec:SVE_F
  [(match_dup 3)
-  (plus:SVE_F
-(match_operand:SVE_F 1 "register_operand")
-(match_operand:SVE_F 2 
"aarch64_sve_float_arith_with_sub_operand"))]
- UNSPEC_MERGE_PTRUE))]
+  (match_operand:SVE_F 1 "register_operand")
+  (match_operand:SVE_F 2 "aarch64_sve_float_arith_with_sub_operand")]
+ UNSPEC_COND_FADD))]
   "TARGET_SVE"
   {
 operands[3] = aarch64_ptrue_reg (mode);
@@ -2066,10 +2069,9 @@ (define_insn_and_split "*add3"
   [(set (match_operand:SVE_F 0 "register_operand" "=w, w, w")
(unspec:SVE_F
  [(match_operand: 1 "register_operand" "Upl, Upl, Upl")
-  (plus:SVE_F
- (match_operand:SVE_F 2 "register_operand" "%0, 0, w")
- (match_operand:SVE_F 3 "aarch64_sve_float_arith_with_sub_operand" 
"vsA, vsN, w"))]
- UNSPEC_MERGE_PTRUE))]
+  (match_operand:SVE_F 2 "register_operand" "%0, 0, w")
+  (match_operand:SVE_F 3 "aarch64_sve_float_arith_with_sub_operand" 
"vsA, vsN, w")]
+ UNSPEC_COND_FADD))]
   "TARGET_SVE"
   "@
fadd\t%0., %1/m, %0., #%3
@@ -2098,10 +2100,9 @@ (define_expand "sub3"
   [(set (match_operand:SVE_F 0 "register_operand")
(unspec:SVE_F
  [(match_dup 3)
-  (minus:SVE_F
-(match_operand:SVE_F 1 "aarch64_sve_float_arith_operand")
-(match_operand:SVE_F 2 "register_operand"))]
- UNSPEC_MERGE_PTRUE))]
+  (match_operand:SVE_F 1 "aarch64_sve_float_arith_operand")
+  (match_operand:SVE_F 2 "register_operand")]
+ UNSPEC_COND_FSUB))]
   "TARGET_SVE"
   {
 operands[3] = aarch64_ptrue_reg (mode);
@@ -2113,10 +2114,9 @@ (define_insn_and_split "*sub3"
   [(set (match_operand:SVE_F 0 "register_operand" "=w, w, w, w")
(unspec:SVE_F
  [(match_operand: 1 "register_operand" "Upl, Upl, Upl, Upl")
-  (minus:SVE_F
-(match_operand:SVE_F 2 "aarch64_sve_float_arith_operand" "0, 0, 
vsA, w")
-(match_operand:SVE_F 3 "aarch64_sve_float_arith_with_sub_operand" 
"vsA, vsN, 0, w"))]
- UNSPEC_MERGE_PTRUE))]
+  (match_operand:SVE_F 2 "a

[committed][AArch64] Add a "GP strictness" operand to SVE FP unspecs

2019-08-14 Thread Richard Sandiford

This patch makes the SVE unary, binary and ternary FP unspecs
take a new "GP strictness" operand that indicates whether the
predicate has to be taken literally, or whether it is valid to
make extra lanes active (up to and including using a PTRUE).

This again is laying the groundwork for the ACLE patterns,
in which the value can depend on the FP command-line flags.

At the moment it's only needed for addition, subtraction and
multiplication, which have unpredicated forms that can only
be used when operating on all lanes is safe.  But in future
it might be useful for optimising predicate usage.

The strict mode requires extra alternatives for addition,
subtraction and multiplication, but I've left those for the
main ACLE patch.

Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf.
Applied as r274418.

Richard


2019-08-14  Richard Sandiford  
Kugan Vivekanandarajah  

gcc/
* config/aarch64/aarch64.md (SVE_RELAXED_GP, SVE_STRICT_GP): New
constants.
* config/aarch64/predicates.md (aarch64_sve_gp_strictness): New
predicate.
* config/aarch64/aarch64-protos.h (aarch64_sve_pred_dominates_p):
Declare.
* config/aarch64/aarch64.c (aarch64_sve_pred_dominates_p): New
function.
* config/aarch64/aarch64-sve.md: Add a block comment about the
handling of predicated FP operations.
(2, add3)
(sub3, mul3, div3)
(3)
(3)
(4): Add an SVE_RELAXED_GP
operand.
(cond_)
(cond_): Add an SVE_STRICT_GP
operand.
(*2)
(*cond__2)
(*cond__3)
(*cond__any)
(*fabd3, *div3)
(*3)
(*4)
(*cond__2)
(*cond__4)
(*cond__any): Match the
strictness operands.  Use aarch64_sve_pred_dominates_p to check
whether the predicate on the conditional operation is suitable
for merging.  Split patterns into the canonical equal-predicate form.
(*add3, *sub3, *mul3): Likewise.
Restrict the unpredicated alternatives to SVE_RELAXED_GP.

Index: gcc/config/aarch64/aarch64.md
===
--- gcc/config/aarch64/aarch64.md   2019-08-14 08:58:06.357767418 +0100
+++ gcc/config/aarch64/aarch64.md   2019-08-14 09:13:55.210734712 +0100
@@ -268,6 +268,18 @@ (define_constants
; Indicates that the predicate is known to be a PTRUE.
(SVE_KNOWN_PTRUE 1)])
 
+;; These constants are used as a const_int in predicated SVE FP arithmetic
+;; to indicate whether the operation is allowed to make additional lanes
+;; active without worrying about the effect on faulting behavior.
+(define_constants
+  [; Indicates either that all lanes are active or that the instruction may
+   ; operate on inactive inputs even if doing so could induce a fault.
+   (SVE_RELAXED_GP 0)
+
+   ; Indicates that some lanes might be inactive and that the instruction
+   ; must not operate on inactive inputs if doing so could induce a fault.
+   (SVE_STRICT_GP 1)])
+
 ;; If further include files are added the defintion of MD_INCLUDES
 ;; must be updated.
 
Index: gcc/config/aarch64/predicates.md
===
--- gcc/config/aarch64/predicates.md2019-08-14 08:58:06.357767418 +0100
+++ gcc/config/aarch64/predicates.md2019-08-14 09:13:55.210734712 +0100
@@ -689,6 +689,11 @@ (define_predicate "aarch64_sve_ptrue_fla
(ior (match_test "INTVAL (op) == SVE_MAYBE_NOT_PTRUE")
(match_test "INTVAL (op) == SVE_KNOWN_PTRUE"
 
+(define_predicate "aarch64_sve_gp_strictness"
+  (and (match_code "const_int")
+   (ior (match_test "INTVAL (op) == SVE_RELAXED_GP")
+   (match_test "INTVAL (op) == SVE_STRICT_GP"
+
 (define_predicate "aarch64_gather_scale_operand_w"
   (and (match_code "const_int")
(match_test "INTVAL (op) == 1 || INTVAL (op) == 4")))
Index: gcc/config/aarch64/aarch64-protos.h
===
--- gcc/config/aarch64/aarch64-protos.h 2019-08-14 08:58:06.349767478 +0100
+++ gcc/config/aarch64/aarch64-protos.h 2019-08-14 09:13:55.206734742 +0100
@@ -554,6 +554,7 @@ rtx aarch64_ptrue_all (unsigned int);
 void aarch64_expand_mov_immediate (rtx, rtx);
 rtx aarch64_ptrue_reg (machine_mode);
 rtx aarch64_pfalse_reg (machine_mode);
+bool aarch64_sve_pred_dominates_p (rtx *, rtx);
 void aarch64_emit_sve_pred_move (rtx, rtx, rtx);
 void aarch64_expand_sve_mem_move (rtx, rtx, machine_mode);
 bool aarch64_maybe_expand_sve_subreg_move (rtx, rtx);
Index: gcc/config/aarch64/aarch64.c
===
--- gcc/config/aarch64/aarch64.c2019-08-14 09:03:20.523438266 +0100
+++ gcc/config/aarch64/aarch64.c2019-08-14 09:13:55.210734712 +0100
@@ -2765,6 +2765,24 @@ aarch64_pfalse_reg (machine_mode mode)
   return gen_lowpart (mode, reg);
 }
 
+/* Return true i

[committed][AArch64] Commonise some SVE FP patterns

2019-08-14 Thread Richard Sandiford

This patch uses a single expander for generic FP binary optabs
that map to predicated SVE instructions.  This makes them consistent
with the associated conditional optabs, which already work this way.

The patch also generalises the division handling to be one example
of a register-only predicated FP operation.  The ACLE patches will
add FMULX to the same category.

Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf.
Applied as r274419.

Richard


2019-08-14  Richard Sandiford  
Kugan Vivekanandarajah  

gcc/
* config/aarch64/iterators.md (SVE_COND_FP_BINARY_REG): New int
iterator.
(sve_pred_fp_rhs1_operand, sve_pred_fp_rhs1_operand): New int
attributes.
* config/aarch64/aarch64-sve.md (add3, sub3)
(mul3, div3)
(3): Merge into...
(3): ...this new expander.
(*div3): Generalize to...
(*3): ...this.

Index: gcc/config/aarch64/iterators.md
===
--- gcc/config/aarch64/iterators.md 2019-08-14 08:58:06.357767418 +0100
+++ gcc/config/aarch64/iterators.md 2019-08-14 09:18:49.360558297 +0100
@@ -1646,6 +1646,8 @@ (define_int_iterator SVE_COND_FP_BINARY
 UNSPEC_COND_FMUL
 UNSPEC_COND_FSUB])
 
+(define_int_iterator SVE_COND_FP_BINARY_REG [UNSPEC_COND_FDIV])
+
 ;; Floating-point max/min operations that correspond to optabs,
 ;; as opposed to those that are internal to the port.
 (define_int_iterator SVE_COND_FP_MAXMIN_PUBLIC [UNSPEC_COND_FMAXNM
@@ -2003,3 +2005,23 @@ (define_int_attr sve_fmad_op [(UNSPEC_CO
  (UNSPEC_COND_FMLS "fmsb")
  (UNSPEC_COND_FNMLA "fnmad")
  (UNSPEC_COND_FNMLS "fnmsb")])
+
+;; The predicate to use for the first input operand in a floating-point
+;; 3 pattern.
+(define_int_attr sve_pred_fp_rhs1_operand
+  [(UNSPEC_COND_FADD "register_operand")
+   (UNSPEC_COND_FDIV "register_operand")
+   (UNSPEC_COND_FMAXNM "register_operand")
+   (UNSPEC_COND_FMINNM "register_operand")
+   (UNSPEC_COND_FMUL "register_operand")
+   (UNSPEC_COND_FSUB "aarch64_sve_float_arith_operand")])
+
+;; The predicate to use for the second input operand in a floating-point
+;; 3 pattern.
+(define_int_attr sve_pred_fp_rhs2_operand
+  [(UNSPEC_COND_FADD "aarch64_sve_float_arith_with_sub_operand")
+   (UNSPEC_COND_FDIV "register_operand")
+   (UNSPEC_COND_FMAXNM "register_operand")
+   (UNSPEC_COND_FMINNM "register_operand")
+   (UNSPEC_COND_FMUL "aarch64_sve_float_mul_operand")
+   (UNSPEC_COND_FSUB "register_operand")])
Index: gcc/config/aarch64/aarch64-sve.md
===
--- gcc/config/aarch64/aarch64-sve.md   2019-08-14 09:15:57.613827991 +0100
+++ gcc/config/aarch64/aarch64-sve.md   2019-08-14 09:18:49.360558297 +0100
@@ -73,7 +73,6 @@
 ;;  [FP] Subtraction
 ;;  [FP] Absolute difference
 ;;  [FP] Multiplication
-;;  [FP] Division
 ;;  [FP] Binary logical operations
 ;;  [FP] Sign copying
 ;;  [FP] Maximum and minimum
@@ -2037,6 +2036,38 @@ (define_insn "*post_ra_
 ;; - FSUBR
 ;; -
 
+;; Unpredicated floating-point binary operations.
+(define_expand "3"
+  [(set (match_operand:SVE_F 0 "register_operand")
+   (unspec:SVE_F
+ [(match_dup 3)
+  (const_int SVE_RELAXED_GP)
+  (match_operand:SVE_F 1 "")
+  (match_operand:SVE_F 2 "")]
+ SVE_COND_FP_BINARY))]
+  "TARGET_SVE"
+  {
+operands[3] = aarch64_ptrue_reg (mode);
+  }
+)
+
+;; Predicated floating-point binary operations that have no immediate forms.
+(define_insn "*3"
+  [(set (match_operand:SVE_F 0 "register_operand" "=w, w, ?&w")
+   (unspec:SVE_F
+ [(match_operand: 1 "register_operand" "Upl, Upl, Upl")
+  (match_operand:SI 4 "aarch64_sve_gp_strictness")
+  (match_operand:SVE_F 2 "register_operand" "0, w, w")
+  (match_operand:SVE_F 3 "register_operand" "w, 0, w")]
+ SVE_COND_FP_BINARY_REG))]
+  "TARGET_SVE"
+  "@
+   \t%0., %1/m, %0., %3.
+   \t%0., %1/m, %0., %2.
+   movprfx\t%0, %2\;\t%0., %1/m, %0., %3."
+  [(set_attr "movprfx" "*,*,yes")]
+)
+
 ;; Predicated floating-point operations with merging.
 (define_expand "cond_"
   [(set (match_operand:SVE_F 0 "register_operand")
@@ -2150,21 +2181,6 @@ (define_insn_and_rewrite "*cond_<
 ;; - FSUB
 ;; -
 
-;; Unpredicated floating-point addition.
-(define_expand "add3"
-  [(set (match_operand:SVE_F 0 "register_operand")
-   (unspec:SVE_F
- [(match_dup 3)
-  (const_int SVE_RELAXED_GP)
-  (match_operand:SVE_F 1 "register_operand")
-  (match_operand:SVE_F 2 "aarch64_sve_float_arith_with_sub_operand")]
- UNSPEC_COND_FADD))]
-  "TARGE

Re: [PATCH] Add missing popcount simplifications (PR90693)

2019-08-14 Thread Richard Biener

On Tue, Aug 13, 2019 at 6:47 PM Andrew Pinski  wrote:
>
> On Tue, Aug 13, 2019 at 8:50 AM Wilco Dijkstra  wrote:
> >
> > Add simplifications for popcount (x) > 1 to (x & (x-1)) != 0 and
> > popcount (x) == 1 into (x-1)  > single-use cases and support an optional convert.  A microbenchmark
> > shows a speedup of 2-2.5x on both x64 and AArch64.
> >
> > Bootstrap OK, OK for commit?
>
> I think this should be in expand stage where there could be comparison
> of the cost of the RTLs.

I tend to agree here, if not then for the reason the "simplified" variants
have more GIMPLE stmts which means they are not "simpler".  In
fact I'd argue for canonicalization we'd want to have the reverse
"simplifications" on GIMPLE and expansion based on target cost.

Richard.

> The only reason why it is faster for AARCH64 is the requirement of
> moving between the GPRs and the SIMD registers.
>
> Thanks,
> Andrew Pinski
>
> >
> > ChangeLog:
> > 2019-08-13  Wilco Dijkstra  
> >
> > gcc/
> > PR middle-end/90693
> > * match.pd: Add popcount simplifications.
> >
> > testsuite/
> > PR middle-end/90693
> > * gcc.dg/fold-popcount-5.c: Add new test.
> >
> > ---
> >
> > diff --git a/gcc/match.pd b/gcc/match.pd
> > index 
> > 0317bc704f771f626ab72189b3a54de00087ad5a..bf4351a330f45f3a1424d9792cefc3da6267597d
> >  100644
> > --- a/gcc/match.pd
> > +++ b/gcc/match.pd
> > @@ -5356,7 +5356,24 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > rep (eq eq ne ne)
> >  (simplify
> >(cmp (popcount @0) integer_zerop)
> > -  (rep @0 { build_zero_cst (TREE_TYPE (@0)); }
> > +  (rep @0 { build_zero_cst (TREE_TYPE (@0)); })))
> > +  /* popcount(X) == 1 -> (X-1)  > +  (for cmp (eq ne)
> > +   rep (lt ge)
> > +(simplify
> > +  (cmp (convert? (popcount:s @0)) integer_onep)
> > +  (with {
> > + tree utype = unsigned_type_for (TREE_TYPE (@0));
> > + tree a0 = fold_convert (utype, @0); }
> > +   (rep (plus { a0; } { build_minus_one_cst (utype); })
> > +(bit_and (negate { a0; }) { a0; })
> > +  /* popcount(X) > 1 -> (X & (X-1)) != 0.  */
> > +  (for cmp (gt le)
> > +   rep (ne eq)
> > +(simplify
> > +  (cmp (convert? (popcount:s @0)) integer_onep)
> > +  (rep (bit_and (plus @0 { build_minus_one_cst (TREE_TYPE (@0)); }) @0)
> > +  { build_zero_cst (TREE_TYPE (@0)); }
> >
> >  /* Simplify:
> >
> > diff --git a/gcc/testsuite/gcc.dg/fold-popcount-5.c 
> > b/gcc/testsuite/gcc.dg/fold-popcount-5.c
> > new file mode 100644
> > index 
> > ..fcf3910587caacb8e39cf437dc3971df892f405a
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/fold-popcount-5.c
> > @@ -0,0 +1,69 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -fdump-tree-optimized" } */
> > +
> > +/* Test popcount (x) > 1 -> (x & (x-1)) != 0.  */
> > +
> > +int test_1 (long x)
> > +{
> > +  return __builtin_popcountl (x) >= 2;
> > +}
> > +
> > +int test_2 (int x)
> > +{
> > +  return (unsigned) __builtin_popcount (x) <= 1u;
> > +}
> > +
> > +int test_3 (unsigned x)
> > +{
> > +  return __builtin_popcount (x) > 1u;
> > +}
> > +
> > +int test_4 (unsigned long x)
> > +{
> > +  return (unsigned char) __builtin_popcountl (x) > 1;
> > +}
> > +
> > +int test_5 (unsigned long x)
> > +{
> > +  return (signed char) __builtin_popcountl (x) <= (signed char)1;
> > +}
> > +
> > +int test_6 (unsigned long long x)
> > +{
> > +  return 2u <= __builtin_popcountll (x);
> > +}
> > +
> > +/* Test popcount (x) == 1 -> (x-1)  > +
> > +int test_7 (unsigned long x)
> > +{
> > +  return __builtin_popcountl (x) != 1;
> > +}
> > +
> > +int test_8 (long long x)
> > +{
> > +  return (unsigned) __builtin_popcountll (x) == 1u;
> > +}
> > +
> > +int test_9 (int x)
> > +{
> > +  return (unsigned char) __builtin_popcount (x) != 1u;
> > +}
> > +
> > +int test_10 (unsigned x)
> > +{
> > +  return (unsigned char) __builtin_popcount (x) == 1;
> > +}
> > +
> > +int test_11 (long x)
> > +{
> > +  return (signed char) __builtin_popcountl (x) == 1;
> > +}
> > +
> > +int test_12 (long x)
> > +{
> > +  return 1u == __builtin_popcountl (x);
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "popcount" 0 "optimized" } } */
> > +

[committed][AArch64] Add support for SVE HF vconds

2019-08-14 Thread Richard Sandiford

We were missing vcond patterns that had HF comparisons and HI or HF data.

Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf.
Applied as r274420.

Richard


2019-08-14  Richard Sandiford  

gcc/
* config/aarch64/iterators.md (SVE_HSD): New mode iterator.
(V_FP_EQUIV, v_fp_equiv): Handle VNx8HI and VNx8HF.
* config/aarch64/aarch64-sve.md (vcond): Use
SVE_HSD instead of SVE_SD.

gcc/testsuite/
* gcc.target/aarch64/sve/vcond_17.c: New test.
* gcc.target/aarch64/sve/vcond_17_run.c: Likewise.

Index: gcc/config/aarch64/iterators.md
===
--- gcc/config/aarch64/iterators.md 2019-08-14 09:20:57.547610677 +0100
+++ gcc/config/aarch64/iterators.md 2019-08-14 09:23:37.626427347 +0100
@@ -301,6 +301,9 @@ (define_mode_iterator SVE_HSDI [VNx16QI
 ;; All SVE floating-point vector modes that have 16-bit or 32-bit elements.
 (define_mode_iterator SVE_HSF [VNx8HF VNx4SF])
 
+;; All SVE vector modes that have 16-bit, 32-bit or 64-bit elements.
+(define_mode_iterator SVE_HSD [VNx8HI VNx4SI VNx2DI VNx8HF VNx4SF VNx2DF])
+
 ;; All SVE vector modes that have 32-bit or 64-bit elements.
 (define_mode_iterator SVE_SD [VNx4SI VNx2DI VNx4SF VNx2DF])
 
@@ -928,9 +931,11 @@ (define_mode_attr v_int_equiv [(V8QI "v8
 ])
 
 ;; Floating-point equivalent of selected modes.
-(define_mode_attr V_FP_EQUIV [(VNx4SI "VNx4SF") (VNx4SF "VNx4SF")
+(define_mode_attr V_FP_EQUIV [(VNx8HI "VNx8HF") (VNx8HF "VNx8HF")
+ (VNx4SI "VNx4SF") (VNx4SF "VNx4SF")
  (VNx2DI "VNx2DF") (VNx2DF "VNx2DF")])
-(define_mode_attr v_fp_equiv [(VNx4SI "vnx4sf") (VNx4SF "vnx4sf")
+(define_mode_attr v_fp_equiv [(VNx8HI "vnx8hf") (VNx8HF "vnx8hf")
+ (VNx4SI "vnx4sf") (VNx4SF "vnx4sf")
  (VNx2DI "vnx2df") (VNx2DF "vnx2df")])
 
 ;; Mode for vector conditional operations where the comparison has
Index: gcc/config/aarch64/aarch64-sve.md
===
--- gcc/config/aarch64/aarch64-sve.md   2019-08-14 09:20:57.547610677 +0100
+++ gcc/config/aarch64/aarch64-sve.md   2019-08-14 09:23:37.626427347 +0100
@@ -2884,13 +2884,13 @@ (define_expand "vcondu"
-  [(set (match_operand:SVE_SD 0 "register_operand")
-   (if_then_else:SVE_SD
+  [(set (match_operand:SVE_HSD 0 "register_operand")
+   (if_then_else:SVE_HSD
  (match_operator 3 "comparison_operator"
[(match_operand: 4 "register_operand")
 (match_operand: 5 "aarch64_simd_reg_or_zero")])
- (match_operand:SVE_SD 1 "register_operand")
- (match_operand:SVE_SD 2 "register_operand")))]
+ (match_operand:SVE_HSD 1 "register_operand")
+ (match_operand:SVE_HSD 2 "register_operand")))]
   "TARGET_SVE"
   {
 aarch64_expand_sve_vcond (mode, mode, operands);
Index: gcc/testsuite/gcc.target/aarch64/sve/vcond_17.c
===
--- /dev/null   2019-07-30 08:53:31.317691683 +0100
+++ gcc/testsuite/gcc.target/aarch64/sve/vcond_17.c 2019-08-14 
09:23:37.626427347 +0100
@@ -0,0 +1,94 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize" } */
+
+#include 
+
+#define eq(A, B) ((A) == (B))
+#define ne(A, B) ((A) != (B))
+#define olt(A, B) ((A) < (B))
+#define ole(A, B) ((A) <= (B))
+#define oge(A, B) ((A) >= (B))
+#define ogt(A, B) ((A) > (B))
+#define ordered(A, B) (!__builtin_isunordered (A, B))
+#define unordered(A, B) (__builtin_isunordered (A, B))
+#define ueq(A, B) (!__builtin_islessgreater (A, B))
+#define ult(A, B) (__builtin_isless (A, B))
+#define ule(A, B) (__builtin_islessequal (A, B))
+#define uge(A, B) (__builtin_isgreaterequal (A, B))
+#define ugt(A, B) (__builtin_isgreater (A, B))
+#define nueq(A, B) (__builtin_islessgreater (A, B))
+#define nult(A, B) (!__builtin_isless (A, B))
+#define nule(A, B) (!__builtin_islessequal (A, B))
+#define nuge(A, B) (!__builtin_isgreaterequal (A, B))
+#define nugt(A, B) (!__builtin_isgreater (A, B))
+
+#define DEF_LOOP(CMP, EXPECT_INVALID)  \
+  void __attribute__ ((noinline, noclone)) \
+  test_##CMP##_var (__fp16 *restrict dest, __fp16 *restrict src,   \
+   __fp16 fallback, __fp16 *restrict a,\
+   __fp16 *restrict b, int count)  \
+  {\
+for (int i = 0; i < count; ++i)\
+  dest[i] = CMP (a[i], b[i]) ? src[i] : fallback;  \
+  }\
+   \
+  void __attribute__ ((noinline, noclone)) \
+  test_##CMP##_zero (__fp16 *restrict dest,  __fp16 *restrict src,

[committed][AArch64] Rework SVE FP comparisons

2019-08-14 Thread Richard Sandiford

This patch rewrites the SVE FP comparisons so that they always use
unspecs and so that they have an additional operand to indicate
whether the predicate is known to be a PTRUE.  It's part of a series
that rewrites the SVE FP patterns so that they can cope with non-PTRUE
predicates.

Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf.
Applied as r274421.

Richard


2019-08-14  Richard Sandiford  

gcc/
* config/aarch64/iterators.md (UNSPEC_COND_FCMUO): New unspec.
(cmp_op): Handle it.
(SVE_COND_FP_CMP): Rename to...
(SVE_COND_FP_CMP_I0): ...this.
(SVE_FP_CMP): Remove.
* config/aarch64/aarch64-sve.md
(*fcm): Replace with...
(*fcm): ...this new pattern,
using unspecs to represent the comparison.
(*fcmuo): Use UNSPEC_COND_FCMUO.
(*fcm_and_combine, *fcmuo_and_combine): Update
accordingly.
* config/aarch64/aarch64.c (aarch64_emit_sve_ptrue_op): Delete.
(aarch64_unspec_cond_code): Move after integer code.  Handle
UNORDERED.
(aarch64_emit_sve_predicated_cond): Replace with...
(aarch64_emit_sve_fp_cond): ...this new function.
(aarch64_emit_sve_or_conds): Replace with...
(aarch64_emit_sve_or_fp_conds): ...this new function.
(aarch64_emit_sve_inverted_cond): Replace with...
(aarch64_emit_sve_invert_fp_cond): ...this new function.
(aarch64_expand_sve_vec_cmp_float): Update accordingly.

Index: gcc/config/aarch64/iterators.md
===
--- gcc/config/aarch64/iterators.md 2019-08-14 09:25:49.689451157 +0100
+++ gcc/config/aarch64/iterators.md 2019-08-14 09:29:14.195939545 +0100
@@ -479,6 +479,7 @@ (define_c_enum "unspec"
 UNSPEC_COND_FCMLE  ; Used in aarch64-sve.md.
 UNSPEC_COND_FCMLT  ; Used in aarch64-sve.md.
 UNSPEC_COND_FCMNE  ; Used in aarch64-sve.md.
+UNSPEC_COND_FCMUO  ; Used in aarch64-sve.md.
 UNSPEC_COND_FDIV   ; Used in aarch64-sve.md.
 UNSPEC_COND_FMAXNM ; Used in aarch64-sve.md.
 UNSPEC_COND_FMINNM ; Used in aarch64-sve.md.
@@ -1273,9 +1274,6 @@ (define_code_iterator SVE_UNPRED_FP_BINA
 ;; SVE integer comparisons.
 (define_code_iterator SVE_INT_CMP [lt le eq ne ge gt ltu leu geu gtu])
 
-;; SVE floating-point comparisons.
-(define_code_iterator SVE_FP_CMP [lt le eq ne ge gt])
-
 ;; ---
 ;; Code Attributes
 ;; ---
@@ -1663,12 +1661,13 @@ (define_int_iterator SVE_COND_FP_TERNARY
  UNSPEC_COND_FNMLA
  UNSPEC_COND_FNMLS])
 
-(define_int_iterator SVE_COND_FP_CMP [UNSPEC_COND_FCMEQ
- UNSPEC_COND_FCMGE
- UNSPEC_COND_FCMGT
- UNSPEC_COND_FCMLE
- UNSPEC_COND_FCMLT
- UNSPEC_COND_FCMNE])
+;; SVE FP comparisons that accept #0.0.
+(define_int_iterator SVE_COND_FP_CMP_I0 [UNSPEC_COND_FCMEQ
+UNSPEC_COND_FCMGE
+UNSPEC_COND_FCMGT
+UNSPEC_COND_FCMLE
+UNSPEC_COND_FCMLT
+UNSPEC_COND_FCMNE])
 
 (define_int_iterator FCADD [UNSPEC_FCADD90
UNSPEC_FCADD270])
@@ -1955,7 +1954,8 @@ (define_int_attr cmp_op [(UNSPEC_COND_FC
 (UNSPEC_COND_FCMGT "gt")
 (UNSPEC_COND_FCMLE "le")
 (UNSPEC_COND_FCMLT "lt")
-(UNSPEC_COND_FCMNE "ne")])
+(UNSPEC_COND_FCMNE "ne")
+(UNSPEC_COND_FCMUO "uo")])
 
 (define_int_attr sve_int_op [(UNSPEC_ANDV "andv")
 (UNSPEC_IORV "orv")
Index: gcc/config/aarch64/aarch64-sve.md
===
--- gcc/config/aarch64/aarch64-sve.md   2019-08-14 09:25:49.685451187 +0100
+++ gcc/config/aarch64/aarch64-sve.md   2019-08-14 09:29:14.191939575 +0100
@@ -3136,15 +3136,15 @@ (define_expand "vec_cmp"
   }
 )
 
-;; Floating-point comparisons predicated with a PTRUE.
+;; Predicated floating-point comparisons.
 (define_insn "*fcm"
   [(set (match_operand: 0 "register_operand" "=Upa, Upa")
(unspec:
  [(match_operand: 1 "register_operand" "Upl, Upl")
-  (SVE_FP_CMP:
-(match_operand:SVE_F 2 "register_operand" "w, w")
-(match_operand:SVE_F 3 "aarch64_simd_reg_or_zero" "Dz, w"))]
- UNSPEC_MERGE_PTRUE))]
+  (match_operand:SI 4 "aarch64_sve_ptrue_flag")
+  (match_operand:SVE_F 2 "register_operand" "w, w")
+  (match_operand:SVE_F 3 "aarch64_simd_reg_or_zer

[committed][AArch64] Use unspecs for SVE conversions involving floats

2019-08-14 Thread Richard Sandiford

This patch changes the SVE FP<->FP and FP<->INT patterns so that
they use unspecs rather than rtx codes, continuing the series
to make the patterns work with predicates that might not be all-true.

Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf.
Applied as r274423.

Richard


2019-08-14  Richard Sandiford  

gcc/
* config/aarch64/aarch64.md (UNSPEC_FLOAT_CONVERT): Delete.
* config/aarch64/iterators.md (UNSPEC_COND_FCVT, UNSPEC_COND_FCVTZS)
(UNSPEC_COND_FCVTZU, UNSPEC_COND_SCVTF, UNSPEC_COND_UCVTF): New
unspecs.
(optab, su): Handle them.
(SVE_COND_FCVTI, SVE_COND_ICVTF): New int iterators.
* config/aarch64/aarch64-sve.md
(2): Replace with...
(2): ...this.
(*v16hsf<:SVE_HSDImode>2): Replace with...
(*v16hsf2): ...this.
(*vnx4sf2): Replace with...
(*vnx4sf2): ...this.
(*vnx2df2): Replace with...
(*vnx2df2): ...this.
(vec_pack_fix_trunc_vnx2df): Use SVE_COND_FCVTI instead of
FIXUORS.
(2): Replace with...
(2): ...this.
(*vnx8hf2): Replace with...
(*vnx8hf2): ...this.
(*vnx4sf2): Replace with...
(*vnx4sf2): ...this.
(aarch64_sve_vnx2df2): Replace with...
(aarch64_sve_vnx2df2): ...this.
(vec_unpack_float__vnx4si): Pass a GP strictness
operand to aarch64_sve_vnx2df2.
(vec_pack_trunc_, *trunc2)
(aarch64_sve_extend2): Use UNSPEC_COND_FCVT instead
of UNSPEC_FLOAT_CONVERT.
(vec_unpacks__): Pass a GP strictness operand to
aarch64_sve_extend2.

Index: gcc/config/aarch64/aarch64.md
===
--- gcc/config/aarch64/aarch64.md   2019-08-14 09:15:57.617827961 +0100
+++ gcc/config/aarch64/aarch64.md   2019-08-14 09:32:41.846404780 +0100
@@ -226,7 +226,6 @@ (define_c_enum "unspec" [
 UNSPEC_UNPACKSLO
 UNSPEC_UNPACKULO
 UNSPEC_PACK
-UNSPEC_FLOAT_CONVERT
 UNSPEC_WHILE_LO
 UNSPEC_LDN
 UNSPEC_STN
Index: gcc/config/aarch64/iterators.md
===
--- gcc/config/aarch64/iterators.md 2019-08-14 09:29:52.871653684 +0100
+++ gcc/config/aarch64/iterators.md 2019-08-14 09:32:41.846404780 +0100
@@ -480,6 +480,9 @@ (define_c_enum "unspec"
 UNSPEC_COND_FCMLT  ; Used in aarch64-sve.md.
 UNSPEC_COND_FCMNE  ; Used in aarch64-sve.md.
 UNSPEC_COND_FCMUO  ; Used in aarch64-sve.md.
+UNSPEC_COND_FCVT   ; Used in aarch64-sve.md.
+UNSPEC_COND_FCVTZS ; Used in aarch64-sve.md.
+UNSPEC_COND_FCVTZU ; Used in aarch64-sve.md.
 UNSPEC_COND_FDIV   ; Used in aarch64-sve.md.
 UNSPEC_COND_FMAXNM ; Used in aarch64-sve.md.
 UNSPEC_COND_FMINNM ; Used in aarch64-sve.md.
@@ -498,6 +501,8 @@ (define_c_enum "unspec"
 UNSPEC_COND_FRINTZ ; Used in aarch64-sve.md.
 UNSPEC_COND_FSQRT  ; Used in aarch64-sve.md.
 UNSPEC_COND_FSUB   ; Used in aarch64-sve.md.
+UNSPEC_COND_SCVTF  ; Used in aarch64-sve.md.
+UNSPEC_COND_UCVTF  ; Used in aarch64-sve.md.
 UNSPEC_LASTB   ; Used in aarch64-sve.md.
 UNSPEC_FCADD90 ; Used in aarch64-simd.md.
 UNSPEC_FCADD270; Used in aarch64-simd.md.
@@ -1642,6 +1647,9 @@ (define_int_iterator SVE_COND_FP_UNARY [
UNSPEC_COND_FRINTZ
UNSPEC_COND_FSQRT])
 
+(define_int_iterator SVE_COND_FCVTI [UNSPEC_COND_FCVTZS UNSPEC_COND_FCVTZU])
+(define_int_iterator SVE_COND_ICVTF [UNSPEC_COND_SCVTF UNSPEC_COND_UCVTF])
+
 (define_int_iterator SVE_COND_FP_BINARY [UNSPEC_COND_FADD
 UNSPEC_COND_FDIV
 UNSPEC_COND_FMAXNM
@@ -1715,6 +1723,9 @@ (define_int_attr optab [(UNSPEC_ANDF "an
(UNSPEC_FMINV "smin_nan")
(UNSPEC_COND_FABS "abs")
(UNSPEC_COND_FADD "add")
+   (UNSPEC_COND_FCVT "fcvt")
+   (UNSPEC_COND_FCVTZS "fix_trunc")
+   (UNSPEC_COND_FCVTZU "fixuns_trunc")
(UNSPEC_COND_FDIV "div")
(UNSPEC_COND_FMAXNM "smax")
(UNSPEC_COND_FMINNM "smin")
@@ -1732,7 +1743,9 @@ (define_int_attr optab [(UNSPEC_ANDF "an
(UNSPEC_COND_FRINTX "rint")
(UNSPEC_COND_FRINTZ "btrunc")
(UNSPEC_COND_FSQRT "sqrt")
-   (UNSPEC_COND_FSUB "sub")])
+   (UNSPEC_COND_FSUB "sub")
+   (UNSPEC_COND_SCVTF "float")
+   (UNSPEC_COND_UCVTF "floatuns")])
 
 (define_int_attr  maxmin_uns [(UNSPEC_UMAXV "umax")
  (UNSPEC_UMINV "umin")
@@ -1773,7 +1786,11 @@ (define_int_attr su [(UNSPEC_UNPACKSHI "
 (UNSPEC_UNPACKSLO "s")
 (UNSPEC_UNPA

[committed][AArch64] Rearrange SVE conversion patterns

2019-08-14 Thread Richard Sandiford

The SVE int<->float conversion patterns need to handle various
combinations of modes, making sure that the predicate mode is based
on the widest element size.  We did this using separate patterns for
conversions involving:

- HF (converting to/from [HSD]I, predicated based on the int operand)
- SF (converting to/from [SD]I, predicated based on the int operand)
- DF (converting to/from [SD]I, predicated based on the float operand)

This worked, and meant that there were no redundant patterns.  However,
the ACLE needs various new predicated patterns too, and having three
versions of each one seemed excessive.

This patch instead splits the patterns into two groups rather than three.
For conversions to integers:

- truncating (predicated based on the source type, DF->SI only)
- non-truncating (predicated based on the destination type)

For conversions from integers:

- extending (predicated based on the destination type, SI->DF only)
- non-extending (predicated based on the source type)

This means that we still don't create pattern names for the invalid
combinations DF<->HI and SF<->HI.  The downside is that we need to
use C conditions to exclude the SI<->DF case from the non-truncating/
non-extending patterns.  We therefore have two pattern names for SI<->DF,
but genconditions ensures that the invalid one always has the value
CODE_FOR_nothing.

Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf.
Applied as r274424.

Richard


2019-08-14  Richard Sandiford  

gcc/
* config/aarch64/iterators.md (VNx4SI_ONLY, VNx2DF_ONLY): New mode
iterators.
(SVE_BHSI, SVE_SDI): Tweak comment.
(SVE_HSDI): Likewise.  Fix definition.
(SVE_SDF): New mode iterator.
(elem_bits): New mode attribute.
(SVE_COND_FCVT): New int iterator.
* config/aarch64/aarch64-sve.md
(*v16hsf2)
(*vnx4sf2)
(*vnx2df2): Merge into...

(*aarch64_sve__nontrunc)

(*aarch64_sve__trunc):
...these new patterns.
(*vnx8hf2)
(*vnx4sf2)
(aarch64_sve_vnx2df2):
Merge into...

(*aarch64_sve__nonextend)

(aarch64_sve__extend):
...these new patterns.
(vec_unpack_float__vnx4si): Update accordingly.
(*trunc2): Replace with...
(*aarch64_sve__trunc):
...this new pattern.
(aarch64_sve_extend2): Replace with...

(aarch64_sve__nontrunc):
...this new pattern.
(vec_unpacks__): Update accordingly.

Index: gcc/config/aarch64/iterators.md
===
--- gcc/config/aarch64/iterators.md 2019-08-14 09:34:05.509786440 +0100
+++ gcc/config/aarch64/iterators.md 2019-08-14 09:38:47.027705882 +0100
@@ -278,6 +278,10 @@ (define_mode_iterator VMUL_CHANGE_NLANES
 (define_mode_iterator SVE_ALL [VNx16QI VNx8HI VNx4SI VNx2DI
   VNx8HF VNx4SF VNx2DF])
 
+;; Iterators for single modes, for "@" patterns.
+(define_mode_iterator VNx4SI_ONLY [VNx4SI])
+(define_mode_iterator VNx2DF_ONLY [VNx2DF])
+
 ;; All SVE vector structure modes.
 (define_mode_iterator SVE_STRUCT [VNx32QI VNx16HI VNx8SI VNx4DI
  VNx16HF VNx8SF VNx4DF
@@ -292,15 +296,21 @@ (define_mode_iterator SVE_BH [VNx16QI VN
 ;; All SVE vector modes that have 8-bit, 16-bit or 32-bit elements.
 (define_mode_iterator SVE_BHS [VNx16QI VNx8HI VNx4SI VNx8HF VNx4SF])
 
-;; All SVE integer vector modes that have 8-bit, 16-bit or 32-bit elements.
+;; SVE integer vector modes that have 8-bit, 16-bit or 32-bit elements.
 (define_mode_iterator SVE_BHSI [VNx16QI VNx8HI VNx4SI])
 
-;; All SVE integer vector modes that have 16-bit, 32-bit or 64-bit elements.
-(define_mode_iterator SVE_HSDI [VNx16QI VNx8HI VNx4SI])
+;; SVE integer vector modes that have 16-bit, 32-bit or 64-bit elements.
+(define_mode_iterator SVE_HSDI [VNx8HI VNx4SI VNx2DI])
 
-;; All SVE floating-point vector modes that have 16-bit or 32-bit elements.
+;; SVE floating-point vector modes that have 16-bit or 32-bit elements.
 (define_mode_iterator SVE_HSF [VNx8HF VNx4SF])
 
+;; SVE integer vector modes that have 32-bit or 64-bit elements.
+(define_mode_iterator SVE_SDI [VNx4SI VNx2DI])
+
+;; SVE floating-point vector modes that have 32-bit or 64-bit elements.
+(define_mode_iterator SVE_SDF [VNx4SF VNx2DF])
+
 ;; All SVE vector modes that have 16-bit, 32-bit or 64-bit elements.
 (define_mode_iterator SVE_HSD [VNx8HI VNx4SI VNx2DI VNx8HF VNx4SF VNx2DF])
 
@@ -313,9 +323,6 @@ (define_mode_iterator SVE_S [VNx4SI VNx4
 ;; All SVE vector modes that have 64-bit elements.
 (define_mode_iterator SVE_D [VNx2DI VNx2DF])
 
-;; All SVE integer vector modes that have 32-bit or 64-bit elements.
-(define_mode_iterator SVE_SDI [VNx4SI VNx2DI])
-
 ;; All SVE integer vector modes.
 (define_mode_iterator SVE_I [VNx16QI VNx8HI VNx4SI VNx2DI])
 
@@ -629,6 +636,11 @@ (define_mode_attr sizen [(QI "8") (HI "1
 (define_mode_attr sizem1 [(QI "#7")

Re: [PATCH] Automatics in equivalence statements

2019-08-14 Thread Mark Eggleston


I now have commit access.

gcc/fortran

    Jeff Law 
    Mark Eggleston 

    * gfortran.h: Add gfc_check_conflict declaration.
    * symbol.c (check_conflict): Rename cfg_check_conflict and remove
    static.
    * symbol.c (cfg_check_conflict): Remove automatic in equivalence
    conflict check.
    * symbol.c (save_symbol): Add check for in equivalence to stop the
    the save attribute being added.
    * trans-common.c (build_equiv_decl): Add is_auto parameter and
    add !is_auto to condition where TREE_STATIC (decl) is set.
    * trans-common.c (build_equiv_decl): Add local variable is_auto,
    set it true if an atomatic attribute is encountered in the variable
    list.  Call build_equiv_decl with is_auto as an additional parameter.
    flag_dec_format_defaults is enabled.
    * trans-common.c (accumulate_equivalence_attributes) : New subroutine.
    * trans-common.c (find_equivalence) : New local variable dummy_symbol,
    accumulated equivalence attributes from each symbol then check for
    conflicts.

gcc/testsuite

    Mark Eggleston 

    * gfortran.dg/auto_in_equiv_1.f90: New test.
    * gfortran.dg/auto_in_equiv_2.f90: New test.
    * gfortran.dg/auto_in_equiv_3.f90: New test.

OK to commit?

How do I know that I have approval to commit?

On 23/07/2019 03:50, Jeff Law wrote:

On 7/22/19 8:36 PM, Steve Kargl wrote:

On Mon, Jul 22, 2019 at 08:07:12PM -0600, Jeff Law wrote:

On 7/22/19 7:38 PM, Steve Kargl wrote:

Someone needs to get commit access.


I've sent Mark the link for authenticated access.  So is he clear to
commit once that's set up?

jeff

Yes, IMHO.  He's sent a number of quality patches, and
from what I gathered you've worked with him so he has
a good mentor.

Perfect.  THanks.


Unfortunately, gfortran has too few contributors at the moment.

Y'all aren't alone...

Jeff



--
https://www.codethink.co.uk/privacy.html

>From 8487aa2c195261f62489f94c2e2d16d81f945362 Mon Sep 17 00:00:00 2001
From: Mark Eggleston 
Date: Tue, 11 Sep 2018 12:50:11 +0100
Subject: [PATCH 2/3] Allow automatics in equivalence

If a variable with an automatic attribute appears in an
equivalence statement the storage should be allocated on
the stack.

Note: most of this patch was provided by Jeff Law .
---
 gcc/fortran/gfortran.h|   1 +
 gcc/fortran/symbol.c  | 102 +-
 gcc/fortran/trans-common.c|  73 --
 gcc/testsuite/gfortran.dg/auto_in_equiv_1.f90 |  36 +
 gcc/testsuite/gfortran.dg/auto_in_equiv_2.f90 |  38 ++
 gcc/testsuite/gfortran.dg/auto_in_equiv_3.f90 |  63 
 6 files changed, 257 insertions(+), 56 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/auto_in_equiv_1.f90
 create mode 100644 gcc/testsuite/gfortran.dg/auto_in_equiv_2.f90
 create mode 100644 gcc/testsuite/gfortran.dg/auto_in_equiv_3.f90

diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 75e5b2f0644..49bcacc9a54 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -3007,6 +3007,7 @@ bool gfc_merge_new_implicit (gfc_typespec *);
 void gfc_set_implicit_none (bool, bool, locus *);
 void gfc_check_function_type (gfc_namespace *);
 bool gfc_is_intrinsic_typename (const char *);
+bool gfc_check_conflict (symbol_attribute *, const char *, locus *);
 
 gfc_typespec *gfc_get_default_type (const char *, gfc_namespace *);
 bool gfc_set_default_type (gfc_symbol *, int, gfc_namespace *);
diff --git a/gcc/fortran/symbol.c b/gcc/fortran/symbol.c
index 2b8f86e0881..cc5b5efa3a8 100644
--- a/gcc/fortran/symbol.c
+++ b/gcc/fortran/symbol.c
@@ -407,8 +407,8 @@ gfc_check_function_type (gfc_namespace *ns)
 goto conflict_std;\
   }
 
-static bool
-check_conflict (symbol_attribute *attr, const char *name, locus *where)
+bool
+gfc_check_conflict (symbol_attribute *attr, const char *name, locus *where)
 {
   static const char *dummy = "DUMMY", *save = "SAVE", *pointer = "POINTER",
 *target = "TARGET", *external = "EXTERNAL", *intent = "INTENT",
@@ -544,7 +544,6 @@ check_conflict (symbol_attribute *attr, const char *name, locus *where)
   conf (allocatable, elemental);
 
   conf (in_common, automatic);
-  conf (in_equivalence, automatic);
   conf (result, automatic);
   conf (use_assoc, automatic);
   conf (dummy, automatic);
@@ -1004,7 +1003,7 @@ gfc_add_attribute (symbol_attribute *attr, locus *where)
   if (check_used (attr, NULL, where))
 return false;
 
-  return check_conflict (attr, NULL, where);
+  return gfc_check_conflict (attr, NULL, where);
 }
 
 
@@ -1030,7 +1029,7 @@ gfc_add_allocatable (symbol_attribute *attr, locus *where)
 }
 
   attr->allocatable = 1;
-  return check_conflict (attr, NULL, where);
+  return gfc_check_conflict (attr, NULL, where);
 }
 
 
@@ -1045,7 +1044,7 @@ gfc_add_automatic (symbol_attribute *attr, const char *name, locus *where)
 return false;
 
   attr->automatic = 1;
-  retur

[committed][AArch64] Use "x" predication for SVE integer arithmetic patterns

2019-08-14 Thread Richard Sandiford

The SVE patterns used an UNSPEC_MERGE_PTRUE unspec to attach a predicate
to an otherwise unpredicated integer arithmetic operation.  As its name
suggests, this was designed to be a wrapper used for merging instructions
in which the predicate is known to be a PTRUE.

This unspec dates from the very early days of the port and nothing has
ever taken advantage of the PTRUE guarantee for arithmetic (as opposed
to comparisons).  This patch replaces it with the less stringent
guarantee that:

(a) the values of inactive lanes don't matter and
(b) it is valid to make extra lanes active if there's a specific benefit

Doing this makes the patterns suitable for the ACLE _x functions, which
have the above semantics.

See the block comment in the patch for more details.

Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf.
Applied as r274425.

Richard


2019-08-14  Richard Sandiford  

gcc/
* config/aarch64/aarch64.md (UNSPEC_PRED_X): New unspec.
* config/aarch64/aarch64-sve.md: Add a section describing it.
(@aarch64_pred_mov, @aarch64_pred_mov)
(2, *2)
(aarch64_abd_3, mul3, *mul3)
(mul3_highpart, *mul3_highpart)
(3, *3)
(*bic3, v3, *v3)
(3, *3, *madd)
(*msub3, *aarch64_sve_rev64)
(*aarch64_sve_rev32, *aarch64_sve_rev16vnx16qi): Use
UNSPEC_PRED_X instead of UNSPEC_MERGE_PTRUE.
* config/aarch64/aarch64-sve2.md (avg3_floor)
(avg3_ceil, *h): Likewise.
* config/aarch64/aarch64.c (aarch64_split_sve_subreg_move)
(aarch64_evpc_rev_local): Update accordingly.

Index: gcc/config/aarch64/aarch64.md
===
--- gcc/config/aarch64/aarch64.md   2019-08-14 09:34:05.509786440 +0100
+++ gcc/config/aarch64/aarch64.md   2019-08-14 09:43:23.977659217 +0100
@@ -220,6 +220,7 @@ (define_c_enum "unspec" [
 UNSPEC_LD1_GATHER
 UNSPEC_ST1_SCATTER
 UNSPEC_MERGE_PTRUE
+UNSPEC_PRED_X
 UNSPEC_PTEST
 UNSPEC_UNPACKSHI
 UNSPEC_UNPACKUHI
Index: gcc/config/aarch64/aarch64-sve.md
===
--- gcc/config/aarch64/aarch64-sve.md   2019-08-14 09:39:44.323282457 +0100
+++ gcc/config/aarch64/aarch64-sve.md   2019-08-14 09:43:23.973659247 +0100
@@ -24,6 +24,7 @@
 ;; == General notes
 ;;  Note on the handling of big-endian SVE
 ;;  Description of UNSPEC_PTEST
+;;  Note on predicated integer arithemtic and UNSPEC_PRED_X
 ;;  Note on predicated FP arithmetic patterns and GP "strictness"
 ;;
 ;; == Moves
@@ -230,6 +231,63 @@
 ;; - OP is the predicate we want to test, of the same mode as CAST_GP.
 ;;
 ;; -
+;;  Note on predicated integer arithemtic and UNSPEC_PRED_X
+;; -
+;;
+;; Many SVE integer operations are predicated.  We can generate them
+;; from four sources:
+;;
+;; (1) Using normal unpredicated optabs.  In this case we need to create
+;; an all-true predicate register to act as the governing predicate
+;; for the SVE instruction.  There are no inactive lanes, and thus
+;; the values of inactive lanes don't matter.
+;;
+;; (2) Using _x ACLE functions.  In this case the function provides a
+;; specific predicate and some lanes might be inactive.  However,
+;; as for (1), the values of the inactive lanes don't matter.
+;; We can make extra lanes active without changing the behavior
+;; (although for code-quality reasons we should avoid doing so
+;; needlessly).
+;;
+;; (3) Using cond_* optabs that correspond to IFN_COND_* internal functions.
+;; These optabs have a predicate operand that specifies which lanes are
+;; active and another operand that provides the values of inactive lanes.
+;;
+;; (4) Using _m and _z ACLE functions.  These functions map to the same
+;; patterns as (3), with the _z functions setting inactive lanes to zero
+;; and the _m functions setting the inactive lanes to one of the function
+;; arguments.
+;;
+;; For (1) and (2) we need a way of attaching the predicate to a normal
+;; unpredicated integer operation.  We do this using:
+;;
+;;   (unspec:M [pred (code:M (op0 op1 ...))] UNSPEC_PRED_X)
+;;
+;; where (code:M (op0 op1 ...)) is the normal integer operation and PRED
+;; is a predicate of mode .  PRED might or might not be a PTRUE;
+;; it always is for (1), but might not be for (2).
+;;
+;; The unspec as a whole has the same value as (code:M ...) when PRED is
+;; all-true.  It is always semantically valid to replace PRED with a PTRUE,
+;; but as noted above, we should only do so if there's a specific benefit.
+;;
+;; (The "_X" in the unspec is named after the ACLE functions in (2).)
+;;
+;; For (3) and (4) we can simply use the SVE port's normal representation
+;; of a predicate-based select:
+;;
+;;   (unspec:M [pred (code:M (

Re: [PATCH 2/3] Add simplify rules for wrapped binary operations.

2019-08-14 Thread Richard Biener

On Tue, Aug 13, 2019 at 10:36 AM Robin Dapp  wrote:
>
> We would like to simplify code like
>  (larger_type)(var + const1) + const2
> to
>  (larger_type)(var + combined_const1_const2)
> when we know that no overflow happens.

Trowing in my own comments...

> ---
>  gcc/match.pd | 101 +++
>  1 file changed, 101 insertions(+)
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 0317bc704f7..94400529ad8 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -2020,6 +2020,107 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  (if (cst && !TREE_OVERFLOW (cst))
>   (plus { cst; } @0
>
> +/* ((T)(A + CST1)) + CST2 -> (T)(A) + CST  */
> +#if GIMPLE
> +  (simplify
> +(plus (convert (plus @0 INTEGER_CST@1)) INTEGER_CST@2)
> +  (if (INTEGRAL_TYPE_P (type)
> +   && TYPE_PRECISION (type) > TYPE_PRECISION (TREE_TYPE (@0)))
> +   /* We actually want to perform two simplifications here:
> + (1) (T)(A + CST1) + CST2  --> (T)(A) + (T)(CST1)
> + If for (A + CST1) we either do not care about overflow (e.g.
> + for a signed inner type) or the overflow is ok for an unsigned
> + inner type.
> + (2) (T)(A) + (T)(CST1) + CST2 --> (T)(A) + (T)(CST1 + CST2)

But the original is already (T)(A) + CST1-in-T + CST2-in-T and thus we
can always
do 2) by means of already existing patterns.  So it's only 1) you need
to implement?!
And 1) is really (T)(A + CST) -> (T)A + CST-in-T, no?

> + If (CST1 + CST2) does not overflow and we do care about overflow
> + (for a signed outer type) or we do not care about overflow in an
> + unsigned outer type.  */
> +   (with
> +   {
> + tree inner_type = TREE_TYPE (@0);
> + wide_int wmin0, wmax0;
> + wide_int cst1 = wi::to_wide (@1);
> +
> +wi::overflow_type min_ovf = wi::OVF_OVERFLOW,
> +  max_ovf = wi::OVF_OVERFLOW;
> +
> +/* Get overflow behavior.  */
> + bool ovf_undef_inner = TYPE_OVERFLOW_UNDEFINED (inner_type);
> + bool ovf_undef_outer = TYPE_OVERFLOW_UNDEFINED (type);
> +
> +/* Get value range of A.  */
> + enum value_range_kind vr0 = get_range_info (@0, &wmin0, &wmax0);
> +
> +/* If we have a proper range, determine min and max overflow
> +   of (A + CST1).
> +   ??? We might also want to handle anti ranges.  */
> +if (vr0 == VR_RANGE)
> +  {
> +wi::add (wmin0, cst1, TYPE_SIGN (inner_type), &min_ovf);
> +wi::add (wmax0, cst1, TYPE_SIGN (inner_type), &max_ovf);
> +  }
> +
> +/* Inner overflow does not matter in this case.  */
> +if (ovf_undef_inner)
> +  {
> +min_ovf = wi::OVF_NONE;
> +max_ovf = wi::OVF_NONE;
> +  }
> +
> +/* Extend CST from INNER_TYPE to TYPE.  */
> +cst1 = cst1.from (cst1, TYPE_PRECISION (type), TYPE_SIGN 
> (inner_type));
> +
> +/* Check for overflow of (TYPE)(CST1 + CST2).  */
> +wi::overflow_type outer_ovf = wi::OVF_OVERFLOW;
> +wide_int cst = wi::add (cst1, wi::to_wide (@2), TYPE_SIGN (type),
> +&outer_ovf);
> +
> +/* We *do* care about an overflow here as we do not want to introduce
> +   new undefined behavior that was not there before.  */
> +if (ovf_undef_outer && outer_ovf)
> +  {
> +/* Set these here to prevent the final conversion below
> +   to take place instead of introducing a new guard variable.  */
> +min_ovf = wi::OVF_OVERFLOW;
> +max_ovf = wi::OVF_OVERFLOW;
> +  }
> +   }
> +   (if (min_ovf == wi::OVF_NONE && max_ovf == wi::OVF_NONE)
> +(plus (convert @0) { wide_int_to_tree (type, cst); }
> + )
> +#endif
> +
> +/* ((T)(A)) + CST -> (T)(A + CST)  */

But then this is the reverse... (as Marc already noticed).

So - what are you really after? (sorry if I don't remeber, testcase(s)
are missing
from this patch)

To me it seems that 1) loses information if A + CST was done in a signed type
and we know that overflow doesn't happen because of that.  For the reverse
transformation we don't.  Btw, if you make A == A' + CST' then
you get (T)A + CST -> (T)(A' + CST' + CST) which is again trivially handled
so why do you need both transforms again?

> +#if GIMPLE
> +  (simplify
> +   (plus (convert SSA_NAME@0) INTEGER_CST@1)
> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
> + && INTEGRAL_TYPE_P (type)
> + && TYPE_PRECISION (type) > TYPE_PRECISION (TREE_TYPE (@0))
> + && int_fits_type_p (@1, TREE_TYPE (@0)))
> + /* Perform binary operation inside the cast if the constant fits
> +and (A + CST)'s range does not overflow.  */
> + (with
> +  {
> +   wi::overflow_type min_ovf = wi::OVF_OVERFLOW,
> + max_ovf = wi::OVF_OVERFLOW;
> +tree inner_type = TREE_TYPE (@0);
> +

Re: [PATCH 0/3] Libsanitizer: merge from trunk

2019-08-14 Thread Martin Liška

On 8/13/19 5:02 PM, Jeff Law wrote:
> On 8/13/19 7:07 AM, Martin Liska wrote:
>> Hi.
>>
>> For this year, I decided to make a first merge now and the
>> next (much smaller) at the end of October.
>>
>> The biggest change is rename of many files from .cc to .cpp.
>>
>> I bootstrapped the patch set on x86_64-linux-gnu and run
>> asan/ubsan/tsan tests on x86_64, ppc64le (power8) and
>> aarch64.
>>
>> Libasan SONAME has been already bumped compared to GCC 9.
>>
>> For other libraries, I don't see a reason for library bumping:
>>
>> $ abidiff /usr/lib64/libubsan.so.1.0.0 
>> ./x86_64-pc-linux-gnu/libsanitizer/ubsan/.libs/libubsan.so.1.0.0 --stat
>> Functions changes summary: 0 Removed, 0 Changed, 4 Added functions
>> Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
>> Function symbols changes summary: 3 Removed, 0 Added function symbols not 
>> referenced by debug info
>> Variable symbols changes summary: 0 Removed, 0 Added variable symbol not 
>> referenced by debug info
>>
>> $ abidiff /usr/lib64/libtsan.so.0.0.0  
>> ./x86_64-pc-linux-gnu/libsanitizer/tsan/.libs/libtsan.so.0.0.0 --stat
>> Functions changes summary: 0 Removed, 0 Changed, 47 Added functions
>> Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
>> Function symbols changes summary: 1 Removed, 2 Added function symbols not 
>> referenced by debug info
>> Variable symbols changes summary: 0 Removed, 0 Added variable symbol not 
>> referenced by debug info
>>
>> Ready to be installed?
> ISTM that a sanitizer merge during stage1 should be able to move forward
> without ACKs.  Similarly for other runtimes where we pull from some
> upstream master.

Good then. I've just installed the patch and also the refresh of LOCAL_PATCHES.

> 
> I'd be slightly concerned about the function removals, but I don't think
> we've really tried to be ABI stable for the sanitizer runtimes.

These are fine based on the function names.

Martin

> 
> jeff
> 

>From 090353a2c70b2cf18add7520e34366e10b7f54f7 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Wed, 14 Aug 2019 10:48:38 +0200
Subject: [PATCH] Refresh LOCAL_PATCHES

libsanitizer/ChangeLog:

2019-08-14  Martin Liska  

	* LOCAL_PATCHES: Refresh based on what was committed.
---
 libsanitizer/LOCAL_PATCHES | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/libsanitizer/LOCAL_PATCHES b/libsanitizer/LOCAL_PATCHES
index f653712fdda..121df67826b 100644
--- a/libsanitizer/LOCAL_PATCHES
+++ b/libsanitizer/LOCAL_PATCHES
@@ -1,6 +1 @@
-r258525
-r265667
-r265668
-r265669
-r265950
-r270208
+r274427
-- 
2.22.0

[committed][AArch64] Rework SVE integer comparisons

2019-08-14 Thread Richard Sandiford

The remaining uses of UNSPEC_MERGE_PTRUE were in integer comparison
patterns.  These aren't actually merging operations but zeroing ones,
although there's no practical difference when the predicate is a PTRUE.

All comparisons produced by expand are predicated on a PTRUE,
although we try to pattern-match a compare-and-AND as a predicated
comparison during combine.

Like previous patches, this one rearranges things in a way that works
better with the ACLE, where the initial predicate might or might not
be a PTRUE.  The new patterns use UNSPEC_PRED_Z to represent zeroing
predication, with a aarch64_sve_ptrue_flag to record whether the
predicate is all-true (as for UNSPEC_PTEST).

See the block comment in the patch for more details.

Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf.
Applied as r274429.

Richard


2019-08-14  Richard Sandiford  

gcc/
* config/aarch64/aarch64-protos.h (aarch64_sve_same_pred_for_ptest_p):
Declare.
* config/aarch64/aarch64.c (aarch64_sve_same_pred_for_ptest_p)
(aarch64_sve_emit_int_cmp): New functions.
(aarch64_convert_sve_data_to_pred): Use aarch64_sve_emit_int_cmp.
(aarch64_sve_cmp_operand_p, aarch64_emit_sve_ptrue_op_cc): Delete.
(aarch64_expand_sve_vec_cmp_int): Use aarch64_sve_emit_int_cmp.
* config/aarch64/aarch64.md (UNSPEC_MERGE_PTRUE): Delete.
(UNSPEC_PRED_Z): New unspec.
(set_clobber_cc_nzc): Delete.
* config/aarch64/aarch64-sve.md: Add a block comment about
UNSPEC_PRED_Z.
(*cmp): Rename to...
(@aarch64_pred_cmp): ...this, replacing
the old pattern with that name.  Use UNSPEC_PRED_Z instead of
UNSPEC_MERGE_PTRUE.
(*cmp_cc): Use UNSPEC_PRED_Z instead of
UNSPEC_MERGE_PTRUE.  Use aarch64_sve_same_pred_for_ptest_p to
check for compatible predicates.
(*cmp_ptest): Likewise.
(*cmp_and): Match a known-ptrue UNSPEC_PRED_Z instead
of UNSPEC_MERGE_PTRUE.  Split into the new form of predicated
comparisons above.

Index: gcc/config/aarch64/aarch64-protos.h
===
--- gcc/config/aarch64/aarch64-protos.h 2019-08-14 09:15:57.609828019 +0100
+++ gcc/config/aarch64/aarch64-protos.h 2019-08-14 09:47:31.355831192 +0100
@@ -555,6 +555,7 @@ void aarch64_expand_mov_immediate (rtx,
 rtx aarch64_ptrue_reg (machine_mode);
 rtx aarch64_pfalse_reg (machine_mode);
 bool aarch64_sve_pred_dominates_p (rtx *, rtx);
+bool aarch64_sve_same_pred_for_ptest_p (rtx *, rtx *);
 void aarch64_emit_sve_pred_move (rtx, rtx, rtx);
 void aarch64_expand_sve_mem_move (rtx, rtx, machine_mode);
 bool aarch64_maybe_expand_sve_subreg_move (rtx, rtx);
Index: gcc/config/aarch64/aarch64.c
===
--- gcc/config/aarch64/aarch64.c2019-08-14 09:45:45.464613673 +0100
+++ gcc/config/aarch64/aarch64.c2019-08-14 09:47:31.355831192 +0100
@@ -2783,6 +2783,48 @@ aarch64_sve_pred_dominates_p (rtx *pred1
  || rtx_equal_p (pred1[0], pred2));
 }
 
+/* PRED1[0] is a PTEST predicate and PRED1[1] is an aarch64_sve_ptrue_flag
+   for it.  PRED2[0] is the predicate for the instruction whose result
+   is tested by the PTEST and PRED2[1] is again an aarch64_sve_ptrue_flag
+   for it.  Return true if we can prove that the two predicates are
+   equivalent for PTEST purposes; that is, if we can replace PRED2[0]
+   with PRED1[0] without changing behavior.  */
+
+bool
+aarch64_sve_same_pred_for_ptest_p (rtx *pred1, rtx *pred2)
+{
+  machine_mode mode = GET_MODE (pred1[0]);
+  gcc_assert (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL
+ && mode == GET_MODE (pred2[0])
+ && aarch64_sve_ptrue_flag (pred1[1], SImode)
+ && aarch64_sve_ptrue_flag (pred2[1], SImode));
+
+  bool ptrue1_p = (pred1[0] == CONSTM1_RTX (mode)
+  || INTVAL (pred1[1]) == SVE_KNOWN_PTRUE);
+  bool ptrue2_p = (pred2[0] == CONSTM1_RTX (mode)
+  || INTVAL (pred2[1]) == SVE_KNOWN_PTRUE);
+  return (ptrue1_p && ptrue2_p) || rtx_equal_p (pred1[0], pred2[0]);
+}
+
+/* Emit a comparison CMP between OP0 and OP1, both of which have mode
+   DATA_MODE, and return the result in a predicate of mode PRED_MODE.
+   Use TARGET as the target register if nonnull and convenient.  */
+
+static rtx
+aarch64_sve_emit_int_cmp (rtx target, machine_mode pred_mode, rtx_code cmp,
+ machine_mode data_mode, rtx op1, rtx op2)
+{
+  insn_code icode = code_for_aarch64_pred_cmp (cmp, data_mode);
+  expand_operand ops[5];
+  create_output_operand (&ops[0], target, pred_mode);
+  create_input_operand (&ops[1], CONSTM1_RTX (pred_mode), pred_mode);
+  create_integer_operand (&ops[2], SVE_KNOWN_PTRUE);
+  create_input_operand (&ops[3], op1, data_mode);
+  create_input_operand (&ops[4], op2, data_mode);
+  expand_insn (icode, 5, ops);
+  return ops[0].value;
+}
+
 /* Use a comparison to

[committed][AArch64] Handle more SVE predicate constants

2019-08-14 Thread Richard Sandiford

This patch handles more predicate constants by using TRN1, TRN2
and EOR.  For now, only one operation is allowed before we fall
back to loading from memory or doing an integer move and a compare.
The EOR support includes the important special case of an inverted
predicate.

The real motivating case for this is the ACLE svdupq function,
which allows a repeating 16-bit predicate to be built from
individual scalar booleans.  It's not easy to test properly
before that support is merged.

Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf.
Applied as r274434.

Richard


2019-08-14  Richard Sandiford  

gcc/
* config/aarch64/aarch64.c (aarch64_expand_sve_const_pred_eor)
(aarch64_expand_sve_const_pred_trn): New functions.
(aarch64_expand_sve_const_pred_1): Add a recurse_p parameter and
use the above functions when the parameter is true.
(aarch64_expand_sve_const_pred): Update call accordingly.
* config/aarch64/aarch64-sve.md (*aarch64_sve_):
Rename to...
(@aarch64_sve_): ...this.

gcc/testsuite/
* gcc.target/aarch64/sve/peel_ind_1.c: Look for an inverted .B VL1.
* gcc.target/aarch64/sve/peel_ind_2.c: Likewise .S VL7.

Index: gcc/config/aarch64/aarch64.c
===
--- gcc/config/aarch64/aarch64.c2019-08-14 09:50:03.682705602 +0100
+++ gcc/config/aarch64/aarch64.c2019-08-14 09:52:02.893827778 +0100
@@ -3751,13 +3751,163 @@ aarch64_sve_move_pred_via_while (rtx tar
   return target;
 }
 
+static rtx
+aarch64_expand_sve_const_pred_1 (rtx, rtx_vector_builder &, bool);
+
+/* BUILDER is a constant predicate in which the index of every set bit
+   is a multiple of ELT_SIZE (which is <= 8).  Try to load the constant
+   by inverting every element at a multiple of ELT_SIZE and EORing the
+   result with an ELT_SIZE PTRUE.
+
+   Return a register that contains the constant on success, otherwise
+   return null.  Use TARGET as the register if it is nonnull and
+   convenient.  */
+
+static rtx
+aarch64_expand_sve_const_pred_eor (rtx target, rtx_vector_builder &builder,
+  unsigned int elt_size)
+{
+  /* Invert every element at a multiple of ELT_SIZE, keeping the
+ other bits zero.  */
+  rtx_vector_builder inv_builder (VNx16BImode, builder.npatterns (),
+ builder.nelts_per_pattern ());
+  for (unsigned int i = 0; i < builder.encoded_nelts (); ++i)
+if ((i & (elt_size - 1)) == 0 && INTVAL (builder.elt (i)) == 0)
+  inv_builder.quick_push (const1_rtx);
+else
+  inv_builder.quick_push (const0_rtx);
+  inv_builder.finalize ();
+
+  /* See if we can load the constant cheaply.  */
+  rtx inv = aarch64_expand_sve_const_pred_1 (NULL_RTX, inv_builder, false);
+  if (!inv)
+return NULL_RTX;
+
+  /* EOR the result with an ELT_SIZE PTRUE.  */
+  rtx mask = aarch64_ptrue_all (elt_size);
+  mask = force_reg (VNx16BImode, mask);
+  target = aarch64_target_reg (target, VNx16BImode);
+  emit_insn (gen_aarch64_pred_z (XOR, VNx16BImode, target, mask, inv, mask));
+  return target;
+}
+
+/* BUILDER is a constant predicate in which the index of every set bit
+   is a multiple of ELT_SIZE (which is <= 8).  Try to load the constant
+   using a TRN1 of size PERMUTE_SIZE, which is >= ELT_SIZE.  Return the
+   register on success, otherwise return null.  Use TARGET as the register
+   if nonnull and convenient.  */
+
+static rtx
+aarch64_expand_sve_const_pred_trn (rtx target, rtx_vector_builder &builder,
+  unsigned int elt_size,
+  unsigned int permute_size)
+{
+  /* We're going to split the constant into two new constants A and B,
+ with element I of BUILDER going into A if (I & PERMUTE_SIZE) == 0
+ and into B otherwise.  E.g. for PERMUTE_SIZE == 4 && ELT_SIZE == 1:
+
+ A: { 0, 1, 2, 3, _, _, _, _, 8, 9, 10, 11, _, _, _, _ }
+ B: { 4, 5, 6, 7, _, _, _, _, 12, 13, 14, 15, _, _, _, _ }
+
+ where _ indicates elements that will be discarded by the permute.
+
+ First calculate the ELT_SIZEs for A and B.  */
+  unsigned int a_elt_size = GET_MODE_SIZE (DImode);
+  unsigned int b_elt_size = GET_MODE_SIZE (DImode);
+  for (unsigned int i = 0; i < builder.encoded_nelts (); i += elt_size)
+if (INTVAL (builder.elt (i)) != 0)
+  {
+   if (i & permute_size)
+ b_elt_size |= i - permute_size;
+   else
+ a_elt_size |= i;
+  }
+  a_elt_size &= -a_elt_size;
+  b_elt_size &= -b_elt_size;
+
+  /* Now construct the vectors themselves.  */
+  rtx_vector_builder a_builder (VNx16BImode, builder.npatterns (),
+   builder.nelts_per_pattern ());
+  rtx_vector_builder b_builder (VNx16BImode, builder.npatterns (),
+   builder.nelts_per_pattern ());
+  unsigned int nelts = builder.encoded_nelts ();
+  for (unsigned int i = 0; i < nelts; ++i)
+

[committed][AArch64] Use SVE ADR to optimise shift-add sequences

2019-08-14 Thread Richard Sandiford

This patch uses SVE ADR to optimise shift-and-add and uxtw-and-add
sequences.

Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf.
Applied as r274436.

Richard


2019-08-14  Richard Sandiford  

gcc/
* config/aarch64/predicates.md (const_1_to_3_operand): New predicate.
* config/aarch64/aarch64-sve.md (*aarch64_adr_uxtw)
(*aarch64_adr_shift, *aarch64_adr_shift_uxtw): New patterns.

gcc/testsuite/
* gcc.target/aarch64/sve/adr_1.c: New test.
* gcc.target/aarch64/sve/adr_1_run.c: Likewise.
* gcc.target/aarch64/sve/adr_2.c: Likewise.
* gcc.target/aarch64/sve/adr_2_run.c: Likewise.
* gcc.target/aarch64/sve/adr_3.c: Likewise.
* gcc.target/aarch64/sve/adr_3_run.c: Likewise.
* gcc.target/aarch64/sve/adr_4.c: Likewise.
* gcc.target/aarch64/sve/adr_4_run.c: Likewise.
* gcc.target/aarch64/sve/adr_5.c: Likewise.
* gcc.target/aarch64/sve/adr_5_run.c: Likewise.
--

Index: gcc/config/aarch64/predicates.md
===
--- gcc/config/aarch64/predicates.md2019-08-14 09:15:57.617827961 +0100
+++ gcc/config/aarch64/predicates.md2019-08-14 09:56:55.323680943 +0100
@@ -39,6 +39,13 @@ (define_predicate "const0_operand"
   (and (match_code "const_int")
(match_test "op == CONST0_RTX (mode)")))
 
+(define_predicate "const_1_to_3_operand"
+  (match_code "const_int,const_vector")
+{
+  op = unwrap_const_vec_duplicate (op);
+  return CONST_INT_P (op) && IN_RANGE (INTVAL (op), 1, 3);
+})
+
 (define_special_predicate "subreg_lowpart_operator"
   (and (match_code "subreg")
(match_test "subreg_lowpart_p (op)")))
@@ -595,6 +602,11 @@ (define_predicate "aarch64_sve_inc_dec_i
   (and (match_code "const,const_vector")
(match_test "aarch64_sve_inc_dec_immediate_p (op)")))
 
+(define_predicate "aarch64_sve_uxtw_immediate"
+  (and (match_code "const_vector")
+   (match_test "GET_MODE_UNIT_BITSIZE (GET_MODE (op)) > 32")
+   (match_test "aarch64_const_vec_all_same_int_p (op, 0x)")))
+
 (define_predicate "aarch64_sve_logical_immediate"
   (and (match_code "const,const_vector")
(match_test "aarch64_sve_bitmask_immediate_p (op)")))
Index: gcc/config/aarch64/aarch64-sve.md
===
--- gcc/config/aarch64/aarch64-sve.md   2019-08-14 09:54:30.808741952 +0100
+++ gcc/config/aarch64/aarch64-sve.md   2019-08-14 09:56:55.323680943 +0100
@@ -61,6 +61,7 @@
 ;;  [INT] General binary arithmetic corresponding to rtx codes
 ;;  [INT] Addition
 ;;  [INT] Subtraction
+;;  [INT] Take address
 ;;  [INT] Absolute difference
 ;;  [INT] Multiplication
 ;;  [INT] Highpart multiplication
@@ -1672,6 +1673,65 @@ (define_insn "sub3"
 ;; Merging forms are handled through SVE_INT_BINARY.
 
 ;; -
+;;  [INT] Take address
+;; -
+;; Includes:
+;; - ADR
+;; -
+
+;; Unshifted ADR, with the offset being zero-extended from the low 32 bits.
+(define_insn "*aarch64_adr_uxtw"
+  [(set (match_operand:VNx2DI 0 "register_operand" "=w")
+   (plus:VNx2DI
+ (and:VNx2DI
+   (match_operand:VNx2DI 2 "register_operand" "w")
+   (match_operand:VNx2DI 3 "aarch64_sve_uxtw_immediate"))
+ (match_operand:VNx2DI 1 "register_operand" "w")))]
+  "TARGET_SVE"
+  "adr\t%0.d, [%1.d, %2.d, uxtw]"
+)
+
+;; ADR with a nonzero shift.
+(define_insn_and_rewrite "*aarch64_adr_shift"
+  [(set (match_operand:SVE_SDI 0 "register_operand" "=w")
+   (plus:SVE_SDI
+ (unspec:SVE_SDI
+   [(match_operand 4)
+(ashift:SVE_SDI
+  (match_operand:SVE_SDI 2 "register_operand" "w")
+  (match_operand:SVE_SDI 3 "const_1_to_3_operand"))]
+   UNSPEC_PRED_X)
+ (match_operand:SVE_SDI 1 "register_operand" "w")))]
+  "TARGET_SVE"
+  "adr\t%0., [%1., %2., lsl %3]"
+  "&& !CONSTANT_P (operands[4])"
+  {
+operands[4] = CONSTM1_RTX (mode);
+  }
+)
+
+;; Same, but with the index being zero-extended from the low 32 bits.
+(define_insn_and_rewrite "*aarch64_adr_shift_uxtw"
+  [(set (match_operand:VNx2DI 0 "register_operand" "=w")
+   (plus:VNx2DI
+ (unspec:VNx2DI
+   [(match_operand 5)
+(ashift:VNx2DI
+  (and:VNx2DI
+(match_operand:VNx2DI 2 "register_operand" "w")
+(match_operand:VNx2DI 4 "aarch64_sve_uxtw_immediate"))
+  (match_operand:VNx2DI 3 "const_1_to_3_operand"))]
+   UNSPEC_PRED_X)
+ (match_operand:VNx2DI 1 "register_operand" "w")))]
+  "TARGET_SVE"
+  "adr\t%0.d, [%1.d, %2.d, uxtw %3]"
+  "&& !CONSTANT_P (operands[5])"
+

[committed][AArch64] Add support for SVE CLS and CLZ

2019-08-14 Thread Richard Sandiford

This patch adds support for unpredicated SVE CLS and CLZ.  A later patch
will add support for predicated unary integer arithmetic.

Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf.
Applied as r274437.

Richard


2019-08-14  Richard Sandiford  

gcc/
* config/aarch64/iterators.md (SVE_INT_UNARY): Add clrsb and clz.
(optab, sve_int_op): Handle them.
* config/aarch64/aarch64-sve.md: Expand comment.

gcc/testsuite/
* gcc.target/aarch64/vect-clz.c: Force SVE off.
* gcc.target/aarch64/sve/clrsb_1.c: New test.
* gcc.target/aarch64/sve/clrsb_1_run.c: Likewise.
* gcc.target/aarch64/sve/clz_1.c: Likewise.
* gcc.target/aarch64/sve/clz_1_run.c: Likewise.

Index: gcc/config/aarch64/iterators.md
===
--- gcc/config/aarch64/iterators.md 2019-08-14 09:39:44.323282457 +0100
+++ gcc/config/aarch64/iterators.md 2019-08-14 10:00:45.485990851 +0100
@@ -1276,7 +1276,7 @@ (define_code_iterator UCOMPARISONS [ltu
 (define_code_iterator FAC_COMPARISONS [lt le ge gt])
 
 ;; SVE integer unary operations.
-(define_code_iterator SVE_INT_UNARY [abs neg not popcount])
+(define_code_iterator SVE_INT_UNARY [abs neg not clrsb clz popcount])
 
 ;; SVE integer binary operations.
 (define_code_iterator SVE_INT_BINARY [plus minus mult smax umax smin umin
@@ -1307,6 +1307,8 @@ (define_code_attr optab [(ashift "ashl")
 (unsigned_fix "fixuns")
 (float "float")
 (unsigned_float "floatuns")
+(clrsb "clrsb")
+(clz "clz")
 (popcount "popcount")
 (and "and")
 (ior "ior")
@@ -1474,6 +1476,8 @@ (define_code_attr sve_int_op [(plus "add
  (ior "orr")
  (xor "eor")
  (not "not")
+ (clrsb "cls")
+ (clz "clz")
  (popcount "cnt")])
 
 (define_code_attr sve_int_op_rev [(plus "add")
Index: gcc/config/aarch64/aarch64-sve.md
===
--- gcc/config/aarch64/aarch64-sve.md   2019-08-14 09:58:35.914942337 +0100
+++ gcc/config/aarch64/aarch64-sve.md   2019-08-14 10:00:45.485990851 +0100
@@ -1422,6 +1422,8 @@ (define_expand "vec_extract"
 ;; -
 ;; Includes:
 ;; - ABS
+;; - CLS (= clrsb)
+;; - CLZ
 ;; - CNT (= popcount)
 ;; - NEG
 ;; - NOT
Index: gcc/testsuite/gcc.target/aarch64/vect-clz.c
===
--- gcc/testsuite/gcc.target/aarch64/vect-clz.c 2019-03-08 18:14:30.068993639 
+
+++ gcc/testsuite/gcc.target/aarch64/vect-clz.c 2019-08-14 10:00:45.485990851 
+0100
@@ -1,6 +1,8 @@
 /* { dg-do run } */
 /* { dg-options "-O3 -save-temps -fno-inline -fno-vect-cost-model" } */
 
+#pragma GCC target "+nosve"
+
 extern void abort ();
 
 void
Index: gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c
===
--- /dev/null   2019-07-30 08:53:31.317691683 +0100
+++ gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c  2019-08-14 
10:00:45.485990851 +0100
@@ -0,0 +1,22 @@
+/* { dg-do assemble { target aarch64_asm_sve_ok } } */
+/* { dg-options "-O2 -ftree-vectorize --save-temps" } */
+
+#include 
+
+void __attribute__ ((noinline, noclone))
+clrsb_32 (unsigned int *restrict dst, uint32_t *restrict src, int size)
+{
+  for (int i = 0; i < size; ++i)
+dst[i] = __builtin_clrsb (src[i]);
+}
+
+void __attribute__ ((noinline, noclone))
+clrsb_64 (unsigned int *restrict dst, uint64_t *restrict src, int size)
+{
+  for (int i = 0; i < size; ++i)
+dst[i] = __builtin_clrsbll (src[i]);
+}
+
+/* { dg-final { scan-assembler-times {\tcls\tz[0-9]+\.s, p[0-7]/m, 
z[0-9]+\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tcls\tz[0-9]+\.d, p[0-7]/m, 
z[0-9]+\.d\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tuzp1\tz[0-9]+\.s, z[0-9]+\.s, 
z[0-9]+\.s\n} 1 } } */
Index: gcc/testsuite/gcc.target/aarch64/sve/clrsb_1_run.c
===
--- /dev/null   2019-07-30 08:53:31.317691683 +0100
+++ gcc/testsuite/gcc.target/aarch64/sve/clrsb_1_run.c  2019-08-14 
10:00:45.485990851 +0100
@@ -0,0 +1,50 @@
+/* { dg-do run { target aarch64_sve_hw } } */
+/* { dg-options "-O2 -ftree-vectorize" } */
+
+#include "clrsb_1.c"
+
+extern void abort (void) __attribute__ ((noreturn));
+
+unsigned int data[] = {
+  0xff80, 24,
+  0x, 31,
+  0x, 31,
+  0x8000, 0,
+  0x7fff, 0,
+  0x03ff, 21,
+  0x1fff, 2,
+  0x, 15,
+  0x, 15
+};
+
+int __attribute__ ((optimize (1)))
+main (void)
+{
+  unsigned int count = sizeof (data) / sizeof (data[0]) / 2;
+
+  uint32_t i

[committed][AArch64] Add support for SVE CNOT

2019-08-14 Thread Richard Sandiford

This patch adds support for predicated and unpredicated CNOT
(logical NOT on integers).  In RTL terms, this is a select between
1 and 0 in which the predicate is fed by a comparison with zero.

Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf.
Applied as r274438.

Richard


2019-08-14  Richard Sandiford  

gcc/
* config/aarch64/predicates.md (aarch64_simd_imm_one): New predicate.
* config/aarch64/aarch64-sve.md (*cnot): New pattern.
(*cond_cnot_2, *cond_cnot_any): Likewise.

gcc/testsuite/
* gcc.target/aarch64/sve/cnot_1.c: New test.
* gcc.target/aarch64/sve/cond_cnot_1.c: Likewise.
* gcc.target/aarch64/sve/cond_cnot_1_run.c: Likewise.
* gcc.target/aarch64/sve/cond_cnot_2.c: Likewise.
* gcc.target/aarch64/sve/cond_cnot_2_run.c: Likewise.
* gcc.target/aarch64/sve/cond_cnot_3.c: Likewise.
* gcc.target/aarch64/sve/cond_cnot_3_run.c: Likewise.

Index: gcc/config/aarch64/predicates.md
===
--- gcc/config/aarch64/predicates.md2019-08-14 09:58:35.914942337 +0100
+++ gcc/config/aarch64/predicates.md2019-08-14 10:04:58.948129300 +0100
@@ -460,6 +460,10 @@ (define_predicate "aarch64_simd_imm_zero
   (and (match_code "const,const_vector")
(match_test "op == CONST0_RTX (GET_MODE (op))")))
 
+(define_predicate "aarch64_simd_imm_one"
+  (and (match_code "const_vector")
+   (match_test "op == CONST1_RTX (GET_MODE (op))")))
+
 (define_predicate "aarch64_simd_or_scalar_imm_zero"
   (and (match_code "const_int,const_double,const,const_vector")
(match_test "op == CONST0_RTX (GET_MODE (op))")))
Index: gcc/config/aarch64/aarch64-sve.md
===
--- gcc/config/aarch64/aarch64-sve.md   2019-08-14 10:02:44.165119259 +0100
+++ gcc/config/aarch64/aarch64-sve.md   2019-08-14 10:04:58.948129300 +0100
@@ -54,6 +54,7 @@
 ;;
 ;; == Unary arithmetic
 ;;  [INT] General unary arithmetic corresponding to rtx codes
+;;  [INT] Logical inverse
 ;;  [FP] General unary arithmetic corresponding to unspecs
 ;;  [PRED] Inverse
 
@@ -1455,6 +1456,95 @@ (define_insn "*2"
 )
 
 ;; -
+;;  [INT] Logical inverse
+;; -
+
+;; Predicated logical inverse.
+(define_insn "*cnot"
+  [(set (match_operand:SVE_I 0 "register_operand" "=w")
+   (unspec:SVE_I
+ [(unspec:
+[(match_operand: 1 "register_operand" "Upl")
+ (match_operand:SI 5 "aarch64_sve_ptrue_flag")
+ (eq:
+   (match_operand:SVE_I 2 "register_operand" "w")
+   (match_operand:SVE_I 3 "aarch64_simd_imm_zero"))]
+UNSPEC_PRED_Z)
+  (match_operand:SVE_I 4 "aarch64_simd_imm_one")
+  (match_dup 3)]
+ UNSPEC_SEL))]
+  "TARGET_SVE"
+  "cnot\t%0., %1/m, %2."
+)
+
+;; Predicated logical inverse, merging with the first input.
+(define_insn_and_rewrite "*cond_cnot_2"
+  [(set (match_operand:SVE_I 0 "register_operand" "=w, ?&w")
+   (unspec:SVE_I
+ [(match_operand: 1 "register_operand" "Upl, Upl")
+  ;; Logical inverse of operand 2 (as above).
+  (unspec:SVE_I
+[(unspec:
+   [(match_operand 5)
+(const_int SVE_KNOWN_PTRUE)
+(eq:
+  (match_operand:SVE_I 2 "register_operand" "0, w")
+  (match_operand:SVE_I 3 "aarch64_simd_imm_zero"))]
+   UNSPEC_PRED_Z)
+ (match_operand:SVE_I 4 "aarch64_simd_imm_one")
+ (match_dup 3)]
+UNSPEC_SEL)
+  (match_dup 2)]
+ UNSPEC_SEL))]
+  "TARGET_SVE"
+  "@
+   cnot\t%0., %1/m, %0.
+   movprfx\t%0, %2\;cnot\t%0., %1/m, %2."
+  "&& !CONSTANT_P (operands[5])"
+  {
+operands[5] = CONSTM1_RTX (mode);
+  }
+  [(set_attr "movprfx" "*,yes")]
+)
+
+;; Predicated logical inverse, merging with an independent value.
+;;
+;; The earlyclobber isn't needed for the first alternative, but omitting
+;; it would only help the case in which operands 2 and 6 are the same,
+;; which is handled above rather than here.  Marking all the alternatives
+;; as earlyclobber helps to make the instruction more regular to the
+;; register allocator.
+(define_insn_and_rewrite "*cond_cnot_any"
+  [(set (match_operand:SVE_I 0 "register_operand" "=&w, ?&w, ?&w")
+   (unspec:SVE_I
+ [(match_operand: 1 "register_operand" "Upl, Upl, Upl")
+  ;; Logical inverse of operand 2 (as above).
+  (unspec:SVE_I
+[(unspec:
+   [(match_operand 5)
+(const_int SVE_KNOWN_PTRUE)
+(eq:
+  (match_operand:SVE_I 2 "register_operand" "w, w, w")
+  (match_operand:SVE_I 3 "aarch64_simd_imm_zero"))]
+   UNSPEC_PRED_Z)
+ (m

Re: [PATCH][RFC][x86] Fix PR91154, add SImode smax, allow SImode add in SSE regs

2019-08-14 Thread Richard Biener

On Tue, 13 Aug 2019, Jeff Law wrote:

> On 8/9/19 7:00 AM, Richard Biener wrote:
> > 
> > It fixes the slowdown observed in 416.gamess and 464.h264ref.
> > 
> > Bootstrapped on x86_64-unknown-linux-gnu, testing still in progress.
> > 
> > CCing Jeff who "knows RTL".
> What specifically do you want me to look at?  I'm not really familiar
> with the STV stuff, but can certainly take a peek.

Below is the updated patch with the already approved and committed
parts taken out.  It is not mostly mechanical apart from the
make_vector_copies and convert_reg changes which move existing
"patterns" under appropriate conditionals and adds handling of the
case where the scalar mode fits in a single GPR (previously it
was -m32 DImode only, now it handles -m32/-m64 SImode and DImode).

I'm redoing bootstrap / regtest on x86_64-unknown-linux-gnu now just
to be safe.

OK?

I do expect we need to work on the compile-time issue I placed ???
comments on and more generally try to avoid using DF so much.

Thanks,
Richard.

2019-08-13  Richard Biener  

PR target/91154
* config/i386/i386-features.h (scalar_chain::scalar_chain): Add
mode arguments.
(scalar_chain::smode): New member.
(scalar_chain::vmode): Likewise.
(dimode_scalar_chain): Rename to...
(general_scalar_chain): ... this.
(general_scalar_chain::general_scalar_chain): Take mode arguments.
(timode_scalar_chain::timode_scalar_chain): Initialize scalar_chain
base with TImode and V1TImode.
* config/i386/i386-features.c (scalar_chain::scalar_chain): Adjust.
(general_scalar_chain::vector_const_cost): Adjust for SImode
chains.
(general_scalar_chain::compute_convert_gain): Likewise.  Add
{S,U}{MIN,MAX} support.
(general_scalar_chain::replace_with_subreg): Use vmode/smode.
(general_scalar_chain::make_vector_copies): Likewise.  Handle
non-DImode chains appropriately.
(general_scalar_chain::convert_reg): Likewise.
(general_scalar_chain::convert_op): Likewise.
(general_scalar_chain::convert_insn): Likewise.  Add
fatal_insn_not_found if the result is not recognized.
(convertible_comparison_p): Pass in the scalar mode and use that.
(general_scalar_to_vector_candidate_p): Likewise.  Rename from
dimode_scalar_to_vector_candidate_p.  Add {S,U}{MIN,MAX} support.
(scalar_to_vector_candidate_p): Remove by inlining into single
caller.
(general_remove_non_convertible_regs): Rename from
dimode_remove_non_convertible_regs.
(remove_non_convertible_regs): Remove by inlining into single caller.
(convert_scalars_to_vector): Handle SImode and DImode chains
in addition to TImode chains.
* config/i386/i386.md (3): New expander.
(*3_1): New insn-and-split.
(*di3_doubleword): Likewise.

* gcc.target/i386/pr91154.c: New testcase.
* gcc.target/i386/minmax-3.c: Likewise.
* gcc.target/i386/minmax-4.c: Likewise.
* gcc.target/i386/minmax-5.c: Likewise.
* gcc.target/i386/minmax-6.c: Likewise.
* gcc.target/i386/minmax-1.c: Add -mno-stv.
* gcc.target/i386/minmax-2.c: Likewise.

Index: gcc/config/i386/i386-features.c
===
--- gcc/config/i386/i386-features.c (revision 274422)
+++ gcc/config/i386/i386-features.c (working copy)
@@ -276,8 +276,11 @@ unsigned scalar_chain::max_id = 0;

 /* Initialize new chain.  */

-scalar_chain::scalar_chain ()
+scalar_chain::scalar_chain (enum machine_mode smode_, enum machine_mode vmode_)
 {
+  smode = smode_;
+  vmode = vmode_;
+
   chain_id = ++max_id;

if (dump_file)
@@ -319,7 +322,7 @@ scalar_chain::add_to_queue (unsigned ins
conversion.  */

 void
-dimode_scalar_chain::mark_dual_mode_def (df_ref def)
+general_scalar_chain::mark_dual_mode_def (df_ref def)
 {
   gcc_assert (DF_REF_REG_DEF_P (def));

@@ -409,6 +412,9 @@ scalar_chain::add_insn (bitmap candidate
   && !HARD_REGISTER_P (SET_DEST (def_set)))
 bitmap_set_bit (defs, REGNO (SET_DEST (def_set)));

+  /* ???  The following is quadratic since analyze_register_chain
+ iterates over all refs to look for dual-mode regs.  Instead this
+ should be done separately for all regs mentioned in the chain once.  */
   df_ref ref;
   df_ref def;
   for (ref = DF_INSN_UID_DEFS (insn_uid); ref; ref = DF_REF_NEXT_LOC (ref))
@@ -469,19 +475,21 @@ scalar_chain::build (bitmap candidates,
instead of using a scalar one.  */

 int
-dimode_scalar_chain::vector_const_cost (rtx exp)
+general_scalar_chain::vector_const_cost (rtx exp)
 {
   gcc_assert (CONST_INT_P (exp));

-  if (standard_sse_constant_p (exp, V2DImode))
-return COSTS_N_INSNS (1);
-  return ix86_cost->sse_load[1];
+  if (standard_sse_constant_p (exp, vmode))
+return ix86_cost->sse_op;
+  /* We have separate costs for SImode and DImode, us

[committed][AArch64] Add support for SVE [SU]{MAX,MIN} immediate

2019-08-14 Thread Richard Sandiford

This patch adds support for the immediate forms of SVE SMAX, SMIN, UMAX
and UMIN.  SMAX and SMIN take the same range as MUL, so the patch
basically just moves and generalises the existing MUL patterns.

Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf.
Applied as r274439.

Richard


2019-08-14  Richard Sandiford  

gcc/
* config/aarch64/constraints.md (vsb): New constraint.
(vsm): Generalize description.
* config/aarch64/iterators.md (SVE_INT_BINARY_IMM): New code
iterator.
(sve_imm_con): Handle smax, smin, umax and umin.
(sve_imm_prefix): New code attribute.
* config/aarch64/predicates.md (aarch64_sve_vsb_immediate)
(aarch64_sve_vsb_operand): New predicates.
(aarch64_sve_mul_immediate): Rename to...
(aarch64_sve_vsm_immediate): ...this.
(aarch64_sve_mul_operand): Rename to...
(aarch64_sve_vsm_operand): ...this.
* config/aarch64/aarch64-sve.md (mul3): Generalize to...
(3): ...this.
(*mul3, *post_ra_mul3): Generalize to...
(*3)
(*post_ra_3): ...these and
add movprfx support for the immediate alternatives.
(3, *3): Delete in favor
of the above.
(*3): Fix incorrect predicate
for operand 3.

gcc/testsuite/
* gcc.target/aarch64/sve/smax_1.c: New test.
* gcc.target/aarch64/sve/smin_1.c: Likewise.
* gcc.target/aarch64/sve/umax_1.c: Likewise.
* gcc.target/aarch64/sve/umin_1.c: Likewise.

Index: gcc/config/aarch64/constraints.md
===
--- gcc/config/aarch64/constraints.md   2019-08-13 11:39:54.753376024 +0100
+++ gcc/config/aarch64/constraints.md   2019-08-14 10:08:03.446774020 +0100
@@ -388,6 +388,12 @@ (define_constraint "vsa"
arithmetic instructions."
  (match_operand 0 "aarch64_sve_arith_immediate"))
 
+(define_constraint "vsb"
+  "@internal
+   A constraint that matches an immediate operand valid for SVE UMAX
+   and UMIN operations."
+ (match_operand 0 "aarch64_sve_vsb_immediate"))
+
 (define_constraint "vsc"
   "@internal
A constraint that matches a signed immediate operand valid for SVE
@@ -420,9 +426,9 @@ (define_constraint "vsl"
 
 (define_constraint "vsm"
   "@internal
-   A constraint that matches an immediate operand valid for SVE MUL
-   operations."
- (match_operand 0 "aarch64_sve_mul_immediate"))
+   A constraint that matches an immediate operand valid for SVE MUL,
+   SMAX and SMIN operations."
+ (match_operand 0 "aarch64_sve_vsm_immediate"))
 
 (define_constraint "vsA"
   "@internal
Index: gcc/config/aarch64/iterators.md
===
--- gcc/config/aarch64/iterators.md 2019-08-14 10:02:44.165119259 +0100
+++ gcc/config/aarch64/iterators.md 2019-08-14 10:08:03.446774020 +0100
@@ -1285,6 +1285,9 @@ (define_code_iterator SVE_INT_BINARY [pl
 ;; SVE integer binary division operations.
 (define_code_iterator SVE_INT_BINARY_SD [div udiv])
 
+;; SVE integer binary operations that have an immediate form.
+(define_code_iterator SVE_INT_BINARY_IMM [mult smax smin umax umin])
+
 ;; SVE floating-point operations with an unpredicated all-register form.
 (define_code_iterator SVE_UNPRED_FP_BINARY [plus minus mult])
 
@@ -1499,7 +1502,12 @@ (define_code_attr sve_fp_op [(plus "fadd
 (mult "fmul")])
 
 ;; The SVE immediate constraint to use for an rtl code.
-(define_code_attr sve_imm_con [(eq "vsc")
+(define_code_attr sve_imm_con [(mult "vsm")
+  (smax "vsm")
+  (smin "vsm")
+  (umax "vsb")
+  (umin "vsb")
+  (eq "vsc")
   (ne "vsc")
   (lt "vsc")
   (ge "vsc")
@@ -1510,6 +1518,13 @@ (define_code_attr sve_imm_con [(eq "vsc"
   (geu "vsd")
   (gtu "vsd")])
 
+;; The prefix letter to use when printing an immediate operand.
+(define_code_attr sve_imm_prefix [(mult "")
+ (smax "")
+ (smin "")
+ (umax "D")
+ (umin "D")])
+
 ;; ---
 ;; Int Iterators.
 ;; ---
Index: gcc/config/aarch64/predicates.md
===
--- gcc/config/aarch64/predicates.md2019-08-14 10:06:06.331634340 +0100
+++ gcc/config/aarch64/predicates.md2019-08-14 10:08:03.446774020 +0100
@@ -615,7 +615,15 @@ (define_predicate "aarch64_sve_logical_i
   (and (match_code "const,const_vector")
(match_test "aarch64_sve_bitmask_immediate_p (op)")))
 
-(define_predicate "aarch64_sve_mul

[committed][AArch64] Add support for SVE F{MAX,MIN}NM immediate

2019-08-14 Thread Richard Sandiford

This patch uses the immediate forms of FMAXNM and FMINNM for
unconditional arithmetic.

The same rules apply to FMAX and FMIN, but we only generate those
via the ACLE.

Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf.
Applied as r274440.

Richard


2019-08-14  Richard Sandiford  

gcc/
* config/aarch64/predicates.md (aarch64_sve_float_maxmin_immediate)
(aarch64_sve_float_maxmin_operand): New predicates.
* config/aarch64/constraints.md (vsB): New constraint.
(vsM): Fix typo.
* config/aarch64/iterators.md (sve_pred_fp_rhs2_operand): Use
aarch64_sve_float_maxmin_operand for UNSPEC_COND_FMAXNM and
UNSPEC_COND_FMINNM.
* config/aarch64/aarch64-sve.md (3):
Use aarch64_sve_float_maxmin_operand for operand 2.
(*3): Likewise.
Add alternatives for the constant forms.

gcc/testsuite/
* gcc.target/aarch64/sve/fmaxnm_1.c: New test.
* gcc.target/aarch64/sve/fminnm_1.c: Likewise.

Index: gcc/config/aarch64/predicates.md
===
--- gcc/config/aarch64/predicates.md2019-08-14 10:10:02.497900721 +0100
+++ gcc/config/aarch64/predicates.md2019-08-14 10:12:12.864944397 +0100
@@ -655,6 +655,11 @@ (define_predicate "aarch64_sve_float_mul
   (and (match_code "const,const_vector")
(match_test "aarch64_sve_float_mul_immediate_p (op)")))
 
+(define_predicate "aarch64_sve_float_maxmin_immediate"
+  (and (match_code "const_vector")
+   (ior (match_test "op == CONST0_RTX (GET_MODE (op))")
+   (match_test "op == CONST1_RTX (GET_MODE (op))"
+
 (define_predicate "aarch64_sve_arith_operand"
   (ior (match_operand 0 "register_operand")
(match_operand 0 "aarch64_sve_arith_immediate")))
@@ -708,6 +713,10 @@ (define_predicate "aarch64_sve_float_mul
   (ior (match_operand 0 "register_operand")
(match_operand 0 "aarch64_sve_float_mul_immediate")))
 
+(define_predicate "aarch64_sve_float_maxmin_operand"
+  (ior (match_operand 0 "register_operand")
+   (match_operand 0 "aarch64_sve_float_maxmin_immediate")))
+
 (define_predicate "aarch64_sve_vec_perm_operand"
   (ior (match_operand 0 "register_operand")
(match_operand 0 "aarch64_constant_vector_operand")))
Index: gcc/config/aarch64/constraints.md
===
--- gcc/config/aarch64/constraints.md   2019-08-14 10:10:02.497900721 +0100
+++ gcc/config/aarch64/constraints.md   2019-08-14 10:12:12.864944397 +0100
@@ -436,9 +436,16 @@ (define_constraint "vsA"
and FSUB operations."
  (match_operand 0 "aarch64_sve_float_arith_immediate"))
 
+;; "B" for "bound".
+(define_constraint "vsB"
+  "@internal
+   A constraint that matches an immediate operand valid for SVE FMAX
+   and FMIN operations."
+ (match_operand 0 "aarch64_sve_float_maxmin_immediate"))
+
 (define_constraint "vsM"
   "@internal
-   A constraint that matches an imediate operand valid for SVE FMUL
+   A constraint that matches an immediate operand valid for SVE FMUL
operations."
  (match_operand 0 "aarch64_sve_float_mul_immediate"))
 
Index: gcc/config/aarch64/iterators.md
===
--- gcc/config/aarch64/iterators.md 2019-08-14 10:10:02.497900721 +0100
+++ gcc/config/aarch64/iterators.md 2019-08-14 10:12:12.864944397 +0100
@@ -2075,7 +2075,7 @@ (define_int_attr sve_pred_fp_rhs1_operan
 (define_int_attr sve_pred_fp_rhs2_operand
   [(UNSPEC_COND_FADD "aarch64_sve_float_arith_with_sub_operand")
(UNSPEC_COND_FDIV "register_operand")
-   (UNSPEC_COND_FMAXNM "register_operand")
-   (UNSPEC_COND_FMINNM "register_operand")
+   (UNSPEC_COND_FMAXNM "aarch64_sve_float_maxmin_operand")
+   (UNSPEC_COND_FMINNM "aarch64_sve_float_maxmin_operand")
(UNSPEC_COND_FMUL "aarch64_sve_float_mul_operand")
(UNSPEC_COND_FSUB "register_operand")])
Index: gcc/config/aarch64/aarch64-sve.md
===
--- gcc/config/aarch64/aarch64-sve.md   2019-08-14 10:10:02.497900721 +0100
+++ gcc/config/aarch64/aarch64-sve.md   2019-08-14 10:12:12.864944397 +0100
@@ -2604,7 +2604,7 @@ (define_expand "3"
  [(match_dup 3)
   (const_int SVE_RELAXED_GP)
   (match_operand:SVE_F 1 "register_operand")
-  (match_operand:SVE_F 2 "register_operand")]
+  (match_operand:SVE_F 2 "aarch64_sve_float_maxmin_operand")]
  SVE_COND_FP_MAXMIN_PUBLIC))]
   "TARGET_SVE"
   {
@@ -2614,18 +2614,20 @@ (define_expand "3"
 
 ;; Predicated floating-point maximum/minimum.
 (define_insn "*3"
-  [(set (match_operand:SVE_F 0 "register_operand" "=w, ?&w")
+  [(set (match_operand:SVE_F 0 "register_operand" "=w, w, ?&w, ?&w")
(unspec:SVE_F
- [(match_operand: 1 "register_operand" "Upl, Upl")
+ [(match_operand: 1 "register_operand" "Upl, Upl, Upl, Upl")
   (match_operand:SI 4 "aarch64_sve_gp_strictness")
-

[committed][AArch64] Make more use of SVE conditional constant moves

2019-08-14 Thread Richard Sandiford

This patch extends the SVE UNSPEC_SEL patterns so that they can use:

(1) MOV /M of a duplicated integer constant
(2) MOV /M of a duplicated floating-point constant bitcast to an integer,
accepting the same constants as (1)
(3) FMOV /M of a duplicated floating-point constant
(4) MOV /Z of a duplicated integer constant
(5) MOV /Z of a duplicated floating-point constant bitcast to an integer,
accepting the same constants as (4)
(6) MOVPRFXed FMOV /M of a duplicated floating-point constant

We already handled (4) with a special pattern; the rest are new.

Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf.
Applied as r274441.

Richard


2019-08-14  Richard Sandiford  
Kugan Vivekanandarajah  

gcc/
* config/aarch64/aarch64.c (aarch64_bit_representation): New function.
(aarch64_print_vector_float_operand): Also handle 8-bit floats.
(aarch64_print_operand): Add support for %I.
(aarch64_sve_dup_immediate_p): Handle scalars as well as vectors.
Bitcast floating-point constants to the corresponding integer constant.
(aarch64_float_const_representable_p): Handle vectors as well
as scalars.
(aarch64_expand_sve_vcond): Make sure that the operands are valid
for the new vcond_mask_ expander.
* config/aarch64/predicates.md (aarch64_sve_dup_immediate): Also
test aarch64_float_const_representable_p.
(aarch64_sve_reg_or_dup_imm): New predicate.
* config/aarch64/aarch64-sve.md (vec_extract): Use
gen_vcond_mask_ instead of
gen_aarch64_sve_dup_const.
(vcond_mask_): Turn into a define_expand that
accepts aarch64_sve_reg_or_dup_imm and aarch64_simd_reg_or_zero
for operands 1 and 2 respectively.  Force operand 2 into a
register if operand 1 is a register.  Fold old define_insn...
(aarch64_sve_dup_const): ...and this define_insn...
(*vcond_mask_): ...into this new pattern.  Handle
floating-point constants that can be moved as integers.  Add
alternatives for MOV /M and FMOV /M.
(vcond, vcondu)
(vcond): Accept nonmemory_operand for operands
1 and 2 respectively.
* config/aarch64/constraints.md (Ufc): Handle vectors as well
as scalars.
(vss): New constraint.

gcc/testsuite/
* gcc.target/aarch64/sve/vcond_18.c: New test.
* gcc.target/aarch64/sve/vcond_18_run.c: Likewise.
* gcc.target/aarch64/sve/vcond_19.c: Likewise.
* gcc.target/aarch64/sve/vcond_19_run.c: Likewise.
* gcc.target/aarch64/sve/vcond_20.c: Likewise.
* gcc.target/aarch64/sve/vcond_20_run.c: Likewise.

Index: gcc/config/aarch64/aarch64.c
===
--- gcc/config/aarch64/aarch64.c2019-08-14 09:54:30.816741891 +0100
+++ gcc/config/aarch64/aarch64.c2019-08-14 10:16:30.671052843 +0100
@@ -1482,6 +1482,16 @@ aarch64_dbx_register_number (unsigned re
return DWARF_FRAME_REGISTERS;
 }
 
+/* If X is a CONST_DOUBLE, return its bit representation as a constant
+   integer, otherwise return X unmodified.  */
+static rtx
+aarch64_bit_representation (rtx x)
+{
+  if (CONST_DOUBLE_P (x))
+x = gen_lowpart (int_mode_for_mode (GET_MODE (x)).require (), x);
+  return x;
+}
+
 /* Return true if MODE is any of the Advanced SIMD structure modes.  */
 static bool
 aarch64_advsimd_struct_mode_p (machine_mode mode)
@@ -8275,7 +8285,8 @@ aarch64_print_vector_float_operand (FILE
   if (negate)
 r = real_value_negate (&r);
 
-  /* We only handle the SVE single-bit immediates here.  */
+  /* Handle the SVE single-bit immediates specially, since they have a
+ fixed form in the assembly syntax.  */
   if (real_equal (&r, &dconst0))
 asm_fprintf (f, "0.0");
   else if (real_equal (&r, &dconst1))
@@ -8283,7 +8294,13 @@ aarch64_print_vector_float_operand (FILE
   else if (real_equal (&r, &dconsthalf))
 asm_fprintf (f, "0.5");
   else
-return false;
+{
+  const int buf_size = 20;
+  char float_buf[buf_size] = {'\0'};
+  real_to_decimal_for_mode (float_buf, &r, buf_size, buf_size,
+   1, GET_MODE (elt));
+  asm_fprintf (f, "%s", float_buf);
+}
 
   return true;
 }
@@ -8312,6 +8329,11 @@ sizetochar (int size)
and print it as an unsigned integer, in decimal.
  'e':  Print the sign/zero-extend size as a character 8->b,
16->h, 32->w.
+ 'I':  If the operand is a duplicated vector constant,
+   replace it with the duplicated scalar.  If the
+   operand is then a floating-point constant, replace
+   it with the integer bit representation.  Print the
+   transformed constant as a signed decimal number.
  'p':  Prints N such that 2^N == X (X must be power of 2 and

[committed][AArch64] Use SVE MOV /M of scalars

2019-08-14 Thread Richard Sandiford

This patch uses MOV /M to optimise selects between a duplicated
scalar variable and a vector.

Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf.
Applied as r274442.

Richard


2019-08-14  Richard Sandiford  
Kugan Vivekanandarajah  

gcc/
* config/aarch64/aarch64-sve.md (*aarch64_sel_dup): New pattern.

gcc/testsuite/
* g++.target/aarch64/sve/dup_sel_1.C: New test.
* g++.target/aarch64/sve/dup_sel_2.C: Likewise.
* g++.target/aarch64/sve/dup_sel_3.C: Likewise.
* g++.target/aarch64/sve/dup_sel_4.C: Likewise.
* g++.target/aarch64/sve/dup_sel_5.C: Likewise.
* g++.target/aarch64/sve/dup_sel_6.C: Likewise.

Index: gcc/config/aarch64/aarch64-sve.md
===
--- gcc/config/aarch64/aarch64-sve.md   2019-08-14 10:18:10.634319267 +0100
+++ gcc/config/aarch64/aarch64-sve.md   2019-08-14 10:20:21.241360707 +0100
@@ -3070,6 +3070,29 @@ (define_insn "*vcond_mask_"
   [(set_attr "movprfx" "*,*,*,*,yes,yes,yes")]
 )
 
+;; Optimize selects between a duplicated scalar variable and another vector,
+;; the latter of which can be a zero constant or a variable.  Treat duplicates
+;; of GPRs as being more expensive than duplicates of FPRs, since they
+;; involve a cross-file move.
+(define_insn "*aarch64_sel_dup"
+  [(set (match_operand:SVE_ALL 0 "register_operand" "=?w, w, ??w, ?&w, ??&w, 
?&w")
+   (unspec:SVE_ALL
+ [(match_operand: 3 "register_operand" "Upa, Upa, Upl, Upl, 
Upl, Upl")
+  (vec_duplicate:SVE_ALL
+(match_operand: 1 "register_operand" "r, w, r, w, r, w"))
+  (match_operand:SVE_ALL 2 "aarch64_simd_reg_or_zero" "0, 0, Dz, Dz, 
w, w")]
+ UNSPEC_SEL))]
+  "TARGET_SVE"
+  "@
+   mov\t%0., %3/m, %1
+   mov\t%0., %3/m, %1
+   movprfx\t%0., %3/z, %0.\;mov\t%0., %3/m, %1
+   movprfx\t%0., %3/z, %0.\;mov\t%0., %3/m, %1
+   movprfx\t%0, %2\;mov\t%0., %3/m, %1
+   movprfx\t%0, %2\;mov\t%0., %3/m, %1"
+  [(set_attr "movprfx" "*,*,yes,yes,yes,yes")]
+)
+
 ;; -
 ;;  [INT,FP] Compare and select
 ;; -
Index: gcc/testsuite/g++.target/aarch64/sve/dup_sel_1.C
===
--- /dev/null   2019-07-30 08:53:31.317691683 +0100
+++ gcc/testsuite/g++.target/aarch64/sve/dup_sel_1.C2019-08-14 
10:20:21.245360681 +0100
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -msve-vector-bits=256" } */
+
+#include 
+
+typedef int32_t vnx4si __attribute__((vector_size(32)));
+
+void
+foo (int32_t val)
+{
+  register vnx4si x asm ("z0");
+  register vnx4si y asm ("z0");
+  asm volatile ("" : "=w" (y));
+  val += 1;
+  vnx4si z = { val, val, val, val, val, val, val, val };
+  x = (vnx4si) { -1, 0, 0, -1, 0, -1, 0, -1 } ? z : y;
+  asm volatile ("" :: "w" (x));
+}
+
+/* { dg-final { scan-assembler {\tmov\tz0\.s, p[0-7]/m, w[0-9]+\n} } } */
+/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
Index: gcc/testsuite/g++.target/aarch64/sve/dup_sel_2.C
===
--- /dev/null   2019-07-30 08:53:31.317691683 +0100
+++ gcc/testsuite/g++.target/aarch64/sve/dup_sel_2.C2019-08-14 
10:20:21.245360681 +0100
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -msve-vector-bits=256" } */
+
+#include 
+
+typedef int32_t vnx4si __attribute__((vector_size(32)));
+
+void
+foo (int32_t val)
+{
+  register vnx4si x asm ("z0");
+  register vnx4si y asm ("z1");
+  asm volatile ("" : "=w" (y));
+  val += 1;
+  vnx4si z = { val, val, val, val, val, val, val, val };
+  x = (vnx4si) { -1, 0, 0, -1, 0, -1, 0, -1 } ? z : y;
+  asm volatile ("" :: "w" (x));
+}
+
+/* { dg-final { scan-assembler {\tmovprfx\tz0, z1\n\tmov\tz0\.s, p[0-7]/m, 
w[0-9]+\n} } } */
Index: gcc/testsuite/g++.target/aarch64/sve/dup_sel_3.C
===
--- /dev/null   2019-07-30 08:53:31.317691683 +0100
+++ gcc/testsuite/g++.target/aarch64/sve/dup_sel_3.C2019-08-14 
10:20:21.245360681 +0100
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -msve-vector-bits=256" } */
+
+#include 
+
+typedef int32_t vnx4si __attribute__((vector_size(32)));
+typedef float vnx4sf __attribute__((vector_size(32)));
+
+void
+foo (float val)
+{
+  register vnx4sf x asm ("z0");
+  register vnx4sf y asm ("z0");
+  asm volatile ("" : "=w" (y));
+  vnx4sf z = { val, val, val, val, val, val, val, val };
+  x = (vnx4si) { -1, 0, 0, -1, 0, -1, 0, -1 } ? z : y;
+  asm volatile ("" :: "w" (x));
+}
+
+/* { dg-final { scan-assembler {\tmov\tz0\.s, p[0-7]/m, s[0-9]+\n} } } */
+/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
Index: gcc/testsuite/g++.target/aarch64/sve/dup_sel_4.C
===
--- /dev/null   2019-07-30 08:53:31.3

Re: [PATCH][RFC][x86] Fix PR91154, add SImode smax, allow SImode add in SSE regs

2019-08-14 Thread Uros Bizjak

On Wed, Aug 14, 2019 at 11:08 AM Richard Biener  wrote:
>
> On Tue, 13 Aug 2019, Jeff Law wrote:
>
> > On 8/9/19 7:00 AM, Richard Biener wrote:
> > >
> > > It fixes the slowdown observed in 416.gamess and 464.h264ref.
> > >
> > > Bootstrapped on x86_64-unknown-linux-gnu, testing still in progress.
> > >
> > > CCing Jeff who "knows RTL".
> > What specifically do you want me to look at?  I'm not really familiar
> > with the STV stuff, but can certainly take a peek.
>
> Below is the updated patch with the already approved and committed
> parts taken out.  It is not mostly mechanical apart from the
> make_vector_copies and convert_reg changes which move existing
> "patterns" under appropriate conditionals and adds handling of the
> case where the scalar mode fits in a single GPR (previously it
> was -m32 DImode only, now it handles -m32/-m64 SImode and DImode).
>
> I'm redoing bootstrap / regtest on x86_64-unknown-linux-gnu now just
> to be safe.
>
> OK?
>
> I do expect we need to work on the compile-time issue I placed ???
> comments on and more generally try to avoid using DF so much.
>
> Thanks,
> Richard.
>
> 2019-08-13  Richard Biener  
>
> PR target/91154
> * config/i386/i386-features.h (scalar_chain::scalar_chain): Add
> mode arguments.
> (scalar_chain::smode): New member.
> (scalar_chain::vmode): Likewise.
> (dimode_scalar_chain): Rename to...
> (general_scalar_chain): ... this.
> (general_scalar_chain::general_scalar_chain): Take mode arguments.
> (timode_scalar_chain::timode_scalar_chain): Initialize scalar_chain
> base with TImode and V1TImode.
> * config/i386/i386-features.c (scalar_chain::scalar_chain): Adjust.
> (general_scalar_chain::vector_const_cost): Adjust for SImode
> chains.
> (general_scalar_chain::compute_convert_gain): Likewise.  Add
> {S,U}{MIN,MAX} support.
> (general_scalar_chain::replace_with_subreg): Use vmode/smode.
> (general_scalar_chain::make_vector_copies): Likewise.  Handle
> non-DImode chains appropriately.
> (general_scalar_chain::convert_reg): Likewise.
> (general_scalar_chain::convert_op): Likewise.
> (general_scalar_chain::convert_insn): Likewise.  Add
> fatal_insn_not_found if the result is not recognized.
> (convertible_comparison_p): Pass in the scalar mode and use that.
> (general_scalar_to_vector_candidate_p): Likewise.  Rename from
> dimode_scalar_to_vector_candidate_p.  Add {S,U}{MIN,MAX} support.
> (scalar_to_vector_candidate_p): Remove by inlining into single
> caller.
> (general_remove_non_convertible_regs): Rename from
> dimode_remove_non_convertible_regs.
> (remove_non_convertible_regs): Remove by inlining into single caller.
> (convert_scalars_to_vector): Handle SImode and DImode chains
> in addition to TImode chains.
> * config/i386/i386.md (3): New expander.
> (*3_1): New insn-and-split.
> (*di3_doubleword): Likewise.
>
> * gcc.target/i386/pr91154.c: New testcase.
> * gcc.target/i386/minmax-3.c: Likewise.
> * gcc.target/i386/minmax-4.c: Likewise.
> * gcc.target/i386/minmax-5.c: Likewise.
> * gcc.target/i386/minmax-6.c: Likewise.
> * gcc.target/i386/minmax-1.c: Add -mno-stv.
> * gcc.target/i386/minmax-2.c: Likewise.

OK.

Thanks,
Uros.

> Index: gcc/config/i386/i386-features.c
> ===
> --- gcc/config/i386/i386-features.c (revision 274422)
> +++ gcc/config/i386/i386-features.c (working copy)
> @@ -276,8 +276,11 @@ unsigned scalar_chain::max_id = 0;
>
>  /* Initialize new chain.  */
>
> -scalar_chain::scalar_chain ()
> +scalar_chain::scalar_chain (enum machine_mode smode_, enum machine_mode 
> vmode_)
>  {
> +  smode = smode_;
> +  vmode = vmode_;
> +
>chain_id = ++max_id;
>
> if (dump_file)
> @@ -319,7 +322,7 @@ scalar_chain::add_to_queue (unsigned ins
> conversion.  */
>
>  void
> -dimode_scalar_chain::mark_dual_mode_def (df_ref def)
> +general_scalar_chain::mark_dual_mode_def (df_ref def)
>  {
>gcc_assert (DF_REF_REG_DEF_P (def));
>
> @@ -409,6 +412,9 @@ scalar_chain::add_insn (bitmap candidate
>&& !HARD_REGISTER_P (SET_DEST (def_set)))
>  bitmap_set_bit (defs, REGNO (SET_DEST (def_set)));
>
> +  /* ???  The following is quadratic since analyze_register_chain
> + iterates over all refs to look for dual-mode regs.  Instead this
> + should be done separately for all regs mentioned in the chain once.  */
>df_ref ref;
>df_ref def;
>for (ref = DF_INSN_UID_DEFS (insn_uid); ref; ref = DF_REF_NEXT_LOC (ref))
> @@ -469,19 +475,21 @@ scalar_chain::build (bitmap candidates,
> instead of using a scalar one.  */
>
>  int
> -dimode_scalar_chain::vector_const_cost (rtx exp)
> +general_scalar_chain::vector_const_cost (rtx

[committed][AArch64] Add support for SVE absolute comparisons

2019-08-14 Thread Richard Sandiford

This patch adds support for floating-point absolute comparisons
FACLT and FACLE (aliased as FACGT and FACGE with swapped operands).

Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf.
Applied as r274443.

Richard


2019-08-14  Richard Sandiford  

gcc/
* config/aarch64/iterators.md (SVE_COND_FP_ABS_CMP): New iterator.
* config/aarch64/aarch64-sve.md (*aarch64_pred_fac):
New pattern.

gcc/testsuite/
* gcc.target/aarch64/sve/vcond_21.c: New test.
* gcc.target/aarch64/sve/vcond_21_run.c: Likewise.

Index: gcc/config/aarch64/iterators.md
===
--- gcc/config/aarch64/iterators.md 2019-08-14 10:14:27.899953691 +0100
+++ gcc/config/aarch64/iterators.md 2019-08-14 10:24:53.211364279 +0100
@@ -1709,6 +1709,11 @@ (define_int_iterator SVE_COND_FP_CMP_I0
 UNSPEC_COND_FCMLT
 UNSPEC_COND_FCMNE])
 
+(define_int_iterator SVE_COND_FP_ABS_CMP [UNSPEC_COND_FCMGE
+ UNSPEC_COND_FCMGT
+ UNSPEC_COND_FCMLE
+ UNSPEC_COND_FCMLT])
+
 (define_int_iterator FCADD [UNSPEC_FCADD90
UNSPEC_FCADD270])
 
Index: gcc/config/aarch64/aarch64-sve.md
===
--- gcc/config/aarch64/aarch64-sve.md   2019-08-14 10:22:19.524492496 +0100
+++ gcc/config/aarch64/aarch64-sve.md   2019-08-14 10:24:53.211364279 +0100
@@ -94,7 +94,8 @@
 ;;  [INT,FP] Compare and select
 ;;  [INT] Comparisons
 ;;  [INT] While tests
-;;  [FP] Comparisons
+;;  [FP] Direct comparisons
+;;  [FP] Absolute comparisons
 ;;  [PRED] Test bits
 ;;
 ;; == Reductions
@@ -3364,7 +3365,7 @@ (define_insn_and_rewrite "*while_ult_and
 )
 
 ;; -
+;;  [FP] Absolute comparisons
+;; -
+;; Includes:
+;; - FACGE
+;; - FACGT
+;; - FACLE
+;; - FACLT
+;; -
+
+;; Predicated floating-point absolute comparisons.
+(define_insn_and_rewrite "*aarch64_pred_fac"
+  [(set (match_operand: 0 "register_operand" "=Upa")
+   (unspec:
+ [(match_operand: 1 "register_operand" "Upl")
+  (match_operand:SI 4 "aarch64_sve_ptrue_flag")
+  (unspec:SVE_F
+[(match_operand 5)
+ (match_operand:SI 6 "aarch64_sve_gp_strictness")
+ (match_operand:SVE_F 2 "register_operand" "w")]
+UNSPEC_COND_FABS)
+  (unspec:SVE_F
+[(match_operand 7)
+ (match_operand:SI 8 "aarch64_sve_gp_strictness")
+ (match_operand:SVE_F 3 "register_operand" "w")]
+UNSPEC_COND_FABS)]
+ SVE_COND_FP_ABS_CMP))]
+  "TARGET_SVE
+   && aarch64_sve_pred_dominates_p (&operands[5], operands[1])
+   && aarch64_sve_pred_dominates_p (&operands[7], operands[1])"
+  "fac\t%0., %1/z, %2., %3."
+  "&& (!rtx_equal_p (operands[1], operands[5])
+   || !rtx_equal_p (operands[1], operands[7]))"
+  {
+operands[5] = copy_rtx (operands[1]);
+operands[7] = copy_rtx (operands[1]);
+  }
+)
+
+;; -
 ;;  [PRED] Test bits
 ;; -
 ;; Includes:
Index: gcc/testsuite/gcc.target/aarch64/sve/vcond_21.c
===
--- /dev/null   2019-07-30 08:53:31.317691683 +0100
+++ gcc/testsuite/gcc.target/aarch64/sve/vcond_21.c 2019-08-14 
10:24:53.211364279 +0100
@@ -0,0 +1,34 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize" } */
+
+#define DEF_LOOP(TYPE, ABS, NAME, OP)  \
+  void \
+  test_##TYPE##_##NAME (TYPE *restrict r,  \
+   TYPE *restrict a,   \
+   TYPE *restrict b, int n)\
+  {\
+for (int i = 0; i < n; ++i)\
+  r[i] = ABS (a[i]) OP ABS (b[i]) ? 1.0 : 0.0; \
+  }
+
+#define TEST_TYPE(T, TYPE, ABS)\
+  T (TYPE, ABS, lt, <) \
+  T (TYPE, ABS, le, <=)\
+  T (TYPE, ABS, ge, >=)\
+  T (TYPE, ABS, gt, >)
+
+#define TEST_ALL(T)\
+  TEST_TYPE (T, _Float16, __builtin_fabsf16)   \
+  TEST_TYPE (T, float, __builtin_fabsf)\
+  TEST_TYPE (T, double, __builtin_fabs)
+
+TEST_ALL (DEF_LOOP)
+
+/* { dg-final { scan-assembler-times {\tfac[lg]t\tp[0-9]+\.h, p[0-7]/z, 
z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */
+/* { dg-fin

Re: [PATCH 1/3] Perform fold when propagating.

2019-08-14 Thread Richard Biener

On Tue, Aug 13, 2019 at 1:24 PM Robin Dapp  wrote:
>
> > May I suggest to add a parameter to the substitute-and-fold engine
> > so we can do the folding on all stmts only when enabled and enable
> > it just for VRP?  That also avoids the testsuite noise.
>
> Would something along these lines do?
>
> diff --git a/gcc/tree-ssa-propagate.c b/gcc/tree-ssa-propagate.c
> index 7a8f1e037b0..6c0d743b823 100644
> --- a/gcc/tree-ssa-propagate.c
> +++ b/gcc/tree-ssa-propagate.c
> @@ -814,7 +814,6 @@ ssa_propagation_engine::ssa_propagate (void)
>ssa_prop_fini ();
>  }
>
> -
>  /* Return true if STMT is of the form 'mem_ref = RHS', where 'mem_ref'
> is a non-volatile pointer dereference, a structure reference or a
> reference to a single _DECL.  Ignore volatile memory references
> @@ -1064,11 +1063,10 @@
> substitute_and_fold_dom_walker::before_dom_children (basic_block bb)
>/* Replace real uses in the statement.  */
>did_replace |= substitute_and_fold_engine->replace_uses_in (stmt);
>
> -  if (did_replace)
> - gimple_set_modified (stmt, true);
> -
> -  if (fold_stmt (&i, follow_single_use_edges))
> +  /* If we made a replacement, fold the statement.  */
> +  if (did_replace ||
> substitute_and_fold_engine->should_fold_all_stmts ())
> {
> + fold_stmt (&i, follow_single_use_edges);
>   did_replace = true;
>   stmt = gsi_stmt (i);
>   gimple_set_modified (stmt, true);
> diff --git a/gcc/tree-ssa-propagate.h b/gcc/tree-ssa-propagate.h
> index 81b635e0787..939680f487c 100644
> --- a/gcc/tree-ssa-propagate.h
> +++ b/gcc/tree-ssa-propagate.h
> @@ -107,6 +107,13 @@ class substitute_and_fold_engine
>bool substitute_and_fold (basic_block = NULL);
>bool replace_uses_in (gimple *);
>bool replace_phi_args_in (gphi *);
> +
> +  /* Users like VRP can overwrite this when they want to perform
> + folding for every propagation.  */
> +  virtual bool should_fold_all_stmts (void)
> +{
> +  return false;
> +}

Since this is constant for a single invocation I'd either
add a flag param to substitute_and_fold or a bool
class member initialized at construction time.

Also do

  if ((did_replace || fold_all_stmts)
 && fold_stmt (...))
   {
   }

to avoid extra work when folding does nothing.

Otherwise yes, this woudl work.

>  };
>
>  #endif /* _TREE_SSA_PROPAGATE_H  */
> diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
> index e2850682da2..8c8fa6f2bec 100644
> --- a/gcc/tree-vrp.c
> +++ b/gcc/tree-vrp.c
> @@ -6271,6 +6271,9 @@ class vrp_folder : public substitute_and_fold_engine
>  { return vr_values->simplify_stmt_using_ranges (gsi); }
>   tree op_with_constant_singleton_value_range (tree op)
>  { return vr_values->op_with_constant_singleton_value_range (op); }
> +
> +  /* Enable aggressive folding in every propagation.  */
> +  bool should_fold_all_stmts (void) { return true; }
>  };
>
>  /* If the statement pointed by SI has a predicate whose value can be
>
>
> > I think it's also only necessary to fold a stmt when a (indirect) use
> > after substitution has either been folded or has (new) SSA name
> > info (range/known-bits) set?
>
> Where would this need to be changed?

It was just a random thought, doing this would need to keep track
of "changed" SSA names (in a bitmap?) and before folding
checking all uses on the stmt if they are "changed".  The overhead
for this may be higher than the folding savings we get.  The "changed"
would also need to tickle down some distance so patterns with
deeper nested subexpressions would be tried.

Richard.

>
> Regards
>  Robin
>

Re: [PATCH][testsuite] Fix PR91419

2019-08-14 Thread Richard Biener

On Tue, 13 Aug 2019, Hans-Peter Nilsson wrote:

> > From: Richard Biener 
> > Date: Tue, 13 Aug 2019 09:50:34 +0200
> 
> > 2019-08-13  Richard Biener  
> > 
> > PR testsuite/91419
> > * lib/target-supports.exp (natural_alignment_32): Amend target
> > list based on BIGGEST_ALIGNMENT.
> > (natural_alignment_64): Targets not natural_alignment_32 cannot
> > be natural_alignment_64.
> > * gcc.dg/tree-ssa/pr91091-2.c: XFAIL for !natural_alignment_32.
> > * gcc.dg/tree-ssa/ssa-fre-77.c: Likewise.
> > * gcc.dg/tree-ssa/ssa-fre-61.c: Require natural_alignment_32.
> 
> LGTM, thanks.  (Not tested myself but my cris-elf autotester will pick it up.)

Committed.

Thanks,
Richard.

[Ada] Illegal selection of first object in a task type's body not detected

2019-08-14 Thread Pierre-Marie de Rodat

The compiler was improperly allowing selection of an object declared
within a task body when the prefix was of the task type, specifically in
the case where the object was the very first declared in the body
(selections of later body declarations were being flagged).  The flag
Is_Private_Op was only set at the point of the first "private"
declaration of the type in cases where the first declaration's name
didn't match the selector.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-14  Gary Dismukes  

gcc/ada/

* sem_ch4.adb (Analyze_Selected_Component): In the case where
the prefix is of a concurrent type, and the selected entity
matching the selector is the first private declaration of the
type (such as the first local variable in a task's body), set
Is_Private_Op.

gcc/testsuite/

* gnat.dg/task5.adb: New testcase.--- gcc/ada/sem_ch4.adb
+++ gcc/ada/sem_ch4.adb
@@ -4994,7 +4994,15 @@ package body Sem_Ch4 is
if Comp = First_Private_Entity (Type_To_Use) then
   if Etype (Sel) /= Any_Type then
 
- --  We have a candiate
+ --  If the first private entity's name matches, then treat
+ --  it as a private op: needed for the error check for
+ --  illegal selection of private entities further below.
+
+ if Chars (Comp) = Chars (Sel) then
+Is_Private_Op := True;
+ end if;
+
+ --  We have a candidate, so exit the loop
 
  exit;
 

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/task5.adb
@@ -0,0 +1,26 @@
+procedure Task5 is
+
+   task type T is
+  entry E (V1, V2 : Integer);
+   end T;
+
+   T_Obj : T;
+
+   task body T is
+  V1 : Integer;
+  V2 : Integer;
+  V3 : Integer;
+   begin
+  accept E (V1, V2 : Integer) do
+ T.V1 := V1;
+ T.V2 := V2;
+
+ T_Obj.V1 := V1;  -- { dg-error "invalid reference to private operation of some object of type \"T\"" }
+ T_Obj.V2 := V2;  -- { dg-error "invalid reference to private operation of some object of type \"T\"" }
+ T_Obj.V3 := V3;  -- { dg-error "invalid reference to private operation of some object of type \"T\"" }
+  end E;
+   end T;
+
+begin
+   null;
+end Task5;

[Ada] Fix failing assertions on SPARK elaboration

2019-08-14 Thread Pierre-Marie de Rodat

Checking of SPARK elaboration rules may lead to assertion failures on a
compiler built with assertions. Now fixed.

There is no impact on compilation.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-14  Yannick Moy  

gcc/ada/

* sem_disp.adb (Check_Dispatching_Operation): Update assertion
for the separate declarations created in GNATprove mode.
* sem_disp.ads (Is_Overriding_Subprogram): Update comment.
* sem_elab.adb (SPARK_Processor): Fix test for checking of
overriding primitives.--- gcc/ada/sem_disp.adb
+++ gcc/ada/sem_disp.adb
@@ -1149,6 +1149,10 @@ package body Sem_Disp is
  -- overridden primitives. The wrappers include checks on these
  -- modified conditions. (AI12-113).
 
+ --  5. Declarations built for subprograms without separate spec which
+ -- are eligible for inlining in GNATprove (inside
+ -- Sem_Ch6.Analyze_Subprogram_Body_Helper).
+
  if Present (Old_Subp)
and then Present (Overridden_Operation (Subp))
and then Is_Dispatching_Operation (Old_Subp)
@@ -1168,7 +1172,9 @@ package body Sem_Disp is
   or else Get_TSS_Name (Subp) = TSS_Stream_Read
   or else Get_TSS_Name (Subp) = TSS_Stream_Write
 
-  or else Present (Contract (Overridden_Operation (Subp;
+  or else Present (Contract (Overridden_Operation (Subp)))
+
+  or else GNATprove_Mode);
 
 Check_Controlling_Formals (Tagged_Type, Subp);
 Override_Dispatching_Operation (Tagged_Type, Old_Subp, Subp);

--- gcc/ada/sem_disp.ads
+++ gcc/ada/sem_disp.ads
@@ -151,7 +151,8 @@ package Sem_Disp is
--  Returns True if E is a null procedure that is an interface primitive
 
function Is_Overriding_Subprogram (E : Entity_Id) return Boolean;
-   --  Returns True if E is an overriding subprogram
+   --  Returns True if E is an overriding subprogram and False otherwise, in
+   --  particular for an inherited subprogram.
 
function Is_Tag_Indeterminate (N : Node_Id) return Boolean;
--  Returns true if the expression N is tag-indeterminate. An expression

--- gcc/ada/sem_elab.adb
+++ gcc/ada/sem_elab.adb
@@ -49,6 +49,7 @@ with Sem_Aux;  use Sem_Aux;
 with Sem_Cat;  use Sem_Cat;
 with Sem_Ch7;  use Sem_Ch7;
 with Sem_Ch8;  use Sem_Ch8;
+with Sem_Disp; use Sem_Disp;
 with Sem_Prag; use Sem_Prag;
 with Sem_Util; use Sem_Util;
 with Sinfo;use Sinfo;
@@ -15233,9 +15234,12 @@ package body Sem_Elab is
  begin
 --  Nothing to do for predefined primitives because they are
 --  artifacts of tagged type expansion and cannot override source
---  primitives.
+--  primitives. Nothing to do as well for inherited primitives as
+--  the check concerns overridding ones.
 
-if Is_Predefined_Dispatching_Operation (Prim) then
+if Is_Predefined_Dispatching_Operation (Prim)
+  or else not Is_Overriding_Subprogram (Prim)
+then
return;
 end if;

[Ada] Crash on precondition involving quantified expression

2019-08-14 Thread Pierre-Marie de Rodat

This patch fixes a compiler abort on a precondition whose condition
includes a quantified expression.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-14  Ed Schonberg  

gcc/ada/

* sem_util.adb (New_Copy_Tree, Visit_Entity): A quantified
expression includes the implicit declaration of the loop
parameter. When a quantified expression is copied during
expansion, for example when building the precondition code from
the generated pragma, a new loop parameter must be created for
the new tree, to prevent duplicate declarations for the same
symbol.

gcc/testsuite/

* gnat.dg/predicate12.adb, gnat.dg/predicate12.ads: New
testcase.--- gcc/ada/sem_util.adb
+++ gcc/ada/sem_util.adb
@@ -20799,16 +20799,27 @@ package body Sem_Util is
  --  this restriction leads to a performance penalty.
 
  --  ??? this list is flaky, and may hide dormant bugs
+ --  Should functions be included???
+
+ --  Loop parameters appear within quantified expressions and contain
+ --  an entity declaration that must be replaced when the expander is
+ --  active if the expression has been preanalyzed or analyzed.
 
  elsif not Ekind_In (Id, E_Block,
  E_Constant,
  E_Label,
+ E_Loop_Parameter,
  E_Procedure,
  E_Variable)
and then not Is_Type (Id)
  then
 return;
 
+ elsif Ekind (Id) = E_Loop_Parameter
+   and then No (Etype (Condition (Parent (Parent (Id)
+ then
+return;
+
  --  Nothing to do when the entity was already visited
 
  elsif NCT_Tables_In_Use
@@ -21081,7 +21092,14 @@ package body Sem_Util is
   begin
  pragma Assert (Nkind (N) not in N_Entity);
 
- if Nkind (N) = N_Expression_With_Actions then
+ --  If the node is a quantified expression and expander is active,
+ --  it contains an implicit declaration that may require a new entity
+ --  when the condition has already been (pre)analyzed.
+
+ if Nkind (N) = N_Expression_With_Actions
+   or else
+ (Nkind (N) = N_Quantified_Expression and then Expander_Active)
+ then
 EWA_Level := EWA_Level + 1;
 
  elsif EWA_Level > 0
@@ -21225,6 +21243,12 @@ package body Sem_Util is
   --* Semantic fields of nodes such as First_Real_Statement must be
   --  updated to reference the proper replicated nodes.
 
+  --  Finally, quantified expressions contain an implicit delaration for
+  --  the bound variable. Given that quantified expressions appearing
+  --  in contracts are copied to create pragmas and eventually checking
+  --  procedures, a new bound variable must be created for each copy, to
+  --  prevent multiple declarations of the same symbol.
+
   --  To meet all these demands, routine New_Copy_Tree is split into two
   --  phases.
 

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/predicate12.adb
@@ -0,0 +1,6 @@
+--  { dg-do compile }
+--  { dg-options "-gnata" }
+
+package body Predicate12 is
+   procedure Dummy is null;
+end Predicate12;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/predicate12.ads
@@ -0,0 +1,42 @@
+package Predicate12 is
+
+   subtype Index_Type is Positive range 1 .. 100;
+   type Array_Type is array(Index_Type) of Integer;
+
+   type Search_Engine is interface;
+
+   procedure Search
+ (S   : in  Search_Engine;
+  Search_Item : in  Integer;
+  Items   : in  Array_Type;
+  Found   : out Boolean;
+  Result  : out Index_Type) is abstract
+ with
+   Pre'Class =>
+ (for all J in Items'Range =>
+   (for all K in J + 1 .. Items'Last => Items(J) <= Items(K))),
+   Post'Class =>
+ (if Found then Search_Item = Items(Result)
+   else (for all J in Items'Range => Items(J) /= Search_Item));
+
+   type Binary_Search_Engine is new Search_Engine with null record;
+
+   procedure Search
+ (S   : in  Binary_Search_Engine;
+  Search_Item : in  Integer;
+  Items   : in  Array_Type;
+  Found   : out Boolean;
+  Result  : out Index_Type) is null;
+
+   type Forward_Search_Engine is new Search_Engine with null record;
+
+   procedure Search
+ (S   : in  Forward_Search_Engine;
+  Search_Item : in  Integer;
+  Items   : in  Array_Type;
+  Found   : out Boolean;
+  Result  : out Index_Type) is null;
+
+   procedure Dummy;
+
+end Predicate12;

[Ada] Fix discrepancy in mechanism tracking private and full views

2019-08-14 Thread Pierre-Marie de Rodat

This fixes a discrepancy in the mechanism tracking the private and full
views of entities when entering and leaving scopes.  This mechanism
records private entities that are dependent on other private entities,
so that the exchange done on entering and leaving scopes can be
propagated.

The propagation is done recursively on entering child units, but it was
not done recursively on leaving them, which would leave the dependency
chains in a uncertain state in this case.  That's mostly visible when
inlining across units is enabled for code involving a lot of generic
units.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-14  Eric Botcazou  

gcc/ada/

* sem_ch7.adb (Install_Private_Declarations)
: Do not rely solely on the
Is_Child_Unit flag on the unit to recurse.
(Uninstall_Declarations) : New
function.  Use it to recurse on the private dependent entities
for child units.

gcc/testsuite/

* gnat.dg/inline18.adb, gnat.dg/inline18.ads,
gnat.dg/inline18_gen1-inner_g.ads, gnat.dg/inline18_gen1.adb,
gnat.dg/inline18_gen1.ads, gnat.dg/inline18_gen2.adb,
gnat.dg/inline18_gen2.ads, gnat.dg/inline18_gen3.adb,
gnat.dg/inline18_gen3.ads, gnat.dg/inline18_pkg1.adb,
gnat.dg/inline18_pkg1.ads, gnat.dg/inline18_pkg2-child.ads,
gnat.dg/inline18_pkg2.ads: New testcase.--- gcc/ada/sem_ch7.adb
+++ gcc/ada/sem_ch7.adb
@@ -2261,13 +2261,14 @@ package body Sem_Ch7 is
   procedure Swap_Private_Dependents (Priv_Deps : Elist_Id);
   --  When the full view of a private type is made available, we do the
   --  same for its private dependents under proper visibility conditions.
-  --  When compiling a grandchild unit this needs to be done recursively.
+  --  When compiling a child unit this needs to be done recursively.
 
   -
   -- Swap_Private_Dependents --
   -
 
   procedure Swap_Private_Dependents (Priv_Deps : Elist_Id) is
+ Cunit : Entity_Id;
  Deps  : Elist_Id;
  Priv  : Entity_Id;
  Priv_Elmt : Elmt_Id;
@@ -2285,6 +2286,7 @@ package body Sem_Ch7 is
 if Present (Full_View (Priv)) and then Is_Visible_Dependent (Priv)
 then
if Is_Private_Type (Priv) then
+  Cunit := Cunit_Entity (Current_Sem_Unit);
   Deps := Private_Dependents (Priv);
   Is_Priv := True;
else
@@ -2312,11 +2314,14 @@ package body Sem_Ch7 is
Set_Is_Potentially_Use_Visible
  (Priv, Is_Potentially_Use_Visible (Node (Priv_Elmt)));
 
-   --  Within a child unit, recurse, except in generic child unit,
-   --  which (unfortunately) handle private_dependents separately.
+   --  Recurse for child units, except in generic child units,
+   --  which unfortunately handle private_dependents separately.
+   --  Note that the current unit may not have been analyzed,
+   --  for example a package body, so we cannot rely solely on
+   --  the Is_Child_Unit flag, but that's only an optimization.
 
if Is_Priv
- and then Is_Child_Unit (Cunit_Entity (Current_Sem_Unit))
+ and then (No (Etype (Cunit)) or else Is_Child_Unit (Cunit))
  and then not Is_Empty_Elmt_List (Deps)
  and then not Inside_A_Generic
then
@@ -2701,13 +2706,16 @@ package body Sem_Ch7 is
   Decl  : constant Node_Id := Unit_Declaration_Node (P);
   Id: Entity_Id;
   Full  : Entity_Id;
-  Priv_Elmt : Elmt_Id;
-  Priv_Sub  : Entity_Id;
 
   procedure Preserve_Full_Attributes (Priv : Entity_Id; Full : Entity_Id);
   --  Copy to the private declaration the attributes of the full view that
   --  need to be available for the partial view also.
 
+  procedure Swap_Private_Dependents (Priv_Deps : Elist_Id);
+  --  When the full view of a private type is made unavailable, we do the
+  --  same for its private dependents under proper visibility conditions.
+  --  When compiling a child unit this needs to be done recursively.
+
   function Type_In_Use (T : Entity_Id) return Boolean;
   --  Check whether type or base type appear in an active use_type clause
 
@@ -2826,6 +2834,66 @@ package body Sem_Ch7 is
  end if;
   end Preserve_Full_Attributes;
 
+  -
+  -- Swap_Private_Dependents --
+  -
+
+  procedure Swap_Private_Dependents (Priv_Deps : Elist_Id) is
+ Cunit : Entity_Id;
+ Deps  : Elist_Id;
+ Priv  : Entity_Id;
+ Priv_Elmt : Elmt_Id;
+ Is_Priv   : Boolean;
+
+  begin
+ Priv_Elmt := First_Elmt (Priv_Deps);
+ while Present (Priv_Elmt) loop
+

[Ada] Spurious error in discriminated aggregate

2019-08-14 Thread Pierre-Marie de Rodat

This patch fixes a bug in which a spurious error is given on an
aggregate of a type derived from a subtype with a constrained
discriminant.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-14  Bob Duff  

gcc/ada/

* exp_aggr.adb (Init_Hidden_Discriminants): Avoid processing the
wrong discriminant, which could be of the wrong type.

gcc/testsuite/

* gnat.dg/discr57.adb: New testcase.--- gcc/ada/exp_aggr.adb
+++ gcc/ada/exp_aggr.adb
@@ -2689,8 +2689,10 @@ package body Exp_Aggr is
Discr_Constr :=
  First_Elmt (Stored_Constraint (Full_View (Base_Typ)));
 
+--  Otherwise, no discriminant to process
+
 else
-   Discr_Constr := First_Elmt (Stored_Constraint (Typ));
+   Discr_Constr := No_Elmt;
 end if;
 
 while Present (Discr) and then Present (Discr_Constr) loop

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/discr57.adb
@@ -0,0 +1,17 @@
+--  { dg-do compile }
+
+procedure Discr57 is
+
+   type T1(Scalar : Boolean) is abstract tagged null record;
+
+   subtype S1 is T1 (Scalar => False);
+
+   type T2(Lower_Bound : Natural) is new
+ S1 with null record;
+
+   Obj : constant T2 :=
+   (Lower_Bound => 123);
+
+begin
+   null;
+end Discr57;

[Ada] Expose part of ownership checking for use in GNATprove

2019-08-14 Thread Pierre-Marie de Rodat

GNATprove needs to be able to call a subset of the ownership legality
rules from marking. This is provided by a new function
Sem_SPARK.Is_Legal.

There is no impact on compilation.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-14  Yannick Moy  

gcc/ada/

* sem_spark.adb, sem_spark.ads (Is_Legal): New function exposed
for use in GNATprove, to test legality rules not related to
permissions.
(Check_Declaration_Legality): Extract the part of
Check_Declaration that checks rules not related to permissions.
(Check_Declaration): Call the new Check_Declaration_Legality.
(Check_Type_Legality): Rename of Check_Type. Introduce
parameters to force or not checking, and update a flag detecting
illegalities.
(Check_Node): Ignore attribute references in statement position.--- gcc/ada/sem_spark.adb
+++ gcc/ada/sem_spark.adb
@@ -637,6 +637,14 @@ package body Sem_SPARK is
 
procedure Check_Declaration (Decl : Node_Id);
 
+   procedure Check_Declaration_Legality
+ (Decl  : Node_Id;
+  Force : Boolean;
+  Legal : in out Boolean);
+   --  Check the legality of declaration Decl regarding rules not related to
+   --  permissions. Update Legal to False if a rule is violated. Issue an
+   --  error message if Force is True and Emit_Messages returns True.
+
procedure Check_Expression (Expr : Node_Id; Mode : Extended_Checking_Mode);
pragma Precondition (Nkind_In (Expr, N_Index_Or_Discriminant_Constraint,
 N_Range_Constraint,
@@ -686,7 +694,10 @@ package body Sem_SPARK is
 
procedure Check_Statement (Stmt : Node_Id);
 
-   procedure Check_Type (Typ : Entity_Id);
+   procedure Check_Type_Legality
+ (Typ   : Entity_Id;
+  Force : Boolean;
+  Legal : in out Boolean);
--  Check that type Typ is either not deep, or that it is an observing or
--  owning type according to SPARK RM 3.10
 
@@ -1138,11 +1149,12 @@ package body Sem_SPARK is
   Expr_Root   : Entity_Id;
   Perm: Perm_Kind;
   Status  : Error_Status;
+  Dummy   : Boolean := True;
 
--  Start of processing for Check_Assignment
 
begin
-  Check_Type (Target_Typ);
+  Check_Type_Legality (Target_Typ, Force => True, Legal => Dummy);
 
   if Is_Anonymous_Access_Type (Target_Typ) then
  Check_Source_Of_Borrow_Or_Observe (Expr, Status);
@@ -1410,11 +1422,18 @@ package body Sem_SPARK is
   Target : constant Entity_Id := Defining_Identifier (Decl);
   Target_Typ : constant Node_Id := Etype (Target);
   Expr   : Node_Id;
+  Dummy  : Boolean := True;
 
begin
+  --  Start with legality rules not related to permissions
+
+  Check_Declaration_Legality (Decl, Force => True, Legal => Dummy);
+
+  --  Now check permission-related legality rules
+
   case N_Declaration'(Nkind (Decl)) is
  when N_Full_Type_Declaration =>
-Check_Type (Target);
+null;
 
 --  ??? What about component declarations with defaults.
 
@@ -1424,7 +1443,105 @@ package body Sem_SPARK is
  when N_Object_Declaration =>
 Expr := Expression (Decl);
 
-Check_Type (Target_Typ);
+if Present (Expr) then
+   Check_Assignment (Target => Target,
+ Expr   => Expr);
+end if;
+
+if Is_Deep (Target_Typ) then
+   declare
+  Tree : constant Perm_Tree_Access :=
+new Perm_Tree_Wrapper'
+  (Tree =>
+ (Kind=> Entire_Object,
+  Is_Node_Deep=> True,
+  Explanation => Decl,
+  Permission  => Read_Write,
+  Children_Permission => Read_Write));
+   begin
+  Set (Current_Perm_Env, Target, Tree);
+   end;
+end if;
+
+ when N_Iterator_Specification =>
+null;
+
+ when N_Loop_Parameter_Specification =>
+null;
+
+ --  Checking should not be called directly on these nodes
+
+ when N_Function_Specification
+| N_Entry_Declaration
+| N_Procedure_Specification
+| N_Component_Declaration
+ =>
+raise Program_Error;
+
+ --  Ignored constructs for pointer checking
+
+ when N_Formal_Object_Declaration
+| N_Formal_Type_Declaration
+| N_Incomplete_Type_Declaration
+| N_Private_Extension_Declaration
+| N_Private_Type_Declaration
+| N_Protected_Type_Declaration
+ =>
+null;
+
+ --  The following nodes are rewritten by semantic analysis
+
+ when N_Expression_Function =>
+raise Program_Error;
+  end case;
+   end Check_Declaratio

[Ada] Equality for nonabstract type derived from interface treated as abstract

2019-08-14 Thread Pierre-Marie de Rodat

The compiler was creating an abstract function for the equality
operation of a (nonlimited) interface type, and that could result in
errors on generic instantiations that are passed nonabstract types
derived from the interface type along with the derived type's inherited
equality operation (complaining about an abstract subprogram being
passed to a nonabstract formal). The "=" operation of an interface is
supposed to be nonabstract (a direct consequence of the rule in RM
4.5.2(6-7)), so we now create an expression function rather than an
abstract function. The function returns False, but the result is
unimportant since a function of an abstract type can never actually be
invoked (its arguments must generally be class-wide, since there can be
no objects of the type, and calling it will dispatch).

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-14  Gary Dismukes  

gcc/ada/

* exp_ch3.adb (Predef_Spec_Or_Body): For an equality operation
of an interface type, create an expression function (that
returns False) rather than declaring an abstract function.
* freeze.adb (Check_Inherited_Conditions): Set Needs_Wrapper to
False unconditionally at the start of the loop creating wrappers
for inherited operations.

gcc/testsuite/

* gnat.dg/equal11.adb, gnat.dg/equal11_interface.ads,
gnat.dg/equal11_record.adb, gnat.dg/equal11_record.ads: New
testcase.--- gcc/ada/exp_ch3.adb
+++ gcc/ada/exp_ch3.adb
@@ -10313,8 +10313,24 @@ package body Exp_Ch3 is
  Result_Definition=> New_Occurrence_Of (Ret_Type, Loc));
   end if;
 
+  --  Declare an abstract subprogram for primitive subprograms of an
+  --  interface type (except for "=").
+
   if Is_Interface (Tag_Typ) then
- return Make_Abstract_Subprogram_Declaration (Loc, Spec);
+ if Name /= Name_Op_Eq then
+return Make_Abstract_Subprogram_Declaration (Loc, Spec);
+
+ --  The equality function (if any) for an interface type is defined
+ --  to be nonabstract, so we create an expression function for it that
+ --  always returns False. Note that the function can never actually be
+ --  invoked because interface types are abstract, so there aren't any
+ --  objects of such types (and their equality operation will always
+ --  dispatch).
+
+ else
+return Make_Expression_Function
+ (Loc, Spec, New_Occurrence_Of (Standard_False, Loc));
+ end if;
 
   --  If body case, return empty subprogram body. Note that this is ill-
   --  formed, because there is not even a null statement, and certainly not

--- gcc/ada/freeze.adb
+++ gcc/ada/freeze.adb
@@ -1526,11 +1526,11 @@ package body Freeze is
   --  so that LSP can be verified/enforced.
 
   Op_Node := First_Elmt (Prim_Ops);
-  Needs_Wrapper := False;
 
   while Present (Op_Node) loop
- Decls := Empty_List;
- Prim  := Node (Op_Node);
+ Decls := Empty_List;
+ Prim  := Node (Op_Node);
+ Needs_Wrapper := False;
 
  if not Comes_From_Source (Prim) and then Present (Alias (Prim)) then
 Par_Prim := Alias (Prim);
@@ -1601,8 +1601,6 @@ package body Freeze is
 (Par_R, New_List (New_Decl, New_Body));
end if;
 end;
-
-Needs_Wrapper := False;
  end if;
 
  Next_Elmt (Op_Node);

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/equal11.adb
@@ -0,0 +1,37 @@
+--  { dg-do run }
+
+with Equal11_Record;
+
+procedure Equal11 is
+
+  use Equal11_Record;
+
+  R : My_Record_Type;
+  L : My_Record_Type_List_Pck.List;
+begin
+  -- Single record
+  R.F := 42;
+  R.Put;
+  if Put_Result /= 42 then
+raise Program_Error;
+  end if;
+
+  -- List of records
+  L.Append ((F => 3));
+  L.Append ((F => 2));
+  L.Append ((F => 1));
+
+  declare
+Expected : constant array (Positive range <>) of Integer :=
+  (3, 2, 1);
+I : Positive := 1;
+  begin
+for LR of L loop
+  LR.Put;
+  if Put_Result /= Expected (I) then
+raise Program_Error;
+  end if;
+  I := I + 1;
+end loop;
+  end;
+end Equal11;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/equal11_interface.ads
@@ -0,0 +1,7 @@
+package Equal11_Interface is
+
+  type My_Interface_Type is interface;
+
+  procedure Put (R : in My_Interface_Type) is abstract;
+
+end Equal11_Interface;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/equal11_record.adb
@@ -0,0 +1,10 @@
+with Ada.Text_IO;
+
+package body Equal11_Record is
+
+  procedure Put (R : in My_Record_Type) is
+  begin
+Put_Result := R.F;
+  end Put;
+
+end Equal11_Record;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/equal11_record.ads
@@ -0,0 +1,21 @@
+with Ada.Containers.Doubly_Linked_Lists;
+with Equal11_Interface;
+
+package Equal11_Record is
+
+  use Eq

[Ada] Strengthen Locked flag

2019-08-14 Thread Pierre-Marie de Rodat

This patch strengthens the Locked flag, by Asserting that it is False on
operations that might cause reallocation.

No change in behavior (except in the presence of compiler bugs), so no
test.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-14  Bob Duff  

gcc/ada/

* table.adb: Assert that the table is not locked when increasing
Last, even if it doesn't cause reallocation.  In other words,
assert that on operations that MIGHT cause reallocation.
* table.ads: Fix comment accordingly.--- gcc/ada/table.adb
+++ gcc/ada/table.adb
@@ -80,6 +80,7 @@ package body Table is
 
   procedure Append (New_Val : Table_Component_Type) is
   begin
+ pragma Assert (not Locked);
  Set_Item (Table_Index_Type (Last_Val + 1), New_Val);
   end Append;
 
@@ -120,6 +121,7 @@ package body Table is
 
   procedure Increment_Last is
   begin
+ pragma Assert (not Locked);
  Last_Val := Last_Val + 1;
 
  if Last_Val > Max then
@@ -384,6 +386,8 @@ package body Table is
 
   procedure Set_Last (New_Val : Table_Index_Type) is
   begin
+ pragma Assert (Int (New_Val) <= Last_Val or else not Locked);
+
  if Int (New_Val) < Last_Val then
 Last_Val := Int (New_Val);
 

--- gcc/ada/table.ads
+++ gcc/ada/table.ads
@@ -130,14 +130,15 @@ package Table is
   --  First .. Last.
 
   Locked : Boolean := False;
-  --  Table expansion is permitted only if this switch is set to False. A
-  --  client may set Locked to True, in which case any attempt to expand
-  --  the table will cause an assertion failure. Note that while a table
-  --  is locked, its address in memory remains fixed and unchanging. This
-  --  feature is used to control table expansion during Gigi processing.
-  --  Gigi assumes that tables other than the Uint and Ureal tables do
-  --  not move during processing, which means that they cannot be expanded.
-  --  The Locked flag is used to enforce this restriction.
+  --  Increasing the value of Last is permitted only if this switch is set
+  --  to False. A client may set Locked to True, in which case any attempt
+  --  to increase the value of Last (which might expand the table) will
+  --  cause an assertion failure. Note that while a table is locked, its
+  --  address in memory remains fixed and unchanging. This feature is used
+  --  to control table expansion during Gigi processing.  Gigi assumes that
+  --  tables other than the Uint and Ureal tables do not move during
+  --  processing, which means that they cannot be expanded.  The Locked
+  --  flag is used to enforce this restriction.
 
   procedure Init;
   --  This procedure allocates a new table of size Initial (freeing any

[Ada] Crash on quantified expression in disabled assertion

2019-08-14 Thread Pierre-Marie de Rodat

The defining identifier of a quantified expression may be the freeze
point of its type.  If the quantified expression appears in an assertion
that is disavbled, the freeze node for that type may appear in a tree
that will be discarded when the enclosing pragma is elaborated. To
ensure that the freeze node is reachable for subsquent uses we must
generate its freeze node explicitly when the quantified expression is
analyzed.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-14  Ed Schonberg  

gcc/ada/

* exp_ch4.adb (Expand_N_Quantified_Expression): Freeze
explicitly the type of the loop parameter.

gcc/testsuite/

* gnat.dg/assert2.adb, gnat.dg/assert2.ads: New testcase.--- gcc/ada/exp_ch4.adb
+++ gcc/ada/exp_ch4.adb
@@ -10337,8 +10337,30 @@ package body Exp_Ch4 is
   Flag  : Entity_Id;
   Scheme: Node_Id;
   Stmts : List_Id;
+  Var   : Entity_Id;
 
begin
+  --  Ensure that the bound variable is properly frozen. We must do
+  --  this before expansion because the expression is about to be
+  --  converted into a loop, and resulting freeze nodes may end up
+  --  in the wrong place in the tree.
+
+  if Present (Iter_Spec) then
+ Var := Defining_Identifier (Iter_Spec);
+  else
+ Var := Defining_Identifier (Loop_Spec);
+  end if;
+
+  declare
+ P : Node_Id := Parent (N);
+  begin
+ while Nkind (P) in N_Subexpr loop
+P := Parent (P);
+ end loop;
+
+ Freeze_Before (P, Etype (Var));
+  end;
+
   --  Create the declaration of the flag which tracks the status of the
   --  quantified expression. Generate:
 

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/assert2.adb
@@ -0,0 +1,5 @@
+--  { dg-do compile }
+
+package body Assert2 is
+   procedure Dummy is null;
+end Assert2;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/assert2.ads
@@ -0,0 +1,15 @@
+package Assert2
+with SPARK_Mode
+is
+   type Living is new Integer;
+   function Is_Martian (Unused : Living) return Boolean is (False);
+
+   function Is_Green (Unused : Living) return Boolean is (True);
+
+   pragma Assert
+ (for all M in Living => (if Is_Martian (M) then Is_Green (M)));
+   pragma Assert
+ (for all M in Living => (if Is_Martian (M) then not Is_Green (M)));
+
+   procedure Dummy;
+end Assert2;

[Ada] Warn about unknown condition in Compile_Time_Warning

2019-08-14 Thread Pierre-Marie de Rodat

The compiler now warns if the condition in a pragma Compile_Time_Warning
or Compile_Time_Error does not have a compile-time-known value. The
warning is not given for pragmas in a generic template, but is given for
pragmas in an instance.

The -gnatw_c and -gnatw_C switches turn the warning on and off. The
default is on.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-14  Bob Duff  

gcc/ada/

* sem_prag.ads, sem_prag.adb
(Process_Compile_Time_Warning_Or_Error): In parameterless
version, improve detection of whether we are in a generic unit
to cover the case of an instance within a generic unit.
(Process_Compile_Time_Warning_Or_Error): Rename the
two-parameter version to be
Validate_Compile_Time_Warning_Or_Error, and do not export it.
Issue a warning if the condition is not known at compile time.
The key point is that the warning must be given only for pragmas
deferred to the back end, because the back end discovers
additional values that are known at compile time.  Previous
changes in this ticket have enabled this by deferring to the
back end without checking for special cases such as 'Size.
(Validate_Compile_Time_Warning_Or_Error): Rename to be
Defer_Compile_Time_Warning_Error_To_BE.
* warnsw.ads, warnsw.adb (Warn_On_Unknown_Compile_Time_Warning):
Add new switches -gnatw_c and -gnatw_C to control the above
warning.
* doc/gnat_ugn/building_executable_programs_with_gnat.rst:
Document new switches.
* gnat_ugn.texi: Regenerate.

gcc/testsuite/

* gnat.dg/warn27.adb: New testcase.

patch.diff.gz
Description: application/gzip

[Ada] Fix spurious ownership error in GNATprove

2019-08-14 Thread Pierre-Marie de Rodat

Like Is_Path_Expression, function Is_Subpath_Expression should consider
the possibility that the subpath is a type conversion or type
qualification over the actual subpath node. This avoids spurious
ownership errors in GNATprove.

There is no impact on compilation.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-14  Yannick Moy  

gcc/ada/

* sem_spark.adb (Is_Subpath_Expression): Take into account
conversion and qualification.--- gcc/ada/sem_spark.adb
+++ gcc/ada/sem_spark.adb
@@ -4266,6 +4266,12 @@ package body Sem_SPARK is
is
begin
   return Is_Path_Expression (Expr, Is_Traversal)
+
+or else (Nkind_In (Expr, N_Qualified_Expression,
+ N_Type_Conversion,
+ N_Unchecked_Type_Conversion)
+  and then Is_Subpath_Expression (Expression (Expr)))
+
 or else (Nkind (Expr) = N_Attribute_Reference
   and then
 (Get_Attribute_Id (Attribute_Name (Expr)) =
@@ -4276,7 +4282,8 @@ package body Sem_SPARK is
  or else
  Get_Attribute_Id (Attribute_Name (Expr)) =
Attribute_Image))
-   or else Nkind (Expr) = N_Op_Concat;
+
+or else Nkind (Expr) = N_Op_Concat;
end Is_Subpath_Expression;
 
---

[Ada] Alignment may be specified as zero

2019-08-14 Thread Pierre-Marie de Rodat

An Alignment clause or an aspect_specification for Alignment may be
specified as 0, which is treated the same as 1.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-14  Bob Duff  

gcc/ada/

* sem_ch13.adb (Get_Alignment_Value): Return 1 for Alignment 0,
and do not give an error.
* doc/gnat_rm/representation_clauses_and_pragmas.rst: Update the
corresponding documentation.
* gnat_rm.texi: Regenerate.

gcc/testsuite/

* gnat.dg/alignment15.adb: New testcase.--- gcc/ada/doc/gnat_rm/representation_clauses_and_pragmas.rst
+++ gcc/ada/doc/gnat_rm/representation_clauses_and_pragmas.rst
@@ -30,9 +30,11 @@ Alignment Clauses
 
 .. index:: Alignment Clause
 
-GNAT requires that all alignment clauses specify a power of 2, and all
-default alignments are always a power of 2.  The default alignment
-values are as follows:
+GNAT requires that all alignment clauses specify 0 or a power of 2, and
+all default alignments are always a power of 2. Specifying 0 is the
+same as specifying 1.
+
+The default alignment values are as follows:
 
 * *Elementary Types*.
 
@@ -610,23 +612,23 @@ alignment of the type (this is true for all types). In some cases the
  end record;
 
 
-On a typical 32-bit architecture, the X component will occupy four bytes 
-and the Y component will occupy one byte, for a total of 5 bytes. As a 
-result ``R'Value_Size`` will be 40 (bits) since this is the minimum size 
-required to store a value of this type. For example, it is permissible 
-to have a component of type R in an array whose component size is 
-specified to be 40 bits. 
-
-However, ``R'Object_Size`` will be 64 (bits). The difference is due to 
-the alignment requirement for objects of the record type. The X 
-component will require four-byte alignment because that is what type 
-Integer requires, whereas the Y component, a Character, will only 
-require 1-byte alignment. Since the alignment required for X is the 
-greatest of all the components' alignments, that is the alignment 
-required for the enclosing record type, i.e., 4 bytes or 32 bits. As 
-indicated above, the actual object size must be rounded up so that it is 
-a multiple of the alignment value. Therefore, 40 bits rounded up to the 
-next multiple of 32 yields 64 bits. 
+On a typical 32-bit architecture, the X component will occupy four bytes
+and the Y component will occupy one byte, for a total of 5 bytes. As a
+result ``R'Value_Size`` will be 40 (bits) since this is the minimum size
+required to store a value of this type. For example, it is permissible
+to have a component of type R in an array whose component size is
+specified to be 40 bits.
+
+However, ``R'Object_Size`` will be 64 (bits). The difference is due to
+the alignment requirement for objects of the record type. The X
+component will require four-byte alignment because that is what type
+Integer requires, whereas the Y component, a Character, will only
+require 1-byte alignment. Since the alignment required for X is the
+greatest of all the components' alignments, that is the alignment
+required for the enclosing record type, i.e., 4 bytes or 32 bits. As
+indicated above, the actual object size must be rounded up so that it is
+a multiple of the alignment value. Therefore, 40 bits rounded up to the
+next multiple of 32 yields 64 bits.
 
 For all other types, the ``Object_Size``
 and ``Value_Size`` are the same (and equivalent to the RM attribute ``Size``).

--- gcc/ada/gnat_rm.texi
+++ gcc/ada/gnat_rm.texi
@@ -21,7 +21,7 @@
 
 @copying
 @quotation
-GNAT Reference Manual , Jul 31, 2019
+GNAT Reference Manual , Aug 01, 2019
 
 AdaCore
 
@@ -18369,9 +18369,11 @@ and this section describes the additional capabilities provided.
 
 @geindex Alignment Clause
 
-GNAT requires that all alignment clauses specify a power of 2, and all
-default alignments are always a power of 2.  The default alignment
-values are as follows:
+GNAT requires that all alignment clauses specify 0 or a power of 2, and
+all default alignments are always a power of 2. Specifying 0 is the
+same as specifying 1.
+
+The default alignment values are as follows:
 
 
 @itemize *

--- gcc/ada/sem_ch13.adb
+++ gcc/ada/sem_ch13.adb
@@ -11509,7 +11509,7 @@ package body Sem_Ch13 is
   if Align = No_Uint then
  return No_Uint;
 
-  elsif Align <= 0 then
+  elsif Align < 0 then
 
  --  This error is suppressed in ASIS mode to allow for different ASIS
  --  back ends or ASIS-based tools to query the illegal clause.
@@ -11520,6 +11520,11 @@ package body Sem_Ch13 is
 
  return No_Uint;
 
+  --  If Alignment is specified to be 0, we treat it the same as 1
+
+  elsif Align = 0 then
+ return Uint_1;
+
   else
  for J in Int range 0 .. 64 loop
 declare

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/alignment15.adb
@@ -0,0 +1,17 @@
+--  { dg-compile }
+
+procedure Alignment15 is
+   type T0 is record
+  X

[Ada] Incorrect error on inline protected function

2019-08-14 Thread Pierre-Marie de Rodat

This patch fixes a bug where if a protected function has a pragma
Inline, and has no local variables, and the body consists of a single
extended_return_statement, and the result type is an indefinite
composite subtype, and inlining is enabled, the compiler gives an error,
even though the program is legal.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-14  Bob Duff  

gcc/ada/

* inline.adb (Check_And_Split_Unconstrained_Function): Ignore
protected functions to get rid of spurious error. The
transformation done by this procedure triggers legality errors
in the generated code in this case.

gcc/testsuite/

* gnat.dg/inline19.adb, gnat.dg/inline19.ads: New testcase.--- gcc/ada/inline.adb
+++ gcc/ada/inline.adb
@@ -2041,6 +2041,8 @@ package body Inline is
  Original_Body   : Node_Id;
  Body_To_Analyze : Node_Id;
 
+  --  Start of processing for Build_Body_To_Inline
+
   begin
  pragma Assert (Current_Scope = Spec_Id);
 
@@ -2448,6 +2450,18 @@ package body Inline is
   elsif Present (Body_To_Inline (Decl)) then
  return;
 
+  --  Do not generate a body to inline for protected functions, because the
+  --  transformation generates a call to a protected procedure, causing
+  --  spurious errors. We don't inline protected operations anyway, so
+  --  this is no loss. We might as well ignore intrinsics and foreign
+  --  conventions as well -- just allow Ada conventions.
+
+  elsif not (Convention (Spec_Id) = Convention_Ada
+or else Convention (Spec_Id) = Convention_Ada_Pass_By_Copy
+or else Convention (Spec_Id) = Convention_Ada_Pass_By_Reference)
+  then
+ return;
+
   --  Check excluded declarations
 
   elsif Present (Declarations (N))

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/inline19.adb
@@ -0,0 +1,17 @@
+--  { dg-do compile }
+--  { dg-options "-O2" }
+
+package body Inline19 is
+
+   S : String := "Hello";
+
+   protected body P is
+  function F return String is
+  begin
+ return Result : constant String := S do
+null;
+ end return;
+  end F;
+   end P;
+
+end Inline19;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/inline19.ads
@@ -0,0 +1,8 @@
+package Inline19 is
+
+   protected P is
+  function F return String;
+  pragma Inline (F);
+   end P;
+
+end Inline19;

[Ada] Check SPARK restriction on Old/Loop_Entry with pointers

2019-08-14 Thread Pierre-Marie de Rodat

SPARK RM rule 3.10(14) restricts the use of Old and Loop_Entry
attributes on prefixes of an owning or observing type (i.e. a type with
access inside).

There is no impact on compilation.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-14  Yannick Moy  

gcc/ada/

* sem_spark.adb (Check_Old_Loop_Entry): New procedure to check
correct use of Old  and Loop_Entry.
(Check_Node): Check subprogram contracts.
(Check_Pragma): Check Loop_Variant.
(Check_Safe_Pointers): Apply checking to library-level
subprogram  declarations as well, in order to check their
contract.
--- gcc/ada/sem_spark.adb
+++ gcc/ada/sem_spark.adb
@@ -663,6 +663,9 @@ package body Sem_SPARK is
procedure Check_Node (N : Node_Id);
--  Main traversal procedure to check safe pointer usage
 
+   procedure Check_Old_Loop_Entry (N : Node_Id);
+   --  Check SPARK RM 3.10(14) regarding 'Old and 'Loop_Entry
+
procedure Check_Package_Body (Pack : Node_Id);
 
procedure Check_Package_Spec (Pack : Node_Id);
@@ -2583,6 +2586,43 @@ package body Sem_SPARK is

 
procedure Check_Node (N : Node_Id) is
+
+  procedure Check_Subprogram_Contract (N : Node_Id);
+  --  Check the postcondition-like contracts for use of 'Old
+
+  ---
+  -- Check_Subprogram_Contract --
+  ---
+
+  procedure Check_Subprogram_Contract (N : Node_Id) is
+  begin
+ if Nkind (N) = N_Subprogram_Declaration
+   or else Acts_As_Spec (N)
+ then
+declare
+   E: constant Entity_Id := Unique_Defining_Entity (N);
+   Post : constant Node_Id :=
+ Get_Pragma (E, Pragma_Postcondition);
+   Cases: constant Node_Id :=
+ Get_Pragma (E, Pragma_Contract_Cases);
+begin
+   Check_Old_Loop_Entry (Post);
+   Check_Old_Loop_Entry (Cases);
+end;
+
+ elsif Nkind (N) = N_Subprogram_Body then
+declare
+   E: constant Entity_Id := Defining_Entity (N);
+   Ref_Post : constant Node_Id :=
+ Get_Pragma (E, Pragma_Refined_Post);
+begin
+   Check_Old_Loop_Entry (Ref_Post);
+end;
+ end if;
+  end Check_Subprogram_Contract;
+
+   --  Start of processing for Check_Node
+
begin
   case Nkind (N) is
  when N_Declaration =>
@@ -2602,14 +2642,17 @@ package body Sem_SPARK is
Check_Package_Body (N);
 end if;
 
- when N_Subprogram_Body
-| N_Entry_Body
-| N_Task_Body
- =>
+ when N_Subprogram_Body =>
 if not Is_Generic_Unit (Unique_Defining_Entity (N)) then
+   Check_Subprogram_Contract (N);
Check_Callable_Body (N);
 end if;
 
+ when N_Entry_Body
+| N_Task_Body
+ =>
+Check_Callable_Body (N);
+
  when N_Protected_Body =>
 Check_List (Declarations (N));
 
@@ -2622,6 +2665,9 @@ package body Sem_SPARK is
  when N_Pragma =>
 Check_Pragma (N);
 
+ when N_Subprogram_Declaration =>
+Check_Subprogram_Contract (N);
+
  --  Ignored constructs for pointer checking
 
  when N_Abstract_Subprogram_Declaration
@@ -2655,7 +2701,6 @@ package body Sem_SPARK is
 | N_Procedure_Instantiation
 | N_Raise_xxx_Error
 | N_Record_Representation_Clause
-| N_Subprogram_Declaration
 | N_Subprogram_Renaming_Declaration
 | N_Task_Type_Declaration
 | N_Use_Package_Clause
@@ -2677,6 +2722,65 @@ package body Sem_SPARK is
   end case;
end Check_Node;
 
+   --
+   -- Check_Old_Loop_Entry --
+   --
+
+   procedure Check_Old_Loop_Entry (N : Node_Id) is
+
+  function Check_Attribute (N : Node_Id) return Traverse_Result;
+
+  -
+  -- Check_Attribute --
+  -
+
+  function Check_Attribute (N : Node_Id) return Traverse_Result is
+ Attr_Id : Attribute_Id;
+ Aname   : Name_Id;
+ Pref: Node_Id;
+
+  begin
+ if Nkind (N) = N_Attribute_Reference then
+Attr_Id := Get_Attribute_Id (Attribute_Name (N));
+Aname   := Attribute_Name (N);
+
+if Attr_Id = Attribute_Old
+  or else Attr_Id = Attribute_Loop_Entry
+then
+   Pref := Prefix (N);
+
+   if Is_Deep (Etype (Pref)) then
+  if Nkind (Pref) /= N_Function_Call then
+ if Emit_Messages then
+Error_Msg_Name_1 := Aname;
+Error_Msg_N
+  ("prefix of % attribute must be a function call "
+

[Ada] Improve performance of Containers.Functional_Base

2019-08-14 Thread Pierre-Marie de Rodat

This patch modifies the implementation of Functional_Base to damp the
cost of its subprograms at runtime in specific cases. Instead of copying
the entire underlying array to create a new container, containers can
share the same Array_Base attribute. Performance on common use cases of
formal and functional containers is improved with this patch.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-14  Joffrey Huguet  

gcc/ada/

* libgnat/a-cofuba.ads: Add a Length attribute to type
Container. Add a type Array_Base which replaces the previous
Elements attribute of Container.
(Content_Init): New subprogram. It is used to initialize the
Base attribute of Container.
* libgnat/a-cofuba.adb (Resize): New subprogram. It is used to
resize the underlying array of a container if necessary.
(=, <=, Find, Get, Intersection, Length, Num_Overlaps, Set,
Union): Update to match changes in type declarations.
(Add): Modify body to damp the time and space cost in a specific
case.
(Content_Init): New subprogram. It is used to initialize the
Base attribute of Container.
(Remove): Modify body to damp the time and space cost in a
specific case.--- gcc/ada/libgnat/a-cofuba.adb
+++ gcc/ada/libgnat/a-cofuba.adb
@@ -30,6 +30,7 @@
 --
 
 pragma Ada_2012;
+with Ada.Unchecked_Deallocation;
 
 package body Ada.Containers.Functional_Base with SPARK_Mode => Off is
 
@@ -47,18 +48,22 @@ package body Ada.Containers.Functional_Base with SPARK_Mode => Off is
--  Search a container C for an element equal to E.all, returning the
--  position in the underlying array.
 
+   procedure Resize (Base : Array_Base_Access);
+   --  Resize the underlying array if needed so that it can contain one more
+   --  element.
+
-
-- "=" --
-
 
function "=" (C1 : Container; C2 : Container) return Boolean is
begin
-  if C1.Elements'Length /= C2.Elements'Length then
+  if C1.Length /= C2.Length then
  return False;
   end if;
 
-  for I in C1.Elements'Range loop
- if C1.Elements (I).all /= C2.Elements (I).all then
+  for I in 1 .. C1.Length loop
+ if C1.Base.Elements (I).all /= C2.Base.Elements (I).all then
 return False;
  end if;
   end loop;
@@ -72,8 +77,8 @@ package body Ada.Containers.Functional_Base with SPARK_Mode => Off is
 
function "<=" (C1 : Container; C2 : Container) return Boolean is
begin
-  for I in C1.Elements'Range loop
- if Find (C2, C1.Elements (I)) = 0 then
+  for I in 1 .. C1.Length loop
+ if Find (C2, C1.Base.Elements (I)) = 0 then
 return False;
  end if;
   end loop;
@@ -90,31 +95,58 @@ package body Ada.Containers.Functional_Base with SPARK_Mode => Off is
   I : Index_Type;
   E : Element_Type) return Container
is
-  A : constant Element_Array_Access :=
-new Element_Array'(1 .. C.Elements'Last + 1 => <>);
-  P : Count_Type := 0;
-
begin
-  for J in 1 .. C.Elements'Last + 1 loop
- if J /= To_Count (I) then
-P := P + 1;
-A (J) := C.Elements (P);
- else
-A (J) := new Element_Type'(E);
- end if;
-  end loop;
-
-  return Container'(Elements => A);
+  if To_Count (I) = C.Length + 1 and then C.Length = C.Base.Max_Length then
+ Resize (C.Base);
+ C.Base.Max_Length := C.Base.Max_Length + 1;
+ C.Base.Elements (C.Base.Max_Length) := new Element_Type'(E);
+
+ return Container'(Length => C.Base.Max_Length, Base => C.Base);
+  else
+ declare
+A : constant Array_Base_Access := Content_Init (C.Length);
+P : Count_Type := 0;
+ begin
+A.Max_Length := C.Length + 1;
+for J in 1 .. C.Length + 1 loop
+   if J /= To_Count (I) then
+  P := P + 1;
+  A.Elements (J) := C.Base.Elements (P);
+   else
+  A.Elements (J) := new Element_Type'(E);
+   end if;
+end loop;
+
+return Container'(Length => A.Max_Length,
+  Base   => A);
+ end;
+  end if;
end Add;
 
+   --
+   -- Content_Init --
+   --
+
+   function Content_Init (L : Count_Type := 0) return Array_Base_Access
+   is
+  Max_Init : constant Count_Type := 100;
+  Size : constant Count_Type :=
+(if L < Count_Type'Last - Max_Init then L + Max_Init
+ else Count_Type'Last);
+  Elements : constant Element_Array_Access :=
+new Element_Array'(1 .. Size => <>);
+   begin
+  return new Array_Base'(Max_Length => 0, Elements => Elements);
+   end Content_Init;
+
--
-- Find --
--
 
function Find (C :

[Ada] Fix internal error on inlined subprogram instance

2019-08-14 Thread Pierre-Marie de Rodat

This fixes a long-standing oddity in the procedure analyzing the
instantiation of a generic subprogram, which would set the
Is_Generic_Instance flag on the enclosing package generated for the
instantiation but only to reset it a few lines below.  Now this flag is
relied upon by the machinery which computes the set of public entities
to be exposed by a package.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-14  Eric Botcazou  

gcc/ada/

* sem_ch12.adb (Analyze_Instance_And_Renamings): Do not reset
the Is_Generic_Instance flag previously set on the package
generated for the instantiation of a generic subprogram.

gcc/testsuite/

* gnat.dg/generic_inst11.adb, gnat.dg/generic_inst11_pkg.adb,
gnat.dg/generic_inst11_pkg.ads: New testcase.--- gcc/ada/sem_ch12.adb
+++ gcc/ada/sem_ch12.adb
@@ -5264,10 +5264,6 @@ package body Sem_Ch12 is
 
  Analyze (Pack_Decl);
  Check_Formal_Packages (Pack_Id);
- Set_Is_Generic_Instance (Pack_Id, False);
-
- --  Why do we clear Is_Generic_Instance??? We set it 20 lines
- --  above???
 
  --  Body of the enclosing package is supplied when instantiating the
  --  subprogram body, after semantic analysis is completed.

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/generic_inst11.adb
@@ -0,0 +1,9 @@
+--  { dg-do compile }
+--  { dg-options "-O -gnatn" }
+
+with Generic_Inst11_Pkg;
+
+procedure Generic_Inst11 is
+begin
+   Generic_Inst11_Pkg.Proc;
+end;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/generic_inst11_pkg.adb
@@ -0,0 +1,21 @@
+with System;
+
+package body Generic_Inst11_Pkg is
+
+   Data : Integer;
+
+   generic
+  Reg_Address : System.Address;
+   procedure Inner_G with Inline;
+
+   procedure Inner_G is
+  Reg : Integer with Address => Reg_Address;
+   begin
+  null;
+   end;
+
+   procedure My_Inner_G is new Inner_G (Data'Address);
+
+   procedure Proc renames My_Inner_G;
+
+end Generic_Inst11_Pkg;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/generic_inst11_pkg.ads
@@ -0,0 +1,5 @@
+package Generic_Inst11_Pkg is
+
+   procedure Proc with Inline;
+
+end Generic_Inst11_Pkg;

[Ada] Compiler speedup with inlining across units

2019-08-14 Thread Pierre-Marie de Rodat

This change is aimed at speeding up the inlining across units done by
the Ada compiler when -gnatn is specified and in the presence of units
instantiating a lot of generic packages.

The current implementation is as follows: when a generic package is
being instantiated, the compiler scans its spec for the presence of
subprograms with an aspect/pragma Inline and, upon finding one,
schedules the instantiation of its body.  That's not very efficient
because the compiler doesn't know yet if one of those inlined
subprograms will eventually be called from the main unit.

The new implementation arranges for the compiler to instantiate the body
on demand, i.e. when it encounters a call to one of the inlined
subprograms.  That's still not optimal because, at this point, the
compiler has not yet computed whether the call itself is reachable from
the main unit (it will do this computation at the very end of the
processing, just before sending the inlined units to the code generator)
but that's nevertheless a net progress.

The patch also enhances the -gnatd.j option to make it output the list
of instances "inlined" this way.  The following package is a simple
example:

with Q;

procedure P is
begin
  Q.Proc;
end;

package Q is

  procedure Proc;
  pragma Inline (Proc);

end Q;

with G;

package body Q is

  package My_G is new G (1);

  procedure Proc is
Val : constant Integer := My_G.Func;
  begin
if Val /= 1 then
  raise Program_Error;
end if;
  end;

end Q;

generic

  Value : Integer;

package G is

  function Func return Integer;
  pragma Inline (Func);

end G;

package body G is

  function Func return Integer is
  begin
return Value;
  end;

end G;

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-14  Eric Botcazou  

gcc/ada/

* einfo.ads (Is_Called): Document new usage on E_Package
entities.
* einfo.adb (Is_Called): Accept E_Package entities.
(Set_Is_Called): Likewise.
* exp_ch6.adb (Expand_Call_Helper): Move code dealing with
instances for back-end inlining to Add_Inlined_Body.
* inline.ads: Remove with clauses for Alloc and Table.
(Pending_Instantiations): Move to...
* inline.adb: Add with clauses for Alloc, Uintp, Table and
GNAT.HTable.
(Backend_Instances): New variable.
(Pending_Instantiations): ...here.
(Called_Pending_Instantiations): New table.
(Node_Table_Size): New constant.
(Node_Header_Num): New subtype.
(Node_Hash): New function.
(To_Pending_Instantiations): New hash table.
(Add_Inlined_Body): Bail out early for subprograms in the main
unit or subunit.  Likewise if the Is_Called flag is set.  If the
subprogram is an instance, invoke Add_Inlined_Instance.  Call
Set_Is_Called earlier.  If the subrogram is within an instance,
invoke Add_Inlined_Instance.  Also deal with the case where the
call itself is within an instance.
(Add_Inlined_Instance): New procedure.
(Add_Inlined_Subprogram): Remove conditions always fulfilled.
(Add_Pending_Instantiation): Move the defence against ludicruous
number of instantiations to here. When back-end inlining is
enabled, associate an instantiation with its index in table and
mark a few selected kinds of instantiations as always needed.
(Initialize): Set Backend_Instances to No_Elist.
(Instantiate_Body): New procedure doing the work extracted
from...
(Instantiate_Bodies): ...here.  When back-end inlining is
enabled, loop over Called_Pending_Instantiations instead of
Pending_Instantiations.
(Is_Nested): Minor tweak.
(List_Inlining_Info): Also list the contents of
Backend_Instances.
* sem_ch12.adb (Might_Inline_Subp): Return early if Is_Inlined
is set and otherwise set it before returning true.
(Analyze_Package_Instantiation): Remove the defence against
ludicruous number of instantiations.  Invoke
Remove_Dead_Instance instead of doing the removal manually if
there is a guaranteed ABE.

patch.diff.gz
Description: application/gzip

Re: [PATCH 2/2] Add more entries to the C++ get_std_name_hint array

2019-08-14 Thread Jonathan Wakely


On 13/08/19 16:08 -0400, Jason Merrill wrote:

On 8/13/19 9:36 AM, Jonathan Wakely wrote:

This adds some commonly-used C++11/14 names, and some new C++17/20
names. The latter aren't available when using the -std=gnu++14
default, so the fix-it suggesting to use a newer dialect is helpful.

* name-lookup.c (get_std_name_hint): Add more entries.

Tested x86_64-linux. OK for trunk?


OK.


I realised as I was about to commit it that cxx17 is the wrong dialect
for remove_cvref and remove_cvref_t, so I corrected them to cxx2a
before committing it.

(I've tried to use remove_cvref_t in C++17 a few times, so this
diagnostic should help me!)

[PING] [PATCHv4] Fix not 8-byte aligned ldrd/strd on ARMv5 (PR 89544)

2019-08-14 Thread Bernd Edlinger

Hi!

I'd like to ping for this patch:
https://gcc.gnu.org/ml/gcc-patches/2019-08/msg00546.html


Thanks
Bernd.

[committed][AArch64] Add SVE conditional integer unary patterns

2019-08-14 Thread Richard Sandiford

This patch adds patterns to match conditional unary operations
on integers.  At the moment we rely on combine to merge separate
arithmetic and vcond_mask operations, and since the latter doesn't
accept zero operands, we miss out on the opportunity to use the
movprfx /z alternative.  (This alternative is tested by the ACLE
patches though.)

Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf.
Applied as r274476.

Richard


2019-08-14  Richard Sandiford  
Kugan Vivekanandarajah  

gcc/
* config/aarch64/aarch64-sve.md
(*cond__2): New pattern.
(*cond__any): Likewise.

gcc/testsuite/
* gcc.target/aarch64/sve/cond_unary_1.c: New test.
* gcc.target/aarch64/sve/cond_unary_1_run.c: Likewise.
* gcc.target/aarch64/sve/cond_unary_2.c: Likewise.
* gcc.target/aarch64/sve/cond_unary_2_run.c: Likewise.
* gcc.target/aarch64/sve/cond_unary_3.c: Likewise.
* gcc.target/aarch64/sve/cond_unary_3_run.c: Likewise.
* gcc.target/aarch64/sve/cond_unary_4.c: Likewise.
* gcc.target/aarch64/sve/cond_unary_4_run.c: Likewise.

Index: gcc/config/aarch64/aarch64-sve.md
===
--- gcc/config/aarch64/aarch64-sve.md   2019-08-14 10:28:46.145666799 +0100
+++ gcc/config/aarch64/aarch64-sve.md   2019-08-14 11:47:53.151171700 +0100
@@ -1454,6 +1454,45 @@ (define_insn "*2"
   "\t%0., %1/m, %2."
 )
 
+;; Predicated integer unary arithmetic, merging with the first input.
+(define_insn "*cond__2"
+  [(set (match_operand:SVE_I 0 "register_operand" "=w, ?&w")
+   (unspec:SVE_I
+ [(match_operand: 1 "register_operand" "Upl, Upl")
+  (SVE_INT_UNARY:SVE_I
+(match_operand:SVE_I 2 "register_operand" "0, w"))
+  (match_dup 2)]
+ UNSPEC_SEL))]
+  "TARGET_SVE"
+  "@
+   \t%0., %1/m, %0.
+   movprfx\t%0, %2\;\t%0., %1/m, %2."
+  [(set_attr "movprfx" "*,yes")]
+)
+
+;; Predicated integer unary arithmetic, merging with an independent value.
+;;
+;; The earlyclobber isn't needed for the first alternative, but omitting
+;; it would only help the case in which operands 2 and 3 are the same,
+;; which is handled above rather than here.  Marking all the alternatives
+;; as earlyclobber helps to make the instruction more regular to the
+;; register allocator.
+(define_insn "*cond__any"
+  [(set (match_operand:SVE_I 0 "register_operand" "=&w, ?&w, ?&w")
+   (unspec:SVE_I
+ [(match_operand: 1 "register_operand" "Upl, Upl, Upl")
+  (SVE_INT_UNARY:SVE_I
+(match_operand:SVE_I 2 "register_operand" "w, w, w"))
+  (match_operand:SVE_I 3 "aarch64_simd_reg_or_zero" "0, Dz, w")]
+ UNSPEC_SEL))]
+  "TARGET_SVE && !rtx_equal_p (operands[2], operands[3])"
+  "@
+   \t%0., %1/m, %2.
+   movprfx\t%0., %1/z, %2.\;\t%0., %1/m, 
%2.
+   movprfx\t%0, %3\;\t%0., %1/m, %2."
+  [(set_attr "movprfx" "*,yes,yes")]
+)
+
 ;; -
 ;;  [INT] Logical inverse
 ;; -
Index: gcc/testsuite/gcc.target/aarch64/sve/cond_unary_1.c
===
--- /dev/null   2019-07-30 08:53:31.317691683 +0100
+++ gcc/testsuite/gcc.target/aarch64/sve/cond_unary_1.c 2019-08-14 
11:47:53.151171700 +0100
@@ -0,0 +1,44 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize" } */
+
+#include 
+
+#define abs(A) ((A) < 0 ? -(A) : (A))
+#define neg(A) (-(A))
+
+#define DEF_LOOP(TYPE, OP) \
+  void __attribute__ ((noipa)) \
+  test_##TYPE##_##OP (TYPE *__restrict r, TYPE *__restrict a,  \
+ TYPE *__restrict pred, int n) \
+  {\
+for (int i = 0; i < n; ++i)\
+  r[i] = pred[i] ? OP (a[i]) : a[i];   \
+  }
+
+#define TEST_TYPE(T, TYPE) \
+  T (TYPE, abs) \
+  T (TYPE, neg)
+
+#define TEST_ALL(T) \
+  TEST_TYPE (T, int8_t) \
+  TEST_TYPE (T, int16_t) \
+  TEST_TYPE (T, int32_t) \
+  TEST_TYPE (T, int64_t)
+
+TEST_ALL (DEF_LOOP)
+
+/* { dg-final { scan-assembler-times {\tabs\tz[0-9]+\.b, p[0-7]/m,} 1 } } */
+/* { dg-final { scan-assembler-times {\tabs\tz[0-9]+\.h, p[0-7]/m,} 1 } } */
+/* { dg-final { scan-assembler-times {\tabs\tz[0-9]+\.s, p[0-7]/m,} 1 } } */
+/* { dg-final { scan-assembler-times {\tabs\tz[0-9]+\.d, p[0-7]/m,} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tneg\tz[0-9]+\.b, p[0-7]/m,} 1 } } */
+/* { dg-final { scan-assembler-times {\tneg\tz[0-9]+\.h, p[0-7]/m,} 1 } } */
+/* { dg-final { scan-assembler-times {\tneg\tz[0-9]+\.s, p[0-7]/m,} 1 } } */
+/* { dg-final { scan-assembler-times {\tneg\tz[0-9]+\.d, p[0-7]/m,} 1 } } */
+
+/* { dg-final { scan-assembler-not {\tmov\tz} } } */
+/* { dg-final { scan-assembler-not

[committed][AArch64] Add SVE conditional floating-point unary patterns

2019-08-14 Thread Richard Sandiford

This patch adds patterns to match conditional unary operations
on floating-point modes.  At the moment we rely on combine to merge
separate arithmetic and vcond_mask operations, and since the latter
doesn't accept zero operands, we miss out on the opportunity to use
the movprfx /z alternative.  (This alternative is tested by the ACLE
patches though.)

Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf.
Applied as r274477.

Richard


2019-08-14  Richard Sandiford  
Kugan Vivekanandarajah  

gcc/
* config/aarch64/aarch64-sve.md
(*cond__2): New pattern.
(*cond__any): Likewise.

gcc/testsuite/
* gcc.target/aarch64/sve/cond_unary_1.c: Add tests for
floating-point types.
* gcc.target/aarch64/sve/cond_unary_2.c: Likewise.
* gcc.target/aarch64/sve/cond_unary_3.c: Likewise.
* gcc.target/aarch64/sve/cond_unary_4.c: Likewise.

Index: gcc/config/aarch64/aarch64-sve.md
===
--- gcc/config/aarch64/aarch64-sve.md   2019-08-14 11:48:45.114792555 +0100
+++ gcc/config/aarch64/aarch64-sve.md   2019-08-14 11:51:07.537753363 +0100
@@ -1624,6 +1624,62 @@ (define_insn "*2"
   "\t%0., %1/m, %2."
 )
 
+;; Predicated floating-point unary arithmetic, merging with the first input.
+(define_insn_and_rewrite "*cond__2"
+  [(set (match_operand:SVE_F 0 "register_operand" "=w, ?&w")
+   (unspec:SVE_F
+ [(match_operand: 1 "register_operand" "Upl, Upl")
+  (unspec:SVE_F
+[(match_operand 3)
+ (match_operand:SI 4 "aarch64_sve_gp_strictness")
+ (match_operand:SVE_F 2 "register_operand" "0, w")]
+SVE_COND_FP_UNARY)
+  (match_dup 2)]
+ UNSPEC_SEL))]
+  "TARGET_SVE && aarch64_sve_pred_dominates_p (&operands[3], operands[1])"
+  "@
+   \t%0., %1/m, %0.
+   movprfx\t%0, %2\;\t%0., %1/m, %2."
+  "&& !rtx_equal_p (operands[1], operands[3])"
+  {
+operands[3] = copy_rtx (operands[1]);
+  }
+  [(set_attr "movprfx" "*,yes")]
+)
+
+;; Predicated floating-point unary arithmetic, merging with an independent
+;; value.
+;;
+;; The earlyclobber isn't needed for the first alternative, but omitting
+;; it would only help the case in which operands 2 and 3 are the same,
+;; which is handled above rather than here.  Marking all the alternatives
+;; as earlyclobber helps to make the instruction more regular to the
+;; register allocator.
+(define_insn_and_rewrite "*cond__any"
+  [(set (match_operand:SVE_F 0 "register_operand" "=&w, ?&w, ?&w")
+   (unspec:SVE_F
+ [(match_operand: 1 "register_operand" "Upl, Upl, Upl")
+  (unspec:SVE_F
+[(match_operand 4)
+ (match_operand:SI 5 "aarch64_sve_gp_strictness")
+ (match_operand:SVE_F 2 "register_operand" "w, w, w")]
+SVE_COND_FP_UNARY)
+  (match_operand:SVE_F 3 "aarch64_simd_reg_or_zero" "0, Dz, w")]
+ UNSPEC_SEL))]
+  "TARGET_SVE
+   && !rtx_equal_p (operands[2], operands[3])
+   && aarch64_sve_pred_dominates_p (&operands[4], operands[1])"
+  "@
+   \t%0., %1/m, %2.
+   movprfx\t%0., %1/z, %2.\;\t%0., %1/m, 
%2.
+   movprfx\t%0, %3\;\t%0., %1/m, %2."
+  "&& !rtx_equal_p (operands[1], operands[4])"
+  {
+operands[4] = copy_rtx (operands[1]);
+  }
+  [(set_attr "movprfx" "*,yes,yes")]
+)
+
 ;; -
 ;;  [PRED] Inverse
 ;; -
Index: gcc/testsuite/gcc.target/aarch64/sve/cond_unary_1.c
===
--- gcc/testsuite/gcc.target/aarch64/sve/cond_unary_1.c 2019-08-14 
11:48:45.114792555 +0100
+++ gcc/testsuite/gcc.target/aarch64/sve/cond_unary_1.c 2019-08-14 
11:51:07.537753363 +0100
@@ -15,15 +15,22 @@ #define DEF_LOOP(TYPE, OP)  
\
   r[i] = pred[i] ? OP (a[i]) : a[i];   \
   }
 
-#define TEST_TYPE(T, TYPE) \
+#define TEST_INT_TYPE(T, TYPE) \
   T (TYPE, abs) \
   T (TYPE, neg)
 
+#define TEST_FLOAT_TYPE(T, TYPE, SUFFIX) \
+  T (TYPE, __builtin_fabs##SUFFIX) \
+  T (TYPE, neg)
+
 #define TEST_ALL(T) \
-  TEST_TYPE (T, int8_t) \
-  TEST_TYPE (T, int16_t) \
-  TEST_TYPE (T, int32_t) \
-  TEST_TYPE (T, int64_t)
+  TEST_INT_TYPE (T, int8_t) \
+  TEST_INT_TYPE (T, int16_t) \
+  TEST_INT_TYPE (T, int32_t) \
+  TEST_INT_TYPE (T, int64_t) \
+  TEST_FLOAT_TYPE (T, _Float16, f16) \
+  TEST_FLOAT_TYPE (T, float, f) \
+  TEST_FLOAT_TYPE (T, double, )
 
 TEST_ALL (DEF_LOOP)
 
@@ -37,6 +44,14 @@ TEST_ALL (DEF_LOOP)
 /* { dg-final { scan-assembler-times {\tneg\tz[0-9]+\.s, p[0-7]/m,} 1 } } */
 /* { dg-final { scan-assembler-times {\tneg\tz[0-9]+\.d, p[0-7]/m,} 1 } } */
 
+/* { dg-final { scan-assembler-times {\tfabs\tz[0-9]+\.h, p[0-7]/m,} 1 } } */
+/* { dg-final { scan-assembler-times {\tfabs\tz[0-9]+\.s, p[0-7]/m,} 1 } } */
+/* { dg-final { scan-assembler-times

[committed][AArch64] Add SVE conditional conversion patterns

2019-08-14 Thread Richard Sandiford

This patch adds patterns to match conditional conversions between
integers and like-sized floats.  The patterns are actually more
general than that, but the other combinations can only be tested
via the ACLE.

Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf.
Applied as r274478.

Richard


2019-08-14  Richard Sandiford  

gcc/
* config/aarch64/aarch64-sve.md
(*cond__nontrunc)
(*cond__nonextend):
New patterns.

gcc/testsuite/
* gcc.target/aarch64/sve/cond_convert_1.c: New test.
* gcc.target/aarch64/sve/cond_convert_1_run.c: Likewise.
* gcc.target/aarch64/sve/cond_convert_2.c: Likewise.
* gcc.target/aarch64/sve/cond_convert_2_run.c: Likewise.
* gcc.target/aarch64/sve/cond_convert_3.c: Likewise.
* gcc.target/aarch64/sve/cond_convert_3_run.c: Likewise.
* gcc.target/aarch64/sve/cond_convert_4.c: Likewise.
* gcc.target/aarch64/sve/cond_convert_4_run.c: Likewise.
* gcc.target/aarch64/sve/cond_convert_5.c: Likewise.
* gcc.target/aarch64/sve/cond_convert_5_run.c: Likewise.
* gcc.target/aarch64/sve/cond_convert_6.c: Likewise.
* gcc.target/aarch64/sve/cond_convert_6_run.c: Likewise.

Index: gcc/config/aarch64/aarch64-sve.md
===
--- gcc/config/aarch64/aarch64-sve.md   2019-08-14 11:53:04.636898923 +0100
+++ gcc/config/aarch64/aarch64-sve.md   2019-08-14 11:55:33.251813494 +0100
@@ -4071,6 +4071,39 @@ (define_insn "*aarch64_sve__trunc
   "fcvtz\t%0., %1/m, %2."
 )
 
+;; Predicated float-to-integer conversion with merging, either to the same
+;; width or wider.
+;;
+;; The first alternative doesn't need the earlyclobber, but the only case
+;; it would help is the uninteresting one in which operands 2 and 3 are
+;; the same register (despite having different modes).  Making all the
+;; alternatives earlyclobber makes things more consistent for the
+;; register allocator.
+(define_insn_and_rewrite "*cond__nontrunc"
+  [(set (match_operand:SVE_HSDI 0 "register_operand" "=&w, &w, ?&w")
+   (unspec:SVE_HSDI
+ [(match_operand: 1 "register_operand" "Upl, Upl, Upl")
+  (unspec:SVE_HSDI
+[(match_operand 4)
+ (match_operand:SI 5 "aarch64_sve_gp_strictness")
+ (match_operand:SVE_F 2 "register_operand" "w, w, w")]
+SVE_COND_FCVTI)
+  (match_operand:SVE_HSDI 3 "aarch64_simd_reg_or_zero" "0, Dz, w")]
+ UNSPEC_SEL))]
+  "TARGET_SVE
+   &&  >= 
+   && aarch64_sve_pred_dominates_p (&operands[4], operands[1])"
+  "@
+   fcvtz\t%0., %1/m, %2.
+   movprfx\t%0., %1/z, 
%2.\;fcvtz\t%0., %1/m, %2.
+   movprfx\t%0, %3\;fcvtz\t%0., %1/m, %2."
+  "&& !rtx_equal_p (operands[1], operands[4])"
+  {
+operands[4] = copy_rtx (operands[1]);
+  }
+  [(set_attr "movprfx" "*,yes,yes")]
+)
+
 ;; -
 ;;  [INT<-FP] Packs
 ;; -
@@ -4155,6 +4188,39 @@ (define_insn "aarch64_sve__extend
   "cvtf\t%0., %1/m, %2."
 )
 
+;; Predicated integer-to-float conversion with merging, either to the same
+;; width or narrower.
+;;
+;; The first alternative doesn't need the earlyclobber, but the only case
+;; it would help is the uninteresting one in which operands 2 and 3 are
+;; the same register (despite having different modes).  Making all the
+;; alternatives earlyclobber makes things more consistent for the
+;; register allocator.
+(define_insn_and_rewrite "*cond__nonextend"
+  [(set (match_operand:SVE_F 0 "register_operand" "=&w, &w, ?&w")
+   (unspec:SVE_F
+ [(match_operand: 1 "register_operand" "Upl, Upl, Upl")
+  (unspec:SVE_F
+[(match_operand 4)
+ (match_operand:SI 5 "aarch64_sve_gp_strictness")
+ (match_operand:SVE_HSDI 2 "register_operand" "w, w, w")]
+SVE_COND_ICVTF)
+  (match_operand:SVE_F 3 "aarch64_simd_reg_or_zero" "0, Dz, w")]
+ UNSPEC_SEL))]
+  "TARGET_SVE
+   &&  >= 
+   && aarch64_sve_pred_dominates_p (&operands[4], operands[1])"
+  "@
+   cvtf\t%0., %1/m, %2.
+   movprfx\t%0., %1/z, 
%2.\;cvtf\t%0., %1/m, %2.
+   movprfx\t%0, %3\;cvtf\t%0., %1/m, %2."
+  "&& !rtx_equal_p (operands[1], operands[4])"
+  {
+operands[4] = copy_rtx (operands[1]);
+  }
+  [(set_attr "movprfx" "*,yes,yes")]
+)
+
 ;; -
 ;;  [FP<-INT] Packs
 ;; -
Index: gcc/testsuite/gcc.target/aarch64/sve/cond_convert_1.c
===
--- /dev/null   2019-07-30 08:53:31.317691683 +0100
+++ gcc/testsuite/gcc.target/aarch64/sve/cond_convert_1.c   2019-08-14 
11:55:33.251813494 +0100
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize -fno-

[committed][AArch64] Use SVE UXT[BHW] as a form of predicated AND

2019-08-14 Thread Richard Sandiford

UXTB, UXTH and UXTW are equivalent to predicated ANDs with the constants
0xff, 0x and 0x respectively.  This patch uses them in the
patterns for IFN_COND_AND.

Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf.
Applied as r274479.

Richard


2019-08-14  Richard Sandiford  

gcc/
* config/aarch64/aarch64.c (aarch64_print_operand): Allow %e to
take the equivalent mask, as well as a bit count.
* config/aarch64/predicates.md (aarch64_sve_uxtb_immediate)
(aarch64_sve_uxth_immediate, aarch64_sve_uxt_immediate)
(aarch64_sve_pred_and_operand): New predicates.
* config/aarch64/iterators.md (sve_pred_int_rhs2_operand): New
code attribute.
* config/aarch64/aarch64-sve.md
(cond_): Use it.
(*cond_uxt_2, *cond_uxt_any): New patterns.

gcc/testsuite/
* gcc.target/aarch64/sve/cond_uxt_1.c: New test.
* gcc.target/aarch64/sve/cond_uxt_1_run.c: Likewise.
* gcc.target/aarch64/sve/cond_uxt_2.c: Likewise.
* gcc.target/aarch64/sve/cond_uxt_2_run.c: Likewise.
* gcc.target/aarch64/sve/cond_uxt_3.c: Likewise.
* gcc.target/aarch64/sve/cond_uxt_3_run.c: Likewise.
* gcc.target/aarch64/sve/cond_uxt_4.c: Likewise.
* gcc.target/aarch64/sve/cond_uxt_4_run.c: Likewise.

Index: gcc/config/aarch64/aarch64.c
===
--- gcc/config/aarch64/aarch64.c2019-08-14 10:18:10.642319210 +0100
+++ gcc/config/aarch64/aarch64.c2019-08-14 12:00:03.209840337 +0100
@@ -8328,7 +8328,8 @@ sizetochar (int size)
  'D':  Take the duplicated element in a vector constant
and print it as an unsigned integer, in decimal.
  'e':  Print the sign/zero-extend size as a character 8->b,
-   16->h, 32->w.
+   16->h, 32->w.  Can also be used for masks:
+   0xff->b, 0x->h, 0x->w.
  'I':  If the operand is a duplicated vector constant,
replace it with the duplicated scalar.  If the
operand is then a floating-point constant, replace
@@ -8399,27 +8400,22 @@ aarch64_print_operand (FILE *f, rtx x, i
 
 case 'e':
   {
-   int n;
-
-   if (!CONST_INT_P (x)
-   || (n = exact_log2 (INTVAL (x) & ~7)) <= 0)
+   x = unwrap_const_vec_duplicate (x);
+   if (!CONST_INT_P (x))
  {
output_operand_lossage ("invalid operand for '%%%c'", code);
return;
  }
 
-   switch (n)
+   HOST_WIDE_INT val = INTVAL (x);
+   if ((val & ~7) == 8 || val == 0xff)
+ fputc ('b', f);
+   else if ((val & ~7) == 16 || val == 0x)
+ fputc ('h', f);
+   else if ((val & ~7) == 32 || val == 0x)
+ fputc ('w', f);
+   else
  {
- case 3:
-   fputc ('b', f);
-   break;
- case 4:
-   fputc ('h', f);
-   break;
- case 5:
-   fputc ('w', f);
-   break;
- default:
output_operand_lossage ("invalid operand for '%%%c'", code);
return;
  }
Index: gcc/config/aarch64/predicates.md
===
--- gcc/config/aarch64/predicates.md2019-08-14 10:18:10.642319210 +0100
+++ gcc/config/aarch64/predicates.md2019-08-14 12:00:03.209840337 +0100
@@ -606,11 +606,26 @@ (define_predicate "aarch64_sve_inc_dec_i
   (and (match_code "const,const_vector")
(match_test "aarch64_sve_inc_dec_immediate_p (op)")))
 
+(define_predicate "aarch64_sve_uxtb_immediate"
+  (and (match_code "const_vector")
+   (match_test "GET_MODE_UNIT_BITSIZE (GET_MODE (op)) > 8")
+   (match_test "aarch64_const_vec_all_same_int_p (op, 0xff)")))
+
+(define_predicate "aarch64_sve_uxth_immediate"
+  (and (match_code "const_vector")
+   (match_test "GET_MODE_UNIT_BITSIZE (GET_MODE (op)) > 16")
+   (match_test "aarch64_const_vec_all_same_int_p (op, 0x)")))
+
 (define_predicate "aarch64_sve_uxtw_immediate"
   (and (match_code "const_vector")
(match_test "GET_MODE_UNIT_BITSIZE (GET_MODE (op)) > 32")
(match_test "aarch64_const_vec_all_same_int_p (op, 0x)")))
 
+(define_predicate "aarch64_sve_uxt_immediate"
+  (ior (match_operand 0 "aarch64_sve_uxtb_immediate")
+   (match_operand 0 "aarch64_sve_uxth_immediate")
+   (match_operand 0 "aarch64_sve_uxtw_immediate")))
+
 (define_predicate "aarch64_sve_logical_immediate"
   (and (match_code "const,const_vector")
(match_test "aarch64_sve_bitmask_immediate_p (op)")))
@@ -670,6 +685,10 @@ (define_predicate "aarch64_sve_add_opera
(match_operand 0 "aarch64_sve_sub_arith_immediate")
(match_operand 0 "aarch64_sve_inc_dec_immediate")))
 
+(define_predicate "aarch64_sve_pred_and_operand"
+  (ior (match_operand 0 "register_opera

Re: [PATCH 1/2] PR c++/91436 fix C++ dialect for std::make_unique fix-it hint

2019-08-14 Thread Jonathan Wakely


On 13/08/19 16:07 -0400, Jason Merrill wrote:

On 8/13/19 9:32 AM, Jonathan Wakely wrote:

* g++.dg/lookup/missing-std-include-6.C: Don't check make_unique in
test that runs for C++11.


I'm not comfortable removing this test coverage entirely.  Doesn't it 
give a useful diagnostic in C++11 mode as well?


It does:

mu.cc:3:15: error: 'make_unique' is not a member of 'std'
   3 | auto p = std::make_unique();
 |   ^~~
mu.cc:3:15: note: 'std::make_unique' is only available from C++14 onwards
mu.cc:3:27: error: expected primary-expression before 'int'
   3 | auto p = std::make_unique();
 |   ^~~

So we can add it to g++.dg/lookup/missing-std-include-8.C instead,
which runs for c++98_only and checks for the "is only available for"
cases. Here's a patch doing that.

Tested x86_64-linux.

OK for trunk?

OK for gcc-9-branch and gcc-8-branch too, since PR c++/91436 affects
those branches?

commit 5ad7b3202e4818f2d6d84e22e7e489b39a65c851
Author: Jonathan Wakely 
Date:   Tue Aug 13 13:25:39 2019 +0100

PR c++/91436 fix C++ dialect for std::make_unique fix-it hint

The std::make_unique function wasn't added until C++14, and neither was
the std::complex_literals namespace.

gcc/cp:

PR c++/91436
* name-lookup.c (get_std_name_hint): Fix min_dialect field for
complex_literals and make_unique entries.

gcc/testsuite:

PR c++/91436
* g++.dg/lookup/missing-std-include-5.C: Limit test to C++14 and up.
* g++.dg/lookup/missing-std-include-6.C: Don't check make_unique in
test that runs for C++11.
* g++.dg/lookup/missing-std-include-8.C: Check make_unique here.

diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
index d5e491e9072..16c74287bb1 100644
--- a/gcc/cp/name-lookup.c
+++ b/gcc/cp/name-lookup.c
@@ -5559,7 +5559,7 @@ get_std_name_hint (const char *name)
 {"bitset", "", cxx11},
 /* .  */
 {"complex", "", cxx98},
-{"complex_literals", "", cxx98},
+{"complex_literals", "", cxx14},
 /* . */
 {"condition_variable", "", cxx11},
 {"condition_variable_any", "", cxx11},
@@ -5632,7 +5632,7 @@ get_std_name_hint (const char *name)
 {"allocator", "", cxx98},
 {"allocator_traits", "", cxx11},
 {"make_shared", "", cxx11},
-{"make_unique", "", cxx11},
+{"make_unique", "", cxx14},
 {"shared_ptr", "", cxx11},
 {"unique_ptr", "", cxx11},
 {"weak_ptr", "", cxx11},
diff --git a/gcc/testsuite/g++.dg/lookup/missing-std-include-5.C b/gcc/testsuite/g++.dg/lookup/missing-std-include-5.C
index fe880a6263b..3ec9abd9316 100644
--- a/gcc/testsuite/g++.dg/lookup/missing-std-include-5.C
+++ b/gcc/testsuite/g++.dg/lookup/missing-std-include-5.C
@@ -1,2 +1,3 @@
+// { dg-do compile { target c++14 } }
 using namespace std::complex_literals; // { dg-error "" }
 // { dg-message "#include " "" { target *-*-* } .-1 }
diff --git a/gcc/testsuite/g++.dg/lookup/missing-std-include-6.C b/gcc/testsuite/g++.dg/lookup/missing-std-include-6.C
index d9eeb4284e8..a8f27473e6d 100644
--- a/gcc/testsuite/g++.dg/lookup/missing-std-include-6.C
+++ b/gcc/testsuite/g++.dg/lookup/missing-std-include-6.C
@@ -11,15 +11,6 @@ void test_make_shared ()
   // { dg-error "expected primary-expression before '\\)' token" "" { target *-*-* } .-3 }
 }
 
-template
-void test_make_unique ()
-{
-  auto p = std::make_unique(); // { dg-error "'make_unique' is not a member of 'std'" }
-  // { dg-message "'#include '" "" { target *-*-* } .-1 }
-  // { dg-error "expected primary-expression before '>' token" "" { target *-*-* } .-2 }
-  // { dg-error "expected primary-expression before '\\)' token" "" { target *-*-* } .-3 }
-}
-
 std::shared_ptr test_shared_ptr; // { dg-error "'shared_ptr' in namespace 'std' does not name a template type" }
 // { dg-message "'#include '" "" { target *-*-* } .-1 }
 
diff --git a/gcc/testsuite/g++.dg/lookup/missing-std-include-8.C b/gcc/testsuite/g++.dg/lookup/missing-std-include-8.C
index 68b208299f2..73532c82968 100644
--- a/gcc/testsuite/g++.dg/lookup/missing-std-include-8.C
+++ b/gcc/testsuite/g++.dg/lookup/missing-std-include-8.C
@@ -13,6 +13,15 @@ void test_make_shared ()
   // { dg-error "expected primary-expression before '\\)' token" "" { target *-*-* } .-3 }
 }
 
+template
+void test_make_unique ()
+{
+  std::make_unique(); // { dg-error "'make_unique' is not a member of 'std'" }
+  // { dg-message "'std::make_unique' is only available from C\\+\\+14 onwards" "" { target *-*-* } .-1 }
+  // { dg-error "expected primary-expression before '>' token" "" { target *-*-* } .-2 }
+  // { dg-error "expected primary-expression before '\\)' token" "" { target *-*-* } .-3 }
+}
+
 void test_array ()
 {
   std::array a; // { dg-error "'array' is not a member of 'std'" }

[committed][AArch64] Use SVE BIC for conditional arithmetic

2019-08-14 Thread Richard Sandiford

This patch uses BIC to pattern-match conditional AND with an inverted
third input.  It also adds extra tests for AND, ORR and EOR.

Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf.
Applied as r274480.

Richard


2019-08-14  Richard Sandiford  
Kugan Vivekanandarajah  

gcc/
* config/aarch64/aarch64-sve.md (*cond_bic_2)
(*cond_bic_any): New patterns.

gcc/testsuite/
* gcc.target/aarch64/sve/cond_logical_1.c: New test.
* gcc.target/aarch64/sve/cond_logical_1_run.c: Likewise.
* gcc.target/aarch64/sve/cond_logical_2.c: Likewise.
* gcc.target/aarch64/sve/cond_logical_2_run.c: Likewise.
* gcc.target/aarch64/sve/cond_logical_3.c: Likewise.
* gcc.target/aarch64/sve/cond_logical_3_run.c: Likewise.
* gcc.target/aarch64/sve/cond_logical_4.c: Likewise.
* gcc.target/aarch64/sve/cond_logical_4_run.c: Likewise.
* gcc.target/aarch64/sve/cond_logical_5.c: Likewise.
* gcc.target/aarch64/sve/cond_logical_5_run.c: Likewise.

Index: gcc/config/aarch64/aarch64-sve.md
===
--- gcc/config/aarch64/aarch64-sve.md   2019-08-14 12:00:23.761690128 +0100
+++ gcc/config/aarch64/aarch64-sve.md   2019-08-14 12:02:29.540770835 +0100
@@ -2274,6 +2274,50 @@ (define_insn_and_rewrite "*bic3"
   }
 )
 
+;; Predicated integer BIC, merging with the first input.
+(define_insn "*cond_bic_2"
+  [(set (match_operand:SVE_I 0 "register_operand" "=w, ?&w")
+   (unspec:SVE_I
+ [(match_operand: 1 "register_operand" "Upl, Upl")
+  (and:SVE_I
+(not:SVE_I (match_operand:SVE_I 3 "register_operand" "w, w"))
+(match_operand:SVE_I 2 "register_operand" "0, w"))
+  (match_dup 2)]
+ UNSPEC_SEL))]
+  "TARGET_SVE"
+  "@
+   bic\t%0., %1/m, %0., %3.
+   movprfx\t%0, %2\;bic\t%0., %1/m, %0., %3."
+  [(set_attr "movprfx" "*,yes")]
+)
+
+;; Predicated integer BIC, merging with an independent value.
+(define_insn_and_rewrite "*cond_bic_any"
+  [(set (match_operand:SVE_I 0 "register_operand" "=&w, &w, &w, ?&w")
+   (unspec:SVE_I
+ [(match_operand: 1 "register_operand" "Upl, Upl, Upl, Upl")
+  (and:SVE_I
+(not:SVE_I (match_operand:SVE_I 3 "register_operand" "w, w, w, w"))
+(match_operand:SVE_I 2 "register_operand" "0, w, w, w"))
+  (match_operand:SVE_I 4 "aarch64_simd_reg_or_zero" "Dz, Dz, 0, w")]
+ UNSPEC_SEL))]
+  "TARGET_SVE && !rtx_equal_p (operands[2], operands[4])"
+  "@
+   movprfx\t%0., %1/z, %0.\;bic\t%0., %1/m, 
%0., %3.
+   movprfx\t%0., %1/z, %2.\;bic\t%0., %1/m, 
%0., %3.
+   movprfx\t%0., %1/m, %2.\;bic\t%0., %1/m, 
%0., %3.
+   #"
+  "&& reload_completed
+   && register_operand (operands[4], mode)
+   && !rtx_equal_p (operands[0], operands[4])"
+  {
+emit_insn (gen_vcond_mask_ (operands[0], operands[2],
+operands[4], operands[1]));
+operands[4] = operands[2] = operands[0];
+  }
+  [(set_attr "movprfx" "yes")]
+)
+
 ;; -
 ;;  [INT] Shifts
 ;; -
Index: gcc/testsuite/gcc.target/aarch64/sve/cond_logical_1.c
===
--- /dev/null   2019-07-30 08:53:31.317691683 +0100
+++ gcc/testsuite/gcc.target/aarch64/sve/cond_logical_1.c   2019-08-14 
12:02:29.540770835 +0100
@@ -0,0 +1,62 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize" } */
+
+#include 
+
+#define bit_and(A, B) ((A) & (B))
+#define bit_or(A, B) ((A) | (B))
+#define bit_xor(A, B) ((A) ^ (B))
+#define bit_bic(A, B) ((A) & ~(B))
+
+#define DEF_LOOP(TYPE, OP) \
+  void __attribute__ ((noinline, noclone)) \
+  test_##TYPE##_##OP (TYPE *__restrict r,  \
+ TYPE *__restrict a,   \
+ TYPE *__restrict b,   \
+ TYPE *__restrict c, int n)\
+  {\
+for (int i = 0; i < n; ++i)\
+  r[i] = a[i] < 20 ? OP (b[i], c[i]) : b[i];   \
+  }
+
+#define TEST_TYPE(T, TYPE) \
+  T (TYPE, bit_and) \
+  T (TYPE, bit_or) \
+  T (TYPE, bit_xor) \
+  T (TYPE, bit_bic)
+
+#define TEST_ALL(T) \
+  TEST_TYPE (T, int8_t) \
+  TEST_TYPE (T, uint8_t) \
+  TEST_TYPE (T, int16_t) \
+  TEST_TYPE (T, uint16_t) \
+  TEST_TYPE (T, int32_t) \
+  TEST_TYPE (T, uint32_t) \
+  TEST_TYPE (T, int64_t) \
+  TEST_TYPE (T, uint64_t)
+
+TEST_ALL (DEF_LOOP)
+
+/* { dg-final { scan-assembler-times {\tand\tz[0-9]+\.b, p[0-7]/m,} 2 } } */
+/* { dg-final { scan-assembler-times {\tand\tz[0-9]+\.h, p[0-7]/m,} 2 } } */
+/* { dg-final { scan-assembler-times {\tand\tz[0-9]+\.s, p[0-7]/m,} 2 } } */
+/* { dg-final { scan-assembler-times {\tand\tz[0-9]+\.d

Re: [PATCHv3] Fix not 8-byte aligned ldrd/strd on ARMv5 (PR 89544)

2019-08-14 Thread Richard Biener

On Fri, 2 Aug 2019, Bernd Edlinger wrote:

> On 8/2/19 3:11 PM, Richard Biener wrote:
> > On Tue, 30 Jul 2019, Bernd Edlinger wrote:
> > 
> >>
> >> I have no test coverage for the movmisalign optab though, so I
> >> rely on your code review for that part.
> > 
> > It looks OK.  I tried to make it trigger on the following on
> > i?86 with -msse2:
> > 
> > typedef int v4si __attribute__((vector_size (16)));
> > 
> > struct S { v4si v; } __attribute__((packed));
> > 
> > v4si foo (struct S s)
> > {
> >   return s.v;
> > }
> > 
> 
> Hmm, the entry_parm need to be a MEM_P and an unaligned one.
> So the test case could be made to trigger it this way:
> 
> typedef int v4si __attribute__((vector_size (16)));
> 
> struct S { v4si v; } __attribute__((packed));
> 
> int t;
> v4si foo (struct S a, struct S b, struct S c, struct S d,
>   struct S e, struct S f, struct S g, struct S h,
>   int i, int j, int k, int l, int m, int n,
>   int o, struct S s)
> {
>   t = o;
>   return s.v;
> }
> 
> However the code path is still not reached, since targetm.slow_ualigned_access
> is always FALSE, which is probably a flaw in my patch.
> 
> So I think,
> 
> +  else if (MEM_P (data->entry_parm)
> +  && GET_MODE_ALIGNMENT (promoted_nominal_mode)
> + > MEM_ALIGN (data->entry_parm)
> +  && targetm.slow_unaligned_access (promoted_nominal_mode,
> +MEM_ALIGN (data->entry_parm)))
> 
> should probably better be
> 
> +  else if (MEM_P (data->entry_parm)
> +  && GET_MODE_ALIGNMENT (promoted_nominal_mode)
> + > MEM_ALIGN (data->entry_parm)
> +&& (((icode = optab_handler (movmisalign_optab, 
> promoted_nominal_mode))
> + != CODE_FOR_nothing)
> +|| targetm.slow_unaligned_access (promoted_nominal_mode,
> +  MEM_ALIGN (data->entry_parm
> 
> Right?

Ah, yes.  So it's really the presence of a movmisalign optab makes it
a must for unaligned moves and if it is not present then
targetm.slow_unaligned_access tells whether we need to use the bitfield
extraction/insertion code.

> Then the modified test case would use the movmisalign optab.
> However nothing changes in the end, since the i386 back-end is used to work
> around the middle end not using movmisalign optab when it should do so.

Yeah, in the past it would have failed though.  I wonder if movmisalign
is still needed for x86...

> I wonder if I should try to add a gcc_checking_assert to the mov expand
> patterns that the memory is properly aligned ?

I suppose gen* could add asserts that there is no movmisalign_optab
that would match when expanding a mov.  Eventually it's enough
to guard the mov_optab use in emit_move_insn_1 that way?  Or even
try movmisalign there...

> 
> > but nowadays x86 seems to be happy with regular moves operating on
> > unaligned memory, using unaligned moves where necessary.
> > 
> > (insn 5 2 8 2 (set (reg:V4SI 82 [ _2 ])
> > (mem/c:V4SI (reg/f:SI 16 argp) [2 s.v+0 S16 A32])) "t.c":7:11 1229 
> > {movv4si_internal}
> >  (nil))
> > 
> > and with GCC 4.8 we ended up with the following expansion which is
> > also correct.
> > 
> > (insn 2 4 3 2 (set (subreg:V16QI (reg/v:V4SI 61 [ s ]) 0)
> > (unspec:V16QI [
> > (mem/c:V16QI (reg/f:SI 16 argp) [0 s+0 S16 A32])
> > ] UNSPEC_LOADU)) t.c:6 1164 {sse2_loaddqu}
> >  (nil))
> > 
> > So it seems it has been too long and I don't remember what is
> > special with arm that it doesn't work...  it possibly simply
> > trusts GET_MODE_ALIGNMENT, never looking at MEM_ALIGN which
> > I think is OK-ish?
> > 
> 
> Yes, that is what Richard said as well.
> 
> > Similarly the very same issue should exist on x86_64 which is
> > !STRICT_ALIGNMENT, it's just the ABI seems to provide the appropriate
> > alignment on the caller side.  So the STRICT_ALIGNMENT check is
> > a wrong one.
> >
> 
>  I may be plain wrong here, but I thought that !STRICT_ALIGNMENT targets
>  just use MEM_ALIGN to select the right instructions.  MEM_ALIGN
>  is always 32-bit align on the DImode memory.  The x86_64 vector 
>  instructions
>  would look at MEM_ALIGN and do the right thing, yes?
> >>>
> >>> No, they need to use the movmisalign optab and end up with UNSPECs
> >>> for example.
> >> Ah, thanks, now I see.
> >>
>  It seems to be the definition of STRICT_ALIGNMENT targets that all RTL
>  instructions need to have MEM_ALIGN >= GET_MODE_ALIGNMENT, so the target
>  does not even have to look at MEM_ALIGN except in the mov_misalign_optab,
>  right?
> >>>
> >>> Yes, I think we never losened that.  Note that RTL expansion has to
> >>> fix this up for them.  Note that strictly speaking SLOW_UNALIGNED_ACCESS
> >>> specifies that x86 is strict-align wrt vector modes.
> >>>
> >>
> >> Yes I agree, the code would be incorrect for x86 as well when the 
> >> movmisalign_op

Re: [PATCH] PR libstdc++/90361 add missing macro definition

2019-08-14 Thread Jonathan Wakely


On 12/08/19 17:41 +0100, Jonathan Wakely wrote:

The src/c++17/string-inst.cc file needs to override the default string
ABI so that it still contains the expected symbols even when the library
is configured with --with-default-libstdcxx-abi=gcc4-compatible.

PR libstdc++/90361
* src/c++17/string-inst.cc: Use _GLIBCXX_USE_CXX11_ABI=1 by default.

Tested x86_64-linux, committed to trunk.


This documents the bug in the gcc-9 release notes.

Committed to CVS.

Index: htdocs/gcc-9/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-9/changes.html,v
retrieving revision 1.74
diff -u -r1.74 changes.html
--- htdocs/gcc-9/changes.html	12 Aug 2019 07:31:04 -	1.74
+++ htdocs/gcc-9/changes.html	14 Aug 2019 11:17:34 -
@@ -70,8 +70,18 @@
 definition of std::rotate is not used.
   
   
-  The automatic template instantiation at link time (https://gcc.gnu.org/onlinedocs/gcc-9.1.0/gcc/C_002b_002b-Dialect-Options.html#index-frepo";>-frepo) has been deprecated and
-will be removed in a future release.
+The automatic template instantiation at link time
+(https://gcc.gnu.org/onlinedocs/gcc-9.1.0/gcc/C_002b_002b-Dialect-Options.html#index-frepo";>-frepo)
+has been deprecated and will be removed in a future release.
+  
+  
+The --with-default-libstdcxx-abi=gcc4-compatible configure
+option is broken in the 9.1 and 9.2 releases, producing a shared library
+with missing symbols
+(see https://gcc.gnu.org/PR90361";>Bug 90361).
+As a workaround, configure without that option and build GCC as normal,
+then edit the installed  headers
+to define the _GLIBCXX_USE_CXX11_ABI macro to 0.

[PATCH 0/2] Fix dangling pointer in next_nested.

2019-08-14 Thread Martin Liska

Hi.

First patch is about addition of a nested/origin/next_nested verification.
The verification can find the issue in Ada run-time library on x86_64
without bootstrap.

The second patch is fix where we need to clean up the field.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

Martin Liska (2):
  Add ::verify for cgraph_node::origin/nested/next_nested.
  Clean next_nested properly.

 gcc/cgraph.c | 35 +++
 1 file changed, 31 insertions(+), 4 deletions(-)

-- 
2.22.0

[PATCH 2/2] Clean next_nested properly.

2019-08-14 Thread Martin Liska


gcc/ChangeLog:

2019-08-14  Martin Liska  

PR ipa/91438
* cgraph.c (cgraph_node::remove): When setting
n->origin = NULL for all nested functions, reset
also next_nested.
---
 gcc/cgraph.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index eb38b905879..ea8ab38d806 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -1767,8 +1767,6 @@ cgraph_node::release_body (bool keep_arguments)
 void
 cgraph_node::remove (void)
 {
-  cgraph_node *n;
-
   if (symtab->ipa_clones_dump_file && symtab->cloned_nodes.contains (this))
 fprintf (symtab->ipa_clones_dump_file,
 	 "Callgraph removal;%s;%d;%s;%d;%d\n", asm_name (), order,
@@ -1785,8 +1783,13 @@ cgraph_node::remove (void)
  */
   force_output = false;
   forced_by_abi = false;
-  for (n = nested; n; n = n->next_nested)
+  cgraph_node *next = nested;
+  for (cgraph_node *n = nested; n; n = next)
+  {
+next = n->next_nested;
 n->origin = NULL;
+n->next_nested = NULL;
+  }
   nested = NULL;
   if (origin)
 {
@@ -1840,7 +1843,7 @@ cgraph_node::remove (void)
  */
   if (symtab->state != LTO_STREAMING)
 {
-  n = cgraph_node::get (decl);
+  cgraph_node *n = cgraph_node::get (decl);
   if (!n
 	  || (!n->clones && !n->clone_of && !n->global.inlined_to
 	  && ((symtab->global_info_ready || in_lto_p)

[PATCH 1/2] Add ::verify for cgraph_node::origin/nested/next_nested.

2019-08-14 Thread Martin Liska


gcc/ChangeLog:

2019-08-14  Martin Liska  

* cgraph.c (cgraph_node::verify_node): Verify origin, nested
and next_nested.
---
 gcc/cgraph.c | 24 
 1 file changed, 24 insertions(+)

diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index ed46d81a513..eb38b905879 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -3464,6 +3464,30 @@ cgraph_node::verify_node (void)
 	  e->aux = 0;
 	}
 }
+
+  if (nested != NULL)
+{
+  for (cgraph_node *n = nested; n != NULL; n = n->next_nested)
+	{
+	  if (n->origin == NULL)
+	{
+	  error ("missing origin for a node in a nested list");
+	  error_found = true;
+	}
+	  else if (n->origin != this)
+	{
+	  error ("origin points to a different parent");
+	  error_found = true;
+	  break;
+	}
+	}
+}
+  if (next_nested != NULL && origin == NULL)
+{
+  error ("missing origin for a node in a nested list");
+  error_found = true;
+}
+
   if (error_found)
 {
   dump (stderr);

Re: [PATCHv4] Fix not 8-byte aligned ldrd/strd on ARMv5 (PR 89544)

2019-08-14 Thread Richard Biener

On Thu, 8 Aug 2019, Bernd Edlinger wrote:

> On 8/2/19 9:01 PM, Bernd Edlinger wrote:
> > On 8/2/19 3:11 PM, Richard Biener wrote:
> >> On Tue, 30 Jul 2019, Bernd Edlinger wrote:
> >>
> >>>
> >>> I have no test coverage for the movmisalign optab though, so I
> >>> rely on your code review for that part.
> >>
> >> It looks OK.  I tried to make it trigger on the following on
> >> i?86 with -msse2:
> >>
> >> typedef int v4si __attribute__((vector_size (16)));
> >>
> >> struct S { v4si v; } __attribute__((packed));
> >>
> >> v4si foo (struct S s)
> >> {
> >>   return s.v;
> >> }
> >>
> > 
> > Hmm, the entry_parm need to be a MEM_P and an unaligned one.
> > So the test case could be made to trigger it this way:
> > 
> > typedef int v4si __attribute__((vector_size (16)));
> > 
> > struct S { v4si v; } __attribute__((packed));
> > 
> > int t;
> > v4si foo (struct S a, struct S b, struct S c, struct S d,
> >   struct S e, struct S f, struct S g, struct S h,
> >   int i, int j, int k, int l, int m, int n,
> >   int o, struct S s)
> > {
> >   t = o;
> >   return s.v;
> > }
> > 
> 
> Ah, I realized that there are already a couple of very similar
> test cases: gcc.target/i386/pr35767-1.c, gcc.target/i386/pr35767-1d.c,
> gcc.target/i386/pr35767-1i.c and gcc.target/i386/pr39445.c,
> which also manage to execute the movmisalign code with the latest patch
> version.  So I thought that it is not necessary to add another one.
> 
> > However the code path is still not reached, since 
> > targetm.slow_ualigned_access
> > is always FALSE, which is probably a flaw in my patch.
> > 
> > So I think,
> > 
> > +  else if (MEM_P (data->entry_parm)
> > +  && GET_MODE_ALIGNMENT (promoted_nominal_mode)
> > + > MEM_ALIGN (data->entry_parm)
> > +  && targetm.slow_unaligned_access (promoted_nominal_mode,
> > +MEM_ALIGN (data->entry_parm)))
> > 
> > should probably better be
> > 
> > +  else if (MEM_P (data->entry_parm)
> > +  && GET_MODE_ALIGNMENT (promoted_nominal_mode)
> > + > MEM_ALIGN (data->entry_parm)
> > +&& (((icode = optab_handler (movmisalign_optab, 
> > promoted_nominal_mode))
> > + != CODE_FOR_nothing)
> > +|| targetm.slow_unaligned_access (promoted_nominal_mode,
> > +  MEM_ALIGN 
> > (data->entry_parm
> > 
> > Right?
> > 
> > Then the modified test case would use the movmisalign optab.
> > However nothing changes in the end, since the i386 back-end is used to work
> > around the middle end not using movmisalign optab when it should do so.
> > 
> 
> I prefer the second form of the check, as it offers more test coverage,
> and is probably more correct than the former.
> 
> Note there are more variations of this misalign check in expr.c,
> some are somehow odd, like expansion of MEM_REF and VIEW_CONVERT_EXPR:
> 
> && mode != BLKmode
> && align < GET_MODE_ALIGNMENT (mode))
>   {
> if ((icode = optab_handler (movmisalign_optab, mode))
> != CODE_FOR_nothing)
>   [...]
> else if (targetm.slow_unaligned_access (mode, align))
>   temp = extract_bit_field (temp, GET_MODE_BITSIZE (mode),
> 0, TYPE_UNSIGNED (TREE_TYPE (exp)),
> (modifier == EXPAND_STACK_PARM
>  ? NULL_RTX : target),
> mode, mode, false, alt_rtl);
> 
> I wonder if they are correct this way, why shouldn't we use the movmisalign
> optab if it exists, regardless of TARGET_SLOW_UNALIGNED_ACCESSS ?

Doesn't the code do exactly this?  Prefer movmisalign over 
extrct_bit_field?

> 
> > I wonder if I should try to add a gcc_checking_assert to the mov 
> > expand
> > patterns that the memory is properly aligned ?
> >
> 
> Wow, that was a really exciting bug-hunt with those assertions around...

:)

> >> @@ -3292,6 +3306,23 @@ assign_parm_setup_reg (struct assign_parm_data_all
> >>
> >>did_conversion = true;
> >>  }
> >> +  else if (MEM_P (data->entry_parm)
> >> +  && GET_MODE_ALIGNMENT (promoted_nominal_mode)
> >> + > MEM_ALIGN (data->entry_parm)
> >>
> >> we arrive here by-passing
> >>
> >>   else if (need_conversion)
> >> {
> >>   /* We did not have an insn to convert directly, or the sequence
> >>  generated appeared unsafe.  We must first copy the parm to a
> >>  pseudo reg, and save the conversion until after all
> >>  parameters have been moved.  */
> >>
> >>   int save_tree_used;
> >>   rtx tempreg = gen_reg_rtx (GET_MODE (data->entry_parm));
> >>
> >>   emit_move_insn (tempreg, validated_mem);
> >>
> >> but this move instruction is invalid in the same way as the case
> >> you fix, no?  So wouldn't it be better to do
> >>
> > 
> > We could do that, but I s

Re: [PATCH] Add generic support for "noinit" attribute

2019-08-14 Thread Richard Sandiford

Sorry for the slow response, I'd missed that there was an updated patch...

Christophe Lyon  writes:
> 2019-07-04  Christophe Lyon  
> 
>   * lib/target-supports.exp (check_effective_target_noinit): New
>   proc.
> * gcc.c-torture/execute/noinit-attribute.c: New test.

Second line should be indented by tabs rather than spaces.

> @@ -2224,6 +2234,54 @@ handle_weak_attribute (tree *node, tree name,
>return NULL_TREE;
>  }
>  
> +/* Handle a "noinit" attribute; arguments as in struct
> +   attribute_spec.handler.  Check whether the attribute is allowed
> +   here and add the attribute to the variable decl tree or otherwise
> +   issue a diagnostic.  This function checks NODE is of the expected
> +   type and issues diagnostics otherwise using NAME.  If it is not of
> +   the expected type *NO_ADD_ATTRS will be set to true.  */
> +
> +static tree
> +handle_noinit_attribute (tree * node,
> +   tree   name,
> +   tree   args,
> +   intflags ATTRIBUTE_UNUSED,
> +   bool *no_add_attrs)
> +{
> +  const char *message = NULL;
> +
> +  gcc_assert (DECL_P (*node));
> +  gcc_assert (args == NULL);
> +
> +  if (TREE_CODE (*node) != VAR_DECL)
> +message = G_("%qE attribute only applies to variables");
> +
> +  /* Check that it's possible for the variable to have a section.  */
> +  else if ((TREE_STATIC (*node) || DECL_EXTERNAL (*node) || in_lto_p)
> +&& DECL_SECTION_NAME (*node))
> +message = G_("%qE attribute cannot be applied to variables "
> +  "with specific sections");
> +
> +  if (!targetm.have_switchable_bss_sections)
> +message = G_("%qE attribute is specific to ELF targets");

Maybe make this an else if too?  Or make the VAR_DECL an else if
if you think the ELF one should win.  Either way, it seems odd to
have the mixture between else if and not.

> +  if (message)
> +{
> +  warning (OPT_Wattributes, message, name);
> +  *no_add_attrs = true;
> +}
> +  else
> +  /* If this var is thought to be common, then change this.  Common
> + variables are assigned to sections before the backend has a
> + chance to process them.  Do this only if the attribute is
> + valid.  */

Comment should be indented two spaces more.

> +if (DECL_COMMON (*node))
> +  DECL_COMMON (*node) = 0;
> +
> +  return NULL_TREE;
> +}
> +
> +
>  /* Handle a "noplt" attribute; arguments as in
> struct attribute_spec.handler.  */
>  
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index f2619e1..f1af1dc 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -7129,6 +7129,14 @@ The @code{visibility} attribute is described in
>  The @code{weak} attribute is described in
>  @ref{Common Function Attributes}.
>  
> +@item noinit
> +@cindex @code{noinit} variable attribute
> +Any data with the @code{noinit} attribute will not be initialized by
> +the C runtime startup code, or the program loader.  Not initializing
> +data in this way can reduce program startup times.  Specific to ELF
> +targets, this attribute relies on the linker to place such data in the
> +right location.

Maybe:

   This attribute is specific to ELF targets and relies on the linker to
   place such data in the right location.

> diff --git a/gcc/testsuite/gcc.c-torture/execute/noinit-attribute.c 
> b/gcc/testsuite/gcc.c-torture/execute/noinit-attribute.c
> new file mode 100644
> index 000..ffcf8c6
> --- /dev/null
> +++ b/gcc/testsuite/gcc.c-torture/execute/noinit-attribute.c
> @@ -0,0 +1,59 @@
> +/* { dg-do run } */
> +/* { dg-require-effective-target noinit */
> +/* { dg-options "-O2" } */
> +
> +/* This test checks that noinit data is handled correctly.  */
> +
> +extern void _start (void) __attribute__ ((noreturn));
> +extern void abort (void) __attribute__ ((noreturn));
> +extern void exit (int) __attribute__ ((noreturn));
> +
> +int var_common;
> +int var_zero = 0;
> +int var_one = 1;
> +int __attribute__((noinit)) var_noinit;
> +int var_init = 2;
> +
> +int __attribute__((noinit)) func(); /* { dg-warning "attribute only applies 
> to variables" } */
> +int __attribute__((section ("mysection"), noinit)) var_section1; /* { 
> dg-warning "because it conflicts with attribute" } */
> +int __attribute__((noinit, section ("mysection"))) var_section2; /* { 
> dg-warning "because it conflicts with attribute" } */
> +
> +
> +int
> +main (void)
> +{
> +  /* Make sure that the C startup code has correctly initialized the 
> ordinary variables.  */
> +  if (var_common != 0)
> +abort ();
> +
> +  /* Initialized variables are not re-initialized during startup, so
> + check their original values only during the first run of this
> + test.  */
> +  if (var_init == 2)
> +if (var_zero != 0 || var_one != 1)
> +  abort ();
> +
> +  switch (var_init)
> +{
> +case 2:
> +  /* First time through - change all the values.  */
> +  var_common = var_zero = var_one = var_noinit = var_init = 3

Add IFN_COND functions for shifting

2019-08-14 Thread Richard Sandiford

This patch adds support for IFN_COND shifts left and shifts right.
This is mostly mechanical, but since we try to handle conditional
operations in the same way as unconditional operations in match.pd,
we need to support IFN_COND shifts by scalars as well as vectors.
E.g.:

   IFN_COND_SHL (cond, a, { 1, 1, ... }, fallback)

and:

   IFN_COND_SHL (cond, a, 1, fallback)

are the same operation, with:

   (for shiftrotate (lrotate rrotate lshift rshift)
...
/* Prefer vector1 << scalar to vector1 << vector2
   if vector2 is uniform.  */
(for vec (VECTOR_CST CONSTRUCTOR)
 (simplify
  (shiftrotate @0 vec@1)
  (with { tree tem = uniform_vector_p (@1); }
   (if (tem)
(shiftrotate @0 { tem; }))

preferring the latter.  The patch copes with this by extending
create_convert_operand_from to handle scalar-to-vector conversions.

Tested on aarch64-linux-gnu (with and without SVE), aarch64_be-elf
and x86_64-linux-gnu.  OK for the generic bits?

Richard


2019-08-14  Richard Sandiford  
Prathamesh Kulkarni  

gcc/
* internal-fn.def (IFN_COND_SHL, IFN_COND_SHR): New internal functions.
* internal-fn.c (FOR_EACH_CODE_MAPPING): Handle shifts.
* match.pd (UNCOND_BINARY, COND_BINARY): Likewise.
* optabs.def (cond_ashl_optab, cond_ashr_optab, cond_lshr_optab): New
optabs.
* optabs.h (create_convert_operand_from): Expand comment.
* optabs.c (maybe_legitimize_operand): Allow implicit broadcasts
when mapping scalar rtxes to vector operands.
* config/aarch64/iterators.md (SVE_INT_BINARY): Add ashift,
ashiftrt and lshiftrt.
(sve_int_op, sve_int_op_rev, sve_pred_int_rhs2_operand): Handle them.
* config/aarch64/aarch64-sve.md (*cond__2_const)
(*cond__any_const): New patterns.

gcc/testsuite/
* gcc.target/aarch64/sve/cond_shift_1.c: New test.
* gcc.target/aarch64/sve/cond_shift_1_run.c: Likewise.
* gcc.target/aarch64/sve/cond_shift_2.c: Likewise.
* gcc.target/aarch64/sve/cond_shift_2_run.c: Likewise.
* gcc.target/aarch64/sve/cond_shift_3.c: Likewise.
* gcc.target/aarch64/sve/cond_shift_3_run.c: Likewise.
* gcc.target/aarch64/sve/cond_shift_4.c: Likewise.
* gcc.target/aarch64/sve/cond_shift_4_run.c: Likewise.
* gcc.target/aarch64/sve/cond_shift_5.c: Likewise.
* gcc.target/aarch64/sve/cond_shift_5_run.c: Likewise.
* gcc.target/aarch64/sve/cond_shift_6.c: Likewise.
* gcc.target/aarch64/sve/cond_shift_6_run.c: Likewise.
* gcc.target/aarch64/sve/cond_shift_7.c: Likewise.
* gcc.target/aarch64/sve/cond_shift_7_run.c: Likewise.
* gcc.target/aarch64/sve/cond_shift_8.c: Likewise.
* gcc.target/aarch64/sve/cond_shift_8_run.c: Likewise.
* gcc.target/aarch64/sve/cond_shift_9.c: Likewise.
* gcc.target/aarch64/sve/cond_shift_9_run.c: Likewise.

Index: gcc/internal-fn.def
===
--- gcc/internal-fn.def 2019-06-18 09:35:54.921869466 +0100
+++ gcc/internal-fn.def 2019-08-14 13:22:08.625843346 +0100
@@ -167,6 +167,10 @@ DEF_INTERNAL_OPTAB_FN (COND_IOR, ECF_CON
   cond_ior, cond_binary)
 DEF_INTERNAL_OPTAB_FN (COND_XOR, ECF_CONST | ECF_NOTHROW,
   cond_xor, cond_binary)
+DEF_INTERNAL_OPTAB_FN (COND_SHL, ECF_CONST | ECF_NOTHROW,
+  cond_ashl, cond_binary)
+DEF_INTERNAL_SIGNED_OPTAB_FN (COND_SHR, ECF_CONST | ECF_NOTHROW, first,
+ cond_ashr, cond_lshr, cond_binary)
 
 DEF_INTERNAL_OPTAB_FN (COND_FMA, ECF_CONST, cond_fma, cond_ternary)
 DEF_INTERNAL_OPTAB_FN (COND_FMS, ECF_CONST, cond_fms, cond_ternary)
Index: gcc/internal-fn.c
===
--- gcc/internal-fn.c   2019-07-10 19:41:21.623936245 +0100
+++ gcc/internal-fn.c   2019-08-14 13:22:08.625843346 +0100
@@ -3286,7 +3286,9 @@ #define FOR_EACH_CODE_MAPPING(T) \
   T (MAX_EXPR, IFN_COND_MAX) \
   T (BIT_AND_EXPR, IFN_COND_AND) \
   T (BIT_IOR_EXPR, IFN_COND_IOR) \
-  T (BIT_XOR_EXPR, IFN_COND_XOR)
+  T (BIT_XOR_EXPR, IFN_COND_XOR) \
+  T (LSHIFT_EXPR, IFN_COND_SHL) \
+  T (RSHIFT_EXPR, IFN_COND_SHR)
 
 /* Return a function that only performs CODE when a certain condition is met
and that uses a given fallback value otherwise.  For example, if CODE is
Index: gcc/match.pd
===
--- gcc/match.pd2019-07-29 09:39:48.690173827 +0100
+++ gcc/match.pd2019-08-14 13:22:08.625843346 +0100
@@ -83,12 +83,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   plus minus
   mult trunc_div trunc_mod rdiv
   min max
-  bit_and bit_ior bit_xor)
+  bit_and bit_ior bit_xor
+  lshift rshift)
 (define_operator_list COND_BINARY
   IFN_COND_ADD IFN_COND_SUB
   IFN_COND_MUL IFN_COND_DIV IFN_COND_MOD IFN_COND_RDIV
   IFN_COND_MIN IFN_COND_MAX
-  IFN_COND_AND IFN_CO

Re: [PATCH 5/9] Come up with an abstraction.

2019-08-14 Thread Richard Biener

On Mon, Aug 12, 2019 at 3:56 PM Martin Liška  wrote:
>
> On 8/12/19 2:43 PM, Richard Biener wrote:
> > On Mon, Aug 12, 2019 at 1:49 PM Martin Liška  wrote:
> >>
> >> On 8/12/19 1:40 PM, Richard Biener wrote:
> >>> On Mon, Aug 12, 2019 at 1:19 PM Martin Liška  wrote:
> 
>  On 8/8/19 5:55 PM, Michael Matz wrote:
> > Hi,
> >
> > On Mon, 10 Jun 2019, Martin Liska wrote:
> >
> >> 2019-07-24  Martin Liska  
> >>
> >>  * fold-const.c (operand_equal_p): Rename to ...
> >>  (operand_compare::operand_equal_p): ... this.
> >>  (add_expr):  Rename to ...
> >>  (operand_compare::hash_operand): ... this.
> >>  (operand_compare::operand_equal_valueize): Likewise.
> >>  (operand_compare::hash_operand_valueize): Likewise.
> >>  * fold-const.h (operand_equal_p): Set default
> >>  value for last argument.
> >>  (class operand_compare): New.
> >
> > Hmpf.  A class without any data?  That doesn't sound like a good design.
> 
>  Yes, the base class (current operand_equal_p) does not have a data.
>  But the ICF derive class has a data and e.g. 
>  func_checker::operand_equal_valueize
>  will use m_label_bb_map.get (t1). Which are member data of class 
>  func_checker.
> 
> > You seem to need it only to have the possibility of virtual functions,
> > i.e. fancy callbacks.  AFAICS you only have one derived class, i.e. a
> > simple distinction of two cases.  What do you think about encoding the
> > additional new (ICF) case in the (existing) 'flags' argument to
> > operand_equal_p (and in case the ICF flag is set simply call the
> > "callback" directly)?
> 
>  That's possible. I can add two more callbacks to the operand_equal_p 
>  function
>  (hash_operand_valueize and operand_equal_valueize).
> 
>  Is Richi also supporting this approach?
> >>>
> >>> I still see no value in the abstraction since you invoke none of the
> >>> (virtual) methods from the base class operand_equal_p.
> >>
> >> I call operand_equal_valueize (and hash_operand) from operand_equal_p.
> >> These are then used in IPA ICF (patch 6/9).
> >
> > Ugh.  I see you call that after
> >
> >   if (TREE_CODE (arg0) != TREE_CODE (arg1))
> > {
> > ...
> > }
> >   else
> > return false;
> > }
> >
> > and also after
> >
> >   /* Check equality of integer constants before bailing out due to
> >  precision differences.  */
> >   if (TREE_CODE (arg0) == INTEGER_CST && TREE_CODE (arg1) == INTEGER_CST)
> >
> > which means for arg0 == SSA_NAME and arg1 == INTEGER_CST you return false
> > instead of valueizing arg0 to the possibly same or same "lose" value
> > and returning true.
>
> Yes. ICF does not allow to have anything where TREE_CODEs do not match.
>
> >
> > Also
> >
> > +  int val = operand_equal_valueize (arg0, arg1, flags);
> > +  if (val == 1)
> > +return 1;
> > +  if (val == 0)
> > +return 0;
> >
> > suggests that you pass in arbirtrary trees for "valueization" but it
> > isn't actually
> > valueization that is performed but instead it should do an alternate 
> > comparison
> > of arg0 and arg1 with valueization.  Why's this done this way instead of
> > sth like
> >
> >   if (TREE_CODE (arg0) == SSA_NAME)
> >arg0 = operand_equal_valueize (arg0, flags);
> >  if (TREE_CODE (arg1) == SSA_NAME)
> >arg1 = operand_equal_valueize (arg1, flags);
>
> Because I want to be given a pair of trees about which the function
> operand_equal_valueize returns match/no-match/dunno.
>
> >
> > and why's this done with virtual functions rather than a callback that we 
> > can
> > cheaply check for NULLness in the default implementation?
>
> I can transform it into a hook. But as mentioned I'll need two hooks.
>
> >
> > So - what does ICF want to make "equal" that isn't equal normally and how's
> > that "valueization"?
>
> E.g. for a FUNCTION_DECL, ICF always return true because it can only calls
> the operand_equal_p after callgraph is compared. Similarly for LABEL_DECLs,
> we have a map (m_label_bb_map). Please take a look at patch 6/9 in this
> series.

Hmm, ok, so you basically replace recursive calls to operand_equal_p with

  operand_equal_valueize (t1, t2, 0)
  || operand_equal_p (t1, t2, 0)

no?  But the same could be achieved by actually making t1 and t2 equal
according to operand_equal_p rules via the valueization hook?  So replace
FUNCTION_DECLs with their prevailing ones, LABEL_DECLs with theirs, etc.

As given your abstraction is quite awkward to use, say, from value-numbering
which knows how to "valueize" a single tree but doesn't compare things.

To make it work for your case you'd valueize not only SSA names but also
all DECL_P I guess.  After all your operand_equal_valueize only does
something for "leafs" but is called for all intermediate expressions as well.

Richard.

> Thanks,
> Martin
>
> >
> > Thanks,
> > Richard.
> >
> >> Martin
> >>
> >>>

Re: C++ PATCH for c++/91391 - bogus -Wcomma-subscript warning

2019-08-14 Thread Marek Polacek

Ping.

On Wed, Aug 07, 2019 at 04:05:53PM -0400, Marek Polacek wrote:
> When implementing -Wcomma-subscript I failed to realize that a comma in
> a template-argument-list shouldn't be warned about.
> 
> But we can't simply ignore any commas inside < ... > because the following
> needs to be caught:
> 
>   a[b < c, b > c];
> 
> This patch from Jakub fixes it by moving the warning to cp_parser_expression
> where we can better detect top-level commas (and avoid saving tokens).
> 
> I've extended the patch to revert the cp_parser_skip_to_closing_square_bracket
> changes I made in r274121 -- they are no longer needed.
> 
> Apologies for the thinko.
> 
> Bootstrapped/regtested on x86_64-linux, ok for trunk?
> 
> 2019-08-07  Jakub Jelinek  
> Marek Polacek  
> 
>   PR c++/91391 - bogus -Wcomma-subscript warning.
>   * parser.c (cp_parser_postfix_open_square_expression): Don't warn about
>   a deprecated comma here.  Pass warn_comma_subscript down to
>   cp_parser_expression.
>   (cp_parser_expression): New bool parameter.  Warn about uses of a comma
>   operator within a subscripting expression.
>   (cp_parser_skip_to_closing_square_bracket): Revert to pre-r274121 state.
>   (cp_parser_skip_to_closing_square_bracket_1): Remove.
> 
>   * g++.dg/cpp2a/comma5.C: New test.
> 
> diff --git gcc/cp/parser.c gcc/cp/parser.c
> index 14b724095c4..eccc3749fd0 100644
> --- gcc/cp/parser.c
> +++ gcc/cp/parser.c
> @@ -2102,7 +2102,7 @@ static cp_expr cp_parser_assignment_expression
>  static enum tree_code cp_parser_assignment_operator_opt
>(cp_parser *);
>  static cp_expr cp_parser_expression
> -  (cp_parser *, cp_id_kind * = NULL, bool = false, bool = false);
> +  (cp_parser *, cp_id_kind * = NULL, bool = false, bool = false, bool = 
> false);
>  static cp_expr cp_parser_constant_expression
>(cp_parser *, bool = false, bool * = NULL, bool = false);
>  static cp_expr cp_parser_builtin_offsetof
> @@ -2669,8 +2669,6 @@ static bool cp_parser_init_statement_p
>(cp_parser *);
>  static bool cp_parser_skip_to_closing_square_bracket
>(cp_parser *);
> -static int cp_parser_skip_to_closing_square_bracket_1
> -  (cp_parser *, enum cpp_ttype);
>  
>  /* Concept-related syntactic transformations */
>  
> @@ -7524,33 +7522,9 @@ cp_parser_postfix_open_square_expression (cp_parser 
> *parser,
> index = cp_parser_braced_list (parser, &expr_nonconst_p);
>   }
>else
> - {
> -   /* [depr.comma.subscript]: A comma expression appearing as
> -  the expr-or-braced-init-list of a subscripting expression
> -  is deprecated.  A parenthesized comma expression is not
> -  deprecated.  */
> -   if (warn_comma_subscript)
> - {
> -   /* Save tokens so that we can put them back.  */
> -   cp_lexer_save_tokens (parser->lexer);
> -
> -   /* Look for ',' that is not nested in () or {}.  */
> -   if (cp_parser_skip_to_closing_square_bracket_1 (parser,
> -   CPP_COMMA) == -1)
> - {
> -   auto_diagnostic_group d;
> -   warning_at (cp_lexer_peek_token (parser->lexer)->location,
> -   OPT_Wcomma_subscript,
> -   "top-level comma expression in array subscript "
> -   "is deprecated");
> - }
> -
> -   /* Roll back the tokens we skipped.  */
> -   cp_lexer_rollback_tokens (parser->lexer);
> - }
> -
> -   index = cp_parser_expression (parser);
> - }
> + index = cp_parser_expression (parser, NULL, /*cast_p=*/false,
> +   /*decltype_p=*/false,
> +   /*warn_comma_p=*/warn_comma_subscript);
>  }
>  
>parser->greater_than_is_operator_p = saved_greater_than_is_operator_p;
> @@ -9932,12 +9906,13 @@ cp_parser_assignment_operator_opt (cp_parser* parser)
> CAST_P is true if this expression is the target of a cast.
> DECLTYPE_P is true if this expression is the immediate operand of 
> decltype,
>   except possibly parenthesized or on the RHS of a comma (N3276).
> +   WARN_COMMA_P is true if a comma should be diagnosed.
>  
> Returns a representation of the expression.  */
>  
>  static cp_expr
>  cp_parser_expression (cp_parser* parser, cp_id_kind * pidk,
> -   bool cast_p, bool decltype_p)
> +   bool cast_p, bool decltype_p, bool warn_comma_p)
>  {
>cp_expr expression = NULL_TREE;
>location_t loc = UNKNOWN_LOCATION;
> @@ -9984,6 +9959,17 @@ cp_parser_expression (cp_parser* parser, cp_id_kind * 
> pidk,
>   break;
>/* Consume the `,'.  */
>loc = cp_lexer_peek_token (parser->lexer)->location;
> +  if (warn_comma_p)
> + {
> +   /* [depr.comma.subscript]: A comma expression appearing as
> +  the expr-or-braced-init-list of a subscripting expression
> +

Re: [PATCH] Add generic support for "noinit" attribute

2019-08-14 Thread Christophe Lyon

On Wed, 14 Aug 2019 at 14:14, Richard Sandiford
 wrote:
>
> Sorry for the slow response, I'd missed that there was an updated patch...
>
> Christophe Lyon  writes:
> > 2019-07-04  Christophe Lyon  
> >
> >   * lib/target-supports.exp (check_effective_target_noinit): New
> >   proc.
> > * gcc.c-torture/execute/noinit-attribute.c: New test.
>
> Second line should be indented by tabs rather than spaces.
>
> > @@ -2224,6 +2234,54 @@ handle_weak_attribute (tree *node, tree name,
> >return NULL_TREE;
> >  }
> >
> > +/* Handle a "noinit" attribute; arguments as in struct
> > +   attribute_spec.handler.  Check whether the attribute is allowed
> > +   here and add the attribute to the variable decl tree or otherwise
> > +   issue a diagnostic.  This function checks NODE is of the expected
> > +   type and issues diagnostics otherwise using NAME.  If it is not of
> > +   the expected type *NO_ADD_ATTRS will be set to true.  */
> > +
> > +static tree
> > +handle_noinit_attribute (tree * node,
> > +   tree   name,
> > +   tree   args,
> > +   intflags ATTRIBUTE_UNUSED,
> > +   bool *no_add_attrs)
> > +{
> > +  const char *message = NULL;
> > +
> > +  gcc_assert (DECL_P (*node));
> > +  gcc_assert (args == NULL);
> > +
> > +  if (TREE_CODE (*node) != VAR_DECL)
> > +message = G_("%qE attribute only applies to variables");
> > +
> > +  /* Check that it's possible for the variable to have a section.  */
> > +  else if ((TREE_STATIC (*node) || DECL_EXTERNAL (*node) || in_lto_p)
> > +&& DECL_SECTION_NAME (*node))
> > +message = G_("%qE attribute cannot be applied to variables "
> > +  "with specific sections");
> > +
> > +  if (!targetm.have_switchable_bss_sections)
> > +message = G_("%qE attribute is specific to ELF targets");
>
> Maybe make this an else if too?  Or make the VAR_DECL an else if
> if you think the ELF one should win.  Either way, it seems odd to
> have the mixture between else if and not.
>
Right, I changed this into an else if.

> > +  if (message)
> > +{
> > +  warning (OPT_Wattributes, message, name);
> > +  *no_add_attrs = true;
> > +}
> > +  else
> > +  /* If this var is thought to be common, then change this.  Common
> > + variables are assigned to sections before the backend has a
> > + chance to process them.  Do this only if the attribute is
> > + valid.  */
>
> Comment should be indented two spaces more.
>
> > +if (DECL_COMMON (*node))
> > +  DECL_COMMON (*node) = 0;
> > +
> > +  return NULL_TREE;
> > +}
> > +
> > +
> >  /* Handle a "noplt" attribute; arguments as in
> > struct attribute_spec.handler.  */
> >
> > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> > index f2619e1..f1af1dc 100644
> > --- a/gcc/doc/extend.texi
> > +++ b/gcc/doc/extend.texi
> > @@ -7129,6 +7129,14 @@ The @code{visibility} attribute is described in
> >  The @code{weak} attribute is described in
> >  @ref{Common Function Attributes}.
> >
> > +@item noinit
> > +@cindex @code{noinit} variable attribute
> > +Any data with the @code{noinit} attribute will not be initialized by
> > +the C runtime startup code, or the program loader.  Not initializing
> > +data in this way can reduce program startup times.  Specific to ELF
> > +targets, this attribute relies on the linker to place such data in the
> > +right location.
>
> Maybe:
>
>This attribute is specific to ELF targets and relies on the linker to
>place such data in the right location.
>
Thanks, I thought I had chosen a nice turn of phrase :-)


> > diff --git a/gcc/testsuite/gcc.c-torture/execute/noinit-attribute.c 
> > b/gcc/testsuite/gcc.c-torture/execute/noinit-attribute.c
> > new file mode 100644
> > index 000..ffcf8c6
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.c-torture/execute/noinit-attribute.c
> > @@ -0,0 +1,59 @@
> > +/* { dg-do run } */
> > +/* { dg-require-effective-target noinit */
> > +/* { dg-options "-O2" } */
> > +
> > +/* This test checks that noinit data is handled correctly.  */
> > +
> > +extern void _start (void) __attribute__ ((noreturn));
> > +extern void abort (void) __attribute__ ((noreturn));
> > +extern void exit (int) __attribute__ ((noreturn));
> > +
> > +int var_common;
> > +int var_zero = 0;
> > +int var_one = 1;
> > +int __attribute__((noinit)) var_noinit;
> > +int var_init = 2;
> > +
> > +int __attribute__((noinit)) func(); /* { dg-warning "attribute only 
> > applies to variables" } */
> > +int __attribute__((section ("mysection"), noinit)) var_section1; /* { 
> > dg-warning "because it conflicts with attribute" } */
> > +int __attribute__((noinit, section ("mysection"))) var_section2; /* { 
> > dg-warning "because it conflicts with attribute" } */
> > +
> > +
> > +int
> > +main (void)
> > +{
> > +  /* Make sure that the C startup code has correctly initialized the 
> > ordinary variables.  */
> > +  if (var_common != 0)
> > +abort ();
> > +
> > +  /* Initialized

Re: [PATCH 5/9] Come up with an abstraction.

2019-08-14 Thread Martin Liška

On 8/14/19 3:04 PM, Richard Biener wrote:
> On Mon, Aug 12, 2019 at 3:56 PM Martin Liška  wrote:
>>
>> On 8/12/19 2:43 PM, Richard Biener wrote:
>>> On Mon, Aug 12, 2019 at 1:49 PM Martin Liška  wrote:

 On 8/12/19 1:40 PM, Richard Biener wrote:
> On Mon, Aug 12, 2019 at 1:19 PM Martin Liška  wrote:
>>
>> On 8/8/19 5:55 PM, Michael Matz wrote:
>>> Hi,
>>>
>>> On Mon, 10 Jun 2019, Martin Liska wrote:
>>>
 2019-07-24  Martin Liska  

  * fold-const.c (operand_equal_p): Rename to ...
  (operand_compare::operand_equal_p): ... this.
  (add_expr):  Rename to ...
  (operand_compare::hash_operand): ... this.
  (operand_compare::operand_equal_valueize): Likewise.
  (operand_compare::hash_operand_valueize): Likewise.
  * fold-const.h (operand_equal_p): Set default
  value for last argument.
  (class operand_compare): New.
>>>
>>> Hmpf.  A class without any data?  That doesn't sound like a good design.
>>
>> Yes, the base class (current operand_equal_p) does not have a data.
>> But the ICF derive class has a data and e.g. 
>> func_checker::operand_equal_valueize
>> will use m_label_bb_map.get (t1). Which are member data of class 
>> func_checker.
>>
>>> You seem to need it only to have the possibility of virtual functions,
>>> i.e. fancy callbacks.  AFAICS you only have one derived class, i.e. a
>>> simple distinction of two cases.  What do you think about encoding the
>>> additional new (ICF) case in the (existing) 'flags' argument to
>>> operand_equal_p (and in case the ICF flag is set simply call the
>>> "callback" directly)?
>>
>> That's possible. I can add two more callbacks to the operand_equal_p 
>> function
>> (hash_operand_valueize and operand_equal_valueize).
>>
>> Is Richi also supporting this approach?
>
> I still see no value in the abstraction since you invoke none of the
> (virtual) methods from the base class operand_equal_p.

 I call operand_equal_valueize (and hash_operand) from operand_equal_p.
 These are then used in IPA ICF (patch 6/9).
>>>
>>> Ugh.  I see you call that after
>>>
>>>   if (TREE_CODE (arg0) != TREE_CODE (arg1))
>>> {
>>> ...
>>> }
>>>   else
>>> return false;
>>> }
>>>
>>> and also after
>>>
>>>   /* Check equality of integer constants before bailing out due to
>>>  precision differences.  */
>>>   if (TREE_CODE (arg0) == INTEGER_CST && TREE_CODE (arg1) == INTEGER_CST)
>>>
>>> which means for arg0 == SSA_NAME and arg1 == INTEGER_CST you return false
>>> instead of valueizing arg0 to the possibly same or same "lose" value
>>> and returning true.
>>
>> Yes. ICF does not allow to have anything where TREE_CODEs do not match.
>>
>>>
>>> Also
>>>
>>> +  int val = operand_equal_valueize (arg0, arg1, flags);
>>> +  if (val == 1)
>>> +return 1;
>>> +  if (val == 0)
>>> +return 0;
>>>
>>> suggests that you pass in arbirtrary trees for "valueization" but it
>>> isn't actually
>>> valueization that is performed but instead it should do an alternate 
>>> comparison
>>> of arg0 and arg1 with valueization.  Why's this done this way instead of
>>> sth like
>>>
>>>   if (TREE_CODE (arg0) == SSA_NAME)
>>>arg0 = operand_equal_valueize (arg0, flags);
>>>  if (TREE_CODE (arg1) == SSA_NAME)
>>>arg1 = operand_equal_valueize (arg1, flags);
>>
>> Because I want to be given a pair of trees about which the function
>> operand_equal_valueize returns match/no-match/dunno.
>>
>>>
>>> and why's this done with virtual functions rather than a callback that we 
>>> can
>>> cheaply check for NULLness in the default implementation?
>>
>> I can transform it into a hook. But as mentioned I'll need two hooks.
>>
>>>
>>> So - what does ICF want to make "equal" that isn't equal normally and how's
>>> that "valueization"?
>>
>> E.g. for a FUNCTION_DECL, ICF always return true because it can only calls
>> the operand_equal_p after callgraph is compared. Similarly for LABEL_DECLs,
>> we have a map (m_label_bb_map). Please take a look at patch 6/9 in this
>> series.
> 
> Hmm, ok, so you basically replace recursive calls to operand_equal_p with
> 
>   operand_equal_valueize (t1, t2, 0)
>   || operand_equal_p (t1, t2, 0)
> 
> no?

This is not going to work ..

>  But the same could be achieved by actually making t1 and t2 equal
> according to operand_equal_p rules via the valueization hook?  So replace
> FUNCTION_DECLs with their prevailing ones, LABEL_DECLs with theirs, etc.
> 
> As given your abstraction is quite awkward to use, say, from value-numbering
> which knows how to "valueize" a single tree but doesn't compare things.
> 
> To make it work for your case you'd valueize not only SSA names but also
> all DECL_P I guess.  After all your operand_equal_valueize only does
> something for "leafs" but is called for

[PATCH] Make GIMPLE forwprop DCE dead stmts

2019-08-14 Thread Richard Biener



The following patch makes forwprop DCE the stmts that become dead
because of propagation of copies and constants.  For this to work
we actually have to do that reliably rather than relying on
fold_stmt doing this for us.

This hits fortran/trans-intrinsic.c in a way that we do "interesting"
jump threading exposing a bogus uninit warning.  I'll open a PR
for this with an (unreduced) testcase after committing.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

I've done this when seeing the number of copyprop passes we have
and knowing the expense of the SSA propagation machinery so
eventually forwprop (in a cheaper mode, not folding all stmts)
could replace copyprop.

Richard.

2019-08-14  Richard Biener  

* tree-ssa-forwprop.c (pass_forwprop::execute): Fully
propagate lattice, DCE stmts that became dead because of that.

fortran/
* trans-intrinsic.c (gfc_conv_intrinsic_findloc): Initialize
forward_branch to avoid bogus uninitialized warning.

* gcc.dg/tree-ssa/forwprop-31.c: Adjust.

Index: gcc/tree-ssa-forwprop.c
===
--- gcc/tree-ssa-forwprop.c (revision 274422)
+++ gcc/tree-ssa-forwprop.c (working copy)
@@ -2299,13 +2299,14 @@ pass_forwprop::execute (function *fun)
   int postorder_num = pre_and_rev_post_order_compute_fn (cfun, NULL,
 postorder, false);
   auto_vec to_fixup;
+  auto_vec to_remove;
   to_purge = BITMAP_ALLOC (NULL);
   for (int i = 0; i < postorder_num; ++i)
 {
   gimple_stmt_iterator gsi;
   basic_block bb = BASIC_BLOCK_FOR_FN (fun, postorder[i]);
 
-  /* Propagate into PHIs and record degenerate ones in the lattice.  */
+  /* Record degenerate PHIs in the lattice.  */
   for (gphi_iterator si = gsi_start_phis (bb); !gsi_end_p (si);
   gsi_next (&si))
{
@@ -2321,17 +2322,20 @@ pass_forwprop::execute (function *fun)
  FOR_EACH_PHI_ARG (use_p, phi, it, SSA_OP_USE)
{
  tree use = USE_FROM_PTR (use_p);
- tree tem = fwprop_ssa_val (use);
  if (! first)
-   first = tem;
- else if (! operand_equal_p (first, tem, 0))
-   all_same = false;
- if (tem != use
- && may_propagate_copy (use, tem))
-   propagate_value (use_p, tem);
+   first = use;
+ else if (! operand_equal_p (first, use, 0))
+   {
+ all_same = false;
+ break;
+   }
}
  if (all_same)
-   fwprop_set_lattice_val (res, first);
+   {
+ if (may_propagate_copy (res, first))
+   to_remove.safe_push (phi);
+ fwprop_set_lattice_val (res, first);
+   }
}
 
   /* Apply forward propagation to all stmts in the basic-block.
@@ -2648,148 +2652,227 @@ pass_forwprop::execute (function *fun)
 
   /* Combine stmts with the stmts defining their operands.
 Note we update GSI within the loop as necessary.  */
-  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);)
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
{
  gimple *stmt = gsi_stmt (gsi);
- gimple *orig_stmt = stmt;
- bool changed = false;
- bool was_noreturn = (is_gimple_call (stmt)
-  && gimple_call_noreturn_p (stmt));
 
  /* Mark stmt as potentially needing revisiting.  */
  gimple_set_plf (stmt, GF_PLF_1, false);
 
- if (fold_stmt (&gsi, fwprop_ssa_val))
-   {
- changed = true;
- stmt = gsi_stmt (gsi);
- if (maybe_clean_or_replace_eh_stmt (orig_stmt, stmt))
-   bitmap_set_bit (to_purge, bb->index);
- if (!was_noreturn
- && is_gimple_call (stmt) && gimple_call_noreturn_p (stmt))
-   to_fixup.safe_push (stmt);
- /* Cleanup the CFG if we simplified a condition to
-true or false.  */
- if (gcond *cond = dyn_cast  (stmt))
-   if (gimple_cond_true_p (cond)
-   || gimple_cond_false_p (cond))
- cfg_changed = true;
- update_stmt (stmt);
-   }
-
- switch (gimple_code (stmt))
-   {
-   case GIMPLE_ASSIGN:
- {
-   tree rhs1 = gimple_assign_rhs1 (stmt);
-   enum tree_code code = gimple_assign_rhs_code (stmt);
+ /* Substitute from our lattice.  We need to do so only once.  */
+ bool substituted_p = false;
+ use_operand_p usep;
+ ssa_op_iter iter;
+ FOR_EACH_SSA_USE_OPERAND (usep, stmt, iter, SSA_OP_USE)
+   {
+ tree use = USE_FROM_PTR (usep);
+ tree val = fwprop_ssa_val (use);
+ if (val && val != use && may_propagate_cop

Re: types for VR_VARYING

2019-08-14 Thread Andrew MacLeod


On 8/13/19 8:39 PM, Aldy Hernandez wrote:



Yes, it was 2X.

I noticed that Richi made some changes to the lattice handling for 
VARYING while the discussion was on-going.  I missed these, and had 
failed to adapt the patch for it.  I would appreciate a final review 
of the attached patch, especially the vr-values.c changes, which I 
have modified to play nice with current trunk.


I also noticed that Andrew's patch was setting num_vr_values to 
num_ssa_names + num_ssa_names / 10.  I think he meant num_vr_values + 
num_vr_values / 10.  Please verify the current incantation makes sense.


no, I meant num_ssa_names.  We are resizing the vector because 
num_vr_values is out of date (and smaller than num_ssa_names is now), so 
we need to resize the vector to be at least the number of ssa-names... 
and I added 10% just in case we arent done adding new ones.



if num_vr_values is 100, and we've added 200 ssa-names, num_ssa_names 
would now be 300.   if you resize based on num_vr_values, you could 
still go off the end of the vector.



Andrew

Re: [PATCH 5/9] Come up with an abstraction.

2019-08-14 Thread Richard Biener

On Wed, Aug 14, 2019 at 3:19 PM Martin Liška  wrote:
>
> On 8/14/19 3:04 PM, Richard Biener wrote:
> > On Mon, Aug 12, 2019 at 3:56 PM Martin Liška  wrote:
> >>
> >> On 8/12/19 2:43 PM, Richard Biener wrote:
> >>> On Mon, Aug 12, 2019 at 1:49 PM Martin Liška  wrote:
> 
>  On 8/12/19 1:40 PM, Richard Biener wrote:
> > On Mon, Aug 12, 2019 at 1:19 PM Martin Liška  wrote:
> >>
> >> On 8/8/19 5:55 PM, Michael Matz wrote:
> >>> Hi,
> >>>
> >>> On Mon, 10 Jun 2019, Martin Liska wrote:
> >>>
>  2019-07-24  Martin Liska  
> 
>   * fold-const.c (operand_equal_p): Rename to ...
>   (operand_compare::operand_equal_p): ... this.
>   (add_expr):  Rename to ...
>   (operand_compare::hash_operand): ... this.
>   (operand_compare::operand_equal_valueize): Likewise.
>   (operand_compare::hash_operand_valueize): Likewise.
>   * fold-const.h (operand_equal_p): Set default
>   value for last argument.
>   (class operand_compare): New.
> >>>
> >>> Hmpf.  A class without any data?  That doesn't sound like a good 
> >>> design.
> >>
> >> Yes, the base class (current operand_equal_p) does not have a data.
> >> But the ICF derive class has a data and e.g. 
> >> func_checker::operand_equal_valueize
> >> will use m_label_bb_map.get (t1). Which are member data of class 
> >> func_checker.
> >>
> >>> You seem to need it only to have the possibility of virtual functions,
> >>> i.e. fancy callbacks.  AFAICS you only have one derived class, i.e. a
> >>> simple distinction of two cases.  What do you think about encoding the
> >>> additional new (ICF) case in the (existing) 'flags' argument to
> >>> operand_equal_p (and in case the ICF flag is set simply call the
> >>> "callback" directly)?
> >>
> >> That's possible. I can add two more callbacks to the operand_equal_p 
> >> function
> >> (hash_operand_valueize and operand_equal_valueize).
> >>
> >> Is Richi also supporting this approach?
> >
> > I still see no value in the abstraction since you invoke none of the
> > (virtual) methods from the base class operand_equal_p.
> 
>  I call operand_equal_valueize (and hash_operand) from operand_equal_p.
>  These are then used in IPA ICF (patch 6/9).
> >>>
> >>> Ugh.  I see you call that after
> >>>
> >>>   if (TREE_CODE (arg0) != TREE_CODE (arg1))
> >>> {
> >>> ...
> >>> }
> >>>   else
> >>> return false;
> >>> }
> >>>
> >>> and also after
> >>>
> >>>   /* Check equality of integer constants before bailing out due to
> >>>  precision differences.  */
> >>>   if (TREE_CODE (arg0) == INTEGER_CST && TREE_CODE (arg1) == INTEGER_CST)
> >>>
> >>> which means for arg0 == SSA_NAME and arg1 == INTEGER_CST you return false
> >>> instead of valueizing arg0 to the possibly same or same "lose" value
> >>> and returning true.
> >>
> >> Yes. ICF does not allow to have anything where TREE_CODEs do not match.
> >>
> >>>
> >>> Also
> >>>
> >>> +  int val = operand_equal_valueize (arg0, arg1, flags);
> >>> +  if (val == 1)
> >>> +return 1;
> >>> +  if (val == 0)
> >>> +return 0;
> >>>
> >>> suggests that you pass in arbirtrary trees for "valueization" but it
> >>> isn't actually
> >>> valueization that is performed but instead it should do an alternate 
> >>> comparison
> >>> of arg0 and arg1 with valueization.  Why's this done this way instead of
> >>> sth like
> >>>
> >>>   if (TREE_CODE (arg0) == SSA_NAME)
> >>>arg0 = operand_equal_valueize (arg0, flags);
> >>>  if (TREE_CODE (arg1) == SSA_NAME)
> >>>arg1 = operand_equal_valueize (arg1, flags);
> >>
> >> Because I want to be given a pair of trees about which the function
> >> operand_equal_valueize returns match/no-match/dunno.
> >>
> >>>
> >>> and why's this done with virtual functions rather than a callback that we 
> >>> can
> >>> cheaply check for NULLness in the default implementation?
> >>
> >> I can transform it into a hook. But as mentioned I'll need two hooks.
> >>
> >>>
> >>> So - what does ICF want to make "equal" that isn't equal normally and 
> >>> how's
> >>> that "valueization"?
> >>
> >> E.g. for a FUNCTION_DECL, ICF always return true because it can only calls
> >> the operand_equal_p after callgraph is compared. Similarly for LABEL_DECLs,
> >> we have a map (m_label_bb_map). Please take a look at patch 6/9 in this
> >> series.
> >
> > Hmm, ok, so you basically replace recursive calls to operand_equal_p with

_recursive calls_

> >
> >   operand_equal_valueize (t1, t2, 0)
> >   || operand_equal_p (t1, t2, 0)
> >
> > no?
>
> This is not going to work ..

I wonder if

class base
{
  virtual operand_equal_p (tree a, tree b, int f);
};

base::operand_equal_p (tree a, tree b, int f)
{
  as-is now, recursing to virtual operand_equal_p
}

class deriv : public base
{
  vritual operand_equa

Re: types for VR_VARYING

2019-08-14 Thread Aldy Hernandez





On 8/14/19 9:50 AM, Andrew MacLeod wrote:

On 8/13/19 8:39 PM, Aldy Hernandez wrote:



Yes, it was 2X.

I noticed that Richi made some changes to the lattice handling for 
VARYING while the discussion was on-going.  I missed these, and had 
failed to adapt the patch for it.  I would appreciate a final review 
of the attached patch, especially the vr-values.c changes, which I 
have modified to play nice with current trunk.


I also noticed that Andrew's patch was setting num_vr_values to 
num_ssa_names + num_ssa_names / 10.  I think he meant num_vr_values + 
num_vr_values / 10.  Please verify the current incantation makes sense.


no, I meant num_ssa_names.  We are resizing the vector because 
num_vr_values is out of date (and smaller than num_ssa_names is now), so 
we need to resize the vector to be at least the number of ssa-names... 
and I added 10% just in case we arent done adding new ones.



if num_vr_values is 100, and we've added 200 ssa-names, num_ssa_names 
would now be 300.   if you resize based on num_vr_values, you could 
still go off the end of the vector.


OK, I've changed the resize to allocate 2X as well.  So now we'll have:

+  unsigned int old_sz = num_vr_values;
+  num_vr_values = num_ssa_names * 2;
+  vr_value = XRESIZEVEC (value_range *, vr_value, num_vr_values);
etc

And the original allocation will also be 2X.

Aldy

Re: [PATCH 1/2] PR c++/91436 fix C++ dialect for std::make_unique fix-it hint

2019-08-14 Thread Jason Merrill

On Wed, Aug 14, 2019 at 7:02 AM Jonathan Wakely  wrote:
>
> On 13/08/19 16:07 -0400, Jason Merrill wrote:
> >On 8/13/19 9:32 AM, Jonathan Wakely wrote:
> >> * g++.dg/lookup/missing-std-include-6.C: Don't check make_unique in
> >> test that runs for C++11.
> >
> >I'm not comfortable removing this test coverage entirely.  Doesn't it
> >give a useful diagnostic in C++11 mode as well?
>
> It does:
>
> mu.cc:3:15: error: 'make_unique' is not a member of 'std'
> 3 | auto p = std::make_unique();
>   |   ^~~
> mu.cc:3:15: note: 'std::make_unique' is only available from C++14 onwards
> mu.cc:3:27: error: expected primary-expression before 'int'
> 3 | auto p = std::make_unique();
>   |   ^~~
>
> So we can add it to g++.dg/lookup/missing-std-include-8.C instead,
> which runs for c++98_only and checks for the "is only available for"
> cases. Here's a patch doing that.
>
> Tested x86_64-linux.
>
> OK for trunk?
>
> OK for gcc-9-branch and gcc-8-branch too, since PR c++/91436 affects
> those branches?

OK.

Jason

Re: [PATCH 1/2] PR c++/91436 fix C++ dialect for std::make_unique fix-it hint

2019-08-14 Thread David Malcolm

On Wed, 2019-08-14 at 12:02 +0100, Jonathan Wakely wrote:
> On 13/08/19 16:07 -0400, Jason Merrill wrote:
> > On 8/13/19 9:32 AM, Jonathan Wakely wrote:
> > > * g++.dg/lookup/missing-std-include-6.C: Don't check
> > > make_unique in
> > > test that runs for C++11.
> > 
> > I'm not comfortable removing this test coverage entirely.  Doesn't
> > it 
> > give a useful diagnostic in C++11 mode as well?
> 
> It does:
> 
> mu.cc:3:15: error: 'make_unique' is not a member of 'std'
> 3 | auto p = std::make_unique();
>   |   ^~~
> mu.cc:3:15: note: 'std::make_unique' is only available from C++14
> onwards
> mu.cc:3:27: error: expected primary-expression before 'int'
> 3 | auto p = std::make_unique();
>   |   ^~~
> 
> So we can add it to g++.dg/lookup/missing-std-include-8.C instead,
> which runs for c++98_only and checks for the "is only available for"
> cases. Here's a patch doing that.

FWIW this eliminates the testing that when we do have C++14 onwards,
that including  is suggested.

Maybe we need a C++14-onwards missing-std-include-* test, and to move
the existing test there?  (and to add the new test for before-C++-14)


> Tested x86_64-linux.
> 
> OK for trunk?
> 
> OK for gcc-9-branch and gcc-8-branch too, since PR c++/91436 affects
> those branches?
>

Re: C++ PATCH for c++/91391 - bogus -Wcomma-subscript warning

2019-08-14 Thread Jason Merrill


On 8/14/19 9:15 AM, Marek Polacek wrote:

Ping.

On Wed, Aug 07, 2019 at 04:05:53PM -0400, Marek Polacek wrote:

When implementing -Wcomma-subscript I failed to realize that a comma in
a template-argument-list shouldn't be warned about.

But we can't simply ignore any commas inside < ... > because the following
needs to be caught:

   a[b < c, b > c];

This patch from Jakub fixes it by moving the warning to cp_parser_expression
where we can better detect top-level commas (and avoid saving tokens).

I've extended the patch to revert the cp_parser_skip_to_closing_square_bracket
changes I made in r274121 -- they are no longer needed.

Apologies for the thinko.

Bootstrapped/regtested on x86_64-linux, ok for trunk?


OK.

Jason

Re: [PATCH 1/2] PR c++/91436 fix C++ dialect for std::make_unique fix-it hint

2019-08-14 Thread Jason Merrill


On 8/14/19 10:39 AM, David Malcolm wrote:

On Wed, 2019-08-14 at 12:02 +0100, Jonathan Wakely wrote:

On 13/08/19 16:07 -0400, Jason Merrill wrote:

On 8/13/19 9:32 AM, Jonathan Wakely wrote:

 * g++.dg/lookup/missing-std-include-6.C: Don't check
make_unique in
 test that runs for C++11.


I'm not comfortable removing this test coverage entirely.  Doesn't
it
give a useful diagnostic in C++11 mode as well?


It does:

mu.cc:3:15: error: 'make_unique' is not a member of 'std'
 3 | auto p = std::make_unique();
   |   ^~~
mu.cc:3:15: note: 'std::make_unique' is only available from C++14
onwards
mu.cc:3:27: error: expected primary-expression before 'int'
 3 | auto p = std::make_unique();
   |   ^~~

So we can add it to g++.dg/lookup/missing-std-include-8.C instead,
which runs for c++98_only and checks for the "is only available for"
cases. Here's a patch doing that.


FWIW this eliminates the testing that when we do have C++14 onwards,
that including  is suggested.

Maybe we need a C++14-onwards missing-std-include-* test, and to move
the existing test there?  (and to add the new test for before-C++-14)


We can also check for different messages in different std modes, i.e.

{ dg-message "one" "" { target c++11_down } .-1 }
{ dg-message "two" "" { target c++14 } .-2 }

Jason

[SVE] PR86753

2019-08-14 Thread Prathamesh Kulkarni

Hi,
The attached patch tries to fix PR86753.

For following test:
void
f1 (int *restrict x, int *restrict y, int *restrict z)
{
  for (int i = 0; i < 100; ++i)
x[i] = y[i] ? z[i] : 10;
}

vect dump shows:
  vect_cst__42 = { 0, ... };
  vect_cst__48 = { 0, ... };

  vect__4.7_41 = .MASK_LOAD (vectp_y.5_38, 4B, loop_mask_40);
  _4 = *_3;
  _5 = z_12(D) + _2;
  mask__35.8_43 = vect__4.7_41 != vect_cst__42;
  _35 = _4 != 0;
  vec_mask_and_46 = mask__35.8_43 & loop_mask_40;
  vect_iftmp.11_47 = .MASK_LOAD (vectp_z.9_44, 4B, vec_mask_and_46);
  iftmp.0_13 = 0;
  vect_iftmp.12_50 = VEC_COND_EXPR ;

and following code-gen:
L2:
ld1wz0.s, p2/z, [x1, x3, lsl 2]
cmpne   p1.s, p3/z, z0.s, #0
cmpne   p0.s, p2/z, z0.s, #0
ld1wz0.s, p0/z, [x2, x3, lsl 2]
sel z0.s, p1, z0.s, z1.s

We could reuse vec_mask_and_46 in vec_cond_expr since the conditions
vect__4.7_41 != vect_cst__48 and vect__4.7_41 != vect_cst__42
are equivalent, and vect_iftmp.11_47 depends on vect__4.7_41 != vect_cst__48.

I suppose in general for vec_cond_expr  if T comes from masked load,
which is conditional on C, then we could reuse the mask used in load,
in vec_cond_expr ?

The patch maintains a hash_map cond_to_vec_mask
from  vec_mask (with loop predicate applied).
In prepare_load_store_mask, we record  -> vec_mask & loop_mask,
and in vectorizable_condition, we check if  exists in
cond_to_vec_mask
and if found, the corresponding vec_mask is used as 1st operand of
vec_cond_expr.

 is represented with cond_vmask_key, and the patch
adds tree_cond_ops to represent condition operator and operands coming
either from cond_expr
or a gimple comparison stmt. If the stmt is not comparison, it returns
 and inserts that into cond_to_vec_mask.

With patch, the redundant p1 is eliminated and sel uses p0 for above test.

For following test:
void
f2 (int *restrict x, int *restrict y, int *restrict z, int fallback)
{
  for (int i = 0; i < 100; ++i)
x[i] = y[i] ? z[i] : fallback;
}

input to vectorizer has operands swapped in cond_expr:
  _36 = _4 != 0;
  iftmp.0_14 = .MASK_LOAD (_5, 32B, _36);
  iftmp.0_8 = _4 == 0 ? fallback_12(D) : iftmp.0_14;

So we need to check for inverted condition in cond_to_vec_mask,
and swap the operands.
Does the patch look OK so far ?

One major issue remaining with the patch is value  numbering.
Currently, it does value numbering for entire function using sccvn
during start of vect pass, which is too expensive since we only need
block based VN. I am looking into that.

Thanks,
Prathamesh
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index b0cbbac0cb5..bf54f80dd8b 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -8608,6 +8608,7 @@ vect_transform_loop (loop_vec_info loop_vinfo)
 {
   basic_block bb = bbs[i];
   stmt_vec_info stmt_info;
+  loop_vinfo->cond_to_vec_mask = new cond_vmask_map_type (8);
 
   for (gphi_iterator si = gsi_start_phis (bb); !gsi_end_p (si);
 	   gsi_next (&si))
@@ -8717,6 +8718,9 @@ vect_transform_loop (loop_vec_info loop_vinfo)
 		}
 	}
 	}
+
+  delete loop_vinfo->cond_to_vec_mask;
+  loop_vinfo->cond_to_vec_mask = 0;
 }/* BBs in loop */
 
   /* The vectorization factor is always > 1, so if we use an IV increment of 1.
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 1e2dfe5d22d..862206b3256 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -1989,17 +1989,31 @@ check_load_store_masking (loop_vec_info loop_vinfo, tree vectype,
 
 static tree
 prepare_load_store_mask (tree mask_type, tree loop_mask, tree vec_mask,
-			 gimple_stmt_iterator *gsi)
+			 gimple_stmt_iterator *gsi, tree mask,
+			 cond_vmask_map_type *cond_to_vec_mask)
 {
   gcc_assert (useless_type_conversion_p (mask_type, TREE_TYPE (vec_mask)));
   if (!loop_mask)
 return vec_mask;
 
   gcc_assert (TREE_TYPE (loop_mask) == mask_type);
+
+  tree *slot = 0;
+  if (cond_to_vec_mask)
+{
+  cond_vmask_key cond (mask, loop_mask);
+  slot = &cond_to_vec_mask->get_or_insert (cond);
+  if (*slot)
+	return *slot;
+}
+
   tree and_res = make_temp_ssa_name (mask_type, NULL, "vec_mask_and");
   gimple *and_stmt = gimple_build_assign (and_res, BIT_AND_EXPR,
 	  vec_mask, loop_mask);
   gsi_insert_before (gsi, and_stmt, GSI_SAME_STMT);
+
+  if (slot)
+*slot = and_res;
   return and_res;
 }
 
@@ -3514,8 +3528,10 @@ vectorizable_call (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi,
 			  gcc_assert (ncopies == 1);
 			  tree mask = vect_get_loop_mask (gsi, masks, vec_num,
 			  vectype_out, i);
+			  tree scalar_mask = gimple_call_arg (gsi_stmt (*gsi), mask_opno);
 			  vargs[mask_opno] = prepare_load_store_mask
-			(TREE_TYPE (mask), mask, vargs[mask_opno], gsi);
+			(TREE_TYPE (mask), mask, vargs[mask_opno], gsi,
+			 scalar_mask, vinfo->cond_to_vec_mask);
 			}
 
 		  gcall *call;
@@ -3564,9 +3580,11 @@ vectorizable_call (stmt_vec_info stmt_info, gimple_stmt_ite

Re: [PATCH 2/3] C++20 constexpr lib part 2/3 - swappish functions.

2019-08-14 Thread Ed Smith-Rowland via gcc-patches


On 8/13/19 7:14 AM, Jonathan Wakely wrote:

On 01/08/19 13:16 -0400, Ed Smith-Rowland via libstdc++ wrote:

Greetings,

Here is a patch for C++20 p0879 - Constexpr for swap and swap related 
functions.


This essentially constexprifies the rest of .

Built and tested with C++20 (and pre-c++20) on x86_64-linux.

Ok?

Regards,

Ed Smith-Rowland





2019-08-01?? Edward Smith-Rowland <3dw...@verizon.net>

Implement C++20 p0879 - Constexpr for swap and swap related 
functions.

* include/bits/algorithmfwd.h (__cpp_lib_constexpr_swap_algorithms):
New macro. (iter_swap, make_heap, next_permutation, 
partial_sort_copy,


There should be a newline after "New macro." and before the next
parenthesized list of identifiers.

The parenthesized lists should not span multiple lines, so close and
reopen the parens, i.e.

 Implement C++20 p0879 - Constexpr for swap and swap related 
functions.
 * include/bits/algorithmfwd.h 
(__cpp_lib_constexpr_swap_algorithms):

 New macro.
 (iter_swap, make_heap, next_permutation, partial_sort_copy, 
pop_heap)

 (prev_permutation, push_heap, reverse, rotate, sort_heap, swap)
 (swap_ranges, nth_element, partial_sort, sort): Add constexpr.


@@ -193,6 +193,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION

#if __cplusplus > 201703L
#?? define __cpp_lib_constexpr_algorithms 201711L
+#?? define __cpp_lib_constexpr_swap_algorithms 201712L


Should this value be 201806L?

Indeed.


The new macro also needs to be added to .


Done.

I this OK after it passes testing?

Ed



2019-08-14  Edward Smith-Rowland  <3dw...@verizon.net>

Implement C++20 p0879 - Constexpr for swap and swap related functions.
* include/bits/algorithmfwd.h (__cpp_lib_constexpr_swap_algorithms):
New macro.
* include/std/version: Ditto.
(iter_swap, make_heap, next_permutation, partial_sort_copy, pop_heap)
(prev_permutation, push_heap, reverse, rotate, sort_heap, swap)
(swap_ranges, nth_element, partial_sort, sort): Add constexpr.
* include/bits/move.h (swap): Add constexpr.
* include/bits/stl_algo.h (__move_median_to_first, __reverse, reverse)
(__gcd, __rotate, rotate, __partition, __heap_select)
(__partial_sort_copy, partial_sort_copy, __unguarded_partition)
(__unguarded_partition_pivot, __partial_sort, __introsort_loop, __sort)
(__introselect, __chunk_insertion_sort, next_permutation)
(prev_permutation, partition, partial_sort, nth_element, sort)
(__iter_swap::iter_swap, iter_swap, swap_ranges): Add constexpr.
* include/bits/stl_algobase.h (__iter_swap::iter_swap, iter_swap)
(swap_ranges): Add constexpr.
* include/bits/stl_heap.h (__push_heap, push_heap, __adjust_heap,
__pop_heap, pop_heap, __make_heap, make_heap, __sort_heap, sort_heap):
Add constexpr.
* include/std/type_traits (swap): Add constexpr.
* testsuite/25_algorithms/headers/algorithm/synopsis.cc: Add constexpr.
* testsuite/25_algorithms/iter_swap/constexpr.cc: New test.
* testsuite/25_algorithms/make_heap/constexpr.cc: New test.
* testsuite/25_algorithms/next_permutation/constexpr.cc: New test.
* testsuite/25_algorithms/nth_element/constexpr.cc: New test.
* testsuite/25_algorithms/partial_sort/constexpr.cc: New test.
* testsuite/25_algorithms/partial_sort_copy/constexpr.cc: New test.
* testsuite/25_algorithms/partition/constexpr.cc: New test.
* testsuite/25_algorithms/pop_heap/constexpr.cc: New test.
* testsuite/25_algorithms/prev_permutation/constexpr.cc: New test.
* testsuite/25_algorithms/push_heap/constexpr.cc: New test.
* testsuite/25_algorithms/reverse/constexpr.cc: New test.
* testsuite/25_algorithms/rotate/constexpr.cc: New test.
* testsuite/25_algorithms/sort/constexpr.cc: New test.
* testsuite/25_algorithms/sort_heap/constexpr.cc: New test.
* testsuite/25_algorithms/swap/constexpr.cc: New test.
* testsuite/25_algorithms/swap_ranges/constexpr.cc: New test.

Index: include/bits/algorithmfwd.h
===
--- include/bits/algorithmfwd.h (revision 274411)
+++ include/bits/algorithmfwd.h (working copy)
@@ -193,6 +193,7 @@
 
 #if __cplusplus > 201703L
 #  define __cpp_lib_constexpr_algorithms 201711L
+#  define __cpp_lib_constexpr_swap_algorithms 201806L
 #endif
 
 #if __cplusplus >= 201103L
@@ -377,6 +378,7 @@
 #endif
 
   template
+_GLIBCXX20_CONSTEXPR
 void
 iter_swap(_FIter1, _FIter2);
 
@@ -391,10 +393,12 @@
 lower_bound(_FIter, _FIter, const _Tp&, _Compare);
 
   template
+_GLIBCXX20_CONSTEXPR
 void
 make_heap(_RAIter, _RAIter);
 
   template
+_GLIBCXX20_CONSTEXPR
 void
 make_heap(_RAIter, _RAIter, _Compare);
 
@@ -478,10 +482,12 @@
   // mismatch
 
   template
+_GLIBCXX2

Re: [PATCH 1/2] PR c++/91436 fix C++ dialect for std::make_unique fix-it hint

2019-08-14 Thread Jonathan Wakely

On 14/08/19 10:39 -0400, David Malcolm wrote:

On Wed, 2019-08-14 at 12:02 +0100, Jonathan Wakely wrote:

On 13/08/19 16:07 -0400, Jason Merrill wrote:
> On 8/13/19 9:32 AM, Jonathan Wakely wrote:
> > * g++.dg/lookup/missing-std-include-6.C: Don't check
> > make_unique in
> > test that runs for C++11.
>
> I'm not comfortable removing this test coverage entirely.  Doesn't
> it
> give a useful diagnostic in C++11 mode as well?

It does:

mu.cc:3:15: error: 'make_unique' is not a member of 'std'
3 | auto p = std::make_unique();
  |   ^~~
mu.cc:3:15: note: 'std::make_unique' is only available from C++14
onwards
mu.cc:3:27: error: expected primary-expression before 'int'
3 | auto p = std::make_unique();
  |   ^~~

So we can add it to g++.dg/lookup/missing-std-include-8.C instead,
which runs for c++98_only and checks for the "is only available for"
cases. Here's a patch doing that.

FWIW this eliminates the testing that when we do have C++14 onwards,
that including  is suggested.

Do we really care?

Are we testing that *every* entry in the array gives the right answer
for both missing-header and bad-std-option, or are we just testing a
subset of them to be sure the logic works as expected?

Because if we're testing every entry then:

1) we're missing LOTS of tests, and

2) we're just as likely to test the wrong thing and not actually catch
  bugs (as was already happening for both make_unique and
  complex_literals).

Maybe we need a C++14-onwards missing-std-include-* test, and to move
the existing test there?  (and to add the new test for before-C++-14)

We could, but is it worth it?

RE: [PATCH] Add generic support for "noinit" attribute

2019-08-14 Thread Tamar Christina

Hi Christoph,

The noinit testcase is currently failing on x86_64.

Is the test supposed to be running there?

Thanks,
Tamar

-Original Message-
From: gcc-patches-ow...@gcc.gnu.org  On Behalf 
Of Christophe Lyon
Sent: Wednesday, August 14, 2019 2:18 PM
To: Christophe Lyon ; Martin Sebor 
; gcc Patches ; Richard Earnshaw 
; ni...@redhat.com; Jozef Lawrynowicz 
; Richard Sandiford 
Subject: Re: [PATCH] Add generic support for "noinit" attribute

On Wed, 14 Aug 2019 at 14:14, Richard Sandiford  
wrote:
>
> Sorry for the slow response, I'd missed that there was an updated patch...
>
> Christophe Lyon  writes:
> > 2019-07-04  Christophe Lyon  
> >
> >   * lib/target-supports.exp (check_effective_target_noinit): New
> >   proc.
> > * gcc.c-torture/execute/noinit-attribute.c: New test.
>
> Second line should be indented by tabs rather than spaces.
>
> > @@ -2224,6 +2234,54 @@ handle_weak_attribute (tree *node, tree name,
> >return NULL_TREE;
> >  }
> >
> > +/* Handle a "noinit" attribute; arguments as in struct
> > +   attribute_spec.handler.  Check whether the attribute is allowed
> > +   here and add the attribute to the variable decl tree or otherwise
> > +   issue a diagnostic.  This function checks NODE is of the expected
> > +   type and issues diagnostics otherwise using NAME.  If it is not of
> > +   the expected type *NO_ADD_ATTRS will be set to true.  */
> > +
> > +static tree
> > +handle_noinit_attribute (tree * node,
> > +   tree   name,
> > +   tree   args,
> > +   intflags ATTRIBUTE_UNUSED,
> > +   bool *no_add_attrs)
> > +{
> > +  const char *message = NULL;
> > +
> > +  gcc_assert (DECL_P (*node));
> > +  gcc_assert (args == NULL);
> > +
> > +  if (TREE_CODE (*node) != VAR_DECL)
> > +message = G_("%qE attribute only applies to variables");
> > +
> > +  /* Check that it's possible for the variable to have a section.  
> > + */  else if ((TREE_STATIC (*node) || DECL_EXTERNAL (*node) || in_lto_p)
> > +&& DECL_SECTION_NAME (*node))
> > +message = G_("%qE attribute cannot be applied to variables "
> > +  "with specific sections");
> > +
> > +  if (!targetm.have_switchable_bss_sections)
> > +message = G_("%qE attribute is specific to ELF targets");
>
> Maybe make this an else if too?  Or make the VAR_DECL an else if if 
> you think the ELF one should win.  Either way, it seems odd to have 
> the mixture between else if and not.
>
Right, I changed this into an else if.

> > +  if (message)
> > +{
> > +  warning (OPT_Wattributes, message, name);
> > +  *no_add_attrs = true;
> > +}
> > +  else
> > +  /* If this var is thought to be common, then change this.  Common
> > + variables are assigned to sections before the backend has a
> > + chance to process them.  Do this only if the attribute is
> > + valid.  */
>
> Comment should be indented two spaces more.
>
> > +if (DECL_COMMON (*node))
> > +  DECL_COMMON (*node) = 0;
> > +
> > +  return NULL_TREE;
> > +}
> > +
> > +
> >  /* Handle a "noplt" attribute; arguments as in
> > struct attribute_spec.handler.  */
> >
> > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 
> > f2619e1..f1af1dc 100644
> > --- a/gcc/doc/extend.texi
> > +++ b/gcc/doc/extend.texi
> > @@ -7129,6 +7129,14 @@ The @code{visibility} attribute is described 
> > in  The @code{weak} attribute is described in  @ref{Common Function 
> > Attributes}.
> >
> > +@item noinit
> > +@cindex @code{noinit} variable attribute Any data with the 
> > +@code{noinit} attribute will not be initialized by the C runtime 
> > +startup code, or the program loader.  Not initializing data in this 
> > +way can reduce program startup times.  Specific to ELF targets, 
> > +this attribute relies on the linker to place such data in the right 
> > +location.
>
> Maybe:
>
>This attribute is specific to ELF targets and relies on the linker to
>place such data in the right location.
>
Thanks, I thought I had chosen a nice turn of phrase :-)


> > diff --git a/gcc/testsuite/gcc.c-torture/execute/noinit-attribute.c 
> > b/gcc/testsuite/gcc.c-torture/execute/noinit-attribute.c
> > new file mode 100644
> > index 000..ffcf8c6
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.c-torture/execute/noinit-attribute.c
> > @@ -0,0 +1,59 @@
> > +/* { dg-do run } */
> > +/* { dg-require-effective-target noinit */
> > +/* { dg-options "-O2" } */
> > +
> > +/* This test checks that noinit data is handled correctly.  */
> > +
> > +extern void _start (void) __attribute__ ((noreturn)); extern void 
> > +abort (void) __attribute__ ((noreturn)); extern void exit (int) 
> > +__attribute__ ((noreturn));
> > +
> > +int var_common;
> > +int var_zero = 0;
> > +int var_one = 1;
> > +int __attribute__((noinit)) var_noinit; int var_init = 2;
> > +
> > +int __attribute__((noinit)) func(); /* { dg-warning "attribute only 
> > +applies to variables" } */ int __attribute__((sectio

Re: [PATCH 2/3] C++20 constexpr lib part 2/3 - swappish functions.

2019-08-14 Thread Jonathan Wakely


On 14/08/19 11:06 -0400, Ed Smith-Rowland wrote:

I this OK after it passes testing?

Ed






2019-08-14  Edward Smith-Rowland  <3dw...@verizon.net>

Implement C++20 p0879 - Constexpr for swap and swap related functions.
* include/bits/algorithmfwd.h (__cpp_lib_constexpr_swap_algorithms):
New macro.
* include/std/version: Ditto.


It looks like this line was inserted in the wrong place, as the lines
that follow it are not part of . The entry for
include/std/version should be after include/std/type_traits.



(iter_swap, make_heap, next_permutation, partial_sort_copy, pop_heap)
(prev_permutation, push_heap, reverse, rotate, sort_heap, swap)
(swap_ranges, nth_element, partial_sort, sort): Add constexpr.
* include/bits/move.h (swap): Add constexpr.
* include/bits/stl_algo.h (__move_median_to_first, __reverse, reverse)
(__gcd, __rotate, rotate, __partition, __heap_select)
(__partial_sort_copy, partial_sort_copy, __unguarded_partition)
(__unguarded_partition_pivot, __partial_sort, __introsort_loop, __sort)
(__introselect, __chunk_insertion_sort, next_permutation)
(prev_permutation, partition, partial_sort, nth_element, sort)
(__iter_swap::iter_swap, iter_swap, swap_ranges): Add constexpr.
* include/bits/stl_algobase.h (__iter_swap::iter_swap, iter_swap)
(swap_ranges): Add constexpr.
* include/bits/stl_heap.h (__push_heap, push_heap, __adjust_heap,
__pop_heap, pop_heap, __make_heap, make_heap, __sort_heap, sort_heap):
Add constexpr.
* include/std/type_traits (swap): Add constexpr.


i.e. here.

OK for trunk with that change if testing passes. Thanks!

Re: [PATCH] fold more string comparison with known result (PR 90879)

2019-08-14 Thread Martin Sebor


On 8/12/19 7:40 AM, Michael Matz wrote:

Hi,

On Fri, 9 Aug 2019, Martin Sebor wrote:


The solution introduced in C99 is a flexible array.  C++
compilers usually support it as well.  Those that don't are
likely to support the zero-length array (even Visual C++ does).
If there's a chance that some don't support either do you really
think it's safe to assume they will do something sane with
the [1] hack?


As the [1] "hack" is the traditional pre-C99 (and C++) idiom to
implement flexible trailing char arrays, yes, I do expect all existing
(and not any more existing) compilers to do the obvious and sane thing
with it.  IOW: it's more portable in practice than our documented
zero-length extension.  And that's what matters for the things compiled by
the host compiler.

Without requiring C99 (which would be a different discussion) and a
non-existing C++ standard we can't write this code (in this form) in a
standard conforming way, no matter what we wish for.  Hence it seems
prudent to use the most portable variant of all the non-standard ways, the
trailing [1] array.


There are a few reasons why these legacy C idioms should be
replaced with better/newer/safer alternatives.

First, with two C revisions since C99 and with support for
superior alternatives widely available, pre-C99 idioms have less
and less relevance.

Second, since most of GCC requires a C++98 compiler to compile,
ancient C code needs to adjust to the more strict C++ requirements.
As C++ evolves, dependencies on legacy extensions like this one
make it increasingly difficult to upgrade to newer revisions of
the standard.  C++ 11 already requires compilers to reject
undefined behavior in constexpr contexts, including accesses
to arrays outside of their bounds.  Once GCC adopts C++ 11 it
won't be able to make use of constexpr with code that relies
on the hack.

Third, the safest and most secure approach to dealing with past-
the-end accesses is to diagnose and prevent them.  Accommodating
code that disregards the array bounds compromises this goal.  This
is evident from the gaps in _FORTIFY_SOURCE and -Wstringop-overflow
that other compilers like Clang and ICC don't suffer from(*).  It's
in everyone's best interest to proactively drive them to extinction
and replace them by safer alternatives that let compilers distinguish
the intentional accesses from accidental ones.  It not only makes it
easier to find bugs but also emit more efficient object code.

Martin

PS Unlike GCC, both Clang and ICC diagnose past-the-end accesses
to trailing arrays with more than one element.  They do recognize
the struct hack even in C++ and, outside constexpr contexts, avoid
diagnosing past-the-end accesses to trailing one-element arrays.
This isn't so much an issue today because neither allows statically
initializing struct objects with such arrays to more elements than
the bound specifies.  But it will likely change when the C++
proposal for constexpr functions to use new expressions is adopted 
(P0784R1).

Re: [PATCH 0/3] Libsanitizer: merge from trunk

2019-08-14 Thread Jeff Law

On 8/14/19 2:50 AM, Martin Liška wrote:
> On 8/13/19 5:02 PM, Jeff Law wrote:
>> On 8/13/19 7:07 AM, Martin Liska wrote:
>>> Hi.
>>>
>>> For this year, I decided to make a first merge now and the
>>> next (much smaller) at the end of October.
>>>
>>> The biggest change is rename of many files from .cc to .cpp.
>>>
>>> I bootstrapped the patch set on x86_64-linux-gnu and run
>>> asan/ubsan/tsan tests on x86_64, ppc64le (power8) and
>>> aarch64.
>>>
>>> Libasan SONAME has been already bumped compared to GCC 9.
>>>
>>> For other libraries, I don't see a reason for library bumping:
>>>
>>> $ abidiff /usr/lib64/libubsan.so.1.0.0 
>>> ./x86_64-pc-linux-gnu/libsanitizer/ubsan/.libs/libubsan.so.1.0.0 --stat
>>> Functions changes summary: 0 Removed, 0 Changed, 4 Added functions
>>> Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
>>> Function symbols changes summary: 3 Removed, 0 Added function symbols not 
>>> referenced by debug info
>>> Variable symbols changes summary: 0 Removed, 0 Added variable symbol not 
>>> referenced by debug info
>>>
>>> $ abidiff /usr/lib64/libtsan.so.0.0.0  
>>> ./x86_64-pc-linux-gnu/libsanitizer/tsan/.libs/libtsan.so.0.0.0 --stat
>>> Functions changes summary: 0 Removed, 0 Changed, 47 Added functions
>>> Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
>>> Function symbols changes summary: 1 Removed, 2 Added function symbols not 
>>> referenced by debug info
>>> Variable symbols changes summary: 0 Removed, 0 Added variable symbol not 
>>> referenced by debug info
>>>
>>> Ready to be installed?
>> ISTM that a sanitizer merge during stage1 should be able to move forward
>> without ACKs.  Similarly for other runtimes where we pull from some
>> upstream master.
> 
> Good then. I've just installed the patch and also the refresh of 
> LOCAL_PATCHES.
Sounds good.  My tester will spin them on a variety of platforms over
the next couple days.  I won't be at all surprised if the MIPS bits are
still flakey.

Jeff

Re: [PATCH] fix and improve strlen conditional handling of merged stores (PR 91183, 91294, 91315)

2019-08-14 Thread Martin Sebor


On 8/12/19 1:57 PM, Jeff Law wrote:

On 8/9/19 5:42 PM, Martin Sebor wrote:

@@ -3408,7 +3457,13 @@ static bool
   }
       gimple *stmt = SSA_NAME_DEF_STMT (exp);
-  if (gimple_code (stmt) != GIMPLE_PHI)
+  if (gimple_assign_single_p (stmt))
+    {
+  tree rhs = gimple_assign_rhs1 (stmt);
+  return count_nonzero_bytes (rhs, offset, nbytes, lenrange,
nulterm,
+  allnul, allnonnul, snlim);
+    }
+  else if (gimple_code (stmt) != GIMPLE_PHI)
   return false;

What cases are you handling here?  Are there any cases where a single
operand expression on the RHS affects the result.  For example, if we've
got a NOP_EXPR which zero extends RHS?  Does that change the nonzero
bytes in a way that is significant?

I'm not opposed to handling single operand objects here, I'm just
concerned that we're being very lenient in just stripping away the
operator and looking at the underlying object.


I remember adding the code because of a test failure but not
the specifics anymore.  No tests fail with it removed so it
may not be needed.  As you know, I've been juggling a few
enhancements in this area and copying code between them as
I need it so it's possible that I copied too much, or that
some other change has obviated it, or also that the test
failed somewhere else and I forgot to copy the test along
with the code  I'll remove it until it's needed.

Let's pull it for now.  If we come across the need again, we can
obviously revisit with a testcase.





@@ -3795,7 +3824,14 @@ handle_store (gimple_stmt_iterator *gsi)
   }
     else
   si->nonzero_chars = build_int_cst (size_type_node, offset);
-  si->full_string_p = full_string_p;
+
+  /* Set FULL_STRING_P only if the length of the strings being
+ written is the same, and clear it if the strings have
+ different lengths.  In the latter case the length stored
+ in si->NONZERO_CHARS becomes the lower bound.
+ FIXME: Handle the upper bound of the length if possible.  */
+  si->full_string_p = full_string_p && lenrange[0] == lenrange[1];

So there seems to be a disconnect between the comment and the code.

The comment indicates you care about the lengths of the two strings
being the same.  But what you're really comparing when the lenrange[0]
== lenrange[1] test is that the min and max of RHS are the same.


The comment tries to make clear that all the arrays on the RHS
of the assignment must have the same length in order to set
FULL_STRING_P.  Like here where LENRANGE = { 4, 4, 4 }:

   void f (char *s)
   {
     if (__builtin_strlen (s) != 2)
   return;

     *(int*)a = i ? 0x : 0x;
   }

but not here where LENRANGE = { 1, 4, 4 }:

     *(int*)a = i < 0 ? 0x : i ? 0x0022 : 0x3300;

If the bounds of the range of lengths of all the strings on
the RHS are the same they're all the same length.

I'm open to phrasing it better.

Oh, I think I see what I was missing.  In the case where RHS is a
conditional (or perhaps a SSA_NAME which was set from a PHI) LENRANGE
will have the min/max/# bytes for the RHS was a whole, not just a single
component of the RHS.



It generally looks reasonable, so I think we just need to reach a
conclusion on the gimple_assign_single_p cases we're trying to handle
and the possible mismatch between the comment and the code.


Do you want me to post another revision with
the gimple_assign_single_p test removed?

I think remove that hunk, bootstrap, test, commit and post for archival
purposes.  I do not think another round of review is necessary.


Done in r274486 (also attached).

Martin
Index: gcc/tree-ssa-strlen.c
===
--- gcc/tree-ssa-strlen.c	(revision 274485)
+++ gcc/tree-ssa-strlen.c	(revision 274486)
@@ -1195,14 +1195,13 @@ adjust_last_stmt (strinfo *si, gimple *stmt, bool
to constants.  */
 
 tree
-set_strlen_range (tree lhs, wide_int max, tree bound /* = NULL_TREE */)
+set_strlen_range (tree lhs, wide_int min, wide_int max,
+		  tree bound /* = NULL_TREE */)
 {
   if (TREE_CODE (lhs) != SSA_NAME
   || !INTEGRAL_TYPE_P (TREE_TYPE (lhs)))
 return NULL_TREE;
 
-  wide_int min = wi::zero (max.get_precision ());
-
   if (bound)
 {
   /* For strnlen, adjust MIN and MAX as necessary.  If the bound
@@ -1312,7 +1311,8 @@ maybe_set_strlen_range (tree lhs, tree src, tree b
 	}
 }
 
-  return set_strlen_range (lhs, max, bound);
+  wide_int min = wi::zero (max.get_precision ());
+  return set_strlen_range (lhs, min, max, bound);
 }
 
 /* Handle a strlen call.  If strlen of the argument is known, replace
@@ -1434,6 +1434,12 @@ handle_builtin_strlen (gimple_stmt_iterator *gsi)
 		  tree adj = fold_build2_loc (loc, MINUS_EXPR,
 	  TREE_TYPE (lhs), lhs, old);
 		  adjust_related_strinfos (loc, si, adj);
+		  /* Use the constant minimim length as the lower bound
+		 of the non-constant length.  */
+		  wide_int min =

Re: [PATCH] fix and improve strlen conditional handling of merged stores (PR 91183, 91294, 91315)

2019-08-14 Thread Martin Sebor


Do you want me to post another revision with
the gimple_assign_single_p test removed?

I think remove that hunk, bootstrap, test, commit and post for archival
purposes.  I do not think another round of review is necessary.


Done in r274486 (also attached).


I should add that the early store merging on loosely aligned
targets makes the strlen tests prone to failures on strictly
aligned targets (or even ILP32 targets).  The handle_store
function can now deal with all sorts of MEM_REF assignments
that result from the store merging, but because
the handle_builtin_memcpy function lacks the same support,
the tests that expect the assignments to be folded fail when
they are not merged.  For example, the strlen call below is
folded on i386:

  const char a4[32] = "0123";
  const char b4[32] = "3210";

  void f (int i)
  {
char a[32];
memcpy (a, i ? a4 + 1 : b4, 8);   // copy just 8 bytes
if (strlen (a) < 3)
  abort ();
  }

but the equivalent call below is not:

  void g (int i)
  {
char a[32];
memcpy (a, i ? a4 + 1 : b4, 16);   // copy 16 bytes
if (strlen (a) < 3)
  abort ();
  }

This pattern may not be very common in the wild but having
the pass behave consistently without these target dependencies
would be helpful in avoiding these test failures.  I will try
to remember to extend the same enhancement as in handle_store
to handle_builtin_memcpy (ideally by factoring the code out
of handle_store into a helper and letting both functions call
it to do the folding).

Martin

Re: [SVE] PR86753

2019-08-14 Thread Richard Biener

On Wed, Aug 14, 2019 at 5:06 PM Prathamesh Kulkarni
 wrote:
>
> Hi,
> The attached patch tries to fix PR86753.
>
> For following test:
> void
> f1 (int *restrict x, int *restrict y, int *restrict z)
> {
>   for (int i = 0; i < 100; ++i)
> x[i] = y[i] ? z[i] : 10;
> }
>
> vect dump shows:
>   vect_cst__42 = { 0, ... };
>   vect_cst__48 = { 0, ... };
>
>   vect__4.7_41 = .MASK_LOAD (vectp_y.5_38, 4B, loop_mask_40);
>   _4 = *_3;
>   _5 = z_12(D) + _2;
>   mask__35.8_43 = vect__4.7_41 != vect_cst__42;
>   _35 = _4 != 0;
>   vec_mask_and_46 = mask__35.8_43 & loop_mask_40;
>   vect_iftmp.11_47 = .MASK_LOAD (vectp_z.9_44, 4B, vec_mask_and_46);
>   iftmp.0_13 = 0;
>   vect_iftmp.12_50 = VEC_COND_EXPR  vect_iftmp.11_47, vect_cst__49>;
>
> and following code-gen:
> L2:
> ld1wz0.s, p2/z, [x1, x3, lsl 2]
> cmpne   p1.s, p3/z, z0.s, #0
> cmpne   p0.s, p2/z, z0.s, #0
> ld1wz0.s, p0/z, [x2, x3, lsl 2]
> sel z0.s, p1, z0.s, z1.s
>
> We could reuse vec_mask_and_46 in vec_cond_expr since the conditions
> vect__4.7_41 != vect_cst__48 and vect__4.7_41 != vect_cst__42
> are equivalent, and vect_iftmp.11_47 depends on vect__4.7_41 != vect_cst__48.
>
> I suppose in general for vec_cond_expr  if T comes from masked load,
> which is conditional on C, then we could reuse the mask used in load,
> in vec_cond_expr ?
>
> The patch maintains a hash_map cond_to_vec_mask
> from  vec_mask (with loop predicate applied).
> In prepare_load_store_mask, we record  -> vec_mask & 
> loop_mask,
> and in vectorizable_condition, we check if  exists in
> cond_to_vec_mask
> and if found, the corresponding vec_mask is used as 1st operand of
> vec_cond_expr.
>
>  is represented with cond_vmask_key, and the patch
> adds tree_cond_ops to represent condition operator and operands coming
> either from cond_expr
> or a gimple comparison stmt. If the stmt is not comparison, it returns
>  and inserts that into cond_to_vec_mask.
>
> With patch, the redundant p1 is eliminated and sel uses p0 for above test.
>
> For following test:
> void
> f2 (int *restrict x, int *restrict y, int *restrict z, int fallback)
> {
>   for (int i = 0; i < 100; ++i)
> x[i] = y[i] ? z[i] : fallback;
> }
>
> input to vectorizer has operands swapped in cond_expr:
>   _36 = _4 != 0;
>   iftmp.0_14 = .MASK_LOAD (_5, 32B, _36);
>   iftmp.0_8 = _4 == 0 ? fallback_12(D) : iftmp.0_14;
>
> So we need to check for inverted condition in cond_to_vec_mask,
> and swap the operands.
> Does the patch look OK so far ?
>
> One major issue remaining with the patch is value  numbering.
> Currently, it does value numbering for entire function using sccvn
> during start of vect pass, which is too expensive since we only need
> block based VN. I am looking into that.

Why do you need it at all?  We run VN on the if-converted loop bodies btw.

Richard.

>
> Thanks,
> Prathamesh

Re: [SVE] PR86753

2019-08-14 Thread Richard Biener

On Wed, Aug 14, 2019 at 6:49 PM Richard Biener
 wrote:
>
> On Wed, Aug 14, 2019 at 5:06 PM Prathamesh Kulkarni
>  wrote:
> >
> > Hi,
> > The attached patch tries to fix PR86753.
> >
> > For following test:
> > void
> > f1 (int *restrict x, int *restrict y, int *restrict z)
> > {
> >   for (int i = 0; i < 100; ++i)
> > x[i] = y[i] ? z[i] : 10;
> > }
> >
> > vect dump shows:
> >   vect_cst__42 = { 0, ... };
> >   vect_cst__48 = { 0, ... };
> >
> >   vect__4.7_41 = .MASK_LOAD (vectp_y.5_38, 4B, loop_mask_40);
> >   _4 = *_3;
> >   _5 = z_12(D) + _2;
> >   mask__35.8_43 = vect__4.7_41 != vect_cst__42;
> >   _35 = _4 != 0;
> >   vec_mask_and_46 = mask__35.8_43 & loop_mask_40;
> >   vect_iftmp.11_47 = .MASK_LOAD (vectp_z.9_44, 4B, vec_mask_and_46);
> >   iftmp.0_13 = 0;
> >   vect_iftmp.12_50 = VEC_COND_EXPR  > vect_iftmp.11_47, vect_cst__49>;
> >
> > and following code-gen:
> > L2:
> > ld1wz0.s, p2/z, [x1, x3, lsl 2]
> > cmpne   p1.s, p3/z, z0.s, #0
> > cmpne   p0.s, p2/z, z0.s, #0
> > ld1wz0.s, p0/z, [x2, x3, lsl 2]
> > sel z0.s, p1, z0.s, z1.s
> >
> > We could reuse vec_mask_and_46 in vec_cond_expr since the conditions
> > vect__4.7_41 != vect_cst__48 and vect__4.7_41 != vect_cst__42
> > are equivalent, and vect_iftmp.11_47 depends on vect__4.7_41 != 
> > vect_cst__48.
> >
> > I suppose in general for vec_cond_expr  if T comes from masked 
> > load,
> > which is conditional on C, then we could reuse the mask used in load,
> > in vec_cond_expr ?
> >
> > The patch maintains a hash_map cond_to_vec_mask
> > from  vec_mask (with loop predicate applied).
> > In prepare_load_store_mask, we record  -> vec_mask & 
> > loop_mask,
> > and in vectorizable_condition, we check if  exists in
> > cond_to_vec_mask
> > and if found, the corresponding vec_mask is used as 1st operand of
> > vec_cond_expr.
> >
> >  is represented with cond_vmask_key, and the patch
> > adds tree_cond_ops to represent condition operator and operands coming
> > either from cond_expr
> > or a gimple comparison stmt. If the stmt is not comparison, it returns
> >  and inserts that into cond_to_vec_mask.
> >
> > With patch, the redundant p1 is eliminated and sel uses p0 for above test.
> >
> > For following test:
> > void
> > f2 (int *restrict x, int *restrict y, int *restrict z, int fallback)
> > {
> >   for (int i = 0; i < 100; ++i)
> > x[i] = y[i] ? z[i] : fallback;
> > }
> >
> > input to vectorizer has operands swapped in cond_expr:
> >   _36 = _4 != 0;
> >   iftmp.0_14 = .MASK_LOAD (_5, 32B, _36);
> >   iftmp.0_8 = _4 == 0 ? fallback_12(D) : iftmp.0_14;
> >
> > So we need to check for inverted condition in cond_to_vec_mask,
> > and swap the operands.
> > Does the patch look OK so far ?
> >
> > One major issue remaining with the patch is value  numbering.
> > Currently, it does value numbering for entire function using sccvn
> > during start of vect pass, which is too expensive since we only need
> > block based VN. I am looking into that.
>
> Why do you need it at all?  We run VN on the if-converted loop bodies btw.

Also I can't trivially see the equality of the masks and probably so can't VN.
Is it that we just don't bother to apply loop_mask to VEC_COND but there's
no harm if we do?

Richard.

> Richard.
>
> >
> > Thanks,
> > Prathamesh

Re: types for VR_VARYING

2019-08-14 Thread Jeff Law

On 8/14/19 8:15 AM, Aldy Hernandez wrote:
> 
> 
> On 8/14/19 9:50 AM, Andrew MacLeod wrote:
>> On 8/13/19 8:39 PM, Aldy Hernandez wrote:
>>>
>>>
>>> Yes, it was 2X.
>>>
>>> I noticed that Richi made some changes to the lattice handling for
>>> VARYING while the discussion was on-going.  I missed these, and had
>>> failed to adapt the patch for it.  I would appreciate a final review
>>> of the attached patch, especially the vr-values.c changes, which I
>>> have modified to play nice with current trunk.
>>>
>>> I also noticed that Andrew's patch was setting num_vr_values to
>>> num_ssa_names + num_ssa_names / 10.  I think he meant num_vr_values +
>>> num_vr_values / 10.  Please verify the current incantation makes sense.
>>>
>> no, I meant num_ssa_names.  We are resizing the vector because
>> num_vr_values is out of date (and smaller than num_ssa_names is now),
>> so we need to resize the vector to be at least the number of
>> ssa-names... and I added 10% just in case we arent done adding new ones.
>>
>>
>> if num_vr_values is 100, and we've added 200 ssa-names, num_ssa_names
>> would now be 300.   if you resize based on num_vr_values, you could
>> still go off the end of the vector.
> 
> OK, I've changed the resize to allocate 2X as well.  So now we'll have:
> 
> +  unsigned int old_sz = num_vr_values;
> +  num_vr_values = num_ssa_names * 2;
> +  vr_value = XRESIZEVEC (value_range *, vr_value, num_vr_values);
> etc
> 
> And the original allocation will also be 2X.
I don't think we want the resize to be 2X, we've tried to get away from
those kinds of growth patterns.  The 10% from Andrew's patch seems like
a better choice for the resize.

jeff

Re: [PATCH] Make GIMPLE forwprop DCE dead stmts

2019-08-14 Thread Jeff Law

On 8/14/19 7:36 AM, Richard Biener wrote:
> 
> The following patch makes forwprop DCE the stmts that become dead
> because of propagation of copies and constants.  For this to work
> we actually have to do that reliably rather than relying on
> fold_stmt doing this for us.
> 
> This hits fortran/trans-intrinsic.c in a way that we do "interesting"
> jump threading exposing a bogus uninit warning.  I'll open a PR
> for this with an (unreduced) testcase after committing.
Feel free to mark it as a regression, if for no other reason than that
guarantees that I look at it during stage3/stage4.  I can adjust the
marker at that time based on what I find.

jeff

Re: [PATCH 1/2] Add ::verify for cgraph_node::origin/nested/next_nested.

2019-08-14 Thread Jeff Law

On 8/14/19 5:15 AM, Martin Liska wrote:
> 
> gcc/ChangeLog:
> 
> 2019-08-14  Martin Liska  
> 
>   * cgraph.c (cgraph_node::verify_node): Verify origin, nested
>   and next_nested.
> ---
>  gcc/cgraph.c | 24 
>  1 file changed, 24 insertions(+)
> 
OK.
Jeff

Re: [PATCH 2/2] Clean next_nested properly.

2019-08-14 Thread Jeff Law

On 8/14/19 5:17 AM, Martin Liska wrote:
> 
> gcc/ChangeLog:
> 
> 2019-08-14  Martin Liska  
> 
>   PR ipa/91438
>   * cgraph.c (cgraph_node::remove): When setting
>   n->origin = NULL for all nested functions, reset
>   also next_nested.
> ---
>  gcc/cgraph.c | 11 +++
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
OK
jeff

Re: [PATCH] Add generic support for "noinit" attribute

2019-08-14 Thread Christophe Lyon

On Wed, 14 Aug 2019 at 17:59, Tamar Christina  wrote:
>
> Hi Christoph,
>
> The noinit testcase is currently failing on x86_64.
>
> Is the test supposed to be running there?
>
No, there's an effective-target to skip it.
But I notice a typo:
+/* { dg-require-effective-target noinit */
(missing closing brace)
Could it explain why it's failing on x86_64 ?

> Thanks,
> Tamar
>
> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org  On Behalf 
> Of Christophe Lyon
> Sent: Wednesday, August 14, 2019 2:18 PM
> To: Christophe Lyon ; Martin Sebor 
> ; gcc Patches ; Richard Earnshaw 
> ; ni...@redhat.com; Jozef Lawrynowicz 
> ; Richard Sandiford 
> Subject: Re: [PATCH] Add generic support for "noinit" attribute
>
> On Wed, 14 Aug 2019 at 14:14, Richard Sandiford  
> wrote:
> >
> > Sorry for the slow response, I'd missed that there was an updated patch...
> >
> > Christophe Lyon  writes:
> > > 2019-07-04  Christophe Lyon  
> > >
> > >   * lib/target-supports.exp (check_effective_target_noinit): New
> > >   proc.
> > > * gcc.c-torture/execute/noinit-attribute.c: New test.
> >
> > Second line should be indented by tabs rather than spaces.
> >
> > > @@ -2224,6 +2234,54 @@ handle_weak_attribute (tree *node, tree name,
> > >return NULL_TREE;
> > >  }
> > >
> > > +/* Handle a "noinit" attribute; arguments as in struct
> > > +   attribute_spec.handler.  Check whether the attribute is allowed
> > > +   here and add the attribute to the variable decl tree or otherwise
> > > +   issue a diagnostic.  This function checks NODE is of the expected
> > > +   type and issues diagnostics otherwise using NAME.  If it is not of
> > > +   the expected type *NO_ADD_ATTRS will be set to true.  */
> > > +
> > > +static tree
> > > +handle_noinit_attribute (tree * node,
> > > +   tree   name,
> > > +   tree   args,
> > > +   intflags ATTRIBUTE_UNUSED,
> > > +   bool *no_add_attrs)
> > > +{
> > > +  const char *message = NULL;
> > > +
> > > +  gcc_assert (DECL_P (*node));
> > > +  gcc_assert (args == NULL);
> > > +
> > > +  if (TREE_CODE (*node) != VAR_DECL)
> > > +message = G_("%qE attribute only applies to variables");
> > > +
> > > +  /* Check that it's possible for the variable to have a section.
> > > + */  else if ((TREE_STATIC (*node) || DECL_EXTERNAL (*node) || in_lto_p)
> > > +&& DECL_SECTION_NAME (*node))
> > > +message = G_("%qE attribute cannot be applied to variables "
> > > +  "with specific sections");
> > > +
> > > +  if (!targetm.have_switchable_bss_sections)
> > > +message = G_("%qE attribute is specific to ELF targets");
> >
> > Maybe make this an else if too?  Or make the VAR_DECL an else if if
> > you think the ELF one should win.  Either way, it seems odd to have
> > the mixture between else if and not.
> >
> Right, I changed this into an else if.
>
> > > +  if (message)
> > > +{
> > > +  warning (OPT_Wattributes, message, name);
> > > +  *no_add_attrs = true;
> > > +}
> > > +  else
> > > +  /* If this var is thought to be common, then change this.  Common
> > > + variables are assigned to sections before the backend has a
> > > + chance to process them.  Do this only if the attribute is
> > > + valid.  */
> >
> > Comment should be indented two spaces more.
> >
> > > +if (DECL_COMMON (*node))
> > > +  DECL_COMMON (*node) = 0;
> > > +
> > > +  return NULL_TREE;
> > > +}
> > > +
> > > +
> > >  /* Handle a "noplt" attribute; arguments as in
> > > struct attribute_spec.handler.  */
> > >
> > > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index
> > > f2619e1..f1af1dc 100644
> > > --- a/gcc/doc/extend.texi
> > > +++ b/gcc/doc/extend.texi
> > > @@ -7129,6 +7129,14 @@ The @code{visibility} attribute is described
> > > in  The @code{weak} attribute is described in  @ref{Common Function
> > > Attributes}.
> > >
> > > +@item noinit
> > > +@cindex @code{noinit} variable attribute Any data with the
> > > +@code{noinit} attribute will not be initialized by the C runtime
> > > +startup code, or the program loader.  Not initializing data in this
> > > +way can reduce program startup times.  Specific to ELF targets,
> > > +this attribute relies on the linker to place such data in the right
> > > +location.
> >
> > Maybe:
> >
> >This attribute is specific to ELF targets and relies on the linker to
> >place such data in the right location.
> >
> Thanks, I thought I had chosen a nice turn of phrase :-)
>
>
> > > diff --git a/gcc/testsuite/gcc.c-torture/execute/noinit-attribute.c
> > > b/gcc/testsuite/gcc.c-torture/execute/noinit-attribute.c
> > > new file mode 100644
> > > index 000..ffcf8c6
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.c-torture/execute/noinit-attribute.c
> > > @@ -0,0 +1,59 @@
> > > +/* { dg-do run } */
> > > +/* { dg-require-effective-target noinit */
> > > +/* { dg-options "-O2" } */
> > > +
> > > +/* This test checks that noinit dat

Re: [PATCH] Automatics in equivalence statements

2019-08-14 Thread Jeff Law

On 8/14/19 2:45 AM, Mark Eggleston wrote:
> I now have commit access.
> 
> gcc/fortran
> 
>     Jeff Law 
>     Mark Eggleston 
> 
>     * gfortran.h: Add gfc_check_conflict declaration.
>     * symbol.c (check_conflict): Rename cfg_check_conflict and remove
>     static.
>     * symbol.c (cfg_check_conflict): Remove automatic in equivalence
>     conflict check.
>     * symbol.c (save_symbol): Add check for in equivalence to stop the
>     the save attribute being added.
>     * trans-common.c (build_equiv_decl): Add is_auto parameter and
>     add !is_auto to condition where TREE_STATIC (decl) is set.
>     * trans-common.c (build_equiv_decl): Add local variable is_auto,
>     set it true if an atomatic attribute is encountered in the variable
>     list.  Call build_equiv_decl with is_auto as an additional parameter.
>     flag_dec_format_defaults is enabled.
>     * trans-common.c (accumulate_equivalence_attributes) : New subroutine.
>     * trans-common.c (find_equivalence) : New local variable dummy_symbol,
>     accumulated equivalence attributes from each symbol then check for
>     conflicts.
> 
> gcc/testsuite
> 
>     Mark Eggleston 
> 
>     * gfortran.dg/auto_in_equiv_1.f90: New test.
>     * gfortran.dg/auto_in_equiv_2.f90: New test.
>     * gfortran.dg/auto_in_equiv_3.f90: New test.
> 
> OK to commit?
> 
> How do I know that I have approval to commit?
Yes, this is OK to commit.  Steve acked it in a private message to me.

Normally you'll get an ACK/OK on the public list.  But private ACKs or
ACKs on IRC also count as approval :-)

jeff

Re: Rewrite some jump.c routines to use flags

2019-08-14 Thread Joseph Myers

On Fri, 12 Jul 2019, Richard Sandiford wrote:

> At least AIUI, __builtin_isunordered etc. don't raise an exception even
> for signalling NaNs.

__builtin_isunordered should raise "invalid" for signaling NaNs.  
(isunordered is the IEEE 754 operation compareQuietUnordered, and IEEE 754 
specifies for comparisons that "Invalid operation is the only exception 
that a comparison predicate can signal. All predicates signal the invalid 
operation exception on signaling NaN operands. The predicates named Quiet 
shall not signal any exception, unless an operand is a signaling NaN. The 
predicates named Signaling shall signal the invalid operation exception on 
quiet NaN operands.".)

Note that __builtin_isunordered (x, x) is thus not the same as 
__builtin_isnan (x), because isnan binds to isNaN and isNaN is a 
non-computational operation for which IEEE 754 specifies "Implementations 
shall provide the following non-computational operations for all supported 
arithmetic formats and should provide them for all supported interchange 
formats. They are never exceptional, even for signaling NaNs.".

-- 
Joseph S. Myers
jos...@codesourcery.com

[PATCH] Simplify and generalize rust-demangle's unescaping logic.

2019-08-14 Thread Eduard-Mihai Burtescu

Previously, rust-demangle.c was special-casing a fixed number
of '$uXY$' escapes, but 'XY' can technically be any hex value,
representing some Unicode codepoint.

This patch adds more general support for '$u...$' escapes,
similar to https://github.com/alexcrichton/rustc-demangle/pull/29,
but only for the the ASCII subset. More complete Unicode support
may come at a later time, but right now I want to keep it simple.

Escapes that decode to ASCII control codes are considered invalid,
as the Rust compiler should never emit them, and to avoid any
undesirable effects from accidentally outputting a control code.

Additionally, the switch statements, which had one case for each
alphanumeric character, were replaced with if-else chains.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

2019-08-14  Eduard-Mihai Burtescu  
libiberty/ChangeLog:
* rust-demangle.c (unescape): Remove.
(parse_lower_hex_nibble): New function.
(parse_legacy_escape): New function.
(is_prefixed_hash): Use parse_lower_hex_nibble.
(looks_like_rust): Use parse_legacy_escape.
(rust_demangle_sym): Use parse_legacy_escape.
* testsuite/rust-demangle-expected: Add 'llv$u6d$' test.

diff --git a/libiberty/rust-demangle.c b/libiberty/rust-demangle.c
index 2302db45b6f..da591902db1 100644
--- a/libiberty/rust-demangle.c
+++ b/libiberty/rust-demangle.c
@@ -50,7 +50,7 @@ extern void *memset(void *s, int c, size_t n);
 #include "rust-demangle.h"
 
 
-/* Mangled Rust symbols look like this:
+/* Mangled (legacy) Rust symbols look like this:
  
_$LT$std..sys..fd..FileDesc$u20$as$u20$core..ops..Drop$GT$::drop::hc68340e1baa4987a
 
The original symbol is:
@@ -74,16 +74,7 @@ extern void *memset(void *s, int c, size_t n);
">"  =>  $GT$
"("  =>  $LP$
")"  =>  $RP$
-   " "  =>  $u20$
-   "\"" =>  $u22$
-   "'"  =>  $u27$
-   "+"  =>  $u2b$
-   ";"  =>  $u3b$
-   "["  =>  $u5b$
-   "]"  =>  $u5d$
-   "{"  =>  $u7b$
-   "}"  =>  $u7d$
-   "~"  =>  $u7e$
+   "\u{XY}"  =>  $uXY$
 
A double ".." means "::" and a single "." means "-".
 
@@ -95,7 +86,8 @@ static const size_t hash_len = 16;
 
 static int is_prefixed_hash (const char *start);
 static int looks_like_rust (const char *sym, size_t len);
-static int unescape (const char **in, char **out, const char *seq, char value);
+static int parse_lower_hex_nibble (char nibble);
+static char parse_legacy_escape (const char **in);
 
 /* INPUT: sym: symbol that has been through C++ (gnu v3) demangling
 
@@ -149,7 +141,7 @@ is_prefixed_hash (const char *str)
   const char *end;
   char seen[16];
   size_t i;
-  int count;
+  int count, nibble;
 
   if (strncmp (str, hash_prefix, hash_prefix_len))
 return 0;
@@ -157,12 +149,12 @@ is_prefixed_hash (const char *str)
 
   memset (seen, 0, sizeof(seen));
   for (end = str + hash_len; str < end; str++)
-if (*str >= '0' && *str <= '9')
-  seen[*str - '0'] = 1;
-else if (*str >= 'a' && *str <= 'f')
-  seen[*str - 'a' + 10] = 1;
-else
-  return 0;
+{
+  nibble = parse_lower_hex_nibble (*str);
+  if (nibble < 0)
+return 0;
+  seen[nibble] = 1;
+}
 
   /* Count how many distinct digits seen */
   count = 0;
@@ -179,57 +171,17 @@ looks_like_rust (const char *str, size_t len)
   const char *end = str + len;
 
   while (str < end)
-switch (*str)
-  {
-  case '$':
-   if (!strncmp (str, "$C$", 3))
- str += 3;
-   else if (!strncmp (str, "$SP$", 4)
-|| !strncmp (str, "$BP$", 4)
-|| !strncmp (str, "$RF$", 4)
-|| !strncmp (str, "$LT$", 4)
-|| !strncmp (str, "$GT$", 4)
-|| !strncmp (str, "$LP$", 4)
-|| !strncmp (str, "$RP$", 4))
- str += 4;
-   else if (!strncmp (str, "$u20$", 5)
-|| !strncmp (str, "$u22$", 5)
-|| !strncmp (str, "$u27$", 5)
-|| !strncmp (str, "$u2b$", 5)
-|| !strncmp (str, "$u3b$", 5)
-|| !strncmp (str, "$u5b$", 5)
-|| !strncmp (str, "$u5d$", 5)
-|| !strncmp (str, "$u7b$", 5)
-|| !strncmp (str, "$u7d$", 5)
-|| !strncmp (str, "$u7e$", 5))
- str += 5;
-   else
- return 0;
-   break;
-  case '.':
-   /* Do not allow three or more consecutive dots */
-   if (!strncmp (str, "...", 3))
- return 0;
-   /* Fall through */
-  case 'a': case 'b': case 'c': case 'd': case 'e': case 'f':
-  case 'g': case 'h': case 'i': case 'j': case 'k': case 'l':
-  case 'm': case 'n': case 'o': case 'p': case 'q': case 'r':
-  case 's': case 't': case 'u': case 'v': case 'w': case 'x':
-  case 'y': case 'z':
-  case 'A': case 'B': case 'C': case 'D': case 'E': case 'F':
-  case 'G': case 'H': case 'I': case 'J': case 'K': case 'L':
-  case 'M': case 'N': case 'O': case 'P': case 'Q': case 'R':
-  case 'S': c

Re: types for VR_VARYING

2019-08-14 Thread Jeff Law

On 8/13/19 6:39 PM, Aldy Hernandez wrote:
> 
> 
> On 8/12/19 7:46 PM, Jeff Law wrote:
>> On 8/12/19 12:43 PM, Aldy Hernandez wrote:
>>> This is a fresh re-post of:
>>>
>>> https://gcc.gnu.org/ml/gcc-patches/2019-07/msg6.html
>>>
>>> Andrew gave me some feedback a week ago, and I obviously don't remember
>>> what it was because I was about to leave on PTO.  However, I do remember
>>> I addressed his concerns before getting drunk on rum in tropical islands.
>>>
>> FWIW found a great coffee infused rum while in Kauai last week.  I'm not
>> a coffee fan, but it was wonderful.  The one bottle we brought back
>> isn't going to last until Cauldron and I don't think I can get a special
>> order filled before I leave :(
> 
> You must bring some to Cauldron before we believe you. :)
That's the problem.  The nearest place I can get it is in Vegas and
there's no distributor in Montreal.   I can special order it in our
state run stores, but it won't be here in time.

Of course, I don't mind if you don't believe me.  More for me in that
case...


>> Is the supports_type_p stuff there to placate the calls from ipa-cp?  I
>> can live with it in the short term, but it really feels like there
>> should be something in the ipa-cp client that avoids this silliness.
> 
> I am not happy with this either, but there are various places where
> statements that are !stmt_interesting_for_vrp() are still setting a
> range of VARYING, which is then being ignored at a later time.
> 
> For example, vrp_initialize:
> 
>   if (!stmt_interesting_for_vrp (phi))
> {
>   tree lhs = PHI_RESULT (phi);
>   set_def_to_varying (lhs);
>   prop_set_simulate_again (phi, false);
> }
> 
> Also in evrp_range_analyzer::record_ranges_from_stmt(), where we if the
> statement is interesting for VRP but extract_range_from_stmt() does not
> produce a useful range, we also set a varying for a range we will never
> use.  Similarly for a statement that is not interesting in this hunk.
Ugh.  One could perhaps argue that setting any kind of range in these
circumstances is silly.   But I suspect it's necessary due to the
optimistic handling of VR_UNDEFINED in value_range_base::union_helper.
It's all coming back to me now...


> 
> Then there is vrp_prop::visit_stmt() where we also set VARYING for types
> that VRP will never handle:
> 
>   case IFN_ADD_OVERFLOW:
>   case IFN_SUB_OVERFLOW:
>   case IFN_MUL_OVERFLOW:
>   case IFN_ATOMIC_COMPARE_EXCHANGE:
> /* These internal calls return _Complex integer type,
>    which VRP does not track, but the immediate uses
>    thereof might be interesting.  */
> if (lhs && TREE_CODE (lhs) == SSA_NAME)
>   {
> imm_use_iterator iter;
> use_operand_p use_p;
> enum ssa_prop_result res = SSA_PROP_VARYING;
> 
> set_def_to_varying (lhs);
> 
> I've adjusted the patch so that set_def_to_varying will set the range to
> VR_UNDEFINED if !supports_type_p.  This is a fail safe, as we can't
> really do anything with a nonsensical range.  I just don't want to leave
> the range in an indeterminate state.
> 
I think VR_UNDEFINED is unsafe due to value_range_base::union_helper.
And that's a more general than this patch.  VR_UNDEFINED is _not_ a safe
range to set something to if we can't handle it.  We have to use VR_VARYING.

Why?  See the beginning of value_range_base::union_helper:

   /* VR0 has the resulting range if VR1 is undefined or VR0 is varying.  */
   if (vr1->undefined_p ()
   || vr0->varying_p ())
 return *vr0;

   /* VR1 has the resulting range if VR0 is undefined or VR1 is varying.  */
   if (vr0->undefined_p ()
   || vr1->varying_p ())
 return *vr1;
This can get called for something like

  a =  ? name1 : name2;

If name1 was set to VR_UNDEFINED thinking that VR_UNDEFINED was a safe
value for something we can't handle, then we'll incorrectly return the
range for name2.

VR_UNDEFINED can only be used for the ranges of objects we haven't
processed.  If we can't produce a range for an object because the
statement is something we don't handle or just doesn't produce anythign
useful, then the right result is VR_VARYING.

This may be worth commenting at the definition site for VR_*.

> 
> I also noticed that Andrew's patch was setting num_vr_values to
> num_ssa_names + num_ssa_names / 10.  I think he meant num_vr_values +
> num_vr_values / 10.  Please verify the current incantation makes sense.
Going to assume this will be adjusted per the other messages in this thread.


> diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
> index 39ea22f0554..663dd6e2398 100644
> --- a/gcc/tree-ssa-threadedge.c
> +++ b/gcc/tree-ssa-threadedge.c
> @@ -182,8 +182,10 @@ record_temporary_equivalences_from_phis (edge e,
>   new_vr->deep_copy (vr_values->get_value_range (src));
> else if (TREE_CODE (src) == INTEGER_CST)
>   new_vr->set (src);
> +   else if (value_range_base::supports_

[COMMITTED] Set memory alignment in expand_builtin_init_descriptor

2019-08-14 Thread Bernd Edlinger

Committed as r274487 with approval in 
https://gcc.gnu.org/ml/gcc-patches/2019-08/msg00974.html


Index: gcc/builtins.c
===
--- gcc/builtins.c  (revision 274486)
+++ gcc/builtins.c  (revision 274487)
@@ -5756,6 +5756,7 @@ expand_builtin_init_descriptor (tree exp)
   r_descr = expand_normal (t_descr);
   m_descr = gen_rtx_MEM (BLKmode, r_descr);
   MEM_NOTRAP_P (m_descr) = 1;
+  set_mem_align (m_descr, GET_MODE_ALIGNMENT (ptr_mode));
 
   r_func = expand_normal (t_func);
   r_chain = expand_normal (t_chain);
Index: gcc/ChangeLog
===
--- gcc/ChangeLog   (revision 274486)
+++ gcc/ChangeLog   (revision 274487)
@@ -1,3 +1,7 @@
+2019-08-14  Bernd Edlinger  
+
+   * builtins.c (expand_builtin_init_descriptor): Set memory alignment.
+
 2019-08-14  Martin Sebor  
 
PR tree-optimization/91294


Thanks
Bernd.

Re: [PATCH 1/2] PR c++/91436 fix C++ dialect for std::make_unique fix-it hint

2019-08-14 Thread David Malcolm

On Wed, 2019-08-14 at 16:53 +0100, Jonathan Wakely wrote:
> On 14/08/19 10:39 -0400, David Malcolm wrote:
> > On Wed, 2019-08-14 at 12:02 +0100, Jonathan Wakely wrote:
> > > On 13/08/19 16:07 -0400, Jason Merrill wrote:
> > > > On 8/13/19 9:32 AM, Jonathan Wakely wrote:
> > > > > * g++.dg/lookup/missing-std-include-6.C: Don't check
> > > > > make_unique in
> > > > > test that runs for C++11.
> > > > 
> > > > I'm not comfortable removing this test coverage
> > > > entirely.  Doesn't
> > > > it
> > > > give a useful diagnostic in C++11 mode as well?
> > > 
> > > It does:
> > > 
> > > mu.cc:3:15: error: 'make_unique' is not a member of 'std'
> > > 3 | auto p = std::make_unique();
> > >   |   ^~~
> > > mu.cc:3:15: note: 'std::make_unique' is only available from C++14
> > > onwards
> > > mu.cc:3:27: error: expected primary-expression before 'int'
> > > 3 | auto p = std::make_unique();
> > >   |   ^~~
> > > 
> > > So we can add it to g++.dg/lookup/missing-std-include-8.C
> > > instead,
> > > which runs for c++98_only and checks for the "is only available
> > > for"
> > > cases. Here's a patch doing that.
> > 
> > FWIW this eliminates the testing that when we do have C++14
> > onwards,
> > that including  is suggested.
> 
> Do we really care?
> 
> Are we testing that *every* entry in the array gives the right answer
> for both missing-header and bad-std-option, or are we just testing a
> subset of them to be sure the logic works as expected?
> 
> Because if we're testing every entry then:
> 
> 1) we're missing LOTS of tests, and
> 
> 2) we're just as likely to test the wrong thing and not actually
> catch
>bugs (as was already happening for both make_unique and
>complex_literals).
> 
> > Maybe we need a C++14-onwards missing-std-include-* test, and to
> > move
> > the existing test there?  (and to add the new test for before-C++-
> > 14)
> 
> We could, but is it worth it?

Fair enough.

Dave

Re: enforce canonicalization of value_range's

2019-08-14 Thread Jeff Law

On 8/13/19 6:51 PM, Aldy Hernandez wrote:
>> Presumably this was better than moving the implementation earlier.
> 
> Actually, it was for ease of review.  I made some changes to the
> function, and I didn't want the reviewer to miss them because I had
> moved the function wholesale.  I can move the function earlier, after we
> agree on the changes (see below).
Either works for me.  I think there was an informal effort to avoid
these kinds of forward decls eons ago because our inliner sucked, but in
the IPA world order in the source file really shouldn't matter.

> 
>>
>> If we weren't on a path to kill VRP I'd probably suggest a distinct
>> effort to constify this code.  Some of the changes were a bit confusing
>> when it looked like we'd dropped a call to set the range of an object.
>> But those were just local copies, so setting the type/min/max directly
>> was actually fine.  constification would make this a bit clearer.  But
>> again, I don't think it's worth the effort given the long term
>> trajectory for tree-vrp.c.
> 
> I shouldn't be introducing any new confusion.  Did I add any new methods
> that should've been const that aren't?  I can't see any??.  I'm happy to
> fix anything I introduced.
IIRC we had an incoming range object passed by value, which we locally
modified and called the setter.

I spotted the dropped call to the setter and was going to call it out as
possibly broken.  But in investigating further I realized the object was
passed by value, so dropping the setter wasn't really a problem.

THe funny thing was we were doing this on source operands rather than
the destination operand.  Arguably the ranges for the source operands
should be constant which would have flagged that code as fishy from its
inception and I'm sure the code would have been restructured
appropriately and would have avoided the confusion.

So in summary, you didn't break anything.  It was a safe change you
made, but it wasn't immediately obvious it was safe.  If we had a
constified codebase the intent of the code would have been more obvious.


> 
>>
>>
>> So where does the handle_pointers stuff matter?   I'm a bit surprised we
>> have to do anything special for them.
> 
> I've learned to touch as little of VRP as is necessary, as changing
> anything to be more consistent breaks things in unexpected ways ;-).
> 
> In this particular case, TYPE_MIN_VALUE and TYPE_MAX_VALUE are not
> defined for pointers, and I didn't want to change the meaning of
> vrp_val_{min,max} throughout.  I was trying to minimize the changes to
> existing behavior.  If it bothers you too much, we could remove it as a
> follow up when we are sure there are no expected side-effects from the
> rest of the patch. ??
I don't mind exploring this as a follow-up.  I guess that a min/max
doesn't really have significant meaning for pointers.

I think rather than digging too deep into this, let's table it for now.
 I think the time to revisit will be as we work through removal of
tree-vrp at some point in the future.

> 
>>
>>
>> OK.  I don't expect the answers to the minor questions above will
>> ultimately change anything.
> 
> I could appreciate a final nod before I commit.  And even then, I will
> wait until the other patch is approved and commit them simultaneously.
> They are a bit intertwined.
I'm nodding :-)

jeff

1 2 >

1 - 100 of 139 matches

Mail list logo