Re: [PATCH][RFC][x86] Fix PR91154, add SImode smax, allow SImode add in SSE regs

2019-08-03 Thread Richard Biener

On Thu, 1 Aug 2019, Uros Bizjak wrote:


On Thu, Aug 1, 2019 at 11:28 AM Richard Biener  wrote:


So you unconditionally add a smaxdi3 pattern - indeed this looks
necessary even when going the STV route.  The actual regression
for the testcase could also be solved by turing the smaxsi3
back into a compare and jump rather than a conditional move sequence.
So I wonder how you'd do that given that there's pass_if_after_reload
after pass_split_after_reload and I'm not sure we can split
as late as pass_split_before_sched2 (there's also a split _after_
sched2 on x86 it seems).

So how would you go implement {s,u}{min,max}{si,di}3 for the
case STV doesn't end up doing any transform?


If STV doesn't transform the insn, then a pre-reload splitter splits
the insn back to compare+cmove.


OK, that would work.  But there's no way to force a jumpy sequence then
which we know is faster than compare+cmove because later RTL
if-conversion passes happily re-discover the smax (or conditional move)
sequence.


However, considering the SImode move
from/to int/xmm register is relatively cheap, the cost function should
be tuned so that STV always converts smaxsi3 pattern.


Note that on both Zen and even more so bdverN the int/xmm transition
makes it no longer profitable but a _lot_ slower than the cmp/cmov
sequence... (for the loop in hmmer which is the only one I see
any effect of any of my patches).  So identifying chains that
start/end in memory is important for cost reasons.


Please note that the cost function also considers the cost of move
from/to xmm. So, the cost of the whole chain would disable the
transformation.


So I think the splitting has to happen after the last if-conversion
pass (and thus we may need to allocate a scratch register for this
purpose?)


I really hope that the underlying issue will be solved by a machine
dependant pass inserted somewhere after the pre-reload split. This
way, we can split unconverted smax to the cmove, and this later pass
would handle jcc and cmove instructions. Until then... yes your
proposed approach is one of the ways to avoid unwanted if-conversion,
although sometimes we would like to split to cmove instead.


So the following makes STV also consider SImode chains, re-using the
DImode chain code.  I've kept a simple incomplete smaxsi3 pattern
and also did not alter the {SI,DI}mode chain cost function - it's
quite off for TARGET_64BIT.  With this I get the expected conversion
for the testcase derived from hmmer.

No further testing sofar.

Is it OK to re-use the DImode chain code this way?  I'll clean things
up some more of course.

Still need help with the actual patterns for minmax and how the splitters
should look like.

Richard.

Index: gcc/config/i386/i386-features.c
===
--- gcc/config/i386/i386-features.c (revision 274037)
+++ gcc/config/i386/i386-features.c (working copy)
@@ -276,8 +276,11 @@

 /* Initialize new chain.  */

-scalar_chain::scalar_chain ()
+scalar_chain::scalar_chain (enum machine_mode smode_, enum machine_mode vmode_)
 {
+  smode = smode_;
+  vmode = vmode_;
+
   chain_id = ++max_id;

if (dump_file)
@@ -473,7 +476,7 @@
 {
   gcc_assert (CONST_INT_P (exp));

-  if (standard_sse_constant_p (exp, V2DImode))
+  if (standard_sse_constant_p (exp, vmode))
 return COSTS_N_INSNS (1);
   return ix86_cost->sse_load[1];
 }
@@ -534,6 +537,9 @@
   else if (GET_CODE (src) == NEG
   || GET_CODE (src) == NOT)
gain += ix86_cost->add - COSTS_N_INSNS (1);
+  else if (GET_CODE (src) == SMAX
+  || GET_CODE (src) == SMIN)
+   gain += COSTS_N_INSNS (3);
   else if (GET_CODE (src) == COMPARE)
{
  /* Assume comparison cost is the same.  */
@@ -573,7 +579,7 @@
 dimode_scalar_chain::replace_with_subreg (rtx x, rtx reg, rtx new_reg)
 {
   if (x == reg)
-return gen_rtx_SUBREG (V2DImode, new_reg, 0);
+return gen_rtx_SUBREG (vmode, new_reg, 0);

   const char *fmt = GET_RTX_FORMAT (GET_CODE (x));
   int i, j;
@@ -707,7 +713,7 @@
   bitmap_copy (conv, insns);

   if (scalar_copy)
-scopy = gen_reg_rtx (DImode);
+scopy = gen_reg_rtx (smode);

   for (ref = DF_REG_DEF_CHAIN (regno); ref; ref = DF_REF_NEXT_REG (ref))
 {
@@ -750,6 +756,10 @@
  gen_rtx_VEC_SELECT (SImode,
  gen_rtx_SUBREG (V4SImode, reg, 0), tmp)));
}
+ else if (smode == SImode)
+   {
+ emit_move_insn (scopy, gen_rtx_SUBREG (SImode, reg, 0));
+   }
  else
{
  rtx vcopy = gen_reg_rtx (V2DImode);
@@ -816,14 +826,14 @@
   if (GET_CODE (*op) == NOT)
 {
   convert_op (&XEXP (*op, 0), insn);
-  PUT_MODE (*op, V2DImode);
+  PUT_MODE (*op, vmode);
 }
   else if (MEM_P (*op))
 {
-  rtx tmp = gen_reg_rtx (DImode);
+  rtx tmp = gen_reg_rtx (GET_MODE (*op));

   emit_insn_before (gen_move_insn (tmp, *op), insn

Re: [PATCH V5, rs6000] Support vrotr3 for int vector types

2019-08-03 Thread Segher Boessenkool
Hi!

I somehow lost track of this email, sorry.

On Fri, Aug 02, 2019 at 04:59:44PM +0800, Kewen.Lin wrote:
> As to the predicate name and usage, I checked the current vector shifts, 
> they don't need to check const_vector specially (like right to left 
> conversion), excepting for the one "vec_shr_", but it checks for
> scalar const int.

I don't understand why we want to expand rotate-by-vector-of-immediates
if we have no insns for that?  If you just use vint_operand, what happens
then?

> Btw, I've changed the 
>   +  rtx imm_vec = 
>   +   simplify_const_unary_operation
> back to 
>   +  rtx imm_vec
>   +   = simplify_const_unary_operation
> Otherwise check_GNU_style will report "Trailing operator" error.  :(

Yeah I got it the wrong way around.  Either way is ugly.  Oh well.

> +/* { dg-options "-O3" } */
> +/* { dg-require-effective-target powerpc_altivec_ok } */

If you use altivec_ok, you need to use -maltivec in the options, too.
This test should probably work with -O2 as well; use that, if possible.

> +/* { dg-require-effective-target powerpc_p8vector_ok } */

I don't think we need this anymore?  Not sure.


Segher


C++ PATCH for c++/91338 - P1161R3: Deprecate a[b,c]

2019-08-03 Thread Marek Polacek
This patch implements P1161R3, Deprecate uses of the comma operator in
subscripting expressions:

which made its way to C++20.  New [depr.comma.subscript] shows:

  void f(int *a, int b, int c) {
a[b,c]; // deprecated
a[(b,c)];   // OK
  }

I've added a new option, -Wcomma-subscript, so that this warning can be
selectively turned off in c++2a, where it is enabled by default, and turned
on in lower standards, to check if your codebase has any occurrences of it,
without needing to change the -std=.

As for implementation, I've extended cp_parser_skip_to_closing_square_bracket_1
so that we can check for a non-nested token inside [ ].

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2019-08-03  Marek Polacek  

PR c++/91338 - Implement P1161R3: Deprecate a[b,c].
* c-opts.c (c_common_post_options): Enable -Wcomma-subscript by
default for C++2a.
* c.opt (Wcomma-subscript): New warning.

* parser.c (cp_parser_postfix_open_square_expression): Warn about uses
of a comma operator within a subscripting expression.
(cp_parser_skip_to_closing_square_bracket_1): New function, made out
of...
(cp_parser_skip_to_closing_square_bracket): ...this.

* doc/invoke.texi: Document -Wcomma-subscript.

* g++.dg/cpp2a/comma1.C: New test.
* g++.dg/cpp2a/comma2.C: New test.
* g++.dg/cpp2a/comma3.C: New test.

diff --git gcc/c-family/c-opts.c gcc/c-family/c-opts.c
index e97bbdf5c6f..e43172598f3 100644
--- gcc/c-family/c-opts.c
+++ gcc/c-family/c-opts.c
@@ -916,6 +916,10 @@ c_common_post_options (const char **pfilename)
   if (!global_options_set.x_warn_register)
 warn_register = cxx_dialect >= cxx17;
 
+  /* -Wcomma-subscript is enabled by default in C++20.  */
+  if (!global_options_set.x_warn_comma_subscript)
+warn_comma_subscript = cxx_dialect >= cxx2a;
+
   /* Declone C++ 'structors if -Os.  */
   if (flag_declone_ctor_dtor == -1)
 flag_declone_ctor_dtor = optimize_size;
diff --git gcc/c-family/c.opt gcc/c-family/c.opt
index 4c8b0026000..257cadfa5f1 100644
--- gcc/c-family/c.opt
+++ gcc/c-family/c.opt
@@ -428,6 +428,10 @@ Wclobbered
 C ObjC C++ ObjC++ Var(warn_clobbered) Warning EnabledBy(Wextra)
 Warn about variables that might be changed by \"longjmp\" or \"vfork\".
 
+Wcomma-subscript
+C++ ObjC++ Var(warn_comma_subscript) Warning
+Warn about uses of a comma operator within a subscripting expression.
+
 Wcomment
 C ObjC C++ ObjC++ CPP(warn_comments) CppReason(CPP_W_COMMENTS) 
Var(cpp_warn_comment) Init(0) Warning LangEnabledBy(C ObjC C++ ObjC++,Wall)
 Warn about possibly nested block comments, and C++ comments spanning more than 
one physical line.
diff --git gcc/cp/parser.c gcc/cp/parser.c
index ebeffdb775f..1a5ae147b84 100644
--- gcc/cp/parser.c
+++ gcc/cp/parser.c
@@ -2669,6 +2669,8 @@ static bool cp_parser_init_statement_p
   (cp_parser *);
 static bool cp_parser_skip_to_closing_square_bracket
   (cp_parser *);
+static int cp_parser_skip_to_closing_square_bracket_1
+  (cp_parser *, enum cpp_ttype);
 
 /* Concept-related syntactic transformations */
 
@@ -7522,7 +7524,33 @@ cp_parser_postfix_open_square_expression (cp_parser 
*parser,
  index = cp_parser_braced_list (parser, &expr_nonconst_p);
}
   else
-   index = cp_parser_expression (parser);
+   {
+ /* [depr.comma.subscript]: A comma expression appearing as
+the expr-or-braced-init-list of a subscripting expression
+is deprecated.  A parenthesized comma expression is not
+deprecated.  */
+ if (warn_comma_subscript)
+   {
+ /* Save tokens so that we can put them back.  */
+ cp_lexer_save_tokens (parser->lexer);
+
+ /* Look for ',' that is not nested in () or {}.  */
+ if (cp_parser_skip_to_closing_square_bracket_1 (parser,
+ CPP_COMMA) == -1)
+   {
+ auto_diagnostic_group d;
+ warning_at (cp_lexer_peek_token (parser->lexer)->location,
+ OPT_Wcomma_subscript,
+ "top-level comma expression in array subscript "
+ "is deprecated");
+   }
+
+ /* Roll back the tokens we skipped.  */
+ cp_lexer_rollback_tokens (parser->lexer);
+   }
+
+ index = cp_parser_expression (parser);
+   }
 }
 
   parser->greater_than_is_operator_p = saved_greater_than_is_operator_p;
@@ -22857,16 +22885,25 @@ cp_parser_braced_list (cp_parser* parser, bool* 
non_constant_p)
 }
 
 /* Consume tokens up to, and including, the next non-nested closing `]'.
-   Returns true iff we found a closing `]'.  */
+   Returns 1 iff we found a closing `]'.  Returns -1 if OR_TTYPE is not
+   CPP_EOF and we found an

Re: [PATCH] PR fortran/88227 -- Revenge of the BOZ

2019-08-03 Thread Steve Kargl
Last call.  T-12.

-- 
steve

On Thu, Aug 01, 2019 at 02:13:12PM -0700, Steve Kargl wrote:
> Ping.
> 
> -- 
> steve
> 
> On Sun, Jul 28, 2019 at 04:41:02PM -0700, Steve Kargl wrote:
> > The attach patch fixes a problem with the conversion of a
> > BOZ literal constant to a REAL where the size of the REAL
> > exceeds the size of the largest INTEGER.  The problem can
> > be seen on 32-bit targets that provide support for REAL(10)
> > and/or REAL(16), or it can be seen with a multilib target
> > when using -m32 and REAL(10) and/or REAL(16).
> > 
> > If needed, the patch converts an octal or hexidecimal string
> > to the equivalent binary string, and then converts the binary
> > string to a REAL.  In principle, bin2real() can convert to 
> > REAL(4), REAL(8), REAL(10), and REAL(16), but I have elected
> > to use the old conversion method if the size of the largest
> > INTEGER exceeds the size the REAL(XXX) of interest.  A future
> > patch may remove the old method and make this new approach the
> > only way to convert a BOZ.
> > 
> > I have attached a short test program.  There is no testcase
> > for testsuite.
> > 
> > PLEASE TEST.
> > 
> > 2019-07-28  Steven G. Kargl  
> > 
> > PR fortran/88227
> > * check.c (oct2bin):  New function.  Convert octal string to binary.
> > (hex2bin): New function.  Convert hexidecimal string to binary.
> > (bin2real): New function.  Convert binary string to REAL.  Use
> > oct2bin and hex2bin.
> > (gfc_boz2real):  Use fallback conversion bin2real.
> > 
> > -- 
> > Steve
> 
> > Index: gcc/fortran/check.c
> > ===
> > --- gcc/fortran/check.c (revision 273766)
> > +++ gcc/fortran/check.c (working copy)
> > @@ -55,6 +55,7 @@ gfc_invalid_boz (const char *msg, locus *loc)
> >  
> >  
> >  /* Issue an error for an illegal BOZ argument.  */
> > +
> >  static bool
> >  illegal_boz_arg (gfc_expr *x)
> >  {
> > @@ -101,6 +102,167 @@ is_boz_constant (gfc_expr *a)
> >  }
> >  
> >  
> > +/* Convert a octal string into a binary string.  This is used in the
> > +   fallback conversion of an octal string to a REAL.  */
> > +
> > +static char *
> > +oct2bin(int nbits, char *oct)
> > +{
> > +  const char bits[8][5] = {
> > +"000", "001", "010", "011", "100", "101", "110", "111"};
> > +
> > +  char *buf, *bufp;
> > +  int i, j, n;
> > +
> > +  j = nbits + 1;
> > +  if (nbits == 64) j++;
> > +
> > +  bufp = buf = XCNEWVEC (char, j + 1);
> > +  memset (bufp, 0, j + 1);
> > +
> > +  n = strlen (oct);
> > +  for (i = 0; i < n; i++, oct++)
> > +{
> > +  j = *oct - 48;
> > +  strcpy (bufp, &bits[j][0]);
> > +  bufp += 3;
> > +}
> > +
> > +  bufp = XCNEWVEC (char, nbits + 1);
> > +  if (nbits == 64)
> > +strcpy (bufp, buf + 2);
> > +  else
> > +strcpy (bufp, buf + 1);
> > +
> > +  free (buf);
> > +
> > +  return bufp;
> > +}
> > +
> > +
> > +/* Convert a hexidecimal string into a binary string.  This is used in the
> > +   fallback conversion of a hexidecimal string to a REAL.  */
> > +
> > +static char *
> > +hex2bin(int nbits, char *hex)
> > +{
> > +  const char bits[16][5] = {
> > +"", "0001", "0010", "0011", "0100", "0101", "0110", "0111",
> > +"1000", "1001", "1010", "1011", "1100", "1101", "1110", ""};
> > +
> > +  char *buf, *bufp;
> > +  int i, j, n;
> > +
> > +  bufp = buf = XCNEWVEC (char, nbits + 1);
> > +  memset (bufp, 0, nbits + 1);
> > +
> > +  n = strlen (hex);
> > +  for (i = 0; i < n; i++, hex++)
> > +{
> > +  j = *hex;
> > +  if (j > 47 && j < 58)
> > + j -= 48;
> > +  else if (j > 64 && j < 71)
> > + j -= 55;
> > +  else if (j > 96 && j < 103)
> > + j -= 87;
> > +  else
> > + gcc_unreachable ();
> > +
> > +  strcpy (bufp, &bits[j][0]);
> > +  bufp += 4;
> > +   }
> > +
> > +   return buf;
> > +}
> > +
> > +
> > +/* Fallback conversion of a BOZ string to REAL.  */
> > +
> > +static void
> > +bin2real (gfc_expr *x, int kind)
> > +{
> > +  char buf[114], *sp;
> > +  int b, i, ie, t, w;
> > +  bool sgn;
> > +  mpz_t em;
> > +
> > +  i = gfc_validate_kind (BT_REAL, kind, false);
> > +  t = gfc_real_kinds[i].digits - 1;
> > +
> > +  /* Number of bits in the exponent.  */
> > +  if (gfc_real_kinds[i].max_exponent == 16384)
> > +w = 15;
> > +  else if (gfc_real_kinds[i].max_exponent == 1024)
> > +w = 11;
> > +  else
> > +w = 8;
> > +
> > +  if (x->boz.rdx == 16)
> > +sp = hex2bin (gfc_real_kinds[i].mode_precision, x->boz.str);
> > +  else if (x->boz.rdx == 8)
> > +sp = oct2bin (gfc_real_kinds[i].mode_precision, x->boz.str);
> > +  else
> > +sp = x->boz.str;
> > +
> > +  /* Extract sign bit. */
> > +  sgn = *sp != '0';
> > +
> > +  /* Extract biased exponent. */
> > +  memset (buf, 0, 114);
> > +  strncpy (buf, ++sp, w);
> > +  mpz_init (em);
> > +  mpz_set_str (em, buf, 2);
> > +  ie = mpz_get_si (em);
> > +
> > +  mpfr_init2 (x->value.real, t + 1);
> > +  x->ts.typ