Re: [PATCH, rs6000] Remove TImode from mode iterator BOOL_128 [PR100694]

2022-02-16 Thread HAO CHEN GUI via Gcc-patches
Hi,

On 15/2/2022 下午 10:56, Segher Boessenkool wrote:
> On Tue, Feb 15, 2022 at 11:01:03AM +0800, HAO CHEN GUI wrote:
> Hi!
> 
>> On 15/2/2022 上午 5:36, Segher Boessenkool wrote:
>>> On Wed, Feb 09, 2022 at 10:43:17AM +0800, HAO CHEN GUI wrote:
>>> All that are arguments for expanding to split form, not for removing
>>> TImode from the iterator.  And you leave PTImode, which *always* is in
>>> GPRs!
>> >From my understanding, PTImode has limitation that it needs to be assigned
>> with an even/odd register pair. So it can't be split before the reload pass.
> 
> TImode is put in an even/odd pair always as well.  What is special about
> PTImode here?
TI is allowed in any GPRs. TI can be placed in r3/r4 or r4/r5 (both odd/even
and even/odd) while PTI can only be placed in r4/r5 (even/odd). So if we
split PTI before reload,the constraint is broken then PTI can be placed in
any GPRs, I think.
> 
>> Currently it is split after reload.>
> 
> This prevents almost all optimisations.  Splits after reload should be a
> last resort thing.  They almost always cause bigger problems than what
> they are meant to solve.  There aren't many splitters that *have* to run
> after RA!
> 
>>> (You'll also have to show it is *correct*, you need to prove (or show it
>>> really likely :-) ) that after this change there are no TImode things
>>> generated anywhere (anywhere!) that are no longer handled now).
>>>
>> Yes, the TI may be generated after expand pass and causes ICEs. So how about
>> creating two mode iterators? One is for expand which doesn't include TImode,
>> another is for the split which include TImode and make TImode to be split
>> as early as possible?
> 
> You can also have the expanders fail for TImode?  That gives you a good
> place to put in a code comment as well ;-)
> 
Yes, I will take it.
> 
> Segher


[committed] openmp: For min/max omp atomic compare forms verify arg types with build_binary_op [PR104531]

2022-02-16 Thread Jakub Jelinek via Gcc-patches
Hi!

The MIN_EXPR/MAX_EXPR handling in *build_binary_op is minimal (especially
for C FE), because min/max aren't expressions the languages contain directly.
I'm using those for the
  #pragma omp atomic
  x = x < y ? y : x;
forms, but e.g. for the attached testcase we normally reject _Complex int vs. 
int
comparisons, in C++ due to MIN/MAX_EXPR we were diagnosing it as invalid types
for  while in C we accept it and ICEd later on.

The following patch will try build_binary_op with LT_EXPR on the operands first
to get needed diagnostics and fail if it returns error_mark_node.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2022-02-16  Jakub Jelinek  

PR c/104531
* c-omp.cc (c_finish_omp_atomic): For MIN_EXPR/MAX_EXPR, try first
build_binary_op with LT_EXPR and only if that doesn't return
error_mark_node call build_modify_expr.

* c-c++-common/gomp/atomic-31.c: New test.

--- gcc/c-family/c-omp.cc.jj2022-02-11 00:19:22.107067674 +0100
+++ gcc/c-family/c-omp.cc   2022-02-15 16:06:24.173311609 +0100
@@ -353,8 +353,13 @@ c_finish_omp_atomic (location_t loc, enu
 }
   bool save = in_late_binary_op;
   in_late_binary_op = true;
-  x = build_modify_expr (loc, blhs ? blhs : lhs, NULL_TREE, opcode,
-loc, rhs, NULL_TREE);
+  if ((opcode == MIN_EXPR || opcode == MAX_EXPR)
+  && build_binary_op (loc, LT_EXPR, blhs ? blhs : lhs, rhs,
+ true) == error_mark_node)
+x = error_mark_node;
+  else
+x = build_modify_expr (loc, blhs ? blhs : lhs, NULL_TREE, opcode,
+  loc, rhs, NULL_TREE);
   in_late_binary_op = save;
   if (x == error_mark_node)
 return error_mark_node;
--- gcc/testsuite/c-c++-common/gomp/atomic-31.c.jj  2022-02-15 
17:02:17.938486108 +0100
+++ gcc/testsuite/c-c++-common/gomp/atomic-31.c 2022-02-15 17:04:23.811729201 
+0100
@@ -0,0 +1,11 @@
+/* c/104531 */
+/* { dg-do compile } */
+
+int x;
+
+void
+foo (_Complex int y)
+{
+  #pragma omp atomic compare   /* { dg-error "invalid operands" } */
+  x = x > y ? y : x;
+}

Jakub



[PATCH v2, rs6000] Enable absolute jump table for PPC AIX and Linux

2022-02-16 Thread HAO CHEN GUI via Gcc-patches
Hi,
   This patch enables absolute jump tables on PPC AIX and Linux. For AIX, the 
jump
table is placed in data section. For Linux, it is placed in RELRO section when
relocation is needed.

   Bootstrapped and tested on AIX,Linux BE and LE with no regressions. Is this 
okay for trunk?
Any recommendations? Thanks a lot.

ChangeLog
2022-02-16 Haochen Gui 

gcc/
* config/rs6000/aix.h (JUMP_TABLES_IN_TEXT_SECTION): Define.
* config/rs6000/linux64.h (JUMP_TABLES_IN_TEXT_SECTION): Likewise.
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Enable
absolute jump tables for AIX and Linux.
(rs6000_xcoff_function_rodata_section): Implement.
* config/rs6000/xcoff.h (TARGET_ASM_FUNCTION_RODATA_SECTION): Define.

patch.diff
diff --git a/gcc/config/rs6000/aix.h b/gcc/config/rs6000/aix.h
index ad3238bf09a..b52208c2ee7 100644
--- a/gcc/config/rs6000/aix.h
+++ b/gcc/config/rs6000/aix.h
@@ -253,7 +253,7 @@

 /* Indicate that jump tables go in the text section.  */

-#define JUMP_TABLES_IN_TEXT_SECTION 1
+#define JUMP_TABLES_IN_TEXT_SECTION 0

 /* Define any extra SPECS that the compiler needs to generate.  */
 #undef  SUBTARGET_EXTRA_SPECS
diff --git a/gcc/config/rs6000/linux64.h b/gcc/config/rs6000/linux64.h
index b2a7afabc73..16df9ef167f 100644
--- a/gcc/config/rs6000/linux64.h
+++ b/gcc/config/rs6000/linux64.h
@@ -239,7 +239,7 @@ extern int dot_symbols;

 /* Indicate that jump tables go in the text section.  */
 #undef  JUMP_TABLES_IN_TEXT_SECTION
-#define JUMP_TABLES_IN_TEXT_SECTION TARGET_64BIT
+#define JUMP_TABLES_IN_TEXT_SECTION 0

 /* The linux ppc64 ABI isn't explicit on whether aggregates smaller
than a doubleword should be padded upward or downward.  You could
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index bc3ef0721a4..e9c5552c082 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -4954,6 +4954,10 @@ rs6000_option_override_internal (bool global_init_p)
 warning (0, "%qs is deprecated and not recommended in any circumstances",
 "-mno-speculate-indirect-jumps");

+  /* Enable absolute jump tables for AIX and Linux.  */
+  if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)
+rs6000_relative_jumptables = 0;
+
   return ret;
 }

@@ -28751,6 +28755,15 @@ constant_generates_xxspltidp (vec_const_128bit_type 
*vsx_const)
   return sf_value;
 }

+section * rs6000_xcoff_function_rodata_section (tree decl ATTRIBUTE_UNUSED,
+   bool relocatable)
+{
+  if (relocatable)
+return data_section;
+  else
+return readonly_data_section;
+}
+
 
 struct gcc_target targetm = TARGET_INITIALIZER;

diff --git a/gcc/config/rs6000/xcoff.h b/gcc/config/rs6000/xcoff.h
index cd0f99cb9c6..0dacd86eed9 100644
--- a/gcc/config/rs6000/xcoff.h
+++ b/gcc/config/rs6000/xcoff.h
@@ -98,7 +98,7 @@
 #define TARGET_ASM_SELECT_SECTION  rs6000_xcoff_select_section
 #define TARGET_ASM_SELECT_RTX_SECTION  rs6000_xcoff_select_rtx_section
 #define TARGET_ASM_UNIQUE_SECTION  rs6000_xcoff_unique_section
-#define TARGET_ASM_FUNCTION_RODATA_SECTION default_no_function_rodata_section
+#define TARGET_ASM_FUNCTION_RODATA_SECTION rs6000_xcoff_function_rodata_section
 #define TARGET_STRIP_NAME_ENCODING  rs6000_xcoff_strip_name_encoding
 #define TARGET_SECTION_TYPE_FLAGS  rs6000_xcoff_section_type_flags
 #ifdef HAVE_AS_TLS


[PATCH] combine: Fix up -fcompare-debug issue in the combiner [PR104544]

2022-02-16 Thread Jakub Jelinek via Gcc-patches
Hi!

On the following testcase on aarch64-linux, we behave differently
with -g and -g0.

The problem is that on:
(insn 10011 10010 10012 2 (set (reg:CC 66 cc)
(compare:CC (reg:DI 105)
(const_int 0 [0]))) "pr104544.c":18:3 407 {cmpdi}
 (expr_list:REG_DEAD (reg:DI 105)
(nil)))
(insn 10012 10011 10013 2 (set (reg:SI 109)
(eq:SI (reg:CC 66 cc)
(const_int 0 [0]))) "pr104544.c":18:3 444 {aarch64_cstoresi}
 (expr_list:REG_DEAD (reg:CC 66 cc)
(nil)))
(insn 10013 10012 10016 2 (set (reg:DI 110)
(zero_extend:DI (reg:SI 109))) "pr104544.c":18:3 111 
{*zero_extendsidi2_aarch64}
 (expr_list:REG_DEAD (reg:SI 109)
(nil)))
(insn 10016 10013 10017 2 (parallel [
(set (reg:CC 66 cc)
(compare:CC (const_int 0 [0])
(reg:DI 110)))
(set (reg:DI 111)
(neg:DI (reg:DI 110)))
]) "pr104544.c":18:3 281 {negdi_carryout}
 (expr_list:REG_DEAD (reg:DI 110)
(nil)))
...
(debug_insn 6 5 7 2 (var_location:SI y (debug_expr:SI D#5)) "pr104544.c":18:3 -1
 (nil))
(debug_insn 7 6 10033 2 (debug_marker) "pr104544.c":11:3 -1
 (nil))
(insn 10033 7 10034 2 (set (reg:DI 117 [ _14 ])
(ior:DI (reg:DI 111)
(reg:DI 112))) "pr104544.c":11:6 496 {iordi3}
 (expr_list:REG_DEAD (reg:DI 112)
(expr_list:REG_DEAD (reg:DI 111)
(nil
we successfully split 3 insns into two:

Trying 10011, 10013 -> 10016:
 10011: cc:CC=cmp(r105:DI,0)
  REG_DEAD r105:DI
 10013: r110:DI=cc:CC==0
  REG_DEAD cc:CC
 10016: {cc:CC=cmp(0,r110:DI);r111:DI=-r110:DI;}
  REG_DEAD r110:DI
Failed to match this instruction:
(parallel [
(set (reg:CC 66 cc)
(compare:CC (reg:DI 105)
(const_int 0 [0])))
(set (reg:DI 111)
(neg:DI (eq:DI (reg:DI 105)
(const_int 0 [0]
])
Failed to match this instruction:
(parallel [
(set (reg:CC 66 cc)
(compare:CC (reg:DI 105)
(const_int 0 [0])))
(set (reg:DI 111)
(neg:DI (eq:DI (reg:DI 105)
(const_int 0 [0]
])
Successfully matched this instruction:
(set (reg:DI 111)
(neg:DI (eq:DI (reg:DI 105)
(const_int 0 [0]
Successfully matched this instruction:
(set (reg:CC 66 cc)
(compare:CC (reg:DI 105)
(const_int 0 [0])))
Successfully matched this instruction:
(set (reg:DI 112)
(neg:DI (eq:DI (reg:CC 66 cc)
(const_int 0 [0]
allowing combination of insns 10011, 10013 and 10016
original costs 4 + 4 + 4 = 16
replacement costs 4 + 4 = 12
deferring deletion of insn with uid = 10011.

but the code that searches forward for insns to update their log
links (before the change there is a link from insn 10033 to insn 10016
for pseudo 111) only finds insn 10033 and updates the log link if
-g isn't enabled, otherwise it stops earlier because there are debug insns
in between.  So, with -g LOG_LINKS of 10033 isn't updated, points eventually
to NOTE_INSN_DELETED and so we do not attempt to combine 10033 with other
insns, while with -g0 we do.

The following patch fixes that by instead ignoring debug insns during the
searching.  We can still check BLOCK_FOR_INSN (insn) on those, because
if we notice DEBUG_INSN in a following basic block, necessarily there won't
be any further normal insns in the current block after it.

Bootstrapped/regtested on x86_64-linux and i686-linux, bootstrapped
on aarch64-linux, regtest on aarch64-linux still pending, ok for trunk
if it succeeds?

2022-02-16  Jakub Jelinek  

PR rtl-optimization/104544
* combine.cc (try_combine): When looking for insn whose links
should be updated from i3 to i2, don't stop on debug insns, instead
skip over them.

* gcc.dg/pr104544.c: New test.

--- gcc/combine.cc.jj   2022-02-11 13:51:56.294928090 +0100
+++ gcc/combine.cc  2022-02-15 14:15:41.663012950 +0100
@@ -4223,10 +4223,12 @@ try_combine (rtx_insn *i3, rtx_insn *i2,
  for (rtx_insn *insn = NEXT_INSN (i3);
   !done
   && insn
-  && NONDEBUG_INSN_P (insn)
+  && INSN_P (insn)
   && BLOCK_FOR_INSN (insn) == this_basic_block;
   insn = NEXT_INSN (insn))
{
+ if (DEBUG_INSN_P (insn))
+   continue;
  struct insn_link *link;
  FOR_EACH_LOG_LINK (link, insn)
if (link->insn == i3 && link->regno == regno)
--- gcc/testsuite/gcc.dg/pr104544.c.jj  2022-02-15 14:17:50.154221461 +0100
+++ gcc/testsuite/gcc.dg/pr104544.c 2022-02-15 14:17:34.441440536 +0100
@@ -0,0 +1,19 @@
+/* PR rtl-optimization/104544 */
+/* { dg-do compile { target int128 } } */
+/* { dg-options "-O2 -fcompare-debug" } */
+
+int m, n;
+__int128 q;
+
+void
+bar (unsigned __int128 x, int y)
+{
+  if (x)
+q += y;
+}
+
+void
+foo (void)
+{
+  bar (!!q - 1, (m += m ? m : 1) 

[PATCH] Restrict the two sources of vect_recog_cond_expr_convert_pattern to be of the same type when convert is extension.

2022-02-16 Thread liuhongt via Gcc-patches
> > +(match (cond_expr_convert_p @0 @2 @3 @6)
> > + (cond (simple_comparison@6 @0 @1) (convert@4 @2) (convert@5 @3))
> > +  (if (types_match (TREE_TYPE (@2), TREE_TYPE (@3))
>
> But in principle @2 or @3 could safely differ in sign, you'd then need to 
> ensure
> to insert sign conversions to @2/@3 to the signedness of @4/@5.
>
It turns out differ in sign is not suitable for extension(but ok for 
truncation),
because it's zero_extend vs sign_extend.

The patch add types_match check when convert is extension.

Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
And native Bootstrapped and regtested on CLX.

Ok for trunk?

gcc/ChangeLog:

PR tree-optimization/104551
PR tree-optimization/103771
* match.pd (cond_expr_convert_p): Add types_match check when
convert is extension.
* tree-vect-patterns.cc
(gimple_cond_expr_convert_p): Adjust comments.
(vect_recog_cond_expr_convert_pattern): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr104551.c: New test.
---
 gcc/match.pd |  8 +---
 gcc/testsuite/gcc.target/i386/pr104551.c | 24 
 gcc/tree-vect-patterns.cc|  6 --
 3 files changed, 33 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr104551.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 05a10ab6bfd..8e80b9f1576 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -7692,11 +7692,13 @@ and,
   (if (INTEGRAL_TYPE_P (type)
&& INTEGRAL_TYPE_P (TREE_TYPE (@2))
&& INTEGRAL_TYPE_P (TREE_TYPE (@0))
-   && INTEGRAL_TYPE_P (TREE_TYPE (@3))
&& TYPE_PRECISION (type) != TYPE_PRECISION (TREE_TYPE (@0))
&& TYPE_PRECISION (TREE_TYPE (@0))
  == TYPE_PRECISION (TREE_TYPE (@2))
-   && TYPE_PRECISION (TREE_TYPE (@0))
- == TYPE_PRECISION (TREE_TYPE (@3))
+   && (types_match (TREE_TYPE (@2), TREE_TYPE (@3))
+  || ((TYPE_PRECISION (TREE_TYPE (@0))
+   == TYPE_PRECISION (TREE_TYPE (@3)))
+  && INTEGRAL_TYPE_P (TREE_TYPE (@3))
+  && TYPE_PRECISION (TREE_TYPE (@3)) > TYPE_PRECISION (type)))
&& single_use (@4)
&& single_use (@5
diff --git a/gcc/testsuite/gcc.target/i386/pr104551.c 
b/gcc/testsuite/gcc.target/i386/pr104551.c
new file mode 100644
index 000..6300f25c0d5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr104551.c
@@ -0,0 +1,24 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mavx2" } */
+/* { dg-require-effective-target avx2 } */
+
+unsigned int
+__attribute__((noipa))
+test(unsigned int a, unsigned char p[16]) {
+  unsigned int res = 0;
+  for (unsigned b = 0; b < a; b += 1)
+res = p[b] ? p[b] : (char) b;
+  return res;
+}
+
+int main ()
+{
+  unsigned int a = 16U;
+  unsigned char p[16];
+  for (int i = 0; i != 16; i++)
+p[i] = (unsigned char)128;
+  unsigned int res = test (a, p);
+  if (res != 128)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index a8f96d59643..217bdfd7045 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -929,8 +929,10 @@ vect_reassociating_reduction_p (vec_info *vinfo,
with conditions:
1) @1, @2, c, d, a, b are all integral type.
2) There's single_use for both @1 and @2.
-   3) a, c and d have same precision.
+   3) a, c have same precision.
4) c and @1 have different precision.
+   5) c, d are the same type or they can differ in sign when convert is
+   truncation.
 
record a and c and d and @3.  */
 
@@ -952,7 +954,7 @@ extern bool gimple_cond_expr_convert_p (tree, tree*, tree 
(*)(tree));
TYPE_PRECISION (TYPE_E) != TYPE_PRECISION (TYPE_CD);
TYPE_PRECISION (TYPE_AB) == TYPE_PRECISION (TYPE_CD);
single_use of op_true and op_false.
-   TYPE_AB could differ in sign.
+   TYPE_AB could differ in sign when (TYPE_E) A is a truncation.
 
Input:
 
-- 
2.18.1



Re: [PATCH] rs6000: Retry tbegin. instructions that can fail intermittently

2022-02-16 Thread Segher Boessenkool
On Tue, Feb 15, 2022 at 04:59:45PM -0600, Peter Bergner wrote:
> > That is the way any HTM code should be written in the first place
> > (except for rollback-only transactions, but let's not go there --
> > besides, it is normal for those to fail as well, and there needs to be a
> > fallback there as well :-) )
> 
> Agreed and I'm not sure why I didn't write it that way to begin with.
> Maybe I thought it was so simple that the likelihood of it failing was
> so small we'd never see it?  Anyway, we do now, so...

Yeah, and you perhaps were misled by not seeing it fail in any testing
(it fails only .02% of the time you said).  For that reason it helps to
make testcases fail *more* often.  That isn't very trivial to do with
HTM of course.  Since we don't do HTM anymore it will all fade away, and
let's not bother, why am I typing still :-)


Segher


[pushed] aarch64: Extend PR100056 patterns to +

2022-02-16 Thread Richard Sandiford via Gcc-patches
pr100056.c contains things like:

int
or_shift_u3a (unsigned i)
{
  i &= 7;
  return i | (i << 11);
}

After g:96146e61cd7aee62c21c2845916ec42152918ab7, the preferred
gimple representation of this is a multiplication:

  i_2 = i_1(D) & 7;
  _5 = i_2 * 2049;

Expand then open-codes the multiplication back to individual shifts,
but (of course) it uses + rather than | to combine the shifts.
This means that we end up with the RTL equivalent of:

  i + (i << 11)

I wondered about canonicalising the + to | (*back* to | in this case)
when the operands have no set bits in common and when one of the
operands is &, | or ^, but that didn't seem to be a popular idea when
I asked on IRC.  The feeling seemed to be that + is inherently simpler
than |, so we shouldn't be “simplifying” the other way.

This patch therefore adjusts the PR100056 patterns to handle +
as well as |, in cases where the operands are provably disjoint.

For:

int
or_shift_u8 (unsigned char i)
{
  return i | (i << 11);
}

the instructions:

2: r95:SI=zero_extend(x0:QI)
  REG_DEAD x0:QI
7: r98:SI=r95:SI<<0xb

are combined into:

(parallel [
(set (reg:SI 98)
 (and:SI (ashift:SI (reg:SI 0 x0 [ i ])
(const_int 11 [0xb]))
 (const_int 522240 [0x7f800])))
(set (reg/v:SI 95 [ i ])
 (zero_extend:SI (reg:QI 0 x0 [ i ])))
])

which fails to match, but which is then split into its individual
(independent) sets.  Later the zero_extend is combined with the add
to get an ADD UXTB:

(set (reg:SI 99)
 (plus:SI (zero_extend:SI (reg:QI 0 x0 [ i ]))
  (reg:SI 98)))

This means that there is never a 3-insn combo to match the split
against.  The end result is therefore:

ubfiz   w1, w0, 11, 8
add w0, w1, w0, uxtb

This is a bit redundant, since it's doing the zero_extend twice.
It is at least 2 instructions though, rather than the 3 that we
had before the original patch for PR100056.  or_shift_u8_asm is
affected similarly.

The net effect is that we do still have 2 UBFIZs, but we're at
least back down to 2 instructions per function, as for GCC 11.
I think that's good enough for now.

There are probably other instructions that should be extended
to support + as well as | (e.g. the EXTR ones), but those aren't
regressions and so are GCC 13 material.

Tested on aarch64-linux-gnu & pushed.

Richard


gcc/
PR target/100056
* config/aarch64/iterators.md (LOGICAL_OR_PLUS): New iterator.
* config/aarch64/aarch64.md: Extend the PR100056 patterns
to handle plus in the same way as ior, if the operands have
no set bits in common.

gcc/testsuite/
PR target/100056
* gcc.target/aarch64/pr100056.c: XFAIL the original UBFIZ test
and instead expect two UBFIZs + two ADD UXTBs.
---
 gcc/config/aarch64/aarch64.md   | 33 ++---
 gcc/config/aarch64/iterators.md |  3 ++
 gcc/testsuite/gcc.target/aarch64/pr100056.c |  4 ++-
 3 files changed, 29 insertions(+), 11 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 64cc21d5802..590918464b8 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -4558,7 +4558,7 @@ (define_insn "*_3"
 
 (define_split
   [(set (match_operand:GPI 0 "register_operand")
-   (LOGICAL:GPI
+   (LOGICAL_OR_PLUS:GPI
  (and:GPI (ashift:GPI (match_operand:GPI 1 "register_operand")
   (match_operand:QI 2 "aarch64_shift_imm_"))
   (match_operand:GPI 3 "const_int_operand"))
@@ -4571,16 +4571,23 @@ (define_split
   && REGNO (operands[1]) == REGNO (operands[4])))
&& (trunc_int_for_mode (GET_MODE_MASK (GET_MODE (operands[4]))
   << INTVAL (operands[2]), mode)
-   == INTVAL (operands[3]))"
+   == INTVAL (operands[3]))
+   && ( != PLUS
+   || (GET_MODE_MASK (GET_MODE (operands[4]))
+  & INTVAL (operands[3])) == 0)"
   [(set (match_dup 5) (zero_extend:GPI (match_dup 4)))
-   (set (match_dup 0) (LOGICAL:GPI (ashift:GPI (match_dup 5) (match_dup 2))
-  (match_dup 5)))]
-  "operands[5] = gen_reg_rtx (mode);"
+   (set (match_dup 0) (match_dup 6))]
+  {
+operands[5] = gen_reg_rtx (mode);
+rtx shift = gen_rtx_ASHIFT (mode, operands[5], operands[2]);
+rtx_code new_code = ( == PLUS ? IOR : );
+operands[6] = gen_rtx_fmt_ee (new_code, mode, shift, operands[5]);
+  }
 )
 
 (define_split
   [(set (match_operand:GPI 0 "register_operand")
-   (LOGICAL:GPI
+   (LOGICAL_OR_PLUS:GPI
  (and:GPI (ashift:GPI (match_operand:GPI 1 "register_operand")
   (match_operand:QI 2 "aarch64_shift_imm_"))
   (match_operand:GPI 4 "const_int_operand"))
@@ -4589,11 +4596,17 @@ (define_split
&& pow2_or_zerop (UINTVAL (operands[3]

[pushed] aarch64: Remove XFAIL for bic-bitmask-1.c

2022-02-16 Thread Richard Sandiford via Gcc-patches
bic-bitmask-1.c is now passing, so remove the XFAIL.

Tested on aarch64-linux-gnu & pushed.

Richard


gcc/testsuite/
* gcc.target/aarch64/bic-bitmask-1.c: Remove XFAIL.
---
 gcc/testsuite/gcc.target/aarch64/bic-bitmask-1.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/aarch64/bic-bitmask-1.c 
b/gcc/testsuite/gcc.target/aarch64/bic-bitmask-1.c
index 568c1ffc8bc..bcb9cdd494e 100644
--- a/gcc/testsuite/gcc.target/aarch64/bic-bitmask-1.c
+++ b/gcc/testsuite/gcc.target/aarch64/bic-bitmask-1.c
@@ -10,4 +10,4 @@ uint32x4_t foo (int32x4_t a)
   return vceqq_s32 (vbicq_s32 (a, cst), zero);
 }
 
-/* { dg-final { scan-assembler-not {\tbic\t} { xfail { aarch64*-*-* } } } } */
+/* { dg-final { scan-assembler-not {\tbic\t} } } */
-- 
2.25.1


[pushed] aarch64: Tweak atomic-inst-cas.c options

2022-02-16 Thread Richard Sandiford via Gcc-patches
atomic-inst-cas.c has code to skip __atomic_compare_exchange_n
calls for invalid memory orderings, but -Winvalid-memory-model
applies before the dead code is removed (which is the right
behaviour IMO).  This patch therefore suppresses the warning
for this test.

Tested on aarch64-linux-gnu & pushed.

Richard


gcc/testsuite/
* gcc.target/aarch64/atomic-inst-cas.c: Add
-Wno-invalid-memory-model.
---
 gcc/testsuite/gcc.target/aarch64/atomic-inst-cas.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-inst-cas.c 
b/gcc/testsuite/gcc.target/aarch64/atomic-inst-cas.c
index f6f28922319..0b4533adade 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-inst-cas.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-inst-cas.c
@@ -1,5 +1,7 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -march=armv8-a+lse" } */
+/* -Winvalid-memory-model warnings are issued before the dead invalid calls
+   are removed.  */
+/* { dg-options "-O2 -march=armv8-a+lse -Wno-invalid-memory-model" } */
 
 /* Test ARMv8.1-A CAS instruction.  */
 
-- 
2.25.1


Re: [PATCH], PR target/99708 - Define __SIZEOF_FLOAT128__ and __SIZEOF_IBM128__

2022-02-16 Thread Segher Boessenkool
On Tue, Feb 15, 2022 at 06:05:06PM -0500, Michael Meissner wrote:
> On Tue, Feb 15, 2022 at 04:05:11PM -0600, Segher Boessenkool wrote:
> > On all older compilers these macros will not be defined, but the types
> > often are.  If you are willing to not support older compilers properly
> > anyway, you could just *always* use the types, which will work with most
> > very old compilers as well (and the approach using these propesed
> > predefines will *not*!)
> 
> The types are not defined on older systems.

They are defined since GCC 6.  Using new macros to see if the types
exist will miss all of GCC 6, 7, 8, 9, 10, 11.

> Both __ibm128 (ibm128_float_type_node) and __float128 
> (ieee128_float_type_node)
> are only defined if TARGET_FLOAT128_TYPE is true.
> 
> TARGET_FLOAT128_TYPE is only true if both TARGET_FLOAT128_ENABLE_TYPE and
> TARGET_VSX are true.
> 
> TARGET_FLOAT128_ENABLE_TYPE is only true on linux64 systems.
> 
> Now, the code to set __SIZEOF_IBM128__ and __SIZEOF_FLOAT128__  is in the code
> that also defines __FLOAT128__.  This code checks whether the __float128 and
> __ibm128 keywords are allowed.  These keywords are only set if
> TARGET_FLOAT128_TYPE is true, and if the user did not use the -mno-float128
> option.  In the GCC 7 time frame, we did not set this by default, but in the
> modern compilers, it is always set by default on Linux 64-bit systems.

__SIZEOF_IBM128__ should be defined based on what we have for the
__ibm128 type, not on what we have for the __float128 type  Yes I know
we *currently* define those under the same conditions, but why write
code that is more fragile than needed?  Please don't.  It is easy to do
it correctly, so it is no real hassle for the writer of the code, and it
is much better for the reader.


Segher


Re: [PATCH] [gfortran] Add support for allocate clause (OpenMP 5.0).

2022-02-16 Thread Hafiz Abid Qadeer
On 05/02/2022 19:09, Hafiz Abid Qadeer wrote:
> On 04/02/2022 11:25, Hafiz Abid Qadeer wrote:
>> On 04/02/2022 09:46, Thomas Schwinge wrote:
>>
>>>
>>> Abid, are you going to address these?  I think it does make sense if the
>>> C/C++ and Fortran test cases match as much as feasible.
>>>
>> Sure. I will do that.
> 
> The attached patch address those issues apart from removing pool_size trait.

Is this change ok to commit?

Thanks,
-- 
Hafiz Abid Qadeer



[PATCH] selftest: Move C-specific tests to c_family

2022-02-16 Thread Arthur Cohen

When trying to make use of the selftest framework over on the rust
frontend, we ran into issues where rust1 was expected to produce errors
containing C-like type names such as `int`.

I had gotten in contact with David Malcolm on the gcc mailing list [1],
who advised moving some test functions to a better location. The
offending functions have also been renamed in order to better fit the C
family of tests, and are thus not called when performing general
selftests anymore.

Kindly,

[1]: https://gcc.gnu.org/pipermail/gcc/2021-November/237703.html

2022-02-16 Arthur Cohen 

* diagnostic.cc (diagnostic_cc_tests): Rename to...
(c_diagnostic_cc_tests): ...this.
* opt_problem.cc (opt_problem_cc_tests): Rename to...
(c_opt_problem_cc_tests): ...this.
---
 gcc/c-family/c-common.cc  | 2 ++
 gcc/c-family/c-common.h   | 2 ++
 gcc/diagnostic.cc | 2 +-
 gcc/opt-problem.cc| 2 +-
 gcc/selftest-run-tests.cc | 2 --
 gcc/selftest.h| 2 --
 6 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index 7203d761df1..d034837bb5b 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -9120,6 +9120,8 @@ c_family_tests (void)
   c_indentation_cc_tests ();
   c_pretty_print_cc_tests ();
   c_spellcheck_cc_tests ();
+  c_diagnostic_cc_tests ();
+  c_opt_problem_cc_tests ();
 }

 } // namespace selftest
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index a8d6f82bb2c..ed20c5837be 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1513,8 +1513,10 @@ extern tree braced_lists_to_strings (tree, tree);
 namespace selftest {
   /* Declarations for specific families of tests within c-family,
  by source file, in alphabetical order.  */
+  extern void c_diagnostic_cc_tests (void);
   extern void c_format_cc_tests (void);
   extern void c_indentation_cc_tests (void);
+  extern void c_opt_problem_cc_tests (void);
   extern void c_pretty_print_cc_tests (void);
   extern void c_spellcheck_cc_tests (void);

diff --git a/gcc/diagnostic.cc b/gcc/diagnostic.cc
index 87eb473d2f3..73324a728fe 100644
--- a/gcc/diagnostic.cc
+++ b/gcc/diagnostic.cc
@@ -2472,7 +2472,7 @@ test_num_digits ()
 /* Run all of the selftests within this file.  */

 void
-diagnostic_cc_tests ()
+c_diagnostic_cc_tests ()
 {
   test_print_escaped_string ();
   test_print_parseable_fixits_none ();
diff --git a/gcc/opt-problem.cc b/gcc/opt-problem.cc
index e45d14e94b6..11fec57d679 100644
--- a/gcc/opt-problem.cc
+++ b/gcc/opt-problem.cc
@@ -324,7 +324,7 @@ test_opt_result_failure_at (const line_table_case 
&case_)

 /* Run all of the selftests within this file.  */

 void
-opt_problem_cc_tests ()
+c_opt_problem_cc_tests ()
 {
   test_opt_result_success ();
   for_each_line_table_case (test_opt_result_failure_at);
diff --git a/gcc/selftest-run-tests.cc b/gcc/selftest-run-tests.cc
index 99c35423253..d59e0aeddee 100644
--- a/gcc/selftest-run-tests.cc
+++ b/gcc/selftest-run-tests.cc
@@ -76,7 +76,6 @@ selftest::run_tests ()
   json_cc_tests ();
   cgraph_cc_tests ();
   optinfo_emit_json_cc_tests ();
-  opt_problem_cc_tests ();
   ordered_hash_map_tests_cc_tests ();
   splay_tree_cc_tests ();

@@ -95,7 +94,6 @@ selftest::run_tests ()
   /* Higher-level tests, or for components that other selftests don't
  rely on.  */
   diagnostic_show_locus_cc_tests ();
-  diagnostic_cc_tests ();
   diagnostic_format_json_cc_tests ();
   edit_context_cc_tests ();
   fold_const_cc_tests ();
diff --git a/gcc/selftest.h b/gcc/selftest.h
index 7a715631c62..7568a6d24d4 100644
--- a/gcc/selftest.h
+++ b/gcc/selftest.h
@@ -222,7 +222,6 @@ extern void attribs_cc_tests ();
 extern void bitmap_cc_tests ();
 extern void cgraph_cc_tests ();
 extern void convert_cc_tests ();
-extern void diagnostic_cc_tests ();
 extern void diagnostic_format_json_cc_tests ();
 extern void diagnostic_show_locus_cc_tests ();
 extern void digraph_cc_tests ();
@@ -238,7 +237,6 @@ extern void hash_map_tests_cc_tests ();
 extern void hash_set_tests_cc_tests ();
 extern void input_cc_tests ();
 extern void json_cc_tests ();
-extern void opt_problem_cc_tests ();
 extern void optinfo_emit_json_cc_tests ();
 extern void opts_cc_tests ();
 extern void ordered_hash_map_tests_cc_tests ();
--
2.35.1



--
Arthur Cohen 

Toolchain Engineer

Embecosm GmbH

Geschäftsführer: Jeremy Bennett
Niederlassung: Nürnberg
Handelsregister: HR-B 36368
www.embecosm.de

Fürther Str. 27
90429 Nürnberg


Tel.: 091 - 128 707 040
Fax: 091 - 128 707 077


OpenPGP_0x1B3465B044AD9C65.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH] combine: Fix up -fcompare-debug issue in the combiner [PR104544]

2022-02-16 Thread Segher Boessenkool
Hi!

On Wed, Feb 16, 2022 at 09:53:34AM +0100, Jakub Jelinek wrote:
> On the following testcase on aarch64-linux, we behave differently
> with -g and -g0.

[ huge snip ]

> The following patch fixes that by instead ignoring debug insns during the
> searching.  We can still check BLOCK_FOR_INSN (insn) on those, because
> if we notice DEBUG_INSN in a following basic block, necessarily there won't
> be any further normal insns in the current block after it.

> --- gcc/combine.cc.jj 2022-02-11 13:51:56.294928090 +0100
> +++ gcc/combine.cc2022-02-15 14:15:41.663012950 +0100
> @@ -4223,10 +4223,12 @@ try_combine (rtx_insn *i3, rtx_insn *i2,
> for (rtx_insn *insn = NEXT_INSN (i3);
>  !done
>  && insn
> -&& NONDEBUG_INSN_P (insn)
> +&& INSN_P (insn)
>  && BLOCK_FOR_INSN (insn) == this_basic_block;
>  insn = NEXT_INSN (insn))
>   {
> +   if (DEBUG_INSN_P (insn))
> + continue;
> struct insn_link *link;
> FOR_EACH_LOG_LINK (link, insn)
>   if (link->insn == i3 && link->regno == regno)

About half of the similar loops in combine.c are still broken this way,
from a quick sampling :-(

Okay for trunk and all backports you may want.  Thanks!


Segher


Re: [PATCH] combine: Fix up -fcompare-debug issue in the combiner [PR104544]

2022-02-16 Thread Jakub Jelinek via Gcc-patches
On Wed, Feb 16, 2022 at 04:44:58AM -0600, Segher Boessenkool wrote:
> > --- gcc/combine.cc.jj   2022-02-11 13:51:56.294928090 +0100
> > +++ gcc/combine.cc  2022-02-15 14:15:41.663012950 +0100
> > @@ -4223,10 +4223,12 @@ try_combine (rtx_insn *i3, rtx_insn *i2,
> >   for (rtx_insn *insn = NEXT_INSN (i3);
> >!done
> >&& insn
> > -  && NONDEBUG_INSN_P (insn)
> > +  && INSN_P (insn)
> >&& BLOCK_FOR_INSN (insn) == this_basic_block;
> >insn = NEXT_INSN (insn))
> > {
> > + if (DEBUG_INSN_P (insn))
> > +   continue;
> >   struct insn_link *link;
> >   FOR_EACH_LOG_LINK (link, insn)
> > if (link->insn == i3 && link->regno == regno)
> 
> About half of the similar loops in combine.c are still broken this way,
> from a quick sampling :-(

Looking for just NONDEBUG_INSN_P, I don't see any other than this.

> Okay for trunk and all backports you may want.  Thanks!

Thanks.

Jakub



[PATCH][gcc][middle-end] PR104498: Fix comparing symbol reference

2022-02-16 Thread Andre Vieira (lists) via Gcc-patches

Hi,

As reported on PR104498, the issue here is that when 
compare_base_symbol_refs swaps x and y but doesn't take that into 
account when computing the distance.
This patch makes sure that if x and y are swapped, we correct the 
distance computation by multiplying it by -1 to end up with the correct 
expected result of the original Y_BASE - X_BASE.


Bootstrapped and regression tested on aarch64-none-linux.

OK for trunk?

gcc/ChangeLog:

    PR middle-end/104498
    * alias.cc (compare_base_symbol_refs): Correct distance 
computation when

    swapping x and y.
diff --git a/gcc/alias.cc b/gcc/alias.cc
index 
3fd71cff2e2b488bc39fcf7d937e118b96f491ab..8c08452e0acfcbf1bfd8fd2e8cd420b5b929d6b4
 100644
--- a/gcc/alias.cc
+++ b/gcc/alias.cc
@@ -2195,6 +2195,7 @@ compare_base_symbol_refs (const_rtx x_base, const_rtx 
y_base,
   tree x_decl = SYMBOL_REF_DECL (x_base);
   tree y_decl = SYMBOL_REF_DECL (y_base);
   bool binds_def = true;
+  bool swap = false;
 
   if (XSTR (x_base, 0) == XSTR (y_base, 0))
 return 1;
@@ -2204,6 +2205,7 @@ compare_base_symbol_refs (const_rtx x_base, const_rtx 
y_base,
 {
   if (!x_decl)
{
+ swap = true;
  std::swap (x_decl, y_decl);
  std::swap (x_base, y_base);
}
@@ -2238,8 +2240,8 @@ compare_base_symbol_refs (const_rtx x_base, const_rtx 
y_base,
   if (SYMBOL_REF_BLOCK (x_base) != SYMBOL_REF_BLOCK (y_base))
return 0;
   if (distance)
-   *distance += (SYMBOL_REF_BLOCK_OFFSET (y_base)
- - SYMBOL_REF_BLOCK_OFFSET (x_base));
+   *distance += (swap ? -1 : 1) * (SYMBOL_REF_BLOCK_OFFSET (y_base)
+   - SYMBOL_REF_BLOCK_OFFSET (x_base));
   return binds_def ? 1 : -1;
 }
   /* Either the symbols are equal (via aliasing) or they refer to


[wwwdocs PATCH v2] gcc-12: Mention -mno-direct-extern-access

2022-02-16 Thread H.J. Lu via Gcc-patches
---
 htdocs/gcc-12/changes.html | 4 
 1 file changed, 4 insertions(+)

diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
index b6341fda..7d253f29 100644
--- a/htdocs/gcc-12/changes.html
+++ b/htdocs/gcc-12/changes.html
@@ -399,6 +399,10 @@ a work-in-progress.
   Add CS prefix to call and jmp to indirect thunk with branch target
   in r8-r15 registers via -mindirect-branch-cs-prefix.
   
+  Always use global offset table (GOT) to access external data and
+  function symbols when the new -mno-direct-extern-access
+  command-line option is specified.
+  
 
 
 
-- 
2.35.1



Re: [wwwdocs PATCH] gcc-12: Mention -mno-direct-extern-access

2022-02-16 Thread H.J. Lu via Gcc-patches
On Sat, Feb 12, 2022 at 2:27 PM Gerald Pfeifer  wrote:
>
> On Sat, 12 Feb 2022, H.J. Lu via Gcc-patches wrote:
> > +  Always use GOT to access external data and function symbols via
> > +  -mno-direct-extern-access.
>
> Maybe say "global offset table (GOT)"?

Fixed,

> And at first I was confused reading this, so I suggest something like
>
>   "...when the new -mno-direct-extern-access command-line
>   option is specified"

Fixed.

> or
>
>   "New command-line option ... that ..." ?
>
> Gerald

Fixed in the v2 patch.

Thanks.

-- 
H.J.


Re: [GCC 11 PATCH 0/5] x86: Backport straight-line-speculation mitigation

2022-02-16 Thread H.J. Lu via Gcc-patches
On Tue, Feb 15, 2022 at 10:52 PM Hongtao Liu  wrote:
>
> On Tue, Feb 1, 2022 at 2:55 AM H.J. Lu via Gcc-patches
>  wrote:
> >
> > Backport -mindirect-branch-cs-prefix:
> >
> > commit 48a4ae26c225eb018ecb59f131e2c4fd4f3cf89a
> > Author: H.J. Lu 
> > Date:   Wed Oct 27 06:27:15 2021 -0700
> >
> > x86: Add -mindirect-branch-cs-prefix
> >
> > Add -mindirect-branch-cs-prefix to add CS prefix to call and jmp to
> > indirect thunk with branch target in r8-r15 registers so that the call
> > and jmp instruction length is 6 bytes to allow them to be replaced with
> > "lfence; call *%r8-r15" or "lfence; jmp *%r8-r15" at run-time.
> >
> > commit 63738e176726d31953deb03f7e32cf8b760735ac
> > Author: H.J. Lu 
> > Date:   Wed Oct 27 07:48:54 2021 -0700
> >
> > x86: Add -mharden-sls=[none|all|return|indirect-branch]
> >
> > Add -mharden-sls= to mitigate against straight line speculation (SLS)
> > for function return and indirect branch by adding an INT3 instruction
> > after function return and indirect branch.
> >
> > and followup commits to support Linux kernel commits:
> >
> > commit e463a09af2f0677b9485a7e8e4e70b396b2ffb6f
> > Author: Peter Zijlstra 
> > Date:   Sat Dec 4 14:43:44 2021 +0100
> >
> > x86: Add straight-line-speculation mitigation
> >
> > commit 68cf4f2a72ef8786e6b7af6fd9a89f27ac0f520d
> > Author: Peter Zijlstra 
> > Date:   Fri Nov 19 17:50:25 2021 +0100
> >
> > x86: Use -mindirect-branch-cs-prefix for RETPOLINE builds
> >
> > H.J. Lu (5):
> >   x86: Remove "%!" before ret
> >   x86: Add -mharden-sls=[none|all|return|indirect-branch]
> >   x86: Add -mindirect-branch-cs-prefix
> >   x86: Rename -harden-sls=indirect-branch to -harden-sls=indirect-jmp
> >   x86: Generate INT3 for __builtin_eh_return
> The patch LGTM.

I am pushing this patch set into GCC 11 branch.

Thanks.

> >
> >  gcc/config/i386/i386-opts.h   |  7 
> >  gcc/config/i386/i386.c| 38 +--
> >  gcc/config/i386/i386.md   |  2 +-
> >  gcc/config/i386/i386.opt  | 24 
> >  gcc/doc/invoke.texi   | 18 -
> >  gcc/testsuite/gcc.target/i386/harden-sls-1.c  | 14 +++
> >  gcc/testsuite/gcc.target/i386/harden-sls-2.c  | 14 +++
> >  gcc/testsuite/gcc.target/i386/harden-sls-3.c  | 14 +++
> >  gcc/testsuite/gcc.target/i386/harden-sls-4.c  | 16 
> >  gcc/testsuite/gcc.target/i386/harden-sls-5.c  | 17 +
> >  gcc/testsuite/gcc.target/i386/harden-sls-6.c  | 18 +
> >  .../i386/indirect-thunk-cs-prefix-1.c | 14 +++
> >  .../i386/indirect-thunk-cs-prefix-2.c | 15 
> >  13 files changed, 198 insertions(+), 13 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/harden-sls-1.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/harden-sls-2.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/harden-sls-3.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/harden-sls-4.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/harden-sls-5.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/harden-sls-6.c
> >  create mode 100644 
> > gcc/testsuite/gcc.target/i386/indirect-thunk-cs-prefix-1.c
> >  create mode 100644 
> > gcc/testsuite/gcc.target/i386/indirect-thunk-cs-prefix-2.c
> >
> > --
> > 2.34.1
> >
>
>
> --
> BR,
> Hongtao



-- 
H.J.


Re: [PATCH] Restrict the two sources of vect_recog_cond_expr_convert_pattern to be of the same type when convert is extension.

2022-02-16 Thread Jakub Jelinek via Gcc-patches
On Wed, Feb 16, 2022 at 05:03:09PM +0800, liuhongt via Gcc-patches wrote:
> > > +(match (cond_expr_convert_p @0 @2 @3 @6)
> > > + (cond (simple_comparison@6 @0 @1) (convert@4 @2) (convert@5 @3))
> > > +  (if (types_match (TREE_TYPE (@2), TREE_TYPE (@3))
> >
> > But in principle @2 or @3 could safely differ in sign, you'd then need to 
> > ensure
> > to insert sign conversions to @2/@3 to the signedness of @4/@5.
> >
> It turns out differ in sign is not suitable for extension(but ok for 
> truncation),
> because it's zero_extend vs sign_extend.
> 
> The patch add types_match check when convert is extension.
> 
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> And native Bootstrapped and regtested on CLX.
> 
> Ok for trunk?
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/104551
>   PR tree-optimization/103771
>   * match.pd (cond_expr_convert_p): Add types_match check when
>   convert is extension.
>   * tree-vect-patterns.cc
>   (gimple_cond_expr_convert_p): Adjust comments.
>   (vect_recog_cond_expr_convert_pattern): Ditto.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/i386/pr104551.c: New test.
> ---
>  gcc/match.pd |  8 +---
>  gcc/testsuite/gcc.target/i386/pr104551.c | 24 
>  gcc/tree-vect-patterns.cc|  6 --
>  3 files changed, 33 insertions(+), 5 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr104551.c
> 
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 05a10ab6bfd..8e80b9f1576 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -7692,11 +7692,13 @@ and,
>(if (INTEGRAL_TYPE_P (type)
> && INTEGRAL_TYPE_P (TREE_TYPE (@2))
> && INTEGRAL_TYPE_P (TREE_TYPE (@0))
> -   && INTEGRAL_TYPE_P (TREE_TYPE (@3))
> && TYPE_PRECISION (type) != TYPE_PRECISION (TREE_TYPE (@0))
> && TYPE_PRECISION (TREE_TYPE (@0))
> == TYPE_PRECISION (TREE_TYPE (@2))
> -   && TYPE_PRECISION (TREE_TYPE (@0))
> -   == TYPE_PRECISION (TREE_TYPE (@3))
> +   && (types_match (TREE_TYPE (@2), TREE_TYPE (@3))
> +|| ((TYPE_PRECISION (TREE_TYPE (@0))
> + == TYPE_PRECISION (TREE_TYPE (@3)))
> +&& INTEGRAL_TYPE_P (TREE_TYPE (@3))
> +&& TYPE_PRECISION (TREE_TYPE (@3)) > TYPE_PRECISION (type)))
> && single_use (@4)
> && single_use (@5

I find this quite unreadable, it looks like if @2 and @3 are treated
differently.  I think keeping the old 3 lines and just adding
  && (TYPE_PRECISION (TREE_TYPE (@0)) >= TYPE_PRECISION (type)
  || (TYPE_UNSIGNED (TREE_TYPE (@2))
  == TYPE_UNSIGNED (TREE_TYPE (@3
after it ideally with a comment why would be better.
Note, if the precision of @0 and type is the same, I think signedness can
still differ, no?

Jakub



[wwwdocs PATCH] gcc-11.3: Mention -mharden-sls= and -mindirect-branch-cs-prefix

2022-02-16 Thread H.J. Lu via Gcc-patches
---
 htdocs/gcc-11/changes.html | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/htdocs/gcc-11/changes.html b/htdocs/gcc-11/changes.html
index fbd1b8ba..8e6d4ec8 100644
--- a/htdocs/gcc-11/changes.html
+++ b/htdocs/gcc-11/changes.html
@@ -1129,6 +1129,13 @@ are not listed here).
 no longer changes how they are passed nor returned.  This ABI change
 is now diagnosed with -Wpsabi.
   
+  Mitigation against straight line speculation (SLS) for function
+  return and indirect jump is supported via
+  -mharden-sls=[none|all|return|indirect-jmp].
+  
+  Add CS prefix to call and jmp to indirect thunk with branch target
+  in r8-r15 registers via -mindirect-branch-cs-prefix.
+  
 
 
 
-- 
2.35.1



Re: [PATCH] combine: Fix up -fcompare-debug issue in the combiner [PR104544]

2022-02-16 Thread Segher Boessenkool
On Wed, Feb 16, 2022 at 11:55:23AM +0100, Jakub Jelinek wrote:
> On Wed, Feb 16, 2022 at 04:44:58AM -0600, Segher Boessenkool wrote:
> > About half of the similar loops in combine.c are still broken this way,
> > from a quick sampling :-(
> 
> Looking for just NONDEBUG_INSN_P, I don't see any other than this.

Ah yes, I was confused by !NONDEBUG_INSN.  Too many inversions make my
head spin (NONDEBUG_INSN really means RTX_INSN && !DEBUG_INSN).

So everything looks fine here now.  Thanks for double checking!


Segher


Re: [PATCH] [PATCH,v5,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack

2022-02-16 Thread Dan Li via Gcc-patches




On 2/15/22 10:02, Richard Sandiford wrote:

Dan Li  writes:

Shadow Call Stack can be used to protect the return address of a
function at runtime, and clang already supports this feature[1].


Looks good, thanks.  However, when I bootstrap it on
aarch64-linux-gnu I get:

.../gcc/ubsan.cc: In function ‘bool 
ubsan_expand_null_ifn(gimple_stmt_iterator*)’:
.../gcc/ubsan.cc:835:50: error: enumerated and non-enumerated type in 
conditional expression [-Werror=extra]
   835 | = (flag_sanitize_recover & ((check_align ? SANITIZE_ALIGNMENT 
: 0)
   |  
^~~~
.../gcc/ubsan.cc:836:51: error: enumerated and non-enumerated type in 
conditional expression [-Werror=extra]
   836 | | (check_null ? SANITIZE_NULL : 
0)))
   |~~~^~~

I think this is because you're taking the last available bit
of the enum :-)

A hacky fix is to add "+ 0" to SANITIZE_ALIGNMENT and SANITIZE_NULL
in the code quoted above (i.e. the code in the error messages).
That seems slightly more robust than a cast to unsigned int (say),
since "+ 0" will work even if the values become 64-bit quantities
in future.

Richard



Ah, apologize for my mistake! I specified --disable-werror in
./configure from the beginning, I didn't see this problem before.

As you said, I use the following patch:

diff --git a/gcc/ubsan.cc b/gcc/ubsan.cc
index 5641d3cc3be..a858994c841 100644
--- a/gcc/ubsan.cc
+++ b/gcc/ubsan.cc
@@ -832,8 +832,8 @@ ubsan_expand_null_ifn (gimple_stmt_iterator *gsip)
   else
 {
   enum built_in_function bcode
-   = (flag_sanitize_recover & ((check_align ? SANITIZE_ALIGNMENT : 0)
-   | (check_null ? SANITIZE_NULL : 0)))
+   = (flag_sanitize_recover & ((check_align ? SANITIZE_ALIGNMENT + 0 : 0)
+   | (check_null ? SANITIZE_NULL + 0 : 0)))
  ? BUILT_IN_UBSAN_HANDLE_TYPE_MISMATCH_V1
  : BUILT_IN_UBSAN_HANDLE_TYPE_MISMATCH_V1_ABORT;
   tree fn = builtin_decl_implicit (bcode);


And tested fine in native compiling for x86_64 , I will change it in the
next version.

BTW:
The platform I'm using is x86-64, so I'm trying to find a way to
reproduce this issue when cross-compiling with aarch64, which I
haven't found so far, the issue only seems to happen with native
compilation.

But most of the code changes are for the aarch64 platform,
is it enough for me to do the following tests before submitting
the patch?
1) A full compile of gcc under x86_64 platform
(make; make install; make bootstrap;)
2) Test all testsuites in aarch64 cross-compile environment
(make -k check)


Thanks,
Dan


[committed] testsuite: Add testcase for already fixed PR [PR104448]

2022-02-16 Thread Jakub Jelinek via Gcc-patches
Hi!

This PR has been fixed with r12-7147-g2f9ab267e725ddf2.

Tested on x86_64-linux -m32/-m64, committed to trunk as obvious.

2022-02-16  Jakub Jelinek  

PR target/104448
* gcc.target/i386/pr104448.c: New test.

--- gcc/testsuite/gcc.target/i386/pr104448.c.jj 2022-02-16 17:02:45.172189326 
+0100
+++ gcc/testsuite/gcc.target/i386/pr104448.c2022-02-16 17:01:50.481951141 
+0100
@@ -0,0 +1,9 @@
+/* PR target/104448 */
+/* { dg-do compile { target { *-*-linux* && lp64 } } } */
+/* { dg-options "-mavx5124vnniw -mno-xsave -mabi=ms" } */
+
+int
+main ()
+{
+  return 0;
+}

Jakub



Re: [PATCH v7] c++: Add diagnostic when operator= is used as truth cond [PR25689]

2022-02-16 Thread Jason Merrill via Gcc-patches

On 2/16/22 02:16, Zhao Wei Liew wrote:

On Wed Feb 16, 2022 at 4:06 AM +08, Jason Merrill wrote:

Ah, I see. I found it a bit odd that gcc-commit-mklog auto-generated a
subject with "c:",
but I just went with it as I didn't know any better. Unfortunately, I
can't change it now on the current thread.


That came from this line in the testcase:

  > +/* PR c/25689 */

The PR should be c++/25689.  Also, sometimes the bugzilla component
isn't the same as the area of the compiler you're changing; the latter
is what you want in the patch subject, so that the right people know to
review it.


Oh, I see. Thanks for the explanation. I've fixed the line.


Ah, I didn't notice that. Sorry about that! I'm kinda new to the whole
mailing list setup so there are some kinks I have to iron out.


FWIW it's often easier to send the patch as an attachment.


Alright, I'll send patches as attachments instead. I originally sent
them as text as it is easier to comment on them.


It is a bit more of a hassle in this case because your mail sender 
doesn't mark the patch as text, but rather application/mbox or 
application/x-patch, so my mail reader for patch review (Thunderbird) 
doesn't display it inline.  I tried sending myself a patch through the 
gmail web interface, and it used text/x-patch, which is OK; what are you 
using to send?


Maybe renaming the file to .txt before sending would help?


+/* Test non-empty class */
+void f2(B b1, B b2)
+{
+ if (b1 = 0); /* { dg-warning "suggest parentheses" } */
+ if (b1 = 0.); /* { dg-warning "suggest parentheses" } */
+ if (b1 = b2); /* { dg-warning "suggest parentheses" } */
+ if (b1.operator= (0));
+
+ /* Ideally, we wouldn't warn for non-empty classes using trivial
+  operator= (below), but we currently do as it is a MODIFY_EXPR. */
+ // if (b1.operator= (b2));


You can avoid it by calling suppress_warning on that MODIFY_EXPR in
build_over_call.


Unfortunately, that also affects the warning for if (b1 = b2) just 5
lines above. Both expressions seem to generate the same tree structure.


True, you would need to put the call to suppress_warning in build_new_op 
around where CALL_EXPR_OPERATOR_SYNTAX is set.


Jason



Re: [pushed] aarch64: Tweak atomic-inst-cas.c options

2022-02-16 Thread Andrew Pinski via Gcc-patches
On Wed, Feb 16, 2022 at 2:25 AM Richard Sandiford via Gcc-patches
 wrote:
>
> atomic-inst-cas.c has code to skip __atomic_compare_exchange_n
> calls for invalid memory orderings, but -Winvalid-memory-model
> applies before the dead code is removed (which is the right
> behaviour IMO).  This patch therefore suppresses the warning
> for this test.

It is a bit more complex than that, see
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104200#c3 for the reduced
testcase.
The undefined (invalid) arguments to __atomic_compare_exchange_n are
only after constant propagation really which is not done at -O0,
though the warning does it.
So the warning does constant propagation of the arguments but not if
it was conditionally executed.
Most likely waccess should do a similar thing like it was done for the
uninitialized warnings in
https://gcc.gnu.org/pipermail/gcc-patches/2022-February/589983.html .

Thanks,
Andrew Pinski

>
> Tested on aarch64-linux-gnu & pushed.
>
> Richard
>
>
> gcc/testsuite/
> * gcc.target/aarch64/atomic-inst-cas.c: Add
> -Wno-invalid-memory-model.
> ---
>  gcc/testsuite/gcc.target/aarch64/atomic-inst-cas.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-inst-cas.c 
> b/gcc/testsuite/gcc.target/aarch64/atomic-inst-cas.c
> index f6f28922319..0b4533adade 100644
> --- a/gcc/testsuite/gcc.target/aarch64/atomic-inst-cas.c
> +++ b/gcc/testsuite/gcc.target/aarch64/atomic-inst-cas.c
> @@ -1,5 +1,7 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -march=armv8-a+lse" } */
> +/* -Winvalid-memory-model warnings are issued before the dead invalid calls
> +   are removed.  */
> +/* { dg-options "-O2 -march=armv8-a+lse -Wno-invalid-memory-model" } */
>
>  /* Test ARMv8.1-A CAS instruction.  */
>
> --
> 2.25.1


[PATCH] rs6000: Workaround for new ifcvt behavior [PR104335]

2022-02-16 Thread Robin Dapp via Gcc-patches
Hi,

since r12-6747-gaa8cfe785953a0 ifcvt not only passes real comparisons
but also "cc comparisons" (i.e. the representation of the result of a
comparison) to the backend.  rs6000_emit_int_cmove () is not prepared to
handle this.  Therefore, this patch makes it return false in such a case
in order to avoid an ICE.

I bootstrapped (with --enable-languages=all on P10,
--enable-languages="c, c++, fortran, go, lto, objc, obj-c++" otherwise)
and regtested on Power7, Power8, Power9 and Power10.

Testsuite is unchanged on P7 and P9.  On P8 I hit some different FAILs
vs master but they look unrelated and seem to be caused by "spawn
failed" i.e. out of memory or so.
On P10 I compared the testsuite of the last commit before the breaking
one (r12-6746-ge9ebb86799fd77, but commenting out a line that would
still result in a "-Wformat-diag" bootstrap error then) vs. the patched
master:  No regressions.

Is it OK?

Regards
 Robin

--

PR target/104335

gcc/ChangeLog:

* config/rs6000/rs6000.cc (rs6000_emit_int_cmove): Return false
if the expected comparison's first operand is of mode MODE_CC.From b9f053bf266bd1518e0eac36509ebde57266 Mon Sep 17 00:00:00 2001
From: Robin Dapp 
Date: Thu, 10 Feb 2022 09:01:51 -0600
Subject: [PATCH] rs6000: Workaround for new ifcvt behavior [PR104335].

Since r12-6747-gaa8cfe785953a0 ifcvt passes a "cc comparison"
i.e. the representation of the result of a comparison to the
backend.  rs6000_emit_int_cmove () is not prepared to handle this.
Therefore, this patch makes it return false in such a case.

	PR target/104335

gcc/ChangeLog:

	* config/rs6000/rs6000.cc (rs6000_emit_int_cmove): Return false
	if the expected comparison's first operand is of mode MODE_CC.
---
 gcc/config/rs6000/rs6000.cc | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index eaba9a2d698..ebc5b0cefdc 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -16175,6 +16175,12 @@ rs6000_emit_int_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
   if (mode != SImode && (!TARGET_POWERPC64 || mode != DImode))
 return false;
 
+  /* PR104335: We now need to expect CC-mode "comparisons"
+ coming from ifcvt.  The following code expects proper
+ comparisons so better abort here.  */
+  if (XEXP (op, 0) && GET_MODE_CLASS (GET_MODE (XEXP (op, 0))) == MODE_CC)
+return false;
+
   /* We still have to do the compare, because isel doesn't do a
  compare, it just looks at the CRx bits set by a previous compare
  instruction.  */
-- 
2.31.1



libbacktrace patch committed: Initialize DWARF 5 fields of unit

2022-02-16 Thread Ian Lance Taylor via Gcc-patches
When I added the DWARF 5 support to libbacktrace in 2019-12-13 I
forgot to initialize the new fields of the unit data structure.
Whoops.  Fixed with this patch.  Bootstrapped and ran libbacktrace and
Go testsuite on x86_64-pc-linux-gnu.  Committed to mainline.

Ian

* dwarf.c (build_address_map): Initialize DWARF 5 fields of unit.
ab59cb2055658a72fdccba0be76eeadd222ffef6
diff --git a/libbacktrace/dwarf.c b/libbacktrace/dwarf.c
index c0bae0e501e..2158bc14065 100644
--- a/libbacktrace/dwarf.c
+++ b/libbacktrace/dwarf.c
@@ -2221,6 +2221,9 @@ build_address_map (struct backtrace_state *state, 
uintptr_t base_address,
   u->comp_dir = NULL;
   u->abs_filename = NULL;
   u->lineoff = 0;
+  u->str_offsets_base = 0;
+  u->addr_base = 0;
+  u->rnglists_base = 0;
 
   /* The actual line number mappings will be read as needed.  */
   u->lines = NULL;


Re: [PATCH] c++: NON_DEPENDENT_EXPR is not potentially constant [PR104507]

2022-02-16 Thread Patrick Palka via Gcc-patches
On Tue, 15 Feb 2022, Jason Merrill wrote:

> On 2/15/22 17:00, Patrick Palka wrote:
> > On Tue, 15 Feb 2022, Jason Merrill wrote:
> > 
> > > On 2/15/22 15:13, Patrick Palka wrote:
> > > > On Tue, 15 Feb 2022, Patrick Palka wrote:
> > > > 
> > > > > Here we're crashing from potential_constant_expression because it
> > > > > tries
> > > > > to perform trial evaluation of the first operand '(bool)__r' of the
> > > > > conjunction (which is overall wrapped in a NON_DEPENDENT_EXPR), but
> > > > > cxx_eval_constant_expression ICEs on unhandled trees (of which
> > > > > CAST_EXPR
> > > > > is one).
> > > > > 
> > > > > Since cxx_eval_constant_expression always treats NON_DEPENDENT_EXPR
> > > > > as non-constant, and since NON_DEPENDENT_EXPR is also opaque to
> > > > > instantiate_non_dependent_expr, it seems futile to have p_c_e_1 ever
> > > > > return true for NON_DEPENDENT_EXPR, so let's just instead return false
> > > > > and avoid recursing.
> > > 
> > > Well, in a template we use pce1 to decide whether to complain about
> > > something
> > > that needs to be constant but can't be.  We aren't trying to get a value
> > > yet.
> > 
> > Makes sense.. though for NON_DEPENDENT_EXPR in particular, ISTM this
> > tree is always used in a context where a constant expression isn't
> > required, e.g. in the build_x_* functions.
> 
> Fair enough.  The patch is OK with a comment to that effect.

Thanks, I committed the following as r12-7264:

-- >8 --

Subject: [PATCH] c++: treat NON_DEPENDENT_EXPR as not potentially constant
 [PR104507]

Here we're crashing from potential_constant_expression because it tries
to perform trial evaluation of the first operand '(bool)__r' of the
conjunction (which is overall wrapped in a NON_DEPENDENT_EXPR), but
cxx_eval_constant_expression ICEs on unsupported trees (of which CAST_EXPR
is one).  The sequence of events is:

  1. build_non_dependent_expr for the array subscript yields
 NON_DEPENDENT_EXPR<<<(bool)__r && __s>>> ? 1 : 2
  2. cp_build_array_ref calls fold_non_dependent_expr on this subscript
 (after this point, processing_template_decl is cleared)
  3. during which, the COND_EXPR case of tsubst_copy_and_build calls
 fold_non_dependent_expr on the first operand
  4. during which, we crash from p_c_e_1 because it attempts trial
 evaluation of the CAST_EXPR '(bool)__r'.

Note that even if this crash didn't happen, fold_non_dependent_expr
from cp_build_array_ref would still ultimately be one big no-op here
since neither constexpr evaluation nor tsubst handle NON_DEPENDENT_EXPR.

In light of this and of the observation that we should never see
NON_DEPENDENT_EXPR in a context where a constant expression is needed
(it's used primarily in the build_x_* family of functions), it seems
futile for p_c_e_1 to ever return true for NON_DEPENDENT_EXPR.  And the
otherwise inconsistent handling of NON_DEPENDENT_EXPR between p_c_e_1,
cxx_evaluate_constexpr_expression and tsubst apparently leads to weird
bugs such as this one.

PR c++/104507

gcc/cp/ChangeLog:

* constexpr.cc (potential_constant_expression_1)
: Return false instead of recursing.
Assert tf_error isn't set.

gcc/testsuite/ChangeLog:

* g++.dg/template/non-dependent21.C: New test.
---
 gcc/cp/constexpr.cc | 9 -
 gcc/testsuite/g++.dg/template/non-dependent21.C | 9 +
 2 files changed, 17 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/template/non-dependent21.C

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index 7274c3b760e..4716694cb71 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -9065,6 +9065,14 @@ potential_constant_expression_1 (tree t, bool want_rval, 
bool strict, bool now,
 case BIND_EXPR:
   return RECUR (BIND_EXPR_BODY (t), want_rval);
 
+case NON_DEPENDENT_EXPR:
+  /* Treat NON_DEPENDENT_EXPR as non-constant: it's not handled by
+constexpr evaluation or tsubst, so fold_non_dependent_expr can't
+do anything useful with it.  And we shouldn't see it in a context
+where a constant expression is strictly required, hence the assert.  */
+  gcc_checking_assert (!(flags & tf_error));
+  return false;
+
 case CLEANUP_POINT_EXPR:
 case MUST_NOT_THROW_EXPR:
 case TRY_CATCH_EXPR:
@@ -9072,7 +9080,6 @@ potential_constant_expression_1 (tree t, bool want_rval, 
bool strict, bool now,
 case EH_SPEC_BLOCK:
 case EXPR_STMT:
 case PAREN_EXPR:
-case NON_DEPENDENT_EXPR:
   /* For convenience.  */
 case LOOP_EXPR:
 case EXIT_EXPR:
diff --git a/gcc/testsuite/g++.dg/template/non-dependent21.C 
b/gcc/testsuite/g++.dg/template/non-dependent21.C
new file mode 100644
index 000..89900837b8b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/non-dependent21.C
@@ -0,0 +1,9 @@
+// PR c++/104507
+
+extern const char *_k_errmsg[];
+
+template
+const char* DoFoo(int __r, int __s) {
+  const char* n = _k_errmsg[(bool)__r && 

Re: [PATCH] c++: return-type-req in constraint using only outer tparms [PR104527]

2022-02-16 Thread Patrick Palka via Gcc-patches
On Tue, 15 Feb 2022, Jason Merrill wrote:

> On 2/14/22 11:32, Patrick Palka wrote:
> > Here the template context for the atomic constraint has two levels of
> > template arguments, but since it depends only on the innermost argument
> > T we use a single-level argument vector during substitution into the
> > constraint (built by get_mapped_args).  We eventually pass this vector
> > to do_auto_deduction as part of checking the return-type-requirement
> > inside the atom, but do_auto_deduction expects outer_targs to be a full
> > set of arguments for sake of satisfaction.
> 
> Could we note the current number of levels in the map and use that in
> get_mapped_args instead of the highest level parameter we happened to use?

Ah yeah, that seems to work nicely.  IIUC it should suffice to remember
whether the atomic constraint expression came from a concept definition.
If it did, then the depth of the argument vector returned by
get_mapped_args must be one, otherwise (as in the testcase) it must be
the same as the template depth of the constrained entity, which is the
depth of ARGS.

How does the following look?  Bootstrapped and regtested on
x86_64-pc-linux-gnu and also on cmcstl2 and range-v3.

-- >8 --

Subject: [PATCH] c++: return-type-req in constraint using only outer tparms
 [PR104527]

Here the template context for the atomic constraint has two levels of
template parameters, but since it depends only on the innermost parameter
T we use a single-level argument vector (built by get_mapped_args) during
substitution into the atom.  We eventually pass this vector to
do_auto_deduction as part of checking the return-type-requirement within
the atom, but do_auto_deduction expects outer_targs to be a full set of
arguments for sake of satisfaction.

This patch fixes this by making get_mapped_args always return an
argument vector whose depth corresponds to the template depth of the
context in which the atomic constraint expression was written, instead
of the highest parameter level that the expression happens to use.

PR c++/104527

gcc/cp/ChangeLog:

* constraint.cc (normalize_atom): Set
ATOMIC_CONSTR_EXPR_FROM_CONCEPT_P appropriately.
(get_mapped_args):  Make static, adjust parameters.  Always
return a vector whose depth corresponds to the template depth of
the context of the atomic constraint expression.  Micro-optimize
by passing false as exact to safe_grow_cleared and by collapsing
a multi-level depth-one argument vector.
(satisfy_atom): Adjust call to get_mapped_args and
diagnose_atomic_constraint.
(diagnose_atomic_constraint): Replace map parameter with an args
parameter.
* cp-tree.h (ATOMIC_CONSTR_EXPR_FROM_CONCEPT_P): Define.
(get_mapped_args): Remove declaration.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-return-req4.C: New test.
---
 gcc/cp/constraint.cc  | 64 +++
 gcc/cp/cp-tree.h  |  7 +-
 .../g++.dg/cpp2a/concepts-return-req4.C   | 24 +++
 3 files changed, 69 insertions(+), 26 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-return-req4.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 12db7e5cf14..306e28955c6 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -764,6 +764,8 @@ normalize_atom (tree t, tree args, norm_info info)
   tree ci = build_tree_list (t, info.context);
 
   tree atom = build1 (ATOMIC_CONSTR, ci, map);
+  if (info.in_decl && concept_definition_p (info.in_decl))
+ATOMIC_CONSTR_EXPR_FROM_CONCEPT_P (atom) = true;
   if (!info.generate_diagnostics ())
 {
   /* Cache the ATOMIC_CONSTRs that we return, so that sat_hasher::equal
@@ -2826,33 +2828,37 @@ satisfaction_value (tree t)
 return boolean_true_node;
 }
 
-/* Build a new template argument list with template arguments corresponding
-   to the parameters used in an atomic constraint.  */
+/* Build a new template argument vector according to the parameter
+   mapping of the atomic constraint T, using arguments from ARGS.  */
 
-tree
-get_mapped_args (tree map)
+static tree
+get_mapped_args (tree t, tree args)
 {
+  tree map = ATOMIC_CONSTR_MAP (t);
+
   /* No map, no arguments.  */
   if (!map)
 return NULL_TREE;
 
-  /* Find the mapped parameter with the highest level.  */
-  int count = 0;
-  for (tree p = map; p; p = TREE_CHAIN (p))
-{
-  int level;
-  int index;
-  template_parm_level_and_index (TREE_VALUE (p), &level, &index);
-  if (level > count)
-count = level;
-}
+  /* Determine the depth of the resulting argument vector.  */
+  int depth;
+  if (ATOMIC_CONSTR_EXPR_FROM_CONCEPT_P (t))
+/* The expression of this atomic constraint comes from a concept 
definition,
+   whose template depth is always one, so the resulting argument vector
+   will also have depth one.  */
+depth = 1;
+  else
+/* Otherwise, the e

Re: libgo patch committed: Update to Go1.18beta2 release

2022-02-16 Thread Ian Lance Taylor via Gcc-patches
On Tue, Feb 15, 2022 at 1:19 AM Eric Botcazou  wrote:
>
> > I've committed a change to update libgo to the Go1.18beta2 release.
>
> This apparently broke the build on SPARC/Solaris 11.3:

I've committed this patch to fix these problems.  Bootstrapped and ran
Go testsuite on x86_64-pc-linux-gnu and x86_64-solaris.

Ian

p
24ca97325cab7bc454c785d55f37120fe7ea6f74
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 745132a3d9d..3742414c828 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-0af68c0552341a44f1fb12301f9eff954b9dde88
+3742e8a154bfec805054b4ebf0809f12dc7694da
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/libgo/go/net/fcntl_libc_test.go b/libgo/go/net/fcntl_libc_test.go
index f59a1aa33ba..c935c4540cf 100644
--- a/libgo/go/net/fcntl_libc_test.go
+++ b/libgo/go/net/fcntl_libc_test.go
@@ -6,7 +6,10 @@
 
 package net
 
-import "syscall"
+import (
+   "syscall"
+   _ "unsafe"
+)
 
 // Use a helper function to call fcntl.  This is defined in C in
 // libgo/runtime.
diff --git a/libgo/go/os/signal/internal/pty/pty.go 
b/libgo/go/os/signal/internal/pty/pty.go
index e5ee3f6dc01..01c3908becf 100644
--- a/libgo/go/os/signal/internal/pty/pty.go
+++ b/libgo/go/os/signal/internal/pty/pty.go
@@ -2,7 +2,7 @@
 // Use of this source code is governed by a BSD-style
 // license that can be found in the LICENSE file.
 
-//go:build (aix || darwin || dragonfly || freebsd || hurd || (linux && 
!android) || netbsd || openbsd) && cgo
+//go:build (aix || darwin || dragonfly || freebsd || hurd || (linux && 
!android) || netbsd || openbsd || solaris) && cgo
 
 // Package pty is a simple pseudo-terminal package for Unix systems,
 // implemented by calling C functions via cgo.
diff --git a/libgo/go/runtime/os3_solaris.go b/libgo/go/runtime/os3_solaris.go
index ec23ce2cc0c..6c825746fbc 100644
--- a/libgo/go/runtime/os3_solaris.go
+++ b/libgo/go/runtime/os3_solaris.go
@@ -36,6 +36,14 @@ func solarisExecutablePath() string {
return executablePath
 }
 
+func setProcessCPUProfiler(hz int32) {
+   setProcessCPUProfilerTimer(hz)
+}
+
+func setThreadCPUProfiler(hz int32) {
+   setThreadCPUProfilerHz(hz)
+}
+
 //go:nosplit
 func validSIGPROF(mp *m, c *sigctxt) bool {
return true
diff --git a/libgo/go/runtime/stubs2.go b/libgo/go/runtime/stubs2.go
index 0b9e60587e1..587109209d1 100644
--- a/libgo/go/runtime/stubs2.go
+++ b/libgo/go/runtime/stubs2.go
@@ -2,7 +2,7 @@
 // Use of this source code is governed by a BSD-style
 // license that can be found in the LICENSE file.
 
-//go:build !aix && !darwin && !js && !openbsd && !plan9 && !solaris && !windows
+//go:build !js && !plan9 && !windows
 
 package runtime
 
diff --git a/libgo/go/syscall/exec_bsd.go b/libgo/go/syscall/exec_bsd.go
index c05ae138811..ff88bc45366 100644
--- a/libgo/go/syscall/exec_bsd.go
+++ b/libgo/go/syscall/exec_bsd.go
@@ -143,13 +143,13 @@ func forkAndExecInChild(argv0 *byte, argv, envv []*byte, 
chroot, dir *byte, attr
// User and groups
if cred := sys.Credential; cred != nil {
ngroups := len(cred.Groups)
-   var groups *Gid_t
+   var groups unsafe.Pointer
if ngroups > 0 {
gids := make([]Gid_t, ngroups)
for i, v := range cred.Groups {
gids[i] = Gid_t(v)
}
-   groups = &gids[0]
+   groups = unsafe.Pointer(&gids[0])
}
if !cred.NoSetGroups {
err1 = raw_setgroups(ngroups, groups)
diff --git a/libgo/go/syscall/export_unix_test.go 
b/libgo/go/syscall/export_unix_test.go
index 184eb84c0b1..bd904c70f36 100644
--- a/libgo/go/syscall/export_unix_test.go
+++ b/libgo/go/syscall/export_unix_test.go
@@ -2,7 +2,7 @@
 // Use of this source code is governed by a BSD-style
 // license that can be found in the LICENSE file.
 
-//go:build dragonfly || freebsd || hurd || linux || netbsd || openbsd
+//go:build dragonfly || freebsd || hurd || linux || netbsd || openbsd || 
solaris
 
 package syscall
 
diff --git a/libgo/go/syscall/syscall_solaris.go 
b/libgo/go/syscall/syscall_solaris.go
index 13c60a493d9..673ba8223fc 100644
--- a/libgo/go/syscall/syscall_solaris.go
+++ b/libgo/go/syscall/syscall_solaris.go
@@ -6,8 +6,6 @@ package syscall
 
 import "unsafe"
 
-const _F_DUP2FD_CLOEXEC = F_DUP2FD_CLOEXEC
-
 func (ts *Timestruc) Unix() (sec int64, nsec int64) {
return int64(ts.Sec), int64(ts.Nsec)
 }


[PATCH] c++: double non-dep folding from finish_compound_literal [PR104565]

2022-02-16 Thread Patrick Palka via Gcc-patches
In finish_compound_literal, we perform non-dependent expr folding before
calling check_narrowing (ever since r9-5973).  But ever since r10-7096,
check_narrowing also performs non-dependent expr folding of its own.
This double folding cause tsubst to see non-templated trees during the
second folding, which causes a spurious error in the below testcase.

This patch removes this first folding operation; it now seems obviated
by the second one.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk/10/11?

PR c++/104565

gcc/cp/ChangeLog:

* semantics.cc (finish_compound_literal): Don't perform
non-dependent expr folding before calling check_narrowing.

gcc/testsuite/ChangeLog:

* g++.dg/template/non-dependent22.C: New test.
---
 gcc/cp/semantics.cc | 10 +++---
 gcc/testsuite/g++.dg/template/non-dependent22.C | 12 
 2 files changed, 15 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/template/non-dependent22.C

diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 0cb17a6a8ab..114baa48710 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -3203,13 +3203,9 @@ finish_compound_literal (tree type, tree 
compound_literal,
 return error_mark_node;
   compound_literal = reshape_init (type, compound_literal, complain);
   if (SCALAR_TYPE_P (type)
-  && !BRACE_ENCLOSED_INITIALIZER_P (compound_literal))
-{
-  tree t = instantiate_non_dependent_expr_sfinae (compound_literal,
- complain);
-  if (!check_narrowing (type, t, complain))
-   return error_mark_node;
-}
+  && !BRACE_ENCLOSED_INITIALIZER_P (compound_literal)
+  && !check_narrowing (type, compound_literal, complain))
+return error_mark_node;
   if (TREE_CODE (type) == ARRAY_TYPE
   && TYPE_DOMAIN (type) == NULL_TREE)
 {
diff --git a/gcc/testsuite/g++.dg/template/non-dependent22.C 
b/gcc/testsuite/g++.dg/template/non-dependent22.C
new file mode 100644
index 000..83a6a13f15b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/non-dependent22.C
@@ -0,0 +1,12 @@
+// PR c++/104565
+// { dg-do compile { target c++11 } }
+
+struct apa {
+  constexpr int n() const { return 3; }
+};
+
+template
+int f() {
+  apa foo;
+  return int{foo.n()};  // no matching function for call to 'apa::n(apa*)'
+}
-- 
2.35.1.129.gb80121027d



[PATCH] libgomp : OMPD implementation

2022-02-16 Thread Mohamed Atef via Gcc-patches
HI,
I am sorry that the previous patch was buggy.
This patch contains the header files and source files of functions that are
specified in OpenMP Application ProgrammingInterface book from sections
(5.1, 5.2, 5.3, 5.4, 5.5.1, 5.5.2) the functions are tested using the gdb
plugin and the results are correct.
Please Review this Patch and reply to us.

2022-02-16  Mohamed Atef  

* Makefile.am (toolexeclib_LTLIBRARIES): Add libgompd.la.
(libgompd_la_LDFLAGS, libgompd_la_DEPENDENCIES, libgompd_la_LINK,
libgompd_la_SOURCES, libgompd_version_dep, libgompd_version_script,
libgompd.ver-sun, libgompd.ver, libgompd_version_info): Defined.
* Makefile.in: Regenerate.
* aclocal.m4: Regenerate.
* config/darwin/plugin-suffix.h: Removed ().
* config/hpux/plugin-suffix.h: Removed ().
* config/posix/plugin-suffix.h: Removed ().
* configure: Regenerate.
* env.c: (#include "ompd-support.h") : Added.
(initialize_env) : Call ompd_load().
* parallel.c:(#include "ompd-support.h"): Added.
(GOMP_parallel) : Call ompd_bp_parallel_begin and
ompd_bp_parallel_end.
* libgomp.map: Add OMP_5.0.3 symobl versions.
* libgompd.map: New file.
* omp-tools.h.in : New file.
* omp-types.h.in : New file.
* ompd-support.h : New file.
* ompd-support.c : New file.
* ompd-helper.h : New file.
* ompd-helper.c: New file.
* ompd-init.c: New file.
* testsuite/Makfile.in: Regenerate.
diff --git a/configure b/configure
index 9c2d7df1bb2..c270ea34098 100755
--- a/configure
+++ b/configure
@@ -766,6 +766,7 @@ infodir
 docdir
 oldincludedir
 includedir
+runstatedir
 localstatedir
 sharedstatedir
 sysconfdir
@@ -936,6 +937,7 @@ datadir='${datarootdir}'
 sysconfdir='${prefix}/etc'
 sharedstatedir='${prefix}/com'
 localstatedir='${prefix}/var'
+runstatedir='${localstatedir}/run'
 includedir='${prefix}/include'
 oldincludedir='/usr/include'
 docdir='${datarootdir}/doc/${PACKAGE}'
@@ -1188,6 +1190,15 @@ do
   | -silent | --silent | --silen | --sile | --sil)
 silent=yes ;;
 
+  -runstatedir | --runstatedir | --runstatedi | --runstated \
+  | --runstate | --runstat | --runsta | --runst | --runs \
+  | --run | --ru | --r)
+ac_prev=runstatedir ;;
+  -runstatedir=* | --runstatedir=* | --runstatedi=* | --runstated=* \
+  | --runstate=* | --runstat=* | --runsta=* | --runst=* | --runs=* \
+  | --run=* | --ru=* | --r=*)
+runstatedir=$ac_optarg ;;
+
   -sbindir | --sbindir | --sbindi | --sbind | --sbin | --sbi | --sb)
 ac_prev=sbindir ;;
   -sbindir=* | --sbindir=* | --sbindi=* | --sbind=* | --sbin=* \
@@ -1325,7 +1336,7 @@ fi
 for ac_var in  exec_prefix prefix bindir sbindir libexecdir datarootdir \
datadir sysconfdir sharedstatedir localstatedir includedir \
oldincludedir docdir infodir htmldir dvidir pdfdir psdir \
-   libdir localedir mandir
+   libdir localedir mandir runstatedir
 do
   eval ac_val=\$$ac_var
   # Remove trailing slashes.
@@ -1485,6 +1496,7 @@ Fine tuning of the installation directories:
   --sysconfdir=DIRread-only single-machine data [PREFIX/etc]
   --sharedstatedir=DIRmodifiable architecture-independent data [PREFIX/com]
   --localstatedir=DIR modifiable single-machine data [PREFIX/var]
+  --runstatedir=DIR   modifiable per-process data [LOCALSTATEDIR/run]
   --libdir=DIRobject code libraries [EPREFIX/lib]
   --includedir=DIRC header files [PREFIX/include]
   --oldincludedir=DIR C header files for non-gcc [/usr/include]
 
* testsuite/libgomp.fortran/allocate-1.f90: Remove spurious
diff --git a/libgomp/Makefile.am b/libgomp/Makefile.am
index f8b2a06d63e..22a27df105e 100644
--- a/libgomp/Makefile.am
+++ b/libgomp/Makefile.am
@@ -20,7 +20,7 @@ AM_CPPFLAGS = $(addprefix -I, $(search_path))
 AM_CFLAGS = $(XCFLAGS)
 AM_LDFLAGS = $(XLDFLAGS) $(SECTION_LDFLAGS) $(OPT_LDFLAGS)
 
-toolexeclib_LTLIBRARIES = libgomp.la
+toolexeclib_LTLIBRARIES = libgomp.la libgompd.la
 nodist_toolexeclib_HEADERS = libgomp.spec
 
 if LIBGOMP_BUILD_VERSIONED_SHLIB
@@ -32,13 +32,21 @@ libgomp.ver: $(top_srcdir)/libgomp.map
$(EGREP) -v '#(#| |$$)' $< | \
  $(PREPROCESS) -P -include config.h - > $@ || (rm -f $@ ; exit 1)
 
+libgompd.ver: $(top_srcdir)/libgompd.map
+   $(EGREP) -v '#(#| |$$)' $< | \
+   $(PREPROCESS) -P -include config.h - > $@ || (rm -f $@ ; exit 1)
+
 if LIBGOMP_BUILD_VERSIONED_SHLIB_GNU
 libgomp_version_script = -Wl,--version-script,libgomp.ver
+libgompd_version_script = -Wl,--version-script,libgompd.ver
 libgomp_version_dep = libgomp.ver
+libgompd_version_dep = libgompd.ver
 endif
 if LIBGOMP_BUILD_VERSIONED_SHLIB_SUN
 libgomp_version_script = -Wl,-M,libgomp.ver-sun
+libgompd_version_script = -Wl,-M,libgompd.ver-sun
 libgomp_version_dep = libgomp.ver-sun
+libgompd_version_dep = libgompd.ver-sun
 libgomp.ver-sun : libgomp.

Re: [PATCH] libgomp : OMPD implementation

2022-02-16 Thread Mohamed Atef via Gcc-patches
Sorry I forgot to uncomment 2 lines,
here's the Patch Again.

Thanks
Mohamed

On Wed, Feb 16, 2022 at 10:54 PM Mohamed Atef 
wrote:

> HI,
> I am sorry that the previous patch was buggy.
> This patch contains the header files and source files of functions that
> are specified in OpenMP Application ProgrammingInterface book from sections
> (5.1, 5.2, 5.3, 5.4, 5.5.1, 5.5.2) the functions are tested using the gdb
> plugin and the results are correct.
> Please Review this Patch and reply to us.
>
> 2022-02-16  Mohamed Atef  
>
> * Makefile.am (toolexeclib_LTLIBRARIES): Add libgompd.la.
> (libgompd_la_LDFLAGS, libgompd_la_DEPENDENCIES, libgompd_la_LINK,
> libgompd_la_SOURCES, libgompd_version_dep, libgompd_version_script,
> libgompd.ver-sun, libgompd.ver, libgompd_version_info): Defined.
> * Makefile.in: Regenerate.
> * aclocal.m4: Regenerate.
> * config/darwin/plugin-suffix.h: Removed ().
> * config/hpux/plugin-suffix.h: Removed ().
> * config/posix/plugin-suffix.h: Removed ().
> * configure: Regenerate.
> * env.c: (#include "ompd-support.h") : Added.
> (initialize_env) : Call ompd_load().
> * parallel.c:(#include "ompd-support.h"): Added.
> (GOMP_parallel) : Call ompd_bp_parallel_begin and
> ompd_bp_parallel_end.
> * libgomp.map: Add OMP_5.0.3 symobl versions.
> * libgompd.map: New file.
> * omp-tools.h.in : New file.
> * omp-types.h.in : New file.
> * ompd-support.h : New file.
> * ompd-support.c : New file.
> * ompd-helper.h : New file.
> * ompd-helper.c: New file.
> * ompd-init.c: New file.
> * testsuite/Makfile.in: Regenerate.
>
>
>
>
diff --git a/configure b/configure
index 9c2d7df1bb2..c270ea34098 100755
--- a/configure
+++ b/configure
@@ -766,6 +766,7 @@ infodir
 docdir
 oldincludedir
 includedir
+runstatedir
 localstatedir
 sharedstatedir
 sysconfdir
@@ -936,6 +937,7 @@ datadir='${datarootdir}'
 sysconfdir='${prefix}/etc'
 sharedstatedir='${prefix}/com'
 localstatedir='${prefix}/var'
+runstatedir='${localstatedir}/run'
 includedir='${prefix}/include'
 oldincludedir='/usr/include'
 docdir='${datarootdir}/doc/${PACKAGE}'
@@ -1188,6 +1190,15 @@ do
   | -silent | --silent | --silen | --sile | --sil)
 silent=yes ;;
 
+  -runstatedir | --runstatedir | --runstatedi | --runstated \
+  | --runstate | --runstat | --runsta | --runst | --runs \
+  | --run | --ru | --r)
+ac_prev=runstatedir ;;
+  -runstatedir=* | --runstatedir=* | --runstatedi=* | --runstated=* \
+  | --runstate=* | --runstat=* | --runsta=* | --runst=* | --runs=* \
+  | --run=* | --ru=* | --r=*)
+runstatedir=$ac_optarg ;;
+
   -sbindir | --sbindir | --sbindi | --sbind | --sbin | --sbi | --sb)
 ac_prev=sbindir ;;
   -sbindir=* | --sbindir=* | --sbindi=* | --sbind=* | --sbin=* \
@@ -1325,7 +1336,7 @@ fi
 for ac_var in  exec_prefix prefix bindir sbindir libexecdir datarootdir \
datadir sysconfdir sharedstatedir localstatedir includedir \
oldincludedir docdir infodir htmldir dvidir pdfdir psdir \
-   libdir localedir mandir
+   libdir localedir mandir runstatedir
 do
   eval ac_val=\$$ac_var
   # Remove trailing slashes.
@@ -1485,6 +1496,7 @@ Fine tuning of the installation directories:
   --sysconfdir=DIRread-only single-machine data [PREFIX/etc]
   --sharedstatedir=DIRmodifiable architecture-independent data [PREFIX/com]
   --localstatedir=DIR modifiable single-machine data [PREFIX/var]
+  --runstatedir=DIR   modifiable per-process data [LOCALSTATEDIR/run]
   --libdir=DIRobject code libraries [EPREFIX/lib]
   --includedir=DIRC header files [PREFIX/include]
   --oldincludedir=DIR C header files for non-gcc [/usr/include]
diff --git a/libgomp/ChangeLog b/libgomp/ChangeLog
index 7905565c420..be4ce1dbe12 100644
--- a/libgomp/ChangeLog
+++ b/libgomp/ChangeLog
@@ -1,3 +1,30 @@
+2022-02-16  Mohamed Atef  
+
+* Makefile.am (toolexeclib_LTLIBRARIES): Add libgompd.la.
+(libgompd_la_LDFLAGS, libgompd_la_DEPENDENCIES, libgompd_la_LINK,
+libgompd_la_SOURCES, libgompd_version_dep, libgompd_version_script,
+libgompd.ver-sun, libgompd.ver, libgompd_version_info): Defined. 
+* Makefile.in: Regenerate.
+* aclocal.m4: Regenerate.
+* config/darwin/plugin-suffix.h: Removed ().
+* config/hpux/plugin-suffix.h: Removed ().
+* config/posix/plugin-suffix.h: Removed ().
+* configure: Regenerate.
+* env.c: (#include "ompd-support.h") : Added.
+(initialize_env) : Call ompd_load().
+* parallel.c:(#include "ompd-support.h"): Added.
+(GOMP_parallel) : Call ompd_bp_parallel_begin and ompd_bp_parallel_end.
+* libgomp.map: Add OMP_5.0.3 symobl versions.
+* libgompd.map: New file.
+* omp-tools.h.in : New file.
+* omp-types.h.in : New fi

[PATCH] PR fortran/104573 - ICE in resolve_structure_cons, at fortran/resolve.cc:1299

2022-02-16 Thread Harald Anlauf via Gcc-patches
Dear Fortranners,

while we detect invalid uses of type(*), we may run into other issues
later when the declared variable is used, leading to an ICE due to a
NULL pointer dereference.  This is demonstrated by Gerhard's testcase.

Steve and I came to rather similar fixes, see PR.  Mine is attached.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

Thanks,
Harald

From 01d629506edca711f02912e2cc124f8894cfa389 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Wed, 16 Feb 2022 22:13:02 +0100
Subject: [PATCH] Fortran: error recovery after invalid assumed type
 declaration

gcc/fortran/ChangeLog:

	PR fortran/104573
	* resolve.cc (resolve_structure_cons): Avoid NULL pointer
	dereference when there is no valid component.

gcc/testsuite/ChangeLog:

	PR fortran/104573
	* gfortran.dg/assumed_type_14.f90: New test.
---
 gcc/fortran/resolve.cc|  8 +---
 gcc/testsuite/gfortran.dg/assumed_type_14.f90 | 12 
 2 files changed, 17 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/assumed_type_14.f90

diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc
index 266e41e25b1..2fa1acdbd6d 100644
--- a/gcc/fortran/resolve.cc
+++ b/gcc/fortran/resolve.cc
@@ -1288,15 +1288,17 @@ resolve_structure_cons (gfc_expr *expr, int init)
 	}
 }

-  cons = gfc_constructor_first (expr->value.constructor);
-
   /* A constructor may have references if it is the result of substituting a
  parameter variable.  In this case we just pull out the component we
  want.  */
   if (expr->ref)
 comp = expr->ref->u.c.sym->components;
-  else
+  else if (expr->ts.u.derived)
 comp = expr->ts.u.derived->components;
+  else
+return false;
+
+  cons = gfc_constructor_first (expr->value.constructor);

   for (; comp && cons; comp = comp->next, cons = gfc_constructor_next (cons))
 {
diff --git a/gcc/testsuite/gfortran.dg/assumed_type_14.f90 b/gcc/testsuite/gfortran.dg/assumed_type_14.f90
new file mode 100644
index 000..6cfe2e4fb73
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/assumed_type_14.f90
@@ -0,0 +1,12 @@
+! { dg-do compile }
+! PR fortran/104573 - ICE in resolve_structure_cons
+! Contributed by G.Steinmetz
+
+program p
+  type t
+  end type
+  type(*), parameter :: x = t() ! { dg-error "Assumed type of variable" }
+  print *, x
+end
+
+! { dg-prune-output "Cannot convert" }
--
2.34.1



Re: [committed] d: Merge upstream dmd 52844d4b1, druntime dbd0c874, phobos 896b1d0e1.

2022-02-16 Thread Rainer Orth
Hi Iain,

> This patch merges the D front-end implementation with upstream dmd
> 52844d4b1, as well as the D runtime libraries with druntime dbd0c874,
> and phobos 896b1d0e1, including the latest features and bug-fixes ahead of 
> the 2.099.0-beta1 release.

this patch broke Solaris bootstrap:

/vol/gcc/src/hg/master/local/libphobos/libdruntime/core/sys/posix/sys/ipc.d:193:5:
 error: static assert:  "Unsupported platform"
  193 | static assert(false, "Unsupported platform");
  | ^

The attached patch fixes this.  Tested on i386-pc-solaris2.11 and
sparc-sun-solaris2.11.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


diff --git a/libphobos/libdruntime/core/sys/posix/sys/ipc.d b/libphobos/libdruntime/core/sys/posix/sys/ipc.d
--- a/libphobos/libdruntime/core/sys/posix/sys/ipc.d
+++ b/libphobos/libdruntime/core/sys/posix/sys/ipc.d
@@ -188,6 +188,31 @@ else version (DragonFlyBSD)
 enum IPC_SET= 1;
 enum IPC_STAT   = 2;
 }
+else version (Solaris)
+{
+struct ipc_perm
+{
+	uid_t   uid;
+	gid_t   gid;
+	uid_t   cuid;
+	gid_t   cgid;
+	mode_t  mode;
+	uintseq;
+	key_t   key;
+version (D_LP64) {} else
+	int[4] pad;
+}
+
+enum IPC_CREAT  = 0x0200;
+enum IPC_EXCL   = 0x0400;
+enum IPC_NOWAIT = 0x0800;
+
+enum key_t IPC_PRIVATE = 0;
+
+enum IPC_RMID   = 10;
+enum IPC_SET= 11;
+enum IPC_STAT   = 12;
+}
 else
 {
 static assert(false, "Unsupported platform");
@@ -233,6 +258,10 @@ else version (CRuntime_UClibc)
 {
 key_t ftok(const scope char*, int);
 }
+else version (Solaris)
+{
+key_t ftok(const scope char*, int);
+}
 else
 {
 static assert(false, "Unsupported platform");


[PATCH, V3] PR target/99708- Define __SIZEOF_FLOAT128__ and __SIZEOF_IBM128__

2022-02-16 Thread Michael Meissner via Gcc-patches
[PATCH, V3] Define __SIZEOF_FLOAT128__ and __SIZEOF_IBM128__.

Define the sizes of the PowerPC specific types __float128 and __ibm128 if those
types are enabled.

This patch will define __SIZEOF_IBM128__ and __SIZEOF_FLOAT128__ if their
respective types are created in the compiler.  Currently, this means both of
these will be defined if float128 support is enabled.  But at some point in
the future, __ibm128 could be enabled without enabling float128 support and
__SIZEOF_IBM128__ would be defined.

I have tested this on a little endian power9 system and there were no
regressions.  I did verify by hand that if I compile with -mno-vsx, that
__SIZEOF_IBM128__ is not defined.  Can I check this into the master branch?
Ideally, it should also be backported to GCC 11 and 10.

2022-02-16  Michael Meissner  

gcc/
PR target/99708
* config/rs6000/rs6000-c.cc (rs6000_cpu_cpp_builtins): Define
__SIZEOF_IBM128__ if the IBM 128-bit long double type is created.
Define __SIZEOF_FLOAT128__ if we have float128 support.

gcc/testsuite/
PR target/99708
* gcc.target/powerpc/pr99708.c: New test.
---
 gcc/config/rs6000/rs6000-c.cc  |  7 ++-
 gcc/testsuite/gcc.target/powerpc/pr99708.c | 21 +
 2 files changed, 27 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr99708.c

diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
index 15251efc209..ec4e5c3f53a 100644
--- a/gcc/config/rs6000/rs6000-c.cc
+++ b/gcc/config/rs6000/rs6000-c.cc
@@ -622,8 +622,13 @@ rs6000_cpu_cpp_builtins (cpp_reader *pfile)
 builtin_define ("__RSQRTE__");
   if (TARGET_FRSQRTES)
 builtin_define ("__RSQRTEF__");
+  if (ibm128_float_type_node)
+builtin_define ("__SIZEOF_IBM128__=16");
   if (TARGET_FLOAT128_TYPE)
-builtin_define ("__FLOAT128_TYPE__");
+{
+  builtin_define ("__FLOAT128_TYPE__");
+  builtin_define ("__SIZEOF_FLOAT128__=16");
+}
 #ifdef TARGET_LIBC_PROVIDES_HWCAP_IN_TCB
   builtin_define ("__BUILTIN_CPU_SUPPORTS__");
 #endif
diff --git a/gcc/testsuite/gcc.target/powerpc/pr99708.c 
b/gcc/testsuite/gcc.target/powerpc/pr99708.c
new file mode 100644
index 000..d478f7bc4c0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr99708.c
@@ -0,0 +1,21 @@
+/* { dg-do run } */
+/* { require-effective-target ppc_float128_sw } */
+/* { dg-options "-O2 -mvsx -mfloat128" } */
+
+/*
+ * PR target/99708
+ *
+ * Verify that __SIZEOF_FLOAT128__ and __SIZEOF_IBM128__ are properly defined.
+ */
+
+#include 
+
+int main (void)
+{
+  if (__SIZEOF_FLOAT128__ != sizeof (__float128)
+  || __SIZEOF_IBM128__ != sizeof (__ibm128))
+abort ();
+
+  return 0;
+}
+
-- 
2.35.1


-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


Re: [PATCH] rs6000: Workaround for new ifcvt behavior [PR104335]

2022-02-16 Thread Segher Boessenkool
Hi!

On Wed, Feb 16, 2022 at 08:11:17PM +0100, Robin Dapp wrote:
> since r12-6747-gaa8cfe785953a0 ifcvt not only passes real comparisons
> but also "cc comparisons" (i.e. the representation of the result of a
> comparison) to the backend.  rs6000_emit_int_cmove () is not prepared to
> handle this.  Therefore, this patch makes it return false in such a case
> in order to avoid an ICE.

> On P10 I compared the testsuite of the last commit before the breaking
> one (r12-6746-ge9ebb86799fd77, but commenting out a line that would
> still result in a "-Wformat-diag" bootstrap error then)

I have used --disable-werror for weeks already :-(

>   PR target/104335
> 
> gcc/ChangeLog:
> 
>   * config/rs6000/rs6000.cc (rs6000_emit_int_cmove): Return false
>   if the expected comparison's first operand is of mode MODE_CC.

Please send patches as plain text, not as base64.

> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -16175,6 +16175,12 @@ rs6000_emit_int_cmove (rtx dest, rtx op, rtx 
> true_cond, rtx false_cond)
>if (mode != SImode && (!TARGET_POWERPC64 || mode != DImode))
>  return false;
>  
> +  /* PR104335: We now need to expect CC-mode "comparisons"
> + coming from ifcvt.  The following code expects proper
> + comparisons so better abort here.  */
> +  if (XEXP (op, 0) && GET_MODE_CLASS (GET_MODE (XEXP (op, 0))) == MODE_CC)
> +return false;

Why that first test?  XEXP (op, 0) is required to not be nil.

The patch is okay without that (if it passes testing of course :-) )
Thanks!


Segher


Re: [PATCH, V3] PR target/99708- Define __SIZEOF_FLOAT128__ and __SIZEOF_IBM128__

2022-02-16 Thread Segher Boessenkool
Hi!

On Wed, Feb 16, 2022 at 06:03:53PM -0500, Michael Meissner wrote:
> [PATCH, V3] Define __SIZEOF_FLOAT128__ and __SIZEOF_IBM128__.
> 
> Define the sizes of the PowerPC specific types __float128 and __ibm128 if 
> those
> types are enabled.
> 
> This patch will define __SIZEOF_IBM128__ and __SIZEOF_FLOAT128__ if their
> respective types are created in the compiler.

> gcc/
>   PR target/99708
>   * config/rs6000/rs6000-c.cc (rs6000_cpu_cpp_builtins): Define
>   __SIZEOF_IBM128__ if the IBM 128-bit long double type is created.
>   Define __SIZEOF_FLOAT128__ if we have float128 support.

> --- a/gcc/config/rs6000/rs6000-c.cc
> +++ b/gcc/config/rs6000/rs6000-c.cc
> @@ -622,8 +622,13 @@ rs6000_cpu_cpp_builtins (cpp_reader *pfile)
>  builtin_define ("__RSQRTE__");
>if (TARGET_FRSQRTES)
>  builtin_define ("__RSQRTEF__");
> +  if (ibm128_float_type_node)
> +builtin_define ("__SIZEOF_IBM128__=16");
>if (TARGET_FLOAT128_TYPE)
> -builtin_define ("__FLOAT128_TYPE__");
> +{
> +  builtin_define ("__FLOAT128_TYPE__");
> +  builtin_define ("__SIZEOF_FLOAT128__=16");
> +}

  if (TARGET_FLOAT128_TYPE)
builtin_define ("__FLOAT128_TYPE__");

  if (float128_type_node)
builtin_define ("__SIZEOF_FLOAT128__=16");
  if (ibm128_float_type_node)
builtin_define ("__SIZEOF_IBM128__=16");

Okay like that.  Thanks!


Segher


[committed] analyzer: fixes to free of non-heap detection [PR104560]

2022-02-16 Thread David Malcolm via Gcc-patches
PR analyzer/104560 reports various false positives from
-Wanalyzer-free-of-non-heap seen with rdma-core, on what's
effectively:

  free (&ptr->field)

where in this case "field" is the first element of its struct, and thus
&ptr->field == ptr, and could be on the heap.

The root cause is due to malloc_state_machine::on_stmt making
  "LHS = &EXPR;"
transition LHS from start to non_heap when EXPR is not a MEM_REF;
this assumption doesn't hold for the above case.

This patch eliminates that state transition, instead relying on
malloc_state_machine::get_default_state to detect regions known to
not be on the heap.
Doing so fixes the false positive, but eliminates some events relating
to free-of-alloca identifying the alloca, so the patch also reworks
free_of_non_heap to capture which region has been freed, adding
region creation events to diagnostic paths, so that the alloca calls
can be identified, and using the memory space of the region for more
precise wording of the diagnostic.
The improvement to malloc_state_machine::get_default_state also
means we now detect attempts to free VLAs, functions and code labels.

In doing so I spotted that I wasn't adding region creation events for
regions for global variables, and for cases where an allocation is the
last stmt within its basic block, so the patch also fixes these issues.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r12-7268-ga61aaee63848d422e8443e17bbec3257ee59d5d8.

gcc/analyzer/ChangeLog:
PR analyzer/104560
* diagnostic-manager.cc (diagnostic_manager::build_emission_path):
Add region creation events for globals of interest.
(null_assignment_sm_context::get_old_program_state): New.
(diagnostic_manager::add_events_for_eedge): Move check for
changing dynamic extents from PK_BEFORE_STMT case to after the
switch on the dst_point's kind so that we can emit them for the
final stmt in a basic block.
* engine.cc (impl_sm_context::get_old_program_state): New.
* sm-malloc.cc (malloc_state_machine::get_default_state): Rewrite
detection of m_non_heap to use get_memory_space.
(free_of_non_heap::free_of_non_heap): Add freed_reg param.
(free_of_non_heap::subclass_equal_p): Update for changes to
fields.
(free_of_non_heap::emit): Drop m_kind in favor of
get_memory_space.
(free_of_non_heap::describe_state_change): Remove logic for
detecting alloca.
(free_of_non_heap::mark_interesting_stuff): Add region-creation of
m_freed_reg.
(free_of_non_heap::get_memory_space): New.
(free_of_non_heap::kind): Drop enum.
(free_of_non_heap::m_freed_reg): New field.
(free_of_non_heap::m_kind): Drop field.
(malloc_state_machine::on_stmt): Drop transition to m_non_heap.
(malloc_state_machine::handle_free_of_non_heap): New function,
split out from on_deallocator_call and on_realloc_call, adding
detection of the freed region.
(malloc_state_machine::on_deallocator_call): Use it.
(malloc_state_machine::on_realloc_call): Likewise.
* sm.h (sm_context::get_old_program_state): New vfunc.

gcc/testsuite/ChangeLog:
PR analyzer/104560
* g++.dg/analyzer/placement-new.C: Update expected wording.
* g++.dg/analyzer/pr100244.C: Likewise.
* gcc.dg/analyzer/attr-malloc-1.c (test_7): Likewise.
* gcc.dg/analyzer/malloc-1.c (test_24): Likewise.
(test_25): Likewise.
(test_26): Likewise.
(test_50a, test_50b, test_50c): New.
* gcc.dg/analyzer/malloc-callbacks.c (test_5): Update expected
wording.
* gcc.dg/analyzer/malloc-paths-8.c: Likewise.
* gcc.dg/analyzer/pr104560-1.c: New test.
* gcc.dg/analyzer/pr104560-2.c: New test.
* gcc.dg/analyzer/realloc-1.c (test_7): Updated expected wording.
* gcc.dg/analyzer/vla-1.c (test_2): New.  Prune output from
-Wfree-nonheap-object.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/diagnostic-manager.cc| 105 +-
 gcc/analyzer/engine.cc|   5 +
 gcc/analyzer/sm-malloc.cc | 134 +-
 gcc/analyzer/sm.h |   4 +
 gcc/testsuite/g++.dg/analyzer/placement-new.C |   4 +-
 gcc/testsuite/g++.dg/analyzer/pr100244.C  |   2 +-
 gcc/testsuite/gcc.dg/analyzer/attr-malloc-1.c |   2 +-
 gcc/testsuite/gcc.dg/analyzer/malloc-1.c  |  32 -
 .../gcc.dg/analyzer/malloc-callbacks.c|   5 +-
 .../gcc.dg/analyzer/malloc-paths-8.c  |   4 +-
 gcc/testsuite/gcc.dg/analyzer/pr104560-1.c|  43 ++
 gcc/testsuite/gcc.dg/analyzer/pr104560-2.c|  26 
 gcc/testsuite/gcc.dg/analyzer/realloc-1.c |   4 +-
 gcc/testsuite/gcc.dg/analyzer/vla-1.c |   9 ++
 14 files changed, 262 insertions(+), 117 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr1

Re: Merge from trunk to gccgo branch

2022-02-16 Thread Ian Lance Taylor via Gcc-patches
I merged trunk revision 24ca97325cab7bc454c785d55f37120fe7ea6f74 to
the gccgo branch.

Ian


Re: [PATCH] c++: double non-dep folding from finish_compound_literal [PR104565]

2022-02-16 Thread Jason Merrill via Gcc-patches

On 2/16/22 15:26, Patrick Palka wrote:

In finish_compound_literal, we perform non-dependent expr folding before
calling check_narrowing (ever since r9-5973).  But ever since r10-7096,
check_narrowing also performs non-dependent expr folding of its own.
This double folding cause tsubst to see non-templated trees during the
second folding, which causes a spurious error in the below testcase.

This patch removes this first folding operation; it now seems obviated
by the second one.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk/10/11?


OK for trunk now, release branches in about a month.


PR c++/104565

gcc/cp/ChangeLog:

* semantics.cc (finish_compound_literal): Don't perform
non-dependent expr folding before calling check_narrowing.

gcc/testsuite/ChangeLog:

* g++.dg/template/non-dependent22.C: New test.
---
  gcc/cp/semantics.cc | 10 +++---
  gcc/testsuite/g++.dg/template/non-dependent22.C | 12 
  2 files changed, 15 insertions(+), 7 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/template/non-dependent22.C

diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 0cb17a6a8ab..114baa48710 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -3203,13 +3203,9 @@ finish_compound_literal (tree type, tree 
compound_literal,
  return error_mark_node;
compound_literal = reshape_init (type, compound_literal, complain);
if (SCALAR_TYPE_P (type)
-  && !BRACE_ENCLOSED_INITIALIZER_P (compound_literal))
-{
-  tree t = instantiate_non_dependent_expr_sfinae (compound_literal,
- complain);
-  if (!check_narrowing (type, t, complain))
-   return error_mark_node;
-}
+  && !BRACE_ENCLOSED_INITIALIZER_P (compound_literal)
+  && !check_narrowing (type, compound_literal, complain))
+return error_mark_node;
if (TREE_CODE (type) == ARRAY_TYPE
&& TYPE_DOMAIN (type) == NULL_TREE)
  {
diff --git a/gcc/testsuite/g++.dg/template/non-dependent22.C 
b/gcc/testsuite/g++.dg/template/non-dependent22.C
new file mode 100644
index 000..83a6a13f15b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/non-dependent22.C
@@ -0,0 +1,12 @@
+// PR c++/104565
+// { dg-do compile { target c++11 } }
+
+struct apa {
+  constexpr int n() const { return 3; }
+};
+
+template
+int f() {
+  apa foo;
+  return int{foo.n()};  // no matching function for call to 'apa::n(apa*)'
+}




Re: [PATCH] Restrict the two sources of vect_recog_cond_expr_convert_pattern to be of the same type when convert is extension.

2022-02-16 Thread Hongtao Liu via Gcc-patches
On Wed, Feb 16, 2022 at 10:17 PM Jakub Jelinek via Gcc-patches
 wrote:
>
> On Wed, Feb 16, 2022 at 05:03:09PM +0800, liuhongt via Gcc-patches wrote:
> > > > +(match (cond_expr_convert_p @0 @2 @3 @6)
> > > > + (cond (simple_comparison@6 @0 @1) (convert@4 @2) (convert@5 @3))
> > > > +  (if (types_match (TREE_TYPE (@2), TREE_TYPE (@3))
> > >
> > > But in principle @2 or @3 could safely differ in sign, you'd then need to 
> > > ensure
> > > to insert sign conversions to @2/@3 to the signedness of @4/@5.
> > >
> > It turns out differ in sign is not suitable for extension(but ok for 
> > truncation),
> > because it's zero_extend vs sign_extend.
> >
> > The patch add types_match check when convert is extension.
> >
> > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> > And native Bootstrapped and regtested on CLX.
> >
> > Ok for trunk?
> >
> > gcc/ChangeLog:
> >
> >   PR tree-optimization/104551
> >   PR tree-optimization/103771
> >   * match.pd (cond_expr_convert_p): Add types_match check when
> >   convert is extension.
> >   * tree-vect-patterns.cc
> >   (gimple_cond_expr_convert_p): Adjust comments.
> >   (vect_recog_cond_expr_convert_pattern): Ditto.
> >
> > gcc/testsuite/ChangeLog:
> >
> >   * gcc.target/i386/pr104551.c: New test.
> > ---
> >  gcc/match.pd |  8 +---
> >  gcc/testsuite/gcc.target/i386/pr104551.c | 24 
> >  gcc/tree-vect-patterns.cc|  6 --
> >  3 files changed, 33 insertions(+), 5 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr104551.c
> >
> > diff --git a/gcc/match.pd b/gcc/match.pd
> > index 05a10ab6bfd..8e80b9f1576 100644
> > --- a/gcc/match.pd
> > +++ b/gcc/match.pd
> > @@ -7692,11 +7692,13 @@ and,
> >(if (INTEGRAL_TYPE_P (type)
> > && INTEGRAL_TYPE_P (TREE_TYPE (@2))
> > && INTEGRAL_TYPE_P (TREE_TYPE (@0))
> > -   && INTEGRAL_TYPE_P (TREE_TYPE (@3))
> > && TYPE_PRECISION (type) != TYPE_PRECISION (TREE_TYPE (@0))
> > && TYPE_PRECISION (TREE_TYPE (@0))
> > == TYPE_PRECISION (TREE_TYPE (@2))
> > -   && TYPE_PRECISION (TREE_TYPE (@0))
> > -   == TYPE_PRECISION (TREE_TYPE (@3))
> > +   && (types_match (TREE_TYPE (@2), TREE_TYPE (@3))
> > +|| ((TYPE_PRECISION (TREE_TYPE (@0))
> > + == TYPE_PRECISION (TREE_TYPE (@3)))
> > +&& INTEGRAL_TYPE_P (TREE_TYPE (@3))
> > +&& TYPE_PRECISION (TREE_TYPE (@3)) > TYPE_PRECISION (type)))
> > && single_use (@4)
> > && single_use (@5
>
> I find this quite unreadable, it looks like if @2 and @3 are treated
> differently.  I think keeping the old 3 lines and just adding
>   && (TYPE_PRECISION (TREE_TYPE (@0)) >= TYPE_PRECISION (type)
>   || (TYPE_UNSIGNED (TREE_TYPE (@2))
>   == TYPE_UNSIGNED (TREE_TYPE (@3
Yes, good idea.
> after it ideally with a comment why would be better.
> Note, if the precision of @0 and type is the same, I think signedness can
> still differ, no?
We have TYPE_PRECISION (type) != TYPE_PRECISION (TREE_TYPE (@0)).
>
> Jakub
>


-- 
BR,
Hongtao


[committed] analyzer: const functions have no side effects [PR104576]

2022-02-16 Thread David Malcolm via Gcc-patches
PR analyzer/104576 tracks that we issue a false positive from
-Wanalyzer-use-of-uninitialized-value for the reproducers of PR 63311
when optimization is disabled.

The root cause is that the analyzer was considering that a call to
__builtin_sinf could have side-effects.

This patch fixes things by generalizing the handling for "pure"
functions to also consider "const" functions.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r12-7270-g5fbcbcaff7248604e04b39464f4fbd64fbf6e43b.

gcc/analyzer/ChangeLog:
PR analyzer/104576
* region-model.cc: Include "calls.h".
(region_model::on_call_pre): Use flags_from_decl_or_type to
generalize check for DECL_PURE_P to also check for ECF_CONST.

gcc/testsuite/ChangeLog:
PR analyzer/104576
* gcc.dg/analyzer/torture/uninit-pr63311.c: New test.
* gcc.dg/analyzer/uninit-pr104576.c: New test.
* gfortran.dg/analyzer/uninit-pr63311.f90: New test.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/region-model.cc  |   6 +-
 .../gcc.dg/analyzer/torture/uninit-pr63311.c  | 134 ++
 .../gcc.dg/analyzer/uninit-pr104576.c |  16 +++
 .../gfortran.dg/analyzer/uninit-pr63311.f90   |  39 +
 4 files changed, 193 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/torture/uninit-pr63311.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/uninit-pr104576.c
 create mode 100644 gcc/testsuite/gfortran.dg/analyzer/uninit-pr63311.f90

diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index 69e8fa7d1e3..d4d7816e0d5 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -72,6 +72,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-phinodes.h"
 #include "tree-ssa-operands.h"
 #include "ssa-iterators.h"
+#include "calls.h"
 
 #if ENABLE_ANALYZER
 
@@ -1271,13 +1272,14 @@ region_model::on_call_pre (const gcall *call, 
region_model_context *ctxt,
 in region-model-impl-calls.cc.
 Having them split out into separate functions makes it easier
 to put breakpoints on the handling of specific functions.  */
+  int callee_fndecl_flags = flags_from_decl_or_type (callee_fndecl);
 
   if (fndecl_built_in_p (callee_fndecl, BUILT_IN_NORMAL)
  && gimple_builtin_call_types_compatible_p (call, callee_fndecl))
switch (DECL_UNCHECKED_FUNCTION_CODE (callee_fndecl))
  {
  default:
-   if (!DECL_PURE_P (callee_fndecl))
+   if (!(callee_fndecl_flags & (ECF_CONST | ECF_PURE)))
  unknown_side_effects = true;
break;
  case BUILT_IN_ALLOCA:
@@ -1433,7 +1435,7 @@ region_model::on_call_pre (const gcall *call, 
region_model_context *ctxt,
  /* Handle in "on_call_post".  */
}
   else if (!fndecl_has_gimple_body_p (callee_fndecl)
-  && !DECL_PURE_P (callee_fndecl)
+  && (!(callee_fndecl_flags & (ECF_CONST | ECF_PURE)))
   && !fndecl_built_in_p (callee_fndecl))
unknown_side_effects = true;
 }
diff --git a/gcc/testsuite/gcc.dg/analyzer/torture/uninit-pr63311.c 
b/gcc/testsuite/gcc.dg/analyzer/torture/uninit-pr63311.c
new file mode 100644
index 000..a73289cb83f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/torture/uninit-pr63311.c
@@ -0,0 +1,134 @@
+/* { dg-additional-options "-Wno-analyzer-too-complex" } */
+
+int foo ()
+{
+  static volatile int v = 42;
+  int __result_foo;
+
+  __result_foo = (int) v;
+  return __result_foo;
+}
+
+void test (int * restrict n, int * restrict flag)
+{
+  int i;
+  int j;
+  int k;
+  double t;
+  int tt;
+  double v;
+
+  if (*flag)
+{
+  t = 4.2e+1;
+  tt = foo ();
+}
+  L_1: ;
+  v = 0.0;
+  {
+int D_3353;
+
+D_3353 = *n;
+i = 1;
+if (i <= D_3353)
+  {
+while (1)
+  {
+{
+  int D_3369;
+
+  v = 0.0;
+  if (*flag)
+{
+  if (tt == i)
+{
+  {
+double M_0;
+
+M_0 = v;
+if (t > M_0 || (int) (M_0 != M_0))
+  {
+M_0 = t;
+  }
+v = M_0;
+  }
+}
+  L_5:;
+}
+  L_4:;
+  {
+int D_3359;
+
+D_3359 = *n;
+j = 1;
+if (j <= D_3359)
+  {
+while (1)
+  {
+{
+  int D_3368;
+
+  {
+int D_3362;
+
+D_3362 = *n;
+k = 1;
+if (k <= D_3362)
+  {
+  

[PATCH] [i386] Clean up MPX-related bit_{MPX,BNDREGS,BNDCSR}.

2022-02-16 Thread liuhongt via Gcc-patches
Bootstrap and regrestest on x86_64-pc-linux-gnu{-m32,}.
Ok for trunk?

gcc/ChangeLog:

* config/i386/cpuid.h (bit_MPX): Removed.
(bit_BNDREGS): Ditto.
(bit_BNDCSR): Ditto.
---
 gcc/config/i386/cpuid.h | 5 -
 1 file changed, 5 deletions(-)

diff --git a/gcc/config/i386/cpuid.h b/gcc/config/i386/cpuid.h
index ed6113009bb..8b3dc2b1dde 100644
--- a/gcc/config/i386/cpuid.h
+++ b/gcc/config/i386/cpuid.h
@@ -86,7 +86,6 @@
 #define bit_AVX2   (1 << 5)
 #define bit_BMI2   (1 << 8)
 #define bit_RTM(1 << 11)
-#define bit_MPX(1 << 14)
 #define bit_AVX512F(1 << 16)
 #define bit_AVX512DQ   (1 << 17)
 #define bit_RDSEED (1 << 18)
@@ -136,10 +135,6 @@
 #define bit_AMX_TILE(1 << 24)
 #define bit_AMX_INT8(1 << 25)
 
-/* XFEATURE_ENABLED_MASK register bits (%eax == 0xd, %ecx == 0) */
-#define bit_BNDREGS (1 << 3)
-#define bit_BNDCSR  (1 << 4)
-
 /* Extended State Enumeration Sub-leaf (%eax == 0xd, %ecx == 1) */
 #define bit_XSAVEOPT   (1 << 0)
 #define bit_XSAVEC (1 << 1)
-- 
2.18.1



Re: [PATCH] [i386] Clean up MPX-related bit_{MPX,BNDREGS,BNDCSR}.

2022-02-16 Thread Hongtao Liu via Gcc-patches
On Thu, Feb 17, 2022 at 12:00 PM liuhongt  wrote:
>
> Bootstrap and regrestest on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
>
> gcc/ChangeLog:
>
> * config/i386/cpuid.h (bit_MPX): Removed.
> (bit_BNDREGS): Ditto.
> (bit_BNDCSR): Ditto.
> ---
>  gcc/config/i386/cpuid.h | 5 -
>  1 file changed, 5 deletions(-)
>
> diff --git a/gcc/config/i386/cpuid.h b/gcc/config/i386/cpuid.h
> index ed6113009bb..8b3dc2b1dde 100644
> --- a/gcc/config/i386/cpuid.h
> +++ b/gcc/config/i386/cpuid.h
> @@ -86,7 +86,6 @@
>  #define bit_AVX2   (1 << 5)
>  #define bit_BMI2   (1 << 8)
>  #define bit_RTM(1 << 11)
> -#define bit_MPX(1 << 14)
>  #define bit_AVX512F(1 << 16)
>  #define bit_AVX512DQ   (1 << 17)
>  #define bit_RDSEED (1 << 18)
> @@ -136,10 +135,6 @@
>  #define bit_AMX_TILE(1 << 24)
>  #define bit_AMX_INT8(1 << 25)
>
> -/* XFEATURE_ENABLED_MASK register bits (%eax == 0xd, %ecx == 0) */
> -#define bit_BNDREGS (1 << 3)
> -#define bit_BNDCSR  (1 << 4)
> -
>  /* Extended State Enumeration Sub-leaf (%eax == 0xd, %ecx == 1) */
>  #define bit_XSAVEOPT   (1 << 0)
>  #define bit_XSAVEC (1 << 1)
> --
> 2.18.1
>


-- 
BR,
Hongtao


libbacktrace patch committed: Handle skeleton units

2022-02-16 Thread Ian Lance Taylor via Gcc-patches
This libbacktrace patch handles DWARF 5 skeleton units, which are used
when part of the DWARF information is stored in a separate file.  This
doesn't actually look in the separate file, as the line number
information, which is all that we care about, is normally kept in the
main executable because it needs relocations.  For this patch
bootstrapped and ran libbacktrace and Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian

* dwarf.c (find_address_ranges): Handle skeleton units.
(read_function_entry): Likewise.
3c16999f983331301384f51fc1cdc04f7d51ef6c
diff --git a/libbacktrace/dwarf.c b/libbacktrace/dwarf.c
index 2158bc14065..45cc9e77e40 100644
--- a/libbacktrace/dwarf.c
+++ b/libbacktrace/dwarf.c
@@ -1989,14 +1989,16 @@ find_address_ranges (struct backtrace_state *state, 
uintptr_t base_address,
  break;
 
case DW_AT_stmt_list:
- if (abbrev->tag == DW_TAG_compile_unit
+ if ((abbrev->tag == DW_TAG_compile_unit
+  || abbrev->tag == DW_TAG_skeleton_unit)
  && (val.encoding == ATTR_VAL_UINT
  || val.encoding == ATTR_VAL_REF_SECTION))
u->lineoff = val.u.uint;
  break;
 
case DW_AT_name:
- if (abbrev->tag == DW_TAG_compile_unit)
+ if (abbrev->tag == DW_TAG_compile_unit
+ || abbrev->tag == DW_TAG_skeleton_unit)
{
  name_val = val;
  have_name_val = 1;
@@ -2004,7 +2006,8 @@ find_address_ranges (struct backtrace_state *state, 
uintptr_t base_address,
  break;
 
case DW_AT_comp_dir:
- if (abbrev->tag == DW_TAG_compile_unit)
+ if (abbrev->tag == DW_TAG_compile_unit
+ || abbrev->tag == DW_TAG_skeleton_unit)
{
  comp_dir_val = val;
  have_comp_dir_val = 1;
@@ -2012,19 +2015,22 @@ find_address_ranges (struct backtrace_state *state, 
uintptr_t base_address,
  break;
 
case DW_AT_str_offsets_base:
- if (abbrev->tag == DW_TAG_compile_unit
+ if ((abbrev->tag == DW_TAG_compile_unit
+  || abbrev->tag == DW_TAG_skeleton_unit)
  && val.encoding == ATTR_VAL_REF_SECTION)
u->str_offsets_base = val.u.uint;
  break;
 
case DW_AT_addr_base:
- if (abbrev->tag == DW_TAG_compile_unit
+ if ((abbrev->tag == DW_TAG_compile_unit
+  || abbrev->tag == DW_TAG_skeleton_unit)
  && val.encoding == ATTR_VAL_REF_SECTION)
u->addr_base = val.u.uint;
  break;
 
case DW_AT_rnglists_base:
- if (abbrev->tag == DW_TAG_compile_unit
+ if ((abbrev->tag == DW_TAG_compile_unit
+  || abbrev->tag == DW_TAG_skeleton_unit)
  && val.encoding == ATTR_VAL_REF_SECTION)
u->rnglists_base = val.u.uint;
  break;
@@ -2052,7 +2058,8 @@ find_address_ranges (struct backtrace_state *state, 
uintptr_t base_address,
}
 
   if (abbrev->tag == DW_TAG_compile_unit
- || abbrev->tag == DW_TAG_subprogram)
+ || abbrev->tag == DW_TAG_subprogram
+ || abbrev->tag == DW_TAG_skeleton_unit)
{
  if (!add_ranges (state, dwarf_sections, base_address,
   is_bigendian, u, pcrange.lowpc, &pcrange,
@@ -2060,9 +2067,10 @@ find_address_ranges (struct backtrace_state *state, 
uintptr_t base_address,
   (void *) addrs))
return 0;
 
- /* If we found the PC range in the DW_TAG_compile_unit, we
-can stop now.  */
- if (abbrev->tag == DW_TAG_compile_unit
+ /* If we found the PC range in the DW_TAG_compile_unit or
+DW_TAG_skeleton_unit, we can stop now.  */
+ if ((abbrev->tag == DW_TAG_compile_unit
+  || abbrev->tag == DW_TAG_skeleton_unit)
  && (pcrange.have_ranges
  || (pcrange.have_lowpc && pcrange.have_highpc)))
return 1;
@@ -3274,7 +3282,8 @@ read_function_entry (struct backtrace_state *state, 
struct dwarf_data *ddata,
 
  /* The compile unit sets the base address for any address
 ranges in the function entries.  */
- if (abbrev->tag == DW_TAG_compile_unit
+ if ((abbrev->tag == DW_TAG_compile_unit
+  || abbrev->tag == DW_TAG_skeleton_unit)
  && abbrev->attrs[i].name == DW_AT_low_pc)
{
  if (val.encoding == ATTR_VAL_ADDRESS)


[PATCH] x86: Add TARGET_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER

2022-02-16 Thread H.J. Lu via Gcc-patches
Reading YMM registers with all zero bits needs VZEROUPPER on Sandy Bride,
Ivy Bridge, Haswell, Broadwell and Alder Lake to avoid SSE <-> AVX
transition penalty.  Add TARGET_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER to
generate vzeroupper instruction after loading all-zero YMM/YMM registers
and enable it by default.

gcc/

PR target/101456
* config/i386/i386.cc (ix86_avx_u128_mode_needed): Skip the
vzeroupper optimization if target needs vzeroupper after reading
all-zero YMM/YMM registers.
* config/i386/i386.h (TARGET_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER):
New.
* config/i386/x86-tune.def
(X86_TUNE_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER): New.

gcc/testsuite/

PR target/101456
* gcc.target/i386/pr101456-1.c (dg-options): Add
-mtune-ctrl=^read_zero_ymm_zmm_need_vzeroupper.
* gcc.target/i386/pr101456-2.c: Likewise.
* gcc.target/i386/pr101456-3.c: New test.
* gcc.target/i386/pr101456-4.c: Likewise.
---
 gcc/config/i386/i386.cc| 51 --
 gcc/config/i386/i386.h |  2 +
 gcc/config/i386/x86-tune.def   |  5 +++
 gcc/testsuite/gcc.target/i386/pr101456-1.c |  2 +-
 gcc/testsuite/gcc.target/i386/pr101456-2.c |  2 +-
 gcc/testsuite/gcc.target/i386/pr101456-3.c | 33 ++
 gcc/testsuite/gcc.target/i386/pr101456-4.c | 33 ++
 7 files changed, 103 insertions(+), 25 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr101456-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr101456-4.c

diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index cf246e74e57..1f8b4caf24c 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -14502,33 +14502,38 @@ ix86_avx_u128_mode_needed (rtx_insn *insn)
 
   subrtx_iterator::array_type array;
 
-  rtx set = single_set (insn);
-  if (set)
+  if (!TARGET_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER)
 {
-  rtx dest = SET_DEST (set);
-  rtx src = SET_SRC (set);
-  if (ix86_check_avx_upper_register (dest))
+  /* Perform this vzeroupper optimization if target doesn't need
+vzeroupper after reading all-zero YMM/YMM registers.  */
+  rtx set = single_set (insn);
+  if (set)
{
- /* This is an YMM/ZMM load.  Return AVX_U128_DIRTY if the
-source isn't zero.  */
- if (standard_sse_constant_p (src, GET_MODE (dest)) != 1)
-   return AVX_U128_DIRTY;
+ rtx dest = SET_DEST (set);
+ rtx src = SET_SRC (set);
+ if (ix86_check_avx_upper_register (dest))
+   {
+ /* This is an YMM/ZMM load.  Return AVX_U128_DIRTY if the
+source isn't zero.  */
+ if (standard_sse_constant_p (src, GET_MODE (dest)) != 1)
+   return AVX_U128_DIRTY;
+ else
+   return AVX_U128_ANY;
+   }
  else
-   return AVX_U128_ANY;
-   }
-  else
-   {
- FOR_EACH_SUBRTX (iter, array, src, NONCONST)
-   if (ix86_check_avx_upper_register (*iter))
- {
-   int status = ix86_avx_u128_mode_source (insn, *iter);
-   if (status == AVX_U128_DIRTY)
- return status;
- }
-   }
+   {
+ FOR_EACH_SUBRTX (iter, array, src, NONCONST)
+   if (ix86_check_avx_upper_register (*iter))
+ {
+   int status = ix86_avx_u128_mode_source (insn, *iter);
+   if (status == AVX_U128_DIRTY)
+ return status;
+ }
+   }
 
-  /* This isn't YMM/ZMM load/store.  */
-  return AVX_U128_ANY;
+ /* This isn't YMM/ZMM load/store.  */
+ return AVX_U128_ANY;
+   }
 }
 
   /* Require DIRTY mode if a 256bit or 512bit AVX register is referenced.
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index f41e0908250..98c2e200027 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -425,6 +425,8 @@ extern unsigned char ix86_tune_features[X86_TUNE_LAST];
 #define TARGET_AVOID_MFENCE ix86_tune_features[X86_TUNE_AVOID_MFENCE]
 #define TARGET_EMIT_VZEROUPPER \
ix86_tune_features[X86_TUNE_EMIT_VZEROUPPER]
+#define TARGET_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER \
+   ix86_tune_features[X86_TUNE_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER]
 #define TARGET_EXPAND_ABS \
ix86_tune_features[X86_TUNE_EXPAND_ABS]
 #define TARGET_V2DF_REDUCTION_PREFER_HADDPD \
diff --git a/gcc/config/i386/x86-tune.def b/gcc/config/i386/x86-tune.def
index 82ca0ae63ac..0a068c09202 100644
--- a/gcc/config/i386/x86-tune.def
+++ b/gcc/config/i386/x86-tune.def
@@ -649,3 +649,8 @@ DEF_TUNE (X86_TUNE_PROMOTE_QI_REGS, "promote_qi_regs", 
m_NONE)
 /* X86_TUNE_EMIT_VZEROUPPER: This enables vzeroupper instruction insertion
before a transfer of control flow out of the function.  */
 DEF_TUNE (X86_TUNE_EMIT_VZEROUPPER, "emit_vzeroupper", ~m_KNL)

Re: Merge from trunk to gccgo branch

2022-02-16 Thread Ian Lance Taylor via Gcc-patches
And another one: I merged trunk revision
837eb12629dd8a8a45fac9b8db57b29ecda46f14 to the gccgo branch.

Ian


Re: [PATCH] x86: Add TARGET_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER

2022-02-16 Thread Hongtao Liu via Gcc-patches
On Thu, Feb 17, 2022 at 12:26 PM H.J. Lu via Gcc-patches
 wrote:
>
> Reading YMM registers with all zero bits needs VZEROUPPER on Sandy Bride,
> Ivy Bridge, Haswell, Broadwell and Alder Lake to avoid SSE <-> AVX
> transition penalty.  Add TARGET_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER to
> generate vzeroupper instruction after loading all-zero YMM/YMM registers
> and enable it by default.
Shouldn't TARGET_READ_ZERO_YMM_ZMM_NONEED_VZEROUPPER sounds a bit smoother?
Because originally we needed to add vzeroupper to all avx<->sse cases,
now it's a tune to indicate that we don't need to add it in some
cases.
>
> gcc/
>
> PR target/101456
> * config/i386/i386.cc (ix86_avx_u128_mode_needed): Skip the
> vzeroupper optimization if target needs vzeroupper after reading
> all-zero YMM/YMM registers.
> * config/i386/i386.h (TARGET_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER):
> New.
> * config/i386/x86-tune.def
> (X86_TUNE_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER): New.
>
> gcc/testsuite/
>
> PR target/101456
> * gcc.target/i386/pr101456-1.c (dg-options): Add
> -mtune-ctrl=^read_zero_ymm_zmm_need_vzeroupper.
> * gcc.target/i386/pr101456-2.c: Likewise.
> * gcc.target/i386/pr101456-3.c: New test.
> * gcc.target/i386/pr101456-4.c: Likewise.
> ---
>  gcc/config/i386/i386.cc| 51 --
>  gcc/config/i386/i386.h |  2 +
>  gcc/config/i386/x86-tune.def   |  5 +++
>  gcc/testsuite/gcc.target/i386/pr101456-1.c |  2 +-
>  gcc/testsuite/gcc.target/i386/pr101456-2.c |  2 +-
>  gcc/testsuite/gcc.target/i386/pr101456-3.c | 33 ++
>  gcc/testsuite/gcc.target/i386/pr101456-4.c | 33 ++
>  7 files changed, 103 insertions(+), 25 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr101456-3.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr101456-4.c
>
> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> index cf246e74e57..1f8b4caf24c 100644
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -14502,33 +14502,38 @@ ix86_avx_u128_mode_needed (rtx_insn *insn)
>
>subrtx_iterator::array_type array;
>
> -  rtx set = single_set (insn);
> -  if (set)
> +  if (!TARGET_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER)
>  {
> -  rtx dest = SET_DEST (set);
> -  rtx src = SET_SRC (set);
> -  if (ix86_check_avx_upper_register (dest))
> +  /* Perform this vzeroupper optimization if target doesn't need
> +vzeroupper after reading all-zero YMM/YMM registers.  */
> +  rtx set = single_set (insn);
> +  if (set)
> {
> - /* This is an YMM/ZMM load.  Return AVX_U128_DIRTY if the
> -source isn't zero.  */
> - if (standard_sse_constant_p (src, GET_MODE (dest)) != 1)
> -   return AVX_U128_DIRTY;
> + rtx dest = SET_DEST (set);
> + rtx src = SET_SRC (set);
> + if (ix86_check_avx_upper_register (dest))
> +   {
> + /* This is an YMM/ZMM load.  Return AVX_U128_DIRTY if the
> +source isn't zero.  */
> + if (standard_sse_constant_p (src, GET_MODE (dest)) != 1)
> +   return AVX_U128_DIRTY;
> + else
> +   return AVX_U128_ANY;
> +   }
>   else
> -   return AVX_U128_ANY;
> -   }
> -  else
> -   {
> - FOR_EACH_SUBRTX (iter, array, src, NONCONST)
> -   if (ix86_check_avx_upper_register (*iter))
> - {
> -   int status = ix86_avx_u128_mode_source (insn, *iter);
> -   if (status == AVX_U128_DIRTY)
> - return status;
> - }
> -   }
> +   {
> + FOR_EACH_SUBRTX (iter, array, src, NONCONST)
> +   if (ix86_check_avx_upper_register (*iter))
> + {
> +   int status = ix86_avx_u128_mode_source (insn, *iter);
> +   if (status == AVX_U128_DIRTY)
> + return status;
> + }
> +   }
>
> -  /* This isn't YMM/ZMM load/store.  */
> -  return AVX_U128_ANY;
> + /* This isn't YMM/ZMM load/store.  */
> + return AVX_U128_ANY;
> +   }
>  }
>
>/* Require DIRTY mode if a 256bit or 512bit AVX register is referenced.
> diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> index f41e0908250..98c2e200027 100644
> --- a/gcc/config/i386/i386.h
> +++ b/gcc/config/i386/i386.h
> @@ -425,6 +425,8 @@ extern unsigned char ix86_tune_features[X86_TUNE_LAST];
>  #define TARGET_AVOID_MFENCE ix86_tune_features[X86_TUNE_AVOID_MFENCE]
>  #define TARGET_EMIT_VZEROUPPER \
> ix86_tune_features[X86_TUNE_EMIT_VZEROUPPER]
> +#define TARGET_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER \
> +   ix86_tune_features[X86_TUNE_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER]
>  #define TARGET_EXPAND_ABS \
> ix86_tune_features[X86_TUNE_EXPAND_ABS]
>  #define T

[PATCH V2] Restrict the two sources of vect_recog_cond_expr_convert_pattern to be of the same type when convert is extension.

2022-02-16 Thread liuhongt via Gcc-patches
> I find this quite unreadable, it looks like if @2 and @3 are treated
> differently.  I think keeping the old 3 lines and just adding
>   && (TYPE_PRECISION (TREE_TYPE (@0)) >= TYPE_PRECISION (type)
>   || (TYPE_UNSIGNED (TREE_TYPE (@2))
>   == TYPE_UNSIGNED (TREE_TYPE (@3
> after it ideally with a comment why would be better.
Update patch.

gcc/ChangeLog:

PR tree-optimization/104551
PR tree-optimization/103771
* match.pd (cond_expr_convert_p): Add types_match check when
convert is extension.
* tree-vect-patterns.cc
(gimple_cond_expr_convert_p): Adjust comments.
(vect_recog_cond_expr_convert_pattern): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr104551.c: New test.
---
 gcc/match.pd |  6 ++
 gcc/testsuite/gcc.target/i386/pr104551.c | 24 
 gcc/tree-vect-patterns.cc|  6 --
 3 files changed, 34 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr104551.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 05a10ab6bfd..8b6f22f1065 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -7698,5 +7698,11 @@ and,
  == TYPE_PRECISION (TREE_TYPE (@2))
&& TYPE_PRECISION (TREE_TYPE (@0))
  == TYPE_PRECISION (TREE_TYPE (@3))
+   /* For vect_recog_cond_expr_convert_pattern, @2 and @3 can differ in
+ signess when convert is truncation, but not ok for extension since
+ it's sign_extend vs zero_extend.  */
+   && (TYPE_PRECISION (TREE_TYPE (@0)) > TYPE_PRECISION (type)
+  || (TYPE_UNSIGNED (TREE_TYPE (@2))
+  == TYPE_UNSIGNED (TREE_TYPE (@3
&& single_use (@4)
&& single_use (@5
diff --git a/gcc/testsuite/gcc.target/i386/pr104551.c 
b/gcc/testsuite/gcc.target/i386/pr104551.c
new file mode 100644
index 000..6300f25c0d5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr104551.c
@@ -0,0 +1,24 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mavx2" } */
+/* { dg-require-effective-target avx2 } */
+
+unsigned int
+__attribute__((noipa))
+test(unsigned int a, unsigned char p[16]) {
+  unsigned int res = 0;
+  for (unsigned b = 0; b < a; b += 1)
+res = p[b] ? p[b] : (char) b;
+  return res;
+}
+
+int main ()
+{
+  unsigned int a = 16U;
+  unsigned char p[16];
+  for (int i = 0; i != 16; i++)
+p[i] = (unsigned char)128;
+  unsigned int res = test (a, p);
+  if (res != 128)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index a8f96d59643..217bdfd7045 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -929,8 +929,10 @@ vect_reassociating_reduction_p (vec_info *vinfo,
with conditions:
1) @1, @2, c, d, a, b are all integral type.
2) There's single_use for both @1 and @2.
-   3) a, c and d have same precision.
+   3) a, c have same precision.
4) c and @1 have different precision.
+   5) c, d are the same type or they can differ in sign when convert is
+   truncation.
 
record a and c and d and @3.  */
 
@@ -952,7 +954,7 @@ extern bool gimple_cond_expr_convert_p (tree, tree*, tree 
(*)(tree));
TYPE_PRECISION (TYPE_E) != TYPE_PRECISION (TYPE_CD);
TYPE_PRECISION (TYPE_AB) == TYPE_PRECISION (TYPE_CD);
single_use of op_true and op_false.
-   TYPE_AB could differ in sign.
+   TYPE_AB could differ in sign when (TYPE_E) A is a truncation.
 
Input:
 
-- 
2.18.1



Re: [PATCH] tree-optimization/96881 - CD-DCE and CLOBBERs

2022-02-16 Thread Richard Biener via Gcc-patches
On Tue, 15 Feb 2022, Jan Hubicka wrote:

> > @@ -1272,7 +1275,7 @@ maybe_optimize_arith_overflow (gimple_stmt_iterator 
> > *gsi,
> > contributes nothing to the program, and can be deleted.  */
> >  
> >  static bool
> > -eliminate_unnecessary_stmts (void)
> > +eliminate_unnecessary_stmts (bool aggressive)
> >  {
> >bool something_changed = false;
> >basic_block bb;
> > @@ -1366,7 +1369,9 @@ eliminate_unnecessary_stmts (void)
> >   break;
> > }
> > }
> > - if (!dead)
> > + if (!dead
> > + && (!aggressive
> > + || bitmap_bit_p (visited_control_parents, bb->index)))
> 
> It seems to me that it may be worth to consider case where
> visited_control_parents is 0 while all basic blocks in the CD relation
> are live for different reasons.  I suppose this can happen in more
> complex CFGs when the other arms of conditionals are live...

It's a bit difficult to do in this place though since we might already
have altered those blocks (and we need to check not for the block being
live but for its control stmt).  I suppose we could use the
last_stmt_necessary bitmap.  I'll do some statistics to see whether
this helps.

Richard.


Re: [PATCH] [i386] Clean up MPX-related bit_{MPX,BNDREGS,BNDCSR}.

2022-02-16 Thread Uros Bizjak via Gcc-patches
On Thu, Feb 17, 2022 at 5:01 AM Hongtao Liu  wrote:
>
> On Thu, Feb 17, 2022 at 12:00 PM liuhongt  wrote:
> >
> > Bootstrap and regrestest on x86_64-pc-linux-gnu{-m32,}.
> > Ok for trunk?
> >
> > gcc/ChangeLog:
> >
> > * config/i386/cpuid.h (bit_MPX): Removed.
> > (bit_BNDREGS): Ditto.
> > (bit_BNDCSR): Ditto.

OK.

Thanks,
Uros.

> > ---
> >  gcc/config/i386/cpuid.h | 5 -
> >  1 file changed, 5 deletions(-)
> >
> > diff --git a/gcc/config/i386/cpuid.h b/gcc/config/i386/cpuid.h
> > index ed6113009bb..8b3dc2b1dde 100644
> > --- a/gcc/config/i386/cpuid.h
> > +++ b/gcc/config/i386/cpuid.h
> > @@ -86,7 +86,6 @@
> >  #define bit_AVX2   (1 << 5)
> >  #define bit_BMI2   (1 << 8)
> >  #define bit_RTM(1 << 11)
> > -#define bit_MPX(1 << 14)
> >  #define bit_AVX512F(1 << 16)
> >  #define bit_AVX512DQ   (1 << 17)
> >  #define bit_RDSEED (1 << 18)
> > @@ -136,10 +135,6 @@
> >  #define bit_AMX_TILE(1 << 24)
> >  #define bit_AMX_INT8(1 << 25)
> >
> > -/* XFEATURE_ENABLED_MASK register bits (%eax == 0xd, %ecx == 0) */
> > -#define bit_BNDREGS (1 << 3)
> > -#define bit_BNDCSR  (1 << 4)
> > -
> >  /* Extended State Enumeration Sub-leaf (%eax == 0xd, %ecx == 1) */
> >  #define bit_XSAVEOPT   (1 << 0)
> >  #define bit_XSAVEC (1 << 1)
> > --
> > 2.18.1
> >
>
>
> --
> BR,
> Hongtao


Re: [PATCH] x86: Add TARGET_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER

2022-02-16 Thread Uros Bizjak via Gcc-patches
On Thu, Feb 17, 2022 at 6:25 AM Hongtao Liu via Gcc-patches
 wrote:
>
> On Thu, Feb 17, 2022 at 12:26 PM H.J. Lu via Gcc-patches
>  wrote:
> >
> > Reading YMM registers with all zero bits needs VZEROUPPER on Sandy Bride,
> > Ivy Bridge, Haswell, Broadwell and Alder Lake to avoid SSE <-> AVX
> > transition penalty.  Add TARGET_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER to
> > generate vzeroupper instruction after loading all-zero YMM/YMM registers
> > and enable it by default.
> Shouldn't TARGET_READ_ZERO_YMM_ZMM_NONEED_VZEROUPPER sounds a bit smoother?
> Because originally we needed to add vzeroupper to all avx<->sse cases,
> now it's a tune to indicate that we don't need to add it in some

Perhaps we should go from the other side and use
X86_TUNE_OPTIMIZE_AVX_READ for new processors?

Uros.

> cases.
> >
> > gcc/
> >
> > PR target/101456
> > * config/i386/i386.cc (ix86_avx_u128_mode_needed): Skip the
> > vzeroupper optimization if target needs vzeroupper after reading
> > all-zero YMM/YMM registers.
> > * config/i386/i386.h (TARGET_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER):
> > New.
> > * config/i386/x86-tune.def
> > (X86_TUNE_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER): New.
> >
> > gcc/testsuite/
> >
> > PR target/101456
> > * gcc.target/i386/pr101456-1.c (dg-options): Add
> > -mtune-ctrl=^read_zero_ymm_zmm_need_vzeroupper.
> > * gcc.target/i386/pr101456-2.c: Likewise.
> > * gcc.target/i386/pr101456-3.c: New test.
> > * gcc.target/i386/pr101456-4.c: Likewise.
> > ---
> >  gcc/config/i386/i386.cc| 51 --
> >  gcc/config/i386/i386.h |  2 +
> >  gcc/config/i386/x86-tune.def   |  5 +++
> >  gcc/testsuite/gcc.target/i386/pr101456-1.c |  2 +-
> >  gcc/testsuite/gcc.target/i386/pr101456-2.c |  2 +-
> >  gcc/testsuite/gcc.target/i386/pr101456-3.c | 33 ++
> >  gcc/testsuite/gcc.target/i386/pr101456-4.c | 33 ++
> >  7 files changed, 103 insertions(+), 25 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr101456-3.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr101456-4.c
> >
> > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> > index cf246e74e57..1f8b4caf24c 100644
> > --- a/gcc/config/i386/i386.cc
> > +++ b/gcc/config/i386/i386.cc
> > @@ -14502,33 +14502,38 @@ ix86_avx_u128_mode_needed (rtx_insn *insn)
> >
> >subrtx_iterator::array_type array;
> >
> > -  rtx set = single_set (insn);
> > -  if (set)
> > +  if (!TARGET_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER)
> >  {
> > -  rtx dest = SET_DEST (set);
> > -  rtx src = SET_SRC (set);
> > -  if (ix86_check_avx_upper_register (dest))
> > +  /* Perform this vzeroupper optimization if target doesn't need
> > +vzeroupper after reading all-zero YMM/YMM registers.  */
> > +  rtx set = single_set (insn);
> > +  if (set)
> > {
> > - /* This is an YMM/ZMM load.  Return AVX_U128_DIRTY if the
> > -source isn't zero.  */
> > - if (standard_sse_constant_p (src, GET_MODE (dest)) != 1)
> > -   return AVX_U128_DIRTY;
> > + rtx dest = SET_DEST (set);
> > + rtx src = SET_SRC (set);
> > + if (ix86_check_avx_upper_register (dest))
> > +   {
> > + /* This is an YMM/ZMM load.  Return AVX_U128_DIRTY if the
> > +source isn't zero.  */
> > + if (standard_sse_constant_p (src, GET_MODE (dest)) != 1)
> > +   return AVX_U128_DIRTY;
> > + else
> > +   return AVX_U128_ANY;
> > +   }
> >   else
> > -   return AVX_U128_ANY;
> > -   }
> > -  else
> > -   {
> > - FOR_EACH_SUBRTX (iter, array, src, NONCONST)
> > -   if (ix86_check_avx_upper_register (*iter))
> > - {
> > -   int status = ix86_avx_u128_mode_source (insn, *iter);
> > -   if (status == AVX_U128_DIRTY)
> > - return status;
> > - }
> > -   }
> > +   {
> > + FOR_EACH_SUBRTX (iter, array, src, NONCONST)
> > +   if (ix86_check_avx_upper_register (*iter))
> > + {
> > +   int status = ix86_avx_u128_mode_source (insn, *iter);
> > +   if (status == AVX_U128_DIRTY)
> > + return status;
> > + }
> > +   }
> >
> > -  /* This isn't YMM/ZMM load/store.  */
> > -  return AVX_U128_ANY;
> > + /* This isn't YMM/ZMM load/store.  */
> > + return AVX_U128_ANY;
> > +   }
> >  }
> >
> >/* Require DIRTY mode if a 256bit or 512bit AVX register is referenced.
> > diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> > index f41e0908250..98c2e200027 100644
> > --- a/gcc/config/i386/i386.h
> > +++ b/gcc/config/i386/i386.h
> > @@ -425,6 +425,8 @@ extern unsigned char ix86_tune_feat