date:20210114

On Thu, Jan 14, 2021 at 08:43:54AM +0100, Richard Biener wrote:
> >  But for
> > diagnostics that is what the user actually want to see IMHO.
> > So on the attached testcase, instead of printing what is in left column
> > it prints what is in right column:
> > ((int*)t) + 3   t.u.b
> > ((int*)t) + 6   t.u.e.i
> > ((int*)t) + 8   t.v
> > s + 1   s[1]
> 
> so while that's "nice" in general, for TBAA diagnostics it might actually
> be misleading.
> 
> I wonder whether we absolutely need to print a C expression here.

I'm afraid yes, because it is not a toplevel routine, but something called
from the c-family pretty-printers, so it can be in the middle of arbitrary
C/C++ expressions.  And printing
(3 * (access to a memory object of type 'int' at offset 12 bytes from 't') + 
31) * 42
would be just weird.

> We could print, instead of *((int *)t + 3), "access to a memory
> object of type 'int' at offset 12 bytes from 't'", thus explain
> in plain english.
> 
> That said, *((int *)t + 3) is exactly what the access is,

*((int *)&t + 3) actually, the code I haven't touched has multiple bugs.

The user generally doesn't know the exact layout of the structures,
and especially with C++ templates it is extremely hard to figure that out,
so even when we could print verbose text it would be helpful to give a hint
(in your text something like (which falls into 't.u.b')).
I don't see how we can print both the MEM_REF type and TBAA type in a way
that would be understandable to the user.

Could we print
t.u.b
if the TBAA type is compatible with the type of the reference and perhaps
*(int*)&t.u.b
if it is incompatible?
>From the aliasing perspective that is still different, but we don't print
the TBAA type anyway.

> In the light of Martins patch this is probably reasonable but still
> the general direction is wrong (which is why I didn't approve Martins
> original patch).  I'm also somewhat disappointed we're breaking this
> so late in the cycle.

I'm too.

> c_fold_indirect_ref_for_warn doesn't look like it is especially
> careful about error recovery issues (error_mark_node in random
> places of the trees).  Maybe that never happens.

I've created it by copying and adjusting the C++ cxx_fold_indirect_ref_1
which had those error_mark_node checks in there (haven't verified if
they are strictly necessary or not), but as the diagnostic code isn't used
solely during middle-end, but also in the FEs and I remember several cases
where the types had error marks within the types in there.
The function seemed to be too short and after the changes too different
from cxx_fold_indirect_ref_1, which contains some very C++ specific parts,
handling of the active union member or empty bases (there is a pending PR
for it) etc.

Jakub

Re: [PATCH] c-family: Improve MEM_REF printing for diagnostics [PR98597]

On Thu, Jan 14, 2021 at 09:28:31AM +0100, Jakub Jelinek via Gcc-patches wrote:
> I'm afraid yes, because it is not a toplevel routine, but something called
> from the c-family pretty-printers, so it can be in the middle of arbitrary
> C/C++ expressions.  And printing
> (3 * (access to a memory object of type 'int' at offset 12 bytes from 't') + 
> 31) * 42
> would be just weird.
> 
> > We could print, instead of *((int *)t + 3), "access to a memory
> > object of type 'int' at offset 12 bytes from 't'", thus explain
> > in plain english.
> > 
> > That said, *((int *)t + 3) is exactly what the access is,
> 
> *((int *)&t + 3) actually, the code I haven't touched has multiple bugs.
> 
> The user generally doesn't know the exact layout of the structures,
> and especially with C++ templates it is extremely hard to figure that out,
> so even when we could print verbose text it would be helpful to give a hint
> (in your text something like (which falls into 't.u.b')).
> I don't see how we can print both the MEM_REF type and TBAA type in a way
> that would be understandable to the user.
> 
> Could we print
> t.u.b
> if the TBAA type is compatible with the type of the reference and perhaps
> *(int*)&t.u.b
> if it is incompatible?
> >From the aliasing perspective that is still different, but we don't print
> the TBAA type anyway.

There is another option I forgot about, but perhaps it is too verbose.
Print
*(int*)((char*)&t + offsetof (struct T, u.b))
so like
*(int*)((char*)&t + 12)
but print the offset in a more user-friendly way.

Jakub

[PATCH] aarch64: Reimplement vmovn_high_* intrinsics using builtins

Hi all,

The vmovn_high* intrinsics are supposed to map to XTN2 instructions that narrow 
their source
vector and instert it into the top half of the destination vector.
This patch reimplements them away from inline assembly to an RTL builtin that 
performs a vec_concat with a truncate.

Bootstrapped and tested on aarch64-none-linux-gnu. Also tested 
aarch64_be-none-elf.

Pushing to trunk.
Thanks,
Kyrill

gcc/
* config/aarch64/aarch64-simd.md (aarch64_xtn2_le): Define.
(aarch64_xtn2_be): Likewise.
(aarch64_xtn2): Likewise.
* config/aarch64/aarch64-simd-builtins.def (xtn2): Define builtins.
* config/aarch64/arm_neon.h (vmovn_high_s16): Reimplement using
builtins.
(vmovn_high_s32): Likewise.
(vmovn_high_s64): Likewise.
(vmovn_high_u16): Likewise.
(vmovn_high_u32): Likewise.
(vmovn_high_u64): Likewise.

gcc/testsuite/
* gcc.target/aarch64/narrow_high-intrinsics.c: Adjust 
scan-assembler-times
for xtn2.


movn-hi.patch
Description: movn-hi.patch

[PATCH] aarch64: Reimplememnt vmovn/vmovl intrinsics with builtins instead

Hi all,

Turns out __builtin_convertvector is not as good a fit for the widening and 
narrowing intrinsics as I had hoped.
During the veclower phase we lower most of it to bitfield operations and hope 
DCE cleans it back up into
vector pack/unpack and extend operations. I received reports that in more 
complex cases GCC fails to do that
and we're left with many vector extract operations that clutter the output.

I think veclower can be improved on that front, but for GCC 10 I'd like to just 
implement these builtins
with a good old RTL builtin rather than inline asm.

Bootstrapped and tested on aarch64-none-linux-gnu.
Pushing to trunk.
Thanks,
Kyrill

gcc/
* config/aarch64/aarch64-simd.md (aarch64_xtl): Define.
(aarch64_xtn): Likewise.
* config/aarch64/aarch64-simd-builtins.def (sxtl, uxtl, xtn): Define
builtins.
* config/aarch64/arm_neon.h (vmovl_s8): Reimplement using
builtin.
(vmovl_s16): Likewise.
(vmovl_s32): Likewise.
(vmovl_u8): Likewise.
(vmovl_u16): Likewise.
(vmovl_u32): Likewise.
(vmovn_s16): Likewise.
(vmovn_s32): Likewise.
(vmovn_s64): Likewise.
(vmovn_u16): Likewise.
(vmovn_u32): Likewise.
(vmovn_u64): Likewise.


vmovnl.patch
Description: vmovnl.patch

[PATCH] aarch64: reimplement vqmovn_high* intrinsics using builtins

Hi all,

This patch reimplements the saturating-truncate-and-insert-into-high intrinsics
using the appropriate RTL codes and builtins.

Bootstrapped on aarch64-none-linux-gnu and tested on aarch64_be-none-elf too.

Pushing to trunk.
Thanks,
Kyrill

gcc/
* config/aarch64/aarch64-simd.md (aarch64_qxtn2_le): Define.
(aarch64_qxtn2_be): Likewise.
(aarch64_qxtn2): Likewise.
* config/aarch64/aarch64-simd-builtins.def (sqxtn2, uqxtn2): Define 
builtins.
* config/aarch64/iterators.md (SAT_TRUNC): Define code_iterator.
(su): Handle ss_truncate and us_truncate.
* config/aarch64/arm_neon.h (vqmovn_high_s16): Reimplement using 
builtin.
(vqmovn_high_s32): Likewise.
(vqmovn_high_s64): Likewise.
(vqmovn_high_u16): Likewise.
(vqmovn_high_u32): Likewise.
(vqmovn_high_u64): Likewise.

gcc/testsuite/
* gcc.target/aarch64/narrow_high-intrinsics.c: Update uqxtn2 and sqxtn2
scan-assembler-times.


qmovn-hi.patch
Description: qmovn-hi.patch

Re: [PATCH] Add a new pattern in 4-insn combine

2021-01-14 Thread HAO CHEN GUI via Gcc-patches


Segher,

    Gentle ping this:

https://gcc.gnu.org/pipermail/gcc-patches/2020-November/560573.html

Thanks a lot.

On 4/1/2021 上午 10:03, HAO CHEN GUI wrote:

Segher,

    Gentle ping this:

https://gcc.gnu.org/pipermail/gcc-patches/2020-November/560573.html

Thanks a lot.


On 11/12/2020 上午 10:14, HAO CHEN GUI wrote:

Segher,

    Gentle ping this:

https://gcc.gnu.org/pipermail/gcc-patches/2020-November/560573.html

Re: [PATCH] Add pytest for a GCOV test-case

On 1/13/21 2:38 PM, Rainer Orth wrote:

Hi Martin,

On 1/6/21 12:36 AM, Jeff Law wrote:

unresolved "could not find python interpreter $testcase" in
run-gcov-pytest if you find the right magic in the output of your spawn.

Achieved that with the updated patch.

Ready for master?

unfortunately, your patch has a large number of problems:

Hello.

Thank you for the investigation.

* On targets where run-gcov-pytest decides that pytest isn't available
(incorrectly in some cases), mail-report.log is cluttered with

UNRESOLVED: could not find Python interpreter and (or) pytest module for
pr98273.C

I fear you've been misled by David and Jeff here: UNRESOLVED isn't
appropriate for cases like this. Please read the DejaGnu manual for
the semantics of the various test outcomes. If anything (we often
just silently skip testcases that cannot be run on some target), use
UNSUPPORTED instead.

Shame on me, I misread what I was suggested.

* Besides, the test outcomes are not generic message facilities but are
supposed to follow a common format:

: []

with the pathname to the test relative to (in this case)
gcc/testsuite. In this case, this might be something like

UNSUPPORTED: g++.dg/gcov/pr98273.C run-gcov-pytest

Currently, you don't have the pathname in run-gcov-pytest, though.

All right, now one will see:

UNSUPPORTED: g++.dg/gcov/pr98273.C run-gcov-pytest could not find Python
interpreter and (or) pytest module

* If we now have an (even optional) dependency on python/pytest, this
(with the exact versions and use) needs to be documented in
install.texi.

Done that.

* Speaking of documenting, the new run-gcov-pytest needs to be
documented in sourcebuild.texi.

Likewise here.

* On to the implementation: your test for the presence of pytest is
wrong:

set result [remote_exec host "pytest -m pytest --version"]

has nothing to do with what you actually use later: on all of Fedora
29, Ubuntu 20.04, and Solaris 11.4 (with a caveat) pytest is Python
2.7 based, but you don't check that. It is well possible that pytest
for 2.7 is installed, but pytest for Python 3.x isn't.

Besides, while Solaris 11.4 does bundle pytest, they don't deliver
pytest, but only py.test due to a conflict with a different pytest from
logilab-common, cf. https://github.com/pytest-dev/pytest/issues/1833.

This is immaterial, however, since what you actually run is

spawn -noecho python3 -m pytest --color=no -rA -s --tb=no
$srcdir/$subdir/$pytest_script

So you should just run python3 -m pytest --version instead to check
for the presence of the version you're going to use.

Btw., there's a mess with pytest on Fedora 29: running the above gives

I must confirm this is mess. I definitely don't want to support Python2 and I
think
the best way would be to use 'env python3', hope it's portable enough.
@David: What do you think?

[...]
pluggy.PluginValidationError: Plugin 'benchmark' could not be loaded: (pytest
3.6.4 (/usr/lib/python3.7/site-packages), Requirement.parse('pytest>=3.8'))!

Seems the packagers have broken things there.

On top of all this, I wonder why you insist on a particular Python
version here: I tried your single testcase and it PASSes just as well
with Python 2.7!? One reason I'm asking is that Solaris 11.3 bundles
both Python 2.7 and 3.4, but (unlike Linux and Solaris 11.4) don't
have /usr/bin/python3, just python (which is 2.7), python2.7, and
python3.4. Not that it matters too much, but you should be aware of
the issue.

When running the test on Solaris 11.4 (with the bundled pytest 4.4.0),
I get

= test session starts ==
platform sunos5 -- Python 3.7.9, pytest-4.4.0, py-1.8.0, pluggy-0.9.0
rootdir: /vol/gcc/src/hg/master/local/gcc/testsuite/g++.dg/gcov
collected 2 items

../../../../../../../../../vol/gcc/src/hg/master/local/gcc/testsuite/g++.dg/gcov/test-pr98273.py
..

=== 2 passed in 0.04 seconds ===

while 4.6.9 on Linux gives

= test session starts ==
platform linux -- Python 3.8.2, pytest-4.6.9, py-1.8.1, pluggy-0.13.0
rootdir: /vol/gcc/src/hg/master/local/gcc/testsuite/g++.dg/gcov
collected 2 items

../../../../../../../../../../vol/gcc/src/hg/master/local/gcc/testsuite/g++.dg/gcov/test-pr98273.py
..

=== short test summary info
PASSED
../../../../../../../../../../vol/gcc/src/hg/master/local/gcc/testsuite/g++.dg/gcov/test-pr98273.py::test_basics
PASSED
../../../../../../../../../../vol/gcc/src/hg/master/local/gcc/testsuite/g++.dg/gcov/test-pr98273.py::test_lines
=== 2 passed in 0.17 seconds ===

Obviously pytest -rA was introduced only after 4.4.0 and the 'A' is
silently ignored. Fortunately, I can ju

Re: [stage1][PATCH] Change semantics of -frecord-gcc-switches and add -frecord-gcc-switches-format.


PING^2

On 1/6/21 3:21 PM, Martin Liška wrote:

PING

On 12/4/20 2:30 PM, Martin Liška wrote:

On 12/4/20 10:03 AM, Richard Biener wrote:

Otherwise 0001- looks good to me.


Pushed that to master.


As said I'd like to see opinions
from others on the
driver / backend communication for 0002.


To be honest, we moved back to the original implementation which used
a temporary file. There hasn't been any opinion for last 8 months :(

Martin

Re: [PATCH v2] Add --ld-path= to specify an arbitrary executable as the linker


PING^3

On 1/6/21 3:22 PM, Martin Liška wrote:

PING^2

On 12/4/20 2:45 PM, Martin Liška wrote:

PING

May I please ping the patch, it's waiting here for a review
for quite some time.

Thanks,
Martin

On 7/23/20 12:17 PM, Martin Liška wrote:

On 7/21/20 6:07 AM, Fangrui Song wrote:

If the value does not contain any path component separator (e.g. a
slash), the linker will be searched for using COMPILER_PATH followed by
PATH. Otherwise, it is either an absolute path or a path relative to the
current working directory.

--ld-path= complements and overrides -fuse-ld={bfd,gold,lld}. If in the
future, we want to make dfferent linker option decisions we can let
-fuse-ld= represent the linker flavor and --ld-path= the linker path.


Hello.

I have just few nits:

=== ERROR type #3: trailing operator (1 error(s)) ===
gcc/collect2.c:1155:14:    ld_file_name =



PR driver/93645
* common.opt (--ld-path=): Add --ld-path=
* opts.c (common_handle_option): Handle OPT__ld_path_.
* gcc.c (driver_handle_option): Likewise.
* collect2.c (main): Likewise.
* doc/invoke.texi: Document --ld-path=.

---
Changes in v2:
* Renamed -fld-path= to --ld-path= (clang 12.0.0 new option).
   The option does not affect code generation and is not a language feature,
   -f* is not suitable. Additionally, clang has other similar --*-path=
   options, e.g. --cuda-path=.
---
  gcc/collect2.c  | 63 +++--
  gcc/common.opt  |  4 +++
  gcc/doc/invoke.texi |  9 +++
  gcc/gcc.c   |  2 +-
  gcc/opts.c  |  1 +
  5 files changed, 64 insertions(+), 15 deletions(-)

diff --git a/gcc/collect2.c b/gcc/collect2.c
index f8a5ce45994..caa1b96ab52 100644
--- a/gcc/collect2.c
+++ b/gcc/collect2.c
@@ -844,6 +844,7 @@ main (int argc, char **argv)
    const char **ld1;
    bool use_plugin = false;
    bool use_collect_ld = false;
+  const char *ld_path = NULL;
    /* The kinds of symbols we will have to consider when scanning the
   outcome of a first pass link.  This is ALL to start with, then might
@@ -961,12 +962,21 @@ main (int argc, char **argv)
  if (selected_linker == USE_DEFAULT_LD)
    selected_linker = USE_PLUGIN_LD;
    }
-    else if (strcmp (argv[i], "-fuse-ld=bfd") == 0)
-  selected_linker = USE_BFD_LD;
-    else if (strcmp (argv[i], "-fuse-ld=gold") == 0)
-  selected_linker = USE_GOLD_LD;
-    else if (strcmp (argv[i], "-fuse-ld=lld") == 0)
-  selected_linker = USE_LLD_LD;
+    else if (strncmp (argv[i], "-fuse-ld=", 9) == 0
+ && selected_linker != USE_LD_MAX)
+  {
+    if (strcmp (argv[i] + 9, "bfd") == 0)
+  selected_linker = USE_BFD_LD;
+    else if (strcmp (argv[i] + 9, "gold") == 0)
+  selected_linker = USE_GOLD_LD;
+    else if (strcmp (argv[i] + 9, "lld") == 0)
+  selected_linker = USE_LLD_LD;
+  }
+    else if (strncmp (argv[i], "--ld-path=", 10) == 0)
+  {
+    ld_path = argv[i] + 10;
+    selected_linker = USE_LD_MAX;
+  }
  else if (strncmp (argv[i], "-o", 2) == 0)
    {
  /* Parse the output filename if it's given so that we can make
@@ -1117,14 +1127,34 @@ main (int argc, char **argv)
    ld_file_name = find_a_file (&cpath, collect_ld_suffix, X_OK);
    use_collect_ld = ld_file_name != 0;
  }
-  /* Search the compiler directories for `ld'.  We have protection against
- recursive calls in find_a_file.  */
-  if (ld_file_name == 0)
-    ld_file_name = find_a_file (&cpath, ld_suffixes[selected_linker], X_OK);
-  /* Search the ordinary system bin directories
- for `ld' (if native linking) or `TARGET-ld' (if cross).  */
-  if (ld_file_name == 0)
-    ld_file_name = find_a_file (&path, full_ld_suffixes[selected_linker], 
X_OK);
+  if (selected_linker == USE_LD_MAX)
+    {
+  /* If --ld-path= does not contain a path component separator, search for
+ the command using cpath, then using path.  Otherwise find the linker
+ relative to the current working directory.  */
+  if (lbasename (ld_path) == ld_path)
+    {
+  ld_file_name = find_a_file (&cpath, ld_path, X_OK);
+  if (ld_file_name == 0)
+    ld_file_name = find_a_file (&path, ld_path, X_OK);
+    }
+  else if (file_exists (ld_path))
+    {


^^^ these braces are not needed.


+  ld_file_name = ld_path;
+    }
+    }
+  else
+    {
+  /* Search the compiler directories for `ld'.  We have protection against
+ recursive calls in find_a_file.  */
+  if (ld_file_name == 0)


I would prefer '== NULL'.


+    ld_file_name = find_a_file (&cpath, ld_suffixes[selected_linker], X_OK);
+  /* Search the ordinary system bin directories
+ for `ld' (if native linking) or `TARGET-ld' (if cross).  */
+  if (ld_file_name == 0)
+    ld_file_name =
+  find_a_file (&path, full_ld_suffixes[selected_linker], X_OK);
+    }
  #ifdef REAL_NM_FILE_NAME
    nm_file_name = find_a_file (&path, REAL_NM_FILE_NAME, X_OK);
@@ -1461,6 +1

[PATCH] match.pd: Optimize ~(X >> Y) to ~X >> Y if ~X can be simplified [PR96688]

Hi!

This patch optimizes two GIMPLE operations into just one.
As mentioned in the PR, there is some risk this might create more expensive
constants, but sometimes it will make them on the other side less expensive,
it really depends on the exact value.
And if it is an important issue, we should do it in md or during expansion.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-01-13  Jakub Jelinek  

PR tree-optimization/96688
* match.pd (~(X >> Y) -> ~X >> Y): New simplification if
~X can be simplified.

* gcc.dg/tree-ssa/pr96688.c: New test.
* gcc.dg/tree-ssa/reassoc-37.c: Adjust scan-tree-dump regex.
* gcc.target/i386/pr66821.c: Likewise.

--- gcc/match.pd.jj 2021-01-13 15:27:13.843788907 +0100
+++ gcc/match.pd2021-01-13 18:01:09.706568135 +0100
@@ -1109,6 +1109,18 @@ (define_operator_list COND_TERNARY
&& wi::to_wide (@1) != wi::min_value (TYPE_PRECISION (type),
  SIGNED))
 (minus (plus @1 { build_minus_one_cst (type); }) @0
+
+/* ~(X >> Y) -> ~X >> Y if ~X can be simplified.  */
+(simplify
+ (bit_not (rshift:s @0 @1))
+  (if (!TYPE_UNSIGNED (TREE_TYPE (@0)))
+   (rshift (bit_not! @0) @1)
+   /* For logical right shifts, this is possible only if @0 doesn't
+  have MSB set and the logical right shift is changed into
+  arithmetic shift.  */
+   (if (!wi::neg_p (tree_nonzero_bits (@0)))
+(with { tree stype = signed_type_for (TREE_TYPE (@0)); }
+ (convert (rshift (bit_not! (convert:stype @0)) @1))
 #endif
 
 /* x + (x & 1) -> (x + 1) & ~1 */
--- gcc/testsuite/gcc.dg/tree-ssa/pr96688.c.jj  2021-01-13 19:12:26.396363212 
+0100
+++ gcc/testsuite/gcc.dg/tree-ssa/pr96688.c 2021-01-13 19:13:32.146628538 
+0100
@@ -0,0 +1,24 @@
+/* PR tree-optimization/96688 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* { dg-final { scan-tree-dump-times " = -124 >> " 2 "optimized" } } */
+/* { dg-final { scan-tree-dump-times " >> " 3 "optimized" } } */
+/* { dg-final { scan-tree-dump-times " = ~" 1 "optimized" } } */
+
+int
+foo (int x)
+{
+  return ~(123 >> x);
+}
+
+unsigned
+bar (int x)
+{
+  return ~(123U >> x);
+}
+
+unsigned
+baz (int x)
+{
+  return ~(~123U >> x);
+}
--- gcc/testsuite/gcc.dg/tree-ssa/reassoc-37.c.jj   2020-01-12 
11:54:37.609395365 +0100
+++ gcc/testsuite/gcc.dg/tree-ssa/reassoc-37.c  2021-01-14 10:04:03.100243196 
+0100
@@ -12,5 +12,5 @@ foo (int x)
 }
 
 /* Check if the tests have been folded into a bit test.  */
-/* { dg-final { scan-tree-dump "(8784908|0x0*860c0c)" "optimized" { target 
i?86-*-* x86_64-*-* } } } */
+/* { dg-final { scan-tree-dump "(8784908|-8784909|0x0*860c0c)" "optimized" { 
target i?86-*-* x86_64-*-* } } } */
 /* { dg-final { scan-tree-dump "(<<|>>)" "optimized" { target i?86-*-* 
x86_64-*-* } } } */
--- gcc/testsuite/gcc.target/i386/pr66821.c.jj  2020-01-12 11:54:37.969389933 
+0100
+++ gcc/testsuite/gcc.target/i386/pr66821.c 2021-01-14 10:04:41.210807013 
+0100
@@ -11,5 +11,5 @@ foo (int x)
 }
 
 /* Check if the tests have been folded into a bit test.  */
-/* { dg-final { scan-tree-dump "(8784908|0x0*860c0c)" "optimized" } } */
+/* { dg-final { scan-tree-dump "(8784908|-8784909|0x0*860c0c)" "optimized" } } 
*/
 /* { dg-final { scan-tree-dump "(<<|>>)" "optimized" } } */

Jakub

Re: [PATCH] PR target/96307: Fix KASAN option checking.

2021-01-14 Thread Kito Cheng via Gcc-patches

It's OK for gcc 10? I just forgot to backport that...

On Fri, Nov 6, 2020 at 11:24 AM Kito Cheng  wrote:
>
> Committed, thanks :)
>
> On Fri, Nov 6, 2020 at 6:21 AM Jeff Law  wrote:
>
> >
> > On 10/16/20 3:01 AM, Martin Liška wrote:
> > > On 10/16/20 9:41 AM, Kito Cheng wrote:
> > >> I think it is still useful for other targets which are not supporting
> > >> libsanitizer yet, so in this patch I also moved related testcases
> > >> from gcc.target to gcc.dg.
> > >
> > > All right, I can't approve the patch, but I support it.
> >
> > Well, that's good enough for me :-)  Approved.
> >
> >
> > jeff
> >
> >
> >

Re: [PATCH] c-family: Improve MEM_REF printing for diagnostics [PR98597]

On Thu, 14 Jan 2021, Jakub Jelinek wrote:

> On Thu, Jan 14, 2021 at 09:28:31AM +0100, Jakub Jelinek via Gcc-patches wrote:
> > I'm afraid yes, because it is not a toplevel routine, but something called
> > from the c-family pretty-printers, so it can be in the middle of arbitrary
> > C/C++ expressions.  And printing
> > (3 * (access to a memory object of type 'int' at offset 12 bytes from 't') 
> > + 31) * 42
> > would be just weird.
> > 
> > > We could print, instead of *((int *)t + 3), "access to a memory
> > > object of type 'int' at offset 12 bytes from 't'", thus explain
> > > in plain english.
> > > 
> > > That said, *((int *)t + 3) is exactly what the access is,
> > 
> > *((int *)&t + 3) actually, the code I haven't touched has multiple bugs.
> > 
> > The user generally doesn't know the exact layout of the structures,
> > and especially with C++ templates it is extremely hard to figure that out,
> > so even when we could print verbose text it would be helpful to give a hint
> > (in your text something like (which falls into 't.u.b')).
> > I don't see how we can print both the MEM_REF type and TBAA type in a way
> > that would be understandable to the user.
> > 
> > Could we print
> > t.u.b
> > if the TBAA type is compatible with the type of the reference and perhaps
> > *(int*)&t.u.b
> > if it is incompatible?
> > >From the aliasing perspective that is still different, but we don't print
> > the TBAA type anyway.

True.  As said we could simply add a GCC extension to write a MEM_REF
in source and print that syntax ... then it would be valid (GCC) C/C++.

> There is another option I forgot about, but perhaps it is too verbose.
> Print
> *(int*)((char*)&t + offsetof (struct T, u.b))

or rather offsetof (struct T, u) to not single out a specific union
member?

Richard.

> so like
> *(int*)((char*)&t + 12)
> but print the offset in a more user-friendly way.

Re: [PATCH v2] Add --ld-path= to specify an arbitrary executable as the linker

On Thu, 14 Jan 2021, Martin Liška wrote:

> PING^3

I see no particular reason to allow arbitrary garbage to be used as
linker.  It just asks for users to shoot themselves in the foot and
for strange bugreports to pop up.

Richard.

> On 1/6/21 3:22 PM, Martin Liška wrote:
> > PING^2
> > 
> > On 12/4/20 2:45 PM, Martin Liška wrote:
> >> PING
> >>
> >> May I please ping the patch, it's waiting here for a review
> >> for quite some time.
> >>
> >> Thanks,
> >> Martin
> >>
> >> On 7/23/20 12:17 PM, Martin Liška wrote:
> >>> On 7/21/20 6:07 AM, Fangrui Song wrote:
>  If the value does not contain any path component separator (e.g. a
>  slash), the linker will be searched for using COMPILER_PATH followed by
>  PATH. Otherwise, it is either an absolute path or a path relative to the
>  current working directory.
> 
>  --ld-path= complements and overrides -fuse-ld={bfd,gold,lld}. If in the
>  future, we want to make dfferent linker option decisions we can let
>  -fuse-ld= represent the linker flavor and --ld-path= the linker path.
> >>>
> >>> Hello.
> >>>
> >>> I have just few nits:
> >>>
> >>> === ERROR type #3: trailing operator (1 error(s)) ===
> >>> gcc/collect2.c:1155:14:    ld_file_name =
> >>>
> 
>  PR driver/93645
>  * common.opt (--ld-path=): Add --ld-path=
>  * opts.c (common_handle_option): Handle OPT__ld_path_.
>  * gcc.c (driver_handle_option): Likewise.
>  * collect2.c (main): Likewise.
>  * doc/invoke.texi: Document --ld-path=.
> 
>  ---
>  Changes in v2:
>  * Renamed -fld-path= to --ld-path= (clang 12.0.0 new option).
>     The option does not affect code generation and is not a language
>  feature,
>     -f* is not suitable. Additionally, clang has other similar --*-path=
>     options, e.g. --cuda-path=.
>  ---
>    gcc/collect2.c  | 63 +++--
>    gcc/common.opt  |  4 +++
>    gcc/doc/invoke.texi |  9 +++
>    gcc/gcc.c   |  2 +-
>    gcc/opts.c  |  1 +
>    5 files changed, 64 insertions(+), 15 deletions(-)
> 
>  diff --git a/gcc/collect2.c b/gcc/collect2.c
>  index f8a5ce45994..caa1b96ab52 100644
>  --- a/gcc/collect2.c
>  +++ b/gcc/collect2.c
>  @@ -844,6 +844,7 @@ main (int argc, char **argv)
>      const char **ld1;
>      bool use_plugin = false;
>      bool use_collect_ld = false;
>  +  const char *ld_path = NULL;
>      /* The kinds of symbols we will have to consider when scanning the
>     outcome of a first pass link.  This is ALL to start with, then
>  might
>  @@ -961,12 +962,21 @@ main (int argc, char **argv)
>    if (selected_linker == USE_DEFAULT_LD)
>      selected_linker = USE_PLUGIN_LD;
>      }
>  -    else if (strcmp (argv[i], "-fuse-ld=bfd") == 0)
>  -  selected_linker = USE_BFD_LD;
>  -    else if (strcmp (argv[i], "-fuse-ld=gold") == 0)
>  -  selected_linker = USE_GOLD_LD;
>  -    else if (strcmp (argv[i], "-fuse-ld=lld") == 0)
>  -  selected_linker = USE_LLD_LD;
>  +    else if (strncmp (argv[i], "-fuse-ld=", 9) == 0
>  + && selected_linker != USE_LD_MAX)
>  +  {
>  +    if (strcmp (argv[i] + 9, "bfd") == 0)
>  +  selected_linker = USE_BFD_LD;
>  +    else if (strcmp (argv[i] + 9, "gold") == 0)
>  +  selected_linker = USE_GOLD_LD;
>  +    else if (strcmp (argv[i] + 9, "lld") == 0)
>  +  selected_linker = USE_LLD_LD;
>  +  }
>  +    else if (strncmp (argv[i], "--ld-path=", 10) == 0)
>  +  {
>  +    ld_path = argv[i] + 10;
>  +    selected_linker = USE_LD_MAX;
>  +  }
>    else if (strncmp (argv[i], "-o", 2) == 0)
>      {
>    /* Parse the output filename if it's given so that we can make
>  @@ -1117,14 +1127,34 @@ main (int argc, char **argv)
>      ld_file_name = find_a_file (&cpath, collect_ld_suffix, X_OK);
>      use_collect_ld = ld_file_name != 0;
>    }
>  -  /* Search the compiler directories for `ld'.  We have protection
>  against
>  - recursive calls in find_a_file.  */
>  -  if (ld_file_name == 0)
>  -    ld_file_name = find_a_file (&cpath, ld_suffixes[selected_linker],
>  X_OK);
>  -  /* Search the ordinary system bin directories
>  - for `ld' (if native linking) or `TARGET-ld' (if cross).  */
>  -  if (ld_file_name == 0)
>  -    ld_file_name = find_a_file (&path,
>  full_ld_suffixes[selected_linker], X_OK);
>  +  if (selected_linker == USE_LD_MAX)
>  +    {
>  +  /* If --ld-path= does not contain a path component separator,
>  search for
>  + the command using cpath, then using path.  Otherwise find the
>  linker
>  + relative to the current working dire

Re: [PATCH] c-family: Improve MEM_REF printing for diagnostics [PR98597]

On Thu, Jan 14, 2021 at 11:05:40AM +0100, Richard Biener wrote:
> > > Could we print
> > > t.u.b
> > > if the TBAA type is compatible with the type of the reference and perhaps
> > > *(int*)&t.u.b
> > > if it is incompatible?
> > > >From the aliasing perspective that is still different, but we don't print
> > > the TBAA type anyway.
> 
> True.  As said we could simply add a GCC extension to write a MEM_REF
> in source and print that syntax ... then it would be valid (GCC) C/C++.

But even if we do that unless people are familiar with that extension they
wouldn't know what it means (and they didn't write it in that way in their
source).

> > There is another option I forgot about, but perhaps it is too verbose.
> > Print
> > *(int*)((char*)&t + offsetof (struct T, u.b))
> 
> or rather offsetof (struct T, u) to not single out a specific union
> member?

Sure, I can just get rid of the UNION_TYPE handling from the function,
or use it only if the TBAA access type is compatible.

Jakub

Re: [PATCH] match.pd: Optimize ~(X >> Y) to ~X >> Y if ~X can be simplified [PR96688]

On Thu, 14 Jan 2021, Jakub Jelinek wrote:

> Hi!
> 
> This patch optimizes two GIMPLE operations into just one.
> As mentioned in the PR, there is some risk this might create more expensive
> constants, but sometimes it will make them on the other side less expensive,
> it really depends on the exact value.
> And if it is an important issue, we should do it in md or during expansion.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks,
Richard.

> 2021-01-13  Jakub Jelinek  
> 
>   PR tree-optimization/96688
>   * match.pd (~(X >> Y) -> ~X >> Y): New simplification if
>   ~X can be simplified.
> 
>   * gcc.dg/tree-ssa/pr96688.c: New test.
>   * gcc.dg/tree-ssa/reassoc-37.c: Adjust scan-tree-dump regex.
>   * gcc.target/i386/pr66821.c: Likewise.
> 
> --- gcc/match.pd.jj   2021-01-13 15:27:13.843788907 +0100
> +++ gcc/match.pd  2021-01-13 18:01:09.706568135 +0100
> @@ -1109,6 +1109,18 @@ (define_operator_list COND_TERNARY
>   && wi::to_wide (@1) != wi::min_value (TYPE_PRECISION (type),
> SIGNED))
>  (minus (plus @1 { build_minus_one_cst (type); }) @0
> +
> +/* ~(X >> Y) -> ~X >> Y if ~X can be simplified.  */
> +(simplify
> + (bit_not (rshift:s @0 @1))
> +  (if (!TYPE_UNSIGNED (TREE_TYPE (@0)))
> +   (rshift (bit_not! @0) @1)
> +   /* For logical right shifts, this is possible only if @0 doesn't
> +  have MSB set and the logical right shift is changed into
> +  arithmetic shift.  */
> +   (if (!wi::neg_p (tree_nonzero_bits (@0)))
> +(with { tree stype = signed_type_for (TREE_TYPE (@0)); }
> + (convert (rshift (bit_not! (convert:stype @0)) @1))
>  #endif
>  
>  /* x + (x & 1) -> (x + 1) & ~1 */
> --- gcc/testsuite/gcc.dg/tree-ssa/pr96688.c.jj2021-01-13 
> 19:12:26.396363212 +0100
> +++ gcc/testsuite/gcc.dg/tree-ssa/pr96688.c   2021-01-13 19:13:32.146628538 
> +0100
> @@ -0,0 +1,24 @@
> +/* PR tree-optimization/96688 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +/* { dg-final { scan-tree-dump-times " = -124 >> " 2 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times " >> " 3 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times " = ~" 1 "optimized" } } */
> +
> +int
> +foo (int x)
> +{
> +  return ~(123 >> x);
> +}
> +
> +unsigned
> +bar (int x)
> +{
> +  return ~(123U >> x);
> +}
> +
> +unsigned
> +baz (int x)
> +{
> +  return ~(~123U >> x);
> +}
> --- gcc/testsuite/gcc.dg/tree-ssa/reassoc-37.c.jj 2020-01-12 
> 11:54:37.609395365 +0100
> +++ gcc/testsuite/gcc.dg/tree-ssa/reassoc-37.c2021-01-14 
> 10:04:03.100243196 +0100
> @@ -12,5 +12,5 @@ foo (int x)
>  }
>  
>  /* Check if the tests have been folded into a bit test.  */
> -/* { dg-final { scan-tree-dump "(8784908|0x0*860c0c)" "optimized" { target 
> i?86-*-* x86_64-*-* } } } */
> +/* { dg-final { scan-tree-dump "(8784908|-8784909|0x0*860c0c)" "optimized" { 
> target i?86-*-* x86_64-*-* } } } */
>  /* { dg-final { scan-tree-dump "(<<|>>)" "optimized" { target i?86-*-* 
> x86_64-*-* } } } */
> --- gcc/testsuite/gcc.target/i386/pr66821.c.jj2020-01-12 
> 11:54:37.969389933 +0100
> +++ gcc/testsuite/gcc.target/i386/pr66821.c   2021-01-14 10:04:41.210807013 
> +0100
> @@ -11,5 +11,5 @@ foo (int x)
>  }
>  
>  /* Check if the tests have been folded into a bit test.  */
> -/* { dg-final { scan-tree-dump "(8784908|0x0*860c0c)" "optimized" } } */
> +/* { dg-final { scan-tree-dump "(8784908|-8784909|0x0*860c0c)" "optimized" } 
> } */
>  /* { dg-final { scan-tree-dump "(<<|>>)" "optimized" } } */
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

Re: [PATCH] c-family: Improve MEM_REF printing for diagnostics [PR98597]

On Thu, 14 Jan 2021, Jakub Jelinek wrote:

> On Thu, Jan 14, 2021 at 11:05:40AM +0100, Richard Biener wrote:
> > > > Could we print
> > > > t.u.b
> > > > if the TBAA type is compatible with the type of the reference and 
> > > > perhaps
> > > > *(int*)&t.u.b
> > > > if it is incompatible?
> > > > >From the aliasing perspective that is still different, but we don't 
> > > > >print
> > > > the TBAA type anyway.
> > 
> > True.  As said we could simply add a GCC extension to write a MEM_REF
> > in source and print that syntax ... then it would be valid (GCC) C/C++.
> 
> But even if we do that unless people are familiar with that extension they
> wouldn't know what it means (and they didn't write it in that way in their
> source).

I'd just use it in case we can't express it in the C/C++ way (thus when
the TBAA type differs).  We already print {ref-all} in type dumping
for example, which isn't C/C++ either.

> > > There is another option I forgot about, but perhaps it is too verbose.
> > > Print
> > > *(int*)((char*)&t + offsetof (struct T, u.b))
> > 
> > or rather offsetof (struct T, u) to not single out a specific union
> > member?
> 
> Sure, I can just get rid of the UNION_TYPE handling from the function,
> or use it only if the TBAA access type is compatible.

Sure.

I wonder if we can use some pp flags to tell whether we're being
called from FE or middle-end diagnostics and thus whether we
should try to produce source-like expressions or can expect
weird GIMPLE IL derived expressions.

Richard.

[PATCH]middle-end slp: elide intermediate nodes for complex add and avoid truncate

2021-01-14 Thread Tamar Christina via Gcc-patches

Hi All,

This applies the same feedback received for MUL and the rest to
ADD which was already committed.  In short it elides the intermediate
nodes vec and avoids the use of truncate on the SLP child.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

* tree-vect-slp-patterns.c (complex_add_pattern::build):

--- inline copy of patch -- 
diff --git a/gcc/tree-vect-slp-patterns.c b/gcc/tree-vect-slp-patterns.c
index 
be066b08310b72320fdbeb88a6b2969151f73cdc..e9f70958fdc32427ab0e1cceadfed41dfa091b47
 100644
--- a/gcc/tree-vect-slp-patterns.c
+++ b/gcc/tree-vect-slp-patterns.c
@@ -645,23 +645,21 @@ class complex_add_pattern : public complex_pattern
 void
 complex_add_pattern::build (vec_info *vinfo)
 {
-  auto_vec nodes;
+  SLP_TREE_CHILDREN (*this->m_node).reserve_exact (2);
+
   slp_tree node = this->m_ops[0];
   vec children = SLP_TREE_CHILDREN (node);
 
   /* First re-arrange the children.  */
-  nodes.create (children.length ());
-  nodes.quick_push (children[0]);
-  nodes.quick_push (vect_build_swap_evenodd_node (children[1]));
+  SLP_TREE_CHILDREN (*this->m_node)[0] = children[0];
+  SLP_TREE_CHILDREN (*this->m_node)[1] =
+vect_build_swap_evenodd_node (children[1]);
 
-  SLP_TREE_REF_COUNT (nodes[0])++;
-  SLP_TREE_REF_COUNT (nodes[1])++;
+  SLP_TREE_REF_COUNT (SLP_TREE_CHILDREN (*this->m_node)[0])++;
+  SLP_TREE_REF_COUNT (SLP_TREE_CHILDREN (*this->m_node)[1])++;
   vect_free_slp_tree (this->m_ops[0]);
   vect_free_slp_tree (this->m_ops[1]);
 
-  SLP_TREE_CHILDREN (*this->m_node).truncate (0);
-  SLP_TREE_CHILDREN (*this->m_node).safe_splice (nodes);
-
   complex_pattern::build (vinfo);
 }
 


-- 
diff --git a/gcc/tree-vect-slp-patterns.c b/gcc/tree-vect-slp-patterns.c
index be066b08310b72320fdbeb88a6b2969151f73cdc..e9f70958fdc32427ab0e1cceadfed41dfa091b47 100644
--- a/gcc/tree-vect-slp-patterns.c
+++ b/gcc/tree-vect-slp-patterns.c
@@ -645,23 +645,21 @@ class complex_add_pattern : public complex_pattern
 void
 complex_add_pattern::build (vec_info *vinfo)
 {
-  auto_vec nodes;
+  SLP_TREE_CHILDREN (*this->m_node).reserve_exact (2);
+
   slp_tree node = this->m_ops[0];
   vec children = SLP_TREE_CHILDREN (node);
 
   /* First re-arrange the children.  */
-  nodes.create (children.length ());
-  nodes.quick_push (children[0]);
-  nodes.quick_push (vect_build_swap_evenodd_node (children[1]));
+  SLP_TREE_CHILDREN (*this->m_node)[0] = children[0];
+  SLP_TREE_CHILDREN (*this->m_node)[1] =
+vect_build_swap_evenodd_node (children[1]);
 
-  SLP_TREE_REF_COUNT (nodes[0])++;
-  SLP_TREE_REF_COUNT (nodes[1])++;
+  SLP_TREE_REF_COUNT (SLP_TREE_CHILDREN (*this->m_node)[0])++;
+  SLP_TREE_REF_COUNT (SLP_TREE_CHILDREN (*this->m_node)[1])++;
   vect_free_slp_tree (this->m_ops[0]);
   vect_free_slp_tree (this->m_ops[1]);
 
-  SLP_TREE_CHILDREN (*this->m_node).truncate (0);
-  SLP_TREE_CHILDREN (*this->m_node).safe_splice (nodes);
-
   complex_pattern::build (vinfo);
 }

[wwwdocs] Add "porting to" notes for libstdc++ in GCC 11

Pushed to wwwdocs.

commit b1448ab2ec847fd9a8283881f620d3ace0aea8ed
Author: Jonathan Wakely 
Date:   Thu Jan 14 10:40:47 2021 +

Add "porting to" notes for libstdc++ in GCC 11

diff --git a/htdocs/gcc-11/porting_to.html b/htdocs/gcc-11/porting_to.html
index 4187dd8e..83227c74 100644
--- a/htdocs/gcc-11/porting_to.html
+++ b/htdocs/gcc-11/porting_to.html
@@ -141,6 +141,35 @@ change the code to not include the 
 header,
 so that only std::bind is declared.
 
 
+Enable multithreading to use std::thread
+
+Programs must be linked to libpthread in order for std::thread
+to create new threads of execution.
+It is not sufficient to use dlopen to dynamically load
+libpthread.so at run-time.
+
+
+Do not undefine __STRICT_ANSI__
+
+The __STRICT_ANSI__ macro is defined by the compiler to
+inform the C and C++ standard library headers when a strict language dialect
+is being used, e.g. -std=c++17 or -std=c11 rather
+than -std=gnu++17 or -std=gnu11.
+
+
+If you undefine the __STRICT_ANSI__ macro then you create an
+inconsistent state where the compiler is using a strict dialect but the
+standard library headers think that GNU extensions are enabled.
+The libstdc++ headers in GCC 11 cannot be used in this state and are likely
+to produce compilation errors.
+
+
+If you don't want the macro to be defined, don't use a -std
+option that causes it to be defined.
+Simply use a -std=gnu++NN option instead of
+-std=c++NN.
+
+

[wwwdocs] Document addition of C++ header in C++11

Pushed to wwwdocs.

commit 768ecbe7af00a219cfdff9fba96874a0fe9c94bb
Author: Jonathan Wakely 
Date:   Thu Jan 14 10:41:43 2021 +

Document addition of  C++ header in C++11

diff --git a/htdocs/gcc-11/changes.html b/htdocs/gcc-11/changes.html
index e044d710..3527428f 100644
--- a/htdocs/gcc-11/changes.html
+++ b/htdocs/gcc-11/changes.html
@@ -307,7 +307,8 @@ a work-in-progress.
   std::bit_cast
   std::source_location
   Atomic wait and notify operations.
-   and 
+  , ,
+and 
   
   Efficient access to basic_stringbuf's buffer.

Re: [wwwdocs] Document addition of C++ header in C++11


On 14/01/21 10:42 +, Jonathan Wakely wrote:

Pushed to wwwdocs.




commit 768ecbe7af00a219cfdff9fba96874a0fe9c94bb
Author: Jonathan Wakely 
Date:   Thu Jan 14 10:41:43 2021 +

   Document addition of  C++ header in C++11


Doh, that was meant to say GCC 11 not C++11.

[PATCH] i386: Fix the pmovzx SSE4.1 define_insn_and_split patterns [PR98670]

Hi!

I've made two mistakes in the *sse4_1_zero_extend* define_insn_and_split
patterns.  One is that when it uses vector_operand, it should use Bm rather
than m constraint, and the other one is that because it is a post-reload
splitter it needs isa attribute to select which alternatives are valid for
which ISAs.  Sorry for messing this up.

Ok for trunk if it passes bootstrap/regtest?

2021-01-14  Jakub Jelinek  

PR target/98670
* config/i386/sse.md (*sse4_1_zero_extendv8qiv8hi2_3,
*sse4_1_zero_extendv4hiv4si2_3, *sse4_1_zero_extendv2siv2di2_3):
Use Bm instead of m for non-avx.  Add isa attribute.

* gcc.target/i386/pr98670.c: New test.

--- gcc/config/i386/sse.md.jj   2021-01-13 11:36:27.07291 +0100
+++ gcc/config/i386/sse.md  2021-01-14 10:30:26.952146198 +0100
@@ -17721,7 +17721,7 @@ (define_insn_and_split "*sse4_1_zero_ext
   [(set (match_operand:V16QI 0 "register_operand" "=Yr,*x,v")
(vec_select:V16QI
  (vec_concat:V32QI
-   (match_operand:V16QI 1 "vector_operand" "Yrm,*xm,vm")
+   (match_operand:V16QI 1 "vector_operand" "YrBm,*xBm,vm")
(match_operand:V16QI 2 "const0_operand" "C,C,C"))
  (match_parallel 3 "pmovzx_parallel"
[(match_operand 4 "const_int_operand" "n,n,n")])))]
@@ -17745,7 +17745,8 @@ (define_insn_and_split "*sse4_1_zero_ext
   emit_insn (gen_rtx_SET (operands[0], operands[1]));
   DONE;
 }
-})
+}
+  [(set_attr "isa" "noavx,noavx,avx")])
 
 (define_expand "v8qiv8hi2"
   [(set (match_operand:V8HI 0 "register_operand")
@@ -18031,7 +18032,7 @@ (define_insn_and_split "*sse4_1_zero_ext
   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
(vec_select:V8HI
  (vec_concat:V16HI
-   (match_operand:V8HI 1 "vector_operand" "Yrm,*xm,vm")
+   (match_operand:V8HI 1 "vector_operand" "YrBm,*xBm,vm")
(match_operand:V8HI 2 "const0_operand" "C,C,C"))
  (match_parallel 3 "pmovzx_parallel"
[(match_operand 4 "const_int_operand" "n,n,n")])))]
@@ -18053,7 +18054,8 @@ (define_insn_and_split "*sse4_1_zero_ext
   emit_insn (gen_rtx_SET (operands[0], operands[1]));
   DONE;
 }
-})
+}
+  [(set_attr "isa" "noavx,noavx,avx")])
 
 (define_insn "avx512f_v8qiv8di2"
   [(set (match_operand:V8DI 0 "register_operand" "=v")
@@ -18447,7 +18449,7 @@ (define_insn_and_split "*sse4_1_zero_ext
   [(set (match_operand:V4SI 0 "register_operand" "=Yr,*x,v")
(vec_select:V4SI
  (vec_concat:V8SI
-   (match_operand:V4SI 1 "vector_operand" "Yrm,*xm,vm")
+   (match_operand:V4SI 1 "vector_operand" "YrBm,*xBm,vm")
(match_operand:V4SI 2 "const0_operand" "C,C,C"))
  (match_parallel 3 "pmovzx_parallel"
[(match_operand 4 "const_int_operand" "n,n,n")])))]
@@ -18467,7 +18469,8 @@ (define_insn_and_split "*sse4_1_zero_ext
   emit_insn (gen_rtx_SET (operands[0], operands[1]));
   DONE;
 }
-})
+}
+  [(set_attr "isa" "noavx,noavx,avx")])
 
 (define_expand "v2siv2di2"
   [(set (match_operand:V2DI 0 "register_operand")
--- gcc/testsuite/gcc.target/i386/pr98670.c.jj  2021-01-14 10:40:37.208180135 
+0100
+++ gcc/testsuite/gcc.target/i386/pr98670.c 2021-01-14 10:40:07.340521064 
+0100
@@ -0,0 +1,16 @@
+/* PR target/98670 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse4.1" } */
+
+#include 
+
+void foo (__m128i);
+int a[6];
+
+void
+bar (void)
+{
+  __m128i d = *(__m128i *) (a + 2);
+  __m128i e = _mm_unpacklo_epi16 (d, (__m128i) {});
+  foo (e);
+}

Jakub

Re: [PATCH] i386: Fix the pmovzx SSE4.1 define_insn_and_split patterns [PR98670]

On Thu, Jan 14, 2021 at 11:44 AM Jakub Jelinek  wrote:
>
> Hi!
>
> I've made two mistakes in the *sse4_1_zero_extend* define_insn_and_split
> patterns.  One is that when it uses vector_operand, it should use Bm rather
> than m constraint, and the other one is that because it is a post-reload
> splitter it needs isa attribute to select which alternatives are valid for
> which ISAs.  Sorry for messing this up.
>
> Ok for trunk if it passes bootstrap/regtest?
>
> 2021-01-14  Jakub Jelinek  
>
> PR target/98670
> * config/i386/sse.md (*sse4_1_zero_extendv8qiv8hi2_3,
> *sse4_1_zero_extendv4hiv4si2_3, *sse4_1_zero_extendv2siv2di2_3):
> Use Bm instead of m for non-avx.  Add isa attribute.
>
> * gcc.target/i386/pr98670.c: New test.

OK.

Thanks,
Uros.

>
> --- gcc/config/i386/sse.md.jj   2021-01-13 11:36:27.07291 +0100
> +++ gcc/config/i386/sse.md  2021-01-14 10:30:26.952146198 +0100
> @@ -17721,7 +17721,7 @@ (define_insn_and_split "*sse4_1_zero_ext
>[(set (match_operand:V16QI 0 "register_operand" "=Yr,*x,v")
> (vec_select:V16QI
>   (vec_concat:V32QI
> -   (match_operand:V16QI 1 "vector_operand" "Yrm,*xm,vm")
> +   (match_operand:V16QI 1 "vector_operand" "YrBm,*xBm,vm")
> (match_operand:V16QI 2 "const0_operand" "C,C,C"))
>   (match_parallel 3 "pmovzx_parallel"
> [(match_operand 4 "const_int_operand" "n,n,n")])))]
> @@ -17745,7 +17745,8 @@ (define_insn_and_split "*sse4_1_zero_ext
>emit_insn (gen_rtx_SET (operands[0], operands[1]));
>DONE;
>  }
> -})
> +}
> +  [(set_attr "isa" "noavx,noavx,avx")])
>
>  (define_expand "v8qiv8hi2"
>[(set (match_operand:V8HI 0 "register_operand")
> @@ -18031,7 +18032,7 @@ (define_insn_and_split "*sse4_1_zero_ext
>[(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> (vec_select:V8HI
>   (vec_concat:V16HI
> -   (match_operand:V8HI 1 "vector_operand" "Yrm,*xm,vm")
> +   (match_operand:V8HI 1 "vector_operand" "YrBm,*xBm,vm")
> (match_operand:V8HI 2 "const0_operand" "C,C,C"))
>   (match_parallel 3 "pmovzx_parallel"
> [(match_operand 4 "const_int_operand" "n,n,n")])))]
> @@ -18053,7 +18054,8 @@ (define_insn_and_split "*sse4_1_zero_ext
>emit_insn (gen_rtx_SET (operands[0], operands[1]));
>DONE;
>  }
> -})
> +}
> +  [(set_attr "isa" "noavx,noavx,avx")])
>
>  (define_insn "avx512f_v8qiv8di2"
>[(set (match_operand:V8DI 0 "register_operand" "=v")
> @@ -18447,7 +18449,7 @@ (define_insn_and_split "*sse4_1_zero_ext
>[(set (match_operand:V4SI 0 "register_operand" "=Yr,*x,v")
> (vec_select:V4SI
>   (vec_concat:V8SI
> -   (match_operand:V4SI 1 "vector_operand" "Yrm,*xm,vm")
> +   (match_operand:V4SI 1 "vector_operand" "YrBm,*xBm,vm")
> (match_operand:V4SI 2 "const0_operand" "C,C,C"))
>   (match_parallel 3 "pmovzx_parallel"
> [(match_operand 4 "const_int_operand" "n,n,n")])))]
> @@ -18467,7 +18469,8 @@ (define_insn_and_split "*sse4_1_zero_ext
>emit_insn (gen_rtx_SET (operands[0], operands[1]));
>DONE;
>  }
> -})
> +}
> +  [(set_attr "isa" "noavx,noavx,avx")])
>
>  (define_expand "v2siv2di2"
>[(set (match_operand:V2DI 0 "register_operand")
> --- gcc/testsuite/gcc.target/i386/pr98670.c.jj  2021-01-14 10:40:37.208180135 
> +0100
> +++ gcc/testsuite/gcc.target/i386/pr98670.c 2021-01-14 10:40:07.340521064 
> +0100
> @@ -0,0 +1,16 @@
> +/* PR target/98670 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -msse4.1" } */
> +
> +#include 
> +
> +void foo (__m128i);
> +int a[6];
> +
> +void
> +bar (void)
> +{
> +  __m128i d = *(__m128i *) (a + 2);
> +  __m128i e = _mm_unpacklo_epi16 (d, (__m128i) {});
> +  foo (e);
> +}
>
> Jakub
>

[PATCH] vect: Account for unused IFN_LOAD_LANES results

2021-01-14 Thread Richard Sandiford via Gcc-patches

At the moment, if we use only one vector of an LD4 result,
we'll treat the LD4 as having the cost of a single load.
But all 4 loads and any associated permutes take place
regardless of which results are actually used.

This patch therefore counts the cost of unused LOAD_LANES
results against the first statement in a group.  An alternative
would be to multiply the ncopies of the first stmt by the group
size and treat other stmts in the group as having zero cost,
but I thought that might be more surprising when reading dumps.

Tested on aarch64-linux-gnu, aarch64_be-elf and x86_64-linux-gnu.
OK to install?

Richard


gcc/
* tree-vect-stmts.c (vect_model_load_cost): Account for unused
IFN_LOAD_LANES results.

gcc/testsuite/
* gcc.target/aarch64/sve/cost_model_11.c: New test.
* gcc.target/aarch64/sve/mask_struct_load_5.c: Use
-fno-vect-cost-model.
---
 .../gcc.target/aarch64/sve/cost_model_11.c| 12 ++
 .../aarch64/sve/mask_struct_load_5.c  |  2 +-
 gcc/tree-vect-stmts.c | 24 +++
 3 files changed, 37 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/cost_model_11.c

diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cost_model_11.c 
b/gcc/testsuite/gcc.target/aarch64/sve/cost_model_11.c
new file mode 100644
index 000..d9f4ccc76de
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/cost_model_11.c
@@ -0,0 +1,12 @@
+/* { dg-options "-O2 -ftree-vectorize -msve-vector-bits=128" } */
+
+long
+f (long *x, long *y, long *z, long n)
+{
+  long res = 0;
+  for (long i = 0; i < n; ++i)
+z[i] = x[i * 4] + y[i * 4];
+  return res;
+}
+
+/* { dg-final { scan-assembler-not {\tld4d\t} } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/mask_struct_load_5.c 
b/gcc/testsuite/gcc.target/aarch64/sve/mask_struct_load_5.c
index da367e4fd79..2a33ee81d1a 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/mask_struct_load_5.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/mask_struct_load_5.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -ftree-vectorize -ffast-math --param 
aarch64-sve-compare-costs=0" } */
+/* { dg-options "-O2 -ftree-vectorize -ffast-math -fno-vect-cost-model" } */
 
 #include 
 
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 068e4982303..4d72c4db2f7 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -1120,6 +1120,30 @@ vect_model_load_cost (vec_info *vinfo,
  once per group anyhow.  */
   bool first_stmt_p = (first_stmt_info == stmt_info);
 
+  /* An IFN_LOAD_LANES will load all its vector results, regardless of which
+ ones we actually need.  Account for the cost of unused results.  */
+  if (first_stmt_p && !slp_node && memory_access_type == VMAT_LOAD_STORE_LANES)
+{
+  unsigned int gaps = DR_GROUP_SIZE (first_stmt_info);
+  stmt_vec_info next_stmt_info = first_stmt_info;
+  do
+   {
+ gaps -= 1;
+ next_stmt_info = DR_GROUP_NEXT_ELEMENT (next_stmt_info);
+   }
+  while (next_stmt_info);
+  if (gaps)
+   {
+ if (dump_enabled_p ())
+   dump_printf_loc (MSG_NOTE, vect_location,
+"vect_model_load_cost: %d unused vectors.\n",
+gaps);
+ vect_get_load_cost (vinfo, stmt_info, ncopies * gaps, false,
+ &inside_cost, &prologue_cost,
+ cost_vec, cost_vec, true);
+   }
+}
+
   /* We assume that the cost of a single load-lanes instruction is
  equivalent to the cost of DR_GROUP_SIZE separate loads.  If a grouped
  access is instead being provided by a load-and-permute operation,

Re: [PATCH] vect: Account for unused IFN_LOAD_LANES results

On Thu, 14 Jan 2021, Richard Sandiford wrote:

> At the moment, if we use only one vector of an LD4 result,
> we'll treat the LD4 as having the cost of a single load.
> But all 4 loads and any associated permutes take place
> regardless of which results are actually used.
> 
> This patch therefore counts the cost of unused LOAD_LANES
> results against the first statement in a group.  An alternative
> would be to multiply the ncopies of the first stmt by the group
> size and treat other stmts in the group as having zero cost,
> but I thought that might be more surprising when reading dumps.
> 
> Tested on aarch64-linux-gnu, aarch64_be-elf and x86_64-linux-gnu.
> OK to install?

OK.

Richard.

> Richard
> 
> 
> gcc/
>   * tree-vect-stmts.c (vect_model_load_cost): Account for unused
>   IFN_LOAD_LANES results.
> 
> gcc/testsuite/
>   * gcc.target/aarch64/sve/cost_model_11.c: New test.
>   * gcc.target/aarch64/sve/mask_struct_load_5.c: Use
>   -fno-vect-cost-model.
> ---
>  .../gcc.target/aarch64/sve/cost_model_11.c| 12 ++
>  .../aarch64/sve/mask_struct_load_5.c  |  2 +-
>  gcc/tree-vect-stmts.c | 24 +++
>  3 files changed, 37 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/cost_model_11.c
> 
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cost_model_11.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/cost_model_11.c
> new file mode 100644
> index 000..d9f4ccc76de
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/cost_model_11.c
> @@ -0,0 +1,12 @@
> +/* { dg-options "-O2 -ftree-vectorize -msve-vector-bits=128" } */
> +
> +long
> +f (long *x, long *y, long *z, long n)
> +{
> +  long res = 0;
> +  for (long i = 0; i < n; ++i)
> +z[i] = x[i * 4] + y[i * 4];
> +  return res;
> +}
> +
> +/* { dg-final { scan-assembler-not {\tld4d\t} } } */
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/mask_struct_load_5.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/mask_struct_load_5.c
> index da367e4fd79..2a33ee81d1a 100644
> --- a/gcc/testsuite/gcc.target/aarch64/sve/mask_struct_load_5.c
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/mask_struct_load_5.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -ftree-vectorize -ffast-math --param 
> aarch64-sve-compare-costs=0" } */
> +/* { dg-options "-O2 -ftree-vectorize -ffast-math -fno-vect-cost-model" } */
>  
>  #include 
>  
> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> index 068e4982303..4d72c4db2f7 100644
> --- a/gcc/tree-vect-stmts.c
> +++ b/gcc/tree-vect-stmts.c
> @@ -1120,6 +1120,30 @@ vect_model_load_cost (vec_info *vinfo,
>   once per group anyhow.  */
>bool first_stmt_p = (first_stmt_info == stmt_info);
>  
> +  /* An IFN_LOAD_LANES will load all its vector results, regardless of which
> + ones we actually need.  Account for the cost of unused results.  */
> +  if (first_stmt_p && !slp_node && memory_access_type == 
> VMAT_LOAD_STORE_LANES)
> +{
> +  unsigned int gaps = DR_GROUP_SIZE (first_stmt_info);
> +  stmt_vec_info next_stmt_info = first_stmt_info;
> +  do
> + {
> +   gaps -= 1;
> +   next_stmt_info = DR_GROUP_NEXT_ELEMENT (next_stmt_info);
> + }
> +  while (next_stmt_info);
> +  if (gaps)
> + {
> +   if (dump_enabled_p ())
> + dump_printf_loc (MSG_NOTE, vect_location,
> +  "vect_model_load_cost: %d unused vectors.\n",
> +  gaps);
> +   vect_get_load_cost (vinfo, stmt_info, ncopies * gaps, false,
> +   &inside_cost, &prologue_cost,
> +   cost_vec, cost_vec, true);
> + }
> +}
> +
>/* We assume that the cost of a single load-lanes instruction is
>   equivalent to the cost of DR_GROUP_SIZE separate loads.  If a grouped
>   access is instead being provided by a load-and-permute operation,
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

Re: [PATCH] [AVX512] Fix ICE: Convert integer mask to vector in ix86_expand_fp_vec_cmp/ix86_expand_int_vec_cmp [PR98537]

2021-01-14 Thread Hongtao Liu via Gcc-patches

ping.

On Thu, Jan 7, 2021 at 1:22 PM Hongtao Liu  wrote:
>
> On Wed, Jan 6, 2021 at 10:39 PM Jakub Jelinek  wrote:
> >
> > On Wed, Jan 06, 2021 at 02:49:13PM +0800, Hongtao Liu wrote:
> > >   ix86_expand_fp_vec_cmp/ix86_expand_int_vec_cmp are used by vec_cmpmn
> > > for vector comparison to vector mask, but ix86_expand_sse_cmp(which is
> > > called in upper 2 functions.) may return integer mask whenever integer
> > > mask is available, so convert integer mask back to vector mask if
> > > needed.
> > >
> > > gcc/ChangeLog:
> > >
> > > PR target/98537
> > > * config/i386/i386-expand.c (ix86_expand_fp_vec_cmp):
> > > When cmp is integer mask, convert it to vector.
> > > (ix86_expand_int_vec_cmp): Ditto.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > PR target/98537
> > > * g++.target/i386/avx512bw-pr98537-1.C: New test.
> > > * g++.target/i386/avx512vl-pr98537-1.C: New test.
> > > * g++.target/i386/avx512vl-pr98537-2.C: New test.
> >
> > Do we optimize it then to an AVX/AVX2 comparison if possible?
> >
> When i'm looking at the code, i find there's other places which
> require comparison dest to be vector(i.e. ix86_expand_sse_unpack,
> ix86_expand_mul_widen_evenodd). It's a potential bug.
> So I fix this bug in another way which won't generate an integer mask
> when the comparison dest is required to a vector mask.
>
> Update patch:
>   ix86_expand_sse_cmp/ix86_expand_int_sse_cmp is used for vector
> comparison, considering that avx512 introduces integer mask, but some
> original callers require the dest of comparison to be a vector.
> So add a new parameter vector_mask_p to control the result
> of vector comparison to be vector or not.
>   regtested/bootstrapped on x86_64-linux-gnu{-m32,}.
>
> gcc/ChangeLog:
>
> PR target/98537
> * config/i386/i386-expand.c (ix86_expand_sse_cmp): Add a new
> parameter vector_mask_p to control whether the comparison
> result should be a vector or not.
> (ix86_expand_int_sse_cmp): Ditto.
> (ix86_expand_sse_movcc): cmpmode should be MODE_INT.
> (ix86_expand_fp_movcc): Allow vector comparison dest as
> integer mask.
> (ix86_expand_fp_vcond): Ditto.
> (ix86_expand_int_vcond): Ditto.
> (ix86_expand_fp_vec_cmp): Require vector comparison dest as
> vector.
> (ix86_expand_int_vec_cmp): Ditto.
> (ix86_expand_sse_unpack): Ditto.
> (ix86_expand_mul_widen_evenodd): Ditto.
>
> gcc/testsuite/ChangeLog:
>
> PR target/98537
> * g++.target/i386/avx512bw-pr98537-1.C: New test.
> * g++.target/i386/avx512vl-pr98537-1.C: New test.
> * g++.target/i386/avx512vl-pr98537-2.C: New test.
>
>
> > @@ -4024,8 +4025,18 @@ ix86_expand_fp_vec_cmp (rtx operands[])
> >  cmp = ix86_expand_sse_cmp (operands[0], code, operands[2], operands[3],
> >operands[1], operands[2]);
> >
> > -  if (operands[0] != cmp)
> > -emit_move_insn (operands[0], cmp);
> > +if (operands[0] != cmp)
> > +{
> >
> > The indentation of the if above looks wrong.
> > Otherwise LGTM.
> >
> > Jakub
> >
>
>
> --
> BR,
> Hongtao



-- 
BR,
Hongtao

GCC 11.0.0 Status Report (2021-01-14), Stage 3 closes Jan 16th



Status
==

GCC trunk which eventually will become GCC 11 is nearing the
end of Stage 3 which will happen on Jan 17th which is when
Stage 4 starts (aka regression and documentation fixes only).

We have accumulated quite a number of regressions, where
P1 classified regressions should be fixed before the release
or analyzed up to a point where they no longer classify as P1.


Quality Data


Priority  #   Change from last report
---   ---
P1   67   + 30 
P2  331   + 74 
P3   34   - 60
P4  190   +  6
P5   24
---   ---
Total P1-P3 432   + 44
Total   646   + 50


Previous Report
===

https://gcc.gnu.org/pipermail/gcc/2020-November/234246.html

[PATCH] i386: Remove redundant assignment in i386-options.c [PR98671]

Also rename x86_prefetch_sse to ix86_prefetch_sse.

2021-01-14  Uroš Bizjak  

gcc/
PR target/98671
* config/i386/i386-options.c (ix86_function_specific_save):
Remove redundant assignment to opts->x_ix86_branch_cost.
* config/i386/i386.c (ix86_prefetch_sse):
Rename from x86_prefetch_sse.  Update all uses.
* config/i386/i386.h: Update for rename.
* config/i386/i386-options.h: Ditto.

Bootstrapped and regression tested on x86_64-linux-gnu.

Pushed to mainline.

Uros.
diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
index 6afd7a9b1f2..4e0165ff32c 100644
--- a/gcc/config/i386/i386-options.c
+++ b/gcc/config/i386/i386-options.c
@@ -641,7 +641,7 @@ ix86_function_specific_save (struct cl_target_option *ptr,
 {
   ptr->arch = ix86_arch;
   ptr->schedule = ix86_schedule;
-  ptr->prefetch_sse = x86_prefetch_sse;
+  ptr->prefetch_sse = ix86_prefetch_sse;
   ptr->tune = ix86_tune;
   ptr->branch_cost = ix86_branch_cost;
   ptr->tune_defaulted = ix86_tune_defaulted;
@@ -773,8 +773,7 @@ ix86_function_specific_restore (struct gcc_options *opts,
   ix86_arch = (enum processor_type) ptr->arch;
   ix86_schedule = (enum attr_cpu) ptr->schedule;
   ix86_tune = (enum processor_type) ptr->tune;
-  x86_prefetch_sse = ptr->prefetch_sse;
-  opts->x_ix86_branch_cost = ptr->branch_cost;
+  ix86_prefetch_sse = ptr->prefetch_sse;
   ix86_tune_defaulted = ptr->tune_defaulted;
   ix86_arch_specified = ptr->arch_specified;
   opts->x_ix86_isa_flags_explicit = ptr->x_ix86_isa_flags_explicit;
@@ -2348,7 +2347,7 @@ ix86_option_override_internal (bool main_args_p,
 
if ((processor_alias_table[i].flags
   & (PTA_PREFETCH_SSE | PTA_SSE)) != 0)
- x86_prefetch_sse = true;
+ ix86_prefetch_sse = true;
if (((processor_alias_table[i].flags & PTA_MWAITX) != 0)
&& !(opts->x_ix86_isa_flags2_explicit & OPTION_MASK_ISA2_MWAITX))
  opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_MWAITX;
@@ -2446,7 +2445,7 @@ ix86_option_override_internal (bool main_args_p,
if (TARGET_CMOV
&& ((processor_alias_table[i].flags
  & (PTA_PREFETCH_SSE | PTA_SSE)) != 0))
- x86_prefetch_sse = true;
+ ix86_prefetch_sse = true;
break;
   }
 
@@ -2589,7 +2588,7 @@ ix86_option_override_internal (bool main_args_p,
   || (TARGET_PRFCHW_P (opts->x_ix86_isa_flags)
  && !TARGET_3DNOW_P (opts->x_ix86_isa_flags))
   || TARGET_PREFETCHWT1_P (opts->x_ix86_isa_flags))
-x86_prefetch_sse = true;
+ix86_prefetch_sse = true;
 
   /* Enable popcnt instruction for -msse4.2 or -mabm.  */
   if (TARGET_SSE4_2_P (opts->x_ix86_isa_flags)
diff --git a/gcc/config/i386/i386-options.h b/gcc/config/i386/i386-options.h
index 67bc5efcd8d..cdaca2644f4 100644
--- a/gcc/config/i386/i386-options.h
+++ b/gcc/config/i386/i386-options.h
@@ -33,7 +33,7 @@ extern enum attr_cpu ix86_schedule;
 
 extern enum processor_type ix86_tune;
 extern enum processor_type ix86_arch;
-extern unsigned char x86_prefetch_sse;
+extern unsigned char ix86_prefetch_sse;
 extern const struct processor_costs *ix86_tune_cost;
 
 extern int ix86_tune_defaulted;
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index d35af37a49c..48f9aa0d731 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -347,7 +347,7 @@ enum processor_type ix86_tune;
 enum processor_type ix86_arch;
 
 /* True if processor has SSE prefetch instruction.  */
-unsigned char x86_prefetch_sse;
+unsigned char ix86_prefetch_sse;
 
 /* Preferred alignment for stack boundary in bits.  */
 unsigned int ix86_preferred_stack_boundary;
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 8dd0354309e..f032746d222 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -654,8 +654,8 @@ extern unsigned char ix86_arch_features[X86_ARCH_LAST];
 
 #define TARGET_FISTTP  (TARGET_SSE3 && TARGET_80387)
 
-extern unsigned char x86_prefetch_sse;
-#define TARGET_PREFETCH_SSEx86_prefetch_sse
+extern unsigned char ix86_prefetch_sse;
+#define TARGET_PREFETCH_SSEix86_prefetch_sse
 
 #define ASSEMBLER_DIALECT  (ix86_asm_dialect)

Re: calibrate intervals to avoid zero in futures poll test


On 05/01/21 04:44 -0300, Alexandre Oliva wrote:


We get occasional failures of 30_threads/future/members/poll.cc
on some platforms whose high resolution clock doesn't have such a high
resolution; wait_for_0 ends up as 0, and then some asserts fail as
intervals measured as longer than zero are tested for less than
several times zero.

This patch adds some calibration in the iteration count to set a
measurable base time interval with some additional margin.

Regstrapped on x86_64-linux-gnu, and also tested on
x-arm-wrs-vxworks7r2.  Ok to install?


for  libstdc++-v3/ChangeLog

* testsuite/30_threads/future/members/poll.cc: Calibrate
iteration count.
---
.../testsuite/30_threads/future/members/poll.cc|   33 +++-
1 file changed, 32 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/testsuite/30_threads/future/members/poll.cc 
b/libstdc++-v3/testsuite/30_threads/future/members/poll.cc
index fff9bea899c90..7b41411a54386 100644
--- a/libstdc++-v3/testsuite/30_threads/future/members/poll.cc
+++ b/libstdc++-v3/testsuite/30_threads/future/members/poll.cc
@@ -25,7 +25,7 @@
#include 
#include 

-const int iterations = 200;
+int iterations = 200;

using namespace std;

@@ -45,10 +45,41 @@ int main()
  promise p;
  future f = p.get_future();

+ start_over:
  auto start = chrono::high_resolution_clock::now();
  for(int i = 0; i < iterations; i++)
f.wait_for(chrono::seconds(0));
  auto stop = chrono::high_resolution_clock::now();
+
+  /* We've run too few iterations for the clock resolution.
+ Attempt to calibrate it.  */
+  if (start == stop)
+{
+  /* Loop until the clock advances, so that start is right after a
+time increment.  */
+  do
+   start = chrono::high_resolution_clock::now();
+  while (start == stop);
+  int i = 0;
+  /* Now until the clock advances again, so that stop is right
+after another time increment.  */
+  do
+   {
+ f.wait_for(chrono::seconds(0));
+ stop = chrono::high_resolution_clock::now();
+ i++;
+   }
+  while (start == stop);
+  /* Got for some 10 cycles, but we're already past that and still


I can't parse "Got for some 10 cycles". If that's just a typo that I'm
failing to spot ("good for"?) please fix and push the patch.

The patch is fine apart from me being unable to understand this
comment.


+get into the calibration loop, double the iteration count and
+try again.  */
+  if (iterations < i * 10)
+   iterations = i * 10;
+  else
+   iterations *= 2;
+  goto start_over;
+}
+
  double wait_for_0 = print("wait_for(0s)", stop - start);

  start = chrono::high_resolution_clock::now();


--
Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/
  Free Software Activist GNU Toolchain Engineer
   Vim, Vi, Voltei pro Emacs -- GNUlius Caesar

Re: Add dg-require-wchars to libstdc++ testsuite


On 13/01/21 14:29 -0300, Alexandre Oliva wrote:

On Dec 28, 2020, FranÃ§ois Dumont  wrote:


On 22/12/20 10:12 pm, Alexandre Oliva wrote:

Some tests uses structures from the libstdc++ that are present only if
the target has a wchar.h header.  However, those tests do not check
that the target supports those constructs before executing the tests.



Looks like those tests should be in some sub-folder containing
'wchar_t' to be considered as UNSUP.



Maybe Jonathan will prefer them to be moved even if your approach
seems more convenient to me.


I'd be glad to make such changes, but I'd appreciate stronger guidance
as to the preferences and the way to go before doing so.  Jonathan,
would you please share your wisdom WRT this patch and the other
wchar_t-related libstdc++ testsuite one?

https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562435.html
https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562438.html


I'll look at them today.


The function dg-require-wchars checks that "_GLIBCXX_USE_WCHAR_T" is
defined by the configure of the libstdc++.  If it is not the case, the
test is not executed.



This check_v3_target_wchars looks like a good candidate to leverage
on: v3_check_preprocessor_condition.


Nice!, thanks for the tip, I was not aware of this proc.


It's only been there a few weeks.

c++: Fix erroneous parm comparison logic [PR 98372]

2021-01-14 Thread Nathan Sidwell


I flubbed an application of De Morgan's law.  Let's justexpress the
logic directly and let the compiler figure it out.  This bug made it
look like pr52830 was fixed, but it is not.


PR c++/98372
gcc/cp/
* tree.c (cp_tree_equal): Correct map_context logic.
gcc/testsuite/
* cpp0x/constexpr-52830.C: Restore dg-ice
* g++.dg/template/pr98372.C: New.


--
Nathan Sidwell
diff --git c/gcc/cp/tree.c w/gcc/cp/tree.c
index d339036e88e..3a9a86de34a 100644
--- c/gcc/cp/tree.c
+++ w/gcc/cp/tree.c
@@ -3841,8 +3841,8 @@ cp_tree_equal (tree t1, tree t2)
 	  /* Module duplicate checking can have t1 = new, t2 =
 	 existing, and they should be considered matching at this
 	 point.  */
-	  && (DECL_CONTEXT (t1) != map_context_from
-	  && DECL_CONTEXT (t2) != map_context_to))
+	  && !(DECL_CONTEXT (t1) == map_context_from
+	   && DECL_CONTEXT (t2) == map_context_to))
 	/* When comparing hash table entries, only an exact match is
 	   good enough; we don't want to replace 'this' with the
 	   version from another function.  But be more flexible
diff --git c/gcc/testsuite/g++.dg/cpp0x/constexpr-52830.C w/gcc/testsuite/g++.dg/cpp0x/constexpr-52830.C
index 04f039fac43..2c9d2f9b329 100644
--- c/gcc/testsuite/g++.dg/cpp0x/constexpr-52830.C
+++ w/gcc/testsuite/g++.dg/cpp0x/constexpr-52830.C
@@ -1,5 +1,6 @@
 // PR c++/52830
 // { dg-do compile { target c++11 } }
+// { dg-ice "comptypes" }
 
 template struct eif { typedef void type; };
 template<>   struct eif {};
diff --git c/gcc/testsuite/g++.dg/template/pr98372.C w/gcc/testsuite/g++.dg/template/pr98372.C
new file mode 100644
index 000..f1e8b0f3323
--- /dev/null
+++ w/gcc/testsuite/g++.dg/template/pr98372.C
@@ -0,0 +1,28 @@
+// PR 98372 ICE due to incorrect type compare
+// { dg-do compile { target c++11 } }
+
+template  using remove_pointer_t = typename _Tp ::type;
+template  struct enable_if;
+template 
+using enable_if_t = typename enable_if<_Cond>::type;
+template  bool is_convertible_v;
+template  class Span;
+template  class Span {
+  using element_type = T;
+  template 
+  Span(element_type (&arr)[N],
+   enable_if_t>,
+   decltype(nullptr)>);
+};
+template  class Span {
+  using element_type = T;
+  template 
+  Span(element_type (&arr)[N],
+   enable_if_t>,
+   decltype(nullptr)>);
+};
+
+struct aaa
+{
+  Span data0;
+};

Re: [PATCH v2] Add --ld-path= to specify an arbitrary executable as the linker


On 1/14/21 11:07 AM, Richard Biener wrote:

I see no particular reason to allow arbitrary garbage to be used as
linker.  It just asks for users to shoot themselves in the foot and
for strange bugreports to pop up.


Well, for a strange bug report, we'll see eventually usage of the --ld-path= 
option.

I see it handy when developing a ld feature to be able to point to a built ld
(without need to build GCC with it). Yes, one can use --save-temps --verbose
and invoke the built linker, but it's not handy.

Martin

Re: [PATCH] Add pytest for a GCOV test-case

2021-01-14 Thread Rainer Orth

Hi Martin,

>> * Besides, the test outcomes are not generic message facilities but are
>>supposed to follow a common format:
>>:  []
>>with  the pathname to the test relative to (in this case)
>>gcc/testsuite.  In this case, this might be something like
>>UNSUPPORTED: g++.dg/gcov/pr98273.C run-gcov-pytest
>>Currently, you don't have the pathname in run-gcov-pytest, though.
>
> All right, now one will see:
>
> UNSUPPORTED: g++.dg/gcov/pr98273.C run-gcov-pytest could not find Python
> interpreter and (or) pytest module

please shorten this quite a bit: maybe

... run-gcov-pytest python3 pytest missing

>> * If we now have an (even optional) dependency on python/pytest, this
>>(with the exact versions and use) needs to be documented in
>>install.texi.
>
> Done that.

+be installed. Some optional tests also require Python3 and pytest module.

It would be better to be more specific here.  Or would Python 3.0 and
pytest 2.0.0 do ;-)

>> * On to the implementation: your test for the presence of pytest is
>>wrong:
>>  set result [remote_exec host "pytest -m pytest --version"]
>>has nothing to do with what you actually use later: on all of Fedora
>>29, Ubuntu 20.04, and Solaris 11.4 (with a caveat) pytest is Python
>>2.7 based, but you don't check that.  It is well possible that pytest
>>for 2.7 is installed, but pytest for Python 3.x isn't.
>>Besides, while Solaris 11.4 does bundle pytest, they don't deliver
>>pytest, but only py.test due to a conflict with a different pytest from
>>logilab-common, cf. https://github.com/pytest-dev/pytest/issues/1833.
>>This is immaterial, however, since what you actually run is
>>  spawn -noecho python3 -m pytest --color=no -rA -s --tb=no
>> $srcdir/$subdir/$pytest_script
>>So you should just run python3 -m pytest --version instead to check
>>for the presence of the version you're going to use.
>>Btw., there's a mess with pytest on Fedora 29: running the above gives
>
> I must confirm this is mess. I definitely don't want to support Python2 and
> I think
> the best way would be to use 'env python3', hope it's portable enough.
> @David: What do you think?

As I mentioned, it's not: Solaris 11.3 has no python3, only (for the 3.x
series) python3.4.

However, I don't understand what you expect to gain from running

$ env python3

rather than just

$ python3

(or a suitable Python 3.x version by any name)?

I just had a quick look and the autoconf-archive has AX_PYTHON which
claims to do that:

https://www.gnu.org/software/autoconf-archive/ax_python.html

Unfortunately, it doesn't know about Python 3.8+ yet.

>>When running the test on Solaris 11.4 (with the bundled pytest 4.4.0),
>>I get
>> = test session starts
>> ==
>> platform sunos5 -- Python 3.7.9, pytest-4.4.0, py-1.8.0, pluggy-0.9.0
>> rootdir: /vol/gcc/src/hg/master/local/gcc/testsuite/g++.dg/gcov
>> collected 2 items
>> ../../../../../../../../../vol/gcc/src/hg/master/local/gcc/testsuite/g++.dg/gcov/test-pr98273.py
>> ..
>> === 2 passed in 0.04 seconds
>> ===
>> while 4.6.9 on Linux gives
>> = test session starts
>> ==
>> platform linux -- Python 3.8.2, pytest-4.6.9, py-1.8.1, pluggy-0.13.0
>> rootdir: /vol/gcc/src/hg/master/local/gcc/testsuite/g++.dg/gcov
>> collected 2 items
>> ../../../../../../../../../../vol/gcc/src/hg/master/local/gcc/testsuite/g++.dg/gcov/test-pr98273.py
>> ..
>> === short test summary info
>> 
>> PASSED
>> ../../../../../../../../../../vol/gcc/src/hg/master/local/gcc/testsuite/g++.dg/gcov/test-pr98273.py::test_basics
>> PASSED
>> ../../../../../../../../../../vol/gcc/src/hg/master/local/gcc/testsuite/g++.dg/gcov/test-pr98273.py::test_lines
>> === 2 passed in 0.17 seconds
>> ===
>>Obviously pytest -rA was introduced only after 4.4.0 and the 'A' is
>>silently ignored.  Fortunately, I can just use -rap instead which
>>works with both versions.
>
> This will be fixed by this:
> env python3 -m pytest --color=no -rA -s --tb=no --version

No, as I already wrote: pytest 4.4.0 silently ignores -rA and doesn't
print the PASSED (or FAILED) lines.  With both versions, pytest -rap
worked for me instead.

>>After this has been processed by gcov.exp, g++.sum contains
>> PASS:
>> ../../../../../../../../../../vol/gcc/src/hg/master/local/gcc/testsuite/g++.dg/gcov/test-pr98273.py::test_basic
>> PASS:
>> ../../../../../../../../../../vol/gcc/src/hg/master/local/gcc/testsuite/g++.dg/gcov/test-pr98273.py::test_line
>>which is again completely wrong in light of what I wrote above on the
>>format of test names: it lacks the testname part completely and
>>contains absolute pathnames which makes it impossible to compare
>>testresults fr

Re: [PATCH] Add pytest for a GCOV test-case

2021-01-14 Thread Rainer Orth

Hi Martin,

>>> * If we now have an (even optional) dependency on python/pytest, this
>>>(with the exact versions and use) needs to be documented in
>>>install.texi.
>>
>> Done that.
>
> +be installed. Some optional tests also require Python3 and pytest module.
>
> It would be better to be more specific here.  Or would Python 3.0 and
> pytest 2.0.0 do ;-)

and a nit I just noticed: two spaces after full stop.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [PATCH] Add pytest for a GCOV test-case


On 1/14/21 2:22 PM, Rainer Orth wrote:

Hi Martin,


* Besides, the test outcomes are not generic message facilities but are
supposed to follow a common format:
:  []
with  the pathname to the test relative to (in this case)
gcc/testsuite.  In this case, this might be something like
UNSUPPORTED: g++.dg/gcov/pr98273.C run-gcov-pytest
Currently, you don't have the pathname in run-gcov-pytest, though.


All right, now one will see:

UNSUPPORTED: g++.dg/gcov/pr98273.C run-gcov-pytest could not find Python
interpreter and (or) pytest module


Hello.



please shorten this quite a bit: maybe

... run-gcov-pytest python3 pytest missing


Sure, done.




* If we now have an (even optional) dependency on python/pytest, this
(with the exact versions and use) needs to be documented in
install.texi.


Done that.


+be installed. Some optional tests also require Python3 and pytest module.

It would be better to be more specific here.  Or would Python 3.0 and
pytest 2.0.0 do ;-)


I would leave it as it is. Python3 is a well established term. About pytest:
I don't know how to investigate a minimal version right now.




* On to the implementation: your test for the presence of pytest is
wrong:
  set result [remote_exec host "pytest -m pytest --version"]
has nothing to do with what you actually use later: on all of Fedora
29, Ubuntu 20.04, and Solaris 11.4 (with a caveat) pytest is Python
2.7 based, but you don't check that.  It is well possible that pytest
for 2.7 is installed, but pytest for Python 3.x isn't.
Besides, while Solaris 11.4 does bundle pytest, they don't deliver
pytest, but only py.test due to a conflict with a different pytest from
logilab-common, cf. https://github.com/pytest-dev/pytest/issues/1833.
This is immaterial, however, since what you actually run is
  spawn -noecho python3 -m pytest --color=no -rA -s --tb=no
$srcdir/$subdir/$pytest_script
So you should just run python3 -m pytest --version instead to check
for the presence of the version you're going to use.
Btw., there's a mess with pytest on Fedora 29: running the above gives


I must confirm this is mess. I definitely don't want to support Python2 and
I think
the best way would be to use 'env python3', hope it's portable enough.
@David: What do you think?


As I mentioned, it's not: Solaris 11.3 has no python3, only (for the 3.x
series) python3.4.

However, I don't understand what you expect to gain from running

$ env python3

rather than just

$ python3

(or a suitable Python 3.x version by any name)?


All right, let's replace it just with 'python3'.



I just had a quick look and the autoconf-archive has AX_PYTHON which
claims to do that:

https://www.gnu.org/software/autoconf-archive/ax_python.html

Unfortunately, it doesn't know about Python 3.8+ yet.


When running the test on Solaris 11.4 (with the bundled pytest 4.4.0),
I get
= test session starts
==
platform sunos5 -- Python 3.7.9, pytest-4.4.0, py-1.8.0, pluggy-0.9.0
rootdir: /vol/gcc/src/hg/master/local/gcc/testsuite/g++.dg/gcov
collected 2 items
../../../../../../../../../vol/gcc/src/hg/master/local/gcc/testsuite/g++.dg/gcov/test-pr98273.py
..
=== 2 passed in 0.04 seconds
===
while 4.6.9 on Linux gives
= test session starts
==
platform linux -- Python 3.8.2, pytest-4.6.9, py-1.8.1, pluggy-0.13.0
rootdir: /vol/gcc/src/hg/master/local/gcc/testsuite/g++.dg/gcov
collected 2 items
../../../../../../../../../../vol/gcc/src/hg/master/local/gcc/testsuite/g++.dg/gcov/test-pr98273.py
..
=== short test summary info

PASSED
../../../../../../../../../../vol/gcc/src/hg/master/local/gcc/testsuite/g++.dg/gcov/test-pr98273.py::test_basics
PASSED
../../../../../../../../../../vol/gcc/src/hg/master/local/gcc/testsuite/g++.dg/gcov/test-pr98273.py::test_lines
=== 2 passed in 0.17 seconds
===
Obviously pytest -rA was introduced only after 4.4.0 and the 'A' is
silently ignored.  Fortunately, I can just use -rap instead which
works with both versions.


This will be fixed by this:
env python3 -m pytest --color=no -rA -s --tb=no --version


No, as I already wrote: pytest 4.4.0 silently ignores -rA and doesn't
print the PASSED (or FAILED) lines.  With both versions, pytest -rap
worked for me instead.


Ah, all right, I fixed that.




After this has been processed by gcov.exp, g++.sum contains
PASS:
../../../../../../../../../../vol/gcc/src/hg/master/local/gcc/testsuite/g++.dg/gcov/test-pr98273.py::test_basic
PASS:
../../../../../../../../../../vol/gcc/src/hg/master/local/gcc/testsuite/g++.dg/gcov/test-pr98273.py::test_line
which is again completely wrong in light of what I wrote above on the
format of test names: it lacks

[PATCH] tree-optimization/98674 - improve dependence analysis

This improves dependence analysis on refs that access the same
array but with different typed but same sized accesses.  That's
obviously safe for the case of types that cannot have any
access function based off them.  For the testcase this is
signed short vs. unsigned short.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

2021-01-14  Richard Biener  

PR tree-optimization/98674
* tree-data-ref.c (base_supports_access_fn_components_p): New.
(initialize_data_dependence_relation): For two bases without
possible access fns resort to type size equality when determining
shape compatibility.

* gcc.dg/vect/pr98674.c: New testcase.
---
 gcc/testsuite/gcc.dg/vect/pr98674.c | 16 
 gcc/tree-data-ref.c | 26 --
 2 files changed, 40 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/pr98674.c

diff --git a/gcc/testsuite/gcc.dg/vect/pr98674.c 
b/gcc/testsuite/gcc.dg/vect/pr98674.c
new file mode 100644
index 000..0f1b6cb060b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr98674.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-msse2" { target sse2 } } */
+
+void swap(short *p, int cnt)
+{
+  while (cnt-- > 0)
+{
+  *p = ((*p << 8) & 0xFF00) | ((*p >> 8) & 0x00FF);
+  ++p;
+}
+}
+
+/* Dependence analysis should not fail.  */
+/* { dg-final { scan-tree-dump "dependence distance == 0" "vect" } } */
+/* On x86 with SSE2 we can vectorize this with psllw/psrlw.  */
+/* { dg-final { scan-tree-dump "loop vectorized" "vect" { target sse2 } } } */
diff --git a/gcc/tree-data-ref.c b/gcc/tree-data-ref.c
index 394470af757..65fe6d5da91 100644
--- a/gcc/tree-data-ref.c
+++ b/gcc/tree-data-ref.c
@@ -1291,6 +1291,23 @@ access_fn_component_p (tree op)
 }
 }
 
+/* Returns whether BASE can have a access_fn_component_p with BASE
+   as base.  */
+
+static bool
+base_supports_access_fn_components_p (tree base)
+{
+  switch (TREE_CODE (TREE_TYPE (base)))
+{
+case COMPLEX_TYPE:
+case ARRAY_TYPE:
+case RECORD_TYPE:
+  return true;
+default:
+  return false;
+}
+}
+
 /* Determines the base object and the list of indices of memory reference
DR, analyzed in LOOP and instantiated before NEST.  */
 
@@ -3272,8 +3289,13 @@ initialize_data_dependence_relation (struct 
data_reference *a,
  && full_seq.start_b + full_seq.length == num_dimensions_b
  && DR_UNCONSTRAINED_BASE (a) == DR_UNCONSTRAINED_BASE (b)
  && operand_equal_p (base_a, base_b, OEP_ADDRESS_OF)
- && types_compatible_p (TREE_TYPE (base_a),
-TREE_TYPE (base_b))
+ && (types_compatible_p (TREE_TYPE (base_a),
+ TREE_TYPE (base_b))
+ || (!base_supports_access_fn_components_p (base_a)
+ && !base_supports_access_fn_components_p (base_b)
+ && operand_equal_p
+  (TYPE_SIZE (TREE_TYPE (base_a)),
+   TYPE_SIZE (TREE_TYPE (base_b)), 0)))
  && (!loop_nest.exists ()
  || (object_address_invariant_in_loop_p
  (loop_nest[0], base_a;
-- 
2.26.2

Re: Add dg-require-wchars to libstdc++ testsuite


On 22/12/20 18:12 -0300, Alexandre Oliva wrote:

--- a/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_utf16/79980.cc
+++ b/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_utf16/79980.cc
@@ -16,6 +16,7 @@
// .

// { dg-do run { target c++11 } }
+// { dg-require-wchars "" }


This test doesn't use wchar_t, so it shouldn't depend on
_GLIBCXX_USE_WCHAR_T being defined.

The problem is that  uses wchar_t in default
template arguments:

#ifdef _GLIBCXX_USE_WCHAR_T

_GLIBCXX_BEGIN_NAMESPACE_CXX11

  /// String conversions
  template
Is it the case that the wchar_t type is defined on this target, it's
just that libc doesn't have support for wcslen etc?  Because we should
probably audit all our uses of _GLIBCXX_USE_WCHAR_T and find which
ones actually need libc support and which just need the wchar_t type
to exist. Some things really do need the libc support, but I suspect
many others don't.

It seems wrong that we can provide full support for char16_t and
char32_t but not wchar_t, just because the former two don't depend on
anything being present in libc. Why can't we just implement the same
functionality for wchar_t without using libc?

In fact, if we just define std::char_traits generically
without using any libc functions (or just using them as optimisations)
we might be able to support std::basic_string and iostream
classes with almost no work. But that's something to consider in the
future.



diff --git a/libstdc++-v3/testsuite/lib/libstdc++.exp 
b/libstdc++-v3/testsuite/lib/libstdc++.exp
index b7d7b906de41c..2c22bcc0f0c94 100644
--- a/libstdc++-v3/testsuite/lib/libstdc++.exp
+++ b/libstdc++-v3/testsuite/lib/libstdc++.exp
@@ -702,6 +702,53 @@ proc v3-build_support { } {
}
}

+proc check_v3_target_wchars { } {
+global et_wchars
+global et_wchars_target_name
+global tool
+
+if { ![info exists et_wchars_target_name] } {
+   set et_wchars_target_name ""
+}
+
+# If the target has changed since we set the cached value, clear it.
+set current_target [current_target_name]
+if { $current_target != $et_wchars_target_name } {
+   verbose "check_v3_target_wchars: `$et_wchars_target_name'" 2
+   set et_wchars_target_name $current_target
+   if [info exists et_wchars] {
+   verbose "check_v3_target_wchars: removing cached result" 2
+   unset et_wchars
+   }
+}
+
+if [info exists et_wchars] {
+   verbose "check_v3_target_wchars: using cached result" 2
+} else {
+   set et_wchars 0
+
+   # Set up and preprocess a C++ test program that depends
+   # on wchars support being configured in the libstdc++.
+   set src wchars[pid].cc
+
+   set f [open $src "w"]
+   puts $f "#ifndef _GLIBCXX_USE_WCHAR_T"
+   puts $f "#  error No wchar header."


As FranÃ§ois said, this could use the new proc. I'd also prefer if it
was defined as an effective-target keyword so we can use:

// { dg-require-effective-target wchars }

instead of the old fashioned { dg-require-wchars "" } form. I've
recently added effective-target keywords for several of the
dg-require-FOO directives, so we can move away from the old form. I
think new directives should be done as effective-target keywords. See
the recent changes to libstdc++-v3/testsuite/lib/libstdc++.exp for
examples, e.g. 10ee46adf44ae731fc4f9e9fdc25ad60c9d43a9c

But we might not even need this new proc if the codecvt tests can be
made to work using the attached patch.


diff --git a/libstdc++-v3/include/bits/locale_conv.h b/libstdc++-v3/include/bits/locale_conv.h
index 0e409da9876..d8a4d0851f4 100644
--- a/libstdc++-v3/include/bits/locale_conv.h
+++ b/libstdc++-v3/include/bits/locale_conv.h
@@ -222,11 +222,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #endif  // _GLIBCXX_USE_CHAR8_T
 
 #ifdef _GLIBCXX_USE_WCHAR_T
+# define _GLIBCXX_WCHAR_DEFAULT_TEMPL_ARG = wchar_t
+#else
+// wstring_convert and wbuffer_convert are still defined for targets without
+// wchar_t support, but the second template argument must be given explictly.
+# define _GLIBCXX_WCHAR_DEFAULT_TEMPL_ARG
+#endif
 
 _GLIBCXX_BEGIN_NAMESPACE_CXX11
 
   /// String conversions
-  template,
 	   typename _Byte_alloc = allocator>
 class wstring_convert
@@ -382,7 +388,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
 _GLIBCXX_END_NAMESPACE_CXX11
 
   /// Buffer conversions
-  template>
 class wbuffer_convert : public basic_streambuf<_Elem, _Tr>
 {
@@ -606,8 +612,6 @@ _GLIBCXX_END_NAMESPACE_CXX11
   bool			_M_always_noconv;
 };
 
-#endif  // _GLIBCXX_USE_WCHAR_T
-
   /// @} group locales
 
 _GLIBCXX_END_NAMESPACE_VERSION

Re: Add dg-require-wchars to libstdc++ testsuite#


On 14/01/21 13:41 +, Jonathan Wakely wrote:


Is it the case that the wchar_t type is defined on this target, it's
just that libc doesn't have support for wcslen etc?  Because we should
probably audit all our uses of _GLIBCXX_USE_WCHAR_T and find which
ones actually need libc support and which just need the wchar_t type
to exist. Some things really do need the libc support, but I suspect
many others don't.

It seems wrong that we can provide full support for char16_t and
char32_t but not wchar_t, just because the former two don't depend on
anything being present in libc. Why can't we just implement the same
functionality for wchar_t without using libc?

In fact, if we just define std::char_traits generically
without using any libc functions (or just using them as optimisations)
we might be able to support std::basic_string and iostream
classes with almost no work. But that's something to consider in the
future.



Oops, I considered it already.

This untested patch should define std::char_traits so it is
available if wchar_t is defined by the front end (which I assume is
always true, is that right?), only using optimized libc routines if
available.

This would be the first step to enabling std::wstring etc for targets
with no wchar_t support in libc.


diff --git a/libstdc++-v3/include/bits/char_traits.h b/libstdc++-v3/include/bits/char_traits.h
index ea1e036f721..3a60478ea32 100644
--- a/libstdc++-v3/include/bits/char_traits.h
+++ b/libstdc++-v3/include/bits/char_traits.h
@@ -438,7 +438,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   };
 
 
-#ifdef _GLIBCXX_USE_WCHAR_T
+#ifdef __SIZEOF_WCHAR_T__
   /// 21.1.3.2  char_traits specializations
   template<>
 struct char_traits
@@ -469,23 +469,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   {
 	if (__n == 0)
 	  return 0;
-#if __cplusplus >= 201703L
-	if (__builtin_constant_p(__n)
-	&& __constant_char_array_p(__s1, __n)
-	&& __constant_char_array_p(__s2, __n))
-	  return __gnu_cxx::char_traits::compare(__s1, __s2, __n);
+#ifdef _GLIBCXX_USE_WCHAR_T
+	if (!__builtin_is_constant_evaluated())
+	  return wmemcmp(__s1, __s2, __n);
 #endif
-	return wmemcmp(__s1, __s2, __n);
+	return __gnu_cxx::char_traits::compare(__s1, __s2, __n);
   }
 
   static _GLIBCXX17_CONSTEXPR size_t
   length(const char_type* __s)
   {
-#if __cplusplus >= 201703L
-	if (__constant_string_p(__s))
-	  return __gnu_cxx::char_traits::length(__s);
+#ifdef _GLIBCXX_USE_WCHAR_T
+	if (!__builtin_is_constant_evaluated())
+	  return wcslen(__s);
 #endif
-	return wcslen(__s);
+	return __gnu_cxx::char_traits::length(__s);
   }
 
   static _GLIBCXX17_CONSTEXPR const char_type*
@@ -493,13 +491,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   {
 	if (__n == 0)
 	  return 0;
-#if __cplusplus >= 201703L
-	if (__builtin_constant_p(__n)
-	&& __builtin_constant_p(__a)
-	&& __constant_char_array_p(__s, __n))
-	  return __gnu_cxx::char_traits::find(__s, __n, __a);
+#ifdef _GLIBCXX_USE_WCHAR_T
+	if (!__builtin_is_constant_evaluated())
+	  return wmemchr(__s, __a, __n);
 #endif
-	return wmemchr(__s, __a, __n);
+	return __gnu_cxx::char_traits::find(__s, __n, __a);
   }
 
   static _GLIBCXX20_CONSTEXPR char_type*
@@ -507,11 +503,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   {
 	if (__n == 0)
 	  return __s1;
-#ifdef __cpp_lib_is_constant_evaluated
-	if (std::is_constant_evaluated())
-	  return __gnu_cxx::char_traits::move(__s1, __s2, __n);
+#ifdef _GLIBCXX_USE_WCHAR_T
+	if (!__builtin_is_constant_evaluated())
+	  return wmemmove(__s1, __s2, __n);
 #endif
-	return wmemmove(__s1, __s2, __n);
+	return __gnu_cxx::char_traits::move(__s1, __s2, __n);
   }
 
   static _GLIBCXX20_CONSTEXPR char_type*
@@ -519,11 +515,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   {
 	if (__n == 0)
 	  return __s1;
-#ifdef __cpp_lib_is_constant_evaluated
-	if (std::is_constant_evaluated())
-	  return __gnu_cxx::char_traits::copy(__s1, __s2, __n);
+#ifdef _GLIBCXX_USE_WCHAR_T
+	if (!__builtin_is_constant_evaluated())
+	  return wmemcpy(__s1, __s2, __n);
 #endif
-	return wmemcpy(__s1, __s2, __n);
+	return __gnu_cxx::char_traits::copy(__s1, __s2, __n);
   }
 
   static _GLIBCXX20_CONSTEXPR char_type*
@@ -531,11 +527,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   {
 	if (__n == 0)
 	  return __s;
-#ifdef __cpp_lib_is_constant_evaluated
-	if (std::is_constant_evaluated())
-	  return __gnu_cxx::char_traits::assign(__s, __n, __a);
+#ifdef _GLIBCXX_USE_WCHAR_T
+	if (!__builtin_is_constant_evaluated())
+	  return wmemset(__s, __a, __n);
 #endif
-	return wmemset(__s, __a, __n);
+	return __gnu_cxx::char_traits::assign(__s, __n, __a);
   }
 
   static _GLIBCXX_CONSTEXPR char_type
@@ -558,7 +554,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   not_eof(const int_type& __c) _GLIBCXX_NOEXCEPT
   { return eq_int_type(__c, eof()) ? 0 : __c; }
   };
-#endif //_GLIBCXX_USE_WCHAR_T
+#endif // __SIZEOF_WCHAR_T__
 
 #ifdef _GLIBCXX_USE_CHAR8_T
   template<>

Re: Split wchars tests from the normal variant


On 22/12/20 18:27 -0300, Alexandre Oliva wrote:


This change extracts apart the wchar specific parts of character
conversion tests to allow conditonalizating these parts on actual
wchar support while applying the rest more generally.

This turned out useful during our work on the libstdc++ support
for VxWorks, to expose the problematic areas more precisely.

Regstrapped on x86_64-linux-gnu, and tested with -x-arm-wrs-vxworks7r2.
Ok to install?  (dg-requires-wchars is added by another patch by
Corentin, that I posted a few minutes ago)

While updating Corentin's patch for mainline, I brought over to the
split-out test even the preprocessor conditional that is present in the
current version of the test, but required/implied by dg-requires-wchars.
Maybe that's excessive.  Maybe the whole patch is excessive, given that
conditional, but I didn't want to just drop it without asking for
others' thoughts.


I do think this is excessive. The point of the test is only to verify
that calling from_chars with wchar_t gives an error. I don't think we
need to make that conditional on whether wchar_t is supported or not.
Adding a whole new test and checking the dg-requires... condition adds
non-zero overhead to the testsuite.

Following the theme of my other replies, maybe _GLIBCXX_USE_WCHAR_T
isn't even the right thing to check here. We don't require any
support for wchar_t in this test, we only require the type to be
defined. Simple changing _GLIBCXX_USE_WCHAR_T to __SIZEOF_WCHAR_T__
seems like a better fix. That will mean that we use the type if it's
defined, and not otherwise. We don't care if the library actually
supports wchar_t specializations for std::char_traits etc. because we
are expecting to get an error anyway.

Re: Split wchars tests from the normal variant


On 28/12/20 19:39 +0100, FranÃ§ois Dumont via Libstdc++ wrote:

On 22/12/20 10:27 pm, Alexandre Oliva wrote:

This change extracts apart the wchar specific parts of character
conversion tests to allow conditonalizating these parts on actual
wchar support while applying the rest more generally.

This turned out useful during our work on the libstdc++ support
for VxWorks, to expose the problematic areas more precisely.

Regstrapped on x86_64-linux-gnu, and tested with -x-arm-wrs-vxworks7r2.
Ok to install?  (dg-requires-wchars is added by another patch by
Corentin, that I posted a few minutes ago)

While updating Corentin's patch for mainline, I brought over to the
split-out test even the preprocessor conditional that is present in the
current version of the test, but required/implied by dg-requires-wchars.
Maybe that's excessive.  Maybe the whole patch is excessive, given that
conditional, but I didn't want to just drop it without asking for
others' thoughts.


from Corentin Gay 
for  libstdc++-v3/ChangeLog

* testsuite/20_util/from_chars/1_neg.cc: Split wchar specific
part into...
* testsuite/20_util/from_chars/1_neg_wchar.cc: ... new file.
---
 libstdc++-v3/testsuite/20_util/from_chars/1_neg.cc |8 -
 .../testsuite/20_util/from_chars/1_neg_wchar.cc|   35 
 2 files changed, 35 insertions(+), 8 deletions(-)
 create mode 100644 libstdc++-v3/testsuite/20_util/from_chars/1_neg_wchar.cc

diff --git a/libstdc++-v3/testsuite/20_util/from_chars/1_neg.cc 
b/libstdc++-v3/testsuite/20_util/from_chars/1_neg.cc
index 0d2fe2b3e6594..a84b0f5efb075 100644
--- a/libstdc++-v3/testsuite/20_util/from_chars/1_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/from_chars/1_neg.cc
@@ -23,14 +23,6 @@
 void
 test01(const char* first, const char* last)
 {
-#if _GLIBCXX_USE_WCHAR_T
-  wchar_t wc;
-#else
-  enum W { } wc;
-#endif
-  std::from_chars(first, last, wc); // { dg-error "no matching" }
-  std::from_chars(first, last, wc, 10); // { dg-error "no matching" }
-
   char16_t c16;
   std::from_chars(first, last, c16); // { dg-error "no matching" }
   std::from_chars(first, last, c16, 10); // { dg-error "no matching" }
diff --git a/libstdc++-v3/testsuite/20_util/from_chars/1_neg_wchar.cc 
b/libstdc++-v3/testsuite/20_util/from_chars/1_neg_wchar.cc
new file mode 100644
index 0..2d736a28a2da7
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/from_chars/1_neg_wchar.cc


AFAIK _neg should be last.


Yup.


Using wchar_t_neg.cc should even make the dg-require-wchars useless here.


Indeed, but I don't think we want this change anyway.

Re: [PATCH] gimple UIDs, LTO and -fanalyzer [PR98599]

2021-01-14 Thread Jan Hubicka

> On Wed, Jan 13, 2021 at 11:04 PM David Malcolm via Gcc-patches
>  wrote:
> >
> > gimple.h has this comment for gimple's uid field:
> >
> >   /* UID of this statement.  This is used by passes that want to
> >  assign IDs to statements.  It must be assigned and used by each
> >  pass.  By default it should be assumed to contain garbage.  */
> >   unsigned uid;
> >
> > and gimple_set_uid has:
> >
> >Please note that this UID property is supposed to be undefined at
> >pass boundaries.  This means that a given pass should not assume it
> >contains any useful value when the pass starts and thus can set it
> >to any value it sees fit.
> >
> > which suggests that any pass can use the uid field as an arbitrary
> > scratch space.
> >
> > PR analyzer/98599 reports a case where this error occurs in LTO mode:
> >   fatal error: Cgraph edge statement index out of range
> > on certain inputs with -fanalyzer.
> >
> > The error occurs in the LTRANS phase after -fanalyzer runs in the
> > WPA phase.  The analyzer pass writes to the uid fields of all stmts.
> >
> > The error occurs when LTRANS is streaming callgraph edges back in.
> > If I'm reading things correctly, the LTO format uses stmt uids to
> > associate call stmts with callgraph edges between WPA and LTRANS.
> > For example, in lto-cgraph.c, lto_output_edge writes out the
> > gimple_uid, and input_edge reads it back in.
> >
> > Hence IPA passes that touch the uids in WPA need to restore them,
> > or the stream-in at LTRANS will fail.
> >
> > Is it intended that the LTO machinery relies on the value of the uid
> > field being preserved during WPA (or, at least, needs to be saved and
> > restored by passes that touch it)?
> 
> I belive this is solely at the cgraph stream out to stream in boundary but
> this may be a blurred area since while we materialize the whole cgraph
> at once the function bodies are streamed in on demand.
> 
> Honza can probably clarify things.

Well, the uids are used to associate cgraph edges with statements.  At
WPA stage you do not have function bodies and thus uids serves role of
pointers to the statement.  If you load the body in (via get_body) the
uids are replaced by pointers and when you stream out uids are
recomputed again.

When do you touch the uids? At WPA time or from small IPA pass in
ltrans?

hozna
> 
> Note LTO uses this exactly because of this comment to avoid allocating
> extra memory for an 'index' but it could of course leave gimple_uid alone
> at some extra expense (eventually paid for in generic cgraph data structures
> and thus for not only the streaming time).
> 
> > On the assumption that this is the case, this patch updates the comments
> > in gimple.h referring to passes being able to set uid to any value to
> > note the caveat for IPA passes, and it updates the analyzer to save
> > and restore the UIDs, fixing the error.
> >
> > Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
> > OK for master?
> 
> The analyzer bits are OK, let's see how Honza can clarify the situation.
> 
> Thanks,
> Richard.
> 
> > gcc/analyzer/ChangeLog:
> > PR analyzer/98599
> > * supergraph.cc (saved_uids::make_uid_unique): New.
> > (saved_uids::restore_uids): New.
> > (supergraph::supergraph): Replace assignments to stmt->uid with
> > calls to m_stmt_uids.make_uid_unique.
> > (supergraph::~supergraph): New.
> > * supergraph.h (class saved_uids): New.
> > (supergraph::~supergraph): New decl.
> > (supergraph::m_stmt_uids): New field.
> >
> > gcc/ChangeLog:
> > PR analyzer/98599
> > * doc/gimple.texi: Document that UIDs must not change during IPA
> > passes.
> > * gimple.h (gimple::uid): Likewise.
> > (gimple_set_uid): Likewise.
> >
> > gcc/testsuite/ChangeLog:
> > PR analyzer/98599
> > * gcc.dg/analyzer/pr98599-a.c: New test.
> > * gcc.dg/analyzer/pr98599-b.c: New test.
> > ---
> >  gcc/analyzer/supergraph.cc| 53 +--
> >  gcc/analyzer/supergraph.h | 15 +++
> >  gcc/doc/gimple.texi   |  6 +++
> >  gcc/gimple.h  | 13 +-
> >  gcc/testsuite/gcc.dg/analyzer/pr98599-a.c |  8 
> >  gcc/testsuite/gcc.dg/analyzer/pr98599-b.c |  1 +
> >  6 files changed, 90 insertions(+), 6 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr98599-a.c
> >  create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr98599-b.c
> >
> > diff --git a/gcc/analyzer/supergraph.cc b/gcc/analyzer/supergraph.cc
> > index 419f6424f76..40acfbd16a8 100644
> > --- a/gcc/analyzer/supergraph.cc
> > +++ b/gcc/analyzer/supergraph.cc
> > @@ -87,6 +87,46 @@ supergraph_call_edge (function *fun, gimple *stmt)
> >return edge;
> >  }
> >
> > +/* class saved_uids.
> > +
> > +   In order to ensure consistent results without relying on the ordering
> > +   of pointer values we assign a uid to each gimple

[PR66791][ARM] Replace __builtin_neon_vcge* with >= and <= for vcge and vcle intrinsics

2021-01-14 Thread Prathamesh Kulkarni via Gcc-patches

Hi,
The attached patch removes __builtin_neon_vcge* function with >= and
<= operators for vcge and vcle intrinsics respectively.
Cross tested on arm*-*-*.
OK for trunk ?

Thanks,
Prathamesh


vcge-1.diff
Description: Binary data

[PATCH] x86: Error on -fcf-protection with incompatible target

-fcf-protection with CF_BRANCH inserts ENDBR32 at function entries.
ENDBR32 is NOP only on 64-bit processors and 32-bit TARGET_CMOVE
processors.  Issue an error for -fcf-protection with CF_BRANCH when
compiling for 32-bit non-TARGET_CMOVE targets.

gcc/

PR target/98667
* config/i386/i386-options.c (ix86_option_override_internal):
Issue an error for -fcf-protection with CF_BRANCH when compiling
for 32-bit non-TARGET_CMOVE targets.

gcc/testsuite/

PR target/98667
* gcc.target/i386/pr98667-1.c: New file.
* gcc.target/i386/pr98667-2.c: Likewise.
* gcc.target/i386/pr98667-3.c: Likewise.
---
 gcc/config/i386/i386-options.c| 9 -
 gcc/testsuite/gcc.target/i386/pr98667-1.c | 9 +
 gcc/testsuite/gcc.target/i386/pr98667-2.c | 9 +
 gcc/testsuite/gcc.target/i386/pr98667-3.c | 7 +++
 4 files changed, 33 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr98667-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr98667-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr98667-3.c

diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
index 4e0165ff32c..1489871b36f 100644
--- a/gcc/config/i386/i386-options.c
+++ b/gcc/config/i386/i386-options.c
@@ -3016,8 +3016,15 @@ ix86_option_override_internal (bool main_args_p,
 }
 
   if (opts->x_flag_cf_protection != CF_NONE)
-opts->x_flag_cf_protection
+{
+  if ((opts->x_flag_cf_protection & CF_BRANCH) == CF_BRANCH
+ && !TARGET_64BIT
+ && !TARGET_CMOVE)
+   error ("%<-fcf-protection%> is not compatible with this target");
+
+  opts->x_flag_cf_protection
   = (cf_protection_level) (opts->x_flag_cf_protection | CF_SET);
+}
 
   if (ix86_tune_features [X86_TUNE_AVOID_256FMA_CHAINS])
 SET_OPTION_IF_UNSET (opts, opts_set, param_avoid_fma_max_bits, 256);
diff --git a/gcc/testsuite/gcc.target/i386/pr98667-1.c 
b/gcc/testsuite/gcc.target/i386/pr98667-1.c
new file mode 100644
index 000..5bf0c9285a8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr98667-1.c
@@ -0,0 +1,9 @@
+/* { dg-do compile { target ia32 } } */
+/* { dg-options "-O2 -fcf-protection -march=i486" } */
+
+void
+test (void)
+{
+}
+
+/* { dg-error "'-fcf-protection' is not compatible with this target" "" { 
target *-*-* } 0 } */
diff --git a/gcc/testsuite/gcc.target/i386/pr98667-2.c 
b/gcc/testsuite/gcc.target/i386/pr98667-2.c
new file mode 100644
index 000..bc3a78c9641
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr98667-2.c
@@ -0,0 +1,9 @@
+/* { dg-do compile { target ia32 } } */
+/* { dg-options "-O2 -fcf-protection=branch -march=i486" } */
+
+void
+test (void)
+{
+}
+
+/* { dg-error "'-fcf-protection' is not compatible with this target" "" { 
target *-*-* } 0 } */
diff --git a/gcc/testsuite/gcc.target/i386/pr98667-3.c 
b/gcc/testsuite/gcc.target/i386/pr98667-3.c
new file mode 100644
index 000..a6ea6d04331
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr98667-3.c
@@ -0,0 +1,7 @@
+/* { dg-do compile { target ia32 } } */
+/* { dg-options "-O2 -fcf-protection=return -march=i486" } */
+
+void
+test (void)
+{
+}
-- 
2.29.2

Re: [PATCH]middle-end slp: elide intermediate nodes for complex add and avoid truncate

On Thu, 14 Jan 2021, Tamar Christina wrote:

> Hi All,
> 
> This applies the same feedback received for MUL and the rest to
> ADD which was already committed.  In short it elides the intermediate
> nodes vec and avoids the use of truncate on the SLP child.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?

OK.

Thanks,
Richard.

> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * tree-vect-slp-patterns.c (complex_add_pattern::build):
> 
> --- inline copy of patch -- 
> diff --git a/gcc/tree-vect-slp-patterns.c b/gcc/tree-vect-slp-patterns.c
> index 
> be066b08310b72320fdbeb88a6b2969151f73cdc..e9f70958fdc32427ab0e1cceadfed41dfa091b47
>  100644
> --- a/gcc/tree-vect-slp-patterns.c
> +++ b/gcc/tree-vect-slp-patterns.c
> @@ -645,23 +645,21 @@ class complex_add_pattern : public complex_pattern
>  void
>  complex_add_pattern::build (vec_info *vinfo)
>  {
> -  auto_vec nodes;
> +  SLP_TREE_CHILDREN (*this->m_node).reserve_exact (2);
> +
>slp_tree node = this->m_ops[0];
>vec children = SLP_TREE_CHILDREN (node);
>  
>/* First re-arrange the children.  */
> -  nodes.create (children.length ());
> -  nodes.quick_push (children[0]);
> -  nodes.quick_push (vect_build_swap_evenodd_node (children[1]));
> +  SLP_TREE_CHILDREN (*this->m_node)[0] = children[0];
> +  SLP_TREE_CHILDREN (*this->m_node)[1] =
> +vect_build_swap_evenodd_node (children[1]);
>  
> -  SLP_TREE_REF_COUNT (nodes[0])++;
> -  SLP_TREE_REF_COUNT (nodes[1])++;
> +  SLP_TREE_REF_COUNT (SLP_TREE_CHILDREN (*this->m_node)[0])++;
> +  SLP_TREE_REF_COUNT (SLP_TREE_CHILDREN (*this->m_node)[1])++;
>vect_free_slp_tree (this->m_ops[0]);
>vect_free_slp_tree (this->m_ops[1]);
>  
> -  SLP_TREE_CHILDREN (*this->m_node).truncate (0);
> -  SLP_TREE_CHILDREN (*this->m_node).safe_splice (nodes);
> -
>complex_pattern::build (vinfo);
>  }
>  
> 
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

RE: [PR66791][ARM] Replace __builtin_neon_vcge* with >= and <= for vcge and vcle intrinsics

Hi Prathamesh,

> -Original Message-
> From: Prathamesh Kulkarni 
> Sent: 14 January 2021 14:04
> To: gcc Patches ; Kyrylo Tkachov
> 
> Subject: [PR66791][ARM] Replace __builtin_neon_vcge* with >= and <= for
> vcge and vcle intrinsics
> 
> Hi,
> The attached patch removes __builtin_neon_vcge* function with >= and
> <= operators for vcge and vcle intrinsics respectively.
> Cross tested on arm*-*-*.
> OK for trunk ?

Looks like it's the same way we do it on aarch64.
So ok.
Thanks,
Kyrill

> 
> Thanks,
> Prathamesh

Re: [PATCH] libstdc++: Add support for C++20 barriers


On 07/01/21 12:56 -0800, Thomas Rodgers via Libstdc++ wrote:


Tested x86_64-pc-linux-gnu, committed to master.


The copyright years need updating. Pushed to master.


commit 194a9d67be45568d81bb8c17e9102e31c1309e5f
Author: Jonathan Wakely 
Date:   Thu Jan 14 14:25:05 2021

libstdc++: Update copyright dates on new files

The patch adding these files was approved in 2020 but it wasn't
committed until 2021, so the copyright years were not updated along with
the years in all the existing files.

libstdc++-v3/ChangeLog:

* include/std/barrier: Update copyright years. Fix whitespace.
* include/std/version: Fix whitespace.
* testsuite/30_threads/barrier/1.cc: Update copyright years.
* testsuite/30_threads/barrier/2.cc: Likewise.
* testsuite/30_threads/barrier/arrive.cc: Likewise.
* testsuite/30_threads/barrier/arrive_and_drop.cc: Likewise.
* testsuite/30_threads/barrier/arrive_and_wait.cc: Likewise.
* testsuite/30_threads/barrier/completion.cc: Likewise.

diff --git a/libstdc++-v3/include/std/barrier b/libstdc++-v3/include/std/barrier
index f1143da89b4..e09212dfcb9 100644
--- a/libstdc++-v3/include/std/barrier
+++ b/libstdc++-v3/include/std/barrier
@@ -1,6 +1,6 @@
 //  -*- C++ -*-
 
-// Copyright (C) 2020 Free Software Foundation, Inc.
+// Copyright (C) 2020-2021 Free Software Foundation, Inc.
 //
 // This file is part of the GNU ISO C++ Library.  This library is free
 // software; you can redistribute it and/or modify it under the
@@ -103,7 +103,7 @@ It looks different from literature pseudocode for two main reasons:
 			   static_cast<__barrier_phase_t>(__old_phase_val + 2);
 
 	size_t __current_expected = _M_expected;
-	std::hash__hasher;
+	std::hash __hasher;
 	size_t __current = __hasher(std::this_thread::get_id())
 	  % ((_M_expected + 1) >> 1);
 
diff --git a/libstdc++-v3/include/std/version b/libstdc++-v3/include/std/version
index 9516558d8b4..e3d52b88c21 100644
--- a/libstdc++-v3/include/std/version
+++ b/libstdc++-v3/include/std/version
@@ -200,8 +200,8 @@
 #if defined _GLIBCXX_HAS_GTHREADS || defined _GLIBCXX_HAVE_LINUX_FUTEX
 # define __cpp_lib_atomic_wait 201907L
 # if __cpp_aligned_new
-# define __cpp_lib_barrier 201907L
-#endif
+#  define __cpp_lib_barrier 201907L
+# endif
 #endif
 #define __cpp_lib_bind_front 201907L
 #if __has_builtin(__builtin_bit_cast)
diff --git a/libstdc++-v3/testsuite/30_threads/barrier/1.cc b/libstdc++-v3/testsuite/30_threads/barrier/1.cc
index 4c15deb1398..a21fae32127 100644
--- a/libstdc++-v3/testsuite/30_threads/barrier/1.cc
+++ b/libstdc++-v3/testsuite/30_threads/barrier/1.cc
@@ -1,4 +1,4 @@
-// Copyright (C) 2020 Free Software Foundation, Inc.
+// Copyright (C) 2020-2021 Free Software Foundation, Inc.
 //
 // This file is part of the GNU ISO C++ Library.  This library is free
 // software; you can redistribute it and/or modify it under the
diff --git a/libstdc++-v3/testsuite/30_threads/barrier/2.cc b/libstdc++-v3/testsuite/30_threads/barrier/2.cc
index 0fac1ef3f3c..94e37d739da 100644
--- a/libstdc++-v3/testsuite/30_threads/barrier/2.cc
+++ b/libstdc++-v3/testsuite/30_threads/barrier/2.cc
@@ -1,4 +1,4 @@
-// Copyright (C) 2019-2020 Free Software Foundation, Inc.
+// Copyright (C) 2020-2021 Free Software Foundation, Inc.
 //
 // This file is part of the GNU ISO C++ Library.  This library is free
 // software; you can redistribute it and/or modify it under the
diff --git a/libstdc++-v3/testsuite/30_threads/barrier/arrive.cc b/libstdc++-v3/testsuite/30_threads/barrier/arrive.cc
index 6e64e378cb0..fb0f56292c0 100644
--- a/libstdc++-v3/testsuite/30_threads/barrier/arrive.cc
+++ b/libstdc++-v3/testsuite/30_threads/barrier/arrive.cc
@@ -3,7 +3,7 @@
 // { dg-require-gthreads "" }
 // { dg-additional-options "-pthread" { target pthread } }
 
-// Copyright (C) 2020 Free Software Foundation, Inc.
+// Copyright (C) 2020-2021 Free Software Foundation, Inc.
 //
 // This file is part of the GNU ISO C++ Library.  This library is free
 // software; you can redistribute it and/or modify it under the
diff --git a/libstdc++-v3/testsuite/30_threads/barrier/arrive_and_drop.cc b/libstdc++-v3/testsuite/30_threads/barrier/arrive_and_drop.cc
index 55f40e17062..22b40200c80 100644
--- a/libstdc++-v3/testsuite/30_threads/barrier/arrive_and_drop.cc
+++ b/libstdc++-v3/testsuite/30_threads/barrier/arrive_and_drop.cc
@@ -3,7 +3,7 @@
 // { dg-require-gthreads "" }
 // { dg-additional-options "-pthread" { target pthread } }
 
-// Copyright (C) 2020 Free Software Foundation, Inc.
+// Copyright (C) 2020-2021 Free Software Foundation, Inc.
 //
 // This file is part of the GNU ISO C++ Library.  This library is free
 // software; you can redistribute it and/or modify it under the
diff --git a/libstdc++-v3/testsuite/30_threads/barrier/arrive_and_wait.cc b/libstdc++-v3/testsuite/30_threads/barrier/arrive_and_wait.cc
index 2a3a69ae3c0..f9b4fa29063 100644
--- a/libstdc++-v3/testsu

Re: [PATCH 1/4] Remove build dependence on HSA run-time

Hi!

I'm raising here an issue with HSA libgomp plugin code changes from a
while ago.  While HSA is now no longer relevant for GCC master branch,
the same code has also been copied into the GCN libgomp plugin.

This is commit b8d89b03db5f212919e4571671ebb4f5f8b1e19d (r242749) "Remove
build dependence on HSA run-time":

On 2016-11-22T14:27:44+0100, Martin Jambor  wrote:
> --- a/libgomp/plugin/configfrag.ac
> +++ b/libgomp/plugin/configfrag.ac

> @@ -195,8 +183,8 @@ if test x"$enable_offload_targets" != x; then
>   tgt_name=hsa
>   PLUGIN_HSA=$tgt
>   PLUGIN_HSA_CPPFLAGS=$HSA_RUNTIME_CPPFLAGS
> - PLUGIN_HSA_LDFLAGS="$HSA_RUNTIME_LDFLAGS $HSA_KMT_LDFLAGS"
> - PLUGIN_HSA_LIBS="-lhsa-runtime64 -lhsakmt"
> + PLUGIN_HSA_LDFLAGS="$HSA_RUNTIME_LDFLAGS"
> + PLUGIN_HSA_LIBS="-ldl"

So this switched from directly linking against 'libhsa-runtime64.so' to a
'libdl'-based runtime linking variant.

Previously, 'libhsa-runtime64.so' would've been found at run time via the
standard search paths.

> +if test "$HSA_RUNTIME_LIB" != ""; then
> +  HSA_RUNTIME_LIB="$HSA_RUNTIME_LIB/"
> +fi
> +
> +AC_DEFINE_UNQUOTED([HSA_RUNTIME_LIB], ["$HSA_RUNTIME_LIB"],
> +  [Define path to HSA runtime.])

That's new, to propagate '--with-hsa-runtime'/'--with-hsa-runtime-lib'
into the HSA plugin source code.

> --- a/libgomp/plugin/plugin-hsa.c
> +++ b/libgomp/plugin/plugin-hsa.c

> +static const char *hsa_runtime_lib;

>  static void
>  init_enviroment_variables (void)
>  {

> +  hsa_runtime_lib = secure_getenv ("HSA_RUNTIME_LIB");

Unless overridden via the 'HSA_RUNTIME_LIB' environment variable...

> +  if (hsa_runtime_lib == NULL)
> +hsa_runtime_lib = HSA_RUNTIME_LIB "libhsa-runtime64.so";

... we now default to '[HSA_RUNTIME_LIB]/libhsa-runtime64.so' (note
'HSA_RUNTIME_LIB' prefix!)...

> +static bool
> +init_hsa_runtime_functions (void)
> +{
> +  void *handle = dlopen (hsa_runtime_lib, RTLD_LAZY);

..., which is then 'dlopen'ed here.

That means, contrary to before, the GCC configure-time
'--with-hsa-runtime' (by definition only valid for GCC configure/build as
well as build-tree testing) leaks into the installed HSA libgomp plugin.
That's a problem if your GCC build system (and build-tree testing)
requires '--with-hsa-runtime' to specify a non-standard location (not in
default search paths) but that location is not valid on your GCC
deployment system (but it has leaked into the HSA libgomp plugin),
meaning that (unless overridden via the 'HSA_RUNTIME_LIB' environment
variable) 'libhsa-runtime64.so' is now no longer found via the standard
search paths, because of the 'HSA_RUNTIME_LIB' prefix passed into
'dlopen'.

Per my understanding this cannot be intentional, so I suggest to restore
the previous behavior as per the attached "libgomp HSA/GCN plugins: don't
prepend the 'HSA_RUNTIME_LIB' path to 'libhsa-runtime64.so'".  OK to push
such changes?  I was tempted to push "as obvious", but maybe I fail to
see the rationale behind this change?

For avoidance of doubt, this change doesn't affect (build-tree) testsuite
usage, where we have:

libgomp/testsuite/libgomp-test-support.exp.in:set hsa_runtime_lib 
"@HSA_RUNTIME_LIB@"

libgomp/testsuite/lib/libgomp.exp:  append always_ld_library_path 
":$hsa_runtime_lib"

And, another data point:

gcc/config/gcn/gcn-run.c:#define HSA_RUNTIME_LIB "libhsa-runtime64.so.1"
[...]
gcc/config/gcn/gcn-run.c:  void *handle = dlopen (HSA_RUNTIME_LIB, 
RTLD_LAZY);

Here, 'libhsa-runtime64.so.1' is 'dlopen'ed without prefix, and thus
found via the standard search paths (as expected).

Grüße
 Thomas

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
>From 936e7ee10349a6be2bd0a6a2198f70239a8e1ec1 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Thu, 25 Jun 2020 11:59:42 +0200
Subject: [PATCH] libgomp HSA/GCN plugins: don't prepend the 'HSA_RUNTIME_LIB'
 path to 'libhsa-runtime64.so'

For unknown reasons, this had gotten added for the libgomp HSA plugin in commit
b8d89b03db5f212919e4571671ebb4f5f8b1e19d (r242749) "Remove build dependence on
HSA run-time", and later propagated into the GCN plugin.

	libgomp/
	* plugin/plugin-hsa.c (init_enviroment_variables): Don't prepend
	the 'HSA_RUNTIME_LIB' path to 'libhsa-runtime64.so'.
	* plugin/plugin-gcn.c (init_environment_variables): Likewise.
	* plugin/configfrag.ac (HSA_RUNTIME_LIB): Clean up.
	* configure: Regenerate.
---
 libgomp/configure| 10 --
 libgomp/plugin/configfrag.ac |  7 ---
 libgomp/plugin/plugin-gcn.c  |  2 +-
 libgomp/plugin/plugin-hsa.c  |  2 +-
 4 files changed, 2 insertions(+), 19 deletions(-)

diff --git a/libgomp/configure b/libgomp/configure
index d8d98f182d4..9765a9068fe 100755
--- a/libgomp/configure
+++ b/libgomp/configure
@@ -15483,16 +15483,6 @@ cat >>confdefs.h <<_ACEOF
 _

Re: [PATCH] x86: Error on -fcf-protection with incompatible target

On Thu, Jan 14, 2021 at 3:05 PM H.J. Lu  wrote:
>
> -fcf-protection with CF_BRANCH inserts ENDBR32 at function entries.
> ENDBR32 is NOP only on 64-bit processors and 32-bit TARGET_CMOVE
> processors.  Issue an error for -fcf-protection with CF_BRANCH when
> compiling for 32-bit non-TARGET_CMOVE targets.
>
> gcc/
>
> PR target/98667
> * config/i386/i386-options.c (ix86_option_override_internal):
> Issue an error for -fcf-protection with CF_BRANCH when compiling
> for 32-bit non-TARGET_CMOVE targets.
>
> gcc/testsuite/
>
> PR target/98667
> * gcc.target/i386/pr98667-1.c: New file.
> * gcc.target/i386/pr98667-2.c: Likewise.
> * gcc.target/i386/pr98667-3.c: Likewise.
> ---
>  gcc/config/i386/i386-options.c| 9 -
>  gcc/testsuite/gcc.target/i386/pr98667-1.c | 9 +
>  gcc/testsuite/gcc.target/i386/pr98667-2.c | 9 +
>  gcc/testsuite/gcc.target/i386/pr98667-3.c | 7 +++
>  4 files changed, 33 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr98667-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr98667-2.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr98667-3.c
>
> diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
> index 4e0165ff32c..1489871b36f 100644
> --- a/gcc/config/i386/i386-options.c
> +++ b/gcc/config/i386/i386-options.c
> @@ -3016,8 +3016,15 @@ ix86_option_override_internal (bool main_args_p,
>  }
>
>if (opts->x_flag_cf_protection != CF_NONE)
> -opts->x_flag_cf_protection
> +{
> +  if ((opts->x_flag_cf_protection & CF_BRANCH) == CF_BRANCH
> + && !TARGET_64BIT
> + && !TARGET_CMOVE)

You need TARGET_CMOV (note, no E) here. Also, please put both tests on one line.

LGTM with the above change.

Thanks,
Uros.

> +   error ("%<-fcf-protection%> is not compatible with this target");
> +
> +  opts->x_flag_cf_protection
>= (cf_protection_level) (opts->x_flag_cf_protection | CF_SET);
> +}
>
>if (ix86_tune_features [X86_TUNE_AVOID_256FMA_CHAINS])
>  SET_OPTION_IF_UNSET (opts, opts_set, param_avoid_fma_max_bits, 256);
> diff --git a/gcc/testsuite/gcc.target/i386/pr98667-1.c 
> b/gcc/testsuite/gcc.target/i386/pr98667-1.c
> new file mode 100644
> index 000..5bf0c9285a8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr98667-1.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile { target ia32 } } */
> +/* { dg-options "-O2 -fcf-protection -march=i486" } */
> +
> +void
> +test (void)
> +{
> +}
> +
> +/* { dg-error "'-fcf-protection' is not compatible with this target" "" { 
> target *-*-* } 0 } */
> diff --git a/gcc/testsuite/gcc.target/i386/pr98667-2.c 
> b/gcc/testsuite/gcc.target/i386/pr98667-2.c
> new file mode 100644
> index 000..bc3a78c9641
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr98667-2.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile { target ia32 } } */
> +/* { dg-options "-O2 -fcf-protection=branch -march=i486" } */
> +
> +void
> +test (void)
> +{
> +}
> +
> +/* { dg-error "'-fcf-protection' is not compatible with this target" "" { 
> target *-*-* } 0 } */
> diff --git a/gcc/testsuite/gcc.target/i386/pr98667-3.c 
> b/gcc/testsuite/gcc.target/i386/pr98667-3.c
> new file mode 100644
> index 000..a6ea6d04331
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr98667-3.c
> @@ -0,0 +1,7 @@
> +/* { dg-do compile { target ia32 } } */
> +/* { dg-options "-O2 -fcf-protection=return -march=i486" } */
> +
> +void
> +test (void)
> +{
> +}
> --
> 2.29.2
>

Re: [PATCH] match.pd: Add ~(X - Y) -> ~X + Y simplification [PR96685]

2021-01-14 Thread Maciej W. Rozycki

On Tue, 12 Jan 2021, Richard Biener wrote:

> >>> The point of the match.pd changes is to canonicalize GIMPLE on some form
> >>> when there are several from GIMPLE POV equivalent or better forms of
> >>> writing
> >>> the same thing.  The advantage of having one canonical way is that ICF,
> >>> SCCVN etc. optimizations can then understand the different forms are
> >>> equivalent.
> >> Fair enough, though in cases like this I think it is unclear which of the
> >> two forms is going to be ultimately better, especially as it may depend on
> >> the exact form of the operands used, e.g. values of any immediates, so I
> >> think a way to make the reverse transformation (whether to undo one made
> >> here or genuinely) needs to be available at a later compilation stage.
> >> One size doesn't fit all.
> >>
> >>  With this in mind...
> > So in this case the number of operations are the same before/after and
> > parallelism is the same before/after, register lifetimes, etc.   I doubt
> > either form is particularly better suited for CSE or gives better VRP
> > data, etc.   The fact that we can't always do ~(X +C) -> ~X + -C
> > probably argues against that form ever so slightly.

 FWIW I agree with Jakub here, that having one canonical form for the 
middle end to operate on is advantageous.  It is just that when we 
eventually get to the backend we may want to do the reverse transformation 
in some cases, which may be specific immediate operand values or whatever 
the backend may see fit.

> >>> If another form is then better for a particular machine, it should be done
> >>> either during expansion (being able to produce both RTLs and computing
> >>> their
> >>> costs), or during combine with either combine splitters or
> >>> define_insn_and_split in the backend, or, if it can't be done in RTL,
> >>> during
> >>> the isel pass.
> >> Hmm, maybe it has been discussed before, so please forgive me if I write
> >> something silly, but it seems to me like this should be done in a generic
> >> way like match.pd so that all the targets do not have to track the changes
> >> made there and then perhaps repeat the same or similar code each.  So I
> >> think it would make sense to make a change like this include that reverse
> >> transformation as well, so that ultimately both forms are tried with RTL,
> >> as there is no clear advantage to either here.
> > The idea we've kicked around in the past was to use the same syntax as
> > match.pd, but have it be target dependent to reform expressions in ways
> > that are beneficial to the target and have it run at the end of the
> > gimple/ssa pipeline.  Nobody's implemented this though.

 Hmm, but why does it have to be target dependent?  For match.pd we do 
things unconditionally, to have a uniform intermediate representation, 
however here we wouldn't have to, as we can check the costs respectively 
of the original and the transformed expression and choose the cheaper of 
the two.  Would that be so we don't waste cycles with targets we know 
beforehand a given transformation won't buy anything?

 In that case however no code quality regression would happen anyway, so I 
think it would be more productive if we still had all transformations 
defined in a generic manner and then possibly the hopeless ones excluded 
by hand for targets listed.  This way if anything is omitted by chance, 
i.e. not excluded for a given target, then good code will still be 
produced and only some compilation performance lost.

 While if we require all port maintainers to qualify individual 
transformations by hand as they are added by someone to their pet target, 
we'll end up with a lot of duplicate effort and missed bits.  Of course 
some very exotic transformations that match unique target machine 
instructions may indeed best be added to a single target only.

> Yes.  And while a gimple-to-gimple transform is indeed quite simple
> to make eventually a match.pd-like gimple-to-RTL would be more
> useful in the end.  Note RTL can eventually be emulated close enough
> via the use of internal functions mapping to optabs.  But still
> complex combined instructions will need expander help unless we
> expose all named instruction RTL patterns as target specific
> internal functions to use from that .pd file.

 Hmm, why aren't the standard named patterns we already have going to be 
sufficient?

  Maciej

[PATCH] i386: Update PR target/95021 tests

Also pass -mpreferred-stack-boundary=4 -mno-stackrealign to avoid
disabling STV by:

  /* Disable STV if -mpreferred-stack-boundary={2,3} or
 -mincoming-stack-boundary={2,3} or -mstackrealign - the needed
 stack realignment will be extra cost the pass doesn't take into
 account and the pass can't realign the stack.  */
  if (ix86_preferred_stack_boundary < 128
  || ix86_incoming_stack_boundary < 128
  || opts->x_ix86_force_align_arg_pointer)
opts->x_target_flags &= ~MASK_STV;

PR target/98676
* gcc.target/i386/pr95021-1.c: Add -mpreferred-stack-boundary=4
-mno-stackrealign.
* gcc.target/i386/pr95021-3.c: Likewise.
---
 gcc/testsuite/gcc.target/i386/pr95021-1.c | 2 +-
 gcc/testsuite/gcc.target/i386/pr95021-3.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/i386/pr95021-1.c 
b/gcc/testsuite/gcc.target/i386/pr95021-1.c
index a0b9a262a87..ec58596959c 100644
--- a/gcc/testsuite/gcc.target/i386/pr95021-1.c
+++ b/gcc/testsuite/gcc.target/i386/pr95021-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile { target ia32 } } */
-/* { dg-options "-O2 -msse2 -mstv -W" } */
+/* { dg-options "-O2 -msse2 -mstv -mpreferred-stack-boundary=4 
-mno-stackrealign -W" } */
 /* { dg-final { scan-assembler "movq\[ \t\]%xmm\[0-9\]+, \\(%esp\\)" } } */
 /* { dg-final { scan-assembler-not "psrlq" } } */
 
diff --git a/gcc/testsuite/gcc.target/i386/pr95021-3.c 
b/gcc/testsuite/gcc.target/i386/pr95021-3.c
index 52f9e4569b3..0f16b16f793 100644
--- a/gcc/testsuite/gcc.target/i386/pr95021-3.c
+++ b/gcc/testsuite/gcc.target/i386/pr95021-3.c
@@ -1,5 +1,5 @@
 /* { dg-do compile { target ia32 } } */
-/* { dg-options "-O2 -msse2 -mstv -mregparm=3 -W -mtune=generic" } */
+/* { dg-options "-O2 -msse2 -mstv -mpreferred-stack-boundary=4 
-mno-stackrealign -mregparm=3 -W -mtune=generic" } */
 /* { dg-final { scan-assembler "movq\[ \t\]+\[^\n\]*, %xmm" } } */
 
 #include "pr95021-1.c"
-- 
2.29.2

Re: [PATCH] match.pd: Add ~(X - Y) -> ~X + Y simplification [PR96685]

On Thu, 14 Jan 2021, Maciej W. Rozycki wrote:

> On Tue, 12 Jan 2021, Richard Biener wrote:
> 
> > >>> The point of the match.pd changes is to canonicalize GIMPLE on some form
> > >>> when there are several from GIMPLE POV equivalent or better forms of
> > >>> writing
> > >>> the same thing.  The advantage of having one canonical way is that ICF,
> > >>> SCCVN etc. optimizations can then understand the different forms are
> > >>> equivalent.
> > >> Fair enough, though in cases like this I think it is unclear which of the
> > >> two forms is going to be ultimately better, especially as it may depend 
> > >> on
> > >> the exact form of the operands used, e.g. values of any immediates, so I
> > >> think a way to make the reverse transformation (whether to undo one made
> > >> here or genuinely) needs to be available at a later compilation stage.
> > >> One size doesn't fit all.
> > >>
> > >>  With this in mind...
> > > So in this case the number of operations are the same before/after and
> > > parallelism is the same before/after, register lifetimes, etc.   I doubt
> > > either form is particularly better suited for CSE or gives better VRP
> > > data, etc.   The fact that we can't always do ~(X +C) -> ~X + -C
> > > probably argues against that form ever so slightly.
> 
>  FWIW I agree with Jakub here, that having one canonical form for the 
> middle end to operate on is advantageous.  It is just that when we 
> eventually get to the backend we may want to do the reverse transformation 
> in some cases, which may be specific immediate operand values or whatever 
> the backend may see fit.
> 
> > >>> If another form is then better for a particular machine, it should be 
> > >>> done
> > >>> either during expansion (being able to produce both RTLs and computing
> > >>> their
> > >>> costs), or during combine with either combine splitters or
> > >>> define_insn_and_split in the backend, or, if it can't be done in RTL,
> > >>> during
> > >>> the isel pass.
> > >> Hmm, maybe it has been discussed before, so please forgive me if I write
> > >> something silly, but it seems to me like this should be done in a generic
> > >> way like match.pd so that all the targets do not have to track the 
> > >> changes
> > >> made there and then perhaps repeat the same or similar code each.  So I
> > >> think it would make sense to make a change like this include that reverse
> > >> transformation as well, so that ultimately both forms are tried with RTL,
> > >> as there is no clear advantage to either here.
> > > The idea we've kicked around in the past was to use the same syntax as
> > > match.pd, but have it be target dependent to reform expressions in ways
> > > that are beneficial to the target and have it run at the end of the
> > > gimple/ssa pipeline.  Nobody's implemented this though.
> 
>  Hmm, but why does it have to be target dependent?  For match.pd we do 
> things unconditionally, to have a uniform intermediate representation, 
> however here we wouldn't have to, as we can check the costs respectively 
> of the original and the transformed expression and choose the cheaper of 
> the two.  Would that be so we don't waste cycles with targets we know 
> beforehand a given transformation won't buy anything?
> 
>  In that case however no code quality regression would happen anyway, so I 
> think it would be more productive if we still had all transformations 
> defined in a generic manner and then possibly the hopeless ones excluded 
> by hand for targets listed.  This way if anything is omitted by chance, 
> i.e. not excluded for a given target, then good code will still be 
> produced and only some compilation performance lost.
> 
>  While if we require all port maintainers to qualify individual 
> transformations by hand as they are added by someone to their pet target, 
> we'll end up with a lot of duplicate effort and missed bits.  Of course 
> some very exotic transformations that match unique target machine 
> instructions may indeed best be added to a single target only.
> 
> > Yes.  And while a gimple-to-gimple transform is indeed quite simple
> > to make eventually a match.pd-like gimple-to-RTL would be more
> > useful in the end.  Note RTL can eventually be emulated close enough
> > via the use of internal functions mapping to optabs.  But still
> > complex combined instructions will need expander help unless we
> > expose all named instruction RTL patterns as target specific
> > internal functions to use from that .pd file.
> 
>  Hmm, why aren't the standard named patterns we already have going to be 
> sufficient?

Because two standard name pattern expansions can later be combined
to a non-standard define-insn which may have lower cost and we'd
of course like to see such instruction selection opportunities here.

Richard.

Re: [PATCH] i386: Update PR target/95021 tests

On Thu, Jan 14, 2021 at 4:00 PM H.J. Lu  wrote:
>
> Also pass -mpreferred-stack-boundary=4 -mno-stackrealign to avoid
> disabling STV by:
>
>   /* Disable STV if -mpreferred-stack-boundary={2,3} or
>  -mincoming-stack-boundary={2,3} or -mstackrealign - the needed
>  stack realignment will be extra cost the pass doesn't take into
>  account and the pass can't realign the stack.  */
>   if (ix86_preferred_stack_boundary < 128
>   || ix86_incoming_stack_boundary < 128
>   || opts->x_ix86_force_align_arg_pointer)
> opts->x_target_flags &= ~MASK_STV;
>
> PR target/98676
> * gcc.target/i386/pr95021-1.c: Add -mpreferred-stack-boundary=4
> -mno-stackrealign.
> * gcc.target/i386/pr95021-3.c: Likewise.

OK.

Thanks,
Uros.

> ---
>  gcc/testsuite/gcc.target/i386/pr95021-1.c | 2 +-
>  gcc/testsuite/gcc.target/i386/pr95021-3.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.target/i386/pr95021-1.c 
> b/gcc/testsuite/gcc.target/i386/pr95021-1.c
> index a0b9a262a87..ec58596959c 100644
> --- a/gcc/testsuite/gcc.target/i386/pr95021-1.c
> +++ b/gcc/testsuite/gcc.target/i386/pr95021-1.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile { target ia32 } } */
> -/* { dg-options "-O2 -msse2 -mstv -W" } */
> +/* { dg-options "-O2 -msse2 -mstv -mpreferred-stack-boundary=4 
> -mno-stackrealign -W" } */
>  /* { dg-final { scan-assembler "movq\[ \t\]%xmm\[0-9\]+, \\(%esp\\)" } } */
>  /* { dg-final { scan-assembler-not "psrlq" } } */
>
> diff --git a/gcc/testsuite/gcc.target/i386/pr95021-3.c 
> b/gcc/testsuite/gcc.target/i386/pr95021-3.c
> index 52f9e4569b3..0f16b16f793 100644
> --- a/gcc/testsuite/gcc.target/i386/pr95021-3.c
> +++ b/gcc/testsuite/gcc.target/i386/pr95021-3.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile { target ia32 } } */
> -/* { dg-options "-O2 -msse2 -mstv -mregparm=3 -W -mtune=generic" } */
> +/* { dg-options "-O2 -msse2 -mstv -mpreferred-stack-boundary=4 
> -mno-stackrealign -mregparm=3 -W -mtune=generic" } */
>  /* { dg-final { scan-assembler "movq\[ \t\]+\[^\n\]*, %xmm" } } */
>
>  #include "pr95021-1.c"
> --
> 2.29.2
>

Re: [PATCH] x86: Error on -fcf-protection with incompatible target

On Thu, Jan 14, 2021 at 6:51 AM Uros Bizjak  wrote:
>
> On Thu, Jan 14, 2021 at 3:05 PM H.J. Lu  wrote:
> >
> > -fcf-protection with CF_BRANCH inserts ENDBR32 at function entries.
> > ENDBR32 is NOP only on 64-bit processors and 32-bit TARGET_CMOVE
> > processors.  Issue an error for -fcf-protection with CF_BRANCH when
> > compiling for 32-bit non-TARGET_CMOVE targets.
> >
> > gcc/
> >
> > PR target/98667
> > * config/i386/i386-options.c (ix86_option_override_internal):
> > Issue an error for -fcf-protection with CF_BRANCH when compiling
> > for 32-bit non-TARGET_CMOVE targets.
> >
> > gcc/testsuite/
> >
> > PR target/98667
> > * gcc.target/i386/pr98667-1.c: New file.
> > * gcc.target/i386/pr98667-2.c: Likewise.
> > * gcc.target/i386/pr98667-3.c: Likewise.
> > ---
> >  gcc/config/i386/i386-options.c| 9 -
> >  gcc/testsuite/gcc.target/i386/pr98667-1.c | 9 +
> >  gcc/testsuite/gcc.target/i386/pr98667-2.c | 9 +
> >  gcc/testsuite/gcc.target/i386/pr98667-3.c | 7 +++
> >  4 files changed, 33 insertions(+), 1 deletion(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr98667-1.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr98667-2.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr98667-3.c
> >
> > diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
> > index 4e0165ff32c..1489871b36f 100644
> > --- a/gcc/config/i386/i386-options.c
> > +++ b/gcc/config/i386/i386-options.c
> > @@ -3016,8 +3016,15 @@ ix86_option_override_internal (bool main_args_p,
> >  }
> >
> >if (opts->x_flag_cf_protection != CF_NONE)
> > -opts->x_flag_cf_protection
> > +{
> > +  if ((opts->x_flag_cf_protection & CF_BRANCH) == CF_BRANCH
> > + && !TARGET_64BIT
> > + && !TARGET_CMOVE)
>
> You need TARGET_CMOV (note, no E) here. Also, please put both tests on one 
> line.
>
> LGTM with the above change.

This is the patch I am checking in.

Thanks.

-- 
H.J.
From c5ba570aeb0985f99ba2f723a4bc3f01801cf555 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Thu, 14 Jan 2021 05:56:46 -0800
Subject: [PATCH] x86: Error on -fcf-protection with incompatible target

-fcf-protection with CF_BRANCH inserts ENDBR32 at function entries.
ENDBR32 is NOP only on 64-bit processors and 32-bit TARGET_CMOV
processors.  Issue an error for -fcf-protection with CF_BRANCH when
compiling for 32-bit non-TARGET_CMOV targets.

gcc/

	PR target/98667
	* config/i386/i386-options.c (ix86_option_override_internal):
	Issue an error for -fcf-protection with CF_BRANCH when compiling
	for 32-bit non-TARGET_CMOV targets.

gcc/testsuite/

	PR target/98667
	* gcc.target/i386/pr98667-1.c: New file.
	* gcc.target/i386/pr98667-2.c: Likewise.
	* gcc.target/i386/pr98667-3.c: Likewise.
---
 gcc/config/i386/i386-options.c| 8 +++-
 gcc/testsuite/gcc.target/i386/pr98667-1.c | 9 +
 gcc/testsuite/gcc.target/i386/pr98667-2.c | 9 +
 gcc/testsuite/gcc.target/i386/pr98667-3.c | 7 +++
 4 files changed, 32 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr98667-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr98667-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr98667-3.c

diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
index 4e0165ff32c..d62cd6f661a 100644
--- a/gcc/config/i386/i386-options.c
+++ b/gcc/config/i386/i386-options.c
@@ -3016,8 +3016,14 @@ ix86_option_override_internal (bool main_args_p,
 }
 
   if (opts->x_flag_cf_protection != CF_NONE)
-opts->x_flag_cf_protection
+{
+  if ((opts->x_flag_cf_protection & CF_BRANCH) == CF_BRANCH
+	  && !TARGET_64BIT && !TARGET_CMOV)
+	error ("%<-fcf-protection%> is not compatible with this target");
+
+  opts->x_flag_cf_protection
   = (cf_protection_level) (opts->x_flag_cf_protection | CF_SET);
+}
 
   if (ix86_tune_features [X86_TUNE_AVOID_256FMA_CHAINS])
 SET_OPTION_IF_UNSET (opts, opts_set, param_avoid_fma_max_bits, 256);
diff --git a/gcc/testsuite/gcc.target/i386/pr98667-1.c b/gcc/testsuite/gcc.target/i386/pr98667-1.c
new file mode 100644
index 000..5bf0c9285a8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr98667-1.c
@@ -0,0 +1,9 @@
+/* { dg-do compile { target ia32 } } */
+/* { dg-options "-O2 -fcf-protection -march=i486" } */
+
+void
+test (void)
+{
+}
+
+/* { dg-error "'-fcf-protection' is not compatible with this target" "" { target *-*-* } 0 } */
diff --git a/gcc/testsuite/gcc.target/i386/pr98667-2.c b/gcc/testsuite/gcc.target/i386/pr98667-2.c
new file mode 100644
index 000..bc3a78c9641
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr98667-2.c
@@ -0,0 +1,9 @@
+/* { dg-do compile { target ia32 } } */
+/* { dg-options "-O2 -fcf-protection=branch -march=i486" } */
+
+void
+test (void)
+{
+}
+
+/* { dg-error "'-fcf-protection' is not compatible with this target" "" { target *-*-* } 0 } */
diff --git a/gcc

[PATCH] i386: Resolve variable shadowing in i386-options.c [PR98671]

Also change global variable pta_size to unsigned.

2021-01-14  Uroš Bizjak  

gcc/
PR target/98671
* config/i386/i386-options.c (ix86_valid_target_attribute_inner_p):
Remove declaration and initialization of shadow variable "ret".
(ix86_option_override_internal): Remove declaration of
shadow variable "i".  Redeclare shadowed variable to unsigned.
* common/config/i386/i386-common.c (pta_size): Redeclare to unsigned.
* config/i386/i386-builtins.c (get_builtin_code_for_version):
Update for redeclaration.
* config/i386/i386.h (pta_size): Ditto.

Bootstrapped and regression tested on x86_64-linux-gnu{,-m32}.

Pushed to mainline.

Uros.
diff --git a/gcc/common/config/i386/i386-common.c 
b/gcc/common/config/i386/i386-common.c
index 00843d4bd93..eea8af12f48 100644
--- a/gcc/common/config/i386/i386-common.c
+++ b/gcc/common/config/i386/i386-common.c
@@ -2051,7 +2051,7 @@ const pta processor_alias_table[] =
 };
 
 /* NB: processor_alias_table stops at the "generic" entry.  */
-int const pta_size = ARRAY_SIZE (processor_alias_table) - 6;
+unsigned int const pta_size = ARRAY_SIZE (processor_alias_table) - 6;
 unsigned int const num_arch_names = ARRAY_SIZE (processor_alias_table);
 
 /* Provide valid option values for -march and -mtune options.  */
diff --git a/gcc/config/i386/i386-builtins.c b/gcc/config/i386/i386-builtins.c
index 5b37fc7b75e..4fcdf4b89ee 100644
--- a/gcc/config/i386/i386-builtins.c
+++ b/gcc/config/i386/i386-builtins.c
@@ -1888,7 +1888,7 @@ get_builtin_code_for_version (tree decl, tree 
*predicate_list)
   gcc_assert (new_target);
   
   if (new_target->arch_specified && new_target->arch > 0)
-   for (i = 0; i < (unsigned int) pta_size; i++)
+   for (i = 0; i < pta_size; i++)
  if (processor_alias_table[i].processor == new_target->arch)
{
  const pta *arch_info = &processor_alias_table[i];
diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
index 4e0165ff32c..6819a042389 100644
--- a/gcc/config/i386/i386-options.c
+++ b/gcc/config/i386/i386-options.c
@@ -1088,8 +1088,6 @@ ix86_valid_target_attribute_inner_p (tree fndecl, tree 
args, char *p_strings[],
   /* If this is a list, recurse to get the options.  */
   if (TREE_CODE (args) == TREE_LIST)
 {
-  bool ret = true;
-
   for (; args; args = TREE_CHAIN (args))
if (TREE_VALUE (args)
&& !ix86_valid_target_attribute_inner_p (fndecl, TREE_VALUE (args),
@@ -1782,7 +1780,7 @@ ix86_option_override_internal (bool main_args_p,
   struct gcc_options *opts,
   struct gcc_options *opts_set)
 {
-  int i;
+  unsigned int i;
   unsigned HOST_WIDE_INT ix86_arch_mask;
   const bool ix86_tune_specified = (opts->x_ix86_tune_string != NULL);
 
@@ -2852,7 +2850,7 @@ ix86_option_override_internal (bool main_args_p,
 {
   char *p = ASTRDUP (opts->x_ix86_recip_name);
   char *q;
-  unsigned int mask, i;
+  unsigned int mask;
   bool invert;
 
   while ((q = strtok (p, ",")) != NULL)
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index f032746d222..272b1957b47 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -2562,7 +2562,7 @@ public:
 };
 
 extern const pta processor_alias_table[];
-extern int const pta_size;
+extern unsigned int const pta_size;
 extern unsigned int const num_arch_names;
 #endif

[PATCH] RTEMS: Fix Ada build for riscv

2021-01-14 Thread Sebastian Huber

gcc/ada/

PR ada/98595
Makefile.rtl (LIBGNAT_TARGET_PAIRS) : Use
wraplf version of Aux_Long_Long_Float.
---
 gcc/ada/Makefile.rtl | 5 +
 1 file changed, 5 insertions(+)

diff --git a/gcc/ada/Makefile.rtl b/gcc/ada/Makefile.rtl
index 81df1e8babc..35faf13ea46 100644
--- a/gcc/ada/Makefile.rtl
+++ b/gcc/ada/Makefile.rtl
@@ -2168,11 +2168,16 @@ ifeq ($(strip $(filter-out rtems%,$(target_os))),)
   s-tpopsp.adb

Re: [PATCH] RTEMS: Fix Ada build for riscv

2021-01-14 Thread Arnaud Charlet

This is OK, thanks.

> gcc/ada/
> 
>   PR ada/98595
>   Makefile.rtl (LIBGNAT_TARGET_PAIRS) : Use
>   wraplf version of Aux_Long_Long_Float.
> ---
>  gcc/ada/Makefile.rtl | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/gcc/ada/Makefile.rtl b/gcc/ada/Makefile.rtl
> index 81df1e8babc..35faf13ea46 100644
> --- a/gcc/ada/Makefile.rtl
> +++ b/gcc/ada/Makefile.rtl
> @@ -2168,11 +2168,16 @@ ifeq ($(strip $(filter-out rtems%,$(target_os))),)
>s-tpopsp.adbs-stchop.adbs-interr.adb +
>ifeq ($(strip $(filter-out arm%, $(target_cpu))),)
>  EH_MECHANISM=-arm
>else
>  EH_MECHANISM=-gcc
>endif
> +
> +  ifeq ($(strip $(filter-out riscv%,$(target_cpu))),)
> +LIBGNAT_TARGET_PAIRS += a-nallfl.ads +  endif
>  endif
>  
>  # PikeOS
> -- 
> 2.26.2
>

[PATCH][pushed] mklog: skip unsupported files


This fixes an infinite loop one could see for:
git show b87ec922c40 | ./contrib/mklog.py

contrib/ChangeLog:

* mklog.py: Fix infinite loop for unsupported files.
---
 contrib/mklog.py | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/contrib/mklog.py b/contrib/mklog.py
index bf51e56337e..a70536a6849 100755
--- a/contrib/mklog.py
+++ b/contrib/mklog.py
@@ -133,6 +133,9 @@ def generate_changelog(data, no_functions=False, 
fill_pr_titles=False):
 diff = PatchSet(data)
 
 for file in diff:

+# skip files that can't be parsed
+if file.path == '/dev/null':
+continue
 changelog = find_changelog(file.path)
 if changelog not in changelogs:
 changelogs[changelog] = []
--
2.29.2

[PATCH][pushed] gcov: add one more pytest


gcc/testsuite/ChangeLog:

* g++.dg/gcov/gcov-17.C: New test.
* g++.dg/gcov/test-gcov-17.py: New test.
---
 gcc/testsuite/g++.dg/gcov/gcov-17.C   | 40 +++
 gcc/testsuite/g++.dg/gcov/test-gcov-17.py | 37 +
 2 files changed, 77 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/gcov/gcov-17.C
 create mode 100644 gcc/testsuite/g++.dg/gcov/test-gcov-17.py

diff --git a/gcc/testsuite/g++.dg/gcov/gcov-17.C 
b/gcc/testsuite/g++.dg/gcov/gcov-17.C
new file mode 100644
index 000..d11883cfd39
--- /dev/null
+++ b/gcc/testsuite/g++.dg/gcov/gcov-17.C
@@ -0,0 +1,40 @@
+/* { dg-options "--coverage -std=c++11" } */
+/* { dg-do run { target native } } */
+
+template  class Foo
+{
+public:
+  Foo () : b (1000) {}
+
+  void inc () { b++; }
+
+private:
+  int b;
+};
+
+template class Foo;
+template class Foo;
+
+int
+main (void)
+{
+  int i, total;
+  Foo counter;
+
+  counter.inc ();
+  counter.inc ();
+  total = 0;
+
+  for (i = 0; i < 10; i++)
+total += i;
+
+  int v = total > 100 ? 1 : 2;
+
+  if (total != 45)
+__builtin_printf ("Failure\n");
+  else
+__builtin_printf ("Success\n");
+  return 0;
+}
+
+/* { dg-final { run-gcov-pytest gcov-17.C "test-gcov-17.py" } } */
diff --git a/gcc/testsuite/g++.dg/gcov/test-gcov-17.py 
b/gcc/testsuite/g++.dg/gcov/test-gcov-17.py
new file mode 100644
index 000..ec5df3dec03
--- /dev/null
+++ b/gcc/testsuite/g++.dg/gcov/test-gcov-17.py
@@ -0,0 +1,37 @@
+from gcov import gcov_from_env
+
+import pytest
+
+
+@pytest.fixture(scope='function', autouse=True)
+def gcov():
+return gcov_from_env()
+
+
+def test_basics(gcov):
+files = gcov['files']
+assert len(files) == 1
+functions = files[0]['functions']
+assert len(functions) == 5
+
+
+def test_lines(gcov):
+lines = gcov['files'][0]['lines']
+linesdict = {}
+for line in lines:
+lineno = int(line['line_number'])
+linesdict.setdefault(lineno, [])
+linesdict[lineno].append(line)
+
+line9 = linesdict[9]
+assert len(line9) == 2
+assert line9[0]['function_name'] == '_ZN3FooIcE3incEv'
+assert line9[1]['function_name'] == '_ZN3FooIiE3incEv'
+assert line9[0]['count'] == 0
+assert line9[1]['count'] == 2
+assert line9[0]['unexecuted_block']
+assert not line9[1]['unexecuted_block']
+assert linesdict[31][0]['unexecuted_block']
+assert linesdict[34][0]['unexecuted_block']
+assert not linesdict[37][0]['unexecuted_block']
+assert 32 not in linesdict
--
2.29.2

[Patch, fortran] PR64290 - [F03] No finalization at deallocation of LHS

2021-01-14 Thread Paul Richard Thomas via Gcc-patches

Hi All,

This patch was triggered by a thread on clf. Some years ago Tobias and I
discussed the remaining conditions where finalization should be triggered
and is not. Intrinsic assignment was one of the glaring omissions for which
implementation looked like a heavy lift job. As it happens, it wasn't too
bad :-)

Most of the work was suppressing partial finalization, as a prelude to
reallocation on assignment, and ensuring that finalization happened in the
right circumstances. gfc_assignment_finalizer_call does the work for
intrinsic assignment and is straightforward. Care has to be taken to place
the result between evaluation of the rhs and any reallocation of the lhs
that might occur.

I thought it to be a good idea to squeeze this in before Stage 4 and so the
testcase is not yet finished.I will post it separately once complete and
before pushing the patch. The process is a bit tedious since it involves
checking that the finalization is occurring at the correct point in the
assignment, that the results are consistent with my understanding of
7.5.6.3 and that another brand gives the same results.

Regtests on FC33/x86_64 - OK for master? It occurs to me that this should
also be backported to the 10-branch at very least.

Paul

Fortran:Implement finalization on intrinsic assignment [PR64290]

2021-01-14  Paul Thomas  

gcc/fortran
PR fortran/64290
* resolve.c (resolve_where, gfc_resolve_where_code_in_forall,
gfc_resolve_forall_body, gfc_resolve_code): Check that the op
code is still EXEC_ASSIGN. If it is set lhs to must finalize.
* trans-array.c (structure_alloc_comps): Add boolean argument
to suppress finalization and use it for calls from
gfc_deallocate_alloc_comp_no_caf. Otherwise it defaults to
false.
(gfc_alloc_allocatable_for_assignment): Suppress finalization
by setting new arg in call to gfc_deallocate_alloc_comp_no_caf.
* trans-array.h : Add the new boolean argument to the prototype
of gfc_deallocate_alloc_comp_no_caf with a default of false.
* trans-expr.c (gfc_trans_scalar_assign): Suppress finalization
by setting new arg in call to gfc_deallocate_alloc_comp_no_caf.
(gfc_assignment_finalizer_call): New function to provide
finalization on intrinsic assignment.
(gfc_trans_assignment_1): Call it and add the block between the
rhs evaluation and any reallocation on assignment that there
might be.

gcc/testsuite/
PR fortran/64290
* gfortran.dg/finalize_38.f90 : New test.
* gfortran.dg/allocate_with_source_16.f90 : The number of final
calls goes down from 6 to 4.
diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index f243bd185b0..05f52185b8b 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -10415,6 +10415,10 @@ resolve_where (gfc_code *code, gfc_expr *mask)
 	  if (e && !resolve_where_shape (cnext->expr1, e))
 	   gfc_error ("WHERE assignment target at %L has "
 			  "inconsistent shape", &cnext->expr1->where);
+
+	  if (cnext->op == EXEC_ASSIGN)
+		cnext->expr1->must_finalize = 1;
+
 	  break;
 
 
@@ -10502,6 +10506,10 @@ gfc_resolve_where_code_in_forall (gfc_code *code, int nvar,
 	/* WHERE assignment statement */
 	case EXEC_ASSIGN:
 	  gfc_resolve_assign_in_forall (cnext, nvar, var_expr);
+
+	  if (cnext->op == EXEC_ASSIGN)
+		cnext->expr1->must_finalize = 1;
+
 	  break;
 
 	/* WHERE operator assignment statement */
@@ -10548,6 +10556,10 @@ gfc_resolve_forall_body (gfc_code *code, int nvar, gfc_expr **var_expr)
 	case EXEC_ASSIGN:
 	case EXEC_POINTER_ASSIGN:
 	  gfc_resolve_assign_in_forall (c, nvar, var_expr);
+
+	  if (c->op == EXEC_ASSIGN)
+	c->expr1->must_finalize = 1;
+
 	  break;
 
 	case EXEC_ASSIGN_CALL:
@@ -11947,6 +11959,9 @@ start:
 	  && code->expr1->ts.u.derived->attr.defined_assign_comp)
 	generate_component_assignments (&code, ns);
 
+	  if (code->op == EXEC_ASSIGN)
+	code->expr1->must_finalize = 1;
+
 	  break;
 
 	case EXEC_LABEL_ASSIGN:
diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c
index 4bd4db877bd..8ac6b9e88fb 100644
--- a/gcc/fortran/trans-array.c
+++ b/gcc/fortran/trans-array.c
@@ -8661,7 +8661,7 @@ static gfc_actual_arglist *pdt_param_list;
 static tree
 structure_alloc_comps (gfc_symbol * der_type, tree decl,
 		   tree dest, int rank, int purpose, int caf_mode,
-		   gfc_co_subroutines_args *args)
+		   gfc_co_subroutines_args *args, bool no_finalization)
 {
   gfc_component *c;
   gfc_loopinfo loop;
@@ -8749,11 +8749,12 @@ structure_alloc_comps (gfc_symbol * der_type, tree decl,
 	 gfc_conv_array_data (dest));
 	  dref = gfc_build_array_ref (tmp, index, NULL);
 	  tmp = structure_alloc_comps (der_type, vref, dref, rank,
-   COPY_ALLOC_COMP, caf_mode, args);
+   COPY_ALLOC_COMP, caf_mode, args,
+   no_finalization);
 	}
   else
 	tmp = structure_alloc_comps (der_type, vref, NULL_TREE, rank, purpose,
- caf_mode, args);
+ caf_mode, args, no_finalization);
 
   gfc_add_expr_to_block (&loopbody, tmp);
 
@@ -

[committed] libstdc++: Define function to throw filesystem_error [PR 98471]

Fix ordering problem on Windows targets where filesystem_error was used
before being defined.

libstdc++-v3/ChangeLog:

PR libstdc++/98471
* include/bits/fs_path.h (__throw_conversion_error): New
function to throw or abort on character conversion errors.
(__wstr_from_utf8): Move definition after filesystem_error has
been defined. Use __throw_conversion_error.
(path::_S_convert<_EcharT>): Use __throw_conversion_error.
(path::_S_str_convert<_CharT, _Traits, _Allocator>): Likewise.
(path::u8string): Likewise.

Tested x86_64-linux and x86_64-mingw64. Committed to trunk.

commit 57a4f5e4eacfbbbd0ca5f1e3f946c27d63e2b533
Author: Jonathan Wakely 
Date:   Thu Jan 14 14:26:19 2021

libstdc++: Define function to throw filesystem_error [PR 98471]

Fix ordering problem on Windows targets where filesystem_error was used
before being defined.

libstdc++-v3/ChangeLog:

PR libstdc++/98471
* include/bits/fs_path.h (__throw_conversion_error): New
function to throw or abort on character conversion errors.
(__wstr_from_utf8): Move definition after filesystem_error has
been defined. Use __throw_conversion_error.
(path::_S_convert<_EcharT>): Use __throw_conversion_error.
(path::_S_str_convert<_CharT, _Traits, _Allocator>): Likewise.
(path::u8string): Likewise.

diff --git a/libstdc++-v3/include/bits/fs_path.h 
b/libstdc++-v3/include/bits/fs_path.h
index 2897134c4c1..1645c53cf53 100644
--- a/libstdc++-v3/include/bits/fs_path.h
+++ b/libstdc++-v3/include/bits/fs_path.h
@@ -238,24 +238,6 @@ namespace __detail
return basic_string<_EcharT>(__first, __last);
 }
 
-#ifdef _GLIBCXX_FILESYSTEM_IS_WINDOWS
-  template
-inline std::wstring
-__wstr_from_utf8(const _Tp& __str)
-{
-  static_assert(std::is_same_v);
-  std::wstring __wstr;
-  // XXX This assumes native wide encoding is UTF-16.
-  std::codecvt_utf8_utf16 __wcvt;
-  const auto __p = __str.data();
-  if (!__str_codecvt_in_all(__p, __p + __str.size(), __wstr, __wcvt))
-   _GLIBCXX_THROW_OR_ABORT(filesystem_error(
- "Cannot convert character sequence",
- std::make_error_code(errc::illegal_byte_sequence)));
-  return __wstr;
-}
-#endif
-
 } // namespace __detail
   /// @endcond
 
@@ -743,6 +725,37 @@ namespace __detail
 std::__shared_ptr _M_impl;
   };
 
+  /// @cond undocumented
+namespace __detail
+{
+  [[noreturn]] inline void
+  __throw_conversion_error()
+  {
+_GLIBCXX_THROW_OR_ABORT(filesystem_error(
+"Cannot convert character sequence",
+std::make_error_code(errc::illegal_byte_sequence)));
+  }
+
+#ifdef _GLIBCXX_FILESYSTEM_IS_WINDOWS
+  template
+inline std::wstring
+__wstr_from_utf8(const _Tp& __str)
+{
+  static_assert(std::is_same_v);
+  std::wstring __wstr;
+  // XXX This assumes native wide encoding is UTF-16.
+  std::codecvt_utf8_utf16 __wcvt;
+  const auto __p = __str.data();
+  if (!__str_codecvt_in_all(__p, __p + __str.size(), __wstr, __wcvt))
+   __detail::__throw_conversion_error();
+  return __wstr;
+}
+#endif
+
+} // namespace __detail
+  /// @endcond
+
+
   /** Create a path from a UTF-8-encoded sequence of char
*
* @relates std::filesystem::path
@@ -846,9 +859,7 @@ namespace __detail
  if (__str_codecvt_out_all(__f, __l, __str, __cvt))
return __str;
 #endif
- _GLIBCXX_THROW_OR_ABORT(filesystem_error(
-   "Cannot convert character sequence",
-   std::make_error_code(errc::illegal_byte_sequence)));
+ __detail::__throw_conversion_error();
}
 }
 
@@ -1058,9 +1069,7 @@ namespace __detail
 #ifdef _GLIBCXX_FILESYSTEM_IS_WINDOWS
   } }
 #endif
-  _GLIBCXX_THROW_OR_ABORT(filesystem_error(
-   "Cannot convert character sequence",
-   std::make_error_code(errc::illegal_byte_sequence)));
+  __detail::__throw_conversion_error();
 }
   /// @endcond
 
@@ -1097,9 +1106,7 @@ namespace __detail
 const value_type* __last = __first + _M_pathname.size();
 if (__str_codecvt_out_all(__first, __last, __str, __cvt))
   return __str;
-_GLIBCXX_THROW_OR_ABORT(filesystem_error(
- "Cannot convert character sequence",
- std::make_error_code(errc::illegal_byte_sequence)));
+__detail::__throw_conversion_error();
 #else
 return _M_pathname;
 #endif

[PATCH] PowerPC: Map IEEE 128-bit long double built-ins.

>From 78435dee177447080434cdc08fc76b1029c7f576 Mon Sep 17 00:00:00 2001
From: Michael Meissner 
Date: Wed, 13 Jan 2021 21:47:03 -0500
Subject: [PATCH] PowerPC: Map IEEE 128-bit long double built-ins.

This patch replaces patches previously submitted:

September 24th, 2020:
Message-ID: <20200924203159.ga31...@ibm-toto.the-meissners.org>

October 9th, 2020:
Message-ID: <20201009043543.ga11...@ibm-toto.the-meissners.org>

October 24th, 2020:
Message-ID: <2020100346.ga8...@ibm-toto.the-meissners.org>

November 19th, 2020:
Message-ID: <20201119235814.ga...@ibm-toto.the-meissners.org>

This patch maps the built-in functions that take or return long double
arguments on systems where long double is IEEE 128-bit.

If long double is IEEE 128-bit, this patch goes through the built-in functions
and changes the name of the math, scanf, and printf built-in functions to use
the functions that GLIBC provides when long double uses the IEEE 128-bit
representation.

In addition, changing the name in GCC allows the Fortran compiler to
automatically use the correct name.

To map the math functions, typically this patch changes l to
__ieee128.  However there are some exceptions that are handled with this
patch.

To map the printf functions,  is mapped to __ieee128.

To map the scanf functions,  is mapped to __isoc99_ieee128.

I have tested this patch by doing builds, bootstraps, and make check with 3
builds on a power9 little endian server:

*   Build one used the default long double being IBM 128-bit;
*   Build two set the long double default to IEEE 128-bit; (and)
*   Build three set the long double default to 64-bit.

The compilers built fine providing I recompiled gmp, mpc, and mpfr with the
appropriate long double options.  There were a few differences in the test
suite runs that will be addressed in later patches, but over all it works
well.  This patch is required to be able to build a toolchain where the default
long double is IEEE 128-bit.  Can I check this patch into the master branch for
GCC 11?

gcc/
2021-01-14  Michael Meissner  

* config/rs6000/rs6000.c (ieee128_builtin_name): New function.
(built_in_uses_long_double): New function.
(identifier_ends_in_suffix): New function.
(rs6000_mangle_decl_assembler_name): Update support for mapping built-in
function names for long double built-in functions if long double is
IEEE 128-bit to catch all of the built-in functions that take or
return long double arguments.

gcc/testsuite/
2021-01-14  Michael Meissner  

* gcc.target/powerpc/float128-longdouble-math.c: New test.
* gcc.target/powerpc/float128-longdouble-stdio.c: New test.
* gcc.target/powerpc/float128-math.c: Adjust test for new name
being generated.  Add support for running test on power10.  Add
support for running if long double defaults to 64-bits.
---
 gcc/config/rs6000/rs6000.c| 239 --
 .../powerpc/float128-longdouble-math.c| 442 ++
 .../powerpc/float128-longdouble-stdio.c   |  36 ++
 .../gcc.target/powerpc/float128-math.c|  16 +-
 4 files changed, 694 insertions(+), 39 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-longdouble-math.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-longdouble-stdio.c

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 6f48dd6566d..282703b9715 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -27100,6 +27100,172 @@ rs6000_globalize_decl_name (FILE * stream, tree decl)
 #endif

+/* If long double uses the IEEE 128-bit representation, return the name used
+   within GLIBC for the IEEE 128-bit long double built-in, instead of the
+   default IBM 128-bit long double built-in.  Or return NULL if the built-in
+   function does not use long double.  */
+
+static const char *
+ieee128_builtin_name (built_in_function fn)
+{
+  switch (fn)
+{
+default:   return NULL;
+case BUILT_IN_ACOSHL:  return "__acoshieee128";
+case BUILT_IN_ACOSL:   return "__acosieee128";
+case BUILT_IN_ASINHL:  return "__asinhieee128";
+case BUILT_IN_ASINL:   return "__asinieee128";
+case BUILT_IN_ATAN2L:  return "__atan2ieee128";
+case BUILT_IN_ATANHL:  return "__atanhieee128";
+case BUILT_IN_ATANL:   return "__atanieee128";
+case BUILT_IN_CABSL:   return "__cabsieee128";
+case BUILT_IN_CACOSHL: return "__cacoshieee128";
+case BUILT_IN_CACOSL:  return "__cacosieee128";
+case BUILT_IN_CARGL:   return "__cargieee128";
+case BUILT_IN_CASINHL: return "__casinhieee128";
+case BUILT_IN_CASINL:  return "__casinieee128";
+case BUILT_IN_CATANHL: return "__catanhieee128";
+case BUILT_IN_CATANL:  return "__catanieee128";
+case BUILT_IN_CBRTL:   return "__cbrtieee128";
+case BUILT_IN_CCOSHL:  return "

Re: [PATCH] libstdc++: c++2b, implement WG21 P1679R3


On 13/01/21 01:21 +, Paul Fee via Libstdc++ wrote:

Add contains member function to basic_string_view and basic_string.

The new method is enabled for -std=gnu++20, gnu++2b and c++2b.  This allows
users to access the method as a GNU extension to C++20.  The conditional
test may be reduced to "__cplusplus > 202011L" once GCC has a c++2b switch.


Thanks for the patch.

A few comments below.


diff --git a/libstdc++-v3/include/bits/basic_string.h
b/libstdc++-v3/include/bits/basic_string.h
index e272d332934..a569ecd8c08 100644
--- a/libstdc++-v3/include/bits/basic_string.h
+++ b/libstdc++-v3/include/bits/basic_string.h
@@ -3073,6 +3073,21 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
  { return __sv_type(this->data(), this->size()).ends_with(__x); }
#endif // C++20

+#if __cplusplus > 202011L || \
+  (__cplusplus == 202002L && !defined __STRICT_ANSI__)


Please put the line break before the binary operator i.e.

#if __cplusplus > 202011L \
  || (__cplusplus == 202002L && !defined __STRICT_ANSI__)

This has the advantage that the operator is at a predictable place,
so it's easier to see at a glance whether this is an && or ||
condition.


+  bool
+  contains(basic_string_view<_CharT, _Traits> __x) const noexcept
+  { return __sv_type(this->data(), this->size()).contains(__x); }
+
+  bool
+  contains(_CharT __x) const noexcept
+  { return __sv_type(this->data(), this->size()).contains(__x); }
+
+  bool
+  contains(const _CharT* __x) const noexcept
+  { return __sv_type(this->data(), this->size()).contains(__x); }
+#endif // C++23
+
  // Allow basic_stringbuf::__xfer_bufptrs to call _M_length:
  template friend class basic_stringbuf;
};
@@ -5998,6 +6013,21 @@ _GLIBCXX_END_NAMESPACE_CXX11
  { return __sv_type(this->data(), this->size()).ends_with(__x); }
#endif // C++20

+#if __cplusplus > 202011L || \
+  (__cplusplus == 202002L && !defined __STRICT_ANSI__)
+  bool
+  contains(basic_string_view<_CharT, _Traits> __x) const noexcept
+  { return __sv_type(this->data(), this->size()).contains(__x); }
+
+  bool
+  contains(_CharT __x) const noexcept
+  { return __sv_type(this->data(), this->size()).contains(__x); }
+
+  bool
+  contains(const _CharT* __x) const noexcept
+  { return __sv_type(this->data(), this->size()).contains(__x); }
+#endif // C++23
+
# ifdef _GLIBCXX_TM_TS_INTERNAL
  friend void
  ::_txnal_cow_string_C1_for_exceptions(void* that, const char* s,
diff --git a/libstdc++-v3/include/std/string_view
b/libstdc++-v3/include/std/string_view
index e33e1bc4b79..2f47ef6ed12 100644
--- a/libstdc++-v3/include/std/string_view
+++ b/libstdc++-v3/include/std/string_view
@@ -352,6 +352,22 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  { return this->ends_with(basic_string_view(__x)); }
#endif // C++20

+#if __cplusplus > 202011L || \
+  (__cplusplus == 202002L && !defined __STRICT_ANSI__)
+#define __cpp_lib_string_contains 202011L


This macro should also be defined in , and should depend on
the same conditions.

There should also be tests that the macro is defined by 
and by .

You can add this to the top of any one of the new tests for
string_view::contains:

#ifndef __cpp_lib_string_contains
# error "Feature-test macro for contains missing in "
#elif __cpp_lib_string_contains != 202011L
# error "Feature-test macro for contains has wrong value in "
#endif

And then add a new version.cc test adjacent to that file, which
includes  (and nothing else) and tests the same conditions:

#include 

#ifndef __cpp_lib_string_contains
# error "Feature-test macro for contains missing in "
#elif __cpp_lib_string_contains != 202011L
# error "Feature-test macro for contains has wrong value in "
#endif




+  constexpr bool
+  contains(basic_string_view __x) const noexcept
+  { return this->find(__x) != npos; }
+
+  constexpr bool
+  contains(_CharT __x) const noexcept
+  { return this->find(__x) != npos; }
+
+  constexpr bool
+  contains(const _CharT* __x) const noexcept
+  { return this->find(__x) != npos; }
+#endif // C++23
+
  // [string.view.find], searching

  constexpr size_type
diff --git 
a/libstdc++-v3/testsuite/21_strings/basic_string/operations/contains/char/1.cc
b/libstdc++-v3/testsuite/21_strings/basic_string/operations/contains/char/1.cc
new file mode 100644
index 000..5d81dcee0ad
--- /dev/null
+++ 
b/libstdc++-v3/testsuite/21_strings/basic_string/operations/contains/char/1.cc
@@ -0,0 +1,65 @@
+// { dg-options "-std=gnu++2b" }
+// { dg-do run { target c++2b } }
+
+// Copyright (C) 2018-2020 Free Software Foundation, Inc.


These dates should be 2021 only.



diff --git 
a/libstdc++-v3/testsuite/21_strings/basic_string/operations/contains/wchar_t/1.cc
b/libstdc++-v3/testsuite/21_strings/basic_string/operations/contains/wchar_t/1.cc
new file mode 100644
index 000..21196add4dc
--- /dev/null
+++ 
b/libstdc++-v3/testsuite/21_strings/basic_string/operations/contains/wchar_t/1

[PATCH] PowerPC: Add float128/Decimal conversions.

[PATCH] PowerPC: Add float128/Decimal conversions.

This patch replaces the following two patches:

September 24th, 2020:
Message-ID: <20200924203545.gd31...@ibm-toto.the-meissners.org>

October 22nd, 2020:
Message-ID: <2020100603.ga11...@ibm-toto.the-meissners.org>

This patch rewrites those patches.  In order to run with older GLIBC's, this
patch uses weak references to the IEEE 128-bit conversions to/from string that
are found in GLIBC 2.32.

If the user uses GLIBC 2.32 or later, the Decimal <-> Float128 conversions will
call the functions in that library.  This isn't ideal, as IEEE 128-bit has more
exponent range than IBM 128-bit.

If an older library is used, these patches will convert IEEE 128-bit to IBM
128-bit and do the conversion with IBM 128-bit.  I have tested this with a
compiler configured to use an older library, and it worked for the conversion
if the number could be represented in the IBM 128-bit format.

While most of the Decimal <-> Long double tests now pass when long doubles are
IEEE 128-bit, there are two tests that fails:

*   c-c++-common/dfp/convert-bfp-6.c
*   c-c++-common/dfp/convert-bfp-11.c

I have patches for the bfp-11 test (which requires that long double be IBM
128-bit).  I have not looked at the bfp-6 test but I will shortly.

I have tested this patch by doing builds, bootstraps, and make check with 3
builds on a power9 little endian server:

*   Build one used the default long double being IBM 128-bit;
*   Build two set the long double default to IEEE 128-bit; (and)
*   Build three set the long double default to 64-bit.

The compilers built fine providing I recompiled gmp, mpc, and mpfr with the
appropriate long double options.  There were a few differences in the test
suite runs that will be addressed in later patches, but over all it works
well.  This patch is required to be able to build a toolchain where the default
long double is IEEE 128-bit.  Can I check this patch into the master branch for
GCC 11?

libgcc/
2021-01-14  Michael Meissner  

* config/rs6000/_dd_to_kf.c: New file.
* config/rs6000/_kf_to_dd.c: New file.
* config/rs6000/_kf_to_sd.c: New file.
* config/rs6000/_kf_to_td.c: New file.
* config/rs6000/_sd_to_kf.c: New file.
* config/rs6000/_sprintfkf.c: New file.
* config/rs6000/_sprintfkf.h: New file.
* config/rs6000/_strtokf.h: New file.
* config/rs6000/_strtokf.c: New file.
* config/rs6000/_td_to_kf.c: New file.
* config/rs6000/quad-float128.h: Add new declarations.
* config/rs6000/t-float128 (fp128_dec_funcs): New macro.
(fp128_decstr_funcs): New macro.
(ibm128_dec_funcs): New macro.
(fp128_ppc_funcs): Add the new conversions.
(fp128_dec_objs): Force Decimal <-> __float128 conversions to be
compiled with -mabi=ieeelongdouble.
(fp128_decstr_objs): Force __float128 <-> string conversions to be
compiled with -mabi=ibmlongdouble.
(ibm128_dec_objs): Force Decimal <-> __float128 conversions to be
compiled with -mabi=ieeelongdouble.
(FP128_CFLAGS_DECIMAL): New macro.
(IBM128_CFLAGS_DECIMAL): New macro.
* dfp-bit.c (DFP_TO_BFP): Add PowerPC _Float128 support.
(BFP_TO_DFP): Add PowerPC _Float128 support.
* dfp-bit.h (BFP_KIND): Add new binary floating point kind for
IEEE 128-bit floating point.
(DFP_TO_BFP): Add PowerPC _Float128 support.
(BFP_TO_DFP): Add PowerPC _Float128 support.
(BFP_SPRINTF): New macro.
---
 libgcc/config/rs6000/_dd_to_kf.c | 37 ++
 libgcc/config/rs6000/_kf_to_dd.c | 37 ++
 libgcc/config/rs6000/_kf_to_sd.c | 37 ++
 libgcc/config/rs6000/_kf_to_td.c | 37 ++
 libgcc/config/rs6000/_sd_to_kf.c | 37 ++
 libgcc/config/rs6000/_sprintfkf.c| 57 
 libgcc/config/rs6000/_sprintfkf.h| 28 ++
 libgcc/config/rs6000/_strtokf.c  | 56 +++
 libgcc/config/rs6000/_strtokf.h  | 27 +
 libgcc/config/rs6000/_td_to_kf.c | 37 ++
 libgcc/config/rs6000/quad-float128.h |  8 
 libgcc/config/rs6000/t-float128  | 37 +-
 libgcc/dfp-bit.c | 12 +-
 libgcc/dfp-bit.h | 26 +
 14 files changed, 470 insertions(+), 3 deletions(-)
 create mode 100644 libgcc/config/rs6000/_dd_to_kf.c
 create mode 100644 libgcc/config/rs6000/_kf_to_dd.c
 create mode 100644 libgcc/config/rs6000/_kf_to_sd.c
 create mode 100644 libgcc/config/rs6000/_kf_to_td.c
 create mode 100644 libgcc/config/rs6000/_sd_to_kf.c
 create mode 100644 libgcc/config/rs6000/_sprintfkf.c
 create mode 100644 libgcc/config/rs6000/_sprintfkf.h
 create mode 100644 libgcc/config/rs6000/_strtokf.c
 create mode 100644 libgcc/config/rs6000/_strtokf.h
 create mode 100644 libgcc/config/rs6000/_

Re: [PATCH] libstdc++/98466 Fix _GLIBCXX_DEBUG N3644 integration


On 01/01/21 18:51 +0100, FranÃ§ois Dumont via Libstdc++ wrote:
I think the PR is not limited to unordered containers iterator, it 
impacts all _GLIBCXX_DEBUG iterators.


However unordered containers local_iterator was more complicated to 
handle. Because of c++/65816 I prefer to review _Node_iterator_default 
constructor to set _M_cur to nullptr even if in principle it is not 
necessary except for the _Local_iterator_base constructor when hash 
code is not cached.


Â Â Â  libstdc++: Implement N3644 for _GLIBCXX_DEBUG iterators

Â Â Â  libstdc++-v3/ChangeLog

Â Â Â Â Â Â Â Â Â Â Â  PR libstdc++/98466
Â Â Â Â Â Â Â Â Â Â Â  * include/bits/hashtable_policy.h (_Node_iterator_base()): 
Set _M_cur to nullptr.

Â Â Â Â Â Â Â Â Â Â Â  (_Node_iterator()): Make default.
Â Â Â Â Â Â Â Â Â Â Â  (_Node_const_iterator()): Make default.
Â Â Â Â Â Â Â Â Â Â Â  * include/debug/macros.h 
(__glibcxx_check_erae_range_after): Add _M_singular

Â Â Â Â Â Â Â Â Â Â Â  iterator checks.
Â Â Â Â Â Â Â Â Â Â Â  * include/debug/safe_iterator.h
Â Â Â Â Â Â Â Â Â Â Â  (_GLIBCXX_DEBUG_VERIFY_OPERANDS): Accept if both iterator 
are value initialized.
Â Â Â Â Â Â Â Â Â Â Â  * include/debug/safe_local_iterator.h 
(_GLIBCXX_DEBUG_VERIFY_OPERANDS):

Â Â Â Â Â Â Â Â Â Â Â  Likewise.
Â Â Â Â Â Â Â Â Â Â Â  * include/debug/safe_iterator.tcc 
(_Safe_iterator<>::_M_valid_range): Add

Â Â Â Â Â Â Â Â Â Â Â  _M_singular checks on input iterators.
Â Â Â Â Â Â Â Â Â Â Â  * src/c++11/debug.cc 
(_Safe_iterator_base::_M_can_compare): Remove _M_singular

Â Â Â Â Â Â Â Â Â Â Â  checks.
Â Â Â Â Â Â Â Â Â Â Â  * testsuite/23_containers/deque/debug/98466.cc: New test.
Â Â Â Â Â Â Â Â Â Â Â  * testsuite/23_containers/unordered_map/debug/98466.cc: 
New test.


Tested under Linux x86_64 normal and debug mode.

Ok to commit ?


Yes, thanks.

One question about the deque test ...



diff --git a/libstdc++-v3/testsuite/23_containers/deque/debug/98466.cc 
b/libstdc++-v3/testsuite/23_containers/deque/debug/98466.cc
new file mode 100644
index 000..720977e5622
--- /dev/null
+++ b/libstdc++-v3/testsuite/23_containers/deque/debug/98466.cc
@@ -0,0 +1,38 @@
+// Copyright (C) 2021 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-do run { target c++11 } }


Does this need to be limited to c++11 and later? Could it just use
{ dg-do run } instead?

OK to commit anyway, thanks.

Re: [PATCH] libstdc++/98466 Fix _GLIBCXX_DEBUG N3644 integration


On 14/01/21 17:10 +, Jonathan Wakely wrote:

On 01/01/21 18:51 +0100, FranÃ§ois Dumont via Libstdc++ wrote:
I think the PR is not limited to unordered containers iterator, it 
impacts all _GLIBCXX_DEBUG iterators.


However unordered containers local_iterator was more complicated to 
handle. Because of c++/65816 I prefer to review 
_Node_iterator_default constructor to set _M_cur to nullptr even if 
in principle it is not necessary except for the _Local_iterator_base 
constructor when hash code is not cached.


Â Â Â  libstdc++: Implement N3644 for _GLIBCXX_DEBUG iterators

Â Â Â  libstdc++-v3/ChangeLog

Â Â Â Â Â Â Â Â Â Â Â  PR libstdc++/98466
Â Â Â Â Â Â Â Â Â Â Â  * include/bits/hashtable_policy.h 
(_Node_iterator_base()): Set _M_cur to nullptr.

Â Â Â Â Â Â Â Â Â Â Â  (_Node_iterator()): Make default.
Â Â Â Â Â Â Â Â Â Â Â  (_Node_const_iterator()): Make default.
Â Â Â Â Â Â Â Â Â Â Â  * include/debug/macros.h 
(__glibcxx_check_erae_range_after): Add _M_singular

Â Â Â Â Â Â Â Â Â Â Â  iterator checks.
Â Â Â Â Â Â Â Â Â Â Â  * include/debug/safe_iterator.h
Â Â Â Â Â Â Â Â Â Â Â  (_GLIBCXX_DEBUG_VERIFY_OPERANDS): Accept if both 
iterator are value initialized.
Â Â Â Â Â Â Â Â Â Â Â  * include/debug/safe_local_iterator.h 
(_GLIBCXX_DEBUG_VERIFY_OPERANDS):

Â Â Â Â Â Â Â Â Â Â Â  Likewise.
Â Â Â Â Â Â Â Â Â Â Â  * include/debug/safe_iterator.tcc 
(_Safe_iterator<>::_M_valid_range): Add

Â Â Â Â Â Â Â Â Â Â Â  _M_singular checks on input iterators.
Â Â Â Â Â Â Â Â Â Â Â  * src/c++11/debug.cc 
(_Safe_iterator_base::_M_can_compare): Remove _M_singular

Â Â Â Â Â Â Â Â Â Â Â  checks.
Â Â Â Â Â Â Â Â Â Â Â  * testsuite/23_containers/deque/debug/98466.cc: New test.
Â Â Â Â Â Â Â Â Â Â Â  * testsuite/23_containers/unordered_map/debug/98466.cc: 
New test.


Tested under Linux x86_64 normal and debug mode.

Ok to commit ?


Yes, thanks.


I've just realised that this C++14 change used to be noted in the
C++14 status table:


  
  
   http://www.w3.org/1999/xlink"; 
xlink:href="http://www.open-std.org/JTC1/sc22/WG21/docs/papers/2013/n3644.pdf";>
 N3644
   
  
  Null Forward Iterators
  Partial
  Only affects Debug Mode
 

But I removed that last year when replacing the list of proposals with
the Table of Contents taken from the standard, in commit
57ede05c6a0b443943e312bf205cb79233c9396f (oops!)

For the branches we should either document that missing feature in a
note, or backport your fix in a few weeks.

[PATCH 0/2] PowerPC: PR 97791: Fix attribute problems

There are many issues with the current implementation of GNU attributes to mark
functions that use long double.  The idea of .gnu_attributes was to mark
objects with ABI requirements.

The two patches in this thread fixes a small subset of the GNU attributes
problems.  It does not fix all of the problems.

The first patch fixes the problem that we set the GNU attribute for long double
if a type were passed or returned that uses the same representation as the long
double type (i.e. passing explicit __float128/_Float128 when long double is
IEEE 128-bit or passing __ibm128 when long double is IBM 128-bit).

The second patch eliminates the code in rs6000_emit_move to set the long double
attribute.  This eliminates the false positives that occur when a type uses a
mode that is the same mode as the long double type.  However, to catch the
cases where long double is used without being passed or returned, we would need
to implement a GIMPLE pass that looked for explicit long double usage.

This patch also changes the the 3 tests that tested this move support to be
'XFAIL' until we flag long double usage correctly.

The current problems with GNU attributes are:

1) Probably the most annoying bug is that they apply at an object level.
   So for example libgcc_s.so.1 is marked as using IBM long double causing
   warnings when linking against code that uses 64-bit long doubles even
   though such code won't use any of the libgcc 128-bit long double
   functions.

   a) Versions of ld prior to 2.35 did not check shared library
  .gnu_attributes, a bug that might allow a user to link a
  soft-float shared library with hard-float code.

2) The original implementation Alan Modra wrote in 2016 to mark relocatable
   object files with attributes had, and still has, bugs.

   a) It is possible for an object to be marked as using IBM long
  double when a function has a long double parameter that is not
  used.

   b) It is possible for an object to not be marked as using IBM long
  double when it does.  For example, a function with a pointer to
  long double parameter is not recognized as using long double.
  This is conceptually difficult to fix.  Does merely passing a
  pointer to another function constitute a use?  What about a
  pointer to a union containing a long double?

   c) An object that defines a global long double variables is not
  marked.

   d) Variable argument functions that process a long double in an
  argument corresponding to the ellipsis are not marked.

3) One of the problems with GNU attributes is that it would signal a long
   double was used when in reality an alternate type was returned or passed,
   such as passing __ibm128 or __float128 that just happens to use the same
   representation as the current long double type.  This is the bug being fixed
   in this patch.

4) In an attempt to fix some of these problems, Mike Meissner applied a patch
   to rs6000_emit_move that set the long double attribute whenever a move to or
   from a register involved a long double mode, but that has bugs too.

   a) With -mlong-double-64 an object that moves doubles to or from a
  register would by marked as using 64-bit long double.

   b) Functions that only use long double internally would wrongly
  cause their object to be marked with the long double attribute.

   c) 2c is not fixed by this patch, unless code in the object uses
  the global variable.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797

[PATCH 1/2] PowerPC: PR 97791: Fix an issue with gnu attributes.

[PATCH 1/2] PowerPC: PR 97791: Fix an issue with gnu attributes.

This patch patch fixes the problem that we set the GNU attribute for long
double if a type were passed or returned that uses the same representation as
the long double type (i.e. passing explicit __float128/_Float128 when long
double is IEEE 128-bit or passing __ibm128 when long double is IBM 128-bit).

I have tested this patch by doing builds, bootstraps, and make check with 3
builds on a power9 little endian server:

*   Build one used the default long double being IBM 128-bit;
*   Build two set the long double default to IEEE 128-bit; (and)
*   Build three set the long double default to 64-bit.

The compilers built fine providing I recompiled gmp, mpc, and mpfr with the
appropriate long double options.  There were a few differences in the test
suite runs that will be addressed in later patches, but over all it works well.
This patch is highly desirable to be able to build a toolchain where the
default long double is IEEE 128-bit.  Can I check this patch into the master
branch for GCC 11?

gcc/
2021-01-14  Michael Meissner  
Alan Modra  

PR gcc/97791
* config/rs6000/rs6000-call.c (init_cumulative_args): Only set
that long double was returned if the type is actually long
double.
(rs6000_function_arg_advance_1): Only set that long double was
passed if the type is actually long double.
---
 gcc/config/rs6000/rs6000-call.c | 23 ++-
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 2308cc8b4a2..519313bc0d6 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -6554,12 +6554,14 @@ init_cumulative_args (CUMULATIVE_ARGS *cum, tree fntype,
  if (SCALAR_FLOAT_MODE_P (return_mode))
{
  rs6000_passes_float = true;
+
+ /* If GNU attributes are enabled, mark if the function returns
+long double.  We do not mark if the function returns a type
+such as __ibm128 that uses the same modes as the current long
+double type, only if an actual long double type was used.  */
  if ((HAVE_LD_PPC_GNU_ATTR_LONG_DOUBLE || TARGET_64BIT)
- && (FLOAT128_IBM_P (return_mode)
- || FLOAT128_IEEE_P (return_mode)
- || (return_type != NULL
- && (TYPE_MAIN_VARIANT (return_type)
- == long_double_type_node
+ && return_type != NULL
+ && TYPE_MAIN_VARIANT (return_type) == long_double_type_node)
rs6000_passes_long_double = true;
 
  /* Note if we passed or return a IEEE 128-bit type.  We changed
@@ -6994,11 +6996,14 @@ rs6000_function_arg_advance_1 (CUMULATIVE_ARGS *cum, 
machine_mode mode,
   if (SCALAR_FLOAT_MODE_P (mode))
{
  rs6000_passes_float = true;
+
+ /* If GNU attributes are enabled, mark if the function passes long
+double.  We do not mark if the function returns a type such as
+__ibm128 that uses the same modes as the current long double type,
+only if an actual long double type was used.  */
  if ((HAVE_LD_PPC_GNU_ATTR_LONG_DOUBLE || TARGET_64BIT)
- && (FLOAT128_IBM_P (mode)
- || FLOAT128_IEEE_P (mode)
- || (type != NULL
- && TYPE_MAIN_VARIANT (type) == long_double_type_node)))
+ && type != NULL
+ && TYPE_MAIN_VARIANT (type) == long_double_type_node)
rs6000_passes_long_double = true;
 
  /* Note if we passed or return a IEEE 128-bit type.  We changed the
-- 
2.22.0


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797

[PATCH 2/2] PowerPC: PR 97791: Do not set gnu attributes on moves

>From 84ae44abc7a79b9c2e6d9f18a30516d3e8f65b1f Mon Sep 17 00:00:00 2001
From: Michael Meissner 
Date: Wed, 13 Jan 2021 21:45:20 -0500
Subject: [PATCH 2/2] PowerPC: PR 97791: Do not set gnu attributes on moves

This patch eliminates the code in rs6000_emit_move to set the long double
attribute.  This eliminates the false positives that occur when a type uses a
mode that is the same mode as the long double type.  However, to catch the
cases where long double is used without being passed or returned, we would need
to implement a GIMPLE pass that looked for explicit long double usage.

This patch also changes the the 3 tests that tested this move support to be
'XFAIL' until we flag long double usage correctly.

I have tested this patch by doing builds, bootstraps, and make check with 3
builds on a power9 little endian server:

*   Build one used the default long double being IBM 128-bit;
*   Build two set the long double default to IEEE 128-bit; (and)
*   Build three set the long double default to 64-bit.

The compilers built fine providing I recompiled gmp, mpc, and mpfr with the
appropriate long double options.  There were a few differences in the test
suite runs that will be addressed in later patches, but over all it works well.
This patch is highly desirable to be able to build a toolchain where the
default long double is IEEE 128-bit.  Can I check this patch into the master
branch for GCC 11?

gcc/
2021-01-14  Michael Meissner  

PR gcc/97791
* config/rs6000/rs6000.c (rs6000_emit_move): Delete code that sets
whether long double was passed based on the modes used in moves.

gcc/testsuite/
2021-01-14  Michael Meissner  

PR target/97791
* gcc.target/powerpc/gnuattr1.c: Mark as XFAIL.
* gcc.target/powerpc/gnuattr2.c: Mark as XFAIL.
* gcc.target/powerpc/gnuattr3.c: Mark as XFAIL.
---
 gcc/config/rs6000/rs6000.c  | 17 -
 gcc/testsuite/gcc.target/powerpc/gnuattr1.c |  9 +++--
 gcc/testsuite/gcc.target/powerpc/gnuattr2.c |  9 +++--
 gcc/testsuite/gcc.target/powerpc/gnuattr3.c |  9 +++--
 4 files changed, 21 insertions(+), 23 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 47a56912e27..6f48dd6566d 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -10077,23 +10077,6 @@ rs6000_emit_move (rtx dest, rtx source, machine_mode 
mode)
   && GET_MODE_BITSIZE (mode) <= HOST_BITS_PER_WIDE_INT)
 gcc_unreachable ();
 
-#ifdef HAVE_AS_GNU_ATTRIBUTE
-  /* If we use a long double type, set the flags in .gnu_attribute that say
- what the long double type is.  This is to allow the linker's warning
- message for the wrong long double to be useful, even if the function does
- not do a call (for example, doing a 128-bit add on power9 if the long
- double type is IEEE 128-bit.  Do not set this if __ibm128 or __floa128 are
- used if they aren't the default long dobule type.  */
-  if (rs6000_gnu_attr && (HAVE_LD_PPC_GNU_ATTR_LONG_DOUBLE || TARGET_64BIT))
-{
-  if (TARGET_LONG_DOUBLE_128 && (mode == TFmode || mode == TCmode))
-   rs6000_passes_float = rs6000_passes_long_double = true;
-
-  else if (!TARGET_LONG_DOUBLE_128 && (mode == DFmode || mode == DCmode))
-   rs6000_passes_float = rs6000_passes_long_double = true;
-}
-#endif
-
   /* See if we need to special case SImode/SFmode SUBREG moves.  */
   if ((mode == SImode || mode == SFmode) && SUBREG_P (source)
   && rs6000_emit_move_si_sf_subreg (dest, source, mode))
diff --git a/gcc/testsuite/gcc.target/powerpc/gnuattr1.c 
b/gcc/testsuite/gcc.target/powerpc/gnuattr1.c
index cf46777849a..9c7680aabae 100644
--- a/gcc/testsuite/gcc.target/powerpc/gnuattr1.c
+++ b/gcc/testsuite/gcc.target/powerpc/gnuattr1.c
@@ -1,11 +1,16 @@
 /* { dg-do compile { target { powerpc*-linux-* } } } */
 /* { dg-require-effective-target powerpc_vsx_ok } */
 /* { dg-options "-O2 -mvsx -mlong-double-64" } */
-/* { dg-final { scan-assembler "gnu_attribute 4, 9" } } */
+/* { dg-final { scan-assembler "gnu_attribute 4, 9" {xfail *-*-*} } } */
 
 /* Check that if we can do the long double operation without doing an emulator
call, such as with 64-bit long double support, that we still set the
-   appropriate .gnu_attribute.  */
+   appropriate .gnu_attribute.
+
+   However, the code that did this in rs6000_emit_move has been removed because
+   it could not differentiate between long double and another type that uses
+   the same mode.  This test is marked as xfail until a gimple pass is added to
+   track the use of long double types.  */
 
 long double a;
 
diff --git a/gcc/testsuite/gcc.target/powerpc/gnuattr2.c 
b/gcc/testsuite/gcc.target/powerpc/gnuattr2.c
index 32a4ba255a8..4ca60d63261 100644
--- a/gcc/testsuite/gcc.target/powerpc/gnuattr2.c
+++ b/gcc/testsuite/gcc.target/powerpc/gnuattr2.c
@@ -1,13 +1,18 @@
 /* { dg-do compile { target { powerpc*-linux-* && lp64 } } } */
 /*

Re: [PATCH v5] rtl: builtins: (not just) rs6000: Add builtins for fegetround, feclearexcept and feraiseexcept [PR94193]

2021-01-14 Thread Segher Boessenkool

On Thu, Jan 07, 2021 at 11:20:39AM -0300, Raoni Fassina Firmino wrote:
> On Wed, Nov 18, 2020 at 06:38:22AM -0600, Segher Boessenkool wrote:
> > We can handle the constants issue similarly to what we do for
> > __builtin_fpclassify, too.
> 
> I think that if we must safe-guard for future or unforeseen libc
> implementations doing what __builtin_fpclassify does is the way to go.
> I don't know what is the GCC police here, but IMHO I don't think we
> should add complexity before it is needed in this case.  And looking at
> __builtin_fpclassify, it seems a lot, IIUC this solution needs
> fixinclude to work, seems to me too much add maintenance for something
> that is not needed yet, because SPARC don't have this expands, none has
> for now.

This way the compiler does not need to know the values of the macros
*at all*, that is the whole point!  You simply pass all the standard
values to the builtin as extra arguments.  This may seem inconvenient
to use, but you put the whole thing in a header file anyway, all is
hidden.


Segher

Re: [PATCH] c-family: Improve MEM_REF printing for diagnostics [PR98597]

2021-01-14 Thread Martin Sebor via Gcc-patches


On 1/14/21 12:43 AM, Richard Biener wrote:

On Wed, 13 Jan 2021, Jakub Jelinek wrote:


Hi!

The following patch doesn't actually fix the print_mem_ref bugs, I've kept
it for now as broken as it was, but at least improves the cases where
we can unambiguously map back MEM[&something + off] into some particular
reference (e.g. something.foo[1].bar etc.).
In the distant past I think we were folding such MEM_REFs back to
COMPONENT_REFs and ARRAY_REFs, but we've stopped doing that.


Yeah, because it has different semantics - *(((int *)t + 3)
accesses an int object while t.u.b accesses a 't' object from the TBAA
perspective.


  But for
diagnostics that is what the user actually want to see IMHO.
So on the attached testcase, instead of printing what is in left column
it prints what is in right column:
((int*)t) + 3   t.u.b
((int*)t) + 6   t.u.e.i
((int*)t) + 8   t.v
s + 1   s[1]


so while that's "nice" in general, for TBAA diagnostics it might actually
be misleading.

I wonder whether we absolutely need to print a C expression here.
We could print, instead of *((int *)t + 3), "access to a memory
object of type 'int' at offset 12 bytes from 't'", thus explain
in plain english.

That said, *((int *)t + 3) is exactly what the access is,
semantically.  There's the case of a mismatch of the access type
and the TBAA type which we cannot write down in C terms but maybe
we want to have a builtin for this?  __builtin_access (ptr, lvalue-type,
tbaa-type)?


Of course, print_mem_ref needs to be also fixed to avoid printing the
nonsense it is printing right now, t is a structure type, so it can't be
cast to int* in C and in C++ only using some user operator, and
the result of what it printed is a pointer, while the uninitialized reads
are int.

I was hoping Martin would fix that, but given his comment in the PR I think
I'll fix it myself tomorrow.

Anyway, this patch is useful on its own.  Bootstrapped/regtested on
x86_64-linux and i686-linux, ok for trunk?


In the light of Martins patch this is probably reasonable but still
the general direction is wrong (which is why I didn't approve Martins
original patch).  I'm also somewhat disappointed we're breaking this
so late in the cycle.


So am I.  I didn't test this change as exhaustively as I could and
(in light of the poor test coverage) should have.  That's my bad.
FWIW, I did do it for the first patch (by instrumenting GCC and
formatting every MEM_REF it came across), but it didn't occur to
me to do it this time around.  I have now completed this testing
(it found one more ICE elsewhere that I'll fix soon).

That said, as I mentioned in the patch submission, most middle end
warnings don't format MEM_REFs.  -Wuninitialized is an outlier here.
Most middle end warnings about invalid accesses print the decl or
allocation call with either a type or size of the access, followed
by either an array index or a byte offset in to the target of
the access.  For consistency I'd like to converge most middle end
warnings on the same style (formatted by the same code).  When
that happens, the MEM_REF format should become much less important,
so I wouldn't invest too much effort into perfecting it.

Martin

libgo patch committed: Update hurd support

2021-01-14 Thread Ian Lance Taylor via Gcc-patches

This libgo patch by Svante Signell updates the hurd support.  This
fixes GCC PR 98496.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
ab5bf5f728be354427a5b06784f34011fea555bc
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index cd95c3d0755..8cfc63248a7 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-255657dc8d61ab26121ca68f124412ef37599166
+fd5396b1af389a55d1e3612702cfdad6755084e9
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/libgo/Makefile.am b/libgo/Makefile.am
index 2e8c37e170d..6188725f73b 100644
--- a/libgo/Makefile.am
+++ b/libgo/Makefile.am
@@ -881,7 +881,7 @@ GOBENCH =
 CHECK = \
GC="$(GOC) $(GOCFLAGS) $($(subst /,_,$@)_GOCFLAGS) -L `${PWD_COMMAND}` 
-L `${PWD_COMMAND}`/.libs"; \
export GC; \
-   GOLIBS="$(extra_check_libs_$(subst .,_,$(subst /,_,$(@D 
$(MATH_LIBS) $(NET_LIBS) $(LIBS)"; \
+   GOLIBS="$(extra_check_libs_$(subst .,_,$(subst /,_,$(@D 
$(PTHREAD_LIBS) $(MATH_LIBS) $(NET_LIBS) $(LIBS)"; \
export GOLIBS; \
RUNTESTFLAGS="$(RUNTESTFLAGS)"; \
export RUNTESTFLAGS; \
diff --git a/libgo/Makefile.in b/libgo/Makefile.in
index 34b0e1d0056..daae4f842d7 100644
--- a/libgo/Makefile.in
+++ b/libgo/Makefile.in
@@ -1029,7 +1029,7 @@ GOBENCH =
 CHECK = \
GC="$(GOC) $(GOCFLAGS) $($(subst /,_,$@)_GOCFLAGS) -L `${PWD_COMMAND}` 
-L `${PWD_COMMAND}`/.libs"; \
export GC; \
-   GOLIBS="$(extra_check_libs_$(subst .,_,$(subst /,_,$(@D 
$(MATH_LIBS) $(NET_LIBS) $(LIBS)"; \
+   GOLIBS="$(extra_check_libs_$(subst .,_,$(subst /,_,$(@D 
$(PTHREAD_LIBS) $(MATH_LIBS) $(NET_LIBS) $(LIBS)"; \
export GOLIBS; \
RUNTESTFLAGS="$(RUNTESTFLAGS)"; \
export RUNTESTFLAGS; \
diff --git a/libgo/go/crypto/x509/root_hurd.go 
b/libgo/go/crypto/x509/root_hurd.go
index 59e9ff0c81b..a25b8a1bc08 100644
--- a/libgo/go/crypto/x509/root_hurd.go
+++ b/libgo/go/crypto/x509/root_hurd.go
@@ -9,3 +9,9 @@ package x509
 var certFiles = []string{
"/etc/ssl/certs/ca-certificates.crt", // Debian/Ubuntu/Gentoo etc.
 }
+
+// Possible directories with certificate files; stop after successfully
+// reading at least one file from a directory.
+var certDirectories = []string{
+   "/etc/ssl/certs", // SLES10/SLES11, https://golang.org/issue/12139
+}
diff --git a/libgo/go/runtime/export_pipe2_test.go 
b/libgo/go/runtime/export_pipe2_test.go
index 9d580d33134..209c6b14a11 100644
--- a/libgo/go/runtime/export_pipe2_test.go
+++ b/libgo/go/runtime/export_pipe2_test.go
@@ -2,7 +2,7 @@
 // Use of this source code is governed by a BSD-style
 // license that can be found in the LICENSE file.
 
-// +build freebsd linux netbsd openbsd solaris
+// +build freebsd hurd linux netbsd openbsd solaris
 
 package runtime
 
diff --git a/libgo/go/runtime/nbpipe_test.go b/libgo/go/runtime/nbpipe_test.go
index 981143ec27b..d7c5d45c854 100644
--- a/libgo/go/runtime/nbpipe_test.go
+++ b/libgo/go/runtime/nbpipe_test.go
@@ -2,7 +2,7 @@
 // Use of this source code is governed by a BSD-style
 // license that can be found in the LICENSE file.
 
-// +build aix darwin dragonfly freebsd linux netbsd openbsd solaris
+// +build aix darwin dragonfly freebsd hurd linux netbsd openbsd solaris
 
 package runtime_test
 
diff --git a/libgo/go/runtime/os_hurd.go b/libgo/go/runtime/os_hurd.go
index 1613b410e2c..8bde23edb81 100644
--- a/libgo/go/runtime/os_hurd.go
+++ b/libgo/go/runtime/os_hurd.go
@@ -27,19 +27,19 @@ func libc_malloc(uintptr) unsafe.Pointer
 
 //go:noescape
 //extern sem_init
-func sem_init(sem *_sem_t, pshared int32, value uint32) int32
+func sem_init(sem *semt, pshared int32, value uint32) int32
 
 //go:noescape
 //extern sem_wait
-func sem_wait(sem *_sem_t) int32
+func sem_wait(sem *semt) int32
 
 //go:noescape
 //extern sem_post
-func sem_post(sem *_sem_t) int32
+func sem_post(sem *semt) int32
 
 //go:noescape
 //extern sem_timedwait
-func sem_timedwait(sem *_sem_t, timeout *timespec) int32
+func sem_timedwait(sem *semt, timeout *timespec) int32
 
 //go:noescape
 //extern clock_gettime
@@ -51,12 +51,12 @@ func semacreate(mp *m) {
return
}
 
-   var sem *_sem_t
+   var sem *semt
 
// Call libc's malloc rather than malloc. This will
// allocate space on the C heap. We can't call malloc
// here because it could cause a deadlock.
-   sem = (*_sem_t)(libc_malloc(unsafe.Sizeof(*sem)))
+   sem = (*semt)(libc_malloc(unsafe.Sizeof(*sem)))
if sem_init(sem, 0, 0) != 0 {
throw("sem_init")
}
@@ -86,7 +86,7 @@ func semasleep(ns int64) int32 {
ts.tv_sec = timespec_sec_t(sec)
ts.tv_nsec = timespec_nsec_t(nsec)
 
-   if sem_timedwait((*_sem_t)(unsafe.Pointer(_m_.waitsema)), &ts) 
!= 0 {
+   if sem_timedwait((*semt)(unsafe.Pointer(_m_.waitsema)), &ts) !=

Re: [PATCH] PR fortran/98661 - valgrind issues with error recovery

2021-01-14 Thread Paul Richard Thomas via Gcc-patches

Hi Harald,

That's OK for master.

Thanks

Paul


On Wed, 13 Jan 2021 at 21:25, Harald Anlauf via Fortran 
wrote:

> Dear all,
>
> the former Fortran testcase charlen_03.f90, which some time ago used to
> ICE, could still display issues during error recovery.  As Dominique
> pointed out, this required either an instrumented compiler, or valgrind.
>
> The issue turned out to not have anything to do with CHARACTER, but
> with an invalid attempt resolve an invalid array specification.
>
> Regtested on x86_64-pc-linux-gnu, and checked for the testcase with
> valgrind.
>
> OK for master?
>
> Thanks,
> Harald
>
>
> PR fortran/98661 - valgrind issues with error recovery
>
> During error recovery after an invalid derived type specification it was
> possible to try to resolve an invalid array specification.  We now skip
> this if the component has the ALLOCATABLE or POINTER attribute and the
> shape is not deferred.
>
> gcc/fortran/ChangeLog:
>
> PR fortran/98661
> * resolve.c (resolve_component): Derived type components with
> ALLOCATABLE or POINTER attribute shall have a deferred shape.
>
> gcc/testsuite/ChangeLog:
>
> PR fortran/98661
> * gfortran.dg/pr98661.f90: New test.
>
>

-- 
"If you can't explain it simply, you don't understand it well enough" -
Albert Einstein

Re: [Patch, RFC] PR fortran/93340 - [8/9/10/11 Regression] fix missed substring simplifications

2021-01-14 Thread Paul Richard Thomas via Gcc-patches

Hi Harald,

It looks OK to me. I can see why you are asking about the implementation
but cannot offer a better solution.

OK for master.

Thanks

Paul


On Tue, 12 Jan 2021 at 22:03, Harald Anlauf via Fortran 
wrote:

> Dear all,
>
> when playing around with the issues exposed by PR93340, particularly
> visible
> in the tree dump, I tried to find ways to simplify substrings in those
> cases
> where they are eligible as designator, which is required e.g. in DATA
> statements.
>
> Given my limited understanding, I finally arrived at a potential solution
> which
> does that simplification near the end of match_string_constant in
> primary.c.
> I couldn't find a better place, but I am open to better suggestions.
>
> The simplification below does an even better job at detecting invalid
> substring
> starting or ending indices than HEAD, and regtests cleanly on
> x86_64-pc-linux-gnu.
>
> Feedback appreciated.  Is this potentially ok for master, or should this
> be done
> differently?
>
> Thanks,
> Harald
>
>
> PR fortran/93340 - fix missed substring simplifications
>
> Substrings were not reduced early enough for use in initializations,
> such as DATA statements.  Add an early simplification for substrings
> with constant starting and ending points.
>
> gcc/fortran/ChangeLog:
>
> * gfortran.h (gfc_resolve_substring): Add prototype.
> * primary.c (match_string_constant): Simplify substrings with
> constant starting and ending points.
> * resolve.c: Rename resolve_substring to gfc_resolve_substring.
> (gfc_resolve_ref): Use renamed function gfc_resolve_substring.
>
> gcc/testsuite/ChangeLog:
>
> * substr_10.f90: New test.
> * substr_9.f90: New test.
>
>

-- 
"If you can't explain it simply, you don't understand it well enough" -
Albert Einstein

[nvptx libgomp plugin] Build only in supported configurations (was: [nvptx offloading] Only 64-bit configurations are currently supported)

Hi!

On 2015-07-08T17:03:02+0200, I wrote:
> On Wed, 18 Feb 2015 09:50:15 +0100, I wrote:
>> So far, we have concentrated only on the 64-bit x86_64 configuration;
>> 32-bit has several known issues to be resolved.
>>  filed.

(This still holds, and is unlikely to ever get addressed.)

> I have committed the following patch in r225560.  This gets us rid of the
> lots of "expected FAILs" in the 32-bit part of
> RUNTESTFLAGS='--target_board=unix\{-m64,-m32\}' testing, for example.
>
> commit fe265ad3c9624da88f43be349137696449148f4f
> Author: tschwinge 
> Date:   Wed Jul 8 14:59:59 2015 +
>
> [nvptx offloading] Only 64-bit configurations are currently supported
>
>   PR libgomp/65099
>   gcc/
>   * config/nvptx/mkoffload.c (main): Create an offload image only in
>   64-bit configurations.

(That remains in place.)

>   libgomp/
>   * plugin/plugin-nvptx.c (nvptx_get_num_devices): Return 0 if not
>   in a 64-bit configuration.

That, for reasons given in the commit log, I've just refined, pushed
"[nvptx libgomp plugin] Build only in supported configurations" to master
branch in commit 6106dfb9f73a33c87108ad5b2dcd4842bdd7828e, and
cherry-picked into releases/gcc-10 branch in commit
1e56a7c9a6631b217299b2ddcd5c4d497bb3445e, releases/gcc-9 branch in commit
0f1e1069a753e912b058f0d4bf599f0edde28408, releases/gcc-8 branch in commit
f9267925c648f2ccd9e4680b699e581003125bcf, see attached.


Grüße
 Thomas


-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
>From 6106dfb9f73a33c87108ad5b2dcd4842bdd7828e Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Mon, 30 Nov 2020 15:15:20 +0100
Subject: [PATCH] [nvptx libgomp plugin] Build only in supported configurations

As recently again discussed in  "[nvptx] -m32
support", nvptx offloading other than for 64-bit host has never been
implemented, tested, supported.  So we simply should buildn't the nvptx libgomp
plugin in this case.

This avoids build problems if, for example, in a (standard) bi-arch
x86_64-pc-linux-gnu '-m64'/'-m32' build, libcuda is available only in a 64-bit
variant but not in a 32-bit one, which, for example, is the case if you build
GCC against the CUDA toolkit's 'stubs/libcuda.so' (see
).

This amends PR65099 commit a92defdab79a1268f4b9dcf42b937e4002a4cf15 (r225560)
"[nvptx offloading] Only 64-bit configurations are currently supported" to
match the way we're doing this for the HSA/GCN plugins.

	libgomp/
	PR libgomp/65099
	* plugin/configfrag.ac (PLUGIN_NVPTX): Restrict to supported
	configurations.
	* configure: Regenerate.
	* plugin/plugin-nvptx.c (nvptx_get_num_devices): Remove 64-bit
	check.
---
 libgomp/configure | 86 +++-
 libgomp/plugin/configfrag.ac  | 92 ---
 libgomp/plugin/plugin-nvptx.c |  9 
 3 files changed, 105 insertions(+), 82 deletions(-)

diff --git a/libgomp/configure b/libgomp/configure
index 89c17c571b7..48bf8e4a72c 100755
--- a/libgomp/configure
+++ b/libgomp/configure
@@ -15272,21 +15272,30 @@ if test x"$enable_offload_targets" != x; then
 	tgt_plugin=intelmic
 	;;
   nvptx*)
-	tgt_plugin=nvptx
-	PLUGIN_NVPTX=$tgt
-	if test "x$CUDA_DRIVER_LIB" != xno \
-	   && test "x$CUDA_DRIVER_LIB" != xno; then
-	  PLUGIN_NVPTX_CPPFLAGS=$CUDA_DRIVER_CPPFLAGS
-	  PLUGIN_NVPTX_LDFLAGS=$CUDA_DRIVER_LDFLAGS
-	  PLUGIN_NVPTX_LIBS='-lcuda'
-
-	  PLUGIN_NVPTX_save_CPPFLAGS=$CPPFLAGS
-	  CPPFLAGS="$PLUGIN_NVPTX_CPPFLAGS $CPPFLAGS"
-	  PLUGIN_NVPTX_save_LDFLAGS=$LDFLAGS
-	  LDFLAGS="$PLUGIN_NVPTX_LDFLAGS $LDFLAGS"
-	  PLUGIN_NVPTX_save_LIBS=$LIBS
-	  LIBS="$PLUGIN_NVPTX_LIBS $LIBS"
-	  cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+	case "${target}" in
+	  aarch64*-*-* | powerpc64le-*-* | x86_64-*-*)
+	case " ${CC} ${CFLAGS} " in
+	  *" -m32 "* | *" -mx32 "*)
+		# PR libgomp/65099: Currently, we only support offloading in
+		# 64-bit configurations.
+		PLUGIN_NVPTX=0
+		;;
+	  *)
+		tgt_plugin=nvptx
+		PLUGIN_NVPTX=$tgt
+		if test "x$CUDA_DRIVER_LIB" != xno \
+		   && test "x$CUDA_DRIVER_LIB" != xno; then
+		  PLUGIN_NVPTX_CPPFLAGS=$CUDA_DRIVER_CPPFLAGS
+		  PLUGIN_NVPTX_LDFLAGS=$CUDA_DRIVER_LDFLAGS
+		  PLUGIN_NVPTX_LIBS='-lcuda'
+
+		  PLUGIN_NVPTX_save_CPPFLAGS=$CPPFLAGS
+		  CPPFLAGS="$PLUGIN_NVPTX_CPPFLAGS $CPPFLAGS"
+		  PLUGIN_NVPTX_save_LDFLAGS=$LDFLAGS
+		  LDFLAGS="$PLUGIN_NVPTX_LDFLAGS $LDFLAGS"
+		  PLUGIN_NVPTX_save_LIBS=$LIBS
+		  LIBS="$PLUGIN_NVPTX_LIBS $LIBS"
+		  cat confdefs.h - <<_ACEOF >conftest.$ac_ext
 /* end confdefs.h.  */
 #include "cuda.h"
 int
@@ -15302,28 +15311,35 @@ if ac_fn_c_try_link "$LINENO"; then :
 fi
 rm -f core conftest.err conftest.$ac_objext \
 conftest$ac_exeext conftest.$ac_ext
-	  CPPFLAGS=$PLUGIN_NVPTX_save_CPPFLAGS
-	  LDFLAGS=$PLUGIN_NVPTX_save_LDFLAGS
-

[PATCH] c-family, v2: Improve MEM_REF printing for diagnostics [PR98597]

On Thu, Jan 14, 2021 at 10:49:42AM -0700, Martin Sebor wrote:
> > In the light of Martins patch this is probably reasonable but still
> > the general direction is wrong (which is why I didn't approve Martins
> > original patch).  I'm also somewhat disappointed we're breaking this
> > so late in the cycle.
> 
> So am I.  I didn't test this change as exhaustively as I could and
> (in light of the poor test coverage) should have.  That's my bad.
> FWIW, I did do it for the first patch (by instrumenting GCC and
> formatting every MEM_REF it came across), but it didn't occur to
> me to do it this time around.  I have now completed this testing
> (it found one more ICE elsewhere that I'll fix soon).

Ok, here is an updated patch which fixes what I found, and implements what
has been discussed on the mailing list and on IRC, i.e. if the types
are compatible as well as alias sets are same, then it prints
what c_fold_indirect_ref_for_warn managed to create, otherwise it uses
that info for printing offsets using offsetof (except when it starts
with ARRAY_REFs, because one can't have offsetof (struct T[2][2], [1][0].x.y)

The uninit-38.c test (which was the only one I believe which had tests on the
exact spelling of MEM_REF printing) contains mainly changes to have space
before * for pointer types (as that is how the C pretty-printers normally
print types, int * rather than int*), plus what might be considered a
regression from what Martin printed, but it is actually a correctness fix.

When the arg is a pointer with type pointer to VLA with char element type
(let's say the pointer is p), which is what happens in several of the
uninit-38.c tests, omitting the (char *) cast is incorrect, as p + 1
is not the 1 byte after p, but pointer to the end of the VLA.
It only happened to work because of the hacks (which I don't like at all
and are dangerous, DECL_ARTIFICIAL var names with dot inside can be pretty
much anything, e.g. a lot of passes construct their helper vars from some
prefix that designates intended use of the var plus numeric suffix), where
the a.1 pointer to VLA is printed as a which if one is lucky happens to be
a variable with VLA type (rather than pointer to it), and for such vars
a + 1 is indeed &a[0] + 1 rather than &a + 1.  But if we want to do this
reliably, we'd need to make sure it comes from VLA (e.g. verify that the
SSA_NAME is defined to __builtin_alloca_with_align and that there exists
a corresponding VAR_DECL with DECL_VALUE_EXPR that has the a.1 variable
in it).

Is this ok for trunk if it passes bootstrap/regtest?

2021-01-14  Jakub Jelinek  

PR tree-optimization/98597
* c-pretty-print.c (c_fold_indirect_ref_for_warn): New function.
(print_mem_ref): Use it.  If it returns something that has compatible
type and is TBAA compatible with zero offset, print it and return,
otherwise print it using offsetof syntax or array ref syntax.  Fix up
printing if MEM_REFs first operand is ADDR_EXPR, or when the first
argument has pointer to array type.  Print pointers using the standard
formatting.

* gcc.dg/uninit-38.c: Expect a space in between type name and asterisk.
Expect for now a (char *) cast for VLAs.
* gcc.dg/uninit-40.c: New test.

--- gcc/c-family/c-pretty-print.c.jj2021-01-13 15:27:09.822834600 +0100
+++ gcc/c-family/c-pretty-print.c   2021-01-14 19:02:21.299138891 +0100
@@ -1809,6 +1809,113 @@ pp_c_call_argument_list (c_pretty_printe
   pp_c_right_paren (pp);
 }

+/* Try to fold *(type *)&op into op.fld.fld2[1] if possible.
+   Only used for printing expressions.  Should punt if ambiguous
+   (e.g. in unions).  */
+
+static tree
+c_fold_indirect_ref_for_warn (location_t loc, tree type, tree op,
+ offset_int &off)
+{
+  tree optype = TREE_TYPE (op);
+  if (off == 0)
+{
+  if (lang_hooks.types_compatible_p (optype, type))
+   return op;
+  /* *(foo *)&complexfoo => __real__ complexfoo */
+  else if (TREE_CODE (optype) == COMPLEX_TYPE
+  && lang_hooks.types_compatible_p (type, TREE_TYPE (optype)))
+   return build1_loc (loc, REALPART_EXPR, type, op);
+}
+  /* ((foo*)&complexfoo)[1] => __imag__ complexfoo */
+  else if (TREE_CODE (optype) == COMPLEX_TYPE
+  && lang_hooks.types_compatible_p (type, TREE_TYPE (optype))
+  && tree_to_uhwi (TYPE_SIZE_UNIT (type)) == off)
+{
+  off = 0;
+  return build1_loc (loc, IMAGPART_EXPR, type, op);
+}
+  /* ((foo *)&fooarray)[x] => fooarray[x] */
+  if (TREE_CODE (optype) == ARRAY_TYPE
+  && TYPE_SIZE_UNIT (TREE_TYPE (optype))
+  && TREE_CODE (TYPE_SIZE_UNIT (TREE_TYPE (optype))) == INTEGER_CST
+  && !integer_zerop (TYPE_SIZE_UNIT (TREE_TYPE (optype
+{
+  tree type_domain = TYPE_DOMAIN (optype);
+  tree min_val = size_zero_node;
+  if (type_domain && TYPE_MIN_VALUE (type_domain))
+   min_val = TYPE_MIN_VALUE (type_domain);
+  offse

Re: [PATCH] libstdc++/98466 Fix _GLIBCXX_DEBUG N3644 integration

2021-01-14 Thread François Dumont via Gcc-patches


On 14/01/21 6:10 pm, Jonathan Wakely wrote:

On 01/01/21 18:51 +0100, FranÃ§ois Dumont via Libstdc++ wrote:
I think the PR is not limited to unordered containers iterator, it 
impacts all _GLIBCXX_DEBUG iterators.


However unordered containers local_iterator was more complicated to 
handle. Because of c++/65816 I prefer to review 
_Node_iterator_default constructor to set _M_cur to nullptr even if 
in principle it is not necessary except for the _Local_iterator_base 
constructor when hash code is not cached.


Â Â Â  libstdc++: Implement N3644 for _GLIBCXX_DEBUG iterators

Â Â Â  libstdc++-v3/ChangeLog

Â Â Â Â Â Â Â Â Â Â Â  PR libstdc++/98466
Â Â Â Â Â Â Â Â Â Â Â  * include/bits/hashtable_policy.h 
(_Node_iterator_base()): Set _M_cur to nullptr.

Â Â Â Â Â Â Â Â Â Â Â  (_Node_iterator()): Make default.
Â Â Â Â Â Â Â Â Â Â Â  (_Node_const_iterator()): Make default.
Â Â Â Â Â Â Â Â Â Â Â  * include/debug/macros.h 
(__glibcxx_check_erae_range_after): Add _M_singular

Â Â Â Â Â Â Â Â Â Â Â  iterator checks.
Â Â Â Â Â Â Â Â Â Â Â  * include/debug/safe_iterator.h
Â Â Â Â Â Â Â Â Â Â Â  (_GLIBCXX_DEBUG_VERIFY_OPERANDS): Accept if 
both iterator are value initialized.
Â Â Â Â Â Â Â Â Â Â Â  * include/debug/safe_local_iterator.h 
(_GLIBCXX_DEBUG_VERIFY_OPERANDS):

Â Â Â Â Â Â Â Â Â Â Â  Likewise.
Â Â Â Â Â Â Â Â Â Â Â  * include/debug/safe_iterator.tcc 
(_Safe_iterator<>::_M_valid_range): Add

Â Â Â Â Â Â Â Â Â Â Â  _M_singular checks on input iterators.
Â Â Â Â Â Â Â Â Â Â Â  * src/c++11/debug.cc 
(_Safe_iterator_base::_M_can_compare): Remove _M_singular

Â Â Â Â Â Â Â Â Â Â Â  checks.
Â Â Â Â Â Â Â Â Â Â Â  * 
testsuite/23_containers/deque/debug/98466.cc: New test.
Â Â Â Â Â Â Â Â Â Â Â  * 
testsuite/23_containers/unordered_map/debug/98466.cc: New test.


Tested under Linux x86_64 normal and debug mode.

Ok to commit ?


Yes, thanks.

One question about the deque test ...


diff --git 
a/libstdc++-v3/testsuite/23_containers/deque/debug/98466.cc 
b/libstdc++-v3/testsuite/23_containers/deque/debug/98466.cc

new file mode 100644
index 000..720977e5622
--- /dev/null
+++ b/libstdc++-v3/testsuite/23_containers/deque/debug/98466.cc
@@ -0,0 +1,38 @@
+// Copyright (C) 2021 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License 
along

+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-do run { target c++11 } }


Does this need to be limited to c++11 and later? Could it just use
{ dg-do run } instead?


Good point, a bad copy/paste from the unordered test I guess.

But I try to remove it and it complained about invalid '{}' syntax in 
C++98 for the iterator value initialization. I try to replace with '()' 
but then ambiguity with a function declaration, I gave up !


As N3644 is talking about value initialization it doesn't sound that bad 
to limit it to C++11, isn't it a C++11 concept ?




OK to commit anyway, thanks.

Re: [PATCH] libstdc++/98466 Fix _GLIBCXX_DEBUG N3644 integration

2021-01-14 Thread François Dumont via Gcc-patches


On 14/01/21 6:15 pm, Jonathan Wakely wrote:

On 14/01/21 17:10 +, Jonathan Wakely wrote:

On 01/01/21 18:51 +0100, FranÃ§ois Dumont via Libstdc++ wrote:
I think the PR is not limited to unordered containers iterator, it 
impacts all _GLIBCXX_DEBUG iterators.


However unordered containers local_iterator was more complicated to 
handle. Because of c++/65816 I prefer to review 
_Node_iterator_default constructor to set _M_cur to nullptr even if 
in principle it is not necessary except for the _Local_iterator_base 
constructor when hash code is not cached.


Â Â Â  libstdc++: Implement N3644 for _GLIBCXX_DEBUG iterators

Â Â Â  libstdc++-v3/ChangeLog

Â Â Â Â Â Â Â Â Â Â Â  PR libstdc++/98466
Â Â Â Â Â Â Â Â Â Â Â  * include/bits/hashtable_policy.h 
(_Node_iterator_base()): Set _M_cur to nullptr.

Â Â Â Â Â Â Â Â Â Â Â  (_Node_iterator()): Make default.
Â Â Â Â Â Â Â Â Â Â Â  (_Node_const_iterator()): Make default.
Â Â Â Â Â Â Â Â Â Â Â  * include/debug/macros.h 
(__glibcxx_check_erae_range_after): Add _M_singular

Â Â Â Â Â Â Â Â Â Â Â  iterator checks.
Â Â Â Â Â Â Â Â Â Â Â  * include/debug/safe_iterator.h
Â Â Â Â Â Â Â Â Â Â Â  (_GLIBCXX_DEBUG_VERIFY_OPERANDS): Accept if 
both iterator are value initialized.
Â Â Â Â Â Â Â Â Â Â Â  * include/debug/safe_local_iterator.h 
(_GLIBCXX_DEBUG_VERIFY_OPERANDS):

Â Â Â Â Â Â Â Â Â Â Â  Likewise.
Â Â Â Â Â Â Â Â Â Â Â  * include/debug/safe_iterator.tcc 
(_Safe_iterator<>::_M_valid_range): Add

Â Â Â Â Â Â Â Â Â Â Â  _M_singular checks on input iterators.
Â Â Â Â Â Â Â Â Â Â Â  * src/c++11/debug.cc 
(_Safe_iterator_base::_M_can_compare): Remove _M_singular

Â Â Â Â Â Â Â Â Â Â Â  checks.
Â Â Â Â Â Â Â Â Â Â Â  * 
testsuite/23_containers/deque/debug/98466.cc: New test.
Â Â Â Â Â Â Â Â Â Â Â  * 
testsuite/23_containers/unordered_map/debug/98466.cc: New test.


Tested under Linux x86_64 normal and debug mode.

Ok to commit ?


Yes, thanks.


I've just realised that this C++14 change used to be noted in the
C++14 status table:

    
  
  
   http://www.w3.org/1999/xlink"; 
xlink:href="http://www.open-std.org/JTC1/sc22/WG21/docs/papers/2013/n3644.pdf";>

 N3644
   
  
  Null Forward Iterators
  Partial
  Only affects Debug Mode
 

But I removed that last year when replacing the list of proposals with
the Table of Contents taken from the standard, in commit
57ede05c6a0b443943e312bf205cb79233c9396f (oops!)

For the branches we should either document that missing feature in a
note, or backport your fix in a few weeks.




I'll keep the backport in my TODOs.

[r11-6672 Regression] Failed to bootstrap on Linux/x86_64

2021-01-14 Thread sunil.k.pandey via Gcc-patches

On Linux/x86_64,

77d372abec0fbf2cfe922e3140ee3410248f979e is the first bad commit
commit 77d372abec0fbf2cfe922e3140ee3410248f979e
Author: H.J. Lu 
Date:   Thu Jan 14 05:56:46 2021 -0800

x86: Error on -fcf-protection with incompatible target

caused build failure when configured with:

../gcc/configure --with-arch=skylake-avx512 --with-cpu=skylake-avx512  
--enable-clocale=gnu --with-system-zlib --enable-shared --enable-cet 
--with-demangler-in-ld --enable-libmpx --with-fpmath=sse 

Build log(last 100 lines):

config.status: creating src/c++17/Makefile
Adding multilib support to src/c++17/Makefile in ../../../../../gcc/libstdc++-v3
with_multisubdir=32
config.status: creating src/c++20/Makefile
Adding multilib support to src/c++20/Makefile in ../../../../../gcc/libstdc++-v3
with_multisubdir=32
config.status: creating src/filesystem/Makefile
Adding multilib support to src/filesystem/Makefile in 
../../../../../gcc/libstdc++-v3
with_multisubdir=32
config.status: creating doc/Makefile
Adding multilib support to doc/Makefile in ../../../../../gcc/libstdc++-v3
with_multisubdir=32
config.status: creating po/Makefile
Adding multilib support to po/Makefile in ../../../../../gcc/libstdc++-v3
with_multisubdir=32
config.status: creating testsuite/Makefile
Adding multilib support to testsuite/Makefile in ../../../../../gcc/libstdc++-v3
with_multisubdir=32
config.status: creating python/Makefile
Adding multilib support to python/Makefile in ../../../../../gcc/libstdc++-v3
with_multisubdir=32
config.status: creating config.h
config.status: executing default-1 commands
Adding multilib support to Makefile in ../../../../../gcc/libstdc++-v3
with_multisubdir=32
config.status: executing libtool commands
config.status: executing include/gstdint.h commands
config.status: executing generate-headers commands
make[3]: Entering directory 
'/local/skpandey/gccwork/toolwork/gcc-bisect-build-master/master/r11-6672/bld/x86_64-linux/32/libstdc++-v3/include'
echo timestamp > stamp-pb
echo timestamp > stamp-host
make[3]: [Makefile:1754: x86_64-linux/bits/largefile-config.h] Error 1 (ignored)
echo 0 > stamp-namespace-version
echo 1 > stamp-visibility
echo 1 > stamp-extern-template
echo 1 > stamp-dual-abi
echo 1 > stamp-cxx11-abi
echo 1 > stamp-allocator-new
echo 'define _GLIBCXX_USE_FLOAT128 1' > stamp-float128
sed -e '/^#pragma/b' \
-e 
'/^#/s/\([ABCDEFGHIJKLMNOPQRSTUVWXYZ_][ABCDEFGHIJKLMNOPQRSTUVWXYZ_]*\)/_GLIBCXX_\1/g'
 \
-e 's/_GLIBCXX_SUPPORTS_WEAK/__GXX_WEAK__/g' \
-e 's/_GLIBCXX___MINGW32_GLIBCXX___/__MINGW32__/g' \
-e 's,^#include "\(.*\)",#include ,g' \
< 
/local/skpandey/gccwork/toolwork/gcc-bisect-build-master/master/gcc/libstdc++-v3/../libgcc/gthr.h
 > x86_64-linux/bits/gthr.h
sed -e 's/\(UNUSED\)/_GLIBCXX_\1/g' \
-e 's/\(GCC[ABCDEFGHIJKLMNOPQRSTUVWXYZ_]*_H\)/_GLIBCXX_\1/g' \
< 
/local/skpandey/gccwork/toolwork/gcc-bisect-build-master/master/gcc/libstdc++-v3/../libgcc/gthr-single.h
 > x86_64-linux/bits/gthr-single.h
sed -e 's/\(UNUSED\)/_GLIBCXX_\1/g' \
-e 's/\(GCC[ABCDEFGHIJKLMNOPQRSTUVWXYZ_]*_H\)/_GLIBCXX_\1/g' \
-e 's/SUPPORTS_WEAK/__GXX_WEAK__/g' \
-e 's/\([ABCDEFGHIJKLMNOPQRSTUVWXYZ_]*USE_WEAK\)/_GLIBCXX_\1/g' \
< 
/local/skpandey/gccwork/toolwork/gcc-bisect-build-master/master/gcc/libstdc++-v3/../libgcc/gthr-posix.h
 > x86_64-linux/bits/gthr-posix.h
sed -e 's/\(UNUSED\)/_GLIBCXX_\1/g' \
-e 's/\(GCC[ABCDEFGHIJKLMNOPQRSTUVWXYZ_]*_H\)/_GLIBCXX_\1/g' \
-e 's/SUPPORTS_WEAK/__GXX_WEAK__/g' \
-e 's/\([ABCDEFGHIJKLMNOPQRSTUVWXYZ_]*USE_WEAK\)/_GLIBCXX_\1/g' \
-e 's,^#include "\(.*\)",#include ,g' \
< 
/local/skpandey/gccwork/toolwork/gcc-bisect-build-master/master/gcc/libstdc++-v3/../libgcc/gthr-posix.h
 > x86_64-linux/bits/gthr-default.h
make[3]: Leaving directory 
'/local/skpandey/gccwork/toolwork/gcc-bisect-build-master/master/r11-6672/bld/x86_64-linux/32/libstdc++-v3/include'
config.status: executing libtool commands
config.status: executing include/gstdint.h commands
config.status: executing generate-headers commands
make[3]: Entering directory 
'/local/skpandey/gccwork/toolwork/gcc-bisect-build-master/master/r11-6672/bld/x86_64-linux/libstdc++-v3/include'
echo timestamp > stamp-pb
echo timestamp > stamp-host
make[3]: [Makefile:1753: x86_64-linux/bits/largefile-config.h] Error 1 (ignored)
make[3]: [Makefile:1754: x86_64-linux/bits/largefile-config.h] Error 1 (ignored)
echo 0 > stamp-namespace-version
echo 1 > stamp-visibility
echo 1 > stamp-extern-template
echo 1 > stamp-dual-abi
echo 1 > stamp-cxx11-abi
echo 1 > stamp-allocator-new
echo 'define _GLIBCXX_USE_FLOAT128 1' > stamp-float128
sed -e '/^#pragma/b' \
-e 
'/^#/s/\([ABCDEFGHIJKLMNOPQRSTUVWXYZ_][ABCDEFGHIJKLMNOPQRSTUVWXYZ_]*\)/_GLIBCXX_\1/g'
 \
-e 's/_GLIBCXX_SUPPORTS_WEAK/__GXX_WEAK__/g' \
-e 's/_GLIBCXX___MINGW32_GLIBCXX___/__MINGW32__/g' \
-e 's,^#include "\(.*\)",#include ,g' \
< 
/local/skpandey/gccwork/toolwork/gcc-bisect-build-master/master/gcc/libstdc++-v3/../libgcc/gthr.h

Re: calibrate intervals to avoid zero in futures poll test

2021-01-14 Thread Alexandre Oliva

On Jan 14, 2021, Jonathan Wakely  wrote:

>> +  /* Got for some 10 cycles, but we're already past that and still

> I can't parse "Got for some 10 cycles". If that's just a typo

Yeah, I meant "Go for ... but if ..." and managed to double-mangle it.
Thanks for spotting it.  Here's the patch I'm installing, with the typos
fixed.  Thanks!


calibrate intervals to avoid zero in futures poll test

From: Alexandre Oliva 

We get occasional failures of 30_threads/future/members/poll.cc
on some platforms whose high resolution clock doesn't have such a high
resolution; wait_for_0 ends up as 0, and then some asserts fail as
intervals measured as longer than zero are tested for less than
several times zero.

This patch adds some calibration in the iteration count to set a
measurable base time interval with some additional margin.


for  libstdc++-v3/ChangeLog

* testsuite/30_threads/future/members/poll.cc: Calibrate
iteration count.
---
 .../testsuite/30_threads/future/members/poll.cc|   33 +++-
 1 file changed, 32 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/testsuite/30_threads/future/members/poll.cc 
b/libstdc++-v3/testsuite/30_threads/future/members/poll.cc
index 91f685b172d73..133dae15ac471 100644
--- a/libstdc++-v3/testsuite/30_threads/future/members/poll.cc
+++ b/libstdc++-v3/testsuite/30_threads/future/members/poll.cc
@@ -25,7 +25,7 @@
 #include 
 #include 
 
-const int iterations = 200;
+int iterations = 200;
 
 using namespace std;
 
@@ -45,10 +45,41 @@ int main()
   promise p;
   future f = p.get_future();
 
+ start_over:
   auto start = chrono::high_resolution_clock::now();
   for(int i = 0; i < iterations; i++)
 f.wait_for(chrono::seconds(0));
   auto stop = chrono::high_resolution_clock::now();
+
+  /* We've run too few iterations for the clock resolution.
+ Attempt to calibrate it.  */
+  if (start == stop)
+{
+  /* Loop until the clock advances, so that start is right after a
+time increment.  */
+  do
+   start = chrono::high_resolution_clock::now();
+  while (start == stop);
+  int i = 0;
+  /* Now until the clock advances again, so that stop is right
+after another time increment.  */
+  do
+   {
+ f.wait_for(chrono::seconds(0));
+ stop = chrono::high_resolution_clock::now();
+ i++;
+   }
+  while (start == stop);
+  /* Go for some 10 cycles, but if we're already past that and
+still get into the calibration loop, double the iteration
+count and try again.  */
+  if (iterations < i * 10)
+   iterations = i * 10;
+  else
+   iterations *= 2;
+  goto start_over;
+}
+
   double wait_for_0 = print("wait_for(0s)", stop - start);
 
   start = chrono::high_resolution_clock::now();


-- 
Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/
   Free Software Activist GNU Toolchain Engineer
Vim, Vi, Voltei pro Emacs -- GNUlius Caesar

[PATCH] keep scope blocks for all inlined functions (PR 98664)

2021-01-14 Thread Martin Sebor via Gcc-patches


One aspect of PR 98465 - Bogus warning stringop-overread for std::string
is the inconsistency between -g and -g0 which turns out to be due to
GCC eliminating apparently unused scope blocks from inlined functions
that aren't explicitly declared inline and artificial.  PR 98664 tracks
just this part of PR 98465.

To resolve just the PR 98664 subset the attached change has
the tree-ssa-live.c pass preserve these blocks for all inlined
functions, not just artificial ones.  Besides avoiding the interaction
between -g and warnings it also seems to improve the inlining context
by including more inlined call sites.  This can be seen in the adjusted
tests.  (Its effect on PR 98465 is that the false positive is issued
consistently, regardless of -g.  Avoiding the false positive is my
next step.)

Jakub, you raised a concern yesterday in PR 98465 c#13 about the memory
footprint of this change.  Can you please comment on whether it's in
line with what you were suggesting?

Martin
PR middle-end/98664 - inconsistent -Wfree-nonheap-object for inlined calls to system headers

gcc/ChangeLog:

	PR middle-end/98664
	* tree-ssa-live.c (remove_unused_scope_block_p): Keep scopes for
	all functions, even if they're not declared artificial or inline.
	* tree.c (tree_inlined_location): Use macro expansion location
	only if scope traversal fails to expose one.

gcc/testsuite/ChangeLog:

	PR middle-end/98664
	* gcc.dg/Wvla-larger-than-4.c: Adjust expected output.
	* gcc.dg/plugin/diagnostic-test-inlining-3.c: Same.
	* g++.dg/warn/Wfree-nonheap-object-5.C: New test.
	* gcc.dg/Wfree-nonheap-object-4.c: New test.

diff --git a/gcc/testsuite/g++.dg/warn/Wfree-nonheap-object-5.C b/gcc/testsuite/g++.dg/warn/Wfree-nonheap-object-5.C
new file mode 100644
index 000..742dba0cf58
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wfree-nonheap-object-5.C
@@ -0,0 +1,129 @@
+/* PR middle-end/98664 - inconsistent --Wfree-nonheap-object for inlined
+   calls to system headers
+   { dg-do compile }
+   { dg-options "-O2 -Wall" } */
+
+# 7 "Wfree-nonheap-object-5.h" 1 3
+
+struct A0
+{
+  void *p;
+
+  void f0 (void *q) { p = q; }
+  void g0 (void) {
+__builtin_free (p);   // { dg-warning "\\\[-Wfree-nonheap-object" }
+  }
+};
+
+struct A1
+{
+  void *p;
+
+  void f0 (void *q) { p = q; }
+  void f1 (void *q) { f0 (q); }
+
+  void g0 (void) {
+__builtin_free (p);   // { dg-warning "\\\[-Wfree-nonheap-object" }
+  }
+  void g1 (void) { g0 (); }
+};
+
+struct A2
+{
+  void *p;
+
+  void f0 (void *q) { p = q; }
+  void f1 (void *q) { f0 (q); }
+  void f2 (void *q) { f1 (q); }
+
+  void g0 (void) {
+__builtin_free (p);   // { dg-warning "\\\[-Wfree-nonheap-object" }
+  }
+  void g1 (void) { g0 (); }
+  void g2 (void) { g1 (); }
+};
+
+# 47 "Wfree-nonheap-object-5.C"
+
+#define NOIPA __attribute__ ((noipa))
+
+extern int array[];
+
+/* Verify the warning is issued even for calls in a system header inlined
+   into a function outside the header.  */
+
+NOIPA void warn_g0 (struct A0 *p)
+{
+  int *q = array + 1;
+
+  p->f0 (q);
+  p->g0 ();
+}
+
+// { dg-message "inlined from 'void warn_g0\\(A0\\*\\)'" "" { target *-*-* } 0 }
+
+
+/* Also verify the warning can be suppressed.  */
+
+NOIPA void nowarn_g0 (struct A0 *p)
+{
+  int *q = array + 2;
+
+  p->f0 (q);
+
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wfree-nonheap-object"
+  p->g0 ();
+#pragma GCC diagnostic pop
+}
+
+
+NOIPA void warn_g1 (struct A1 *p)
+{
+  int *q = array + 3;
+
+  p->f1 (q);
+  p->g1 ();
+}
+
+// { dg-message "inlined from 'void A1::g1\\(\\)'" "" { target *-*-* } 0 }
+// { dg-message "inlined from 'void warn_g1\\(A1\\*\\)'" "" { target *-*-* } 0 }
+
+
+NOIPA void nowarn_g1 (struct A2 *p)
+{
+  int *q = array + 4;
+
+  p->f1 (q);
+
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wfree-nonheap-object"
+  p->g1 ();
+#pragma GCC diagnostic pop
+}
+
+
+NOIPA void warn_g2 (struct A2 *p)
+{
+  int *q = array + 5;
+
+  p->f2 (q);
+  p->g2 ();
+}
+
+// { dg-message "inlined from 'void A2::g1\\(\\)'" "" { target *-*-* } 0 }
+// { dg-message "inlined from 'void A2::g2\\(\\)'" "" { target *-*-* } 0 }
+// { dg-message "inlined from 'void warn_g2\\(A2\\*\\)'" "" { target *-*-* } 0 }
+
+
+NOIPA void nowarn_g2 (struct A2 *p)
+{
+  int *q = array + 6;
+
+  p->f2 (q);
+
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wfree-nonheap-object"
+  p->g2 ();
+#pragma GCC diagnostic pop
+}
diff --git a/gcc/testsuite/gcc.dg/Wfree-nonheap-object-4.c b/gcc/testsuite/gcc.dg/Wfree-nonheap-object-4.c
new file mode 100644
index 000..a7d921248c4
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/Wfree-nonheap-object-4.c
@@ -0,0 +1,107 @@
+/* PR middle-end/98664 - inconsistent --Wfree-nonheap-object for inlined
+   calls to system headers
+   { dg-do compile }
+   { dg-options "-O2 -Wall" } */
+
+# 7 "Wfree-nonheap-object-4.h" 1 3
+
+struct A
+{
+  void *p;
+};
+
+void f0 (struct A *p, void *q) { p->p = q; }
+void f1 (struct A *p, void *q) { f0

[PATCH 1/2] nios2: Add -mcustom-fpu-cfg=fph2

2021-01-14 Thread Sebastian Huber

The new -mcustom-fpu-cfg=fph2 option variant is useful to build a
multilib for the "Nios II Floating Point Hardware 2 Component":

https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/ug/ug_nios2_custom_instruction.pdf

Directly using the corresponding -mcustom-insn=N options for this
floating-point unit leads to a combinatorial explosion in the potential
count of multilibs which may break the build.

The following instructions supported by this component are not enabled
by this option:

* -mcustom-fmins
* -mcustom-fmaxs
* -mcustom-round

The reason is that these instructions are only in effect in combination
with other options. If they are not set, then a build error occurs in
libatomic since -Werror is used for building this library:

cc1: error: switch '-mcustom-fmins' has no effect unless '-ffinite-math-only' 
is specified [-Werror]
cc1: error: switch '-mcustom-fmaxs' has no effect unless '-ffinite-math-only' 
is specified [-Werror]
cc1: error: switch '-mcustom-round' has no effect unless '-fno-math-errno' is 
specified [-Werror]

gcc/

* config/nios2/nios2.c (NIOS2_FPU_CONFIG_NUM): Adjust value.
(nios2_init_fpu_configs): Provide register values for new
-mcustom-fpu-cfg=fph2 option variant.
* doc/invoke.texi (-mcustom-fpu-cfg=fph2): Document new option
variant.
---
 gcc/config/nios2/nios2.c | 20 +++-
 gcc/doc/invoke.texi  | 25 +
 2 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/gcc/config/nios2/nios2.c b/gcc/config/nios2/nios2.c
index 3bffabe9856..fc9c8b70807 100644
--- a/gcc/config/nios2/nios2.c
+++ b/gcc/config/nios2/nios2.c
@@ -1236,7 +1236,7 @@ struct nios2_fpu_config
   int code[n2fpu_code_num];
 };
 
-#define NIOS2_FPU_CONFIG_NUM 3
+#define NIOS2_FPU_CONFIG_NUM 4
 static struct nios2_fpu_config custom_fpu_config[NIOS2_FPU_CONFIG_NUM];
 
 static void
@@ -1280,6 +1280,24 @@ nios2_init_fpu_configs (void)
   cfg->code[n2fpu_fsubs]   = 254;
   cfg->code[n2fpu_fdivs]   = 255;
 
+  NEXT_FPU_CONFIG;
+  cfg->name = "fph2";
+  cfg->code[n2fpu_fabss]   = 224;
+  cfg->code[n2fpu_fnegs]   = 225;
+  cfg->code[n2fpu_fcmpnes] = 226;
+  cfg->code[n2fpu_fcmpeqs] = 227;
+  cfg->code[n2fpu_fcmpges] = 228;
+  cfg->code[n2fpu_fcmpgts] = 229;
+  cfg->code[n2fpu_fcmples] = 230;
+  cfg->code[n2fpu_fcmplts] = 231;
+  cfg->code[n2fpu_fixsi]   = 249;
+  cfg->code[n2fpu_floatis] = 250;
+  cfg->code[n2fpu_fsqrts]  = 251;
+  cfg->code[n2fpu_fmuls]   = 252;
+  cfg->code[n2fpu_fadds]   = 253;
+  cfg->code[n2fpu_fsubs]   = 254;
+  cfg->code[n2fpu_fdivs]   = 255;
+
 #undef NEXT_FPU_CONFIG
   gcc_assert (i == NIOS2_FPU_CONFIG_NUM);
 }
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 298f1f873e3..91fd980550f 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -26121,6 +26121,23 @@ Currently, the following sets are defined:
 -mcustom-fdivs=255 @gol
 -fsingle-precision-constant}
 
+@option{-mcustom-fpu-cfg=fph2} is equivalent to:
+@gccoptlist{-mcustom-fabss=224 @gol
+-mcustom-fnegs=225 @gol
+-mcustom-fcmpnes=226 @gol
+-mcustom-fcmpeqs=227 @gol
+-mcustom-fcmpges=228 @gol
+-mcustom-fcmpgts=229 @gol
+-mcustom-fcmples=230 @gol
+-mcustom-fcmplts=231 @gol
+-mcustom-fixsi=249 @gol
+-mcustom-floatis=250 @gol
+-mcustom-fsqrts=251 @gol
+-mcustom-fmuls=252 @gol
+-mcustom-fadds=253 @gol
+-mcustom-fsubs=254 @gol
+-mcustom-fdivs=255 @gol}
+
 Custom instruction assignments given by individual
 @option{-mcustom-@var{insn}=} options override those given by
 @option{-mcustom-fpu-cfg=}, regardless of the
@@ -26131,6 +26148,14 @@ configuration by using the 
@code{target("custom-fpu-cfg=@var{name}")}
 function attribute (@pxref{Function Attributes})
 or pragma (@pxref{Function Specific Option Pragmas}).
 
+The name @var{fph2} is an abbreviation for @emph{Nios II Floating Point
+Hardware 2 Component}.  This component supports also the custom instructions
+@option{-mcustom-fmins=233}, @option{-mcustom-fmaxs=234}, and
+@option{-mcustom-round=248}.  These options are not enabled by
+@option{-mcustom-fpu-cfg=fph2} since they are only in effect if other options
+are enabled.  In contrast to the other configurations,
+@option{-fsingle-precision-constant} is not set.
+
 @end table
 
 These additional @samp{-m} options are available for the Altera Nios II
-- 
2.26.2

[PATCH 2/2] RTEMS: Add -mcustom-fpu-cfg=fph2 multilib

2021-01-14 Thread Sebastian Huber

This multilib supports Nios II configurations with the "Nios II Floating
Point Hardware 2 Component".

gcc/

* config/nios2/t-rtems: Reset all MULTILIB_* variables.  Shorten
multilib directory names.  Use MULTILIB_REQUIRED instead of
MULTILIB_EXCEPTIONS.  Add -mhw-mul -mhw-mulx -mhw-div
-mcustom-fpu-cfg=fph2 multilib.
---
 gcc/config/nios2/t-rtems | 146 +--
 1 file changed, 18 insertions(+), 128 deletions(-)

diff --git a/gcc/config/nios2/t-rtems b/gcc/config/nios2/t-rtems
index f95fa3c4717..beda8328bd2 100644
--- a/gcc/config/nios2/t-rtems
+++ b/gcc/config/nios2/t-rtems
@@ -1,133 +1,23 @@
 # Custom RTEMS multilibs
 
-MULTILIB_OPTIONS = mhw-mul mhw-mulx mhw-div mcustom-fadds=253 
mcustom-fdivs=255 mcustom-fmuls=252 mcustom-fsubs=254
+# Reset all MULTILIB variables
+
+MULTILIB_OPTIONS   =
+MULTILIB_DIRNAMES  =
+MULTILIB_EXCEPTIONS=
+MULTILIB_REUSE =
+MULTILIB_MATCHES   =
+MULTILIB_REQUIRED  =
 
 # Enumeration of multilibs
 
-# MULTILIB_EXCEPTIONS += 
mhw-mul/mhw-mulx/mhw-div/mcustom-fadds=253/mcustom-fdivs=255/mcustom-fmuls=252/mcustom-fsubs=254
-MULTILIB_EXCEPTIONS += 
mhw-mul/mhw-mulx/mhw-div/mcustom-fadds=253/mcustom-fdivs=255/mcustom-fmuls=252
-MULTILIB_EXCEPTIONS += 
mhw-mul/mhw-mulx/mhw-div/mcustom-fadds=253/mcustom-fdivs=255/mcustom-fsubs=254
-MULTILIB_EXCEPTIONS += 
mhw-mul/mhw-mulx/mhw-div/mcustom-fadds=253/mcustom-fdivs=255
-MULTILIB_EXCEPTIONS += 
mhw-mul/mhw-mulx/mhw-div/mcustom-fadds=253/mcustom-fmuls=252/mcustom-fsubs=254
-MULTILIB_EXCEPTIONS += 
mhw-mul/mhw-mulx/mhw-div/mcustom-fadds=253/mcustom-fmuls=252
-MULTILIB_EXCEPTIONS += 
mhw-mul/mhw-mulx/mhw-div/mcustom-fadds=253/mcustom-fsubs=254
-MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mhw-div/mcustom-fadds=253
-MULTILIB_EXCEPTIONS += 
mhw-mul/mhw-mulx/mhw-div/mcustom-fdivs=255/mcustom-fmuls=252/mcustom-fsubs=254
-MULTILIB_EXCEPTIONS += 
mhw-mul/mhw-mulx/mhw-div/mcustom-fdivs=255/mcustom-fmuls=252
-MULTILIB_EXCEPTIONS += 
mhw-mul/mhw-mulx/mhw-div/mcustom-fdivs=255/mcustom-fsubs=254
-MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mhw-div/mcustom-fdivs=255
-MULTILIB_EXCEPTIONS += 
mhw-mul/mhw-mulx/mhw-div/mcustom-fmuls=252/mcustom-fsubs=254
-MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mhw-div/mcustom-fmuls=252
-MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mhw-div/mcustom-fsubs=254
-# MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mhw-div
-MULTILIB_EXCEPTIONS += 
mhw-mul/mhw-mulx/mcustom-fadds=253/mcustom-fdivs=255/mcustom-fmuls=252/mcustom-fsubs=254
-MULTILIB_EXCEPTIONS += 
mhw-mul/mhw-mulx/mcustom-fadds=253/mcustom-fdivs=255/mcustom-fmuls=252
-MULTILIB_EXCEPTIONS += 
mhw-mul/mhw-mulx/mcustom-fadds=253/mcustom-fdivs=255/mcustom-fsubs=254
-MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mcustom-fadds=253/mcustom-fdivs=255
-MULTILIB_EXCEPTIONS += 
mhw-mul/mhw-mulx/mcustom-fadds=253/mcustom-fmuls=252/mcustom-fsubs=254
-MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mcustom-fadds=253/mcustom-fmuls=252
-MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mcustom-fadds=253/mcustom-fsubs=254
-MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mcustom-fadds=253
-MULTILIB_EXCEPTIONS += 
mhw-mul/mhw-mulx/mcustom-fdivs=255/mcustom-fmuls=252/mcustom-fsubs=254
-MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mcustom-fdivs=255/mcustom-fmuls=252
-MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mcustom-fdivs=255/mcustom-fsubs=254
-MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mcustom-fdivs=255
-MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mcustom-fmuls=252/mcustom-fsubs=254
-MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mcustom-fmuls=252
-MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mcustom-fsubs=254
-MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx
-MULTILIB_EXCEPTIONS += 
mhw-mul/mhw-div/mcustom-fadds=253/mcustom-fdivs=255/mcustom-fmuls=252/mcustom-fsubs=254
-MULTILIB_EXCEPTIONS += 
mhw-mul/mhw-div/mcustom-fadds=253/mcustom-fdivs=255/mcustom-fmuls=252
-MULTILIB_EXCEPTIONS += 
mhw-mul/mhw-div/mcustom-fadds=253/mcustom-fdivs=255/mcustom-fsubs=254
-MULTILIB_EXCEPTIONS += mhw-mul/mhw-div/mcustom-fadds=253/mcustom-fdivs=255
-MULTILIB_EXCEPTIONS += 
mhw-mul/mhw-div/mcustom-fadds=253/mcustom-fmuls=252/mcustom-fsubs=254
-MULTILIB_EXCEPTIONS += mhw-mul/mhw-div/mcustom-fadds=253/mcustom-fmuls=252
-MULTILIB_EXCEPTIONS += mhw-mul/mhw-div/mcustom-fadds=253/mcustom-fsubs=254
-MULTILIB_EXCEPTIONS += mhw-mul/mhw-div/mcustom-fadds=253
-MULTILIB_EXCEPTIONS += 
mhw-mul/mhw-div/mcustom-fdivs=255/mcustom-fmuls=252/mcustom-fsubs=254
-MULTILIB_EXCEPTIONS += mhw-mul/mhw-div/mcustom-fdivs=255/mcustom-fmuls=252
-MULTILIB_EXCEPTIONS += mhw-mul/mhw-div/mcustom-fdivs=255/mcustom-fsubs=254
-MULTILIB_EXCEPTIONS += mhw-mul/mhw-div/mcustom-fdivs=255
-MULTILIB_EXCEPTIONS += mhw-mul/mhw-div/mcustom-fmuls=252/mcustom-fsubs=254
-MULTILIB_EXCEPTIONS += mhw-mul/mhw-div/mcustom-fmuls=252
-MULTILIB_EXCEPTIONS += mhw-mul/mhw-div/mcustom-fsubs=254
-MULTILIB_EXCEPTIONS += mhw-mul/mhw-div
-MULTILIB_EXCEPTIONS += 
mhw-mul/mcustom-fadds=253/mcustom-fdivs=255/mcustom-fmuls=252/mcustom-fsubs=254
-MULTILIB_EXCEPTIONS += 
mhw

[gcn offloading] Only supported in 64-bit configurations (was: [PATCH 7/7 libgomp,amdgcn] GCN Libgomp Plugin)

Hi!

On 2019-11-12T13:29:16+, Andrew Stubbs  wrote:
> This patch contributes the GCN libgomp plugin, with the various
> configure and make bits to go with it.

> --- a/libgomp/plugin/configfrag.ac
> +++ b/libgomp/plugin/configfrag.ac

> +  amdgcn*)
> + case "${target}" in
> +   x86_64-*-*)
> + case " ${CC} ${CFLAGS} " in
> +   *" -m32 "*)
> + PLUGIN_GCN=0

That means, for good reasons, the GCN libgomp plugin is only built in
64-bit configurations.

However, in a (standard) bi-arch x86_64-pc-linux-gnu '-m64'/'-m32' build,
the compiler will still attempt 32-bit GCN offloading code generation,
which will often fail horribly (several classes of ICEs), is untested,
and not intended to be supported, as Andrew confirmed to me months ago.
So, we shouldn't try to do that; similar to nvptx offloading, see PR65099
"nvptx offloading: hard-coded 64-bit assumptions".

As obvious, I've just pushed "[gcn offloading] Only supported in 64-bit
configurations" to master branch in commit
505caa7295b93ecdec8ac9b31595eb34dbd48c9f, and cherry-picked into
releases/gcc-10 branch in commit
d697bf91a5457dfb06b4112b89dec2e43f472830, see attached.


Grüße
 Thomas


-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
>From 505caa7295b93ecdec8ac9b31595eb34dbd48c9f Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Tue, 28 Apr 2020 20:43:38 +0200
Subject: [PATCH] [gcn offloading] Only supported in 64-bit configurations

Similar to nvptx offloading, see PR65099 "nvptx offloading: hard-coded 64-bit
assumptions".

	gcc/
	* config/gcn/mkoffload.c (main): Create an offload image only in
	64-bit configurations.
---
 gcc/config/gcn/mkoffload.c | 260 +++--
 1 file changed, 134 insertions(+), 126 deletions(-)

diff --git a/gcc/config/gcn/mkoffload.c b/gcc/config/gcn/mkoffload.c
index 7d00aaf507e..eb1c717e6e9 100644
--- a/gcc/config/gcn/mkoffload.c
+++ b/gcc/config/gcn/mkoffload.c
@@ -755,11 +755,6 @@ main (int argc, char **argv)
   FILE *cfile = stdout;
   const char *outname = 0;
 
-  const char *gcn_s1_name;
-  const char *gcn_s2_name;
-  const char *gcn_o_name;
-  const char *gcn_cfile_name;
-
   progname = "mkoffload";
   diagnostic_initialize (global_dc, 0);
 
@@ -905,145 +900,158 @@ main (int argc, char **argv)
   if (!dumppfx)
 dumppfx = outname;
 
-  const char *mko_dumpbase = concat (dumppfx, ".mkoffload", NULL);
-  const char *hsaco_dumpbase = concat (dumppfx, ".mkoffload.hsaco", NULL);
   gcn_dumpbase = concat (dumppfx, ".c", NULL);
 
+  const char *gcn_cfile_name;
   if (save_temps)
-{
-  gcn_s1_name = concat (mko_dumpbase, ".1.s", NULL);
-  gcn_s2_name = concat (mko_dumpbase, ".2.s", NULL);
-  gcn_o_name = hsaco_dumpbase;
-  gcn_cfile_name = gcn_dumpbase;
-}
+gcn_cfile_name = gcn_dumpbase;
   else
-{
-  gcn_s1_name = make_temp_file (".mkoffload.1.s");
-  gcn_s2_name = make_temp_file (".mkoffload.2.s");
-  gcn_o_name = make_temp_file (".mkoffload.hsaco");
-  gcn_cfile_name = make_temp_file (".c");
-}
-  obstack_ptr_grow (&files_to_cleanup, gcn_s1_name);
-  obstack_ptr_grow (&files_to_cleanup, gcn_s2_name);
-  obstack_ptr_grow (&files_to_cleanup, gcn_o_name);
+gcn_cfile_name = make_temp_file (".c");
   obstack_ptr_grow (&files_to_cleanup, gcn_cfile_name);
 
-  obstack_ptr_grow (&cc_argv_obstack, "-dumpdir");
-  obstack_ptr_grow (&cc_argv_obstack, "");
-  obstack_ptr_grow (&cc_argv_obstack, "-dumpbase");
-  obstack_ptr_grow (&cc_argv_obstack, mko_dumpbase);
-  obstack_ptr_grow (&cc_argv_obstack, "-dumpbase-ext");
-  obstack_ptr_grow (&cc_argv_obstack, "");
-
-  obstack_ptr_grow (&cc_argv_obstack, "-o");
-  obstack_ptr_grow (&cc_argv_obstack, gcn_s1_name);
-  obstack_ptr_grow (&cc_argv_obstack, NULL);
-  const char **cc_argv = XOBFINISH (&cc_argv_obstack, const char **);
-
-  /* Build arguments for assemble/link pass.  */
-  struct obstack ld_argv_obstack;
-  obstack_init (&ld_argv_obstack);
-  obstack_ptr_grow (&ld_argv_obstack, driver);
-
-  /* Extract early-debug information from the input objects.
- This loop finds all the inputs that end ".o" and aren't the output.  */
-  int dbgcount = 0;
-  for (int ix = 1; ix != argc; ix++)
+  cfile = fopen (gcn_cfile_name, "w");
+  if (!cfile)
+fatal_error (input_location, "cannot open '%s'", gcn_cfile_name);
+
+  /* Currently, we only support offloading in 64-bit configurations.  */
+  if (offload_abi == OFFLOAD_ABI_LP64)
 {
-  if (!strcmp (argv[ix], "-o") && ix + 1 != argc)
-	++ix;
+  const char *mko_dumpbase = concat (dumppfx, ".mkoffload", NULL);
+  const char *hsaco_dumpbase = concat (dumppfx, ".mkoffload.hsaco", NULL);
+
+  const char *gcn_s1_name;
+  const char *gcn_s2_name;
+  const char *gcn_o_name;
+  if (save_temps)
+	{
+	  gcn_s1_name = concat (mko_dumpbase, ".1.s", NULL);
+	  gcn

Re: [PATCH] libstdc++/98466 Fix _GLIBCXX_DEBUG N3644 integration


On 14/01/21 19:33 +0100, FranÃ§ois Dumont wrote:

On 14/01/21 6:10 pm, Jonathan Wakely wrote:

On 01/01/21 18:51 +0100, FranÃÂ§ois Dumont via Libstdc++ wrote:
I think the PR is not limited to unordered containers iterator, it 
impacts all _GLIBCXX_DEBUG iterators.


However unordered containers local_iterator was more complicated 
to handle. Because of c++/65816 I prefer to review 
_Node_iterator_default constructor to set _M_cur to nullptr even 
if in principle it is not necessary except for the 
_Local_iterator_base constructor when hash code is not cached.


ÃÂ ÃÂ ÃÂ  libstdc++: Implement N3644 for _GLIBCXX_DEBUG iterators

ÃÂ ÃÂ ÃÂ  libstdc++-v3/ChangeLog

ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ  PR libstdc++/98466
ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ  * include/bits/hashtable_policy.h 
(_Node_iterator_base()): Set _M_cur to nullptr.

ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ  (_Node_iterator()): Make default.
ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ  (_Node_const_iterator()): Make 
default.
ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ  * include/debug/macros.h 
(__glibcxx_check_erae_range_after): Add _M_singular

ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ  iterator checks.
ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ  * include/debug/safe_iterator.h
ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ  (_GLIBCXX_DEBUG_VERIFY_OPERANDS): Accept if 
both iterator are value initialized.
ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ  * include/debug/safe_local_iterator.h 
(_GLIBCXX_DEBUG_VERIFY_OPERANDS):

ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ  Likewise.
ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ  * include/debug/safe_iterator.tcc 
(_Safe_iterator<>::_M_valid_range): Add

ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ  _M_singular checks on input 
iterators.
ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ  * src/c++11/debug.cc 
(_Safe_iterator_base::_M_can_compare): Remove _M_singular

ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ  checks.
ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ  * 
testsuite/23_containers/deque/debug/98466.cc: New test.
ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ  * 
testsuite/23_containers/unordered_map/debug/98466.cc: New test.


Tested under Linux x86_64 normal and debug mode.

Ok to commit ?


Yes, thanks.

One question about the deque test ...


diff --git 
a/libstdc++-v3/testsuite/23_containers/deque/debug/98466.cc 
b/libstdc++-v3/testsuite/23_containers/deque/debug/98466.cc

new file mode 100644
index 000..720977e5622
--- /dev/null
+++ b/libstdc++-v3/testsuite/23_containers/deque/debug/98466.cc
@@ -0,0 +1,38 @@
+// Copyright (C) 2021 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.Â  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.Â  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public 
License along

+// with this library; see the file COPYING3.Â  If not see
+// .
+
+// { dg-do run { target c++11 } }


Does this need to be limited to c++11 and later? Could it just use
{ dg-do run } instead?


Good point, a bad copy/paste from the unordered test I guess.

But I try to remove it and it complained about invalid '{}' syntax in 
C++98 for the iterator value initialization. I try to replace with 
'()' but then ambiguity with a function declaration, I gave up !


As N3644 is talking about value initialization it doesn't sound that 
bad to limit it to C++11, isn't it a C++11 concept ?


No, it was added in C++03.

You can do this to avoid it being treated as a function declaration:

__gnu_debug::deque::iterator it = __gnu_debug::deque::iterator();


OK to commit with that change if it passes in C++98 mode.

Re: [PATCH] libgomp_g.h: Include stdint.h instead of gstdint.h

Hi!

On 2019-09-30T00:03:00-0700, Frederik Harwath  wrote:
> The patch changes libgomp/libgomp_g.h to include stdint.h instead of the 
> internal gstdint.h. The inclusion of gstdint.h has been
> introduced by GCC trunk r265930, presumably because this introduced uses of 
> uintptr_t. Since gstdint.h is not part of GCC's
> installation, several libgomp test cases fail to compile when running the 
> tests with the installed GCC.

This got into Subversion trunk in time for GCC 10, but is also necessary
for GCC 9; I've thus just pushed to releases/gcc-9 branch "libgomp_g.h:
Include stdint.h instead of gstdint.h" in commit
8d491db06a606f45d7c46e219fc075a3fea22e32, see attached.


Grüße
 Thomas


-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
>From 8d491db06a606f45d7c46e219fc075a3fea22e32 Mon Sep 17 00:00:00 2001
From: Kwok Cheung Yeung 
Date: Mon, 30 Sep 2019 14:16:34 +
Subject: [PATCH] libgomp_g.h: Include stdint.h instead of gstdint.h.

2019-09-30  Kwok Cheung Yeung  

	* libgomp_g.h: Include stdint.h instead of gstdint.h.

(cherry picked from commit d7f9ee981f32bdbc6916cb8c6b9435cfc06f88a0, r276301)
---
 libgomp/libgomp_g.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libgomp/libgomp_g.h b/libgomp/libgomp_g.h
index 32a9d8aade9..dfb55fb66dc 100644
--- a/libgomp/libgomp_g.h
+++ b/libgomp/libgomp_g.h
@@ -31,7 +31,7 @@
 
 #include 
 #include 
-#include "gstdint.h"
+#include 
 
 /* barrier.c */
 
-- 
2.17.1

[pushed] c++: Tweak g++.dg/template/pr98372.C.

2021-01-14 Thread Marek Polacek via Gcc-patches

This test was failing in C++11 because variable templates are only
available in C++14.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/testsuite/ChangeLog:

* g++.dg/template/pr98372.C: Only run in C++14 and up.
---
 gcc/testsuite/g++.dg/template/pr98372.C | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/g++.dg/template/pr98372.C 
b/gcc/testsuite/g++.dg/template/pr98372.C
index f1e8b0f3323..054c94d9edb 100644
--- a/gcc/testsuite/g++.dg/template/pr98372.C
+++ b/gcc/testsuite/g++.dg/template/pr98372.C
@@ -1,5 +1,5 @@
 // PR 98372 ICE due to incorrect type compare
-// { dg-do compile { target c++11 } }
+// { dg-do compile { target c++14 } }
 
 template  using remove_pointer_t = typename _Tp ::type;
 template  struct enable_if;

base-commit: bdd1b1f55529da00b867ef05a53a08fbfc3d1c2e
-- 
2.29.2

Re: [PATCH v2] Add --ld-path= to specify an arbitrary executable as the linker

2021-01-14 Thread Fangrui Song via Gcc-patches


On 2021-01-14, Martin Liška wrote:

On 1/14/21 11:07 AM, Richard Biener wrote:

I see no particular reason to allow arbitrary garbage to be used as
linker.  It just asks for users to shoot themselves in the foot and
for strange bugreports to pop up.


Well, for a strange bug report, we'll see eventually usage of the --ld-path= 
option.

I see it handy when developing a ld feature to be able to point to a built ld
(without need to build GCC with it). Yes, one can use --save-temps --verbose
and invoke the built linker, but it's not handy.

Martin



I did this when I worked on some GNU ld features.
clang --ld-path=/path/to/binutils/out/debug/ld/ld-new
or debugging some Linux kernel issues related to ld.

Having --ld-path= in GCC will be handy.

Re: [r11-6672 Regression] Failed to bootstrap on Linux/x86_64

On Thu, Jan 14, 2021 at 10:52:24AM -0800, sunil.k.pandey via Gcc-patches wrote:
> On Linux/x86_64,

It breaks x86_64-linux build pretty much everywhere.
libatomic (but as well libgomp and libitm) uses -march=i486 in certain cases.
While for --with-arch or --with-arch_32 configured compilers it wouldn't be
that hard to just check if the provided arch isn't i386 only, we shouldn't
stop supporting compiler built with the defaults and I'm afraid that means
-march=i386 by default for 32-bit code.

Jakub

Re: [PATCH v2] Add --ld-path= to specify an arbitrary executable as the linker

On Thu, Jan 14, 2021 at 12:21:05PM -0800, Fangrui Song wrote:
> On 2021-01-14, Martin Liška wrote:
> > On 1/14/21 11:07 AM, Richard Biener wrote:
> > > I see no particular reason to allow arbitrary garbage to be used as
> > > linker.  It just asks for users to shoot themselves in the foot and
> > > for strange bugreports to pop up.
> > 
> > Well, for a strange bug report, we'll see eventually usage of the 
> > --ld-path= option.
> > 
> > I see it handy when developing a ld feature to be able to point to a built 
> > ld
> > (without need to build GCC with it). Yes, one can use --save-temps --verbose
> > and invoke the built linker, but it's not handy.
> > 
> > Martin
> > 
> 
> I did this when I worked on some GNU ld features.
> clang --ld-path=/path/to/binutils/out/debug/ld/ld-new
> or debugging some Linux kernel issues related to ld.
> 
> Having --ld-path= in GCC will be handy.

If the linker is called ld and there isn't random unrelated stuff in the
same directory, one can always just use -B path/to/ld/

Jakub

Re: [r11-6672 Regression] Failed to bootstrap on Linux/x86_64

On Thu, Jan 14, 2021 at 12:33 PM Jakub Jelinek  wrote:
>
> On Thu, Jan 14, 2021 at 10:52:24AM -0800, sunil.k.pandey via Gcc-patches 
> wrote:
> > On Linux/x86_64,
>
> It breaks x86_64-linux build pretty much everywhere.
> libatomic (but as well libgomp and libitm) uses -march=i486 in certain cases.
> While for --with-arch or --with-arch_32 configured compilers it wouldn't be
> that hard to just check if the provided arch isn't i386 only, we shouldn't
> stop supporting compiler built with the defaults and I'm afraid that means
> -march=i386 by default for 32-bit code.
>
> Jakub

This is an old bug:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70454

I am reviving my old patches now.

-- 
H.J.

Re: [PATCH] combine: zeroing cost for new copies

2021-01-14 Thread Segher Boessenkool

Hi!

On Wed, Dec 09, 2020 at 05:49:53PM +0800, Kewen.Lin wrote:
> This patch is to treat those new pseudo-to-pseudo copies
> after hard-reg-to-pseudo-copy as zero costs.  The
> justification is that these new copies are closely after
> the corresponding hard-reg-to-pseudo-copy insns, register
> allocation should be able to coalesce them and get them
> eliminated.

Costing things that are not free as cost zero is very problematic.
Cost zero is problematic in combine anyway (it means unknown cost, not
no cost).

> Now these copies follow the normal costing scheme, the
> below case dump shows the unexpected combination:
> 
> ``` dump
> 
> Trying 3, 2 -> 13:
> 3: r119:DI=r132:DI
>   REG_DEAD r132:DI
> 2: r118:DI=r131:DI
>   REG_DEAD r131:DI
>13: r128:DI=r118:DI&0x|r119:DI<<0x20
>   REG_DEAD r119:DI
>   REG_DEAD r118:DI

This should not combine if 2+13 and 3+13 do not combine already.  Why
did those not combine?

> Failed to match this instruction:
> (set (reg:DI 128)
> (ior:DI (ashift:DI (reg:DI 132)
> (const_int 32 [0x20]))
> (reg:DI 131)))

Likely because it results in this, and this insn isn't recognised.  So
this can be fixed by adding a pattern for it (it needs to make sure all
but the bottom 32 bits of reg 131 are zero; it can use nonzero_bits for
that).

Long ago I had the following patch for this.  Not sure why I never
submitted it, maybe there is something wronmg with it?

Segher

=
>From 04c44ad71941310c84b376744cfbcc87c93a8d68 Mon Sep 17 00:00:00 2001
Message-Id: 
<04c44ad71941310c84b376744cfbcc87c93a8d68.1528751010.git.seg...@kernel.crashing.org>
From: Segher Boessenkool 
Date: Mon, 11 Jun 2018 20:46:31 +
Subject: [PATCH] rs6000: Add a splitter for a rl*imi case

An rl*imi is usually written as an IOR of an ASHIFT or similar, and an
AND of a register with a constant mask.  In some cases combine knows
that that AND doesn't do anything (because all zero bits in that mask
correspond to bits known to be already zero), and then no pattern
matches.  This patch adds a define_split for such cases.  It uses
nonzero_bits in the condition of the splitter, but does not need it
afterwards for the instruction to be recognised.  This is necessary
because later passes can see fewer nonzero_bits.

Because it is a splitter, combine will only use it when starting with
three insns (or more), even though the result is just one.  This isn't
a huge problem in practice, but some possible combinations still won't
happen.

---
 gcc/config/rs6000/rs6000.md | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 38555f5..bc9781b 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -3920,6 +3920,24 @@ (define_insn "*rotl3_insert_3"
 }
   [(set_attr "type" "insert")])

+(define_code_iterator plus_ior_xor [plus ior xor])
+
+(define_split
+  [(set (match_operand:GPR 0 "gpc_reg_operand")
+   (plus_ior_xor:GPR (ashift:GPR (match_operand:GPR 1 "gpc_reg_operand")
+ (match_operand:SI 2 "const_int_operand"))
+ (match_operand:GPR 3 "gpc_reg_operand")))]
+  "nonzero_bits (operands[3], mode)
+   < HOST_WIDE_INT_1U << INTVAL (operands[2])"
+  [(set (match_dup 0)
+   (ior:GPR (and:GPR (match_dup 3)
+ (match_dup 4))
+(ashift:GPR (match_dup 1)
+(match_dup 2]
+{
+  operands[4] = GEN_INT ((HOST_WIDE_INT_1U << INTVAL (operands[2])) - 1);
+})
+
 (define_insn "*rotl3_insert_4"
   [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
(ior:GPR (and:GPR (match_operand:GPR 3 "gpc_reg_operand" "0")
-- 
1.8.3.1

[committed] analyzer: const fixes [PR98679]

2021-01-14 Thread David Malcolm via Gcc-patches

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to master as r11-6689-g8a18261afd923151b8d2a37f667e4673b27acd3f

gcc/analyzer/ChangeLog:
PR analyzer/98679
* analyzer.h (region_offset::operator==): Make const.
* pending-diagnostic.h (pending_diagnostic::equal_p): Likewise.
* store.h (binding_cluster::for_each_value): Likewise.
(binding_cluster::for_each_binding): Likewise.
---
 gcc/analyzer/analyzer.h   | 2 +-
 gcc/analyzer/pending-diagnostic.h | 2 +-
 gcc/analyzer/store.h  | 4 ++--
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/analyzer/analyzer.h b/gcc/analyzer/analyzer.h
index f603802c0d6..6996092717c 100644
--- a/gcc/analyzer/analyzer.h
+++ b/gcc/analyzer/analyzer.h
@@ -169,7 +169,7 @@ public:
 return m_offset;
   }
 
-  bool operator== (const region_offset &other)
+  bool operator== (const region_offset &other) const
   {
 return (m_base_region == other.m_base_region
&& m_offset == other.m_offset
diff --git a/gcc/analyzer/pending-diagnostic.h 
b/gcc/analyzer/pending-diagnostic.h
index 79dc83edc32..571fc1b56b9 100644
--- a/gcc/analyzer/pending-diagnostic.h
+++ b/gcc/analyzer/pending-diagnostic.h
@@ -157,7 +157,7 @@ class pending_diagnostic
   /* Compare for equality with OTHER, which might be of a different
  subclass.  */
 
-  bool equal_p (const pending_diagnostic &other)
+  bool equal_p (const pending_diagnostic &other) const
   {
 /* Check for pointer equality on the IDs from get_kind.  */
 if (get_kind () != other.get_kind ())
diff --git a/gcc/analyzer/store.h b/gcc/analyzer/store.h
index 366439ce2dd..2bcef6c398a 100644
--- a/gcc/analyzer/store.h
+++ b/gcc/analyzer/store.h
@@ -425,7 +425,7 @@ public:
 
   template 
   void for_each_value (void (*cb) (const svalue *sval, T user_data),
-  T user_data)
+  T user_data) const
   {
 for (map_t::iterator iter = m_map.begin (); iter != m_map.end (); ++iter)
   cb ((*iter).second, user_data);
@@ -459,7 +459,7 @@ public:
   const svalue *maybe_get_simple_value (store_manager *mgr) const;
 
   template 
-  void for_each_binding (BindingVisitor &v)
+  void for_each_binding (BindingVisitor &v) const
   {
 for (map_t::iterator iter = m_map.begin (); iter != m_map.end (); ++iter)
   {
-- 
2.26.2

[committed] analyzer: fixes to -fdump-analyzer-json

2021-01-14 Thread David Malcolm via Gcc-patches

I've been implementing a PyGTK viewer for the output of
-fdump-analyzer-json, to help me debug analyzer issues:
  https://github.com/davidmalcolm/gcc-analyzer-viewer
The viewer is very much just a work in progress.

This patch adds some fields that were missing from the dump, and
fixes some mistakes I spotted whilst working on the viewer.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to master as dea4a32b24fb888532c47f3920f6910b3c94a8a0.

gcc/analyzer/ChangeLog:
* engine.cc (strongly_connected_components::to_json): New.
(worklist::to_json): New.
(exploded_graph::to_json): JSON-ify the worklist.
* exploded-graph.h (strongly_connected_components::to_json): New
decl.
(worklist::to_json): New decl.
* store.cc (store::to_json): Fix comment.
* supergraph.cc (supernode::to_json): Fix reference to
"returning_call" in comment.  Add optional "fun" to JSON.
(edge_kind_to_string): New.
(superedge::to_json): Add "kind" to JSON.
---
 gcc/analyzer/engine.cc| 29 -
 gcc/analyzer/exploded-graph.h |  4 
 gcc/analyzer/store.cc |  2 +-
 gcc/analyzer/supergraph.cc| 29 +++--
 4 files changed, 60 insertions(+), 4 deletions(-)

diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc
index 8bc9adf5ee6..fc81e7523fb 100644
--- a/gcc/analyzer/engine.cc
+++ b/gcc/analyzer/engine.cc
@@ -1772,6 +1772,17 @@ strongly_connected_components::dump () const
 }
 }
 
+/* Return a new json::array of per-snode SCC ids.  */
+
+json::array *
+strongly_connected_components::to_json () const
+{
+  json::array *scc_arr = new json::array ();
+  for (int i = 0; i < m_sg.num_nodes (); i++)
+scc_arr->append (new json::integer_number (get_scc_id (i)));
+  return scc_arr;
+}
+
 /* Subroutine of strongly_connected_components's ctor, part of Tarjan's
SCC algorithm.  */
 
@@ -1968,6 +1979,22 @@ worklist::key_t::cmp (const worklist::key_t &ka, const 
worklist::key_t &kb)
   return ka.m_enode->m_index - kb.m_enode->m_index;
 }
 
+/* Return a new json::object of the form
+   {"scc" : [per-snode-IDs]},  */
+
+json::object *
+worklist::to_json () const
+{
+  json::object *worklist_obj = new json::object ();
+
+  worklist_obj->set ("scc", m_scc.to_json ());
+
+  /* The following field isn't yet being JSONified:
+ queue_t m_queue;  */
+
+  return worklist_obj;
+}
+
 /* exploded_graph's ctor.  */
 
 exploded_graph::exploded_graph (const supergraph &sg, logger *logger,
@@ -3315,10 +3342,10 @@ exploded_graph::to_json () const
   /* m_sg is JSONified at the top-level.  */
 
   egraph_obj->set ("ext_state", m_ext_state.to_json ());
+  egraph_obj->set ("worklist", m_worklist.to_json ());
   egraph_obj->set ("diagnostic_manager", m_diagnostic_manager.to_json ());
 
   /* The following fields aren't yet being JSONified:
- worklist m_worklist;
  const state_purge_map *const m_purge_map;
  const analysis_plan &m_plan;
  stats m_global_stats;
diff --git a/gcc/analyzer/exploded-graph.h b/gcc/analyzer/exploded-graph.h
index 84f8862fb96..7ce1e85800d 100644
--- a/gcc/analyzer/exploded-graph.h
+++ b/gcc/analyzer/exploded-graph.h
@@ -622,6 +622,8 @@ public:
 
   void dump () const;
 
+  json::array *to_json () const;
+
 private:
   struct per_node_data
   {
@@ -664,6 +666,8 @@ public:
 return m_scc.get_scc_id (snode.m_index);
   }
 
+  json::object *to_json () const;
+
 private:
   class key_t
   {
diff --git a/gcc/analyzer/store.cc b/gcc/analyzer/store.cc
index bbd2e7c2d40..abdb336da91 100644
--- a/gcc/analyzer/store.cc
+++ b/gcc/analyzer/store.cc
@@ -1740,7 +1740,7 @@ store::dump (bool simple) const
{PARENT_REGION_DESC: {BASE_REGION_DESC: object for binding_map,
 ... for each cluster within parent region},
 ...for each parent region,
-"called_unknown_function": true/false}.  */
+"called_unknown_fn": true/false}.  */
 
 json::object *
 store::to_json () const
diff --git a/gcc/analyzer/supergraph.cc b/gcc/analyzer/supergraph.cc
index 40acfbd16a8..0c69f139334 100644
--- a/gcc/analyzer/supergraph.cc
+++ b/gcc/analyzer/supergraph.cc
@@ -679,8 +679,9 @@ supernode::dump_dot_id (pretty_printer *pp) const
 
 /* Return a new json::object of the form
{"idx": int,
+"fun": optional str
 "bb_idx": int,
-"m_returning_call": optional str,
+"returning_call": optional str,
 "phis": [str],
 "stmts" : [str]}.  */
 
@@ -691,6 +692,8 @@ supernode::to_json () const
 
   snode_obj->set ("idx", new json::integer_number (m_index));
   snode_obj->set ("bb_idx", new json::integer_number (m_bb->index));
+  if (function *fun = get_function ())
+snode_obj->set ("fun", new json::string (function_name (fun)));
 
   if (m_returning_call)
 {
@@ -798,6 +801,26 @@ supernode::get_stmt_index (const gimple *stmt) const
   gcc_unreachable ();
 }
 
+/* Get a string for PK.  */
+
+static const char *
+edge_kind_to_string (enu

RE: [EXTERNAL] Re: [PATCH][tree-optimization]Optimize combination of comparisons to dec+compare

2021-01-14 Thread Eugene Rozenfeld via Gcc-patches

I got more feedback for the patch from Gabriel Ravier and Jakub Jelinek in 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96674 and re-worked it accordingly.

The changes from the previous patch are:
1. Switched the tests to use __attribute__((noipa)) instead of 
__attribute__((noinline)) .
2. Fixed a type in the pattern comment.
3. Added :c for top-level bit_ior expression.
4. Added :s for the subexpressions.
5. Added a pattern for the negated expression:
x >= y && y != XXX_MIN --> x > y - 1
and the corresponding tests.

The new patch is attached.

Eugene

-Original Message-
From: Richard Biener  
Sent: Tuesday, January 5, 2021 4:21 AM
To: Eugene Rozenfeld 
Cc: gcc-patches@gcc.gnu.org
Subject: [EXTERNAL] Re: [PATCH][tree-optimization]Optimize combination of 
comparisons to dec+compare

On Mon, Jan 4, 2021 at 9:50 PM Eugene Rozenfeld 
 wrote:
>
> Ping.
>
> -Original Message-
> From: Eugene Rozenfeld
> Sent: Tuesday, December 22, 2020 3:01 PM
> To: Richard Biener ; 
> gcc-patches@gcc.gnu.org
> Subject: RE: Optimize combination of comparisons to dec+compare
>
> Re-sending my question and re-attaching the patch.
>
> Richard, can you please clarify your feedback?

Hmm, OK.

The patch is OK.

Thanks,
Richard.


> Thanks,
>
> Eugene
>
> -Original Message-
> From: Gcc-patches  On Behalf Of 
> Eugene Rozenfeld via Gcc-patches
> Sent: Tuesday, December 15, 2020 2:06 PM
> To: Richard Biener 
> Cc: gcc-patches@gcc.gnu.org
> Subject: [EXTERNAL] Re: Optimize combination of comparisons to 
> dec+compare
>
> Richard,
>
> > Do we already handle x < y || x <= CST to x <= y - CST?
>
> That is an invalid transformation: e.g., consider x=3, y=4, CST=2.
> Can you please clarify?
>
> Thanks,
>
> Eugene
>
> -Original Message-
> From: Richard Biener 
> Sent: Thursday, December 10, 2020 12:21 AM
> To: Eugene Rozenfeld 
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: Optimize combination of comparisons to dec+compare
>
> On Thu, Dec 10, 2020 at 1:52 AM Eugene Rozenfeld via Gcc-patches 
>  wrote:
> >
> > This patch adds a pattern for optimizing x < y || x == XXX_MIN to x 
> > <=
> > y-1 if y is an integer with TYPE_OVERFLOW_WRAPS.
>
> Do we already handle x < y || x <= CST to x <= y - CST?
> That is, the XXX_MIN case is just a special-case of generic anti-range 
> testing?  For anti-range testing with signed types we pun to unsigned when 
> possible.
>
> > This fixes pr96674.
> >
> > Tested on x86_64-pc-linux-gnu.
> >
> > For this function
> >
> > bool f(unsigned a, unsigned b)
> > {
> > return (b == 0) | (a < b);
> > }
> >
> > the code without the patch is
> >
> > test   esi,esi
> > sete   al
> > cmpesi,edi
> > seta   dl
> > or eax,edx
> > ret
> >
> > the code with the patch is
> >
> > subesi,0x1
> > cmpesi,edi
> > setae  al
> > ret
> >
> > Eugene
> >
> > gcc/
> > PR tree-optimization/96674
> > * match.pd: New pattern x < y || x == XXX_MIN --> x <= y - 1
> >
> > gcc/testsuite
> > * gcc.dg/pr96674.c: New test.
> >


0002-Optimize-combination-of-comparisons-to-dec-compare.patch
Description: 0002-Optimize-combination-of-comparisons-to-dec-compare.patch

[PATCH 1/3] Build x86 libitm with -march=i486 or better

If x86 libitm isn't compiled with -march=i486 or better, append
-march=i486 XCFLAGS for x86 libitm build.

PR target/70454
* configure.tgt (XCFLAGS): Append -march=i486 to compile x86
libitm if needed.
---
 libitm/configure.tgt | 39 +++
 1 file changed, 19 insertions(+), 20 deletions(-)

diff --git a/libitm/configure.tgt b/libitm/configure.tgt
index 6ac206f1005..316896c1b31 100644
--- a/libitm/configure.tgt
+++ b/libitm/configure.tgt
@@ -59,16 +59,25 @@ case "${target_cpu}" in
 
   arm*)ARCH=arm ;;
 
-  i[3456]86)
-   case " ${CC} ${CFLAGS} " in
- *" -m64 "*|*" -mx32 "*)
-   ;;
- *)
-   if test -z "$with_arch"; then
- XCFLAGS="${XCFLAGS} -march=i486 -mtune=${target_cpu}"
- XCFLAGS="${XCFLAGS} -fomit-frame-pointer"
-   fi
-   esac
+  i[3456]86 | x86_64)
+   # Need i486 or better.
+   cat > conftestx.c < /dev/null 2>&1; then
+ if test "${target_cpu}" = x86_64; then
+   XCFLAGS="${XCFLAGS} -march=i486 -mtune=generic"
+   XCFLAGS="${XCFLAGS} -fomit-frame-pointer"
+ else
+   XCFLAGS="${XCFLAGS} -march=i486 -mtune=${target_cpu}"
+   XCFLAGS="${XCFLAGS} -fomit-frame-pointer"
+ fi
+   fi
+   rm -f conftestx.c conftestx.o
XCFLAGS="${XCFLAGS} -mrtm"
ARCH=x86
;;
@@ -103,16 +112,6 @@ case "${target_cpu}" in
ARCH=sparc
;;
 
-  x86_64)
-   case " ${CC} ${CFLAGS} " in
- *" -m32 "*)
-   XCFLAGS="${XCFLAGS} -march=i486 -mtune=generic"
-   XCFLAGS="${XCFLAGS} -fomit-frame-pointer"
-   ;;
-   esac
-   XCFLAGS="${XCFLAGS} -mrtm"
-   ARCH=x86
-   ;;
   s390|s390x)
XCFLAGS="${XCFLAGS} -mzarch -mhtm"
ARCH=s390
-- 
2.29.2

[PATCH 2/3] Build x86 libgomp with -march=i486 or better

If x86 libgomp isn't compiled with -march=i486 or better, append
-march=i486 XCFLAGS for x86 libgomp build.

PR target/70454
* configure.tgt (XCFLAGS): Append -march=i486 to compile x86
libgomp if needed.
---
 libgomp/configure.tgt | 36 
 1 file changed, 16 insertions(+), 20 deletions(-)

diff --git a/libgomp/configure.tgt b/libgomp/configure.tgt
index 1863287fa0d..83b5f92727d 100644
--- a/libgomp/configure.tgt
+++ b/libgomp/configure.tgt
@@ -73,28 +73,24 @@ if test x$enable_linux_futex = xyes; then
;;
 
 # Note that bare i386 is not included here.  We need cmpxchg.
-i[456]86-*-linux*)
+i[456]86-*-linux* | x86_64-*-linux*)
config_path="linux/x86 linux posix"
-   case " ${CC} ${CFLAGS} " in
- *" -m64 "*|*" -mx32 "*)
-   ;;
- *)
-   if test -z "$with_arch"; then
- XCFLAGS="${XCFLAGS} -march=i486 -mtune=${target_cpu}"
+   # Need i486 or better.
+   cat > conftestx.c < /dev/null 2>&1; then
+   if test "${target_cpu}" = x86_64; then
+   XCFLAGS="${XCFLAGS} -march=i486 -mtune=generic"
+   else
+   XCFLAGS="${XCFLAGS} -march=i486 -mtune=${target_cpu}"
fi
-   esac
-   ;;
-
-# Similar jiggery-pokery for x86_64 multilibs, except here we
-# can't rely on the --with-arch configure option, since that
-# applies to the 64-bit side.
-x86_64-*-linux*)
-   config_path="linux/x86 linux posix"
-   case " ${CC} ${CFLAGS} " in
- *" -m32 "*)
-   XCFLAGS="${XCFLAGS} -march=i486 -mtune=generic"
-   ;;
-   esac
+   fi
+   rm -f conftestx.c conftestx.o
;;
 
 # Note that sparcv7 and sparcv8 is not included here.  We need cas.
-- 
2.29.2

[PATCH 0/3] Build x86 libitm/libgomp/libatomic with -march=i486 or better

Starting from

commit 77d372abec0fbf2cfe922e3140ee3410248f979e
Author: H.J. Lu 
Date:   Thu Jan 14 05:56:46 2021 -0800

x86: Error on -fcf-protection with incompatible target

GCC issues an error on -fcf-protection with incompatible target.  CET
is enabled in run-time libraries on x86 when GCC is configured with

--with-arch=XXX

where XXX enables SSE2.  But libitm/libgomp/libatomic are hardcoded to
compile with -march=i486 which is incompatible with CET.  We should
compile libitm/libgomp/libatomic -march=i486 only if the default -march=
is lower than i486.

H.J. Lu (3):
  Build x86 libitm with -march=i486 or better
  Build x86 libgomp with -march=i486 or better
  Build x86 libatomic with -march=i486 or better

 libatomic/configure.tgt | 73 -
 libgomp/configure.tgt   | 36 +---
 libitm/configure.tgt| 39 +++---
 3 files changed, 85 insertions(+), 63 deletions(-)

-- 
2.29.2

[PATCH 3/3] Build x86 libatomic with -march=i486 or better