[PATCH] i386: Hopefully last set of -mavx512vl -mno-avx512bw fixes [PR99321]

2021-03-12 Thread Jakub Jelinek via Gcc-patches
Hi!

This is the final patch of the series started with
https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566139.html
and continued with
https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566356.html
This time, I went through all the remaining instructions marked
by gas as requiring both AVX512BW and AVX512VL and for each checked
tmp-mddump.md, figure out if it ever could be a problem (e.g. instructions
that require AVX512BW+AVX512VL, but didn't exist before AVX512F are usually
fine, the patterns have the right conditions, the bugs are typically on
pre-AVX512F patterns where we have just blindly added v while they actually
can't access those unless AVX512BW+AVX512VL), added test where possible
(the test doesn't cover MMX though)and fixed md bugs.

I believe the mmx pextr[bw]/pinsr[bw] patterns are just conservatively
correct, I think the EVEX encoded instructions just require AVX512BW and
not AVX512VL, but I've just changed Yv (require VL) to Yw (require VL+BW)
and so didn't make it worse.

There are some other interesting details, e.g. most of the 8 interleave
patterns (vpunck[hl]{bw,wd}) had correctly
&&  && 
in the conditions because for masking it needs to be always EVEX encoded
and then it needs both VL+BW, but 2 of those 8 had just
&& 
and so again would run into the -mavx512vl -mno-avx512bw problems.

Another problem different from others was mmx eq/gt comparisons, that was
using Yv constraints, so would happily accept %xmm16+ registers for
-mavx512vl, but there actually are no such EVEX encoded instructions,
as AVX512 comparisons work with %k* registers instead.

The newly added testcase without the patch fails with:
/tmp/ccVROLo2.s: Assembler messages:
/tmp/ccVROLo2.s:9: Error: unsupported instruction `vpabsb'
/tmp/ccVROLo2.s:20: Error: unsupported instruction `vpabsb'
/tmp/ccVROLo2.s:31: Error: unsupported instruction `vpabsw'
/tmp/ccVROLo2.s:42: Error: unsupported instruction `vpabsw'
/tmp/ccVROLo2.s:53: Error: unsupported instruction `vpaddsb'
/tmp/ccVROLo2.s:64: Error: unsupported instruction `vpaddsb'
/tmp/ccVROLo2.s:75: Error: unsupported instruction `vpaddsw'
/tmp/ccVROLo2.s:86: Error: unsupported instruction `vpaddsw'
/tmp/ccVROLo2.s:97: Error: unsupported instruction `vpsubsb'
/tmp/ccVROLo2.s:108: Error: unsupported instruction `vpsubsb'
/tmp/ccVROLo2.s:119: Error: unsupported instruction `vpsubsw'
/tmp/ccVROLo2.s:130: Error: unsupported instruction `vpsubsw'
/tmp/ccVROLo2.s:141: Error: unsupported instruction `vpaddusb'
/tmp/ccVROLo2.s:152: Error: unsupported instruction `vpaddusb'
/tmp/ccVROLo2.s:163: Error: unsupported instruction `vpaddusw'
/tmp/ccVROLo2.s:174: Error: unsupported instruction `vpaddusw'
/tmp/ccVROLo2.s:185: Error: unsupported instruction `vpsubusb'
/tmp/ccVROLo2.s:196: Error: unsupported instruction `vpsubusb'
/tmp/ccVROLo2.s:207: Error: unsupported instruction `vpsubusw'
/tmp/ccVROLo2.s:218: Error: unsupported instruction `vpsubusw'
/tmp/ccVROLo2.s:258: Error: unsupported instruction `vpaddusw'
/tmp/ccVROLo2.s:269: Error: unsupported instruction `vpavgb'
/tmp/ccVROLo2.s:280: Error: unsupported instruction `vpavgb'
/tmp/ccVROLo2.s:291: Error: unsupported instruction `vpavgw'
/tmp/ccVROLo2.s:302: Error: unsupported instruction `vpavgw'
/tmp/ccVROLo2.s:475: Error: unsupported instruction `vpmovsxbw'
/tmp/ccVROLo2.s:486: Error: unsupported instruction `vpmovsxbw'
/tmp/ccVROLo2.s:497: Error: unsupported instruction `vpmovzxbw'
/tmp/ccVROLo2.s:508: Error: unsupported instruction `vpmovzxbw'
/tmp/ccVROLo2.s:548: Error: unsupported instruction `vpmulhuw'
/tmp/ccVROLo2.s:559: Error: unsupported instruction `vpmulhuw'
/tmp/ccVROLo2.s:570: Error: unsupported instruction `vpmulhw'
/tmp/ccVROLo2.s:581: Error: unsupported instruction `vpmulhw'
/tmp/ccVROLo2.s:592: Error: unsupported instruction `vpsadbw'
/tmp/ccVROLo2.s:603: Error: unsupported instruction `vpsadbw'
/tmp/ccVROLo2.s:643: Error: unsupported instruction `vpshufhw'
/tmp/ccVROLo2.s:654: Error: unsupported instruction `vpshufhw'
/tmp/ccVROLo2.s:665: Error: unsupported instruction `vpshuflw'
/tmp/ccVROLo2.s:676: Error: unsupported instruction `vpshuflw'
/tmp/ccVROLo2.s:687: Error: unsupported instruction `vpslldq'
/tmp/ccVROLo2.s:698: Error: unsupported instruction `vpslldq'
/tmp/ccVROLo2.s:709: Error: unsupported instruction `vpsrldq'
/tmp/ccVROLo2.s:720: Error: unsupported instruction `vpsrldq'
/tmp/ccVROLo2.s:899: Error: unsupported instruction `vpunpckhbw'
/tmp/ccVROLo2.s:910: Error: unsupported instruction `vpunpckhbw'
/tmp/ccVROLo2.s:921: Error: unsupported instruction `vpunpckhwd'
/tmp/ccVROLo2.s:932: Error: unsupported instruction `vpunpckhwd'
/tmp/ccVROLo2.s:943: Error: unsupported instruction `vpunpcklbw'
/tmp/ccVROLo2.s:954: Error: unsupported instruction `vpunpcklbw'
/tmp/ccVROLo2.s:965: Error: unsupported instruction `vpunpcklwd'
/tmp/ccVROLo2.s:976: Error: unsupported instruction `vpunpcklwd'

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-03-12  Jakub Jelinek  

PR ta

[PATCH] dwarf2out: Fix up ranges for -gdwarf-5 -gsplit-dwarf [PR99490]

2021-03-12 Thread Jakub Jelinek via Gcc-patches
Hi!

For -gdwarf-4 -gsplit-dwarf we used to emit .debug_ranges section
(so in the binaries/shared libraries) with DW_AT_ranges from skeleton
units as well as .debug_info.dwo pointing to it through DW_FORM_sec_offset
(and DW_AT_GNU_ranges_base pointing into section, not sure for what
reason exactly).
When DWARF5 support was being added, we've started using .debug_rnglists
section, added DW_AT_rnglists_base to the DW_TAG_skeleton_unit, kept
DW_AT_ranges with DW_FORM_sec_offset in the skeleton and switched
over to DW_FORM_rnglistx for DW_AT_ranges in .debug_info.dwo.
But the DWARF5 spec actually means for the ranges section (at least
everything for those DW_AT_ranges in .debug_info.dwo) to sit
in .debug_rnglists.dwo section next to the .debug_info.dwo, rather than
having consumers look it up in the binary/shared library instead.
Based on some discussions in the DWARF discuss mailing list:
http://lists.dwarfstd.org/pipermail/dwarf-discuss-dwarfstd.org/2021-March/thread.html#4765
this patch mostly follows what LLVM emits for that right now:
1) small .debug_rnglists section (when needed) just to cover the
   skeleton DW_AT_ranges (if present); the content of the section
   uses the Split DWARFy DW_RLE_* codes with addrx encodings where
   possible
2) DW_AT_ranges in the skeleton uses DW_FORM_sec_offset (difference
   from LLVM which uses DW_FORM_rnglistx, which makes it larger
   and ambiguous)
3) DW_AT_rnglists_base attribute is gone from the skeleton (again,
   unlike LLVM where it is just confusing what exactly it means because
   it is inherited; it would make sense if we emitted DW_FORM_rnglistx
   in non-split DWARF, but unless ranges are shared, I'm afraid we'd
   make DWARF larger with fewer relocations by that)
4) usually big .debug_rnglists.dwo section again with using DW_RLE_*x*
   where possible
5) DW_AT_ranges with DW_FORM_rnglistx from .debug_info.dwo referring to
   that .debug_rnglists.dwo ranges

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-03-12  Jakub Jelinek  

PR debug/99490
* dwarf2out.c (debug_ranges_dwo_section): New variable.
(DW_RANGES_IDX_SKELETON): Define.
(struct dw_ranges): Add begin_entry and end_entry members.
(DEBUG_DWO_RNGLISTS_SECTION): Define.
(add_ranges_num): Adjust r initializer for addition of *_entry
members.
(add_ranges_by_labels): For -gsplit-dwarf and force_direct,
set idx to DW_RANGES_IDX_SKELETON.
(index_rnglists): Don't set r->idx if it is equal to
DW_RANGES_IDX_SKELETON.  Initialize r->begin_entry and
r->end_entry for -gsplit-dwarf if those will be needed by
output_rnglists.
(output_rnglists): Add DWO argument.  If true, switch to
debug_ranges_dwo_section rather than debug_ranges_section.
Adjust l1/l2 label indexes.  Only output the offset table when
dwo is true and don't include in there the skeleton range
entry if present.  For -gsplit-dwarf, skip ranges that belong
to the other rnglists section.  Change return type from void
to bool and return true if there are any range entries for
the other section.  For dwarf_split_debug_info use
DW_RLE_startx_endx, DW_RLE_startx_length and DW_RLE_base_addressx
entries instead of DW_RLE_start_end, DW_RLE_start_length and
DW_RLE_base_address.
(init_sections_and_labels): Initialize debug_ranges_dwo_section
if -gsplit-dwarf and DWARF >= 5.  Adjust ranges_section_label
and range_base_label indexes.
(dwarf2out_finish): Call index_rnglists earlier before finalizing
.debug_addr.  Never emit DW_AT_rnglists_base attribute.  For
-gsplit-dwarf and DWARF >= 5 call output_rnglists up to twice
with different dwo arguments.
(dwarf2out_c_finalize): Clear debug_ranges_dwo_section.

--- gcc/dwarf2out.c.jj  2021-03-10 17:36:37.037537129 +0100
+++ gcc/dwarf2out.c 2021-03-11 12:50:00.402418873 +0100
@@ -171,6 +171,7 @@ static GTY(()) section *debug_line_str_s
 static GTY(()) section *debug_str_dwo_section;
 static GTY(()) section *debug_str_offsets_section;
 static GTY(()) section *debug_ranges_section;
+static GTY(()) section *debug_ranges_dwo_section;
 static GTY(()) section *debug_frame_section;
 
 /* Maximum size (in bytes) of an artificially generated label.  */
@@ -3152,11 +3153,17 @@ struct GTY(()) dw_ranges {
   /* If this is positive, it's a block number, otherwise it's a
  bitwise-negated index into dw_ranges_by_label.  */
   int num;
+  /* If idx is equal to DW_RANGES_IDX_SKELETON, it should be emitted
+ into .debug_rnglists section rather than .debug_rnglists.dwo
+ for -gsplit-dwarf and DWARF >= 5.  */
+#define DW_RANGES_IDX_SKELETON ((1U << 31) - 1)
   /* Index for the range list for DW_FORM_rnglistx.  */
   unsigned int idx : 31;
   /* True if this range might be possibly in a different section
  from previous entry.  */
   unsigned

[PATCH][pushed] gcc-changelog: allow ChangeLog deletion in a commit

2021-03-12 Thread Martin Liška

That would be needed for removal of components which contain a ChangeLog.
Pushed to master.

Martin

contrib/ChangeLog:

* gcc-changelog/git_commit.py: Allow deletion of ChangeLog
files.
* gcc-changelog/setup.cfg: Set line limit to 120 characters.
* gcc-changelog/test_email.py: Add test.
* gcc-changelog/test_patches.txt: Likewise.
* gcc-changelog/git_email.py: Fix parsing of deleted files.
---
 contrib/gcc-changelog/git_commit.py|  3 ++-
 contrib/gcc-changelog/git_email.py |  2 +-
 contrib/gcc-changelog/setup.cfg|  3 +++
 contrib/gcc-changelog/test_email.py|  4 
 contrib/gcc-changelog/test_patches.txt | 30 ++
 5 files changed, 40 insertions(+), 2 deletions(-)

diff --git a/contrib/gcc-changelog/git_commit.py 
b/contrib/gcc-changelog/git_commit.py
index e9dae0a838d..43fa7f40e4e 100755
--- a/contrib/gcc-changelog/git_commit.py
+++ b/contrib/gcc-changelog/git_commit.py
@@ -314,8 +314,9 @@ class GitCommit:
 if self.revert_commit:
 self.info = self.commit_to_info_hook(self.revert_commit)
 
+# Allow complete deletion of ChangeLog files in a commit

 project_files = [f for f in self.info.modified_files
- if self.is_changelog_filename(f[0], allow_suffix=True)
+ if (self.is_changelog_filename(f[0], 
allow_suffix=True) and f[1] != 'D')
  or f[0] in misc_files]
 ignored_files = [f for f in self.info.modified_files
  if self.in_ignored_location(f[0])]
diff --git a/contrib/gcc-changelog/git_email.py 
b/contrib/gcc-changelog/git_email.py
index 00ad00458f4..b0547b363aa 100755
--- a/contrib/gcc-changelog/git_email.py
+++ b/contrib/gcc-changelog/git_email.py
@@ -66,7 +66,7 @@ class GitEmail(GitCommit):
 t = 'A'
 else:
 t = 'M'
-modified_files.append((target, t))
+modified_files.append((target if t != 'D' else source, t))
 git_info = GitInfo(None, date, author, body, modified_files)
 super().__init__(git_info, strict=strict,
  commit_to_info_hook=lambda x: None)
diff --git a/contrib/gcc-changelog/setup.cfg b/contrib/gcc-changelog/setup.cfg
index 9e4a0f6479c..efc313f6d52 100644
--- a/contrib/gcc-changelog/setup.cfg
+++ b/contrib/gcc-changelog/setup.cfg
@@ -1,2 +1,5 @@
+[flake8]
+max-line-length = 120
+
 [tool:pytest]
 addopts = -vv --flake8
diff --git a/contrib/gcc-changelog/test_email.py 
b/contrib/gcc-changelog/test_email.py
index b81548f2033..9d052e06467 100755
--- a/contrib/gcc-changelog/test_email.py
+++ b/contrib/gcc-changelog/test_email.py
@@ -416,3 +416,7 @@ class TestGccChangelog(unittest.TestCase):
 def test_multiline_bad_parentheses(self):
 email = self.from_patch_glob('0002-Wrong-macro-changelog.patch')
 assert email.errors[0].message == 'bad parentheses wrapping'
+
+def test_changelog_removal(self):
+email = self.from_patch_glob('0001-ChangeLog-removal.patch', 
strict=True)
+assert not email.errors
diff --git a/contrib/gcc-changelog/test_patches.txt 
b/contrib/gcc-changelog/test_patches.txt
index 9de28972556..012573b4845 100644
--- a/contrib/gcc-changelog/test_patches.txt
+++ b/contrib/gcc-changelog/test_patches.txt
@@ -3481,3 +3481,33 @@ index ac63591b33f..ff4d61764e7 100644
 +
 --
 2.29.2
+
+=== 0001-ChangeLog-removal.patch ===
+From b39fadf9df1a9510afcab0a391182da7dc68de24 Mon Sep 17 00:00:00 2001
+From: Martin Liska 
+Date: Fri, 12 Mar 2021 09:10:55 +0100
+Subject: [PATCH] Test ChangeLog removal.
+
+gcc/ChangeLog:
+
+   * ipa-icf.c (make_pass_ipa_icf): Add line.
+---
+diff --git a/gcc/analyzer/ChangeLog b/gcc/analyzer/ChangeLog
+deleted file mode 100644
+index 94e87f6bcde..000
+--- a/gcc/analyzer/ChangeLog
 /dev/null
+@@ -1,1 +0,0 @@
+- foo
+diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
+index 5dd33a75c3a..c4ce432cb98 100644
+--- a/gcc/ipa-icf.c
 b/gcc/ipa-icf.c
+@@ -3655,3 +3655,4 @@ make_pass_ipa_icf (gcc::context *ctxt)
+ {
+   return new ipa_icf::pass_ipa_icf (ctxt);
+ }
++
+--
+2.30.1
+
--
2.30.1



Re: [PATCH] i386: Hopefully last set of -mavx512vl -mno-avx512bw fixes [PR99321]

2021-03-12 Thread Uros Bizjak via Gcc-patches
On Fri, Mar 12, 2021 at 8:59 AM Jakub Jelinek  wrote:
>
> Hi!
>
> This is the final patch of the series started with
> https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566139.html
> and continued with
> https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566356.html
> This time, I went through all the remaining instructions marked
> by gas as requiring both AVX512BW and AVX512VL and for each checked
> tmp-mddump.md, figure out if it ever could be a problem (e.g. instructions
> that require AVX512BW+AVX512VL, but didn't exist before AVX512F are usually
> fine, the patterns have the right conditions, the bugs are typically on
> pre-AVX512F patterns where we have just blindly added v while they actually
> can't access those unless AVX512BW+AVX512VL), added test where possible
> (the test doesn't cover MMX though)and fixed md bugs.
>
> I believe the mmx pextr[bw]/pinsr[bw] patterns are just conservatively
> correct, I think the EVEX encoded instructions just require AVX512BW and
> not AVX512VL, but I've just changed Yv (require VL) to Yw (require VL+BW)
> and so didn't make it worse.

Perhaps we can introduce another Y... constraint for AVX512BW and use
it here. I think they will be used in other places, too.

> There are some other interesting details, e.g. most of the 8 interleave
> patterns (vpunck[hl]{bw,wd}) had correctly
> &&  && 
> in the conditions because for masking it needs to be always EVEX encoded
> and then it needs both VL+BW, but 2 of those 8 had just
> && 
> and so again would run into the -mavx512vl -mno-avx512bw problems.
>
> Another problem different from others was mmx eq/gt comparisons, that was
> using Yv constraints, so would happily accept %xmm16+ registers for
> -mavx512vl, but there actually are no such EVEX encoded instructions,
> as AVX512 comparisons work with %k* registers instead.
>
> The newly added testcase without the patch fails with:
> /tmp/ccVROLo2.s: Assembler messages:
> /tmp/ccVROLo2.s:9: Error: unsupported instruction `vpabsb'
> /tmp/ccVROLo2.s:20: Error: unsupported instruction `vpabsb'
> /tmp/ccVROLo2.s:31: Error: unsupported instruction `vpabsw'
> /tmp/ccVROLo2.s:42: Error: unsupported instruction `vpabsw'
> /tmp/ccVROLo2.s:53: Error: unsupported instruction `vpaddsb'
> /tmp/ccVROLo2.s:64: Error: unsupported instruction `vpaddsb'
> /tmp/ccVROLo2.s:75: Error: unsupported instruction `vpaddsw'
> /tmp/ccVROLo2.s:86: Error: unsupported instruction `vpaddsw'
> /tmp/ccVROLo2.s:97: Error: unsupported instruction `vpsubsb'
> /tmp/ccVROLo2.s:108: Error: unsupported instruction `vpsubsb'
> /tmp/ccVROLo2.s:119: Error: unsupported instruction `vpsubsw'
> /tmp/ccVROLo2.s:130: Error: unsupported instruction `vpsubsw'
> /tmp/ccVROLo2.s:141: Error: unsupported instruction `vpaddusb'
> /tmp/ccVROLo2.s:152: Error: unsupported instruction `vpaddusb'
> /tmp/ccVROLo2.s:163: Error: unsupported instruction `vpaddusw'
> /tmp/ccVROLo2.s:174: Error: unsupported instruction `vpaddusw'
> /tmp/ccVROLo2.s:185: Error: unsupported instruction `vpsubusb'
> /tmp/ccVROLo2.s:196: Error: unsupported instruction `vpsubusb'
> /tmp/ccVROLo2.s:207: Error: unsupported instruction `vpsubusw'
> /tmp/ccVROLo2.s:218: Error: unsupported instruction `vpsubusw'
> /tmp/ccVROLo2.s:258: Error: unsupported instruction `vpaddusw'
> /tmp/ccVROLo2.s:269: Error: unsupported instruction `vpavgb'
> /tmp/ccVROLo2.s:280: Error: unsupported instruction `vpavgb'
> /tmp/ccVROLo2.s:291: Error: unsupported instruction `vpavgw'
> /tmp/ccVROLo2.s:302: Error: unsupported instruction `vpavgw'
> /tmp/ccVROLo2.s:475: Error: unsupported instruction `vpmovsxbw'
> /tmp/ccVROLo2.s:486: Error: unsupported instruction `vpmovsxbw'
> /tmp/ccVROLo2.s:497: Error: unsupported instruction `vpmovzxbw'
> /tmp/ccVROLo2.s:508: Error: unsupported instruction `vpmovzxbw'
> /tmp/ccVROLo2.s:548: Error: unsupported instruction `vpmulhuw'
> /tmp/ccVROLo2.s:559: Error: unsupported instruction `vpmulhuw'
> /tmp/ccVROLo2.s:570: Error: unsupported instruction `vpmulhw'
> /tmp/ccVROLo2.s:581: Error: unsupported instruction `vpmulhw'
> /tmp/ccVROLo2.s:592: Error: unsupported instruction `vpsadbw'
> /tmp/ccVROLo2.s:603: Error: unsupported instruction `vpsadbw'
> /tmp/ccVROLo2.s:643: Error: unsupported instruction `vpshufhw'
> /tmp/ccVROLo2.s:654: Error: unsupported instruction `vpshufhw'
> /tmp/ccVROLo2.s:665: Error: unsupported instruction `vpshuflw'
> /tmp/ccVROLo2.s:676: Error: unsupported instruction `vpshuflw'
> /tmp/ccVROLo2.s:687: Error: unsupported instruction `vpslldq'
> /tmp/ccVROLo2.s:698: Error: unsupported instruction `vpslldq'
> /tmp/ccVROLo2.s:709: Error: unsupported instruction `vpsrldq'
> /tmp/ccVROLo2.s:720: Error: unsupported instruction `vpsrldq'
> /tmp/ccVROLo2.s:899: Error: unsupported instruction `vpunpckhbw'
> /tmp/ccVROLo2.s:910: Error: unsupported instruction `vpunpckhbw'
> /tmp/ccVROLo2.s:921: Error: unsupported instruction `vpunpckhwd'
> /tmp/ccVROLo2.s:932: Error: unsupported instruction `vpunpckhwd'
> /tmp/ccVROLo2.s:943: Error: u

Re: [PATCH v2] c: don't drop typedef information in casts

2021-03-12 Thread Jakub Jelinek via Gcc-patches
On Fri, Mar 12, 2021 at 04:08:17AM +0100, David Lamparter wrote:
> 
> The TYPE_MAIN_VARIANT() here was, for casts to a typedef'd type name,
> resulting in all information about the typedef's involvement getting
> lost.  This drops necessary information for warnings and can make them
> confusing or even misleading.  It also makes specialized warnings for
> unspecified-size system types (pid_t, uid_t, ...) impossible.

I don't think you can/should do the lookup_name calls in build_c_cast,
that just can't be right.  The lookups need to be done only when parsing
the typedefs.  Because, one certainly can have e.g.
typedef ... A;
typedef A ... B; // with some extra qualifiers or whatever
void
foo (void)
{
  typedef ... A; // Override in local scope A but not B
  ... = (B) ...; // build_c_cast called
}
You don't want to find the overridden A, you need the original one.

> 2021-03-09  David Lamparter  
> 
>   PR c/99526
>   * c-typeck.c (build_c_cast): retain (unqualified) typedefs in
> casts rather than stripping down to basic type.

Jakub



[PATCH][pushed] analyzer: document new param

2021-03-12 Thread Martin Liška

Identified by my check that compares documentation of params
with content of --help=param output.

Pushed as obvious.
Martin

gcc/ChangeLog:

* doc/invoke.texi: Add missing param documentation.
---
 gcc/doc/invoke.texi | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 4a3c1e2fa0f..7a368959e5e 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -14362,6 +14362,10 @@ recurse deeper.
 The maximum depth of a symbolic value, before approximating
 the value as unknown.
 
+@item analyzer-max-infeasible-edges

+The maximum number of infeasible edges to reject before declaring
+a diagnostic as infeasible.
+
 @item gimple-fe-computed-hot-bb-threshold
 The number of executions of a basic block which is considered hot.
 The parameter is used only in GIMPLE FE.
--
2.30.1



Re: [Ada] Fix PR ada/99264

2021-03-12 Thread Matthias Klose
Jakub reported that for glibc 2.34 (trunk, unreleased), Richard said it was
working for glibc 2.33 (latest release), your commit says "Fix build breakage
with latest glibc release" (which is 2.33). What is correct?

The change also caused CI test failures in Debian and Ubuntu as seen at
https://ci.debian.net/data/autopkgtest/unstable/amd64/a/ahven/10908933/log.gz
fail libgnat-10 10.2.1-21 (experimental)
https://ci.debian.net/data/autopkgtest/unstable/amd64/a/ahven/10908997/log.gz
pass libgnat-10 10.2.1-6 (unstable)

Citing Nicolas,

"""
The symptom
"/usr/lib/x86_64-linux-gnu/ada/adalib/ahven/ahven-framework.ali" is obsolete and
read-only
looks exactly like the scenario described by the Debian Ada Policy.
https://people.debian.org/~lbrenta/debian-ada-policy.html

Example with a simple dependency chain:
  libahven depends on libgnat
  ahven-tests depend on libahven
libahven contains, in ahven-framework.ali, a checksum of each source
it depends upon, including some libgnat sources.
When libgnat changes, libahven must be rebuilt before ahven-tests.
Else…
The error above is reported when building ahven-tests.
It mentions ahven-framework.ali from the libahven-dev package.
It actually originates in a change in libgnat.

The Debian Ada policy requires that such dependencies are encoded in
-dev package names so that dpkg later automatically prevents
inconsistent sets of .ali files and related cryptic messages.

In the special case of libgnat, built by GCC, there is only one
libgnatMAJOR package, containing the files usually expected in
libgnatSO and libgnatALI-dev. The sources are not expected to change
for a given MAJOR inside unstable.
"""


Assuming that the change is only required for glibc trunk (2.34), I'll revert
that for Debian's package builds for the gcc branches. I'll see what to do if I
still need gnat-10 when glibc 2.34 is in use.  Otoh, the patch could be
conditional on the glibc version detected.

Matthias


On 3/5/21 12:45 PM, Eric Botcazou wrote:
> This fixes the build breakage introduced by the latest glibc release.
> 
> Tested on x86-64/Linux, applied on mainline, 10 and 9 branches.
> 
> 
> 2021-03-05  Eric Botcazou  
> 
>   PR ada/99264
>   * init.c (__gnat_alternate_sta) [Linux]: Remove preprocessor test on
>   MINSIGSTKSZ and bump size to 32KB.
>   * libgnarl/s-osinte__linux.ads (Alternate_Stack_Size): Bump to 32KB.
> 



Re: [PATCH] MIPS: R6: load/store can process unaligned address

2021-03-12 Thread YunQiang Su
ping.

YunQiang Su  于2021年2月28日周日 下午3:17写道:
>
> MIPS release 6 requires the lw/ld/sw/sd can work with
> unaligned address, while it can be implemented by
> full hardware or trap&emulate.
>
> Since it is may be fully done by hardware, we add an
> option -m(no-)unaligned-access, the kernel may need it.
>
> gcc/ChangeLog:
>
> * config/mips/mips.h (ISA_HAS_UNALIGNED_ACCESS):
> (STRICT_ALIGNMENT): R6 can unaligned access.
> * config/mips/mips.md (movmisalign): Likewise.
> * config/mips/mips.opt: add -m(no-)unaligned-access
> * doc/invoke.texi: Likewise.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/mips/mips.exp: add unaligned-access
> * gcc.target/mips/unaligned-2.c: New test.
> * gcc.target/mips/unaligned-3.c: New test.
> ---
>  gcc/config/mips/mips.h  |  6 ++-
>  gcc/config/mips/mips.md | 10 
>  gcc/config/mips/mips.opt|  4 ++
>  gcc/doc/invoke.texi | 10 
>  gcc/testsuite/gcc.target/mips/mips.exp  |  1 +
>  gcc/testsuite/gcc.target/mips/unaligned-2.c | 53 +
>  gcc/testsuite/gcc.target/mips/unaligned-3.c | 53 +
>  7 files changed, 136 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/mips/unaligned-2.c
>  create mode 100644 gcc/testsuite/gcc.target/mips/unaligned-3.c
>
> diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h
> index b4a60a55d80..38c39f79ee2 100644
> --- a/gcc/config/mips/mips.h
> +++ b/gcc/config/mips/mips.h
> @@ -226,6 +226,10 @@ struct mips_cpu_info {
>  && (mips_isa_rev >= 6 \
>  || ISA_HAS_MSA))
>
> +/* ISA load/store instructions can handle unaligned address */
> +#define ISA_HAS_UNALIGNED_ACCESS (TARGET_UNALIGNED_ACCESS \
> +&& (mips_isa_rev >= 6))
> +
>  /* The ISA compression flags that are currently in effect.  */
>  #define TARGET_COMPRESSION (target_flags & (MASK_MIPS16 | MASK_MICROMIPS))
>
> @@ -1665,7 +1669,7 @@ FP_ASM_SPEC "\
>(ISA_HAS_MSA ? BITS_PER_MSA_REG : LONG_DOUBLE_TYPE_SIZE)
>
>  /* All accesses must be aligned.  */
> -#define STRICT_ALIGNMENT 1
> +#define STRICT_ALIGNMENT (!ISA_HAS_UNALIGNED_ACCESS)
>
>  /* Define this if you wish to imitate the way many other C compilers
> handle alignment of bitfields and the structures that contain
> diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md
> index eef3cfd50a8..40e29c60432 100644
> --- a/gcc/config/mips/mips.md
> +++ b/gcc/config/mips/mips.md
> @@ -4459,6 +4459,16 @@ (define_insn "mov_r"
>[(set_attr "move_type" "store")
> (set_attr "mode" "")])
>
> +;; Unaligned direct access
> +(define_expand "movmisalign"
> +  [(set (match_operand:JOIN_MODE 0)
> +   (match_operand:JOIN_MODE 1))]
> +  "ISA_HAS_UNALIGNED_ACCESS"
> +{
> +  if (mips_legitimize_move (mode, operands[0], operands[1]))
> +DONE;
> +})
> +
>  ;; An instruction to calculate the high part of a 64-bit SYMBOL_ABSOLUTE.
>  ;; The required value is:
>  ;;
> diff --git a/gcc/config/mips/mips.opt b/gcc/config/mips/mips.opt
> index 6af8037e9bd..ebb4c616401 100644
> --- a/gcc/config/mips/mips.opt
> +++ b/gcc/config/mips/mips.opt
> @@ -404,6 +404,10 @@ mtune=
>  Target RejectNegative Joined Var(mips_tune_option) ToLower 
> Enum(mips_arch_opt_value)
>  -mtune=PROCESSOR   Optimize the output for PROCESSOR.
>
> +munaligned-access
> +Target Var(TARGET_UNALIGNED_ACCESS) Init(1)
> +Generate code with unaligned load store, valid for MIPS R6.
> +
>  muninit-const-in-rodata
>  Target Var(TARGET_UNINIT_CONST_IN_RODATA)
>  Put uninitialized constants in ROM (needs -membedded-data).
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 546e95453c1..27730d1a0de 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -1059,6 +1059,7 @@ Objective-C and Objective-C++ Dialects}.
>  -mcheck-zero-division  -mno-check-zero-division @gol
>  -mdivide-traps  -mdivide-breaks @gol
>  -mload-store-pairs  -mno-load-store-pairs @gol
> +-munaligned-access  -mno-unaligned-access @gol
>  -mmemcpy  -mno-memcpy  -mlong-calls  -mno-long-calls @gol
>  -mmad  -mno-mad  -mimadd  -mno-imadd  -mfused-madd  -mno-fused-madd  -nocpp 
> @gol
>  -mfix-24k  -mno-fix-24k @gol
> @@ -24967,6 +24968,15 @@ instructions to enable load/store bonding.  This 
> option is enabled by
>  default but only takes effect when the selected architecture is known
>  to support bonding.
>
> +@item -munaligned-access
> +@itemx -mno-unaligned-access
> +@opindex munaligned-access
> +@opindex mno-unaligned-access
> +Enable (disable) direct unaligned access for MIPS Release 6.
> +MIPSr6 requires load/store unaligned-access support,
> +by hardware or trap&emulate.
> +So @option{-mno-unaligned-access} may be needed by kernel.
> +
>  @item -mmemcpy
>  @itemx -mno-memcpy
>  @opindex mmemcpy
> diff --git a/gcc/testsuite/gcc.target/mips/mips.exp 
> b/gcc/testsuit

Re: [Ada] Fix PR ada/99264

2021-03-12 Thread Eric Botcazou
> Jakub reported that for glibc 2.34 (trunk, unreleased), Richard said it was
> working for glibc 2.33 (latest release), your commit says "Fix build
> breakage with latest glibc release" (which is 2.33). What is correct?

Both I guess, mine is just a bit more forward-looking. ;-)

> The symptom
> "/usr/lib/x86_64-linux-gnu/ada/adalib/ahven/ahven-framework.ali" is obsolete
> and read-only
> looks exactly like the scenario described by the Debian Ada Policy.
> https://people.debian.org/~lbrenta/debian-ada-policy.html
> 
> Example with a simple dependency chain:
>   libahven depends on libgnat
>   ahven-tests depend on libahven
> libahven contains, in ahven-framework.ali, a checksum of each source
> it depends upon, including some libgnat sources.
> When libgnat changes, libahven must be rebuilt before ahven-tests.
> Else…
> The error above is reported when building ahven-tests.
> It mentions ahven-framework.ali from the libahven-dev package.
> It actually originates in a change in libgnat.
> 
> The Debian Ada policy requires that such dependencies are encoded in
> -dev package names so that dpkg later automatically prevents
> inconsistent sets of .ali files and related cryptic messages.
> 
> In the special case of libgnat, built by GCC, there is only one
> libgnatMAJOR package, containing the files usually expected in
> libgnatSO and libgnatALI-dev. The sources are not expected to change
> for a given MAJOR inside unstable.

The change is supposed to be binary compatible so not sure what this means.

> Assuming that the change is only required for glibc trunk (2.34), I'll
> revert that for Debian's package builds for the gcc branches. I'll see what
> to do if I still need gnat-10 when glibc 2.34 is in use.  Otoh, the patch
> could be conditional on the glibc version detected.

Too much hassle.  Please pester the glibc folks if you have any complaint.

-- 
Eric Botcazou




Re: [Patch] Fortran/OpenMP: Fix use_device_{ptr,addr} with assumed-size array [PR98858]

2021-03-12 Thread Jakub Jelinek via Gcc-patches
On Thu, Mar 11, 2021 at 02:44:38PM +0100, Tobias Burnus wrote:
> Fortran/OpenMP: Fix use_device_{ptr,addr} with assumed-size array [PR98858]
> 
> gcc/ChangeLog:
> 
>   PR fortran/98858
>   * gimplify.c (omp_add_variable): Handle NULL_TREE as size
>   occuring for assumed-size arrays in use_device_{ptr,addr}.
> 
> libgomp/ChangeLog:
> 
>   PR fortran/98858
>   * testsuite/libgomp.fortran/use_device_ptr-3.f90: New test.

Ok, thanks.

Jakub



Re: [Patch] Fortran/OpenMP: Accept implicit-save DATA vars for threadprivate [PR99514]

2021-03-12 Thread Jakub Jelinek via Gcc-patches
On Wed, Mar 10, 2021 at 03:20:51PM +0100, Tobias Burnus wrote:
> gcc/fortran/ChangeLog:
> 
>   PR fortran/99514
>   * resolve.c (resolve_symbol): Accept vars which are in DATA
>   and hence (either) implicit SAVE (or in common).
> 
> gcc/testsuite/ChangeLog:
> 
>   PR fortran/99514
>   * gfortran.dg/gomp/threadprivate-1.f90: New test.

Ok, thanks.

Jakub



Re: [Patch] OpenMP: Fix 'omp declare target' handling for vars [PR99509]

2021-03-12 Thread Jakub Jelinek via Gcc-patches
On Wed, Mar 10, 2021 at 03:20:42PM +0100, Tobias Burnus wrote:
> The C/C++ FE sets for an 'omp declare target' ... 'omp end declare target'
> the attribute 'omp declare target implicit'.
> 
> That's later processed (for C++) in decl.c - which remove that attribute
> and either keeps and explicit 'omp declare target' or 'omp declare target 
> link'
> attribute.
> 
> Unfortunately, adding 'omp declare target' comes too late as the varpool
> has been generated.

Looking attributes on every get rather than just once is IMHO too expensive.
For explicit declare target, we use:
  symtab_node *node = symtab_node::get (t);
  if (node != NULL)
{
  node->offloadable = 1;
  if (ENABLE_OFFLOADING)
{
  g->have_offload = true;
  if (is_a  (node))
vec_safe_push (offload_vars, t);
}
}
(e.g. from cp/parser.c).
So, I think it would be better to do the same thing when we turn
an "omp declare target implicit" into "omp declare target", i.e. in
cp/decl.c (cp_finish_decl) and in c/c-decl.c (finish_decl) - right
after adding the "omp declare target" attribute in there.

Jakub



Re: [PATCH] Change semantics of -frecord-gcc-switches and add -frecord-gcc-switches-format.

2021-03-12 Thread Martin Liška

PING^2

On 3/1/21 1:07 PM, Martin Liška wrote:

PING

On 2/18/21 10:18 AM, Martin Liška wrote:

On 2/16/21 10:17 PM, Qing Zhao wrote:

Hello,

What’s the status of this patch now? Is there any major technical issue with 
the patch?

Our company has been waiting for this patch for almost one year, we need it for 
our important application.


Hello.

You are right, we've been waiting for quite some time.



Could this one be approved and committed to gcc11?


@richi: ?

Thanks,
Martin



Thanks.

Qing


On Feb 5, 2021, at 3:34 AM, Martin Liška  wrote:

Hello.

Based on discussion with Richi, I'm re-sending the patch. Note that the patch
has been waiting for a review for almost one year and I would like to see
it in GCC 11.1.

Thank you,
Martin
<0001-Change-semantics-of-frecord-gcc-switches-and-add-fre.patch>










[PATCH 1/2] ipa-sra: Introduce a mini-DCE to tree-inline.c (PR 93385)

2021-03-12 Thread Martin Jambor
Hi,

PR 93385 reveals that if the user explicitely disables DCE, IPA-SRA
can leave behind statements which are useless because their results
are eventually not used but can have problematic side effects,
especially since their inputs are now bogus that useless parameters
were removed.

This patch fixes the problem by doing a def-use walk when
materializing clones, marking which statements should not be copied
and which SSA_NAMEs do not need to be computed because eventually they
would be DCEd.  Then tree-inline.c code simply does not copy them.

See the follow-up patch for what this means for debug statements.

There is one exception to the above non-copying rule: If such an
SSA_NAME appears as an argument of a call, it needs to be removed by
call redirection and not as part of clone materialization.  So to have
something valid there until that time, this patch uses the VAR_DECL that
is already created to serve as a base of non-default-def SSA_NAMEs of
the removed parameter.  This is technically incorrect because the
argument may be only causally related to the removed parameter, but it
would be only there until call redirection and this avoids creating
another VAR_DECL, which was the major objection to this patch in summer.

I think I have managed to design a way how to avoid it and generally
make the call statements a bit less clunky in between clone
materialization and call redirection. It involves storing essential
information in call summaries (also moving information from
clone_info::performed_splits there) but the patch is big even though I
have not yet finished writing it and it is not stage 4 material.

So, if we want to fix the issue in GCC 11 (and 10?), I would propose
this one - almost all of it is needed for the sophisticated fix too.  It
has passed bootstrap, LTO bootstrap and profiled LTO bootstrap and
testing on x86_64-linux and normal bootstrap and testing on
aarch64-linux.  Alternatively, we can just mitigate all known
manifestations of the issue with
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93385#c11 and only include
the sophisticated fix using call summaries in GCC 12.  (Or we can of
course decide to do nothing for 11.1 and wait for the final patch in GCC
12.)

Thanks,

Martin



gcc/ChangeLog:

2021-03-10  Martin Jambor  

PR ipa/93385
* ipa-param-manipulation.h (class ipa_param_body_adjustments): New
members m_dead_stmts, m_dead_ssas, mark_dead_statements and
modify_call_argument.
* ipa-param-manipulation.c (phi_arg_will_live_p): New function.
(ipa_param_body_adjustments::mark_dead_statements): New method.
(ipa_param_body_adjustments::common_initialization): Call it.
(ipa_param_body_adjustments::ipa_param_body_adjustments): Initialize
new mwmbers.
(ipa_param_body_adjustments::modify_call_argument): New.
(ipa_param_body_adjustments::modify_call_stmt): Use
modify_call_argument.
* tree-inline.c (remap_gimple_stmt): Do not copy dead statements,
reset dead debug statements.
(copy_phis_for_bb): Do not copy dead PHI nodes.

gcc/testsuite/ChangeLog:

2020-05-14  Martin Jambor  

PR ipa/93385
* gcc.dg/ipa/pr93385.c: New test.
* gcc.dg/ipa/ipa-sra-23.c: Likewise.
---
 gcc/ipa-param-manipulation.c  | 144 --
 gcc/ipa-param-manipulation.h  |   9 ++
 gcc/testsuite/gcc.dg/ipa/ipa-sra-23.c |  24 +
 gcc/testsuite/gcc.dg/ipa/pr93385.c|  27 +
 gcc/tree-inline.c |  18 +++-
 5 files changed, 208 insertions(+), 14 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/ipa/ipa-sra-23.c
 create mode 100644 gcc/testsuite/gcc.dg/ipa/pr93385.c

diff --git a/gcc/ipa-param-manipulation.c b/gcc/ipa-param-manipulation.c
index 132bb24f76a..f6b569a8a8b 100644
--- a/gcc/ipa-param-manipulation.c
+++ b/gcc/ipa-param-manipulation.c
@@ -970,6 +970,97 @@ ipa_param_body_adjustments::carry_over_param (tree t)
   return new_parm;
 }
 
+/* Return true if BLOCKS_TO_COPY is NULL or if PHI has an argument ARG in
+   position that corresponds to an edge that is coming from a block that has
+   the corresponding bit set in BLOCKS_TO_COPY.  */
+
+static bool
+phi_arg_will_live_p (gphi *phi, bitmap blocks_to_copy, tree arg)
+{
+  bool arg_will_survive = false;
+  if (!blocks_to_copy)
+arg_will_survive = true;
+  else
+for (unsigned i = 0; i < gimple_phi_num_args (phi); i++)
+  if (gimple_phi_arg_def (phi, i) == arg
+ && bitmap_bit_p (blocks_to_copy,
+  gimple_phi_arg_edge (phi, i)->src->index))
+   {
+ arg_will_survive = true;
+ break;
+   }
+  return arg_will_survive;
+}
+
+/* Populate m_dead_stmts given that DEAD_PARAM is going to be removed without
+   any replacement or splitting.  REPL is the replacement VAR_SECL to base any
+   remaining uses of a removed parameter on.  */
+
+void
+ipa_param_body_adjustments::mark_dead_statements (tree dead_param

[PATCH 2/2] ipa-sra: Improve debug info for removed parameters (PR 93385)

2021-03-12 Thread Martin Jambor
Hi,

the previous patch fixed issues with actual code left behind after
IPA-SRA removed a parameter but it might cause debug info regressions as
the patch only reset all affected debug bind statements.  This one
updates them with expressions which can allow the debugger to print the
removed value - see the added test-case.

Even though I originally did not want to create DEBUG_EXPR_DECLs for
intermediate values, I ended up doing so, because otherwise the code
started creating statements like

   # DEBUG __aD.198693 => &MEM[(const struct _Alloc_nodeD.171110 
*)D#195]._M_tD.184726->_M_implD.171154

which is not only a bit scary but gimple-fold also ICEs on it.
Therefore I decided they are probably quite necessary and have them.

The patch simply notes each removed SSA name present in a debug
statement and then works from it backwards, looking if it can
reconstruct the expression it represents (which can fail if a
non-degenerate PHI node is in the way).  If it can, it populates two
hash maps with those expressions so that 1) removed assignments are
replaced with a debug bind defining a new intermediate debug_decl_expr
and 2) existing debug binds that refer to SSA names that are bing
removed now refer to corresponding debug_decl_exprs.

If a removed parameter is passed to another function, the debugging
information still cannot describe its value there - see the xfailed
test in the testcase.  Solving the problem requires tracking
information about the respective debug decls on the side somehow
between clone materialization and edge redirection phases (since
inserting them directly to the IL was not met with much enthusiasm in
the summer) and so I will only attempt this time.

Note that I have not actually observed a debug info regression, so the
patch may be only stage1 material, your call.  It has passed bootstrap,
and (profiled) LTO bootstrap and testing on x86_64-linux.

Thanks,

Martin


gcc/ChangeLog:

2021-03-10  Martin Jambor  

PR ipa/93385
* ipa-param-manipulation.h (class ipa_param_body_adjustments): New
members remap_with_debug_expressions, m_dead_ssa_debug_equiv,
m_dead_stmt_debug_equiv and prepare_debug_expressions.  Added
parameter to mark_dead_statements.
* ipa-param-manipulation.c: Include tree-phinodes.h and cfgexpand.h.
(ipa_param_body_adjustments::mark_dead_statements): New parameter
debugstack, push into it all SSA names used in debug statements,
produce m_dead_ssa_debug_equiv mapping for the removed param.
(replace_with_mapped_expr): New function.
(ipa_param_body_adjustments::remap_with_debug_expressions): Likewise.
(ipa_param_body_adjustments::prepare_debug_expressions): Likewise.
(ipa_param_body_adjustments::common_initialization): Gather and
procecc SSA which will be removed but are in debug statements.  Avoid
processing debug info of parameters removed in previous clones.
Simplify.
(ipa_param_body_adjustments::ipa_param_body_adjustments): Initialize
new members.
* tree-inline.c (remap_gimple_stmt): Create a debug bind when possible
when avoiding a copy of an unnecessary statement.  Remap removed SSA
names in existing debug statements.
(tree_function_versioning): Do not create DEBUG_EXPR_DECL for removed
parameters if we have already done so.

gcc/testsuite/ChangeLog:

2021-03-10  Martin Jambor  

PR ipa/93385
* gcc.dg/guality/ipa-sra-1.c: New test.
---
 gcc/ipa-param-manipulation.c | 268 ++-
 gcc/ipa-param-manipulation.h |  13 +-
 gcc/testsuite/gcc.dg/guality/ipa-sra-1.c |  45 
 gcc/tree-inline.c|  45 ++--
 4 files changed, 297 insertions(+), 74 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/guality/ipa-sra-1.c

diff --git a/gcc/ipa-param-manipulation.c b/gcc/ipa-param-manipulation.c
index f6b569a8a8b..6f812ecfdec 100644
--- a/gcc/ipa-param-manipulation.c
+++ b/gcc/ipa-param-manipulation.c
@@ -43,7 +43,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "alloc-pool.h"
 #include "symbol-summary.h"
 #include "symtab-clones.h"
-
+#include "tree-phinodes.h"
+#include "cfgexpand.h"
 
 /* Actual prefixes of different newly synthetized parameters.  Keep in sync
with IPA_PARAM_PREFIX_* defines.  */
@@ -994,10 +995,12 @@ phi_arg_will_live_p (gphi *phi, bitmap blocks_to_copy, 
tree arg)
 
 /* Populate m_dead_stmts given that DEAD_PARAM is going to be removed without
any replacement or splitting.  REPL is the replacement VAR_SECL to base any
-   remaining uses of a removed parameter on.  */
+   remaining uses of a removed parameter on.  Add any SSA names identified to
+   be dead that are used in debug statements to DEBUGSTACK.  */
 
 void
-ipa_param_body_adjustments::mark_dead_statements (tree dead_param, tree repl)
+ipa_param_body_adjustments::mark_dead_statements (tree dead_param, tree repl,
+ 

Re: [PATCH] x86: Update 'P' operand modifier for -fno-plt

2021-03-12 Thread H.J. Lu via Gcc-patches
On Thu, Mar 11, 2021 at 11:27 PM Uros Bizjak  wrote:
>
> On Thu, Mar 11, 2021 at 11:22 PM H.J. Lu  wrote:
> >
> > Update 'P' operand modifier for -fno-plt to support inline assembly
> > statements.  In 64-bit, we can always load function address with
> > @GOTPCREL.  In 32-bit, we load function address with @GOT only for
> > non-PIC since PIC register may not be available at call site.
> >
> > gcc/
> >
> > PR target/99504
> > * config/i386/i386.c (ix86_print_operand): Update 'P' handling
> > for -fno-plt.
> >
> > gcc/testsuite/
> >
> > PR target/99504
> > * gcc.target/i386/pr99530-1.c: New test.
> > * gcc.target/i386/pr99530-2.c: Likewise.
> > * gcc.target/i386/pr99530-3.c: Likewise.
> > * gcc.target/i386/pr99530-4.c: Likewise.
> > * gcc.target/i386/pr99530-5.c: Likewise.
> > * gcc.target/i386/pr99530-6.c: Likewise.
> > ---
> >  gcc/config/i386/i386.c| 33 +--
> >  gcc/testsuite/gcc.target/i386/pr99530-1.c | 11 
> >  gcc/testsuite/gcc.target/i386/pr99530-2.c | 11 
> >  gcc/testsuite/gcc.target/i386/pr99530-3.c | 11 
> >  gcc/testsuite/gcc.target/i386/pr99530-4.c | 11 
> >  gcc/testsuite/gcc.target/i386/pr99530-5.c | 11 
> >  gcc/testsuite/gcc.target/i386/pr99530-6.c | 11 
> >  7 files changed, 97 insertions(+), 2 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-1.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-2.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-3.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-4.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-5.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-6.c
> >
> > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> > index 260f87b..8733fcecf65 100644
> > --- a/gcc/config/i386/i386.c
> > +++ b/gcc/config/i386/i386.c
> > @@ -12701,7 +12701,8 @@ print_reg (rtx x, int code, FILE *file)
> > y -- print "st(0)" instead of "st" as a register.
> > d -- print duplicated register operand for AVX instruction.
> > D -- print condition for SSE cmp instruction.
> > -   P -- if PIC, print an @PLT suffix.
> > +   P -- if PIC, print an @PLT suffix.  For -fno-plt, load function
> > +   address from GOT.
> > p -- print raw symbol name.
> > X -- don't print any sort of PIC '@' suffix for a symbol.
> > & -- print some in-use local-dynamic symbol name.
> > @@ -13445,7 +13446,35 @@ ix86_print_operand (FILE *file, rtx x, int code)
> >   x = const0_rtx;
> > }
> >
> > -  if (code != 'P' && code != 'p')
> > +  if (code == 'P')
> > +   {
> > + if (current_output_insn == NULL_RTX
> > + && (TARGET_64BIT || (!flag_pic && HAVE_AS_IX86_GOT32X))
> > + && !TARGET_PECOFF
> > + && !TARGET_MACHO
> > + && ix86_cmodel != CM_LARGE
> > + && ix86_cmodel != CM_LARGE_PIC
> > + && GET_CODE (x) == SYMBOL_REF
> > + && SYMBOL_REF_FUNCTION_P (x)
> > + && (!flag_plt
> > + || (SYMBOL_REF_DECL (x)
> > + && lookup_attribute ("noplt",
> > +  DECL_ATTRIBUTES (SYMBOL_REF_DECL 
> > (x)
> > + && !SYMBOL_REF_LOCAL_P (x))
>
> You can use ix86_force_load_from_GOT_p instead.
>

bool
ix86_force_load_from_GOT_p (rtx x)
{
  return ((TARGET_64BIT || HAVE_AS_IX86_GOT32X)
  && !TARGET_PECOFF && !TARGET_MACHO
  && !flag_pic <  This doesn't work with -fPIC/-fPIE.
  && ix86_cmodel != CM_LARGE
  && GET_CODE (x) == SYMBOL_REF
  && SYMBOL_REF_FUNCTION_P (x)
  && (!flag_plt
  || (SYMBOL_REF_DECL (x)
  && lookup_attribute ("noplt",
   DECL_ATTRIBUTES (SYMBOL_REF_DECL (x)
  && !SYMBOL_REF_LOCAL_P (x));
}

-- 
H.J.


[GCC 10] aarch64: Set AARCH64_EXTRA_TUNE_PREFER_ADVSIMD_AUTOVEC for Neoverse N2

2021-03-12 Thread Kyrylo Tkachov via Gcc-patches
Hi all,

This patch tweaks the Neoverse N2 tuning on the GCC 10 branch to have it in 
line with GCC 8 and 9 to prefer AdvancedSIMD over SVE for auto-vectorisation.

Bootstrapped and tested on aarch64-none-linux-gnu.
Pushing to the branch.

Thanks,
Kyrill

gcc/ChangeLog:

* config/aarch64/aarch64.c (neoversen2_tunings): Set
AARCH64_EXTRA_TUNE_PREFER_ADVSIMD_AUTOVEC tune_flags.


asimd-vec-n2-10.patch
Description: asimd-vec-n2-10.patch


Re: [PATCH] x86: Update 'P' operand modifier for -fno-plt

2021-03-12 Thread H.J. Lu via Gcc-patches
On Thu, Mar 11, 2021 at 11:21 PM Uros Bizjak  wrote:
>
> On Thu, Mar 11, 2021 at 11:22 PM H.J. Lu  wrote:
> >
> > Update 'P' operand modifier for -fno-plt to support inline assembly
> > statements.  In 64-bit, we can always load function address with
> > @GOTPCREL.  In 32-bit, we load function address with @GOT only for
> > non-PIC since PIC register may not be available at call site.
> >
> > gcc/
> >
> > PR target/99504
> > * config/i386/i386.c (ix86_print_operand): Update 'P' handling
> > for -fno-plt.
> >
> > gcc/testsuite/
> >
> > PR target/99504
> > * gcc.target/i386/pr99530-1.c: New test.
> > * gcc.target/i386/pr99530-2.c: Likewise.
> > * gcc.target/i386/pr99530-3.c: Likewise.
> > * gcc.target/i386/pr99530-4.c: Likewise.
> > * gcc.target/i386/pr99530-5.c: Likewise.
> > * gcc.target/i386/pr99530-6.c: Likewise.
> > ---
> >  gcc/config/i386/i386.c| 33 +--
> >  gcc/testsuite/gcc.target/i386/pr99530-1.c | 11 
> >  gcc/testsuite/gcc.target/i386/pr99530-2.c | 11 
> >  gcc/testsuite/gcc.target/i386/pr99530-3.c | 11 
> >  gcc/testsuite/gcc.target/i386/pr99530-4.c | 11 
> >  gcc/testsuite/gcc.target/i386/pr99530-5.c | 11 
> >  gcc/testsuite/gcc.target/i386/pr99530-6.c | 11 
> >  7 files changed, 97 insertions(+), 2 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-1.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-2.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-3.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-4.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-5.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-6.c
> >
> > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> > index 260f87b..8733fcecf65 100644
> > --- a/gcc/config/i386/i386.c
> > +++ b/gcc/config/i386/i386.c
> > @@ -12701,7 +12701,8 @@ print_reg (rtx x, int code, FILE *file)
> > y -- print "st(0)" instead of "st" as a register.
> > d -- print duplicated register operand for AVX instruction.
> > D -- print condition for SSE cmp instruction.
> > -   P -- if PIC, print an @PLT suffix.
> > +   P -- if PIC, print an @PLT suffix.  For -fno-plt, load function
> > +   address from GOT.
> > p -- print raw symbol name.
> > X -- don't print any sort of PIC '@' suffix for a symbol.
> > & -- print some in-use local-dynamic symbol name.
> > @@ -13445,7 +13446,35 @@ ix86_print_operand (FILE *file, rtx x, int code)
> >   x = const0_rtx;
> > }
> >
> > -  if (code != 'P' && code != 'p')
> > +  if (code == 'P')
> > +   {
> > + if (current_output_insn == NULL_RTX
> > + && (TARGET_64BIT || (!flag_pic && HAVE_AS_IX86_GOT32X))
> > + && !TARGET_PECOFF
> > + && !TARGET_MACHO
> > + && ix86_cmodel != CM_LARGE
> > + && ix86_cmodel != CM_LARGE_PIC
> > + && GET_CODE (x) == SYMBOL_REF
> > + && SYMBOL_REF_FUNCTION_P (x)
> > + && (!flag_plt
> > + || (SYMBOL_REF_DECL (x)
> > + && lookup_attribute ("noplt",
> > +  DECL_ATTRIBUTES (SYMBOL_REF_DECL 
> > (x)
> > + && !SYMBOL_REF_LOCAL_P (x))
> > +   {
> > + /* For inline assembly statement, load function address
> > +from GOT with 'P' operand modifier to avoid PLT.
> > +NB: This works only with call or jmp.  */
> > + const char *xasm;
> > + if (TARGET_64BIT)
> > +   xasm = "{*%p0@GOTPCREL(%%rip)|[QWORD PTR 
> > %p0@GOTPCREL[rip]]}";
> > + else
> > +   xasm = "{*%p0@GOT|[DWORD PTR %p0@GOT]}";
> > + output_asm_insn (xasm, &x);
> > + return;
>
> This should be handled in output_pic_addr_const.
>

call/jmp are special and are handled by ix86_output_call_insn,
not output_pic_addr_const.

H.J.


Re: [PATCH] handle VLA of zero length arrays and vice versa (PR 99121)

2021-03-12 Thread Jakub Jelinek via Gcc-patches
On Mon, Mar 08, 2021 at 07:37:46PM -0700, Martin Sebor via Gcc-patches wrote:
> Accesses to zero-length arrays continue to be diagnosed (except for
> trailing arrays of unknown objects), as are nonempty accesses to empty
> types.
> 
> The warning message for (3) remains unchanged, i.e., for the following:
> 
>   struct S { } a[3];
> 
>   void g (int n)
>   {
> ((int*)a)[0] = 0;
>   }
> 
> it's:
> 
>   warning: array subscript 0 is outside array bounds of ‘struct S[3]’
> [-Warray-bounds]

As I tried to explain several times, this is completely unacceptable to me.
We want to warn, I agree with that, but we don't care that we emit completely
nonsensical warning?
The user will be just confused by that, will (rightly) think it is a bug in
the compiler and might not fix the actual bug.
Array subscript 0 is not outside of those array bounds.

If you don't want a shortcut that for arrays with zero sized elements
and for non-array objects with zero size uses a different code path
that would just warn
"access outside of zero sized object %qD"
(which is what I'd prefer, any non-zero sized access to object of zero size
unless it is flexible array member or decl with flexible array member
is undefined, whatever offset you use)
and want to reuse the current code, at least please change reftype
to build_printable_array_type (TREE_TYPE (ref), 0);
so that it prints
warning: array subscript 0 is outside array bounds of ‘int[0]’ [-Warray-bounds]
You'll need to redo the:
  || !COMPLETE_TYPE_P (reftype)
  || TREE_CODE (TYPE_SIZE_UNIT (reftype)) != INTEGER_CST)
return false;
check on the new reftype.
It should be done not just where you currently do:
  nelts = integer_zero_node;
  eltsize = 1;
but also somewhere in:
  eltsize = 1;
  tree size = TYPE_SIZE_UNIT (reftype);
  if (VAR_P (arg))
if (tree initsize = DECL_SIZE_UNIT (arg))
  if (tree_int_cst_lt (size, initsize))
size = initsize;

  arrbounds[1] = wi::to_offset (size);
below that for the case where integer_zerop (size) && TYPE_EMPTY_P (reftype).
There should be also:

  struct S { } a;

  void g (int n)
  {
((int*)&a)[0] = 0;
  }

testcase that covers that.

BTW, what is the reason behind the POINTER_TYPE_P check in:
  /* The type of the object being referred to.  It can be an array,
 string literal, or a non-array type when the MEM_REF represents
 a reference/subscript via a pointer to an object that is not
 an element of an array.  Incomplete types are excluded as well
 because their size is not known.  */
  reftype = TREE_TYPE (arg);
  if (POINTER_TYPE_P (reftype)
?
It disables the second warning on:
  __UINTPTR_TYPE__ a;
  void *b;

  void g (int n)
  {
((int*)&a)[4] = 0;
((int*)&b)[4] = 0;
  }

I really don't see what is different on vars with pointer type vs. vars with
non-pointer type of the same size for this warning.

Jakub



Re: use -mno-strict-align for strlenopt-80.c on powerpc

2021-03-12 Thread Iain Sandoe via Gcc-patches

Segher Boessenkool  wrote:


On Wed, Mar 10, 2021 at 02:41:50AM -0300, Alexandre Oliva wrote:

ppc configurations that have -mstrict-align enabled by default fail
gcc.dg/strlenopt-80.c, because some memcpy calls don't get turned into
MEM_REFs, which defeats the tested-for strlen optimization.

This was regstrapped on x86_64-linux-gnu, tested with a cross to a
ppc64-vxworks7r2 configured with -mstrict-align enabled by default,
and I'm now also regstrapping on ppc64-linux-gnu just to be sure.
Ok to install?


The -mstrict-align option is defined in sysv4.opt, which is not used in
all configurations (like, powerpc64-darwin*, and the AIX configs).


this works for me on powerpc-darwin, I haven’t tried to test AIX.

Iain

diff --git a/gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c  
b/gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c

index 554cd0c..6d51ea8 100644
--- a/gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c
+++ b/gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c
@@ -1,7 +1,9 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target powerpc_prefixed_addr } */
 /* { dg-require-effective-target lp64 } */
-/* { dg-options "-O2 -mdejagnu-cpu=power10" } */
+/* { dg-options "-O2 -mdejagnu-cpu=power10 -mfloat128 " } */
+/* { dg-additional-options "-mno-strict-align" { target { ! { *-*-darwin*  
*-*-aix* } } } } */

+/* { dg-prune-output ".-mfloat128. option may not be fully supported" } */

 /* Tests whether we generate a prefixed load/store operation for addresses that
don't meet DS/DQ offset constraints.  64-bit is needed for testing the use



Re: [PATCH] i386: Hopefully last set of -mavx512vl -mno-avx512bw fixes [PR99321]

2021-03-12 Thread Jakub Jelinek via Gcc-patches
On Fri, Mar 12, 2021 at 09:35:00AM +0100, Uros Bizjak via Gcc-patches wrote:
> Perhaps we can introduce another Y... constraint for AVX512BW and use
> it here. I think they will be used in other places, too.

Ok, added YW constraint and used that for those mmx*{ins,ext}* as well
as _psadbw.

Here is what I've committed to trunk after another bootstrap/regtest
on x86_64-linux and i686-linux:

2021-03-12  Jakub Jelinek  

PR target/99321
* config/i386/constraints.md (YW): New internal constraint.
* config/i386/sse.md (v_Yw): Add V4TI, V2TI, V1TI and TI cases.
(*_3,
*_uavg3, *abs2,
*mul3_highpart): Use  instead of v in
constraints.
(_psadbw): Use YW instead of v in constraints.
(*avx2_pmaddwd, *sse2_pmaddwd, *v8hi3, *v16qi3,
avx2_pmaddubsw256, ssse3_pmaddubsw128): Merge last two alternatives
into one, use Yw instead of former x,v.
(ashr3, 3): Use  instead of x in constraints of
the last alternative.
(_packsswb, _packssdw,
_packuswb, _packusdw,
*_pmulhrsw3, _palignr,
_pshufb3): Merge last two alternatives
into one, use  instead of former x,v.
(avx2_interleave_highv32qi,
vec_interleave_highv16qi): Use Yw instead of v in
constraints.  Add &&  to condition.
(avx2_interleave_lowv32qi,
vec_interleave_lowv16qi,
avx2_interleave_highv16hi,
vec_interleave_highv8hi,
avx2_interleave_lowv16hi, vec_interleave_lowv8hi,
avx2_pshuflw_1, sse2_pshuflw_1,
avx2_pshufhw_1, sse2_pshufhw_1,
avx2_v16qiv16hi2, sse4_1_v8qiv8hi2,
*sse4_1_v8qiv8hi2_1, _3): Use
Yw instead of v in constraints.
* config/i386/mmx.md (Yv_Yw): New define_mode_attr.
(*mmx_3, mmx_ashr3, mmx_3): Use 
instead of Yv in constraints.
(*mmx_3, *mmx_mulv4hi3, *mmx_smulv4hi3_highpart,
*mmx_umulv4hi3_highpart, *mmx_pmaddwd, *mmx_v4hi3,
*mmx_v8qi3, mmx_packswb, mmx_packssdw,
mmx_punpckhbw, mmx_punpcklbw, mmx_punpckhwd, mmx_punpcklwd,
*mmx_uavgv8qi3, *mmx_uavgv4hi3, mmx_psadbw): Use Yw instead of Yv in
constraints.
(*mmx_pinsrw, *mmx_pinsrb, *mmx_pextrw, *mmx_pextrw_zext, *mmx_pextrb,
*mmx_pextrb_zext): Use YW instead of Yv in constraints.
(*mmx_eq3, mmx_gt3): Use x instead of Yv in constraints.
(mmx_andnot3, *mmx_3): Split last alternative into
two, one with just x, another isa avx512vl with v.

* gcc.target/i386/avx512vl-pr99321-2.c: New test.

--- gcc/config/i386/constraints.md.jj   2021-03-07 10:27:21.655892748 +0100
+++ gcc/config/i386/constraints.md  2021-03-12 09:49:31.272583239 +0100
@@ -111,6 +111,8 @@ (define_register_constraint "v" "TARGET_
 ;; otherwise any SSE register
 ;;  w  any EVEX encodable SSE register for AVX512BW with TARGET_AVX512VL
 ;; target, otherwise any SSE register.
+;;  W   any EVEX encodable SSE register for AVX512BW target,
+;; otherwise any SSE register.
 
 (define_register_constraint "Yz" "TARGET_SSE ? SSE_FIRST_REG : NO_REGS"
  "First SSE register (@code{%xmm0}).")
@@ -151,6 +153,10 @@ (define_register_constraint "Yw"
  "TARGET_AVX512BW && TARGET_AVX512VL ? ALL_SSE_REGS : TARGET_SSE ? SSE_REGS : 
NO_REGS"
  "@internal Any EVEX encodable SSE register (@code{%xmm0-%xmm31}) for AVX512BW 
with TARGET_AVX512VL target, otherwise any SSE register.")
 
+(define_register_constraint "YW"
+ "TARGET_AVX512BW ? ALL_SSE_REGS : TARGET_SSE ? SSE_REGS : NO_REGS"
+ "@internal Any EVEX encodable SSE register (@code{%xmm0-%xmm31}) for AVX512BW 
target, otherwise any SSE register.")
+
 ;; We use the B prefix to denote any number of internal operands:
 ;;  f  FLAGS_REG
 ;;  g  GOT memory operand.
--- gcc/config/i386/sse.md.jj   2021-03-12 09:43:33.269556778 +0100
+++ gcc/config/i386/sse.md  2021-03-12 09:59:01.483271700 +0100
@@ -566,7 +566,8 @@ (define_mode_attr v_Yw
(V4SI "v") (V8SI "v") (V16SI "v")
(V2DI "v") (V4DI "v") (V8DI "v")
(V4SF "v") (V8SF "v") (V16SF "v")
-   (V2DF "v") (V4DF "v") (V8DF "v")])
+   (V2DF "v") (V4DF "v") (V8DF "v")
+   (TI "Yw") (V1TI "Yw") (V2TI "Yw") (V4TI "v")])
 
 (define_mode_attr sse2_avx_avx512f
   [(V16QI "sse2") (V32QI "avx") (V64QI "avx512f")
@@ -11736,10 +11737,10 @@ (define_expand "_
   "ix86_fixup_binary_operands_no_copy (, mode, operands);")
 
 (define_insn "*_3"
-  [(set (match_operand:VI12_AVX2_AVX512BW 0 "register_operand" "=x,v")
+  [(set (match_operand:VI12_AVX2_AVX512BW 0 "register_operand" "=x,")
(sat_plusminus:VI12_AVX2_AVX512BW
- (match_operand:VI12_AVX2_AVX512BW 1 "vector_operand" "0,v")
- (match_operand:VI12_AVX2_AVX512BW 2 "vector_operand" "xBm,vm")))]
+ (match_operand:VI12_AVX2_AVX512BW 1 "vector_operand" "0,")
+ (match_operand:VI12_AVX2_AVX512BW 2 "vector_operand" "xBm,m")))]
   "TARGET_SSE2 &&  && 
&& ix86_binary_operator_ok (, mode, operands)"
   "@
@@ -11827,14 +11828,

Re: [PATCH] avoid assuming gimple_call_alloc_size argument is a call (PR 99489)

2021-03-12 Thread Jakub Jelinek via Gcc-patches
On Tue, Mar 09, 2021 at 03:07:38PM -0700, Martin Sebor via Gcc-patches wrote:
> The gimple_call_alloc_size() function is documented to "return null
> when STMT is not a call to a valid allocation function" but the code
> assumes STMT is a call statement, causing the function to ICE when
> it isn't.
> 
> The attached patch changes the function to fulfill its contract and
> return null also when STMT isn't a call.  The fix seems obvious to
> me but I'll wait some time before committing it in case it's not
> to someone else.

I think the name of the function suggests that it should be called on calls,
not random stmts.  Currently the function has 3 callers, two of them
already verify is_gimple_call before calling it and only one doesn't,
and the stmt will never be NULL.
So I'd say it would be better to remove the if (!stmt) return NULL_TREE;
from the start of the function and add is_gimple_call (stmt) &&
in tree-ssa-strlen.c.

Jakub



Re: [PATCH][pushed] analyzer: document new param

2021-03-12 Thread David Malcolm via Gcc-patches
On Fri, 2021-03-12 at 09:45 +0100, Martin Liška wrote:
> Identified by my check that compares documentation of params
> with content of --help=param output.
> 
> Pushed as obvious.
> Martin

Thanks.

Which check is this, BTW?  (presumably this is something I should add
to my patch testing workflow)

Dave

> gcc/ChangeLog:
> 
> * doc/invoke.texi: Add missing param documentation.
> ---
>   gcc/doc/invoke.texi | 4 
>   1 file changed, 4 insertions(+)
> 
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 4a3c1e2fa0f..7a368959e5e 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -14362,6 +14362,10 @@ recurse deeper.
>   The maximum depth of a symbolic value, before approximating
>   the value as unknown.
>   
> +@item analyzer-max-infeasible-edges
> +The maximum number of infeasible edges to reject before declaring
> +a diagnostic as infeasible.
> +
>   @item gimple-fe-computed-hot-bb-threshold
>   The number of executions of a basic block which is considered hot.
>   The parameter is used only in GIMPLE FE.




Re: [PATCH][pushed] analyzer: document new param

2021-03-12 Thread Martin Liška

On 3/12/21 2:56 PM, David Malcolm wrote:

On Fri, 2021-03-12 at 09:45 +0100, Martin Liška wrote:

Identified by my check that compares documentation of params
with content of --help=param output.

Pushed as obvious.
Martin


Thanks.

Which check is this, BTW?  (presumably this is something I should add
to my patch testing workflow)


./gcc/xgcc -Bgcc --help=param &>params.txt
../contrib/check-params-in-docs.py ../gcc/doc/invoke.texi params.txt

Cheers,
Martin



Dave


gcc/ChangeLog:

 * doc/invoke.texi: Add missing param documentation.
---
   gcc/doc/invoke.texi | 4 
   1 file changed, 4 insertions(+)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 4a3c1e2fa0f..7a368959e5e 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -14362,6 +14362,10 @@ recurse deeper.
   The maximum depth of a symbolic value, before approximating
   the value as unknown.
   
+@item analyzer-max-infeasible-edges

+The maximum number of infeasible edges to reject before declaring
+a diagnostic as infeasible.
+
   @item gimple-fe-computed-hot-bb-threshold
   The number of executions of a basic block which is considered hot.
   The parameter is used only in GIMPLE FE.







Re: [PATCH] avoid assuming gimple_call_alloc_size argument is a call (PR 99489)

2021-03-12 Thread David Malcolm via Gcc-patches
On Fri, 2021-03-12 at 14:52 +0100, Jakub Jelinek via Gcc-patches wrote:
> On Tue, Mar 09, 2021 at 03:07:38PM -0700, Martin Sebor via Gcc-
> patches wrote:
> > The gimple_call_alloc_size() function is documented to "return null
> > when STMT is not a call to a valid allocation function" but the
> > code
> > assumes STMT is a call statement, causing the function to ICE when
> > it isn't.
> > 
> > The attached patch changes the function to fulfill its contract and
> > return null also when STMT isn't a call.  The fix seems obvious to
> > me but I'll wait some time before committing it in case it's not
> > to someone else.
> 
> I think the name of the function suggests that it should be called on
> calls,
> not random stmts.  Currently the function has 3 callers, two of them
> already verify is_gimple_call before calling it and only one doesn't,
> and the stmt will never be NULL.
> So I'd say it would be better to remove the if (!stmt) return
> NULL_TREE;
> from the start of the function and add is_gimple_call (stmt) &&
> in tree-ssa-strlen.c.

Maybe even make it convert it to taking a "const gcall *", so those

  if (is_gimple_call (stmt))
{
   ...
   if (gimple_call_alloc_size (stmt, ...))
 {
 }
}

become:

  if (const gcall *call = dyn_cast  (stmt))
{
   ...
   if (gimple_call_alloc_size (call, ...))
 {
 }

}

so that the compiler can enforce this requirement via the type system?

Hope this is constructive
Dave



Re: [PATCH] avoid assuming gimple_call_alloc_size argument is a call (PR 99489)

2021-03-12 Thread Jakub Jelinek via Gcc-patches
On Fri, Mar 12, 2021 at 09:04:33AM -0500, David Malcolm wrote:
> > from the start of the function and add is_gimple_call (stmt) &&
> > in tree-ssa-strlen.c.
> 
> Maybe even make it convert it to taking a "const gcall *", so those
> 
>   if (is_gimple_call (stmt))
> {
>...
>if (gimple_call_alloc_size (stmt, ...))
>  {
>  }
> }
> 
> become:
> 
>   if (const gcall *call = dyn_cast  (stmt))
> {
>...
>if (gimple_call_alloc_size (call, ...))
>  {
>  }
> 
> }

I'm not a big fan of that, to me it means too much typing/clutter
and think that runtime checking we have is sufficient, but could live with
that.

Jakub



Re: [PATCH] i386: Hopefully last set of -mavx512vl -mno-avx512bw fixes [PR99321]

2021-03-12 Thread Uros Bizjak via Gcc-patches
On Fri, Mar 12, 2021 at 2:38 PM Jakub Jelinek  wrote:
>
> On Fri, Mar 12, 2021 at 09:35:00AM +0100, Uros Bizjak via Gcc-patches wrote:
> > Perhaps we can introduce another Y... constraint for AVX512BW and use
> > it here. I think they will be used in other places, too.
>
> Ok, added YW constraint and used that for those mmx*{ins,ext}* as well
> as _psadbw.
>
> Here is what I've committed to trunk after another bootstrap/regtest
> on x86_64-linux and i686-linux:
>
> 2021-03-12  Jakub Jelinek  
>
> PR target/99321
> * config/i386/constraints.md (YW): New internal constraint.
> * config/i386/sse.md (v_Yw): Add V4TI, V2TI, V1TI and TI cases.
> (*_3,
> *_uavg3, *abs2,
> *mul3_highpart): Use  instead of v in
> constraints.
> (_psadbw): Use YW instead of v in constraints.
> (*avx2_pmaddwd, *sse2_pmaddwd, *v8hi3, *v16qi3,
> avx2_pmaddubsw256, ssse3_pmaddubsw128): Merge last two alternatives
> into one, use Yw instead of former x,v.
> (ashr3, 3): Use  instead of x in constraints 
> of
> the last alternative.
> (_packsswb, _packssdw,
> _packuswb, _packusdw,
> *_pmulhrsw3, _palignr,
> _pshufb3): Merge last two alternatives
> into one, use  instead of former x,v.
> (avx2_interleave_highv32qi,
> vec_interleave_highv16qi): Use Yw instead of v in
> constraints.  Add &&  to condition.
> (avx2_interleave_lowv32qi,
> vec_interleave_lowv16qi,
> avx2_interleave_highv16hi,
> vec_interleave_highv8hi,
> avx2_interleave_lowv16hi, 
> vec_interleave_lowv8hi,
> avx2_pshuflw_1, sse2_pshuflw_1,
> avx2_pshufhw_1, sse2_pshufhw_1,
> avx2_v16qiv16hi2, sse4_1_v8qiv8hi2,
> *sse4_1_v8qiv8hi2_1, _3): Use
> Yw instead of v in constraints.
> * config/i386/mmx.md (Yv_Yw): New define_mode_attr.
> (*mmx_3, mmx_ashr3, mmx_3): Use 
> instead of Yv in constraints.
> (*mmx_3, *mmx_mulv4hi3, *mmx_smulv4hi3_highpart,
> *mmx_umulv4hi3_highpart, *mmx_pmaddwd, *mmx_v4hi3,
> *mmx_v8qi3, mmx_packswb, mmx_packssdw,
> mmx_punpckhbw, mmx_punpcklbw, mmx_punpckhwd, mmx_punpcklwd,
> *mmx_uavgv8qi3, *mmx_uavgv4hi3, mmx_psadbw): Use Yw instead of Yv in
> constraints.
> (*mmx_pinsrw, *mmx_pinsrb, *mmx_pextrw, *mmx_pextrw_zext, *mmx_pextrb,
> *mmx_pextrb_zext): Use YW instead of Yv in constraints.
> (*mmx_eq3, mmx_gt3): Use x instead of Yv in constraints.
> (mmx_andnot3, *mmx_3): Split last alternative into
> two, one with just x, another isa avx512vl with v.
>
> * gcc.target/i386/avx512vl-pr99321-2.c: New test.
>
> --- gcc/config/i386/constraints.md.jj   2021-03-07 10:27:21.655892748 +0100
> +++ gcc/config/i386/constraints.md  2021-03-12 09:49:31.272583239 +0100
> @@ -111,6 +111,8 @@ (define_register_constraint "v" "TARGET_
>  ;; otherwise any SSE register
>  ;;  w  any EVEX encodable SSE register for AVX512BW with TARGET_AVX512VL
>  ;; target, otherwise any SSE register.
> +;;  W   any EVEX encodable SSE register for AVX512BW target,
> +;; otherwise any SSE register.
>
>  (define_register_constraint "Yz" "TARGET_SSE ? SSE_FIRST_REG : NO_REGS"
>   "First SSE register (@code{%xmm0}).")
> @@ -151,6 +153,10 @@ (define_register_constraint "Yw"
>   "TARGET_AVX512BW && TARGET_AVX512VL ? ALL_SSE_REGS : TARGET_SSE ? SSE_REGS 
> : NO_REGS"
>   "@internal Any EVEX encodable SSE register (@code{%xmm0-%xmm31}) for 
> AVX512BW with TARGET_AVX512VL target, otherwise any SSE register.")
>
> +(define_register_constraint "YW"
> + "TARGET_AVX512BW ? ALL_SSE_REGS : TARGET_SSE ? SSE_REGS : NO_REGS"
> + "@internal Any EVEX encodable SSE register (@code{%xmm0-%xmm31}) for 
> AVX512BW target, otherwise any SSE register.")
> +
>  ;; We use the B prefix to denote any number of internal operands:
>  ;;  f  FLAGS_REG
>  ;;  g  GOT memory operand.
> --- gcc/config/i386/sse.md.jj   2021-03-12 09:43:33.269556778 +0100
> +++ gcc/config/i386/sse.md  2021-03-12 09:59:01.483271700 +0100
> @@ -566,7 +566,8 @@ (define_mode_attr v_Yw
> (V4SI "v") (V8SI "v") (V16SI "v")
> (V2DI "v") (V4DI "v") (V8DI "v")
> (V4SF "v") (V8SF "v") (V16SF "v")
> -   (V2DF "v") (V4DF "v") (V8DF "v")])
> +   (V2DF "v") (V4DF "v") (V8DF "v")
> +   (TI "Yw") (V1TI "Yw") (V2TI "Yw") (V4TI "v")])
>
>  (define_mode_attr sse2_avx_avx512f
>[(V16QI "sse2") (V32QI "avx") (V64QI "avx512f")
> @@ -11736,10 +11737,10 @@ (define_expand "_
>"ix86_fixup_binary_operands_no_copy (, mode, operands);")
>
>  (define_insn "*_3"
> -  [(set (match_operand:VI12_AVX2_AVX512BW 0 "register_operand" "=x,v")
> +  [(set (match_operand:VI12_AVX2_AVX512BW 0 "register_operand" "=x,")
> (sat_plusminus:VI12_AVX2_AVX512BW
> - (match_operand:VI12_AVX2_AVX512BW 1 "vector_operand" "0,v")
> - (match_operand:VI12_AVX2_AVX512BW 2 "vector_operand

Re: [PATCH] i386: Hopefully last set of -mavx512vl -mno-avx512bw fixes [PR99321]

2021-03-12 Thread Jakub Jelinek via Gcc-patches
On Fri, Mar 12, 2021 at 03:34:09PM +0100, Uros Bizjak wrote:
> >  (define_insn "*avx2_pmaddwd"
> > -  [(set (match_operand:V8SI 0 "register_operand" "=x,v")
> > +  [(set (match_operand:V8SI 0 "register_operand" "=Yw")
> 
> I'm not sure contraction like this is correct. The prolbem is with vex
> vs. evex prefix, used in length calculation. When XMM16+ is used in
> the insn, and evex prefix is used, the unpatched version returns vex
> for alternative 0 due to (x,x,x) and evex for alternative 1, since one
> of registers satisfies only "v".
> 
> Patched version now always emits vex, which is wrong for XMM16+ registers.

That is true, but we have many thousands of those cases, where we just
use vex or maybe_vex or maybe_evex prefix with v or Yv or Yw or YW etc.
Adding extra alternatives for that would mean changing pretty much all of
sse.md.
I think it is much easier to imply length_evex when prefix is vex/maybe_vex when
any of the operands is EXT_REX_SSE_REG_P, subreg thereof,
MASK_REG_P, or subreg thereof.

The default definition of the "length" attribute has:
 (ior (eq_attr "prefix" "evex")
  (and (ior (eq_attr "prefix" "maybe_evex")
(eq_attr "prefix" "maybe_vex"))
   (match_test "TARGET_AVX512F")))
   (plus (attr "length_evex")
 (plus (attr "length_immediate")
   (plus (attr "modrm")
 (attr "length_address"
 (ior (eq_attr "prefix" "vex")
  (and (ior (eq_attr "prefix" "maybe_vex")
(eq_attr "prefix" "maybe_evex"))
   (match_test "TARGET_AVX")))
   (plus (attr "length_vex")
 (plus (attr "length_immediate")
   (plus (attr "modrm")
 (attr "length_address"]
That is just extremely rough guess, assuming all insns with
evex/maybe_evex/maybe_vex prefix will be EVEX encoded when TARGET_AVX512F
is clearly wrong, that is only true for instructions that don't have
a VEX counterpart (e.g. if they have mnemonics that is EVEX only), otherwise
it depends on whether either the operands will be 64-byte (we can perhaps
use for that the mode attribute at least by default) or whether any of the
operands is %[xy]mm16+ or %k*.
So (but I think this must be GCC 12 material) I'd say we should throw away
maybe_evex prefix altogether (replace with maybe_vex or vex),
use evex for the cases where it has EVEX only mnemonics and otherwise
call some function to look at the operands and mode attribute.

The maybe_vex decision on TARGET_AVX vs. !TARGET_AVX is on the other side
correct, we have all those %v etc. to make sure we emit VEX/EVEX encoded
instructions with -mavx rather than the old SSE* encodings.

I think some years ago I've used scripts and -dP etc. to verify the length
computations for vex, for evex it is known for years that it is inaccurate
unfortunately.

Maybe it is conservatively correct?  length_evex is hardcoded to 5,
while length_vex is 3 + 1 or 2 + 1.

In that case, perhaps I should for now change the vex on that particular
instruction to maybe_vex.  But I see still several hundreds of instructions
with vex prefix attribute and v etc. in constraints.
So, if we want a conservatively correct fix now, I'd say we should treat
even "vex" in the "length_evex" condition, so (untested):

2021-03-12  Jakub Jelinek  

* config/i386/i386.md (length): For TARGET_AVX512F treat
also vex prefix conservatively as length_evex.

--- gcc/config/i386/i386.md.jj  2021-03-05 21:51:33.675350463 +0100
+++ gcc/config/i386/i386.md 2021-03-12 16:24:49.302919436 +0100
@@ -686,6 +686,7 @@
   (attr "length_address")))
 (ior (eq_attr "prefix" "evex")
  (and (ior (eq_attr "prefix" "maybe_evex")
+   (eq_attr "prefix" "vex")
(eq_attr "prefix" "maybe_vex"))
   (match_test "TARGET_AVX512F")))
   (plus (attr "length_evex")

and if we eventually do something more accurate, perhaps vex could
stand for do the operand >= %xmm16 or %k? check (VEX can't ever encode
those), but perhaps don't do the mode attribute check?

Jakub



Re: [PATCH] i386: Hopefully last set of -mavx512vl -mno-avx512bw fixes [PR99321]

2021-03-12 Thread Uros Bizjak via Gcc-patches
On Fri, Mar 12, 2021 at 4:28 PM Jakub Jelinek  wrote:
>
> On Fri, Mar 12, 2021 at 03:34:09PM +0100, Uros Bizjak wrote:
> > >  (define_insn "*avx2_pmaddwd"
> > > -  [(set (match_operand:V8SI 0 "register_operand" "=x,v")
> > > +  [(set (match_operand:V8SI 0 "register_operand" "=Yw")
> >
> > I'm not sure contraction like this is correct. The prolbem is with vex
> > vs. evex prefix, used in length calculation. When XMM16+ is used in
> > the insn, and evex prefix is used, the unpatched version returns vex
> > for alternative 0 due to (x,x,x) and evex for alternative 1, since one
> > of registers satisfies only "v".
> >
> > Patched version now always emits vex, which is wrong for XMM16+ registers.
>
> That is true, but we have many thousands of those cases, where we just
> use vex or maybe_vex or maybe_evex prefix with v or Yv or Yw or YW etc.
> Adding extra alternatives for that would mean changing pretty much all of
> sse.md.
> I think it is much easier to imply length_evex when prefix is vex/maybe_vex 
> when
> any of the operands is EXT_REX_SSE_REG_P, subreg thereof,
> MASK_REG_P, or subreg thereof.
>
> The default definition of the "length" attribute has:
>  (ior (eq_attr "prefix" "evex")
>   (and (ior (eq_attr "prefix" "maybe_evex")
> (eq_attr "prefix" "maybe_vex"))
>(match_test "TARGET_AVX512F")))
>(plus (attr "length_evex")
>  (plus (attr "length_immediate")
>(plus (attr "modrm")
>  (attr "length_address"
>  (ior (eq_attr "prefix" "vex")
>   (and (ior (eq_attr "prefix" "maybe_vex")
> (eq_attr "prefix" "maybe_evex"))
>(match_test "TARGET_AVX")))
>(plus (attr "length_vex")
>  (plus (attr "length_immediate")
>(plus (attr "modrm")
>  (attr "length_address"]
> That is just extremely rough guess, assuming all insns with
> evex/maybe_evex/maybe_vex prefix will be EVEX encoded when TARGET_AVX512F
> is clearly wrong, that is only true for instructions that don't have
> a VEX counterpart (e.g. if they have mnemonics that is EVEX only), otherwise
> it depends on whether either the operands will be 64-byte (we can perhaps
> use for that the mode attribute at least by default) or whether any of the
> operands is %[xy]mm16+ or %k*.
> So (but I think this must be GCC 12 material) I'd say we should throw away
> maybe_evex prefix altogether (replace with maybe_vex or vex),
> use evex for the cases where it has EVEX only mnemonics and otherwise
> call some function to look at the operands and mode attribute.

Yes, I'm aware that while great care was taken to handle vex
attribute, evex handling is quite sloppy, and I fully agree with your
findings. I have noticed the issue when I tried to utilize newly
introduced YW constraint some more, e.g.:

(define_insn "*vec_extract"
  [(set (match_operand: 0 "register_sse4nonimm_operand"
"=r,m,r,m")
(vec_select:
  (match_operand:PEXTR_MODE12 1 "register_operand" "x,x,v,v")
  (parallel
[(match_operand:SI 2 "const_0_to__operand")])))]
  "TARGET_SSE2"
  "@
   %vpextr\t{%2, %1, %k0|%k0, %1, %2}
   %vpextr\t{%2, %1, %0|%0, %1, %2}
   vpextr\t{%2, %1, %k0|%k0, %1, %2}
   vpextr\t{%2, %1, %0|%0, %1, %2}"
  [(set_attr "isa" "*,sse4,avx512bw,avx512bw")

where alternatives 0/2 and 1/3 can now be merged together using YW
register constraint (plus a couple of other places, just grep for
avx512bw isa attribute). I was not sure if using maybe_vex is the
correct selection for the prefix attribute in this case.

Uros.

> The maybe_vex decision on TARGET_AVX vs. !TARGET_AVX is on the other side
> correct, we have all those %v etc. to make sure we emit VEX/EVEX encoded
> instructions with -mavx rather than the old SSE* encodings.
>
> I think some years ago I've used scripts and -dP etc. to verify the length
> computations for vex, for evex it is known for years that it is inaccurate
> unfortunately.
>
> Maybe it is conservatively correct?  length_evex is hardcoded to 5,
> while length_vex is 3 + 1 or 2 + 1.
>
> In that case, perhaps I should for now change the vex on that particular
> instruction to maybe_vex.  But I see still several hundreds of instructions
> with vex prefix attribute and v etc. in constraints.
> So, if we want a conservatively correct fix now, I'd say we should treat
> even "vex" in the "length_evex" condition, so (untested):
>
> 2021-03-12  Jakub Jelinek  
>
> * config/i386/i386.md (length): For TARGET_AVX512F treat
> also vex prefix conservatively as length_evex.
>
> --- gcc/config/i386/i386.md.jj  2021-03-05 21:51:33.675350463 +0100
> +++ gcc/config/i386/i386.md 2021-03-12 16:24:49.302919436 +0100
> @@ -686,6 +686,7 @@
>(attr "length_address")))
>  (ior (eq_attr "prefix" "evex")
>   (and (ior (eq_attr "pre

Re: c++: Macros need to be GTY-reachable [PR 99023]

2021-03-12 Thread Mike Stump via Gcc-patches
On Feb 18, 2021, at 6:15 AM, Jakub Jelinek via Gcc-patches 
 wrote:
> 
> On Wed, Feb 17, 2021 at 01:46:37PM -0500, Nathan Sidwell wrote:
>> I'd missed that  macros were allocated from GC storage, and that they can
>> become unattached from an identifier, and therefore not  GC-reachable.
>> And then bad things happen.   Fixed by making the module machinery's
>> reference vector a GC root.
>> 
>>  PR c++/99023
>>gcc/cp/
>>* module.cc (struct macro_export): Add GTY markers.
>>(macro_exports): Likewise, us a   va_gc Vector.
>>gcc/testsuite/
>>* g++.dg/modules/pr99023_a.H: New.
>>* g++.dg/modules/pr99023_b.H: New.
> 
> I must say I don't know much about modules, but seeing the second set
> of > 100KB g++.dg/modules/ testcases

Indeed, a test case that shows a memory error, likely is misguided at any size 
as we don't have a system to reliably detect when we are wondering outside the 
GTY sandbox.

Re: add rv64im{,c,fc} multilibs

2021-03-12 Thread Mike Stump via Gcc-patches
On Feb 24, 2021, at 1:10 AM, Alexandre Oliva  wrote:
> 
> On Feb 23, 2021, Jim Wilson  wrote:
>> If we add default multilibs for you
> 
> *nod*, it's a very familiar issue to me, I know where that's coming
> from, no worries.

So, what I'd do is if you have a triplet component that isn't used much, say, 
the middle one, you can embed the vendor there, and use the vendor to trigger 
which multilib set you want.

i386-unknown-linux

becomes

i386-telcovendor2-linux

and then telcovendor2 selects new multilib set.  The generic port, selects the 
base multilib set, and no one, other then a vendor build would select the 
vendor set.

That's one solution, there are others.  For example, on a system, you can smell 
the previously installed multilib set, and default to building those.

[PATCH] cprop_hardreg: Ensure replacement reg has compatible mode [PR99221]

2021-03-12 Thread Stefan Schulze Frielinghaus via Gcc-patches
In addition to the existing check also ask the target whether a
replacement register may be accessed in a different mode than it was set
before.

Bootstrapped and regtested on IBM Z.  Ok for mainline?

gcc/ChangeLog:

* regcprop.c (find_oldest_value_reg): Ask target whether
  different mode is fine for replacement register.
---
 gcc/regcprop.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/regcprop.c b/gcc/regcprop.c
index e1342f56bd1..02753a12510 100644
--- a/gcc/regcprop.c
+++ b/gcc/regcprop.c
@@ -474,7 +474,8 @@ find_oldest_value_reg (enum reg_class cl, rtx reg, struct 
value_data *vd)
(set (...) (reg:DI r9))
  Replacing r9 with r11 is invalid.  */
   if (mode != vd->e[regno].mode
-  && REG_NREGS (reg) > hard_regno_nregs (regno, vd->e[regno].mode))
+  && (REG_NREGS (reg) > hard_regno_nregs (regno, vd->e[regno].mode)
+ || !REG_CAN_CHANGE_MODE_P (regno, mode, vd->e[regno].mode)))
 return NULL_RTX;
 
   for (i = vd->e[regno].oldest_regno; i != regno; i = vd->e[i].next_regno)
-- 
2.23.0



Re: [PATCH] i386: Hopefully last set of -mavx512vl -mno-avx512bw fixes [PR99321]

2021-03-12 Thread Uros Bizjak via Gcc-patches
On Fri, Mar 12, 2021 at 5:11 PM Uros Bizjak  wrote:
>
> On Fri, Mar 12, 2021 at 4:28 PM Jakub Jelinek  wrote:
> >
> > On Fri, Mar 12, 2021 at 03:34:09PM +0100, Uros Bizjak wrote:
> > > >  (define_insn "*avx2_pmaddwd"
> > > > -  [(set (match_operand:V8SI 0 "register_operand" "=x,v")
> > > > +  [(set (match_operand:V8SI 0 "register_operand" "=Yw")
> > >
> > > I'm not sure contraction like this is correct. The prolbem is with vex
> > > vs. evex prefix, used in length calculation. When XMM16+ is used in
> > > the insn, and evex prefix is used, the unpatched version returns vex
> > > for alternative 0 due to (x,x,x) and evex for alternative 1, since one
> > > of registers satisfies only "v".
> > >
> > > Patched version now always emits vex, which is wrong for XMM16+ registers.
> >
> > That is true, but we have many thousands of those cases, where we just
> > use vex or maybe_vex or maybe_evex prefix with v or Yv or Yw or YW etc.
> > Adding extra alternatives for that would mean changing pretty much all of
> > sse.md.
> > I think it is much easier to imply length_evex when prefix is vex/maybe_vex 
> > when
> > any of the operands is EXT_REX_SSE_REG_P, subreg thereof,
> > MASK_REG_P, or subreg thereof.
> >
> > The default definition of the "length" attribute has:
> >  (ior (eq_attr "prefix" "evex")
> >   (and (ior (eq_attr "prefix" "maybe_evex")
> > (eq_attr "prefix" "maybe_vex"))
> >(match_test "TARGET_AVX512F")))
> >(plus (attr "length_evex")
> >  (plus (attr "length_immediate")
> >(plus (attr "modrm")
> >  (attr "length_address"
> >  (ior (eq_attr "prefix" "vex")
> >   (and (ior (eq_attr "prefix" "maybe_vex")
> > (eq_attr "prefix" "maybe_evex"))
> >(match_test "TARGET_AVX")))
> >(plus (attr "length_vex")
> >  (plus (attr "length_immediate")
> >(plus (attr "modrm")
> >  (attr "length_address"]
> > That is just extremely rough guess, assuming all insns with
> > evex/maybe_evex/maybe_vex prefix will be EVEX encoded when TARGET_AVX512F
> > is clearly wrong, that is only true for instructions that don't have
> > a VEX counterpart (e.g. if they have mnemonics that is EVEX only), otherwise
> > it depends on whether either the operands will be 64-byte (we can perhaps
> > use for that the mode attribute at least by default) or whether any of the
> > operands is %[xy]mm16+ or %k*.
> > So (but I think this must be GCC 12 material) I'd say we should throw away
> > maybe_evex prefix altogether (replace with maybe_vex or vex),
> > use evex for the cases where it has EVEX only mnemonics and otherwise
> > call some function to look at the operands and mode attribute.
>
> Yes, I'm aware that while great care was taken to handle vex
> attribute, evex handling is quite sloppy, and I fully agree with your
> findings. I have noticed the issue when I tried to utilize newly
> introduced YW constraint some more, e.g.:
>
> (define_insn "*vec_extract"
>   [(set (match_operand: 0 "register_sse4nonimm_operand"
> "=r,m,r,m")
> (vec_select:
>   (match_operand:PEXTR_MODE12 1 "register_operand" "x,x,v,v")
>   (parallel
> [(match_operand:SI 2 "const_0_to__operand")])))]
>   "TARGET_SSE2"
>   "@
>%vpextr\t{%2, %1, %k0|%k0, %1, %2}
>%vpextr\t{%2, %1, %0|%0, %1, %2}
>vpextr\t{%2, %1, %k0|%k0, %1, %2}
>vpextr\t{%2, %1, %0|%0, %1, %2}"
>   [(set_attr "isa" "*,sse4,avx512bw,avx512bw")
>
> where alternatives 0/2 and 1/3 can now be merged together using YW
> register constraint (plus a couple of other places, just grep for
> avx512bw isa attribute). I was not sure if using maybe_vex is the
> correct selection for the prefix attribute in this case.

Untested patch that introduces YW to some remaining pextr
instructions, fixes one case of 128bit vpsrldq and 128bit vpalignr w/o
AVX512VL.

Uros.
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 2cd8e04b913..43e4d57ec6a 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -15483,18 +15483,16 @@
   [(V16QI "TARGET_SSE4_1") V8HI])
 
 (define_insn "*vec_extract"
-  [(set (match_operand: 0 "register_sse4nonimm_operand" 
"=r,m,r,m")
+  [(set (match_operand: 0 "register_sse4nonimm_operand" "=r,m")
(vec_select:
- (match_operand:PEXTR_MODE12 1 "register_operand" "x,x,v,v")
+ (match_operand:PEXTR_MODE12 1 "register_operand" "YW,YW")
  (parallel
[(match_operand:SI 2 "const_0_to__operand")])))]
   "TARGET_SSE2"
   "@
%vpextr\t{%2, %1, %k0|%k0, %1, %2}
-   %vpextr\t{%2, %1, %0|%0, %1, %2}
-   vpextr\t{%2, %1, %k0|%k0, %1, %2}
-   vpextr\t{%2, %1, %0|%0, %1, %2}"
-  [(set_attr "isa" "*,sse4,avx512bw,avx512bw")
+   %vpextr\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "isa" "*,sse4")
(set_attr "type" "sselog1")
(

Re: [PATCH] i386: Hopefully last set of -mavx512vl -mno-avx512bw fixes [PR99321]

2021-03-12 Thread Jakub Jelinek via Gcc-patches
On Fri, Mar 12, 2021 at 06:05:34PM +0100, Uros Bizjak wrote:
> Untested patch that introduces YW to some remaining pextr
> instructions, fixes one case of 128bit vpsrldq and 128bit vpalignr w/o
> AVX512VL.

Not sure I can find the vpsrldq change in there.

> @@ -21599,11 +21590,11 @@
> (set_attr "mode" "")])
>  
>  (define_insn "*ssse3_palignr_perm"
> -  [(set (match_operand:V_128 0 "register_operand" "=x,x,v")
> +  [(set (match_operand:V_128 0 "register_operand" "=x,Yw")
>(vec_select:V_128
> - (match_operand:V_128 1 "register_operand" "0,x,v")
> + (match_operand:V_128 1 "register_operand" "0,Yw")
>   (match_parallel 2 "palignr_operand"
> -   [(match_operand 3 "const_int_operand" "n,n,n")])))]
> +   [(match_operand 3 "const_int_operand" "n,n")])))]
>"TARGET_SSSE3"

and I believe I had exactly this change in an earlier version of my patch
and it didn't work (broke
+FAIL: gcc.target/i386/avx512vl-vpalignr-4.c scan-assembler-not 
vpalignr[^\\n\\r]*\$8[^\\n\\r]*%xmm16[^\\n\\r]*%xmm16[^\\n\\r]*%xmm16
), which is why I've reverted it.
It could use YW instead of Yw though and then it should work.

>  {
>operands[2] = (GEN_INT (INTVAL (operands[3])
> @@ -21614,19 +21605,18 @@
>  case 0:
>return "palignr\t{%2, %1, %0|%0, %1, %2}";
>  case 1:
> -case 2:
>return "vpalignr\t{%2, %1, %1, %0|%0, %1, %1, %2}";
>  default:
>gcc_unreachable ();
>  }
>  }
> -  [(set_attr "isa" "noavx,avx,avx512bw")
> +  [(set_attr "isa" "noavx,avx")
> (set_attr "type" "sseishft")
> (set_attr "atom_unit" "sishuf")
> -   (set_attr "prefix_data16" "1,*,*")
> +   (set_attr "prefix_data16" "1,*")
> (set_attr "prefix_extra" "1")
> (set_attr "length_immediate" "1")
> -   (set_attr "prefix" "orig,vex,evex")])
> +   (set_attr "prefix" "orig,maybe_evex")])
>  
>  (define_expand "avx512vl_vinsert"
>[(match_operand:VI48F_256 0 "register_operand")


Jakub



[PATCH 1/2] libstdc++: Revert to old std::call_once implementation [PR 99341]

2021-03-12 Thread Jonathan Wakely via Gcc-patches
The new std::call_once implementation is not backwards compatible,
contrary to my intention. Because std::once_flag::_M_active() doesn't
write glibc's "fork generation" into the pthread_once_t object, it's
possible for glibc and libstdc++ to run two active executions
concurrently. This violates the primary invariant of the feature!

This patch reverts std::once_flag and std::call_once to the old
implementation that uses pthread_once. This means PR 66146 is a problem
again, but glibc has been changed to solve that. A new API similar to
pthread_once but supporting failure and resetting the pthread_once_t
will be proposed for inclusion in glibc and other C libraries.

This change doesn't simply revert r11-4691 because I want to retain the
new implementation for non-ghtreads targets (which didn't previously
support std::call_once at all, so there's no backwards compatibility
concern). This also leaves the new std::call_once::_M_activate() and
std::call_once::_M_finish(bool) symbols present in libstdc++.so.6 so
that code already compiled against GCC 11 can still use them. Those
symbols will be removed in a subsequent commit (which distros can choose
to temporarily revert if needed).

libstdc++-v3/ChangeLog:

PR libstdc++/99341
* include/std/mutex [_GLIBCXX_HAVE_LINUX_FUTEX] (once_flag):
Revert to pthread_once_t implementation.
[_GLIBCXX_HAVE_LINUX_FUTEX] (call_once): Likewise.
* src/c++11/mutex.cc [_GLIBCXX_HAVE_LINUX_FUTEX]
(struct __once_flag_compat): New type matching the reverted
implementation of once_flag using futexes.
(once_flag::_M_activate): Remove, replace with ...
(_ZNSt9once_flag11_M_activateEv): ... alias symbol.
(once_flag::_M_finish): Remove, replace with ...
(_ZNSt9once_flag9_M_finishEb): ... alias symbol.
* testsuite/30_threads/call_once/66146.cc: Removed.

Tested x86_64-linux, powerpc64-linux and powerpc64le-linux, but not yet
committed (it's not really appropriate for a last-minute Friday
change!)


commit c5ae3f030cc24ec73311f1ff85b8ebdf8551479c
Author: Jonathan Wakely 
Date:   Fri Mar 12 11:47:20 2021

libstdc++: Revert to old std::call_once implementation [PR 99341]

The new std::call_once implementation is not backwards compatible,
contrary to my intention. Because std::once_flag::_M_active() doesn't
write glibc's "fork generation" into the pthread_once_t object, it's
possible for glibc and libstdc++ to run two active executions
concurrently. This violates the primary invariant of the feature!

This patch reverts std::once_flag and std::call_once to the old
implementation that uses pthread_once. This means PR 66146 is a problem
again, but glibc has been changed to solve that. A new API similar to
pthread_once but supporting failure and resetting the pthread_once_t
will be proposed for inclusion in glibc and other C libraries.

This change doesn't simply revert r11-4691 because I want to retain the
new implementation for non-ghtreads targets (which didn't previously
support std::call_once at all, so there's no backwards compatibility
concern). This also leaves the new std::call_once::_M_activate() and
std::call_once::_M_finish(bool) symbols present in libstdc++.so.6 so
that code already compiled against GCC 11 can still use them. Those
symbols will be removed in a subsequent commit (which distros can choose
to temporarily revert if needed).

libstdc++-v3/ChangeLog:

PR libstdc++/99341
* include/std/mutex [_GLIBCXX_HAVE_LINUX_FUTEX] (once_flag):
Revert to pthread_once_t implementation.
[_GLIBCXX_HAVE_LINUX_FUTEX] (call_once): Likewise.
* src/c++11/mutex.cc [_GLIBCXX_HAVE_LINUX_FUTEX]
(struct __once_flag_compat): New type matching the reverted
implementation of once_flag using futexes.
(once_flag::_M_activate): Remove, replace with ...
(_ZNSt9once_flag11_M_activateEv): ... alias symbol.
(once_flag::_M_finish): Remove, replace with ...
(_ZNSt9once_flag9_M_finishEb): ... alias symbol.
* testsuite/30_threads/call_once/66146.cc: Removed.

diff --git a/libstdc++-v3/include/std/mutex b/libstdc++-v3/include/std/mutex
index f96c48e88ec..b6a595237bf 100644
--- a/libstdc++-v3/include/std/mutex
+++ b/libstdc++-v3/include/std/mutex
@@ -669,6 +669,123 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 };
 #endif // C++17
 
+#ifdef _GLIBCXX_HAS_GTHREADS
+  /// Flag type used by std::call_once
+  struct once_flag
+  {
+constexpr once_flag() noexcept = default;
+
+/// Deleted copy constructor
+once_flag(const once_flag&) = delete;
+/// Deleted assignment operator
+once_flag& operator=(const once_flag&) = delete;
+
+  private:
+// For gthreads targets a pthread_once_t is used with pthread_once, but
+// for most targets this doesn't work correctly for excep

[PATCH 2/2] libstdc++: Remove symbols for new std::call_once implementation [PR 99341]

2021-03-12 Thread Jonathan Wakely via Gcc-patches

This removes the new symbols added for the new futex-based
std::call_once implementation. These symbols were new on trunk, so not
in any released version. However, they are already present in some
beta distro releases (Fedora Linux 34) and in Fedora Linux rawhide. This
change can be locally reverted by distros that need to keep the symbols
present until affected packages have been rebuilt.

libstdc++-v3/ChangeLog:

PR libstdc++/99341
* config/abi/post/aarch64-linux-gnu/baseline_symbols.txt: Remove
std::once_flag symbols.
* config/abi/post/ia64-linux-gnu/baseline_symbols.txt: Likewise.
* config/abi/post/m68k-linux-gnu/baseline_symbols.txt: Likewise.
* config/abi/post/riscv64-linux-gnu/baseline_symbols.txt:
Likewise.
* config/abi/pre/gnu.ver: Likewise.
* src/c++11/mutex.cc [_GLIBCXX_HAVE_LINUX_FUTEX]
(struct __once_flag_compat): Remove.
(_ZNSt9once_flag11_M_activateEv): Remove.
(_ZNSt9once_flag9_M_finishEb): Remove.

Tested x86_64-linux, powerpc64-linux and powerpc64le-linux, but not yet
committed (it's not really appropriate for a last-minute Friday
change!)

commit 621a5b5bcb27b0806af020beb1be8e75e050d308
Author: Jonathan Wakely 
Date:   Fri Mar 12 11:47:20 2021

libstdc++: Remove symbols for new std::call_once implementation [PR 99341]

This removes the new symbols added for the new futex-based
std::call_once implementation. These symbols were new on trunk, so not
in any released version. However, they are already present in some
beta distro releases (Fedora Linux 34) and in Fedora Linux rawhide. This
change can be locally reverted by distros that need to keep the symbols
present until affected packages have been rebuilt.

libstdc++-v3/ChangeLog:

PR libstdc++/99341
* config/abi/post/aarch64-linux-gnu/baseline_symbols.txt: Remove
std::once_flag symbols.
* config/abi/post/ia64-linux-gnu/baseline_symbols.txt: Likewise.
* config/abi/post/m68k-linux-gnu/baseline_symbols.txt: Likewise.
* config/abi/post/riscv64-linux-gnu/baseline_symbols.txt:
Likewise.
* config/abi/pre/gnu.ver: Likewise.
* src/c++11/mutex.cc [_GLIBCXX_HAVE_LINUX_FUTEX]
(struct __once_flag_compat): Remove.
(_ZNSt9once_flag11_M_activateEv): Remove.
(_ZNSt9once_flag9_M_finishEb): Remove.

diff --git a/libstdc++-v3/config/abi/post/aarch64-linux-gnu/baseline_symbols.txt b/libstdc++-v3/config/abi/post/aarch64-linux-gnu/baseline_symbols.txt
index 45f1540ca11..898c8e1e895 100644
--- a/libstdc++-v3/config/abi/post/aarch64-linux-gnu/baseline_symbols.txt
+++ b/libstdc++-v3/config/abi/post/aarch64-linux-gnu/baseline_symbols.txt
@@ -4086,8 +4086,6 @@ FUNC:_ZNSt9money_putIwSt19ostreambuf_iteratorIwSt11char_traitsIwEEEC2Em@@GLIBCXX
 FUNC:_ZNSt9money_putIwSt19ostreambuf_iteratorIwSt11char_traitsIwEEED0Ev@@GLIBCXX_3.4
 FUNC:_ZNSt9money_putIwSt19ostreambuf_iteratorIwSt11char_traitsIwEEED1Ev@@GLIBCXX_3.4
 FUNC:_ZNSt9money_putIwSt19ostreambuf_iteratorIwSt11char_traitsIwEEED2Ev@@GLIBCXX_3.4
-FUNC:_ZNSt9once_flag11_M_activateEv@@GLIBCXX_3.4.29
-FUNC:_ZNSt9once_flag9_M_finishEb@@GLIBCXX_3.4.29
 FUNC:_ZNSt9strstream3strEv@@GLIBCXX_3.4
 FUNC:_ZNSt9strstream6freezeEb@@GLIBCXX_3.4
 FUNC:_ZNSt9strstreamC1EPciSt13_Ios_Openmode@@GLIBCXX_3.4
diff --git a/libstdc++-v3/config/abi/post/ia64-linux-gnu/baseline_symbols.txt b/libstdc++-v3/config/abi/post/ia64-linux-gnu/baseline_symbols.txt
index 62d28d2cd04..92288ccb8a6 100644
--- a/libstdc++-v3/config/abi/post/ia64-linux-gnu/baseline_symbols.txt
+++ b/libstdc++-v3/config/abi/post/ia64-linux-gnu/baseline_symbols.txt
@@ -4086,8 +4086,6 @@ FUNC:_ZNSt9money_putIwSt19ostreambuf_iteratorIwSt11char_traitsIwEEEC2Em@@GLIBCXX
 FUNC:_ZNSt9money_putIwSt19ostreambuf_iteratorIwSt11char_traitsIwEEED0Ev@@GLIBCXX_3.4
 FUNC:_ZNSt9money_putIwSt19ostreambuf_iteratorIwSt11char_traitsIwEEED1Ev@@GLIBCXX_3.4
 FUNC:_ZNSt9money_putIwSt19ostreambuf_iteratorIwSt11char_traitsIwEEED2Ev@@GLIBCXX_3.4
-FUNC:_ZNSt9once_flag11_M_activateEv@@GLIBCXX_3.4.29
-FUNC:_ZNSt9once_flag9_M_finishEb@@GLIBCXX_3.4.29
 FUNC:_ZNSt9strstream3strEv@@GLIBCXX_3.4
 FUNC:_ZNSt9strstream6freezeEb@@GLIBCXX_3.4
 FUNC:_ZNSt9strstreamC1EPciSt13_Ios_Openmode@@GLIBCXX_3.4
diff --git a/libstdc++-v3/config/abi/post/m68k-linux-gnu/baseline_symbols.txt b/libstdc++-v3/config/abi/post/m68k-linux-gnu/baseline_symbols.txt
index 45992aef3ec..af5dc020e00 100644
--- a/libstdc++-v3/config/abi/post/m68k-linux-gnu/baseline_symbols.txt
+++ b/libstdc++-v3/config/abi/post/m68k-linux-gnu/baseline_symbols.txt
@@ -4086,8 +4086,6 @@ FUNC:_ZNSt9money_putIwSt19ostreambuf_iteratorIwSt11char_traitsIwEEEC2Ej@@GLIBCXX
 FUNC:_ZNSt9money_putIwSt19ostreambuf_iteratorIwSt11char_traitsIwEEED0Ev@@GLIBCXX_3.4
 FUNC:_ZNSt9money_putIwSt19ostreambuf_iteratorIwSt11char_traitsIwEEED1Ev@@GLIBCXX_3.4
 FUNC:_ZNSt9money_putIwSt19ostreambuf_iteratorIwSt1

Re: [PATCH] i386: Hopefully last set of -mavx512vl -mno-avx512bw fixes [PR99321]

2021-03-12 Thread Uros Bizjak via Gcc-patches
On Fri, Mar 12, 2021 at 6:32 PM Jakub Jelinek  wrote:
>
> On Fri, Mar 12, 2021 at 06:05:34PM +0100, Uros Bizjak wrote:
> > Untested patch that introduces YW to some remaining pextr
> > instructions, fixes one case of 128bit vpsrldq and 128bit vpalignr w/o
> > AVX512VL.
>
> Not sure I can find the vpsrldq change in there.

It is hidden in *vec_extractv4si pattern:

 (define_insn "*vec_extractv4si"
-  [(set (match_operand:SI 0 "nonimmediate_operand" "=rm,rm,Yr,*x,x,Yv")
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=rm,rm,Yr,*x,Yw")
 (vec_select:SI
-  (match_operand:V4SI 1 "register_operand" "x,v,0,0,x,v")
+  (match_operand:V4SI 1 "register_operand" "  x, v, 0, 0,Yw")
   (parallel [(match_operand:SI 2 "const_0_to_3_operand")])))]
   "TARGET_SSE4_1"
 {
@@ -15668,7 +15660,6 @@
   return "psrldq\t{%2, %0|%0, %2}";

 case 4:
-case 5:
   operands[2] = GEN_INT (INTVAL (operands[2]) * 4);
   return "vpsrldq\t{%2, %1, %0|%0, %1, %2}";




>
> > @@ -21599,11 +21590,11 @@
> > (set_attr "mode" "")])
> >
> >  (define_insn "*ssse3_palignr_perm"
> > -  [(set (match_operand:V_128 0 "register_operand" "=x,x,v")
> > +  [(set (match_operand:V_128 0 "register_operand" "=x,Yw")
> >(vec_select:V_128
> > - (match_operand:V_128 1 "register_operand" "0,x,v")
> > + (match_operand:V_128 1 "register_operand" "0,Yw")
> >   (match_parallel 2 "palignr_operand"
> > -   [(match_operand 3 "const_int_operand" "n,n,n")])))]
> > +   [(match_operand 3 "const_int_operand" "n,n")])))]
> >"TARGET_SSSE3"
>
> and I believe I had exactly this change in an earlier version of my patch
> and it didn't work (broke
> +FAIL: gcc.target/i386/avx512vl-vpalignr-4.c scan-assembler-not 
> vpalignr[^\\n\\r]*\$8[^\\n\\r]*%xmm16[^\\n\\r]*%xmm16[^\\n\\r]*%xmm16
> ), which is why I've reverted it.
> It could use YW instead of Yw though and then it should work.

My copy of x86 ISA says that vpalignr with xmm operands needs AVX512VL
and AVX512BW. Is the testcase correct?

[uros@localhost test]$ cat palignr.s
   vpalignr $2, %xmm22, %xmm23, %xmm24
[uros@localhost test]$ as -march=+noavx512vl palignr.s
palignr.s: Assembler messages:
palignr.s:1: Error: unsupported instruction `vpalignr'
[uros@localhost test]$ as -march=+noavx512bw palignr.s
palignr.s: Assembler messages:
palignr.s:1: Error: unsupported instruction `vpalignr'


[uros@localhost test]$ cat palignr.s
   vpalignr $2, %xmm2, %xmm3, %xmm4
[uros@localhost test]$ as -march=+noavx512vl+noavx512bw palignr.s

Uros.


[SPARC] Fix PR target/99422

2021-03-12 Thread Eric Botcazou
It's a bug in the SPARC back-end exposed by the recent LRA changes, whereby 
the T constraint fails to behave properly when LRA is enabled (unlike when 
reload is enabled, thanks for Vladimir for pinpoint it).  The patch also gets 
rid of the awkward W constraint, which is strictly equivalent to m in 64-bit 
mode and, as a result renames the w constraint into W.

Bootstrapped/regtested on SPARC/Solaris and SPARC64/Linux, applied on the 
mainline.


2021-03-12  Eric Botcazou  

PR target/99422
* config/sparc/constraints.md (w): Rename to...
(W): ... this and ditch previous implementation.
* config/sparc/sparc.md (*movdi_insn_sp64): Replace W with m.
(*movdf_insn_sp64): Likewise.
(*mov_insn_sp64): Likewise.
* config/sparc/sync.md (*atomic_compare_and_swap_1): Replace
w with W.
(atomic_compare_and_swap_leon3_1): Likewise.
(*atomic_compare_and_swapdi_v8plus): Likewise.
* config/sparc/sparc.c (memory_ok_for_ldd): Remove useless test on
architecture and add missing address validity check during LRA.

-- 
Eric Botcazoudiff --git a/gcc/config/sparc/constraints.md b/gcc/config/sparc/constraints.md
index 82bbba90457..7ddf014596d 100644
--- a/gcc/config/sparc/constraints.md
+++ b/gcc/config/sparc/constraints.md
@@ -19,7 +19,7 @@
 
 ;;; Unused letters:
 ;;; B
-;;;ajkluv xyz
+;;;ajkluvwxyz
 
 
 ;; Register constraints
@@ -190,14 +190,7 @@
   (match_test "TARGET_ARCH32")
   (match_test "register_ok_for_ldd (op)")))
 
-;; Equivalent to 'T' but in 64-bit mode without alignment requirement
 (define_memory_constraint "W"
- "Memory reference for 'e' constraint floating-point register"
- (and (match_code "mem")
-  (match_test "TARGET_ARCH64")
-  (match_test "memory_ok_for_ldd (op)")))
-
-(define_memory_constraint "w"
   "A memory with only a base register"
   (match_operand 0 "mem_noofs_operand"))
 
diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c
index f1504172022..42ba415255c 100644
--- a/gcc/config/sparc/sparc.c
+++ b/gcc/config/sparc/sparc.c
@@ -9223,14 +9223,17 @@ register_ok_for_ldd (rtx reg)
 int
 memory_ok_for_ldd (rtx op)
 {
-  /* In 64-bit mode, we assume that the address is word-aligned.  */
-  if (TARGET_ARCH32 && !mem_min_alignment (op, 8))
+  if (!mem_min_alignment (op, 8))
 return 0;
 
-  if (! can_create_pseudo_p ()
+  /* We need to perform the job of a memory constraint.  */
+  if ((reload_in_progress || reload_completed)
   && !strict_memory_address_p (Pmode, XEXP (op, 0)))
 return 0;
 
+  if (lra_in_progress && !memory_address_p (Pmode, XEXP (op, 0)))
+return 0;
+
   return 1;
 }
 
diff --git a/gcc/config/sparc/sparc.md b/gcc/config/sparc/sparc.md
index 02b7c8d2fdd..c5d369626cc 100644
--- a/gcc/config/sparc/sparc.md
+++ b/gcc/config/sparc/sparc.md
@@ -1869,8 +1869,8 @@ visl")
(set_attr "lra" "*,*,disabled,disabled,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*")])
 
 (define_insn "*movdi_insn_sp64"
-  [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r,r, m, r,*e,?*e,?*e,  W,b,b")
-(match_operand:DI 1 "input_operand""rI,N,m,rJ,*e, r, *e,  W,?*e,J,P"))]
+  [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r,r, m, r,*e,?*e,?*e,  m,b,b")
+(match_operand:DI 1 "input_operand""rI,N,m,rJ,*e, r, *e,  m,?*e,J,P"))]
   "TARGET_ARCH64
&& (register_operand (operands[0], DImode)
|| register_or_zero_or_all_ones_operand (operands[1], DImode))"
@@ -2498,8 +2498,8 @@ visl")
(set_attr "lra" "*,*,*,*,*,*,*,*,*,*,disabled,disabled,*,*,*,*,*")])
 
 (define_insn "*movdf_insn_sp64"
-  [(set (match_operand:DF 0 "nonimmediate_operand" "=b,b,e,*r, e,  e,W, *r,*r,  m,*r")
-	(match_operand:DF 1 "input_operand" "G,C,e, e,*r,W#F,e,*rG, m,*rG, F"))]
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=b,b,e,*r, e,  e,m, *r,*r,  m,*r")
+	(match_operand:DF 1 "input_operand" "G,C,e, e,*r,m#F,e,*rG, m,*rG, F"))]
   "TARGET_ARCH64
&& (register_operand (operands[0], DFmode)
|| register_or_zero_or_all_ones_operand (operands[1], DFmode))"
@@ -8467,8 +8467,8 @@ visl")
(set_attr "cpu_feature" "vis,vis,vis,*,*,*,*,*,*,vis3,vis3")])
 
 (define_insn "*mov_insn_sp64"
-  [(set (match_operand:VM64 0 "nonimmediate_operand" "=e,e,e,e,W,m,*r, m,*r, e,*r")
-	(match_operand:VM64 1 "input_operand" "Y,Z,e,W,e,Y, m,*r, e,*r,*r"))]
+  [(set (match_operand:VM64 0 "nonimmediate_operand" "=e,e,e,e,m,m,*r, m,*r, e,*r")
+	(match_operand:VM64 1 "input_operand" "Y,Z,e,m,e,Y, m,*r, e,*r,*r"))]
   "TARGET_VIS
&& TARGET_ARCH64
&& (register_operand (operands[0], mode)
diff --git a/gcc/config/sparc/sync.md b/gcc/config/sparc/sync.md
index c578e95811e..c0a20ef8937 100644
--- a/gcc/config/sparc/sync.md
+++ b/gcc/config/sparc/sync.md
@@ -202,7 +202,7 @@
 
 (define_insn "*atomic_compare_and_swap_1"
   [(set (match_operand:I48MODE 0 "register_operand" "=r")
-	(match_operand:I48MODE 1 "mem_noofs_

Re: [PATCH] i386: Hopefully last set of -mavx512vl -mno-avx512bw fixes [PR99321]

2021-03-12 Thread Jakub Jelinek via Gcc-patches
On Fri, Mar 12, 2021 at 06:48:57PM +0100, Uros Bizjak wrote:
> It is hidden in *vec_extractv4si pattern:
> 
>  (define_insn "*vec_extractv4si"
> -  [(set (match_operand:SI 0 "nonimmediate_operand" "=rm,rm,Yr,*x,x,Yv")
> +  [(set (match_operand:SI 0 "nonimmediate_operand" "=rm,rm,Yr,*x,Yw")
>  (vec_select:SI
> -  (match_operand:V4SI 1 "register_operand" "x,v,0,0,x,v")
> +  (match_operand:V4SI 1 "register_operand" "  x, v, 0, 0,Yw")
>(parallel [(match_operand:SI 2 "const_0_to_3_operand")])))]
>"TARGET_SSE4_1"
>  {
> @@ -15668,7 +15660,6 @@
>return "psrldq\t{%2, %0|%0, %2}";
> 
>  case 4:
> -case 5:
>operands[2] = GEN_INT (INTVAL (operands[2]) * 4);
>return "vpsrldq\t{%2, %1, %0|%0, %1, %2}";

Ah, ok, thanks.

> > and I believe I had exactly this change in an earlier version of my patch
> > and it didn't work (broke
> > +FAIL: gcc.target/i386/avx512vl-vpalignr-4.c scan-assembler-not 
> > vpalignr[^\\n\\r]*\$8[^\\n\\r]*%xmm16[^\\n\\r]*%xmm16[^\\n\\r]*%xmm16
> > ), which is why I've reverted it.
> > It could use YW instead of Yw though and then it should work.
> 
> My copy of x86 ISA says that vpalignr with xmm operands needs AVX512VL
> and AVX512BW. Is the testcase correct?
> 
> [uros@localhost test]$ cat palignr.s
>vpalignr $2, %xmm22, %xmm23, %xmm24
> [uros@localhost test]$ as -march=+noavx512vl palignr.s
> palignr.s: Assembler messages:
> palignr.s:1: Error: unsupported instruction `vpalignr'
> [uros@localhost test]$ as -march=+noavx512bw palignr.s
> palignr.s: Assembler messages:
> palignr.s:1: Error: unsupported instruction `vpalignr'

That is indeed the case, but often AVX512VL is implied
already from the operand mode.
The pattern uses V_128 iterator, so:
  [V16QI V8HI V4SI V2DI V4SF (V2DF "TARGET_SSE2")])
For all these modes, ix86_hard_regno_mode_ok
will return false for %xmm16+ when -mavx512vl is not true.

Anyway, sorry, I was wrong, I actually had multiple versions
of the patch and this pattern I've changed more than once,
first used the Yw in there and then after seeing bugs in other
patterns changed that to  and that was the patch that
actually failed the gcc.target/i386/avx512vl-vpalignr-4.c test
and I've just removed those hunks and rebootstrapped/retested
afterwards.
 instead of Yw was needed for numerous patterns, usually
when they include both 16-byte, 32-byte and 64-byte vectors
in the same define_insn*, so that the 64-byte which typically
had TARGET_AVX512BW in the iterator's condition uses v and doesn't need
AVX512VL.
But is not a good fit for *ssse3_palignr_perm,
because that pattern uses various non-QI/HI modes for which
it still wants Yw/YW.

So, your patch looks good to me.

I can test it on avx512{bw,vl,dq} hw tonight if you want.

Jakub



c++: ICE with using-decl [PR 99238]

2021-03-12 Thread Nathan Sidwell


This ICE was caused by a stray TREE_VISITED marker.  The lookup 
machinery was leaving it there due to the way I'd arranged for it to be 
cleared.  That was presuming the name_lookup::value field didn't change, 
and that wasn't always true in the using-decl processing.  I took the 
opportunity to break out a helper, and then call it immediately after 
lookups, rather than wait until destructor time.  Added some asserts the 
module machinery to catch further cases of this.


PR c++/99238
gcc/cp/
* module.cc (depset::hash::add_binding_entity): Assert not
visited.
(depset::add::add_specializations): Likewise.
* name-lookup.c (name_lookup::dedup): New.
(name_lookup::~name_lookup): Assert not deduping.
(name_lookup::restore_state): Likewise.
(name_lookup::add_overload): Replace outlined code with dedup
call.
(name_lookup::add_value): Likewise.
(name_lookup::search_namespace_only): Likewise.
(name_lookup::adl_namespace_fns): Likewise.
(name_lookup::adl_class_fns): Likewise.
(name_lookup::search_adl): Likewise.  Add clearing dedup call.
(name_lookup::search_qualified): Likewise.
(name_lookup::search_unqualified): Likewise.
gcc/testsuite/
* g++.dg/modules/pr99238.h: New.
* g++.dg/modules/pr99238_a.H: New.
* g++.dg/modules/pr99238_b.H: New.

--
Nathan Sidwell
diff --git c/gcc/cp/module.cc w/gcc/cp/module.cc
index 03359db28e1..19bdfc7cb21 100644
--- c/gcc/cp/module.cc
+++ w/gcc/cp/module.cc
@@ -12706,6 +12706,9 @@ depset::hash::add_binding_entity (tree decl, WMB_Flags flags, void *data_)
 	  *slot = data->binding;
 	}
 
+  /* Make sure nobody left a tree visited lying about.  */
+  gcc_checking_assert (!TREE_VISITED (decl));
+
   if (flags & WMB_Using)
 	{
 	  decl = ovl_make (decl, NULL_TREE);
@@ -13000,6 +13003,8 @@ depset::hash::add_specializations (bool decl_p)
 have_spec:;
 #endif
 
+  /* Make sure nobody left a tree visited lying about.  */
+  gcc_checking_assert (!TREE_VISITED (spec));
   depset *dep = make_dependency (spec, depset::EK_SPECIALIZATION);
   if (dep->is_special ())
 	{
diff --git c/gcc/cp/name-lookup.c w/gcc/cp/name-lookup.c
index d8839e29fe5..9382a47e195 100644
--- c/gcc/cp/name-lookup.c
+++ w/gcc/cp/name-lookup.c
@@ -464,6 +464,7 @@ public:
   }
   ~name_lookup ()
   {
+gcc_checking_assert (!deduping);
 restore_state ();
   }
 
@@ -471,6 +472,17 @@ private: /* Uncopyable, unmovable, unassignable. I am a rock. */
   name_lookup (const name_lookup &);
   name_lookup &operator= (const name_lookup &);
 
+ public:
+  /* Turn on or off deduping mode.  */
+  void dedup (bool state)
+  {
+if (deduping != state)
+  {
+	deduping = state;
+	lookup_mark (value, state);
+  }
+  }
+
 protected:
   static bool seen_p (tree scope)
   {
@@ -605,8 +617,7 @@ name_lookup::preserve_state ()
 void
 name_lookup::restore_state ()
 {
-  if (deduping)
-lookup_mark (value, false);
+  gcc_checking_assert (!deduping);
 
   /* Unmark and empty this lookup's scope stack.  */
   for (unsigned ix = vec_safe_length (scopes); ix--;)
@@ -703,12 +714,9 @@ name_lookup::add_overload (tree fns)
 	probe = ovl_skip_hidden (probe);
   if (probe && TREE_CODE (probe) == OVERLOAD
 	  && OVL_DEDUP_P (probe))
-	{
-	  /* We're about to add something found by multiple paths, so
-	 need to engage deduping mode.  */
-	  lookup_mark (value, true);
-	  deduping = true;
-	}
+	/* We're about to add something found by multiple paths, so need to
+	   engage deduping mode.  */
+	dedup (true);
 }
 
   value = lookup_maybe_add (fns, value, deduping);
@@ -737,12 +745,8 @@ name_lookup::add_value (tree new_val)
 value = ORIGINAL_NAMESPACE (value);
   else
 {
-  if (deduping)
-	{
-	  /* Disengage deduping mode.  */
-	  lookup_mark (value, false);
-	  deduping = false;
-	}
+  /* Disengage deduping mode.  */
+  dedup (false);
   value = ambiguous (new_val, value);
 }
 }
@@ -951,10 +955,7 @@ name_lookup::search_namespace_only (tree scope)
 			if ((hit & 1 && BINDING_VECTOR_GLOBAL_DUPS_P (val))
 || (hit & 2
 && BINDING_VECTOR_PARTITION_DUPS_P (val)))
-			  {
-lookup_mark (value, true);
-deduping = true;
-			  }
+			  dedup (true);
 			  }
 			dup_detect |= dup;
 		  }
@@ -1076,6 +1077,8 @@ name_lookup::search_qualified (tree scope, bool usings)
 	found = search_usings (scope);
 }
 
+  dedup (false);
+
   return found;
 }
 
@@ -1177,6 +1180,8 @@ name_lookup::search_unqualified (tree scope, cp_binding_level *level)
 	break;
 }
 
+  dedup (false);
+
   /* Restore to incoming length.  */
   vec_safe_truncate (queue, length);
 
@@ -1284,15 +1289,10 @@ name_lookup::adl_namespace_fns (tree scope, bitmap imports)
 			else if (MODULE_BINDING_PARTITION_P (bind))
 			  dup = 2;
 			if (unsigned hit = dup_detect & dup)
-			  {
-			if ((hit & 1 && BINDING_VECTOR

Re: [PATCH] i386: Hopefully last set of -mavx512vl -mno-avx512bw fixes [PR99321]

2021-03-12 Thread Uros Bizjak via Gcc-patches
V pet., 12. mar. 2021 19:19 je oseba Jakub Jelinek  napisala:
>
> On Fri, Mar 12, 2021 at 06:48:57PM +0100, Uros Bizjak wrote:
> > It is hidden in *vec_extractv4si pattern:
> >
> >  (define_insn "*vec_extractv4si"
> > -  [(set (match_operand:SI 0 "nonimmediate_operand" "=rm,rm,Yr,*x,x,Yv")
> > +  [(set (match_operand:SI 0 "nonimmediate_operand" "=rm,rm,Yr,*x,Yw")
> >  (vec_select:SI
> > -  (match_operand:V4SI 1 "register_operand" "x,v,0,0,x,v")
> > +  (match_operand:V4SI 1 "register_operand" "  x, v, 0, 0,Yw")
> >(parallel [(match_operand:SI 2 "const_0_to_3_operand")])))]
> >"TARGET_SSE4_1"
> >  {
> > @@ -15668,7 +15660,6 @@
> >return "psrldq\t{%2, %0|%0, %2}";
> >
> >  case 4:
> > -case 5:
> >operands[2] = GEN_INT (INTVAL (operands[2]) * 4);
> >return "vpsrldq\t{%2, %1, %0|%0, %1, %2}";
>
> Ah, ok, thanks.
>
> > > and I believe I had exactly this change in an earlier version of my patch
> > > and it didn't work (broke
> > > +FAIL: gcc.target/i386/avx512vl-vpalignr-4.c scan-assembler-not 
> > > vpalignr[^\\n\\r]*\$8[^\\n\\r]*%xmm16[^\\n\\r]*%xmm16[^\\n\\r]*%xmm16
> > > ), which is why I've reverted it.
> > > It could use YW instead of Yw though and then it should work.
> >
> > My copy of x86 ISA says that vpalignr with xmm operands needs AVX512VL
> > and AVX512BW. Is the testcase correct?
> >
> > [uros@localhost test]$ cat palignr.s
> >vpalignr $2, %xmm22, %xmm23, %xmm24
> > [uros@localhost test]$ as -march=+noavx512vl palignr.s
> > palignr.s: Assembler messages:
> > palignr.s:1: Error: unsupported instruction `vpalignr'
> > [uros@localhost test]$ as -march=+noavx512bw palignr.s
> > palignr.s: Assembler messages:
> > palignr.s:1: Error: unsupported instruction `vpalignr'
>
> That is indeed the case, but often AVX512VL is implied
> already from the operand mode.
> The pattern uses V_128 iterator, so:
>   [V16QI V8HI V4SI V2DI V4SF (V2DF "TARGET_SSE2")])
> For all these modes, ix86_hard_regno_mode_ok
> will return false for %xmm16+ when -mavx512vl is not true.
>
> Anyway, sorry, I was wrong, I actually had multiple versions
> of the patch and this pattern I've changed more than once,
> first used the Yw in there and then after seeing bugs in other
> patterns changed that to  and that was the patch that
> actually failed the gcc.target/i386/avx512vl-vpalignr-4.c test
> and I've just removed those hunks and rebootstrapped/retested
> afterwards.
>  instead of Yw was needed for numerous patterns, usually
> when they include both 16-byte, 32-byte and 64-byte vectors
> in the same define_insn*, so that the 64-byte which typically
> had TARGET_AVX512BW in the iterator's condition uses v and doesn't need
> AVX512VL.
> But is not a good fit for *ssse3_palignr_perm,
> because that pattern uses various non-QI/HI modes for which
> it still wants Yw/YW.
>
> So, your patch looks good to me.
>
> I can test it on avx512{bw,vl,dq} hw tonight if you want.

I'm testing the patch on avx2 hw, which is not representative of this
change. So if you can spare a few cycles, that would be awesome.

Thanks,
Uros.


[PATCH] PR fortran/99112 - [11 Regression] ICE with runtime diagnostics for SIZE intrinsic function

2021-03-12 Thread Harald Anlauf via Gcc-patches
Dear all,

the addition of runtime checks for the SIZE intrinsic created a regression
that showed up for certain CLASS arguments to procedures.  Paul did most of
the work (~ 99%), but asked me to dig into an issue with an inappropriately
selected error message.  This actually turned out to be a simple one-liner
on top of Paul's patch.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

Thanks,
Harald

P.S.: I couldn't find a Changelog entry that uses co-authors.  Is the version
below correct?


PR fortran/99112 - ICE with runtime diagnostics for SIZE intrinsic function

Add/fix handling of runtime checks for CLASS arguments with ALLOCATABLE
or POINTER attribute.

gcc/fortran/ChangeLog:

* trans-expr.c (gfc_conv_procedure_call): Fix runtime checks for
CLASS arguments.
* trans-intrinsic.c (gfc_conv_intrinsic_size): Likewise.

gcc/testsuite/ChangeLog:

* gfortran.dg/pr99112.f90: New test.

Co-authored-by: Paul Thomas  

diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index 85c16d7f4c3..53c47e18dfd 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -6662,6 +6662,7 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
 	  symbol_attribute attr;
 	  char *msg;
 	  tree cond;
+	  tree temp;

 	  if (e->expr_type == EXPR_VARIABLE || e->expr_type == EXPR_FUNCTION)
 	attr = gfc_expr_attr (e);
@@ -6732,16 +6733,25 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
 	  else
 		goto end_pointer_check;

-	  tmp = parmse.expr;
+	  if (fsym && fsym->ts.type == BT_CLASS)
+		{
+		  temp = build_fold_indirect_ref_loc (input_location,
+		  parmse.expr);
+		  temp = gfc_class_data_get (temp);
+		  if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (temp)))
+		temp = gfc_conv_descriptor_data_get (temp);
+		}
+	  else
+		temp = parmse.expr;

 	  /* If the argument is passed by value, we need to strip the
 		 INDIRECT_REF.  */
-	  if (!POINTER_TYPE_P (TREE_TYPE (parmse.expr)))
-		tmp = gfc_build_addr_expr (NULL_TREE, tmp);
+	  if (!POINTER_TYPE_P (TREE_TYPE (temp)))
+		temp = gfc_build_addr_expr (NULL_TREE, temp);

 	  cond = fold_build2_loc (input_location, EQ_EXPR,
-  logical_type_node, tmp,
-  fold_convert (TREE_TYPE (tmp),
+  logical_type_node, temp,
+  fold_convert (TREE_TYPE (temp),
 		null_pointer_node));
 	}

diff --git a/gcc/fortran/trans-intrinsic.c b/gcc/fortran/trans-intrinsic.c
index 9cf3642f694..5e53d1162fa 100644
--- a/gcc/fortran/trans-intrinsic.c
+++ b/gcc/fortran/trans-intrinsic.c
@@ -8006,8 +8006,10 @@ gfc_conv_intrinsic_size (gfc_se * se, gfc_expr * expr)
 {
   symbol_attribute attr;
   char *msg;
+  tree temp;
+  tree cond;

-  attr = gfc_expr_attr (e);
+  attr = sym ? sym->attr : gfc_expr_attr (e);
   if (attr.allocatable)
 	msg = xasprintf ("Allocatable argument '%s' is not allocated",
 			 e->symtree->n.sym->name);
@@ -8017,14 +8019,24 @@ gfc_conv_intrinsic_size (gfc_se * se, gfc_expr * expr)
   else
 	goto end_arg_check;

-  argse.descriptor_only = 1;
-  gfc_conv_expr_descriptor (&argse, actual->expr);
-  tree temp = gfc_conv_descriptor_data_get (argse.expr);
-  tree cond = fold_build2_loc (input_location, EQ_EXPR,
-   logical_type_node, temp,
-   fold_convert (TREE_TYPE (temp),
-		 null_pointer_node));
+  if (sym)
+	{
+	  temp = gfc_class_data_get (sym->backend_decl);
+	  temp = gfc_conv_descriptor_data_get (temp);
+	}
+  else
+	{
+	  argse.descriptor_only = 1;
+	  gfc_conv_expr_descriptor (&argse, actual->expr);
+	  temp = gfc_conv_descriptor_data_get (argse.expr);
+	}
+
+  cond = fold_build2_loc (input_location, EQ_EXPR,
+			  logical_type_node, temp,
+			  fold_convert (TREE_TYPE (temp),
+	null_pointer_node));
   gfc_trans_runtime_check (true, false, cond, &argse.pre, &e->where, msg);
+
   free (msg);
 }
  end_arg_check:
diff --git a/gcc/testsuite/gfortran.dg/pr99112.f90 b/gcc/testsuite/gfortran.dg/pr99112.f90
new file mode 100644
index 000..94010615b83
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr99112.f90
@@ -0,0 +1,27 @@
+! { dg-do compile }
+! { dg-options "-fcheck=pointer -fdump-tree-original" }
+! PR99112 - ICE with runtime diagnostics for SIZE intrinsic function
+
+module m
+  type t
+  end type
+contains
+  function f (x, y) result(z)
+class(t) :: x(:)
+class(t) :: y(size(x))
+type(t)  :: z(size(x))
+  end
+  function g (x) result(z)
+class(*) :: x(:)
+type(t)  :: z(size(x))
+  end
+  subroutine s ()
+class(t), allocatable :: a(:), b(:), c(:), d(:)
+class(t), pointer :: p(:)
+c = f (a, b)
+d = g (p)
+  end
+end
+! { dg-final { scan-tree-dump-times "_gfortran_runtime_error_at" 3 "original" } }
+! { dg-final { scan-tree-dump-times "Allocatable actual argument" 2 "original" } }
+! { dg-final { scan-tree-dump-times "Pointer actual argument" 1 "original" } }


Re: [RFC][patch for gcc12][version 1] add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-03-12 Thread Qing Zhao via Gcc-patches



> On Mar 11, 2021, at 6:46 PM, Kees Cook  wrote:
> 
> On Thu, Mar 11, 2021 at 03:47:17PM -0600, Qing Zhao wrote:
>> Hi, Kees,
>> 
>> Sorry for the late reply (I have been busy with other work recently).
>> 
>> Currently, I am working on the issue of flexible length array as the last 
>> field of the structure.
>> 
>> In order to fix it correctly, I have the following question:
>> 
>> 
>>> On Feb 26, 2021, at 3:42 PM, Kees Cook  wrote:
>>> 
>>> On Thu, Feb 25, 2021 at 05:56:38PM -0600, Qing Zhao wrote:
 Just noticed that you didn’t add -fauto-var-init-approach=D to the command 
 line.
>>> 
>>> Ah-ha! I didn't realize that was needed; thanks. However, now some of the 
>>> sources crash in a different way. Here's the reproducer:
>>> 
>>> $ cat poc.i
>>> struct a {
>>> int b;
>>> int array[];
>>> };
>>> void c() {
>>> struct a d;
>>> }
>>> 
>> 
>> For such variable length array as the last field of the structure, static 
>> initialization is not allowed, 
>> User needs to explicitly allocate memory and initialize the allocated array 
>> manually in the source code. 
>> 
>> So, if the compiler has to initialize this structure when requested by 
>> -ftrivial-auto-var-init,  I think that 
>> only the fields before the last fields need to be initialized, Is this the 
>> correct behavior you expected?
> 
> Right, that would be my expectation as well. Putting such a struct on
> the stack tends to be nonsensical, but maybe happens if part of a union,
> which would get initialized correctly, etc:
> 
> union {
>   struct a {
>   int b;
>   int array[];
>   };
>   char buf[32];
> };
> 

Okay, thanks. This issue has been fixed in my local repository.

Qing
> -- 
> Kees Cook



Re: [RFC][patch for gcc12][version 1] add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-03-12 Thread Qing Zhao via Gcc-patches
Hi, Kees,

I am looking at the structure padding initialization issue. And also have some 
questions:


> On Feb 24, 2021, at 10:41 PM, Kees Cook  wrote:
> 
> It looks like there is still some issues with padding and pre-case
> switch variables. Here's the test output, FWIW:
> 
> 
> test_stackinit: small_hole_static_all FAIL (uninit bytes: 3)
> test_stackinit: big_hole_static_all FAIL (uninit bytes: 61)
> test_stackinit: trailing_hole_static_all FAIL (uninit bytes: 7)
> test_stackinit: small_hole_dynamic_all FAIL (uninit bytes: 3)
> test_stackinit: big_hole_dynamic_all FAIL (uninit bytes: 61)
> test_stackinit: trailing_hole_dynamic_all FAIL (uninit bytes: 7)
> 
> test_stackinit: switch_1_none FAIL (uninit bytes: 8)
> test_stackinit: switch_2_none FAIL (uninit bytes: 8)
> test_stackinit: failures: 8
> 
> 
> /* Simple structure with padding likely to be covered by compiler. */
> struct test_small_hole {
>   size_t one;
>   char two;
>   /* 3 byte padding hole here. */
>   int three;
>   unsigned long four;
> };
> 
> /* Try to trigger unhandled padding in a structure. */
> struct test_aligned {
>   u32 internal1;
>   u64 internal2;
> } __aligned(64);
> 
> struct test_big_hole {
>   u8 one;
>   u8 two;
>   u8 three;
>   /* 61 byte padding hole here. */
>   struct test_aligned four;
> } __aligned(64);
> 
> struct test_trailing_hole {
>   char *one;
>   char *two;
>   char *three;
>   char four;
>   /* "sizeof(unsigned long) - 1" byte padding hole here. */
> };
> 
> They fail when they're statically initialized (either fully or
> partially),

So, when the structure is not statically initialized,  the compiler 
initialization is good?

For the failing cases, what’s the behavior of the LLVM -ftrivial-auto-var-init?

From the LLVM patch:  (https://reviews.llvm.org/D54604 
)


To keep the patch simple, only some undef is removed for now, see
replaceUndef. The padding-related infoleaks are therefore not all gone yet.
This will be addressed in a follow-up, mainly because addressing padding-related
leaks should be a stand-alone option which is implied by variable
initialization.


Yes, in GCC’s implementation, I think that  fixing all padding-related leaks 
also require a
separate patch.

Qing

> for example:
> 
> struct test_..._hole instance = { .two = ..., };
> 
> or
> 
> struct test_..._hole instance = { .one = ...,
> .two = ...,
> .three = ...,
> .four = ...,
>   };
> 
> The last case is for switch variables outside of case statements, like
> "var" here:
> 
>   switch (path) {
>   unsigned long var;
> 
>   case ..:
>   ...
>   case ..:
>   ...
>   ...
>   }
> 
> 
> I'm really looking forward to having this available. Thanks again!
> 
> -Kees
> 
> [1] 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/lib/test_stackinit.c
> 
> -- 
> Kees Cook



Re: [RFC][patch for gcc12][version 1] add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-03-12 Thread Kees Cook via Gcc-patches
On Fri, Mar 12, 2021 at 03:35:28PM -0600, Qing Zhao wrote:
> Hi, Kees,
> 
> I am looking at the structure padding initialization issue. And also have 
> some questions:
> 
> 
> > On Feb 24, 2021, at 10:41 PM, Kees Cook  wrote:
> > 
> > It looks like there is still some issues with padding and pre-case
> > switch variables. Here's the test output, FWIW:
> > 
> > 
> > test_stackinit: small_hole_static_all FAIL (uninit bytes: 3)
> > test_stackinit: big_hole_static_all FAIL (uninit bytes: 61)
> > test_stackinit: trailing_hole_static_all FAIL (uninit bytes: 7)
> > test_stackinit: small_hole_dynamic_all FAIL (uninit bytes: 3)
> > test_stackinit: big_hole_dynamic_all FAIL (uninit bytes: 61)
> > test_stackinit: trailing_hole_dynamic_all FAIL (uninit bytes: 7)
> > 
> > test_stackinit: switch_1_none FAIL (uninit bytes: 8)
> > test_stackinit: switch_2_none FAIL (uninit bytes: 8)
> > test_stackinit: failures: 8
> > 
> > 
> > /* Simple structure with padding likely to be covered by compiler. */
> > struct test_small_hole {
> > size_t one;
> > char two;
> > /* 3 byte padding hole here. */
> > int three;
> > unsigned long four;
> > };
> > 
> > /* Try to trigger unhandled padding in a structure. */
> > struct test_aligned {
> > u32 internal1;
> > u64 internal2;
> > } __aligned(64);
> > 
> > struct test_big_hole {
> > u8 one;
> > u8 two;
> > u8 three;
> > /* 61 byte padding hole here. */
> > struct test_aligned four;
> > } __aligned(64);
> > 
> > struct test_trailing_hole {
> > char *one;
> > char *two;
> > char *three;
> > char four;
> > /* "sizeof(unsigned long) - 1" byte padding hole here. */
> > };
> > 
> > They fail when they're statically initialized (either fully or
> > partially),
> 
> So, when the structure is not statically initialized,  the compiler 
> initialization is good?
> 
> For the failing cases, what’s the behavior of the LLVM 
> -ftrivial-auto-var-init?
> 
> From the LLVM patch:  (https://reviews.llvm.org/D54604 
> )
> 
> 
> To keep the patch simple, only some undef is removed for now, see
> replaceUndef. The padding-related infoleaks are therefore not all gone yet.
> This will be addressed in a follow-up, mainly because addressing 
> padding-related
> leaks should be a stand-alone option which is implied by variable
> initialization.
> 

Right, padding init happened in:
https://github.com/llvm/llvm-project/commit/4f7bc0eee7e6099b1abd57dac3c83529944ab23c

And was further clarified that, IIUC, padding _must be zero_ regardless
of pattern-vs-zero in:
https://github.com/llvm/llvm-project/commit/d39fbc7e20d84364e409ce59724ce20625637062

> Yes, in GCC’s implementation, I think that  fixing all padding-related leaks 
> also require a
> separate patch.

That's fine -- but it'll need to be tied to -ftrivial-auto-var-init,
since otherwise the memory isn't actually fully initialized. :)

-Kees

-- 
Kees Cook


Re: [PATCH] i386: Hopefully last set of -mavx512vl -mno-avx512bw fixes [PR99321]

2021-03-12 Thread Jakub Jelinek via Gcc-patches
On Fri, Mar 12, 2021 at 07:52:16PM +0100, Uros Bizjak via Gcc-patches wrote:
> > I can test it on avx512{bw,vl,dq} hw tonight if you want.
> 
> I'm testing the patch on avx2 hw, which is not representative of this
> change. So if you can spare a few cycles, that would be awesome.

Passed bootstrap/regtest on both x86_64-linux and i686-linux on i9-7960X.

Jakub



Re: [PATCH v1] libstdc++-v3: Update VTV vars for libtool link commands [PR99172]

2021-03-12 Thread Caroline Tice via Gcc-patches
I have updated the patch as you suggested, to filter the
"-fvtable-verify=std" out of AM_CXXFLAGS, and I have verified that it
passes the testsuite with no regressions and fixes the reported bug.

Is this OK to commit now?

-- Caroline Tice
cmt...@google.com

libstdc++-v3/ChangeLog

2021-03-12  Caroline Tice  

PR libstdc++/99172
* src/Makefile.am (AM_CXXFLAGS_PRE, AM_CXXFLAGS): Add
AM_CXXFLAGS_PRE with the old definition of AM_CXXFLAGS; make
AM_CXXFLAGS to be AM_CXXFLAGS_PRE with '-fvtable-verify=std'
filtered out.
* src/Makefile.in: Regenerate.



-- Caroline
cmt...@google.com


On Thu, Mar 11, 2021 at 9:10 AM Jonathan Wakely  wrote:
>
> On 11/03/21 17:46 +0100, Jakub Jelinek via Libstdc++ wrote:
> >On Thu, Mar 11, 2021 at 04:31:51PM +, Jonathan Wakely via Gcc-patches 
> >wrote:
> >> On 11/03/21 16:27 +, Jonathan Wakely wrote:
> >> > That seems cleaner to me, rather than adding another variable with
> >> > minor differences from the existing AM_CXXFLAGS.
> >>
> >> My specific concern is that AM_CXXFLAGS and AM_CXXFLAGS_LT will get
> >> out of sync, i.e. we'll add something to the former and forget to add
> >> it to the latter.
> >>
> >> If we keep using AM_CXXFLAGS but cancel out the -fvtable-verify=std
> >> option, then there aren't two separate variables that can diverge.
> >>
> >> But I think it's too late in the gcc-11 process for that kind of
> >> refactoring.
> >
> >I think $(filter-out -fvtable-verify=std,$(AM_CXXFLAGS)) should be fairly
> >simple thing if that is all that needs to be done.
>
> Yes, we could do that now, and then in stage 1 look at the other
> changes (like moving -Wl,-u options to the link flags not cxxflags).
>
> Using filter-out does assume that no target is going to add anything
> different that should also be filtered out, but that's true as of
> today.
>


v2-0001-libstdc-v3-Update-VTV-vars-for-libtool-link-comma.patch
Description: Binary data


[COMMITTED] MAINTAINERS: Add myself for write after approval

2021-03-12 Thread Eugene Rozenfeld via Gcc-patches
ChangeLog:

2021-03-12  Eugene Rozenfeld  

* MAINTAINERS (Write After Approval): Add myself.


0001-MAINTAINERS-Add-myself-for-write-after-approval.patch
Description: 0001-MAINTAINERS-Add-myself-for-write-after-approval.patch


libgo patch committed: Don't use == for string equality in C code

2021-03-12 Thread Ian Lance Taylor via Gcc-patches
This patch to libgo avoids using == for string equality in C code.
This is a backport of https://golang.org/cl/300993.  This is for GCC
PR 99553.  Bootstrapped and ran gotools testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
eb7eb29c37b1326bd4c8af7e5c35439a5d4521d4
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 58c881a7872..e5756c6662c 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-bf35249a7c752836741b1cab43a312f87916fcb0
+2f281eb24ef256a2d3bb9fc1a7ef964d82b40182
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/libgo/misc/cgo/testcarchive/testdata/main_unix.c 
b/libgo/misc/cgo/testcarchive/testdata/main_unix.c
index b23ac1c2428..bd00f9d2339 100644
--- a/libgo/misc/cgo/testcarchive/testdata/main_unix.c
+++ b/libgo/misc/cgo/testcarchive/testdata/main_unix.c
@@ -36,7 +36,7 @@ int install_handler() {
return 2;
}
// gccgo does not set SA_ONSTACK for SIGSEGV.
-   if (getenv("GCCGO") == "" && (osa.sa_flags&SA_ONSTACK) == 0) {
+   if (getenv("GCCGO") == NULL && (osa.sa_flags&SA_ONSTACK) == 0) {
fprintf(stderr, "Go runtime did not install signal handler\n");
return 2;
}