date:20210513

Re: [PATCH 1/2] vect: Add costing_for_scalar parameter to init_cost hook

2021-05-13 Thread Kewen.Lin via Gcc-patches

Hi!

>>> But in the end the vector code shouldn't end up worse than the
>>> scalar code with respect to IVs - the cases where it would should
>>> be already costed.  So I wonder if you have specific examples
>>> where things go worse enough for the heuristic to trigger?
>>>
>>
>> One typical case that I worked on to reuse this density check is the
>> function mat_times_vec of src file block_solver.fppized.f of SPEC2017
>> 503.bwaves_r, the density with the existing heuristic is 83 (doesn't
>> exceed the threshold unlikely).  The interesting loop is the innermost
>> one while option set is "-O2 -mcpu=power8 -ffast-math -ftree-vectorize".
>> We have verified that this loop isn't profitable to be vectorized at
>> O2 (without loop-interchange).
> 
> Yeah, but that's because the loop only runs 5 iterations, not because
> of some "density" (which suggests AGU overloading or some such)?
> Because if you modify it so it iterates more then with keeping the
> "density" measurement constant you suddenly become profitable?
> 

Yes, I agree this isn't a perfect one showing how the density check
matters, though it led me to find this check.  I tried to run SPEC2017
bmks w/ and w/o this density heuristic to catch some "expected" case,
but failed to unluckily.  It may be worth to trying with some more
option sets or even test with the previous SPECs later.

I hacked the innermost loop iteration from 5 to 20, but baseline run
didn't stop (after more than 7 hrs then I killed it), which was
suspected to become endless because of some garbage (out of bound) data.

But the current cost modeling for this loop on Power is still bad, the
min profitable iteration (both static and run time) are evaluated as 2,
while the reality shows 5 isn't profitable at least.


> The loop does have quite many memory streams so optimizing
> the (few) arithmetic ops by vectorizign them might not be worth
> the trouble, esp. since most of the loads are "strided" (composed
> from scalars) when no interchange is performed.  So it's probably
> more a "density" of # memory streams vs. # arithmetic ops, and
> esp. with any non-consecutive vector loads this balance being
> worse in the vector case?
> 

Yeah, these many scalar "strided" loads make things worse.  The fed
vector CTORs have to wait for all of their required loads are ready,
and these vector CTOR are required by further multiplications.

I posted one patch[1] on this, which tries to model it with
some counts: nload (total load number), nload_ctor (strided
load number fed into CTOR) and nctor_strided (CTOR number fed
by strided load).

Restricting the penalization by considering some factors:
  1) vect density ratio, if there are many vector instructions,
 the stalls from loads are easy to impact the subsequent
 computation.
  2) total load number, if nload is small, it's unlikely to
 bother the load/store units much.
  3) strided loads fed into CTOR pct., if there are high portion
 strided loads fed into CTOR, it's very likely to block
 the CTOR and its subsequent chain.

btw, as your previous comments on add_stmt_cost, the load/strided/ctor
statistics should be gathered there instead, like:

  if (!data->costing_for_scalar && data->loop_info && where == vect_body)
{
  if (kind == scalar_load || kind == vector_load || kind == unaligned_load
  || kind == vector_gather_load)
  data->nload += count;
  if (stmt_info && STMT_VINFO_STRIDED_P (stmt_info))
{
  if (kind == scalar_load || kind == unaligned_load)
data->nload_ctor += count;
  else if (kind == vec_construct)
data->nctor_strided += count;
}
}

[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-May/569791.html

> The x86 add_stmt_cost has
> 
>   /* If we do elementwise loads into a vector then we are bound by
>  latency and execution resources for the many scalar loads
>  (AGU and load ports).  Try to account for this by scaling the
>  construction cost by the number of elements involved.  */
>   if ((kind == vec_construct || kind == vec_to_scalar)
>   && stmt_info
>   && (STMT_VINFO_TYPE (stmt_info) == load_vec_info_type
>   || STMT_VINFO_TYPE (stmt_info) == store_vec_info_type)
>   && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info) == VMAT_ELEMENTWISE
>   && TREE_CODE (DR_STEP (STMT_VINFO_DATA_REF (stmt_info))) != INTEGER_CST)
> {
>   stmt_cost = ix86_builtin_vectorization_cost (kind, vectype, misalign);
>   stmt_cost *= (TYPE_VECTOR_SUBPARTS (vectype) + 1);
> }
> 
> so it penaltizes VMAT_ELEMENTWISE for variable step for both loads and stores.
> The above materialized over PRs 84037, 85491 and 87561, so not specifically
> for the bwaves case.  IIRC on x86 bwaves at -O2 is slower with vectorization
> as well.
> 

Thanks for the pointer!  rs6000 probably can follow this way instead.
IIUC, this cost adjustment is for each individual vec_construct/vec_to_scalar,
is it better to use the way

[PATCH][OBVIOUS] testsuite: prune new LTO warning

2021-05-13 Thread Martin Liška


Pushed as obvious.

Martin

libgomp/ChangeLog:

PR testsuite/100569
* testsuite/libgomp.c/omp-nested-3.c: Prune new LTO warning.
* testsuite/libgomp.c/pr46032-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/data-clauses-kernels-ipa-pta.c: 
Likewise.
* testsuite/libgomp.oacc-c-c++-common/data-clauses-parallel-ipa-pta.c: 
Likewise.

gcc/testsuite/ChangeLog:

PR testsuite/100569
* gcc.dg/atomic/c11-atomic-exec-2.c: Prune new LTO warning.
* gcc.dg/torture/pr94947-1.c: Likewise.
---
 gcc/testsuite/gcc.dg/atomic/c11-atomic-exec-2.c  | 1 +
 gcc/testsuite/gcc.dg/torture/pr94947-1.c | 1 +
 libgomp/testsuite/libgomp.c/omp-nested-3.c   | 1 +
 libgomp/testsuite/libgomp.c/pr46032-2.c  | 1 +
 .../libgomp.oacc-c-c++-common/data-clauses-kernels-ipa-pta.c | 1 +
 .../libgomp.oacc-c-c++-common/data-clauses-parallel-ipa-pta.c| 1 +
 6 files changed, 6 insertions(+)

diff --git a/gcc/testsuite/gcc.dg/atomic/c11-atomic-exec-2.c 
b/gcc/testsuite/gcc.dg/atomic/c11-atomic-exec-2.c
index 9ee56b60193..3e75096243e 100644
--- a/gcc/testsuite/gcc.dg/atomic/c11-atomic-exec-2.c
+++ b/gcc/testsuite/gcc.dg/atomic/c11-atomic-exec-2.c
@@ -2,6 +2,7 @@
assignment.  */
 /* { dg-do run } */
 /* { dg-options "-std=c11 -pedantic-errors" } */
+/* { dg-prune-output "warning: using serial compilation" } */
 
 extern void abort (void);

 extern void exit (int);
diff --git a/gcc/testsuite/gcc.dg/torture/pr94947-1.c 
b/gcc/testsuite/gcc.dg/torture/pr94947-1.c
index ab8b488c6fc..832e40db118 100644
--- a/gcc/testsuite/gcc.dg/torture/pr94947-1.c
+++ b/gcc/testsuite/gcc.dg/torture/pr94947-1.c
@@ -1,6 +1,7 @@
 /* { dg-do run } */
 /* { dg-additional-sources "pr94947-2.c" } */
 /* { dg-additional-options "-fipa-pta -flto-partition=1to1" } */
+/* { dg-prune-output "warning: using serial compilation" } */
 
 extern void abort ();

 extern void baz ();
diff --git a/libgomp/testsuite/libgomp.c/omp-nested-3.c 
b/libgomp/testsuite/libgomp.c/omp-nested-3.c
index 7790c58d515..446e6bd386a 100644
--- a/libgomp/testsuite/libgomp.c/omp-nested-3.c
+++ b/libgomp/testsuite/libgomp.c/omp-nested-3.c
@@ -1,4 +1,5 @@
 // { dg-do run { target lto } }
 // { dg-additional-options "-fipa-pta -flto -flto-partition=max" }
+// { dg-prune-output "warning: using serial compilation" }
 
 #include "omp-nested-1.c"

diff --git a/libgomp/testsuite/libgomp.c/pr46032-2.c 
b/libgomp/testsuite/libgomp.c/pr46032-2.c
index 1125f6ec2b2..36f37301abe 100644
--- a/libgomp/testsuite/libgomp.c/pr46032-2.c
+++ b/libgomp/testsuite/libgomp.c/pr46032-2.c
@@ -1,4 +1,5 @@
 /* { dg-do run { target lto } } */
 /* { dg-options "-O2 -ftree-vectorize -std=c99 -fipa-pta -flto 
-flto-partition=max" } */
+/* { dg-prune-output "warning: using serial compilation" } */
 
 #include "pr46032.c"

diff --git 
a/libgomp/testsuite/libgomp.oacc-c-c++-common/data-clauses-kernels-ipa-pta.c 
b/libgomp/testsuite/libgomp.oacc-c-c++-common/data-clauses-kernels-ipa-pta.c
index 2cd98bd9d78..49c11acd933 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/data-clauses-kernels-ipa-pta.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/data-clauses-kernels-ipa-pta.c
@@ -1,4 +1,5 @@
 /* { dg-do run { target lto } } */
 /* { dg-additional-options "-fipa-pta -flto -flto-partition=max" } */
+/* { dg-prune-output "warning: using serial compilation" } */
 
 #include "data-clauses-kernels.c"

diff --git 
a/libgomp/testsuite/libgomp.oacc-c-c++-common/data-clauses-parallel-ipa-pta.c 
b/libgomp/testsuite/libgomp.oacc-c-c++-common/data-clauses-parallel-ipa-pta.c
index ddcf4e389cb..4d61d847c61 100644
--- 
a/libgomp/testsuite/libgomp.oacc-c-c++-common/data-clauses-parallel-ipa-pta.c
+++ 
b/libgomp/testsuite/libgomp.oacc-c-c++-common/data-clauses-parallel-ipa-pta.c
@@ -1,4 +1,5 @@
 /* { dg-do run { target lto } } */
 /* { dg-additional-options "-fipa-pta -flto -flto-partition=max" } */
+/* { dg-prune-output "warning: using serial compilation" } */
 
 #include "data-clauses-parallel.c"

--
2.31.1

Re: [PATCH] tsan: fix false positive for pthread_cond_clockwait

2021-05-13 Thread Martin Liška


On 5/7/21 7:07 PM, Michael de Lang via Gcc-patches wrote:

pthread_cond_clockwait isn't added to TSAN_INTERCEPTORS which leads to
false positives regarding double locking of a mutex. This was
uncovered by a user reporting an issue to the google sanitizer github:
https://github.com/google/sanitizers/issues/1259

This patch copies code from the fix made in llvm:
https://github.com/llvm/llvm-project/commit/16eb853ffdd1a1ad7c95455b7795c5f004402e46


Hello.

Thank you for looking into this.



However, because the tsan related source code hasn't been kept in sync
with llvm, I had to make some modifications.


We merge from master rougtly twice a year. I've just merged LLVM upstream to 
our master.



Given that this is my first contibution to gcc, let me know if I've
missed anything.


Please take a look at the following steps:
https://gcc.gnu.org/contribute.html

We still want your test-case, can you please resend the patch on the current 
master?

Thanks!
Cheers,
Martin



Met vriendelijke groet,
Michael de Lang

+++ b/gcc/testsuite/g++.dg/tsan/pthread_cond_clockwait.C
@@ -0,0 +1,31 @@
+// Test pthread_cond_clockwait not generating false positives with tsan
+// { dg-do run { target { { *-*-linux* *-*-gnu* *-*-uclinux* } && pthread } } }
+// { dg-options "-fsanitize=thread -lpthread" }
+
+#include 
+
+pthread_cond_t cv;
+pthread_mutex_t mtx;
+
+void *fn(void *vp) {
+pthread_mutex_lock(&mtx);
+pthread_cond_signal(&cv);
+pthread_mutex_unlock(&mtx);
+return NULL;
+}
+
+int main() {
+pthread_mutex_lock(&mtx);
+
+pthread_t tid;
+pthread_create(&tid, NULL, fn, NULL);
+
+struct timespec ts;
+clock_gettime(CLOCK_MONOTONIC, &ts);
+ts.tv_sec += 10;
+pthread_cond_clockwait(&cv, &mtx, CLOCK_MONOTONIC, &ts);
+pthread_mutex_unlock(&mtx);
+
+pthread_join(tid, NULL);
+return 0;
+}
diff --git a/libsanitizer/tsan/tsan_interceptors_posix.cpp
b/libsanitizer/tsan/tsan_interceptors_posix.cpp
index aa04d8dfb67..7b3d0a917de 100644
--- a/libsanitizer/tsan/tsan_interceptors_posix.cpp
+++ b/libsanitizer/tsan/tsan_interceptors_posix.cpp
@@ -1126,7 +1126,10 @@ struct CondMutexUnlockCtx {
ScopedInterceptor *si;
ThreadState *thr;
uptr pc;
+  void *c;
void *m;
+  void *abstime;
+  __sanitizer_clockid_t clock;
  };

  static void cond_mutex_unlock(CondMutexUnlockCtx *arg) {
@@ -1152,19 +1155,18 @@ INTERCEPTOR(int, pthread_cond_init, void *c, void *a) {
  }

  static int cond_wait(ThreadState *thr, uptr pc, ScopedInterceptor *si,
- int (*fn)(void *c, void *m, void *abstime), void *c,
- void *m, void *t) {
+ int (*fn)(void *arg), void *c,
+ void *m, void *t, __sanitizer_clockid_t clock) {
MemoryAccessRange(thr, pc, (uptr)c, sizeof(uptr), false);
MutexUnlock(thr, pc, (uptr)m);
-  CondMutexUnlockCtx arg = {si, thr, pc, m};
+  CondMutexUnlockCtx arg = {si, thr, pc, c, m, t, clock};
int res = 0;
// This ensures that we handle mutex lock even in case of pthread_cancel.
// See test/tsan/cond_cancel.cpp.
{
  // Enable signal delivery while the thread is blocked.
  BlockingCall bc(thr);
-res = call_pthread_cancel_with_cleanup(
-fn, c, m, t, (void (*)(void *arg))cond_mutex_unlock, &arg);
+res = call_pthread_cancel_with_cleanup(fn, (void (*)(void
*arg))cond_mutex_unlock, &arg);
}
if (res == errno_EOWNERDEAD) MutexRepair(thr, pc, (uptr)m);
MutexPostLock(thr, pc, (uptr)m, MutexFlagDoPreLockOnPostLock);
@@ -1174,25 +1176,34 @@ static int cond_wait(ThreadState *thr, uptr
pc, ScopedInterceptor *si,
  INTERCEPTOR(int, pthread_cond_wait, void *c, void *m) {
void *cond = init_cond(c);
SCOPED_TSAN_INTERCEPTOR(pthread_cond_wait, cond, m);
-  return cond_wait(thr, pc, &si, (int (*)(void *c, void *m, void
*abstime))REAL(
- pthread_cond_wait),
-   cond, m, 0);
+  return cond_wait(thr, pc, &si, [](void *a) { CondMutexUnlockCtx
*arg = (CondMutexUnlockCtx*)a; return REAL(pthread_cond_wait)(arg->c,
arg->m); },
+   cond, m, 0, 0);
  }

  INTERCEPTOR(int, pthread_cond_timedwait, void *c, void *m, void *abstime) {
void *cond = init_cond(c);
SCOPED_TSAN_INTERCEPTOR(pthread_cond_timedwait, cond, m, abstime);
-  return cond_wait(thr, pc, &si, REAL(pthread_cond_timedwait), cond, m,
-   abstime);
+  return cond_wait(thr, pc, &si, [](void *a) { CondMutexUnlockCtx
*arg = (CondMutexUnlockCtx*)a; return
REAL(pthread_cond_timedwait)(arg->c, arg->m, arg->abstime); }, cond,
m,
+   abstime, 0);
  }

+#if SANITIZER_LINUX
+INTERCEPTOR(int, pthread_cond_clockwait, void *c, void *m,
__sanitizer_clockid_t clock, void *abstime) {
+  void *cond = init_cond(c);
+  SCOPED_TSAN_INTERCEPTOR(pthread_cond_clockwait, cond, m, clock, abstime);
+  return cond_wait(thr, pc, &si,
+   [](void *a) { CondMutexUnlockCtx *arg =
(CondMutexUnlockCtx*)a; return REAL(pthread

Re: [PATCH][OBVIOUS] testsuite: prune new LTO warning

2021-05-13 Thread Eric Botcazou

> gcc/testsuite/ChangeLog:
> 
>   PR testsuite/100569
>   * gcc.dg/atomic/c11-atomic-exec-2.c: Prune new LTO warning.
>   * gcc.dg/torture/pr94947-1.c: Likewise.

Another one:

PR testsuite/100569
* gnat.dg/lto21.adb: Prune new LTO warning.

-- 
Eric Botcazoudiff --git a/gcc/testsuite/gnat.dg/lto21.adb b/gcc/testsuite/gnat.dg/lto21.adb
index fe6fb2734b5..f6266978ee0 100644
--- a/gcc/testsuite/gnat.dg/lto21.adb
+++ b/gcc/testsuite/gnat.dg/lto21.adb
@@ -1,5 +1,6 @@
 -- { dg-do run }
 -- { dg-options "-O3 -flto" { target lto } }
+-- { dg-prune-output "warning: using serial compilation" }
 
 with Lto21_Pkg1;
 with Lto21_Pkg2; use Lto21_Pkg2;

[PATCH] Port gnat-style to Sphinx.

2021-05-13 Thread Martin Liška


Hello.

Right now, 2/3 of Ada manuals are both in .texi and .rst. I would like to port
a small gnat-style manual based on [1].

Ready for master?
Thanks,
Martin

[1] https://github.com/davidmalcolm/texi2rst

gcc/ada/ChangeLog:

* doc/Makefile: Add gnat-style target.
* doc/share/conf.py: Likewise.
* doc/gnat-style.rst: New file.
---
 gcc/ada/doc/Makefile   |   2 +-
 gcc/ada/doc/gnat-style.rst | 691 +
 gcc/ada/doc/share/conf.py  |   4 +-
 3 files changed, 695 insertions(+), 2 deletions(-)
 create mode 100644 gcc/ada/doc/gnat-style.rst

diff --git a/gcc/ada/doc/Makefile b/gcc/ada/doc/Makefile
index 9a435ebbb1f..4adfd368cc8 100644
--- a/gcc/ada/doc/Makefile
+++ b/gcc/ada/doc/Makefile
@@ -14,7 +14,7 @@ ALLSPHINXOPTS   = $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) \
  -c $(SOURCEDIR)/share \
  -d $(BUILDDIR)/$*/doctrees \
  $(SOURCEDIR)
-DOC_LIST=gnat_rm gnat_ugn
+DOC_LIST=gnat_rm gnat_ugn gnat-style
 FMT_LIST=html pdf txt info
 
 .PHONY: help clean

diff --git a/gcc/ada/doc/gnat-style.rst b/gcc/ada/doc/gnat-style.rst
new file mode 100644
index 000..527e7ba2a66
--- /dev/null
+++ b/gcc/ada/doc/gnat-style.rst
@@ -0,0 +1,691 @@
+GNAT Coding Style: A Guide for GNAT Developers
+==
+
+General
+---
+
+Most of GNAT is written in Ada using a consistent style to ensure
+readability of the code.  This document has been written to help
+maintain this consistent style, while having a large group of developers
+work on the compiler.
+
+For the coding style in the C parts of the compiler and run time,
+see the GNU Coding Guidelines.
+
+This document is structured after the Ada Reference Manual.
+Those familiar with that document should be able to quickly
+lookup style rules for particular constructs.
+
+Lexical Elements
+
+
+Character Set and Separators
+
+
+.. index:: Character set
+.. index:: ASCII
+.. index:: Separators
+.. index:: End-of-line
+.. index:: Line length
+.. index:: Indentation
+
+* The character set used should be plain 7-bit ASCII.
+  The only separators allowed are space and the end-of-line sequence.
+  No other control character or format effector (such as ``HT``,
+  ``VT``, ``FF`` )
+  should be used.
+  The normal end-of-line sequence is used, which may be
+  ``LF``, ``CR/LF`` or ``CR``,
+  depending on the host system.  An optional ``SUB``
+  ( ``16#1A#`` ) may be present as the
+  last character in the file on hosts using that character as file terminator.
+
+* Files that are checked in or distributed should be in host format.
+
+* A line should never be longer than 79 characters, not counting the line
+  separator.
+
+* Lines must not have trailing blanks.
+
+* Indentation is 3 characters per level for ``if`` statements, loops, and
+  ``case`` statements.
+  For exact information on required spacing between lexical
+  elements, see file style.adb.
+
+  .. index:: style.adb file
+
+Identifiers
+***
+
+* Identifiers will start with an upper case letter, and each letter following
+  an underscore will be upper case.
+
+  .. index:: Casing (for identifiers)
+
+  Short acronyms may be all upper case.
+  All other letters are lower case.
+  An exception is for identifiers matching a foreign language.  In particular,
+  we use all lower case where appropriate for C.
+
+* Use underscores to separate words in an identifier.
+
+  .. index:: Underscores
+
+* Try to limit your use of abbreviations in identifiers.
+  It is ok to make a few abbreviations, explain what they mean, and then
+  use them frequently, but don't use lots of obscure abbreviations.  An
+  example is the ``ALI`` word which stands for Ada Library
+  Information and is by convention always written in upper-case when
+  used in entity names.
+
+  .. code-block:: ada
+
+   procedure Find_ALI_Files;
+
+* Don't use the variable name ``I``, use ``J`` instead; ``I`` is too
+  easily confused with ``1`` in some fonts.  Similarly don't use the
+  variable ``O``, which is too easily mistaken for the number ``0``.
+
+Numeric Literals
+
+
+* Numeric literals should include underscores where helpful for
+  readability.
+
+  .. index:: Underscores
+
+  .. code-block:: ada
+
+  1_000_000
+  16#8000_#
+  3.14159_26535_89793_23846
+
+Reserved Words
+**
+
+* Reserved words use all lower case.
+
+  .. index:: Casing (for reserved words)
+
+  .. code-block:: ada
+
+   return else
+
+* The words ``Access``, ``Delta`` and ``Digits`` are
+  capitalized when used as attribute_designator.
+
+Comments
+
+
+* A comment starts with ``--`` followed by two spaces.
+  The only exception to this rule (i.e. one space is tolerated) is when the
+  comment ends with a single space followed by ``--``.
+  It is also acceptable to have only one space between ``--`` and the start
+  of the comment

Re: [PATCH] libgccjit: add some reflection functions in the jit C api

2021-05-13 Thread Martin Liška


@David: PING

On 11/3/20 11:13 PM, Antoni Boucher via Gcc-patches wrote:

I was missing a check in gcc_jit_struct_get_field, I added it in this new patch.

On Thu, Oct 15, 2020 at 05:52:33PM -0400, David Malcolm wrote:

On Thu, 2020-10-15 at 13:39 -0400, Antoni Boucher wrote:

Thanks. I updated the patch with these changes.


Thanks for patch; review below.  Sorry if it seems excessively nitpicky
in places.


2020-09-1  Antoni Boucher  

    gcc/jit/
    PR target/96889
    * docs/topics/compatibility.rst (LIBGCCJIT_ABI_14): New ABI tag.


15 now.


    * docs/topics/functions.rst: Add documentation for the
    functions gcc_jit_function_get_return_type and
    gcc_jit_function_get_param_count
    * libgccjit.c: New functions:
  * gcc_jit_function_get_return_type;
  * gcc_jit_function_get_param_count;
  * gcc_jit_function_type_get_return_type;
  * gcc_jit_function_type_get_param_count;
  * gcc_jit_function_type_get_param_type;
  * gcc_jit_type_unqualified;
  * gcc_jit_type_is_array;
  * gcc_jit_type_is_bool;
  * gcc_jit_type_is_function_ptr_type;
  * gcc_jit_type_is_int;
  * gcc_jit_type_is_pointer;
  * gcc_jit_type_is_vector;
  * gcc_jit_vector_type_get_element_type;
  * gcc_jit_vector_type_get_num_units;
  * gcc_jit_struct_get_field;
  * gcc_jit_type_is_struct;
  * gcc_jit_struct_get_field_count;


This isn't valid ChangeLog format; it will fail the git hooks.


    * libgccjit.h


Likewise.


    * jit-recording.h: New functions (is_struct and is_vector)
    * libgccjit.map (LIBGCCJIT_ABI_14): New ABI tag.


15 now.



    gcc/testsuite/
    PR target/96889
    * jit.dg/all-non-failing-tests.h: Add test-reflection.c.
    * jit.dg/test-reflection.c: New test.


[...]



diff --git a/gcc/jit/docs/topics/functions.rst 
b/gcc/jit/docs/topics/functions.rst
index eb40d64010e..9819c28cda2 100644
--- a/gcc/jit/docs/topics/functions.rst
+++ b/gcc/jit/docs/topics/functions.rst
@@ -171,6 +171,16 @@ Functions
    underlying string, so it is valid to pass in a pointer to an on-stack
    buffer.

+.. function::  size_t \
+   gcc_jit_function_get_param_count (gcc_jit_function *func)
+
+   Get the number of parameters of the function.
+
+.. function::  gcc_jit_type \*
+   gcc_jit_function_get_return_type (gcc_jit_function *func)
+
+   Get the return type of the function.
+


The documentation part of the patch is incomplete: it hasn't been
updated to add all the new entrypoints.
Also, the return type of gcc_jit_function_get_param_count is
inconsistent (size_t above, but ssize_t below).



diff --git a/gcc/jit/jit-recording.h b/gcc/jit/jit-recording.h
index 30e37aff387..525b8bc921d 100644
--- a/gcc/jit/jit-recording.h
+++ b/gcc/jit/jit-recording.h
@@ -538,7 +538,9 @@ public:
   virtual bool is_bool () const = 0;
   virtual type *is_pointer () = 0;
   virtual type *is_array () = 0;
+  virtual struct_ *is_struct () { return NULL; }


Can't you use dyn_cast_struct for this?
Or is this about looking through decorated_type? e.g. for const and
volatile variants?

I guess my question is, what is the purpose of gcc_jit_type_is_struct?


   virtual bool is_void () const { return false; }
+  virtual vector_type *is_vector () { return NULL; }


Likewise, can't you use dyn_cast_vector_type for this?


   virtual bool has_known_size () const { return true; }

   bool is_numeric () const
@@ -595,6 +597,8 @@ public:
   bool is_bool () const FINAL OVERRIDE;
   type *is_pointer () FINAL OVERRIDE { return dereference (); }
   type *is_array () FINAL OVERRIDE { return NULL; }
+  vector_type *is_vector () FINAL OVERRIDE { return NULL; }
+  struct_ *is_struct () FINAL OVERRIDE { return NULL; }


Likewise, and this is redundant, as it's merely copying the base class
implementation.


   bool is_void () const FINAL OVERRIDE { return m_kind == GCC_JIT_TYPE_VOID; }

 public:
@@ -629,6 +633,8 @@ public:
   bool is_bool () const FINAL OVERRIDE { return false; }
   type *is_pointer () FINAL OVERRIDE { return m_other_type; }
   type *is_array () FINAL OVERRIDE { return NULL; }
+  vector_type *is_vector () FINAL OVERRIDE { return NULL; }
+  struct_ *is_struct () FINAL OVERRIDE { return NULL; }


Likewise.



@@ -655,6 +661,7 @@ public:
   bool is_bool () const FINAL OVERRIDE { return m_other_type->is_bool (); }
   type *is_pointer () FINAL OVERRIDE { return m_other_type->is_pointer (); }
   type *is_array () FINAL OVERRIDE { return m_other_type->is_array (); }
+  struct_ *is_struct () FINAL OVERRIDE { return m_other_type->is_struct (); }


Aha: with a decorated type you look through the decoration.


 protected:
   type *m_other_type;
@@ -737,6 +744,8 @@ public:

   void replay_into (replayer *) FINAL OVERRIDE;

+  vector_type *is

Re: [PATCH] libgccjit: Handle truncation and extension for casts [PR 95498]

2021-05-13 Thread Martin Liška


@David: PING

On 2/20/21 11:17 PM, Antoni Boucher via Gcc-patches wrote:

Hi.
Thanks for your feedback!

See answers below:

On Sat, Feb 20, 2021 at 11:20:35AM -0700, Tom Tromey wrote:

"Antoni" == Antoni Boucher via Gcc-patches  writes:


Antoni> gcc/jit/
Antoni> PR target/95498
Antoni> * jit-playback.c: Add support to handle truncation and extension
Antoni> in the convert function.

Antoni> +  switch (dst_code)
Antoni> +    {
Antoni> +    case INTEGER_TYPE:
Antoni> +    case ENUMERAL_TYPE:
Antoni> +  t_ret = convert_to_integer (dst_type, expr);
Antoni> +  goto maybe_fold;
Antoni> +
Antoni> +    default:
Antoni> +  gcc_assert (gcc::jit::active_playback_ctxt);
Antoni> +  gcc::jit::active_playback_ctxt->add_error (NULL, "unhandled 
conversion");
Antoni> +  fprintf (stderr, "input expression:\n");
Antoni> +  debug_tree (expr);
Antoni> +  fprintf (stderr, "requested type:\n");
Antoni> +  debug_tree (dst_type);
Antoni> +  return error_mark_node;
Antoni> +
Antoni> +    maybe_fold:
Antoni> +  if (TREE_CODE (t_ret) != C_MAYBE_CONST_EXPR)
Antoni> +    t_ret = fold (t_ret);
Antoni> +  return t_ret;

It seems weird to have a single 'goto' to maybe_fold, especially inside
a switch like this.

If you think the maybe_fold code won't be reused, then it should just be
hoisted up and the 'goto' removed.


This actually depends on how the support for cast between integers and pointers 
will be implemented (see below).
If we will support truncating pointers (does that even make sense? and I guess 
we cannot extend a pointer unless we add the support for uint128_t), that label 
will be reused for that case.
Otherwise, it might not be reused.

So, please tell me which option to choose and I'll update my patch.


On the other hand, if the maybe_fold code might be reused for some other
case, then I suppose I would have the case end with 'break' and then
have this code outside the switch.


In another message, you wrote:

Antoni> For your question, the current code already works with boolean and
Antoni> reals and casts between integers and pointers is currently not
Antoni> supported.

I am curious why this wasn't supported.  It seems like something that
one might want to do.


I have no idea as this is my first contribution to gcc.
But this would be indeed very useful and I opened an issue about this: 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95438


thanks,
Tom


Thanks!

Re: [PATCH 3/3] Use startswith in targets.

2021-05-13 Thread Martin Liška


May I please ping this?

Martin

On 3/19/21 10:21 AM, Martin Liska wrote:


gcc/ChangeLog:

* common/config/aarch64/aarch64-common.c (aarch64_parse_extension):
Use startswith function instead of strncmp.
* common/config/bfin/bfin-common.c (bfin_handle_option): Likewise.
* common/config/riscv/riscv-common.c (riscv_subset_list::parse): 
Likewise.
* config/aarch64/aarch64-sve-builtins-shapes.cc (parse_type): Likewise.
* config/aarch64/aarch64.c (aarch64_process_one_target_attr): Likewise.
* config/alpha/alpha.c (alpha_elf_section_type_flags): Likewise.
* config/arm/aarch-common.c (arm_md_asm_adjust): Likewise.
* config/arm/arm.c (arm_file_start): Likewise.
(arm_valid_target_attribute_rec): Likewise.
(thumb1_md_asm_adjust): Likewise.
* config/arm/driver-arm.c (host_detect_local_cpu): Likewise.
* config/avr/avr.c (STR_PREFIX_P): Likewise.
(avr_set_current_function): Likewise.
(avr_handle_addr_attribute): Likewise.
(avr_asm_output_aligned_decl_common): Likewise.
(avr_asm_named_section): Likewise.
(avr_section_type_flags): Likewise.
(avr_asm_select_section): Likewise.
* config/c6x/c6x.c (c6x_in_small_data_p): Likewise.
(c6x_section_type_flags): Likewise.
* config/darwin-c.c (darwin_cfstring_ref_p): Likewise.
(darwin_objc_declare_unresolved_class_reference): Likewise.
(darwin_objc_declare_class_definition): Likewise.
* config/darwin.c (indirect_data): Likewise.
(darwin_encode_section_info): Likewise.
(darwin_objc2_section): Likewise.
(darwin_objc1_section): Likewise.
(machopic_select_section): Likewise.
(darwin_globalize_label): Likewise.
(darwin_label_is_anonymous_local_objc_name): Likewise.
(darwin_asm_named_section): Likewise.
(darwin_asm_output_dwarf_offset): Likewise.
* config/frv/frv.c (frv_string_begins_with): Likewise.
(frv_in_small_data_p): Likewise.
* config/gcn/mkoffload.c (STR): Likewise.
(main): Likewise.
* config/i386/i386-builtins.c (get_builtin_code_for_version): Likewise.
* config/i386/i386-options.c (ix86_option_override_internal): Likewise.
* config/i386/i386.c (x86_64_elf_section_type_flags): Likewise.
(ix86_md_asm_adjust): Likewise.
* config/i386/intelmic-mkoffload.c (STR): Likewise.
* config/i386/winnt.c (i386_pe_asm_named_section): Likewise.
(i386_pe_file_end): Likewise.
* config/ia64/ia64.c (ia64_in_small_data_p): Likewise.
(ia64_section_type_flags): Likewise.
* config/mips/driver-native.c (host_detect_local_cpu): Likewise.
* config/mips/mips.c (mips_handle_interrupt_attr): Likewise.
(mips16_stub_function_p): Likewise.
(mips_function_rodata_section): Likewise.
* config/msp430/msp430.c (msp430_mcu_name): Likewise.
(msp430_function_section): Likewise.
(msp430_section_type_flags): Likewise.
(msp430_expand_helper): Likewise.
* config/nios2/nios2.c (nios2_small_section_name_p): Likewise.
(nios2_valid_target_attribute_rec): Likewise.
* config/nvptx/mkoffload.c (process): Likewise.
(STR): Likewise.
* config/pa/som.h: Likewise.
* config/pdp11/pdp11.c (pdp11_output_ident): Likewise.
* config/riscv/riscv.c (riscv_elf_select_rtx_section): Likewise.
* config/rs6000/rs6000.c (VTABLE_NAME_P): Likewise.
(rs6000_inner_target_options): Likewise.
* config/s390/driver-native.c (s390_host_detect_local_cpu): Likewise.
* config/sparc/driver-sparc.c (host_detect_local_cpu): Likewise.
* config/vax/vax.c (vax_output_int_move): Likewise.
* config/vms/vms-ld.c (startswith): Likewise.
(process_args): Likewise.
(main): Likewise.
* config/vms/vms.c: Likewise.
---
  gcc/common/config/aarch64/aarch64-common.c|   2 +-
  gcc/common/config/bfin/bfin-common.c  |   2 +-
  gcc/common/config/riscv/riscv-common.c|   4 +-
  .../aarch64/aarch64-sve-builtins-shapes.cc|   4 +-
  gcc/config/aarch64/aarch64.c  |   2 +-
  gcc/config/alpha/alpha.c  |   8 +-
  gcc/config/arm/aarch-common.c |   2 +-
  gcc/config/arm/arm.c  |   8 +-
  gcc/config/arm/driver-arm.c   |   4 +-
  gcc/config/avr/avr.c  |  25 ++--
  gcc/config/c6x/c6x.c  |  14 +-
  gcc/config/darwin-c.c |   9 +-
  gcc/config/darwin.c   | 141 +-
  gcc/config/frv/frv.c  |  16 +-
  gcc/config/gcn/mkoffload.c|  10 +-
  gcc/config/i386/i386-builtins.c   |   2 +-
  gcc/config/i386/i386-options.c|   2 +-
  gcc/config/i386/i386.c

[PATCH] Remove unused variable.

2021-05-13 Thread Martin Liška


Addresses the following clang warning:
gcc/tree-ssa-dom.c:652:33: warning: private field 'm_simplifier' is not used 
[-Wunused-private-field]

Ready for master?
Thanks

gcc/ChangeLog:

* tree-ssa-dom.c: Remove m_simplifier.
---
 gcc/tree-ssa-dom.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
index 11b86b2a326..075b1ccb9de 100644
--- a/gcc/tree-ssa-dom.c
+++ b/gcc/tree-ssa-dom.c
@@ -649,7 +649,6 @@ private:
 
   void test_for_singularity (gimple *, avail_exprs_stack *);
 
-  dom_jump_threader_simplifier *m_simplifier;

   jump_threader *m_threader;
   evrp_range_analyzer *m_evrp_range_analyzer;
 };
--
2.31.1

[i386] Fix ICE [PR target/100549]

2021-05-13 Thread Hongtao Liu via Gcc-patches

Hi:
  When arg0 is same as arg1 in __builtin_ia32_pcmpgtw,
gimple_build (&stmts, GT_EXPR, cmp_type, arg0, arg1) will simplify the
comparison to vector constant 0, no stmts is generated, which causes
ICE in gsi_insert_before (gsi, stmts, GSI_SAME_STMT). So don't insert
stmts when it's NULL.

  Bootstrapped and regtested on x86_64-linux-gnu{-m32,}
  Ok for trunk?

gcc/ChangeLog:

PR target/100549
* config/i386/i386.c (ix86_gimple_fold_builtin): Insert gimple
stmts if stmts is not NULL.

gcc/testsuite/ChangeLog:

PR target/100549
* gcc.target/i386/pr100549.c: New test.

-- 
BR,
Hongtao
From 21eeafce731ec28f3d378b5c1cc94f505a677121 Mon Sep 17 00:00:00 2001
From: liuhongt 
Date: Thu, 13 May 2021 13:08:16 +0800
Subject: [PATCH] [i386] Fix ICE [PR target/100549]

When arg0 is same as arg1 in __builtin_ia32_pcmpgtw,
gimple_build (&stmts, GT_EXPR, cmp_type, arg0, arg1) will simplify the
comparison to vector constant 0, no stmts is generated, which causes
ICE in gsi_insert_before (gsi, stmts, GSI_SAME_STMT). So don't insert
stmts when it's NULL.

gcc/ChangeLog:

	PR target/100549
	* config/i386/i386.c (ix86_gimple_fold_builtin): Insert gimple
	stmts if stmts is not NULL.

gcc/testsuite/ChangeLog:

	PR target/100549
	* gcc.target/i386/pr100549.c: New test.
---
 gcc/config/i386/i386.c   |   5 +-
 gcc/testsuite/gcc.target/i386/pr100549.c | 108 +++
 2 files changed, 111 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr100549.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 780da108a7c..245044e0186 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -17981,8 +17981,9 @@ ix86_gimple_fold_builtin (gimple_stmt_iterator *gsi)
 	tree cmp_type = truth_type_for (type);
 	gimple_seq stmts = NULL;
 	tree cmp = gimple_build (&stmts, tcode, cmp_type, arg0, arg1);
-	gsi_insert_before (gsi, stmts, GSI_SAME_STMT);
-	gimple *g = gimple_build_assign (gimple_call_lhs (stmt),
+	if (stmts)
+	  gsi_insert_before (gsi, stmts, GSI_SAME_STMT);
+	gimple* g = gimple_build_assign (gimple_call_lhs (stmt),
 	 VEC_COND_EXPR, cmp,
 	 minus_one_vec, zero_vec);
 	gimple_set_location (g, loc);
diff --git a/gcc/testsuite/gcc.target/i386/pr100549.c b/gcc/testsuite/gcc.target/i386/pr100549.c
new file mode 100644
index 000..83bba3cfd0d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr100549.c
@@ -0,0 +1,108 @@
+/* PR target/100549  */
+/* { dg-do compile } */
+/* { dg-options "-O -mavx2" } */
+
+typedef char v16qi __attribute__ ((vector_size (16)));
+typedef char v32qi __attribute__ ((vector_size (32)));
+typedef short v8hi __attribute__ ((vector_size (16)));
+typedef short v16hi __attribute__ ((vector_size (32)));
+typedef int v4si __attribute__ ((vector_size (16)));
+typedef int v8si __attribute__ ((vector_size (32)));
+typedef long long v2di __attribute__ ((vector_size (16)));
+typedef long long v4di __attribute__ ((vector_size (32)));
+
+v16qi
+f1 (v16qi a)
+{
+  return __builtin_ia32_pcmpeqb128 (a, a);
+}
+
+v8hi
+f2 (v8hi a)
+{
+  return __builtin_ia32_pcmpeqw128 (a, a);
+}
+
+v4si
+f3 (v4si a)
+{
+  return __builtin_ia32_pcmpeqd128 (a, a);
+}
+
+v2di
+f4 (v2di a)
+{
+  return __builtin_ia32_pcmpeqq (a, a);
+}
+
+v16qi
+f5 (v16qi a)
+{
+  return __builtin_ia32_pcmpgtb128 (a, a);
+}
+
+v8hi
+f6 (v8hi a)
+{
+  return __builtin_ia32_pcmpgtw128 (a, a);
+}
+
+v4si
+f7 (v4si a)
+{
+  return __builtin_ia32_pcmpgtd128 (a, a);
+}
+
+v2di
+f8 (v2di a)
+{
+  return __builtin_ia32_pcmpgtq (a, a);
+}
+
+v32qi
+f9 (v32qi a)
+{
+  return __builtin_ia32_pcmpeqb256 (a, a);
+}
+
+v16hi
+f10 (v16hi a)
+{
+  return __builtin_ia32_pcmpeqw256 (a, a);
+}
+
+v8si
+f11 (v8si a)
+{
+  return __builtin_ia32_pcmpeqd256 (a, a);
+}
+
+v4di
+f12 (v4di a)
+{
+  return __builtin_ia32_pcmpeqq256 (a, a);
+}
+
+v32qi
+f13 (v32qi a)
+{
+  return __builtin_ia32_pcmpgtb256 (a, a);
+}
+
+v16hi
+f14 (v16hi a)
+{
+  return __builtin_ia32_pcmpgtw256 (a, a);
+}
+
+v8si
+f15 (v8si a)
+{
+  return __builtin_ia32_pcmpgtd256 (a, a);
+}
+
+v4di
+f16 (v4di a)
+{
+  return __builtin_ia32_pcmpgtq256 (a, a);
+}
-- 
2.18.1

[PATCH] i386: Fix up V2SFmode vcond* with -mxop [PR100581]

2021-05-13 Thread Uros Bizjak via Gcc-patches

ix86_expand_sse_movcc has special TARGET_XOP handling and the recent
addition of support of v*cond* patterns for V2SFmode results in
ICEs because the expected pattern doesn't exist.  We can handle it
using 128-bit vpcmov (if we ignore the upper 64 bits like we ignore in
other TARGET_MMX_WITH_SSE support).

2021-05-13  Uroš Bizjak  

gcc/
PR target/100581
* config/i386/i386-expand.c (ix86_expand_sse_movcc): Force mode
sizes < 16 to a register when constructing vpcmov pattern.
* config/i386/mmx.md (*xop_pcmov_): Use MMXMODE124 mode.

gcc/testsuite/

PR target/100581
* g++.target/i386/pr100581.C: New test.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Pushed to master.

Uros.
diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index dd230081b16..92488b869ec 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -3661,7 +3661,8 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, 
rtx op_false)
 {
   op_true = force_reg (mode, op_true);
 
-  if (!nonimmediate_operand (op_false, mode))
+  if (GET_MODE_SIZE (mode) < 16
+ || !nonimmediate_operand (op_false, mode))
op_false = force_reg (mode, op_false);
 
   emit_insn (gen_rtx_SET (dest, gen_rtx_IF_THEN_ELSE (mode, cmp,
diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index d433c524652..7fc2e5d781c 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -1816,11 +1816,11 @@ (define_insn "mmx_pblendvb"
 
 ;; XOP parallel XMM conditional moves
 (define_insn "*xop_pcmov_"
-  [(set (match_operand:MMXMODEI 0 "register_operand" "=x")
-(if_then_else:MMXMODEI
-  (match_operand:MMXMODEI 3 "register_operand" "x")
-  (match_operand:MMXMODEI 1 "register_operand" "x")
-  (match_operand:MMXMODEI 2 "register_operand" "x")))]
+  [(set (match_operand:MMXMODE124 0 "register_operand" "=x")
+(if_then_else:MMXMODE124
+  (match_operand:MMXMODE124 3 "register_operand" "x")
+  (match_operand:MMXMODE124 1 "register_operand" "x")
+  (match_operand:MMXMODE124 2 "register_operand" "x")))]
   "TARGET_XOP && TARGET_MMX_WITH_SSE"
   "vpcmov\t{%3, %2, %1, %0|%0, %1, %2, %3}"
   [(set_attr "type" "sse4arg")])
diff --git a/gcc/testsuite/g++.target/i386/pr100581.C 
b/gcc/testsuite/g++.target/i386/pr100581.C
new file mode 100644
index 000..37cc9f11f18
--- /dev/null
+++ b/gcc/testsuite/g++.target/i386/pr100581.C
@@ -0,0 +1,9 @@
+/* PR target/100581 */
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -mxop" } */
+
+typedef float __attribute__((__vector_size__(8))) v64f32;
+
+v64f32 af, bf, ff_a, ff_b;
+
+v64f32 f() { return ff_a > ff_b ? af : bf; }

[PATCH] [i386] Fix _mm256_zeroupper to notify LRA that vzeroupper will kill sse registers. [PR target/82735]

2021-05-13 Thread Hongtao Liu via Gcc-patches

Hi:
  When __builtin_ia32_vzeroupper is called explicitly, the corresponding
vzeroupper pattern does not carry any CLOBBERS or SETs before LRA,
which leads to incorrect optimization in pass_reload.
In order to solve this problem, this patch introduces a pre_reload
splitter which adds CLOBBERS to vzeroupper's pattern, it can solve the
problem in pr.

At the same time, in order to optimize the low 128 bits in
post_reload CSE, this patch also transforms those CLOBBERS to SETs in
pass_vzeroupper.

It works fine except for TARGET_64BIT_MS_ABI, under which xmm6-xmm15
are callee-saved, so even if there're no other uses of xmm6-xmm15 in the
function, because of vzeroupper's pattern, pro_epilog will save and
restore those registers, which is obviously redundant. In order to
eliminate this redundancy, a post_reload splitter is introduced, which
drops those SETs, until epilogue_completed splitter adds those SETs
back, it looks to be safe since there's no CSE between post_reload
split2 and epilogue_completed split3??? Also frame info needs to be
updated in pro_epilog, which saves and restores xmm6-xmm15 only if
there's usage other than explicit vzeroupper pattern.

  Bootstrapped and regtested on X86_64-linux-gnu{-m32,}
  Ok for trunk?

gcc/ChangeLog:

PR target/82735
* config/i386/i386-expand.c (ix86_expand_builtin): Count
number of __builtin_ia32_vzeroupper.
* config/i386/i386-features.c (ix86_add_reg_usage_to_vzerouppers):
Transform CLOBBERs to SETs for explicit vzeroupper pattern so
that CSE can optimize lower 128 bits.
* config/i386/i386.c (ix86_handle_explicit_vzeroupper_in_pro_epilog):
New.
(ix86_save_reg): If there's no use of xmm6~xmm15 other than
explicit vzeroupper under TARGET_64BIT_MS_ABI, no need to save
REGNO.
(ix86_finalize_stack_frame_flags): Recompute frame layout if
there's explicit vzeroupper under TARGET_64BIT_MS_ABI.
* config/i386/i386.h (struct machine_function): Change type of
has_explicit_vzeroupper from BOOL_BITFILED to unsigned int.
* config/i386/sse.md (*avx_vzeroupper_2): New post-reload
splitter which will drop all SETs for explicit vzeroupper
patterns.
(*avx_vzeroupper_1): Generate SET reg to reg instead of
CLOBBER, and add pre-reload splitter after it.

gcc/testsuite/ChangeLog:

PR target/82735
* gcc.target/i386/pr82735-1.c: New test.
* gcc.target/i386/pr82735-2.c: New test.
* gcc.target/i386/pr82735-3.c: New test.
* gcc.target/i386/pr82735-4.c: New test.
* gcc.target/i386/pr82735-5.c: New test.


-- 
BR,
Hongtao
From d53b0c6934ea499c9f87df963661b627e7e977bf Mon Sep 17 00:00:00 2001
From: liuhongt 
Date: Wed, 12 May 2021 14:20:54 +0800
Subject: [PATCH] [i386] Fix _mm256_zeroupper to notify LRA that vzeroupper
 will kill sse registers.

When __builtin_ia32_vzeroupper is called explicitly, the corresponding
vzeroupper pattern does not carry any CLOBBERS or SETs before LRA,
which leads to incorrect optimization in pass_reload.
In order to solve this problem, this patch introduces a pre_reload
splitter which adds CLOBBERS to vzeroupper's pattern, it can solve the
problem in pr.

At the same time, in order to optimize the low 128 bits in
post_reload CSE, this patch also transforms those CLOBBERS to SETs in
pass_vzeroupper.

It works fine except for TARGET_64BIT_MS_ABI, under which xmm6-xmm15
are callee-saved, so even if there're no other uses of xmm6-xmm15 in the
function, because of vzeroupper's pattern, pro_epilog will save and
restore those registers, which is obviously redundant. In order to
eliminate this redundancy, a post_reload splitter is introduced, which
drops those SETs, until epilogue_completed splitter adds those SETs
back, it looks to be safe since there's no CSE between post_reload
split2 and epilogue_completed split3??? Also frame info needs to be
updated in pro_epilog, which saves and restores xmm6-xmm15 only if
there's usage other than explicit vzeroupper pattern.

gcc/ChangeLog:

	PR target/82735
	* config/i386/i386-expand.c (ix86_expand_builtin): Count
	number of __builtin_ia32_vzeroupper.
	* config/i386/i386-features.c (ix86_add_reg_usage_to_vzerouppers):
	Transform CLOBBERs to SETs for explict vzeroupper pattern so
	that CSE can optimize lower 128 bits.
	* config/i386/i386.c (ix86_handle_explicit_vzeroupper_in_pro_epilog):
	New.
	(ix86_save_reg): If there's no use of xmm6~xmm15 other than
	explicit vzeroupper under TARGET_64BIT_MS_ABI, no need to save
	REGNO.
	(ix86_finalize_stack_frame_flags): Recompute frame layout if
	there's explicit vzeroupper under TARGET_64BIT_MS_ABI.
	* config/i386/i386.h (struct machine_function): Change type of
	has_explicit_vzeroupper from BOOL_BITFILED to unsigned int.
	* config/i386/sse.md (*avx_vzeroupper_2): New post-reload
	splitter which will drop all SETs for explicit vzeroupper
	patterns.
	(*avx_vzeroupper_1): Generate SET reg to

Re: [PATCH] [i386] Fix _mm256_zeroupper to notify LRA that vzeroupper will kill sse registers. [PR target/82735]

2021-05-13 Thread Uros Bizjak via Gcc-patches

On Thu, May 13, 2021 at 11:18 AM Hongtao Liu  wrote:
>
> Hi:
>   When __builtin_ia32_vzeroupper is called explicitly, the corresponding
> vzeroupper pattern does not carry any CLOBBERS or SETs before LRA,
> which leads to incorrect optimization in pass_reload.
> In order to solve this problem, this patch introduces a pre_reload
> splitter which adds CLOBBERS to vzeroupper's pattern, it can solve the
> problem in pr.
>
> At the same time, in order to optimize the low 128 bits in
> post_reload CSE, this patch also transforms those CLOBBERS to SETs in
> pass_vzeroupper.
>
> It works fine except for TARGET_64BIT_MS_ABI, under which xmm6-xmm15
> are callee-saved, so even if there're no other uses of xmm6-xmm15 in the
> function, because of vzeroupper's pattern, pro_epilog will save and
> restore those registers, which is obviously redundant. In order to
> eliminate this redundancy, a post_reload splitter is introduced, which
> drops those SETs, until epilogue_completed splitter adds those SETs
> back, it looks to be safe since there's no CSE between post_reload
> split2 and epilogue_completed split3??? Also frame info needs to be
> updated in pro_epilog, which saves and restores xmm6-xmm15 only if
> there's usage other than explicit vzeroupper pattern.
>
>   Bootstrapped and regtested on X86_64-linux-gnu{-m32,}
>   Ok for trunk?

Some time ago a support for CLOBBER_HIGH RTX was added (and later
removed for some reason). Perhaps we could resurrect the patch for the
purpose of ferrying 128bit modes via vzeroupper RTX?

+(define_split
+  [(match_parallel 0 "vzeroupper_pattern"
+ [(unspec_volatile [(const_int 0)] UNSPECV_VZEROUPPER)])]
+  "TARGET_AVX && ix86_pre_reload_split ()"
+  [(match_dup 0)]
+{
+  /* When vzeroupper is explictly used, for LRA purpose, make it clear
+ the instruction kills sse registers.  */
+  gcc_assert (cfun->machine->has_explicit_vzeroupper);
+  unsigned int nregs = TARGET_64BIT ? 16 : 8;
+  rtvec vec = rtvec_alloc (nregs + 1);
+  RTVEC_ELT (vec, 0) = gen_rtx_UNSPEC_VOLATILE (VOIDmode,
+gen_rtvec (1, const1_rtx),
+UNSPECV_VZEROUPPER);
+  for (unsigned int i = 0; i < nregs; ++i)
+{
+  unsigned int regno = GET_SSE_REGNO (i);
+  rtx reg = gen_rtx_REG (V2DImode, regno);
+  RTVEC_ELT (vec, i + 1) = gen_rtx_CLOBBER (VOIDmode, reg);
+}
+  operands[0] = gen_rtx_PARALLEL (VOIDmode, vec);
+})

Wouldn't this also kill lower 128bit values that are not touched by
vzeroupper? A CLOBBER_HIGH would be more appropriate here.

Uros.


> gcc/ChangeLog:
>
> PR target/82735
> * config/i386/i386-expand.c (ix86_expand_builtin): Count
> number of __builtin_ia32_vzeroupper.
> * config/i386/i386-features.c (ix86_add_reg_usage_to_vzerouppers):
> Transform CLOBBERs to SETs for explicit vzeroupper pattern so
> that CSE can optimize lower 128 bits.
> * config/i386/i386.c (ix86_handle_explicit_vzeroupper_in_pro_epilog):
> New.
> (ix86_save_reg): If there's no use of xmm6~xmm15 other than
> explicit vzeroupper under TARGET_64BIT_MS_ABI, no need to save
> REGNO.
> (ix86_finalize_stack_frame_flags): Recompute frame layout if
> there's explicit vzeroupper under TARGET_64BIT_MS_ABI.
> * config/i386/i386.h (struct machine_function): Change type of
> has_explicit_vzeroupper from BOOL_BITFILED to unsigned int.
> * config/i386/sse.md (*avx_vzeroupper_2): New post-reload
> splitter which will drop all SETs for explicit vzeroupper
> patterns.
> (*avx_vzeroupper_1): Generate SET reg to reg instead of
> CLOBBER, and add pre-reload splitter after it.
>
> gcc/testsuite/ChangeLog:
>
> PR target/82735
> * gcc.target/i386/pr82735-1.c: New test.
> * gcc.target/i386/pr82735-2.c: New test.
> * gcc.target/i386/pr82735-3.c: New test.
> * gcc.target/i386/pr82735-4.c: New test.
> * gcc.target/i386/pr82735-5.c: New test.
>
>
> --
> BR,
> Hongtao

Re: [PATCH] [i386] Fix _mm256_zeroupper to notify LRA that vzeroupper will kill sse registers. [PR target/82735]

2021-05-13 Thread Uros Bizjak via Gcc-patches

On Thu, May 13, 2021 at 11:40 AM Uros Bizjak  wrote:
>
> On Thu, May 13, 2021 at 11:18 AM Hongtao Liu  wrote:
> >
> > Hi:
> >   When __builtin_ia32_vzeroupper is called explicitly, the corresponding
> > vzeroupper pattern does not carry any CLOBBERS or SETs before LRA,
> > which leads to incorrect optimization in pass_reload.
> > In order to solve this problem, this patch introduces a pre_reload
> > splitter which adds CLOBBERS to vzeroupper's pattern, it can solve the
> > problem in pr.
> >
> > At the same time, in order to optimize the low 128 bits in
> > post_reload CSE, this patch also transforms those CLOBBERS to SETs in
> > pass_vzeroupper.
> >
> > It works fine except for TARGET_64BIT_MS_ABI, under which xmm6-xmm15
> > are callee-saved, so even if there're no other uses of xmm6-xmm15 in the
> > function, because of vzeroupper's pattern, pro_epilog will save and
> > restore those registers, which is obviously redundant. In order to
> > eliminate this redundancy, a post_reload splitter is introduced, which
> > drops those SETs, until epilogue_completed splitter adds those SETs
> > back, it looks to be safe since there's no CSE between post_reload
> > split2 and epilogue_completed split3??? Also frame info needs to be
> > updated in pro_epilog, which saves and restores xmm6-xmm15 only if
> > there's usage other than explicit vzeroupper pattern.
> >
> >   Bootstrapped and regtested on X86_64-linux-gnu{-m32,}
> >   Ok for trunk?
>
> Some time ago a support for CLOBBER_HIGH RTX was added (and later
> removed for some reason). Perhaps we could resurrect the patch for the
> purpose of ferrying 128bit modes via vzeroupper RTX?

https://gcc.gnu.org/legacy-ml/gcc-patches/2017-11/msg01325.html

Uros.

>
> +(define_split
> +  [(match_parallel 0 "vzeroupper_pattern"
> + [(unspec_volatile [(const_int 0)] UNSPECV_VZEROUPPER)])]
> +  "TARGET_AVX && ix86_pre_reload_split ()"
> +  [(match_dup 0)]
> +{
> +  /* When vzeroupper is explictly used, for LRA purpose, make it clear
> + the instruction kills sse registers.  */
> +  gcc_assert (cfun->machine->has_explicit_vzeroupper);
> +  unsigned int nregs = TARGET_64BIT ? 16 : 8;
> +  rtvec vec = rtvec_alloc (nregs + 1);
> +  RTVEC_ELT (vec, 0) = gen_rtx_UNSPEC_VOLATILE (VOIDmode,
> +gen_rtvec (1, const1_rtx),
> +UNSPECV_VZEROUPPER);
> +  for (unsigned int i = 0; i < nregs; ++i)
> +{
> +  unsigned int regno = GET_SSE_REGNO (i);
> +  rtx reg = gen_rtx_REG (V2DImode, regno);
> +  RTVEC_ELT (vec, i + 1) = gen_rtx_CLOBBER (VOIDmode, reg);
> +}
> +  operands[0] = gen_rtx_PARALLEL (VOIDmode, vec);
> +})
>
> Wouldn't this also kill lower 128bit values that are not touched by
> vzeroupper? A CLOBBER_HIGH would be more appropriate here.
>
> Uros.
>
>
> > gcc/ChangeLog:
> >
> > PR target/82735
> > * config/i386/i386-expand.c (ix86_expand_builtin): Count
> > number of __builtin_ia32_vzeroupper.
> > * config/i386/i386-features.c (ix86_add_reg_usage_to_vzerouppers):
> > Transform CLOBBERs to SETs for explicit vzeroupper pattern so
> > that CSE can optimize lower 128 bits.
> > * config/i386/i386.c 
> > (ix86_handle_explicit_vzeroupper_in_pro_epilog):
> > New.
> > (ix86_save_reg): If there's no use of xmm6~xmm15 other than
> > explicit vzeroupper under TARGET_64BIT_MS_ABI, no need to save
> > REGNO.
> > (ix86_finalize_stack_frame_flags): Recompute frame layout if
> > there's explicit vzeroupper under TARGET_64BIT_MS_ABI.
> > * config/i386/i386.h (struct machine_function): Change type of
> > has_explicit_vzeroupper from BOOL_BITFILED to unsigned int.
> > * config/i386/sse.md (*avx_vzeroupper_2): New post-reload
> > splitter which will drop all SETs for explicit vzeroupper
> > patterns.
> > (*avx_vzeroupper_1): Generate SET reg to reg instead of
> > CLOBBER, and add pre-reload splitter after it.
> >
> > gcc/testsuite/ChangeLog:
> >
> > PR target/82735
> > * gcc.target/i386/pr82735-1.c: New test.
> > * gcc.target/i386/pr82735-2.c: New test.
> > * gcc.target/i386/pr82735-3.c: New test.
> > * gcc.target/i386/pr82735-4.c: New test.
> > * gcc.target/i386/pr82735-5.c: New test.
> >
> >
> > --
> > BR,
> > Hongtao

Re: [PATCH] [i386] Fix _mm256_zeroupper to notify LRA that vzeroupper will kill sse registers. [PR target/82735]

2021-05-13 Thread Jakub Jelinek via Gcc-patches

On Thu, May 13, 2021 at 11:43:19AM +0200, Uros Bizjak wrote:
> > >   Bootstrapped and regtested on X86_64-linux-gnu{-m32,}
> > >   Ok for trunk?
> >
> > Some time ago a support for CLOBBER_HIGH RTX was added (and later
> > removed for some reason). Perhaps we could resurrect the patch for the
> > purpose of ferrying 128bit modes via vzeroupper RTX?
> 
> https://gcc.gnu.org/legacy-ml/gcc-patches/2017-11/msg01325.html

https://gcc.gnu.org/legacy-ml/gcc-patches/2019-09/msg01468.html
is where it got removed, CCing Richard.

> > +(define_split
> > +  [(match_parallel 0 "vzeroupper_pattern"
> > + [(unspec_volatile [(const_int 0)] UNSPECV_VZEROUPPER)])]
> > +  "TARGET_AVX && ix86_pre_reload_split ()"
> > +  [(match_dup 0)]
> > +{
> > +  /* When vzeroupper is explictly used, for LRA purpose, make it clear
> > + the instruction kills sse registers.  */
> > +  gcc_assert (cfun->machine->has_explicit_vzeroupper);
> > +  unsigned int nregs = TARGET_64BIT ? 16 : 8;
> > +  rtvec vec = rtvec_alloc (nregs + 1);
> > +  RTVEC_ELT (vec, 0) = gen_rtx_UNSPEC_VOLATILE (VOIDmode,
> > +gen_rtvec (1, const1_rtx),
> > +UNSPECV_VZEROUPPER);
> > +  for (unsigned int i = 0; i < nregs; ++i)
> > +{
> > +  unsigned int regno = GET_SSE_REGNO (i);
> > +  rtx reg = gen_rtx_REG (V2DImode, regno);
> > +  RTVEC_ELT (vec, i + 1) = gen_rtx_CLOBBER (VOIDmode, reg);
> > +}
> > +  operands[0] = gen_rtx_PARALLEL (VOIDmode, vec);
> > +})
> >
> > Wouldn't this also kill lower 128bit values that are not touched by
> > vzeroupper? A CLOBBER_HIGH would be more appropriate here.

Yes, it would.  But normally the only xmm* hard regs live across the
explicit user vzeroupper would be local and global register variables,
I think the 1st scheduler etc. shouldn't extend lifetime of the
xmm hard regs across UNSPEC_VOLATILE.

Jakub

Re: [PATCH] testsuite/arm: Improve unsigned-float.c

2021-05-13 Thread Richard Earnshaw via Gcc-patches





On 22/04/2021 14:32, Christophe Lyon via Gcc-patches wrote:

The test requires an FPU, so use -march=armv7-a+fp -mfpu=auto instead
of -march=armv7-a.

We also remove dg-require-effective-target arm_fp_ok, but keep
dg-add-options arm_fp: this enables the test to pass on arm-eabi
configured with default cpu/fpu/mode.

dg-require-effective-target arm_fp_ok fails on such a configuration
for lack of FPU, since dg-options are not taken into account by
dg-require-effective-target.

Add -march=armv7-a+fp -mfpu=auto is sufficient for arm_fp options to
be acceptable.

This enables the test to pass on all the arm-eabi configurations I'm
testing, as well as arm-linux-gnueabi when forcing -march=armv5t.

2021-04-22  Christophe Lyon  

gcc/testsuite/
* gcc.target/arm/unsigned-float.c: Remove arm_fp_ok, adjust
dg-options.
---
  gcc/testsuite/gcc.target/arm/unsigned-float.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/unsigned-float.c 
b/gcc/testsuite/gcc.target/arm/unsigned-float.c
index ad589d9..ea3abc7 100644
--- a/gcc/testsuite/gcc.target/arm/unsigned-float.c
+++ b/gcc/testsuite/gcc.target/arm/unsigned-float.c
@@ -1,8 +1,8 @@
  /* { dg-do compile } */
-/* { dg-require-effective-target arm_fp_ok } */
-/* { dg-skip-if "need fp instructions" { *-*-* } { "-mfloat-abi=soft" } { "" } 
} */
  /* { dg-skip-if "-mpure-code supports M-profile only" { *-*-* } { 
"-mpure-code" } } */
-/* { dg-options "-march=armv7-a -O1" } */
+/* { dg-options "-march=armv7-a+fp -mfpu=auto -O1" } */
+/* Do not require arm_ok effective target to avoid skipping on arm-eabi with
+   default configure options.  */
  /* { dg-add-options arm_fp } */
  
  


OK.
R.

Re: RFC: Sphinx for GCC documentation

2021-05-13 Thread Martin Liška


On 4/7/21 7:40 PM, Joseph Myers wrote:

On Wed, 7 Apr 2021, Michael Matz wrote:


Random snippet for what I mean: the .texi contains:

"The @code{access} attribute specifies that a function to whose
by-reference arguments the attribute applies accesses the referenced
object according to @var{access-mode}.  The @var{access-mode} argument is
required and must be"

the .rst has:

"The ``access`` attribute specifies that a function to whose by-reference
arguments the attribute applies accesses the referenced object according
to :samp:`{access-mode}`.  The :samp:`{access-mode}` argument is required
and must be"

So, @code{}/@var{} vs. `` `` / :samp:``.  Keeping in mind that


@var in Texinfo is orthogonal to whether something is literal code.  It
looks like Sphinx's equivalent is {} inside :samp:`` (so not supporting
the use case of @var outside literal code)?


Hello.

Yes, as Joseph says, it's equivalent to {var} within a :samp: directive as 
documented
here: 
https://www.sphinx-doc.org/en/master/usage/restructuredtext/roles.html#role-samp

To be honest, if we really want, we can easily come up with even more roles.
But I don't think we would benefit from it.

...

One other practical concern: it seems there's a one-to-one correspondence 


of .rst files and (web)page.  Do we really want to maintain hundreds (or 



how many?) .rst files, instead of 60 .texi files?


Well, based what I know about RST and Sphinx, it's pretty natural that one HTML 
page
corresponds to a single RST file.

Looking at famous users for Sphinx, I can see the following stats:

linux/Documentation> find . -name '*.rst' | wc -l

2807


godot-docs> find . -name '*.rst' | wc -l

1030


Martin

[committed] arm: correctly handle inequality comparisons against max constants [PR100563]

2021-05-13 Thread Richard Earnshaw via Gcc-patches


Normally we expect the gimple optimizers to fold away comparisons that
are always true, but at some lower optimization levels this is not
always the case, so the back-end has to be able to generate correct
code in these cases.

In this example, we have a comparison of the form

  (unsigned long long) op <= ~0ULL

which, of course is always true.

Normally, in the arm back-end we handle these expansions where the
immediate cannot be handled directly by adding 1 to the constant and
then adjusting the comparison operator:

  (unsigned long long) op < CONST + 1

but we cannot do that when the constant is already the largest value.

Fortunately, we observe that the comparisons we need to handle this
way are either always true or always false, so instead of forming a
comparison against the maximum value, we can replace it with a
comparison against the minimum value (which just happens to also be a
constant we can handle.  So

  op1 <= ~0ULL -> op1 >= 0U
  op1 > ~0ULL -> op1 < 0U

  op1 <= LONG_LONG_INT_MAX -> op1 >= (-LONG_LONG_INT_MAX - 1)
  op1 > LONG_LONG_INT_MAX -> op1 < (-LONG_LONG_INT_MAX - 1)

gcc:
PR target/100563
* config/arm/arm.c (arm_canonicalize_comparison): Correctly
canonicalize DImode inequality comparisons against the
maximum integral value.

gcc/testsuite:
* gcc.dg/pr100563.c: New test.
---
 gcc/config/arm/arm.c| 29 +
 gcc/testsuite/gcc.dg/pr100563.c |  9 +
 2 files changed, 34 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr100563.c

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 2962071adfd..d0c0c50be97 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -5563,9 +5563,20 @@ arm_canonicalize_comparison (int *code, rtx *op0, rtx *op1,
 			return;
 		  *op1 = GEN_INT (i + 1);
 		  *code = *code == GT ? GE : LT;
-		  return;
 		}
-		  break;
+		  else
+		{
+		  /* GT maxval is always false, LE maxval is always true.
+			 We can't fold that away here as we must make a
+			 comparison, but we can fold them to comparisons
+			 with the same result that can be handled:
+			   op0 GT maxval -> op0 LT minval
+			   op0 LE maxval -> op0 GE minval
+			 where minval = (-maxval - 1).  */
+		  *op1 = GEN_INT (-maxval - 1);
+		  *code = *code == GT ? LT : GE;
+		}
+		  return;
 
 		case GTU:
 		case LEU:
@@ -5578,9 +5589,19 @@ arm_canonicalize_comparison (int *code, rtx *op0, rtx *op1,
 			return;
 		  *op1 = GEN_INT (i + 1);
 		  *code = *code == GTU ? GEU : LTU;
-		  return;
 		}
-		  break;
+		  else
+		{
+		  /* GTU ~0 is always false, LEU ~0 is always true.
+			 We can't fold that away here as we must make a
+			 comparison, but we can fold them to comparisons
+			 with the same result that can be handled:
+			   op0 GTU ~0 -> op0 LTU 0
+			   op0 LEU ~0 -> op0 GEU 0.  */
+		  *op1 = const0_rtx;
+		  *code = *code == GTU ? LTU : GEU;
+		}
+		  return;
 
 		default:
 		  gcc_unreachable ();
diff --git a/gcc/testsuite/gcc.dg/pr100563.c b/gcc/testsuite/gcc.dg/pr100563.c
new file mode 100644
index 000..812eb9e6ae2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr100563.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-Og" } */
+unsigned long long e(void);
+void f(int);
+void a() {
+  short b = -1, c = (int)&b;
+  unsigned long long d = e();
+  f(b >= d);
+}

Re: [PATCH] rs6000: Fix wrong code generation for vec_sel [PR94613]

2021-05-13 Thread Segher Boessenkool

Hi!

On Fri, Apr 30, 2021 at 01:32:58AM -0500, Xionghu Luo wrote:
> The vsel instruction is a bit-wise select instruction.  Using an
> IF_THEN_ELSE to express it in RTL is wrong and leads to wrong code
> being generated in the combine pass.  Per element selection is a
> subset of per bit-wise selection,with the patch the pattern is
> written using bit operations.  But there are 8 different patterns
> to define "op0 := (op1 & ~op3) | (op2 & op3)":
> 
> (~op3&op1) | (op3&op2),
> (~op3&op1) | (op2&op3),
> (op3&op2) | (~op3&op1),
> (op2&op3) | (~op3&op1),
> (op1&~op3) | (op3&op2),
> (op1&~op3) | (op2&op3),
> (op3&op2) | (op1&~op3),
> (op2&op3) | (op1&~op3),
> 
> Combine pass will swap (op1&~op3) to (~op3&op1) due to commutative
> canonical, which could reduce it to the FIRST 4 patterns, but it won't
> swap (op2&op3) | (~op3&op1) to (~op3&op1) | (op2&op3), so this patch
> handles it with two patterns with different NOT op3 position and check
> equality inside it.

Yup, that latter case does not have canonicalisation rules.  Btw, not
only combine does this canonicalisation: everything should,
non-canonical RTL is invalid RTL (in the instruction stream, you can do
everything in temporary code of course, as long as the RTL isn't
malformed).

> -(define_insn "*altivec_vsel"
> +(define_insn "altivec_vsel"
>[(set (match_operand:VM 0 "altivec_register_operand" "=v")
> - (if_then_else:VM
> -  (ne:CC (match_operand:VM 1 "altivec_register_operand" "v")
> - (match_operand:VM 4 "zero_constant" ""))
> -  (match_operand:VM 2 "altivec_register_operand" "v")
> -  (match_operand:VM 3 "altivec_register_operand" "v")))]
> -  "VECTOR_MEM_ALTIVEC_P (mode)"
> -  "vsel %0,%3,%2,%1"
> + (ior:VM
> +  (and:VM
> +   (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
> +   (match_operand:VM 1 "altivec_register_operand" "v"))
> +  (and:VM
> +   (match_operand:VM 2 "altivec_register_operand" "v")
> +   (match_operand:VM 4 "altivec_register_operand" "v"]
> +  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)
> +  && (rtx_equal_p (operands[2], operands[3])
> +  || rtx_equal_p (operands[4], operands[3]))"
> +  {
> +if (rtx_equal_p (operands[2], operands[3]))
> +  return "vsel %0,%1,%4,%3";
> +else
> +  return "vsel %0,%1,%2,%3";
> +  }
>[(set_attr "type" "vecmove")])

That rtx_equal_p stuff is nice and tricky, but it is a bit too tricky I
think.  So please write this as two patterns (and keep the expand if
that helps).

> +(define_insn "altivec_vsel2"

(same here of course).

>  ;; Fused multiply add.
> diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
> index f5676255387..d65bdc01055 100644
> --- a/gcc/config/rs6000/rs6000-call.c
> +++ b/gcc/config/rs6000/rs6000-call.c
> @@ -3362,11 +3362,11 @@ const struct altivec_builtin_types 
> altivec_overloaded_builtins[] = {
>  RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 
> RS6000_BTI_unsigned_V2DI },
>{ ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
>  RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI },
> -  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
> +  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI_UNS,

Are the _uns things still used for anything?  But, let's not change
this until Bill's stuff is in :-)

Why do you want to change this here, btw?  I don't understand.

> +  if (target == 0
> +  || GET_MODE (target) != tmode
> +  || ! (*insn_data[icode].operand[0].predicate) (target, tmode))

No space after ! and other unary operators (except for casts and other
operators you write with alphanumerics, like "sizeof").  I know you
copied this code, but :-)

> @@ -15608,8 +15606,6 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, 
> rtx op_false,
>  case GEU:
>  case LTU:
>  case LEU:
> -  /* Mark unsigned tests with CCUNSmode.  */
> -  cc_mode = CCUNSmode;
>  
>/* Invert condition to avoid compound test if necessary.  */
>if (rcode == GEU || rcode == LEU)

So this is related to the _uns thing.  Could you split off that change?
Probably as an earlier patch (but either works for me).

> @@ -15629,6 +15625,9 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, 
> rtx op_false,
>if (!mask)
>  return 0;
>  
> +  if (mask_mode != dest_mode)
> +  mask = simplify_gen_subreg (dest_mode, mask, mask_mode, 0);

Indent just two characters please: line continuations (usually) align,
but indents do not.


Can you fold vsel and xxsel together completely?  They have exactly the
same semantics!  This does not have to be in this patch of course.

Thanks,


Segher

Re: [PATCH] [i386] Fix _mm256_zeroupper to notify LRA that vzeroupper will kill sse registers. [PR target/82735]

2021-05-13 Thread Richard Sandiford via Gcc-patches

Jakub Jelinek  writes:
> On Thu, May 13, 2021 at 11:43:19AM +0200, Uros Bizjak wrote:
>> > >   Bootstrapped and regtested on X86_64-linux-gnu{-m32,}
>> > >   Ok for trunk?
>> >
>> > Some time ago a support for CLOBBER_HIGH RTX was added (and later
>> > removed for some reason). Perhaps we could resurrect the patch for the
>> > purpose of ferrying 128bit modes via vzeroupper RTX?
>> 
>> https://gcc.gnu.org/legacy-ml/gcc-patches/2017-11/msg01325.html
>
> https://gcc.gnu.org/legacy-ml/gcc-patches/2019-09/msg01468.html
> is where it got removed, CCing Richard.

Yeah.  Initially clobber_high seemed like the best appraoch for
handling the tlsdesc thing, but in practice it was too difficult
to shoe-horn the concept in after the fact, when so much rtl
infrastructure wasn't prepared to deal with it.  The old support
didn't handle all cases and passes correctly, and handled others
suboptimally.

I think it would be worth using the same approach as
https://gcc.gnu.org/legacy-ml/gcc-patches/2019-09/msg01466.html for
vzeroupper: represent the instructions as call_insns in which the
call has a special vzeroupper ABI.  I think that's likely to lead
to better code than clobber_high would (or at least, it did for tlsdesc).

Thanks,
Richard

Re: [PATCH] [i386] Fix _mm256_zeroupper to notify LRA that vzeroupper will kill sse registers. [PR target/82735]

2021-05-13 Thread Jakub Jelinek via Gcc-patches

On Thu, May 13, 2021 at 12:32:26PM +0100, Richard Sandiford wrote:
> Jakub Jelinek  writes:
> > On Thu, May 13, 2021 at 11:43:19AM +0200, Uros Bizjak wrote:
> >> > >   Bootstrapped and regtested on X86_64-linux-gnu{-m32,}
> >> > >   Ok for trunk?
> >> >
> >> > Some time ago a support for CLOBBER_HIGH RTX was added (and later
> >> > removed for some reason). Perhaps we could resurrect the patch for the
> >> > purpose of ferrying 128bit modes via vzeroupper RTX?
> >> 
> >> https://gcc.gnu.org/legacy-ml/gcc-patches/2017-11/msg01325.html
> >
> > https://gcc.gnu.org/legacy-ml/gcc-patches/2019-09/msg01468.html
> > is where it got removed, CCing Richard.
> 
> Yeah.  Initially clobber_high seemed like the best appraoch for
> handling the tlsdesc thing, but in practice it was too difficult
> to shoe-horn the concept in after the fact, when so much rtl
> infrastructure wasn't prepared to deal with it.  The old support
> didn't handle all cases and passes correctly, and handled others
> suboptimally.
> 
> I think it would be worth using the same approach as
> https://gcc.gnu.org/legacy-ml/gcc-patches/2019-09/msg01466.html for
> vzeroupper: represent the instructions as call_insns in which the
> call has a special vzeroupper ABI.  I think that's likely to lead
> to better code than clobber_high would (or at least, it did for tlsdesc).

Perhaps a magic call_insn that is split post-reload into a normal insn
with the sets then?

Jakub

Re: RFC: Sphinx for GCC documentation

2021-05-13 Thread Martin Liška


On 4/1/21 3:30 PM, Martin Liška wrote:

That said, I'm asking the GCC community for a green light before I invest
more time on it?


Hello.

So far, I've received just a small feedback about the transition. In most cases 
positive.

May I understand it as green light for the transition?

Thanks,
Martin

[1] https://splichal.eu/scripts/sphinx/

[PATCH] LTO: merge -flto=foo both from IL and linker cmdline

2021-05-13 Thread Martin Liška


Hello.

In g:3835aa0eb90292d652dd6b200f302f3cac7e643f, I changed logic that the output
-flto=foo argument is taken from IL file command lines. However, it should be 
also
merged with linker command line. One can use -flto for compilation and -flto=16 
for linking.

Ready after it finishes tests?
Thanks,
Martin

gcc/ChangeLog:

* lto-wrapper.c (merge_flto_options): Factor out a new function.
(merge_and_complain): Use it.
(run_gcc): Merge also linker command line -flto=foo argument
with IL files.
---
 gcc/lto-wrapper.c | 118 +-
 1 file changed, 65 insertions(+), 53 deletions(-)

diff --git a/gcc/lto-wrapper.c b/gcc/lto-wrapper.c
index a71d6147152..1c2643984f9 100644
--- a/gcc/lto-wrapper.c
+++ b/gcc/lto-wrapper.c
@@ -189,6 +189,37 @@ find_option (vec &options, 
cl_decoded_option *option)
   return find_option (options, option->opt_index);
 }
 
+/* Merge -flto FOPTION into vector of DECODED_OPTIONS.  */

+
+static void
+merge_flto_options (vec &decoded_options,
+   cl_decoded_option *foption)
+{
+  int existing_opt = find_option (decoded_options, foption);
+  if (existing_opt == -1)
+decoded_options.safe_push (*foption);
+  else
+{
+  if (strcmp (foption->arg, decoded_options[existing_opt].arg) != 0)
+   {
+ /* -flto=auto is preferred.  */
+ if (strcmp (decoded_options[existing_opt].arg, "auto") == 0)
+   ;
+ else if (strcmp (foption->arg, "auto") == 0
+  || strcmp (foption->arg, "jobserver") == 0)
+   decoded_options[existing_opt].arg = foption->arg;
+ else if (strcmp (decoded_options[existing_opt].arg,
+  "jobserver") != 0)
+   {
+ int n = atoi (foption->arg);
+ int original_n = atoi (decoded_options[existing_opt].arg);
+ if (n > original_n)
+   decoded_options[existing_opt].arg = foption->arg;
+   }
+   }
+}
+}
+
 /* Try to merge and complain about options FDECODED_OPTIONS when applied
ontop of DECODED_OPTIONS.  */
 
@@ -427,28 +458,7 @@ merge_and_complain (vec decoded_options,

  break;
 
 	case OPT_flto_:

- if (existing_opt == -1)
-   decoded_options.safe_push (*foption);
- else
-   {
- if (strcmp (foption->arg, decoded_options[existing_opt].arg) != 0)
-   {
- /* -flto=auto is preferred.  */
- if (strcmp (decoded_options[existing_opt].arg, "auto") == 0)
-   ;
- else if (strcmp (foption->arg, "auto") == 0
-  || strcmp (foption->arg, "jobserver") == 0)
-   decoded_options[existing_opt].arg = foption->arg;
- else if (strcmp (decoded_options[existing_opt].arg,
-  "jobserver") != 0)
-   {
- int n = atoi (foption->arg);
- int original_n = atoi (decoded_options[existing_opt].arg);
- if (n > original_n)
-   decoded_options[existing_opt].arg = foption->arg;
-   }
-   }
-   }
+ merge_flto_options (decoded_options, foption);
  break;
}
 }
@@ -1515,37 +1525,6 @@ run_gcc (unsigned argc, char *argv[])
   append_compiler_options (&argv_obstack, fdecoded_options);
   append_linker_options (&argv_obstack, decoded_options);
 
-  /* Process LTO-related options on merged options.  */

-  for (j = 1; j < fdecoded_options.length (); ++j)
-{
-  cl_decoded_option *option = &fdecoded_options[j];
-  switch (option->opt_index)
-   {
-   case OPT_flto_:
- if (strcmp (option->arg, "jobserver") == 0)
-   {
- parallel = 1;
- jobserver = 1;
-   }
- else if (strcmp (option->arg, "auto") == 0)
-   {
- parallel = 1;
- auto_parallel = 1;
-   }
- else
-   {
- parallel = atoi (option->arg);
- if (parallel <= 1)
-   parallel = 0;
-   }
- /* Fallthru.  */
-
-   case OPT_flto:
- lto_mode = LTO_MODE_WHOPR;
- break;
-   }
-}
-
   /* Scan linker driver arguments for things that are of relevance to us.  */
   for (j = 1; j < decoded_options.length (); ++j)
 {
@@ -1574,6 +1553,8 @@ run_gcc (unsigned argc, char *argv[])
  break;
 
 	case OPT_flto_:

+ /* Merge linker -flto= option with what we have in IL files.  */
+ merge_flto_options (fdecoded_options, option);
  if (strcmp (option->arg, "jobserver") == 0)
jobserver_requested = true;
  break;
@@ -1596,6 +1577,37 @@ run_gcc (unsigned argc, char *argv[])
}
 }
 
+  /* Process LTO-related options on merged options.  */

+  for (j = 1; j < fdecoded_options.length (); ++j)
+{
+  cl_de

Re: [PATCH] [i386] Fix _mm256_zeroupper to notify LRA that vzeroupper will kill sse registers. [PR target/82735]

2021-05-13 Thread Richard Sandiford via Gcc-patches

Jakub Jelinek  writes:
> On Thu, May 13, 2021 at 12:32:26PM +0100, Richard Sandiford wrote:
>> Jakub Jelinek  writes:
>> > On Thu, May 13, 2021 at 11:43:19AM +0200, Uros Bizjak wrote:
>> >> > >   Bootstrapped and regtested on X86_64-linux-gnu{-m32,}
>> >> > >   Ok for trunk?
>> >> >
>> >> > Some time ago a support for CLOBBER_HIGH RTX was added (and later
>> >> > removed for some reason). Perhaps we could resurrect the patch for the
>> >> > purpose of ferrying 128bit modes via vzeroupper RTX?
>> >> 
>> >> https://gcc.gnu.org/legacy-ml/gcc-patches/2017-11/msg01325.html
>> >
>> > https://gcc.gnu.org/legacy-ml/gcc-patches/2019-09/msg01468.html
>> > is where it got removed, CCing Richard.
>> 
>> Yeah.  Initially clobber_high seemed like the best appraoch for
>> handling the tlsdesc thing, but in practice it was too difficult
>> to shoe-horn the concept in after the fact, when so much rtl
>> infrastructure wasn't prepared to deal with it.  The old support
>> didn't handle all cases and passes correctly, and handled others
>> suboptimally.
>> 
>> I think it would be worth using the same approach as
>> https://gcc.gnu.org/legacy-ml/gcc-patches/2019-09/msg01466.html for
>> vzeroupper: represent the instructions as call_insns in which the
>> call has a special vzeroupper ABI.  I think that's likely to lead
>> to better code than clobber_high would (or at least, it did for tlsdesc).
>
> Perhaps a magic call_insn that is split post-reload into a normal insn
> with the sets then?

I'd be tempted to treat it is a call_insn throughout.  The unspec_volatile
means that we can't move the instruction, so converting a call_insn to an
insn isn't likely to help from that point of view.  The sets are also
likely to be handled suboptimally compared to the more accurate register
information attached to the call: all code that handles calls has to be
prepared to deal with partial clobbers, whereas most code dealing with
sets will assume that the set does useful work, and that the rhs of the
set is live.

Thanks,
Richard

[PATCH] attributes: target_clone expects a string argument

2021-05-13 Thread Martin Liška


Hello.

The change is about error handling.

Ready to be installed?
Thanks,
Martin

PR middle-end/100504

gcc/c-family/ChangeLog:

* c-attribs.c (handle_target_clones_attribute): Expect a string
argument to target_clone argument.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr100504.c: New test.
---
 gcc/c-family/c-attribs.c | 6 ++
 gcc/testsuite/gcc.target/i386/pr100504.c | 7 +++
 2 files changed, 13 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr100504.c

diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index c1f652d1dc9..9905ee56947 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -5300,6 +5300,12 @@ handle_target_clones_attribute (tree *node, tree name, 
tree ARG_UNUSED (args),
   "with %qs attribute", name, "target");
  *no_add_attrs = true;
}
+  else if (TREE_CODE (TREE_VALUE (args)) != STRING_CST)
+   {
+ error ("%qE attribute argument not a string constant", name);
+ *no_add_attrs = true;
+   }
+
   else
   /* Do not inline functions with multiple clone targets.  */
DECL_UNINLINABLE (*node) = 1;
diff --git a/gcc/testsuite/gcc.target/i386/pr100504.c 
b/gcc/testsuite/gcc.target/i386/pr100504.c
new file mode 100644
index 000..2910dfb948b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr100504.c
@@ -0,0 +1,7 @@
+/* PR middle-end/100504 */
+/* { dg-do compile } */
+
+__attribute__((target_clones(0)))
+foo()
+{ /* { dg-error ".target_clones. attribute argument not a string constant" } */
+}
--
2.31.1

[PATCH] tree-sra: Avoid refreshing into const base decls (PR 100453)

2021-05-13 Thread Martin Jambor

Hi,

When SRA transforms an assignment where the RHS is an aggregate decl
that it creates replacements for, the (least efficient) fallback
method of dealing with them is to store all the replacements back into
the original decl and then let the original assignment takes its
course.

That of course should not need to be done for TREE_READONLY bases
which cannot change contents.  The SRA code handled this situation in
one of two necessary places but only for DECL_IN_CONSTANT_POOL const
decls, this patch modifies both to check TREE_READONLY.

Bootstrapped and tested on aarch64-linux, OK for trunk?

Thanks,

Martin



gcc/ChangeLog:

2021-05-12  Martin Jambor  

PR tree-optimization/100453
* tree-sra.c (sra_modify_assign): All const base accesses do not
need refreshing, not just those from decl_pool.
(sra_modify_assign): Do not refresh into a const base decl.

gcc/testsuite/ChangeLog:

2021-05-12  Martin Jambor  

PR tree-optimization/100453
* gcc.dg/tree-ssa/pr100453.c: New test.
---
 gcc/testsuite/gcc.dg/tree-ssa/pr100453.c | 18 ++
 gcc/tree-sra.c   |  4 ++--
 2 files changed, 20 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr100453.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr100453.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr100453.c
new file mode 100644
index 000..0cf0ad23815
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr100453.c
@@ -0,0 +1,18 @@
+/* { dg-do run } */
+/* { dg-options "-O1" } */
+
+struct a {
+  int b : 4;
+} d;
+static int c, e;
+static const struct a f;
+static void g(const struct a h) {
+  for (; c < 1; c++)
+d = h;
+  e = h.b;
+  c = h.b;
+}
+int main() {
+  g(f);
+  return 0;
+}
diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 8dfc923ed7e..186cd62b476 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -4244,7 +4244,7 @@ sra_modify_assign (gimple *stmt, gimple_stmt_iterator 
*gsi)
   || stmt_ends_bb_p (stmt))
 {
   /* No need to copy into a constant-pool, it comes pre-initialized.  */
-  if (access_has_children_p (racc) && !constant_decl_p (racc->base))
+  if (access_has_children_p (racc) && !TREE_READONLY (racc->base))
generate_subtree_copies (racc->first_child, rhs, racc->offset, 0, 0,
 gsi, false, false, loc);
   if (access_has_children_p (lacc))
@@ -4333,7 +4333,7 @@ sra_modify_assign (gimple *stmt, gimple_stmt_iterator 
*gsi)
}
  /* Restore the aggregate RHS from its components so the
 prevailing aggregate copy does the right thing.  */
- if (access_has_children_p (racc))
+ if (access_has_children_p (racc) && !TREE_READONLY (racc->base))
generate_subtree_copies (racc->first_child, rhs, racc->offset, 0, 0,
 gsi, false, false, loc);
  /* Re-load the components of the aggregate copy destination.
-- 
2.31.1

[PATCH][pushed] mklog: Put detected PR entries before ChangeLogs

2021-05-13 Thread Martin Liška


contrib/ChangeLog:

* mklog.py: Put PR entries before all ChangeLog entries
(will be added to all ChangeLog locations by Daily bump script).
* test_mklog.py: Test the new behavior.
---
 contrib/mklog.py  | 10 --
 contrib/test_mklog.py |  7 +--
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/contrib/mklog.py b/contrib/mklog.py
index 1604f0516d0..5c93c707128 100755
--- a/contrib/mklog.py
+++ b/contrib/mklog.py
@@ -169,13 +169,19 @@ def generate_changelog(data, no_functions=False, 
fill_pr_titles=False):
 if fill_pr_titles:
 out += get_pr_titles(prs)
 
+# print list of PR entries before ChangeLog entries

+if prs:
+if not out:
+out += '\n'
+for pr in prs:
+out += '\t%s\n' % pr
+out += '\n'
+
 # sort ChangeLog so that 'testsuite' is at the end
 for changelog in sorted(changelog_list, key=lambda x: 'testsuite' in x):
 files = changelogs[changelog]
 out += '%s:\n' % os.path.join(changelog, 'ChangeLog')
 out += '\n'
-for pr in prs:
-out += '\t%s\n' % pr
 # new and deleted files should be at the end
 for file in sorted(files, key=sort_changelog_files):
 assert file.path.startswith(changelog)
diff --git a/contrib/test_mklog.py b/contrib/test_mklog.py
index 7e95ec1a2ab..a0670dac119 100755
--- a/contrib/test_mklog.py
+++ b/contrib/test_mklog.py
@@ -317,9 +317,10 @@ index 000..dcc8999c446
 EXPECTED5 = '''\
 PR target/95046 - Vectorize V2SFmode operations
 
+	PR target/95046

+
 gcc/testsuite/ChangeLog:
 
-	PR target/95046

* gcc.target/i386/pr95046-6.c: New test.
 
 '''

@@ -377,9 +378,11 @@ index 000..f3d6d11e61e
 '''
 
 EXPECTED7 = '''\

-gcc/testsuite/ChangeLog:
 
 	DR 2237

+
+gcc/testsuite/ChangeLog:
+
* g++.dg/DRs/dr2237.C: New test.
 
 '''

--
2.31.1

[PATCH] Warn for excessive argument alignment in main

2021-05-13 Thread H.J. Lu via Gcc-patches

Warn for excessive argument alignment in main instead of ICE.

gcc/

PR c/100575
* cfgexpand.c (expand_stack_alignment): Add a bool argument for
expanding main.  Warn for excessive argument alignment in main.
(pass_expand::execute): Pass true to expand_stack_alignment when
expanding main.

gcc/testsuite/

PR c/100575
* c-c++-common/pr100575.c: New test.
---
 gcc/cfgexpand.c   | 26 --
 gcc/testsuite/c-c++-common/pr100575.c | 11 +++
 2 files changed, 31 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/pr100575.c

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index e3814ee9d06..50ccb720e6c 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -6363,7 +6363,7 @@ discover_nonconstant_array_refs (void)
virtual_incoming_args_rtx with the virtual register.  */
 
 static void
-expand_stack_alignment (void)
+expand_stack_alignment (bool expanding_main)
 {
   rtx drap_rtx;
   unsigned int preferred_stack_boundary;
@@ -6385,9 +6385,18 @@ expand_stack_alignment (void)
   if (targetm.calls.update_stack_boundary)
 targetm.calls.update_stack_boundary ();
 
-  /* The incoming stack frame has to be aligned at least at
- parm_stack_boundary.  */
-  gcc_assert (crtl->parm_stack_boundary <= INCOMING_STACK_BOUNDARY);
+  if (crtl->parm_stack_boundary > INCOMING_STACK_BOUNDARY)
+{
+  /* The incoming stack frame has to be aligned at least at
+parm_stack_boundary.  NB: The incoming stack frame alignment
+for main is fixed.  */
+  if (expanding_main)
+   warning_at (DECL_SOURCE_LOCATION (current_function_decl),
+   OPT_Wmain, "argument alignment of %q+D is too large",
+   current_function_decl);
+  else
+   gcc_unreachable ();
+}
 
   /* Update crtl->stack_alignment_estimated and use it later to align
  stack.  We check PREFERRED_STACK_BOUNDARY if there may be non-call
@@ -6699,12 +6708,17 @@ pass_expand::execute (function *fun)
}
 }
 
+  bool expanding_main = false;
+
   /* If this function is `main', emit a call to `__main'
  to run global initializers, etc.  */
   if (DECL_NAME (current_function_decl)
   && MAIN_NAME_P (DECL_NAME (current_function_decl))
   && DECL_FILE_SCOPE_P (current_function_decl))
-expand_main_function ();
+{
+  expanding_main = true;
+  expand_main_function ();
+}
 
   /* Initialize the stack_protect_guard field.  This must happen after the
  call to __main (if any) so that the external decl is initialized.  */
@@ -6847,7 +6861,7 @@ pass_expand::execute (function *fun)
 
   /* Call expand_stack_alignment after finishing all
  updates to crtl->preferred_stack_boundary.  */
-  expand_stack_alignment ();
+  expand_stack_alignment (expanding_main);
 
   /* Fixup REG_EQUIV notes in the prologue if there are tailcalls in this
  function.  */
diff --git a/gcc/testsuite/c-c++-common/pr100575.c 
b/gcc/testsuite/c-c++-common/pr100575.c
new file mode 100644
index 000..e7366a8fe7f
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/pr100575.c
@@ -0,0 +1,11 @@
+/* { dg-do run } */
+/* { dg-options "-Wall -Wno-psabi" } */
+
+int
+main (int __attribute__((vector_size(1 << 29))) argc,
+  char **argv)
+/* { dg-warning "first argument of" "" { target *-*-* } .-2 } */
+/* { dg-warning "argument alignment of" "" { target *-*-* } .-3 } */
+{
+  return 0;
+}
-- 
2.31.1

Re: [PATCH] Remove unused variable.

2021-05-13 Thread Jeff Law via Gcc-patches




On 5/13/2021 3:06 AM, Martin Liška wrote:

Addresses the following clang warning:
gcc/tree-ssa-dom.c:652:33: warning: private field 'm_simplifier' is 
not used [-Wunused-private-field]


Ready for master?
Thanks

gcc/ChangeLog:

* tree-ssa-dom.c: Remove m_simplifier.


I wonder if Aldy's refactor accidentally dropped the ephemeral 
simplifications based on the threading path equivalences which is what I 
think this is supposed to be supporting.  Given that the refactor didn't 
cause any regressions, if that capability did get dropped, it couldn't 
be too important anymore.



OK for the trunk,

Jeff

[committed] testsuite: suppress cast warnings in pr100563.c [PR100563]

2021-05-13 Thread Richard Earnshaw via Gcc-patches


Fix a warning when building on machines that don't have 32-bit pointers

gcc/testsuite:

PR target/100563
* gcc.dg/pr100563.c (dg-options): Add -wno-pointer-to-int-cast.
---
 gcc/testsuite/gcc.dg/pr100563.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/pr100563.c b/gcc/testsuite/gcc.dg/pr100563.c
index 812eb9e6ae2..f6a5fcd3a47 100644
--- a/gcc/testsuite/gcc.dg/pr100563.c
+++ b/gcc/testsuite/gcc.dg/pr100563.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-Og" } */
+/* { dg-options "-Og -Wno-pointer-to-int-cast" } */
 unsigned long long e(void);
 void f(int);
 void a() {

Re: [GCC-11 backport][PATCH] arm: Remove duplicate definitions from arm_mve.h (pr100419).

2021-05-13 Thread Richard Earnshaw via Gcc-patches





On 12/05/2021 10:56, Srinath Parvathaneni via Gcc-patches wrote:

Hi,

This is a backport to GCC-11 branch, this patch got applied cleanly on the 
branch.

This patch removes several duplicated intrinsic definitions from
arm_mve.h mentioned in PR100419 and also fixes the wrong arguments
in few of intrinsics polymorphic variants.

Ok for GCC-11 branch?

gcc/ChangeLog:

2021-05-04  Srinath Parvathaneni  

PR target/100419
* config/arm/arm_mve.h (__arm_vstrwq_scatter_offset): Fix wrong 
arguments.
(__arm_vcmpneq): Remove duplicate definition.
(__arm_vstrwq_scatter_offset_p): Likewise.
(__arm_vmaxq_x): Likewise.
(__arm_vmlsdavaq): Likewise.
(__arm_vmlsdavaxq): Likewise.
(__arm_vmlsdavq_p): Likewise.
(__arm_vmlsdavxq_p): Likewise.
(__arm_vrmlaldavhaq): Likewise.
(__arm_vstrbq_p): Likewise.
(__arm_vstrbq_scatter_offset): Likewise.
(__arm_vstrbq_scatter_offset_p): Likewise.
(__arm_vstrdq_scatter_offset): Likewise.
(__arm_vstrdq_scatter_offset_p): Likewise.
(__arm_vstrdq_scatter_shifted_offset): Likewise.
(__arm_vstrdq_scatter_shifted_offset_p): Likewise.


OK.

R.



Co-authored-by: Joe Ramsay  
(cherry picked from commit 9b905ba9ebba8d2cc805c26351225e7f74c02333)


### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
index 
3a40c6e68161b64319b071f57a5b0d8393303cfd..dc1d874a6366eb5fe755a70c72ed371c915bd04b
 100644
--- a/gcc/config/arm/arm_mve.h
+++ b/gcc/config/arm/arm_mve.h
@@ -37808,33 +37808,19 @@ extern void *__ARM_undef;
int (*)[__ARM_mve_type_uint32x4_t]: __arm_vstrwq_scatter_base_p_u32(p0, p1, 
__ARM_mve_coerce(__p2, uint32x4_t), p3), \
int (*)[__ARM_mve_type_float32x4_t]: __arm_vstrwq_scatter_base_p_f32(p0, 
p1, __ARM_mve_coerce(__p2, float32x4_t), p3));})
  
-#define __arm_vstrwq_scatter_offset(p0,p1,p2) ({ __typeof(p1) __p1 = (p1); \

+#define __arm_vstrwq_scatter_offset(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \
__typeof(p2) __p2 = (p2); \
-  _Generic( (int (*)[__ARM_mve_typeid(p0)][__ARM_mve_typeid(__p2)])0, \
-  int (*)[__ARM_mve_type_int32_t_ptr][__ARM_mve_type_int32x4_t]: 
__arm_vstrwq_scatter_offset_s32 (__ARM_mve_coerce(p0, int32_t *), __p1, 
__ARM_mve_coerce(__p2, int32x4_t)), \
-  int (*)[__ARM_mve_type_uint32_t_ptr][__ARM_mve_type_uint32x4_t]: 
__arm_vstrwq_scatter_offset_u32 (__ARM_mve_coerce(p0, uint32_t *), __p1, 
__ARM_mve_coerce(__p2, uint32x4_t)), \
-  int (*)[__ARM_mve_type_float32_t_ptr][__ARM_mve_type_float32x4_t]: 
__arm_vstrwq_scatter_offset_f32 (__ARM_mve_coerce(p0, float32_t *), __p1, 
__ARM_mve_coerce(__p2, float32x4_t)));})
+  _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p2)])0, \
+  int (*)[__ARM_mve_type_int32_t_ptr][__ARM_mve_type_int32x4_t]: 
__arm_vstrwq_scatter_offset_s32 (__ARM_mve_coerce(__p0, int32_t *), p1, 
__ARM_mve_coerce(__p2, int32x4_t)), \
+  int (*)[__ARM_mve_type_uint32_t_ptr][__ARM_mve_type_uint32x4_t]: 
__arm_vstrwq_scatter_offset_u32 (__ARM_mve_coerce(__p0, uint32_t *), p1, 
__ARM_mve_coerce(__p2, uint32x4_t)), \
+  int (*)[__ARM_mve_type_float32_t_ptr][__ARM_mve_type_float32x4_t]: 
__arm_vstrwq_scatter_offset_f32 (__ARM_mve_coerce(__p0, float32_t *), p1, 
__ARM_mve_coerce(__p2, float32x4_t)));})
  
-#define __arm_vstrwq_scatter_offset_p(p0,p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \

+#define __arm_vstrwq_scatter_offset_p(p0,p1,p2,p3) ({ __typeof(p0) __p0 = 
(p0); \
__typeof(p2) __p2 = (p2); \
-  _Generic( (int (*)[__ARM_mve_typeid(p0)][__ARM_mve_typeid(__p2)])0, \
-  int (*)[__ARM_mve_type_int32_t_ptr][__ARM_mve_type_int32x4_t]: 
__arm_vstrwq_scatter_offset_p_s32 (__ARM_mve_coerce(p0, int32_t *), __p1, 
__ARM_mve_coerce(__p2, int32x4_t), p3), \
-  int (*)[__ARM_mve_type_uint32_t_ptr][__ARM_mve_type_uint32x4_t]: 
__arm_vstrwq_scatter_offset_p_u32 (__ARM_mve_coerce(p0, uint32_t *), __p1, 
__ARM_mve_coerce(__p2, uint32x4_t), p3), \
-  int (*)[__ARM_mve_type_float32_t_ptr][__ARM_mve_type_float32x4_t]: 
__arm_vstrwq_scatter_offset_p_f32 (__ARM_mve_coerce(p0, float32_t *), __p1, 
__ARM_mve_coerce(__p2, float32x4_t), p3));})
-
-#define __arm_vstrwq_scatter_offset_p(p0,p1,p2,p3) ({ __typeof(p1) __p1 = 
(p1); \
-  __typeof(p2) __p2 = (p2); \
-  _Generic( (int (*)[__ARM_mve_typeid(p0)][__ARM_mve_typeid(__p2)])0, \
-  int (*)[__ARM_mve_type_int32_t_ptr][__ARM_mve_type_int32x4_t]: 
__arm_vstrwq_scatter_offset_p_s32 (__ARM_mve_coerce(p0, int32_t *), __p1, 
__ARM_mve_coerce(__p2, int32x4_t), p3), \
-  int (*)[__ARM_mve_type_uint32_t_ptr][__ARM_mve_type_uint32x4_t]: 
__arm_vstrwq_scatter_offset_p_u32 (__ARM_mve_coerce(p0, uint32_t *), __p1, 
__ARM_mve_coerce(__p2, uint32x4_t), p3), \
-  int (*)[__ARM_mve_type_float32_t_ptr][__ARM_mve_type_float32x4_t]: 
__arm_vstrwq_scatter_offset_p_f32 (__ARM_mve_coerce(p0, float32_t *), __p1, 
__ARM_mve_coerce(__p2, float32x4_t), p3));})
-
-#define __arm_vstrwq_scatter_offset(p0

Re: [GCC-10 backport][PATCH] arm: Remove duplicate definitions from arm_mve.h (pr100419).

2021-05-13 Thread Richard Earnshaw via Gcc-patches





On 12/05/2021 10:56, Srinath Parvathaneni via Gcc-patches wrote:

Hi,

This is a backport to GCC-10 branch, this patch got applied cleanly on the 
branch.

This patch removes several duplicated intrinsic definitions from
arm_mve.h mentioned in PR100419 and also fixes the wrong arguments
in few of intrinsics polymorphic variants.

Ok for GCC-10 branch?

gcc/ChangeLog:

2021-05-04  Srinath Parvathaneni  

PR target/100419
* config/arm/arm_mve.h (__arm_vstrwq_scatter_offset): Fix wrong 
arguments.
(__arm_vcmpneq): Remove duplicate definition.
(__arm_vstrwq_scatter_offset_p): Likewise.
(__arm_vmaxq_x): Likewise.
(__arm_vmlsdavaq): Likewise.
(__arm_vmlsdavaxq): Likewise.
(__arm_vmlsdavq_p): Likewise.
(__arm_vmlsdavxq_p): Likewise.
(__arm_vrmlaldavhaq): Likewise.
(__arm_vstrbq_p): Likewise.
(__arm_vstrbq_scatter_offset): Likewise.
(__arm_vstrbq_scatter_offset_p): Likewise.
(__arm_vstrdq_scatter_offset): Likewise.
(__arm_vstrdq_scatter_offset_p): Likewise.
(__arm_vstrdq_scatter_shifted_offset): Likewise.
(__arm_vstrdq_scatter_shifted_offset_p): Likewise.

Co-authored-by: Joe Ramsay  
(cherry picked from commit 9b905ba9ebba8d2cc805c26351225e7f74c02333)



OK.




### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
index 
449219e90fccd6344c725404366147f5932e8660..1132c7cf87d217a380cf26dd6f110130ea7bf175
 100644
--- a/gcc/config/arm/arm_mve.h
+++ b/gcc/config/arm/arm_mve.h
@@ -37802,33 +37802,19 @@ extern void *__ARM_undef;
int (*)[__ARM_mve_type_uint32x4_t]: __arm_vstrwq_scatter_base_p_u32(p0, p1, 
__ARM_mve_coerce(__p2, uint32x4_t), p3), \
int (*)[__ARM_mve_type_float32x4_t]: __arm_vstrwq_scatter_base_p_f32(p0, 
p1, __ARM_mve_coerce(__p2, float32x4_t), p3));})
  
-#define __arm_vstrwq_scatter_offset(p0,p1,p2) ({ __typeof(p1) __p1 = (p1); \

+#define __arm_vstrwq_scatter_offset(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \
__typeof(p2) __p2 = (p2); \
-  _Generic( (int (*)[__ARM_mve_typeid(p0)][__ARM_mve_typeid(__p2)])0, \
-  int (*)[__ARM_mve_type_int32_t_ptr][__ARM_mve_type_int32x4_t]: 
__arm_vstrwq_scatter_offset_s32 (__ARM_mve_coerce(p0, int32_t *), __p1, 
__ARM_mve_coerce(__p2, int32x4_t)), \
-  int (*)[__ARM_mve_type_uint32_t_ptr][__ARM_mve_type_uint32x4_t]: 
__arm_vstrwq_scatter_offset_u32 (__ARM_mve_coerce(p0, uint32_t *), __p1, 
__ARM_mve_coerce(__p2, uint32x4_t)), \
-  int (*)[__ARM_mve_type_float32_t_ptr][__ARM_mve_type_float32x4_t]: 
__arm_vstrwq_scatter_offset_f32 (__ARM_mve_coerce(p0, float32_t *), __p1, 
__ARM_mve_coerce(__p2, float32x4_t)));})
+  _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p2)])0, \
+  int (*)[__ARM_mve_type_int32_t_ptr][__ARM_mve_type_int32x4_t]: 
__arm_vstrwq_scatter_offset_s32 (__ARM_mve_coerce(__p0, int32_t *), p1, 
__ARM_mve_coerce(__p2, int32x4_t)), \
+  int (*)[__ARM_mve_type_uint32_t_ptr][__ARM_mve_type_uint32x4_t]: 
__arm_vstrwq_scatter_offset_u32 (__ARM_mve_coerce(__p0, uint32_t *), p1, 
__ARM_mve_coerce(__p2, uint32x4_t)), \
+  int (*)[__ARM_mve_type_float32_t_ptr][__ARM_mve_type_float32x4_t]: 
__arm_vstrwq_scatter_offset_f32 (__ARM_mve_coerce(__p0, float32_t *), p1, 
__ARM_mve_coerce(__p2, float32x4_t)));})
  
-#define __arm_vstrwq_scatter_offset_p(p0,p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \

+#define __arm_vstrwq_scatter_offset_p(p0,p1,p2,p3) ({ __typeof(p0) __p0 = 
(p0); \
__typeof(p2) __p2 = (p2); \
-  _Generic( (int (*)[__ARM_mve_typeid(p0)][__ARM_mve_typeid(__p2)])0, \
-  int (*)[__ARM_mve_type_int32_t_ptr][__ARM_mve_type_int32x4_t]: 
__arm_vstrwq_scatter_offset_p_s32 (__ARM_mve_coerce(p0, int32_t *), __p1, 
__ARM_mve_coerce(__p2, int32x4_t), p3), \
-  int (*)[__ARM_mve_type_uint32_t_ptr][__ARM_mve_type_uint32x4_t]: 
__arm_vstrwq_scatter_offset_p_u32 (__ARM_mve_coerce(p0, uint32_t *), __p1, 
__ARM_mve_coerce(__p2, uint32x4_t), p3), \
-  int (*)[__ARM_mve_type_float32_t_ptr][__ARM_mve_type_float32x4_t]: 
__arm_vstrwq_scatter_offset_p_f32 (__ARM_mve_coerce(p0, float32_t *), __p1, 
__ARM_mve_coerce(__p2, float32x4_t), p3));})
-
-#define __arm_vstrwq_scatter_offset_p(p0,p1,p2,p3) ({ __typeof(p1) __p1 = 
(p1); \
-  __typeof(p2) __p2 = (p2); \
-  _Generic( (int (*)[__ARM_mve_typeid(p0)][__ARM_mve_typeid(__p2)])0, \
-  int (*)[__ARM_mve_type_int32_t_ptr][__ARM_mve_type_int32x4_t]: 
__arm_vstrwq_scatter_offset_p_s32 (__ARM_mve_coerce(p0, int32_t *), __p1, 
__ARM_mve_coerce(__p2, int32x4_t), p3), \
-  int (*)[__ARM_mve_type_uint32_t_ptr][__ARM_mve_type_uint32x4_t]: 
__arm_vstrwq_scatter_offset_p_u32 (__ARM_mve_coerce(p0, uint32_t *), __p1, 
__ARM_mve_coerce(__p2, uint32x4_t), p3), \
-  int (*)[__ARM_mve_type_float32_t_ptr][__ARM_mve_type_float32x4_t]: 
__arm_vstrwq_scatter_offset_p_f32 (__ARM_mve_coerce(p0, float32_t *), __p1, 
__ARM_mve_coerce(__p2, float32x4_t), p3));})
-
-#define __arm_vstrwq_scatter_offset(p0,p1

Re: [PATCH] attributes: target_clone expects a string argument

2021-05-13 Thread Martin Sebor via Gcc-patches


On 5/13/21 6:05 AM, Martin Liška wrote:

Hello.

The change is about error handling.

Ready to be installed?
Thanks,
Martin

 PR middle-end/100504

gcc/c-family/ChangeLog:

 * c-attribs.c (handle_target_clones_attribute): Expect a string
 argument to target_clone argument.

gcc/testsuite/ChangeLog:

 * gcc.target/i386/pr100504.c: New test.
---
  gcc/c-family/c-attribs.c | 6 ++
  gcc/testsuite/gcc.target/i386/pr100504.c | 7 +++
  2 files changed, 13 insertions(+)
  create mode 100644 gcc/testsuite/gcc.target/i386/pr100504.c

diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index c1f652d1dc9..9905ee56947 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -5300,6 +5300,12 @@ handle_target_clones_attribute (tree *node, tree 
name, tree ARG_UNUSED (args),

     "with %qs attribute", name, "target");
    *no_add_attrs = true;
  }
+  else if (TREE_CODE (TREE_VALUE (args)) != STRING_CST)
+    {
+  error ("%qE attribute argument not a string constant", name);
+  *no_add_attrs = true;
+    }


Since errors are higher priority than warnings I'd suggest making
this the first check, before the warnings above, (and adding a test
to verify that that's how it works).

Martin


+
    else
    /* Do not inline functions with multiple clone targets.  */
  DECL_UNINLINABLE (*node) = 1;
diff --git a/gcc/testsuite/gcc.target/i386/pr100504.c 
b/gcc/testsuite/gcc.target/i386/pr100504.c

new file mode 100644
index 000..2910dfb948b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr100504.c
@@ -0,0 +1,7 @@
+/* PR middle-end/100504 */
+/* { dg-do compile } */
+
+__attribute__((target_clones(0)))
+foo()
+{ /* { dg-error ".target_clones. attribute argument not a string 
constant" } */

+}

Re: [PATCH] tree-sra: Avoid refreshing into const base decls (PR 100453)

2021-05-13 Thread Jeff Law via Gcc-patches




On 5/13/2021 6:23 AM, Martin Jambor wrote:

Hi,

When SRA transforms an assignment where the RHS is an aggregate decl
that it creates replacements for, the (least efficient) fallback
method of dealing with them is to store all the replacements back into
the original decl and then let the original assignment takes its
course.

That of course should not need to be done for TREE_READONLY bases
which cannot change contents.  The SRA code handled this situation in
one of two necessary places but only for DECL_IN_CONSTANT_POOL const
decls, this patch modifies both to check TREE_READONLY.

Bootstrapped and tested on aarch64-linux, OK for trunk?

Thanks,

Martin



gcc/ChangeLog:

2021-05-12  Martin Jambor  

PR tree-optimization/100453
* tree-sra.c (sra_modify_assign): All const base accesses do not
need refreshing, not just those from decl_pool.
(sra_modify_assign): Do not refresh into a const base decl.

gcc/testsuite/ChangeLog:

2021-05-12  Martin Jambor  

PR tree-optimization/100453
* gcc.dg/tree-ssa/pr100453.c: New test.


OK

jeff

Re: [PATCH] attributes: target_clone expects a string argument

2021-05-13 Thread Jeff Law via Gcc-patches




On 5/13/2021 6:05 AM, Martin Liška wrote:

Hello.

The change is about error handling.

Ready to be installed?
Thanks,
Martin

PR middle-end/100504

gcc/c-family/ChangeLog:

* c-attribs.c (handle_target_clones_attribute): Expect a string
argument to target_clone argument.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr100504.c: New test.


OK

jeff

[committed] openmp: Add testcases to verify OpenMP 5.0 2.14 and OpenMP 5.1 2.17 rules [PR99928]

2021-05-13 Thread Jakub Jelinek via Gcc-patches

Hi!

In preparation of PR99928 patch review, I've prepared testcases with clauses
that need more interesting handling on combined/composite constructs,
in particular firstprivate, lastprivate, firstprivate+lastprivate, linear
(explicit on non-iv, explicit on simd iv, implicit on simd iv, implicit on
simd iv declared in the construct), reduction (scalars, array sections of
array variables, array sections with pointer bases) and in_reduction.

OpenMP 5.0 had the wording broken for reduction, the intended rule to use
map(tofrom:) on target when combined with it was bound only on inscan modifier
presence which makes no sense, as then inscan may not be used, this has
been fixed in 5.1 and I'm just assuming 5.1 wording for that.

There are various cases where e.g. from historical or optimization reasons
GCC slightly deviates from the rules, but in most cases it is something
that shouldn't be really observable, e.g. whether
  #pragma omp parallel for firstprivate(x)
is handled as
  #pragma omp parallel shared(x)
  #pragma omp for firstprivate(x)
or
  #pragma omp parallel firstprivate(x)
  #pragma omp for
shouldn't be possible to distinguish in user code.  I've added FIXMEs
in the testcases about that, but maybe we just should keep it as is
(alternative would be to do it in standard compliant way and transform
into whatever we like after gimplification (e.g. early during omplower)).
Some cases we for historical reasons implement even with clauses on
constructs which in the standard don't accept them that way and then
handling those magically in omp lowering/expansion, in particular e.g.
  #pragma omp parallel for firstprivate(x) lastprivate(x)
we treat as
  #pragma omp parallel firstprivate(x) lastprivate(x)
  #pragma omp for
even when lastprivate is not valid on parallel.  Maybe one day we
could change that if we make sure we don't regress generated code quality.

I've also found a bug in OpenMP 5.0/5.1,
  #pragma omp parallel sections firstprivate(x) lastprivate(x)
incorrectly says that it should be handled as
  #pragma omp parallel firstprivate(x)
  #pragma omp sections lastprivate(x)
which when written that way results in error; filed as
https://github.com/OpenMP/spec/issues/2758
to be fixed in OpenMP 5.2.  GCC handles it the way it used to do
and users expect, so nothing to fix on the GCC side.

Also, we don't support yet in_reduction clause on target construct,
which means the -11.c testcase can't include any tests about in_reduction
handling on all the composite constructs that include target.

The work found two kinds of bugs on the GCC side, one is the known thing
that we implement still the 4.5 behavior and don't mark for
lastprivate/linear/reduction the list item as map(tofrom:) as mentioned
in PR99928.  These cases are xfailed in the tests.

And another one is with r21 and r28 in -{8,9,10}.c tests - we don't add
reduction clause on teams for
  #pragma omp {target ,}teams distribute simd reduction(+:r)
even when the spec says that teams shouldn't receive reduction only
when combined with loop construct.

In
make check-gcc check-g++ RUNTESTFLAGS='--target_board=unix\{-m32,-m64\} 
gomp.exp=pr99928*'
testing this shows:

  # of expected passes  5648
  # of expected failures872

and with Tobias' patch applied:

  # of expected passes  5648
  # of unexpected successes 384
  # of expected failures488

Committed to trunk.

2021-05-13  Jakub Jelinek  

PR middle-end/99928
* c-c++-common/gomp/pr99928-1.c: New test.
* c-c++-common/gomp/pr99928-2.c: New test.
* c-c++-common/gomp/pr99928-3.c: New test.
* c-c++-common/gomp/pr99928-4.c: New test.
* c-c++-common/gomp/pr99928-5.c: New test.
* c-c++-common/gomp/pr99928-6.c: New test.
* c-c++-common/gomp/pr99928-7.c: New test.
* c-c++-common/gomp/pr99928-8.c: New test.
* c-c++-common/gomp/pr99928-9.c: New test.
* c-c++-common/gomp/pr99928-10.c: New test.
* c-c++-common/gomp/pr99928-11.c: New test.

--- gcc/testsuite/c-c++-common/gomp/pr99928-1.c.jj  2021-05-13 
12:32:49.240146205 +0200
+++ gcc/testsuite/c-c++-common/gomp/pr99928-1.c 2021-05-13 12:32:49.240146205 
+0200
@@ -0,0 +1,206 @@
+/* PR middle-end/99928 */
+/* { dg-do compile } */
+/* { dg-options "-fopenmp -fdump-tree-gimple" } */
+
+int f00, f01, f02, f03, f04, f05, f06, f07, f08, f09;
+int f12, f13, f14, f15, f16, f17, f18, f19;
+int f20, f21, f22, f23, f24, f25, f26, f27, f28, f29;
+
+void
+foo (void)
+{
+  /* { dg-final { scan-tree-dump "omp 
distribute\[^\n\r]*firstprivate\\(f00\\)" "gimple" } } */
+  /* { dg-final { scan-tree-dump "omp parallel\[^\n\r]*firstprivate\\(f00\\)" 
"gimple" } } *//* FIXME: This should be on for instead.  */
+  /* { dg-final { scan-tree-dump-not "omp for\[^\n\r]*firstprivate\\(f00\\)" 
"gimple" } } *//* FIXME.  */
+  #pragma omp distribute parallel for firstprivate (f00)
+  for (int i = 0; i < 64; i++)
+f00++;
+  /* { dg-final { scan-tree-dump

Re: [PATCH] config: delete unused sim macros

2021-05-13 Thread Jeff Law via Gcc-patches




On 5/11/2021 10:28 PM, Mike Frysinger via Gcc-patches wrote:

Nothing in gcc or binutils or gdb or anything anywhere uses these.

config/

* acinclude.m4 (CYG_AC_PATH_SIM, CYG_AC_PATH_DEVO): Delete.


"DEVO", yea, that's old.  I had a slight concern CYG might refer to 
Cygwin rather than Cygnus, but the comments reference old Cygnus 
projects (Foundry).  However, just to be sure, I checked the 
newlib-cygwin repo which has no references other than a clone of the 
code you're removing.


OK for the trunk.

jeff

Re: [PATCH] regcprop: Fix another cprop_hardreg bug [PR100342]

2021-05-13 Thread Jakub Jelinek via Gcc-patches

On Tue, May 11, 2021 at 11:59:24AM +0100, Richard Sandiford via Gcc-patches 
wrote:
> > I wrote the following patch (originally against 10 branch because that is
> > where Uros has been debugging it) and bootstrapped/regtested it on 11
> > branch successfully.
> > It effectively implements your (2) above; I'm not sure if
> > REG_CAN_CHANGE_MODE_P is needed there, because it is already tested in
> > find_oldest_value_reg -> maybe_mode_change -> mode_change_ok.
> 
> The REG_CAN_CHANGE_MODE_P test would in this case be for
> vd->e[dr].mode → vd->e[sr].mode, rather than oldest_regno's mode.
> I'm just worried that:
> 
>(set (reg:HI R1) (reg:HI R0))
>(set (reg:SI R2) (reg:SI R1))
> 
> isn't equivalent to:
> 
>(set (reg:HI R1) (reg:HI R0))
>(set (reg:HI R2) (reg:HI R1))
> 
> if REG_CAN_CHANGE_MODE_P is false for either the R2 or R1 change.
> If we pretend that it is when building the chain then there's a
> risk of GIGO when using it in find_oldest_value_reg.
> 
> (Although in this case SI and HI are both valid for R1,
> REG_CAN_CHANGE_MODE_P might still be false if the HI bits are
> not in the low 16 bits of the SI.  That's unlikely in this case,
> but a similar thing can happen for vector modes or multi-register modes.)
> 
> I'm not saying the patch is wrong.  I just wanted to clarify
> why I thought the check might be needed.

So, do you want something like (I've deleted the old comment as I think
the new one is enough, but am open to keep both) the patch below, where
it REG_CAN_CHANGE_MODE_P is false, we punt (return), otherwise call
set_value_regno?
Am not sure if those REG_CAN_CHANGE_MODE_P arguments is what you want
though.

--- gcc/regcprop.c.jj   2021-03-23 10:21:07.176447920 +0100
+++ gcc/regcprop.c  2021-05-13 17:31:39.940519855 +0200
@@ -358,34 +358,25 @@ copy_value (rtx dest, rtx src, struct va
   else if (sn > hard_regno_nregs (sr, vd->e[sr].mode))
 return;
 
-  /* It is not safe to link DEST into the chain if SRC was defined in some
- narrower mode M and if M is also narrower than the mode of the first
- register in the chain.  For example:
- (set (reg:DI r1) (reg:DI r0))
- (set (reg:HI r2) (reg:HI r1))
- (set (reg:SI r3) (reg:SI r2)) //Should be a new chain start at r3
- (set (reg:SI r4) (reg:SI r1))
- (set (reg:SI r5) (reg:SI r4))
-
- the upper part of r3 is undefined.  If we added it to the chain,
- it may be used to replace r5, which has defined upper bits.
- See PR98694 for details.
-
- [A] partial_subreg_p (vd->e[sr].mode, GET_MODE (src))
- [B] partial_subreg_p (vd->e[sr].mode, vd->e[vd->e[sr].oldest_regno].mode)
- Condition B is added to to catch optimization opportunities of
-
- (set (reg:HI R1) (reg:HI R0))
- (set (reg:SI R2) (reg:SI R1)) // [A]
- (set (reg:DI R3) (reg:DI R2)) // [A]
- (set (reg:SI R4) (reg:SI R[0-3]))
- (set (reg:HI R5) (reg:HI R[0-4]))
-
- in which all registers have only 16 defined bits.  */
-  else if (partial_subreg_p (vd->e[sr].mode, GET_MODE (src))
-  && partial_subreg_p (vd->e[sr].mode,
-   vd->e[vd->e[sr].oldest_regno].mode))
-return;
+  /* If a narrower value is copied using wider mode, the upper bits
+ are undefined (could be e.g. a former paradoxical subreg).  Signal
+ in that case we've only copied value using the narrower mode.
+ Consider:
+ (set (reg:DI r14) (mem:DI ...))
+ (set (reg:QI si) (reg:QI r14))
+ (set (reg:DI bp) (reg:DI r14))
+ (set (reg:DI r14) (const_int ...))
+ (set (reg:DI dx) (reg:DI si))
+ (set (reg:DI si) (const_int ...))
+ (set (reg:DI dx) (reg:DI bp))
+ The last set is not redundant, while the low 8 bits of dx are already
+ equal to low 8 bits of bp, the other bits are undefined.  */
+  else if (partial_subreg_p (vd->e[sr].mode, GET_MODE (src)))
+{
+  if (REG_CAN_CHANGE_MODE_P (sr, GET_MODE (src), vd->e[sr].mode))
+   return;
+  set_value_regno (dr, vd->e[sr].mode, vd);
+}
 
   /* Link DR at the end of the value chain used by SR.  */
 


Jakub

Re: [PATCH] libsanitizer: merge from master

2021-05-13 Thread H.J. Lu via Gcc-patches

On Thu, May 13, 2021 at 09:28:01AM +0200, Martin Liška wrote:
> I'm planning to do merge from master twice a year.
> This merge was tested on x86_64-linux-gnu and ppc64le-linux-gnu
> and survives regression tests.
> 
> Pushed to master.
> Thanks,
> Martin
> 
> Merged revision: f58e0513dd95944b81ce7a6e7b49ba656de7d75f

On Linux/x86-64, I got

../../../../../src-master/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:
 In function ??void __sanitizer::InitTlsSize()??:
../../../../../src-master/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:209:55:
 error: invalid conversion from ??__sanitizer::uptr*?? {aka ??long unsigned 
int*??} to ??size_t*?? {aka ??unsigned int*??} [-fpermissive]
  209 |   ((void (*)(size_t *, size_t *))get_tls_static_info)(&g_tls_size, 
&tls_align);
      |                                                       ^~~
      |                                                       |
      |                                                       
__sanitizer::uptr* {aka long unsigned int*}


H.J.

Re: [PATCH] Warn for excessive argument alignment in main

2021-05-13 Thread Bernd Edlinger

On 5/13/21 3:37 PM, H.J. Lu via Gcc-patches wrote:
> Warn for excessive argument alignment in main instead of ICE.
> 
> gcc/
> 
>   PR c/100575
>   * cfgexpand.c (expand_stack_alignment): Add a bool argument for
>   expanding main.  Warn for excessive argument alignment in main.
>   (pass_expand::execute): Pass true to expand_stack_alignment when
>   expanding main.
> 
> gcc/testsuite/
> 
>   PR c/100575
>   * c-c++-common/pr100575.c: New test.
> ---
>  gcc/cfgexpand.c   | 26 --
>  gcc/testsuite/c-c++-common/pr100575.c | 11 +++
>  2 files changed, 31 insertions(+), 6 deletions(-)
>  create mode 100644 gcc/testsuite/c-c++-common/pr100575.c
> 
> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> index e3814ee9d06..50ccb720e6c 100644
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -6363,7 +6363,7 @@ discover_nonconstant_array_refs (void)
> virtual_incoming_args_rtx with the virtual register.  */
>  
>  static void
> -expand_stack_alignment (void)
> +expand_stack_alignment (bool expanding_main)
>  {
>rtx drap_rtx;
>unsigned int preferred_stack_boundary;
> @@ -6385,9 +6385,18 @@ expand_stack_alignment (void)
>if (targetm.calls.update_stack_boundary)
>  targetm.calls.update_stack_boundary ();
>  
> -  /* The incoming stack frame has to be aligned at least at
> - parm_stack_boundary.  */
> -  gcc_assert (crtl->parm_stack_boundary <= INCOMING_STACK_BOUNDARY);
> +  if (crtl->parm_stack_boundary > INCOMING_STACK_BOUNDARY)
> +{
> +  /* The incoming stack frame has to be aligned at least at
> +  parm_stack_boundary.  NB: The incoming stack frame alignment
> +  for main is fixed.  */
> +  if (expanding_main)
> + warning_at (DECL_SOURCE_LOCATION (current_function_decl),
> + OPT_Wmain, "argument alignment of %q+D is too large",
> + current_function_decl);
> +  else
> + gcc_unreachable ();
> +}

Could you do this instead in ix86_minimum_incoming_stack_boundary

  /* The incoming stack frame has to be aligned at least at
 parm_stack_boundary.  */
  if (incoming_stack_boundary < crtl->parm_stack_boundary)
incoming_stack_boundary = crtl->parm_stack_boundary;

  /* Stack at entrance of main is aligned by runtime.  We use the
 smallest incoming stack boundary. */
  if (incoming_stack_boundary > MAIN_STACK_BOUNDARY
  && DECL_NAME (current_function_decl)
  && MAIN_NAME_P (DECL_NAME (current_function_decl))
  && DECL_FILE_SCOPE_P (current_function_decl))
incoming_stack_boundary = MAIN_STACK_BOUNDARY;


maybe just repeat this after incoming_stack_boundary is set to
MAIN_STACK_BOUNDARY:

  /* The incoming stack frame has to be aligned at least at
 parm_stack_boundary.  */
  if (incoming_stack_boundary < crtl->parm_stack_boundary)
incoming_stack_boundary = crtl->parm_stack_boundary;

and print the warning here?


Thanks
Bernd.

Re: [PATCH] regcprop: Fix another cprop_hardreg bug [PR100342]

2021-05-13 Thread Jakub Jelinek via Gcc-patches

On Thu, May 13, 2021 at 05:37:36PM +0200, Jakub Jelinek wrote:
> So, do you want something like (I've deleted the old comment as I think
> the new one is enough, but am open to keep both) the patch below, where
> it REG_CAN_CHANGE_MODE_P is false, we punt (return), otherwise call
> set_value_regno?
> Am not sure if those REG_CAN_CHANGE_MODE_P arguments is what you want
> though.

Oops, missing !, meant following which works on 11 branch for the testcase:

2021-05-13  Jakub Jelinek  

PR rtl-optimization/100342
* regcprop.c (copy_value): When copying a source reg in a wider
mode than it has recorded for the value, adjust recorded destination
mode too or punt if !REG_CAN_CHANGE_MODE_P.

* gcc.target/i386/pr100342.c: New test.

--- gcc/regcprop.c.jj   2021-03-23 10:21:07.176447920 +0100
+++ gcc/regcprop.c  2021-05-13 17:36:46.443192451 +0200
@@ -358,34 +358,25 @@ copy_value (rtx dest, rtx src, struct va
   else if (sn > hard_regno_nregs (sr, vd->e[sr].mode))
 return;
 
-  /* It is not safe to link DEST into the chain if SRC was defined in some
- narrower mode M and if M is also narrower than the mode of the first
- register in the chain.  For example:
- (set (reg:DI r1) (reg:DI r0))
- (set (reg:HI r2) (reg:HI r1))
- (set (reg:SI r3) (reg:SI r2)) //Should be a new chain start at r3
- (set (reg:SI r4) (reg:SI r1))
- (set (reg:SI r5) (reg:SI r4))
-
- the upper part of r3 is undefined.  If we added it to the chain,
- it may be used to replace r5, which has defined upper bits.
- See PR98694 for details.
-
- [A] partial_subreg_p (vd->e[sr].mode, GET_MODE (src))
- [B] partial_subreg_p (vd->e[sr].mode, vd->e[vd->e[sr].oldest_regno].mode)
- Condition B is added to to catch optimization opportunities of
-
- (set (reg:HI R1) (reg:HI R0))
- (set (reg:SI R2) (reg:SI R1)) // [A]
- (set (reg:DI R3) (reg:DI R2)) // [A]
- (set (reg:SI R4) (reg:SI R[0-3]))
- (set (reg:HI R5) (reg:HI R[0-4]))
-
- in which all registers have only 16 defined bits.  */
-  else if (partial_subreg_p (vd->e[sr].mode, GET_MODE (src))
-  && partial_subreg_p (vd->e[sr].mode,
-   vd->e[vd->e[sr].oldest_regno].mode))
-return;
+  /* If a narrower value is copied using wider mode, the upper bits
+ are undefined (could be e.g. a former paradoxical subreg).  Signal
+ in that case we've only copied value using the narrower mode.
+ Consider:
+ (set (reg:DI r14) (mem:DI ...))
+ (set (reg:QI si) (reg:QI r14))
+ (set (reg:DI bp) (reg:DI r14))
+ (set (reg:DI r14) (const_int ...))
+ (set (reg:DI dx) (reg:DI si))
+ (set (reg:DI si) (const_int ...))
+ (set (reg:DI dx) (reg:DI bp))
+ The last set is not redundant, while the low 8 bits of dx are already
+ equal to low 8 bits of bp, the other bits are undefined.  */
+  else if (partial_subreg_p (vd->e[sr].mode, GET_MODE (src)))
+{
+  if (!REG_CAN_CHANGE_MODE_P (sr, GET_MODE (src), vd->e[sr].mode))
+   return;
+  set_value_regno (dr, vd->e[sr].mode, vd);
+}
 
   /* Link DR at the end of the value chain used by SR.  */
 
--- gcc/testsuite/gcc.target/i386/pr100342.c.jj 2021-05-13 17:28:41.181460465 
+0200
+++ gcc/testsuite/gcc.target/i386/pr100342.c2021-05-13 17:28:41.181460465 
+0200
@@ -0,0 +1,70 @@
+/* PR rtl-optimization/100342 */
+/* { dg-do run { target int128 } } */
+/* { dg-options "-O2 -fno-dse -fno-forward-propagate -Wno-psabi -mno-sse2" } */
+
+#define SHL(x, y) ((x) << ((y) & (sizeof(x) * 8 - 1)))
+#define SHR(x, y) ((x) >> ((y) & (sizeof(x) * 8 - 1)))
+#define ROR(x, y) (SHR(x, y)) | (SHL(x, (sizeof(x) * 8 - (y
+#define SHLV(x, y) ((x) << ((y) & (sizeof((x)[0]) * 8 - 1)))
+#define SHLSV(x, y) ((x) << ((y) & (sizeof((y)[0]) * 8 - 1)))
+typedef unsigned char A;
+typedef unsigned char __attribute__((__vector_size__ (8))) B;
+typedef unsigned char __attribute__((__vector_size__ (16))) C;
+typedef unsigned char __attribute__((__vector_size__ (32))) D;
+typedef unsigned char __attribute__((__vector_size__ (64))) E;
+typedef unsigned short F;
+typedef unsigned short __attribute__((__vector_size__ (16))) G;
+typedef unsigned int H;
+typedef unsigned int __attribute__((__vector_size__ (32))) I;
+typedef unsigned long long J;
+typedef unsigned long long __attribute__((__vector_size__ (8))) K;
+typedef unsigned long long __attribute__((__vector_size__ (32))) L;
+typedef unsigned long long __attribute__((__vector_size__ (64))) M;
+typedef unsigned __int128 N;
+typedef unsigned __int128 __attribute__((__vector_size__ (16))) O;
+typedef unsigned __int128 __attribute__((__vector_size__ (32))) P;
+typedef unsigned __int128 __attribute__((__vector_size__ (64))) Q;
+B v1;
+D v2;
+L v3;
+K v4;
+I v5;
+O v6;
+
+B
+foo (A a, C b, E c, F d, G e, H f, J g, M h, N i, P j, Q k)
+{
+  b &= (A) f;
+  k += a;
+  G l = e;
+  D m = v2 >= (A) (J) v1;
+  J r = a + g;
+  L n = v3 <= f;
+  k -= i / f;
+  l -=

Re: [PATCH] tsan: fix false positive for pthread_cond_clockwait

2021-05-13 Thread Michael de Lang via Gcc-patches

Thanks for updating LLVM to upstream. I've added the rebased patch below.

Met vriendelijke groet,
Michael de Lang

gcc/ChangeLog

* g++.dg/tsan/pthread_cond_clockwait.C: new testcase


diff --git a/gcc/testsuite/g++.dg/tsan/pthread_cond_clockwait.C
b/gcc/testsuite/g++.dg/tsan/pthread_cond_clockwait.C
new file mode 100644
index 000..82d6a5c8329
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tsan/pthread_cond_clockwait.C
@@ -0,0 +1,31 @@
+// Test pthread_cond_clockwait not generating false positives with tsan
+// { dg-do run { target { { *-*-linux* *-*-gnu* *-*-uclinux* } && pthread } } }
+// { dg-options "-fsanitize=thread -lpthread" }
+
+#include 
+
+pthread_cond_t cv;
+pthread_mutex_t mtx;
+
+void *fn(void *vp) {
+pthread_mutex_lock(&mtx);
+pthread_cond_signal(&cv);
+pthread_mutex_unlock(&mtx);
+return NULL;
+}
+
+int main() {
+pthread_mutex_lock(&mtx);
+
+pthread_t tid;
+pthread_create(&tid, NULL, fn, NULL);
+
+struct timespec ts;
+clock_gettime(CLOCK_MONOTONIC, &ts);
+ts.tv_sec += 10;
+pthread_cond_clockwait(&cv, &mtx, CLOCK_MONOTONIC, &ts);
+pthread_mutex_unlock(&mtx);
+
+pthread_join(tid, NULL);
+return 0;
+}

On Thu, 13 May 2021 at 09:33, Martin Liška  wrote:
>
> On 5/7/21 7:07 PM, Michael de Lang via Gcc-patches wrote:
> > pthread_cond_clockwait isn't added to TSAN_INTERCEPTORS which leads to
> > false positives regarding double locking of a mutex. This was
> > uncovered by a user reporting an issue to the google sanitizer github:
> > https://github.com/google/sanitizers/issues/1259
> >
> > This patch copies code from the fix made in llvm:
> > https://github.com/llvm/llvm-project/commit/16eb853ffdd1a1ad7c95455b7795c5f004402e46
>
> Hello.
>
> Thank you for looking into this.
>
> >
> > However, because the tsan related source code hasn't been kept in sync
> > with llvm, I had to make some modifications.
>
> We merge from master rougtly twice a year. I've just merged LLVM upstream to 
> our master.
>
> >
> > Given that this is my first contibution to gcc, let me know if I've
> > missed anything.
>
> Please take a look at the following steps:
> https://gcc.gnu.org/contribute.html
>
> We still want your test-case, can you please resend the patch on the current 
> master?
>
> Thanks!
> Cheers,
> Martin
>
> >
> > Met vriendelijke groet,
> > Michael de Lang
> >
> > +++ b/gcc/testsuite/g++.dg/tsan/pthread_cond_clockwait.C
> > @@ -0,0 +1,31 @@
> > +// Test pthread_cond_clockwait not generating false positives with tsan
> > +// { dg-do run { target { { *-*-linux* *-*-gnu* *-*-uclinux* } && pthread 
> > } } }
> > +// { dg-options "-fsanitize=thread -lpthread" }
> > +
> > +#include 
> > +
> > +pthread_cond_t cv;
> > +pthread_mutex_t mtx;
> > +
> > +void *fn(void *vp) {
> > +pthread_mutex_lock(&mtx);
> > +pthread_cond_signal(&cv);
> > +pthread_mutex_unlock(&mtx);
> > +return NULL;
> > +}
> > +
> > +int main() {
> > +pthread_mutex_lock(&mtx);
> > +
> > +pthread_t tid;
> > +pthread_create(&tid, NULL, fn, NULL);
> > +
> > +struct timespec ts;
> > +clock_gettime(CLOCK_MONOTONIC, &ts);
> > +ts.tv_sec += 10;
> > +pthread_cond_clockwait(&cv, &mtx, CLOCK_MONOTONIC, &ts);
> > +pthread_mutex_unlock(&mtx);
> > +
> > +pthread_join(tid, NULL);
> > +return 0;
> > +}
> > diff --git a/libsanitizer/tsan/tsan_interceptors_posix.cpp
> > b/libsanitizer/tsan/tsan_interceptors_posix.cpp
> > index aa04d8dfb67..7b3d0a917de 100644
> > --- a/libsanitizer/tsan/tsan_interceptors_posix.cpp
> > +++ b/libsanitizer/tsan/tsan_interceptors_posix.cpp
> > @@ -1126,7 +1126,10 @@ struct CondMutexUnlockCtx {
> > ScopedInterceptor *si;
> > ThreadState *thr;
> > uptr pc;
> > +  void *c;
> > void *m;
> > +  void *abstime;
> > +  __sanitizer_clockid_t clock;
> >   };
> >
> >   static void cond_mutex_unlock(CondMutexUnlockCtx *arg) {
> > @@ -1152,19 +1155,18 @@ INTERCEPTOR(int, pthread_cond_init, void *c, void 
> > *a) {
> >   }
> >
> >   static int cond_wait(ThreadState *thr, uptr pc, ScopedInterceptor *si,
> > - int (*fn)(void *c, void *m, void *abstime), void *c,
> > - void *m, void *t) {
> > + int (*fn)(void *arg), void *c,
> > + void *m, void *t, __sanitizer_clockid_t clock) {
> > MemoryAccessRange(thr, pc, (uptr)c, sizeof(uptr), false);
> > MutexUnlock(thr, pc, (uptr)m);
> > -  CondMutexUnlockCtx arg = {si, thr, pc, m};
> > +  CondMutexUnlockCtx arg = {si, thr, pc, c, m, t, clock};
> > int res = 0;
> > // This ensures that we handle mutex lock even in case of 
> > pthread_cancel.
> > // See test/tsan/cond_cancel.cpp.
> > {
> >   // Enable signal delivery while the thread is blocked.
> >   BlockingCall bc(thr);
> > -res = call_pthread_cancel_with_cleanup(
> > -fn, c, m, t, (void (*)(void *arg))cond_mutex_unlock, &arg);
> > +res = call_pthread_cancel_with_cleanup(fn, (void (*)(void
> > *arg))cond_m

Re: [PATCH] avoid using an incompletely populated struct (PR 100574)

2021-05-13 Thread Bernd Edlinger

On 5/13/21 3:55 AM, Martin Sebor via Gcc-patches wrote:
> A logic bug in the handling of PHI arguments in compute_objsize
> that are all null pointers lets an incompletely populated struct
> be used in a way that triggers an assertion causing an ICE.
> 
> The attached patch corrects that by having compute_objsize fail
> when the struct isn't fully populated (when all os the PHI's
> arguments are null).
> 
> Martin

Martin,

I'm getting test failures with your patch here:

Running target unix/-m32
FAIL: g++.dg/pr100574.C  -std=gnu++14 (test for excess errors)
FAIL: g++.dg/pr100574.C  -std=gnu++17 (test for excess errors)
FAIL: g++.dg/pr100574.C  -std=gnu++2a (test for excess errors)

/home/ed/gnu/gcc-trunk/gcc/testsuite/g++.dg/pr100574.C:6:7: error: 'operator 
new' takes type 'size_t' ('unsigned int') as first parameter [-fpermissive]^M
compiler exited with status 1


Bernd.

[committed] libphobos: Fix static asserts on NetBSD, FreeBSD, DragonflyBSD

2021-05-13 Thread Iain Buclaw via Gcc-patches

Hi,

This patch fixes a number of static asserts that were failing on NetBSD,
and the same would have been the case for FreeBSD and DragonFlyBSD as
well.  The function declarations were updated to use `const scope', but
the static asserts were not.

Bootstrapped and regression tested on x86_64-linux-gnu and
x86_64-netbsd, committed to mainline and backported to releases/gcc-11.

Regards,
Iain.

---
libphobos/ChangeLog:

* libdruntime/MERGE: Merge upstream druntime 98c6ff0c.
---
 libphobos/libdruntime/MERGE   |  2 +-
 .../libdruntime/core/sys/dragonflybsd/dlfcn.d | 15 ++-
 libphobos/libdruntime/core/sys/freebsd/dlfcn.d|  4 ++--
 libphobos/libdruntime/core/sys/netbsd/dlfcn.d | 15 ++-
 libphobos/libdruntime/core/sys/posix/dlfcn.d  |  4 ++--
 5 files changed, 17 insertions(+), 23 deletions(-)

diff --git a/libphobos/libdruntime/MERGE b/libphobos/libdruntime/MERGE
index 25cbb955ba2..0d554e07098 100644
--- a/libphobos/libdruntime/MERGE
+++ b/libphobos/libdruntime/MERGE
@@ -1,4 +1,4 @@
-89f870b76710a4cfa96f711bb5b14a7439c5c2a7
+98c6ff0cf1241a0cfac196bf8a0523b1d4ecd3ac
 
 The first line of this file holds the git revision number of the last
 merge done from the dlang/druntime repository.
diff --git a/libphobos/libdruntime/core/sys/dragonflybsd/dlfcn.d 
b/libphobos/libdruntime/core/sys/dragonflybsd/dlfcn.d
index 1d3812fc55b..2c5d8d79c22 100644
--- a/libphobos/libdruntime/core/sys/dragonflybsd/dlfcn.d
+++ b/libphobos/libdruntime/core/sys/dragonflybsd/dlfcn.d
@@ -82,16 +82,13 @@ struct __dlfunc_arg {
 
 alias dlfunc_t = void function(__dlfunc_arg);
 
-private template __externC(RT, P...)
-{
-alias __externC = extern(C) RT function(P) nothrow @nogc @system;
-}
-
 /* XSI functions first. */
-static assert(is(typeof(&dlclose) == __externC!(int, void*)));
-static assert(is(typeof(&dlerror) == __externC!(char*)));
-static assert(is(typeof(&dlopen)  == __externC!(void*, const char*, int)));
-static assert(is(typeof(&dlsym)   == __externC!(void*, void*, const char*)));
+extern(C) {
+static assert(is(typeof(&dlclose) == int function(void*)));
+static assert(is(typeof(&dlerror) == char* function()));
+static assert(is(typeof(&dlopen)  == void* function(const scope char*, 
int)));
+static assert(is(typeof(&dlsym)   == void* function(void*, const scope 
char*)));
+}
 
 void*fdlopen(int, int);
 int  dladdr(const(void)*, Dl_info*);
diff --git a/libphobos/libdruntime/core/sys/freebsd/dlfcn.d 
b/libphobos/libdruntime/core/sys/freebsd/dlfcn.d
index fad91418e6d..7baacfeeb7b 100644
--- a/libphobos/libdruntime/core/sys/freebsd/dlfcn.d
+++ b/libphobos/libdruntime/core/sys/freebsd/dlfcn.d
@@ -90,8 +90,8 @@ static if (__BSD_VISIBLE)
 extern(C) {
 static assert(is(typeof(&dlclose) == int function(void*)));
 static assert(is(typeof(&dlerror) == char* function()));
-static assert(is(typeof(&dlopen)  == void* function(in char*, int)));
-static assert(is(typeof(&dlsym)   == void* function(void*, in char*)));
+static assert(is(typeof(&dlopen)  == void* function(const scope char*, 
int)));
+static assert(is(typeof(&dlsym)   == void* function(void*, const scope 
char*)));
 }
 
 static if (__BSD_VISIBLE)
diff --git a/libphobos/libdruntime/core/sys/netbsd/dlfcn.d 
b/libphobos/libdruntime/core/sys/netbsd/dlfcn.d
index dbbcc7638fd..468ffbfe435 100644
--- a/libphobos/libdruntime/core/sys/netbsd/dlfcn.d
+++ b/libphobos/libdruntime/core/sys/netbsd/dlfcn.d
@@ -87,16 +87,13 @@ static if (__BSD_VISIBLE)
 }
 }
 
-private template __externC(RT, P...)
-{
-alias __externC = extern(C) RT function(P) nothrow @nogc;
-}
-
 /* XSI functions first. */
-static assert(is(typeof(&dlclose) == __externC!(int, void*)));
-static assert(is(typeof(&dlerror) == __externC!(char*)));
-static assert(is(typeof(&dlopen)  == __externC!(void*, const char*, int)));
-static assert(is(typeof(&dlsym)   == __externC!(void*, void*, const char*)));
+extern(C) {
+static assert(is(typeof(&dlclose) == int function(void*)));
+static assert(is(typeof(&dlerror) == char* function()));
+static assert(is(typeof(&dlopen)  == void* function(const scope char*, 
int)));
+static assert(is(typeof(&dlsym)   == void* function(void*, const scope 
char*)));
+}
 
 static if (__BSD_VISIBLE)
 {
diff --git a/libphobos/libdruntime/core/sys/posix/dlfcn.d 
b/libphobos/libdruntime/core/sys/posix/dlfcn.d
index 2477e26dc53..f6476ec3106 100644
--- a/libphobos/libdruntime/core/sys/posix/dlfcn.d
+++ b/libphobos/libdruntime/core/sys/posix/dlfcn.d
@@ -158,8 +158,8 @@ else version (FreeBSD)
 
 int   dlclose(void*);
 char* dlerror();
-void* dlopen(in char*, int);
-void* dlsym(void*, in char*);
+void* dlopen(const scope char*, int);
+void* dlsym(void*, const scope char*);
 int   dladdr(const(void)* addr, Dl_info* info);
 
 struct Dl_info
-- 
2.27.0

Re: [PATCH] libsanitizer: merge from master

2021-05-13 Thread Martin Liška


On 5/13/21 5:54 PM, H.J. Lu wrote:

On Thu, May 13, 2021 at 09:28:01AM +0200, Martin Liška wrote:

I'm planning to do merge from master twice a year.
This merge was tested on x86_64-linux-gnu and ppc64le-linux-gnu
and survives regression tests.

Pushed to master.
Thanks,
Martin

Merged revision: f58e0513dd95944b81ce7a6e7b49ba656de7d75f


On Linux/x86-64, I got

../../../../../src-master/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:
 In function ??void __sanitizer::InitTlsSize()??:
../../../../../src-master/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:209:55:
 error: invalid conversion from ??__sanitizer::uptr*?? {aka ??long unsigned 
int*??} to ??size_t*?? {aka ??unsigned int*??} [-fpermissive]
   209 |   ((void (*)(size_t *, size_t *))get_tls_static_info)(&g_tls_size, 
&tls_align);
       |                                                       ^~~
       |                                                       |
       |                                                       
__sanitizer::uptr* {aka long unsigned int*}


H.J.



Hm, I can't reproduce it:

/dev/shm/objdir/./gcc/xgcc -shared-libgcc -B/dev/shm/objdir/./gcc -nostdinc++ 
-L/dev/shm/objdir/x86_64-pc-linux-gnu/libstdc++-v3/src 
-L/dev/shm/objdir/x86_64-pc-linux-gnu/libstdc++-v3/src/.libs 
-L/dev/shm/objdir/x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs 
-B/home/marxin/bin/gcc/x86_64-pc-linux-gnu/bin/ 
-B/home/marxin/bin/gcc/x86_64-pc-linux-gnu/lib/ -isystem 
/home/marxin/bin/gcc/x86_64-pc-linux-gnu/include -isystem 
/home/marxin/bin/gcc/x86_64-pc-linux-gnu/sys-include -D_GNU_SOURCE -D_DEBUG 
-D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS 
-DHAVE_RPC_XDR_H=0 -DHAVE_TIRPC_RPC_XDR_H=0 -I. 
-I/home/marxin/Programming/gcc/libsanitizer/sanitizer_common -I.. -I 
/home/marxin/Programming/gcc/libsanitizer/include -I 
/home/marxin/Programming/gcc/libsanitizer -isystem 
/home/marxin/Programming/gcc/libsanitizer/include/system -Wall -W 
-Wno-unused-parameter -Wwrite-strings -pedantic -Wno-long-long -fPIC 
-fno-builtin -fno-exceptions -fno-rtti -fomit-frame-pointer -funwind-tables 
-fvisibility=hidden -Wno-variadic-macros -I../../libstdc++-v3/include 
-I../../libstdc++-v3/include/x86_64-pc-linux-gnu 
-I/home/marxin/Programming/gcc/libsanitizer/../libstdc++-v3/libsupc++ 
-std=gnu++14 -fcf-protection -mshstk -DSANITIZER_LIBBACKTRACE 
-DSANITIZER_CP_DEMANGLE -I 
/home/marxin/Programming/gcc/libsanitizer/../libbacktrace -I ../libbacktrace -I 
/home/marxin/Programming/gcc/libsanitizer/../include -include 
/home/marxin/Programming/gcc/libsanitizer/libbacktrace/backtrace-rename.h -g 
-O2 -D_GNU_SOURCE -MT sanitizer_linux_libcdep.lo -MD -MP -MF 
.deps/sanitizer_linux_libcdep.Tpo -c 
/home/marxin/Programming/gcc/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp
  -fPIC -DPIC -o .libs/sanitizer_linux_libcdep.o

Can you please show full command line? And please attach a pre-processed source 
file.
Thanks,
Martin

Re: [PATCH] attributes: target_clone expects a string argument

2021-05-13 Thread Martin Liška


On 5/13/21 5:12 PM, Martin Sebor wrote:

Since errors are higher priority than warnings I'd suggest making
this the first check, before the warnings above, (and adding a test
to verify that that's how it works).


Good idea. I've just pushed such change.

Martin

Re: [PATCH] avoid using an incompletely populated struct (PR 100574)

2021-05-13 Thread Martin Sebor via Gcc-patches


On 5/13/21 11:20 AM, Bernd Edlinger wrote:

On 5/13/21 3:55 AM, Martin Sebor via Gcc-patches wrote:

A logic bug in the handling of PHI arguments in compute_objsize
that are all null pointers lets an incompletely populated struct
be used in a way that triggers an assertion causing an ICE.

The attached patch corrects that by having compute_objsize fail
when the struct isn't fully populated (when all os the PHI's
arguments are null).

Martin


Martin,

I'm getting test failures with your patch here:

Running target unix/-m32
FAIL: g++.dg/pr100574.C  -std=gnu++14 (test for excess errors)
FAIL: g++.dg/pr100574.C  -std=gnu++17 (test for excess errors)
FAIL: g++.dg/pr100574.C  -std=gnu++2a (test for excess errors)

/home/ed/gnu/gcc-trunk/gcc/testsuite/g++.dg/pr100574.C:6:7: error: 'operator 
new' takes type 'size_t' ('unsigned int') as first parameter [-fpermissive]^M
compiler exited with status 1


Thanks, I've just fixed it.

Martin




Bernd.

Re: [PUSHED] Skip out on processing __builtin_clz when varying.

2021-05-13 Thread Aldy Hernandez via Gcc-patches




On 5/12/21 5:08 PM, Jakub Jelinek wrote:

On Wed, May 12, 2021 at 05:01:00PM -0400, Aldy Hernandez via Gcc-patches wrote:


PR c/100521
* gimple-range.cc (range_of_builtin_call): Skip out on
  processing __builtin_clz when varying.
---
  gcc/gimple-range.cc | 2 +-
  gcc/testsuite/gcc.dg/pr100521.c | 8 
  2 files changed, 9 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/gcc.dg/pr100521.c

--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr100521.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int
+__builtin_clz (int a)


Is this intentional?  People shouldn't be redefining builtins...


Ughhh.  I don't think that's intentional.  For that matter, the current 
nor the old code is designed to deal with this, especially in this case 
when the builtin is being redefined with incompatible arguments.  That 
is, the above "builtin" has a signed integer as an argument, whereas the 
original builtin had an unsigned one.


In looking at the original vr-values code, I think this could use a 
cleanup.  First, ranges from range_of_expr are always numeric so we 
should adjust.  Also, the checks for non-zero were assuming the argument 
was unsigned, which in the above redirect is clearly not.  I've cleaned 
this up, so that it works either way, though perhaps we should _also_ 
bail on non-builtins. I don't know...this is before my time.


BTW, I've removed the following annoying idiom:

- int newmini = prec - 1 - wi::floor_log2 (r.upper_bound ());
- if (newmini == prec)

This is really a check for r.upper_bound() == 0, as floor_log2(0) 
returns -1.  It's confusing.


How does this look?  For reference, the original code where this all 
came from is 82b6d25d289195.


Thanks for pointing this out.
Aldy
>From f8a958e8028ed129558f9ad7ccf423c834d377bd Mon Sep 17 00:00:00 2001
From: Aldy Hernandez 
Date: Thu, 13 May 2021 13:47:41 -0400
Subject: [PATCH] Cleanup clz and ctz code in range_of_builtin_call.

gcc/ChangeLog:

	* gimple-range.cc (range_of_builtin_call): Cleanup clz and ctz
	code.
---
 gcc/gimple-range.cc | 43 ---
 1 file changed, 20 insertions(+), 23 deletions(-)

diff --git a/gcc/gimple-range.cc b/gcc/gimple-range.cc
index 5b288d8e6a7..b33ba1c8099 100644
--- a/gcc/gimple-range.cc
+++ b/gcc/gimple-range.cc
@@ -736,33 +736,29 @@ range_of_builtin_call (range_query &query, irange &r, gcall *call)
 	}
 
   query.range_of_expr (r, arg, call);
-  // From clz of minimum we can compute result maximum.
-  if (r.constant_p () && !r.varying_p ())
+  if (!r.undefined_p ())
 	{
-	  int newmaxi = prec - 1 - wi::floor_log2 (r.lower_bound ());
-	  // Argument is unsigned, so do nothing if it is [0, ...] range.
-	  if (newmaxi != prec)
+	  // From clz of minimum we can compute result maximum.
+	  if (wi::gt_p (r.lower_bound (), 0, TYPE_SIGN (r.type (
+	{
+	  maxi = prec - 1 - wi::floor_log2 (r.lower_bound ());
+	  if (mini == -2)
+		mini = 0;
+	}
+	  else if (!range_includes_zero_p (&r))
 	{
 	  mini = 0;
-	  maxi = newmaxi;
+	  maxi = prec - 1;
 	}
-	}
-  else if (!range_includes_zero_p (&r))
-	{
-	  maxi = prec - 1;
-	  mini = 0;
-	}
-  if (mini == -2)
-	break;
-  // From clz of maximum we can compute result minimum.
-  if (r.constant_p ())
-	{
-	  int newmini = prec - 1 - wi::floor_log2 (r.upper_bound ());
-	  if (newmini == prec)
+	  if (mini == -2)
+	break;
+	  // From clz of maximum we can compute result minimum.
+	  wide_int max = r.upper_bound ();
+	  int newmini = prec - 1 - wi::floor_log2 (max);
+	  if (max == 0)
 	{
-	  // Argument range is [0, 0].  If CLZ_DEFINED_VALUE_AT_ZERO
-	  // is 2 with VALUE of prec, return [prec, prec], otherwise
-	  // ignore the range.
+	  // If CLZ_DEFINED_VALUE_AT_ZERO is 2 with VALUE of prec,
+	  // return [prec, prec], otherwise ignore the range.
 	  if (maxi == prec)
 		mini = prec;
 	}
@@ -803,7 +799,8 @@ range_of_builtin_call (range_query &query, irange &r, gcall *call)
   query.range_of_expr (r, arg, call);
   if (!r.undefined_p ())
 	{
-	  if (r.lower_bound () != 0)
+	  // If arg is non-zero, then use [0, prec - 1].
+	  if (!range_includes_zero_p (&r))
 	{
 	  mini = 0;
 	  maxi = prec - 1;
-- 
2.31.1

Re: [PUSHED] Skip out on processing __builtin_clz when varying.

2021-05-13 Thread Aldy Hernandez via Gcc-patches




On 5/12/21 5:08 PM, Jakub Jelinek wrote:

On Wed, May 12, 2021 at 05:01:00PM -0400, Aldy Hernandez via Gcc-patches wrote:


PR c/100521
* gimple-range.cc (range_of_builtin_call): Skip out on
  processing __builtin_clz when varying.
---
  gcc/gimple-range.cc | 2 +-
  gcc/testsuite/gcc.dg/pr100521.c | 8 
  2 files changed, 9 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/gcc.dg/pr100521.c

--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr100521.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int
+__builtin_clz (int a)


Is this intentional?  People shouldn't be redefining builtins...


Ughhh.  I don't think that's intentional.  For that matter, the current 
nor the old code is designed to deal with this, especially in this case 
when the builtin is being redefined with incompatible arguments.  That 
is, the above "builtin" has a signed integer as an argument, whereas the 
original builtin had an unsigned one.


In looking at the original vr-values code, I think this could use a 
cleanup.  First, ranges from range_of_expr are always numeric so we 
should adjust.  Also, the checks for non-zero were assuming the argument 
was unsigned, which in the above redirect is clearly not.  I've cleaned 
this up, so that it works either way, though perhaps we should _also_ 
bail on non-builtins. I don't know...this is before my time.


BTW, I've removed the following annoying idiom:

- int newmini = prec - 1 - wi::floor_log2 (r.upper_bound ());
- if (newmini == prec)

This is really a check for r.upper_bound() == 0, as floor_log2(0) 
returns -1.  It's confusing.


How does this look?  For reference, the original code where this all 
came from is 82b6d25d289195.


Thanks for pointing this out.
Aldy
>From f8a958e8028ed129558f9ad7ccf423c834d377bd Mon Sep 17 00:00:00 2001
From: Aldy Hernandez 
Date: Thu, 13 May 2021 13:47:41 -0400
Subject: [PATCH] Cleanup clz and ctz code in range_of_builtin_call.

gcc/ChangeLog:

	* gimple-range.cc (range_of_builtin_call): Cleanup clz and ctz
	code.
---
 gcc/gimple-range.cc | 43 ---
 1 file changed, 20 insertions(+), 23 deletions(-)

diff --git a/gcc/gimple-range.cc b/gcc/gimple-range.cc
index 5b288d8e6a7..b33ba1c8099 100644
--- a/gcc/gimple-range.cc
+++ b/gcc/gimple-range.cc
@@ -736,33 +736,29 @@ range_of_builtin_call (range_query &query, irange &r, gcall *call)
 	}
 
   query.range_of_expr (r, arg, call);
-  // From clz of minimum we can compute result maximum.
-  if (r.constant_p () && !r.varying_p ())
+  if (!r.undefined_p ())
 	{
-	  int newmaxi = prec - 1 - wi::floor_log2 (r.lower_bound ());
-	  // Argument is unsigned, so do nothing if it is [0, ...] range.
-	  if (newmaxi != prec)
+	  // From clz of minimum we can compute result maximum.
+	  if (wi::gt_p (r.lower_bound (), 0, TYPE_SIGN (r.type (
+	{
+	  maxi = prec - 1 - wi::floor_log2 (r.lower_bound ());
+	  if (mini == -2)
+		mini = 0;
+	}
+	  else if (!range_includes_zero_p (&r))
 	{
 	  mini = 0;
-	  maxi = newmaxi;
+	  maxi = prec - 1;
 	}
-	}
-  else if (!range_includes_zero_p (&r))
-	{
-	  maxi = prec - 1;
-	  mini = 0;
-	}
-  if (mini == -2)
-	break;
-  // From clz of maximum we can compute result minimum.
-  if (r.constant_p ())
-	{
-	  int newmini = prec - 1 - wi::floor_log2 (r.upper_bound ());
-	  if (newmini == prec)
+	  if (mini == -2)
+	break;
+	  // From clz of maximum we can compute result minimum.
+	  wide_int max = r.upper_bound ();
+	  int newmini = prec - 1 - wi::floor_log2 (max);
+	  if (max == 0)
 	{
-	  // Argument range is [0, 0].  If CLZ_DEFINED_VALUE_AT_ZERO
-	  // is 2 with VALUE of prec, return [prec, prec], otherwise
-	  // ignore the range.
+	  // If CLZ_DEFINED_VALUE_AT_ZERO is 2 with VALUE of prec,
+	  // return [prec, prec], otherwise ignore the range.
 	  if (maxi == prec)
 		mini = prec;
 	}
@@ -803,7 +799,8 @@ range_of_builtin_call (range_query &query, irange &r, gcall *call)
   query.range_of_expr (r, arg, call);
   if (!r.undefined_p ())
 	{
-	  if (r.lower_bound () != 0)
+	  // If arg is non-zero, then use [0, prec - 1].
+	  if (!range_includes_zero_p (&r))
 	{
 	  mini = 0;
 	  maxi = prec - 1;
-- 
2.31.1

Re: [PATCH] avoid using an incompletely populated struct (PR 100574)

2021-05-13 Thread Jeff Law via Gcc-patches




On 5/12/2021 7:55 PM, Martin Sebor via Gcc-patches wrote:

A logic bug in the handling of PHI arguments in compute_objsize
that are all null pointers lets an incompletely populated struct
be used in a way that triggers an assertion causing an ICE.

The attached patch corrects that by having compute_objsize fail
when the struct isn't fully populated (when all os the PHI's
arguments are null).

Martin

gcc-100574.diff

PR middle-end/100574 - ICE: in size_remaining, at builtins.c

gcc/ChangeLog:

PR middle-end/100574
* builtins.c (access_ref::get_ref): Improve detection of PHIs with
all null arguments.

gcc/testsuite/ChangeLog:

PR middle-end/100574
* g++.dg/pr100574.C: New test.


OK

jeff

Re: [PATCH] IBM Z: Avoid bash-specific substitution in configure

2021-05-13 Thread Jeff Law via Gcc-patches




On 5/12/2021 9:12 AM, Marius Hillenbrand via Gcc-patches wrote:

Tested configure runs on NetBSD x86-64, Linux on x86-64 (with target s390x), and
on s390x.

Is the patch ok for master, and for gcc-11?

8<--8<--8<-

Fix a bootstrap error observed on NetBSD.

2021-05-12  Marius Hillenbrand  

gcc/ChangeLog:

PR bootstrap/100552
* configure.ac: Replace pattern substitution with call to sed.
* configure: Regenerate (tbd at time of commit).


OK

Jeff

Re: [PATCH] avoid a couple of missing -Wuninitialized (PR 98583, 93100)

2021-05-13 Thread Jeff Law via Gcc-patches




On 5/11/2021 1:49 PM, Martin Sebor via Gcc-patches wrote:

The attached change teaches the uninitialized pass about
__builtin_stack_restore and __builtin___asan_mark to avoid two
classes of -Wuninitialized false negatives.

Richard, you already approved the __builtin_stack_restore change
in the bug but I figured I'd submit a patch with both changes for
approval since they affect the same piece of code.

Martin

gcc-93100.diff

Avoid -Wuninitialized false negatives with sanitization and VLAs.

Resolves:
PR tree-optimization/93100 - gcc -fsanitize=address inhibits -Wuninitialized
PR middle-end/98583 - missing -Wuninitialized reading from a second VLA in its 
own block

gcc/ChangeLog:

PR tree-optimization/93100
PR middle-end/98583
* tree-ssa-uninit.c (check_defs):

gcc/testsuite/ChangeLog:

PR tree-optimization/93100
PR middle-end/98583
* g++.dg/warn/uninit-pr93100.C: New test.
* gcc.dg/uninit-pr93100.c: New test.
* gcc.dg/uninit-pr98583.c: New test.


OK.  I wonder if it would make sense to describe this property when we 
construct the builtin and check that property rather than each builtin 
we find over time.  Your call on whether or not to explore that.



Jeff

Re: [PATCH] Fix awk substr invocation in libgo buildsystem

2021-05-13 Thread Jeff Law via Gcc-patches




On 5/10/2021 3:32 AM, Christoph Höger wrote:

The awk script used a zero-based index which worked on surprisingly
many plattforms. According to the man page, however, the function
expects one-based indexing.

For reference see this bug in the go git repository:

https://github.com/golang/go/issues/45843

Signed-off-by: Christoph Höger 
---
  ChangeLog | 4 
  libgo/mklinknames.awk | 2 +-
  2 files changed, 5 insertions(+), 1 deletion(-)


I believe this file is managed by the upstream golang project. So the 
submission should go to them.  The guidelines from the README file:


Contributing


To contribute patches to the files in this directory, please see
http://golang.org/doc/gccgo_contribute.html .

The master copy of these files is hosted at
http://code.google.com/p/gofrontend .  Changes to these files require
signing a Google contributor license agreement.  If you are the
copyright holder, you will need to agree to the individual contributor
license agreement at
http://code.google.com/legal/individual-cla-v1.0.html.  This agreement
can be completed online.

If your organization is the copyright holder, the organization will
need to agree to the corporate contributor license agreement at
http://code.google.com/legal/corporate-cla-v1.0.html.

If the copyright holder for your code has already completed the
agreement in connection with another Google open source project, it
does not need to be completed again.

--


Thanks,

Jeff

Re: [PATCH RFA] tree-iterator: C++11 range-for and tree_stmt_iterator

2021-05-13 Thread Jason Merrill via Gcc-patches


Ping.

On 5/1/21 12:29 PM, Jason Merrill wrote:

Like my recent patch to add ovl_range and lkp_range in the C++ front end,
this patch adds the tsi_range adaptor for using C++11 range-based 'for' with
a STATEMENT_LIST, e.g.

   for (tree stmt : tsi_range (stmt_list)) { ... }

This also involves adding some operators to tree_stmt_iterator that are
needed for range-for iterators, and should also be useful in code that uses
the iterators directly.

The patch updates the suitable loops in the C++ front end, but does not
touch any loops elsewhere in the compiler.

gcc/ChangeLog:

* tree-iterator.h (struct tree_stmt_iterator): Add operator++,
operator--, operator*, operator==, and operator!=.
(class tsi_range): New.

gcc/cp/ChangeLog:

* constexpr.c (build_data_member_initialization): Use tsi_range.
(build_constexpr_constructor_member_initializers): Likewise.
(constexpr_fn_retval, cxx_eval_statement_list): Likewise.
(potential_constant_expression_1): Likewise.
* coroutines.cc (await_statement_expander): Likewise.
(await_statement_walker): Likewise.
* module.cc (trees_out::core_vals): Likewise.
* pt.c (tsubst_expr): Likewise.
* semantics.c (set_cleanup_locs): Likewise.
---
  gcc/tree-iterator.h  | 28 +++-
  gcc/cp/constexpr.c   | 42 ++
  gcc/cp/coroutines.cc | 10 --
  gcc/cp/module.cc |  5 ++---
  gcc/cp/pt.c  |  5 ++---
  gcc/cp/semantics.c   |  5 ++---
  6 files changed, 47 insertions(+), 48 deletions(-)

diff --git a/gcc/tree-iterator.h b/gcc/tree-iterator.h
index 076fff8644c..f57456bb473 100644
--- a/gcc/tree-iterator.h
+++ b/gcc/tree-iterator.h
@@ -1,4 +1,4 @@
-/* Iterator routines for manipulating GENERIC tree statement list.
+/* Iterator routines for manipulating GENERIC tree statement list. -*- C++ -*-
 Copyright (C) 2003-2021 Free Software Foundation, Inc.
 Contributed by Andrew MacLeod  
  
@@ -32,6 +32,13 @@ along with GCC; see the file COPYING3.  If not see

  struct tree_stmt_iterator {
struct tree_statement_list_node *ptr;
tree container;
+
+  bool operator== (tree_stmt_iterator b) const
+{ return b.ptr == ptr && b.container == container; }
+  bool operator!= (tree_stmt_iterator b) const { return !(*this == b); }
+  tree_stmt_iterator &operator++ () { ptr = ptr->next; return *this; }
+  tree_stmt_iterator &operator-- () { ptr = ptr->prev; return *this; }
+  tree &operator* () { return ptr->stmt; }
  };
  
  static inline tree_stmt_iterator

@@ -71,27 +78,38 @@ tsi_one_before_end_p (tree_stmt_iterator i)
  static inline void
  tsi_next (tree_stmt_iterator *i)
  {
-  i->ptr = i->ptr->next;
+  ++(*i);
  }
  
  static inline void

  tsi_prev (tree_stmt_iterator *i)
  {
-  i->ptr = i->ptr->prev;
+  --(*i);
  }
  
  static inline tree *

  tsi_stmt_ptr (tree_stmt_iterator i)
  {
-  return &i.ptr->stmt;
+  return &(*i);
  }
  
  static inline tree

  tsi_stmt (tree_stmt_iterator i)
  {
-  return i.ptr->stmt;
+  return *i;
  }
  
+/* Make tree_stmt_iterator work as a C++ range, e.g.

+   for (tree stmt : tsi_range (stmt_list)) { ... }  */
+class tsi_range
+{
+  tree t;
+ public:
+  tsi_range (tree t): t(t) { }
+  tree_stmt_iterator begin() { return tsi_start (t); }
+  tree_stmt_iterator end() { return { nullptr, t }; }
+};
+
  enum tsi_iterator_update
  {
TSI_NEW_STMT,   /* Only valid when single statement is added, 
move
diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 9481a5bfd3c..260b0122f59 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -330,12 +330,9 @@ build_data_member_initialization (tree t, 
vec **vec)
  return false;
if (TREE_CODE (t) == STATEMENT_LIST)
  {
-  tree_stmt_iterator i;
-  for (i = tsi_start (t); !tsi_end_p (i); tsi_next (&i))
-   {
- if (! build_data_member_initialization (tsi_stmt (i), vec))
-   return false;
-   }
+  for (tree stmt : tsi_range (t))
+   if (! build_data_member_initialization (stmt, vec))
+ return false;
return true;
  }
if (TREE_CODE (t) == CLEANUP_STMT)
@@ -577,10 +574,9 @@ build_constexpr_constructor_member_initializers (tree 
type, tree body)
break;
  
case STATEMENT_LIST:

-   for (tree_stmt_iterator i = tsi_start (body);
-!tsi_end_p (i); tsi_next (&i))
+   for (tree stmt : tsi_range (body))
  {
-   body = tsi_stmt (i);
+   body = stmt;
if (TREE_CODE (body) == BIND_EXPR)
  break;
  }
@@ -617,10 +613,9 @@ build_constexpr_constructor_member_initializers (tree 
type, tree body)
  }
else if (TREE_CODE (body) == STATEMENT_LIST)
  {
-  tree_stmt_iterator i;
-  for (i = tsi_start (body); !tsi_end_p (i); tsi_next (&i))
+  for (tree stmt : tsi_range (body))
{
- ok = build_data_member_initialization (tsi_stmt (i), &vec);
+ ok

Re: [PATCH RFA (diagnostic)] c++: -Wdeprecated-copy and #pragma diagnostic [PR94492]

2021-05-13 Thread Jason Merrill via Gcc-patches


Ping.

On 4/28/21 9:32 AM, Jason Merrill wrote:

  -Wdeprecated-copy was depending only on the state of the warning at the
point where we call the function, making it hard to use #pragma diagnostic
to suppress the warning for a particular implicitly declared function.

But checking whether the warning is enabled at the location of the implicit
declaration turned out to be a bit complicated; option_enabled only tests
whether it was enabled at the start of compilation, the actual test only
existed in the middle of diagnostic_report_diagnostic.  So this patch
factors it out and adds a new warning_enabled function to diagnostic.h.

Tested x86_64-pc-linux-gnu, OK for trunk?

gcc/ChangeLog:

PR c++/94492
* diagnostic.h (warning_enabled): Declare.
* diagnostic.c (diagnostic_enabled): Factor out from...
(diagnostic_report_diagnostic): ...here.
(warning_enabled): New.

gcc/cp/ChangeLog:

PR c++/94492
* decl2.c (cp_warn_deprecated_use): Check warning_enabled.

gcc/testsuite/ChangeLog:

PR c++/94492
* g++.dg/cpp0x/depr-copy4.C: New test.
---
  gcc/diagnostic.h|  2 +
  gcc/cp/decl2.c  |  8 +--
  gcc/diagnostic.c| 85 +
  gcc/testsuite/g++.dg/cpp0x/depr-copy4.C | 16 +
  4 files changed, 80 insertions(+), 31 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/depr-copy4.C

diff --git a/gcc/diagnostic.h b/gcc/diagnostic.h
index 9a6eefcf918..caa97da2df9 100644
--- a/gcc/diagnostic.h
+++ b/gcc/diagnostic.h
@@ -515,4 +515,6 @@ extern int num_digits (int);
  extern json::value *json_from_expanded_location (diagnostic_context *context,
 location_t loc);
  
+extern bool warning_enabled (int, location_t = input_location);

+
  #endif /* ! GCC_DIAGNOSTIC_H */
diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index a82960fb39c..03b7a68aba2 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -5495,10 +5495,10 @@ cp_warn_deprecated_use (tree decl, tsubst_flags_t 
complain)
&& DECL_NONSTATIC_MEMBER_FUNCTION_P (decl)
&& copy_fn_p (decl))
  {
-  if (warn_deprecated_copy
- /* Don't warn about system library classes (c++/86342).  */
- && (!DECL_IN_SYSTEM_HEADER (decl)
- || global_dc->dc_warn_system_headers))
+  /* Don't warn if the flag was disabled around the class definition
+(c++/94492).  */
+  if (warning_enabled (OPT_Wdeprecated_copy,
+  DECL_SOURCE_LOCATION (decl)))
{
  auto_diagnostic_group d;
  tree ctx = DECL_CONTEXT (decl);
diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index 246d75256cf..278ec8b706f 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -1122,6 +1122,62 @@ print_option_information (diagnostic_context *context,
  }
  }
  
+/* Returns whether a DIAGNOSTIC should be printed, and adjusts diagnostic->kind

+   as appropriate.  */
+
+static bool
+diagnostic_enabled (diagnostic_context *context,
+   diagnostic_info *diagnostic)
+{
+  /* Diagnostics with no option or -fpermissive are always enabled.  */
+  if (!diagnostic->option_index
+  || diagnostic->option_index == permissive_error_option (context))
+return true;
+
+  /* This tests if the user provided the appropriate -Wfoo or
+ -Wno-foo option.  */
+  if (! context->option_enabled (diagnostic->option_index,
+context->lang_mask,
+context->option_state))
+return false;
+
+  /* This tests for #pragma diagnostic changes.  */
+  diagnostic_t diag_class
+= update_effective_level_from_pragmas (context, diagnostic);
+
+  /* This tests if the user provided the appropriate -Werror=foo
+ option.  */
+  if (diag_class == DK_UNSPECIFIED
+  && (context->classify_diagnostic[diagnostic->option_index]
+ != DK_UNSPECIFIED))
+diagnostic->kind
+  = context->classify_diagnostic[diagnostic->option_index];
+
+  /* This allows for future extensions, like temporarily disabling
+ warnings for ranges of source code.  */
+  if (diagnostic->kind == DK_IGNORED)
+return false;
+
+  return true;
+}
+
+/* Returns whether warning OPT is enabled at LOC.  */
+
+bool
+warning_enabled (int opt, location_t loc)
+{
+  if (!diagnostic_report_warnings_p (global_dc, loc))
+return false;
+
+  rich_location richloc (line_table, loc);
+  diagnostic_info diagnostic = {};
+  diagnostic.option_index = opt;
+  diagnostic.richloc = &richloc;
+  diagnostic.message.m_richloc = &richloc;
+  diagnostic.kind = DK_WARNING;
+  return diagnostic_enabled (global_dc, &diagnostic);
+}
+
  /* Report a diagnostic message (an error or a warning) as specified by
 DC.  This function is *the* subroutine in terms of which front-ends
 should implement their specific diagnostic handling modules.  The
@@ -1172,33 +1228,8 @@ diagnostic_report_diagnosti

Re: [PATCH] Remove unused variable.

2021-05-13 Thread Aldy Hernandez via Gcc-patches

On Thu, May 13, 2021 at 9:43 AM Jeff Law via Gcc-patches
 wrote:
>
>
> On 5/13/2021 3:06 AM, Martin Liška wrote:
> > Addresses the following clang warning:
> > gcc/tree-ssa-dom.c:652:33: warning: private field 'm_simplifier' is
> > not used [-Wunused-private-field]
> >
> > Ready for master?
> > Thanks
> >
> > gcc/ChangeLog:
> >
> > * tree-ssa-dom.c: Remove m_simplifier.
>
> I wonder if Aldy's refactor accidentally dropped the ephemeral
> simplifications based on the threading path equivalences which is what I
> think this is supposed to be supporting.  Given that the refactor didn't
> cause any regressions, if that capability did get dropped, it couldn't
> be too important anymore.

The dom_opt_dom_walker is instantiated with a threader, which contains
a simplifier.  So it got shuffled around, but the functionality should
still be there.

Thanks for catching this Martin.

Aldy

Re: [PATCH] libsanitizer: merge from master

2021-05-13 Thread H.J. Lu via Gcc-patches

On Thu, May 13, 2021 at 10:27 AM Martin Liška  wrote:
>
> On 5/13/21 5:54 PM, H.J. Lu wrote:
> > On Thu, May 13, 2021 at 09:28:01AM +0200, Martin Liška wrote:
> >> I'm planning to do merge from master twice a year.
> >> This merge was tested on x86_64-linux-gnu and ppc64le-linux-gnu
> >> and survives regression tests.
> >>
> >> Pushed to master.
> >> Thanks,
> >> Martin
> >>
> >> Merged revision: f58e0513dd95944b81ce7a6e7b49ba656de7d75f
> >
> > On Linux/x86-64, I got
> >
> > ../../../../../src-master/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:
> >  In function ??void __sanitizer::InitTlsSize()??:
> > ../../../../../src-master/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:209:55:
> >  error: invalid conversion from ??__sanitizer::uptr*?? {aka ??long unsigned 
> > int*??} to ??size_t*?? {aka ??unsigned int*??} [-fpermissive]
> >209 |   ((void (*)(size_t *, size_t *))get_tls_static_info)(&g_tls_size, 
> > &tls_align);
> >|   ^~~
> >|   |
> >|   
> > __sanitizer::uptr* {aka long unsigned int*}
> >
> >
> > H.J.
> >
>
> Hm, I can't reproduce it:
>
> /dev/shm/objdir/./gcc/xgcc -shared-libgcc -B/dev/shm/objdir/./gcc -nostdinc++ 
> -L/dev/shm/objdir/x86_64-pc-linux-gnu/libstdc++-v3/src 
> -L/dev/shm/objdir/x86_64-pc-linux-gnu/libstdc++-v3/src/.libs 
> -L/dev/shm/objdir/x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs 
> -B/home/marxin/bin/gcc/x86_64-pc-linux-gnu/bin/ 
> -B/home/marxin/bin/gcc/x86_64-pc-linux-gnu/lib/ -isystem 
> /home/marxin/bin/gcc/x86_64-pc-linux-gnu/include -isystem 
> /home/marxin/bin/gcc/x86_64-pc-linux-gnu/sys-include -D_GNU_SOURCE -D_DEBUG 
> -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS 
> -DHAVE_RPC_XDR_H=0 -DHAVE_TIRPC_RPC_XDR_H=0 -I. 
> -I/home/marxin/Programming/gcc/libsanitizer/sanitizer_common -I.. -I 
> /home/marxin/Programming/gcc/libsanitizer/include -I 
> /home/marxin/Programming/gcc/libsanitizer -isystem 
> /home/marxin/Programming/gcc/libsanitizer/include/system -Wall -W 
> -Wno-unused-parameter -Wwrite-strings -pedantic -Wno-long-long -fPIC 
> -fno-builtin -fno-exceptions -fno-rtti -fomit-frame-pointer -funwind-tables 
> -fvisibility=hidden -Wno-variadic-macros -I../../libstdc++-v3/include 
> -I../../libstdc++-v3/include/x86_64-pc-linux-gnu 
> -I/home/marxin/Programming/gcc/libsanitizer/../libstdc++-v3/libsupc++ 
> -std=gnu++14 -fcf-protection -mshstk -DSANITIZER_LIBBACKTRACE 
> -DSANITIZER_CP_DEMANGLE -I 
> /home/marxin/Programming/gcc/libsanitizer/../libbacktrace -I ../libbacktrace 
> -I /home/marxin/Programming/gcc/libsanitizer/../include -include 
> /home/marxin/Programming/gcc/libsanitizer/libbacktrace/backtrace-rename.h -g 
> -O2 -D_GNU_SOURCE -MT sanitizer_linux_libcdep.lo -MD -MP -MF 
> .deps/sanitizer_linux_libcdep.Tpo -c 
> /home/marxin/Programming/gcc/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp
>   -fPIC -DPIC -o .libs/sanitizer_linux_libcdep.o
>
> Can you please show full command line? And please attach a pre-processed 
> source file.
> Thanks,
> Martin

The problem is -mx32 where size_t == unsigned int, not unsigned long int.

-- 
H.J.

Re: [PATCH] libsanitizer: merge from master

2021-05-13 Thread H.J. Lu via Gcc-patches

On Thu, May 13, 2021 at 1:01 PM H.J. Lu  wrote:
>
> On Thu, May 13, 2021 at 10:27 AM Martin Liška  wrote:
> >
> > On 5/13/21 5:54 PM, H.J. Lu wrote:
> > > On Thu, May 13, 2021 at 09:28:01AM +0200, Martin Liška wrote:
> > >> I'm planning to do merge from master twice a year.
> > >> This merge was tested on x86_64-linux-gnu and ppc64le-linux-gnu
> > >> and survives regression tests.
> > >>
> > >> Pushed to master.
> > >> Thanks,
> > >> Martin
> > >>
> > >> Merged revision: f58e0513dd95944b81ce7a6e7b49ba656de7d75f
> > >
> > > On Linux/x86-64, I got
> > >
> > > ../../../../../src-master/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:
> > >  In function ??void __sanitizer::InitTlsSize()??:
> > > ../../../../../src-master/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:209:55:
> > >  error: invalid conversion from ??__sanitizer::uptr*?? {aka ??long 
> > > unsigned int*??} to ??size_t*?? {aka ??unsigned int*??} [-fpermissive]
> > >209 |   ((void (*)(size_t *, size_t 
> > > *))get_tls_static_info)(&g_tls_size, &tls_align);
> > >|   ^~~
> > >|   |
> > >|   
> > > __sanitizer::uptr* {aka long unsigned int*}
> > >
> > >
> > > H.J.
> > >
> >
> > Hm, I can't reproduce it:
> >
> > /dev/shm/objdir/./gcc/xgcc -shared-libgcc -B/dev/shm/objdir/./gcc 
> > -nostdinc++ -L/dev/shm/objdir/x86_64-pc-linux-gnu/libstdc++-v3/src 
> > -L/dev/shm/objdir/x86_64-pc-linux-gnu/libstdc++-v3/src/.libs 
> > -L/dev/shm/objdir/x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs 
> > -B/home/marxin/bin/gcc/x86_64-pc-linux-gnu/bin/ 
> > -B/home/marxin/bin/gcc/x86_64-pc-linux-gnu/lib/ -isystem 
> > /home/marxin/bin/gcc/x86_64-pc-linux-gnu/include -isystem 
> > /home/marxin/bin/gcc/x86_64-pc-linux-gnu/sys-include -D_GNU_SOURCE -D_DEBUG 
> > -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS 
> > -DHAVE_RPC_XDR_H=0 -DHAVE_TIRPC_RPC_XDR_H=0 -I. 
> > -I/home/marxin/Programming/gcc/libsanitizer/sanitizer_common -I.. -I 
> > /home/marxin/Programming/gcc/libsanitizer/include -I 
> > /home/marxin/Programming/gcc/libsanitizer -isystem 
> > /home/marxin/Programming/gcc/libsanitizer/include/system -Wall -W 
> > -Wno-unused-parameter -Wwrite-strings -pedantic -Wno-long-long -fPIC 
> > -fno-builtin -fno-exceptions -fno-rtti -fomit-frame-pointer -funwind-tables 
> > -fvisibility=hidden -Wno-variadic-macros -I../../libstdc++-v3/include 
> > -I../../libstdc++-v3/include/x86_64-pc-linux-gnu 
> > -I/home/marxin/Programming/gcc/libsanitizer/../libstdc++-v3/libsupc++ 
> > -std=gnu++14 -fcf-protection -mshstk -DSANITIZER_LIBBACKTRACE 
> > -DSANITIZER_CP_DEMANGLE -I 
> > /home/marxin/Programming/gcc/libsanitizer/../libbacktrace -I 
> > ../libbacktrace -I /home/marxin/Programming/gcc/libsanitizer/../include 
> > -include 
> > /home/marxin/Programming/gcc/libsanitizer/libbacktrace/backtrace-rename.h 
> > -g -O2 -D_GNU_SOURCE -MT sanitizer_linux_libcdep.lo -MD -MP -MF 
> > .deps/sanitizer_linux_libcdep.Tpo -c 
> > /home/marxin/Programming/gcc/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp
> >   -fPIC -DPIC -o .libs/sanitizer_linux_libcdep.o
> >
> > Can you please show full command line? And please attach a pre-processed 
> > source file.
> > Thanks,
> > Martin
>
> The problem is -mx32 where size_t == unsigned int, not unsigned long int.
>

I am testing this patch:

diff --git a/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp
b/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp
index da19d3d2ceb..4f9577a97e2 100644
--- a/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp
+++ b/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp
@@ -197,7 +197,7 @@ __attribute__((unused)) static bool
GetLibcVersion(int *major, int *minor,
 __attribute__((unused)) static int g_use_dlpi_tls_data;

 #if SANITIZER_GLIBC && !SANITIZER_GO
-__attribute__((unused)) static uptr g_tls_size;
+__attribute__((unused)) static size_t g_tls_size;
 void InitTlsSize() {
   int major, minor, patch;
   g_use_dlpi_tls_data =


-- 
H.J.

[committed] libgcc: pru: Place mpyll into its own section

2021-05-13 Thread Dimitar Dimitrov

Pushed as obvious.

This should help LD's --gc-sections feature to reduce final ELF size.

libgcc/ChangeLog:

* config/pru/mpyll.S (__pruabi_mpyll): Place into own section.

Signed-off-by: Dimitar Dimitrov 
---
 libgcc/config/pru/mpyll.S | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/libgcc/config/pru/mpyll.S b/libgcc/config/pru/mpyll.S
index 1aa12a63b2c..cd093bb6b72 100644
--- a/libgcc/config/pru/mpyll.S
+++ b/libgcc/config/pru/mpyll.S
@@ -29,6 +29,8 @@
 
 #include "pru-asm.h"
 
+   .section .text.__pruabi_mpyll, "ax"
+
.global SYM(__pruabi_mpyll)
FUNC(__pruabi_mpyll)
 SYM(__pruabi_mpyll):
-- 
2.20.1

[PATCH] Bail in bounds_of_var_in_loop if scev returns NULL.

2021-05-13 Thread Aldy Hernandez via Gcc-patches

Both initial_condition_in_loop_num and evolution_part_in_loop_num
can return NULL.  This patch exits if either one is NULL.  Presumably
this didn't happen before, because adjust_range_with_scev was called
far less frequently than in ranger, which can call it for every PHI.

OK pending tests?

gcc/ChangeLog:

PR tree-optimization/100349
* vr-values.c (bounds_of_var_in_loop): Bail if scev returns
  NULL.

gcc/testsuite/ChangeLog:

* gcc.dg/pr100349.c: New test.
---
 gcc/testsuite/gcc.dg/pr100349.c | 16 
 gcc/vr-values.c |  3 +++
 2 files changed, 19 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/pr100349.c

diff --git a/gcc/testsuite/gcc.dg/pr100349.c b/gcc/testsuite/gcc.dg/pr100349.c
new file mode 100644
index 000..dd7977ac0f9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr100349.c
@@ -0,0 +1,16 @@
+// { dg-do compile }
+// { dg-options "-O2 -w" }
+
+#include 
+uint8_t a;
+b(int8_t c) {
+  int d;
+e:
+  uint32_t f;
+  for (;;)
+for (c = 10; c; c++)
+  if (0 > (a = c) ?: d) {
+f = a;
+goto e;
+  }
+}
diff --git a/gcc/vr-values.c b/gcc/vr-values.c
index 08b237b2632..b1bf53af9e0 100644
--- a/gcc/vr-values.c
+++ b/gcc/vr-values.c
@@ -1650,6 +1650,9 @@ bounds_of_var_in_loop (tree *min, tree *max, range_query 
*query,
   init = initial_condition_in_loop_num (chrec, loop->num);
   step = evolution_part_in_loop_num (chrec, loop->num);
 
+  if (!init || !step)
+return false;
+
   /* If INIT is an SSA with a singleton range, set INIT to said
  singleton, otherwise leave INIT alone.  */
   if (TREE_CODE (init) == SSA_NAME)
-- 
2.31.1

[pushed] libsanitizer, Darwin : Handle missing __builtin_os_log_format.

2021-05-13 Thread Iain Sandoe

Hi,

The update to libsanitizer broke bootstrap on Darwin, since the upstream
sources assume building with clang on Darwin and make use of a header
depending on a builtin that GCC does not currently implement.

This patch fixes the build.  Whether to implement the missing builtin is
something to consider another day,

bootstrapped on Darwin16 and smoke-tested the sanitizer,
pushed to master,
thanks
Iain

I pushed also r12-781-g1f6fc2826d19136bb5ab97a4bdac07e6736b6869 which
adds this patch to libsanitizer LOCAL_PATCHES.

=


GCC does not, currently, define __builtin_os_log_format, which
is needed by os/log.h.  Do not include that header unless the
builtin is defined (since the header errors out on the same
condition).  Provide a work-around solution to the missing API
provided via the header.

libsanitizer/ChangeLog:

* sanitizer_common/sanitizer_mac.cpp : Check for the
availability of __builtin_os_log_format before trying to
include a header depending on it.
(OS_LOG_DEFAULT): New.
(os_log_error): Define to a fall-back using an older API.
---
 libsanitizer/sanitizer_common/sanitizer_mac.cpp | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/libsanitizer/sanitizer_common/sanitizer_mac.cpp 
b/libsanitizer/sanitizer_common/sanitizer_mac.cpp
index f455856c85d..30a94fcba14 100644
--- a/libsanitizer/sanitizer_common/sanitizer_mac.cpp
+++ b/libsanitizer/sanitizer_common/sanitizer_mac.cpp
@@ -70,7 +70,15 @@ extern "C" {
 #include 
 #include 
 #include 
-#include 
+#if defined(__has_builtin) && __has_builtin(__builtin_os_log_format)
+# include 
+#else
+   /* Without support for __builtin_os_log_format, fall back to the older
+  method.  */
+# define OS_LOG_DEFAULT 0
+# define os_log_error(A,B,C) \
+  asl_log(nullptr, nullptr, ASL_LEVEL_ERR, "%s", (C));
+#endif
 #include 
 #include 
 #include 
-- 
2.24.1

[COMMIT] wwwdocs: Document devel/omp/gcc-11

2021-05-13 Thread Kwok Cheung Yeung


Hello

I have pushed the devel/omp/gcc-11 branch to the git repo as the development 
branch for new OpenMP, OpenACC and offloading functionality, based on the GCC 11 
branch.


I have committed this patch to update the git doc page to point to the new 
branch as the active OMP develepment branch, and have moved devel/omp/gcc-10 to 
the list of inactive branches.


Kwok
commit 8a006e10264a471a8f9ece2ce3720eff0910f77d
Author: Kwok Cheung Yeung 
Date:   Thu May 13 22:09:36 2021 +0100

Document devel/omp/gcc-11 branch

This also moves the old devel/omp/gcc-10 branch to the inactive branches
section next to devel/omp/gcc-9.

diff --git a/htdocs/git.html b/htdocs/git.html
index 8edde126..2bbfc334 100644
--- a/htdocs/git.html
+++ b/htdocs/git.html
@@ -280,15 +280,15 @@ in Git.
   Makarov mailto:vmaka...@redhat.com";>vmaka...@redhat.com.
   
 
-  https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;a=shortlog;h=refs/heads/devel/omp/gcc-10";>devel/omp/gcc-10
+  https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;a=shortlog;h=refs/heads/devel/omp/gcc-11";>devel/omp/gcc-11
   This branch is for collaborative development of
   https://gcc.gnu.org/wiki/OpenACC";>OpenACC and
   https://gcc.gnu.org/wiki/openmp";>OpenMP support and related
   functionality, such
   as https://gcc.gnu.org/wiki/Offloading";>offloading support (OMP:
   offloading and multi processing).
-  The branch is based on releases/gcc-10.
-  Please send patch emails with a short-hand [og10] tag in the
+  The branch is based on releases/gcc-11.
+  Please send patch emails with a short-hand [og11] tag in the
   subject line, and use ChangeLog.omp files.
 
   unified-autovect
@@ -949,13 +949,14 @@ merged.
   respectively.
 
   https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;a=shortlog;h=refs/heads/devel/omp/gcc-9";>devel/omp/gcc-9
-  This branch was used for collaborative development of
+  https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;a=shortlog;h=refs/heads/devel/omp/gcc-10";>devel/omp/gcc-10
+  These branches were used for collaborative development of
   https://gcc.gnu.org/wiki/OpenACC";>OpenACC and
   https://gcc.gnu.org/wiki/openmp";>OpenMP support and related
-  functionality as the successor to openacc-gcc-9-branch after the move to
+  functionality as the successors to openacc-gcc-9-branch after the move to
   Git.
-  The branch was based on releases/gcc-9.
-  Development has now moved to the devel/omp/gcc-10 branch.
+  The branches were based on releases/gcc-9 and releases/gcc-10 respectively.
+  Development has now moved to the devel/omp/gcc-11 branch.
 
   hammer-3_3-branch
   The goal of this branch was to have a stable compiler based on GCC 3.3

Re: [GOVERNANCE] Where to file complaints re project-maintainers?

2021-05-13 Thread abebeos via Gcc-patches

On Sat, 8 May 2021 at 18:49, abebeos 
wrote:

> (failed to join gcc, so posting here)
>
> Is there any private email where one can file complaints re
> project-maintainers (or "those who are supervising the maintainers") ?
>
> Is there any information about the process for such complaints?
>
> Related Issue: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100480
>
> (please note that this complaint will most possibly escalate up to the
> person(s) who are responsible for policies/rules)
>

This topic is "closed" for me (for now).

"
Now, the headline would be:

"Physik FU-Berlin, Microchip, Google, RedHat, IBM and more to Support
Abuse, Discrimination and even 'IT-fascism' via/on GCC/GNU/FSF
Project-Resources".

See, nobody cares, until a valid(!) headline gets some visibility.

But for now I'll stop here, as I don't want to open/activate accounts in
order to publish. And in the end, I would analyze GCC/GNU/FSF weaknesses
and threads without getting payed.

Just one last thing:

John Paul Adrian Glaubitz, you have attacked my professional reputation in
public, saying more or less that I claimed the bounty without having done
any work for it.

But the indisputable fact is that any person that declares "assessment,
validation, integration and general reuse of existent results" as "copying"
should simply stay away from OSS software.

And persons which abuse their (position of) power to brute-force violate
voting procedures (or to not intervene), are just some (more or less worse)
form  of IT-fascists.

People like you should be kicked out immediately from OSS projects.

Well, at least in a perfect world.

Cu around, clowns.
"
source: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92729#c61

Re: [PATCH] libgccjit: add some reflection functions in the jit C api

2021-05-13 Thread David Malcolm via Gcc-patches

On Tue, 2020-11-03 at 17:13 -0500, Antoni Boucher wrote:
> I was missing a check in gcc_jit_struct_get_field, I added it in this
> new patch.
> 

Sorry about the long delay in reviewing this patch.

The main high-level points are:
- currently the get_*_count functions return "ssize_t" - why?  Only
unsigned values are meaningful; shouldn't they return "size_t" instead?

- the various "lookup by index" functions take "int" i.e. signed, but
only >= 0 is meaningful.  I think it makes sense to make them take
size_t instead.

Sorry if we covered that before in the review, it's been a while.

Various nitpicks inline below...

[...snip...]
 
> diff --git a/gcc/jit/docs/topics/compatibility.rst 
> b/gcc/jit/docs/topics/compatibility.rst
> index 6bfa101ed71..236e5c72d81 100644
> --- a/gcc/jit/docs/topics/compatibility.rst
> +++ b/gcc/jit/docs/topics/compatibility.rst
> @@ -226,3 +226,44 @@ entrypoints:
>  
>  ``LIBGCCJIT_ABI_14`` covers the addition of
>  :func:`gcc_jit_global_set_initializer`
> +
> +.. _LIBGCCJIT_ABI_15:
> +
> +``LIBGCCJIT_ABI_15``
> +
> +``LIBGCCJIT_ABI_15`` covers the addition of reflection functions via API
> +entrypoints:

This needs updating, as I used LIBGCCJIT_ABI_15 for inline asm.

[...snip...]

> diff --git a/gcc/jit/docs/topics/functions.rst 
> b/gcc/jit/docs/topics/functions.rst
> index eb40d64010e..aa6de87282d 100644
> --- a/gcc/jit/docs/topics/functions.rst
> +++ b/gcc/jit/docs/topics/functions.rst
> @@ -171,6 +171,16 @@ Functions
> underlying string, so it is valid to pass in a pointer to an on-stack
> buffer.
>  
> +.. function::  ssize_t \
> +   gcc_jit_function_get_param_count (gcc_jit_function *func)
> +
> +   Get the number of parameters of the function.
> +
> +.. function::  gcc_jit_type \*
> +   gcc_jit_function_get_return_type (gcc_jit_function *func)
> +
> +   Get the return type of the function.

As noted before, this doesn't yet document all the new entrypoints; I
think you wanted to hold off until all the details were thrashed out,
but hopefully we're close.

The documentation for an entrypoint should specify which ABI it was
added in.

[...snip...]

> +/* Public entrypoint.  See description in libgccjit.h.
> +
> +   After error-checking, the real work is done by the
> +   gcc::jit::recording::type::is_struct method, in
> +   jit-recording.c.  */
> +
> +gcc_jit_struct *
> +gcc_jit_type_is_struct (gcc_jit_type *type)
> +{
> +  RETURN_NULL_IF_FAIL (type, NULL, NULL, "NULL type");
> +  gcc::jit::recording::struct_ *struct_type = type->is_struct ();
> +  return (gcc_jit_struct *)struct_type;
> +}
> +
> +/* Public entrypoint.  See description in libgccjit.h.
> +
> +   After error-checking, the real work is done by the
> +   gcc::jit::recording::vector_type::get_num_units method, in
> +   jit-recording.c.  */
> +
> +ssize_t
> +gcc_jit_vector_type_get_num_units (gcc_jit_vector_type *vector_type)
> +{
> +  RETURN_VAL_IF_FAIL (vector_type, -1, NULL, NULL, "NULL vector_type");
> +  return vector_type->get_num_units ();
> +}
> +
> +/* Public entrypoint.  See description in libgccjit.h.
> +
> +   After error-checking, the real work is done by the
> +   gcc::jit::recording::vector_type::get_element_type method, in
> +   jit-recording.c.  */
> +
> +gcc_jit_type *
> +gcc_jit_vector_type_get_element_type (gcc_jit_vector_type *vector_type)
> +{
> +  RETURN_NULL_IF_FAIL (vector_type, NULL, NULL, "NULL vector_type");
> +  return (gcc_jit_type *)vector_type->get_element_type ();
> +}
> +
> +/* Public entrypoint.  See description in libgccjit.h.
> +
> +   After error-checking, the real work is done by the
> +   gcc::jit::recording::type::unqualified method, in
> +   jit-recording.c.  */
> +
> +gcc_jit_type *
> +gcc_jit_type_unqualified (gcc_jit_type *type)
> +{
> +  RETURN_NULL_IF_FAIL (type, NULL, NULL, "NULL type");
> +
> +  return (gcc_jit_type *)type->unqualified ();
> +}
> +
> +/* Public entrypoint.  See description in libgccjit.h.
> +
> +   After error-checking, the real work is done by the
> +   gcc::jit::recording::type::dyn_cast_function_type method, in
> +   jit-recording.c.  */
> +
> +gcc_jit_function_type *
> +gcc_jit_type_is_function_ptr_type (gcc_jit_type *type)
> +{
> +  RETURN_NULL_IF_FAIL (type, NULL, NULL, "NULL type");
> +  gcc::jit::recording::type *func_ptr_type = type->dereference ();
> +  RETURN_NULL_IF_FAIL (func_ptr_type, NULL, NULL, "NULL type");
> +  gcc::jit::recording::function_type *func_type =
> +func_ptr_type->dyn_cast_function_type ();
> +  RETURN_NULL_IF_FAIL (func_type, NULL, NULL, "NULL type");

I notice that the other new "*_is_*" functions don't fail if the
dyncast returns NULL, whereas this one does.

RETURN_NULL_IF_FAIL calls jit_error; do we really want that?  It seems
more consistent to have it return NULL without an error for the case
where "type" isn't a function ptr type.

> +
> +  return (gcc_jit_function_type *)func_type;
> +}
> +
> +/* Public entrypoint.  See description

[wwwdocs, patch] gcc-12/changes.html: Document -mptx for nvptx

2021-05-13 Thread Tobias Burnus


Document this new flag, added in
https://gcc.gnu.org/g:2a1586401a21dcd43e0f904bb6eec26c8b2f366b
+ https://gcc.gnu.org/onlinedocs/gcc/Nvidia-PTX-Options.html#index-mptx

Any wording suggestions?

Tobias

PS: Some background remarks:

(PTX ISA 3.1 is supported since NVidia's CUDA 5 while 6.3 is supported since
CUDA 10.0 - and adds very useful new features; current is PTX ISA 7.3
(CUDA 11.3),* but on the PTX side, 6.3 adds a lot, >6.3 only few features,
we still may want to support sometime in the future.)

(The new flag paves the way for additional -misa= flags
(i.e. newer hardware, relevant for enabling ptx instructions which only
newer GPUs support) and newer GPU-hardware-independent PTX ISA features;
hence, either permitting better code generation or for be used to fix bugs.
While this will change during GCC 12, currently, the generated code is
effectively the same with either -mptx= value.)

(Regarding the produced instructions, the installed CUDA will JIT
(and then cache) the GCC-generated nvptx in the binary at startup,
optimizing for the available hardware - i.e. the chosen -mptx and
available -misa do not restrict the hardware ability, just that
PTX instructions which is only available in newer PTX / for newer
hardware may not be generated.)

(* Cf. 
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#release-notes__ptx-release-history
 )

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf
gcc-12/changes.html: Document -mptx for nvptx

* htdocs/gcc-12/changes.html (nvptx): Document new -mptx flag.

diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
index 23f71411..6541cf4e 100644
--- a/htdocs/gcc-12/changes.html
+++ b/htdocs/gcc-12/changes.html
@@ -101,8 +101,13 @@ a work-in-progress.
 
 
 
-
-
+NVPTX
+
+  The -mptx flag has been added to specify the PTX ISA version
+  for the generated code; permitted values are 3.1
+  (default and as used previous GCC versions) and 6.3.
+  
+

Re: [PATCH v2] c++: Check attributes on friend declarations [PR99032]

2021-05-13 Thread Marek Polacek via Gcc-patches

On Wed, May 12, 2021 at 08:27:18PM -0400, Jason Merrill wrote:
> On 5/12/21 8:03 PM, Marek Polacek wrote:
> > diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
> > index 89f874a32cc..2bcefb619aa 100644
> > --- a/gcc/cp/decl2.c
> > +++ b/gcc/cp/decl2.c
> > @@ -1331,6 +1331,20 @@ any_dependent_type_attributes_p (tree attrs)
> > return false;
> >   }
> > +/* True if ATTRS contains any attribute that requires a type.  */
> 
> Let's invert this to check if ATTRS contains any attribute that does *not*
> require a type, and would therefore apply to the decl.

Sounds good, done.  Now I don't need to check *attrlist.
I've also fixed up the xfail thing in my new test.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
This patch implements [dcl.attr.grammar]/5: "If an attribute-specifier-seq
appertains to a friend declaration ([class.friend]), that declaration shall
be a definition."

This restriction applies to C++11-style attributes as well as GNU
attributes with the exception that we allow GNU attributes that require
a type, such as vector_size to continue accepting code as in attrib63.C.
There are various forms of friend declarations, we have friend
templates, C++11 extended friend declarations, and so on.  In some cases
we already ignore the attribute and warn that it was ignored.  But
certain cases weren't diagnosed, and with this patch we'll give a hard
error.  I tried hard not to emit both a warning and error and I think it
worked out.

Jason provided the cp_parser_decl_specifier_seq hunk to detect using
standard attributes in the middle of decl-specifiers, which is invalid.

Co-authored-by: Jason Merrill 

gcc/cp/ChangeLog:

PR c++/99032
* cp-tree.h (any_non_type_attribute_p): Declare.
* decl.c (grokdeclarator): Diagnose when an attribute appertains to
a friend declaration that is not a definition.
* decl2.c (any_non_type_attribute_p): New.
* parser.c (cp_parser_decl_specifier_seq): Diagnose standard attributes
in the middle of decl-specifiers.
(cp_parser_elaborated_type_specifier): Diagnose when an attribute
appertains to a friend declaration that is not a definition.
(cp_parser_member_declaration): Likewise.

gcc/testsuite/ChangeLog:

PR c++/99032
* g++.dg/cpp0x/friend7.C: New test.
* g++.dg/cpp0x/gen-attrs-4.C: Add dg-error.
* g++.dg/cpp0x/gen-attrs-39-1.C: Likewise.
* g++.dg/cpp0x/gen-attrs-74.C: New test.
* g++.dg/ext/attrib63.C: New test.
---
 gcc/cp/cp-tree.h|  1 +
 gcc/cp/decl.c   |  5 +++
 gcc/cp/decl2.c  | 14 
 gcc/cp/parser.c | 23 +++-
 gcc/testsuite/g++.dg/cpp0x/friend7.C| 40 +
 gcc/testsuite/g++.dg/cpp0x/gen-attrs-39-1.C |  3 +-
 gcc/testsuite/g++.dg/cpp0x/gen-attrs-4.C|  3 +-
 gcc/testsuite/g++.dg/cpp0x/gen-attrs-74.C   | 10 ++
 gcc/testsuite/g++.dg/ext/attrib63.C | 16 +
 9 files changed, 112 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/friend7.C
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/gen-attrs-74.C
 create mode 100644 gcc/testsuite/g++.dg/ext/attrib63.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 122dadf976f..580db914d40 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -6763,6 +6763,7 @@ extern tree grokbitfield (const cp_declarator *, 
cp_decl_specifier_seq *,
  tree, tree, tree);
 extern tree splice_template_attributes (tree *, tree);
 extern bool any_dependent_type_attributes_p(tree);
+extern bool any_non_type_attribute_p   (tree);
 extern tree cp_reconstruct_complex_type(tree, tree);
 extern bool attributes_naming_typedef_ok   (tree);
 extern void cplus_decl_attributes  (tree *, tree, int);
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index bc3928d7f85..17511f09e79 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -13741,6 +13741,11 @@ grokdeclarator (const cp_declarator *declarator,

if (friendp)
  {
+   if (attrlist && !funcdef_flag
+   /* Hack to allow attributes like vector_size on a friend.  */
+   && any_non_type_attribute_p (*attrlist))
+ error_at (id_loc, "attribute appertains to a friend "
+   "declaration that is not a definition");
/* Friends are treated specially.  */
if (ctype == current_class_type)
  ;  /* We already issued a permerror.  */
diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index 89f874a32cc..8e4dd6b544a 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -1331,6 +1331,20 @@ any_dependent_type_attributes_p (tree attrs)
   return false;
 }

+/* True if ATTRS contains any attribute that does not require a type.  */
+
+bool
+any_non_type_attribute_p (tree attrs)
+{
+  for (tree a = attrs; a; a = TRE

[PATCH] c++: Prune dead functions.

2021-05-13 Thread Marek Polacek via Gcc-patches

[ Repost from GCC 11 stage 3.  Rebased onto current trunk. ]

I was looking at the LCOV coverage report for the C++ FE and
found a bunch of unused functions that I think we can remove.
Obviously, I left alone various dump_* and debug_* routines.
I haven't removed cp_build_function_call although it is also
currently unused.

* lambda_return_type: was used in parser.c in GCC 7, unused since r255950,
* classtype_has_non_deleted_copy_ctor: appeared in GCC 10, its usage
  was removed in c++/95350,
* contains_wildcard_p: used in GCC 9, unused since r276764,
* get_template_head_requirements: seems to never have been used,
* check_constrained_friend: seems to never have been used,
* subsumes_constraints: unused since r276764,
* push_void_library_fn: usage removed in r248328,
* get_template_parms_at_level: unused since r157857,
* get_pattern_parm: unused since r275387.

(Some of the seemingly unused functions, such as set_global_friend, are
actually used in libcc1.)

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

gcc/cp/ChangeLog:

* class.c (classtype_has_non_deleted_copy_ctor): Remove.
* constraint.cc (contains_wildcard_p): Likewise.
(get_template_head_requirements): Likewise.
(check_constrained_friend): Likewise.
(subsumes_constraints): Likewise.
* cp-tree.h (classtype_has_non_deleted_copy_ctor): Likewise.
(push_void_library_fn): Likewise.
(get_pattern_parm): Likewise.
(get_template_parms_at_level): Likewise.
(lambda_return_type): Likewise.
(get_template_head_requirements): Likewise.
(check_constrained_friend): Likewise.
(subsumes_constraints): Likewise.
* decl.c (push_void_library_fn): Likewise.
* lambda.c (lambda_return_type): Likewise.
* pt.c (get_template_parms_at_level): Likewise.
(get_pattern_parm): Likewise.
---
 gcc/cp/class.c   | 13 --
 gcc/cp/constraint.cc | 62 
 gcc/cp/cp-tree.h |  8 --
 gcc/cp/decl.c| 10 ---
 gcc/cp/lambda.c  | 18 -
 gcc/cp/pt.c  | 49 --
 6 files changed, 160 deletions(-)

diff --git a/gcc/cp/class.c b/gcc/cp/class.c
index 66bc1eea682..354addde773 100644
--- a/gcc/cp/class.c
+++ b/gcc/cp/class.c
@@ -5604,19 +5604,6 @@ classtype_has_non_deleted_move_ctor (tree t)
   return false;
 }
 
-/* True iff T has a copy constructor that is not deleted.  */
-
-bool
-classtype_has_non_deleted_copy_ctor (tree t)
-{
-  if (CLASSTYPE_LAZY_COPY_CTOR (t))
-lazily_declare_fn (sfk_copy_constructor, t);
-  for (ovl_iterator iter (CLASSTYPE_CONSTRUCTORS (t)); iter; ++iter)
-if (copy_fn_p (*iter) && !DECL_DELETED_FN (*iter))
-  return true;
-  return false;
-}
-
 /* If T, a class, has a user-provided copy constructor, copy assignment
operator, or destructor, returns that function.  Otherwise, null.  */
 
diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 30fccc46678..03ce8eb9ff2 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -278,21 +278,6 @@ get_concept_check_template (tree t)
   return tmpl;
 }
 
-/* Returns true if any of the arguments in the template argument list is
-   a wildcard or wildcard pack.  */
-
-bool
-contains_wildcard_p (tree args)
-{
-  for (int i = 0; i < TREE_VEC_LENGTH (args); ++i)
-{
-  tree arg = TREE_VEC_ELT (args, i);
-  if (TREE_CODE (arg) == WILDCARD_DECL)
-   return true;
-}
-  return false;
-}
-
 /*---
 Resolution of qualified concept names
 ---*/
@@ -1310,18 +1295,6 @@ maybe_substitute_reqs_for (tree reqs, const_tree decl_)
   return reqs;
 }
 
-/* Returns the template-head requires clause for the template
-   declaration T or NULL_TREE if none.  */
-
-tree
-get_template_head_requirements (tree t)
-{
-  tree ci = get_constraints (t);
-  if (!ci)
-return NULL_TREE;
-  return CI_TEMPLATE_REQS (ci);
-}
-
 /* Returns the trailing requires clause of the declarator of
a template declaration T or NULL_TREE if none.  */
 
@@ -3469,31 +3442,6 @@ check_function_concept (tree fn)
   return NULL_TREE;
 }
 
-
-// Check that a constrained friend declaration function declaration,
-// FN, is admissible. This is the case only when the declaration depends
-// on template parameters and does not declare a specialization.
-void
-check_constrained_friend (tree fn, tree reqs)
-{
-  if (fn == error_mark_node)
-return;
-  gcc_assert (TREE_CODE (fn) == FUNCTION_DECL);
-
-  // If there are not constraints, this cannot be an error.
-  if (!reqs)
-return;
-
-  // Constrained friend functions that don't depend on template
-  // arguments are effectively meaningless.
-  if (!uses_template_parms (TREE_TYPE (fn)))
-{
-  error_at (location_of (fn),
-   "constrained friend

Re: [PATCH] libgccjit: Handle truncation and extension for casts [PR 95498]

2021-05-13 Thread David Malcolm via Gcc-patches

On Sat, 2021-02-20 at 17:17 -0500, Antoni Boucher via Gcc-patches
wrote:
> Hi.
> Thanks for your feedback!
> 

Sorry about the delay in responding.

In the past I was hesitant about adding more cast support to libgccjit
since I felt that the user could always just create a union to do the
cast.  Then I tried actually using the libgccjit API to do this, and
realized how much work it adds, so I now think we do want to support
casting more types.


> See answers below:
> 
> On Sat, Feb 20, 2021 at 11:20:35AM -0700, Tom Tromey wrote:
> > > > > > > "Antoni" == Antoni Boucher via Gcc-patches <   
> > > > > > > gcc-patches@gcc.gnu.org> writes:
> > 
> > Antoni> gcc/jit/
> > Antoni> PR target/95498
> > Antoni> * jit-playback.c: Add support to handle truncation
> > and extension
> > Antoni> in the convert function.
> > 
> > Antoni> +  switch (dst_code)
> > Antoni> +    {
> > Antoni> +    case INTEGER_TYPE:
> > Antoni> +    case ENUMERAL_TYPE:
> > Antoni> +  t_ret = convert_to_integer (dst_type, expr);
> > Antoni> +  goto maybe_fold;
> > Antoni> +
> > Antoni> +    default:
> > Antoni> +  gcc_assert (gcc::jit::active_playback_ctxt);
> > Antoni> +  gcc::jit::active_playback_ctxt->add_error (NULL,
> > "unhandled conversion");
> > Antoni> +  fprintf (stderr, "input expression:\n");
> > Antoni> +  debug_tree (expr);
> > Antoni> +  fprintf (stderr, "requested type:\n");
> > Antoni> +  debug_tree (dst_type);
> > Antoni> +  return error_mark_node;
> > Antoni> +
> > Antoni> +    maybe_fold:
> > Antoni> +  if (TREE_CODE (t_ret) != C_MAYBE_CONST_EXPR)

Do we even get C_MAYBE_CONST_EXPR in libgccjit?  That tree code is
defined in c-family/c-common.def; how can nodes of that kind be created
outside of the c-family?

> > Antoni> +   t_ret = fold (t_ret);
> > Antoni> +  return t_ret;
> > 
> > It seems weird to have a single 'goto' to maybe_fold, especially
> > inside
> > a switch like this.
> > 
> > If you think the maybe_fold code won't be reused, then it should just
> > be
> > hoisted up and the 'goto' removed.
> 
> This actually depends on how the support for cast between integers and 
> pointers will be implemented (see below).
> If we will support truncating pointers (does that even make sense? and
> I 
> guess we cannot extend a pointer unless we add the support for 
> uint128_t), that label will be reused for that case.
> Otherwise, it might not be reused.
> 
> So, please tell me which option to choose and I'll update my patch.

FWIW I don't think we'll want to support truncating or extending
pointers.

> 
> > On the other hand, if the maybe_fold code might be reused for some
> > other
> > case, then I suppose I would have the case end with 'break' and then
> > have this code outside the switch.
> > 
> > 
> > In another message, you wrote:
> > 
> > Antoni> For your question, the current code already works with
> > boolean and
> > Antoni> reals and casts between integers and pointers is currently
> > not
> > Antoni> supported.
> > 
> > I am curious why this wasn't supported.  It seems like something that
> > one might want to do.
> 
> I have no idea as this is my first contribution to gcc.
> But this would be indeed very useful and I opened an issue about this: 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95438
> 
> > thanks,
> > Tom
> 
> Thanks!
>

Re: [PATCH] avoid a couple of missing -Wuninitialized (PR 98583, 93100)

2021-05-13 Thread Martin Sebor via Gcc-patches


On 5/13/21 1:03 PM, Jeff Law wrote:


On 5/11/2021 1:49 PM, Martin Sebor via Gcc-patches wrote:

The attached change teaches the uninitialized pass about
__builtin_stack_restore and __builtin___asan_mark to avoid two
classes of -Wuninitialized false negatives.

Richard, you already approved the __builtin_stack_restore change
in the bug but I figured I'd submit a patch with both changes for
approval since they affect the same piece of code.

Martin

gcc-93100.diff

Avoid -Wuninitialized false negatives with sanitization and VLAs.

Resolves:
PR tree-optimization/93100 - gcc -fsanitize=address inhibits -Wuninitialized
PR middle-end/98583 - missing -Wuninitialized reading from a second VLA in its 
own block

gcc/ChangeLog:

PR tree-optimization/93100
PR middle-end/98583
* tree-ssa-uninit.c (check_defs):

gcc/testsuite/ChangeLog:

PR tree-optimization/93100
PR middle-end/98583
* g++.dg/warn/uninit-pr93100.C: New test.
* gcc.dg/uninit-pr93100.c: New test.
* gcc.dg/uninit-pr98583.c: New test.


OK.  I wonder if it would make sense to describe this property when we 
construct the builtin and check that property rather than each builtin 
we find over time.  Your call on whether or not to explore that.


I like the idea.  Adding atrribute access to the built-ins would
be one way.  Attribute fn spec might be able to express the same
thing although there I'm not sure if it would apply to
the sanitizer functions.  Either way it seems worth looking into.

Thanks
Martin




Jeff

Re: [PATCH] avoid using an incompletely populated struct (PR 100574)

2021-05-13 Thread Martin Sebor via Gcc-patches


On 5/13/21 11:36 AM, Martin Sebor wrote:

On 5/13/21 11:20 AM, Bernd Edlinger wrote:

On 5/13/21 3:55 AM, Martin Sebor via Gcc-patches wrote:

A logic bug in the handling of PHI arguments in compute_objsize
that are all null pointers lets an incompletely populated struct
be used in a way that triggers an assertion causing an ICE.

The attached patch corrects that by having compute_objsize fail
when the struct isn't fully populated (when all os the PHI's
arguments are null).

Martin


Martin,

I'm getting test failures with your patch here:

Running target unix/-m32
FAIL: g++.dg/pr100574.C  -std=gnu++14 (test for excess errors)
FAIL: g++.dg/pr100574.C  -std=gnu++17 (test for excess errors)
FAIL: g++.dg/pr100574.C  -std=gnu++2a (test for excess errors)

/home/ed/gnu/gcc-trunk/gcc/testsuite/g++.dg/pr100574.C:6:7: error: 
'operator new' takes type 'size_t' ('unsigned int') as first parameter 
[-fpermissive]^M

compiler exited with status 1


Thanks, I've just fixed it.


I hadn't checked in the patch yet.  I'm only now about to do it and
see I inadvertently committed the test in response to your email
about the failures.  I didn't realize you were testing the patch
I had posted for review, before I committed it.

Martin



Martin




Bernd.

Re: [PATCH RFA] tree-iterator: C++11 range-for and tree_stmt_iterator

2021-05-13 Thread Martin Sebor via Gcc-patches


On 5/13/21 1:26 PM, Jason Merrill via Gcc-patches wrote:

Ping.

On 5/1/21 12:29 PM, Jason Merrill wrote:

Like my recent patch to add ovl_range and lkp_range in the C++ front end,
this patch adds the tsi_range adaptor for using C++11 range-based 
'for' with

a STATEMENT_LIST, e.g.

   for (tree stmt : tsi_range (stmt_list)) { ... }

This also involves adding some operators to tree_stmt_iterator that are
needed for range-for iterators, and should also be useful in code that 
uses

the iterators directly.

The patch updates the suitable loops in the C++ front end, but does not
touch any loops elsewhere in the compiler.


I like the modernization of the loops.

I can't find anything terribly wrong with the iterator but let me
at least pick on some nits ;)



gcc/ChangeLog:

* tree-iterator.h (struct tree_stmt_iterator): Add operator++,
operator--, operator*, operator==, and operator!=.
(class tsi_range): New.

gcc/cp/ChangeLog:

* constexpr.c (build_data_member_initialization): Use tsi_range.
(build_constexpr_constructor_member_initializers): Likewise.
(constexpr_fn_retval, cxx_eval_statement_list): Likewise.
(potential_constant_expression_1): Likewise.
* coroutines.cc (await_statement_expander): Likewise.
(await_statement_walker): Likewise.
* module.cc (trees_out::core_vals): Likewise.
* pt.c (tsubst_expr): Likewise.
* semantics.c (set_cleanup_locs): Likewise.
---
  gcc/tree-iterator.h  | 28 +++-
  gcc/cp/constexpr.c   | 42 ++
  gcc/cp/coroutines.cc | 10 --
  gcc/cp/module.cc |  5 ++---
  gcc/cp/pt.c  |  5 ++---
  gcc/cp/semantics.c   |  5 ++---
  6 files changed, 47 insertions(+), 48 deletions(-)

diff --git a/gcc/tree-iterator.h b/gcc/tree-iterator.h
index 076fff8644c..f57456bb473 100644
--- a/gcc/tree-iterator.h
+++ b/gcc/tree-iterator.h
@@ -1,4 +1,4 @@
-/* Iterator routines for manipulating GENERIC tree statement list.
+/* Iterator routines for manipulating GENERIC tree statement list. 
-*- C++ -*-

 Copyright (C) 2003-2021 Free Software Foundation, Inc.
 Contributed by Andrew MacLeod  
@@ -32,6 +32,13 @@ along with GCC; see the file COPYING3.  If not see
  struct tree_stmt_iterator {
    struct tree_statement_list_node *ptr;
    tree container;


I assume the absence of ctors is intentional.  If so, I suggest
to add a comment explaing why.  Otherwise, I would provide one
(or as many as needed).


+
+  bool operator== (tree_stmt_iterator b) const
+    { return b.ptr == ptr && b.container == container; }
+  bool operator!= (tree_stmt_iterator b) const { return !(*this == b); }
+  tree_stmt_iterator &operator++ () { ptr = ptr->next; return *this; }
+  tree_stmt_iterator &operator-- () { ptr = ptr->prev; return *this; }


I would suggest to add postincrement and postdecrement.


+  tree &operator* () { return ptr->stmt; }


Given the pervasive lack of const-safety in GCC and the by-value
semantics of the iterator this probably isn't worth it but maybe
add a const overload.  operator-> would probably never be used.


  };
  static inline tree_stmt_iterator
@@ -71,27 +78,38 @@ tsi_one_before_end_p (tree_stmt_iterator i)
  static inline void
  tsi_next (tree_stmt_iterator *i)
  {
-  i->ptr = i->ptr->next;
+  ++(*i);
  }
  static inline void
  tsi_prev (tree_stmt_iterator *i)
  {
-  i->ptr = i->ptr->prev;
+  --(*i);
  }
  static inline tree *
  tsi_stmt_ptr (tree_stmt_iterator i)
  {
-  return &i.ptr->stmt;
+  return &(*i);
  }
  static inline tree
  tsi_stmt (tree_stmt_iterator i)
  {
-  return i.ptr->stmt;
+  return *i;
  }
+/* Make tree_stmt_iterator work as a C++ range, e.g.
+   for (tree stmt : tsi_range (stmt_list)) { ... }  */
+class tsi_range
+{
+  tree t;
+ public:
+  tsi_range (tree t): t(t) { }
+  tree_stmt_iterator begin() { return tsi_start (t); }
+  tree_stmt_iterator end() { return { nullptr, t }; }


Those member functions could be made const.

Martin


+};
+
  enum tsi_iterator_update
  {
    TSI_NEW_STMT,    /* Only valid when single statement is added, 
move

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 9481a5bfd3c..260b0122f59 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -330,12 +330,9 @@ build_data_member_initialization (tree t, 
vec **vec)

  return false;
    if (TREE_CODE (t) == STATEMENT_LIST)
  {
-  tree_stmt_iterator i;
-  for (i = tsi_start (t); !tsi_end_p (i); tsi_next (&i))
-    {
-  if (! build_data_member_initialization (tsi_stmt (i), vec))
-    return false;
-    }
+  for (tree stmt : tsi_range (t))
+    if (! build_data_member_initialization (stmt, vec))
+  return false;
    return true;
  }
    if (TREE_CODE (t) == CLEANUP_STMT)
@@ -577,10 +574,9 @@ build_constexpr_constructor_member_initializers 
(tree type, tree body)

  break;
    case STATEMENT_LIST:
-    for (tree_stmt_iterator i = tsi_start (body);
- !tsi_end_p (i); tsi_next (&i))
+    for (tre

Re: [PATCH] libgccjit: Handle truncation and extension for casts [PR 95498]

2021-05-13 Thread Antoni Boucher via Gcc-patches

Thanks for your answer.

See my answers below:

Le jeudi 13 mai 2021 à 18:13 -0400, David Malcolm a écrit :
> On Sat, 2021-02-20 at 17:17 -0500, Antoni Boucher via Gcc-patches
> wrote:
> > Hi.
> > Thanks for your feedback!
> > 
> 
> Sorry about the delay in responding.
> 
> In the past I was hesitant about adding more cast support to libgccjit
> since I felt that the user could always just create a union to do the
> cast.  Then I tried actually using the libgccjit API to do this, and
> realized how much work it adds, so I now think we do want to support
> casting more types.
> 
> 
> > See answers below:
> > 
> > On Sat, Feb 20, 2021 at 11:20:35AM -0700, Tom Tromey wrote:
> > > > > > > > "Antoni" == Antoni Boucher via Gcc-patches <   
> > > > > > > > gcc-patches@gcc.gnu.org> writes:
> > > 
> > > Antoni> gcc/jit/
> > > Antoni> PR target/95498
> > > Antoni> * jit-playback.c: Add support to handle truncation
> > > and extension
> > > Antoni> in the convert function.
> > > 
> > > Antoni> +  switch (dst_code)
> > > Antoni> +    {
> > > Antoni> +    case INTEGER_TYPE:
> > > Antoni> +    case ENUMERAL_TYPE:
> > > Antoni> +  t_ret = convert_to_integer (dst_type, expr);
> > > Antoni> +  goto maybe_fold;
> > > Antoni> +
> > > Antoni> +    default:
> > > Antoni> +  gcc_assert (gcc::jit::active_playback_ctxt);
> > > Antoni> +  gcc::jit::active_playback_ctxt->add_error (NULL,
> > > "unhandled conversion");
> > > Antoni> +  fprintf (stderr, "input expression:\n");
> > > Antoni> +  debug_tree (expr);
> > > Antoni> +  fprintf (stderr, "requested type:\n");
> > > Antoni> +  debug_tree (dst_type);
> > > Antoni> +  return error_mark_node;
> > > Antoni> +
> > > Antoni> +    maybe_fold:
> > > Antoni> +  if (TREE_CODE (t_ret) != C_MAYBE_CONST_EXPR)
> 
> Do we even get C_MAYBE_CONST_EXPR in libgccjit?  That tree code is
> defined in c-family/c-common.def; how can nodes of that kind be created
> outside of the c-family?

I am not sure, but that seems like it's only created in c-family
indeed.
However, we do use it in libgccjit here:

https://github.com/gcc-mirror/gcc/blob/master/gcc/jit/jit-playback.c#L1180

> 
> > > Antoni> +   t_ret = fold (t_ret);
> > > Antoni> +  return t_ret;
> > > 
> > > It seems weird to have a single 'goto' to maybe_fold, especially
> > > inside
> > > a switch like this.
> > > 
> > > If you think the maybe_fold code won't be reused, then it should
> > > just
> > > be
> > > hoisted up and the 'goto' removed.
> > 
> > This actually depends on how the support for cast between integers
> > and 
> > pointers will be implemented (see below).
> > If we will support truncating pointers (does that even make sense?
> > and
> > I 
> > guess we cannot extend a pointer unless we add the support for 
> > uint128_t), that label will be reused for that case.
> > Otherwise, it might not be reused.
> > 
> > So, please tell me which option to choose and I'll update my patch.
> 
> FWIW I don't think we'll want to support truncating or extending
> pointers.

Ok, but do you think we'll want to support casts between integers and
pointers?
I opened an issue about this
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95438) and would be
willing to do a patch for it eventually.

> > 
> > > On the other hand, if the maybe_fold code might be reused for some
> > > other
> > > case, then I suppose I would have the case end with 'break' and
> > > then
> > > have this code outside the switch.
> > > 
> > > 
> > > In another message, you wrote:
> > > 
> > > Antoni> For your question, the current code already works with
> > > boolean and
> > > Antoni> reals and casts between integers and pointers is currently
> > > not
> > > Antoni> supported.
> > > 
> > > I am curious why this wasn't supported.  It seems like something
> > > that
> > > one might want to do.
> > 
> > I have no idea as this is my first contribution to gcc.
> > But this would be indeed very useful and I opened an issue about
> > this: 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95438
> > 
> > > thanks,
> > > Tom
> > 
> > Thanks!
> > 
> 
>

[PATCH] libstdc++: Fix wrong thread waking on notify [PR100334]

2021-05-13 Thread Thomas Rodgers

From: Thomas Rodgers 

libstdc++/ChangeLog:
* include/bits/atomic_wait.h (__waiter::_M_do_wait_v): loop
until value change observed.
(__waiter_base::_M_a): Renamed member from _M_addr, changed
type to uintptr_t.
(__waiter_base::_S_wait_addr): Change return type to uinptr_t,
sets LSB if 'laundering' the wait address 
(__waiter_base::_M_addr): New member, returns wait address,
masking off LSB of _M_a.
(__waiter_base::_M_laundered): New member, returns true if
LSB of _M_a is set.
(__waiter_base::_M_notify): Call _M_addr(), check _M_laundered()
to determine whether to wake one or all.
(__waiter_base::_M_do_spin_v): Call _M_addr().
(__waiter_base::_M_do_spin): Likewise.
(__waiter::_M_do_wait_v): Likewise.
(__waiter::_M_do_wait): Likewise.
(__detail::__atomic_compare): Return true if call to
__builtin_memcmp() == 0.
(__waiter_base::_S_do_spin_v): Adjust predicate.
* testsuite/29_atomics/atomic/wait_notify/100334.cc: New
test.
* include/bits/atomic_timed_wait.h
(__timed_waiter::_M_do_wait_until_v): Call _M_addr().
(__timed_waiter::_M_do_wait_until): Likewise.
---
 libstdc++-v3/include/bits/atomic_timed_wait.h |  6 +-
 libstdc++-v3/include/bits/atomic_wait.h   | 49 ++
 .../29_atomics/atomic/wait_notify/100334.cc   | 94 +++
 3 files changed, 129 insertions(+), 20 deletions(-)
 create mode 100644 
libstdc++-v3/testsuite/29_atomics/atomic/wait_notify/100334.cc

diff --git a/libstdc++-v3/include/bits/atomic_timed_wait.h 
b/libstdc++-v3/include/bits/atomic_timed_wait.h
index ec7ff51cdbc..5fe64fa2219 100644
--- a/libstdc++-v3/include/bits/atomic_timed_wait.h
+++ b/libstdc++-v3/include/bits/atomic_timed_wait.h
@@ -289,7 +289,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
if (_M_do_spin(__old, std::move(__vfn), __val,
   __timed_backoff_spin_policy(__atime)))
  return true;
-   return __base_type::_M_w._M_do_wait_until(__base_type::_M_addr, 
__val, __atime);
+   return __base_type::_M_w._M_do_wait_until(__base_type::_M_addr(), 
__val, __atime);
  }
 
// returns true if wait ended before timeout
@@ -304,7 +304,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  __now = _Clock::now())
  {
if (__base_type::_M_w._M_do_wait_until(
- __base_type::_M_addr, __val, __atime)
+ __base_type::_M_addr(), __val, __atime)
&& __pred())
  return true;
 
@@ -347,7 +347,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
auto __reltime = chrono::ceil<__wait_clock_t::duration>(__rtime);
 
return __base_type::_M_w._M_do_wait_until(
- __base_type::_M_addr,
+ __base_type::_M_addr(),
  __val,
  chrono::steady_clock::now() + 
__reltime);
  }
diff --git a/libstdc++-v3/include/bits/atomic_wait.h 
b/libstdc++-v3/include/bits/atomic_wait.h
index 984ed70f16c..06ebcc7bce3 100644
--- a/libstdc++-v3/include/bits/atomic_wait.h
+++ b/libstdc++-v3/include/bits/atomic_wait.h
@@ -181,11 +181,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
return false;
   }
 
+// return true if equal
 template
   bool __atomic_compare(const _Tp& __a, const _Tp& __b)
   {
// TODO make this do the correct padding bit ignoring comparison
-   return __builtin_memcmp(&__a, &__b, sizeof(_Tp)) != 0;
+   return __builtin_memcmp(&__a, &__b, sizeof(_Tp)) == 0;
   }
 
 struct __waiter_pool_base
@@ -276,16 +277,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
using __waiter_type = _Tp;
 
__waiter_type& _M_w;
-   __platform_wait_t* _M_addr;
+   uintptr_t  _M_a;
 
template
- static __platform_wait_t*
+ static uintptr_t
  _S_wait_addr(const _Up* __a, __platform_wait_t* __b)
  {
if constexpr (__platform_wait_uses_type<_Up>)
- return 
reinterpret_cast<__platform_wait_t*>(const_cast<_Up*>(__a));
+ return reinterpret_cast(const_cast<_Up*>(__a));
else
- return __b;
+ return reinterpret_cast(__b) | 0x1;
  }
 
static __waiter_type&
@@ -299,16 +300,25 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
template
  explicit __waiter_base(const _Up* __addr) noexcept
: _M_w(_S_for(__addr))
-   , _M_addr(_S_wait_addr(__addr, &_M_w._M_ver))
- {
- }
+   , _M_a(_S_wait_addr(__addr, &_M_w._M_ver))
+ { }
+
+   __platform_wait_t*
+   _M_addr() const noexcept
+   { return reinterpret_cast<__platform_wait_t*>(_M_a & (-1 << 1)); }
+
+   bool
+   _M_laundered() const
+   { return _M_a & 0x1;

Re: [PATCH RFA (diagnostic)] c++: -Wdeprecated-copy and #pragma diagnostic [PR94492]

2021-05-13 Thread Martin Sebor via Gcc-patches


On 5/13/21 1:28 PM, Jason Merrill via Gcc-patches wrote:

Ping.

On 4/28/21 9:32 AM, Jason Merrill wrote:

  -Wdeprecated-copy was depending only on the state of the warning at the
point where we call the function, making it hard to use #pragma 
diagnostic

to suppress the warning for a particular implicitly declared function.

But checking whether the warning is enabled at the location of the 
implicit

declaration turned out to be a bit complicated; option_enabled only tests
whether it was enabled at the start of compilation, the actual test only
existed in the middle of diagnostic_report_diagnostic.  So this patch
factors it out and adds a new warning_enabled function to diagnostic.h.


There is a bit of overlap in this patch with my work here:
  https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563862.html
but nothing that concerns me, for whatever it's worth.

In my ongoing work to extend TREE_NO_WARNING to more than one bit
I've been thinking of introducing a function (actually a pair of
them) similar to warning_enabled().  The overloads will take
a tree and gimple* rather than a location so that they can consider
the inlining context.  This is only useful in the middle end so
front ends can still use location_t.

Just one suggestion: since warning_at() takes location_t and int
for the option in that order, I would recommend doing the same
for warning_enabled(), just to reduce the risk of confusion.
(It would be nice if location_t could be something other than
an arithmetic type).

Martin



Tested x86_64-pc-linux-gnu, OK for trunk?

gcc/ChangeLog:

PR c++/94492
* diagnostic.h (warning_enabled): Declare.
* diagnostic.c (diagnostic_enabled): Factor out from...
(diagnostic_report_diagnostic): ...here.
(warning_enabled): New.

gcc/cp/ChangeLog:

PR c++/94492
* decl2.c (cp_warn_deprecated_use): Check warning_enabled.

gcc/testsuite/ChangeLog:

PR c++/94492
* g++.dg/cpp0x/depr-copy4.C: New test.
---
  gcc/diagnostic.h    |  2 +
  gcc/cp/decl2.c  |  8 +--
  gcc/diagnostic.c    | 85 +
  gcc/testsuite/g++.dg/cpp0x/depr-copy4.C | 16 +
  4 files changed, 80 insertions(+), 31 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/depr-copy4.C

diff --git a/gcc/diagnostic.h b/gcc/diagnostic.h
index 9a6eefcf918..caa97da2df9 100644
--- a/gcc/diagnostic.h
+++ b/gcc/diagnostic.h
@@ -515,4 +515,6 @@ extern int num_digits (int);
  extern json::value *json_from_expanded_location (diagnostic_context 
*context,

   location_t loc);
+extern bool warning_enabled (int, location_t = input_location);
+
  #endif /* ! GCC_DIAGNOSTIC_H */
diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index a82960fb39c..03b7a68aba2 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -5495,10 +5495,10 @@ cp_warn_deprecated_use (tree decl, 
tsubst_flags_t complain)

    && DECL_NONSTATIC_MEMBER_FUNCTION_P (decl)
    && copy_fn_p (decl))
  {
-  if (warn_deprecated_copy
-  /* Don't warn about system library classes (c++/86342).  */
-  && (!DECL_IN_SYSTEM_HEADER (decl)
-  || global_dc->dc_warn_system_headers))
+  /* Don't warn if the flag was disabled around the class definition
+ (c++/94492).  */
+  if (warning_enabled (OPT_Wdeprecated_copy,
+   DECL_SOURCE_LOCATION (decl)))
  {
    auto_diagnostic_group d;
    tree ctx = DECL_CONTEXT (decl);
diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index 246d75256cf..278ec8b706f 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -1122,6 +1122,62 @@ print_option_information (diagnostic_context 
*context,

  }
  }
+/* Returns whether a DIAGNOSTIC should be printed, and adjusts 
diagnostic->kind

+   as appropriate.  */
+
+static bool
+diagnostic_enabled (diagnostic_context *context,
+    diagnostic_info *diagnostic)
+{
+  /* Diagnostics with no option or -fpermissive are always enabled.  */
+  if (!diagnostic->option_index
+  || diagnostic->option_index == permissive_error_option (context))
+    return true;
+
+  /* This tests if the user provided the appropriate -Wfoo or
+ -Wno-foo option.  */
+  if (! context->option_enabled (diagnostic->option_index,
+ context->lang_mask,
+ context->option_state))
+    return false;
+
+  /* This tests for #pragma diagnostic changes.  */
+  diagnostic_t diag_class
+    = update_effective_level_from_pragmas (context, diagnostic);
+
+  /* This tests if the user provided the appropriate -Werror=foo
+ option.  */
+  if (diag_class == DK_UNSPECIFIED
+  && (context->classify_diagnostic[diagnostic->option_index]
+  != DK_UNSPECIFIED))
+    diagnostic->kind
+  = context->classify_diagnostic[diagnostic->option_index];
+
+  /* This allows for future extensions, like temporarily disabling
+ warnings for ranges of source code.  */
+  if (diagnostic->kind == DK_IGNORED)
+    return fal

Re: [PATCH] libgccjit: Handle truncation and extension for casts [PR 95498]

2021-05-13 Thread David Malcolm via Gcc-patches

On Thu, 2021-05-13 at 19:31 -0400, Antoni Boucher wrote:
> Thanks for your answer.
> 
> See my answers below:
> 
> Le jeudi 13 mai 2021 à 18:13 -0400, David Malcolm a écrit :
> > On Sat, 2021-02-20 at 17:17 -0500, Antoni Boucher via Gcc-patches
> > wrote:
> > > Hi.
> > > Thanks for your feedback!
> > > 
> > 
> > Sorry about the delay in responding.
> > 
> > In the past I was hesitant about adding more cast support to
> > libgccjit
> > since I felt that the user could always just create a union to do
> > the
> > cast.  Then I tried actually using the libgccjit API to do this,
> > and
> > realized how much work it adds, so I now think we do want to
> > support
> > casting more types.
> > 
> > 
> > > See answers below:
> > > 
> > > On Sat, Feb 20, 2021 at 11:20:35AM -0700, Tom Tromey wrote:
> > > > > > > > > "Antoni" == Antoni Boucher via Gcc-patches <   
> > > > > > > > > gcc-patches@gcc.gnu.org> writes:
> > > > 
> > > > Antoni> gcc/jit/
> > > > Antoni> PR target/95498
> > > > Antoni> * jit-playback.c: Add support to handle
> > > > truncation
> > > > and extension
> > > > Antoni> in the convert function.
> > > > 
> > > > Antoni> +  switch (dst_code)
> > > > Antoni> +    {
> > > > Antoni> +    case INTEGER_TYPE:
> > > > Antoni> +    case ENUMERAL_TYPE:
> > > > Antoni> +  t_ret = convert_to_integer (dst_type, expr);
> > > > Antoni> +  goto maybe_fold;
> > > > Antoni> +
> > > > Antoni> +    default:
> > > > Antoni> +  gcc_assert (gcc::jit::active_playback_ctxt);
> > > > Antoni> +  gcc::jit::active_playback_ctxt->add_error (NULL,
> > > > "unhandled conversion");
> > > > Antoni> +  fprintf (stderr, "input expression:\n");
> > > > Antoni> +  debug_tree (expr);
> > > > Antoni> +  fprintf (stderr, "requested type:\n");
> > > > Antoni> +  debug_tree (dst_type);
> > > > Antoni> +  return error_mark_node;
> > > > Antoni> +
> > > > Antoni> +    maybe_fold:
> > > > Antoni> +  if (TREE_CODE (t_ret) != C_MAYBE_CONST_EXPR)
> > 
> > Do we even get C_MAYBE_CONST_EXPR in libgccjit?  That tree code is
> > defined in c-family/c-common.def; how can nodes of that kind be
> > created
> > outside of the c-family?
> 
> I am not sure, but that seems like it's only created in c-family
> indeed.
> However, we do use it in libgccjit here:
> 
> https://github.com/gcc-mirror/gcc/blob/master/gcc/jit/jit-playback.c#L1180
> 
> > 
> > > > Antoni> +   t_ret = fold (t_ret);
> > > > Antoni> +  return t_ret;
> > > > 
> > > > It seems weird to have a single 'goto' to maybe_fold,
> > > > especially
> > > > inside
> > > > a switch like this.
> > > > 
> > > > If you think the maybe_fold code won't be reused, then it
> > > > should
> > > > just
> > > > be
> > > > hoisted up and the 'goto' removed.
> > > 
> > > This actually depends on how the support for cast between
> > > integers
> > > and 
> > > pointers will be implemented (see below).
> > > If we will support truncating pointers (does that even make
> > > sense?
> > > and
> > > I 
> > > guess we cannot extend a pointer unless we add the support for 
> > > uint128_t), that label will be reused for that case.
> > > Otherwise, it might not be reused.
> > > 
> > > So, please tell me which option to choose and I'll update my
> > > patch.
> > 
> > FWIW I don't think we'll want to support truncating or extending
> > pointers.
> 
> Ok, but do you think we'll want to support casts between integers and
> pointers?

Yes, though we probably want to reject truncating a pointer into a
smaller integer type.

> I opened an issue about this
> (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95438) and would be
> willing to do a patch for it eventually.
> 
> > > 
> > > > On the other hand, if the maybe_fold code might be reused for
> > > > some
> > > > other
> > > > case, then I suppose I would have the case end with 'break' and
> > > > then
> > > > have this code outside the switch.
> > > > 
> > > > 
> > > > In another message, you wrote:
> > > > 
> > > > Antoni> For your question, the current code already works with
> > > > boolean and
> > > > Antoni> reals and casts between integers and pointers is
> > > > currently
> > > > not
> > > > Antoni> supported.
> > > > 
> > > > I am curious why this wasn't supported.  It seems like
> > > > something
> > > > that
> > > > one might want to do.
> > > 
> > > I have no idea as this is my first contribution to gcc.
> > > But this would be indeed very useful and I opened an issue about
> > > this: 
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95438
> > > 
> > > > thanks,
> > > > Tom
> > > 
> > > Thanks!
> > > 
> > 
> > 
> 
>

Re: [PATCH v2] c++: Check attributes on friend declarations [PR99032]

2021-05-13 Thread Jason Merrill via Gcc-patches


On 5/13/21 6:08 PM, Marek Polacek wrote:

On Wed, May 12, 2021 at 08:27:18PM -0400, Jason Merrill wrote:

On 5/12/21 8:03 PM, Marek Polacek wrote:

diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index 89f874a32cc..2bcefb619aa 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -1331,6 +1331,20 @@ any_dependent_type_attributes_p (tree attrs)
 return false;
   }
+/* True if ATTRS contains any attribute that requires a type.  */


Let's invert this to check if ATTRS contains any attribute that does *not*
require a type, and would therefore apply to the decl.


Sounds good, done.  Now I don't need to check *attrlist.
I've also fixed up the xfail thing in my new test.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


OK.


-- >8 --
This patch implements [dcl.attr.grammar]/5: "If an attribute-specifier-seq
appertains to a friend declaration ([class.friend]), that declaration shall
be a definition."

This restriction applies to C++11-style attributes as well as GNU
attributes with the exception that we allow GNU attributes that require
a type, such as vector_size to continue accepting code as in attrib63.C.
There are various forms of friend declarations, we have friend
templates, C++11 extended friend declarations, and so on.  In some cases
we already ignore the attribute and warn that it was ignored.  But
certain cases weren't diagnosed, and with this patch we'll give a hard
error.  I tried hard not to emit both a warning and error and I think it
worked out.

Jason provided the cp_parser_decl_specifier_seq hunk to detect using
standard attributes in the middle of decl-specifiers, which is invalid.

Co-authored-by: Jason Merrill 

gcc/cp/ChangeLog:

PR c++/99032
* cp-tree.h (any_non_type_attribute_p): Declare.
* decl.c (grokdeclarator): Diagnose when an attribute appertains to
a friend declaration that is not a definition.
* decl2.c (any_non_type_attribute_p): New.
* parser.c (cp_parser_decl_specifier_seq): Diagnose standard attributes
in the middle of decl-specifiers.
(cp_parser_elaborated_type_specifier): Diagnose when an attribute
appertains to a friend declaration that is not a definition.
(cp_parser_member_declaration): Likewise.

gcc/testsuite/ChangeLog:

PR c++/99032
* g++.dg/cpp0x/friend7.C: New test.
* g++.dg/cpp0x/gen-attrs-4.C: Add dg-error.
* g++.dg/cpp0x/gen-attrs-39-1.C: Likewise.
* g++.dg/cpp0x/gen-attrs-74.C: New test.
* g++.dg/ext/attrib63.C: New test.
---
  gcc/cp/cp-tree.h|  1 +
  gcc/cp/decl.c   |  5 +++
  gcc/cp/decl2.c  | 14 
  gcc/cp/parser.c | 23 +++-
  gcc/testsuite/g++.dg/cpp0x/friend7.C| 40 +
  gcc/testsuite/g++.dg/cpp0x/gen-attrs-39-1.C |  3 +-
  gcc/testsuite/g++.dg/cpp0x/gen-attrs-4.C|  3 +-
  gcc/testsuite/g++.dg/cpp0x/gen-attrs-74.C   | 10 ++
  gcc/testsuite/g++.dg/ext/attrib63.C | 16 +
  9 files changed, 112 insertions(+), 3 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/friend7.C
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/gen-attrs-74.C
  create mode 100644 gcc/testsuite/g++.dg/ext/attrib63.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 122dadf976f..580db914d40 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -6763,6 +6763,7 @@ extern tree grokbitfield (const cp_declarator *, 
cp_decl_specifier_seq *,
  tree, tree, tree);
  extern tree splice_template_attributes(tree *, tree);
  extern bool any_dependent_type_attributes_p   (tree);
+extern bool any_non_type_attribute_p   (tree);
  extern tree cp_reconstruct_complex_type   (tree, tree);
  extern bool attributes_naming_typedef_ok  (tree);
  extern void cplus_decl_attributes (tree *, tree, int);
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index bc3928d7f85..17511f09e79 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -13741,6 +13741,11 @@ grokdeclarator (const cp_declarator *declarator,
  
  	if (friendp)

  {
+   if (attrlist && !funcdef_flag
+   /* Hack to allow attributes like vector_size on a friend.  */
+   && any_non_type_attribute_p (*attrlist))
+ error_at (id_loc, "attribute appertains to a friend "
+   "declaration that is not a definition");
/* Friends are treated specially.  */
if (ctype == current_class_type)
  ;  /* We already issued a permerror.  */
diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index 89f874a32cc..8e4dd6b544a 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -1331,6 +1331,20 @@ any_dependent_type_attributes_p (tree attrs)
return false;
  }
  
+/* True if ATTRS contains any attribute that does not require a type.  */

+
+bool
+any_non_type_attribute_p (tree attrs)

Re: [PATCH RFA (diagnostic)] c++: -Wdeprecated-copy and #pragma diagnostic [PR94492]

2021-05-13 Thread Jason Merrill via Gcc-patches


On 5/13/21 7:38 PM, Martin Sebor wrote:

On 5/13/21 1:28 PM, Jason Merrill via Gcc-patches wrote:

Ping.

On 4/28/21 9:32 AM, Jason Merrill wrote:
  -Wdeprecated-copy was depending only on the state of the warning at 
the
point where we call the function, making it hard to use #pragma 
diagnostic

to suppress the warning for a particular implicitly declared function.

But checking whether the warning is enabled at the location of the 
implicit
declaration turned out to be a bit complicated; option_enabled only 
tests

whether it was enabled at the start of compilation, the actual test only
existed in the middle of diagnostic_report_diagnostic.  So this patch
factors it out and adds a new warning_enabled function to diagnostic.h.


There is a bit of overlap in this patch with my work here:
  https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563862.html
but nothing that concerns me, for whatever it's worth.

In my ongoing work to extend TREE_NO_WARNING to more than one bit
I've been thinking of introducing a function (actually a pair of
them) similar to warning_enabled().  The overloads will take
a tree and gimple* rather than a location so that they can consider
the inlining context.  This is only useful in the middle end so
front ends can still use location_t.

Just one suggestion: since warning_at() takes location_t and int
for the option in that order, I would recommend doing the same
for warning_enabled(), just to reduce the risk of confusion.
(It would be nice if location_t could be something other than
an arithmetic type).


Sure.  I'd probably rename it to warning_enabled_at, in that case, and 
drop the default argument.



Tested x86_64-pc-linux-gnu, OK for trunk?

gcc/ChangeLog:

PR c++/94492
* diagnostic.h (warning_enabled): Declare.
* diagnostic.c (diagnostic_enabled): Factor out from...
(diagnostic_report_diagnostic): ...here.
(warning_enabled): New.

gcc/cp/ChangeLog:

PR c++/94492
* decl2.c (cp_warn_deprecated_use): Check warning_enabled.

gcc/testsuite/ChangeLog:

PR c++/94492
* g++.dg/cpp0x/depr-copy4.C: New test.
---
  gcc/diagnostic.h    |  2 +
  gcc/cp/decl2.c  |  8 +--
  gcc/diagnostic.c    | 85 +
  gcc/testsuite/g++.dg/cpp0x/depr-copy4.C | 16 +
  4 files changed, 80 insertions(+), 31 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/depr-copy4.C

diff --git a/gcc/diagnostic.h b/gcc/diagnostic.h
index 9a6eefcf918..caa97da2df9 100644
--- a/gcc/diagnostic.h
+++ b/gcc/diagnostic.h
@@ -515,4 +515,6 @@ extern int num_digits (int);
  extern json::value *json_from_expanded_location (diagnostic_context 
*context,

   location_t loc);
+extern bool warning_enabled (int, location_t = input_location);
+
  #endif /* ! GCC_DIAGNOSTIC_H */
diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index a82960fb39c..03b7a68aba2 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -5495,10 +5495,10 @@ cp_warn_deprecated_use (tree decl, 
tsubst_flags_t complain)

    && DECL_NONSTATIC_MEMBER_FUNCTION_P (decl)
    && copy_fn_p (decl))
  {
-  if (warn_deprecated_copy
-  /* Don't warn about system library classes (c++/86342).  */
-  && (!DECL_IN_SYSTEM_HEADER (decl)
-  || global_dc->dc_warn_system_headers))
+  /* Don't warn if the flag was disabled around the class 
definition

+ (c++/94492).  */
+  if (warning_enabled (OPT_Wdeprecated_copy,
+   DECL_SOURCE_LOCATION (decl)))
  {
    auto_diagnostic_group d;
    tree ctx = DECL_CONTEXT (decl);
diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index 246d75256cf..278ec8b706f 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -1122,6 +1122,62 @@ print_option_information (diagnostic_context 
*context,

  }
  }
+/* Returns whether a DIAGNOSTIC should be printed, and adjusts 
diagnostic->kind

+   as appropriate.  */
+
+static bool
+diagnostic_enabled (diagnostic_context *context,
+    diagnostic_info *diagnostic)
+{
+  /* Diagnostics with no option or -fpermissive are always enabled.  */
+  if (!diagnostic->option_index
+  || diagnostic->option_index == permissive_error_option (context))
+    return true;
+
+  /* This tests if the user provided the appropriate -Wfoo or
+ -Wno-foo option.  */
+  if (! context->option_enabled (diagnostic->option_index,
+ context->lang_mask,
+ context->option_state))
+    return false;
+
+  /* This tests for #pragma diagnostic changes.  */
+  diagnostic_t diag_class
+    = update_effective_level_from_pragmas (context, diagnostic);
+
+  /* This tests if the user provided the appropriate -Werror=foo
+ option.  */
+  if (diag_class == DK_UNSPECIFIED
+  && (context->classify_diagnostic[diagnostic->option_index]
+  != DK_UNSPECIFIED))
+    diagnostic->kind
+  = context->classify_diagnostic[diagnostic->option_index];
+
+  /* This allows for future ex

[PATCH] libsanitizer: cherry-pick from upstream

2021-05-13 Thread H.J. Lu via Gcc-patches

On Thu, May 13, 2021 at 1:11 PM H.J. Lu  wrote:
>
> On Thu, May 13, 2021 at 1:01 PM H.J. Lu  wrote:
> >
> > On Thu, May 13, 2021 at 10:27 AM Martin Liška  wrote:
> > >
> > > On 5/13/21 5:54 PM, H.J. Lu wrote:
> > > > On Thu, May 13, 2021 at 09:28:01AM +0200, Martin Liška wrote:
> > > >> I'm planning to do merge from master twice a year.
> > > >> This merge was tested on x86_64-linux-gnu and ppc64le-linux-gnu
> > > >> and survives regression tests.
> > > >>
> > > >> Pushed to master.
> > > >> Thanks,
> > > >> Martin
> > > >>
> > > >> Merged revision: f58e0513dd95944b81ce7a6e7b49ba656de7d75f
> > > >
> > > > On Linux/x86-64, I got
> > > >
> > > > ../../../../../src-master/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:
> > > >  In function ??void __sanitizer::InitTlsSize()??:
> > > > ../../../../../src-master/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:209:55:
> > > >  error: invalid conversion from ??__sanitizer::uptr*?? {aka ??long 
> > > > unsigned int*??} to ??size_t*?? {aka ??unsigned int*??} [-fpermissive]
> > > >209 |   ((void (*)(size_t *, size_t 
> > > > *))get_tls_static_info)(&g_tls_size, &tls_align);
> > > >|   
> > > > ^~~
> > > >|   |
> > > >|   
> > > > __sanitizer::uptr* {aka long unsigned int*}
> > > >
> > > >
> > > > H.J.
> > > >
> > >
> > > Hm, I can't reproduce it:
> > >
> > > /dev/shm/objdir/./gcc/xgcc -shared-libgcc -B/dev/shm/objdir/./gcc 
> > > -nostdinc++ -L/dev/shm/objdir/x86_64-pc-linux-gnu/libstdc++-v3/src 
> > > -L/dev/shm/objdir/x86_64-pc-linux-gnu/libstdc++-v3/src/.libs 
> > > -L/dev/shm/objdir/x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs 
> > > -B/home/marxin/bin/gcc/x86_64-pc-linux-gnu/bin/ 
> > > -B/home/marxin/bin/gcc/x86_64-pc-linux-gnu/lib/ -isystem 
> > > /home/marxin/bin/gcc/x86_64-pc-linux-gnu/include -isystem 
> > > /home/marxin/bin/gcc/x86_64-pc-linux-gnu/sys-include -D_GNU_SOURCE 
> > > -D_DEBUG -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS 
> > > -D__STDC_LIMIT_MACROS -DHAVE_RPC_XDR_H=0 -DHAVE_TIRPC_RPC_XDR_H=0 -I. 
> > > -I/home/marxin/Programming/gcc/libsanitizer/sanitizer_common -I.. -I 
> > > /home/marxin/Programming/gcc/libsanitizer/include -I 
> > > /home/marxin/Programming/gcc/libsanitizer -isystem 
> > > /home/marxin/Programming/gcc/libsanitizer/include/system -Wall -W 
> > > -Wno-unused-parameter -Wwrite-strings -pedantic -Wno-long-long -fPIC 
> > > -fno-builtin -fno-exceptions -fno-rtti -fomit-frame-pointer 
> > > -funwind-tables -fvisibility=hidden -Wno-variadic-macros 
> > > -I../../libstdc++-v3/include 
> > > -I../../libstdc++-v3/include/x86_64-pc-linux-gnu 
> > > -I/home/marxin/Programming/gcc/libsanitizer/../libstdc++-v3/libsupc++ 
> > > -std=gnu++14 -fcf-protection -mshstk -DSANITIZER_LIBBACKTRACE 
> > > -DSANITIZER_CP_DEMANGLE -I 
> > > /home/marxin/Programming/gcc/libsanitizer/../libbacktrace -I 
> > > ../libbacktrace -I /home/marxin/Programming/gcc/libsanitizer/../include 
> > > -include 
> > > /home/marxin/Programming/gcc/libsanitizer/libbacktrace/backtrace-rename.h 
> > > -g -O2 -D_GNU_SOURCE -MT sanitizer_linux_libcdep.lo -MD -MP -MF 
> > > .deps/sanitizer_linux_libcdep.Tpo -c 
> > > /home/marxin/Programming/gcc/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp
> > >   -fPIC -DPIC -o .libs/sanitizer_linux_libcdep.o
> > >
> > > Can you please show full command line? And please attach a pre-processed 
> > > source file.
> > > Thanks,
> > > Martin
> >
> > The problem is -mx32 where size_t == unsigned int, not unsigned long int.
> >
>
> I am testing this patch:
>
> diff --git a/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp
> b/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp
> index da19d3d2ceb..4f9577a97e2 100644
> --- a/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp
> +++ b/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp
> @@ -197,7 +197,7 @@ __attribute__((unused)) static bool
> GetLibcVersion(int *major, int *minor,
>  __attribute__((unused)) static int g_use_dlpi_tls_data;
>
>  #if SANITIZER_GLIBC && !SANITIZER_GO
> -__attribute__((unused)) static uptr g_tls_size;
> +__attribute__((unused)) static size_t g_tls_size;
>  void InitTlsSize() {
>int major, minor, patch;
>g_use_dlpi_tls_data =

This is what I checked in.

-- 
H.J.
From f3b1516d9dfd969d7cc1ca6f26dec13478a1c458 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Thu, 13 May 2021 18:23:55 -0700
Subject: [PATCH] libsanitizer: cherry-pick from upstream

cherry-pick:

72797dedb720 [sanitizer] Use size_t on g_tls_size to fix build on x32
---
 libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp b/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp
index da19d3d2ceb..4f9577a97e2 10064

Re: [PATCH RFA] tree-iterator: C++11 range-for and tree_stmt_iterator

2021-05-13 Thread Jason Merrill via Gcc-patches


On 5/13/21 7:21 PM, Martin Sebor wrote:

On 5/13/21 1:26 PM, Jason Merrill via Gcc-patches wrote:

Ping.

On 5/1/21 12:29 PM, Jason Merrill wrote:
Like my recent patch to add ovl_range and lkp_range in the C++ front 
end,
this patch adds the tsi_range adaptor for using C++11 range-based 
'for' with

a STATEMENT_LIST, e.g.

   for (tree stmt : tsi_range (stmt_list)) { ... }

This also involves adding some operators to tree_stmt_iterator that are
needed for range-for iterators, and should also be useful in code 
that uses

the iterators directly.

The patch updates the suitable loops in the C++ front end, but does not
touch any loops elsewhere in the compiler.


I like the modernization of the loops.

I can't find anything terribly wrong with the iterator but let me
at least pick on some nits ;)



gcc/ChangeLog:

* tree-iterator.h (struct tree_stmt_iterator): Add operator++,
operator--, operator*, operator==, and operator!=.
(class tsi_range): New.

gcc/cp/ChangeLog:

* constexpr.c (build_data_member_initialization): Use tsi_range.
(build_constexpr_constructor_member_initializers): Likewise.
(constexpr_fn_retval, cxx_eval_statement_list): Likewise.
(potential_constant_expression_1): Likewise.
* coroutines.cc (await_statement_expander): Likewise.
(await_statement_walker): Likewise.
* module.cc (trees_out::core_vals): Likewise.
* pt.c (tsubst_expr): Likewise.
* semantics.c (set_cleanup_locs): Likewise.
---
  gcc/tree-iterator.h  | 28 +++-
  gcc/cp/constexpr.c   | 42 ++
  gcc/cp/coroutines.cc | 10 --
  gcc/cp/module.cc |  5 ++---
  gcc/cp/pt.c  |  5 ++---
  gcc/cp/semantics.c   |  5 ++---
  6 files changed, 47 insertions(+), 48 deletions(-)

diff --git a/gcc/tree-iterator.h b/gcc/tree-iterator.h
index 076fff8644c..f57456bb473 100644
--- a/gcc/tree-iterator.h
+++ b/gcc/tree-iterator.h
@@ -1,4 +1,4 @@
-/* Iterator routines for manipulating GENERIC tree statement list.
+/* Iterator routines for manipulating GENERIC tree statement list. 
-*- C++ -*-

 Copyright (C) 2003-2021 Free Software Foundation, Inc.
 Contributed by Andrew MacLeod  
@@ -32,6 +32,13 @@ along with GCC; see the file COPYING3.  If not see
  struct tree_stmt_iterator {
    struct tree_statement_list_node *ptr;
    tree container;


I assume the absence of ctors is intentional.  If so, I suggest
to add a comment explaing why.  Otherwise, I would provide one
(or as many as needed).


+
+  bool operator== (tree_stmt_iterator b) const
+    { return b.ptr == ptr && b.container == container; }
+  bool operator!= (tree_stmt_iterator b) const { return !(*this == 
b); }

+  tree_stmt_iterator &operator++ () { ptr = ptr->next; return *this; }
+  tree_stmt_iterator &operator-- () { ptr = ptr->prev; return *this; }


I would suggest to add postincrement and postdecrement.


+  tree &operator* () { return ptr->stmt; }


Given the pervasive lack of const-safety in GCC and the by-value
semantics of the iterator this probably isn't worth it but maybe
add a const overload.  operator-> would probably never be used.


  };
  static inline tree_stmt_iterator
@@ -71,27 +78,38 @@ tsi_one_before_end_p (tree_stmt_iterator i)
  static inline void
  tsi_next (tree_stmt_iterator *i)
  {
-  i->ptr = i->ptr->next;
+  ++(*i);
  }
  static inline void
  tsi_prev (tree_stmt_iterator *i)
  {
-  i->ptr = i->ptr->prev;
+  --(*i);
  }
  static inline tree *
  tsi_stmt_ptr (tree_stmt_iterator i)
  {
-  return &i.ptr->stmt;
+  return &(*i);
  }
  static inline tree
  tsi_stmt (tree_stmt_iterator i)
  {
-  return i.ptr->stmt;
+  return *i;
  }
+/* Make tree_stmt_iterator work as a C++ range, e.g.
+   for (tree stmt : tsi_range (stmt_list)) { ... }  */
+class tsi_range
+{
+  tree t;
+ public:
+  tsi_range (tree t): t(t) { }
+  tree_stmt_iterator begin() { return tsi_start (t); }
+  tree_stmt_iterator end() { return { nullptr, t }; }


Those member functions could be made const.


Sure:


+};
+
  enum tsi_iterator_update
  {
    TSI_NEW_STMT,    /* Only valid when single statement is 
added, move

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 9481a5bfd3c..260b0122f59 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -330,12 +330,9 @@ build_data_member_initialization (tree t, 
vec **vec)

  return false;
    if (TREE_CODE (t) == STATEMENT_LIST)
  {
-  tree_stmt_iterator i;
-  for (i = tsi_start (t); !tsi_end_p (i); tsi_next (&i))
-    {
-  if (! build_data_member_initialization (tsi_stmt (i), vec))
-    return false;
-    }
+  for (tree stmt : tsi_range (t))
+    if (! build_data_member_initialization (stmt, vec))
+  return false;
    return true;
  }
    if (TREE_CODE (t) == CLEANUP_STMT)
@@ -577,10 +574,9 @@ build_constexpr_constructor_member_initializers 
(tree type, tree body)

  break;
    case STATEMENT_LIST:
-    for (tree_stmt_iterator i = tsi_start (body);
- !

[PATCH] libstdc++: Fix wrong thread waking on notify [PR100334]

2021-05-13 Thread Thomas Rodgers

From: Thomas Rodgers 

Please ignore the previous patch. This one removes the need to carry any
extra state in the case of a 'laundered' atomic wait.

libstdc++/ChangeLog:
* include/bits/atomic_wait.h (__waiter::_M_do_wait_v): loop
until value change observed.
(__waiter_base::_M_laundered): New member function.
(__watier_base::_M_notify): Check _M_laundered() to determine
whether to wake one or all.
(__detail::__atomic_compare): Return true if call to
__builtin_memcmp() == 0.
(__waiter_base::_S_do_spin_v): Adjust predicate.
* testsuite/29_atomics/atomic/wait_notify/100334.cc: New
test.
---
 libstdc++-v3/include/bits/atomic_wait.h   | 28 --
 .../29_atomics/atomic/wait_notify/100334.cc   | 94 +++
 2 files changed, 114 insertions(+), 8 deletions(-)
 create mode 100644 
libstdc++-v3/testsuite/29_atomics/atomic/wait_notify/100334.cc

diff --git a/libstdc++-v3/include/bits/atomic_wait.h 
b/libstdc++-v3/include/bits/atomic_wait.h
index 984ed70f16c..07bb744d822 100644
--- a/libstdc++-v3/include/bits/atomic_wait.h
+++ b/libstdc++-v3/include/bits/atomic_wait.h
@@ -181,11 +181,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
return false;
   }
 
+// return true if equal
 template
   bool __atomic_compare(const _Tp& __a, const _Tp& __b)
   {
// TODO make this do the correct padding bit ignoring comparison
-   return __builtin_memcmp(&__a, &__b, sizeof(_Tp)) != 0;
+   return __builtin_memcmp(&__a, &__b, sizeof(_Tp)) == 0;
   }
 
 struct __waiter_pool_base
@@ -300,14 +301,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  explicit __waiter_base(const _Up* __addr) noexcept
: _M_w(_S_for(__addr))
, _M_addr(_S_wait_addr(__addr, &_M_w._M_ver))
- {
- }
+ { }
+
+   bool
+   _M_laundered() const
+   { return _M_addr == &_M_w._M_ver; }
 
void
_M_notify(bool __all, bool __bare = false)
{
- if (_M_addr == &_M_w._M_ver)
-   __atomic_fetch_add(_M_addr, 1, __ATOMIC_ACQ_REL);
+ if (_M_laundered())
+   {
+ __atomic_fetch_add(_M_addr, 1, __ATOMIC_ACQ_REL);
+ __all = true;
+   }
  _M_w._M_notify(_M_addr, __all, __bare);
}
 
@@ -320,7 +327,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _Spin __spin = _Spin{ })
  {
auto const __pred = [=]
- { return __detail::__atomic_compare(__old, __vfn()); };
+ { return !__detail::__atomic_compare(__old, __vfn()); };
 
if constexpr (__platform_wait_uses_type<_Up>)
  {
@@ -387,7 +394,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
__platform_wait_t __val;
if (__base_type::_M_do_spin_v(__old, __vfn, __val))
  return;
-   __base_type::_M_w._M_do_wait(__base_type::_M_addr, __val);
+
+   do
+ {
+   __base_type::_M_w._M_do_wait(__base_type::_M_addr, __val);
+ }
+   while (__detail::__atomic_compare(__old, __vfn()));
  }
 
template
@@ -452,7 +464,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 __atomic_notify_address(const _Tp* __addr, bool __all) noexcept
 {
   __detail::__bare_wait __w(__addr);
-  __w._M_notify(__all, true);
+  __w._M_notify(__all);
 }
 
   // This call is to be used by atomic types which track contention externally
diff --git a/libstdc++-v3/testsuite/29_atomics/atomic/wait_notify/100334.cc 
b/libstdc++-v3/testsuite/29_atomics/atomic/wait_notify/100334.cc
new file mode 100644
index 000..3e63eca42fa
--- /dev/null
+++ b/libstdc++-v3/testsuite/29_atomics/atomic/wait_notify/100334.cc
@@ -0,0 +1,94 @@
+// { dg-options "-std=gnu++2a" }
+// { dg-do run { target c++2a } }
+// { dg-require-gthreads "" }
+// { dg-additional-options "-pthread" { target pthread } }
+// { dg-add-options libatomic }
+
+// Copyright (C) 2021 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+#include 
+#include 
+
+#include 
+
+template 
+struct atomics_sharing_same_waiter
+{
+   std::atomic tmp[49 * 4] = {};
+   std::atomic* a[4] = {
+  { &tmp[0] },
+  { &tmp[16 * 4] },
+  { &tmp[32 * 4] },
+  { &tmp[48 * 4] }

Re: [PATCH] libgccjit: add some reflection functions in the jit C api

2021-05-13 Thread Antoni Boucher via Gcc-patches

Thanks for your reviews.

I attached the new patch to this email.

See answers below:

Le jeudi 13 mai 2021 à 17:30 -0400, David Malcolm a écrit :
> On Tue, 2020-11-03 at 17:13 -0500, Antoni Boucher wrote:
> > I was missing a check in gcc_jit_struct_get_field, I added it in
> > this
> > new patch.
> > 
> 
> Sorry about the long delay in reviewing this patch.
> 
> The main high-level points are:
> - currently the get_*_count functions return "ssize_t" - why?  Only
> unsigned values are meaningful; shouldn't they return "size_t"
> instead?

For those, I had this question in a previous email:

That seemed off to return NULL for the functions returning a 
size_t to indicate an error. So I changed it to return -1 (and return
type to ssize_t). Is that the proper way to indicate an error?

Once I know the answer for this error handling question, I'll fix the
types.

> - the various "lookup by index" functions take "int" i.e. signed, but
> only >= 0 is meaningful.  I think it makes sense to make them take
> size_t instead.

That's fixed in the new patch.

> Sorry if we covered that before in the review, it's been a while.
> 
> Various nitpicks inline below...
> 
> [...snip...]
>  
> > diff --git a/gcc/jit/docs/topics/compatibility.rst
> > b/gcc/jit/docs/topics/compatibility.rst
> > index 6bfa101ed71..236e5c72d81 100644
> > --- a/gcc/jit/docs/topics/compatibility.rst
> > +++ b/gcc/jit/docs/topics/compatibility.rst
> > @@ -226,3 +226,44 @@ entrypoints:
> >  
> >  ``LIBGCCJIT_ABI_14`` covers the addition of
> >  :func:`gcc_jit_global_set_initializer`
> > +
> > +.. _LIBGCCJIT_ABI_15:
> > +
> > +``LIBGCCJIT_ABI_15``
> > +
> > +``LIBGCCJIT_ABI_15`` covers the addition of reflection functions via
> > API
> > +entrypoints:
> 
> This needs updating, as I used LIBGCCJIT_ABI_15 for inline asm.

This was updated for the new patch.

> [...snip...]
> 
> > diff --git a/gcc/jit/docs/topics/functions.rst
> > b/gcc/jit/docs/topics/functions.rst
> > index eb40d64010e..aa6de87282d 100644
> > --- a/gcc/jit/docs/topics/functions.rst
> > +++ b/gcc/jit/docs/topics/functions.rst
> > @@ -171,6 +171,16 @@ Functions
> >     underlying string, so it is valid to pass in a pointer to an on-
> > stack
> >     buffer.
> >  
> > +.. function::  ssize_t \
> > +   gcc_jit_function_get_param_count (gcc_jit_function
> > *func)
> > +
> > +   Get the number of parameters of the function.
> > +
> > +.. function::  gcc_jit_type \*
> > +   gcc_jit_function_get_return_type (gcc_jit_function
> > *func)
> > +
> > +   Get the return type of the function.
> 
> As noted before, this doesn't yet document all the new entrypoints; I
> think you wanted to hold off until all the details were thrashed out,
> but hopefully we're close.
> 
> The documentation for an entrypoint should specify which ABI it was
> added in.

The documentation was added in the new patch.

> [...snip...]
> 
> > +/* Public entrypoint.  See description in libgccjit.h.
> > +
> > +   After error-checking, the real work is done by the
> > +   gcc::jit::recording::type::is_struct method, in
> > +   jit-recording.c.  */
> > +
> > +gcc_jit_struct *
> > +gcc_jit_type_is_struct (gcc_jit_type *type)
> > +{
> > +  RETURN_NULL_IF_FAIL (type, NULL, NULL, "NULL type");
> > +  gcc::jit::recording::struct_ *struct_type = type->is_struct ();
> > +  return (gcc_jit_struct *)struct_type;
> > +}
> > +
> > +/* Public entrypoint.  See description in libgccjit.h.
> > +
> > +   After error-checking, the real work is done by the
> > +   gcc::jit::recording::vector_type::get_num_units method, in
> > +   jit-recording.c.  */
> > +
> > +ssize_t
> > +gcc_jit_vector_type_get_num_units (gcc_jit_vector_type
> > *vector_type)
> > +{
> > +  RETURN_VAL_IF_FAIL (vector_type, -1, NULL, NULL, "NULL
> > vector_type");
> > +  return vector_type->get_num_units ();
> > +}
> > +
> > +/* Public entrypoint.  See description in libgccjit.h.
> > +
> > +   After error-checking, the real work is done by the
> > +   gcc::jit::recording::vector_type::get_element_type method, in
> > +   jit-recording.c.  */
> > +
> > +gcc_jit_type *
> > +gcc_jit_vector_type_get_element_type (gcc_jit_vector_type
> > *vector_type)
> > +{
> > +  RETURN_NULL_IF_FAIL (vector_type, NULL, NULL, "NULL
> > vector_type");
> > +  return (gcc_jit_type *)vector_type->get_element_type ();
> > +}
> > +
> > +/* Public entrypoint.  See description in libgccjit.h.
> > +
> > +   After error-checking, the real work is done by the
> > +   gcc::jit::recording::type::unqualified method, in
> > +   jit-recording.c.  */
> > +
> > +gcc_jit_type *
> > +gcc_jit_type_unqualified (gcc_jit_type *type)
> > +{
> > +  RETURN_NULL_IF_FAIL (type, NULL, NULL, "NULL type");
> > +
> > +  return (gcc_jit_type *)type->unqualified ();
> > +}
> > +
> > +/* Public entrypoint.  See description in libgccjit.h.
> > +
> > +   After error-checking, the real work is done by the
> > +   gcc::jit::recording::type::dyn_cast_function_type method, in
>

Re: [PATCH] [i386] Fix _mm256_zeroupper to notify LRA that vzeroupper will kill sse registers. [PR target/82735]

2021-05-13 Thread Hongtao Liu via Gcc-patches

On Thu, May 13, 2021 at 7:52 PM Richard Sandiford
 wrote:
>
> Jakub Jelinek  writes:
> > On Thu, May 13, 2021 at 12:32:26PM +0100, Richard Sandiford wrote:
> >> Jakub Jelinek  writes:
> >> > On Thu, May 13, 2021 at 11:43:19AM +0200, Uros Bizjak wrote:
> >> >> > >   Bootstrapped and regtested on X86_64-linux-gnu{-m32,}
> >> >> > >   Ok for trunk?
> >> >> >
> >> >> > Some time ago a support for CLOBBER_HIGH RTX was added (and later
> >> >> > removed for some reason). Perhaps we could resurrect the patch for the
> >> >> > purpose of ferrying 128bit modes via vzeroupper RTX?
> >> >>
> >> >> https://gcc.gnu.org/legacy-ml/gcc-patches/2017-11/msg01325.html
> >> >
> >> > https://gcc.gnu.org/legacy-ml/gcc-patches/2019-09/msg01468.html
> >> > is where it got removed, CCing Richard.
> >>
> >> Yeah.  Initially clobber_high seemed like the best appraoch for
> >> handling the tlsdesc thing, but in practice it was too difficult
> >> to shoe-horn the concept in after the fact, when so much rtl
> >> infrastructure wasn't prepared to deal with it.  The old support
> >> didn't handle all cases and passes correctly, and handled others
> >> suboptimally.
> >>
> >> I think it would be worth using the same approach as
> >> https://gcc.gnu.org/legacy-ml/gcc-patches/2019-09/msg01466.html for
> >> vzeroupper: represent the instructions as call_insns in which the
> >> call has a special vzeroupper ABI.  I think that's likely to lead
> >> to better code than clobber_high would (or at least, it did for tlsdesc).

>From an implementation perspective， I guess you're meaning we should
implement TARGET_INSN_CALLEE_ABI and TARGET_FNTYPE_ABI in the i386
backend.

> >
> > Perhaps a magic call_insn that is split post-reload into a normal insn
> > with the sets then?
>
> I'd be tempted to treat it is a call_insn throughout.  The unspec_volatile
> means that we can't move the instruction, so converting a call_insn to an
> insn isn't likely to help from that point of view.  The sets are also
> likely to be handled suboptimally compared to the more accurate register
> information attached to the call: all code that handles calls has to be
> prepared to deal with partial clobbers, whereas most code dealing with
> sets will assume that the set does useful work, and that the rhs of the
> set is live.
>
> Thanks,
> Richard
>


-- 
BR,
Hongtao

[PATCH] go/100537 - Bootstrap-O3 and bootstrap-debug fail

2021-05-13 Thread Jiufu Guo via Gcc-patches

As discussed in the PR, Richard mentioned the method to
figure out which VAR was not set TREE_ADDRESSABLE, and
then cause this failure.  It is address_expression which
build addr_expr (build_fold_addr_expr_loc), but not set
TREE_ADDRESSABLE.

I drafted this patch with reference the comments from Richard
in this PR, while I'm not quite sure if more thing need to do.
So, please have review, thanks!

Bootstrap and regtest pass on ppc64le. Is this ok for trunk?

Jiufu Guo.

2021-05-14  Richard Biener  
Jiufu Guo 

PR go/100537
* go-gcc.cc
(Gcc_backend::address_expression): Set TREE_ADDRESSABLE.

---
 gcc/go/go-gcc.cc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/go/go-gcc.cc b/gcc/go/go-gcc.cc
index 5d9dbb5d068..8ed20a3b479 100644
--- a/gcc/go/go-gcc.cc
+++ b/gcc/go/go-gcc.cc
@@ -1680,6 +1680,7 @@ Gcc_backend::address_expression(Bexpression* bexpr, 
Location location)
   if (expr == error_mark_node)
 return this->error_expression();
 
+  TREE_ADDRESSABLE (expr) = 1;
   tree ret = build_fold_addr_expr_loc(location.gcc_location(), expr);
   return this->make_expression(ret);
 }
-- 
2.17.1

[PATCHv2 0/4] ROP support

2021-05-13 Thread Bill Schmidt via Gcc-patches

This is version 2 of the ROP support patch, addressing comments by
Will Schmidt and Segher Boessenkool.  I've attempted to implement
all of your excellent suggestions; otherwise the series is unchanged.
I decided to repost the whole series rather than just the patches
needing further approval, since all have changed.

Add POWER10 support for hashst[p] and hashchk[p] operations.  When
the -mrop-protect option is selected, any function that loads the link
register from memory before returning must have protection in the
prologue and epilogue to ensure the link register save location has
not been compromised.  If -mprivileged is also specified, the
protection instructions generated require supervisor privilege.

The patches are broken up into logical chunks:
 - Option handling
 - Instruction generation
 - Predefined macro handling
 - Test cases

Bootstrapped and tested on a POWER10 system with no regressions.
Tests on a kernel that enables user-space ROP mitigation were
successful.  Is this series ok for trunk?  I would also like to
later backport these patches to GCC for the 11.2 release.

Thanks!
Bill

Bill Schmidt (4):
  rs6000: Add -mrop-protect and -mprivileged flags
  rs6000: Emit ROP-mitigation instructions in prologue and epilogue
  rs6000: Conditionally define __ROP_PROTECT__
  rs6000: Add ROP tests

 gcc/config/rs6000/rs6000-c.c |  3 +
 gcc/config/rs6000/rs6000-internal.h  |  2 +
 gcc/config/rs6000/rs6000-logue.c | 74 +---
 gcc/config/rs6000/rs6000.c   |  4 ++
 gcc/config/rs6000/rs6000.md  | 47 +++
 gcc/config/rs6000/rs6000.opt |  8 +++
 gcc/doc/invoke.texi  | 20 ++-
 gcc/testsuite/gcc.target/powerpc/rop-1.c | 17 ++
 gcc/testsuite/gcc.target/powerpc/rop-2.c | 17 ++
 gcc/testsuite/gcc.target/powerpc/rop-3.c | 18 ++
 gcc/testsuite/gcc.target/powerpc/rop-4.c | 15 +
 gcc/testsuite/gcc.target/powerpc/rop-5.c | 13 +
 12 files changed, 229 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/rop-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/rop-2.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/rop-3.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/rop-4.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/rop-5.c

-- 
2.27.0

[PATCH 1/4] rs6000: Add -mrop-protect and -mprivileged flags

2021-05-13 Thread Bill Schmidt via Gcc-patches

2021-05-13  Bill Schmidt  

gcc/
* config/rs6000/rs6000.c (rs6000_option_override_internal):
Disable shrink wrap when inserting ROP-protect instructions.
* config/rs6000/rs6000.opt (mrop-protect): New option.
(mprivileged): Likewise.
* doc/invoke.texi: Document mrop-protect and mprivileged.
---
 gcc/config/rs6000/rs6000.c   |  4 
 gcc/config/rs6000/rs6000.opt |  8 
 gcc/doc/invoke.texi  | 20 ++--
 3 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index d1b76f6ec41..53a9f5411c7 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -4040,6 +4040,10 @@ rs6000_option_override_internal (bool global_init_p)
   && ((rs6000_isa_flags_explicit & OPTION_MASK_QUAD_MEMORY_ATOMIC) == 0))
 rs6000_isa_flags |= OPTION_MASK_QUAD_MEMORY_ATOMIC;
 
+  /* If we are inserting ROP-protect instructions, disable shrink wrap.  */
+  if (rs6000_rop_protect)
+flag_shrink_wrap = 0;
+
   /* If we can shrink-wrap the TOC register save separately, then use
  -msave-toc-indirect unless explicitly disabled.  */
   if ((rs6000_isa_flags_explicit & OPTION_MASK_SAVE_TOC_INDIRECT) == 0
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index 0dbdf753673..f66ef20a102 100644
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -619,3 +619,11 @@ Generate (do not generate) MMA instructions.
 
 mrelative-jumptables
 Target Undocumented Var(rs6000_relative_jumptables) Init(1) Save
+
+mrop-protect
+Target Var(rs6000_rop_protect) Init(0)
+Enable instructions that guard against return-oriented programming attacks.
+
+mprivileged
+Target Var(rs6000_privileged) Init(0)
+Enable generation of instructions that require privileged state.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 519881509a6..92549524583 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1240,7 +1240,8 @@ See RS/6000 and PowerPC Options.
 -mgnu-attribute  -mno-gnu-attribute @gol
 -mstack-protector-guard=@var{guard} -mstack-protector-guard-reg=@var{reg} @gol
 -mstack-protector-guard-offset=@var{offset} -mprefixed -mno-prefixed @gol
--mpcrel -mno-pcrel -mmma -mno-mmma}
+-mpcrel -mno-pcrel -mmma -mno-mmma -mrop-protect -mno-rop-protect @gol
+-mprivileged -mno-privileged}
 
 @emph{RX Options}
 @gccoptlist{-m64bit-doubles  -m32bit-doubles  -fpu  -nofpu@gol
@@ -27029,7 +27030,8 @@ following options:
 -mmulhw  -mdlmzb  -mmfpgpr  -mvsx @gol
 -mcrypto  -mhtm  -mpower8-fusion  -mpower8-vector @gol
 -mquad-memory  -mquad-memory-atomic  -mfloat128 @gol
--mfloat128-hardware -mprefixed -mpcrel -mmma}
+-mfloat128-hardware -mprefixed -mpcrel -mmma @gol
+-mrop-protect}
 
 The particular options set for any particular CPU varies between
 compiler versions, depending on what setting seems to produce optimal
@@ -28034,6 +28036,20 @@ store instructions when the option 
@option{-mcpu=future} is used.
 Generate (do not generate) the MMA instructions when the option
 @option{-mcpu=future} is used.
 
+@item -mrop-protect
+@itemx -mno-rop-protect
+@opindex mrop-protect
+@opindex mno-rop-protect
+Generate (do not generate) ROP protection instructions when the target
+processor supports them.  Currently this option disables the shrink-wrap
+optimization (@option{-fshrink-wrap}).
+
+@item -mprivileged
+@itemx -mno-privileged
+@opindex mprivileged
+@opindex mno-privileged
+Generate (do not generate) instructions for privileged state.
+
 @item -mblock-ops-unaligned-vsx
 @itemx -mno-block-ops-unaligned-vsx
 @opindex block-ops-unaligned-vsx
-- 
2.27.0

[PATCH 2/4] rs6000: Emit ROP-mitigation instructions in prologue and epilogue

2021-05-13 Thread Bill Schmidt via Gcc-patches

2021-05-13  Bill Schmidt  

gcc/
* config/rs6000/rs6000-internal.h (rs6000_stack): Add
rop_hash_save_offset and rop_hash_size.
* config/rs6000/rs6000-logue.c (rs6000_stack_info): Compute
rop_hash_size and rop_hash_save_offset.
(debug_stack_info): Dump rop_hash_save_offset and rop_hash_size.
(rs6000_emit_prologue): Emit hashst[p] in prologue.
(rs6000_emit_epilogue): Emit hashchk[p] in epilogue.
* config/rs6000/rs6000.md (unspec): Add UNSPEC_HASHST and
UNSPEC_HASHCHK.
(hashst): New define_insn.
(hashchk): Likewise.
---
 gcc/config/rs6000/rs6000-internal.h |  2 +
 gcc/config/rs6000/rs6000-logue.c| 74 ++---
 gcc/config/rs6000/rs6000.md | 47 ++
 3 files changed, 116 insertions(+), 7 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-internal.h 
b/gcc/config/rs6000/rs6000-internal.h
index 428a7861a98..88cf9bd5692 100644
--- a/gcc/config/rs6000/rs6000-internal.h
+++ b/gcc/config/rs6000/rs6000-internal.h
@@ -39,6 +39,7 @@ typedef struct rs6000_stack {
   int gp_save_offset;  /* offset to save GP regs from initial SP */
   int fp_save_offset;  /* offset to save FP regs from initial SP */
   int altivec_save_offset; /* offset to save AltiVec regs from initial SP 
*/
+  int rop_hash_save_offset;/* offset to save ROP hash from initial SP */
   int lr_save_offset;  /* offset to save LR from initial SP */
   int cr_save_offset;  /* offset to save CR from initial SP */
   int vrsave_save_offset;  /* offset to save VRSAVE from initial SP */
@@ -53,6 +54,7 @@ typedef struct rs6000_stack {
   int gp_size; /* size of saved GP registers */
   int fp_size; /* size of saved FP registers */
   int altivec_size;/* size of saved AltiVec registers */
+  int rop_hash_size;   /* size of ROP hash slot */
   int cr_size; /* size to hold CR if not in fixed area */
   int vrsave_size; /* size to hold VRSAVE */
   int altivec_padding_size;/* size of altivec alignment padding */
diff --git a/gcc/config/rs6000/rs6000-logue.c b/gcc/config/rs6000/rs6000-logue.c
index b0ac183ceff..13c00e740d6 100644
--- a/gcc/config/rs6000/rs6000-logue.c
+++ b/gcc/config/rs6000/rs6000-logue.c
@@ -595,19 +595,21 @@ rs6000_savres_strategy (rs6000_stack_t *info,
+---+
| Parameter save area (+padding*) (P)   |  32
+---+
-   | Alloca space (A)  |  32+P
+   | Optional ROP hash slot (R)|  32+P
+---+
-   | Local variable space (L)  |  32+P+A
+   | Alloca space (A)  |  32+P+R
+---+
-   | Save area for AltiVec registers (W)   |  32+P+A+L
+   | Local variable space (L)  |  32+P+R+A
+---+
-   | AltiVec alignment padding (Y) |  32+P+A+L+W
+   | Save area for AltiVec registers (W)   |  32+P+R+A+L
+---+
-   | Save area for GP registers (G)|  32+P+A+L+W+Y
+   | AltiVec alignment padding (Y) |  32+P+R+A+L+W
+---+
-   | Save area for FP registers (F)|  32+P+A+L+W+Y+G
+   | Save area for GP registers (G)|  32+P+R+A+L+W+Y
+---+
-   old SP->| back chain to caller's caller |  32+P+A+L+W+Y+G+F
+   | Save area for FP registers (F)|  32+P+R+A+L+W+Y+G
+   +---+
+   old SP->| back chain to caller's caller |  32+P+R+A+L+W+Y+G+F
+---+
 
  * If the alloca area is present, the parameter save area is
@@ -716,6 +718,19 @@ rs6000_stack_info (void)
 
   /* Does this function call anything (apart from sibling calls)?  */
   info->calls_p = (!crtl->is_leaf || cfun->machine->ra_needs_full_frame);
+  info->rop_hash_size = 0;
+
+  if (TARGET_POWER10
+  && info->calls_p
+  && DEFAULT_ABI == ABI_ELFv2
+  && rs6000_rop_protect)
+info->rop_hash_size = 8;
+  else if (rs6000_rop_protect && DEFAULT_ABI != ABI_ELFv2)
+{
+  /* We can't check this in rs6000_option_override_internal since
+DEFAULT_ABI isn't established yet.  */
+  error ("%qs requires the ELFv2 ABI", "-mrop-protect");
+}
 
   /* Determine if we need to save the condition code registers.  */
   if (save_reg_p (CR2_REGNO)
@@ -808,6 +823,11 @@ rs6000_stack_info (void)
 
  /* Adjust for AltiVec case.  */

[PATCH 3/4] rs6000: Conditionally define __ROP_PROTECT__

2021-05-13 Thread Bill Schmidt via Gcc-patches

2021-05-13  Bill Schmidt  

gcc/
* config/rs6000/rs6000-c.c (rs6000_target_modify_macros): Define
__ROP_PROTECT__ if -mrop-protect is selected.
---
 gcc/config/rs6000/rs6000-c.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index 0f8a629ff5a..afcb5bb6e39 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -602,6 +602,9 @@ rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT 
flags,
   /* Whether pc-relative code is being generated.  */
   if ((flags & OPTION_MASK_PCREL) != 0)
 rs6000_define_or_undefine_macro (define_p, "__PCREL__");
+  /* Tell the user -mrop-protect is in play.  */
+  if (rs6000_rop_protect)
+rs6000_define_or_undefine_macro (define_p, "__ROP_PROTECT__");
 }
 
 void
-- 
2.27.0

[PATCH 4/4] rs6000: Add ROP tests

2021-05-13 Thread Bill Schmidt via Gcc-patches

2021-05-13  Bill Schmidt  

gcc/testsuite/
* gcc.target/powerpc/rop-1.c: New.
* gcc.target/powerpc/rop-2.c: New.
* gcc.target/powerpc/rop-3.c: New.
* gcc.target/powerpc/rop-4.c: New.
* gcc.target/powerpc/rop-5.c: New.
---
 gcc/testsuite/gcc.target/powerpc/rop-1.c | 17 +
 gcc/testsuite/gcc.target/powerpc/rop-2.c | 17 +
 gcc/testsuite/gcc.target/powerpc/rop-3.c | 18 ++
 gcc/testsuite/gcc.target/powerpc/rop-4.c | 15 +++
 gcc/testsuite/gcc.target/powerpc/rop-5.c | 13 +
 5 files changed, 80 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/rop-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/rop-2.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/rop-3.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/rop-4.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/rop-5.c

diff --git a/gcc/testsuite/gcc.target/powerpc/rop-1.c 
b/gcc/testsuite/gcc.target/powerpc/rop-1.c
new file mode 100644
index 000..8cedcb6668a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/rop-1.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mdejagnu-cpu=power10 -mrop-protect" } */
+/* { dg-require-effective-target powerpc_elfv2 } */
+
+/* Verify that ROP-protect instructions are inserted when a
+   call is present.  */
+
+extern void foo (void);
+
+int bar ()
+{
+  foo ();
+  return 5;
+}
+
+/* { dg-final { scan-assembler-times {\mhashst\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mhashchk\M} 1 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/rop-2.c 
b/gcc/testsuite/gcc.target/powerpc/rop-2.c
new file mode 100644
index 000..c556952aec1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/rop-2.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mdejagnu-cpu=power10 -mrop-protect -mprivileged" } */
+/* { dg-require-effective-target powerpc_elfv2 } */
+
+/* Verify that privileged ROP-protect instructions are inserted when a
+   call is present.  */
+
+extern void foo (void);
+
+int bar ()
+{
+  foo ();
+  return 5;
+}
+
+/* { dg-final { scan-assembler-times {\mhashstp\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mhashchkp\M} 1 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/rop-3.c 
b/gcc/testsuite/gcc.target/powerpc/rop-3.c
new file mode 100644
index 000..8d03792e3e5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/rop-3.c
@@ -0,0 +1,18 @@
+/* { dg-do run { target { power10_hw } } } */
+/* { dg-require-effective-target powerpc_elfv2 } */
+/* { dg-options "-O2 -mdejagnu-cpu=power10 -mrop-protect" } */
+
+/* Verify that ROP-protect instructions execute correctly when a
+   call is present.  */
+
+void __attribute__((noipa)) foo ()
+{
+  asm ("");
+}
+
+int main ()
+{
+  foo ();
+  return 0;
+}
+
diff --git a/gcc/testsuite/gcc.target/powerpc/rop-4.c 
b/gcc/testsuite/gcc.target/powerpc/rop-4.c
new file mode 100644
index 000..dcf47c63fb7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/rop-4.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mdejagnu-cpu=power10 -mrop-protect" } */
+/* { dg-require-effective-target powerpc_elfv2 } */
+
+/* Verify that no ROP-protect instructions are inserted when no
+   call is present.  */
+
+
+int bar ()
+{
+  return 5;
+}
+
+/* { dg-final { scan-assembler-not {\mhashst\M} } } */
+/* { dg-final { scan-assembler-not {\mhashchk\M} } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/rop-5.c 
b/gcc/testsuite/gcc.target/powerpc/rop-5.c
new file mode 100644
index 000..cf04ea90eeb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/rop-5.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mdejagnu-cpu=power10 -mrop-protect" } */
+
+/* Verify that __ROP_PROTECT__ is predefined for -mrop-protect.  */
+
+int foo ()
+{
+#ifndef __ROP_PROTECT__
+  __ROP_PROTECT__ macro is not defined when it should be
+#endif
+  return 0;
+}
+
-- 
2.27.0

Re: [PR66791][ARM] Replace __builtin_neon_vtst*

2021-05-13 Thread Prathamesh Kulkarni via Gcc-patches

On Wed, 12 May 2021 at 20:33, Richard Earnshaw
 wrote:
>
> On 12/05/2021 12:05, Prathamesh Kulkarni via Gcc-patches wrote:
> > On Wed, 12 May 2021 at 16:02, Richard Earnshaw
> >  wrote:
> >>
> >>
> >>
> >> On 12/05/2021 08:46, Prathamesh Kulkarni via Gcc-patches wrote:
> >>> On Mon, 10 May 2021 at 19:55, Richard Earnshaw
> >>>  wrote:
> 
> 
> 
>  On 06/05/2021 01:14, Prathamesh Kulkarni via Gcc-patches wrote:
> > Hi,
> > The attached patch replaces __builtin_neon_vtst* (a, b) with (a & b) != 
> > 0.
> > Bootstrapped and tested on arm-linux-gnueabihf and cross-tested on 
> > arm*-*-*.
> > OK to commit ?
> >
> > Thanks,
> > Prathamesh
> >
> 
>  You're missing the ChangeLog details.
> 
>  Also, if you're removing these, do we still need the neon_vtst
>  patterns in neon.md?  They generate unspecs, so if they're no-longer
>  needed for these expansions, they can likely be dropped entirely.
> >>> Hi Richard,
> >>> Thanks for the suggestions.
> >>> Does the attached patch look OK if bootstrap+test passes ?
> >>
> >> You're still missing the changelog description.
> > Oh, I had attached it separately.
>
> Sorry, I hadn't noticed that.  The attachments were in
> application/octet-stream format so don't show up when viewing the
> message - I have to download them and then load them into a separate
> editor or whatever in order to view them.  Please don't use octet-stream
> attachments for plain text files.
Oops, sorry about that.
>
> 2021-05-12  Prathamesh Kulkarni  
>
> PR target/66791
> * config/arm/arm_neon.h: Replace calls to __builtin_neon_vtst* (a, b) 
> with
> (a & b) != 0.
>
> For a ChangeLog description it's usually better to avoid the exact
> expression.  I think it would be better to write.
>
> Replace calls to __builtin_neon_vtst* (a, b) with boolean logic equivalent.
>
> Also, pedantically, you should list every function you've changed (just
> in case somebody searches the ChangeLog for changes that affect, say,
> vtstq_p8).  There aren't that many affected functions in this specific
> patch that that is an unreasonable thing to do
>
> * config/arm/arm_neon_builtins.def: Remove entry for vtst.
> * config/arm/neon.md (neon_vtst): Remove pattern.
>
> OK with those changes.
Is the attached version OK to commit ?
Bootstrapped + tested on arm-linux-gnueabihf and cross-tested on arm*-*-*.

Thanks,
Prathamesh
>
> R.
>
> > Does this version look OK ?
> >
> > Thanks,
> > Prathamesh
> >>
> >> R.
> >>>
> >>> Thanks,
> >>> Prathamesh
> 
>  R.
>
2021-05-14  Prathamesh Kulkarni  

PR target/66791
* config/arm/arm_neon.h (vtst_s8): Replace call to vtst builtin with 
it's
boolean logic equivalent.
(vtst_s16): Likewise.
(vtst_s32): Likewise.
(vtst_u8): Likewise.
(vtst_u16): Likewise.
(vtst_u32): Likewise.
(vtst_p8): Likewise.
(vtst_p16): Likewise.
(vtstq_s8): Likewise.
(vtstq_s16): Likewise.
(vtstq_s32): Likewise.
(vtstq_u8): Likewise.
(vtstq_u16): Likewise.
(vtstq_u32): Likewise.
(vtstq_p8): Likewise.
(vtstq_p16): Likewise.

* config/arm/arm_neon_builtins.def: Remove entry for vtst.
* config/arm/neon.md (neon_vtst): Remove pattern.

diff --git a/gcc/config/arm/arm_neon.h b/gcc/config/arm/arm_neon.h
index dc28b92b5af..dcd533fd003 100644
--- a/gcc/config/arm/arm_neon.h
+++ b/gcc/config/arm/arm_neon.h
@@ -2919,112 +2919,112 @@ __extension__ extern __inline uint8x8_t
 __attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
 vtst_s8 (int8x8_t __a, int8x8_t __b)
 {
-  return (uint8x8_t)__builtin_neon_vtstv8qi (__a, __b);
+  return (uint8x8_t) ((__a & __b) != 0);
 }
 
 __extension__ extern __inline uint16x4_t
 __attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
 vtst_s16 (int16x4_t __a, int16x4_t __b)
 {
-  return (uint16x4_t)__builtin_neon_vtstv4hi (__a, __b);
+  return (uint16x4_t) ((__a & __b) != 0);
 }
 
 __extension__ extern __inline uint32x2_t
 __attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
 vtst_s32 (int32x2_t __a, int32x2_t __b)
 {
-  return (uint32x2_t)__builtin_neon_vtstv2si (__a, __b);
+  return (uint32x2_t) ((__a & __b) != 0);
 }
 
 __extension__ extern __inline uint8x8_t
 __attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
 vtst_u8 (uint8x8_t __a, uint8x8_t __b)
 {
-  return (uint8x8_t)__builtin_neon_vtstv8qi ((int8x8_t) __a, (int8x8_t) __b);
+  return (uint8x8_t) ((__a & __b) != 0);
 }
 
 __extension__ extern __inline uint16x4_t
 __attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
 vtst_u16 (uint16x4_t __a, uint16x4_t __b)
 {
-  return (uint16x4_t)__builtin_neon_vtstv4hi ((int16x4_t) __a, (int16x4_t) 
__b);
+  return (uint16x4_t) ((__a & __b) != 0);
 }
 
 __extension__ extern __inline uint32x2_t
 __attribute__  ((__always_inline__, __gnu_inline

Re: [PATCH] avoid using an incompletely populated struct (PR 100574)

2021-05-13 Thread Bernd Edlinger

On 5/14/21 12:35 AM, Martin Sebor wrote:
> On 5/13/21 11:36 AM, Martin Sebor wrote:
>> On 5/13/21 11:20 AM, Bernd Edlinger wrote:
>>> On 5/13/21 3:55 AM, Martin Sebor via Gcc-patches wrote:
 A logic bug in the handling of PHI arguments in compute_objsize
 that are all null pointers lets an incompletely populated struct
 be used in a way that triggers an assertion causing an ICE.

 The attached patch corrects that by having compute_objsize fail
 when the struct isn't fully populated (when all os the PHI's
 arguments are null).

 Martin
>>>
>>> Martin,
>>>
>>> I'm getting test failures with your patch here:
>>>
>>> Running target unix/-m32
>>> FAIL: g++.dg/pr100574.C  -std=gnu++14 (test for excess errors)
>>> FAIL: g++.dg/pr100574.C  -std=gnu++17 (test for excess errors)
>>> FAIL: g++.dg/pr100574.C  -std=gnu++2a (test for excess errors)
>>>
>>> /home/ed/gnu/gcc-trunk/gcc/testsuite/g++.dg/pr100574.C:6:7: error: 
>>> 'operator new' takes type 'size_t' ('unsigned int') as first parameter 
>>> [-fpermissive]^M
>>> compiler exited with status 1
>>
>> Thanks, I've just fixed it.
> 
> I hadn't checked in the patch yet.  I'm only now about to do it and
> see I inadvertently committed the test in response to your email
> about the failures.  I didn't realize you were testing the patch
> I had posted for review, before I committed it.
> 

Oh, well.

I just wanted to help, since you didn't tell how you tested the patch.


Bernd.

> Martin
> 
>>
>> Martin
>>
>>>
>>>
>>> Bernd.
>>>
>>
>

[PATCH, LIBPHOBOS] Cleanup temp files in libphobos unittest at src/std/process.d

2021-05-13 Thread Bernd Edlinger

Hi,

I've noticed that a couple temp files are leaked after each full
gcc test-suite run.

I'd like to fix that by the following patch.


Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
Is it OK for trunk?


Thanks
Bernd.
From 6ad9e8552646e6ff54981cf7102ffcb311b6860f Mon Sep 17 00:00:00 2001
From: Bernd Edlinger 
Date: Fri, 14 May 2021 07:10:59 +0200
Subject: [PATCH] Cleanup temp files in libphobos unittest at src/std/process.d

2021-05-14  Bernd Edlinger  

	* src/std/process.d (unittest): Remove tmpname on exit.
---
 libphobos/src/std/process.d | 1 +
 1 file changed, 1 insertion(+)

diff --git a/libphobos/src/std/process.d b/libphobos/src/std/process.d
index 63ec493..1e977aa 100644
--- a/libphobos/src/std/process.d
+++ b/libphobos/src/std/process.d
@@ -2581,6 +2581,7 @@ private auto executeImpl(alias pipeFunc, Cmd, ExtraPipeFuncArgs...)(
 
 ReturnType!executeShell r;
 auto tmpname = uniqueTempPath;
+scope(exit) if (exists(tmpname)) remove(tmpname);
 auto t = stderr;
 // Open a new scope to minimize code ran with stderr redirected.
 {
-- 
1.9.1

Re: [PATCH] rs6000: Fix wrong code generation for vec_sel [PR94613]

2021-05-13 Thread Xionghu Luo via Gcc-patches


Hi,

On 2021/5/13 18:49, Segher Boessenkool wrote:

Hi!

On Fri, Apr 30, 2021 at 01:32:58AM -0500, Xionghu Luo wrote:

The vsel instruction is a bit-wise select instruction.  Using an
IF_THEN_ELSE to express it in RTL is wrong and leads to wrong code
being generated in the combine pass.  Per element selection is a
subset of per bit-wise selection,with the patch the pattern is
written using bit operations.  But there are 8 different patterns
to define "op0 := (op1 & ~op3) | (op2 & op3)":

(~op3&op1) | (op3&op2),
(~op3&op1) | (op2&op3),
(op3&op2) | (~op3&op1),
(op2&op3) | (~op3&op1),
(op1&~op3) | (op3&op2),
(op1&~op3) | (op2&op3),
(op3&op2) | (op1&~op3),
(op2&op3) | (op1&~op3),

Combine pass will swap (op1&~op3) to (~op3&op1) due to commutative
canonical, which could reduce it to the FIRST 4 patterns, but it won't
swap (op2&op3) | (~op3&op1) to (~op3&op1) | (op2&op3), so this patch
handles it with two patterns with different NOT op3 position and check
equality inside it.


Yup, that latter case does not have canonicalisation rules.  Btw, not
only combine does this canonicalisation: everything should,
non-canonical RTL is invalid RTL (in the instruction stream, you can do
everything in temporary code of course, as long as the RTL isn't
malformed).


-(define_insn "*altivec_vsel"
+(define_insn "altivec_vsel"
[(set (match_operand:VM 0 "altivec_register_operand" "=v")
-   (if_then_else:VM
-(ne:CC (match_operand:VM 1 "altivec_register_operand" "v")
-   (match_operand:VM 4 "zero_constant" ""))
-(match_operand:VM 2 "altivec_register_operand" "v")
-(match_operand:VM 3 "altivec_register_operand" "v")))]
-  "VECTOR_MEM_ALTIVEC_P (mode)"
-  "vsel %0,%3,%2,%1"
+   (ior:VM
+(and:VM
+ (not:VM (match_operand:VM 3 "altivec_register_operand" "v"))
+ (match_operand:VM 1 "altivec_register_operand" "v"))
+(and:VM
+ (match_operand:VM 2 "altivec_register_operand" "v")
+ (match_operand:VM 4 "altivec_register_operand" "v"]
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)
+  && (rtx_equal_p (operands[2], operands[3])
+  || rtx_equal_p (operands[4], operands[3]))"
+  {
+if (rtx_equal_p (operands[2], operands[3]))
+  return "vsel %0,%1,%4,%3";
+else
+  return "vsel %0,%1,%2,%3";
+  }
[(set_attr "type" "vecmove")])


That rtx_equal_p stuff is nice and tricky, but it is a bit too tricky I
think.  So please write this as two patterns (and keep the expand if
that helps).


I was a bit concerned that there would be a lot of duplicate code if we
write two patterns for each vsel, totally 4 similar patterns in
altivec.md and another 4 in vsx.md make it difficult to maintain, however
I updated it since you prefer this way, as you pointed out the xxsel in
vsx.md could be folded by later patch.




+(define_insn "altivec_vsel2"


(same here of course).


  ;; Fused multiply add.
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index f5676255387..d65bdc01055 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -3362,11 +3362,11 @@ const struct altivec_builtin_types 
altivec_overloaded_builtins[] = {
  RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 
RS6000_BTI_unsigned_V2DI },
{ ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
  RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI },
-  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI,
+  { ALTIVEC_BUILTIN_VEC_SEL, ALTIVEC_BUILTIN_VSEL_2DI_UNS,


Are the _uns things still used for anything?  But, let's not change
this until Bill's stuff is in :-)

Why do you want to change this here, btw?  I don't understand.


OK, they are actually "unsigned type" overload builtin functions, change
it or not so far won't cause functionality issue, I will revert this change
in the updated patch.




+  if (target == 0
+  || GET_MODE (target) != tmode
+  || ! (*insn_data[icode].operand[0].predicate) (target, tmode))


No space after ! and other unary operators (except for casts and other
operators you write with alphanumerics, like "sizeof").  I know you
copied this code, but :-)


OK, thanks.




@@ -15608,8 +15606,6 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, 
rtx op_false,
  case GEU:
  case LTU:
  case LEU:
-  /* Mark unsigned tests with CCUNSmode.  */
-  cc_mode = CCUNSmode;
  
/* Invert condition to avoid compound test if necessary.  */

if (rcode == GEU || rcode == LEU)


So this is related to the _uns thing.  Could you split off that change?
Probably as an earlier patch (but either works for me).


Not related to the ALTIVEC_BUILTIN_VSEL_2DI_UNS things, previously cc_mode
is a parameter to generate the condition for IF_THEN_ELSE instruction, now
we don't need it again as we use IOR (AND... AND...) style, remove it to avoid
build error.


-  cond2 = gen_rtx_fmt_ee (NE, cc_mode, gen_lowpart (dest_mode, mask),
- CONST0_RTX (d

92 matches

Mail list logo