date:20231102

Re: [PATCH v3 1/2] c++: Initial support for P0847R7 (Deducing This) [PR102609]

2023-11-02 Thread waffl3x

The problem with operators is fixed, now starts a long period of
testing. It's been so long since I've gotten to this part that I almost
forgot that I have to do it :'). When/if regtests and bootstrap all
pass I will format the patch and submit it.

--- Original Message ---
On Wednesday, November 1st, 2023 at 5:15 PM, waffl3x  
wrote:

> 
> 
> Just want to quickly check, when is the cutoff for GCC14 exactly? I
> want to know how much time I have left to try to tackle this bug with
> subscript. I'm going to be crunching out final stuff this week and I'll
> try to get a (probably non-final) patch for you to review today.
> 
> Alex

Re: [PATCH v3 1/2] c++: Initial support for P0847R7 (Deducing This) [PR102609]

2023-11-02 Thread Jakub Jelinek

On Wed, Nov 01, 2023 at 11:15:56PM +, waffl3x wrote:
> Just want to quickly check, when is the cutoff for GCC14 exactly? I

Nov 18th EOD.  But generally it is cutoff for posting new feature patches
that could be accepted, patches posted before the deadline if reviewed
soon after the deadline still can make it in.

Jakub

[RE] [7/7] riscv: Add basic extension support for XTheadFmv and XTheadInt

2023-11-02 Thread Jin Ma

Hi, I see that XTheadInt is not implemented in the compiler. Is there any plan 
here?
If there is no patch for it, can I try to implement it with you?

Thanks

Jin

Re: [PATCH] Reduce false positives for -Wnonnull for VLA parameters [PR98541]

2023-11-02 Thread Richard Biener

On Tue, Oct 31, 2023 at 8:05 PM Martin Uecker  wrote:
>
>
> This is a revised part of previously posted patch which
> I split up. C FE changes which another false positive
> were already merged, but I still need approval for this
>  middle-end change.  It would be nice to get this in,
> because it fixes some rather annoying (for me atleast)
> false positive warnings with no easy workaround.
>
> In the following example,
>
> int foo(int n, float matrix[n], float opt[n]);
> foo(n, matrix, NULL);
>
> GCC warns about NULL iff n > 0.  This is problematic for
> several reasons:
> 1. It causes false positives (and I turn off -Wnonnull
> in one of my projects for this reason)
> 2. It is inconsistent with regular arrays where there is no
> warning in this case.
> 3. The size parameter is sometimes shared (as in this example)
> so passing zero to avoid the warning is only possible by
> making the code more complex.
> 4. Passing zero as a workaround is technically UB.
>
>
> (The original author of the warning code, Martin S seemed to
> agree with this change according to this discussion in Bugzilla.)

OK.

>
>
> Reduce false positives for -Wnonnull for VLA parameters [PR98541]
>
> This patch limits the warning about NULL arguments to VLA
> parameters declared [static n].
>
> PR c/98541
>
> gcc/
> * gimple-ssa-warn-access.cc
> (pass_waccess::maybe_check_access_sizes): For VLA bounds
> in parameters, only warn about null pointers with 'static'.
>
> gcc/testsuite:
> * gcc.dg/Wnonnull-4: Adapt test.
> * gcc.dg/Wstringop-overflow-40.c: Adapt test.
>
> diff --git a/gcc/gimple-ssa-warn-access.cc b/gcc/gimple-ssa-warn-access.cc
> index e439d1b9b68..8b734295f09 100644
> --- a/gcc/gimple-ssa-warn-access.cc
> +++ b/gcc/gimple-ssa-warn-access.cc
> @@ -3477,27 +3477,14 @@ pass_waccess::maybe_check_access_sizes (rdwr_map 
> *rwm, tree fndecl, tree fntype,
>
>if (integer_zerop (ptr))
> {
> - if (sizidx >= 0 && tree_int_cst_sgn (sizrng[0]) > 0)
> + if (!access.second.internal_p
> + && sizidx >= 0 && tree_int_cst_sgn (sizrng[0]) > 0)
> {
>   /* Warn about null pointers with positive sizes.  This is
>  different from also declaring the pointer argument with
>  attribute nonnull when the function accepts null pointers
>  only when the corresponding size is zero.  */
> - if (access.second.internal_p)
> -   {
> - const std::string argtypestr
> -   = access.second.array_as_string (ptrtype);
> -
> - if (warning_at (loc, OPT_Wnonnull,
> - "argument %i of variable length "
> - "array %s is null but "
> - "the corresponding bound argument "
> - "%i value is %s",
> - ptridx + 1, argtypestr.c_str (),
> - sizidx + 1, sizstr))
> -   arg_warned = OPT_Wnonnull;
> -   }
> - else if (warning_at (loc, OPT_Wnonnull,
> + if (warning_at (loc, OPT_Wnonnull,
>"argument %i is null but "
>"the corresponding size argument "
>"%i value is %s",
> diff --git a/gcc/testsuite/gcc.dg/Wnonnull-4.c 
> b/gcc/testsuite/gcc.dg/Wnonnull-4.c
> index 2c1c45a9856..1f14fbba45d 100644
> --- a/gcc/testsuite/gcc.dg/Wnonnull-4.c
> +++ b/gcc/testsuite/gcc.dg/Wnonnull-4.c
> @@ -27,9 +27,9 @@ void test_fca_n (int r_m1)
>T (  0);
>
>// Verify positive bounds.
> -  T (  1);  // { dg-warning "argument 2 of variable length array 
> 'char\\\[n]' is null but the corresponding bound argument 1 value is 1" }
> -  T (  9);  // { dg-warning "argument 2 of variable length array 
> 'char\\\[n]' is null but the corresponding bound argument 1 value is 9" }
> -  T (max);  // { dg-warning "argument 2 of variable length array 
> 'char\\\[n]' is null but the corresponding bound argument 1 value is \\d+" }
> +  T (  1);  // { dg-bogus "argument 2 of variable length array 
> 'char\\\[n]' is null but the corresponding bound argument 1 value is 1" }
> +  T (  9);  // { dg-bogus "argument 2 of variable length array 
> 'char\\\[n]' is null but the corresponding bound argument 1 value is 9" }
> +  T (max);  // { dg-bogus "argument 2 of variable length array 
> 'char\\\[n]' is null but the corresponding bound argument 1 value is \\d+" }
>  }
>
>
> @@ -55,9 +55,9 @@ void test_fsa_x_n (int r_m1)
>T (  0);
>
>// Verify positive bounds.
> -  T (  1);  // { dg-warning "argument 2 of variable length array 
> 'short int\\\[]\\\[n]' is null but the corresponding bound argument 1 value 
> is 1" }
> -  T (  9);  // { d

Re: [RE] [7/7] riscv: Add basic extension support for XTheadFmv and XTheadInt

2023-11-02 Thread Christoph Müllner

On Thu, Nov 2, 2023, 08:32 Jin Ma  wrote:

> Hi, I see that XTheadInt is not implemented in the compiler. Is there any
> plan here?
> If there is no patch for it, can I try to implement it with you?
>

Yes, sounds good.
Let me know if you have any questions.
We don't have any plans to work on this at the moment.

BR
Christoph

> Thanks
>
> Jin
>

Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-11-02 Thread Richard Biener

On Tue, 31 Oct 2023, Robin Dapp wrote:

> >> +int
> >> +internal_fn_else_index (internal_fn fn)
> > 
> > The function needs a comment, maybe:
> > 
> > /* If FN is an IFN_COND_* or IFN_COND_LEN_* function, return the index of 
> > the
> >argument that is used when the condition is false.  Return -1 otherwise. 
> >  */
> > 
> > OK for the internal-fn* and tree-if-conv.cc bits (which were the
> > parts I commented on earlier).  I'll look at cleaning up the
> > definition of conditional internal functions separately, so that
> > the list of functions isn't necessary.
> 
> Thank you, added the comment (shouldn't have forgotten it in the
> first place...).  So there's the vectorizer part left that is not
> yet OK'd.  

The vectorizer part is OK.

Richard.

[PATCH] MIPS: Use -mnan value for -mabs if not specified

2023-11-02 Thread YunQiang Su

On most hardware, FCSR.ABS2008 is set the value same with FCSR.NAN2008.
Let's use this behaivor by default in GCC, aka
gcc -mnan=2008 -c fabs.c
will imply `-mabs=2008`.

And of course, `gcc -mnan=2008 -mabs=legacy` can continue workable
like previous.

gcc/ChangeLog:

* config/mips/mips.cc(mips_option_override): Set mips_abs to
2008, if mips_abs is default and mips_nan is 2008.
* testsuite/gcc.target/mips/fabs-nan2008.c: New test.
* testsuite/gcc.target/mips/fabsf-nan2008.c: New test.
---
 gcc/config/mips/mips.cc   |  2 ++
 gcc/testsuite/gcc.target/mips/fabs-nan2008.c  | 10 ++
 gcc/testsuite/gcc.target/mips/fabsf-nan2008.c | 10 ++
 3 files changed, 22 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/mips/fabs-nan2008.c
 create mode 100644 gcc/testsuite/gcc.target/mips/fabsf-nan2008.c

diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc
index f9861020902..7fd54503660 100644
--- a/gcc/config/mips/mips.cc
+++ b/gcc/config/mips/mips.cc
@@ -20549,6 +20549,8 @@ mips_option_override (void)
   /* Set NaN and ABS defaults.  */
   if (mips_nan == MIPS_IEEE_754_DEFAULT && !ISA_HAS_IEEE_754_LEGACY)
 mips_nan = MIPS_IEEE_754_2008;
+  if (mips_abs == MIPS_IEEE_754_DEFAULT && mips_nan == MIPS_IEEE_754_2008)
+mips_abs = MIPS_IEEE_754_2008;
   if (mips_abs == MIPS_IEEE_754_DEFAULT && !ISA_HAS_IEEE_754_LEGACY)
 mips_abs = MIPS_IEEE_754_2008;
 
diff --git a/gcc/testsuite/gcc.target/mips/fabs-nan2008.c 
b/gcc/testsuite/gcc.target/mips/fabs-nan2008.c
new file mode 100644
index 000..9e2719bbf36
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/fabs-nan2008.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-mnan=2008" } */
+
+NOMIPS16 double
+fabs_2008 (double d)
+{
+  return __builtin_fabs (d);
+}
+
+/* { dg-final { scan-assembler "\tabs\\.d\t" } } */
diff --git a/gcc/testsuite/gcc.target/mips/fabsf-nan2008.c 
b/gcc/testsuite/gcc.target/mips/fabsf-nan2008.c
new file mode 100644
index 000..11c423429d6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/fabsf-nan2008.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-mnan=2008" } */
+
+NOMIPS16 float
+fabsf_2008 (float f)
+{
+  return __builtin_fabsf (f);
+}
+
+/* { dg-final { scan-assembler "\tabs\\.s\t" } } */
-- 
2.39.2

Re: [PATCH] c++: Implement C++26 P1854R4 - Making non-encodable string literals ill-formed [PR110341]

2023-11-02 Thread Jakub Jelinek

On Fri, Oct 27, 2023 at 07:05:34PM -0400, Jason Merrill wrote:
> > --- gcc/testsuite/g++.dg/cpp26/literals1.C.jj   2023-08-25 
> > 17:23:06.662878355 +0200
> > +++ gcc/testsuite/g++.dg/cpp26/literals1.C  2023-08-25 17:37:03.085132304 
> > +0200
> > @@ -0,0 +1,65 @@
> > +// C++26 P1854R4 - Making non-encodable string literals ill-formed
> > +// { dg-do compile { target c++11 } }
> > +// { dg-require-effective-target int32 }
> > +// { dg-options "-pedantic-errors -finput-charset=UTF-8 
> > -fexec-charset=UTF-8" }
> > +
> > +int d = '😁';   // { dg-error 
> > "character too large for character literal type" }
> ...
> > +char16_t m = u'😁'; // { dg-error 
> > "character constant too long for its type" }
> 
> Why are these different diagnostics?  Why doesn't the first line already hit
> the existing diagnostic that the second gets?
> 
> Both could be clearer that the problem is that the single source character
> can't be encoded as a single execution character.

The first diagnostics is the newly added in the patch which takes precedence
over the existing diagnostics (and wouldn't actually trigger without the
patch).  Sure, I could make that new diagnostics more specific, but all
I generally know is that (str2.len / nbwc) c-chars are encodable in str.len
execution character set code units.
So, would you like 2 different messages, one for str2.len / nbwb == 1
"single character not encodable in a single execution character code unit"
and otherwise
"%d characters need %d execution character code units"
or
"at least one character not encodable in a single execution character code unit"
or something different?

Everything else (i.e. u8 case in narrow_str_to_charconst and L, u and U
cases in wide_str_to_charconst) is already covered by existing diagnostics
which has the "character constant too long for its type"
wording and covers for both C and C++ both the cases where there are more
than one c-chars in the literal (allowed in the L case for < C++23) and
when one c-char encodes in more than one code units (but this time
it isn't execution character set, but UTF-8 character set for u8,
wide execution character set for L, UTF-16 character set for u and
UTF-32 for U).
Plus the same "character constant too long for its type" diagnostics
is emitted if normal narrow literal has several c-chars encodable all as
single execution character code units, but more than can fit into int.

So, do you want to change just the new diagnostics (and what is your
preferred wording), or use the old diagnostics wording also for the
new one, or do you want to change the preexisting diagnostics as well
and e.g. differentiate there between the single c-char cases which need
more than one code unit and different wording for more than one c-char?
Note, if we differentiate between those, we'd need to count how many
c-chars we have even for the u8, L, u and U cases if we see more than
one code unit, similarly how the patch does that (and also the follow-up
patch tweaks).

Jakub

Re: RFC: the proposal to resolve the missing dependency issue for counted_by attribute

2023-11-02 Thread Richard Biener

On Wed, Nov 1, 2023 at 3:47 PM Qing Zhao  wrote:
>
>
>
> > On Oct 31, 2023, at 6:14 PM, Joseph Myers  wrote:
> >
> > On Tue, 31 Oct 2023, Qing Zhao wrote:
> >
> >> 2.3 A new semantic requirement in the user documentation of "counted_by"
> >>
> >> For the following structure including a FAM with a counted_by attribute:
> >>
> >>  struct A
> >>  {
> >>   size_t size;
> >>   char buf[] __attribute__((counted_by(size)));
> >>  };
> >>
> >> for any object with such type:
> >>
> >>  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
> >>
> >> The setting to the size field should be done before the first reference
> >> to the FAM field.
> >>
> >> Such requirement to the user will guarantee that the first reference to
> >> the FAM knows the size of the FAM.
> >>
> >> We need to add this additional requirement to the user document.
> >
> > Make sure the manual is very specific about exactly when size is
> > considered to be an accurate representation of the space available for buf
> > (given that, after malloc or realloc, it's going to be temporarily
> > inaccurate).  If the intent is that inaccurate size at such a time means
> > undefined behavior, say so explicitly.
>
> Yes, good point. We need to define this clearly in the beginning.
> We need to explicit say that
>
> the size of the FAM is defined by the latest “counted_by” value. And it’s an 
> undefined behavior when the size field is not defined when the FAM is 
> referenced.
>
> Is the above good enough?
>
>
> >
> >> 2.4 Replace FAM field accesses with the new function ACCESS_WITH_SIZE
> >>
> >> In C FE:
> >>
> >> for every reference to a FAM, for example, "obj->buf" in the small example,
> >>  check whether the corresponding FIELD_DECL has a "counted_by" attribute?
> >>  if YES, replace the reference to "obj->buf" with a call to
> >>  .ACCESS_WITH_SIZE (obj->buf, obj->size, -1);
> >
> > This seems plausible - but you should also consider the case of static
> > initializers - remember the GNU extension for statically allocated objects
> > with flexible array members (unless you're not allowing it with
> > counted_by).
> >
> > static struct A x = { sizeof "hello", "hello" };
> > static char *y = &x.buf;
> >
> > I'd expect that to be valid - and unless you say such a usage is invalid,
>
> At this moment, I think that this should be valid.
>
> I,e, the following:
>
> struct A
> {
>  size_t size;
>  char buf[] __attribute__((counted_by(size)));
> };
>
> static struct A x = {sizeof "hello", "hello”};
>
> Should be valid, and x.size represents the number of elements of x.buf.
> Both x.size and x.buf are initialized statically.
>
> > you should avoid the replacement in such a static initializer context when
> > the FAM reference is to an object with a constant address (if
> > .ACCESS_WITH_SIZE would not act as an lvalue whose address is a constant
> > expression; if it works fine as a constant-address lvalue, then the
> > replacement would be OK).
>
> Then if such usage for the “counted_by” is valid, we need to replace the FAM
> reference by a call to  .ACCESS_WITH_SIZE as well.
> Otherwise the “counted_by” relationship will be lost to the Middle end.
>
> With the current definition of .ACCESS_WITH_SIZE
>
> PTR = .ACCESS_WITH_SIZE (PTR, SIZE, ACCESS_MODE)
>
> Isn’t the PTR (return value of the call) a LVALUE?

You probably want to specify that when a pointer to the array is taken the
pointer has to be to the first array element (or do we want to mangle the
'size' accordingly for the instrumentation?).  You also want to specify that
the 'size' associated with such pointer is assumed to be unchanging and
after changing the size such pointer has to be re-obtained.  Plus that
changes to the allocated object/size have to be performed through an
lvalue where the containing type and thus the 'counted_by' attribute is
visible.  That is,

size_t *s = &a.size;
*s = 1;

is invoking undefined behavior, likewise modifying 'buf' (makes it a bit
awkward since for example that wouldn't support using posix_memalign
for allocation, though aligned_alloc would be fine).

Richard.

> Qing
> >
> > --
> > Joseph S. Myers
> > jos...@codesourcery.com
>

Re: Ping: [PATCH v3] libiberty: Use posix_spawn in pex-unix when available.

2023-11-02 Thread Richard Biener

On Wed, Nov 1, 2023 at 7:16 PM Brendan Shanks  wrote:
>
> Polite ping on this.

OK.

Thanks,
Richard.

> > On Oct 4, 2023, at 11:28 AM, Brendan Shanks  wrote:
> >
> > Hi,
> >
> > This patch implements pex_unix_exec_child using posix_spawn when
> > available.
> >
> > This should especially benefit recent macOS (where vfork just calls
> > fork), but should have equivalent or faster performance on all
> > platforms.
> > In addition, the implementation is substantially simpler than the
> > vfork+exec code path.
> >
> > Tested on x86_64-linux.
> >
> > v2: Fix error handling (previously the function would be run twice in
> > case of error), and don't use a macro that changes control flow.
> >
> > v3: Match file style for error-handling blocks, don't close
> > in/out/errdes on error, and check close() for errors.
> >
> > libiberty/
> > * configure.ac (AC_CHECK_HEADERS): Add spawn.h.
> > (checkfuncs): Add posix_spawn, posix_spawnp.
> > (AC_CHECK_FUNCS): Add posix_spawn, posix_spawnp.
> > * configure, config.in: Rebuild.
> > * pex-unix.c [HAVE_POSIX_SPAWN] (pex_unix_exec_child): New function.
> >
> > Signed-off-by: Brendan Shanks 
> > ---
> > libiberty/configure.ac |   8 +-
> > libiberty/pex-unix.c   | 168 +
> > 2 files changed, 173 insertions(+), 3 deletions(-)
> >
> > diff --git a/libiberty/configure.ac b/libiberty/configure.ac
> > index 0748c592704..2488b031bc8 100644
> > --- a/libiberty/configure.ac
> > +++ b/libiberty/configure.ac
> > @@ -289,7 +289,7 @@ AC_SUBST_FILE(host_makefile_frag)
> > # It's OK to check for header files.  Although the compiler may not be
> > # able to link anything, it had better be able to at least compile
> > # something.
> > -AC_CHECK_HEADERS(sys/file.h sys/param.h limits.h stdlib.h malloc.h 
> > string.h unistd.h strings.h sys/time.h time.h sys/resource.h sys/stat.h 
> > sys/mman.h fcntl.h alloca.h sys/pstat.h sys/sysmp.h sys/sysinfo.h 
> > machine/hal_sysinfo.h sys/table.h sys/sysctl.h sys/systemcfg.h stdint.h 
> > stdio_ext.h process.h sys/prctl.h)
> > +AC_CHECK_HEADERS(sys/file.h sys/param.h limits.h stdlib.h malloc.h 
> > string.h unistd.h strings.h sys/time.h time.h sys/resource.h sys/stat.h 
> > sys/mman.h fcntl.h alloca.h sys/pstat.h sys/sysmp.h sys/sysinfo.h 
> > machine/hal_sysinfo.h sys/table.h sys/sysctl.h sys/systemcfg.h stdint.h 
> > stdio_ext.h process.h sys/prctl.h spawn.h)
> > AC_HEADER_SYS_WAIT
> > AC_HEADER_TIME
> >
> > @@ -412,7 +412,8 @@ funcs="$funcs setproctitle"
> > vars="sys_errlist sys_nerr sys_siglist"
> >
> > checkfuncs="__fsetlocking canonicalize_file_name dup3 getrlimit getrusage \
> > - getsysinfo gettimeofday on_exit pipe2 psignal pstat_getdynamic 
> > pstat_getstatic \
> > + getsysinfo gettimeofday on_exit pipe2 posix_spawn posix_spawnp psignal \
> > + pstat_getdynamic pstat_getstatic \
> >  realpath setrlimit spawnve spawnvpe strerror strsignal sysconf sysctl \
> >  sysmp table times wait3 wait4"
> >
> > @@ -435,7 +436,8 @@ if test "x" = "y"; then
> > index insque \
> > memchr memcmp memcpy memmem memmove memset mkstemps \
> > on_exit \
> > -pipe2 psignal pstat_getdynamic pstat_getstatic putenv \
> > +pipe2 posix_spawn posix_spawnp psignal \
> > +pstat_getdynamic pstat_getstatic putenv \
> > random realpath rename rindex \
> > sbrk setenv setproctitle setrlimit sigsetmask snprintf spawnve spawnvpe 
> > \
> >  stpcpy stpncpy strcasecmp strchr strdup \
> > diff --git a/libiberty/pex-unix.c b/libiberty/pex-unix.c
> > index 33b5bce31c2..336799d1125 100644
> > --- a/libiberty/pex-unix.c
> > +++ b/libiberty/pex-unix.c
> > @@ -58,6 +58,9 @@ extern int errno;
> > #ifdef HAVE_PROCESS_H
> > #include 
> > #endif
> > +#ifdef HAVE_SPAWN_H
> > +#include 
> > +#endif
> >
> > #ifdef vfork /* Autoconf may define this to fork for us. */
> > # define VFORK_STRING "fork"
> > @@ -559,6 +562,171 @@ pex_unix_exec_child (struct pex_obj *obj 
> > ATTRIBUTE_UNUSED,
> >   return (pid_t) -1;
> > }
> >
> > +#elif defined(HAVE_POSIX_SPAWN) && defined(HAVE_POSIX_SPAWNP)
> > +/* Implementation of pex->exec_child using posix_spawn.*/
> > +
> > +static pid_t
> > +pex_unix_exec_child (struct pex_obj *obj ATTRIBUTE_UNUSED,
> > + int flags, const char *executable,
> > + char * const * argv, char * const * env,
> > + int in, int out, int errdes,
> > + int toclose, const char **errmsg, int *err)
> > +{
> > +  int ret;
> > +  pid_t pid = -1;
> > +  posix_spawnattr_t attr;
> > +  posix_spawn_file_actions_t actions;
> > +  int attr_initialized = 0, actions_initialized = 0;
> > +
> > +  *err = 0;
> > +
> > +  ret = posix_spawnattr_init (&attr);
> > +  if (ret)
> > +{
> > +  *err = ret;
> > +  *errmsg = "posix_spawnattr_init";
> > +  goto exit;
> > +}
> > +  attr_initialized = 1;
> > +
> > +  /* Use vfork() on glibc <=2.24. */
> > +#ifdef POSIX_SPAWN_USEVFORK
> > +  ret = posix_spawnattr_setflags (&attr, POSIX_SPAWN_USEVFORK);
> > +  if (ret)
> > +{
> > +  *err =

Re: [PATCH v1] EXPMED: Allow vector mode for DSE extract_low_bits [PR111720]

2023-11-02 Thread Richard Biener

On Thu, Nov 2, 2023 at 4:15 AM  wrote:
>
> From: Pan Li 
>
> The extract_low_bits only try the scalar mode if the bitsize of
> the mode and src_mode is not equal. When vector mode is given
> from get_stored_val in DSE, it will always fail and return NULL_RTX.
>
> This patch would like to allow the vector mode in the extract_low_bits
> if and only if the size of mode is less than or equals to the size of
> the src_mode.
>
> Given below example code with --param=riscv-autovec-preference=fixed-vlmax.
>
> vuint8m1_t test () {
>   uint8_t arr[32] = {
> 1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
> 1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>   };
>
>   return __riscv_vle8_v_u8m1(arr, 32);
> }
>
> Before this patch:
>
> test:
>   lui a5,%hi(.LANCHOR0)
>   addisp,sp,-32
>   addia5,a5,%lo(.LANCHOR0)
>   li  a3,32
>   vl2re64.v   v2,0(a5)
>   vsetvli zero,a3,e8,m1,ta,ma
>   vs2r.v  v2,0(sp) <== Unnecessary store to stack
>   vle8.v  v1,0(sp) <== Ditto
>   vs1r.v  v1,0(a0)
>   addisp,sp,32
>   jr  ra
>
> After this patch:
>
> test:
>   lui a5,%hi(.LANCHOR0)
>   addia5,a5,%lo(.LANCHOR0)
>   li  a4,32
>   addisp,sp,-32
>   vsetvli zero,a4,e8,m1,ta,ma
>   vle8.v  v1,0(a5)
>   vs1r.v  v1,0(a0)
>   addisp,sp,32
>   jr  ra
>
> Below tests are passed within this patch:
>
> * The x86 bootstrap and regression test.
> * The aarch64 regression test.
> * The risc-v regression test.
>
> PR target/111720
>
> gcc/ChangeLog:
>
> * expmed.cc (extract_low_bits): Allow vector mode if the
> mode size is less than or equal to src_mode.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/pr111720-0.c: New test.
> * gcc.target/riscv/rvv/base/pr111720-1.c: New test.
> * gcc.target/riscv/rvv/base/pr111720-10.c: New test.
> * gcc.target/riscv/rvv/base/pr111720-2.c: New test.
> * gcc.target/riscv/rvv/base/pr111720-3.c: New test.
> * gcc.target/riscv/rvv/base/pr111720-4.c: New test.
> * gcc.target/riscv/rvv/base/pr111720-5.c: New test.
> * gcc.target/riscv/rvv/base/pr111720-6.c: New test.
> * gcc.target/riscv/rvv/base/pr111720-7.c: New test.
> * gcc.target/riscv/rvv/base/pr111720-8.c: New test.
> * gcc.target/riscv/rvv/base/pr111720-9.c: New test.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/expmed.cc | 44 ---
>  .../gcc.target/riscv/rvv/base/pr111720-0.c| 18 
>  .../gcc.target/riscv/rvv/base/pr111720-1.c| 18 
>  .../gcc.target/riscv/rvv/base/pr111720-10.c   | 18 
>  .../gcc.target/riscv/rvv/base/pr111720-2.c| 18 
>  .../gcc.target/riscv/rvv/base/pr111720-3.c| 18 
>  .../gcc.target/riscv/rvv/base/pr111720-4.c| 18 
>  .../gcc.target/riscv/rvv/base/pr111720-5.c| 18 
>  .../gcc.target/riscv/rvv/base/pr111720-6.c| 18 
>  .../gcc.target/riscv/rvv/base/pr111720-7.c| 21 +
>  .../gcc.target/riscv/rvv/base/pr111720-8.c| 18 
>  .../gcc.target/riscv/rvv/base/pr111720-9.c| 15 +++
>  12 files changed, 227 insertions(+), 15 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-10.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-3.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-4.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-5.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-6.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-7.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-8.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-9.c
>
> diff --git a/gcc/expmed.cc b/gcc/expmed.cc
> index b294eabb08d..5db83fe638c 100644
> --- a/gcc/expmed.cc
> +++ b/gcc/expmed.cc
> @@ -2403,8 +2403,6 @@ extract_split_bit_field (rtx op0, opt_scalar_int_mode 
> op0_mode,
>  rtx
>  extract_low_bits (machine_mode mode, machine_mode src_mode, rtx src)
>  {
> -  scalar_int_mode int_mode, src_int_mode;
> -
>if (mode == src_mode)
>  return src;
>
> @@ -2437,22 +2435,38 @@ extract_low_bits (machine_mode mode, machine_mode 
> src_mode, rtx src)
>  return x;
>  }
>
> -  if (!int_mode_for_mode (src_mode).exists (&src_int_mode)
> -  || !int_mode_for_mode (mode).exists (&int_mode))
> -return NULL_RTX;
> +  if (VECTOR_MODE_P (mode) && VECTOR_MODE_P (src_mode))

when there are integer modes for the vector modes you now go a different path,
a little less "regressing" would be to write it as

   if (int_mode_for_mode (src_mode).exists (&src_int_mode)
   && int_mode_for_mode

Re: RFC: the proposal to resolve the missing dependency issue for counted_by attribute

2023-11-02 Thread Jakub Jelinek

On Thu, Nov 02, 2023 at 08:57:36AM +0100, Richard Biener wrote:
> You probably want to specify that when a pointer to the array is taken the
> pointer has to be to the first array element (or do we want to mangle the
> 'size' accordingly for the instrumentation?).  You also want to specify that
> the 'size' associated with such pointer is assumed to be unchanging and
> after changing the size such pointer has to be re-obtained.  Plus that
> changes to the allocated object/size have to be performed through an
> lvalue where the containing type and thus the 'counted_by' attribute is
> visible.  That is,
> 
> size_t *s = &a.size;
> *s = 1;
> 
> is invoking undefined behavior, likewise modifying 'buf' (makes it a bit
> awkward since for example that wouldn't support using posix_memalign
> for allocation, though aligned_alloc would be fine).

Depends on what behavior we want to guarantee and what kind of price we want
to pay for it.  If the size is .ACCESS_WITH_SIZE operand, the size used in
__bdos will be whatever counted_by size an array had upon taking address of
the array, wherever that happens in the program.  And while we can CSE
the calls, they'd be CSEd only if they have the same size.

Or, if we want to pay further price, .ACCESS_WITH_SIZE could take as one of
the arguments not the size value, but its address.  Then at __bdos time
we would dereference that pointer to get the size.
So,
struct S { int a; char b __attribute__((counted_by (a))) []; };
struct S s;
s.a = 5;
char *p = &s.b[2];
int i1 = __builtin_dynamic_object_size (p, 0);
s.a = 3;
int i2 = __builtin_dynamic_object_size (p, 0);
would then yield 3 and 1 rather than 3 and 3.  But dunno if we wouldn't
need to drop leaf attribute from __bdos to make that work, that would be
I think a significant case against doing that, because while in all the
current plans one just pay code performance price when using counted_by
attribute, even when not using __bdos for it, if we had to make __bdos
non-leaf we'd pay extra price even when nobody is using that attribute
just in -D_FORTIFY_SOURCE=3 / -fhardened compilations, which is how
several distros build basically everything.

Jakub

[PATCH 2/4] maintainer-scripts/gcc_release: create index between snapshots <-> commits

2023-11-02 Thread Sam James

Create and maintain a known_snapshots.txt index with space-separated format
BRANCH-DATE COMMIT.

For example:
8-20210107 5114ee0676e432493ada968e34071f02fb08114f
8-20210114 f9267925c648f2ccd9e4680b699e581003125bcf
...

This is helpful for bisects and quickly looking up the information from bug
reports.

maintainer-scripts/
* gcc_release: Create known_snapshots.txt as an index between snapshots
and commits.

Signed-off-by: Sam James 
---
Note that there's a few different approaches we can take here. I've gone
for the simpler one of having it still fetch from the remote site and parse
because it's obviously hard for me to test a part which runs on the remote
machine.

We can skip this patch for now if desired. I have mixed feelings about 
complicating
the contrib/generate_snapshot_index.py script to take an URL / path as I'd 
ideally
like it to still be easily usable locally.

We could have it be generated locally and then uploaded as well, as another 
option.

 maintainer-scripts/gcc_release | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/maintainer-scripts/gcc_release b/maintainer-scripts/gcc_release
index 962b8efe99a7..4cd1fa799660 100755
--- a/maintainer-scripts/gcc_release
+++ b/maintainer-scripts/gcc_release
@@ -448,6 +448,9 @@ announce_snapshot() {
   SNAPSHOT_INDEX=${RELEASE}/index.html
 
   changedir "${SNAPSHOTS_DIR}"
+  # Create an index if it doesn't already exist and populate it before we add
+  # the new snapshot.
+  ${PYTHON} ${SOURCE_DIRECTORY}/contrib/generate_snapshot_index.py || error 
"Failed to generate snapshot index"
   echo \
 "Snapshot gcc-"${RELEASE}" is now available on
   https://gcc.gnu.org/pub/gcc/snapshots/"${RELEASE}"/
@@ -514,6 +517,9 @@ Last modified "${TEXT_DATE}"
   rm -f LATEST-${BRANCH}
   ln -s ${RELEASE} LATEST-${BRANCH}
 
+  # Add the snapshot we just made to the index
+  printf "${RELEASE} ${GITREV}\n" >> known_snapshots.txt
+
   inform "Sending mail"
 
   export QMAILHOST=gcc.gnu.org
@@ -617,6 +623,7 @@ GZIP="${GZIP:-gzip --best}"
 SCP="${SCP:-scp -p}"
 SSH="${SSH:-ssh}"
 TAR="${TAR:-tar}"
+PYTHON="${PYTHON:-python3}"
 
 
 # Command Line Processing
-- 
2.42.0

[PATCH 1/4] contrib: add generate_snapshot_index.py

2023-11-02 Thread Sam James

Script to create a map between weekly snapshots and the commit they're based on
with space-separated format BRANCH-DATE COMMIT.

For example:
8-20210107 5114ee0676e432493ada968e34071f02fb08114f
8-20210114 f9267925c648f2ccd9e4680b699e581003125bcf
...

This is helpful for bisects and quickly looking up the information from bug
reports.

contrib/:
* generate_snapshot_index.py: New file.

Signed-off-by: Sam James 
---
 contrib/generate_snapshot_index.py | 79 ++
 1 file changed, 79 insertions(+)
 create mode 100755 contrib/generate_snapshot_index.py

diff --git a/contrib/generate_snapshot_index.py 
b/contrib/generate_snapshot_index.py
new file mode 100755
index ..80fc14b2cf1e
--- /dev/null
+++ b/contrib/generate_snapshot_index.py
@@ -0,0 +1,79 @@
+#!/usr/bin/env python3
+#
+# Copyright (C) 2023 Free Software Foundation, Inc.
+# Contributed by Sam James.
+#
+# This script is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# Script to create a map between weekly snapshots and the commit they're based 
on.
+# Creates known_snapshots.txt with space-separated format: BRANCH-DATE COMMIT
+# For example:
+# 8-20210107 5114ee0676e432493ada968e34071f02fb08114f
+# 8-20210114 f9267925c648f2ccd9e4680b699e581003125bcf
+
+import os
+import re
+import urllib.request
+
+MIRROR = "https://mirrorservice.org/sites/sourceware.org/pub/gcc/snapshots/";
+
+
+def get_remote_snapshot_list() -> list[str]:
+# Parse the HTML index for links to snapshots
+with urllib.request.urlopen(MIRROR) as index_response:
+html = index_response.read().decode("utf-8")
+snapshots = re.findall(r'href="([0-9]+-.*)"', html)
+
+return snapshots
+
+
+def load_cached_entries() -> dict[str, str]:
+local_snapshots = {}
+
+with open("known_snapshots.txt", encoding="utf-8") as local_entries:
+for entry in local_entries.readlines():
+if not entry:
+continue
+
+date, commit = entry.strip().split(" ")
+local_snapshots[date] = commit
+
+return local_snapshots
+
+
+remote_snapshots = get_remote_snapshot_list()
+try:
+known_snapshots = load_cached_entries()
+except FileNotFoundError:
+# No cache available
+known_snapshots = {}
+
+# This would give us chronological order (as in by creation)
+# snapshots.sort(reverse=False, key=lambda x: x.split('-')[1])
+# snapshots.sort(reverse=True, key=lambda x: x.split('-')[0])
+
+for snapshot in remote_snapshots:
+# 8-20210107/ -> 8-20210107
+snapshot = snapshot.strip("/")
+
+# Don't fetch entries we already have stored.
+if snapshot in known_snapshots:
+continue
+
+# The READMEs are plain text with several lines, one of which is:
+# "with the following options: git://gcc.gnu.org/git/gcc.git branch 
releases/gcc-8 revision e4e5ad2304db534957c4af612aa288cb6ef51f25""
+# We match after 'revision ' to grab the commit used.
+with urllib.request.urlopen(f"{MIRROR}/{snapshot}/README") as 
readme_response:
+data = readme_response.read().decode("utf-8")
+parsed_commit = re.findall(r"revision (.*)", data)[0]
+known_snapshots[snapshot] = parsed_commit
+
+# Dump it all back out to disk.
+with open("known_snapshots.txt.tmp", "w", encoding="utf-8") as known_entries:
+for name, stored_commit in known_snapshots.items():
+known_entries.write(f"{name} {stored_commit}\n")
+
+os.rename("known_snapshots.txt.tmp", "known_snapshots.txt")
-- 
2.42.0

[PATCH 3/4] maintainer-scripts/gcc_release: use HTTPS for links

2023-11-02 Thread Sam James

maintainer-scripts/
* gcc_release: Use HTTPS for links.

Signed-off-by: Sam James 
---
 maintainer-scripts/gcc_release | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/maintainer-scripts/gcc_release b/maintainer-scripts/gcc_release
index 4cd1fa799660..cf6a5731c609 100755
--- a/maintainer-scripts/gcc_release
+++ b/maintainer-scripts/gcc_release
@@ -25,7 +25,7 @@
 #
 # You should have received a copy of the GNU General Public License
 # along with GCC; see the file COPYING3.  If not see
-# .
+# .
 #
 
 
@@ -454,7 +454,7 @@ announce_snapshot() {
   echo \
 "Snapshot gcc-"${RELEASE}" is now available on
   https://gcc.gnu.org/pub/gcc/snapshots/"${RELEASE}"/
-and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.
+and on various mirrors, see https://gcc.gnu.org/mirrors.html for details.
 
 This snapshot has been generated from the GCC "${BRANCH}" git branch
 with the following options: "git://gcc.gnu.org/git/gcc.git branch ${GITBRANCH} 
revision ${GITREV}"
@@ -472,7 +472,7 @@ You'll find:
 
 GCC "${RELEASE}" Snapshot
 
-The http://gcc.gnu.org/\";>GCC Project makes
+The https://gcc.gnu.org/\";>GCC Project makes
 periodic snapshots of the GCC source tree available to the public
 for testing purposes.

-- 
2.42.0

[PATCH 4/4] maintainer-scripts/gcc_release: cleanup whitespace

2023-11-02 Thread Sam James

maintainer-scripts/
* gcc_release: Cleanup whitespace.

Signed-off-by: Sam James 
---
 maintainer-scripts/gcc_release | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/maintainer-scripts/gcc_release b/maintainer-scripts/gcc_release
index cf6a5731c609..965163b65b74 100755
--- a/maintainer-scripts/gcc_release
+++ b/maintainer-scripts/gcc_release
@@ -153,7 +153,7 @@ build_sources() {
   # Update this ChangeLog file only if it does not yet contain the
   # entry we are going to add.  (This is a safety net for repeated
   # runs of this script for the same release.)
-  if ! grep "GCC ${RELEASE} released." ${SOURCE_DIRECTORY}/${x} > 
/dev/null ; then   
+  if ! grep "GCC ${RELEASE} released." ${SOURCE_DIRECTORY}/${x} > 
/dev/null ; then
cat - ${SOURCE_DIRECTORY}/${x} > ${SOURCE_DIRECTORY}/${x}.new GCC Project makes
 periodic snapshots of the GCC source tree available to the public
 for testing purposes.
-   
+
 If you are planning to download and use one of our snapshots, then
 we highly recommend you join the GCC developers list.  Details for
 how to sign up can be found on the GCC project home page.
@@ -484,7 +484,7 @@ how to sign up can be found on the GCC project home 
page.
 with the following options: "git://gcc.gnu.org/git/gcc.git branch 
${GITBRANCH} revision ${GITREV}"
 
 " > ${SNAPSHOT_INDEX}
-   
+
   snapshot_print gcc-${RELEASE}.tar.xz "Complete GCC"
 
   echo \
@@ -554,7 +554,7 @@ FTP_PATH=/var/ftp/pub/gcc
 # The directory in which snapshots will be placed.
 SNAPSHOTS_DIR=${FTP_PATH}/snapshots
 
-# The major number for the release.  For release `3.0.2' this would be 
+# The major number for the release.  For release `3.0.2' this would be
 # `3'
 RELEASE_MAJOR=""
 # The minor number for the release.  For release `3.0.2' this would be
@@ -566,7 +566,7 @@ RELEASE_REVISION=""
 # The complete name of the release.
 RELEASE=""
 
-# The name of the branch from which the release should be made, in a 
+# The name of the branch from which the release should be made, in a
 # user-friendly form.
 BRANCH=""
 
-- 
2.42.0

[PATCH v3 1/2] intl: remove, in favor of out-of-tree gettext

2023-11-02 Thread Arsen Arsenović

ChangeLog:

* intl/*: Remove.
---
 intl/ChangeLog   |  306 --
 intl/Makefile.in |  264 --
 intl/README  |   21 -
 intl/VERSION |1 -
 intl/aclocal.m4  |   33 -
 intl/bindtextdom.c   |  374 --
 intl/config.h.in |  280 --
 intl/config.intl.in  |   12 -
 intl/configure   | 8288 --
 intl/configure.ac|  108 -
 intl/dcgettext.c |   59 -
 intl/dcigettext.c| 1238 ---
 intl/dcngettext.c|   60 -
 intl/dgettext.c  |   60 -
 intl/dngettext.c |   62 -
 intl/eval-plural.h   |  114 -
 intl/explodename.c   |  192 -
 intl/finddomain.c|  195 -
 intl/gettext.c   |   64 -
 intl/gettextP.h  |  224 --
 intl/gmo.h   |  148 -
 intl/hash-string.h   |   59 -
 intl/intl-compat.c   |  151 -
 intl/l10nflist.c |  453 ---
 intl/libgnuintl.h|  341 --
 intl/loadinfo.h  |  156 -
 intl/loadmsgcat.c| 1322 ---
 intl/localcharset.c  |  398 --
 intl/localcharset.h  |   42 -
 intl/locale.alias|   78 -
 intl/localealias.c   |  419 ---
 intl/localename.c|  772 
 intl/log.c   |  104 -
 intl/ngettext.c  |   68 -
 intl/osdep.c |   24 -
 intl/plural-config.h |1 -
 intl/plural-exp.c|  156 -
 intl/plural-exp.h|  132 -
 intl/plural.c| 1540 
 intl/plural.y|  434 ---
 intl/relocatable.c   |  439 ---
 intl/relocatable.h   |   67 -
 intl/textdomain.c|  142 -
 43 files changed, 19401 deletions(-)
 delete mode 100644 intl/ChangeLog
 delete mode 100644 intl/Makefile.in
 delete mode 100644 intl/README
 delete mode 100644 intl/VERSION
 delete mode 100644 intl/aclocal.m4
 delete mode 100644 intl/bindtextdom.c
 delete mode 100644 intl/config.h.in
 delete mode 100644 intl/config.intl.in
 delete mode 100755 intl/configure
 delete mode 100644 intl/configure.ac
 delete mode 100644 intl/dcgettext.c
 delete mode 100644 intl/dcigettext.c
 delete mode 100644 intl/dcngettext.c
 delete mode 100644 intl/dgettext.c
 delete mode 100644 intl/dngettext.c
 delete mode 100644 intl/eval-plural.h
 delete mode 100644 intl/explodename.c
 delete mode 100644 intl/finddomain.c
 delete mode 100644 intl/gettext.c
 delete mode 100644 intl/gettextP.h
 delete mode 100644 intl/gmo.h
 delete mode 100644 intl/hash-string.h
 delete mode 100644 intl/intl-compat.c
 delete mode 100644 intl/l10nflist.c
 delete mode 100644 intl/libgnuintl.h
 delete mode 100644 intl/loadinfo.h
 delete mode 100644 intl/loadmsgcat.c
 delete mode 100644 intl/localcharset.c
 delete mode 100644 intl/localcharset.h
 delete mode 100644 intl/locale.alias
 delete mode 100644 intl/localealias.c
 delete mode 100644 intl/localename.c
 delete mode 100644 intl/log.c
 delete mode 100644 intl/ngettext.c
 delete mode 100644 intl/osdep.c
 delete mode 100644 intl/plural-config.h
 delete mode 100644 intl/plural-exp.c
 delete mode 100644 intl/plural-exp.h
 delete mode 100644 intl/plural.c
 delete mode 100644 intl/plural.y
 delete mode 100644 intl/relocatable.c
 delete mode 100644 intl/relocatable.h
 delete mode 100644 intl/textdomain.c

patch body dropped - it is just removals

[PATCH v3 0/2] Replace intl/ with out-of-tree GNU gettext

2023-11-02 Thread Arsen Arsenović

Morning!

This patch is a rebase and slight wording tweak of
https://inbox.sourceware.org/20231006140501.3370874-1-ar...@aarsen.me

Changes since v2:
- Elaborate on the libintl requirement on non-glibc hosts, per Andrews
  request

Range diff since v2 (since it seems sufficiently readable here):
@@ gcc/doc/install.texi: which lets GCC output diagnostics in languages
  English.  Native Language Support is enabled by default if not doing a
  canadian cross build.  The @option{--disable-nls} option disables NLS@.
  
++Note that this functionality requires either libintl (provided by GNU
++gettext) or C standard library that contains support for gettext (such
++as the GNU C Library).
++@xref{with-included-gettext,,--with-included-gettext} for more
++information on the conditions required to get gettext support.
++
 +@item --with-libintl-prefix=@var{dir}
 +@itemx --without-libintl-prefix
 +Searches for libintl in @file{@var{dir}/include} and
@@ gcc/doc/install.texi: which lets GCC output diagnostics in languages
 +Specifies the type of library to search for when looking for libintl.
 +@var{type} can be one of @code{auto}, @code{static} or @code{shared}.
 +
++@anchor{with-included-gettext}
  @item --with-included-gettext


OK for trunk?  (granted that a regstrap + hand-test for working
localization passes - build's ongoing asynchronously)

Thanks in advance, have a lovely day.

Arsen Arsenović (2):
  intl: remove, in favor of out-of-tree gettext
  *: add modern gettext

 .gitignore |1 +
 Makefile.def   |   72 +-
 Makefile.in| 1612 +++
 config/gettext-sister.m4   |   35 +-
 config/gettext.m4  |  357 +-
 config/iconv.m4|  313 +-
 config/intlmacosx.m4   |   69 +
 configure  |   44 +-
 configure.ac   |   44 +-
 contrib/download_prerequisites |2 +
 contrib/prerequisites.md5  |1 +
 contrib/prerequisites.sha512   |1 +
 gcc/Makefile.in|8 +-
 gcc/aclocal.m4 |4 +
 gcc/configure  | 2001 +++-
 gcc/doc/install.texi   |   72 +-
 intl/ChangeLog |  306 --
 intl/Makefile.in   |  264 -
 intl/README|   21 -
 intl/VERSION   |1 -
 intl/aclocal.m4|   33 -
 intl/bindtextdom.c |  374 --
 intl/config.h.in   |  280 --
 intl/config.intl.in|   12 -
 intl/configure | 8288 
 intl/configure.ac  |  108 -
 intl/dcgettext.c   |   59 -
 intl/dcigettext.c  | 1238 -
 intl/dcngettext.c  |   60 -
 intl/dgettext.c|   60 -
 intl/dngettext.c   |   62 -
 intl/eval-plural.h |  114 -
 intl/explodename.c |  192 -
 intl/finddomain.c  |  195 -
 intl/gettext.c |   64 -
 intl/gettextP.h|  224 -
 intl/gmo.h |  148 -
 intl/hash-string.h |   59 -
 intl/intl-compat.c |  151 -
 intl/l10nflist.c   |  453 --
 intl/libgnuintl.h  |  341 --
 intl/loadinfo.h|  156 -
 intl/loadmsgcat.c  | 1322 -
 intl/localcharset.c|  398 --
 intl/localcharset.h|   42 -
 intl/locale.alias  |   78 -
 intl/localealias.c |  419 --
 intl/localename.c  |  772 ---
 intl/log.c |  104 -
 intl/ngettext.c|   68 -
 intl/osdep.c   |   24 -
 intl/plural-config.h   |1 -
 intl/plural-exp.c  |  156 -
 intl/plural-exp.h  |  132 -
 intl/plural.c  | 1540 --
 intl/plural.y  |  434 --
 intl/relocatable.c |  439 --
 intl/relocatable.h |   67 -
 intl/textdomain.c  |  142 -
 libcpp/aclocal.m4  |5 +
 libcpp/configure   | 2139 -
 libstdc++-v3/configure |  727 +--
 62 files changed, 5467 insertions(+), 21441 deletions(-)
 create mode 100644 config/intlmacosx.m4
 delete mode 100644 intl/ChangeLog
 delete mode 100644 intl/Makefile.in
 delete mode 100644 intl/README
 delete mode 100644 intl/VERSION
 delete mode 100644 intl/aclocal.m4
 delete mode 100644 intl/bindtextdom.c
 delete mode 100644 intl/config.h.in
 delete mode 100644 intl/config.intl.in
 delete mode 100755 intl/configure
 delete mode 100644 intl/configure.ac
 delete mode 100644 intl/dcgettext.c
 delete mode 100644 intl/dcigettext.c
 delete mode 100644 intl/dcngettext.c
 delete mode 100644 intl/dgettext.c
 delete mode 100644 intl/dngettext.c
 delete mode 100644 intl/eval-plural.h
 delete mode 100644 intl/explodename.c
 delete mode 100644 intl/finddomain.c
 delete mode 100644 intl/gettext.c
 delete mode 100644 intl/gettextP.h
 delete mode 100644 intl/gmo.h
 delete mode 1006

[PATCH v3 2/2] *: add modern gettext

2023-11-02 Thread Arsen Arsenović

This patch updates gettext.m4 and related .m4 files and adds
gettext-runtime as a gmp/mpfr/... style host library, allowing newer
libintl to be used.

This patch /does not/ add build-time tools required for
internationalizing (msgfmt et al), instead, it just updates the runtime
library.  The result should be a distribution that acts exactly the same
when a copy of gettext is present, and disables internationalization
otherwise.

There should be no changes in behavior when gettext is included in-tree.
When gettext is not included in tree, nor available on the system, the
programs will be built without localization.

ChangeLog:

PR bootstrap/12596
* .gitignore: Add '/gettext*'.
* configure.ac (host_libs): Replace intl with gettext.
(hbaseargs, bbaseargs, baseargs): Split baseargs into
{h,b}baseargs.
(skip_barg): New flag.  Skips appending current flag to
bbaseargs.
: Exempt --with-libintl-{type,prefix} from
target and build machine argument passing.
* configure: Regenerate.
* Makefile.def (host_modules): Replace intl module with gettext
module.
(configure-ld): Depend on configure-gettext.
* Makefile.in: Regenerate.

config/ChangeLog:

* intlmacosx.m4: Import from gettext-0.22 (serial 8).
* gettext.m4: Sync with gettext-0.22 (serial 77).
* gettext-sister.m4 (ZW_GNU_GETTEXT_SISTER_DIR): Load gettext's
uninstalled-config.sh, or call AM_GNU_GETTEXT if missing.
* iconv.m4: Sync with gettext-0.22 (serial 26).

contrib/ChangeLog:

* prerequisites.sha512: Add gettext.
* prerequisites.md5: Add gettext.
* download_prerequisites: Add gettext.

gcc/ChangeLog:

* configure: Regenerate.
* aclocal.m4: Regenerate.
* Makefile.in (LIBDEPS): Remove (potential) ./ prefix from
LIBINTL_DEP.
* doc/install.texi: Document new (notable) flags added by the
optional gettext tree and by AM_GNU_GETTEXT.  Document libintl/libc
with gettext dependency.

libcpp/ChangeLog:

* configure: Regenerate.
* aclocal.m4: Regenerate.
---
 .gitignore |1 +
 Makefile.def   |   72 +-
 Makefile.in| 1612 
 config/gettext-sister.m4   |   35 +-
 config/gettext.m4  |  357 +++---
 config/iconv.m4|  313 +++--
 config/intlmacosx.m4   |   69 ++
 configure  |   44 +-
 configure.ac   |   44 +-
 contrib/download_prerequisites |2 +
 contrib/prerequisites.md5  |1 +
 contrib/prerequisites.sha512   |1 +
 gcc/Makefile.in|8 +-
 gcc/aclocal.m4 |4 +
 gcc/configure  | 2001 +++---
 gcc/doc/install.texi   |   72 +-
 libcpp/aclocal.m4  |5 +
 libcpp/configure   | 2139 
 libstdc++-v3/configure |  727 ---
 19 files changed, 5467 insertions(+), 2040 deletions(-)
 create mode 100644 config/intlmacosx.m4

diff --git a/.gitignore b/.gitignore
index 5cc4a0fdfa61..93a16b0b950c 100644
--- a/.gitignore
+++ b/.gitignore
@@ -69,3 +69,4 @@ stamp-*
 /mpc*
 /gmp*
 /isl*
+/gettext*
diff --git a/Makefile.def b/Makefile.def
index 15c068e4ac40..792f81447e1b 100644
--- a/Makefile.def
+++ b/Makefile.def
@@ -74,8 +74,14 @@ host_modules= { module= isl; lib_path=.libs; bootstrap=true;
 host_modules= { module= gold; bootstrap=true; };
 host_modules= { module= gprof; };
 host_modules= { module= gprofng; };
-// intl acts on 'host_shared' directly, and does not support --with-pic.
-host_modules= { module= intl; bootstrap=true; };
+host_modules= { module= gettext; bootstrap=true; no_install=true;
+module_srcdir= "gettext/gettext-runtime";
+   // We always build gettext with pic, because some packages 
(e.g. gdbserver)
+   // need it in some configuratons, which is determined via 
nontrivial tests.
+   // Always enabling pic seems to make sense for something tied to
+   // user-facing output.
+extra_configure_flags='--disable-shared --disable-java 
--disable-csharp --with-pic';
+lib_path=intl/.libs; };
 host_modules= { module= tcl;
 missing=mostlyclean; };
 host_modules= { module= itcl; };
@@ -345,7 +351,7 @@ dependencies = { module=all-build-fixincludes; 
on=all-build-libiberty; };
 dependencies = { module=all-build-libcpp; on=all-build-libiberty; };
 
 // Host modules specific to gcc.
-dependencies = { module=configure-gcc; on=configure-intl; };
+dependencies = { module=configure-gcc; on=configure-gettext; };
 dependencies = { module=configure-gcc; on=all-gmp; };
 dependencies = { module=configure-gcc; on=all-mpfr; };
 dependencies = { module=configure-gcc; on=all-mpc; };
@@ -357,7 +363,7 @@ dependencies = { module=confi

Re: [PATCH 2/2] tree-optimization/111131 - SLP for non-IFN gathers

2023-11-02 Thread Thomas Schwinge

Hi!

On 2023-10-31T13:10:24+, Richard Biener  wrote:
> On Tue, 31 Oct 2023, Thomas Schwinge wrote:
>> On 2023-10-19T11:47:14+, Richard Biener  wrote:
>> > The following implements SLP vectorization support for gathers
>> > without relying on IFNs being pattern detected (and supported by
>> > the target).  That includes support for emulated gathers but also
>> > the legacy x86 builtin path.
>> >
>> > Bootstrapped and tested on x86_64-unknown-linux-gnu, will push.
>>
>> For GCN (tested '-march=gfx90a'), I see:
>>
>>  PASS: gcc.dg/vect/vect-gather-2.c (test for excess errors)
>> +FAIL: gcc.dg/vect/vect-gather-2.c scan-tree-dump vect "different gather 
>> base"
>> +FAIL: gcc.dg/vect/vect-gather-2.c scan-tree-dump vect "different gather 
>> scale"
>> +PASS: gcc.dg/vect/vect-gather-2.c scan-tree-dump-not vect "Loop 
>> contains only SLP stmts"
>
> Ah, for gather IFNs pattern matched it will instead have
>
> Build SLP failed: different calls in patt_55 = .GATHER_LOAD ((sizetype)
> x2_29(D), _15, 4, 0);
>
> but then I have put in
>
> /* { dg-final { scan-tree-dump "different gather base" vect { target { !
> vect_gather_load_ifn } } } } */
> /* { dg-final { scan-tree-dump "different gather scale" vect { target { !
> vect_gather_load_ifn } } } } */
>
> and expected gcn to have vect_gather_load_ifn ... but that is
>
> proc check_effective_target_vect_gather_load_ifn { } {
> return [expr { [check_effective_target_aarch64_sve]
>|| [check_effective_target_riscv_v] }]
> }
>
> probably add
>
>|| [istarget amdgcn*-*-*]
>
> there?  Can you do that (after checking it doesn't break other
> tests)?

Oh indeed, thanks!  Pushed to master branch
commit 36a26298ec7dfca615d4ba411a3508d1287d6ce5
"Make GCN target effective-target 'vect_gather_load_ifn'", see attached.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 36a26298ec7dfca615d4ba411a3508d1287d6ce5 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Tue, 31 Oct 2023 14:31:37 +0100
Subject: [PATCH] Make GCN target effective-target 'vect_gather_load_ifn'

This fixes:

 PASS: gcc.dg/vect/vect-gather-2.c (test for excess errors)
-FAIL: gcc.dg/vect/vect-gather-2.c scan-tree-dump vect "different gather base"
-FAIL: gcc.dg/vect/vect-gather-2.c scan-tree-dump vect "different gather scale"
 PASS: gcc.dg/vect/vect-gather-2.c scan-tree-dump-not vect "Loop contains only SLP stmts"

..., and enables other test cases.

	gcc/testsuite/
	* lib/target-supports.exp
	(check_effective_target_vect_gather_load_ifn): True for GCN
	target.
---
 gcc/testsuite/lib/target-supports.exp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index a5f393e1c10..bc93f6e158b 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -8773,6 +8773,7 @@ proc check_effective_target_vect_masked_store { } {
 
 proc check_effective_target_vect_gather_load_ifn { } {
 return [expr { [check_effective_target_aarch64_sve]
+		   || [istarget amdgcn*-*-*]
 		   || [check_effective_target_riscv_v] }]
 }
 
-- 
2.34.1

[PATCH] doc: explicitly say 'lifetime' for DCE

2023-11-02 Thread Sam James

Say 'memory lifetime' rather than 'memory life' as lifetime is the more
standard term nowadays (indeed we have e.g. -fno-lifetime-dse).

It's also easier to grep for if someone is looking for the documentation on
where we do that.

gcc/ChangeLog:
* doc/passes.texi (Dead code elimination): Explicitly say 'lifetime'
as this has become the standard term for what we're doing here.

Signed-off-by: Sam James 
---
 gcc/doc/passes.texi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/doc/passes.texi b/gcc/doc/passes.texi
index eb2bb6062834..470ac498a132 100644
--- a/gcc/doc/passes.texi
+++ b/gcc/doc/passes.texi
@@ -543,7 +543,7 @@ and is defined by @code{pass_early_warn_uninitialized} and
 @item Dead code elimination
 
 This pass scans the function for statements without side effects whose
-result is unused.  It does not do memory life analysis, so any value
+result is unused.  It does not do memory lifetime analysis, so any value
 that is stored in memory is considered used.  The pass is run multiple
 times throughout the optimization process.  It is located in
 @file{tree-ssa-dce.cc} and is described by @code{pass_dce}.
-- 
2.42.0

Re: [PATCH 2/4] maintainer-scripts/gcc_release: create index between snapshots <-> commits

2023-11-02 Thread Jonathan Wakely


On 02/11/23 08:39 +, Sam James wrote:

Create and maintain a known_snapshots.txt index with space-separated format
BRANCH-DATE COMMIT.

For example:
8-20210107 5114ee0676e432493ada968e34071f02fb08114f
8-20210114 f9267925c648f2ccd9e4680b699e581003125bcf
...

This is helpful for bisects and quickly looking up the information from bug
reports.



Is there any reason we don't just use git tags for this?

We could run a one-off job to create all the historical tags (setting
GIT_COMMITTER_DATE and GIT_AUTHOR_DATE so the tags are backdated), and
add a tagging step to the snapshot creation.

Git tags are cheap, but I can imagine a concern about hundreds of new
tags "littering" the output of 'git tag -l'. I don't _think_ you can
put tags under an alternative ref that isn't fetched by default (as we
do with refs/users and refs/vendor). I think tags have to go under
refs/tags. But grep -v could be used to filter out snapshot tags
easily.

We could use https://git-scm.com/docs/gitnamespaces for this though,
so that git --namespace=snapshots tag -l would show the snapshot tags.

Re: [PATCH v4] libgfortran: Replace mutex with rwlock

2023-11-02 Thread Bernhard Reutner-Fischer

[CCing Ian as libgcc maintainer]

On Wed, 1 Nov 2023 10:14:37 +
"Zhu, Lipeng"  wrote:

> > >
> > > Hi Lipeng,
> > >  
> > > >>> Sure, as your comments, in the patch V6, I added 3 test cases with
> > > >>> OpenMP to test different cases in concurrency respectively:
> > > >>> 1. find and create unit very frequently to stress read lock and write 
> > > >>> lock.
> > > >>> 2. only access the unit which exist in cache to stress read lock.
> > > >>> 3. access the same unit in concurrency.
> > > >>> For the third test case, it also help to find a bug:  When unit
> > > >>> can't be found in cache nor unit list in read phase, then threads
> > > >>> will try to acquire write lock to insert the same unit, this will
> > > >>> cause duplicate key  
> > > >> error.  
> > > >>> To fix this bug, I get the unit from unit list once again before
> > > >>> insert in write  
> > > >> lock.  
> > > >>> More details you can refer the patch v6.
> > > >>>  
> > > >>
> > > >> Could you help to review this update? I really appreciate your 
> > > >> assistance.
> > > >>  
> > >  
> > > > Could you help to review this update?  Any concern will be appreciated. 
> > > >  
> > >
> > > Fortran parts are OK (I think I wrote that already), we need somebody
> > > for the non-Fortran parts.
> > >  
> > Hi Thomas,
> > 
> > Thanks for your response. Very appreciate for your patience and help.
> >   
> > > Jakub, could you maybe take a look?
> > >
> > > Best regards
> > >
> > >   Thomas  
> > 
> > Hi Jakub,
> > 
> > Can you help to take a look at the change for libgcc part that added several
> > rwlock macros in libgcc/gthr-posix.h?
> >   
> 
> Hi Jakub,
> 
> Could you help to review this, any comment will be greatly appreciated.

Latest version is at
https://inbox.sourceware.org/gcc-patches/20230818031818.2161842-1-lipeng@intel.com/

> 
> > Best Regards,
> > Lipeng Zhu  
>

[COMMITTED] i386: Move stack protector patterns above mov $0 -> xor peephole

2023-11-02 Thread Uros Bizjak

Move stack protector patterns above mov $0,%reg -> xor %reg,%reg
so the latter won't interfere with stack protector peephole2s.

gcc/ChangeLog:

* config/i386/i386.md: Move stack protector patterns
above mov $0,%reg -> xor %reg,%reg peephole2 pattern.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Uros.
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 75c75f610c2..0528b8379bf 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -24281,6 +24281,141 @@ (define_expand "restore_stack_nonlocal"
   DONE;
 })
 
+(define_expand "stack_protect_set"
+  [(match_operand 0 "memory_operand")
+   (match_operand 1 "memory_operand")]
+  ""
+{
+  rtx scratch = gen_reg_rtx (word_mode);
+
+  emit_insn (gen_stack_protect_set_1
+(ptr_mode, word_mode, operands[0], operands[1], scratch));
+  DONE;
+})
+
+(define_insn "@stack_protect_set_1__"
+  [(set (match_operand:PTR 0 "memory_operand" "=m")
+   (unspec:PTR [(match_operand:PTR 1 "memory_operand" "m")]
+   UNSPEC_SP_SET))
+   (set (match_operand:SWI48 2 "register_operand" "=&r") (const_int 0))
+   (clobber (reg:CC FLAGS_REG))]
+  ""
+{
+  output_asm_insn ("mov{}\t{%1, %2|%2, %1}",
+  operands);
+  output_asm_insn ("mov{}\t{%2, %0|%0, %2}",
+  operands);
+  return "xor{l}\t%k2, %k2";
+}
+  [(set_attr "type" "multi")])
+
+;; Patterns and peephole2s to optimize stack_protect_set_1_
+;; immediately followed by *mov{s,d}i_internal, where we can avoid
+;; the xor{l} above.  We don't split this, so that scheduling or
+;; anything else doesn't separate the *stack_protect_set* pattern from
+;; the set of the register that overwrites the register with a new value.
+
+(define_peephole2
+  [(parallel [(set (match_operand:PTR 0 "memory_operand")
+  (unspec:PTR [(match_operand:PTR 1 "memory_operand")]
+  UNSPEC_SP_SET))
+ (set (match_operand:W 2 "general_reg_operand") (const_int 0))
+ (clobber (reg:CC FLAGS_REG))])
+   (parallel [(set (match_operand:SWI48 3 "general_reg_operand")
+  (match_operand:SWI48 4 "const0_operand"))
+ (clobber (reg:CC FLAGS_REG))])]
+  "peep2_reg_dead_p (0, operands[3])
+   && peep2_reg_dead_p (1, operands[2])"
+  [(parallel [(set (match_dup 0)
+  (unspec:PTR [(match_dup 1)] UNSPEC_SP_SET))
+ (set (match_dup 3) (const_int 0))
+ (clobber (reg:CC FLAGS_REG))])])
+
+(define_insn "*stack_protect_set_2__si"
+  [(set (match_operand:PTR 0 "memory_operand" "=m")
+   (unspec:PTR [(match_operand:PTR 3 "memory_operand" "m")]
+   UNSPEC_SP_SET))
+   (set (match_operand:SI 1 "register_operand" "=&r")
+   (match_operand:SI 2 "general_operand" "g"))]
+  "reload_completed"
+{
+  output_asm_insn ("mov{}\t{%3, %1|%1, %3}", operands);
+  output_asm_insn ("mov{}\t{%1, %0|%0, %1}", operands);
+  if (pic_32bit_operand (operands[2], SImode)
+  || ix86_use_lea_for_mov (insn, operands + 1))
+return "lea{l}\t{%E2, %1|%1, %E2}";
+  else
+return "mov{l}\t{%2, %1|%1, %2}";
+}
+  [(set_attr "type" "multi")
+   (set_attr "length" "24")])
+
+(define_insn "*stack_protect_set_2__di"
+  [(set (match_operand:PTR 0 "memory_operand" "=m,m,m")
+   (unspec:PTR [(match_operand:PTR 3 "memory_operand" "m,m,m")]
+   UNSPEC_SP_SET))
+   (set (match_operand:DI 1 "register_operand" "=&r,&r,&r")
+   (match_operand:DI 2 "general_operand" "Z,rem,i"))]
+  "TARGET_64BIT && reload_completed"
+{
+  output_asm_insn ("mov{}\t{%3, %1|%1, %3}", operands);
+  output_asm_insn ("mov{}\t{%1, %0|%0, %1}", operands);
+  if (pic_32bit_operand (operands[2], DImode))
+return "lea{q}\t{%E2, %1|%1, %E2}";
+  else if (which_alternative == 0)
+return "mov{l}\t{%k2, %k1|%k1, %k2}";
+  else if (which_alternative == 2)
+return "movabs{q}\t{%2, %1|%1, %2}";
+  else if (ix86_use_lea_for_mov (insn, operands + 1))
+return "lea{q}\t{%E2, %1|%1, %E2}";
+  else
+return "mov{q}\t{%2, %1|%1, %2}";
+}
+  [(set_attr "type" "multi")
+   (set_attr "length" "24")])
+
+(define_peephole2
+  [(parallel [(set (match_operand:PTR 0 "memory_operand")
+  (unspec:PTR [(match_operand:PTR 1 "memory_operand")]
+  UNSPEC_SP_SET))
+ (set (match_operand:W 2 "general_reg_operand") (const_int 0))
+ (clobber (reg:CC FLAGS_REG))])
+   (set (match_operand:SWI48 3 "general_reg_operand")
+   (match_operand:SWI48 4 "general_gr_operand"))]
+  "peep2_reg_dead_p (0, operands[3])
+   && peep2_reg_dead_p (1, operands[2])"
+  [(parallel [(set (match_dup 0)
+  (unspec:PTR [(match_dup 1)] UNSPEC_SP_SET))
+ (set (match_dup 3) (match_dup 4))])])
+
+(define_expand "stack_protect_test"
+  [(match_operand 0 "memory_operand")
+   (match_operand 1 "memory_operand")
+   (match_operand 2)]
+  ""
+{
+  rtx flags = gen_rtx_REG (CCZmode, FLAGS_REG);
+
+  emit_i

Re: [PATCH 2/4] maintainer-scripts/gcc_release: create index between snapshots <-> commits

2023-11-02 Thread Andreas Schwab

On Nov 02 2023, Jonathan Wakely wrote:

> Git tags are cheap, but I can imagine a concern about hundreds of new
> tags "littering" the output of 'git tag -l'. I don't _think_ you can
> put tags under an alternative ref that isn't fetched by default (as we
> do with refs/users and refs/vendor). I think tags have to go under
> refs/tags. But grep -v could be used to filter out snapshot tags
> easily.

There is no inherent limitation on publishing tags outside of refs/tags,
to make them invisible by git tag.  There are already existing examples
of tags residing under various refs/users and refs/vendors namespaces.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

Re: RFC: the proposal to resolve the missing dependency issue for counted_by attribute

2023-11-02 Thread Richard Biener

On Thu, Nov 2, 2023 at 9:27 AM Jakub Jelinek  wrote:
>
> On Thu, Nov 02, 2023 at 08:57:36AM +0100, Richard Biener wrote:
> > You probably want to specify that when a pointer to the array is taken the
> > pointer has to be to the first array element (or do we want to mangle the
> > 'size' accordingly for the instrumentation?).  You also want to specify that
> > the 'size' associated with such pointer is assumed to be unchanging and
> > after changing the size such pointer has to be re-obtained.  Plus that
> > changes to the allocated object/size have to be performed through an
> > lvalue where the containing type and thus the 'counted_by' attribute is
> > visible.  That is,
> >
> > size_t *s = &a.size;
> > *s = 1;
> >
> > is invoking undefined behavior, likewise modifying 'buf' (makes it a bit
> > awkward since for example that wouldn't support using posix_memalign
> > for allocation, though aligned_alloc would be fine).
>
> Depends on what behavior we want to guarantee and what kind of price we want
> to pay for it.  If the size is .ACCESS_WITH_SIZE operand, the size used in
> __bdos will be whatever counted_by size an array had upon taking address of
> the array, wherever that happens in the program.  And while we can CSE
> the calls, they'd be CSEd only if they have the same size.
>
> Or, if we want to pay further price, .ACCESS_WITH_SIZE could take as one of
> the arguments not the size value, but its address.  Then at __bdos time
> we would dereference that pointer to get the size.
> So,
> struct S { int a; char b __attribute__((counted_by (a))) []; };
> struct S s;
> s.a = 5;
> char *p = &s.b[2];
> int i1 = __builtin_dynamic_object_size (p, 0);
> s.a = 3;
> int i2 = __builtin_dynamic_object_size (p, 0);
> would then yield 3 and 1 rather than 3 and 3.

I fail to see how we can get the __builtin_dynamic_object_size call
data dependent on s.a, thus avoid re-ordering or even DSE of the
store.

Basically the model is that __builtin_dynamic_object_size will get
you the size at the point 'p' was formed from something that "last"
had the container with the counted_by attribute visible (plus adjustments
to 'p' inbetween that we are able to track).

s.a = 5;
char *p = &a.b[0];

will get you '5' as size,

char *p = &a.b[0];
s.a = 7;

will get you whatever was in 's.a' at the point of the address taking,
s.a  = 7 will _not_ be honored for __builtin_dynamic_object_size
calls on 'p'.

>  But dunno if we wouldn't
> need to drop leaf attribute from __bdos to make that work, that would be
> I think a significant case against doing that, because while in all the
> current plans one just pay code performance price when using counted_by
> attribute, even when not using __bdos for it, if we had to make __bdos
> non-leaf we'd pay extra price even when nobody is using that attribute
> just in -D_FORTIFY_SOURCE=3 / -fhardened compilations, which is how
> several distros build basically everything.
>
> Jakub
>

Re: [PATCH] doc: explicitly say 'lifetime' for DCE

2023-11-02 Thread Richard Biener

On Thu, Nov 2, 2023 at 10:03 AM Sam James  wrote:
>
> Say 'memory lifetime' rather than 'memory life' as lifetime is the more
> standard term nowadays (indeed we have e.g. -fno-lifetime-dse).
>
> It's also easier to grep for if someone is looking for the documentation on
> where we do that.

OK

> gcc/ChangeLog:
> * doc/passes.texi (Dead code elimination): Explicitly say 'lifetime'
> as this has become the standard term for what we're doing here.
>
> Signed-off-by: Sam James 
> ---
>  gcc/doc/passes.texi | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/doc/passes.texi b/gcc/doc/passes.texi
> index eb2bb6062834..470ac498a132 100644
> --- a/gcc/doc/passes.texi
> +++ b/gcc/doc/passes.texi
> @@ -543,7 +543,7 @@ and is defined by @code{pass_early_warn_uninitialized} and
>  @item Dead code elimination
>
>  This pass scans the function for statements without side effects whose
> -result is unused.  It does not do memory life analysis, so any value
> +result is unused.  It does not do memory lifetime analysis, so any value
>  that is stored in memory is considered used.  The pass is run multiple
>  times throughout the optimization process.  It is located in
>  @file{tree-ssa-dce.cc} and is described by @code{pass_dce}.
> --
> 2.42.0
>

Re: [PATCH 2/4] maintainer-scripts/gcc_release: create index between snapshots <-> commits

2023-11-02 Thread Jonathan Wakely

On Thu, 2 Nov 2023 at 10:23, Andreas Schwab wrote:
>
> On Nov 02 2023, Jonathan Wakely wrote:
>
> > Git tags are cheap, but I can imagine a concern about hundreds of new
> > tags "littering" the output of 'git tag -l'. I don't _think_ you can
> > put tags under an alternative ref that isn't fetched by default (as we
> > do with refs/users and refs/vendor). I think tags have to go under
> > refs/tags. But grep -v could be used to filter out snapshot tags
> > easily.
>
> There is no inherent limitation on publishing tags outside of refs/tags,
> to make them invisible by git tag.  There are already existing examples
> of tags residing under various refs/users and refs/vendors namespaces.


Ah, good to know, thanks.

So then there's no reason that snapshots would have to clutter up the
list of default tags for anybody who isn't interested in them.

Re: [PATCH] doc: explicitly say 'lifetime' for DCE

2023-11-02 Thread Sam James



Richard Biener  writes:

> On Thu, Nov 2, 2023 at 10:03 AM Sam James  wrote:
>>
>> Say 'memory lifetime' rather than 'memory life' as lifetime is the more
>> standard term nowadays (indeed we have e.g. -fno-lifetime-dse).
>>
>> It's also easier to grep for if someone is looking for the documentation on
>> where we do that.
>
> OK

Could you push for me please? I have a sw account but no gcc access
(yet).

cheers

>
>> gcc/ChangeLog:
>> * doc/passes.texi (Dead code elimination): Explicitly say 'lifetime'
>> as this has become the standard term for what we're doing here.
>>
>> Signed-off-by: Sam James 
>> ---
>>  gcc/doc/passes.texi | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/gcc/doc/passes.texi b/gcc/doc/passes.texi
>> index eb2bb6062834..470ac498a132 100644
>> --- a/gcc/doc/passes.texi
>> +++ b/gcc/doc/passes.texi
>> @@ -543,7 +543,7 @@ and is defined by @code{pass_early_warn_uninitialized} 
>> and
>>  @item Dead code elimination
>>
>>  This pass scans the function for statements without side effects whose
>> -result is unused.  It does not do memory life analysis, so any value
>> +result is unused.  It does not do memory lifetime analysis, so any value
>>  that is stored in memory is considered used.  The pass is run multiple
>>  times throughout the optimization process.  It is located in
>>  @file{tree-ssa-dce.cc} and is described by @code{pass_dce}.
>> --
>> 2.42.0
>>

[PATCH] tree-optimization/112320 - bougs debug IL after SCCP

2023-11-02 Thread Richard Biener

The following addresses wrong debug IL created by SCCP rewriting stmts
to defined overflow.  I addressed another inefficiency there but
needed to adjust the API of rewrite_to_defined_overflow for this
which is now taking a stmt iterator for in-place operation and a
stmt for sequence producing because gsi_for_stmt doesn't work for
stmts not in the IL.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/112320
* gimple-fold.h (rewrite_to_defined_overflow): New overload
for in-place operation.
* gimple-fold.cc (rewrite_to_defined_overflow): Add stmt
iterator argument to worker, define separate API for
in-place and not in-place operation.
* tree-if-conv.cc (predicate_statements): Simplify.
* tree-scalar-evolution.cc (final_value_replacement_loop):
Likewise.
* tree-ssa-ifcombine.cc (pass_tree_ifcombine::execute): Adjust.
* tree-ssa-reassoc.cc (update_range_test): Likewise.

* gcc.dg/pr112320.c: New testcase.
---
 gcc/gimple-fold.cc  | 25 ++---
 gcc/gimple-fold.h   |  3 ++-
 gcc/testsuite/gcc.dg/pr112320.c | 14 ++
 gcc/tree-if-conv.cc | 19 +--
 gcc/tree-scalar-evolution.cc| 15 ---
 gcc/tree-ssa-ifcombine.cc   |  2 +-
 gcc/tree-ssa-reassoc.cc |  2 +-
 7 files changed, 41 insertions(+), 39 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr112320.c

diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
index 853edd9e5d4..a5be2ee048b 100644
--- a/gcc/gimple-fold.cc
+++ b/gcc/gimple-fold.cc
@@ -8769,12 +8769,14 @@ arith_code_with_undefined_signed_overflow (tree_code 
code)
its operand, carrying out the operation in the corresponding unsigned
type and converting the result back to the original type.
 
-   If IN_PLACE is true, adjust the stmt in place and return NULL.
+   If IN_PLACE is true, *GSI points to STMT, adjust the stmt in place and
+   return NULL.
Otherwise returns a sequence of statements that replace STMT and also
contain a modified form of STMT itself.  */
 
-gimple_seq
-rewrite_to_defined_overflow (gimple *stmt, bool in_place /* = false */)
+static gimple_seq
+rewrite_to_defined_overflow (gimple_stmt_iterator *gsi, gimple *stmt,
+bool in_place)
 {
   if (dump_file && (dump_flags & TDF_DETAILS))
 {
@@ -8801,9 +8803,8 @@ rewrite_to_defined_overflow (gimple *stmt, bool in_place 
/* = false */)
   gimple_set_modified (stmt, true);
   if (in_place)
 {
-  gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
   if (stmts)
-   gsi_insert_seq_before (&gsi, stmts, GSI_SAME_STMT);
+   gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
   stmts = NULL;
 }
   else
@@ -8811,8 +8812,7 @@ rewrite_to_defined_overflow (gimple *stmt, bool in_place 
/* = false */)
   gimple *cvt = gimple_build_assign (lhs, NOP_EXPR, gimple_assign_lhs (stmt));
   if (in_place)
 {
-  gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
-  gsi_insert_after (&gsi, cvt, GSI_SAME_STMT);
+  gsi_insert_after (gsi, cvt, GSI_SAME_STMT);
   update_stmt (stmt);
 }
   else
@@ -8821,6 +8821,17 @@ rewrite_to_defined_overflow (gimple *stmt, bool in_place 
/* = false */)
   return stmts;
 }
 
+void
+rewrite_to_defined_overflow (gimple_stmt_iterator *gsi)
+{
+  rewrite_to_defined_overflow (gsi, gsi_stmt (*gsi), true);
+}
+
+gimple_seq
+rewrite_to_defined_overflow (gimple *stmt)
+{
+  return rewrite_to_defined_overflow (nullptr, stmt, false);
+}
 
 /* The valueization hook we use for the gimple_build API simplification.
This makes us match fold_buildN behavior by only combining with
diff --git a/gcc/gimple-fold.h b/gcc/gimple-fold.h
index 2fd58db9a2e..f69bcc7d3e4 100644
--- a/gcc/gimple-fold.h
+++ b/gcc/gimple-fold.h
@@ -60,7 +60,8 @@ extern tree gimple_fold_indirect_ref (tree);
 extern bool gimple_fold_builtin_sprintf (gimple_stmt_iterator *);
 extern bool gimple_fold_builtin_snprintf (gimple_stmt_iterator *);
 extern bool arith_code_with_undefined_signed_overflow (tree_code);
-extern gimple_seq rewrite_to_defined_overflow (gimple *, bool = false);
+extern void rewrite_to_defined_overflow (gimple_stmt_iterator *);
+extern gimple_seq rewrite_to_defined_overflow (gimple *);
 extern void replace_call_with_value (gimple_stmt_iterator *, tree);
 extern tree tree_vec_extract (gimple_stmt_iterator *, tree, tree, tree, tree);
 extern void gsi_replace_with_seq_vops (gimple_stmt_iterator *, gimple_seq);
diff --git a/gcc/testsuite/gcc.dg/pr112320.c b/gcc/testsuite/gcc.dg/pr112320.c
new file mode 100644
index 000..15cf39f898c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr112320.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O -g" } */
+
+unsigned void0_effective_addr2;
+int void0_i, void0_m, void0_p2;
+void void0()
+{
+  void0_m = 800 - (void0_effective_addr2 & 5);
+  int b1;
+  void0_i = 0;
+  for (; void0_i < void0

Re: RFC: the proposal to resolve the missing dependency issue for counted_by attribute

2023-11-02 Thread Jakub Jelinek

On Thu, Nov 02, 2023 at 11:18:09AM +0100, Richard Biener wrote:
> > Or, if we want to pay further price, .ACCESS_WITH_SIZE could take as one of
> > the arguments not the size value, but its address.  Then at __bdos time
> > we would dereference that pointer to get the size.
> > So,
> > struct S { int a; char b __attribute__((counted_by (a))) []; };
> > struct S s;
> > s.a = 5;
> > char *p = &s.b[2];
> > int i1 = __builtin_dynamic_object_size (p, 0);
> > s.a = 3;
> > int i2 = __builtin_dynamic_object_size (p, 0);
> > would then yield 3 and 1 rather than 3 and 3.
> 
> I fail to see how we can get the __builtin_dynamic_object_size call
> data dependent on s.a, thus avoid re-ordering or even DSE of the
> store.

If &s.b[2] is lowered as
sz_1 = s.a;
tmp_2 = .ACCESS_WITH_SIZE (&s.b[0], sz_1);
p_3 = &tmp_2[2];
then sure, there is no way, you get the size from that point.
tree-object-size.cc tracking then determines that in a particular
case the pointer is size associated with sz_1 and use that value
as the size (with the usual adjustments for pointer arithmetics and the
like).

What I meant is to emit
tmp_4 = .ACCESS_WITH_SIZE (&s.b[0], &s.a, (typeof (&s.a)) 0);
p_5 = &tmp_4[2];
i.e. don't associate the pointer with a value of the size, but with
an address where to find the size (plus how large it is), basically escape
pointer to the size at that point.  And __builtin_dynamic_object_size is pure,
so supposedly it can depend on what the escaped pointer points to.
We'd see that a particular pointer is size associated with &s.a address
and would use that address cast to the type of the third argument (to
preserve the exact pointer type on INTEGER_CST, though not sure, wouldn't
VN CSE it anyway if one has say
union U { struct S { int a; char b __attribute__((counted_by (a))) []; } s;
  struct T { char c, d, e, f; char g __attribute__((counted_by (c))) 
[]; } t; };
and
.ACCESS_WITH_SIZE (&v.s.b[0], &v.s.a, (int *) 0);
...
.ACCESS_WITH_SIZE (&v.t.g[0], &v.t.c, (int *) 0);
?

It would mean though that counted_by wouldn't be allowed to be a
bit-field...

Jakub

Re: [committed] libstdc++: Minor update to installation docs

2023-11-02 Thread Jonathan Wakely

On Wed, 1 Nov 2023 at 22:11, Gerald Pfeifer  wrote:
>
> On Mon, 18 Sep 2023, Jonathan Wakely via Gcc-patches wrote:
> > @@ -103,8 +103,10 @@ ln -s libiconv-1.16 libiconv
> >   
> > If GCC 3.1.0 or later on is being used on GNU/Linux, an attempt
> > will be made to use "C" library functionality necessary for
> > -   C++ named locale support.  For GCC 4.6.0 and later, this
> > -   means that glibc 2.3 or later is required.
> > +   C++ named locale support, e.g. the newlocale
> > +   and uselocale functions.
> > +   For GCC 4.6.0 and later,
> > +   this means that glibc 2.3 or later is required.
>
> Do we still need to provide those details on GCC 3.1+ and GCC 4.6+?
>
> Would it make sense to simply require glibc 2.3 (or higher)?

Yes that probably makes sense now.

Re: [tree-optimization/111721] VECT: Support SLP for MASK_LEN_GATHER_LOAD with dummy mask

2023-11-02 Thread Richard Biener

On Thu, 2 Nov 2023, Juzhe-Zhong wrote:

> This patch fixes following FAILs for RVV:
> FAIL: gcc.dg/vect/vect-gather-1.c -flto -ffat-lto-objects  scan-tree-dump 
> vect "Loop contains only SLP stmts"
> FAIL: gcc.dg/vect/vect-gather-1.c scan-tree-dump vect "Loop contains only SLP 
> stmts"
> 
> Bootstrap on X86 and regtest passed.
> 
> Tested on aarch64 passed.
> 
> Ok for trunk ?
> 
> PR tree-optimization/111721
> 
> gcc/ChangeLog:
> 
> * tree-vect-slp.cc (vect_get_and_check_slp_defs): Support SLP for 
> dummy mask -1.
> * tree-vect-stmts.cc (vectorizable_load): Ditto.
> 
> ---
>  gcc/tree-vect-slp.cc   | 14 --
>  gcc/tree-vect-stmts.cc |  8 +++-
>  2 files changed, 19 insertions(+), 3 deletions(-)
> 
> diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> index 43d742e3c92..23ca0318e31 100644
> --- a/gcc/tree-vect-slp.cc
> +++ b/gcc/tree-vect-slp.cc
> @@ -756,8 +756,7 @@ vect_get_and_check_slp_defs (vec_info *vinfo, unsigned 
> char swap,
>   {
> tree type = TREE_TYPE (oprnd);
> dt = dts[i];
> -   if ((dt == vect_constant_def
> -|| dt == vect_external_def)
> +   if (dt == vect_external_def
> && !GET_MODE_SIZE (vinfo->vector_mode).is_constant ()
> && (TREE_CODE (type) == BOOLEAN_TYPE
> || !can_duplicate_and_interleave_p (vinfo, stmts.length (),
> @@ -769,6 +768,17 @@ vect_get_and_check_slp_defs (vec_info *vinfo, unsigned 
> char swap,
>"for variable-length SLP %T\n", oprnd);
> return -1;
>   }
> +   if (dt == vect_constant_def
> +   && !GET_MODE_SIZE (vinfo->vector_mode).is_constant ()
> +   && !can_duplicate_and_interleave_p (vinfo, stmts.length (), type))
> + {
> +   if (dump_enabled_p ())
> + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> +  "Build SLP failed: invalid type of def "
> +  "for variable-length SLP %T\n",
> +  oprnd);
> +   return -1;
> + }

I don't think that's quite correct.  can_duplicate_and_interleave_p
doesn't get enough info here and IIRC even materializing arbitrary
constants isn't possible with VLA vectors.  The very first thing
the function does is

  tree base_vector_type = get_vectype_for_scalar_type (vinfo, elt_type, 
count);
  if (!base_vector_type || !VECTOR_MODE_P (TYPE_MODE (base_vector_type)))
return false;

but for masks that's not going to get us the correct vector type.
While I don't understand why we have that 'BOOLEAN_TYPE' special
case (maybe the intent was to identify 'mask' operands that way?),
we might want to require that we can materialize both all-zero
and all-ones constant 'mask's.  But then 'mask' operands should
be properly identified here.

Maybe we can also simply delay the check to the point we know
whether we're facing an uniform constant or not (note for 'first',
we cannot really special-case vect_constant_def as the second
SLP lane might demote that to vect_external_def).  It's always
a balance of whether to reject sth at SLP build time (possibly
allowing operand swapping to do magic) or to delay checks
to stmt analysis time.  That might also explain that you
do not see fallout of the "wrong" change (the later checking
will catch it anyway).

There's probably testsuite coverage for SVE here.

That said, a "correct" patch might be to simply change

  && (TREE_CODE (type) == BOOLEAN_TYPE
  || !can_duplicate_and_interleave_p (vinfo, stmts.length 
(),
  type)))

to

   && TREE_CODE (type) != BOOLEAN_TYPE
   && !can_duplicate_and_interleave_p (vinfo, stmts.length 
(),   
  type)

thus delay 'mask' operand validation here.

Note I still think we should improve TREE_CODE (type) == BOOLEAN_TYPE
to identify internal function mask operands only.

Richard.

>  
> /* For the swapping logic below force vect_reduction_def
>for the reduction op in a SLP reduction group.  */
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index 6ce4868d3e1..6c47121e158 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -9859,10 +9859,16 @@ vectorizable_load (vec_info *vinfo,
>mask_index = internal_fn_mask_index (ifn);
>if (mask_index >= 0 && slp_node)
>   mask_index = vect_slp_child_index_for_operand (call, mask_index);
> +  slp_tree slp_op = NULL;
>if (mask_index >= 0
> && !vect_check_scalar_mask (vinfo, stmt_info, slp_node, mask_index,
> -   &mask, NULL, &mask_dt, &mask_vectype))
> +   &mask, &slp_op, &mask_dt, &mask_vectype))
>   return false;
> +  /* MASK_LEN_GATHER_LOAD dummy mask -1 should always match the
> +  MASK_VECTYPE.  */
>

[PATCH] RISC-V: Fix bug of AVL propagation PASS

2023-11-02 Thread Juzhe-Zhong

A run FAIL suddenly shows up today to me:
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-11.c 
execution test

that I didn't have before.

After investigation, I realize that there is a bug in AVL propagtion PASS.

gcc/ChangeLog:

* config/riscv/riscv-avlprop.cc 
(pass_avlprop::get_vlmax_ta_preferred_avl): Don't allow non-real insn AVL 
propation.

---
 gcc/config/riscv/riscv-avlprop.cc | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/gcc/config/riscv/riscv-avlprop.cc 
b/gcc/config/riscv/riscv-avlprop.cc
index bec1e3c715a..1dfaa8742da 100644
--- a/gcc/config/riscv/riscv-avlprop.cc
+++ b/gcc/config/riscv/riscv-avlprop.cc
@@ -308,6 +308,13 @@ pass_avlprop::get_vlmax_ta_preferred_avl (insn_info *insn) 
const
  def_info *def2 = dl.prev_def (use_insn);
  if (!def1 || !def2 || def1 != def2)
return NULL_RTX;
+ /* For vectorized codes, we always use SELECT_VL/MIN_EXPR to
+calculate the loop len at the header of the loop.
+We only allow AVL propagation for real instruction for now.
+TODO: We may enhance it for intrinsic codes if it is necessary.
+ */
+ if (!def1->insn ()->is_real ())
+   return NULL_RTX;
 
  /* FIXME: We only all AVL propation within a block which should
 be totally enough for vectorized codes.
-- 
2.36.3

Re: [PATCH] RISC-V: Fix bug of AVL propagation PASS

2023-11-02 Thread Robin Dapp

LGTM.

Regards
 Robin

RE: [PATCH] RISC-V: Fix bug of AVL propagation PASS

2023-11-02 Thread Li, Pan2

Committed, thanks Robin.

Pan

-Original Message-
From: Robin Dapp  
Sent: Thursday, November 2, 2023 7:34 PM
To: Juzhe-Zhong ; gcc-patches@gcc.gnu.org
Cc: rdapp@gmail.com; kito.ch...@gmail.com; kito.ch...@sifive.com; 
jeffreya...@gmail.com
Subject: Re: [PATCH] RISC-V: Fix bug of AVL propagation PASS

LGTM.

Regards
 Robin

[PATCH v1] RISC-V: Refactor prefix [I/L/LL] rounding API autovec iterator

2023-11-02 Thread pan2 . li

From: Pan Li 

The previous rounding API start with i/l/ll only works on the same
mode types. For example as below, and we arrange the iterator similar
to fcvt.

* SF => SI
* DF => DI

After we refined this limination from middle-end, these API can also
vectorized with different type sizes, aka:

* HF => SI, HF => DI
* SF => DI, SF => SI
* DF => SI, DF => DI

Then the iterator cannot take care of this simply and this patch
would like to re-arrange the iterator in two items.

* V_VLS_F_CONVERT_SI: handle (HF, SF, DF) => SI
* V_VLS_F_CONVERT_DI: handle (HF, SF, DF) => DI

As well as related mode_attr to reconcile the new iterator.

gcc/ChangeLog:

* config/riscv/autovec.md (lrint2): Remove.
(lround2): Ditto.
(lceil2): Ditto.
(lfloor2): Ditto.
(lrint2): New pattern for cvt from
FP to SI.
(lround2): Ditto.
(lceil2): Ditto.
(lfloor2): Ditto.
(lrint2): New pattern for cvt from
FP to DI.
(lround2): Ditto.
(lceil2): Ditto.
(lfloor2): Ditto.
* config/riscv/vector-iterators.md: Renew iterators for both
the SI and DI.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/autovec.md  |  72 +++---
 gcc/config/riscv/vector-iterators.md | 199 ---
 2 files changed, 237 insertions(+), 34 deletions(-)

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index f5e3e347ace..81acb1a815b 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -2395,42 +2395,82 @@ (define_expand "roundeven2"
   }
 )
 
-(define_expand "lrint2"
-  [(match_operand:0 "register_operand")
-   (match_operand:V_VLS_FCONVERT_I_L_LL 1 "register_operand")]
+(define_expand "lrint2"
+  [(match_operand:   0 "register_operand")
+   (match_operand:V_VLS_F_CONVERT_SI 1 "register_operand")]
   "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
   {
-riscv_vector::expand_vec_lrint (operands[0], operands[1], mode, 
mode);
+riscv_vector::expand_vec_lrint (operands[0], operands[1], mode, 
mode);
 DONE;
   }
 )
 
-(define_expand "lround2"
-  [(match_operand:0 "register_operand")
-   (match_operand:V_VLS_FCONVERT_I_L_LL 1 "register_operand")]
+(define_expand "lrint2"
+  [(match_operand:   0 "register_operand")
+   (match_operand:V_VLS_F_CONVERT_DI 1 "register_operand")]
   "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
   {
-riscv_vector::expand_vec_lround (operands[0], operands[1], mode, 
mode);
+riscv_vector::expand_vec_lrint (operands[0], operands[1], mode, 
mode);
 DONE;
   }
 )
 
-(define_expand "lceil2"
-  [(match_operand:0 "register_operand")
-   (match_operand:V_VLS_FCONVERT_I_L_LL 1 "register_operand")]
+(define_expand "lround2"
+  [(match_operand:   0 "register_operand")
+   (match_operand:V_VLS_F_CONVERT_SI 1 "register_operand")]
   "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
   {
-riscv_vector::expand_vec_lceil (operands[0], operands[1], mode, 
mode);
+riscv_vector::expand_vec_lround (operands[0], operands[1], mode, 
mode);
 DONE;
   }
 )
 
-(define_expand "lfloor2"
-  [(match_operand:0 "register_operand")
-   (match_operand:V_VLS_FCONVERT_I_L_LL 1 "register_operand")]
+(define_expand "lround2"
+  [(match_operand:   0 "register_operand")
+   (match_operand:V_VLS_F_CONVERT_DI 1 "register_operand")]
   "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
   {
-riscv_vector::expand_vec_lfloor (operands[0], operands[1], mode, 
mode);
+riscv_vector::expand_vec_lround (operands[0], operands[1], mode, 
mode);
+DONE;
+  }
+)
+
+(define_expand "lceil2"
+  [(match_operand:   0 "register_operand")
+   (match_operand:V_VLS_F_CONVERT_SI 1 "register_operand")]
+  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
+  {
+riscv_vector::expand_vec_lceil (operands[0], operands[1], mode, 
mode);
+DONE;
+  }
+)
+
+(define_expand "lceil2"
+  [(match_operand:   0 "register_operand")
+   (match_operand:V_VLS_F_CONVERT_DI 1 "register_operand")]
+  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
+  {
+riscv_vector::expand_vec_lceil (operands[0], operands[1], mode, 
mode);
+DONE;
+  }
+)
+
+(define_expand "lfloor2"
+  [(match_operand:   0 "register_operand")
+   (match_operand:V_VLS_F_CONVERT_SI 1 "register_operand")]
+  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
+  {
+riscv_vector::expand_vec_lfloor (operands[0], operands[1], mode, 
mode);
+DONE;
+  }
+)
+
+(define_expand "lfloor2"
+  [(match_operand:   0 "register_operand")
+   (match_operand:V_VLS_F_CONVERT_DI 1 "register_operand")]
+  "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
+  {
+riscv_vector::expand_vec_lfloor (operands[0], operands[1], mode, 
mode);
 DONE;
   }
 )
diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index d9b5dec5edb..f2d9f60b631 100644
--- a/gcc/config/riscv/vector-iter

Re: [PATCH] doc: explicitly say 'lifetime' for DCE

2023-11-02 Thread Richard Biener

On Thu, Nov 2, 2023 at 11:25 AM Sam James  wrote:
>
>
> Richard Biener  writes:
>
> > On Thu, Nov 2, 2023 at 10:03 AM Sam James  wrote:
> >>
> >> Say 'memory lifetime' rather than 'memory life' as lifetime is the more
> >> standard term nowadays (indeed we have e.g. -fno-lifetime-dse).
> >>
> >> It's also easier to grep for if someone is looking for the documentation on
> >> where we do that.
> >
> > OK
>
> Could you push for me please? I have a sw account but no gcc access
> (yet).

Done after fixing ChangeLog format.

Richard.

> cheers
>
> >
> >> gcc/ChangeLog:
> >> * doc/passes.texi (Dead code elimination): Explicitly say 'lifetime'
> >> as this has become the standard term for what we're doing here.
> >>
> >> Signed-off-by: Sam James 
> >> ---
> >>  gcc/doc/passes.texi | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/gcc/doc/passes.texi b/gcc/doc/passes.texi
> >> index eb2bb6062834..470ac498a132 100644
> >> --- a/gcc/doc/passes.texi
> >> +++ b/gcc/doc/passes.texi
> >> @@ -543,7 +543,7 @@ and is defined by @code{pass_early_warn_uninitialized} 
> >> and
> >>  @item Dead code elimination
> >>
> >>  This pass scans the function for statements without side effects whose
> >> -result is unused.  It does not do memory life analysis, so any value
> >> +result is unused.  It does not do memory lifetime analysis, so any value
> >>  that is stored in memory is considered used.  The pass is run multiple
> >>  times throughout the optimization process.  It is located in
> >>  @file{tree-ssa-dce.cc} and is described by @code{pass_dce}.
> >> --
> >> 2.42.0
> >>
>

[AVR PATCH] Optimize (X>>C)&1 for C in [1, 4, 8, 16, 24] in *insv.any_shift..

2023-11-02 Thread Roger Sayle


This patch optimizes a few special cases in avr.md's *insv.any_shift.
instruction.  This template handles tests for a single bit, where the result
has only a (different) single bit set in the result.  Usually (currently)
this always requires a three-instruction sequence of a BST, a CLR and a BLD
(plus any additional CLR instructions to clear the rest of the result
bytes).
The special cases considered here are those that can be done with only two
instructions (plus CLRs); an ANDI preceded by either a MOV, a SHIFT or a
SWAP.

Hence for C=1 in HImode, GCC with -O2 currently generates:

bst r24,1
clr r24
clr r25
bld r24,0

with this patch, we now generate:

lsr r24
andi r24,1
clr r25

Likewise, HImode C=4 now becomes:

swap r24
andi r24,1
clr r25

and SImode C=8 now becomes:

mov r22,r23
andi r22,1
clr 23
clr 24
clr 25


I've not attempted to model the instruction length accurately for these
special cases; the logic would be ugly, but it's safe to use the current
(1 insn longer) length.

This patch has been (partially) tested with a cross-compiler to avr-elf
hosted on x86_64, without a simulator, where the compile-only tests in
the gcc testsuite show no regressions.  If someone could test this more
thoroughly that would be great.


2023-11-02  Roger Sayle  

gcc/ChangeLog
* config/avr/avr.md (*insv.any_shift.): Optimize special
cases of *insv.any_shift that save one instruction by using
ANDI with either a MOV, a SHIFT or a SWAP.

gcc/testsuite/ChangeLog
* gcc.target/avr/insvhi-1.c: New HImode test case.
* gcc.target/avr/insvhi-2.c: Likewise.
* gcc.target/avr/insvhi-3.c: Likewise.
* gcc.target/avr/insvhi-4.c: Likewise.
* gcc.target/avr/insvhi-5.c: Likewise.
* gcc.target/avr/insvqi-1.c: New QImode test case.
* gcc.target/avr/insvqi-2.c: Likewise.
* gcc.target/avr/insvqi-3.c: Likewise.
* gcc.target/avr/insvqi-4.c: Likewise.
* gcc.target/avr/insvsi-1.c: New SImode test case.
* gcc.target/avr/insvsi-2.c: Likewise.
* gcc.target/avr/insvsi-3.c: Likewise.
* gcc.target/avr/insvsi-4.c: Likewise.
* gcc.target/avr/insvsi-5.c: Likewise.
* gcc.target/avr/insvsi-6.c: Likewise.


Thanks in advance,
Roger
--

diff --git a/gcc/config/avr/avr.md b/gcc/config/avr/avr.md
index 83dd15040b07..c2a1931733f8 100644
--- a/gcc/config/avr/avr.md
+++ b/gcc/config/avr/avr.md
@@ -9840,6 +9840,7 @@
(clobber (reg:CC REG_CC))]
   "reload_completed"
   {
+int ldi_ok = test_hard_reg_class (LD_REGS, operands[0]);
 int shift =  == ASHIFT ? INTVAL (operands[2]) : -INTVAL 
(operands[2]);
 int mask = GET_MODE_MASK (mode) & INTVAL (operands[3]);
 // Position of the output / input bit, respectively.
@@ -9850,6 +9851,217 @@
 operands[3] = GEN_INT (obit);
 operands[2] = GEN_INT (ibit);
 
+/* Special cases requiring MOV to low byte and ANDI.  */
+if ((shift & 7) == 0 && ldi_ok)
+  {
+   if (IN_RANGE (obit, 0, 7))
+ {
+   if (shift == -8)
+ {
+   if ( == 2)
+ return "mov %A0,%B1\;andi %A0,lo8(1<<%3)\;clr %B0";
+   if ( == 3)
+ return "mov %A0,%B1\;andi %A0,lo8(1<<%3)\;clr %B0\;clr %C0";
+   if ( == 4 && !AVR_HAVE_MOVW)
+ return "mov %A0,%B1\;andi %A0,lo8(1<<%3)\;"
+"clr %B0\;clr %C0\;clr %D0";
+ }
+   else if (shift == -16)
+ {
+   if ( == 3)
+ return "mov %A0,%C1\;andi %A0,lo8(1<<%3)\;clr %B0\;clr %C0";
+   if ( == 4 && !AVR_HAVE_MOVW)
+ return "mov %A0,%C1\;andi %A0,lo8(1<<%3)\;"
+"clr %B0\;clr %C0\;clr %D0";
+ }
+   else if (shift == -24 && !AVR_HAVE_MOVW)
+ return "mov %A0,%D1\;andi %A0,lo8(1<<%3)\;"
+"clr %B0\;clr %C0\;clr %D0";
+ }
+
+   /* Special cases requiring MOV and ANDI.  */
+   else if (IN_RANGE (obit, 8, 15))
+ {
+   if (shift == 8)
+ {
+   if ( == 2)
+ return "mov %B0,%A1\;andi %B0,lo8(1<<(%3-8))\;clr %A0";
+   if ( == 3)
+ return "mov %B0,%A1\;andi %B0,lo8(1<<(%3-8))\;"
+"clr %A0\;clr %C0";
+   if ( == 4 && !AVR_HAVE_MOVW)
+ return "mov %B0,%A1\;andi %B0,lo8(1<<(%3-8))\;"
+"clr %A0\;clr %C0\;clr %D0";
+ }
+   else if (shift == -8)
+ {
+   if ( == 3)
+ return "mov %B0,%C1\;andi %B0,lo8(1<<(%3-8))\;"
+"clr %A0\;clr %C0";
+   if ( == 4 && !AVR_HAVE_MOVW)
+ return "mov %B0,%C1\;andi %B0,lo8(1<<(%3-8))\;"
+"clr %B0\;clr %C0\;clr %D0";
+

[AVR PATCH] Improvements to SImode and PSImode shifts by constants.

2023-11-02 Thread Roger Sayle


This patch provides non-looping implementations for more SImode (32-bit)
and PSImode (24-bit) shifts on AVR.  For most cases, these are shorter
and faster than using a loop, but for a few (controlled by optimize_size)
they are a little larger but significantly faster,  The approach is to
perform byte-based shifts by 1, 2 or 3 bytes, followed by bit-based shifts
(effectively in a narrower type) for the remaining bits, beyond 8, 16 or 24.

For example, the simple test case below (inspired by PR 112268):

unsigned long foo(unsigned long x)
{
  return x >> 26;
}

gcc -O2 currently generates:

foo:ldi r18,26
1:  lsr r25
ror r24
ror r23
ror r22
dec r18
brne 1b
ret

which is 8 instructions, and takes ~158 cycles.
With this patch, we now generate:

foo:mov r22,r25
clr r23
clr r24
clr r25
lsr r22
lsr r22
ret

which is 7 instructions, and takes ~7 cycles.

One complication is that the modified functions sometimes use spaces instead
of TABs, with occasional mistakes in GNU-style formatting, so I've fixed
these indentation/whitespace issues.  There's no change in the code for the
cases previously handled/special-cased, with the exception of ashrqi3 reg,5
where with -Os a (4-instruction) loop is shorter than the five single-bit
shifts of a fully unrolled implementation.

This patch has been (partially) tested with a cross-compiler to avr-elf
hosted on x86_64, without a simulator, where the compile-only tests in
the gcc testsuite show no regressions.  If someone could test this more
thoroughly that would be great.


2023-11-02  Roger Sayle  

gcc/ChangeLog
* config/avr/avr.cc (ashlqi3_out): Fix indentation whitespace.
(ashlhi3_out): Likewise.
(avr_out_ashlpsi3): Likewise.  Handle shifts by 9 and 17-22.
(ashlsi3_out): Fix formatting.  Handle shifts by 9 and 25-30.
(ashrqi3_our): Use loop for shifts by 5 when optimizing for size.
Fix indentation whitespace.
(ashrhi3_out): Likewise.
(avr_out_ashrpsi3): Likewise.  Handle shifts by 17.
(ashrsi3_out): Fix indentation.  Handle shifts by 17 and 25.
(lshrqi3_out): Fix whitespace.
(lshrhi3_out): Likewise.
(avr_out_lshrpsi3): Likewise.  Handle shifts by 9 and 17-22.
(lshrsi3_out): Fix indentation.  Handle shifts by 9,17,18 and 25-30.

gcc/testsuite/ChangeLog
* gcc.target/avr/ashlsi-1.c: New test case.
* gcc.target/avr/ashlsi-2.c: Likewise.
* gcc.target/avr/ashrsi-1.c: Likewise.
* gcc.target/avr/ashrsi-2.c: Likewise.
* gcc.target/avr/lshrsi-1.c: Likewise.
* gcc.target/avr/lshrsi-2.c: Likewise.


Thanks in advance,
Roger
--

diff --git a/gcc/config/avr/avr.cc b/gcc/config/avr/avr.cc
index 5e0217de36fc..706599b4aa6a 100644
--- a/gcc/config/avr/avr.cc
+++ b/gcc/config/avr/avr.cc
@@ -6715,7 +6715,7 @@ ashlqi3_out (rtx_insn *insn, rtx operands[], int *len)
 fatal_insn ("internal compiler error.  Incorrect shift:", insn);
 
   out_shift_with_cnt ("lsl %0",
-  insn, operands, len, 1);
+ insn, operands, len, 1);
   return "";
 }
 
@@ -6728,8 +6728,8 @@ ashlhi3_out (rtx_insn *insn, rtx operands[], int *len)
   if (CONST_INT_P (operands[2]))
 {
   int scratch = (GET_CODE (PATTERN (insn)) == PARALLEL
- && XVECLEN (PATTERN (insn), 0) == 3
- && REG_P (operands[3]));
+&& XVECLEN (PATTERN (insn), 0) == 3
+&& REG_P (operands[3]));
   int ldi_ok = test_hard_reg_class (LD_REGS, operands[0]);
   int k;
   int *t = len;
@@ -6826,8 +6826,9 @@ ashlhi3_out (rtx_insn *insn, rtx operands[], int *len)
  "ror %A0");
 
case 8:
- return *len = 2, ("mov %B0,%A1" CR_TAB
-   "clr %A0");
+ *len = 2;
+ return ("mov %B0,%A1" CR_TAB
+ "clr %A0");
 
case 9:
  *len = 3;
@@ -6974,7 +6975,7 @@ ashlhi3_out (rtx_insn *insn, rtx operands[], int *len)
   len = t;
 }
   out_shift_with_cnt ("lsl %A0" CR_TAB
-  "rol %B0", insn, operands, len, 2);
+ "rol %B0", insn, operands, len, 2);
   return "";
 }
 
@@ -6990,54 +6991,126 @@ avr_out_ashlpsi3 (rtx_insn *insn, rtx *op, int *plen)
   if (CONST_INT_P (op[2]))
 {
   switch (INTVAL (op[2]))
-{
-default:
-  if (INTVAL (op[2]) < 24)
-break;
+   {
+   default:
+ if (INTVAL (op[2]) < 24)
+   break;
 
-  return avr_asm_len ("clr %A0" CR_TAB
-  "clr %B0" CR_TAB
-  "clr %C0", op, plen, 3);
+ return avr_asm_len ("clr %A0" CR_TAB
+ "clr %B0" CR_TAB
+ "clr %C0", op, plen, 3);
 
-case 8:
-  {
-int reg0 = REGNO (op[0]);
-

Re: Re: [tree-optimization/111721] VECT: Support SLP for MASK_LEN_GATHER_LOAD with dummy mask

2023-11-02 Thread juzhe.zh...@rivai.ai

Thanks Richi.

The following is the V2 patch:
Testing on X86 and aarch64 are running.

diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index 43d742e3c92..e7f7f976f11 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -760,7 +760,7 @@ vect_get_and_check_slp_defs (vec_info *vinfo, unsigned char 
swap,
   || dt == vect_external_def)
  && !GET_MODE_SIZE (vinfo->vector_mode).is_constant ()
  && (TREE_CODE (type) == BOOLEAN_TYPE
- || !can_duplicate_and_interleave_p (vinfo, stmts.length (),
+ && !can_duplicate_and_interleave_p (vinfo, stmts.length (),
  type)))
{
  if (dump_enabled_p ())
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 6ce4868d3e1..6c47121e158 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -9859,10 +9859,16 @@ vectorizable_load (vec_info *vinfo,
   mask_index = internal_fn_mask_index (ifn);
   if (mask_index >= 0 && slp_node)
mask_index = vect_slp_child_index_for_operand (call, mask_index);
+  slp_tree slp_op = NULL;
   if (mask_index >= 0
  && !vect_check_scalar_mask (vinfo, stmt_info, slp_node, mask_index,
- &mask, NULL, &mask_dt, &mask_vectype))
+ &mask, &slp_op, &mask_dt, &mask_vectype))
return false;
+  /* MASK_LEN_GATHER_LOAD dummy mask -1 should always match the
+MASK_VECTYPE.  */
+  if (mask_index >= 0 && slp_node && mask_dt == vect_constant_def
+ && !vect_maybe_update_slp_op_vectype (slp_op, mask_vectype))
+   gcc_unreachable ();
 }




juzhe.zh...@rivai.ai
 
From: Richard Biener
Date: 2023-11-02 19:11
To: Juzhe-Zhong
CC: gcc-patches; richard.sandiford
Subject: Re: [tree-optimization/111721] VECT: Support SLP for 
MASK_LEN_GATHER_LOAD with dummy mask
On Thu, 2 Nov 2023, Juzhe-Zhong wrote:
 
> This patch fixes following FAILs for RVV:
> FAIL: gcc.dg/vect/vect-gather-1.c -flto -ffat-lto-objects  scan-tree-dump 
> vect "Loop contains only SLP stmts"
> FAIL: gcc.dg/vect/vect-gather-1.c scan-tree-dump vect "Loop contains only SLP 
> stmts"
> 
> Bootstrap on X86 and regtest passed.
> 
> Tested on aarch64 passed.
> 
> Ok for trunk ?
> 
> PR tree-optimization/111721
> 
> gcc/ChangeLog:
> 
> * tree-vect-slp.cc (vect_get_and_check_slp_defs): Support SLP for 
> dummy mask -1.
> * tree-vect-stmts.cc (vectorizable_load): Ditto.
> 
> ---
>  gcc/tree-vect-slp.cc   | 14 --
>  gcc/tree-vect-stmts.cc |  8 +++-
>  2 files changed, 19 insertions(+), 3 deletions(-)
> 
> diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> index 43d742e3c92..23ca0318e31 100644
> --- a/gcc/tree-vect-slp.cc
> +++ b/gcc/tree-vect-slp.cc
> @@ -756,8 +756,7 @@ vect_get_and_check_slp_defs (vec_info *vinfo, unsigned 
> char swap,
>  {
>tree type = TREE_TYPE (oprnd);
>dt = dts[i];
> -   if ((dt == vect_constant_def
> -|| dt == vect_external_def)
> +   if (dt == vect_external_def
>&& !GET_MODE_SIZE (vinfo->vector_mode).is_constant ()
>&& (TREE_CODE (type) == BOOLEAN_TYPE
>|| !can_duplicate_and_interleave_p (vinfo, stmts.length (),
> @@ -769,6 +768,17 @@ vect_get_and_check_slp_defs (vec_info *vinfo, unsigned 
> char swap,
>  "for variable-length SLP %T\n", oprnd);
>return -1;
>  }
> +   if (dt == vect_constant_def
> +   && !GET_MODE_SIZE (vinfo->vector_mode).is_constant ()
> +   && !can_duplicate_and_interleave_p (vinfo, stmts.length (), type))
> + {
> +   if (dump_enabled_p ())
> + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> + "Build SLP failed: invalid type of def "
> + "for variable-length SLP %T\n",
> + oprnd);
> +   return -1;
> + }
 
I don't think that's quite correct.  can_duplicate_and_interleave_p
doesn't get enough info here and IIRC even materializing arbitrary
constants isn't possible with VLA vectors.  The very first thing
the function does is
 
  tree base_vector_type = get_vectype_for_scalar_type (vinfo, elt_type, 
count);
  if (!base_vector_type || !VECTOR_MODE_P (TYPE_MODE (base_vector_type)))
return false;
 
but for masks that's not going to get us the correct vector type.
While I don't understand why we have that 'BOOLEAN_TYPE' special
case (maybe the intent was to identify 'mask' operands that way?),
we might want to require that we can materialize both all-zero
and all-ones constant 'mask's.  But then 'mask' operands should
be properly identified here.
 
Maybe we can also simply delay the check to the point we know
whether we're facing an uniform constant or not (note for 'first',
we cannot really special-case vect_constant_def as the second
SLP lane might demote that to vect_external_def).  It's always
a balance of whether to reject sth at SLP build time (possibly
allowing operand swapping to do magic) or to delay checks
to stmt analysis

Re: RFC: the proposal to resolve the missing dependency issue for counted_by attribute

2023-11-02 Thread Richard Biener

On Thu, Nov 2, 2023 at 11:40 AM Jakub Jelinek  wrote:
>
> On Thu, Nov 02, 2023 at 11:18:09AM +0100, Richard Biener wrote:
> > > Or, if we want to pay further price, .ACCESS_WITH_SIZE could take as one 
> > > of
> > > the arguments not the size value, but its address.  Then at __bdos time
> > > we would dereference that pointer to get the size.
> > > So,
> > > struct S { int a; char b __attribute__((counted_by (a))) []; };
> > > struct S s;
> > > s.a = 5;
> > > char *p = &s.b[2];
> > > int i1 = __builtin_dynamic_object_size (p, 0);
> > > s.a = 3;
> > > int i2 = __builtin_dynamic_object_size (p, 0);
> > > would then yield 3 and 1 rather than 3 and 3.
> >
> > I fail to see how we can get the __builtin_dynamic_object_size call
> > data dependent on s.a, thus avoid re-ordering or even DSE of the
> > store.
>
> If &s.b[2] is lowered as
> sz_1 = s.a;
> tmp_2 = .ACCESS_WITH_SIZE (&s.b[0], sz_1);
> p_3 = &tmp_2[2];
> then sure, there is no way, you get the size from that point.
> tree-object-size.cc tracking then determines that in a particular
> case the pointer is size associated with sz_1 and use that value
> as the size (with the usual adjustments for pointer arithmetics and the
> like).
>
> What I meant is to emit
> tmp_4 = .ACCESS_WITH_SIZE (&s.b[0], &s.a, (typeof (&s.a)) 0);
> p_5 = &tmp_4[2];
> i.e. don't associate the pointer with a value of the size, but with
> an address where to find the size (plus how large it is), basically escape
> pointer to the size at that point.  And __builtin_dynamic_object_size is pure,
> so supposedly it can depend on what the escaped pointer points to.

Well, yeah - that would work but depend on .ACCESS_WITH_SIZE being an
escape point (quite bad IMHO) and __builtin_dynamic_object_size being
non-const (that's probably not too bad).

> We'd see that a particular pointer is size associated with &s.a address
> and would use that address cast to the type of the third argument (to
> preserve the exact pointer type on INTEGER_CST, though not sure, wouldn't
> VN CSE it anyway if one has say
> union U { struct S { int a; char b __attribute__((counted_by (a))) []; } s;
>   struct T { char c, d, e, f; char g __attribute__((counted_by (c))) 
> []; } t; };
> and
> .ACCESS_WITH_SIZE (&v.s.b[0], &v.s.a, (int *) 0);
> ...
> .ACCESS_WITH_SIZE (&v.t.g[0], &v.t.c, (int *) 0);
> ?

We'd probably CSE that - the usual issue of address-with-same-value.

> It would mean though that counted_by wouldn't be allowed to be a
> bit-field...

Yup.  We could also pass a pointer to the container though, that's good enough
for the escape, and pass the size by value in addition to that.

> Jakub
>

Re: [PATCH] doc: explicitly say 'lifetime' for DCE

2023-11-02 Thread Sam James



Richard Biener  writes:

> On Thu, Nov 2, 2023 at 11:25 AM Sam James  wrote:
>>
>>
>> Richard Biener  writes:
>>
>> > On Thu, Nov 2, 2023 at 10:03 AM Sam James  wrote:
>> >>
>> >> Say 'memory lifetime' rather than 'memory life' as lifetime is the more
>> >> standard term nowadays (indeed we have e.g. -fno-lifetime-dse).
>> >>
>> >> It's also easier to grep for if someone is looking for the documentation 
>> >> on
>> >> where we do that.
>> >
>> > OK
>>
>> Could you push for me please? I have a sw account but no gcc access
>> (yet).
>
> Done after fixing ChangeLog format.
>
> Richard.

Thanks!

>
>> cheers
>>
>> >
>> >> gcc/ChangeLog:
>> >> * doc/passes.texi (Dead code elimination): Explicitly say 'lifetime'
>> >> as this has become the standard term for what we're doing here.
>> >>
>> >> Signed-off-by: Sam James 
>> >> ---
>> >>  gcc/doc/passes.texi | 2 +-
>> >>  1 file changed, 1 insertion(+), 1 deletion(-)
>> >>
>> >> diff --git a/gcc/doc/passes.texi b/gcc/doc/passes.texi
>> >> index eb2bb6062834..470ac498a132 100644
>> >> --- a/gcc/doc/passes.texi
>> >> +++ b/gcc/doc/passes.texi
>> >> @@ -543,7 +543,7 @@ and is defined by 
>> >> @code{pass_early_warn_uninitialized} and
>> >>  @item Dead code elimination
>> >>
>> >>  This pass scans the function for statements without side effects whose
>> >> -result is unused.  It does not do memory life analysis, so any value
>> >> +result is unused.  It does not do memory lifetime analysis, so any value
>> >>  that is stored in memory is considered used.  The pass is run multiple
>> >>  times throughout the optimization process.  It is located in
>> >>  @file{tree-ssa-dce.cc} and is described by @code{pass_dce}.
>> >> --
>> >> 2.42.0
>> >>
>>

Re: [PATCH v1] RISC-V: Refactor prefix [I/L/LL] rounding API autovec iterator

2023-11-02 Thread juzhe.zhong

lgtm Replied Message Frompan2...@intel.comDate11/02/2023 19:48 Togcc-patches@gcc.gnu.org Ccjuzhe.zh...@rivai.ai,pan2...@intel.com,yanzhang.w...@intel.com,kito.ch...@gmail.comSubject[PATCH v1] RISC-V: Refactor prefix [I/L/LL] rounding API autovec iterator

Re: RFC: the proposal to resolve the missing dependency issue for counted_by attribute

2023-11-02 Thread Jakub Jelinek

On Thu, Nov 02, 2023 at 12:52:50PM +0100, Richard Biener wrote:
> > What I meant is to emit
> > tmp_4 = .ACCESS_WITH_SIZE (&s.b[0], &s.a, (typeof (&s.a)) 0);
> > p_5 = &tmp_4[2];
> > i.e. don't associate the pointer with a value of the size, but with
> > an address where to find the size (plus how large it is), basically escape
> > pointer to the size at that point.  And __builtin_dynamic_object_size is 
> > pure,
> > so supposedly it can depend on what the escaped pointer points to.
> 
> Well, yeah - that would work but depend on .ACCESS_WITH_SIZE being an
> escape point (quite bad IMHO)

That is why I've said we need to decide what cost we want to suffer because
of that.

> and __builtin_dynamic_object_size being
> non-const (that's probably not too bad).

It is already pure,leaf,nothrow (unlike __builtin_object_size which is obviously
const,leaf,nothrow).  Because under the hood, it can read memory when
expanded.

> > We'd see that a particular pointer is size associated with &s.a address
> > and would use that address cast to the type of the third argument (to
> > preserve the exact pointer type on INTEGER_CST, though not sure, wouldn't
> > VN CSE it anyway if one has say
> > union U { struct S { int a; char b __attribute__((counted_by (a))) []; } s;
> >   struct T { char c, d, e, f; char g __attribute__((counted_by 
> > (c))) []; } t; };
> > and
> > .ACCESS_WITH_SIZE (&v.s.b[0], &v.s.a, (int *) 0);
> > ...
> > .ACCESS_WITH_SIZE (&v.t.g[0], &v.t.c, (int *) 0);
> > ?
> 
> We'd probably CSE that - the usual issue of address-with-same-value.
> 
> > It would mean though that counted_by wouldn't be allowed to be a
> > bit-field...
> 
> Yup.  We could also pass a pointer to the container though, that's good enough
> for the escape, and pass the size by value in addition to that.

I was wondering about stuff like _BitInt.  But sure, counted_by is just an
extension, we can just refuse counting by _BitInt in addition to counting by
floating point, pointers, aggregates, bit-fields, or we could somehow encode
all the needed type's properties numerically into an integral constant.
Similarly for alias set (unless it uses 0 for reads).

Jakub

RE: [PATCH v1] RISC-V: Refactor prefix [I/L/LL] rounding API autovec iterator

2023-11-02 Thread Li, Pan2

Committed, thanks Juzhe.

Pan

From: juzhe.zhong 
Sent: Thursday, November 2, 2023 8:04 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; Li, Pan2 ; Wang, Yanzhang 
; kito.ch...@gmail.com
Subject: Re: [PATCH v1] RISC-V: Refactor prefix [I/L/LL] rounding API autovec 
iterator

lgtm
 Replied Message 
From
pan2...@intel.com
Date
11/02/2023 19:48
To
gcc-patches@gcc.gnu.org
Cc
juzhe.zh...@rivai.ai,
pan2...@intel.com,
yanzhang.w...@intel.com,
kito.ch...@gmail.com
Subject
[PATCH v1] RISC-V: Refactor prefix [I/L/LL] rounding API autovec iterator

RE: [PATCH v1] EXPMED: Allow vector mode for DSE extract_low_bits [PR111720]

2023-11-02 Thread Li, Pan2

Thanks Richard B for comments.

> when there are integer modes for the vector modes you now go a different path,
> a little less "regressing" would be to write it as
> 
>   if (int_mode_for_mode (src_mode).exists (&src_int_mode)
>&& int_mode_for_mode (mode).exists (&int_mode))
>  {
> ... old code ...
>  }
>   else if (VECTOR_MODE_P (mode) && VECTOR_MODE_P (src_mode))
>  {
> ... new code ...
>}
>   else
>  return NULL_RTX;

That make sense to me, will update it in V2.

> so you're really expecting to generate a subreg here?  Given "vector
> register layout"
> isn't something that's very well defined I fear it's going to be
> difficult to guarantee
> the desired semantics of this function.  IIRC powerpc64le has big-endian lane
> order for example.

This should be one problem here, I may need more consideration here regarding 
different backends.

Pan


-Original Message-
From: Richard Biener  
Sent: Thursday, November 2, 2023 4:20 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 
; kito.ch...@gmail.com; jeffreya...@gmail.com; 
richard.sandif...@arm.com
Subject: Re: [PATCH v1] EXPMED: Allow vector mode for DSE extract_low_bits 
[PR111720]

On Thu, Nov 2, 2023 at 4:15 AM  wrote:
>
> From: Pan Li 
>
> The extract_low_bits only try the scalar mode if the bitsize of
> the mode and src_mode is not equal. When vector mode is given
> from get_stored_val in DSE, it will always fail and return NULL_RTX.
>
> This patch would like to allow the vector mode in the extract_low_bits
> if and only if the size of mode is less than or equals to the size of
> the src_mode.
>
> Given below example code with --param=riscv-autovec-preference=fixed-vlmax.
>
> vuint8m1_t test () {
>   uint8_t arr[32] = {
> 1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
> 1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>   };
>
>   return __riscv_vle8_v_u8m1(arr, 32);
> }
>
> Before this patch:
>
> test:
>   lui a5,%hi(.LANCHOR0)
>   addisp,sp,-32
>   addia5,a5,%lo(.LANCHOR0)
>   li  a3,32
>   vl2re64.v   v2,0(a5)
>   vsetvli zero,a3,e8,m1,ta,ma
>   vs2r.v  v2,0(sp) <== Unnecessary store to stack
>   vle8.v  v1,0(sp) <== Ditto
>   vs1r.v  v1,0(a0)
>   addisp,sp,32
>   jr  ra
>
> After this patch:
>
> test:
>   lui a5,%hi(.LANCHOR0)
>   addia5,a5,%lo(.LANCHOR0)
>   li  a4,32
>   addisp,sp,-32
>   vsetvli zero,a4,e8,m1,ta,ma
>   vle8.v  v1,0(a5)
>   vs1r.v  v1,0(a0)
>   addisp,sp,32
>   jr  ra
>
> Below tests are passed within this patch:
>
> * The x86 bootstrap and regression test.
> * The aarch64 regression test.
> * The risc-v regression test.
>
> PR target/111720
>
> gcc/ChangeLog:
>
> * expmed.cc (extract_low_bits): Allow vector mode if the
> mode size is less than or equal to src_mode.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/pr111720-0.c: New test.
> * gcc.target/riscv/rvv/base/pr111720-1.c: New test.
> * gcc.target/riscv/rvv/base/pr111720-10.c: New test.
> * gcc.target/riscv/rvv/base/pr111720-2.c: New test.
> * gcc.target/riscv/rvv/base/pr111720-3.c: New test.
> * gcc.target/riscv/rvv/base/pr111720-4.c: New test.
> * gcc.target/riscv/rvv/base/pr111720-5.c: New test.
> * gcc.target/riscv/rvv/base/pr111720-6.c: New test.
> * gcc.target/riscv/rvv/base/pr111720-7.c: New test.
> * gcc.target/riscv/rvv/base/pr111720-8.c: New test.
> * gcc.target/riscv/rvv/base/pr111720-9.c: New test.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/expmed.cc | 44 ---
>  .../gcc.target/riscv/rvv/base/pr111720-0.c| 18 
>  .../gcc.target/riscv/rvv/base/pr111720-1.c| 18 
>  .../gcc.target/riscv/rvv/base/pr111720-10.c   | 18 
>  .../gcc.target/riscv/rvv/base/pr111720-2.c| 18 
>  .../gcc.target/riscv/rvv/base/pr111720-3.c| 18 
>  .../gcc.target/riscv/rvv/base/pr111720-4.c| 18 
>  .../gcc.target/riscv/rvv/base/pr111720-5.c| 18 
>  .../gcc.target/riscv/rvv/base/pr111720-6.c| 18 
>  .../gcc.target/riscv/rvv/base/pr111720-7.c| 21 +
>  .../gcc.target/riscv/rvv/base/pr111720-8.c| 18 
>  .../gcc.target/riscv/rvv/base/pr111720-9.c| 15 +++
>  12 files changed, 227 insertions(+), 15 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-10.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-3.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-4.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-5.c
>  create mode

[pushed] analyzer: fix clang warnings [PR112317]

2023-11-02 Thread David Malcolm

No functional change intended.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r14-5080-gc71028c979d55f.

gcc/analyzer/ChangeLog:
PR analyzer/112317
* access-diagram.cc (class x_aligned_x_ruler_widget): Eliminate
unused field "m_col_widths".
(access_diagram_impl::add_valid_vs_invalid_ruler): Update for
above change.
* region-model.cc
(check_one_function_attr_null_terminated_string_arg): Remove
unused variables "cd_unchecked", "strlen_sval", and
"limited_sval".
* region-model.h (region_model_context_decorator::warn): Add
missing "override".
---
 gcc/analyzer/access-diagram.cc |  9 +++--
 gcc/analyzer/region-model.cc   | 21 +
 gcc/analyzer/region-model.h|  2 +-
 3 files changed, 9 insertions(+), 23 deletions(-)

diff --git a/gcc/analyzer/access-diagram.cc b/gcc/analyzer/access-diagram.cc
index c7d190e3188..fb8c0282e75 100644
--- a/gcc/analyzer/access-diagram.cc
+++ b/gcc/analyzer/access-diagram.cc
@@ -919,11 +919,9 @@ class x_aligned_x_ruler_widget : public leaf_widget
 {
 public:
   x_aligned_x_ruler_widget (const access_diagram_impl &dia_impl,
-   const theme &theme,
-   table_dimension_sizes &col_widths)
+   const theme &theme)
   : m_dia_impl (dia_impl),
-m_theme (theme),
-m_col_widths (col_widths)
+m_theme (theme)
   {
   }
 
@@ -973,7 +971,6 @@ private:
 
   const access_diagram_impl &m_dia_impl;
   const theme &m_theme;
-  table_dimension_sizes &m_col_widths;
   std::vector m_labels;
 };
 
@@ -2361,7 +2358,7 @@ private:
 LOG_SCOPE (m_logger);
 
 x_aligned_x_ruler_widget *w
-  = new x_aligned_x_ruler_widget (*this, m_theme, *m_col_widths);
+  = new x_aligned_x_ruler_widget (*this, m_theme);
 
 access_range invalid_before_bits;
 if (m_op.maybe_get_invalid_before_bits (&invalid_before_bits))
diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index 9479bcf380c..dc834406520 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -1877,23 +1877,13 @@ check_one_function_attr_null_terminated_string_arg 
(const gcall *call,
 || access->mode == access_read_write)
&& access->sizarg != UINT_MAX)
   {
-   /* First, check for a null-terminated string *without*
-  emitting warnings (via a null context), to get an
-  svalue for the strlen of the buffer (possibly
-  nullptr if there would be an issue).  */
-   call_details cd_unchecked (call, this, nullptr);
-   const svalue *strlen_sval
- = check_for_null_terminated_string_arg (cd_unchecked,
- arg_idx);
-
-   /* Get svalue for the size limit argument.  */
call_details cd_checked (call, this, ctxt);
const svalue *limit_sval
  = cd_checked.get_arg_svalue (access->sizarg);
const svalue *ptr_sval
  = cd_checked.get_arg_svalue (arg_idx);
/* Try reading all of the bytes expressed by the size param,
-  but without checking (via a null context).  */
+  but without emitting warnings (via a null context).  */
const svalue *limited_sval
  = read_bytes (deref_rvalue (ptr_sval, NULL_TREE, nullptr),
NULL_TREE,
@@ -1912,11 +1902,10 @@ check_one_function_attr_null_terminated_string_arg 
(const gcall *call,
  {
/* Reading up to the truncation limit seems OK; repeat
   the read, but with checking enabled.  */
-   const svalue *limited_sval
- = read_bytes (deref_rvalue (ptr_sval, NULL_TREE, ctxt),
-   NULL_TREE,
-   limit_sval,
-   ctxt);
+   read_bytes (deref_rvalue (ptr_sval, NULL_TREE, ctxt),
+   NULL_TREE,
+   limit_sval,
+   ctxt);
  }
return;
   }
diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-model.h
index 8bfb06880ff..4d8480df141 100644
--- a/gcc/analyzer/region-model.h
+++ b/gcc/analyzer/region-model.h
@@ -890,7 +890,7 @@ class region_model_context_decorator : public 
region_model_context
 {
  public:
   bool warn (std::unique_ptr d,
-const stmt_finder *custom_finder)
+const stmt_finder *custom_finder) override
   {
 if (m_inner)
   return m_inner->warn (std::move (d), custom_finder);
-- 
2.26.3

[PATCH] Format gotools.sum closer to what DejaGnu does

2023-11-02 Thread Maxim Kuvyrkov

... to restore compatability with validate_failures.py .
The testsuite script validate_failures.py expects
"Running  ..." to extract  values,
and gotools.sum provided "Running ".

Note that libgo.sum, which also uses Makefile logic to generate
DejaGnu-like output, already has "..." suffix.

gotools/ChangeLog:

* Makefile.am: Update "Running  ..." output
* Makefile.in: Regenerate.
---
 gotools/Makefile.am | 4 ++--
 gotools/Makefile.in | 5 +++--
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/gotools/Makefile.am b/gotools/Makefile.am
index 7b5302990f8..d2376b9c25b 100644
--- a/gotools/Makefile.am
+++ b/gotools/Makefile.am
@@ -332,8 +332,8 @@ check: check-head check-go-tool check-runtime 
check-cgo-test check-carchive-test
@cp gotools.sum gotools.log
@for file in cmd_go-testlog runtime-testlog cgo-testlog 
carchive-testlog cmd_vet-testlog embed-testlog; do \
  testname=`echo $${file} | sed -e 's/-testlog//' -e 's|_|/|'`; \
- echo "Running $${testname}" >> gotools.sum; \
- echo "Running $${testname}" >> gotools.log; \
+ echo "Running $${testname} ..." >> gotools.sum; \
+ echo "Running $${testname} ..." >> gotools.log; \
  sed -e 's/^--- \(.*\) ([^)]*)$$/\1/' < $${file} >> gotools.log; \
  grep '^--- ' $${file} | sed -e 's/^--- \(.*\) ([^)]*)$$/\1/' -e 
's/SKIP/UNTESTED/' | sort -k 2 >> gotools.sum; \
done
diff --git a/gotools/Makefile.in b/gotools/Makefile.in
index 2783b91ef4b..9cc238e748d 100644
--- a/gotools/Makefile.in
+++ b/gotools/Makefile.in
@@ -317,6 +317,7 @@ pdfdir = @pdfdir@
 prefix = @prefix@
 program_transform_name = @program_transform_name@
 psdir = @psdir@
+runstatedir = @runstatedir@
 sbindir = @sbindir@
 sharedstatedir = @sharedstatedir@
 srcdir = @srcdir@
@@ -1003,8 +1004,8 @@ mostlyclean-local:
 @NATIVE_TRUE@  @cp gotools.sum gotools.log
 @NATIVE_TRUE@  @for file in cmd_go-testlog runtime-testlog cgo-testlog 
carchive-testlog cmd_vet-testlog embed-testlog; do \
 @NATIVE_TRUE@testname=`echo $${file} | sed -e 's/-testlog//' -e 's|_|/|'`; 
\
-@NATIVE_TRUE@echo "Running $${testname}" >> gotools.sum; \
-@NATIVE_TRUE@echo "Running $${testname}" >> gotools.log; \
+@NATIVE_TRUE@echo "Running $${testname} ..." >> gotools.sum; \
+@NATIVE_TRUE@echo "Running $${testname} ..." >> gotools.log; \
 @NATIVE_TRUE@sed -e 's/^--- \(.*\) ([^)]*)$$/\1/' < $${file} >> 
gotools.log; \
 @NATIVE_TRUE@grep '^--- ' $${file} | sed -e 's/^--- \(.*\) ([^)]*)$$/\1/' 
-e 's/SKIP/UNTESTED/' | sort -k 2 >> gotools.sum; \
 @NATIVE_TRUE@  done
-- 
2.34.1

Re: [PATCH] A new copy propagation and PHI elimination pass

2023-11-02 Thread Filip Kastl

Hi,

thanks for the guidance.  I'm going to post a new version of the patch with the
testcase modified so that it searches for 'return 9;' instead of '= 9;'.

Filip Kastl


On Fri 2023-10-27 13:55:37, Jeff Law wrote:
> 
> 
> On 10/20/23 07:52, Filip Kastl wrote:
> > On Fri 2023-10-20 15:50:25, Filip Kastl wrote:
> > > Bootstraped and tested* on x86_64-pc-linux-gnu.
> > > 
> > > * One testcase (pr79691.c) did regress. However that is because the test 
> > > is
> > > dependent on a certain variable not being copy propagated. I will go into 
> > > more
> > > detail about this in a reply to this mail.
> > 
> > This testcase checks for the string '= 9' being present in the 
> > tree-optimized
> > gimple dump ({ dg-final { scan-tree-dump " = 9;" "optimized" } }). This is 
> > how
> > the relevant place in the dump looks like without my patch:
> > 
> > int f4 (int i)
> > {
> >int _6;
> > 
> > [local count: 1073741824]:
> >_6 = 9;
> >return _6;
> > 
> > }
> > 
> > Note that '= 9' is indeed present but there is an opportunity for copy
> > propagation. With my patch, the copy propagation happens:
> > 
> > int f4 (int i)
> > {
> >int _6;
> > 
> > [local count: 1073741824]:
> >return 9;
> > 
> > }
> > 
> > Which means no '= 9' is present and therefore the test fails.
> > 
> > What should I do? I don't suppose that changing the testcase to search for 
> > just
> > '9' would be wise since the dump may contain other '9's. I could change it 
> > to
> > search for 'return 9'. That would make it dependent on some copy propagation
> > being run late enough. However it is currently dependent on *no* copy
> > propagation being run late in the compilation. Also, if the test would 
> > search
> > for 'return 9', it would search for the most optimized version of the 
> > function
> > f4.
> > 
> > Or maybe searching for '9;' would work.
> So in general you have to go back and try to assess the original intent of
> the test.  Once you have the original intent, the path forward is often
> clear.
> 
> In this specific case the source is:
> +/* Verify -fprintf-return-value results used for constant propagation.  */
> +int f4 (int i)
> +{
> +  int n1 = __builtin_snprintf (0, 0, "%i", 1234);
> +  int n2 = __builtin_snprintf (0, 0, "%i", 12345);
> +  return n1 + n2;
> +}
> 
> And the intent of the test is to verify that we get constants from the
> snprintf calls and that they in turn simplify to a constant.
> 
> That is certainly still the case after your patch, just the form of the
> output is different (the constant is propagated further).  So I think
> testing for "return 9" would be the right approach here.
> 
> jeff

[PATCH v2] A new copy propagation and PHI elimination pass

2023-11-02 Thread Filip Kastl

> Hi,
> 
> this is a patch that I submitted two months ago as an RFC. I added some polish
> since.
> 
> It is a new lightweight pass that removes redundant PHI functions and as a
> bonus does basic copy propagation. With Jan Hubička we measured that it is 
> able
> to remove usually more than 5% of all PHI functions when run among early 
> passes
> (sometimes even 13% or more). Those are mostly PHI functions that would be
> later optimized away but with this pass it is possible to remove them early
> enough so that they don't get streamed when runing LTO (and also potentially
> inlined at multiple places). It is also able to remove some redundant PHIs
> that otherwise would still be present during RTL expansion.
> 
> Jakub Jelínek was concerned about debug info coverage so I compiled cc1plus
> with and without this patch. These are the sizes of .debug_info and
> .debug_loclists
> 
> .debug_info without patch 181694311
> .debug_infowith patch 181692320
> +0.0011% change
> 
> .debug_loclists without patch 47934753
> .debug_loclistswith patch 47934966
> -0.0004% change
> 
> I wanted to use dwlocstat to compare debug coverages but didn't manage to get
> the program working on my machine sadly. Hope this suffices. Seems to me that
> my patch doesn't have a significant impact on debug info.
> 
> Bootstraped and tested* on x86_64-pc-linux-gnu.
> 
> * One testcase (pr79691.c) did regress. However that is because the test is
> dependent on a certain variable not being copy propagated. I will go into more
> detail about this in a reply to this mail.
> 
> Ok to commit?

This is a second version of the patch.  In this version, I modified the
pr79691.c testcase so that it works as intended with other changes from the
patch.

The pr79691.c testcase checks that we get constants from snprintf calls and
that they simplify into a single constant.  The testcase doesn't account for
the fact that this constant may be further copy propagated which is exactly
what happens with this patch applied.

Bootstrapped and tested on x86_64-pc-linux-gnu.

Ok to commit?

Filip Kastl

-- >8 --

This patch adds the strongly-connected copy propagation (SCCOPY) pass.
It is a lightweight GIMPLE copy propagation pass that also removes some
redundant PHI statements. It handles degenerate PHIs, e.g.:

_5 = PHI <_1>;
_6 = PHI <_6, _6, _1, _1>;
_7 = PHI <16, _7>;
// Replaces occurences of _5 and _6 by _1 and _7 by 16

It also handles more complicated situations, e.g.:

_8 = PHI <_9, _10>;
_9 = PHI <_8, _10>;
_10 = PHI <_8, _9, _1>;
// Replaces occurences of _8, _9 and _10 by _1

gcc/ChangeLog:

* Makefile.in: Added sccopy pass.
* passes.def: Added sccopy pass before LTO streaming and before
  RTL expansion.
* tree-pass.h (make_pass_sccopy): Added sccopy pass.
* tree-ssa-sccopy.cc: New file.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr79691.c: Updated scan-tree-dump to account
  for additional copy propagation this patch adds.
* gcc.dg/sccopy-1.c: New test.

Signed-off-by: Filip Kastl 
---
 gcc/Makefile.in |   1 +
 gcc/passes.def  |   3 +
 gcc/testsuite/gcc.dg/sccopy-1.c |  78 +++
 gcc/testsuite/gcc.dg/tree-ssa/pr79691.c |   2 +-
 gcc/tree-pass.h |   1 +
 gcc/tree-ssa-sccopy.cc  | 867 
 6 files changed, 951 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/sccopy-1.c
 create mode 100644 gcc/tree-ssa-sccopy.cc

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index a25a1e32fbc..2bd5a015676 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1736,6 +1736,7 @@ OBJS = \
tree-ssa-pre.o \
tree-ssa-propagate.o \
tree-ssa-reassoc.o \
+   tree-ssa-sccopy.o \
tree-ssa-sccvn.o \
tree-ssa-scopedtables.o \
tree-ssa-sink.o \
diff --git a/gcc/passes.def b/gcc/passes.def
index 1e1950bdb39..fa6c5a2c9fa 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -100,6 +100,7 @@ along with GCC; see the file COPYING3.  If not see
  NEXT_PASS (pass_if_to_switch);
  NEXT_PASS (pass_convert_switch);
  NEXT_PASS (pass_cleanup_eh);
+ NEXT_PASS (pass_sccopy);
  NEXT_PASS (pass_profile);
  NEXT_PASS (pass_local_pure_const);
  NEXT_PASS (pass_modref);
@@ -368,6 +369,7 @@ along with GCC; see the file COPYING3.  If not see
 However, this also causes us to misdiagnose cases that should be
 real warnings (e.g., testsuite/gcc.dg/pr18501.c).  */
   NEXT_PASS (pass_cd_dce, false /* update_address_taken_p */);
+  NEXT_PASS (pass_sccopy);
   NEXT_PASS (pass_tail_calls);
   /* Split critical edges before late uninit warning to reduce the
  number of false positives from it.  */
@@ -409,6 +411,7 @@ along with GCC; see the file COPYING3.  If not see
   NEXT_PASS (pass_sancov);
   NEXT_PASS (pass_asan);
   NEXT_PASS (pass_tsan);

Re: Re: [tree-optimization/111721] VECT: Support SLP for MASK_LEN_GATHER_LOAD with dummy mask

2023-11-02 Thread Richard Biener

On Thu, 2 Nov 2023, juzhe.zh...@rivai.ai wrote:

> Thanks Richi.
> 
> The following is the V2 patch:
> Testing on X86 and aarch64 are running.
> 
> diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> index 43d742e3c92..e7f7f976f11 100644
> --- a/gcc/tree-vect-slp.cc
> +++ b/gcc/tree-vect-slp.cc
> @@ -760,7 +760,7 @@ vect_get_and_check_slp_defs (vec_info *vinfo, unsigned 
> char swap,
>|| dt == vect_external_def)
>   && !GET_MODE_SIZE (vinfo->vector_mode).is_constant ()
>   && (TREE_CODE (type) == BOOLEAN_TYPE
> - || !can_duplicate_and_interleave_p (vinfo, stmts.length (),
> + && !can_duplicate_and_interleave_p (vinfo, stmts.length (),
>   type)))

That's not what I wrote.  I wrote to let == BOOLEAN_TYPE pass without
check here, thus

 - && (TREE_CODE (type) == BOOLEAN_TYPE
 + && TREE_CODE (type) != BOOLEAN_TYPE
   && !can_duplicate...

> {
>   if (dump_enabled_p ())
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index 6ce4868d3e1..6c47121e158 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -9859,10 +9859,16 @@ vectorizable_load (vec_info *vinfo,
>mask_index = internal_fn_mask_index (ifn);
>if (mask_index >= 0 && slp_node)
> mask_index = vect_slp_child_index_for_operand (call, mask_index);
> +  slp_tree slp_op = NULL;
>if (mask_index >= 0
>   && !vect_check_scalar_mask (vinfo, stmt_info, slp_node, mask_index,
> - &mask, NULL, &mask_dt, &mask_vectype))
> + &mask, &slp_op, &mask_dt, 
> &mask_vectype))
> return false;
> +  /* MASK_LEN_GATHER_LOAD dummy mask -1 should always match the
> +MASK_VECTYPE.  */
> +  if (mask_index >= 0 && slp_node && mask_dt == vect_constant_def
> + && !vect_maybe_update_slp_op_vectype (slp_op, mask_vectype))
> +   gcc_unreachable ();

You shouldn't do this here.  Theres code in if (costing_p) that
would need to be updated if you (correctly) want to track slp_op here.

>  }
> 
> 
> 
> 
> juzhe.zh...@rivai.ai
>  
> From: Richard Biener
> Date: 2023-11-02 19:11
> To: Juzhe-Zhong
> CC: gcc-patches; richard.sandiford
> Subject: Re: [tree-optimization/111721] VECT: Support SLP for 
> MASK_LEN_GATHER_LOAD with dummy mask
> On Thu, 2 Nov 2023, Juzhe-Zhong wrote:
>  
> > This patch fixes following FAILs for RVV:
> > FAIL: gcc.dg/vect/vect-gather-1.c -flto -ffat-lto-objects  scan-tree-dump 
> > vect "Loop contains only SLP stmts"
> > FAIL: gcc.dg/vect/vect-gather-1.c scan-tree-dump vect "Loop contains only 
> > SLP stmts"
> > 
> > Bootstrap on X86 and regtest passed.
> > 
> > Tested on aarch64 passed.
> > 
> > Ok for trunk ?
> > 
> > PR tree-optimization/111721
> > 
> > gcc/ChangeLog:
> > 
> > * tree-vect-slp.cc (vect_get_and_check_slp_defs): Support SLP for 
> > dummy mask -1.
> > * tree-vect-stmts.cc (vectorizable_load): Ditto.
> > 
> > ---
> >  gcc/tree-vect-slp.cc   | 14 --
> >  gcc/tree-vect-stmts.cc |  8 +++-
> >  2 files changed, 19 insertions(+), 3 deletions(-)
> > 
> > diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> > index 43d742e3c92..23ca0318e31 100644
> > --- a/gcc/tree-vect-slp.cc
> > +++ b/gcc/tree-vect-slp.cc
> > @@ -756,8 +756,7 @@ vect_get_and_check_slp_defs (vec_info *vinfo, unsigned 
> > char swap,
> >  {
> >tree type = TREE_TYPE (oprnd);
> >dt = dts[i];
> > -   if ((dt == vect_constant_def
> > -|| dt == vect_external_def)
> > +   if (dt == vect_external_def
> >&& !GET_MODE_SIZE (vinfo->vector_mode).is_constant ()
> >&& (TREE_CODE (type) == BOOLEAN_TYPE
> >|| !can_duplicate_and_interleave_p (vinfo, stmts.length (),
> > @@ -769,6 +768,17 @@ vect_get_and_check_slp_defs (vec_info *vinfo, unsigned 
> > char swap,
> >  "for variable-length SLP %T\n", oprnd);
> >return -1;
> >  }
> > +   if (dt == vect_constant_def
> > +   && !GET_MODE_SIZE (vinfo->vector_mode).is_constant ()
> > +   && !can_duplicate_and_interleave_p (vinfo, stmts.length (), type))
> > + {
> > +   if (dump_enabled_p ())
> > + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> > + "Build SLP failed: invalid type of def "
> > + "for variable-length SLP %T\n",
> > + oprnd);
> > +   return -1;
> > + }
>  
> I don't think that's quite correct.  can_duplicate_and_interleave_p
> doesn't get enough info here and IIRC even materializing arbitrary
> constants isn't possible with VLA vectors.  The very first thing
> the function does is
>  
>   tree base_vector_type = get_vectype_for_scalar_type (vinfo, elt_type, 
> count);
>   if (!base_vector_type || !VECTOR_MODE_P (TYPE_MODE (base_vector_type)))
> return false;
>  
> but for masks that's not going to get us the correct vector type.
> While I don't under

[PATCH/RFC 0/4] C/C++/diagnostics: various UX improvements

2023-11-02 Thread David Malcolm

The following patch kit implements the:
  #pragma GCC show_layout (struct foo)
idea I mentioned in my Cauldron talk (in patch 2),  and the other
patches implement various related user experience changes I came
across when implementing it.

Patch 1 reworks how c-pragma.cc parses pragmas, and experiments with
adding links to documentation to the diagnostics messages (on a
suitably capable terminal).

Patch 2 implements the new "show_layout" pragma

Patch 3 adds a new mechanism to the diagnostics subsytem for
automatically adding documentation links to messages, with enough
data to handle the pragmas from patch 1.

Patch 4 attempts to automatically populate the URL data for our docs by
parsing the results of "make html".

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.

I'd like to go ahead with patch 1 and patch 3; patch 2 and patch 4 may
need more work, but posting here for feedback.

Thoughts?

David Malcolm (4):
  c/c++: rework pragma parsing
  c: add #pragma GCC show_layout
  diagnostics: add automatic URL-ification within messages
  RFC: add contrib/regenerate-index-urls.py

 contrib/regenerate-index-urls.py  |  245 ++
 gcc/Makefile.in   |3 +-
 gcc/analyzer/record-layout.cc |  235 ++
 gcc/analyzer/record-layout.h  |4 +
 gcc/c-family/c-pragma.cc  |  641 -
 gcc/c-family/c-pragma.h   |5 +-
 gcc/diagnostic.cc |8 +-
 gcc/diagnostic.h  |4 +
 gcc/doc/extend.texi   |   49 +
 gcc/gcc-urlifier.cc   |  159 ++
 gcc/gcc-urlifier.def  | 2532 +
 gcc/gcc-urlifier.h|   26 +
 gcc/gcc.cc|2 +
 gcc/pretty-print-urlifier.h   |   33 +
 gcc/pretty-print.cc   |  242 +-
 gcc/pretty-print.h|5 +-
 gcc/selftest-run-tests.cc |1 +
 gcc/selftest.h|1 +
 gcc/stor-layout.h |3 +
 .../c-c++-common/pragma-message-parsing.c |   21 +
 .../c-c++-common/pragma-optimize-parsing.c|   16 +
 .../c-c++-common/pragma-pack-parsing-1.c  |   19 +
 .../c-c++-common/pragma-pack-parsing-2.c  |4 +
 .../pragma-redefine_extname-parsing.c |9 +
 .../c-c++-common/pragma-target-parsing.c  |   14 +
 .../c-c++-common/pragma-visibility-parsing.c  |   13 +
 .../c-c++-common/pragma-weak-parsing.c|   24 +
 gcc/testsuite/gcc.dg/bad-pragma-locations.c   |   22 +-
 .../gcc.dg/parsing-pragma-show_layout.c   |   15 +
 .../pragma-scalar_storate_order-parsing.c |8 +
 gcc/testsuite/gcc.dg/pragma-show_layout-1.c   |   12 +
 gcc/testsuite/gcc.dg/pragma-show_layout-2.c   |  184 ++
 ...agma-show_layout-infoleak-CVE-2017-18550.c |  175 ++
 gcc/testsuite/gcc.dg/sso-6.c  |2 +-
 gcc/toplev.cc |2 +
 35 files changed, 4589 insertions(+), 149 deletions(-)
 create mode 100755 contrib/regenerate-index-urls.py
 create mode 100644 gcc/gcc-urlifier.cc
 create mode 100644 gcc/gcc-urlifier.def
 create mode 100644 gcc/gcc-urlifier.h
 create mode 100644 gcc/pretty-print-urlifier.h
 create mode 100644 gcc/testsuite/c-c++-common/pragma-message-parsing.c
 create mode 100644 gcc/testsuite/c-c++-common/pragma-optimize-parsing.c
 create mode 100644 gcc/testsuite/c-c++-common/pragma-pack-parsing-1.c
 create mode 100644 gcc/testsuite/c-c++-common/pragma-pack-parsing-2.c
 create mode 100644 gcc/testsuite/c-c++-common/pragma-redefine_extname-parsing.c
 create mode 100644 gcc/testsuite/c-c++-common/pragma-target-parsing.c
 create mode 100644 gcc/testsuite/c-c++-common/pragma-visibility-parsing.c
 create mode 100644 gcc/testsuite/c-c++-common/pragma-weak-parsing.c
 create mode 100644 gcc/testsuite/gcc.dg/parsing-pragma-show_layout.c
 create mode 100644 gcc/testsuite/gcc.dg/pragma-scalar_storate_order-parsing.c
 create mode 100644 gcc/testsuite/gcc.dg/pragma-show_layout-1.c
 create mode 100644 gcc/testsuite/gcc.dg/pragma-show_layout-2.c
 create mode 100644 
gcc/testsuite/gcc.dg/pragma-show_layout-infoleak-CVE-2017-18550.c

-- 
2.26.3

[PATCH 3/4] diagnostics: add automatic URL-ification within messages

2023-11-02 Thread David Malcolm

In r10-3781-gd26082357676a3 GCC's pretty-print framework gained
the ability to emit embedding URLs via escape sequences
for marking up text output..

In r10-3783-gb4c7ca2ef3915a GCC started using this for the
[-Wname-of-option] emitted at the end of each diagnostic so that it
becomes a hyperlink to the documentation for that option on the GCC
website.

This makes it much more convenient for the user to locate pertinent
documentation when a diagnostic is emitted.

The above involved special-casing in one specific place, but there is
plenty of quoted text throughout GCC's diagnostic messages that could
usefully have a documentation URL: references to options, pragmas, etc

This patch adds a new optional "urlifier" parameter to pp_format.
The idea is that a urlifier object has responsibility for mapping from
quoted strings in diagnostic messages to URLs, and pp_format has the
ability to automatically add URL escapes for strings that the urlifier
gives it URLs for.

For example, given the format string:

  "%<#pragma pack%> has no effect with %<-fpack-struct%>"

with this patch GCC is able to automatically linkify the "#pragma pack"
text to
  https://gcc.gnu.org/onlinedocs/gcc/Structure-Layout-Pragmas.html
and the "-fpack-struct" text to:
  https://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html#index-fpack-struct

and we don't have to modify the format string itself.

This is only done for the pp_format within diagnostic_report_diagnostic
i.e. just for the primary message in each diagnostics, and not for other
places within GCC that use pp_format internally.

"urlifier" is an abstract base class, with a GCC-specific subclass
implementing the logic for generating URLs into GCC's HTML
documentation via binary search in a data table.  This patch implements
the gcc_urlifier with a small table generated by hand; the data table in
this patch only covers enough pragmas and options to allow undoing some
of the hardcoding from the previous pragma-parsing patch.

I have a followup patch that scripts the creation of this data by
directly scraping the output of "make html", thus automating all this,
and (I hope) minimizing the work of ensuring that documentation URLs
emitted by GCC match the generated documentation.

gcc/ChangeLog:
* Makefile.in (GCC_OBJS): Add gcc-urlifier.o.
(OBJS): Likewise.

gcc/c-family/ChangeLog:
* c-pragma.cc:: Eliminate uses of %{ and %} and get_doc_url
in all places where it's just the name of the pragma (or of an
option).
(handle_pragma_push_options): Fix missing "GCC" in name of pragma
in "junk" message.
(handle_pragma_pop_options): Likewise.

gcc/ChangeLog:
* diagnostic.cc: Include "pretty-print-urlifier.h".
(diagnostic_initialize): Initialize m_urlifier.
(diagnostic_finish): Clean up m_urlifier
(diagnostic_report_diagnostic): Pass context->m_urlifier to
pp_format.
* diagnostic.h (diagnostic_context::m_urlifier): New field.
* gcc-urlifier.cc: New file.
* gcc-urlifier.def: New file.
* gcc-urlifier.h: New file.
* gcc.cc: Include "gcc-urlifier.h".
(driver::global_initializations): Initialize global_dc->m_urlifier.
* pretty-print-urlifier.h: New file.
* pretty-print.cc: Include "pretty-print-urlifier.h".
(obstack_append_string): New.
(urlify_quoted_string): New.
(pp_format): Add "urlifier" param and use it to implement optional
urlification of quoted text strings.
(pp_output_formatted_text): Make buffer a const pointer.
(selftest::pp_printf_with_urlifier): New.
(selftest::test_urlification): New.
(selftest::pretty_print_cc_tests): Call it.
* pretty-print.h (class urlifier): New forward declaration.
(pp_format): Add optional urlifier param.
* selftest-run-tests.cc (selftest::run_tests): Call
selftest::gcc_urlifier_cc_tests .
* selftest.h (selftest::gcc_urlifier_cc_tests): New decl.
* toplev.cc: Include "gcc-urlifier.h".
(general_init): Initialize global_dc->m_urlifier.
---
 gcc/Makefile.in |   3 +-
 gcc/c-family/c-pragma.cc|  73 ---
 gcc/diagnostic.cc   |   8 +-
 gcc/diagnostic.h|   4 +
 gcc/gcc-urlifier.cc | 159 +++
 gcc/gcc-urlifier.def|  20 +++
 gcc/gcc-urlifier.h  |  26 
 gcc/gcc.cc  |   2 +
 gcc/pretty-print-urlifier.h |  33 +
 gcc/pretty-print.cc | 242 +++-
 gcc/pretty-print.h  |   5 +-
 gcc/selftest-run-tests.cc   |   1 +
 gcc/selftest.h  |   1 +
 gcc/toplev.cc   |   2 +
 14 files changed, 520 insertions(+), 59 deletions(-)
 create mode 100644 gcc/gcc-urlifier.cc
 create mode 100644 gcc/gcc-urlifier.def
 create mode 100644 gcc/gcc-urlifier.h
 create mode 100644 gcc/pretty-print-urlifier.h

diff --git a/gcc/Makefile.in b/gcc/Ma

[PATCH 2/4] c: add #pragma GCC show_layout

2023-11-02 Thread David Malcolm

This patch adds a new pragma to the C frontend that will
make it emit a human-readable diagram of a struct's layout.

For example, given this contrived usage:

struct example {
  char foo : 7;
  char bar;
  char visible : 1;
  char active  : 1;
};

the compiler will emit output similar to the following:

note: 'sizeof(struct example)' == 3; layout:

  
┌───┬┬───┬─┬─┬───┐
  │Offsets│Byte│   0   │  1  │2
│   3   │
  
├───┼┼─┬─┬─┬─┬─┬─┬─┬─┼─┬─┬──┬──┬──┬──┬──┬──┼───┬───┬──┬──┬──┬──┬──┬──┼──┬──┬──┬──┬──┬──┬──┬──┤
  │ Byte  │Bit │0│1│2│3│4│5│6│7│8│9│10│11│12│13│14│15│16 │17 
│18│19│20│21│22│23│24│25│26│27│28│29│30│31│
  
├───┼┼─┴─┴─┴─┴─┴─┴─┼─┼─┴─┴──┴──┴──┴──┴──┴──┼───┼───┼──┴──┴──┴──┴──┴──┼──┴──┴──┴──┴──┴──┴──┴──┘
  │   0   │ 0  │'foo'│*│'bar'│(1)│(2)│ padding │
  └───┴┴─┴─┴─┴───┴───┴─┘
  *: padding
  (1): 'visible'
  (2): 'active'

The output is intended for humans, rather than scripts, and is
subject to change.

One wart is that it uses some analyzer internals, and thus requires
GCC to have been configured without disabling the analyzer.

Caveat: only tested on x86_64, and probably has some endianness and
packing assumptions in the testcases.

Thoughts?

gcc/analyzer/ChangeLog:
* record-layout.cc: Define INCLUDE_ALGORITHM and
INCLUDE_VECTOR.  Include "intl.h", "text-art/table.h",
"text-art/widget.h", and "diagnostic-diagram.h".
(class layout_diagram): New.
(layout_diagram::layout_diagram): New.
(layout_diagram::bit_to_table_coord): New.
(layout_diagram::ensure_table_rows): New.
(layout_diagram::get_string_for_item): New.
(impl_show_record_layout): New.
(show_record_layout): New.
* record-layout.h (class layout_diagram): New forward decl.
(class record_layout): Add friend class layout_diagram.

gcc/c-family/ChangeLog:
* c-pragma.cc: Include "stor-layout.h".
(class pragma_parser_show_layout): New.
(handle_pragma_show_layout): New.
(init_pragma): Register it.

gcc/ChangeLog:
* doc/extend.texi (Other Pragmas): New subsection,
with '#pragma GCC show_layout'.
* stor-layout.h (show_record_layout): New decl.

gcc/testsuite/ChangeLog:
* gcc.dg/parsing-pragma-show_layout.c: New test.
* gcc.dg/pragma-show_layout-1.c: New test.
* gcc.dg/pragma-show_layout-2.c: New test.
* gcc.dg/pragma-show_layout-infoleak-CVE-2017-18550.c: New test.
---
 gcc/analyzer/record-layout.cc | 235 ++
 gcc/analyzer/record-layout.h  |   4 +
 gcc/c-family/c-pragma.cc  |  95 +++
 gcc/doc/extend.texi   |  49 
 gcc/stor-layout.h |   3 +
 .../gcc.dg/parsing-pragma-show_layout.c   |  15 ++
 gcc/testsuite/gcc.dg/pragma-show_layout-1.c   |  12 +
 gcc/testsuite/gcc.dg/pragma-show_layout-2.c   | 184 ++
 ...agma-show_layout-infoleak-CVE-2017-18550.c | 175 +
 9 files changed, 772 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/parsing-pragma-show_layout.c
 create mode 100644 gcc/testsuite/gcc.dg/pragma-show_layout-1.c
 create mode 100644 gcc/testsuite/gcc.dg/pragma-show_layout-2.c
 create mode 100644 
gcc/testsuite/gcc.dg/pragma-show_layout-infoleak-CVE-2017-18550.c

diff --git a/gcc/analyzer/record-layout.cc b/gcc/analyzer/record-layout.cc
index 1369bfb5eff..242a9895309 100644
--- a/gcc/analyzer/record-layout.cc
+++ b/gcc/analyzer/record-layout.cc
@@ -19,7 +19,9 @@ along with GCC; see the file COPYING3.  If not see
 .  */
 
 #include "config.h"
+#define INCLUDE_ALGORITHM
 #define INCLUDE_MEMORY
+#define INCLUDE_VECTOR
 #include "system.h"
 #include "coretypes.h"
 #include "tree.h"
@@ -28,8 +30,13 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple.h"
 #include "diagnostic.h"
 #include "tree-diagnostic.h"
+#include "intl.h"
+#include "make-unique.h"
 #include "analyzer/analyzer.h"
 #include "analyzer/record-layout.h"
+#include "text-art/table.h"
+#include "text-art/widget.h"
+#include "diagnostic-diagram.h"
 
 #if ENABLE_ANALYZER
 
@@ -120,6 +127,234 @@ record_layout::maybe_pad_to (bit_offset_t next_offset)
 }
 }
 
+class layout_diagram : public text_art::vbox_widget
+{
+public:
+  layout_diagram (const ana::record_layout &layout,
+ text_art::style_manager &sm,
+ const text_art::theme &theme);
+
+private:
+  text_art::table::coord_t bit_to_table_coord (ana::bit_offset_t bit);
+
+  void ensure_table_rows (text_art::style_manager &sm,
+ text_art::table &table,
+ int table_y);
+
+  text_art::styled_string
+  get_string_for_item (const ana::rec

[PATCH 1/4] c/c++: rework pragma parsing

2023-11-02 Thread David Malcolm

This patch reworks pragma parsing in c-pragma.cc, with the
following improvements:

- it replaces the GCC_BAD* macros (that contained "return") in favor
of helper classes and functions for emitting diagnostics, making control
flow more explicit

- the -Wpragmas diagnostics are reworded from the form e.g.:
  DESCRIPTION OF PROBLEM; ignored
to:
  ignoring malformed '#pragma FOO': DESCRIPTION OF PROBLEM

- the locations of the warnings are fixed to more accurately
reflect the location of the problem

- the names of the pragmas are URLified into links to the
documentation for the pragma.  For example, in:

  warning: ignoring malformed '#pragma weak': expected name [-Wpragmas]

in a suitable terminal, the "#pragma weak" within quotes is a link
to https://gcc.gnu.org/onlinedocs/gcc/Weak-Pragmas.html; similarly with

  warning: '#pragma pack' has no effect with '-fpack-struct' - ignored 
[-Wpragmas]

the "#pragma pack" text is linkified to
  https://gcc.gnu.org/onlinedocs/gcc/Structure-Layout-Pragmas.html
and the "-fpack-struct" text is linkified to:
  https://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html#index-fpack-struct

I have a more general and maintainable approach to adding URLs to
diagnostics which is in a followup.

gcc/c-family/ChangeLog:
* c-pragma.cc (GCC_BAD): Delete.
(GCC_BAD2): Delete.
(GCC_BAD_AT): Delete.
(GCC_BAD2_AT): Delete.
(get_doc_url): New.
(class pragma_parser): New.
(handle_pragma_pack): Delete redundant forward decl.
(pop_alignment): Add param "p" and use it to get doc urls.
(enum class pack_action): Move here from within
handle_pragma_pack.
(class pragma_pack_parser): New.
(handle_pragma_pack): Rewrite using pragma_pack_parser
and enum class pack_action, eliminating uses of GCC_BAD*,
rewording diagnostics.
(handle_pragma_weak): Rewrite using pragma_parser, eliminating
uses of GCC_BAD*, rewording diagnostics.
(class pragma_scalar_storage_order_parser): New.
(handle_pragma_scalar_storage_order): Rewrite using above,
eliminating uses of GCC_BAD*, rewording diagnostics.
(handle_pragma_redefine_extname): Rewrite using pragma_parser,
eliminating uses of GCC_BAD*, rewording diagnostics.  Fix overlong
line.
(handle_pragma_visibility): Remove redundant forward decl.
(push_visibility): Add "const pragma_parser *" param.  Rewrite to
eliminate uses of GCC_BAD*.  Add note that warning was ignored.
(handle_pragma_visibility): Rewrite using pragma_parser,
eliminating uses of GCC_BAD*, rewording diagnostics.
(handle_pragma_target): Fix name of pragma in "error".  Eliminate
uses of GCC_BAD*.
(handle_pragma_optimize): Eliminate uses of GCC_BAD.
(handle_pragma_message): Rewrite using pragma_parser, eliminating
uses of GCC_BAD*, rewording diagnostics.
* c-pragma.h (class pragma_parser): New forward decl.
(push_visibility): Add optional "const pragma_parser *" param.

gcc/testsuite/ChangeLog:
* c-c++-common/pragma-message-parsing.c: New test.
* c-c++-common/pragma-optimize-parsing.c: New test.
* c-c++-common/pragma-pack-parsing-1.c: New test.
* c-c++-common/pragma-pack-parsing-2.c: New test.
* c-c++-common/pragma-redefine_extname-parsing.c: New test.
* c-c++-common/pragma-target-parsing.c: New test.
* c-c++-common/pragma-visibility-parsing.c: New test.
* c-c++-common/pragma-weak-parsing.c: New test.
* gcc.dg/bad-pragma-locations.c: Update for changes to wording and
location of -Wpragmas.
* gcc.dg/pragma-scalar_storate_order-parsing.c: New test.
* gcc.dg/sso-6.c: Update for changes to wording of -Wpragmas.
---
 gcc/c-family/c-pragma.cc  | 569 ++
 gcc/c-family/c-pragma.h   |   5 +-
 .../c-c++-common/pragma-message-parsing.c |  21 +
 .../c-c++-common/pragma-optimize-parsing.c|  16 +
 .../c-c++-common/pragma-pack-parsing-1.c  |  19 +
 .../c-c++-common/pragma-pack-parsing-2.c  |   4 +
 .../pragma-redefine_extname-parsing.c |   9 +
 .../c-c++-common/pragma-target-parsing.c  |  14 +
 .../c-c++-common/pragma-visibility-parsing.c  |  13 +
 .../c-c++-common/pragma-weak-parsing.c|  24 +
 gcc/testsuite/gcc.dg/bad-pragma-locations.c   |  22 +-
 .../pragma-scalar_storate_order-parsing.c |   8 +
 gcc/testsuite/gcc.dg/sso-6.c  |   2 +-
 13 files changed, 588 insertions(+), 138 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/pragma-message-parsing.c
 create mode 100644 gcc/testsuite/c-c++-common/pragma-optimize-parsing.c
 create mode 100644 gcc/testsuite/c-c++-common/pragma-pack-parsing-1.c
 create mode 100644 gcc/testsuite/c-c++-common/pragma-pack-parsing-2.c
 create mode 100644 gcc/testsuite/c-c++-common/pragma-redefine_extname-parsi

Re: [PATCH 01/12] [contrib] validate_failures.py: Avoid testsuite aliasing

2023-11-02 Thread Maxim Kuvyrkov

Patch proposed at 
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635000.html
--
Maxim Kuvyrkov
https://www.linaro.org

> On Sep 27, 2023, at 18:47, Maxim Kuvyrkov  wrote:
> 
> Hi Bernhard,
> 
> Thanks, I meant to fix this, but forgot.
> 
> The underlying problem here is that we want to detect which sub-testsuites 
> had failures.  Current regex doesn't match go's case because there is no 
> "..." at the end: "Running foo" vs "Running foo ..." .
> 
> My preferred way of fixing this is to make go's testsuite print out "..." .  
> We have a similar patch for glibc [1].
> 
> [1] https://sourceware.org/pipermail/libc-alpha/2023-June/148702.html
> 
> --
> Maxim Kuvyrkov
> https://www.linaro.org
> 
>> On Sep 26, 2023, at 19:46, Bernhard Reutner-Fischer  
>> wrote:
>> 
>> Hi Maxim!
>> 
>> On Mon, 5 Jun 2023 18:06:25 +0400
>> Maxim Kuvyrkov via Gcc-patches  wrote:
>> 
 On Jun 3, 2023, at 19:17, Jeff Law  wrote:
 
 On 6/2/23 09:20, Maxim Kuvyrkov via Gcc-patches wrote:  
> This patch adds tracking of current testsuite "tool" and "exp"
> to the processing of .sum files.  This avoids aliasing between
> tests from different testsuites with same name+description.
> E.g., this is necessary for testsuite/c-c++-common, which is ran
> for both gcc and g++ "tools".
> This patch changes manifest format from ...
> 
> FAIL: gcc_test
> FAIL: g++_test
> 
> ... to ...
> 
> === gcc tests ===
> Running gcc/foo.exp ...
> FAIL: gcc_test
> === gcc Summary ==
> === g++ tests ===
> Running g++/bar.exp ...
> FAIL: g++_test
> === g++ Summary ==
> .
> The new format uses same formatting as DejaGnu's .sum files
> to specify which "tool" and "exp" the test belongs to.  
 I think the series is fine.  You're not likely to hear from Diego or Doug 
 I suspect, I don't think either are involved in GNU stuff anymore.
 
>>> 
>>> Thanks, Jeff.  I'll wait for a couple of days and will merge if there are 
>>> no new comments.
>> 
>> Maxim, may i ask you to have a look at the following problem, please?
>> 
>> ISTM that your exp code does not work as expected for go, maybe you
>> forgot to test the changes with go enabled?
>> 
>> Ever since your changes in summer i see the following:
>> 
>> gcc-14.mine$ 
>> /scratch/src/gcc-14.mine/contrib/testsuite-management/validate_failures.py 
>> --clean_build ../gcc-14.orig/
>> Getting actual results from build directory .
>> ./gcc/testsuite/go/go.sum
>> ./gcc/testsuite/gcc/gcc.sum
>> ./gcc/testsuite/objc/objc.sum
>> ./gcc/testsuite/jit/jit.sum
>> ./gcc/testsuite/gdc/gdc.sum
>> ./gcc/testsuite/gnat/gnat.sum
>> ./gcc/testsuite/ada/acats/acats.sum
>> ./gcc/testsuite/g++/g++.sum
>> ./gcc/testsuite/obj-c++/obj-c++.sum
>> ./gcc/testsuite/rust/rust.sum
>> ./gcc/testsuite/gfortran/gfortran.sum
>> ./x86_64-pc-linux-gnu/libgomp/testsuite/libgomp.sum
>> ./x86_64-pc-linux-gnu/libphobos/testsuite/libphobos.sum
>> ./x86_64-pc-linux-gnu/libstdc++-v3/testsuite/libstdc++.sum
>> ./x86_64-pc-linux-gnu/libffi/testsuite/libffi.sum
>> ./x86_64-pc-linux-gnu/libitm/testsuite/libitm.sum
>> ./x86_64-pc-linux-gnu/libgo/libgo.sum
>> ./x86_64-pc-linux-gnu/libatomic/testsuite/libatomic.sum
>> ./gotools/gotools.sum
>> .sum file seems to be broken: tool="gotools", exp="None", 
>> summary_line="FAIL: TestScript"
>> Traceback (most recent call last):
>> File 
>> "/scratch/src/gcc-14.mine/contrib/testsuite-management/validate_failures.py",
>>  line 732, in 
>>   retval = Main(sys.argv)
>> File 
>> "/scratch/src/gcc-14.mine/contrib/testsuite-management/validate_failures.py",
>>  line 721, in Main
>>   retval = CompareBuilds()
>> File 
>> "/scratch/src/gcc-14.mine/contrib/testsuite-management/validate_failures.py",
>>  line 622, in CompareBuilds
>>   actual = GetResults(sum_files)
>> File 
>> "/scratch/src/gcc-14.mine/contrib/testsuite-management/validate_failures.py",
>>  line 466, in GetResults
>>   build_results.update(ParseSummary(sum_fname))
>> File 
>> "/scratch/src/gcc-14.mine/contrib/testsuite-management/validate_failures.py",
>>  line 405, in ParseSummary
>>   result = result_set.MakeTestResult(line, ordinal)
>> File 
>> "/scratch/src/gcc-14.mine/contrib/testsuite-management/validate_failures.py",
>>  line 239, in MakeTestResult
>>   return TestResult(summary_line, ordinal,
>> File 
>> "/scratch/src/gcc-14.mine/contrib/testsuite-management/validate_failures.py",
>>  line 151, in __init__
>>   raise
>> RuntimeError: No active exception to reraise
>> 
>> 
>> The problem seems to be that gotools.sum does not mention any ".exp"
>> files.
>> 
>> $ grep "Running " gotools/gotools.sum 
>> Running cmd/go
>> Running runtime
>> Running cgo
>> Running carchive
>> Running cmd/vet
>> Running embed
>> $ grep -c "\.exp" gotools/gotools.sum 
>> 0
>> 
>> The .sum files looks like this:
>> ---8<---
>> Test Run By foo on Tue Sep 26 14:46:48 CEST 2023
>> Native configuration is x86_64-foo-linux-gnu
>> 
>>   === gotools tests

[committed] Improve H8 sequences for single bit sign extractions

2023-11-02 Thread Jeff Law

Spurred by Roger's recent work on ARC, this patch improves the code we 
generation for single bit sign extractions.


The basic idea is to get the bit we want into C, the use a 
subx;ext.w;ext.l sequence to sign extend it in a GPR.


For bits 0..15 we can use a bld instruction to get the bit we want into 
C.  For bits 16..31, we can move the high word into the low word, then 
use bld.  There's a couple special cases where we can shift the bit we 
want from the high word into C which is an instruction smaller.


Not surprisingly most cases seen in newlib and the test suite are 
extractions from the low byte, HImode sign bit and top two bits of SImode.


Regression tested on the H8 with no regressions.  Installing on the trunk.

Jeffcommit 0f9f3fc885a1f830ff09a095e8c14919c2796a9d
Author: Jeff Law 
Date:   Thu Nov 2 07:25:39 2023 -0600

[committed] Improve H8 sequences for single bit sign extractions

Spurred by Roger's recent work on ARC, this patch improves the code we
generation for single bit sign extractions.

The basic idea is to get the bit we want into C, the use a subx;ext.w;ext.l
sequence to sign extend it in a GPR.

For bits 0..15 we can use a bld instruction to get the bit we want into C.  
For
bits 16..31, we can move the high word into the low word, then use bld.
There's a couple special cases where we can shift the bit we want from the 
high
word into C which is an instruction smaller.

Not surprisingly most cases seen in newlib and the test suite are 
extractions
from the low byte, HImode sign bit and top two bits of SImode.

Regression tested on the H8 with no regressions.  Installing on the trunk.

gcc/
* config/h8300/combiner.md: Add new patterns for single bit
sign extractions.

diff --git a/gcc/config/h8300/combiner.md b/gcc/config/h8300/combiner.md
index fd5cf2f4af4..2f7faf77c93 100644
--- a/gcc/config/h8300/combiner.md
+++ b/gcc/config/h8300/combiner.md
@@ -1268,3 +1268,94 @@ (define_insn ""
 ;;   (label_ref (match_dup 1))
 ;;   (pc)))]
 ;;   "")
+
+;; Various ways to extract a single bit bitfield and sign extend it
+;;
+;; Testing showed this only triggering with SImode, probably because
+;; of how insv/extv are defined.
+(define_insn_and_split ""
+  [(set (match_operand:SI 0 "register_operand" "=r")
+   (sign_extract:SI (match_operand:QHSI 1 "register_operand" "0")
+(const_int 1)
+(match_operand 2 "immediate_operand")))]
+  ""
+  "#"
+  "&& reload_completed"
+  [(parallel [(set (match_dup 0)
+  (sign_extract:SI (match_dup 1) (const_int 1) (match_dup 2)))
+ (clobber (reg:CC CC_REG))])])
+
+(define_insn ""
+  [(set (match_operand:SI 0 "register_operand" "=r")
+   (sign_extract:SI (match_operand:QHSI 1 "register_operand" "0")
+(const_int 1)
+(match_operand 2 "immediate_operand")))
+   (clobber (reg:CC CC_REG))]
+  ""
+{
+  int position = INTVAL (operands[2]);
+
+  /* For bit position 31, 30, left shift the bit we want into C.  */
+  bool bit_in_c = false;
+  if (position == 31)
+{
+  output_asm_insn ("shll.l\t%0", operands);
+  bit_in_c = true;
+}
+  else if (position == 30 && TARGET_H8300S)
+{
+  output_asm_insn ("shll.l\t#2,%0", operands);
+  bit_in_c = true;
+}
+
+  /* Similar for positions 16, 17, but with a right shift into C.  */
+  else if (position == 16)
+{
+  output_asm_insn ("shlr.w\t%e0", operands);
+  bit_in_c = true;
+}
+  else if (position == 17 && TARGET_H8300S)
+{
+  output_asm_insn ("shlr.w\t#2,%e0", operands);
+  bit_in_c = true;
+}
+
+
+  /* For all the other cases in the upper 16 bits, move the upper 16
+ bits into the lower 16 bits, then use the standard sequence for
+ extracting one of the low 16 bits.  */
+  else if (position >= 16)
+{
+  output_asm_insn ("mov.w\t%e1,%f0", operands);
+
+  /* We'll use the standard sequences for the low word now.  */
+  position %= 16;
+}
+
+  /* Same size/speed as the general sequence, but slightly faster
+ to simulate.  */
+  if (position == 0)
+return "and.l\t#1,%0\;neg.l\t%0";
+
+  rtx xoperands[3];
+  xoperands[0] = operands[0];
+  xoperands[1] = operands[1];
+  xoperands[2] = GEN_INT (position);
+
+  /* If the bit we want is not already in C, get it there  */
+  if (!bit_in_c)
+{
+  if (position >= 8)
+   {
+ xoperands[2] = GEN_INT (position % 8);
+ output_asm_insn ("bld\t%2,%t1", xoperands);
+   }
+  else
+   output_asm_insn ("bld\t%2,%s1", xoperands);
+}
+
+  /* Now the bit we want is in C, emit the generalized sequence
+ to get that bit into the destination, properly extended.  */
+  return "subx\t%s0,%s0\;exts.w %T0\;exts.l %0";
+}
+  [(set_attr "length" "10")])

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-11-02 Thread Richard Biener

On Thu, 26 Oct 2023, Robin Dapp wrote:

> Ok, next try.  Now without dubious pattern and with direct optab
> but still dedicated expander function.
> 
> This will cause one riscv regression in cond_widen_reduc-2.c that
> we can deal with later.  It is just a missed optimization where
> we do not combine something that we used to because of the
> now-present length masking.
> 
> I'd also like to postpone handling vcond_mask_len simplifications
> via stripping the length and falling back to vec_cond and its fold
> patterns to a later time.  As is, this helps us avoid execution
> failures in at least five test cases.
> 
> Bootstrap et al. running on x86, aarch64 and power10.

Looks reasonable overall.  The new match patterns are 1:1 the
same as the COND_ ones.  That's a bit awkward, but I don't see
a good way to "macroize" stuff further there.  Can you at least
interleave the COND_LEN_* ones with the other ones instead of
putting them all at the end?

Thanks,
Richard.


> Regards
>  Robin
> 
> From 7acdebb5b13b71331621af08da6649fe08476fe8 Mon Sep 17 00:00:00 2001
> From: Robin Dapp 
> Date: Wed, 25 Oct 2023 22:19:43 +0200
> Subject: [PATCH v3] internal-fn: Add VCOND_MASK_LEN.
> 
> In order to prevent simplification of a COND_OP with degenerate mask
> (all true or all zero) into just an OP in the presence of length
> masking this patch introduces a length-masked analog to VEC_COND_EXPR:
> IFN_VCOND_MASK_LEN.
> 
> It also adds new match patterns that allow the combination of
> unconditional unary, binary and ternay operations with the
> VCOND_MASK_LEN into a conditional operation if the target supports it.
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/111760
> 
>   * config/riscv/autovec.md (vcond_mask_len_): Add
>   expander.
>   * config/riscv/riscv-protos.h (enum insn_type): Add.
>   * doc/md.texi: Add vcond_mask_len.
>   * gimple-match-exports.cc (maybe_resimplify_conditional_op):
>   Create VCOND_MASK_LEN when
>   length masking.
>   * gimple-match.h (gimple_match_op::gimple_match_op): Allow
>   matching of 6 and 7 parameters.
>   (gimple_match_op::set_op): Ditto.
>   (gimple_match_op::gimple_match_op): Always initialize len and
>   bias.
>   * internal-fn.cc (vec_cond_mask_len_direct): Add.
>   (expand_vec_cond_mask_len_optab_fn): Add.
>   (direct_vec_cond_mask_len_optab_supported_p): Add.
>   (internal_fn_len_index): Add VCOND_MASK_LEN.
>   (internal_fn_mask_index): Ditto.
>   * internal-fn.def (VCOND_MASK_LEN): New internal function.
>   * match.pd: Combine unconditional unary, binary and ternary
>   operations into the respective COND_LEN operations.
>   * optabs.def (OPTAB_D): Add vcond_mask_len optab.
> ---
>  gcc/config/riscv/autovec.md | 37 
>  gcc/config/riscv/riscv-protos.h |  5 +++
>  gcc/doc/md.texi |  9 
>  gcc/gimple-match-exports.cc | 13 --
>  gcc/gimple-match.h  | 78 -
>  gcc/internal-fn.cc  | 42 ++
>  gcc/internal-fn.def |  2 +
>  gcc/match.pd| 61 ++
>  gcc/optabs.def  |  1 +
>  9 files changed, 243 insertions(+), 5 deletions(-)
> 
> diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
> index 80910ba3cc2..dadb71c1165 100644
> --- a/gcc/config/riscv/autovec.md
> +++ b/gcc/config/riscv/autovec.md
> @@ -565,6 +565,43 @@ (define_insn_and_split "vcond_mask_"
>[(set_attr "type" "vector")]
>  )
>  
> +(define_expand "vcond_mask_len_"
> +  [(match_operand:V_VLS 0 "register_operand")
> +(match_operand: 3 "nonmemory_operand")
> +(match_operand:V_VLS 1 "nonmemory_operand")
> +(match_operand:V_VLS 2 "autovec_else_operand")
> +(match_operand 4 "autovec_length_operand")
> +(match_operand 5 "const_0_operand")]
> +  "TARGET_VECTOR"
> +  {
> +if (satisfies_constraint_Wc1 (operands[3]))
> +  {
> + rtx ops[] = {operands[0], operands[2], operands[1]};
> + riscv_vector::emit_nonvlmax_insn (code_for_pred_mov (mode),
> +   riscv_vector::UNARY_OP_TUMA,
> +   ops, operands[4]);
> +  }
> +else if (satisfies_constraint_Wc0 (operands[3]))
> +  {
> + rtx ops[] = {operands[0], operands[2], operands[2]};
> + riscv_vector::emit_nonvlmax_insn (code_for_pred_mov (mode),
> +   riscv_vector::UNARY_OP_TUMA,
> +   ops, operands[4]);
> +  }
> +else
> +  {
> + /* The order of vcond_mask is opposite to pred_merge.  */
> + rtx ops[] = {operands[0], operands[2], operands[2], operands[1],
> +  operands[3]};
> + riscv_vector::emit_nonvlmax_insn (code_for_pred_merge (mode),
> +   riscv_vector::MERGE_OP_TUMA,
> +   ops, operands[4

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-11-02 Thread Robin Dapp

> Looks reasonable overall.  The new match patterns are 1:1 the
> same as the COND_ ones.  That's a bit awkward, but I don't see
> a good way to "macroize" stuff further there.  Can you at least
> interleave the COND_LEN_* ones with the other ones instead of
> putting them all at the end?

Yes, no problem.  It's supposed to be only temporary anyway (FWIW)
as I didn't manage with the "stripping _LEN" way on the first few tries.
Still on the todo list but unlikely to be done before stage 1 closes.

I believe Richard "kind of" LGTM'ed the rest minus the spurious
pattern (which is gone now) but there is still the direct optab change
that he didn't comment on so I think we should wait for his remarks
still.

Regards
 Robin

Re: RFC: the proposal to resolve the missing dependency issue for counted_by attribute

2023-11-02 Thread Qing Zhao



> On Nov 2, 2023, at 3:57 AM, Richard Biener  wrote:
> 
> On Wed, Nov 1, 2023 at 3:47 PM Qing Zhao  wrote:
>> 
>> 
>> 
>>> On Oct 31, 2023, at 6:14 PM, Joseph Myers  wrote:
>>> 
>>> On Tue, 31 Oct 2023, Qing Zhao wrote:
>>> 
 2.3 A new semantic requirement in the user documentation of "counted_by"
 
 For the following structure including a FAM with a counted_by attribute:
 
 struct A
 {
  size_t size;
  char buf[] __attribute__((counted_by(size)));
 };
 
 for any object with such type:
 
 struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
 
 The setting to the size field should be done before the first reference
 to the FAM field.
 
 Such requirement to the user will guarantee that the first reference to
 the FAM knows the size of the FAM.
 
 We need to add this additional requirement to the user document.
>>> 
>>> Make sure the manual is very specific about exactly when size is
>>> considered to be an accurate representation of the space available for buf
>>> (given that, after malloc or realloc, it's going to be temporarily
>>> inaccurate).  If the intent is that inaccurate size at such a time means
>>> undefined behavior, say so explicitly.
>> 
>> Yes, good point. We need to define this clearly in the beginning.
>> We need to explicit say that
>> 
>> the size of the FAM is defined by the latest “counted_by” value. And it’s an 
>> undefined behavior when the size field is not defined when the FAM is 
>> referenced.
>> 
>> Is the above good enough?
>> 
>> 
>>> 
 2.4 Replace FAM field accesses with the new function ACCESS_WITH_SIZE
 
 In C FE:
 
 for every reference to a FAM, for example, "obj->buf" in the small example,
 check whether the corresponding FIELD_DECL has a "counted_by" attribute?
 if YES, replace the reference to "obj->buf" with a call to
 .ACCESS_WITH_SIZE (obj->buf, obj->size, -1);
>>> 
>>> This seems plausible - but you should also consider the case of static
>>> initializers - remember the GNU extension for statically allocated objects
>>> with flexible array members (unless you're not allowing it with
>>> counted_by).
>>> 
>>> static struct A x = { sizeof "hello", "hello" };
>>> static char *y = &x.buf;
>>> 
>>> I'd expect that to be valid - and unless you say such a usage is invalid,
>> 
>> At this moment, I think that this should be valid.
>> 
>> I,e, the following:
>> 
>> struct A
>> {
>> size_t size;
>> char buf[] __attribute__((counted_by(size)));
>> };
>> 
>> static struct A x = {sizeof "hello", "hello”};
>> 
>> Should be valid, and x.size represents the number of elements of x.buf.
>> Both x.size and x.buf are initialized statically.
>> 
>>> you should avoid the replacement in such a static initializer context when
>>> the FAM reference is to an object with a constant address (if
>>> .ACCESS_WITH_SIZE would not act as an lvalue whose address is a constant
>>> expression; if it works fine as a constant-address lvalue, then the
>>> replacement would be OK).
>> 
>> Then if such usage for the “counted_by” is valid, we need to replace the FAM
>> reference by a call to  .ACCESS_WITH_SIZE as well.
>> Otherwise the “counted_by” relationship will be lost to the Middle end.
>> 
>> With the current definition of .ACCESS_WITH_SIZE
>> 
>> PTR = .ACCESS_WITH_SIZE (PTR, SIZE, ACCESS_MODE)
>> 
>> Isn’t the PTR (return value of the call) a LVALUE?
> 
> You probably want to specify that when a pointer to the array is taken the
> pointer has to be to the first array element (or do we want to mangle the
> 'size' accordingly for the instrumentation?).

Yes. Will add this into the user documentation.

>  You also want to specify that
> the 'size' associated with such pointer is assumed to be unchanging and
> after changing the size such pointer has to be re-obtained.

What do you mean by “re-obtained”? 

>  Plus that
> changes to the allocated object/size have to be performed through an
> lvalue where the containing type and thus the 'counted_by' attribute is
> visible.

Through an lvalue with the containing type?

Yes, will add this too. 


>  That is,
> 
> size_t *s = &a.size;
> *s = 1;
> 
> is invoking undefined behavior,

right.

> likewise modifying 'buf' (makes it a bit
> awkward since for example that wouldn't support using posix_memalign
> for allocation, though aligned_alloc would be fine).
Is there a small example for the undefined behavior for this?

Qing
> 
> Richard.
> 
>> Qing
>>> 
>>> --
>>> Joseph S. Myers
>>> jos...@codesourcery.com
>>

Re: [PATCH V2] OPTABS/IFN: Add mask_len_strided_load/mask_len_strided_store OPTABS/IFN

2023-11-02 Thread Richard Biener

On Tue, 31 Oct 2023, Juzhe-Zhong wrote:

> As previous Richard's suggested, we should support strided load/store in
> loop vectorizer instead hacking RISC-V backend.
> 
> This patch adds MASK_LEN_STRIDED LOAD/STORE OPTABS/IFN.
> 
> The GIMPLE IR is same as mask_len_gather_load/mask_len_scatter_store but with
> changing vector offset into scalar stride.

I see that it follows gather/scatter.  I'll note that when introducing
those we failed to add a specifier for TBAA and alignment info for the
data access.  That means we have to use alias-set zero for the accesses
(I see existing targets use UNSPECs with some not elaborate MEM anyway,
but TBAA info might have been the "easy" and obvious property to 
preserve).  For alignment we either have to assume unaligned or reject
vectorization of accesses that do not have their original scalar accesses
naturally aligned (aligned according to their mode).  We don't seem
to check that though.

It might be fine to go forward with this since gather/scatter are broken
in a similar way.

Do we really need to have two modes for the optab though or could we
simply require the target to support arbitrary offset modes (give it
is implicitly constrained to ptr_mode for the base already)?  Or
properly extend/truncate the offset at expansion time, say to ptr_mode
or to the mode of sizetype.

Thanks,
Richard.
 
> We don't have strided_load/strided_store and 
> mask_strided_load/mask_strided_store since
> it't unlikely RVV will have such optabs and we can't add the patterns that we 
> can't test them.
>
> 
> gcc/ChangeLog:
> 
>   * doc/md.texi: Add mask_len_strided_load/mask_len_strided_store.
>   * internal-fn.cc (internal_load_fn_p): Ditto.
>   (internal_strided_fn_p): Ditto.
>   (internal_fn_len_index): Ditto.
>   (internal_fn_mask_index): Ditto.
>   (internal_fn_stored_value_index): Ditto.
>   (internal_strided_fn_supported_p): Ditto.
>   * internal-fn.def (MASK_LEN_STRIDED_LOAD): Ditto.
>   (MASK_LEN_STRIDED_STORE): Ditto.
>   * internal-fn.h (internal_strided_fn_p): Ditto.
>   (internal_strided_fn_supported_p): Ditto.
>   * optabs.def (OPTAB_CD): Ditto.
> 
> ---
>  gcc/doc/md.texi | 51 +
>  gcc/internal-fn.cc  | 44 ++
>  gcc/internal-fn.def |  4 
>  gcc/internal-fn.h   |  2 ++
>  gcc/optabs.def  |  2 ++
>  5 files changed, 103 insertions(+)
> 
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index fab2513105a..5bac713a0dd 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -5094,6 +5094,32 @@ Bit @var{i} of the mask is set if element @var{i} of 
> the result should
>  be loaded from memory and clear if element @var{i} of the result should be 
> undefined.
>  Mask elements @var{i} with @var{i} > (operand 6 + operand 7) are ignored.
>  
> +@cindex @code{mask_len_strided_load@var{m}@var{n}} instruction pattern
> +@item @samp{mask_len_strided_load@var{m}@var{n}}
> +Load several separate memory locations into a destination vector of mode 
> @var{m}.
> +Operand 0 is a destination vector of mode @var{m}.
> +Operand 1 is a scalar base address and operand 2 is a scalar stride of mode 
> @var{n}.
> +The instruction can be seen as a special case of 
> @code{mask_len_gather_load@var{m}@var{n}}
> +with an offset vector that is a @code{vec_series} with operand 1 as base and 
> operand 2 as step.
> +For each element index i:
> +
> +@itemize @bullet
> +@item
> +extend the stride to address width, using zero
> +extension if operand 3 is 1 and sign extension if operand 3 is zero;
> +@item
> +multiply the extended stride by operand 4;
> +@item
> +add the result to the base; and
> +@item
> +load the value at that address (operand 1 + @var{i} * multiplied and 
> extended stride) into element @var{i} of operand 0.
> +@end itemize
> +
> +Similar to mask_len_load, the instruction loads at most (operand 6 + operand 
> 7) elements from memory.
> +Bit @var{i} of the mask is set if element @var{i} of the result should
> +be loaded from memory and clear if element @var{i} of the result should be 
> undefined.
> +Mask elements @var{i} with @var{i} > (operand 6 + operand 7) are ignored.
> +
>  @cindex @code{scatter_store@var{m}@var{n}} instruction pattern
>  @item @samp{scatter_store@var{m}@var{n}}
>  Store a vector of mode @var{m} into several distinct memory locations.
> @@ -5131,6 +5157,31 @@ at most (operand 6 + operand 7) elements of (operand 
> 4) to memory.
>  Bit @var{i} of the mask is set if element @var{i} of (operand 4) should be 
> stored.
>  Mask elements @var{i} with @var{i} > (operand 6 + operand 7) are ignored.
>  
> +@cindex @code{mask_len_strided_store@var{m}@var{n}} instruction pattern
> +@item @samp{mask_len_strided_store@var{m}@var{n}}
> +Store a vector of mode m into several distinct memory locations.
> +Operand 0 is a scalar base address and operand 1 is scalar stride of mode 
> @var{n}.
> +Operand 2 is the vector of values that should b

[committed] libstdc++: Fix warning during configure

2023-11-02 Thread Jonathan Wakely

Tested x86_64-linux. Pushed to trunk.

-- >8 --

The checks for snprintf give a -Wformat warning due to a missing
argument.

libstdc++-v3/ChangeLog:

* acinclude.m4 (GLIBCXX_ENABLE_C99): Fix snprintf checks.
* configure: Regenerate.
---
 libstdc++-v3/acinclude.m4 | 4 ++--
 libstdc++-v3/configure| 8 
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index d8f0ba1c3e2..654b99e92d7 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -997,7 +997,7 @@ AC_DEFUN([GLIBCXX_ENABLE_C99], [
vscanf("%i", args);
vsnprintf(fmt, 0, "%i", args);
vsscanf(fmt, "%i", args);
-   snprintf(fmt, 0, "%i");
+   snprintf(fmt, 0, "%i", 1);
  }], [],
 [glibcxx_cv_c99_stdio_cxx98=yes], [glibcxx_cv_c99_stdio_cxx98=no])
 ])
@@ -1578,7 +1578,7 @@ AC_DEFUN([GLIBCXX_ENABLE_C99], [
vscanf("%i", args);
vsnprintf(fmt, 0, "%i", args);
vsscanf(fmt, "%i", args);
-   snprintf(fmt, 0, "%i");
+   snprintf(fmt, 0, "%i", 1);
  }], [],
 [glibcxx_cv_c99_stdio_cxx11=yes], [glibcxx_cv_c99_stdio_cxx11=no])
 ])

Re: RFC: the proposal to resolve the missing dependency issue for counted_by attribute

2023-11-02 Thread Richard Biener

On Thu, Nov 2, 2023 at 2:50 PM Qing Zhao  wrote:
>
>
>
> > On Nov 2, 2023, at 3:57 AM, Richard Biener  
> > wrote:
> >
> > On Wed, Nov 1, 2023 at 3:47 PM Qing Zhao  wrote:
> >>
> >>
> >>
> >>> On Oct 31, 2023, at 6:14 PM, Joseph Myers  wrote:
> >>>
> >>> On Tue, 31 Oct 2023, Qing Zhao wrote:
> >>>
>  2.3 A new semantic requirement in the user documentation of "counted_by"
> 
>  For the following structure including a FAM with a counted_by attribute:
> 
>  struct A
>  {
>   size_t size;
>   char buf[] __attribute__((counted_by(size)));
>  };
> 
>  for any object with such type:
> 
>  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
> 
>  The setting to the size field should be done before the first reference
>  to the FAM field.
> 
>  Such requirement to the user will guarantee that the first reference to
>  the FAM knows the size of the FAM.
> 
>  We need to add this additional requirement to the user document.
> >>>
> >>> Make sure the manual is very specific about exactly when size is
> >>> considered to be an accurate representation of the space available for buf
> >>> (given that, after malloc or realloc, it's going to be temporarily
> >>> inaccurate).  If the intent is that inaccurate size at such a time means
> >>> undefined behavior, say so explicitly.
> >>
> >> Yes, good point. We need to define this clearly in the beginning.
> >> We need to explicit say that
> >>
> >> the size of the FAM is defined by the latest “counted_by” value. And it’s 
> >> an undefined behavior when the size field is not defined when the FAM is 
> >> referenced.
> >>
> >> Is the above good enough?
> >>
> >>
> >>>
>  2.4 Replace FAM field accesses with the new function ACCESS_WITH_SIZE
> 
>  In C FE:
> 
>  for every reference to a FAM, for example, "obj->buf" in the small 
>  example,
>  check whether the corresponding FIELD_DECL has a "counted_by" attribute?
>  if YES, replace the reference to "obj->buf" with a call to
>  .ACCESS_WITH_SIZE (obj->buf, obj->size, -1);
> >>>
> >>> This seems plausible - but you should also consider the case of static
> >>> initializers - remember the GNU extension for statically allocated objects
> >>> with flexible array members (unless you're not allowing it with
> >>> counted_by).
> >>>
> >>> static struct A x = { sizeof "hello", "hello" };
> >>> static char *y = &x.buf;
> >>>
> >>> I'd expect that to be valid - and unless you say such a usage is invalid,
> >>
> >> At this moment, I think that this should be valid.
> >>
> >> I,e, the following:
> >>
> >> struct A
> >> {
> >> size_t size;
> >> char buf[] __attribute__((counted_by(size)));
> >> };
> >>
> >> static struct A x = {sizeof "hello", "hello”};
> >>
> >> Should be valid, and x.size represents the number of elements of x.buf.
> >> Both x.size and x.buf are initialized statically.
> >>
> >>> you should avoid the replacement in such a static initializer context when
> >>> the FAM reference is to an object with a constant address (if
> >>> .ACCESS_WITH_SIZE would not act as an lvalue whose address is a constant
> >>> expression; if it works fine as a constant-address lvalue, then the
> >>> replacement would be OK).
> >>
> >> Then if such usage for the “counted_by” is valid, we need to replace the 
> >> FAM
> >> reference by a call to  .ACCESS_WITH_SIZE as well.
> >> Otherwise the “counted_by” relationship will be lost to the Middle end.
> >>
> >> With the current definition of .ACCESS_WITH_SIZE
> >>
> >> PTR = .ACCESS_WITH_SIZE (PTR, SIZE, ACCESS_MODE)
> >>
> >> Isn’t the PTR (return value of the call) a LVALUE?
> >
> > You probably want to specify that when a pointer to the array is taken the
> > pointer has to be to the first array element (or do we want to mangle the
> > 'size' accordingly for the instrumentation?).
>
> Yes. Will add this into the user documentation.
>
> >  You also want to specify that
> > the 'size' associated with such pointer is assumed to be unchanging and
> > after changing the size such pointer has to be re-obtained.
>
> What do you mean by “re-obtained”?

do

p = &container.array[0];

after any adjustments to 'array' or 'len' again and base further accesses on
the new 'p'.

> >  Plus that
> > changes to the allocated object/size have to be performed through an
> > lvalue where the containing type and thus the 'counted_by' attribute is
> > visible.
>
> Through an lvalue with the containing type?
>
> Yes, will add this too.
>
>
> >  That is,
> >
> > size_t *s = &a.size;
> > *s = 1;
> >
> > is invoking undefined behavior,
>
> right.
>
> > likewise modifying 'buf' (makes it a bit
> > awkward since for example that wouldn't support using posix_memalign
> > for allocation, though aligned_alloc would be fine).
> Is there a small example for the undefined behavior for this?

a.len = len;
posix_memalign (&a.buf, 16, len);

we would probably have to somehow instrument this.

[committed] d: Merge upstream dmd, druntime 643b1261bb, phobos 1c98326e7

2023-11-02 Thread Iain Buclaw

Hi,

This patch merges the D front-end and runtime library with upstream dmd
643b1261bb, and standard library with phobos 1c98326e7.

Synchronizing with the v2.106.0-beta.1 release.

This is being done a little earlier than usual as there's a lot of
internal moving code around within upstream at the moment to reduce both
the extern(C++) surface area, and cyclic dependencies between all D
modules that implement the compiler. So it is done now to keep the diff
below the 400kb threshold enforced on the mailing list.

D front-end changes:

- Suggested preview switches now give gdc flags (PR109681).
- `new S[10]' is now lowered to `_d_newarrayT!S(10)'.

D runtime changes:

- Runtime compiler library functions `_d_newarrayU', `_d_newarrayT',
  `_d_newarrayiT' have been converted to templates.

Phobos changes:

- Add new `std.traits.Unshared' template.

Bootstrapped and regression tested on x86_64-linux-gnu/-m32, committed
to mainline.

Regards,
Iain.

---
gcc/d/ChangeLog:

* dmd/MERGE: Merge upstream dmd 643b1261bb.
* d-attribs.cc (build_attributes): Update for new front-end interface.
* d-lang.cc (d_post_options): Likewise.
* decl.cc (layout_class_initializer): Likewise.

libphobos/ChangeLog:

* libdruntime/MERGE: Merge upstream druntime 643b1261bb.
* libdruntime/Makefile.am (DRUNTIME_DSOURCES_FREEBSD): Add
core/sys/freebsd/ifaddrs.d, core/sys/freebsd/net/if_dl.d,
core/sys/freebsd/sys/socket.d, core/sys/freebsd/sys/types.d.
(DRUNTIME_DSOURCES_LINUX): Add core/sys/linux/linux/if_arp.d,
core/sys/linux/linux/if_packet.d.
* libdruntime/Makefile.in: Regenerate.
* src/MERGE: Merge upstream phobos 1c98326e7.
---
 gcc/d/d-attribs.cc|   2 +-
 gcc/d/d-lang.cc   |   1 -
 gcc/d/decl.cc |   2 +-
 gcc/d/dmd/MERGE   |   2 +-
 gcc/d/dmd/aggregate.d | 184 +++---
 gcc/d/dmd/attrib.d|   6 +-
 gcc/d/dmd/cond.d  |   1 +
 gcc/d/dmd/constfold.d |  24 +-
 gcc/d/dmd/cparse.d|   1 +
 gcc/d/dmd/dcast.d |   3 +-
 gcc/d/dmd/dclass.d|   2 +-
 gcc/d/dmd/declaration.d   |  50 +-
 gcc/d/dmd/dinterpret.d|   3 +-
 gcc/d/dmd/dmangle.d   |   1 +
 gcc/d/dmd/doc.d   |   2 +-
 gcc/d/dmd/dstruct.d   |   2 +-
 gcc/d/dmd/dsymbol.d   |  74 ++-
 gcc/d/dmd/dsymbolsem.d|  11 +-
 gcc/d/dmd/dtemplate.d |  15 +-
 gcc/d/dmd/expression.d| 546 +-
 gcc/d/dmd/expression.h|  20 +-
 gcc/d/dmd/expressionsem.d | 511 +++-
 gcc/d/dmd/func.d  |   1 +
 gcc/d/dmd/globals.h   |   1 -
 gcc/d/dmd/gluelayer.d |   5 -
 gcc/d/dmd/initsem.d   |   1 +
 gcc/d/dmd/lexer.d |   1 -
 gcc/d/dmd/mtype.d |  25 +-
 gcc/d/dmd/mtype.h |   2 +-
 gcc/d/dmd/optimize.d  |   1 +
 gcc/d/dmd/parse.d |  22 +-
 gcc/d/dmd/semantic3.d |   7 +-
 gcc/d/dmd/statementsem.d  |   5 +-
 gcc/d/dmd/staticcond.d|   1 +
 gcc/d/dmd/templateparamsem.d  |   1 +
 gcc/d/dmd/traits.d|   1 +
 gcc/d/dmd/typesem.d   |   2 +
 gcc/d/dmd/typinf.d|  30 +-
 gcc/d/dmd/typinf.h|  22 +
 gcc/testsuite/gdc.test/compilable/dbitfield.d |  13 +
 .../gdc.test/compilable/deprecate14283.d  |   8 +-
 .../gdc.test/compilable/named_arguments.d |  18 +-
 gcc/testsuite/gdc.test/compilable/test20039.d |   2 +-
 .../gdc.test/fail_compilation/b23686.d|  42 ++
 .../gdc.test/fail_compilation/diag4596.d  |   4 +-
 .../gdc.test/fail_compilation/fail13116.d |   2 +-
 .../gdc.test/fail_compilation/fail24208.d |  20 +
 .../gdc.test/fail_compilation/fail24212.d |  30 +
 .../gdc.test/fail_compilation/fail24213.d |  17 +
 .../gdc.test/fail_compilation/ice23865.d  |  32 +
 .../gdc.test/fail_compilation/ice24188.d  |  14 +
 .../fail_compilation/ice24188_a/ice24188_c.d  |   0
 .../gdc.test/fail_compilation/test18480.d |   1 +
 .../gdc.test/fail_compilation/test24157.d |  28 +
 libphobos/libdruntime/MERGE   |   2 +-
 libphobos/libdruntime/Makefile.am |   7 +-
 libphobos/libdruntime/Makefile.in |  34 +-
 .../libdruntime/core/sys/linux/linux/if_arp.d | 136 +

[committed] c: Add missing conditions in Walloc-size to avoid ICEs [PR112347]

2023-11-02 Thread Uecker, Martin


I forget to guard against some more cases. 

Committed as obvious.


Martin


c: Add missing conditions in Walloc-size to avoid ICEs [PR112347]

Fix ICE because of forgotten checks for pointers to void
and incomplete arrays.

Committed as obvious.

PR c/112347

gcc/c:
* c-typeck.cc (convert_for_assignment): Add missing check.

gcc/testsuite:

* gcc.dg/Walloc-size-3.c: New test.

diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index 16fadfb5468..bdd57aae3ff 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -7367,6 +7367,7 @@ convert_for_assignment (location_t location, location_t 
expr_loc, tree type,
idx = TREE_INT_CST_LOW (TREE_VALUE (TREE_CHAIN (args))) - 1;
  tree arg = CALL_EXPR_ARG (rhs, idx);
  if (TREE_CODE (arg) == INTEGER_CST
+ && !VOID_TYPE_P (ttl) && TYPE_SIZE_UNIT (ttl)
  && INTEGER_CST == TREE_CODE (TYPE_SIZE_UNIT (ttl))
  && tree_int_cst_lt (arg, TYPE_SIZE_UNIT (ttl)))
 warning_at (location, OPT_Walloc_size, "allocation of "
diff --git a/gcc/testsuite/gcc.dg/Walloc-size-3.c 
b/gcc/testsuite/gcc.dg/Walloc-size-3.c
new file mode 100644
index 000..b95e04a8d99
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/Walloc-size-3.c
@@ -0,0 +1,15 @@
+/* PR 112347 
+   { dg-do compile }
+   { dg-options "-Walloc-size" }
+ * */
+
+// Test that various types without size do not crash with -Walloc-size
+
+int * mallocx(unsigned long) __attribute__((malloc)) 
__attribute__((alloc_size(1)));
+void test_oom(void) { void *a_ = mallocx(1); }
+
+void parse_args(char (**child_args_ptr_ptr)[]) {
+  *child_args_ptr_ptr = __builtin_calloc(1, sizeof(char));
+}
+
+

Re: RFC: the proposal to resolve the missing dependency issue for counted_by attribute

2023-11-02 Thread Martin Uecker

Am Donnerstag, dem 02.11.2023 um 13:50 + schrieb Qing Zhao:
> 
> > On Nov 2, 2023, at 3:57 AM, Richard Biener  
> > wrote:
> > 
> > On Wed, Nov 1, 2023 at 3:47 PM Qing Zhao  wrote:
> > > 
> > > 
> > > 
> > > > On Oct 31, 2023, at 6:14 PM, Joseph Myers  
> > > > wrote:
> > > > 
> > > > On Tue, 31 Oct 2023, Qing Zhao wrote:
> > > > 
> > > > > 2.3 A new semantic requirement in the user documentation of 
> > > > > "counted_by"
> > > > > 
> > > > > For the following structure including a FAM with a counted_by 
> > > > > attribute:
> > > > > 
> > > > > struct A
> > > > > {
> > > > >  size_t size;
> > > > >  char buf[] __attribute__((counted_by(size)));
> > > > > };
> > > > > 
> > > > > for any object with such type:
> > > > > 
> > > > > struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
> > > > > sizeof(char));
> > > > > 
> > > > > The setting to the size field should be done before the first 
> > > > > reference
> > > > > to the FAM field.
> > > > > 
> > > > > Such requirement to the user will guarantee that the first reference 
> > > > > to
> > > > > the FAM knows the size of the FAM.
> > > > > 
> > > > > We need to add this additional requirement to the user document.
> > > > 
> > > > Make sure the manual is very specific about exactly when size is
> > > > considered to be an accurate representation of the space available for 
> > > > buf
> > > > (given that, after malloc or realloc, it's going to be temporarily
> > > > inaccurate).  If the intent is that inaccurate size at such a time means
> > > > undefined behavior, say so explicitly.
> > > 
> > > Yes, good point. We need to define this clearly in the beginning.
> > > We need to explicit say that
> > > 
> > > the size of the FAM is defined by the latest “counted_by” value. And it’s 
> > > an undefined behavior when the size field is not defined when the FAM is 
> > > referenced.
> > > 
> > > Is the above good enough?
> > > 
> > > 
> > > > 
> > > > > 2.4 Replace FAM field accesses with the new function ACCESS_WITH_SIZE
> > > > > 
> > > > > In C FE:
> > > > > 
> > > > > for every reference to a FAM, for example, "obj->buf" in the small 
> > > > > example,
> > > > > check whether the corresponding FIELD_DECL has a "counted_by" 
> > > > > attribute?
> > > > > if YES, replace the reference to "obj->buf" with a call to
> > > > > .ACCESS_WITH_SIZE (obj->buf, obj->size, -1);
> > > > 
> > > > This seems plausible - but you should also consider the case of static
> > > > initializers - remember the GNU extension for statically allocated 
> > > > objects
> > > > with flexible array members (unless you're not allowing it with
> > > > counted_by).
> > > > 
> > > > static struct A x = { sizeof "hello", "hello" };
> > > > static char *y = &x.buf;
> > > > 
> > > > I'd expect that to be valid - and unless you say such a usage is 
> > > > invalid,
> > > 
> > > At this moment, I think that this should be valid.
> > > 
> > > I,e, the following:
> > > 
> > > struct A
> > > {
> > > size_t size;
> > > char buf[] __attribute__((counted_by(size)));
> > > };
> > > 
> > > static struct A x = {sizeof "hello", "hello”};
> > > 
> > > Should be valid, and x.size represents the number of elements of x.buf.
> > > Both x.size and x.buf are initialized statically.
> > > 
> > > > you should avoid the replacement in such a static initializer context 
> > > > when
> > > > the FAM reference is to an object with a constant address (if
> > > > .ACCESS_WITH_SIZE would not act as an lvalue whose address is a constant
> > > > expression; if it works fine as a constant-address lvalue, then the
> > > > replacement would be OK).
> > > 
> > > Then if such usage for the “counted_by” is valid, we need to replace the 
> > > FAM
> > > reference by a call to  .ACCESS_WITH_SIZE as well.
> > > Otherwise the “counted_by” relationship will be lost to the Middle end.
> > > 
> > > With the current definition of .ACCESS_WITH_SIZE
> > > 
> > > PTR = .ACCESS_WITH_SIZE (PTR, SIZE, ACCESS_MODE)
> > > 
> > > Isn’t the PTR (return value of the call) a LVALUE?
> > 
> > You probably want to specify that when a pointer to the array is taken the
> > pointer has to be to the first array element (or do we want to mangle the
> > 'size' accordingly for the instrumentation?).
> 
> Yes. Will add this into the user documentation.

This shouldn't be necessary. The object-size pass
can track pointer arithmeti if it comes after
inserting the .ACCESS_WITH_SIZE.

https://godbolt.org/z/fvc3aoPfd

> 
> >  You also want to specify that
> > the 'size' associated with such pointer is assumed to be unchanging and
> > after changing the size such pointer has to be re-obtained.
> 
> What do you mean by “re-obtained”? 
> 
> >  Plus that
> > changes to the allocated object/size have to be performed through an
> > lvalue where the containing type and thus the 'counted_by' attribute is
> > visible.
> 
> Through an lvalue with the containing type?
> 
> Yes, will add this too. 

I do not understand this.  It shouldn't matter how
it is

Re: Re: [PATCH V2] OPTABS/IFN: Add mask_len_strided_load/mask_len_strided_store OPTABS/IFN

2023-11-02 Thread 钟居哲

Ok. So drop 'scale' and keep signed/unsigned argument, is that right?
And I wonder I should create the stride_type using size_type_node or 
ptrdiff_type_node ?
Which is preferrable ?

Thanks.



juzhe.zh...@rivai.ai
 
From: Richard Biener
Date: 2023-11-02 22:27
To: 钟居哲
CC: gcc-patches; Jeff Law; richard.sandiford; rdapp.gcc
Subject: Re: Re: [PATCH V2] OPTABS/IFN: Add 
mask_len_strided_load/mask_len_strided_store OPTABS/IFN
On Thu, 2 Nov 2023, ??? wrote:
 
> Hi, Richi.
> 
> >> Do we really need to have two modes for the optab though or could we
> >> simply require the target to support arbitrary offset modes (give it
> >> is implicitly constrained to ptr_mode for the base already)?  Or
> >> properly extend/truncate the offset at expansion time, say to ptr_mode
> >> or to the mode of sizetype.
> 
> For RVV, it's ok by default set stride type as ptr_mode/size_type by default.
> Is it ok that I define strided load/store as single mode optab and default 
> Pmode as stride operand?
> How about scale and signed/unsigned operand ?
> It seems scale operand can be removed ? Since we can pass DR_STEP directly to 
> the stride arguments.
> But I think we can't remove signed/unsigned operand since for strided mode = 
> SI mode, the unsigned
> maximum stride = 2^31 wheras signed is 2 ^ 30.
 
On the GIMPLE side I think we want to have a sizetype operand and
indeed drop 'scale', the sizetype operand should be readily available.
 
> 
> 
> 
> juzhe.zh...@rivai.ai
>  
> From: Richard Biener
> Date: 2023-11-02 21:52
> To: Juzhe-Zhong
> CC: gcc-patches; jeffreyalaw; richard.sandiford; rdapp.gcc
> Subject: Re: [PATCH V2] OPTABS/IFN: Add 
> mask_len_strided_load/mask_len_strided_store OPTABS/IFN
> On Tue, 31 Oct 2023, Juzhe-Zhong wrote:
>  
> > As previous Richard's suggested, we should support strided load/store in
> > loop vectorizer instead hacking RISC-V backend.
> > 
> > This patch adds MASK_LEN_STRIDED LOAD/STORE OPTABS/IFN.
> > 
> > The GIMPLE IR is same as mask_len_gather_load/mask_len_scatter_store but 
> > with
> > changing vector offset into scalar stride.
>  
> I see that it follows gather/scatter.  I'll note that when introducing
> those we failed to add a specifier for TBAA and alignment info for the
> data access.  That means we have to use alias-set zero for the accesses
> (I see existing targets use UNSPECs with some not elaborate MEM anyway,
> but TBAA info might have been the "easy" and obvious property to 
> preserve).  For alignment we either have to assume unaligned or reject
> vectorization of accesses that do not have their original scalar accesses
> naturally aligned (aligned according to their mode).  We don't seem
> to check that though.
>  
> It might be fine to go forward with this since gather/scatter are broken
> in a similar way.
>  
> Do we really need to have two modes for the optab though or could we
> simply require the target to support arbitrary offset modes (give it
> is implicitly constrained to ptr_mode for the base already)?  Or
> properly extend/truncate the offset at expansion time, say to ptr_mode
> or to the mode of sizetype.
>  
> Thanks,
> Richard.
> > We don't have strided_load/strided_store and 
> > mask_strided_load/mask_strided_store since
> > it't unlikely RVV will have such optabs and we can't add the patterns that 
> > we can't test them.
> >
> > 
> > gcc/ChangeLog:
> > 
> > * doc/md.texi: Add mask_len_strided_load/mask_len_strided_store.
> > * internal-fn.cc (internal_load_fn_p): Ditto.
> > (internal_strided_fn_p): Ditto.
> > (internal_fn_len_index): Ditto.
> > (internal_fn_mask_index): Ditto.
> > (internal_fn_stored_value_index): Ditto.
> > (internal_strided_fn_supported_p): Ditto.
> > * internal-fn.def (MASK_LEN_STRIDED_LOAD): Ditto.
> > (MASK_LEN_STRIDED_STORE): Ditto.
> > * internal-fn.h (internal_strided_fn_p): Ditto.
> > (internal_strided_fn_supported_p): Ditto.
> > * optabs.def (OPTAB_CD): Ditto.
> > 
> > ---
> >  gcc/doc/md.texi | 51 +
> >  gcc/internal-fn.cc  | 44 ++
> >  gcc/internal-fn.def |  4 
> >  gcc/internal-fn.h   |  2 ++
> >  gcc/optabs.def  |  2 ++
> >  5 files changed, 103 insertions(+)
> > 
> > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> > index fab2513105a..5bac713a0dd 100644
> > --- a/gcc/doc/md.texi
> > +++ b/gcc/doc/md.texi
> > @@ -5094,6 +5094,32 @@ Bit @var{i} of the mask is set if element @var{i} of 
> > the result should
> >  be loaded from memory and clear if element @var{i} of the result should be 
> > undefined.
> >  Mask elements @var{i} with @var{i} > (operand 6 + operand 7) are ignored.
> >  
> > +@cindex @code{mask_len_strided_load@var{m}@var{n}} instruction pattern
> > +@item @samp{mask_len_strided_load@var{m}@var{n}}
> > +Load several separate memory locations into a destination vector of mode 
> > @var{m}.
> > +Operand 0 is a destination vector of mode @var{m}.
> > +Operand 1 is a scalar base address and operand 2 is a scalar stride of 
> >

Re: Re: [PATCH V2] OPTABS/IFN: Add mask_len_strided_load/mask_len_strided_store OPTABS/IFN

2023-11-02 Thread 钟居哲

Hi, Richi.

>> Do we really need to have two modes for the optab though or could we
>> simply require the target to support arbitrary offset modes (give it
>> is implicitly constrained to ptr_mode for the base already)?  Or
>> properly extend/truncate the offset at expansion time, say to ptr_mode
>> or to the mode of sizetype.

For RVV, it's ok by default set stride type as ptr_mode/size_type by default.
Is it ok that I define strided load/store as single mode optab and default 
Pmode as stride operand?
How about scale and signed/unsigned operand ?
It seems scale operand can be removed ? Since we can pass DR_STEP directly to 
the stride arguments.
But I think we can't remove signed/unsigned operand since for strided mode = SI 
mode, the unsigned
maximum stride = 2^31 wheras signed is 2 ^ 30.




juzhe.zh...@rivai.ai
 
From: Richard Biener
Date: 2023-11-02 21:52
To: Juzhe-Zhong
CC: gcc-patches; jeffreyalaw; richard.sandiford; rdapp.gcc
Subject: Re: [PATCH V2] OPTABS/IFN: Add 
mask_len_strided_load/mask_len_strided_store OPTABS/IFN
On Tue, 31 Oct 2023, Juzhe-Zhong wrote:
 
> As previous Richard's suggested, we should support strided load/store in
> loop vectorizer instead hacking RISC-V backend.
> 
> This patch adds MASK_LEN_STRIDED LOAD/STORE OPTABS/IFN.
> 
> The GIMPLE IR is same as mask_len_gather_load/mask_len_scatter_store but with
> changing vector offset into scalar stride.
 
I see that it follows gather/scatter.  I'll note that when introducing
those we failed to add a specifier for TBAA and alignment info for the
data access.  That means we have to use alias-set zero for the accesses
(I see existing targets use UNSPECs with some not elaborate MEM anyway,
but TBAA info might have been the "easy" and obvious property to 
preserve).  For alignment we either have to assume unaligned or reject
vectorization of accesses that do not have their original scalar accesses
naturally aligned (aligned according to their mode).  We don't seem
to check that though.
 
It might be fine to go forward with this since gather/scatter are broken
in a similar way.
 
Do we really need to have two modes for the optab though or could we
simply require the target to support arbitrary offset modes (give it
is implicitly constrained to ptr_mode for the base already)?  Or
properly extend/truncate the offset at expansion time, say to ptr_mode
or to the mode of sizetype.
 
Thanks,
Richard.
> We don't have strided_load/strided_store and 
> mask_strided_load/mask_strided_store since
> it't unlikely RVV will have such optabs and we can't add the patterns that we 
> can't test them.
>
> 
> gcc/ChangeLog:
> 
> * doc/md.texi: Add mask_len_strided_load/mask_len_strided_store.
> * internal-fn.cc (internal_load_fn_p): Ditto.
> (internal_strided_fn_p): Ditto.
> (internal_fn_len_index): Ditto.
> (internal_fn_mask_index): Ditto.
> (internal_fn_stored_value_index): Ditto.
> (internal_strided_fn_supported_p): Ditto.
> * internal-fn.def (MASK_LEN_STRIDED_LOAD): Ditto.
> (MASK_LEN_STRIDED_STORE): Ditto.
> * internal-fn.h (internal_strided_fn_p): Ditto.
> (internal_strided_fn_supported_p): Ditto.
> * optabs.def (OPTAB_CD): Ditto.
> 
> ---
>  gcc/doc/md.texi | 51 +
>  gcc/internal-fn.cc  | 44 ++
>  gcc/internal-fn.def |  4 
>  gcc/internal-fn.h   |  2 ++
>  gcc/optabs.def  |  2 ++
>  5 files changed, 103 insertions(+)
> 
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index fab2513105a..5bac713a0dd 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -5094,6 +5094,32 @@ Bit @var{i} of the mask is set if element @var{i} of 
> the result should
>  be loaded from memory and clear if element @var{i} of the result should be 
> undefined.
>  Mask elements @var{i} with @var{i} > (operand 6 + operand 7) are ignored.
>  
> +@cindex @code{mask_len_strided_load@var{m}@var{n}} instruction pattern
> +@item @samp{mask_len_strided_load@var{m}@var{n}}
> +Load several separate memory locations into a destination vector of mode 
> @var{m}.
> +Operand 0 is a destination vector of mode @var{m}.
> +Operand 1 is a scalar base address and operand 2 is a scalar stride of mode 
> @var{n}.
> +The instruction can be seen as a special case of 
> @code{mask_len_gather_load@var{m}@var{n}}
> +with an offset vector that is a @code{vec_series} with operand 1 as base and 
> operand 2 as step.
> +For each element index i:
> +
> +@itemize @bullet
> +@item
> +extend the stride to address width, using zero
> +extension if operand 3 is 1 and sign extension if operand 3 is zero;
> +@item
> +multiply the extended stride by operand 4;
> +@item
> +add the result to the base; and
> +@item
> +load the value at that address (operand 1 + @var{i} * multiplied and 
> extended stride) into element @var{i} of operand 0.
> +@end itemize
> +
> +Similar to mask_len_load, the instruction loads at most (operand 6 + operand 
> 7) elements from memory.
> +Bit @var{i} of the mask is set if element @var{i} o

Re: Re: [PATCH V2] OPTABS/IFN: Add mask_len_strided_load/mask_len_strided_store OPTABS/IFN

2023-11-02 Thread Richard Biener

On Thu, 2 Nov 2023, ??? wrote:

> Ok. So drop 'scale' and keep signed/unsigned argument, is that right?

I don't think we need signed/unsigned.  RTL expansion has the signedness
of the offset argument there and can just extend to the appropriate mode
to offset a pointer.

> And I wonder I should create the stride_type using size_type_node or 
> ptrdiff_type_node ?
> Which is preferrable ?

'sizetype' - that's the type we require to be used for 
the POINTER_PLUS_EXPR offset operand.


> Thanks.
> 
> 
> 
> juzhe.zh...@rivai.ai
>  
> From: Richard Biener
> Date: 2023-11-02 22:27
> To: ???
> CC: gcc-patches; Jeff Law; richard.sandiford; rdapp.gcc
> Subject: Re: Re: [PATCH V2] OPTABS/IFN: Add 
> mask_len_strided_load/mask_len_strided_store OPTABS/IFN
> On Thu, 2 Nov 2023, ??? wrote:
>  
> > Hi, Richi.
> > 
> > >> Do we really need to have two modes for the optab though or could we
> > >> simply require the target to support arbitrary offset modes (give it
> > >> is implicitly constrained to ptr_mode for the base already)?  Or
> > >> properly extend/truncate the offset at expansion time, say to ptr_mode
> > >> or to the mode of sizetype.
> > 
> > For RVV, it's ok by default set stride type as ptr_mode/size_type by 
> > default.
> > Is it ok that I define strided load/store as single mode optab and default 
> > Pmode as stride operand?
> > How about scale and signed/unsigned operand ?
> > It seems scale operand can be removed ? Since we can pass DR_STEP directly 
> > to the stride arguments.
> > But I think we can't remove signed/unsigned operand since for strided mode 
> > = SI mode, the unsigned
> > maximum stride = 2^31 wheras signed is 2 ^ 30.
>  
> On the GIMPLE side I think we want to have a sizetype operand and
> indeed drop 'scale', the sizetype operand should be readily available.
>  
> > 
> > 
> > 
> > juzhe.zh...@rivai.ai
> >  
> > From: Richard Biener
> > Date: 2023-11-02 21:52
> > To: Juzhe-Zhong
> > CC: gcc-patches; jeffreyalaw; richard.sandiford; rdapp.gcc
> > Subject: Re: [PATCH V2] OPTABS/IFN: Add 
> > mask_len_strided_load/mask_len_strided_store OPTABS/IFN
> > On Tue, 31 Oct 2023, Juzhe-Zhong wrote:
> >  
> > > As previous Richard's suggested, we should support strided load/store in
> > > loop vectorizer instead hacking RISC-V backend.
> > > 
> > > This patch adds MASK_LEN_STRIDED LOAD/STORE OPTABS/IFN.
> > > 
> > > The GIMPLE IR is same as mask_len_gather_load/mask_len_scatter_store but 
> > > with
> > > changing vector offset into scalar stride.
> >  
> > I see that it follows gather/scatter.  I'll note that when introducing
> > those we failed to add a specifier for TBAA and alignment info for the
> > data access.  That means we have to use alias-set zero for the accesses
> > (I see existing targets use UNSPECs with some not elaborate MEM anyway,
> > but TBAA info might have been the "easy" and obvious property to 
> > preserve).  For alignment we either have to assume unaligned or reject
> > vectorization of accesses that do not have their original scalar accesses
> > naturally aligned (aligned according to their mode).  We don't seem
> > to check that though.
> >  
> > It might be fine to go forward with this since gather/scatter are broken
> > in a similar way.
> >  
> > Do we really need to have two modes for the optab though or could we
> > simply require the target to support arbitrary offset modes (give it
> > is implicitly constrained to ptr_mode for the base already)?  Or
> > properly extend/truncate the offset at expansion time, say to ptr_mode
> > or to the mode of sizetype.
> >  
> > Thanks,
> > Richard.
> > > We don't have strided_load/strided_store and 
> > > mask_strided_load/mask_strided_store since
> > > it't unlikely RVV will have such optabs and we can't add the patterns 
> > > that we can't test them.
> > >
> > > 
> > > gcc/ChangeLog:
> > > 
> > > * doc/md.texi: Add mask_len_strided_load/mask_len_strided_store.
> > > * internal-fn.cc (internal_load_fn_p): Ditto.
> > > (internal_strided_fn_p): Ditto.
> > > (internal_fn_len_index): Ditto.
> > > (internal_fn_mask_index): Ditto.
> > > (internal_fn_stored_value_index): Ditto.
> > > (internal_strided_fn_supported_p): Ditto.
> > > * internal-fn.def (MASK_LEN_STRIDED_LOAD): Ditto.
> > > (MASK_LEN_STRIDED_STORE): Ditto.
> > > * internal-fn.h (internal_strided_fn_p): Ditto.
> > > (internal_strided_fn_supported_p): Ditto.
> > > * optabs.def (OPTAB_CD): Ditto.
> > > 
> > > ---
> > >  gcc/doc/md.texi | 51 +
> > >  gcc/internal-fn.cc  | 44 ++
> > >  gcc/internal-fn.def |  4 
> > >  gcc/internal-fn.h   |  2 ++
> > >  gcc/optabs.def  |  2 ++
> > >  5 files changed, 103 insertions(+)
> > > 
> > > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> > > index fab2513105a..5bac713a0dd 100644
> > > --- a/gcc/doc/md.texi
> > > +++ b/gcc/doc/md.texi
> > > @@ -5094,6 +5094,32 @@ Bit @var{i} of the mask is set if element @var{i} 
> > > of the result should
> >

Re: Re: [PATCH V2] OPTABS/IFN: Add mask_len_strided_load/mask_len_strided_store OPTABS/IFN

2023-11-02 Thread Richard Biener

On Thu, 2 Nov 2023, ??? wrote:

> Hi, Richi.
> 
> >> Do we really need to have two modes for the optab though or could we
> >> simply require the target to support arbitrary offset modes (give it
> >> is implicitly constrained to ptr_mode for the base already)?  Or
> >> properly extend/truncate the offset at expansion time, say to ptr_mode
> >> or to the mode of sizetype.
> 
> For RVV, it's ok by default set stride type as ptr_mode/size_type by default.
> Is it ok that I define strided load/store as single mode optab and default 
> Pmode as stride operand?
> How about scale and signed/unsigned operand ?
> It seems scale operand can be removed ? Since we can pass DR_STEP directly to 
> the stride arguments.
> But I think we can't remove signed/unsigned operand since for strided mode = 
> SI mode, the unsigned
> maximum stride = 2^31 wheras signed is 2 ^ 30.

On the GIMPLE side I think we want to have a sizetype operand and
indeed drop 'scale', the sizetype operand should be readily available.

> 
> 
> 
> juzhe.zh...@rivai.ai
>  
> From: Richard Biener
> Date: 2023-11-02 21:52
> To: Juzhe-Zhong
> CC: gcc-patches; jeffreyalaw; richard.sandiford; rdapp.gcc
> Subject: Re: [PATCH V2] OPTABS/IFN: Add 
> mask_len_strided_load/mask_len_strided_store OPTABS/IFN
> On Tue, 31 Oct 2023, Juzhe-Zhong wrote:
>  
> > As previous Richard's suggested, we should support strided load/store in
> > loop vectorizer instead hacking RISC-V backend.
> > 
> > This patch adds MASK_LEN_STRIDED LOAD/STORE OPTABS/IFN.
> > 
> > The GIMPLE IR is same as mask_len_gather_load/mask_len_scatter_store but 
> > with
> > changing vector offset into scalar stride.
>  
> I see that it follows gather/scatter.  I'll note that when introducing
> those we failed to add a specifier for TBAA and alignment info for the
> data access.  That means we have to use alias-set zero for the accesses
> (I see existing targets use UNSPECs with some not elaborate MEM anyway,
> but TBAA info might have been the "easy" and obvious property to 
> preserve).  For alignment we either have to assume unaligned or reject
> vectorization of accesses that do not have their original scalar accesses
> naturally aligned (aligned according to their mode).  We don't seem
> to check that though.
>  
> It might be fine to go forward with this since gather/scatter are broken
> in a similar way.
>  
> Do we really need to have two modes for the optab though or could we
> simply require the target to support arbitrary offset modes (give it
> is implicitly constrained to ptr_mode for the base already)?  Or
> properly extend/truncate the offset at expansion time, say to ptr_mode
> or to the mode of sizetype.
>  
> Thanks,
> Richard.
> > We don't have strided_load/strided_store and 
> > mask_strided_load/mask_strided_store since
> > it't unlikely RVV will have such optabs and we can't add the patterns that 
> > we can't test them.
> >
> > 
> > gcc/ChangeLog:
> > 
> > * doc/md.texi: Add mask_len_strided_load/mask_len_strided_store.
> > * internal-fn.cc (internal_load_fn_p): Ditto.
> > (internal_strided_fn_p): Ditto.
> > (internal_fn_len_index): Ditto.
> > (internal_fn_mask_index): Ditto.
> > (internal_fn_stored_value_index): Ditto.
> > (internal_strided_fn_supported_p): Ditto.
> > * internal-fn.def (MASK_LEN_STRIDED_LOAD): Ditto.
> > (MASK_LEN_STRIDED_STORE): Ditto.
> > * internal-fn.h (internal_strided_fn_p): Ditto.
> > (internal_strided_fn_supported_p): Ditto.
> > * optabs.def (OPTAB_CD): Ditto.
> > 
> > ---
> >  gcc/doc/md.texi | 51 +
> >  gcc/internal-fn.cc  | 44 ++
> >  gcc/internal-fn.def |  4 
> >  gcc/internal-fn.h   |  2 ++
> >  gcc/optabs.def  |  2 ++
> >  5 files changed, 103 insertions(+)
> > 
> > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> > index fab2513105a..5bac713a0dd 100644
> > --- a/gcc/doc/md.texi
> > +++ b/gcc/doc/md.texi
> > @@ -5094,6 +5094,32 @@ Bit @var{i} of the mask is set if element @var{i} of 
> > the result should
> >  be loaded from memory and clear if element @var{i} of the result should be 
> > undefined.
> >  Mask elements @var{i} with @var{i} > (operand 6 + operand 7) are ignored.
> >  
> > +@cindex @code{mask_len_strided_load@var{m}@var{n}} instruction pattern
> > +@item @samp{mask_len_strided_load@var{m}@var{n}}
> > +Load several separate memory locations into a destination vector of mode 
> > @var{m}.
> > +Operand 0 is a destination vector of mode @var{m}.
> > +Operand 1 is a scalar base address and operand 2 is a scalar stride of 
> > mode @var{n}.
> > +The instruction can be seen as a special case of 
> > @code{mask_len_gather_load@var{m}@var{n}}
> > +with an offset vector that is a @code{vec_series} with operand 1 as base 
> > and operand 2 as step.
> > +For each element index i:
> > +
> > +@itemize @bullet
> > +@item
> > +extend the stride to address width, using zero
> > +extension if operand 3 is 1 and sign extension if operand 3 is zero;
> > +@item
>

Re: RFC: the proposal to resolve the missing dependency issue for counted_by attribute

2023-11-02 Thread Qing Zhao



> On Nov 2, 2023, at 9:54 AM, Richard Biener  wrote:
> 
> On Thu, Nov 2, 2023 at 2:50 PM Qing Zhao  wrote:
>> 
>> 
>> 
>>> On Nov 2, 2023, at 3:57 AM, Richard Biener  
>>> wrote:
>>> 
>>> On Wed, Nov 1, 2023 at 3:47 PM Qing Zhao  wrote:
 
 
 
> On Oct 31, 2023, at 6:14 PM, Joseph Myers  wrote:
> 
> On Tue, 31 Oct 2023, Qing Zhao wrote:
> 
>> 2.3 A new semantic requirement in the user documentation of "counted_by"
>> 
>> For the following structure including a FAM with a counted_by attribute:
>> 
>> struct A
>> {
>> size_t size;
>> char buf[] __attribute__((counted_by(size)));
>> };
>> 
>> for any object with such type:
>> 
>> struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
>> 
>> The setting to the size field should be done before the first reference
>> to the FAM field.
>> 
>> Such requirement to the user will guarantee that the first reference to
>> the FAM knows the size of the FAM.
>> 
>> We need to add this additional requirement to the user document.
> 
> Make sure the manual is very specific about exactly when size is
> considered to be an accurate representation of the space available for buf
> (given that, after malloc or realloc, it's going to be temporarily
> inaccurate).  If the intent is that inaccurate size at such a time means
> undefined behavior, say so explicitly.
 
 Yes, good point. We need to define this clearly in the beginning.
 We need to explicit say that
 
 the size of the FAM is defined by the latest “counted_by” value. And it’s 
 an undefined behavior when the size field is not defined when the FAM is 
 referenced.
 
 Is the above good enough?
 
 
> 
>> 2.4 Replace FAM field accesses with the new function ACCESS_WITH_SIZE
>> 
>> In C FE:
>> 
>> for every reference to a FAM, for example, "obj->buf" in the small 
>> example,
>> check whether the corresponding FIELD_DECL has a "counted_by" attribute?
>> if YES, replace the reference to "obj->buf" with a call to
>>.ACCESS_WITH_SIZE (obj->buf, obj->size, -1);
> 
> This seems plausible - but you should also consider the case of static
> initializers - remember the GNU extension for statically allocated objects
> with flexible array members (unless you're not allowing it with
> counted_by).
> 
> static struct A x = { sizeof "hello", "hello" };
> static char *y = &x.buf;
> 
> I'd expect that to be valid - and unless you say such a usage is invalid,
 
 At this moment, I think that this should be valid.
 
 I,e, the following:
 
 struct A
 {
 size_t size;
 char buf[] __attribute__((counted_by(size)));
 };
 
 static struct A x = {sizeof "hello", "hello”};
 
 Should be valid, and x.size represents the number of elements of x.buf.
 Both x.size and x.buf are initialized statically.
 
> you should avoid the replacement in such a static initializer context when
> the FAM reference is to an object with a constant address (if
> .ACCESS_WITH_SIZE would not act as an lvalue whose address is a constant
> expression; if it works fine as a constant-address lvalue, then the
> replacement would be OK).
 
 Then if such usage for the “counted_by” is valid, we need to replace the 
 FAM
 reference by a call to  .ACCESS_WITH_SIZE as well.
 Otherwise the “counted_by” relationship will be lost to the Middle end.
 
 With the current definition of .ACCESS_WITH_SIZE
 
 PTR = .ACCESS_WITH_SIZE (PTR, SIZE, ACCESS_MODE)
 
 Isn’t the PTR (return value of the call) a LVALUE?
>>> 
>>> You probably want to specify that when a pointer to the array is taken the
>>> pointer has to be to the first array element (or do we want to mangle the
>>> 'size' accordingly for the instrumentation?).
>> 
>> Yes. Will add this into the user documentation.
>> 
>>> You also want to specify that
>>> the 'size' associated with such pointer is assumed to be unchanging and
>>> after changing the size such pointer has to be re-obtained.
>> 
>> What do you mean by “re-obtained”?
> 
> do
> 
> p = &container.array[0];
> 
> after any adjustments to 'array' or 'len' again and base further accesses on
> the new 'p'.


Then for the following example form Kees:

struct foo *f;
char *p;
int i;

f = alloc(maximum_possible);
f->count = 0;
p = f->buf;

for (i; data_is_available() && i < maximum_possible; i++) {
f->count ++;
p[i] = next_data_item();
}

Will not work?

We have to change it as:

struct foo *f;
char *p;
int i;

f = alloc(maximum_possible);
f->count = 0;
p = f->buf;

for (i; data_is_available() && i < maximum_possible; i++) {

[PATCH] libstdc++: Improve static assert messages for monadic operations

2023-11-02 Thread Jonathan Wakely

Any objections or suggestions for better wording?

Tested x86_64-linux.

-- >8 --

The monadic operations for std::optional and std::expected make use of
internal helper traits __is_optional nad __is_expected, which are not
very user-friendly when shown in diagnostics. Add messages to the
assertions explaining the problem more clearly.

libstdc++-v3/ChangeLog:

* include/std/expected (expected::and_then, expected::or_else):
Add string literals to static assertions.
* include/std/optional (optional::and_then, optional::or_else):
Likewise.
---
 libstdc++-v3/include/std/expected | 64 +++
 libstdc++-v3/include/std/optional | 24 +---
 2 files changed, 66 insertions(+), 22 deletions(-)

diff --git a/libstdc++-v3/include/std/expected 
b/libstdc++-v3/include/std/expected
index a796f0b6f27..a176d4c3a78 100644
--- a/libstdc++-v3/include/std/expected
+++ b/libstdc++-v3/include/std/expected
@@ -843,8 +843,12 @@ namespace __expected
and_then(_Fn&& __f) &
{
  using _Up = __expected::__result<_Fn, _Tp&>;
- static_assert(__expected::__is_expected<_Up>);
- static_assert(is_same_v);
+ static_assert(__expected::__is_expected<_Up>,
+   "the function passed to std::expected::and_then "
+   "must return a std::expected");
+ static_assert(is_same_v,
+   "the function passed to std::expected::and_then "
+   "must return a std::expected with the same error_type");
 
  if (has_value())
return std::__invoke(std::forward<_Fn>(__f), _M_val);
@@ -857,8 +861,12 @@ namespace __expected
and_then(_Fn&& __f) const &
{
  using _Up = __expected::__result<_Fn, const _Tp&>;
- static_assert(__expected::__is_expected<_Up>);
- static_assert(is_same_v);
+ static_assert(__expected::__is_expected<_Up>,
+   "the function passed to std::expected::and_then "
+   "must return a std::expected");
+ static_assert(is_same_v,
+   "the function passed to std::expected::and_then "
+   "must return a std::expected with the same error_type");
 
  if (has_value())
return std::__invoke(std::forward<_Fn>(__f), _M_val);
@@ -871,8 +879,12 @@ namespace __expected
and_then(_Fn&& __f) &&
{
  using _Up = __expected::__result<_Fn, _Tp&&>;
- static_assert(__expected::__is_expected<_Up>);
- static_assert(is_same_v);
+ static_assert(__expected::__is_expected<_Up>,
+   "the function passed to std::expected::and_then "
+   "must return a std::expected");
+ static_assert(is_same_v,
+   "the function passed to std::expected::and_then "
+   "must return a std::expected with the same error_type");
 
  if (has_value())
return std::__invoke(std::forward<_Fn>(__f), std::move(_M_val));
@@ -886,8 +898,12 @@ namespace __expected
and_then(_Fn&& __f) const &&
{
  using _Up = __expected::__result<_Fn, const _Tp&&>;
- static_assert(__expected::__is_expected<_Up>);
- static_assert(is_same_v);
+ static_assert(__expected::__is_expected<_Up>,
+   "the function passed to std::expected::and_then "
+   "must return a std::expected");
+ static_assert(is_same_v,
+   "the function passed to std::expected::and_then "
+   "must return a std::expected with the same error_type");
 
  if (has_value())
return std::__invoke(std::forward<_Fn>(__f), std::move(_M_val));
@@ -900,8 +916,12 @@ namespace __expected
or_else(_Fn&& __f) &
{
  using _Gr = __expected::__result<_Fn, _Er&>;
- static_assert(__expected::__is_expected<_Gr>);
- static_assert(is_same_v);
+ static_assert(__expected::__is_expected<_Gr>,
+   "the function passed to std::expected::or_else "
+   "must return a std::expected");
+ static_assert(is_same_v,
+   "the function passed to std::expected::or_else "
+   "must return a std::expected with the same value_type");
 
  if (has_value())
return _Gr(in_place, _M_val);
@@ -914,8 +934,12 @@ namespace __expected
or_else(_Fn&& __f) const &
{
  using _Gr = __expected::__result<_Fn, const _Er&>;
- static_assert(__expected::__is_expected<_Gr>);
- static_assert(is_same_v);
+ static_assert(__expected::__is_expected<_Gr>,
+   "the function passed to std::expected::or_else "
+   "must return a std::expected");
+ static_assert(is_same_v,
+   "the function passed to st

[committed] libstdc++: Add assertion to std::string_view::remove_suffix [PR112314]

2023-11-02 Thread Jonathan Wakely

Tested x86_64-linux. Pushed to trunk.

Backports seem reasonable.

-- >8 --

libstdc++-v3/ChangeLog:

PR libstdc++/112314
* include/std/string_view (string_view::remove_suffix): Add
debug assertion.
* 
testsuite/21_strings/basic_string_view/modifiers/remove_prefix/debug.cc:
New test.
* 
testsuite/21_strings/basic_string_view/modifiers/remove_suffix/debug.cc:
New test.
---
 libstdc++-v3/include/std/string_view   |  5 -
 .../modifiers/remove_prefix/debug.cc   | 14 ++
 .../modifiers/remove_suffix/debug.cc   | 14 ++
 3 files changed, 32 insertions(+), 1 deletion(-)
 create mode 100644 
libstdc++-v3/testsuite/21_strings/basic_string_view/modifiers/remove_prefix/debug.cc
 create mode 100644 
libstdc++-v3/testsuite/21_strings/basic_string_view/modifiers/remove_suffix/debug.cc

diff --git a/libstdc++-v3/include/std/string_view 
b/libstdc++-v3/include/std/string_view
index d103abda668..9deae25f712 100644
--- a/libstdc++-v3/include/std/string_view
+++ b/libstdc++-v3/include/std/string_view
@@ -301,7 +301,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   constexpr void
   remove_suffix(size_type __n) noexcept
-  { this->_M_len -= __n; }
+  {
+   __glibcxx_assert(this->_M_len >= __n);
+   this->_M_len -= __n;
+  }
 
   constexpr void
   swap(basic_string_view& __sv) noexcept
diff --git 
a/libstdc++-v3/testsuite/21_strings/basic_string_view/modifiers/remove_prefix/debug.cc
 
b/libstdc++-v3/testsuite/21_strings/basic_string_view/modifiers/remove_prefix/debug.cc
new file mode 100644
index 000..37204583b71
--- /dev/null
+++ 
b/libstdc++-v3/testsuite/21_strings/basic_string_view/modifiers/remove_prefix/debug.cc
@@ -0,0 +1,14 @@
+// { dg-do compile { target c++17 } }
+
+#include 
+
+constexpr bool
+check_remove_prefix()
+{
+  std::string_view sv("123");
+  sv.remove_prefix(4);
+  // { dg-error "not a constant expression" "" { target *-*-* } 0 }
+  return true;
+}
+
+constexpr bool test = check_remove_prefix();
diff --git 
a/libstdc++-v3/testsuite/21_strings/basic_string_view/modifiers/remove_suffix/debug.cc
 
b/libstdc++-v3/testsuite/21_strings/basic_string_view/modifiers/remove_suffix/debug.cc
new file mode 100644
index 000..a549e4c2471
--- /dev/null
+++ 
b/libstdc++-v3/testsuite/21_strings/basic_string_view/modifiers/remove_suffix/debug.cc
@@ -0,0 +1,14 @@
+// { dg-do compile { target c++17 } }
+
+#include 
+
+constexpr bool
+check_remove_suffix()
+{
+  std::string_view sv("123");
+  sv.remove_suffix(4);
+  // { dg-error "not a constant expression" "" { target *-*-* } 0 }
+  return true;
+}
+
+constexpr bool test = check_remove_suffix();
-- 
2.41.0

[PATCH] vect: allow using inbranch simdclones for masked loops

2023-11-02 Thread Andre Vieira (lists)


Hi,

In a previous patch I did most of the work for this, but forgot to 
change the check for number of arguments matching between call and 
simdclone.  This check should accept calls without a mask to be matched 
against simdclones with mask arguments.  I also added tests to verify 
this feature actually works.



For the simd-builtins tests I decided to remove the sin (double) 
simdclone which would now be used, because it was inbranch and we enable 
their use for not inbranch.  Given the nature of the test, removing it 
made more sense, but thats not a strong opinion, happy to change.


Bootstrapped and regression tested on aarch64-unknown-linux-gnu and 
x86_64-pc-linux-gnu.


OK for trunk?

PS: I'll be away for two weeks from tomorrow, it would be really nice if 
this can go in for gcc-14, otherwise the previous work I did for this 
won't have any actual visible effect :(



gcc/ChangeLog:

* tree-vect-stmts.cc (vectorizable_simd_clone_call): Allow unmasked
calls to use masked simdclones.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/vect-simd-clone-20.c: New file.
* gfortran.dg/simd-builtins-1.h: Adapt.
* gfortran.dg/simd-builtins-6.f90: Adapt.diff --git a/gcc/testsuite/gcc.dg/vect/vect-simd-clone-20.c 
b/gcc/testsuite/gcc.dg/vect/vect-simd-clone-20.c
new file mode 100644
index 
..9f51a68f3a0c8851af2cd26bd8235c771b851d7d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-simd-clone-20.c
@@ -0,0 +1,87 @@
+/* { dg-require-effective-target vect_simd_clones } */
+/* { dg-additional-options "-fopenmp-simd --param vect-epilogues-nomask=0" } */
+/* { dg-additional-options "-mavx" { target avx_runtime } } */
+
+/* Test that simd inbranch clones work correctly.  */
+
+#ifndef TYPE
+#define TYPE int
+#endif
+
+/* A simple function that will be cloned.  */
+#pragma omp declare simd inbranch
+TYPE __attribute__((noinline))
+foo (TYPE a)
+{
+  return a + 1;
+}
+
+/* Check that "inbranch" clones are called correctly.  */
+
+void __attribute__((noipa))
+masked (TYPE * __restrict a, TYPE * __restrict b, int size)
+{
+  #pragma omp simd
+  for (int i = 0; i < size; i++)
+b[i] = foo(a[i]);
+}
+
+/* Check that "inbranch" works when there might be unrolling.  */
+
+void __attribute__((noipa))
+masked_fixed (TYPE * __restrict a, TYPE * __restrict b)
+{
+  #pragma omp simd
+  for (int i = 0; i < 128; i++)
+b[i] = foo(a[i]);
+}
+
+/* Validate the outputs.  */
+
+void
+check_masked (TYPE *b, int size)
+{
+  for (int i = 0; i < size; i++)
+if (b[i] != (TYPE)(i + 1))
+  {
+   __builtin_printf ("error at %d\n", i);
+   __builtin_exit (1);
+  }
+}
+
+int
+main ()
+{
+  TYPE a[1024];
+  TYPE b[1024];
+
+  for (int i = 0; i < 1024; i++)
+a[i] = i;
+
+  masked_fixed (a, b);
+  check_masked (b, 128);
+
+  /* Test various sizes to cover machines with different vectorization
+ factors.  */
+  for (int size = 8; size <= 1024; size *= 2)
+{
+  masked (a, b, size);
+  check_masked (b, size);
+}
+
+  /* Test sizes that might exercise the partial vector code-path.  */
+  for (int size = 8; size <= 1024; size *= 2)
+{
+  masked (a, b, size-4);
+  check_masked (b, size-4);
+}
+
+  return 0;
+}
+
+/* Ensure the the in-branch simd clones are used on targets that support them. 
 */
+/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 2 "vect" 
{ target { aarch64*-*-* } } } } */
+/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 4 "vect" 
{ target { x86_64*-*-* } } } } */
+
+/* The LTO test produces two dump files and we scan the wrong one.  */
+/* { dg-skip-if "" { *-*-* } { "-flto" } { "" } } */
diff --git a/gcc/testsuite/gfortran.dg/simd-builtins-1.h 
b/gcc/testsuite/gfortran.dg/simd-builtins-1.h
index 
88d555cf41ad065ea525a63d7c05d15d3e5b54ed..08b73514a67d5791d35203530d039741946e9dcc
 100644
--- a/gcc/testsuite/gfortran.dg/simd-builtins-1.h
+++ b/gcc/testsuite/gfortran.dg/simd-builtins-1.h
@@ -1,4 +1,3 @@
-!GCC$ builtin (sin) attributes simd (inbranch)
 !GCC$ builtin (sinf) attributes simd (notinbranch)
 !GCC$ builtin (cosf) attributes simd
 !GCC$ builtin (cosf) attributes simd (notinbranch)
diff --git a/gcc/testsuite/gfortran.dg/simd-builtins-6.f90 
b/gcc/testsuite/gfortran.dg/simd-builtins-6.f90
index 
60bcac78f3e0cc492930f3eb73cf97065312dc1c..2c68f9f1818a35674a0aef15793aa312a48199a8
 100644
--- a/gcc/testsuite/gfortran.dg/simd-builtins-6.f90
+++ b/gcc/testsuite/gfortran.dg/simd-builtins-6.f90
@@ -2,7 +2,6 @@
 ! { dg-additional-options "-nostdinc -Ofast -fdump-tree-optimized" }
 ! { dg-additional-options "-msse2 -mno-avx" { target i?86-*-linux* 
x86_64-*-linux* } }
 
-!GCC$ builtin (sin) attributes simd (inbranch)
 !GCC$ builtin (sinf) attributes simd (notinbranch)
 !GCC$ builtin (cosf) attributes simd
 !GCC$ builtin (cosf) attributes simd (notinbranch)
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 
a9200767f67a4c9a8e106259be97a7bc7cd7e9dc.

Re: [PATCH v3] c++: implement P2564, consteval needs to propagate up [PR107687]

2023-11-02 Thread Marek Polacek

On Thu, Nov 02, 2023 at 11:28:43AM -0400, Marek Polacek wrote:
> On Sat, Oct 14, 2023 at 12:56:11AM -0400, Jason Merrill wrote:
> > As discussed above, we probably don't want to exclude implicitly-declared
> > special member functions.
> 
> Done (but removing the DECL_ARTIFICIAL check).

s/but/by/

[PATCH v3] c++: implement P2564, consteval needs to propagate up [PR107687]

2023-11-02 Thread Marek Polacek

On Sat, Oct 14, 2023 at 12:56:11AM -0400, Jason Merrill wrote:
> On 10/10/23 13:20, Marek Polacek wrote:
> > Thanks for looking into this.  It's kept me occupied for quite a while.
> 
> Thanks, nice work.
> 
> > On Tue, Aug 29, 2023 at 03:26:46PM -0400, Jason Merrill wrote:
> > > On 8/23/23 15:49, Marek Polacek wrote:
> > > > +struct A {
> > > > +  int x;
> > > > +  int y = id(x);
> > > > +};
> > > > +
> > > > +template
> > > > +constexpr int k(int) {  // k is not an immediate function 
> > > > because A(42) is a
> > > > +  return A(42).y;   // constant expression and thus not 
> > > > immediate-escalating
> > > > +}
> > > 
> > > Needs use(s) of k to test the comment.
> > 
> > True, and that revealed what I think is a bug in the standard.
> > In the test I'm saying:
> > 
> > // ??? [expr.const]#example-9 says:
> > //   k is not an immediate function because A(42) is a
> > //   constant expression and thus not immediate-escalating
> > // But I think the call to id(x) is *not* a constant expression and thus
> > // it is an immediate-escalating expression.  Therefore k *is*
> > // an immediate function.  So we get the error below.  clang++ agrees.
> > id(x) is not a constant expression because x isn't constant.
> 
> Not when considering id(x) by itself, but in the evaluation of A(42), the
> member x has just been initialized to constant 42.  And A(42) is
> constant-evaluated because "An aggregate initialization is an immediate
> invocation if it evaluates a default member initializer that has a
> subexpression that is an immediate-escalating expression."
> 
> I assume clang doesn't handle this passage properly yet because it was added
> during core review of the paper, for parity between aggregate initialization
> and constructor escalation.
> 
> This can be a follow-up patch.

I see.  So the fix will be to, for the aggregate initialization case, pass
the whole A(42).y thing to cxx_constant_eval, not just id(x).
 
> > So.  I think we want to refrain from instantiating things early
> > given how many problems that caused.  On the other hand, stashing
> > all the immediate-escalating decls into immediate_escalating_decls
> > and walking their bodies isn't going to be cheap.  I've checked
> > how big the vectors can get, but our testsuite isn't the best litmus
> > test because it's mostly smallish testcases without many #includes.
> > The worst offender is uninit-pr105562.C with
> > 
> > (gdb) p immediate_escalating_decls->length()
> > $2 = 2204
> > (gdb) p deferred_escalating_exprs->length()
> > $3 = 501
> > 
> > Compiling uninit-pr105562.C with g++13 and g++14 with this patch:
> > real 7.51 real 7.67
> > user 7.32 user 7.49
> > sys 0.15  sys 0.14
> > 
> > I've made sure not to walk the same bodies twice.  But there's room
> > for further optimization; I suppose we could escalate instantiated
> > functions right away rather than putting them into
> > immediate_escalating_decls and waiting till later.
> 
> Absolutely; if we see a call to a known consteval function, we should
> escalate...immediately.  As the patch seems to do already?

Right, I'm not sure what I was thinking.
 
> > I'm not certain
> > if I can just look at DECL_TEMPLATE_INSTANTIATED.
> 
> I'm not sure what you mean, but a constexpr function being instantiated
> doesn't necessarily imply that everything it calls has been instantiated, so
> we might not know yet if it needs to escalate.

I was pondering exactly that but you are of course correct here.
 
> > I suppose some
> > functions cannot possibly be promoted because they don't contain
> > any CALL_EXPRs.  So we may be able to rule them out while doing
> > cp_fold_r early.
> 
> Yes.  Or, the only immediate-escalating functions referenced have already
> been checked.
> 
> We can also do some escalation during constexpr evaluation: all the
> functions involved need to be instantiated for the evaluation, and if we
> encounter an immediate-escalating expression while evaluating a call to an
> immediate-escalating function, we can promote it then.  Though we can't
> necessarily mark it as not needing promotion, as there might be i-e exprs in
> branches that the particular evaluation doesn't take.

I've tried but I didn't get anywhere.  The patch was basically

--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -2983,7 +2983,13 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree 
t,
   } fb (new_call.bindings);

   if (*non_constant_p)
-return t;
+{
+  if (cxx_dialect >= cxx20
+ && ctx->manifestly_const_eval == mce_false
+ && DECL_IMMEDIATE_FUNCTION_P (fun))
+   maybe_promote_function_to_consteval (current_function_decl);
+  return t;
+}

   /* We can't defer instantiating the function any longer.  */
   if (!DECL_INITIAL (fun)

but since I have to check mce_false, it didn't do anything useful
in practice (that is, it wouldn't escalate anything in my tests).

> > If a function is t

Re: [PATCH V2] RISC-V: Fix redundant vsetvl in fixed-vlmax vectorized codes[PR112326]

2023-11-02 Thread Robin Dapp

Hi Juzhe,

in principle this LGTM.  It could use some function comments, though ;)
> +imm_avl_p (machine_mode mode)
>  {
>poly_uint64 nuints = GET_MODE_NUNITS (mode);
>  
>return nuints.is_constant ()
> -/* The vsetivli can only hold register 0~31.  */
> -? (IN_RANGE (nuints.to_constant (), 0, 31))
> -/* Only allowed in VLS-VLMAX mode.  */
> -: false;
> +/* The vsetivli can only hold register 0~31.  */
> +? (IN_RANGE (nuints.to_constant (), 0, 31))
> +/* Only allowed in VLS-VLMAX mode.  */
> +: false;
>  }

Please replace nuints (or untis) with nunits here everywhere.

> +;; The index of operand[] represents the machine mode of the instruction.
> +(define_attr "mode_idx" ""
> + (cond [(eq_attr "type" 
> "vlde,vste,vldm,vstm,vlds,vsts,vldux,vldox,vldff,vldr,vstr,\
> + 
> vlsegde,vlsegds,vlsegdux,vlsegdox,vlsegdff,vialu,vext,vicalu,\
> + 
> vshift,vicmp,viminmax,vimul,vidiv,vimuladd,vimerge,vimov,\
> + 
> vsalu,vaalu,vsmul,vsshift,vfalu,vfmul,vfdiv,vfmuladd,vfsqrt,vfrecp,\
> + vfcmp,vfminmax,vfsgnj,vfclass,vfmerge,vfmov,\
> + 
> vfcvtitof,vfncvtitof,vfncvtftoi,vfncvtftof,vmalu,vmiota,vmidx,\
> + 
> vimovxv,vfmovfv,vslideup,vslidedown,vislide1up,vislide1down,vfslide1up,vfslide1down,\
> + vgather,vcompress,vmov")
> +(const_int 0)
> +
> +(eq_attr "type" "vimovvx,vfmovvf")
> +(const_int 1)
> +
> +(eq_attr "type" "vssegte,vnshift,vmpop,vmffs")
> +(const_int 2)   

I'm not that fond of the growing number of necessary indices even though I
realize that it's the most painless way for now.  Why is vnshift "2" and
not "0", though?

"4" for vnclip also looks dubious.  I didn't go through all of them.

Regards
 Robin

Re: RFC: the proposal to resolve the missing dependency issue for counted_by attribute

2023-11-02 Thread Siddhesh Poyarekar


On 2023-11-02 10:12, Martin Uecker wrote:

This shouldn't be necessary. The object-size pass
can track pointer arithmeti if it comes after
inserting the .ACCESS_WITH_SIZE.

https://godbolt.org/z/fvc3aoPfd


The problem is dependency tracking through the pointer arithmetic, which 
Jakub suggested to work around by passing a reference to the size in 
.ACCESS_WITH_SIZE to avoid DCE/reordering.


Thanks,
Sid

[PATCH V3 2/6] aarch64: Add support for aarch64-sys-regs.def

2023-11-02 Thread Victor Do Nascimento

This patch defines the structure of a new .def file used for
representing the aarch64 system registers, what information it should
hold and the basic framework in GCC to process this file.

Entries in the aarch64-system-regs.def file should be as follows:

  SYSREG (NAME, CPENC (sn,op1,cn,cm,op2), FLAG1 | ... | FLAGn, ARCH)

Where the arguments to SYSREG correspond to:
  - NAME:  The system register name, as used in the assembly language.
  - CPENC: The system register encoding, mapping to:

   s__c_c_

  - FLAG: The entries in the FLAGS field are bitwise-OR'd together to
  encode extra information required to ensure proper use of
  the system register.  For example, a read-only system
  register will have the flag F_REG_READ, while write-only
  registers will be labeled F_REG_WRITE.  Such flags are
  tested against at compile-time.
  - ARCH: The architectural features the system register is associated
  with.  This is encoded via one of three possible macros:
  1. When a system register is universally implemented, we say
  it has no feature requirements, so we tag it with the
  AARCH64_NO_FEATURES macro.
  2. When a register is only implemented for a single
  architectural extension EXT, the AARCH64_FEATURE (EXT), is
  used.
  3. When a given system register is made available by any of N
  possible architectural extensions, the AARCH64_FEATURES(N, ...)
  macro is used to combine them accordingly.

In order to enable proper interpretation of the SYSREG entries by the
compiler, flags defining system register behavior such as `F_REG_READ'
and `F_REG_WRITE' are also defined here, so they can later be used for
the validation of system register properties.

Finally, any architectural feature flags from Binutils missing from GCC
have appropriate aliases defined here so as to ensure
cross-compatibility of SYSREG entries across the toolchain.

gcc/ChangeLog:

* config/aarch64/aarch64.cc (sysreg_t): New.
(sysreg_structs): Likewise.
(nsysreg): Likewise.
(AARCH64_FEATURE): Likewise.
(AARCH64_FEATURES): Likewise.
(AARCH64_NO_FEATURES): Likewise.
* config/aarch64/aarch64.h (AARCH64_ISA_V8A): Add missing
ISA flag.
(AARCH64_ISA_V8_1A): Likewise.
(AARCH64_ISA_V8_7A): Likewise.
(AARCH64_ISA_V8_8A): Likewise.
(AARCH64_NO_FEATURES): Likewise.
(AARCH64_FL_RAS): New ISA flag alias.
(AARCH64_FL_LOR): Likewise.
(AARCH64_FL_PAN): Likewise.
(AARCH64_FL_AMU): Likewise.
(AARCH64_FL_SCXTNUM): Likewise.
(AARCH64_FL_ID_PFR2): Likewise.
(F_DEPRECATED): New.
(F_REG_READ): Likewise.
(F_REG_WRITE): Likewise.
(F_ARCHEXT): Likewise.
(F_REG_ALIAS): Likewise.
---
 gcc/config/aarch64/aarch64.cc | 53 +++
 gcc/config/aarch64/aarch64.h  | 22 +++
 2 files changed, 75 insertions(+)

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 5fd7063663c..a4a9e2e51ea 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -2806,6 +2806,59 @@ static const struct processor all_cores[] =
feature_deps::V8A ().enable, &generic_tunings},
   {NULL, aarch64_none, aarch64_none, aarch64_no_arch, 0, NULL}
 };
+/* Internal representation of system registers.  */
+typedef struct {
+  const char *name;
+  /* Stringified sysreg encoding values, represented as
+ s__c_c_.  */
+  const char *encoding;
+  /* Flags affecting sysreg usage, such as read/write-only.  */
+  unsigned properties;
+  /* Architectural features implied by sysreg.  */
+  aarch64_feature_flags arch_reqs;
+} sysreg_t;
+
+/* An aarch64_feature_set initializer for a single feature,
+   AARCH64_FEATURE_.  */
+#define AARCH64_FEATURE(FEAT) AARCH64_FL_##FEAT
+
+/* Used by AARCH64_FEATURES.  */
+#define AARCH64_OR_FEATURES_1(X, F1) \
+  AARCH64_FEATURE (F1)
+#define AARCH64_OR_FEATURES_2(X, F1, F2) \
+  (AARCH64_FEATURE (F1) | AARCH64_OR_FEATURES_1 (X, F2))
+#define AARCH64_OR_FEATURES_3(X, F1, ...) \
+  (AARCH64_FEATURE (F1) | AARCH64_OR_FEATURES_2 (X, __VA_ARGS__))
+
+/* An aarch64_feature_set initializer for the N features listed in "...".  */
+#define AARCH64_FEATURES(N, ...) \
+  AARCH64_OR_FEATURES_##N (0, __VA_ARGS__)
+
+#define AARCH64_NO_FEATURES   0
+
+/* Flags associated with the properties of system registers.  It mainly serves
+   to mark particular registers as read or write only.  */
+#define F_DEPRECATED  (1 << 1)
+#define F_REG_READ(1 << 2)
+#define F_REG_WRITE   (1 << 3)
+#define F_ARCHEXT (1 << 4)
+/* Flag indicating register name is alias for another system register.  */
+#define F_REG_ALIAS   (1 << 5)
+
+/* Database of system registers, their encodings and architectural
+   requirements.  */
+const sysreg_

[PATCH V3 1/6] aarch64: Sync system register information with Binutils

2023-11-02 Thread Victor Do Nascimento

This patch adds the `aarch64-sys-regs.def' file, originally written
for Binutils, to GCC. In so doing, it provides GCC with the necessary
information for teaching the compiler about system registers known to
the assembler and how these can be used.

By aligning the representation of data common to different parts of
the toolchain we can greatly reduce the duplication of work,
facilitating the maintenance of the aarch64 back-end across different
parts of the toolchain; By keeping both copies of the file in sync,
any `SYSREG (...)' that is added in one project is automatically added
to its counterpart.  This being the case, no change should be made in
the GCC copy of the file.  Any modifications should first be made in
Binutils and the resulting file copied over to GCC.

GCC does not implement the full range of ISA flags present in
Binutils.  Where this is the case, aliases must be added to aarch64.h
with the unknown architectural extension being mapped to its
associated base architecture, such that any flag present in Binutils
and used in system register definitions is understood in GCC.  Again,
this is done such that flags can be used interchangeably between
projects making use of the aarch64-system-regs.def file.  This is done
in the next patch in the series.

`.arch' directives missing from the emitted assembly files as a
consequence of this aliasing are accounted for by the compiler using
the S encoding of system registers when
issuing mrs/msr instructions.  This design choice ensures the
assembler will accept anything that was deemed acceptable by the
compiler.

gcc/ChangeLog:

* config/aarch64/aarch64-system-regs.def: New.
---
 gcc/config/aarch64/aarch64-sys-regs.def | 1064 +++
 1 file changed, 1064 insertions(+)
 create mode 100644 gcc/config/aarch64/aarch64-sys-regs.def

diff --git a/gcc/config/aarch64/aarch64-sys-regs.def 
b/gcc/config/aarch64/aarch64-sys-regs.def
new file mode 100644
index 000..d24a2455503
--- /dev/null
+++ b/gcc/config/aarch64/aarch64-sys-regs.def
@@ -0,0 +1,1064 @@
+/* aarch64-system-regs.def -- AArch64 opcode support.
+   Copyright (C) 2009-2023 Free Software Foundation, Inc.
+   Contributed by ARM Ltd.
+
+   This file is part of the GNU opcodes library.
+
+   This library is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   It is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; see the file COPYING3.  If not,
+   see .  */
+
+/* Array of system registers and their associated arch features.
+
+   This file is also used by GCC.  Where necessary, any updates should
+   be made in Binutils and the updated file copied across to GCC, such
+   that the two projects are kept in sync at all times.
+
+   Before using #include to read this file, define a macro:
+
+ SYSREG (name, encoding, flags, features)
+
+  The NAME is the system register name, as recognized by the
+  assembler.  ENCODING provides the necessary information for the binary
+  encoding of the system register.  The FLAGS field is a bitmask of
+  relevant behavior information pertaining to the particular register.
+  For example: is it read/write-only? does it alias another register?
+  The FEATURES field maps onto ISA flags and specifies the architectural
+  feature requirements of the system register.  */
+
+  SYSREG ("accdata_el1",   CPENC (3,0,13,0,5), 0,  
AARCH64_NO_FEATURES)
+  SYSREG ("actlr_el1", CPENC (3,0,1,0,1),  0,  
AARCH64_NO_FEATURES)
+  SYSREG ("actlr_el2", CPENC (3,4,1,0,1),  0,  
AARCH64_NO_FEATURES)
+  SYSREG ("actlr_el3", CPENC (3,6,1,0,1),  0,  
AARCH64_NO_FEATURES)
+  SYSREG ("afsr0_el1", CPENC (3,0,5,1,0),  0,  
AARCH64_NO_FEATURES)
+  SYSREG ("afsr0_el12",CPENC (3,5,5,1,0),  F_ARCHEXT,  
AARCH64_FEATURE (V8_1A))
+  SYSREG ("afsr0_el2", CPENC (3,4,5,1,0),  0,  
AARCH64_NO_FEATURES)
+  SYSREG ("afsr0_el3", CPENC (3,6,5,1,0),  0,  
AARCH64_NO_FEATURES)
+  SYSREG ("afsr1_el1", CPENC (3,0,5,1,1),  0,  
AARCH64_NO_FEATURES)
+  SYSREG ("afsr1_el12",CPENC (3,5,5,1,1),  F_ARCHEXT,  
AARCH64_FEATURE (V8_1A))
+  SYSREG ("afsr1_el2", CPENC (3,4,5,1,1),  0,  
AARCH64_NO_FEATURES)
+  SYSREG ("afsr1_el3", CPENC (3,6,5,1,1),  0,

[PATCH V3 3/6] aarch64: Implement system register validation tools

2023-11-02 Thread Victor Do Nascimento

Given the implementation of a mechanism of encoding system registers
into GCC, this patch provides the mechanism of validating their use by
the compiler.  In particular, this involves:

  1. Ensuring a supplied string corresponds to a known system
 register name.  System registers can be accessed either via their
 name (e.g. `SPSR_EL1') or their encoding (e.g. `S3_0_C4_C0_0').
 Register names are validated using a hash map, mapping known
 system register names to its corresponding `sysreg_t' struct,
 which is populated from the `aarch64_system_regs.def' file.
 Register name validation is done via `lookup_sysreg_map', while
 the encoding naming convention is validated via a parser
 implemented in this patch - `is_implem_def_reg'.
  2. Once a given register name is deemed to be valid, it is checked
 against a further 2 criteria:
   a. Is the referenced register implemented in the target
  architecture?  This is achieved by comparing the ARCH field
  in the relevant SYSREG entry from `aarch64_system_regs.def'
  against `aarch64_feature_flags' flags set at compile-time.
   b. Is the register being used correctly?  Check the requested
  operation against the FLAGS specified in SYSREG.
  This prevents operations like writing to a read-only system
  register.

gcc/ChangeLog:

* config/aarch64/aarch64-protos.h (aarch64_valid_sysreg_name_p): New.
(aarch64_retrieve_sysreg): Likewise.
* config/aarch64/aarch64.cc (is_implem_def_reg): Likewise.
(aarch64_valid_sysreg_name_p): Likewise.
(aarch64_retrieve_sysreg): Likewise.
(aarch64_register_sysreg): Likewise.
(aarch64_init_sysregs): Likewise.
(aarch64_lookup_sysreg_map): Likewise.
* config/aarch64/predicates.md (aarch64_sysreg_string): New.
---
 gcc/config/aarch64/aarch64-protos.h |   2 +
 gcc/config/aarch64/aarch64.cc   | 147 
 gcc/config/aarch64/predicates.md|   4 +
 3 files changed, 153 insertions(+)

diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 60a55f4bc19..5d6a1e75700 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -830,6 +830,8 @@ bool aarch64_simd_shift_imm_p (rtx, machine_mode, bool);
 bool aarch64_sve_ptrue_svpattern_p (rtx, struct simd_immediate_info *);
 bool aarch64_simd_valid_immediate (rtx, struct simd_immediate_info *,
enum simd_immediate_check w = AARCH64_CHECK_MOV);
+bool aarch64_valid_sysreg_name_p (const char *);
+const char *aarch64_retrieve_sysreg (const char *, bool);
 rtx aarch64_check_zero_based_sve_index_immediate (rtx);
 bool aarch64_sve_index_immediate_p (rtx);
 bool aarch64_sve_arith_immediate_p (machine_mode, rtx, bool);
diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index a4a9e2e51ea..eaeab0be436 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -85,6 +85,7 @@
 #include "config/arm/aarch-common.h"
 #include "config/arm/aarch-common-protos.h"
 #include "ssa.h"
+#include "hash-map.h"
 
 /* This file should be included last.  */
 #include "target-def.h"
@@ -2860,6 +2861,51 @@ const sysreg_t sysreg_structs[] =
 
 const unsigned nsysreg = ARRAY_SIZE (sysreg_structs);
 
+using sysreg_map_t = hash_map;
+static sysreg_map_t *sysreg_map = nullptr;
+
+/* Map system register names to their hardware metadata: encoding,
+   feature flags and architectural feature requirements, all of which
+   are encoded in a sysreg_t struct.  */
+void
+aarch64_register_sysreg (const char *name, const sysreg_t *metadata)
+{
+  bool dup = sysreg_map->put (name, metadata);
+  gcc_checking_assert (!dup);
+}
+
+/* Lazily initialize hash table for system register validation,
+   checking the validity of supplied register name and returning
+   register's associated metadata.  */
+static void
+aarch64_init_sysregs (void)
+{
+  gcc_assert (!sysreg_map);
+  sysreg_map = new sysreg_map_t;
+
+  for (unsigned i = 0; i < nsysreg; i++)
+{
+  const sysreg_t *reg = sysreg_structs + i;
+  aarch64_register_sysreg (reg->name, reg);
+}
+}
+
+/* No direct access to the sysreg hash-map should be made.  Doing so
+   risks trying to acess an unitialized hash-map and dereferencing the
+   returned double pointer without due care risks dereferencing a
+   null-pointer.  */
+const sysreg_t *
+aarch64_lookup_sysreg_map (const char *regname)
+{
+  if (!sysreg_map)
+aarch64_init_sysregs ();
+
+  const sysreg_t **sysreg_entry = sysreg_map->get (regname);
+  if (sysreg_entry != NULL)
+return *sysreg_entry;
+  return NULL;
+}
+
 /* The current tuning set.  */
 struct tune_params aarch64_tune_params = generic_tunings;
 
@@ -28116,6 +28162,107 @@ aarch64_pars_overlap_p (rtx par1, rtx par2)
   return false;
 }
 
+/* Parse an implementation-defined system register name of
+   the form S[0-3]_[0-7]_C[0-15]_C[0-

[PATCH V3 0/6] aarch64: Add support for __arm_rsr and __arm_wsr ACLE function family

2023-11-02 Thread Victor Do Nascimento

Implement changes resulting from upstream discussion about the
implementation as presented in V2 of this patch:

https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633458.html

Note that patch 4/7 of the previous iteration of this series (Add
basic target_print_operand support for CONST_STRING) was resubmitted
and upstreamed separately due to its use in other work which had since
been submitted.

---

This patch series adds support for reading and writing to and from
system registers via the relevant ACLE-defined builtins [1].

The patch series makes a series of additions to the aarch64-specific
areas of the compiler to make this possible.

Firstly, a mechanism for defining system registers is established via a
new .def file and the new SYSREG macro.  This macro is the same as is
used in Binutils and system register entries are compatible with
either code-base.

Given the information contained in this system register definition
file, a compile-time validation mechanism is implemented, such that any
system register name passed as a string literal argument to these
builtins can be checked against known system registers and its use
for a given target architecture validated.

Finally, patterns for each of these builtins are added to the back-end
such that, if all validation criteria are met, the correct assembly is
emitted.

Thus, the following example of system register access is now valid for
GCC:

long long old = __arm_rsr("trcseqstr");
__arm_wsr("trcseqstr", new);

Testing:
 - Bootstrap/regtest on aarch64-linux-gnu done.

[1] https://arm-software.github.io/acle/main/acle.html

Victor Do Nascimento (6):
  aarch64: Sync system register information with Binutils
  aarch64: Add support for aarch64-sys-regs.def
  aarch64: Implement system register validation tools
  aarch64: Implement system register r/w arm ACLE intrinsic functions
  aarch64: Add front-end argument type checking for target builtins
  aarch64: Add system register duplication check selftest

 gcc/config/aarch64/aarch64-builtins.cc|  222 
 gcc/config/aarch64/aarch64-c.cc   |4 +-
 gcc/config/aarch64/aarch64-protos.h   |6 +
 gcc/config/aarch64/aarch64-sys-regs.def   | 1064 +
 gcc/config/aarch64/aarch64.cc |  244 
 gcc/config/aarch64/aarch64.h  |   22 +
 gcc/config/aarch64/aarch64.md |   18 +
 gcc/config/aarch64/arm_acle.h |   30 +
 gcc/config/aarch64/predicates.md  |4 +
 gcc/testsuite/gcc.dg/pch/rwsr-pch.c   |7 +
 gcc/testsuite/gcc.dg/pch/rwsr-pch.hs  |   10 +
 .../gcc.target/aarch64/acle/rwsr-1.c  |   29 +
 .../gcc.target/aarch64/acle/rwsr-2.c  |   25 +
 .../gcc.target/aarch64/acle/rwsr-3.c  |   18 +
 gcc/testsuite/gcc.target/aarch64/acle/rwsr.c  |  144 +++
 15 files changed, 1845 insertions(+), 2 deletions(-)
 create mode 100644 gcc/config/aarch64/aarch64-sys-regs.def
 create mode 100644 gcc/testsuite/gcc.dg/pch/rwsr-pch.c
 create mode 100644 gcc/testsuite/gcc.dg/pch/rwsr-pch.hs
 create mode 100644 gcc/testsuite/gcc.target/aarch64/acle/rwsr-1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/acle/rwsr-2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/acle/rwsr-3.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/acle/rwsr.c

-- 
2.41.0

[PATCH V3 6/6] aarch64: Add system register duplication check selftest

2023-11-02 Thread Victor Do Nascimento

Add a build-time test to check whether system register data, as
imported from `aarch64-sys-reg.def' has any duplicate entries.

Duplicate entries are defined as any two SYSREG entries in the .def
file which share the same encoding values (as specified by its `CPENC'
field) and where the relationship amongst the two does not fit into
one of the following categories:

* Simple aliasing: In some cases, it is observed that one
register name serves as an alias to another.  One example of
this is where TRCEXTINSELR aliases TRCEXTINSELR0.
* Expressing intent: It is possible that when a given register
serves two distinct functions depending on how it is used, it
is given two distinct names whose use should match the context
under which it is being used.  Example:  Debug Data Transfer
Register. When used to receive data, it should be accessed as
DBGDTRRX_EL0 while when transmitting data it should be
accessed via DBGDTRTX_EL0.
* Register depreciation: Some register names have been
deprecated and should no longer be used, but backwards-
compatibility requires that such names continue to be
recognized, as is the case for the SPSR_EL1 register, whose
access via the SPSR_SVC name is now deprecated.
* Same encoding different target: Some encodings are given
different meaning depending on the target architecture and, as
such, are given different names in each of theses contexts.
We see an example of this for CPENC(3,4,2,0,0), which
corresponds to TTBR0_EL2 for Armv8-A targets and VSCTLR_EL2
in Armv8-R targets.

A consequence of these observations is that `CPENC' duplication is
acceptable iff at least one of the `properties' or `arch_reqs' fields
of the `sysreg_t' structs associated with the two registers in
question differ and it's this condition that is checked by the new
`aarch64_test_sysreg_encoding_clashes' function.

gcc/ChangeLog:

* config/aarch64/aarch64.cc
(aarch64_test_sysreg_encoding_clashes): New.
(aarch64_run_selftests): add call to
aarch64_test_sysreg_encoding_clashes selftest.
---
 gcc/config/aarch64/aarch64.cc | 44 +++
 1 file changed, 44 insertions(+)

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index eaeab0be436..c0d75f167be 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -22,6 +22,7 @@
 
 #define INCLUDE_STRING
 #define INCLUDE_ALGORITHM
+#define INCLUDE_VECTOR
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
@@ -28390,6 +28391,48 @@ aarch64_test_fractional_cost ()
   ASSERT_EQ (cf (1, 2).as_double (), 0.5);
 }
 
+/* Calculate whether our system register data, as imported from
+   `aarch64-sys-reg.def' has any duplicate entries.  */
+static void
+aarch64_test_sysreg_encoding_clashes (void)
+{
+  using dup_instances_t = hash_map>;
+
+  dup_instances_t duplicate_instances;
+
+  /* Every time an encoding is established to come up more than once
+ we add it to a "clash-analysis queue", which is then used to extract
+ necessary information from our hash map when establishing whether
+ repeated encodings are valid.  */
+
+  /* 1) Collect recurrence information.  */
+  for (unsigned i = 0; i < nsysreg; i++)
+{
+  const sysreg_t *reg = sysreg_structs + i;
+
+  std::vector *tmp
+   = &duplicate_instances.get_or_insert (reg->encoding);
+
+  tmp->push_back (reg);
+}
+
+  /* 2) Carry out analysis on collected data.  */
+  for (auto instance : duplicate_instances)
+{
+  unsigned nrep = instance.second.size ();
+  if (nrep > 1)
+   for (unsigned i = 0; i < nrep; i++)
+ for (unsigned j = i + 1; j < nrep; j++)
+   {
+ const sysreg_t *a = instance.second[i];
+ const sysreg_t *b = instance.second[j];
+ ASSERT_TRUE ((a->properties != b->properties)
+  || (a->arch_reqs != b->arch_reqs));
+   }
+}
+}
+
 /* Run all target-specific selftests.  */
 
 static void
@@ -28397,6 +28440,7 @@ aarch64_run_selftests (void)
 {
   aarch64_test_loading_full_dump ();
   aarch64_test_fractional_cost ();
+  aarch64_test_sysreg_encoding_clashes ();
 }
 
 } // namespace selftest
-- 
2.41.0

[PATCH V3 4/6] aarch64: Implement system register r/w arm ACLE intrinsic functions

2023-11-02 Thread Victor Do Nascimento

Implement the aarch64 intrinsics for reading and writing system
registers with the following signatures:

uint32_t __arm_rsr(const char *special_register);
uint64_t __arm_rsr64(const char *special_register);
void* __arm_rsrp(const char *special_register);
float __arm_rsrf(const char *special_register);
double __arm_rsrf64(const char *special_register);
void __arm_wsr(const char *special_register, uint32_t value);
void __arm_wsr64(const char *special_register, uint64_t value);
void __arm_wsrp(const char *special_register, const void *value);
void __arm_wsrf(const char *special_register, float value);
void __arm_wsrf64(const char *special_register, double value);

gcc/ChangeLog:

* config/aarch64/aarch64-builtins.cc (enum aarch64_builtins):
Add enums for new builtins.
(aarch64_init_rwsr_builtins): New.
(aarch64_general_init_builtins): Call aarch64_init_rwsr_builtins.
(aarch64_expand_rwsr_builtin):  New.
(aarch64_general_expand_builtin): Call aarch64_general_expand_builtin.
* config/aarch64/aarch64.md (read_sysregdi): New insn_and_split.
(write_sysregdi): Likewise.
* config/aarch64/arm_acle.h (__arm_rsr): New.
(__arm_rsrp): Likewise.
(__arm_rsr64): Likewise.
(__arm_rsrf): Likewise.
(__arm_rsrf64): Likewise.
(__arm_wsr): Likewise.
(__arm_wsrp): Likewise.
(__arm_wsr64): Likewise.
(__arm_wsrf): Likewise.
(__arm_wsrf64): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/acle/rwsr.c: New.
* gcc.target/aarch64/acle/rwsr-1.c: Likewise.
* gcc.target/aarch64/acle/rwsr-2.c: Likewise.
* gcc.dg/pch/rwsr-pch.c: Likewise.
* gcc.dg/pch/rwsr-pch.hs: Likewise.
---
 gcc/config/aarch64/aarch64-builtins.cc| 191 ++
 gcc/config/aarch64/aarch64.md |  18 ++
 gcc/config/aarch64/arm_acle.h |  30 +++
 gcc/testsuite/gcc.dg/pch/rwsr-pch.c   |   7 +
 gcc/testsuite/gcc.dg/pch/rwsr-pch.hs  |  10 +
 .../gcc.target/aarch64/acle/rwsr-1.c  |  29 +++
 .../gcc.target/aarch64/acle/rwsr-2.c  |  25 +++
 gcc/testsuite/gcc.target/aarch64/acle/rwsr.c  | 144 +
 8 files changed, 454 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/pch/rwsr-pch.c
 create mode 100644 gcc/testsuite/gcc.dg/pch/rwsr-pch.hs
 create mode 100644 gcc/testsuite/gcc.target/aarch64/acle/rwsr-1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/acle/rwsr-2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/acle/rwsr.c

diff --git a/gcc/config/aarch64/aarch64-builtins.cc 
b/gcc/config/aarch64/aarch64-builtins.cc
index 04f59fd9a54..dd76cca611b 100644
--- a/gcc/config/aarch64/aarch64-builtins.cc
+++ b/gcc/config/aarch64/aarch64-builtins.cc
@@ -47,6 +47,7 @@
 #include "stringpool.h"
 #include "attribs.h"
 #include "gimple-fold.h"
+#include "builtins.h"
 
 #define v8qi_UP  E_V8QImode
 #define v8di_UP  E_V8DImode
@@ -808,6 +809,17 @@ enum aarch64_builtins
   AARCH64_RBIT,
   AARCH64_RBITL,
   AARCH64_RBITLL,
+  /* System register builtins.  */
+  AARCH64_RSR,
+  AARCH64_RSRP,
+  AARCH64_RSR64,
+  AARCH64_RSRF,
+  AARCH64_RSRF64,
+  AARCH64_WSR,
+  AARCH64_WSRP,
+  AARCH64_WSR64,
+  AARCH64_WSRF,
+  AARCH64_WSRF64,
   AARCH64_BUILTIN_MAX
 };
 
@@ -1798,6 +1810,65 @@ aarch64_init_rng_builtins (void)
   AARCH64_BUILTIN_RNG_RNDRRS);
 }
 
+/* Add builtins for reading system register.  */
+static void
+aarch64_init_rwsr_builtins (void)
+{
+  tree fntype = NULL;
+  tree const_char_ptr_type
+= build_pointer_type (build_type_variant (char_type_node, true, false));
+
+#define AARCH64_INIT_RWSR_BUILTINS_DECL(F, N, T) \
+  aarch64_builtin_decls[AARCH64_##F] \
+= aarch64_general_add_builtin ("__builtin_aarch64_"#N, T, AARCH64_##F);
+
+  fntype
+= build_function_type_list (uint32_type_node, const_char_ptr_type, NULL);
+  AARCH64_INIT_RWSR_BUILTINS_DECL (RSR, rsr, fntype);
+
+  fntype
+= build_function_type_list (ptr_type_node, const_char_ptr_type, NULL);
+  AARCH64_INIT_RWSR_BUILTINS_DECL (RSRP, rsrp, fntype);
+
+  fntype
+= build_function_type_list (uint64_type_node, const_char_ptr_type, NULL);
+  AARCH64_INIT_RWSR_BUILTINS_DECL (RSR64, rsr64, fntype);
+
+  fntype
+= build_function_type_list (float_type_node, const_char_ptr_type, NULL);
+  AARCH64_INIT_RWSR_BUILTINS_DECL (RSRF, rsrf, fntype);
+
+  fntype
+= build_function_type_list (double_type_node, const_char_ptr_type, NULL);
+  AARCH64_INIT_RWSR_BUILTINS_DECL (RSRF64, rsrf64, fntype);
+
+  fntype
+= build_function_type_list (void_type_node, const_char_ptr_type,
+   uint32_type_node, NULL);
+
+  AARCH64_INIT_RWSR_BUILTINS_DECL (WSR, wsr, fntype);
+
+  fntype
+= build_function_type_list (void_type_node, const_char_ptr_type,
+   const_ptr_type_node, NU

Re: [PATCH] Format gotools.sum closer to what DejaGnu does

2023-11-02 Thread rep . dot . nop

Hi Maxim!

Many thanks for the patch! Quick question below..

On 2 November 2023 13:48:55 CET, Maxim Kuvyrkov  
wrote:
>... to restore compatability with validate_failures.py .
>The testsuite script validate_failures.py expects
>"Running  ..." to extract  values,
>and gotools.sum provided "Running ".
>
>Note that libgo.sum, which also uses Makefile logic to generate
>DejaGnu-like output, already has "..." suffix.
>
>gotools/ChangeLog:
>
>   * Makefile.am: Update "Running  ..." output
>   * Makefile.in: Regenerate.
>---
> gotools/Makefile.am | 4 ++--
> gotools/Makefile.in | 5 +++--
> 2 files changed, 5 insertions(+), 4 deletions(-)
>
>diff --git a/gotools/Makefile.am b/gotools/Makefile.am
>index 7b5302990f8..d2376b9c25b 100644
>--- a/gotools/Makefile.am
>+++ b/gotools/Makefile.am
>@@ -332,8 +332,8 @@ check: check-head check-go-tool check-runtime 
>check-cgo-test check-carchive-test
>   @cp gotools.sum gotools.log
>   @for file in cmd_go-testlog runtime-testlog cgo-testlog 
> carchive-testlog cmd_vet-testlog embed-testlog; do \
> testname=`echo $${file} | sed -e 's/-testlog//' -e 's|_|/|'`; \
>-echo "Running $${testname}" >> gotools.sum; \
>-echo "Running $${testname}" >> gotools.log; \
>+echo "Running $${testname} ..." >> gotools.sum; \
>+echo "Running $${testname} ..." >> gotools.log; \
> sed -e 's/^--- \(.*\) ([^)]*)$$/\1/' < $${file} >> gotools.log; \
> grep '^--- ' $${file} | sed -e 's/^--- \(.*\) ([^)]*)$$/\1/' -e 
> 's/SKIP/UNTESTED/' | sort -k 2 >> gotools.sum; \
>   done
>diff --git a/gotools/Makefile.in b/gotools/Makefile.in
>index 2783b91ef4b..9cc238e748d 100644
>--- a/gotools/Makefile.in
>+++ b/gotools/Makefile.in
>@@ -317,6 +317,7 @@ pdfdir = @pdfdir@
> prefix = @prefix@
> program_transform_name = @program_transform_name@
> psdir = @psdir@
>+runstatedir = @runstatedir@

Are you sure you used the correct version of automake?

thanks

> sbindir = @sbindir@
> sharedstatedir = @sharedstatedir@
> srcdir = @srcdir@
>@@ -1003,8 +1004,8 @@ mostlyclean-local:
> @NATIVE_TRUE@ @cp gotools.sum gotools.log
> @NATIVE_TRUE@ @for file in cmd_go-testlog runtime-testlog cgo-testlog 
> carchive-testlog cmd_vet-testlog embed-testlog; do \
> @NATIVE_TRUE@   testname=`echo $${file} | sed -e 's/-testlog//' -e 's|_|/|'`; 
> \
>-@NATIVE_TRUE@   echo "Running $${testname}" >> gotools.sum; \
>-@NATIVE_TRUE@   echo "Running $${testname}" >> gotools.log; \
>+@NATIVE_TRUE@   echo "Running $${testname} ..." >> gotools.sum; \
>+@NATIVE_TRUE@   echo "Running $${testname} ..." >> gotools.log; \
> @NATIVE_TRUE@   sed -e 's/^--- \(.*\) ([^)]*)$$/\1/' < $${file} >> 
> gotools.log; \
> @NATIVE_TRUE@   grep '^--- ' $${file} | sed -e 's/^--- \(.*\) ([^)]*)$$/\1/' 
> -e 's/SKIP/UNTESTED/' | sort -k 2 >> gotools.sum; \
> @NATIVE_TRUE@ done

Re: [PATCH] Format gotools.sum closer to what DejaGnu does

2023-11-02 Thread Maxim Kuvyrkov

> On Nov 2, 2023, at 21:02, rep.dot@gmail.com wrote:
> 
> Hi Maxim!
> 
> Many thanks for the patch! Quick question below..
> 
> On 2 November 2023 13:48:55 CET, Maxim Kuvyrkov  
> wrote:
>> ... to restore compatability with validate_failures.py .
>> The testsuite script validate_failures.py expects
>> "Running  ..." to extract  values,
>> and gotools.sum provided "Running ".
>> 
>> Note that libgo.sum, which also uses Makefile logic to generate
>> DejaGnu-like output, already has "..." suffix.
>> 
>> gotools/ChangeLog:
>> 
>> * Makefile.am: Update "Running  ..." output
>> * Makefile.in: Regenerate.
>> ---
>> gotools/Makefile.am | 4 ++--
>> gotools/Makefile.in | 5 +++--
>> 2 files changed, 5 insertions(+), 4 deletions(-)
>> 
>> diff --git a/gotools/Makefile.am b/gotools/Makefile.am
>> index 7b5302990f8..d2376b9c25b 100644
>> --- a/gotools/Makefile.am
>> +++ b/gotools/Makefile.am
>> @@ -332,8 +332,8 @@ check: check-head check-go-tool check-runtime 
>> check-cgo-test check-carchive-test
>> @cp gotools.sum gotools.log
>> @for file in cmd_go-testlog runtime-testlog cgo-testlog carchive-testlog 
>> cmd_vet-testlog embed-testlog; do \
>>   testname=`echo $${file} | sed -e 's/-testlog//' -e 's|_|/|'`; \
>> -   echo "Running $${testname}" >> gotools.sum; \
>> -   echo "Running $${testname}" >> gotools.log; \
>> +   echo "Running $${testname} ..." >> gotools.sum; \
>> +   echo "Running $${testname} ..." >> gotools.log; \
>>   sed -e 's/^--- \(.*\) ([^)]*)$$/\1/' < $${file} >> gotools.log; \
>>   grep '^--- ' $${file} | sed -e 's/^--- \(.*\) ([^)]*)$$/\1/' -e 
>> 's/SKIP/UNTESTED/' | sort -k 2 >> gotools.sum; \
>> done
>> diff --git a/gotools/Makefile.in b/gotools/Makefile.in
>> index 2783b91ef4b..9cc238e748d 100644
>> --- a/gotools/Makefile.in
>> +++ b/gotools/Makefile.in
>> @@ -317,6 +317,7 @@ pdfdir = @pdfdir@
>> prefix = @prefix@
>> program_transform_name = @program_transform_name@
>> psdir = @psdir@
>> +runstatedir = @runstatedir@
> 
> Are you sure you used the correct version of automake?

I used automake 1.15.1 (from Ubuntu 20.04 automake-1.15 package), and I 
double-checked after getting the runstatedir update.

I would appreciate someone checking on their side to make sure I don't have 
something weird going on in my setup.

--
Maxim Kuvyrkov
https://www.linaro.org

[PATCH V3 5/6] aarch64: Add front-end argument type checking for target builtins

2023-11-02 Thread Victor Do Nascimento

In implementing the ACLE read/write system register builtins it was
observed that leaving argument type checking to be done at expand-time
meant that poorly-formed function calls were being "fixed" by certain
optimization passes, meaning bad code wasn't being properly picked up
in checking.

Example:

  const char *regname = "amcgcr_el0";
  long long a = __builtin_aarch64_rsr64 (regname);

is reduced by the ccp1 pass to

  long long a = __builtin_aarch64_rsr64 ("amcgcr_el0");

As these functions require an argument of STRING_CST type, there needs
to be a check carried out by the front-end capable of picking this up.

The introduced `check_general_builtin_call' function will be called by
the TARGET_CHECK_BUILTIN_CALL hook whenever a call to a builtin
belonging to the AARCH64_BUILTIN_GENERAL category is encountered,
carrying out any appropriate checks associated with a particular
builtin function code.

gcc/ChangeLog:

* config/aarch64/aarch64-builtins.cc (check_general_builtin_call):
New.
* config/aarch64/aarch64-c.cc (aarch64_check_builtin_call):
Add check_general_builtin_call call.
* config/aarch64/aarch64-protos.h (check_general_builtin_call):
New.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/acle/rwsr-3.c: New.
---
 gcc/config/aarch64/aarch64-builtins.cc| 31 +++
 gcc/config/aarch64/aarch64-c.cc   |  4 +--
 gcc/config/aarch64/aarch64-protos.h   |  4 +++
 .../gcc.target/aarch64/acle/rwsr-3.c  | 18 +++
 4 files changed, 55 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/acle/rwsr-3.c

diff --git a/gcc/config/aarch64/aarch64-builtins.cc 
b/gcc/config/aarch64/aarch64-builtins.cc
index dd76cca611b..c5f20f68bca 100644
--- a/gcc/config/aarch64/aarch64-builtins.cc
+++ b/gcc/config/aarch64/aarch64-builtins.cc
@@ -2127,6 +2127,37 @@ aarch64_general_builtin_decl (unsigned code, bool)
   return aarch64_builtin_decls[code];
 }
 
+bool
+aarch64_check_general_builtin_call (location_t location, vec,
+   unsigned int code, tree fndecl,
+   unsigned int nargs ATTRIBUTE_UNUSED, tree *args)
+{
+  switch (code)
+{
+case AARCH64_RSR:
+case AARCH64_RSRP:
+case AARCH64_RSR64:
+case AARCH64_RSRF:
+case AARCH64_RSRF64:
+case AARCH64_WSR:
+case AARCH64_WSRP:
+case AARCH64_WSR64:
+case AARCH64_WSRF:
+case AARCH64_WSRF64:
+  if (TREE_CODE (args[0]) != NOP_EXPR
+ || TREE_CODE (TREE_TYPE (args[0])) != POINTER_TYPE
+ || (TREE_CODE (TREE_OPERAND (TREE_OPERAND (args[0], 0) , 0))
+ != STRING_CST))
+   {
+ error_at (location, "first argument to %qD must be a string literal",
+   fndecl);
+ return false;
+   }
+}
+  /* Default behavior.  */
+  return true;
+}
+
 typedef enum
 {
   SIMD_ARG_COPY_TO_REG,
diff --git a/gcc/config/aarch64/aarch64-c.cc b/gcc/config/aarch64/aarch64-c.cc
index ab8844f6049..be8b7236cf9 100644
--- a/gcc/config/aarch64/aarch64-c.cc
+++ b/gcc/config/aarch64/aarch64-c.cc
@@ -339,8 +339,8 @@ aarch64_check_builtin_call (location_t loc, vec 
arg_loc,
   switch (code & AARCH64_BUILTIN_CLASS)
 {
 case AARCH64_BUILTIN_GENERAL:
-  return true;
-
+  return aarch64_check_general_builtin_call (loc, arg_loc, subcode,
+orig_fndecl, nargs, args);
 case AARCH64_BUILTIN_SVE:
   return aarch64_sve::check_builtin_call (loc, arg_loc, subcode,
  orig_fndecl, nargs, args);
diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 5d6a1e75700..dbd486cfea4 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -990,6 +990,10 @@ tree aarch64_general_builtin_rsqrt (unsigned int);
 void handle_arm_acle_h (void);
 void handle_arm_neon_h (void);
 
+bool aarch64_check_general_builtin_call (location_t, vec,
+unsigned int, tree, unsigned int,
+tree *);
+
 namespace aarch64_sve {
   void init_builtins ();
   void handle_arm_sve_h ();
diff --git a/gcc/testsuite/gcc.target/aarch64/acle/rwsr-3.c 
b/gcc/testsuite/gcc.target/aarch64/acle/rwsr-3.c
new file mode 100644
index 000..17038fefbf6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/acle/rwsr-3.c
@@ -0,0 +1,18 @@
+/* Test the __arm_[r,w]sr ACLE intrinsics family.  */
+/* Ensure that illegal behavior is rejected by the compiler.  */
+
+/* { dg-do compile } */
+/* { dg-options "-std=c2x -O3 -march=armv8.4-a" } */
+
+#include 
+
+void
+test_non_const_sysreg_name ()
+{
+  const char *regname = "trcseqstr";
+  long long a = __arm_rsr64 (regname); /* { dg-error "first argument to 
'__builtin_aarch64_rsr64' must be a string literal" } */
+  __arm_wsr64 (regname, a); /* { dg-error "first argument to 
'__builtin_aar

Re: [PATCH] Format gotools.sum closer to what DejaGnu does

2023-11-02 Thread Rainer Orth

rep.dot@gmail.com writes:

> On 2 November 2023 18:06:54 CET, Maxim Kuvyrkov 
> wrote:
>>> On Nov 2, 2023, at 21:02, rep.dot@gmail.com wrote:
>>> 
>>> Hi Maxim!
>>> 
>>> Many thanks for the patch! Quick question below..
>>> 
>>> On 2 November 2023 13:48:55 CET, Maxim Kuvyrkov
>>>  wrote:
[...]
>>> Are you sure you used the correct version of automake?
>>
>>I used automake 1.15.1 (from Ubuntu 20.04 automake-1.15 package), and I
>> double-checked after getting the runstatedir update.
>
> I think that runstatedir is a Debian (and derivatives) addition, would
> probably suffice to just drop that line manually..

One needs to use the exact version of the autotools as documented on
https://gcc.gnu.org/install/prerequisites.html.  Since distros often
apply local patches, it's best to use a self-built version to guard
against those.  Manually dropping parts of the regenerated files is
heavily fraught with error, especially since you usually don't know what
to drop.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [Patch, fortran] PR98498 - Interp request: defined operators and unlimited polymorphic

2023-11-02 Thread Paul Richard Thomas

Hi Harald,

I was overthinking the problem. The rejected cases led me to a fix that can
only be described as a considerable simplification compared with the first
patch!

The testcase now reflects the requirements of the standard and
regtests without failures.

OK for mainline?

Thanks

Paul

Fortran: Defined operators with unlimited polymorphic args [PR98498]

2023-11-02  Paul Thomas  

gcc/fortran
PR fortran/98498
* interface.cc (upoly_ok): Defined operators using unlimited
polymorphic formal arguments must not override the intrinsic
operator use.

gcc/testsuite/
PR fortran/98498
* gfortran.dg/interface_50.f90: New test.


On Wed, 1 Nov 2023 at 20:12, Harald Anlauf  wrote:

> Hi Paul,
>
> Am 01.11.23 um 19:02 schrieb Paul Richard Thomas:
> > The interpretation request came in a long time ago but I only just got
> > around to implementing it.
> >
> > The updated text from the standard is in the comment. Now I am writing
> > this, I think that I should perhaps use switch(op)/case rather than using
> > if/else if and depending on the order of the gfc_intrinsic_op enum being
> > maintained. Thoughts?
>
> the logic is likely harder to parse with if/else than with
> switch(op)/case.  However, I do not think that the order of
> the enum will ever be changed, as the module format relies
> on that very order.
>
> > The testcase runs fine with both mainline and nagfor. I think that
> > compile-only with counts of star-eq and star_not should suffice.
>
> I found other cases that are rejected even with your patch,
> but which are accepted by nagfor.  Example:
>
> print *, ('a' == c)
>
> Nagfor prints F at runtime as expected, as it correctly resolves
> this to star_eq.  Further examples can be easily constructed.
>
> Can you have a look?
>
> Thanks,
> Harald
>
> > Regtests with no regressions. OK for mainline?
> >
> > Paul
> >
> > Fortran: Defined operators with unlimited polymorphic args [PR98498]
> >
> > 2023-11-01  Paul Thomas  
> >
> > gcc/fortran
> > PR fortran/98498
> > * interface.cc (upoly_ok): New function.
> > (gfc_extend_expr): Use new function to ensure that defined
> > operators using unlimited polymorphic formal arguments do not
> > override their intrinsic uses.
> >
> > gcc/testsuite/
> > PR fortran/98498
> > * gfortran.dg/interface_50.f90: New test.
> >
>
>
diff --git a/gcc/fortran/interface.cc b/gcc/fortran/interface.cc
index 8c4571e0aa6..fc4fe662eab 100644
--- a/gcc/fortran/interface.cc
+++ b/gcc/fortran/interface.cc
@@ -4737,6 +4737,17 @@ gfc_extend_expr (gfc_expr *e)
 	  if (sym != NULL)
 	break;
 	}
+
+  /* F2018(15.4.3.4.2) requires that the use of unlimited polymorphic
+	 formal arguments does not override the intrinsic uses.  */
+  gfc_push_suppress_errors ();
+  if (sym
+	  && (UNLIMITED_POLY (sym->formal->sym)
+	  || (sym->formal->next
+		  && UNLIMITED_POLY (sym->formal->next->sym)))
+	  && !gfc_check_operator_interface (sym, e->value.op.op, e->where))
+	sym = NULL;
+  gfc_pop_suppress_errors ();
 }
 
   /* TODO: Do an ambiguity-check and error if multiple matching interfaces are
! { dg-do compile }
! { dg-options "-fdump-tree-original" }
!
! Tests the fix for PR98498, which was subject to an interpretation request
! as to whether or not the interface operator overrode the intrinsic use.
! (See PR for correspondence)
!
! Contributed by Paul Thomas  
!
MODULE mytypes
  IMPLICIT none

  TYPE pvar
 character(len=20) :: name
 integer   :: level
  end TYPE pvar

  interface operator (==)
 module procedure star_eq
  end interface

  interface operator (.not.)
 module procedure star_not
  end interface

contains
  function star_eq(a, b)
implicit none
class(*), intent(in) :: a, b
logical :: star_eq
select type (a)
  type is (pvar)
  select type (b)
type is (pvar)
  if((a%level .eq. b%level) .and. (a%name .eq. b%name)) then
star_eq = .true.
  else
star_eq = .false.
  end if
type is (integer)
  star_eq = (a%level == b)
  end select
  class default
star_eq = .false.
end select
  end function star_eq

  function star_not (a)
implicit none
class(*), intent(in) :: a
type(pvar) :: star_not
select type (a)
  type is (pvar)
star_not = a
star_not%level = -star_not%level
  type is (real)
star_not = pvar ("real", -int(a))
  class default
star_not = pvar ("noname", 0)
end select
  end function

end MODULE mytypes

program test_eq
   use mytypes
   implicit none

   type(pvar) x, y
   integer :: i = 4
   real :: r = 2.0
   character(len = 4, kind =4) :: c = "abcd"
! Check that intrinsic use of .not. and == is not overridden.
   if (.not.(i == 2*int (r))) stop 1
   if (r == 1.0) stop 2

! Test defined operator ==
   x = pvar('test 1', 100)
   y = pvar('test 1', 100)
   if (.not.(x == y)) stop 3
   y = pvar('test 2', 100)
   if (x == y) stop 4
   if (x == r) stop 5! cla

Re: [PATCH] Format gotools.sum closer to what DejaGnu does

2023-11-02 Thread rep . dot . nop

On 2 November 2023 18:06:54 CET, Maxim Kuvyrkov  
wrote:
>> On Nov 2, 2023, at 21:02, rep.dot@gmail.com wrote:
>> 
>> Hi Maxim!
>> 
>> Many thanks for the patch! Quick question below..
>> 
>> On 2 November 2023 13:48:55 CET, Maxim Kuvyrkov  
>> wrote:
>>> ... to restore compatability with validate_failures.py .
>>> The testsuite script validate_failures.py expects
>>> "Running  ..." to extract  values,
>>> and gotools.sum provided "Running ".
>>> 
>>> Note that libgo.sum, which also uses Makefile logic to generate
>>> DejaGnu-like output, already has "..." suffix.
>>> 
>>> gotools/ChangeLog:
>>> 
>>> * Makefile.am: Update "Running  ..." output
>>> * Makefile.in: Regenerate.
>>> ---
>>> gotools/Makefile.am | 4 ++--
>>> gotools/Makefile.in | 5 +++--
>>> 2 files changed, 5 insertions(+), 4 deletions(-)
>>> 
>>> diff --git a/gotools/Makefile.am b/gotools/Makefile.am
>>> index 7b5302990f8..d2376b9c25b 100644
>>> --- a/gotools/Makefile.am
>>> +++ b/gotools/Makefile.am
>>> @@ -332,8 +332,8 @@ check: check-head check-go-tool check-runtime 
>>> check-cgo-test check-carchive-test
>>> @cp gotools.sum gotools.log
>>> @for file in cmd_go-testlog runtime-testlog cgo-testlog carchive-testlog 
>>> cmd_vet-testlog embed-testlog; do \
>>>   testname=`echo $${file} | sed -e 's/-testlog//' -e 's|_|/|'`; \
>>> -   echo "Running $${testname}" >> gotools.sum; \
>>> -   echo "Running $${testname}" >> gotools.log; \
>>> +   echo "Running $${testname} ..." >> gotools.sum; \
>>> +   echo "Running $${testname} ..." >> gotools.log; \
>>>   sed -e 's/^--- \(.*\) ([^)]*)$$/\1/' < $${file} >> gotools.log; \
>>>   grep '^--- ' $${file} | sed -e 's/^--- \(.*\) ([^)]*)$$/\1/' -e 
>>> 's/SKIP/UNTESTED/' | sort -k 2 >> gotools.sum; \
>>> done
>>> diff --git a/gotools/Makefile.in b/gotools/Makefile.in
>>> index 2783b91ef4b..9cc238e748d 100644
>>> --- a/gotools/Makefile.in
>>> +++ b/gotools/Makefile.in
>>> @@ -317,6 +317,7 @@ pdfdir = @pdfdir@
>>> prefix = @prefix@
>>> program_transform_name = @program_transform_name@
>>> psdir = @psdir@
>>> +runstatedir = @runstatedir@
>> 
>> Are you sure you used the correct version of automake?
>
>I used automake 1.15.1 (from Ubuntu 20.04 automake-1.15 package), and I 
>double-checked after getting the runstatedir update.

I think that runstatedir is a Debian (and derivatives) addition, would probably 
suffice to just drop that line manually..

The patch itself looks like it would be ok, probably even obvious, but I can 
not approve it.

I'm a bit surprised that you don't need to have "exp" != None for 
validate-failures to work after your exp addition, but I take it you checked 
that aspect :-)

thanks, again!

>
>I would appreciate someone checking on their side to make sure I don't have 
>something weird going on in my setup.
>
>--
>Maxim Kuvyrkov
>https://www.linaro.org
>

Re: [PATCH 3/4] maintainer-scripts/gcc_release: use HTTPS for links

2023-11-02 Thread Joseph Myers

On Thu, 2 Nov 2023, Sam James wrote:

> maintainer-scripts/
>   * gcc_release: Use HTTPS for links.

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH 4/4] maintainer-scripts/gcc_release: cleanup whitespace

2023-11-02 Thread Joseph Myers

On Thu, 2 Nov 2023, Sam James wrote:

> maintainer-scripts/
>   * gcc_release: Cleanup whitespace.

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com

[PATCH] gfortran: Rely on dg-do-what-default to avoid running pr85853.f90, pr107254.f90 and vect-alias-check-1.F90 on non-vector targets

2023-11-02 Thread Patrick O'Neill

Testcases in gfortran.dg/vect/vect.exp rely on
check_vect_support_and_set_flags to set dg-do-what-default and avoid
running vector tests on non-vector targets. The three testcases in this
patch overwrite the default with dg-do run which causes issues
for non-vector targets.

Removing the dg-do run directive resolves this issue for non-vector
targets (while still running the tests on vector targets).

gcc/testsuite/ChangeLog:

* gfortran.dg/vect/pr107254.f90: Remove dg-do run directive.
* gfortran.dg/vect/pr85853.f90: Ditto.
* gfortran.dg/vect/vect-alias-check-1.F90: Ditto.

Signed-off-by: Patrick O'Neill 
---
Tested using rv64gc & rv64gcv to make sure the testcases compile/run
as expected.

These files haven't been changed in a long time so I'm not sure why (or
if) this hasn't been run into by other people before.
---
 gcc/testsuite/gfortran.dg/vect/pr107254.f90   | 2 --
 gcc/testsuite/gfortran.dg/vect/pr85853.f90| 1 -
 gcc/testsuite/gfortran.dg/vect/vect-alias-check-1.F90 | 1 -
 3 files changed, 4 deletions(-)

diff --git a/gcc/testsuite/gfortran.dg/vect/pr107254.f90 
b/gcc/testsuite/gfortran.dg/vect/pr107254.f90
index 85bcb5f3fa2..adce6bedc30 100644
--- a/gcc/testsuite/gfortran.dg/vect/pr107254.f90
+++ b/gcc/testsuite/gfortran.dg/vect/pr107254.f90
@@ -1,5 +1,3 @@
-! { dg-do run }
-
 subroutine dlartg( f, g, s, r )
   implicit none
   double precision :: f, g, r, s
diff --git a/gcc/testsuite/gfortran.dg/vect/pr85853.f90 
b/gcc/testsuite/gfortran.dg/vect/pr85853.f90
index 68f4a004324..4c0e3b81a09 100644
--- a/gcc/testsuite/gfortran.dg/vect/pr85853.f90
+++ b/gcc/testsuite/gfortran.dg/vect/pr85853.f90
@@ -1,5 +1,4 @@
 ! Taken from execute/where_2.f90, but with special flags.
-! { dg-do run }
 ! { dg-additional-options "-fno-tree-loop-vectorize" }
 
 ! Program to test the WHERE constructs
diff --git a/gcc/testsuite/gfortran.dg/vect/vect-alias-check-1.F90 
b/gcc/testsuite/gfortran.dg/vect/vect-alias-check-1.F90
index 3014ff9f3b6..85ae9b151e3 100644
--- a/gcc/testsuite/gfortran.dg/vect/vect-alias-check-1.F90
+++ b/gcc/testsuite/gfortran.dg/vect/vect-alias-check-1.F90
@@ -1,4 +1,3 @@
-! { dg-do run }
 ! { dg-additional-options "-fno-inline" }
 
 #define N 200
-- 
2.34.1

[PATCH] libstdc++: avoid uninitialized read in basic_string constructor

2023-11-02 Thread Ben Sherman

Tested on x86_64-pc-linux-gnu, please let me know if there's anything
else needed. I haven't contributed before and don't have write access, so
apologies if I've missed anything.

-- >8 --

The basic_string input iterator constructor incrementally reads data and
allocates the internal buffer as-needed. When _M_dispose() is called, there
is a check for whether the local buffer is being used - if it is, there is
an additional check guarding __builtin_unreachable() for the value of
_M_string_length. The constructor does not initialize _M_string_length
until all data has been read, so the first re-allocation out of the local
buffer will have an uninitialized read.

This updates the basic_string input iterator constructor to properly set
_M_string_length as data is being read.  It additionally introduces a new
_M_assign_terminator() function to assign the null-terminator based on the
currently-stored _M_string_length.

libstdc++-v3/ChangeLog:

* include/bits/basic_string.h (_M_assign_terminator()): New
  function.
  (_M_set_length()): Use _M_assign_terminator().
* include/bits/basic_string.tcc (_M_construct(InIter, InIter,
  input_iterator_tag)): Set length incrementally, use
  _M_assign_terminator().

diff --git a/libstdc++-v3/include/bits/basic_string.h 
b/libstdc++-v3/include/bits/basic_string.h
index 0fa32afeb..ba02d8f0f 100644
--- a/libstdc++-v3/include/bits/basic_string.h
+++ b/libstdc++-v3/include/bits/basic_string.h
@@ -258,12 +258,17 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   _M_capacity(size_type __capacity)
   { _M_allocated_capacity = __capacity; }

+  _GLIBCXX20_CONSTEXPR
+  void
+  _M_assign_terminator()
+  { traits_type::assign(_M_data()[_M_string_length], _CharT()); }
+
   _GLIBCXX20_CONSTEXPR
   void
   _M_set_length(size_type __n)
   {
_M_length(__n);
-   traits_type::assign(_M_data()[__n], _CharT());
+   _M_assign_terminator();
   }

   _GLIBCXX20_CONSTEXPR
diff --git a/libstdc++-v3/include/bits/basic_string.tcc 
b/libstdc++-v3/include/bits/basic_string.tcc
index f0a44e5e8..84366a44a 100644
--- a/libstdc++-v3/include/bits/basic_string.tcc
+++ b/libstdc++-v3/include/bits/basic_string.tcc
@@ -182,6 +182,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
++__beg;
  }

+   _M_length(__len);
+
struct _Guard
{
  _GLIBCXX20_CONSTEXPR
@@ -206,12 +208,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
_M_capacity(__capacity);
  }
traits_type::assign(_M_data()[__len++], *__beg);
+   _M_length(__len);
++__beg;
  }

__guard._M_guarded = 0;

-   _M_set_length(__len);
+   _M_assign_terminator();
   }

   template
--
2.21.0







This electronic mail message and any attached files contain information 
intended for the exclusive use of the individual or entity to whom it is 
addressed and may contain information that is proprietary, confidential and/or 
exempt from disclosure under applicable law. If you are not the intended 
recipient, you are hereby notified that any viewing, copying, disclosure or 
distribution of this information may be subject to legal restriction or 
sanction. Please notify the sender, by electronic mail or telephone, of any 
unintended recipients and delete the original message without making any copies.

[pushed] c++: retval dtor on rethrow [PR112301]

2023-11-02 Thread Jason Merrill

Tested x86_64-pc-linux-gnu, applying to trunk.

-- 8< --

In r12-6333 for PR33799, I fixed the example in [except.ctor]/2.  In that
testcase, the exception is caught and the function returns again,
successfully.

In this testcase, however, the exception is rethrown, and hits two separate
cleanups: one in the try block and the other in the function body.  So we
destroy twice an object that was only constructed once.

Fortunately, the fix for the normal case is easy: we just need to clear the
"return value constructed by return" flag when we do it the first time.

This gets more complicated with the named return value optimization, since
we don't want to destroy the return value while the NRV variable is still in
scope.

PR c++/112301
PR c++/102191
PR c++/33799

gcc/cp/ChangeLog:

* except.cc (maybe_splice_retval_cleanup): Clear
current_retval_sentinel when destroying retval.
* semantics.cc (nrv_data): Add in_nrv_cleanup.
(finalize_nrv): Set it.
(finalize_nrv_r): Fix handling of throwing cleanups.

gcc/testsuite/ChangeLog:

* g++.dg/eh/return1.C: Add more cases.
---
 gcc/cp/except.cc  | 18 ++-
 gcc/cp/semantics.cc   | 47 +-
 gcc/testsuite/g++.dg/eh/return1.C | 81 ++-
 3 files changed, 142 insertions(+), 4 deletions(-)

diff --git a/gcc/cp/except.cc b/gcc/cp/except.cc
index e32efb30457..d966725db9b 100644
--- a/gcc/cp/except.cc
+++ b/gcc/cp/except.cc
@@ -1284,7 +1284,15 @@ build_noexcept_spec (tree expr, tsubst_flags_t complain)
current_retval_sentinel so that we know that the return value needs to be
destroyed on throw.  Do the same if the current function might use the
named return value optimization, so we don't destroy it on return.
-   Otherwise, returns NULL_TREE.  */
+   Otherwise, returns NULL_TREE.
+
+   The sentinel is set to indicate that we're in the process of returning, and
+   therefore should destroy a normal return value on throw, and shouldn't
+   destroy a named return value variable on normal scope exit.  It is set on
+   return, and cleared either by maybe_splice_retval_cleanup, or when an
+   exception reaches the NRV scope (finalize_nrv_r).  Note that once return
+   passes the NRV scope, it's effectively a normal return value, so cleanup
+   past that point is handled by maybe_splice_retval_cleanup. */
 
 tree
 maybe_set_retval_sentinel ()
@@ -1361,6 +1369,14 @@ maybe_splice_retval_cleanup (tree compound_stmt, bool 
is_try)
  tsi_delink (&iter);
}
   tree dtor = build_cleanup (retval);
+  if (!function_body)
+   {
+ /* Clear the sentinel so we don't try to destroy the retval again on
+rethrow (c++/112301).  */
+ tree clear = build2 (MODIFY_EXPR, boolean_type_node,
+  current_retval_sentinel, boolean_false_node);
+ dtor = build2 (COMPOUND_EXPR, void_type_node, clear, dtor);
+   }
   tree cond = build3 (COND_EXPR, void_type_node, current_retval_sentinel,
  dtor, void_node);
   tree cleanup = build_stmt (loc, CLEANUP_STMT,
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 52044be7af8..a0f2edcf117 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -4982,6 +4982,7 @@ public:
   tree result;
   hash_table > visited;
   bool simple;
+  bool in_nrv_cleanup;
 };
 
 /* Helper function for walk_tree, used by finalize_nrv below.  */
@@ -4997,7 +4998,7 @@ finalize_nrv_r (tree* tp, int* walk_subtrees, void* data)
   if (TYPE_P (*tp))
 *walk_subtrees = 0;
   /* If there's a label, we might need to destroy the NRV on goto (92407).  */
-  else if (TREE_CODE (*tp) == LABEL_EXPR)
+  else if (TREE_CODE (*tp) == LABEL_EXPR && !dp->in_nrv_cleanup)
 dp->simple = false;
   /* Change NRV returns to just refer to the RESULT_DECL; this is a nop,
  but differs from using NULL_TREE in that it indicates that we care
@@ -5016,16 +5017,59 @@ finalize_nrv_r (tree* tp, int* walk_subtrees, void* 
data)
   else if (TREE_CODE (*tp) == CLEANUP_STMT
   && CLEANUP_DECL (*tp) == dp->var)
 {
+  dp->in_nrv_cleanup = true;
+  cp_walk_tree (&CLEANUP_BODY (*tp), finalize_nrv_r, data, 0);
+  dp->in_nrv_cleanup = false;
+  cp_walk_tree (&CLEANUP_EXPR (*tp), finalize_nrv_r, data, 0);
+  *walk_subtrees = 0;
+
   if (dp->simple)
+   /* For a simple NRV, just run it on the EH path.  */
CLEANUP_EH_ONLY (*tp) = true;
   else
{
+ /* Not simple, we need to check current_retval_sentinel to decide
+whether to run it.  If it's set, we're returning normally and
+don't want to destroy the NRV.  If the sentinel is not set, we're
+leaving scope some other way, either by flowing off the end of its
+scope or throwing an exception.  */
  tree cond = build3 (COND_EXPR, void_type_node,

[pushed] c++: use hash_set in nrv_data

2023-11-02 Thread Jason Merrill

Tested x86_64-pc-linux-gnu, applying to trunk.

-- 8< --

I noticed we were using a hash_table directly here instead of the simpler
hash_set interface.  Also, let's check for the variable itself and repeats
earlier, since they should happen more often than any of the other cases.

gcc/cp/ChangeLog:

* semantics.cc (nrv_data): Change visited to hash_set.
(finalize_nrv_r): Reorganize.
---
 gcc/cp/semantics.cc | 26 --
 1 file changed, 12 insertions(+), 14 deletions(-)

diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index a0f2edcf117..37bffca8e55 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -4980,7 +4980,7 @@ public:
 
   tree var;
   tree result;
-  hash_table > visited;
+  hash_set visited;
   bool simple;
   bool in_nrv_cleanup;
 };
@@ -4991,12 +4991,22 @@ static tree
 finalize_nrv_r (tree* tp, int* walk_subtrees, void* data)
 {
   class nrv_data *dp = (class nrv_data *)data;
-  tree_node **slot;
 
   /* No need to walk into types.  There wouldn't be any need to walk into
  non-statements, except that we have to consider STMT_EXPRs.  */
   if (TYPE_P (*tp))
 *walk_subtrees = 0;
+
+  /* Replace all uses of the NRV with the RESULT_DECL.  */
+  else if (*tp == dp->var)
+*tp = dp->result;
+
+  /* Avoid walking into the same tree more than once.  Unfortunately, we
+ can't just use walk_tree_without duplicates because it would only call
+ us for the first occurrence of dp->var in the function body.  */
+  else if (dp->visited.add (*tp))
+*walk_subtrees = 0;
+
   /* If there's a label, we might need to destroy the NRV on goto (92407).  */
   else if (TREE_CODE (*tp) == LABEL_EXPR && !dp->in_nrv_cleanup)
 dp->simple = false;
@@ -5086,18 +5096,6 @@ finalize_nrv_r (tree* tp, int* walk_subtrees, void* data)
   SET_EXPR_LOCATION (init, EXPR_LOCATION (*tp));
   *tp = init;
 }
-  /* And replace all uses of the NRV with the RESULT_DECL.  */
-  else if (*tp == dp->var)
-*tp = dp->result;
-
-  /* Avoid walking into the same tree more than once.  Unfortunately, we
- can't just use walk_tree_without duplicates because it would only call
- us for the first occurrence of dp->var in the function body.  */
-  slot = dp->visited.find_slot (*tp, INSERT);
-  if (*slot)
-*walk_subtrees = 0;
-  else
-*slot = *tp;
 
   /* Keep iterating.  */
   return NULL_TREE;

base-commit: 36a26298ec7dfca615d4ba411a3508d1287d6ce5
-- 
2.39.3

Re: [Patch, fortran] PR98498 - Interp request: defined operators and unlimited polymorphic

2023-11-02 Thread Harald Anlauf


Hi Paul,

Am 02.11.23 um 19:18 schrieb Paul Richard Thomas:

Hi Harald,

I was overthinking the problem. The rejected cases led me to a fix that can
only be described as a considerable simplification compared with the first
patch!


this patch is *much* simpler, makes more sense, and works here. :-)


The testcase now reflects the requirements of the standard and
regtests without failures.

OK for mainline?


Yes, OK for mainline.

Thanks,
Harald


Thanks

Paul

Fortran: Defined operators with unlimited polymorphic args [PR98498]

2023-11-02  Paul Thomas  

gcc/fortran
PR fortran/98498
* interface.cc (upoly_ok): Defined operators using unlimited
polymorphic formal arguments must not override the intrinsic
operator use.

gcc/testsuite/
PR fortran/98498
* gfortran.dg/interface_50.f90: New test.


On Wed, 1 Nov 2023 at 20:12, Harald Anlauf  wrote:


Hi Paul,

Am 01.11.23 um 19:02 schrieb Paul Richard Thomas:

The interpretation request came in a long time ago but I only just got
around to implementing it.

The updated text from the standard is in the comment. Now I am writing
this, I think that I should perhaps use switch(op)/case rather than using
if/else if and depending on the order of the gfc_intrinsic_op enum being
maintained. Thoughts?


the logic is likely harder to parse with if/else than with
switch(op)/case.  However, I do not think that the order of
the enum will ever be changed, as the module format relies
on that very order.


The testcase runs fine with both mainline and nagfor. I think that
compile-only with counts of star-eq and star_not should suffice.


I found other cases that are rejected even with your patch,
but which are accepted by nagfor.  Example:

 print *, ('a' == c)

Nagfor prints F at runtime as expected, as it correctly resolves
this to star_eq.  Further examples can be easily constructed.

Can you have a look?

Thanks,
Harald


Regtests with no regressions. OK for mainline?

Paul

Fortran: Defined operators with unlimited polymorphic args [PR98498]

2023-11-01  Paul Thomas  

gcc/fortran
PR fortran/98498
* interface.cc (upoly_ok): New function.
(gfc_extend_expr): Use new function to ensure that defined
operators using unlimited polymorphic formal arguments do not
override their intrinsic uses.

gcc/testsuite/
PR fortran/98498
* gfortran.dg/interface_50.f90: New test.

Re: RFC: the proposal to resolve the missing dependency issue for counted_by attribute

2023-11-02 Thread Qing Zhao

Thanks a lot for raising these issues. 

If I understand correctly,  the major question we need to answer is:

For the following example: (Jakub mentioned this  in an early message)

  1 struct S { int a; char b __attribute__((counted_by (a))) []; };
  2 struct S s;
  3 s.a = 5;
  4 char *p = &s.b[2];
  5 int i1 = __builtin_dynamic_object_size (p, 0);
  6 s.a = 3;
  7 int i2 = __builtin_dynamic_object_size (p, 0);

Should the 2nd __bdos call (line 7) get
A. the latest value of s.a (line 6) for it’s size? 
Or  B. the value when the s.b was referenced (line 3, line 4)?

A should be more convenient for the user to use the dynamic array feature.
With B, the user has to modify the source code (to add code to “re-obtain” 
the pointer after the size was adjusted at line 6) as mentioned by Richard. 

This depends on how we design the new internal function .ACCESS_WITH_SIZE

1. Size is passed by value to .ACCESS_WITH_SIZE as we currently designed. 

PTR = .ACCESS_WITH_SIZE (PTR, SIZE, ACCESS_MODE)

2. Size is passed by reference to .ACCESS_WITH_SIZE as Jakub suggested.

PTR = .ACCESS_WITH_SIZE(PTR, &SIZE, TYPEOFSIZE, ACCESS_MODE)

With 1, We can only provide B, the user needs to modify the source code to get 
the full feature of dynamic array;
With 2, We can provide  A, the user will get full support to the dynamic array 
without restrictions in the source code. 

However, We have to pay additional cost for supporting A by using 2, which 
includes:

1. .ACCESS_WITH_SIZE will become an escape point, which will further impact the 
IPA optimizations, more runtime overhead. 
Then .ACCESS_WTH_SIZE will not be CONST, right? But it will still be PURE?

2. __builtin_dynamic_object_size will NOT be LEAF anymore.  This will also 
impact some IPA optimizations, more runtime overhead. 

I think the following are the factors that make the decision:

1. How big the performance impact?
2. How important the dynamic array feature? Is adding some user restrictions as 
Richard mentioned feasible to support this feature?

Maybe we can implement 1 first, if the full support to the dynamic array is 
needed, we can add 2 then? 
Or, we can implement both, and compare the performance difference, then decide?

Qing

> On Nov 2, 2023, at 8:09 AM, Jakub Jelinek  wrote:
> 
> On Thu, Nov 02, 2023 at 12:52:50PM +0100, Richard Biener wrote:
>>> What I meant is to emit
>>> tmp_4 = .ACCESS_WITH_SIZE (&s.b[0], &s.a, (typeof (&s.a)) 0);
>>> p_5 = &tmp_4[2];
>>> i.e. don't associate the pointer with a value of the size, but with
>>> an address where to find the size (plus how large it is), basically escape
>>> pointer to the size at that point.  And __builtin_dynamic_object_size is 
>>> pure,
>>> so supposedly it can depend on what the escaped pointer points to.
>> 
>> Well, yeah - that would work but depend on .ACCESS_WITH_SIZE being an
>> escape point (quite bad IMHO)
> 
> That is why I've said we need to decide what cost we want to suffer because
> of that.
> 
>> and __builtin_dynamic_object_size being
>> non-const (that's probably not too bad).
> 
> It is already pure,leaf,nothrow (unlike __builtin_object_size which is 
> obviously
> const,leaf,nothrow).  Because under the hood, it can read memory when
> expanded.
> 
>>> We'd see that a particular pointer is size associated with &s.a address
>>> and would use that address cast to the type of the third argument (to
>>> preserve the exact pointer type on INTEGER_CST, though not sure, wouldn't
>>> VN CSE it anyway if one has say
>>> union U { struct S { int a; char b __attribute__((counted_by (a))) []; } s;
>>>  struct T { char c, d, e, f; char g __attribute__((counted_by (c))) 
>>> []; } t; };
>>> and
>>> .ACCESS_WITH_SIZE (&v.s.b[0], &v.s.a, (int *) 0);
>>> ...
>>> .ACCESS_WITH_SIZE (&v.t.g[0], &v.t.c, (int *) 0);
>>> ?
>> 
>> We'd probably CSE that - the usual issue of address-with-same-value.
>> 
>>> It would mean though that counted_by wouldn't be allowed to be a
>>> bit-field...
>> 
>> Yup.  We could also pass a pointer to the container though, that's good 
>> enough
>> for the escape, and pass the size by value in addition to that.
> 
> I was wondering about stuff like _BitInt.  But sure, counted_by is just an
> extension, we can just refuse counting by _BitInt in addition to counting by
> floating point, pointers, aggregates, bit-fields, or we could somehow encode
> all the needed type's properties numerically into an integral constant.
> Similarly for alias set (unless it uses 0 for reads).
> 
>   Jakub
>

Re: RFC: the proposal to resolve the missing dependency issue for counted_by attribute

2023-11-02 Thread Qing Zhao



> On Nov 2, 2023, at 7:52 AM, Richard Biener  wrote:
> 
> On Thu, Nov 2, 2023 at 11:40 AM Jakub Jelinek  wrote:
>> 
>> On Thu, Nov 02, 2023 at 11:18:09AM +0100, Richard Biener wrote:
 Or, if we want to pay further price, .ACCESS_WITH_SIZE could take as one of
 the arguments not the size value, but its address.  Then at __bdos time
 we would dereference that pointer to get the size.
 So,
 struct S { int a; char b __attribute__((counted_by (a))) []; };
 struct S s;
 s.a = 5;
 char *p = &s.b[2];
 int i1 = __builtin_dynamic_object_size (p, 0);
 s.a = 3;
 int i2 = __builtin_dynamic_object_size (p, 0);
 would then yield 3 and 1 rather than 3 and 3.
>>> 
>>> I fail to see how we can get the __builtin_dynamic_object_size call
>>> data dependent on s.a, thus avoid re-ordering or even DSE of the
>>> store.
>> 
>> If &s.b[2] is lowered as
>> sz_1 = s.a;
>> tmp_2 = .ACCESS_WITH_SIZE (&s.b[0], sz_1);
>> p_3 = &tmp_2[2];
>> then sure, there is no way, you get the size from that point.
>> tree-object-size.cc tracking then determines that in a particular
>> case the pointer is size associated with sz_1 and use that value
>> as the size (with the usual adjustments for pointer arithmetics and the
>> like).
>> 
>> What I meant is to emit
>> tmp_4 = .ACCESS_WITH_SIZE (&s.b[0], &s.a, (typeof (&s.a)) 0);
>> p_5 = &tmp_4[2];
>> i.e. don't associate the pointer with a value of the size, but with
>> an address where to find the size (plus how large it is), basically escape
>> pointer to the size at that point.  And __builtin_dynamic_object_size is 
>> pure,
>> so supposedly it can depend on what the escaped pointer points to.
> 
> Well, yeah - that would work but depend on .ACCESS_WITH_SIZE being an
> escape point (quite bad IMHO) and __builtin_dynamic_object_size being
> non-const (that's probably not too bad).
> 
>> We'd see that a particular pointer is size associated with &s.a address
>> and would use that address cast to the type of the third argument (to
>> preserve the exact pointer type on INTEGER_CST, though not sure, wouldn't
>> VN CSE it anyway if one has say
>> union U { struct S { int a; char b __attribute__((counted_by (a))) []; } s;
>>  struct T { char c, d, e, f; char g __attribute__((counted_by (c))) 
>> []; } t; };
>> and
>> .ACCESS_WITH_SIZE (&v.s.b[0], &v.s.a, (int *) 0);
>> ...
>> .ACCESS_WITH_SIZE (&v.t.g[0], &v.t.c, (int *) 0);
>> ?
> 
> We'd probably CSE that - the usual issue of address-with-same-value.
> 
>> It would mean though that counted_by wouldn't be allowed to be a
>> bit-field...
> 
> Yup.  We could also pass a pointer to the container though, that's good enough
> for the escape, and pass the size by value in addition to that.
Could you explain a little bit more here? Then the .ACCESS_WITH_SIZE will become

PTR = .ACCESS_WITH_SIZE (PTR, &PTR’s Container, SIZE, ACCESS_MODE)

??

> 
>>Jakub
>>

1 2 >

1 - 100 of 129 matches

Mail list logo