Re: configure adds -std=gnu++11 to CXX variable

2024-05-27 Thread Florian Weimer via Gcc
* Paul Eggert:

> diff --git a/NEWS b/NEWS
> index 20dbc173..4ba8f3fe 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -16,6 +16,10 @@ GNU Autoconf NEWS - User visible changes.
>C11 and later.  Programs can use AC_C_VARARRAYS and __STDC_NO_VLA__
>to use VLAs if available.
>  
> +*** AC_PROG_CXX now prefers C++23, C++20, C++17, C++14 if available.
> +  Older code may need to be updated, as some older features of C++ are
> +  removed in later standards.
> +

Does this turn on experimental language modes by default?  That's
probably not what we want.

It would be better to have an option to raise the C++ mode to at least a
certain revision, and otherwise use the default.  The default is more
likely to be supported by system libraries, and it's a bit less likely
that programmers use experimental, known-to-be-buggy features (language
and library) by accident.

Thanks,
Florian



Re: Question about the SLP vectorizer failed to perform automatic vectorization in one case

2024-05-27 Thread Richard Biener via Gcc
On Sat, May 25, 2024 at 3:08 PM Hanke Zhang via Gcc  wrote:
>
> Hi,
> I'm trying to studing the automatic vectorization optimization in GCC,
> but I found one case that SLP vectorizer failed to do such things.
>
> Here is the sample code: (also a simplification version of a function
> from the 625/525.x264 source code in SPEC CPU 2017)
>
> void pixel_sub_wxh(int16_t *diff, uint8_t *pix1, uint8_t *pix2) {
>   for (int y = 0; y < 4; y++) {
> for (int x = 0; x < 4; x++)
>   diff[x + y * 4] = pix1[x] - pix2[x];
> pix1 += 16;
> pix2 += 32;

The issue is these increments, with only four uint8_t elements accessed
we still want to fill up a vectors worth of them.

In the end we succeed with v4hi / v8qi but also peel for gaps even though
we handle the half-load case fine.

>   }
> }
>
> When I compiled with `-O3 -mavx2/-msse4.2`, SLP vectorizer failed to
> vectorize it, and I got the following message when adding
> `-fopt-info-vec-all`. (The inner loop will be unrolled)
>
> :6:21: optimized: loop vectorized using 8 byte vectors
> :6:21: optimized:  loop versioned for vectorization because of
> possible aliasing
> :5:6: note: vectorized 1 loops in function.

^^^

so you do see the vectorization as outlined above.

> :5:6: note: * Analysis failed with vector mode V8SI
> :5:6: note: * The result for vector mode V32QI would be the same
> :5:6: note: * Re-trying analysis with vector mode V16QI
> :5:6: note: * Analysis failed with vector mode V16QI
> :5:6: note: * Re-trying analysis with vector mode V8QI
> :5:6: note: * Analysis failed with vector mode V8QI
> :5:6: note: * Re-trying analysis with vector mode V4QI
> :5:6: note: * Analysis failed with vector mode V4QI
>
> If I manually use the type declaration provided by `immintrin.h` to
> rewrite the code, the code is as follows (which I hope the SLP
> vectorizer to be able to do)
>
> void pixel_sub_wxh_vec(int16_t *diff, uint8_t *pix1, uint8_t *pix2) {
>   for (int y = 0; y < 4; y++) {
> __v4hi pix1_v = {pix1[0], pix1[1], pix1[2], pix1[3]};
> __v4hi pix2_v = {pix2[0], pix2[1], pix2[2], pix2[3]};
> __v4hi diff_v = pix1_v - pix2_v;
> *(long long *)(diff + y * 4) = (long long)diff_v;

We kind-of do it this way, just

__v8qi pix1_v = {pix1[0], pix1[1], pix1[2], pix1[3], 0, 0, 0, 0};
...

and then unpack __v8qi low to v4hi.

And unfortunately the last two outer iterations are scalar because of the
gap issue.  There's some PRs about this, I did start to work on improving this,
I'm not sure this exact case is covered so can you open a new bugreport?

> pix1 += 16;
> pix2 += 32;
>   }
> }
>
> What I want to know is why SLP vectorizer can't vectorize the code
> here, and what changes do I need to make to SLP vectorizer or the
> source code if I want it to do so?
>
> Thanks
> Hanke Zhang


Re: configure adds -std=gnu++11 to CXX variable

2024-05-27 Thread Paul Eggert

On 2024-05-27 03:35, Florian Weimer wrote:

Does this turn on experimental language modes by default?  That's
probably not what we want.


What do C++ developers want these days? Autoconf should have a 
reasonable default, and C++11 is surely not a good default anymore.


It would be easy to discourage use of C++23 in the near future by using 
a stricter test, such as the attached patch (which I've not installed). 
Even GCC 14.1 fails the test in the new patch, so 'configure' will fall 
back on C++20. I hope GCC 15 will succeed on it but of course there's no 
guarantee. Although this new test covers a DR and is not specific to 
C++23 (and there seems to be some reluctance to implement the DR, I 
assume because it invalidates some older code), I expect any compiler 
passing both this and the __cplusplus>=202302 check would be good enough.


Would this patch be preferable to the current Autoconf master?



It would be better to have an option to raise the C++ mode to at least a
certain revision, and otherwise use the default.


That option is already available. For example, a builder who doesn't 
want C++23 can use './configure ac_cv_prog_cxx_cxx23=no', and a 
developer can discourage C++23 by putting ': ${ac_cv_prog_cxx_cxx23=no}' 
early in configure.ac.


As I mentioned earlier, I volunteered to document this sort of thing if 
Zack doesn't come up with something nicer soon.From b2f28ce66ea1618b50e14085059ce512d7245300 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Mon, 27 May 2024 11:56:06 -0700
Subject: [PATCH] Add P1787R6 test to AC_PROG_CXX C++23 check

* lib/autoconf/c.m4 (_AC_CXX_CXX23_TEST_PROGRAM): Check more
carefully for C++23 support, by checking for P1787R6, which even
GCC 14.1 and Clang 18.1 have not implemented.
---
 lib/autoconf/c.m4 | 13 +
 1 file changed, 13 insertions(+)

diff --git a/lib/autoconf/c.m4 b/lib/autoconf/c.m4
index a0a2b487..157dcb12 100644
--- a/lib/autoconf/c.m4
+++ b/lib/autoconf/c.m4
@@ -2856,6 +2856,19 @@ AC_DEFUN([_AC_CXX_CXX23_TEST_PROGRAM],
 # error "Compiler does not advertise C++23 conformance"
 #endif
 
+/* Check support for P1787R6: Declarations and where to find them
+   .
+   See .  */
+template  struct A {
+  void f(int);
+  template  void f(U);
+};
+template  struct B {
+  template  struct C { };
+};
+template  class TT = T::C> struct E { };
+E > db;
+
 int
 main ()
 {
-- 
2.45.1



Re: configure adds -std=gnu++11 to CXX variable

2024-05-27 Thread Jakub Jelinek via Gcc
On Mon, May 27, 2024 at 12:04:40PM -0700, Paul Eggert wrote:
> On 2024-05-27 03:35, Florian Weimer wrote:
> > Does this turn on experimental language modes by default?  That's
> > probably not what we want.
> 
> What do C++ developers want these days? Autoconf should have a reasonable
> default, and C++11 is surely not a good default anymore.

Maybe respect the carefully chosen compiler default (unless explicitly
overridden in configure.ac)?

Jakub



Re: Question about the SLP vectorizer failed to perform automatic vectorization in one case

2024-05-27 Thread Hanke Zhang via Gcc
Hi Biener,

Thanks for your help!

I have already open a bugreport here
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115252.

Thanks
Hanke Zhang

Richard Biener  于2024年5月27日周一 21:14写道:
>
> On Sat, May 25, 2024 at 3:08 PM Hanke Zhang via Gcc  wrote:
> >
> > Hi,
> > I'm trying to studing the automatic vectorization optimization in GCC,
> > but I found one case that SLP vectorizer failed to do such things.
> >
> > Here is the sample code: (also a simplification version of a function
> > from the 625/525.x264 source code in SPEC CPU 2017)
> >
> > void pixel_sub_wxh(int16_t *diff, uint8_t *pix1, uint8_t *pix2) {
> >   for (int y = 0; y < 4; y++) {
> > for (int x = 0; x < 4; x++)
> >   diff[x + y * 4] = pix1[x] - pix2[x];
> > pix1 += 16;
> > pix2 += 32;
>
> The issue is these increments, with only four uint8_t elements accessed
> we still want to fill up a vectors worth of them.
>
> In the end we succeed with v4hi / v8qi but also peel for gaps even though
> we handle the half-load case fine.
>
> >   }
> > }
> >
> > When I compiled with `-O3 -mavx2/-msse4.2`, SLP vectorizer failed to
> > vectorize it, and I got the following message when adding
> > `-fopt-info-vec-all`. (The inner loop will be unrolled)
> >
> > :6:21: optimized: loop vectorized using 8 byte vectors
> > :6:21: optimized:  loop versioned for vectorization because of
> > possible aliasing
> > :5:6: note: vectorized 1 loops in function.
>
> ^^^
>
> so you do see the vectorization as outlined above.
>
> > :5:6: note: * Analysis failed with vector mode V8SI
> > :5:6: note: * The result for vector mode V32QI would be the same
> > :5:6: note: * Re-trying analysis with vector mode V16QI
> > :5:6: note: * Analysis failed with vector mode V16QI
> > :5:6: note: * Re-trying analysis with vector mode V8QI
> > :5:6: note: * Analysis failed with vector mode V8QI
> > :5:6: note: * Re-trying analysis with vector mode V4QI
> > :5:6: note: * Analysis failed with vector mode V4QI
> >
> > If I manually use the type declaration provided by `immintrin.h` to
> > rewrite the code, the code is as follows (which I hope the SLP
> > vectorizer to be able to do)
> >
> > void pixel_sub_wxh_vec(int16_t *diff, uint8_t *pix1, uint8_t *pix2) {
> >   for (int y = 0; y < 4; y++) {
> > __v4hi pix1_v = {pix1[0], pix1[1], pix1[2], pix1[3]};
> > __v4hi pix2_v = {pix2[0], pix2[1], pix2[2], pix2[3]};
> > __v4hi diff_v = pix1_v - pix2_v;
> > *(long long *)(diff + y * 4) = (long long)diff_v;
>
> We kind-of do it this way, just
>
> __v8qi pix1_v = {pix1[0], pix1[1], pix1[2], pix1[3], 0, 0, 0, 0};
> ...
>
> and then unpack __v8qi low to v4hi.
>
> And unfortunately the last two outer iterations are scalar because of the
> gap issue.  There's some PRs about this, I did start to work on improving 
> this,
> I'm not sure this exact case is covered so can you open a new bugreport?
>
> > pix1 += 16;
> > pix2 += 32;
> >   }
> > }
> >
> > What I want to know is why SLP vectorizer can't vectorize the code
> > here, and what changes do I need to make to SLP vectorizer or the
> > source code if I want it to do so?
> >
> > Thanks
> > Hanke Zhang


Re: configure adds -std=gnu++11 to CXX variable

2024-05-27 Thread Paul Eggert

On 2024-05-27 12:18, Jakub Jelinek wrote:

Maybe respect the carefully chosen compiler default (unless explicitly
overridden in configure.ac)?


Autoconf gave up on that idea long ago, as we had bad experiences with 
compiler defaults being chosen for the convenience of distro maintainers 
rather than for application developers and builders. Compilers were 
still defaulting to K&R long after that made little sense. It was a bit 
like what we're still experiencing with _FILE_OFFSET_BITS defaulting to 
32 on x86 GNU/Linux.


Using the compiler default puts you at the mercy of the distro. It's not 
always wrong to do that - which is why developers and builders should 
have an option - but it's not always right either.




Archaeology time: Help me identify these ancient OSes and vendors

2024-05-27 Thread Zack Weinberg via Gcc
I've been trying to fill in as many gaps as possible in the config.sub
test suite (and finding a whole bunch of actual bugs in the process).
I have a short list of inputs where the actual code to handle them is
incomplete or broken, there's nothing in config.guess to use as a clue,
and I don't know what the correct canonical system name should be.
gcc@ mailing list cc:ed because I know some of you have long memories.

These are probably all either vendor or OS names from the late 1980s or
early 1990s.  Can anyone help me fill out the following list of things
that ought to appear in testsuite/config-sub.data, if I knew what to
put in place of the question marks?

???-pc533???-pc533-???
???-sim  ???-sim-???
???-ultra???-ultra-???
???-unicom   ???-unicom-???
???-acis ???-???-aos
???-triton   ???-???-sysv3
???-oss  ???-???-sysv3
???-storm-chaos  ???-???-???

n.b. "storm-chaos" is extra troublesome because (a) it's too generic
to search for, and (b) it's being treated as a $os value but it's got
a dash in the middle, i.e. the code to handle it never got updated
for four-part canonical system names.

Thanks for any hints you can provide.
zw


Re: configure adds -std=gnu++11 to CXX variable

2024-05-27 Thread Florian Weimer via Gcc
* Paul Eggert:

> On 2024-05-27 03:35, Florian Weimer wrote:
>> Does this turn on experimental language modes by default?  That's
>> probably not what we want.
>
> What do C++ developers want these days? Autoconf should have a
> reasonable default, and C++11 is surely not a good default anymore.

It's still a good default for GCC 5.

GCC developers will correct me, but I think the default C++ dialect is
updated to a newer version once the implementation is reasonably
complete and bugs have been ironed out.

This is different from the C front end, where it took close to 40 years
(from the introduction of void * into C) to activate type checking for
pointer types by default.

>> It would be better to have an option to raise the C++ mode to at least a
>> certain revision, and otherwise use the default.
>
> That option is already available. For example, a builder who doesn't
> want C++23 can use './configure ac_cv_prog_cxx_cxx23=no', and a
> developer can discourage C++23 by putting ':
> ${ac_cv_prog_cxx_cxx23=no}' early in configure.ac.

But that is not the same thing.  If a project uses C++14 constructs,
wouldn't it make sense to tell configure to try to get (likely
experimental) support for it if the compiler does not enable C++14 by
default?  And if the system is already at C++17, leave it at that?

Setting C++14 unconditionally could be incompatible with used system
libraries, which assume C++17 support because the distribution is aware
that the system compiler supports C++17.

Thanks,
Florian