Re: [PATCH] openmp: Fix up gomp_affinity_init_numa_domains
Hi Jakub! On 2022-03-17T17:16:04+0100, Jakub Jelinek wrote: > On Thu, Nov 11, 2021 at 02:14:05PM +0100, Thomas Schwinge wrote: >> There appears to be yet another issue: there still are quite a number of >> 'FAIL: libgomp.c/places-10.c execution test' reports on >> . Also in my testing testing, on a system >> where '/sys/devices/system/node/online' contains '0-1', I get a FAIL: >> >> [...] >> OPENMP DISPLAY ENVIRONMENT BEGIN >> _OPENMP = '201511' >> OMP_DYNAMIC = 'FALSE' >> OMP_NESTED = 'FALSE' >> OMP_NUM_THREADS = '8' >> OMP_SCHEDULE = 'DYNAMIC' >> OMP_PROC_BIND = 'TRUE' >> OMP_PLACES = '{0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30},{FAIL: >> libgomp.c/places-10.c execution test > > I've finally managed to debug this (by dumping used /sys/ files from > an affected system in Fedora build system, replacing /sys/ with /tmp/ > in gcc sources and populating there those files), I think following patch > ought to fix it. > I'll test it tonight in my bootstrap/regtest (but that's a non-numa box), > can somebody with a box where places-10.c fails reliably test this too? Thanks, confirmed to resolve all the 'libgomp.c/places-10.c' FAILs that I've seen on different systems. Grüße Thomas > 2022-03-17 Jakub Jelinek > > * config/linux/affinity.c (gomp_affinity_init_numa_domains): Move seen > variable next to pl variable. > > --- libgomp/config/linux/affinity.c.jj2022-01-11 23:11:23.887269117 > +0100 > +++ libgomp/config/linux/affinity.c 2022-03-17 17:05:38.129008653 +0100 > @@ -411,11 +411,11 @@ gomp_affinity_init_numa_domains (unsigne > { > char *p = line; > void *pl = NULL; > + bool seen = false; > > while (*p && *p != '\n') > { > unsigned long first, last; > - bool seen = false; > > errno = 0; > first = strtoul (p, &end, 10); > > > Jakub - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
Re: [PATCH] x86: Correct march=sapphirerapids to base on icelake server
On Fri, Mar 18, 2022 at 11:32 AM Cui,Lili wrote: > > Hi Hongtao, > > This patch is to correct march=sapphirerapids to base on icelake server. > and update sapphirerapids in the documentation. > > OK for master and backport to GCC 11? Ok. > > > gcc/Changelog: > > PR target/104963 > * config/i386/i386.h (PTA_SAPPHIRERAPIDS): change it to base on ICX. > * doc/invoke.texi: Update documents for Intel sapphirerapids. > > gcc/testsuite/ChangeLog > > PR target/104963 > * gcc.target/i386/pr104963.c: New test case. > --- > gcc/config/i386/i386.h | 5 +++-- > gcc/doc/invoke.texi | 11 ++- > gcc/testsuite/gcc.target/i386/pr104963.c | 12 > 3 files changed, 21 insertions(+), 7 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/i386/pr104963.c > > diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h > index 37b523cea4f..b92955177fe 100644 > --- a/gcc/config/i386/i386.h > +++ b/gcc/config/i386/i386.h > @@ -2323,10 +2323,11 @@ constexpr wide_int_bitmask PTA_ICELAKE_SERVER = > PTA_ICELAKE_CLIENT >| PTA_PCONFIG | PTA_WBNOINVD | PTA_CLWB; > constexpr wide_int_bitmask PTA_TIGERLAKE = PTA_ICELAKE_CLIENT | PTA_MOVDIRI >| PTA_MOVDIR64B | PTA_CLWB | PTA_AVX512VP2INTERSECT | PTA_KL | PTA_WIDEKL; > -constexpr wide_int_bitmask PTA_SAPPHIRERAPIDS = PTA_COOPERLAKE | PTA_MOVDIRI > +constexpr wide_int_bitmask PTA_SAPPHIRERAPIDS = PTA_ICELAKE_SERVER | > PTA_MOVDIRI >| PTA_MOVDIR64B | PTA_AVX512VP2INTERSECT | PTA_ENQCMD | PTA_CLDEMOTE >| PTA_PTWRITE | PTA_WAITPKG | PTA_SERIALIZE | PTA_TSXLDTRK | PTA_AMX_TILE > - | PTA_AMX_INT8 | PTA_AMX_BF16 | PTA_UINTR | PTA_AVXVNNI | PTA_AVX512FP16; > + | PTA_AMX_INT8 | PTA_AMX_BF16 | PTA_UINTR | PTA_AVXVNNI | PTA_AVX512FP16 > + | PTA_AVX512BF16; > constexpr wide_int_bitmask PTA_KNL = PTA_BROADWELL | PTA_AVX512PF >| PTA_AVX512ER | PTA_AVX512F | PTA_AVX512CD | PTA_PREFETCHWT1; > constexpr wide_int_bitmask PTA_BONNELL = PTA_CORE2 | PTA_MOVBE; > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > index d65979bba3f..59baa5e5747 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -31288,11 +31288,12 @@ AVX512VP2INTERSECT and KEYLOCKER instruction set > support. > Intel sapphirerapids CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, > SSSE3, SSE4.1, SSE4.2, POPCNT, CX16, SAHF, FXSR, AVX, XSAVE, PCLMUL, > FSGSBASE, > RDRND, F16C, AVX2, BMI, BMI2, LZCNT, FMA, MOVBE, HLE, RDSEED, ADCX, > PREFETCHW, > -AES, CLFLUSHOPT, XSAVEC, XSAVES, SGX, AVX512F, CLWB, AVX512VL, AVX512BW, > -AVX512DQ, AVX512CD, AVX512VNNI, AVX512BF16 MOVDIRI, MOVDIR64B, > -AVX512VP2INTERSECT, ENQCMD, CLDEMOTE, PTWRITE, WAITPKG, SERIALIZE, TSXLDTRK, > -UINTR, AMX-BF16, AMX-TILE, AMX-INT8, AVX-VNNI and AVX512FP16 instruction set > -support. > +AES, CLFLUSHOPT, XSAVEC, XSAVES, SGX, AVX512F, AVX512VL, AVX512BW, AVX512DQ, > +AVX512CD, PKU, AVX512VBMI, AVX512IFMA, SHA, AVX512VNNI, GFNI, VAES, > AVX512VBMI2 > +VPCLMULQDQ, AVX512BITALG, RDPID, AVX512VPOPCNTDQ, PCONFIG, WBNOINVD, CLWB, > +MOVDIRI, MOVDIR64B, AVX512VP2INTERSECT, ENQCMD, CLDEMOTE, PTWRITE, WAITPKG, > +SERIALIZE, TSXLDTRK, UINTR, AMX-BF16, AMX-TILE, AMX-INT8, AVX-VNNI and > +AVX512FP16 instruction set support. > > @item alderlake > Intel Alderlake CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, > SSSE3, > diff --git a/gcc/testsuite/gcc.target/i386/pr104963.c > b/gcc/testsuite/gcc.target/i386/pr104963.c > new file mode 100644 > index 000..19000671ebf > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr104963.c > @@ -0,0 +1,12 @@ > +/* PR target/104963 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -march=sapphirerapids" } */ > + > +#include > + > +__m512i > +foo (__m512i a, __m512i b) > +{ > +return _mm512_permutexvar_epi8(a, b); > +} > + > -- > 2.17.1 > > Thanks. -- BR, Hongtao
RE: [PATCH] Ignore (possible) signed zeros in operands of FP comparisons.
Hi Jeff/Andrew, > If you're going to do more work in this space, you might want to reach out to > Aldy and Andrew to see if there's space for collaboration. One (clever?) suggestion that I do have for ranger would be to add support for an additional value_range_kind, VR_NONZEROBITS, which would be a variant of VR_RANGE (for unsigned types?) and require very few changes to the existing code. Just like VR_RANGE all values would lie in [MIN, MAX], so by default treating this value_range_kind identically to VR_RANGE there should be no visible changes, but the change in semantics is that MIN has the minimum bits set, and MAX, the maximum bits set [equivalent to the RVAL and RMASK pairs from CCP's bit_value_{bin,un}op]. Hence, the VR_NONZEROBITS range [2,7] would represent the possible values {2, 3, 6, 7} rather than {2, 3, 4, 5, 6, 7}. For a small number of bits, int_range can already handle this with multiple irange spans, but adding this representation would allow the unification of the bit-based propagation performed in tree-ssa-ccp with the range-value based propagation performed in EVRP/ranger, allowing the clever forwards/backwards functionality. As Andrew's recent (partial) review points out, tracking the effect of operations like BIT_XOR_EXPR on VR_RANGE is much more complicated than on the proposed VR_NONZEROBITS. Alas, I'm not the sort of contributor to make large infrastructure changes myself, but if the above functionality were in place, I/the compiler would be able to make use of it. Cheers, Roger -- > -Original Message- > From: Jeff Law > Sent: 17 March 2022 23:28 > To: Roger Sayle ; 'Richard Biener' > > Cc: 'GCC Patches' > Subject: Re: [PATCH] Ignore (possible) signed zeros in operands of FP > comparisons. > > > On 3/15/2022 2:03 AM, Roger Sayle wrote: > >> -Original Message- > >> From: Richard Biener > >> Sent: 15 March 2022 07:29 > >> To: Roger Sayle > >> Cc: GCC Patches > >> Subject: Re: [PATCH] Ignore (possible) signed zeros in operands of FP > >> comparisons. > >> > >> On Mon, Mar 14, 2022 at 8:26 PM Roger Sayle > >> wrote: > >>> > >>> I've been wondering about the possible > >>> performance/missed-optimization impact of my patch for PR > >>> middle-end/98420 and similar IEEE correctness fixes that disable > >>> constant folding optimizations when worrying > >> about -0.0. > >>> In the common situation where the floating point result is used by a > >>> FP comparison, there's no distinction between +0.0 and -0.0, so some > >>> HONOR_SIGNED_ZEROS optimizations that we'd usually disable, are safe. > >>> > >>> Consider the following interesting example: > >>> > >>> int foo(int x, double y) { > >>> return (x * 0.0) < y; > >>> } > >>> > >>> Although we know that x (when converted to double) can't be NaN or > >>> Inf, we still worry that for negative values of x that (x * 0.0) may > >>> be -0.0 and so perform the multiplication at run-time. But in this > >>> case, the result of the comparison (-0.0 < y) will be exactly the > >>> same as (+0.0 < y) for any y, hence the above may be safely constant > >>> folded to "0.0 < > >> y" > >>> avoiding the multiplication at run-time. > >>> > >>> This patch has been tested on x86_64-pc-linux-gnu with make > >>> bootstrap and make -k check with no new failures, and allows GCC to > >>> continue to optimize cases that we optimized in GCC 11 (without regard to > correctness). > >>> Ok for mainline? > >> Isn't that something that gimple-ssa-backprop.c is designed to > >> handle? I wonder if you can see whether the signed zero speciality can be > retrofitted there? > >> It currently tracks "sign does not matter", so possibly another > >> state, "sign of zero does not matter" could be introduced there. > > Two questions. Would adding tracking of "sign of zero does not matter" > > to gimple-ssa-backprop.c be suitable for stage4? Secondly, even if > > gimple-ssa-backprop.c performed this kind of optimization, would that > > be a reason not to support these transformations in match.pd? Perhaps > > someone could open a missed optimization PR for backprop in Bugzilla, > > but the above patch still needs to be reviewed on its own merits. > > Can't see how it's appropriate for stage4, but definitely interesting for > gcc-13. > > It'd fit well into some of the Ranger plans too -- Aldy and Andrew have been > talking about tracking the special FP values in Ranger. This is related, > though > not exactly the same since rather than tracking the special value, you're > tracking > if those special values actually matter. If you're going to do more work in > this > space, you might want to reach out to Aldy and Andrew to see if there's space > for collaboration. > > > > > > Speaking of tree-ssa passes that could be improved, I was wondering > > whether you could review my EVRP patch to fix regression PR/102950. Pretty > please? > > https://gcc.gnu.org/pipermail/gcc-patches/2022-February/589569.html > > I forwarded this to Al
[PATCH] [avx512fp16] Refine HImode movement for "v" to "v".
Set attr from HImode to HFmode which uses vmovsh instead of vmovw for movment between sse registers. Bootstrapped and regstested on x86_64-pc-linux-gnu{-m32,}. Ok for main trunk? gcc/ChangeLog: PR target/104974 * config/i386/i386.md (*movhi_internal): Set attr type from HI to HF for alternative 12 under TARGET_AVX512FP16. gcc/testsuite/ChangeLog: * gcc.target/i386/pr104974.c: New test. --- gcc/config/i386/i386.md | 2 +- gcc/testsuite/gcc.target/i386/pr104974.c | 12 2 files changed, 13 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr104974.c diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index c8fbf605e41..42aa0e1d998 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -2658,7 +2658,7 @@ (define_insn "*movhi_internal" (const_string "TI")) (eq_attr "alternative" "12") (cond [(match_test "TARGET_AVX512FP16") - (const_string "HI") + (const_string "HF") (match_test "TARGET_AVX") (const_string "TI") (ior (not (match_test "TARGET_SSE2")) diff --git a/gcc/testsuite/gcc.target/i386/pr104974.c b/gcc/testsuite/gcc.target/i386/pr104974.c new file mode 100644 index 000..d1f2b1a9722 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr104974.c @@ -0,0 +1,12 @@ +/* { dg-do assemble { target avx512fp16 } } */ +/* { dg-options "-mavx512fp16 -O2" } */ + +short +foo (short a) +{ + register short b __asm ("%xmm1") = a; + asm volatile ("": "+v" (b)); + register short c __asm ("%xmm2") = b; + asm volatile ("": "+v" (c)); + return a; +} -- 2.18.1
[PATCH] Avoid a warning of overflow
This patch avoid a warning of "c-ada-spec.cc:1660:34: warning: 'sprintf' may write a terminating nul past the end of the destination [-Wformat-overflow=]" when build GCC. gcc/c-family/ * c-ada-spec.cc: Change array length --- gcc/c-family/c-ada-spec.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/c-family/c-ada-spec.cc b/gcc/c-family/c-ada-spec.cc index 149d336ee96..aeb429136b6 100644 --- a/gcc/c-family/c-ada-spec.cc +++ b/gcc/c-family/c-ada-spec.cc @@ -1579,7 +1579,7 @@ dump_ada_function_declaration (pretty_printer *buffer, tree func, tree type = TREE_TYPE (func); tree arg = TYPE_ARG_TYPES (type); tree t; - char buf[17]; + char buf[18]; int num, num_args = 0, have_args = true, have_ellipsis = false; /* Compute number of arguments. */ -- 2.18.1
Re: [PATCH 3/3] RISC-V:Cache Management Operation instructions testcases
On Fri, Mar 18, 2022 at 7:58 AM Kito Cheng wrote: > I would suggest rename those __builtin_riscv_* to > __builtin_riscv_cmo_*, that's less confusing, __builtin_riscv_zero > just seems like it will return a zero value. > You meant cbo_zero, right? CMO was only the task-group name, but the extensions ended up having "cbo" in their name… On Fri, Mar 4, 2022 at 10:52 AM wrote: > > > > From: yulong-plct > > > > This commit adds testcases about CMO instructions. > > 7 > > 8 gcc/testsuite/ChangeLog: > > 9 > > 10 * gcc.target/riscv/cmo-zicbom-1.c: New test. > > 11 * gcc.target/riscv/cmo-zicbom-2.c: New test. > > 12 * gcc.target/riscv/cmo-zicbop-1.c: New test. > > 13 * gcc.target/riscv/cmo-zicbop-2.c: New test. > > 14 * gcc.target/riscv/cmo-zicboz-1.c: New test. > > 15 * gcc.target/riscv/cmo-zicboz-2.c: New test. > > > > --- > > gcc/testsuite/gcc.target/riscv/cmo-zicbom-1.c | 21 + > > gcc/testsuite/gcc.target/riscv/cmo-zicbom-2.c | 21 + > > gcc/testsuite/gcc.target/riscv/cmo-zicbop-1.c | 23 +++ > > gcc/testsuite/gcc.target/riscv/cmo-zicbop-2.c | 23 +++ > > gcc/testsuite/gcc.target/riscv/cmo-zicboz-1.c | 9 > > gcc/testsuite/gcc.target/riscv/cmo-zicboz-2.c | 9 > > 6 files changed, 106 insertions(+) > > create mode 100644 gcc/testsuite/gcc.target/riscv/cmo-zicbom-1.c > > create mode 100644 gcc/testsuite/gcc.target/riscv/cmo-zicbom-2.c > > create mode 100644 gcc/testsuite/gcc.target/riscv/cmo-zicbop-1.c > > create mode 100644 gcc/testsuite/gcc.target/riscv/cmo-zicbop-2.c > > create mode 100644 gcc/testsuite/gcc.target/riscv/cmo-zicboz-1.c > > create mode 100644 gcc/testsuite/gcc.target/riscv/cmo-zicboz-2.c > > > > diff --git a/gcc/testsuite/gcc.target/riscv/cmo-zicbom-1.c > b/gcc/testsuite/gcc.target/riscv/cmo-zicbom-1.c > > new file mode 100644 > > index 000..16935ff3d31 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/riscv/cmo-zicbom-1.c > > @@ -0,0 +1,21 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-march=rv64gc_zicbom -mabi=lp64" } */ > > + > > +int foo1() > > +{ > > +return __builtin_riscv_clean(); > > +} > > + > > +int foo2() > > +{ > > +return __builtin_riscv_flush(); > > +} > > + > > +int foo3() > > +{ > > +return __builtin_riscv_inval(); > > +} > > + > > +/* { dg-final { scan-assembler-times "cbo.clean" 1 } } */ > > +/* { dg-final { scan-assembler-times "cbo.flush" 1 } } */ > > +/* { dg-final { scan-assembler-times "cbo.inval" 1 } } */ > > \ No newline at end of file > > diff --git a/gcc/testsuite/gcc.target/riscv/cmo-zicbom-2.c > b/gcc/testsuite/gcc.target/riscv/cmo-zicbom-2.c > > new file mode 100644 > > index 000..fc14f2b9c2b > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/riscv/cmo-zicbom-2.c > > @@ -0,0 +1,21 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-march=rv32gc_zicbom -mabi=ilp32" } */ > > + > > +int foo1() > > +{ > > +return __builtin_riscv_clean(); > > +} > > + > > +int foo2() > > +{ > > +return __builtin_riscv_flush(); > > +} > > + > > +int foo3() > > +{ > > +return __builtin_riscv_inval(); > > +} > > + > > +/* { dg-final { scan-assembler-times "cbo.clean" 1 } } */ > > +/* { dg-final { scan-assembler-times "cbo.flush" 1 } } */ > > +/* { dg-final { scan-assembler-times "cbo.inval" 1 } } */ > > \ No newline at end of file > > diff --git a/gcc/testsuite/gcc.target/riscv/cmo-zicbop-1.c > b/gcc/testsuite/gcc.target/riscv/cmo-zicbop-1.c > > new file mode 100644 > > index 000..b8bac2e8c51 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/riscv/cmo-zicbop-1.c > > @@ -0,0 +1,23 @@ > > +/* { dg-do compile target { { rv64-*-*}}} */ > > +/* { dg-options "-march=rv64gc_zicbop -mabi=lp64" } */ > > + > > +void foo (char *p) > > +{ > > + __builtin_prefetch (p, 0, 0); > > + __builtin_prefetch (p, 0, 1); > > + __builtin_prefetch (p, 0, 2); > > + __builtin_prefetch (p, 0, 3); > > + __builtin_prefetch (p, 1, 0); > > + __builtin_prefetch (p, 1, 1); > > + __builtin_prefetch (p, 1, 2); > > + __builtin_prefetch (p, 1, 3); > > +} > > + > > +int foo1() > > +{ > > + return __builtin_riscv_prefetchi(1); > > +} > > + > > +/* { dg-final { scan-assembler-times "prefetch.i" 1 } } */ > > +/* { dg-final { scan-assembler-times "prefetch.r" 4 } } */ > > +/* { dg-final { scan-assembler-times "prefetch.w" 4 } } */ > > \ No newline at end of file > > diff --git a/gcc/testsuite/gcc.target/riscv/cmo-zicbop-2.c > b/gcc/testsuite/gcc.target/riscv/cmo-zicbop-2.c > > new file mode 100644 > > index 000..5ace6e2b349 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/riscv/cmo-zicbop-2.c > > @@ -0,0 +1,23 @@ > > +/* { dg-do compile target { { rv32-*-*}}} */ > > +/* { dg-options "-march=rv32gc_zicbop -mabi=ilp32" } */ > > + > > +void foo (char *p) > > +{ > > + __builtin_prefetch (p, 0, 0); > > + __builtin_prefetch (p, 0, 1); > > + __builtin_prefetch (p, 0, 2); > > + __builtin_prefetc
Re: [PATCH] libstdc++: Reduce header dependencies from PSTL headers [PR92546]
On Thu, 17 Mar 2022 at 20:44, Thomas Rodgers wrote: > > Looks ok to me. I just am curious, does the change to src/c++17/fs_path.cc > need to be part of this change (It's not obvious to me that it is related to > the other changes in the patch). It's related. fs_path.cc uses std::array but was not including directly, it was accidentally relying on it being there via . After this change, doesn't include , so doesn't include and so fs_path.cc needs to be fixed.
Re: [PATCH] [avx512fp16] Refine HImode movement for "v" to "v".
On Fri, Mar 18, 2022 at 9:18 AM liuhongt wrote: > > Set attr from HImode to HFmode which uses vmovsh instead of vmovw for > movment between sse registers. > > Bootstrapped and regstested on x86_64-pc-linux-gnu{-m32,}. > Ok for main trunk? > > gcc/ChangeLog: > > PR target/104974 > * config/i386/i386.md (*movhi_internal): Set attr type from HI > to HF for alternative 12 under TARGET_AVX512FP16. > > gcc/testsuite/ChangeLog: > > * gcc.target/i386/pr104974.c: New test. OK. Thanks, Uros. > --- > gcc/config/i386/i386.md | 2 +- > gcc/testsuite/gcc.target/i386/pr104974.c | 12 > 2 files changed, 13 insertions(+), 1 deletion(-) > create mode 100644 gcc/testsuite/gcc.target/i386/pr104974.c > > diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md > index c8fbf605e41..42aa0e1d998 100644 > --- a/gcc/config/i386/i386.md > +++ b/gcc/config/i386/i386.md > @@ -2658,7 +2658,7 @@ (define_insn "*movhi_internal" > (const_string "TI")) > (eq_attr "alternative" "12") > (cond [(match_test "TARGET_AVX512FP16") > - (const_string "HI") > + (const_string "HF") > (match_test "TARGET_AVX") >(const_string "TI") > (ior (not (match_test "TARGET_SSE2")) > diff --git a/gcc/testsuite/gcc.target/i386/pr104974.c > b/gcc/testsuite/gcc.target/i386/pr104974.c > new file mode 100644 > index 000..d1f2b1a9722 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr104974.c > @@ -0,0 +1,12 @@ > +/* { dg-do assemble { target avx512fp16 } } */ > +/* { dg-options "-mavx512fp16 -O2" } */ > + > +short > +foo (short a) > +{ > + register short b __asm ("%xmm1") = a; > + asm volatile ("": "+v" (b)); > + register short c __asm ("%xmm2") = b; > + asm volatile ("": "+v" (c)); > + return a; > +} > -- > 2.18.1 >
Re: [PATCH 3/3] RISC-V:Cache Management Operation instructions testcases
> You meant cbo_zero, right? > CMO was only the task-group name, but the extensions ended up having "cbo" > in their name… Yeah, named with an extension name makes more sense, thank you for pointing that out. Either __builtin_riscv_cbo_zero or __builtin_riscv_zicboz_cbo_zero is fine to me since I expect those builtin are used for compiler internal only, and used to implement intrinsics in future.
Re: [PATCH] libstdc++: Reduce header dependencies from PSTL headers [PR92546]
On Fri, 18 Mar 2022 at 10:16, Jonathan Wakely wrote: > > On Thu, 17 Mar 2022 at 20:44, Thomas Rodgers wrote: > > > > Looks ok to me. I just am curious, does the change to src/c++17/fs_path.cc > > need to be part of this change (It's not obvious to me that it is related > > to the other changes in the patch). > > It's related. fs_path.cc uses std::array but was not including > directly, it was accidentally relying on it being there via > . After this change, doesn't include > , so doesn't include and so fs_path.cc needs to be > fixed. Pushed to trunk now.
Re: [committed] libstdc++: Fix symbol versioning for Solaris 11.3 [PR103407]
Hi Jonathan, > Tested x86_64-linux and sparc-sun-solaris2.11 (but 11.3 only). > > Pushed to trunk. > > Rainer, this should allow you to continue omitting the > _ZSt10from_charsPKcS0_R[def]St12chars_format symbols from the baseline, > without the current FAIL. Please check on your other Solaris targets. I've run bootstraps on i386-pc-solaris2.11 (Solaris 11.3 and 11.4) and sparc-sun-solaris2.11 (Solaris 11.4) last night: the abi_check failures are gone everywhere. Thanks a lot for your patience with this issue. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [committed] libstdc++: Fix symbol versioning for Solaris 11.3 [PR103407]
On Fri, 18 Mar 2022 at 11:49, Rainer Orth wrote: > > Hi Jonathan, > > > Tested x86_64-linux and sparc-sun-solaris2.11 (but 11.3 only). > > > > Pushed to trunk. > > > > Rainer, this should allow you to continue omitting the > > _ZSt10from_charsPKcS0_R[def]St12chars_format symbols from the baseline, > > without the current FAIL. Please check on your other Solaris targets. > > I've run bootstraps on i386-pc-solaris2.11 (Solaris 11.3 and 11.4) and > sparc-sun-solaris2.11 (Solaris 11.4) last night: the abi_check failures > are gone everywhere. Hooray! I did some very brief testing and it seemed like a program linked to the Solaris 11.3 libstdc++.so.6.0.30 (with from_chars@GLIBCXX_3.4.30) can still run against libstdc++.so.6.0.30 with from_chars@GLIBCXX_3.4.29 (which should match what you get on Solaris 11.4 if I correctly fiddled with the versioning). So I don't understand how the Solaris runtime linker handles symbol versions, but it seems like there's no backwards compatibility problem for the Solaris 11.4 build of libstdc++.so.6.0.30.
[PATCH][openmp] Set location for taskloop stmts
Hi, The test-case included in this patch contains: ... #pragma omp taskloop simd shared(a) lastprivate(myId) ... This is translated to 3 taskloop statements in gimple, visible with -fdump-tree-gimple: ... #pragma omp taskloop private(D.2124) #pragma omp taskloop shared(a) shared(myId) private(i.0) firstprivate(a_h) #pragma omp taskloop lastprivate(myId) ... But when exposing the gimple statement locations using -fdump-tree-gimple-lineno, we find that only the first one has location information. Fix this by adding the missing location information. Tested gomp.exp on x86_64. Tested libgomp testsuite on x86_64 with nvptx accelerator. OK for trunk? Thanks, - Tom [openmp] Set location for taskloop stmts gcc/ChangeLog: 2022-03-18 Tom de Vries * gimplify.cc (gimplify_omp_for): Set taskloop location. gcc/testsuite/ChangeLog: 2022-03-18 Tom de Vries * c-c++-common/gomp/pr104968.c: New test. --- gcc/gimplify.cc| 2 ++ gcc/testsuite/c-c++-common/gomp/pr104968.c | 14 ++ 2 files changed, 16 insertions(+) diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc index 139a0de6100..c46589639e4 100644 --- a/gcc/gimplify.cc +++ b/gcc/gimplify.cc @@ -13178,6 +13178,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p) gfor = gimple_build_omp_for (for_body, kind, OMP_FOR_CLAUSES (orig_for_stmt), TREE_VEC_LENGTH (OMP_FOR_INIT (for_stmt)), for_pre_body); + gimple_set_location (gfor, EXPR_LOCATION (*expr_p)); if (orig_for_stmt != for_stmt) gimple_omp_for_set_combined_p (gfor, true); if (gimplify_omp_ctxp @@ -13361,6 +13362,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p) g = gimple_build_bind (NULL_TREE, gfor, NULL_TREE); g = gimple_build_omp_task (g, task_clauses, NULL_TREE, NULL_TREE, NULL_TREE, NULL_TREE, NULL_TREE); + gimple_set_location (g, EXPR_LOCATION (*expr_p)); gimple_omp_task_set_taskloop_p (g, true); g = gimple_build_bind (NULL_TREE, g, NULL_TREE); gomp_for *gforo diff --git a/gcc/testsuite/c-c++-common/gomp/pr104968.c b/gcc/testsuite/c-c++-common/gomp/pr104968.c new file mode 100644 index 000..2977db2f433 --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/pr104968.c @@ -0,0 +1,14 @@ +/* { dg-additional-options "-fdump-tree-gimple-lineno" } */ + +int +main (void) +{ + double a[10], a_h[10]; + int myId = -1; +#pragma omp target map(tofrom:a) +#pragma omp taskloop simd shared(a) lastprivate(myId) /* { dg-line here } */ +for(int i = 0 ; i < 10; i++) if (a[i] != a_h[i]) { } +} + +/* { dg-final { scan-tree-dump-times "#pragma omp taskloop" 3 "gimple" } } */ +/* { dg-final { scan-tree-dump-times "(?n)\\\[.*pr104968.c:[get-absolute-line '' here]:.*\\\] #pragma omp taskloop" 3 "gimple" } } */
Re: [committed] libstdc++: Fix symbol versioning for Solaris 11.3 [PR103407]
Hi Jonathan, > I did some very brief testing and it seemed like a program linked to > the Solaris 11.3 libstdc++.so.6.0.30 (with from_chars@GLIBCXX_3.4.30) > can still run against libstdc++.so.6.0.30 with > from_chars@GLIBCXX_3.4.29 (which should match what you get on Solaris > 11.4 if I correctly fiddled with the versioning). So I don't indeed. You can observe the symbols provided and consumed by shared objects and executables with pvs. E.g. on 11.4: $ pvs -dsvo libstdc++.so|grep _ZSt10from_charsPKcS0_R libstdc++.so - GLIBCXX_3.4.29: _ZSt10from_charsPKcS0_RfSt12chars_format; libstdc++.so - GLIBCXX_3.4.29: _ZSt10from_charsPKcS0_ReSt12chars_format; libstdc++.so - GLIBCXX_3.4.29: _ZSt10from_charsPKcS0_RdSt12chars_format; for the provider side vs. 11.3: $ pvs -dsvo libstdc++.so|grep _ZSt10from_charsPKcS0_R libstdc++.so - GLIBCXX_3.4.30: _ZSt10from_charsPKcS0_ReSt12chars_format; libstdc++.so - GLIBCXX_3.4.30: _ZSt10from_charsPKcS0_RdSt12chars_format; libstdc++.so - GLIBCXX_3.4.30: _ZSt10from_charsPKcS0_RfSt12chars_format; pvs -r shows symbols and versions required by an executable. > understand how the Solaris runtime linker handles symbol versions, but > it seems like there's no backwards compatibility problem for the > Solaris 11.4 build of libstdc++.so.6.0.30. You can observe this at runtime with LD_DEBUG=versions or versions,detail. LD_DEBUG=help gives the full info. IIRC Solaris ld.so.1 just checks if the versions required by an executable are provided by a shared object, but doesn't look into individual symbols in advance. It may well be that some checks have been relaxed in the 11.4 timeframe, though. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [committed] libstdc++: Fix symbol versioning for Solaris 11.3 [PR103407]
On Fri, 18 Mar 2022 at 12:47, Rainer Orth wrote: > > Hi Jonathan, > > > I did some very brief testing and it seemed like a program linked to > > the Solaris 11.3 libstdc++.so.6.0.30 (with from_chars@GLIBCXX_3.4.30) > > can still run against libstdc++.so.6.0.30 with > > from_chars@GLIBCXX_3.4.29 (which should match what you get on Solaris > > 11.4 if I correctly fiddled with the versioning). So I don't > > indeed. You can observe the symbols provided and consumed by shared > objects and executables with pvs. > > E.g. on 11.4: > > $ pvs -dsvo libstdc++.so|grep _ZSt10from_charsPKcS0_R > libstdc++.so - GLIBCXX_3.4.29: _ZSt10from_charsPKcS0_RfSt12chars_format; > libstdc++.so - GLIBCXX_3.4.29: _ZSt10from_charsPKcS0_ReSt12chars_format; > libstdc++.so - GLIBCXX_3.4.29: _ZSt10from_charsPKcS0_RdSt12chars_format; > > for the provider side vs. 11.3: > > $ pvs -dsvo libstdc++.so|grep _ZSt10from_charsPKcS0_R > libstdc++.so - GLIBCXX_3.4.30: _ZSt10from_charsPKcS0_ReSt12chars_format; > libstdc++.so - GLIBCXX_3.4.30: _ZSt10from_charsPKcS0_RdSt12chars_format; > libstdc++.so - GLIBCXX_3.4.30: _ZSt10from_charsPKcS0_RfSt12chars_format; > > pvs -r shows symbols and versions required by an executable. > > > understand how the Solaris runtime linker handles symbol versions, but > > it seems like there's no backwards compatibility problem for the > > Solaris 11.4 build of libstdc++.so.6.0.30. > > You can observe this at runtime with LD_DEBUG=versions or > versions,detail. LD_DEBUG=help gives the full info. > > IIRC Solaris ld.so.1 just checks if the versions required by an > executable are provided by a shared object, but doesn't look into > individual symbols in advance. It may well be that some checks have > been relaxed in the 11.4 timeframe, though. Ah that would explain it. The libstdc++.so built on Solaris 11.4 has the std::from_chars symbols and it has version GLIBCXX_3.4.30, so that satisfies the requirements for a program linked against the libstdc++.so on Solaris 11.3.
Re: [PATCH][openmp] Set location for taskloop stmts
On Fri, Mar 18, 2022 at 01:44:00PM +0100, Tom de Vries wrote: > The test-case included in this patch contains: > ... > #pragma omp taskloop simd shared(a) lastprivate(myId) > ... > > This is translated to 3 taskloop statements in gimple, visible with > -fdump-tree-gimple: > ... > #pragma omp taskloop private(D.2124) > #pragma omp taskloop shared(a) shared(myId) private(i.0) firstprivate(a_h) > #pragma omp taskloop lastprivate(myId) > ... > > But when exposing the gimple statement locations using > -fdump-tree-gimple-lineno, we find that only the first one has location > information. > > Fix this by adding the missing location information. > > Tested gomp.exp on x86_64. > > Tested libgomp testsuite on x86_64 with nvptx accelerator. And for NVPTX we somehow lower the taskloop into GIMPLE_ASM or how we end up ICEing? No objection against doing that, but if we do it, we should probably do it for all or at least most gimple_build_omp_* calls, not just these 2. So in gimplify_omp_parallel, gimplify_omp_task, another spot in gimplify_omp_for beyond these 2, gimplify_omp_workshare (ideally just in one spot for all the cases), gimplify_omp_target_update, gimplify_omp_atomic, gimplify_omp_ordered, gimplify_expr's case OMP_* that call gimple_build_omp_*. Or is it normally handled using if (!gimple_seq_empty_p (internal_post)) { annotate_all_with_location (internal_post, input_location); gimplify_seq_add_seq (pre_p, internal_post); } and we just need to catch the cases where we gimplify something into multiple nested stmts because annotate_all_with_location doesn't walk into gimple_omp_body? Jakub
Re: [PATCH] Ignore (possible) signed zeros in operands of FP comparisons.
On 3/18/22 03:43, Roger Sayle wrote: Hi Jeff/Andrew, If you're going to do more work in this space, you might want to reach out to Aldy and Andrew to see if there's space for collaboration. One (clever?) suggestion that I do have for ranger would be to add support for an additional value_range_kind, VR_NONZEROBITS, which would be a variant of VR_RANGE (for unsigned types?) and require very few changes to the existing I think were ahead of you here.. Tracking known zero and one bits within irange as an adjunct has been in plan for awhile, just priorities haven't allowed us to get to it until recently... Theres a bunch of stuff already in the hopper for the next stage1 that Aldy has been working with... he can expound upon it, but we plan to use both masks and ranges together as appropriate. code. Just like VR_RANGE all values would lie in [MIN, MAX], so by default treating this value_range_kind identically to VR_RANGE there should be no visible changes, but the change in semantics is that MIN has the minimum bits set, and MAX, the maximum bits set [equivalent to the RVAL and RMASK pairs from CCP's bit_value_{bin,un}op]. Hence, the VR_NONZEROBITS range [2,7] would represent the possible values {2, 3, 6, 7} rather than {2, 3, 4, 5, 6, 7}. For a small number of bits, int_range can already handle this with multiple irange spans, but adding this representation would allow the unification of the bit-based propagation performed in tree-ssa-ccp with the range-value based propagation performed in EVRP/ranger, allowing the clever forwards/backwards functionality. As Andrew's recent (partial) review points out, tracking the effect of operations like BIT_XOR_EXPR on VR_RANGE is much more complicated than on the proposed VR_NONZEROBITS. Alas, I'm not the sort of contributor to make large infrastructure changes myself, but if the above functionality were in place, I/the compiler would be able to make use of it. And this is exactly what we are hoping.. we provide the structure and someone who is better at the underlying detail interaction can flesh out whatever specifics they find interesting. Andrew
Re: [PATCH RFC] mips: add TARGET_ZERO_CALL_USED_REGS hook [PR104817, PR104820]
> > If we have to go this way, I think it’s better to make the change you > suggested above, > and then also update the documentation, both internal documentation on > how to define > the hook and the user level documentation on what the user might > expect when using > this option (i.e, it’s possible that the compiler might clear more > registers than the user > requests on some targets due to the implementation limitation). > > I can make this change if we decide to do this. I'd vote for this. Richard? -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
[Patch] Fortran/OpenMP: Improve associate-name diagnostic [PR103039]
This patch addresses a side issue found when looking at PR103039. Namely instead of printing: 55 | !$omp parallel firstprivate(tt) | 1 Error: ASSOCIATE name ‘__tmp_INTEGER_4’ in FIRSTPRIVATE clause at (1) With the patch, the error is: Error: Associate name ‘tt’ in FIRSTPRIVATE clause at (1) That is: It prints the proper name and it uses 'associate name' matching the Fortran standard – and takes into account that an associate name not only used with ASSOCIATE but also with SELECT TYPE, SELECT RANK, and (untested) CHANGE TEAMS. OK for mainline? Tobias - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 Fortran/OpenMP: Improve associate-name diagnostic [PR103039] gcc/fortran/ChangeLog: PR fortran/103039 * openmp.cc (resolve_omp_clauses): Improve associate-name diagnostic for select type/rank. gcc/testsuite/ChangeLog: PR fortran/103039 * gfortran.dg/gomp/associate1.f90: Update dg-error. * gfortran.dg/gomp/associate2.f90: New test. gcc/fortran/openmp.cc | 12 +++-- gcc/testsuite/gfortran.dg/gomp/associate1.f90 | 40 +++--- gcc/testsuite/gfortran.dg/gomp/associate2.f90 | 76 +++ 3 files changed, 104 insertions(+), 24 deletions(-) diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc index 16cd03a3d67..714148138c2 100644 --- a/gcc/fortran/openmp.cc +++ b/gcc/fortran/openmp.cc @@ -6782,8 +6782,10 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses, gfc_error ("Cray pointee %qs in SHARED clause at %L", n->sym->name, &n->where); if (n->sym->attr.associate_var) - gfc_error ("ASSOCIATE name %qs in SHARED clause at %L", - n->sym->name, &n->where); + gfc_error ("Associate name %qs in SHARED clause at %L", + n->sym->attr.select_type_temporary + ? n->sym->assoc->target->symtree->n.sym->name + : n->sym->name, &n->where); if (omp_clauses->detach && n->sym == omp_clauses->detach->symtree->n.sym) gfc_error ("DETACH event handle %qs in SHARED clause at %L", @@ -7163,8 +7165,10 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses, gfc_error ("Cray pointee %qs in %s clause at %L", n->sym->name, name, &n->where); if (n->sym->attr.associate_var) - gfc_error ("ASSOCIATE name %qs in %s clause at %L", - n->sym->name, name, &n->where); + gfc_error ("Associate name %qs in %s clause at %L", + n->sym->attr.select_type_temporary + ? n->sym->assoc->target->symtree->n.sym->name + : n->sym->name, name, &n->where); if (list != OMP_LIST_PRIVATE && is_reduction) { if (n->sym->attr.proc_pointer) diff --git a/gcc/testsuite/gfortran.dg/gomp/associate1.f90 b/gcc/testsuite/gfortran.dg/gomp/associate1.f90 index abc5ae95a0d..a44099e005f 100644 --- a/gcc/testsuite/gfortran.dg/gomp/associate1.f90 +++ b/gcc/testsuite/gfortran.dg/gomp/associate1.f90 @@ -16,65 +16,65 @@ program associate1 j = 2 associate(k => v, l => a(i, j), m => a(i, :)) associate(n => b(j)%c(:, :)%i, o => a, p => b) -!$omp parallel shared (l) ! { dg-error "ASSOCIATE name" } +!$omp parallel shared (l) ! { dg-error "Associate name" } !$omp end parallel -!$omp parallel firstprivate (m) ! { dg-error "ASSOCIATE name" } +!$omp parallel firstprivate (m) ! { dg-error "Associate name" } !$omp end parallel -!$omp parallel reduction (+: k) ! { dg-error "ASSOCIATE name" } +!$omp parallel reduction (+: k) ! { dg-error "Associate name" } !$omp end parallel -!$omp parallel do firstprivate (k) ! { dg-error "ASSOCIATE name" } +!$omp parallel do firstprivate (k) ! { dg-error "Associate name" } do i = 1, 10 end do -!$omp parallel do lastprivate (n) ! { dg-error "ASSOCIATE name" } +!$omp parallel do lastprivate (n) ! { dg-error "Associate name" } do i = 1, 10 end do -!$omp parallel do private (o) ! { dg-error "ASSOCIATE name" } +!$omp parallel do private (o) ! { dg-error "Associate name" } do i = 1, 10 end do -!$omp parallel do shared (p) ! { dg-error "ASSOCIATE name" } +!$omp parallel do shared (p) ! { dg-error "Associate name" } do i = 1, 10 end do -!$omp task private (k) ! { dg-error "ASSOCIATE name" } +!$omp task private (k) ! { dg-error "Associate name" } !$omp end task -!$omp task shared (l) ! { dg-error "ASSOCIATE name" } +!$omp task shared (l) ! { dg-error "Associate name" } !$omp end task -!$omp task firstprivate (m) ! { dg-error "ASSOCIATE name" } +!$omp task firstprivate (m) ! { dg-error "Associate name" } !$omp end task -!$omp do private (l) ! { dg-error "ASSOCIATE name" } +!$omp do private (l) ! { dg-error "Associate name" } do i = 1, 10 end do -!$omp do reduction (*: k) ! { dg-error "ASSOCIATE name" } +!$omp do reduction (*: k) ! { dg-error "Associate name
Re: [PATCH] Ignore (possible) signed zeros in operands of FP comparisons.
On 3/17/22 19:27, Jeff Law via Gcc-patches wrote: On 3/15/2022 2:03 AM, Roger Sayle wrote: -Original Message- From: Richard Biener Sent: 15 March 2022 07:29 To: Roger Sayle Cc: GCC Patches Subject: Re: [PATCH] Ignore (possible) signed zeros in operands of FP comparisons. On Mon, Mar 14, 2022 at 8:26 PM Roger Sayle wrote: I've been wondering about the possible performance/missed-optimization impact of my patch for PR middle-end/98420 and similar IEEE correctness fixes that disable constant folding optimizations when worrying about -0.0. In the common situation where the floating point result is used by a FP comparison, there's no distinction between +0.0 and -0.0, so some HONOR_SIGNED_ZEROS optimizations that we'd usually disable, are safe. Consider the following interesting example: int foo(int x, double y) { return (x * 0.0) < y; } Although we know that x (when converted to double) can't be NaN or Inf, we still worry that for negative values of x that (x * 0.0) may be -0.0 and so perform the multiplication at run-time. But in this case, the result of the comparison (-0.0 < y) will be exactly the same as (+0.0 < y) for any y, hence the above may be safely constant folded to "0.0 < y" avoiding the multiplication at run-time. I'm going to hazard a guess that this can be handled in the upcoming floating point range support? there was a start of a conversation in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=24021 about a month ago. I know *zero* about floating point, but It seems like when we track floating point ranges, we will naturally be able to integrate _2 = _1 * 0.0; _3 = _2 < y_5(D); that _2 evaluates to +/- 0.0 and when we do look at (_2 < y_5) that VRP can simplify that to 0.0 < y? (or any patch which uses simplification and ranges). It seems like it should be straightforward anyway. Andrew
Re: [Patch] Fortran/OpenMP: Improve associate-name diagnostic [PR103039]
On Fri, Mar 18, 2022 at 02:15:11PM +0100, Tobias Burnus wrote: > This patch addresses a side issue found when looking at PR103039. > > Namely instead of printing: > >55 | !$omp parallel firstprivate(tt) > | 1 > Error: ASSOCIATE name ‘__tmp_INTEGER_4’ in FIRSTPRIVATE clause at (1) > > With the patch, the error is: > > Error: Associate name ‘tt’ in FIRSTPRIVATE clause at (1) > > That is: It prints the proper name and it uses 'associate name' > matching the Fortran standard – and takes into account that an > associate name not only used with ASSOCIATE but also with > SELECT TYPE, SELECT RANK, and (untested) CHANGE TEAMS. > > OK for mainline? LGTM, thanks. > Fortran/OpenMP: Improve associate-name diagnostic [PR103039] > > gcc/fortran/ChangeLog: > > PR fortran/103039 > * openmp.cc (resolve_omp_clauses): Improve associate-name diagnostic > for select type/rank. > > gcc/testsuite/ChangeLog: > > PR fortran/103039 > * gfortran.dg/gomp/associate1.f90: Update dg-error. > * gfortran.dg/gomp/associate2.f90: New test. > > gcc/fortran/openmp.cc | 12 +++-- > gcc/testsuite/gfortran.dg/gomp/associate1.f90 | 40 +++--- > gcc/testsuite/gfortran.dg/gomp/associate2.f90 | 76 > +++ > 3 files changed, 104 insertions(+), 24 deletions(-) Jakub
Re: [PATCH][openmp] Set location for taskloop stmts
On 3/18/22 14:01, Jakub Jelinek wrote: On Fri, Mar 18, 2022 at 01:44:00PM +0100, Tom de Vries wrote: The test-case included in this patch contains: ... #pragma omp taskloop simd shared(a) lastprivate(myId) ... This is translated to 3 taskloop statements in gimple, visible with -fdump-tree-gimple: ... #pragma omp taskloop private(D.2124) #pragma omp taskloop shared(a) shared(myId) private(i.0) firstprivate(a_h) #pragma omp taskloop lastprivate(myId) ... But when exposing the gimple statement locations using -fdump-tree-gimple-lineno, we find that only the first one has location information. Fix this by adding the missing location information. Tested gomp.exp on x86_64. Tested libgomp testsuite on x86_64 with nvptx accelerator. And for NVPTX we somehow lower the taskloop into GIMPLE_ASM or how we end up ICEing? In the nvptx backend, gen_comment (triggering not very frequently atm) uses gen_rtx_ASM_INPUT_loc with as location argument DECL_SOURCE_LOCATION (cfun->decl). If this location is UNKNOWN_LOCATION, we run into an ICE, which is fixed by the proposed patch "[final] Handle compiler-generated asm insn" ( https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590721.html ). As for the openmp test-case, we end up lowering at least one of those taskloops into an outlined function, and if its location is UNKNOWN_LOCATION and gen_comment is triggered in the body, we run into the ICE. [ My preferred solution is to have "[final] Handle compiler-generated asm insn" approved and committed, but no response sofar, maybe ignored for not being stage-4 material, I'm not sure. Alternatively, if there's a better way to get some random valid location than DECL_SOURCE_LOCATION (cfun->decl), that would also work for me. ] No objection against doing that, but if we do it, we should probably do it for all or at least most gimple_build_omp_* calls, not just these 2. So in gimplify_omp_parallel, gimplify_omp_task, another spot in gimplify_omp_for beyond these 2, gimplify_omp_workshare (ideally just in one spot for all the cases), gimplify_omp_target_update, gimplify_omp_atomic, gimplify_omp_ordered, gimplify_expr's case OMP_* that call gimple_build_omp_*. Or is it normally handled using if (!gimple_seq_empty_p (internal_post)) { annotate_all_with_location (internal_post, input_location); gimplify_seq_add_seq (pre_p, internal_post); } and we just need to catch the cases where we gimplify something into multiple nested stmts because annotate_all_with_location doesn't walk into gimple_omp_body? I can try to update the patch to take care of these additional cases. I reckon answering the questions that you're asking requires writing test-cases for all of these. Thanks, - Tom
[committed][openmp] Fix SIMT reduction using TRUTH_{AND,OR}IF_EXPR
Hi, Consider test-case pr104952-1.c, included in this commit, containing: ... #pragma omp target map(tofrom:result) map(to:arr) #pragma omp simd reduction(||: result) ... When run on x86_64 with nvptx accelerator, the test-case either aborts or hangs. The reduction clause is translated by the SIMT code (active for nvptx) as a butterfly reduction loop with this butterfly shuffle / update pair: ... D.2163 = D.2163 || .GOMP_SIMT_XCHG_BFLY (D.2163, D.2164) ... in the loop body. The problem is that the butterfly shuffle is possibly not executed, while it needs to be executed unconditionally. Fix this by translating instead as: ... D.tmp_bfly = .GOMP_SIMT_XCHG_BFLY (D.2163, D.2164) D.2163 = D.2163 || D.tmp_bfly ... Tested on x86_64-linux with nvptx accelerator. Committed to trunk. Thanks, - Tom [openmp] Fix SIMT reduction using TRUTH_{AND,OR}IF_EXPR gcc/ChangeLog: 2022-03-17 Tom de Vries PR target/104952 * omp-low.cc (lower_rec_input_clauses): Make sure GOMP_SIMT_XCHG_BFLY is executed unconditionally. libgomp/ChangeLog: 2022-03-17 Tom de Vries PR target/104952 * testsuite/libgomp.c/pr104952-1.c: New test. * testsuite/libgomp.c/pr104952-2.c: New test. --- gcc/omp-low.cc | 5 - libgomp/testsuite/libgomp.c/pr104952-1.c | 24 libgomp/testsuite/libgomp.c/pr104952-2.c | 22 ++ 3 files changed, 50 insertions(+), 1 deletion(-) diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc index cfc63d6a104..392bb18bc5d 100644 --- a/gcc/omp-low.cc +++ b/gcc/omp-low.cc @@ -6743,7 +6743,10 @@ lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist, x = build_call_expr_internal_loc (UNKNOWN_LOCATION, IFN_GOMP_SIMT_XCHG_BFLY, TREE_TYPE (ivar), 2, ivar, simt_lane); - x = build2 (code, TREE_TYPE (ivar), ivar, x); + /* Make sure x is evaluated unconditionally. */ + tree bfly_var = create_tmp_var (TREE_TYPE (ivar)); + gimplify_assign (bfly_var, x, &llist[2]); + x = build2 (code, TREE_TYPE (ivar), ivar, bfly_var); gimplify_assign (ivar, x, &llist[2]); } tree ivar2 = ivar; diff --git a/libgomp/testsuite/libgomp.c/pr104952-1.c b/libgomp/testsuite/libgomp.c/pr104952-1.c new file mode 100644 index 000..a3bfb1e77df --- /dev/null +++ b/libgomp/testsuite/libgomp.c/pr104952-1.c @@ -0,0 +1,24 @@ +#define N 32 + +static char arr[N]; + +int +main (void) +{ + unsigned int result = 0; + + for (unsigned int i = 0; i < N; ++i) +arr[i] = 0; + + arr[5] = 42; + +#pragma omp target map(tofrom:result) map(to:arr) +#pragma omp simd reduction(||: result) + for (unsigned int i = 0; i < N; ++i) +result = result || arr[i]; + + if (result != 1) +__builtin_abort (); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.c/pr104952-2.c b/libgomp/testsuite/libgomp.c/pr104952-2.c new file mode 100644 index 000..7ab4bcdb8af --- /dev/null +++ b/libgomp/testsuite/libgomp.c/pr104952-2.c @@ -0,0 +1,22 @@ +#define N 32 + +static char arr[N]; + +int +main (void) +{ + unsigned int result = 2; + + for (unsigned int i = 0; i < N; ++i) +arr[i] = i + 1; + +#pragma omp target map(tofrom:result) map(to:arr) +#pragma omp simd reduction(&&: result) + for (unsigned int i = 0; i < N; ++i) +result = result && arr[i]; + + if (result != 1) +__builtin_abort (); + + return 0; +}
Re: [PATCH][openmp] Set location for taskloop stmts
On Fri, Mar 18, 2022 at 03:42:48PM +0100, Tom de Vries wrote: > > And for NVPTX we somehow lower the taskloop into GIMPLE_ASM > > or how we end up ICEing? > > > > In the nvptx backend, gen_comment (triggering not very frequently atm) uses > gen_rtx_ASM_INPUT_loc with as location argument DECL_SOURCE_LOCATION > (cfun->decl). Ok. > Alternatively, if there's a better way to get some random valid location > than DECL_SOURCE_LOCATION (cfun->decl), that would also work for me. ] > > > No objection against doing that, but if we do it, we should probably do it > > for all or at least most gimple_build_omp_* calls, not just these 2. > > So in gimplify_omp_parallel, gimplify_omp_task, another spot in > > gimplify_omp_for beyond these 2, gimplify_omp_workshare (ideally just > > in one spot for all the cases), gimplify_omp_target_update, > > gimplify_omp_atomic, gimplify_omp_ordered, gimplify_expr's > > case OMP_* that call gimple_build_omp_*. > > Or is it normally handled using > >if (!gimple_seq_empty_p (internal_post)) > > { > >annotate_all_with_location (internal_post, input_location); > >gimplify_seq_add_seq (pre_p, internal_post); > > } > > and we just need to catch the cases where we gimplify something into > > multiple nested stmts because annotate_all_with_location doesn't > > walk into gimple_omp_body? > > I can try to update the patch to take care of these additional cases. > > I reckon answering the questions that you're asking requires writing > test-cases for all of these. Actually, in the light of annotate_all_with_location annotating the newly generated sequence except for the stmts in nested contexts I think only the two spots you have in your patch is what needs adjusting. But I'd do it only when actually dealing with a OMP_TASKLOOP, so both in the spot of your second hunk and for consistency with the annotate_all_with_location do there (pseudo patch): + gimple_set_location (gfor, input_location); g = gimple_build_bind (NULL_TREE, gfor, NULL_TREE); g = gimple_build_omp_task (g, task_clauses, NULL_TREE, NULL_TREE, NULL_TREE, NULL_TREE, NULL_TREE); gimple_omp_task_set_taskloop_p (g, true); + gimple_set_location (g, input_location); g = gimple_build_bind (NULL_TREE, g, NULL_TREE); gomp_for *gforo = gimple_build_omp_for (g, GF_OMP_FOR_KIND_TASKLOOP, outer_for_clauses, gimple_omp_for_collapse (gfor), gimple_omp_for_pre_body (gfor)); gimple_omp_for_set_pre_body (gfor, NULL); gimple_omp_for_set_combined_p (gforo, true); gimple_omp_for_set_combined_into_p (gfor, true); In theory we could do it for the gimple_build_bind results too, but we don't do that in other spots where we gimple_build_bind in OpenMP/OpenACC related gimplification. Ok for trunk with those tweaks. Jakub
Re: [PATCH][openmp] Set location for taskloop stmts
On 3/18/22 15:56, Jakub Jelinek wrote: On Fri, Mar 18, 2022 at 03:42:48PM +0100, Tom de Vries wrote: And for NVPTX we somehow lower the taskloop into GIMPLE_ASM or how we end up ICEing? In the nvptx backend, gen_comment (triggering not very frequently atm) uses gen_rtx_ASM_INPUT_loc with as location argument DECL_SOURCE_LOCATION (cfun->decl). Ok. Alternatively, if there's a better way to get some random valid location than DECL_SOURCE_LOCATION (cfun->decl), that would also work for me. ] No objection against doing that, but if we do it, we should probably do it for all or at least most gimple_build_omp_* calls, not just these 2. So in gimplify_omp_parallel, gimplify_omp_task, another spot in gimplify_omp_for beyond these 2, gimplify_omp_workshare (ideally just in one spot for all the cases), gimplify_omp_target_update, gimplify_omp_atomic, gimplify_omp_ordered, gimplify_expr's case OMP_* that call gimple_build_omp_*. Or is it normally handled using if (!gimple_seq_empty_p (internal_post)) { annotate_all_with_location (internal_post, input_location); gimplify_seq_add_seq (pre_p, internal_post); } and we just need to catch the cases where we gimplify something into multiple nested stmts because annotate_all_with_location doesn't walk into gimple_omp_body? I can try to update the patch to take care of these additional cases. I reckon answering the questions that you're asking requires writing test-cases for all of these. Actually, in the light of annotate_all_with_location annotating the newly generated sequence except for the stmts in nested contexts I think only the two spots you have in your patch is what needs adjusting. But I'd do it only when actually dealing with a OMP_TASKLOOP, so both in the spot of your second hunk and for consistency with the annotate_all_with_location do there (pseudo patch): + gimple_set_location (gfor, input_location); g = gimple_build_bind (NULL_TREE, gfor, NULL_TREE); g = gimple_build_omp_task (g, task_clauses, NULL_TREE, NULL_TREE, NULL_TREE, NULL_TREE, NULL_TREE); gimple_omp_task_set_taskloop_p (g, true); + gimple_set_location (g, input_location); g = gimple_build_bind (NULL_TREE, g, NULL_TREE); gomp_for *gforo = gimple_build_omp_for (g, GF_OMP_FOR_KIND_TASKLOOP, outer_for_clauses, gimple_omp_for_collapse (gfor), gimple_omp_for_pre_body (gfor)); gimple_omp_for_set_pre_body (gfor, NULL); gimple_omp_for_set_combined_p (gforo, true); gimple_omp_for_set_combined_into_p (gfor, true); In theory we could do it for the gimple_build_bind results too, but we don't do that in other spots where we gimple_build_bind in OpenMP/OpenACC related gimplification. Ok for trunk with those tweaks. Ack, committed (in two steps though, I accidentally first committed the old patch). Thanks, - Tom
[pushed] testsuite, modules, Darwin: Adjust expected output for older OS versions.
Darwin versions <= 10 (macOS 10.6) emit different diagnostics for the failure case being tested by bad-mapper-1.C. Adjust the dg- expressions to reflect this. tested on powerpc,i686-darwin9, x86-64-darwin10,17,20 powerpc64le,powerpc64,x86_64-linux-gnu, pushed to master, thanks Iain Signed-off-by: Iain Sandoe gcc/testsuite/ChangeLog: * g++.dg/modules/bad-mapper-1.C: Make dg- expressions that match the diagnostics output by earlier Darwin too. --- gcc/testsuite/g++.dg/modules/bad-mapper-1.C | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/gcc/testsuite/g++.dg/modules/bad-mapper-1.C b/gcc/testsuite/g++.dg/modules/bad-mapper-1.C index 7ed75b824b4..6d0ed4b5895 100644 --- a/gcc/testsuite/g++.dg/modules/bad-mapper-1.C +++ b/gcc/testsuite/g++.dg/modules/bad-mapper-1.C @@ -1,6 +1,9 @@ // { dg-additional-options "-fmodules-ts -fmodule-mapper=|this-will-not-work" } import unique1.bob; -// { dg-error "-:failed exec.*mapper.* .*this-will-not-work" "" { target *-*-* } 0 } +// { dg-error "-:failed exec.*mapper.* .*this-will-not-work" "" { target { ! { *-*-darwin[89]* *-*-darwin10* } } } 0 } // { dg-prune-output "fatal error:" } // { dg-prune-output "failed to read" } // { dg-prune-output "compilation terminated" } +// { dg-error "-:failed mapper handshake communication" "" { target { *-*-darwin[89]* *-*-darwin10* } } 0 } +// { dg-prune-output "trying to exec .this-will-not-work." } +// { dg-prune-output "unknown Compiled Module Interface" } -- 2.24.3 (Apple Git-128)
[committed] libstdc++: Simplify constraints for std::any construction [PR104242]
Tested powerpc64le-linux, pushed to trunk, -- >8 -- Partially revert r12-4190-g6da36b7d0e43b6f9281c65c19a025d4888a25b2d because using __and_<..., is_copy_constructible> when T is incomplete results in an error about deriving from is_copy_constructible when that is incomplete. I don't know how to fix that, so this simply restores the previous constraint which worked in this case (even though I think it's technically undefined to use is_copy_constructible with incomplete T). This doesn't restore exactly what we had before, but uses the is_copy_constructible_v and __is_in_place_type_v variable templates instead of the ::value member. libstdc++-v3/ChangeLog: PR libstdc++/104242 * include/std/any (any(T&&)): Revert change to constraints. * testsuite/20_util/any/cons/104242.cc: New test. --- libstdc++-v3/include/std/any | 4 ++-- libstdc++-v3/testsuite/20_util/any/cons/104242.cc | 12 2 files changed, 14 insertions(+), 2 deletions(-) create mode 100644 libstdc++-v3/testsuite/20_util/any/cons/104242.cc diff --git a/libstdc++-v3/include/std/any b/libstdc++-v3/include/std/any index bdad76239b4..a6770e8f235 100644 --- a/libstdc++-v3/include/std/any +++ b/libstdc++-v3/include/std/any @@ -185,8 +185,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION /// Construct with a copy of @p __value as the contained object. template , typename _Mgr = _Manager<_VTp>, - typename = _Require<__not_<__is_in_place_type<_VTp>>, - is_copy_constructible<_VTp>>> + enable_if_t + && !__is_in_place_type_v<_VTp>, bool> = true> any(_Tp&& __value) : _M_manager(&_Mgr::_S_manage) { diff --git a/libstdc++-v3/testsuite/20_util/any/cons/104242.cc b/libstdc++-v3/testsuite/20_util/any/cons/104242.cc new file mode 100644 index 000..8d5868b7ff9 --- /dev/null +++ b/libstdc++-v3/testsuite/20_util/any/cons/104242.cc @@ -0,0 +1,12 @@ +// { dg-do compile { target c++17 } } + +// PR libstdc++/104242 - Class with constructor from std::any is not copyable + +#include +#include + +struct A { +A(const A&) = default; +explicit A(std::any value); +}; +static_assert(std::is_copy_constructible_v); -- 2.34.1
Re: [committed] libstdc++: Simplify constraints for std::any construction [PR104242]
Oops, I meant to change the commit summary line to Un-simplify, since it was reverting the "Simplify constaints ..." commit. On Fri, 18 Mar 2022 at 15:43, Jonathan Wakely via Libstdc++ wrote: > > Tested powerpc64le-linux, pushed to trunk, > > -- >8 -- > > Partially revert r12-4190-g6da36b7d0e43b6f9281c65c19a025d4888a25b2d > because using __and_<..., is_copy_constructible> when T is incomplete > results in an error about deriving from is_copy_constructible when > that is incomplete. I don't know how to fix that, so this simply > restores the previous constraint which worked in this case (even though > I think it's technically undefined to use is_copy_constructible with > incomplete T). This doesn't restore exactly what we had before, but uses > the is_copy_constructible_v and __is_in_place_type_v variable templates > instead of the ::value member. > > libstdc++-v3/ChangeLog: > > PR libstdc++/104242 > * include/std/any (any(T&&)): Revert change to constraints. > * testsuite/20_util/any/cons/104242.cc: New test. > --- > libstdc++-v3/include/std/any | 4 ++-- > libstdc++-v3/testsuite/20_util/any/cons/104242.cc | 12 > 2 files changed, 14 insertions(+), 2 deletions(-) > create mode 100644 libstdc++-v3/testsuite/20_util/any/cons/104242.cc > > diff --git a/libstdc++-v3/include/std/any b/libstdc++-v3/include/std/any > index bdad76239b4..a6770e8f235 100644 > --- a/libstdc++-v3/include/std/any > +++ b/libstdc++-v3/include/std/any > @@ -185,8 +185,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > /// Construct with a copy of @p __value as the contained object. > template , > typename _Mgr = _Manager<_VTp>, > - typename = _Require<__not_<__is_in_place_type<_VTp>>, > - is_copy_constructible<_VTp>>> > + enable_if_t > + && !__is_in_place_type_v<_VTp>, bool> = true> >any(_Tp&& __value) >: _M_manager(&_Mgr::_S_manage) >{ > diff --git a/libstdc++-v3/testsuite/20_util/any/cons/104242.cc > b/libstdc++-v3/testsuite/20_util/any/cons/104242.cc > new file mode 100644 > index 000..8d5868b7ff9 > --- /dev/null > +++ b/libstdc++-v3/testsuite/20_util/any/cons/104242.cc > @@ -0,0 +1,12 @@ > +// { dg-do compile { target c++17 } } > + > +// PR libstdc++/104242 - Class with constructor from std::any is not copyable > + > +#include > +#include > + > +struct A { > +A(const A&) = default; > +explicit A(std::any value); > +}; > +static_assert(std::is_copy_constructible_v); > -- > 2.34.1 >
Re: [PATCH] Ignore (possible) signed zeros in operands of FP comparisons.
On 3/18/2022 7:16 AM, Andrew MacLeod wrote: On 3/17/22 19:27, Jeff Law via Gcc-patches wrote: On 3/15/2022 2:03 AM, Roger Sayle wrote: -Original Message- From: Richard Biener Sent: 15 March 2022 07:29 To: Roger Sayle Cc: GCC Patches Subject: Re: [PATCH] Ignore (possible) signed zeros in operands of FP comparisons. On Mon, Mar 14, 2022 at 8:26 PM Roger Sayle wrote: I've been wondering about the possible performance/missed-optimization impact of my patch for PR middle-end/98420 and similar IEEE correctness fixes that disable constant folding optimizations when worrying about -0.0. In the common situation where the floating point result is used by a FP comparison, there's no distinction between +0.0 and -0.0, so some HONOR_SIGNED_ZEROS optimizations that we'd usually disable, are safe. Consider the following interesting example: int foo(int x, double y) { return (x * 0.0) < y; } Although we know that x (when converted to double) can't be NaN or Inf, we still worry that for negative values of x that (x * 0.0) may be -0.0 and so perform the multiplication at run-time. But in this case, the result of the comparison (-0.0 < y) will be exactly the same as (+0.0 < y) for any y, hence the above may be safely constant folded to "0.0 < y" avoiding the multiplication at run-time. I'm going to hazard a guess that this can be handled in the upcoming floating point range support? there was a start of a conversation in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=24021 about a month ago. I know *zero* about floating point, but It seems like when we track floating point ranges, we will naturally be able to integrate _2 = _1 * 0.0; _3 = _2 < y_5(D); that _2 evaluates to +/- 0.0 and when we do look at (_2 < y_5) that VRP can simplify that to 0.0 < y? (or any patch which uses simplification and ranges). It seems like it should be straightforward anyway. Yea, I guess we'd pick it up that way and that's probably cleaner than what I was originally thinking in this space. We realize that 2 is +-0.0. Then we realize that for the comparison, we can constant propagate +0.0 for _2 since +0.0 and -0.0 behave the same way. Ideally that removes the last reference to _2 and we DCE away the multiplication. jeff
Re: [PATCH RFC] mips: add TARGET_ZERO_CALL_USED_REGS hook [PR104817, PR104820]
Xi Ruoyao writes: >> >> If we have to go this way, I think it’s better to make the change you >> suggested above, >> and then also update the documentation, both internal documentation on >> how to define >> the hook and the user level documentation on what the user might >> expect when using >> this option (i.e, it’s possible that the compiler might clear more >> registers than the user >> requests on some targets due to the implementation limitation). >> >> I can make this change if we decide to do this. > > I'd vote for this. Richard? Fine by me too, although I don't think this should be mentioned in the user documentation. E.g. used-arg means that non-argument, non-return registers can have any value on return from the function; the compiler doesn't make any guarantees. If the compiler happens to use a temporary register in the implementation of the option, and if that temporary register happens to still be zero on return, then that's OK. It's just an internal implementation detail. The same thing could happen for any part of the epilogue. Thanks, Richard
Re: [PATCH] [GIMPLE FE] allow to unit test loop passes
Richard Biener writes: > The following arranges for the GIMPLE frontend to parse an > additional loops(...) specification, currently consisting of > 'normal' and 'lcssa'. The intent is to allow unit testing > of passes inside the GIMPLE loop optimization pipeline where > we keep the IL in loop-closed SSA and have SCEV initialized. > > The GIMPLE parser side is only half of the story - the rest > of the patch makes sure we do not destroy loop state when > moving the IL through the skipped portion of the pass pipeline > which shows we are not careful with IPA passes here and those > tend to call loop_optimizer_init which destroys state. The > patch arranges for IPA passes to honor fn->pass_startwith and > if set, refrain from considering the function. > > Since the SCEV state is global we cannot initialize it during > GIMPLE parsing but we have to arrange for that to happen before > the pass we want to start with. The patch heuristically > enables the loop-init pass based on the fact whether the IL > is in loop-closed SSA state and makes that aware of GIMPLE > testcases already arriving with properly initialized loops. > > That's all quite some hacks but I think it's worth the ability > to unit test loop passes. > > I've tried to do this before but PR104931 now triggered me to > try again (I have still to see whether that's enough to make > a GIMPLE testcase trigger that bug). I've skipped existing > GIMPLE testcases for -flto since when starting after IPA > using LTO doesn't make much sense and my IPA mangling ends up > crashing in the LTO writing. There's possibly some better > way to "hide" the late to be started functions from IPA > (but we would still need to stream the body). > > Bootstrapped and tested on x86_64-unknown-linux-gnu. Don't think I'm really qualified to comment, but it seems OK to me FWIW. Given that you've already found 4 places that need: (DECL_STRUCT_FUNCTION (node->decl) && DECL_STRUCT_FUNCTION (node->decl)->pass_startwith) I guess it might be worth having a helper for it. Thanks, Richard > > Any comments? > > Thanks, > Richard. > > 2022-03-17 Richard Biener > > gcc/c/ > * c-tree.h (c_declspecs::loops_state): Add. > * c-parser.cc (c_parser_declaration_or_fndef): Pass down > loops_state to c_parser_parse_gimple_body. > * gimple-parser.h (c_parser_parse_gimple_body): Adjust. > * gimple-parser.cc (c_parser_parse_gimple_body): Get > and initialize loops state according to specification. > (c_parser_gimple_or_rtl_pass_list): Parse loops[(spec...)] > with supported spec 'normal' and 'lcssa'. > > gcc/ > * passes.cc (should_skip_pass_p): When in LC SSA do not > skip loopinit. > * tree-ssa-loop.cc (pass_tree_loop_init::execute): Handle > functions already in LC SSA. > * ipa-cp.cc (ipcp_cloning_candidate_p): Skip functions > with pending startwith pass. > * ipa-fnsummary.c (ipa_fn_summary_generate): Likewise. > * ipa-inline.cc (inline_small_functions): Likewise. > (ipa_inline): Likewise. > * ipa-modref.cc (modref_generate): Likewise. > > gcc/testsuite/ > * gcc.dg/gimplefe-50.c: New testcase. > * gcc.dg/torture/pr89595.c: Skip -flto. > * gcc.dg/vect/bb-slp-48.c: Likewise. > * gcc.dg/vect/slp-reduc-10a.c: Likewise. > * gcc.dg/vect/slp-reduc-10b.c: Likewise. > * gcc.dg/vect/slp-reduc-10c.c: Likewise. > * gcc.dg/vect/slp-reduc-10d.c: Likewise. > * gcc.dg/vect/slp-reduc-10e.c: Likewise. > * gcc.dg/vect/vect-cond-arith-2.c: Likewise. > --- > gcc/c/c-parser.cc | 1 + > gcc/c/c-tree.h| 3 ++ > gcc/c/gimple-parser.cc| 37 +- > gcc/c/gimple-parser.h | 1 + > gcc/ipa-cp.cc | 4 +- > gcc/ipa-fnsummary.cc | 4 +- > gcc/ipa-inline.cc | 8 +++- > gcc/ipa-modref.cc | 2 +- > gcc/passes.cc | 7 +++ > gcc/testsuite/gcc.dg/gimplefe-50.c| 48 +++ > gcc/testsuite/gcc.dg/torture/pr89595.c| 1 + > gcc/testsuite/gcc.dg/vect/bb-slp-48.c | 1 + > gcc/testsuite/gcc.dg/vect/slp-reduc-10a.c | 1 + > gcc/testsuite/gcc.dg/vect/slp-reduc-10b.c | 1 + > gcc/testsuite/gcc.dg/vect/slp-reduc-10c.c | 1 + > gcc/testsuite/gcc.dg/vect/slp-reduc-10d.c | 1 + > gcc/testsuite/gcc.dg/vect/slp-reduc-10e.c | 1 + > gcc/testsuite/gcc.dg/vect/vect-cond-arith-2.c | 1 + > gcc/tree-ssa-loop.cc | 11 +++-- > 19 files changed, 125 insertions(+), 9 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/gimplefe-50.c > > diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc > index 129dd727ef3..80091d81bb4 100644 > --- a/gcc/c/c-parser.cc > +++ b/gcc/c/c-parser.cc > @@ -2537,6 +2537,7 @@ c_parser_declar
[PATCH v2 00/11] OpenMP 5.0: C & C++ "declare mapper" support (plus struct rework, etc.)
Hi Jakub, This is a new version of the series posted here: https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590582.html Again, this isn't for committing now (it's definitely stage 1 material) but I'm posting now for comments on the general approach (to any of the contained parts) and to avoid duplicating effort, etc.. Relative to the previously-posted version, this version of the series makes changes to the "address inspector" code and its call sites in order to hopefully clarify the logic used to create pointer mapping nodes and so forth, and implements "declare mapper" support for C as well as C++. Further commentary on individual patches. This version of the series has been tested (offloading to NVPTX) as a whole, for now. Thanks, Julian Julian Brown (11): OpenMP 5.0: Clause ordering for OpenMP 5.0 (topological sorting by base pointer) Remove omp_target_reorder_clauses OpenMP/OpenACC struct sibling list gimplification extension and rework OpenMP/OpenACC: Add inspector class to unify mapped address analysis OpenMP: Handle reference-typed struct members OpenMP: lvalue parsing for map clauses (C++) OpenMP: lvalue parsing for map clauses (C) Use OMP_ARRAY_SECTION instead of TREE_LIST in C++ FE OpenMP 5.0 "declare mapper" support for C++ OpenMP: Use OMP_ARRAY_SECTION instead of TREE_LIST for array sections in C FE OpenMP: Support OpenMP 5.0 "declare mapper" directives for C gcc/c-family/c-common.h | 58 + gcc/c-family/c-omp.cc | 583 gcc/c/c-decl.cc | 169 + gcc/c/c-objc-common.h | 12 + gcc/c/c-parser.cc | 463 ++- gcc/c/c-tree.h|9 + gcc/c/c-typeck.cc | 392 +-- gcc/cp/cp-gimplify.cc |6 + gcc/cp/cp-objcp-common.h |9 + gcc/cp/cp-tree.h | 13 + gcc/cp/decl.cc| 18 +- gcc/cp/error.cc |9 + gcc/cp/mangle.cc |5 +- gcc/cp/name-lookup.cc |3 +- gcc/cp/parser.cc | 543 ++- gcc/cp/parser.h |3 + gcc/cp/pt.cc | 144 +- gcc/cp/semantics.cc | 728 +++-- gcc/fortran/parse.cc |3 + gcc/fortran/trans-openmp.cc | 20 +- gcc/gimplify.cc | 2905 - gcc/langhooks-def.h | 13 + gcc/langhooks.cc | 35 + gcc/langhooks.h | 16 + gcc/omp-general.h | 84 + gcc/omp-low.cc| 23 +- .../c-c++-common/gomp/declare-mapper-12.c | 22 + .../c-c++-common/gomp/declare-mapper-3.c | 30 + .../c-c++-common/gomp/declare-mapper-4.c | 78 + .../c-c++-common/gomp/declare-mapper-5.c | 26 + .../c-c++-common/gomp/declare-mapper-6.c | 24 + .../c-c++-common/gomp/declare-mapper-7.c | 30 + .../c-c++-common/gomp/declare-mapper-8.c | 43 + .../c-c++-common/gomp/declare-mapper-9.c | 34 + gcc/testsuite/c-c++-common/gomp/map-1.c |3 +- gcc/testsuite/c-c++-common/gomp/map-6.c | 12 +- gcc/testsuite/g++.dg/goacc/member-array-acc.C | 13 + gcc/testsuite/g++.dg/gomp/declare-mapper-1.C | 58 + gcc/testsuite/g++.dg/gomp/declare-mapper-2.C | 30 + gcc/testsuite/g++.dg/gomp/ind-base-3.C| 38 + gcc/testsuite/g++.dg/gomp/map-assignment-1.C | 12 + gcc/testsuite/g++.dg/gomp/map-inc-1.C | 10 + gcc/testsuite/g++.dg/gomp/map-lvalue-ref-1.C | 19 + gcc/testsuite/g++.dg/gomp/map-ptrmem-1.C | 37 + gcc/testsuite/g++.dg/gomp/map-ptrmem-2.C | 40 + .../g++.dg/gomp/map-static-cast-lvalue-1.C| 17 + gcc/testsuite/g++.dg/gomp/map-ternary-1.C | 20 + gcc/testsuite/g++.dg/gomp/member-array-2.C| 92 + gcc/testsuite/g++.dg/gomp/member-array-omp.C | 13 + gcc/testsuite/g++.dg/gomp/pr67522.C |2 +- gcc/testsuite/g++.dg/gomp/target-3.C |4 +- gcc/testsuite/g++.dg/gomp/target-lambda-1.C |6 +- gcc/testsuite/g++.dg/gomp/target-this-2.C |2 +- gcc/testsuite/g++.dg/gomp/target-this-3.C |4 +- gcc/testsuite/g++.dg/gomp/target-this-4.C |4 +- .../g++.dg/gomp/unmappable-component-1.C | 21 + gcc/testsuite/gcc.dg/gomp/declare-mapper-10.c | 61 + gcc/testsuite/gcc.dg/gomp/declare-mapper-11.c | 33 + gcc/tree-core.h |4 + gcc/tree-pretty-print.cc | 56 + gcc/tree.cc |2 + gcc/tree.def | 10 + gcc/tree.h
[PATCH v2 02/11] Remove omp_target_reorder_clauses
This patch has been split out from the previous one to avoid a confusingly-interleaved diff. The two patches should probably be committed squashed together. 2021-10-01 Julian Brown gcc/ * gimplify.c (omp_target_reorder_clauses): Delete. --- gcc/gimplify.cc | 207 1 file changed, 207 deletions(-) diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc index 968cbd263f5..b667012a118 100644 --- a/gcc/gimplify.cc +++ b/gcc/gimplify.cc @@ -8738,213 +8738,6 @@ extract_base_bit_offset (tree base, tree *base_ref, poly_int64 *bitposp, return base; } -#if 0 -/* Returns true if EXPR is or contains (as a sub-component) BASE_PTR. */ - -static bool -is_or_contains_p (tree expr, tree base_ptr) -{ - if ((TREE_CODE (expr) == INDIRECT_REF && TREE_CODE (base_ptr) == MEM_REF) - || (TREE_CODE (expr) == MEM_REF && TREE_CODE (base_ptr) == INDIRECT_REF)) -return operand_equal_p (TREE_OPERAND (expr, 0), - TREE_OPERAND (base_ptr, 0)); - while (!operand_equal_p (expr, base_ptr)) -{ - if (TREE_CODE (base_ptr) == COMPOUND_EXPR) - base_ptr = TREE_OPERAND (base_ptr, 1); - if (TREE_CODE (base_ptr) == COMPONENT_REF - || TREE_CODE (base_ptr) == POINTER_PLUS_EXPR - || TREE_CODE (base_ptr) == SAVE_EXPR) - base_ptr = TREE_OPERAND (base_ptr, 0); - else - break; -} - return operand_equal_p (expr, base_ptr); -} - - -/* Implement OpenMP 5.x map ordering rules for target directives. There are - several rules, and with some level of ambiguity, hopefully we can at least - collect the complexity here in one place. */ - -static void -omp_target_reorder_clauses (tree *list_p) -{ - /* Collect refs to alloc/release/delete maps. */ - auto_vec ard; - tree *cp = list_p; - while (*cp != NULL_TREE) -if (OMP_CLAUSE_CODE (*cp) == OMP_CLAUSE_MAP - && (OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_ALLOC - || OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_RELEASE - || OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_DELETE)) - { - /* Unlink cp and push to ard. */ - tree c = *cp; - tree nc = OMP_CLAUSE_CHAIN (c); - *cp = nc; - ard.safe_push (c); - - /* Any associated pointer type maps should also move along. */ - while (*cp != NULL_TREE - && OMP_CLAUSE_CODE (*cp) == OMP_CLAUSE_MAP - && (OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_FIRSTPRIVATE_REFERENCE - || OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_FIRSTPRIVATE_POINTER - || OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_ATTACH_DETACH - || OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_POINTER - || OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_ALWAYS_POINTER - || OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_TO_PSET)) - { - c = *cp; - nc = OMP_CLAUSE_CHAIN (c); - *cp = nc; - ard.safe_push (c); - } - } -else - cp = &OMP_CLAUSE_CHAIN (*cp); - - /* Link alloc/release/delete maps to the end of list. */ - for (unsigned int i = 0; i < ard.length (); i++) -{ - *cp = ard[i]; - cp = &OMP_CLAUSE_CHAIN (ard[i]); -} - *cp = NULL_TREE; - - /* OpenMP 5.0 requires that pointer variables are mapped before - its use as a base-pointer. */ - auto_vec atf; - for (tree *cp = list_p; *cp; cp = &OMP_CLAUSE_CHAIN (*cp)) -if (OMP_CLAUSE_CODE (*cp) == OMP_CLAUSE_MAP) - { - /* Collect alloc, to, from, to/from clause tree pointers. */ - gomp_map_kind k = OMP_CLAUSE_MAP_KIND (*cp); - if (k == GOMP_MAP_ALLOC - || k == GOMP_MAP_TO - || k == GOMP_MAP_FROM - || k == GOMP_MAP_TOFROM - || k == GOMP_MAP_ALWAYS_TO - || k == GOMP_MAP_ALWAYS_FROM - || k == GOMP_MAP_ALWAYS_TOFROM) - atf.safe_push (cp); - } - - for (unsigned int i = 0; i < atf.length (); i++) -if (atf[i]) - { - tree *cp = atf[i]; - tree decl = OMP_CLAUSE_DECL (*cp); - if (TREE_CODE (decl) == INDIRECT_REF || TREE_CODE (decl) == MEM_REF) - { - tree base_ptr = TREE_OPERAND (decl, 0); - STRIP_TYPE_NOPS (base_ptr); - for (unsigned int j = i + 1; j < atf.length (); j++) - if (atf[j]) - { - tree *cp2 = atf[j]; - tree decl2 = OMP_CLAUSE_DECL (*cp2); - - decl2 = OMP_CLAUSE_DECL (*cp2); - if (is_or_contains_p (decl2, base_ptr)) - { - /* Move *cp2 to before *cp. */ - tree c = *cp2; - *cp2 = OMP_CLAUSE_CHAIN (c); - OMP_CLAUSE_CHAIN (c) = *cp; - *cp = c; - - if (*cp2 != NULL_TREE - && OMP_CLAUSE_CODE (*cp2) == OMP_CLAUSE_MAP - && OMP_CLAUSE_MAP_KIND (*cp2) == GOMP_MAP_ALWAYS_POINTER) -
[PATCH v2 01/11] OpenMP 5.0: Clause ordering for OpenMP 5.0 (topological sorting by base pointer)
This patch reimplements the omp_target_reorder_clauses function in anticipation of supporting "deeper" struct mappings (that is, with several structure dereference operators, or similar). The idea is that in place of the (possibly quadratic) algorithm in omp_target_reorder_clauses that greedily moves clauses containing addresses that are subexpressions of other addresses before those other addresses, we employ a topological sort algorithm to calculate a proper order for map clauses. This should run in linear time, and hopefully handles degenerate cases where multiple "levels" of indirect accesses are present on a given directive. The new method also takes care to keep clause groups together, addressing the concerns raised in: https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570501.html To figure out if some given clause depends on a base pointer in another clause, we strip off the outer layers of the address expression, and check (via a tree_operand_hash hash table we have built) if the result is a "base pointer" as defined in OpenMP 5.0 (1.2.6 Data Terminology). There are some subtleties involved, however: - We must treat MEM_REF with zero offset the same as INDIRECT_REF. This should probably be fixed in the front ends instead so we always use a canonical form (probably INDIRECT_REF). The following patch shows one instance of the problem, but there may be others: https://gcc.gnu.org/pipermail/gcc-patches/2021-May/571382.html - Mapping a whole struct implies mapping each of that struct's elements, which may be base pointers. Because those base pointers aren't necessarily explicitly referenced in the directive in question, we treat the whole-struct mapping as a dependency instead. This version of the patch has been moved to the front of the patch queue, thus isn't dependent on any of the following struct-rework patches. 2021-11-23 Julian Brown gcc/ * gimplify.c (is_or_contains_p, omp_target_reorder_clauses): Delete functions. (omp_tsort_mark): Add enum. (omp_mapping_group): Add struct. (debug_mapping_group, omp_get_base_pointer, omp_get_attachment, omp_group_last, omp_gather_mapping_groups, omp_group_base, omp_index_mapping_groups, omp_containing_struct, omp_tsort_mapping_groups_1, omp_tsort_mapping_groups, omp_segregate_mapping_groups, omp_reorder_mapping_groups): New functions. (gimplify_scan_omp_clauses): Call above functions instead of omp_target_reorder_clauses, unless we've seen an error. * omp-low.c (scan_sharing_clauses): Avoid strict test if we haven't sorted mapping groups. gcc/testsuite/ * g++.dg/gomp/target-lambda-1.C: Adjust expected output. * g++.dg/gomp/target-this-3.C: Likewise. * g++.dg/gomp/target-this-4.C: Likewise. --- gcc/gimplify.cc | 785 +++- gcc/omp-low.cc | 7 +- gcc/testsuite/g++.dg/gomp/target-lambda-1.C | 7 +- gcc/testsuite/g++.dg/gomp/target-this-3.C | 4 +- gcc/testsuite/g++.dg/gomp/target-this-4.C | 4 +- 5 files changed, 793 insertions(+), 14 deletions(-) diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc index 875b115d02d..968cbd263f5 100644 --- a/gcc/gimplify.cc +++ b/gcc/gimplify.cc @@ -8738,6 +8738,7 @@ extract_base_bit_offset (tree base, tree *base_ref, poly_int64 *bitposp, return base; } +#if 0 /* Returns true if EXPR is or contains (as a sub-component) BASE_PTR. */ static bool @@ -8761,6 +8762,7 @@ is_or_contains_p (tree expr, tree base_ptr) return operand_equal_p (expr, base_ptr); } + /* Implement OpenMP 5.x map ordering rules for target directives. There are several rules, and with some level of ambiguity, hopefully we can at least collect the complexity here in one place. */ @@ -8940,6 +8942,761 @@ omp_target_reorder_clauses (tree *list_p) } } } +#endif + + +enum omp_tsort_mark { + UNVISITED, + TEMPORARY, + PERMANENT +}; + +struct omp_mapping_group { + tree *grp_start; + tree grp_end; + omp_tsort_mark mark; + struct omp_mapping_group *sibling; + struct omp_mapping_group *next; +}; + +__attribute__((used)) static void +debug_mapping_group (omp_mapping_group *grp) +{ + tree tmp = OMP_CLAUSE_CHAIN (grp->grp_end); + OMP_CLAUSE_CHAIN (grp->grp_end) = NULL; + debug_generic_expr (*grp->grp_start); + OMP_CLAUSE_CHAIN (grp->grp_end) = tmp; +} + +/* Return the OpenMP "base pointer" of an expression EXPR, or NULL if there + isn't one. This needs improvement. */ + +static tree +omp_get_base_pointer (tree expr) +{ + while (TREE_CODE (expr) == ARRAY_REF) +expr = TREE_OPERAND (expr, 0); + + while (TREE_CODE (expr) == COMPONENT_REF +&& (DECL_P (TREE_OPERAND (expr, 0)) +|| (TREE_CODE (TREE_OPERAND (expr, 0)) == COMPONENT_REF) +|| TREE_CODE (TREE_OPERAND (expr, 0)) == INDIRECT_REF +|| (TREE_CODE (TREE_OPERAND (expr
[PATCH v2 03/11] OpenMP/OpenACC struct sibling list gimplification extension and rework
This patch is a combination of several previously-posted patches, rebased and squashed together, and with a couple of additional bugfixes: "Rewrite GOMP_MAP_ATTACH_DETACH mappings unconditionally" https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585440.html "OpenMP/OpenACC: Move array_ref/indirect_ref handling code out of extract_base_bit_offset" https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585441.html "OpenACC/OpenMP: Refactor struct lowering in gimplify.c" https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585442.html "OpenACC: Rework indirect struct handling in gimplify.c" https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585443.html "Remove base_ind/base_ref handling from extract_base_bit_offset" https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585445.html "OpenMP/OpenACC: Hoist struct sibling list handling in gimplification" https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585448.html "OpenACC: Make deep-copy-arrayofstruct.c a libgomp/runtime test" https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585452.html Unlike the previously-posted patch, this version does *not* contain the following changes, which have been pulled out into separate patches again or merged with other patches in this series: "OpenMP: Allow array ref components for C & C++" https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585449.html "OpenMP: Fix non-zero attach/detach bias for struct dereferences" https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585451.html "OpenMP: Handle reference-typed struct members" https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585453.html This brings the patch closer to being "just a refactor" than the previously-posted version (hopefully easing review), though several behavioural changes still remain. 2022-03-17 Julian Brown gcc/fortran/ * trans-openmp.cc (gfc_trans_omp_clauses): Don't create GOMP_MAP_TO_PSET mappings for class metadata, nor GOMP_MAP_POINTER mappings for POINTER_TYPE_P decls. gcc/ * gimplify.c (gimplify_omp_var_data): Remove GOVD_MAP_HAS_ATTACHMENTS. (insert_struct_comp_map): Refactor function into... (build_struct_comp_nodes): This new function. Remove list handling and improve self-documentation. (extract_base_bit_offset): Remove BASE_REF, OFFSETP parameters. Move code to strip outer parts of address out of function, but strip no-op conversions. (omp_mapping_group): Add DELETED field for use during reindexing. (strip_components_and_deref, strip_indirections): New functions. (omp_group_last, omp_group_base): Add GOMP_MAP_STRUCT handling. (omp_gather_mapping_groups): Initialise DELETED field for new groups. (omp_index_mapping_groups): Notice DELETED groups when (re)indexing. (insert_node_after, move_node_after, move_nodes_after, move_concat_nodes_after): New helper functions. (accumulate_sibling_list): New function to build up GOMP_MAP_STRUCT node groups for sibling lists. Outlined from gimplify_scan_omp_clauses. (omp_build_struct_sibling_lists): New function. (gimplify_scan_omp_clauses): Remove struct_map_to_clause, struct_seen_clause, struct_deref_set. Call omp_build_struct_sibling_lists as pre-pass instead of handling sibling lists in the function's main processing loop. (gimplify_adjust_omp_clauses_1): Remove GOVD_MAP_HAS_ATTACHMENTS handling, unused now. * omp-low.cc (scan_sharing_clauses): Handle pointer-type indirect struct references, and references to pointers to structs also. gcc/testsuite/ * g++.dg/goacc/member-array-acc.C: New test. * g++.dg/gomp/member-array-omp.C: New test. * g++.dg/gomp/target-3.C: Update expected output. * g++.dg/gomp/target-lambda-1.C: Likewise. * g++.dg/gomp/target-this-2.C: Likewise. * c-c++-common/goacc/deep-copy-arrayofstruct.c: Move test from here. libgomp/ * testsuite/libgomp.oacc-c-c++-common/deep-copy-15.c: New test. * testsuite/libgomp.oacc-c-c++-common/deep-copy-16.c: New test. * testsuite/libgomp.oacc-c++/deep-copy-17.C: New test. * testsuite/libgomp.oacc-c-c++-common/deep-copy-arrayofstruct.c: Move test to here, make "run" test. --- gcc/fortran/trans-openmp.cc | 20 +- gcc/gimplify.cc | 1512 ++--- gcc/omp-low.cc| 16 +- gcc/testsuite/g++.dg/goacc/member-array-acc.C | 13 + gcc/testsuite/g++.dg/gomp/member-array-omp.C | 13 + gcc/testsuite/g++.dg/gomp/target-3.C |4 +- gcc/testsuite/g++.dg/gomp/target-lambda-1.C |3 +- gcc/testsuite/g++.dg/gomp/target-this-2.C |2 +- .../testsuite/libgomp.oacc-c++/deep-copy-17.C | 101 ++ .../libgomp.oacc-c-c++-common/deep-copy-15.c | 68 + .../libgomp.oacc-c-c++-common/deep-copy-
[PATCH v2 04/11] OpenMP/OpenACC: Add inspector class to unify mapped address analysis
Several places in the C and C++ front-ends dig through OpenMP and OpenACC addresses from "map" clauses (etc.) in order to determine whether they are component accesses that need "attach" operations, check duplicate mapping clauses, and so on. When we're extending support for more kinds of lvalues in map clauses for OpenMP, it seems helpful to bring these all into one place in order to keep all the analyses in sync, and to make it easier to reason about which kinds of expressions are supported. This patch introduces an "address inspector" class for that purpose, and adjusts the C and C++ front-ends to use it. Relative to the previous posted version, this patch heavily reworks the internals of the "address inspector" class and its call sites in the C and C++ front-ends in order to clarify the logic used to elaborate "map" clause nodes, which had become somewhat convoluted. It also now implements the functionality of the "c_omp_decompose_attachable_address" function from earlier versions of this patch series. 2022-03-17 Julian Brown gcc/c-family/ * c-common.h (c_omp_address_inspector): New class. * c-omp.c (c_omp_address_inspector::get_deref_origin, c_omp_address_inspector::component_access_p, c_omp_address_inspector::check_clause, c_omp_address_inspector::get_root_term, c_omp_address_inspector::map_supported_p, c_omp_address_inspector::mappable_type, c_omp_address_inspector::get_origin, c_omp_address_inspector::peel_components, c_omp_address_inspector::maybe_peel_ref, c_omp_address_inspector::maybe_zero_length_array_section, c_omp_address_inspector::get_base_pointer, c_omp_address_inspector::get_base_pointer_tgt, c_omp_address_inspector::get_attachment_point): New methods. gcc/c/ * c-typeck.c (handle_omp_array_sections_1, handle_omp_array_sections, c_finish_omp_clauses): Use c_omp_address_inspector class. gcc/cp/ * semantics.c (cp_omp_address_inspector): New class, derived from c_omp_address_inspector. (handle_omp_array_sections_1, handle_omp_array_sections, finish_omp_clauses): Use cp_omp_address_inspector class to analyze OpenMP map clause expressions. gcc/testsuite/ * g++.dg/gomp/unmappable-component-1.C: New test. libgomp/ * testsuite/libgomp.c++/class-array-1.C: New test. * testsuite/libgomp.c-c++-common/baseptrs-1.c: New test. * testsuite/libgomp.c-c++-common/baseptrs-2.c: New test. --- gcc/c-family/c-common.h | 55 +++ gcc/c-family/c-omp.cc | 268 +++ gcc/c/c-typeck.cc | 305 +--- gcc/cp/semantics.cc | 440 -- .../g++.dg/gomp/unmappable-component-1.C | 21 + libgomp/testsuite/libgomp.c++/class-array-1.C | 59 +++ .../libgomp.c-c++-common/baseptrs-1.c | 50 ++ .../libgomp.c-c++-common/baseptrs-2.c | 70 +++ 8 files changed, 814 insertions(+), 454 deletions(-) create mode 100644 gcc/testsuite/g++.dg/gomp/unmappable-component-1.C create mode 100644 libgomp/testsuite/libgomp.c++/class-array-1.C create mode 100644 libgomp/testsuite/libgomp.c-c++-common/baseptrs-1.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/baseptrs-2.c diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h index a8d6f82bb2c..e592e7fd368 100644 --- a/gcc/c-family/c-common.h +++ b/gcc/c-family/c-common.h @@ -1253,6 +1253,61 @@ extern void c_omp_mark_declare_variant (location_t, tree, tree); extern const char *c_omp_map_clause_name (tree, bool); extern void c_omp_adjust_map_clauses (tree, bool); +class c_omp_address_inspector +{ + location_t loc; + tree root_term; + bool indirections; + int map_supported; + +protected: + tree orig; + +public: + c_omp_address_inspector (location_t loc, tree t) +: loc (loc), root_term (NULL_TREE), indirections (false), + map_supported (-1), orig (t) + { } + + ~c_omp_address_inspector () {} + + virtual bool processing_template_decl_p () { return false; } + virtual bool mappable_type (tree t); + virtual void emit_unmappable_type_notes (tree) { } + + bool check_clause (tree); + tree get_root_term (bool); + + tree get_address () { return orig; } + tree get_deref_origin (); + bool component_access_p (); + + bool has_indirections_p () +{ + if (!root_term) + get_root_term (false); + return indirections; +} + + bool indir_component_ref_p () +{ + return component_access_p () && has_indirections_p (); +} + + bool map_supported_p (); + + static tree get_origin (tree); + static tree peel_components (tree); + static tree maybe_peel_ref (tree); + static tree get_base_pointer (tree); + tree get_base_pointer () { return get_base_pointer (orig); } + static tree get_base_pointer_tgt (tree); + tree get_base_pointer_tgt () { return get_base_pointer_tgt (o
[PATCH v2 05/11] OpenMP: Handle reference-typed struct members
This patch relates to OpenMP mapping clauses containing struct members of reference type, e.g. "mystruct.myref.myptr[:N]". To be able to access the array slice through the reference in the middle, we need to perform an attach action for that reference, since it is represented internally as a pointer. I don't think the spec allows for this case explicitly. The closest clause is (OpenMP 5.0, "2.19.7.1 map Clause"): "If the type of a list item is a reference to a type T then the reference in the device data environment is initialized to refer to the object in the device data environment that corresponds to the object referenced by the list item. If mapping occurs, it occurs as though the object were mapped through a pointer with an array section of type T and length one." The patch as is allows the mapping to work with just "mystruct.myref.myptr[:N]", without an explicit "mystruct.myref" mapping also (because, would that refer to the hidden pointer used by the reference, or the automatically-dereferenced data itself?). An attach/detach operation is thus synthesised for the reference. Reworking the previous "address inspector" patch, it occurred to me that this patch might be better implemented in the frontend (or frontends). I've added a note to that effect, but not actually made that change for the time being. --- gcc/gimplify.cc| 59 - libgomp/testsuite/libgomp.c++/baseptrs-3.C | 275 + 2 files changed, 327 insertions(+), 7 deletions(-) create mode 100644 libgomp/testsuite/libgomp.c++/baseptrs-3.C diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc index 598c65eb430..0b8c221e4c8 100644 --- a/gcc/gimplify.cc +++ b/gcc/gimplify.cc @@ -9813,7 +9813,10 @@ accumulate_sibling_list (enum omp_region_type region_type, enum tree_code code, /* FIXME: If we're not mapping the base pointer in some other clause on this directive, I think we want to create ALLOC/RELEASE here -- i.e. not early-exit. */ - if (openmp && attach_detach) + if (openmp + && attach_detach + && !(TREE_CODE (TREE_TYPE (ocd)) == REFERENCE_TYPE + && TREE_CODE (TREE_TYPE (TREE_TYPE (ocd))) != POINTER_TYPE)) return NULL; if (!struct_map_to_clause || struct_map_to_clause->get (base) == NULL) @@ -9862,9 +9865,37 @@ accumulate_sibling_list (enum omp_region_type region_type, enum tree_code code, tree noind = strip_indirections (base); - if (!openmp + /* TODO: the following two stanzas handling reference-typed struct +members (for OpenMP) and nested base pointers (for OpenACC) could +probably both be better handled in the frontends. Doing that would +avoid this late manipulation of the clause list. */ + + if (openmp + && TREE_CODE (TREE_TYPE (noind)) == REFERENCE_TYPE && (region_type & ORT_TARGET) && TREE_CODE (noind) == COMPONENT_REF) + { + tree c2 = build_omp_clause (OMP_CLAUSE_LOCATION (grp_end), + OMP_CLAUSE_MAP); + OMP_CLAUSE_SET_MAP_KIND (c2, GOMP_MAP_TO); + OMP_CLAUSE_DECL (c2) = unshare_expr (base); + OMP_CLAUSE_SIZE (c2) = TYPE_SIZE_UNIT (TREE_TYPE (noind)); + + tree c3 = build_omp_clause (OMP_CLAUSE_LOCATION (grp_end), + OMP_CLAUSE_MAP); + OMP_CLAUSE_SET_MAP_KIND (c3, GOMP_MAP_ATTACH_DETACH); + OMP_CLAUSE_DECL (c3) = unshare_expr (noind); + OMP_CLAUSE_SIZE (c3) = size_zero_node; + + OMP_CLAUSE_CHAIN (c2) = c3; + OMP_CLAUSE_CHAIN (c3) = NULL_TREE; + + *inner = c2; + return NULL; + } + else if (!openmp + && (region_type & ORT_TARGET) + && TREE_CODE (noind) == COMPONENT_REF) { /* The base for this component access is a struct component access itself. Insert a node to be processed on the next iteration of @@ -9878,13 +9909,28 @@ accumulate_sibling_list (enum omp_region_type region_type, enum tree_code code, OMP_CLAUSE_SET_MAP_KIND (c2, GOMP_MAP_FORCE_PRESENT); OMP_CLAUSE_DECL (c2) = unshare_expr (noind); OMP_CLAUSE_SIZE (c2) = TYPE_SIZE_UNIT (TREE_TYPE (noind)); + OMP_CLAUSE_CHAIN (c2) = NULL_TREE; *inner = c2; return NULL; } - tree sdecl = strip_components_and_deref (base); + tree sdecl = base; + if (TREE_CODE (sdecl) == INDIRECT_REF + || TREE_CODE (sdecl) == MEM_REF) + { + sdecl = TREE_OPERAND (sdecl, 0); + if (TREE_CODE (sdecl) == INDIRECT_REF + && (TREE_CODE (TREE_TYPE (TREE_OPERAND (sdecl, 0))) + == REFERENCE_TYPE)) + sdecl = TREE_OPERAND (sdecl, 0); + } - if (POINTER_TYPE_P (TREE_TYPE (sdecl)) && (region_type & ORT_TARGET)) + while (TREE_CODE (sdecl) == POINTER_PLUS_EXPR) + sdecl = TREE_OPERAND (sdecl, 0); + + if (DECL_P (sdecl) +
[PATCH v2 06/11] OpenMP: lvalue parsing for map clauses (C++)
This patch changes parsing for OpenMP map clauses in C++ to use the generic expression parser, hence adds support for parsing general lvalues (as required by OpenMP 5.0+). So far only a few new types of expression are actually supported throughout compilation (including everything in the testsuite of course, and newly-added tests), and we attempt to reject unsupported expressions in order to avoid surprises for the user. This version of the patch adds code to recalculate the bias for ATTACH operations on struct bases which are not DECL_P (edit: oops, and leaves in a commented-out line). 2022-03-17 Julian Brown gcc/c-family/ * c-omp.cc (c_omp_address_inspector::map_supported_p): Support OMP_ARRAY_SECTION. gcc/cp/ * error.cc (dump_expr): Handle OMP_ARRAY_SECTION. * parser.cc (cp_parser_new): Initialize parser->omp_array_section_p. (cp_parser_postfix_open_square_expression): Support OMP_ARRAY_SECTION parsing. (cp_parser_omp_var_list_no_open): Remove ALLOW_DEREF parameter, add MAP_LVALUE in its place. Supported generalised lvalue parsing for map clauses. (cp_parser_omp_var_list): Remove ALLOW_DEREF parameter, add MAP_LVALUE. Pass to cp_parser_omp_var_list_no_open. (cp_parser_oacc_data_clause, cp_parser_omp_all_clauses): Update calls to cp_parser_omp_var_list. * parser.h (cp_parser): Add omp_array_section_p field. * semantics.cc (handle_omp_array_sections_1): Handle more types of map expression. (handle_omp_array_section): Disallow pointer-to-member mappings. (finish_omp_clauses): Check for supported types of expression. gcc/ * gimplify.cc (omp_containing_struct): Handle POINTER_PLUS_EXPR. (accumulate_sibling_list): Fix support for non-DECL_P struct bases. Don't create firstprivate pointers for struct bases that are explicitly mapped TO elsewhere in the clause list. Add support for adjusting struct ATTACH bias. (omp_ptrmem_p): New function. (omp_build_struct_sibling_lists): Support length-two group for synthesized inner struct mapping. Explicitly disallow pointers to members. Recalculate bias for struct ATTACH nodes. * tree-pretty-print.c (dump_generic_node): Support OMP_ARRAY_SECTION. * tree.def (OMP_ARRAY_SECTION): New tree code. gcc/testsuite/ * c-c++-common/gomp/map-6.c: Update expected output. * g++.dg/gomp/pr67522.C: Likewise. * g++.dg/gomp/ind-base-3.C: New test. * g++.dg/gomp/map-assignment-1.C: New test. * g++.dg/gomp/map-inc-1.C: New test. * g++.dg/gomp/map-lvalue-ref-1.C: New test. * g++.dg/gomp/map-ptrmem-1.C: New test. * g++.dg/gomp/map-ptrmem-2.C: New test. * g++.dg/gomp/map-static-cast-lvalue-1.C: New test. * g++.dg/gomp/map-ternary-1.C: New test. * g++.dg/gomp/member-array-2.C: New test. libgomp/ * testsuite/libgomp.c++/ind-base-1.C: New test. * testsuite/libgomp.c++/ind-base-2.C: New test. * testsuite/libgomp.c++/map-comma-1.C: New test. * testsuite/libgomp.c++/map-rvalue-ref-1.C: New test. * testsuite/libgomp.c++/struct-ref-1.C: New test. * testsuite/libgomp.c-c++-common/array-field-1.c: New test. * testsuite/libgomp.c-c++-common/array-of-struct-1.c: New test. * testsuite/libgomp.c-c++-common/array-of-struct-2.c: New test. --- gcc/c-family/c-omp.cc | 1 + gcc/cp/error.cc | 9 + gcc/cp/parser.cc | 141 +-- gcc/cp/parser.h | 3 + gcc/cp/semantics.cc | 59 ++- gcc/gimplify.cc | 119 +++-- gcc/testsuite/c-c++-common/gomp/map-6.c | 4 +- gcc/testsuite/g++.dg/gomp/ind-base-3.C| 38 gcc/testsuite/g++.dg/gomp/map-assignment-1.C | 12 ++ gcc/testsuite/g++.dg/gomp/map-inc-1.C | 10 ++ gcc/testsuite/g++.dg/gomp/map-lvalue-ref-1.C | 19 ++ gcc/testsuite/g++.dg/gomp/map-ptrmem-1.C | 37 gcc/testsuite/g++.dg/gomp/map-ptrmem-2.C | 40 + .../g++.dg/gomp/map-static-cast-lvalue-1.C| 17 ++ gcc/testsuite/g++.dg/gomp/map-ternary-1.C | 20 +++ gcc/testsuite/g++.dg/gomp/member-array-2.C| 92 ++ gcc/testsuite/g++.dg/gomp/pr67522.C | 2 +- gcc/tree-pretty-print.cc | 14 ++ gcc/tree.def | 3 + libgomp/testsuite/libgomp.c++/ind-base-1.C| 162 ++ libgomp/testsuite/libgomp.c++/ind-base-2.C| 49 ++ libgomp/testsuite/libgomp.c++/map-comma-1.C | 15 ++ .../testsuite/libgomp.c++/map-rvalue-ref-1.C | 22 +++ libgomp/testsuite/libgomp.c++/struct-ref-1.C | 97 +++ .../libgomp.c-c++-common/array-field-1.c | 35 .../libgomp.c-c++-common/array
[PATCH v2 07/11] OpenMP: lvalue parsing for map clauses (C)
This patch adds support for parsing general lvalues for OpenMP "map" clauses to the C front-end, similar to the previous patch for C++. This version of the patch has been adjusted for changes to the address inspector patch, but is otherwise the same as the last posted version. 2022-03-17 Julian Brown gcc/c/ * c-parser.c (c_parser_postfix_expression_after_primary): Add support for OpenMP array section parsing. (c_parser_omp_variable_list): Change ALLOW_DEREF parameter to MAP_LVALUE. Support parsing of general lvalues in "map" clauses. (c_parser_omp_var_list_parens): Change ALLOW_DEREF parameter to MAP_LVALUE. Update call to c_parser_omp_variable_list. (c_parser_oacc_data_clause, c_parser_omp_clause_to, c_parser_omp_clause_from): Update calls to c_parser_omp_var_list_parens. * c-tree.h (c_omp_array_section_p): Add extern declaration. * c-typeck.c (c_omp_array_section_p): Add flag. (mark_exp_read): Support OMP_ARRAY_SECTION. (handle_omp_array_sections_1): Handle more kinds of expressions. (c_finish_omp_clauses): Check for supported expression types. Support non-DECL_P root term for map clauses. gcc/testsuite/ * c-c++-common/gomp/map-1.c: Adjust expected output. * c-c++-common/gomp/map-6.c: Likewise. libgomp/ * testsuite/libgomp.c-c++-common/ind-base-4.c: New test. * testsuite/libgomp.c-c++-common/unary-ptr-1.c: New test. --- gcc/c/c-parser.cc | 150 +++--- gcc/c/c-tree.h| 1 + gcc/c/c-typeck.cc | 38 - gcc/testsuite/c-c++-common/gomp/map-1.c | 3 +- gcc/testsuite/c-c++-common/gomp/map-6.c | 2 + .../libgomp.c-c++-common/ind-base-4.c | 50 ++ .../libgomp.c-c++-common/unary-ptr-1.c| 16 ++ 7 files changed, 236 insertions(+), 24 deletions(-) create mode 100644 libgomp/testsuite/libgomp.c-c++-common/ind-base-4.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/unary-ptr-1.c diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc index e9086c58524..cc590e56e75 100644 --- a/gcc/c/c-parser.cc +++ b/gcc/c/c-parser.cc @@ -10486,7 +10486,7 @@ c_parser_postfix_expression_after_primary (c_parser *parser, struct c_expr expr) { struct c_expr orig_expr; - tree ident, idx; + tree ident, idx, len; location_t sizeof_arg_loc[3], comp_loc; tree sizeof_arg[3]; unsigned int literal_zero_mask; @@ -10505,15 +10505,44 @@ c_parser_postfix_expression_after_primary (c_parser *parser, case CPP_OPEN_SQUARE: /* Array reference. */ c_parser_consume_token (parser); - idx = c_parser_expression (parser).value; - c_parser_skip_until_found (parser, CPP_CLOSE_SQUARE, -"expected %<]%>"); - start = expr.get_start (); - finish = parser->tokens_buf[0].location; - expr.value = build_array_ref (op_loc, expr.value, idx); - set_c_expr_source_range (&expr, start, finish); - expr.original_code = ERROR_MARK; - expr.original_type = NULL; + idx = len = NULL_TREE; + if (!c_omp_array_section_p + || c_parser_next_token_is_not (parser, CPP_COLON)) + idx = c_parser_expression (parser).value; + + if (c_omp_array_section_p + && c_parser_next_token_is (parser, CPP_COLON)) + { + c_parser_consume_token (parser); + if (c_parser_next_token_is_not (parser, CPP_CLOSE_SQUARE)) + len = c_parser_expression (parser).value; + + c_parser_skip_until_found (parser, CPP_CLOSE_SQUARE, +"expected %<]%>"); + +/* NOTE: We are reusing using the type of the whole array as the + type of the array section here, which isn't necessarily + entirely correct. Might need revisiting. */ + start = expr.get_start (); + finish = parser->tokens_buf[0].location; + expr.value = build3_loc (op_loc, OMP_ARRAY_SECTION, + TREE_TYPE (expr.value), expr.value, + idx, len); + set_c_expr_source_range (&expr, start, finish); + expr.original_code = ERROR_MARK; + expr.original_type = NULL; + } + else + { + c_parser_skip_until_found (parser, CPP_CLOSE_SQUARE, +"expected %<]%>"); + start = expr.get_start (); + finish = parser->tokens_buf[0].location; + expr.value = build_array_ref (op_loc, expr.value, idx); + set_c_expr_source_range (&expr, start, finish); + expr.original_code = ERROR_MARK; + expr.original_type = NULL; + }
[PATCH v2 08/11] Use OMP_ARRAY_SECTION instead of TREE_LIST in C++ FE
This patch changes the representation of OMP array sections in the C++ front end to use the new OMP_ARRAY_SECTION tree code instead of a TREE_LIST. This is important for "declare mapper" support, because the array section representation may stick around longer (in "declare mapper" definitions), and special-case handling TREE_LIST becomes necessary in more places, which starts to become unwieldy. 2022-02-18 Julian Brown gcc/c-family/ * c-omp.cc (c_omp_split_clauses): Support OMP_ARRAY_SECTION. gcc/cp/ * parser.cc (cp_parser_omp_var_list_no_open): Use OMP_ARRAY_SECTION code instead of TREE_LIST to represent OpenMP array sections. * pt.cc (tsubst_copy, tsubst_omp_clause_decl, tsubst_copy_and_build): Add OMP_ARRAY_SECTION support. * semantics.cc (handle_omp_array_sections_1, handle_omp_array_sections, cp_oacc_check_attachments, finish_omp_clauses): Use OMP_ARRAY_SECTION instead of TREE_LIST where appropriate. * gimplify.cc (gimplify_expr): Ensure OMP_ARRAY_SECTION has been processed out before gimplification. --- gcc/c-family/c-omp.cc | 14 +++ gcc/cp/parser.cc | 15 gcc/cp/pt.cc | 52 gcc/cp/semantics.cc | 56 ++- gcc/gimplify.cc | 3 +++ 5 files changed, 109 insertions(+), 31 deletions(-) diff --git a/gcc/c-family/c-omp.cc b/gcc/c-family/c-omp.cc index 421cc76..77255dd587a 100644 --- a/gcc/c-family/c-omp.cc +++ b/gcc/c-family/c-omp.cc @@ -2662,6 +2662,9 @@ c_omp_split_clauses (location_t loc, enum tree_code code, } else if (TREE_CODE (OMP_CLAUSE_DECL (c)) == TREE_LIST) { + /* TODO: This can go away once we transition all uses of +TREE_LIST for representing OMP array sections to +OMP_ARRAY_SECTION. */ tree t; for (t = OMP_CLAUSE_DECL (c); TREE_CODE (t) == TREE_LIST; t = TREE_CHAIN (t)) @@ -2670,6 +2673,17 @@ c_omp_split_clauses (location_t loc, enum tree_code code, bitmap_clear_bit (&allocate_head, DECL_UID (t)); break; } + else if (TREE_CODE (OMP_CLAUSE_DECL (c)) == OMP_ARRAY_SECTION) + { + tree t; + for (t = OMP_CLAUSE_DECL (c); + TREE_CODE (t) == OMP_ARRAY_SECTION; + t = TREE_OPERAND (t, 0)) + ; + if (DECL_P (t)) + bitmap_clear_bit (&allocate_head, DECL_UID (t)); + break; + } /* FALLTHRU */ case OMP_CLAUSE_PRIVATE: case OMP_CLAUSE_FIRSTPRIVATE: diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc index efb65543e11..8d5ae9c44d0 100644 --- a/gcc/cp/parser.cc +++ b/gcc/cp/parser.cc @@ -36639,11 +36639,14 @@ cp_parser_omp_var_list_no_open (cp_parser *parser, enum omp_clause_code kind, decl = TREE_OPERAND (decl, 0); for (int i = dims.length () - 1; i >= 0; i--) - decl = tree_cons (dims[i].low_bound, dims[i].length, decl); + decl = build3_loc (input_location, OMP_ARRAY_SECTION, + TREE_TYPE (decl), decl, dims[i].low_bound, + dims[i].length); } else if (TREE_CODE (decl) == INDIRECT_REF) { bool ref_p = REFERENCE_REF_P (decl); + tree type = TREE_TYPE (decl); /* Turn *foo into the representation previously used for foo[0]. */ @@ -36653,7 +36656,8 @@ cp_parser_omp_var_list_no_open (cp_parser *parser, enum omp_clause_code kind, /* ...but don't add the [0:1] representation for references (because they have special handling elsewhere). */ if (!ref_p) - decl = tree_cons (integer_zero_node, integer_one_node, decl); + decl = build3_loc (input_location, OMP_ARRAY_SECTION, type, + decl, integer_zero_node, integer_one_node); } else if (TREE_CODE (decl) == ARRAY_REF) { @@ -36662,7 +3,8 @@ cp_parser_omp_var_list_no_open (cp_parser *parser, enum omp_clause_code kind, decl = TREE_OPERAND (decl, 0); STRIP_NOPS (decl); - decl = tree_cons (idx, integer_one_node, decl); + decl = build3_loc (input_location, OMP_ARRAY_SECTION, +TREE_TYPE (decl), decl, idx, integer_one_node); } else if (TREE_CODE (decl) == NON_LVALUE_EXPR || CONVERT_EXPR_P (decl)) @@ -36837,7 +36842,9 @@ cp_parser_omp_var_list_no_open (cp_parser
[Patch] Fortran/OpenMP: Improve associate-name diagnostic [PR103039]
SELECT TYPE, SELECT RANK and ASSOCIATE have associate-name => selector and create a pointer to the selector. GCC was fixed to handle CLASS properly in class(t) :: var !$omp ... firstprivate(var) As a side effect, firstprivate(assoc_name) now also gets handled that way, effectively trying to firstprivate(selector) which should be shared... While firstprivate(var) does not appear explicitly, it gets added via gfc_omp_predetermined_sharing. I went for the simple solution and handle it only in gfortran's ctor/dtor. An alternative would be to set OMP_CLAUSE_FIRSTPRIVATE_NO_REFERENCE, which is currently only set for C++'s __for_end / __for_range and then later process it in ctor/dtor. I am not sure whether that's really best and what's the best way to propagate it. One way would be to create and use OMP_CLAUSE_DEFAULT_FIRSTPRIVATE_NO_REFERENCE. OK as is (simple version) – or is a fuller version better. If so, suggestion how to do this best? Tobias - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 Fortran/OpenMP: Fix privatization of associated names gfc_omp_predetermined_sharing cases the associate-name pointer variable to be OMP_CLAUSE_DEFAULT_FIRSTPRIVATE, which is fine. However, the associated selector is shared. Thus, the target of associate-name pointer should not get copied. (It was before but because of gfc_omp_privatize_by_reference returning false, the selector was not only wrongly copied but this was also not done properly.) gcc/fortran/ChangeLog: PR fortran/103039 * trans-openmp.cc (gfc_omp_clause_copy_ctor, gfc_omp_clause_dtor): Only privatize pointer for associate names. libgomp/ChangeLog: PR fortran/103039 * testsuite/libgomp.fortran/associate4.f90: New test. gcc/fortran/trans-openmp.cc | 10 +++ libgomp/testsuite/libgomp.fortran/associate4.f90 | 92 2 files changed, 102 insertions(+) diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc index fad76a4791f..101924f5e76 100644 --- a/gcc/fortran/trans-openmp.cc +++ b/gcc/fortran/trans-openmp.cc @@ -808,6 +808,11 @@ gfc_omp_clause_copy_ctor (tree clause, tree dest, tree src) gcc_assert (OMP_CLAUSE_CODE (clause) == OMP_CLAUSE_FIRSTPRIVATE || OMP_CLAUSE_CODE (clause) == OMP_CLAUSE_LINEAR); + /* Privatize pointer, only; cf. gfc_omp_predetermined_sharing. */ + if (DECL_P (OMP_CLAUSE_DECL (clause)) + && GFC_DECL_ASSOCIATE_VAR_P (OMP_CLAUSE_DECL (clause))) +return build2 (MODIFY_EXPR, TREE_TYPE (dest), dest, src); + if (DECL_ARTIFICIAL (OMP_CLAUSE_DECL (clause)) && DECL_LANG_SPECIFIC (OMP_CLAUSE_DECL (clause)) && GFC_DECL_SAVED_DESCRIPTOR (OMP_CLAUSE_DECL (clause))) @@ -1321,6 +1326,11 @@ gfc_omp_clause_dtor (tree clause, tree decl) tree type = TREE_TYPE (decl), tem; tree decl_type = TREE_TYPE (OMP_CLAUSE_DECL (clause)); + /* Only pointer was privatized; cf. gfc_omp_clause_copy_ctor. */ + if (DECL_P (OMP_CLAUSE_DECL (clause)) + && GFC_DECL_ASSOCIATE_VAR_P (OMP_CLAUSE_DECL (clause))) +return NULL_TREE; + if (DECL_ARTIFICIAL (OMP_CLAUSE_DECL (clause)) && DECL_LANG_SPECIFIC (OMP_CLAUSE_DECL (clause)) && GFC_DECL_SAVED_DESCRIPTOR (OMP_CLAUSE_DECL (clause))) diff --git a/libgomp/testsuite/libgomp.fortran/associate4.f90 b/libgomp/testsuite/libgomp.fortran/associate4.f90 new file mode 100644 index 000..f0949b5530d --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/associate4.f90 @@ -0,0 +1,92 @@ +! PR fortran/51722 + +module m +implicit none + +contains + +subroutine seltype + +type :: a + integer :: p = 2 +end type a + +type, extends(a) :: b + integer :: cnt = 0 +end type b + +integer :: k, s +class(a), pointer :: x + +allocate(a :: x) +s = 0 +select type (y => x) +class is (a) +!$omp parallel do default(shared) private(k) reduction(+:s) + do k = 1,10 + s = s + k*y%p + end do +!$omp end parallel do +end select + +if (s /= 110) error stop +deallocate(x) +allocate(b :: x) + +s = 0 +select type (y => x) +class is (b) +!$omp parallel do default(shared) private(k) reduction(+:s) + do k = 1,10 + s = s + k*y%p +!$omp atomic update + y%cnt = y%cnt + 2 + end do +!$omp end parallel do +if (s /= 110) error stop +if (y%p /= 2) error stop +if (y%cnt /= 10*2) error stop +end select + +deallocate(x) + +end subroutine seltype + +subroutine assoc + +type :: b + integer :: r = 3 +end type b + +type :: a + integer :: p = 2 + class(b), pointer :: u => null() +end type a + +integer :: k, s +class(a), pointer :: x + +s = 0 +allocate(a :: x) +allocate(b :: x%u) + +associate(f => x%u) +!$omp parallel do default(shared) private(k) reduction(+:s) + do k = 1,10 + s = s + k*f%r + end do +!$omp end parallel do +end associate + +deallocate(x%u) +deallocate(x) + +if (s /
[PATCH v2 09/11] OpenMP 5.0 "declare mapper" support for C++
This patch implements OpenMP 5.0 "declare mapper" support for C++ -- except for arrays of structs with mappers, which are TBD. I've taken cues from the existing "declare reduction" support where appropriate, though obviously the details of implementation differ somewhat (in particular, "declare mappers" must survive longer, until gimplification time). Both named and unnamed (default) mappers are supported, and both explicitly-mapped structures and implicitly-mapped struct-typed variables used within an offload region are supported. For the latter, we need a way to communicate to the middle end which mapper (visible, in-scope) is the right one to use -- for that, we scan the target body in the front end for uses of structure (etc.) types, and create artificial "mapper binding" clauses to associate types with visible mappers. (It doesn't matter too much if we create these mapper bindings a bit over-eagerly, since they will only be used if needed later during gimplification.) Another difficulty concerns the order in which an OpenMP offload region body's clauses are processed relative to its body: in order to add clauses for instantiated mappers, we need to have processed the body already in order to find which variables have been used, but at present the region's clauses are processed strictly before the body. So, this patch adds a second clause-processing step after gimplification of the body in order to add clauses resulting from instantiated mappers. This version of the patch improves detection of explicitly-mapped struct accesses which inhibit implicitly-triggered user-defined mappers for a target region. 2022-03-17 Julian Brown gcc/cp/ * cp-gimplify.cc (cxx_omp_finish_mapper_clauses): New function. * cp-objcp-common.h (LANG_HOOKS_OMP_FINISH_MAPPER_CLAUSES): Add new langhook. * cp-tree.h (lang_decl_fn): Add omp_declare_mapper_p field. (DECL_OMP_DECLARE_MAPPER_P): New function. (omp_mapper_id, cp_check_omp_declare_mapper, omp_instantiate_mappers, cxx_omp_finish_mapper_clauses): Add prototypes. * decl.cc (duplicate_decls): Support DECL_OMP_DECLARE_MAPPER_P functions. (finish_function): Likewise. * mangle.cc (decl_mangling_context): Likewise. * name-lookup.cc (set_decl_context_in_fn): Likewise. * parser.cc (cp_parser_class_specifier_1): Likewise. (cp_parser_omp_declare_mapper, cp_parser_omp_declare_mapper_maplist): New functions. (cp_parser_late_parsing_for_member): Support DECL_OMP_DECLARE_MAPPER_P functions. (cp_parser_omp_clause_map): Add KIND parameter. Support "mapper" modifier. (cp_parser_omp_all_clauses): Add KIND argument to cp_parser_omp_clause_map call. (cp_parser_omp_target): Call omp_instantiate_mappers before finish_omp_clauses. (cp_parser_omp_declare): Add "declare mapper" support. * pt.cc (instantiate_class_template_1, tsubst_function_decl): Support DECL_OMP_DECLARE_MAPPER_P functions. (tsubst_omp_clauses): Call omp_instantiate_mappers before finish_omp_clauses, for target regions. (tsubst_expr): Support DECL_OMP_DECLARE_MAPPER_P functions. (tsubst_omp_udm): New function. (instantiate_body): Support DECL_OMP_DECLARE_MAPPER_P functions. * semantics.cc (gimplify.h): Include. (expand_or_defer_fn_1): Support DECL_OMP_DECLARE_MAPPER_P functions. (omp_mapper_id, omp_mapper_lookup, omp_extract_mapper_directive, cp_check_omp_declare_mapper): New functions. (remap_mapper_decl_info): New struct. (remap_mapper_decl_1, omp_instantiate_mapper, omp_instantiate_mappers): New functions. (finish_omp_clauses): Delete GOMP_MAP_PUSH_MAPPER_NAME and GOMP_MAP_POP_MAPPER_NAME artificial clauses. (mapper_list): New struct. (find_nested_mappers): New function. (omp_target_walk_data): Add MAPPERS field. (finish_omp_target_clauses_r): Scan for uses of struct/union/class type variables. (finish_omp_target_clauses): Create artificial mapper binding clauses for used structs/unions/classes in offload region. gcc/fortran/ * parse.cc (tree.h, fold-const.h, tree-hash-traits.h): Add includes (for additions to omp-general.h). gcc/ * gimplify.cc (gimplify_omp_ctx): Add IMPLICIT_MAPPERS field. (new_omp_context): Initialise IMPLICIT_MAPPERS hash map. (delete_omp_context): Delete IMPLICIT_MAPPERS hash map. (instantiate_mapper_info, remap_mapper_decl_info): New structs. (remap_mapper_decl_1, omp_instantiate_mapper, omp_find_explicitly_mapped_structs, omp_instantiate_implicit_mappers, new_omp_context_for_scan): New functions. (gimplify_scan_omp_clauses): Add optional USE_MAPPERS parameter. Instantiate implicit mappers if true. Handle artificial mapper_binding
[PATCH v2 10/11] OpenMP: Use OMP_ARRAY_SECTION instead of TREE_LIST for array sections in C FE
This patch uses the new OMP_ARRAY_SECTION tree code to represent OpenMP array sections, rather than TREE_LIST. As for the C++ FE, using TREE_LIST becomes unwieldy when the array-section representation sticks around for longer due to adding "declare mapper" support. We must be a little careful here because OMP_CLAUSE_DEPEND and OMP_CLAUSE_AFFINITY also use TREE_LIST for their own purposes, and we're not changing those ones. No behavioural changes should be introduced by this patch. 2022-03-04 Julian Brown gcc/c/ * c-parser.cc (c_parser_omp_variable_list): Use OMP_ARRAY_SECTION instead of TREE_LIST for array sections. (c_parser_omp_clause_reduction): Likewise. * c-typeck.cc (handle_omp_array_sections_1, handle_omp_array_sections, c_oacc_check_attachments, c_finish_omp_clauses): Likewise. --- gcc/c/c-parser.cc | 21 +++-- gcc/c/c-typeck.cc | 37 +++-- 2 files changed, 34 insertions(+), 24 deletions(-) diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc index cc590e56e75..1ca03b6a632 100644 --- a/gcc/c/c-parser.cc +++ b/gcc/c/c-parser.cc @@ -13165,16 +13165,20 @@ c_parser_omp_variable_list (c_parser *parser, } for (int i = dims.length () - 1; i >= 0; i--) - decl = tree_cons (dims[i].low_bound, dims[i].length, decl); + decl = build3_loc (loc, OMP_ARRAY_SECTION, TREE_TYPE (decl), + decl, dims[i].low_bound, dims[i].length); } else if (TREE_CODE (decl) == INDIRECT_REF) { + tree type = TREE_TYPE (decl); + /* Turn *foo into the representation previously used for foo[0]. */ decl = TREE_OPERAND (decl, 0); STRIP_NOPS (decl); - decl = tree_cons (integer_zero_node, integer_one_node, decl); + decl = build3_loc (loc, OMP_ARRAY_SECTION, type, decl, +integer_zero_node, integer_one_node); } else if (TREE_CODE (decl) == ARRAY_REF) { @@ -13183,7 +13187,8 @@ c_parser_omp_variable_list (c_parser *parser, decl = TREE_OPERAND (decl, 0); STRIP_NOPS (decl); - decl = tree_cons (idx, integer_one_node, decl); + decl = build3_loc (loc, OMP_ARRAY_SECTION, TREE_TYPE (decl), +decl, idx, integer_one_node); } else if (TREE_CODE (decl) == NON_LVALUE_EXPR || CONVERT_EXPR_P (decl)) @@ -13345,7 +13350,9 @@ c_parser_omp_variable_list (c_parser *parser, } else for (unsigned i = 0; i < dims.length (); i++) - t = tree_cons (dims[i].low_bound, dims[i].length, t); + t = build3_loc (clause_loc, OMP_ARRAY_SECTION, + TREE_TYPE (t), t, dims[i].low_bound, + dims[i].length); } if ((kind == OMP_CLAUSE_DEPEND || kind == OMP_CLAUSE_AFFINITY) @@ -15061,13 +15068,15 @@ c_parser_omp_clause_reduction (c_parser *parser, enum omp_clause_code kind, for (c = nl; c != list; c = OMP_CLAUSE_CHAIN (c)) { tree d = OMP_CLAUSE_DECL (c), type; - if (TREE_CODE (d) != TREE_LIST) + if (TREE_CODE (d) != OMP_ARRAY_SECTION) type = TREE_TYPE (d); else { int cnt = 0; tree t; - for (t = d; TREE_CODE (t) == TREE_LIST; t = TREE_CHAIN (t)) + for (t = d; + TREE_CODE (t) == OMP_ARRAY_SECTION; + t = TREE_OPERAND (t, 0)) cnt++; type = TREE_TYPE (t); while (cnt > 0) diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc index 162ed0a18a2..98212c6b7f5 100644 --- a/gcc/c/c-typeck.cc +++ b/gcc/c/c-typeck.cc @@ -13218,7 +13218,7 @@ handle_omp_array_sections_1 (tree c, tree t, vec &types, enum c_omp_region_type ort) { tree ret, low_bound, length, type; - if (TREE_CODE (t) != TREE_LIST) + if (TREE_CODE (t) != OMP_ARRAY_SECTION) { if (error_operand_p (t)) return error_mark_node; @@ -13293,14 +13293,14 @@ handle_omp_array_sections_1 (tree c, tree t, vec &types, return ret; } - ret = handle_omp_array_sections_1 (c, TREE_CHAIN (t), types, + ret = handle_omp_array_sections_1 (c, TREE_OPERAND (t, 0), types, maybe_zero_len, first_non_one, ort); if (ret == error_mark_node || ret == NULL_TREE) return ret; type = TREE_TYPE (ret); - low_bound = TREE_PURPOSE (t); - length = TREE_VALUE (t); + low_bound = TREE_OPERAND (t, 1); + length = TREE_OPERAND (t, 2); if (low_bound == error_mark_node || length == error_mark_node)
[PATCH v2 11/11] OpenMP: Support OpenMP 5.0 "declare mapper" directives for C
This patch adds support for "declare mapper" directives (and the "mapper" modifier on "map" clauses) for C. As for C++, arrays of custom-mapped objects are not supported yet. I've taken hints from the existing C support for "declare reduction" directives: this works a little differently from C++ for things such as looking up user-defined reductions (or user-defined mappers, in our case). Some support functions have been pulled out of the C++ FE and shared with the C implementation: several language hooks have been added to facilitate that, given the above differences. (Fortran FE support is TBD.) 2022-03-17 Julian Brown gcc/c-family/ * c-common.h (omp_mapper_list, c_omp_find_nested_mappers, c_omp_instantiate_mappers): Add forward declarations/prototypes. * c-omp.cc (c_omp_find_nested_mappers): New function. (remap_mapper_decl_info): New struct. (remap_mapper_decl_1, omp_instantiate_mapper, c_omp_instantiate_mappers): Add functions. * c-decl.cc (c_omp_mapper_id, c_omp_mapper_decl, c_omp_mapper_lookup, c_omp_extract_mapper_directive, c_omp_map_array_section, c_omp_scan_mapper_bindings_r, c_omp_scan_mapper_bindings): New functions. gcc/c/ * c-objc-common.h (LANG_HOOKS_OMP_FINISH_MAPPER_CLAUSES, LANG_HOOKS_OMP_MAPPER_LOOKUP, LANG_HOOKS_OMP_EXTRACT_MAPPER_DIRECTIVE, LANG_HOOKS_OMP_MAP_ARRAY_SECTION): Define langhooks for C. * c-parser.cc (c_parser_omp_clause_map): Add KIND parameter. Handle mapper modifier. (c_parser_omp_all_clauses): Update call to c_parser_omp_clause_map with new kind argument. (c_parser_omp_target): Instantiate explicit mappers and record bindings for implicit mappers. (c_parser_omp_declare_mapper): Parse "declare mapper" directives. (c_parser_omp_declare): Support "declare mapper". * c-tree.h (c_omp_finish_mapper_clauses, c_omp_mapper_lookup, c_omp_extract_mapper_directive, c_omp_map_array_section, c_omp_mapper_id, c_omp_mapper_decl, c_omp_scan_mapper_bindings, c_omp_instantiate_mappers): Add prototypes. * c-typeck.cc (c_finish_omp_clauses): Handle GOMP_MAP_PUSH_MAPPER_NAME and GOMP_MAP_POP_MAPPER_NAME. (c_omp_finish_mapper_clauses): New function (langhook). gcc/cp/ * cp-objcp-common.h (LANG_HOOKS_OMP_MAPPER_LOOKUP, LANG_HOOKS_OMP_EXTRACT_MAPPER_DIRECTIVE, LANG_HOOKS_OMP_MAP_ARRAY_SECTION): Define langhooks for C++. * cp-tree.h (cxx_omp_mapper_lookup, cxx_omp_extract_mapper_directive, cxx_omp_map_array_section): Add prototypes. * parser.cc (cp_parser_omp_target): Use new name for c_omp_instantiate_mappers. * pt.cc (tsubst_omp_clauses): Use new name for c_omp_instantiate_mappers. (omp_mapper_lookup): Rename to... (cxx_omp_mapper_lookup): This. (omp_extract_mapper_directive): Rename to... (cxx_omp_extract_mapper_directive): This. (cxx_omp_map_array_section): New function. (remap_mapper_decl_info, remap_mapper_decl_1, omp_instantiate_mapper, omp_instantiate_mappers, mapper_list, find_nested_mappers): Remove. (omp_target_walk_data): Rename mapper_list to omp_mapper_list. (finish_omp_target_clauses_r): Likewise. Use renamed cxx_omp_mapper_lookup, cxx_omp_extract_mapper_directive and c_omp_find_nested_mappers. (finish_omp_target_clauses): Likewise. gcc/ * gimplify.cc (omp_instantiate_mapper): Use omp_map_array_section langhook to handle (singleton only, for now) array sections. Diagnose attempts to use length >1 array sections with custom mappers. (gimplify_scan_omp_clauses): Use omp_extract_mapper_directive langhook. * langhooks-def.h (lhd_omp_mapper_lookup, lhd_omp_extract_mapper_directive, lhd_omp_map_array_section): Add prototypes. (LANG_HOOKS_OMP_MAPPER_LOOKUP, LANG_HOOKS_OMP_EXTRACT_MAPPER_DIRECTIVE, LANG_HOOKS_OMP_MAP_ARRAY_SECTION): Define lang hooks. (LANG_HOOKS_DECLS): Add LANG_HOOKS_OMP_MAPPER_LOOKUP, LANG_HOOKS_OMP_EXTRACT_MAPPER_DIRECTIVE, LANG_HOOKS_OMP_MAP_ARRAY_SECTION. * langhooks.cc (lhd_omp_mapper_lookup, lhd_omp_extract_mapper_directive, lhd_omp_map_array_section): New default definitions of langhooks. * langhooks.h (lang_hooks_for_decls): Add omp_mapper_lookup, omp_extract_mapper_directive, omp_map_array_section. * omp-general.h (omp_mapper_list): New. gcc/testsuite/ * g++.dg/gomp/declare-mapper-3.C: Remove from here. * c-c++-common/gomp/declare-mapper-3.c: Move test here, make C-compatible. * g++.dg/gomp/declare-mapper-4.C: Remove from here. * c-c++-common/gomp/declare-mapper-4.c: Move test here, make C-compatible. * c-c++-common/gomp/declare-mapper-5.c: New test. * c-c++-common
Re: [Patch] Fortran/OpenMP: Improve associate-name diagnostic [PR103039]
On Fri, Mar 18, 2022 at 05:27:02PM +0100, Tobias Burnus wrote: > SELECT TYPE, SELECT RANK and ASSOCIATE have > associate-name => selector > and create a pointer to the selector. > > GCC was fixed to handle CLASS properly in > class(t) :: var > !$omp ... firstprivate(var) > As a side effect, firstprivate(assoc_name) now also gets > handled that way, effectively trying to firstprivate(selector) > which should be shared... > > While firstprivate(var) does not appear explicitly, it gets > added via gfc_omp_predetermined_sharing. > > I went for the simple solution and handle it only > in gfortran's ctor/dtor. > > An alternative would be to set OMP_CLAUSE_FIRSTPRIVATE_NO_REFERENCE, > which is currently only set for C++'s __for_end / __for_range > and then later process it in ctor/dtor. I am not sure whether that's > really best and what's the best way to propagate it. One way would > be to create and use OMP_CLAUSE_DEFAULT_FIRSTPRIVATE_NO_REFERENCE. > > OK as is (simple version) – or is a fuller version better. If so, > suggestion how to do this best? LGTM. > Fortran/OpenMP: Fix privatization of associated names > > gfc_omp_predetermined_sharing cases the associate-name pointer variable > to be OMP_CLAUSE_DEFAULT_FIRSTPRIVATE, which is fine. However, the associated > selector is shared. Thus, the target of associate-name pointer should not get > copied. (It was before but because of gfc_omp_privatize_by_reference returning > false, the selector was not only wrongly copied but this was also not done > properly.) > > gcc/fortran/ChangeLog: > > PR fortran/103039 > * trans-openmp.cc (gfc_omp_clause_copy_ctor, gfc_omp_clause_dtor): > Only privatize pointer for associate names. > > libgomp/ChangeLog: > > PR fortran/103039 > * testsuite/libgomp.fortran/associate4.f90: New test. > > gcc/fortran/trans-openmp.cc | 10 +++ > libgomp/testsuite/libgomp.fortran/associate4.f90 | 92 > > 2 files changed, 102 insertions(+) > Jakub
Re: [PATCH] c++: alias template and empty parameter packs [PR104008]
On 3/16/22 17:18, Marek Polacek wrote: Zero-length pack expansions are treated as if no list were provided at all, that is, with template struct S { }; template void g() { S...>; } g will result in S<>. In the following test we have something similar: template using IsOneOf = disjunction...>; and then we have "IsOneOf..." where OtherHolders is an empty pack. Since r11-7931, we strip_typedefs in TYPE_PACK_EXPANSION. In this test that results in "IsOneOf" being turned into "disjunction<>". So the whole expansion is now "disjunction<>...". But then we error in make_pack_expansion because find_parameter_packs_r won't find the pack OtherHolders. We strip the alias template because dependent_alias_template_spec_p says it's not dependent. It it not dependent because this alias is not TEMPLATE_DECL_COMPLEX_ALIAS_P. My understanding is that currently we consider an alias complex if it 1) expands a pack from the enclosing class, as in template typename... TT> struct S { template using X = P...>; }; where the alias expands TT; or 2) the expansion does *not* name all the template parameters, as in template struct R; template using U = R...>; where T is not named in the expansion. But IsOneOf is neither. And it can't know how it's going to be used. Therefore I think we cannot make it complex (and in turn dependent) to fix this bug. After much gnashing of teeth, I think we simply want to avoid stripping the alias if the new pattern doesn't have any parameter packs to expand. Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/11? PR c++/104008 gcc/cp/ChangeLog: * tree.cc (strip_typedefs): Don't strip an alias template when doing so would result in losing a parameter pack. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/variadic-alias3.C: New test. * g++.dg/cpp0x/variadic-alias4.C: New test. --- gcc/cp/tree.cc | 13 +- gcc/testsuite/g++.dg/cpp0x/variadic-alias3.C | 45 ++ gcc/testsuite/g++.dg/cpp0x/variadic-alias4.C | 48 3 files changed, 105 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.dg/cpp0x/variadic-alias3.C create mode 100644 gcc/testsuite/g++.dg/cpp0x/variadic-alias4.C diff --git a/gcc/cp/tree.cc b/gcc/cp/tree.cc index 6e9be713c51..eb59e56610b 100644 --- a/gcc/cp/tree.cc +++ b/gcc/cp/tree.cc @@ -1778,7 +1778,18 @@ strip_typedefs (tree t, bool *remove_attributes, unsigned int flags) if (TYPE_P (pat)) { type = strip_typedefs (pat, remove_attributes, flags); - if (type != pat) + /* Empty packs can thwart our efforts here. Consider + + template + using IsOneOf = disjunction...>; + + where IsOneOf seemingly uses all of its template parameters in + its expansion (and does not expand a pack from the enclosing + class), so the alias is not marked as complex. However, it may + be used as in "IsOneOf", where Ts is an empty parameter pack, + and stripping it down into "disjunction<>" here would exclude the + Ts pack, resulting in an error. */ + if (type != pat && uses_parameter_packs (type)) I wonder if we want to verify that the result of uses_parameter_packs matches PACK_EXPANSION_PARAMETER_PACKS, not just that it's non-null? But I can't think how to write a testcase where those two questions have different answers. The patch is OK, thanks. { result = copy_node (t); PACK_EXPANSION_PATTERN (result) = type; diff --git a/gcc/testsuite/g++.dg/cpp0x/variadic-alias3.C b/gcc/testsuite/g++.dg/cpp0x/variadic-alias3.C new file mode 100644 index 000..6b6dd9f4c85 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/variadic-alias3.C @@ -0,0 +1,45 @@ +// PR c++/104008 +// { dg-do compile { target c++11 } } + +template struct conjunction; +template struct disjunction; +template struct is_same; +template struct enable_if; +template using enable_if_t = typename enable_if<_Cond>::type; +struct B; +struct __uniq_ptr_impl { + struct _Ptr { +using type = B *; + }; + using pointer = _Ptr::type; +}; +struct unique_ptr { + using pointer = __uniq_ptr_impl::pointer; + unique_ptr(pointer); +}; +template +using IsOneOf = disjunction...>; + +template struct any_badge; + +struct badge { + badge(any_badge<>); + badge(); +}; + +template struct any_badge { + template ...>::value>> + any_badge(); +}; + +template unique_ptr make_unique(_Args... __args); + +struct B { + B(badge); + unique_ptr b_ = make_unique(badge{}); +}; + +template unique_ptr make_unique(_Args... __args) { + return new B(__args...); +} diff --git a/gcc/testsuite/g++.dg/cpp0x/variadic-alias4.C b/gcc/testsuite/g++.dg/cpp0x/variadic-alias4.C new file mode 100644 index 000..896a4725627 --- /dev/n
Re: [PATCH] c++, v2: Fix up constexpr evaluation of new with zero sized types [PR104568]
On 3/16/22 12:55, Jakub Jelinek wrote: On Tue, Mar 15, 2022 at 04:19:05PM -0400, Jason Merrill wrote: But if you strongly prefer it that way, I can do that. Note, probably not 3 new args but 4, depends on whether we could turn all those cases where the tree arg0 = CALL_EXPR_ARG (oldop, 0); is done but var_size_adjusted is false into assertion failures. I'm worried that with the zero size of element we could end up with a variable number of elements which when multiplied by 0 gives constant 0, though hopefully that would be rejected earlier during constant evaluation. Or we could move all the adjustment into a separate function and only ever pass the number of elements to build_new_constexpr_heap_type? So like this? 2022-03-16 Jakub Jelinek PR c++/104568 * init.cc (build_new_constexpr_heap_type): Remove FULL_SIZE argument and its handling, instead add ITYPE2 argument. Only support COOKIE_SIZE != NULL. (build_new_1): If size is 0, change it to 0 * outer_nelts if outer_nelts is non-NULL. Pass type rather than elt_type to maybe_wrap_new_for_constexpr. * constexpr.cc (build_new_constexpr_heap_type): New function. (cxx_eval_constant_expression) : If elt_size is zero sized type, try to recover outer_nelts from the size argument to operator new/new[] and pass that as arg_size to build_new_constexpr_heap_type. Pass ctx, non_constant_p and overflow_p to that call too. * g++.dg/cpp2a/constexpr-new22.C: New test. --- gcc/cp/constexpr.cc.jj 2022-03-16 15:25:26.294551244 +0100 +++ gcc/cp/constexpr.cc 2022-03-16 17:27:08.202184961 +0100 @@ -6422,6 +6422,84 @@ maybe_warn_about_constant_value (locatio } } +/* For element type ELT_TYPE, return the appropriate type of the heap object + containing such element(s). COOKIE_SIZE is NULL or the size of cookie + in bytes. If COOKIE_SIZE is NULL, return array type + ELT_TYPE[FULL_SIZE / sizeof(ELT_TYPE)], otherwise return + struct { size_t[COOKIE_SIZE/sizeof(size_t)]; ELT_TYPE[N]; } + where N is is computed such that the size of the struct fits into FULL_SIZE. + If ARG_SIZE is non-NULL, it is the first argument to the new operator. + It should be passed if ELT_TYPE is zero sized type in which case FULL_SIZE + will be also 0 and so it is not possible to determine the actual array + size. CTX, NON_CONSTANT_P and OVERFLOW_P are used during constant + expression evaluation of subexpressions of ARG_SIZE. */ + +tree Let's make this static. +build_new_constexpr_heap_type (const constexpr_ctx *ctx, tree elt_type, + tree cookie_size, tree full_size, tree arg_size, + bool *non_constant_p, bool *overflow_p) +{ + gcc_assert (cookie_size == NULL_TREE || tree_fits_uhwi_p (cookie_size)); + gcc_assert (tree_fits_uhwi_p (full_size)); + unsigned HOST_WIDE_INT csz = cookie_size ? tree_to_uhwi (cookie_size) : 0; + if (arg_size) +{ + STRIP_NOPS (arg_size); + if (cookie_size) + { + if (TREE_CODE (arg_size) != PLUS_EXPR) + arg_size = NULL_TREE; + else if (TREE_CODE (TREE_OPERAND (arg_size, 0)) == INTEGER_CST + && tree_int_cst_equal (cookie_size, + TREE_OPERAND (arg_size, 0))) + { + arg_size = TREE_OPERAND (arg_size, 1); + STRIP_NOPS (arg_size); + } + else if (TREE_CODE (TREE_OPERAND (arg_size, 1)) == INTEGER_CST + && tree_int_cst_equal (cookie_size, + TREE_OPERAND (arg_size, 1))) + { + arg_size = TREE_OPERAND (arg_size, 0); + STRIP_NOPS (arg_size); + } + else + arg_size = NULL_TREE; + } + if (arg_size && TREE_CODE (arg_size) == MULT_EXPR) + { + tree op0 = TREE_OPERAND (arg_size, 0); + tree op1 = TREE_OPERAND (arg_size, 1); + if (integer_zerop (op0)) + arg_size + = cxx_eval_constant_expression (ctx, op1, false, non_constant_p, + overflow_p); + else if (integer_zerop (op1)) + arg_size + = cxx_eval_constant_expression (ctx, op0, false, non_constant_p, + overflow_p); + else + arg_size = NULL_TREE; + } + else + arg_size = NULL_TREE; +} + + unsigned HOST_WIDE_INT fsz = tree_to_uhwi (arg_size ? arg_size : full_size); + unsigned HOST_WIDE_INT esz = int_size_in_bytes (elt_type); esz could move inside the if. OK with those changes, thanks. + if (!arg_size) +{ + gcc_assert (fsz >= csz); + fsz -= csz; + if (esz) + fsz /= esz; +} + tree itype2 = build_index_type (size_int (fsz - 1)); + if (!cookie_size) +return build_cplus_array_type (elt_type, itype2); + re
[PATCH] Allow (void *) 0xdeadbeef accesses without warnings [PR99578]
Hi! Starting with GCC11 we keep emitting false positive -Warray-bounds or -Wstringop-overflow etc. warnings on widely used *(type *)0x12345000 style accesses (or memory/string routines to such pointers). This is a standard programming style supported by all C/C++ compilers I've ever tried, used mostly in kernel or DSP programming, but sometimes also together with mmap MAP_FIXED when certain things, often I/O registers but could be anything else too are known to be present at fixed addresses. Such INTEGER_CST addresses can appear in code either because a user used it like that (in which case it is fine) or because somebody used pointer arithmetics (including &((struct whatever *)NULL)->field) on a NULL pointer. The middle-end warning code wrongly assumes that the latter case is what is very likely, while the former is unlikely and users should change their code. The following patch attempts to differentiate between the two, but because we are in stage4, does it in a temporary compromise way. For GCC 13, the aim is to represent in the IL the user (type *)0x12345000 style addresses as INTEGER_CSTs as before, and represent the addresses derived from pointer arithmetics on NULL pointer as &MEM_REF[(void*)0B + offset]. When we see say POINTER_PLUS of NULL and non-zero offset, I think we should fold it to this form. We IMNSHO need to support the poor man's offsetof I'm afraid still in wide use, so we should IMHO fold casts of such addresses to integers into INTEGER_CST offset. SCCVN IMHO shouldn't the &MEM_REF[(void*)0B + offset] and (void*)offset forms as equivalent (yes, it will be a small missed optimization but RTL will see them the same), but e.g. the patch already does ==/!= folding between the two forms. As doing this fully seems to be more like a stage1 project, the following patch keeps doing what GCC11/GCC12 did until now for pointers smaller than a parameter, by default equal to 4KB, which means that PR102037 will remain unfixed for now (unless users lower the new param manually) and for the most common cases nothing changes right now, but for higher addresses we'll assume that such INTEGER_CSTs are valid direct address accesses and use the &MEM_REF[(void*)0B + offset] forms for the invalid code on which we'll warn. Yet another alternative would be just the params.opt and pointer-query.cc (compute_objsize_r) hunks which will stop warning if we go more than 4KB from a NULL pointer altogether and doing the full set of changes only in GCC 13 (I've bootstrapped/regtested such patch last night on x86_64-linux and i686-linux). So far tested with make check-gcc/check-g++ on x86_64-linux, furthermore also tested with the parameter's default changed from 4096 to 16. Ok for trunk if it passes full bootstrap/regtest? 2022-03-18 Jakub Jelinek PR middle-end/99578 PR middle-end/100680 PR tree-optimization/100834 * params.opt (--param=min-pagesize=): New parameter. * gimple-expr.cc (is_gimple_invariant_address): Allow ADDR_EXPR of MEM_REF with integer_zerop as first operand as invariant. (is_gimple_ip_invariant_address): Likewise. * match.pd (&MEM[0 + off] == off): New simplification. * gimple-ssa-warn-restrict.cc (builtin_memref::set_base_and_offset): Ignore MEM_REFs with integer_zerop as address. * gimple-fold.cc (maybe_canonicalize_mem_ref_addr): Don't fold &MEM[0 + off] into off if off is larger than min-pagesize. * gimple-array-bounds.cc (array_bounds_checker::check_mem_ref): Return false for MEM_REFs with integer_zerop as address. * pointer-query.cc (handle_mem_ref): Handle MEM_REFs with integer_zerop as address specially. (compute_objsize_r) : Formatting fix. (compute_objsize_r) : Use maximum object size instead of zero for pointer constants equal or larger than min-pagesize. * fold-const.cc (build_fold_addr_expr_with_type_loc): * gcc.dg/tree-ssa/pr99578-1.c: New test. * gcc.dg/pr99578-1.c: New test. * gcc.dg/pr99578-2.c: New test. * gcc.dg/pr99578-3.c: New test. * gcc.dg/pr100680.c: New test. * gcc.dg/pr100834.c: New test. --- gcc/params.opt.jj 2022-03-17 18:48:06.723406664 +0100 +++ gcc/params.opt 2022-03-18 11:59:31.518452714 +0100 @@ -613,6 +613,10 @@ The maximum number of insns in loop head Common Joined UInteger Var(param_max_modulo_backtrack_attempts) Init(40) Param Optimization The maximum number of backtrack attempts the scheduler should make when modulo scheduling a loop. +-param=min-pagesize= +Common Joined UInteger Var(param_min_pagesize) Init(4096) Param Optimization +Minimum page size for warning purposes. + -param=max-partial-antic-length= Common Joined UInteger Var(param_max_partial_antic_length) Init(100) Param Optimization Maximum length of partial antic set when performing tree pre optimization. --- gcc/gimple-expr.cc.jj 2022-03-17 18:48:06.69140710
[r12-7699 Regression] FAIL: g++.dg/torture/pr104601.C -Os (test for excess errors) on Linux/x86_64
On Linux/x86_64, ac73c944eac88f37db2767aa4acc7ff6f4983f21 is the first bad commit commit ac73c944eac88f37db2767aa4acc7ff6f4983f21 Author: Jonathan Wakely Date: Thu Mar 17 16:45:43 2022 + libstdc++: Reduce header dependencies from PSTL headers [PR92546] caused FAIL: g++.dg/torture/pr104601.C -O0 (test for excess errors) FAIL: g++.dg/torture/pr104601.C -O1 (test for excess errors) FAIL: g++.dg/torture/pr104601.C -O2 -flto -fno-use-linker-plugin -flto-partition=none (test for excess errors) FAIL: g++.dg/torture/pr104601.C -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects (test for excess errors) FAIL: g++.dg/torture/pr104601.C -O2 (test for excess errors) FAIL: g++.dg/torture/pr104601.C -O3 -g (test for excess errors) FAIL: g++.dg/torture/pr104601.C -Os (test for excess errors) with GCC configured with ../../gcc/configure --prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-7699/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl --enable-libmpx x86_64-linux --disable-bootstrap To reproduce: $ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg-torture.exp=g++.dg/torture/pr104601.C --target_board='unix{-m32}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg-torture.exp=g++.dg/torture/pr104601.C --target_board='unix{-m32\ -march=cascadelake}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg-torture.exp=g++.dg/torture/pr104601.C --target_board='unix{-m64}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg-torture.exp=g++.dg/torture/pr104601.C --target_board='unix{-m64\ -march=cascadelake}'" (Please do not reply to this email, for question about this report, contact me at skpgkp2 at gmail dot com)
Re: [PATCH] Fix PR 101515 (ICE in pp_cxx_unqualified_id, at cp/cxx-pretty-print.c:128)
On 3/15/22 12:06, Jakub Jelinek wrote: On Tue, Mar 15, 2022 at 11:57:22AM -0400, Jason Merrill wrote: The intent of r11-6729 is that it prints something that helps user to figure out what exactly is being accessed. When we find a unique non-static data member that is being accessed, even when we can't fold it nicely, IMNSHO it is better to print ((sometype *)&var)->field or (*(sometype *)&var).field instead of *(fieldtype *)((char *)&var + 56) because the user doesn't know what is at offset 56, we shouldn't ask user to decipher structure layout etc. The problem is that the reference is *not* to any non-static data member, it's to the PMF as a whole. But c_fold_indirect_ref_for_warn wrongly turns it into a reference to the first non-static data member. We asked c_fold_indirect_ref_warn to fold a MEM_REF with RECORD_TYPE, and it gave us back a COMPONENT_REF with POINTER_TYPE. That seems clearly wrong. That is not what I see on the testcase. I see the outer c_fold_indirect_ref_for_warn call with type ptrmemfunc which is a 64-bit RECORD_TYPE containing a single ptr member which has pointer to function type, and op which is the x VAR_DECL with sp type which is 128-bit RECORD_TYPE containing 64-bit __pfn member and 64-bit __delta member. Exactly: TYPE is RECORD_TYPE, TREE_TYPE (field) is POINTER_TYPE. As all the bits of the ptrmemfunc RECORD_TYPE fit within the __pfn member (they are equal size), it wants to print (cast)(something.__pfn). I disagree that this is what we want; we asked to fold an expression with class type, it seems unlikely that we want to get back an expression with pointer type. Jason
Re: [PATCH] Allow (void *) 0xdeadbeef accesses without warnings [PR99578]
> Am 18.03.2022 um 18:01 schrieb Jakub Jelinek via Gcc-patches > : > > Hi! > > Starting with GCC11 we keep emitting false positive -Warray-bounds or > -Wstringop-overflow etc. warnings on widely used *(type *)0x12345000 > style accesses (or memory/string routines to such pointers). > This is a standard programming style supported by all C/C++ compilers > I've ever tried, used mostly in kernel or DSP programming, but sometimes > also together with mmap MAP_FIXED when certain things, often I/O registers > but could be anything else too are known to be present at fixed > addresses. > > Such INTEGER_CST addresses can appear in code either because a user > used it like that (in which case it is fine) or because somebody used > pointer arithmetics (including &((struct whatever *)NULL)->field) on > a NULL pointer. The middle-end warning code wrongly assumes that the > latter case is what is very likely, while the former is unlikely and > users should change their code. > > The following patch attempts to differentiate between the two, but > because we are in stage4, does it in a temporary compromise way. > For GCC 13, the aim is to represent in the IL the user (type *)0x12345000 > style addresses as INTEGER_CSTs as before, and represent the addresses > derived from pointer arithmetics on NULL pointer as > &MEM_REF[(void*)0B + offset]. When we see say POINTER_PLUS of > NULL and non-zero offset, I think we should fold it to this form. > We IMNSHO need to support the poor man's offsetof I'm afraid still > in wide use, so we should IMHO fold casts of such addresses to integers > into INTEGER_CST offset. SCCVN IMHO shouldn't the > &MEM_REF[(void*)0B + offset] and (void*)offset forms as equivalent (yes, > it will be a small missed optimization but RTL will see them the same), > but e.g. the patch already does ==/!= folding between the two forms. > As doing this fully seems to be more like a stage1 project, the following > patch keeps doing what GCC11/GCC12 did until now for pointers smaller than > a parameter, by default equal to 4KB, which means that PR102037 will > remain unfixed for now (unless users lower the new param manually) and for > the most common cases nothing changes right now, but for higher > addresses we'll assume that such INTEGER_CSTs are valid direct address > accesses and use the &MEM_REF[(void*)0B + offset] forms for the invalid > code on which we'll warn. > > Yet another alternative would be just the params.opt and > pointer-query.cc (compute_objsize_r) hunks which will stop warning > if we go more than 4KB from a NULL pointer altogether and doing > the full set of changes only in GCC 13 (I've bootstrapped/regtested > such patch last night on x86_64-linux and i686-linux). I think I’d prefer this simpler approach at this point. Thus that one is OK. Richard. > > So far tested with make check-gcc/check-g++ on x86_64-linux, > furthermore also tested with the parameter's default changed from 4096 > to 16. > > Ok for trunk if it passes full bootstrap/regtest? > > 2022-03-18 Jakub Jelinek > >PR middle-end/99578 >PR middle-end/100680 >PR tree-optimization/100834 >* params.opt (--param=min-pagesize=): New parameter. >* gimple-expr.cc (is_gimple_invariant_address): Allow ADDR_EXPR of >MEM_REF with integer_zerop as first operand as invariant. >(is_gimple_ip_invariant_address): Likewise. >* match.pd (&MEM[0 + off] == off): New simplification. >* gimple-ssa-warn-restrict.cc (builtin_memref::set_base_and_offset): >Ignore MEM_REFs with integer_zerop as address. >* gimple-fold.cc (maybe_canonicalize_mem_ref_addr): Don't fold >&MEM[0 + off] into off if off is larger than min-pagesize. >* gimple-array-bounds.cc (array_bounds_checker::check_mem_ref): Return >false for MEM_REFs with integer_zerop as address. >* pointer-query.cc (handle_mem_ref): Handle MEM_REFs with integer_zerop >as address specially. >(compute_objsize_r) : Formatting fix. >(compute_objsize_r) : Use maximum object size instead >of zero for pointer constants equal or larger than min-pagesize. >* fold-const.cc (build_fold_addr_expr_with_type_loc): > >* gcc.dg/tree-ssa/pr99578-1.c: New test. >* gcc.dg/pr99578-1.c: New test. >* gcc.dg/pr99578-2.c: New test. >* gcc.dg/pr99578-3.c: New test. >* gcc.dg/pr100680.c: New test. >* gcc.dg/pr100834.c: New test. > > --- gcc/params.opt.jj2022-03-17 18:48:06.723406664 +0100 > +++ gcc/params.opt2022-03-18 11:59:31.518452714 +0100 > @@ -613,6 +613,10 @@ The maximum number of insns in loop head > Common Joined UInteger Var(param_max_modulo_backtrack_attempts) Init(40) > Param Optimization > The maximum number of backtrack attempts the scheduler should make when > modulo scheduling a loop. > > +-param=min-pagesize= > +Common Joined UInteger Var(param_min_pagesize) Init(4096) Param Optimization > +Minimum page size for warning purposes. > + > -param=max-partial-antic-length= > Comm
[committed] testsuite: Add missing header to test
This fixes a new FAIL caused by my r12-7699 change to libstdc++ headers. Pushed to trunk as obvious. -- >8 -- gcc/testsuite/ChangeLog: * g++.dg/torture/pr104601.C: Include . --- gcc/testsuite/g++.dg/torture/pr104601.C | 1 + 1 file changed, 1 insertion(+) diff --git a/gcc/testsuite/g++.dg/torture/pr104601.C b/gcc/testsuite/g++.dg/torture/pr104601.C index bb56479c745..92ad73d47e9 100644 --- a/gcc/testsuite/g++.dg/torture/pr104601.C +++ b/gcc/testsuite/g++.dg/torture/pr104601.C @@ -4,6 +4,7 @@ #include #include +#include inline std::optional foo (std::vector::iterator b, std::vector::iterator c, -- 2.34.1
[PATCH] Remove --with-gmp-dir and --with-mpfr-dir
The top-level configure options --with-gmp-dir and --with-mpfr-dir were obsoleted and marked as "REMOVED" back in 2006. I think that's long enough ago for everyone to have updated their scripts, so this patch removes them entirely. While doing this, I also found one other leftover that wasn't removed by the earlier patch. This is also removed here. ChangeLog 2022-03-18 Tom Tromey * configure.ac: Remove --with-mpfr-dir and --with-gmp-dir. * configure: Rebuild. --- ChangeLog| 5 + configure| 22 +- configure.ac | 14 ++ 3 files changed, 8 insertions(+), 33 deletions(-) diff --git a/configure b/configure index f7e0fa46c9c..69d8fc046c3 100755 --- a/configure +++ b/configure @@ -811,11 +811,9 @@ enable_pgo_build with_mpc with_mpc_include with_mpc_lib -with_mpfr_dir with_mpfr with_mpfr_include with_mpfr_lib -with_gmp_dir with_gmp with_gmp_include with_gmp_lib @@ -1588,14 +1586,12 @@ Optional Packages: --with-mpc-lib=PATH/lib --with-mpc-include=PATH specify directory for installed MPC include files --with-mpc-lib=PATH specify directory for the installed MPC library - --with-mpfr-dir=PATHthis option has been REMOVED --with-mpfr=PATHspecify prefix directory for installed MPFR package. Equivalent to --with-mpfr-include=PATH/include plus --with-mpfr-lib=PATH/lib --with-mpfr-include=PATH specify directory for installed MPFR include files --with-mpfr-lib=PATHspecify directory for the installed MPFR library - --with-gmp-dir=PATH this option has been REMOVED --with-gmp=PATH specify prefix directory for the installed GMP package. Equivalent to --with-gmp-include=PATH/include plus @@ -3768,7 +3764,7 @@ case "${target}" in *-*-dragonfly*) ;; *-*-freebsd*) -if test "x$with_gmp" = x && test "x$with_gmp_dir" = x \ +if test "x$with_gmp" = x \ && ! test -d ${srcdir}/gmp \ && test -f /usr/local/include/gmp.h; then with_gmp=/usr/local @@ -7999,14 +7995,6 @@ fi # Specify a location for mpfr # check for this first so it ends up on the link line before gmp. -# Check whether --with-mpfr-dir was given. -if test "${with_mpfr_dir+set}" = set; then : - withval=$with_mpfr_dir; as_fn_error $? "The --with-mpfr-dir=PATH option has been removed. -Use --with-mpfr=PATH or --with-mpfr-include=PATH plus --with-mpfr-lib=PATH" "$LINENO" 5 -fi - - - # Check whether --with-mpfr was given. if test "${with_mpfr+set}" = set; then : withval=$with_mpfr; @@ -8052,14 +8040,6 @@ fi # Specify a location for gmp -# Check whether --with-gmp-dir was given. -if test "${with_gmp_dir+set}" = set; then : - withval=$with_gmp_dir; as_fn_error $? "The --with-gmp-dir=PATH option has been removed. -Use --with-gmp=PATH or --with-gmp-include=PATH plus --with-gmp-lib=PATH" "$LINENO" 5 -fi - - - # Check whether --with-gmp was given. if test "${with_gmp+set}" = set; then : withval=$with_gmp; diff --git a/configure.ac b/configure.ac index 434b1a267a4..d0f6d215b99 100644 --- a/configure.ac +++ b/configure.ac @@ -1,6 +1,6 @@ # Copyright (C) 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, # 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, -# 2014, 2015, 2016, 2019 Free Software Foundation, Inc. +# 2014, 2015, 2016, 2019, 2022 Free Software Foundation, Inc. # # This file is free software; you can redistribute it and/or modify it # under the terms of the GNU General Public License as published by @@ -1021,7 +1021,7 @@ case "${target}" in *-*-dragonfly*) ;; *-*-freebsd*) -if test "x$with_gmp" = x && test "x$with_gmp_dir" = x \ +if test "x$with_gmp" = x \ && ! test -d ${srcdir}/gmp \ && test -f /usr/local/include/gmp.h; then with_gmp=/usr/local @@ -1568,11 +1568,6 @@ fi # Specify a location for mpfr # check for this first so it ends up on the link line before gmp. -AC_ARG_WITH(mpfr-dir, -[AS_HELP_STRING([--with-mpfr-dir=PATH], [this option has been REMOVED])], -[AC_MSG_ERROR([The --with-mpfr-dir=PATH option has been removed. -Use --with-mpfr=PATH or --with-mpfr-include=PATH plus --with-mpfr-lib=PATH])]) - AC_ARG_WITH(mpfr, [AS_HELP_STRING([--with-mpfr=PATH], [specify prefix directory for installed MPFR package. @@ -1612,11 +1607,6 @@ Building GCC with MPFR in the source tree is only handled for MPFR 3.1.0+.]) fi # Specify a location for gmp -AC_ARG_WITH(gmp-dir, -[AS_HELP_STRING([--with-gmp-dir=PATH], [this option has been REMOVED])], -[AC_MSG_ERROR([The --with-gmp-dir=PATH option has been removed. -Use --with-gmp=PATH or --with-gmp-include=PATH plus --with-gmp-lib=PATH])]) - AC_ARG_WITH(gmp, [AS_HELP_STRING([--with-gmp=PATH], [specify prefix directory for the installed GMP package. -- 2.34.1
Re: [PATCH] Ignore (possible) signed zeros in operands of FP comparisons.
On Fri, Mar 18, 2022 at 2:07 PM Andrew MacLeod wrote: > > On 3/18/22 03:43, Roger Sayle wrote: > > Hi Jeff/Andrew, > >> If you're going to do more work in this space, you might want to reach out > >> to > >> Aldy and Andrew to see if there's space for collaboration. > > One (clever?) suggestion that I do have for ranger would be to add support > > for > > an additional value_range_kind, VR_NONZEROBITS, which would be a variant of > > VR_RANGE (for unsigned types?) and require very few changes to the existing > > > I think were ahead of you here.. Tracking known zero and one bits within > irange as an adjunct has been in plan for awhile, just priorities > haven't allowed us to get to it until recently... > > Theres a bunch of stuff already in the hopper for the next stage1 that > Aldy has been working with... he can expound upon it, but we plan to use > both masks and ranges together as appropriate. Yes, for the next stage1 I have patches queued up to provide nonzero bits within the irange object. In working on providing a global range storage (an irange replacement for SSA_NAME_RANGE_INFO) for the next release, it became clear that nonzero and ranges are probably best suited to live in the same space. The current global range mechanism already does this-- by refining nonzero bits with each change to the range for an SSA name. However, we sometimes pessimize global ranges, presumably because we couldn´t properly represent an intersection with our value_range anti-range business. Also, this nonzero+range symbiosis only currently works for global ranges (not iranges or value_ranges). In the upcoming work, the nonzero bits will live in irange, and work in tandem-- for example, sometimes refining the range when a new nonzero mask is available (0xff is really [0,255]) etc. I have the core infrastructure done. I could probably use range-op help from the relevant experts when the time comes. And yes, as Roger points out, the mask tracking bits in CCP could play very well with the ranger world. For that matter, while I was doing the above work I noticed that many of the bitmasks that CCP could determine, would eventually be figured out by evrp, just in the form of ranges. Having nonzero bits in irange opens up all sorts of possibilities. > > > > code. Just like VR_RANGE all values would lie in [MIN, MAX], so by default > > treating this value_range_kind identically to VR_RANGE there should be no > > visible changes, but the change in semantics is that MIN has the minimum > > bits > > set, and MAX, the maximum bits set [equivalent to the RVAL and RMASK pairs > > from CCP's bit_value_{bin,un}op]. Hence, the VR_NONZEROBITS range [2,7] > > would represent the possible values {2, 3, 6, 7} rather than {2, 3, 4, 5, > > 6, 7}. > > For a small number of bits, int_range can already handle this with multiple > > irange spans, but adding this representation would allow the unification of > > the > > bit-based propagation performed in tree-ssa-ccp with the range-value based > > propagation performed in EVRP/ranger, allowing the clever forwards/backwards > > functionality. > > > > As Andrew's recent (partial) review points out, tracking the effect of > > operations > > like BIT_XOR_EXPR on VR_RANGE is much more complicated than on the > > proposed VR_NONZEROBITS. > > > > Alas, I'm not the sort of contributor to make large infrastructure changes > > myself, but if the above functionality were in place, I/the compiler would > > be able to make use of it. > > > > And this is exactly what we are hoping.. we provide the structure and > someone who is better at the underlying detail interaction can flesh out > whatever specifics they find interesting. Yup. We´re on the other side of the spectrum, the ability to provide infrastructure with little expertise in the fine details (range ops, floats, etc). :-) Aldy > > > Andrew >
[pushed] c++: using lookup within class defn [PR104476]
The problem in both PR92918 and PR104476 is overloading of base member functions brought in by 'using' with direct member functions during parsing of the class body. To this point they've had a troublesome coexistence which was resolved by set_class_bindings when the class is complete, but we also need to handle lookup within the class body, such as in a trailing return type. The problem was that push_class_level_binding would either clobber the using-decl with the direct members or vice-versa. In older versions of GCC we only pushed dependent usings, and preferring the dependent using made sense, as it expresses a type-dependent overload set that we can't do anything useful with. But when we started keeping non-dependent usings around, push_class_level_binding in particular wasn't adjusted accordingly. This patch makes that adjustment, and pushes the functions imported by a non-dependent using immediately from finish_member_declaration. This made diagnosing redundant using-decls a bit awkward, since we no longer push the using-decl itself; I handle that by noticing when we try to add the same function again and searching TYPE_FIELDS for the previous using-decl. Tested x86_64-pc-linux-gnu, applying to trunk. PR c++/92918 PR c++/104476 gcc/cp/ChangeLog: * class.cc (add_method): Avoid adding the same used function twice. (handle_using_decl): Don't add_method. (finish_struct): Don't using op= if we have one already. (maybe_push_used_methods): New. * semantics.cc (finish_member_declaration): Call it. * name-lookup.cc (diagnose_name_conflict): No longer static. (push_class_level_binding): Revert 92918 patch, limit to dependent using. * cp-tree.h: Adjust. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/pr85070.C: Remove expected error. * g++.dg/lookup/using66a.C: New test. * g++.dg/lookup/using67.C: New test. --- gcc/cp/cp-tree.h | 2 + gcc/cp/class.cc| 136 +++-- gcc/cp/name-lookup.cc | 16 ++- gcc/cp/semantics.cc| 5 +- gcc/testsuite/g++.dg/cpp0x/pr85070.C | 4 +- gcc/testsuite/g++.dg/lookup/using66a.C | 22 gcc/testsuite/g++.dg/lookup/using67.C | 20 7 files changed, 142 insertions(+), 63 deletions(-) create mode 100644 gcc/testsuite/g++.dg/lookup/using66a.C create mode 100644 gcc/testsuite/g++.dg/lookup/using67.C diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h index a4bc48a4a20..1bd7bc6fca2 100644 --- a/gcc/cp/cp-tree.h +++ b/gcc/cp/cp-tree.h @@ -6677,6 +6677,7 @@ extern tree build_vfn_ref (tree, tree); extern tree get_vtable_decl(tree, int); extern bool add_method (tree, tree, bool); extern tree declared_access(tree); +extern bool maybe_push_used_methods(tree); extern tree currently_open_class (tree); extern tree currently_open_derived_class (tree); extern tree outermost_open_class (void); @@ -8241,6 +8242,7 @@ extern tree fold_builtin_source_location (location_t); /* in name-lookup.cc */ extern tree strip_using_decl(tree); +extern void diagnose_name_conflict (tree, tree); /* Tell the binding oracle what kind of binding we are looking for. */ diff --git a/gcc/cp/class.cc b/gcc/cp/class.cc index 696155714e8..40e17140db5 100644 --- a/gcc/cp/class.cc +++ b/gcc/cp/class.cc @@ -1049,7 +1049,12 @@ add_method (tree type, tree method, bool via_using) if (via_using && iter.using_p () /* Except handle inherited constructors specially. */ && ! DECL_CONSTRUCTOR_P (fn)) - continue; + { + if (fn == method) + /* Don't add the same one twice. */ + return false; + continue; + } /* [over.load] Member function declarations with the same name and the same parameter types cannot be @@ -1212,10 +1217,11 @@ add_method (tree type, tree method, bool via_using) if (via_using) /* Defer to the local function. */ return false; - else if (flag_new_inheriting_ctors - && DECL_INHERITED_CTOR (fn)) + else if (iter.using_p () + || (flag_new_inheriting_ctors + && DECL_INHERITED_CTOR (fn))) { - /* Remove the inherited constructor. */ + /* Remove the inherited function. */ current_fns = iter.remove_node (current_fns); continue; } @@ -1299,21 +1305,65 @@ declared_access (tree decl) : access_public_node); } +/* If DECL is a non-dependent using of non-ctor function members, push them + and return true, otherwise return false. Called from + finish_member_declaration. */ + +bool +maybe_push_used_methods (tree decl) +{ + if (TREE_CODE (decl) != USING_DECL) +return false; + tree used
Re: [PATCH] Fix PR 101515 (ICE in pp_cxx_unqualified_id, at cp/cxx-pretty-print.c:128)
On Fri, Mar 18, 2022 at 01:35:53PM -0400, Jason Merrill wrote: > On 3/15/22 12:06, Jakub Jelinek wrote: > > On Tue, Mar 15, 2022 at 11:57:22AM -0400, Jason Merrill wrote: > > > > The intent of r11-6729 is that it prints something that helps user to > > > > figure > > > > out what exactly is being accessed. > > > > When we find a unique non-static data member that is being accessed, > > > > even > > > > when we can't fold it nicely, IMNSHO it is better to print > > > > ((sometype *)&var)->field > > > > or > > > > (*(sometype *)&var).field > > > > instead of > > > > *(fieldtype *)((char *)&var + 56) > > > > because the user doesn't know what is at offset 56, we shouldn't ask > > > > user > > > > to decipher structure layout etc. > > > > > > The problem is that the reference is *not* to any non-static data member, > > > it's to the PMF as a whole. But c_fold_indirect_ref_for_warn wrongly > > > turns > > > it into a reference to the first non-static data member. > > > > > > We asked c_fold_indirect_ref_warn to fold a MEM_REF with RECORD_TYPE, and > > > it > > > gave us back a COMPONENT_REF with POINTER_TYPE. That seems clearly wrong. > > > > That is not what I see on the testcase. > > I see the outer c_fold_indirect_ref_for_warn call with type ptrmemfunc > > which is a 64-bit RECORD_TYPE containing a single ptr member which has > > pointer to function type, and op which is the x VAR_DECL with sp type which > > is 128-bit RECORD_TYPE containing 64-bit __pfn member and 64-bit __delta > > member. > > Exactly: TYPE is RECORD_TYPE, TREE_TYPE (field) is POINTER_TYPE. > > > As all the bits of the ptrmemfunc RECORD_TYPE fit within the __pfn member > > (they are equal size), it wants to print (cast)(something.__pfn). > > I disagree that this is what we want; we asked to fold an expression with > class type, it seems unlikely that we want to get back an expression with > pointer type. That doesn't matter. What c_fold_indirect_ref_warn returns is something that can help the user understand what is actually being accessed. Consider slightly modified testcase (which doesn't use the PMFs so that we don't ICE on those too): // PR c++/101515 // { dg-do compile } // { dg-options "-O1 -Wuninitialized" } struct S { int j; }; struct T : public S { virtual void h () {} }; struct U { char buf[32]; void (*ptr) (); }; struct sp { char a[14]; char b[50]; void (*pfn) (); long delta; }; int main () { T t; sp x; U *xp = (U *) &x.b[18]; if (xp->ptr != ((void (*) ()) (sizeof (void * return 1; } Trunk emits: pr101515-2.C: In function ‘int main()’: pr101515-2.C:16:11: warning: ‘*(U*)((char*)&x + offsetof(sp, sp::b[18])).U::ptr’ is used uninitialized [-Wuninitialized] 16 | if (xp->ptr != ((void (*) ()) (sizeof (void * | ^~~ pr101515-2.C:14:6: note: ‘x’ declared here 14 | sp x; | ^ here, which is indeed quite long expression, but valid C++ and helps the user to narrow down what exactly is being accessed. If I comment out the c_fold_indirect_ref_warn RECORD_TYPE handling so that it punts on it, it prints: pr101515-2.C: In function ‘int main()’: pr101515-2.C:16:11: warning: ‘*(U*)((char*)&x + 32).U::ptr’ is used uninitialized [-Wuninitialized] 16 | if (xp->ptr != ((void (*) ()) (sizeof (void * | ^~~ pr101515-2.C:14:6: note: ‘x’ declared here 14 | sp x; | ^ That is also correct C++ expression, but user probably has no idea what is present at offset 32 into the variable. Of course if there is a type match and not any kind of type punning, it will try to print much shorter and more readable expressions. Jakub
[committed] [PR104961] LRA: split hard reg for reload pseudo with clobber
The following patch fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104961 The patch was successfully bootstrapped and tested on x86-64. commit 4e2291789a8b31c550271405782356e8aeddcee3 Author: Vladimir N. Makarov Date: Fri Mar 18 14:23:40 2022 -0400 [PR104961] LRA: split hard reg for reload pseudo with clobber. Splitting hard register live range did not work for subreg of a multi-reg reload pseudo. Reload insns for such pseudo contain clobber of the pseudo and splitting did not take this into account. The patch fixes it. gcc/ChangeLog: PR rtl-optimization/104961 * lra-assigns.cc (find_reload_regno_insns): Process reload pseudo clobber. gcc/testsuite/ChangeLog: PR rtl-optimization/104961 * gcc.target/i386/pr104961.c: New. diff --git a/gcc/lra-assigns.cc b/gcc/lra-assigns.cc index ab3a6e6e9cc..af30a673142 100644 --- a/gcc/lra-assigns.cc +++ b/gcc/lra-assigns.cc @@ -1706,7 +1706,8 @@ find_reload_regno_insns (int regno, rtx_insn * &start, rtx_insn * &finish) { unsigned int uid; bitmap_iterator bi; - int n = 0; + int insns_num = 0; + bool clobber_p = false; rtx_insn *prev_insn, *next_insn; rtx_insn *start_insn = NULL, *first_insn = NULL, *second_insn = NULL; @@ -1714,28 +1715,32 @@ find_reload_regno_insns (int regno, rtx_insn * &start, rtx_insn * &finish) { if (start_insn == NULL) start_insn = lra_insn_recog_data[uid]->insn; - n++; + if (GET_CODE (PATTERN (lra_insn_recog_data[uid]->insn)) == CLOBBER) + clobber_p = true; + else + insns_num++; } - /* For reload pseudo we should have at most 3 insns referring for + /* For reload pseudo we should have at most 3 insns besides clobber referring for it: input/output reload insns and the original insn. */ - if (n > 3) + if (insns_num > 3) return false; - if (n > 1) + if (clobber_p) +insns_num++; + if (insns_num > 1) { for (prev_insn = PREV_INSN (start_insn), next_insn = NEXT_INSN (start_insn); - n != 1 && (prev_insn != NULL || next_insn != NULL); ) + insns_num != 1 && (prev_insn != NULL || next_insn != NULL); ) { - if (prev_insn != NULL && first_insn == NULL) + if (prev_insn != NULL) { - if (! bitmap_bit_p (&lra_reg_info[regno].insn_bitmap, - INSN_UID (prev_insn))) - prev_insn = PREV_INSN (prev_insn); - else + if (bitmap_bit_p (&lra_reg_info[regno].insn_bitmap, +INSN_UID (prev_insn))) { first_insn = prev_insn; - n--; + insns_num--; } + prev_insn = PREV_INSN (prev_insn); } if (next_insn != NULL && second_insn == NULL) { @@ -1745,11 +1750,11 @@ find_reload_regno_insns (int regno, rtx_insn * &start, rtx_insn * &finish) else { second_insn = next_insn; - n--; + insns_num--; } } } - if (n > 1) + if (insns_num > 1) return false; } start = first_insn != NULL ? first_insn : start_insn; diff --git a/gcc/testsuite/gcc.target/i386/pr104961.c b/gcc/testsuite/gcc.target/i386/pr104961.c new file mode 100644 index 000..11ea95afe44 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr104961.c @@ -0,0 +1,15 @@ +/* { dg-do compile { target int128 } } */ +/* { dg-options "-Og" } */ + +__int128 i; + +void bar (int); + +void +foo (int a, char b, _Complex unsigned char c) +{ + __int128 j = i * i; + c -= 1; + bar (j); + bar (__imag__ c); +}
Re: [PATCH] Fix PR 101515 (ICE in pp_cxx_unqualified_id, at cp/cxx-pretty-print.c:128)
On 3/18/22 14:20, Jakub Jelinek wrote: On Fri, Mar 18, 2022 at 01:35:53PM -0400, Jason Merrill wrote: On 3/15/22 12:06, Jakub Jelinek wrote: On Tue, Mar 15, 2022 at 11:57:22AM -0400, Jason Merrill wrote: The intent of r11-6729 is that it prints something that helps user to figure out what exactly is being accessed. When we find a unique non-static data member that is being accessed, even when we can't fold it nicely, IMNSHO it is better to print ((sometype *)&var)->field or (*(sometype *)&var).field instead of *(fieldtype *)((char *)&var + 56) because the user doesn't know what is at offset 56, we shouldn't ask user to decipher structure layout etc. The problem is that the reference is *not* to any non-static data member, it's to the PMF as a whole. But c_fold_indirect_ref_for_warn wrongly turns it into a reference to the first non-static data member. We asked c_fold_indirect_ref_warn to fold a MEM_REF with RECORD_TYPE, and it gave us back a COMPONENT_REF with POINTER_TYPE. That seems clearly wrong. That is not what I see on the testcase. I see the outer c_fold_indirect_ref_for_warn call with type ptrmemfunc which is a 64-bit RECORD_TYPE containing a single ptr member which has pointer to function type, and op which is the x VAR_DECL with sp type which is 128-bit RECORD_TYPE containing 64-bit __pfn member and 64-bit __delta member. Exactly: TYPE is RECORD_TYPE, TREE_TYPE (field) is POINTER_TYPE. As all the bits of the ptrmemfunc RECORD_TYPE fit within the __pfn member (they are equal size), it wants to print (cast)(something.__pfn). I disagree that this is what we want; we asked to fold an expression with class type, it seems unlikely that we want to get back an expression with pointer type. That doesn't matter. What c_fold_indirect_ref_warn returns is something that can help the user understand what is actually being accessed. Consider slightly modified testcase (which doesn't use the PMFs so that we don't ICE on those too): // PR c++/101515 // { dg-do compile } // { dg-options "-O1 -Wuninitialized" } struct S { int j; }; struct T : public S { virtual void h () {} }; struct U { char buf[32]; void (*ptr) (); }; struct sp { char a[14]; char b[50]; void (*pfn) (); long delta; }; int main () { T t; sp x; U *xp = (U *) &x.b[18]; if (xp->ptr != ((void (*) ()) (sizeof (void * return 1; } Trunk emits: pr101515-2.C: In function ‘int main()’: pr101515-2.C:16:11: warning: ‘*(U*)((char*)&x + offsetof(sp, sp::b[18])).U::ptr’ is used uninitialized [-Wuninitialized] 16 | if (xp->ptr != ((void (*) ()) (sizeof (void * | ^~~ pr101515-2.C:14:6: note: ‘x’ declared here 14 | sp x; | ^ here, which is indeed quite long expression, but valid C++ and helps the user to narrow down what exactly is being accessed. If I comment out the c_fold_indirect_ref_warn RECORD_TYPE handling so that it punts on it, it prints: pr101515-2.C: In function ‘int main()’: pr101515-2.C:16:11: warning: ‘*(U*)((char*)&x + 32).U::ptr’ is used uninitialized [-Wuninitialized] 16 | if (xp->ptr != ((void (*) ()) (sizeof (void * | ^~~ pr101515-2.C:14:6: note: ‘x’ declared here 14 | sp x; | ^ That is also correct C++ expression, but user probably has no idea what is present at offset 32 into the variable. Of course if there is a type match and not any kind of type punning, it will try to print much shorter and more readable expressions. One important point about the OP's testcase is that we're dealing with offset 0, which corresponds to both the PMF as a whole and its first element. Since we're looking for a RECORD_TYPE, choosing to interpret the desired object as the (RECORD_TYPE) PMF as a whole seems like the better choice; being more flexible probably does make sense at non-0 offsets. Jason
Re: [PATCH] Ignore (possible) signed zeros in operands of FP comparisons.
On Fri, Mar 18, 2022 at 5:01 PM Jeff Law wrote: > > > > On 3/18/2022 7:16 AM, Andrew MacLeod wrote: > > On 3/17/22 19:27, Jeff Law via Gcc-patches wrote: > >> > >> On 3/15/2022 2:03 AM, Roger Sayle wrote: > -Original Message- > From: Richard Biener > Sent: 15 March 2022 07:29 > To: Roger Sayle > Cc: GCC Patches > Subject: Re: [PATCH] Ignore (possible) signed zeros in operands of FP > comparisons. > > On Mon, Mar 14, 2022 at 8:26 PM Roger Sayle > wrote: > > > > I've been wondering about the possible > > performance/missed-optimization > > impact of my patch for PR middle-end/98420 and similar IEEE > > correctness fixes that disable constant folding optimizations when > > worrying > about -0.0. > > In the common situation where the floating point result is used by a > > FP comparison, there's no distinction between +0.0 and -0.0, so some > > HONOR_SIGNED_ZEROS optimizations that we'd usually disable, are safe. > > > > Consider the following interesting example: > > > > int foo(int x, double y) { > > return (x * 0.0) < y; > > } > > > > Although we know that x (when converted to double) can't be NaN or > > Inf, we still worry that for negative values of x that (x * 0.0) may > > be -0.0 and so perform the multiplication at run-time. But in this > > case, the result of the comparison (-0.0 < y) will be exactly the > > same > > as (+0.0 < y) for any y, hence the above may be safely constant > > folded to "0.0 < > y" > > avoiding the multiplication at run-time. > > > > I'm going to hazard a guess that this can be handled in the upcoming > > floating point range support? there was a start of a conversation in > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=24021 about a month ago. > > > > I know *zero* about floating point, but It seems like when we track > > floating point ranges, we will naturally be able to integrate > > > > _2 = _1 * 0.0; > > _3 = _2 < y_5(D); > > > > that _2 evaluates to +/- 0.0 and when we do look at (_2 < y_5) > > that VRP can simplify that to 0.0 < y? (or any patch which uses > > simplification and ranges).It seems like it should be > > straightforward anyway. Yes, definitely. Ranger is really a propagation engine. The plan all along was to make it type agnostic with a generic vrange class that provides core functionality that each implementation overrides (union, intersect, varying, undefined, etc). Right now we have plans for float ranges, pointer tracking, and eventually string tracking. The generic vrange abstraction is done and queued up for the next release. I have a core float implementation that handles relationals and end points, but the plan is to add bits to track nonzero, infs, nans, etc. Since ranger will just be a propagation engine, range-ops, union, intersection, etc, could propagate not NaN and non Inf for the cast operator above, plus +-0.0 for _2. The float range-op entries would be able to do all sorts of magic by querying the range for _2. Also the simplify_using_ranges class vrp uses could simplify the _2 < y_5 with appropriate smarts. It really does seem pretty straightforward once the vrange and frange classes are in place. All this should be doable for the next stage1, but we could use help with the nuances of frange when the time comes. > Yea, I guess we'd pick it up that way and that's probably cleaner than > what I was originally thinking in this space. > > We realize that 2 is +-0.0. Then we realize that for the comparison, we > can constant propagate +0.0 for _2 since +0.0 and -0.0 behave the same > way. Ideally that removes the last reference to _2 and we DCE away the > multiplication. Yup yup. Aldy > > > jeff >
Re: [PATCH] Fix PR 101515 (ICE in pp_cxx_unqualified_id, at cp/cxx-pretty-print.c:128)
On Fri, Mar 18, 2022 at 02:27:30PM -0400, Jason Merrill wrote: > > That is also correct C++ expression, but user probably has no idea what is > > present at offset 32 into the variable. > > Of course if there is a type match and not any kind of type punning, > > it will try to print much shorter and more readable expressions. > > One important point about the OP's testcase is that we're dealing with > offset 0, which corresponds to both the PMF as a whole and its first > element. Since we're looking for a RECORD_TYPE, choosing to interpret the > desired object as the (RECORD_TYPE) PMF as a whole seems like the better > choice; being more flexible probably does make sense at non-0 offsets. Even if we punt for 0 offsets in RECORD_TYPEs, we'd still run into the ICE if we tried to access something overlapping the __delta member. For 0 offsets and type puning, the question is what will help user more, it is true we don't need to print those + offsetof(...) parts because it is the start of it, on the other side the user might not know what is the first member of the struct and printing that might sometimes help. Jakub
Re: [PATCH RFC] mips: add TARGET_ZERO_CALL_USED_REGS hook [PR104817, PR104820]
> On Mar 18, 2022, at 11:09 AM, Richard Sandiford > wrote: > > Xi Ruoyao writes: >>> >>> If we have to go this way, I think it’s better to make the change you >>> suggested above, >>> and then also update the documentation, both internal documentation on >>> how to define >>> the hook and the user level documentation on what the user might >>> expect when using >>> this option (i.e, it’s possible that the compiler might clear more >>> registers than the user >>> requests on some targets due to the implementation limitation). >>> >>> I can make this change if we decide to do this. >> >> I'd vote for this. Richard? > > Fine by me too, although I don't think this should be mentioned > in the user documentation. E.g. used-arg means that non-argument, > non-return registers can have any value on return from the function; > the compiler doesn't make any guarantees. If the compiler happens to > use a temporary register in the implementation of the option, and if > that temporary register happens to still be zero on return, then > that's OK. It's just an internal implementation detail. The same > thing could happen for any part of the epilogue. This makes good sense to me. I agree. Okay, will just add an extra argument and update the internal documentation. thanks. Qing > > Thanks, > Richard
Re: [PATCH] rs6000: Allow using -mlong-double-64 after -mabi={ibm,ieee}longdouble [PR104208, PR87496]
On 3/16/22 7:33 AM, Segher Boessenkool wrote: > On Tue, Mar 15, 2022 at 02:49:39PM -0500, Peter Bergner wrote: >> The trunk patch has been confirmed to fix the glibc build errors and no >> issues >> with the patch has surfaced, so ok for the GCC11 and GCC10 release branches? > > If you can confirm it fixes the glibc build error on 11 and 10 as well, > then yes please. Thanks! I confirmed I see the same glibc issue with unpatched gcc11 and gcc10 compilers and that the patched gcc11 and gcc10 fix the issue so I went ahead and pushed the backports. Thanks! Peter
[PATCH v2] c++: ICE with template code in constexpr [PR104284]
On Fri, Mar 11, 2022 at 06:46:42PM -0500, Jason Merrill wrote: > On 3/10/22 18:04, Marek Polacek wrote: > > Since r9-6073 cxx_eval_store_expression preevaluates the value to > > be stored, and that revealed a crash where a template code (here, > > code=IMPLICIT_CONV_EXPR) leaks into cxx_eval*. > > > > It happens because we're performing build_vec_init while processing > > a template > > Hmm, that seems like the bug. Where's that call coming from? >From build_aggr_init. So we're handling e.g. template constexpr void g () { constexpr S s2[]{{'a'}}; } cp_finish_decl (decl=s2, init={{'a'}}) sees we're in processing_template_decl, but also that we have a constexpr var which is not dependent, nor is its initializer: else if (init && (init_const_expr_p || DECL_DECLARED_CONSTEXPR_P (decl)) && !TYPE_REF_P (type) && decl_maybe_constant_var_p (decl) && !(dep_init = value_dependent_init_p (init))) { /* This variable seems to be a non-dependent constant, so process its initializer. If check_initializer returns non-null the initialization wasn't constant after all. */ tree init_code; cleanups = make_tree_vector (); init_code = check_initializer (decl, init, flags, &cleanups); so we call check_initializer, where we go down this path: init_code = build_aggr_init_full_exprs (decl, init, flags); build_aggr_init sees that the type of 's2' is ARRAY_TYPE, so it calls build_vec_init. I now recall that we've discussed build_vec_init in a template in the past, for example in the context of c++/93676. So I agree we ought to make an effort to avoid calling build_vec_init in a template. Perhaps like this: use an INIT_EXPR. With that, we should call build_vec_init if needed while instantiating. Does that make any sense? Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? -- >8 -- Since r9-6073 cxx_eval_store_expression preevaluates the value to be stored, and that revealed a crash where a template code (here, code=IMPLICIT_CONV_EXPR) leaks into cxx_eval*. It happens because we're performing build_vec_init while processing a template, which calls get_temp_regvar which creates an INIT_EXPR. This INIT_EXPR's RHS contains an rvalue conversion so we create an IMPLICIT_CONV_EXPR. Its operand is not type-dependent and the whole INIT_EXPR is not type-dependent. So we call build_non_dependent_expr which, with -fchecking=2, calls fold_non_dependent_expr. At this point the expression still has an IMPLICIT_CONV_EXPR, which ought to be handled in instantiate_non_dependent_expr_internal. However, tsubst_copy_and_build doesn't handle INIT_EXPR; it will just call tsubst_copy which does nothing when args is null. So we fail to replace the IMPLICIT_CONV_EXPR and ICE. The problem is that we call build_vec_init in a template in the first place. It should work to create an INIT_EXPR in a template and only perform build_vec_init when instantiating. PR c++/104284 gcc/cp/ChangeLog: * init.cc (build_aggr_init): Don't call build_vec_init in a template, create an INIT_EXPR instead. gcc/testsuite/ChangeLog: * g++.dg/cpp1y/constexpr-104284-1.C: New test. * g++.dg/cpp1y/constexpr-104284-2.C: New test. * g++.dg/cpp1y/constexpr-104284-3.C: New test. * g++.dg/cpp1y/constexpr-104284-4.C: New test. --- gcc/cp/init.cc| 11 +++--- .../g++.dg/cpp1y/constexpr-104284-1.C | 34 ++ .../g++.dg/cpp1y/constexpr-104284-2.C | 33 + .../g++.dg/cpp1y/constexpr-104284-3.C | 33 + .../g++.dg/cpp1y/constexpr-104284-4.C | 35 +++ 5 files changed, 142 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-104284-1.C create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-104284-2.C create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-104284-3.C create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-104284-4.C diff --git a/gcc/cp/init.cc b/gcc/cp/init.cc index 7575597c8fd..58e66adbfe1 100644 --- a/gcc/cp/init.cc +++ b/gcc/cp/init.cc @@ -2006,10 +2006,13 @@ build_aggr_init (tree exp, tree init, int flags, tsubst_flags_t complain) } } - stmt_expr = build_vec_init (exp, NULL_TREE, init, - /*explicit_value_init_p=*/false, - from_array, - complain); + /* build_vec_init is not meant to be used in templates. */ + if (processing_template_decl) + stmt_expr = build2 (INIT_EXPR, itype, exp, init); + else + stmt_expr = build_vec_init (exp, NULL_TREE, init, + /*explicit_value_init_p=*/false, + from_array, complain); TREE_READONLY (exp) = was_const; TREE_THIS_
Re: [PATCH] Remove --with-gmp-dir and --with-mpfr-dir
On 3/18/2022 12:06 PM, Tom Tromey via Gcc-patches wrote: The top-level configure options --with-gmp-dir and --with-mpfr-dir were obsoleted and marked as "REMOVED" back in 2006. I think that's long enough ago for everyone to have updated their scripts, so this patch removes them entirely. While doing this, I also found one other leftover that wasn't removed by the earlier patch. This is also removed here. ChangeLog 2022-03-18 Tom Tromey * configure.ac: Remove --with-mpfr-dir and --with-gmp-dir. * configure: Rebuild. OK jeff
[committed] analyzer: add tests of boxed values [PR104943]
This patch adds various regression tests as preparatory work for purging irrelevant local decls from state (PR analyzer/104943) Tested on x86_64-pc-linux-gnu. Pushed to trunk as r12-7717-g1c1daca1cdf7bc0156d57bb2b9083ee70c66b000. gcc/testsuite/ChangeLog: PR analyzer/104943 * gcc.dg/analyzer/boxed-malloc-1-29.c: New test. * gcc.dg/analyzer/boxed-malloc-1.c: New test. * gcc.dg/analyzer/taint-alloc-5.c: New test. * gcc.dg/analyzer/torture/boxed-int-1.c: New test. * gcc.dg/analyzer/torture/boxed-ptr-1.c: New test. Signed-off-by: David Malcolm --- .../gcc.dg/analyzer/boxed-malloc-1-29.c | 36 ++ .../gcc.dg/analyzer/boxed-malloc-1.c | 476 ++ gcc/testsuite/gcc.dg/analyzer/taint-alloc-5.c | 21 + .../gcc.dg/analyzer/torture/boxed-int-1.c | 170 +++ .../gcc.dg/analyzer/torture/boxed-ptr-1.c | 82 +++ 5 files changed, 785 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/analyzer/boxed-malloc-1-29.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/boxed-malloc-1.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-alloc-5.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/torture/boxed-int-1.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/torture/boxed-ptr-1.c diff --git a/gcc/testsuite/gcc.dg/analyzer/boxed-malloc-1-29.c b/gcc/testsuite/gcc.dg/analyzer/boxed-malloc-1-29.c new file mode 100644 index 000..9e38f97fc8e --- /dev/null +++ b/gcc/testsuite/gcc.dg/analyzer/boxed-malloc-1-29.c @@ -0,0 +1,36 @@ +/* Isolating this false positive from boxed-malloc-1.c since it's + reported within boxed_malloc. */ + +#include + +typedef struct boxed_ptr { void *value; } boxed_ptr; + +boxed_ptr +boxed_malloc (size_t sz) +{ + boxed_ptr result; + result.value = malloc (sz); + return result; /* { dg-bogus "leak" "leak false +ve (PR analyzer/104979)" { xfail *-*-* } } */ +} + +boxed_ptr +boxed_free (boxed_ptr ptr) +{ + free (ptr.value); +} + +const boxed_ptr boxed_null = {NULL}; + +struct link +{ + boxed_ptr m_ptr; +}; + +boxed_ptr test_29 (void) +{ + boxed_ptr res = boxed_malloc (sizeof (struct link)); + if (!res.value) +return boxed_null; + ((struct link *)res.value)->m_ptr = boxed_malloc (sizeof (struct link)); + return res; +} diff --git a/gcc/testsuite/gcc.dg/analyzer/boxed-malloc-1.c b/gcc/testsuite/gcc.dg/analyzer/boxed-malloc-1.c new file mode 100644 index 000..5428f2baf49 --- /dev/null +++ b/gcc/testsuite/gcc.dg/analyzer/boxed-malloc-1.c @@ -0,0 +1,476 @@ +/* Adapted from malloc-1.c, but wrapping the pointers in a struct. */ + +/* { dg-require-effective-target alloca } */ + +#include + +extern int foo (void); +extern int bar (void); +extern void could_free (void *); +extern void cant_free (const void *); /* since it's a const void *. */ + +typedef struct boxed_ptr { void *value; } boxed_ptr; + +boxed_ptr +boxed_malloc (size_t sz) +{ + boxed_ptr result; + result.value = malloc (sz); + return result; +} + +boxed_ptr +boxed_free (boxed_ptr ptr) +{ + free (ptr.value); +} + +const boxed_ptr boxed_null = {NULL}; + +void test_1 (void) +{ + boxed_ptr ptr; + ptr.value = malloc (1024); + free (ptr.value); + free (ptr.value); /* { dg-warning "double-'free' of 'ptr.value'" } */ +} + +void test_2 (boxed_ptr ptr) +{ + free (ptr.value); + free (ptr.value); /* { dg-warning "double-'free' of 'ptr.value'" } */ +} + +boxed_ptr +test_3 (void) +{ + boxed_ptr ptr; + ptr.value = malloc (sizeof (int)); + *(int *)ptr.value = 42; /* { dg-warning "dereference of possibly-NULL 'ptr.value' \\\[CWE-690\\\]" } */ + return ptr; +} + +boxed_ptr +test_4 (void) +{ + boxed_ptr ptr; + ptr.value = malloc (sizeof (int)); + int *iptr = (int *)ptr.value; + if (iptr) +*iptr = 42; + else +*iptr = 43; /* { dg-warning "dereference of NULL 'iptr' \\\[CWE-476\\\]" } */ + return ptr; +} + +int test_5 (boxed_ptr ptr) +{ + free (ptr.value); + return *(int *)ptr.value; /* { dg-warning "use after 'free' of 'ptr.value'" } */ +} + +void test_6 (void *ptr) +{ + boxed_ptr q; + q.value = ptr; + free (ptr); + free (q.value); /* { dg-warning "double-'free' of 'ptr'" } */ +} + +void test_6a (boxed_ptr ptr) +{ + boxed_ptr q; + q = ptr; + boxed_free (ptr); + free (q.value); /* { dg-warning "double-'free' of 'ptr.value'" } */ +} + +void test_7 (void) +{ + boxed_ptr ptr = boxed_malloc(4096); + if (!ptr.value) +return; + __builtin_memset(ptr.value, 0, 4096); + boxed_free(ptr); +} + +boxed_ptr test_8 (void) +{ + boxed_ptr ptr = boxed_malloc(4096); + if (!ptr.value) +return boxed_null; + __builtin_memset(ptr.value, 0, 4096); + return ptr; +} + +void test_9 (void) +{ + boxed_ptr ptr = boxed_malloc (1024); + + int i; + for (i = 0; i < 1024; i++) +free (ptr.value); /* { dg-warning "double-'free' of 'ptr.value'" } */ +} + +void test_10 (void) +{ + boxed_ptr ptr = boxed_malloc (1024); + + int i; + for (i = 0; i < 1024; i++) +foo (); + + free (ptr.value); + free (ptr.valu
[committed] analyzer: extend state-purging to locals [PR104943]
The existing analyzer code attempts to purge the state of SSA names where it can in order to minimize the size of program_state instances, and to increase the chances of being able to reuse exploded_node instances whilst exploring the user's code. PR analyzer/104943 identifies that we fail to purge state of local variables, based on behavior seen in PR analyzer/104954 when attempting to profile slow performance of -fanalyzer on a particular file in the Linux kernel, where that testcase has many temporary "boxed" values of structs containing ints, which are never cleaned up, leading to bloat of the program_state instances (specifically, of the store objects). This patch generalizes the state purging from just being on SSA names to also work on local variables. Doing so requires that we detect where addresses to a local variable (or within them) are taken; we assume that once a pointer has been taken, it's not longer safe to purge the value of that decl at any successor point within the function. Doing so speeds up the PR analyzer/104954 Linux kernel analyzer testcase from taking 254 seconds to "just" 186 seconds (and I have a followup patch in development that seems to further reduce this to 37 seconds). The patch may also help with scaling up taint-detection so that it can eventually be turned on by default, but we're not quite there (this is PR analyzer/103533). Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Pushed to trunk as r12-7718-gfaacafd2306ad7ece721a79dedbb6e44e0d65bdb. gcc/analyzer/ChangeLog: PR analyzer/104943 PR analyzer/104954 PR analyzer/103533 * analyzer.h (class state_purge_per_decl): New forward decl. * engine.cc (impl_run_checkers): Pass region_model_manager to state_purge_map ctor. * program-point.cc (function_point::final_stmt_p): New. (function_point::get_next): New. * program-point.h (function_point::final_stmt_p): New decl. (function_point::get_next): New decl. * program-state.cc (program_state::prune_for_point): Generalize to purge local decls as well as SSA names. (program_state::can_purge_base_region_p): New. * program-state.h (program_state::can_purge_base_region_p): New decl. * region-model.cc (struct append_ssa_names_cb_data): Rename to... (struct append_regions_cb_data): ...this. (region_model::get_ssa_name_regions_for_current_frame): Rename to... (region_model::get_regions_for_current_frame): ...this, updating for other renamings. (region_model::append_ssa_names_cb): Rename to... (region_model::append_regions_cb): ...this, and drop the requirement that the subregion be a SSA name. * region-model.h (struct append_ssa_names_cb_data): Rename decl to... (struct append_regions_cb_data): ...this. (region_model::get_ssa_name_regions_for_current_frame): Rename decl to... (region_model::get_regions_for_current_frame): ...this. (region_model::append_ssa_names_cb): Rename decl to... (region_model::append_regions_cb): ...this. * state-purge.cc: Include "tristate.h", "selftest.h", "analyzer/store.h", "analyzer/region-model.h", and "gimple-walk.h". (get_candidate_for_purging): New. (class gimple_op_visitor): New. (my_load_cb): New. (my_store_cb): New. (my_addr_cb): New. (state_purge_map::state_purge_map): Add "mgr" param. Update for renamings. Find uses of local variables. (state_purge_map::~state_purge_map): Update for renaming of m_map to m_ssa_map. Clean up m_decl_map. (state_purge_map::get_or_create_data_for_decl): New. (state_purge_per_ssa_name::state_purge_per_ssa_name): Update for inheriting from state_purge_per_tree. (state_purge_per_ssa_name::add_to_worklist): Likewise. (state_purge_per_decl::state_purge_per_decl): New. (state_purge_per_decl::add_needed_at): New. (state_purge_per_decl::add_pointed_to_at): New. (state_purge_per_decl::process_worklists): New. (state_purge_per_decl::add_to_worklist): New. (same_binding_p): New. (fully_overwrites_p): New. (state_purge_per_decl::process_point_backwards): New. (state_purge_per_decl::process_point_forwards): New. (state_purge_per_decl::needed_at_point_p): New. (state_purge_annotator::print_needed): Generalize to print local decls as well as SSA names. * state-purge.h (class state_purge_map): Update leading comment. (state_purge_map::map_t): Rename to... (state_purge_map::ssa_map_t): ...this. (state_purge_map::iterator): Rename to... (state_purge_map::ssa_iterator): ...this. (state_purge_map::decl_map_t): New typedef. (state_purge_map::decl_iterator): New typedef. (state_purge_ma
[PATCH] AVX512FP16: Fix masm=intel output for vfc?(madd|mul)csh [PR 104977]
Hi, This patch fixes typo in subst for scalar complex mask_round operand. Bootstraped/regtested on x86_64-pc-linux-gnu{-m32,} and sde. Ok for master? gcc/ChangeLog: PR target/104977 * config/i386/sse.md (avx512fp16_fmash_v8hf): Correct round operand for intel dialect. gcc/testsuite/ChangeLog: PR target/104977 * gcc.target/i386/pr104977.c: New test. --- gcc/config/i386/sse.md | 2 +- gcc/testsuite/gcc.target/i386/pr104977.c | 13 + 2 files changed, 14 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr104977.c diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index ed98120be59..21bf3c55c95 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -6723,7 +6723,7 @@ (define_insn "avx512fp16_fma_sh_v8hfsh\t{%2, %1, %0|%0, %1, %2}" + "vsh\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "ssemuladd") (set_attr "mode" "V8HF")]) diff --git a/gcc/testsuite/gcc.target/i386/pr104977.c b/gcc/testsuite/gcc.target/i386/pr104977.c new file mode 100644 index 000..9faa4db3b0d --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr104977.c @@ -0,0 +1,13 @@ +/* PR target/104977 */ +/* { dg-do assemble } */ +/* { dg-options "-O2 -mavx512fp16 -masm=intel" } */ +/* { dg-require-effective-target avx512fp16 } */ +/* { dg-require-effective-target masm_intel } */ + +#include + +__m128h +foo (__m128h a, __m128h b, __m128h c, __mmask8 m) +{ + return _mm_fcmadd_round_sch (a, b, c, 8); +} -- 2.18.1
[PATCH] AVX512FP16: Fix wrong code for _mm_mask_f[c]madd.*sch [PR 104978]
Hi, For complex scalar intrinsic like _mm_mask_fcmadd_sch, the mask should be and by 1 to ensure the mask is bind to lowest byte. Bootstraped/regtested on x86_64-pc-linux-gnu{-m32,} and sde. Ok for master? gcc/ChangeLog: PR target/104978 * config/i386/sse.md (avx512fp16_fmaddcsh_v8hf_mask1" (match_operand:QI 4 "register_operand")] "TARGET_AVX512FP16 && " { - rtx op0, op1; + rtx op0, op1, mask; if () emit_insn (gen_avx512fp16_fmaddcsh_v8hf_mask ( @@ -6590,11 +6590,13 @@ (define_expand "avx512fp16_fmaddcsh_v8hf_mask1" { op0 = lowpart_subreg (V4SFmode, operands[0], V8HFmode); op1 = lowpart_subreg (V4SFmode, operands[1], V8HFmode); -emit_insn (gen_avx512vl_loadv4sf_mask (op0, op0, op1, operands[4])); +mask = gen_reg_rtx (QImode); +emit_insn (gen_andqi3 (mask, operands[4], GEN_INT (1))); +emit_insn (gen_avx512vl_loadv4sf_mask (op0, op0, op1, mask)); } else { -rtx mask, tmp, vec_mask; +rtx tmp, vec_mask; mask = lowpart_subreg (SImode, operands[4], QImode), tmp = gen_reg_rtx (SImode); emit_insn (gen_ashlsi3 (tmp, mask, GEN_INT (31))); @@ -6631,7 +6633,7 @@ (define_expand "avx512fp16_fcmaddcsh_v8hf_mask1" (match_operand:QI 4 "register_operand")] "TARGET_AVX512FP16 && " { - rtx op0, op1; + rtx op0, op1, mask; if () emit_insn (gen_avx512fp16_fcmaddcsh_v8hf_mask ( @@ -6645,11 +6647,13 @@ (define_expand "avx512fp16_fcmaddcsh_v8hf_mask1" { op0 = lowpart_subreg (V4SFmode, operands[0], V8HFmode); op1 = lowpart_subreg (V4SFmode, operands[1], V8HFmode); -emit_insn (gen_avx512vl_loadv4sf_mask (op0, op0, op1, operands[4])); +mask = gen_reg_rtx (QImode); +emit_insn (gen_andqi3 (mask, operands[4], GEN_INT (1))); +emit_insn (gen_avx512vl_loadv4sf_mask (op0, op0, op1, mask)); } else { -rtx mask, tmp, vec_mask; +rtx tmp, vec_mask; mask = lowpart_subreg (SImode, operands[4], QImode), tmp = gen_reg_rtx (SImode); emit_insn (gen_ashlsi3 (tmp, mask, GEN_INT (31))); diff --git a/gcc/testsuite/gcc.target/i386/pr104978.c b/gcc/testsuite/gcc.target/i386/pr104978.c new file mode 100644 index 000..fd22a6c3f43 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr104978.c @@ -0,0 +1,18 @@ +/* PR target/104978 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -mavx512fp16 -mavx512vl" } */ +/* { dg-final { scan-assembler-times "and\[^\\n\\r\]*\\\$1" 2 } } */ + +#include + +__m128h +foo (__m128h a, __m128h b, __m128h c, __mmask8 m) +{ + return _mm_mask_fmadd_round_sch (a, m, b, c, 8); +} + +__m128h +foo2 (__m128h a, __m128h b, __m128h c, __mmask8 m) +{ + return _mm_mask_fcmadd_round_sch (a, m, b, c, 8); +} -- 2.18.1
[r12-7710 Regression] FAIL: g++.dg/cpp0x/variadic-alias3.C -std=c++17 (test for excess errors) on Linux/x86_64
On Linux/x86_64, c7a6a32739d62deab03266e2b5449fce261b1ecb is the first bad commit commit c7a6a32739d62deab03266e2b5449fce261b1ecb Author: Marek Polacek Date: Wed Mar 16 09:34:34 2022 -0400 c++: alias template and empty parameter packs [PR104008] caused FAIL: g++.dg/cpp0x/variadic-alias3.C -std=c++17 (internal compiler error: in hashtab_chk_error, at hash-table.cc:137) FAIL: g++.dg/cpp0x/variadic-alias3.C -std=c++17 (test for excess errors) with GCC configured with ../../gcc/configure --prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-7710/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl --enable-libmpx x86_64-linux --disable-bootstrap To reproduce: $ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=g++.dg/cpp0x/variadic-alias3.C --target_board='unix{-m64}'" (Please do not reply to this email, for question about this report, contact me at skpgkp2 at gmail dot com)
Re: [PATCH] Fix PR 101515 (ICE in pp_cxx_unqualified_id, at cp/cxx-pretty-print.c:128)
On 3/18/22 14:47, Jakub Jelinek wrote: On Fri, Mar 18, 2022 at 02:27:30PM -0400, Jason Merrill wrote: That is also correct C++ expression, but user probably has no idea what is present at offset 32 into the variable. Of course if there is a type match and not any kind of type punning, it will try to print much shorter and more readable expressions. One important point about the OP's testcase is that we're dealing with offset 0, which corresponds to both the PMF as a whole and its first element. Since we're looking for a RECORD_TYPE, choosing to interpret the desired object as the (RECORD_TYPE) PMF as a whole seems like the better choice; being more flexible probably does make sense at non-0 offsets. Even if we punt for 0 offsets in RECORD_TYPEs, we'd still run into the ICE if we tried to access something overlapping the __delta member. Good point. Your patch is OK, then. For 0 offsets and type puning, the question is what will help user more, it is true we don't need to print those + offsetof(...) parts because it is the start of it, on the other side the user might not know what is the first member of the struct and printing that might sometimes help. Or it might not, as in this case. Going with the enclosing class if none of the types match seems like a better default to me, but I guess let's not worry about it now. Jason