Re: [PING^3][PATCH,v2,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
On Wed, 26 Jan 2022 at 08:53, Dan Li wrote: > > Hi, all, > > Sorry for bothering. > > I'm trying to commit aarch64 scs code to the gcc and there is an issue > that I'm not sure about, could someone give me some suggestions? > (To avoid noise, I did't cc PING^3 [1] to the kernel mail list :) ) > > When clang enables scs, the following instructions are usually generated: > > str x30, [x18], 8 > ldp x29, x30, [sp], 16 > .. > ldp x29, x30, [sp], 16 ## x30 pop > ldr x30, [x18, -8]! ## x30 pop again > ret > > The x30 register is popped twice here, Richard suggested that we can > omit the first x30 pop here. > > AFAICT, it seems fine and also safe for SCS. But I'm not sure if I'm > missing something with the kernel, could someone give some suggestions? > > The previous discussion can be found here [1]. > > [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589257.html > As was pointed out in the discussion, binary patching is in fact a concern for the Linux kernel. E.g., Android uses generic binary builds, but we would like to be able to switch between pointer authentication and shadow call stack at boot time, rather than always support both, and take the SCS performance hit on systems that implement PAC as well. However, it seems more straight-forward to patch PACIASP and AUTIASP instructions into SCS push/pop instructions rather than the other way around, as we can force the use of these exact opcodes [in the NOP space]), as well as rely on existing unwind annotations to locate any such instruction in the binary. So omitting the load of X30 from the ordinary stack seems fine to me. > On 1/25/22 22:51, Dan Li wrote: > > > > > > On 1/25/22 02:19, Richard Sandiford wrote: > >> Dan Li writes: > > + > > if (flag_stack_usage_info) > > current_function_static_stack_size = constant_lower_bound > > (frame_size); > > @@ -9066,6 +9089,10 @@ aarch64_expand_epilogue (bool for_sibcall) > > RTX_FRAME_RELATED_P (insn) = 1; > > } > > + /* Pop return address from shadow call stack. */ > > + if (aarch64_shadow_call_stack_enabled ()) > > +emit_insn (gen_scs_pop ()); > > + > > This looks correct, but following on from the above, I guess we continue > to restore x30 from the frame in the traditional way first, and then > overwrite it with the SCS value. I think the output code would be > slightly neater if we suppressed the first restore of x30. > > >>> Yes, the current epilogue is like: > >>> ... > >>> ldp x29, x30, [sp], #16 > >>> + ldr x30, [x18, #-8]! > >>> > >>> I think may be we can keep the two x30 pops here, for two reasons: > >>> 1) The implementation of scs in clang is to pop x30 twice, it might be > >>> better to be consistent with clang here[1]. > >> > >> This doesn't seem a very compelling reason on its own, unless it's > >> explicitly meant to be observable behaviour. (But then, observed how?) > >> > > > > Well, probably sticking to pop x30 twice is not a good idea. > > AFAICT, there doesn't seem to be an explicit requirement that > > this behavior must be followed. > > > > BTW: > > Do we still need to emit the .cfi_restore 30 directive here if we > > want to save a pop x30? (Sorry I'm not familiar enough with DWARF.) > > > > Since the aarch64 linux kernel always enables -fno-omit-frame-pointer, > > the generated assembly code may be as follows: > > > > str x30, [x18], 8 > > ldp x29, x30, [sp], 16 > > .. > > ldr x29, [sp], 16 > > ## Do we still need a .cfi_restore 30 here > > .cfi_restore 29 > > .cfi_def_cfa_offset 0 > > ldr x30, [x18, -8]! > > ## Or may be a non-CFA based directive here > > ret > > > >>> 2) If we keep the first restore of x30, it may provide some flexibility > >>> for other scenarios. For example, we can directly patch the scs_push/pop > >>> insns in the binary to turn it into a binary without scs; > >> > >> Is that a supported (and ideally documented) use case? If it is, > >> then it becomes important for correctness that the compiler emits > >> exactly the opcodes in the original patch. If it isn't, then the > >> compiler can emit other instructions that have the same effect. > >> > > > > Oh, no, this is just a little trick that can be used for compatibility. > > (Maybe some scenarios such as our company sometimes need to be > > compatible with some non-open source binaries from different > > manufacturers, two pops could make life easier :). ) > > > > If this is not a consideration for our community, please ignore > > this request. > > > >> I'd like a definitive ruling on this from the kernel folks before > >> the patch goes in. > >> > > > > Ok, I'll cc some kernel folks to make sure I didn't miss something. > > > >> If binary patching is supposed to be possible then scs_push and > >> scs_pop *do* need to be separate define_in
Re: [PATCH v3 00/15] ARM/MVE use vectors of boolean for predicates
Ping? As discussed elsewhere with André, I'll drop patch #15 from this series, since his patch is a better fix. Since v2 of this series had been approved, I think only patches 4,7,8,9,12 and 13 need proper review. Thanks, Christophe On Fri, Jan 14, 2022 at 3:22 PM Kyrylo Tkachov wrote: > Hi Christophe, Richard, > > > -Original Message- > > From: Gcc-patches > bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Richard > > Biener via Gcc-patches > > Sent: Friday, January 14, 2022 1:33 PM > > To: Christophe Lyon > > Cc: GCC Patches > > Subject: Re: [PATCH v3 00/15] ARM/MVE use vectors of boolean for > > predicates > > > > On Fri, Jan 14, 2022 at 2:18 PM Christophe Lyon via Gcc-patches > > wrote: > > > > > > Hi, > > > > > > I hadn't realized we are moving to stage 4 this week-end :-( > > > > > > The PRs I'm fixing are P3, but without these fixes MVE support is badly > > > broken, so I think I would be really good to fix that before the buggy > > > version becomes part of an actual release. > > > Anyway I posted v1 of the patches during stage1, so it should still be > OK > > > if they are accepted as-is ? > > > > In the end it's up to the target maintainers to weight the risk of > breakage > > vs. the risk of not usefulness ;) But stage3 is where the "was posted > > during stage1" > > rule can easily apply - at some point we have to stop with such general > ruling. > > > > Thanks, that's in line with my interpretation. > These patches resolve some nasty brokenness in the MVE support that I'm > keen to see fixed and from what I can tell the patches shouldn't have a > large effect on non-MVE code. > So the risk vs reward balance for the arm port as a whole looks good to me. > Andre has kindly agreed to help review the patches and I'll also try to > get to them today and next week so that we can get them in early stage4. > > Thanks, > Kyrill > > > Richard. > > > > > Thanks, > > > > > > Christophe > > > > > > On Thu, Jan 13, 2022 at 3:58 PM Christophe Lyon via Gcc-patches < > > > gcc-patches@gcc.gnu.org> wrote: > > > > > > > > > > > This is v3 of this patch series, fixing issues I discovered before > > > > committing v2 (which had been approved). > > > > > > > > Thanks a lot to Richard Sandiford for his help. > > > > > > > > The changes v2 -> v3 are: > > > > > > > > Patch 4: Fix arm_hard_regno_nregs and CLASS_MAX_NREGS to support > > VPR. > > > > > > > > Patch 7: Changes to the underlying representation of vectors of > > > > booleans to account for the different expectations between > AArch64/SVE > > > > and Arm/MVE. > > > > > > > > Patch 8: Re-use and extend existing thumb2_movhi* patterns instead of > > > > duplicating them in mve_mov. This requires the introduction of > a > > > > new constraint to match a constant vector of booleans. Add a new RTL > > > > test. > > > > > > > > Patch 9: Introduce check_effective_target_arm_mve and skip > > > > gcc.dg/signbit-2.c, because with MVE there is no fallback > architecture > > > > unlike SVE or AVX512. > > > > > > > > Patch 12: Update less load/store MVE builtins > > > > (mve_vldrdq_gather_base_z_v2di, > > > > mve_vldrdq_gather_offset_z_v2di, > > > > mve_vldrdq_gather_shifted_offset_z_v2di, > > > > mve_vstrdq_scatter_base_p_v2di, > > > > mve_vstrdq_scatter_offset_p_v2di, > > > > mve_vstrdq_scatter_offset_p_v2di_insn, > > > > mve_vstrdq_scatter_shifted_offset_p_v2di, > > > > mve_vstrdq_scatter_shifted_offset_p_v2di_insn, > > > > mve_vstrdq_scatter_base_wb_p_v2di, > > > > mve_vldrdq_gather_base_wb_z_v2di, > > > > mve_vldrdq_gather_base_nowb_z_v2di, > > > > mve_vldrdq_gather_base_wb_z_v2di_insn) for which we keep HI > > mode > > > > for vpr_register_operand. > > > > > > > > Patch 13: No need to update > > > > gcc.target/arm/acle/cde-mve-full-assembly.c anymore since we re-use > > > > the mov pattern that emits '@ movhi' in the assembly. > > > > > > > > Patch 15: This is a new patch to fix a problem I noticed during this > > > > v2->v3 update. > > > > > > > > > > > > > > > > I'll squash patch 2 with patch 9 and patch 3 with patch 8. > > > > > > > > Original text: > > > > > > > > This patch series addresses PR 100757 and 101325 by representing > > > > vectors of predicates (MVE VPR.P0 register) as vectors of booleans > > > > rather than using HImode. > > > > > > > > As this implies a lot of mostly mechanical changes, I have tried to > > > > split the patches in a way that should help reviewers, but the split > > > > is a bit artificial. > > > > > > > > Patches 1-3 add new tests. > > > > > > > > Patches 4-6 are small independent improvements. > > > > > > > > Patch 7 implements the predicate qualifier, but does not change any > > > > builtin yet. > > > > > > > > Patch 8 is the first of the two main patches, and uses the new > > > > qualifier to describe the vcmp and vpsel builtins that are useful for > > > > auto-vectorization of comparisons. > > > > > > > > Patch 9 is the second main patch, which fixes the vcond_mask > expander. > > > > > > > > Patch
[PATCH] c++: Fix up handling of vector CONSTRUCTORs with vectors in it in constexpr.cc [PR104226]
Hi! The middle-end uses sometimes VECTOR_TYPE CONSTRUCTORs that contain some other VECTOR_TYPE elements in it (should be with compatible element size and smaller number of elements, e.g. a V8SImode vector can be constructed as { V4SImode_var_1, V4SImode_var_2 }), and expansion of __builtin_shufflevector emits these early, so constexpr.cc can see those too. constexpr.cc already has special cases for NULL index which is typical for VECTOR_TYPE CONSTRUCTORs, and for VECTOR_TYPE CONSTRUCTORs that contain just scalar elts that works just fine - init_subob_ctx just returns on non-aggregate elts and get_or_insert_ctor_field has if (TREE_CODE (type) == VECTOR_TYPE && index == NULL_TREE) { CONSTRUCTOR_APPEND_ELT (CONSTRUCTOR_ELTS (ctor), index, NULL_TREE); return &CONSTRUCTOR_ELTS (ctor)->last(); } handling for it. But for the vector in vector case init_subob_ctx would try to create a sub-CONSTRUCTOR and even didn't handle the NULL index case well, so instead of creating the sub-CONSTRUCTOR after the elts already in it overwrote the first one. So (V8SImode) { { 0, 0, 0, 0 }, { 0, 0, 0, 0 } } became (V8SImode) { 0, 0, 0, 0 } The following patch fixes it by not forcing a sub-CONSTRUCTOR for this vector in vector case. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2022-01-26 Jakub Jelinek PR c++/104226 * constexpr.cc (init_subob_ctx): For vector ctors containing vector elements, ensure appending to the same ctor instead of creating another one. * g++.dg/cpp0x/constexpr-104226.C: New test. --- gcc/cp/constexpr.cc.jj 2022-01-19 00:42:11.0 +0100 +++ gcc/cp/constexpr.cc 2022-01-25 21:44:28.459208756 +0100 @@ -4658,6 +4658,13 @@ init_subob_ctx (const constexpr_ctx *ctx if (!AGGREGATE_TYPE_P (type) && !VECTOR_TYPE_P (type)) /* A non-aggregate member doesn't get its own CONSTRUCTOR. */ return; + if (VECTOR_TYPE_P (type) + && VECTOR_TYPE_P (TREE_TYPE (ctx->ctor)) + && index == NULL_TREE) +/* A vector inside of a vector CONSTRUCTOR, e.g. when a larger + vector is constructed from smaller vectors, doesn't get its own + CONSTRUCTOR either. */ +return; /* The sub-aggregate initializer might contain a placeholder; update object to refer to the subobject and ctor to refer to --- gcc/testsuite/g++.dg/cpp0x/constexpr-104226.C.jj2022-01-25 21:50:34.977031244 +0100 +++ gcc/testsuite/g++.dg/cpp0x/constexpr-104226.C 2022-01-25 21:51:41.851086559 +0100 @@ -0,0 +1,15 @@ +// PR c++/104226 +// { dg-do compile } +// { dg-options "-Wno-psabi" } + +typedef unsigned short __attribute__((__vector_size__(16))) U; +typedef unsigned int __attribute__((__vector_size__(16))) V; +typedef unsigned int __attribute__((__vector_size__(32))) W; + +U +foo (void) +{ + return __builtin_convertvector (__builtin_shufflevector ((V){}, (W){}, + 0, 0, 1, 0, + 5, 5, 0, 2), U); +} Jakub
Re: [PING^3][PATCH,v2,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
Thanks, Ard, On 1/26/22 00:10, Ard Biesheuvel wrote: On Wed, 26 Jan 2022 at 08:53, Dan Li wrote: Hi, all, Sorry for bothering. I'm trying to commit aarch64 scs code to the gcc and there is an issue that I'm not sure about, could someone give me some suggestions? (To avoid noise, I did't cc PING^3 [1] to the kernel mail list :) ) When clang enables scs, the following instructions are usually generated: str x30, [x18], 8 ldp x29, x30, [sp], 16 .. ldp x29, x30, [sp], 16 ## x30 pop ldr x30, [x18, -8]! ## x30 pop again ret The x30 register is popped twice here, Richard suggested that we can omit the first x30 pop here. AFAICT, it seems fine and also safe for SCS. But I'm not sure if I'm missing something with the kernel, could someone give some suggestions? The previous discussion can be found here [1]. [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589257.html As was pointed out in the discussion, binary patching is in fact a concern for the Linux kernel. E.g., Android uses generic binary builds, but we would like to be able to switch between pointer authentication and shadow call stack at boot time, rather than always support both, and take the SCS performance hit on systems that implement PAC as well. However, it seems more straight-forward to patch PACIASP and AUTIASP instructions into SCS push/pop instructions rather than the other way around, as we can force the use of these exact opcodes [in the NOP space]), as well as rely on existing unwind annotations to locate any such instruction in the binary. Well, then I think I don't need to submit a kernel patch to enable SCS for gcc :) BTW: Do we have a plan to submit patches of dynamic patch PAC into the kernel recently? So omitting the load of X30 from the ordinary stack seems fine to me. On 1/25/22 22:51, Dan Li wrote: On 1/25/22 02:19, Richard Sandiford wrote: Well, probably sticking to pop x30 twice is not a good idea. AFAICT, there doesn't seem to be an explicit requirement that Ok, I'll cc some kernel folks to make sure I didn't miss something. To Richard: Sorry for my mistake. Due to binary compatibility issues, SCS related code may not be directly merged into libgcc/glibc, do we still need to add this patch into GCC? (I'd like to finish it if that makes sense). Thanks all for your time! Dan If binary patching is supposed to be possible then scs_push and scs_pop *do* need to be separate define_insns. But they also need to have some magic unspec that differentiates them from normal pushes and pops, e.g.: (set ...mem... (unspec:DI [...reg...] UNSPEC_SCS_PUSH)) so that there is no chance that the pattern would be treated as a normal move and optimised accordingly. Yeah, this template looks more appropriate if it is to be treated as a special directive. Thanks for your suggestions, Dan
Re: [PING^3][PATCH,v2,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
On Wed, 26 Jan 2022 at 11:40, Dan Li wrote: > > Thanks, Ard, > > On 1/26/22 00:10, Ard Biesheuvel wrote: > > On Wed, 26 Jan 2022 at 08:53, Dan Li wrote: > >> > >> Hi, all, > >> > >> Sorry for bothering. > >> > >> I'm trying to commit aarch64 scs code to the gcc and there is an issue > >> that I'm not sure about, could someone give me some suggestions? > >> (To avoid noise, I did't cc PING^3 [1] to the kernel mail list :) ) > >> > >> When clang enables scs, the following instructions are usually generated: > >> > >> str x30, [x18], 8 > >> ldp x29, x30, [sp], 16 > >> .. > >> ldp x29, x30, [sp], 16 ## x30 pop > >> ldr x30, [x18, -8]! ## x30 pop again > >> ret > >> > >> The x30 register is popped twice here, Richard suggested that we can > >> omit the first x30 pop here. > >> > >> AFAICT, it seems fine and also safe for SCS. But I'm not sure if I'm > >> missing something with the kernel, could someone give some suggestions? > >> > >> The previous discussion can be found here [1]. > >> > >> [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589257.html > >> > > > > As was pointed out in the discussion, binary patching is in fact a > > concern for the Linux kernel. E.g., Android uses generic binary > > builds, but we would like to be able to switch between pointer > > authentication and shadow call stack at boot time, rather than always > > support both, and take the SCS performance hit on systems that > > implement PAC as well. > > > > However, it seems more straight-forward to patch PACIASP and AUTIASP > > instructions into SCS push/pop instructions rather than the other way > > around, as we can force the use of these exact opcodes [in the NOP > > space]), as well as rely on existing unwind annotations to locate any > > such instruction in the binary. > > > > Well, then I think I don't need to submit a kernel patch to > enable SCS for gcc :) > Not entirely. > BTW: > Do we have a plan to submit patches of dynamic patch PAC into > the kernel recently? > At the moment, there are just some ideas floating around. I did implement a proof of concept that uses unwind data, but it hit some issues with cfi_negate_ra_state being emitted imprecisely (GCC) or not at all (Clang) in some cases. Using objtool would be another possibility. So in summary, getting SCS support proper into GCC is definitely worth the effort. -- Ard.
Re: [PATCH v2] x86: Also check mode of memory broadcast in bcst_mem_operand
On Sun, Jan 23, 2022 at 04:39:34PM -0800, H.J. Lu via Gcc-patches wrote: > On Sun, Jan 23, 2022 at 4:35 PM Hongtao Liu wrote: > > > > On Sun, Jan 23, 2022 at 8:28 PM H.J. Lu via Gcc-patches > > wrote: > > > > > > Return false for invalid mode on memory broadcast in bcst_mem_operand: > > > > > > (vec_duplicate:V16SF (mem/j:V4SF (reg/v/f:DI 109 [ b ]))) > > > > > Yes, thanks. > > I will also backport it to GCC 11 branch. On i686-linux this new testcase FAILs with: cc1: warning: SSE instruction set disabled, using 387 arithmetics FAIL: gcc.target/i386/pr104188.c (test for excess errors) Excess errors: cc1: warning: SSE instruction set disabled, using 387 arithmetics This is because it uses -mfpmath=sse, but -msse2 isn't on. Fixed by adding -msse2 to dg-options and requiring sse2_runtime effective target. Tested on x86_64-linux and i686-linux, committed as obvious to trunk/11: 2022-01-26 Jakub Jelinek PR target/104188 * gcc.target/i386/pr104188.c: Add dg-require-effective-target sse2_runtime. Add -msse2 to dg-options. --- gcc/testsuite/gcc.target/i386/pr104188.c.jj 2022-01-24 10:18:21.174512441 +0100 +++ gcc/testsuite/gcc.target/i386/pr104188.c2022-01-26 11:54:58.025950692 +0100 @@ -1,5 +1,6 @@ /* { dg-do run { target avx512f } } */ -/* { dg-options "-O2 -mfpmath=sse" } */ +/* { dg-require-effective-target sse2_runtime } */ +/* { dg-options "-O2 -msse2 -mfpmath=sse" } */ #include Jakub
[PATCH] Improve profile handling in switch lowering.
Hello. Right now, switch lowering does not update basic_block::count values so that they are uninitiliazed. Moreover, I've updated probability scaling when a more complex expansion happens. There are still some situations where the profile is a bit imprecise, but the patch improves rapidly the current situation. Patch can bootstrap on x86_64-linux-gnu and survives regression tests. Ready to be installed? Thanks, Martin PR tree-optimization/101301 PR tree-optimization/103680 gcc/ChangeLog: * tree-switch-conversion.cc (bit_test_cluster::emit): Handle correctly remaining probability. (switch_decision_tree::try_switch_expansion): Fix BB's count where a cluster expansion happens. (switch_decision_tree::emit_cmp_and_jump_insns): Fill up also BB count. (switch_decision_tree::do_jump_if_equal): Likewise. (switch_decision_tree::emit_case_nodes): Handle special case for BT expansion which can also fallback to a default BB. * tree-switch-conversion.h (cluster::cluster): Add m_default_prob probability. --- gcc/tree-switch-conversion.cc | 51 --- gcc/tree-switch-conversion.h | 8 +- 2 files changed, 43 insertions(+), 16 deletions(-) diff --git a/gcc/tree-switch-conversion.cc b/gcc/tree-switch-conversion.cc index 670397c87e4..d6679e8dee3 100644 --- a/gcc/tree-switch-conversion.cc +++ b/gcc/tree-switch-conversion.cc @@ -1538,10 +1538,12 @@ bit_test_cluster::emit (tree index_expr, tree index_type, test[k].target_bb = n->m_case_bb; test[k].label = n->m_case_label_expr; test[k].bits = 0; + test[k].prob = profile_probability::never (); count++; } test[k].bits += n->get_range (n->get_low (), n->get_high ()); + test[k].prob += n->m_prob; lo = tree_to_uhwi (int_const_binop (MINUS_EXPR, n->get_low (), minval)); if (n->get_high () == NULL_TREE) @@ -1629,6 +1631,11 @@ bit_test_cluster::emit (tree index_expr, tree index_type, /*simple=*/true, NULL_TREE, /*before=*/true, GSI_SAME_STMT); + profile_probability subtree_prob = m_subtree_prob; + profile_probability default_prob = m_default_prob; + if (!default_prob.initialized_p ()) +default_prob = m_subtree_prob.invert (); + if (m_handles_entire_switch && entry_test_needed) { tree range = int_const_binop (MINUS_EXPR, maxval, minval); @@ -1639,9 +1646,10 @@ bit_test_cluster::emit (tree index_expr, tree index_type, /*simple=*/true, NULL_TREE, /*before=*/true, GSI_SAME_STMT); tmp = fold_build2 (GT_EXPR, boolean_type_node, idx, range); + default_prob = default_prob.apply_scale (1, 2); basic_block new_bb = hoist_edge_and_branch_if_true (&gsi, tmp, default_bb, -profile_probability::unlikely ()); +default_prob); gsi = gsi_last_bb (new_bb); } @@ -1662,14 +1670,12 @@ bit_test_cluster::emit (tree index_expr, tree index_type, else csui = tmp; - profile_probability prob = profile_probability::always (); - /* for each unique set of cases: if (const & csui) goto target */ for (k = 0; k < count; k++) { - prob = profile_probability::always ().apply_scale (test[k].bits, -bt_range); + profile_probability prob = test[k].prob / (subtree_prob + default_prob); + subtree_prob -= test[k].prob; bt_range -= test[k].bits; tmp = wide_int_to_tree (word_type_node, test[k].mask); tmp = fold_build2 (BIT_AND_EXPR, word_type_node, csui, tmp); @@ -1908,9 +1914,13 @@ switch_decision_tree::try_switch_expansion (vec &clusters) /* Emit cluster-specific switch handling. */ for (unsigned i = 0; i < clusters.length (); i++) if (clusters[i]->get_type () != SIMPLE_CASE) - clusters[i]->emit (index_expr, index_type, -gimple_switch_default_label (m_switch), -m_default_bb, gimple_location (m_switch)); + { + edge e = single_pred_edge (clusters[i]->m_case_bb); + e->dest->count = e->src->count.apply_probability (e->probability); + clusters[i]->emit (index_expr, index_type, + gimple_switch_default_label (m_switch), + m_default_bb, gimple_location (m_switch)); + } } fix_phi_operands_for_edges (); @@ -2162,6 +2172,7 @@ switch_decision_tree::emit_cmp_and_jump_insns (basic_block bb, tree op0, edge false_edge = split_block (bb, cond); false_edge->flags = EDGE_FALSE_VALUE; false_edge->probability = prob.invert (); + false_edge->dest->count = bb->count.apply_probability (prob.invert ()); ed
[PATCH] tree-optimization/100499 - niter analysis and multiple_of_p
niter analysis uses multiple_of_p which currently assumes operations like MULT_EXPR do not wrap. We've got to rely on this for optimizing size expressions like those in DECL_SIZE and those generally use unsigned arithmetic with no indication that they are not expected to wrap. To preserve that the following adds a parameter to multiple_of_p, defaulted to true, indicating that the TOP expression is not expected to wrap for outer computations in TYPE. This mostly follows a patch proposed by Bin last year with the conversion behavior added. Applying to all users the new effect is that upon type conversions in the TOP expression the behavior will switch to honor TYPE_OVERFLOW_UNDEFINED for the converted sub-expressions. The patch also changes the occurance in niter analysis that we know is problematic and we have testcases for to pass false to multiple_of_p. The patch also contains a change to the PR72817 fix from Bin to avoid regressing gcc.dg/tree-ssa/loop-42.c. The intent for stage1 is to introduce a size_multiple_of_p and internalize the added parameter so all multiple_of_p users will honor TYPE_OVERFLOW_UNDEFINED and users dealing with size expressions need to be switched to size_multiple_of_p. Bootstrapped and tested on x86_64-unknown-linux-gnu with all languages and {,-m32} testing. The patch applies ontop of the three earlier posted ones that touch multiple_of_p but have not yet been reviewed/pushed. OK? Thanks, Richard. 2022-01-26 Richard Biener PR tree-optimization/100499 * fold-const.h (multiple_of_p): Add nowrap parameter, defaulted to true. * fold-const.cc (multiple_of_p): Likewise. Honor it for MULT_EXPR, PLUS_EXPR and MINUS_EXPR and pass it along, switching to false for conversions. * tree-ssa-loop-niter.cc (number_of_iterations_ne): Do not claim the outermost expression does not wrap when calling multiple_of_p. Refactor the check done to check the original IV, avoiding a bias that might wrap. * gcc.dg/torture/pr100499-1.c: New testcase. * gcc.dg/torture/pr100499-2.c: Likewise. * gcc.dg/torture/pr100499-3.c: Likewise. Co-authored-by: Bin Cheng --- gcc/fold-const.cc | 81 +++ gcc/fold-const.h | 2 +- gcc/testsuite/gcc.dg/torture/pr100499-1.c | 27 gcc/testsuite/gcc.dg/torture/pr100499-2.c | 16 + gcc/testsuite/gcc.dg/torture/pr100499-3.c | 14 gcc/tree-ssa-loop-niter.cc| 52 ++- 6 files changed, 131 insertions(+), 61 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/torture/pr100499-1.c create mode 100644 gcc/testsuite/gcc.dg/torture/pr100499-2.c create mode 100644 gcc/testsuite/gcc.dg/torture/pr100499-3.c diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc index a0a4913c45e..7c204fb6265 100644 --- a/gcc/fold-const.cc +++ b/gcc/fold-const.cc @@ -14062,10 +14062,16 @@ fold_binary_initializer_loc (location_t loc, tree_code code, tree type, SAVE_EXPR (I) * SAVE_EXPR (J) (where the same SAVE_EXPR (J) is used in the original and the - transformed version). */ + transformed version). + + NOWRAP specifies whether all outer operations in TYPE should + be considered not wrapping. Any type conversion within TOP acts + as a barrier and we will fall back to NOWRAP being false. + NOWRAP is mostly used to treat expressions in TYPE_SIZE and friends + as not wrapping even though they are generally using unsigned arithmetic. */ int -multiple_of_p (tree type, const_tree top, const_tree bottom) +multiple_of_p (tree type, const_tree top, const_tree bottom, bool nowrap) { gimple *stmt; tree op1, op2; @@ -14083,10 +14089,17 @@ multiple_of_p (tree type, const_tree top, const_tree bottom) a multiple of BOTTOM then TOP is a multiple of BOTTOM. */ if (!integer_pow2p (bottom)) return 0; - return (multiple_of_p (type, TREE_OPERAND (top, 1), bottom) - || multiple_of_p (type, TREE_OPERAND (top, 0), bottom)); + return (multiple_of_p (type, TREE_OPERAND (top, 1), bottom, nowrap) + || multiple_of_p (type, TREE_OPERAND (top, 0), bottom, nowrap)); case MULT_EXPR: + /* If the multiplication can wrap we cannot recurse further unless +the second operand is a power of two which is where wrapping +does not matter. */ + if (!nowrap + && !TYPE_OVERFLOW_UNDEFINED (type) + && !integer_pow2p (TREE_OPERAND (top, 1))) + return 0; if (TREE_CODE (bottom) == INTEGER_CST) { op1 = TREE_OPERAND (top, 0); @@ -14095,24 +14108,24 @@ multiple_of_p (tree type, const_tree top, const_tree bottom) std::swap (op1, op2); if (TREE_CODE (op2) == INTEGER_CST) { - if (multiple_of_p (type, op2, bottom)) + if (multiple_of_p (type, op2, bottom, nowrap)) return 1;
Re: [PING^3][PATCH,v2,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
On 1/26/22 03:09, Ard Biesheuvel wrote: On Wed, 26 Jan 2022 at 11:40, Dan Li wrote: Thanks, Ard, On 1/26/22 00:10, Ard Biesheuvel wrote: On Wed, 26 Jan 2022 at 08:53, Dan Li wrote: Hi, all, Sorry for bothering. I'm trying to commit aarch64 scs code to the gcc and there is an issue that I'm not sure about, could someone give me some suggestions? (To avoid noise, I did't cc PING^3 [1] to the kernel mail list :) ) When clang enables scs, the following instructions are usually generated: str x30, [x18], 8 ldp x29, x30, [sp], 16 .. ldp x29, x30, [sp], 16 ## x30 pop ldr x30, [x18, -8]! ## x30 pop again ret The x30 register is popped twice here, Richard suggested that we can omit the first x30 pop here. AFAICT, it seems fine and also safe for SCS. But I'm not sure if I'm missing something with the kernel, could someone give some suggestions? The previous discussion can be found here [1]. [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589257.html As was pointed out in the discussion, binary patching is in fact a concern for the Linux kernel. E.g., Android uses generic binary builds, but we would like to be able to switch between pointer authentication and shadow call stack at boot time, rather than always support both, and take the SCS performance hit on systems that implement PAC as well. However, it seems more straight-forward to patch PACIASP and AUTIASP instructions into SCS push/pop instructions rather than the other way around, as we can force the use of these exact opcodes [in the NOP space]), as well as rely on existing unwind annotations to locate any such instruction in the binary. Well, then I think I don't need to submit a kernel patch to enable SCS for gcc :) Not entirely. BTW: Do we have a plan to submit patches of dynamic patch PAC into the kernel recently? At the moment, there are just some ideas floating around. I did implement a proof of concept that uses unwind data, but it hit some issues with cfi_negate_ra_state being emitted imprecisely (GCC) or not at all (Clang) in some cases. Using objtool would be another possibility. So in summary, getting SCS support proper into GCC is definitely worth the effort. Got it! And thanks for the suggestion, Ard :)
Re: [PATCH] openmp: Add support for target_device selector set in metadirectives
Hello Just noticed a bug in the ISA checking in the nvptx plugin - the minor version should only be compared if the major version is equal, otherwise it would reject an isa of sm_35 if the card is capable of supporting sm_52, for example. This patch fixes the issue. Thanks Kwokdiff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c index 7427677e69d..86a12c3fcfd 100644 --- a/libgomp/plugin/plugin-nvptx.c +++ b/libgomp/plugin/plugin-nvptx.c @@ -2047,7 +2047,8 @@ GOMP_OFFLOAD_run (int ord, void *tgt_fn, void *tgt_vars, void **args) /* TODO: Implement GOMP_OFFLOAD_async_run. */ #define CHECK_ISA(major, minor) \ - if (device->compute_major >= major && device->compute_minor >= minor \ + if (((device->compute_major == major && device->compute_minor >= minor) \ + || device->compute_major > major) \ && strcmp (isa, "sm_"#major#minor) == 0) \ return true
Re: [PATCH][GCC13?] RISC-V: Replace `smin'/`smax' RTL patterns with `fmin'/`fmax'
On Mon, 24 Jan 2022, Joseph Myers wrote: > > I think we have consensus that we can ignore pre-r2.2 hardware. What we > > actually support is `-misa-spec=<2.2|20190608|20191213>', so we can assume > > r2.2 semantics as the absolute minimum, matching the description in my > > submission. > > Where, to repeat the point about possibly confusing version numbers, > that's saying we can ignore hardware from before *ISA* revision 2.2 (which > contained 'F' and 'D' extension version 2.0) - not that we can ignore > hardware from before 'F' and 'D' extension version 2.2 (which changed the > semantics of the FMIN and FMAX instructions). OK, I'll try to make it clear in the change description of v2. > > Then once we have IEEE 754-2019 support in place, which will require new > > RTL standard pattern names, say `fminimum'/`fmaximum', we will provide > > them iff (!HONOR_SNANS || riscv_isa_spec >= ISA_SPEC_CLASS_20190608). It > > may be a good idea to start adding IEEE 754-2019 support, including the > > relevant `__builtin_fminimum' and `__builtin_fmaximum' intrinsics, once > > the GCC 13 development cycle has started. > > The newer instruction semantics correspond to the functions fminimum_num > and fmaximum_num, not fminimum and fmaximum (which are functions that > treat both quiet and signaling NaNs like most libm functions do - any > argument a NaN means the result is NaN). I got these wrong then, sorry. I got lost too: what is the difference between `fmin'/`fmax' and `fminimum'/`fmaximum' then? It looks to me like we have a zoo of selection functions now with very subtle differences between each other. Maciej
[PATCH] IPA mod-ref: fix usage of --param names in dump messages.
The patch fixed bad --param param=foo names. Patch can bootstrap on x86_64-linux-gnu and survives regression tests. Ready to be installed? Thanks, Martin gcc/ChangeLog: * ipa-modref-tree.cc (modref_access_node::update): Remove "--param param=foo" with "--param foo". (modref_access_node::insert): Likewise. (modref_access_node::insert_kill): Likewise. * ipa-modref-tree.h (struct modref_ref_node): Likewise. (struct modref_base_node): Likewise. (struct modref_tree): Likewise. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/modref-7.c: Update scanned patterns. * gcc.dg/tree-ssa/modref-8.c: Likewise. --- gcc/ipa-modref-tree.cc | 10 -- gcc/ipa-modref-tree.h| 9 - gcc/testsuite/gcc.dg/tree-ssa/modref-7.c | 2 +- gcc/testsuite/gcc.dg/tree-ssa/modref-8.c | 4 ++-- 4 files changed, 11 insertions(+), 14 deletions(-) diff --git a/gcc/ipa-modref-tree.cc b/gcc/ipa-modref-tree.cc index 97e497accf2..828994f3536 100644 --- a/gcc/ipa-modref-tree.cc +++ b/gcc/ipa-modref-tree.cc @@ -130,8 +130,7 @@ modref_access_node::update (poly_int64 parm_offset1, else { if (dump_file) - fprintf (dump_file, -"--param param=modref-max-adjustments limit reached:"); + fprintf (dump_file, "--param modref-max-adjustments limit reached:"); if (!known_eq (parm_offset, parm_offset1)) { if (dump_file) @@ -594,11 +593,11 @@ modref_access_node::insert (vec *&accesses, return -1; if (dump_file && best2 >= 0) fprintf (dump_file, -"--param param=modref-max-accesses limit reached;" +"--param modref-max-accesses limit reached;" " merging %i and %i\n", best1, best2); else if (dump_file) fprintf (dump_file, -"--param param=modref-max-accesses limit reached;" +"--param modref-max-accesses limit reached;" " merging with %i\n", best1); modref_access_node::try_merge_with (accesses, best1); if (best2 >= 0) @@ -825,8 +824,7 @@ modref_access_node::insert_kill (vec &kills, if ((int)kills.length () >= param_modref_max_accesses) { if (dump_file) - fprintf (dump_file, -"--param param=modref-max-accesses limit reached:"); + fprintf (dump_file, "--param modref-max-accesses limit reached:"); return false; } a.adjustments = 0; diff --git a/gcc/ipa-modref-tree.h b/gcc/ipa-modref-tree.h index edb3b49374a..fdaa9612e9a 100644 --- a/gcc/ipa-modref-tree.h +++ b/gcc/ipa-modref-tree.h @@ -197,8 +197,7 @@ struct GTY((user)) modref_ref_node { if (dump_file) fprintf (dump_file, - "--param param=modref-max-accesses limit reached;" - " collapsing\n"); + "--param modref-max-accesses limit reached; collapsing\n"); collapse (); } return ret != 0; @@ -252,7 +251,7 @@ struct GTY((user)) modref_base_node if (ref && refs && refs->length () >= max_refs) { if (dump_file) - fprintf (dump_file, "--param param=modref-max-refs limit reached;" + fprintf (dump_file, "--param modref-max-refs limit reached;" " using 0\n"); ref = 0; ref_node = search (ref); @@ -344,12 +343,12 @@ struct GTY((user)) modref_tree if (base_node) { if (dump_file) - fprintf (dump_file, "--param param=modref-max-bases" + fprintf (dump_file, "--param modref-max-bases" " limit reached; using ref\n"); return base_node; } if (dump_file) - fprintf (dump_file, "--param param=modref-max-bases" + fprintf (dump_file, "--param modref-max-bases" " limit reached; using 0\n"); base = 0; base_node = search (base); diff --git a/gcc/testsuite/gcc.dg/tree-ssa/modref-7.c b/gcc/testsuite/gcc.dg/tree-ssa/modref-7.c index 53ffa1c394c..b55d7066b22 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/modref-7.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/modref-7.c @@ -10,4 +10,4 @@ int test(struct a *a, int p) a->array[0] = 1; } /* All three accesses combine to one bigger access. */ -/* { dg-final { scan-tree-dump-not "param=modref-max-accesses" "modref1" } } */ +/* { dg-final { scan-tree-dump-not "--param modref-max-accesses" "modref1" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/modref-8.c b/gcc/testsuite/gcc.dg/tree-ssa/modref-8.c index 4a18e34cd16..c51590ff0ba 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/modref-8.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/modref-8.c @@ -17,8 +17,8 @@ recurse (char *p, int n) if (n) recurse (p+1,n-1); } -/* { dg-final { scan-tree-dump-not "param=modref-max-accesses" "modref1" } } */ -/* { dg-final { scan-tree-dump "param=modref-max-adjustments" "modr
[committed] analyzer: fix sense in range::add_bound [PR94362]
On Sun, 2022-01-23 at 17:34 +0100, Mikael Morin wrote: > Hello, > > Le 21/01/2022 à 00:59, David Malcolm via Gcc-patches a écrit : > > diff --git a/gcc/analyzer/constraint-manager.cc > > b/gcc/analyzer/constraint-manager.cc > > index 568e7150ea7..7c4a85bbb24 100644 > > --- a/gcc/analyzer/constraint-manager.cc > > +++ b/gcc/analyzer/constraint-manager.cc > > @@ -301,6 +301,80 @@ range::above_upper_bound (tree rhs_const) > > const > > m_upper_bound.m_constant).is_true (); > > } > > > > +/* Attempt to add B to the bound of the given kind of this range. > > + Return true if feasible; false if infeasible. */ > > + > > +bool > > +range::add_bound (bound b, enum bound_kind bound_kind) > > +{ > > + b.ensure_closed (bound_kind); > > + > > + switch (bound_kind) > > +{ > > +default: > > + gcc_unreachable (); > > +case BK_LOWER: > > + /* Discard redundant bounds. */ > > + if (m_lower_bound.m_constant) > > + { > > + m_lower_bound.ensure_closed (BK_LOWER); > > + if (!tree_int_cst_lt (b.m_constant, > > + m_lower_bound.m_constant)) > > + return true; > > isn’t this condition reversed? > > > + } > > + m_lower_bound = b; > > + break; > > +case BK_UPPER: > > + /* Discard redundant bounds. */ > > + if (m_upper_bound.m_constant) > > + { > > + m_upper_bound.ensure_closed (BK_UPPER); > > + if (tree_int_cst_le (b.m_constant, > > + m_upper_bound.m_constant)) > > + return true; > > same here. > > All the tests added have just one lower and one upper bound, so they > don’t use the short-circuit code, but amending one of them as follows > makes the problem appear as the test starts to fails. It should > continue to work, shouldn’t it? > > > diff --git a/gcc/analyzer/constraint-manager.cc > b/gcc/analyzer/constraint-manager.cc > index 7c4a85bbb24..3f38b857722 100644 > --- a/gcc/analyzer/constraint-manager.cc > +++ b/gcc/analyzer/constraint-manager.cc > @@ -3697,6 +3697,7 @@ test_constant_comparisons () > region_model_manager mgr; > { > region_model model (&mgr); > + ADD_SAT_CONSTRAINT (model, int_1, LT_EXPR, a); > ADD_SAT_CONSTRAINT (model, int_3, LT_EXPR, a); > ADD_UNSAT_CONSTRAINT (model, a, LT_EXPR, int_4); > } Good catch, thanks. Fixed as follows, which also moves the rejection of contradictory constraints in range::add_bound to earlier, so that this code can be self-tested. Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Pushed to trunk as r12-6875-ge966a508e03fe28bfca65a1e60e579fa90355ea6. gcc/analyzer/ChangeLog: PR analyzer/94362 * constraint-manager.cc (range::add_bound): Fix tests for discarding redundant constraints. Perform test for rejecting unsatisfiable constraints earlier so that they don't update the object on failure. (selftest::test_range): New. (selftest::test_constant_comparisons): Add test coverage for existing constraints becoming narrower until they are unsatisfiable. (selftest::run_constraint_manager_tests): Call test_range. Signed-off-by: David Malcolm --- gcc/analyzer/constraint-manager.cc | 93 +- 1 file changed, 79 insertions(+), 14 deletions(-) diff --git a/gcc/analyzer/constraint-manager.cc b/gcc/analyzer/constraint-manager.cc index 7c4a85bbb24..88b0988513a 100644 --- a/gcc/analyzer/constraint-manager.cc +++ b/gcc/analyzer/constraint-manager.cc @@ -318,35 +318,42 @@ range::add_bound (bound b, enum bound_kind bound_kind) if (m_lower_bound.m_constant) { m_lower_bound.ensure_closed (BK_LOWER); - if (!tree_int_cst_lt (b.m_constant, - m_lower_bound.m_constant)) + if (tree_int_cst_le (b.m_constant, + m_lower_bound.m_constant)) return true; } + if (m_upper_bound.m_constant) + { + m_upper_bound.ensure_closed (BK_UPPER); + /* Reject B <= V <= UPPER when B > UPPER. */ + if (!tree_int_cst_le (b.m_constant, + m_upper_bound.m_constant)) + return false; + } m_lower_bound = b; break; + case BK_UPPER: /* Discard redundant bounds. */ if (m_upper_bound.m_constant) { m_upper_bound.ensure_closed (BK_UPPER); - if (tree_int_cst_le (b.m_constant, - m_upper_bound.m_constant)) + if (!tree_int_cst_lt (b.m_constant, + m_upper_bound.m_constant)) return true; } + if (m_lower_bound.m_constant) + { + m_lower_bound.ensure_closed (BK_LOWER); + /* Reject LOWER <= V <= B when LOWER > B. */ + if (!tree_int_cst_le (m_lower_bound.m_constant, +
[committed] analyzer: fix missing uninit warning on args to stdio builtins [PR104224]
We were failing to check for uninitialized arguments to stdio builtins, such as when passing local "go" to the call to "printf" in "main" in the testcase. Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Pushed to trunk as r12-6876-g9ff3e2368d86c1bf7d1e8876a14e58c0d6617ffe. gcc/analyzer/ChangeLog: PR analyzer/104224 * region-model.cc (region_model::check_call_args): New. (region_model::on_call_pre): Call it when ignoring stdio builtins. * region-model.h (region_model::check_call_args): New decl gcc/testsuite/ChangeLog: PR analyzer/104224 * gcc.dg/analyzer/pr104224.c: New test. Signed-off-by: David Malcolm --- gcc/analyzer/region-model.cc | 11 +++ gcc/analyzer/region-model.h | 2 + gcc/testsuite/gcc.dg/analyzer/pr104224.c | 106 +++ 3 files changed, 119 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr104224.c diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc index f6b7f986a39..a559bc84eb0 100644 --- a/gcc/analyzer/region-model.cc +++ b/gcc/analyzer/region-model.cc @@ -1044,6 +1044,16 @@ region_model::on_stmt_pre (const gimple *stmt, } } +/* Ensure that all arguments at the call described by CD are checked + for poisoned values, by calling get_rvalue on each argument. */ + +void +region_model::check_call_args (const call_details &cd) const +{ + for (unsigned arg_idx = 0; arg_idx < cd.num_args (); arg_idx++) +cd.get_arg_svalue (arg_idx); +} + /* Update this model for the CALL stmt, using CTXT to report any diagnostics - the first half. @@ -1173,6 +1183,7 @@ region_model::on_call_pre (const gcall *call, region_model_context *ctxt, /* These stdio builtins have external effects that are out of scope for the analyzer: we only want to model the effects on the return value. */ + check_call_args (cd); break; } else if (is_named_call_p (callee_fndecl, "malloc", call, 1)) diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-model.h index c78efe8f215..40958842bce 100644 --- a/gcc/analyzer/region-model.h +++ b/gcc/analyzer/region-model.h @@ -832,6 +832,8 @@ class region_model void check_region_for_read (const region *src_reg, region_model_context *ctxt) const; + void check_call_args (const call_details &cd) const; + /* Storing this here to avoid passing it around everywhere. */ region_model_manager *const m_mgr; diff --git a/gcc/testsuite/gcc.dg/analyzer/pr104224.c b/gcc/testsuite/gcc.dg/analyzer/pr104224.c new file mode 100644 index 000..8f69d72befa --- /dev/null +++ b/gcc/testsuite/gcc.dg/analyzer/pr104224.c @@ -0,0 +1,106 @@ +#include + +struct test { +int one; +int two; +}; + +void func2(const struct test *t) +{ +if (t->one == 0) +printf("init func2\n"); + +if (t->two == 0) /* { dg-warning "uninitialized" } */ +printf("uninit func2\n"); +} + +void func1(struct test *t) +{ +t->one = 1; +func2(t); +} + +int func3(int num) +{ +if (num) +return num; +else +return 0; +} + +void func4(int *a, int max) +{ +int i; +// skip the first +for (i=1; i
[PATCH v1] rtl: builtins: Fix builtins feclearexcept and feraiseexcept operand check [PR94193]
Tested on top of master (e0b8716f53ed6455e9f18931940141692793068d) using --enable-checking=yes,rtl, on the following plataforms with no regression: - powerpc64le-linux-gnu (Power 9) - powerpc64le-linux-gnu (Power 8) - powerpc64-linux-gnu (Power 9, with 32 and 64 bits tests) I did not include a testcase because I could not figure out one that works without --enable-checking=rtl yet. 8< Commit 4343f5e25679 ("rtl: builtins: (not just) rs6000: Add builtins for fegetround, feclearexcept and feraiseexcept [PR94193]") broke gcc bootstra when building with --enable-checking=rtl[1]. The function expand_builtin_feclear_feraise_except was failing to proper validate op0 predicate before emit_insn leading to the mismatch type failure. [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589186.html 2022-01-26 Raoni Fassina Firmino gcc/ PR target/94193 * builtins.cc (expand_builtin_feclear_feraise_except): Add op0 predicate check. Signed-off-by: Raoni Fassina Firmino --- gcc/builtins.cc | 3 +++ 1 file changed, 3 insertions(+) diff --git a/gcc/builtins.cc b/gcc/builtins.cc index e84208035dab..d784a57c2b81 100644 --- a/gcc/builtins.cc +++ b/gcc/builtins.cc @@ -2598,6 +2598,9 @@ expand_builtin_feclear_feraise_except (tree exp, rtx target, if (icode == CODE_FOR_nothing) return NULL_RTX; + if (!(*insn_data[icode].operand[1].predicate) (op0, GET_MODE (op0))) +return NULL_RTX; + if (target == 0 || GET_MODE (target) != target_mode || !(*insn_data[icode].operand[0].predicate) (target, target_mode)) -- 2.34.1
Re: [PATCH] c++: Fix up handling of vector CONSTRUCTORs with vectors in it in constexpr.cc [PR104226]
On 1/26/22 04:33, Jakub Jelinek wrote: Hi! The middle-end uses sometimes VECTOR_TYPE CONSTRUCTORs that contain some other VECTOR_TYPE elements in it (should be with compatible element size and smaller number of elements, e.g. a V8SImode vector can be constructed as { V4SImode_var_1, V4SImode_var_2 }), and expansion of __builtin_shufflevector emits these early, so constexpr.cc can see those too. constexpr.cc already has special cases for NULL index which is typical for VECTOR_TYPE CONSTRUCTORs, and for VECTOR_TYPE CONSTRUCTORs that contain just scalar elts that works just fine - init_subob_ctx just returns on non-aggregate elts and get_or_insert_ctor_field has if (TREE_CODE (type) == VECTOR_TYPE && index == NULL_TREE) { CONSTRUCTOR_APPEND_ELT (CONSTRUCTOR_ELTS (ctor), index, NULL_TREE); return &CONSTRUCTOR_ELTS (ctor)->last(); } handling for it. But for the vector in vector case init_subob_ctx would try to create a sub-CONSTRUCTOR and even didn't handle the NULL index case well, so instead of creating the sub-CONSTRUCTOR after the elts already in it overwrote the first one. So (V8SImode) { { 0, 0, 0, 0 }, { 0, 0, 0, 0 } } became (V8SImode) { 0, 0, 0, 0 } The following patch fixes it by not forcing a sub-CONSTRUCTOR for this vector in vector case. OK. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2022-01-26 Jakub Jelinek PR c++/104226 * constexpr.cc (init_subob_ctx): For vector ctors containing vector elements, ensure appending to the same ctor instead of creating another one. * g++.dg/cpp0x/constexpr-104226.C: New test. --- gcc/cp/constexpr.cc.jj 2022-01-19 00:42:11.0 +0100 +++ gcc/cp/constexpr.cc 2022-01-25 21:44:28.459208756 +0100 @@ -4658,6 +4658,13 @@ init_subob_ctx (const constexpr_ctx *ctx if (!AGGREGATE_TYPE_P (type) && !VECTOR_TYPE_P (type)) /* A non-aggregate member doesn't get its own CONSTRUCTOR. */ return; + if (VECTOR_TYPE_P (type) + && VECTOR_TYPE_P (TREE_TYPE (ctx->ctor)) + && index == NULL_TREE) +/* A vector inside of a vector CONSTRUCTOR, e.g. when a larger + vector is constructed from smaller vectors, doesn't get its own + CONSTRUCTOR either. */ +return; /* The sub-aggregate initializer might contain a placeholder; update object to refer to the subobject and ctor to refer to --- gcc/testsuite/g++.dg/cpp0x/constexpr-104226.C.jj2022-01-25 21:50:34.977031244 +0100 +++ gcc/testsuite/g++.dg/cpp0x/constexpr-104226.C 2022-01-25 21:51:41.851086559 +0100 @@ -0,0 +1,15 @@ +// PR c++/104226 +// { dg-do compile } +// { dg-options "-Wno-psabi" } + +typedef unsigned short __attribute__((__vector_size__(16))) U; +typedef unsigned int __attribute__((__vector_size__(16))) V; +typedef unsigned int __attribute__((__vector_size__(32))) W; + +U +foo (void) +{ + return __builtin_convertvector (__builtin_shufflevector ((V){}, (W){}, + 0, 0, 1, 0, + 5, 5, 0, 2), U); +} Jakub
Re: [PATCH] warn-access: Prevent -Wuse-after-free on ARM [PR104213]
On 1/25/22 18:33, Marek Polacek wrote: Here, -Wuse-after-free warns about using 'this' which, on ARM, cdtors return, as mandated by the EABI. To be entirely correct, it only requires it for C1 and C2 ctors and D2 and D1 dtors, but I don't feel like changing that now and possibly running into issues later on. This patch uses suppress_warning on 'this' for certain cdtor_returns_this cases in the C++ FE, and then warn_invalid_pointer makes use of this information and doesn't warn. In my first attempt I tried suppress_warning the MODIFY_EXPR or RETURN_EXPR we build in build_delete_destructor_body, but the complication is that the suppress_warning bits don't always survive gimplification; see e.g. gimplify_modify_expr where we do 6130 if (COMPARISON_CLASS_P (*from_p)) 6131 copy_warning (assign, *from_p); but here we're not dealing with a comparison. Removing that check regresses uninit-pr74762.C. Adding copy_warning (assign, *expr_p) regresses c-c++-common/uninit-17.c. Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? OK if Martin doesn't have any suggestions. PR target/104213 gcc/cp/ChangeLog: * decl.cc (finish_constructor_body): Suppress -Wuse-after-free. (finish_destructor_body): Likewise. * optimize.cc (build_delete_destructor_body): Likewise. gcc/ChangeLog: * gimple-ssa-warn-access.cc (pass_waccess::warn_invalid_pointer): Don't warn when the SSA_NAME_VAR of REF has supressed -Wuse-after-free. gcc/testsuite/ChangeLog: * g++.dg/warn/Wuse-after-free2.C: New test. --- gcc/cp/decl.cc | 2 ++ gcc/cp/optimize.cc | 1 + gcc/gimple-ssa-warn-access.cc| 14 +++--- gcc/testsuite/g++.dg/warn/Wuse-after-free2.C | 10 ++ 4 files changed, 24 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/g++.dg/warn/Wuse-after-free2.C diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc index 22d3dd1e2ad..6534a7fd320 100644 --- a/gcc/cp/decl.cc +++ b/gcc/cp/decl.cc @@ -17315,6 +17315,7 @@ finish_constructor_body (void) add_stmt (build_stmt (input_location, LABEL_EXPR, cdtor_label)); val = DECL_ARGUMENTS (current_function_decl); + suppress_warning (val, OPT_Wuse_after_free); val = build2 (MODIFY_EXPR, TREE_TYPE (val), DECL_RESULT (current_function_decl), val); /* Return the address of the object. */ @@ -17408,6 +17409,7 @@ finish_destructor_body (void) tree val; val = DECL_ARGUMENTS (current_function_decl); + suppress_warning (val, OPT_Wuse_after_free); val = build2 (MODIFY_EXPR, TREE_TYPE (val), DECL_RESULT (current_function_decl), val); /* Return the address of the object. */ diff --git a/gcc/cp/optimize.cc b/gcc/cp/optimize.cc index 4ad3f1dc9aa..13ab8b7361e 100644 --- a/gcc/cp/optimize.cc +++ b/gcc/cp/optimize.cc @@ -166,6 +166,7 @@ build_delete_destructor_body (tree delete_dtor, tree complete_dtor) if (targetm.cxx.cdtor_returns_this ()) { tree val = DECL_ARGUMENTS (delete_dtor); + suppress_warning (val, OPT_Wuse_after_free); val = build2 (MODIFY_EXPR, TREE_TYPE (val), DECL_RESULT (delete_dtor), val); add_stmt (build_stmt (0, RETURN_EXPR, val)); diff --git a/gcc/gimple-ssa-warn-access.cc b/gcc/gimple-ssa-warn-access.cc index 8bc33eeb6fa..484bcd34c25 100644 --- a/gcc/gimple-ssa-warn-access.cc +++ b/gcc/gimple-ssa-warn-access.cc @@ -3880,9 +3880,17 @@ pass_waccess::warn_invalid_pointer (tree ref, gimple *use_stmt, bool maybe, bool equality /* = false */) { /* Avoid printing the unhelpful "" in the diagnostics. */ - if (ref && TREE_CODE (ref) == SSA_NAME - && (!SSA_NAME_VAR (ref) || DECL_ARTIFICIAL (SSA_NAME_VAR (ref -ref = NULL_TREE; + if (ref && TREE_CODE (ref) == SSA_NAME) +{ + tree var = SSA_NAME_VAR (ref); + if (!var) + ref = NULL_TREE; + /* Don't warn for cases like when a cdtor returns 'this' on ARM. */ + else if (warning_suppressed_p (var, OPT_Wuse_after_free)) + return; + else if (DECL_ARTIFICIAL (var)) + ref = NULL_TREE; +} location_t use_loc = gimple_location (use_stmt); if (use_loc == UNKNOWN_LOCATION) diff --git a/gcc/testsuite/g++.dg/warn/Wuse-after-free2.C b/gcc/testsuite/g++.dg/warn/Wuse-after-free2.C new file mode 100644 index 000..6d5f2bf01b5 --- /dev/null +++ b/gcc/testsuite/g++.dg/warn/Wuse-after-free2.C @@ -0,0 +1,10 @@ +// PR target/104213 +// { dg-do compile } +// { dg-options "-Wuse-after-free" } + +class C +{ +virtual ~C(); +}; + +C::~C() {} // { dg-bogus "used after" } base-commit: aeac414923aa1e87986c7fc6f9b921d89a9b86cf
RFA: libiberty: Fix infinite recursion in rust demangler (PRs 98886 and 99935)
Hi Guys, I would like to propose the patch below to fix a couple of sources of infinite recursion in libiberty's rust demangling code. This patch is based upon the one submitted for PR 99935, but extended to cope with the case presented in PR 98886 and also fixed so that the "uint" type is not used. Tested with a patched version of the binutils sources on an x86-pc-linux-gnu target. Cheers Nick 2022-01-26 Nick Clifton * rust-demangle.c (struct rust_demangler): Add a recursion counter. (demangle_path): Increment/decrement the recursion counter upon entry and exit. Fail if the counter exceeds a fixed limit. (demangle_type): Likewise. (rust_demangle_callback): Initialise the recursion counter, disabling if requested by the option flags. diff --git a/libiberty/rust-demangle.c b/libiberty/rust-demangle.c index 18c760491bd..3b24d63892a 100644 --- a/libiberty/rust-demangle.c +++ b/libiberty/rust-demangle.c @@ -74,6 +74,12 @@ struct rust_demangler /* Rust mangling version, with legacy mangling being -1. */ int version; + /* Recursion depth. */ + unsigned int recursion; + /* Maximum number of times demangle_path may be called recursively. */ +#define RUST_MAX_RECURSION_COUNT 1024 +#define RUST_NO_RECURSION_LIMIT ((unsigned int) -1) + uint64_t bound_lifetime_depth; }; @@ -671,6 +677,15 @@ demangle_path (struct rust_demangler *rdm, int in_value) if (rdm->errored) return; + if (rdm->recursion != RUST_NO_RECURSION_LIMIT) +{ + ++ rdm->recursion; + if (rdm->recursion > RUST_MAX_RECURSION_COUNT) + /* FIXME: There ought to be a way to report + that the recursion limit has been reached. */ + goto fail_return; +} + switch (tag = next (rdm)) { case 'C': @@ -688,10 +703,7 @@ demangle_path (struct rust_demangler *rdm, int in_value) case 'N': ns = next (rdm); if (!ISLOWER (ns) && !ISUPPER (ns)) -{ - rdm->errored = 1; - return; -} + goto fail_return; demangle_path (rdm, in_value); @@ -776,9 +788,15 @@ demangle_path (struct rust_demangler *rdm, int in_value) } break; default: - rdm->errored = 1; - return; + goto fail_return; } + goto pass_return; + + fail_return: + rdm->errored = 1; + pass_return: + if (rdm->recursion != RUST_NO_RECURSION_LIMIT) +-- rdm->recursion; } static void @@ -870,6 +888,19 @@ demangle_type (struct rust_demangler *rdm) return; } + if (rdm->recursion != RUST_NO_RECURSION_LIMIT) +{ + ++ rdm->recursion; + if (rdm->recursion > RUST_MAX_RECURSION_COUNT) + /* FIXME: There ought to be a way to report + that the recursion limit has been reached. */ + { + rdm->errored = 1; + -- rdm->recursion; + return; + } +} + switch (tag) { case 'R': @@ -1030,6 +1061,9 @@ demangle_type (struct rust_demangler *rdm) rdm->next--; demangle_path (rdm, 0); } + + if (rdm->recursion != RUST_NO_RECURSION_LIMIT) +-- rdm->recursion; } /* A trait in a trait object may have some "existential projections" @@ -1320,6 +1354,7 @@ rust_demangle_callback (const char *mangled, int options, rdm.skipping_printing = 0; rdm.verbose = (options & DMGL_VERBOSE) != 0; rdm.version = 0; + rdm.recursion = (options & DMGL_NO_RECURSE_LIMIT) ? RUST_NO_RECURSION_LIMIT : 0; rdm.bound_lifetime_depth = 0; /* Rust symbols always start with _R (v0) or _ZN (legacy). */
Re: [PATCH] IPA mod-ref: fix usage of --param names in dump messages.
On Wed, Jan 26, 2022 at 3:49 PM Martin Liška wrote: > > The patch fixed bad --param param=foo names. > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > Ready to be installed? OK. > Thanks, > Martin > > gcc/ChangeLog: > > * ipa-modref-tree.cc (modref_access_node::update): > Remove "--param param=foo" with "--param foo". > (modref_access_node::insert): Likewise. > (modref_access_node::insert_kill): Likewise. > * ipa-modref-tree.h (struct modref_ref_node): Likewise. > (struct modref_base_node): Likewise. > (struct modref_tree): Likewise. > > gcc/testsuite/ChangeLog: > > * gcc.dg/tree-ssa/modref-7.c: Update scanned patterns. > * gcc.dg/tree-ssa/modref-8.c: Likewise. > --- > gcc/ipa-modref-tree.cc | 10 -- > gcc/ipa-modref-tree.h| 9 - > gcc/testsuite/gcc.dg/tree-ssa/modref-7.c | 2 +- > gcc/testsuite/gcc.dg/tree-ssa/modref-8.c | 4 ++-- > 4 files changed, 11 insertions(+), 14 deletions(-) > > diff --git a/gcc/ipa-modref-tree.cc b/gcc/ipa-modref-tree.cc > index 97e497accf2..828994f3536 100644 > --- a/gcc/ipa-modref-tree.cc > +++ b/gcc/ipa-modref-tree.cc > @@ -130,8 +130,7 @@ modref_access_node::update (poly_int64 parm_offset1, > else > { > if (dump_file) > - fprintf (dump_file, > -"--param param=modref-max-adjustments limit reached:"); > + fprintf (dump_file, "--param modref-max-adjustments limit reached:"); > if (!known_eq (parm_offset, parm_offset1)) > { > if (dump_file) > @@ -594,11 +593,11 @@ modref_access_node::insert (vec va_gc> *&accesses, > return -1; > if (dump_file && best2 >= 0) > fprintf (dump_file, > -"--param param=modref-max-accesses limit reached;" > +"--param modref-max-accesses limit reached;" > " merging %i and %i\n", best1, best2); > else if (dump_file) > fprintf (dump_file, > -"--param param=modref-max-accesses limit reached;" > +"--param modref-max-accesses limit reached;" > " merging with %i\n", best1); > modref_access_node::try_merge_with (accesses, best1); > if (best2 >= 0) > @@ -825,8 +824,7 @@ modref_access_node::insert_kill (vec > &kills, > if ((int)kills.length () >= param_modref_max_accesses) > { > if (dump_file) > - fprintf (dump_file, > -"--param param=modref-max-accesses limit reached:"); > + fprintf (dump_file, "--param modref-max-accesses limit reached:"); > return false; > } > a.adjustments = 0; > diff --git a/gcc/ipa-modref-tree.h b/gcc/ipa-modref-tree.h > index edb3b49374a..fdaa9612e9a 100644 > --- a/gcc/ipa-modref-tree.h > +++ b/gcc/ipa-modref-tree.h > @@ -197,8 +197,7 @@ struct GTY((user)) modref_ref_node > { > if (dump_file) > fprintf (dump_file, > - "--param param=modref-max-accesses limit reached;" > - " collapsing\n"); > + "--param modref-max-accesses limit reached; collapsing\n"); > collapse (); > } > return ret != 0; > @@ -252,7 +251,7 @@ struct GTY((user)) modref_base_node > if (ref && refs && refs->length () >= max_refs) > { > if (dump_file) > - fprintf (dump_file, "--param param=modref-max-refs limit reached;" > + fprintf (dump_file, "--param modref-max-refs limit reached;" >" using 0\n"); > ref = 0; > ref_node = search (ref); > @@ -344,12 +343,12 @@ struct GTY((user)) modref_tree > if (base_node) > { > if (dump_file) > - fprintf (dump_file, "--param param=modref-max-bases" > + fprintf (dump_file, "--param modref-max-bases" >" limit reached; using ref\n"); > return base_node; > } > if (dump_file) > - fprintf (dump_file, "--param param=modref-max-bases" > + fprintf (dump_file, "--param modref-max-bases" >" limit reached; using 0\n"); > base = 0; > base_node = search (base); > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/modref-7.c > b/gcc/testsuite/gcc.dg/tree-ssa/modref-7.c > index 53ffa1c394c..b55d7066b22 100644 > --- a/gcc/testsuite/gcc.dg/tree-ssa/modref-7.c > +++ b/gcc/testsuite/gcc.dg/tree-ssa/modref-7.c > @@ -10,4 +10,4 @@ int test(struct a *a, int p) > a->array[0] = 1; > } > /* All three accesses combine to one bigger access. */ > -/* { dg-final { scan-tree-dump-not "param=modref-max-accesses" "modref1" } } > */ > +/* { dg-final { scan-tree-dump-not "--param modref-max-accesses" "modref1" } > } */ > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/modref-8.c > b/gcc/testsuite/gcc.dg/tree-ssa/modref-8.c > index 4a18e34cd16..c51590ff0ba 100644 > ---
Re: [PATCH v1] rtl: builtins: Fix builtins feclearexcept and feraiseexcept operand check [PR94193]
On Wed, Jan 26, 2022 at 12:05:54PM -0300, Raoni Fassina Firmino wrote: > Commit 4343f5e25679 ("rtl: builtins: (not just) rs6000: Add builtins > for fegetround, feclearexcept and feraiseexcept [PR94193]") broke gcc > bootstra when building with --enable-checking=rtl[1]. > > The function expand_builtin_feclear_feraise_except was failing to > proper validate op0 predicate before emit_insn leading to the mismatch > type failure. > gcc/ > PR target/94193 > * builtins.cc (expand_builtin_feclear_feraise_except): Add op0 > predicate check. Perfect, and pushed. Thank you! Segher
[PATCH] tree-optimization/104162 - CSE of &MEM[ptr].a[i] and ptr + CST
This adds the capability to value-numbering of treating complex address expressions where the offset becomes invariant as equal to a POINTER_PLUS_EXPR. This restores CSE that is now prevented by early lowering of &MEM[ptr + CST] to a POINTER_PLUS_EXPR. Unfortunately this regresses gcc.dg/asan/pr99673.c again. Bootstrapped and tested on x86_64-unknown-linux-gnu - any opinions? Richard. 2022-01-26 Richard Biener PR tree-optimization/104162 * tree-ssa-sccvn.cc (vn_reference_lookup): Handle &MEM[_1 + 5].a[i] like a POINTER_PLUS_EXPR if the offset becomes invariant. (vn_reference_insert): Likewise. * gcc.dg/tree-ssa/ssa-fre-99.c: New testcase. --- gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-99.c | 27 ++ gcc/tree-ssa-sccvn.cc | 62 +- 2 files changed, 88 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-99.c diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-99.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-99.c new file mode 100644 index 000..101d0d63f7a --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-99.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* Disable FRE1 because that for the sake of __builtin_object_size + will not consider the equality but still valueize 'i', defeating + the purpose of the check. */ +/* { dg-options "-O -fdump-tree-fre3 -fdisable-tree-fre1" } */ + +struct S { int a[4]; }; + +int i; +int bar (struct S *p) +{ + char *q = (char *)p + 4; + i = 1; + int *r = &((struct S *)p)->a[i]; + return q == (char *)r; +} +int baz (struct S *p) +{ + i = 1; + int *r = &((struct S *)p)->a[i]; + char *q = (char *)p + 4; + return q == (char *)r; +} + +/* Verify FRE can handle valueizing &p->a[i] and value-numbering it + equal to a POINTER_PLUS_EXPR. */ +/* { dg-final { scan-tree-dump-times "return 1;" 2 "fre3" } } */ diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc index a03f0aae924..482cf372686 100644 --- a/gcc/tree-ssa-sccvn.cc +++ b/gcc/tree-ssa-sccvn.cc @@ -3669,6 +3669,36 @@ vn_reference_lookup (tree op, tree vuse, vn_lookup_kind kind, vr1.vuse = vuse_ssa_val (vuse); vr1.operands = operands = valueize_shared_reference_ops_from_ref (op, &valueized_anything); + + /* Handle &MEM[ptr + 5].b[1].c as POINTER_PLUS_EXPR. */ + if (operands[0].opcode == ADDR_EXPR + && operands.last ().opcode == SSA_NAME) +{ + poly_int64 off = 0; + vn_reference_op_t vro; + unsigned i; + for (i = 1; operands.iterate (i, &vro); ++i) + { + if (vro->opcode == SSA_NAME) + break; + else if (known_eq (vro->off, -1)) + break; + off += vro->off; + } + if (i == operands.length () - 1) + { + gcc_assert (operands[i-1].opcode == MEM_REF); + tree ops[2]; + ops[0] = operands[i].op0; + ops[1] = wide_int_to_tree (sizetype, off); + tree res = vn_nary_op_lookup_pieces (2, POINTER_PLUS_EXPR, + TREE_TYPE (op), ops, NULL); + if (res) + return res; + return NULL_TREE; + } +} + vr1.type = TREE_TYPE (op); ao_ref op_ref; ao_ref_init (&op_ref, op); @@ -3760,13 +3790,43 @@ vn_reference_insert (tree op, tree result, tree vuse, tree vdef) vn_reference_t vr1; bool tem; + vec operands += valueize_shared_reference_ops_from_ref (op, &tem); + /* Handle &MEM[ptr + 5].b[1].c as POINTER_PLUS_EXPR. */ + if (operands[0].opcode == ADDR_EXPR + && operands.last ().opcode == SSA_NAME) +{ + poly_int64 off = 0; + vn_reference_op_t vro; + unsigned i; + for (i = 1; operands.iterate (i, &vro); ++i) + { + if (vro->opcode == SSA_NAME) + break; + else if (known_eq (vro->off, -1)) + break; + off += vro->off; + } + if (i == operands.length () - 1) + { + gcc_assert (operands[i-1].opcode == MEM_REF); + tree ops[2]; + ops[0] = operands[i].op0; + ops[1] = wide_int_to_tree (sizetype, off); + vn_nary_op_insert_pieces (2, POINTER_PLUS_EXPR, + TREE_TYPE (op), ops, result, + VN_INFO (result)->value_id); + return; + } +} + vr1 = XOBNEW (&vn_tables_obstack, vn_reference_s); if (TREE_CODE (result) == SSA_NAME) vr1->value_id = VN_INFO (result)->value_id; else vr1->value_id = get_or_alloc_constant_value_id (result); vr1->vuse = vuse_ssa_val (vuse); - vr1->operands = valueize_shared_reference_ops_from_ref (op, &tem).copy (); + vr1->operands = operands.copy (); vr1->type = TREE_TYPE (op); vr1->punned = false; ao_ref op_ref; -- 2.31.1
Re: [PATCH] tree-optimization/104162 - CSE of &MEM[ptr].a[i] and ptr + CST
On 1/26/2022 8:56 AM, Richard Biener via Gcc-patches wrote: This adds the capability to value-numbering of treating complex address expressions where the offset becomes invariant as equal to a POINTER_PLUS_EXPR. This restores CSE that is now prevented by early lowering of &MEM[ptr + CST] to a POINTER_PLUS_EXPR. Unfortunately this regresses gcc.dg/asan/pr99673.c again. Bootstrapped and tested on x86_64-unknown-linux-gnu - any opinions? Richard. 2022-01-26 Richard Biener PR tree-optimization/104162 * tree-ssa-sccvn.cc (vn_reference_lookup): Handle &MEM[_1 + 5].a[i] like a POINTER_PLUS_EXPR if the offset becomes invariant. (vn_reference_insert): Likewise. * gcc.dg/tree-ssa/ssa-fre-99.c: New testcase. IIRC there's a handful of old regressions this might help. More than once I've been tempted to do something similar in DOM -- with this addition it's just one more reason we should be looking to drop DOM and replace it with FRE going forward. jeff
[pushed] c++: ->template and using-decl [PR104235]
cp_parser_template_id wasn't prepared to handle getting a USING_DECL back from cp_parser_template_name. Let's defer that case to instantiation time, as well. Tested x86_64-pc-linux-gnu, applying to trunk. PR c++/104235 gcc/cp/ChangeLog: * parser.cc (cp_parser_template_name): Repeat lookup of USING_DECL. gcc/testsuite/ChangeLog: * g++.dg/parse/template-keyword2.C: New test. --- gcc/cp/parser.cc | 3 ++- gcc/testsuite/g++.dg/parse/template-keyword2.C | 8 2 files changed, 10 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.dg/parse/template-keyword2.C diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc index ed219d79dc9..8b38165020f 100644 --- a/gcc/cp/parser.cc +++ b/gcc/cp/parser.cc @@ -18680,7 +18680,8 @@ cp_parser_template_name (cp_parser* parser, cp_parser_error (parser, "expected template-name"); return error_mark_node; } - else if (!DECL_P (decl) && !is_overloaded_fn (decl)) + else if ((!DECL_P (decl) && !is_overloaded_fn (decl)) + || TREE_CODE (decl) == USING_DECL) /* Repeat the lookup at instantiation time. */ decl = identifier; } diff --git a/gcc/testsuite/g++.dg/parse/template-keyword2.C b/gcc/testsuite/g++.dg/parse/template-keyword2.C new file mode 100644 index 000..ecd066787bc --- /dev/null +++ b/gcc/testsuite/g++.dg/parse/template-keyword2.C @@ -0,0 +1,8 @@ +// PR c++/104235 + +template +struct L: M { + using M::a; + void a(); + void p() { this->template a<>(); } +}; base-commit: 1bc00a489086b0bdde89ccb11dfa4f50b6c4e8f0 prerequisite-patch-id: c7ab4056fcbd782232c8dc091597602ecd4a7a48 -- 2.27.0
Re: [PATCH] warn-access: Prevent -Wuse-after-free on ARM [PR104213]
On 1/26/22 08:24, Jason Merrill via Gcc-patches wrote: On 1/25/22 18:33, Marek Polacek wrote: Here, -Wuse-after-free warns about using 'this' which, on ARM, cdtors return, as mandated by the EABI. To be entirely correct, it only requires it for C1 and C2 ctors and D2 and D1 dtors, but I don't feel like changing that now and possibly running into issues later on. This patch uses suppress_warning on 'this' for certain cdtor_returns_this cases in the C++ FE, and then warn_invalid_pointer makes use of this information and doesn't warn. In my first attempt I tried suppress_warning the MODIFY_EXPR or RETURN_EXPR we build in build_delete_destructor_body, but the complication is that the suppress_warning bits don't always survive gimplification; see e.g. gimplify_modify_expr where we do 6130 if (COMPARISON_CLASS_P (*from_p)) 6131 copy_warning (assign, *from_p); but here we're not dealing with a comparison. Removing that check regresses uninit-pr74762.C. Adding copy_warning (assign, *expr_p) regresses c-c++-common/uninit-17.c. Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? OK if Martin doesn't have any suggestions. Nothing further from me. Thanks Martin PR target/104213 gcc/cp/ChangeLog: * decl.cc (finish_constructor_body): Suppress -Wuse-after-free. (finish_destructor_body): Likewise. * optimize.cc (build_delete_destructor_body): Likewise. gcc/ChangeLog: * gimple-ssa-warn-access.cc (pass_waccess::warn_invalid_pointer): Don't warn when the SSA_NAME_VAR of REF has supressed -Wuse-after-free. gcc/testsuite/ChangeLog: * g++.dg/warn/Wuse-after-free2.C: New test. --- gcc/cp/decl.cc | 2 ++ gcc/cp/optimize.cc | 1 + gcc/gimple-ssa-warn-access.cc | 14 +++--- gcc/testsuite/g++.dg/warn/Wuse-after-free2.C | 10 ++ 4 files changed, 24 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/g++.dg/warn/Wuse-after-free2.C diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc index 22d3dd1e2ad..6534a7fd320 100644 --- a/gcc/cp/decl.cc +++ b/gcc/cp/decl.cc @@ -17315,6 +17315,7 @@ finish_constructor_body (void) add_stmt (build_stmt (input_location, LABEL_EXPR, cdtor_label)); val = DECL_ARGUMENTS (current_function_decl); + suppress_warning (val, OPT_Wuse_after_free); val = build2 (MODIFY_EXPR, TREE_TYPE (val), DECL_RESULT (current_function_decl), val); /* Return the address of the object. */ @@ -17408,6 +17409,7 @@ finish_destructor_body (void) tree val; val = DECL_ARGUMENTS (current_function_decl); + suppress_warning (val, OPT_Wuse_after_free); val = build2 (MODIFY_EXPR, TREE_TYPE (val), DECL_RESULT (current_function_decl), val); /* Return the address of the object. */ diff --git a/gcc/cp/optimize.cc b/gcc/cp/optimize.cc index 4ad3f1dc9aa..13ab8b7361e 100644 --- a/gcc/cp/optimize.cc +++ b/gcc/cp/optimize.cc @@ -166,6 +166,7 @@ build_delete_destructor_body (tree delete_dtor, tree complete_dtor) if (targetm.cxx.cdtor_returns_this ()) { tree val = DECL_ARGUMENTS (delete_dtor); + suppress_warning (val, OPT_Wuse_after_free); val = build2 (MODIFY_EXPR, TREE_TYPE (val), DECL_RESULT (delete_dtor), val); add_stmt (build_stmt (0, RETURN_EXPR, val)); diff --git a/gcc/gimple-ssa-warn-access.cc b/gcc/gimple-ssa-warn-access.cc index 8bc33eeb6fa..484bcd34c25 100644 --- a/gcc/gimple-ssa-warn-access.cc +++ b/gcc/gimple-ssa-warn-access.cc @@ -3880,9 +3880,17 @@ pass_waccess::warn_invalid_pointer (tree ref, gimple *use_stmt, bool maybe, bool equality /* = false */) { /* Avoid printing the unhelpful "" in the diagnostics. */ - if (ref && TREE_CODE (ref) == SSA_NAME - && (!SSA_NAME_VAR (ref) || DECL_ARTIFICIAL (SSA_NAME_VAR (ref - ref = NULL_TREE; + if (ref && TREE_CODE (ref) == SSA_NAME) + { + tree var = SSA_NAME_VAR (ref); + if (!var) + ref = NULL_TREE; + /* Don't warn for cases like when a cdtor returns 'this' on ARM. */ + else if (warning_suppressed_p (var, OPT_Wuse_after_free)) + return; + else if (DECL_ARTIFICIAL (var)) + ref = NULL_TREE; + } location_t use_loc = gimple_location (use_stmt); if (use_loc == UNKNOWN_LOCATION) diff --git a/gcc/testsuite/g++.dg/warn/Wuse-after-free2.C b/gcc/testsuite/g++.dg/warn/Wuse-after-free2.C new file mode 100644 index 000..6d5f2bf01b5 --- /dev/null +++ b/gcc/testsuite/g++.dg/warn/Wuse-after-free2.C @@ -0,0 +1,10 @@ +// PR target/104213 +// { dg-do compile } +// { dg-options "-Wuse-after-free" } + +class C +{ + virtual ~C(); +}; + +C::~C() {} // { dg-bogus "used after" } base-commit: aeac414923aa1e87986c7fc6f9b921d89a9b86cf
[PATCH v2] warn-access: Prevent -Wuse-after-free on ARM [PR104213]
On Wed, Jan 26, 2022 at 09:39:46AM -0700, Martin Sebor wrote: > On 1/26/22 08:24, Jason Merrill via Gcc-patches wrote: > > On 1/25/22 18:33, Marek Polacek wrote: > > > Here, -Wuse-after-free warns about using 'this' which, on ARM, cdtors > > > return, as mandated by the EABI. To be entirely correct, it only > > > requires it for C1 and C2 ctors and D2 and D1 dtors, but I don't feel > > > like changing that now and possibly running into issues later on. > > > > > > This patch uses suppress_warning on 'this' for certain cdtor_returns_this > > > cases in the C++ FE, and then warn_invalid_pointer makes use of this > > > information and doesn't warn. > > > > > > In my first attempt I tried suppress_warning the MODIFY_EXPR or > > > RETURN_EXPR > > > we build in build_delete_destructor_body, but the complication is that > > > the suppress_warning bits don't always survive gimplification; see e.g. > > > gimplify_modify_expr where we do > > > > > > 6130 if (COMPARISON_CLASS_P (*from_p)) > > > 6131 copy_warning (assign, *from_p); > > > > > > but here we're not dealing with a comparison. Removing that check > > > regresses uninit-pr74762.C. Adding copy_warning (assign, *expr_p) > > > regresses c-c++-common/uninit-17.c. > > > > > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? > > > > OK if Martin doesn't have any suggestions. > > Nothing further from me. Here's what I've pushed then (it adds the testcase Martin mentioned earlier). Thanks, -- >8 -- Here, -Wuse-after-free warns about using 'this' which, on ARM, cdtors return, as mandated by the EABI. To be entirely correct, it only requires it for C1 and C2 ctors and D2 and D1 dtors, but I don't feel like changing that now and possibly running into issues later on. This patch uses suppress_warning on 'this' for certain cdtor_returns_this cases in the C++ FE, and then warn_invalid_pointer makes use of this information and doesn't warn. In my first attempt I tried suppress_warning the MODIFY_EXPR or RETURN_EXPR we build in build_delete_destructor_body, but the complication is that the suppress_warning bits don't always survive gimplification; see e.g. gimplify_modify_expr where we do 6130 if (COMPARISON_CLASS_P (*from_p)) 6131 copy_warning (assign, *from_p); but here we're not dealing with a comparison. Removing that check regresses uninit-pr74762.C. Adding copy_warning (assign, *expr_p) regresses c-c++-common/uninit-17.c. PR target/104213 gcc/cp/ChangeLog: * decl.cc (finish_constructor_body): Suppress -Wuse-after-free. (finish_destructor_body): Likewise. * optimize.cc (build_delete_destructor_body): Likewise. gcc/ChangeLog: * gimple-ssa-warn-access.cc (pass_waccess::warn_invalid_pointer): Don't warn when the SSA_NAME_VAR of REF has supressed -Wuse-after-free. gcc/testsuite/ChangeLog: * g++.dg/warn/Wuse-after-free2.C: New test. * g++.dg/warn/Wuse-after-free3.C: New test. --- gcc/cp/decl.cc | 2 ++ gcc/cp/optimize.cc | 1 + gcc/gimple-ssa-warn-access.cc| 14 +++--- gcc/testsuite/g++.dg/warn/Wuse-after-free2.C | 10 ++ gcc/testsuite/g++.dg/warn/Wuse-after-free3.C | 16 5 files changed, 40 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/g++.dg/warn/Wuse-after-free2.C create mode 100644 gcc/testsuite/g++.dg/warn/Wuse-after-free3.C diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc index 22d3dd1e2ad..6534a7fd320 100644 --- a/gcc/cp/decl.cc +++ b/gcc/cp/decl.cc @@ -17315,6 +17315,7 @@ finish_constructor_body (void) add_stmt (build_stmt (input_location, LABEL_EXPR, cdtor_label)); val = DECL_ARGUMENTS (current_function_decl); + suppress_warning (val, OPT_Wuse_after_free); val = build2 (MODIFY_EXPR, TREE_TYPE (val), DECL_RESULT (current_function_decl), val); /* Return the address of the object. */ @@ -17408,6 +17409,7 @@ finish_destructor_body (void) tree val; val = DECL_ARGUMENTS (current_function_decl); + suppress_warning (val, OPT_Wuse_after_free); val = build2 (MODIFY_EXPR, TREE_TYPE (val), DECL_RESULT (current_function_decl), val); /* Return the address of the object. */ diff --git a/gcc/cp/optimize.cc b/gcc/cp/optimize.cc index 4ad3f1dc9aa..13ab8b7361e 100644 --- a/gcc/cp/optimize.cc +++ b/gcc/cp/optimize.cc @@ -166,6 +166,7 @@ build_delete_destructor_body (tree delete_dtor, tree complete_dtor) if (targetm.cxx.cdtor_returns_this ()) { tree val = DECL_ARGUMENTS (delete_dtor); + suppress_warning (val, OPT_Wuse_after_free); val = build2 (MODIFY_EXPR, TREE_TYPE (val), DECL_RESULT (delete_dtor), val); add_stmt (build_stmt (0, RETURN_EXPR, val)); diff --git a/gcc/gimple-ssa-warn-access.cc b/gcc/gimple-ssa-warn-access.cc index afcf0f71bec..3dcaf4230b8 100644 --
[PATCH] c++: constrained partial spec using qualified name [PR92944]
In the nested_name_specifier branch within cp_parser_class_head, we need to update TYPE with the result of maybe_process_partial_specialization just like we do in the template_id_p branch. Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk? PR c++/92944 gcc/cp/ChangeLog: * parser.cc (cp_parser_class_head): Update TYPE with the result of maybe_process_partial_specialization in the nested_name_specifier branch too. Refactor nearby code to accomodate that maybe_process_partial_specialization returns a _TYPE, not a TYPE_DECL, and eliminate local variable CLASS_TYPE. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/concepts-partial-spec10.C: New test. --- gcc/cp/parser.cc | 18 +++--- .../g++.dg/cpp2a/concepts-partial-spec10.C | 6 ++ 2 files changed, 13 insertions(+), 11 deletions(-) create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-partial-spec10.C diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc index ed219d79dc9..aa1768ffab1 100644 --- a/gcc/cp/parser.cc +++ b/gcc/cp/parser.cc @@ -26534,7 +26534,7 @@ cp_parser_class_head (cp_parser* parser, } else if (nested_name_specifier) { - tree class_type; + type = TREE_TYPE (type); /* Given: @@ -26544,31 +26544,27 @@ cp_parser_class_head (cp_parser* parser, we will get a TYPENAME_TYPE when processing the definition of `S::T'. We need to resolve it to the actual type before we try to define it. */ - if (TREE_CODE (TREE_TYPE (type)) == TYPENAME_TYPE) + if (TREE_CODE (type) == TYPENAME_TYPE) { - class_type = resolve_typename_type (TREE_TYPE (type), - /*only_current_p=*/false); - if (TREE_CODE (class_type) != TYPENAME_TYPE) - type = TYPE_NAME (class_type); - else + type = resolve_typename_type (type, /*only_current_p=*/false); + if (TREE_CODE (type) == TYPENAME_TYPE) { cp_parser_error (parser, "could not resolve typename type"); type = error_mark_node; } } - if (maybe_process_partial_specialization (TREE_TYPE (type)) - == error_mark_node) + type = maybe_process_partial_specialization (type); + if (type == error_mark_node) { type = NULL_TREE; goto done; } - class_type = current_class_type; /* Enter the scope indicated by the nested-name-specifier. */ pushed_scope = push_scope (nested_name_specifier); /* Get the canonical version of this type. */ - type = TYPE_MAIN_DECL (TREE_TYPE (type)); + type = TYPE_MAIN_DECL (type); /* Call push_template_decl if it seems like we should be defining a template either from the template headers or the type we're defining, so that we diagnose both extra and missing headers. */ diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-partial-spec10.C b/gcc/testsuite/g++.dg/cpp2a/concepts-partial-spec10.C new file mode 100644 index 000..8504a055ac5 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp2a/concepts-partial-spec10.C @@ -0,0 +1,6 @@ +// PR c++/92944 +// { dg-do compile { target c++20 } } + +namespace ns { template struct A { }; } +template requires true struct ns::A { }; +template requires false struct ns::A { }; -- 2.35.0
[PATCH] aarch64: [PR101529] Fix vector shuffle insertion expansion
From: Andrew Pinski The function aarch64_evpc_ins would reuse the target even though it might be the same register as the two inputs. Instead of checking to see if we can reuse the target, just use the original input directly. Committed as approved after bootstrapped and tested on aarch64-linux-gnu with no regressions. Note the testcases are not backported as __builtin_shufflevector does not exist in GCC 11. PR target/101529 gcc/ChangeLog: * config/aarch64/aarch64.c (aarch64_evpc_ins): Don't use target as an input, use original one. (cherry picked from commit 52fa771758635d9c53cddb9116e5a66fae592230) --- gcc/config/aarch64/aarch64.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index bbcf5ed4a61..b58a379759d 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -23026,11 +23026,10 @@ aarch64_evpc_ins (struct expand_vec_perm_d *d) } gcc_assert (extractindex < nelt); - emit_move_insn (d->target, insv); insn_code icode = code_for_aarch64_simd_vec_copy_lane (mode); expand_operand ops[5]; create_output_operand (&ops[0], d->target, mode); - create_input_operand (&ops[1], d->target, mode); + create_input_operand (&ops[1], insv, mode); create_integer_operand (&ops[2], 1 << idx); create_input_operand (&ops[3], extractv, mode); create_integer_operand (&ops[4], extractindex); -- 2.17.1
Re: [PATCH] rs6000: Fix constraint v with rs6000_constraints[RS6000_CONSTRAINT_v]
Hi! On Wed, Jan 26, 2022 at 10:26:45AM +0800, Kewen.Lin wrote: > on 2022/1/14 上午12:31, David Edelsohn wrote: > Yeah, but IMHO it still can confuse new comers at first glance. Yes, or at least cause to read (well, grep) the whole backend and scratch heads. > >> 2) Bootstrapped and tested one below patch to remove all the code using > >> RS6000_CONSTRAINT_v on powerpc64le-linux-gnu P10 and P9, > >> powerpc64-linux-gnu P8 and P7 with no regressions. > > I would prefer that we not make gratuitous changes to this code, but > > maybe Segher has a different opinion. > > Thanks David for the comments! > > Hi Segher, what's your preference on this? I like your original patch better. But for stage 1, sorry. Indeed using ALTIVEC_REGS directly in the define_regiater_constraint works fine, but it isn't as clear as it could be that is correct. Segher
Re: [PATCH] MIPS: use 8bit for IPL in Cause register
On Wed, 26 Jan 2022, YunQiang Su wrote: > Since MIPS r2, the IPL section in Cause register has been expand > to 8bit instead of 6bit. Hmm, I cannot see it in my copy of the architecture manual I'm afraid. The interpretation may have changed, but the field is still 6-bit (not counting the software interrupts). Now the MCU ASE does expand the IPL field, but we can't rely on that here, not at least unconditionally, and then MCU is MIPSr3+. What problem are you trying to solve anyway? Maciej
[pushed] c++: vector compound literal [PR104206]
My patch for PR101072 removed the specific VECTOR_TYPE handling here, which broke pr72747-2.c on PPC; this patch restores it. Tested x86_64-pc-linux-gnu, applying to trunk. PR c++/104206 PR c++/101072 gcc/cp/ChangeLog: * semantics.cc (finish_compound_literal): Restore VECTOR_TYPE check. --- gcc/cp/semantics.cc | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc index 07c2b3393be..466d6b56871 100644 --- a/gcc/cp/semantics.cc +++ b/gcc/cp/semantics.cc @@ -3272,8 +3272,9 @@ finish_compound_literal (tree type, tree compound_literal, /* Represent other compound literals with TARGET_EXPR so we produce a prvalue, and can elide copies. */ - if (TREE_CODE (compound_literal) == CONSTRUCTOR - || TREE_CODE (compound_literal) == VEC_INIT_EXPR) + if (!VECTOR_TYPE_P (type) + && (TREE_CODE (compound_literal) == CONSTRUCTOR + || TREE_CODE (compound_literal) == VEC_INIT_EXPR)) { /* The CONSTRUCTOR is now an initializer, not a compound literal. */ if (TREE_CODE (compound_literal) == CONSTRUCTOR) base-commit: 00d8321124123daf41f7c51526355a5a610cdeb8 prerequisite-patch-id: c7ab4056fcbd782232c8dc091597602ecd4a7a48 -- 2.27.0
Re: [PATCH][GCC13?] RISC-V: Replace `smin'/`smax' RTL patterns with `fmin'/`fmax'
On Wed, 26 Jan 2022, Maciej W. Rozycki wrote: > > The newer instruction semantics correspond to the functions fminimum_num > > and fmaximum_num, not fminimum and fmaximum (which are functions that > > treat both quiet and signaling NaNs like most libm functions do - any > > argument a NaN means the result is NaN). > > I got these wrong then, sorry. I got lost too: what is the difference > between `fmin'/`fmax' and `fminimum'/`fmaximum' then? It looks to me like > we have a zoo of selection functions now with very subtle differences > between each other. fmin and fmax: * Treat quiet NaN as missing data and return the other argument (if a number). * Treat signaling NaN like most functions (raise invalid, return quiet NaN). fminimum and fmaximum: * Treat quiet NaN like most functions (return quiet NaN, no exceptions raised unless the other argument is signaling NaN). * Treat signaling NaN like most functions (raise invalid, return quiet NaN). fminimum_num and fmaximum_num: * Treat both quiet and signaling NaN as missing data and return the other argument (if a number). "invalid" is still raised if one argument is a signaling NaN (contrary to the normal rule that raising "invalid" means also returning a quiet NaN). -- Joseph S. Myers jos...@codesourcery.com
[PATCH v2][GCC13] RISC-V: Provide `fmin'/`fmax' RTL patterns
As at r2.2 of the RISC-V ISA specification[1] the FMIN and FMAX machine instructions fully match our requirement for the `fminM3' and `fmaxM3' standard RTL patterns: "For FMIN and FMAX, if at least one input is a signaling NaN, or if both inputs are quiet NaNs, the result is the canonical NaN. If one operand is a quiet NaN and the other is not a NaN, the result is the non-NaN operand." suitably for the IEEE 754-2008 `minNum' and `maxNum' operations. However we only define `sminM3' and `smaxM3' standard RTL patterns to produce the FMIN and FMAX machine instructions, which in turn causes the `__builtin_fmin' and `__builtin_fmax' family of intrinsics to emit the corresponding libcalls rather than the relevant machine instructions. This is according to earlier revisions of the RISC-V ISA specification, which we however do not support anymore, as from commit 4b81528241ca ("RISC-V: Support version controling for ISA standard extensions"). As from r20190608 of the RISC-V ISA specification the definition of the FMIN and FMAX machine instructions has been updated[2]: "Defined the signed-zero behavior of FMIN.fmt and FMAX.fmt, and changed their behavior on signaling-NaN inputs to conform to the minimumNumber and maximumNumber operations in the proposed IEEE 754-201x specification." and specifically[3]: "Floating-point minimum-number and maximum-number instructions FMIN.S and FMAX.S write, respectively, the smaller or larger of rs1 and rs2 to rd. For the purposes of these instructions only, the value -0.0 is considered to be less than the value +0.0. If both inputs are NaNs, the result is the canonical NaN. If only one operand is a NaN, the result is the non-NaN operand. Signaling NaN inputs set the invalid operation exception flag, even when the result is not NaN." Consequently for forwards compatibility with r20190608+ hardware we cannot use the FMIN and FMAX machine instructions unconditionally even where the ISA level of r2.2 has been specified with the `-misa-spec=2.2' option where operation would be different between ISA revisions, that is the handling of signaling NaN inputs. Therefore provide new `fmin3' and `fmax3' patterns removing the need to emit libcalls with the `__builtin_fmin' and `__builtin_fmax' family of intrinsics, however limit them to where `-fno-signaling-nans' is in effect, deferring to the existing `smin3' and `smax3' patterns otherwise. For clarity and maintenance error resistance add an explicit condition to the latter patterns rather than relying on source code ordering for where the respective RTL insns are matched by their operation rather than being explicitly referred to by their names. References: [1] "The RISC-V Instruction Set Manual, Volume I: User-Level ISA", Document Version 2.2, May 7, 2017, Section 8.3 "NaN Generation and Propagation", p. 48 [1] "The RISC-V Instruction Set Manual, Volume I: Unprivileged ISA", Document Version 20190608-Base-Ratified, June 8, 2019, "Preface", p. ii [2] same, Section 11.6 "Single-Precision Floating-Point Computational Instructions", p. 66 gcc/ * config/riscv/riscv.md (fmin3, fmax3): New insns. (smin3, smax3): Only enable if `HONOR_SNANS'. --- Hi, This updated version has passed full GCC regression testing (including the D frontend this time) with the `riscv64-linux-gnu' target using the HiFive Unmatched (U74 CPU) target board. Any further questions or comments? Otherwise OK once GCC 13 has opened? Maciej Changes from v1: - Adjust heading from "RISC-V: Replace `smin'/`smax' RTL patterns with `fmin'/`fmax'". - Do not remove `smin'/`smax' patterns; instead make them available if `HONOR_SNANS (mode)'. - Make `fmin'/`fmax' patterns available if `!HONOR_SNANS (mode)' only. - Update description accordingly; refer r2.2 and r20190608 ISA specs. --- gcc/config/riscv/riscv.md | 22 -- 1 file changed, 20 insertions(+), 2 deletions(-) gcc-riscv-fmin-fmax.diff Index: gcc/gcc/config/riscv/riscv.md === --- gcc.orig/gcc/config/riscv/riscv.md +++ gcc/gcc/config/riscv/riscv.md @@ -1214,11 +1214,29 @@ ;; ;; +(define_insn "fmin3" + [(set (match_operand:ANYF0 "register_operand" "=f") + (smin:ANYF (match_operand:ANYF 1 "register_operand" " f") + (match_operand:ANYF 2 "register_operand" " f")))] + "TARGET_HARD_FLOAT && !HONOR_SNANS (mode)" + "fmin.\t%0,%1,%2" + [(set_attr "type" "fmove") + (set_attr "mode" "")]) + +(define_insn "fmax3" + [(set (match_operand:ANYF0 "register_operand" "=f") + (smax:ANYF (match_operand:ANYF 1 "register_operand" " f") + (match_operand:ANYF 2 "register_operand" " f")))] + "TARGET_HARD_FLOAT && !HONOR_SNANS (mode)" + "fmax.\t%0,%1,%2" + [(set_attr "type" "fmove") + (set_attr "mode" "")]) + (define_insn "smin3" [(set (match_opera
Re: [PATCH][GCC13?] RISC-V: Replace `smin'/`smax' RTL patterns with `fmin'/`fmax'
On Wed, 26 Jan 2022, Joseph Myers wrote: > fmin and fmax: > > * Treat quiet NaN as missing data and return the other argument (if a > number). > > * Treat signaling NaN like most functions (raise invalid, return quiet > NaN). > > fminimum and fmaximum: > > * Treat quiet NaN like most functions (return quiet NaN, no exceptions > raised unless the other argument is signaling NaN). > > * Treat signaling NaN like most functions (raise invalid, return quiet > NaN). > > fminimum_num and fmaximum_num: > > * Treat both quiet and signaling NaN as missing data and return the other > argument (if a number). "invalid" is still raised if one argument is a > signaling NaN (contrary to the normal rule that raising "invalid" means > also returning a quiet NaN). Thanks! Maciej
[PATCH] rs6000: Fix up #include or [PR104239]
Hi! r12-4717-g7d37abedf58d66 added immintrin.h and x86gprintrin.h headers to rs6000, these headers are on x86 standalone headers that various programs include directly rather than including them through . Unfortunately, for that change the bmiintrin.h and bmi2intrin.h headers haven't been adjusted, so the effect is that if one includes them (without including also x86intrin.h first) #error will trigger. Furthermore, when including such headers conditionally as some real-world packages do, this means a regression. The following patch fixes it and matches what the x86 bmi{,2}intrin.h headers do. Bootstrapped/regtested on powerpc64le-linux, ok for trunk? 2022-01-26 Jakub Jelinek PR target/104239 * config/rs6000/bmiintrin.h: Test _X86GPRINTRIN_H_INCLUDED instead of _X86INTRIN_H_INCLUDED and adjust #error wording. * config/rs6000/bmi2intrin.h: Likewise. * gcc.target/powerpc/pr104239-1.c: New test. * gcc.target/powerpc/pr104239-2.c: New test. --- gcc/config/rs6000/bmiintrin.h.jj2022-01-11 23:11:21.936296534 +0100 +++ gcc/config/rs6000/bmiintrin.h 2022-01-26 13:35:08.705945170 +0100 @@ -29,8 +29,8 @@ standard C or GNU C extensions, which are more portable and better optimized across multiple targets. */ -#if !defined _X86INTRIN_H_INCLUDED -# error "Never use directly; include instead." +#if !defined _X86GPRINTRIN_H_INCLUDED +# error "Never use directly; include instead." #endif #ifndef _BMIINTRIN_H_INCLUDED --- gcc/config/rs6000/bmi2intrin.h.jj 2022-01-11 23:11:21.936296534 +0100 +++ gcc/config/rs6000/bmi2intrin.h 2022-01-26 13:34:53.373162122 +0100 @@ -29,8 +29,8 @@ standard C or GNU C extensions, which are more portable and better optimized across multiple targets. */ -#if !defined _X86INTRIN_H_INCLUDED -# error "Never use directly; include instead." +#if !defined _X86GPRINTRIN_H_INCLUDED +# error "Never use directly; include instead." #endif #ifndef _BMI2INTRIN_H_INCLUDED --- gcc/testsuite/gcc.target/powerpc/pr104239-1.c.jj2022-01-26 13:42:34.103643030 +0100 +++ gcc/testsuite/gcc.target/powerpc/pr104239-1.c 2022-01-26 13:42:23.101798701 +0100 @@ -0,0 +1,9 @@ +/* PR target/104239 */ +/* { dg-do compile } */ +/* { dg-options "-DNO_WARN_X86_INTRINSICS" } */ + +#if __has_include() +#include +#endif + +int i; --- gcc/testsuite/gcc.target/powerpc/pr104239-2.c.jj2022-01-26 13:42:42.279527345 +0100 +++ gcc/testsuite/gcc.target/powerpc/pr104239-2.c 2022-01-26 13:42:23.101798701 +0100 @@ -0,0 +1,9 @@ +/* PR target/104239 */ +/* { dg-do compile } */ +/* { dg-options "-DNO_WARN_X86_INTRINSICS" } */ + +#if __has_include() +#include +#endif + +int i; Jakub
Re: [PATCH] rs6000: Fix up #include or [PR104239]
On Wed, Jan 26, 2022 at 3:45 PM Jakub Jelinek wrote: > > Hi! > > r12-4717-g7d37abedf58d66 added immintrin.h and x86gprintrin.h headers > to rs6000, these headers are on x86 standalone headers that various > programs include directly rather than including them through > . > Unfortunately, for that change the bmiintrin.h and bmi2intrin.h > headers haven't been adjusted, so the effect is that if one includes them > (without including also x86intrin.h first) #error will trigger. > Furthermore, when including such headers conditionally as some real-world > packages do, this means a regression. > > The following patch fixes it and matches what the x86 bmi{,2}intrin.h > headers do. > > Bootstrapped/regtested on powerpc64le-linux, ok for trunk? Okay. Thanks for catching this. - David > > 2022-01-26 Jakub Jelinek > > PR target/104239 > * config/rs6000/bmiintrin.h: Test _X86GPRINTRIN_H_INCLUDED instead of > _X86INTRIN_H_INCLUDED and adjust #error wording. > * config/rs6000/bmi2intrin.h: Likewise. > > * gcc.target/powerpc/pr104239-1.c: New test. > * gcc.target/powerpc/pr104239-2.c: New test. > > --- gcc/config/rs6000/bmiintrin.h.jj2022-01-11 23:11:21.936296534 +0100 > +++ gcc/config/rs6000/bmiintrin.h 2022-01-26 13:35:08.705945170 +0100 > @@ -29,8 +29,8 @@ > standard C or GNU C extensions, which are more portable and better > optimized across multiple targets. */ > > -#if !defined _X86INTRIN_H_INCLUDED > -# error "Never use directly; include instead." > +#if !defined _X86GPRINTRIN_H_INCLUDED > +# error "Never use directly; include instead." > #endif > > #ifndef _BMIINTRIN_H_INCLUDED > --- gcc/config/rs6000/bmi2intrin.h.jj 2022-01-11 23:11:21.936296534 +0100 > +++ gcc/config/rs6000/bmi2intrin.h 2022-01-26 13:34:53.373162122 +0100 > @@ -29,8 +29,8 @@ > standard C or GNU C extensions, which are more portable and better > optimized across multiple targets. */ > > -#if !defined _X86INTRIN_H_INCLUDED > -# error "Never use directly; include instead." > +#if !defined _X86GPRINTRIN_H_INCLUDED > +# error "Never use directly; include > instead." > #endif > > #ifndef _BMI2INTRIN_H_INCLUDED > --- gcc/testsuite/gcc.target/powerpc/pr104239-1.c.jj2022-01-26 > 13:42:34.103643030 +0100 > +++ gcc/testsuite/gcc.target/powerpc/pr104239-1.c 2022-01-26 > 13:42:23.101798701 +0100 > @@ -0,0 +1,9 @@ > +/* PR target/104239 */ > +/* { dg-do compile } */ > +/* { dg-options "-DNO_WARN_X86_INTRINSICS" } */ > + > +#if __has_include() > +#include > +#endif > + > +int i; > --- gcc/testsuite/gcc.target/powerpc/pr104239-2.c.jj2022-01-26 > 13:42:42.279527345 +0100 > +++ gcc/testsuite/gcc.target/powerpc/pr104239-2.c 2022-01-26 > 13:42:23.101798701 +0100 > @@ -0,0 +1,9 @@ > +/* PR target/104239 */ > +/* { dg-do compile } */ > +/* { dg-options "-DNO_WARN_X86_INTRINSICS" } */ > + > +#if __has_include() > +#include > +#endif > + > +int i; > > Jakub >
[PATCH] PR fortran/84784 - ICEs: verify_gimple failed with -fdefault-integer-8
Dear Fortranners, the use of -fdefault-integer-8 exhibits several cases where we missed to convert the result of an intrinsic from the declared to the effective resulting type. The attached obvious patch fixes this for IMAGE_STATUS, TEAM_NUMBER, and POPCNT/POPPAR. OK for mainline if regtesting passes on x86_64-pc-linux-gnu? Thanks, Harald From af5cb1f0ec1cacb47acc8c2b0c0629cf3808e1af Mon Sep 17 00:00:00 2001 From: Harald Anlauf Date: Wed, 26 Jan 2022 21:50:41 +0100 Subject: [PATCH] Fortran: add missing conversions for result of intrinsics to result type gcc/fortran/ChangeLog: PR fortran/84784 * trans-intrinsic.cc (conv_intrinsic_image_status): Convert result to resulting (default) integer type. (conv_intrinsic_team_number): Likewise. (gfc_conv_intrinsic_popcnt_poppar): Likewise. gcc/testsuite/ChangeLog: PR fortran/84784 * gfortran.dg/pr84784.f90: New test. --- gcc/fortran/trans-intrinsic.cc| 13 +++-- gcc/testsuite/gfortran.dg/pr84784.f90 | 22 ++ 2 files changed, 29 insertions(+), 6 deletions(-) create mode 100644 gcc/testsuite/gfortran.dg/pr84784.f90 diff --git a/gcc/fortran/trans-intrinsic.cc b/gcc/fortran/trans-intrinsic.cc index fccf0a9b229..da854fad89d 100644 --- a/gcc/fortran/trans-intrinsic.cc +++ b/gcc/fortran/trans-intrinsic.cc @@ -2620,7 +2620,7 @@ conv_intrinsic_image_status (gfc_se *se, gfc_expr *expr) else gcc_unreachable (); - se->expr = tmp; + se->expr = fold_convert (gfc_get_int_type (gfc_default_integer_kind), tmp); } static void @@ -2662,7 +2662,7 @@ conv_intrinsic_team_number (gfc_se *se, gfc_expr *expr) else gcc_unreachable (); - se->expr = tmp; + se->expr = fold_convert (gfc_get_int_type (gfc_default_integer_kind), tmp); } @@ -7255,12 +7255,13 @@ gfc_conv_intrinsic_popcnt_poppar (gfc_se * se, gfc_expr *expr, int parity) /* Combine the results. */ if (parity) - se->expr = fold_build2_loc (input_location, BIT_XOR_EXPR, result_type, -call1, call2); + se->expr = fold_build2_loc (input_location, BIT_XOR_EXPR, +integer_type_node, call1, call2); else - se->expr = fold_build2_loc (input_location, PLUS_EXPR, result_type, -call1, call2); + se->expr = fold_build2_loc (input_location, PLUS_EXPR, +integer_type_node, call1, call2); + se->expr = convert (result_type, se->expr); return; } diff --git a/gcc/testsuite/gfortran.dg/pr84784.f90 b/gcc/testsuite/gfortran.dg/pr84784.f90 new file mode 100644 index 000..48dd4dd4b0a --- /dev/null +++ b/gcc/testsuite/gfortran.dg/pr84784.f90 @@ -0,0 +1,22 @@ +! { dg-do compile } +! { dg-options "-fcoarray=lib -fdefault-integer-8" } +! { dg-require-effective-target fortran_integer_16 } +! PR fortran/84784 - ICEs: verify_gimple failed with -fdefault-integer-8 + + use iso_fortran_env, only : team_type, STAT_FAILED_IMAGE + implicit none + type(team_type) :: team + integer :: new_team + new_team = mod(this_image(),2)+1 + form team (new_team,team) +change team (team) +if (team_number() /= new_team) STOP 1 + end team + if (image_status (1) == STAT_FAILED_IMAGE) ERROR STOP "cannot recover" + if (runtime_popcnt(0_16) /= 0) STOP 2 +contains + integer function runtime_popcnt (i) +integer(kind=16), intent(in) :: i +runtime_popcnt = popcnt(i) + end function +end -- 2.31.1
[PATCH] rs6000: Fix up *intrin.h for C89 [PR104239]
Hi! When writing testcases for the previously posted patch, I have noticed that 3 of the headers aren't valid C89 (I didn't have any dg-options so -ansi -pedantic-errors was implied and these errors were reported). The following patch fixes those, ok for trunk? Note, as can be seen even in this patch, seems older rs6000/*intrin.h headers uglify not just argument names (__A instead of A etc.), but also automatic variable names and other local identifiers, while e.g. emmintrin.h or bmi2intrin.h clearly uglify only the argument names and not local variables. I think that should be fixed but don't have time for that myself (libstdc++ or e.g. the x86 headers uglify everything; this is so that one can #define result a + b #include etc.). 2022-01-26 Jakub Jelinek PR target/104239 * config/rs6000/emmintrin.h (_mm_sad_epu8): Use __asm__ instead of asm. * config/rs6000/smmintrin.h (_mm_minpos_epu16): Declare iterator before for loop instead of for init clause. * config/rs6000/bmi2intrin.h (_pext_u64): Likewise. * gcc.target/powerpc/pr104239-3.c: New test. --- gcc/config/rs6000/emmintrin.h.jj2022-01-11 22:56:21.316686838 + +++ gcc/config/rs6000/emmintrin.h 2022-01-26 20:47:26.319336232 + @@ -2215,7 +2215,7 @@ _mm_sad_epu8 (__m128i __A, __m128i __B) vsum = (__vector signed int) vec_sum4s (vabsdiff, zero); #ifdef __LITTLE_ENDIAN__ /* Sum across four integers with two integer results. */ - asm ("vsum2sws %0,%1,%2" : "=v" (result) : "v" (vsum), "v" (zero)); + __asm__ ("vsum2sws %0,%1,%2" : "=v" (result) : "v" (vsum), "v" (zero)); /* Note: vec_sum2s could be used here, but on little-endian, vector shifts are added that are not needed for this use-case. A vector shift to correctly position the 32-bit integer results --- gcc/config/rs6000/smmintrin.h.jj2022-01-19 10:27:58.529911366 + +++ gcc/config/rs6000/smmintrin.h 2022-01-26 20:48:05.720348812 + @@ -687,7 +687,8 @@ _mm_minpos_epu16 (__m128i __A) union __u __u = { .__m = __A }, __r = { .__m = {0} }; unsigned short __ridx = 0; unsigned short __rmin = __u.__uh[__ridx]; - for (unsigned long __i = 1; __i < 8; __i++) + unsigned long __i; + for (__i = 1; __i < 8; __i++) { if (__u.__uh[__i] < __rmin) { --- gcc/config/rs6000/bmi2intrin.h.jj 2022-01-26 20:42:53.132315506 + +++ gcc/config/rs6000/bmi2intrin.h 2022-01-26 20:46:33.687983641 + @@ -115,10 +115,11 @@ _pext_u64 (unsigned long long __X, unsig the Power8 Bit permute instruction. */ if (__builtin_constant_p (__M) && (__builtin_popcountl (__M) <= 8)) { + long i; /* Also if the pext mask is constant, then the popcount is constant, we can evaluate the following loop at compile time and use a constant bit permute vector. */ - for (long i = 0; i < __builtin_popcountl (__M); i++) + for (i = 0; i < __builtin_popcountl (__M); i++) { c = __builtin_clzl (m); p = (p << 8) | c; --- gcc/testsuite/gcc.target/powerpc/pr104239-3.c.jj2022-01-26 20:52:42.987474394 + +++ gcc/testsuite/gcc.target/powerpc/pr104239-3.c 2022-01-26 20:52:36.547308886 + @@ -0,0 +1,7 @@ +/* PR target/104239 */ +/* { dg-options "-O2 -mdejagnu-cpu=power8 -DNO_WARN_X86_INTRINSICS -std=c89" } */ +/* { dg-require-effective-target powerpc_p8vector_ok } */ + +#include + +int i; Jakub
Re: [PATCH][GCC13?] RISC-V: Replace `smin'/`smax' RTL patterns with `fmin'/`fmax'
On Wed, 26 Jan 2022, Joseph Myers wrote: > fmin and fmax: Also: * If the arguments are +0 and -0 in some order, it's unspecified what sign the result has. > fminimum and fmaximum: > fminimum_num and fmaximum_num: Also: * These four functions are required to treat -0 as less than +0. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH] rs6000: Fix up *intrin.h for C89 [PR104239]
Hi! On Wed, Jan 26, 2022 at 10:03:36PM +0100, Jakub Jelinek wrote: > When writing testcases for the previously posted patch, I have noticed > that 3 of the headers aren't valid C89 (I didn't have any dg-options > so -ansi -pedantic-errors was implied and these errors were reported). Hrm. Do they even work like that? > Note, as can be seen even in this patch, seems older rs6000/*intrin.h > headers uglify not just argument names (__A instead of A etc.), but also > automatic variable names and other local identifiers, while e.g. emmintrin.h > or bmi2intrin.h clearly uglify only the argument names and not local > variables. I think that should be fixed but don't have time for that myself > (libstdc++ or e.g. the x86 headers uglify everything; this is so that one > can > #define result a + b > #include > etc.). Heh, I argued for this before, but mistakenly for arguments as well. Not that I can remember right now how function arguments can mess up either? > --- gcc/config/rs6000/bmi2intrin.h.jj 2022-01-26 20:42:53.132315506 + > +++ gcc/config/rs6000/bmi2intrin.h2022-01-26 20:46:33.687983641 + > @@ -115,10 +115,11 @@ _pext_u64 (unsigned long long __X, unsig > the Power8 Bit permute instruction. */ >if (__builtin_constant_p (__M) && (__builtin_popcountl (__M) <= 8)) > { > + long i; >/* Also if the pext mask is constant, then the popcount is > constant, we can evaluate the following loop at compile > time and use a constant bit permute vector. */ > - for (long i = 0; i < __builtin_popcountl (__M); i++) > + for (i = 0; i < __builtin_popcountl (__M); i++) Please put the declaration immediately before it is used, not at the start of the function? Okay for trunk like that. Thanks! Also any and all backports you want are fine. Segher
Re: [committed] libstdc++: Avoid symlink race in filesystem::remove_all [PR104161]
On Tue, Jan 25, 2022 at 09:09:51PM +, Jonathan Wakely via Gcc-patches wrote: > Tested x86_64-linux, pushed to trunk. Backports to follow. > > > This adds a new internal flag to the filesystem::directory_iterator > constructor that makes it fail if the path is a symlink that resolves to > a directory. This prevents filesystem::remove_all from following a > symlink to a directory, rather than deleting the symlink itself. > > We can also use that new flag in recursive_directory_iterator to ensure > that we don't follow symlinks if the follow_directory_symlink option is > not set. > > This also moves an error check in filesystem::remove_all after the while > loop, so that errors from the directory_iterator constructor are > reproted, instead of continuing to the filesystem::remove call below. > > libstdc++-v3/ChangeLog: > > PR libstdc++/104161 > * acinclude.m4 (GLIBCXX_CHECK_FILESYSTEM_DEPS): Check for > fdopendir. > * config.h.in: Regenerate. > * configure: Regenerate. > * src/c++17/fs_dir.cc (_Dir): Add nofollow flag to constructor > and pass it to base class constructor. > (directory_iterator): Pass nofollow flag to _Dir constructor. > (fs::recursive_directory_iterator::increment): Likewise. > * src/c++17/fs_ops.cc (do_remove_all): Use nofollow option for > directory_iterator constructor. Move error check outside loop. > * src/filesystem/dir-common.h (_Dir_base): Add nofollow flag to > constructor and when it's set use ::open with O_NOFOLLOW and > O_DIRECTORY. > * src/filesystem/dir.cc (_Dir): Add nofollow flag to constructor > and pass it to base class constructor. > (directory_iterator): Pass nofollow flag to _Dir constructor. > (fs::recursive_directory_iterator::increment): Likewise. > * src/filesystem/ops.cc (remove_all): Use nofollow option for > directory_iterator constructor. Move error check outside loop. > --- > libstdc++-v3/acinclude.m4| 12 ++ > libstdc++-v3/config.h.in | 3 ++ > libstdc++-v3/configure | 55 > libstdc++-v3/src/c++17/fs_dir.cc | 13 -- > libstdc++-v3/src/c++17/fs_ops.cc | 12 +++--- > libstdc++-v3/src/filesystem/dir-common.h | 48 - > libstdc++-v3/src/filesystem/dir.cc | 13 -- > libstdc++-v3/src/filesystem/ops.cc | 6 +-- > 8 files changed, 134 insertions(+), 28 deletions(-) > > diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4 > index d996477254c..7b6b807114a 100644 > --- a/libstdc++-v3/acinclude.m4 > +++ b/libstdc++-v3/acinclude.m4 > @@ -4735,6 +4735,18 @@ dnl >if test $glibcxx_cv_truncate = yes; then > AC_DEFINE(HAVE_TRUNCATE, 1, [Define if truncate is available in > .]) >fi > +dnl > + AC_CACHE_CHECK([for fdopendir], > +glibcxx_cv_fdopendir, [dnl > +GCC_TRY_COMPILE_OR_LINK( > + [#include ], > + [::fdopendir(1);], > + [glibcxx_cv_fdopendir=yes], > + [glibcxx_cv_fdopendir=no]) > + ]) > + if test $glibcxx_cv_truncate = yes; then This is a typo. Should check glibcxx_cv_fdopendir. Regards, Dimitar
Re: [committed] libstdc++: Avoid symlink race in filesystem::remove_all [PR104161]
On Wed, 26 Jan 2022 at 22:08, Dimitar Dimitrov wrote: > > On Tue, Jan 25, 2022 at 09:09:51PM +, Jonathan Wakely via Gcc-patches > wrote: > > Tested x86_64-linux, pushed to trunk. Backports to follow. > > > > > > This adds a new internal flag to the filesystem::directory_iterator > > constructor that makes it fail if the path is a symlink that resolves to > > a directory. This prevents filesystem::remove_all from following a > > symlink to a directory, rather than deleting the symlink itself. > > > > We can also use that new flag in recursive_directory_iterator to ensure > > that we don't follow symlinks if the follow_directory_symlink option is > > not set. > > > > This also moves an error check in filesystem::remove_all after the while > > loop, so that errors from the directory_iterator constructor are > > reproted, instead of continuing to the filesystem::remove call below. > > > > libstdc++-v3/ChangeLog: > > > > PR libstdc++/104161 > > * acinclude.m4 (GLIBCXX_CHECK_FILESYSTEM_DEPS): Check for > > fdopendir. > > * config.h.in: Regenerate. > > * configure: Regenerate. > > * src/c++17/fs_dir.cc (_Dir): Add nofollow flag to constructor > > and pass it to base class constructor. > > (directory_iterator): Pass nofollow flag to _Dir constructor. > > (fs::recursive_directory_iterator::increment): Likewise. > > * src/c++17/fs_ops.cc (do_remove_all): Use nofollow option for > > directory_iterator constructor. Move error check outside loop. > > * src/filesystem/dir-common.h (_Dir_base): Add nofollow flag to > > constructor and when it's set use ::open with O_NOFOLLOW and > > O_DIRECTORY. > > * src/filesystem/dir.cc (_Dir): Add nofollow flag to constructor > > and pass it to base class constructor. > > (directory_iterator): Pass nofollow flag to _Dir constructor. > > (fs::recursive_directory_iterator::increment): Likewise. > > * src/filesystem/ops.cc (remove_all): Use nofollow option for > > directory_iterator constructor. Move error check outside loop. > > --- > > libstdc++-v3/acinclude.m4| 12 ++ > > libstdc++-v3/config.h.in | 3 ++ > > libstdc++-v3/configure | 55 > > libstdc++-v3/src/c++17/fs_dir.cc | 13 -- > > libstdc++-v3/src/c++17/fs_ops.cc | 12 +++--- > > libstdc++-v3/src/filesystem/dir-common.h | 48 - > > libstdc++-v3/src/filesystem/dir.cc | 13 -- > > libstdc++-v3/src/filesystem/ops.cc | 6 +-- > > 8 files changed, 134 insertions(+), 28 deletions(-) > > > > diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4 > > index d996477254c..7b6b807114a 100644 > > --- a/libstdc++-v3/acinclude.m4 > > +++ b/libstdc++-v3/acinclude.m4 > > @@ -4735,6 +4735,18 @@ dnl > >if test $glibcxx_cv_truncate = yes; then > > AC_DEFINE(HAVE_TRUNCATE, 1, [Define if truncate is available in > > .]) > >fi > > +dnl > > + AC_CACHE_CHECK([for fdopendir], > > +glibcxx_cv_fdopendir, [dnl > > +GCC_TRY_COMPILE_OR_LINK( > > + [#include ], > > + [::fdopendir(1);], > > + [glibcxx_cv_fdopendir=yes], > > + [glibcxx_cv_fdopendir=no]) > > + ]) > > + if test $glibcxx_cv_truncate = yes; then > > This is a typo. Should check glibcxx_cv_fdopendir. Oops, thanks! Copy&pasto.
[Patch] libgomp.texi: Update OpenMP implementation status
]I saw that the following now is implemented: * requires: dynamic_allocators is now also works (by assuming that it is supported, no longer giving 'sorry') https://gcc.gnu.org/r12-6641-g450c85b81f4dd67bf6211d307afdc0f3c07ef44f * in_reduction on target: Now also supported for Fortran https://gcc.gnu.org/r12-4574-gd98626bf451dea6a28a42d953f7d0bd7659ad4d5 (I kept the "P" because of "nowait") I also note – but did not change: * omp_display_env runtime routine P Not inside target regions I observe that OpenMP 5.2 added: "When called from within a target region the effect is unspecified." Thus, regarding this as bug fix, it could also be "Y". Thoughts? [Some output modifications are also part of the pending device-specific ICV patch, https://gcc.gnu.org/pipermail/gcc-patches/2022-January/588728.html thus, as that patch are not yet it, one can also argue for "P" in terms of the output.] Otherwise: OK? Tobias - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 libgomp.texi: Update OpenMP implementation status libgomp/ * libgomp.texi (OpenMP 5.0): Update implementation status. diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi index 3be9de51f11..73d9f4f76cb 100644 --- a/libgomp/libgomp.texi +++ b/libgomp/libgomp.texi @@ -187,7 +187,8 @@ The OpenMP 4.5 specification is fully supported. env variable @tab Y @tab @item Nested-parallel changes to @emph{max-active-levels-var} ICV @tab Y @tab @item @code{requires} directive @tab P - @tab Only fulfillable requirement is @code{atomic_default_mem_order} + @tab Only fulfillable requirement are @code{atomic_default_mem_order} + and @code{dynamic_allocators} @item @code{teams} construct outside an enclosing target region @tab Y @tab @item Non-rectangular loop nests @tab P @tab Only C/C++ @item @code{!=} as relational-op in canonical loop form for C/C++ @tab Y @tab @@ -203,7 +204,7 @@ The OpenMP 4.5 specification is fully supported. @code{reduction} clause @tab Y @tab @item @code{in_reduction} clause on @code{task} constructs @tab Y @tab @item @code{in_reduction} clause on @code{target} constructs @tab P - @tab Only C/C++, @code{nowait} only stub + @tab @code{nowait} only stub @item @code{task_reduction} clause with @code{taskgroup} @tab Y @tab @item @code{task} modifier to @code{reduction} clause @tab Y @tab @item @code{affinity} clause to @code{task} construct @tab Y @tab Stub only
Re: [Patch] libgomp.texi: Update OpenMP implementation status
On Wed, Jan 26, 2022 at 11:14:54PM +0100, Tobias Burnus wrote: > ]I saw that the following now is implemented: > > * requires: dynamic_allocators is now also works > (by assuming that it is supported, no longer giving 'sorry') > https://gcc.gnu.org/r12-6641-g450c85b81f4dd67bf6211d307afdc0f3c07ef44f > > * in_reduction on target: Now also supported for Fortran > https://gcc.gnu.org/r12-4574-gd98626bf451dea6a28a42d953f7d0bd7659ad4d5 > (I kept the "P" because of "nowait") > > > I also note – but did not change: > > * omp_display_env runtime routine P Not inside target regions > > I observe that OpenMP 5.2 added: > "When called from within a target region the effect is unspecified." > Thus, regarding this as bug fix, it could also be "Y". > Thoughts? Yes, we should say Y instead of P for it. > Otherwise: OK? Yes. Jakub
Re: [PATCH] rs6000: Disable MMA if no P9 VECTOR support [PR103627]
Hi! On Thu, Dec 23, 2021 at 10:09:27AM +0800, Kewen.Lin wrote: > As PR103627 shows, there is an unexpected case where !TARGET_VSX > and TARGET_MMA co-exist. As ISA3.1 claims, SIMD is a requirement > for MMA. By looking into the ICE, I noticed that the current > MMA implementation depends on vector pairs load/store, but since > we don't have a separated option to control Power10 vector, this > patch is to check for Power9 vector instead. > > Bootstrapped and regtested on powerpc64le-linux-gnu P9 and > powerpc64-linux-gnu P8. > > Is it ok for trunk? No, sorry. > + /* MMA requires SIMD support as ISA 3.1 claims and our implementation > + such as "*movoo" uses vector pair access which are only supported > + from ISA 3.1. But since we don't have one separated option to > + control Power10 vector, check for Power9 vector instead. */ > + if (TARGET_MMA && !TARGET_P9_VECTOR) > +{ > + if ((rs6000_isa_flags_explicit & OPTION_MASK_MMA) != 0) > + error ("%qs requires %qs", "-mmma", "-mpower9-vector"); > + rs6000_isa_flags &= ~OPTION_MASK_MMA; > +} -mpower9-vector is a workaround that should go away. TARGET_MMA should require ISA 3.1 (or POWER10) directly, and require VSX. > +/* { dg-options "-mdejagnu-cpu=power10 -mno-power9-vector" } */ It should be impossible to select this at all. Either you have vectors or you don't, but it should be impossible to selectively disable part of vector support. That way madness lies. We can allow no VSRs, only VRs, or all VSRs. There is precedent for that (see -msoft-float for example, which means "do not use FPRs") -- when compiling certain code we want to disallow whole register banks. But disallowing or allowing some instructions separately is not a good idea: it doesn't give any useful extra functionality, it is hard to use, and it is a lot of extra work for us (it is impossible to test, already, too many combinations). Segher
Re: [PATCH] c++: non-dependent immediate member fn call [PR99895]
On Wed, Jan 19, 2022 at 2:19 PM Jason Merrill wrote: > > On 1/19/22 11:15, Patrick Palka wrote: > > Here we're emitting a bogus error during ahead of time evaluation of a > > non-dependent immediate member function call such as a.f(args) because > > the defacto templated form for such a call is (a.f)(args) but we're > > trying to evaluate it using the intermediate CALL_EXPR built by > > build_over_call, which has the non-member form f(a, args). The defacto > > member form is built in build_new_method_call, so it seems we should > > handle the immediate call there instead. > > Hmm, there's already a bunch of code in build_over_call to try to fix up > the object argument, and there seem to be many places other than > build_new_method_call that call build_over_call for member functions; I > think it's probably better to build the needed COMPONENT_REF in > build_over_call. Ah, but in build_over_call the arguments (including the implicit object argument) are potentially wrapped in a NON_DEPENDENT_EXPR. So even if we built a COMPONENT_REF in build_over_call, we'd have to keep rebuilding the entire CALL_EXPR (including the COMPONENT_REF) in terms of the original arguments in build_new_method_call, IIUC. On a related note, the NON_DEPENDENT_EXPR wrapping seems to be a problem for non-member functions too, because the constexpr engine punts on them: struct fixed_string { }; static consteval void size(fixed_string) { } template void VerifyHash(fixed_string s) { size(s); // error: expression ‘s’ is not a constant expression (because it's wrapped in NON_DEPENDENT_EXPR) } I wonder if this means we should be evaluating non-dependent non-member immediate calls elsewhere, e.g. in finish_call_expr? Or perhaps we should stop punting on NON_DEPENDENT_EXPR during constexpr evaluation? > > > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for > > trunk and perhaps 11? > > > > PR c++/99895 > > > > gcc/cp/ChangeLog: > > > > * call.cc (build_over_call): Don't evaluate non-dependent > > immediate member function calls here. > > (build_new_method_call): Instead evaluate them here. > > > > gcc/testsuite/ChangeLog: > > > > * g++.dg/cpp2a/consteval-memfn1.C: New test. > > * g++.dg/cpp2a/consteval-memfn2.C: New test. > > --- > > gcc/cp/call.cc| 9 - > > gcc/testsuite/g++.dg/cpp2a/consteval-memfn1.C | 15 > > gcc/testsuite/g++.dg/cpp2a/consteval-memfn2.C | 34 +++ > > 3 files changed, 57 insertions(+), 1 deletion(-) > > create mode 100644 gcc/testsuite/g++.dg/cpp2a/consteval-memfn1.C > > create mode 100644 gcc/testsuite/g++.dg/cpp2a/consteval-memfn2.C > > > > diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc > > index d4a07a7a9b3..0583cc0083b 100644 > > --- a/gcc/cp/call.cc > > +++ b/gcc/cp/call.cc > > @@ -9241,7 +9241,10 @@ build_over_call (struct z_candidate *cand, int > > flags, tsubst_flags_t complain) > > addr, nargs, argarray); > > if (TREE_THIS_VOLATILE (fn) && cfun) > > current_function_returns_abnormally = 1; > > - if (immediate_invocation_p (fn, nargs)) > > + if (!DECL_FUNCTION_MEMBER_P (fn) > > + /* Non-dependent immediate member function calls are evaluated in > > + build_new_method_call. */ > > + && immediate_invocation_p (fn, nargs)) > > { > > tree obj_arg = NULL_TREE, exprimm = expr; > > if (DECL_CONSTRUCTOR_P (fn)) > > @@ -11227,6 +11230,10 @@ skip_prune: > > call = convert_from_reference (call); > > if (cast_to_void) > > call = build_nop (void_type_node, call); > > + > > + if (immediate_invocation_p (fn, vec_safe_length (orig_args))) > > + fold_non_dependent_expr (call, complain, > > + /*manifestly_const_eval=*/true); > > } > > > >/* Free all the conversions we allocated. */ > > diff --git a/gcc/testsuite/g++.dg/cpp2a/consteval-memfn1.C > > b/gcc/testsuite/g++.dg/cpp2a/consteval-memfn1.C > > new file mode 100644 > > index 000..d2df2e9b5ae > > --- /dev/null > > +++ b/gcc/testsuite/g++.dg/cpp2a/consteval-memfn1.C > > @@ -0,0 +1,15 @@ > > +// PR c++/99895 > > +// { dg-do compile { target c++20 } } > > + > > +struct fixed_string { > > + consteval int size(int n) const { > > +if (n < 0) throw; // { dg-error "not a constant" } > > +return n; > > + } > > +}; > > + > > +template > > +void VerifyHash(fixed_string s) { > > + s.size(0); // { dg-bogus "" } > > + s.size(-1); // { dg-message "expansion of" } > > +} > > diff --git a/gcc/testsuite/g++.dg/cpp2a/consteval-memfn2.C > > b/gcc/testsuite/g++.dg/cpp2a/consteval-memfn2.C > > new file mode 100644 > > index 000..71748f46b13 > > --- /dev/null > > +++ b/gcc/testsuite/g++.dg/cpp2a/consteval-memfn2.C > > @@ -0,0 +1,34 @@ > > +// PR c++/99895 > > +// { dg-do compile { target c++20 } } > > + > > +static constexpr unsigned hash(const char*
Re: [PATCH] rs6000: Move the hunk affecting VSX/ALTIVEC ahead [PR103627]
Hi! On Thu, Dec 23, 2021 at 10:12:19AM +0800, Kewen.Lin wrote: > PR target/103627 > * config/rs6000/rs6000.c (rs6000_option_override_internal): Move the > hunk affecting VSX and ALTIVEC to the appropriate place. > > gcc/testsuite/ChangeLog: > > PR target/103627 > * gcc.target/powerpc/pr103627-3.c: New test. > --- a/gcc/config/rs6000/rs6000.c > +++ b/gcc/config/rs6000/rs6000.c > @@ -3955,6 +3955,15 @@ rs6000_option_override_internal (bool global_init_p) >else if (TARGET_ALTIVEC) > rs6000_isa_flags |= (OPTION_MASK_PPC_GFXOPT & ~ignore_masks); > > + /* Disable VSX and Altivec silently if the user switched cpus to power7 in > a > + target attribute or pragma which automatically enables both options, > + unless the altivec ABI was set. This is set by default for 64-bit, but > + not for 32-bit. Don't move this before the above code using > ignore_masks, > + since it can reset the cleared VSX/ALTIVEC flag again. */ > + if (main_target_opt != NULL && !main_target_opt->x_rs6000_altivec_abi) > +rs6000_isa_flags &= ~((OPTION_MASK_VSX | OPTION_MASK_ALTIVEC) > + & ~rs6000_isa_flags_explicit); Could you at the same time get rid of the != NULL please? if (bla != NULL) is sillier than if (bla != 0) which is about the same as if (!!bla) but that is certainly better than if (bla != 0 != 0) although I am not sure about the more stylish if (bla != 0 != 0 != 0 != 0 != 0) but what is wrong with if (bla) ? :-) > +/* There are no error messages for either LE or BE 64bit. */ > +/* { dg-require-effective-target be }*/ (space before */) > +/* { dg-require-effective-target ilp32 } */ > +/* We don't have one powerpc.*_ok for Power6, use altivec_ok conservatively. > */ > +/* { dg-require-effective-target powerpc_altivec_ok } */ > +/* { dg-options "-mdejagnu-cpu=power6" } */ It is okay always, no _ok at all please. Okay for trunk with those things (but do test of course). Thanks! Segher
Re: [PATCH] c++: non-dependent immediate member fn call [PR99895]
On Wed, 26 Jan 2022, Patrick Palka wrote: > On Wed, Jan 19, 2022 at 2:19 PM Jason Merrill wrote: > > > > On 1/19/22 11:15, Patrick Palka wrote: > > > Here we're emitting a bogus error during ahead of time evaluation of a > > > non-dependent immediate member function call such as a.f(args) because > > > the defacto templated form for such a call is (a.f)(args) but we're > > > trying to evaluate it using the intermediate CALL_EXPR built by > > > build_over_call, which has the non-member form f(a, args). The defacto > > > member form is built in build_new_method_call, so it seems we should > > > handle the immediate call there instead. > > > > Hmm, there's already a bunch of code in build_over_call to try to fix up > > the object argument, and there seem to be many places other than > > build_new_method_call that call build_over_call for member functions; I > > think it's probably better to build the needed COMPONENT_REF in > > build_over_call. > > Ah, but in build_over_call the arguments (including the implicit > object argument) are potentially wrapped in a NON_DEPENDENT_EXPR. So > even if we built a COMPONENT_REF in build_over_call, we'd have to > keep rebuilding the entire CALL_EXPR (including the COMPONENT_REF) in > terms of the original arguments in build_new_method_call, IIUC. > > On a related note, the NON_DEPENDENT_EXPR wrapping seems to be a > problem for non-member functions too, because the constexpr engine > punts on them: > > struct fixed_string { }; > > static consteval void size(fixed_string) { } > > template > void VerifyHash(fixed_string s) { > size(s); // error: expression ‘s’ is not a constant expression > (because it's wrapped in NON_DEPENDENT_EXPR) > } > > I wonder if this means we should be evaluating non-dependent > non-member immediate calls elsewhere, e.g. in finish_call_expr? Or > perhaps we should stop punting on NON_DEPENDENT_EXPR during constexpr > evaluation? Actually, for that particular example, we probably should just avoid wrapping PARM_DECL in a NON_DEPENDENT_EXPR... > > > > > > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for > > > trunk and perhaps 11? > > > > > > PR c++/99895 > > > > > > gcc/cp/ChangeLog: > > > > > > * call.cc (build_over_call): Don't evaluate non-dependent > > > immediate member function calls here. > > > (build_new_method_call): Instead evaluate them here. > > > > > > gcc/testsuite/ChangeLog: > > > > > > * g++.dg/cpp2a/consteval-memfn1.C: New test. > > > * g++.dg/cpp2a/consteval-memfn2.C: New test. > > > --- > > > gcc/cp/call.cc| 9 - > > > gcc/testsuite/g++.dg/cpp2a/consteval-memfn1.C | 15 > > > gcc/testsuite/g++.dg/cpp2a/consteval-memfn2.C | 34 +++ > > > 3 files changed, 57 insertions(+), 1 deletion(-) > > > create mode 100644 gcc/testsuite/g++.dg/cpp2a/consteval-memfn1.C > > > create mode 100644 gcc/testsuite/g++.dg/cpp2a/consteval-memfn2.C > > > > > > diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc > > > index d4a07a7a9b3..0583cc0083b 100644 > > > --- a/gcc/cp/call.cc > > > +++ b/gcc/cp/call.cc > > > @@ -9241,7 +9241,10 @@ build_over_call (struct z_candidate *cand, int > > > flags, tsubst_flags_t complain) > > > addr, nargs, argarray); > > > if (TREE_THIS_VOLATILE (fn) && cfun) > > > current_function_returns_abnormally = 1; > > > - if (immediate_invocation_p (fn, nargs)) > > > + if (!DECL_FUNCTION_MEMBER_P (fn) > > > + /* Non-dependent immediate member function calls are evaluated in > > > + build_new_method_call. */ > > > + && immediate_invocation_p (fn, nargs)) > > > { > > > tree obj_arg = NULL_TREE, exprimm = expr; > > > if (DECL_CONSTRUCTOR_P (fn)) > > > @@ -11227,6 +11230,10 @@ skip_prune: > > > call = convert_from_reference (call); > > > if (cast_to_void) > > > call = build_nop (void_type_node, call); > > > + > > > + if (immediate_invocation_p (fn, vec_safe_length (orig_args))) > > > + fold_non_dependent_expr (call, complain, > > > + /*manifestly_const_eval=*/true); > > > } > > > > > >/* Free all the conversions we allocated. */ > > > diff --git a/gcc/testsuite/g++.dg/cpp2a/consteval-memfn1.C > > > b/gcc/testsuite/g++.dg/cpp2a/consteval-memfn1.C > > > new file mode 100644 > > > index 000..d2df2e9b5ae > > > --- /dev/null > > > +++ b/gcc/testsuite/g++.dg/cpp2a/consteval-memfn1.C > > > @@ -0,0 +1,15 @@ > > > +// PR c++/99895 > > > +// { dg-do compile { target c++20 } } > > > + > > > +struct fixed_string { > > > + consteval int size(int n) const { > > > +if (n < 0) throw; // { dg-error "not a constant" } > > > +return n; > > > + } > > > +}; > > > + > > > +template > > > +void VerifyHash(fixed_string s) { > > > + s.size(0); // { dg-bogus "" } > > > + s.size(-1); // { dg-message "expansion of" } > > >
[PATCH RFA (tree)] c++: lambda in template default argument [PR103186]
The problem with this testcase was that since my patch for PR97900 we weren't preserving DECL_UID identity for parameters of instantiations of templated functions, so using those parameters as the keys for the defarg_inst map broke. I think this was always fragile given the possibility of redeclarations, so instead of reverting that change let's switch to keying off the function. Memory use compiling stdc++.h is not noticeably different. Tested x86_64-pc-linux-gnu. Is the tree.{h,cc} change, just moving the class definition from one to the other, OK for trunk? PR c++/103186 gcc/ChangeLog: * tree.h (struct tree_vec_map_cache_hasher): Move from... * tree.cc (struct tree_vec_map_cache_hasher): ...here. gcc/cp/ChangeLog: * pt.cc (defarg_inst): Use tree_vec_map_cache_hasher. (defarg_inst_for): New. (tsubst_default_argument): Adjust. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/lambda/lambda-defarg10.C: New test. --- gcc/tree.h| 17 gcc/cp/pt.cc | 43 +++ .../g++.dg/cpp0x/lambda/lambda-defarg10.C | 21 + gcc/tree.cc | 17 4 files changed, 72 insertions(+), 26 deletions(-) create mode 100644 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-defarg10.C diff --git a/gcc/tree.h b/gcc/tree.h index 30bc53c2996..c5617fbcc61 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -5559,6 +5559,23 @@ struct tree_decl_map_cache_hasher : ggc_cache_ptr_hash #define tree_vec_map_hash tree_decl_map_hash #define tree_vec_map_marked_p tree_map_base_marked_p +struct tree_vec_map_cache_hasher : ggc_cache_ptr_hash +{ + static hashval_t hash (tree_vec_map *m) { return DECL_UID (m->base.from); } + + static bool + equal (tree_vec_map *a, tree_vec_map *b) + { +return a->base.from == b->base.from; + } + + static int + keep_cache_entry (tree_vec_map *&m) + { +return ggc_marked_p (m->base.from); + } +}; + /* Hasher for tree decls. Pointer equality is enough here, but the DECL_UID is a better hash than the pointer value and gives a predictable traversal order. Additionally it can be used across PCH save/restore. */ diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc index 6fbda676527..db99b988fc3 100644 --- a/gcc/cp/pt.cc +++ b/gcc/cp/pt.cc @@ -13666,7 +13666,36 @@ tsubst_aggr_type (tree t, } } -static GTY((cache)) decl_tree_cache_map *defarg_inst; +/* Map from a FUNCTION_DECL to a vec of default argument instantiations. */ + +static GTY((cache)) hash_table *defarg_inst; + +/* Return a reference to the slot for the defarg inst of PARM. */ + +tree & +defarg_inst_for (tree parm) +{ + if (!defarg_inst) +defarg_inst = hash_table::create_ggc (13); + tree fn = DECL_CONTEXT (parm); + tree_vec_map in = { fn, nullptr }; + tree_vec_map **slot += defarg_inst->find_slot_with_hash (&in, DECL_UID (fn), INSERT); + if (!*slot) +{ + *slot = ggc_alloc (); + **slot = in; +} + + /* Index in reverse order to avoid allocating space for initial parameters + that don't have default arguments. */ + unsigned ridx = list_length (parm); + + vec *&defs = (*slot)->to; + if (vec_safe_length (defs) < ridx) +vec_safe_grow_cleared (defs, ridx); + return (*defs)[ridx - 1]; +} /* Substitute into the default argument ARG (a default argument for FN), which has the indicated TYPE. */ @@ -13696,9 +13725,9 @@ tsubst_default_argument (tree fn, int parmnum, tree type, tree arg, gcc_assert (same_type_ignoring_top_level_qualifiers_p (type, parmtype)); - tree *slot; - if (defarg_inst && (slot = defarg_inst->get (parm))) -return *slot; + tree &inst = defarg_inst_for (parm); + if (inst) +return inst; /* This default argument came from a template. Instantiate the default argument here, not in tsubst. In the case of @@ -13743,11 +13772,7 @@ tsubst_default_argument (tree fn, int parmnum, tree type, tree arg, pop_from_top_level (); if (arg != error_mark_node && !cp_unevaluated_operand) -{ - if (!defarg_inst) - defarg_inst = decl_tree_cache_map::create_ggc (37); - defarg_inst->put (parm, arg); -} +inst = arg; return arg; } diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-defarg10.C b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-defarg10.C new file mode 100644 index 000..e08eb4f5e56 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-defarg10.C @@ -0,0 +1,21 @@ +// PR c++/103186 +// { dg-do compile { target c++11 } } + +struct f +{ + template + f(const T1&){} +}; + + +template class A { +public: +void foo(A a, const f& fn = [](){}) { } +void bar(A a) { foo(a); } +}; +int main() { +A a; +a.foo(a); +a.bar(a); +return 0; +} diff --git a/gcc/tree.cc b/gcc/tree.cc index c4b36619e6a..6bd3b14c93c 100644 --- a/gcc/tree.cc +++ b/gcc/tree.cc @@ -242,23 +242,6 @@ static GTY ((cache)) static GTY ((cache))
[PATCH] c++: new-expr of array of deduced class tmpl [PR101988]
In r12-1933 I attempted to implement DR2397 aka allowing int a[3]; auto (&r)[3] = a; by removing the type_uses_auto check in create_array_type_for_decl. That may have gone too far, because it also allows arrays of CLASS_PLACEHOLDER_TEMPLATE and it looks like [dcl.type.class.deduct] prohibits that: "...the declared type of the variable shall be cv T, where T is the placeholder." However, in /2 it explicitly states that "A placeholder for a deduced class type can also be used in the type-specifier-seq in the new-type-id or type-id of a new-expression." In this PR, it manifested by making us accept invalid template struct A { A(T); }; auto p = new A[]{1}; [expr.new]/2 says that such a construct is treated as an invented declaration of the form A x[]{1}; but, I think, that ought to be ill-formed as per above. So this patch sort of restores the create_array_type_for_decl check. I should mention that the difference between [] and [1] is due to cp_parser_new_type_id: if (*nelts == NULL_TREE) /* Leave [] in the declarator. */; and groktypename returning different types based on that. Does this make sense? Bootstrapped/regtested on x86_64-pc-linux-gnu. PR c++/101988 gcc/cp/ChangeLog: * decl.cc (create_array_type_for_decl): Reject forming an array of placeholder for a deduced class type. gcc/testsuite/ChangeLog: * g++.dg/cpp1z/class-deduction-new1.C: New test. * g++.dg/cpp23/auto-array2.C: New test. --- gcc/cp/decl.cc | 12 .../g++.dg/cpp1z/class-deduction-new1.C | 16 gcc/testsuite/g++.dg/cpp23/auto-array2.C | 11 +++ 3 files changed, 39 insertions(+) create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction-new1.C create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-array2.C diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc index 6534a7fd320..10e6956117e 100644 --- a/gcc/cp/decl.cc +++ b/gcc/cp/decl.cc @@ -11087,6 +11087,18 @@ create_array_type_for_decl (tree name, tree type, tree size, location_t loc) if (type == error_mark_node || size == error_mark_node) return error_mark_node; + /* [dcl.type.class.deduct] prohibits forming an array of placeholder + for a deduced class type. */ + if (is_auto (type) && CLASS_PLACEHOLDER_TEMPLATE (type)) +{ + if (name) + error_at (loc, "%qD declared as array of template placeholder " + "type %qT", name, type); + else + error ("creating array of template placeholder type %qT", type); + return error_mark_node; +} + /* If there are some types which cannot be array elements, issue an error-message and return. */ switch (TREE_CODE (type)) diff --git a/gcc/testsuite/g++.dg/cpp1z/class-deduction-new1.C b/gcc/testsuite/g++.dg/cpp1z/class-deduction-new1.C new file mode 100644 index 000..70283353619 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp1z/class-deduction-new1.C @@ -0,0 +1,16 @@ +// PR c++/101988 +// { dg-do compile { target c++17 } } + +template +struct A { + A(T); + A(); +}; +auto p1 = new A[]{1}; // { dg-error "creating array of template placeholder type" } +auto p2 = new A[1]{1}; // { dg-error "invalid use of placeholder" } +auto p3 = new A[]{1}; +auto p4 = new A[1]{1}; +auto p5 = new A[]{1, 2}; // { dg-error "creating array of template placeholder type" } +auto p6 = new A[]{1, 2}; +auto p7 = new A[]{A(1), A(1)}; +auto p8 = new A[2]{A(1), A(1)}; diff --git a/gcc/testsuite/g++.dg/cpp23/auto-array2.C b/gcc/testsuite/g++.dg/cpp23/auto-array2.C new file mode 100644 index 000..06431685b30 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp23/auto-array2.C @@ -0,0 +1,11 @@ +// PR c++/101988 +// { dg-do compile { target c++17 } } + +template struct A { A(); }; +A a[3]; +auto (*p)[3] = &a; +A (*p2)[3] = &a; +A (*p3)[3] = &a; // { dg-error "template placeholder type" } +auto (&r)[3] = a; +A (&r2)[3] = a; +A (&r3)[3] = a; // { dg-error "template placeholder type" } base-commit: fd5b0488ad5e4f29b65238e06a2d65b7de120235 -- 2.34.1
Re: [PATCH] c++: new-expr of array of deduced class tmpl [PR101988]
On 1/26/22 18:55, Marek Polacek wrote: In r12-1933 I attempted to implement DR2397 aka allowing int a[3]; auto (&r)[3] = a; by removing the type_uses_auto check in create_array_type_for_decl. That may have gone too far, because it also allows arrays of CLASS_PLACEHOLDER_TEMPLATE and it looks like [dcl.type.class.deduct] prohibits that: "...the declared type of the variable shall be cv T, where T is the placeholder." However, in /2 it explicitly states that "A placeholder for a deduced class type can also be used in the type-specifier-seq in the new-type-id or type-id of a new-expression." In this PR, it manifested by making us accept invalid template struct A { A(T); }; auto p = new A[]{1}; [expr.new]/2 says that such a construct is treated as an invented declaration of the form A x[]{1}; but, I think, that ought to be ill-formed as per above. So this patch sort of restores the create_array_type_for_decl check. I should mention that the difference between [] and [1] is due to cp_parser_new_type_id: if (*nelts == NULL_TREE) /* Leave [] in the declarator. */; and groktypename returning different types based on that. Does this make sense? Bootstrapped/regtested on x86_64-pc-linux-gnu. OK. PR c++/101988 gcc/cp/ChangeLog: * decl.cc (create_array_type_for_decl): Reject forming an array of placeholder for a deduced class type. gcc/testsuite/ChangeLog: * g++.dg/cpp1z/class-deduction-new1.C: New test. * g++.dg/cpp23/auto-array2.C: New test. --- gcc/cp/decl.cc | 12 .../g++.dg/cpp1z/class-deduction-new1.C | 16 gcc/testsuite/g++.dg/cpp23/auto-array2.C | 11 +++ 3 files changed, 39 insertions(+) create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction-new1.C create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-array2.C diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc index 6534a7fd320..10e6956117e 100644 --- a/gcc/cp/decl.cc +++ b/gcc/cp/decl.cc @@ -11087,6 +11087,18 @@ create_array_type_for_decl (tree name, tree type, tree size, location_t loc) if (type == error_mark_node || size == error_mark_node) return error_mark_node; + /* [dcl.type.class.deduct] prohibits forming an array of placeholder + for a deduced class type. */ + if (is_auto (type) && CLASS_PLACEHOLDER_TEMPLATE (type)) +{ + if (name) + error_at (loc, "%qD declared as array of template placeholder " + "type %qT", name, type); + else + error ("creating array of template placeholder type %qT", type); + return error_mark_node; +} + /* If there are some types which cannot be array elements, issue an error-message and return. */ switch (TREE_CODE (type)) diff --git a/gcc/testsuite/g++.dg/cpp1z/class-deduction-new1.C b/gcc/testsuite/g++.dg/cpp1z/class-deduction-new1.C new file mode 100644 index 000..70283353619 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp1z/class-deduction-new1.C @@ -0,0 +1,16 @@ +// PR c++/101988 +// { dg-do compile { target c++17 } } + +template +struct A { + A(T); + A(); +}; +auto p1 = new A[]{1}; // { dg-error "creating array of template placeholder type" } +auto p2 = new A[1]{1}; // { dg-error "invalid use of placeholder" } +auto p3 = new A[]{1}; +auto p4 = new A[1]{1}; +auto p5 = new A[]{1, 2}; // { dg-error "creating array of template placeholder type" } +auto p6 = new A[]{1, 2}; +auto p7 = new A[]{A(1), A(1)}; +auto p8 = new A[2]{A(1), A(1)}; diff --git a/gcc/testsuite/g++.dg/cpp23/auto-array2.C b/gcc/testsuite/g++.dg/cpp23/auto-array2.C new file mode 100644 index 000..06431685b30 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp23/auto-array2.C @@ -0,0 +1,11 @@ +// PR c++/101988 +// { dg-do compile { target c++17 } } + +template struct A { A(); }; +A a[3]; +auto (*p)[3] = &a; +A (*p2)[3] = &a; +A (*p3)[3] = &a; // { dg-error "template placeholder type" } +auto (&r)[3] = a; +A (&r2)[3] = a; +A (&r3)[3] = a; // { dg-error "template placeholder type" } base-commit: fd5b0488ad5e4f29b65238e06a2d65b7de120235
[PATCH] testsuite/70230 - fix failures with default SSP
Configuring with --enable-default-ssp triggers various testsuite failures. These contain asm statements that are not compatible with -fstack-protector. Adding -fno-stack-protector to dg-options to work around this issue. Tested on x86_64-linux. 2022-01-26 Allan McRae PR testsuite/70230 * gcc.dg/asan/use-after-scope-4.c (dg-options): Add -fno-stack-protector. * gcc.dg/stack-usage-1.c: Likewise * gcc.dg/superblock.c: Likewise * gcc.target/i386/avx-vzeroupper-17.c: Likewise * gcc.target/i386/cleanup-1.c: Likewise * gcc.target/i386/cleanup-2.c: Likewise * gcc.target/i386/interrupt-redzone-1.c: Likewise * gcc.target/i386/interrupt-redzone-2.c: Likewise * gcc.target/i386/pr79793-1.c: Likewise * gcc.target/i386/pr79793-2.c: Likewise * gcc.target/i386/shrink_wrap_1.c: Likewise * gcc.target/i386/stack-check-11.c: Likewise * gcc.target/i386/stack-check-18.c: Likewise * gcc.target/i386/stack-check-19.c: Likewise * gcc.target/i386/stackalign/pr88483-1.c: Likewise * gcc.target/i386/stackalign/pr88483-2.c: Likewise * gcc.target/i386/sw-1.c: Likewise --- gcc/testsuite/gcc.dg/asan/use-after-scope-4.c| 1 + gcc/testsuite/gcc.dg/stack-usage-1.c | 2 +- gcc/testsuite/gcc.dg/superblock.c| 2 +- gcc/testsuite/gcc.target/i386/avx-vzeroupper-17.c| 2 +- gcc/testsuite/gcc.target/i386/cleanup-1.c| 2 +- gcc/testsuite/gcc.target/i386/cleanup-2.c| 2 +- gcc/testsuite/gcc.target/i386/interrupt-redzone-1.c | 2 +- gcc/testsuite/gcc.target/i386/interrupt-redzone-2.c | 2 +- gcc/testsuite/gcc.target/i386/pr79793-1.c| 2 +- gcc/testsuite/gcc.target/i386/pr79793-2.c| 2 +- gcc/testsuite/gcc.target/i386/shrink_wrap_1.c| 2 +- gcc/testsuite/gcc.target/i386/stack-check-11.c | 2 +- gcc/testsuite/gcc.target/i386/stack-check-18.c | 2 +- gcc/testsuite/gcc.target/i386/stack-check-19.c | 2 +- gcc/testsuite/gcc.target/i386/stackalign/pr88483-1.c | 2 +- gcc/testsuite/gcc.target/i386/stackalign/pr88483-2.c | 2 +- gcc/testsuite/gcc.target/i386/sw-1.c | 2 +- 17 files changed, 17 insertions(+), 16 deletions(-) diff --git a/gcc/testsuite/gcc.dg/asan/use-after-scope-4.c b/gcc/testsuite/gcc.dg/asan/use-after-scope-4.c index 44dc79535d2..e1486e75045 100644 --- a/gcc/testsuite/gcc.dg/asan/use-after-scope-4.c +++ b/gcc/testsuite/gcc.dg/asan/use-after-scope-4.c @@ -1,4 +1,5 @@ // { dg-do run } +/* { dg-options "-fno-stack-protector" } */ #define FN(NAME) \ NAME (void) \ diff --git a/gcc/testsuite/gcc.dg/stack-usage-1.c b/gcc/testsuite/gcc.dg/stack-usage-1.c index 93cfe7c0163..1d7d1fee435 100644 --- a/gcc/testsuite/gcc.dg/stack-usage-1.c +++ b/gcc/testsuite/gcc.dg/stack-usage-1.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-fstack-usage" } */ +/* { dg-options "-fstack-usage -fno-stack-protector" } */ /* nvptx doesn't have a reg allocator, and hence no stack usage data. */ /* { dg-skip-if "" { nvptx-*-* } } */ diff --git a/gcc/testsuite/gcc.dg/superblock.c b/gcc/testsuite/gcc.dg/superblock.c index 2b2fa9e154f..6b4419adaf5 100644 --- a/gcc/testsuite/gcc.dg/superblock.c +++ b/gcc/testsuite/gcc.dg/superblock.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fno-asynchronous-unwind-tables -fsched2-use-superblocks -fdump-rtl-sched2 -fdump-rtl-bbro" } */ +/* { dg-options "-O2 -fno-asynchronous-unwind-tables -fsched2-use-superblocks -fdump-rtl-sched2 -fdump-rtl-bbro -fno-stack-protector" } */ /* { dg-require-effective-target scheduling } */ typedef int aligned __attribute__ ((aligned (64))); diff --git a/gcc/testsuite/gcc.target/i386/avx-vzeroupper-17.c b/gcc/testsuite/gcc.target/i386/avx-vzeroupper-17.c index 6dc0dc05321..d677e6f10e0 100644 --- a/gcc/testsuite/gcc.target/i386/avx-vzeroupper-17.c +++ b/gcc/testsuite/gcc.target/i386/avx-vzeroupper-17.c @@ -1,5 +1,5 @@ /* { dg-do compile { target lp64 } } */ -/* { dg-options "-O2 -mavx -mabi=ms -dp" } */ +/* { dg-options "-O2 -mavx -mabi=ms -dp -fno-stack-protector" } */ typedef float __m256 __attribute__ ((__vector_size__ (32), __may_alias__)); diff --git a/gcc/testsuite/gcc.target/i386/cleanup-1.c b/gcc/testsuite/gcc.target/i386/cleanup-1.c index dcfcc4edb5f..6e7544c6b7a 100644 --- a/gcc/testsuite/gcc.target/i386/cleanup-1.c +++ b/gcc/testsuite/gcc.target/i386/cleanup-1.c @@ -1,5 +1,5 @@ /* { dg-do run { target *-*-linux* *-*-gnu* } } */ -/* { dg-options "-fexceptions -fnon-call-exceptions -fasynchronous-unwind-tables -O2" } */ +/* { dg-options "-fexceptions -fnon-call-exceptions -fasynchronous-unwind-tables -O2 -fno-stack-protector" } */ /* Test complex CFA value expressions. */ #include diff --git a/gcc/testsuite/gcc.target/i386/cleanup-2.c b/gcc/testsuite/gcc.target/i386/cleanup-2.c index 7e60323373b..a24daba73da 100644 --- a/gcc/testsuite/gcc.tar