Re: [PATCH] Do not use DW_OP_not for TRUTH_NOT_EXPR in conditional expressions
> But this doesn't fix > > case TRUTH_NOT_EXPR: > case BIT_NOT_EXPR: > op = DW_OP_not; > goto do_unop; Revised patch attached, using Jakub's suggestion. The original (buggy) DWARF procedure for the Ada testcase I previously posted is: .uleb128 0x8# (DIE (0x5b) DW_TAG_dwarf_procedure) .uleb128 0xc# DW_AT_location .byte 0x12# DW_OP_dup .byte 0x20# DW_OP_not .byte 0x28# DW_OP_bra .value 0x4 .byte 0x34# DW_OP_lit4 .byte 0x2f# DW_OP_skip .value 0x1 .byte 0x30# DW_OP_lit0 .byte 0x16# DW_OP_swap .byte 0x13# DW_OP_drop With Jakub's fix: .uleb128 0x8# (DIE (0x5b) DW_TAG_dwarf_procedure) .uleb128 0xd# DW_AT_location .byte 0x12# DW_OP_dup .byte 0x30# DW_OP_lit0 .byte 0x29# DW_OP_eq .byte 0x28# DW_OP_bra .value 0x4 .byte 0x34# DW_OP_lit4 .byte 0x2f# DW_OP_skip .value 0x1 .byte 0x30# DW_OP_lit0 .byte 0x16# DW_OP_swap .byte 0x13# DW_OP_drop With mine: .uleb128 0x8# (DIE (0x5b) DW_TAG_dwarf_procedure) .uleb128 0xb# DW_AT_location .byte 0x12# DW_OP_dup .byte 0x28# DW_OP_bra .value 0x4 .byte 0x30# DW_OP_lit0 .byte 0x2f# DW_OP_skip .value 0x1 .byte 0x34# DW_OP_lit4 .byte 0x16# DW_OP_swap .byte 0x13# DW_OP_drop * dwarf2out.c (loc_list_from_tree_1) : Do a logical instead of a bitwise negation. : Swap the operands if the condition is TRUTH_NOT_EXPR. -- Eric Botcazou diff --git a/gcc/dwarf2out.cc b/gcc/dwarf2out.cc index 5681b01749a..c333c595026 100644 --- a/gcc/dwarf2out.cc +++ b/gcc/dwarf2out.cc @@ -19449,6 +19449,14 @@ loc_list_from_tree_1 (tree loc, int want_address, break; case TRUTH_NOT_EXPR: + list_ret = loc_list_from_tree_1 (TREE_OPERAND (loc, 0), 0, context); + if (list_ret == 0) + return 0; + + add_loc_descr_to_each (list_ret, new_loc_descr (DW_OP_lit0, 0, 0)); + add_loc_descr_to_each (list_ret, new_loc_descr (DW_OP_eq, 0, 0)); + break; + case BIT_NOT_EXPR: op = DW_OP_not; goto do_unop; @@ -19497,6 +19505,15 @@ loc_list_from_tree_1 (tree loc, int want_address, list_ret = loc_list_from_tree_1 (TREE_OPERAND (TREE_OPERAND (loc, 0), 0), 0, context); + /* Likewise, swap the operands for a logically negated condition. */ + else if (TREE_CODE (TREE_OPERAND (loc, 0)) == TRUTH_NOT_EXPR) + { + lhs = loc_descriptor_from_tree (TREE_OPERAND (loc, 2), 0, context); + rhs = loc_list_from_tree_1 (TREE_OPERAND (loc, 1), 0, context); + list_ret + = loc_list_from_tree_1 (TREE_OPERAND (TREE_OPERAND (loc, 0), 0), + 0, context); + } else list_ret = loc_list_from_tree_1 (TREE_OPERAND (loc, 0), 0, context); if (list_ret == 0 || lhs == 0 || rhs == 0)
Re: [PATCH] Do not use DW_OP_not for TRUTH_NOT_EXPR in conditional expressions
On Mon, May 16, 2022 at 9:06 AM Eric Botcazou wrote: > > > But this doesn't fix > > > > case TRUTH_NOT_EXPR: > > case BIT_NOT_EXPR: > > op = DW_OP_not; > > goto do_unop; > > Revised patch attached, using Jakub's suggestion. The original (buggy) DWARF > procedure for the Ada testcase I previously posted is: > > .uleb128 0x8# (DIE (0x5b) DW_TAG_dwarf_procedure) > .uleb128 0xc# DW_AT_location > .byte 0x12# DW_OP_dup > .byte 0x20# DW_OP_not > .byte 0x28# DW_OP_bra > .value 0x4 > .byte 0x34# DW_OP_lit4 > .byte 0x2f# DW_OP_skip > .value 0x1 > .byte 0x30# DW_OP_lit0 > .byte 0x16# DW_OP_swap > .byte 0x13# DW_OP_drop > > With Jakub's fix: > > .uleb128 0x8# (DIE (0x5b) DW_TAG_dwarf_procedure) > .uleb128 0xd# DW_AT_location > .byte 0x12# DW_OP_dup > .byte 0x30# DW_OP_lit0 > .byte 0x29# DW_OP_eq > .byte 0x28# DW_OP_bra > .value 0x4 > .byte 0x34# DW_OP_lit4 > .byte 0x2f# DW_OP_skip > .value 0x1 > .byte 0x30# DW_OP_lit0 > .byte 0x16# DW_OP_swap > .byte 0x13# DW_OP_drop > > With mine: > > .uleb128 0x8# (DIE (0x5b) DW_TAG_dwarf_procedure) > .uleb128 0xb# DW_AT_location > .byte 0x12# DW_OP_dup > .byte 0x28# DW_OP_bra > .value 0x4 > .byte 0x30# DW_OP_lit0 > .byte 0x2f# DW_OP_skip > .value 0x1 > .byte 0x34# DW_OP_lit4 > .byte 0x16# DW_OP_swap > .byte 0x13# DW_OP_drop > > > * dwarf2out.c (loc_list_from_tree_1) : Do a logical > instead of a bitwise negation. > : Swap the operands if the condition is TRUTH_NOT_EXPR. LGTM. Thanks, Richard. > -- > Eric Botcazou
[PATCH][pushed] Fix ubsan error in opts-global.cc
Fixes: opts-global.cc:75:15: runtime error: store to address 0x0bc9be70 with insufficient space for an object of type 'char' which happens when mask == 0, len == 0 and we allocate zero elements. Eventually, result[0] is called which triggers the UBSAN. It's newly discovered after the Siddhesh's recent patch. Cheers, Martin gcc/ChangeLog: * opts-global.cc (write_langs): Allocate at least one byte. --- gcc/opts-global.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/opts-global.cc b/gcc/opts-global.cc index a18c76940f9..4f5f8cdcb98 100644 --- a/gcc/opts-global.cc +++ b/gcc/opts-global.cc @@ -61,7 +61,7 @@ write_langs (unsigned int mask) if (mask & (1U << n)) len += strlen (lang_name) + 1; - result = XNEWVEC (char, len); + result = XNEWVEC (char, MAX (1, len)); len = 0; for (n = 0; (lang_name = lang_names[n]) != 0; n++) if (mask & (1U << n)) -- 2.36.1
Re: [PATCH] Do not use DW_OP_not for TRUTH_NOT_EXPR in conditional expressions
On Mon, May 16, 2022 at 09:45:18AM +0200, Richard Biener via Gcc-patches wrote: > > * dwarf2out.c (loc_list_from_tree_1) : Do a logical > > instead of a bitwise negation. > > : Swap the operands if the condition is TRUTH_NOT_EXPR. > > LGTM. It won't work for types larger than size of address, it would need to use dwarf_OP (DW_OP_const_type) instead of DW_OP_lit0 in that case. But maybe TRUTH_NOT_EXPR will be never seen for such types and after all, even the loc_list_from_tree_1 INTEGER_CST case doesn't handle that (the RTL case does). So I think at least for now it is ok. Jakub
RE: [PATCH 1/3]middle-end: Add the ability to let the target decide the method of argument promotions.
> -Original Message- > From: Richard Biener > Sent: Monday, May 16, 2022 7:31 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; jeffreya...@gmail.com; > Richard Sandiford > Subject: Re: [PATCH 1/3]middle-end: Add the ability to let the target decide > the method of argument promotions. > > On Fri, 13 May 2022, Tamar Christina wrote: > > > Hi All, > > > > Some targets require function parameters to be promoted to a different > > type on expand time because the target may not have native > > instructions to work on such types. As an example the AArch64 port > > does not have native instructions working on integer 8- or 16-bit > > values. As such it promotes every parameter of these types to 32-bits. > > > > This promotion could be done by a target for two reasons: > > > > 1. For correctness. This may be an APCS requirement for instance. > > 2. For efficiency. By promoting the argument at expansion time we don't > have > >to keep promoting the type back and forth after each operation on it. > >i.e. the promotion simplies the RTL. > > > > This patch adds the ability for a target to decide whether during the > > expansion to use an extend to handle promotion or to use a paradoxical > subreg. > > > > A pradoxical subreg can be used when there's no correctness issues and > > when you still want the RTL efficiency of not doing the promotion > constantly. > > > > This also allows the target to not need to generate any code when the > > top bits are not significant. > > > > An example is in AArch64 the following extend is unneeded: > > > > uint8_t fd2 (uint8_t xr){ > > return xr + 1; > > } > > > > currently generates: > > > > fd2: > > and w0, w0, 255 > > add w0, w0, 1 > > ret > > > > instead of > > > > fd2: > > add w0, w0, #1 > > ret > > > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > Bootstrapped on x86_64-pc-linux-gnu and no issues > > > > Ok for master? > > Why do we need a target hook for this? Why doesn't the targets > function_arg(?) hook just return (subreg:SI (reg:QI 0)) here instead when no > zero-extension is required and (reg:QI 0) when it is? > Because I'm not sure it's allowed to. From what I can tell function_arg expects return registers to be hardregs. i.e. the actual APCS determined register for the argument. And since it's a hardreg can't make it a paradoxical subreg, it'll just resize the register to the new mode. assign_parm_setup_reg gets around this by creating a new pseudo and then assigning the resulting rtl to the param set_parm_rtl (parm, rtl); which as far as I can tell changes what expand will expand the parm to without actually changing the actual APCS register. Which is why I think the current code does the promotions there. > That said, an extra hook looks like a bad design to me, the existing > cummulative args way of querying the target ABI shoud be enough, and if > not, should be extended in a less hackish way. I could perhaps modify function_arg_info to have a field for the subreg extension which I can then use in function_arg for the promotion. However I'll have a problem with the asserts in insert_value_copy_on_edge and set_rtl both of which check that either the in/out types match, or the out type is a promoted in type. set_rtl I can extend with enough information to determine whether the subreg was intentional but insert_value_copy_on_edge doesn't give me access to enough information.. Which is probably why the current code re-queries the target.. Can I get to the parm information Here? I also initially tried to extend PROMOTE_MODE itself, however this is used by promote_mode, which is used in many other places and I am not sure some of these usages are safe to change, such as promote_ssa_mode. The new hook I know only interacts with parms. Any thoughts appreciated 😊 Cheers, Tamar > > But of course I am not familiar at all with the current state, but since you > specifially CCed me ... > > Richard. > > > Thanks, > > Tamar > > > > gcc/ChangeLog: > > > > * cfgexpand.cc (set_rtl): Check for function promotion. > > * tree-outof-ssa.cc (insert_value_copy_on_edge): Likewise. > > * function.cc (assign_parm_setup_reg): Likewise. > > * hooks.cc (hook_bool_mode_mode_int_tree_false, > > hook_bool_mode_mode_int_tree_true): New. > > * hooks.h (hook_bool_mode_mode_int_tree_false, > > hook_bool_mode_mode_int_tree_true): New. > > * target.def (promote_function_args_subreg_p): New. > > * doc/tm.texi: Document it. > > * doc/tm.texi.in: Likewise. > > > > --- inline copy of patch -- > > diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc index > > > d3cc77d2ca98f620b29623fc5696410bad9bc184..df95184cfa185312c2a46cb92da > a > > 051718d9f4f3 100644 > > --- a/gcc/cfgexpand.cc > > +++ b/gcc/cfgexpand.cc > > @@ -206,14 +206,20 @@ set_rtl (tree t, rtx x) > > have to compute it ourselves. For RESULT_DECLs, we accept
Re: [PATCH] lto-plugin: add support for feature detection
On Mon, May 16, 2022 at 2:38 PM Alexander Monakov wrote: > > On Mon, 16 May 2022, Rui Ueyama wrote: > > > If it is a guaranteed behavior that GCC of all versions that support > > only get_symbols_v2 don't leave a temporary file behind if it is > > suddenly disrupted during get_symbols_v2 execution, then yes, mold can > > restart itself when get_symbols_v2 is called for the first time. > > > > Is this what you want? I'm fine with that approach if it is guaranteed > > to work by GCC developers. > > I cannot answer that, hopefully someone in Cc: will. This sub-thread started > with Richard proposing an alternative solution for API level discovery [1] > (invoking onload twice, first with only the _v3 entrypoint in the "transfer > vector"), and then suggesting an onload_v2 variant that would allow to > discover which entrypoints the plugin is going to use [2]. > > [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-May/594058.html > [2] https://gcc.gnu.org/pipermail/gcc-patches/2022-May/594059.html > > ... at which point I butted in, because the whole _v3 thing was shrouded in > mystery. Hopefully, now it makes more sense. > > > From my side I want to add that thread-safety remains a major unsolved point. > Compiler driver _always_ adds the plugin to linker command line, so I expect > that if you add a mutex around your claim_file hook invocation, you'll find > that it serializes the linker too much. Have you given some thought to that? > Will you be needing a plugin API upgrade to discover thread-safe entrypoints, > or do you have another solution in mind? >From my linker's point of view, the easiest way to handle the situation is to implement a logic like this: if (gcc_version >= 12.2) assume that claim_file_hook is thread safe else assume that claim_file_hook is not thread safe And that is also _very_ useful to identify the existence of get_symbols_v3, because we can do the same thing with s/12.2/12.1/.
[PATCH] Mitigate -Wmaybe-uninitialized in expmed.cc.
It's the warning I see every time I build GCC: In file included from /home/marxin/Programming/gcc/gcc/coretypes.h:478, from /home/marxin/Programming/gcc/gcc/expmed.cc:26: In function ‘poly_uint16 mode_to_bytes(machine_mode)’, inlined from ‘typename if_nonpoly::type GET_MODE_SIZE(const T&) [with T = scalar_int_mode]’ at /home/marxin/Programming/gcc/gcc/machmode.h:647:24, inlined from ‘rtx_def* emit_store_flag_1(rtx, rtx_code, rtx, rtx, machine_mode, int, int, machine_mode)’ at /home/marxin/Programming/gcc/gcc/expmed.cc:5728:56: /home/marxin/Programming/gcc/gcc/machmode.h:550:49: warning: ‘*(unsigned int*)((char*)&int_mode + offsetof(scalar_int_mode, scalar_int_mode::m_mode))’ may be used uninitialized [-Wmaybe-uninitialized] 550 | ? mode_size_inline (mode) : mode_size[mode]); | ^~~~ /home/marxin/Programming/gcc/gcc/expmed.cc: In function ‘rtx_def* emit_store_flag_1(rtx, rtx_code, rtx, rtx, machine_mode, int, int, machine_mode)’: /home/marxin/Programming/gcc/gcc/expmed.cc:5657:19: note: ‘*(unsigned int*)((char*)&int_mode + offsetof(scalar_int_mode, scalar_int_mode::m_mode))’ was declared here 5657 | scalar_int_mode int_mode; | ^~~~ Can we please mitigate it? gcc/ChangeLog: * expmed.cc (emit_store_flag_1): Mitigate -Wmaybe-uninitialized warning. --- gcc/expmed.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/expmed.cc b/gcc/expmed.cc index 41738c1efe9..f23d63471ea 100644 --- a/gcc/expmed.cc +++ b/gcc/expmed.cc @@ -5654,7 +5654,7 @@ emit_store_flag_1 (rtx target, enum rtx_code code, rtx op0, rtx op1, /* If we are comparing a double-word integer with zero or -1, we can convert the comparison into one involving a single word. */ - scalar_int_mode int_mode; + scalar_int_mode int_mode = {}; if (is_int_mode (mode, &int_mode) && GET_MODE_BITSIZE (int_mode) == BITS_PER_WORD * 2 && (!MEM_P (op0) || ! MEM_VOLATILE_P (op0))) -- 2.36.1
Re: [PATCH] Use more ARRAY_SIZE.
On 5/11/22 10:17, Martin Liška wrote: > On 5/9/22 14:03, Richard Biener wrote: >> On Thu, May 5, 2022 at 4:30 PM Martin Liška wrote: >>> >>> On 5/5/22 14:58, Iain Buclaw wrote: This D front-end change doesn't look right to me, besides the slight >>> >>> Hello. >>> >>> Sorry, I've re-read the patch and fixed some places where the macro usage >>> was wrong. >>> >>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests. >> >> The middle-end parts are OK. I'd say in files where ARRAY_SIZE is already >> used it's OK to introduce more uses. Otherwise I defer to the more specific >> maintainers if they like this or not. > > All right, CCing the following maintainers for other parts: > > - David for JIT and Analyzer > - Tobias for Fortran part > - Jason for C-family part There are not further comments from the remaining C-family part so I'm going to install the patch. Martin > > Thanks, > Martin > >> >> Richard. >> >>> Martin >
[Ada] Remove duplicated code for detecting enabled pragmas
Routines Is_Enabled and Is_Enabled_Pragma are identical (except for comments); remove this duplication. Cleanup related to handling of volatile refinement aspects in SPARK; behaviour is unaffected. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * sem_util.adb (Is_Enabled): Remove; use Is_Enabled_Pragma instead.diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb --- a/gcc/ada/sem_util.adb +++ b/gcc/ada/sem_util.adb @@ -12632,44 +12632,6 @@ package body Sem_Util is function Type_Or_Variable_Has_Enabled_Property (Item_Id : Entity_Id) return Boolean is - function Is_Enabled (Prag : Node_Id) return Boolean; - -- Determine whether property pragma Prag (if present) denotes an - -- enabled property. - - - -- Is_Enabled -- - - - function Is_Enabled (Prag : Node_Id) return Boolean is -Arg1 : Node_Id; - - begin -if Present (Prag) then - Arg1 := First (Pragma_Argument_Associations (Prag)); - - -- The pragma has an optional Boolean expression, the related - -- property is enabled only when the expression evaluates to - -- True. - - if Present (Arg1) then - return Is_True (Expr_Value (Get_Pragma_Arg (Arg1))); - - -- Otherwise the lack of expression enables the property by - -- default. - - else - return True; - end if; - --- The property was never set in the first place - -else - return False; -end if; - end Is_Enabled; - - -- Local variables - AR : constant Node_Id := Get_Pragma (Item_Id, Pragma_Async_Readers); AW : constant Node_Id := @@ -12683,8 +12645,6 @@ package body Sem_Util is Is_Derived_Type (Item_Id) and then Is_Effectively_Volatile (Etype (Base_Type (Item_Id))); - -- Start of processing for Type_Or_Variable_Has_Enabled_Property - begin -- A non-effectively volatile object can never possess external -- properties. @@ -12699,16 +12659,16 @@ package body Sem_Util is -- missing altogether. elsif Property = Name_Async_Readersand then Present (AR) then -return Is_Enabled (AR); +return Is_Enabled_Pragma (AR); elsif Property = Name_Async_Writersand then Present (AW) then -return Is_Enabled (AW); +return Is_Enabled_Pragma (AW); elsif Property = Name_Effective_Reads and then Present (ER) then -return Is_Enabled (ER); +return Is_Enabled_Pragma (ER); elsif Property = Name_Effective_Writes and then Present (EW) then -return Is_Enabled (EW); +return Is_Enabled_Pragma (EW); -- If other properties are set explicitly, then this one is set -- implicitly to False, except in the case of a derived type
[Ada] Clarify code for detecting volatile refinement properties
Routine Type_Or_Variable_Has_Enabled_Property handles either types or objects; replace negation with an explicit positive condition. Cleanup related to handling of volatile refinement aspects in SPARK; behaviour is unaffected. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * sem_util.adb (Type_Or_Variable_Has_Enabled_Property): Clarify.diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb --- a/gcc/ada/sem_util.adb +++ b/gcc/ada/sem_util.adb @@ -12696,10 +12696,10 @@ package body Sem_Util is return Type_Or_Variable_Has_Enabled_Property (First_Subtype (Etype (Base_Type (Item_Id; - -- If not specified explicitly for an object and the type + -- If not specified explicitly for an object and its type -- is effectively volatile, then take result from the type. - elsif not Is_Type (Item_Id) + elsif Is_Object (Item_Id) and then Is_Effectively_Volatile (Etype (Item_Id)) then return Has_Enabled_Property (Etype (Item_Id), Property);
[Ada] Pick volatile refinement property of a subtype from its base type
Volatile refinement properties (e.g. Async_Writers), which refine the Volatile aspect in SPARK, are inherited by subtypes from their base types. In particular, this patch fixes handling of those properties for subtypes of private types. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * sem_util.adb (Type_Or_Variable_Has_Enabled_Property): Given a subtype recurse into its base type.diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb --- a/gcc/ada/sem_util.adb +++ b/gcc/ada/sem_util.adb @@ -12683,7 +12683,8 @@ package body Sem_Util is then return False; - -- For a private type, may need to look at the full view + -- For a private type (including subtype of a private types), look at + -- the full view. elsif Is_Private_Type (Item_Id) and then Present (Full_View (Item_Id)) then @@ -12696,6 +12697,13 @@ package body Sem_Util is return Type_Or_Variable_Has_Enabled_Property (First_Subtype (Etype (Base_Type (Item_Id; + -- For a subtype, the property will be inherited from its base type. + + elsif Is_Type (Item_Id) + and then not Is_Base_Type (Item_Id) + then +return Type_Or_Variable_Has_Enabled_Property (Etype (Item_Id)); + -- If not specified explicitly for an object and its type -- is effectively volatile, then take result from the type.
[Ada] Couple of small consistency tweaks
This aligns Analyze_Negation and Analyze_Unary_Op with the other similar procedures in Sem_Ch4. No functional changes. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * sem_ch4.adb (Analyze_Negation): Minor tweak. (Analyze_Unary_Op): Likewise.diff --git a/gcc/ada/sem_ch4.adb b/gcc/ada/sem_ch4.adb --- a/gcc/ada/sem_ch4.adb +++ b/gcc/ada/sem_ch4.adb @@ -3461,8 +3461,9 @@ package body Sem_Ch4 is -- procedure Analyze_Negation (N : Node_Id) is - R : constant Node_Id := Right_Opnd (N); - Op_Id : Entity_Id := Entity (N); + R : constant Node_Id := Right_Opnd (N); + + Op_Id : Entity_Id; begin Set_Etype (N, Any_Type); @@ -3470,7 +3471,15 @@ package body Sem_Ch4 is Analyze_Expression (R); - if Present (Op_Id) then + -- If the entity is already set, the node is the instantiation of a + -- generic node with a non-local reference, or was manufactured by a + -- call to Make_Op_xxx. In either case the entity is known to be valid, + -- and we do not need to collect interpretations, instead we just get + -- the single possible interpretation. + + if Present (Entity (N)) then + Op_Id := Entity (N); + if Ekind (Op_Id) = E_Operator then Find_Negation_Types (R, Op_Id, N); else @@ -6067,8 +6076,9 @@ package body Sem_Ch4 is -- procedure Analyze_Unary_Op (N : Node_Id) is - R : constant Node_Id := Right_Opnd (N); - Op_Id : Entity_Id := Entity (N); + R : constant Node_Id := Right_Opnd (N); + + Op_Id : Entity_Id; begin Set_Etype (N, Any_Type); @@ -6076,7 +6086,15 @@ package body Sem_Ch4 is Analyze_Expression (R); - if Present (Op_Id) then + -- If the entity is already set, the node is the instantiation of a + -- generic node with a non-local reference, or was manufactured by a + -- call to Make_Op_xxx. In either case the entity is known to be valid, + -- and we do not need to collect interpretations, instead we just get + -- the single possible interpretation. + + if Present (Entity (N)) then + Op_Id := Entity (N); + if Ekind (Op_Id) = E_Operator then Find_Unary_Types (R, Op_Id, N); else
[Ada] Map gnatlib-shared to gnatlib-shared-dual for aarch64-vx7r2
This is an incremental change towards supporting shared libraries for VxWorks on aarch64. The aarch64-vx7r2 compiler supports compilation with -fpic/PIC. This change adds aarch64 to the list of CPUs for which GNATLIB_SHARED maps to gnatlib-shared-dual for vxworks7r2, so "make gnatlib-shared" actually builds a shared lib. While other adjustments will be needed to get the runtime tests to pass, this one is a necessary step and doesn't impair the rest. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * Makefile.rtl: Add aarch64 to the list of CPUs for which GNATLIB_SHARED maps to gnatlib-shared-dual for vxworks7r2.diff --git a/gcc/ada/Makefile.rtl b/gcc/ada/Makefile.rtl --- a/gcc/ada/Makefile.rtl +++ b/gcc/ada/Makefile.rtl @@ -2854,7 +2854,7 @@ endif # Turn on shared gnatlib for specific vx7r2 targets for RTP runtimes. Once # all targets are ported the target_cpu selector can be removed. -ifeq ($(strip $(filter-out vxworks7r2 powerpc64 x86_64 rtp rtp-smp, $(target_os) $(target_cpu) $(THREAD_KIND))),) +ifeq ($(strip $(filter-out vxworks7r2 powerpc64 x86_64 aarch64 rtp rtp-smp, $(target_os) $(target_cpu) $(THREAD_KIND))),) GNATLIB_SHARED = gnatlib-shared-dual LIBRARY_VERSION := $(LIB_VERSION) endif
[Ada] Fix spurious error on limited view with incomplete type
The problem is that Install_Limited_With_Clause does not fully implement AI05-0129, in the case where a regular with clause is processed before a limited_with clause of the same package: the visible "shadow" entity is that of the incomplete type, instead of that of the full type per the AI. This requires adjusting Remove_Limited_With_Unit to match the change in Install_Limited_With_Clause and also Build_Incomplete_Type_Declaration, which is responsible for synthesizing incomplete types out of full type declarations for self-referential types. A small tweak is also needed in Analyze_Subprogram_Body_Helper to align it with an equivalent processing for CW types in Find_Type_Name. And the patch also changes the Incomplete_View field in full type declarations to point to the entity of the view instead of its declaration. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * exp_ch3.adb (Build_Assignment): Adjust to the new definition of Incomplete_View field. * sem_ch10.ads (Decorate_Type): Declare. * sem_ch10.adb (Decorate_Type): Move to library level. (Install_Limited_With_Clause): In the already analyzed case, also deal with incomplete type declarations present in the sources and simplify the replacement code. (Build_Shadow_Entity): Deal with swapped views in package body. (Restore_Chain_For_Shadow): Deal with incomplete type declarations present in the sources. * sem_ch3.adb (Analyze_Full_Type_Declaration): Adjust to the new definition of Incomplete_View field. (Build_Incomplete_Type_Declaration): Small consistency tweak. Set the incomplete type as the Incomplete_View of the full type. If the scope is a package with a limited view, build a shadow entity for the incomplete type. * sem_ch6.adb (Analyze_Subprogram_Body_Helper): When replacing the limited view of a CW type as designated type of an anonymous access return type, get to the CW type of the incomplete view of the tagged type, if any. (Collect_Primitive_Operations): Adjust to the new definition of Incomplete_View field. * sinfo.ads (Incomplete_View): Denote the entity itself instead of its declaration. * sem_util.adb: Remove call to Defining_Entity.diff --git a/gcc/ada/exp_ch3.adb b/gcc/ada/exp_ch3.adb --- a/gcc/ada/exp_ch3.adb +++ b/gcc/ada/exp_ch3.adb @@ -2100,8 +2100,7 @@ package body Exp_Ch3 is and then Present (Incomplete_View (Parent (Rec_Type))) then Append_Elmt ( -N => Defining_Identifier -(Incomplete_View (Parent (Rec_Type))), +N => Incomplete_View (Parent (Rec_Type)), To => Map); Append_Elmt ( N => Defining_Identifier diff --git a/gcc/ada/sem_ch10.adb b/gcc/ada/sem_ch10.adb --- a/gcc/ada/sem_ch10.adb +++ b/gcc/ada/sem_ch10.adb @@ -3107,6 +3107,72 @@ package body Sem_Ch10 is end if; end Check_Stub_Level; + --- + -- Decorate_Type -- + --- + + procedure Decorate_Type + (Ent : Entity_Id; + Scop: Entity_Id; + Is_Tagged : Boolean := False; + Materialize : Boolean := False) + is + CW_Typ : Entity_Id; + + begin + -- An unanalyzed type or a shadow entity of a type is treated as an + -- incomplete type, and carries the corresponding attributes. + + Mutate_Ekind (Ent, E_Incomplete_Type); + Set_Etype (Ent, Ent); + Set_Full_View (Ent, Empty); + Set_Is_First_Subtype (Ent); + Set_Scope (Ent, Scop); + Set_Stored_Constraint (Ent, No_Elist); + Reinit_Size_Align (Ent); + + if From_Limited_With (Ent) then + Set_Private_Dependents (Ent, New_Elmt_List); + end if; + + -- A tagged type and its corresponding shadow entity share one common + -- class-wide type. The list of primitive operations for the shadow + -- entity is empty. + + if Is_Tagged then + Set_Is_Tagged_Type (Ent); + Set_Direct_Primitive_Operations (Ent, New_Elmt_List); + + CW_Typ := + New_External_Entity + (E_Void, Scope (Ent), Sloc (Ent), Ent, 'C', 0, 'T'); + + Set_Class_Wide_Type (Ent, CW_Typ); + + -- Set parent to be the same as the parent of the tagged type. + -- We need a parent field set, and it is supposed to point to + -- the declaration of the type. The tagged type declaration + -- essentially declares two separate types, the tagged type + -- itself and the corresponding class-wide type, so it is + -- reasonable for the parent fields to point to the declaration + -- in both cases. + + Set_Parent (CW_Typ, Parent (Ent)); + + Mutate_Ekind
[Ada] Improve building of untagged equality
When checking components of a record type for their own user-defined equality function it is enough to find just one such a component. Cleanup related to handling of user-defined equality in GNATprove. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * exp_ch3.adb (Build_Untagged_Equality): Exit early when the outcome of a loop is already known.diff --git a/gcc/ada/exp_ch3.adb b/gcc/ada/exp_ch3.adb --- a/gcc/ada/exp_ch3.adb +++ b/gcc/ada/exp_ch3.adb @@ -4541,6 +4541,7 @@ package body Exp_Ch3 is and then Present (User_Defined_Eq (Etype (Comp))) then Build_Eq := True; +exit; end if; Next_Component (Comp);
[Ada] Remove duplicated detection of user-defined equality
Cleanup related to handling of user-defined equality in GNATprove. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * exp_ch3.adb (User_Defined_Eq): Replace duplicated code with a call to Get_User_Defined_Eq.diff --git a/gcc/ada/exp_ch3.adb b/gcc/ada/exp_ch3.adb --- a/gcc/ada/exp_ch3.adb +++ b/gcc/ada/exp_ch3.adb @@ -4500,32 +4500,14 @@ package body Exp_Ch3 is - function User_Defined_Eq (T : Entity_Id) return Entity_Id is - Prim : Elmt_Id; - Op : Entity_Id; + Op : constant Entity_Id := TSS (T, TSS_Composite_Equality); begin - Op := TSS (T, TSS_Composite_Equality); - if Present (Op) then return Op; + else +return Get_User_Defined_Eq (T); end if; - - Prim := First_Elmt (Collect_Primitive_Operations (T)); - while Present (Prim) loop -Op := Node (Prim); - -if Chars (Op) = Name_Op_Eq - and then Etype (Op) = Standard_Boolean - and then Etype (First_Formal (Op)) = T - and then Etype (Next_Formal (First_Formal (Op))) = T -then - return Op; -end if; - -Next_Elmt (Prim); - end loop; - - return Empty; end User_Defined_Eq; -- Start of processing for Build_Untagged_Equality
[Ada] Implement component finalization ordering rules for type extensions
Finalization of a record object is required to finalize any components that have an access discriminant constrained by a per-object expression before other components. This includes the case of a type extension; "early finalization" components of the parent type are required to be finalized before non-early-finalization extension components. This is implemented in the extension type's finalization procedure by placing the call to the parent type's finalization procedure between the finalization of the "early finalization" extension components and the finalization of the other extension components. Previously that call was executed after finalizing all of the extension conponents. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * exp_ch7.adb (Build_Finalize_Statements): Add Last_POC_Call variable to keep track of the last "early finalization" call generated for type extension's finalization procedure. If non-empty, then this will indicate the point at which to insert the call to the parent type's finalization procedure. Modify nested function Process_Component_List_For_Finalize to set this variable (and avoid setting it during a recursive call). If Last_POC_Call is empty, then insert the parent finalization call before, rather than after, the finalization code for the extension components.diff --git a/gcc/ada/exp_ch7.adb b/gcc/ada/exp_ch7.adb --- a/gcc/ada/exp_ch7.adb +++ b/gcc/ada/exp_ch7.adb @@ -8273,19 +8273,23 @@ package body Exp_Ch7 is Counter: Nat := 0; Finalizer_Data : Finalization_Exception_Data; + Last_POC_Call : Node_Id := Empty; function Process_Component_List_For_Finalize - (Comps : Node_Id) return List_Id; + (Comps : Node_Id; +In_Variant_Part : Boolean := False) return List_Id; -- Build all necessary finalization statements for a single component -- list. The statements may include a jump circuitry if flag Is_Local - -- is enabled. + -- is enabled. In_Variant_Part indicates whether this is a recursive + -- call. - -- Process_Component_List_For_Finalize -- - function Process_Component_List_For_Finalize - (Comps : Node_Id) return List_Id + (Comps : Node_Id; +In_Variant_Part : Boolean := False) return List_Id is procedure Process_Component_For_Finalize (Decl : Node_Id; @@ -8467,7 +8471,8 @@ package body Exp_Ch7 is New_Copy_List (Discrete_Choices (Var)), Statements => Process_Component_List_For_Finalize ( - Component_List (Var; + Component_List (Var), + In_Variant_Part => True))); Next_Non_Pragma (Var); end loop; @@ -8534,6 +8539,12 @@ package body Exp_Ch7 is end loop; end if; +if not In_Variant_Part then + Last_POC_Call := Last (Stmts); + -- In the case of a type extension, the deep-finalize call + -- for the _Parent component will be inserted here. +end if; + -- Process the rest of the components in reverse order Decl := Last_Non_Pragma (Component_Items (Comps)); @@ -8749,7 +8760,38 @@ package body Exp_Ch7 is (Finalizer_Data; end if; - Append_To (Bod_Stmts, Fin_Stmt); + -- The intended component finalization order is + --1) POC components of extension + --2) _Parent component + --3) non-POC components of extension. + -- + -- With this "finalize the parent part in the middle" + -- ordering, we can avoid the need for making two + -- calls to the parent's subprogram in the way that + -- is necessary for Init_Procs. This does have the + -- peculiar (but legal) consequence that the parent's + -- non-POC components are finalized before the + -- non-POC extension components. This violates the + -- usual "finalize in reverse declaration order" + -- principle, but that's ok (see Ada RM 7.6.1(9)). + -- + -- Last_POC_Call should be non-empty if the extension + -- has at least one POC. Interactions with variant + -- parts are incorrectly ignored. + + if Pr
[Ada] Fix implementation issues with equality for untagged record types
This moves the implementation of AI12-0101 + AI05-0123 from the expander to the semantic analyzer and completes the implementation of AI12-0413, which are both binding interpretations in Ada 2012, fixing a few bugs in the process and removing a fair amount of duplicated code throughout. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * einfo-utils.adb (Remove_Entity): Fix couple of oversights. * exp_ch3.adb (Is_User_Defined_Equality): Delete. (User_Defined_Eq): Call Get_User_Defined_Equality. (Make_Eq_Body): Likewise. (Predefined_Primitive_Eq_Body): Call Is_User_Defined_Equality. * exp_ch4.adb (Build_Eq_Call): Call Get_User_Defined_Equality. (Is_Equality): Delete. (User_Defined_Primitive_Equality_Op): Likewise. (Find_Aliased_Equality): Call Is_User_Defined_Equality. (Expand_N_Op_Eq): Call Underlying_Type unconditionally. Do not implement AI12-0101 + AI05-0123 here. (Expand_Set_Membership): Call Resolve_Membership_Equality. * exp_ch6.adb (Expand_Call_Helper): Remove obsolete code. * sem_aux.ads (Is_Record_Or_Limited_Type): Delete. * sem_aux.adb (Is_Record_Or_Limited_Type): Likewise. * sem_ch4.ads (Nondispatching_Call_To_Abstract_Operation): Declare. * sem_ch4.adb (Analyze_Call): Call Call_Abstract_Operation. (Analyze_Membership_Op): Call Resolve_Membership_Equality. (Nondispatching_Call_To_Abstract_Operation): New procedure. (Remove_Abstract_Operations): Call it. * sem_ch6.adb (Check_Untagged_Equality): Remove obsolete error and call Is_User_Defined_Equality. * sem_ch7.adb (Inspect_Untagged_Record_Completion): New procedure implementing AI12-0101 + AI05-0123. (Analyze_Package_Specification): Call it. (Declare_Inherited_Private_Subprograms): Minor tweak. (Uninstall_Declarations): Likewise. * sem_disp.adb (Check_Direct_Call): Adjust to new implementation of Is_User_Defined_Equality. * sem_res.ads (Resolve_Membership_Equality): Declare. * sem_res.adb (Resolve): Replace direct error handling with call to Nondispatching_Call_To_Abstract_Operation (Resolve_Call): Likewise. (Resolve_Equality_Op): Likewise. mplement AI12-0413. (Resolve_Membership_Equality): New procedure. (Resolve_Membership_Op): Call Get_User_Defined_Equality. * sem_util.ads (Get_User_Defined_Eq): Rename into... (Get_User_Defined_Equality): ...this. * sem_util.adb (Get_User_Defined_Eq): Rename into... (Get_User_Defined_Equality): ...this. Call Is_User_Defined_Equality. (Is_User_Defined_Equality): Also check the profile but remove tests on Comes_From_Source and Parent. * sinfo.ads (Generic_Parent_Type): Adjust field description. * uintp.ads (Ubool): Invoke user-defined equality in predicate. patch.diff.gz Description: application/gzip
[Ada] Fix internal error on predicate aspect with iterator
The semantic analysis of predicates involves a fair amount of tree copying because of both semantic and implementation considerations, and there is a difficulty with quantified expressions since they declare a new entity that cannot be shared between the various copies of the tree. This change implements a specific processing for it in New_Copy_Tree that subsumes a couple of fixes made earlier for variants of the issue. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * sem_util.ads (Is_Entity_Of_Quantified_Expression): Declare. * sem_util.adb (Is_Entity_Of_Quantified_Expression): New predicate. (New_Copy_Tree): Deal with all entities of quantified expressions. * sem_ch13.adb (Build_Predicate_Functions): Get rid of superfluous tree copying and remove obsolete code. * sem_ch6.adb (Fully_Conformant_Expressions): Deal with all entities of quantified expressions.diff --git a/gcc/ada/sem_ch13.adb b/gcc/ada/sem_ch13.adb --- a/gcc/ada/sem_ch13.adb +++ b/gcc/ada/sem_ch13.adb @@ -10231,16 +10231,13 @@ package body Sem_Ch13 is Set_SCO_Pragma_Enabled (Sloc (Prag)); --- Extract the arguments of the pragma. The expression itself --- is copied for use in the predicate function, to preserve the --- original version for ASIS use. --- Is this still needed??? +-- Extract the arguments of the pragma Arg1 := First (Pragma_Argument_Associations (Prag)); Arg2 := Next (Arg1); Arg1 := Get_Pragma_Arg (Arg1); -Arg2 := New_Copy_Tree (Get_Pragma_Arg (Arg2)); +Arg2 := Get_Pragma_Arg (Arg2); -- When the predicate pragma applies to the current type or its -- full view, replace all occurrences of the subtype name with @@ -10455,45 +10452,12 @@ package body Sem_Ch13 is if Raise_Expression_Present then declare - function Reset_Loop_Variable - (N : Node_Id) return Traverse_Result; - - procedure Reset_Loop_Variables is - new Traverse_Proc (Reset_Loop_Variable); - - - -- Reset_Loop_Variable -- - - - function Reset_Loop_Variable - (N : Node_Id) return Traverse_Result - is - begin - if Nkind (N) = N_Iterator_Specification then - Set_Defining_Identifier (N, - Make_Defining_Identifier - (Sloc (N), Chars (Defining_Identifier (N; - end if; - - return OK; - end Reset_Loop_Variable; - - -- Local variables - Map : constant Elist_Id := New_Elmt_List; begin Append_Elmt (Object_Entity, Map); Append_Elmt (Object_Entity_M, Map); Expr_M := New_Copy_Tree (Expr, Map => Map); - - -- The unanalyzed expression will be copied and appear in - -- both functions. Normally expressions do not declare new - -- entities, but quantified expressions do, so we need to - -- create new entities for their bound variables, to prevent - -- multiple definitions in gigi. - - Reset_Loop_Variables (Expr_M); end; end if; diff --git a/gcc/ada/sem_ch6.adb b/gcc/ada/sem_ch6.adb --- a/gcc/ada/sem_ch6.adb +++ b/gcc/ada/sem_ch6.adb @@ -10106,14 +10106,13 @@ package body Sem_Ch6 is and then Discriminal_Link (Entity (E1)) = Discriminal_Link (Entity (E2))) - -- AI12-050: The loop variables of quantified expressions match - -- if they have the same identifier, even though they may have - -- different entities. + -- AI12-050: The entities of quantified expressions match if they + -- have the same identifier, even if they may be distinct nodes. or else (Chars (Entity (E1)) = Chars (Entity (E2)) - and then Ekind (Entity (E1)) = E_Loop_Parameter - and then Ekind (Entity (E2)) = E_Loop_Parameter) + and then Is_Entity_Of_Quantified_Expression (Entity (E1)) + and then Is_Entity_Of_Quantified_Expression (Entity (E2))) -- A call to an instantiation of Unchecked_Conversion is -- rewritten with the name of the generated function created for diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb --- a/gcc/ada/sem_util.adb +++ b/gcc/ada/sem_util.adb @@ -17624,6 +17624,21 @@ package body Sem_Util is end if; end Is_Effectively_Volatile_Object_Shared; + + -- Is_Entity_Of_Quan
[Ada] Fix internal error on mix of controlled and protected types
The key is that the protected type is a (limited) private type, which fools a test in Cleanup_Scopes. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * inline.adb (Cleanup_Scopes): Test the underlying type.diff --git a/gcc/ada/inline.adb b/gcc/ada/inline.adb --- a/gcc/ada/inline.adb +++ b/gcc/ada/inline.adb @@ -2773,7 +2773,7 @@ package body Inline is Scop := Protected_Body_Subprogram (Scop); elsif Is_Subprogram (Scop) - and then Is_Protected_Type (Scope (Scop)) + and then Is_Protected_Type (Underlying_Type (Scope (Scop))) and then Present (Protected_Body_Subprogram (Scop)) then -- If a protected operation contains an instance, its cleanup
[Ada] Accept calls to abstract subprograms in class-wide pre/post-conditions
Fix a regression in the support for Ada 2022's treatment of calls to abstract subprograms in pre/post-conditions (thanks to Javier Miranda for producing this patch). Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * sem_disp.adb (Check_Dispatching_Context): When checking to see whether an expression occurs in a class-wide pre/post-condition, also check for the possibility that it occurs in a class-wide preconditions subprogram that was introduced as part of expansion. Without this fix, some legal calls occuring in class-wide preconditions may be incorrectly flagged as violating the "a call to an abstract subprogram must be dispatching" rule.diff --git a/gcc/ada/sem_disp.adb b/gcc/ada/sem_disp.adb --- a/gcc/ada/sem_disp.adb +++ b/gcc/ada/sem_disp.adb @@ -751,14 +751,22 @@ package body Sem_Disp is elsif Is_Subprogram (Scop) and then not Is_Tag_Indeterminate (N) - and then In_Pre_Post_Condition (Call, Class_Wide_Only => True) + and then + -- The context is an internally built helper or an indirect + -- call wrapper that handles class-wide preconditions + (Present (Class_Preconditions_Subprogram (Scop)) - -- The tagged type associated with the called subprogram must be - -- the same as that of the subprogram with a class-wide aspect. + -- ... or the context is a class-wide pre/postcondition. + or else + (In_Pre_Post_Condition (Call, Class_Wide_Only => True) - and then Is_Dispatching_Operation (Scop) - and then -Find_Dispatching_Type (Subp) = Find_Dispatching_Type (Scop) + -- The tagged type associated with the called + -- subprogram must be the same as that of the + -- subprogram with a class-wide aspect. + + and then Is_Dispatching_Operation (Scop) + and then Find_Dispatching_Type (Subp) + = Find_Dispatching_Type (Scop))) then null;
[Ada] Fix internal error on iterated array aggregate
The front-end drops the declaration of a temporary on the floor because Insert_Actions fails to climb up out of an N_Iterated_Component_Association when the temporary is created during the analysis of its Discrete_Choices. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * exp_util.adb (Insert_Actions) : Climb up out of the node if the actions come from Discrete_Choices.diff --git a/gcc/ada/exp_util.adb b/gcc/ada/exp_util.adb --- a/gcc/ada/exp_util.adb +++ b/gcc/ada/exp_util.adb @@ -7751,12 +7751,17 @@ package body Exp_Util is -- We must not climb up out of an N_Iterated_xxx_Association -- because the actions might contain references to the loop - -- parameter. But it turns out that setting the Loop_Actions - -- attribute in the case of an N_Component_Association - -- when the attribute was not already set can lead to - -- (as yet not understood) bugboxes (gcc failures that are - -- presumably due to malformed trees). So we don't do that. - + -- parameter, except if we come from the Discrete_Choices of + -- N_Iterated_Component_Association which cannot contain any. + -- But it turns out that setting the Loop_Actions field in + -- the case of an N_Component_Association when the field was + -- not already set can lead to gigi assertion failures that + -- are presumably due to malformed trees, so don't do that. + + and then (Nkind (P) /= N_Iterated_Component_Association +or else not Is_List_Member (N) +or else + List_Containing (N) /= Discrete_Choices (P)) and then (Nkind (P) /= N_Component_Association or else Present (Loop_Actions (P))) then
[Ada] Fix iterated element association loop var escaping loop scope
Fix the escaping of the loop variable from the loop scope in both forms of iterated element associations (i.e. "for J in ..." and "for J of ..."). Create a dedicated scope around the analyses of both loops. Also create a copy of the Loop_Parameter_Specification instead of analyzing (and modifying) the original Tree as it will be reanalyzed later. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * sem_aggr.adb (Resolve_Iterated_Association): Create scope around N_Iterated_Element_Association handling. Analyze a copy of the Loop_Parameter_Specification. Call Analyze instead Analyze_* to be more homogeneous. (Sem_Ch5): Remove now unused package.diff --git a/gcc/ada/sem_aggr.adb b/gcc/ada/sem_aggr.adb --- a/gcc/ada/sem_aggr.adb +++ b/gcc/ada/sem_aggr.adb @@ -51,7 +51,6 @@ with Sem_Aux;use Sem_Aux; with Sem_Case; use Sem_Case; with Sem_Cat;use Sem_Cat; with Sem_Ch3;use Sem_Ch3; -with Sem_Ch5;use Sem_Ch5; with Sem_Ch8;use Sem_Ch8; with Sem_Ch13; use Sem_Ch13; with Sem_Dim;use Sem_Dim; @@ -2890,12 +2889,12 @@ package body Sem_Aggr is is Loc : constant Source_Ptr := Sloc (N); Choice : Node_Id; + Copy : Node_Id; Ent : Entity_Id; Expr : Node_Id; Key_Expr : Node_Id; Id : Entity_Id; Id_Name : Name_Id; - Iter : Node_Id; Typ : Entity_Id := Empty; begin @@ -2906,15 +2905,29 @@ package body Sem_Aggr is -- is present. In both cases a Key_Expression is present. if Nkind (Comp) = N_Iterated_Element_Association then + +-- Create a temporary scope to avoid some modifications from +-- escaping the Analyze call below. The original Tree will be +-- reanalyzed later. + +Ent := New_Internal_Entity + (E_Loop, Current_Scope, Sloc (Comp), 'L'); +Set_Etype (Ent, Standard_Void_Type); +Set_Parent (Ent, Parent (Comp)); +Push_Scope (Ent); + if Present (Loop_Parameter_Specification (Comp)) then - Analyze_Loop_Parameter_Specification - (Loop_Parameter_Specification (Comp)); + Copy := Copy_Separate_Tree (Comp); + + Analyze + (Loop_Parameter_Specification (Copy)); + Id_Name := Chars (Defining_Identifier (Loop_Parameter_Specification (Comp))); else - Iter := Copy_Separate_Tree (Iterator_Specification (Comp)); - Analyze (Iter); - Typ := Etype (Defining_Identifier (Iter)); + Copy := Copy_Separate_Tree (Iterator_Specification (Comp)); + Analyze (Copy); + Id_Name := Chars (Defining_Identifier (Iterator_Specification (Comp))); end if; @@ -2926,12 +2939,14 @@ package body Sem_Aggr is Key_Expr := Key_Expression (Comp); Analyze_And_Resolve (New_Copy_Tree (Key_Expr), Key_Type); +End_Scope; elsif Present (Iterator_Specification (Comp)) then -Iter:= Copy_Separate_Tree (Iterator_Specification (Comp)); +Copy:= Copy_Separate_Tree (Iterator_Specification (Comp)); Id_Name := Chars (Defining_Identifier (Comp)); -Analyze (Iter); -Typ := Etype (Defining_Identifier (Iter)); + +Analyze (Copy); +Typ := Etype (Defining_Identifier (Copy)); else Choice := First (Discrete_Choices (Comp)); @@ -2965,7 +2980,8 @@ package body Sem_Aggr is -- analysis. Id := Make_Defining_Identifier (Sloc (Comp), Id_Name); - Ent := New_Internal_Entity (E_Loop, Current_Scope, Sloc (Comp), 'L'); + Ent := New_Internal_Entity (E_Loop, + Current_Scope, Sloc (Comp), 'L'); Set_Etype (Ent, Standard_Void_Type); Set_Parent (Ent, Parent (Comp)); Push_Scope (Ent); @@ -3000,6 +3016,8 @@ package body Sem_Aggr is end Resolve_Iterated_Association; + -- Start of processing for Resolve_Container_Aggregate + begin pragma Assert (Nkind (Asp) = N_Aggregate);
[Ada] replace call to bzero in terminals.c by call to memset
bzero is marked as legacy in POSIX.1-2001, and using it triggers a deprecation warnings on some systems such as QNX. This change adjusts the one place where we use it in terminals.c to use memset instead. This, in turns, allows us to get rid of a hack for HP/UX and Solaris. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * terminals.c: Remove bzero #define on HP/UX or Solaris platforms. (child_setup_tty): Replace bzero call by equivalent call to memset.diff --git a/gcc/ada/terminals.c b/gcc/ada/terminals.c --- a/gcc/ada/terminals.c +++ b/gcc/ada/terminals.c @@ -1123,12 +1123,6 @@ __gnat_setup_winsize (void *desc ATTRIBUTE_UNUSED, #define CDISABLE _POSIX_VDISABLE -/* On HP-UX and Sun system, there is a bzero function but with a different - signature. Use memset instead */ -#if defined (__hpux__) || defined (__sun__) || defined (_AIX) -# define bzero(s,n) memset (s,0,n) -#endif - /* POSIX does not specify how to open the master side of a terminal.Several methods are available (system specific): 1- using a cloning device (USE_CLONE_DEVICE) @@ -1289,8 +1283,15 @@ child_setup_tty (int fd) struct termios s; intstatus; - /* ensure that s is filled with 0 */ - bzero (&s, sizeof (s)); + /* Ensure that s is filled with 0. + + Note that we avoid using bzero for a few reasons: + - On HP-UX and Sun system, there is a bzero function but with + a different signature, thus making the use of bzero more + complicated on these platforms (we used to define a bzero + macro that rewrote calls to bzero into calls to memset); + - bzero is deprecated (marked as LEGACY in POSIX.1-2001). */ + memset (&s, 0, sizeof (s)); /* Get the current terminal settings */ status = tcgetattr (fd, &s);
[Ada] Revise Storage_Model_Support operations to do checks and take objects and types
The functions in subpackage Storage_Model_Support (apart from the Has_*_Aspect functions) are revised to have assertions that will fail when passed a parameter that doesn't specify the appropriate aspect (either aspect Storage_Model_Type or Designated_Storage_Model), instead of returning Empty for bad arguments. Also, various of the functions now allow either a type with aspect Storage_Model_Type or an object of such a type. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * sem_util.ads (Storage_Model_Support): Revise comments on most operations within this nested package to reflect that they can now be passed either a type that has aspect Storage_Model_Type or an object of such a type. Change the names of the relevant formals to SM_Obj_Or_Type. Also, add more precise semantic descriptions in some cases, and declare the subprograms in a more logical order. * sem_util.adb (Storage_Model_Support.Storage_Model_Object): Add an assertion that the type must specify aspect Designated_Storage_Model, rather than returning Empty when it doesn't specify that aspect. (Storage_Model_Support.Storage_Model_Type): Add an assertion that formal must be an object whose type specifies aspect Storage_Model_Type, rather than returning Empty for when it doesn't have such a type (and test Has_Storage_Model_Type_Aspect rather than Find_Value_Of_Aspect). (Storage_Model_Support.Get_Storage_Model_Type_Entity): Allow both objects and types, and add an assertion that the type (or the type of the object) has a value for aspect Storage_Model_Type.diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb --- a/gcc/ada/sem_util.adb +++ b/gcc/ada/sem_util.adb @@ -32302,47 +32302,6 @@ package body Sem_Util is package body Storage_Model_Support is - --- - -- Get_Storage_Model_Type_Entity -- - --- - - function Get_Storage_Model_Type_Entity -(Typ : Entity_Id; - Nam : Name_Id) return Entity_Id - is - pragma Assert - (Is_Type (Typ) -and then - Nam in Name_Address_Type - | Name_Null_Address - | Name_Allocate - | Name_Deallocate - | Name_Copy_From - | Name_Copy_To - | Name_Storage_Size); - - SMT_Aspect_Value : constant Node_Id := - Find_Value_Of_Aspect (Typ, Aspect_Storage_Model_Type); - Assoc: Node_Id; - - begin - if No (SMT_Aspect_Value) then -return Empty; - - else -Assoc := First (Component_Associations (SMT_Aspect_Value)); -while Present (Assoc) loop - if Chars (First (Choices (Assoc))) = Nam then - return Entity (Expression (Assoc)); - end if; - - Next (Assoc); -end loop; - -return Empty; - end if; - end Get_Storage_Model_Type_Entity; - - -- Has_Designated_Storage_Model_Aspect -- - @@ -32370,13 +32329,11 @@ package body Sem_Util is function Storage_Model_Object (Typ : Entity_Id) return Entity_Id is begin - if Has_Designated_Storage_Model_Aspect (Typ) then -return - Entity -(Find_Value_Of_Aspect (Typ, Aspect_Designated_Storage_Model)); - else -return Empty; - end if; + pragma Assert (Has_Designated_Storage_Model_Aspect (Typ)); + + return + Entity + (Find_Value_Of_Aspect (Typ, Aspect_Designated_Storage_Model)); end Storage_Model_Object; @@ -32385,76 +32342,132 @@ package body Sem_Util is function Storage_Model_Type (Obj : Entity_Id) return Entity_Id is begin - if Present - (Find_Value_Of_Aspect (Etype (Obj), Aspect_Storage_Model_Type)) - then -return Etype (Obj); - else -return Empty; - end if; + pragma Assert (Has_Storage_Model_Type_Aspect (Etype (Obj))); + + return Etype (Obj); end Storage_Model_Type; + --- + -- Get_Storage_Model_Type_Entity -- + --- + + function Get_Storage_Model_Type_Entity +(SM_Obj_Or_Type : Entity_Id; + Nam: Name_Id) return Entity_Id + is + Typ : constant Entity_Id := (if Is_Object (SM_Obj_Or_Type) then + Storage_Model_Type (SM_Obj_Or_Type) + else + SM_Obj_Or_Type); + pragma Assert +
[Ada] Add #include in cstreams.c
When building the GNAT runtime for QNX, we get the following warning: | cstreams.c: In function '__gnat_full_name': | cstreams.c:209:5: warning: implicit declaration of function 'realpath' | [-Wimplicit-function-declaration] | 209 | realpath (nam, buffer); | | ^~~~ This commit fixes the warning by adding the corresponding #include of Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * cstreams.c: Add #include.diff --git a/gcc/ada/cstreams.c b/gcc/ada/cstreams.c --- a/gcc/ada/cstreams.c +++ b/gcc/ada/cstreams.c @@ -46,6 +46,7 @@ #include #include #include +#include #ifdef _AIX /* needed to avoid conflicting declarations */
[Ada] Fix thinko in QNX's implementation of __gnat_install_handler
On QNX, the sigaction handler is incorrectly installed via the sa_handler field of struct sigaction, rather than the sa_sigaction field. This triggers a compilation warning due to a mismatch between the function's signature and the field's type. | init.c:2614:18: warning: assignment to 'void (*)(int)' | from incompatible pointer type 'void (*)(int, siginfo_t *, void *)' | {aka 'void (*)(int, struct _siginfo *, void *)'} | [-Wincompatible-pointer-types] In practice, using the sa_handler field actually works, but only because those two fields are inside a union: From target/qnx7/usr/include/signal.h: | union { \ | __handler_type _sa_handler; \ | __action_type _sa_sigaction; \ | } __sa_un; \ This commit fixes this. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * init.c (__gnat_install_handler) [__QNX__]: Set act.sa_sigaction rather than act.sa_handler.diff --git a/gcc/ada/init.c b/gcc/ada/init.c --- a/gcc/ada/init.c +++ b/gcc/ada/init.c @@ -2611,7 +2611,7 @@ __gnat_install_handler (void) struct sigaction act; int err; - act.sa_handler = __gnat_error_handler; + act.sa_sigaction = __gnat_error_handler; act.sa_flags = SA_NODEFER | SA_SIGINFO; sigemptyset (&act.sa_mask);
[Ada] sigaction result not properly checked in __gnat_install_handler (QNX)
The QNX version of __gnat_install_handler calls sigaction for a number of signals, and then prints an error message when the the call failed. But unfortunately, except for the first call, we forgot to store sigaction's return value, so the check that ensues uses a potentially uninitialized variable, which the compiler does detect (this is how we found this issue). This change fixes this by make sure we store the result of each sigaction call before checking it. While at it, we noticed a thinko in the error messages all alerting about the SIGFPE signal, rather than the signal it just tested. Most likely a copy/paste thinko. Fixed by this change as well. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * init.c (__gnat_install_handler) [__QNX__]: Save sigaction's return value in err before checking err's value. Fix incorrect signal names in perror messages.diff --git a/gcc/ada/init.c b/gcc/ada/init.c --- a/gcc/ada/init.c +++ b/gcc/ada/init.c @@ -2625,26 +2625,26 @@ __gnat_install_handler (void) } } if (__gnat_get_interrupt_state (SIGILL) != 's') { -sigaction (SIGILL, &act, NULL); +err = sigaction (SIGILL, &act, NULL); if (err == -1) { err = errno; - perror ("error while attaching SIGFPE"); + perror ("error while attaching SIGILL"); perror (strerror (err)); } } if (__gnat_get_interrupt_state (SIGSEGV) != 's') { -sigaction (SIGSEGV, &act, NULL); +err = sigaction (SIGSEGV, &act, NULL); if (err == -1) { err = errno; - perror ("error while attaching SIGFPE"); + perror ("error while attaching SIGSEGV"); perror (strerror (err)); } } if (__gnat_get_interrupt_state (SIGBUS) != 's') { -sigaction (SIGBUS, &act, NULL); +err = sigaction (SIGBUS, &act, NULL); if (err == -1) { err = errno; - perror ("error while attaching SIGFPE"); + perror ("error while attaching SIGBUS"); perror (strerror (err)); } }
[Ada] GNAT.Debug_Pools: Improve documentation of the Stack_Trace_Depth parameter
Setting this parameter to zero when calling the Configure procedure has the effect of disabling completely the tracking of the biggest memory users, which wasn't clear from the current documentation. So this patch enhances the documentation of both the Configure procedure as well as the Dump procedure to make that explicit. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * libgnat/g-debpoo.ads: Improve documentation of the Stack_Trace_Depth parameter.diff --git a/gcc/ada/libgnat/g-debpoo.ads b/gcc/ada/libgnat/g-debpoo.ads --- a/gcc/ada/libgnat/g-debpoo.ads +++ b/gcc/ada/libgnat/g-debpoo.ads @@ -123,7 +123,8 @@ package GNAT.Debug_Pools is --traces that are output to indicate locations of actions for error --conditions such as bad allocations. If set to zero, the debug pool --will not try to compute backtraces. This is more efficient but gives - --less information on problem locations + --less information on problem locations (and in particular, this + --disables the tracking of the biggest users of memory). -- --Maximum_Logically_Freed_Memory: maximum amount of memory (bytes) --that should be kept before starting to physically deallocate some. @@ -275,8 +276,12 @@ package GNAT.Debug_Pools is Size : Positive; Report : Report_Type := All_Reports); -- Dump information about memory usage. - -- Size is the number of the biggest memory users we want to show. Report - -- indicates which sorting order is used in the report. + -- Size is the number of the biggest memory users we want to show + -- (requires that the Debug_Pool has been configured with Stack_Trace_Depth + -- greater than zero). Also, for efficiency reasons, tracebacks with + -- a memory allocation below 1_000 bytes are not shown in the "biggest + -- memory users" part of the report. + -- Report indicates which sorting order is used in the report. procedure Dump_Stdout (Pool : Debug_Pool;
[Ada] Don't crash on ghost packages when emitting CUDA symbols in ALI files
Before this commit, a GNAT compiled with assertions would crash when attempting to emit CUDA symbols in ALI files for spark_mode/ghost packages, whose content is a single null statement. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * lib-writ.adb (Output_CUDA_Symbols): Check for null packages.diff --git a/gcc/ada/lib-writ.adb b/gcc/ada/lib-writ.adb --- a/gcc/ada/lib-writ.adb +++ b/gcc/ada/lib-writ.adb @@ -403,7 +403,9 @@ package body Lib.Writ is Kernel_Elm : Elmt_Id; Kernel : Entity_Id; begin - if not Enable_CUDA_Expansion then + if not Enable_CUDA_Expansion + or else Nkind (Unit_Id) = N_Null_Statement + then return; end if; Spec_Id := (if Nkind (Unit_Id) = N_Package_Body
[Ada] Fix proof of double arithmetic units
Proof of an assertion is not automatic anymore. Add two assertions before it to guide the prover. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * libgnat/s-aridou.adb (Double_Divide): Add intermediate assertions.diff --git a/gcc/ada/libgnat/s-aridou.adb b/gcc/ada/libgnat/s-aridou.adb --- a/gcc/ada/libgnat/s-aridou.adb +++ b/gcc/ada/libgnat/s-aridou.adb @@ -941,11 +941,13 @@ is else T2 := Yhi * Zlo; pragma Assert (Big (T2) = Big (Double_Uns'(Yhi * Zlo))); +pragma Assert (Big_0 = Big (Double_Uns'(Ylo * Zhi))); end if; else T2 := Ylo * Zhi; pragma Assert (Big (T2) = Big (Double_Uns'(Ylo * Zhi))); + pragma Assert (Big_0 = Big (Double_Uns'(Yhi * Zlo))); end if; T1 := Ylo * Zlo;
[Ada] Freeze target type on qualified expression expansion
An object declaration (other than a deferred constant declaration) causes freezing where it occurs (13.14(6)), which means every name occurring within it causes freezing (13.14(4/1)), and when the name in a subtype_mark causes freezing, the denoted subtype is frozen (13.14(11)). Hence, one needs to freeze the target type when expanding a qualified expression. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * exp_ch4.adb (Expand_N_Qualified_Expression): Freeze Target_Type.diff --git a/gcc/ada/exp_ch4.adb b/gcc/ada/exp_ch4.adb --- a/gcc/ada/exp_ch4.adb +++ b/gcc/ada/exp_ch4.adb @@ -10776,6 +10776,8 @@ package body Exp_Ch4 is Ensure_Valid (Operand); end if; + Freeze_Before (Operand, Target_Type); + -- Apply possible constraint check Apply_Constraint_Check (Operand, Target_Type, No_Sliding => True);
[Ada] Type invariant or postcondition may cause uninitialized memory reads
This patch corrects an error in the compiler whereby a function requiring the generation of a postconditions procedure may cause an uninitialized memory read when the return type Has_Unconstrained_Elements or is an unconstrained array. The error occurs because evaluation of postconditions happens within the "at end" handler when the temporary result object may go out of scope. The patch modifies expansion in the above cases to evaluate postconditions at the point of return instead - in order to guarantee the result object is valid. Note that these changes have the side effect of introducing a semantic bug such that functions returning types with unconstrained elements will have their postconditions/return type invariants evaluated before finalization. Work is currently being done to introduce wrappers which will solve this problem and remove technical debt in this area. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * exp_ch7.adb (Build_Finalizer): Disable late evaluation of postconditions for functions returning types which where Has_Unconstrained_Elements is true or are unconstrained arrays.diff --git a/gcc/ada/exp_ch7.adb b/gcc/ada/exp_ch7.adb --- a/gcc/ada/exp_ch7.adb +++ b/gcc/ada/exp_ch7.adb @@ -4247,14 +4247,33 @@ package body Exp_Ch7 is -- --Postcond_Enable := False; - Append_To (Top_Decls, - Make_Assignment_Statement (Loc, - Name => - New_Occurrence_Of - (Get_Postcond_Enabled (Def_Ent), Loc), - Expression => - New_Occurrence_Of - (Standard_False, Loc))); + -- Note that we do not disable early evaluation of postconditions + -- for return types that are unconstrained or have unconstrained + -- elements since the temporary result object could get allocated on + -- the stack and be out of scope at the point where we perform late + -- evaluation of postconditions - leading to uninitialized memory + -- reads. + + -- This disabling of early evaluation can lead to incorrect run-time + -- semantics where functions with unconstrained elements will + -- have their corresponding postconditions evaluated before + -- finalization. The proper solution here is to generate a wrapper + -- to capture the result instead of using multiple flags and playing + -- with flags which does not even work in all cases ??? + + if not Has_Unconstrained_Elements (Etype (Def_Ent)) + or else (Is_Array_Type (Etype (Def_Ent)) + and then not Is_Constrained (Etype (Def_Ent))) + then +Append_To (Top_Decls, + Make_Assignment_Statement (Loc, +Name => + New_Occurrence_Of +(Get_Postcond_Enabled (Def_Ent), Loc), +Expression => + New_Occurrence_Of +(Standard_False, Loc))); + end if; -- Add the subprogram to the list of declarations an analyze it
[Ada] Remove useless code related to current value propagation
The current value propagation applies only to assignable objects and doesn't make sense for subprogram entities. This was a mistake introduced when extending the current value propagation years ago. Cleanup related to fixing interference between expansion of attribute Loop_Entry and current value propagation. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * sem_attr.adb (Address_Checks): Remove call to Kill_Current_Values for subprogram entities, because this routine only does something for object entities.diff --git a/gcc/ada/sem_attr.adb b/gcc/ada/sem_attr.adb --- a/gcc/ada/sem_attr.adb +++ b/gcc/ada/sem_attr.adb @@ -504,7 +504,6 @@ package body Sem_Attr is begin if Is_Subprogram (Ent) then Set_Address_Taken (Ent); - Kill_Current_Values (Ent); -- An Address attribute is accepted when generated by the -- compiler for dispatching operation, and an error is
[Ada] Fix expansion of attribute Loop_Entry wrt value propagation
When expanding attribute Loop_Entry we create constant object declarations and put them just before the loop. The current values of variables at the point of Loop_Entry attribute must not be used when analysing the initialization expressions of these constants, because they might be different from the values at the loop entry itself. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * exp_attr.adb (Expand_Loop_Entry_Attribute): Disable value propagation when analysing the constant that holds the Loop_Entry prefix value.diff --git a/gcc/ada/exp_attr.adb b/gcc/ada/exp_attr.adb --- a/gcc/ada/exp_attr.adb +++ b/gcc/ada/exp_attr.adb @@ -26,6 +26,7 @@ with Aspects;use Aspects; with Atree; use Atree; with Checks; use Checks; +with Debug; use Debug; with Einfo; use Einfo; with Einfo.Entities; use Einfo.Entities; with Einfo.Utils;use Einfo.Utils; @@ -1792,23 +1793,30 @@ package body Exp_Attr is Push_Scope (Scope (Loop_Id)); end if; - -- The analysis of the conditional block takes care of the constant - -- declaration. + -- Analyze constant declaration with simple value propagation disabled, + -- because the values at the loop entry might be different than the + -- values at the occurrence of Loop_Entry attribute. - if Present (Result) then - Rewrite (Loop_Stmt, Result); - Analyze (Loop_Stmt); - - -- The conditional block was analyzed when a previous 'Loop_Entry was - -- expanded. There is no point in reanalyzing the block, simply analyze - -- the declaration of the constant. + declare + Save_Debug_Flag_MM : constant Boolean := Debug_Flag_MM; + begin + Debug_Flag_MM := True; - else if Present (Aux_Decl) then Analyze (Aux_Decl); end if; Analyze (Temp_Decl); + + Debug_Flag_MM := Save_Debug_Flag_MM; + end; + + -- If the conditional block has just been created, then analyze it; + -- otherwise it was analyzed when a previous 'Loop_Entry was expanded. + + if Present (Result) then + Rewrite (Loop_Stmt, Result); + Analyze (Loop_Stmt); end if; Rewrite (N, New_Occurrence_Of (Temp_Id, Loc));
[Ada] Fix fallout of change in equality for untagged record types
The problem is that the resolution of expanded names implicitly assumes that the visible and private homonyms in a given scope are segregated on the homonym chain, and this was no longer the case for equality operators in the specific case at stake. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * sem_ch7.adb (Inspect_Untagged_Record_Completion): Also move the equality operator on the homonym chain if there is another equality operator in the private part.diff --git a/gcc/ada/sem_ch7.adb b/gcc/ada/sem_ch7.adb --- a/gcc/ada/sem_ch7.adb +++ b/gcc/ada/sem_ch7.adb @@ -1491,6 +1491,7 @@ package body Sem_Ch7 is Prim_List : constant Elist_Id := Collect_Primitive_Operations (Defining_Identifier (Decl)); + E : Entity_Id; Ne_Id : Entity_Id; Op_Decl : Node_Id; Op_Id : Entity_Id; @@ -1517,16 +1518,39 @@ package body Sem_Ch7 is pragma Assert (Ekind (Ne_Id) = E_Function and then Corresponding_Equality (Ne_Id) = Op_Id); +E := First_Private_Entity (Id); + -- Move them from the private part of the entity list -- up to the end of the visible part of the same list. Remove_Entity (Op_Id); Remove_Entity (Ne_Id); -Link_Entities - (Prev_Entity (First_Private_Entity (Id)), Op_Id); +Link_Entities (Prev_Entity (E), Op_Id); Link_Entities (Op_Id, Ne_Id); -Link_Entities (Ne_Id, First_Private_Entity (Id)); +Link_Entities (Ne_Id, E); + +-- And if the private part contains another equality +-- operator, move the equality operator to after it +-- in the homonym chain, so that all its next homonyms +-- in the same scope, if any, also are in the visible +-- part. This is relied upon to resolve expanded names +-- in Collect_Interps for example. + +while Present (E) loop + exit when Ekind (E) = E_Function + and then Chars (E) = Name_Op_Eq; + + Next_Entity (E); +end loop; + +if Present (E) then + Remove_Homonym (Op_Id); + + Set_Homonym (Op_Id, Homonym (E)); + Set_Homonym (E, Op_Id); +end if; + exit; end if;
Re: [PATCH] Do not use DW_OP_not for TRUTH_NOT_EXPR in conditional expressions
> It won't work for types larger than size of address, it would need to use > dwarf_OP (DW_OP_const_type) instead of DW_OP_lit0 in that case. > But maybe TRUTH_NOT_EXPR will be never seen for such types and after all, > even the loc_list_from_tree_1 INTEGER_CST case doesn't handle that > (the RTL case does). > So I think at least for now it is ok. Thanks. Any objection to me installing it on the 12 branch as well? -- Eric Botcazou
Re: [PATCH] Do not use DW_OP_not for TRUTH_NOT_EXPR in conditional expressions
On Mon, May 16, 2022 at 10:47:53AM +0200, Eric Botcazou wrote: > > It won't work for types larger than size of address, it would need to use > > dwarf_OP (DW_OP_const_type) instead of DW_OP_lit0 in that case. > > But maybe TRUTH_NOT_EXPR will be never seen for such types and after all, > > even the loc_list_from_tree_1 INTEGER_CST case doesn't handle that > > (the RTL case does). > > So I think at least for now it is ok. > > Thanks. Any objection to me installing it on the 12 branch as well? Nope. It is ok. Jakub
Re: [PATCH] Mitigate -Wmaybe-uninitialized in expmed.cc.
Martin Liška writes: > It's the warning I see every time I build GCC: > > In file included from /home/marxin/Programming/gcc/gcc/coretypes.h:478, > from /home/marxin/Programming/gcc/gcc/expmed.cc:26: > In function ‘poly_uint16 mode_to_bytes(machine_mode)’, > inlined from ‘typename if_nonpoly::type > GET_MODE_SIZE(const T&) [with T = scalar_int_mode]’ at > /home/marxin/Programming/gcc/gcc/machmode.h:647:24, > inlined from ‘rtx_def* emit_store_flag_1(rtx, rtx_code, rtx, rtx, > machine_mode, int, int, machine_mode)’ at > /home/marxin/Programming/gcc/gcc/expmed.cc:5728:56: > /home/marxin/Programming/gcc/gcc/machmode.h:550:49: warning: ‘*(unsigned > int*)((char*)&int_mode + offsetof(scalar_int_mode, scalar_int_mode::m_mode))’ > may be used uninitialized [-Wmaybe-uninitialized] > 550 | ? mode_size_inline (mode) : mode_size[mode]); > | ^~~~ > /home/marxin/Programming/gcc/gcc/expmed.cc: In function ‘rtx_def* > emit_store_flag_1(rtx, rtx_code, rtx, rtx, machine_mode, int, int, > machine_mode)’: > /home/marxin/Programming/gcc/gcc/expmed.cc:5657:19: note: ‘*(unsigned > int*)((char*)&int_mode + offsetof(scalar_int_mode, scalar_int_mode::m_mode))’ > was declared here > 5657 | scalar_int_mode int_mode; > | ^~~~ > > Can we please mitigate it? > > gcc/ChangeLog: > > * expmed.cc (emit_store_flag_1): Mitigate -Wmaybe-uninitialized > warning. Not a strong objection, but TBH I'd rather we didn't work around false positives like this. Richard > --- > gcc/expmed.cc | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/gcc/expmed.cc b/gcc/expmed.cc > index 41738c1efe9..f23d63471ea 100644 > --- a/gcc/expmed.cc > +++ b/gcc/expmed.cc > @@ -5654,7 +5654,7 @@ emit_store_flag_1 (rtx target, enum rtx_code code, rtx > op0, rtx op1, > >/* If we are comparing a double-word integer with zero or -1, we can > convert the comparison into one involving a single word. */ > - scalar_int_mode int_mode; > + scalar_int_mode int_mode = {}; >if (is_int_mode (mode, &int_mode) >&& GET_MODE_BITSIZE (int_mode) == BITS_PER_WORD * 2 >&& (!MEM_P (op0) || ! MEM_VOLATILE_P (op0)))
Re: [PATCH][pushed] Fix ubsan error in opts-global.cc
On Mon, May 16, 2022 at 9:53 AM Martin Liška wrote: > > Fixes: > opts-global.cc:75:15: runtime error: store to address 0x0bc9be70 with > insufficient space for an object of type 'char' > which happens when mask == 0, len == 0 and we allocate zero elements. > Eventually, result[0] is called which triggers the UBSAN. > > It's newly discovered after the Siddhesh's recent patch. > > Cheers, > Martin > > gcc/ChangeLog: > > * opts-global.cc (write_langs): Allocate at least one byte. > --- > gcc/opts-global.cc | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/gcc/opts-global.cc b/gcc/opts-global.cc > index a18c76940f9..4f5f8cdcb98 100644 > --- a/gcc/opts-global.cc > +++ b/gcc/opts-global.cc > @@ -61,7 +61,7 @@ write_langs (unsigned int mask) > if (mask & (1U << n)) >len += strlen (lang_name) + 1; > > - result = XNEWVEC (char, len); > + result = XNEWVEC (char, MAX (1, len)); Does it not fail to allocate space for the '\0' it terminates the list with even when there's a language? Ah, it "re-uses" the byte it allocates for the '/' of the first element. Can you add a comment? OK with that change. Richard. >len = 0; >for (n = 0; (lang_name = lang_names[n]) != 0; n++) > if (mask & (1U << n)) > -- > 2.36.1 >
Re: [PATCH] Expand __builtin_memcmp_eq with ptest for OImode.
On Sat, May 7, 2022 at 7:05 AM liuhongt wrote: > > This is adjusted patch only for OImode. > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > Ok for trunk? > > gcc/ChangeLog: > > PR target/104610 > * config/i386/i386-expand.cc (ix86_expand_branch): Use ptest > for QImode when code is EQ or NE. > * config/i386/sse.md (cbranch4): Extend to OImode. > > gcc/testsuite/ChangeLog: > > * gcc.target/i386/pr104610.c: New test. > --- > gcc/config/i386/i386-expand.cc | 10 +- > gcc/config/i386/sse.md | 8 ++-- > gcc/testsuite/gcc.target/i386/pr104610.c | 15 +++ > 3 files changed, 30 insertions(+), 3 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/i386/pr104610.c > > diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc > index bc806ffa283..c2f8776102c 100644 > --- a/gcc/config/i386/i386-expand.cc > +++ b/gcc/config/i386/i386-expand.cc > @@ -2267,11 +2267,19 @@ ix86_expand_branch (enum rtx_code code, rtx op0, rtx > op1, rtx label) > >/* Handle special case - vector comparsion with boolean result, transform > it using ptest instruction. */ > - if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT) > + if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT > + || (mode == OImode && (code == EQ || code == NE))) No need for the code check here. You have an assert in the code below. > { >rtx flag = gen_rtx_REG (CCZmode, FLAGS_REG); >machine_mode p_mode = GET_MODE_SIZE (mode) == 32 ? V4DImode : V2DImode; > > + if (mode == OImode) > + { > + op0 = lowpart_subreg (p_mode, force_reg (mode, op0), mode); > + op1 = lowpart_subreg (p_mode, force_reg (mode, op1), mode); > + mode = p_mode; > + } > + >gcc_assert (code == EQ || code == NE); Please put the above hunk after the assert. >/* Generate XOR since we can't check that one operand is zero vector. > */ >tmp = gen_reg_rtx (mode); > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md > index 7b791def542..9514b8e0234 100644 > --- a/gcc/config/i386/sse.md > +++ b/gcc/config/i386/sse.md > @@ -26034,10 +26034,14 @@ (define_expand > "maskstore" > (match_operand: 2 "register_operand")))] >"TARGET_AVX512BW") > > +(define_mode_iterator VI48_OI_AVX > + [(V8SI "TARGET_AVX") (V4DI "TARGET_AVX") (OI "TARGET_AVX") > + V4SI V2DI]) > + > (define_expand "cbranch4" >[(set (reg:CC FLAGS_REG) > - (compare:CC (match_operand:VI48_AVX 1 "register_operand") > - (match_operand:VI48_AVX 2 "nonimmediate_operand"))) > + (compare:CC (match_operand:VI48_OI_AVX 1 "register_operand") > + (match_operand:VI48_OI_AVX 2 "nonimmediate_operand"))) > (set (pc) (if_then_else >(match_operator 0 "bt_comparison_operator" > [(reg:CC FLAGS_REG) (const_int 0)]) Please rather put the new cbranchoi4 expander in i386.md. > diff --git a/gcc/testsuite/gcc.target/i386/pr104610.c > b/gcc/testsuite/gcc.target/i386/pr104610.c > new file mode 100644 > index 000..00866238bd7 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr104610.c > @@ -0,0 +1,15 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -mmove-max=256 -mstore-max=256" } */ > +/* { dg-final { scan-assembler-times {(?n)vptest.*ymm} 1 } } */ > +/* { dg-final { scan-assembler-times {sete} 1 } } */ > +/* { dg-final { scan-assembler-not {(?n)je.*L[0-9]} } } */ > +/* { dg-final { scan-assembler-not {(?n)jne.*L[0-9]} } } */ > + > + > +#include > +__attribute__((target("avx"))) > +bool f256(char *a) Use _Bool istead and simply pass -mavx to dg-options. Uros. > +{ > + char t[] = "0123456789012345678901234567890"; > + return __builtin_memcmp(a, &t[0], sizeof(t)) == 0; > +} > -- > 2.18.1 >
Re: [PATCH] lto-plugin: add support for feature detection
On Mon, May 16, 2022 at 10:37 AM Rui Ueyama via Gcc-patches wrote: > > On Mon, May 16, 2022 at 2:38 PM Alexander Monakov wrote: > > > > On Mon, 16 May 2022, Rui Ueyama wrote: > > > > > If it is a guaranteed behavior that GCC of all versions that support > > > only get_symbols_v2 don't leave a temporary file behind if it is > > > suddenly disrupted during get_symbols_v2 execution, then yes, mold can > > > restart itself when get_symbols_v2 is called for the first time. > > > > > > Is this what you want? I'm fine with that approach if it is guaranteed > > > to work by GCC developers. > > > > I cannot answer that, hopefully someone in Cc: will. This sub-thread started > > with Richard proposing an alternative solution for API level discovery [1] > > (invoking onload twice, first with only the _v3 entrypoint in the "transfer > > vector"), and then suggesting an onload_v2 variant that would allow to > > discover which entrypoints the plugin is going to use [2]. > > > > [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-May/594058.html > > [2] https://gcc.gnu.org/pipermail/gcc-patches/2022-May/594059.html > > > > ... at which point I butted in, because the whole _v3 thing was shrouded in > > mystery. Hopefully, now it makes more sense. > > > > > > From my side I want to add that thread-safety remains a major unsolved > > point. > > Compiler driver _always_ adds the plugin to linker command line, so I expect > > that if you add a mutex around your claim_file hook invocation, you'll find > > that it serializes the linker too much. Have you given some thought to that? > > Will you be needing a plugin API upgrade to discover thread-safe > > entrypoints, > > or do you have another solution in mind? > > From my linker's point of view, the easiest way to handle the > situation is to implement a logic like this: > > if (gcc_version >= 12.2) > assume that claim_file_hook is thread safe > else > assume that claim_file_hook is not thread safe > > And that is also _very_ useful to identify the existence of > get_symbols_v3, because we can do the same thing with s/12.2/12.1/. Sure having a 'plugin was compiled from sources of the GCC N.M compiler' is useful if bugs are discovered in old versions that you by definition cannot fix but can apply workarounds to. Note the actual compiler used might still differ. Note that still isn't clean API documentation / extension of the plugin API itself. As of thread safety we can either add a claim_file_v2_hook or try to do broader-level versioning of the API with a new api_version hook which could also specify that with say version 2 the plugin will not use get_symbols_v2 but only newer, etc. The linker would announce the API version it likes to use and the plugin would return the (closest) version it can handle. A v2 could then also specify that claim_file needs to be thread safe, or that cleanup should allow a followup onload again with a state identical to after dlopen, etc. Is there an API document besides the header itself somewhere? Richard.
Re: [PATCH] lto-plugin: add support for feature detection
On Mon, 16 May 2022, Richard Biener wrote: > Is there an API document besides the header itself somewhere? It's on the wiki: https://gcc.gnu.org/wiki/whopr/driver (sadly the v3 entrypoint was added there without documentation) Alexander
[PATCH] middle-end/105604 - snprintf dianostics and non-constant sizes/offsets
The following tries to correct get_origin_and_offset_r not handling non-constant sizes of array elements in ARRAY_REFs and non-constant offsets of COMPONENT_REFs. It isn't exactly clear how such failures should be treated in this API and existing handling isn't consistent here either. The following applies two different variants, treating non-constant array sizes like non-constant array indices and treating non-constant offsets of COMPONENT_REFs by terminating the recursion (not sure what that means to the callers). Basically the code failed to use component_ref_field_offset and array_ref_element_size and instead relies on inappropriate helpers (that shouldn't exist in the first place ...). The code is also not safe-guarded against overflows in the final offset/size computations but I'm not trying to rectify that. Martin - can you comment on how the API should handle such situations? Bootstrapped and tested on x86_64-unknown-linux-gnu. OK for trunk and branches? Thanks, Richard. 2022-05-16 Richard Biener PR middle-end/105604 * gimple-ssa-sprintf.cc (get_origin_and_offset_r): Handle non-constant ARRAY_REF element size and non-constant COMPONENT_REF field offset. * gcc.dg/torture/pr105604.c: New testcase. --- gcc/gimple-ssa-sprintf.cc | 14 +++--- gcc/testsuite/gcc.dg/torture/pr105604.c | 24 2 files changed, 35 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/torture/pr105604.c diff --git a/gcc/gimple-ssa-sprintf.cc b/gcc/gimple-ssa-sprintf.cc index c93f12f90b5..14e215ce69c 100644 --- a/gcc/gimple-ssa-sprintf.cc +++ b/gcc/gimple-ssa-sprintf.cc @@ -2312,14 +2312,16 @@ get_origin_and_offset_r (tree x, HOST_WIDE_INT *fldoff, HOST_WIDE_INT *fldsize, HOST_WIDE_INT idx = (tree_fits_uhwi_p (offset) ? tree_to_uhwi (offset) : HOST_WIDE_INT_MAX); + tree elsz = array_ref_element_size (x); tree eltype = TREE_TYPE (x); if (TREE_CODE (eltype) == INTEGER_TYPE) { if (off) *off = idx; } - else if (idx < HOST_WIDE_INT_MAX) - *fldoff += idx * int_size_in_bytes (eltype); + else if (idx < HOST_WIDE_INT_MAX +&& tree_fits_shwi_p (elsz)) + *fldoff += idx * tree_to_shwi (elsz); else *fldoff = idx; @@ -2350,8 +2352,14 @@ get_origin_and_offset_r (tree x, HOST_WIDE_INT *fldoff, HOST_WIDE_INT *fldsize, case COMPONENT_REF: { + tree foff = component_ref_field_offset (x); tree fld = TREE_OPERAND (x, 1); - *fldoff += int_byte_position (fld); + if (!tree_fits_shwi_p (foff) + || !tree_fits_shwi_p (DECL_FIELD_BIT_OFFSET (fld))) + return x; + *fldoff += (tree_to_shwi (foff) + + (tree_to_shwi (DECL_FIELD_BIT_OFFSET (fld)) + / BITS_PER_UNIT)); get_origin_and_offset_r (fld, fldoff, fldsize, off); x = TREE_OPERAND (x, 0); diff --git a/gcc/testsuite/gcc.dg/torture/pr105604.c b/gcc/testsuite/gcc.dg/torture/pr105604.c new file mode 100644 index 000..b002251df10 --- /dev/null +++ b/gcc/testsuite/gcc.dg/torture/pr105604.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-Wall" } */ + +struct { + long users; + long size; + char *data; +} * main_trans; +void *main___trans_tmp_1; +int sprintf(char *, char *, ...); +int main() { + int users = 0; + struct { +long users; +long size; +char *data; +int links[users]; +char buf[]; + } *trans = trans; + trans->data = trans->buf; + main___trans_tmp_1 = trans; + main_trans = main___trans_tmp_1; + sprintf(main_trans->data, "test"); +} -- 2.35.3
Re: [PATCH] lto-plugin: add support for feature detection
> > Sure having a 'plugin was compiled from sources of the GCC N.M compiler' > is useful if bugs are discovered in old versions that you by definition cannot > fix but can apply workarounds to. Note the actual compiler used might still > differ. Note that still isn't clean API documentation / extension of the > plugin > API itself. As of thread safety we can either add a claim_file_v2_hook > or try to do broader-level versioning of the API with a new api_version > hook which could also specify that with say version 2 the plugin will > not use get_symbols_v2 but only newer, etc. The linker would announce > the API version it likes to use and the plugin would return the > (closest) version > it can handle. A v2 could then also specify that claim_file needs to be Yep, I think having the API version handshake is quite reasonable way to go as the _v2,_v3 symbols seems already getting bit out of hand and we essentially have 3 revisions of API to think of (first adding LDPR_PREVAILING_DEF_IRONLY_EXP, second adding GET_SYMBOLS_V3 and now we plan third adding thread safety and solving the file handler problems) > thread safe, or that cleanup should allow a followup onload again with > a state identical to after dlopen, etc. > > Is there an API document besides the header itself somewhere? There is https://gcc.gnu.org/wiki/whopr/driver I wonder if this can't be moved into more official looking place since it looks like it is internal to GCC WHOPR implementation this way. Honza > > Richard.
Re: [PATCH] Extend --with-zstd documentation
On 5/12/22 09:00, Richard Biener via Gcc-patches wrote: > On Wed, May 11, 2022 at 5:10 PM Bruno Haible wrote: >> >> The patch that was so far added for documenting --with-zstd is pretty >> minimal: >> - it refers to undocumented options --with-zstd-include and >> --with-zstd-lib; >> - it suggests that --with-zstd can be used without an argument; >> - it does not clarify how this option applies to cross-compilation. >> >> How about adding the same details as for the --with-isl, >> --with-isl-include, --with-isl-lib options, mutatis mutandis? This patch >> does that. > > Sounds good! > > OK. Bruno, are you planning committing the change? Or should I do it on your behalf? Cheers, Martin > > Thanks, > Richard. > >> PR other/105527 >> >> gcc/ChangeLog: >> >> * doc/install.texi (Configuration): Add more details about >> --with-zstd. >> Document --with-zstd-include and --with-zstd-lib >> --- >> diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi >> index 042241e9fad..ed0d1d882c3 100644 >> --- a/gcc/doc/install.texi >> +++ b/gcc/doc/install.texi >> @@ -2360,10 +2360,20 @@ default is derived from glibc's behavior. When glibc >> clamps float_t to double, >> GCC follows and enables the option. For other cross compiles, the default is >> disabled. >> >> -@item --with-zstd >> -Specify prefix directory for installed zstd library. >> -Equivalent to @option{--with-zstd-include=PATH/include} plus >> -@option{--with-zstd-lib=PATH/lib}. >> +@item --with-zstd=@var{pathname} >> +@itemx --with-zstd-include=@var{pathname} >> +@itemx --with-zstd-lib=@var{pathname} >> +If you do not have the @code{zstd} library installed in a standard >> +location and you want to build GCC, you can explicitly specify the >> +directory where it is installed (@samp{--with-zstd=@/@var{zstdinstalldir}}). >> +The @option{--with-zstd=@/@var{zstdinstalldir}} option is shorthand for >> +@option{--with-zstd-lib=@/@var{zstdinstalldir}/lib} and >> +@option{--with-zstd-include=@/@var{zstdinstalldir}/include}. If this >> +shorthand assumption is not correct, you can use the explicit >> +include and lib options directly. >> + >> +These flags are applicable to the host platform only. When building >> +a cross compiler, they will not be used to configure target libraries. >> @end table >> >> @subheading Cross-Compiler-Specific Options >> >> >>
Re: [PATCH] lto-plugin: add support for feature detection
On 5/16/22 11:25, Jan Hubicka via Gcc-patches wrote: >> >> Sure having a 'plugin was compiled from sources of the GCC N.M compiler' >> is useful if bugs are discovered in old versions that you by definition >> cannot >> fix but can apply workarounds to. Note the actual compiler used might still >> differ. Note that still isn't clean API documentation / extension of the >> plugin >> API itself. As of thread safety we can either add a claim_file_v2_hook >> or try to do broader-level versioning of the API with a new api_version >> hook which could also specify that with say version 2 the plugin will >> not use get_symbols_v2 but only newer, etc. The linker would announce >> the API version it likes to use and the plugin would return the >> (closest) version >> it can handle. A v2 could then also specify that claim_file needs to be > > Yep, I think having the API version handshake is quite reasonable way to > go as the _v2,_v3 symbols seems already getting bit out of hand and we > essentially have 3 revisions of API to think of > (first adding LDPR_PREVAILING_DEF_IRONLY_EXP, second adding > GET_SYMBOLS_V3 and now we plan third adding thread safety and solving > the file handler problems) How should be design such a version handshake? > >> thread safe, or that cleanup should allow a followup onload again with >> a state identical to after dlopen, etc. >> >> Is there an API document besides the header itself somewhere? > > There is https://gcc.gnu.org/wiki/whopr/driver > I wonder if this can't be moved into more official looking place since > it looks like it is internal to GCC WHOPR implementation this way. We can likely add it here: https://gcc.gnu.org/onlinedocs/gccint/LTO.html#LTO What do you think? I can port it. Martin > > Honza >> >> Richard.
[PATCH] Clamp vec_perm_expr index in simplify_bitfield_ref to avoid ICE.
Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,} Ok for trunk? gcc/ChangeLog: PR tree-optimization/105591 * tree-ssa-forwprop.cc (simplify_bitfield_ref): Clamp vec_perm_expr index. gcc/testsuite/ChangeLog: * gcc.dg/pr105591.c: New test. --- gcc/testsuite/gcc.dg/pr105591.c | 12 gcc/tree-ssa-forwprop.cc| 13 - 2 files changed, 20 insertions(+), 5 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/pr105591.c diff --git a/gcc/testsuite/gcc.dg/pr105591.c b/gcc/testsuite/gcc.dg/pr105591.c new file mode 100644 index 000..9554c42e2f4 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr105591.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-Wno-psabi -O" } */ +/* { dg-additional-options "-mavx" { target x86_64-*-* i?86-*-* } } */ +typedef unsigned long long __attribute__((__vector_size__ (16))) U; +typedef unsigned long long __attribute__((__vector_size__ (32))) V; + +V +foo (U u) +{ + U x = __builtin_shuffle (u, (U) { 0xBE2ED0AB630B33FE }); + return __builtin_shufflevector (u, x, 2, 1, 0, 3); +} diff --git a/gcc/tree-ssa-forwprop.cc b/gcc/tree-ssa-forwprop.cc index 48cab5844e0..7da3f80af10 100644 --- a/gcc/tree-ssa-forwprop.cc +++ b/gcc/tree-ssa-forwprop.cc @@ -2381,23 +2381,26 @@ simplify_bitfield_ref (gimple_stmt_iterator *gsi) /* One element. */ if (known_eq (size, elem_size)) -idx = TREE_INT_CST_LOW (VECTOR_CST_ELT (m, idx)); +idx = TREE_INT_CST_LOW (VECTOR_CST_ELT (m, idx)) % (2 * nelts); else { unsigned HOST_WIDE_INT nelts_op; if (!constant_multiple_p (size, elem_size, &nelts_op) || !pow2p_hwi (nelts_op)) return false; - unsigned start = TREE_INT_CST_LOW (vector_cst_elt (m, idx)); - unsigned end = TREE_INT_CST_LOW (vector_cst_elt (m, idx + nelts_op - 1)); + /* Clamp vec_perm_expr index. */ + unsigned start = TREE_INT_CST_LOW (vector_cst_elt (m, idx)) % (2 * nelts); + unsigned end = TREE_INT_CST_LOW (vector_cst_elt (m, idx + nelts_op - 1)) +% (2 * nelts); /* Be in the same vector. */ if ((start < nelts) != (end < nelts)) return false; for (unsigned HOST_WIDE_INT i = 1; i != nelts_op; i++) { /* Continuous area. */ - if (TREE_INT_CST_LOW (vector_cst_elt (m, idx + i)) - 1 - != TREE_INT_CST_LOW (vector_cst_elt (m, idx + i - 1))) + if (TREE_INT_CST_LOW (vector_cst_elt (m, idx + i)) % (2 * nelts) - 1 + != TREE_INT_CST_LOW (vector_cst_elt (m, idx + i - 1)) +% (2 * nelts)) return false; } /* Alignment not worse than before. */ -- 2.18.1
Re: [wwwdocs][Patch] Add OpenMP by-GCC-version implementation status
Hi all, small update (interdiff): s/s/S/ for consistency, missed one GCC 13 commit, and improved wording of the enter/exit change. (New wording better captures the effect; I was thinking too much of the changed spec wording not of the effective result.) Plus added some cross-ref hyperlinks to make finding the documentation easier: --- a/htdocs/projects/gomp/openmp-status.html +++ b/htdocs/projects/gomp/openmp-status.html @@ -25,0 +26 @@ + GOMP project page @@ -61 +61 @@ -hint clause on the atomic constructGCC 9stub only +hint clause on the atomic constructGCC 9Stub only @@ -141,0 +142 @@ +omp_get_mapped_ptr runtime routineGCC 13 @@ -159 +159,0 @@> -omp_get_mapped_ptr runtime routineN @@ -195 +195 @@ -Extended map-type handling of target enter/exit dataN +Default map type for map clause in target enter/exit dataN --- a/htdocs/projects/index.html +++ b/htdocs/projects/index.html @@ -32,0 +33 @@ help develop GCC: +Implementing missing OpenMP features. I will incorporate the 5.2 wording change in my libgomp.texi patch. (I think the s/stub/Stub/ change is already there and I know the N→Y change is in the .texi.) Otherwise, the same boilerplate applies: On 14.05.22 23:27, Tobias Burnus wrote: Jakub and I discussed the other day that it would be useful to have a page similar to https://gcc.gnu.org/projects/cxx-status.html to provide by-GCC-version information of the which OpenMP are supported. The list is based on * https://gcc.gnu.org/onlinedocs/libgomp/OpenMP-Implementation-Status.html * Some revision archeology by Jakub * Plus added 5.2 list. Comments regarding the file, where it is placed and how it is linked to? I found it simpler to have a single very long line rather than trying to break it into 80-character lines, but I can also do this reformatting. Tobias PS: Full patch attached. PPS: I think we should consider to further cleanup/consolidate both the generic gomp landing page and the http://gcc.gnu.org/wiki/openmp wiki. Current the information is too far spread and too difficult to find. - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 htdocs/projects/gomp/index.html | 15 ++- htdocs/projects/gomp/openmp-status.html | 203 htdocs/projects/index.html | 1 + 3 files changed, 213 insertions(+), 6 deletions(-) diff --git a/htdocs/projects/gomp/index.html b/htdocs/projects/gomp/index.html index 59697c10..81096578 100644 --- a/htdocs/projects/gomp/index.html +++ b/htdocs/projects/gomp/index.html @@ -54,18 +54,21 @@ projects. extensions to target language parsers. A long-term goal is the generation of efficient and small code for OpenMP applications. +Documentation + + OpenMP Implementation status + https://gcc.gnu.org/onlinedocs/libgomp/";>Online + documentation of libgomp, the GOMP support library + + Contributing We encourage everyone to contribute changes and help test GOMP. GOMP has been merged into mainline GCC. Reporting Bugs -Please add "openmp" to the keywords field when filing a bug report. - -Documentation -libgomp, the GOMP support library, has -https://gcc.gnu.org/onlinedocs/libgomp/";>online documentation -available. +Please add openmp to the keywords field when filling a +https://gcc.gnu.org/bugzilla/";>bug report. Status diff --git a/htdocs/projects/gomp/openmp-status.html b/htdocs/projects/gomp/openmp-status.html new file mode 100644 index ..58e55c89 --- /dev/null +++ b/htdocs/projects/gomp/openmp-status.html @@ -0,0 +1,203 @@ + + + + + +OpenMP Implementation Status in GCC +https://gcc.gnu.org/gcc.css"; /> + + + +OpenMP Implementation Status in GCC + +Disclaimer: A feature might be only fully supported in a later GCC version +than listed, depending on resolved corner cases and optimizations. + +Features added by OpenMP version + + OpenMP 4.5 + OpenMP 5.0 + OpenMP 5.1 + OpenMP 5.2 + + +OpenMP implementation status in the online documentation: + + GOMP project page + https://gcc.gnu.org/onlinedocs/libgomp/OpenMP-Implementation-Status.html"; +>Mainline (GCC 13) + https://gcc.gnu.org/onlinedocs/gcc-12.1.0/libgomp/OpenMP-Implementation-Status.html"; +>GCC 12 + + + +OpenMP 4.5 + + + C/C++: Supported since GCC 6 + Fortran: Partial support since GCC 7, full support since GCC 11 + + + +OpenMP 5.0 + + + +FeatureGCC VersionComments + + +in_reduction clause on task constructsGCC 9 +Supporting C++'s range-based for loopGCC 9 +IteratorsGCC 9 +!= as relational-op in canonical loop form for C/C++GCC 9 +C/C++'s lvalue expressions in depend clausesGCC 9 +mutexinoutset dependence-type for depend clauseGCC 9 +depobj construct and depend objects GCC 9 +depend clause on taskwaitGCC 9 +teams construct outside an enclosing target regionGCC 9 +Clauses if, nontemporal and order
Re: [PATCH] lto-plugin: add support for feature detection
> On 5/16/22 11:25, Jan Hubicka via Gcc-patches wrote: > >> > >> Sure having a 'plugin was compiled from sources of the GCC N.M compiler' > >> is useful if bugs are discovered in old versions that you by definition > >> cannot > >> fix but can apply workarounds to. Note the actual compiler used might > >> still > >> differ. Note that still isn't clean API documentation / extension of the > >> plugin > >> API itself. As of thread safety we can either add a claim_file_v2_hook > >> or try to do broader-level versioning of the API with a new api_version > >> hook which could also specify that with say version 2 the plugin will > >> not use get_symbols_v2 but only newer, etc. The linker would announce > >> the API version it likes to use and the plugin would return the > >> (closest) version > >> it can handle. A v2 could then also specify that claim_file needs to be > > > > Yep, I think having the API version handshake is quite reasonable way to > > go as the _v2,_v3 symbols seems already getting bit out of hand and we > > essentially have 3 revisions of API to think of > > (first adding LDPR_PREVAILING_DEF_IRONLY_EXP, second adding > > GET_SYMBOLS_V3 and now we plan third adding thread safety and solving > > the file handler problems) > > How should be design such a version handshake? As richi outlined. We need to assign version number of the API (with 1, 2, 3 defind to match existing revisions) and define version 4 which will have have way for plugin to announce maximal api version supported and linker telling plugin what API version it wants to work with. That shold be min(linker_api, plugin_api) Since revisions happens relatively rarely, I think we should be able to stay with single number determining it. > > > > >> thread safe, or that cleanup should allow a followup onload again with > >> a state identical to after dlopen, etc. > >> > >> Is there an API document besides the header itself somewhere? > > > > There is https://gcc.gnu.org/wiki/whopr/driver > > I wonder if this can't be moved into more official looking place since > > it looks like it is internal to GCC WHOPR implementation this way. > > We can likely add it here: > https://gcc.gnu.org/onlinedocs/gccint/LTO.html#LTO > > What do you think? I can port it. I am bit worried that it is place LLVM developers will not look at since name "GCC internals" suggests it is internal to GCC. It is not. However perhaps we can link it from binutils docs, plugin header and gcc plugin source to make it hopefully obvious enough. Maybe binutils docs would be an alternative. Yet we want other linkers support it: I am really happy mold now supports plugin API, but if lld and MacOS linker had it it would make our life easier. Honza > > Martin > > > > > Honza > >> > >> Richard. >
Re: [PATCH] lto-plugin: add support for feature detection
Version handshaking is doable, but it feels like we are over-designing an API, given that the real providers of this plugin API are only GCC and LLVM and the users of the API are BFD ld, gold and mold. It is unlikely that we'll have dozens of more compilers or linkers in the near future. So, I personally prefer the following style if (!strcmp(plugin_compiler_name, "gcc") && plugin_major >= 12) than versioning various API-provided functions. It'll just work and be easy to understand. Besides that, even if we version GCC-provided plugin API functions, we still need a logic similar to the above to distinguish GCC from LLVM, as they behave slightly differently in various corner cases. We can't get rid of the heuristic version detection logic from the linker anyways. So, from the linker's point of view, exporting a compiler name and version numbers is enough. On Mon, May 16, 2022 at 5:38 PM Martin Liška wrote: > > On 5/16/22 11:25, Jan Hubicka via Gcc-patches wrote: > >> > >> Sure having a 'plugin was compiled from sources of the GCC N.M compiler' > >> is useful if bugs are discovered in old versions that you by definition > >> cannot > >> fix but can apply workarounds to. Note the actual compiler used might > >> still > >> differ. Note that still isn't clean API documentation / extension of the > >> plugin > >> API itself. As of thread safety we can either add a claim_file_v2_hook > >> or try to do broader-level versioning of the API with a new api_version > >> hook which could also specify that with say version 2 the plugin will > >> not use get_symbols_v2 but only newer, etc. The linker would announce > >> the API version it likes to use and the plugin would return the > >> (closest) version > >> it can handle. A v2 could then also specify that claim_file needs to be > > > > Yep, I think having the API version handshake is quite reasonable way to > > go as the _v2,_v3 symbols seems already getting bit out of hand and we > > essentially have 3 revisions of API to think of > > (first adding LDPR_PREVAILING_DEF_IRONLY_EXP, second adding > > GET_SYMBOLS_V3 and now we plan third adding thread safety and solving > > the file handler problems) > > How should be design such a version handshake? > > > > >> thread safe, or that cleanup should allow a followup onload again with > >> a state identical to after dlopen, etc. > >> > >> Is there an API document besides the header itself somewhere? > > > > There is https://gcc.gnu.org/wiki/whopr/driver > > I wonder if this can't be moved into more official looking place since > > it looks like it is internal to GCC WHOPR implementation this way. > > We can likely add it here: > https://gcc.gnu.org/onlinedocs/gccint/LTO.html#LTO > > What do you think? I can port it. > > Martin > > > > > Honza > >> > >> Richard. >
Re: [wwwdocs][Patch] Add OpenMP by-GCC-version implementation status
Hi Tobias, On Sat, 14 May 2022, Tobias Burnus wrote: > Jakub and I discussed the other day that it would be useful > to have a page similar to > https://gcc.gnu.org/projects/cxx-status.html > to provide by-GCC-version information of the which OpenMP are supported. this looks like a great idea, thanks! > Comments regarding the file, where it is placed and how it is linked to? > I found it simpler to have a single very long line rather than trying > to break it into 80-character lines, but I can also do this reformatting. Yes, in cases like this with a small and fixed set of contributors (to that table) ... whatever serves you best. Regarding the name projects/gomp/openmp-status.html feels a bit long? How about projects/gomp/status.html ? Or maybe even make this part of the main index.html page, so projects/gomp/ in the browser? The former is more of a stronger recommendation, the latter just a thought. Minor tweak: it's "to file a bug report", so "filing a bug" not "filling a bug", though indeed when filing a bug report one needs to fill a form. :-) Gerald
[PATCH] rtl-optimization/105577 - testcase for the PR
Tested on x86_64-unknown-linux-gnu, pushed. 2022-05-16 Richard Biener PR rtl-optimization/105577 * g++.dg/torture/pr105577.C: New testcase. --- gcc/testsuite/g++.dg/torture/pr105577.C | 156 1 file changed, 156 insertions(+) create mode 100644 gcc/testsuite/g++.dg/torture/pr105577.C diff --git a/gcc/testsuite/g++.dg/torture/pr105577.C b/gcc/testsuite/g++.dg/torture/pr105577.C new file mode 100644 index 000..52f16a54136 --- /dev/null +++ b/gcc/testsuite/g++.dg/torture/pr105577.C @@ -0,0 +1,156 @@ +// { dg-do compile } +// { dg-additional-options "-fexceptions -fnon-call-exceptions" } + +namespace { +typedef __SIZE_TYPE__ size_t; +} +typedef __UINT8_TYPE__ uint8_t; +typedef __UINT64_TYPE__ uint64_t; +namespace { +template struct integral_constant { + static constexpr _Tp value = __v; +}; +template using __bool_constant = integral_constant; +template struct __conditional { + template using type = _Tp; +}; +template +using __conditional_t = typename __conditional<_Cond>::type<_If, _Else>; +template struct __and_; +template +struct __and_<_B1, _B2> : __conditional_t<_B1::value, _B2, _B1> {}; +template struct __not_ : __bool_constant {}; +template +struct __is_constructible_impl : __bool_constant<__is_constructible(_Tp)> {}; +template +struct is_default_constructible : __is_constructible_impl<_Tp> {}; +template struct remove_extent { typedef _Tp type; }; +template struct enable_if; +} // namespace +namespace std { +template struct allocator_traits { using pointer = _Tp; }; +template struct __alloc_traits : allocator_traits<_Alloc> {}; +template struct _Vector_base { + typedef typename __alloc_traits<_Alloc>::pointer pointer; + struct { +pointer _M_finish; +pointer _M_end_of_storage; + }; +}; +template +class vector : _Vector_base<_Tp, _Alloc> { +public: + _Tp value_type; + typedef size_t size_type; +}; +template class __uniq_ptr_impl { + template struct _Ptr { using type = _Up *; }; + +public: + using _DeleterConstraint = + enable_if<__and_<__not_<_Dp>, is_default_constructible<_Dp>>::value>; + using pointer = typename _Ptr<_Tp, _Dp>::type; +}; +template class unique_ptr { +public: + using pointer = typename __uniq_ptr_impl<_Tp, _Dp>::pointer; + pointer operator->(); +}; +enum _Lock_policy { _S_atomic } const __default_lock_policy = _S_atomic; +template <_Lock_policy = __default_lock_policy> class _Sp_counted_base; +template class __shared_ptr; +template <_Lock_policy> class __shared_count { _Sp_counted_base<> *_M_pi; }; +template class __shared_ptr { + using element_type = typename remove_extent<_Tp>::type; + element_type *_M_ptr; + __shared_count<_Lp> _M_refcount; +}; +template class shared_ptr : __shared_ptr<_Tp> { +public: + shared_ptr() noexcept : __shared_ptr<_Tp>() {} +}; +enum CompressionType : char; +class SliceTransform; +enum Temperature : uint8_t; +struct MutableCFOptions { + MutableCFOptions() + : soft_pending_compaction_bytes_limit(), + hard_pending_compaction_bytes_limit(level0_file_num_compaction_trigger), +level0_slowdown_writes_trigger(level0_stop_writes_trigger), +max_compaction_bytes(target_file_size_base), +target_file_size_multiplier(max_bytes_for_level_base), +max_bytes_for_level_multiplier(ttl), compaction_options_fifo(), +min_blob_size(blob_file_size), blob_compression_type(), +enable_blob_garbage_collection(blob_garbage_collection_age_cutoff), +max_sequential_skip_in_iterations(check_flush_compaction_key_order), +paranoid_file_checks(bottommost_compression), bottommost_temperature(), +sample_for_compression() {} + shared_ptr prefix_extractor; + uint64_t soft_pending_compaction_bytes_limit; + uint64_t hard_pending_compaction_bytes_limit; + int level0_file_num_compaction_trigger; + int level0_slowdown_writes_trigger; + int level0_stop_writes_trigger; + uint64_t max_compaction_bytes; + uint64_t target_file_size_base; + int target_file_size_multiplier; + uint64_t max_bytes_for_level_base; + double max_bytes_for_level_multiplier; + uint64_t ttl; + vector compaction_options_fifo; + uint64_t min_blob_size; + uint64_t blob_file_size; + CompressionType blob_compression_type; + bool enable_blob_garbage_collection; + double blob_garbage_collection_age_cutoff; + uint64_t max_sequential_skip_in_iterations; + bool check_flush_compaction_key_order; + bool paranoid_file_checks; + CompressionType bottommost_compression; + Temperature bottommost_temperature; + uint64_t sample_for_compression; +}; +template class autovector { + using value_type = T; + using size_type = typename vector::size_type; + size_type buf_[kSize * sizeof(value_type)]; +}; +class MemTable; +class ColumnFamilyData; +struct SuperVersion { + MutableCFOptions write_stall_condition; + autovector to_delete; +}; +class ColumnFamilySet { +public: + class iterator { + public: +iterator operator++(); +bool ope
Re: [PATCH] lto-plugin: add support for feature detection
On Mon, May 16, 2022 at 11:50 AM Jan Hubicka wrote: > > > On 5/16/22 11:25, Jan Hubicka via Gcc-patches wrote: > > >> > > >> Sure having a 'plugin was compiled from sources of the GCC N.M compiler' > > >> is useful if bugs are discovered in old versions that you by definition > > >> cannot > > >> fix but can apply workarounds to. Note the actual compiler used might > > >> still > > >> differ. Note that still isn't clean API documentation / extension of > > >> the plugin > > >> API itself. As of thread safety we can either add a claim_file_v2_hook > > >> or try to do broader-level versioning of the API with a new api_version > > >> hook which could also specify that with say version 2 the plugin will > > >> not use get_symbols_v2 but only newer, etc. The linker would announce > > >> the API version it likes to use and the plugin would return the > > >> (closest) version > > >> it can handle. A v2 could then also specify that claim_file needs to be > > > > > > Yep, I think having the API version handshake is quite reasonable way to > > > go as the _v2,_v3 symbols seems already getting bit out of hand and we > > > essentially have 3 revisions of API to think of > > > (first adding LDPR_PREVAILING_DEF_IRONLY_EXP, second adding > > > GET_SYMBOLS_V3 and now we plan third adding thread safety and solving > > > the file handler problems) > > > > How should be design such a version handshake? > > As richi outlined. We need to assign version number of the API (with 1, > 2, 3 defind to match existing revisions) and define version 4 which will > have have way for plugin to announce maximal api version supported and > linker telling plugin what API version it wants to work with. That > shold be min(linker_api, plugin_api) > > Since revisions happens relatively rarely, I think we should be able to > stay with single number determining it. > > > > > > > >> thread safe, or that cleanup should allow a followup onload again with > > >> a state identical to after dlopen, etc. > > >> > > >> Is there an API document besides the header itself somewhere? > > > > > > There is https://gcc.gnu.org/wiki/whopr/driver > > > I wonder if this can't be moved into more official looking place since > > > it looks like it is internal to GCC WHOPR implementation this way. > > > > We can likely add it here: > > https://gcc.gnu.org/onlinedocs/gccint/LTO.html#LTO > > > > What do you think? I can port it. > > I am bit worried that it is place LLVM developers will not look at > since name "GCC internals" suggests it is internal to GCC. It is > not. However perhaps we can link it from binutils docs, plugin header > and gcc plugin source to make it hopefully obvious enough. Maybe > binutils docs would be an alternative. Yet we want other linkers support > it: I am really happy mold now supports plugin API, but if lld and MacOS > linker had it it would make our life easier. Yep. Or if the docs are not too extensive pasting them into plugin-api.h itself might be not the worst idea either ... basically provide markup that doxygen or similar tools can re-create sth like the wiki page? Richard. > Honza > > > > Martin > > > > > > > > Honza > > >> > > >> Richard. > >
Re: [RESEND][committed v4] RISC-V: Provide `fmin'/`fmax' RTL patterns
On Sat, 14 May 2022, Palmer Dabbelt wrote: > > Hmm, should we? We only support `-misa-spec=<2.2|20190608|20191213>' > > already and this update is fine for r.2.2+. If someone has pre-r.2.2 hw, > > then it's been already unsupported even before this change (as from GCC 11 > > AFAICS). Have I missed anything? > > I have no idea, but either you did or I did... > > IIUC this actually changed for version 2.2 of the F extension, which happened > well after version 2.2 of the ISA manual. I see eb78171 ("F/D extensions to > v2.2") both changing the version and noting the change, with cd20cee > ("FMIN/FMAX now implement minimumNumber/maximumNumber, not minNum/maxNum") > actually making the change and also adding some notation about this being a > draft of version 2.3 of the ISA manual (which was presumably never released as > 2.3 but instead one of those other tags). It also calls this out as F 2.0, > but I'm assuming that's non-canonical because this is a draft. So the only actual concern with F/D 2.0 vs F/D 2.2 is the change of the treatment of signalling NaNs, which is why the new `fmin'/`fmax' patterns are keyed with !HONOR_SNANS so that code produces the same results from ISA r.2.2 onwards. I am sorry if that got buried in the elaborate change description. Once we've got the generic bits available for minimumNumber/maximumNumber we can add suitable `*min'/`*max' patterns that implement these operations and key them with (riscv_isa_spec >= ISA_SPEC_CLASS_20190608). Does this explanation clear your concern? Maciej
[Patch] gcn/t-omp-device: Add 'amdgcn' as 'arch' [PR105602]
While 'vendor' and 'kind' is well defined, 'arch' and 'isa' isn't. When looking at an 'metadirective' testcase (which oddly uses 'arch(amd)'), I noticed that LLVM uses 'arch(amdgcn)' while we use 'gcn', cf. e.g. 'clang/lib/Headers/openmp_wrappers/math.h'. (Side note: we use the target triplet amdgcn-amdhsa and LLVM uses amdgcn-amd-amdhsa.) Given the target triplet, the LLVM choice seems to make more sense; 'gcn' only shows up as directory name (under gcc/config + libgomp/). Thus, I think we have two options: * Either also change to (only) 'amdgcn' - given that the supported arch() values are neither documented not yet widely used. * Or add 'amdgcn' alongside 'gcn' - which is what the attached patch does. Which option do you prefer? Other thoughts? Tobias PS: I think during the GCC 13 development, we should start adding more offload-target documentation, including documenting the used/permitted vendor/arch/isa context selectors. - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 gcn/t-omp-device: Add 'amdgcn' as 'arch' [PR105602] Improve cross-compiler handling. gcc/ChangeLog: PR target/105602 * config/gcn/t-omp-device (arch): Add 'amdgcn' besides existing 'gcn'. diff --git a/gcc/config/gcn/t-omp-device b/gcc/config/gcn/t-omp-device index cd56e2f8a68..e1d9e0d2a1e 100644 --- a/gcc/config/gcn/t-omp-device +++ b/gcc/config/gcn/t-omp-device @@ -1,4 +1,4 @@ omp-device-properties-gcn: $(srcdir)/config/gcn/gcn.cc echo kind: gpu > $@ - echo arch: gcn >> $@ + echo arch: amdgcn gcn >> $@ echo isa: fiji gfx900 gfx906 gfx908 >> $@
Re: [PATCH] lto-plugin: add support for feature detection
On Mon, May 16, 2022 at 11:58 AM Rui Ueyama wrote: > > Version handshaking is doable, but it feels like we are over-designing > an API, given that the real providers of this plugin API are only GCC > and LLVM and the users of the API are BFD ld, gold and mold. It is > unlikely that we'll have dozens of more compilers or linkers in the > near future. So, I personally prefer the following style > > if (!strcmp(plugin_compiler_name, "gcc") && plugin_major >= 12) > > than versioning various API-provided functions. It'll just work and be > easy to understand. > > Besides that, even if we version GCC-provided plugin API functions, we > still need a logic similar to the above to distinguish GCC from LLVM, > as they behave slightly differently in various corner cases. We can't > get rid of the heuristic version detection logic from the linker > anyways. > > So, from the linker's point of view, exporting a compiler name and > version numbers is enough. I agree that this might be convenient enough but the different behaviors are because of inadequate documentation of the API - that's something we should fix. And for this I think a plugin API version might help as we can this way also handle your case of the need of querying whether v2 will be used or not. I wouldn't go to enumerate past API versions - the version handshake hook requirement alone makes that not so useful. Instead I'd require everybody implementing the handshare hook implementing also all other hooks defined at that point in time and set the version to 1. I'd eventually keep version 2 to indicate thread safety (of a part of the API). That said, I'm not opposed to add a "GCC 12.1" provider, maybe the version handshake should be int api_version (int linker, const char **identifier); where the linker specifies the desired API version and passes an identifier identifying itself ("mold 1.0") and it will get back the API version the plugin intends to use plus an identifier of the plugin ("GCC 12.1"). Richard. > > > On Mon, May 16, 2022 at 5:38 PM Martin Liška wrote: > > > > On 5/16/22 11:25, Jan Hubicka via Gcc-patches wrote: > > >> > > >> Sure having a 'plugin was compiled from sources of the GCC N.M compiler' > > >> is useful if bugs are discovered in old versions that you by definition > > >> cannot > > >> fix but can apply workarounds to. Note the actual compiler used might > > >> still > > >> differ. Note that still isn't clean API documentation / extension of > > >> the plugin > > >> API itself. As of thread safety we can either add a claim_file_v2_hook > > >> or try to do broader-level versioning of the API with a new api_version > > >> hook which could also specify that with say version 2 the plugin will > > >> not use get_symbols_v2 but only newer, etc. The linker would announce > > >> the API version it likes to use and the plugin would return the > > >> (closest) version > > >> it can handle. A v2 could then also specify that claim_file needs to be > > > > > > Yep, I think having the API version handshake is quite reasonable way to > > > go as the _v2,_v3 symbols seems already getting bit out of hand and we > > > essentially have 3 revisions of API to think of > > > (first adding LDPR_PREVAILING_DEF_IRONLY_EXP, second adding > > > GET_SYMBOLS_V3 and now we plan third adding thread safety and solving > > > the file handler problems) > > > > How should be design such a version handshake? > > > > > > > >> thread safe, or that cleanup should allow a followup onload again with > > >> a state identical to after dlopen, etc. > > >> > > >> Is there an API document besides the header itself somewhere? > > > > > > There is https://gcc.gnu.org/wiki/whopr/driver > > > I wonder if this can't be moved into more official looking place since > > > it looks like it is internal to GCC WHOPR implementation this way. > > > > We can likely add it here: > > https://gcc.gnu.org/onlinedocs/gccint/LTO.html#LTO > > > > What do you think? I can port it. > > > > Martin > > > > > > > > Honza > > >> > > >> Richard. > >
Re: [PATCH] Mitigate -Wmaybe-uninitialized in expmed.cc.
On Mon, May 16, 2022 at 11:18 AM Richard Sandiford via Gcc-patches wrote: > > Martin Liška writes: > > It's the warning I see every time I build GCC: > > > > In file included from /home/marxin/Programming/gcc/gcc/coretypes.h:478, > > from /home/marxin/Programming/gcc/gcc/expmed.cc:26: > > In function ‘poly_uint16 mode_to_bytes(machine_mode)’, > > inlined from ‘typename if_nonpoly::type > > GET_MODE_SIZE(const T&) [with T = scalar_int_mode]’ at > > /home/marxin/Programming/gcc/gcc/machmode.h:647:24, > > inlined from ‘rtx_def* emit_store_flag_1(rtx, rtx_code, rtx, rtx, > > machine_mode, int, int, machine_mode)’ at > > /home/marxin/Programming/gcc/gcc/expmed.cc:5728:56: > > /home/marxin/Programming/gcc/gcc/machmode.h:550:49: warning: ‘*(unsigned > > int*)((char*)&int_mode + offsetof(scalar_int_mode, > > scalar_int_mode::m_mode))’ may be used uninitialized [-Wmaybe-uninitialized] > > 550 | ? mode_size_inline (mode) : mode_size[mode]); > > | ^~~~ > > /home/marxin/Programming/gcc/gcc/expmed.cc: In function ‘rtx_def* > > emit_store_flag_1(rtx, rtx_code, rtx, rtx, machine_mode, int, int, > > machine_mode)’: > > /home/marxin/Programming/gcc/gcc/expmed.cc:5657:19: note: ‘*(unsigned > > int*)((char*)&int_mode + offsetof(scalar_int_mode, > > scalar_int_mode::m_mode))’ was declared here > > 5657 | scalar_int_mode int_mode; > > | ^~~~ > > > > Can we please mitigate it? > > > > gcc/ChangeLog: > > > > * expmed.cc (emit_store_flag_1): Mitigate -Wmaybe-uninitialized > > warning. > > Not a strong objection, but TBH I'd rather we didn't work around false > positives like this. It only seems to happen with your host compiler though? The set of # These files are to have specific diagnostics suppressed, or are not to # be subject to -Werror: # flex output may yield harmless "no previous prototype" warnings build/gengtype-lex.o-warn = -Wno-error gengtype-lex.o-warn = -Wno-error libgcov-util.o-warn = -Wno-error libgcov-driver-tool.o-warn = -Wno-error libgcov-merge-tool.o-warn = -Wno-error gimple-match.o-warn = -Wno-unused generic-match.o-warn = -Wno-unused dfp.o-warn = -Wno-strict-aliasing doesn't include expmed.o at least. > Richard > > > --- > > gcc/expmed.cc | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/gcc/expmed.cc b/gcc/expmed.cc > > index 41738c1efe9..f23d63471ea 100644 > > --- a/gcc/expmed.cc > > +++ b/gcc/expmed.cc > > @@ -5654,7 +5654,7 @@ emit_store_flag_1 (rtx target, enum rtx_code code, > > rtx op0, rtx op1, > > > >/* If we are comparing a double-word integer with zero or -1, we can > > convert the comparison into one involving a single word. */ > > - scalar_int_mode int_mode; > > + scalar_int_mode int_mode = {}; > >if (is_int_mode (mode, &int_mode) > >&& GET_MODE_BITSIZE (int_mode) == BITS_PER_WORD * 2 > >&& (!MEM_P (op0) || ! MEM_VOLATILE_P (op0)))
Re: [PATCH v2 08/10] testsuite: Add C++ unwinding tests with Decimal Floating-Point
Christophe Lyon via Gcc-patches writes: > These tests exercise exception handling with Decimal Floating-Point > type. > > dfp-1.C and dfp-2.C check that thrown objects of such types are > properly caught, whether when using C++ classes (decimalXX) or via GCC > mode attributes. > > dfp-saves-aarch64.C checks that such objects are properly restored, > and has to use the mode attribute trick because objects of decimalXX > class type cannot be assigned to a register variable. > > 2022-05-03 Christophe Lyon > > gcc/testsuite/ > * g++.dg/eh/dfp-1.C: New test. > * g++.dg/eh/dfp-2.C: New test. > * g++.dg/eh/dfp-saves-aarch64.C: New test. OK, thanks. Richard > --- > gcc/testsuite/g++.dg/eh/dfp-1.C | 54 + > gcc/testsuite/g++.dg/eh/dfp-2.C | 54 + > gcc/testsuite/g++.dg/eh/dfp-saves-aarch64.C | 49 +++ > 3 files changed, 157 insertions(+) > create mode 100644 gcc/testsuite/g++.dg/eh/dfp-1.C > create mode 100644 gcc/testsuite/g++.dg/eh/dfp-2.C > create mode 100644 gcc/testsuite/g++.dg/eh/dfp-saves-aarch64.C > > diff --git a/gcc/testsuite/g++.dg/eh/dfp-1.C b/gcc/testsuite/g++.dg/eh/dfp-1.C > new file mode 100644 > index 000..b0da13a4cc5 > --- /dev/null > +++ b/gcc/testsuite/g++.dg/eh/dfp-1.C > @@ -0,0 +1,54 @@ > +// { dg-do run } > +// { dg-require-effective-target dfp } > + > +extern "C" void abort (); > + > +#include > + > +using namespace std::decimal; > + > +int > +foo (double fp) > +{ > + if (fp < 32.0) > +throw (decimal32)32; > + if (fp < 64.0) > +throw (decimal64)64; > + if (fp < 128.0) > +throw (decimal128)128; > + return 0; > +} > + > +int bar (double fp) > +{ > + try > +{ > + foo (fp); > + abort (); > +} > + catch (decimal32 df) > +{ > + if (df != (decimal32)32) > + abort (); > +} > + catch (decimal64 dd) > +{ > + if (dd != (decimal64)64) > + abort (); > +} > + catch (decimal128 dl) > +{ > + if (dl != (decimal128)128) > + abort (); > +} > + return 0; > +} > + > +int > +main () > +{ > + bar (10.0); > + bar (20.0); > + bar (100.0); > + return 0; > +} > diff --git a/gcc/testsuite/g++.dg/eh/dfp-2.C b/gcc/testsuite/g++.dg/eh/dfp-2.C > new file mode 100644 > index 000..aff0e03d1d9 > --- /dev/null > +++ b/gcc/testsuite/g++.dg/eh/dfp-2.C > @@ -0,0 +1,54 @@ > +// { dg-do run } > +// { dg-require-effective-target dfp } > + > +extern "C" void abort (); > + > +typedef float dec32 __attribute__((mode(SD))); > +typedef float dec64 __attribute__((mode(DD))); > +typedef float dec128 __attribute__((mode(TD))); > + > +int > +foo (double fp) > +{ > + if (fp < 32.0) > +throw (dec32)32; > + if (fp < 64.0) > +throw (dec64)64; > + if (fp < 128.0) > +throw (dec128)128; > + return 0; > +} > + > +int bar (double fp) > +{ > + try > +{ > + foo (fp); > + abort (); > +} > + catch (dec32 df) > +{ > + if (df != (dec32)32) > + abort (); > +} > + catch (dec64 dd) > +{ > + if (dd != (dec64)64) > + abort (); > +} > + catch (dec128 dl) > +{ > + if (dl != (dec128)128) > + abort (); > +} > + return 0; > +} > + > +int > +main () > +{ > + bar (10.0); > + bar (20.0); > + bar (100.0); > + return 0; > +} > diff --git a/gcc/testsuite/g++.dg/eh/dfp-saves-aarch64.C > b/gcc/testsuite/g++.dg/eh/dfp-saves-aarch64.C > new file mode 100644 > index 000..06203410500 > --- /dev/null > +++ b/gcc/testsuite/g++.dg/eh/dfp-saves-aarch64.C > @@ -0,0 +1,49 @@ > +// { dg-do run { target aarch64*-*-* } } > +// { dg-require-effective-target dfp } > + > +/* Test unwinding of AArch64 register saves. */ > +/* We cannot use #include because it defines > + decimal* types as classes, which cannot be assigned to register > + variables. Hence the use the mode attribute trick. */ > + > +#ifdef __aarch64__ > + > +typedef float dec64 __attribute__((mode(DD))); > + > +extern "C" void abort (void); > +extern "C" void exit (int); > + > +void > +foo (void) > +{ > + register dec64 v10 asm("v10") = 0; > + register dec64 v11 asm("v11") = 1; > + register dec64 v12 asm("v12") = 2; > + register dec64 v13 asm("v13") = 3; > + asm volatile ("" : "+w" (v10), "+w" (v11), "+w" (v12), "+w" (v13)); > + throw ""; > +} > + > +int > +main (void) > +{ > + register dec64 v10 asm("v10") = 10; > + register dec64 v11 asm("v11") = 11; > + register dec64 v12 asm("v12") = 12; > + register dec64 v13 asm("v13") = 13; > + asm volatile ("" : "+w" (v10), "+w" (v11), "+w" (v12), "+w" (v13)); > + try { > +foo (); > + } catch (...) { > +asm volatile ("" : "+w" (v10), "+w" (v11), "+w" (v12), "+w" (v13)); > +if (v10 != 10 || v11 != 11 || v12 != 12 || v13 != 13) > + abort (); > + } > + exit (0); > +} > +#else > +int > +main (void) > +{ > +} > +#endif
[PATCH] Move code_helper to tree.h
tree.h already contains combined_fn handling at the top and moving code_helper away from gimple-match.h makes improving the gimple_build API easier. Bootstrapped on x86_64-unknown-linux-gnu. Will push this if there are no comments when I've finished enhancing the gimple_build API (and moving pieces out from gimple-match.h). Richard. 2022-05-16 Richard Biener * gimple-match.h (code_helper): Move class ... * tree.h (code_helper): ... here. --- gcc/gimple-match.h | 49 -- gcc/tree.h | 49 ++ 2 files changed, 49 insertions(+), 49 deletions(-) diff --git a/gcc/gimple-match.h b/gcc/gimple-match.h index d7b0b676059..e96c31ed09f 100644 --- a/gcc/gimple-match.h +++ b/gcc/gimple-match.h @@ -23,55 +23,6 @@ along with GCC; see the file COPYING3. If not see #define GCC_GIMPLE_MATCH_H -/* Helper to transparently allow tree codes and builtin function codes - exist in one storage entity. */ -class code_helper -{ -public: - code_helper () {} - code_helper (tree_code code) : rep ((int) code) {} - code_helper (combined_fn fn) : rep (-(int) fn) {} - code_helper (internal_fn fn) : rep (-(int) as_combined_fn (fn)) {} - explicit operator tree_code () const { return (tree_code) rep; } - explicit operator combined_fn () const { return (combined_fn) -rep; } - explicit operator internal_fn () const; - explicit operator built_in_function () const; - bool is_tree_code () const { return rep > 0; } - bool is_fn_code () const { return rep < 0; } - bool is_internal_fn () const; - bool is_builtin_fn () const; - int get_rep () const { return rep; } - bool operator== (const code_helper &other) { return rep == other.rep; } - bool operator!= (const code_helper &other) { return rep != other.rep; } - bool operator== (tree_code c) { return rep == code_helper (c).rep; } - bool operator!= (tree_code c) { return rep != code_helper (c).rep; } - -private: - int rep; -}; - -inline code_helper::operator internal_fn () const -{ - return as_internal_fn (combined_fn (*this)); -} - -inline code_helper::operator built_in_function () const -{ - return as_builtin_fn (combined_fn (*this)); -} - -inline bool -code_helper::is_internal_fn () const -{ - return is_fn_code () && internal_fn_p (combined_fn (*this)); -} - -inline bool -code_helper::is_builtin_fn () const -{ - return is_fn_code () && builtin_fn_p (combined_fn (*this)); -} - /* Represents the condition under which an operation should happen, and the value to use otherwise. The condition applies elementwise (as for VEC_COND_EXPR) if the values are vectors. */ diff --git a/gcc/tree.h b/gcc/tree.h index 5f4b03dc5df..c92c5bf344b 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -75,6 +75,55 @@ as_internal_fn (combined_fn code) return internal_fn (int (code) - int (END_BUILTINS)); } +/* Helper to transparently allow tree codes and builtin function codes + exist in one storage entity. */ +class code_helper +{ +public: + code_helper () {} + code_helper (tree_code code) : rep ((int) code) {} + code_helper (combined_fn fn) : rep (-(int) fn) {} + code_helper (internal_fn fn) : rep (-(int) as_combined_fn (fn)) {} + explicit operator tree_code () const { return (tree_code) rep; } + explicit operator combined_fn () const { return (combined_fn) -rep; } + explicit operator internal_fn () const; + explicit operator built_in_function () const; + bool is_tree_code () const { return rep > 0; } + bool is_fn_code () const { return rep < 0; } + bool is_internal_fn () const; + bool is_builtin_fn () const; + int get_rep () const { return rep; } + bool operator== (const code_helper &other) { return rep == other.rep; } + bool operator!= (const code_helper &other) { return rep != other.rep; } + bool operator== (tree_code c) { return rep == code_helper (c).rep; } + bool operator!= (tree_code c) { return rep != code_helper (c).rep; } + +private: + int rep; +}; + +inline code_helper::operator internal_fn () const +{ + return as_internal_fn (combined_fn (*this)); +} + +inline code_helper::operator built_in_function () const +{ + return as_builtin_fn (combined_fn (*this)); +} + +inline bool +code_helper::is_internal_fn () const +{ + return is_fn_code () && internal_fn_p (combined_fn (*this)); +} + +inline bool +code_helper::is_builtin_fn () const +{ + return is_fn_code () && builtin_fn_p (combined_fn (*this)); +} + /* Macros for initializing `tree_contains_struct'. */ #define MARK_TS_BASE(C)\ (tree_contains_struct[C][TS_BASE] = true) -- 2.35.3
Re: [PATCH] lto-plugin: add support for feature detection
On Mon, May 16, 2022 at 6:28 PM Richard Biener wrote: > > On Mon, May 16, 2022 at 11:58 AM Rui Ueyama wrote: > > > > Version handshaking is doable, but it feels like we are over-designing > > an API, given that the real providers of this plugin API are only GCC > > and LLVM and the users of the API are BFD ld, gold and mold. It is > > unlikely that we'll have dozens of more compilers or linkers in the > > near future. So, I personally prefer the following style > > > > if (!strcmp(plugin_compiler_name, "gcc") && plugin_major >= 12) > > > > than versioning various API-provided functions. It'll just work and be > > easy to understand. > > > > Besides that, even if we version GCC-provided plugin API functions, we > > still need a logic similar to the above to distinguish GCC from LLVM, > > as they behave slightly differently in various corner cases. We can't > > get rid of the heuristic version detection logic from the linker > > anyways. > > > > So, from the linker's point of view, exporting a compiler name and > > version numbers is enough. > > I agree that this might be convenient enough but the different behaviors > are because of inadequate documentation of the API - that's something > we should fix. And for this I think a plugin API version might help > as we can this way also handle your case of the need of querying > whether v2 will be used or not. > > I wouldn't go to enumerate past API versions - the version handshake > hook requirement alone makes that not so useful. Instead I'd require > everybody implementing the handshare hook implementing also all > other hooks defined at that point in time and set the version to 1. > > I'd eventually keep version 2 to indicate thread safety (of a part of the > API). > > That said, I'm not opposed to add a "GCC 12.1" provider, maybe the > version handshake should be > > int api_version (int linker, const char **identifier); > > where the linker specifies the desired API version and passes an > identifier identifying itself ("mold 1.0") and it will get back the API > version the plugin intends to use plus an identifier of the plugin > ("GCC 12.1"). void api_version(char *linker_identifier, const char **compiler_identifier, int *compiler_version); might be a bit better, where compiler_identifier is something like "gcc" or "clang" and comipler_version is 12001000 for 12.1.0. In the longer term, it feels to me that gcc should migrate to LLVM's libLTO-compatible API (https://llvm.org/docs/LinkTimeOptimization.html). It has resolved various problems of GCC's plugin API. A few notable examples are: - libLTO API separates a function to read a symbol table from an IR object from adding that object to the LTO final result - libLTO API functions don't depend on a global state of the plugin API, while GCC LTO plugin saves its internal state to a global variable (so we can't have two linker instances in a single process with GCC LTO, for example) - libLTO API doesn't use callbacks. It looks much more straightforward than GCC's plugin API. > Richard. > > > > > > > On Mon, May 16, 2022 at 5:38 PM Martin Liška wrote: > > > > > > On 5/16/22 11:25, Jan Hubicka via Gcc-patches wrote: > > > >> > > > >> Sure having a 'plugin was compiled from sources of the GCC N.M > > > >> compiler' > > > >> is useful if bugs are discovered in old versions that you by > > > >> definition cannot > > > >> fix but can apply workarounds to. Note the actual compiler used might > > > >> still > > > >> differ. Note that still isn't clean API documentation / extension of > > > >> the plugin > > > >> API itself. As of thread safety we can either add a claim_file_v2_hook > > > >> or try to do broader-level versioning of the API with a new api_version > > > >> hook which could also specify that with say version 2 the plugin will > > > >> not use get_symbols_v2 but only newer, etc. The linker would announce > > > >> the API version it likes to use and the plugin would return the > > > >> (closest) version > > > >> it can handle. A v2 could then also specify that claim_file needs to > > > >> be > > > > > > > > Yep, I think having the API version handshake is quite reasonable way to > > > > go as the _v2,_v3 symbols seems already getting bit out of hand and we > > > > essentially have 3 revisions of API to think of > > > > (first adding LDPR_PREVAILING_DEF_IRONLY_EXP, second adding > > > > GET_SYMBOLS_V3 and now we plan third adding thread safety and solving > > > > the file handler problems) > > > > > > How should be design such a version handshake? > > > > > > > > > > >> thread safe, or that cleanup should allow a followup onload again with > > > >> a state identical to after dlopen, etc. > > > >> > > > >> Is there an API document besides the header itself somewhere? > > > > > > > > There is https://gcc.gnu.org/wiki/whopr/driver > > > > I wonder if this can't be moved into more official looking place since > > > > it looks like it is internal to GCC WHOPR implementation this way. >
Re: [PATCH] Move code_helper to tree.h
Richard Biener writes: > tree.h already contains combined_fn handling at the top and moving > code_helper away from gimple-match.h makes improving the gimple_build > API easier. Nice. Thanks for doing this. Richard > > Bootstrapped on x86_64-unknown-linux-gnu. > > Will push this if there are no comments when I've finished > enhancing the gimple_build API (and moving pieces out from > gimple-match.h). > > Richard. > > 2022-05-16 Richard Biener > > * gimple-match.h (code_helper): Move class ... > * tree.h (code_helper): ... here. > --- > gcc/gimple-match.h | 49 -- > gcc/tree.h | 49 ++ > 2 files changed, 49 insertions(+), 49 deletions(-) > > diff --git a/gcc/gimple-match.h b/gcc/gimple-match.h > index d7b0b676059..e96c31ed09f 100644 > --- a/gcc/gimple-match.h > +++ b/gcc/gimple-match.h > @@ -23,55 +23,6 @@ along with GCC; see the file COPYING3. If not see > #define GCC_GIMPLE_MATCH_H > > > -/* Helper to transparently allow tree codes and builtin function codes > - exist in one storage entity. */ > -class code_helper > -{ > -public: > - code_helper () {} > - code_helper (tree_code code) : rep ((int) code) {} > - code_helper (combined_fn fn) : rep (-(int) fn) {} > - code_helper (internal_fn fn) : rep (-(int) as_combined_fn (fn)) {} > - explicit operator tree_code () const { return (tree_code) rep; } > - explicit operator combined_fn () const { return (combined_fn) -rep; } > - explicit operator internal_fn () const; > - explicit operator built_in_function () const; > - bool is_tree_code () const { return rep > 0; } > - bool is_fn_code () const { return rep < 0; } > - bool is_internal_fn () const; > - bool is_builtin_fn () const; > - int get_rep () const { return rep; } > - bool operator== (const code_helper &other) { return rep == other.rep; } > - bool operator!= (const code_helper &other) { return rep != other.rep; } > - bool operator== (tree_code c) { return rep == code_helper (c).rep; } > - bool operator!= (tree_code c) { return rep != code_helper (c).rep; } > - > -private: > - int rep; > -}; > - > -inline code_helper::operator internal_fn () const > -{ > - return as_internal_fn (combined_fn (*this)); > -} > - > -inline code_helper::operator built_in_function () const > -{ > - return as_builtin_fn (combined_fn (*this)); > -} > - > -inline bool > -code_helper::is_internal_fn () const > -{ > - return is_fn_code () && internal_fn_p (combined_fn (*this)); > -} > - > -inline bool > -code_helper::is_builtin_fn () const > -{ > - return is_fn_code () && builtin_fn_p (combined_fn (*this)); > -} > - > /* Represents the condition under which an operation should happen, > and the value to use otherwise. The condition applies elementwise > (as for VEC_COND_EXPR) if the values are vectors. */ > diff --git a/gcc/tree.h b/gcc/tree.h > index 5f4b03dc5df..c92c5bf344b 100644 > --- a/gcc/tree.h > +++ b/gcc/tree.h > @@ -75,6 +75,55 @@ as_internal_fn (combined_fn code) >return internal_fn (int (code) - int (END_BUILTINS)); > } > > +/* Helper to transparently allow tree codes and builtin function codes > + exist in one storage entity. */ > +class code_helper > +{ > +public: > + code_helper () {} > + code_helper (tree_code code) : rep ((int) code) {} > + code_helper (combined_fn fn) : rep (-(int) fn) {} > + code_helper (internal_fn fn) : rep (-(int) as_combined_fn (fn)) {} > + explicit operator tree_code () const { return (tree_code) rep; } > + explicit operator combined_fn () const { return (combined_fn) -rep; } > + explicit operator internal_fn () const; > + explicit operator built_in_function () const; > + bool is_tree_code () const { return rep > 0; } > + bool is_fn_code () const { return rep < 0; } > + bool is_internal_fn () const; > + bool is_builtin_fn () const; > + int get_rep () const { return rep; } > + bool operator== (const code_helper &other) { return rep == other.rep; } > + bool operator!= (const code_helper &other) { return rep != other.rep; } > + bool operator== (tree_code c) { return rep == code_helper (c).rep; } > + bool operator!= (tree_code c) { return rep != code_helper (c).rep; } > + > +private: > + int rep; > +}; > + > +inline code_helper::operator internal_fn () const > +{ > + return as_internal_fn (combined_fn (*this)); > +} > + > +inline code_helper::operator built_in_function () const > +{ > + return as_builtin_fn (combined_fn (*this)); > +} > + > +inline bool > +code_helper::is_internal_fn () const > +{ > + return is_fn_code () && internal_fn_p (combined_fn (*this)); > +} > + > +inline bool > +code_helper::is_builtin_fn () const > +{ > + return is_fn_code () && builtin_fn_p (combined_fn (*this)); > +} > + > /* Macros for initializing `tree_contains_struct'. */ > #define MARK_TS_BASE(C) \ >(tree_contains_struct[C][TS_BASE] = true)
Graphite Patch Ping
Hi all, I would like to ping the following patches from Frederik's "[PATCH 00/40] OpenACC "kernels" Improvements" series https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586901.html patch set thread link: https://gcc.gnu.org/pipermail/gcc-patches/2021-December/thread.html#586901 (A) Simpler patches [PATCH 15/40] graphite: Extend SCoP detection dump output https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586915.html "Extend dump output to make understanding why Graphite rejects to include a loop in a SCoP easier (for GCC developers)." [PATCH 16/40] graphite: Rename isl_id_for_ssa_name" https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586917.html (consistency renaming of static function/var name) [PATCH 17/40] graphite: Fix minor mistakes in comments https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586918.html Comment-typo fixes (B) User-visible change (new flag) - prep + actual new-flag patch [PATCH 18/40] Move compute_alias_check_pairs to tree-data-ref.c https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586919.html "Move this function from tree-loop-distribution.c to tree-data-ref.c and make it non-static to enable its use from other parts of GCC." [PATCH 19/40] graphite: Add runtime alias checking https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586920.html 'Graphite rejects a SCoP if it contains a pair of data references for which it cannot determine statically if they may alias. This happens very often, for instance in C code which does not use explicit "restrict". This commit adds the possibility to analyze a SCoP nevertheless and perform an alias check at runtime.' Thanks, Tobias - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
Re: [Patch] gcn/t-omp-device: Add 'amdgcn' as 'arch' [PR105602]
On 16/05/2022 11:28, Tobias Burnus wrote: While 'vendor' and 'kind' is well defined, 'arch' and 'isa' isn't. When looking at an 'metadirective' testcase (which oddly uses 'arch(amd)'), I noticed that LLVM uses 'arch(amdgcn)' while we use 'gcn', cf. e.g. 'clang/lib/Headers/openmp_wrappers/math.h'. (Side note: we use the target triplet amdgcn-amdhsa and LLVM uses amdgcn-amd-amdhsa.) Given the target triplet, the LLVM choice seems to make more sense; 'gcn' only shows up as directory name (under gcc/config + libgomp/). Thus, I think we have two options: * Either also change to (only) 'amdgcn' - given that the supported arch() values are neither documented not yet widely used. * Or add 'amdgcn' alongside 'gcn' - which is what the attached patch does. Which option do you prefer? Other thoughts? As long as this remains backwards compatible and fixes your problem, accepting everything that might occur is fine with me. I don't think trying to pick one is realistic; whatever we choose, there'll always be some other toolchain doing something different. Andrew
Re: [PATCH] [12/11/10] Fix invalid format warnings on Windows
On 5/11/22 18:43, Joseph Myers wrote: There are various coding style issues in the patch; at least missing space before '(' and '&&' at end of line (should be at start of line). It will also need to be updated for .c files having been renamed to .cc in the GCC source tree. Thanks, I've fixed the formatting issue and updated the patch for master, 12, 11 and 10. In addition to the renaming of .c to .cc files, there was also a change in the first argument of check_function_format. I've also removed a duplicated check for whether fndecl was null and fixed indentation. I've updated the patches for each version to also note that in c51f1e7427e6a5ae2a6d82b5a790df77a3adc99a gcc: Add `ll` and `L` length modifiers for `ms_printf` the ms_printf format has been taught to support (not warn about) printing the 64-bit integers using the "%ll" modifier (currently GCC 11 and newer). However, I assume there may be other differences between the ms_printf and gnu_printf formats people might run into, so it still makes sense to fix this not only in GCC 10, but also in newer versions. Furthermore, the attached patch is still needed (GCC 11, GCC 12, master) to get rid of duplicate warnings for an incorrect format (e.g. "%lu" used to print "unsigned long long"), when both ms_printf and gnu_printf formats are violated (PR 92292). Best Tomas From 7d72c9cc395df56ce044f4a0b0b77c151bfe2bf6 Mon Sep 17 00:00:00 2001 From: Tomas Kalibera Date: Mon, 16 May 2022 06:43:09 -0400 Subject: [PATCH] c-family: Let stdio.h override built in printf format [PR95130,PR92292] Mingw32 targets use ms_printf format for printf, but mingw-w64 when configured for UCRT uses gnu_format (via stdio.h). GCC checks both formats, which means that one gets a warning twice if the format string violates both formats: printf("Hello %lu\n", (long long unsigned) x); Fixed by disabling the built in format in case there are additional ones. This fixes also prevents issues if the print formats disagree. In the past it was the case when printing 64-bit integers, but GCC ms_printf format since c51f1e7427e6a5ae2a6d82b5a790df77a3adc99 supports %llu. gcc/c-family/ChangeLog: PR c/95130 PR c/92292 * c-format.cc (check_function_format): For builtin functions with a built in format and at least one more, do not check the first one. --- gcc/c-family/c-format.cc | 29 - 1 file changed, 28 insertions(+), 1 deletion(-) diff --git a/gcc/c-family/c-format.cc b/gcc/c-family/c-format.cc index ea57fde801c..7590b467e81 100644 --- a/gcc/c-family/c-format.cc +++ b/gcc/c-family/c-format.cc @@ -1157,9 +1157,10 @@ void check_function_format (const_tree fn, tree attrs, int nargs, tree *argarray, vec *arglocs) { - tree a; + tree a, aa; tree atname = get_identifier ("format"); + bool skipped_default_format = false; /* See if this function has any format attributes. */ for (a = attrs; a; a = TREE_CHAIN (a)) @@ -1170,6 +1171,32 @@ check_function_format (const_tree fn, tree attrs, int nargs, function_format_info info; decode_format_attr (fn, atname, TREE_VALUE (a), &info, /*validated=*/true); + + /* Mingw32 targets have traditionally used ms_printf format for the + printf function, and this format is built in GCC. But nowadays, + if mingw-w64 is configured to target UCRT, the printf function + uses the gnu_printf format (specified in the stdio.h header). This + causes GCC to check both formats, which means that GCC would + warn twice about the same issue when both formats are violated, + e.g. for %lu used to print long long unsigned. + + Hence, if there are multiple format specifiers, we skip the first + one. See PR 95130 (but note that GCC ms_printf already supports + %llu) and PR 92292. */ + + if (!skipped_default_format && fn && TREE_CODE (fn) == FUNCTION_DECL) + { + for(aa = TREE_CHAIN (a); aa; aa = TREE_CHAIN (aa)) + if (is_attribute_p ("format", get_attribute_name (aa)) + && fndecl_built_in_p (fn, BUILT_IN_NORMAL)) + { + skipped_default_format = true; + break; + } + if (skipped_default_format) + continue; + } + if (warn_format) { /* FIXME: Rewrite all the internal functions in this file -- 2.25.1 From d64309760bc7f61db10a7f28baf3308d871ef1ed Mon Sep 17 00:00:00 2001 From: Tomas Kalibera Date: Mon, 16 May 2022 06:16:55 -0400 Subject: [PATCH] c-family: Let stdio.h override built in printf format [PR95130,PR92292] Mingw32 targets use ms_printf format for printf, but mingw-w64 when configured for UCRT uses gnu_format (via stdio.h). GCC checks both formats, which means that one gets a warning twice if the format string violates both formats: printf("Hello %lu\n", (long long unsigned) x); Fixed by disabling the built in format in case there are additional ones. This fixes also prevents issues if the print formats disagree. In the past it was the case when printi
Re: Graphite Patch Ping
On Mon, 16 May 2022, Tobias Burnus wrote: > Hi all, > > I would like to ping the following patches from Frederik's > "[PATCH 00/40] OpenACC "kernels" Improvements" series > https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586901.html > patch set thread link: > https://gcc.gnu.org/pipermail/gcc-patches/2021-December/thread.html#586901 Can we get them re-based to after the .c -> .cc renaming please? Richard. > (A) Simpler patches > > [PATCH 15/40] graphite: Extend SCoP detection dump output >https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586915.html >"Extend dump output to make understanding why Graphite rejects to > include a loop in a SCoP easier (for GCC developers)." > > [PATCH 16/40] graphite: Rename isl_id_for_ssa_name" > https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586917.html > (consistency renaming of static function/var name) > > [PATCH 17/40] graphite: Fix minor mistakes in comments > https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586918.html > Comment-typo fixes > > > (B) User-visible change (new flag) - prep + actual new-flag patch > > [PATCH 18/40] Move compute_alias_check_pairs to tree-data-ref.c > https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586919.html > "Move this function from tree-loop-distribution.c to tree-data-ref.c >and make it non-static to enable its use from other parts of GCC." > > [PATCH 19/40] graphite: Add runtime alias checking > https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586920.html > 'Graphite rejects a SCoP if it contains a pair of data references for >which it cannot determine statically if they may alias. This happens >very often, for instance in C code which does not use explicit >"restrict". This commit adds the possibility to analyze a SCoP >nevertheless and perform an alias check at runtime.' > > Thanks, > > Tobias > > - > Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstra?e 201, 80634 > M?nchen; Gesellschaft mit beschr?nkter Haftung; Gesch?ftsf?hrer: Thomas > Heurung, Frank Th?rauf; Sitz der Gesellschaft: M?nchen; Registergericht > M?nchen, HRB 106955 > -- Richard Biener SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)
Re: [PATCH 1/3]middle-end: Add the ability to let the target decide the method of argument promotions.
Tamar Christina writes: > Hi All, > > Some targets require function parameters to be promoted to a different > type on expand time because the target may not have native instructions > to work on such types. As an example the AArch64 port does not have native > instructions working on integer 8- or 16-bit values. As such it promotes > every parameter of these types to 32-bits. This doesn't seem specific to parameters though. It applies to any 8- or 16-bit variable. E.g.: #include uint8_t foo(uint32_t x, uint32_t y) { uint8_t z = x != 0 ? x : y; return z + 1; } generates: foo: cmp w0, 0 and w1, w1, 255 and w0, w0, 255 cselw0, w1, w0, eq add w0, w0, 1 ret So I think the new behaviour is really a modification of the PROMOTE_MODE behaviour rather than the PROMOTE_FUNCTION_MODE behaviour. FWIW, I agree with Richard that it would be better not to add a new hook. I think we're really making PROMOTE_MODE choose between SIGN_EXTEND, ZERO_EXTEND or SUBREG (what LLVM would call “any extend”) rather than the current SIGN_EXTEND vs. ZERO_EXTEND choice. Thanks, Richard
[PATCH] Finish gimple_build API enhancement
This finishes the remaining parts of the gimple_build API enhancement, converting the remaining workers to receive a gimple_stmt_iterator, direction and update argument. It also moves the code_helper receiving functions from gimple-match.h to gimple-fold.h. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. 2022-05-16 Richard Biener * gimple-match.h (gimple_build): Move code_helper overloads ... * gimple-fold.h (gimple_build): ... here. (gimple_build): Transition to new worker API. Provide overloads from sequence-based API. (gimple_convert): Likewise. (gimple_convert_to_ptrofftype): Likewise. (gimple_build_vector_from_val): Likewise. (gimple_build_vector): Likewise. (gimple_build_round_up): Likewise. * gimple-fold.cc (gimple_build_insert_seq): New helper. (gimple_build): Use it. Transition combined_fn and code_helper API parts. (gimple_convert): Transition to new worker API. (gimple_convert_to_ptrofftype): Likewise. (gimple_build_vector_from_val): Likewise. (gimple_build_vector): Likewise. (gimple_build_round_up): Likewise. --- gcc/gimple-fold.cc | 232 + gcc/gimple-fold.h | 145 gcc/gimple-match.h | 26 - 3 files changed, 254 insertions(+), 149 deletions(-) diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc index 4d4ef43cb36..f61bc87da63 100644 --- a/gcc/gimple-fold.cc +++ b/gcc/gimple-fold.cc @@ -8665,6 +8665,29 @@ gimple_build_valueize (tree op) return NULL_TREE; } +/* Helper for gimple_build to perform the final insertion of stmts on SEQ. */ + +static inline void +gimple_build_insert_seq (gimple_stmt_iterator *gsi, +bool before, gsi_iterator_update update, +gimple_seq seq) +{ + if (before) +{ + if (gsi->bb) + gsi_insert_seq_before (gsi, seq, update); + else + gsi_insert_seq_before_without_update (gsi, seq, update); +} + else +{ + if (gsi->bb) + gsi_insert_seq_after (gsi, seq, update); + else + gsi_insert_seq_after_without_update (gsi, seq, update); +} +} + /* Build the expression CODE OP0 of type TYPE with location LOC, simplifying it first if possible. Returns the built expression value and inserts statements possibly defining it @@ -8697,27 +8720,14 @@ gimple_build (gimple_stmt_iterator *gsi, gimple_set_location (stmt, loc); gimple_seq_add_stmt_without_update (&seq, stmt); } - if (before) -{ - if (gsi->bb) - gsi_insert_seq_before (gsi, seq, update); - else - gsi_insert_seq_before_without_update (gsi, seq, update); -} - else -{ - if (gsi->bb) - gsi_insert_seq_after (gsi, seq, update); - else - gsi_insert_seq_after_without_update (gsi, seq, update); -} + gimple_build_insert_seq (gsi, before, update, seq); return res; } /* Build the expression OP0 CODE OP1 of type TYPE with location LOC, simplifying it first if possible. Returns the built - expression value and appends statements possibly defining it - to SEQ. */ + expression value inserting any new statements at GSI honoring BEFORE + and UPDATE. */ tree gimple_build (gimple_stmt_iterator *gsi, @@ -8736,27 +8746,14 @@ gimple_build (gimple_stmt_iterator *gsi, gimple_set_location (stmt, loc); gimple_seq_add_stmt_without_update (&seq, stmt); } - if (before) -{ - if (gsi->bb) - gsi_insert_seq_before (gsi, seq, update); - else - gsi_insert_seq_before_without_update (gsi, seq, update); -} - else -{ - if (gsi->bb) - gsi_insert_seq_after (gsi, seq, update); - else - gsi_insert_seq_after_without_update (gsi, seq, update); -} + gimple_build_insert_seq (gsi, before, update, seq); return res; } /* Build the expression (CODE OP0 OP1 OP2) of type TYPE with location LOC, simplifying it first if possible. Returns the built - expression value and appends statements possibly defining it - to SEQ. */ + expression value inserting any new statements at GSI honoring BEFORE + and UPDATE. */ tree gimple_build (gimple_stmt_iterator *gsi, @@ -8781,31 +8778,22 @@ gimple_build (gimple_stmt_iterator *gsi, gimple_set_location (stmt, loc); gimple_seq_add_stmt_without_update (&seq, stmt); } - if (before) -{ - if (gsi->bb) - gsi_insert_seq_before (gsi, seq, update); - else - gsi_insert_seq_before_without_update (gsi, seq, update); -} - else -{ - if (gsi->bb) - gsi_insert_seq_after (gsi, seq, update); - else - gsi_insert_seq_after_without_update (gsi, seq, update); -} + gimple_build_insert_seq (gsi, before, update, seq); return res; } /* Build the call FN () with a result of type TYPE (or no result if TYPE is void) with a loc
Re: [AArch64] PR105162: emit barrier for __sync and __atomic builtins on CPUs without LSE
"Pop, Sebastian" writes: > Please see attached the patch back-ported to branches 12, 11, 10, and 9. > Tested on aarch64-linux with bootstrap and regression test. > Ok to commit to the GCC active branches? OK, thanks. Only very safe patches are supposed to be going into GCC 9 at this stage, and I guess this one is a bit on the edge. But I think it's worth applying anyway because it's fixing a non-determinstic wrong-code regression. Richard
RE: [PATCH 1/3]middle-end: Add the ability to let the target decide the method of argument promotions.
> -Original Message- > From: Richard Sandiford > Sent: Monday, May 16, 2022 12:36 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; rguent...@suse.de; > jeffreya...@gmail.com > Subject: Re: [PATCH 1/3]middle-end: Add the ability to let the target decide > the method of argument promotions. > > Tamar Christina writes: > > Hi All, > > > > Some targets require function parameters to be promoted to a different > > type on expand time because the target may not have native > > instructions to work on such types. As an example the AArch64 port > > does not have native instructions working on integer 8- or 16-bit > > values. As such it promotes every parameter of these types to 32-bits. > > This doesn't seem specific to parameters though. It applies to any > 8- or 16-bit variable. E.g.: > > #include > uint8_t foo(uint32_t x, uint32_t y) { > uint8_t z = x != 0 ? x : y; > return z + 1; > } > > generates: > > foo: > cmp w0, 0 > and w1, w1, 255 > and w0, w0, 255 > cselw0, w1, w0, eq > add w0, w0, 1 > ret > > So I think the new behaviour is really a modification of the PROMOTE_MODE > behaviour rather than the PROMOTE_FUNCTION_MODE behaviour. > > FWIW, I agree with Richard that it would be better not to add a new hook. > I think we're really making PROMOTE_MODE choose between > SIGN_EXTEND, ZERO_EXTEND or SUBREG (what LLVM would call “any > extend”) rather than the current SIGN_EXTEND vs. ZERO_EXTEND choice. Ah, I hadn't realized this also applied to locals.. ok I can modify PROMOTE_MODE then, but I also need the actual SSA_NAME and not just the type so will have to pass this along. From a practical point of view.. the actual hook however is implemented by 34 targets, would I need to CC maintainers for each of them or would global maintainer approval suffice for these mostly mechanical changes? Cheers, Tamar > > Thanks, > Richard
Re: [PATCH] lto-plugin: add support for feature detection
On 5/16/22 12:28, Richard Biener wrote: > On Mon, May 16, 2022 at 11:58 AM Rui Ueyama wrote: >> >> Version handshaking is doable, but it feels like we are over-designing >> an API, given that the real providers of this plugin API are only GCC >> and LLVM and the users of the API are BFD ld, gold and mold. It is >> unlikely that we'll have dozens of more compilers or linkers in the >> near future. So, I personally prefer the following style >> >> if (!strcmp(plugin_compiler_name, "gcc") && plugin_major >= 12) >> >> than versioning various API-provided functions. It'll just work and be >> easy to understand. >> >> Besides that, even if we version GCC-provided plugin API functions, we >> still need a logic similar to the above to distinguish GCC from LLVM, >> as they behave slightly differently in various corner cases. We can't >> get rid of the heuristic version detection logic from the linker >> anyways. >> >> So, from the linker's point of view, exporting a compiler name and >> version numbers is enough. > > I agree that this might be convenient enough but the different behaviors > are because of inadequate documentation of the API - that's something > we should fix. And for this I think a plugin API version might help > as we can this way also handle your case of the need of querying > whether v2 will be used or not. > > I wouldn't go to enumerate past API versions - the version handshake > hook requirement alone makes that not so useful. Instead I'd require > everybody implementing the handshare hook implementing also all > other hooks defined at that point in time and set the version to 1. > > I'd eventually keep version 2 to indicate thread safety (of a part of the > API). That seems reasonable to me. > > That said, I'm not opposed to add a "GCC 12.1" provider, maybe the > version handshake should be > > int api_version (int linker, const char **identifier); > > where the linker specifies the desired API version and passes an > identifier identifying itself ("mold 1.0") and it will get back the API > version the plugin intends to use plus an identifier of the plugin > ("GCC 12.1"). I've implemented first version of the patch, please take a look. Note there's a bit name collision of api_version with: /* The version of the API specification. */ enum ld_plugin_api_version { LD_PLUGIN_API_VERSION = 1 }; @Rui: Am I correct that you're interested in thread-safe claim_file? Is there any other function being called paralely? Thoughts? > > Richard. > >> >> >> On Mon, May 16, 2022 at 5:38 PM Martin Liška wrote: >>> >>> On 5/16/22 11:25, Jan Hubicka via Gcc-patches wrote: > > Sure having a 'plugin was compiled from sources of the GCC N.M compiler' > is useful if bugs are discovered in old versions that you by definition > cannot > fix but can apply workarounds to. Note the actual compiler used might > still > differ. Note that still isn't clean API documentation / extension of the > plugin > API itself. As of thread safety we can either add a claim_file_v2_hook > or try to do broader-level versioning of the API with a new api_version > hook which could also specify that with say version 2 the plugin will > not use get_symbols_v2 but only newer, etc. The linker would announce > the API version it likes to use and the plugin would return the > (closest) version > it can handle. A v2 could then also specify that claim_file needs to be Yep, I think having the API version handshake is quite reasonable way to go as the _v2,_v3 symbols seems already getting bit out of hand and we essentially have 3 revisions of API to think of (first adding LDPR_PREVAILING_DEF_IRONLY_EXP, second adding GET_SYMBOLS_V3 and now we plan third adding thread safety and solving the file handler problems) >>> >>> How should be design such a version handshake? >>> > thread safe, or that cleanup should allow a followup onload again with > a state identical to after dlopen, etc. > > Is there an API document besides the header itself somewhere? There is https://gcc.gnu.org/wiki/whopr/driver I wonder if this can't be moved into more official looking place since it looks like it is internal to GCC WHOPR implementation this way. >>> >>> We can likely add it here: >>> https://gcc.gnu.org/onlinedocs/gccint/LTO.html#LTO >>> >>> What do you think? I can port it. >>> >>> Martin >>> Honza > > Richard. >>> From 0ef5f678144b8ff3a1d247a992aa2e14128b82d1 Mon Sep 17 00:00:00 2001 From: Martin Liska Date: Mon, 16 May 2022 14:01:52 +0200 Subject: [PATCH] Implement LDPT_REGISTER_GET_API_VERSION. --- include/plugin-api.h| 14 ++ lto-plugin/lto-plugin.c | 38 ++ 2 files changed, 52 insertions(+) diff --git a/include/plugin-api.h b/include/plugin-api.h index 8aebe2ff267..f01f4301fc9 100644 --- a/include/plugin-api.h +++ b/inc
Re: [PATCH 1/3]middle-end: Add the ability to let the target decide the method of argument promotions.
Tamar Christina writes: >> -Original Message- >> From: Richard Sandiford >> Sent: Monday, May 16, 2022 12:36 PM >> To: Tamar Christina >> Cc: gcc-patches@gcc.gnu.org; nd ; rguent...@suse.de; >> jeffreya...@gmail.com >> Subject: Re: [PATCH 1/3]middle-end: Add the ability to let the target decide >> the method of argument promotions. >> >> Tamar Christina writes: >> > Hi All, >> > >> > Some targets require function parameters to be promoted to a different >> > type on expand time because the target may not have native >> > instructions to work on such types. As an example the AArch64 port >> > does not have native instructions working on integer 8- or 16-bit >> > values. As such it promotes every parameter of these types to 32-bits. >> >> This doesn't seem specific to parameters though. It applies to any >> 8- or 16-bit variable. E.g.: >> >> #include >> uint8_t foo(uint32_t x, uint32_t y) { >> uint8_t z = x != 0 ? x : y; >> return z + 1; >> } >> >> generates: >> >> foo: >> cmp w0, 0 >> and w1, w1, 255 >> and w0, w0, 255 >> cselw0, w1, w0, eq >> add w0, w0, 1 >> ret >> >> So I think the new behaviour is really a modification of the PROMOTE_MODE >> behaviour rather than the PROMOTE_FUNCTION_MODE behaviour. >> >> FWIW, I agree with Richard that it would be better not to add a new hook. >> I think we're really making PROMOTE_MODE choose between >> SIGN_EXTEND, ZERO_EXTEND or SUBREG (what LLVM would call “any >> extend”) rather than the current SIGN_EXTEND vs. ZERO_EXTEND choice. > > Ah, I hadn't realized this also applied to locals.. ok I can modify > PROMOTE_MODE then, > but I also need the actual SSA_NAME and not just the type so will have to > pass this along. > > From a practical point of view.. the actual hook however is implemented by 34 > targets, > would I need to CC maintainers for each of them or would global maintainer > approval > suffice for these mostly mechanical changes? Yeah, single approval should be enough mechanical changes. It would be good to do the interface change and mechanical target changes as a separate prepatch if possible though. I'm not sure about passing the SSA name to the target though, or the way that the aarch64 hook uses the info. It looks like a single cold comparison could defeat the optimisation for hot code. If we do try to make the decision based on uses at expand time, it might be better for the analysis to be in target-independent code, with help from the target to decide where extensions are cheap. It still feels a bit hacky though. What stops us from forming cbz/cbnz when the extension is done close to the comparison (from the comment in 2/3)? If we can solve that, could we simply do an any-extend all the time, and treat removing redundant extensions as a global availability problem? What kind of code do we emit when do an extension just before an operation? If the DECL_RTL is (subreg:QI (reg:SI R) 0), say, then it should be safe to do the extension directly into R: (set (reg:SI X) (zero_extend:SI (subreg:QI (reg:SI X which avoids the problem of having two values live at once (the zero-extended value and the any-extended value). Having identical instructions for each extension would also make it easier for any global pass to move them and remove redundancies. Thanks, Richard
Re: [PATCH v2 02/10] aarch64: Add backend support for DFP
On 5/13/22 18:35, Richard Sandiford wrote: Christophe Lyon via Gcc-patches writes: @@ -19352,7 +19363,9 @@ aarch64_legitimate_constant_p (machine_mode mode, rtx x) { /* Support CSE and rematerialization of common constants. */ if (CONST_INT_P (x) - || (CONST_DOUBLE_P (x) && GET_MODE_CLASS (mode) == MODE_FLOAT)) + || (CONST_DOUBLE_P (x) + && (GET_MODE_CLASS (mode) == MODE_FLOAT + || GET_MODE_CLASS (mode) == MODE_DECIMAL_FLOAT))) return true; /* Only accept variable-length vector constants if they can be I'd prefer to remove the mode check here, rather than extend it. If we see a CONST_DOUBLE that doesn't have a scalar float mode then something has gone wrong. (No need to assert for that though. It's a fundamental assumption when TARGET_SUPPORTS_WIDE_INT.) For consistency… @@ -4,11 +6,16 @@ aarch64_float_const_rtx_p (rtx x) return false; } -/* Return TRUE if rtx X is immediate constant 0.0 */ +/* Return TRUE if rtx X is immediate constant 0.0 (but not in Decimal + Floating Point). */ bool aarch64_float_const_zero_rtx_p (rtx x) { - if (GET_MODE (x) == VOIDmode) + if (GET_MODE (x) == VOIDmode + /* 0.0 in Decimal Floating Point cannot be represented by #0 or +zr as our callers expect, so no need to check the actual +value if X is of Decimal Floating Point type. */ + || GET_MODE_CLASS (GET_MODE (x)) == MODE_DECIMAL_FLOAT) …it would be good to drop the VOIDmode check here at the same time. OK with those changes, thanks. Thanks, here is what I will commit once patch 9/10 is approved. RichardFrom 8dbef6d38700b252551309bfd657c0402a7e339a Mon Sep 17 00:00:00 2001 From: Christophe Lyon Date: Fri, 11 Mar 2022 16:21:02 + Subject: [PATCH v3 02/10] aarch64: Add backend support for DFP This patch updates the aarch64 backend as needed to support DFP modes (SD, DD and TD). Changes v1->v2: * Drop support for DFP modes in aarch64_gen_{load||store}[wb]_pair as these are only used in prologue/epilogue where DFP modes are not used. Drop the changes to the corresponding patterns in aarch64.md, and useless GPF_PAIR iterator. * In aarch64_reinterpret_float_as_int, handle DDmode the same way as DFmode (needed in case the representation of the floating-point value can be loaded using mov/movk. * In aarch64_float_const_zero_rtx_p, reject constants with DFP mode: when X is zero, the callers want to emit either '0' or 'zr' depending on the context, which is not the way 0.0 is represented in DFP mode (in particular fmov d0, #0 is not right for DFP). * In aarch64_legitimate_constant_p, accept DFP 2022-03-31 Christophe Lyon gcc/ * config/aarch64/aarch64.cc (aarch64_split_128bit_move): Handle DFP modes. (aarch64_mode_valid_for_sched_fusion_p): Likewise. (aarch64_classify_address): Likewise. (aarch64_legitimize_address_displacement): Likewise. (aarch64_reinterpret_float_as_int): Likewise. (aarch64_float_const_zero_rtx_p): Likewise. (aarch64_can_const_movi_rtx_p): Likewise. (aarch64_anchor_offset): Likewise. (aarch64_secondary_reload): Likewise. (aarch64_rtx_costs): Likewise. (aarch64_legitimate_constant_p): Likewise. (aarch64_gimplify_va_arg_expr): Likewise. (aapcs_vfp_sub_candidate): Likewise. (aarch64_vfp_is_call_or_return_candidate): Likewise. (aarch64_output_scalar_simd_mov_immediate): Likewise. (aarch64_gen_adjusted_ldpstp): Likewise. (aarch64_scalar_mode_supported_p): Accept DFP modes if enabled. * config/aarch64/aarch64.md (movsf_aarch64): Use SFD iterator and rename into mov_aarch64. (movdf_aarch64): Use DFD iterator and rename into mov_aarch64. (movtf_aarch64): Use TFD iterator and rename into mov_aarch64. (split pattern for move TF mode): Use TFD iterator. * config/aarch64/aarch64/iterators.md (GPF_TF_F16_MOV): Add DFP modes. (SFD, DFD, TFD): New iterators. (GPF_TF): Add DFP modes. (TX, DX, DX2): Likewise. --- gcc/config/aarch64/aarch64.cc | 82 ++--- gcc/config/aarch64/aarch64.md | 34 +++--- gcc/config/aarch64/iterators.md | 24 +++--- 3 files changed, 89 insertions(+), 51 deletions(-) diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index bd855758778..9489c2fcaf9 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -4828,7 +4828,7 @@ aarch64_split_128bit_move (rtx dst, rtx src) machine_mode mode = GET_MODE (dst); - gcc_assert (mode == TImode || mode == TFmode); + gcc_assert (mode == TImode || mode == TFmode || mode == TDmode); gcc_assert (!(side_effects_p (src) || side_effects_p (dst))); gcc_assert (mode == GET_MODE (src) || GET_MODE (src) == VOIDmode); @@ -10568,6 +10568,7 @@ aarch64_mode_valid_for_sched_fusion_p (machine_mode mode) { return mode == SImode || mode == DImode || mode == SFmode || mode == DFmode + || mode == SDmode || mode == DDmode || (aarch64_vector_mode_supported_p (mode) && (known_e
Re: [PATCH 1/3]middle-end: Add the ability to let the target decide the method of argument promotions.
Richard Sandiford via Gcc-patches writes: > Tamar Christina writes: >>> -Original Message- >>> From: Richard Sandiford >>> Sent: Monday, May 16, 2022 12:36 PM >>> To: Tamar Christina >>> Cc: gcc-patches@gcc.gnu.org; nd ; rguent...@suse.de; >>> jeffreya...@gmail.com >>> Subject: Re: [PATCH 1/3]middle-end: Add the ability to let the target decide >>> the method of argument promotions. >>> >>> Tamar Christina writes: >>> > Hi All, >>> > >>> > Some targets require function parameters to be promoted to a different >>> > type on expand time because the target may not have native >>> > instructions to work on such types. As an example the AArch64 port >>> > does not have native instructions working on integer 8- or 16-bit >>> > values. As such it promotes every parameter of these types to 32-bits. >>> >>> This doesn't seem specific to parameters though. It applies to any >>> 8- or 16-bit variable. E.g.: >>> >>> #include >>> uint8_t foo(uint32_t x, uint32_t y) { >>> uint8_t z = x != 0 ? x : y; >>> return z + 1; >>> } >>> >>> generates: >>> >>> foo: >>> cmp w0, 0 >>> and w1, w1, 255 >>> and w0, w0, 255 >>> cselw0, w1, w0, eq >>> add w0, w0, 1 >>> ret >>> >>> So I think the new behaviour is really a modification of the PROMOTE_MODE >>> behaviour rather than the PROMOTE_FUNCTION_MODE behaviour. >>> >>> FWIW, I agree with Richard that it would be better not to add a new hook. >>> I think we're really making PROMOTE_MODE choose between >>> SIGN_EXTEND, ZERO_EXTEND or SUBREG (what LLVM would call “any >>> extend”) rather than the current SIGN_EXTEND vs. ZERO_EXTEND choice. >> >> Ah, I hadn't realized this also applied to locals.. ok I can modify >> PROMOTE_MODE then, >> but I also need the actual SSA_NAME and not just the type so will have to >> pass this along. >> >> From a practical point of view.. the actual hook however is implemented by >> 34 targets, >> would I need to CC maintainers for each of them or would global maintainer >> approval >> suffice for these mostly mechanical changes? > > Yeah, single approval should be enough mechanical changes. It would be > good to do the interface change and mechanical target changes as a > separate prepatch if possible though. > > I'm not sure about passing the SSA name to the target though, or the > way that the aarch64 hook uses the info. It looks like a single cold > comparison could defeat the optimisation for hot code. > > If we do try to make the decision based on uses at expand time, it might > be better for the analysis to be in target-independent code, with help > from the target to decide where extensions are cheap. It still feels > a bit hacky though. > > What stops us from forming cbz/cbnz when the extension is done close to > the comparison (from the comment in 2/3)? If we can solve that, could > we simply do an any-extend all the time, and treat removing redundant > extensions as a global availability problem? (which would run after combine) > > What kind of code do we emit when do an extension just before > an operation? If the DECL_RTL is (subreg:QI (reg:SI R) 0), say, > then it should be safe to do the extension directly into R: > > (set (reg:SI X) (zero_extend:SI (subreg:QI (reg:SI X Oops, that should of course be: (set (reg:SI R) (zero_extend:SI (subreg:QI (reg:SI R > which avoids the problem of having two values live at once > (the zero-extended value and the any-extended value). > > Having identical instructions for each extension would also make > it easier for any global pass to move them and remove redundancies. > > Thanks, > Richard
[PATCH 1/2] Force the selection operand of a GIMPLE COND_EXPR to be a register
This goes away with the selection operand allowed to be a GENERIC tcc_comparison tree. It keeps those for vectorizer pattern recog, those are short lived and removing this instance is a bigger task. The patch doesn't yet remove dead code and functionality, that's left for a followup. Instead the patch makes sure to produce valid GIMPLE IL and continue to optimize COND_EXPRs where the previous IL allowed and the new IL showed regressions in the testsuite. Bootstrapped and tested on x86_64-unknown-linux-gnu ontop of https://gcc.gnu.org/pipermail/gcc-patches/2022-May/594753.html which awaits review. When that's approved I plan to go ahead with this so I have the rest of the week to fixup issues that appear. Richard. 2022-05-16 Richard Biener * gimple-expr.cc (is_gimple_condexpr): Equate to is_gimple_val. * gimplify.cc (gimplify_pure_cond_expr): Gimplify the condition as is_gimple_val. * gimple-fold.cc (valid_gimple_rhs_p): Simplify. * tree-cfg.cc (verify_gimple_assign_ternary): Likewise. * gimple-loop-interchange.cc (loop_cand::undo_simple_reduction): Build the condition of the COND_EXPR separately. * tree-ssa-loop-im.cc (move_computations_worker): Likewise. * tree-vect-generic.cc (expand_vector_condition): Likewise. * tree-vect-loop.cc (vect_create_epilog_for_reduction): Likewise. * vr-values.cc (simplify_using_ranges::simplify): Likewise. * tree-vect-patterns.cc: Add comment indicating we are building invalid COND_EXPRs and why. * omp-expand.cc (expand_omp_simd): Gimplify the condition to the COND_EXPR separately. (expand_omp_atomic_cas): Note part that should be unreachable now. * tree-ssa-forwprop.cc (forward_propagate_into_cond): Adjust condition for valid replacements. * tree-if-conv.cc (predicate_bbs): Simulate previous re-folding of the condition in folded COND_EXPRs which is necessary because of unfolded GIMPLE_CONDs in the IL as in for example gcc.dg/fold-bopcond-1.c. * gimple-range-gori.cc (gori_compute::condexpr_adjust): Handle that the comparison is now in the def stmt of the select operand. Required by gcc.dg/pr104526.c. * gcc.dg/gimplefe-27.c: Adjust. * gcc.dg/gimplefe-45.c: Likewise. * gcc.dg/pr101145-2.c: Likewise. * gcc.dg/pr98211.c: Likewise. * gcc.dg/torture/pr89595.c: Likewise. * gcc.dg/tree-ssa/divide-7.c: Likewise. * gcc.dg/tree-ssa/ssa-lim-12.c: Likewise. --- gcc/gimple-expr.cc | 2 +- gcc/gimple-fold.cc | 4 +-- gcc/gimple-loop-interchange.cc | 4 ++- gcc/gimple-range-gori.cc | 20 +-- gcc/gimplify.cc| 6 ++--- gcc/omp-expand.cc | 7 +++--- gcc/testsuite/gcc.dg/gimplefe-27.c | 4 ++- gcc/testsuite/gcc.dg/gimplefe-45.c | 4 ++- gcc/testsuite/gcc.dg/pr101145-2.c | 4 ++- gcc/testsuite/gcc.dg/pr98211.c | 8 -- gcc/testsuite/gcc.dg/torture/pr89595.c | 4 ++- gcc/testsuite/gcc.dg/tree-ssa/divide-7.c | 3 ++- gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-12.c | 2 +- gcc/tree-cfg.cc| 12 + gcc/tree-if-conv.cc| 29 +++--- gcc/tree-ssa-forwprop.cc | 2 +- gcc/tree-ssa-loop-im.cc| 7 -- gcc/tree-vect-generic.cc | 6 ++--- gcc/tree-vect-loop.cc | 27 ++-- gcc/tree-vect-patterns.cc | 6 + gcc/vr-values.cc | 5 +++- 21 files changed, 104 insertions(+), 62 deletions(-) diff --git a/gcc/gimple-expr.cc b/gcc/gimple-expr.cc index 5d10c24ed1b..09eac80ae1c 100644 --- a/gcc/gimple-expr.cc +++ b/gcc/gimple-expr.cc @@ -622,7 +622,7 @@ is_gimple_condexpr (tree t) { /* Always split out _Complex type compares since complex lowering doesn't handle this case. */ - return is_gimple_condexpr_1 (t, true, false); + return is_gimple_val (t); } /* Like is_gimple_condexpr, but does not allow T to trap. */ diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc index 8555a2bca89..f61bc87da63 100644 --- a/gcc/gimple-fold.cc +++ b/gcc/gimple-fold.cc @@ -418,9 +418,7 @@ valid_gimple_rhs_p (tree expr) default: if (get_gimple_rhs_class (code) == GIMPLE_TERNARY_RHS) { - if ((code == COND_EXPR - ? !is_gimple_condexpr (TREE_OPERAND (expr, 0)) - : !is_gimple_val (TREE_OPERAND (expr, 0))) + if (!is_gimple_val (TREE_OPERAND (expr, 0)) || !is_gimple_val (TREE_OPERAND (expr, 1)) || !is_gimple_val (TREE_OPERAND (expr, 2))) return false; diff --git a/gcc/gimple-loop-interchange.cc b/gcc/g
[PATCH 2/2] Remove is_gimple_condexpr
This removes is_gimple_condexpr, note the vectorizer via patterns still creates COND_EXPRs with embedded GENERIC conditions and has a reference to the function in comments. Otherwise is_gimple_condexpr is now equal to is_gimple_val. Bootstrap and regtest running on x86_64-unknown-linux-gnu. Richard. 2022-05-16 Richard Biener * gimple-expr.cc (is_gimple_condexpr): Remove. * gimple-expr.h (is_gimple_condexpr): Likewise. * gimplify.cc (gimplify_expr): Remove is_gimple_condexpr usage. * tree-if-conv.cc (set_bb_predicate): Likewie. (add_to_predicate_list): Likewise. (gen_phi_arg_condition): Likewise. (predicate_scalar_phi): Likewise. (predicate_statements): Likewise. --- gcc/gimple-expr.cc | 11 --- gcc/gimple-expr.h | 1 - gcc/gimplify.cc | 1 - gcc/tree-if-conv.cc | 33 ++--- 4 files changed, 14 insertions(+), 32 deletions(-) diff --git a/gcc/gimple-expr.cc b/gcc/gimple-expr.cc index 09eac80ae1c..c9c7285efbc 100644 --- a/gcc/gimple-expr.cc +++ b/gcc/gimple-expr.cc @@ -614,17 +614,6 @@ is_gimple_condexpr_1 (tree t, bool allow_traps, bool allow_cplx) && is_gimple_val (TREE_OPERAND (t, 1; } -/* Return true if T is a condition operand in a GIMPLE assignment - with a COND_EXPR RHS. */ - -bool -is_gimple_condexpr (tree t) -{ - /* Always split out _Complex type compares since complex lowering - doesn't handle this case. */ - return is_gimple_val (t); -} - /* Like is_gimple_condexpr, but does not allow T to trap. */ bool diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h index ba53b808437..0c3ac096ed7 100644 --- a/gcc/gimple-expr.h +++ b/gcc/gimple-expr.h @@ -40,7 +40,6 @@ extern void extract_ops_from_tree (tree, enum tree_code *, tree *, tree *, extern void gimple_cond_get_ops_from_tree (tree, enum tree_code *, tree *, tree *); extern bool is_gimple_lvalue (tree); -extern bool is_gimple_condexpr (tree); extern bool is_gimple_condexpr_for_cond (tree); extern bool is_gimple_address (const_tree); extern bool is_gimple_invariant_address (const_tree); diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc index 38e9d62535a..d4a0366a549 100644 --- a/gcc/gimplify.cc +++ b/gcc/gimplify.cc @@ -14951,7 +14951,6 @@ gimplify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p, gcc_assert (fallback & (fb_rvalue | fb_lvalue)); else if (gimple_test_f == is_gimple_val || gimple_test_f == is_gimple_call_addr - || gimple_test_f == is_gimple_condexpr || gimple_test_f == is_gimple_condexpr_for_cond || gimple_test_f == is_gimple_mem_rhs || gimple_test_f == is_gimple_mem_rhs_or_call diff --git a/gcc/tree-if-conv.cc b/gcc/tree-if-conv.cc index 5b884aae3d6..2245b6bfd7c 100644 --- a/gcc/tree-if-conv.cc +++ b/gcc/tree-if-conv.cc @@ -244,8 +244,8 @@ static inline void set_bb_predicate (basic_block bb, tree cond) { gcc_assert ((TREE_CODE (cond) == TRUTH_NOT_EXPR - && is_gimple_condexpr (TREE_OPERAND (cond, 0))) - || is_gimple_condexpr (cond)); + && is_gimple_val (TREE_OPERAND (cond, 0))) + || is_gimple_val (cond)); ((struct bb_predicate *) bb->aux)->predicate = cond; } @@ -544,10 +544,10 @@ add_to_predicate_list (class loop *loop, basic_block bb, tree nc) tp = &TREE_OPERAND (bc, 0); else tp = &bc; - if (!is_gimple_condexpr (*tp)) + if (!is_gimple_val (*tp)) { gimple_seq stmts; - *tp = force_gimple_operand_1 (*tp, &stmts, is_gimple_condexpr, NULL_TREE); + *tp = force_gimple_operand (*tp, &stmts, true, NULL_TREE); add_bb_predicate_gimplified_stmts (bb, stmts); } set_bb_predicate (bb, bc); @@ -1884,16 +1884,14 @@ gen_phi_arg_condition (gphi *phi, vec *occur, cond = c; break; } - c = force_gimple_operand_gsi_1 (gsi, unshare_expr (c), - is_gimple_condexpr, NULL_TREE, - true, GSI_SAME_STMT); + c = force_gimple_operand_gsi (gsi, unshare_expr (c), + true, NULL_TREE, true, GSI_SAME_STMT); if (cond != NULL_TREE) { /* Must build OR expression. */ cond = fold_or_predicates (EXPR_LOCATION (c), c, cond); - cond = force_gimple_operand_gsi_1 (gsi, unshare_expr (cond), -is_gimple_condexpr, NULL_TREE, -true, GSI_SAME_STMT); + cond = force_gimple_operand_gsi (gsi, unshare_expr (cond), true, + NULL_TREE, true, GSI_SAME_STMT); } else cond = c; @@ -1973,9 +1971,8 @@ predicate_scalar_phi (gphi *phi, gimple_stmt_iterator *gsi) else cond = bb_predicate (first_edge->src); /* Gimplify the condition to a valid cond-expr conditonal
Re: Supporting RISC-V Vendor Extensions in the GNU Toolchain
A generous [snip], as this has been getting a bit long. On Sun, 15 May 2022 at 03:21, Palmer Dabbelt wrote: > I am worried about bad > actors leveraging any policy to make a bunch of noise, as that's a > pretty persistent problem in RISC-V land and it looks like things are > going to get worse before they get better. > I don't follow. Maybe you can walk me through the "bad actors" comment next time we talk… > > We today have: > > - Tag_RISCV_arch (see > > > https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-elf.adoc#tag_riscv_arch-5-ntbssubarch > ) > > - ifunc support > > > > Admittedly, there's some loose ends in the end-to-end story (e.g. > > Unified Discovery -> DTB -> glibc ifunc initialisation): we know just > > too well how this plays out as there are optimised string/memory > > functions (Zbb, Zicboz, cache-block-length, …) in our pipeline as well > > as OpenSSL support for Zbb and Zbc. However, this is a known gap and > > will be fully addressed in the future. > > > > Is there something specific beyond this that you'd be looking for? > > I might be forgetting something, but at least: > > * Tag_RISCV_arch attributes are really fundamentally based around > compatible extension sets and just don't work when faced with the > realities of what RISC-V is today -- that's true even for standard > extensions, but it's going to be way worse with vendor extensions. > * Some scheme that allows relocations from multiple vendors to be linked > together. There's been some proposals here, but nothing that the > psABI folks seem to like (and also might not play well with dynamic > relocations). > I would recommend deferring solving the vendor-defined relocations to a later time. All vendor-defined extension proposals already on the table for upstream inclusion (X-Ventana-CondOps, X-THead-CMO) don't require custom relocation. I don't expect anything requiring these shortly — and whoever submits it will have to provide a proposal for vendor-defined relocations that finds some consensus. > * There's a lot between device tree and ifunc (not to mention ACPI). > Kito had a proposal for how to get this up to userspace, there's an > earlier version from Plumbers last year but there's a lot of work that > needs to be done to turn that into reality. > Agreed. Our team is looking into this already as Zbb and Zicboz are useful in GLIBC. > * Some use cases won't be met by ifunc, there's a whole lot of > techniques available and we at least want to allow those to function. > In the long run binary compatibility is going to be a losing battle, > but we can at least try to keep things sane so the folks in charge at > the foundation have a chance to understand what a hole we're in with > enough time left to fix it. > > I know it's a lot more work to give users the option of compatibility, > but once that's gone it'll never come back so I'm willing to at least > try -- though of course that'll put a burden on everyone, even those > outside the RISC-V ports, so everyone needs to be on board. > I have been discussing "fat binaries" on and off in the context of reconciling the vector fragmentation. This is a follow-on topic to getting things enabled and ensuring that no accidental interworking occurs — once the basic support is mature enough, I hope there will be takers for fat-binary support. I hope this further clarifies my thinking: I would like to roll support for vendor-defined extensions out in an incremental manner: starting with rolling up some extensions into the development tools (assembler, linker, and compiler); and only then improving runtime detection and library usage. For vendor-defined relocations, I would build consensus once we first encounter the need for them. Philipp.
Re: [PATCH] Clamp vec_perm_expr index in simplify_bitfield_ref to avoid ICE.
On Mon, 16 May 2022, liuhongt wrote: > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,} > Ok for trunk? OK. Thanks, Richard. > gcc/ChangeLog: > > PR tree-optimization/105591 > * tree-ssa-forwprop.cc (simplify_bitfield_ref): Clamp > vec_perm_expr index. > > gcc/testsuite/ChangeLog: > > * gcc.dg/pr105591.c: New test. > --- > gcc/testsuite/gcc.dg/pr105591.c | 12 > gcc/tree-ssa-forwprop.cc| 13 - > 2 files changed, 20 insertions(+), 5 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/pr105591.c > > diff --git a/gcc/testsuite/gcc.dg/pr105591.c b/gcc/testsuite/gcc.dg/pr105591.c > new file mode 100644 > index 000..9554c42e2f4 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/pr105591.c > @@ -0,0 +1,12 @@ > +/* { dg-do compile } */ > +/* { dg-options "-Wno-psabi -O" } */ > +/* { dg-additional-options "-mavx" { target x86_64-*-* i?86-*-* } } */ > +typedef unsigned long long __attribute__((__vector_size__ (16))) U; > +typedef unsigned long long __attribute__((__vector_size__ (32))) V; > + > +V > +foo (U u) > +{ > + U x = __builtin_shuffle (u, (U) { 0xBE2ED0AB630B33FE }); > + return __builtin_shufflevector (u, x, 2, 1, 0, 3); > +} > diff --git a/gcc/tree-ssa-forwprop.cc b/gcc/tree-ssa-forwprop.cc > index 48cab5844e0..7da3f80af10 100644 > --- a/gcc/tree-ssa-forwprop.cc > +++ b/gcc/tree-ssa-forwprop.cc > @@ -2381,23 +2381,26 @@ simplify_bitfield_ref (gimple_stmt_iterator *gsi) > >/* One element. */ >if (known_eq (size, elem_size)) > -idx = TREE_INT_CST_LOW (VECTOR_CST_ELT (m, idx)); > +idx = TREE_INT_CST_LOW (VECTOR_CST_ELT (m, idx)) % (2 * nelts); >else > { >unsigned HOST_WIDE_INT nelts_op; >if (!constant_multiple_p (size, elem_size, &nelts_op) > || !pow2p_hwi (nelts_op)) > return false; > - unsigned start = TREE_INT_CST_LOW (vector_cst_elt (m, idx)); > - unsigned end = TREE_INT_CST_LOW (vector_cst_elt (m, idx + nelts_op - > 1)); > + /* Clamp vec_perm_expr index. */ > + unsigned start = TREE_INT_CST_LOW (vector_cst_elt (m, idx)) % (2 * > nelts); > + unsigned end = TREE_INT_CST_LOW (vector_cst_elt (m, idx + nelts_op - > 1)) > + % (2 * nelts); >/* Be in the same vector. */ >if ((start < nelts) != (end < nelts)) > return false; >for (unsigned HOST_WIDE_INT i = 1; i != nelts_op; i++) > { > /* Continuous area. */ > - if (TREE_INT_CST_LOW (vector_cst_elt (m, idx + i)) - 1 > - != TREE_INT_CST_LOW (vector_cst_elt (m, idx + i - 1))) > + if (TREE_INT_CST_LOW (vector_cst_elt (m, idx + i)) % (2 * nelts) - 1 > + != TREE_INT_CST_LOW (vector_cst_elt (m, idx + i - 1)) > + % (2 * nelts)) > return false; > } >/* Alignment not worse than before. */ > -- Richard Biener SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman; HRB 36809 (AG Nuernberg)
Re: [PATCH 15/40] graphite: Extend SCoP detection dump output
As requested by Richard: Rediffed patch. Changes: s/.c/.cc/ + some whitespace changes. (At least in my email reader, some were lost. I also fixed too-long line issues.) In addition, FOR_EACH_LOOP was replaced by 'for (auto loop : ...' (macro was removed late in GCC 12 development → r12-2605-ge41ba804ba5f5c) Otherwise, it should be identical to Frederik's patch, earlier in this thread. On 15.12.21 16:54, Frederik Harwath wrote: Extend dump output to make understanding why Graphite rejects to include a loop in a SCoP easier (for GCC developers). OK for mainline? Tobias - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 graphite: Extend SCoP detection dump output Extend dump output to make understanding why Graphite rejects to include a loop in a SCoP easier (for GCC developers). ChangeLog: * graphite-scop-detection.cc (scop_detection::can_represent_loop): Output reason for failure to dump file. (scop_detection::harmful_loop_in_region): Likewise. (scop_detection::graphite_can_represent_expr): Likewise. (scop_detection::stmt_has_simple_data_refs_p): Likewise. (scop_detection::stmt_simple_for_scop_p): Likewise. (print_sese_loop_numbers): New function. (scop_detection::add_scop): Use from here to print loops in rejected SCoP. gcc/graphite-scop-detection.cc | 182 - 1 file changed, 161 insertions(+), 21 deletions(-) diff --git a/gcc/graphite-scop-detection.cc b/gcc/graphite-scop-detection.cc index 8c0ee997557..075aa3010c8 100644 --- a/gcc/graphite-scop-detection.cc +++ b/gcc/graphite-scop-detection.cc @@ -69,12 +69,27 @@ public: fprintf (output.dump_file, "%d", i); return output; } + friend debug_printer & operator<< (debug_printer &output, const char *s) { fprintf (output.dump_file, "%s", s); return output; } + + friend debug_printer & + operator<< (debug_printer &output, gimple* stmt) + { +print_gimple_stmt (output.dump_file, stmt, 0, TDF_VOPS | TDF_MEMSYMS); +return output; + } + + friend debug_printer & + operator<< (debug_printer &output, tree t) + { +print_generic_expr (output.dump_file, t, TDF_SLIM); +return output; + } } dp; #define DEBUG_PRINT(args) do \ @@ -506,6 +521,23 @@ scop_detection::merge_sese (sese_l first, sese_l second) const return combined; } +/* Print the loop numbers of the loops contained + in SESE to FILE. */ + +static void +print_sese_loop_numbers (FILE *file, sese_l sese) +{ + bool printed = false; + for (auto loop : loops_list (cfun, 0)) +{ + if (loop_in_sese_p (loop, sese)) + fprintf (file, "%d, ", loop->num); + printed = true; +} + if (printed) +fprintf (file, "\b\b"); +} + /* Build scop outer->inner if possible. */ void @@ -519,6 +551,10 @@ scop_detection::build_scop_depth (loop_p loop) if (! next || harmful_loop_in_region (next)) { + if (next) + DEBUG_PRINT (dp << "[scop-detection] Discarding SCoP on loops "; + print_sese_loop_numbers (dump_file, next); + dp << " because of harmful loops\n"); if (s) add_scop (s); build_scop_depth (loop); @@ -560,14 +596,63 @@ scop_detection::can_represent_loop (loop_p loop, sese_l scop) || !single_pred_p (loop->latch) || exit->src != single_pred (loop->latch) || !empty_block_p (loop->latch)) -return false; +{ + DEBUG_PRINT (dp << "[can_represent_loop-fail] Loop shape unsupported.\n"); + return false; +} + + bool edge_irreducible = (loop_preheader_edge (loop)->flags + & EDGE_IRREDUCIBLE_LOOP); + if (edge_irreducible) +{ + DEBUG_PRINT (dp << "[can_represent_loop-fail] " + "Loop is not a natural loop.\n"); + return false; +} + + bool niter_is_unconditional = number_of_iterations_exit (loop, + single_exit (loop), + &niter_desc, false); + + if (!niter_is_unconditional) +{ + DEBUG_PRINT (dp << "[can_represent_loop-fail] " + "Loop niter not unconditional.\n" + "Condition: " << niter_desc.assumptions << "\n"); + return false; +} + + niter = number_of_latch_executions (loop); + if (!niter) +{ + DEBUG_PRINT (dp << "[can_represent_loop-fail] Loop niter unknown.\n"); + return false; +} + if (!niter_desc.control.no_overflow) +{ + DEBUG_PRINT (dp << "[can_represent_loop-fail] Loop niter can overflow.\n"); + return false; +} - return !(loop_preheader_edge (loop)->flags & EDGE_IRREDUCIBLE_LOOP) -&& number_of_iterations_exit (loop, single_exit (loop), &niter_desc, false) -&& niter_desc.control.no_overflow -&& (niter = number_of_latch_executions (loop)) -&& !chrec_contains_undetermined (niter) -&& graphite_can_represent_expr (scop, loop, niter); + bool undetermined_coefficients =
Re: [PATCH 16/40] graphite: Rename isl_id_for_ssa_name
Rediffed Frederik's patch. Actual change is just s/.c/.cc/ but also a missing space → tab. On 15.12.21 16:54, Frederik Harwath wrote: The SSA names for which this function gets used are always SCoP parameters and hence "isl_id_for_parameter" is a better name. It also explains the prefix "P_" for those names in the ISL representation. OK for mainline? Tobias - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 graphite: Rename isl_id_for_ssa_name The SSA names for which this function gets used are always SCoP parameters and hence "isl_id_for_parameter" is a better name. It also explains the prefix "P_" for those names in the ISL representation. gcc/ChangeLog: * graphite-sese-to-poly.cc (isl_id_for_ssa_name): Rename to ... (isl_id_for_parameter): ... this new function name. (build_scop_context): Adjust function use. gcc/graphite-sese-to-poly.cc | 21 +++-- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/gcc/graphite-sese-to-poly.cc b/gcc/graphite-sese-to-poly.cc index 5a6d779052c..ea67b267e1c 100644 --- a/gcc/graphite-sese-to-poly.cc +++ b/gcc/graphite-sese-to-poly.cc @@ -100,14 +100,15 @@ extract_affine_mul (scop_p s, tree e, __isl_take isl_space *space) return isl_pw_aff_mul (lhs, rhs); } -/* Return an isl identifier from the name of the ssa_name E. */ +/* Return an isl identifier for the parameter P. */ static isl_id * -isl_id_for_ssa_name (scop_p s, tree e) +isl_id_for_parameter (scop_p s, tree p) { - char name1[14]; - snprintf (name1, sizeof (name1), "P_%d", SSA_NAME_VERSION (e)); - return isl_id_alloc (s->isl_context, name1, e); + gcc_checking_assert (TREE_CODE (p) == SSA_NAME); + char name[14]; + snprintf (name, sizeof (name), "P_%d", SSA_NAME_VERSION (p)); + return isl_id_alloc (s->isl_context, name, p); } /* Return an isl identifier for the data reference DR. Data references and @@ -898,15 +899,15 @@ build_scop_context (scop_p scop) isl_space *space = isl_space_set_alloc (scop->isl_context, nbp, 0); unsigned i; - tree e; - FOR_EACH_VEC_ELT (region->params, i, e) + tree p; + FOR_EACH_VEC_ELT (region->params, i, p) space = isl_space_set_dim_id (space, isl_dim_param, i, - isl_id_for_ssa_name (scop, e)); + isl_id_for_parameter (scop, p)); scop->param_context = isl_set_universe (space); - FOR_EACH_VEC_ELT (region->params, i, e) -add_param_constraints (scop, i, e); + FOR_EACH_VEC_ELT (region->params, i, p) +add_param_constraints (scop, i, p); } /* Return true when loop A is nested in loop B. */
Re: [PATCH 17/40] graphite: Fix minor mistakes in comments
Another comment-only change. Otherwise, just re-diffed Frederik's patch. Mostly s/.c/.cc/, but I added one '. ' that got lost. On 15.12.21 16:54, Frederik Harwath wrote: * graphite-sese-to-poly.c (build_poly_sr_1): Fix a typo and a reference to a variable which does not exist. * graphite-isl-ast-to-gimple.c (gsi_insert_earliest): Fix typo in comment. OK for mainline? Tobias - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 graphite: Fix minor mistakes in comments gcc/ChangeLog: * graphite-sese-to-poly.cc (build_poly_sr_1): Fix a typo and a reference to a variable which does not exist. * graphite-isl-ast-to-gimple.cc (gsi_insert_earliest): Fix typo in comment. gcc/graphite-isl-ast-to-gimple.cc | 2 +- gcc/graphite-sese-to-poly.cc | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/gcc/graphite-isl-ast-to-gimple.cc b/gcc/graphite-isl-ast-to-gimple.cc index 45ed7704807..844b6d4e2b5 100644 --- a/gcc/graphite-isl-ast-to-gimple.cc +++ b/gcc/graphite-isl-ast-to-gimple.cc @@ -1014,7 +1014,7 @@ gsi_insert_earliest (gimple_seq seq) basic_block begin_bb = get_entry_bb (codegen_region); /* Inserting the gimple statements in a vector because gimple_seq behave - in strage ways when inserting the stmts from it into different basic + in strange ways when inserting the stmts from it into different basic blocks one at a time. */ auto_vec stmts; for (gimple_stmt_iterator gsi = gsi_start (seq); !gsi_end_p (gsi); diff --git a/gcc/graphite-sese-to-poly.cc b/gcc/graphite-sese-to-poly.cc index ea67b267e1c..51ba3af204f 100644 --- a/gcc/graphite-sese-to-poly.cc +++ b/gcc/graphite-sese-to-poly.cc @@ -649,14 +649,14 @@ build_poly_sr_1 (poly_bb_p pbb, gimple *stmt, tree var, enum poly_dr_type kind, isl_map *acc, isl_set *subscript_sizes) { scop_p scop = PBB_SCOP (pbb); - /* Each scalar variables has a unique alias set number starting from + /* Each scalar variable has a unique alias set number starting from the maximum alias set assigned to a dr. */ int alias_set = scop->max_alias_set + SSA_NAME_VERSION (var); subscript_sizes = isl_set_fix_si (subscript_sizes, isl_dim_set, 0, alias_set); /* Add a constrain to the ACCESSES polyhedron for the alias set of - data reference DR. */ + the reference. */ isl_constraint *c = isl_equality_alloc (isl_local_space_from_space (isl_map_get_space (acc))); c = isl_constraint_set_constant_si (c, -alias_set);
RE: [PATCH 1/3]middle-end: Add the ability to let the target decide the method of argument promotions.
> -Original Message- > From: Richard Sandiford > Sent: Monday, May 16, 2022 1:18 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; rguent...@suse.de; > jeffreya...@gmail.com > Subject: Re: [PATCH 1/3]middle-end: Add the ability to let the target decide > the method of argument promotions. > > Richard Sandiford via Gcc-patches writes: > > Tamar Christina writes: > >>> -Original Message- > >>> From: Richard Sandiford > >>> Sent: Monday, May 16, 2022 12:36 PM > >>> To: Tamar Christina > >>> Cc: gcc-patches@gcc.gnu.org; nd ; rguent...@suse.de; > >>> jeffreya...@gmail.com > >>> Subject: Re: [PATCH 1/3]middle-end: Add the ability to let the > >>> target decide the method of argument promotions. > >>> > >>> Tamar Christina writes: > >>> > Hi All, > >>> > > >>> > Some targets require function parameters to be promoted to a > >>> > different type on expand time because the target may not have > >>> > native instructions to work on such types. As an example the > >>> > AArch64 port does not have native instructions working on integer > >>> > 8- or 16-bit values. As such it promotes every parameter of these > types to 32-bits. > >>> > >>> This doesn't seem specific to parameters though. It applies to any > >>> 8- or 16-bit variable. E.g.: > >>> > >>> #include > >>> uint8_t foo(uint32_t x, uint32_t y) { > >>> uint8_t z = x != 0 ? x : y; > >>> return z + 1; > >>> } > >>> > >>> generates: > >>> > >>> foo: > >>> cmp w0, 0 > >>> and w1, w1, 255 > >>> and w0, w0, 255 > >>> cselw0, w1, w0, eq > >>> add w0, w0, 1 > >>> ret > >>> > >>> So I think the new behaviour is really a modification of the > >>> PROMOTE_MODE behaviour rather than the > PROMOTE_FUNCTION_MODE behaviour. > >>> > >>> FWIW, I agree with Richard that it would be better not to add a new > hook. > >>> I think we're really making PROMOTE_MODE choose between > SIGN_EXTEND, > >>> ZERO_EXTEND or SUBREG (what LLVM would call “any > >>> extend”) rather than the current SIGN_EXTEND vs. ZERO_EXTEND > choice. > >> > >> Ah, I hadn't realized this also applied to locals.. ok I can modify > >> PROMOTE_MODE then, but I also need the actual SSA_NAME and not just > the type so will have to pass this along. > >> > >> From a practical point of view.. the actual hook however is > >> implemented by 34 targets, would I need to CC maintainers for each of > >> them or would global maintainer approval suffice for these mostly > mechanical changes? > > > > Yeah, single approval should be enough mechanical changes. It would > > be good to do the interface change and mechanical target changes as a > > separate prepatch if possible though. > > > > I'm not sure about passing the SSA name to the target though, or the > > way that the aarch64 hook uses the info. It looks like a single cold > > comparison could defeat the optimisation for hot code. I'm not sure I follow why the likelihood of the comparison matters in this case.. I'll expand on it below.. > > > > If we do try to make the decision based on uses at expand time, it > > might be better for the analysis to be in target-independent code, > > with help from the target to decide where extensions are cheap. It > > still feels a bit hacky though. I thought about it but can't see most target having this problem. I did go with an optimistic heuristics. There are of course various ways to defeat it but looking through the corpus of code I don't see any but the simple cases in practice. (more below). > > > > What stops us from forming cbz/cbnz when the extension is done close > > to the comparison (from the comment in 2/3)? If we can solve that, > > could we simply do an any-extend all the time, and treat removing > > redundant extensions as a global availability problem? > In such cases there's no gain from doing the extension at all, e.g. and w0, w0, 255 cmp w0, 0 b.eq .Lfoo will be optimized to tst w0, 0xff b.ne .Lfoo already. In RTL the problem occurs when you have nested control flow like nested if and switch statements The example in 2/3 was intended to show that before what we'd do is and w22, w0, 255 cbz w22, .Lfoo1 cbz w22, .Lfoo2 If we have a single comparison we already sink the zero_extend today. Now if we instead any-extend w0 we end up with: mov w22, w0 tst w22, 0xff b.ne .Lfoo1 tst w22, 0xff b.ne .Lfoo2 So you get an additional tst before each branch. You also can't perform the tst higher since CC is clobbered. The cbz is nice because the zero extend doesn't use CC of course and so having the value live allows us to optimize The branch. I don't think branch likeliness matters here as you must keep w22 live in both cases somehow. In the SPECCPU 2017 Benchmark perlbench (which uses a lot of nested switches) this pattern is responsible for an extra 0.3% codesize increase which the approach in 2/3 prevents. > (which would run after combine) > > > > > What kind o
Re: [PATCH] lto-plugin: add support for feature detection
On Mon, May 16, 2022 at 8:04 PM Martin Liška wrote: > > On 5/16/22 12:28, Richard Biener wrote: > > On Mon, May 16, 2022 at 11:58 AM Rui Ueyama wrote: > >> > >> Version handshaking is doable, but it feels like we are over-designing > >> an API, given that the real providers of this plugin API are only GCC > >> and LLVM and the users of the API are BFD ld, gold and mold. It is > >> unlikely that we'll have dozens of more compilers or linkers in the > >> near future. So, I personally prefer the following style > >> > >> if (!strcmp(plugin_compiler_name, "gcc") && plugin_major >= 12) > >> > >> than versioning various API-provided functions. It'll just work and be > >> easy to understand. > >> > >> Besides that, even if we version GCC-provided plugin API functions, we > >> still need a logic similar to the above to distinguish GCC from LLVM, > >> as they behave slightly differently in various corner cases. We can't > >> get rid of the heuristic version detection logic from the linker > >> anyways. > >> > >> So, from the linker's point of view, exporting a compiler name and > >> version numbers is enough. > > > > I agree that this might be convenient enough but the different behaviors > > are because of inadequate documentation of the API - that's something > > we should fix. And for this I think a plugin API version might help > > as we can this way also handle your case of the need of querying > > whether v2 will be used or not. > > > > I wouldn't go to enumerate past API versions - the version handshake > > hook requirement alone makes that not so useful. Instead I'd require > > everybody implementing the handshare hook implementing also all > > other hooks defined at that point in time and set the version to 1. > > > > I'd eventually keep version 2 to indicate thread safety (of a part of the > > API). > > That seems reasonable to me. > > > > > That said, I'm not opposed to add a "GCC 12.1" provider, maybe the > > version handshake should be > > > > int api_version (int linker, const char **identifier); > > > > where the linker specifies the desired API version and passes an > > identifier identifying itself ("mold 1.0") and it will get back the API > > version the plugin intends to use plus an identifier of the plugin > > ("GCC 12.1"). > > I've implemented first version of the patch, please take a look. > > Note there's a bit name collision of api_version with: > > /* The version of the API specification. */ > > enum ld_plugin_api_version > { > LD_PLUGIN_API_VERSION = 1 > }; > > @Rui: Am I correct that you're interested in thread-safe claim_file? Is there > any > other function being called paralely? Yes, I want a thread-safe claim_file. And that function seems to be the only function in mold that is called in parallel. > Thoughts? > > > > > Richard. > > > >> > >> > >> On Mon, May 16, 2022 at 5:38 PM Martin Liška wrote: > >>> > >>> On 5/16/22 11:25, Jan Hubicka via Gcc-patches wrote: > > > > Sure having a 'plugin was compiled from sources of the GCC N.M compiler' > > is useful if bugs are discovered in old versions that you by definition > > cannot > > fix but can apply workarounds to. Note the actual compiler used might > > still > > differ. Note that still isn't clean API documentation / extension of > > the plugin > > API itself. As of thread safety we can either add a claim_file_v2_hook > > or try to do broader-level versioning of the API with a new api_version > > hook which could also specify that with say version 2 the plugin will > > not use get_symbols_v2 but only newer, etc. The linker would announce > > the API version it likes to use and the plugin would return the > > (closest) version > > it can handle. A v2 could then also specify that claim_file needs to be > > Yep, I think having the API version handshake is quite reasonable way to > go as the _v2,_v3 symbols seems already getting bit out of hand and we > essentially have 3 revisions of API to think of > (first adding LDPR_PREVAILING_DEF_IRONLY_EXP, second adding > GET_SYMBOLS_V3 and now we plan third adding thread safety and solving > the file handler problems) > >>> > >>> How should be design such a version handshake? > >>> > > > thread safe, or that cleanup should allow a followup onload again with > > a state identical to after dlopen, etc. > > > > Is there an API document besides the header itself somewhere? > > There is https://gcc.gnu.org/wiki/whopr/driver > I wonder if this can't be moved into more official looking place since > it looks like it is internal to GCC WHOPR implementation this way. > >>> > >>> We can likely add it here: > >>> https://gcc.gnu.org/onlinedocs/gccint/LTO.html#LTO > >>> > >>> What do you think? I can port it. > >>> > >>> Martin > >>> > > Honza > > > > Richard. > >>>
Re: [PATCH 1/3]middle-end: Add the ability to let the target decide the method of argument promotions.
Tamar Christina writes: >> -Original Message- >> From: Richard Sandiford >> Sent: Monday, May 16, 2022 1:18 PM >> To: Tamar Christina >> Cc: gcc-patches@gcc.gnu.org; nd ; rguent...@suse.de; >> jeffreya...@gmail.com >> Subject: Re: [PATCH 1/3]middle-end: Add the ability to let the target decide >> the method of argument promotions. >> >> Richard Sandiford via Gcc-patches writes: >> > Tamar Christina writes: >> >>> -Original Message- >> >>> From: Richard Sandiford >> >>> Sent: Monday, May 16, 2022 12:36 PM >> >>> To: Tamar Christina >> >>> Cc: gcc-patches@gcc.gnu.org; nd ; rguent...@suse.de; >> >>> jeffreya...@gmail.com >> >>> Subject: Re: [PATCH 1/3]middle-end: Add the ability to let the >> >>> target decide the method of argument promotions. >> >>> >> >>> Tamar Christina writes: >> >>> > Hi All, >> >>> > >> >>> > Some targets require function parameters to be promoted to a >> >>> > different type on expand time because the target may not have >> >>> > native instructions to work on such types. As an example the >> >>> > AArch64 port does not have native instructions working on integer >> >>> > 8- or 16-bit values. As such it promotes every parameter of these >> types to 32-bits. >> >>> >> >>> This doesn't seem specific to parameters though. It applies to any >> >>> 8- or 16-bit variable. E.g.: >> >>> >> >>> #include >> >>> uint8_t foo(uint32_t x, uint32_t y) { >> >>> uint8_t z = x != 0 ? x : y; >> >>> return z + 1; >> >>> } >> >>> >> >>> generates: >> >>> >> >>> foo: >> >>> cmp w0, 0 >> >>> and w1, w1, 255 >> >>> and w0, w0, 255 >> >>> cselw0, w1, w0, eq >> >>> add w0, w0, 1 >> >>> ret >> >>> >> >>> So I think the new behaviour is really a modification of the >> >>> PROMOTE_MODE behaviour rather than the >> PROMOTE_FUNCTION_MODE behaviour. >> >>> >> >>> FWIW, I agree with Richard that it would be better not to add a new >> hook. >> >>> I think we're really making PROMOTE_MODE choose between >> SIGN_EXTEND, >> >>> ZERO_EXTEND or SUBREG (what LLVM would call “any >> >>> extend”) rather than the current SIGN_EXTEND vs. ZERO_EXTEND >> choice. >> >> >> >> Ah, I hadn't realized this also applied to locals.. ok I can modify >> >> PROMOTE_MODE then, but I also need the actual SSA_NAME and not just >> the type so will have to pass this along. >> >> >> >> From a practical point of view.. the actual hook however is >> >> implemented by 34 targets, would I need to CC maintainers for each of >> >> them or would global maintainer approval suffice for these mostly >> mechanical changes? >> > >> > Yeah, single approval should be enough mechanical changes. It would >> > be good to do the interface change and mechanical target changes as a >> > separate prepatch if possible though. >> > >> > I'm not sure about passing the SSA name to the target though, or the >> > way that the aarch64 hook uses the info. It looks like a single cold >> > comparison could defeat the optimisation for hot code. > > I'm not sure I follow why the likelihood of the comparison matters in this > case.. > I'll expand on it below.. I meant the likelihood that the comparison is executed at all, not which outcome is more likely. E.g. suppose the only comparison occurs on a failure path that eventually calls abort, and that there are other paths (without comparisons of the same value) that would benefit from the any-extend optimisation. We'd prioritise the cold comparison over optimising the other (hot) code. I'm just suspicious of heuristics along the lines of “don't do X if there is a single instance of Y”. :-) >> > If we do try to make the decision based on uses at expand time, it >> > might be better for the analysis to be in target-independent code, >> > with help from the target to decide where extensions are cheap. It >> > still feels a bit hacky though. > > I thought about it but can't see most target having this problem. I did go > with an optimistic heuristics. There are of course various ways to defeat it > but looking through the corpus of code I don't see any but the simple cases > in practice. (more below). > >> > >> > What stops us from forming cbz/cbnz when the extension is done close >> > to the comparison (from the comment in 2/3)? If we can solve that, >> > could we simply do an any-extend all the time, and treat removing >> > redundant extensions as a global availability problem? >> > > In such cases there's no gain from doing the extension at all, e.g. > and w0, w0, 255 > cmp w0, 0 > b.eq .Lfoo > > will be optimized to > > tst w0, 0xff > b.ne .Lfoo > > already. > > In RTL the problem occurs when you have nested control flow like nested if > and switch statements > The example in 2/3 was intended to show that before what we'd do is > > and w22, w0, 255 > > > cbz w22, .Lfoo1 > > > cbz w22, .Lfoo2 > > If we have a single comparison we already sink the zero_extend today. > > Now if we instead any-extend w0 we
Re: [PATCH] lto-plugin: add support for feature detection
On Mon, 16 May 2022, Rui Ueyama wrote: > > @Rui: Am I correct that you're interested in thread-safe claim_file? Is > > there any > > other function being called paralely? > > Yes, I want a thread-safe claim_file. And that function seems to be > the only function in mold that is called in parallel. But note that you'll have to provide a guarantee that all entrypoints that the plugin may invoke when multithreaded (i.e. add_symbols, which is called from claim_file) are also thread-safe. Alexander
[PATCH v3 09/10] libgcc: Add support for HF mode (aka _Float16) in libbid
This patch adds support for trunc and extend operations between HF mode (_Float16) and Decimal Floating Point formats (_Decimal32, _Decimal64 and _Decimal128). For simplicity we rely on the implicit conversions inserted by the compiler between HF and SD/DF/TF modes. The existing bid*_to_binary* and binary*_to_bid* functions are non-trivial and at this stage it is not clear if there is a performance-critical use case involving _Float16 and _Decimal* formats. The patch also adds two executable tests, to make sure the right functions are called, available (link phase) and functional. Tested on aarch64 and x86_86. The number of symbol matches in the testcases includes the .global XXX to avoid having to match different call instructions for different targets. 2022-05-04 Christophe Lyon libgcc/ChangeLog: * Makefile.in (D32PBIT_FUNCS): Add _hf_to_sd and _sd_to_hf. (D64PBIT_FUNCS): Add _hf_to_dd and _dd_to_hf. (D128PBIT_FUNCS): Add _hf_to_td _td_to_hf. libgcc/config/libbid/ChangeLog: * bid_gcc_intrinsics.h (LIBGCC2_HAS_HF_MODE): Define according to __LIBGCC_HAS_HF_MODE__. (BID_HAS_HF_MODE): Define. (HFtype): Define. (__bid_extendhfsd): New prototype. (__bid_extendhfdd): Likewise. (__bid_extendhftd): Likewise. (__bid_truncsdhf): Likewise. (__bid_truncddhf): Likewise. (__bid_trunctdhf): Likewise. * _dd_to_hf.c: New file. * _hf_to_dd.c: New file. * _hf_to_sd.c: New file. * _hf_to_td.c: New file. * _sd_to_hf.c: New file. * _td_to_hf.c: New file. gcc/testsuite/ChangeLog: * gcc.dg/torture/convert-dfp-2.c: New test. * gcc.dg/torture/convert-dfp.c: New test. --- gcc/testsuite/gcc.dg/torture/convert-dfp-2.c | 45 ++ gcc/testsuite/gcc.dg/torture/convert-dfp.c | 63 libgcc/Makefile.in | 9 ++- libgcc/config/libbid/_dd_to_hf.c | 38 libgcc/config/libbid/_hf_to_dd.c | 36 +++ libgcc/config/libbid/_hf_to_sd.c | 36 +++ libgcc/config/libbid/_hf_to_td.c | 36 +++ libgcc/config/libbid/_sd_to_hf.c | 38 libgcc/config/libbid/_td_to_hf.c | 38 libgcc/config/libbid/bid_gcc_intrinsics.h| 30 +- 10 files changed, 364 insertions(+), 5 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/torture/convert-dfp-2.c create mode 100644 gcc/testsuite/gcc.dg/torture/convert-dfp.c create mode 100644 libgcc/config/libbid/_dd_to_hf.c create mode 100644 libgcc/config/libbid/_hf_to_dd.c create mode 100644 libgcc/config/libbid/_hf_to_sd.c create mode 100644 libgcc/config/libbid/_hf_to_td.c create mode 100644 libgcc/config/libbid/_sd_to_hf.c create mode 100644 libgcc/config/libbid/_td_to_hf.c diff --git a/gcc/testsuite/gcc.dg/torture/convert-dfp-2.c b/gcc/testsuite/gcc.dg/torture/convert-dfp-2.c new file mode 100644 index 000..3e4ecb57ba6 --- /dev/null +++ b/gcc/testsuite/gcc.dg/torture/convert-dfp-2.c @@ -0,0 +1,45 @@ +/* { dg-do run } */ +/* { dg-require-effective-target float16_runtime } */ +/* { dg-require-effective-target dfprt } */ +/* { dg-options "-save-temps" } */ +/* { dg-add-options float16 } */ + +/* Test conversions from DFP to smaller types. */ + +_Decimal32 var32; +_Decimal64 var64; +_Decimal128 var128; +_Float16 var16; + +void __attribute__ ((__noinline__)) foo32 (_Decimal32 param32) +{ + var16 = param32; +} + +void __attribute__ ((__noinline__)) foo64 (_Decimal64 param64) +{ + var16 = param64; + var32 = param64; +} + +void __attribute__ ((__noinline__)) foo128 (_Decimal128 param128) +{ + var16 = param128; + var32 = param128; + var64 = param128; +} + +int main () +{ + foo32 (var32); + foo64 (var64); + foo128 (var128); + return 0; +} + +/* { dg-final { scan-assembler-times {\t__bid_truncsdhf} 2 { target { dfp_bid } } } } */ +/* { dg-final { scan-assembler-times {\t__bid_truncddhf} 2 { target { dfp_bid } } } } */ +/* { dg-final { scan-assembler-times {\t__bid_truncddsd2} 2 { target { dfp_bid } } } } */ +/* { dg-final { scan-assembler-times {\t__bid_trunctdhf} 2 { target { dfp_bid } } } } */ +/* { dg-final { scan-assembler-times {\t__bid_trunctdsd2} 2 { target { dfp_bid } } } } */ +/* { dg-final { scan-assembler-times {\t__bid_trunctddd2} 2 { target { dfp_bid } } } } */ diff --git a/gcc/testsuite/gcc.dg/torture/convert-dfp.c b/gcc/testsuite/gcc.dg/torture/convert-dfp.c new file mode 100644 index 000..ec136896ca7 --- /dev/null +++ b/gcc/testsuite/gcc.dg/torture/convert-dfp.c @@ -0,0 +1,63 @@ +/* { dg-do run } */ +/* { dg-require-effective-target float16_runtime } */ +/* { dg-require-effective-target dfprt } */ +/* { dg-options "-save-temps" } */ +/* { dg-add-options float16 } */ + +/* Test conversions to/from DFP values. */ + +extern void abort (); + +_Decimal32 var32 = 1.2df; + +int __attribute__ ((__noinline__)
[committed 1/4] libstdc++: Fix status docs for support
Pushed to trunk. Backports to all branches needed. -- >8 -- libstdc++-v3/ChangeLog: * doc/html/manual/status.html: Regenerate. * doc/xml/manual/status_cxx2020.xml: Fix supported version for C++20 bit operations. --- libstdc++-v3/doc/html/manual/status.html | 2 +- libstdc++-v3/doc/xml/manual/status_cxx2020.xml | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2020.xml b/libstdc++-v3/doc/xml/manual/status_cxx2020.xml index cde88634126..7611d174233 100644 --- a/libstdc++-v3/doc/xml/manual/status_cxx2020.xml +++ b/libstdc++-v3/doc/xml/manual/status_cxx2020.xml @@ -1394,7 +1394,7 @@ or any notes about the implementation. http://www.w3.org/1999/xlink"; xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0553r4.html";> P0553R4 - 10.1 + 9.1 __cpp_lib_bitops >= 201907L (since 9.4, see Note 1) -- 2.34.3
[committed 2/4] libstdc++: Add C++23 status docs
Pushed to trunk. Backports to gcc-11 and gcc-12 to follow. -- >8 -- These are the C++23 proposals already supported in the gcc-11 branch. libstdc++-v3/ChangeLog: * doc/xml/manual/intro.xml: Include new chapter. * doc/xml/manual/status_cxx2020.xml: Tweak release numbers. * doc/xml/manual/status_cxx2023.xml: New file. * doc/html/*: Regenerate. --- libstdc++-v3/doc/html/index.html | 2 +- libstdc++-v3/doc/html/manual/index.html | 4 +- libstdc++-v3/doc/html/manual/intro.html | 2 +- libstdc++-v3/doc/html/manual/status.html | 109 +++- libstdc++-v3/doc/xml/manual/intro.xml | 10 +- .../doc/xml/manual/status_cxx2020.xml | 16 +- .../doc/xml/manual/status_cxx2023.xml | 232 ++ 7 files changed, 347 insertions(+), 28 deletions(-) create mode 100644 libstdc++-v3/doc/xml/manual/status_cxx2023.xml diff --git a/libstdc++-v3/doc/xml/manual/intro.xml b/libstdc++-v3/doc/xml/manual/intro.xml index 548b632b6e4..290e5d3a74e 100644 --- a/libstdc++-v3/doc/xml/manual/intro.xml +++ b/libstdc++-v3/doc/xml/manual/intro.xml @@ -47,15 +47,19 @@ http://www.w3.org/2001/XInclude"; parse="xml" href="status_cxx2020.xml"> - + +http://www.w3.org/2001/XInclude"; parse="xml" href="status_cxx2023.xml"> + + + http://www.w3.org/2001/XInclude"; parse="xml" href="status_cxxtr1.xml"> - + http://www.w3.org/2001/XInclude"; parse="xml" href="status_cxxtr24733.xml"> - + http://www.w3.org/2001/XInclude"; parse="xml" href="status_cxxis29124.xml"> diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2020.xml b/libstdc++-v3/doc/xml/manual/status_cxx2020.xml index 7611d174233..d979db06b11 100644 --- a/libstdc++-v3/doc/xml/manual/status_cxx2020.xml +++ b/libstdc++-v3/doc/xml/manual/status_cxx2020.xml @@ -25,8 +25,8 @@ not in any particular release. -The following table lists new library features that have been accepted into -the C++20 working draft. The "Proposal" column provides a link to the +The following table lists new library features that are included in +the C++20 standard. The "Proposal" column provides a link to the ISO C++ committee proposal that describes the feature, while the "Status" column indicates the first version of GCC that contains an implementation of this feature (if it has been implemented). @@ -359,7 +359,7 @@ or any notes about the implementation. http://www.w3.org/1999/xlink"; xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0053r7.pdf";> P0053R7 - 11 + 11.1 __cpp_lib_syncbuf >= 201711L @@ -369,7 +369,7 @@ or any notes about the implementation. http://www.w3.org/1999/xlink"; xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0753r2.pdf";> P0753R2 - 11 + 11.1 __cpp_lib_syncbuf >= 201803L @@ -643,7 +643,7 @@ or any notes about the implementation. http://www.w3.org/1999/xlink"; xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0966r1.html";> P0966R1 - 11 + 11.1 @@ -1039,7 +1039,7 @@ or any notes about the implementation. http://www.w3.org/1999/xlink"; xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0476r2.html";> P0476R2 - 11 + 11.1 __cpp_lib_bit_cast >= 201806L @@ -1424,7 +1424,7 @@ or any notes about the implementation. http://www.w3.org/1999/xlink"; xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1208r6.pdf";> P1208R6 - 11 + 11.1 __cpp_lib_source_location >= 201907L @@ -1436,7 +1436,7 @@ or any notes about the implementation. http://www.w3.org/1999/xlink"; xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0408r7.pdf";> P0408R7 - 11 + 11.1 diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2023.xml b/libstdc++-v3/doc/xml/manual/status_cxx2023.xml new file mode 100644 index 000..f1244da9dc8 --- /dev/null +++ b/libstdc++-v3/doc/xml/manual/status_cxx2023.xml @@ -0,0 +1,232 @@ +http://docbook.org/ns/docbook"; version="5.0" + xml:id="status.iso.2023" xreflabel="Status C++ 2023"> + + +C++ 2023 + +ISO C++ +2023 + + + + +In this implementation the -std=gnu++23 or +-std=c++23 flag must be used to enable language +and library +features. See dialect +options. The pre-defined symbol +__cplusplus is used to check for the +presence of the required flag. + + + +This section describes the C++23 and library TS support in mainline GCC, +not in any particular release. + + + +The following table lists new library features that have been accepted into +the C++23 working draft. The "Proposal" column provides a link to the +ISO C++ committe
[committed 4/4] libstdc++: Fix hyperlink in docs
Pushed to trunk. Backports to follow. -- >8 -- libstdc++-v3/ChangeLog: * doc/xml/manual/prerequisites.xml: Fix attributes for external hyperlink. * doc/html/manual/setup.html: Regenerate. --- libstdc++-v3/doc/html/manual/setup.html | 2 +- libstdc++-v3/doc/xml/manual/prerequisites.xml | 3 ++- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/libstdc++-v3/doc/xml/manual/prerequisites.xml b/libstdc++-v3/doc/xml/manual/prerequisites.xml index 8799487c821..f00979a1afa 100644 --- a/libstdc++-v3/doc/xml/manual/prerequisites.xml +++ b/libstdc++-v3/doc/xml/manual/prerequisites.xml @@ -71,7 +71,8 @@ - https://www.gnu.org/software/libiconv/#downloading";> + http://www.w3.org/1999/xlink"; + xlink:href="https://www.gnu.org/software/libiconv/#downloading";> Download the libiconv sources and extract them into the top level of the GCC source tree, e.g., -- 2.34.3
[committed 3/4] libstdc++: Update C++23 status docs
Pushed to trunk. Backport to gcc-12 to follow. -- >8 -- These are the C++23 proposals supported in the gcc-12 branch. libstdc++-v3/ChangeLog: * doc/xml/manual/status_cxx2023.xml: Update with gcc-12 support. * doc/html/*: Regenerate. --- libstdc++-v3/doc/html/manual/status.html | 132 ++- .../doc/xml/manual/status_cxx2023.xml | 341 ++ 2 files changed, 468 insertions(+), 5 deletions(-) diff --git a/libstdc++-v3/doc/html/manual/status.html b/libstdc++-v3/doc/html/manual/status.html index 118ce30ad54..8706d54b07a 100644 --- a/libstdc++-v3/doc/html/manual/status.html +++ b/libstdc++-v3/doc/html/manual/status.html @@ -1757,7 +1757,31 @@ or any notes about the implementation. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2251r1.pdf"; target="_top"> P2251R1 - Yes + Yes Repairing input range adaptors and counted_iterator +http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2259r1.html"; target="_top"> +P2259R1 + + 12.1 Superior String Splitting +http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2210r2.html"; target="_top"> +P2210R2 + + 12.1 What is a view? +http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2415r2.html"; target="_top"> +P2415R2 + + 12.1 __cpp_lib_ranges >= 202110L Fix istream_view +http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2432r1.pdf"; target="_top"> +P2432R1 + + 12.1 starts_with and ends_with +http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p1659r3.html"; target="_top"> +P1659R3 + +__cpp_lib_ranges_starts_ends_with >= 202106L zip +http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2321r2.html"; target="_top"> +P2321R2 + +__cpp_lib_ranges_zip >= 202110L Compile-time programming A proposal for a type trait to detect scoped enumerations http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1048r1.pdf"; target="_top"> @@ -1767,13 +1791,67 @@ or any notes about the implementation. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p1682r3.html"; target="_top"> P1682R3 - 11.1 __cpp_lib_to_underlying >= 202102L + 11.1 __cpp_lib_to_underlying >= 202102L Missing constexpr in std::optional and std::variant +http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2231r1.html"; target="_top"> +P2231R1 + + +11.3 (optional) 12.1 (variant) + +__cpp_lib_constexpr_optional >= 202106L __cpp_lib_variant >= 202106L + Making std::unique_ptr constexpr +http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2273r3.pdf"; target="_top"> +P2273R3 + + 12.1 __cpp_lib_constexpr_memory >= 202202L Making std::type_info::operator== constexpr +http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p1328r1.html"; target="_top"> +P1328R1 + + 12.1 __cpp_lib_constexpr_typeinfo >= 202106L constexpr forand +http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p0533r9.pdf"; target="_top"> +P0533R9 + +__cpp_lib_constexpr_cmath >= 202202L Deprecate std::aligned_storage and std::aligned_union +http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p1413r3.pdf"; target="_top"> +P1413R3 + + A type trait to detect reference binding to temporary +http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2255r2.html"; target="_top"> +P2255R2 + +__cpp_lib_reference_from_temporary >= 202202L Strings and text string contains function http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1679r3.html"; target="_top"> P1679R3 - 11.1 __cpp_lib_string_contains >= 202011L + 11.1 __cpp_lib_string_contains >= 202011L Prohibit std::basic_string and std::basic_string_view construction from nullptr +http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p2166r1.html"; target="_top"> +P2166R1 + + 12.1 basic_string::resize_and_overwrite +http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p1072r10.html"; target="_top"> +P1072R10 + + 12.1 __cpp_lib_resize_and_overwrite >= 202110L A strstream replacement using span as buffer +http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p0448r4.pdf"; target="_top"> +P0448R4 + + 12.1 __cpp_lib_spanstream >= 202106L +Containers + Iterator pair constructors for stack and queue +http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p1425r4.pdf"; target="_top"> +P1425R4 + + 12.1 __cpp_lib
Re: [committed 3/4] libstdc++: Update C++23 status docs
This documents the partial support for P2231 in the gcc-11 branch, pushed to that branch. commit 5d418194ccb39346d2ad022c5b143fe00b2340ac Author: Jonathan Wakely Date: Mon May 16 15:33:06 2022 libstdc++: Document support for constexpr optional (P2231R1) The changes for std::variant are not on the gcc-11 branch though. libstdc++-v3/ChangeLog: * doc/xml/manual/status_cxx2023.xml: Update status. * doc/html/manual/status.html: Regenerate. diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2023.xml b/libstdc++-v3/doc/xml/manual/status_cxx2023.xml index f1244da9dc8..75f31f55aa9 100644 --- a/libstdc++-v3/doc/xml/manual/status_cxx2023.xml +++ b/libstdc++-v3/doc/xml/manual/status_cxx2023.xml @@ -158,6 +158,23 @@ or any notes about the implementation. __cpp_lib_to_underlying >= 202102L + + + Missing constexpr in std::optional and std::variant + +http://www.w3.org/1999/xlink"; xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2231r1.html";> +P2231R1 + + + 11.3 (optional only) + + + __cpp_lib_constexpr_optional >= 202106L + __cpp_lib_variant >= 202106L + + + + Strings and text
[Patch] OpenMP: Skip target-nesting warning for reverse offload
A warning about target-nesting inside target makes sense, but not if the inner target is one for reverse offload ("device(ancestor:1)"). Thus, silence the warning in this case. OK for mainline? Tobias - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 OpenMP: Skip target-nesting warning for reverse offload gcc/ChangeLog: * omp-low.cc (check_omp_nesting_restrictions): Skip warning for target inside target if inner is reverse offload. gcc/testsuite/ChangeLog: * c-c++-common/gomp/target-device-ancestor-5.c: New test. omp-low.cc | 10 ++ testsuite/c-c++-common/gomp/target-device-ancestor-5.c | 28 2 files changed, 38 insertions(+) diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc index 8aebaeecd42..f7f44fca450 100644 --- a/gcc/omp-low.cc +++ b/gcc/omp-low.cc @@ -3883,6 +3883,16 @@ check_omp_nesting_restrictions (gimple *stmt, omp_context *ctx) } else { + if ((gimple_omp_target_kind (ctx->stmt) + == GF_OMP_TARGET_KIND_REGION) + && (gimple_omp_target_kind (stmt) + == GF_OMP_TARGET_KIND_REGION) + && ((c = omp_find_clause ( + gimple_omp_target_clauses (stmt), + OMP_CLAUSE_DEVICE)) + != NULL_TREE) + && OMP_CLAUSE_DEVICE_ANCESTOR (c)) + break; warning_at (gimple_location (stmt), 0, "%qs construct inside of %qs region", stmt_name, ctx_stmt_name); diff --git a/gcc/testsuite/c-c++-common/gomp/target-device-ancestor-5.c b/gcc/testsuite/c-c++-common/gomp/target-device-ancestor-5.c new file mode 100644 index 000..b6ff84bcdab --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/target-device-ancestor-5.c @@ -0,0 +1,28 @@ +#pragma omp requires reverse_offload /* { dg-message "sorry, unimplemented: 'reverse_offload' clause on 'requires' directive not supported yet" } */ + +void +foo () +{ + /* Good nesting - as reverse offload */ + #pragma omp target + #pragma omp target device(ancestor:1) /* valid -> no warning */ /* { dg-bogus "'target' construct inside of 'target' region" } */ +{ } + + /* Bad nesting */ + #pragma omp target + #pragma omp target /* { dg-warning "'target' construct inside of 'target' region" } */ + #pragma omp target /* { dg-warning "'target' construct inside of 'target' region" } */ +{ } + + /* Good nesting - as reverse offload */ + #pragma omp target + #pragma omp target /* { dg-warning "'target' construct inside of 'target' region" } */ + #pragma omp target device(ancestor:1) /* valid -> no warning */ /* { dg-bogus "'target' construct inside of 'target' region" } */ + { } + + #pragma omp target + #pragma omp target device(ancestor:1) /* valid -> no warning */ /* { dg-bogus "'target' construct inside of 'target' region" } */ + #pragma omp target device(ancestor:1) /* { dg-error "OpenMP constructs are not allowed in target region with 'ancestor'" } */ + { } + +}
PING#2 Re: [PATCH RFA] attribs: fix typedefs in generic code [PR105492]
Ping. On 5/10/22 16:48, Jason Merrill wrote: Ping? On 5/5/22 14:07, Jason Merrill wrote: In my patch for PR100545 I added an assert to check for broken typedefs in set_underlying_type, and it found one in this case: rs6000_handle_altivec_attribute had the same problem as handle_mode_attribute. So let's move the fixup into decl_attributes. Tested that this fixes the ICE on a cross compiler, regression tested x86_64-pc-linux-gnu, OK for trunk? PR c/105492 gcc/ChangeLog: * attribs.cc (decl_attributes): Fix broken typedefs here. gcc/c-family/ChangeLog: * c-attribs.cc (handle_mode_attribute): Don't fix broken typedefs here. --- gcc/attribs.cc | 15 +++ gcc/c-family/c-attribs.cc | 10 -- 2 files changed, 15 insertions(+), 10 deletions(-) diff --git a/gcc/attribs.cc b/gcc/attribs.cc index b219f878042..0648391f0c6 100644 --- a/gcc/attribs.cc +++ b/gcc/attribs.cc @@ -872,6 +872,21 @@ decl_attributes (tree *node, tree attributes, int flags, tree ret = (spec->handler) (cur_and_last_decl, name, args, flags|cxx11_flag, &no_add_attrs); + /* Fix up typedefs clobbered by attribute handlers. */ + if (TREE_CODE (*node) == TYPE_DECL + && anode == &TREE_TYPE (*node) + && DECL_ORIGINAL_TYPE (*node) + && TYPE_NAME (*anode) == *node + && TYPE_NAME (cur_and_last_decl[0]) != *node) + { + tree t = cur_and_last_decl[0]; + DECL_ORIGINAL_TYPE (*node) = t; + tree tt = build_variant_type_copy (t); + cur_and_last_decl[0] = tt; + TREE_TYPE (*node) = tt; + TYPE_NAME (tt) = *node; + } + *anode = cur_and_last_decl[0]; if (ret == error_mark_node) { diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc index b1953a45f9b..a280987c111 100644 --- a/gcc/c-family/c-attribs.cc +++ b/gcc/c-family/c-attribs.cc @@ -2204,16 +2204,6 @@ handle_mode_attribute (tree *node, tree name, tree args, TYPE_QUALS (type)); if (TYPE_USER_ALIGN (type)) *node = build_aligned_type (*node, TYPE_ALIGN (type)); - - tree decl = node[2]; - if (decl && TYPE_NAME (type) == decl) - { - /* Set up the typedef all over again. */ - DECL_ORIGINAL_TYPE (decl) = NULL_TREE; - TREE_TYPE (decl) = *node; - set_underlying_type (decl); - *node = TREE_TYPE (decl); - } } return NULL_TREE; base-commit: 000f4480005035d0811e009a7cb25b42721f0a6e
Re: [PATCH] lto-plugin: add support for feature detection
On Mon, 16 May 2022, Martin Liška wrote: > I've implemented first version of the patch, please take a look. I'll comment on the patch, feel free to inform me when I should back off with forcing my opinion in this thread :) > --- a/include/plugin-api.h > +++ b/include/plugin-api.h > @@ -483,6 +483,18 @@ enum ld_plugin_level >LDPL_FATAL > }; > > +/* The linker's interface for API version negotiation. */ > + > +typedef > +int (*ld_plugin_get_api_version) (char *linker_identifier, int > linker_version, > + int preferred_linker_api, > + const char **compiler_identifier, > + int *compiler_version); > + > +typedef > +enum ld_plugin_status > +(*ld_plugin_register_get_api_version) (ld_plugin_get_api_version handler); > + > /* Values for the tv_tag field of the transfer vector. */ > > enum ld_plugin_tag > @@ -521,6 +533,7 @@ enum ld_plugin_tag >LDPT_REGISTER_NEW_INPUT_HOOK, >LDPT_GET_WRAP_SYMBOLS, >LDPT_ADD_SYMBOLS_V2, > + LDPT_REGISTER_GET_API_VERSION, > }; > > /* The plugin transfer vector. */ > @@ -556,6 +569,7 @@ struct ld_plugin_tv > ld_plugin_get_input_section_size tv_get_input_section_size; > ld_plugin_register_new_input tv_register_new_input; > ld_plugin_get_wrap_symbols tv_get_wrap_symbols; > +ld_plugin_register_get_api_version tv_register_get_api_version; >} tv_u; > }; Here I disagree with the overall design. Rui already pointed out how plugin API seems to consist of callbacks-that-register-callbacks, and I'm with him on that, let's not make that worse. On a more serious note, this pattern: * the linker provides register_get_api_version entrypoint * the plugin registers its get_api_version implementation * the linker uses the provided function pointer is problematic because the plugin doesn't know when the linker is going to invoke its callback (or maybe the linker won't do that at all). I'd recommend to reduce the level of indirection, remove the register_ callback, and simply require that if LDPT_GET_API_VERSION is provided, the plugin MUST invoke it before returning from onload, i.e.: * the linker invokes onload with LDPT_GET_API_VERSION in 'transfer vector' * the plugin iterates over the transfer vector and notes if LDPT_GET_API_VERSION is seen * if not, the plugin knows the linker is predates its introduction * if yes, the plugin invokes it before returning from onload * the linker now knows the plugin version (either one provided via LDPT_GET_API_VERSION, or 'old' if the callback wasn't invoked). > diff --git a/lto-plugin/lto-plugin.c b/lto-plugin/lto-plugin.c > index 00b760636dc..49484decd89 100644 > --- a/lto-plugin/lto-plugin.c > +++ b/lto-plugin/lto-plugin.c > @@ -69,6 +69,7 @@ along with this program; see the file COPYING3. If not see > #include "../gcc/lto/common.h" > #include "simple-object.h" > #include "plugin-api.h" > +#include "ansidecl.h" > > /* We need to use I64 instead of ll width-specifier on native Windows. > The reason for this is that older MS-runtimes don't support the ll. */ > @@ -166,6 +167,10 @@ static ld_plugin_add_input_file add_input_file; > static ld_plugin_add_input_library add_input_library; > static ld_plugin_message message; > static ld_plugin_add_symbols add_symbols, add_symbols_v2; > +static ld_plugin_register_get_api_version register_get_api_version; > + > +/* By default, use version 1 if there is not negotiation. */ > +static int used_api_version = 1; > > static struct plugin_file_info *claimed_files = NULL; > static unsigned int num_claimed_files = 0; > @@ -1407,6 +1412,29 @@ process_option (const char *option) >verbose = verbose || debug; > } > > +static int > +get_api_version (char *linker_identifier, int linker_version, > + int preferred_linker_api, The 'preferred' qualifier seems vague. If you go with my suggestion above, I'd suggest to pass lowest and highest supported version number, and the linker can check if that intersects with its range of supported versions, and error out if the intersection is empty (and otherwise return the highest version they both support as the 'negotiated' one). > + const char **compiler_identifier, > + int *compiler_version) > +{ > + *compiler_identifier = "GCC"; > + *compiler_version = GCC_VERSION; Note that I'm chiming in here because I worked on a tool that used LTO plugin API to discover symbol/section dependencies with high accuracy. So I'd like to remind that implementors/consumers of this API are not necessarily compilers! Alexander
RE: [PATCH 1/3]middle-end: Add the ability to let the target decide the method of argument promotions.
> -Original Message- > From: Richard Sandiford > Sent: Monday, May 16, 2022 2:24 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; rguent...@suse.de; > jeffreya...@gmail.com > Subject: Re: [PATCH 1/3]middle-end: Add the ability to let the target decide > the method of argument promotions. > > Tamar Christina writes: > >> -Original Message- > >> From: Richard Sandiford > >> Sent: Monday, May 16, 2022 1:18 PM > >> To: Tamar Christina > >> Cc: gcc-patches@gcc.gnu.org; nd ; rguent...@suse.de; > >> jeffreya...@gmail.com > >> Subject: Re: [PATCH 1/3]middle-end: Add the ability to let the target > >> decide the method of argument promotions. > >> > >> Richard Sandiford via Gcc-patches writes: > >> > Tamar Christina writes: > >> >>> -Original Message- > >> >>> From: Richard Sandiford > >> >>> Sent: Monday, May 16, 2022 12:36 PM > >> >>> To: Tamar Christina > >> >>> Cc: gcc-patches@gcc.gnu.org; nd ; > rguent...@suse.de; > >> >>> jeffreya...@gmail.com > >> >>> Subject: Re: [PATCH 1/3]middle-end: Add the ability to let the > >> >>> target decide the method of argument promotions. > >> >>> > >> >>> Tamar Christina writes: > >> >>> > Hi All, > >> >>> > > >> >>> > Some targets require function parameters to be promoted to a > >> >>> > different type on expand time because the target may not have > >> >>> > native instructions to work on such types. As an example the > >> >>> > AArch64 port does not have native instructions working on > >> >>> > integer > >> >>> > 8- or 16-bit values. As such it promotes every parameter of > >> >>> > these > >> types to 32-bits. > >> >>> > >> >>> This doesn't seem specific to parameters though. It applies to > >> >>> any > >> >>> 8- or 16-bit variable. E.g.: > >> >>> > >> >>> #include > >> >>> uint8_t foo(uint32_t x, uint32_t y) { > >> >>> uint8_t z = x != 0 ? x : y; > >> >>> return z + 1; > >> >>> } > >> >>> > >> >>> generates: > >> >>> > >> >>> foo: > >> >>> cmp w0, 0 > >> >>> and w1, w1, 255 > >> >>> and w0, w0, 255 > >> >>> cselw0, w1, w0, eq > >> >>> add w0, w0, 1 > >> >>> ret > >> >>> > >> >>> So I think the new behaviour is really a modification of the > >> >>> PROMOTE_MODE behaviour rather than the > >> PROMOTE_FUNCTION_MODE behaviour. > >> >>> > >> >>> FWIW, I agree with Richard that it would be better not to add a > >> >>> new > >> hook. > >> >>> I think we're really making PROMOTE_MODE choose between > >> SIGN_EXTEND, > >> >>> ZERO_EXTEND or SUBREG (what LLVM would call “any > >> >>> extend”) rather than the current SIGN_EXTEND vs. ZERO_EXTEND > >> choice. > >> >> > >> >> Ah, I hadn't realized this also applied to locals.. ok I can > >> >> modify PROMOTE_MODE then, but I also need the actual SSA_NAME > and > >> >> not just > >> the type so will have to pass this along. > >> >> > >> >> From a practical point of view.. the actual hook however is > >> >> implemented by 34 targets, would I need to CC maintainers for each > >> >> of them or would global maintainer approval suffice for these > >> >> mostly > >> mechanical changes? > >> > > >> > Yeah, single approval should be enough mechanical changes. It > >> > would be good to do the interface change and mechanical target > >> > changes as a separate prepatch if possible though. > >> > > >> > I'm not sure about passing the SSA name to the target though, or > >> > the way that the aarch64 hook uses the info. It looks like a > >> > single cold comparison could defeat the optimisation for hot code. > > > > I'm not sure I follow why the likelihood of the comparison matters in this > case.. > > I'll expand on it below.. > > I meant the likelihood that the comparison is executed at all, not which > outcome is more likely. E.g. suppose the only comparison occurs on a failure > path that eventually calls abort, and that there are other paths (without > comparisons of the same value) that would benefit from the any-extend > optimisation. We'd prioritise the cold comparison over optimising the other > (hot) code. > > I'm just suspicious of heuristics along the lines of “don't do X if there is a > single instance of Y”. :-) I'm probably very dense here sorry.. but if there's 1 use: the zero extend gets pushed down into the branch which needs it. i.e. in: extern void foo (); extern void bar (); uint8_t f (uint8_t a, uint8_t b) { if (b) { if (a) foo (); else return f (a, b); } else { bar (); } return b; } The zero extend of a is only done in the true branch for if (b). Secondly the zero extended form is the basis for all other patterns we form, such as ands, which is the combination of the zero extend and compare. 2 uses, both live: extern void foo (); extern void bar (uint8_t); uint8_t f (uint8_t a, uint8_t b) { if (b) { if (a) foo (); else return f (a, b); } else { bar (a); } return b; } In which case the extend of a is done before
[PATCH v5] c++: ICE with temporary of class type in DMI [PR100252]
On Sat, May 14, 2022 at 11:13:28PM -0400, Jason Merrill wrote: > On 5/13/22 19:41, Marek Polacek wrote: > > --- a/gcc/cp/typeck2.cc > > +++ b/gcc/cp/typeck2.cc > > @@ -1371,6 +1371,70 @@ digest_init_flags (tree type, tree init, int flags, > > tsubst_flags_t complain) > > return digest_init_r (type, init, 0, flags, complain); > > } > > +/* Return true if a prvalue is used as an initializer rather than for > > + temporary materialization. For instance: > > I might say "if SUBOB initializes the same object as FULL_EXPR"; the > full-expression could still end up initializing a temporary. Fixed. > > + A a = A{}; // initializer > > + A a = (A{});// initializer > > + A a = (1, A{}); // initializer > > + A a = true ? A{} : A{}; // initializer > > + auto x = A{}.x; // temporary materialization > > + auto x = foo(A{}); // temporary materialization > > + > > + FULL_EXPR is the whole expression, SUBOB is its TARGET_EXPR subobject. > > */ > > + > > +static bool > > +potential_prvalue_result_of (tree subob, tree full_expr) > > +{ > > + if (subob == full_expr) > > +return true; > > + else if (TREE_CODE (full_expr) == TARGET_EXPR) > > +{ > > + tree init = TARGET_EXPR_INITIAL (full_expr); > > + if (TREE_CODE (init) == COND_EXPR) > > + return (potential_prvalue_result_of (subob, TREE_OPERAND (init, 1)) > > + || potential_prvalue_result_of (subob, TREE_OPERAND (init, 2))); > > + else if (TREE_CODE (init) == COMPOUND_EXPR) > > + return (potential_prvalue_result_of (subob, TREE_OPERAND (init, 0)) > > We shouldn't recurse into the LHS of the comma, only the RHS. Fixed. > > + || potential_prvalue_result_of (subob, TREE_OPERAND (init, 1))); > > + /* ??? I don't know if this can be hit. If so, look inside the ( ) > > +instead of the assert. */ > > + else if (TREE_CODE (init) == PAREN_EXPR) > > + gcc_checking_assert (false); > > It seems trivial enough to recurse after the assert, in case it does happen > in the wild. OK, adjusted. Thanks! Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? -- >8 -- Consider struct A { int x; int y = x; }; struct B { int x = 0; int y = A{x}.y; // #1 }; where for #1 we end up with {.x=(&)->x, .y=(&)->x} that is, two PLACEHOLDER_EXPRs for different types on the same level in a {}. This crashes because our CONSTRUCTOR_PLACEHOLDER_BOUNDARY mechanism to avoid replacing unrelated PLACEHOLDER_EXPRs cannot deal with it. Here's why we wound up with those PLACEHOLDER_EXPRs: When we're performing cp_parser_late_parsing_nsdmi for "int y = A{x}.y;" we use finish_compound_literal on type=A, compound_literal={((struct B *) this)->x}. When digesting this initializer, we call get_nsdmi which creates a PLACEHOLDER_EXPR for A -- we don't have any object to refer to yet. After digesting, we have {.x=((struct B *) this)->x, .y=(&)->x} and since we've created a PLACEHOLDER_EXPR inside it, we marked the whole ctor CONSTRUCTOR_PLACEHOLDER_BOUNDARY. f_c_l creates a TARGET_EXPR and returns TARGET_EXPR x, .y=(&)->x}> Then we get to B b = {}; and call store_init_value, which digests the {}, which produces {.x=NON_LVALUE_EXPR <0>, .y=(TARGET_EXPR )->x, .y=(&)->x}>).y} lookup_placeholder in constexpr won't find an object to replace the PLACEHOLDER_EXPR for B, because ctx->object will be D.2395 of type A, and we cannot search outward from D.2395 to find 'b'. The call to replace_placeholders in store_init_value will not do anything: we've marked the inner { } CONSTRUCTOR_PLACEHOLDER_BOUNDARY, and it's only a sub-expression, so replace_placeholders does nothing, so the stays even though now is the perfect time to replace it because we have an object for it: 'b'. Later, in cp_gimplify_init_expr the *expr_p is D.2395 = {.x=(&)->x, .y=(&)->x} where D.2395 is of type A, but we crash because we hit , which has a different type. My idea was to replace with D.2384 after creating the TARGET_EXPR because that means we have an object we can refer to. Then clear CONSTRUCTOR_PLACEHOLDER_BOUNDARY because we no longer have a PLACEHOLDER_EXPR in the {}. Then store_init_value will be able to replace with 'b', and we should be good to go. We must be careful not to break guaranteed copy elision, so this replacement happens in digest_nsdmi_init where we can see the whole initializer, and avoid replacing any placeholders in TARGET_EXPRs used in the context of initialization/copy elision. This is achieved via the new function called potential_prvalue_result_of. While fixing this problem, I found PR105550, thus the FIXMEs in the tests. PR c++/100252 gcc/cp/ChangeLog: * typeck2.cc (potential_prvalue_result_of): New. (replace_placeholders_for_class_temp_r): New. (digest_nsdmi_init): Call it. gcc/testsuite/ChangeLog: * g++.dg/cpp1y/nsdmi-aggr14.C: New test.
Re: [PATCH] ipa-visibility: Optimize TLS access [PR99619]
On Mon, 9 May 2022, Jan Hubicka wrote: > > On second thought, it might be better to keep the assert, and place the loop > > under 'if (optimize)'? > > The problem is that at IPA level it does not make sense to check > optimize flag as it is function specific. (shlib is OK to check it > anywhere since it is global.) > > So I think we really want to run the code only at the WPA time > (symtab_state>=IPA_SSA) and we want to see what is optimization flag of > those function referring the variable since that is what decided codegen > we will produce. Perhaps I misunderstood the issue. Are you saying that there might be no -O option on lto1 command line, because lto1 is supposed to take optimization level from function summaries, but during pass_ipa_whole_program_visibility there's no "current function" so 'optimize' is at its default value (zero)? And the solution is to iterate over referring functions to see if at least one of them satisfies 'opt_for_fn (decl, optimize) > 0'? Alexander
Re: [PATCH 1/3]middle-end: Add the ability to let the target decide the method of argument promotions.
Tamar Christina writes: >> -Original Message- >> From: Richard Sandiford >> Sent: Monday, May 16, 2022 2:24 PM >> To: Tamar Christina >> Cc: gcc-patches@gcc.gnu.org; nd ; rguent...@suse.de; >> jeffreya...@gmail.com >> Subject: Re: [PATCH 1/3]middle-end: Add the ability to let the target decide >> the method of argument promotions. >> >> Tamar Christina writes: >> >> -Original Message- >> >> From: Richard Sandiford >> >> Sent: Monday, May 16, 2022 1:18 PM >> >> To: Tamar Christina >> >> Cc: gcc-patches@gcc.gnu.org; nd ; rguent...@suse.de; >> >> jeffreya...@gmail.com >> >> Subject: Re: [PATCH 1/3]middle-end: Add the ability to let the target >> >> decide the method of argument promotions. >> >> >> >> Richard Sandiford via Gcc-patches writes: >> >> > Tamar Christina writes: >> >> >>> -Original Message- >> >> >>> From: Richard Sandiford >> >> >>> Sent: Monday, May 16, 2022 12:36 PM >> >> >>> To: Tamar Christina >> >> >>> Cc: gcc-patches@gcc.gnu.org; nd ; >> rguent...@suse.de; >> >> >>> jeffreya...@gmail.com >> >> >>> Subject: Re: [PATCH 1/3]middle-end: Add the ability to let the >> >> >>> target decide the method of argument promotions. >> >> >>> >> >> >>> Tamar Christina writes: >> >> >>> > Hi All, >> >> >>> > >> >> >>> > Some targets require function parameters to be promoted to a >> >> >>> > different type on expand time because the target may not have >> >> >>> > native instructions to work on such types. As an example the >> >> >>> > AArch64 port does not have native instructions working on >> >> >>> > integer >> >> >>> > 8- or 16-bit values. As such it promotes every parameter of >> >> >>> > these >> >> types to 32-bits. >> >> >>> >> >> >>> This doesn't seem specific to parameters though. It applies to >> >> >>> any >> >> >>> 8- or 16-bit variable. E.g.: >> >> >>> >> >> >>> #include >> >> >>> uint8_t foo(uint32_t x, uint32_t y) { >> >> >>> uint8_t z = x != 0 ? x : y; >> >> >>> return z + 1; >> >> >>> } >> >> >>> >> >> >>> generates: >> >> >>> >> >> >>> foo: >> >> >>> cmp w0, 0 >> >> >>> and w1, w1, 255 >> >> >>> and w0, w0, 255 >> >> >>> cselw0, w1, w0, eq >> >> >>> add w0, w0, 1 >> >> >>> ret >> >> >>> >> >> >>> So I think the new behaviour is really a modification of the >> >> >>> PROMOTE_MODE behaviour rather than the >> >> PROMOTE_FUNCTION_MODE behaviour. >> >> >>> >> >> >>> FWIW, I agree with Richard that it would be better not to add a >> >> >>> new >> >> hook. >> >> >>> I think we're really making PROMOTE_MODE choose between >> >> SIGN_EXTEND, >> >> >>> ZERO_EXTEND or SUBREG (what LLVM would call “any >> >> >>> extend”) rather than the current SIGN_EXTEND vs. ZERO_EXTEND >> >> choice. >> >> >> >> >> >> Ah, I hadn't realized this also applied to locals.. ok I can >> >> >> modify PROMOTE_MODE then, but I also need the actual SSA_NAME >> and >> >> >> not just >> >> the type so will have to pass this along. >> >> >> >> >> >> From a practical point of view.. the actual hook however is >> >> >> implemented by 34 targets, would I need to CC maintainers for each >> >> >> of them or would global maintainer approval suffice for these >> >> >> mostly >> >> mechanical changes? >> >> > >> >> > Yeah, single approval should be enough mechanical changes. It >> >> > would be good to do the interface change and mechanical target >> >> > changes as a separate prepatch if possible though. >> >> > >> >> > I'm not sure about passing the SSA name to the target though, or >> >> > the way that the aarch64 hook uses the info. It looks like a >> >> > single cold comparison could defeat the optimisation for hot code. >> > >> > I'm not sure I follow why the likelihood of the comparison matters in this >> case.. >> > I'll expand on it below.. >> >> I meant the likelihood that the comparison is executed at all, not which >> outcome is more likely. E.g. suppose the only comparison occurs on a failure >> path that eventually calls abort, and that there are other paths (without >> comparisons of the same value) that would benefit from the any-extend >> optimisation. We'd prioritise the cold comparison over optimising the other >> (hot) code. >> >> I'm just suspicious of heuristics along the lines of “don't do X if there is >> a >> single instance of Y”. :-) > > I'm probably very dense here sorry.. but if there's > > 1 use: the zero extend gets pushed down into the branch which needs it. > > i.e. in: > > extern void foo (); > extern void bar (); > > uint8_t f (uint8_t a, uint8_t b) > { > if (b) { > if (a) > foo (); > else > return f (a, b); > } else { > bar (); > } > return b; > } > > The zero extend of a is only done in the true branch for if (b). Secondly > the zero > extended form is the basis for all other patterns we form, such as ands, > which is > the combination of the zero extend and compare. > > 2 uses, both live: > > extern void foo (); > extern void bar (uint8_t
[committed] d: Merge upstream dmd 60bfa0ee7, druntime 94bd5bcb, phobos 3a1cd9a01.
Hi, Upstream dmd has now released v2.100.0, this patch merges in the latest bug fixes since the last sync-up of the release branch, as well as all new feature changes on development branch. D front-end changes: - Import dmd v2.100.0. - Add bit fields to D, enabled via the -fpreview=bitfields switch. - Removed the -ftransition=markdown and -frevert=markdown switches. - Added new trait `__traits(classInstanceAlignment)' to provide the required data alignment for classes. - The check for `pragma(crt_constructor)' and `pragma(crt_destructor)' linkage has been relaxed to allow all `void()' signatures. - ImportC parser now recognizes the `typeof(...)' operator. D runtime changes: - Import druntime v2.100.0. Phobos changes: - Import phobos v2.100.0. - To comply with dip1000, `std.socket.Socket` methods now accept only `scope' arrays. - The `fill', `alignSize', `align2', and `align4' methods of `std.outbuffer.OutBuffer' have been extended to allow specifying a custom value when pre-filling or padding the buffer. Bootstrapped and regression tested on x86_64-linux-gnu/-m32/-mx32, and committed to mainline. Regards, Iain. --- gcc/d/ChangeLog: * dmd/MERGE: Merge upstream dmd 60bfa0ee7. * dmd/VERSION: Update version to v2.100.0. * d-builtins.cc (d_init_versions): Update for new front-end interface. * d-codegen.cc (d_decl_context): Use resolvedLinkage to get declaration linkage. (build_struct_literal): Track offset in bits. * d-gimplify.cc (d_gimplify_modify_expr): Check both operands for a bit-field reference. * d-lang.cc (d_handle_option): Handle -fpreview=bitfields, remove -frevert=markdown and -ftransition=vmarkdown. (d_post_options): Set flag_rtti and flag_exceptions if -fno-druntime was seen on command-line. (d_parse_file): Update for new front-end interface. (d_type_promotes_to): Use resolvedLinkage to get declaration linkage. * decl.cc (make_thunk): Likewise. * expr.cc (ExprVisitor::visit (CatAssignExp *)): Remove lowering for appending of an element or array to another array. * lang.opt (fpreview=bitfields): New option. (frevert=markdown): Remove. (ftransition=vmarkdown): Remove. * types.cc (layout_aggregate_members): Ignore anonymous fields in total count. libphobos/ChangeLog: * libdruntime/MERGE: Merge upstream druntime 94bd5bcb. * libdruntime/Makefile.am (ALL_DRUNTIME_INSTALL_DSOURCES): Add $(DRUNTIME_DSOURCES_ELF). (ALL_DRUNTIME_SOURCES): Likewise. (DRUNTIME_DSOURCES_ELF): New variable. * libdruntime/Makefile.in: Regenerate. * src/MERGE: Merge upstream phobos 3a1cd9a01. --- 0001-d-Merge-upstream-dmd-60bfa0ee7-druntime-94bd5bcb-pho.patch.xz Description: application/xz
[PATCH] c++: suppress -Waddress warnings with *_cast [PR105569]
dynamic_cast can legally return nullptr, so I don't think it's helpful for -Waddress to warn for if (dynamic_cast(&ref)) // ... More generally, it's likely not useful to warn for the artificial POINTER_PLUS_EXPRs created by build_base_path. Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? PR c++/105569 gcc/cp/ChangeLog: * class.cc (build_base_path): Suppress -Waddress warning. gcc/testsuite/ChangeLog: * g++.dg/warn/Waddress-9.C: New test. --- gcc/cp/class.cc| 2 ++ gcc/testsuite/g++.dg/warn/Waddress-9.C | 34 ++ 2 files changed, 36 insertions(+) create mode 100644 gcc/testsuite/g++.dg/warn/Waddress-9.C diff --git a/gcc/cp/class.cc b/gcc/cp/class.cc index 3c195b35396..06167380423 100644 --- a/gcc/cp/class.cc +++ b/gcc/cp/class.cc @@ -518,6 +518,8 @@ build_base_path (enum tree_code code, expr = build1 (NOP_EXPR, ptr_target_type, expr); + suppress_warning (expr, OPT_Waddress); + indout: if (!want_pointer) { diff --git a/gcc/testsuite/g++.dg/warn/Waddress-9.C b/gcc/testsuite/g++.dg/warn/Waddress-9.C new file mode 100644 index 000..2ec41949ccf --- /dev/null +++ b/gcc/testsuite/g++.dg/warn/Waddress-9.C @@ -0,0 +1,34 @@ +// PR c++/105569 +// { dg-do compile { target c++11 } } +// { dg-options -Waddress } + +class A {}; + +class B : public virtual A {}; + +class C : public A {}; + +int main() { +B* object = new B(); +B &ref = *object; + +// -Waddress warns here +bool b = nullptr == dynamic_cast(&ref); // { dg-bogus "-Waddress" } +bool b4 = nullptr == static_cast(&ref); // { dg-bogus "-Waddress" } +if (dynamic_cast(&ref)) // { dg-bogus "-Waddress" } + { + } +if (static_cast(&ref)) // { dg-bogus "-Waddress" } + { + } + +// -Waddress doesn't warn anymore +auto ptr = dynamic_cast(&ref); +bool b2 = ptr == nullptr; + +C* cobject = new C(); +C &cref = *cobject; + +// -Waddress also doesn't warn anymore +bool b3 = nullptr == dynamic_cast(&cref); +} base-commit: 682e587f1021241758f7dfe0b22651008622a312 -- 2.36.1