[PATCH] c: Fix -Wunused-but-set-* warning with _Generic [PR96571]
Hi! The following testcase shows various problems with -Wunused-but-set* warnings and _Generic construct. I think it is best to treat the selector and the ignored expressions as (potentially) read, because when they are parsed, the vars in there are already marked as TREE_USED. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2020-08-14 Jakub Jelinek PR c/96571 * c-parser.c (c_parser_generic_selection): Change match_found from bool to int, holding index of the match. Call mark_exp_read on the selector expression and on expressions other than the selected one. * gcc.dg/Wunused-var-4.c: New test. --- gcc/c/c-parser.c.jj 2020-07-28 15:39:09.672760847 +0200 +++ gcc/c/c-parser.c2020-08-13 15:41:09.476861020 +0200 @@ -8686,7 +8686,7 @@ c_parser_generic_selection (c_parser *pa struct c_expr selector, error_expr; tree selector_type; struct c_generic_association matched_assoc; - bool match_found = false; + int match_found = -1; location_t generic_loc, selector_loc; error_expr.original_code = ERROR_MARK; @@ -8721,6 +8721,7 @@ c_parser_generic_selection (c_parser *pa c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, NULL); return selector; } + mark_exp_read (selector.value); selector_type = TREE_TYPE (selector.value); /* In ISO C terms, rvalues (including the controlling expression of _Generic) do not have qualified types. */ @@ -8820,18 +8821,18 @@ c_parser_generic_selection (c_parser *pa if (assoc.type == NULL_TREE) { - if (!match_found) + if (match_found < 0) { matched_assoc = assoc; - match_found = true; + match_found = associations.length (); } } else if (comptypes (assoc.type, selector_type)) { - if (!match_found || matched_assoc.type == NULL_TREE) + if (match_found < 0 || matched_assoc.type == NULL_TREE) { matched_assoc = assoc; - match_found = true; + match_found = associations.length (); } else { @@ -8849,13 +8850,19 @@ c_parser_generic_selection (c_parser *pa c_parser_consume_token (parser); } + unsigned int ix; + struct c_generic_association *iter; + FOR_EACH_VEC_ELT (associations, ix, iter) +if (ix != match_found) + mark_exp_read (iter->expression.value); + if (!parens.require_close (parser)) { c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, NULL); return error_expr; } - if (!match_found) + if (match_found < 0) { error_at (selector_loc, "%<_Generic%> selector of type %qT is not " "compatible with any association", --- gcc/testsuite/gcc.dg/Wunused-var-4.c.jj 2020-08-13 18:36:35.608060608 +0200 +++ gcc/testsuite/gcc.dg/Wunused-var-4.c2020-08-13 18:36:48.866880017 +0200 @@ -0,0 +1,33 @@ +/* PR c/96571 */ +/* { dg-do compile } */ +/* { dg-options "-std=c99 -O2 -Wunused-but-set-variable" } */ + +enum E { V }; + +int +foo (void) +{ + enum E v;/* { dg-bogus "set but not used" } */ + return _Generic (v, enum E : 0); +} + +int +bar (void) +{ + int a = 0; /* { dg-bogus "set but not used" } */ + return _Generic (0, int : a); +} + +int +baz (void) +{ + int a; /* { dg-bogus "set but not used" } */ + return _Generic (0, long long : a, int : 0); +} + +int +qux (void) +{ + int a; /* { dg-bogus "set but not used" } */ + return _Generic (0, long long : a, default: 0); +} Jakub
Re: [PATCH] C-SKY: Fix assembling error with -mfloat-abi=hard.
Hi Jojo, gcc/ChangeLog: * gcc/config/csky/csky-elf.h (ASM_SPEC): Use mfloat-abi. * gcc/config/csky/csky-linux-elf.h (ASM_SPEC): mfloat-abi. I have committed it to trunk. But there two points you shoud pay attention to, 1. line should start with a tab not spaces 2. The path of changed file should not include the 'gcc' Thanks, Xianmiao
Re: [PATCH] vec: Fix bootstrap on i686-linux, 32-bit darwin and AIX
On August 14, 2020 8:58:55 AM GMT+02:00, Jakub Jelinek wrote: >Hi! > >As mentioned earlier, embedded_size is broken on vecs of long long, >double >etc. on some platforms, which breaks bootstrap. >E.g. on i686-linux, the problem is mostly with older GCC versions being >used >as stage1 compiler (GCC 4.8 to 7.* in particular), because alignas >(long long) >makes U 64-bit aligned, while when long long m_vecdata[1]; is in vec, >it is >only 32-bit aligned. We've tried various ways and the following one >seems >to work, use the old way (offsetof (vec, m_vecdata)) for non-class >types >as well as standard layout class types, i.e. whenever offsetof is >guaranteed >to work, and for others use the new day (in that case we don't run into >problems with long long or other scalar types and for the structure >layout >there is just a struct with a given alignment. > >Bootstrapped/regtested on x86_64-linux and i686-linux, AFAIK Iain and >David >are bootstrapping/regtesting it on 32-bit darwin and AIX and it got >through >the spots where it broke before, ok for trunk? OK. Thanks, Richard. >2020-08-14 Jakub Jelinek > Jonathan Wakely > > * system.h: Include type_traits. > * vec.h (vec::embedded_size): Use offsetof and asserts > on vec_stdlayout, which is conditionally a vec (for standard layout T) > and otherwise vec_embedded. > >--- gcc/system.h.jj2020-07-28 15:39:09.984756558 +0200 >+++ gcc/system.h 2020-08-13 16:33:48.330642026 +0200 >@@ -235,6 +235,7 @@ extern int errno; > # include > # include > # include >+# include > #endif > > /* Some of glibc's string inlines cause warnings. Plus we'd rather >--- gcc/vec.h.jj 2020-08-13 15:41:17.221755055 +0200 >+++ gcc/vec.h 2020-08-13 16:31:42.190367975 +0200 >@@ -1283,9 +1283,11 @@ vec::embedded_size (unsi > { > struct alignas (T) U { char data[sizeof (T)]; }; > typedef vec vec_embedded; >- static_assert (sizeof (vec_embedded) == sizeof(vec), ""); >- static_assert (alignof (vec_embedded) == alignof(vec), ""); >- return offsetof (vec_embedded, m_vecdata) + alloc * sizeof (T); >+ typedef typename std::conditional::value, >+ vec, vec_embedded>::type vec_stdlayout; >+ static_assert (sizeof (vec_stdlayout) == sizeof (vec), ""); >+ static_assert (alignof (vec_stdlayout) == alignof (vec), ""); >+ return offsetof (vec_stdlayout, m_vecdata) + alloc * sizeof (T); > } > > > > Jakub
[PATCH] Add support for putting jump table into relocation read-only section
Hi, This patch adds a section selection for jump tables. The jump tables can be put into read-only data section or relocation read-only data section by the relocation flags. When the PIC flag is set and jump table is non-relative, the jump table is put into relocation read-only section. section * select_jump_table_section (tree decl) { int reloc; bool section_reloc; reloc = (! CASE_VECTOR_PC_RELATIVE && flag_pic && ! targetm.asm_out.generate_pic_addr_diff_vec ()) ? 1 : 0; section_reloc = (reloc & targetm.asm_out.reloc_rw_mask ()); return targetm.asm_out.function_rodata_section (decl, section_reloc); } The attachments are the patch diff file and change log file. Bootstrapped and tested on powerpc64le-linux-gnu with no regressions. Is this okay for trunk? Any recommendations? Thanks a lot. * final.c (select_jump_table_section): Implement a function to select the section of jump tables by reloc_rw_mask and other flags. * output.h (default_function_rodata_section, default_no_function_rodata_section): Add the second argument to the declarations. * target.def (function_rodata_section): Change the doc and add the second argument. * doc/tm.texi: Regenerate. * varasm.c (default_function_rodata_section, default_no_function_rodata_section, function_mergeable_rodata_prefix): Add the second argument in default_function_rodata_section. It indicates the section should be read-only or relocation read-only. Add the second argument in default_function_rodata_section. Set the second argument to false when default_function_rodata_section calls function_rodata_section. * config/mips/mips.c (mips_function_rodata_section): Add the second arugment and set it to false when it calls function_rodata_section. * config/s390/s390.c (targetm.asm_out.function_rodata_section): Set the second argument to false. * config/s390/s390.md Likewise.diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index 6e7d9dc54a9..3ff7527f2cc 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -7712,6 +7712,15 @@ if function is in @code{.text.name}, and the normal readonly-data section otherwise. @end deftypefn +@deftypefn {Target Hook} {section *} TARGET_ASM_FUNCTION_RELRO_SECTION (tree @var{decl}) +Return the relro section associated with +@samp{DECL_SECTION_NAME (@var{decl})}. +The default version of this function selects @code{.gnu.linkonce.r.name} if +the function's section is @code{.gnu.linkonce.t.name}, +@code{.data.rel.ro.name} if function is in @code{.text.name}, +and the normal data relro section otherwise. +@end deftypefn + @deftypevr {Target Hook} {const char *} TARGET_ASM_MERGEABLE_RODATA_PREFIX Usually, the compiler uses the prefix @code{".rodata"} to construct section names for mergeable constant data. Define this macro to override diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index 3be984bbd5c..0ac1488c837 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -5007,6 +5007,8 @@ it is unlikely to be called. @hook TARGET_ASM_FUNCTION_RODATA_SECTION +@hook TARGET_ASM_FUNCTION_RELRO_SECTION + @hook TARGET_ASM_MERGEABLE_RODATA_PREFIX @hook TARGET_ASM_TM_CLONE_TABLE_SECTION diff --git a/gcc/final.c b/gcc/final.c index a3601964a8d..ba198182663 100644 --- a/gcc/final.c +++ b/gcc/final.c @@ -107,6 +107,10 @@ along with GCC; see the file COPYING3. If not see #define JUMP_TABLES_IN_TEXT_SECTION 0 #endif +#ifndef JUMP_TABLES_IN_RELRO +#define JUMP_TABLES_IN_RELRO 0 +#endif + /* Bitflags used by final_scan_insn. */ #define SEEN_NOTE 1 #define SEEN_EMITTED 2 @@ -2154,6 +2158,23 @@ asm_show_source (const char *filename, int linenum) fputc ('\n', asm_out_file); } +/* Select sections for jump table. + If the jump table need to be relocated, + select relro sections. Otherwise in readonly section */ + +section * +select_jump_table_section (tree decl) +{ + int relco; + + relco = JUMP_TABLES_IN_RELRO; + + if (relco & targetm.asm_out.reloc_rw_mask ()) +return targetm.asm_out.function_relro_section (decl); + else +return targetm.asm_out.function_rodata_section (decl); +} + /* The final scan for one insn, INSN. Args are same as in `final', except that INSN is the insn being scanned. @@ -2492,7 +2513,7 @@ final_scan_insn_1 (rtx_insn *insn, FILE *file, int optimize_p ATTRIBUTE_UNUSED, { int log_align; - switch_to_section (targetm.asm_out.function_rodata_section + switch_to_section (select_jump_table_section (current_function_decl)); #ifdef ADDR_VEC_ALIGN @@ -2571,7 +2592,7 @@ final_scan_insn_1 (rtx_insn *insn, FILE *file, int optimize_p ATTRIBUTE_UNUSED, #endif if (! JUMP_TABLES_IN_TEXT_SECTION) - swit
[PATCH] Add support for putting jump table into relocation read-only section
Hi, This patch adds a section selection for jump tables. The jump tables can be put into read-only data section or relocation read-only data section by the relocation flags. When the PIC flag is set and jump table is non-relative, the jump table is put into relocation read-only section. section * select_jump_table_section (tree decl) { int reloc; bool section_reloc; reloc = (! CASE_VECTOR_PC_RELATIVE && flag_pic && ! targetm.asm_out.generate_pic_addr_diff_vec ()) ? 1 : 0; section_reloc = (reloc & targetm.asm_out.reloc_rw_mask ()); return targetm.asm_out.function_rodata_section (decl, section_reloc); } The attachments are the patch diff file and change log file. Bootstrapped and tested on powerpc64le-linux-gnu with no regressions. Is this okay for trunk? Any recommendations? Thanks a lot. * final.c (select_jump_table_section): Implement a function to select the section of jump tables by reloc_rw_mask and other flags. * output.h (default_function_rodata_section, default_no_function_rodata_section): Add the second argument to the declarations. * target.def (function_rodata_section): Change the doc and add the second argument. * doc/tm.texi: Regenerate. * varasm.c (default_function_rodata_section, default_no_function_rodata_section, function_mergeable_rodata_prefix): Add the second argument in default_function_rodata_section. It indicates the section should be read-only or relocation read-only. Add the second argument in default_function_rodata_section. Set the second argument to false when default_function_rodata_section calls function_rodata_section. * config/mips/mips.c (mips_function_rodata_section): Add the second arugment and set it to false when it calls function_rodata_section. * config/s390/s390.c (targetm.asm_out.function_rodata_section): Set the second argument to false. * config/s390/s390.md Likewise.diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index 6e7d9dc54a9..3ff7527f2cc 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -7712,6 +7712,15 @@ if function is in @code{.text.name}, and the normal readonly-data section otherwise. @end deftypefn +@deftypefn {Target Hook} {section *} TARGET_ASM_FUNCTION_RELRO_SECTION (tree @var{decl}) +Return the relro section associated with +@samp{DECL_SECTION_NAME (@var{decl})}. +The default version of this function selects @code{.gnu.linkonce.r.name} if +the function's section is @code{.gnu.linkonce.t.name}, +@code{.data.rel.ro.name} if function is in @code{.text.name}, +and the normal data relro section otherwise. +@end deftypefn + @deftypevr {Target Hook} {const char *} TARGET_ASM_MERGEABLE_RODATA_PREFIX Usually, the compiler uses the prefix @code{".rodata"} to construct section names for mergeable constant data. Define this macro to override diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index 3be984bbd5c..0ac1488c837 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -5007,6 +5007,8 @@ it is unlikely to be called. @hook TARGET_ASM_FUNCTION_RODATA_SECTION +@hook TARGET_ASM_FUNCTION_RELRO_SECTION + @hook TARGET_ASM_MERGEABLE_RODATA_PREFIX @hook TARGET_ASM_TM_CLONE_TABLE_SECTION diff --git a/gcc/final.c b/gcc/final.c index a3601964a8d..ba198182663 100644 --- a/gcc/final.c +++ b/gcc/final.c @@ -107,6 +107,10 @@ along with GCC; see the file COPYING3. If not see #define JUMP_TABLES_IN_TEXT_SECTION 0 #endif +#ifndef JUMP_TABLES_IN_RELRO +#define JUMP_TABLES_IN_RELRO 0 +#endif + /* Bitflags used by final_scan_insn. */ #define SEEN_NOTE 1 #define SEEN_EMITTED 2 @@ -2154,6 +2158,23 @@ asm_show_source (const char *filename, int linenum) fputc ('\n', asm_out_file); } +/* Select sections for jump table. + If the jump table need to be relocated, + select relro sections. Otherwise in readonly section */ + +section * +select_jump_table_section (tree decl) +{ + int relco; + + relco = JUMP_TABLES_IN_RELRO; + + if (relco & targetm.asm_out.reloc_rw_mask ()) +return targetm.asm_out.function_relro_section (decl); + else +return targetm.asm_out.function_rodata_section (decl); +} + /* The final scan for one insn, INSN. Args are same as in `final', except that INSN is the insn being scanned. @@ -2492,7 +2513,7 @@ final_scan_insn_1 (rtx_insn *insn, FILE *file, int optimize_p ATTRIBUTE_UNUSED, { int log_align; - switch_to_section (targetm.asm_out.function_rodata_section + switch_to_section (select_jump_table_section (current_function_decl)); #ifdef ADDR_VEC_ALIGN @@ -2571,7 +2592,7 @@ final_scan_insn_1 (rtx_insn *insn, FILE *file, int optimize_p ATTRIBUTE_UNUSED, #endif if (! JUMP_TABLES_IN_TEXT_SECTION) - swit
Re: [committed] analyzer: rewrite of region and value-handling
On 8/13/20 10:58 PM, David Malcolm via Gcc-patches wrote: PR analyzer/93032 (missing leak diagnostic for zlib/contrib/minizip/mztools.c) PR analyzer/93938 (ICE in analyzer) PR analyzer/94011 (ICE in analyzer) PR analyzer/94099 (ICE in analyzer) PR analyzer/94399 (leak false positive with __attribute__((cleanup( PR analyzer/94458 (leak false positive) PR analyzer/94503 (ICE on C++ return-value-optimization) PR analyzer/94640 (leak false positive) PR analyzer/94688 (ICE in analyzer) PR analyzer/94689 ("arrays of functions are not meaningful" error) PR analyzer/94839 (leak false positive) PR analyzer/95026 (leak false positive) PR analyzer/95042 (ICE merging const and non-const C++ object instances) PR analyzer/95240 (leak false positive) Hello David. Unfortunately, this format is not recognized by gcc-changelog script and so the corresponding PR entries were not added to the generated ChangeLog entries. The currently supported regex is: pr_regex = re.compile(r'\tPR (?P[a-z+-]+\/)?([0-9]+)$') which prevents parsing an entries not being standalone. Anyway, I updated gcc/analyzer/ChangeLog manually. Thanks, Martin
Re: [committed] analyzer: rewrite of region and value-handling
On 8/14/20 9:22 AM, Martin Liška wrote: On 8/13/20 10:58 PM, David Malcolm via Gcc-patches wrote: PR analyzer/93032 (missing leak diagnostic for zlib/contrib/minizip/mztools.c) PR analyzer/93938 (ICE in analyzer) PR analyzer/94011 (ICE in analyzer) PR analyzer/94099 (ICE in analyzer) PR analyzer/94399 (leak false positive with __attribute__((cleanup( PR analyzer/94458 (leak false positive) PR analyzer/94503 (ICE on C++ return-value-optimization) PR analyzer/94640 (leak false positive) PR analyzer/94688 (ICE in analyzer) PR analyzer/94689 ("arrays of functions are not meaningful" error) PR analyzer/94839 (leak false positive) PR analyzer/95026 (leak false positive) PR analyzer/95042 (ICE merging const and non-const C++ object instances) PR analyzer/95240 (leak false positive) Hello David. Unfortunately, this format is not recognized by gcc-changelog script and so the corresponding PR entries were not added to the generated ChangeLog entries. The currently supported regex is: pr_regex = re.compile(r'\tPR (?P[a-z+-]+\/)?([0-9]+)$') which prevents parsing an entries not being standalone. Anyway, I updated gcc/analyzer/ChangeLog manually. Thanks, Martin ... and I bet for similar reasons gcc-bugs emails were not send to various PRs mentioned in the commit. Martin
[PATCH, rs6000] Add non-relative jump table support on Power Linux
Hi, This patch adds non-relative jump table support on Power Linux. It implements ASM_OUTPUT_ADDR_VEC_ELT and adds four new expansions for non-relative jump table in rs6000.md. It also defines a rs6000 option(mrelative-jumptables). If it's set to false, the non-relative jump table is picked up even PIC flag is set. The attachments are the patch diff file and change log file. Bootstrapped and tested on powerpc64le-linux-gnu with no regressions. Is this okay for trunk? Any recommendations? Thanks a lot. * config/rs6000/linux64.h (JUMP_TABLES_IN_TEXT_SECTION, JUMP_TABLES_IN_RELRO): Define. * config/rs6000/rs6000-protos.h (rs6000_output_addr_vec_elt): Declare. * config/rs6000/rs6000.c (TARGET_ASM_GENERATE_PIC_ADDR_DIFF_VEC, rs6000_gen_pic_addr_diff_vec, rs6000_output_addr_vec_elt): TARGET_ASM_GENERATE_PIC_ADDR_DIFF_VEC define. rs6000_gen_pic_addr_diff_vec, rs6000_output_addr_vec_elt implement. * config/rs6000/rs6000.h (rs6000_relative_jumptables, CASE_VECTOR_PC_RELATIVE, CASE_VECTOR_MODE): Define. * config/rs6000/rs6000.md (nonrelative_tablejumpsi, nonrelative_tablejumpsi_nospec, nonrelative_tablejumpdi, nonrelative_tablejumpdi_nospec): Add four new expansions. * config/rs6000/rs6000.opt (mrelative-jumptables): A new option.diff --git a/gcc/config/rs6000/linux64.h b/gcc/config/rs6000/linux64.h index 34776c8421e..626f2201d96 100644 --- a/gcc/config/rs6000/linux64.h +++ b/gcc/config/rs6000/linux64.h @@ -324,7 +324,10 @@ extern int dot_symbols; /* Indicate that jump tables go in the text section. */ #undef JUMP_TABLES_IN_TEXT_SECTION -#define JUMP_TABLES_IN_TEXT_SECTION TARGET_64BIT +#define JUMP_TABLES_IN_TEXT_SECTION rs6000_relative_jumptables + +#undef JUMP_TABLES_IN_RELRO +#define JUMP_TABLES_IN_RELRO !rs6000_relative_jumptables /* The linux ppc64 ABI isn't explicit on whether aggregates smaller than a doubleword should be padded upward or downward. You could diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h index 5508484ba19..820943cdde0 100644 --- a/gcc/config/rs6000/rs6000-protos.h +++ b/gcc/config/rs6000/rs6000-protos.h @@ -155,6 +155,8 @@ extern void rs6000_split_logical (rtx [], enum rtx_code, bool, bool, bool); extern bool rs6000_pcrel_p (struct function *); extern bool rs6000_fndecl_pcrel_p (const_tree); +extern void rs6000_output_addr_vec_elt(FILE *, int); + /* Different PowerPC instruction formats that are used by GCC. There are various other instruction formats used by the PowerPC hardware, but these formats are not currently used by GCC. */ diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 58f5d780603..0a149e3e99c 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -1369,6 +1369,9 @@ static const struct attribute_spec rs6000_attribute_table[] = #undef TARGET_ASM_OUTPUT_ADDR_CONST_EXTRA #define TARGET_ASM_OUTPUT_ADDR_CONST_EXTRA rs6000_output_addr_const_extra +#undef TARGET_ASM_GENERATE_PIC_ADDR_DIFF_VEC +#define TARGET_ASM_GENERATE_PIC_ADDR_DIFF_VEC rs6000_gen_pic_addr_diff_vec + #undef TARGET_LEGITIMIZE_ADDRESS #define TARGET_LEGITIMIZE_ADDRESS rs6000_legitimize_address @@ -26494,6 +26497,26 @@ rs6000_cannot_substitute_mem_equiv_p (rtx mem) return false; } +/* Implement TARGET_ASM_GENERATE_PIC_ADDR_DIFF_VEC. Return true if rs6000_relative_jumptables is set. */ + +static bool +rs6000_gen_pic_addr_diff_vec (void) +{ + return rs6000_relative_jumptables; +} + +void +rs6000_output_addr_vec_elt (FILE *file, int value) +{ + const char *directive = TARGET_64BIT ? DOUBLE_INT_ASM_OP : "\t.long\t"; + char buf[100]; + + fprintf(file, "%s", directive); + ASM_GENERATE_INTERNAL_LABEL (buf, "L", value); + assemble_name (file, buf); + fprintf(file, "\n"); +} + struct gcc_target targetm = TARGET_INITIALIZER; #include "gt-rs6000.h" diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h index 1209a33173e..9254911836d 100644 --- a/gcc/config/rs6000/rs6000.h +++ b/gcc/config/rs6000/rs6000.h @@ -1752,15 +1752,25 @@ typedef struct rs6000_args /* #define LEGITIMATE_PIC_OPERAND_P (X) */ -/* Specify the machine mode that this machine uses - for the index in the tablejump instruction. */ -#define CASE_VECTOR_MODE SImode + +/* Enable non-relative jump tables for linux ELF. */ +#ifdef TARGET_ELF +#if !TARGET_AIX +#undef rs6000_relative_jumptables +#define rs6000_relative_jumptables 0 +#endif +#endif /* Define as C expression which evaluates to nonzero if the tablejump instruction expects the table to contain offsets from the address of the table. Do not define this if the table should contain absolute addresses. */ -#define CASE_VECTOR_PC_RELATIVE 1 +#define CASE_VECTOR_PC_RELATIVE 0 + +/* Specify the machine mode that this machine uses + for the index in the tableju
[PATCH 1/4][PR target/88808]Enable bitwise operator for AVX512 masks.
Hi: First, since avx512 masks involve both vector isa and general part, so i add both maintainers to the maillist. I'm doing this in 4 steps: 1 - Add cost model for operation of mask registers. 2 - Introduce new cover class INT_MASK_REGS, this will enable direct move between gpr and mask registers in pass_reload by consideration of cost model, this is similar as INT_SSE_REGS. 3 - Tune cost model. 4 - Enable operator or/xor/and/andn/not for mask register. kxnor is not enabled since there's no corresponding instruction for general registers, 64bit mask op is not enabled for 32bit target. kadd/kshift/ktest are not merged into general versionsadd/ashl/test since i think it would be odd to use mask register for those operations. Bootstrap is ok, regression test is ok for i386/x86-64 result. There's some improvement for performance of SPEC2017 tested on SKL, i observe there're many spills from integer to mask registers instead of memory which is the reason for the improvement. 500.perlbench_r 0.65% 502.gcc_r 0.32% 505.mcf_r -0.75% 520.omnetpp_r 2.15% 523.xalancbmk_r -0.95% 525.x264_r 1.83% 531.deepsjeng_r 0.00% 541.leela_r 0.66% 548.exchange2_r -0.45% 557.xz_r -0.94% INT geomean 0.25% 503.bwaves_r 0.00% 507.cactuBSSN_r 0.78% 508.namd_r 0.34% 510.parest_r 0.16% 511.povray_r 1.37% 519.lbm_r 1.33% 521.wrf_r 0.04% 526.blender_r 0.15% 527.cam4_r 0.69% 538.imagick_r 0.51% 544.nab_r 0.27% 549.fotonik3d_r 2.00% 554.roms_r 4.35% FP geomean 0.99% -- BR, Hongtao From 7a8393426bbd54b0906846a9f29833cf068732d5 Mon Sep 17 00:00:00 2001 From: "H.J. Lu" Date: Tue, 3 Sep 2019 14:41:02 -0700 Subject: [PATCH 1/4] x86: Add cost model for operation of mask registers. gcc/ PR target/71453 * config/i386/i386.h (struct processor_costs): Add member mask_to_integer, integer_to_mask, mask_load[3], mask_store[3], mask_move. * config/i386/x86-tune-costs.h (ix86_size_cost, i386_cost, i386_cost, pentium_cost, lakemont_cost, pentiumpro_cost, geode_cost, k6_cost, athlon_cost, k8_cost, amdfam10_cost, bdver_cost, znver1_cost, znver2_cost, skylake_cost, btver1_cost, btver2_cost, pentium4_cost, nocona_cost, atom_cost, slm_cost, intel_cost, generic_cost, core_cost): Initialize mask_load[3], mask_store[3], mask_move, integer_to_mask, mask_to_integer for all target costs. * config/i386/i386.c (ix86_register_move_cost): Using cost model of mask registers. (inline_memory_move_cost): Ditto. (ix86_register_move_cost): Ditto. --- gcc/config/i386/i386.c | 33 +++ gcc/config/i386/i386.h | 7 ++ gcc/config/i386/x86-tune-costs.h | 144 +++ 3 files changed, 184 insertions(+) diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 8ea6a4d7ea7..156df77166b 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -18769,6 +18769,28 @@ inline_memory_move_cost (machine_mode mode, enum reg_class regclass, int in) return in ? ix86_cost->hard_register.sse_load [index] : ix86_cost->hard_register.sse_store [index]; } + if (MASK_CLASS_P (regclass)) +{ + int index; + switch (GET_MODE_SIZE (mode)) + { + case 1: + index = 0; + break; + case 2: + index = 1; + break; + default: + index = 3; + break; + } + + if (in == 2) + return MAX (ix86_cost->hard_register.mask_load[index], + ix86_cost->hard_register.mask_store[index]); + return in ? ix86_cost->hard_register.mask_load[2] + : ix86_cost->hard_register.mask_store[2]; +} if (MMX_CLASS_P (regclass)) { int index; @@ -18894,6 +18916,17 @@ ix86_register_move_cost (machine_mode mode, reg_class_t class1_i, ? ix86_cost->hard_register.sse_to_integer : ix86_cost->hard_register.integer_to_sse); + /* Moves between mask register and GPR. */ + if (MASK_CLASS_P (class1) != MASK_CLASS_P (class2)) +{ + return (MASK_CLASS_P (class1) + ? ix86_cost->hard_register.mask_to_integer + : ix86_cost->hard_register.integer_to_mask); +} + /* Moving between mask registers. */ + if (MASK_CLASS_P (class1) && MASK_CLASS_P (class2)) +return ix86_cost->hard_register.mask_move; + if (MAYBE_FLOAT_CLASS_P (class1)) return ix86_cost->hard_register.fp_move; if (MAYBE_SSE_CLASS_P (class1)) diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index 114967e49a3..e0af87450b8 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -279,6 +279,13 @@ struct processor_costs { in SImode, DImode and TImode. */ const int sse_to_integer; /* cost of moving SSE register to integer. */ const int integer_to_sse; /* cost of moving integer register to SSE. */ + const int mask_to_integer; /* cost of moving mask register to integer. */ + const int integer_to_mask; /* cost of moving integer register to mask. */ + const int mask_load[3]; /* cost of loading mask registers + in QImode, HImode and SImode. */ + const int mask_store[3]; /* cost of storing mask register +
[PATCH 2/4][PR target/88808]Enable bitwise operator for AVX512 masks.
Enable direct move between masks and gprs in pass_reload with consideration of cost model. Changelog gcc/ * config/i386/i386.c (inline_secondary_memory_needed): No memory is needed between mask regs and gpr. (ix86_hard_regno_mode_ok): Add condition TARGET_AVX512F for mask regno. * config/i386/i386.h (enum reg_class): Add INT_MASK_REGS. (REG_CLASS_NAMES): Ditto. (REG_CLASS_CONTENTS): Ditto. * config/i386/i386.md: Exclude mask register in define_peephole2 which is available only for gpr. gcc/testsuites/ * gcc.target/i386/pr71453-1.c: New tests. * gcc.target/i386/pr71453-2.c: Ditto. * gcc.target/i386/pr71453-3.c: Ditto. * gcc.target/i386/pr71453-4.c: Ditto. -- BR, Hongtao From 131388217cc5c52947fdad43216f77aa6ff090ab Mon Sep 17 00:00:00 2001 From: liuhongt Date: Thu, 6 Aug 2020 13:48:38 +0800 Subject: [PATCH 2/4] Enable direct movement between gpr and mask registers in pass_reload. Changelog gcc/ * config/i386/i386.c (inline_secondary_memory_needed): No memory is needed between mask regs and gpr. (ix86_hard_regno_mode_ok): Add condition TARGET_AVX512F for mask regno. * config/i386/i386.h (enum reg_class): Add INT_MASK_REGS. (REG_CLASS_NAMES): Ditto. (REG_CLASS_CONTENTS): Ditto. * config/i386/i386.md: Exclude mask register in define_peephole2 which is avaiable only for gpr. gcc/testsuites/ * gcc.target/i386/pr71453-1.c: New tests. * gcc.target/i386/pr71453-2.c: Ditto. * gcc.target/i386/pr71453-3.c: Ditto. * gcc.target/i386/pr71453-4.c: Ditto. Delete mspill2mask option. Fix bugs. --- gcc/config/i386/i386.c| 6 +- gcc/config/i386/i386.h| 3 + gcc/config/i386/i386.md | 3 +- gcc/testsuite/gcc.target/i386/pr71453-1.c | 86 +++ gcc/testsuite/gcc.target/i386/pr71453-2.c | 6 ++ gcc/testsuite/gcc.target/i386/pr71453-3.c | 7 ++ 6 files changed, 106 insertions(+), 5 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr71453-1.c create mode 100644 gcc/testsuite/gcc.target/i386/pr71453-2.c create mode 100644 gcc/testsuite/gcc.target/i386/pr71453-3.c diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 156df77166b..c9bd3ab06a2 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -18571,9 +18571,7 @@ inline_secondary_memory_needed (machine_mode mode, reg_class_t class1, || MAYBE_SSE_CLASS_P (class1) != SSE_CLASS_P (class1) || MAYBE_SSE_CLASS_P (class2) != SSE_CLASS_P (class2) || MAYBE_MMX_CLASS_P (class1) != MMX_CLASS_P (class1) - || MAYBE_MMX_CLASS_P (class2) != MMX_CLASS_P (class2) - || MAYBE_MASK_CLASS_P (class1) != MASK_CLASS_P (class1) - || MAYBE_MASK_CLASS_P (class2) != MASK_CLASS_P (class2)) + || MAYBE_MMX_CLASS_P (class2) != MMX_CLASS_P (class2)) { gcc_assert (!strict || lra_in_progress); return true; @@ -18999,7 +18997,7 @@ ix86_hard_regno_mode_ok (unsigned int regno, machine_mode mode) if ((mode == P2QImode || mode == P2HImode)) return MASK_PAIR_REGNO_P(regno); - return (VALID_MASK_REG_MODE (mode) + return ((TARGET_AVX512F && VALID_MASK_REG_MODE (mode)) || (TARGET_AVX512BW && VALID_MASK_AVX512BW_MODE (mode))); } diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index e0af87450b8..852dd017aa4 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -1418,6 +1418,7 @@ enum reg_class FLOAT_INT_SSE_REGS, MASK_REGS, ALL_MASK_REGS, + INT_MASK_REGS, ALL_REGS, LIM_REG_CLASSES }; @@ -1477,6 +1478,7 @@ enum reg_class "FLOAT_INT_SSE_REGS", \ "MASK_REGS",\ "ALL_MASK_REGS", \ + "INT_MASK_REGS", \ "ALL_REGS" } /* Define which registers fit in which classes. This is an initializer @@ -1515,6 +1517,7 @@ enum reg_class { 0xff9, 0xfff0, 0xf }, /* FLOAT_INT_SSE_REGS */ \ { 0x0,0x0, 0xfe0 }, /* MASK_REGS */ \ { 0x0,0x0, 0xff0 }, /* ALL_MASK_REGS */ \ + { 0x900ff, 0xff0, 0xff0 }, /* INT_MASK_REGS */ \ { 0x, 0x, 0xfff } /* ALL_REGS */ \ } diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index b24a4557871..74d207c3711 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -15051,7 +15051,7 @@ (parallel [(set (reg:CC FLAGS_REG) (unspec:CC [(match_dup 0)] UNSPEC_PARITY)) (clobber (match_dup 0))])] - "" + "!MASK_REGNO_P (REGNO (operands[0]))" [(set (reg:CC FLAGS_REG) (unspec:CC [(match_dup 1)] UNSPEC_PARITY))]) @@ -15072,6 +15072,7 @@ (label_ref (match_operand 5)) (pc)))] "REGNO (operands[2]) == REGNO (operands[3]) + && !MASK_REGNO_P (REGNO (operands[1])) && peep2_reg_dead_p (3, operands[0]) && peep2_reg_dead_p (3, operands[2]) && peep2_regno_dead_p (4, FLAGS_REG)" diff --git a/gcc/testsuite/gcc.target/i386/pr71453-1.c b/gcc/testsuite/gcc.target/i386/pr71453-
[PATCH 3/4][PR target/88808]Enable bitwise operator for AVX512 masks.
1. Set cost of movement inside mask registers a bit higher than gpr's. 2. Set cost of movement between mask register and gpr much higher than movement inside gpr, but still less equal than load/store. 3. Set cost of mask register load/store a bit higher than gpr load/store. -- BR, Hongtao From 5342006fea6b9f2b863590b400e73340c5dff21a Mon Sep 17 00:00:00 2001 From: liuhongt Date: Thu, 24 Oct 2019 11:13:00 +0800 Subject: [PATCH 3/4] According to instruction_tables.pdf 1. Set cost of movement inside mask registers a bit higher than gpr's. 2. Set cost of movement between mask register and gpr much higher than movement inside gpr, but still less equal than load/store. 3. Set cost of mask register load/store a bit higher than gpr load/store. --- gcc/config/i386/x86-tune-costs.h | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/gcc/config/i386/x86-tune-costs.h b/gcc/config/i386/x86-tune-costs.h index 256c84e364e..a782a9dd9e3 100644 --- a/gcc/config/i386/x86-tune-costs.h +++ b/gcc/config/i386/x86-tune-costs.h @@ -1727,12 +1727,12 @@ struct processor_costs skylake_cost = { {8, 8, 8, 12, 24}, /* cost of storing SSE registers in 32,64,128,256 and 512-bit */ 6, 6, /* SSE->integer and integer->SSE moves */ - 2, 2,/* mask->integer and integer->mask moves */ - {4, 4, 4},/* cost of loading mask register + 4, 6,/* mask->integer and integer->mask moves */ + {6, 6, 6},/* cost of loading mask register in QImode, HImode, SImode. */ - {6, 6, 6},/* cost if storing mask register + {8, 8, 8},/* cost if storing mask register in QImode, HImode, SImode. */ - 2, /* cost of moving mask register. */ + 3, /* cost of moving mask register. */ /* End of register allocator costs. */ }, -- 2.18.1
[PATCH 4/4][PR target/88808]Enable bitwise operator for AVX512 masks.
Enable operator or/xor/and/andn/not for mask register, kxnor is not enabled since there's no corresponding instruction for general registers. gcc/ PR target/88808 * config/i386/i386.md: (*movsi_internal): Adjust constraints for mask registers. (*movhi_internal): Ditto. (*movqi_internal): Ditto. (*anddi_1): Support mask register operations (*and_1): Ditto. (*andqi_1): Ditto. (*andn_1): Ditto. (*_1): Ditto. (*qi_1): Ditto. (*one_cmpl2_1): Ditto. (*one_cmplsi2_1_zext): Ditto. (*one_cmplqi2_1): Ditto. gcc/testsuite/ * gcc.target/i386/bitwise_mask_op-1.c: New test. * gcc.target/i386/bitwise_mask_op-2.c: New test. * gcc.target/i386/avx512bw-kunpckwd-1.c: Adjust testcase. * gcc.target/i386/avx512bw-kunpckwd-3.c: Ditto. * gcc.target/i386/avx512dq-kmovb-5.c: Ditto. * gcc.target/i386/avx512f-kmovw-5.c: Ditto. -- BR, Hongtao From df816952e6e76e3dccd53b6384075c41eed1a0f9 Mon Sep 17 00:00:00 2001 From: liuhongt Date: Thu, 13 Aug 2020 14:20:43 +0800 Subject: [PATCH 4/4] Enable bitwise operation for type mask. Enable operator or/xor/and/andn/not for mask register, kxnor is not enabled since there's no corresponding instruction for general registers. gcc/ PR target/88808 * config/i386/i386.md: (*movsi_internal): Adjust constraints for mask registers. (*movhi_internal): Ditto. (*movqi_internal): Ditto. (*anddi_1): Support mask register operations (*and_1): Ditto. (*andqi_1): Ditto. (*andn_1): Ditto. (*_1): Ditto. (*qi_1): Ditto. (*one_cmpl2_1): Ditto. (*one_cmplsi2_1_zext): Ditto. (*one_cmplqi2_1): Ditto. gcc/testsuite/ * gcc.target/i386/bitwise_mask_op-1.c: New test. * gcc.target/i386/bitwise_mask_op-2.c: New test. * gcc.target/i386/avx512bw-kunpckwd-1.c: Adjust testcase. * gcc.target/i386/avx512bw-kunpckwd-3.c: Ditto. * gcc.target/i386/avx512dq-kmovb-5.c: Ditto. * gcc.target/i386/avx512f-kmovw-5.c: Ditto. --- gcc/config/i386/i386.md | 262 +- .../gcc.target/i386/avx512bw-kunpckwd-1.c | 2 +- .../gcc.target/i386/avx512bw-kunpckwd-3.c | 2 +- .../gcc.target/i386/avx512dq-kmovb-5.c| 2 +- .../gcc.target/i386/avx512f-kmovw-5.c | 2 +- .../gcc.target/i386/bitwise_mask_op-1.c | 177 .../gcc.target/i386/bitwise_mask_op-2.c | 7 + 7 files changed, 377 insertions(+), 77 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/bitwise_mask_op-1.c create mode 100644 gcc/testsuite/gcc.target/i386/bitwise_mask_op-2.c diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 74d207c3711..e8ad79d1b0a 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -2294,7 +2294,7 @@ (define_insn "*movsi_internal" [(set (match_operand:SI 0 "nonimmediate_operand" -"=r,m ,*y,*y,?*y,?m,?r,?*y,*v,*v,*v,m ,?r,?*v,*k,*k ,*rm,*k") +"=r,m ,*y,*y,?*y,?m,?r,?*y,*v,*v,*v,m ,?r,?*v,*k,*k ,*rm,k") (match_operand:SI 1 "general_operand" "g ,re,C ,*y,m ,*y,*y,r ,C ,*v,m ,*v,*v,r ,*r,*km,*k ,CBC"))] "!(MEM_P (operands[0]) && MEM_P (operands[1]))" @@ -2403,8 +2403,8 @@ (symbol_ref "true")))]) (define_insn "*movhi_internal" - [(set (match_operand:HI 0 "nonimmediate_operand" "=r,r ,r ,m ,k,k ,r,m,k") - (match_operand:HI 1 "general_operand" "r ,rn,rm,rn,r,km,k,k,CBC"))] + [(set (match_operand:HI 0 "nonimmediate_operand" "=r,r ,r ,m ,*k,*k ,*r,*m,k") + (match_operand:HI 1 "general_operand" "r ,rn,rm,rn,*r,*km,*k,*k,CBC"))] "!(MEM_P (operands[0]) && MEM_P (operands[1]))" { switch (get_attr_type (insn)) @@ -2491,9 +2491,9 @@ (define_insn "*movqi_internal" [(set (match_operand:QI 0 "nonimmediate_operand" - "=Q,R,r,q,q,r,r ,?r,m ,k,k,r,m,k,k,k") + "=Q,R,r,q,q,r,r ,?r,m ,*k,*k,*r,*m,*k,k,k") (match_operand:QI 1 "general_operand" - "Q ,R,r,n,m,q,rn, m,qn,r,k,k,k,m,C,BC"))] + "Q ,R,r,n,m,q,rn, m,qn,*r,*k,*k,*k,*m,C,BC"))] "!(MEM_P (operands[0]) && MEM_P (operands[1]))" { char buf[128]; @@ -9044,19 +9044,21 @@ }) (define_insn "*anddi_1" - [(set (match_operand:DI 0 "nonimmediate_operand" "=r,rm,r,r") + [(set (match_operand:DI 0 "nonimmediate_operand" "=r,rm,r,r,k") (and:DI - (match_operand:DI 1 "nonimmediate_operand" "%0,0,0,qm") - (match_operand:DI 2 "x86_64_szext_general_operand" "Z,re,m,L"))) + (match_operand:DI 1 "nonimmediate_operand" "%0,0,0,qm,k") + (match_operand:DI 2 "x86_64_szext_general_operand" "Z,re,m,L,k"))) (clobber (reg:CC FLAGS_REG))] "TARGET_64BIT && ix86_binary_operator_ok (AND, DImode, operands)" "@ and{l}\t{%k2, %k0|%k0, %k2} and{q}\t{%2, %0|%0, %2} and{q}\t{%2, %0|%0, %2} - #" - [(set_attr "type" "alu,alu,alu,imovx") - (set_attr "length_immediate" "*,*,*,0") + and{q}\t{%2, %0|%0, %2} + kandq\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "x64,x64,x64,x64,avx512bw") + (set_attr "type" "alu,alu,alu,imovx,msklog") + (set_a
Re: Backporting streaming and enum changes
Hi, On Thu, 6 Aug 2020 at 16:39, Richard Biener wrote: > > On Thu, 6 Aug 2020, Jan Hubicka wrote: > > > Hello, > > as discussed some time ago, I would like to discuss possibility to > > backport the straming and enum improvements. The motivation is that > > this brings quite noticeable improvements to builds of very large > > projects where we currently have nonlinearity problem with anonymous > > namespaces (which is solved by first set of patches) and also there is > > quite noticeable overhead of streaming of enums that I noticed too late > > for gcc 10.1. This is the second combine dpatch. > > > > There is also noticeable reduction of .o files (especially before > > compression as hit to WPA->ltrans streaming) and some memory use > > benefits. > > > > This is an optional thing to do, but I believe it may be helpful for > > distro builds and those using LTO for large projects. > > > > For firefox the reduction in global stream (that is slowest part of WPA) > > is from 25678391 tree bodies to 20821629, 11160520 SCC hash collisions > > to 6002. 392382523 overal section size to 287891470 (both is > > compressed). > > > > For Firefox streaming is under control, but other projects like Chromium > > hits bigger issues. The reason is that Firefox has "unified build" that > > #includes multiple cpp sources to one, so it consists of only about 8k > > source files, while chromium is over 25k and it was tested on project > > with over 250k sources. More smaller sources one gets, the more > > noticeable bottleneck streaming become. > > > > The patches are not completely trivial, but they affect code that is > > heavily executed during streaming and was in mainline for several > > months, so I hope they are safe. > > So we've built the core of openSUSE (~3000 packages) on x86_64 > and i586 with these backported and sofar found no issues. > > I'm fine with backporting but I'll give Jakub the chance to > object. > > Honza - please make sure to bump the LTO stream version minor > together with the streaming change (I think the enum change > doesn't require bumping). > Since this was backported as r10-8623-g0d96c3424bbb5e5f994b78c8f65d8704d215be54, I've noticed ICEs on arm and aarch64: gcc.dg/pr34457-1.c (internal compiler error) gcc.dg/torture/pr92088-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none (internal compiler error) gcc.dg/torture/pr92088-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects (internal compiler error) I can see: Excess errors: during IPA pass: cp lto1: internal compiler error: in operator[], at vec.h:878 0xa0a5d7 vec::operator[](unsigned int) /gcc/vec.h:878 0xa0a5d7 vec::operator[](unsigned int) /gcc/vec.h:1444 0xa194d3 vec::operator[](unsigned int) /gcc/tree.h:3408 0xa194d3 lto_symtab_encoder_deref /gcc/lto-streamer.h:1173 0xa194d3 ipa_prop_read_section /gcc/ipa-prop.c:5060 0xa194d3 ipa_prop_read_jump_functions() /gcc/ipa-prop.c:5089 0xb6ba71 ipa_read_summaries_1 /gcc/passes.c:2837 0x6bc4b5 read_cgraph_and_symbols(unsigned int, char const**) /gcc/lto/lto-common.c:2921 0x69deb2 lto_main() /gcc/lto/lto.c:625 The tests pass on trunk. Christophe > Thanks, > Richard. > > > Honza > > > > -- > Richard Biener > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
Re: [committed] analyzer: rewrite of region and value-handling
Hi David, On Thu, 13 Aug 2020 at 22:58, David Malcolm via Gcc-patches wrote: > > This large patch reimplements how the analyzer tracks regions and > values. > > Elimination of region_id and svalue_id > ** > > The patch eliminates region_id and svalue_id in favor of simply > using pointers. I'd hoped that the ID classes would make it easier > to compare states, avoiding having to compare long hexadecimal addresses > in favor of small integers. Unfortunately it added lots of complexity, > with the need to remap IDs when comparing or purging states, and the > need to "canonicalize" when comparing states. > > Various "state explosion" bugs in the old implementation were due to > failures in canonicalization, where two states that ought to be equal > were non-equal due to differences in ID ordering. I spent a lot of > time trying to fix canonicalization bugs, and there always seemed to > be one more bug. By eliminating IDs in this new implementation, lots > of tricky canonicalization goes away and no ID remapping should be > needed; almost all of the old validation code becomes redundant. > There's still some canonicalization in the new implementation, mostly > in constraint_manager, but much less than before. > > Ownership of regions and svalues > > > In the old implementation, each region_model had its own copies of > regions and svalues, so there was heap bloat and churn as lots of > little objects were cloned when copying program_state instances. In the > new implementation the regions and svalues are immutable and are shared > thoughout the analysis, rather than being per region_model. They are > owned by a manager class, and are effectively singletons. Region and > svalue instances can now be compared by pointer rather than by comparing > their fields (the manager class takes care of uniqueness). > > This is a huge simplification, and (I hope) will avoid lots > of heap churn as states are copied; all mutable state from regions and > svalues is now stored in a "store" class in the region_model. > > Changes to the meaning of a "region" > > > Region subclasses no longer represent internal structure, but instead > represent how the regions are reached. So e.g. a global "struct coord > c;" is now a decl_region, rather than a struct_region. > > In the old implementation, the values for each region were stored in the > region instances, but in the new implementation the regions are immutable. > Memory is now modeled in a new "store" class: a mapping from keys to > svalues, where the keys are both concrete bit-offsets from the start of > a "base region", and "symbolic" keys (thus hopefully making unions, > casts, aliasing etc easier to deal with). So e.g. for assignments to > the fields of a struct, it records the mapping from bit-offsets of e.g. > field to the values; if that memory is cast to another type and written > to, the appropriate clobbering of the bound values can happen. > > The concept of "what the current stack is" moves from the regions to > being a field within the region_model ("m_current_frame"). > > Bugs fixed by this patch > > > PR analyzer/93032 (missing leak diagnostic for zlib/contrib/minizip/mztools.c) > PR analyzer/93938 (ICE in analyzer) > PR analyzer/94011 (ICE in analyzer) > PR analyzer/94099 (ICE in analyzer) > PR analyzer/94399 (leak false positive with __attribute__((cleanup( > PR analyzer/94458 (leak false positive) > PR analyzer/94503 (ICE on C++ return-value-optimization) > PR analyzer/94640 (leak false positive) > PR analyzer/94688 (ICE in analyzer) > PR analyzer/94689 ("arrays of functions are not meaningful" error) > PR analyzer/94839 (leak false positive) > PR analyzer/95026 (leak false positive) > PR analyzer/95042 (ICE merging const and non-const C++ object instances) > PR analyzer/95240 (leak false positive) > > Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. > Pushed to master as 808f4dfeb3a95f50f15e71148e5c1067f90a126d. > Some of the new tests fail on arm and aarch64. On arm: gcc.dg/analyzer/casts-1.c (test for warnings, line 19) gcc.dg/analyzer/casts-1.c (test for warnings, line 20) gcc.dg/analyzer/casts-1.c (test for warnings, line 21) gcc.dg/analyzer/casts-1.c (test for warnings, line 22) gcc.dg/analyzer/casts-1.c (test for warnings, line 23) gcc.dg/analyzer/casts-1.c (test for warnings, line 24) gcc.dg/analyzer/casts-1.c (test for warnings, line 25) gcc.dg/analyzer/casts-1.c (test for warnings, line 26) gcc.dg/analyzer/casts-1.c (test for warnings, line 36) gcc.dg/analyzer/casts-1.c (test for warnings, line 37) gcc.dg/analyzer/casts-1.c (test for warnings, line 38) gcc.dg/analyzer/casts-1.c (test for warnings, line 39) gcc.dg/analyzer/casts-1.c (test for excess errors) gcc.dg/analyzer/init.c (test for warnings, line 100) gcc.dg/analyz
Re: [PATCH] ipa-inline: Improve growth accumulation for recursive calls
Hi, On 2020/8/13 20:52, Jan Hubicka wrote: >> Since there are no other callers outside of these specialized nodes, the >> guessed profile count should be same equal? Perf tool shows that even >> each specialized node is called only once, none of them take same time for >> each call: >> >>40.65% exchange2_gcc.o exchange2_gcc.orig.slow [.] >> __brute_force_MOD_digits_2.constprop.4 >>16.31% exchange2_gcc.o exchange2_gcc.orig.slow [.] >> __brute_force_MOD_digits_2.constprop.3 >>10.91% exchange2_gcc.o libgfortran.so.5.0.0 [.] >> _gfortran_mminloc0_4_i4 >> 5.41% exchange2_gcc.o exchange2_gcc.orig.slow [.] >> __brute_force_MOD_digits_2.constprop.6 >> 4.68% exchange2_gcc.o exchange2_gcc.orig.slow [.] >> __logic_MOD_new_solver >> 3.76% exchange2_gcc.o exchange2_gcc.orig.slow [.] >> __brute_force_MOD_digits_2.constprop.5 >> 1.07% exchange2_gcc.o exchange2_gcc.orig.slow [.] >> __brute_force_MOD_digits_2.constprop.7 >> 0.84% exchange2_gcc.o exchange2_gcc.orig.slow [.] >> __brute_force_MOD_brute.constprop.0 >> 0.47% exchange2_gcc.o exchange2_gcc.orig.slow [.] >> __brute_force_MOD_digits_2.constprop.2 >> 0.24% exchange2_gcc.o exchange2_gcc.orig.slow [.] >> __brute_force_MOD_digits_2.constprop.1 >> 0.24% exchange2_gcc.o exchange2_gcc.orig.slow [.] >> __brute_force_MOD_covered.constprop.0 >> 0.11% exchange2_gcc.o exchange2_gcc.orig.slow [.] >> __brute_force_MOD_reflected.constprop.0 >> 0.00% exchange2_gcc.o exchange2_gcc.orig.slow [.] >> __brute_force_MOD_brute.constprop.1 >> >> >> digits_2.constprop.4 & digits_2.constprop.3 takes most of the execution time, >> So profile count and frequency seem not very helpful for this case? > Yep, you can not really determine the time spent on each of recursion > levels from the recursive edge probability since you can not assume it > to be the same on each level of recursion (here we know it is 0 at level > 10 and it is slowly dropping as the recursion increases because there > are fewer posiblities to fill up the partly filled sudoku:). > Especially you can not assume it after the ipa-cp did the work to figure > out that there is recursion depth counter that affect the function. > > Thinking of the ipa-cp profile updating changes, I did not consider safe > the approach of adding extra weight to the recursive call. The problem > is that the recursive call frequency estimate, when comming from static > profile stimation, is likely completely off, just like static profile > estimator can not be used to predict realistically the number of > iterations of loops. However even with profile feedback this is hard to > interpret. > > I was wondering if we can not simply detect this scenario and avoid > using the recursive edge probability in the profile updating. > We could simply scale by the number of copies. > This means that if we produce 10 clones, we could just set each clone to > 1/10th of the frequency. This implies that the hottest spot will not be > underestimated expontentially much as can happen now, but just 10 times > at worst, wich is probably still OK. We play similar games in loop > optimizers: static profile estimators expect every loop to trip around 3 > times so unroller/vectorizer/etc would make no sense on them. > > By scaling down according to number of copies we will keep the > frequencies of calls to function called from clones "sane". We will > still run into problems when we inline one clone to another (sine its > proflie will get scaled by the recursive edge frequency), but perhaps > that can be special cases in profiler or make ipa-cp to adjust the > recursive edges to get frequency close to 1 as a hack. > > This is not pretty, but at least it looks safer to me than the original > profile updating patch adding extra weight to the frequency. Really appreciate for your detailed explanation. BTW, My previous patch for PGO build on exchange2 takes this similar method by setting each cloned node to 1/10th of the frequency several month agao :) https://gcc.gnu.org/pipermail/gcc-patches/2020-June/546926.html Xionghu
Re: Backporting streaming and enum changes
Hi, > > Since this was backported as > r10-8623-g0d96c3424bbb5e5f994b78c8f65d8704d215be54, Yes, after discussion with Jakub on IRC. > I've noticed ICEs on arm and aarch64: > gcc.dg/pr34457-1.c (internal compiler error) > gcc.dg/torture/pr92088-1.c -O2 -flto -fno-use-linker-plugin > -flto-partition=none (internal compiler error) > gcc.dg/torture/pr92088-1.c -O2 -flto -fuse-linker-plugin > -fno-fat-lto-objects (internal compiler error) > > I can see: > Excess errors: > during IPA pass: cp > lto1: internal compiler error: in operator[], at vec.h:878 > 0xa0a5d7 vec::operator[](unsigned int) > /gcc/vec.h:878 > 0xa0a5d7 vec::operator[](unsigned int) > /gcc/vec.h:1444 > 0xa194d3 vec::operator[](unsigned int) > /gcc/tree.h:3408 > 0xa194d3 lto_symtab_encoder_deref > /gcc/lto-streamer.h:1173 > 0xa194d3 ipa_prop_read_section > /gcc/ipa-prop.c:5060 > 0xa194d3 ipa_prop_read_jump_functions() > /gcc/ipa-prop.c:5089 > 0xb6ba71 ipa_read_summaries_1 > /gcc/passes.c:2837 > 0x6bc4b5 read_cgraph_and_symbols(unsigned int, char const**) > /gcc/lto/lto-common.c:2921 > 0x69deb2 lto_main() > /gcc/lto/lto.c:625 > > The tests pass on trunk. I will check this out. I seem to remember that we fixed the issue on mainline. It was extra zero byte streaming, I will dig it out. Honza > > Christophe > > Thanks, > > Richard. > > > > > Honza > > > > > > > -- > > Richard Biener > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
Re: Backporting streaming and enum changes
> Hi, > > > > Since this was backported as > > r10-8623-g0d96c3424bbb5e5f994b78c8f65d8704d215be54, > > Yes, after discussion with Jakub on IRC. > > I've noticed ICEs on arm and aarch64: > > gcc.dg/pr34457-1.c (internal compiler error) > > gcc.dg/torture/pr92088-1.c -O2 -flto -fno-use-linker-plugin > > -flto-partition=none (internal compiler error) > > gcc.dg/torture/pr92088-1.c -O2 -flto -fuse-linker-plugin > > -fno-fat-lto-objects (internal compiler error) > > > > I can see: > > Excess errors: > > during IPA pass: cp > > lto1: internal compiler error: in operator[], at vec.h:878 > > 0xa0a5d7 vec::operator[](unsigned int) > > /gcc/vec.h:878 > > 0xa0a5d7 vec::operator[](unsigned int) > > /gcc/vec.h:1444 > > 0xa194d3 vec::operator[](unsigned int) > > /gcc/tree.h:3408 > > 0xa194d3 lto_symtab_encoder_deref > > /gcc/lto-streamer.h:1173 > > 0xa194d3 ipa_prop_read_section > > /gcc/ipa-prop.c:5060 > > 0xa194d3 ipa_prop_read_jump_functions() > > /gcc/ipa-prop.c:5089 > > 0xb6ba71 ipa_read_summaries_1 > > /gcc/passes.c:2837 > > 0x6bc4b5 read_cgraph_and_symbols(unsigned int, char const**) > > /gcc/lto/lto-common.c:2921 > > 0x69deb2 lto_main() > > /gcc/lto/lto.c:625 > > > > The tests pass on trunk. > > I will check this out. I seem to remember that we fixed the issue on > mainline. It was extra zero byte streaming, I will dig it out. Found it, https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95362 I will backport the patch shortly. Sorry for the breakage - it is quite impressive that we do not have non-trivial SCC streamed in the distro bootstrap. Honza > > Honza > > > > Christophe > > > Thanks, > > > Richard. > > > > > > > Honza > > > > > > > > > > -- > > > Richard Biener > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
Re: [PATCH] C-SKY: Fix assembling error with -mfloat-abi=hard.
Hi, Ok & thanks. Jojo 在 2020年8月14日 +0800 PM3:02,Xianmiao Qu ,写道: > Hi Jojo, > > > gcc/ChangeLog: > > * gcc/config/csky/csky-elf.h (ASM_SPEC): Use mfloat-abi. > > * gcc/config/csky/csky-linux-elf.h (ASM_SPEC): mfloat-abi. > > > I have committed it to trunk. But there two points you shoud pay > attention to, > > 1. line should start with a tab not spaces > > 2. The path of changed file should not include the 'gcc' > > > Thanks, > > Xianmiao
Re: Backporting streaming and enum changes
On Fri, 14 Aug 2020 at 11:21, Jan Hubicka wrote: > > > Hi, > > > > > > Since this was backported as > > > r10-8623-g0d96c3424bbb5e5f994b78c8f65d8704d215be54, > > > > Yes, after discussion with Jakub on IRC. > > > I've noticed ICEs on arm and aarch64: > > > gcc.dg/pr34457-1.c (internal compiler error) > > > gcc.dg/torture/pr92088-1.c -O2 -flto -fno-use-linker-plugin > > > -flto-partition=none (internal compiler error) > > > gcc.dg/torture/pr92088-1.c -O2 -flto -fuse-linker-plugin > > > -fno-fat-lto-objects (internal compiler error) > > > > > > I can see: > > > Excess errors: > > > during IPA pass: cp > > > lto1: internal compiler error: in operator[], at vec.h:878 > > > 0xa0a5d7 vec::operator[](unsigned > > > int) > > > /gcc/vec.h:878 > > > 0xa0a5d7 vec::operator[](unsigned int) > > > /gcc/vec.h:1444 > > > 0xa194d3 vec::operator[](unsigned int) > > > /gcc/tree.h:3408 > > > 0xa194d3 lto_symtab_encoder_deref > > > /gcc/lto-streamer.h:1173 > > > 0xa194d3 ipa_prop_read_section > > > /gcc/ipa-prop.c:5060 > > > 0xa194d3 ipa_prop_read_jump_functions() > > > /gcc/ipa-prop.c:5089 > > > 0xb6ba71 ipa_read_summaries_1 > > > /gcc/passes.c:2837 > > > 0x6bc4b5 read_cgraph_and_symbols(unsigned int, char const**) > > > /gcc/lto/lto-common.c:2921 > > > 0x69deb2 lto_main() > > > /gcc/lto/lto.c:625 > > > > > > The tests pass on trunk. > > > > I will check this out. I seem to remember that we fixed the issue on > > mainline. It was extra zero byte streaming, I will dig it out. > Found it, > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95362 > I will backport the patch shortly. Sorry for the breakage - it is quite > impressive that we do not have non-trivial SCC streamed in the distro > bootstrap. > Indeed! Thanks for the quick fix. > Honza > > > > Honza > > > > > > Christophe > > > > Thanks, > > > > Richard. > > > > > > > > > Honza > > > > > > > > > > > > > -- > > > > Richard Biener > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
[PING][PATCH] PR target/96347: non-delegitimized UNSPEC UNSPEC_TP (19) found in variable location
Ping. Though I wonder if there's any point adding a check at all over just swapping the order that mem_loc_descriptor and tls_mem_loc_descriptor are called in. Iain. On 07/08/2020 13:33, Iain Buclaw wrote: > Hi, > > On x86, a memory reference reference to a TLS address can be broken out > and stored in a register, for instance: > > movq %fs:8+testYearsBC@tpoff, %rdx > > Subsequently becomes: > > pushq %rbp > leaq 8+testYearsBC@tpoff, %rbp > // later... > movq %fs:0(%rbp), %rdx > > When it comes to representing this in Dwarf, mem_loc_descriptor is left > with the following RTL, which ix86_delegitimize_tls_address is unable to > handle as the symbol_ref has been swapped out with the temporary. > > (plus:DI (unspec:DI [ > (const_int 0 [0]) > ] UNSPEC_TP) > (reg/f:DI 6 bp [90])) > > The UNSPEC_TP expression is checked, ix86_const_not_ok_for_debug_p > returns false as it only lets through UNSPEC_GOTOFF, and finally > const_ok_for_output is either not smart enough or does not have the > information needed to recognize that UNSPEC_TP is a TLS UNSPEC that > should be ignored. This results in informational warnings being fired > under -fchecking. > > The entrypoint that triggers this warning to occur is that MEM codes are > first lowered with mem_loc_descriptor, with tls_mem_loc_descriptor only > being used if that failed. This patch switches that around so that TLS > memory expressions first call tls_mem_loc_descriptor, and only use > mem_loc_descriptor if that fails (which may result in UNSPEC warnings). > > Bootstrapped and regression tested on x86_64-linux-gnu, I do also have a > sparc64-linux-gnu build at hand, but have not yet got the results to > check for a before/after comparison. > > OK for mainline? > > Regards > Iain > > --- > gcc/ChangeLog: > > PR target/96347 > * dwarf2out.c (is_tls_mem_loc): New. > (mem_loc_descriptor): Call tls_mem_loc_descriptor on TLS memory > expressions, fallback to mem_loc_descriptor if unsuccessful. > (loc_descriptor): Likewise. > > gcc/testsuite/ChangeLog: > > PR target/96347 > * g++.dg/other/pr96347.C: New test. > --- > gcc/dwarf2out.c | 30 ++ > gcc/testsuite/g++.dg/other/pr96347.C | 61 > 2 files changed, 84 insertions(+), 7 deletions(-) > create mode 100644 gcc/testsuite/g++.dg/other/pr96347.C > > diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c > index 9deca031fc2..093ceb23c7a 100644 > --- a/gcc/dwarf2out.c > +++ b/gcc/dwarf2out.c > @@ -14435,6 +14435,20 @@ is_based_loc (const_rtx rtl) > && CONST_INT_P (XEXP (rtl, 1); > } > > +/* Return true if this MEM expression is for a TLS variable. */ > + > +static bool > +is_tls_mem_loc (rtx mem) > +{ > + tree base; > + > + if (MEM_EXPR (mem) == NULL_TREE || !MEM_OFFSET_KNOWN_P (mem)) > +return false; > + > + base = get_base_address (MEM_EXPR (mem)); > + return (base && VAR_P (base) && DECL_THREAD_LOCAL_P (base)); > +} > + > /* Try to handle TLS MEMs, for which mem_loc_descriptor on XEXP (mem, 0) > failed. */ > > @@ -15671,11 +15685,12 @@ mem_loc_descriptor (rtx rtl, machine_mode mode, > return mem_loc_result; > } >} > - mem_loc_result = mem_loc_descriptor (XEXP (rtl, 0), > -get_address_mode (rtl), mode, > -VAR_INIT_STATUS_INITIALIZED); > - if (mem_loc_result == NULL) > + if (is_tls_mem_loc (rtl)) > mem_loc_result = tls_mem_loc_descriptor (rtl); > + if (mem_loc_result == NULL) > + mem_loc_result = mem_loc_descriptor (XEXP (rtl, 0), > + get_address_mode (rtl), mode, > + VAR_INIT_STATUS_INITIALIZED); >if (mem_loc_result != NULL) > { > if (!is_a (mode, &int_mode) > @@ -16631,10 +16646,11 @@ loc_descriptor (rtx rtl, machine_mode mode, >break; > > case MEM: > - loc_result = mem_loc_descriptor (XEXP (rtl, 0), get_address_mode (rtl), > -GET_MODE (rtl), initialized); > - if (loc_result == NULL) > + if (is_tls_mem_loc (rtl)) > loc_result = tls_mem_loc_descriptor (rtl); > + if (loc_result == NULL) > + loc_result = mem_loc_descriptor (XEXP (rtl, 0), get_address_mode (rtl), > + GET_MODE (rtl), initialized); >if (loc_result == NULL) > { > rtx new_rtl = avoid_constant_pool_reference (rtl); > diff --git a/gcc/testsuite/g++.dg/other/pr96347.C > b/gcc/testsuite/g++.dg/other/pr96347.C > new file mode 100644 > index 000..414d10c73de > --- /dev/null > +++ b/gcc/testsuite/g++.dg/other/pr96347.C > @@ -0,0 +1,61 @@ > +/* { dg-do compile { target c++11 } } */ > +/* { dg-require-effective-target tls } */ > +/* { dg-options "-O2 -g -fcheckin
[PING][PATCH] libiberty: Add support for 'in' and 'in ref' storage classes.
Ping. Though I will hold back on supporting 'in ref' until it has been formalized. Current discussions are around what value might it have over 'ref const scope' (so far, none). Iain. On 07/08/2020 13:32, Iain Buclaw wrote: > Hi, > > This patch adds support for 'in' as a first-class storage class with its > own mangle symbol, of which also permits 'in ref'. Previously, 'in' was > an alias to 'const [scope]', which is a type constructor. > > The mangle symbol repurposed for this is 'I', which was originally used > by identifier types. However, while TypeIdentifier is part of the > grammar, it must be resolved to some other entity during the semantic > passes, and so shouldn't appear anywhere in the mangled name. > > OK for mainline? > > Regards > Iain. > > --- > libiberty/ChangeLog: > > * d-demangle.c (dlang_function_args): Handle 'in' and 'in ref' > parameter storage classes. > (dlang_type): Remove identifier type. > * testsuite/d-demangle-expected: Update tests. > --- > libiberty/d-demangle.c | 10 +- > libiberty/testsuite/d-demangle-expected | 16 > 2 files changed, 17 insertions(+), 9 deletions(-) > > diff --git a/libiberty/d-demangle.c b/libiberty/d-demangle.c > index f2d6946ecad..2e52eff7617 100644 > --- a/libiberty/d-demangle.c > +++ b/libiberty/d-demangle.c > @@ -699,6 +699,15 @@ dlang_function_args (string *decl, const char *mangled, > struct dlang_info *info) > >switch (*mangled) > { > + case 'I': /* in(T) */ > + mangled++; > + string_append (decl, "in "); > + if (*mangled == 'K') /* in ref(T) */ > + { > + mangled++; > + string_append (decl, "ref "); > + } > + break; > case 'J': /* out(T) */ > mangled++; > string_append (decl, "out "); > @@ -826,7 +835,6 @@ dlang_type (string *decl, const char *mangled, struct > dlang_info *info) >mangled = dlang_function_type (decl, mangled, info); >string_append (decl, "function"); >return mangled; > -case 'I': /* ident T */ > case 'C': /* class T */ > case 'S': /* struct T */ > case 'E': /* enum T */ > diff --git a/libiberty/testsuite/d-demangle-expected > b/libiberty/testsuite/d-demangle-expected > index e3f32e31d7b..ffb317f23d4 100644 > --- a/libiberty/testsuite/d-demangle-expected > +++ b/libiberty/testsuite/d-demangle-expected > @@ -274,14 +274,6 @@ _D8demangle4testFNhG4dZv > demangle.test(__vector(double[4])) > # > --format=dlang > -_D8demangle4testFI5identZv > -demangle.test(ident) > -# > ---format=dlang > -_D8demangle4testFI5ident4testZv > -demangle.test(ident.test) > -# > ---format=dlang > _D8demangle4testFC5classZv > demangle.test(class) > # > @@ -314,6 +306,14 @@ _D8demangle4testFT7typedef4testZv > demangle.test(typedef.test) > # > --format=dlang > +_D8demangle4testFIaZv > +demangle.test(in char) > +# > +--format=dlang > +_D8demangle4testFIKaZv > +demangle.test(in ref char) > +# > +--format=dlang > _D8demangle4testFJaZv > demangle.test(out char) > # >
Re: [PATCH]Don't use pinsr for struct initialization.
On Fri, Aug 14, 2020 at 8:03 AM Hongtao Liu wrote: > > Hi: > For struct initialization, when it fits in a TImode, gcc will use > pinsr insn which causes poor codegen described in PR93897 and PR96562. You should probably remove TImode handling also from ix86_expand_pextr. Uros. > Bootstrap is ok, regression test is ok for i386/x86-64 backend. > Ok for trunk? > > ChangeLog > gcc/ > PR target/96562 > PR target/93897 > * config/i386/i386-expand.c (ix86_expand_pinsr): Don't use > pinsr for TImode. > > gcc/testsuite/ > * gcc.target/i386/pr96562-1.c: New test. > > -- > BR, > Hongtao
Re: [PATCH] [PR target/96350]Force ENDBR immediate into memory to avoid fake ENDBR opcode.
On Fri, Aug 14, 2020 at 6:54 AM Hongtao Liu wrote: > > On Tue, Aug 11, 2020 at 5:56 PM Uros Bizjak wrote: > > > > On Tue, Aug 11, 2020 at 11:36 AM Hongtao Liu wrote: > > > > > > On Tue, Aug 11, 2020 at 4:38 PM Uros Bizjak wrote: > > > > > > > > On Tue, Aug 11, 2020 at 5:30 AM Hongtao Liu wrote: > > > > > > > > > > Hi: > > > > > The issue is described in the bugzilla. > > > > > Bootstrap is ok, regression test for i386/x86-64 backend is ok. > > > > > Ok for trunk? > > > > > > > > > > ChangeLog > > > > > gcc/ > > > > > PR target/96350 > > > > > * config/i386/i386.c (ix86_legitimate_constant_p): Return > > > > > false for ENDBR immediate. > > > > > (ix86_legitimate_address_p): Ditto. > > > > > * config/i386/predicated.md > > > > > (x86_64_immediate_operand): Exclude ENDBR immediate. > > > > > (x86_64_zext_immediate_operand): Ditto. > > > > > (x86_64_dwzext_immediate_operand): Ditto. > > > > > (ix86_not_endbr_immediate_operand): New predicate. > > > > > > > > > > gcc/testsuite > > > > > * gcc.target/i386/endbr_immediate.c: New test. > > > > > > > > +;; Return true if VALUE isn't an ENDBR opcode in immediate field. > > > > +(define_predicate "ix86_not_endbr_immediate_operand" > > > > + (match_test "1") > > > > > > > > Please reverse the above logic to introduce > > > > ix86_endbr_immediate_operand, that returns true for unwanted > > > > immediate. Something like: > > > > > > > > (define_predicate "ix86_endbr_immediate_operand" > > > > (match_code "const_int") > > > > ... > > > > > > > > And you will be able to use it like: > > > > > > > > if (ix86_endbr_immediate_operand (x, VOIDmode) > > > > return false; > > > > > > > > > > Changed. > > > > No, it is not. > > > > + if ((flag_cf_protection & CF_BRANCH) > > + && CONST_INT_P (op)) > > > > You don't need to check for const ints here. > > > > And please rewrite the body of the function to something like (untested): > > > > { > > unsigned HOST_WIDE_INT val = TARGET_64BIT ? 0xfa1e0ff3 : 0xfb1e0ff3; > > > > if (x == val) > > return 1; > > > > if (TARGET_64BIT) > > for (; x >= val; x >>= 8) > > if (x == val) > > return 1; > > > > return 0; > > } > > > > so it will at least *look* like some thoughts have been spent on this. > > I don't plan to review the code where it is obvious from the first > > look that it was thrown together in a hurry. Please get some internal > > company signoff first. Ping me in a week for a review. > > > > Sorry for the hurry, i know your time is precious. > > > Uros. > > > > > > >/* Otherwise we handle everything else in the move patterns. */ > > > > - return true; > > > > + return ix86_not_endbr_immediate_operand (x, VOIDmode); > > > > } > > > > > > > > Please handle this in CASE_CONST_SCALAR_INT: part. > > > > > > > > + if (disp && !ix86_not_endbr_immediate_operand (disp, VOIDmode)) > > > > +return false; > > > > > > > > And this in: > > > > > > > > /* Validate displacement. */ > > > > if (disp) > > > > { > > > > > > > > > > Changed. > > > > A better place for these new special cases is at the beginning of the > > part I referred, not at the end. > > > > Yes. > > > Uros. > > Update patch. OK with two nits below. Thanks, Uros. + if (flag_cf_protection & CF_BRANCH) + { + unsigned HOST_WIDE_INT imm = INTVAL (op); UINTVAL, just for the consistency. + unsigned HOST_WIDE_INT val = TARGET_64BIT ? 0xfa1e0ff3 : 0xfb1e0ff3; @@ -374,6 +402,8 @@ (define_predicate "x86_64_dwzext_immediate_operand" (match_code "const_int,const_wide_int") { + if (ix86_endbr_immediate_operand (op, VOIDmode)) +return false; vertical space here. switch (GET_CODE (op)) > > -- > BR, > Hongtao
[committed] [OG10] Backport fixes for PR94690
Hello I have backported Tobias' patches for PR94690 from master to devel/omp/gcc-10. These are: [Fortran] OpenMP - permit lastprivate in distribute + SIMD fixes (PR94690) (commit f884bef2105d748fd7869cd641cbb4f6b6bb) [Fortran] OpenMP 5 – permit more sharing clauses for SIMD (PR94690) (commit 0ec52417fd9b3bef5227cdc9a18ff4f0247b0ea4) Kwok From b39d180f83a15eaf0b1a42afd4990378be8229e8 Mon Sep 17 00:00:00 2001 From: Kwok Cheung Yeung Date: Fri, 14 Aug 2020 05:20:39 -0700 Subject: [PATCH 2/2] =?UTF-8?q?[Fortran]=20OpenMP=205=20=E2=80=93=20permit?= =?UTF-8?q?=20more=20sharing=20clauses=20for=20SIMD=20(PR94690)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This is a backport from master of commit 0ec52417fd9b3bef5227cdc9a18ff4f0247b0ea4. gcc/fortran/ PR fortran/94690 * openmp.c (resolve_omp_do): Permit more clauses for SIMD iteration variables. gcc/testsuite/ PR fortran/94690 * gfortran.dg/gomp/openmp-simd-4.f90: New test. --- gcc/fortran/ChangeLog.omp| 9 gcc/fortran/openmp.c | 17 +++ gcc/testsuite/ChangeLog.omp | 8 +++ gcc/testsuite/gfortran.dg/gomp/openmp-simd-4.f90 | 65 4 files changed, 88 insertions(+), 11 deletions(-) create mode 100644 gcc/testsuite/gfortran.dg/gomp/openmp-simd-4.f90 diff --git a/gcc/fortran/ChangeLog.omp b/gcc/fortran/ChangeLog.omp index 340a4dc..00c6be0 100644 --- a/gcc/fortran/ChangeLog.omp +++ b/gcc/fortran/ChangeLog.omp @@ -1,6 +1,15 @@ 2020-08-14 Kwok Cheung Yeung Backport from mainline + 2020-05-15 Tobias Burnus + + PR fortran/94690 + * openmp.c (resolve_omp_do): Permit more clauses for SIMD + iteration variables. + +2020-08-14 Kwok Cheung Yeung + + Backport from mainline 2020-05-13 Tobias Burnus PR fortran/94690 diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c index 220b1a9..5c384ca 100644 --- a/gcc/fortran/openmp.c +++ b/gcc/fortran/openmp.c @@ -5868,26 +5868,21 @@ resolve_omp_do (gfc_code *code) "at %L", name, &do_code->loc); if (code->ext.omp_clauses) for (list = 0; list < OMP_LIST_NUM; list++) - if (!is_simd + if (!is_simd || code->ext.omp_clauses->collapse > 1 ? (list != OMP_LIST_PRIVATE && list != OMP_LIST_LASTPRIVATE) - : code->ext.omp_clauses->collapse > 1 - ? (list != OMP_LIST_LASTPRIVATE) - : (list != OMP_LIST_LINEAR)) + : (list != OMP_LIST_PRIVATE && list != OMP_LIST_LASTPRIVATE +&& list != OMP_LIST_LINEAR)) for (n = code->ext.omp_clauses->lists[list]; n; n = n->next) if (dovar == n->sym) { - if (!is_simd) + if (!is_simd || code->ext.omp_clauses->collapse > 1) gfc_error ("%s iteration variable present on clause " "other than PRIVATE or LASTPRIVATE at %L", name, &do_code->loc); - else if (code->ext.omp_clauses->collapse > 1) - gfc_error ("%s iteration variable present on clause " - "other than LASTPRIVATE at %L", - name, &do_code->loc); else gfc_error ("%s iteration variable present on clause " - "other than LINEAR at %L", - name, &do_code->loc); + "other than PRIVATE, LASTPRIVATE or " + "LINEAR at %L", name, &do_code->loc); break; } if (i > 1) diff --git a/gcc/testsuite/ChangeLog.omp b/gcc/testsuite/ChangeLog.omp index aba7d39..8f652f4 100644 --- a/gcc/testsuite/ChangeLog.omp +++ b/gcc/testsuite/ChangeLog.omp @@ -1,3 +1,11 @@ +2020-08-14 Kwok Cheung Yeung + + Backport from mainline + 2020-05-15 Tobias Burnus + + PR fortran/94690 + * gfortran.dg/gomp/openmp-simd-4.f90: New test. + 2020-08-12 Kwok Cheung Yeung * gcc.target/nvptx/sync.c: New. diff --git a/gcc/testsuite/gfortran.dg/gomp/openmp-simd-4.f90 b/gcc/testsuite/gfortran.dg/gomp/openmp-simd-4.f90 new file mode 100644 index 000..4a17fb9 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/openmp-simd-4.f90 @@ -0,0 +1,65 @@ +! { dg-do compile } + +integer :: i, j, k +integer :: x(5), y(2,5) + +!$omp parallel do private(i) +do i = 1, 5 + x(i) = 42 +end do + +!$omp parallel do lastprivate(i) +do i = 1, 5 + x(i) = 42 +end do + + +!$omp simd private(i) +do i = 1, 5 + x(i) = 42 +end do + +!$omp simd linear(i) +do i = 1, 5 + x(i) = 42 +end do + +!$omp simd lastprivate(i) +do i = 1, 5 + x(i) = 42 +end do + + +!$omp simd private(i) lastprivate(j) collapse(2) +do i = 1, 5 + do j = 1, 2 +y(j, i) = 52 + end do +end do
Re: [committed] analyzer: rewrite of region and value-handling
On Fri, 2020-08-14 at 09:22 +0200, Martin Liška wrote: > On 8/13/20 10:58 PM, David Malcolm via Gcc-patches wrote: > > PR analyzer/93032 (missing leak diagnostic for > > zlib/contrib/minizip/mztools.c) > > PR analyzer/93938 (ICE in analyzer) > > PR analyzer/94011 (ICE in analyzer) > > PR analyzer/94099 (ICE in analyzer) > > PR analyzer/94399 (leak false positive with > > __attribute__((cleanup( > > PR analyzer/94458 (leak false positive) > > PR analyzer/94503 (ICE on C++ return-value-optimization) > > PR analyzer/94640 (leak false positive) > > PR analyzer/94688 (ICE in analyzer) > > PR analyzer/94689 ("arrays of functions are not meaningful" error) > > PR analyzer/94839 (leak false positive) > > PR analyzer/95026 (leak false positive) > > PR analyzer/95042 (ICE merging const and non-const C++ object > > instances) > > PR analyzer/95240 (leak false positive) > > Hello David. > > Unfortunately, this format is not recognized by gcc-changelog script > and > so the corresponding PR entries were not added to the generated > ChangeLog > entries. The currently supported regex is: > > pr_regex = re.compile(r'\tPR (?P[a-z+-]+\/)?([0-9]+)$') > > which prevents parsing an entries not being standalone. > Anyway, I updated gcc/analyzer/ChangeLog manually. Thanks
Re: [committed] analyzer: rewrite of region and value-handling
On Fri, 2020-08-14 at 09:26 +0200, Martin Liška wrote: > On 8/14/20 9:22 AM, Martin Liška wrote: > > On 8/13/20 10:58 PM, David Malcolm via Gcc-patches wrote: > > > PR analyzer/93032 (missing leak diagnostic for > > > zlib/contrib/minizip/mztools.c) > > > PR analyzer/93938 (ICE in analyzer) > > > PR analyzer/94011 (ICE in analyzer) > > > PR analyzer/94099 (ICE in analyzer) > > > PR analyzer/94399 (leak false positive with > > > __attribute__((cleanup( > > > PR analyzer/94458 (leak false positive) > > > PR analyzer/94503 (ICE on C++ return-value-optimization) > > > PR analyzer/94640 (leak false positive) > > > PR analyzer/94688 (ICE in analyzer) > > > PR analyzer/94689 ("arrays of functions are not meaningful" > > > error) > > > PR analyzer/94839 (leak false positive) > > > PR analyzer/95026 (leak false positive) > > > PR analyzer/95042 (ICE merging const and non-const C++ object > > > instances) > > > PR analyzer/95240 (leak false positive) > > > > Hello David. > > > > Unfortunately, this format is not recognized by gcc-changelog > > script and > > so the corresponding PR entries were not added to the generated > > ChangeLog > > entries. The currently supported regex is: > > > > pr_regex = re.compile(r'\tPR (?P[a-z+-]+\/)?([0-9]+)$') > > > > which prevents parsing an entries not being standalone. > > Anyway, I updated gcc/analyzer/ChangeLog manually. > > > > Thanks, > > Martin BTW, the entries you quoted above (with their per-bug descriptions) are in the leading text of the commit message, without indentation, and they are also in the section labeled "gcc/analyzer/ChangeLog with TAB indentation, and without descriptions. In my initial attempts to push the patch, the latter had the same text as the former, each line indented with a tab, and it caused the commit to fail the push hook. I removed the trailing descriptive text from each (TAB)PR analyzer/N line in the ChangeLog part of the message, and the hook passed and let me push the patch. (I wanted to quote the logs here, but I don't seem to have them anymore, sadly) > ... and I bet for similar reasons gcc-bugs emails were not send to > various PRs > mentioned in the commit. I wondered about that too. Given that the hook rejected it, and then accepted the revised version, my theory is that the commit exceeded some size limit for the bugzilla integration (the "git show --no-patch" blurb and ChangeLog from the commit is 1652 lines long and approaching 100k [1]) Looking at the "Daily bump." commit b3cb56060bcdc1cf4d38aa30b5017b802822f8c0 I see that the ChangeLog entries did make it into the various ChangeLog files. Dave [1] writing the ChangeLog took about 3 days, which part of me resents as tedious busywork, but I did find and fix several bugs whilst doing it, including one serious one-liner mistake that was significantly slowed down the code, so there is some merit in poring over a candidate change line-by-line, I guess.
Intel AMX programming model discussion
Hi, Intel Advanced Matrix Extensions (Intel AMX) is a new programming paradigm consisting of two components: a set of 2-dimensional registers (tiles) representing sub-arrays from a larger 2-dimensional memory image, and accelerators able to operate on tiles. Capability of Intel AMX implementation is enumerated by palettes. Two palettes are supported: palette 0 represents the initialized state and palette 1 consists of 8 tile registers of up to 1 KB size, which is controlled by a tile control register. The instruction manual is posted at https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html. The AMX abi proposal is posted at https://groups.google.com/g/x86-64-abi/c/NRejFm7pwb4. This email is to discuss the programming model for AMX. Florian has introduced the matrix type and intrinsics in LLVM community. We'd like to adopt some ideas from it. We propose for the AMX programming model at http://lists.llvm.org/pipermail/llvm-dev/2020-August/144302.html. Comments are welcome. Thanks Yuanke
Re: [PATCH] diagnostics: Add new option -fdiagnostics-plain-output
On Wed, Aug 12, 2020 at 12:54 PM Richard Sandiford wrote: > > Lewis Hyatt writes: > > Hello- > > > > Attached is the patch I mentioned in another discussion here: > > https://gcc.gnu.org/pipermail/gcc-patches/2020-August/551442.html > > > > This adds a new option -fdiagnostics-plain-output that currently means the > > same thing as: > > -fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers > > -fdiagnostics-color=never -fdiagnostics-urls=never > > > > The idea is that over time, if diagnostics output changes to get more bells > > and whistles by default (such as the UTF-8 output I suggested in the above > > discussion), -fdiagnostics-plain-output will adjust to turn that back off, > > so that the testsuite needs only the one option and doesn't need to get > > updated every time something new is added. It seems to me that when > > diagnostics change, it's otherwise a bit hard to update the testsuite > > correctly, especially for compat.exp that is often not run by default. I > > think this would also be useful for utilities that want to parse the > > diagnostics (but aren't using -fdiagnostics-output-format=json). > > > > BTW, I considered avoiding adding a new switch by having this option take > > the form -fdiagnostics-output-format=plain instead, but this seems to have > > problematic semantics when multiple related options are specified. Given > > that > > this option needs to be expanded early in the parsing process, so that it > > can be compatible with the special handling for -fdiagnostics-color, it > > seemed best to just make it a simple option with no negated form. > > > > I hope this may be useful, please let me know if you'd like me to push > > it. bootstrap and regtest were done for all languages on x86-64 Linux, all > > tests the same before and after, and same for the compat.exp with > > alternate compiler GCC 8.2. > > Thanks for doing this. LGTM except for a couple of very minor things: > > > […] > > @@ -981,6 +982,42 @@ decode_cmdline_options_to_array (unsigned int argc, > > const char **argv, > > argv[++i] = replacement; > > } > > > > + /* Expand -fdiagnostics-plain-output to its constituents. This needs > > + to happen here so that prune_options can handle -fdiagnostics-color > > + specially. */ > > + if (!strcmp (opt, "-fdiagnostics-plain-output")) > > + { > > + /* If you have changed the default diagnostics output, and this new > > + output is not appropriately "plain" (e.g., the change needs to be > > + undone in order for the testsuite to work properly), then please do > > + the following: > > With GCC formatting, the paragraph should be indented under “If you…”. > > > + 1. Add the necessary option to undo the new behavior to > > + the array below. > > + 2. Update the documentation for -fdiagnostics-plain-output > > + in invoke.texi. > > + */ > > …and this should be on the previous line (“. */”). > > > + const char *const expanded_args[] = { > > + "-fno-diagnostics-show-caret", > > + "-fno-diagnostics-show-line-numbers", > > + "-fdiagnostics-color=never", > > + "-fdiagnostics-urls=never", > > + }; > > Hadn't expected it to be this complicated :-) But I agree with your > reasoning: it looks like this is the correct way given the special > handling of -fdiagnostic-color (and potentially other -fdiagnostic > options in future). > > > + const int num_expanded > > + = sizeof expanded_args / sizeof (*expanded_args); > > Simplifies to “ARRAY_SIZE (expanded_args)”. > > > + opt_array_len += num_expanded - 1; > > + opt_array = XRESIZEVEC (struct cl_decoded_option, > > + opt_array, opt_array_len); > > + for (int j = 0; j < num_expanded;) > > + { > > + j += decode_cmdline_option (expanded_args + j, lang_mask, > > + &opt_array[num_decoded_options]); > > Might be worth using the same "for" loop structure as the outer loop, > assigning the number of decoded options to “n”. Neither's better than > the other, but it makes it clearer that there's nothing special going on. > > > + num_decoded_options++; > > + } > > + > > + n = 1; > > + continue; > > + } > > + > >n = decode_cmdline_option (argv + i, lang_mask, > >&opt_array[num_decoded_options]); > >num_decoded_options++; > > diff --git a/gcc/testsuite/lib/c-compat.exp b/gcc/testsuite/lib/c-compat.exp > > index 9493c214aea..5f43c5a6a57 100644 > > --- a/gcc/testsuite/lib/c-compat.exp > > +++ b/gcc/testsuite/lib/c-compat.exp > > @@ -36,24 +36,34 @@ load_lib target-libpath.exp > > proc compat-use-alt-compiler { } { > > global GCC_UNDER_TEST ALT_CC_UNDER_TEST > > global compat_same_alt compat_alt_caret compat_alt_color > > compat_no_line_no > > -gl
Re: [PATCH] diagnostics: Add new option -fdiagnostics-plain-output
On Fri, 2020-08-14 at 10:01 -0400, Lewis Hyatt wrote: > On Wed, Aug 12, 2020 at 12:54 PM Richard Sandiford > wrote: [...] > > OK with those changes in 24 hrs if noone objects or asks for a > > delay. > > > > Thanks, > > Richard > > Thanks for the review, and sorry about the formatting glitches. I > pushed it just now with your fixes. Thanks! (both to you and to Richard)
[committed] i386: Improve LWP builtin expanders.
Use parameterized pattern names to simplify calling of named patterns. 2020-08-14 Uroš Bizjak gcc/ChangeLog: * config/i386/i386-builtin.def (__builtin_ia32_llwpcb) (__builtin_ia32_slwpcb, __builtin_ia32_lwpval32) (__builtin_ia32_lwpval64, __builtin_ia32_lwpins32) (__builtin_ia32_lwpins64): Use CODE_FOR_nothing. * config/i386/i386.md (@lwp_llwpcb): Implement as parametrized name pattern. (@lwp_slwpcb): Ditto. (@lwp_lwpval): Ditto. (@lwp_lwpins): Ditto. * config/i386/i386-expand.c (ix86_expand_special_args_builtin) [case VOID_FTYPE_UINT_UINT_UINT, case VOID_FTYPE_UINT64_UINT_UINT] [case UCHAR_FTYPE_UINT_UINT_UINT, case UCHAR_FTYPE_UINT64_UINT_UINT]: Remove. (ix86_expand_builtin) [ case IX86_BUILTIN_LLWPCB, case IX86_BUILTIN_LLWPCB]: Update for parameterized name patterns. [case IX86_BUILTIN_LWPVAL32, case IX86_BUILTIN_LWPVAL64] [case IX86_BUILTIN_LWPINS32, case IX86_BUILTIN_LWPINS64]: Expand here. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Uros. diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def index 25b80868bd3..3b6c4a85579 100644 --- a/gcc/config/i386/i386-builtin.def +++ b/gcc/config/i386/i386-builtin.def @@ -260,12 +260,12 @@ BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_loadsf_mask, "__builtin_ia32 BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_storedf_mask, "__builtin_ia32_storesd_mask", IX86_BUILTIN_STORESD_MASK, UNKNOWN, (int) VOID_FTYPE_PDOUBLE_V2DF_UQI) BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_avx512f_storesf_mask, "__builtin_ia32_storess_mask", IX86_BUILTIN_STORESS_MASK, UNKNOWN, (int) VOID_FTYPE_PFLOAT_V4SF_UQI) -BDESC (OPTION_MASK_ISA_LWP, 0, CODE_FOR_lwp_llwpcb, "__builtin_ia32_llwpcb", IX86_BUILTIN_LLWPCB, UNKNOWN, (int) VOID_FTYPE_PVOID) -BDESC (OPTION_MASK_ISA_LWP, 0, CODE_FOR_lwp_slwpcb, "__builtin_ia32_slwpcb", IX86_BUILTIN_SLWPCB, UNKNOWN, (int) PVOID_FTYPE_VOID) -BDESC (OPTION_MASK_ISA_LWP, 0, CODE_FOR_lwp_lwpvalsi3, "__builtin_ia32_lwpval32", IX86_BUILTIN_LWPVAL32, UNKNOWN, (int) VOID_FTYPE_UINT_UINT_UINT) -BDESC (OPTION_MASK_ISA_LWP | OPTION_MASK_ISA_64BIT, 0, CODE_FOR_lwp_lwpvaldi3, "__builtin_ia32_lwpval64", IX86_BUILTIN_LWPVAL64, UNKNOWN, (int) VOID_FTYPE_UINT64_UINT_UINT) -BDESC (OPTION_MASK_ISA_LWP, 0, CODE_FOR_lwp_lwpinssi3, "__builtin_ia32_lwpins32", IX86_BUILTIN_LWPINS32, UNKNOWN, (int) UCHAR_FTYPE_UINT_UINT_UINT) -BDESC (OPTION_MASK_ISA_LWP | OPTION_MASK_ISA_64BIT, 0, CODE_FOR_lwp_lwpinsdi3, "__builtin_ia32_lwpins64", IX86_BUILTIN_LWPINS64, UNKNOWN, (int) UCHAR_FTYPE_UINT64_UINT_UINT) +BDESC (OPTION_MASK_ISA_LWP, 0, CODE_FOR_nothing, "__builtin_ia32_llwpcb", IX86_BUILTIN_LLWPCB, UNKNOWN, (int) VOID_FTYPE_PVOID) +BDESC (OPTION_MASK_ISA_LWP, 0, CODE_FOR_nothing, "__builtin_ia32_slwpcb", IX86_BUILTIN_SLWPCB, UNKNOWN, (int) PVOID_FTYPE_VOID) +BDESC (OPTION_MASK_ISA_LWP, 0, CODE_FOR_nothing, "__builtin_ia32_lwpval32", IX86_BUILTIN_LWPVAL32, UNKNOWN, (int) VOID_FTYPE_UINT_UINT_UINT) +BDESC (OPTION_MASK_ISA_LWP | OPTION_MASK_ISA_64BIT, 0, CODE_FOR_nothing, "__builtin_ia32_lwpval64", IX86_BUILTIN_LWPVAL64, UNKNOWN, (int) VOID_FTYPE_UINT64_UINT_UINT) +BDESC (OPTION_MASK_ISA_LWP, 0, CODE_FOR_nothing, "__builtin_ia32_lwpins32", IX86_BUILTIN_LWPINS32, UNKNOWN, (int) UCHAR_FTYPE_UINT_UINT_UINT) +BDESC (OPTION_MASK_ISA_LWP | OPTION_MASK_ISA_64BIT, 0, CODE_FOR_nothing, "__builtin_ia32_lwpins64", IX86_BUILTIN_LWPINS64, UNKNOWN, (int) UCHAR_FTYPE_UINT64_UINT_UINT) /* FSGSBASE */ BDESC (OPTION_MASK_ISA_FSGSBASE | OPTION_MASK_ISA_64BIT, 0, CODE_FOR_rdfsbasesi, "__builtin_ia32_rdfsbase32", IX86_BUILTIN_RDFSBASE32, UNKNOWN, (int) UNSIGNED_FTYPE_VOID) diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c index aec894bbdb4..9de6f5029b9 100644 --- a/gcc/config/i386/i386-expand.c +++ b/gcc/config/i386/i386-expand.c @@ -10665,15 +10665,6 @@ ix86_expand_special_args_builtin (const struct builtin_description *d, klass = load; memory = 0; break; -case VOID_FTYPE_UINT_UINT_UINT: -case VOID_FTYPE_UINT64_UINT_UINT: -case UCHAR_FTYPE_UINT_UINT_UINT: -case UCHAR_FTYPE_UINT64_UINT_UINT: - nargs = 3; - klass = load; - memory = ARRAY_SIZE (args); - last_arg_constant = true; - break; default: gcc_unreachable (); } @@ -10728,13 +10719,7 @@ ix86_expand_special_args_builtin (const struct builtin_description *d, { if (!match) { - if (icode == CODE_FOR_lwp_lwpvalsi3 - || icode == CODE_FOR_lwp_lwpinssi3 - || icode == CODE_FOR_lwp_lwpvaldi3 - || icode == CODE_FOR_lwp_lwpinsdi3) - error ("the last argument must be a 32-bit immediate"); - else - error ("the last argument must be an 8-bit immediate"); + error ("the last argument must be an 8-bit immediate"); return const0_rtx; }
c++: More simplification of name_lookup api
Continuing fixing name lookup's API we have two parameters saying what we'd like to find 'prefer_type', which is a tri-valued boolan with meaning 'don't care', 'type or namespace', 'type or death'. And we have a second parameter 'namespaces_only', which means 'namespace or death'. There are only 4 states, because the latter one has priority. Firstly 'prefer_type' isn't really the right name -- it's not a preference, it's a requirement. Name lookup maps those two parameters into 2 LOOKUP_ bits. We can simply have callers express that desire directly. So this adds another enum class, LOOK_want, which expresses all those options in 2 bits. Most of this patch is then the expected fallout from such a change. The parser was mapping its internal state into a prefer_type value, which was then mapped into the LOOKUP_ bits. So this saves a conversion there. Also the parser's conversion routine had an 'is_template' flag, which was only ever true in one place, where the parser also had to deal with other nuances of the flags to pass. So just drop that parm and deal with it at the call site too. I've left LOOKUP_HIDDEN alone for the moment. That'll be next. pushed gcc/cp/ * cp-tree.h (LOOKUP_PREFER_TYPES, LOOKUP_PREFER_NAMESPACES) (LOOKUP_NAMESPACES_ONLY, LOOKUP_TYPES_ONLY) (LOOKUP_QUALIFIERS_ONL): Delete. (LOOKUP_HIDDEN): Adjust. * name-lookup.h (enum class LOOK_want): New. (operator|, operator&): Overloads for it. (lookup_name_real): Replace prefer_type & namespaces_only with LOOK_want parm. (lookup_qualified_name): Replace prefer_type with LOOK_want. (lookup_name_prefer_type): Replace with ... (lookup_name): ... this. New overload with LOOK_want parm. * name-lookup.c (struct name_lookup): Replace flags with want and hidden fields. Adjust constructors. (name_lookyp::add_overload): Correct hidden stripping test. Update for new LOOK_want type. (name_lookup::process_binding): Likewise. (name_lookup::search_unqualified): Use hidden flag. (identifier_type_value_1): Adjust lookup_name_real call. (set_decl_namespace): Adjust name_lookup ctor. (lookup_flags): Delete. (qualify_lookup): Add LOOK_want parm, adjust. (lookup_qualified_name): Replace prefer_type parm with LOOK_want. (lookup_name_real_1): Replace prefer_type and namespaces_only with LOOK_want parm. (lookup_name_real): Likewise. (lookup_name_nonclass, lookup_name): Adjust lookup_name_real call. (lookup_name_prefer_type): Rename to ... (lookup_name): ... here. New overload with LOOK_want parm. (lookup_type_scope_1): Adjust qualify_lookup calls. * call.c (build_operator_new_call) (add_operator_candidates): Adjust lookup_name_real calls. * coroutines.cc (find_coro_traits_template_decl) (find_coro_handle_template_decl, morph_fn_to_coro): Adjust lookup_qualified_name calls. * cp-objcp-common.c (identifier_global_tag): Likewise. * decl.c (get_tuple_size, get_tuple_decomp_init): Likewise. (lookup_and_check_tag): Use lookup_name overload. * parser.c (cp_parser_userdef_numeric_literal): Adjust lookup_qualified_name call. (prefer_arg_type): Drop template_mem_access parm, return LOOK_want value. (cp_parser_lookup_name): Adjust lookup_member, lookup_name_real calls. * pt.c (check_explicit_specialization): Adjust lookup_qualified_name call. (tsubst_copy_and_build, tsubst_qualified_name): Likewise (deduction_guides_for): Likewise. (tsubst_friend_class): Adjust lookup_name_real call. (lookup_init_capture, tsubst_expr): Likewise. * rtti.c (emit_support_tinfos): Adjust lookup_qualified_name call. * semantics.c (omp_reduction_lookup): Likewise. (capture_decltype): Adjust lookup_name_real call. libcc1/ * libcp1plugin.cc (plugin_build_dependent_expr): Adjust lookup_name_real & lookup_qualified_name calls. -- Nathan Sidwell diff --git c/ChangeLog w/ChangeLog index 56ee7d467d7..21943e20e54 100644 --- c/ChangeLog +++ w/ChangeLog @@ -1,3 +1,62 @@ +2020-08-14 Nathan Sidwell + + gcc/cp/ + * cp-tree.h (LOOKUP_PREFER_TYPES, LOOKUP_PREFER_NAMESPACES) + (LOOKUP_NAMESPACES_ONLY, LOOKUP_TYPES_ONLY) + (LOOKUP_QUALIFIERS_ONL): Delete. + (LOOKUP_HIDDEN): Adjust. + * name-lookup.h (enum class LOOK_want): New. + (operator|, operator&): Overloads for it. + (lookup_name_real): Replace prefer_type & namespaces_only with + LOOK_want parm. + (lookup_qualified_name): Replace prefer_type with LOOK_want. + (lookup_name_prefer_type): Replace with ... + (lookup_name): ... this. New overload with LOOK_want parm. + * name-lookup.c (struct name_lookup): Replace flags with want and + hidden fields. Adjust constructors. + (name_lookyp::add_o
[PATCH] modules: c++: Fix push_namespace ICE with modules
Hello! Attached is a patch that fixes an ICE on the devel/c++-modules branch caused by a slot invalidation edge case in push_namespace. It's been sitting around for a while and I wasn't sure if I should use the original date or not. Feel free to adjust that to the commit date if that's what it should be instead. 2020-05-15 Jeff Chapman II gcc/cp/ * name-lookup.c (push_namespace): Fix slot invalidation related ICE when compiling with modules enabled. gcc/testsuite/ * g++.dg/modules/string-view1.C: New test. * g++.dg/modules/string-view2.C: Ditto. Please let me know if there's anything more I can do, Jeff Chapman II From d33a239c187cb6cef1c39953c5f014bd266492de Mon Sep 17 00:00:00 2001 From: Jeff Chapman II Date: Fri, 15 May 2020 06:37:41 -0400 Subject: [PATCH 1/1] c++: Fix push_namespace ICE with modules push_namespace was reusing an earlier find_namespace_slot result after it was invalidated by pushdecl in corner cases. 2020-05-15 Jeff Chapman II gcc/cp/ * name-lookup.c (push_namespace): Fix slot invalidation related ICE when compiling with modules enabled. gcc/testsuite/ * g++.dg/modules/string-view1.C: New test. * g++.dg/modules/string-view2.C: Ditto. --- gcc/cp/name-lookup.c| 2 + gcc/testsuite/g++.dg/modules/string-view1.C | 6 +++ gcc/testsuite/g++.dg/modules/string-view2.C | 53 + 3 files changed, 61 insertions(+) create mode 100644 gcc/testsuite/g++.dg/modules/string-view1.C create mode 100644 gcc/testsuite/g++.dg/modules/string-view2.C diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c index e9495d2a282..462f028617c 100644 --- a/gcc/cp/name-lookup.c +++ b/gcc/cp/name-lookup.c @@ -8920,6 +8920,8 @@ push_namespace (tree name, bool make_inline) { /* finish up making the namespace. */ add_decl_to_level (NAMESPACE_LEVEL (current_namespace), ns); + /* pushdecl may invalidate slot, find name again. */ + slot = find_namespace_slot (current_namespace, name, true); make_namespace_finish (ns, slot); /* Add the anon using-directive here, we don't do it in diff --git a/gcc/testsuite/g++.dg/modules/string-view1.C b/gcc/testsuite/g++.dg/modules/string-view1.C new file mode 100644 index 000..dabc16a8b01 --- /dev/null +++ b/gcc/testsuite/g++.dg/modules/string-view1.C @@ -0,0 +1,6 @@ +// { dg-additional-options "-fmodules-ts" } +module; +#include +#include +export module foo; + diff --git a/gcc/testsuite/g++.dg/modules/string-view2.C b/gcc/testsuite/g++.dg/modules/string-view2.C new file mode 100644 index 000..2e389eacd8f --- /dev/null +++ b/gcc/testsuite/g++.dg/modules/string-view2.C @@ -0,0 +1,53 @@ +// { dg-additional-options "-fmodules-ts" } +// reduced from string-view1 through cvise. Broken under c++2a too. +namespace std { +typedef int a; +int b; +decltype(nullptr) c; +namespace xyz {} +__builtin_va_list d; +int n; +int e; +int f; +int g; +int h; +int i; +int j; +int k; +typedef struct l m; +typedef struct aa w; +typedef struct x o; +typedef x p; +long q; +long r; +typedef l s; +extern p ab; +void t(); +void v(); +extern p ac; +void ad(); +int ae; +int af; +extern p ag; +extern p ah; +void ai(); +void y(); +int aj; +int ak; +int al; +char am; +int an; +a ao; +int ap; +int aq; +void z(); +int ar; +int as; +void at(); +void au(); +void av(); +void aw(); +int u; +namespace zz { +} +} -- 2.27.0
[PATCH 1/2] modules: c++: Fix cross module member redecl/add long distance friendship warning
Hello again, This is part one of a patchset to add an optional warning for long distance (cross module) friendship when the friendship has no useful/sensical meaning. Attached is a patch to error when trying to define a member function of a type owned by a different module. This also fixes an issue where a friend decl lets a user define a function owned by a different module. 2020-08-14 Jeff Chapman II gcc/cp/ * decl.c (duplicate_decls): Return original decl when attempting to redeclare a function owned by another module instead of clobbering the original decl. (grokfndecl): Error on member declarations of types owned by another module. gcc/testsuite/ * g++.dg/modules/redecl-1_[ab].C: New test. * g++.dg/modules/redecl-2_[ab].C: Ditto. * g++.dg/modules/redecl-3_[ab].C: Ditto. Please let me know if there are any questions, Jeff Chapman II From fc95c562ec87520f66388a35009649c4b020a843 Mon Sep 17 00:00:00 2001 From: Jeff Chapman II Date: Thu, 19 Dec 2019 09:43:16 -0500 Subject: [PATCH 1/2] c++: Fix cross module member redecl 2020-08-14 Jeff Chapman II gcc/cp/ * decl.c (duplicate_decls): Return original decl when attempting to redeclare a function owned by another module instead of clobbering the original decl. (grokfndecl): Error on member declarations of types owned by another module. gcc/testsuite/ * g++.dg/modules/redecl-1_[ab].C: New test. * g++.dg/modules/redecl-2_[ab].C: Ditto. * g++.dg/modules/redecl-3_[ab].C: Ditto. --- gcc/cp/decl.c | 14 ++ gcc/testsuite/g++.dg/modules/redecl-1_a.C | 9 + gcc/testsuite/g++.dg/modules/redecl-1_b.C | 8 gcc/testsuite/g++.dg/modules/redecl-2_a.C | 10 ++ gcc/testsuite/g++.dg/modules/redecl-2_b.C | 7 +++ gcc/testsuite/g++.dg/modules/redecl-3_a.C | 6 ++ gcc/testsuite/g++.dg/modules/redecl-3_b.C | 13 + 7 files changed, 67 insertions(+) create mode 100644 gcc/testsuite/g++.dg/modules/redecl-1_a.C create mode 100644 gcc/testsuite/g++.dg/modules/redecl-1_b.C create mode 100644 gcc/testsuite/g++.dg/modules/redecl-2_a.C create mode 100644 gcc/testsuite/g++.dg/modules/redecl-2_b.C create mode 100644 gcc/testsuite/g++.dg/modules/redecl-3_a.C create mode 100644 gcc/testsuite/g++.dg/modules/redecl-3_b.C diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c index decd58791ae..e8637dac7eb 100644 --- a/gcc/cp/decl.c +++ b/gcc/cp/decl.c @@ -2031,6 +2031,12 @@ duplicate_decls (tree newdecl, tree olddecl, bool newdecl_is_friend) } } } + /* FIXME Is there a better place to handle this? Without this, we end up + * potentially merging a decl owned by a separate module with a friend decl + * injected in the current module (see modules/redecl-3). Returning olddecl + * relies on code higher up handling the issue. */ + else if (modules_p () && !module_may_redeclare (olddecl)) +return olddecl; /* We have committed to returning OLDDECL at this point. */ @@ -9517,6 +9523,14 @@ grokfndecl (tree ctype, if (TREE_CODE (type) == METHOD_TYPE) { + if (modules_p() + && !module_may_redeclare (TYPE_NAME (ctype))) + { + error_at (location, "declaration conflicts with import"); + inform (location_of (ctype), "import declared %q#T here", ctype); + return NULL_TREE; + } + tree parm = build_this_parm (decl, type, quals); DECL_CHAIN (parm) = parms; parms = parm; diff --git a/gcc/testsuite/g++.dg/modules/redecl-1_a.C b/gcc/testsuite/g++.dg/modules/redecl-1_a.C new file mode 100644 index 000..2646adf8327 --- /dev/null +++ b/gcc/testsuite/g++.dg/modules/redecl-1_a.C @@ -0,0 +1,9 @@ +// { dg-additional-options -fmodules-ts } +export module foo; +// { dg-module-cmi foo } + +export struct Foo +{ + int foo(); +}; + diff --git a/gcc/testsuite/g++.dg/modules/redecl-1_b.C b/gcc/testsuite/g++.dg/modules/redecl-1_b.C new file mode 100644 index 000..b980bfe290c --- /dev/null +++ b/gcc/testsuite/g++.dg/modules/redecl-1_b.C @@ -0,0 +1,8 @@ +// { dg-additional-options -fmodules-ts } +import foo; + +int Foo::foo() // { dg-error "conflicts with import" } +{ + return 1; +} + diff --git a/gcc/testsuite/g++.dg/modules/redecl-2_a.C b/gcc/testsuite/g++.dg/modules/redecl-2_a.C new file mode 100644 index 000..eea99f2024e --- /dev/null +++ b/gcc/testsuite/g++.dg/modules/redecl-2_a.C @@ -0,0 +1,10 @@ +// { dg-additional-options -fmodules-ts } +export module foo; +// { dg-module-cmi foo } + +export struct Foo +{ + int foo(); + friend class Bar; +}; + diff --git a/gcc/testsuite/g++.dg/modules/redecl-2_b.C b/gcc/testsuite/g++.dg/modules/redecl-2_b.C new file mode 100644 index 000..9c3b545f6d2 --- /dev/null +++ b/gcc/testsuite/g++.dg/modules/redecl-2_b.C @@ -0,0 +1,7 @@ +// { dg-additional-options -fmodules-ts } +import foo; + +struct Bar // { dg-error "cannot declare.*in a different module" } +{ +}; + diff --git a/gcc/testsuite/g++.dg/modules/redecl-3_a.C
[PATCH 2/2] modules: c++: Fix cross module member redecl/add long distance friendship warning
Attached is a patch adding a -Wlong-distance-friends flag to diagnose long distance (cross module) friendship. 2020-08-14 Jeff Chapman II gcc/c-family/ * c.opt (Wlong-distance-friends): New. gcc/cp/ * cp-tree.h (module_friendship_compatible): New. * friend.c (add_friend): Warn when befriending a function owned by a different module. (make_friend_class): Warn when befriending a class owned by a different module. * module.cc (module_friendship_compatible): New function. Returns true if a decl may be befriended by another without issuing a long distance friend warning. gcc/testsuite/ * g++.dg/modules/long-distance-friend-1_[ab].C: New test. * g++.dg/modules/long-distance-friend-2_a.[Ch]: Ditto. * g++.dg/modules/long-distance-friend-3_[ab].[Ch]: Ditto. * g++.dg/modules/redecl-3_b.C: Add case for -Wlong-distance-friends. Thanks, Jeff Chapman II From 3d229111dcbe4bc3438207a500452c4420fba527 Mon Sep 17 00:00:00 2001 From: Jeff Chapman II Date: Fri, 20 Dec 2019 11:02:20 -0500 Subject: [PATCH 2/2] c++: Add warning flag for cross module friendship 2020-08-14 Jeff Chapman II gcc/c-family/ * c.opt (Wlong-distance-friends): New. gcc/cp/ * cp-tree.h (module_friendship_compatible): New. * friend.c (add_friend): Warn when befriending a function owned by a different module. (make_friend_class): Warn when befriending a class owned by a different module. * module.cc (module_friendship_compatible): New function. Returns true if a decl may be befriended by another without issuing a long distance friend warning. gcc/testsuite/ * g++.dg/modules/long-distance-friend-1_[ab].C: New test. * g++.dg/modules/long-distance-friend-2_a.[Ch]: Ditto. * g++.dg/modules/long-distance-friend-3_[ab].[Ch]: Ditto. * g++.dg/modules/redecl-3_b.C: Add case for -Wlong-distance-friends. --- gcc/c-family/c.opt| 4 ++ gcc/cp/cp-tree.h | 1 + gcc/cp/friend.c | 24 gcc/cp/module.cc | 38 +++ .../g++.dg/modules/long-distance-friend-1_a.C | 17 + .../g++.dg/modules/long-distance-friend-1_b.C | 24 .../g++.dg/modules/long-distance-friend-2_a.C | 18 + .../g++.dg/modules/long-distance-friend-2_a.h | 7 .../g++.dg/modules/long-distance-friend-3_a.C | 9 + .../g++.dg/modules/long-distance-friend-3_b.C | 9 + .../g++.dg/modules/long-distance-friend-3_b.h | 15 gcc/testsuite/g++.dg/modules/redecl-3_b.C | 4 +- 12 files changed, 168 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/g++.dg/modules/long-distance-friend-1_a.C create mode 100644 gcc/testsuite/g++.dg/modules/long-distance-friend-1_b.C create mode 100644 gcc/testsuite/g++.dg/modules/long-distance-friend-2_a.C create mode 100644 gcc/testsuite/g++.dg/modules/long-distance-friend-2_a.h create mode 100644 gcc/testsuite/g++.dg/modules/long-distance-friend-3_a.C create mode 100644 gcc/testsuite/g++.dg/modules/long-distance-friend-3_b.C create mode 100644 gcc/testsuite/g++.dg/modules/long-distance-friend-3_b.h diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt index 506d091e5f2..2d39ac5f974 100644 --- a/gcc/c-family/c.opt +++ b/gcc/c-family/c.opt @@ -739,6 +739,10 @@ Wlogical-not-parentheses C ObjC C++ ObjC++ Var(warn_logical_not_paren) Warning LangEnabledBy(C ObjC C++ ObjC++,Wall) Warn when logical not is used on the left hand side operand of a comparison. +Wlong-distance-friends +C++ Var(warn_long_distance_friends) Warning LangEnabledBy(C++,Wall) +Warn when friends are declared across module boundaries. + Wlong-long C ObjC C++ ObjC++ CPP(cpp_warn_long_long) CppReason(CPP_W_LONG_LONG) Var(warn_long_long) Init(-1) Warning LangEnabledBy(C ObjC,Wc90-c99-compat) Do not warn about using \"long long\" when -pedantic. diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h index d633b4d5c70..9299625f33b 100644 --- a/gcc/cp/cp-tree.h +++ b/gcc/cp/cp-tree.h @@ -7071,6 +7071,7 @@ inline bool module_exporting_p () extern module_state *get_module (tree name, module_state *parent = NULL, bool partition = false); extern bool module_may_redeclare (tree decl); +extern bool module_friendship_compatible (tree decl1, tree decl2); extern int module_initializer_kind (); extern void module_add_import_initializers (); diff --git a/gcc/cp/friend.c b/gcc/cp/friend.c index 9b27d7d340a..1e6e746a3a5 100644 --- a/gcc/cp/friend.c +++ b/gcc/cp/friend.c @@ -173,6 +173,17 @@ add_friend (tree type, tree decl, bool complain) if (decl == error_mark_node) return; + if (modules_p () + && complain + && warn_long_distance_friends + && !module_friendship_compatible (TYPE_NAME (type), decl)) +{ + warning (OPT_Wlong_distance_friends, + "%q#T is not visible to befriended declaration %q#D", + type, decl); + inform (DECL_SOURCE_LOCA
Re: [PATCH] introduce attribute exalias
Ping? In case there isn't immediate approval for the patch proper (I suppose different parts will require review by different subsystem maintainers), I'd appreciate at least community and language lawyers buy-in (or turn-down) for the new feature hereby proposed for C-family languages, namely, attribute exalias("symbol_name") as a means to have symbol_name output as a same-linkage alias for functions, variables, and for C++ class types' RTTI symbols. Thanks in advance, On Aug 7, 2020, Alexandre Oliva wrote: > Since last week's patchlet, I've delayed the creation of the exalias > decls, improved the merging of attributes, minimizing > interface/visibility updates, found a better way to assign exaliases to > nested explicit instantiations, even after enabling aliases to > already-defined types, so now I'm reasonably happy with the patch. > This patch introduces an attribute to add extra aliases to a symbol > when its definition is output. The main goal is to ease interfacing > C++ with Ada, as C++ mangled names have to be named, and in some cases > (e.g. when using stdint.h typedefs in function arguments) the symbol > names may vary across platforms. > The attribute is usable in C and C++, presumably in all C-family > languages. It can be attached to global variables and functions. In > C++, it can also be attached to namespace-scoped variables and > functions, static data members, member functions, explicit > instantiations and specializations of template functions, members and > classes. When applied to constructors or destructor, additional > exaliases with _Base and _Del suffixes are defined for variants other > than complete-object ones. > Applying the attribute to class types is only valid in C++, and the > effect is to attach the alias to the RTTI object associated with the > class type. > While working on this, I noticed C++ didn't merge attributes of extern > local declarations with those of the namespace-scoped declaration. > I've added code to merge the attributes if there is a namespace-scoped > declaration, but if there isn't one, there won't be any merging, and > the effects are noticeable, as in the added attr-weak-1.C. I'm also > slightly concerned that an earlier local decl would go out of sync if > a subsequent local decl, say within the same or even in another > function, introduces additional attributes in the global decl. > Regstrapped on x86_64-linux-gnu. Ok to install? > (The newly-introduced attr-weak-1.c passes in C, but is marked as XFAIL > for C++, so it gets an XPASS in C; I could move it to some C++-only > subtree, or drop it altogether and file a PR instead) > for gcc/ChangeLog > * attribs.c: Include cgraph.h. > (decl_attributes): Allow late introduction of exalias in > types. > (create_exalias_decl, create_exalias_decls): New. > * attribs.h: Declare them. > (FOR_EACH_EXALIAS): New macro. > * cgraph.c (cgraph_node::create): Create exalias decls. > * varpool.c (varpool_node::get_create): Create exalias decls. > * cgraph.h (symtab_node::remap_exalias_target): New. > * symtab.c (symtab_node::remap_exalias_target): Define. > * cgraphunit.c (cgraph_node::analyze): Create alias_target > node if needed. > (analyze_functions): Fixup visibility of implicit alias only > after its node is analyzed. > * doc/extend.texi (exalias): Document for variables, functions > and types. > for gcc/ada/ChangeLog > * doc/gnat_rm/interfacing_to_other_languages.rst: Mention > attribute exalias to give RTTI symbols mnemonic names. > * doc/gnat_ugn/the_gnat_compilation_model.rst: Mention > attribute exalias. Fix incorrect ref to C1 ctor variant. > for gcc/c-family/ChangeLog > * c-ada-spec.c (pp_asm_name): Use first exalias if available. > * c-attribs.c (handle_exalias_attribute): New. > (c_common_attribute_table): Add exalias. > (handle_copy_attribute): Do not copy exalias. > * c-decl.c (duplicate_decls): Remap exalias target. > for gcc/cp/ChangeLog > * class.c (copy_fndecl_with_name): Move/adjust exalias to > cdtor variants. > (build_cdtor_clones): Drop exalias from primary variant. > * cp-tree.h (update_exalias_interface, update_tinfo_exalias): > Declare. > * decl.c (duplicate_decls): Remap exalias target. > (grokfndecl): Tentatively create exalias decls after adding > attributes in e.g. a template member function explicit > instantiation. > * decl2.c (cplus_decl_attributes): Update tinfo exalias. > (copy_interface, update_exalias_interface): New. > (determine_visibility): Update exalias interface. > (tentative_decl_linkage, import_export_decl): Likewise. > * name-lookup.c: Include target.h and cgraph.h. > (set_local_extern_decl_linkage): Merge attributes with a > namespace-scoped decl if one is found. Remap exalias > targe
Re: PING: Fwd: [PATCH 1/2] Add statement context to get_value_range.
On 8/11/20 7:53 AM, Aldy Hernandez via Gcc-patches wrote: -- Forwarded message - From: Aldy Hernandez Date: Tue, Aug 4, 2020, 13:55 Subject: [PATCH 1/2] Add statement context to get_value_range. To: Cc: , Aldy Hernandez This is in line with the statement context that we have for get_value() in the substitute_and_fold_engine class. --- gcc/vr-values.c | 64 ++--- gcc/vr-values.h | 14 +-- 2 files changed, 41 insertions(+), 37 deletions(-) diff --git a/gcc/vr-values.c b/gcc/vr-values.c index 511342f2f13..9002d87c14b 100644 --- a/gcc/vr-values.c +++ b/gcc/vr-values.c @@ -147,7 +147,8 @@ vr_values::get_lattice_entry (const_tree var) return NULL. Otherwise create an empty range if none existed for VAR. */ const value_range_equiv * -vr_values::get_value_range (const_tree var) +vr_values::get_value_range (const_tree var, + gimple *stmt ATTRIBUTE_UNUSED) { /* If we have no recorded ranges, then return NULL. */ if (!vr_value) @@ -450,7 +451,7 @@ simplify_using_ranges::op_with_boolean_value_range_p (tree op) /* ?? Errr, this should probably check for [0,0] and [1,1] as well as [0,1]. */ - const value_range *vr = get_value_range (op); + const value_range *vr = get_value_range (op, NULL); return *vr == value_range (build_zero_cst (TREE_TYPE (op)), build_one_cst (TREE_TYPE (op))); } I think if we are adding "gimple *stmt" as a parameter, we should make if default to NULL... Then we won't have to change all the callers that don't have a need for it. I get that it helped us find all the places where stmts were available/needed originally, but I think that need is no longer relevant and we can revert to making it a default parameter now. further more, I don't think it should be a ATTRIBUTE_UNUSED, and then pass a NULL further down :) we should be able to pass stmt. @@ -972,12 +973,13 @@ vr_values::extract_range_from_cond_expr (value_range_equiv *vr, gassign *stmt) void vr_values::extract_range_from_comparison (value_range_equiv *vr, + gimple *stmt, enum tree_code code, tree type, tree op0, tree op1) Now that we are passing stmt in, and there is only one use of this function, I think you can kill the final 4 parameters and just get them in the function itself... { bool sop; tree val -= simplifier.vrp_evaluate_conditional_warnv_with_ops (code, op0, op1, += simplifier.vrp_evaluate_conditional_warnv_with_ops (stmt, code, op0, op1, false, &sop, NULL); if (val) { @@ -1008,14 +1010,14 @@ check_for_binary_op_overflow (vr_values *store, { value_range vr0, vr1; if (TREE_CODE (op0) == SSA_NAME) -vr0 = *store->get_value_range (op0); +vr0 = *store->get_value_range (op0, NULL); else if (TREE_CODE (op0) == INTEGER_CST) vr0.set (op0); else vr0.set_varying (TREE_TYPE (op0)); if (TREE_CODE (op1) == SSA_NAME) -vr1 = *store->get_value_range (op1); +vr1 = *store->get_value_range (op1, NULL); else if (TREE_CODE (op1) == INTEGER_CST) vr1.set (op1); else @@ -1472,7 +1474,7 @@ vr_values::extract_range_from_assignment (value_range_equiv *vr, gassign *stmt) else if (code == COND_EXPR) extract_range_from_cond_expr (vr, stmt); else if (TREE_CODE_CLASS (code) == tcc_comparison) -extract_range_from_comparison (vr, gimple_assign_rhs_code (stmt), +extract_range_from_comparison (vr, stmt, gimple_assign_rhs_code (stmt), gimple_expr_type (stmt), gimple_assign_rhs1 (stmt), gimple_assign_rhs2 (stmt)); @@ -1805,7 +1807,7 @@ vr_values::adjust_range_with_scev (value_range_equiv *vr, class loop *loop, if (TREE_CODE (step) == INTEGER_CST && is_gimple_val (init) && (TREE_CODE (init) != SSA_NAME - || get_value_range (init)->kind () == VR_RANGE)) + || get_value_range (init, stmt)->kind () == VR_RANGE)) { widest_int nit; @@ -1838,7 +1840,7 @@ vr_values::adjust_range_with_scev (value_range_equiv *vr, class loop *loop, value_range initvr; if (TREE_CODE (init) == SSA_NAME) - initvr = *(get_value_range (init)); + initvr = *(get_value_range (init, stmt)); else if (is_gimple_min_invariant (init)) initvr.set (init); else @@ -2090,7 +2092,7 @@ const value_range_equiv * simplify_using_ranges::get_vr_for_comparison (int i, value_range_equiv *tem) { /* Shallow-copy equiv bitmap. */ - const value_range_equiv *vr = get_value_range (ssa_name (i)); + const value_range_equiv *vr = get_value_r
Re: PING: Fwd: [PATCH 2/2] Decouple adjust_range_from_scev from vr_values and value_range_equiv.
I made some minor changes to the function comments. gcc/ChangeLog: * vr-values.c (check_for_binary_op_overflow): Change type of store to range_query. (vr_values::adjust_range_with_scev): Abstract most of the code... (range_of_var_in_loop): ...here. Remove value_range_equiv uses. (simplify_using_ranges::simplify_using_ranges): Change type of store to range_query. * vr-values.h (class range_query): New. (class simplify_using_ranges): Use range_query. (class vr_values): Add OVERRIDE to get_value_range. (range_of_var_in_loop): New. --- gcc/vr-values.c | 150 ++-- gcc/vr-values.h | 23 ++-- 2 files changed, 88 insertions(+), 85 deletions(-) diff --git a/gcc/vr-values.c b/gcc/vr-values.c index 9002d87c14b..5b7bae3bfb7 100644 --- a/gcc/vr-values.c +++ b/gcc/vr-values.c @@ -1004,7 +1004,7 @@ vr_values::extract_range_from_comparison (value_range_equiv *vr, overflow. */ static bool -check_for_binary_op_overflow (vr_values *store, +check_for_binary_op_overflow (range_query *store, enum tree_code subcode, tree type, tree op0, tree op1, bool *ovf) { @@ -1737,22 +1737,18 @@ compare_range_with_value (enum tree_code comp, const value_range *vr, gcc_unreachable (); } -/* Given a range VR, a LOOP and a variable VAR, determine whether it - would be profitable to adjust VR using scalar evolution information - for VAR. If so, update VR with the new limits. */ + +/* Given a VAR in STMT within LOOP, determine the range of the + variable and store it in VR. If no range can be determined, the + resulting range will be set to VARYING. */ void -vr_values::adjust_range_with_scev (value_range_equiv *vr, class loop *loop, - gimple *stmt, tree var) +range_of_var_in_loop (irange *vr, range_query *query, + class loop *loop, gimple *stmt, tree var) { - tree init, step, chrec, tmin, tmax, min, max, type, tem; + tree init, step, chrec, tmin, tmax, min, max, type; enum ev_direction dir; - /* TODO. Don't adjust anti-ranges. An anti-range may provide - better opportunities than a regular range, but I'm not sure. */ - if (vr->kind () == VR_ANTI_RANGE) -return; - chrec = instantiate_parameters (loop, analyze_scalar_evolution (loop, var)); /* Like in PR19590, scev can return a constant function. */ @@ -1763,16 +1759,17 @@ vr_values::adjust_range_with_scev (value_range_equiv *vr, class loop *loop, } if (TREE_CODE (chrec) != POLYNOMIAL_CHREC) -return; +{ + vr->set_varying (TREE_TYPE (var)); + return; +} init = initial_condition_in_loop_num (chrec, loop->num); - tem = op_with_constant_singleton_value_range (init); - if (tem) -init = tem; + if (TREE_CODE (init) == SSA_NAME) +query->get_value_range (init, stmt)->singleton_p (&init); step = evolution_part_in_loop_num (chrec, loop->num); - tem = op_with_constant_singleton_value_range (step); - if (tem) -step = tem; + if (TREE_CODE (step) == SSA_NAME) +query->get_value_range (step, stmt)->singleton_p (&step); /* If STEP is symbolic, we can't know whether INIT will be the minimum or maximum value in the range. Also, unless INIT is @@ -1781,7 +1778,10 @@ vr_values::adjust_range_with_scev (value_range_equiv *vr, class loop *loop, if (step == NULL_TREE || !is_gimple_min_invariant (step) || !valid_value_p (init)) -return; +{ + vr->set_varying (TREE_TYPE (var)); + return; +} dir = scev_direction (chrec); if (/* Do not adjust ranges if we do not know whether the iv increases @@ -1790,7 +1790,10 @@ vr_values::adjust_range_with_scev (value_range_equiv *vr, class loop *loop, /* ... or if it may wrap. */ || scev_probably_wraps_p (NULL_TREE, init, step, stmt, get_chrec_loop (chrec), true)) -return; +{ + vr->set_varying (TREE_TYPE (var)); + return; +} type = TREE_TYPE (var); if (POINTER_TYPE_P (type) || !TYPE_MIN_VALUE (type)) @@ -1807,7 +1810,7 @@ vr_values::adjust_range_with_scev (value_range_equiv *vr, class loop *loop, if (TREE_CODE (step) == INTEGER_CST && is_gimple_val (init) && (TREE_CODE (init) != SSA_NAME - || get_value_range (init, stmt)->kind () == VR_RANGE)) + || query->get_value_range (init, stmt)->kind () == VR_RANGE)) { widest_int nit; @@ -1830,21 +1833,32 @@ vr_values::adjust_range_with_scev (value_range_equiv *vr, class loop *loop, && (sgn == UNSIGNED || wi::gts_p (wtmp, 0) == wi::gts_p (wi::to_wide (step), 0))) { - value_range_equiv maxvr; - tem = wide_int_to_tree (TREE_TYPE (init), wtmp); - extract_range_from_binary_expr (&maxvr, PLUS_EXPR, -
Re: [PATCH] introduce attribute exalias
On 8/14/20 11:39 AM, Alexandre Oliva wrote: Ping? In case there isn't immediate approval for the patch proper (I suppose different parts will require review by different subsystem maintainers), I'd appreciate at least community and language lawyers buy-in (or turn-down) for the new feature hereby proposed for C-family languages, namely, attribute exalias("symbol_name") as a means to have symbol_name output as a same-linkage alias for functions, variables, and for C++ class types' RTTI symbols. This seems a useful feature. I don;t think it needs language lawyering -- it's an extension, right? By 'same-linkage', do you mean same linkage as the *symbol* of the thing it is aliasing, or same linkage as the language entity it is aliasing? I suspect you mean the former. I'm sure we can bikeshed the name 'exalias' doesn't seem very mnemonic to me. 'symbol_alias' or something? nathan Thanks in advance, On Aug 7, 2020, Alexandre Oliva wrote: Since last week's patchlet, I've delayed the creation of the exalias decls, improved the merging of attributes, minimizing interface/visibility updates, found a better way to assign exaliases to nested explicit instantiations, even after enabling aliases to already-defined types, so now I'm reasonably happy with the patch. This patch introduces an attribute to add extra aliases to a symbol when its definition is output. The main goal is to ease interfacing C++ with Ada, as C++ mangled names have to be named, and in some cases (e.g. when using stdint.h typedefs in function arguments) the symbol names may vary across platforms. The attribute is usable in C and C++, presumably in all C-family languages. It can be attached to global variables and functions. In C++, it can also be attached to namespace-scoped variables and functions, static data members, member functions, explicit instantiations and specializations of template functions, members and classes. When applied to constructors or destructor, additional exaliases with _Base and _Del suffixes are defined for variants other than complete-object ones. Applying the attribute to class types is only valid in C++, and the effect is to attach the alias to the RTTI object associated with the class type. While working on this, I noticed C++ didn't merge attributes of extern local declarations with those of the namespace-scoped declaration. I've added code to merge the attributes if there is a namespace-scoped declaration, but if there isn't one, there won't be any merging, and the effects are noticeable, as in the added attr-weak-1.C. I'm also slightly concerned that an earlier local decl would go out of sync if a subsequent local decl, say within the same or even in another function, introduces additional attributes in the global decl. Regstrapped on x86_64-linux-gnu. Ok to install? (The newly-introduced attr-weak-1.c passes in C, but is marked as XFAIL for C++, so it gets an XPASS in C; I could move it to some C++-only subtree, or drop it altogether and file a PR instead) for gcc/ChangeLog * attribs.c: Include cgraph.h. (decl_attributes): Allow late introduction of exalias in types. (create_exalias_decl, create_exalias_decls): New. * attribs.h: Declare them. (FOR_EACH_EXALIAS): New macro. * cgraph.c (cgraph_node::create): Create exalias decls. * varpool.c (varpool_node::get_create): Create exalias decls. * cgraph.h (symtab_node::remap_exalias_target): New. * symtab.c (symtab_node::remap_exalias_target): Define. * cgraphunit.c (cgraph_node::analyze): Create alias_target node if needed. (analyze_functions): Fixup visibility of implicit alias only after its node is analyzed. * doc/extend.texi (exalias): Document for variables, functions and types. for gcc/ada/ChangeLog * doc/gnat_rm/interfacing_to_other_languages.rst: Mention attribute exalias to give RTTI symbols mnemonic names. * doc/gnat_ugn/the_gnat_compilation_model.rst: Mention attribute exalias. Fix incorrect ref to C1 ctor variant. for gcc/c-family/ChangeLog * c-ada-spec.c (pp_asm_name): Use first exalias if available. * c-attribs.c (handle_exalias_attribute): New. (c_common_attribute_table): Add exalias. (handle_copy_attribute): Do not copy exalias. * c-decl.c (duplicate_decls): Remap exalias target. for gcc/cp/ChangeLog * class.c (copy_fndecl_with_name): Move/adjust exalias to cdtor variants. (build_cdtor_clones): Drop exalias from primary variant. * cp-tree.h (update_exalias_interface, update_tinfo_exalias): Declare. * decl.c (duplicate_decls): Remap exalias target. (grokfndecl): Tentatively create exalias decls after adding attributes in e.g. a template member function explicit instantiation. * decl2.c (cplus_decl_attrib
[pushed] c++: Copy elision and [[no_unique_address]]. [PR93711]
We don't elide a copy from a function returning a class by value into a base because that can overwrite data laid out in the tail padding of the base class; we need to handle [[no_unique_address]] fields the same way, or we ICE when the middle-end wants to create a temporary object of a TYPE_NEEDS_CONSTRUCTING type. This means that we can't always express initialization of a field with INIT_EXPR from a TARGET_EXPR the way we usually do, so I needed to change several places that were assuming that was sufficient. This also fixes 90254, the same problem with C++17 aggregate initialization of a base. Tested x86_64-pc-linux-gnu, applying to trunk. gcc/cp/ChangeLog: PR c++/90254 PR c++/93711 * cp-tree.h (unsafe_return_slot_p): Declare. * call.c (is_base_field_ref): Rename to unsafe_return_slot_p. (build_over_call): Check unsafe_return_slot_p. (build_special_member_call): Likewise. * init.c (expand_default_init): Likewise. * typeck2.c (split_nonconstant_init_1): Likewise. gcc/testsuite/ChangeLog: PR c++/90254 PR c++/93711 * g++.dg/cpp1z/aggr-base10.C: New test. * g++.dg/cpp2a/no_unique_address7.C: New test. * g++.dg/cpp2a/no_unique_address7a.C: New test. --- gcc/cp/cp-tree.h | 1 + gcc/cp/call.c | 45 --- gcc/cp/init.c | 3 +- gcc/cp/typeck2.c | 12 - gcc/testsuite/g++.dg/cpp1z/aggr-base10.C | 16 +++ .../g++.dg/cpp2a/no_unique_address7.C | 13 ++ .../g++.dg/cpp2a/no_unique_address7a.C| 14 ++ 7 files changed, 85 insertions(+), 19 deletions(-) create mode 100644 gcc/testsuite/g++.dg/cpp1z/aggr-base10.C create mode 100644 gcc/testsuite/g++.dg/cpp2a/no_unique_address7.C create mode 100644 gcc/testsuite/g++.dg/cpp2a/no_unique_address7a.C diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h index fc54e6bb9bd..921bf954647 100644 --- a/gcc/cp/cp-tree.h +++ b/gcc/cp/cp-tree.h @@ -6372,6 +6372,7 @@ extern bool is_std_init_list (tree); extern bool is_list_ctor (tree); extern void validate_conversion_obstack(void); extern void mark_versions_used (tree); +extern bool unsafe_return_slot_p (tree); extern bool cp_warn_deprecated_use (tree, tsubst_flags_t = tf_warning_or_error); extern void cp_warn_deprecated_use_scopes (tree); extern tree get_function_version_dispatcher(tree); diff --git a/gcc/cp/call.c b/gcc/cp/call.c index 47a368d069d..e6c14084127 100644 --- a/gcc/cp/call.c +++ b/gcc/cp/call.c @@ -8347,24 +8347,34 @@ call_copy_ctor (tree a, tsubst_flags_t complain) return r; } -/* Return true iff T refers to a base field. */ +/* Return true iff T refers to a base or potentially-overlapping field, which + cannot be used for return by invisible reference. We avoid doing C++17 + mandatory copy elision when this is true. -static bool -is_base_field_ref (tree t) + This returns true even if the type of T has no tail padding that other data + could be allocated into, because that depends on the particular ABI. + unsafe_copy_elision_p, below, does consider whether there is padding. */ + +bool +unsafe_return_slot_p (tree t) { STRIP_NOPS (t); if (TREE_CODE (t) == ADDR_EXPR) t = TREE_OPERAND (t, 0); if (TREE_CODE (t) == COMPONENT_REF) t = TREE_OPERAND (t, 1); - if (TREE_CODE (t) == FIELD_DECL) -return DECL_FIELD_IS_BASE (t); - return false; + if (TREE_CODE (t) != FIELD_DECL) +return false; + if (!CLASS_TYPE_P (TREE_TYPE (t))) +/* The middle-end will do the right thing for scalar types. */ +return false; + return (DECL_FIELD_IS_BASE (t) + || lookup_attribute ("no_unique_address", DECL_ATTRIBUTES (t))); } -/* We can't elide a copy from a function returning by value to a base - subobject, as the callee might clobber tail padding. Return true iff this - could be that case. */ +/* We can't elide a copy from a function returning by value to a + potentially-overlapping subobject, as the callee might clobber tail padding. + Return true iff this could be that case. */ static bool unsafe_copy_elision_p (tree target, tree exp) @@ -8374,10 +8384,11 @@ unsafe_copy_elision_p (tree target, tree exp) return false; tree type = TYPE_MAIN_VARIANT (TREE_TYPE (exp)); /* It's safe to elide the copy for a class with no tail padding. */ - if (tree_int_cst_equal (TYPE_SIZE (type), CLASSTYPE_SIZE (type))) + if (!is_empty_class (type) + && tree_int_cst_equal (TYPE_SIZE (type), CLASSTYPE_SIZE (type))) return false; /* It's safe to elide the copy if we aren't initializing a base object. */ - if (!is_base_field_ref (target)) + if (!unsafe_return_slot_p (target)) return false; tree init = TARGET_EXPR_INITIAL (exp); /* build_compound_expr p
Re: [Patch 3/5] rs6000, Add TI to TD (128-bit DFP) and TD to TI support
On Tue, 2020-08-11 at 12:22 -0700, Carl Love wrote: > Segher, Will: > > Path 3 adds support for converting to/from 128-bit integers and 128-bit > decimal floating point formats. > > Carl Love > Some cosmetic comments below. overall lgtm. Thanks, -Will > > > Add TI to TD (128-bit DFP) and TD to TI support > > gcc/ChangeLog > > 2020-08-10 Carl Love > * config/rs6000/dfp.md (floattitd2, fixtdti2): New define_insns. > > gcc/testsuite/ChangeLog > > 2020-08-10 Carl Love > * gcc.target/powerpc/int_128bit-runnable.c: Add tests. Update test. (This test already exists). > --- > gcc/config/rs6000/dfp.md | 15 + > .../gcc.target/powerpc/int_128bit-runnable.c | 64 +++ nit - Path to testcase looks strange? > 2 files changed, 79 insertions(+) > > diff --git a/gcc/config/rs6000/dfp.md b/gcc/config/rs6000/dfp.md > index 8f822732bac..ac9fe189f3e 100644 > --- a/gcc/config/rs6000/dfp.md > +++ b/gcc/config/rs6000/dfp.md > @@ -222,6 +222,13 @@ >"dcffixq %0,%1" >[(set_attr "type" "dfp")]) > > +(define_insn "floattitd2" > + [(set (match_operand:TD 0 "gpc_reg_operand" "=d") > + (float:TD (match_operand:TI 1 "gpc_reg_operand" "v")))] > + "TARGET_TI_VECTOR_OPS" > + "dcffixqq %0,%1" > + [(set_attr "type" "dfp")]) > + > ;; Convert a decimal64/128 to a decimal64/128 whose value is an integer. > ;; This is the first stage of converting it to an integer type. > Compared to some existing define_insn entries, this matches the style, looks reasonable. > @@ -241,6 +248,14 @@ >"TARGET_DFP" >"dctfix %0,%1" >[(set_attr "type" "dfp")]) > + > + ;; carll Fix comment. > +(define_insn "fixtdti2" > + [(set (match_operand:TI 0 "gpc_reg_operand" "=v") > + (fix:TI (match_operand:TD 1 "gpc_reg_operand" "d")))] > + "TARGET_TI_VECTOR_OPS" > + "dctfixqq %0,%1" > + [(set_attr "type" "dfp")]) > looks reasonable. > ;; Decimal builtin support > > diff --git a/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c > b/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c > index c84494fc28d..d1e69cea021 100644 > --- a/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c > +++ b/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c > @@ -38,6 +38,7 @@ > #if DEBUG > #include > #include > +#include > > > void print_i128(__int128_t val) > @@ -59,6 +60,13 @@ int main () >__int128_t arg1, result; >__uint128_t uarg2; > > + _Decimal128 arg1_dfp128, result_dfp128, expected_result_dfp128; > + > + struct conv_t { > +__uint128_t u128; > +_Decimal128 d128; > + } conv, conv2; > + >vector signed long long int vec_arg1_di, vec_arg2_di; >vector unsigned long long int vec_uarg1_di, vec_uarg2_di, vec_uarg3_di; >vector unsigned long long int vec_uresult_di; > @@ -2249,6 +2257,62 @@ int main () > abort(); > #endif >} > + > + /* DFP to __int128 and __int128 to DFP conversions */ > + /* Can't get printing of DFP values to work. Print the DFP value as an > + unsigned int so we can see the bit patterns. */ > +#if 1 I'd recommend dropping the #if 1 and matching #endif. > + conv.u128 = 0x2208ULL; > + conv.u128 = (conv.u128 << 64) | 0x4ULL; //DFP bit pattern for integer 4 > + expected_result_dfp128 = conv.d128; > + > + arg1 = 4; > + > + conv.d128 = (_Decimal128) arg1; > + > + result_dfp128 = (_Decimal128) arg1; > + if (((conv.u128 >>64) != 0x2208ULL) && > + ((conv.u128 & 0x) != 0x4ULL)) { > +#if DEBUG > +printf("ERROR: convert int128 value "); > +print_i128 (arg1); > +conv.d128 = result_dfp128; > +printf("\nto DFP value 0x%llx %llx (printed as hex bit string) ", > +(unsigned long long)((conv.u128) >>64), > +(unsigned long long)((conv.u128) & 0x)); > + > +conv.d128 = expected_result_dfp128; > +printf("\ndoes not match expected_result = 0x%llx %llx\n\n", > +(unsigned long long) (conv.u128>>64), > +(unsigned long long) (conv.u128 & 0x)); > +#else > +abort(); > +#endif > + } > +#endif > + > + expected_result = 4; > > + conv.u128 = 0x2208ULL; > + conv.u128 = (conv.u128 << 64) | 0x4ULL; // 4 as DFP > + arg1_dfp128 = conv.d128; > + > + result = (__int128_t) arg1_dfp128; > + > + if (result != expected_result) { > +#if DEBUG > +printf("ERROR: convert DFP value "); > +printf("0x%llx %llx (printed as hex bit string) ", > +(unsigned long long)(conv.u128>>64), > +(unsigned long long)(conv.u128 & 0x)); > +printf("to __int128 value = "); > +print_i128 (result); > +printf("\ndoes not match expected_result = "); > +print_i128 (expected_result); > +printf("\n"); > +#else > +abort(); > +#endif > + } >return 0; > }
Re: PING: Fwd: [PATCH 2/2] Decouple adjust_range_from_scev from vr_values and value_range_equiv.
On 8/14/20 12:05 PM, Aldy Hernandez wrote: I made some minor changes to the function comments. gcc/ChangeLog: * vr-values.c (check_for_binary_op_overflow): Change type of store to range_query. (vr_values::adjust_range_with_scev): Abstract most of the code... (range_of_var_in_loop): ...here. Remove value_range_equiv uses. (simplify_using_ranges::simplify_using_ranges): Change type of store to range_query. * vr-values.h (class range_query): New. (class simplify_using_ranges): Use range_query. (class vr_values): Add OVERRIDE to get_value_range. (range_of_var_in_loop): New. --- gcc/vr-values.c | 150 ++-- gcc/vr-values.h | 23 ++-- 2 files changed, 88 insertions(+), 85 deletions(-) diff --git a/gcc/vr-values.c b/gcc/vr-values.c index 9002d87c14b..5b7bae3bfb7 100644 --- a/gcc/vr-values.c +++ b/gcc/vr-values.c @@ -1004,7 +1004,7 @@ vr_values::extract_range_from_comparison (value_range_equiv *vr, overflow. */ static bool -check_for_binary_op_overflow (vr_values *store, +check_for_binary_op_overflow (range_query *store, enum tree_code subcode, tree type, tree op0, tree op1, bool *ovf) { @@ -1737,22 +1737,18 @@ compare_range_with_value (enum tree_code comp, const value_range *vr, gcc_unreachable (); } -/* Given a range VR, a LOOP and a variable VAR, determine whether it - would be profitable to adjust VR using scalar evolution information - for VAR. If so, update VR with the new limits. */ + +/* Given a VAR in STMT within LOOP, determine the range of the + variable and store it in VR. If no range can be determined, the + resulting range will be set to VARYING. */ void -vr_values::adjust_range_with_scev (value_range_equiv *vr, class loop *loop, - gimple *stmt, tree var) +range_of_var_in_loop (irange *vr, range_query *query, + class loop *loop, gimple *stmt, tree var) { - tree init, step, chrec, tmin, tmax, min, max, type, tem; + tree init, step, chrec, tmin, tmax, min, max, type; enum ev_direction dir; - /* TODO. Don't adjust anti-ranges. An anti-range may provide - better opportunities than a regular range, but I'm not sure. */ - if (vr->kind () == VR_ANTI_RANGE) - return; - IIUC, you've switched to using the new API, so the bounds calls will basically turn and ANTI range into a varying , making [lbound,ubound] will be [MIN, MAX] ? so its effectively a no-op, except we will not punt on getting a range when VR is an anti range anymore.. so that goodness... chrec = instantiate_parameters (loop, analyze_scalar_evolution (loop, var)); /* Like in PR19590, scev can return a constant function. */ @@ -1763,16 +1759,17 @@ vr_values::adjust_range_with_scev (value_range_equiv *vr, class loop *loop, } if (TREE_CODE (chrec) != POLYNOMIAL_CHREC) - return; + { + vr->set_varying (TREE_TYPE (var)); + return; + } Im seeing a lot of this pattern... Maybe we should set vr to varying upon entry to the function as the default return value.. then we can just return like it did before in all those places. Better yet, since this routine doesn't "update" anymore and simply returns a range, maybe it could instead return a boolean if it finds a range rather than the current behaviour... then those simply become + return false; We won't have to intersect at the caller if we don't need to, and its useful information at other points to know a range was calculated without having to see if varying_p () came back from the call. ie, we'd the usage pattern would then be value_range_equiv r; if (range_of_var_in_loop (&r, this, loop, stmt, var)) vr->intersect (&r); This is the pattern we use throughout the ranger. init = initial_condition_in_loop_num (chrec, loop->num); - tem = op_with_constant_singleton_value_range (init); - if (tem) - init = tem; + if (TREE_CODE (init) == SSA_NAME) + query->get_value_range (init, stmt)->singleton_p (&init); step = evolution_part_in_loop_num (chrec, loop->num); - tem = op_with_constant_singleton_value_range (step); - if (tem) - step = tem; + if (TREE_CODE (step) == SSA_NAME) + query->get_value_range (step, stmt)->singleton_p (&step); If I read this correctly, we get values for init and step... and if they are SSA_NAMES, then we query ranges, otherwise use what we got back.. So that would seem to be the same behaviour as before then.. Perhaps a comment is warranted? I had to read it a few times :-) /* If STEP is symbolic, we can't know whether INIT will be the minimum or maximum value in the range. Also, unless INIT is @@ -1781,7 +1778,10 @@ vr_values::adjust_range_with_scev (value_range_equiv *vr, class loop *loop, if (step == NULL_TREE || !is_gimple_min_invariant (step) || !valid_value_p (init)) - return; + { + vr->set_varying (TREE_TYPE (var)); + return; +
Re: [Patch 4/5] rs6000, Test 128-bit shifts for just the int128 type.
On Tue, 2020-08-11 at 12:23 -0700, Carl Love wrote: > Segher, Will: > > Patch 4 adds 128-bit integer shift instruction support. I suggest having a few more words here to better describe what this patch is doing. i.e. This is adding the VEC_I128 iterator which contains the V1TI and TI types, and modifying existing define_insns to add handling for the VEC_I128 iterator. > > Carl Love > > - > Test 128-bit shifts for just the int128 type. > > gcc/ChangeLog > > 2020-08-10 Carl Love > * config/rs6000/altivec.md (altivec_vslq, altivec_vsrq): Add mode > VEC_I128. Maybe also rename to altivec_vslq_ and altivec_vsrq_. > * config/rs6000/vector.md (VEC_I128): New mode iterator. ok > (vashlv1ti3): Change to vashl3, mode VEC_I128. > (vlshrv1ti3): Change to vlshr3, mode VEC_I128. > * config/rs6000/vsx.md (UNSPEC_XXSWAPD_V1TI): Change to > UNSPEC_XXSWAPD_VEC_I128. s/Change/Rename/ > (xxswapd_v1ti): Change to xxswapd_, mode VEC_I128. > > gcc/testsuite/ChangeLog > > 2020-08-10 Carl Love > gcc.target/powerpc/int_128bit-runnable.c: Add shift_right, shift_left > tests. > --- > gcc/config/rs6000/altivec.md | 16 +-- > gcc/config/rs6000/vector.md | 27 ++- > gcc/config/rs6000/vsx.md | 14 +- > .../gcc.target/powerpc/int_128bit-runnable.c | 24 +++-- > 4 files changed, 52 insertions(+), 29 deletions(-) > > diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md > index 2763d920828..cba39852070 100644 > --- a/gcc/config/rs6000/altivec.md > +++ b/gcc/config/rs6000/altivec.md > @@ -2219,10 +2219,10 @@ >"vsl %0,%1,%2" >[(set_attr "type" "vecsimple")]) > > -(define_insn "altivec_vslq" > - [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") > - (ashift:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v") > - (match_operand:V1TI 2 "vsx_register_operand" "v")))] > +(define_insn "altivec_vslq_" > + [(set (match_operand:VEC_I128 0 "vsx_register_operand" "=v") > + (ashift:VEC_I128 (match_operand:VEC_I128 1 "vsx_register_operand" "v") > + (match_operand:VEC_I128 2 "vsx_register_operand" "v")))] >"TARGET_TI_VECTOR_OPS" >/* Shift amount in needs to be in bits[57:63] of 128-bit operand. */ >"vslq %0,%1,%2" ok > @@ -2236,10 +2236,10 @@ >"vsr %0,%1,%2" >[(set_attr "type" "vecsimple")]) > > -(define_insn "altivec_vsrq" > - [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") > - (lshiftrt:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v") > -(match_operand:V1TI 2 "vsx_register_operand" "v")))] > +(define_insn "altivec_vsrq_" > + [(set (match_operand:VEC_I128 0 "vsx_register_operand" "=v") > + (lshiftrt:VEC_I128 (match_operand:VEC_I128 1 "vsx_register_operand" "v") > +(match_operand:VEC_I128 2 "vsx_register_operand" > "v")))] >"TARGET_TI_VECTOR_OPS" >/* Shift amount in needs to be in bits[57:63] of 128-bit operand. */ >"vsrq %0,%1,%2" ok > diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md > index 2deff282076..682aabc4657 100644 > --- a/gcc/config/rs6000/vector.md > +++ b/gcc/config/rs6000/vector.md > @@ -26,6 +26,9 @@ > ;; Vector int modes > (define_mode_iterator VEC_I [V16QI V8HI V4SI V2DI]) > > +;; 128-bit int modes > +(define_mode_iterator VEC_I128 [V1TI TI]) > + > ;; Vector int modes for parity > (define_mode_iterator VEC_IP [V8HI > V4SI > @@ -1635,17 +1638,17 @@ >"") > > ;; No immediate version of this 128-bit instruction > -(define_expand "vashlv1ti3" > - [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") > - (ashift:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v") > - (match_operand:V1TI 2 "vsx_register_operand" "v")))] > +(define_expand "vashl3" > + [(set (match_operand:VEC_I128 0 "vsx_register_operand" "=v") > + (ashift:VEC_I128 (match_operand:VEC_I128 1 "vsx_register_operand") > + (match_operand:VEC_I128 2 "vsx_register_operand")))] >"TARGET_TI_VECTOR_OPS" > { >/* Shift amount in needs to be put in bits[57:63] of 128-bit operand2. */ > - rtx tmp = gen_reg_rtx (V1TImode); > + rtx tmp = gen_reg_rtx (mode); > >emit_insn(gen_xxswapd_v1ti (tmp, operands[2])); > - emit_insn(gen_altivec_vslq (operands[0], operands[1], tmp)); > + emit_insn(gen_altivec_vslq_ (operands[0], operands[1], tmp)); >DONE; > }) > > @@ -1658,17 +1661,17 @@ >"") > > ;; No immediate version of this 128-bit instruction > -(define_expand "vlshrv1ti3" > - [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") > - (lshiftrt:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v") > -(match_operand:V1TI 2 "vsx_register_operand" "v")))] > +(define_expand "vlshr3" > + [(se
c++: Yet more name-lookup api simplification
This patch deals with LOOKUP_HIDDEN, which originally meant 'find hidden friends', but it's being pressed into service for not ignoring lambda-relevant internals. However these two functions are different. (a) hidden friends can occur in block scope (very uncommon) and (b) it had the semantics of stopping after the innermost enclosing namepspace. That's really suspect for the lambda case, but not relevant there because we never get to namespace scope (I think). Anyway, I've split the flag into two and adjusted the lambda callers to just search block scope. These two flags are added to the LOOK_want enum class, which allows dropping another parameter from the name lookup routines. The remaining LOOKUP_$FOO flags in cp-tree.h are, I think, now all related to features of overload resolution, conversion operators and reference binding. Nothing to do with /name/ lookup. pushing gcc/cp/ * cp-tree.h (LOOKUP_HIDDEN): Delete. (LOOKUP_PREFER_RVALUE): Adjust initializer. * name-lookup.h (enum class LOOK_want): Add HIDDEN_FRIEND and HIDDEN_LAMBDA flags. (lookup_name_real): Drop flags parm. (lookup_qualified_name): Drop find_hidden parm. * name-lookup.c (class name_lookup): Drop hidden field, adjust ctors. (name_lookup::add_overload): Check want for hiddenness. (name_lookup::process_binding): Likewise. (name_lookup::search_unqualified): Likewise. (identifier_type_value_1): Adjust lookup_name_real call. (set_decl_namespace): Adjust name_lookup ctor. (qualify_lookup): Drop flags parm, use want for hiddenness. (lookup_qualified_name): Drop find_hidden parm. (lookup_name_real_1): Drop flags parm, adjust qualify_lookup calls. (lookup_name_real): Drop flags parm. (lookup_name_nonclass, lookup_name): Adjust lookup_name_real calls. (lookup_type_scope_1): Adjust qualify_lookup calls. * call.c (build_operator_new_call): Adjust lookup_name_real call. (add_operator_candidates): Likewise. * coroutines.cc (morph_fn_to_coro): Adjust lookup_qualified_name call. * parser.c (cp_parser_lookup_name): Adjust lookup_name_real calls. * pt.c (check_explicit_specialization): Adjust lookup_qualified_name call. (deduction_guides_for): Likewise. (tsubst_friend_class): Adjust lookup_name_real call. (lookup_init_capture_pack): Likewise. (tsubst_expr): Likewise, don't look in namespaces. * semantics.c (capture_decltype): Adjust lookup_name_real. Don't look in namespaces. libcc1/ * libcp1plugin.cc (plugin_build_dependent_exp): Adjust lookup_name_real call. -- Nathan Sidwell diff --git c/gcc/cp/call.c w/gcc/cp/call.c index d4935bd0ce1..7b46d70c2cf 100644 --- c/gcc/cp/call.c +++ w/gcc/cp/call.c @@ -4704,7 +4704,7 @@ build_operator_new_call (tree fnname, vec **args, up in the global scope. we disregard block-scope declarations of "operator new". */ - fns = lookup_name_real (fnname, LOOK_where::NAMESPACE, LOOK_want::NORMAL, 0); + fns = lookup_name_real (fnname, LOOK_where::NAMESPACE, LOOK_want::NORMAL); fns = lookup_arg_dependent (fnname, fns, *args); if (align_arg) @@ -5983,7 +5983,7 @@ add_operator_candidates (z_candidate **candidates, if (!memonly) { tree fns = lookup_name_real (fnname, LOOK_where::BLOCK_NAMESPACE, - LOOK_want::NORMAL, 0); + LOOK_want::NORMAL); fns = lookup_arg_dependent (fnname, fns, arglist); add_candidates (fns, NULL_TREE, arglist, NULL_TREE, NULL_TREE, false, NULL_TREE, NULL_TREE, diff --git c/gcc/cp/coroutines.cc w/gcc/cp/coroutines.cc index 22ba81c8025..898b88b7075 100644 --- c/gcc/cp/coroutines.cc +++ w/gcc/cp/coroutines.cc @@ -4311,7 +4311,7 @@ morph_fn_to_coro (tree orig, tree *resumer, tree *destroyer) tree std_nt = lookup_qualified_name (std_node, get_identifier ("nothrow"), LOOK_want::NORMAL, - /*complain=*/true, false); + /*complain=*/true); if (!std_nt || std_nt == error_mark_node) error_at (fn_start, "%qE is provided by %qT but % " "cannot be found", grooaf, promise_type); diff --git c/gcc/cp/cp-tree.h w/gcc/cp/cp-tree.h index 82834e65a53..72c57320c9a 100644 --- c/gcc/cp/cp-tree.h +++ w/gcc/cp/cp-tree.h @@ -5567,14 +5567,10 @@ enum overload_flags { NO_SPECIAL = 0, DTOR_FLAG, TYPENAME_FLAG }; #define LOOKUP_DESTRUCTOR (1 << 5) /* Do not permit references to bind to temporaries. */ #define LOOKUP_NO_TEMP_BIND (1 << 6) -/* Return friend declarations and un-declared builtin functions. - (Normally, these entities are registered in the symbol table, but - not found by lookup.) */ -#define LOOKUP_HIDDEN (1 << 7) /* We're trying to treat an lvalue as an rvalue. */ /* FIXME remove when we extend the P1825 semantics to all standard modes, the C++20 ap
Re: [Patch 5/5] rs6000, Conversions between 128-bit integer and floating point values.
On Tue, 2020-08-11 at 12:23 -0700, Carl Love wrote: > Segher, Will: > > Patch 5 adds the 128-bit integer to/from 128-floating point > conversions. This patch has to invoke the routines to use the 128-bit > hardware instructions if on Power 10 or use software routines if > running on a pre Power 10 system via the resolve function. > > Carl > > --- > Conversions between 128-bit integer and floating point values. > > gcc/ChangeLog > > 2020-08-10 Carl Love > config/rs6000/rs6000.md (floatunsti2, > fix_truncti2, fixuns_truncti2): Add > define_insn for mode IEEE 128. s/Add/Update/ missing floatti2 > libgcc/config/rs6000/fixkfi-sw.c: New file. > libgcc/config/rs6000/fixkfi.c: Remove file. > libgcc/config/rs6000/fixunskfi-sw.c: New file. > libgcc/config/rs6000/fixunskfi.c: Remove file. ... rename to ... ? > libgcc/config/rs6000/float128-hw.c (__floattikf_hw, > __floatuntikf_hw, __fixkfti_hw, __fixunskfti_hw): > New functions. > libgcc/config/rs6000/float128-ifunc.c (SW_OR_HW_ISA3_1): > New macro. > (__floattikf_resolve, __floatuntikf_resolve, __fixkfti_resolve, > __fixunskfti_resolve): Add resolve functions. > (__floattikf, __floatuntikf, __fixkfti, __fixunskfti): New > functions. > libgcc/config/rs6000/float128-sed (floattitf, __floatuntitf, > __fixtfti, __fixunstfti): Add editor commands to change > names. > libgcc/config/rs6000/float128-sed-hw (__floattitf, > __floatuntitf, __fixtfti, __fixunstfti): Add editor commands > to change names. > libgcc/config/rs6000/floattikf-sw.c: New file. > libgcc/config/rs6000/floattikf.c: Remove file. > libgcc/config/rs6000/floatuntikf-sw.c: New file. > libgcc/config/rs6000/floatuntikf.c: Remove file. > libgcc/config/rs6000/floatuntikf-sw.c: New file. floatuntikf-sw was so good, it was added twice. ... rename to ... ? > libgcc/config/rs6000/quaad-float128.h (__floattikf_sw, One 'a' in quad. > __floatuntikf_sw, __fixkfti_sw, __fixunskfti_sw, __floattikf_hw, > __floatuntikf_hw, __fixkfti_hw, __fixunskfti_hw, __floattikf, > __floatuntikf, __fixkfti, __fixunskfti):New extern declarations. Tab in there that should not be. > libgcc/config/rs6000/t-float128 (floattikf, floatuntikf, > fixkfti, fixunskfti): Remove file names from fp128_ppc_funcs. > (floattikf-sw, floatuntikf-sw, fixkfti-sw, fixunskfti-sw): Add > file names to fp128_ppc_funcs. Perhaps libgcc/config/rs6000/t-float128 (floattikf, floatuntikf, fixkfti, fixunskfti): Rename to (floattikf-sw, floatuntikf-sw, fixkfti-sw, fixunskfti-sw) > > gcc/testsuite/ChangeLog > > 2020-08-10 Carl Love > gcc.target/powerpc/fl128_conversions.c: New file. New test. or just New. > --- > gcc/config/rs6000/rs6000.md | 36 +++ > .../gcc.target/powerpc/fp128_conversions.c| 287 ++ > .../config/rs6000/{fixkfti.c => fixkfti-sw.c} | 4 +- > .../rs6000/{fixunskfti.c => fixunskfti-sw.c} | 4 +- > libgcc/config/rs6000/float128-hw.c| 24 ++ > libgcc/config/rs6000/float128-ifunc.c | 44 ++- > libgcc/config/rs6000/float128-sed | 4 + > libgcc/config/rs6000/float128-sed-hw | 4 + > .../rs6000/{floattikf.c => floattikf-sw.c}| 4 +- > .../{floatuntikf.c => floatuntikf-sw.c} | 4 +- > libgcc/config/rs6000/quad-float128.h | 17 +- > libgcc/config/rs6000/t-float128 | 3 +- > 12 files changed, 415 insertions(+), 20 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/powerpc/fp128_conversions.c > rename libgcc/config/rs6000/{fixkfti.c => fixkfti-sw.c} (96%) > rename libgcc/config/rs6000/{fixunskfti.c => fixunskfti-sw.c} (96%) > rename libgcc/config/rs6000/{floattikf.c => floattikf-sw.c} (96%) > rename libgcc/config/rs6000/{floatuntikf.c => floatuntikf-sw.c} (96%) > > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md > index 43b620ae1c0..3853ebd4195 100644 > --- a/gcc/config/rs6000/rs6000.md > +++ b/gcc/config/rs6000/rs6000.md > @@ -6390,6 +6390,42 @@ > xscvsxddp %x0,%x1" >[(set_attr "type" "fp")]) > > +(define_insn "floatti2" > + [(set (match_operand:IEEE128 0 "vsx_register_operand" "=v") > + (float:IEEE128 (match_operand:TI 1 "vsx_register_operand" "v")))] > + "TARGET_POWER10" > +{ > + return "xscvsqqp %0,%1"; > +} > + [(set_attr "type" "fp")]) > + > +(define_insn "floatunsti2" > + [(set (match_operand:IEEE128 0 "vsx_register_operand" "=v") > + (unsigned_float:IEEE128 (match_operand:TI 1 "vsx_register_operand" > "v")))] > + "TARGET_POWER10" > +{ > + return "xscvuqqp %0,%1"; > +} > + [(set_attr "type" "fp")]) > + > +(define_insn "fix_truncti2" > + [(set (match_operand:TI 0 "vsx_register_operand" "=v
Re: [PATCH] introduce attribute exalias
On Aug 14, 2020, Nathan Sidwell wrote: > This seems a useful feature. I don;t think it needs language > lawyering -- it's an extension, right? Well, yeah, but I think it's usually good for even extensions to be sound language-wise. > By 'same-linkage', do you mean same linkage as the *symbol* of the > thing it is aliasing, or same linkage as the language entity it is > aliasing? > I suspect you mean the former. Yeah, ultimately the symbol declared as exalias gets the same object-level linkage and visibility properties as those of the primary symbol emitted for the language entity. Conceptually, the entity introduced by the attribute is not even visible or accessible in the standard language; it can only be referenced by alias attributes and by Ada import declarations, but conceptually, in as much as you conceive of it as a separate entity, I suppose it makes some sense to say it gets the same linkage as the entity it refers to. > I'm sure we can bikeshed the name 'exalias' doesn't seem very mnemonic > to me. 'symbol_alias' or something? I don't like symbol_alias; that this feature names a symbol is not a distinguishing feature from the preexisting alias attribute. 'ex' can be read as both extra, exported, external, and all of these sort of make sense, at least for entities that have linkage. Even for exclusively internal uses, say to introduce a mnemonic symbol for another alias-attributed declaration to refer to, the "ex" prefix, that means the opposite of "in", fitting in well with the functionality of "ex"posing the symbol through a name that other alias declarations can take *in*, *im*port. Another possible spelling for this proposed attribute that might be more mnemonic is "aka"; unfortunately, that's pretty much a synonym to alias, so it might be mistaken as such, rather than as a complementary feature, akin to the other end of a power extension cable: whereas alias does the job of a plug, *ex*alias provides a socket/*out*let. -- Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/ Free Software Activist GNU Toolchain Engineer
Re: [committed] analyzer: rewrite of region and value-handling
On Fri, 2020-08-14 at 10:49 +0200, Christophe Lyon wrote: > Hi David, > > > On Thu, 13 Aug 2020 at 22:58, David Malcolm via Gcc-patches > wrote: > > This large patch reimplements how the analyzer tracks regions and > > values. [...] > > Pushed to master as 808f4dfeb3a95f50f15e71148e5c1067f90a126d. > > > > Some of the new tests fail on arm and aarch64. Sorry about the failures. The failures were additionally reported as failing on powerpc64 to bugzilla as PR testsuite/96609 and PR analyzer/96616; those reports both mentioned the init.c and the pr93032-mztools.c failures. The init.c failures and the casts-1.c failures on arm seems to be due to the analyzer not knowing about constant pools. I'm testing a fix for that now. I'll look at the pr93032-mztools.c next. > > HTH > > Christophe Thanks for the report; sorry for the breakage Dave
[committed] analyzer: document how to get gimple dump from an ICE
gcc/ChangeLog: * doc/analyzer.texi (Overview): Add tip about how to get a gimple dump if the analyzer ICEs. --- gcc/doc/analyzer.texi | 5 + 1 file changed, 5 insertions(+) diff --git a/gcc/doc/analyzer.texi b/gcc/doc/analyzer.texi index b5d6d0f393e..92c12e19401 100644 --- a/gcc/doc/analyzer.texi +++ b/gcc/doc/analyzer.texi @@ -29,6 +29,11 @@ The implementation is read-only: it doesn't attempt to change anything, just emit warnings. The gimple representation can be seen using @option{-fdump-ipa-analyzer}. +@quotation Tip +If the analyzer ICEs before this is written out, one workaround is to use +@option{--param=analyzer-bb-explosion-factor=0} to force the analyzer +to bail out after analyzing the first basic block. +@end quotation First, we build a @code{supergraph} which combines the callgraph and all of the CFGs into a single directed graph, with both interprocedural and -- 2.26.2
[committed] analyzer: fix ICE on escaped unknown pointers [PR96611]
PR analyzer/96611 reports an ICE within the handling for unknown functions, when passing a pointer to something accessed via a global pointer, after an unknown function has already been called. The first unknown function leads to the store being flagged, so the access to the global pointer leads to (*unknown_svalue) for the base region of the argument to the 2nd function, and thus *unknown_svalue being reachable by the 2nd unknown function, triggering an assertion failure. Handle this case by rejecting attempts to get a cluster for the unknown pointer, fixing the ICE. Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Pushed to master as r11-2707-gee88b536069db8f870c444c441182a9c76ec5bba. gcc/analyzer/ChangeLog: PR analyzer/96611 * store.cc (store::mark_as_escaped): Reject attempts to get a cluster for an unknown pointer. gcc/testsuite/ChangeLog: PR analyzer/96611 * gcc.dg/analyzer/pr96611.c: New test. --- gcc/analyzer/store.cc | 3 +++ gcc/testsuite/gcc.dg/analyzer/pr96611.c | 14 ++ 2 files changed, 17 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr96611.c diff --git a/gcc/analyzer/store.cc b/gcc/analyzer/store.cc index 5fef27c8bd2..950a7784542 100644 --- a/gcc/analyzer/store.cc +++ b/gcc/analyzer/store.cc @@ -1691,6 +1691,9 @@ store::mark_as_escaped (const region *base_reg) gcc_assert (base_reg); gcc_assert (base_reg->get_base_region () == base_reg); + if (base_reg->symbolic_for_unknown_ptr_p ()) +return; + binding_cluster *cluster = get_or_create_cluster (base_reg); cluster->mark_as_escaped (); } diff --git a/gcc/testsuite/gcc.dg/analyzer/pr96611.c b/gcc/testsuite/gcc.dg/analyzer/pr96611.c new file mode 100644 index 000..4f7502361cb --- /dev/null +++ b/gcc/testsuite/gcc.dg/analyzer/pr96611.c @@ -0,0 +1,14 @@ +struct s { int a; } *ptr; +void unknown_int_ptr (int *); +void unknown_void (void); + +void test_1 () +{ + unknown_int_ptr (&ptr->a); +} + +void test_2 () +{ + unknown_void (); + unknown_int_ptr (&ptr->a); +} -- 2.26.2
Re: [PATCH] rs6000, restrict bfloat convert intrinsic to Power 10. Fix BU_P10V macro definitions.
Hi Carl, On Thu, Aug 13, 2020 at 09:12:48AM -0700, Carl Love wrote: > The macro expansion for the bfloat convert intrinsics XVCVBF16SP and > XVCVSPBF16 need to be restricted to P10. > The following patch creates new macro expansions BU_P10V_VSX_# and > BU_P10V_AV_# for the VSX and Altivec instructions respectively. The > new names are consistent with the P8 and P9 naming convention for the > VSX and Altivec instructions. So _vsx if it is for all VSRs, but _altivec for just the VRs? > The macro expansion for XVCVBF16SP and XVCVSPBF16 is changed from > BU_VSX_1 to BU_P10V_VSX_1 to restrict it to P10 and beyond. Also MISC > is changed to CONST in the macro expansion call. The spelling of the xvcvbf16sp name will probably change, fwiw. So you might want to wait before committing this (but changing it later is fine as well). > The side effect of creating the macro expansions for VSX and Altivec is > it changes all of the expanded names. The patch fixes all the uses of > the expanded names as needed for the new VSX and Altivec macros. Joy :-) Do the names agree with the (future) documentation now? > -#define BU_P10V_1(ENUM, NAME, ATTR, ICODE) \ > - RS6000_BUILTIN_1 (P10_BUILTIN_ ## ENUM,/* ENUM */ \ > - "__builtin_altivec_" NAME, /* NAME */ \ > +#define BU_P10V_VSX_1(ENUM, NAME, ATTR, ICODE) > \ Hrm, this now says "V" (for vector) as well as "VSX" (for all 64 vector regs allowed). One of those is superfluous. > + RS6000_BUILTIN_1 (P10V_BUILTIN_ ## ENUM, /* ENUM */ \ So this enum name doesn't say it allows all 64 registers? > + "__builtin_vsx_" NAME, /* NAME */ \ > /* Builtins for vector instructions added in ISA 3.1 (power10). */ > -BU_P10V_2 (VCLRLB, "vclrlb", CONST, vclrlb) > +BU_P10V_AV_2 (VCLRLB, "vclrlb", CONST, vclrlb) Maybe you should just keep "V" for insns using only the VRs (which you call V_AV now), and do "VS" for those working on all VSRs (which you call V_VSX here)? Segher
Re: [PATCH] options: Make --help= to emit values post-overrided
Hi! On Fri, Aug 14, 2020 at 01:42:24PM +0800, Kewen.Lin wrote: > > I think personally I'd prefer an option (3): call > > target_option_override_hook directly in decode_options, > > if help_option_arguments is nonempty. Like you say, > > decode_options appears to be the only caller of print_help. > > Good idea! The related patch is attached, different from opts_alt{1,2} > it could still call target_option_override_hook even if we won't call > print_specific_help eventually for some special cases like lang_mask is > CL_DRIVER or include_flags is empty. But I think it's fine. > --- a/gcc/opts-global.c > +++ b/gcc/opts-global.c > @@ -327,8 +327,14 @@ decode_options (struct gcc_options *opts, struct > gcc_options *opts_set, >unsigned i; >const char *arg; > > - FOR_EACH_VEC_ELT (help_option_arguments, i, arg) > -print_help (opts, lang_mask, arg); > + if (!help_option_arguments.is_empty ()) > +{ > + /* Consider post-overrided values for --help=*. */ I'd say /* Make sure --help=* see the overridden values. */ > + target_option_override_hook (); > + > + FOR_EACH_VEC_ELT (help_option_arguments, i, arg) > + print_help (opts, lang_mask, arg); > +} > } The patch looks just fine to me. But, not my call :-) Segher
Re: [PATCH] introduce attribute exalias
On 8/14/20 3:24 PM, Alexandre Oliva wrote: On Aug 14, 2020, Nathan Sidwell wrote: This seems a useful feature. I don;t think it needs language lawyering -- it's an extension, right? Well, yeah, but I think it's usually good for even extensions to be sound language-wise. By 'same-linkage', do you mean same linkage as the *symbol* of the thing it is aliasing, or same linkage as the language entity it is aliasing? I suspect you mean the former. Yeah, ultimately the symbol declared as exalias gets the same object-level linkage and visibility properties as those of the primary symbol emitted for the language entity. Conceptually, the entity introduced by the attribute is not even visible or accessible in the standard language; it can only be referenced by alias attributes and by Ada import declarations, but conceptually, in as much as you conceive of it as a separate entity, I suppose it makes some sense to say it gets the same linkage as the entity it refers to. thanks for the discussion. I should have said, 'exalias' sounds either like a used-to-be alias, it is an ex alias, it has ceased to be, gone to join the choir invisible. or it sounds like exa-lias, making me wonder what a 'lia' is, and why I want 10^18 of them I'm sure we can bikeshed the name 'exalias' doesn't seem very mnemonic to me. 'symbol_alias' or something? I don't like symbol_alias; that this feature names a symbol is not a distinguishing feature from the preexisting alias attribute. right, I realize this is different to the existing alias. It's always struck me that the existing semantics are not c++ friendly. Perhaps alias is not the right name at all. You're emitting an alternative symbol, for use in interfacing to a foreign language/system. Perhaps 'xenoname'? 'ex' can be read as both extra, exported, external, and all of these sort of make sense, at least for entities that have linkage. Even for exclusively internal uses, say to introduce a mnemonic symbol for another alias-attributed declaration to refer to, the "ex" prefix, that means the opposite of "in", fitting in well with the functionality of "ex"posing the symbol through a name that other alias declarations can take *in*, *im*port. Another possible spelling for this proposed attribute that might be more mnemonic is "aka"; unfortunately, that's pretty much a synonym to alias, so it might be mistaken as such, rather than as a complementary feature, akin to the other end of a power extension cable: whereas alias does the job of a plug, *ex*alias provides a socket/*out*let. nathan -- Nathan Sidwell
RE: [PATCH] rs6000, restrict bfloat convert intrinsic to Power 10. Fix BU_P10V macro definitions.
On Fri, 2020-08-14 at 16:33 -0500, Segher Boessenkool wrote: > Hi Carl, > > On Thu, Aug 13, 2020 at 09:12:48AM -0700, Carl Love wrote: > > The macro expansion for the bfloat convert intrinsics XVCVBF16SP > > and > > XVCVSPBF16 need to be restricted to P10. > > The following patch creates new macro expansions BU_P10V_VSX_# and > > BU_P10V_AV_# for the VSX and Altivec instructions > > respectively. The > > new names are consistent with the P8 and P9 naming convention for > > the > > VSX and Altivec instructions. > > So _vsx if it is for all VSRs, but _altivec for just the VRs? Yes, I worked off the rule that Altivec registers are designated with 5-bits and VSR registers are designated with 6-bits. > > > The macro expansion for XVCVBF16SP and XVCVSPBF16 is changed from > > BU_VSX_1 to BU_P10V_VSX_1 to restrict it to P10 and beyond. Also > > MISC > > is changed to CONST in the macro expansion call. > > The spelling of the xvcvbf16sp name will probably change, fwiw. So > you > might want to wait before committing this (but changing it later is > fine > as well). OK, not sure of the urgency to get this patch in. Bill noted the error recently in the expansion. Will check with him to see if we should wait and go with the new name or go with the old name for now. > > Do the names agree with the (future) documentation now? Did not double check on the documentation. > > > -#define BU_P10V_1(ENUM, NAME, ATTR, ICODE) > > \ > > - RS6000_BUILTIN_1 (P10_BUILTIN_ ## ENUM, /* ENUM */ \ > > - "__builtin_altivec_" NAME, /* NAME */ > > \ > > +#define BU_P10V_VSX_1(ENUM, NAME, ATTR, ICODE) > > \ > > Hrm, this now says "V" (for vector) as well as "VSX" (for all 64 > vector > regs allowed). One of those is superfluous. > > > + RS6000_BUILTIN_1 (P10V_BUILTIN_ ## ENUM, /* ENUM */ \ > > So this enum name doesn't say it allows all 64 registers? > > > + "__builtin_vsx_" NAME, /* NAME */ \ > > > > /* Builtins for vector instructions added in ISA 3.1 > > (power10). */ > > -BU_P10V_2 (VCLRLB, "vclrlb", CONST, vclrlb) > > +BU_P10V_AV_2 (VCLRLB, "vclrlb", CONST, vclrlb) > > Maybe you should just keep "V" for insns using only the VRs (which > you > call V_AV now), and do "VS" for those working on all VSRs (which you > call > V_VSX here)? The names BU_P10V_AV_# and BU_P10V_VSX_# were used for consistency with the Power 8 and Power 9 naming, per my discussion with Peter. The following already exist in rs6000-builtin.def for similar uses. #define BU_P8V_AV_1(ENUM, NAME, ATTR, ICODE) #define BU_P8V_AV_2(ENUM, NAME, ATTR, ICODE) #define BU_P8V_AV_3(ENUM, NAME, ATTR, ICODE) #define BU_P9V_AV_1(ENUM, NAME, ATTR, ICODE) #define BU_P9V_AV_2(ENUM, NAME, ATTR, ICODE) #define BU_P9V_AV_3(ENUM, NAME, ATTR, ICODE) #define BU_P8V_VSX_1(ENUM, NAME, ATTR, ICODE) #define BU_P8V_VSX_2(ENUM, NAME, ATTR, ICODE) #define BU_P9V_VSX_1(ENUM, NAME, ATTR, ICODE) #define BU_P9V_VSX_2(ENUM, NAME, ATTR, ICODE) #define BU_P9V_VSX_3(ENUM, NAME, ATTR, ICODE) I agree the V seems redundant given the VSX, AV in the name. The overload macros: #define BU_P8V_OVERLOAD_1(ENUM, NAME) #define BU_P8V_OVERLOAD_2(ENUM, NAME) #define BU_P8V_OVERLOAD_3(ENUM, NAME) #define BU_P9_OVERLOAD_2(ENUM, NAME) #define BU_P9V_OVERLOAD_1(ENUM, NAME) #define BU_P9V_OVERLOAD_2(ENUM, NAME) #define BU_P9V_OVERLOAD_3(ENUM, NAME) Looks like BU_P9_OVERLOAD_2 and BU_P9V_OVERLOAD_2 are identical definitions. That should probably be fixed. Seems like you need V in the processor name for the OVERLOAD definitions to make it clear they are vector definitions. I don't believe the OVERLOAD cares if the builtin is for vsx or altivec registers. It kinda makes sense to leave the V in BU_P8V_VSX_#, BU_P9V_VSX_#, BU_P8V_AV_#, BU_P9V_AV_# definitions for consistency with the OVERLOAD macro name? Thoughts? I will change to whatever everyone agrees on. Carl
[committed] analyzer: fix initialization from constant pool [PR96609, PR96616]
PR testsuite/96609 and PR analyzer/96616 report various testsuite failures seen on powerpc64, aarch64, and arm in new tests added by r11-2694-g808f4dfeb3a95f50f15e71148e5c1067f90a126d. Some of these failures (in gcc.dg/analyzer/init.c, and on arm in gcc.dg/analyzer/casts-1.c) relate to initializations from var_decls in the constant pool. I wrote the tests assuming that the gimplified stmts would initialize the locals via a gassign of code CONSTRUCTOR, whereas on these targets some of the initializations are gassign from a VAR_DECL e.g.: c = *.LC0; where "*.LC0" is a var_decl with DECL_IN_CONSTANT_POOL set. For example, in test_7: struct coord c[2] = {{3, 4}, {5, 6}}; __analyzer_eval (c[0].x == 3); /* { dg-warning "TRUE" } */ after the initialization, the store was simply recording: cluster for: c: INIT_VAL(*.LC0) when I was expecting the cluster for c to have: cluster for: c key: {kind: direct, start: 0, size: 32, next: 32} value: 'int' {(int)3} key: {kind: direct, start: 32, size: 32, next: 64} value: 'int' {(int)4} key: {kind: direct, start: 64, size: 32, next: 96} value: 'int' {(int)5} key: {kind: direct, start: 96, size: 32, next: 128} value: 'int' {(int)6} The test for c[0].x == 3 would then generate: cluster for: _2: (SUB(SUB(INIT_VAL(*.LC0), c[(int)0]), c[(int)0].x)==(int)3) which is UNKNOWN, leading to the test failing. This patch fixes the init.c and casts-1.c failures by special-casing reads from a var_decl with DECL_IN_CONSTANT_POOL set, so that they build a compound_svalue containing the bindings implied by the CONSTRUCTOR node for DECL_INITIAL. Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Manually verified the fixes to init.c and casts-1.c on aarch64-unknown-linux-gnu, arm-unknown-eabi, and powerpc64-linux-gnu (-m32 and -m64). Pushed to master as r11-2708-g2867118ddda9b56d991c16022f7d3d634ed08313. This doesn't address the bogus -Wanalyzer-too-complex messages for pr93032-mztools.c reported in the bugs, which seem to be a separate issue that I'm now investigating. gcc/analyzer/ChangeLog: PR testsuite/96609 PR analyzer/96616 * region-model.cc (region_model::get_store_value): Call maybe_get_constant_value on decl_regions first. * region-model.h (decl_region::maybe_get_constant_value): New decl. * region.cc (decl_region::get_stack_depth): Likewise. (decl_region::maybe_get_constant_value): New. * store.cc (get_subregion_within_ctor): New. (binding_map::apply_ctor_to_region): New. * store.h (binding_map::apply_ctor_to_region): New decl. --- gcc/analyzer/region-model.cc | 5 +++ gcc/analyzer/region-model.h | 2 ++ gcc/analyzer/region.cc | 27 + gcc/analyzer/store.cc| 59 gcc/analyzer/store.h | 3 ++ 5 files changed, 96 insertions(+) diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc index 649e20438e4..3c7ea40e8d8 100644 --- a/gcc/analyzer/region-model.cc +++ b/gcc/analyzer/region-model.cc @@ -1192,6 +1192,11 @@ region_model::get_rvalue (tree expr, region_model_context *ctxt) const svalue * region_model::get_store_value (const region *reg) const { + /* Special-case: handle var_decls in the constant pool. */ + if (const decl_region *decl_reg = reg->dyn_cast_decl_region ()) +if (const svalue *sval = decl_reg->maybe_get_constant_value (m_mgr)) + return sval; + const svalue *sval = m_store.get_any_binding (m_mgr->get_store_manager (), reg); if (sval) diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-model.h index 33aa3461611..3d044bf8d6c 100644 --- a/gcc/analyzer/region-model.h +++ b/gcc/analyzer/region-model.h @@ -1869,6 +1869,8 @@ public: tree get_decl () const { return m_decl; } int get_stack_depth () const; + const svalue *maybe_get_constant_value (region_model_manager *mgr) const; + private: tree m_decl; }; diff --git a/gcc/analyzer/region.cc b/gcc/analyzer/region.cc index f3f577c43de..afe416b001b 100644 --- a/gcc/analyzer/region.cc +++ b/gcc/analyzer/region.cc @@ -874,6 +874,33 @@ decl_region::get_stack_depth () const return 0; } +/* If the underlying decl is in the global constant pool, + return an svalue representing the constant value. + Otherwise return NULL. */ + +const svalue * +decl_region::maybe_get_constant_value (region_model_manager *mgr) const +{ + if (TREE_CODE (m_decl) == VAR_DECL + && DECL_IN_CONSTANT_POOL (m_decl) + && DECL_INITIAL (m_decl) + && TREE_CODE (DECL_INITIAL (m_decl)) == CONSTRUCTOR) +{ + tree ctor = DECL_INITIAL (m_decl); + gcc_assert (!TREE_CLOBBER_P (ctor)); + + /* Create a binding map, applying ctor to it, using this +decl_region as the base region when building child regions +for offset calculations. */ + binding_map map; + map.apply_ctor_to_region (this, ctor, mgr); + + /* Retur
[PATCH] rs6000: Add builtins for fegetround, feclearexcept and feraiseexcept [PR94193]
Hi All, I sent an initial RFC of this patch[1] some time ago. I believe it is now in a complete state, with added the test cases for the builtins. The builtin optimizations presented here were originally in glibc, but were removed and suggested that they were a good fit as gcc builtins[2]. So, this patch adds new rs6000 expand optimizations for fegetround and for some calls to feclearexcept and feraiseexcept. All of them C99 functions from fenv.h I replicated the optimizations semantics almost as-is as the glibc version, with the notable exception that for feclearexcept and feraiseexcept, the glibc builtin was not filtering only the valid flags, so it had and undefined behavior for values out of range. For the gcc builtin I thought was best to explicitly filter the valid ones, as the builtin does not return any error(same as the original) and there is only 4 valid flags. To check the FE_* flags used in feclearexcept and feraiseexcept expands I decided copy verbatim the definitions from glibc instead of using the macros, which would means including fenv.h somewhere to get them. Still on feclearexcept and feraiseexcept I, I am not sure I used exact_log2_cint_operand correctly because on my tests it kept accepting feclearexcept(0) and it should not. In any case, because I decided to test for all valid flags, this is not a problem for correct generation, but I thought I should mention it. tested on top of master (808f4dfeb3a95f50f15e71148e5c1067f90a126d) on the following plataforms with no regression: powerpc64le-linux-gnu (Power 9) powerpc64le-linux-gnu (Power 8) [1] https://gcc.gnu.org/pipermail/gcc-patches/2020-June/548998.html [2] https://sourceware.org/legacy-ml/libc-alpha/2020-03/msg00047.html https://sourceware.org/legacy-ml/libc-alpha/2020-03/msg00080.html 8< This optimizations were originally in glibc, but was removed and sugested that they were a good fit as gcc builtins[1]. The associated bugreport: PR target/94193 [1] https://sourceware.org/legacy-ml/libc-alpha/2020-03/msg00047.html https://sourceware.org/legacy-ml/libc-alpha/2020-03/msg00080.html 2020-08-13 Raoni Fassina Firmino gcc/ChangeLog: * builtins.c (expand_builtin_fegetround): New function. (expand_builtin_feclear_feraise_except): New function. (expand_builtin): Add cases for BUILT_IN_FEGETROUND, BUILT_IN_FECLEAREXCEPT and BUILT_IN_FERAISEEXCEPT * config/rs6000/rs6000.md (fegetroundsi): New pattern. (feclearexceptsi): New Pattern. (feraiseexceptsi): New Pattern. * optabs.def (fegetround_optab): New optab. (feclearexcept_optab): New optab. (feraiseexcept_optab): New optab. gcc/testsuite/ChangeLog: * gcc.target/powerpc/builtin-feclearexcept-feraiseexcept-1.c: New test. * gcc.target/powerpc/builtin-feclearexcept-feraiseexcept-2.c: New test. * gcc.target/powerpc/builtin-fegetround.c: New test. Signed-off-by: Raoni Fassina Firmino --- gcc/builtins.c| 75 ++ gcc/config/rs6000/rs6000.md | 82 +++ gcc/optabs.def| 4 + .../builtin-feclearexcept-feraiseexcept-1.c | 64 + .../builtin-feclearexcept-feraiseexcept-2.c | 130 ++ .../gcc.target/powerpc/builtin-fegetround.c | 30 6 files changed, 385 insertions(+) create mode 100644 gcc/testsuite/gcc.target/powerpc/builtin-feclearexcept-feraiseexcept-1.c create mode 100644 gcc/testsuite/gcc.target/powerpc/builtin-feclearexcept-feraiseexcept-2.c create mode 100644 gcc/testsuite/gcc.target/powerpc/builtin-fegetround.c diff --git a/gcc/builtins.c b/gcc/builtins.c index beb56e06d8a..de6f34e0225 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -115,6 +115,8 @@ static rtx expand_builtin_mathfn_3 (tree, rtx, rtx); static rtx expand_builtin_mathfn_ternary (tree, rtx, rtx); static rtx expand_builtin_interclass_mathfn (tree, rtx); static rtx expand_builtin_sincos (tree); +static rtx expand_builtin_fegetround (tree, rtx, machine_mode); +static rtx expand_builtin_feclear_feraise_except (tree, rtx, machine_mode, optab); static rtx expand_builtin_cexpi (tree, rtx); static rtx expand_builtin_int_roundingfn (tree, rtx); static rtx expand_builtin_int_roundingfn_2 (tree, rtx); @@ -2577,6 +2579,59 @@ expand_builtin_sincos (tree exp) return const0_rtx; } +/* Expand call EXP to the fegetround builtin (from C99 venv.h), returning the + result and setting it in TARGET. Otherwise return NULL_RTX on failure. */ +static rtx +expand_builtin_fegetround (tree exp, rtx target, machine_mode target_mode) +{ + if (!validate_arglist (exp, VOID_TYPE)) +return NULL_RTX; + + insn_code icode = direct_optab_handler (fegetround_optab, SImode); + if (icode == CODE_FOR_nothing) +return NULL_RTX; + + if (target == 0 + || GET_MODE (target) != target_mode + || ! (*insn_data[icode].operand[0].predicate) (target, target_
[PATCH] rs6000: unaligned VSX in memcpy/memmove expansion
This patch adds a few new instructions to inline expansion of memcpy/memmove. Generation of all these is controlled by the option -mblock-ops-unaligned-vsx which is set on by default if the target has TARGET_EFFICIENT_UNALIGNED_VSX. * unaligned vsx load/store (V2DImode) * unaligned vsx pair load/store (POImode) which is also controlled by -mblock-ops-vector-pair in case it is not wanted at some point. The default for this option is also for it to be on if the target has TARGET_EFFICIENT_UNALIGNED_VSX. * unaligned vsx lxvl/stxvl but generally only to do the remainder of a copy/move we stated with some vsx loads/stores, and also prefer to use lb/lh/lw/ld if the remainder is 1/2/4/8 bytes. Testing of this is actually accomplished by gcc.dg/memcmp-1.c which does two memcpy() for each memcmp(). If the memcpy() calls don't do the right thing then the memcmp() will fail unexpectedly. Regstrap passed on ppc64le power9 and the memcmp-1.c test passes on power10 simulator, ok for trunk? Thanks! Aaron gcc/ChangeLog: * config/rs6000/rs6000-string.c (gen_lxvl_stxvl_move): Helper function. (expand_block_move): Add lxvl/stxvl, vector pair, and unaligned VSX. * config/rs6000/rs6000.c (rs6000_option_override_internal): Default value for -mblock-ops-vector-pair. * config/rs6000/rs6000.opt: Add -mblock-ops-vector-pair. --- gcc/config/rs6000/rs6000-string.c | 105 ++ gcc/config/rs6000/rs6000.c| 14 +++- gcc/config/rs6000/rs6000.opt | 4 ++ 3 files changed, 107 insertions(+), 16 deletions(-) diff --git a/gcc/config/rs6000/rs6000-string.c b/gcc/config/rs6000/rs6000-string.c index c35d93180ca..ce6db2ba14d 100644 --- a/gcc/config/rs6000/rs6000-string.c +++ b/gcc/config/rs6000/rs6000-string.c @@ -2708,6 +2708,36 @@ gen_lvx_v4si_move (rtx dest, rtx src) return gen_altivec_lvx_v4si_internal (dest, src); } +static rtx +gen_lxvl_stxvl_move (rtx dest, rtx src, int length) +{ + gcc_assert (MEM_P (dest) ^ MEM_P (src)); + gcc_assert (GET_MODE (dest) == V16QImode && GET_MODE (src) == V16QImode); + gcc_assert (length <= 16); + + bool is_store = MEM_P (dest); + + /* If the address form is not a simple register, make it so. */ + if (is_store) +{ + dest = XEXP (dest, 0); + if (!REG_P (dest)) + dest = force_reg (Pmode, dest); +} + else +{ + src = XEXP (src, 0); + if (!REG_P (src)) + src = force_reg (Pmode, src); +} + + rtx len = force_reg (DImode, gen_int_mode (length, DImode)); + if (is_store) +return gen_stxvl (src, dest, len); + else +return gen_lxvl (dest, src, len); +} + /* Expand a block move operation, and return 1 if successful. Return 0 if we should let the compiler generate normal code. @@ -2750,18 +2780,57 @@ expand_block_move (rtx operands[], bool might_overlap) if (bytes > rs6000_block_move_inline_limit) return 0; + int orig_bytes = bytes; for (offset = 0; bytes > 0; offset += move_bytes, bytes -= move_bytes) { union { - rtx (*movmemsi) (rtx, rtx, rtx, rtx); rtx (*mov) (rtx, rtx); + rtx (*movlen) (rtx, rtx, int); } gen_func; machine_mode mode = BLKmode; rtx src, dest; - - /* Altivec first, since it will be faster than a string move -when it applies, and usually not significantly larger. */ - if (TARGET_ALTIVEC && bytes >= 16 && align >= 128) + bool move_with_length = false; + + /* Use POImode for paired vsx load/store. Use V2DI for single +unaligned vsx load/store, for consistency with what other +expansions (compare) already do, and so we can use lxvd2x on +p8. Order is VSX pair unaligned, VSX unaligned, Altivec, vsx +with length < 16 (if allowed), then smaller gpr +load/store. */ + + if (TARGET_MMA && TARGET_BLOCK_OPS_UNALIGNED_VSX + && TARGET_BLOCK_OPS_VECTOR_PAIR + && bytes >= 32 + && (align >= 256 || !STRICT_ALIGNMENT)) + { + move_bytes = 32; + mode = POImode; + gen_func.mov = gen_movpoi; + } + else if (TARGET_POWERPC64 && TARGET_BLOCK_OPS_UNALIGNED_VSX + && VECTOR_MEM_VSX_P (V2DImode) + && bytes >= 16 && (align >= 128 || !STRICT_ALIGNMENT)) + { + move_bytes = 16; + mode = V2DImode; + gen_func.mov = gen_vsx_movv2di_64bit; + } + else if (TARGET_BLOCK_OPS_UNALIGNED_VSX + && TARGET_POWER10 && bytes < 16 + && orig_bytes > 16 + && !(bytes == 1 || bytes == 2 + || bytes == 4 || bytes == 8) + && (align >= 128 || !STRICT_ALIGNMENT)) + { + /* Only use lxvl/stxvl if it could replace multiple ordinary +loads+stores. Also don't use it unless we likely already +did one vsx copy so we aren't mixing gpr and vsx. */ + move_bytes = bytes; +
Re: [PATCH] improve memcmp and memchr constant folding (PR 78257)
On 8/13/20 11:44 AM, Martin Sebor wrote: On 8/13/20 10:21 AM, Jeff Law wrote: On Fri, 2020-07-31 at 17:55 -0600, Martin Sebor via Gcc-patches wrote: The folders for these functions (and some others) call c_getsr which relies on string_constant to return the representation of constant strings. Because the function doesn't handle constants of other types, including aggregates, memcmp or memchr calls involving those are not folded when they could be. The attached patch extends the algorithm used by string_constant to also handle constant aggregates involving elements or members of the same types as native_encode_expr. (The change restores the empty initializer optimization inadvertently disabled in the fix for pr96058.) To avoid accidentally misusing either string_constant or c_getstr with non-strings I have introduced a pair of new functions to get the representation of those: byte_representation and getbyterep. Tested on x86_64-linux. Martin PR tree-optimization/78257 - missing memcmp optimization with constant arrays gcc/ChangeLog: PR middle-end/78257 * builtins.c (expand_builtin_memory_copy_args): Rename called function. (expand_builtin_stpcpy_1): Remove argument from call. (expand_builtin_memcmp): Rename called function. (inline_expand_builtin_bytecmp): Same. * expr.c (convert_to_bytes): New function. (constant_byte_string): New function (formerly string_constant). (string_constant): Call constant_byte_string. (byte_representation): New function. * expr.h (byte_representation): Declare. * fold-const-call.c (fold_const_call): Rename called function. * fold-const.c (c_getstr): Remove an argument. (getbyterep): Define a new function. * fold-const.h (c_getstr): Remove an argument. (getbyterep): Declare a new function. * gimple-fold.c (gimple_fold_builtin_memory_op): Rename callee. (gimple_fold_builtin_string_compare): Same. (gimple_fold_builtin_memchr): Same. gcc/testsuite/ChangeLog: PR middle-end/78257 * gcc.dg/memchr.c: New test. * gcc.dg/memcmp-2.c: New test. * gcc.dg/memcmp-3.c: New test. * gcc.dg/memcmp-4.c: New test. diff --git a/gcc/expr.c b/gcc/expr.c index a150fa0d3b5..a124df54655 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -11594,15 +11594,103 @@ is_aligning_offset (const_tree offset, const_tree exp) /* This must now be the address of EXP. */ return TREE_CODE (offset) == ADDR_EXPR && TREE_OPERAND (offset, 0) == exp; } - -/* Return the tree node if an ARG corresponds to a string constant or zero - if it doesn't. If we return nonzero, set *PTR_OFFSET to the (possibly - non-constant) offset in bytes within the string that ARG is accessing. - If MEM_SIZE is non-zero the storage size of the memory is returned. - If DECL is non-zero the constant declaration is returned if available. */ -tree -string_constant (tree arg, tree *ptr_offset, tree *mem_size, tree *decl) +/* If EXPR is a constant initializer (either an expression or CONSTRUCTOR), + attempt to obtain its native representation as an array of nonzero BYTES. + Return true on success and false on failure (the latter without modifying + BYTES). */ + +static bool +convert_to_bytes (tree type, tree expr, vec *bytes) +{ + if (TREE_CODE (expr) == CONSTRUCTOR) + { + /* Set to the size of the CONSTRUCTOR elements. */ + unsigned HOST_WIDE_INT ctor_size = bytes->length (); + + if (TREE_CODE (type) == ARRAY_TYPE) + { + tree val, idx; + tree eltype = TREE_TYPE (type); + unsigned HOST_WIDE_INT elsize = + tree_to_uhwi (TYPE_SIZE_UNIT (eltype)); + unsigned HOST_WIDE_INT i, last_idx = HOST_WIDE_INT_M1U; + FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (expr), i, idx, val) + { + /* Append zeros for elements with no initializers. */ + if (!tree_fits_uhwi_p (idx)) + return false; + unsigned HOST_WIDE_INT cur_idx = tree_to_uhwi (idx); + if (unsigned HOST_WIDE_INT size = cur_idx - (last_idx + 1)) + { + size = size * elsize + bytes->length (); + bytes->safe_grow_cleared (size); ^^^ + } + + if (!convert_to_bytes (eltype, val, bytes)) + return false; + + last_idx = cur_idx; + } + } + else if (TREE_CODE (type) == RECORD_TYPE) + { + tree val, fld; + unsigned HOST_WIDE_INT i; + FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (expr), i, fld, val) + { + /* Append zeros for members with no initializers and + any padding. */ + unsigned HOST_WIDE_INT cur_off = int_byte_position (fld); + if (bytes->length () < cur_off) + bytes->safe_grow_cleared (cur_off); ^^ + + if (!convert_to_bytes (TREE_TYPE (val), val, bytes)) + return false; + } + } + else + return false;
Re: [PATCH] Add support for putting jump table into relocation read-only section
Hi! On Fri, Aug 14, 2020 at 03:20:03PM +0800, HAO CHEN GUI wrote: > section * > select_jump_table_section (tree decl) > { > int reloc; > bool section_reloc; > > reloc = (! CASE_VECTOR_PC_RELATIVE && flag_pic && > ! targetm.asm_out.generate_pic_addr_diff_vec ()) ? 1 : > 0; (Modern style is no space after "!", and your mailer wrapped the code). > section_reloc = (reloc & targetm.asm_out.reloc_rw_mask ()); (No space after & either). > return targetm.asm_out.function_rodata_section (decl, section_reloc); So you are saying it should not go into relro if either there is no pic flag, the targets are pc relative, or pic uses some address diff? This seems too complicated, and/or not the right abstraction. > diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in > index 3be984bbd5c..0ac1488c837 100644 > --- a/gcc/doc/tm.texi.in > +++ b/gcc/doc/tm.texi.in > @@ -5007,6 +5007,8 @@ it is unlikely to be called. > > @hook TARGET_ASM_FUNCTION_RODATA_SECTION > > +@hook TARGET_ASM_FUNCTION_RELRO_SECTION I would expect people to just use TARGET_ASM_FUNCTION_RODATA_SECTION to get the .data.rel.ro? With another argument then. > +section * > +select_jump_table_section (tree decl) > +{ > + int relco; "reloc"? > + relco = JUMP_TABLES_IN_RELRO; Just put that on the same line as the declaration? > + if (relco & targetm.asm_out.reloc_rw_mask ()) > +return targetm.asm_out.function_relro_section (decl); > + else > +return targetm.asm_out.function_rodata_section (decl); > +} (trailing spaces btw) > @@ -2492,7 +2513,7 @@ final_scan_insn_1 (rtx_insn *insn, FILE *file, int > optimize_p ATTRIBUTE_UNUSED, > { > int log_align; > > - switch_to_section (targetm.asm_out.function_rodata_section > + switch_to_section (select_jump_table_section >(current_function_decl)); section *section = select_jump_table_section (current_function_decl)); switch_to_section (section); (but it would be better if we didn't indent so deeply here). I think this should be split into a function just selecting the relro section (either directly, or from the rodata selection function), and then separately the jumptable section thing. There is a lot of stuff here that seems confused, and a lot of that already was there :-( Segher
c++: Final bit of name-lookup api simplification
We no longer need to give name_lookup_real not name_lookup_nonclass different names to the name_lookup functions. This renames the lookup functions thusly. gcc/cp/ * name-lookup.h (lookup_name_real, lookup_name_nonclass): Rename to ... (lookup_name): ... these new overloads. * name-lookup.c (identifier_type_value_1): Rename lookup_name_real call. (lookup_name_real_1): Rename to ... (lookup_name_1): ... here. (lookup_name_real): Rename to ... (lookup_name): ... here. Rename lookup_name_real_1 call. (lookup_name_nonclass): Delete. * call.c (build_operator_new_call): Rename lookup_name_real call. (add_operator_candidates): Likewise. (build_op_delete_call): Rename lookup_name_nonclass call. * parser.c (cp_parser_lookup_name): Likewise. * pt.c (tsubst_friend_class, lookup_init_capture_pack): Likewise. (tsubst_expr): Likewise. * semantics.c (capture_decltype): Likewise. libcc1/ * libcp1plugin.cc (plugin_build_dependent_expr): Rename lookup_name_real call. -- Nathan Sidwell diff --git i/gcc/cp/call.c w/gcc/cp/call.c index 8c03fcaf74b..c62d9e227d3 100644 --- i/gcc/cp/call.c +++ w/gcc/cp/call.c @@ -4704,7 +4704,7 @@ build_operator_new_call (tree fnname, vec **args, up in the global scope. we disregard block-scope declarations of "operator new". */ - fns = lookup_name_real (fnname, LOOK_where::NAMESPACE, LOOK_want::NORMAL); + fns = lookup_name (fnname, LOOK_where::NAMESPACE); fns = lookup_arg_dependent (fnname, fns, *args); if (align_arg) @@ -5982,8 +5982,7 @@ add_operator_candidates (z_candidate **candidates, consider. */ if (!memonly) { - tree fns = lookup_name_real (fnname, LOOK_where::BLOCK_NAMESPACE, - LOOK_want::NORMAL); + tree fns = lookup_name (fnname, LOOK_where::BLOCK_NAMESPACE); fns = lookup_arg_dependent (fnname, fns, arglist); add_candidates (fns, NULL_TREE, arglist, NULL_TREE, NULL_TREE, false, NULL_TREE, NULL_TREE, @@ -6812,7 +6811,7 @@ build_op_delete_call (enum tree_code code, tree addr, tree size, fns = NULL_TREE; if (fns == NULL_TREE) -fns = lookup_name_nonclass (fnname); +fns = lookup_name (fnname, LOOK_where::BLOCK_NAMESPACE); /* Strip const and volatile from addr. */ tree oaddr = addr; diff --git i/gcc/cp/name-lookup.c w/gcc/cp/name-lookup.c index 4753dc529e6..ad9c92da254 100644 --- i/gcc/cp/name-lookup.c +++ w/gcc/cp/name-lookup.c @@ -3743,7 +3743,7 @@ identifier_type_value_1 (tree id) return REAL_IDENTIFIER_TYPE_VALUE (id); /* Have to search for it. It must be on the global level, now. Ask lookup_name not to return non-types. */ - id = lookup_name_real (id, LOOK_where::BLOCK_NAMESPACE, LOOK_want::TYPE); + id = lookup_name (id, LOOK_where::BLOCK_NAMESPACE, LOOK_want::TYPE); if (id) return TREE_TYPE (id); return NULL_TREE; @@ -5213,8 +5213,7 @@ cp_namespace_decls (tree ns) } /* Given a lookup that returned VAL, use FLAGS to decide if we want to - ignore it or not. Subroutine of lookup_name_real and - lookup_type_scope. */ + ignore it or not. Subroutine of lookup_name_1 and lookup_type_scope. */ static bool qualify_lookup (tree val, LOOK_want want) @@ -5987,7 +5986,7 @@ suggest_alternative_in_scoped_enum (tree name, tree scoped_enum) /* Look up NAME (an IDENTIFIER_NODE) in SCOPE (either a NAMESPACE_DECL or a class TYPE). - WANT as for lookup_name_real_1. + WANT as for lookup_name_1. Returns a DECL (or OVERLOAD, or BASELINK) representing the declaration found. If no suitable declaration can be found, @@ -6420,10 +6419,13 @@ innermost_non_namespace_value (tree name) LOOK_want::NORMAL for normal lookup (implicit typedefs can be hidden). LOOK_want::TYPE for only TYPE_DECLS, LOOK_want::NAMESPACE for only NAMESPACE_DECLS. These two can be bit-ored to find - namespace or type. */ + namespace or type. + + WANT can also have LOOK_want::HIDDEN_FRIEND or + LOOK_want::HIDDEN_LAMBDa added to it. */ static tree -lookup_name_real_1 (tree name, LOOK_where where, LOOK_want want) +lookup_name_1 (tree name, LOOK_where where, LOOK_want want) { tree val = NULL_TREE; @@ -6560,33 +6562,21 @@ lookup_name_real_1 (tree name, LOOK_where where, LOOK_want want) return val; } -/* Wrapper for lookup_name_real_1. */ +/* Wrapper for lookup_name_1. */ tree -lookup_name_real (tree name, LOOK_where where, LOOK_want want) +lookup_name (tree name, LOOK_where where, LOOK_want want) { bool subtime = timevar_cond_start (TV_NAME_LOOKUP); - tree ret = lookup_name_real_1 (name, where, want); + tree ret = lookup_name_1 (name, where, want); timevar_cond_stop (TV_NAME_LOOKUP, subtime); return ret; } -tree -lookup_name_nonclass (tree name) -{ - return lookup_name_real (name, LOOK_where::BLOCK_NAMESPACE, LOOK_want::NORMAL); -} - tree look
Re: [PATCH] rs6000, restrict bfloat convert intrinsic to Power 10. Fix BU_P10V macro definitions.
Hi! On Fri, Aug 14, 2020 at 03:32:47PM -0700, Carl Love wrote: > On Fri, 2020-08-14 at 16:33 -0500, Segher Boessenkool wrote: > > So _vsx if it is for all VSRs, but _altivec for just the VRs? > > Yes, I worked off the rule that Altivec registers are designated with > 5-bits and VSR registers are designated with 6-bits. Excellent :-) > > Do the names agree with the (future) documentation now? > > Did not double check on the documentation. Someone should... > > > -#define BU_P10V_1(ENUM, NAME, ATTR, ICODE) > > > \ > > > - RS6000_BUILTIN_1 (P10_BUILTIN_ ## ENUM,/* ENUM */ > > > \ > > > - "__builtin_altivec_" NAME, /* NAME */ > > > \ > > > +#define BU_P10V_VSX_1(ENUM, NAME, ATTR, ICODE) > > > \ > > > > Hrm, this now says "V" (for vector) as well as "VSX" (for all 64 > > vector > > regs allowed). One of those is superfluous. > > > > > + RS6000_BUILTIN_1 (P10V_BUILTIN_ ## ENUM, /* ENUM */ > > > \ > > > > So this enum name doesn't say it allows all 64 registers? > > > > > + "__builtin_vsx_" NAME, /* NAME */ \ > > > > > > > /* Builtins for vector instructions added in ISA 3.1 > > > (power10). */ > > > -BU_P10V_2 (VCLRLB, "vclrlb", CONST, vclrlb) > > > +BU_P10V_AV_2 (VCLRLB, "vclrlb", CONST, vclrlb) > > > > Maybe you should just keep "V" for insns using only the VRs (which > > you > > call V_AV now), and do "VS" for those working on all VSRs (which you > > call > > V_VSX here)? > > The names BU_P10V_AV_# and BU_P10V_VSX_# were used for consistency with > the Power 8 and Power 9 naming, per my discussion with Peter. Ah, okay, yes let's keep it all consistent then, to make the rewrite easier :-) > The overload macros: > > #define BU_P8V_OVERLOAD_1(ENUM, NAME) > #define BU_P8V_OVERLOAD_2(ENUM, NAME) > #define BU_P8V_OVERLOAD_3(ENUM, NAME) > > #define BU_P9_OVERLOAD_2(ENUM, NAME) > > #define BU_P9V_OVERLOAD_1(ENUM, NAME) > #define BU_P9V_OVERLOAD_2(ENUM, NAME) > #define BU_P9V_OVERLOAD_3(ENUM, NAME) > > Looks like BU_P9_OVERLOAD_2 and BU_P9V_OVERLOAD_2 are identical > definitions. That should probably be fixed. It's __builtin_ vs. __builtin_vec_. > Thoughts? I will change to whatever everyone agrees on. I think your current code is fine; I hadn't considered Bill's upcoming rewrite. It is more important to make that go smoother than to fix some aesthetics right now. Please check if the names for the builtins match the (future) documentation, and then, approved for trunk. Thank you! Segher
Re: [PATCH] introduce attribute exalias
On Aug 14, 2020, Nathan Sidwell wrote: > 'exalias' sounds either like a used-to-be alias *nod* > or it sounds like exa-lias, making me wonder what a 'lia' is, and why > I want 10^18 of them heh >>> I'm sure we can bikeshed the name 'exalias' doesn't seem very mnemonic >>> to me. 'symbol_alias' or something? >> I don't like symbol_alias; that this feature names a symbol is not a >> distinguishing feature from the preexisting alias attribute. > right, I realize this is different to the existing alias. The point was that the existing alias already takes a symbol name. > It's always struck me that the existing semantics are not c++ > friendly. Indeed, avoiding the need for using mangled symbol names to refer to language entities is the very issue I've set out to solve. It helps with aliases in C++ as much as it helps with imports in Ada. > Perhaps alias is not the right name at all. I kind of like the explicit present of "alias" because, well, what we get is an alias, to the point that, if asm aliases aren't available, it won't work. And, if they are, you can use the so-assigned name as an alias target, so it's a good thing if they're typographically related. One could even argue that this new attribute is more deserving of the term alias than the existing one, and that the existing one should be renamed to "aliased_to" or so. But I'm not seriously suggesting us to rename a long-available attribute while assigning uses thereof a different semantics, that would be preposterous. Since you don't seem to have liked 'aka' either, how about 'nickname', or 'nicknamed'? A more convenient name to refer to an entity is exactly what this is about, eh? -- Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/ Free Software Activist GNU Toolchain Engineer