[PATCH] xtensa: Turn on -fsplit-wide-types-early by default
Since GCC10, the "subreg2" optimization pass was no longer tied to enabling "subreg1" unless -fsplit-wide-types-early was turned on (PR88233). However on the Xtensa port, the lack of "subreg2" can degrade the quality of the output code, especially for those that produce many D[FC]mode pseudos. This patch turns on -fsplit-wide-types-early by default in order to restore the previous behavior. gcc/ChangeLog: * common/config/xtensa/xtensa-common.cc (xtensa_option_optimization_table): Add OPT_fsplit_wide_types_early for OPT_LEVELS_ALL in order to restore pre-GCC10 behavior. --- gcc/common/config/xtensa/xtensa-common.cc | 2 ++ 1 file changed, 2 insertions(+) diff --git a/gcc/common/config/xtensa/xtensa-common.cc b/gcc/common/config/xtensa/xtensa-common.cc index fbbe9b0aad7..0f27763aa71 100644 --- a/gcc/common/config/xtensa/xtensa-common.cc +++ b/gcc/common/config/xtensa/xtensa-common.cc @@ -34,6 +34,8 @@ static const struct default_options xtensa_option_optimization_table[] = assembler, so GCC cannot do a good job of reordering blocks. Do not enable reordering unless it is explicitly requested. */ { OPT_LEVELS_ALL, OPT_freorder_blocks, NULL, 0 }, +/* Split multi-word types early (pre-GCC10 behavior). */ +{ OPT_LEVELS_ALL, OPT_fsplit_wide_types_early, NULL, 1 }, { OPT_LEVELS_NONE, 0, NULL, 0 } }; -- 2.20.1
Re: [PATCH] MIPS: improve -march=native arch detection
On Tue, Aug 02, 2022 at 11:10:09AM +, YunQiang Su wrote: > If we cannot get info from options and cpuinfo, we try to get from: > 1. getauxval(AT_BASE_PLATFORM), introduced since Linux 5.7 > 2. _MIPS_ARCH from host compiler. > > This can fix the wrong loader usage on r5/r6 platform with > -march=native. > ping... > gcc/ChangeLog: > * config/mips/driver-native.cc (host_detect_local_cpu): > try getauxval(AT_BASE_PLATFORM) and _MIPS_ARCH, too. > --- > gcc/config/mips/driver-native.cc | 22 +++--- > 1 file changed, 19 insertions(+), 3 deletions(-) > > diff --git a/gcc/config/mips/driver-native.cc > b/gcc/config/mips/driver-native.cc > index 47627f85ce1..9aa7044c0b8 100644 > --- a/gcc/config/mips/driver-native.cc > +++ b/gcc/config/mips/driver-native.cc > @@ -19,6 +19,7 @@ along with GCC; see the file COPYING3. If not see > > #define IN_TARGET_CODE 1 > > +#include > #include "config.h" > #include "system.h" > #include "coretypes.h" > @@ -46,15 +47,15 @@ host_detect_local_cpu (int argc, const char **argv) >bool arch; > >if (argc < 1) > -return NULL; > +goto fallback_cpu; > >arch = strcmp (argv[0], "arch") == 0; >if (!arch && strcmp (argv[0], "tune")) > -return NULL; > +goto fallback_cpu; > >f = fopen ("/proc/cpuinfo", "r"); >if (f == NULL) > -return NULL; > +goto fallback_cpu; > >while (fgets (buf, sizeof (buf), f) != NULL) > if (startswith (buf, "cpu model")) > @@ -84,8 +85,23 @@ host_detect_local_cpu (int argc, const char **argv) > >fclose (f); > > +fallback_cpu: > +/*FIXME: how about other OSes, like FreeBSD? */ > +#ifdef __linux__ > + /*Note: getauxval may return NULL as: > + * AT_BASE_PLATFORM is supported since Linux 5.7 > + * Or from older version of qemu-user > + * */ > + if (cpu == NULL) > +cpu = (const char *) getauxval (AT_BASE_PLATFORM); > +#endif > + >if (cpu == NULL) > +#if defined (_MIPS_ARCH) > +cpu = _MIPS_ARCH; > +#else > return NULL; > +#endif > >return concat ("-m", argv[0], "=", cpu, NULL); > } > -- > 2.30.2 >
[PATCH][_GLIBCXX_DEBUG] Add basic_string::starts_with/ends_with checks
I think we can add those checks. Note that I wonder if it was needed as in basic_string_view I see usages of __attribute__((__nonnull__)). But running the test I saw no impact even after I try to apply this attribute to the starts_with/ends_with methods themselves. Also note that several checks like the ones I am adding here are XFAILS when using 'make check' because of the segfault rather than on a proper debug checks. Would you prefer to add dg-require-debug-mode to those ? libstdc++: [_GLIBCXX_DEBUG] Add basic_string::starts_with/ends_with checks Add simple checks on C string parameters which should not be null. Review null string checks to show: _String != nullptr rather than: _String != 0 libstdc++-v3/ChangeLog: * include/bits/basic_string.h (starts_with, ends_with): Add __glibcxx_check_string. * include/bits/cow_string.h (starts_with, ends_with): Likewise. * include/debug/debug.h: Use nullptr rather than '0' in checks in C++11. * include/debug/string: Likewise. * testsuite/21_strings/basic_string/operations/ends_with/char.cc: Use __gnu_test::string. * testsuite/21_strings/basic_string/operations/ends_with/wchar_t.cc: Use __gnu_test::wstring. * testsuite/21_strings/basic_string/operations/starts_with/wchar_t.cc: Use __gnu_test::wstring. * testsuite/21_strings/basic_string/operations/starts_with/char.cc: Use __gnu_test::string. * testsuite/21_strings/basic_string/operations/ends_with/char_neg.cc: New test. * testsuite/21_strings/basic_string/operations/ends_with/wchar_t_neg.cc: New test. * testsuite/21_strings/basic_string/operations/starts_with/char_neg.cc: New test. * testsuite/21_strings/basic_string/operations/starts_with/wchar_t_neg.cc: New test. Tested under linux normal and debug modes. François
[RFC]rs6000: split complicated constant to memory
Hi, This patch tries to put the constant into constant pool if building the constant requires 3 or more instructions. But there is a concern: I'm wondering if this patch is really profitable. Because, as I tested, 1. for simple case, if instructions are not been run in parallel, loading constant from memory maybe faster; but 2. if there are some instructions could run in parallel, loading constant from memory are not win comparing with building constant. As below examples. For f1.c and f3.c, 'loading' constant would be acceptable in runtime aspect; for f2.c and f4.c, 'loading' constant are visibly slower. For real-world cases, both kinds of code sequences exist. So, I'm not sure if we need to push this patch. Run a lot of times (10) below functions to check runtime. f1.c: long foo (long *arg, long*, long *) { *arg = 0x12345678; } asm building constant: lis 10,0x1234 ori 10,10,0x5678 sldi 10,10,32 vs. asm loading addis 10,2,.LC0@toc@ha ld 10,.LC0@toc@l(10) The runtime between 'building' and 'loading' are similar: some times the 'building' is faster; sometimes 'loading' is faster. And the difference is slight. f2.c long foo (long *arg, long *arg2, long *arg3) { *arg = 0x12345678; *arg2 = 0x79652347; *arg3 = 0x46891237; } asm building constant: lis 7,0x1234 lis 10,0x7965 lis 9,0x4689 ori 7,7,0x5678 ori 10,10,0x2347 ori 9,9,0x1237 sldi 7,7,32 sldi 10,10,32 sldi 9,9,32 vs. loading addis 7,2,.LC0@toc@ha addis 10,2,.LC1@toc@ha addis 9,2,.LC2@toc@ha ld 7,.LC0@toc@l(7) ld 10,.LC1@toc@l(10) ld 9,.LC2@toc@l(9) For this case, 'loading' is always slower than 'building' (>15%). f3.c long foo (long *arg, long *, long *) { *arg = 384307168202282325; } lis 10,0x555 ori 10,10,0x sldi 10,10,32 oris 10,10,0x ori 10,10,0x For this case, 'building' (through 5 instructions) are slower, and 'loading' is faster ~5%; f4.c long foo (long *arg, long *arg2, long *arg3) { *arg = 384307168202282325; *arg2 = -6148914691236517205; *arg3 = 768614336404564651; } lis 7,0x555 lis 10,0x lis 9,0xaaa ori 7,7,0x ori 10,10,0x ori 9,9,0x sldi 7,7,32 sldi 10,10,32 sldi 9,9,32 oris 7,7,0x oris 10,10,0x oris 9,9,0x ori 7,7,0x ori 10,10,0xaaab ori 9,9,0xaaab For this cases, since 'building' constant are parallel, 'loading' is slower: ~8%. On p10, 'loading'(through 'pld') is also slower >4%. BR, Jeff(Jiufu) --- gcc/config/rs6000/rs6000.cc| 14 ++ gcc/testsuite/gcc.target/powerpc/pr63281.c | 11 +++ 2 files changed, 25 insertions(+) create mode 100644 gcc/testsuite/gcc.target/powerpc/pr63281.c diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index 4b727d2a500..3798e11bdbc 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -10098,6 +10098,20 @@ rs6000_emit_set_const (rtx dest, rtx source) c = ((c & 0x) ^ 0x8000) - 0x8000; emit_move_insn (lo, GEN_INT (c)); } + else if (base_reg_operand (dest, mode) + && num_insns_constant (source, mode) > 2) + { + rtx sym = force_const_mem (mode, source); + if (TARGET_TOC && SYMBOL_REF_P (XEXP (sym, 0)) + && use_toc_relative_ref (XEXP (sym, 0), mode)) + { + rtx toc = create_TOC_reference (XEXP (sym, 0), copy_rtx (dest)); + sym = gen_const_mem (mode, toc); + set_mem_alias_set (sym, get_TOC_alias_set ()); + } + + emit_insn (gen_rtx_SET (dest, sym)); + } else rs6000_emit_set_long_const (dest, c); break; diff --git a/gcc/testsuite/gcc.target/powerpc/pr63281.c b/gcc/testsuite/gcc.target/powerpc/pr63281.c new file mode 100644 index 000..469a8f64400 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr63281.c @@ -0,0 +1,11 @@ +/* PR target/63281 */ +/* { dg-do compile { target lp64 } } */ +/* { dg-options "-O2 -std=c99" } */ + +void +foo (unsigned long long *a) +{ + *a = 0x020805006106003; +} + +/* { dg-final { scan-assembler-times {\mp?ld\M} 1 } } */ -- 2.17.1
Re: [PATCH] i386 testsuite: cope with --enable-default-pie
On Wed, Aug 10, 2022 at 1:42 PM Alexandre Oliva via Gcc-patches wrote: > > On Aug 9, 2022, Alexandre Oliva wrote: > > > Ping? > > https://gcc.gnu.org/pipermail/gcc-patches/2022-July/598276.html > > Oops, sorry, I linked to the wrong patch. This is the one I meant to ping: > > https://gcc.gnu.org/pipermail/gcc-patches/2022-July/598874.html Patch LGTM. > > > On Jul 27, 2022, Alexandre Oliva wrote: > > >> for gcc/testsuite/ChangeLog > > >> * g++.dg/abi/anon1.C: Disable pie on ia32. > >> * g++.dg/abi/anon4.C: Likewise. > >> * g++.dg/cpp0x/initlist-const1.C: Likewise. > >> * g++.dg/no-stack-protector-attr-3.C: Likewise. > >> * g++.dg/stackprotectexplicit2.C: Likewise. > >> * g++.dg/pr71694.C: Likewise. > >> * gcc.dg/pr102892-1.c: Likewise. > >> * gcc.dg/sibcall-11.c: Likewise. > >> * gcc.dg/torture/builtin-self.c: Likewise. > >> * gcc.target/i386/avx2-dest-false-dep-for-glc.c: Likewise. > >> * gcc.target/i386/avx512bf16-cvtsbh2ss-1.c: Likewise. > >> * gcc.target/i386/avx512f-broadcast-pr87767-1.c: Likewise. > >> * gcc.target/i386/avx512f-broadcast-pr87767-3.c: Likewise. > >> * gcc.target/i386/avx512f-broadcast-pr87767-5.c: Likewise. > >> * gcc.target/i386/avx512f-broadcast-pr87767-7.c: Likewise. > >> * gcc.target/i386/avx512fp16-broadcast-1.c: Likewise. > >> * gcc.target/i386/avx512fp16-pr101846.c: Likewise. > >> * gcc.target/i386/avx512vl-broadcast-pr87767-1.c: Likewise. > >> * gcc.target/i386/avx512vl-broadcast-pr87767-3.c: Likewise. > >> * gcc.target/i386/avx512vl-broadcast-pr87767-5.c: Likewise. > >> * gcc.target/i386/pr100865-2.c: Likewise. > >> * gcc.target/i386/pr100865-3.c: Likewise. > >> * gcc.target/i386/pr100865-4a.c: Likewise. > >> * gcc.target/i386/pr100865-4b.c: Likewise. > >> * gcc.target/i386/pr100865-5a.c: Likewise. > >> * gcc.target/i386/pr100865-5b.c: Likewise. > >> * gcc.target/i386/pr100865-6a.c: Likewise. > >> * gcc.target/i386/pr100865-6b.c: Likewise. > >> * gcc.target/i386/pr100865-6c.c: Likewise. > >> * gcc.target/i386/pr100865-7b.c: Likewise. > >> * gcc.target/i386/pr101796-1.c: Likewise. > >> * gcc.target/i386/pr101846-2.c: Likewise. > >> * gcc.target/i386/pr101989-broadcast-1.c: Likewise. > >> * gcc.target/i386/pr102021.c: Likewise. > >> * gcc.target/i386/pr90773-17.c: Likewise. > >> * gcc.target/i386/pr54855-3.c: Likewise. > >> * gcc.target/i386/pr54855-7.c: Likewise. > >> * gcc.target/i386/pr15184-1.c: Likewise. > >> * gcc.target/i386/pr15184-2.c: Likewise. > >> * gcc.target/i386/pr27971.c: Likewise. > >> * gcc.target/i386/pr70263-2.c: Likewise. > >> * gcc.target/i386/pr78035.c: Likewise. > >> * gcc.target/i386/pr81736-5.c: Likewise. > >> * gcc.target/i386/pr81736-7.c: Likewise. > >> * gcc.target/i386/pr85620-6.c: Likewise. > >> * gcc.target/i386/pr85667-6.c: Likewise. > >> * gcc.target/i386/pr93492-5.c: Likewise. > >> * gcc.target/i386/pr96539.c: Likewise. > >> PR target/81708 (%gs:my_guard) > >> * gcc.target/i386/stack-prot-sym.c: Likewise. > >> * g++.dg/init/static-cdtor1.C: Add alternate patterns for PIC. > >> * gcc.target/i386/avx512fp16-vcvtsh2si-1a.c: Extend patterns > >> for PIC/PIE register allocation. > >> * gcc.target/i386/pr100704-3.c: Likewise. > >> * gcc.target/i386/avx512fp16-vcvtsh2usi-1a.c: Likewise. > >> * gcc.target/i386/avx512fp16-vcvttsh2si-1a.c: Likewise. > >> * gcc.target/i386/avx512fp16-vcvttsh2usi-1a.c: Likewise. > >> * gcc.target/i386/avx512fp16-vmovsh-1a.c: Likewise. > >> * gcc.target/i386/interrupt-11.c: Likewise, allowing for > >> preservation of the PIC register. > >> * gcc.target/i386/interrupt-12.c: Likewise. > >> * gcc.target/i386/interrupt-13.c: Likewise. > >> * gcc.target/i386/interrupt-15.c: Likewise. > >> * gcc.target/i386/interrupt-16.c: Likewise. > >> * gcc.target/i386/interrupt-17.c: Likewise. > >> * gcc.target/i386/interrupt-8.c: Likewise. > >> * gcc.target/i386/cet-sjlj-6a.c: Combine patterns from > >> previous change. > >> * gcc.target/i386/cet-sjlj-6b.c: Likewise. > >> * gcc.target/i386/pad-10.c: Accept insns in get_pc_thunk. > >> * gcc.target/i386/pr70321.c: Likewise. > >> * gcc.target/i386/pr81563.c: Likewise. > >> * gcc.target/i386/pr84278.c: Likewise. > >> * gcc.target/i386/pr90773-2.c: Likewise, plus extra loads from > >> the GOT. > >> * gcc.target/i386/pr90773-3.c: Likewise. > >> * gcc.target/i386/pr94913-2.c: Accept additional PIC insns. > >> * gcc.target/i386/stack-check-17.c: Likewise. > >> * gcc.target/i386/stack-check-12.c: Do not require dummy stack > >> probing obviated with PIC. > >> * gcc.target/i386/pr95126-m32-1.c: Expect missed optimization > >> with PIC. > >> * gcc.target/i386/pr95126-m32-2.c: Likewise. > >> * gcc.target/i386/pr95852-2.c: Accept different optimization > >> with PIC. > >> * gcc.target/i386/pr95852-4.c: Likewise. > > -- > Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/ >Free Software Activist GNU Toolchain Engineer > Disinformation flourishes because many people care deeply about injustice > but very few check the facts. Ask me about
[PATCH] riscv: elf-multilib: add rv32iafc to defaults
rv32iafc-ilp32 is compatible with rv32iac-ilp32 for library implementation, so add a reuse rule allowing the default configuration to support rv32iafc. -IAFC is an unusual configuration (much less common than -IMAFC), but multilib reuse has essentially no cost: this change is useful to users of platforms that support hardware floating-point but cannot use hardware multiply/divide for any reason. To avoid generating a new set of libraries this is limited to the soft-float ABI. Tested by verifying that `gcc -march=rv32iafc -mabi=ilp32 --print-search-dirs` refers to the rv32iac/ilp32 library directory as expected, rather than just the root library directory as occurs when an unsupported target is selected (for instance, rv32id). Signed-off-by: Peter Marheine --- gcc/config/riscv/t-elf-multilib | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/gcc/config/riscv/t-elf-multilib b/gcc/config/riscv/t-elf-multilib index 19f9434616c..6e74b1811be 100644 --- a/gcc/config/riscv/t-elf-multilib +++ b/gcc/config/riscv/t-elf-multilib @@ -1,11 +1,12 @@ # This file was generated by multilib-generator with the command: -# ./multilib-generator rv32i-ilp32--c rv32im-ilp32--c rv32iac-ilp32-- rv32imac-ilp32-- rv32imafc-ilp32f-rv32imafdc- rv64imac-lp64-- rv64imafdc-lp64d-- -MULTILIB_OPTIONS = march=rv32i/march=rv32ic/march=rv32im/march=rv32imc/march=rv32iac/march=rv32imac/march=rv32imafc/march=rv32imafdc/march=rv32gc/march=rv64imac/march=rv64imafdc/march=rv64gc mabi=ilp32/mabi=ilp32f/mabi=lp64/mabi=lp64d +# ./multilib-generator rv32i-ilp32--c rv32im-ilp32--c rv32iac-ilp32--f rv32imac-ilp32-- rv32imafc-ilp32f-rv32imafdc- rv64imac-lp64-- rv64imafdc-lp64d-- +MULTILIB_OPTIONS = march=rv32i/march=rv32ic/march=rv32im/march=rv32imc/march=rv32iac/march=rv32iafc/march=rv32imac/march=rv32imafc/march=rv32imafdc/march=rv32gc/march=rv64imac/march=rv64imafdc/march=rv64gc mabi=ilp32/mabi=ilp32f/mabi=lp64/mabi=lp64d MULTILIB_DIRNAMES = rv32i \ rv32ic \ rv32im \ rv32imc \ rv32iac \ +rv32iafc \ rv32imac \ rv32imafc \ rv32imafdc \ @@ -25,6 +26,7 @@ march=rv64imac/mabi=lp64 \ march=rv64imafdc/mabi=lp64d MULTILIB_REUSE = march.rv32i/mabi.ilp32=march.rv32ic/mabi.ilp32 \ march.rv32im/mabi.ilp32=march.rv32imc/mabi.ilp32 \ +march.rv32iac/mabi.ilp32=march.rv32iafc/mabi.ilp32 \ march.rv32imafc/mabi.ilp32f=march.rv32imafdc/mabi.ilp32f \ march.rv32imafc/mabi.ilp32f=march.rv32gc/mabi.ilp32f \ march.rv64imafdc/mabi.lp64d=march.rv64gc/mabi.lp64d -- 2.37.1.595.g718a3a8f04-goog