Re: [PATCH v2] rs6000: Fix .machine cpu selection w/ altivec [PR97367]
Hi Peter, > On Jul 12, 2024, at 23:48, Peter Bergner wrote: > > René's patch seems to have stalled, so here is an updated version of the > patch with the requested changes to his patch. Yes, sorry. Our Linux distribution grew so much in popularity the last month that I had hard weeks keeping up with all the new developments. Your changes looks good to me, too. The question is do the maintainers agree and wether they like this in one or two commits. Signed-of-by: René Rebe > I'll note I have added an additional code change, which is to also emit a > ".machine altivec" if Altivec is enabled. The problem this fixes is for > cpus like the G5, which is basically a power4 plus an Altivec unit, its > ".machine power4" doesn't enable the assembler to recognize Altivec insns. > That isn't a problem if you use gcc -mcpu=G5 to assemble the assembler file, > since gcc passes -maltivec to the assembler. However, if you try to assemble > the assembler file with as by hand, you'll get "unrecognized opcode" errors. > I did not do the same for VSX, since all ".machine " for cpus that > support VSX already enable VSX insn recognition, so it's not needed. > > > rs6000: Fix .machine cpu selection w/ altivec [PR97367] > > There are various non-IBM CPUs with altivec, so we cannot use that > flag to determine which .machine cpu to use, so ignore it. > Emit an additional ".machine altivec" if Altivec is enabled so > that the assembler doesn't require an explicit -maltivec option > to assemble any Altivec instructions for those targets where > the ".machine cpu" is insufficient to enable Altivec. For example, > -mcpu=G5 emits a ".machine power4". > > This passed bootstrap and regtesting on powrpc64-linux (running the testsuite > in both 32-bit and 64-bit modes) with no regressions. > > Ok for trunk and the release branches after some trunk burn-in time? > > Peter > > > 2024-07-12 René Rebe > Peter Bergner > > gcc/ > PR target/97367 > * config/rs6000/rs6000.c (rs6000_machine_from_flags): Do not consider > OPTION_MASK_ALTIVEC. > (emit_asm_machine): For Altivec compiles, emit a ".machine altivec". > > gcc/testsuite/ > PR target/97367 > * gcc.target/powerpc/pr97367.c: New test. > > Signed-of-by: René Rebe > --- > gcc/config/rs6000/rs6000.cc| 5 - > gcc/testsuite/gcc.target/powerpc/pr97367.c | 13 + > 2 files changed, 17 insertions(+), 1 deletion(-) > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr97367.c > > diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc > index 2cbea6ea2d7..2cb8f35739b 100644 > --- a/gcc/config/rs6000/rs6000.cc > +++ b/gcc/config/rs6000/rs6000.cc > @@ -5888,7 +5888,8 @@ rs6000_machine_from_flags (void) > HOST_WIDE_INT flags = rs6000_isa_flags; > > /* Disable the flags that should never influence the .machine selection. */ > - flags &= ~(OPTION_MASK_PPC_GFXOPT | OPTION_MASK_PPC_GPOPT | > OPTION_MASK_ISEL); > + flags &= ~(OPTION_MASK_PPC_GFXOPT | OPTION_MASK_PPC_GPOPT | > OPTION_MASK_ISEL > + | OPTION_MASK_ALTIVEC); > > if ((flags & (ISA_3_1_MASKS_SERVER & ~ISA_3_0_MASKS_SERVER)) != 0) > return "power10"; > @@ -5913,6 +5914,8 @@ void > emit_asm_machine (void) > { > fprintf (asm_out_file, "\t.machine %s\n", rs6000_machine); > + if (TARGET_ALTIVEC) > +fprintf (asm_out_file, "\t.machine altivec\n"); > } > #endif > > diff --git a/gcc/testsuite/gcc.target/powerpc/pr97367.c > b/gcc/testsuite/gcc.target/powerpc/pr97367.c > new file mode 100644 > index 000..f9118dbcdec > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr97367.c > @@ -0,0 +1,13 @@ > +/* PR target/97367 */ > +/* { dg-options "-mdejagnu-cpu=G5" } */ > + > +/* Verify we emit a ".machine power4" and ".machine altivec" rather > + than a ".machine power7". */ > + > +int dummy (void) > +{ > + return 0; > +} > + > +/* { dg-final { scan-assembler {\.\mmachine power4\M} } } */ > +/* { dg-final { scan-assembler {\.\mmachine altivec\M} } } */ > -- > 2.45.2 > -- ExactCODE GmbH, Lietzenburger Str. 42, DE-10789 Berlin http://exactcode.com | http://exactscan.com | http://ocrkit.com
Re: [PATCH v2] fix PowerPC < 7 w/ Altivec not to default to power7
Hi! > On Jun 12, 2024, at 00:15, Segher Boessenkool > wrote: > > Hi! > > What does "powerpc < 7" mean? Something before POWER ISA 2.06? PowerPC ISA level 7 or whatever you like to call it. > On Tue, Jun 11, 2024 at 04:22:54PM +0200, Rene Rebe wrote: >> Glibc uses .machine to determine assembler optimizations to use. > > What does this mean? > > .machine is an *output* for glibc; nothing in glibc reads source code. The glibc build with gcc since 2019 with -mcpu=g5, cell or anything before power7 w/ altiven will use assembly optimizations with instructions not supported by the CPU. I found out the hard way because the resultings binaries threw SIGILL. > Nothing the ".machine" directive does has anything to do with > optimisations. Instead, it simply changes what architecture level is > used for the following code. what specific instructions are supported > mainly. I could probably go grep the glibc sources again 4 years later for you. >> --- a/gcc/testsuite/gcc.target/powerpc/pr97367.c.vanilla 2024-05-30 >> 18:26:29.839784279 +0200 >> +++ b/gcc/testsuite/gcc.target/powerpc/pr97367.c 2024-10-06 >> 18:20:34.873818482 +0200 >> @@ -0,0 +1,9 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-mdejagnu-cpu=G5" } */ >> + >> +int dummy () >> +{ >> + return 0; >> +} >> + >> +/* { dg-final { scan-assembler "power4" } } */ > > Please explain (in the testcase, not here!) what this is meant to test! > > You probably want to say {\mpower4\M} instead, btw. Unless you want to > match ".machine spower436" as well? That sounds indeed reasonable. I guess we can make it match .machine, too. Updated test-case welcome ;-) René -- ExactCODE GmbH, Lietzenburger Str. 42, DE-10789 Berlin http://exactcode.com | http://exactscan.com | http://ocrkit.com
Re: [PATCH v2] fix PowerPC < 7 w/ Altivec not to default to power7
Hey, > On Jun 12, 2024, at 00:27, René Rebe wrote: > > Hi! > >> On Jun 12, 2024, at 00:15, Segher Boessenkool >> wrote: >> >> Hi! >> >> What does "powerpc < 7" mean? Something before POWER ISA 2.06? > > PowerPC ISA level 7 or whatever you like to call it. > >> On Tue, Jun 11, 2024 at 04:22:54PM +0200, Rene Rebe wrote: >>> Glibc uses .machine to determine assembler optimizations to use. >> >> What does this mean? >> >> .machine is an *output* for glibc; nothing in glibc reads source code. > > The glibc build with gcc since 2019 with -mcpu=g5, cell or anything before > power7 w/ altiven will use assembly optimizations with instructions not > supported by the CPU. I found out the hard way because the resultings > binaries threw SIGILL. Thankfully to total recall I actually debugged this live, 4 years ago on YouTube: https://www.youtube.com/watch?v=0gU5n3XhGOw It is actually in glibc’s preconfigure explicitly grep’ing for it to choose the submachine assembler optimizations: preconfigure:case "${machine}:${submachine}" in preconfigure: | grep -E "mcpu=|.machine" -m 1 \ preconfigure: | sed -e "s/.*machine //" -e "s/.*mcpu=\(.*\)\"/\1/“` While we could argue that the glibc configure code is also not particularly stelar, gcc should define the correct .machine ISA level like it did before the quoted change in 2019 and my patch submitted nearly 4 years ago fixes that ;-) You can also support the work I’m doing daily over at: https://patreon.com/renerebe Thank you so much, René >> Nothing the ".machine" directive does has anything to do with >> optimisations. Instead, it simply changes what architecture level is >> used for the following code. what specific instructions are supported >> mainly. > > I could probably go grep the glibc sources again 4 years later for you. > >>> --- a/gcc/testsuite/gcc.target/powerpc/pr97367.c.vanilla 2024-05-30 >>> 18:26:29.839784279 +0200 >>> +++ b/gcc/testsuite/gcc.target/powerpc/pr97367.c 2024-10-06 >>> 18:20:34.873818482 +0200 >>> @@ -0,0 +1,9 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-options "-mdejagnu-cpu=G5" } */ >>> + >>> +int dummy () >>> +{ >>> + return 0; >>> +} >>> + >>> +/* { dg-final { scan-assembler "power4" } } */ >> >> Please explain (in the testcase, not here!) what this is meant to test! >> >> You probably want to say {\mpower4\M} instead, btw. Unless you want to >> match ".machine spower436" as well? > > That sounds indeed reasonable. I guess we can make it match .machine, too. > Updated test-case welcome ;-) -- ExactCODE GmbH, Lietzenburger Str. 42, DE-10789 Berlin http://exactcode.com | http://exactscan.com | http://ocrkit.com
Re: [PATCH 2/3] Enabled LRA for ia64.
Hi, > On Jun 12, 2024, at 13:01, Richard Biener wrote: > > On Wed, 12 Jun 2024, Rene Rebe wrote: >> >> gcc/ >>* config/ia64/ia64.cc: Enable LRA for ia64. >>* config/ia64/ia64.md: Likewise. >>* config/ia64/predicates.md: Likewise. > > That looks simple enough. I cannot find any copyright assignment on > file with the FSF so you probably want to contribute to GCC under > the DCO (see https://gcc.gnu.org/dco.html), in that case please post > patches with Signed-off-by: tags. If it helps for the future, I can apply for copyright assignment, too. > For this patch please state how you tested it, I assume you > bootstrapped GCC natively on ia64-linux and ran the testsuite. > I can find two gcc-testresult postings, one appearantly with LRA > and one without? Both from May: > > https://sourceware.org/pipermail/gcc-testresults/2024-May/816422.html > https://sourceware.org/pipermail/gcc-testresults/2024-May/816346.html Yes, that are the two I quoted in the patch cover letter. https://gcc.gnu.org/pipermail/gcc-patches/2024-June/654321.html > somehow for example libstdc++ summaries were not merged, it might > be you do not have recent python installed on the system? Or you > didn't use contrib/test_summary to create those mails. It would be > nice to see the difference between LRA and not LRA in the testresults, > can you quote that? We usually cross-compile gcc, but also ran natively for the testsuite. Given the tests run quite long natively on the hardware we currently have, I summed the results them up in the cover letter. I would assume that shoudl be enough to include with a note the resulting kernel and user-space world was booted and worked without issues? If so, I’ll just resend with the additional information added. Thank you so much, René > Thanks, > Richard. > >> --- >> gcc/config/ia64/ia64.cc | 7 ++- >> gcc/config/ia64/ia64.md | 4 ++-- >> gcc/config/ia64/predicates.md | 2 +- >> 3 files changed, 5 insertions(+), 8 deletions(-) >> >> diff --git a/gcc/config/ia64/ia64.cc b/gcc/config/ia64/ia64.cc >> index ac3d56073ac..d189bfb2cb4 100644 >> --- a/gcc/config/ia64/ia64.cc >> +++ b/gcc/config/ia64/ia64.cc >> @@ -618,9 +618,6 @@ static const scoped_attribute_specs *const >> ia64_attribute_table[] = >> #undef TARGET_LEGITIMATE_ADDRESS_P >> #define TARGET_LEGITIMATE_ADDRESS_P ia64_legitimate_address_p >> >> -#undef TARGET_LRA_P >> -#define TARGET_LRA_P hook_bool_void_false >> - >> #undef TARGET_CANNOT_FORCE_CONST_MEM >> #define TARGET_CANNOT_FORCE_CONST_MEM ia64_cannot_force_const_mem >> >> @@ -1329,7 +1326,7 @@ ia64_expand_move (rtx op0, rtx op1) >> { >> machine_mode mode = GET_MODE (op0); >> >> - if (!reload_in_progress && !reload_completed && !ia64_move_ok (op0, op1)) >> + if (!lra_in_progress && !reload_completed && !ia64_move_ok (op0, op1)) >> op1 = force_reg (mode, op1); >> >> if ((mode == Pmode || mode == ptr_mode) && symbolic_operand (op1, >> VOIDmode)) >> @@ -1776,7 +1773,7 @@ ia64_expand_movxf_movrf (machine_mode mode, rtx >> operands[]) >> } >> } >> >> - if (!reload_in_progress && !reload_completed) >> + if (!lra_in_progress && !reload_completed) >> { >> operands[1] = spill_xfmode_rfmode_operand (operands[1], 0, mode); >> >> diff --git a/gcc/config/ia64/ia64.md b/gcc/config/ia64/ia64.md >> index 698e302081e..d485acc0ea8 100644 >> --- a/gcc/config/ia64/ia64.md >> +++ b/gcc/config/ia64/ia64.md >> @@ -2318,7 +2318,7 @@ >> (match_operand:DI 3 "register_operand" "f")) >> (match_operand:DI 4 "nonmemory_operand" "rI"))) >>(clobber (match_scratch:DI 5 "=f"))] >> - "reload_in_progress" >> + "lra_in_progress" >> "#" >> [(set_attr "itanium_class" "unknown")]) >> >> @@ -3407,7 +3407,7 @@ >> (match_operand:DI 2 "shladd_operand" "n")) >> (match_operand:DI 3 "nonmemory_operand" "r")) >> (match_operand:DI 4 "nonmemory_operand" "rI")))] >> - "reload_in_progress" >> + "lra_in_progress" >> "* gcc_unreachable ();" >> "reload_completed" >> [(set (match_dup 0) (plus:DI (mult:DI (match_dup 1) (match_dup 2)) >> diff --git a/gcc/config/ia64/predicates.md b/gcc/config/ia64/predicates.md >> index 01a4effd339..85f5380e734 100644 >> --- a/gcc/config/ia64/predicates.md >> +++ b/gcc/config/ia64/predicates.md >> @@ -347,7 +347,7 @@ >> allows reload the opportunity to avoid spilling addresses to >> the stack, and instead simply substitute in the value from a >> REG_EQUIV. We'll split this up again when splitting the insn. */ >> - if (reload_in_progress || reload_completed) >> + if (lra_in_progress || reload_completed) >> return true; >> >> /* Some symbol types we allow to use with any offset. */ >> > > -- > Richard Biener > SUSE Software Solutions Germany GmbH, > Frankenstrasse 146, 90461 Nuernberg, Germany; > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg) -- ExactCODE GmbH, Lietzenburger Str. 42, DE-10789 Berlin http://exactcode.com | http://exactscan.com | htt
Re: [PATCH 2/3] Enabled LRA for ia64.
On Jun 12, 2024, at 15:00, Richard Biener wrote: > > On Wed, 12 Jun 2024, René Rebe wrote: > >> Hi, >> >>> On Jun 12, 2024, at 13:01, Richard Biener wrote: >>> >>> On Wed, 12 Jun 2024, Rene Rebe wrote: >>>> >>>> gcc/ >>>> * config/ia64/ia64.cc: Enable LRA for ia64. >>>> * config/ia64/ia64.md: Likewise. >>>> * config/ia64/predicates.md: Likewise. >>> >>> That looks simple enough. I cannot find any copyright assignment on >>> file with the FSF so you probably want to contribute to GCC under >>> the DCO (see https://gcc.gnu.org/dco.html), in that case please post >>> patches with Signed-off-by: tags. >> >> If it helps for the future, I can apply for copyright assignment, too. > > It's not a requirement - you as contributor get the choice under > which legal framework you contribute to GCC, for the DCO there's > the formal requirement of Signed-off-by: tags. > >>> For this patch please state how you tested it, I assume you >>> bootstrapped GCC natively on ia64-linux and ran the testsuite. >>> I can find two gcc-testresult postings, one appearantly with LRA >>> and one without? Both from May: >>> >>> https://sourceware.org/pipermail/gcc-testresults/2024-May/816422.html >>> https://sourceware.org/pipermail/gcc-testresults/2024-May/816346.html >> >> Yes, that are the two I quoted in the patch cover letter. >> >> https://gcc.gnu.org/pipermail/gcc-patches/2024-June/654321.html >> >>> somehow for example libstdc++ summaries were not merged, it might >>> be you do not have recent python installed on the system? Or you >>> didn't use contrib/test_summary to create those mails. It would be >>> nice to see the difference between LRA and not LRA in the testresults, >>> can you quote that? >> >> We usually cross-compile gcc, but also ran natively for the testsuite. >> Given the tests run quite long natively on the hardware we currently >> have, I summed the results them up in the cover letter. I would assume >> that shoudl be enough to include with a note the resulting kernel and >> user-space world was booted and worked without issues? > > I specifically wondered if bootstrap with LRA enabled succeeds. > That needs either native or emulated hardware. I think we consider > ia64-linux a host platform and not only a cross compiler target. With “also ran” I meant to say we did both, our T2 framework usually boot-straps everything cross compiled, but also supports native in-system compilation. Frank also manually natively bootstrapped gcc and ran the testsuite natively on ia64. >> If so, I’ll just resend with the additional information added. > > For the LRA enablement patch the requirement is that patches should > state how they were tested - usually you'll see sth like > > Boostrapped and tested on x86_64-unknown-linux-gnu. > > In your case it was > > Cross-built from x86_64-linux(?) to ia64-linux, natively tested OK - I include the details in v2. > not sure how you exactly did this though? I've never tried > testing of a canadian-cross tree - did you copy the whole build > tree over from the x86 to the ia64 machine? IIRC the testsuite did not just work copying the canadian-cross. I did run the testsuite from the cross-compiled gcc using a ssh based dejagnu board config, but Frank also did fully bootstrap and ran the testsuite natively. Thanks, René > Thanks, > Richard. > >> Thank you so much, >> René >> >>> Thanks, >>> Richard. >>> >>>> --- >>>> gcc/config/ia64/ia64.cc | 7 ++- >>>> gcc/config/ia64/ia64.md | 4 ++-- >>>> gcc/config/ia64/predicates.md | 2 +- >>>> 3 files changed, 5 insertions(+), 8 deletions(-) >>>> >>>> diff --git a/gcc/config/ia64/ia64.cc b/gcc/config/ia64/ia64.cc >>>> index ac3d56073ac..d189bfb2cb4 100644 >>>> --- a/gcc/config/ia64/ia64.cc >>>> +++ b/gcc/config/ia64/ia64.cc >>>> @@ -618,9 +618,6 @@ static const scoped_attribute_specs *const >>>> ia64_attribute_table[] = >>>> #undef TARGET_LEGITIMATE_ADDRESS_P >>>> #define TARGET_LEGITIMATE_ADDRESS_P ia64_legitimate_address_p >>>> >>>> -#undef TARGET_LRA_P >>>> -#define TARGET_LRA_P hook_bool_void_false >>>> - >>>> #undef TARGET_CANNOT_FORCE_CONST_MEM >>>> #define TARGET_CANNOT_FORCE_CONST_MEM ia64_cannot_force_const_mem >>>> >>>
[BUG/PATCH] ppc64 g5 and cell optimizations result in .machine power7
Hi, Since reworking the rs6000 .machine output selection in commit e154242724b084380e3221df7c08fcdbd8460674 22 May 2019, compiling glibc with either G5 or cell results in power7 assembly optimizations to be chosen, which obviously crash with illegal instructions. This is because gcc's .machine output was accidentally changed due to OPTION_MASK_ALTIVEC only otherwise present in IBM CPUs since power7. Bug & patch already at: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97367 Best regards, René Rebe -- ExactCODE GmbH, Lietzenburger Str. 42, DE-10789 Berlin, https://exactcode.com https://exactscan.com | https://ocrkit.com | https://t2sde.org | https://rene.rebe.de