RE:Look for the diamond tools 2018/6/7 8:03:13
| Hi dear customer, Have a great day. My name is Peter. We are the facotory in China, produce stone abrasive and diamond tools. Our main products as below: 1. Cutting tools. 2. Grinding & polishing tools. 3. Other tools to process stone, concrete, ceramics etc. We also can make tools as your requirement. Do not hesitate to contact us, if you need our products. Thanks and best regards, Peter Ye | | 2018/6/7 8:03:19 | -- Quanzhou Danar Diamond Tools & Abrasive Co.,Ltd Email: i...@danartool.com Mobile Ph:+8613645919851(WhatsApp&Wechat) Skype:892721...@qq.com Website: www.danartool.com
[PING][PATCH, rs6000, C/C++] Fix PR target/86324: divkc3-1.c FAILs when compiling with -mabi=ieeelongdouble
I'd like to PING: https://gcc.gnu.org/ml/gcc-patches/2018-06/msg01713.html I've included the entire patch below, since I missed the test cases in the original submission and Segher asked for some updated text for the hook documentation which I've included below. Peter gcc/ PR target/86324 * target.def (translate_mode_attribute): New hook. * targhooks.h (default_translate_mode_attribute): Declare. * targhooks.c (default_translate_mode_attribute): New function. * doc/tm.texi.in (TARGET_TRANSLATE_MODE_ATTRIBUTE): New hook. * doc/tm.texi: Regenerate. * config/rs6000/rs6000.c (TARGET_TRANSLATE_MODE_ATTRIBUTE): Define. (rs6000_translate_mode_attribute): New function. gcc/c-family/ PR target/86324 * c-attribs.c (handle_mode_attribute): Call new translate_mode_attribute target hook. gcc/testsuite/ PR target/86324 gcc.target/powerpc/pr86324-1.c: New test. gcc.target/powerpc/pr86324-2.c: Likewise. Index: gcc/target.def === --- gcc/target.def (revision 262159) +++ gcc/target.def (working copy) @@ -3310,6 +3310,16 @@ constants can be done inline. The funct HOST_WIDE_INT, (const_tree constant, HOST_WIDE_INT basic_align), default_constant_alignment) +DEFHOOK +(translate_mode_attribute, + "Define this hook if during mode attribute processing, the port should\n\ +translate machine_mode @var{mode} to another mode. For example, rs6000's\n\ +@code{KFmode}, when it is the same as @code{TFmode}.\n\ +\n\ +The default version of the hook returns that mode that was passed in.", + machine_mode, (machine_mode mode), + default_translate_mode_attribute) + /* True if MODE is valid for the target. By "valid", we mean able to be manipulated in non-trivial ways. In particular, this means all the arithmetic is supported. */ Index: gcc/targhooks.c === --- gcc/targhooks.c (revision 262159) +++ gcc/targhooks.c (working copy) @@ -393,6 +393,14 @@ default_mangle_assembler_name (const cha return get_identifier (stripped); } +/* The default implementation of TARGET_TRANSLATE_MODE_ATTRIBUTE. */ + +machine_mode +default_translate_mode_attribute (machine_mode mode) +{ + return mode; +} + /* True if MODE is valid for the target. By "valid", we mean able to be manipulated in non-trivial ways. In particular, this means all the arithmetic is supported. Index: gcc/targhooks.h === --- gcc/targhooks.h (revision 262159) +++ gcc/targhooks.h (working copy) @@ -72,6 +72,7 @@ extern void default_print_operand_addres extern bool default_print_operand_punct_valid_p (unsigned char); extern tree default_mangle_assembler_name (const char *); +extern machine_mode default_translate_mode_attribute (machine_mode); extern bool default_scalar_mode_supported_p (scalar_mode); extern bool default_libgcc_floating_mode_supported_p (scalar_float_mode); extern opt_scalar_float_mode default_floatn_mode (int, bool); Index: gcc/doc/tm.texi.in === --- gcc/doc/tm.texi.in (revision 262159) +++ gcc/doc/tm.texi.in (working copy) @@ -3336,6 +3336,8 @@ stack. @hook TARGET_REF_MAY_ALIAS_ERRNO +@hook TARGET_TRANSLATE_MODE_ATTRIBUTE + @hook TARGET_SCALAR_MODE_SUPPORTED_P @hook TARGET_VECTOR_MODE_SUPPORTED_P Index: gcc/doc/tm.texi === --- gcc/doc/tm.texi (revision 262159) +++ gcc/doc/tm.texi (working copy) @@ -4255,6 +4255,14 @@ hook returns true for both @code{ptr_mod Define this to return nonzero if the memory reference @var{ref} may alias with the system C library errno location. The default version of this hook assumes the system C library errno location is either a declaration of type int or accessed by dereferencing a pointer to int. @end deftypefn +@deftypefn {Target Hook} machine_mode TARGET_TRANSLATE_MODE_ATTRIBUTE (machine_mode @var{mode}) +Define this hook if during mode attribute processing, the port should +translate machine_mode @var{mode} to another mode. For example, rs6000's +@code{KFmode}, when it is the same as @code{TFmode}. + +The default version of the hook returns that mode that was passed in. +@end deftypefn + @deftypefn {Target Hook} bool TARGET_SCALAR_MODE_SUPPORTED_P (scalar_mode @var{mode}) Define this to return nonzero if the port is prepared to handle insns involving scalar mode @var{mode}. For a scalar mode to be Index: gcc/config/rs6000/rs6000.c === --- gcc/config/rs6000/rs6000.c (revision 262159) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -1802,6 +1802,9 @@ static const struct attribute_s
Re: [PING][PATCH, rs6000, C/C++] Fix PR target/86324: divkc3-1.c FAILs when compiling with -mabi=ieeelongdouble
On 7/5/18 2:36 PM, Jeff Law wrote: > On 07/02/2018 03:50 PM, Peter Bergner wrote: >> I'd like to PING: >> >> https://gcc.gnu.org/ml/gcc-patches/2018-06/msg01713.html >> >> I've included the entire patch below, since I missed the test cases in >> the original submission and Segher asked for some updated text for the >> hook documentation which I've included below. > > OK. Is that an ok for GCC 8 as well which I asked for in the initial patch submission? Peter
Re: [PATCHv2] Change the library search path when using --with-advance-toolchain
On 10/5/19 12:20 PM, Segher Boessenkool wrote: > On Fri, Oct 04, 2019 at 06:31:34PM -0300, Tulio Magno Quites Machado Filho > wrote: >> Remove all -L directories from LINK_OS_EXTRA_SPEC32 and >> LINK_OS_EXTRA_SPEC64 so that user directories specified at >> build time have higher preference over the advance toolchain libraries. >> >> Set MD_STARTFILE_PREFIX to $prefix/lib/ and MD_STARTFILE_PREFIX_1 to >> $at/lib/ so that a compiler library has preference over the Advance >> Toolchain libraries. > > This is fine, approved for all branches. Thank you! And thanks to Mike > for the testing. I've committed the back ports to the FSF 7, 8 and 9 branches now after clean bootstraps and regtesting. Peter
[PATCH, rs6000][committed] Fix PR92090: Allow MODE_PARTIAL_INT modes for integer constant input operands.
Before, LRA, we have an insn that sets a TImode pseudo with an integer constant and a following insn that copies that TImode pseudo to a PTImode pseudo. During LRA spilling, we generate a new insn that sets a PTImode pseudo to that constant directly and we ICE because we do not recognize that as a valid insn. The fix below fixes the ICE reported in PR92090 by modifying our input_operand predicate to allow MODE_PARTIAL_INT modes for integer constant input operands. This patch (preapproved by Segher) passed bootstrap and regtesting with no errors. Committed. Peter gcc/ PR other/92090 * config/rs6000/predicates.md (input_operand): Allow MODE_PARTIAL_INT modes for integer constants. gcc/testsuite/ PR other/92090 * gcc.target/powerpc/pr92090.c: New test. Index: gcc/config/rs6000/predicates.md === --- gcc/config/rs6000/predicates.md (revision 277861) +++ gcc/config/rs6000/predicates.md (working copy) @@ -1047,8 +1047,7 @@ (define_predicate "input_operand" return 1; /* Allow any integer constant. */ - if (GET_MODE_CLASS (mode) == MODE_INT - && CONST_SCALAR_INT_P (op)) + if (SCALAR_INT_MODE_P (mode) && CONST_SCALAR_INT_P (op)) return 1; /* Allow easy vector constants. */ Index: gcc/testsuite/gcc.target/powerpc/pr92090.c === --- gcc/testsuite/gcc.target/powerpc/pr92090.c (nonexistent) +++ gcc/testsuite/gcc.target/powerpc/pr92090.c (working copy) @@ -0,0 +1,42 @@ +/* { dg-do compile } */ +/* { dg-options "-mdejagnu-cpu=power8 -Os -mbig" } */ + +/* Verify that we don't ICE. */ + +_Atomic int a; +_Atomic long double b, c; +int j; +void foo (void); +void bar (int, int, int, int); + +void +bug (void) +{ + b = 1; + int d, e, f, g; + while (a) +; + for (int h = 0; h < 1; h++) +{ + double i = b /= 3; + foo (); + if (i) + { + if (i == 1) + d++; + e++; + b = 0; + } + else + { + if (i == 2) + f++; + g++; + b = 1; + } +} + bar (d, e, f, g); + c = 1; + for (int h; h; h++) +j = 0; +}
[PATCH] rs6000: Fix PR93136, gcc.dg/vmx/ops.c and several other test break after r279772
My fix to PR92923 seems to have caused the vmx/ops.c and vsx-vector-6.p*.c test failures. The ops.c issue is we need a new option to quiet a warning we didn't see when we were emitting VIEW_CONVERT_EXPRs. The other test cases just need a slight adjustment to some of their counts. However, we were seeing double/triple/... counting due to using "xxland" instead of {\mxxland\M} for our regex, so that was also counting xxlandc too. I adjusted all of the insn regexs to use \m and \M to fix that. I must say that the vsx-vector-6.p*.c tests are fragile! They're so big and reusing source operands, that the compiler can sometimes optimize several builtin calls together, meaning we don't see as many vsx instructions as we have calls to builtins. I didn't bother trying to fix that, since that is a lot more work! I just wanted to vent! :-) I've confirmed the updated test cases now pass on both BE and LE. Ok for trunk? Peter PR target/92923 PR target/93136 * gcc.dg/vmx/ops.c: Add -flax-vector-conversions to dg-options. * gcc.target/powerpc/vsx-vector-6.p7.c: Adjust scan-assembler-times regex directives. Adjust expected instruction counts. * gcc.target/powerpc/vsx-vector-6.p8.c: Likewise. * gcc.target/powerpc/vsx-vector-6.p9.c: Likewise. Index: gcc/testsuite/gcc.dg/vmx/ops.c === --- gcc/testsuite/gcc.dg/vmx/ops.c (revision 279852) +++ gcc/testsuite/gcc.dg/vmx/ops.c (working copy) @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-maltivec -mabi=altivec -std=gnu99 -mno-vsx -Wno-deprecated" } */ +/* { dg-options "-maltivec -mabi=altivec -std=gnu99 -mno-vsx -Wno-deprecated -flax-vector-conversions" } */ #include #include extern char * *var_char_ptr; Index: gcc/testsuite/gcc.target/powerpc/vsx-vector-6.p7.c === --- gcc/testsuite/gcc.target/powerpc/vsx-vector-6.p7.c (revision 279852) +++ gcc/testsuite/gcc.target/powerpc/vsx-vector-6.p7.c (working copy) @@ -5,37 +5,38 @@ /* Expected instruction counts for Power 7 */ -/* { dg-final { scan-assembler-times "xvabsdp" 1 } } */ -/* { dg-final { scan-assembler-times "xvadddp" 1 } } */ -/* { dg-final { scan-assembler-times "xxlnor" 5 } } */ +/* { dg-final { scan-assembler-times {\mxvabsdp\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mxvadddp\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mxxlnor\M} 5 } } */ /* { dg-final { scan-assembler-times {\mxvcmpeqdp\s} 1 } } */ /* { dg-final { scan-assembler-times {\mxvcmpeqdp\.\s} 5 } } */ /* { dg-final { scan-assembler-times {\mxvcmpgtdp\s} 2 } } */ /* { dg-final { scan-assembler-times {\mxvcmpgtdp\.\s} 5 } } */ /* { dg-final { scan-assembler-times {\mxvcmpgedp\s} 1 } } */ /* { dg-final { scan-assembler-times {\mxvcmpgedp\.\s} 6 } } */ -/* { dg-final { scan-assembler-times "xvrdpim" 1 } } */ -/* { dg-final { scan-assembler-times "xvmaddadp" 1 } } */ -/* { dg-final { scan-assembler-times "xvmsubadp" 1 } } */ -/* { dg-final { scan-assembler-times "xvsubdp" 1 } } */ -/* { dg-final { scan-assembler-times "xvmaxdp" 1 } } */ -/* { dg-final { scan-assembler-times "xvmindp" 1 } } */ -/* { dg-final { scan-assembler-times "xvmuldp" 1 } } */ -/* { dg-final { scan-assembler-times "vperm" 2 } } */ -/* { dg-final { scan-assembler-times "xvrdpic" 2 } } */ -/* { dg-final { scan-assembler-times "xvsqrtdp" 1 } } */ -/* { dg-final { scan-assembler-times "xvrdpiz" 1 } } */ -/* { dg-final { scan-assembler-times "xvmsubasp" 1 } } */ -/* { dg-final { scan-assembler-times "xvnmaddasp" 1 } } */ -/* { dg-final { scan-assembler-times "xvnmaddadp" 1 } } */ -/* { dg-final { scan-assembler-times "xvnmsubadp" 1 } } */ -/* { dg-final { scan-assembler-times "vmsumshs" 2 } } */ -/* { dg-final { scan-assembler-times "xxland" 13 } } */ -/* { dg-final { scan-assembler-times "xxlxor" 2 } } */ -/* { dg-final { scan-assembler-times "xxsel" 4 } } */ -/* { dg-final { scan-assembler-times "xvrdpip" 1 } } */ -/* { dg-final { scan-assembler-times "xvdivdp" 1 } } */ -/* { dg-final { scan-assembler-times "xvrdpi" 7 } } */ +/* { dg-final { scan-assembler-times {\mxvrdpim\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mxvmaddadp\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mxvmsubadp\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mxvsubdp\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mxvmaxdp\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mxvmindp\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mxvmuldp\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mvperm\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mxv
Re: [PATCH] rs6000: Fix PR93136, gcc.dg/vmx/ops.c and several other test break after r279772
On 1/9/20 4:51 PM, Segher Boessenkool wrote: > On Thu, Jan 09, 2020 at 01:44:59PM -0600, Peter Bergner wrote: >> The other test cases just need a slight adjustment to some of their >> counts. > > What were the changes? Or, I'll just trust you looked at it and it is > okay :-) I did look. Most of the changes brought us closer to one vsx instruction for each builtin than what we had before. We still have some optimization across builtin functions occurring, so we still don't get a one-to-one mapping of builtin call to vsx instruction. >> I must say that the vsx-vector-6.p*.c tests are fragile! They're so big and >> reusing source operands, that the compiler can sometimes optimize several >> builtin calls together, meaning we don't see as many vsx instructions as >> we have calls to builtins. I didn't bother trying to fix that, since that >> is a lot more work! I just wanted to vent! :-) > > Splitting out separate functions in the testcase shouldn't be so much > work? Or am I too optimistic :-) > > This should make the test a good deal less prone to random changes in > output caused by the lunar cycle. Ok, let me take a stab at rewriting the tests to be more similar to the pr92923-[12].c and see how much work that is. I do agree that it would be nice not having the insn counts be so fragile. Peter
Re: [PATCH] rs6000: Fix PR93136, gcc.dg/vmx/ops.c and several other test break after r279772
On 1/9/20 6:29 PM, Peter Bergner wrote: > On 1/9/20 4:51 PM, Segher Boessenkool wrote: >> Splitting out separate functions in the testcase shouldn't be so much >> work? Or am I too optimistic :-) >> >> This should make the test a good deal less prone to random changes in >> output caused by the lunar cycle. > > Ok, let me take a stab at rewriting the tests to be more similar to the > pr92923-[12].c and see how much work that is. I do agree that it would > be nice not having the insn counts be so fragile. Sorry for taking so long to get back to this. I split the functions into smaller chunks, but didn't need to go all the way to one function per builtin call. Does this look better? The POWER7 and POWER8 files are identical, except for their -mcpu= option. The POWER9 file has some slight differences to the other files, because of new instructions added that we make sure of. I've confirmed the updated test cases now pass on both LE and BE. Ok for trunk? Peter PR target/93136 * gcc.dg/vmx/ops.c: Add -flax-vector-conversions to dg-options. * gcc.target/powerpc/vsx-vector-6.h: Split tests into smaller functions. * gcc.target/powerpc/vsx-vector-6.p7.c: Adjust scan-assembler-times regex directives. Adjust expected instruction counts. * gcc.target/powerpc/vsx-vector-6.p8.c: Likewise. * gcc.target/powerpc/vsx-vector-6.p9.c: Likewise. diff --git a/gcc/testsuite/gcc.dg/vmx/ops.c b/gcc/testsuite/gcc.dg/vmx/ops.c index 6aff80bbd1a..4a0650c06bd 100644 --- a/gcc/testsuite/gcc.dg/vmx/ops.c +++ b/gcc/testsuite/gcc.dg/vmx/ops.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-maltivec -mabi=altivec -std=gnu99 -mno-vsx -Wno-deprecated" } */ +/* { dg-options "-maltivec -mabi=altivec -std=gnu99 -mno-vsx -Wno-deprecated -flax-vector-conversions" } */ #include #include extern char * *var_char_ptr; diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-vector-6.h b/gcc/testsuite/gcc.target/powerpc/vsx-vector-6.h index a891b64e6fa..0106e8d2901 100644 --- a/gcc/testsuite/gcc.target/powerpc/vsx-vector-6.h +++ b/gcc/testsuite/gcc.target/powerpc/vsx-vector-6.h @@ -1,167 +1,154 @@ -/* This test code is included into vsx-vector-6-be.c and vsx-vector-6-le.c. - The two files have the tests for the number of instructions generated for - LE versus BE. */ +/* This test code is included into vsx-vector-6.p7.c, vsx-vector-6.p8.c + and vsx-vector-6.p9.c. The .c files have the tests for the number + of instructions generated for each cpu type. */ #include -void foo (vector double *out, vector double *in, vector long *p_l, vector bool long *p_b, - vector unsigned char *p_uc, int *i, vector float *p_f, - vector bool char *outbc, vector bool int *outbi, - vector bool short *outbsi, vector int *outsi, - vector unsigned int *outui, vector signed char *outsc, - vector unsigned char *outuc) +typedef struct { + vector double d; + vector float f; + vector long sl; + vector int si; + vector short ss; + vector char sc; + vector unsigned int ui; + vector unsigned short int us; + vector unsigned char uc; + vector bool long long bll; + vector bool long bl; + vector bool int bi; + vector bool short bs; + vector bool char bc; +} opnd_t; + +void +func_1op (opnd_t *dst, opnd_t *src) { - vector double in0 = in[0]; - vector double in1 = in[1]; - vector double in2 = in[2]; - vector long inl = *p_l; - vector bool long inb = *p_b; - vector bool long long inbl0; - vector bool long long inbl1; - vector unsigned char uc = *p_uc; - vector float inf0; - vector float inf1; - vector float inf2; - vector char inc0; - vector char inc1; - vector bool char inbc0; - vector bool char inbc1; - vector bool short inbs0; - vector bool short inbs1; - vector bool int inbi0; - vector bool int inbi1; - vector signed short int inssi0, inssi1; - vector unsigned short int inusi0, inusi1; - vector signed int insi0, insi1; - vector unsigned int inui0, inui1; - vector unsigned char inuc0, inuc1; - - *out++ = vec_abs (in0); - *out++ = vec_add (in0, in1); - *out++ = vec_and (in0, in1); - *out++ = vec_and (in0, inb); - *out++ = vec_and (inb, in0); - *out++ = vec_andc (in0, in1); - *out++ = vec_andc (in0, inb); - *out++ = vec_andc (inb, in0); - *out++ = vec_andc (inbl0, in0); - *out++ = vec_andc (in0, inbl0); - - *out++ = vec_ceil (in0); - *p_b++ = vec_cmpeq (in0, in1); - *p_b++ = vec_cmpgt (in0, in1); - *p_b++ = vec_cmpge (in0, in1); - *p_b++ = vec_cmplt (in0, in1); - *p_b++ = vec_cmple (in0, in1); - *out++ = vec_div (in0, in1); - *out++ = vec_floor (in0); - *out++ = vec_madd (in0, in1, in2); - *out++ = vec_msub (in0, in1, in2); - *out++ = vec_max (in0, in1); - *out++ = vec_min (in0, in1); - *out++ = vec_msub (in0, in1, in2); - *out++ = vec_mul (in0, in1); - *out++ = vec_nearbyint (in0); - *out++ = vec_nmadd
Re: [PATCH] rs6000: Fix PR93136, gcc.dg/vmx/ops.c and several other test break after r279772
On 2/8/20 11:13 AM, Segher Boessenkool wrote: >> +/* { dg-final { scan-assembler-times {\mvperm[r]?\M} 1 } } */ > > You can write this without the square brackets, fwiw. > > Okay for trunk. Thank you! Ok, I pushed this change with your suggestion to remove the square brackets on the above regex. Thanks. Peter
Re: [PATCH] rs6000: Fix 2 for PR92661, Do not define builtins that overload disabled builtins
On 12/4/19 1:16 PM, Segher Boessenkool wrote: > For future patches: it is much easier to review if you make the big, > mechanical move a separate (earlier) patch. Will do. >> I have also >> included a small patch to disable running the powerpc/dfp/ tests even for >> powerpc*-linux when --disable-decimal-float is used. > > What is the reason for that? [snip] > This isn't run from powerpc.exp, so it needs to still do that first check. > And it's up to the Darwin maintainers whether they want that second part > (there are many more tests and testsuites that disable *-darwin* while > that isn't really necessary). > > Why do you need/want the check_effective_target_dfp test? If for example > this is to prevent ICEs, that is no good, that is *hiding* problems. > > I suspect it is to stop the testsuite from complaining if you configure > with --disable-decimal-float. What is different there then, compared to > targets that do actually not support decimal float? Well, yes. I saw those tests being run for my --disable-decimal-float runs, which resulted in FAILs for all of those tests. They had ICE's on unpatched trunk and FAILed gracefully using my patch, but they all still FAILed, since these are DFP tests and DFP is disabled. There's no sense in running these tests when DFP is disabled, either manually due to --disable-decimal-float or implicitly because of the specific target. Why isn't just testing check_effective_target_dfp enough to disable the tests on Darwin, --disable-decimal-float, etc.? That would seem to imply that one of those targets we're testing against enables DFP, but somehow we don't want to run the tests or they all still FAIL for some reason??? > Okay for trunk minus the changes to powerpc-dfp.exp (we can iterate on > that). Thanks! Ok, I committed that part of the patch. Thanks! Peter
Re: [PATCH] rs6000: Fix 2 for PR92661, Do not define builtins that overload disabled builtins
On 12/4/19 2:47 PM, Iain Sandoe wrote: > Peter Bergner wrote: >> >> Why isn't just testing check_effective_target_dfp enough to disable the >> tests on Darwin, --disable-decimal-float, etc.? > > … It should be a better solution - I will confirm this. Thanks for checking. The nice thing about this solution (if it works for Darwin and the other targets) is that if you eventually add DFP support, then these tests will just automatically start being run for you. Otherwise, you'll have to hunt through the DFP tests looking for a dg-skip-if darwin test. Peter
Re: [PATCH] rs6000: Fix 2 for PR92661, Do not define builtins that overload disabled builtins
On 12/4/19 2:50 PM, Segher Boessenkool wrote: > It would be nice to keep *some* dfp test(s) to make sure we don't ICE. > If we disabled all such tests already, like with this patch, we wouldn't > have ICEd or seen this problem. That can be a separate test of course > (and could be outside gcc.target/powerpc/dfp/). Sure, I can add a test in gcc.target/powerpc/ that uses both a builtin and an overloaded builtin to make sure we don't ICE when DFP is disabled. > OTOH, if you add this check, we can lose the > > /* { dg-require-effective-target powerpc_p9vector_ok } */ > /* { dg-skip-if "" { powerpc*-*-aix* } } */ > > from all the dtstsfi-* tests, etc. (Well, no, need to keep the p9 part). Agreed on not needing the dg-skip-if tests. Not only on the powerpc/dfp/ tests, but we should be able to remove them from the DFP tests that are in gcc.target/powerpc/ too. > Making such changes to the testsuite needs testing on *all* subtargets :-( Right. I'll come up with a patch and hopefully Iain and David can test on Darwin and AIX and I can test on Linux with --enable-decimal-float and --disable-decimal-float. That should cover the major subtargets and if it works there, I'd expect it to work on the minor subtargets too. >> Why isn't just testing check_effective_target_dfp enough to disable the >> tests on Darwin, --disable-decimal-float, etc.? That would seem to imply >> that one of those targets we're testing against enables DFP, but somehow >> we don't want to run the tests or they all still FAIL for some reason??? > > It should be enough. Currently we just directly skip all tests on OSes > that do not support DFP, but that is not as nice as the effective target. Right, if using the effective target test, any target that adds DFP support in the future will automatically get these tests runs for it, which is what we want. Peter
Re: [PATCH] rs6000: Fix 2 for PR92661, Do not define builtins that overload disabled builtins
On 12/5/19 2:44 AM, Iain Sandoe wrote: > Segher Boessenkool wrote: >> On Wed, Dec 04, 2019 at 03:53:30PM -0600, Peter Bergner wrote: >>> Sure, I can add a test in gcc.target/powerpc/ that uses both a builtin >>> and an overloaded builtin to make sure we don't ICE when DFP is disabled. >> >> Great, thanks! > > so that would be a/some dg-do compile test(s), then? > (presumably, gated on effective_target_dfp … see below) Yes, only a dg-do compile test. For DFP enabled targets, they shouldn't see any errors at all and for DFP disabled targets, I'd insert some dg-error checks gated on ![check_effective_target_dfp] looking for the "error: decimal floating-point not supported for this target" errors. Any ICE would flag a real test case error. > (on r278957, with the system assembler which doesn’t support DFP insns > and gcc.target dfp.exp not yet renamed) > > # Skip these tests for targets that don't support this extension. > if { ![check_effective_target_dfp] } { > return; > } > > Works on Darwin to skip the entire set (not too much of a surprise, since > it’s used for the GCC and G++ cases already). Great, thanks for checking. > It’s possible (even likely) that Darwin can be built with an assembler that > supports DFP instructions, even tho it has no OS support. Usually, my policy > is that we would do compile tests then, since that excercises more code. My --disable-decimal-float run also worked and I had an assembler that supports all of the DFP insns, so I'd expect it to work for you too. The check_effective_target_dfp tests really is checking for whether the DFP modes exist in the compiler and that isn't gated on the assembler... at least fully. > ..but then we need to gate run tests on the availability of runtime support. There exists check_effective_target_dfprt for that. > I see that the GCC and G++ testsuites have also…. > > # If the decimal float is supported in the compiler but not yet in the > # runtime, treat all tests as compile-only. > global dg-do-what-default > set save-dg-do-what-default ${dg-do-what-default} > if { ![check_effective_target_dfprt] } { > verbose "dfp.exp: runtime support for decimal float does not exist" 2 > set dg-do-what-default compile > } else { > verbose "dfp.exp: runtime support for decimal float exists, use it" 2 > set dg-do-what-default run > } > verbose "dfp.exp: dg-do-what-default is ${dg-do-what-default}” 2 > > Do you think there’s any merit in doing something like that here too? Adding this to powerpc/dfp/dfp.exp won't do anything, since all of the powerpc/dfp/ tests are dg-do compile tests. > (I guess a finer-grained alternative is check_effective_target_dfprt in any > run-tests) > > .. or I can just force a false return from effective_target_dfp as we > do for other cases where assembler support does not imply system > support. I've already verified that if decimal float is disabled, but you do have an assembler that supports dfp insns, check_effective_target_dfp still works correctly. Peter
Re: [PATCH] rs6000: Fix 2 for PR92661, Do not define builtins that overload disabled builtins
On 12/6/19 5:12 PM, Segher Boessenkool wrote: > On Thu, Dec 05, 2019 at 08:44:57AM +, Iain Sandoe wrote: >> .. or I can just force a false return from effective_target_dfp as we >> do for other cases where assembler support does not imply system >> support. > > That's what I would do, yes. I'm not sure that's necessary. DFP enablement isn't triggered by assembler support. Just the gcc/configure fragment (ignoring manually using --enable-decimal-float): case $target in powerpc*-*-linux* | i?86*-*-linux* | x86_64*-*-linux* | s390*-*-linux* | \ i?86*-*-elfiamcu | i?86*-*-gnu* | x86_64*-*-gnu* | \ i?86*-*-mingw* | x86_64*-*-mingw* | \ i?86*-*-cygwin* | x86_64*-*-cygwin*) enable_decimal_float=yes ;; *) { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: decimal float is not supported for this target, ignored" >&5 $as_echo "$as_me: WARNING: decimal float is not supported for this target, ignored" >&2;} enable_decimal_float=no ;; So I don't think there is anything to do wrt Darwin here. Peter
Re: [PATCH] rs6000: Fix 2 for PR92661, Do not define builtins that overload disabled builtins
On 12/4/19 5:03 PM, Segher Boessenkool wrote: > On Wed, Dec 04, 2019 at 03:53:30PM -0600, Peter Bergner wrote: >> Right. I'll come up with a patch and hopefully Iain and David can test >> on Darwin and AIX and I can test on Linux with --enable-decimal-float >> and --disable-decimal-float. That should cover the major subtargets and >> if it works there, I'd expect it to work on the minor subtargets too. Ok, how about the patch below? If Iain and David could test this on Darwin and AIX respectively, that would be great. I'm currently testing this on powerpc64le-linux, with and without --disable-decimal-float. The pr92661.c test case is the DFP test case you wanted added to make sure we do not ICE, even when DFP is disabled. The dfp-[dt]d*.c changes are due to me seeing them being run (and FAILing) on my --disable-decimal-float runs. Clearly, they shouldn't be run when DFP is disabled. All of the powerpc/dfp/* tests had powerpc*-*-* target tests, but that is covered by the dfp.exp target tests, so I removed them along with the now unneeded dg-skip-if AIX tests. If dg-do compile is the default, do we want to just remove that whole line? How is this looking? Peter PR bootstrap/92661 * gcc.target/powerpc/pr92661.c: New test. * gcc.target/powerpc/dfp-dd.c: Add dg-require-effective-target dfp_hw. Remove unneeded powerpc_fprs test. * gcc.target/powerpc/dfp-td.c: Likewise. * gcc.target/powerpc/dfp-dd-2.c: Add dg-require-effective-target dfp. * gcc.target/powerpc/dfp-td-2.c: Likewise. * gcc.target/powerpc/dfp-td-3.c: Likewise. * gcc.target/powerpc/dfp/dfp.exp: Remove rs6000-*-* and powerpc*-*-darwin* target tests. Add check_effective_target_dfp test. * gcc.target/powerpc/dfp/dtstsfi-0.c: Remove unneeded target test. Remove unneeded dg-skip-if. * gcc.target/powerpc/dfp/dtstsfi-1.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-10.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-11.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-12.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-13.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-14.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-15.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-16.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-17.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-18.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-19.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-2.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-20.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-21.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-22.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-23.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-24.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-25.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-26.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-27.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-28.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-29.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-3.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-30.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-31.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-32.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-33.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-34.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-35.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-36.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-37.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-38.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-39.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-4.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-40.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-41.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-42.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-43.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-44.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-45.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-46.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-47.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-48.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-49.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-5.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-50.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-51.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-52.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-53.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-54.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-55.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-56.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-57.c: Likewise. * gcc.target/powerpc/dfp/dtstsfi-58.c: Likewise
Re: [PATCH] rs6000: Fix 2 for PR92661, Do not define builtins that overload disabled builtins
On 12/10/19 12:27 PM, Peter Bergner wrote: > Ok, how about the patch below? If Iain and David could test this on Darwin > and AIX respectively, that would be great. I'm currently testing this on > powerpc64le-linux, with and without --disable-decimal-float. So my --enable-decimal-float builds showed no regressions or differences in the testsuite results (as expected). My --disable-decimal-float runs showed no new regressions. The only differences in the testsuite results were due to all of the FAILs on the base compiler build that are fixed with the patch. Peter
[PATCH] rs6000: Fix PR92923, __builtin_vec_xor() causes subregs to be used when not using V4SImode vectors
PR92923 shows a problem where builtin function usage is causing performance problems due to unneeded subreg usage. These are being caused by the front- end emitting unneeded VIEW_CONVERT_EXPR's on the builtin functions operands. These in tern where caused by a lack of overloaded builtin entries in the rs6000 backend. The following patch adds just enough new definitions to match what our vector documentation says we must support. I have also added new test cases so that we will catch any regressions in this area. This passed bootstrap and regression testing with no errors. Ok for trunk? This is a bug on the release branches too, but given how big this patch ended up being, I don't think we want to backport this. Peter gcc/ PR target/92923 * config/rs6000/rs6000-builtin.def (VAND, VANDC, VNOR, VOR, VXOR): Delete. (EQV_V16QI_UNS, EQV_V8HI_UNS, EQV_V4SI_UNS, EQV_V2DI_UNS, EQV_V1TI_UNS, NAND_V16QI_UNS, NAND_V8HI_UNS, NAND_V4SI_UNS, NAND_V2DI_UNS, NAND_V1TI_UNS, ORC_V16QI_UNS, ORC_V8HI_UNS, ORC_V4SI_UNS, ORC_V2DI_UNS, ORC_V1TI_UNS, VAND_V16QI_UNS, VAND_V16QI, VAND_V8HI_UNS, VAND_V8HI, VAND_V4SI_UNS, VAND_V4SI, VAND_V2DI_UNS, VAND_V2DI, VAND_V4SF, VAND_V2DF, VANDC_V16QI_UNS, VANDC_V16QI, VANDC_V8HI_UNS, VANDC_V8HI, VANDC_V4SI_UNS, VANDC_V4SI, VANDC_V2DI_UNS, VANDC_V2DI, VANDC_V4SF, VANDC_V2DF, VNOR_V16QI_UNS, VNOR_V16QI, VNOR_V8HI_UNS, VNOR_V8HI, VNOR_V4SI_UNS, VNOR_V4SI, VNOR_V2DI_UNS, VNOR_V2DI, VNOR_V4SF, VNOR_V2DF, VOR_V16QI_UNS, VOR_V16QI, VOR_V8HI_UNS, VOR_V8HI, VOR_V4SI_UNS, VOR_V4SI, VOR_V2DI_UNS, VOR_V2DI, VOR_V4SF, VOR_V2DF, VXOR_V16QI_UNS, VXOR_V16QI, VXOR_V8HI_UNS, VXOR_V8HI, VXOR_V4SI_UNS, VXOR_V4SI, VXOR_V2DI_UNS, VXOR_V2DI, VXOR_V4SF, VXOR_V2DF): Add definitions. * config/rs6000/rs6000-call.c (altivec_overloaded_builtins) : Remove. : Add definitions. : Change unsigned usages to use the new *_UNS definition names. (rs6000_gimple_fold_builtin) : Use new definition names. (builtin_function_type) : Handle unsigned builtins. gcc/testsuite/ PR target/92923 * gcc.target/powerpc/pr92923-1.c: New test. * gcc.target/powerpc/pr92923-2.c: Likewise. Index: gcc/config/rs6000/rs6000-builtin.def === --- gcc/config/rs6000/rs6000-builtin.def(revision 279479) +++ gcc/config/rs6000/rs6000-builtin.def(working copy) @@ -1000,8 +1000,26 @@ BU_ALTIVEC_2 (VADDUHS, "vadduhs", BU_ALTIVEC_2 (VADDSHS, "vaddshs",CONST, altivec_vaddshs) BU_ALTIVEC_2 (VADDUWS, "vadduws",CONST, altivec_vadduws) BU_ALTIVEC_2 (VADDSWS, "vaddsws",CONST, altivec_vaddsws) -BU_ALTIVEC_2 (VAND, "vand", CONST, andv4si3) -BU_ALTIVEC_2 (VANDC, "vandc", CONST, andcv4si3) +BU_ALTIVEC_2 (VAND_V16QI_UNS, "vand_v16qi_uns",CONST, andv16qi3) +BU_ALTIVEC_2 (VAND_V16QI, "vand_v16qi",CONST, andv16qi3) +BU_ALTIVEC_2 (VAND_V8HI_UNS, "vand_v8hi_uns", CONST, andv8hi3) +BU_ALTIVEC_2 (VAND_V8HI, "vand_v8hi", CONST, andv8hi3) +BU_ALTIVEC_2 (VAND_V4SI_UNS, "vand_v4si_uns", CONST, andv4si3) +BU_ALTIVEC_2 (VAND_V4SI, "vand_v4si", CONST, andv4si3) +BU_ALTIVEC_2 (VAND_V2DI_UNS, "vand_v2di_uns", CONST, andv2di3) +BU_ALTIVEC_2 (VAND_V2DI, "vand_v2di", CONST, andv2di3) +BU_ALTIVEC_2 (VAND_V4SF, "vand_v4sf", CONST, andv4sf3) +BU_ALTIVEC_2 (VAND_V2DF, "vand_v2df", CONST, andv2df3) +BU_ALTIVEC_2 (VANDC_V16QI_UNS,"vandc_v16qi_uns",CONST, andcv16qi3) +BU_ALTIVEC_2 (VANDC_V16QI,"vandc_v16qi", CONST, andcv16qi3) +BU_ALTIVEC_2 (VANDC_V8HI_UNS, "vandc_v8hi_uns",CONST, andcv8hi3) +BU_ALTIVEC_2 (VANDC_V8HI, "vandc_v8hi",CONST, andcv8hi3) +BU_ALTIVEC_2 (VANDC_V4SI_UNS, "vandc_v4si_uns",CONST, andcv4si3) +BU_ALTIVEC_2 (VANDC_V4SI, "vandc_v4si",CONST, andcv4si3) +BU_ALTIVEC_2 (VANDC_V2DI_UNS, "vandc_v2di_uns",CONST, andcv2di3) +BU_ALTIVEC_2 (VANDC_V2DI, "vandc_v2di",CONST, andcv2di3) +BU_ALTIVEC_2 (VANDC_V4SF, "vandc_v4sf",CONST, andcv4sf3) +BU_ALTIVEC_2 (VANDC_V2DF, "vandc_v2df",CONST, andcv2df3) BU_ALTIVEC_2 (VAVGUB,"vavgub", CONST, uavgv16qi3_ceil) BU_ALTIVEC_2 (VAVGSB,"vavgsb", CONST, avgv16qi3_ceil) BU_ALTIVEC_2 (VAVGUH,"vavguh", CONST, uavgv8hi3_ceil) @@ -1057,8 +1075,27 @@ BU_ALTIVEC_2 (VMULOUH, "vmulouh", BU_ALTIVEC_2 (VMULOSH,
Re: [PATCH] rs6000: Fix 2 for PR92661, Do not define builtins that overload disabled builtins
On 12/18/19 8:15 AM, Segher Boessenkool wrote: >> +/* { dg-do compile { target { powerpc*-*-* } } } */ >> +/* { dg-options "-w -O2 -mdejagnu-cpu=power9" } */ > > You don't need that target clause in gcc.target/powerpc (and dg-do compile > is the default, but having it explicit is also fine of course). I think leaving the bare dg-do compile (ie, no target) is nice, for newbies who don't know that dg-do compile is the default. >> --- gcc/testsuite/gcc.target/powerpc/dfp-dd.c(revision 278980) >> +++ gcc/testsuite/gcc.target/powerpc/dfp-dd.c(working copy) >> @@ -1,6 +1,7 @@ >> /* Test generation of DFP instructions for POWER6. */ >> /* Origin: Janis Johnson */ >> -/* { dg-do compile { target { powerpc*-*-linux* && powerpc_fprs } } } */ >> +/* { dg-do compile { target { powerpc*-*-linux* } } } */ >> +/* { dg-require-effective-target dfp_hw } */ > > You can remove powerpc_fprs now because it became redundant? Cool. Right, hard dfp support requires we have hard float support. > But dfp_hw is the wrong conditions for a dg-do compile test. Ok, yes. Looking closer, that dfp_hw is a runtime test and not what we want. I'll change this to using "hard_dfp" which is a compile time test. > Nice cleanups! Please fix that dfp_hw thing, and then, okay for trunk, > Thanks! Will do, thanks. I'll commit this after making these changes and rerunning the updated test cases. Peter
Re: [PATCH] rs6000: Fix PR92923, __builtin_vec_xor() causes subregs to be used when not using V4SImode vectors
On 12/20/19 9:35 AM, Segher Boessenkool wrote: > Something automated to verify what we implement is what we have documented > would be neat to have. Maybe this becomes feasible with the rewrite of > the builtin stuff :-) Agreed! >> This passed bootstrap and regression testing with no errors. Ok for trunk? > > On what kind of system did you test? > > I'd like to see this tested on both BE and LE, and various processor > generations -- but we'll see if it regresses anyway, and it is still > stage 3. So, okay for trunk, just please keep an eye out for > regressions (in January that is :-) ) It was an LE test. I fixed up the problems below and have kicked off a BE run now and will run the testsuite in both 32-bit and 64-bit modes. I leave on vacation tomorrow, so I'll wait until I return next week before committing so I can watch for any regressions. >> * config/rs6000/rs6000-builtin.def (VAND, VANDC, VNOR, VOR, >> VXOR): Delete. > > You can end a changelog line in a colon just fine, fwiw. Ok. >> --- gcc/testsuite/gcc.target/powerpc/pr92923-1.c (nonexistent) >> +++ gcc/testsuite/gcc.target/powerpc/pr92923-1.c (working copy) >> @@ -0,0 +1,454 @@ >> +/* { dg-do compile { target { powerpc*-*-* } } } */ > > You don't need this target clause, everything in gc.target/powerpc has it > already. > >> +/* { dg-skip-if "" { powerpc*-*-darwin* } } */ > > I'm not sure if this is necessary, or just cargo culting :-) I think maybe both of those were cut/paste issues. I've removed them both. >> +/* { dg-final { scan-tree-dump-times "VIEW_CONVERT_EXPR" 0 "gimple" } } */ > > That's scan-tree-dump-not. Ahh, that is better. Fixed. > Same comments for the p8 test of course. Okay with or without those > adjusted (they aren't wrong, just weird style). Fixed too. Peter
Re: [PATCH] rs6000: Fix PR92923, __builtin_vec_xor() causes subregs to be used when not using V4SImode vectors
On 12/20/19 12:20 PM, Peter Bergner wrote: >> On what kind of system did you test? >> >> I'd like to see this tested on both BE and LE, and various processor >> generations -- but we'll see if it regresses anyway, and it is still >> stage 3. So, okay for trunk, just please keep an eye out for >> regressions (in January that is :-) ) > > It was an LE test. I fixed up the problems below and have kicked off > a BE run now and will run the testsuite in both 32-bit and 64-bit modes. > I leave on vacation tomorrow, so I'll wait until I return next week > before committing so I can watch for any regressions. The BE run was clean and I'm back from vacation, so I committed the updated patch. Thanks. Peter
Re: [PATCH 0/6] RFC: adding support to GCC for detecting trust boundaries
On Sat, Nov 13, 2021 at 03:37:24PM -0500, David Malcolm wrote: > This approach is much less expressive that the custom addres space > approach; it would only cover the trust boundary aspect; it wouldn't > cover any differences between generic pointers and __user, vs __iomem, > __percpu, and __rcu which I admit I only dimly understand. __iomem would point at device memory, which can have curious side effects or is yet another trust boundary, depending on device and usage. __percpu is an address space that denotes a per-cpu variable's relative offset, it needs be combined with a per-cpu offset to get a 'real' pointer, on x86_64 %gs segment offset is used for this purpose, other architectures are less fortunate. The whole per_cpu()/this_cpu_*() family of APIs accepts such pointers. __rcu is the regular kernel address space, but denotes that the object pointed to has RCU lifetime management. The attribute is laundered through rcu_dereference() to remove the __rcu qualifier. > Possibly silly question: is it always a bug for the value of a kernel > pointer to leak into user space? i.e. should I be complaining about an > infoleak if the value of a trusted_ptr itself is written to > *untrusted_ptr? e.g. Yes, always. Leaking kernel pointers is unconditionally bad.
Re: [PATCH 2/6] Add returns_zero_on_success/failure attributes
On Mon, Nov 15, 2021 at 12:33:16PM +0530, Prathamesh Kulkarni wrote: > On Sun, 14 Nov 2021 at 02:07, David Malcolm via Gcc-patches > > +/* Handle "returns_zero_on_failure" and "returns_zero_on_success" > > attributes; > > + arguments as in struct attribute_spec.handler. */ > > + > > +static tree > > +handle_returns_zero_on_attributes (tree *node, tree name, tree, int, > > + bool *no_add_attrs) > > +{ > > + if (!INTEGRAL_TYPE_P (TREE_TYPE (*node))) > > +{ > > + error ("%qE attribute on a function not returning an integral type", > > +name); > > + *no_add_attrs = true; > > +} > > + return NULL_TREE; > Hi David, > Just curious if a warning should be emitted if the function is marked > with the attribute but it's return value isn't actually 0 ? > > There are other constants like -1 or 1 that are often used to indicate > error, so maybe tweak the attribute to > take the integer as an argument ? > Sth like returns_int_on_success(cst) / returns_int_on_failure(cst) ? > > Also, would it make sense to extend it for pointers too for returning > NULL on success / failure ? Please also consider that in Linux we use the 'last' page for error code returns. That is, a function returning a pointer could return '(void *)-EFAULT' also see linux/err.h
Re: [PATCH V3] rs6000: Don't ICE when compiling the __builtin_vsx_splat_2di built-in [PR113950]
On 3/6/24 3:27 AM, Kewen.Lin wrote: > on 2024/3/4 02:55, jeevitha wrote: >> The following patch has been bootstrapped and regtested on powerpc64le-linux. >> >> When we expand the __builtin_vsx_splat_2di function, we were allowing >> immediate >> value for second operand which causes an unrecognizable insn ICE. Even though >> the immediate value was forced into a register, it wasn't correctly assigned >> to the second operand. So corrected the assignment of op1 to operands[1]. [snip] > As the discussions in the thread of the previous versions, I think > Segher agreed this approach, so OK for trunk with the above nits > tweaked, thanks! The bogus vsx_splat_ code goes all the way back to GCC 8, so we should backport this fix. Segher and Ke Wen, can we get an approval to backport this to all the open release branches (GCC 13, 12, 11)? Thanks. Jeevitha, once we get approval, please perform the backports. Peter
Re: [PATCH] rs6000: Fix issue in specifying PTImode as an attribute [PR106895]
On 3/18/24 9:36 AM, Segher Boessenkool wrote: > Hi! > > On Fri, Feb 23, 2024 at 03:04:13PM +0530, jeevitha wrote: >> PTImode attribute assists in generating even/odd register pairs on 128 bits. > > It is a mode, not an attribute. Attributes are on declarations, while > modes are on a much more fundamental level (every value has a mode, in > GCC!) > >> When the user specifies PTImode as an attribute, it breaks because there is >> no >> internal type to handle this mode . We have created a tree node with dummy >> type >> to handle PTImode. > > You are talking about the mode attribute. Aha. Correct. >> We are not documenting this dummy type since users are not >> allowed to use this type externally. > > Not sure what this is meant to mean? What does "allowed to" mean, even? > We do not forbid people from doing anything (we can discourage them > though). Or dso you mean something just doesn't work? The use case is the Linux kernel will use the mode attribute as shown in the test case. The type she created was needed for the code to "work" internally, but we don't want to advertise it to users, so that's why we're not documenting it. Yes, pesky users who go digging through the GCC source could find it and use it, but we don't want to encourage them. :-) >> PR target/106895 >> * config/rs6000/rs6000.h (enum rs6000_builtin_type_index): Add fields >> to hold PTImode type. > > An enum does not have fields. What do you actually mean? Yeah, as per your follow-on comment, I think a simple "Add RS6000_BTI_INTPTI." should suffice. Peter
Re: [PATCH v2] rs6000: Stackoverflow in optimized code on PPC [PR100799]
so same problem with the comment, but the following change... > - else if (align_words < GP_ARG_NUM_REG) > + else if (align_words < GP_ARG_NUM_REG > +|| (cum->hidden_string_length > +&& cum->actual_parm_length <= GP_ARG_NUM_REG)) { if (TARGET_32BIT && TARGET_POWERPC64) return rs6000_mixed_function_arg (mode, type, align_words); return gen_rtx_REG (mode, GP_ARG_MIN_REG + align_words); } else return NULL_RTX; The old code for the unused hidden parameter (which was the 9th param) would fall thru to the "return NULL_RTX;" which would make the callee assume there was a parameter save area allocated. Now instead, we'll return a reg rtx, probably of r11 (r3 thru r10 are our param regs) and I'm guessing we'll now see a copy of r11 into a pseudo like we do for the other param regs. Is that a problem? Given it's an unused parameter, it'll probably get deleted as dead code, but could it cause any issues? What if we have more than one unused hidden parameter and we return r12 and r13 which have specific uses in our ABIs (eg, r13 is our TCB pointer), so it may not actually look dead. Have you verified what the callee RTL looks like after expand for these unused hidden parameters? Is there a rtx we can return that isn't a NULL_RTX which triggers the assumption of a parameter save area, but isn't a reg rtx which might lead to some rtl being generated? Would a (const_int 0) or something else work? > + /* Actual parameter length ignoring hidden parameter. > + This is done to C++ wrapper calling fortran procedures > + which has hidden parameter that are not used. */ I think a simpler comment will suffice: /* Actual parameter count ignoring unused hidden parameters. */ Peter
Re: [PATCH v2] rs6000: Stackoverflow in optimized code on PPC [PR100799]
On 3/23/24 4:33 AM, Ajit Agarwal wrote: >>> - else if (align_words < GP_ARG_NUM_REG) >>> + else if (align_words < GP_ARG_NUM_REG >>> + || (cum->hidden_string_length >>> + && cum->actual_parm_length <= GP_ARG_NUM_REG)) >> { >> if (TARGET_32BIT && TARGET_POWERPC64) >> return rs6000_mixed_function_arg (mode, type, align_words); >> >> return gen_rtx_REG (mode, GP_ARG_MIN_REG + align_words); >> } >> else >> return NULL_RTX; >> >> The old code for the unused hidden parameter (which was the 9th param) would >> fall thru to the "return NULL_RTX;" which would make the callee assume there >> was a parameter save area allocated. Now instead, we'll return a reg rtx, >> probably of r11 (r3 thru r10 are our param regs) and I'm guessing we'll now >> see a copy of r11 into a pseudo like we do for the other param regs. >> Is that a problem? Given it's an unused parameter, it'll probably get deleted >> as dead code, but could it cause any issues? What if we have more than one >> unused hidden parameter and we return r12 and r13 which have specific uses >> in our ABIs (eg, r13 is our TCB pointer), so it may not actually look dead. >> Have you verified what the callee RTL looks like after expand for these >> unused hidden parameters? Is there a rtx we can return that isn't a NULL_RTX >> which triggers the assumption of a parameter save area, but isn't a reg rtx >> which might lead to some rtl being generated? Would a (const_int 0) or >> something else work? >> >> > For the above use case it will return > > (reg:DI 5 %r5) and below check entry_parm = > (reg:DI 5 %r5) and the following check will not return TRUE and hence >parameter save area will not be allocated. Why r5?!?! The 8th (integer) param would return r10, so I'd assume if the next param was a hidden param, then it'd get the next gpr, so r11. How does it jump back to r5 which may have been used by the 3rd param? > It will not generate any rtx in the callee rtl code but it just used to > check whether to allocate parameter save area or not when number of args > 8. > > /* If there is no incoming register, we need a stack. */ > entry_parm = rs6000_function_arg (args_so_far, arg); > if (entry_parm == NULL) > return true; > > /* Likewise if we need to pass both in registers and on the stack. */ > if (GET_CODE (entry_parm) == PARALLEL > && XEXP (XVECEXP (entry_parm, 0, 0), 0) == NULL_RTX) > return true; Yes, this code in rs6000_parm_needs_stack() uses the rs6000_function_arg() return value as a boolean to tell us whether a parameter save area is required so what we return is unimportant other than to know it's not NULL_RTX. I'm more concerned about the use of the target hook targetm.calls.function_arg used in the generic parts of the compiler. What will that code do differently now that we return a reg rtx rather than NULL_RTX? Might that code use the reg rtx to emit something? I'd feel better if you could verify what happens in that code when we return a reg rtx for that 9th hidden param which isn't really being passed in a register. Peter
[PATCH] lto: Don't assume a posix shell is usable on windows [PR110710]
lto-wrapper generates Makefiles that use the following: touch -r file file.tmp && mv file.tmp file to truncate files. If there is no suitable "touch" or "mv" available, then this errors with "The system cannot find the file specified". The solution here is to check if sh -c true works, before trying to use it in the Makefile. If it doesn't, then fall back to "copy /y nul file" instead. The fallback doesn't work exactly the same (the modified time of the file gets updated), but this doesn't seem to matter in practice. I tested this both in environments both with and without sh present, and observed no issues. Signed-off-by: Peter Damianov --- gcc/lto-wrapper.cc | 35 --- 1 file changed, 32 insertions(+), 3 deletions(-) diff --git a/gcc/lto-wrapper.cc b/gcc/lto-wrapper.cc index 5186d040ce0..8dee0aaa2d8 100644 --- a/gcc/lto-wrapper.cc +++ b/gcc/lto-wrapper.cc @@ -1389,6 +1389,27 @@ make_exists (void) return errmsg == NULL && exit_status == 0 && err == 0; } +/* Test that an sh command is present and working, return true if so. + This is only relevant for windows hosts, where a /bin/sh shell cannot + be assumed to exist. */ + +static bool +sh_exists (void) +{ + const char *sh = "sh"; + const char *sh_args[] = {sh, "-c", "true", NULL}; +#ifdef _WIN32 + int exit_status = 0; + int err = 0; + const char *errmsg += pex_one (PEX_SEARCH, sh_args[0], CONST_CAST (char **, sh_args), + "sh", NULL, NULL, &exit_status, &err); + return errmsg == NULL && exit_status == 0 && err == 0; +#else + return true; +#endif +} + /* Execute gcc. ARGC is the number of arguments. ARGV contains the arguments. */ static void @@ -1402,6 +1423,7 @@ run_gcc (unsigned argc, char *argv[]) const char *collect_gcc; char *collect_gcc_options; int parallel = 0; + bool have_sh = sh_exists (); int jobserver = 0; bool jobserver_requested = false; int auto_parallel = 0; @@ -2016,6 +2038,7 @@ cont: argv_ptr[5] = NULL; if (parallel) { + fprintf (mstream, "SHELL=%s\n", have_sh ? "sh" : "cmd"); fprintf (mstream, "%s:\n\t@%s ", output_name, new_argv[0]); for (j = 1; new_argv[j] != NULL; ++j) fprintf (mstream, " '%s'", new_argv[j]); @@ -2024,9 +2047,15 @@ cont: truncate them as soon as we have processed it. This reduces temporary disk-space usage. */ if (! save_temps) - fprintf (mstream, "\t@-touch -r \"%s\" \"%s.tem\" > /dev/null " -"2>&1 && mv \"%s.tem\" \"%s\"\n", -input_name, input_name, input_name, input_name); + { + fprintf (mstream, + have_sh + ? "\t@-touch -r \"%s\" \"%s.tem\" > /dev/null " +"2>&1 && mv \"%s.tem\" \"%s\"\n" + : "\t@-copy /y nul \"%s\" > NUL " +"2>&1\n", + input_name, input_name, input_name, input_name); + } } else { -- 2.39.2
Re: [PATCH v3] tree-profile: Disable indirect call profiling for IFUNC resolvers
On 4/3/24 7:40 AM, H.J. Lu wrote: > We can't profile indirect calls to IFUNC resolvers nor their callees as > it requires TLS which hasn't been set up yet when the dynamic linker is > resolving IFUNC symbols. > > Add an IFUNC resolver caller marker to cgraph_node and set it if the > function is called by an IFUNC resolver. Disable indirect call profiling > for IFUNC resolvers and their callees. The IFUNC resolvers on Power do not use TLS, so isn't this a little too conservative? Should this be triggered via a target hook so architectures that don't use TLS in their IFUNC resolvers could still profile them? Peter
Re: [PATCH v3] tree-profile: Disable indirect call profiling for IFUNC resolvers
On 4/3/24 10:45 AM, Andrew Pinski wrote: > On Wed, Apr 3, 2024 at 8:32 AM Peter Bergner wrote: > I think you misunderstood the patch/situtation. Most ifunc resolves > don't use TLS at all; what is happening here is that the profiler > (-fprofile-generate) is adding TLS usage to the ifunc resolver which > then causes issues. And the use of TLS causes a PLT call to be inside > the ifun which causes all the fun stuff. > > This is not about ifunc resolves using TLS directly in code but rather > indirectly via -fprofile-generate. Ah, ok. Thanks to you and H.J. for clarifying. Peter
[PATCH] rs6000: Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR [PR101865]
This is a cleanup patch in preparation to fixing the real bug in PR101865. TARGET_DIRECT_MOVE is redundant with TARGET_P8_VECTOR, so alias it to that. Also replace all usages of OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR and delete the now dead mask. This passed bootstrap and retesting on powerpc64le-linux with no regressions. Ok for trunk? Eventually we'll want to backport this along with the follow-on patch that actually fixes PR101865. Peter gcc/ PR target/101865 * config/rs6000/rs6000.h (TARGET_DIRECT_MOVE): Define. * config/rs6000/rs6000.cc (rs6000_option_override_internal): Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR. Delete redundant OPTION_MASK_DIRECT_MOVE usage. Delete TARGET_DIRECT_MOVE dead code. (rs6000_opt_masks): Neuter the "direct-move" option. * config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR. Delete useless comment. * config/rs6000/rs6000-cpus.def (ISA_2_7_MASKS_SERVER): Delete OPTION_MASK_DIRECT_MOVE. (OTHER_VSX_VECTOR_MASKS): Likewise. (POWERPC_MASKS): Likewise. * config/rs6000/rs6000.opt (mno-direct-move): New. (mdirect-move): Remove Mask and Var. diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h index 68bc45d65ba..77d045c9f6e 100644 --- a/gcc/config/rs6000/rs6000.h +++ b/gcc/config/rs6000/rs6000.h @@ -471,6 +471,8 @@ extern int rs6000_vector_align[]; #define TARGET_EXTSWSLI(TARGET_MODULO && TARGET_POWERPC64) #define TARGET_MADDLD TARGET_MODULO +/* TARGET_DIRECT_MOVE is redundant to TARGET_P8_VECTOR, so alias it to that. */ +#define TARGET_DIRECT_MOVE TARGET_P8_VECTOR #define TARGET_XSCVDPSPN (TARGET_DIRECT_MOVE || TARGET_P8_VECTOR) #define TARGET_XSCVSPDPN (TARGET_DIRECT_MOVE || TARGET_P8_VECTOR) #define TARGET_VADDUQM (TARGET_P8_VECTOR && TARGET_POWERPC64) diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index 6ba9df4f02e..c241371147c 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -3811,7 +3811,7 @@ rs6000_option_override_internal (bool global_init_p) Testing for direct_move matches power8 and later. */ if (!BYTES_BIG_ENDIAN && !(processor_target_table[tune_index].target_enable - & OPTION_MASK_DIRECT_MOVE)) + & OPTION_MASK_P8_VECTOR)) rs6000_isa_flags |= ~rs6000_isa_flags_explicit & OPTION_MASK_STRICT_ALIGN; /* Add some warnings for VSX. */ @@ -3853,8 +3853,7 @@ rs6000_option_override_internal (bool global_init_p) && (rs6000_isa_flags_explicit & (OPTION_MASK_SOFT_FLOAT | OPTION_MASK_ALTIVEC | OPTION_MASK_VSX)) != 0) -rs6000_isa_flags &= ~((OPTION_MASK_P8_VECTOR | OPTION_MASK_CRYPTO - | OPTION_MASK_DIRECT_MOVE) +rs6000_isa_flags &= ~((OPTION_MASK_P8_VECTOR | OPTION_MASK_CRYPTO) & ~rs6000_isa_flags_explicit); if (TARGET_DEBUG_REG || TARGET_DEBUG_TARGET) @@ -3939,13 +3938,6 @@ rs6000_option_override_internal (bool global_init_p) rs6000_isa_flags &= ~OPTION_MASK_FPRND; } - if (TARGET_DIRECT_MOVE && !TARGET_VSX) -{ - if (rs6000_isa_flags_explicit & OPTION_MASK_DIRECT_MOVE) - error ("%qs requires %qs", "-mdirect-move", "-mvsx"); - rs6000_isa_flags &= ~OPTION_MASK_DIRECT_MOVE; -} - if (TARGET_P8_VECTOR && !TARGET_ALTIVEC) rs6000_isa_flags &= ~OPTION_MASK_P8_VECTOR; @@ -24429,7 +24421,7 @@ static struct rs6000_opt_mask const rs6000_opt_masks[] = false, true }, { "cmpb",OPTION_MASK_CMPB, false, true }, { "crypto", OPTION_MASK_CRYPTO, false, true }, - { "direct-move", OPTION_MASK_DIRECT_MOVE,false, true }, + { "direct-move", 0, false, true }, { "dlmzb", OPTION_MASK_DLMZB, false, true }, { "efficient-unaligned-vsx", OPTION_MASK_EFFICIENT_UNALIGNED_VSX, false, true }, diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc index ce0b14a8d37..647f20de7f2 100644 --- a/gcc/config/rs6000/rs6000-c.cc +++ b/gcc/config/rs6000/rs6000-c.cc @@ -429,19 +429,7 @@ rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT flags) rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR6"); if ((flags & OPTION_MASK_POPCNTD) != 0) rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR7"); - /* Not
Re: [PATCH, rs6000] Split TARGET_POWER8 from TARGET_DIRECT_MOVE [PR101865] (2/2)
I'm picking up Will's patches for this bug. As an FYI, this is the bug where _ARCH_PWR8 is conditional on TARGET_DIRECT_MOVE which can be disabled with -mno-vsx which is bad. I already posted the cleanup patch that the updated patch for this bug will rely on, that removed the OPTION_MASK_DIRECT_MOVE because it is fully redundant with OPTION_MASK_P8_VECTOR. I've also incorporated some of Ke Wen's review comments on Will's original patch. I have a couple of comments on your review though... On 10/17/22 1:08 PM, Segher Boessenkool wrote: > On Mon, Sep 19, 2022 at 11:13:20AM -0500, will schmidt wrote: >> @@ -24046,10 +24045,11 @@ static struct rs6000_opt_mask const >> rs6000_opt_masks[] = >>{ "block-ops-vector-pair",OPTION_MASK_BLOCK_OPS_VECTOR_PAIR, >> false, true }, >>{ "cmpb", OPTION_MASK_CMPB, false, true }, >>{ "crypto", OPTION_MASK_CRYPTO, false, >> true }, >>{ "direct-move", OPTION_MASK_DIRECT_MOVE,false, true }, >> + { "power8", OPTION_MASK_POWER8, false, >> true }, > > Why would we want a #pragma power8 ? Agreed, we don't want that. We have target attribute cpu=power8 for that. >> +mpower8 >> +Target Mask(POWER8) Var(rs6000_isa_flags) >> +Use instructions added in ISA 2.07 (power8). > > There should not be such an option. It is set by -mcpu=power8 and > later, but can never be enabled or disabled direfctly by the user. So we need an OPTION_MASK_POWER8 to be created for use in rs6000_isa_flags, but the only way I see that we can do that is to create an option in rs6000.opt. Did I miss that there is another way? Otherwise, I was thinking of creating a dummy option that is WarnRemoved from the start ala: +;; This option exists only for its MASK. It is not intended for users. +mpower8 +Target Mask(POWER8) Var(rs6000_isa_flags) WarnRemoved + Is there a better way? The problem is P8 created lots of new instructions, but they were basically all vector and htm instructions. There were no general GPR or FPR instructions (ie, what we'd think of as base architecture) added, so there's no other OPTION_MASK_*/TARGET_* we can use as a P8 base architecture test. I'll note I tried just a bare "Target Mask(POWER8) Var(rs6000_isa_flags)" with no option name mentioned at all, but that didn't work, as no OPTION_MASK_POWER8 was created. Peter
Re: [PATCH] rs6000: Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR [PR101865]
On 4/8/24 3:55 AM, Kewen.Lin wrote: > on 2024/4/6 06:28, Peter Bergner wrote: >> +mno-direct-move >> +Target Undocumented WarnRemoved >> + >> mdirect-move >> -Target Undocumented Mask(DIRECT_MOVE) Var(rs6000_isa_flags) WarnRemoved >> +Target Undocumented WarnRemoved > > When reviewing my previous patch to "neuter option -mpower{8,9}-vector", > Segher mentioned that we don't need to keep such option warning all the > time and can drop it like in a release later as users should be aware of > this information then, I agreed and considering that patch disabling > -m[no-]direct-move was r8-7845-g57f108f5a1e1b2, I think we can just remove > m[no-]direct-move here? What do you think? I'm fine with that if that is what we want. So something like the following? +;; This option existed in the past, but now is always silently ignored. mdirect-move -Target Undocumented Mask(DIRECT_MOVE) Var(rs6000_isa_flags) WarnRemoved +Target Undocumented Ignore The above seems to silently ignore both -mdirect-move and -mno-direct-move which I think is what we want. That said, it's not what we've done with other options, but maybe those just need to be changed too? Peter ;; This option existed in the past, but now is always off. mno-mfpgpr Target RejectNegative Undocumented Ignore mmfpgpr Target RejectNegative Undocumented WarnRemoved [snip other examples]
Re: [PATCH] rs6000: Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR [PR101865]
On 4/8/24 9:37 PM, Kewen.Lin wrote: > on 2024/4/8 21:21, Peter Bergner wrote: > I prefer to remove it completely, that is: > >> -mdirect-move >> -Target Undocumented Mask(DIRECT_MOVE) Var(rs6000_isa_flags) WarnRemoved > > The reason why you still kept it is to keep a historical record here? I believe we've never completely removed an option before. I think the thought was, if some software package explicitly used the option, then they shouldn't see an 'unrecognized command-line option' error, but rather either a warning that the option was removed or just silently ignore it. Ie, we don't want to make a package that used to build with an old compiler now break its build because the option doesn't exist anymore. > Segher pointed out to me that this kind of option complete removal should be > stage 1 stuff, so let's defer to make it in a separated patch next release > (including some other options like mfpgpr you showed below etc.). :) If we're going to completely remove it, then for sure, it's a stage1 thing. I'd like to hear Segher's thoughts on whether we should completely remove it or just silently ignore it. > For the original patch, > >> +mno-direct-move >> +Target Undocumented WarnRemoved > > s/WarnRemoved/Ignore/ to match some other existing practice, there is no > warning now if specifying -mno-direct-move and it would be good to keep > the same behavior for users. If we want to silently ignore -mdirect-move and -mno-direct-move, then we just need to do: mdirect-move -Target Undocumented Mask(DIRECT_MOVE) Var(rs6000_isa_flags) WarnRemoved +Target Undocumented Ignore There's no need to mention -mno-direct-move at all then. It was only in the case I thought we wanted to warn against it's use that I added -mno-direct-move. >> That said, it's not what we've done with >> other options, but maybe those just need to be changed too? > > Yes, I think they need to be changed too (next release). If that's the consensus with Segher, sure, we can plan on that in stage1. Peter
Re: [PATCH] rs6000: Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR [PR101865]
On 4/9/24 12:37 AM, Kewen.Lin wrote: > Since removing it completely is a stage1 thing, I prefer to keep mdirect-move > and -mno-direct-move handlings as before, WarnRemoved and Ignore separately. Ok, current trunk ignores -mno-direct-move and warns on -mdirect-move, so to keep the same behavior for GCC 14 (before removing in stage1), we want just: mdirect-move -Target Undocumented Mask(DIRECT_MOVE) Var(rs6000_isa_flags) WarnRemoved +Target Undocumented WarnRemoved which is what I'll commit and push after one last round of testing. Thanks. Peter
Re: [PATCH] rs6000: Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR [PR101865]
On 4/9/24 3:19 PM, Peter Bergner wrote: > Ok, current trunk ignores -mno-direct-move and warns on -mdirect-move, so to > keep the same behavior for GCC 14 (before removing in stage1), we want just: > > mdirect-move > -Target Undocumented Mask(DIRECT_MOVE) Var(rs6000_isa_flags) WarnRemoved > +Target Undocumented WarnRemoved > > which is what I'll commit and push after one last round of testing. Testing was clean as expected, so I pushed the commit. Thanks. Peter
[PATCH] rs6000: Add OPTION_MASK_POWER8 [PR101865]
FYI: This patch is an update to Will Schmidt's patches to fix PR101865: https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601825.html https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601823.html ...taking into consideration patch reviews received than. I also found a few more locations that needed patching, as well as simplifying the testsuite test cases by removing the need to scan for the predefined macros. The bug in PR101865 is the _ARCH_PWR8 predefined macro is conditional upon TARGET_DIRECT_MOVE, which can be false for some -mcpu=power8 compiles if the -mno-altivec or -mno-vsx options are used. The solution here is to create a new OPTION_MASK_POWER8 mask that is true for -mcpu=power8, regardless of Altivec or VSX enablement. Unfortunately, the only way to create an OPTION_MASK_* mask is to create a new option, which we have done here, but marked it as WarnRemoved since we do not want users using it. For stage1, we will look into how we can create ISA mask flags for use in the compiler without the need for explicit options. The passed bootstrap and regtest on powerpc64le-linux. Ok for trunk? This is also broken on the release branches, so ok for backports after some burn-in time on trunk? Peter 2024-04-11 Will Schmidt Peter Bergner gcc/ PR target/101865 * config/rs6000/rs6000-builtin.cc (rs6000_builtin_is_supported): Use TARGET_POWER8. * config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Use OPTION_MASK_POWER8. * config/rs6000/rs6000-cpus.def (POWERPC_MASKS): Add OPTION_MASK_POWER8. (ISA_2_7_MASKS_SERVER): Likewise. * config/rs6000/rs6000.cc (rs6000_option_override_internal): Update comment. Use OPTION_MASK_POWER8 and TARGET_POWER8. * config/rs6000/rs6000.h (TARGET_SYNC_HI_QI): Use TARGET_POWER8. * config/rs6000/rs6000.md (define_attr "isa"): Add p8. (define_attr "enabled"): Handle it. (define_insn "prefetch"): Use TARGET_POWER8. * config/rs6000/rs6000.opt (mdo-not-use-this-option): New. gcc/testsuite/ PR target/101865 * gcc.target/powerpc/predefined-p7-novsx.c: New test. * gcc.target/powerpc/predefined-p8-noaltivec-novsx.c: New test. * gcc.target/powerpc/predefined-p8-noaltivec.c: New test. * gcc.target/powerpc/predefined-p8-novsx.c: New test. * gcc.target/powerpc/predefined-p8-pragma-vsx.c: New test. * gcc.target/powerpc/predefined-p9-novsx.c: New test. diff --git a/gcc/config/rs6000/rs6000-builtin.cc b/gcc/config/rs6000/rs6000-builtin.cc index e7d6204074c..320affd79e3 100644 --- a/gcc/config/rs6000/rs6000-builtin.cc +++ b/gcc/config/rs6000/rs6000-builtin.cc @@ -165,7 +165,7 @@ rs6000_builtin_is_supported (enum rs6000_gen_builtins fncode) case ENB_P7_64: return TARGET_POPCNTD && TARGET_POWERPC64; case ENB_P8: - return TARGET_DIRECT_MOVE; + return TARGET_POWER8; case ENB_P8V: return TARGET_P8_VECTOR; case ENB_P9: diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc index 647f20de7f2..bd493ab87c5 100644 --- a/gcc/config/rs6000/rs6000-c.cc +++ b/gcc/config/rs6000/rs6000-c.cc @@ -429,7 +429,7 @@ rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT flags) rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR6"); if ((flags & OPTION_MASK_POPCNTD) != 0) rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR7"); - if ((flags & OPTION_MASK_P8_VECTOR) != 0) + if ((flags & OPTION_MASK_POWER8) != 0) rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR8"); if ((flags & OPTION_MASK_MODULO) != 0) rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR9"); diff --git a/gcc/config/rs6000/rs6000-cpus.def b/gcc/config/rs6000/rs6000-cpus.def index 45dd5a85901..6ee678e69c3 100644 --- a/gcc/config/rs6000/rs6000-cpus.def +++ b/gcc/config/rs6000/rs6000-cpus.def @@ -47,6 +47,7 @@ fusion here, instead set it in rs6000.cc if we are tuning for a power8 system. */ #define ISA_2_7_MASKS_SERVER (ISA_2_6_MASKS_SERVER \ +| OPTION_MASK_POWER8 \ | OPTION_MASK_P8_VECTOR\ | OPTION_MASK_CRYPTO \ | OPTION_MASK_EFFICIENT_UNALIGNED_VSX \ @@ -130,6 +131,7 @@ | OPTION_MASK_MODULO \ | OPTION_MASK_MULHW\ | OPTION_MASK_NO_UPDATE\ +| OPTION_MASK_POWER8 \ | OPTION_MASK_P8_FUSION\ | OPTION_MASK_P8_VECTOR\
[PATCH] Fortran: fix passing array component to polymorphic argument [PR105658]
Dear all, The attached patch fixes PR105658 by forcing an array temporary to be created. This is required when passing an array component, but this didn't happen if the dummy argument was an unlimited polymorphic type. The problem bit of code is in `gfc_conv_expr_descriptor`, near L7828: subref_array_target = (is_subref_array (expr) && (se->direct_byref || expr->ts.type == BT_CHARACTER)); need_tmp = (gfc_ref_needs_temporary_p (expr->ref) && !subref_array_target); where `need_tmp` is being evaluated to 0. The logic here isn't clear to me, and this function is used in several places, which is why I went with setting `parmse.force_tmp = 1` in `gfc_conv_procedure_call` and using the same conditional as the later branch for the non-polymorphic case (near the call to `gfc_conv_subref_array_arg`) If this patch is ok, please could someone commit it for me? This is my first patch for GCC, so apologies in advance if the commit message is missing something. Tested on x86_64-pc-linux-gnu. The bug is present in gfortran back to 4.9, so should it also be backported? Cheers, Peter PR fortran/105658 gcc/fortran/ChangeLog * trans-expr.cc (gfc_conv_procedure_call): When passing an array component reference of intrinsic type to a procedure with an unlimited polymorphic dummy argument, a temporary should be created. gcc/testsuite/ChangeLog * gfortran.dg/PR105658.f90: New test. --- gcc/fortran/trans-expr.cc | 8 gcc/testsuite/gfortran.dg/PR105658.f90 | 25 + 2 files changed, 33 insertions(+) create mode 100644 gcc/testsuite/gfortran.dg/PR105658.f90 diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc index a0593b76f18..7fd3047c4e9 100644 --- a/gcc/fortran/trans-expr.cc +++ b/gcc/fortran/trans-expr.cc @@ -6439,6 +6439,14 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym, CLASS object for the unlimited polymorphic formal. */ gfc_find_vtab (&e->ts); gfc_init_se (&parmse, se); + /* The actual argument is a component reference to an array + of derived types, so we need to force creation of a + temporary */ + if (e->expr_type == EXPR_VARIABLE + && is_subref_array (e) + && !(fsym && fsym->attr.pointer)) + parmse.force_tmp = 1; + gfc_conv_intrinsic_to_class (&parmse, e, fsym->ts); } diff --git a/gcc/testsuite/gfortran.dg/PR105658.f90 b/gcc/testsuite/gfortran.dg/PR105658.f90 new file mode 100644 index 000..407ee25f77c --- /dev/null +++ b/gcc/testsuite/gfortran.dg/PR105658.f90 @@ -0,0 +1,25 @@ +! { dg-do compile } +! { dg-options "-Warray-temporaries" } +! Test fix for incorrectly passing array component to unlimited polymorphic procedure + +module test_PR105658_mod + implicit none + type :: foo +integer :: member1 +integer :: member2 + end type foo +contains + subroutine print_poly(array) +class(*), dimension(:), intent(in) :: array +select type(array) +type is (integer) + print*, array +end select + end subroutine print_poly + + subroutine do_print(thing) +type(foo), dimension(3), intent(in) :: thing +call print_poly(thing%member1) ! { dg-warning "array temporary" } + end subroutine do_print + +end module test_PR105658_mod -- 2.43.0
Re: [PATCH] Fortran: fix passing array component to polymorphic argument [PR105658]
Hi Harald, Thanks for your help, please see the updated and signed-off patch below. > (I am not entirely sure whether we need to exclude pointer and > allocatable attributes here explicitly, given the constraints > in F2023:15.5.2.6, but other may have an opinion, too. > The above should be safe anyway.) I've included them in the patch here, but it does seem to work fine without checking those attributes here -- and invalid code is still caught with that change. It also occurred to me that array temporaries aren't _required_ here (for arrays of derived type components), but in the general case with a type with differently sized components, the stride wouldn't be a multiple of the component's type's size. Is it possible in principle to have an arbitrary stride? Cheers, Peter >From 907a104facfc7f35f48ebcfa9ef5f8f5430d4d3c Mon Sep 17 00:00:00 2001 From: Peter Hill Date: Thu, 15 Feb 2024 16:58:33 + Subject: [PATCH] Fortran: fix passing array component ref to polymorphic procedures PR fortran/105658 gcc/fortran/ChangeLog * trans-expr.cc (gfc_conv_intrinsic_to_class): When passing an array component reference of intrinsic type to a procedure with an unlimited polymorphic dummy argument, a temporary should be created. gcc/testsuite/ChangeLog * gfortran.dg/PR105658.f90: New test. Signed-off-by: Peter Hill --- gcc/fortran/trans-expr.cc | 9 + gcc/testsuite/gfortran.dg/PR105658.f90 | 50 ++ 2 files changed, 59 insertions(+) create mode 100644 gcc/testsuite/gfortran.dg/PR105658.f90 diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc index a0593b76f18..004081aa6c3 100644 --- a/gcc/fortran/trans-expr.cc +++ b/gcc/fortran/trans-expr.cc @@ -1019,6 +1019,14 @@ gfc_conv_intrinsic_to_class (gfc_se *parmse, gfc_expr *e, tmp = gfc_typenode_for_spec (&class_ts); var = gfc_create_var (tmp, "class"); + /* Force a temporary for component or substring references */ + if (unlimited_poly + && class_ts.u.derived->components->attr.dimension + && !class_ts.u.derived->components->attr.allocatable + && !class_ts.u.derived->components->attr.class_pointer + && is_subref_array (e)) +parmse->force_tmp = 1; + /* Set the vptr. */ ctree = gfc_class_vptr_get (var); @@ -6439,6 +6447,7 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym, CLASS object for the unlimited polymorphic formal. */ gfc_find_vtab (&e->ts); gfc_init_se (&parmse, se); + gfc_conv_intrinsic_to_class (&parmse, e, fsym->ts); } diff --git a/gcc/testsuite/gfortran.dg/PR105658.f90 b/gcc/testsuite/gfortran.dg/PR105658.f90 new file mode 100644 index 000..8aacecf806e --- /dev/null +++ b/gcc/testsuite/gfortran.dg/PR105658.f90 @@ -0,0 +1,50 @@ +! { dg-do compile } +! { dg-options "-Warray-temporaries" } +! Test fix for incorrectly passing array component to unlimited polymorphic procedure + +module test_PR105658_mod + implicit none + type :: foo + integer :: member1 + integer :: member2 + end type foo +contains + subroutine print_poly(array) + class(*), dimension(:), intent(in) :: array + select type(array) + type is (integer) + print*, array + type is (character(*)) + print *, array + end select + end subroutine print_poly + + subroutine do_print(thing) + type(foo), dimension(3), intent(in) :: thing + type(foo), parameter :: y(3) = [foo(1,2),foo(3,4),foo(5,6)] + integer :: i, j, uu(5,6) + + call print_poly(thing%member1) ! { dg-warning "array temporary" } + call print_poly(y%member2) ! { dg-warning "array temporary" } + call print_poly(y(1::2)%member2) ! { dg-warning "array temporary" } + + ! The following array sections work without temporaries + uu = reshape([(((10*i+j),i=1,5),j=1,6)],[5,6]) + print *, uu(2,2::2) + call print_poly (uu(2,2::2)) ! no temp needed! + print *, uu(1::2,6) + call print_poly (uu(1::2,6)) ! no temp needed! + end subroutine do_print + + subroutine do_print2(thing2) + class(foo), dimension(:), intent(in) :: thing2 + call print_poly (thing2% member2) ! { dg-warning "array temporary" } + end subroutine do_print2 + + subroutine do_print3 () + character(3) :: c(3) = ["abc","def","ghi"] + call print_poly (c(1::2)) ! no temp needed! + call print_poly (c(1::2)(2:3)) ! { dg-warning "array temporary" } + end subroutine do_print3 + +end module test_PR105658_mod -- 2.43.0
[PATCH] rs6000: Update instruction counts due to combine changes [PR112103]
rs6000: Update instruction counts due to combine changes [PR112103] The PR91865 combine fix changed instruction counts slightly for rlwinm-0.c. Adjust expected instruction counts accordingly. This passed on both powerpc64le-linux and powerpc64-linux running the testsuite in both 32-bit and 64-bit modes. Ok for trunk? FYI, I will open a new bug to track the removing of the superfluous insns detected in PR112103. Peter gcc/testsuite/ PR target/112103 * gcc.target/powerpc/rlwinm-0.c: Adjust expected instruction counts. diff --git a/gcc/testsuite/gcc.target/powerpc/rlwinm-0.c b/gcc/testsuite/gcc.target/powerpc/rlwinm-0.c index 4f4fca2d8ef..a10d9174306 100644 --- a/gcc/testsuite/gcc.target/powerpc/rlwinm-0.c +++ b/gcc/testsuite/gcc.target/powerpc/rlwinm-0.c @@ -4,10 +4,10 @@ /* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 6739 { target ilp32 } } } */ /* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 9716 { target lp64 } } } */ /* { dg-final { scan-assembler-times {(?n)^\s+blr} 3375 } } */ -/* { dg-final { scan-assembler-times {(?n)^\s+rldicl} 3081 { target lp64 } } } */ +/* { dg-final { scan-assembler-times {(?n)^\s+rldicl} 3090 { target lp64 } } } */ /* { dg-final { scan-assembler-times {(?n)^\s+rlwinm} 3197 { target ilp32 } } } */ -/* { dg-final { scan-assembler-times {(?n)^\s+rlwinm} 3093 { target lp64 } } } */ +/* { dg-final { scan-assembler-times {(?n)^\s+rlwinm} 3084 { target lp64 } } } */ /* { dg-final { scan-assembler-times {(?n)^\s+rotlwi} 154 } } */ /* { dg-final { scan-assembler-times {(?n)^\s+srwi} 13 { target ilp32 } } } */ /* { dg-final { scan-assembler-times {(?n)^\s+srdi} 13 { target lp64 } } } */
Re: [PATCH] rs6000: Update instruction counts due to combine changes [PR112103]
On 2/20/24 3:29 AM, Kewen.Lin wrote: > on 2024/2/20 06:35, Peter Bergner wrote: >> rs6000: Update instruction counts due to combine changes [PR112103] >> >> The PR91865 combine fix changed instruction counts slightly for rlwinm-0.c. >> Adjust expected instruction counts accordingly. >> >> This passed on both powerpc64le-linux and powerpc64-linux running the >> testsuite in both 32-bit and 64-bit modes. Ok for trunk? > > OK for trunk, thanks for fixing! Ok, pushed. Thanks. >> FYI, I will open a new bug to track the removing of the superfluous >> insns detected in PR112103. > > Hope this test case will become not fragile any more once this filed > issue gets fixed. :) I think this will become less fragile after we fix PR114004 which is the bug I opened to track fixing the superfluous insn that was emitted that we found in this bug. The fragility was due to the superfluous insn being different before and after Roger's patch. Once we don't emit it anymore, this test case should be less fragile. Peter
Re: [PATCH] rs6000: Neuter option -mpower{8,9}-vector [PR109987]
On 2/20/24 3:27 AM, Kewen.Lin wrote: > on 2024/2/20 02:45, Segher Boessenkool wrote: >> On Tue, Jan 16, 2024 at 10:50:01AM +0800, Kewen.Lin wrote: >>> it consists of some aspects: >>> - effective target powerpc_p{8,9}vector_ok are removed >>> and replaced with powerpc_vsx_ok. >> >> So all such testcases already arrange to have p8 or p9 some other way? Shouldn't that be replaced with powerpc_vsx instead of powerpc_vsx_ok? That way we know VSX code gen is enabled for the options being used, even those in RUNTESTFLAGS. I thought we agreed that powerpc_vsx_ok was almost always useless and we always want to use powerpc_vsx. ...or did I miss that we removed the old powerpc_vsx_ok and renamed powerpc_vsx to powerpc_vsx_ok? >>> - Some test cases are updated with explicit -mvsx. >>> - Some test cases with those two option mixed are adjusted >>> to keep the test points, like -mpower8-vector >>> -mno-power9-vector are updated with -mdejagnu-cpu=power8 >>> -mvsx etc. >> >> -mcpu=power8 implies -mvsx already. Then we can omit the explicit -msx option, correct? Ie, if the user forces -mno-vsx in RUNTESTFLAGS, then we'll just skip the test case as UNSUPPORTED rather than trying to compile some vsx test case with vsx disabled via the options. Peter
Re: [PATCH] rs6000: Don't allow immediate value in the vsx_splat pattern [PR113950]
On 2/26/24 4:49 AM, Kewen.Lin wrote: > on 2024/2/26 14:18, jeevitha wrote: >> Hi All, >> diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md >> index 6111cc90eb7..e5688ff972a 100644 >> --- a/gcc/config/rs6000/vsx.md >> +++ b/gcc/config/rs6000/vsx.md >> @@ -4660,7 +4660,7 @@ >> (define_expand "vsx_splat_" >>[(set (match_operand:VSX_D 0 "vsx_register_operand") >> (vec_duplicate:VSX_D >> - (match_operand: 1 "input_operand")))] >> + (match_operand: 1 "splat_input_operand")))] >>"VECTOR_MEM_VSX_P (mode)" >> { >>rtx op1 = operands[1]; > > This hunk actually does force_reg already: > > ... > else if (!REG_P (op1)) > op1 = force_reg (mode, op1); > > but it's assigning to op1 unexpectedly (an omission IMHO), so just > simply fix it with: > > else if (!REG_P (op1)) > -op1 = force_reg (mode, op1); > +operands[1] = force_reg (mode, op1); I agree op1 was an oversight and it should be operands[1]. That said, I think using more precise predicates is a good thing, so I think we should use both Jeevitha's predicate change and your operands[1] change. I'll note that Jeevitha originally had the operands[1] change, but I didn't look closely enough at the issue or the pattern and mentioned that these kinds of bugs can be caused by too loose constraints and predicates, which is when she found the updated predicate to use. I believe she already even bootstrapped and regtested the operands[1] only change. Jeevitha??? >> +/* PR target/113950 */ >> +/* { dg-do compile } */ > > We need an effective target to ensure vsx support, for now it's > powerpc_vsx_ok. > ie: /* { dg-require-effective-target powerpc_vsx_ok } */ Agreed. >> +/* { dg-options "-O1" } */ I think we should also use a -mcpu=XXX option to ensure VSX is enabled when compiling these VSX built-in functions. I'm fine using any CPU (power7 or later) where the ICE exists with an unpatched compiler. Otherwise, testing will be limited to our server systems that have VSX enabled by default. Peter
Re: [PATCH] rs6000: Don't allow immediate value in the vsx_splat pattern [PR113950]
On 2/26/24 7:55 PM, Kewen.Lin wrote: > on 2024/2/26 23:07, Peter Bergner wrote: >> so I think we should use both Jeevitha's predicate change and >> your operands[1] change. > > Since either the original predicate change or operands[1] change > can fix this issue, I think it's implied that only either of them > is enough, so we can remove "else if (!REG_P (op1))" arm (or even > replaced with one else arm to assert REG_P (op1))? splat_input_operand allows, mem, reg and subreg, so I don't think we can just assert on REG_P (op1), since op1 could be a subreg. I do agree we can remove the "if (!REG_P (op1))" test on the else branch, since force_reg() has an early exit for regs, so a simple: ... else operands[1] = force_reg (mode, op1); ..should work. > Good point, or maybe just an explicit -mvsx like some existing ones, which > can avoid to only test some fixed cpu type. If a simple "-O1 -vsx" is enough to expose the ICE on an unpacthed compiler and a PASS on a patched compiler, then I'm all for it. Jeevitha, can you try confirming that? Peter
Re: [PATCH V2] rs6000: Don't allow immediate value in the vsx_splat pattern [PR113950]
On 2/27/24 6:40 AM, Segher Boessenkool wrote: > On Tue, Feb 27, 2024 at 02:02:38AM +0530, jeevitha wrote: >> There is no immediate value splatting instruction in Power. Currently, those >> values need to be stored in a register or memory. To address this issue, I >> have updated the predicate for the second operand in vsx_splat to >> splat_input_operand and corrected the assignment of op1 to operands[1]. >> These changes ensure that operand1 is stored in a register. > > input_operand allows a lot of things that splat_input_operand does not, > not just immediate operands. NAK. > > (For example, *all* memory is okay for input_operand, always). > > I'm not saying we do not want to restrict these things, but a commit > that doesn't discuss this at all is not okay. Sorry. So it seems you're not NAKing the use of splat_input_operand, but just that it needs more explanation in the git log entry, correct? Yes, input_operand accepts a lot more things than splat_input_operand does, but the multiple define_insns this define_expand feeds, uses gpc_reg_operand, memory_operand and splat_input_operand for their operands[1] operand (splat_input_operand accepts reg and mem too), so it seems to match better what the patterns will be accepting and I always thought that using predicates that more accurately reflect what the define_insns expect/accept lead to better code gen. Mike, was it just an oversight to not use splat_input_operand for the vsx_splat_ expander or was input_operand a conscious decision? If input_operand was used purposely, then we can just fall back to the s/op1/operands[1]/ change which we already know fixes the bug. Peter
Re: [PATCH V2] rs6000: Don't allow immediate value in the vsx_splat pattern [PR113950]
On 2/28/24 8:31 AM, Segher Boessenkool wrote: > On Tue, Feb 27, 2024 at 04:50:02PM -0600, Peter Bergner wrote: >> So it seems you're not NAKing the use of splat_input_operand, but >> just that it needs more explanation in the git log entry, correct? > > I NAK the patch. _Of course_ there needs to be *something* done, there > is a bug after all, it needs to be fixed. > > But no, there are big questions about if splat_input_operand is correct > as well. This needs to be justified in the patch submission. Ok, then Jeevitha, repost the patch with the s/op1/operands[1]/ only change. Jeevitha has already bootstrapped and regtested that change and it does fix the bug. Clearly, the splat_input_operand change needs more discussion and would be a follow-on patch...if we decide to do it at all. Peter
Re: [PATCH, V2] PR target/112886, Add %S to print_operand for vector pair support.
On 1/11/24 11:29 AM, Michael Meissner wrote: > This is version 2 of the patch. The only difference is I made the test case > simpler to read. [snip] > gcc/ > > PR target/112886 > * config/rs6000/rs6000.cc (print_operand): Add %S output modifier. > * doc/md.texi (Modifiers): Mention %S can be used like %x. > > gcc/testsuite/ > > PR target/112886 > * /gcc.target/powerpc/pr112886.c: New test. This resolves my issue with the first patch, so LGTM. Peter
Re: [PATCH, V2] PR target/112886, Add %S to print_operand for vector pair support.
On 1/23/24 8:30 PM, Kewen.Lin wrote: >> -output_operand_lossage ("invalid %%x value"); >> +output_operand_lossage ("invalid %%%c value", (code == 'S' ? 'S' : >> 'x')); > > Nit: Seems simpler with > > output_operand_lossage ("invalid %%%c value", (char) code); Agreed, good catch. >> diff --git a/gcc/testsuite/gcc.target/powerpc/pr112886.c >> b/gcc/testsuite/gcc.target/powerpc/pr112886.c >> new file mode 100644 >> index 000..4e59dcda6ea >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/powerpc/pr112886.c >> @@ -0,0 +1,29 @@ >> +/* { dg-do compile } */ >> +/* { dg-require-effective-target power10_ok } */ > > I think this needs one more: > > /* { dg-require-effective-target powerpc_vsx_ok } */ I agree with this... >> +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ > > ... and > > /* { dg-options "-mdejagnu-cpu=power10 -O2 -mvsx" } */ > > , otherwise with explicit -mno-vsx, this test case would fail. But not with this. The -mdejagnu-cpu=power10 option already enables -mvsx. If the user explcitly forces -mno-vsx via RUNTESTFLAGS, then let them. The options set in RUNTESTFLAGS come after the options in the dg-options line, so even adding -mvsx like the above won't help the test case PASS if we didn't have the powerpc_vsx_ok test. In other words, the -mvsx option doesn't help with anything. All we need is the new powerpc_vsx_ok check and that will guard against the FAIL in the case the user forces -mno-vsx. In that case, we'll just get an UNSUPPORTED and that is fine. Peter
Re: [PATCH, V2] PR target/112886, Add %S to print_operand for vector pair support.
On 1/24/24 12:04 AM, Kewen.Lin wrote: > on 2024/1/24 11:11, Peter Bergner wrote: >> But not with this. The -mdejagnu-cpu=power10 option already enables -mvsx. >> If the user explcitly forces -mno-vsx via RUNTESTFLAGS, then let them. >> The options set in RUNTESTFLAGS come after the options in the dg-options >> line, so even adding -mvsx like the above won't help the test case PASS > > But this is NOT true, at least on one of internal Power10 machine > (ltcden2-lp1). > > With the command below: > > make check-gcc-c RUNTESTFLAGS="--target_board=unix/-mno-vsx > powerpc.exp=pr112886.c" > > this test case fails without the explicit -mvsx in dg-options. > > From the verbose dumping, the compilation command looks like: > > /home/linkw/gcc/build/gcc-test-debug/gcc/xgcc > -B/home/linkw/gcc/build/gcc-test-debug/gcc/ > /home/linkw/gcc/gcc-test/gcc/testsuite/gcc.target/powerpc/pr112886.c > -mno-vsx > -fdiagnostics-plain-output -mdejagnu-cpu=power10 -O2 -ffat-lto-objects > -fno-ident -S > -o pr112886.s > > "-mno-vsx" comes **before** "-mdejagnu-cpu=power10 -O2" rather than **after**. > > I guess it might be due to different behaviors of different versions of > runtest framework? That is confusing, unless as you say, the behavior changed. The whole reason we added -mdejagnu-cpu= (and the dg-skip usage before that) was due to encountering problems when the test case wanted a specific -mcpu= value and the user overrode it in their RUNTESTFLAGS and that can only happen when its options come last on the command line. Then again, why didn't the powerpc_vsx_ok test not save us here? > So there can be two cases with user explicitly specified -mno-vsx: > > 1) RUNTESTFLAGS comes after dg-options (assuming same order for -mvsx in > powerpc_vsx_ok) > > powerpc_vsx_ok test failed, so UNSUPPORTED > > // with explicit -mvsx does nothing as you said. > > 2) RUNTESTFLAGS comes before dg-options > > powerpc_vsx_ok test succeeds, but FAIL. > > // with suggested -mvsx, make it match the powerpc_vsx_ok checking and the > case not fail. > > As above I think we still need to append the "-mvsx" explicitly. As > tested/verified, it > does help the case not to fail on ltcden2-lp1. I'd like to verify that the behavior did change before we enforce adding that option. The problem is, there are a HUGE number of older test cases that would need updating to "fix" them too. ...and not just adding -mnsx, but -maltivec and basically any -mfoo option where the test case is expecting the feature foo to be used/tested. It would be a huge mess. Peter
Re: [PATCH] SRA: Force gimple operand in an additional corner case (PR 112822)
On 12/13/23 2:05 AM, Jakub Jelinek wrote: > On Wed, Dec 13, 2023 at 08:51:16AM +0100, Richard Biener wrote: >> On Tue, 12 Dec 2023, Peter Bergner wrote: >> >>> On 12/12/23 8:36 PM, Jason Merrill wrote: >>>> This test is failing for me below C++17, I think you need >>>> >>>> // { dg-do compile { target c++17 } } >>>> or >>>> // { dg-require-effective-target c++17 } >>> >>> Sorry about that. Should we do the above or should we just add >>> -std=c++17 to dg-options? ...or do we need to do both? >> >> Just do the above, the C++ testsuite iterates over all standards, >> adding -std=c++17 would just run that 5 times. But the above >> properly skips unsupported cases. > > I believe if one uses explicit -std=gnu++17 or -std=c++17 in dg-options > then it will not iterate: > # If the testcase specifies a standard, use that one. > # If not, run it under several standards, allowing GNU extensions > # if there's a dg-options line. > if ![search_for $test "-std=*++"] { > and otherwise how many times exactly it iterates depends on what the user > asked for or what effective target is there (normally the default is > to iterate 4 times (98,14,17,20), one can use e.g. > GXX_TESTSUITE_STDS=98,11,14,17,20,23,26 to iterate 7 times, or the > default also changes if c++23, c++26 or c++11_only effective targets > are present somewhere in the test. > > But sure, if the test is valid in C++17, 20, 23, 26, then > // { dg-do compile { target c++17 } } > is best (unless the test is mostly language version independent and > very expensive to compile or run). I confirmed the test case builds with C++17, 20, 23, 26 and errors out with C++11, so I went with your solution. Thanks for the input and sorry for the breakage. Pushed. Peter testsuite: Add dg-do compile target c++17 directive for testcase [PR112822] Add dg-do compile target directive that limits the test case to being built on c++17 compiles or greater. 2023-12-13 Peter Bergner gcc/testsuite/ PR tree-optimization/112822 * g++.dg/pr112822.C: Add dg-do compile target c++17 directive. diff --git a/gcc/testsuite/g++.dg/pr112822.C b/gcc/testsuite/g++.dg/pr112822.C index d1490405493..a8557522467 100644 --- a/gcc/testsuite/g++.dg/pr112822.C +++ b/gcc/testsuite/g++.dg/pr112822.C @@ -1,4 +1,5 @@ /* PR tree-optimization/112822 */ +/* { dg-do compile { target c++17 } } */ /* { dg-options "-w -O2" } */ /* Verify we do not ICE on the following noisy creduced test case. */
Re: [PATCH] rs6000: Disassemble opaque modes using subregs to allow optimizations [PR109116]
On 11/24/23 3:28 AM, Kewen.Lin wrote: >> + int regoff = INTVAL (operands[2]) * GET_MODE_SIZE (V16QImode); > > Is it intentional to keep GET_MODE_SIZE (V16QImode) instead of 16? > I think if one day NUM_POLY_INT_COEFFS isn't 1 on rs6000 any more, > we have to add one explicit .to_constant () here. So I prefer this > to use 16 directly, maybe one comment above to indicate what's for > the value 16. I normally don't like hard coding constants in the code, so used GET_MODE_SIZE (V16QImode) as the number of bytes of a vector register, but if that's going to cause an issue in the future, I'm fine using 16. Changed. >> + int regoff = INTVAL (operands[2]) * GET_MODE_SIZE (V16QImode); > > Likewise. Changed. >> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc >> index 5f56c3ed85b..f2efa46c147 100644 >> --- a/gcc/config/rs6000/rs6000.cc >> +++ b/gcc/config/rs6000/rs6000.cc >> @@ -1964,9 +1964,12 @@ rs6000_hard_regno_mode_ok (unsigned int regno, >> machine_mode mode) >> static bool >> rs6000_modes_tieable_p (machine_mode mode1, machine_mode mode2) >> { >> - if (mode1 == PTImode || mode1 == OOmode || mode1 == XOmode >> - || mode2 == PTImode || mode2 == OOmode || mode2 == XOmode) >> -return mode1 == mode2; >> + if (mode1 == PTImode || mode1 == OOmode || mode1 == XOmode >> + || mode2 == PTImode || mode2 == XOmode) >> + return mode1 == mode2; >> + >> + if (mode2 == OOmode) >> +return ALTIVEC_OR_VSX_VECTOR_MODE (mode1); > > I vaguely remembered that Segher mentioned it's unexpected for opaque > modes to have tieable modes excepting for themselves, but if this is the > only way to get rid of those extra moves, I guess we can special-case > them here. Looking forward to Segher's comments on this part. To be honest, my original patch didn't have this. I think it was Mike who said we want or need this. Mike, why do we want/need this again? That said, the documentation for TARGET_MODES_TIEABLE_P says: This hook returns true if a value of mode mode1 is accessible in mode mode2 without copying. Given OOmode (ie, __vector_pair) under the covers is two consecutive vector registers, and we use them/initialize them with two vectors, then mode1 being a vector mode could be accesible from an OOmode mode2 without copying, meaning we could access it directly from the registers holding mode2. Segher, your input to the above an the subreg portion of the patch in general? Peter
Re: [PATCH V2] rs6000: Change GPR2 to volatile & non-fixed register for function that does not use TOC [PR110320]
On 7/16/23 10:40 PM, P Jeevitha via Gcc-patches wrote: > Normally, GPR2 is the TOC pointer and is defined as a fixed and non-volatile > register. However, it can be used as volatile for PCREL addressing. Therefore, > modified r2 to be non-fixed in FIXED_REGISTERS and set it to fixed if it is > not > PCREL and also when the user explicitly requests TOC or fixed. If the register > r2 is fixed, it is made as non-volatile. Changes in register preservation > roles > can be accomplished with the help of available target hooks > (TARGET_CONDITIONAL_REGISTER_USAGE). > > 2023-07-12 Jeevitha Palanisamy > > gcc/ > PR target/PR110320 > * config/rs6000/rs6000.cc (rs6000_conditional_register_usage): Change > GPR2 to volatile and non-fixed register for PCREL. > > gcc/testsuite/ > PR target/PR110320 > * gcc.target/powerpc/pr110320-1.c: New testcase. > * gcc.target/powerpc/pr110320-2.c: New testcase. > * gcc.target/powerpc/pr110320-3.c: New testcase. > > diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc > index 44b448d2ba6..9aa04ec5d57 100644 > --- a/gcc/config/rs6000/rs6000.cc > +++ b/gcc/config/rs6000/rs6000.cc > @@ -10193,9 +10193,13 @@ rs6000_conditional_register_usage (void) > for (i = 32; i < 64; i++) >fixed_regs[i] = call_used_regs[i] = 1; > > + /* For non PC-relative code, GPR2 is unavailable for register allocation. > */ > + if (FIXED_R2 && !rs6000_pcrel_p ()) > +fixed_regs[2] = 1; > + >/* The TOC register is not killed across calls in a way that is > visible to the compiler. */ > - if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2) > + if (fixed_regs[2] && (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)) > call_used_regs[2] = 0; Segher and Ke Wen, As we discussed on my PR111045 patch that disabled PCREL on everything other than ELFv2 and Segher said, well, it should work on ELFv1, modulo fixing bugs. Segher said we should attempt to fix those bugs before we ship the next release and if we miss that, we can push my PR111045 patch then to disable it. On a related note, Jeevitha's patch above allows using r2 for normal register allocation if r2 is not fixed and pcrel is enabled. Given pcrel with this patch enables pcrel on ELFv1, that means this patch can also enable using r2 for normal register allocation on ELFv1. Is that safe? Should we add a check above where we set fixed_regs[2] = 1, to also check for whether this is not an ELFv2 compile? ...or Segher, should we leave this as is and add it to the things to check for non-ELFv2 compiles before the next release and possible disable it then if we know/aren't sure whether it legal? So I guessing I'm wondering, should Jeevitha push the above approved patch as is, or should we modify it so r2 is only available for RA on ELFv2 and pcrel? Peter
Re: [PATCH V2] rs6000: Change GPR2 to volatile & non-fixed register for function that does not use TOC [PR110320]
On 12/14/23 9:57 PM, Peter Bergner wrote: > On 7/16/23 10:40 PM, P Jeevitha via Gcc-patches wrote: >> + /* For non PC-relative code, GPR2 is unavailable for register allocation. >> */ >> + if (FIXED_R2 && !rs6000_pcrel_p ()) >> +fixed_regs[2] = 1; [snip] > On a related note, Jeevitha's patch above allows using r2 for normal register > allocation if r2 is not fixed and pcrel is enabled. Given pcrel with this > patch > enables pcrel on ELFv1, that means this patch can also enable using r2 for > normal > register allocation on ELFv1. Nevermind, I'm daft and r2 usage is not allowed on ELFv1. The rs6000_pcrel_p() call above is always false for non ELFv2 compiles, so we'll mark r2 as fixed for ELFv1. Move along, nothing to see. :-) That said, I think we need a "dg-require-effective-target powerpc_elfv2" for the first test case where we're checking that we do use r2 for normal RA. That'll only be true on ELFv2 compiles, hence the need for the extra target requirement. I've asked Jeevitha to add that to the pr111045-1.c test case and verify it fixes the failure of that test case on her BE run. Peter
Re: [PATCH] PR target/112886, Add %S to print_operand for vector pair support
On 1/5/24 4:18 PM, Michael Meissner wrote: > @@ -14504,13 +14504,17 @@ print_operand (FILE *file, rtx x, int code) > print_operand (file, x, 0); >return; > > +case 'S': > case 'x': > - /* X is a FPR or Altivec register used in a VSX context. */ > + /* X is a FPR or Altivec register used in a VSX context. %x prints > + the VSX register number, %S prints the 2nd register number for > + vector pair, decimal 128-bit floating and IBM 128-bit binary floating > + values. */ >if (!REG_P (x) || !VSX_REGNO_P (REGNO (x))) > - output_operand_lossage ("invalid %%x value"); > + output_operand_lossage ("invalid %%%c value", (code == 'S' ? 'S' : > 'x')); >else > { > - int reg = REGNO (x); > + int reg = REGNO (x) + (code == 'S' ? 1 : 0); > int vsx_reg = (FP_REGNO_P (reg) >? reg - 32 >: reg - FIRST_ALTIVEC_REGNO + 32); The above looks good to me. However: > +: "=v" (*p) > +: "v" (*q), "v" (*r)); These really should use "wa" rather than "v", since these are VSX instructions... or did you use those to ensure you got Altivec registers numbers assigned? > +/* { dg-final { scan-assembler-times {\mxvadddp > (3[2-9]|[45][0-9]|6[0-3]),(3[2-9]|[45][0-9]|6[0-3]),(3[2-9]|[45][0-9]|6[0-3])\M} > 2 } } */ ...and this is really ugly and hard to read/understand. Can't we use register variables to make it simpler? Something like the following which tests having both FPR and Altivec reg numbers assigned? ... void test (__vector_pair *ptr) { register __vector_pair p asm ("vs10"); register __vector_pair q asm ("vs42"); register __vector_pair r asm ("vs44"); q = ptr[1]; r = ptr[2]; __asm__ ("xvadddp %x0,%x1,%x2\n\txvadddp %S0,%S1,%S2" : "=wa" (p) : "wa" (q), "wa" (r)); ptr[2] = p; } /* { dg-final { scan-assembler-times {\mxvadddp 10,42,44\M} 1 } } */ /* { dg-final { scan-assembler-times {\mxvadddp 11,43,45\M} 1 } } */ ... Peter
Re: [PATCH] SRA: Force gimple operand in an additional corner case (PR 112822)
On 12/12/23 10:50 AM, Martin Jambor wrote: > The testcase has reasonable size but it is specific to ppc64le and its > altivec vectors. My plan is to ask the bug reporter to massage it into > a target specific testcase in bugzilla. Alternatively I can try to > craft a testcase from scratch but that will take time. I rewrote the Altivec specific part of the testcase to use generic C code and it still ICEs for me on ppc64le using an unpatched compiler. Therefore, I think we can just add the updated testcase to the generic g++ tests. I'll note I was wrong in the bugzilla comments, -O3 -mcpu=power10 is not required to hit the ICE. A simple -O2 on ppc64le is enough to hit the ICE. Ok for trunk? Peter testsuite: Add testcase for already fixed PR [PR112822] gcc/testsuite/ PR tree-optimization/112822 * g++.dg/pr112822.C: New test. diff --git a/gcc/testsuite/g++.dg/pr112822.C b/gcc/testsuite/g++.dg/pr112822.C new file mode 100644 index 000..3921d5c1bbe --- /dev/null +++ b/gcc/testsuite/g++.dg/pr112822.C @@ -0,0 +1,369 @@ +/* PR target/112822 */ +/* { dg-options "-w -O2" } */ + +/* Verify we do not ICE on the following noisy creduced test case. */ + +namespace b { +typedef int c; +template struct d; +template struct d { using f = e; }; +template struct aa; +template struct aa { using f = h; }; +template using ab = typename d::f; +template using n = typename aa::f; +template class af { +public: + typedef __complex__ ah; + template af operator+=(e) { +ah o; +x = o; +return *this; + } + ah x; +}; +} // namespace b +namespace { +enum { p }; +enum { ac, ad }; +struct ae; +struct al; +struct ag; +typedef b::c an; +namespace ai { +template struct ak { typedef aj f; }; +template using ar = typename ak::f; +template struct am { + enum { at }; +}; +template struct ao { + enum { at }; +}; +template struct ap; +template struct aq { + enum { at }; +}; +} // namespace ai +template struct ay; +template class as; +template class ba; +template class aw; +template class be; +template class az; +namespace ai { +template struct bg; +template ::bd> +struct bk; +template struct bf; +template struct bm; +template struct bh; +template ::bj>::at> struct bp { + typedef bi f; +}; +template struct br { + typedef typename bp::f>::f f; +}; +template struct bn; +template struct bn { + typedef aw f; +}; +template struct bx { + typedef typename bn::bs, aj ::bo>::f f; +}; +template struct bt { typedef b::n<0, aj, aj> f; }; +template ::f> struct cb { + enum { bw }; + typedef b::n::f> f; +}; +template ::bs> struct by { + typedef be f; +}; +template struct bz { + typedef typename by::f f; +}; +template struct ch; +template struct ch { typedef ci bd; }; +} // namespace ai +template > struct cg; +template struct cg { typedef aj cn; }; +namespace ai { +template cj cp; +template void cl(bu *cr, cj cs) { ct(cr, cs); } +typedef __attribute__((altivec(vector__))) double co; +void ct(double *cr, co cs) { *(co *)cr = cs; } +struct cq { + co q; +}; +template <> struct bm> { typedef cq f; }; +template <> struct bh { typedef cq bj; }; +void ct(b::af *cr, cq cs) { ct((double *)cr, cs.q); } +template struct cx { + template void cu(cw *a, cj) { +cl(a, cp); + } +}; +} // namespace ai +template class ba : public ay { +public: + typedef ai::ap bu; + typedef b::n::bo, bu, b::n::at, bu, bu>> cv; + typedef ay db; + db::dc; + cv coeff(an dd, an col) const { return dc().coeff(dd, col); } +}; +template class cz : public ba::at> { +public: + ai::ap b; + enum { da, dg, dh, bv, bq, di = dg, bo }; +}; +template class be : public cz { +public: + typedef typename ai::ap::bu bu; + typedef cz db; + db::dc; + template cd &operator+=(const be &); + template az df(de); +}; +template struct ay { + cd &dc() { return *static_cast(this); } + cd dc() const; +}; +template class dl; +namespace ai { +template struct ap> { + typedef bb dj; + typedef bc r; + typedef ap s; + typedef ap t; + typedef typename cg::cn bu; + typedef typename ch::bd>::bd cf; + enum { bo }; +}; +} // namespace ai +template +class az : public dl, ai::ap, ai::bg::bd>> { +public: + typedef dk bb; + typedef Rhs_ bc; + typedef typename ai::bt::f LhsNested; + typedef typename ai::bt::f dn; + typedef ai::ar u; + typedef ai::ar RhsNestedCleaned; + u lhs(); + RhsNestedCleaned rhs(); +}; +template +class dl : public ai::bz, al>::f {}; +namespace ai { +template struct v { typedef ag w; }; +template struct evaluator_traits_base { + typedef typename v::cf>::w w; +}; +template struct ax : evaluator_traits_base {}; +template struct y { static const bool at = false; }; +template class plainobjectbase_evaluator_data { +public: + plainobjectbase_evaluator_data(bu *ptr, an) : data(ptr) {} + an outerStride() { return z; } + bu *data; +}; +template struct evaluator { +
Re: [PATCH] SRA: Force gimple operand in an additional corner case (PR 112822)
On 12/12/23 12:45 PM, Peter Bergner wrote: > +/* PR target/112822 */ Oops, this should be: /* PR tree-optimization/112822 */ It's fixed on my end. Peter
Re: [PATCH] SRA: Force gimple operand in an additional corner case (PR 112822)
On 12/12/23 1:26 PM, Richard Biener wrote: >> Am 12.12.2023 um 19:51 schrieb Peter Bergner : >> >> On 12/12/23 12:45 PM, Peter Bergner wrote: >>> +/* PR target/112822 */ >> >> Oops, this should be: >> >> /* PR tree-optimization/112822 */ >> >> It's fixed on my end. > > Ok Pushed now that Martin has pushed his fix. Thanks! Peter
Re: [PATCH] SRA: Force gimple operand in an additional corner case (PR 112822)
On 12/12/23 8:36 PM, Jason Merrill wrote: > This test is failing for me below C++17, I think you need > > // { dg-do compile { target c++17 } } > or > // { dg-require-effective-target c++17 } Sorry about that. Should we do the above or should we just add -std=c++17 to dg-options? ...or do we need to do both? Peter
[PATCH] configure: Only create serdep.tmp if needed
There's no reason to create this file if none of the serial configure options are passed. ChangeLog: * configure: Regenerate. * configure.ac: Only create serdep.tmp if needed Signed-off-by: Peter Foley --- configure| 2 ++ configure.ac | 2 ++ 2 files changed, 4 insertions(+) diff --git a/configure b/configure index 85883099410..47caaeb23fe 100755 --- a/configure +++ b/configure @@ -9918,7 +9918,9 @@ esac # These force 'configure's to be done one at a time, to avoid problems # with contention over a shared config.cache. rm -f serdep.tmp +if "x${enable_serial_build_configure}" = xyes || "x${enable_serial_host_configure}" = xyes || "x${enable_serial_target_configure}" = xyes ; then echo '# serdep.tmp' > serdep.tmp +fi olditem= test "x${enable_serial_build_configure}" = xyes && for item in ${build_configdirs} ; do diff --git a/configure.ac b/configure.ac index 2b612dce6e9..847df0c99b5 100644 --- a/configure.ac +++ b/configure.ac @@ -3071,7 +3071,9 @@ esac # These force 'configure's to be done one at a time, to avoid problems # with contention over a shared config.cache. rm -f serdep.tmp +if [ "x${enable_serial_build_configure}" = xyes || "x${enable_serial_host_configure}" = xyes || "x${enable_serial_target_configure}" = xyes ]; then echo '# serdep.tmp' > serdep.tmp +fi olditem= test "x${enable_serial_build_configure}" = xyes && for item in ${build_configdirs} ; do -- 2.39.0
[PATCH v2] configure: Only create serdep.tmp if needed
There's no reason to create this file if none of the serial configure options are passed. v2: Use test instead of [ to avoid running afoul of autoconf quoting. ChangeLog: * configure: Regenerate. * configure.ac: Only create serdep.tmp if needed Signed-off-by: Peter Foley --- configure| 2 ++ configure.ac | 2 ++ 2 files changed, 4 insertions(+) diff --git a/configure b/configure index 85883099410..0494e2fa2bf 100755 --- a/configure +++ b/configure @@ -9918,7 +9918,9 @@ esac # These force 'configure's to be done one at a time, to avoid problems # with contention over a shared config.cache. rm -f serdep.tmp +if test "x${enable_serial_build_configure}" = xyes || test "x${enable_serial_host_configure}" = xyes || test "x${enable_serial_target_configure}" = xyes; then echo '# serdep.tmp' > serdep.tmp +fi olditem= test "x${enable_serial_build_configure}" = xyes && for item in ${build_configdirs} ; do diff --git a/configure.ac b/configure.ac index 2b612dce6e9..f5cce5830bc 100644 --- a/configure.ac +++ b/configure.ac @@ -3071,7 +3071,9 @@ esac # These force 'configure's to be done one at a time, to avoid problems # with contention over a shared config.cache. rm -f serdep.tmp +if test "x${enable_serial_build_configure}" = xyes || test "x${enable_serial_host_configure}" = xyes || test "x${enable_serial_target_configure}" = xyes; then echo '# serdep.tmp' > serdep.tmp +fi olditem= test "x${enable_serial_build_configure}" = xyes && for item in ${build_configdirs} ; do -- 2.39.0
[PATCH] ada: Respect GNATMAKE
Use the GNATMAKE variables consistently. Avoids failures when bootstraping with a custom GNATMAKE value. gcc/ada/ChangeLog: * Make-generated.in: Use GNATMAKE. * gcc-interface/Makefile.in: Ditto. Signed-off-by: Peter Foley --- gcc/ada/Make-generated.in | 6 +++--- gcc/ada/gcc-interface/Makefile.in | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/gcc/ada/Make-generated.in b/gcc/ada/Make-generated.in index 948fc508a56..34c86b2cd63 100644 --- a/gcc/ada/Make-generated.in +++ b/gcc/ada/Make-generated.in @@ -18,7 +18,7 @@ GEN_IL_FLAGS = -gnata -gnat2012 -gnatw.g -gnatyg -gnatU $(GEN_IL_INCLUDES) ada/seinfo_tables.ads ada/seinfo_tables.adb ada/sinfo.h ada/einfo.h ada/nmake.ads ada/nmake.adb ada/seinfo.ads ada/sinfo-nodes.ads ada/sinfo-nodes.adb ada/einfo-entities.ads ada/einfo-entities.adb: ada/stamp-gen_il ; @true ada/stamp-gen_il: $(fsrcdir)/ada/gen_il* $(MKDIR) ada/gen_il - cd ada/gen_il; gnatmake -q -g $(GEN_IL_FLAGS) gen_il-main + cd ada/gen_il; $(GNATMAKE) -q -g $(GEN_IL_FLAGS) gen_il-main # Ignore errors to work around finalization issues in older compilers - cd ada/gen_il; ./gen_il-main $(fsrcdir)/../move-if-change ada/gen_il/seinfo_tables.ads ada/seinfo_tables.ads @@ -39,14 +39,14 @@ ada/stamp-gen_il: $(fsrcdir)/ada/gen_il* # would cause bootstrapping with older compilers to fail. You can call it by # hand, as a sanity check that these files are legal. ada/seinfo_tables.o: ada/seinfo_tables.ads ada/seinfo_tables.adb - cd ada ; gnatmake $(GEN_IL_INCLUDES) seinfo_tables.adb -gnatU -gnatX + cd ada ; $(GNATMAKE) $(GEN_IL_INCLUDES) seinfo_tables.adb -gnatU -gnatX ada/snames.h ada/snames.ads ada/snames.adb : ada/stamp-snames ; @true ada/stamp-snames : ada/snames.ads-tmpl ada/snames.adb-tmpl ada/snames.h-tmpl ada/xsnamest.adb ada/xutil.ads ada/xutil.adb -$(MKDIR) ada/bldtools/snamest $(RM) $(addprefix ada/bldtools/snamest/,$(notdir $^)) $(CP) $^ ada/bldtools/snamest - cd ada/bldtools/snamest; gnatmake -q xsnamest ; ./xsnamest + cd ada/bldtools/snamest; $(GNATMAKE) -q xsnamest ; ./xsnamest $(fsrcdir)/../move-if-change ada/bldtools/snamest/snames.ns ada/snames.ads $(fsrcdir)/../move-if-change ada/bldtools/snamest/snames.nb ada/snames.adb $(fsrcdir)/../move-if-change ada/bldtools/snamest/snames.nh ada/snames.h diff --git a/gcc/ada/gcc-interface/Makefile.in b/gcc/ada/gcc-interface/Makefile.in index da6a56fcec8..c8c38acf447 100644 --- a/gcc/ada/gcc-interface/Makefile.in +++ b/gcc/ada/gcc-interface/Makefile.in @@ -616,7 +616,7 @@ OSCONS_EXTRACT=$(GCC_FOR_ADA_RTS) $(GNATLIBCFLAGS_FOR_C) -S s-oscons-tmplt.i -$(MKDIR) ./bldtools/oscons $(RM) $(addprefix ./bldtools/oscons/,$(notdir $^)) $(CP) $^ ./bldtools/oscons - (cd ./bldtools/oscons ; gnatmake -q xoscons) + (cd ./bldtools/oscons ; $(GNATMAKE) -q xoscons) $(RTSDIR)/s-oscons.ads: ../stamp-gnatlib1-$(RTSDIR) s-oscons-tmplt.c gsocket.h ./bldtools/oscons/xoscons $(RM) $(RTSDIR)/s-oscons-tmplt.i $(RTSDIR)/s-oscons-tmplt.s -- 2.39.0
Re: [PATCH] rs6000: Disable PCREL for unsupported targets [PR111045]
On 8/27/23 9:06 PM, Kewen.Lin wrote: > Assuming we only have ELFv2_ABI_CHECK in PCREL_SUPPORTED_BY_OS, we > can have either TARGET_PCREL or !TARGET_PCREL after the checking. > For the latter, it's fine and don't need any checks. For the former, > if it's implicit, for !TARGET_PREFIXED we will clean it silently; > while if it's explicit, for !TARGET_PREFIXED we will emit an error. > TARGET_PREFIXED checking has considered Power10, so it's also > concerned accordingly. [snip] > Yeah, looking forward to their opinions. IMHO, with the current proposed > change, pcrel doesn't look like a pure Power10 hardware feature, it also > quite relies on ABIs, that's why I thought it seems good not to turn it > on by default for Power10. Ok, how about the patch below? This removes OPTION_MASK_PCREL from the power10 flags, so instead of our options override code needing to disable PCREL on the systems that don't support it, we now enable it only on those systems that do support it. Jeevitha, can you test this patch to see whether it fixes the testsuite issue caused by your earlier patch that was approved, but not yet pushed? That was the use GPR2 for register allocation, correct? Note, you'll need to update the patch to replace the rs6000_pcrel_p() usage with just TARGET_PCREL, since this patch removes rs6000_pcrel_p(). If testing is clean and everyone is OK with the patch, I'll officially submit it for review with git log entry, etc. Peter gcc/ * config/rs6000/linux64.h (PCREL_SUPPORTED_BY_OS): Only test the ABI. * config/rs6000/rs6000-cpus.def (RS6000_CPU): Remove OPTION_MASK_PCREL from power10. * config/rs6000/predicates.md: Use TARGET_PCREL. * config/rs6000/rs6000-logue.cc (rs6000_decl_ok_for_sibcall): Likewise. (rs6000_global_entry_point_prologue_needed_p): Likewise. (rs6000_output_function_prologue): Likewise. * config/rs6000/rs6000.md: Likewise. * config/rs6000/rs6000.cc (rs6000_option_override_internal): Rework the logic for enabling PCREL by default. (rs6000_legitimize_tls_address): Use TARGET_PCREL. (rs6000_call_template_1): Likewise. (rs6000_indirect_call_template_1): Likewise. (rs6000_longcall_ref): Likewise. (rs6000_call_aix): Likewise. (rs6000_sibcall_aix): Likewise. (rs6000_pcrel_p): Remove. * config/rs6000/rs6000-protos.h (rs6000_pcrel_p): Likewise. diff --git a/gcc/config/rs6000/linux64.h b/gcc/config/rs6000/linux64.h index 98b7255c95f..5b77bd7fd51 100644 --- a/gcc/config/rs6000/linux64.h +++ b/gcc/config/rs6000/linux64.h @@ -563,8 +563,5 @@ extern int dot_symbols; #define TARGET_FLOAT128_ENABLE_TYPE 1 /* Enable using prefixed PC-relative addressing on POWER10 if the ABI - supports it. The ELF v2 ABI only supports PC-relative relocations for - the medium code model. */ -#define PCREL_SUPPORTED_BY_OS (TARGET_POWER10 && TARGET_PREFIXED \ -&& ELFv2_ABI_CHECK \ -&& TARGET_CMODEL == CMODEL_MEDIUM) + supports it. */ +#define PCREL_SUPPORTED_BY_OS (ELFv2_ABI_CHECK) diff --git a/gcc/config/rs6000/rs6000-cpus.def b/gcc/config/rs6000/rs6000-cpus.def index 4f350da378c..fe01a2312ae 100644 --- a/gcc/config/rs6000/rs6000-cpus.def +++ b/gcc/config/rs6000/rs6000-cpus.def @@ -256,7 +256,8 @@ RS6000_CPU ("power8", PROCESSOR_POWER8, MASK_POWERPC64 | ISA_2_7_MASKS_SERVER | OPTION_MASK_HTM) RS6000_CPU ("power9", PROCESSOR_POWER9, MASK_POWERPC64 | ISA_3_0_MASKS_SERVER | OPTION_MASK_HTM) -RS6000_CPU ("power10", PROCESSOR_POWER10, MASK_POWERPC64 | ISA_3_1_MASKS_SERVER) +RS6000_CPU ("power10", PROCESSOR_POWER10, MASK_POWERPC64 + | (ISA_3_1_MASKS_SERVER & ~OPTION_MASK_PCREL)) RS6000_CPU ("powerpc", PROCESSOR_POWERPC, 0) RS6000_CPU ("powerpc64", PROCESSOR_POWERPC64, OPTION_MASK_PPC_GFXOPT | MASK_POWERPC64) diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md index ef7d3f214c4..0b76541fc0a 100644 --- a/gcc/config/rs6000/predicates.md +++ b/gcc/config/rs6000/predicates.md @@ -1216,7 +1216,7 @@ && SYMBOL_REF_DECL (op) != NULL && TREE_CODE (SYMBOL_REF_DECL (op)) == FUNCTION_DECL && (rs6000_fndecl_pcrel_p (SYMBOL_REF_DECL (op)) -!= rs6000_pcrel_p ()))"))) +!= TARGET_PCREL))"))) ;; Return 1 if this operand is a valid input for a move insn. (define_predicate "input_operand" diff --git a/gcc/config/rs6000/rs6000-logue.cc b/gcc/config/rs6000/rs6000-logue.cc index 98846f781ec..9e08d9bb4d2 100644 --- a/gcc/config/rs6000/rs6000-logue.cc +++ b/gcc/config/rs6000/rs6000-lo
Re: [PATCH] rs6000: Disable PCREL for unsupported targets [PR111045]
On 8/25/23 6:20 AM, Kewen.Lin wrote: > btw, I was also expecting that we don't implicitly set > OPTION_MASK_PCREL any more for Power10, that is to remove > OPTION_MASK_PCREL from OTHER_POWER10_MASKS. So my patch removes the flag from the default power10 flags, like you want. However, it doesn't remove it from OTHER_POWER10_MASKS, since that is used to set ISA_3_1_MASKS_SERVER and I didn't want to change how rs6000_machine_from_flags() behaves, so instead, I just explicitly mask it off when defining the power10 default flags. Peter
Re: [PATCH] rs6000: Disable PCREL for unsupported targets [PR111045]
On 11/13/23 8:33 PM, Kewen.Lin wrote: >> if (PCREL_SUPPORTED_BY_OS) > >> + else >> +{ >> + if (TARGET_PCREL >> + && (rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0) >> +error ("use of %qs is invalid for this target", "-mpcrel"); >>rs6000_isa_flags &= ~OPTION_MASK_PCREL; >> } > > now, I think it should be fine not to explicitly mask it off Power10 > default flags? Then leave Power10 default flags unchanged seems more > consistent. The others look good to me! Thanks again. Ah, good catch, yes, I'll remove the clearing of the bit, since we know it's either already clear (because that is the default value) or if it's set, then that's the error condition we're catching here. That also means that we can remove the test for rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0 too, since at this point, TARGET_PCREL can only be true if it was explicitly enabled with -mpcrel by the user. Therefor, the code can now look like: if (PCREL_SUPPORTED_BY_OS) else if (TARGET_PCREL) error ("use of %qs is invalid for this target", "-mpcrel"); I'll make that change, redo the bootstrap and regtesting and then officially submit the patch. Thanks! Peter
Re: [PATCH V3 0/7] ira/lra: Support subreg coalesce
On 11/12/23 6:08 AM, Lehua Ding wrote: > V3 Changes: > 1. fix three ICE. > 2. rebase I tested this on powerpc64le-linux and powerpc64-linux. The LE build bootstrapped fine and it looks like only one testsuite FAIL which I have to look into why it's FAILing. The BE build did bootstrap, but the 32-bit and 64-bit testsuite runs both had lots of FAILs (over 100 between them both) which I have yet to look into what is happening. I'll also note I have done no performance testing yet until I have an idea of what the testsuite failures are. I think a patch like this that can affect the performance of all architectures needs some performance testing to ensure we don't have unintended performance degradations. I'll have someone on my team kick off some builds once I have a handle on the testsuite FAILs. Peter
Re: [PATCH V3 0/7] ira/lra: Support subreg coalesce
On 11/13/23 11:37 PM, Lehua Ding wrote: > On 2023/11/14 3:37, Vladimir Makarov wrote: >> Also besides testing major targets I'd recommend testing at least one big >> endian target (I'd recommend ppc64be. gcc110.fsfrance.org could be used >> for this). Plenty RA issues occur because BE targets are not tested. > > You said the address looks a bit wrong, it should be this gcc110.fsffrance.org > right? I looked for it and it looks like you have to go to portal.cfarm.net > first to apply for an account on this site, I'll try that, thanks a lot. The compile farm just went through with a domain name change, so the Power7 BE gcc110.fsffrance.org system is now reachable via cfarm110.cfarm.net. You are correct on the address for requesting a cfarm account. That said, I posted results using your V3 patches for both LE and BE Power in my other reply. Peter
[PATCH] rs6000: Only enable PCREL on supported ABIs [PR111045]
PCREL data accesses are only officially supported on ELFv2. We currently incorrectly enable PCREL on all Power10 compiles in which prefix instructions are also enabled. Rework the option override code so we only enable PCREL for those ABIs that actually support it. Jeevitha has confirmed this patch fixes the testsuite fallout seen with her PR110320 patch. This has been bootstrapped and regtested with no regressions on the following builds: powerpc64le-linux, powerpc64le-linux --with-cpu=power10 and powerpc64-linux - testsuite run in both 32-bit and 64-bit modes. Ok for trunk? Ok for the release branches after some burn-in on trunk? Peter gcc/ PR target/111045 * config/rs6000/linux64.h (PCREL_SUPPORTED_BY_OS): Only test the ABI. * config/rs6000/rs6000-cpus.def (RS6000_CPU): Remove OPTION_MASK_PCREL from power10. * config/rs6000/predicates.md: Use TARGET_PCREL. * config/rs6000/rs6000-logue.cc (rs6000_decl_ok_for_sibcall): Likewise. (rs6000_global_entry_point_prologue_needed_p): Likewise. (rs6000_output_function_prologue): Likewise. * config/rs6000/rs6000.md: Likewise. * config/rs6000/rs6000.cc (rs6000_option_override_internal): Rework the logic for enabling PCREL by default. (rs6000_legitimize_tls_address): Use TARGET_PCREL. (rs6000_call_template_1): Likewise. (rs6000_indirect_call_template_1): Likewise. (rs6000_longcall_ref): Likewise. (rs6000_call_aix): Likewise. (rs6000_sibcall_aix): Likewise. (rs6000_pcrel_p): Remove. * config/rs6000/rs6000-protos.h (rs6000_pcrel_p): Likewise. gcc/testsuite/ PR target/111045 * gcc.target/powerpc/pr111045.c: New test. * gcc.target/powerpc/float128-constant.c: Add instruction counts for non-pcrel compiles. diff --git a/gcc/config/rs6000/linux64.h b/gcc/config/rs6000/linux64.h index 98b7255c95f..5b77bd7fd51 100644 --- a/gcc/config/rs6000/linux64.h +++ b/gcc/config/rs6000/linux64.h @@ -563,8 +563,5 @@ extern int dot_symbols; #define TARGET_FLOAT128_ENABLE_TYPE 1 /* Enable using prefixed PC-relative addressing on POWER10 if the ABI - supports it. The ELF v2 ABI only supports PC-relative relocations for - the medium code model. */ -#define PCREL_SUPPORTED_BY_OS (TARGET_POWER10 && TARGET_PREFIXED \ -&& ELFv2_ABI_CHECK \ -&& TARGET_CMODEL == CMODEL_MEDIUM) + supports it. */ +#define PCREL_SUPPORTED_BY_OS (ELFv2_ABI_CHECK) diff --git a/gcc/config/rs6000/rs6000-cpus.def b/gcc/config/rs6000/rs6000-cpus.def index 4f350da378c..fe01a2312ae 100644 --- a/gcc/config/rs6000/rs6000-cpus.def +++ b/gcc/config/rs6000/rs6000-cpus.def @@ -256,7 +256,8 @@ RS6000_CPU ("power8", PROCESSOR_POWER8, MASK_POWERPC64 | ISA_2_7_MASKS_SERVER | OPTION_MASK_HTM) RS6000_CPU ("power9", PROCESSOR_POWER9, MASK_POWERPC64 | ISA_3_0_MASKS_SERVER | OPTION_MASK_HTM) -RS6000_CPU ("power10", PROCESSOR_POWER10, MASK_POWERPC64 | ISA_3_1_MASKS_SERVER) +RS6000_CPU ("power10", PROCESSOR_POWER10, MASK_POWERPC64 + | (ISA_3_1_MASKS_SERVER & ~OPTION_MASK_PCREL)) RS6000_CPU ("powerpc", PROCESSOR_POWERPC, 0) RS6000_CPU ("powerpc64", PROCESSOR_POWERPC64, OPTION_MASK_PPC_GFXOPT | MASK_POWERPC64) diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md index ef7d3f214c4..0b76541fc0a 100644 --- a/gcc/config/rs6000/predicates.md +++ b/gcc/config/rs6000/predicates.md @@ -1216,7 +1216,7 @@ && SYMBOL_REF_DECL (op) != NULL && TREE_CODE (SYMBOL_REF_DECL (op)) == FUNCTION_DECL && (rs6000_fndecl_pcrel_p (SYMBOL_REF_DECL (op)) -!= rs6000_pcrel_p ()))"))) +!= TARGET_PCREL))"))) ;; Return 1 if this operand is a valid input for a move insn. (define_predicate "input_operand" diff --git a/gcc/config/rs6000/rs6000-logue.cc b/gcc/config/rs6000/rs6000-logue.cc index 98846f781ec..9e08d9bb4d2 100644 --- a/gcc/config/rs6000/rs6000-logue.cc +++ b/gcc/config/rs6000/rs6000-logue.cc @@ -1106,7 +1106,7 @@ rs6000_decl_ok_for_sibcall (tree decl) r2 for its caller's TOC. Such a function may make sibcalls to any function, whether local or external, without restriction based on TOC-save/restore rules. */ - if (rs6000_pcrel_p ()) + if (TARGET_PCREL) return true; /* Otherwise, under the AIX or ELFv2 ABIs we can't allow sibcalls @@ -2583,7 +2583,7 @@ rs6000_global_entry_point_prologue_needed_p (void) return false; /* PC-relative functions never generate a global entry point prologue. */ - if (rs6000_pcrel_p ()) + if (TARGET_PCREL) return f
Re: [PATCH V3 0/7] ira/lra: Support subreg coalesce
On 11/14/23 9:12 PM, Lehua Ding wrote: > I've applied for machine permissions on the compile farm, can you give > me the way to compile and run tests on PPC64BE machine? I'll take a look > at it too, thanks a lot. That's an old system, with too old system libgmp, etc. Let me attempt a build there so I can give you correct build directions for that system. That said, unfortunately, that system is currently almost out of available disk space: [bergner@gcc1-power7 ~]$ df -h Filesystem Size Used Avail Use% Mounted on ... /dev/md41.6T 1.6T 9.0G 100% /home Segher, can you please send out an admin note for people to clean up unneeded space on cfarm110? Thanks. Peter
[PATCH] rs6000: Disassemble opaque modes using subregs to allow optimizations [PR109116]
PR109116 exposes an issue where using unspecs to access each vector component of an opaque mode variable leads to unneeded register copies, because our rtl optimizers cannot handle unspecs. Instead, use subregs to access each vector component of the opaque mode variable, which our optimizers know how to handle. I did not include a test case with the patch, since writing a test case that attempts to ensure we don't emit unneeded register copies is nearly impossible since those copies can still be generated for reasons other than the causes in this patch. I have verified that this patch does improve code generation for some unit tests and our AI libraries team has confirmed that performance of their tests improved when using this patch. This passed bootstrap and regtesting with no regressions on powerpc64le-linux and powerpc64-linux. Ok for trunk? Peter gcc/ PR target/109116 * config/rs6000/mma.md (vsx_disassemble_pair): Expand into a vector register sized subreg. * config/rs6000/mma.md (*vsx_disassemble_pair): Delete. (mma_disassemble_acc): Expand into a vector register sized subreg. (*mma_disassemble_acc): Delete. * config/rs6000/rs6000.cc (rs6000_modes_tieable_p): Allow vector modes to tie with OOmode. diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md index 575751d477e..2ca405469e2 100644 --- a/gcc/config/rs6000/mma.md +++ b/gcc/config/rs6000/mma.md @@ -398,29 +398,8 @@ (define_expand "vsx_disassemble_pair" (match_operand 2 "const_0_to_1_operand")] "TARGET_MMA" { - rtx src; - int regoff = INTVAL (operands[2]); - src = gen_rtx_UNSPEC (V16QImode, - gen_rtvec (2, operands[1], GEN_INT (regoff)), - UNSPEC_MMA_EXTRACT); - emit_move_insn (operands[0], src); - DONE; -}) - -(define_insn_and_split "*vsx_disassemble_pair" - [(set (match_operand:V16QI 0 "mma_disassemble_output_operand" "=mwa") - (unspec:V16QI [(match_operand:OO 1 "vsx_register_operand" "wa") - (match_operand 2 "const_0_to_1_operand")] - UNSPEC_MMA_EXTRACT))] - "TARGET_MMA - && vsx_register_operand (operands[1], OOmode)" - "#" - "&& reload_completed" - [(const_int 0)] -{ - int reg = REGNO (operands[1]); - int regoff = INTVAL (operands[2]); - rtx src = gen_rtx_REG (V16QImode, reg + regoff); + int regoff = INTVAL (operands[2]) * GET_MODE_SIZE (V16QImode); + rtx src = simplify_gen_subreg (V16QImode, operands[1], OOmode, regoff); emit_move_insn (operands[0], src); DONE; }) @@ -472,29 +451,8 @@ (define_expand "mma_disassemble_acc" (match_operand 2 "const_0_to_3_operand")] "TARGET_MMA" { - rtx src; - int regoff = INTVAL (operands[2]); - src = gen_rtx_UNSPEC (V16QImode, - gen_rtvec (2, operands[1], GEN_INT (regoff)), - UNSPEC_MMA_EXTRACT); - emit_move_insn (operands[0], src); - DONE; -}) - -(define_insn_and_split "*mma_disassemble_acc" - [(set (match_operand:V16QI 0 "mma_disassemble_output_operand" "=mwa") - (unspec:V16QI [(match_operand:XO 1 "fpr_reg_operand" "d") - (match_operand 2 "const_0_to_3_operand")] - UNSPEC_MMA_EXTRACT))] - "TARGET_MMA - && fpr_reg_operand (operands[1], XOmode)" - "#" - "&& reload_completed" - [(const_int 0)] -{ - int reg = REGNO (operands[1]); - int regoff = INTVAL (operands[2]); - rtx src = gen_rtx_REG (V16QImode, reg + regoff); + int regoff = INTVAL (operands[2]) * GET_MODE_SIZE (V16QImode); + rtx src = simplify_gen_subreg (V16QImode, operands[1], XOmode, regoff); emit_move_insn (operands[0], src); DONE; }) diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index 5f56c3ed85b..f2efa46c147 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -1964,9 +1964,12 @@ rs6000_hard_regno_mode_ok (unsigned int regno, machine_mode mode) static bool rs6000_modes_tieable_p (machine_mode mode1, machine_mode mode2) { - if (mode1 == PTImode || mode1 == OOmode || mode1 == XOmode - || mode2 == PTImode || mode2 == OOmode || mode2 == XOmode) -return mode1 == mode2; + if (mode1 == PTImode || mode1 == OOmode || mode1 == XOmode + || mode2 == PTImode || mode2 == XOmode) + return mode1 == mode2; + + if (mode2 == OOmode) +return ALTIVEC_OR_VSX_VECTOR_MODE (mode1); if (ALTIVEC_OR_VSX_VECTOR_MODE (mode1)) return ALTIVEC_OR_VSX_VECTOR_MODE (mode2);
[PATCH] diagnostics: Follow DECL_ABSTRACT_ORIGIN links in lhd_decl_printable_name [PR102061]
Currently, if a warning references a cloned function, the name of the cloned function will be emitted in the "In function 'xyz'" part of the diagnostic, which users aren't supposed to see. This patch follows the DECL_ABSTRACT_ORIGIN links until encountering the original function. gcc/ChangeLog: PR diagnostics/102061 * langhooks.cc (lhd_decl_printable_name): Follow DECL_ABSTRACT_ORIGIN links to the source Signed-off-by: Peter Damianov --- I would add a testcase but I'm not familiar with that process, and would need some help. I also did not bootstrap or test this patch, I'm posting to see if the CI will do it for me. I used "while" because I'm not sure if there can be clones of clones or not. The second check is because I see comments elsewhere that say: "DECL_ABSTRACT_ORIGIN can point to itself", so I want to avoid a potential infinite loop. gcc/langhooks.cc | 2 ++ 1 file changed, 2 insertions(+) diff --git a/gcc/langhooks.cc b/gcc/langhooks.cc index 61f2b676256..89a89b74535 100644 --- a/gcc/langhooks.cc +++ b/gcc/langhooks.cc @@ -223,6 +223,8 @@ lhd_get_alias_set (tree ARG_UNUSED (t)) const char * lhd_decl_printable_name (tree decl, int ARG_UNUSED (verbosity)) { + while (DECL_ABSTRACT_ORIGIN(decl) && DECL_ABSTRACT_ORIGIN(decl) != decl) +decl = DECL_ABSTRACT_ORIGIN(decl); gcc_assert (decl && DECL_NAME (decl)); return IDENTIFIER_POINTER (DECL_NAME (decl)); } -- 2.39.2
[PATCH v2] diagnostics: Follow DECL_ORIGIN in lhd_decl_printable_name [PR102061]
Currently, if a warning references a cloned function, the name of the cloned function will be emitted in the "In function 'xyz'" part of the diagnostic, which users aren't supposed to see. This patch follows the DECL_ORIGIN link to get the name of the original function. gcc/ChangeLog: PR diagnostics/102061 * langhooks.cc (lhd_decl_printable_name): Follow DECL_ORIGIN link Signed-off-by: Peter Damianov --- v2: use DECL_ORIGIN instead of DECL_ABSTRACT_ORIGIN and remove loop gcc/langhooks.cc | 1 + 1 file changed, 1 insertion(+) diff --git a/gcc/langhooks.cc b/gcc/langhooks.cc index 61f2b676256..943b8345a95 100644 --- a/gcc/langhooks.cc +++ b/gcc/langhooks.cc @@ -223,6 +223,7 @@ lhd_get_alias_set (tree ARG_UNUSED (t)) const char * lhd_decl_printable_name (tree decl, int ARG_UNUSED (verbosity)) { + decl = DECL_ORIGIN(decl); gcc_assert (decl && DECL_NAME (decl)); return IDENTIFIER_POINTER (DECL_NAME (decl)); } -- 2.39.2
Re: [PATCH] rs6000: ROP - Emit hashst and hashchk insns on Power8 and later [PR114759]
On 7/3/24 4:01 AM, Kewen.Lin wrote: >> - if (TARGET_POWER10 >> + if (TARGET_POWER8 >>&& info->calls_p >>&& DEFAULT_ABI == ABI_ELFv2 >>&& rs6000_rop_protect) > > Nit: I noticed that this is the only place to change > info->rop_hash_size to non-zero, and ... > >> @@ -3277,7 +3277,7 @@ rs6000_emit_prologue (void) >>/* NOTE: The hashst isn't needed if we're going to do a sibcall, >> but there's no way to know that here. Harmless except for >> performance, of course. */ >> - if (TARGET_POWER10 && rs6000_rop_protect && info->rop_hash_size != 0) >> + if (TARGET_POWER8 && rs6000_rop_protect && info->rop_hash_size != 0) > > ... this condition and ... > >> { >>gcc_assert (DEFAULT_ABI == ABI_ELFv2); >>rtx stack_ptr = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM); >> @@ -5056,7 +5056,7 @@ rs6000_emit_epilogue (enum epilogue_type epilogue_type) >> >>/* The ROP hash check must occur after the stack pointer is restored >> (since the hash involves r1), and is not performed for a sibcall. */ >> - if (TARGET_POWER10 >> + if (TARGET_POWER8>&& rs6000_rop_protect >>&& info->rop_hash_size != 0 > > ... here, both check info->rop_hash_size isn't zero, I think we can drop these > two TARGET_POWER10 (TARGET_POWER8) and rs6000_rop_protect checks? Instead > just > update the inner gcc_assert (now checking DEFAULT_ABI == ABI_ELFv2) by extra > checkings on TARGET_POWER8 && rs6000_rop_protect? > > The other looks good to me, ok for trunk with this nit tweaked (if you agree > with it and re-tested well), thanks! I agree with you, because the next patch I haven't submitted yet (waiting on this to get in), makes that simplification as part of the adding earlier checking of invalid options. :-) The follow-on patch will not only remove the TARGET_* and the 2nd/3rd rs6000_rop_protect usage, but will also remove the test and asserts of ELFv2...because we've already verified valid option usage earlier in the normal options handling code. Therefore, I'd like to keep this patch as simple as possible and limited to the TARGET_POWER10 -> TARGET_POWER8 change and the cleanup of those tests is coming in the next patch...which has already been tested. Peter
[PING*2][PATCH v2] rs6000: ROP - Do not disable shrink-wrapping for leaf functions [PR114759]
Ping * 2. [Message-ID: <1e003d78-3b2e-4263-830a-7c00a3e9d...@linux.ibm.com>] Segher, this resolves the issues you mentioned in your review. Peter On 6/18/24 5:59 PM, Peter Bergner wrote: > Updated patch. This passed bootstrap and regtesting on powerpc64le-linux > with no regressions. Ok for trunk? > > Changes from v1: > 1. Moved the disabling of shrink-wrapping to rs6000_emit_prologue >and beefed up comment. Used a more accurate test. > 2. Added comment to the test case on why rop_ok is needed. > > Peter > > > rs6000: ROP - Do not disable shrink-wrapping for leaf functions [PR114759] > > Only disable shrink-wrapping when using -mrop-protect when we know we > will be emitting the ROP-protect hash instructions (ie, non-leaf functions). > > 2024-06-17 Peter Bergner > > gcc/ > PR target/114759 > * config/rs6000/rs6000.cc (rs6000_override_options_after_change): Move > the disabling of shrink-wrapping from here > * config/rs6000/rs6000-logue.cc (rs6000_emit_prologue): ...to here. > > gcc/testsuite/ > PR target/114759 > * gcc.target/powerpc/pr114759-1.c: New test. > --- > gcc/config/rs6000/rs6000-logue.cc | 5 + > gcc/config/rs6000/rs6000.cc | 4 > gcc/testsuite/gcc.target/powerpc/pr114759-1.c | 16 > 3 files changed, 21 insertions(+), 4 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr114759-1.c > > diff --git a/gcc/config/rs6000/rs6000-logue.cc > b/gcc/config/rs6000/rs6000-logue.cc > index 193e2122c0f..c384e48e378 100644 > --- a/gcc/config/rs6000/rs6000-logue.cc > +++ b/gcc/config/rs6000/rs6000-logue.cc > @@ -3018,6 +3018,11 @@ rs6000_emit_prologue (void) > && (lookup_attribute ("no_split_stack", > DECL_ATTRIBUTES > (cfun->decl)) > == NULL)); > + /* If we are inserting ROP-protect hash instructions, disable shrink-wrap > + until the bug where the hashst insn is emitted in the wrong location > + is fixed. See PR101324 for details. */ > + if (info->rop_hash_size) > +flag_shrink_wrap = 0; > >frame_pointer_needed_indeed > = frame_pointer_needed && df_regs_ever_live_p > (HARD_FRAME_POINTER_REGNUM); > diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc > index e4dc629ddcc..fd6e013c346 100644 > --- a/gcc/config/rs6000/rs6000.cc > +++ b/gcc/config/rs6000/rs6000.cc > @@ -3427,10 +3427,6 @@ rs6000_override_options_after_change (void) > } >else if (!OPTION_SET_P (flag_cunroll_grow_size)) > flag_cunroll_grow_size = flag_peel_loops || optimize >= 3; > - > - /* If we are inserting ROP-protect instructions, disable shrink wrap. */ > - if (rs6000_rop_protect) > -flag_shrink_wrap = 0; > } > > #ifdef TARGET_USES_LINUX64_OPT > diff --git a/gcc/testsuite/gcc.target/powerpc/pr114759-1.c > b/gcc/testsuite/gcc.target/powerpc/pr114759-1.c > new file mode 100644 > index 000..579e08e920f > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr114759-1.c > @@ -0,0 +1,16 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -mdejagnu-cpu=power10 -mrop-protect > -fdump-rtl-pro_and_epilogue" } */ > +/* { dg-require-effective-target rop_ok } Only enable on supported ABIs. */ > + > +/* Verify we still attempt shrink-wrapping when using -mrop-protect > + and there are no function calls. */ > + > +long > +foo (long arg) > +{ > + if (arg) > +asm ("" ::: "r20"); > + return 0; > +} > + > +/* { dg-final { scan-rtl-dump-times "Performing shrink-wrapping" 1 > "pro_and_epilogue" } } */
[PING*3][PATCH v2] rs6000: ROP - Do not disable shrink-wrapping for, leaf functions [PR114759]
Ping * 3. [Message-ID: <1e003d78-3b2e-4263-830a-7c00a3e9d...@linux.ibm.com>] Segher, this resolves the issues you mentioned in your review. Peter On 6/18/24 5:59 PM, Peter Bergner wrote: > Updated patch. This passed bootstrap and regtesting on powerpc64le-linux > with no regressions. Ok for trunk? > > Changes from v1: > 1. Moved the disabling of shrink-wrapping to rs6000_emit_prologue >and beefed up comment. Used a more accurate test. > 2. Added comment to the test case on why rop_ok is needed. > > Peter > > > rs6000: ROP - Do not disable shrink-wrapping for leaf functions [PR114759] > > Only disable shrink-wrapping when using -mrop-protect when we know we > will be emitting the ROP-protect hash instructions (ie, non-leaf functions). > > 2024-06-17 Peter Bergner > > gcc/ > PR target/114759 > * config/rs6000/rs6000.cc (rs6000_override_options_after_change): Move > the disabling of shrink-wrapping from here > * config/rs6000/rs6000-logue.cc (rs6000_emit_prologue): ...to here. > > gcc/testsuite/ > PR target/114759 > * gcc.target/powerpc/pr114759-1.c: New test. > --- > gcc/config/rs6000/rs6000-logue.cc | 5 + > gcc/config/rs6000/rs6000.cc | 4 > gcc/testsuite/gcc.target/powerpc/pr114759-1.c | 16 > 3 files changed, 21 insertions(+), 4 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr114759-1.c > > diff --git a/gcc/config/rs6000/rs6000-logue.cc > b/gcc/config/rs6000/rs6000-logue.cc > index 193e2122c0f..c384e48e378 100644 > --- a/gcc/config/rs6000/rs6000-logue.cc > +++ b/gcc/config/rs6000/rs6000-logue.cc > @@ -3018,6 +3018,11 @@ rs6000_emit_prologue (void) > && (lookup_attribute ("no_split_stack", > DECL_ATTRIBUTES > (cfun->decl)) > == NULL)); > + /* If we are inserting ROP-protect hash instructions, disable shrink-wrap > + until the bug where the hashst insn is emitted in the wrong location > + is fixed. See PR101324 for details. */ > + if (info->rop_hash_size) > +flag_shrink_wrap = 0; > >frame_pointer_needed_indeed > = frame_pointer_needed && df_regs_ever_live_p > (HARD_FRAME_POINTER_REGNUM); > diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc > index e4dc629ddcc..fd6e013c346 100644 > --- a/gcc/config/rs6000/rs6000.cc > +++ b/gcc/config/rs6000/rs6000.cc > @@ -3427,10 +3427,6 @@ rs6000_override_options_after_change (void) > } >else if (!OPTION_SET_P (flag_cunroll_grow_size)) > flag_cunroll_grow_size = flag_peel_loops || optimize >= 3; > - > - /* If we are inserting ROP-protect instructions, disable shrink wrap. */ > - if (rs6000_rop_protect) > -flag_shrink_wrap = 0; > } > > #ifdef TARGET_USES_LINUX64_OPT > diff --git a/gcc/testsuite/gcc.target/powerpc/pr114759-1.c > b/gcc/testsuite/gcc.target/powerpc/pr114759-1.c > new file mode 100644 > index 000..579e08e920f > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr114759-1.c > @@ -0,0 +1,16 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -mdejagnu-cpu=power10 -mrop-protect > -fdump-rtl-pro_and_epilogue" } */ > +/* { dg-require-effective-target rop_ok } Only enable on supported ABIs. */ > + > +/* Verify we still attempt shrink-wrapping when using -mrop-protect > + and there are no function calls. */ > + > +long > +foo (long arg) > +{ > + if (arg) > +asm ("" ::: "r20"); > + return 0; > +} > + > +/* { dg-final { scan-rtl-dump-times "Performing shrink-wrapping" 1 > "pro_and_epilogue" } } */
[PATCH] rs6000: Error on CPUs and ABIs that don't support the ROP, protection insns [PR114759]
Hi Kewen, Here is that promised cleanup patch we discussed in the previous patch review. I'll note this patch is dependent on the previous patch you approved. I have not pushed that yet (in case you looked) since I'm waiting on Segher to approve the updated patch for not disabling shrink-wrapping for leaf-functions in the presence of -mrop-protect first. We currently silently ignore the -mrop-protect option for old CPUs we don't support with the ROP hash insns, but we throw an error for unsupported ABIs. This patch treats unsupported CPUs and ABIs similarly by throwing an error for both. This matches clang behavior and allows us to simplify our ROP tests in the code that generates our prologue and epilogue code. This passed bootstrap and regtesting on powerpc64le-linux and powerpc64-linux with no regressions. Ok for trunk? I'll note I did not create a test case for unsupported ABIs, since I'll be working on adding ROP support for powerpc-linux and powerpc64-linux next. Peter 2024-06-26 Peter Bergner gcc/ PR target/114759 * config/rs6000/rs6000.cc (rs6000_override_options_after_change): Disallow CPUs and ABIs that do no support the ROP protection insns. * config/rs6000/rs6000-logue.cc (rs6000_stack_info): Remove now unneeded tests. (rs6000_emit_prologue): Likewise. Remove unneeded gcc_assert. (rs6000_emit_epilogue): Likewise. * config/rs6000/rs6000.md: Likewise. gcc/testsuite/ PR target/114759 * gcc.target/powerpc/pr114759-3.c: New test. --- gcc/config/rs6000/rs6000.cc | 11 ++ gcc/config/rs6000/rs6000-logue.cc | 22 +-- gcc/config/rs6000/rs6000.md | 4 ++-- gcc/testsuite/gcc.target/powerpc/pr114759-3.c | 19 4 files changed, 38 insertions(+), 18 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/pr114759-3.c diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index fd6e013c346..e9642fd5310 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -3427,6 +3427,17 @@ rs6000_override_options_after_change (void) } else if (!OPTION_SET_P (flag_cunroll_grow_size)) flag_cunroll_grow_size = flag_peel_loops || optimize >= 3; + + if (rs6000_rop_protect) +{ + /* Disallow CPU targets we don't support. */ + if (!TARGET_POWER8) + error ("%<-mrop-protect%> requires %<-mcpu=power8%> or later"); + + /* Disallow ABI targets we don't support. */ + if (DEFAULT_ABI != ABI_ELFv2) + error ("%<-mrop-protect%> requires the ELFv2 ABI"); +} } #ifdef TARGET_USES_LINUX64_OPT diff --git a/gcc/config/rs6000/rs6000-logue.cc b/gcc/config/rs6000/rs6000-logue.cc index bd363b625a4..fdb6414f486 100644 --- a/gcc/config/rs6000/rs6000-logue.cc +++ b/gcc/config/rs6000/rs6000-logue.cc @@ -716,17 +716,11 @@ rs6000_stack_info (void) info->calls_p = (!crtl->is_leaf || cfun->machine->ra_needs_full_frame); info->rop_hash_size = 0; - if (TARGET_POWER8 - && info->calls_p - && DEFAULT_ABI == ABI_ELFv2 - && rs6000_rop_protect) + /* If we want ROP protection and this function makes a call, indicate + we need to create a stack slot to save the hashed return address in. */ + if (rs6000_rop_protect + && info->calls_p) info->rop_hash_size = 8; - else if (rs6000_rop_protect && DEFAULT_ABI != ABI_ELFv2) -{ - /* We can't check this in rs6000_option_override_internal since -DEFAULT_ABI isn't established yet. */ - error ("%qs requires the ELFv2 ABI", "-mrop-protect"); -} /* Determine if we need to save the condition code registers. */ if (save_reg_p (CR2_REGNO) @@ -3277,9 +3271,8 @@ rs6000_emit_prologue (void) /* NOTE: The hashst isn't needed if we're going to do a sibcall, but there's no way to know that here. Harmless except for performance, of course. */ - if (TARGET_POWER8 && rs6000_rop_protect && info->rop_hash_size != 0) + if (info->rop_hash_size) { - gcc_assert (DEFAULT_ABI == ABI_ELFv2); rtx stack_ptr = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM); rtx addr = gen_rtx_PLUS (Pmode, stack_ptr, GEN_INT (info->rop_hash_save_offset)); @@ -5056,12 +5049,9 @@ rs6000_emit_epilogue (enum epilogue_type epilogue_type) /* The ROP hash check must occur after the stack pointer is restored (since the hash involves r1), and is not performed for a sibcall. */ - if (TARGET_POWER8 - && rs6000_rop_protect - && info->rop_hash_size != 0 + if (info->rop_hash_size && epilogue_type != EPILOGUE_TYPE_SIBCALL) { - gcc_assert (DEFAULT_ABI == ABI_ELFv2);
Re: [PATCH] rs6000: Error on CPUs and ABIs that don't support the ROP, protection insns [PR114759]
On 7/10/24 1:01 AM, Kewen.Lin wrote: >> + if (rs6000_rop_protect) >> +{ >> + /* Disallow CPU targets we don't support. */ >> + if (!TARGET_POWER8) >> +error ("%<-mrop-protect%> requires %<-mcpu=power8%> or later"); >> + >> + /* Disallow ABI targets we don't support. */ >> + if (DEFAULT_ABI != ABI_ELFv2) >> +error ("%<-mrop-protect%> requires the ELFv2 ABI"); >> +} > > I wonder if there is some reason to put this hunk here. IMHO we want the hunk > in rs6000_option_override_internal instead since no optimization options can > affect cpu type and DEFAULT_ABI? So the original code that used to disable shrink-wrapping that looked like: /* If we are inserting ROP-protect instructions, disable shrink wrap. */ if (rs6000_rop_protect) flag_shrink_wrap = 0; ...used to be in rs6000_option_override_internal, but was moved to rs6000_override_options_after_change as part of PR101324 (commit cff7879a381d). I guess I just placed this code here since it was the correct location for that old usage (it's changed in the patch I'm waiting for Segher to re-review), but I will look at moving it to rs6000_option_override_internal. I think I thought we could use a target attribute to enable -mrop-protect, but looking closer, we don't actually allow that option there. > And we probably want to unset rs6000_rop_protect to align with the handlings > on other options? I'm not sure I know what you mean? Why would we unset rs6000_rop_protect? Either we've concluded the current target options allow ROP code gen or not and for the cases where we don't/can't allow ROP, we want to give the user and error to match clang's behavior and how we already handle unsupported ABIs. So what is it you're trying to describe here? Peter
[PATCH v2] rs6000: Fix .machine cpu selection w/ altivec [PR97367]
René's patch seems to have stalled, so here is an updated version of the patch with the requested changes to his patch. I'll note I have added an additional code change, which is to also emit a ".machine altivec" if Altivec is enabled. The problem this fixes is for cpus like the G5, which is basically a power4 plus an Altivec unit, its ".machine power4" doesn't enable the assembler to recognize Altivec insns. That isn't a problem if you use gcc -mcpu=G5 to assemble the assembler file, since gcc passes -maltivec to the assembler. However, if you try to assemble the assembler file with as by hand, you'll get "unrecognized opcode" errors. I did not do the same for VSX, since all ".machine " for cpus that support VSX already enable VSX insn recognition, so it's not needed. rs6000: Fix .machine cpu selection w/ altivec [PR97367] There are various non-IBM CPUs with altivec, so we cannot use that flag to determine which .machine cpu to use, so ignore it. Emit an additional ".machine altivec" if Altivec is enabled so that the assembler doesn't require an explicit -maltivec option to assemble any Altivec instructions for those targets where the ".machine cpu" is insufficient to enable Altivec. For example, -mcpu=G5 emits a ".machine power4". This passed bootstrap and regtesting on powrpc64-linux (running the testsuite in both 32-bit and 64-bit modes) with no regressions. Ok for trunk and the release branches after some trunk burn-in time? Peter 2024-07-12 René Rebe Peter Bergner gcc/ PR target/97367 * config/rs6000/rs6000.c (rs6000_machine_from_flags): Do not consider OPTION_MASK_ALTIVEC. (emit_asm_machine): For Altivec compiles, emit a ".machine altivec". gcc/testsuite/ PR target/97367 * gcc.target/powerpc/pr97367.c: New test. Signed-of-by: René Rebe --- gcc/config/rs6000/rs6000.cc| 5 - gcc/testsuite/gcc.target/powerpc/pr97367.c | 13 + 2 files changed, 17 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/pr97367.c diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index 2cbea6ea2d7..2cb8f35739b 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -5888,7 +5888,8 @@ rs6000_machine_from_flags (void) HOST_WIDE_INT flags = rs6000_isa_flags; /* Disable the flags that should never influence the .machine selection. */ - flags &= ~(OPTION_MASK_PPC_GFXOPT | OPTION_MASK_PPC_GPOPT | OPTION_MASK_ISEL); + flags &= ~(OPTION_MASK_PPC_GFXOPT | OPTION_MASK_PPC_GPOPT | OPTION_MASK_ISEL +| OPTION_MASK_ALTIVEC); if ((flags & (ISA_3_1_MASKS_SERVER & ~ISA_3_0_MASKS_SERVER)) != 0) return "power10"; @@ -5913,6 +5914,8 @@ void emit_asm_machine (void) { fprintf (asm_out_file, "\t.machine %s\n", rs6000_machine); + if (TARGET_ALTIVEC) +fprintf (asm_out_file, "\t.machine altivec\n"); } #endif diff --git a/gcc/testsuite/gcc.target/powerpc/pr97367.c b/gcc/testsuite/gcc.target/powerpc/pr97367.c new file mode 100644 index 000..f9118dbcdec --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr97367.c @@ -0,0 +1,13 @@ +/* PR target/97367 */ +/* { dg-options "-mdejagnu-cpu=G5" } */ + +/* Verify we emit a ".machine power4" and ".machine altivec" rather + than a ".machine power7". */ + +int dummy (void) +{ + return 0; +} + +/* { dg-final { scan-assembler {\.\mmachine power4\M} } } */ +/* { dg-final { scan-assembler {\.\mmachine altivec\M} } } */ -- 2.45.2
Re: [PATCH] rs6000: Error on CPUs and ABIs that don't support the ROP, protection insns [PR114759]
On 7/11/24 1:24 AM, Kewen.Lin wrote: > Sorry for the confusion, I meant for most target options when we emit some > error > message meanwhile we also unset it, such as: > > if (TARGET_CRYPTO && !TARGET_ALTIVEC) > { > if (rs6000_isa_flags_explicit & OPTION_MASK_CRYPTO) > error ("%qs requires %qs", "-mcrypto", "-maltivec"); > rs6000_isa_flags &= ~OPTION_MASK_CRYPTO; > } That is not what is happening here though. The code here to disable crypto is for the case where crypto is enabled implicitly, via say -mcpu=power8, but the user has also explicitly disabled Altivec which crypto depends on. In that case, we do not emit an error or warning and we silently disable crypto. This is similar to -mcpu=power8 -mno-altivec where we silently disable VSX. The other cases you showed are of similar scenarios. In my ROP code, we know ROP was explicitly enabled (it is never turned on implicitly with any -mcpu= option) and the target cpu and/or ABI does not support it, so there's nothing more to do, other than to emit an error. There is no matching implicit use case where we silently disable ROP as there was in your case above. Therefore, I think the code as I showed it is correct as is...other than the code snipit location, which I have moved and am testing now. Peter
[PATCH v2] rs6000: Error on CPUs and ABIs that don't support the ROP, protection insns [PR114759]
Hi Kewen, Here's the updated patch per your review comments, minus your suggestion to disable the ROP mask which I mentioned isn't needed in my other reply. This passed bootstrap and regtesting with no regressions on powerpc64le-linux. Ok for trunk? Peter Changes from v1: * Moved checks for invalid targets from rs6000_override_options_after_change to rs6000_option_override_internal. rs6000: Error on CPUs and ABIs that don't support the ROP, protection insns [PR114759] We currently silently ignore the -mrop-protect option for old CPUs we don't support with the ROP hash insns, but we throw an error for unsupported ABIs. This patch treats unsupported CPUs and ABIs similarly by throwing an error both both. This matches clang behavior and allows us to simplify our tests in the code that generates our prologue and epilogue code. 2024-07-15 Peter Bergner gcc/ PR target/114759 * config/rs6000/rs6000.cc (rs6000_option_override_internal): Disallow CPUs and ABIs that do no support the ROP protection insns. * config/rs6000/rs6000-logue.cc (rs6000_stack_info): Remove now unneeded tests. (rs6000_emit_prologue): Likewise. Remove unneeded gcc_assert. (rs6000_emit_epilogue): Likewise. * config/rs6000/rs6000.md: Likewise. gcc/testsuite/ PR target/114759 * gcc.target/powerpc/pr114759-3.c: New test. --- gcc/config/rs6000/rs6000-logue.cc | 22 +-- gcc/config/rs6000/rs6000.cc | 12 ++ gcc/config/rs6000/rs6000.md | 4 ++-- gcc/testsuite/gcc.target/powerpc/pr114759-3.c | 19 4 files changed, 39 insertions(+), 18 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/pr114759-3.c diff --git a/gcc/config/rs6000/rs6000-logue.cc b/gcc/config/rs6000/rs6000-logue.cc index bd363b625a4..fdb6414f486 100644 --- a/gcc/config/rs6000/rs6000-logue.cc +++ b/gcc/config/rs6000/rs6000-logue.cc @@ -716,17 +716,11 @@ rs6000_stack_info (void) info->calls_p = (!crtl->is_leaf || cfun->machine->ra_needs_full_frame); info->rop_hash_size = 0; - if (TARGET_POWER8 - && info->calls_p - && DEFAULT_ABI == ABI_ELFv2 - && rs6000_rop_protect) + /* If we want ROP protection and this function makes a call, indicate + we need to create a stack slot to save the hashed return address in. */ + if (rs6000_rop_protect + && info->calls_p) info->rop_hash_size = 8; - else if (rs6000_rop_protect && DEFAULT_ABI != ABI_ELFv2) -{ - /* We can't check this in rs6000_option_override_internal since -DEFAULT_ABI isn't established yet. */ - error ("%qs requires the ELFv2 ABI", "-mrop-protect"); -} /* Determine if we need to save the condition code registers. */ if (save_reg_p (CR2_REGNO) @@ -3277,9 +3271,8 @@ rs6000_emit_prologue (void) /* NOTE: The hashst isn't needed if we're going to do a sibcall, but there's no way to know that here. Harmless except for performance, of course. */ - if (TARGET_POWER8 && rs6000_rop_protect && info->rop_hash_size != 0) + if (info->rop_hash_size) { - gcc_assert (DEFAULT_ABI == ABI_ELFv2); rtx stack_ptr = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM); rtx addr = gen_rtx_PLUS (Pmode, stack_ptr, GEN_INT (info->rop_hash_save_offset)); @@ -5056,12 +5049,9 @@ rs6000_emit_epilogue (enum epilogue_type epilogue_type) /* The ROP hash check must occur after the stack pointer is restored (since the hash involves r1), and is not performed for a sibcall. */ - if (TARGET_POWER8 - && rs6000_rop_protect - && info->rop_hash_size != 0 + if (info->rop_hash_size && epilogue_type != EPILOGUE_TYPE_SIBCALL) { - gcc_assert (DEFAULT_ABI == ABI_ELFv2); rtx stack_ptr = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM); rtx addr = gen_rtx_PLUS (Pmode, stack_ptr, GEN_INT (info->rop_hash_save_offset)); diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index fd6e013c346..1cee9c2011d 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -4825,6 +4825,18 @@ rs6000_option_override_internal (bool global_init_p) } } + /* We only support ROP protection on certain targets. */ + if (rs6000_rop_protect) +{ + /* Disallow CPU targets we don't support. */ + if (!TARGET_POWER8) + error ("%<-mrop-protect%> requires %<-mcpu=power8%> or later"); + + /* Disallow ABI targets we don't support. */ + if (DEFAULT_ABI != ABI_ELFv2) + error ("%<-mrop-protect%> requires the ELFv2 ABI"); +} + /* Initialize all of the registers. */ rs6
Re: [PATCH] rs6000, update effective target for tests builtins-10*.c and, vec_perm-runnable-i128.c
On 7/15/24 5:43 PM, Carl Love wrote: > -/* { dg-do run } */ > +/* { dg-do run { target { lp64 } && { int128 } } } */ Why isn't this just: /* { dg-do run { target int128 } } */ ??? The int128 test should disable this on 32-bit systems just fine. Peter
Re: [PATCH] rs6000: Error on CPUs and ABIs that don't support the ROP, protection insns [PR114759]
On 7/15/24 9:19 PM, Kewen.Lin wrote: > on 2024/7/16 04:30, Peter Bergner wrote: >> On 7/11/24 1:24 AM, Kewen.Lin wrote: >>> Sorry for the confusion, I meant for most target options when we emit some >>> error >>> message meanwhile we also unset it, such as: >>> >>> if (TARGET_CRYPTO && !TARGET_ALTIVEC) >>> { >>> if (rs6000_isa_flags_explicit & OPTION_MASK_CRYPTO) >>> error ("%qs requires %qs", "-mcrypto", "-maltivec"); >>> rs6000_isa_flags &= ~OPTION_MASK_CRYPTO; >>> } >> >> That is not what is happening here though. The code here to disable >> crypto is for the case where crypto is enabled implicitly, via say >> -mcpu=power8, but the user has also explicitly disabled Altivec which >> crypto depends on. In that case, we do not emit an error or warning and > > But if it's just for the case implicit enabling, the unmask should be > in an else arm and not for both implicit and explicit, the code still does > unmasking for explicit enabling. Since it's very unlikely to cause some > unexpected behaviors in following processing even in future, it's your call. > :) I think that is just Mike's coding style. Yes, the clearing of the flag could be in an else block, but clearing it even in the error case is harmless, since we're going to emit an error and exit anyway. Peter
Re: [PATCH ver 2] rs6000, update effective target for tests builtins-10*.c and, vec_perm-runnable-i128.c
On 7/16/24 6:19 PM, Carl Love wrote: > use __int128 types that are not supported on all platforms. The > __int128 type is only supported on 64-bit platforms. Need to check that > the platform is 64-bits and support the __int128 type. Add the int128 and > lp64 flags to the target test. The test cases themselves look good, but you need to update your git log entry to not mention the lp64/64-bits since you removed them. Yes, currently, only 64-bit targets support __int128, but our hope is that one day, even 32-bit targets will as well. So how about the following text instead? ... use __int128 types that are not supported on all platforms. Update the tests to check int128 effective target to avoid unsupported type errors on unsupported platforms. Peter
Re: [PATCH] rs6000: Relax some FLOAT128 expander condition for FLOAT128_IEEE_P [PR105359]
On 7/17/24 4:09 AM, Kewen.Lin wrote: > * config/rs6000/rs6000.md (@extenddf2): Remove condition > TARGET_LONG_DOUBLE_128 for FLOAT128_IEEE_P modes. This all LGTM, except this ChangeLog fragment doesn't match the code changes below. Rather than removing TARGET_LONG_DOUBLE_128, you've added FLOAT128_IEEE_P (mode)). > - "TARGET_HARD_FLOAT && TARGET_LONG_DOUBLE_128" > + "TARGET_HARD_FLOAT > + && (TARGET_LONG_DOUBLE_128 || FLOAT128_IEEE_P (mode))" Peter
Re: [PATCH] rs6000: Relax some FLOAT128 expander condition for FLOAT128_IEEE_P [PR105359]
On 7/18/24 12:19 AM, Kewen.Lin wrote: > I guess you meant "Remove" is expected to remove some code explicitly and > can be misleading here, if so how about "Don't check TARGET_LONG_DOUBLE_128 > for FLOAT128_IEEE_P modes"? Yeah, I think that is more clear. Thanks! Peter
Re: [PATCH v2] rs6000: Fix .machine cpu selection w/ altivec [PR97367]
On 7/18/24 4:14 AM, Kewen.Lin wrote: >> +/* { dg-final { scan-assembler {\.\mmachine power4\M} } } */ >> +/* { dg-final { scan-assembler {\.\mmachine altivec\M} } } */ > > Nit: Both \m looks useless and can be removed. Fine with me. Is that because the \. acts like a \m? >> Ok for trunk and the release branches after some trunk burn-in time? > > OK for all with/without the below minor nit tweaked. Great, thanks! Peter
Re: [PATCH 3/3] Add power11 tests
On 7/18/24 8:23 AM, Segher Boessenkool wrote: > Everything in gcc.target/powerpc is only run for target powerpc*-*-* > anyway, so make this just > > /* { dg-do compile } */ > > please. (Or nothing, it is the default after all, but you may want to > have it explicit). Personally, I do like seeing the /* { dg-do compile } */ even though that is the default in this testsuite directory. >> +/* { dg-require-effective-target powerpc_vsx_ok } */ > > Why is this needed? It needs a comment saying that. After Kewen's testsuite patch, powerpc_vsx_ok is pretty useless as it only checks whether the assembler knows about vsx, and what people normally want is powerpc_vsx which verifies that the compiler options used to compile the test support generating VSX insns. That said, do we really need that here? Can't we compile this test case with -mcpu=power4 as the default (ie, no VSX) and the clones should be able to generate power* and VSX just fine, correct? I don't understand the requirement on VSX here. > Saying that this is "to test if can use the target attr" is nonsense. > That does not need Linux either btw. Target selection might not work > correctly currently on some other systems, but this is not a run test! Agreed. Also, if the clones stuff really doesn't work for those targets even in a compile test, then rather than diabling this way, I think I'd like to see a target-supports clones_ok test or similar. Peter
Re: [PATCH v2] rs6000: Fix .machine cpu selection w/ altivec [PR97367]
On 7/18/24 9:14 AM, Peter Bergner wrote: > On 7/18/24 4:14 AM, Kewen.Lin wrote: >>> +/* { dg-final { scan-assembler {\.\mmachine power4\M} } } */ >>> +/* { dg-final { scan-assembler {\.\mmachine altivec\M} } } */ >> >> Nit: Both \m looks useless and can be removed. > > Fine with me. Is that because the \. acts like a \m? > > > >>> Ok for trunk and the release branches after some trunk burn-in time? >> >> OK for all with/without the below minor nit tweaked. Ok, I pushed the commit with the tweak to trunk and will let it burn-in there for a bit before backporting to the release branches. Thanks René and Kewen! Peter
[PATCH, Obvious] rs6000: Catch unsupported ABI errors when using -mrop-protect [PR114759,PR115988]
I'm not sure how my testing of the pr114759-3.c test case didn't see the excess errors on BE, as I did test there. Anyway, the following obvious patches fixes the problem Bill saw. Tested on LE and 32-bit and 64-bit BE. I'll push this tomorrow if there are no objections. Peter rs6000: Catch unsupported ABI errors when using -mrop-protect [PR114759,PR115988] 2024-07-18 Peter Bergner gcc/testsuite/ PR target/114759 PR target/115988 * gcc.target/powerpc/pr114759-3.c: Catch unsupported ABI errors. --- gcc/testsuite/gcc.target/powerpc/pr114759-3.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.target/powerpc/pr114759-3.c b/gcc/testsuite/gcc.target/powerpc/pr114759-3.c index 6770a9aec3b..e2f1d42e111 100644 --- a/gcc/testsuite/gcc.target/powerpc/pr114759-3.c +++ b/gcc/testsuite/gcc.target/powerpc/pr114759-3.c @@ -2,7 +2,8 @@ /* { dg-do compile } */ /* { dg-options "-O2 -mdejagnu-cpu=power7 -mrop-protect" } */ -/* Verify we emit an error if we use -mrop-protect with an unsupported cpu. */ +/* Verify we emit an error if we use -mrop-protect with an unsupported cpu + or ABI. */ extern void foo (void); @@ -17,3 +18,4 @@ bar (void) in the final line (which is all that dg-error inspects). Hence, we have to tell dg-error to ignore the line number. */ /* { dg-error "'-mrop-protect' requires '-mcpu=power8'" "PR114759" { target *-*-* } 0 } */ +/* { dg-error "'-mrop-protect' requires the ELFv2 ABI" "PR114759" { target { ! rop_ok } } 0 } */
Re: [REPOST 1/3] Add support for -mcpu=power11
On 7/19/24 12:34 PM, Michael Meissner wrote: > On Thu, Jul 18, 2024 at 08:08:44AM -0500, Segher Boessenkool wrote: >>> --- a/gcc/config/rs6000/ppc-auxv.h >>> +++ b/gcc/config/rs6000/ppc-auxv.h >>> @@ -47,9 +47,8 @@ >>> #define PPC_PLATFORM_PPC47612 >>> #define PPC_PLATFORM_POWER813 >>> #define PPC_PLATFORM_POWER914 >>> - >>> -/* This is not yet official. */ >>> #define PPC_PLATFORM_POWER10 15 >>> +#define PPC_PLATFORM_POWER11 16 >> >> Please add a comment where the official thing is? >> >> It is in glibc dl-procinfo.h, and there *cannot* be more than 16 >> platforms currently, so how can this work? Or do we get data >> directly from the kernel, or what? >> >> But you tested it, so you do obviously get PPC_PLATFORM_POWER11 from >> somewhere. I cannot see from where though? > > In other discussions, I was told that 16 with be the platform number for the > kernel in the future. The AT_PLATFORM from the kernel is a string, not an integer. To make __builtin_cpu_is ("power11") work efficiently, GLIBC stores an integer representing each cpu AT_PLATFORM string in the TCB. It is therefore GLIBC which "owns" the integer versions of the platform values and yes, 16 is the correct value for Power11 and that value exists in upstream GLIBC. Peter
Re: [PATCH] rs6000: Don't pass -many to the assembler [PR112868]
On 5/21/24 8:27 AM, jeevitha wrote: > The following patch has been bootstrapped and regtested with default > configuration > [--enable-checking=yes] and with --enable-checking=release on > powerpc64le-linux. > > This patch removes passing the -many assembler option for release builds. Now, > GCC no longer passes -many under any conditions to the assembler. > > 2024-05-15 Jeevitha Palanisamy > > gcc/ > PR target/112868 > * config/rs6000/rs6000.h (ASM_OPT_ANY): Removed Define. > (ASM_CPU_SPEC): Remove ASM_OPT_ANY usage. You are missing a ChangeLog entry for the target-supports.exp change plus there is no mention of why it's needed in the git log entry. Otherwise, the rest LGTM. Peter
Re: [PATCH-1v2, rs6000] Implement optab_isinf for SFDF and IEEE128
On 5/19/24 10:28 PM, HAO CHEN GUI wrote: > +(define_expand "isinf2" > + [(use (match_operand:SI 0 "gpc_reg_operand")) > + (use (match_operand:SFDF 1 "gpc_reg_operand"))] > + "TARGET_HARD_FLOAT && TARGET_P9_VECTOR" > +{ > + emit_insn (gen_xststdcp (operands[0], operands[1], GEN_INT (0x30))); > + DONE; > +}) Is there a reason not to use the vsx_register_operand predicate for op1 which matches the predicate for the operand of the xststdcp pattern we're passing op1 to? Ditto for the other optab patches you've submitted. Peter
[PATCH] .gitattributes: disable crlf translation
By default, git has the "autocrlf" """feature""" enabled. This causes the files to have CRLF line endings when checked out on windows, which in the case of configure, causes confusing errors like: ./gcc/configure: line 14: $'\r': command not found ./gcc/configure: line 29: syntax error near unexpected token `newline' '/gcc/configure: line 29: ` ;; when it is invoked. Any files damaged in this way can be fixed with: $ git config core.autocrlf false $ git reset $ git checkout . But, it's better to simply avoid this problem in the first place. This behavior is never helpful or desired for gcc. Signed-off-by: Peter Damianov --- .gitattributes | 3 +++ 1 file changed, 3 insertions(+) diff --git a/.gitattributes b/.gitattributes index e75bfc595bf..1e116987c98 100644 --- a/.gitattributes +++ b/.gitattributes @@ -8,3 +8,6 @@ ChangeLog whitespace=indent-with-non-tab,space-before-tab,trailing-space # Use together with git config diff.md.xfuncname '^\(define.*$' # which is run by contrib/gcc-git-customization.sh too. *.md diff=md + +# Disable lf -> crlf translation on windows. +* -crlf -- 2.39.2
[PATCH] libcpp: Correct typo 'r' -> '\r'
libcpp/ChangeLog: * lex.cc (do_peek_prev): Correct typo in argument to __builtin_expect() Signed-off-by: Peter Damianov --- libcpp/lex.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libcpp/lex.cc b/libcpp/lex.cc index c9e44e6..de752bdc9c8 100644 --- a/libcpp/lex.cc +++ b/libcpp/lex.cc @@ -5038,7 +5038,7 @@ do_peek_prev (const unsigned char *peek, const unsigned char *bound) unsigned char c = *--peek; if (__builtin_expect (c == '\n', false) - || __builtin_expect (c == 'r', false)) + || __builtin_expect (c == '\r', false)) { if (peek == bound) return peek; -- 2.39.2
[PATCH v3 2/3] diagnostics: Don't hardcode auto_enable_urls to false for mingw hosts
Windows terminal and mintty both have support for link escape sequences, and so auto_enable_urls shouldn't be hardcoded to false. For older versions of the windows console, mingw_ansi_fputs's console API translation logic does mangle these sequences, but there's nothing useful it could do even if this weren't the case, so check if the ansi escape sequences are supported at all. conhost.exe doesn't support link escape sequences, but printing them does not cause any problems. gcc/ChangeLog: * diagnostic-color.cc (auto_enable_urls): Don't hardcode to return false on mingw hosts. * diagnostic-color.cc (auto_enable_urls): Return true if console supports ansi escape sequences. Signed-off-by: Peter Damianov --- v3: Fix minor comment formatting nit. gcc/diagnostic-color.cc | 19 +++ 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/gcc/diagnostic-color.cc b/gcc/diagnostic-color.cc index 261a14b6c5f..184419cea36 100644 --- a/gcc/diagnostic-color.cc +++ b/gcc/diagnostic-color.cc @@ -384,9 +384,6 @@ parse_env_vars_for_urls () static bool auto_enable_urls () { -#ifdef __MINGW32__ - return false; -#else const char *term, *colorterm; /* First check the terminal is capable of printing color escapes, @@ -394,6 +391,21 @@ auto_enable_urls () if (!should_colorize ()) return false; +#ifdef __MINGW32__ + HANDLE handle; + DWORD mode; + + handle = GetStdHandle (STD_ERROR_HANDLE); + if ((handle == INVALID_HANDLE_VALUE) || (handle == NULL)) +return false; + + /* If ansi escape sequences aren't supported by the console, then URLs will + print mangled from mingw_ansi_fputs's console API translation. It wouldn't + be useful even if this weren't the case. */ + if (GetConsoleMode (handle, &mode) && !(mode & ENABLE_VIRTUAL_TERMINAL_PROCESSING)) +return false; +#endif + /* xfce4-terminal is known to not implement URLs at this time. Recently new installations (0.8) will safely ignore the URL escape sequences, but a large number of legacy installations (0.6.3) print @@ -428,7 +440,6 @@ auto_enable_urls () return false; return true; -#endif } /* Determine if URLs should be enabled, based on RULE, -- 2.39.2
[PATCH v3 1/3] diagnostics: Enable escape sequence processing on windows consoles
Since windows 10 release v1511, the windows console has had support for VT100 escape sequences. We should try to enable this, and utilize it where possible. gcc/ChangeLog: * diagnostic-color.cc (should_colorize): Enable processing of VT100 escape sequences on windows consoles Signed-off-by: Peter Damianov --- Pinging these patches. The wine ordeal is a problem, but disabling the diagnostics colors with the environment variable should be resolving that. I don't want to intentionally make testing harder, but until wine fixes: https://bugs.winehq.org/show_bug.cgi?id=49780 There is really nothing I can do. gcc/diagnostic-color.cc | 21 - 1 file changed, 16 insertions(+), 5 deletions(-) diff --git a/gcc/diagnostic-color.cc b/gcc/diagnostic-color.cc index cbe57ce763f..261a14b6c5f 100644 --- a/gcc/diagnostic-color.cc +++ b/gcc/diagnostic-color.cc @@ -300,12 +300,23 @@ should_colorize (void) pp_write_text_to_stream() in pretty-print.cc calls fputs() on that stream. However, the code below for non-Windows doesn't seem to care about it either... */ - HANDLE h; - DWORD m; + HANDLE handle; + DWORD mode; + BOOL isconsole = false; - h = GetStdHandle (STD_ERROR_HANDLE); - return (h != INVALID_HANDLE_VALUE) && (h != NULL) - && GetConsoleMode (h, &m); + handle = GetStdHandle (STD_ERROR_HANDLE); + + if ((handle != INVALID_HANDLE_VALUE) && (handle != NULL)) +isconsole = GetConsoleMode (handle, &mode); + + if (isconsole) +{ + /* Try to enable processing of VT100 escape sequences */ + mode |= ENABLE_PROCESSED_OUTPUT | ENABLE_VIRTUAL_TERMINAL_PROCESSING; + SetConsoleMode (handle, mode); +} + + return isconsole; #else char const *t = getenv ("TERM"); /* emacs M-x shell sets TERM="dumb". */ -- 2.39.2
[PATCH v3 3/3] pretty-print: Don't translate escape sequences to windows console API
Modern versions of windows (after windows 10 v1511) support VT100 escape sequences, so translation for them is not necessary. The translation also mangles embedded warning documentation links. gcc/ChangeLog: * pretty-print.cc (mingw_ansi_fputs): Don't translate escape sequences if the console has ENABLE_VIRTUAL_TERMINAL_PROCESSING. Signed-off-by: Peter Damianov --- v3: fix minor comment formatting nit. gcc/pretty-print.cc | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/gcc/pretty-print.cc b/gcc/pretty-print.cc index eb59bf424b7..505230ed4d6 100644 --- a/gcc/pretty-print.cc +++ b/gcc/pretty-print.cc @@ -674,8 +674,9 @@ mingw_ansi_fputs (const char *str, FILE *fp) /* Don't mess up stdio functions with Windows APIs. */ fflush (fp); - if (GetConsoleMode (h, &mode)) -/* If it is a console, translate ANSI escape codes as needed. */ + if (GetConsoleMode (h, &mode) && !(mode & ENABLE_VIRTUAL_TERMINAL_PROCESSING)) +/* If it is a console, and doesn't support ANSI escape codes, translate + them as needed. */ for (;;) { if ((esc_code = find_esc_head (&prefix_len, &esc_head, read)) == 0) -- 2.39.2
[PATCH v3] diagnostics: Follow DECL_ORIGIN in lhd_print_error_function [PR102061]
Currently, if a warning references a cloned function, the name of the cloned function will be emitted in the "In function 'xyz'" part of the diagnostic, which users aren't supposed to see. This patch follows the DECL_ORIGIN link to get the name of the original function, so the internal compiler details aren't exposed. gcc/ChangeLog: PR diagnostics/102061 * langhooks.cc (lhd_print_error_function): Follow DECL_ORIGIN links. * gcc.dg/pr102061.c: New testcase. Signed-off-by: Peter Damianov --- v3: also follow DECL_ORIGIN when emitting "inlined from" warnings, I missed this before. Add testcase. gcc/langhooks.cc| 3 +++ gcc/testsuite/gcc.dg/pr102061.c | 35 + 2 files changed, 38 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/pr102061.c diff --git a/gcc/langhooks.cc b/gcc/langhooks.cc index 61f2b676256..7a2a66b3c39 100644 --- a/gcc/langhooks.cc +++ b/gcc/langhooks.cc @@ -395,6 +395,8 @@ lhd_print_error_function (diagnostic_context *context, const char *file, else fndecl = current_function_decl; + fndecl = DECL_ORIGIN(fndecl); + if (TREE_CODE (TREE_TYPE (fndecl)) == METHOD_TYPE) pp_printf (context->printer, _("In member function %qs"), @@ -439,6 +441,7 @@ lhd_print_error_function (diagnostic_context *context, const char *file, } if (fndecl) { + fndecl = DECL_ORIGIN(fndecl); expanded_location s = expand_location (*locus); pp_comma (context->printer); pp_newline (context->printer); diff --git a/gcc/testsuite/gcc.dg/pr102061.c b/gcc/testsuite/gcc.dg/pr102061.c new file mode 100644 index 000..dbdd23965e7 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr102061.c @@ -0,0 +1,35 @@ +/* { dg-do compile } */ +/* { dg-options "-Wall -O2" } */ +/* { dg-message "inlined from 'bar'" "" { target *-*-* } 0 } */ +/* { dg-excess-errors "" } */ + +static inline void +foo (char *p) +{ + __builtin___memcpy_chk (p, "abc", 3, __builtin_object_size (p, 0)); +} +static void +bar (char *p) __attribute__((noinline)); +static void +bar (char *p) +{ + foo (p); +} +void f(char*) __attribute__((noipa)); +char buf[2]; +void +baz (void) __attribute__((noinline)); +void +baz (void) +{ + bar (buf); + f(buf); +} + +void f(char*) +{} + +int main(void) +{ +baz(); +} -- 2.39.2
Re: [PATCH] rs6000: Add TARGET_P10_VECTOR for Power10 vector insns [PR116266]
On 8/9/24 4:50 AM, Kewen.Lin wrote: > As PR116266 shows, we miss TARGET_P10_VECTOR to guard those > Power10 related vector instructions as well as their > according built-in functions. This patch is to introduce... LGTM. The only change I would suggest is s/according/associated/ in the sentence above. Thanks for fixing this! Peter
Re: [PATCH] rs6000: Add TARGET_P10_VECTOR for Power10 vector insns [PR116266]
On 8/9/24 12:54 PM, Segher Boessenkool wrote: >> --- a/gcc/config/rs6000/altivec.md >> +++ b/gcc/config/rs6000/altivec.md >> @@ -623,7 +623,7 @@ (define_insn "altivec_eqv1ti" >>[(set (match_operand:V1TI 0 "altivec_register_operand" "=v") >> (eq:V1TI (match_operand:V1TI 1 "altivec_register_operand" "v") >>(match_operand:V1TI 2 "altivec_register_operand" "v")))] >> - "TARGET_POWER10" >> + "TARGET_P10_VECTOR" >>"vcmpequq %0,%1,%2" >>[(set_attr "type" "veccmpfx")]) > > This very first one is incorrect, already. This is a Vector insn > (it needs MSR[VEC]=1), not a VSX insn (for which MSR[VSX]=1 is needed). > > We test TARGET_ALTIVEC for that, not TARGET_VSX. I guess you are correct that *_VECTOR is not specific enough because yeah, we could have -mcpu=power10 -maltivec -mno-vsx so we'd need two macros, TARGET_P10_ALTIVEC and TARGET_P10_VSX rather than one catch-all. That said... > In general, we want to get rid of TARGET_Pxxx_VECTOR, not introduce new > stuff like it! I'm fine with the TARGET_P10_* macro, since it's more readable than saying TARGET_POWER10 && TARGET_ALTIVEC && TARGET_VSX, especially when we use the negated version. What we really wanted to get rid of, was having a separate OPTION_MASK_Pxxx_* flag for things like this which we used to have. I agree that was bad. This isn't that though. This is just a convenience macro to make the code more (IMHO) readable. Peter
Re: [PATCH] lra: emit caller-save register spills before call insn [PR116028]
On 8/9/24 12:02 PM, Vladimir Makarov wrote: > I believe your should reverse the original patch and all the patches you > submitted to fix the issues with the original patch. I agree this commit should be reverted and Kyrill has pushed that already, so bootstrap should be restored now. Thank you Kyrill. Surya was testing her revert commit too, but was waiting for her bootstrap to complete before pushing (per process rules). It's just the cfarm systems she has access to were being slow. I do *not* agree that all the previous patches should be reverted, since they fixed *real* bugs and really only exposed other latent issues which Surya has been working on. > You should do a better testing you patches It is not correct to presume she did not test her patches thoroughly. I know she has, but with the large number of supported architectures and ABIs on those architectures, she cannot possibly test everything. That's why we have stage1 and our patch revert process. Peter
[PATCH v4] diagnostics: Follow DECL_ORIGIN in lhd_print_error_function [PR102061]
Currently, if a warning references a cloned function, the name of the cloned function will be emitted in the "In function 'xyz'" part of the diagnostic, which users aren't supposed to see. This patch follows the DECL_ORIGIN link to get the name of the original function, so the internal compiler details aren't exposed. gcc/ChangeLog: PR diagnostics/102061 * langhooks.cc (lhd_print_error_function): Follow DECL_ORIGIN links. * gcc.dg/pr102061.c: New testcase. Signed-off-by: Peter Damianov --- v4: Address formatting nits, add comment. v4: Rework testcase. It is now shorter and covers more, including both "inlined from" and "in function" parts of diagnostics. I would still appreciate a review regarding cp_print_error_function, perhaps with more info about whether it needs adjustment or whatever circumstances it is used in. gcc/langhooks.cc| 6 ++ gcc/testsuite/gcc.dg/pr102061.c | 31 +++ 2 files changed, 37 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/pr102061.c diff --git a/gcc/langhooks.cc b/gcc/langhooks.cc index 61f2b676256..270f7aee1c1 100644 --- a/gcc/langhooks.cc +++ b/gcc/langhooks.cc @@ -395,6 +395,11 @@ lhd_print_error_function (diagnostic_context *context, const char *file, else fndecl = current_function_decl; + // Follow DECL_ORIGIN link, in case this is a cloned function. + // Otherwise, we will emit names like "foo.constprop" or "bar.isra" + // in the diagnostic. + fndecl = DECL_ORIGIN (fndecl); + if (TREE_CODE (TREE_TYPE (fndecl)) == METHOD_TYPE) pp_printf (context->printer, _("In member function %qs"), @@ -439,6 +444,7 @@ lhd_print_error_function (diagnostic_context *context, const char *file, } if (fndecl) { + fndecl = DECL_ORIGIN (fndecl); expanded_location s = expand_location (*locus); pp_comma (context->printer); pp_newline (context->printer); diff --git a/gcc/testsuite/gcc.dg/pr102061.c b/gcc/testsuite/gcc.dg/pr102061.c new file mode 100644 index 000..aff1062ca5a --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr102061.c @@ -0,0 +1,31 @@ +/* { dg-do compile } */ +/* { dg-options "-Wall -O2" } */ +/* { dg-message "In function 'foo'" "" { target *-*-* } 0 } */ +/* { dg-message "inlined from 'bar'" "" { target *-*-* } 0 } */ +/* { dg-message "isra" "" { xfail *-*-* } 0 } */ +/* { dg-excess-errors "" } */ + +// The warnings generated here should not contain names of clones, like +// 'foo.isra' and 'bar.isra' + +// Emit warning with "In function 'foo'" +__attribute__((noinline)) +static int foo(char* p) { +__builtin_strncpy(p, p, 1); +return 0; +} + +// Emit warning with "inlined from 'bar'" +// For some reason, this function needs to be infinite recursive +// for the warning to show up in an isra clone. +static int bar(char* p) { +__builtin_strncpy(p, p, 1); +bar(p); +return 0; +} + +void baz() { +char c[0]; +foo(c); +bar(c); +} -- 2.39.2