Hi Segher,

on 2022/11/25 23:46, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Nov 25, 2022 at 09:21:21PM +0800, Jiufu Guo wrote:
>> "Kewen.Lin" <li...@linux.ibm.com> writes:
>>> on 2022/9/15 16:30, Jiufu Guo wrote:
>>>> For a complicate 64bit constant, blow is one instruction-sequence to
>>>> build:
>>>>    lis 9,0x800a
>>>>    ori 9,9,0xabcd
>>>>    sldi 9,9,32
>>>>    oris 9,9,0xc167
>>>>    ori 9,9,0xfa16
>>>>
>>>> while we can also use below sequence to build:
>>>>    lis 9,0xc167
>>>>    lis 10,0x800a
>>>>    ori 9,9,0xfa16
>>>>    ori 10,10,0xabcd
>>>>    rldimi 9,10,32,0
>>>> This sequence is using 2 registers to build high and low part firstly,
>>>> and then merge them.
>>>> In parallel aspect, this sequence would be faster. (Ofcause, using 1 more
>>>> register with potential register pressure).
> 
> And crucially this patch only uses two registers if can_create_pseudo_p.
> Please mention that.
> 
>>>>    * config/rs6000/rs6000.cc (rs6000_emit_set_long_const): Update 64bit
>>>>    constant build.
> 
> If you don't give details of what this does, just say "Update." please.
> But update to what?
> 
> "Generate more parallel code if can_create_pseudo_p." maybe?
> 
>>>> +    rtx H = gen_reg_rtx (DImode);
>>>> +    rtx L = gen_reg_rtx (DImode);
> 
> Please don't use all-uppercase variable names, those are for macros.  In
> fact, don't use uppercase in variable (and function etc.) names at all,
> unless there is a really good reason to.
> 
> Just call it "high" and "low", or "hi" and "lo", or something?
> 
>>>> --- /dev/null
>>>> +++ b/gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c
>>>> @@ -0,0 +1,27 @@
>>>> +/* { dg-do run } */
>>>> +/* { dg-options "-O2 -mdejagnu-cpu=power7  -save-temps" } */
>>>
>>> Why do we need power7 here?
>> power8/9 are also ok for this case.  Actually, O just want to
>> avoid to use new p10 instruction, like "pli", and then selected
>> an old arch option.
> 
> Why does it need _at least_ p7, is the question (as I understand it).
> 

Yeah, that's what I was intended to ask, since those insns to be scanned
don't actually require Power7 or later.

> To prohibit pli etc. you can do -mno-prefixed (which works on all older
> CPUs just as well), or skip the test if prefixed insns are enabled, or
> scan for the then generated code as well.  The first option is by far
> the simplest.

Yeah, using -mno-prefixed is perfect here, nice!

BR,
Kewen

Reply via email to