Hi Segher, on 2022/11/25 23:46, Segher Boessenkool wrote: > Hi! > > On Fri, Nov 25, 2022 at 09:21:21PM +0800, Jiufu Guo wrote: >> "Kewen.Lin" <li...@linux.ibm.com> writes: >>> on 2022/9/15 16:30, Jiufu Guo wrote: >>>> For a complicate 64bit constant, blow is one instruction-sequence to >>>> build: >>>> lis 9,0x800a >>>> ori 9,9,0xabcd >>>> sldi 9,9,32 >>>> oris 9,9,0xc167 >>>> ori 9,9,0xfa16 >>>> >>>> while we can also use below sequence to build: >>>> lis 9,0xc167 >>>> lis 10,0x800a >>>> ori 9,9,0xfa16 >>>> ori 10,10,0xabcd >>>> rldimi 9,10,32,0 >>>> This sequence is using 2 registers to build high and low part firstly, >>>> and then merge them. >>>> In parallel aspect, this sequence would be faster. (Ofcause, using 1 more >>>> register with potential register pressure). > > And crucially this patch only uses two registers if can_create_pseudo_p. > Please mention that. > >>>> * config/rs6000/rs6000.cc (rs6000_emit_set_long_const): Update 64bit >>>> constant build. > > If you don't give details of what this does, just say "Update." please. > But update to what? > > "Generate more parallel code if can_create_pseudo_p." maybe? > >>>> + rtx H = gen_reg_rtx (DImode); >>>> + rtx L = gen_reg_rtx (DImode); > > Please don't use all-uppercase variable names, those are for macros. In > fact, don't use uppercase in variable (and function etc.) names at all, > unless there is a really good reason to. > > Just call it "high" and "low", or "hi" and "lo", or something? > >>>> --- /dev/null >>>> +++ b/gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c >>>> @@ -0,0 +1,27 @@ >>>> +/* { dg-do run } */ >>>> +/* { dg-options "-O2 -mdejagnu-cpu=power7 -save-temps" } */ >>> >>> Why do we need power7 here? >> power8/9 are also ok for this case. Actually, O just want to >> avoid to use new p10 instruction, like "pli", and then selected >> an old arch option. > > Why does it need _at least_ p7, is the question (as I understand it). >
Yeah, that's what I was intended to ask, since those insns to be scanned don't actually require Power7 or later. > To prohibit pli etc. you can do -mno-prefixed (which works on all older > CPUs just as well), or skip the test if prefixed insns are enabled, or > scan for the then generated code as well. The first option is by far > the simplest. Yeah, using -mno-prefixed is perfect here, nice! BR, Kewen