Hello Segher:
On 21/02/23 4:34 pm, Segher Boessenkool wrote:
> Hi!
>
> On Tue, Feb 21, 2023 at 02:18:25PM +0530, Ajit Agarwal wrote:
>> This patch replaces fmr instruction 6 cycles with 2 cycles xxlor instruction
>> for p7 and p8 architecture.
>>
>> I have implemented with switch and cases otherwise it is difficult to
>> accommodate
>> xxlor with p7 and p8 and fmr for other architectures.
>
> Please domn't use a switch, it isn't needed. Instead use the "isa"
> attribute (with p7v here), and put the preferred alternative first.
I am not sure how this is possible without switch and using only "isa".
>
>> rs6000: fmr gets used instead of faster xxlor [PR93571]
>
> rs6000: Use xxlor instead of fmr where possible
>
>> This patch replaces 6 cycles fmr instruction with xxlor
>> 2 cycles in p8 and p7 architecture.
>
> No, it also does it on all later architectures.
>
> Do you have any actual timings (i.e. from hardware, not documentation)?
>
>> * config/rs6000/rs6000.md (*movdf_hardfloat64): Replace fmr with xxlor
>> instruction.
>
> Line too long. And, that is not what the patch does. Changelog should
> be totally boring just saying what the patch changes. If the patch
> changes things other than what thechangelog says your reviewer will
> think something went missin somewhere :-)
I will correct this.
>
>> - "@
>> - stfd%U0%X0 %1,%0
>> - lfd%U1%X1 %0,%1
>> - xxlor %0,%1,%1
>
> That is not what is currently in trunk, so your patch cannot apply.
>
>> + switch (which_alternative) {
>> + case 0 : return "stfd%U0%X0 %1,%0";
>> + case 1 : return "lfd%U1%X1 %0,%1";
>
> Formatting is all incorrect. We dom't need or want a switch at all, but
> correct would be:
> switch (which_alternative)
> {
> case 0:
> return "stfd%U0%X0 %1,%0";
> case 1:
> return "lfd%U1%X1 %0,%1";
>
> etc.
I will correct that.
>
>> + case 2 : if ((TARGET_VSX || TARGET_P8_VECTOR)
>> + && !TARGET_P9_VECTOR
>> + && !TARGET_POWER10)
>> + return "xxlor %0,%1,%1";
>> + else
>> + return "fmr %0,%1";
>
> Ah, so you are excluding p9 and p10 here. Hrm. That should be written
> TARGET_VSX && !TARGET_P9_VECTOR, none of the rest is needed; but is that
> a good idea at all?
>
> Please use %xN for VSX arguments whenever possible. If this alternative
> allows only the low numbered vector registers, that is a hint that you
> probably should write this differently (and %xN is harmless then).
>
>> + return "unreachable";
>
> No, never do that. There is "gcc_unreachable ()" if you need it.
>
I will also correct this.
> So, let's first do actual timings, and see if it is better on p9 and
> p10 as well (or at least not worse).
>
>
> Segher
Thanks & Regards
Ajit