Hi! On Tue, Feb 21, 2023 at 02:18:25PM +0530, Ajit Agarwal wrote: > This patch replaces fmr instruction 6 cycles with 2 cycles xxlor instruction > for p7 and p8 architecture. > > I have implemented with switch and cases otherwise it is difficult to > accommodate > xxlor with p7 and p8 and fmr for other architectures.
Please domn't use a switch, it isn't needed. Instead use the "isa" attribute (with p7v here), and put the preferred alternative first. > rs6000: fmr gets used instead of faster xxlor [PR93571] rs6000: Use xxlor instead of fmr where possible > This patch replaces 6 cycles fmr instruction with xxlor > 2 cycles in p8 and p7 architecture. No, it also does it on all later architectures. Do you have any actual timings (i.e. from hardware, not documentation)? > * config/rs6000/rs6000.md (*movdf_hardfloat64): Replace fmr with xxlor > instruction. Line too long. And, that is not what the patch does. Changelog should be totally boring just saying what the patch changes. If the patch changes things other than what thechangelog says your reviewer will think something went missin somewhere :-) > - "@ > - stfd%U0%X0 %1,%0 > - lfd%U1%X1 %0,%1 > - xxlor %0,%1,%1 That is not what is currently in trunk, so your patch cannot apply. > + switch (which_alternative) { > + case 0 : return "stfd%U0%X0 %1,%0"; > + case 1 : return "lfd%U1%X1 %0,%1"; Formatting is all incorrect. We dom't need or want a switch at all, but correct would be: switch (which_alternative) { case 0: return "stfd%U0%X0 %1,%0"; case 1: return "lfd%U1%X1 %0,%1"; etc. > + case 2 : if ((TARGET_VSX || TARGET_P8_VECTOR) > + && !TARGET_P9_VECTOR > + && !TARGET_POWER10) > + return "xxlor %0,%1,%1"; > + else > + return "fmr %0,%1"; Ah, so you are excluding p9 and p10 here. Hrm. That should be written TARGET_VSX && !TARGET_P9_VECTOR, none of the rest is needed; but is that a good idea at all? Please use %xN for VSX arguments whenever possible. If this alternative allows only the low numbered vector registers, that is a hint that you probably should write this differently (and %xN is harmless then). > + return "unreachable"; No, never do that. There is "gcc_unreachable ()" if you need it. So, let's first do actual timings, and see if it is better on p9 and p10 as well (or at least not worse). Segher