On 05/02/2018 03:55 AM, Nesterovskiy, Alexander wrote: > This patch fixes false dependencies for vmovss, vmovsd, vrcpss, vrsqrtss, > vsqrtss and vsqrtsd instructions. > > Tested on x86-64/Linux, no new test fails, some SPEC 2006/2017 performance > gains. > Please let me know if something is wrong here and should be changed. > > -- > Alexander Nesterovskiy > > > falsedep.patch > > > --- i386.md (revision 259756) > +++ i386.md (working copy) > @@ -3547,7 +3547,7 @@ > { > case MODE_DF: > if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1])) > - return "vmovsd\t{%1, %0, %0|%0, %0, %1}"; > + return "%vmovsd\t{%d1, %0|%0, %d1}"; > return "%vmovsd\t{%1, %0|%0, %1}"; > > case MODE_V4SF: > @@ -3748,7 +3748,7 @@ > { > case MODE_SF: > if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1])) > - return "vmovss\t{%1, %0, %0|%0, %0, %1}"; > + return "%vmovss\t{%d1, %0|%0, %d1}"; > return "%vmovss\t{%1, %0|%0, %1}"; So what I'm confused about is in the original output template operand 0 is duplicated. In the new template operand 1 is duplicated.
Presumably what you're trying to accomplish is avoiding a false read on operand 0 (the destination)? Can you please confirm? Knowing that should also help me evaluate the changes to recp and rsqrt since they're being changed to the same style encoding when operating strictly on registers. THanks, jeff