kromanova added inline comments.

================
Comment at: emmintrin.h:1607
+///
+/// This intrinsic corresponds to the <c> VMOVSD / MOVSD </c> instruction.
+///
----------------
RKSimon wrote:
> kromanova wrote:
> > kromanova wrote:
> > > kromanova wrote:
> > > > probinson wrote:
> > > > > should this be VMOVQ/MOVQ instead?
> > > > Probably yes. Let me know if you have a different opinion.
> > > >  
> > > > If I use this intrinsic by itself, clang generates VMOVSD instruction. 
> > > > It happens because the default domain is chooses to generate smaller 
> > > > instruction code. 
> > > > I got confused because I couldn't find Intel's documentation about 
> > > > _mm_loadu_si64, so I just wrote a test like the one below and looked 
> > > > what instructions got generated.
> > > > 
> > > > ```
> > > > __m128i foo22 (void const * __a)
> > > > {
> > > >   return _mm_loadu_si64 (__a);
> > > > }
> > > > ```
> > > > 
> > > > However, if I change the test and use an intrisic to add 2 64-bit 
> > > > integers after the load intrinsics, I can see that VMOVQ instruction 
> > > > gets generated.
> > > > 
> > > > ```
> > > > __m128d foo44 (double const * __a)
> > > > {
> > > >   __m128i first  = _mm_loadu_si64 (__a);
> > > >   __m128i second = _mm_loadu_si64 (__a);
> > > >   return _mm_add_epi64(first, second);
> > > > 
> > > > }
> > > > ```
> > > > 
> > > > So, as you see clang could generate either VMOVSD/MOVSD or 
> > > > VMOVSQ/MOVSQ. I think it makes sense to change the documentation as 
> > > > Paul suggested:
> > > > 
> > > > /// This intrinsic corresponds to the VMOVSQ/MOVSQ.
> > > > 
> > > > Or, alternatively, we could list all the instructions that correspond 
> > > > to this intrinsics:
> > > > 
> > > > /// This intrinsic corresponds to the VMOVSQ/MOVSQ/VMOVSD/MOVSD.
> > > > 
> > > >           
> > > It will be interesting to hear Asaf Badoug opinion, since he added this 
> > > intrisic. He probably has access to Intel's documentation for this 
> > > intrinsic too (which I wasn't able to find online).
> > There is a similar situation for one intrisic just a few lines above, 
> > namely _mm_loadu_pd. It could generate either VMOVUPD / MOVUPD or 
> > VMOVUPS/MOVUPS instructions. 
> > I have actually asked Simon question about it offline just a couple of days 
> > ago. 
> > 
> > I decided to kept referring to VMOVUPD / MOVUPD as a corresponding 
> > instruction for _mm_loadu_pd. However, if we end up doing things 
> > differently for _mm_loadu_si64, we need to do a similar change to 
> > _mm_loadu_pd (and probably to some other intrinsics).
> It should be VMOVQ/MOVQ (note NOT VMOVSQ/MOVSQ!). Whatever the domain fixup 
> code does to it, that was the original intent of the code and matches what 
> other compilers says it will (probably) be.
Yep, sorry, inaccurate editing after copy and paste. Thank you for noticing.
I agree should say VMOVQ/MOVQ (similar to what is done for _mm_loadu_pd that we 
discussed a few days ago).

I will do this change and reload the review shortly.


Repository:
  rL LLVM

https://reviews.llvm.org/D28503



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to