Hi,
I am working on the LLVM IR of x86 vector intrinsics. In the target-specific
header file <xmmintrin.h>
some intrinsics are defined using GCC builtins, and some other are implemented
using the vector support
provided by LLVM, which I guess is the preferred method whenever it is possible.
What determines when to use LLVM vector instructions and when to use GCC
builtins?
For example, the store low value intrinsic is defined as:
static __inline__ void __attribute__((__always_inline__))
_mm_store_ss(float *__p, __m128 __a)
{
struct __mm_store_ss_struct {
float __u;
} __attribute__((__packed__, __may_alias__));
((struct __mm_store_ss_struct*)__p)->__u = __a[0];
}
But the store packed intrinsic is defined as:
static __inline__ void __attribute__((__always_inline__, __nodebug__))
_mm_storeu_ps(float *__p, __m128 __a)
{
__builtin_ia32_storeups(__p, __a);
}
Why not?
static __inline__ void __attribute__((__always_inline__, __nodebug__))
_mm_storeu_ps(float *__p, __m128 __a)
{
struct __mm_store_ps_struct {
__m128 __u;
} __attribute__((__packed__, __may_alias__));
((struct __mm_store_ps_struct*)__p)->__u = __a;
}
Loads are defined this way. That would generate more consisten LLVM IR,
because at the moment vector loads are translated into native vector
operations in LLVM IR, but vector stores are translated into calls to
external intrinsics.
Many thanks in advance,
Victoria
_______________________________________________
cfe-users mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-users