RE: [PATCH] D12052: [X86][SSE] Add _mm_undefined_* intrinsics

2015-08-18 Thread Kuperstein, Michael M via cfe-commits
I’m not sure how much people actually use these, but the AVX-512 versions of 
these, at least, can be very useful internally to implement AVX-512 intrinsics.
For AVX-512, we use the same GCC builtin for all 3 versions of the intrinsic 
(pass-through masked, set to zero masked, and unmasked). This is the same 
implementation that’s used in GCC, and is fairly clean, since the only 
difference is in the desired pass-through values (actual value, zero, or undef).

However, since we don’t actually have the undef intrinsics right now, we put a 
zero in the unmasked version as well, which is definitely a pessimization.
The plan is to change them to use undef once the undef intrinsics are 
implemented.


From: Eric Christopher [mailto:echri...@gmail.com]
Sent: Monday, August 17, 2015 21:33
To: reviews+d12052+public+a6057f04f570e...@reviews.llvm.org; 
llvm-...@redking.me.uk; craig.top...@gmail.com; Kuperstein, Michael M
Cc: david.majne...@gmail.com; Badouh, Asaf; cfe-commits@lists.llvm.org; Richard 
Smith
Subject: Re: [PATCH] D12052: [X86][SSE] Add _mm_undefined_* intrinsics


On Sun, Aug 16, 2015 at 3:05 AM Simon Pilgrim 
mailto:llvm-...@redking.me.uk>> wrote:
RKSimon added a comment.

Yes using that uninitialized value has worried me as well. I originally set it 
to zero (and considered using __ LINE __ or __ COUNTER __) but both introduce 
defined behaviour that I could see causing all sorts of problems further down 
the line in debug vs release builds. How undefined do we want our undefined to 
be? ;-)

Yeah, this is why I hadn't implemented them yet either.

I can create __builtin_ia32_undef64mmx / __builtin_ia32_undef128 / 
__builtin_ia32_undef256 / __builtin_ia32_undef512 if nobody can think of a 
better alternative?

This seems fairly heavyweight, but I don't have any better ideas. I'll assume 
we don't want to try to expose undef as a value in C (making it as something we 
could just add), if not then this seems to make the most sense. It's pretty 
painful/ugly though.

Are people using these or did they just notice for completeness? We probably 
_could_ define them to zero and leave it at that. It's not pleasant and slower 
than it needs to be but not crazy.

-eric


Repository:
  rL LLVM

http://reviews.llvm.org/D12052


-
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


RE: r245923 - [X86] Expose the various _rot intrinsics on non-MS platforms

2015-08-25 Thread Kuperstein, Michael M via cfe-commits
Argh.
I'll revert, and figure out what to do with it later.
Sorry about the breakage.

Michael

-Original Message-
From: İsmail Dönmez [mailto:ism...@i10z.com] 
Sent: Tuesday, August 25, 2015 14:17
To: Kuperstein, Michael M
Cc: cfe-commits@lists.llvm.org
Subject: Re: r245923 - [X86] Expose the various _rot intrinsics on non-MS 
platforms

Hi,

On Tue, Aug 25, 2015 at 10:21 AM, Michael Kuperstein via cfe-commits 
 wrote:
> Author: mkuper
> Date: Tue Aug 25 02:21:33 2015
> New Revision: 245923
>
> URL: http://llvm.org/viewvc/llvm-project?rev=245923&view=rev
> Log:
> [X86] Expose the various _rot intrinsics on non-MS platforms
>
> _rotl, _rotwl and _lrotl (and their right-shift counterparts) are 
> official x86 intrinsics, and should be supported regardless of 
> environment. This is in contrast to _rotl8, _rotl16, and _rotl64 which are 
> MS-specific.
>
> Note that the MS documentation for _lrotl is different from the Intel 
> documentation. Intel explicitly documents it as a 64-bit rotate, while 
> for MS, since sizeof(unsigned long) for MSVC is always 4, a 32-bit rotate is 
> implied.
>
> Differential Revision: http://reviews.llvm.org/D12271
>
> Added:
> cfe/trunk/test/CodeGen/x86-rot-intrinsics.c   (with props)
> Modified:
> cfe/trunk/lib/Headers/Intrin.h
> cfe/trunk/lib/Headers/immintrin.h

This seems to break clang + mingw-w64 :

λ echo "#include " | clang -x c -target x86_64-w64-mingw32 
--sysroot C:/mingw-w64-6.0.0 - In file included from :1:
In file included from
C:/mingw-w64-6.0.0\x86_64-w64-mingw32\include\winsock2.h:23:
In file included from
C:/mingw-w64-6.0.0\x86_64-w64-mingw32\include\windows.h:69:
In file included from C:/mingw-w64-6.0.0\x86_64-w64-mingw32\include\windef.h:8:
In file included from
C:/mingw-w64-6.0.0\x86_64-w64-mingw32\include\minwindef.h:163:
In file included from
C:/mingw-w64-6.0.0\x86_64-w64-mingw32\include\winnt.h:1516:
In file included from C:\Program
Files\LLVM\bin\..\lib\clang\3.8.0\include\x86intrin.h:29:
C:\Program Files\LLVM\bin\..\lib\clang\3.8.0\include\immintrin.h:164:1:
error: static declaration of '_rotl' follows non-static declaration 
_rotl(unsigned int _Value, int _Shift) { ^
C:/mingw-w64-6.0.0\x86_64-w64-mingw32\include\stdlib.h:581:24: note:
previous declaration is here
  unsigned int __cdecl _rotl(unsigned int _Val,int _Shift);
   ^
-
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


RE: [PATCH] D13015: [X86] Make f16c intrinsics accessible through emmintrin.h, per Intel docs

2015-11-30 Thread Kuperstein, Michael M via cfe-commits
Hi Paul,

I'd rather not move them into immintrin.h. The current situation is that if you 
include emmintrin.h you get "too many" intrinsics w.r.t to the Intel docs (and 
immintrin.h is just right, since it recursively includes emmintrin.h), while if 
we move it to immintrin.h, including emmintrin.h would provide "not enough" 
intrinsics. So, in terms of compatibility, I think this is slightly better.
Regarding MSVC and GCC - MSVC doesn't have an emmintrin.h (or any other 
specialized intrinsic file) and puts everything in immintrin.h. GCC, in my 
opinion, should probably also be fixed. :-)

On the other hand, I undersand why providing _mm256 stuff in emmintrin.h may be 
a problem. Perhaps we can split the file in two (or move the relevant 
intrinsics directly into emm/immintrin)?
That would means our header structure is slightly different from GCC's, but 
since this file should not be included directly anyway, that doesn't really 
bother me.
Does that sound reasonable to you?

Michael

-Original Message-
From: Paul Robinson [mailto:paul_robin...@playstation.sony.com] 
Sent: Monday, November 30, 2015 23:01
To: Kuperstein, Michael M; Badouh, Asaf
Cc: cfe-commits@lists.llvm.org
Subject: Re: [PATCH] D13015: [X86] Make f16c intrinsics accessible through 
emmintrin.h, per Intel docs

probinson added a subscriber: probinson.
probinson added a comment.

Regarding _mm[256]_cvtps_ph and ...cvtph_ps, I can find an Intel doc that say 
the _mm_* functions are in emmintrin.h and the _mm256_* are in immintrin.h, so 
putting them all with emmintrin.h is not consistent with what Intel says.  I 
can find a Microsoft doc that say these are all available with immintrin.h (not 
emmintrin.h).  It looks like gcc puts them with immintrin.h also.

Can we move the #include of f16cintrin.h from emmintrin.h to immintrin.h?  It 
would be more compatible with everybody else, and equally inconsistent with 
Intel's doc AFAICT.


Repository:
  rL LLVM

http://reviews.llvm.org/D13015



-
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits