RE: [PATCH] D12052: [X86][SSE] Add _mm_undefined_* intrinsics
I’m not sure how much people actually use these, but the AVX-512 versions of these, at least, can be very useful internally to implement AVX-512 intrinsics. For AVX-512, we use the same GCC builtin for all 3 versions of the intrinsic (pass-through masked, set to zero masked, and unmasked). This is the same implementation that’s used in GCC, and is fairly clean, since the only difference is in the desired pass-through values (actual value, zero, or undef). However, since we don’t actually have the undef intrinsics right now, we put a zero in the unmasked version as well, which is definitely a pessimization. The plan is to change them to use undef once the undef intrinsics are implemented. From: Eric Christopher [mailto:echri...@gmail.com] Sent: Monday, August 17, 2015 21:33 To: reviews+d12052+public+a6057f04f570e...@reviews.llvm.org; llvm-...@redking.me.uk; craig.top...@gmail.com; Kuperstein, Michael M Cc: david.majne...@gmail.com; Badouh, Asaf; cfe-commits@lists.llvm.org; Richard Smith Subject: Re: [PATCH] D12052: [X86][SSE] Add _mm_undefined_* intrinsics On Sun, Aug 16, 2015 at 3:05 AM Simon Pilgrim mailto:llvm-...@redking.me.uk>> wrote: RKSimon added a comment. Yes using that uninitialized value has worried me as well. I originally set it to zero (and considered using __ LINE __ or __ COUNTER __) but both introduce defined behaviour that I could see causing all sorts of problems further down the line in debug vs release builds. How undefined do we want our undefined to be? ;-) Yeah, this is why I hadn't implemented them yet either. I can create __builtin_ia32_undef64mmx / __builtin_ia32_undef128 / __builtin_ia32_undef256 / __builtin_ia32_undef512 if nobody can think of a better alternative? This seems fairly heavyweight, but I don't have any better ideas. I'll assume we don't want to try to expose undef as a value in C (making it as something we could just add), if not then this seems to make the most sense. It's pretty painful/ugly though. Are people using these or did they just notice for completeness? We probably _could_ define them to zero and leave it at that. It's not pleasant and slower than it needs to be but not crazy. -eric Repository: rL LLVM http://reviews.llvm.org/D12052 - Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
RE: r245923 - [X86] Expose the various _rot intrinsics on non-MS platforms
Argh. I'll revert, and figure out what to do with it later. Sorry about the breakage. Michael -Original Message- From: İsmail Dönmez [mailto:ism...@i10z.com] Sent: Tuesday, August 25, 2015 14:17 To: Kuperstein, Michael M Cc: cfe-commits@lists.llvm.org Subject: Re: r245923 - [X86] Expose the various _rot intrinsics on non-MS platforms Hi, On Tue, Aug 25, 2015 at 10:21 AM, Michael Kuperstein via cfe-commits wrote: > Author: mkuper > Date: Tue Aug 25 02:21:33 2015 > New Revision: 245923 > > URL: http://llvm.org/viewvc/llvm-project?rev=245923&view=rev > Log: > [X86] Expose the various _rot intrinsics on non-MS platforms > > _rotl, _rotwl and _lrotl (and their right-shift counterparts) are > official x86 intrinsics, and should be supported regardless of > environment. This is in contrast to _rotl8, _rotl16, and _rotl64 which are > MS-specific. > > Note that the MS documentation for _lrotl is different from the Intel > documentation. Intel explicitly documents it as a 64-bit rotate, while > for MS, since sizeof(unsigned long) for MSVC is always 4, a 32-bit rotate is > implied. > > Differential Revision: http://reviews.llvm.org/D12271 > > Added: > cfe/trunk/test/CodeGen/x86-rot-intrinsics.c (with props) > Modified: > cfe/trunk/lib/Headers/Intrin.h > cfe/trunk/lib/Headers/immintrin.h This seems to break clang + mingw-w64 : λ echo "#include " | clang -x c -target x86_64-w64-mingw32 --sysroot C:/mingw-w64-6.0.0 - In file included from :1: In file included from C:/mingw-w64-6.0.0\x86_64-w64-mingw32\include\winsock2.h:23: In file included from C:/mingw-w64-6.0.0\x86_64-w64-mingw32\include\windows.h:69: In file included from C:/mingw-w64-6.0.0\x86_64-w64-mingw32\include\windef.h:8: In file included from C:/mingw-w64-6.0.0\x86_64-w64-mingw32\include\minwindef.h:163: In file included from C:/mingw-w64-6.0.0\x86_64-w64-mingw32\include\winnt.h:1516: In file included from C:\Program Files\LLVM\bin\..\lib\clang\3.8.0\include\x86intrin.h:29: C:\Program Files\LLVM\bin\..\lib\clang\3.8.0\include\immintrin.h:164:1: error: static declaration of '_rotl' follows non-static declaration _rotl(unsigned int _Value, int _Shift) { ^ C:/mingw-w64-6.0.0\x86_64-w64-mingw32\include\stdlib.h:581:24: note: previous declaration is here unsigned int __cdecl _rotl(unsigned int _Val,int _Shift); ^ - Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
RE: [PATCH] D13015: [X86] Make f16c intrinsics accessible through emmintrin.h, per Intel docs
Hi Paul, I'd rather not move them into immintrin.h. The current situation is that if you include emmintrin.h you get "too many" intrinsics w.r.t to the Intel docs (and immintrin.h is just right, since it recursively includes emmintrin.h), while if we move it to immintrin.h, including emmintrin.h would provide "not enough" intrinsics. So, in terms of compatibility, I think this is slightly better. Regarding MSVC and GCC - MSVC doesn't have an emmintrin.h (or any other specialized intrinsic file) and puts everything in immintrin.h. GCC, in my opinion, should probably also be fixed. :-) On the other hand, I undersand why providing _mm256 stuff in emmintrin.h may be a problem. Perhaps we can split the file in two (or move the relevant intrinsics directly into emm/immintrin)? That would means our header structure is slightly different from GCC's, but since this file should not be included directly anyway, that doesn't really bother me. Does that sound reasonable to you? Michael -Original Message- From: Paul Robinson [mailto:paul_robin...@playstation.sony.com] Sent: Monday, November 30, 2015 23:01 To: Kuperstein, Michael M; Badouh, Asaf Cc: cfe-commits@lists.llvm.org Subject: Re: [PATCH] D13015: [X86] Make f16c intrinsics accessible through emmintrin.h, per Intel docs probinson added a subscriber: probinson. probinson added a comment. Regarding _mm[256]_cvtps_ph and ...cvtph_ps, I can find an Intel doc that say the _mm_* functions are in emmintrin.h and the _mm256_* are in immintrin.h, so putting them all with emmintrin.h is not consistent with what Intel says. I can find a Microsoft doc that say these are all available with immintrin.h (not emmintrin.h). It looks like gcc puts them with immintrin.h also. Can we move the #include of f16cintrin.h from emmintrin.h to immintrin.h? It would be more compatible with everybody else, and equally inconsistent with Intel's doc AFAICT. Repository: rL LLVM http://reviews.llvm.org/D13015 - Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits