These is the third major contribution of X86 intrinsic equivalent headers for PPC64LE.
X86 SSE technology was the second SIMD extension which added wider 128-bit vector (XMM) registers and single precision float capability. They also addressed missing MMX capabilies and provided transfers (move, pack, unpack) operations between MMX and XMM registers. This was embodied in the xmmintrin.h> header (in part 2/3). The implementation also provided the mm_malloc.h API to allow for correct 16-byte alignment where the system malloc may only provide 8-byte alignment. PowerPC64LE can assume the PowerPC quadword (16-byte) alignment but we provide this header and API to ease the application porting process. The mm_malloc.h header is implicitly included by xmmintrin.h. In general the SSE (__m128) intrinsic's are a better match to the PowerISA VMX/VSX 128-bit vector facilities. This allows direct mapping of the __m128 type to PowerPC __vector float and allows natural handling of parameter passing return values and SIMD float operations. However while both ISA's support float scalars in vector registers the X86_64 and PowerPC64LE use different formats (and bits within the vector register) for float scalars. This requires extra PowerISA operations to exactly match the X86 scalar float (intrinsics ending in *_ss) semantics. The intent is to provide a functionally correct implementation at some reduction in performance. This patch just adds the mm_malloc.h header with is will be needed by xmmintrin.h and cleans up some noisy warnings from the previous MMX commit. Part 2 adds the xmmintrin.h include and associated config.gcc and x86intrin.h changes part 3 adds the associated DG test cases. ./gcc/ChangeLog: 2017-08-16 Steven Munroe <munro...@gcc.gnu.org> * config/rs6000/mm_malloc.h: New file. [gcc/testsuite] 2017-07-21 Steven Munroe <munro...@gcc.gnu.org> * gcc.target/powerpc/mmx-packuswb-1.c [NO_WARN_X86_INTRINSICS]: Define. Suppress warning during tests. Index: gcc/testsuite/gcc.target/powerpc/mmx-packuswb-1.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/mmx-packuswb-1.c (revision 250986) +++ gcc/testsuite/gcc.target/powerpc/mmx-packuswb-1.c (working copy) @@ -3,6 +3,8 @@ /* { dg-require-effective-target lp64 } */ /* { dg-require-effective-target p8vector_hw } */ +#define NO_WARN_X86_INTRINSICS 1 + #ifndef CHECK_H #define CHECK_H "mmx-check.h" #endif Index: gcc/config/rs6000/mm_malloc.h =================================================================== --- gcc/config/rs6000/mm_malloc.h (revision 0) +++ gcc/config/rs6000/mm_malloc.h (revision 0) @@ -0,0 +1,62 @@ +/* Copyright (C) 2004-2017 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + <http://www.gnu.org/licenses/>. */ + +#ifndef _MM_MALLOC_H_INCLUDED +#define _MM_MALLOC_H_INCLUDED + +#include <stdlib.h> + +/* We can't depend on <stdlib.h> since the prototype of posix_memalign + may not be visible. */ +#ifndef __cplusplus +extern int posix_memalign (void **, size_t, size_t); +#else +extern "C" int posix_memalign (void **, size_t, size_t) throw (); +#endif + +static __inline void * +_mm_malloc (size_t size, size_t alignment) +{ + /* PowerPC64 ELF V2 ABI requires quadword alignment. */ + size_t vec_align = sizeof (__vector float); + /* Linux GLIBC malloc alignment is at least 2 X ptr size. */ + size_t malloc_align = (sizeof (void *) + sizeof (void *)); + void *ptr; + + if (alignment == malloc_align && alignment == vec_align) + return malloc (size); + if (alignment < vec_align) + alignment = vec_align; + if (posix_memalign (&ptr, alignment, size) == 0) + return ptr; + else + return NULL; +} + +static __inline void +_mm_free (void * ptr) +{ + free (ptr); +} + +#endif /* _MM_MALLOC_H_INCLUDED */