http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56504
Bug #: 56504
Summary: -mveclibabi=... Support AMD's LibM 3.0 (sucessor of
ACML)
Classification: Unclassified
Product: gcc
Version: 4.8.0
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: middle-end
AssignedTo: [email protected]
ReportedBy: [email protected]
GCC currently supports:
-mveclibabi=type
Specifies the ABI type to use for vectorizing intrinsics
[...] and acml for the AMD math core library. [...]
[...] and "__vrd2_sin",
"__vrd2_cos", "__vrd2_exp", "__vrd2_log", "__vrd2_log2",
"__vrd2_log10", "__vrs4_sinf", "__vrs4_cosf", "__vrs4_expf",
"__vrs4_logf", "__vrs4_log2f", "__vrs4_log10f" and
"__vrs4_powf" for the corresponding function type when
-mveclibabi=acml is used.
The current AMD LibM version, however, supports much more:
http://developer.amd.com/tools/cpu-development/libm/
>From the release notes:
Vector Functions
----------------
Exponential
-----------
* vrs4_expf, vrs4_exp2f, vrs4_exp10f, vrs4_expm1f
* vrsa_expf, vrsa_exp2f, vrsa_exp10f, vrsa_expm1f
* vrd2_exp, vrd2_exp2, vrd2_exp10, vrd2_expm1
* vrda_exp, vrda_exp2, vrda_exp10, vrda_expm1
Logarithmic
-----------
* vrs4_logf, vrs4_log2f, vrs4_log10f, vrs4_log1pf
* vrsa_logf, vrsa_log2f, vrsa_log10f, vrsa_log1pf
* vrd2_log, vrd2_log2, vrd2_log10, vrd2_log1p
* vrda_log, vrda_log2, vrda_log10, vrda_log1p
Trigonometric
-------------
* vrs4_cosf, vrs4_sinf
* vrsa_cosf, vrsa_sinf
* vrd2_cos, vrd2_sin
* vrda_cos, vrda_sin
* vrd2_sincos,vrda_sincos
* vrs4_sincosf,vrsa_sincosf
* vrd2_tan, vrs4_tanf
* vrd2_cosh
Power
-----
* vrs4_cbrtf, vrd2_cbrt, vrs4_powf, vrs4_powxf
* vrsa_cbrtf, vrda_cbrt, vrsa_powf, vrsa_powxf
* vrd2_pow
The vector functions are the known (cf. include/amdlibm.h):
__m128d amd_vrd2_exp (__m128d x);
__m128 amd_vrs4_expf (__m128 x);
etc.
While the array version use:
void amd_vrsa_expf (int len, float *src, float *dst);
void amd_vrda_exp2 (int len, double *src, double *dst);
void amd_vrda_exp (int len, double *src, double *dst);
void amd_vrsa_expf (int len, float *src, float *dst);
Unfortunately, no further documentation is available, telling whether, e.g.,
src and dst may be the same or not.
Note that AMD LibM now uses "amd_" as prefix to the vector functions. It
contains the old version as weak symbols but only those:
0000000000000340 W __vrd2_cos
00000000000000e0 W __vrd2_exp
00000000000001a0 W __vrd2_log
00000000000001c0 W __vrd2_log10
00000000000001b0 W __vrd2_log2
0000000000000330 W __vrd2_sin
0000000000000390 W __vrs4_cosf
00000000000000a0 W __vrs4_expf
0000000000000200 W __vrs4_log10f
00000000000001f0 W __vrs4_log2f
00000000000001e0 W __vrs4_logf
00000000000003a0 W __vrs4_sinf