On Mon, 11 Nov 2024, Jan Hubicka wrote: > > We currently support generating vectorized math calls to the AMD core > > math library (ACML) (-mveclibabi=acml). That library is end-of-life and > > its successor is the math library from AMD Optimizing CPU Libraries > > (AOCL). > > > > This patch adds support for AOCL (-mveclibabi=aocl). That significantly > > broadens the range of vectorized math functions optimized for AMD CPUs > > that GCC can generate calls to. > > > > See the edit to invoke.texi for a complete list of added functions. > > Compared to the list of functions in AOCL LibM docs I left out the > > sincos, linearfrac, powx, sqrt and fabs operations. I also left out all > Why those are out? > > the functions working with arrays and amd_vrd2_expm1() (the AMD docs > > list the function but I wasn't able to link calls to it with the current > > version of the library). > > > > gcc/ChangeLog: > > > > PR target/56504 > > * config/i386/i386-options.cc (ix86_option_override_internal): > > Add ix86_veclibabi_type_aocl case. > > * config/i386/i386-options.h (ix86_veclibabi_aocl): Add extern > > ix86_veclibabi_aocl(). > > * config/i386/i386-opts.h (enum ix86_veclibabi): Add > > ix86_veclibabi_type_aocl into the ix86_veclibabi enum. > > * config/i386/i386.cc (ix86_veclibabi_aocl): New function. > > * config/i386/i386.opt: Add the 'aocl' type. > > * doc/invoke.texi: Document -mveclibabi=aocl. > > > > gcc/testsuite/ChangeLog: > > > > PR target/56504 > > * gcc.target/i386/vectorize-aocl1.c: New test. > > > > + new_fndecl = build_decl (BUILTINS_LOCATION, > > + FUNCTION_DECL, get_identifier (name), fntype); > > + TREE_PUBLIC (new_fndecl) = 1; > > + DECL_EXTERNAL (new_fndecl) = 1; > > + DECL_IS_NOVOPS (new_fndecl) = 1; > > + TREE_READONLY (new_fndecl) = 1; > > I see that NOVOPS is copied from the older implementation. I think > const (which is specified by TREE_READONLY = 1) should be sufficient. > > Checking this theory I noticed that tree-ssa-operands does: > > if (!(call_flags & ECF_NOVOPS)) > { > /* A 'pure' or a 'const' function never call-clobbers anything. */ > if (!(call_flags & (ECF_PURE | ECF_CONST))) > add_virtual_operand (opf_def); > else if (!(call_flags & ECF_CONST)) > add_virtual_operand (opf_use); > } > > It is not clear to my why ECF_CONST functions needs opf_use.
It doesn't, you missed the ! > tree-core documents NOVOPS as weaker than CONST > > /* Function does not read or write memory (but may have side effects, so > it does not necessarily fit ECF_CONST). */ > #define ECF_NOVOPS (1 << 9) > > Richi, why this is the case? IIRC we use it for example for __builtin_prefetch which we do not want to DCE. So basically ECF_NOVOPS is equivalent to ECF_CONST from an alias analysis perspective but it isn't DCEable - it's kind of a volatile const call. ECF_NOVOPS still feels ugly, I hope we do not use it too much for "real" calls, so the above usage in the patch is indeed quite bad. Richard. > But this is indpendent issue. Patch is OK. > > Thanks, > Honza > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)