On Mon, 11 Nov 2024, Jan Hubicka wrote:

> > We currently support generating vectorized math calls to the AMD core
> > math library (ACML) (-mveclibabi=acml).  That library is end-of-life and
> > its successor is the math library from AMD Optimizing CPU Libraries
> > (AOCL).
> > 
> > This patch adds support for AOCL (-mveclibabi=aocl).  That significantly
> > broadens the range of vectorized math functions optimized for AMD CPUs
> > that GCC can generate calls to.
> > 
> > See the edit to invoke.texi for a complete list of added functions.
> > Compared to the list of functions in AOCL LibM docs I left out the
> > sincos, linearfrac, powx, sqrt and fabs operations.  I also left out all
> Why those are out?
> > the functions working with arrays and amd_vrd2_expm1() (the AMD docs
> > list the function but I wasn't able to link calls to it with the current
> > version of the library).
> > 
> > gcc/ChangeLog:
> > 
> >     PR target/56504
> >     * config/i386/i386-options.cc (ix86_option_override_internal):
> >     Add ix86_veclibabi_type_aocl case.
> >     * config/i386/i386-options.h (ix86_veclibabi_aocl): Add extern
> >     ix86_veclibabi_aocl().
> >     * config/i386/i386-opts.h (enum ix86_veclibabi): Add
> >     ix86_veclibabi_type_aocl into the ix86_veclibabi enum.
> >     * config/i386/i386.cc (ix86_veclibabi_aocl): New function.
> >     * config/i386/i386.opt: Add the 'aocl' type.
> >     * doc/invoke.texi: Document -mveclibabi=aocl.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> >     PR target/56504
> >     * gcc.target/i386/vectorize-aocl1.c: New test.
> > 
> > +  new_fndecl = build_decl (BUILTINS_LOCATION,
> > +                      FUNCTION_DECL, get_identifier (name), fntype);
> > +  TREE_PUBLIC (new_fndecl) = 1;
> > +  DECL_EXTERNAL (new_fndecl) = 1;
> > +  DECL_IS_NOVOPS (new_fndecl) = 1;
> > +  TREE_READONLY (new_fndecl) = 1;
> 
> I see that NOVOPS is copied from the older implementation.  I think
> const (which is specified by TREE_READONLY = 1) should be sufficient.
> 
> Checking this theory I noticed that tree-ssa-operands does:
> 
>   if (!(call_flags & ECF_NOVOPS))
>     {
>       /* A 'pure' or a 'const' function never call-clobbers anything.  */
>       if (!(call_flags & (ECF_PURE | ECF_CONST)))
>         add_virtual_operand (opf_def);
>       else if (!(call_flags & ECF_CONST))
>         add_virtual_operand (opf_use);
>     }
> 
> It is not clear to my why ECF_CONST functions needs opf_use. 

It doesn't, you missed the !

> tree-core documents NOVOPS as weaker than CONST
> 
> /* Function does not read or write memory (but may have side effects, so
>    it does not necessarily fit ECF_CONST).  */
> #define ECF_NOVOPS                (1 << 9)
> 
> Richi, why this is the case?

IIRC we use it for example for __builtin_prefetch which we do not want
to DCE.  So basically ECF_NOVOPS is equivalent to ECF_CONST from an
alias analysis perspective but it isn't DCEable - it's kind of a
volatile const call.  ECF_NOVOPS still feels ugly, I hope we do
not use it too much for "real" calls, so the above usage in the
patch is indeed quite bad.

Richard.

> But this is indpendent issue. Patch is OK.
> 
> Thanks,
> Honza
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to