On Tue, 31 Dec 2019, Jakub Jelinek wrote:

> On Tue, Dec 31, 2019 at 05:47:54PM +0100, Richard Biener wrote:
> > >Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> > 
> > Ok. 
> 
> Thanks.
> 
> > >One thing I haven't done anything about yet is that there is
> > >FAIL: gcc.dg/tree-ssa/popcount4ll.c scan-tree-dump-times optimized
> > >".POPCOUNT" 1
> > >before/after this patch with -m32/-march=skylake-avx512.  That is
> > >because
> > >the popcountll effective target tests that we don't emit a call for
> > >__builtin_popcountll, which we don't on ia32 skylake-avx512, but
> > >direct_internal_fn_supported_p isn't true - that is because we expand
> > >the
> > >double word popcount using 2 word popcounts + addition.  Shall the
> > >match.pd
> > >case handle that case too  by allowing the optimization even if there
> > >is a
> > >type with half precision for which direct_internal_fn_supported_p?
> > 
> > You mean emitting a single builtin call
> > Or an add of two ifns? 
> 
> I meant to do in the match.pd condition what expand_unop will do, i.e.
> -     && direct_internal_fn_supported_p (IFN_POPCOUNT, type,
> -                                        OPTIMIZE_FOR_BOTH))
> +     && (direct_internal_fn_supported_p (IFN_POPCOUNT, type,
> +                                         OPTIMIZE_FOR_BOTH)
> +         /* expand_unop can handle double-word popcount using
> +            two word popcounts and addition.  */
> +         || (TREE_CODE (type) == INTEGRAL_TYPE
> +             && TYPE_PRECISION (type) == 2 * BITS_PER_WORD
> +             && (optab_handler (popcount_optab, word_mode)
> +                 != CODE_FOR_nothing))))
> or so.

OK, that would work for me (maybe add a predicate to the optabs code
close to the actual expander).

Richard.

Reply via email to