On 22/03/2023 10:09, Richard Biener wrote:
On Tue, Mar 21, 2023 at 6:00 PM Andrew Stubbs <a...@codesourcery.com> wrote:

Hi all,

I want to be able to vectorize divide operators (softfp and integer),
but amdgcn only has hardware instructions suitable for -ffast-math.

We have recently implemented vector versions of all the libm functions,
but the libgcc functions aren't builtins and therefore don't use those
hooks.

What's the best way to achieve this? Add a new __builtin_div (and
__builtin_mod) that tree-vectorize can find, perhaps? Or something else?

What do you want to do?  Vectorize the out-of-line libgcc copy?  Or
emit inline vectorized code for int/softfp operations?  In the latter
case just emit the code from the pattern expanders?

I'd like to investigate having vectorized versions of the libgcc instruction functions, like we do for libm.

The inline code expansion is certainly an option, but I think there's quite a lot of code in those routines. I know how to do that option at least (except, maybe not the errno handling without making assumptions about the C runtime).

Basically, the -ffast-math instructions will always be the fastest way, but the goal is that the default optimization shouldn't just disable vectorization entirely for any loop that has a divide in it.

Andrew

Reply via email to