Oluwatamilore Adebayo <[email protected]> writes:
>> It would be good to mark all of these functions with __attribute__((noipa)),
>> since I think interprocedural optimisations might otherwise defeat the
>> runtime test in abd_run_1.c (in the sense that we might end up folding
>> things at compile time and not testing the vector versions of the functions).
>
> Done.
>
>> There are 14 tests, and it looks like 6 of them are expected to produce
>> ABD instructions while 8 aren't. It isn't really clear which tests are
>> which though.
>>
>> I think it'd help to split the file into two:
>>
>> - one containing only the tests that should produce ABD, so that the
>> scan-assembler counts sum up to the number of tests
>>
>> - one containing only the tests that cannot use ABD, with:
>>
>> { dg-final { scan-assembler-not {\tsabd\t} } }
>> { dg-final { scan-assembler-not {\tuabd\t} } }
>>
>> to enforce that
>
> After adjustments made to the vectoriser part, all tests now use an abd
> instruction.
Ah, that's a problem. Sorry, I didn't review 1/2 closely enough.
For:
> + /* Failed to find a widen operation so we check for a regular MINUS_EXPR.
> */
> + if (diff
> + && gimple_assign_rhs_code (diff) == MINUS_EXPR
> + && (TYPE_UNSIGNED (abs_type) || TYPE_OVERFLOW_UNDEFINED (abs_type)))
> + {
> + *half_type = NULL_TREE;
> + return true;
> + }
the condition should instead be:
if (diff
&& gimple_assign_rhs_code (diff) == MINUS_EXPR
&& TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (abs_oprnd)))
{
*half_type = NULL_TREE;
return true;
}
That is, we rely on overflow being undefined, so we need to check
TYPE_OVERFLOW_UNDEFINED on the type of the subtraction (rather than
abs_type, which is the type of ABS input, and at this point can be
different from TREE_TYPE (abs_oprnd)).
Then fn_unsigned_int and fn_unsigned_char_int_short correctly
avoid using SABD. The output for the other tests looks right.
It would be good to add a:
/* { dg-final { scan-assembler-not {\tabs\t} } } */
to be the positive tests, to make it more obvious that all separate
ABS instructions are elided.
Thanks,
Richard