On 19/11/14 18:22, Sandra Loosemore wrote:
> On 11/13/2014 10:47 AM, Andrew Pinski wrote:
>> On Thu, Nov 13, 2014 at 9:42 AM, Sandra Loosemore
>> <[email protected]> wrote:
>>> On 11/13/2014 10:27 AM, Richard Earnshaw wrote:
>>>>
>>>> On 13/11/14 17:05, Ramana Radhakrishnan wrote:
>>>>>
>>>>> On Thu, Nov 13, 2014 at 4:55 PM, Sandra Loosemore
>>>>> <[email protected]> wrote:
>>>>>>
>>>>>> This patch to the AArch64 back end adds a couple of additional bics
>>>>>> patterns
>>>>>> to match code of the form
>>>>>>
>>>>>> if ((x & y) == x) ...;
>>>>>>
>>>>>> This is testing whether the bits set in x are a subset of the bits set
>>>>>> in y;
>>>>>> or, that no bits in x are set that are not set in y. So, it is
>>>>>> equivalent
>>>>>> to
>>>>>>
>>>>>> if ((x & ~y) == 0) ...;
>>>>>>
>>>>>> Presently this generates code like
>>>>>> and x21, x21, x20
>>>>>> cmp x21, x20
>>>>>> b.eq c0 <main+0xc0>
>>>>>>
>>>>>> and this patch allows it to be written more concisely as:
>>>>>> bics x21, x20, x21
>>>>>> b.eq c0 <main+0xc0>
>>>>>>
>>>>>> Since the bics instruction sets the condition codes itself, no explicit
>>>>>> comparison is required and the result of the bics computation can be
>>>>>> discarded.
>>>>>>
>>>>>> Regression-tested on aarch64-linux-gnu. OK to commit?
>>>>>
>>>>>
>>>>> Is this not a duplicate of
>>>>> https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00943.html ?
>>>>>
>>>>>
>>>> I don't think so. However, I think it is something that should be
>>>> caught in generic simplification code
>>>>
>>>> ie map ((a & b) == b) ==> ((~a & b) == 0), etc
>>>>
>>>> Bit-clear operations are not that uncommon. Furthermore, A may be a
>>>> constant.
>>>
>>>
>>> Alex posted his patch when I already had Chris's in my regression test
>>> queue, but I've just confirmed that it does not fix the test case I
>>> included.
>>>
>>> I already thought a little about making this a generic simplification, but
>>> it seemed to me like it was only useful on targets that have a bit-clear
>>> instruction that happens to set condition codes, and that it would pessimize
>>> code on targets that don't have a bit-clear instruction at all (by inserting
>>> the extra complement operation). So to me it seemed reasonable to do it in
>>> the back end.
>>
>> But can't you do this in simplify-rtx.c and allow for the cost model
>> to do the correct thing?
>
> OK, here is a revised patch to apply the identity there. This version
> depends on Alex's aarch64 BICS patch for the included test case to pass,
> though.
>
> In addition to the aarch64 testing, I bootstrapped and regression-tested
> the target-inspecific part of the patch on x86_64-linux-gnu. Is this
> OK? Should I hold off on committing it until Alex's patch is in?
>
> -Sandra
>
>
> 2014-11-19 Sandra Loosemore <[email protected]>
>
> gcc/
> * simplify-rtx.c (simplify_relational_operation_1): Handle
> simplification identities for BICS patterns.
>
> gcc/testsuite/
> * gcc.target/aarch64/bics_4.c: New.
>
Looks sensible to me. Eric, are you happy?
R.
>
> bics2.patch
>
>
> Index: gcc/simplify-rtx.c
> ===================================================================
> --- gcc/simplify-rtx.c (revision 217322)
> +++ gcc/simplify-rtx.c (working copy)
> @@ -4551,6 +4551,32 @@ simplify_relational_operation_1 (enum rt
> simplify_gen_binary (XOR, cmp_mode,
> XEXP (op0, 1), op1));
>
> + /* (eq/ne (and x y) x) simplifies to (eq/ne (and (not y) x) 0), which
> + can be implemented with a BICS instruction on some targets, or
> + constant-folded if y is a constant. */
> + if ((code == EQ || code == NE)
> + && op0code == AND
> + && rtx_equal_p (XEXP (op0, 0), op1)
> + && !side_effects_p (op1))
> + {
> + rtx not_y = simplify_gen_unary (NOT, cmp_mode, XEXP (op0, 1),
> cmp_mode);
> + rtx lhs = simplify_gen_binary (AND, cmp_mode, not_y, XEXP (op0, 0));
> +
> + return simplify_gen_relational (code, mode, cmp_mode, lhs, const0_rtx);
> + }
> +
> + /* Likewise for (eq/ne (and x y) y). */
> + if ((code == EQ || code == NE)
> + && op0code == AND
> + && rtx_equal_p (XEXP (op0, 1), op1)
> + && !side_effects_p (op1))
> + {
> + rtx not_x = simplify_gen_unary (NOT, cmp_mode, XEXP (op0, 0),
> cmp_mode);
> + rtx lhs = simplify_gen_binary (AND, cmp_mode, not_x, XEXP (op0, 1));
> +
> + return simplify_gen_relational (code, mode, cmp_mode, lhs, const0_rtx);
> + }
> +
> /* (eq/ne (bswap x) C1) simplifies to (eq/ne x C2) with C2 swapped. */
> if ((code == EQ || code == NE)
> && GET_CODE (op0) == BSWAP
> Index: gcc/testsuite/gcc.target/aarch64/bics_4.c
> ===================================================================
> --- gcc/testsuite/gcc.target/aarch64/bics_4.c (revision 0)
> +++ gcc/testsuite/gcc.target/aarch64/bics_4.c (revision 0)
> @@ -0,0 +1,87 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2 --save-temps -fno-inline" } */
> +
> +extern void abort (void);
> +
> +int
> +bics_si_test1 (int a, int b, int c)
> +{
> + if ((a & b) == a)
> + return a;
> + else
> + return c;
> +}
> +
> +int
> +bics_si_test2 (int a, int b, int c)
> +{
> + if ((a & b) == b)
> + return b;
> + else
> + return c;
> +}
> +
> +typedef long long s64;
> +
> +s64
> +bics_di_test1 (s64 a, s64 b, s64 c)
> +{
> + if ((a & b) == a)
> + return a;
> + else
> + return c;
> +}
> +
> +s64
> +bics_di_test2 (s64 a, s64 b, s64 c)
> +{
> + if ((a & b) == b)
> + return b;
> + else
> + return c;
> +}
> +
> +int
> +main ()
> +{
> + int x;
> + s64 y;
> +
> + x = bics_si_test1 (0xf00d, 0xf11f, 0);
> + if (x != 0xf00d)
> + abort ();
> +
> + x = bics_si_test1 (0xf11f, 0xf00d, 0);
> + if (x != 0)
> + abort ();
> +
> + x = bics_si_test2 (0xf00d, 0xf11f, 0);
> + if (x != 0)
> + abort ();
> +
> + x = bics_si_test2 (0xf11f, 0xf00d, 0);
> + if (x != 0xf00d)
> + abort ();
> +
> + y = bics_di_test1 (0x10001000f00dll, 0x12341000f00dll, 0ll);
> + if (y != 0x10001000f00dll)
> + abort ();
> +
> + y = bics_di_test1 (0x12341000f00dll, 0x10001000f00dll, 0ll);
> + if (y != 0)
> + abort ();
> +
> + y = bics_di_test2 (0x10001000f00dll, 0x12341000f00dll, 0ll);
> + if (y != 0)
> + abort ();
> +
> + y = bics_di_test2 (0x12341000f00dll, 0x10001000f00dll, 0ll);
> + if (y != 0x10001000f00dll)
> + abort ();
> +
> + return 0;
> +}
> +
> +/* { dg-final { scan-assembler-times "bics\twzr, w\[0-9\]+, w\[0-9\]+" 2 } }
> */
> +/* { dg-final { scan-assembler-times "bics\txzr, x\[0-9\]+, x\[0-9\]+" 2 } }
> */
> +/* { dg-final { cleanup-saved-temps } } */
>