On 12 June 2020 20:55 Andrew Pinski wrote: > Subject: Re: [PATCH][GCC][Aarch64]: Fix for PR94880: Failure to recognize > andn pattern > > On Fri, Jun 12, 2020 at 7:50 AM Przemyslaw Wirkus > <przemyslaw.wir...@arm.com> wrote: > > > > Hi all, > > > > Pattern "(x | y) - y" can be optimized to simple "(x & ~y)" andn pattern. > > Isn't it better to do this transformation on the gimple level and not in a > target specific form? Or at least do it in the RTL level in a generic form > rather > than adding target specific patterns.
Yes, I will rework this and add simplification pattern on the gimple level. Cheers, Przemyslaw Wirkus > Thanks, > Andrew Pinski > > > > > > So, for the example code: > > > > $ cat main.c > > int > > f_i(int x, int y) > > { > > return (x | y) - y; > > } > > > > long long > > f_l(long long x, long long y) > > { > > return (x | y) - y; > > } > > > > typedef int v4si __attribute__ ((vector_size (16))); typedef long long > > v2di __attribute__ ((vector_size (16))); > > > > v4si > > f_v4si(v4si a, v4si b) { > > return (a | b) - b; > > } > > > > v2di > > f_v2di(v2di a, v2di b) { > > return (a | b) - b; > > } > > > > void > > f(v4si *d, v4si *a, v4si *b) { > > for (int i=0; i<N; i++) > > d[i] = (a[i] | b[i]) - b[i]; } > > > > Before this patch: > > $ ./aarch64-none-linux-gnu-gcc -S -O2 main.c -dp > > > > f_i: > > orr w0, w0, w1 // 8 [c=4 l=4] iorsi3/0 > > sub w0, w0, w1 // 14 [c=4 l=4] subsi3 > > ret // 24 [c=0 l=4] *do_return > > f_l: > > orr x0, x0, x1 // 8 [c=4 l=4] iordi3/0 > > sub x0, x0, x1 // 14 [c=4 l=4] subdi3/0 > > ret // 24 [c=0 l=4] *do_return > > f_v4si: > > orr v0.16b, v0.16b, v1.16b // 8 [c=8 l=4] > > iorv4si3/0 > > sub v0.4s, v0.4s, v1.4s // 14 [c=8 l=4] subv4si3 > > ret // 24 [c=0 l=4] *do_return > > f_v2di: > > orr v0.16b, v0.16b, v1.16b // 8 [c=8 l=4] > > iorv2di3/0 > > sub v0.2d, v0.2d, v1.2d // 14 [c=8 l=4] subv2di3 > > ret // 24 [c=0 l=4] *do_return > > > > After this patch: > > $ ./aarch64-none-linux-gnu-gcc -S -O2 main.c -dp > > > > f_i: > > bic w0, w0, w1 // 13 [c=8 l=4] *bic_and_not_si3 > > ret // 23 [c=0 l=4] *do_return > > f_l: > > bic x0, x0, x1 // 13 [c=8 l=4] *bic_and_not_di3 > > ret // 23 [c=0 l=4] *do_return > > f_v4si: > > bic v0.16b, v0.16b, v1.16b // 13 [c=16 l=4] > *bic_and_not_simd_v4si3 > > ret // 23 [c=0 l=4] *do_return > > f_v2di: > > bic v0.16b, v0.16b, v1.16b // 13 [c=16 l=4] > *bic_and_not_simd_v2di3 > > ret // 23 [c=0 l=4] *do_return > > > > Bootstrapped and tested on aarch64-none-linux-gnu. > > > > OK for master ? > > > > Cheers, > > Przemyslaw > > > > gcc/ChangeLog: > > > > PR tree-optimization/94880 > > * config/aarch64/aarch64.md (bic_and_not_<mode>3): New > define_insn. > > * config/aarch64/aarch64-simd.md (bic_and_not_simd_<mode>3): > New > > define_insn. > > > > gcc/testsuite/ChangeLog: > > > > PR tree-optimization/94880 > > * gcc.target/aarch64/bic_and_not_di3.c: New test. > > * gcc.target/aarch64/bic_and_not_si3.c: New test. > > * gcc.target/aarch64/bic_and_not_v2di3.c: New test. > > * gcc.target/aarch64/bic_and_not_v4si3.c: New test.