Re: [AArch64][SVE2] Support for EOR3 and variants of BSL

Segher Boessenkool Wed, 16 Oct 2019 14:35:16 -0700

On Wed, Oct 16, 2019 at 09:04:18PM +0100, Richard Sandiford wrote:
> Segher Boessenkool <seg...@kernel.crashing.org> writes:
> > This isn't canonical RTL.  Does combine not simplify this?
> >
> > Or, rather, it should not be what we canonicalise to: nothing is defined
> > here.
> 
> But when nothing is defined, let's match what we get :-)


Of course.

> If someone wants to add a new canonical form then the ports should of
> course adapt, but until then I think the patch is doing the right thing.

We used to generate this, until GCC 5.  There aren't many ports that have
adapted yet.

> > If the mask is not a constant, we really shouldn't generate a totally
> > different form.  The xor-and-xor form is very hard to handle, too.
> >
> > Expand currently generates this, because gimple thinks this is simpler.
> > I think this should be fixed.
> 
> But the constant form is effectively folding away the NOT.
> Without it the equivalent rtl uses 4 operations rather than 3:
> 
>   (ior (and A C) (and B (not C)))

RTL canonicalisation rules are not based around number of ops.  For example,
we do  (and (not A) (not B))  rather than  (not (ior (A B))  .  Instead,
there are other rules (like here: push "not"s inward, which can be applied
locally with the wanted result).

> And folding 4 operations gets us into 4-insn combinations, which are
> obviously more limited (for good reason).

But on most machines it doesn't need to combine more than two or three
insns to get here.  Reducing the depth of the tree is more useful...  That
is 3 in both cases here, but "andc" is common on many machines, so that
makes it only two deep.

> As you say, it's no accident that we get this form, it's something
> that match.pd specifically chose.  And I think there should be a
> strong justification for having an RTL canonical form that reverses
> a gimple decision.  RTL isn't as powerful as gimple and so isn't going
> to be able to undo the gimple transforms in all cases.

Canonical RTL is different in many ways, already.

"Not as powerful", I have no idea what you mean, btw.  RTL is much closer
to the real machine, so is a lot *more* powerful than Gimple for modelling
machine instructions (where Gimple is much nicer for higher-level
optimisations).  We need both.


Segher

Re: [AArch64][SVE2] Support for EOR3 and variants of BSL

Reply via email to