Michael Meissner schrieb:

> In particular, go to pages 5-7 of my tutorial (chapter 6) where I talk about
> scratch registers and allocating a new pseudo in split1 which is now always
> run.

Ah great! That's the missing link. The pseudo hatches in split1 from
scratch.

This makes combine even more powerful an split1 can be seen as expand2 :-)

imho, combine could be slightly improved by some minor modifications:

I.
In the old 3.4.x (private port) I introduced a target hook in combine,
just prior to where recog_for_combine gets called. The hook did some
canonicalization of rtx and thereby considerably reduced the number of
patterns that would have been necessary without that hook. It
transformed some unspecs.

A second use case of such a hook could look as follows: Imagine a
combined pattern that does not match but would be legal if the backend
knew that some register contains some specific value like, e.g.,
non-negative, zero-or-storeflagvalue, combiner has proved that some
bits will always be set or always be cleared; that kind of things. The
hook could accept the pattern (provided it passes recog_for_combine,
in that setup the hook will dispatch to recog_for_combine, so it has
the opportunity to canonicalize the pattern or take some decision
depending on the insn matched) if backend is allowed so see the proof
(pattern ok) or there is no such proof (pattern rejected).

In the actual setup you could write more complex patterns to make
combiner look deeper into things. That implies writing an insn that
has the information explicit in the pattern. Most probably you are
going so split the stuff in split1 again. A little bit more explicit:

Suppose f(x) is the pattern in question, just kooked up by combine.
You have an insn that can do f(x) under the condition that x \in {0,1}
but not f(x) generally, and combine doesn't look deeper into things
because of some reason. In a case where combine can proof that x will
always be 0 or 1 it is legal to accept f(x), otherwise not.

The current situation, however, would go like this: In the case where
the source happens to be like x &= 1, f(x) and there is a combine
bridge that can do f(x & 1) combine might take a try. f(x & 1)
matches, and in split1 things get split as x &= 1, f(x). Now you have
the knowledge that x is either 0 or 1 and you can replace (or directly
split it like) x &=1, g(x) with an insn g(x) that is cheaper than f(x).

Note that the first case is more general and need no combine bridge.
The only thing that has to be done is replacing f(x) with g(x), which
is legal under the assumption x \in {0,1}.


II.
Suppose insns A, B and C with costs f(A) = f(B) = f(C) = 2.

Combine combines A+B and sees costs f(A+B) = 5. This makes combine
reject the pattern, not looking deeper. If f(A+B+C) = 2 you either
will never see A+B+C, or you have to lie about cost and say f(A+B) =
2. Ok. As combine is myopic (I do not blame combine here, it's complex
enough) you go and write the combine bridge A+B. But: In situations
where A+B matches but there is no thing like C we see a
dis-optimization. split1 could fix that by splitting A+B into A, B
again, but that clutters up machine description. Combine could go like
this:

a) f(A+B) = 5 is more expensive than f(A)+f(B) = 4
b) do the combine A+B
c) look deeper
d1) there is match A+B+C cost f(A+B+C) = 2 => accept
d2) there is match A+B+C cost f(A+B+C) = 12 => costly => reject
d3) there is no match A+B+C => A+B ist costly => reject

Skimming combine dumps, I see most patterns get rejected because they
do not match, just a few insns get rejected because of higher cost. So
looking deeper even if the costs are higher may give better code and
increasing compile time just a little bit.

Case II is similar to tunneling known from physics.

Georg



Reply via email to