Re: Help with comparison-merging optimization pass

Jakub Jelinek via Gcc Wed, 06 Aug 2025 05:52:17 -0700

On Wed, Aug 06, 2025 at 12:29:09PM +0000, Thomas de Bock wrote:
> I've looked at the pattern LLVM recognizes and there is indeed a lot of 
> different
> ways we could recognize the chains and generalize the optimization.
> The way I do the pattern recognition now is by changing the conditions to ==,


You can't rely on just == even for equality, it needs to handle both == and
!= and based on which edge has EDGE_TRUE_VALUE vs. EDGE_FALSE_VALUE and
whether it was EQ_EXPR or NE_EXPR decide what edge is for equality and what
is for non-equality.  Whether the comparisons are EQ_EXPR or NE_EXPR can be
quite random.

> redirecting the edges if needed. Since memcmp will only give us whether all 
> fields
> were equal or not, a mix of these operators or their outgoing edges wouldn't 
> work.

That is not true.

"The memcmp() function returns an integer less than, equal to, or greater
than zero if the first n bytes of s1 is found, respectively, to be less
than, to match, or be greater than the first n bytes of s2.

For a nonzero return value, the sign is determined by the sign of the
difference between the first pair of bytes (interpreted as unsigned char)
that differ in s1 and s2."

So, for equality only you can pattern match eq/ne comparison of integral
types with no padding (TYPE_SIZE in multiples of BITS_PER_UNIT,
TYPE_PRECISION equal to TYPE_SIZE), but you can also handle the <=> cases
if the comparisons are unsigned char or little endian above mentioned
unsigned larger types.
For <=> it would need to canonicalize the memcmp result (which can return
any negative, zero or any positive, to -1/0/1 or whatever other constants
are used for the PHIs for the <, == and > cases).
Even the equality doesn't have to have 0/1 in PHIs, it can be 1/0 or
42/53 etc., just look at return *a == *b ? 42 : 53;

> Then discovering chains where:
> 1. All blocks in the chain perform 2 loads, then a condition of their 
> equality.
>   The PHI argument value corresponding to each block in the chain should be 
> the
>   same, except for possibly a TRUE edge from the final block in the chain
>   (possibly indirectly, by a path of empty blocks with fallthru edges)
> 2. If the condition returns false, an edge is followed into the PHI node 
> (possibly
>   indirectly). If the condition returns true, an edge is taken into the next
>   block in the chain, which we follow to attempt to keep expanding the known
>   chain this way.
> 3. The final comparison block in the chain does not need to lead to PHI in its
>   TRUE case, I believe this is the case in LLVM, but I don't see how it's
>   needed. Since either way we can store the edge, merge the comparisons,
>   then reattach the edge to the new merged comparison block, since the
>   condition for the edge to be taken will be that all compared fields
>   were equal either way, so control flow is not affected.
>   This helps with recognizing partial chains of comparisons that precede e.g.
>   an array comparison that loops (which it does not recognize currently).
> Since there is no clear start or end to any chain initially, I pick them in
> the order they were listed in the PHI node, then discover the successors and
> predecessors. This doesn't recognize the pattern generated by <=>, but
> I tried to implement it to be as general as possible and recognize the chains
> in any case where it does not affect the program's logic (in the case that
> all sizes are known and sufficient memory is allocated).
> I think it's best to keep the recognizing implementation general, before
> adding clear restrictions around its use (like being applied to defaulted
> equality operators for example, or at least distinguishing the pattern 
> restrictions
> from the general case). I think limiting its application by just
> adding restricting logic to the pattern recognizer logic could prove 
> unpredictable.

There might not be any PHI node at all, there could be just branches (see
the two functions which just guard some function call on the structure
comparisons), it should handle that as well (so either PHI or all non-equal
edges with the same destinations).
E.g. reassoc maybe_optimize_range_tests can start from any stmt in the
sequence and just walks it backwards and forward from there to find the
first and last stmt (or bb) in the chain.  That one is about optimizing
say if (a && b && c && d && e && f && g && h)
where the a-h expressions could have some cases optimized.  Normally reassoc
will handle just the single bb a & b & c & d & e & f & g & h cases, but
expansion of the && on some targets is e.g. one expression per bb or two
expressions per bb.
I'd really suggest you look through what that code does because this
pattern recognition is similar (but not the same).
Also note that even in the PHI case, the last bb could be different from the
others.

        Jakub

Re: Help with comparison-merging optimization pass

Reply via email to