https://gcc.gnu.org/g:23db87301b623ecf162c9df718ce82ed9aa354a8

commit r15-1059-g23db87301b623ecf162c9df718ce82ed9aa354a8
Author: Hongyu Wang <hongyu.w...@intel.com>
Date:   Tue Apr 9 16:05:26 2024 +0800

    [APX CCMP] Adjust startegy for selecting ccmp candidates
    
    For general ccmp scenario, the tree sequence is like
    
    _1 = (a < b)
    _2 = (c < d)
    _3 = _1 & _2
    
    current ccmp expanding will try to swap compare order for _1 and _2,
    compare the expansion cost/cost2 for expanding _1 or _2 first, then
    return the sequence with lower cost.
    
    It is possible that one expansion succeeds and the other fails.
    For example, x86 has int ccmp but not fp ccmp, so a combined fp and
    int comparison must be ordered such that the fp comparison happens
    first.  The costs are not meaningful for failed expansions.
    
    Check the expand_ccmp_next result ret and ret2, returns the valid one
    before cost comparison.
    
    gcc/ChangeLog:
    
            * ccmp.cc (expand_ccmp_expr_1): Check ret and ret2 of
            expand_ccmp_next, returns the valid one first instead of
            comparing cost.

Diff:
---
 gcc/ccmp.cc | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/gcc/ccmp.cc b/gcc/ccmp.cc
index 7cb525addf4..4d50708d986 100644
--- a/gcc/ccmp.cc
+++ b/gcc/ccmp.cc
@@ -247,7 +247,15 @@ expand_ccmp_expr_1 (gimple *g, rtx_insn **prep_seq, 
rtx_insn **gen_seq)
              cost2 = seq_cost (prep_seq_2, speed_p);
              cost2 += seq_cost (gen_seq_2, speed_p);
            }
-         if (cost2 < cost1)
+
+         /* It's possible that one expansion succeeds and the other
+            fails.
+            For example, x86 has int ccmp but not fp ccmp, and so a
+            combined fp and int comparison must be ordered such that
+            the fp comparison happens first. The costs are not
+            meaningful for failed expansions.  */
+
+         if (ret2 && (!ret || cost2 < cost1))
            {
              *prep_seq = prep_seq_2;
              *gen_seq = gen_seq_2;

Reply via email to