https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100328

Kewen Lin <linkw at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |linkw at gcc dot gnu.org
   Last reconfirmed|                            |2021-06-24
     Ever confirmed|0                           |1

--- Comment #4 from Kewen Lin <linkw at gcc dot gnu.org> ---
Created attachment 51059
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51059&action=edit
ira-respect-matching-constraint-v2

This v2 considers the situation that: for one preferred register class
there can be two or more alternatives, one of them has the matching
constraint, while another doesn't have.  For the given operand, even if
it's assigned by a hardware reg which doesn't meet the matching
constraint, it can simply use the alternative which doesn't have
matching constraint so no register copy is needed.  One typical case is
define_insn *mov<mode>_internal2 on rs6000:

(define_insn "*mov<mode>_internal2"
  [(set (match_operand:CC 2 "cc_reg_operand" "=y,x,?y")
        (compare:CC (match_operand:P 1 "gpc_reg_operand" "0,r,r")
                    (const_int 0)))
   (set (match_operand:P 0 "gpc_reg_operand" "=r,r,r") (match_dup 1))]
  ""
  "@
   cmp<wd>i %2,%0,0
   mr. %0,%1
   #"

So we shouldn't create constraint copy for it.  For fma style insns on
rs6000, there are also several alternatives for preferred regclass and
also only one with matching constraint.  The difference is that for the
given operand although there is no matching constraint applied for the
alternaitve but matching constraint is being applied for one other input
operand in this same alternative, it means when one matching constraint
can be applied to more than one input operand, it has to have several
alternatives like this.  And to create constraint copies for all of
these input operands with matching constraint is fine, once the matching
constraint is honored on one input operand, it implicitly disable the
others due to the interference relationship.  So this patch is going
to record and check all the other alternatives, which don't have matching
constraint but with preferred classes, whether there is one input operand
having same matching constraint.

It also considers the possible free register move in the same register
class, disable this if so since the register copy to meet the constraint
is considered as free.

I re-evaluated SPEC2017 performance with option set Ofast unroll, bmks 
508.namd_r and 519.lbm_r were observed to be improved by 2.4% ~ 3.8%
on Power8 and Power9.

As mentioned before, it's bootstrapped/regtested on powerpc64le-linux-gnu P9
and x86_64-redhat-linux, but hit some regression failures on aarch64, I am
still
investigating the only one PASS->FAIL: (the others are XFAIL->XPASS)

PASS->FAIL: gcc.target/aarch64/sve/acle/general/pr94683.c -march=armv8.2-a+sve
-moverride=tune=none  check-function-bodies test

In this case, the newly created constraint copy is expected (which was shuffle
copy), but this copy change somehow affects the full cost on register 92 due to
conflict with reg 102. Need more digging on this.

Reply via email to