On 6/28/23 16:00, 钟居哲 wrote:
You can see here:
https://godbolt.org/z/d78646hWb <https://godbolt.org/z/d78646hWb>
The first case can't genreate vfwmul.vv but second case succeed.
Failed to match this instruction:
(set (reg:VNx2DF 150 [ vect__11.50 ])
(if_then_else:VNx2DF (unspec:VNx2BI [
(const_vector:VNx2BI repeat [
(const_int 1 [0x1])
])
(reg:DI 153)
(const_int 2 [0x2]) repeated x2
(const_int 1 [0x1])
(const_int 7 [0x7])
(reg:SI 66 vl)
(reg:SI 67 vtype)
(reg:SI 69 N/A)
] UNSPEC_VPREDICATE)
(mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 149 [ vect__5.45 ]))
(reg:VNx2DF 148 [ vect__8.49 ]))
(unspec:VNx2DF [
(reg:SI 0 zero)
] UNSPEC_VUNDEF)))
Right. We try combining:
24 -> 27
25 -> 27
23, 24 -> 27
22, 25 -> 27
All of which fail, as expected. 24 -> 27 and 25-> 27 only put an
extension on one operand of the mult. The other two try to substitute a
float extend of an if-then-else which I fully expect to fail. All as
expected.
The next one that gets tried is:
Trying 25, 24 -> 27:
25: r149:VNx2DF=float_extend(r141:VNx2SF)
REG_DEAD r141:VNx2SF
24: r148:VNx2DF=float_extend(r139:VNx2SF)
REG_DEAD r139:VNx2SF
27:
r150:VNx2DF={(unspec[const_vector,r153:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI]
69)?r148:VNx2DF*r149:VNx2DF:unspec[zero:SI] 68}
REG_DEAD r149:VNx2DF
REG_DEAD r148:VNx2DF
REG_DEAD N/A:SI
REG_DEAD zero:SI
REG_EQUAL r148:VNx2DF*r149:VNx2DF
Successfully matched this instruction:
(set (reg:VNx2DF 150 [ vect__11.50 ])
(if_then_else:VNx2DF (unspec:VNx2BI [
(const_vector:VNx2BI repeat [
(const_int 1 [0x1])
])
(reg:DI 153)
(const_int 2 [0x2]) repeated x2
(const_int 1 [0x1])
(const_int 7 [0x7])
(reg:SI 66 vl)
(reg:SI 67 vtype)
(reg:SI 69 N/A)
] UNSPEC_VPREDICATE)
(mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 141 [ vect__4.44 ]))
(float_extend:VNx2DF (reg:VNx2SF 139 [ vect__7.48 ])))
(unspec:VNx2DF [
(reg:SI 0 zero)
] UNSPEC_VUNDEF)))
allowing combination of insns 24, 25 and 27
original costs 4 + 4 + 4 = 12
replacement cost 4
Note how it replaced both operands of the mult with extended versions
and the pattern matches, as expected.
The point being that I don't think those helper patterns are needed to
handle the problem you suggested they were there to handle. Combine
knows how to handle multiple substitutions just fine.
Right now I don't see a need for this patch.
Jeff