On 04/11/15 11:37, Kyrill Tkachov wrote:
On 02/11/15 22:31, Jeff Law wrote:
On 11/02/2015 07:15 AM, Kyrill Tkachov wrote:
Hi all,
This patch attempts to restrict combine from transforming ZERO_EXTEND
and SIGN_EXTEND operations into and-bitmask
and weird SUBREG expressions when they appear inside MULT expressions.
This is because a MULT rtx containing these
extend operations is usually a well understood widening multiply operation.
However, if the costs for simple zero or sign-extend moves happen to
line up in a particular way,
expand_compound_operation will end up mangling a perfectly innocent
extend+mult+add rtx like:
(set (reg/f:DI 393)
(plus:DI (mult:DI (sign_extend:DI (reg/v:SI 425 [ selected ]))
(sign_extend:DI (reg:SI 606)))
(reg:DI 600)))
into:
(set (reg/f:DI 393)
(plus:DI (mult:DI (and:DI (subreg:DI (reg:SI 606) 0)
(const_int 202 [0xca]))
(sign_extend:DI (reg/v:SI 425 [ selected ])))
(reg:DI 600)))
Going to leave the review side of this for Segher.
If you decide to go forward, there's a section in md.texi WRT canonicalization of these
RTL codes that probably would need updating. Just search for "canonicalization"
section and read down a ways.
You mean document a canonical form for these operations as (mult (extend op1)
(extend op2)) ?
Looking at the issue more deeply I think the concrete problem is the code in
expand_compound_operation:
<code>
/* If we reach here, we want to return a pair of shifts. The inner
shift is a left shift of BITSIZE - POS - LEN bits. The outer
shift is a right shift of BITSIZE - LEN bits. It is arithmetic or
logical depending on the value of UNSIGNEDP.
If this was a ZERO_EXTEND or ZERO_EXTRACT, this pair of shifts will be
converted into an AND of a shift.
We must check for the case where the left shift would have a negative
count. This can happen in a case like (x >> 31) & 255 on machines
that can't shift by a constant. On those machines, we would first
combine the shift with the AND to produce a variable-position
extraction. Then the constant of 31 would be substituted in
to produce such a position. */
modewidth = GET_MODE_PRECISION (GET_MODE (x));
if (modewidth >= pos + len)
{
machine_mode mode = GET_MODE (x);
tem = gen_lowpart (mode, XEXP (x, 0));
if (!tem || GET_CODE (tem) == CLOBBER)
return x;
tem = simplify_shift_const (NULL_RTX, ASHIFT, mode,
tem, modewidth - pos - len);
tem = simplify_shift_const (NULL_RTX, unsignedp ? LSHIFTRT : ASHIFTRT,
mode, tem, modewidth - len);
}
else if (unsignedp && len < HOST_BITS_PER_WIDE_INT)
tem = simplify_and_const_int (NULL_RTX, GET_MODE (x),
simplify_shift_const (NULL_RTX, LSHIFTRT,
GET_MODE (x),
XEXP (x, 0), pos),
((unsigned HOST_WIDE_INT) 1 << len) - 1);
else
/* Any other cases we can't handle. */
return x;
</code>
this code ends up unconditionally transforming (zero_extend:DI (reg:SI 125))
into:
(and:DI (subreg:DI (reg:SI 125) 0)
(const_int 1 [0x1]))
which then gets substituted as an operand of the mult and nothing matches after
that.
archaeology shows this code has been there since at least 1992. I guess my
question is
why do we do this unconditionally? Should we be checking whether the
zero_extend form
is cheaper than whatever simplify_shift_const returns?
I don't think what expand_compound_operatio is trying to do here is
canonicalisation...
Thanks,
Kyrill
Jeff