On 01/13/14 00:34, Jakub Jelinek wrote:
On Sun, Jan 12, 2014 at 11:21:59PM -0700, Jeff Law wrote:
--- a/gcc/ree.c
+++ b/gcc/ree.c
@@ -297,6 +297,13 @@ combine_set_extension (ext_cand *cand, rtx curr_insn, rtx
*orig_set)
else
new_reg = gen_rtx_REG (cand->mode, REGNO (SET_DEST (*orig_set)));
+ /* We're going to be widening the result of DEF_INSN, ensure that doing so
+ doesn't change the number of hard registers needed for the result. */
+ if (HARD_REGNO_NREGS (REGNO (new_reg), cand->mode)
+ != HARD_REGNO_NREGS (REGNO (SET_SRC (*orig_set)),
Note you can use orig_src instead of SET_SRC (*orig_set) here.
Yea. A little manual CSE never hurts. Fixed.
+ GET_MODE (SET_DEST (*orig_set))))
+ return false;
+
/* Merge constants by directly moving the constant into the register under
some conditions. Recall that RTL constants are sign-extended. */
if (GET_CODE (orig_src) == CONST_INT
Are you sure the above is needed even for the
REGNO (new_reg) == REGNO (SET_DEST (*orig_set))
&& REGNO (new_reg) == REGNO (orig_src) case?
It wasn't clear when I was reviewing the code, so I took the
conservative approach of always rejecting when the # hard regs changes
as a result of widening.
(set (reg:SI 3) (something:SI))
(set (reg:SI 2) (expression:SI)) // def_insn
(use (reg:SI 3))
(set (reg:DI 3) (sign_extend:DI (reg:SI 2)))
So, perhaps if we wanted to handle the HARD_REGNO_NREGS != HARD_REGNO_NREGS
case when all 3 REGNOs are the same, we'd need to limit it to the case where
cand->insn and curr_insn are in the same bb, DF_INSN_LUID of curr_insn
is smaller than DF_INSN_LUID of cand->insn and the extra hard regs aren't
used between the two. Perhaps not worth it?
I doubt it's worth it. A size change here is pretty unusual.
BTW, I'm surprised to hear that it triggers in the testsuite already (for
the 3 REGNOs the same case, or different?), is that on x86_64 or i?86?
Do you have an example? I'm surprised that we'd have post reload a pattern
that extends into multiple hard registers.
There were only a couple on x86_64, both related to handling vector
regs. They would actually fail later, but they tripped the changing
size check.
Here's one example:
(insn 16 15 17 2 (set (reg:V4SI 21 xmm0 [99])
(vec_select:V4SI (reg:V8SI 22 xmm1 [orig:85 vect__3.443 ] [85])
(parallel [
(const_int 4 [0x4])
(const_int 5 [0x5])
(const_int 6 [0x6])
(const_int 7 [0x7])
]))) j.c:26 1926 {vec_extract_hi_v8si}
(nil))
(insn 17 16 18 2 (set (reg:V4DI 21 xmm0 [orig:98 vect__4.444 ] [98])
(sign_extend:V4DI (reg:V4SI 21 xmm0 [99]))) j.c:26 2583
{avx2_sign_extendv4siv4di2}
(nil))
Changing (reg:V4SI xmm0) to (reg:V4DI xmm0) changes the number of hard
regs and thus tripped the check.
The transformation would actually fail later when it tried to actually
combine the extension and vec_select into this insn:
(insn 16 15 17 2 (set (reg:V4DI 21 xmm0)
(sign_extend:V4DI (vec_select:V4SI (reg:V8SI 22 xmm1 [orig:85
vect__3.443 ] [85])
(parallel [
(const_int 4 [0x4])
(const_int 5 [0x5])
(const_int 6 [0x6])
(const_int 7 [0x7])
])))) j.c:26 -1
I guess I could run a -m32 multilib test and see if I can get it to
trigger in on something that will create a valid insn.
Jeff