Jeff Law <[email protected]> writes:
>>> Focusing on this insn:
>>>
>>>> (insn 77 75 80 6 (parallel [
>>>> (set (reg:DI 75 [ _32 ])
>>>> (plus:DI (reg:DI 73 [ _31 ])
>>>> (subreg:DI (reg/v:SI 41 [ __n ]) 0)))
>>>> (clobber (scratch:SI))
>>>> ]) "j.C":50:38 discrim 1 155 {adddi3}
>>>> (expr_list:REG_DEAD (reg:DI 73 [ _31 ])
>>>> (expr_list:REG_DEAD (reg/v:SI 41 [ __n ])
>>>> (nil))))
>>>
>>> Not surprisingly we're focused on the subreg expression in there.
>>>
>>> The first checkpoint in my mind is IRA's allocation where we assign it
>>> to reg 0.
>>>
>>>
>>>> Popping a0(r41,l0) -- assign reg 0
>>>
>>>
>>> So given the use inside a paradoxical subreg, do we consider this valid?
>>>
>>> After the discussion from last week, I'm leaning a bit more towards no
>>> than before.
>>
>> I thought it wasn't valid. AIUI, there are two mechanisms that try
>> to prevent it:
>>
>> - valid_mode_changes_for_regno, which says which hard registers can
>> form all subregs required by a pseudo. This is only used to restrict
>> class choices though, rather than forbid individual registers.
>>
>> - This code in ira_build_conflicts:
>>
>> /* Now we deal with paradoxical subreg cases where certain registers
>> cannot be accessed in the widest mode. */
>> machine_mode outer_mode = ALLOCNO_WMODE (a);
>> machine_mode inner_mode = ALLOCNO_MODE (a);
>> if (paradoxical_subreg_p (outer_mode, inner_mode))
>> {
>> enum reg_class aclass = ALLOCNO_CLASS (a);
>> for (int j = ira_class_hard_regs_num[aclass] - 1; j >= 0; --j)
>> {
>> int inner_regno = ira_class_hard_regs[aclass][j];
>> int outer_regno = simplify_subreg_regno (inner_regno,
>> inner_mode, 0,
>> outer_mode);
>> if (outer_regno < 0
>> || !in_hard_reg_set_p (reg_class_contents[aclass],
>> outer_mode, outer_regno))
>> {
>> SET_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj),
>> inner_regno);
>> SET_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (obj),
>> inner_regno);
>> }
>> }
>> }
>>
>> which operates at the level of individual registers.
>>
>> So yeah, I think the first question is why ira_build_conflicts isn't
>> kicking in for this register or (if it is) why we still get register 0.
> So pulling on this thread leads me into the code that sets up
> ALLOCNO_WMODE in create_insn_allocnos:
>
>> if ((a = ira_curr_regno_allocno_map[regno]) == NULL)
>> {
>> a = ira_create_allocno (regno, false, ira_curr_loop_tree_node);
>> if (outer != NULL && GET_CODE (outer) == SUBREG)
>> {
>> machine_mode wmode = GET_MODE (outer);
>> if (partial_subreg_p (ALLOCNO_WMODE (a), wmode))
>> ALLOCNO_WMODE (a) = wmode;
>> }
>> }
> Note how we only set ALLOCNO_MODE only at allocno creation, so it'll
> work as intended if and only if the first reference is via a SUBREG.
Huh, yeah, I agree that that looks wrong.
> ISTM the fix here is to always do the check and set ALLOCNO_WMODE.
>
> The other bug I see is that we may potentially have paradoxicals in
> different modes. ie, on a 32 bit target, we could in theory have a
> paradoxical in DI and another in TI. So in addition to pulling that
> code out of the conditional so that it executes every time, the
> assignment would look like
>
> if (partial_subreg_p (ALLCONO_WMODE (a), wmode)
> && wmode > ALLOCNO_WMODE (a))
> ALLOCNO_WMODE (a) = wmode;
>
> Or something along those lines.
Not sure about this part though. The construct:
if (partial_subreg_p (ALLCONO_WMODE (a), wmode))
ALLOCNO_WMODE (a) = wmode;
is effectively:
ALLOCNO_WMODE (a) = MAX_SIZE (ALLOCNO_WMODE (a), wmode);
and so already picks the single widest mode, if there is one.
For things like DI vs DF, it will use the existing mode as a tie-breaker.
So ISTM that moving the code out of the "if (... == NULL)" should be
enough on its own.
> And it all makes sense that you caught this. You and another colleague
> at ARM were trying to address this exact problem ~11 years ago ;-)
Heh, thought it sounded familiar :)
Richard