Jeff Law <jeffreya...@gmail.com> writes: >>> Focusing on this insn: >>> >>>> (insn 77 75 80 6 (parallel [ >>>> (set (reg:DI 75 [ _32 ]) >>>> (plus:DI (reg:DI 73 [ _31 ]) >>>> (subreg:DI (reg/v:SI 41 [ __n ]) 0))) >>>> (clobber (scratch:SI)) >>>> ]) "j.C":50:38 discrim 1 155 {adddi3} >>>> (expr_list:REG_DEAD (reg:DI 73 [ _31 ]) >>>> (expr_list:REG_DEAD (reg/v:SI 41 [ __n ]) >>>> (nil)))) >>> >>> Not surprisingly we're focused on the subreg expression in there. >>> >>> The first checkpoint in my mind is IRA's allocation where we assign it >>> to reg 0. >>> >>> >>>> Popping a0(r41,l0) -- assign reg 0 >>> >>> >>> So given the use inside a paradoxical subreg, do we consider this valid? >>> >>> After the discussion from last week, I'm leaning a bit more towards no >>> than before. >> >> I thought it wasn't valid. AIUI, there are two mechanisms that try >> to prevent it: >> >> - valid_mode_changes_for_regno, which says which hard registers can >> form all subregs required by a pseudo. This is only used to restrict >> class choices though, rather than forbid individual registers. >> >> - This code in ira_build_conflicts: >> >> /* Now we deal with paradoxical subreg cases where certain registers >> cannot be accessed in the widest mode. */ >> machine_mode outer_mode = ALLOCNO_WMODE (a); >> machine_mode inner_mode = ALLOCNO_MODE (a); >> if (paradoxical_subreg_p (outer_mode, inner_mode)) >> { >> enum reg_class aclass = ALLOCNO_CLASS (a); >> for (int j = ira_class_hard_regs_num[aclass] - 1; j >= 0; --j) >> { >> int inner_regno = ira_class_hard_regs[aclass][j]; >> int outer_regno = simplify_subreg_regno (inner_regno, >> inner_mode, 0, >> outer_mode); >> if (outer_regno < 0 >> || !in_hard_reg_set_p (reg_class_contents[aclass], >> outer_mode, outer_regno)) >> { >> SET_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj), >> inner_regno); >> SET_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (obj), >> inner_regno); >> } >> } >> } >> >> which operates at the level of individual registers. >> >> So yeah, I think the first question is why ira_build_conflicts isn't >> kicking in for this register or (if it is) why we still get register 0. > So pulling on this thread leads me into the code that sets up > ALLOCNO_WMODE in create_insn_allocnos: > >> if ((a = ira_curr_regno_allocno_map[regno]) == NULL) >> { >> a = ira_create_allocno (regno, false, ira_curr_loop_tree_node); >> if (outer != NULL && GET_CODE (outer) == SUBREG) >> { >> machine_mode wmode = GET_MODE (outer); >> if (partial_subreg_p (ALLOCNO_WMODE (a), wmode)) >> ALLOCNO_WMODE (a) = wmode; >> } >> } > Note how we only set ALLOCNO_MODE only at allocno creation, so it'll > work as intended if and only if the first reference is via a SUBREG.
Huh, yeah, I agree that that looks wrong. > ISTM the fix here is to always do the check and set ALLOCNO_WMODE. > > The other bug I see is that we may potentially have paradoxicals in > different modes. ie, on a 32 bit target, we could in theory have a > paradoxical in DI and another in TI. So in addition to pulling that > code out of the conditional so that it executes every time, the > assignment would look like > > if (partial_subreg_p (ALLCONO_WMODE (a), wmode) > && wmode > ALLOCNO_WMODE (a)) > ALLOCNO_WMODE (a) = wmode; > > Or something along those lines. Not sure about this part though. The construct: if (partial_subreg_p (ALLCONO_WMODE (a), wmode)) ALLOCNO_WMODE (a) = wmode; is effectively: ALLOCNO_WMODE (a) = MAX_SIZE (ALLOCNO_WMODE (a), wmode); and so already picks the single widest mode, if there is one. For things like DI vs DF, it will use the existing mode as a tie-breaker. So ISTM that moving the code out of the "if (... == NULL)" should be enough on its own. > And it all makes sense that you caught this. You and another colleague > at ARM were trying to address this exact problem ~11 years ago ;-) Heh, thought it sounded familiar :) Richard