On Fri, Feb 12, 2021 at 10:59 AM Richard Sandiford <richard.sandif...@arm.com> wrote: > > Richard Biener <richard.guent...@gmail.com> writes: > > On Thu, Feb 11, 2021 at 4:33 PM Richard Sandiford via Gcc-patches > > <gcc-patches@gcc.gnu.org> wrote: > >> > >> df_lr_bb_local_compute has: > >> > >> FOR_EACH_INSN_INFO_DEF (def, insn_info) > >> /* If the def is to only part of the reg, it does > >> not kill the other defs that reach here. */ > >> if (!(DF_REF_FLAGS (def) & (DF_REF_PARTIAL | DF_REF_CONDITIONAL))) > >> > >> However, as noted in the comment in the patch and below, almost all > >> partial definitions have an associated use. This means that the > >> confluence function: > >> > >> IN = (OUT & ~DEF) | USE > >> > >> is unaffected by whether partial definitions are in DEF or not. > >> > >> Even though the choice doesn't matter for the LR problem itself, > >> it's IMO much more convenient for consumers if DEF contains all the > >> definitions in the block. The only pre-RTL-SSA code that tries to > >> consume DEF directly is shrink-wrap.c, which already has to work > >> around the incompleteness of the information: > >> > >> /* DF_LR_BB_INFO (bb)->def does not comprise the DF_REF_PARTIAL > >> and > >> DF_REF_CONDITIONAL defs. So if DF_LIVE doesn't exist, i.e. > >> at -O1, just give up searching NEXT_BLOCK. */ > >> > >> I hit the same problem when trying to fix the RTL-SSA part of PR98863. > >> > >> This patch treats partial definitions as both a def and a use, > >> just like the df_ref records almost always do. > >> > >> To show that partial definitions almost always have uses: > >> > >> DF_REF_CONDITIONAL: > >> > >> Added by: > >> > >> case COND_EXEC: > >> df_defs_record (collection_rec, COND_EXEC_CODE (x), > >> bb, insn_info, DF_REF_CONDITIONAL); > >> break; > >> > >> Later, df_get_conditional_uses creates uses for all DF_REF_CONDITIONAL > >> definitions. > >> > >> DF_REF_PARTIAL: > >> > >> In total, there are 4 locations at which we add partial definitions. > >> > >> Case 1: > >> > >> if (GET_CODE (dst) == STRICT_LOW_PART) > >> { > >> flags |= DF_REF_READ_WRITE | DF_REF_PARTIAL | > >> DF_REF_STRICT_LOW_PART; > >> > >> loc = &XEXP (dst, 0); > >> dst = *loc; > >> } > >> > >> Corresponding use: > >> > >> case STRICT_LOW_PART: > >> { > >> rtx *temp = &XEXP (dst, 0); > >> /* A strict_low_part uses the whole REG and not just the > >> SUBREG. */ > >> dst = XEXP (dst, 0); > >> df_uses_record (collection_rec, > >> (GET_CODE (dst) == SUBREG) ? &SUBREG_REG (dst) : > >> temp, > >> DF_REF_REG_USE, bb, insn_info, > >> DF_REF_READ_WRITE | DF_REF_STRICT_LOW_PART); > >> } > >> break; > >> > >> Case 2: > >> > >> if (GET_CODE (dst) == ZERO_EXTRACT) > >> { > >> flags |= DF_REF_READ_WRITE | DF_REF_PARTIAL | > >> DF_REF_ZERO_EXTRACT; > >> > >> loc = &XEXP (dst, 0); > >> dst = *loc; > >> } > >> > >> Corresponding use: > >> > >> case ZERO_EXTRACT: > >> { > >> df_uses_record (collection_rec, &XEXP (dst, 1), > >> DF_REF_REG_USE, bb, insn_info, flags); > >> df_uses_record (collection_rec, &XEXP (dst, 2), > >> DF_REF_REG_USE, bb, insn_info, flags); > >> if (GET_CODE (XEXP (dst,0)) == MEM) > >> df_uses_record (collection_rec, &XEXP (dst, 0), > >> DF_REF_REG_USE, bb, insn_info, > >> flags); > >> else > >> df_uses_record (collection_rec, &XEXP (dst, 0), > >> DF_REF_REG_USE, bb, insn_info, > >> DF_REF_READ_WRITE | DF_REF_ZERO_EXTRACT); > >> ----------------------------^^^^^^^^^^^^^^^^^ > >> } > >> break; > >> > >> Case 3: > >> > >> else if (GET_CODE (dst) == SUBREG && REG_P (SUBREG_REG (dst))) > >> { > >> if (read_modify_subreg_p (dst)) > >> flags |= DF_REF_READ_WRITE | DF_REF_PARTIAL; > >> > >> flags |= DF_REF_SUBREG; > >> > >> df_ref_record (DF_REF_REGULAR, collection_rec, > >> dst, loc, bb, insn_info, DF_REF_REG_DEF, flags); > >> } > >> > >> Corresponding use: > >> > >> case SUBREG: > >> if (read_modify_subreg_p (dst)) > >> { > >> df_uses_record (collection_rec, &SUBREG_REG (dst), > >> DF_REF_REG_USE, bb, insn_info, > >> flags | DF_REF_READ_WRITE | DF_REF_SUBREG); > >> break; > >> } > >> > >> Case 4: > >> > >> /* If this is a multiword hardreg, we create some extra > >> datastructures that will enable us to easily build REG_DEAD > >> and REG_UNUSED notes. */ > >> if (collection_rec > >> && (endregno != regno + 1) && insn_info) > >> { > >> /* Sets to a subreg of a multiword register are partial. > >> Sets to a non-subreg of a multiword register are not. */ > >> if (GET_CODE (reg) == SUBREG) > >> ref_flags |= DF_REF_PARTIAL; > >> ref_flags |= DF_REF_MW_HARDREG; > >> > >> Corresponding use: > >> > >> None. However, this case should be rare to non-existent on most > >> targets, and the current handling seems suspect. See the comment > >> in the patch for more details. > >> > >> Tested so far on aarch64-linux-gnu, both individually and with > >> the follow-on rtl-ssa patch. WDYT? I realise it's not ideal > >> to be applying this in stage 4, but it would be expensive > >> to work around. > >> > >> FWIW, I also tried a variation: > >> > >> df_insn_info *insn_info = DF_INSN_INFO_GET (insn); > >> FOR_EACH_INSN_INFO_DEF (def, insn_info) > >> { > >> unsigned int dregno = DF_REF_REGNO (def); > >> bitmap_set_bit (&bb_info->def, dregno); > >> bitmap_clear_bit (&bb_info->use, dregno); > >> } > >> > >> FOR_EACH_INSN_INFO_USE (use, insn_info) > >> /* Add use to set of uses in this BB. */ > >> bitmap_set_bit (&bb_info->use, DF_REF_REGNO (use)); > >> > >> if (CHECKING_P) > >> FOR_EACH_INSN_INFO_DEF (def, insn_info) > >> if (DF_REF_FLAGS (def) & (DF_REF_PARTIAL | DF_REF_CONDITIONAL)) > >> gcc_assert (bitmap_bit_p (&bb_info->use, DF_REF_REGNO (def))); > >> > >> This obviously triggers case 4 above, as seen in the testcase. > >> It nevertheless survived bootstrap & regression test. This is what > >> I expected, because case 4 should never trigger on most targets, > >> including AArch64. > >> > >> I think this shows that the LR in/out sets are very likely to be > >> unaffected by the patch. > > > > I'd say it should nevertheless be a correctness fix. I suppose you > > ran into the cases where it was _not_ unaffected by the patch? > > (the multi-hard-reg subreg case?) > > The problems I ran into without the patch were from cases (1)-(3). > E.g. I was trying to place phis using a combination of the def sets > and dominance frontiers, but strict_lowpart definitions weren't in > the def sets, and so we were missing necessary phis.
Ah, I see. I guess I'd consider the local problem bitmaps "internal" (and they're just kept to support incrementally updating the problem) ... > I then wanted to go with the asserting variant above, but when > I looked at the df-scan.c code, it was obvious that case (4) would > trigger the assert. > > I don't know a way of triggering case (4) for real code (for any > target). The only way I could think of was to use an artificial > rtl test like the one in the patch. That did trigger the assert > above, but compiles correctly with the patch as posted (and compiled > correctly before the patch). > > So… > > > The comment mentions that the bug might be elsewhere though (in > > df-scan I suppose), so is isn't it better to fix the missing USE or > > making it two KILLs there? I realize the fallout of that might be > > bigger at this point. > > …yeah, I agree the right long-term fix would be to remove: > > /* Sets to a subreg of a multiword register are partial. > Sets to a non-subreg of a multiword register are not. */ > if (GET_CODE (reg) == SUBREG) > ref_flags |= DF_REF_PARTIAL; > > from case 4 and see if anything breaks. But that feels quite risky > at this stage, given that it's so hard to trigger. I agree. > I don't know of a way of constructing a test that is actually > miscompiled because of the current handling of case (4). The rtl test > was compiled correctly, it just had surprising LR information before > the patch. > > That said, if you think it's worth trying now, I'd be happy to remove > the code above and go for the asserting version. I think your original patch is fine if it helps solving the fwprop issue. Thanks, Richard. > Thanks, > Richard