On Fri, Feb 12, 2021 at 10:59 AM Richard Sandiford
<richard.sandif...@arm.com> wrote:
>
> Richard Biener <richard.guent...@gmail.com> writes:
> > On Thu, Feb 11, 2021 at 4:33 PM Richard Sandiford via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> >>
> >> df_lr_bb_local_compute has:
> >>
> >>       FOR_EACH_INSN_INFO_DEF (def, insn_info)
> >>         /* If the def is to only part of the reg, it does
> >>            not kill the other defs that reach here.  */
> >>         if (!(DF_REF_FLAGS (def) & (DF_REF_PARTIAL | DF_REF_CONDITIONAL)))
> >>
> >> However, as noted in the comment in the patch and below, almost all
> >> partial definitions have an associated use.  This means that the
> >> confluence function:
> >>
> >>   IN = (OUT & ~DEF) | USE
> >>
> >> is unaffected by whether partial definitions are in DEF or not.
> >>
> >> Even though the choice doesn't matter for the LR problem itself,
> >> it's IMO much more convenient for consumers if DEF contains all the
> >> definitions in the block.  The only pre-RTL-SSA code that tries to
> >> consume DEF directly is shrink-wrap.c, which already has to work
> >> around the incompleteness of the information:
> >>
> >>           /* DF_LR_BB_INFO (bb)->def does not comprise the DF_REF_PARTIAL 
> >> and
> >>              DF_REF_CONDITIONAL defs.  So if DF_LIVE doesn't exist, i.e.
> >>              at -O1, just give up searching NEXT_BLOCK.  */
> >>
> >> I hit the same problem when trying to fix the RTL-SSA part of PR98863.
> >>
> >> This patch treats partial definitions as both a def and a use,
> >> just like the df_ref records almost always do.
> >>
> >> To show that partial definitions almost always have uses:
> >>
> >>   DF_REF_CONDITIONAL:
> >>
> >>     Added by:
> >>
> >>       case COND_EXEC:
> >>         df_defs_record (collection_rec, COND_EXEC_CODE (x),
> >>                         bb, insn_info, DF_REF_CONDITIONAL);
> >>         break;
> >>
> >>     Later, df_get_conditional_uses creates uses for all DF_REF_CONDITIONAL
> >>     definitions.
> >>
> >>   DF_REF_PARTIAL:
> >>
> >>     In total, there are 4 locations at which we add partial definitions.
> >>
> >>     Case 1:
> >>
> >>       if (GET_CODE (dst) == STRICT_LOW_PART)
> >>         {
> >>           flags |= DF_REF_READ_WRITE | DF_REF_PARTIAL | 
> >> DF_REF_STRICT_LOW_PART;
> >>
> >>           loc = &XEXP (dst, 0);
> >>           dst = *loc;
> >>         }
> >>
> >>     Corresponding use:
> >>
> >>       case STRICT_LOW_PART:
> >>         {
> >>           rtx *temp = &XEXP (dst, 0);
> >>           /* A strict_low_part uses the whole REG and not just the
> >>              SUBREG.  */
> >>           dst = XEXP (dst, 0);
> >>           df_uses_record (collection_rec,
> >>                           (GET_CODE (dst) == SUBREG) ? &SUBREG_REG (dst) : 
> >> temp,
> >>                           DF_REF_REG_USE, bb, insn_info,
> >>                           DF_REF_READ_WRITE | DF_REF_STRICT_LOW_PART);
> >>         }
> >>         break;
> >>
> >>     Case 2:
> >>
> >>       if (GET_CODE (dst) == ZERO_EXTRACT)
> >>         {
> >>           flags |= DF_REF_READ_WRITE | DF_REF_PARTIAL | 
> >> DF_REF_ZERO_EXTRACT;
> >>
> >>           loc = &XEXP (dst, 0);
> >>           dst = *loc;
> >>         }
> >>
> >>     Corresponding use:
> >>
> >>       case ZERO_EXTRACT:
> >>         {
> >>           df_uses_record (collection_rec, &XEXP (dst, 1),
> >>                           DF_REF_REG_USE, bb, insn_info, flags);
> >>           df_uses_record (collection_rec, &XEXP (dst, 2),
> >>                           DF_REF_REG_USE, bb, insn_info, flags);
> >>           if (GET_CODE (XEXP (dst,0)) == MEM)
> >>             df_uses_record (collection_rec, &XEXP (dst, 0),
> >>                             DF_REF_REG_USE, bb, insn_info,
> >>                             flags);
> >>           else
> >>             df_uses_record (collection_rec, &XEXP (dst, 0),
> >>                             DF_REF_REG_USE, bb, insn_info,
> >>                             DF_REF_READ_WRITE | DF_REF_ZERO_EXTRACT);
> >> ----------------------------^^^^^^^^^^^^^^^^^
> >>         }
> >>         break;
> >>
> >>     Case 3:
> >>
> >>       else if (GET_CODE (dst) == SUBREG && REG_P (SUBREG_REG (dst)))
> >>         {
> >>           if (read_modify_subreg_p (dst))
> >>             flags |= DF_REF_READ_WRITE | DF_REF_PARTIAL;
> >>
> >>           flags |= DF_REF_SUBREG;
> >>
> >>           df_ref_record (DF_REF_REGULAR, collection_rec,
> >>                          dst, loc, bb, insn_info, DF_REF_REG_DEF, flags);
> >>         }
> >>
> >>     Corresponding use:
> >>
> >>       case SUBREG:
> >>         if (read_modify_subreg_p (dst))
> >>           {
> >>             df_uses_record (collection_rec, &SUBREG_REG (dst),
> >>                             DF_REF_REG_USE, bb, insn_info,
> >>                             flags | DF_REF_READ_WRITE | DF_REF_SUBREG);
> >>             break;
> >>           }
> >>
> >>     Case 4:
> >>
> >>       /*  If this is a multiword hardreg, we create some extra
> >>           datastructures that will enable us to easily build REG_DEAD
> >>           and REG_UNUSED notes.  */
> >>       if (collection_rec
> >>           && (endregno != regno + 1) && insn_info)
> >>         {
> >>           /* Sets to a subreg of a multiword register are partial.
> >>              Sets to a non-subreg of a multiword register are not.  */
> >>           if (GET_CODE (reg) == SUBREG)
> >>             ref_flags |= DF_REF_PARTIAL;
> >>           ref_flags |= DF_REF_MW_HARDREG;
> >>
> >>     Corresponding use:
> >>
> >>       None.  However, this case should be rare to non-existent on most
> >>       targets, and the current handling seems suspect.  See the comment
> >>       in the patch for more details.
> >>
> >> Tested so far on aarch64-linux-gnu, both individually and with
> >> the follow-on rtl-ssa patch.  WDYT?  I realise it's not ideal
> >> to be applying this in stage 4, but it would be expensive
> >> to work around.
> >>
> >> FWIW, I also tried a variation:
> >>
> >>    df_insn_info *insn_info = DF_INSN_INFO_GET (insn);
> >>    FOR_EACH_INSN_INFO_DEF (def, insn_info)
> >>      {
> >>        unsigned int dregno = DF_REF_REGNO (def);
> >>        bitmap_set_bit (&bb_info->def, dregno);
> >>        bitmap_clear_bit (&bb_info->use, dregno);
> >>      }
> >>
> >>    FOR_EACH_INSN_INFO_USE (use, insn_info)
> >>      /* Add use to set of uses in this BB.  */
> >>      bitmap_set_bit (&bb_info->use, DF_REF_REGNO (use));
> >>
> >>    if (CHECKING_P)
> >>      FOR_EACH_INSN_INFO_DEF (def, insn_info)
> >>        if (DF_REF_FLAGS (def) & (DF_REF_PARTIAL | DF_REF_CONDITIONAL))
> >>          gcc_assert (bitmap_bit_p (&bb_info->use, DF_REF_REGNO (def)));
> >>
> >> This obviously triggers case 4 above, as seen in the testcase.
> >> It nevertheless survived bootstrap & regression test.  This is what
> >> I expected, because case 4 should never trigger on most targets,
> >> including AArch64.
> >>
> >> I think this shows that the LR in/out sets are very likely to be
> >> unaffected by the patch.
> >
> > I'd say it should nevertheless be a correctness fix.  I suppose you
> > ran into the cases where it was _not_ unaffected by the patch?
> > (the multi-hard-reg subreg case?)
>
> The problems I ran into without the patch were from cases (1)-(3).
> E.g. I was trying to place phis using a combination of the def sets
> and dominance frontiers, but strict_lowpart definitions weren't in
> the def sets, and so we were missing necessary phis.

Ah, I see.  I guess I'd consider the local problem bitmaps "internal"
(and they're just kept to support incrementally updating the problem) ...

> I then wanted to go with the asserting variant above, but when
> I looked at the df-scan.c code, it was obvious that case (4) would
> trigger the assert.
>
> I don't know a way of triggering case (4) for real code (for any
> target).  The only way I could think of was to use an artificial
> rtl test like the one in the patch.  That did trigger the assert
> above, but compiles correctly with the patch as posted (and compiled
> correctly before the patch).
>
> So…
>
> > The comment mentions that the bug might be elsewhere though (in
> > df-scan I suppose), so is isn't it better to fix the missing USE or
> > making it two KILLs there?  I realize the fallout of that might be
> > bigger at this point.
>
> …yeah, I agree the right long-term fix would be to remove:
>
>           /* Sets to a subreg of a multiword register are partial.
>              Sets to a non-subreg of a multiword register are not.  */
>           if (GET_CODE (reg) == SUBREG)
>             ref_flags |= DF_REF_PARTIAL;
>
> from case 4 and see if anything breaks.  But that feels quite risky
> at this stage, given that it's so hard to trigger.

I agree.

> I don't know of a way of constructing a test that is actually
> miscompiled because of the current handling of case (4).  The rtl test
> was compiled correctly, it just had surprising LR information before
> the patch.
>
> That said, if you think it's worth trying now, I'd be happy to remove
> the code above and go for the asserting version.

I think your original patch is fine if it helps solving the fwprop issue.

Thanks,
Richard.

> Thanks,
> Richard

Reply via email to