Richard Biener <richard.guent...@gmail.com> writes:
> On Mon, Aug 18, 2025 at 10:51 AM Richard Sandiford
> <richard.sandif...@arm.com> wrote:
>>
>> This patch fixes an internal disagreement in gcse about how to
>> handle partial clobbers.  Like many passes, gcse doesn't track
>> the modes of live values, so if a call clobbers only part of
>> a register, the pass has to make conservative assumptions.
>> As the comment in the patch says, this means:
>>
>> (1) ignoring partial clobbers when computing liveness and reaching
>>     definitions
>>
>> (2) treating partial clobbers as full clobbers when computing
>>     availability
>>
>> DF is mostly concerned with (1), so ignores partial clobbers.
>>
>> compute_hash_table_work did (2) when calculating kill sets,
>> but compute_transp didn't do (2) when computing transparency.
>> This led to a nonsensical situation of a register being in both
>> the transparency and kill sets.
>>
>> Andreas fixed the PR itself with a target-specific change, which is why
>> there is no new testcase.  But I was supposed to post this patch when
>> the next stage 1 opened and it looks like I never did.
>>
>> Tested on aarch64-linux-gnu.  OK to install?
>>
>> Richard
>>
>>
>> gcc/
>>         PR rtl-optimization/97497
>>         * function-abi.h (predefined_function_abi::only_partial_reg_clobbers)
>>         (function_abi::only_partial_reg_clobbers): New member functions.
>>         * gcse-common.c: Include regs.h and function-abi.h.
>>         (compute_transp): Check for partially call-clobbered registers
>>         and treat them as not being transparent in blocks with calls.
>> ---
>>  gcc/function-abi.h | 39 +++++++++++++++++++++++++++++++++++++++
>>  gcc/gcse-common.cc | 40 ++++++++++++++++++++++++++++++++--------
>>  2 files changed, 71 insertions(+), 8 deletions(-)
>>
>> diff --git a/gcc/function-abi.h b/gcc/function-abi.h
>> index edf83c1cb39..6cb552fbcac 100644
>> --- a/gcc/function-abi.h
>> +++ b/gcc/function-abi.h
>> @@ -99,6 +99,39 @@ public:
>>      return m_full_and_partial_reg_clobbers;
>>    }
>>
>> +  /* Return the set of registers that a function call is allowed to alter
>> +     partially but not fully; i.e. those in full_and_partial_reg_clobbers ()
>> +     but not in full_reg_clobbers ().
>> +
>> +     If a register X is in this set and if we don't know which parts of
>> +     X are live (typically because we don't bother to track X's mode),
>> +     then the conservative assumptions are:
>> +
>> +     - to ignore the call when computing reaching definitions
>> +     - to treat X as clobbered when computing availability
>> +
>> +     For example, if we have:
>> +
>> +       A: X := Y
>> +       B: call that partially clobbers X
>> +       C: ... := ... X ...
>> +
>> +     and don't track the mode of X when computing reaching definitions,
>> +     then the conservative assumption is that A's definition survives
>> +     until C.  But if we have:
>> +
>> +       A: X := Y
>> +       B: call that partially clobbers X
>> +       C: ... := ... Y ...
>> +
>> +     and don't track the mode of X when computing availability, then the
>> +     conservative assumption is that Y is not available in X at C.  */
>> +  HARD_REG_SET
>> +  only_partial_reg_clobbers () const
>> +  {
>> +    return full_and_partial_reg_clobbers () & ~full_reg_clobbers ();
>> +  }
>> +
>>    /* Return the set of registers that cannot be used to hold a value of
>>       mode MODE across a function call.  That is:
>>
>> @@ -165,6 +198,12 @@ public:
>>      return m_mask & m_base_abi->full_and_partial_reg_clobbers ();
>>    }
>>
>> +  HARD_REG_SET
>> +  only_partial_reg_clobbers () const
>> +  {
>> +    return m_mask & m_base_abi->only_partial_reg_clobbers ();
>> +  }
>> +
>>    HARD_REG_SET
>>    mode_clobbers (machine_mode mode) const
>>    {
>> diff --git a/gcc/gcse-common.cc b/gcc/gcse-common.cc
>> index 888e4d6356c..9ce9bb57148 100644
>> --- a/gcc/gcse-common.cc
>> +++ b/gcc/gcse-common.cc
>> @@ -27,7 +27,8 @@
>>  #include "rtl.h"
>>  #include "df.h"
>>  #include "gcse-common.h"
>> -
>> +#include "regs.h"
>> +#include "function-abi.h"
>>
>>  /* Record all of the canonicalized MEMs of record_last_mem_set_info's insn.
>>     Note we store a pair of elements in the list, so they have to be
>> @@ -129,13 +130,36 @@ compute_transp (const_rtx x, int indx, sbitmap *bmap,
>>    switch (code)
>>      {
>>      case REG:
>> -       {
>> -         df_ref def;
>> -         for (def = DF_REG_DEF_CHAIN (REGNO (x));
>> -              def;
>> -              def = DF_REF_NEXT_REG (def))
>> -           bitmap_clear_bit (bmap[DF_REF_BB (def)->index], indx);
>> -       }
>> +      {
>> +       df_ref def;
>> +       for (def = DF_REG_DEF_CHAIN (REGNO (x));
>> +            def;
>> +            def = DF_REF_NEXT_REG (def))
>> +         bitmap_clear_bit (bmap[DF_REF_BB (def)->index], indx);
>> +
>> +       /* Check for hard registers that are partially clobbered (but not
>> +          fully clobbered) by calls.  Such partial clobbers do not have
>> +          an associated definition or use in the DF representation,
>> +          but they do prevent the register from being transparent.
>> +
>> +          ??? The check here is fairly crude.  We could instead maintain
>> +          separate blocks_with_calls bitmaps for each ABI.  */
>> +       if (HARD_REGISTER_P (x))
>
> It's probably cheap enough in that hard-reg uses in expressions are
> seldom.

Yeah, hopefully.  And uses of only-part-clobbered registers should be
even rarer.

> One does of course wonder about the cache-infriendlyness
> of compute_transp in general (also looking at the MEM case...).

Having blocks be the outer index and expressions be the inner index does
seem a bit inefficient in this context, but I suppose it's better when
solving the dataflow problem.  I wonder whether it'd be more efficient
to reformat the data after computing transparency?  It might not be
in practice though.

And yeah, the MEM case does look a bit expensive :)

Thanks for the review.

Richard

>
> OK.
>
> Thanks,
> Richard.
>
>> +         for (unsigned int i = 0; i < NUM_ABI_IDS; ++i)
>> +           {
>> +             const predefined_function_abi &abi = function_abis[i];
>> +             if (abi.initialized_p ()
>> +                 && overlaps_hard_reg_set_p (abi.only_partial_reg_clobbers 
>> (),
>> +                                             GET_MODE (x), REGNO (x)))
>> +               {
>> +                 bitmap_iterator bi;
>> +                 unsigned bb_index;
>> +                 EXECUTE_IF_SET_IN_BITMAP (blocks_with_calls, 0, bb_index, 
>> bi)
>> +                   bitmap_clear_bit (bmap[bb_index], indx);
>> +                 break;
>> +               }
>> +           }
>> +      }
>>
>>        return;
>>
>> --
>> 2.43.0
>>

Reply via email to