On Thu, Jul 11, 2024 at 10:30 AM Ajit Agarwal <aagar...@linux.ibm.com> wrote:
>
> Hello All:
>
> Unroll factor is determined with max distance across loop iterations.
> The logic for determining the loop unroll factor is based on
> data dependency across loop iterations.
>
> The max distance across loop iterations is the unrolling factor
> that helps in predictive commoning.

The old comment in the code says

> -      /* The best unroll factor for this chain is equal to the number of
> -        temporary variables that we create for it.  */

why is that wrong and why is the max dependence distance more correct?

Do you have a testcase that shows how this makes a (positive) difference?

Richard.

> Bootstrapped and regtested on powerpc64-linux-gnu.
>
> Thanks & Regards
> Ajit
>
> tree-optimization, predcom: Improve unroll factor for predictive commoning
>
> Unroll factor is determined with max distance across loop iterations.
> The logic for determining the loop unroll factor is based on
> data dependency across loop iterations.
>
> The max distance across loop iterations is the unrolling factor
> that helps in predictive commoning.
>
> 2024-07-11  Ajit Kumar Agarwal  <aagar...@linux.ibm.com>
>
> gcc/ChangeLog:
>
>         * tree-predcom.cc: Change in determining unroll factor with
>         data dependence across loop iterations.
> ---
>  gcc/tree-predcom.cc | 51 ++++++++++++++++++++++++++++++++++-----------
>  1 file changed, 39 insertions(+), 12 deletions(-)
>
> diff --git a/gcc/tree-predcom.cc b/gcc/tree-predcom.cc
> index 9844fee1e97..029b02f5990 100644
> --- a/gcc/tree-predcom.cc
> +++ b/gcc/tree-predcom.cc
> @@ -409,6 +409,7 @@ public:
>    /* Perform the predictive commoning optimization for chains, make this
>       public for being called in callback execute_pred_commoning_cbck.  */
>    void execute_pred_commoning (bitmap tmp_vars);
> +  unsigned determine_unroll_factor (const vec<chain_p> &chains);
>
>  private:
>    /* The pointer to the given loop.  */
> @@ -2400,13 +2401,46 @@ pcom_worker::execute_pred_commoning_chain (chain_p 
> chain,
>     copies as possible.  CHAINS is the list of chains that will be
>     optimized.  */
>
> -static unsigned
> -determine_unroll_factor (const vec<chain_p> &chains)
> +unsigned
> +pcom_worker::determine_unroll_factor (const vec<chain_p> &chains)
>  {
>    chain_p chain;
> -  unsigned factor = 1, af, nfactor, i;
> +  unsigned factor = 1, i;
>    unsigned max = param_max_unroll_times;
> +  struct data_dependence_relation *ddr;
> +  unsigned nfactor = 0;
> +  int nzfactor = 0;
> +
> +  /* Best unroll factor is the maximum distance across loop
> +     iterations.  */
> +  FOR_EACH_VEC_ELT (m_dependences, i, ddr)
> +    {
> +      for (unsigned j = 0; j < DDR_NUM_DIST_VECTS (ddr); j++)
> +       {
> +         lambda_vector vec = DDR_DIST_VECT (ddr, j);
> +         widest_int distance = vec[j];
> +         unsigned offset = distance.to_uhwi ();
> +         if (offset == 0)
> +           continue;
> +
> +         int dist = offset - nzfactor;
> +         if (dist  == 0)
> +           continue;
>
> +         if (nfactor == 0)
> +           {
> +             nfactor = offset;
> +             nzfactor = offset;
> +           }
> +         else if (dist <= nzfactor)
> +           nfactor = offset;
> +
> +         if (nfactor > 0 && nfactor <= max)
> +           factor = nfactor;
> +       }
> +    }
> +
> +  int max_use = 0;
>    FOR_EACH_VEC_ELT (chains, i, chain)
>      {
>        if (chain->type == CT_INVARIANT)
> @@ -2427,17 +2461,10 @@ determine_unroll_factor (const vec<chain_p> &chains)
>           continue;
>         }
>
> -      /* The best unroll factor for this chain is equal to the number of
> -        temporary variables that we create for it.  */
> -      af = chain->length;
>        if (chain->has_max_use_after)
> -       af++;
> -
> -      nfactor = factor * af / gcd (factor, af);
> -      if (nfactor <= max)
> -       factor = nfactor;
> +       max_use++;
>      }
> -
> +  factor += max_use;
>    return factor;
>  }
>
> --
> 2.43.5
>

Reply via email to