On Wed, Feb 5, 2025 at 1:29 PM Richard Biener <rguent...@suse.de> wrote:
>
> The PR shows fold-mem-offsets taking ages and a lot of memory computing
> DU/UD chains as that requires the RD problem.  The issue is not so much
> the memory required for the pruned sets but the high CFG connectivity
> (and that the CFG is cyclic) which makes solving the dataflow problem
> expensive.
>
> The following adds the same limit as the one imposed by GCSE and CPROP.
>
> Bootstrap and regtest ongoing on x86_64-unknown-linux-gnu, this
> reduces the compile-time of the PR26854 testcase from 480s to 150s.
>
> OK?

Thank you for sending this.
I was working on something like this as well, but I could not
find a reasonable threshold expression to disable the pass.

>
> Thanks,
> Richard.
>
>         PR rtl-optimization/117922
>         * fold-mem-offsets.cc (pass_fold_mem_offsets::execute):
>         Do nothing for a highly connected CFG.
> ---
>  gcc/fold-mem-offsets.cc | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
>
> diff --git a/gcc/fold-mem-offsets.cc b/gcc/fold-mem-offsets.cc
> index a816006e207..c1c94472a07 100644
> --- a/gcc/fold-mem-offsets.cc
> +++ b/gcc/fold-mem-offsets.cc
> @@ -35,6 +35,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "df.h"
>  #include "tree-pass.h"
>  #include "cfgrtl.h"
> +#include "diagnostic-core.h"
>
>  /* This pass tries to optimize memory offset calculations by moving constants
>     from add instructions to the memory instructions (loads / stores).
> @@ -841,6 +842,23 @@ do_commit_insn (rtx_insn *insn)
>  unsigned int
>  pass_fold_mem_offsets::execute (function *fn)
>  {
> +  /* Computing UD/DU chains for flow graphs which have a high connectivity
> +     will take a long time and is unlikely to be particularly useful.
> +
> +     In normal circumstances a cfg should have about twice as many
> +     edges as blocks.  But we do not want to punish small functions
> +     which have a couple switch statements.  Rather than simply
> +     threshold the number of blocks, uses something with a more
> +     graceful degradation.  */
> +  if (n_edges_for_fn (fn) > 20000 + n_basic_blocks_for_fn (fn) * 4)
> +    {
> +      warning (OPT_Wdisabled_optimization,
> +              "fold-mem-offsets: %d basic blocks and %d edges/basic block",
> +              n_basic_blocks_for_fn (cfun),
> +              n_edges_for_fn (cfun) / n_basic_blocks_for_fn (cfun));
> +      return 0;
> +    }
> +
>    df_set_flags (DF_EQ_NOTES + DF_RD_PRUNE_DEAD_DEFS + DF_DEFER_INSN_RESCAN);
>    df_chain_add_problem (DF_UD_CHAIN + DF_DU_CHAIN);
>    df_analyze ();
> --
> 2.43.0

Reply via email to