On Wed, Jul 31, 2024 at 10:15 AM Mariam Arutunian
<[email protected]> wrote:
>
> This patch adds a new compiler pass aimed at identifying naive CRC
> implementations,
> characterized by the presence of a loop calculating a CRC (polynomial long
> division).
> Upon detection of a potential CRC, the pass prints an informational message.
>
> Performs CRC optimization if optimization level is >= 2,
> besides optimizations for size and if fno_gimple_crc_optimization given.
>
> This pass is added for the detection and optimization of naive CRC
> implementations,
> improving the efficiency of CRC-related computations.
>
> This patch includes only initial fast checks for filtering out non-CRCs,
> detected possible CRCs verification and optimization parts will be provided
> in subsequent patches.
+fgimple-crc-optimization
Please do not put IL names in flags exposed to users, this should be
-foptimize-crc
+This flag is enabled by default at @option{-O2} and higher
+if @option{-Os} is not also specified.
Please leave out "if @option{-Os} is not also specified". I wonder why
for example if the CPU has a CRC instruction and the polynomial matches
we would disable this when optimizing for size? A CRC instruction should
be smaller?
+/* This pass performs CRC optimization. */
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "tree.h"
+#include "gimple.h"
+#include "tree-pass.h"
+#include "ssa.h"
+#include "gimple-iterator.h"
+#include "tree-cfg.h"
+#include "tree-ssa-loop-niter.h"
please try to prune the set of #include (I guess you've
copied this from elsewhere).
+ /* Don't continue searching if encountered the loop's beginning. */
+ if (bb_loop_header_p (gimple_bb (stmt)))
+ return nullptr;
gimple_bb (stmt) == m_crc_loop->header
it looks like this loop ignores other uses of xored_crc, like in
calls? Shouldn't
there be a
else if (!is_gimple_debug (stmt))
return nullptr;
in the FOR_EACH_IMM_USE_FAST loop?
It looks like loop_may_calculate_crc walks all loops XOR stmts and
for each xor_calculates_crc you call set_bbs_stmts_not_visited.
This looks like it would be quadratic in complexity considering a loop
with 1000s of XOR stmts _not_ computing a CRC? This would be
solved by aborting the search when the first found XOR is not
a CRC for example. Or by doing the initialization only once.
It also seems that in xor_calculates_crc finding the
shift after the XOR is much cheaper than the one before
as that collects dependent stmts (but we unset visited on all
stmts _again_). May I suggest to use a bitmap of SSA def
SSA_NAME_VERSIONs instead of using the gimple visited flag?
You are collecting a possibly very large use-def chain but
only very simple checks are later done on it.
Richard.
> gcc/
>
> * Makefile.in (OBJS): Add gimple-crc-optimization.o.
> * common.opt (fgimple-crc-optimization): New option.
> * common.opt.urls: Regenerate to add
> fgimple-crc-optimization.
> * doc/invoke.texi (-fgimple-crc-optimization): Add documentation.
> * gimple-crc-optimization.cc: New file.
> * gimple.cc (set_phi_stmts_not_visited): New function.
> (set_gimple_stmts_not_visited): Likewise.
> (set_bbs_stmts_not_visited): Likewise.
> * gimple.h (set_gimple_stmts_not_visited): New extern function
> declaration.
> (set_phi_stmts_not_visited): New extern function declaration.
> (set_bbs_stmts_not_visited): New extern function declaration.
> * opts.cc (default_options_table): Add OPT_fgimple_crc_optimization.
> (enable_fdo_optimizations): Enable gimple-crc-optimization.
> * passes.def (pass_crc_optimization): Add new pass.
> * timevar.def (TV_GIMPLE_CRC_OPTIMIZATION): New timevar.
> * tree-pass.h (make_pass_crc_optimization): New extern function
> declaration.
>
> Signed-off-by: Mariam Arutunian <[email protected]>