On Mon, Nov 03, 2014 at 03:27:57PM +0100, Marek Polacek wrote: > I moved all of this into new sanopt.c file. > (I guess that file includes some headers that we in fact don't > need, but the header flattening doesn't make it easy to check, > there are too many of them.)
Well, in theory you could just script removing them one by one and just make sanopt.o after each step to see if the header is or is not needed, perhaps with some manual tweeks. > +/* This is used to carry information about basic blocks. It is > + attached to the AUX field of the standard CFG block. */ > + > +struct sanopt_info > +{ > + /* True if this BB has been visited. */ > + bool visited_p; > +}; > + > + > +/* Try to optimize away redundant UBSAN_NULL checks. > + > + We walk blocks in the CFG via a depth first search of the dominator > + tree; we push unique UBSAN_NULL statements into a vector in the > + NULL_CHECK_MAP as we enter the blocks. When leaving a block, we > + mark the block as visited; then when checking the statements in the > + vector, we ignore statements that are coming from already visited > + blocks, because these cannot dominate anything anymore. */ > + > +static void > +sanopt_optimize_walker (basic_block bb, > + hash_map<tree, auto_vec<gimple> > *map) Perhaps in preparation for future optimizations (other UBSAN_* calls, and ASAN_CHECK and tsan builtins), you should consider putting the hash_map into some structure and pass address of that structure instead, so that you have all the pass context at the same spot. You could put asan_num_accesses in there too, see below. > + /* We already have recorded a UBSAN_NULL check > + for this pointer. Perhaps we can drop this one. > + But only if this check doesn't specify stricter > + alignment. */ > + int i; > + gimple g; > + > + FOR_EACH_VEC_ELT (v, i, g) > + { > + /* Remove statements for BBs that have been > + already processed. */ > + sanopt_info *si = (sanopt_info *) gimple_bb (g)->aux; > + if (si->visited_p) > + { > + /* ??? This might be unneccesary; we could just > + skip the stale statements. */ > + v.unordered_remove (i); > + continue; > + } I think it would be better to walk the vector backwards instead of forward. 1) if you have the same SSA_NAME there multiple times, ignoring the already unnecessary stmts, the only case where you'd have multiple stmts is if the earlier stmts dominate the following stmts and for some reason weren't optimized away; that for some reason currently should be just higher required alignment in the dominated stmt (e.g. say have UBSAN_NULL (foo_23, 0); UBSAN_NULL (foo_23, 2); UBSAN_NULL (foo_23, 4); where the first stmt dominates the second two and second stmt dominates the last one. 2) All the si->visited_p stmts should be always at the end of the vector IMHO, preceeded only by !si->visited_p stmts. Thus, when walking backwards, first remove the stmts with bb->visited_p, once you hit one that doesn't have it set, I believe there shouldn't be any further. And then in theory it should be fine to just compare the last stmt in the vector that was left (if any). > +unsigned int > +pass_sanopt::execute (function *fun) > +{ > + basic_block bb; > + > + /* Try to remove redundant checks. */ > + if (optimize > + && (flag_sanitize & (SANITIZE_NULL | SANITIZE_ALIGNMENT))) > + sanopt_optimize (fun); Perhaps you could return the asan_num_accesses value computed during sanopt_optimize (just count IFN_ASAN_CHECK calls that you don't optimize away), and do this only in else if (i.e. when sanopt_optimize has not been run). That way, you'd save one extra IL walk. > + if (flag_sanitize & SANITIZE_ADDRESS) > + { > + gimple_stmt_iterator gsi; > + FOR_EACH_BB_FN (bb, fun) > + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) > + { > + gimple stmt = gsi_stmt (gsi); > + if (is_gimple_call (stmt) && gimple_call_internal_p (stmt) > + && gimple_call_internal_fn (stmt) == IFN_ASAN_CHECK) > + ++asan_num_accesses; > + } > + } Otherwise LGTM, thanks for working on this. Jakub