> > Just to quantify a lot, on tramp3d there are 2263 calls declared dead > > and 34206 declared live (by my mini-dce) in fnsummary1. So 6%. > > There are overall 3797 unnecesary stmts and 97263 necessary ones. > > So most of dead stuff are actually calls. > > > > In IPA fnsummary there are 4 dead calls, all of them .part clones. > > This looks like missed optimization pre ipa-fnsplit. > > > > Those are most frequent ones: > > 12 skipping unnecesary stmt > > Evaluator<RemoteMultiPatchEvaluatorTag>::Evaluator (&evaluator); > > 12 skipping unnecesary stmt Pooma::DummyMutex::lock (_1); > > 16 skipping unnecesary stmt ForEach<Scalar<double>, > > DomainFunctorTag, DomainFunctorTag>::apply (_2, f_7(D), c_8(D)); > > 23 skipping unnecesary stmt operator delete (_5, _2); > > 24 skipping unnecesary stmt > > Evaluator<RemoteMultiPatchEvaluatorTag>::~Evaluator (&evaluator); > > 28 skipping unnecesary stmt ForEach<Scalar<double>, > > DomainFunctorTag, DomainFunctorTag>::apply (_1, f_6(D), c_7(D)); > > 37 skipping unnecesary stmt _37 = > > Field<UniformRectilinearMesh<MeshTraits<3, double, UniformRectilinearTag, > > CartesianTag, 3> >, double, BrickView>::engine (_9); > > 37 skipping unnecesary stmt _39 = engineFunctor<Engine<3, double, > > BrickView>, DataObjectRequest<BlockAffinity> > (_10, &getAffinity); > > 43 skipping unnecesary stmt > > DataObjectRequest<BlockAffinity>::DataObjectRequest (&getAffinity); > > 43 skipping unnecesary stmt > > MultiArgEvaluator<SinglePatchEvaluatorTag>::MultiArgEvaluator (&speval); > > 43 skipping unnecesary stmt > > Smarts::Iterate<Smarts::Stub>::hintAffinity (_8, _11); > > 86 skipping unnecesary stmt > > MultiArgEvaluator<SinglePatchEvaluatorTag>::~MultiArgEvaluator (&speval); > > 227 skipping unnecesary stmt PoomaCTAssert<true>::test (); > > > > Mostly those are functions which are empty at release_ssa time. Perhaps > > we could special case pures/const calls with no LHS and get rid of them > > during early inline. This is cheaper then doing actual inline which > > triggers a lot of logic in tree-inline... > > That's an interesting idea. Note the original thinking was that CFG cleanup > gets rid of most dead code (from C++ templates). forwprop as a very simple > "CCP" and "DCE" as part of its RPO walk with lattice and folding, CCP can > be somewhat expensive due to it using the SSA propagator, DCE can be > expensive due to its weak DSE support.
I remember doing some experiments with this when SSA inliner was new (approx. 2005?). At that time we was not able to DCE away functions (we did not detect const/pure functions at that stage), so most benefits were derived from CCP on constnat conditionals which in turn eliminated some calls. We do not have alias info built yet, so any pass run here would need to live with that, which would also make it cheaper. I think dominator opts were intended as cheap cleanup pass for such situations... There isn't much reason why my "mini-dce" would not also include a sweep phase except that it is bit odd for early inliner to remove random statements. Also I wonder how often people use empty function call and expect to be able to breakpoint in it. Honza