https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65701
Jan Hubicka <hubicka at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |rguenther at suse dot de, | |vmakarov at redhat dot com --- Comment #10 from Jan Hubicka <hubicka at gcc dot gnu.org> --- This is on clean mainline and bdver1 machine. GCC with patch reverted runtime is: real 0m50.714s user 0m50.402s sys 0m0.356s and now with different inliner settings: (talos4)$ sh compile real 1m4.636s user 1m4.270s sys 0m0.420s (talos4)$ sh compile --param large-function-insns=1000 real 0m51.063s user 0m50.742s sys 0m0.364s (talos4)$ sh compile --param large-function-insns=100000 --param large-stack-frame=100000 real 1m1.369s user 1m1.012s sys 0m0.407s (talos4)$ sh compile -fno-tree-vectorize real 1m0.629s user 1m0.299s sys 0m0.381s (talos4)$ sh compile -fno-tree-vectorize --param large-function-insns=1000 real 0m53.375s user 0m53.053s sys 0m0.367s (talos4)$ sh compile -fno-tree-vectorize --param large-function-insns=100000 --param large-stack-frame=100000 real 0m55.131s user 0m54.826s sys 0m0.351s param large-function-insns=1000 is thus a winner, but apparently by an accident. It seems that tree vectorizer actually make code slower when more inlining and SRA happens. Richard, perhaps with you vect-costmodel-fu, you can take a look? It also may be just an RA issue, but I do not see particularly many spills in ther internal loops. To completely flatten the whole benchmark, one needs to also bump up max-inline-insns-auto. This seems to firther degrade perofmrance with both vectorizer and nonvectorizer, so it also may be just an register pressure and IRA issue. Richard, since it is the second time we run into large-function-insns being beneficial, I wonder if you can patch frescobaldi or czerny (so we have c++ benchmark and LTO spec covered) with change of the parameter value? The current value was never really tuned it is quite possibly just too large. I will see if I can get anything useful out of firefox benchmarks.