https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53957
--- Comment #23 from rguenther at suse dot de <rguenther at suse dot de> --- On Sun, 28 Jun 2020, prop_design at protonmail dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53957 > > --- Comment #22 from Anthony <prop_design at protonmail dot com> --- > (In reply to Thomas Koenig from comment #21) > > Another question: Is there anything left to be done with the > > vectorizer, or could we remove that dependency? > > thanks for looking into this again for me. i'm surprised it worked the same on > Linux, but knowing that, at least helps debug this issue some more. I'm not > sure about the vectorizer question, maybe that question was intended for > someone else. the runtimes seem good as is though. i doubt the > auto-parallelization will add much speed. but it's an interesting feature that > i've always hoped would work. i've never got it to work though. the only code > that did actually implement something was Intel Fortran. it implemented one > trivial loop, but it slowed the code down instead of speeding it up. the > output > from gfortran shows more loops it wants to run in parallel. they aren't > important ones. but something would be better than nothing. if it slowed the > code down, i would just not use it. GCC adds runtime checks for a minimal number of iterations before dispatching to the parallelized code - I guess we simply never hit the threshold. This is configurable via --param parloops-min-per-thread, the default is 100, the default number of threads is determined the same as for OpenMP so you can probably tune that via OMP_NUM_THREADS.