>> > These numbers are useful to try and ensure the overhead (scaling factor) >> > is reasonable, thanks. >> >> A nice improvement indeed. The patched result is 15 times faster >> than the serial unpatched run. So there is room for improvement > > Note, the box used was oldish AMD 16-core, no ht, box, haven't tried it on > anything
on a 32 core box, no ht, I see these timings: time make -j32 -k check >& log.check32 ; time make -j8 -k check >& log.check8 real 18m14.562s user 260m21.578s sys 264m26.042s real 41m33.210s user 233m4.563s sys 72m11.429s so it is not quite reaching the ideal 4x speedup. Counting the number of 'expect' processes they are nicely at around 32 and 8 for the full test, with only a very short tail near the end. So, there might be some overhead somewhere. Total user time is similar, but time in sys goes up.