Hans Petter Selasky wrote this message on Wed, May 13, 2015 at 10:35 +0200: > On 05/13/15 10:27, David Chisnall wrote: > > On 13 May 2015, at 09:03, John-Mark Gurney <j...@funkthat.com> wrote: > >> > >> Poul-Henning Kamp wrote this message on Tue, May 12, 2015 at 06:31 +0000: > >>> -------- > >>> In message <20150512032307.gp37...@funkthat.com>, John-Mark Gurney writes: > >>> > >>>> Also, you'd probably see even better performance by increasing the > >>>> size to 64k, [...] > >>> > >>> easy: > >>> 8K on 32bit > >>> 64k on 64bit > >> > >> Sounds good to me... Just for people who care... I did a quick set of > >> benchmarks on sha256.. This is using my preliminary patch to use sse4 > >> optimized sha256... But this should be the same for others... > >> > >> The numbers in ministat output are the time in seconds it takes my > >> 3.4GHz AMD A10-5700 APU running HEAD to process a 512MB file, so lower > >> numbers are better.. I've processed them into easier to read format: > >> BUFSIZ: 145MB/sec > >> 8k: 193MB/sec > >> 16k: 198MB/sec > >> 64k: 202MB/sec > >> 128k: 202MB/sec > >> -t: 211MB/sec > > > > It looks like most of the benefit is gained at 16KB. Did you try running > > the benchmark with something else running at the same time to see if there > > is any advantage in trashing the caches a bit less (simple case, what > > happens if you run two instances of the same benchmark at once)? > > > > I suspect that you???re about right anyway - I recently did some tests > > while playing with JavaScript FFI generation with a multithreaded process > > JavaScript environment calling out to OpenSSL to do SHA calculations and > > having each of 8 threads reading in 128KB chunks gave the fastest > > performance (Core i7, 4 cores + hyperthreading), with only a negligible > > gain over 64KB. In all cases, the JavaScript implementation was > > significantly faster than the openssl tool, which used 8KB buffers. > > You should also try this using an USB disk. The performance numbers > heavily depends on the hardware's interrupt moderation values.
This shouldn't matter.. I wasn't flushing the buffer cache between runs, so this was entirely from the buffer cache... This is purely, syscall+copy overhead that is being measured here... No matter what you're source is, NFS, USB disk, you'll always have this overhead... -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." _______________________________________________ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"