Hans Petter Selasky wrote this message on Wed, May 13, 2015 at 10:35 +0200:
> On 05/13/15 10:27, David Chisnall wrote:
> > On 13 May 2015, at 09:03, John-Mark Gurney <j...@funkthat.com> wrote:
> >>
> >> Poul-Henning Kamp wrote this message on Tue, May 12, 2015 at 06:31 +0000:
> >>> --------
> >>> In message <20150512032307.gp37...@funkthat.com>, John-Mark Gurney writes:
> >>>
> >>>> Also, you'd probably see even better performance by increasing the
> >>>> size to 64k, [...]
> >>>
> >>> easy:
> >>>   8K on 32bit
> >>>   64k on 64bit
> >>
> >> Sounds good to me...  Just for people who care... I did a quick set of
> >> benchmarks on sha256.. This is using my preliminary patch to use sse4
> >> optimized sha256...  But this should be the same for others...
> >>
> >> The numbers in ministat output are the time in seconds it takes my
> >> 3.4GHz AMD A10-5700 APU running HEAD to process a 512MB file, so lower
> >> numbers are better..  I've processed them into easier to read format:
> >> BUFSIZ:    145MB/sec
> >> 8k:        193MB/sec
> >> 16k:       198MB/sec
> >> 64k:       202MB/sec
> >> 128k:      202MB/sec
> >> -t:        211MB/sec
> >
> > It looks like most of the benefit is gained at 16KB.  Did you try running 
> > the benchmark with something else running at the same time to see if there 
> > is any advantage in trashing the caches a bit less (simple case, what 
> > happens if you run two instances of the same benchmark at once)?
> >
> > I suspect that you???re about right anyway - I recently did some tests 
> > while playing with JavaScript FFI generation with a multithreaded process 
> > JavaScript environment calling out to OpenSSL to do SHA calculations and 
> > having each of 8 threads reading in 128KB chunks gave the fastest 
> > performance (Core i7, 4 cores + hyperthreading), with only a negligible 
> > gain over 64KB.  In all cases, the JavaScript implementation was 
> > significantly faster than the openssl tool, which used 8KB buffers.
> 
> You should also try this using an USB disk. The performance numbers 
> heavily depends on the hardware's interrupt moderation values.

This shouldn't matter.. I wasn't flushing the buffer cache between
runs, so this was entirely from the buffer cache...  This is purely,
syscall+copy overhead that is being measured here...  No matter what
you're source is, NFS, USB disk, you'll always have this overhead...

-- 
  John-Mark Gurney                              Voice: +1 415 225 5579

     "All that I will do, has been done, All that I have, has not."
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Reply via email to