Mark Knecht posted on Sat, 22 Jun 2013 18:48:15 -0700 as excerpted:

> Duncan,

Again, following up now that it's my "weekend" and I have a chance...

> Actually, using your idea of piping things to /dev/null it appears
> that the random number generator itself is only capable of 15MB/S on my
> machine.  It doesn't change much based on block size of number of bytes
> I pipe.

=:^(

Well, you tried.

> If this speed is representative of how well that works then I think
> I have to use a file. It appears this guy gets similar values:
> 
> http://www.globallinuxsecurity.pro/quickly-fill-a-disk-with-random-bits-
without-dev-urandom/

Wow, that's a very nice idea he has there!  I'll have to remember that!  
The same idea should work for creating any relatively large random file, 
regardless of final use.  Just crypt-setup the thing and dd /dev/zero 
into it.

FWIW, you're doing better than my system does, however.  I seem to run 
about 13 MB/s from /dev/urandom (upto 13.7 depending on blocksize).  And 
back to the random vs urandom discussion, random totally blocked here 
after a few dozen bytes, waiting for more random data to be generated.  
So the fact that you actually got a usefully sized file out of it does 
indicate that you must have hardware random and that it's apparently 
working well.

> On the other hand, piping /dev/zero appears to be very fast -
> basically the speed of the processor I think:
> 
> $ dd if=/dev/zero of=/dev/null bs=4096 count=$[1000]
> 1000+0 records in 1000+0 records out 4096000 bytes (4.1 MB) copied,
> 0.000622594 s, 6.6 GB/s

What's most interesting to me when I tried that here is that unlike 
urandom, zero's output varies DRAMATICALLY by blocksize.  With
bs=$((1024*1024)) (aka 1MB), I get 14.3 GB/s, tho at the default bs=512, 
I get only 1.2 GB/s.  (Trying a few more values, 1024*512 gives me very 
similar 14.5 GB/s, 1024*64 is already down to 13.2 GB/s, 1024*128=13.9 
and 1024*256=14.1, while on the high side 1024*1024*2 is already down to 
10.2 GB/s.  So quarter MB to one MB seems the ideal range, on my 
hardware.)

But of course, if your device is compressible-data speed-sensitive, as 
are say the sandforce-controller-based ssds, /dev/zero isn't going to 
give you anything like the real-world benchmark random data would (tho it 
should be a great best-case compressible-data test).  Tho it's unlikely 
to matter on most spinning rust, AFAIK, and SSDs like my Corsair Neutrons 
(Link_A_Media/LAMD-based controller), which have as a bullet-point 
feature that they're data compression agnostic, unlike the sandforce-
based SSDs.

Since /dev/zero is so fast, I'd probably do a few initial tests to 
determine whether compressible data makes a difference on what you're 
testing, then use /dev/zero if it doesn't appear to, to get a reasonable 
base config, then finally double-check that against random data again.

Meanwhile, here's another idea for random data, seeing as /dev/urandom is 
speed limited.  Upto your memory constraints anyway, you should be able 
to dd if=/dev/urandom of=/some/file/on/tmpfs .  Then you can
dd if=/tmpfs/file, of=/dev/test/target, or if you want a bigger file than 
a direct tmpfs file will let you use, try something like this:

cat /tmpfs/file /tmpfs/file /tmpfs/file | dd of=/dev/test/target

... which would give you 3X the data size of /tmpfs/file.

(Man, testing that with a 10 GB tmpfs file (on a 12 GB tmpfs /tmp), I can 
see see how slow that 13 MB/s /dev/urandom actually is as I'm creating 
it! OUCH!  I waited awhile before I started typing this comment... I've 
been typing slowly and looking at the usage graph as I type, and I'm 
still only at maybe 8 gigs, depending on where my cache usage was when I 
started, right now!)

cd /tmp

dd if=/dev/urandom of=/tmp/10gig.testfile bs=$((1024*1024)) count=10240

(10240 records, 10737418240 bytes, but it says 11 GB copied, I guess dd 
uses 10^3 multipliers, anyway, ~783 s, 13.7 MB/s)

ls -l 10gig.testfile

(confirm the size, 10737418240 bytes)

cat 10gig.testfile 10gig.testfile 10gig.testfile \
10gig.testfile 10gig.testfile | dd of=/dev/null

(that's 5x, yielding 50 GB power of 2, 104857600+0 records, 53687091200 
bytes, ~140s, 385 MB/s at the default 512-byte blocksize)

Wow, what a difference block size makes there, too!  Trying the above cat/
dd with bs=$((1024*1024)) (1MB) yields ~30s, 1.8 GB/s!

1GB block size (1024*1024*1024) yields about the same, 30s, 1.8 GB/s.

LOL dd didn't like my idea to try a 10 GB buffer size!

dd: memory exhausted by input buffer of size 10737418240 bytes (10 GiB)

(No wonder, as that'd be 10GB in tmpfs/cache and a 10GB buffer, and I'm 
/only/ running 16 gigs RAM and no swap!  But it won't take 2 GB either.  
Checking, looks like as my normal user I'm running a ulimit of 1-gig 
memory size, 2-gig virtual-size, so I'm sort of surprised it took the 1GB 
buffer... maybe that counts against virtual only or something? )

Low side again, ~90s, 599 MB/s @ 1KB (1024 byte) bs, already a dramatic 
improvement from the 140s 385 MB/s of the default 512-byte block.

2KB bs yields 52s, 1 GB/s

16KB bs yields 31s, 1.7 GB/s, near optimum already.

High side again, 1024*1024*4 (4MB) bs appears to be best-case, just under 
29s, 1.9 GB/s.  Going to 8MB takes another second, 1.8 GB/s again, which 
is not a big surprise given that the memory page size is 4MB, so that's 
an unsurprising peak performance point.

FWIW, cat seems to run just over 100% single-core saturation while dd 
seems to run just under, @97% or so.

Running two instances in parallel (using the peak 4MB block size, 1.9 GB/
s with a single run) seems to cut performance some, but not nearly in 
half.  (I got 1.5 GB/s and 1.6 GB/s, but I started one then switched to a 
different terminal to start the other, so they only overlapped by maybe 
30s or so of the 35s on each.).

OK, so that's all memory/cpu since neither end is actual storage, but 
that does give me a reasonable base against which to benchmark actual 
storage (rust or ssd), if I wished.

What's interesting is that by, I guess pure coincidence, my 385 MB/s 
original 512-byte blocksize figure is reasonably close to what the SSD 
read benchmarks are with hddparm.  IIRC the hdparm/ssd numbers were some 
higher, but not so much so (470 MB/sec I just tested).  But the bus speed 
maxes out not /too/ far above that (500-600 MB/sec, theoretically 600 MB/
sec on SATA-600, but real world obviously won't /quite/ hit that, IIRC 
best numbers I've seen anywhere are 585 or so).

So now I guess I send this and do some more testing of real device, now 
that you've provoked my curiosity and I have the 50 GB (mostly) 
pseudorandom file sitting in tmpfs already.  Maybe I'll post those 
results later.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


Reply via email to