Re: [Qemu-devel] [PATCH] copy, dd: simplify and optimize NUL bytes detection

2015-10-25 Thread Pádraig Brady
On 24/10/15 03:24, Pádraig Brady wrote: > On 23/10/15 12:15, Pádraig Brady wrote: >> On 22/10/15 20:47, Paolo Bonzini wrote: >>> >>> >>> On 22/10/2015 19:39, Radim Krčmář wrote: 2015-10-22 18:14+0200, Paolo Bonzini: > On 22/10/2015 18:02, Eric Blake wrote: >> I see a bug in there:

Re: [Qemu-devel] [PATCH] copy, dd: simplify and optimize NUL bytes detection

2015-10-23 Thread Pádraig Brady
On 23/10/15 12:15, Pádraig Brady wrote: > On 22/10/15 20:47, Paolo Bonzini wrote: >> >> >> On 22/10/2015 19:39, Radim Krčmář wrote: >>> 2015-10-22 18:14+0200, Paolo Bonzini: On 22/10/2015 18:02, Eric Blake wrote: > I see a bug in there: Of course. You shouldn't have told me what

Re: [Qemu-devel] [PATCH] copy, dd: simplify and optimize NUL bytes detection

2015-10-23 Thread Pádraig Brady
On 22/10/15 20:47, Paolo Bonzini wrote: > > > On 22/10/2015 19:39, Radim Krčmář wrote: >> 2015-10-22 18:14+0200, Paolo Bonzini: >>> On 22/10/2015 18:02, Eric Blake wrote: I see a bug in there: >>> >>> Of course. You shouldn't have told me what the bug was, I deserved >>> to look for it myse

Re: [Qemu-devel] [PATCH] copy, dd: simplify and optimize NUL bytes detection

2015-10-23 Thread Pádraig Brady
On 22/10/15 20:47, Paolo Bonzini wrote: > > > On 22/10/2015 19:39, Radim Krčmář wrote: >> 2015-10-22 18:14+0200, Paolo Bonzini: >>> On 22/10/2015 18:02, Eric Blake wrote: I see a bug in there: >>> >>> Of course. You shouldn't have told me what the bug was, I deserved >>> to look for it myse

Re: [Qemu-devel] [PATCH] copy, dd: simplify and optimize NUL bytes detection

2015-10-23 Thread Paolo Bonzini
On 23/10/2015 13:12, Pádraig Brady wrote: > On 22/10/15 20:47, Paolo Bonzini wrote: >> >> >> On 22/10/2015 19:39, Radim Krčmář wrote: >>> 2015-10-22 18:14+0200, Paolo Bonzini: On 22/10/2015 18:02, Eric Blake wrote: > I see a bug in there: Of course. You shouldn't have told me

Re: [Qemu-devel] [PATCH] copy, dd: simplify and optimize NUL bytes detection

2015-10-23 Thread Bernhard Voelker
On 10/22/2015 05:55 PM, Eric Blake wrote: On 10/22/2015 09:47 AM, Bernhard Voelker wrote: Also I suspect the extra conditions involved in using longs for just the first 16 bytes would outweigh the benefits? I.E. the first simple loop probably breaks early, and if not has the added benefit of "p

Re: [Qemu-devel] [PATCH] copy, dd: simplify and optimize NUL bytes detection

2015-10-22 Thread Paolo Bonzini
On 22/10/2015 19:39, Radim Krčmář wrote: > 2015-10-22 18:14+0200, Paolo Bonzini: >> On 22/10/2015 18:02, Eric Blake wrote: >>> I see a bug in there: >> >> Of course. You shouldn't have told me what the bug was, I deserved >> to look for it myself. :) > > It rather seems that you don't want spoi

Re: [Qemu-devel] [PATCH] copy, dd: simplify and optimize NUL bytes detection

2015-10-22 Thread Radim Krčmář
2015-10-22 18:14+0200, Paolo Bonzini: > On 22/10/2015 18:02, Eric Blake wrote: >> I see a bug in there: > > Of course. You shouldn't have told me what the bug was, I deserved > to look for it myself. :) It rather seems that you don't want spoilers, :) I see two bugs now. > bool memeqzero4_paol

Re: [Qemu-devel] [PATCH] copy, dd: simplify and optimize NUL bytes detection

2015-10-22 Thread Paolo Bonzini
On 22/10/2015 18:02, Eric Blake wrote: > On 10/22/2015 09:31 AM, Paolo Bonzini wrote: > >> Only if your machine cannot do unaligned loads. If it can, you can >> align the length instead of the buffer. memcmp will take care of >> aligning the buffer (with some luck it won't have to, e.g. if buf

Re: [Qemu-devel] [PATCH] copy, dd: simplify and optimize NUL bytes detection

2015-10-22 Thread Eric Blake
On 10/22/2015 09:31 AM, Paolo Bonzini wrote: > Only if your machine cannot do unaligned loads. If it can, you can > align the length instead of the buffer. memcmp will take care of > aligning the buffer (with some luck it won't have to, e.g. if buf is > 0x12340002 and length = 4094). On x86 una

Re: [Qemu-devel] [PATCH] copy, dd: simplify and optimize NUL bytes detection

2015-10-22 Thread Eric Blake
[adding qemu] On 10/22/2015 08:00 AM, Pádraig Brady wrote: > * src/system.h (is_nul): Reimplement with a version > that doesn't require a sentinel after the buffer, > and which calls down to (the system optimized) memcmp. > Performance analyzed at http://rusty.ozlabs.org/?p=560 > /* Return wheth

Re: [Qemu-devel] [PATCH] copy, dd: simplify and optimize NUL bytes detection

2015-10-22 Thread Bernhard Voelker
On 10/22/2015 05:17 PM, Pádraig Brady wrote: On 22/10/15 15:44, Paolo Bonzini wrote: On 22/10/2015 16:37, Eric Blake wrote: + /* Check first 16 bytes manually. */ + for (len = 0; len < 16; len++) +{ + if (! bufsize) +return true; + if (*p) +return false; +

Re: [Qemu-devel] [PATCH] copy, dd: simplify and optimize NUL bytes detection

2015-10-22 Thread Eric Blake
On 10/22/2015 09:47 AM, Bernhard Voelker wrote: >> Also I suspect the extra conditions involved in using longs >> for just the first 16 bytes would outweigh the benefits? >> I.E. the first simple loop probably breaks early, and if not >> has the added benefit of "priming the pumps" for the subsequ

Re: [Qemu-devel] [PATCH] copy, dd: simplify and optimize NUL bytes detection

2015-10-22 Thread Paolo Bonzini
On 22/10/2015 17:47, Bernhard Voelker wrote: >> Note the above does break early if non zero detected in first 16 bytes. >> >> Also I suspect the extra conditions involved in using longs >> for just the first 16 bytes would outweigh the benefits? >> I.E. the first simple loop probably breaks early

Re: [Qemu-devel] [PATCH] copy, dd: simplify and optimize NUL bytes detection

2015-10-22 Thread Pádraig Brady
On 22/10/15 15:44, Paolo Bonzini wrote: > > > On 22/10/2015 16:37, Eric Blake wrote: + /* Check first 16 bytes manually. */ + for (len = 0; len < 16; len++) +{ + if (! bufsize) +return true; + if (*p) +return false; + p

Re: [Qemu-devel] [PATCH] copy, dd: simplify and optimize NUL bytes detection

2015-10-22 Thread Paolo Bonzini
On 22/10/2015 17:17, Pádraig Brady wrote: >> > Nice trick indeed. On the other hand, the first 16 bytes are enough to >> > rule out 99.99% (number out of thin hair) of the non-zero blocks, so >> > that's where you want to optimize. Checking them an unsigned long at a >> > time, or fetching a fe

Re: [Qemu-devel] [PATCH] copy, dd: simplify and optimize NUL bytes detection

2015-10-22 Thread Paolo Bonzini
On 22/10/2015 16:37, Eric Blake wrote: >> > + /* Check first 16 bytes manually. */ >> > + for (len = 0; len < 16; len++) >> > +{ >> > + if (! bufsize) >> > +return true; >> > + if (*p) >> > +return false; >> > + p++; >> > + bufsize--; >> > +} >> > +