On 01/02/2018 04:52 PM, Grazvydas Ignotas wrote: > On Tue, Jan 2, 2018 at 11:38 PM, Ian Romanick <i...@freedesktop.org> wrote: >> On 12/28/2017 05:56 PM, Grazvydas Ignotas wrote: >>> zlib provides a faster slice-by-4 CRC32 implementation than the >>> traditional single byte lookup one used by mesa. As most supported >>> platforms now link zlib unconditionally, we can easily use it. >>> For small buffers the old implementation is still used as it's faster >>> with cold cache (first call), as indicated by some throughput >>> benchmarking (avg MB/s, n=100, zlib 1.2.8): >>> >>> i5-6600K C2D E4500 >>> size mesa zlib mesa zlib >>> 4 66 43 -35% +/- 4.8% 43 22 -49% +/- 9.6% >>> 32 193 171 -11% +/- 5.8% 129 49 -61% +/- 7.2% >>> 64 256 267 4% +/- 4.1% 171 63 -63% +/- 5.4% >>> 128 317 389 22% +/- 5.8% 253 89 -64% +/- 4.2% >>> 256 364 596 63% +/- 5.6% 304 166 -45% +/- 2.8% >>> 512 401 838 108% +/- 5.3% 338 296 -12% +/- 3.1% >>> 1024 420 1036 146% +/- 7.6% 375 461 23% +/- 3.7% >>> 1M 443 1443 225% +/- 2.1% 403 1175 191% +/- 0.9% >>> 100M 448 1452 224% +/- 0.3% 406 1214 198% +/- 0.3% >>> >>> With hot cache (repeated calls) zlib almost always wins on both CPUS. >>> It has been verified the calculation results stay the same after this >>> change. >>> >>> Signed-off-by: Grazvydas Ignotas <nota...@gmail.com> >>> --- >>> src/util/crc32.c | 13 +++++++++++++ >>> 1 file changed, 13 insertions(+) >>> >>> diff --git a/src/util/crc32.c b/src/util/crc32.c >>> index f2e01c6..0cffa49 100644 >>> --- a/src/util/crc32.c >>> +++ b/src/util/crc32.c >>> @@ -31,12 +31,20 @@ >>> * >>> * @author Jose Fonseca >>> */ >>> >>> >>> +#ifdef HAVE_ZLIB >>> +#include <zlib.h> >>> +#endif >>> #include "crc32.h" >>> >>> +/* For small buffers it's faster to avoid the library call. >>> + * The optimal threshold depends on CPU characteristics, it is hoped >>> + * the choice below is reasonable for typical modern CPU. >>> + */ >>> +#define ZLIB_SIZE_THRESHOLD 64 >> >> For the actual users of this function in Mesa, is it even possible to >> pass less than 64 bytes (I'm assuming that's the units)? > > Hmm why wouldn't it be? The unit is a byte, and you can compute CRC32 > of a single byte.
It can be done, but that's not my question. My question is: would any of the existing users actually do this? I thought the main user was the various disk cache / GetProgramBinary kind of things. You won't have a shader binary that's a single byte... less than 64 also seems unlikely. Unless there are other users? >>> >>> static const uint32_t >>> util_crc32_table[256] = { >>> 0x00000000, 0x77073096, 0xee0e612c, 0x990951ba, >>> 0x076dc419, 0x706af48f, 0xe963a535, 0x9e6495a3, >>> @@ -112,10 +120,15 @@ uint32_t >>> util_hash_crc32(const void *data, size_t size) >>> { >>> const uint8_t *p = data; >>> uint32_t crc = 0xffffffff; >>> >>> +#ifdef HAVE_ZLIB >>> + if (size >= ZLIB_SIZE_THRESHOLD && (uInt)size == size) >> ^^^^^^^^^^^^^^^^^^ >> This comparison is always true (unless sizeof(uInt) != sizeof(size_t)?). >> I'm not 100% sure what you're trying to accomplish here, but I think >> you want 'size > 0'. I'm not familiar with this zlib function, so I >> don't know what it's expectations for size are. > > zlib's uInt is always 32bit while size_t is 64bit on most (all?) 64bit > architectures, so if someone decides to CRC32 >= 4GiB buffer, this > function will do the wrong thing without such check. Newer zlib has > crc32_z that takes size_t, but I was trying to avoid build system > complications of detecting that function... Ok. That makes sense. I'll bet this adds a warning about tautological compares only on 32-bit, then. I was going to suggest comparing with 0xffffffff instead, but GCC emits worse code for that... and it probably still results in the warning on 32-bit. Maybe just a comment (based on your reply) that describes what's happening for the next person that reads the code. > GraÅžvydas _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev