On Tue, Jan 2, 2018 at 11:38 PM, Ian Romanick <[email protected]> wrote:
> On 12/28/2017 05:56 PM, Grazvydas Ignotas wrote:
>> zlib provides a faster slice-by-4 CRC32 implementation than the
>> traditional single byte lookup one used by mesa. As most supported
>> platforms now link zlib unconditionally, we can easily use it.
>> For small buffers the old implementation is still used as it's faster
>> with cold cache (first call), as indicated by some throughput
>> benchmarking (avg MB/s, n=100, zlib 1.2.8):
>>
>> i5-6600K C2D E4500
>> size mesa zlib mesa zlib
>> 4 66 43 -35% +/- 4.8% 43 22 -49% +/- 9.6%
>> 32 193 171 -11% +/- 5.8% 129 49 -61% +/- 7.2%
>> 64 256 267 4% +/- 4.1% 171 63 -63% +/- 5.4%
>> 128 317 389 22% +/- 5.8% 253 89 -64% +/- 4.2%
>> 256 364 596 63% +/- 5.6% 304 166 -45% +/- 2.8%
>> 512 401 838 108% +/- 5.3% 338 296 -12% +/- 3.1%
>> 1024 420 1036 146% +/- 7.6% 375 461 23% +/- 3.7%
>> 1M 443 1443 225% +/- 2.1% 403 1175 191% +/- 0.9%
>> 100M 448 1452 224% +/- 0.3% 406 1214 198% +/- 0.3%
>>
>> With hot cache (repeated calls) zlib almost always wins on both CPUS.
>> It has been verified the calculation results stay the same after this
>> change.
>>
>> Signed-off-by: Grazvydas Ignotas <[email protected]>
>> ---
>> src/util/crc32.c | 13 +++++++++++++
>> 1 file changed, 13 insertions(+)
>>
>> diff --git a/src/util/crc32.c b/src/util/crc32.c
>> index f2e01c6..0cffa49 100644
>> --- a/src/util/crc32.c
>> +++ b/src/util/crc32.c
>> @@ -31,12 +31,20 @@
>> *
>> * @author Jose Fonseca
>> */
>>
>>
>> +#ifdef HAVE_ZLIB
>> +#include <zlib.h>
>> +#endif
>> #include "crc32.h"
>>
>> +/* For small buffers it's faster to avoid the library call.
>> + * The optimal threshold depends on CPU characteristics, it is hoped
>> + * the choice below is reasonable for typical modern CPU.
>> + */
>> +#define ZLIB_SIZE_THRESHOLD 64
>
> For the actual users of this function in Mesa, is it even possible to
> pass less than 64 bytes (I'm assuming that's the units)?
Hmm why wouldn't it be? The unit is a byte, and you can compute CRC32
of a single byte.
>
>>
>> static const uint32_t
>> util_crc32_table[256] = {
>> 0x00000000, 0x77073096, 0xee0e612c, 0x990951ba,
>> 0x076dc419, 0x706af48f, 0xe963a535, 0x9e6495a3,
>> @@ -112,10 +120,15 @@ uint32_t
>> util_hash_crc32(const void *data, size_t size)
>> {
>> const uint8_t *p = data;
>> uint32_t crc = 0xffffffff;
>>
>> +#ifdef HAVE_ZLIB
>> + if (size >= ZLIB_SIZE_THRESHOLD && (uInt)size == size)
> ^^^^^^^^^^^^^^^^^^
> This comparison is always true (unless sizeof(uInt) != sizeof(size_t)?).
> I'm not 100% sure what you're trying to accomplish here, but I think
> you want 'size > 0'. I'm not familiar with this zlib function, so I
> don't know what it's expectations for size are.
zlib's uInt is always 32bit while size_t is 64bit on most (all?) 64bit
architectures, so if someone decides to CRC32 >= 4GiB buffer, this
function will do the wrong thing without such check. Newer zlib has
crc32_z that takes size_t, but I was trying to avoid build system
complications of detecting that function...
GraÅžvydas
_______________________________________________
mesa-dev mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/mesa-dev