Robert Muir created LUCENE-9403:
-----------------------------------

             Summary: tune BufferedChecksum.DEFAULT_BUFFERSIZE
                 Key: LUCENE-9403
                 URL: https://issues.apache.org/jira/browse/LUCENE-9403
             Project: Lucene - Core
          Issue Type: Improvement
            Reporter: Robert Muir
         Attachments: LUCENE-9403.patch

This is currently set to 256 bytes, so that's the amount of data we pass to 
crc.update() at once. 

I tried different sizes with https://github.com/benalexau/hash-bench and JDK14:

{noformat}
HashBench.withArray         crc32-jre       256  avgt    5   81.349 ±  8.364  
ns/op
HashBench.withArray         crc32-jre       512  avgt    5   95.204 ± 10.057  
ns/op
HashBench.withArray         crc32-jre      1024  avgt    5  120.081 ±  8.471  
ns/op
HashBench.withArray         crc32-jre      2048  avgt    5  173.505 ±  8.857  
ns/op
HashBench.withArray         crc32-jre      8192  avgt    5  487.721 ± 11.435  
ns/op
{noformat}

based on this let's bump the buffersize from 256 to 1024? I think we want to 
avoid huge buffers but still keep the CPU overhead low. It only impacts 
ChecksumIndexInputs (e.g. speed of checkIntegrity() calls at merge) because 
IndexOutputs do not need this buffer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to