Re: [PR] Improve checksum calculations [lucene]

2024-11-25 Thread via GitHub
jpountz commented on PR #13989: URL: https://github.com/apache/lucene/pull/13989#issuecomment-2498257183 Thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

Re: [PR] Improve checksum calculations [lucene]

2024-11-25 Thread via GitHub
jpountz merged PR #13989: URL: https://github.com/apache/lucene/pull/13989 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [PR] Improve checksum calculations [lucene]

2024-11-23 Thread via GitHub
jfboeuf commented on code in PR #13989: URL: https://github.com/apache/lucene/pull/13989#discussion_r1855164834 ## lucene/core/src/test/org/apache/lucene/store/TestBufferedChecksum.java: ## @@ -63,4 +67,127 @@ public void testRandom() { } assertEquals(c1.getValue(), c2

Re: [PR] Improve checksum calculations [lucene]

2024-11-23 Thread via GitHub
jfboeuf commented on code in PR #13989: URL: https://github.com/apache/lucene/pull/13989#discussion_r1855164624 ## lucene/core/src/test/org/apache/lucene/store/TestBufferedChecksum.java: ## @@ -63,4 +67,127 @@ public void testRandom() { } assertEquals(c1.getValue(), c2

Re: [PR] Improve checksum calculations [lucene]

2024-11-22 Thread via GitHub
jpountz commented on code in PR #13989: URL: https://github.com/apache/lucene/pull/13989#discussion_r1854638074 ## lucene/core/src/test/org/apache/lucene/store/TestBufferedChecksum.java: ## @@ -63,4 +67,127 @@ public void testRandom() { } assertEquals(c1.getValue(), c2

Re: [PR] Improve checksum calculations [lucene]

2024-11-13 Thread via GitHub
jfboeuf commented on code in PR #13989: URL: https://github.com/apache/lucene/pull/13989#discussion_r1840804892 ## lucene/core/src/java/org/apache/lucene/store/BufferedChecksum.java: ## @@ -60,6 +64,37 @@ public void update(byte[] b, int off, int len) { } } + void upd

Re: [PR] Improve checksum calculations [lucene]

2024-11-12 Thread via GitHub
jpountz commented on code in PR #13989: URL: https://github.com/apache/lucene/pull/13989#discussion_r1839685981 ## lucene/core/src/java/org/apache/lucene/store/BufferedChecksum.java: ## @@ -60,6 +64,37 @@ public void update(byte[] b, int off, int len) { } } + void upd

Re: [PR] Improve checksum calculations [lucene]

2024-11-12 Thread via GitHub
jfboeuf commented on PR #13989: URL: https://github.com/apache/lucene/pull/13989#issuecomment-2471157022 @jpountz [I modified the benchmark to make it more realistic by adding a header to the `IndexOutput` ](https://github.com/apache/lucene/commit/8dc6eac23b3a1158ef4c82860d8574c779bad04

Re: [PR] Improve checksum calculations [lucene]

2024-11-12 Thread via GitHub
rmuir commented on PR #13989: URL: https://github.com/apache/lucene/pull/13989#issuecomment-2471135059 OK, I see @jfboeuf, thank you for the explanation. My only concern with with the optimization is testing. If there is a bug here, the user will get CorruptIndexException. Could we

Re: [PR] Improve checksum calculations [lucene]

2024-11-12 Thread via GitHub
jpountz commented on PR #13989: URL: https://github.com/apache/lucene/pull/13989#issuecomment-2470814754 The change makes sense to me and looks like it could speed up loading live docs. > The benchmark shows the single-long approach performs better on small arrays. [...] It can be im

Re: [PR] Improve checksum calculations [lucene]

2024-11-12 Thread via GitHub
jfboeuf commented on PR #13989: URL: https://github.com/apache/lucene/pull/13989#issuecomment-2470657083 Thank you for your feedback. Perhaps I misunderstood your point, but the implementation I propose only calls `Checksum.update(byte[])`. The change resides in how the buffer is fed to avo

Re: [PR] Improve checksum calculations [lucene]

2024-11-12 Thread via GitHub
rmuir commented on PR #13989: URL: https://github.com/apache/lucene/pull/13989#issuecomment-2470068010 This is actually slower, we only want to call `updateBytes(byte[])` or the checksum calculation is very slow (not vectorized). -- This is an automated message from the Apache Git Service

[PR] Improve checksum calculations [lucene]

2024-11-12 Thread via GitHub
jfboeuf opened a new pull request, #13989: URL: https://github.com/apache/lucene/pull/13989 Take advantage of the existing buffer in `BufferedChecksum` to speed up reads for Longs, Ints, Shorts, and Long arrays by avoiding byte-by-byte reads. Use the faster `readLongs()` method to decode