Re: [PR] [WIP] first cut at bounding the NodeHash size during FST compilation [lucene]
mikemccand commented on PR #12633: URL: https://github.com/apache/lucene/pull/12633#issuecomment-1751999625 Here are the results from running `test_all_sizes.py` then `results_to_md.py`: |NodeHash size|FST (mb)|RAM (mb)|FST build time (sec)| |-|||| |0|577.4|0.0|35.2| |4|586.5|0.0|43.2| |8|587.0|0.0|46.4| |16|585.2|0.0|44.8| |32|582.0|0.0|45.9| |64|578.8|0.0|45.4| |128|573.0|0.0|45.9| |256|563.6|0.0|46.1| |512|551.2|0.0|45.4| |1024|537.5|0.0|45.7| |2048|523.4|0.0|46.0| |4096|509.5|0.1|45.6| |8192|495.8|0.1|45.2| |16384|481.8|0.2|46.3| |32768|461.1|0.5|45.2| |65536|447.2|1.0|45.7| |131072|432.4|2.0|46.3| |262144|418.6|4.0|46.3| |524288|402.4|8.0|46.9| |1048576|391.0|16.0|50.0| |2097152|380.8|32.0|55.2| |4194304|371.4|64.0|58.3| |8388608|362.5|128.0|59.9| |16777216|356.1|256.0|59.3|
Re: [PR] Speedup integer functions for 128-bit neon vectors [lucene]
ChrisHegarty commented on PR #12632: URL: https://github.com/apache/lucene/pull/12632#issuecomment-1752003494 Thanks for looking into this @rmuir, I've been thinking similar myself (just didn't get around to anything other than the thinking! ) On my Mac M2. JDK 20.0.2. ``` Benchmark (size) Mode Cnt Score Error Units BinaryDotProductBenchmark.dotProductNew 1024 thrpt5 6.590 ± 0.098 ops/us BinaryDotProductBenchmark.dotProductNewNew1024 thrpt5 7.769 ± 0.102 ops/us BinaryDotProductBenchmark.dotProductOld 1024 thrpt5 3.159 ± 0.034 ops/us ``` JDK 21 ``` Benchmark (size) Mode Cnt Score Error Units BinaryDotProductBenchmark.dotProductNew 1024 thrpt5 6.546 ± 0.054 ops/us BinaryDotProductBenchmark.dotProductNewNew1024 thrpt5 7.696 ± 0.103 ops/us BinaryDotProductBenchmark.dotProductOld 1024 thrpt5 2.893 ± 0.306 ops/us ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Speedup integer functions for 128-bit neon vectors [lucene]
ChrisHegarty commented on PR #12632: URL: https://github.com/apache/lucene/pull/12632#issuecomment-1752024575 ``` // sum into accumulators Vector prod16 = prod16_1.add(prod16_2); acc = acc.add(prod16.convert(VectorOperators.S2I, 0)); acc = acc.add(prod16.convert(VectorOperators.S2I, 1)); ``` What is the maximum value that we can see in the input bytes? Can they every hold `-128`? Do we need to handle "overflow" in the short accumulation and subsequent conversion to `int`? If so then we can use `VectorOperators.ZERO_EXTEND_S2I` (rather than `S2I`). ( this is just a question, rather than a suggestion, since I only thought of this after leaving my keyboard) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Enable rank-unsafe optimization of top-k hit computations by quantizing scores. [lucene]
mikemccand commented on PR #12628: URL: https://github.com/apache/lucene/pull/12628#issuecomment-1752028823 Very cool, surprisingly impactful! > I ran the Tantivy benchmark with TOP_10 and TOP_100 commands This is the Tantivy benchmark tooling, but you are comparing Lucene (main) to Lucene (your PR) right? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Speedup integer functions for 128-bit neon vectors [lucene]
ChrisHegarty commented on PR #12632: URL: https://github.com/apache/lucene/pull/12632#issuecomment-1752029230 And of course, `ZERO_EXTEND_S2I`, will work in the maximum boundary case, but not in others. So the question is then just about the maximum value of the bytes in these input arrays. :-( -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] Lucene's FST Builder should have a simpler "knob" to trade off memory/CPU required against minimality [lucene]
mikemccand commented on issue #12542: URL: https://github.com/apache/lucene/issues/12542#issuecomment-1752030874 Talking to @sokolovm at Community Over Code 2023 he suggested another idea here: instead of a (RAM hungry) hash table, couldn't we use the growing FST itself to lookup suffixes? If we added the reversed (incoming) transitions to each FST node then we could do such a suffix lookup in reverse of the normal forward only FST lookup. Maybe we could have these additional reverse transitions in a separate RAM efficient structure, just used during construction? (Because the written FST only needs the forwards-only lookup). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Write MSB VLong for better outputs sharing in block tree index [lucene]
mikemccand commented on PR #12631: URL: https://github.com/apache/lucene/pull/12631#issuecomment-1752031210 > sum | 31606784 | 27188690 | -13.98% WHOA, wow! This is a massive gain for such a tiny change :) I'll try to review soon! Nice to revisit ancient `TODO`s in the source code ... :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Speedup integer functions for 128-bit neon vectors [lucene]
rmuir commented on PR #12632: URL: https://github.com/apache/lucene/pull/12632#issuecomment-1752033176 > What is the maximum value that we can see in the input bytes? All possible values is how i test > Can they every hold `-128`? Yes! > Do we need to handle "overflow" in the short accumulation and subsequent conversion to `int`? No! integer fma won't overflow here. and i know that isn't obvious and probably needs a code comment lol. that's why it only works for these two methods and not the square. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Speedup integer functions for 128-bit neon vectors [lucene]
ChrisHegarty commented on PR #12632: URL: https://github.com/apache/lucene/pull/12632#issuecomment-1752035773 Ok, cool. If there is not already one, we should add a test to the Panama / scalar unit test for the boundary values. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Speedup integer functions for 128-bit neon vectors [lucene]
rmuir commented on PR #12632: URL: https://github.com/apache/lucene/pull/12632#issuecomment-1752036396 yeah agreed: we should test the boundaries for all 3 functions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Write MSB VLong for better outputs sharing in block tree index [lucene]
mikemccand commented on code in PR #12631: URL: https://github.com/apache/lucene/pull/12631#discussion_r1349699402 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/FieldReader.java: ## @@ -99,6 +102,26 @@ public final class FieldReader extends Terms { */ } + long readLongOutput(DataInput in) throws IOException { Review Comment: Maybe rename to `readVLongOutput`? Make it clearly we are doing some form of vlong encoding? ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Lucene90BlockTreeTermsWriter.java: ## @@ -460,6 +460,19 @@ public String toString() { return "BLOCK: prefix=" + brToString(prefix); } +private static void writeMSBVLong(long i, DataOutput scratchBytes) throws IOException { Review Comment: Maybe rename `i` to `value`? And then you can use `i` for the for loop below? ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Lucene90BlockTreeTermsReader.java: ## @@ -81,8 +81,11 @@ public final class Lucene90BlockTreeTermsReader extends FieldsProducer { /** Initial terms format. */ public static final int VERSION_START = 0; + /** Version that uses MSB VLong encoded output */ + public static final int VERSION_MSB_VLONG_OUTPUT = 1; + /** Current terms format. */ - public static final int VERSION_CURRENT = VERSION_START; + public static final int VERSION_CURRENT = VERSION_MSB_VLONG_OUTPUT; Review Comment: Add a comment explaining what changed, and link to the GitHub issue? ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Lucene90BlockTreeTermsWriter.java: ## @@ -460,6 +460,19 @@ public String toString() { return "BLOCK: prefix=" + brToString(prefix); } +private static void writeMSBVLong(long i, DataOutput scratchBytes) throws IOException { + assert i >= 0; + // Keep zero bits on most significant byte to have more chance to get prefix bytes shared. + // e.g. we expect 0x7FFF stored as [0x81, 0xFF, 0x7F] but not [0xFF, 0xFF, 0x40] + int LSBVLongBytes = (Long.SIZE - Long.numberOfLeadingZeros(i) - 1) / 7 + 1; Review Comment: Maybe rename to `bytesNeeded`? (Full disclosure this was suggested by [Bard](https://bard.google.com)). ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Lucene90BlockTreeTermsWriter.java: ## @@ -460,6 +460,19 @@ public String toString() { return "BLOCK: prefix=" + brToString(prefix); } +private static void writeMSBVLong(long i, DataOutput scratchBytes) throws IOException { + assert i >= 0; + // Keep zero bits on most significant byte to have more chance to get prefix bytes shared. + // e.g. we expect 0x7FFF stored as [0x81, 0xFF, 0x7F] but not [0xFF, 0xFF, 0x40] + int LSBVLongBytes = (Long.SIZE - Long.numberOfLeadingZeros(i) - 1) / 7 + 1; + i <<= Long.SIZE - LSBVLongBytes * 7; + for (int j = 1; j < LSBVLongBytes; j++) { +scratchBytes.writeByte((byte) (((i >>> 57) & 0x7FL) | 0x80)); Review Comment: Maybe @uschindler can help review this heavy math :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Speedup integer functions for 128-bit neon vectors [lucene]
rmuir commented on PR #12632: URL: https://github.com/apache/lucene/pull/12632#issuecomment-1752039360 yeah, you are right, i am wrong. the trick only works in the unsigned case, Byte.MIN_VALUE is a problem :( -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[PR] add tests for vectorutils integer boundaries [lucene]
rmuir opened a new pull request, #12634: URL: https://github.com/apache/lucene/pull/12634 Let's improve the testing for the boundary cases and check them explicitly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Speedup integer functions for 128-bit neon vectors [lucene]
rmuir commented on PR #12632: URL: https://github.com/apache/lucene/pull/12632#issuecomment-1752041404 at least we can improve the testing out of this: https://github.com/apache/lucene/pull/12634 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Write MSB VLong for better outputs sharing in block tree index [lucene]
gf2121 commented on code in PR #12631: URL: https://github.com/apache/lucene/pull/12631#discussion_r1349705693 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Lucene90BlockTreeTermsReader.java: ## @@ -81,8 +81,11 @@ public final class Lucene90BlockTreeTermsReader extends FieldsProducer { /** Initial terms format. */ public static final int VERSION_START = 0; + /** Version that uses MSB VLong encoded output */ + public static final int VERSION_MSB_VLONG_OUTPUT = 1; + /** Current terms format. */ - public static final int VERSION_CURRENT = VERSION_START; + public static final int VERSION_CURRENT = VERSION_MSB_VLONG_OUTPUT; Review Comment: I wrote some comments on `VERSION_MSB_VLONG_OUTPUT`. I prefer to leave comments there, as `VERSION_CURRENT` could probably link to other version in feature. WDYT? Issue added :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Speedup integer functions for 128-bit neon vectors [lucene]
rmuir commented on PR #12632: URL: https://github.com/apache/lucene/pull/12632#issuecomment-1752049654 don't worry, i have a plan B. it is just frustrating due to the nightmare of operating on the mac, combined with the fact this benchmark and lucene source is a separate repo. it makes the situation very slow and error-prone -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Speedup integer functions for 128-bit neon vectors [lucene]
rmuir commented on PR #12632: URL: https://github.com/apache/lucene/pull/12632#issuecomment-1752050233 see latest commit for the idea. on my mac it gives a decent boost. it uses "32-bit" vector by loading 64-bit vector from array but only processing half of it. The tests should fail as i need to fix the other functions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Write MSB VLong for better outputs sharing in block tree index [lucene]
mikemccand commented on code in PR #12631: URL: https://github.com/apache/lucene/pull/12631#discussion_r1349711457 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Lucene90BlockTreeTermsReader.java: ## @@ -81,8 +81,11 @@ public final class Lucene90BlockTreeTermsReader extends FieldsProducer { /** Initial terms format. */ public static final int VERSION_START = 0; + /** Version that uses MSB VLong encoded output */ + public static final int VERSION_MSB_VLONG_OUTPUT = 1; + /** Current terms format. */ - public static final int VERSION_CURRENT = VERSION_START; + public static final int VERSION_CURRENT = VERSION_MSB_VLONG_OUTPUT; Review Comment: Oh, sorry, yes - `VERSION_MSB_VLONG_OUTPUT` is the right place for the comment! Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Write MSB VLong for better outputs sharing in block tree index [lucene]
mikemccand commented on PR #12631: URL: https://github.com/apache/lucene/pull/12631#issuecomment-1752050479 I kicked off a `luceneutil` run ... I'll post results here soonish. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] add tests for vectorutils integer boundaries [lucene]
rmuir merged PR #12634: URL: https://github.com/apache/lucene/pull/12634 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Speedup integer functions for 128-bit neon vectors [lucene]
rmuir commented on PR #12632: URL: https://github.com/apache/lucene/pull/12632#issuecomment-1752063622 ok on my mac i see: ``` Benchmark (size) Mode Cnt Score Error Units BinaryCosineBenchmark.cosineDistanceNew 1024 thrpt5 2.261 ± 0.007 ops/us BinaryCosineBenchmark.cosineDistanceNewNew1024 thrpt5 3.708 ± 0.034 ops/us BinaryDotProductBenchmark.dotProductNew 1024 thrpt5 6.138 ± 0.021 ops/us BinaryDotProductBenchmark.dotProductNewNew1024 thrpt5 6.645 ± 0.021 ops/us ``` and it passes new boundary tests (but no sneakiness with boundary values, instead another type of sneakiness). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Write MSB VLong for better outputs sharing in block tree index [lucene]
mikemccand commented on PR #12631: URL: https://github.com/apache/lucene/pull/12631#issuecomment-1752064474 `luceneutil` results on `wikimediumall` look good -- looks like all noise (even for `PKLookup`), or, any signal (change) is very low, making the ~15% reduction very much worth it. ``` TaskQPS base StdDevQPS msb_vlong StdDevPct diff p-value BrowseMonthTaxoFacets8.34 (4.4%)8.24 (3.6%) -1.2% ( -8% -7%) 0.344 HighIntervalsOrdered 11.37 (3.9%) 11.24 (4.5%) -1.2% ( -9% -7%) 0.382 OrHighLow 394.28 (1.6%) 390.47 (2.1%) -1.0% ( -4% -2%) 0.099 MedSpanNear 24.41 (3.9%) 24.23 (3.9%) -0.8% ( -8% -7%) 0.537 MedIntervalsOrdered9.96 (4.0%)9.89 (4.3%) -0.7% ( -8% -7%) 0.602 LowSpanNear 13.82 (3.2%) 13.73 (3.2%) -0.6% ( -6% -5%) 0.526 AndHighHigh 97.28 (1.9%) 96.71 (1.2%) -0.6% ( -3% -2%) 0.259 BrowseRandomLabelSSDVFacets9.22 (4.5%)9.17 (4.3%) -0.6% ( -8% -8%) 0.688 LowIntervalsOrdered 28.78 (4.0%) 28.64 (4.1%) -0.5% ( -8% -7%) 0.720 PKLookup 250.58 (0.9%) 249.44 (1.0%) -0.5% ( -2% -1%) 0.118 IntNRQ 315.77 (2.5%) 314.47 (2.1%) -0.4% ( -4% -4%) 0.570 BrowseDayOfYearSSDVFacets 12.75 (7.3%) 12.70 (10.5%) -0.4% ( -16% - 18%) 0.887 HighPhrase 55.21 (3.5%) 55.00 (2.8%) -0.4% ( -6% -6%) 0.712 MedPhrase 77.25 (2.0%) 77.04 (1.9%) -0.3% ( -4% -3%) 0.658 HighSloppyPhrase6.90 (3.9%)6.89 (3.4%) -0.2% ( -7% -7%) 0.841 LowPhrase 463.60 (1.4%) 462.90 (1.1%) -0.2% ( -2% -2%) 0.701 OrNotHighMed 424.68 (2.6%) 424.06 (2.5%) -0.1% ( -5% -5%) 0.857 AndHighLow 680.76 (1.3%) 679.77 (1.6%) -0.1% ( -3% -2%) 0.757 HighSpanNear 27.40 (1.5%) 27.37 (1.3%) -0.1% ( -2% -2%) 0.787 AndHighMed 200.30 (1.3%) 200.12 (1.3%) -0.1% ( -2% -2%) 0.826 OrNotHighLow 622.80 (1.2%) 622.27 (1.1%) -0.1% ( -2% -2%) 0.816 LowSloppyPhrase7.73 (2.9%)7.73
Re: [PR] Speedup integer functions for 128-bit neon vectors [lucene]
ChrisHegarty commented on PR #12632: URL: https://github.com/apache/lucene/pull/12632#issuecomment-1752098666 I get similar bench results, the new impl is faster. ``` Benchmark (size) Mode Cnt Score Error Units BinaryDotProductBenchmark.dotProductNew 1024 thrpt5 6.499 ± 0.299 ops/us BinaryDotProductBenchmark.dotProductNewNew1024 thrpt5 7.017 ± 0.165 ops/us BinaryDotProductBenchmark.dotProductOld 1024 thrpt5 3.135 ± 0.060 ops/us ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Speedup integer functions for 128-bit neon vectors [lucene]
ChrisHegarty commented on PR #12632: URL: https://github.com/apache/lucene/pull/12632#issuecomment-1752099845 My sense here is that accessing a `part` other than `0` is less performant that just reloading the data, which seems a little off. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Speedup integer functions for 128-bit neon vectors [lucene]
rmuir commented on PR #12632: URL: https://github.com/apache/lucene/pull/12632#issuecomment-1752100681 > My sense here is that accessing a `part` other than `0` is less performant that just reloading the data, which seems a little off. It seems to have a heavy cost no matter how i do it, like a bounds check that i can't trick it into eliminating? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Speedup integer functions for 128-bit neon vectors [lucene]
rmuir commented on PR #12632: URL: https://github.com/apache/lucene/pull/12632#issuecomment-1752101786 btw, another crazy avenue to possibly explore here another day, since we seem bottlenecked on integer multiply. We could try it on arm too. It is faster than the current binary code on my machine at least, but not by much so it needs tweaking. Need to also look at vector size limits and think about it as whole integers can only be exactly represented up to some size. Maybe accumulator has to be integer, needs some thought. ``` // optimized 256/512 bit implementation, uses FPU int upperBound = PREFERRED_BYTE_SPECIES.loopBound(a.length - 3*PREFERRED_BYTE_SPECIES.length()); FloatVector acc1 = FloatVector.zero(FloatVector.SPECIES_PREFERRED); FloatVector acc2 = FloatVector.zero(FloatVector.SPECIES_PREFERRED); FloatVector acc3 = FloatVector.zero(FloatVector.SPECIES_PREFERRED); FloatVector acc4 = FloatVector.zero(FloatVector.SPECIES_PREFERRED); for (; i < upperBound; i += 4*PREFERRED_BYTE_SPECIES.length()) { ByteVector va8 = ByteVector.fromArray(PREFERRED_BYTE_SPECIES, a, i); ByteVector vb8 = ByteVector.fromArray(PREFERRED_BYTE_SPECIES, b, i); Vector va32 = va8.convertShape(VectorOperators.B2F, FloatVector.SPECIES_PREFERRED, 0); Vector vb32 = vb8.convertShape(VectorOperators.B2F, FloatVector.SPECIES_PREFERRED, 0); acc1 = acc1.add(va32.mul(vb32)); ByteVector vc8 = ByteVector.fromArray(PREFERRED_BYTE_SPECIES, a, i + PREFERRED_BYTE_SPECIES.length()); ByteVector vd8 = ByteVector.fromArray(PREFERRED_BYTE_SPECIES, b, i + PREFERRED_BYTE_SPECIES.length()); Vector vc32 = vc8.convertShape(VectorOperators.B2F, FloatVector.SPECIES_PREFERRED, 0); Vector vd32 = vd8.convertShape(VectorOperators.B2F, FloatVector.SPECIES_PREFERRED, 0); acc2 = acc2.add(vc32.mul(vd32)); ByteVector ve8 = ByteVector.fromArray(PREFERRED_BYTE_SPECIES, a, i + 2*PREFERRED_BYTE_SPECIES.length()); ByteVector vf8 = ByteVector.fromArray(PREFERRED_BYTE_SPECIES, b, i + 2*PREFERRED_BYTE_SPECIES.length()); Vector ve32 = ve8.convertShape(VectorOperators.B2F, FloatVector.SPECIES_PREFERRED, 0); Vector vf32 = vf8.convertShape(VectorOperators.B2F, FloatVector.SPECIES_PREFERRED, 0); acc3 = acc3.add(ve32.mul(vf32)); ByteVector vg8 = ByteVector.fromArray(PREFERRED_BYTE_SPECIES, a, i + 3*PREFERRED_BYTE_SPECIES.length()); ByteVector vh8 = ByteVector.fromArray(PREFERRED_BYTE_SPECIES, b, i + 3*PREFERRED_BYTE_SPECIES.length()); Vector vg32 = vg8.convertShape(VectorOperators.B2F, FloatVector.SPECIES_PREFERRED, 0); Vector vh32 = vh8.convertShape(VectorOperators.B2F, FloatVector.SPECIES_PREFERRED, 0); acc3 = acc3.add(vg32.mul(vh32)); } // TODO: vector tail opto // reduce FloatVector res1 = acc1.add(acc2); FloatVector res2 = acc3.add(acc4); res += (int) res1.add(res2).reduceLanes(VectorOperators.ADD) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[I] Should we have an interface VectorValues which would be implemented by [Byte/Float]VectorValues classes [lucene]
shubhamvishu opened a new issue, #12635: URL: https://github.com/apache/lucene/issues/12635 ### Description Currently, there is lot of code duplication due to [ByteVectorValues](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/index/ByteVectorValues.java) and [FloatVectorValues](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/index/FloatVectorValues.java) (Example - [one](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/codecs/KnnVectorsWriter.java#L48-L73), [two](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/codecs/KnnVectorsWriter.java#L109-L301) etc.) because they are handled separately as one returns byte[] and other float[] for `vectorValue` method. The idea is if we should have a common interface `VectorValues`? that provides the common functionality and we could eventually get rid of this duplication. As a first step maybe we could try to move the below methods(common in above 2 classes) into its different interface `VectorValues`? - int dimension() - int size() - byte[]/float[] vectorValue() -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[PR] Add interface VectorValues to be implemented by [Float/Byte]VectorValues [lucene]
shubhamvishu opened a new pull request, #12636: URL: https://github.com/apache/lucene/pull/12636 ### Description The classes [ByteVectorValues](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/index/ByteVectorValues.java) and [FloatVectorValues](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/index/FloatVectorValues.java) share the common functionalities/methods listed below. This PR moves those methods to a new interface `VectorValues`. - int dimension() - int size() - byte[]/float[] vectorValue() Since the name is clashing, created two new names `vectorByteValue` and `vectorFloatValue`. Also retained the `vectorValue` method whose returned value needs to be casted appropriately. There could be some other ideas as well like 1. Making `DISI` implementing an interface which `VectorValues`(in this PR) would extend making further room to remove duplication (or) 2. Making a wrapper over `ByteVectorValues` and `FloatVectorValues` etc?. Looking for more ideas or suggestions on how to better handle this and avoid duplicating code. Addresses #12635 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Speedup integer functions for 128-bit neon vectors [lucene]
rmuir commented on PR #12632: URL: https://github.com/apache/lucene/pull/12632#issuecomment-1752107370 The other thought I had around conversion costs would be to look into reinterpret+shuffle/shift/mask crap ourselves, which seems really crazy but i'm running low on ideas. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] LUCENE-10241: Updating OpenNLP to 1.9.4. [lucene]
epugh commented on PR #448: URL: https://github.com/apache/lucene/pull/448#issuecomment-1752112078 It would be nice if this was updated to the awesome new OpenNLP 2.x line! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Enable rank-unsafe optimization of top-k hit computations by quantizing scores. [lucene]
jpountz commented on PR #12628: URL: https://github.com/apache/lucene/pull/12628#issuecomment-1752152301 I'll try to give a bit more context how I ended up here. With recent work on vector search and excitement around it, I can't prevent myself from thinking that all users who are happy to trade some effectiveness for better efficiency for their vector searches should be happy to do it with their term-based searches as well. There is some literature on rank-unsafe evaluation of top-hits, the main one I'm familiar with is the original [WAND paper](https://www.researchgate.net/profile/David-Carmel-3/publication/221613425_Efficient_query_evaluation_using_a_two-level_retrieval_process/links/02bfe50e685450015300/Efficient-query-evaluation-using-a-two-level-retrieval-process.pdf) where authors suggest to configure a threshold factor F and run the WAND operator so that it only considers matches whose sum of maximum scores across query terms is at least F times the score of the k-th score in the priority queue. When F is equal to 1 or less, the returned top hits are the same as the ones produced by exhaustive evaluation. Setting F greater than 1 allows trading effectiveness for efficiency. Authors share some experiments that show how F=2 affects P@10 and MAP, as well as the number of evaluated hits. I'd like to expose this kind of trade-off in Lucene. One problem I have with the above approach is that it is very WAND-specific. A similar idea that wouldn't be WAND-specific would be to configure the minimum score (`Scorer#setMinimumCompetitiveScore`) as the k-th score in the heap times some factor. This feels a bit too low-level for something to be exposed as an option on `IndexSearcher`, so I ended up trying this idea of quantizing scores, which is again very similar in terms of why it helps, and that I find easier to explain: it computes accurate top hits on quantized scores. To get a bit of time thinking of how to expose it on `IndexSearcher`, I'm starting with this collector wrapper which doesn't require making changes to lucene-core and would help put it in the hands of users who would be interested in trying it out. For reference, https://github.com/apache/lucene/pull/12446 is another PR where I'm exploring exposing rank-unsafe optimizations in Lucene. > Would the typical usage of this scorer be as an initial collection of k docs, which would then get re-scored over their full floating point scores? If someone wanted to do that, they could do it in a single pass by using a similar collector wrapper as the one I'm adding, and only changing scores passed to `Scorer#setMinimumCompetitiveScore`, not the ones returned by `Scorer#score`. I think the idea is rather for it to be used as a drop-in replacement for traditional top-k collection, just trading a bit of effectiveness for better efficiency. It's probably especially acceptable with high `k` values, when hits are expected to be fed to a reranker in a later stage. > Are there any numbers around how this effects recall or does the Tantivy benchmark account for actually getting the correct document? No, there are no measures of the effect on recall or other measures of effectiveness. > Very cool, surprisingly impactful! For reference, I noticed that it especially helps in the following cases: - Stop words mixed with high-scoring clauses: in that case the quantization helps the score contribution of stop words become a rounding error more than an actual score contribution, which in-turn helps the pruning logic consider almost only the high-scoring clauses. A good such example from the benchmark is `the progressive`. The query `the incredibles` is an extreme case which is due to the fact that `incredibles` has an extremely low term frequency, so the query looks like a query on `the` in practice where a few hits get boosted because they contain `incredibles`. - Clauses with similar maximum scores. These queries are complicated because it's rare that the same document yields the maximum score in the block for both clauses. So actual scores produced by such queries are often quite behind the sum of maximum scores of each clause. They can run like conjunctions in practice, but they can rarely skip blocks across all clauses. Quantization makes this more likely. A good example from the benchmark is `american south`. - Conjunctions. Conjunctions suffer from the same problem as above, it's rare that all clauses yield their maximum score on the same document. This makes skipping whole blocks rare in practice. Quantization makes it more likely by creating more ties. > This is the Tantivy benchmark tooling, but you are comparing Lucene (main) to Lucene (your PR) right? Actually since this only introduces a single class, I copied this class to the source folder of the `engine` querying logic. So the baseline is Lucene 9.8.0 and t
Re: [PR] [WIP] first cut at bounding the NodeHash size during FST compilation [lucene]
mikemccand commented on PR #12633: URL: https://github.com/apache/lucene/pull/12633#issuecomment-1752165322 For comparison, this is how the curve (RAM required during construction vs final FST size) looks on trunk, using the god-like parameters as best I could. I sorted the results in reverse `FST (mb)` order but not that this results in a confusing mix of the first three columns (the god-like parameters). I'll try to turn these two tables into a single chart comparing RAM used and final FST size and maybe build time: |Share suffix?|Share non-singleton?|Max tail length|FST (mb)|RAM (mb)|Build time (sec)| |-|||||| |True|True|1|584.3|0.0|30.5| |True|False|1|584.3|0.0|30.5| |True|True|0|577.4|0.0|30.1| |True|False|0|577.4|0.0|30.2| |False|True|0|577.4|0.0|29.9| |False|False|0|577.4|0.0|29.8| |True|False|2|576.4|0.0|31.4| |True|True|2|569.9|14.5|32.4| |True|False|3|523.7|3.6|32.2| |True|True|3|509.7|29.0|34.1| |True|False|4|486.2|7.0|32.9| |True|True|4|468.9|58.0|36.7| |True|False|5|456.1|14.0|34.1| |True|True|5|437.6|56.0|37.8| |True|False|6|435.4|28.0|35.3| |True|False|7|419.8|58.0|36.6| |True|True|6|416.1|116.0|41.2| |True|False|8|408.1|56.0|37.6| |True|True|7|399.8|116.0|41.7| |True|False|9|398.8|116.0|39.9|
Re: [PR] Add interface VectorValues to be implemented by [Float/Byte]VectorValues [lucene]
benwtrent commented on PR #12636: URL: https://github.com/apache/lucene/pull/12636#issuecomment-1752194821 It was sort of this way before but we decided to switch it as a common interface required either: - having to use generics - an API where things weren't fully implemented or required casting. If we are worried about internal duplication there may be better options around using a better abstraction for scoring (see RandomVectorScorer) and writing to a flat file (on flush and merges). Those are the only two places we care about reading vectors during indexing and search. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[I] segmentInfos.replace() doesn't set userData [lucene]
Shibi-bala opened a new issue, #12637: URL: https://github.com/apache/lucene/issues/12637 ### Description Found that the [replace method](https://github.com/qcri/solr-6/blob/master/lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java#L875-L878) doesn't set `userData` with the new user data from `other`. Unsure if this is an oversight, but if it is, I have a PR up [here ](https://github.com/apache/lucene/pull/12626). Existing: ``` void replace(SegmentInfos other) { rollbackSegmentInfos(other.asList()); lastGeneration = other.lastGeneration; } ``` Fix: ``` void replace(SegmentInfos other) { rollbackSegmentInfos(other.asList()); lastGeneration = other.lastGeneration; userData = other.userData; } ``` ### Version and environment details Lucene version 9.7 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Avoid NPEx if the end of the stream has been reached without reading any characters [lucene]
pzygielo commented on PR #12611: URL: https://github.com/apache/lucene/pull/12611#issuecomment-1752377046 Thanks for checking. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[PR] Explicitly return needStats flag in TermStates [lucene]
yugushihuang opened a new pull request, #12638: URL: https://github.com/apache/lucene/pull/12638 ### Description A simple API in TermStates to expose the `needStats` flag. Addresses #12617 # -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Avoid NPEx if the end of the stream has been reached without reading any characters [lucene]
dweiss merged PR #12611: URL: https://github.com/apache/lucene/pull/12611 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Avoid NPEx if the end of the stream has been reached without reading any characters [lucene]
dweiss commented on PR #12611: URL: https://github.com/apache/lucene/pull/12611#issuecomment-1752397871 I've applied this to main and branch_9x (9.9). Thank you. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Explicitly return needStats flag in TermStates [lucene]
jpountz commented on PR #12638: URL: https://github.com/apache/lucene/pull/12638#issuecomment-1752414836 Can you explain how/when you plan to use this new API? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] [WIP] first cut at bounding the NodeHash size during FST compilation [lucene]
dweiss commented on PR #12633: URL: https://github.com/apache/lucene/pull/12633#issuecomment-1752416032 I didn't get into all the details but I think this looks good. Your questions are indeed intriguing - I can't provide any explanation off the top of my head, really. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org