[GitHub] [lucene] javanna merged pull request #12325: Parallelize knn query rewrite across slices rather than segments

2023-05-26 Thread via GitHub
javanna merged PR #12325: URL: https://github.com/apache/lucene/pull/12325 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

[GitHub] [lucene] javanna commented on pull request #12325: Parallelize knn query rewrite across slices rather than segments

2023-05-26 Thread via GitHub
javanna commented on PR #12325: URL: https://github.com/apache/lucene/pull/12325#issuecomment-1563919463 Thanks @zhaih for the review! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [lucene] javanna opened a new pull request, #12335: Don't generate stacktrace for TimeExceededException

2023-05-26 Thread via GitHub
javanna opened a new pull request, #12335: URL: https://github.com/apache/lucene/pull/12335 The exception is package private and never rethrown, we can avoid generating a stacktrace for it. -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [lucene] javanna commented on pull request #12270: Don't generate stacktrace in CollectionTerminatedException

2023-05-26 Thread via GitHub
javanna commented on PR #12270: URL: https://github.com/apache/lucene/pull/12270#issuecomment-1564135251 I opened #12335 for TimeExceededException. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [lucene] javanna commented on a diff in pull request #12328: Optimize ConjunctionDISI.createConjunction

2023-05-26 Thread via GitHub
javanna commented on code in PR #12328: URL: https://github.com/apache/lucene/pull/12328#discussion_r1206530279 ## lucene/CHANGES.txt: ## @@ -76,6 +76,10 @@ Optimizations * GITHUB#11857, GITHUB#11859, GITHUB#11893, GITHUB#11909: Hunspell: improved suggestion performance (Pet

[GitHub] [lucene] alessandrobenedetti commented on a diff in pull request #12314: Multi-value support for KnnVectorField

2023-05-26 Thread via GitHub
alessandrobenedetti commented on code in PR #12314: URL: https://github.com/apache/lucene/pull/12314#discussion_r1206532595 ## lucene/core/src/java/org/apache/lucene/index/DocsWithFieldSet.java: ## @@ -22,6 +22,8 @@ import org.apache.lucene.util.FixedBitSet; import org.apache.

[GitHub] [lucene] alessandrobenedetti commented on a diff in pull request #12314: Multi-value support for KnnVectorField

2023-05-26 Thread via GitHub
alessandrobenedetti commented on code in PR #12314: URL: https://github.com/apache/lucene/pull/12314#discussion_r1206533493 ## lucene/core/src/java/org/apache/lucene/index/DocsWithFieldSet.java: ## @@ -32,8 +34,14 @@ public final class DocsWithFieldSet extends DocIdSet {

[GitHub] [lucene] alessandrobenedetti commented on a diff in pull request #12314: Multi-value support for KnnVectorField

2023-05-26 Thread via GitHub
alessandrobenedetti commented on code in PR #12314: URL: https://github.com/apache/lucene/pull/12314#discussion_r1206534495 ## lucene/core/src/java/org/apache/lucene/codecs/lucene95/Lucene95HnswVectorsWriter.java: ## @@ -762,13 +776,18 @@ private void writeMeta( meta.writ

[GitHub] [lucene] original-brownbear commented on a diff in pull request #12328: Optimize ConjunctionDISI.createConjunction

2023-05-26 Thread via GitHub
original-brownbear commented on code in PR #12328: URL: https://github.com/apache/lucene/pull/12328#discussion_r1206536107 ## lucene/CHANGES.txt: ## @@ -76,6 +76,10 @@ Optimizations * GITHUB#11857, GITHUB#11859, GITHUB#11893, GITHUB#11909: Hunspell: improved suggestion perfo

[GitHub] [lucene] javanna commented on a diff in pull request #12328: Optimize ConjunctionDISI.createConjunction

2023-05-26 Thread via GitHub
javanna commented on code in PR #12328: URL: https://github.com/apache/lucene/pull/12328#discussion_r1206530279 ## lucene/CHANGES.txt: ## @@ -76,6 +76,10 @@ Optimizations * GITHUB#11857, GITHUB#11859, GITHUB#11893, GITHUB#11909: Hunspell: improved suggestion performance (Pet

[GitHub] [lucene] alessandrobenedetti commented on pull request #12314: Multi-value support for KnnVectorField

2023-05-26 Thread via GitHub
alessandrobenedetti commented on PR #12314: URL: https://github.com/apache/lucene/pull/12314#issuecomment-1564148395 I proceeded with some additional refactoring and refinements that find in the latest commits. The diff is down to 25 classes, query time has been simplified, and explicit

[GitHub] [lucene] alessandrobenedetti commented on pull request #12314: Multi-value support for KnnVectorField

2023-05-26 Thread via GitHub
alessandrobenedetti commented on PR #12314: URL: https://github.com/apache/lucene/pull/12314#issuecomment-1564165188 > Thanks for sharing and working on a prototype @alessandrobenedetti ! > > I have additional questions and comments ;) Starting with the devil advocate but I'd like to

[GitHub] [lucene] alessandrobenedetti commented on pull request #12311: Integrate the Incubating Panama Vector API

2023-05-26 Thread via GitHub
alessandrobenedetti commented on PR #12311: URL: https://github.com/apache/lucene/pull/12311#issuecomment-1564190226 Just out of curiosity, do we tolerate this sort of class in Lucene? Are some of them auto-generated? (for example lucene/core/src/java20/org/apache/lucene/util/VectorUtilPa

[GitHub] [lucene] jimczi commented on pull request #12314: Multi-value support for KnnVectorField

2023-05-26 Thread via GitHub
jimczi commented on PR #12314: URL: https://github.com/apache/lucene/pull/12314#issuecomment-1564208967 > That was the initial approach, it was explicit at index and query time, and they were separate code paths from the single valued use case. So it was not affecting the single valued scen

[GitHub] [lucene] original-brownbear commented on pull request #12328: Optimize ConjunctionDISI.createConjunction

2023-05-26 Thread via GitHub
original-brownbear commented on PR #12328: URL: https://github.com/apache/lucene/pull/12328#issuecomment-1564256307 npnp + thanks Luca! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[GitHub] [lucene] javanna merged pull request #12328: Optimize ConjunctionDISI.createConjunction

2023-05-26 Thread via GitHub
javanna merged PR #12328: URL: https://github.com/apache/lucene/pull/12328 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

2023-05-26 Thread via GitHub
uschindler commented on PR #12311: URL: https://github.com/apache/lucene/pull/12311#issuecomment-1564310637 Hi @alessandrobenedetti, the code shown here is indeed crazy to read, but this is more a problem of the APIs in general. The Java Vector API is very low level and you have to exactly

[GitHub] [lucene] joegallo commented on pull request #12328: Optimize ConjunctionDISI.createConjunction

2023-05-26 Thread via GitHub
joegallo commented on PR #12328: URL: https://github.com/apache/lucene/pull/12328#issuecomment-1564371710 Does it make sense to backport this to 9.x? (or, perhaps, what is the process for doing that?) -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [lucene] javanna commented on pull request #12328: Optimize ConjunctionDISI.createConjunction

2023-05-26 Thread via GitHub
javanna commented on PR #12328: URL: https://github.com/apache/lucene/pull/12328#issuecomment-1564372833 heya @joegallo that was already the plan, it's done now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [lucene] joegallo commented on pull request #12328: Optimize ConjunctionDISI.createConjunction

2023-05-26 Thread via GitHub
joegallo commented on PR #12328: URL: https://github.com/apache/lucene/pull/12328#issuecomment-1564374279 Ah, outstanding! Thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [lucene] gsmiller commented on issue #12317: Option for disabling term dictionary compression

2023-05-26 Thread via GitHub
gsmiller commented on issue #12317: URL: https://github.com/apache/lucene/issues/12317#issuecomment-1564397702 @jainankitk thanks! To clarify my question a little bit, my understanding is that you'd like to explore the idea of making this compression optional based on memory usage profiling

[GitHub] [lucene] gsmiller commented on a diff in pull request #12334: Fix searchafter query high latency when after value is out of range for segment

2023-05-26 Thread via GitHub
gsmiller commented on code in PR #12334: URL: https://github.com/apache/lucene/pull/12334#discussion_r1206813319 ## lucene/core/src/java/org/apache/lucene/search/comparators/NumericComparator.java: ## @@ -204,13 +200,21 @@ private void updateCompetitiveIterator() throws IOExcep

[GitHub] [lucene] mikemccand commented on pull request #12320: Add "direct to binary" option for DaciukMihovAutomatonBuilder and use it in TermInSetQuery#visit

2023-05-26 Thread via GitHub
mikemccand commented on PR #12320: URL: https://github.com/apache/lucene/pull/12320#issuecomment-1564480240 > Resolving the class naming conflicts from `main` was a bit of a hassle with an incremental git history. Woops, sorry! -- This is an automated message from the Apache Git Se

[GitHub] [lucene] msokolov commented on pull request #12311: Integrate the Incubating Panama Vector API

2023-05-26 Thread via GitHub
msokolov commented on PR #12311: URL: https://github.com/apache/lucene/pull/12311#issuecomment-1564492407 hm I looked more closely at the test I ran and it seems I managed to create a file full of identical vectors -- so this is going to lead to crazy results. WIll follow up once I've manag

[GitHub] [lucene] mikemccand commented on a diff in pull request #12320: Add "direct to binary" option for DaciukMihovAutomatonBuilder and use it in TermInSetQuery#visit

2023-05-26 Thread via GitHub
mikemccand commented on code in PR #12320: URL: https://github.com/apache/lucene/pull/12320#discussion_r1206892745 ## lucene/core/src/java/org/apache/lucene/util/UnicodeUtil.java: ## @@ -477,38 +477,60 @@ public static int UTF8toUTF32(final BytesRef utf8, final int[] ints) {

[GitHub] [lucene] uschindler commented on pull request #12311: Integrate the Incubating Panama Vector API

2023-05-26 Thread via GitHub
uschindler commented on PR #12311: URL: https://github.com/apache/lucene/pull/12311#issuecomment-1564518097 Hi, I changed te CHANGES.txt entry in main and 9.x to correctly refer to ARM's chipset feature (NEON). @rmuir asked me to correct it. See: https://en.wikipedia.org/wiki/ARM_architectu

[GitHub] [lucene] uschindler commented on pull request #12268: add BitSet.clear()

2023-05-26 Thread via GitHub
uschindler commented on PR #12268: URL: https://github.com/apache/lucene/pull/12268#issuecomment-1564522169 Hi, sorry this went out of my view. Could you please add a CHANGES.txt entry in the 9.7 part of the file? -- This is an automated message from the Apache Git Service. To respond

[GitHub] [lucene] uschindler commented on pull request #12268: add BitSet.clear()

2023-05-26 Thread via GitHub
uschindler commented on PR #12268: URL: https://github.com/apache/lucene/pull/12268#issuecomment-1564523245 I will then press the merge button and cherry-pick it in 9.x branch for next release 9.7. -- This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [lucene] uschindler commented on pull request #12293: Capture build scans on ge.apache.org to benefit from deep build insights

2023-05-26 Thread via GitHub
uschindler commented on PR #12293: URL: https://github.com/apache/lucene/pull/12293#issuecomment-1564543093 I am fine with the changes (mostly), but I still don't understand why this needs to be on top-top level and can't be inside the `gradle/` subfolder. Also please make it conditio

[GitHub] [lucene] clayburn commented on pull request #12293: Capture build scans on ge.apache.org to benefit from deep build insights

2023-05-26 Thread via GitHub
clayburn commented on PR #12293: URL: https://github.com/apache/lucene/pull/12293#issuecomment-1564550262 > Is there a solution for 3rd party build Servers not having any CI secret. > I am fine with the changes (mostly), except. > Also please make it conditionally when running o

[GitHub] [lucene] gashutos commented on a diff in pull request #12334: Fix searchafter query high latency when after value is out of range for segment

2023-05-26 Thread via GitHub
gashutos commented on code in PR #12334: URL: https://github.com/apache/lucene/pull/12334#discussion_r1206958937 ## lucene/core/src/java/org/apache/lucene/search/comparators/NumericComparator.java: ## @@ -204,13 +200,21 @@ private void updateCompetitiveIterator() throws IOExcep

[GitHub] [lucene] gsmiller commented on a diff in pull request #12320: Add "direct to binary" option for DaciukMihovAutomatonBuilder and use it in TermInSetQuery#visit

2023-05-26 Thread via GitHub
gsmiller commented on code in PR #12320: URL: https://github.com/apache/lucene/pull/12320#discussion_r1206966623 ## lucene/core/src/java/org/apache/lucene/util/UnicodeUtil.java: ## @@ -477,38 +477,60 @@ public static int UTF8toUTF32(final BytesRef utf8, final int[] ints) {

[GitHub] [lucene] alessandrobenedetti commented on pull request #12311: Integrate the Incubating Panama Vector API

2023-05-26 Thread via GitHub
alessandrobenedetti commented on PR #12311: URL: https://github.com/apache/lucene/pull/12311#issuecomment-1564575142 thanks @uschindler for the explanation, I appreciate the work you are doing! -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [lucene] gsmiller commented on a diff in pull request #12320: Add "direct to binary" option for DaciukMihovAutomatonBuilder and use it in TermInSetQuery#visit

2023-05-26 Thread via GitHub
gsmiller commented on code in PR #12320: URL: https://github.com/apache/lucene/pull/12320#discussion_r1206972753 ## lucene/core/src/java/org/apache/lucene/util/UnicodeUtil.java: ## @@ -477,38 +477,60 @@ public static int UTF8toUTF32(final BytesRef utf8, final int[] ints) {

[GitHub] [lucene] alessandrobenedetti commented on pull request #12314: Multi-value support for KnnVectorField

2023-05-26 Thread via GitHub
alessandrobenedetti commented on PR #12314: URL: https://github.com/apache/lucene/pull/12314#issuecomment-1564579302 > My main worry is the change to `FloatVectorValue`, moving to a multivalued iterator changes the access pattern so I don't find it right to change the interface and the mean

[GitHub] [lucene] jbellis commented on pull request #12268: add BitSet.clear()

2023-05-26 Thread via GitHub
jbellis commented on PR #12268: URL: https://github.com/apache/lucene/pull/12268#issuecomment-1564588827 done! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscrib

[GitHub] [lucene] gsmiller commented on pull request #12320: Add "direct to binary" option for DaciukMihovAutomatonBuilder and use it in TermInSetQuery#visit

2023-05-26 Thread via GitHub
gsmiller commented on PR #12320: URL: https://github.com/apache/lucene/pull/12320#issuecomment-1564595369 Thanks @mikemccand! Did a pass to address your comments. Much appreciated! I also added some testing around the minimization aspect of the automaton building. I think all feedback has b

[GitHub] [lucene] gsmiller merged pull request #12331: GH#12321: Reduce visibility of StringsToAutomaton

2023-05-26 Thread via GitHub
gsmiller merged PR #12331: URL: https://github.com/apache/lucene/pull/12331 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.ap

[GitHub] [lucene] gsmiller merged pull request #12332: GH#12321: Marked DaciukMihovAutomatonBuilder as deprecated

2023-05-26 Thread via GitHub
gsmiller merged PR #12332: URL: https://github.com/apache/lucene/pull/12332 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.ap

[GitHub] [lucene] uschindler merged pull request #12268: add BitSet.clear()

2023-05-26 Thread via GitHub
uschindler merged PR #12268: URL: https://github.com/apache/lucene/pull/12268 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.

[GitHub] [lucene] uschindler commented on pull request #12332: GH#12321: Marked DaciukMihovAutomatonBuilder as deprecated

2023-05-26 Thread via GitHub
uschindler commented on PR #12332: URL: https://github.com/apache/lucene/pull/12332#issuecomment-1564636151 Thanks. Was also backported to 9.x and will be released with 9.7. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [lucene] uschindler commented on pull request #12268: add BitSet.clear()

2023-05-26 Thread via GitHub
uschindler commented on PR #12268: URL: https://github.com/apache/lucene/pull/12268#issuecomment-1564636408 Thanks. Was also backported to 9.x and will be released with 9.7. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [lucene] jimczi commented on pull request #12314: Multi-value support for KnnVectorField

2023-05-26 Thread via GitHub
jimczi commented on PR #12314: URL: https://github.com/apache/lucene/pull/12314#issuecomment-1564665498 > nothing in this PR is final nor I have any strong opinion about it. Sure, we're just discussing the approach, no worries. > In regards to your main worry, can you point me t

[GitHub] [lucene] gsmiller commented on issue #12321: Can we make `DaciukMihovAutomatonBuilder` pkg-private?

2023-05-26 Thread via GitHub
gsmiller commented on issue #12321: URL: https://github.com/apache/lucene/issues/12321#issuecomment-1564712751 Merged on `main` (#12331) and also added some deprecation notices on 9.x (#12332). -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [lucene] gsmiller closed issue #12321: Can we make `DaciukMihovAutomatonBuilder` pkg-private?

2023-05-26 Thread via GitHub
gsmiller closed issue #12321: Can we make `DaciukMihovAutomatonBuilder` pkg-private? URL: https://github.com/apache/lucene/issues/12321 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [lucene] dsmiley commented on pull request #12306: Make MAX_DIMENSIONS configurable via a system property.

2023-05-26 Thread via GitHub
dsmiley commented on PR #12306: URL: https://github.com/apache/lucene/pull/12306#issuecomment-1565223689 The PR has evolved from the first iteration. Are there remaining concerns with the PR as it is today? It shows that 2048 dimensions is tested, works, and thus is *supportable*. It wou