[GitHub] [lucene] jpountz commented on pull request #12334: Fix searchafter query high latency when after value is out of range for segment
jpountz commented on PR #12334: URL: https://github.com/apache/lucene/pull/12334#issuecomment-1590631585 We just upgraded Elasticsearch to a Lucene snapshot that has this change, and this triggered major speedups on some queries. In my opinion, the PR title and description don't do justice to this change since it does not only help when `after` is out of range, also when `after` is within the range but filtering only based on the `after` value significantly reduces the number of hits to evaluate. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] gashutos commented on pull request #12334: Fix searchafter query high latency when after value is out of range for segment
gashutos commented on PR #12334: URL: https://github.com/apache/lucene/pull/12334#issuecomment-1590638213 > We just upgraded Elasticsearch to a Lucene snapshot that has this change, and this triggered major speedups on some queries. In my opinion, the PR title and description don't do justice to this change since it does not only help when after is out of range, also when after is within the range but filtering only based on the after value significantly reduces the number of hits to evaluate. @jpountz Agreed ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] jpountz commented on pull request #12334: Fix searchafter query high latency when after value is out of range for segment
jpountz commented on PR #12334: URL: https://github.com/apache/lucene/pull/12334#issuecomment-1590663201 @gashutos I think we should make users aware of this optimization, would you be up for opening another PR that adds a CHANGES entry? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] jpountz commented on pull request #12334: Fix searchafter query high latency when after value is out of range for segment
jpountz commented on PR #12334: URL: https://github.com/apache/lucene/pull/12334#issuecomment-1590666069 Let's also update the title/description of this PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] gashutos opened a new pull request, #12367: Add CHANGES.txt for #12334 Honor after value for skipping documents even if queue is not full for PagingFieldCollector
gashutos opened a new pull request, #12367: URL: https://github.com/apache/lucene/pull/12367 ### Description Adding CHANGES.txt in improvements sections. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] gashutos closed pull request #12367: Add CHANGES.txt for #12334 Honor after value for skipping documents even if queue is not full for PagingFieldCollector
gashutos closed pull request #12367: Add CHANGES.txt for #12334 Honor after value for skipping documents even if queue is not full for PagingFieldCollector URL: https://github.com/apache/lucene/pull/12367 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] gashutos opened a new pull request, #12368: Add CHANGES.txt for #12334 Honor after value for skipping documents e…
gashutos opened a new pull request, #12368: URL: https://github.com/apache/lucene/pull/12368 Adding CHANGES.txt for #12334 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] gashutos commented on pull request #12334: Honor after value for skipping documents even if queue is not full for PagingFieldCollector
gashutos commented on PR #12334: URL: https://github.com/apache/lucene/pull/12334#issuecomment-1590701060 Sure, changes title/description, LMK if looks good. CHANGES.txt PR https://github.com/apache/lucene/pull/12368 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] javanna commented on issue #12347: Allow extensions of IndexSearcher to provide custom SliceExecutor and slices computation
javanna commented on issue #12347: URL: https://github.com/apache/lucene/issues/12347#issuecomment-1590702834 heya @sohami thanks a lot for sharing more context. > With custom slice computation to control the max slices per request/index the limiting factor in SliceExecutor will not be needed. Good point, agreed. Also, QueueSizeBasedExecutor is quite opinionated and non configurable, and it gets applied based on an instanceof check on the provided executor which is not fantastic. Another thought on my end: executing sometimes on the caller thread, and sometimes on the executor makes things hard to reason about: how do you size the two thread pools if you can't easily tell what load they are subjected to? Instead of making the slice executor configurable then, I would considering removing it entirely, and forcing the collection to always to happen on the separate thread pool. I think we'll need to figure out how to handle rejections from the executor thread pool, as today the collection happens on the caller thread whenever there's a rejection which I don't think is a behaviour we want to keep. We could also leave this to the executor implementation that is provided. I believe that the QueueSizeBasedExecutor was contributed by OpenSearch: would the approach suggested above be feasible for you folks? I am thinking it would simplify things and provide a better user experience for Lucene users. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] jpountz commented on pull request #12334: Honor after value for skipping documents even if queue is not full for PagingFieldCollector
jpountz commented on PR #12334: URL: https://github.com/apache/lucene/pull/12334#issuecomment-1590703139 Looks great, thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] jpountz merged pull request #12368: Add CHANGES.txt for #12334 Honor after value for skipping documents e…
jpountz merged PR #12368: URL: https://github.com/apache/lucene/pull/12368 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] LuXugang commented on pull request #12349: CompetitiveIterator should be null if sort field does not exist in TermOrdValComparator
LuXugang commented on PR #12349: URL: https://github.com/apache/lucene/pull/12349#issuecomment-1590716969 ```java public void test111() throws IOException{ Directory dir = newDirectory(); IndexWriterConfig iwc = new IndexWriterConfig(new MockAnalyzer(random())); RandomIndexWriter indexWriter = new RandomIndexWriter(random(), dir, iwc); Document doc; Random random = new Random(); int count = 0; while (count++ < 10){ doc = new Document(); doc.add(new SortedSetDocValuesField("sortedSet", new BytesRef("a"))); doc.add(new StringField("name", String.valueOf(random.nextInt(100)), StringField.Store.YES)); indexWriter.addDocument(doc); } indexWriter.commit(); IndexReader reader = indexWriter.getReader(); IndexSearcher searcher = newSearcher(reader); assert reader.maxDoc() == 10; Query query = new MatchAllDocsQuery(); Sort sort = new Sort(new SortedSetSortField("field no exist ", false)); TopDocs noSearchField = searcher.search(query, 2000); assert noSearchField.totalHits.value == 2001; TopDocs hasSearchField = searcher.search(query, 2000, sort); // if the search sort field is not exist, should early terminate after Top 2000 collected? assert hasSearchField.totalHits.value == 10; indexWriter.close(); reader.close(); dir.close(); } ``` If search sort field does not exist, should we early terminate collection after TopN collected? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] alessandrobenedetti commented on a diff in pull request #12253: GITHUB-12252: Add function queries for computing similarity scores between knn vectors
alessandrobenedetti commented on code in PR #12253: URL: https://github.com/apache/lucene/pull/12253#discussion_r1229310619 ## lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/ConstKnnFloatValueSource.java: ## @@ -0,0 +1,74 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.queries.function.valuesource; + +import java.io.IOException; +import java.util.Arrays; +import java.util.Map; +import java.util.Objects; +import org.apache.lucene.index.LeafReaderContext; +import org.apache.lucene.queries.function.FunctionValues; +import org.apache.lucene.queries.function.ValueSource; +import org.apache.lucene.util.VectorUtil; + +/** Function that returns a constant float vector value for every document. */ +public class ConstKnnFloatValueSource extends ValueSource { + private final float[] vector; + + public ConstKnnFloatValueSource(float[] constVector) { +this.vector = VectorUtil.checkFinite(Objects.requireNonNull(constVector, "constVector")); Review Comment: "constVector" -> maybe a better message? "the input constant vector is null" for example? I struggled to read this code, thinking it was some reference to some variable/constant but it was just a message (this applies to the other non null check) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler commented on a diff in pull request #12253: GITHUB-12252: Add function queries for computing similarity scores between knn vectors
uschindler commented on code in PR #12253: URL: https://github.com/apache/lucene/pull/12253#discussion_r1229323827 ## lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/ConstKnnFloatValueSource.java: ## @@ -0,0 +1,74 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.queries.function.valuesource; + +import java.io.IOException; +import java.util.Arrays; +import java.util.Map; +import java.util.Objects; +import org.apache.lucene.index.LeafReaderContext; +import org.apache.lucene.queries.function.FunctionValues; +import org.apache.lucene.queries.function.ValueSource; +import org.apache.lucene.util.VectorUtil; + +/** Function that returns a constant float vector value for every document. */ +public class ConstKnnFloatValueSource extends ValueSource { + private final float[] vector; + + public ConstKnnFloatValueSource(float[] constVector) { +this.vector = VectorUtil.checkFinite(Objects.requireNonNull(constVector, "constVector")); Review Comment: actually theres are inconsistences how it is used. I tend to just use for parameter checks just the variable name of the parameter. I have no strong preference. It is also not harmoized in Lucene. The JDK uses the "variable name" approach. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler commented on a diff in pull request #12253: GITHUB-12252: Add function queries for computing similarity scores between knn vectors
uschindler commented on code in PR #12253: URL: https://github.com/apache/lucene/pull/12253#discussion_r1229323827 ## lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/ConstKnnFloatValueSource.java: ## @@ -0,0 +1,74 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.queries.function.valuesource; + +import java.io.IOException; +import java.util.Arrays; +import java.util.Map; +import java.util.Objects; +import org.apache.lucene.index.LeafReaderContext; +import org.apache.lucene.queries.function.FunctionValues; +import org.apache.lucene.queries.function.ValueSource; +import org.apache.lucene.util.VectorUtil; + +/** Function that returns a constant float vector value for every document. */ +public class ConstKnnFloatValueSource extends ValueSource { + private final float[] vector; + + public ConstKnnFloatValueSource(float[] constVector) { +this.vector = VectorUtil.checkFinite(Objects.requireNonNull(constVector, "constVector")); Review Comment: actually theres are inconsistences how it is used. I tend to just use for parameter checks just the variable name of the parameter. I have no strong preference. It is also not harmoized in Lucene. The JDK uses also both approaches... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler commented on a diff in pull request #12253: GITHUB-12252: Add function queries for computing similarity scores between knn vectors
uschindler commented on code in PR #12253: URL: https://github.com/apache/lucene/pull/12253#discussion_r1229329251 ## lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/ConstKnnFloatValueSource.java: ## @@ -0,0 +1,74 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.queries.function.valuesource; + +import java.io.IOException; +import java.util.Arrays; +import java.util.Map; +import java.util.Objects; +import org.apache.lucene.index.LeafReaderContext; +import org.apache.lucene.queries.function.FunctionValues; +import org.apache.lucene.queries.function.ValueSource; +import org.apache.lucene.util.VectorUtil; + +/** Function that returns a constant float vector value for every document. */ +public class ConstKnnFloatValueSource extends ValueSource { + private final float[] vector; + + public ConstKnnFloatValueSource(float[] constVector) { +this.vector = VectorUtil.checkFinite(Objects.requireNonNull(constVector, "constVector")); Review Comment: - https://github.com/openjdk/jdk/blob/bd79db3930f192f6742e29a63a6d1c3bc3dd3385/src/java.base/share/classes/java/nio/channels/Channels.java#L87 - https://github.com/openjdk/jdk/blob/bd79db3930f192f6742e29a63a6d1c3bc3dd3385/src/java.base/share/classes/java/util/StringJoiner.java#L126 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] jpountz commented on pull request #12349: CompetitiveIterator should be null if sort field does not exist in TermOrdValComparator
jpountz commented on PR #12349: URL: https://github.com/apache/lucene/pull/12349#issuecomment-1590865303 I agree that we should fix this comparator so that the last call to `IndexSearcher.search` in your test only collects 2000 hits. This doesn't seem to be what your PR does though? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] jpountz merged pull request #12366: Move TermAndBoost back to its original location.
jpountz merged PR #12366: URL: https://github.com/apache/lucene/pull/12366 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] javanna opened a new pull request, #12369: Increased the likelihood of leveraging inter-segment concurrency in tests
javanna opened a new pull request, #12369: URL: https://github.com/apache/lucene/pull/12369 We have recently increased the likelihood of leveraging inter-segment search concurrency in tests when newSearcher is used to create the index searcher (see #959). When parallel execution is enabled though, an executor is only set 50% of the times, and parallel execution is dependent on the number of documents and segments indexed. That means that out of 1000 test runs that uses RandomIndexWriter to index a random number of docs up to 1000, we will effectively parallelize only a couple of times. This commit increases the likelihood of running concurrent searches by lowering further the slice thresholds and setting the executor frequently instead of 50% of the times. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] javanna commented on a diff in pull request #12369: Increased the likelihood of leveraging inter-segment concurrency in tests
javanna commented on code in PR #12369: URL: https://github.com/apache/lucene/pull/12369#discussion_r1229356704 ## lucene/test-framework/src/java/org/apache/lucene/tests/util/LuceneTestCase.java: ## @@ -1965,9 +1966,9 @@ public static IndexSearcher newSearcher( .addClosedListener(cacheKey -> TestUtil.shutdownExecutorService(ex)); } IndexSearcher ret; + int maxDocPerSlice = random.nextBoolean() ? 1 : 1 + random.nextInt(1000); + int maxSegmentsPerSlice = random.nextBoolean() ? 1 : 1 + random.nextInt(10); Review Comment: This may be too aggressive, as we may end up with way too many slices depending on how many docs and segments tests have. An alternative would be to have a different value distribution that is closer to the lower bound of the range. Another option could be to make this configurable so that tests that want a behaviour that is closed to production can override it? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] javanna commented on a diff in pull request #12369: Increased the likelihood of leveraging inter-segment concurrency in tests
javanna commented on code in PR #12369: URL: https://github.com/apache/lucene/pull/12369#discussion_r1229359014 ## lucene/test-framework/src/java/org/apache/lucene/tests/util/LuceneTestCase.java: ## @@ -1965,9 +1966,9 @@ public static IndexSearcher newSearcher( .addClosedListener(cacheKey -> TestUtil.shutdownExecutorService(ex)); } IndexSearcher ret; + int maxDocPerSlice = random.nextBoolean() ? 1 : 1 + random.nextInt(1000); + int maxSegmentsPerSlice = random.nextBoolean() ? 1 : 1 + random.nextInt(10); Review Comment: I do think that when `useThreads` is true, we should do our best to leverage concurrency at least half of the runs, rather than 0.2% of the runs. Being this dependent on the number of docs and segments makes it particularly challenging to come up with a good default value. Possibly the proposed behaviour is good for tests that index a low amount of docs, which is the majority of the lucene tests? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] jpountz commented on a diff in pull request #12369: Increased the likelihood of leveraging inter-segment concurrency in tests
jpountz commented on code in PR #12369: URL: https://github.com/apache/lucene/pull/12369#discussion_r1229481989 ## lucene/test-framework/src/java/org/apache/lucene/tests/util/LuceneTestCase.java: ## @@ -1941,7 +1940,7 @@ public static IndexSearcher newSearcher( } else { int threads = 0; final ThreadPoolExecutor ex; - if (r.getReaderCacheHelper() == null || random.nextBoolean()) { + if (r.getReaderCacheHelper() == null || rarely()) { Review Comment: I'd prefer to keep this one a `random.nextBoolean()` as the semantics of `useThreads` to me are about whether the test _may_ use threads. The point is to allow some tests to disable threading by passing `useThreads = false`. ## lucene/test-framework/src/java/org/apache/lucene/tests/util/LuceneTestCase.java: ## @@ -1965,9 +1966,9 @@ public static IndexSearcher newSearcher( .addClosedListener(cacheKey -> TestUtil.shutdownExecutorService(ex)); } IndexSearcher ret; + int maxDocPerSlice = random.nextBoolean() ? 1 : 1 + random.nextInt(1000); + int maxSegmentsPerSlice = random.nextBoolean() ? 1 : 1 + random.nextInt(10); Review Comment: It looks ok to me, worst-case scenario it will create one slice per segment, which shouldn't be an adversarial case. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] javanna commented on a diff in pull request #12369: Increased the likelihood of leveraging inter-segment concurrency in tests
javanna commented on code in PR #12369: URL: https://github.com/apache/lucene/pull/12369#discussion_r1229508836 ## lucene/test-framework/src/java/org/apache/lucene/tests/util/LuceneTestCase.java: ## @@ -1941,7 +1940,7 @@ public static IndexSearcher newSearcher( } else { int threads = 0; final ThreadPoolExecutor ex; - if (r.getReaderCacheHelper() == null || random.nextBoolean()) { + if (r.getReaderCacheHelper() == null || rarely()) { Review Comment: I see, maybe it's too many changes at the same time. Not setting the executor half of the times though lowers the likelihood quite a bit, which is lowered further by the slices thresholds. I agree that we should not guarantee that we always parallelize when we may use threads, yet I am trying to have that happen at least 50% of the times, instead of a couple of times every 1000 runs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] LuXugang commented on pull request #12349: CompetitiveIterator should be null if sort field does not exist in TermOrdValComparator
LuXugang commented on PR #12349: URL: https://github.com/apache/lucene/pull/12349#issuecomment-1591110183 > This doesn't seem to be what your PR does though? It indeed has no relation to this PR》 > If search sort field does not exist, should we early terminate collection after TopN collected? I would like to open an issue for this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] jpountz commented on pull request #12349: CompetitiveIterator should be null if sort field does not exist in TermOrdValComparator
jpountz commented on PR #12349: URL: https://github.com/apache/lucene/pull/12349#issuecomment-159992 +1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler commented on pull request #12281: Add checks in KNNVectorField / KNNVectorQuery to only allow non-null, non-empty and finite vectors
uschindler commented on PR #12281: URL: https://github.com/apache/lucene/pull/12281#issuecomment-1591112679 I did not see any slowdowns in last night @mikemccand benchmark caused by the check during indexing and on building the query. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler commented on issue #12358: Optimize `count()` for BooleanQuery disjunction
uschindler commented on issue #12358: URL: https://github.com/apache/lucene/issues/12358#issuecomment-1591117808 Hi, thanks for crosschecking. 1 hour warmup is therefor not changing anything. Anyways, I'd use a newer JDK like 20. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] nreimers commented on issue #12342: Prevent VectorSimilarity.DOT_PRODUCT from returning negative scores
nreimers commented on issue #12342: URL: https://github.com/apache/lucene/issues/12342#issuecomment-1591318871 @msokolov The index / vector DB should return the dot product score as is. No scaling, no truncation. Using dot product is tremendously useful for embedding models, they perform in asymmetric settings where you want to map a short search query to a longer relevant document (which is the most common case in search) much better than cosine similarity or euclidean distance. But here the index should return the values as is and it should then be up to the user to truncate negative scores or to normalize these scores to pre-defined ranges. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler commented on issue #12342: Prevent VectorSimilarity.DOT_PRODUCT from returning negative scores
uschindler commented on issue #12342: URL: https://github.com/apache/lucene/issues/12342#issuecomment-1591331493 > @msokolov The index / vector DB should return the dot product score as is. No scaling, no truncation. > > Using dot product is tremendously useful for embedding models, they perform in asymmetric settings where you want to map a short search query to a longer relevant document (which is the most common case in search) much better than cosine similarity or euclidean distance. > > But here the index should return the values as is and it should then be up to the user to truncate negative scores or to normalize these scores to pre-defined ranges. The problem is that this is not compatible with Lucene. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] benwtrent commented on issue #12342: Prevent VectorSimilarity.DOT_PRODUCT from returning negative scores
benwtrent commented on issue #12342: URL: https://github.com/apache/lucene/issues/12342#issuecomment-1591347442 I would think as long as more negative values are scored lower, we will retrieve documents in a sane manner. Scaling negatives to restrict them and then not scaling positive values at all could work. The `_score` wouldn't always be the dot-product exactly, but it allows KNN search to find the most relevant information, even if all of the dot-products are negative when comparing with the query vector. This brings us back to @jmazanec15 suggestion on scaling scores. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] msokolov commented on issue #12342: Prevent VectorSimilarity.DOT_PRODUCT from returning negative scores
msokolov commented on issue #12342: URL: https://github.com/apache/lucene/issues/12342#issuecomment-1591355715 Yeah, after consideration, I think we could maybe argue for changing the scaling of negative values given that they were documented as unsupported, even though it would be breaking back-compat in the sense that scores would be changed. But I think we ought to preserve the scaling of non-negative values in case people have scaling factors they use for combining scores with other queries' scores. So we could go with @jmazanec15 suggestion except leaving in place the scale by 1/2? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] msokolov commented on issue #12342: Prevent VectorSimilarity.DOT_PRODUCT from returning negative scores
msokolov commented on issue #12342: URL: https://github.com/apache/lucene/issues/12342#issuecomment-1591379022 Yeah. Another thing we could consider is doing this scaling in KnnVectorQuery and/or its Scorer. These have the ultimate responsibility of complying with the Scorer contract. If we did it there we wouldn't have to change the output of VectorSimilarity. However it's messy to do it there since this is specific to a particular similarity implementation, so on balance doing it in the similarity makes more sense to me. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] alessandrobenedetti merged pull request #12253: GITHUB-12252: Add function queries for computing similarity scores between knn vectors
alessandrobenedetti merged PR #12253: URL: https://github.com/apache/lucene/pull/12253 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] alessandrobenedetti closed issue #12252: Add function queries for computing vector similarity between knn vectors
alessandrobenedetti closed issue #12252: Add function queries for computing vector similarity between knn vectors URL: https://github.com/apache/lucene/issues/12252 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] sohami commented on issue #12347: Allow extensions of IndexSearcher to provide custom SliceExecutor and slices computation
sohami commented on issue #12347: URL: https://github.com/apache/lucene/issues/12347#issuecomment-1591452446 @javanna Thanks for your input. > Another thought on my end: executing sometimes on the caller thread, and sometimes on the executor makes things hard to reason about: how do you size the two thread pools if you can't easily tell what load they are subjected to? > Instead of making the slice executor configurable then, I would considering removing it entirely, and forcing the collection to always to happen on the separate thread pool. I think we'll need to figure out how to handle rejections from the executor thread pool, as today the collection happens on the caller thread whenever there's a rejection which I don't think is a behaviour we want to keep. We could also leave this to the executor implementation that is provided. As you mentioned earlier as well (and I agree) it is hard to understand the default which works best for all the usage. So providing a way to customize it will provide the flexibility to the users to adhere to their use cases. I think that way we can see what custom mechanism used across users works well and then change the default later as needed. I would also like to try to just remove the limiting factor but keep the mechanism to execute the last slice on the caller thread, so `SliceExecutor` type interface will still be useful. I think for now can we split the issue into 2. We can potentially make the change for 1st one now and follow up with 2nd one. Thoughts ? 1. Take the `LeafSlice[]` in constructor to allow for custom slice computation. 2. Discuss different options to customize `SliceExecutor` or we will want to replace it with some other interface -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] Jackyrie2 opened a new pull request, #12371: [Draft] #12236 Lazily compute similarity score
Jackyrie2 opened a new pull request, #12371: URL: https://github.com/apache/lucene/pull/12371 ### Description Per @zhaih suggestion in #12236, this PR moves the computation of the similarity score from `initalizedFromGraph` to a later time, when the `NeighborArray` needs to be sorted and pop out the worst non-diverse node. A new abstract class `ScoringFunction` is created to hold the necessary context to compute the similarity score, and is passed into the `addOutofOrder` function. Let me know if this solution works, as it puts extra strain on memory usage. I will work on writing unit tests, but the changes in this PR pass the current unit tests in hnsw test directory. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] benwtrent commented on pull request #12371: [Draft] #12236 Lazily compute similarity score
benwtrent commented on PR #12371: URL: https://github.com/apache/lucene/pull/12371#issuecomment-1591770109 Hey @Jackyrie2 this does add some extra memory overhead, 4 new object references. It would be good if it was justified with a benchmark. Could you share some benchmarking on indexing throughput and segment merging? I expect those two places to be where we see improvement if any. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] javanna commented on issue #12347: Allow extensions of IndexSearcher to provide custom SliceExecutor and slices computation
javanna commented on issue #12347: URL: https://github.com/apache/lucene/issues/12347#issuecomment-1591822866 > Take the LeafSlice[] in constructor to allow for custom slice computation. Sounds good, I'll happily review that change. > Discuss different options to customize SliceExecutor or we will want to replace it with some other interface Ok to discussing, I do think that making things pluggable is a change that's difficult to revert in terms of backwards compatibility, and I think we should put some effort into changing the current behaviour before we add new public abstractions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] atris commented on issue #12347: Allow extensions of IndexSearcher to provide custom SliceExecutor and slices computation
atris commented on issue #12347: URL: https://github.com/apache/lucene/issues/12347#issuecomment-1591837122 > > Take the LeafSlice[] in constructor to allow for custom slice computation. > > Sounds good, I'll happily review that change. > > > Discuss different options to customize SliceExecutor or we will want to replace it with some other interface > > Ok to discussing, I do think that making things pluggable is a change that's difficult to revert in terms of backwards compatibility, and I think we should put some effort into changing the current behaviour before we add new public abstractions. Strong -1 t replacing the interface. I think it has worked well for many users for a while and it would be breaking back compatibility to serve a specific use case. I am just catching up on this thread -- why does the current SliceExecutor not work for extension in this case? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] jbellis opened a new pull request, #12372: Reuse neighborqueue during hnsw index build (attempt 2)
jbellis opened a new pull request, #12372: URL: https://github.com/apache/lucene/pull/12372 This changes HnswGraphBuilder to re-use the same candidates queues for adding nodes by allocating them in the Builder instance. This saves about 2.5% of build time and takes memory allocations of NQ long[] from 25% of total to 0%. JFR runs are attached. The difference from the first attempt (which actually made things slower) is that it preserves the original code's behavior of using a 1-sized queue for the search in the levels above where the node actually gets added. [main.jfr.gz](https://github.com/apache/lucene/files/11749837/main.jfr.gz) [nq2.jfr.gz](https://github.com/apache/lucene/files/11749838/nq2.jfr.gz) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] jbellis commented on pull request #12372: Reuse neighborqueue during hnsw index build (attempt 2)
jbellis commented on PR #12372: URL: https://github.com/apache/lucene/pull/12372#issuecomment-1591859337 Additionally, the original change only re-used the candidates queues within a single addNode call, so this is improved in that respect as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] sohami commented on issue #12347: Allow extensions of IndexSearcher to provide custom SliceExecutor and slices computation
sohami commented on issue #12347: URL: https://github.com/apache/lucene/issues/12347#issuecomment-1591876384 @atris To summarize, there are 2 separate functionality I am looking to add: 1) Custom slice computation which the extension can provide. For this we can provide a constructor in `IndexSearcher` which takes in `LeafSlice` array from extension. I think probably there is no concern with this. 2) Mechanism to provide custom `SliceExecutor` implementation or deprecate this with some other mechanism. I would ideally like to provide a mechanism for extensions to be able to give a custom implementation of it. The default implementations takes into consideration certain limiting factor to apply back pressure which will not be needed in all the cases (as shared [above](https://github.com/apache/lucene/issues/12347#issuecomment-1589876811)) and will also simplify the reasoning behind which `slices` got executed on which thread-pool. So keeping the existing default and giving the flexibility to customize it is what I guess will be helpful here. This is still being discussed and would be great to hear your feedback as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] jbellis opened a new pull request, #12373: require that float vector components are smaller than 1E17 to prevent overflowing to Infinity
jbellis opened a new pull request, #12373: URL: https://github.com/apache/lucene/pull/12373 Following up to PR #12281 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] jbellis commented on pull request #12373: require that float vector components are smaller than 1E17 to prevent overflowing to Infinity
jbellis commented on PR #12373: URL: https://github.com/apache/lucene/pull/12373#issuecomment-1591954514 cc @uschindler -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] sohami opened a new pull request, #12374: Provide constructor to accept the LeafSlice computed by extensions
sohami opened a new pull request, #12374: URL: https://github.com/apache/lucene/pull/12374 ### Description Add a constructor which takes in the computed slices from extensions and uses that for running the search concurrently on provided executor. This is based on the discussion on the issue https://github.com/apache/lucene/issues/12347 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] sohami commented on issue #12347: Allow extensions of IndexSearcher to provide custom SliceExecutor and slices computation
sohami commented on issue #12347: URL: https://github.com/apache/lucene/issues/12347#issuecomment-1591995488 @javanna @atris I have create a PR (#12374) for item 1 above for now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org