[GitHub] [lucene] jpountz commented on pull request #11840: GITHUB-11838 Add api to allow concurrent query rewrite

2022-10-18 Thread GitBox


jpountz commented on PR #11840:
URL: https://github.com/apache/lucene/pull/11840#issuecomment-1282004123

   > Of course in the case that somebody writes a subclass that only implements 
the IndexReader variant of rewrite it won't use the searcher. If this user 
subclass then rewrites subqueries on its own it will call 
"rewrite(IndexReader)" and so it is subqueries won't use parallelization. This 
is not so dramatic as a new query in user's code will just never pass 
parallelization down its subqueries.
   
   Using the same reasoning, I wonder if we'd solve 90% of the problem by 
propagating `Query#rewrite(IndexSearcher)` in `BooleanQuery`, 
`ConstantScoreQuery` and `DisjunctionMaxQuery` (the main queries that can be 
inner nodes in a query tree).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jpountz commented on a diff in pull request #11840: GITHUB-11838 Add api to allow concurrent query rewrite

2022-10-18 Thread GitBox


jpountz commented on code in PR #11840:
URL: https://github.com/apache/lucene/pull/11840#discussion_r997895966


##
lucene/highlighter/src/test/org/apache/lucene/search/vectorhighlight/TestFieldQuery.java:
##
@@ -40,12 +41,23 @@
 
 public class TestFieldQuery extends AbstractTestCase {
   private float boost;
+  private IndexSearcher searcher;
 
   /** Set boost to a random value each time it is called. */
   private void initBoost() {
 boost = usually() ? 1F : random().nextFloat() * 1;
   }
 
+  @Override
+  public void setUp() throws Exception {
+super.setUp();
+if (reader == null) {
+  searcher = null;
+} else {
+  searcher = newSearcher(reader);
+}
+  }

Review Comment:
   Does it work? It looks like the reader is not instantiated in the setup, but 
in some protected helper methods that run after the setup? Should we instead 
modify the parent class to always instantiate an IndexSearcher when 
instantiating a reader?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jpountz commented on pull request #11832: Added static factory method for loading VectorValues

2022-10-18 Thread GitBox


jpountz commented on PR #11832:
URL: https://github.com/apache/lucene/pull/11832#issuecomment-1282020859

   @shubhamvishu I'm sorry that your work didn't result in a merged commit, but 
it feels like it would be better not to merge this change. Thank you again for 
looking into this!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jpountz closed pull request #11832: Added static factory method for loading VectorValues

2022-10-18 Thread GitBox


jpountz closed pull request #11832: Added static factory method for loading 
VectorValues
URL: https://github.com/apache/lucene/pull/11832


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jpountz commented on pull request #11722: Binary search the entries when all suffixes have the same length in a leaf block.

2022-10-18 Thread GitBox


jpountz commented on PR #11722:
URL: https://github.com/apache/lucene/pull/11722#issuecomment-1282060420

   @mikemccand You might want to have a look at this change since (I think) you 
are one of the most familiar ones with the original code.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jpountz merged pull request #11722: Binary search the entries when all suffixes have the same length in a leaf block.

2022-10-18 Thread GitBox


jpountz merged PR #11722:
URL: https://github.com/apache/lucene/pull/11722


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] donnerpeter opened a new pull request, #11859: hunspell: speed up GeneratingSuggester by not deserializing non-suggestible roots

2022-10-18 Thread GitBox


donnerpeter opened a new pull request, #11859:
URL: https://github.com/apache/lucene/pull/11859

   We discard entries with NOSUGGEST (and some other) flags anyway, so let's 
bail out of processing them at an earlier stage.
   This speeds up suggestions for relatively short German words by about 20% 
for me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jpountz commented on pull request #11722: Binary search the entries when all suffixes have the same length in a leaf block.

2022-10-18 Thread GitBox


jpountz commented on PR #11722:
URL: https://github.com/apache/lucene/pull/11722#issuecomment-1282305919

   I had to revert this change because of test failures, e.g. this seed 
reproduces on the main branch:
   
   ```
   gradlew test --tests TestNumericDocValuesUpdates.testSortedIndex 
-Dtests.seed=4C6E977E1F29E069 -Dtests.locale=khq 
-Dtests.timezone=Asia/Hong_Kong -Dtests.asserts=true -Dtests.file.encoding=UTF-8
   ```
   
   It seems like this test exercises some logic that 
`BasePostingsFormatTestCase` doesn't but I haven't figured out what yet.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] chatman commented on issue #10342: Integer overflow in total count in grouping results [LUCENE-9302]

2022-10-18 Thread GitBox


chatman commented on issue #10342:
URL: https://github.com/apache/lucene/issues/10342#issuecomment-1282339116

   I plan to update this PR to merge against main branch shortly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] stefanvodita commented on pull request #11780: GH#11601: Add ability to compute reader states after refresh

2022-10-18 Thread GitBox


stefanvodita commented on PR #11780:
URL: https://github.com/apache/lucene/pull/11780#issuecomment-1282371150

   That makes sense. Maybe I'm not addressing the right problem. @gsmiller - as 
the issue's author, what do you think?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] harishankar-gopalan commented on issue #11354: Reuse HNSW graphs when merging segments? [LUCENE-10318]

2022-10-18 Thread GitBox


harishankar-gopalan commented on issue #11354:
URL: https://github.com/apache/lucene/issues/11354#issuecomment-1282390590

   Hi @jmazanec15, I had a quick doubt. Currently how are segment merges 
happening in Lucene for the HNSW graph ? Is the graph being reconstructed from 
scratch ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] msokolov commented on a diff in pull request #11852: Luke Webapp

2022-10-18 Thread GitBox


msokolov commented on code in PR #11852:
URL: https://github.com/apache/lucene/pull/11852#discussion_r998344033


##
lucene/luke/src/java/org/apache/lucene/luke/app/web/LukeWebMain.java:
##
@@ -17,31 +17,78 @@
 
 package org.apache.lucene.luke.app.web;
 
+import java.net.InetSocketAddress;
 import java.util.concurrent.CountDownLatch;
+import java.util.HashMap;
+import java.util.Map;
 import org.apache.lucene.luke.app.IndexHandler;
 import org.apache.lucene.luke.util.LoggerFactory;
 
 /** Entry class for web Luke */
-public class LukeWebMain {
+public final class LukeWebMain {
+
+  private LukeWebMain() {
+  }
 
   static {
 LoggerFactory.initGuiLogging();
   }
 
   public static void main(String[] args) throws Exception {
-String index = null;
-if (args.length == 2 && args[0].equals("--index")) {
-  index = args[1];
-} else {
-  System.err.println("usage: LukeWebMain --index ");
-  Runtime.getRuntime().exit(1);
+Map parsed = null;
+try {
+  parsed = parseArgs(args);
+} catch (Exception e) {
+  usage(e.getMessage());
 }
-
 IndexHandler indexHandler = IndexHandler.getInstance();
-indexHandler.open(index, "org.apache.lucene.store.FSDirectory", true, 
true, false);
+indexHandler.open(getIndex(parsed), "org.apache.lucene.store.FSDirectory", 
true, true, false);
 CountDownLatch tombstone = new CountDownLatch(1);
-HttpService httpService = new HttpService(indexHandler, tombstone);
+HttpService httpService = new HttpService(getSockAddr(parsed), 
indexHandler, tombstone);
 httpService.start();
 tombstone.await();
   }
+
+  private static String getIndex(Map args) {
+String index = (String) args.get("index");
+if (index == null) {
+  usage("index arg is required");
+}
+return index;
+  }
+
+  private static InetSocketAddress getSockAddr(Map args) {
+String host = (String) args.get("host");
+int port = (Integer) args.getOrDefault("port", 8080);
+if (host == null) {
+  return new InetSocketAddress(port);

Review Comment:
   I guess that's OK. ssh tunneling is always an option for ad hoc users (what 
I expect the main use case is ?)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] benwtrent opened a new pull request, #11860: GITHUB-11830 Better optimize storage for vector connections

2022-10-18 Thread GitBox


benwtrent opened a new pull request, #11860:
URL: https://github.com/apache/lucene/pull/11860

   Vector search is much faster when the graph can fit in memory. Consequently, 
improvements in vector storage can translate to faster searches on larger 
graphs.
   
   One area of size reduction is node connections. Currently, they are stored 
as regular `int` values, but per connection there are usually fewer connections 
than required to store in an `int`. 
   
   This commit proposes storing node connections within the graph with 
`PackedInts`. This adds a new codec reader/writer for the HNSW graph. 
   
   Will store node connections with a `PackedInts` stream, using the maximal 
possible value of connections as the upper limit. Additional, the packed ints 
version is written so the reader uses the appropriate PackedInts version when 
reading the data.
   
   
   This change found, on average, a 30% space savings with minimal change in 
query-per-second (QPS).
   
   There are probably even better storage optimization options, if anybody 
knows of such (I am new to the Lucene world), please let me know!
   
   In depth investigation on QPS available here: 
https://github.com/apache/lucene/issues/11830#issuecomment-1279207529
   
   closes: https://github.com/apache/lucene/issues/11830


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] shahrs87 commented on pull request #907: LUCENE-10357 Ghost fields and postings/points

2022-10-18 Thread GitBox


shahrs87 commented on PR #907:
URL: https://github.com/apache/lucene/pull/907#issuecomment-1282791420

   @jpountz  Can you please review this patch again? Thank you


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jtibshirani opened a new pull request, #11861: Fix Lucene94HnswVectorsFormat validation on large segments

2022-10-18 Thread GitBox


jtibshirani opened a new pull request, #11861:
URL: https://github.com/apache/lucene/pull/11861

   When reading large segments, the vectors format can fail with a validation
   error:
   
   ```
   java.lang.IllegalStateException: Vector data length 3070061568 not matching
   size=999369 * dim=768 * byteSize=4 = -1224905728
   ```
   
   The problem is that we use an integer to represent the size, which is too 
small
   to hold it. The bug snuck in during the work to enable int8 values, which
   switched a long value to an int.
   
   Closes #11858.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jtibshirani commented on pull request #11861: Fix Lucene94HnswVectorsFormat validation on large segments

2022-10-18 Thread GitBox


jtibshirani commented on PR #11861:
URL: https://github.com/apache/lucene/pull/11861#issuecomment-1282872442

   As a note, this only touches the read codepath and has no effect on data 
format, so it's safe to fix the current codec directly.
   
   I tried to add a test but didn't see a good way:
   * The bug is only triggered when you add over 2GB of vector data -- this is 
much too slow for unit tests!
   * I also tried exposing `validateFieldEntry`, removing tricky dependencies 
like `FieldEntry`, and testing that directly... but it didn't seem very helpful.
   
   Let me know if you have suggestions on testing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] iverase commented on pull request #11861: Fix Lucene94HnswVectorsFormat validation on large segments

2022-10-18 Thread GitBox


iverase commented on PR #11861:
URL: https://github.com/apache/lucene/pull/11861#issuecomment-1282966111

   I have to change this test  not too long ago to index 4B points instead of 
2B to trigger a bug as well. Maybe something like that as a Monster test might 
work for you? : 
   
https://github.com/apache/lucene/blob/fe8d11254a8a768608d7bb5e2bf8dcfd2c2c9310/lucene/core/src/test/org/apache/lucene/util/bkd/Test4BBKDPoints.java


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jtibshirani commented on pull request #11861: Fix Lucene94HnswVectorsFormat validation on large segments

2022-10-18 Thread GitBox


jtibshirani commented on PR #11861:
URL: https://github.com/apache/lucene/pull/11861#issuecomment-1282994720

   Thanks @iverase ! Do the monster tests get run regularly (perhaps during 
nightly builds)?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jtibshirani commented on a diff in pull request #11860: GITHUB-11830 Better optimize storage for vector connections

2022-10-18 Thread GitBox


jtibshirani commented on code in PR #11860:
URL: https://github.com/apache/lucene/pull/11860#discussion_r998757995


##
lucene/core/src/java/org/apache/lucene/codecs/lucene95/Lucene95HnswVectorsReader.java:
##
@@ -0,0 +1,505 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.codecs.lucene95;
+
+import static org.apache.lucene.search.DocIdSetIterator.NO_MORE_DOCS;
+
+import java.io.IOException;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.Map;
+import org.apache.lucene.codecs.CodecUtil;
+import org.apache.lucene.codecs.KnnVectorsReader;
+import org.apache.lucene.index.*;
+import org.apache.lucene.search.ScoreDoc;
+import org.apache.lucene.search.TopDocs;
+import org.apache.lucene.search.TotalHits;
+import org.apache.lucene.store.ChecksumIndexInput;
+import org.apache.lucene.store.DataInput;
+import org.apache.lucene.store.IndexInput;
+import org.apache.lucene.util.Bits;
+import org.apache.lucene.util.IOUtils;
+import org.apache.lucene.util.RamUsageEstimator;
+import org.apache.lucene.util.hnsw.HnswGraph;
+import org.apache.lucene.util.hnsw.HnswGraphSearcher;
+import org.apache.lucene.util.hnsw.NeighborQueue;
+import org.apache.lucene.util.packed.DirectMonotonicReader;
+import org.apache.lucene.util.packed.PackedInts;
+
+/**
+ * Reads vectors from the index segments along with index data structures 
supporting KNN search.
+ *
+ * @lucene.experimental
+ */
+public final class Lucene95HnswVectorsReader extends KnnVectorsReader {
+
+  private final FieldInfos fieldInfos;
+  private final Map fields = new HashMap<>();
+  private final IndexInput vectorData;
+  private final IndexInput vectorIndex;
+
+  Lucene95HnswVectorsReader(SegmentReadState state) throws IOException {
+this.fieldInfos = state.fieldInfos;
+int versionMeta = readMetadata(state);
+boolean success = false;
+try {
+  vectorData =
+  openDataInput(
+  state,
+  versionMeta,
+  Lucene95HnswVectorsFormat.VECTOR_DATA_EXTENSION,
+  Lucene95HnswVectorsFormat.VECTOR_DATA_CODEC_NAME);
+  vectorIndex =
+  openDataInput(
+  state,
+  versionMeta,
+  Lucene95HnswVectorsFormat.VECTOR_INDEX_EXTENSION,
+  Lucene95HnswVectorsFormat.VECTOR_INDEX_CODEC_NAME);
+  success = true;
+} finally {
+  if (success == false) {
+IOUtils.closeWhileHandlingException(this);
+  }
+}
+  }
+
+  private int readMetadata(SegmentReadState state) throws IOException {
+String metaFileName =
+IndexFileNames.segmentFileName(
+state.segmentInfo.name, state.segmentSuffix, 
Lucene95HnswVectorsFormat.META_EXTENSION);
+int versionMeta = -1;
+try (ChecksumIndexInput meta = 
state.directory.openChecksumInput(metaFileName, state.context)) {
+  Throwable priorE = null;
+  try {
+versionMeta =
+CodecUtil.checkIndexHeader(
+meta,
+Lucene95HnswVectorsFormat.META_CODEC_NAME,
+Lucene95HnswVectorsFormat.VERSION_START,
+Lucene95HnswVectorsFormat.VERSION_CURRENT,
+state.segmentInfo.getId(),
+state.segmentSuffix);
+readFields(meta, state.fieldInfos);
+  } catch (Throwable exception) {
+priorE = exception;
+  } finally {
+CodecUtil.checkFooter(meta, priorE);
+  }
+}
+return versionMeta;
+  }
+
+  private static IndexInput openDataInput(
+  SegmentReadState state, int versionMeta, String fileExtension, String 
codecName)
+  throws IOException {
+String fileName =
+IndexFileNames.segmentFileName(state.segmentInfo.name, 
state.segmentSuffix, fileExtension);
+IndexInput in = state.directory.openInput(fileName, state.context);
+boolean success = false;
+try {
+  int versionVectorData =
+  CodecUtil.checkIndexHeader(
+  in,
+  codecName,
+  Lucene95HnswVectorsFormat.VERSION_START,
+  Lucene95HnswVectorsFormat.VERSION_CURRENT,
+  state.segmentInfo.getId(),
+  state.segmentSuffix);
+  if (versionMeta != versionV

[GitHub] [lucene] stevenschlansker commented on pull request #11822: PrimaryNode: add configurable timeout to waitForAllRemotesToClose

2022-10-18 Thread GitBox


stevenschlansker commented on PR #11822:
URL: https://github.com/apache/lucene/pull/11822#issuecomment-1283082251

   I updated this PR to rename the field to include `Ms`.
   I added a test case for both no timeout (0), and 1000ms. I verified the test 
fails (doesn't terminate) without the new configuration option sent.
   
   I don't think this is a great candidate for a random test - randomization is 
wonderful for perturbing data and finding edge cases in algorithms. In this 
case, it is just an int we compare to a clock, so randomizing doesn't seem 
likely to uncover any helpful edge cases. Please let me know if this is still 
desired.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] zhaih commented on a diff in pull request #11822: PrimaryNode: add configurable timeout to waitForAllRemotesToClose

2022-10-18 Thread GitBox


zhaih commented on code in PR #11822:
URL: https://github.com/apache/lucene/pull/11822#discussion_r998806414


##
lucene/CHANGES.txt:
##
@@ -44,6 +44,8 @@ New Features
 * LUCENE-10626 Hunspell: add tools to aid dictionary editing:
   analysis introspection, stem expansion and stem/flag suggestion (Peter 
Gromov)
 
+* GITHUB#11822: Configure replicator PrimaryNode replia shutdown timeout. 
(Steven Schlansker)

Review Comment:
   We can put it under 9.5 so it can be release in next release and don't need 
to wait until 10?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] stevenschlansker commented on a diff in pull request #11822: PrimaryNode: add configurable timeout to waitForAllRemotesToClose

2022-10-18 Thread GitBox


stevenschlansker commented on code in PR #11822:
URL: https://github.com/apache/lucene/pull/11822#discussion_r998813121


##
lucene/CHANGES.txt:
##
@@ -44,6 +44,8 @@ New Features
 * LUCENE-10626 Hunspell: add tools to aid dictionary editing:
   analysis introspection, stem expansion and stem/flag suggestion (Peter 
Gromov)
 
+* GITHUB#11822: Configure replicator PrimaryNode replia shutdown timeout. 
(Steven Schlansker)

Review Comment:
   Sounds great to me! I moved the CHANGES entry.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] zhaih merged pull request #11822: PrimaryNode: add configurable timeout to waitForAllRemotesToClose

2022-10-18 Thread GitBox


zhaih merged PR #11822:
URL: https://github.com/apache/lucene/pull/11822


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] zhaih closed issue #11674: PrimaryNode close waits for replicas to close, but there is no guarantee they ever will [LUCENE-10638]

2022-10-18 Thread GitBox


zhaih closed issue #11674: PrimaryNode close waits for replicas to close, but 
there is no guarantee they ever will [LUCENE-10638]
URL: https://github.com/apache/lucene/issues/11674


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] zhaih commented on a diff in pull request #11840: GITHUB-11838 Add api to allow concurrent query rewrite

2022-10-18 Thread GitBox


zhaih commented on code in PR #11840:
URL: https://github.com/apache/lucene/pull/11840#discussion_r998828034


##
lucene/highlighter/src/test/org/apache/lucene/search/vectorhighlight/TestFieldQuery.java:
##
@@ -40,12 +41,23 @@
 
 public class TestFieldQuery extends AbstractTestCase {
   private float boost;
+  private IndexSearcher searcher;
 
   /** Set boost to a random value each time it is called. */
   private void initBoost() {
 boost = usually() ? 1F : random().nextFloat() * 1;
   }
 
+  @Override
+  public void setUp() throws Exception {
+super.setUp();
+if (reader == null) {
+  searcher = null;
+} else {
+  searcher = newSearcher(reader);
+}
+  }

Review Comment:
   It works.. with flaws, since if one of the unit test alters the reader and 
the reader become non-null in this way searcher stays null. (Altho I personally 
don't like the pattern where a shared member will be altered within test)
   
   > Should we instead modify the parent class to always instantiate an 
IndexSearcher when instantiating a reader?
   
   Yeah that's better, I changed accordingly, thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] zhaih commented on pull request #11822: PrimaryNode: add configurable timeout to waitForAllRemotesToClose

2022-10-18 Thread GitBox


zhaih commented on PR #11822:
URL: https://github.com/apache/lucene/pull/11822#issuecomment-1283191722

   I merged it but seems there're test failure
   ```
   org.apache.lucene.index.TestIndexFileDeleter > test suite's output saved to 
/home/runner/work/lucene/lucene/lucene/core/build/test-results/test/outputs/OUTPUT-org.apache.lucene.index.TestIndexFileDeleter.txt,
 copied below:
  > org.apache.lucene.store.AlreadyClosedException: ReaderPool is 
already closed
  > at 
__randomizedtesting.SeedInfo.seed([FF62209E9305A732:16FF57ACE5CC40CF]:0)
  > at 
app//org.apache.lucene.index.ReaderPool.get(ReaderPool.java:400)
  > at 
app//org.apache.lucene.index.IndexWriter.writeReaderPool(IndexWriter.java:3922)
  > at 
app//org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:592)
  > at 
app//org.apache.lucene.index.IndexWriter$4.getReader(IndexWriter.java:6479)
  > at 
app//org.apache.lucene.tests.index.RandomIndexWriter.getReader(RandomIndexWriter.java:488)
  > at 
app//org.apache.lucene.tests.index.RandomIndexWriter.getReader(RandomIndexWriter.java:420)
  > at 
app//org.apache.lucene.index.TestIndexFileDeleter.testExcInDecRef(TestIndexFileDeleter.java:485)
  > at 
java.base@17.0.4.1/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
  > at 
java.base@17.0.4.1/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
  > at 
java.base@17.0.4.1/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  > at 
java.base@17.0.4.1/java.lang.reflect.Method.invoke(Method.java:568)
  > at 
app//com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
  > at 
app//com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
  > at 
app//com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
  > at 
app//com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
  > at 
app//org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
  > at 
app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  > at 
app//org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
  > at 
app//org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
  > at 
app//org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
  > at app//org.junit.rules.RunRules.evaluate(RunRules.java:20)
  > at 
app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  > at 
app//com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
  > at 
app//com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
  > at 
app//com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
  > at 
app//com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
  > at 
app//com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
  > at 
app//com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
  > at 
app//com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
  > at 
app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  > at 
app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  > at 
app//org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
  > at 
app//com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  > at 
app//com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  > at 
app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  > at 
app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  > at 
app//org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
  > at 
app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAf