[GitHub] [lucene] iverase opened a new issue, #11824: Performance regression on LatLonPoint#newPolygonQuery

2022-09-27 Thread GitBox


iverase opened a new issue, #11824:
URL: https://github.com/apache/lucene/issues/11824

   ### Description
   
   I just notice a big performance regression on polygon queries using 
LatLonPoint field in [lucene geo 
benchmarks](https://home.apache.org/~mikemccand/geobench.html):
   
   https://user-images.githubusercontent.com/29038686/192466543-419575d8-e1c2-483c-81e4-c122a92a694f.png";>
   
   I checked and the regression was introduced by this change: 
https://github.com/apache/lucene/pull/1017. 
   
   My suspicion is that before this change, SpatialQuery was calling the method 
`#getSpatialVisitor()` once for the whole index but in the new version is 
calling it once per segment. This method might be expensive for LatLonPoint 
queries, threfore the regression.
   
   @nknize FYI
   
   
   
   ### Version and environment details
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jpountz commented on pull request #11796: GITHUB#11795: Add FilterDirectory to track write amplification factor

2022-09-27 Thread GitBox


jpountz commented on PR #11796:
URL: https://github.com/apache/lucene/pull/11796#issuecomment-1259137841

   > Interesting point.. Thinking how/when we'd like to track the impact of 
temp output files. From what I understand, they won't be a part of commit and 
fsync. So if we're trying to measure increased disk or remote store I/O, we 
probably want to skip them?
   
   Indeed temporary files are never part of a commit and fsynced, but this may 
also be the case for a number of flushed segments: if flushed segments get 
included in a merge before the next commit, then they would never be part of a 
commit and fsynced either.
   
   > Although we delete the temp files right after, but on a small box, maybe 
we gives us a sense of increased file writes or page fault.
   
   Some temporary files are also not necessarily short-lived, like the ones we 
create for stored fields when index sorting is enabled.
   
   I'm considering exposing write amplification separately for flushes (as 
`flushedBytes / totalIndexSize`), merges (as `(totalIndexSize + mergedBytes) / 
totalIndexSize`) and temporary files (as `(totalIndexSize + tempBytes) / 
totalIndexSize`) and pushing the responsibility to users of whether and how 
they would like to combine these various metrics?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] iverase opened a new pull request, #11825: Build SpatialVisitor once per index

2022-09-27 Thread GitBox


iverase opened a new pull request, #11825:
URL: https://github.com/apache/lucene/pull/11825

   see https://github.com/apache/lucene/issues/11824
   
   I am not adding an entry in changes as this is an unreleased bug. If it does 
not make to Lucene 8.4 then we should add it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jpountz commented on pull request #11825: Build SpatialVisitor once per index

2022-09-27 Thread GitBox


jpountz commented on PR #11825:
URL: https://github.com/apache/lucene/pull/11825#issuecomment-1259183151

   Have you managed to confirm that it addressed the performance regression?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] iverase commented on pull request #11825: Build SpatialVisitor once per index

2022-09-27 Thread GitBox


iverase commented on PR #11825:
URL: https://github.com/apache/lucene/pull/11825#issuecomment-1259185529

   Yes, I did 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] iverase merged pull request #11825: Build SpatialVisitor once per index

2022-09-27 Thread GitBox


iverase merged PR #11825:
URL: https://github.com/apache/lucene/pull/11825


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on pull request #11825: Build SpatialVisitor once per index

2022-09-27 Thread GitBox


uschindler commented on PR #11825:
URL: https://github.com/apache/lucene/pull/11825#issuecomment-1259209942

   Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on pull request #11823: GH-11819: Fix the Eclipse part to support development of the MR-JAR in IDE

2022-09-27 Thread GitBox


uschindler commented on PR #11823:
URL: https://github.com/apache/lucene/pull/11823#issuecomment-1259214503

   This PR also removes an obsolete XSL file used by Ant.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler merged pull request #11823: GH-11819: Fix the Eclipse part to support development of the MR-JAR in IDE

2022-09-27 Thread GitBox


uschindler merged PR #11823:
URL: https://github.com/apache/lucene/pull/11823


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on issue #11819: Fix Java 19 MR-JAR compilation in IDEs

2022-09-27 Thread GitBox


uschindler commented on issue #11819:
URL: https://github.com/apache/lucene/issues/11819#issuecomment-1259220386

   I merged the Eclipse part. Robert yesterday tried to get hold of IDEA, but 
it looks like it is "mostly working" if you set the Java versions correctly (to 
19). It looks like Idea picks per source folder the correct Java version, but 
it messed up compilation.
   
   Of course, in both cases you need Java 19 added as IDE in Eclipse and IDEA 
if you want to edit the Java 19 features.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler opened a new pull request, #11826: Let smoketester initialize local settings before running any checks (like Github CI or Jenkins)

2022-09-27 Thread GitBox


uschindler opened a new pull request, #11826:
URL: https://github.com/apache/lucene/pull/11826

   This closes #11820.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on issue #11820: smoketester needs to run 'gradlew localSettings' first

2022-09-27 Thread GitBox


uschindler commented on issue #11820:
URL: https://github.com/apache/lucene/issues/11820#issuecomment-1259230934

   I have a simple PR: #11826
   
   We can introduce more advanced solutions like editing gradlew.sh/bat later. 
This should fix the issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] iverase closed issue #11824: Performance regression on LatLonPoint#newPolygonQuery

2022-09-27 Thread GitBox


iverase closed issue #11824: Performance regression on 
LatLonPoint#newPolygonQuery
URL: https://github.com/apache/lucene/issues/11824


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] iverase commented on issue #11824: Performance regression on LatLonPoint#newPolygonQuery

2022-09-27 Thread GitBox


iverase commented on issue #11824:
URL: https://github.com/apache/lucene/issues/11824#issuecomment-1259240627

   close in https://github.com/apache/lucene/pull/11825


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler closed issue #11820: smoketester needs to run 'gradlew localSettings' first

2022-09-27 Thread GitBox


uschindler closed issue #11820: smoketester needs to run 'gradlew 
localSettings' first
URL: https://github.com/apache/lucene/issues/11820


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler merged pull request #11826: Let smoketester initialize local settings before running any checks (like Github CI or Jenkins)

2022-09-27 Thread GitBox


uschindler merged PR #11826:
URL: https://github.com/apache/lucene/pull/11826


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on issue #11819: Fix Java 19 MR-JAR compilation in IDEs

2022-09-27 Thread GitBox


rmuir commented on issue #11819:
URL: https://github.com/apache/lucene/issues/11819#issuecomment-125935

   i tried wrestling with it, but i'm not sure i even held it properly. 
ultimately i was able to get it to work by setting entire project to java 19. 
this works for now, because "mr-jar" is not "really" used (there are no 
duplicate classes)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on issue #11820: smoketester needs to run 'gradlew localSettings' first

2022-09-27 Thread GitBox


rmuir commented on issue #11820:
URL: https://github.com/apache/lucene/issues/11820#issuecomment-1259358447

   thanks @uschindler ! it works now without any hassles.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] gsmiller commented on pull request #11803: DrillSideways optimizations

2022-09-27 Thread GitBox


gsmiller commented on PR #11803:
URL: https://github.com/apache/lucene/pull/11803#issuecomment-1259557102

   > Changes LGTM, do we need to add some unit tests?
   
   Thanks @zhaih. Let me consider some specific test for this. I know our 
randomized testing for DrillSideways covers these code paths but maybe some 
specific, non-random tests would be useful.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] iverase opened a new issue, #11827: Release manager should review lucene benchmarks before building release candidates

2022-09-27 Thread GitBox


iverase opened a new issue, #11827:
URL: https://github.com/apache/lucene/issues/11827

   ### Description
   
   We were about to release Lucene 9.4.0 with an important performance 
regression. This regression was showing up in the benchmarks but obviously we 
missed reviewing them. The regression was actually caught during the daily 
monitoring of Elasticsearch development instead.
   
   Speaking to other Lucene committers we thought that it may be a good idea to 
add a new task to the release manager handbook to review the Lucene benchmarks 
before building the release candidates. Obviously we are not expecting the 
release manager to find all regressions but something as obvious as the one in 
Lucene 9.4.0 would have been possible.
   
   wdyt?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] gsmiller opened a new pull request, #11828: TermInSetQuery optimization when all docs in a field match a term

2022-09-27 Thread GitBox


gsmiller opened a new pull request, #11828:
URL: https://github.com/apache/lucene/pull/11828

   ### Description
   
   This changes the optimization present in `TermInSetQuery` to mimic the one 
in `MultiTermQueryConstantScoreWrapper`, bringing parity to the two approaches. 
More specifically, it optimizes the case where all docs with a value for the 
referenced field contain a given term (rather than requiring all docs in the 
segment to contain the term). The solution for 
`MultiTermQueryConstantScoreWrapper` was discussed in PR #11738 for reference.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jpountz commented on issue #11592: Relax the maximum dirtiness for stored fields and term vectors? [LUCENE-10556]

2022-09-27 Thread GitBox


jpountz commented on issue #11592:
URL: https://github.com/apache/lucene/issues/11592#issuecomment-1259677572

   Updated numbers now that TMP no longer runs quadratic merges, and 
BEST_COMPRESSION:
   
   |Dirtiness|Indexing time (msec)|
   |-|-|
   |0%|27219|
   |1%|26897|
   |5%|27196|
   |20%|27043|
   |33%|26188|
   |50%|24729|
   |75%|20993|
   |100%|9971|
   
   So the tolerable dirtiness doesn't affect indexing time too much unless 
unlimited (100%). My understanding is that because we merge 10 segments at 
once, an optimized merge typically increases dirtiness by 10x, so you need to 
bump the tolerable dirtiness very significantly to start seeing benefits. This 
benchmark suggests we wouldn't get very significant benefits from raising the 
maximum tolerable dirtiness.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jpountz closed issue #11592: Relax the maximum dirtiness for stored fields and term vectors? [LUCENE-10556]

2022-09-27 Thread GitBox


jpountz closed issue #11592: Relax the maximum dirtiness for stored fields and 
term vectors? [LUCENE-10556]
URL: https://github.com/apache/lucene/issues/11592


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] dweiss commented on issue #11819: Fix Java 19 MR-JAR compilation in IDEs

2022-09-27 Thread GitBox


dweiss commented on issue #11819:
URL: https://github.com/apache/lucene/issues/11819#issuecomment-1259684070

   (away until the next of the week, guys - out of reach). bb on sunday.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] nknize commented on issue #11824: Performance regression on LatLonPoint#newPolygonQuery

2022-09-27 Thread GitBox


nknize commented on issue #11824:
URL: https://github.com/apache/lucene/issues/11824#issuecomment-1259733493

   Just seeing this. That's exactly what it would be! Snuck in one of those 
last commits on the long running PR. Thanks for refactoring and merging  
@iverase!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on issue #11827: Release manager should review lucene benchmarks before building release candidates

2022-09-27 Thread GitBox


rmuir commented on issue #11827:
URL: https://github.com/apache/lucene/issues/11827#issuecomment-1259797860

   i think a good step is to just make sure that we review the benchmark 
results before releasing and send an email out about anything that looks 
suspicious.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jpountz commented on a diff in pull request #11722: Binary search the entries when all suffixes have the same length in a leaf block.

2022-09-27 Thread GitBox


jpountz commented on code in PR #11722:
URL: https://github.com/apache/lucene/pull/11722#discussion_r981512784


##
lucene/test-framework/src/java/org/apache/lucene/tests/index/BasePostingsFormatTestCase.java:
##
@@ -367,6 +367,49 @@ public void testGhosts() throws Exception {
 dir.close();
   }
 
+  public void testBinarySearchTermLeaf() throws Exception {
+Directory dir = newDirectory();
+
+IndexWriterConfig iwc = newIndexWriterConfig(null);
+iwc.setCodec(getCodec());
+iwc.setMergePolicy(newTieredMergePolicy());
+IndexWriter iw = new IndexWriter(dir, iwc);
+
+for (int i = 10; i <= 100400; i++){
+  // only add odd number
+  if (i % 2 == 1) {
+Document document = new Document();
+document.add(new StringField("id", i + "", Field.Store.NO));
+iw.addDocument(document);
+  }
+}
+iw.commit();
+iw.forceMerge(1);
+
+DirectoryReader reader = DirectoryReader.open(iw);
+TermsEnum termsEnum = getOnlyLeafReader(reader).terms("id").iterator();
+// test seekExact
+for (int i = 10; i <= 100400; i++){
+  if (i % 2 == 1) {
+assertTrue(termsEnum.seekExact(new BytesRef(i + "")));

Review Comment:
   maybe assert that the value of `termsEnum.term()` is correct after this call



##
lucene/test-framework/src/java/org/apache/lucene/tests/index/BasePostingsFormatTestCase.java:
##
@@ -367,6 +367,49 @@ public void testGhosts() throws Exception {
 dir.close();
   }
 
+  public void testBinarySearchTermLeaf() throws Exception {
+Directory dir = newDirectory();
+
+IndexWriterConfig iwc = newIndexWriterConfig(null);
+iwc.setCodec(getCodec());
+iwc.setMergePolicy(newTieredMergePolicy());
+IndexWriter iw = new IndexWriter(dir, iwc);
+
+for (int i = 10; i <= 100400; i++){
+  // only add odd number
+  if (i % 2 == 1) {
+Document document = new Document();
+document.add(new StringField("id", i + "", Field.Store.NO));
+iw.addDocument(document);
+  }
+}
+iw.commit();
+iw.forceMerge(1);
+
+DirectoryReader reader = DirectoryReader.open(iw);
+TermsEnum termsEnum = getOnlyLeafReader(reader).terms("id").iterator();
+// test seekExact
+for (int i = 10; i <= 100400; i++){
+  if (i % 2 == 1) {
+assertTrue(termsEnum.seekExact(new BytesRef(i + "")));
+  } else {
+assertFalse(termsEnum.seekExact(new BytesRef(i + "")));
+  }
+}
+// test seekCeil
+for (int i = 10; i < 100400; i++){
+  if (i % 2 == 1) {
+assertEquals(SeekStatus.FOUND, termsEnum.seekCeil(new BytesRef(i + 
"")));

Review Comment:
   maybe assert that the value of `termsEnum.term()` is correct after this call



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] elliotzlin commented on pull request #11724: LUCENE-10520 / #11556 HTMLStripCharFilter bugfix

2022-09-27 Thread GitBox


elliotzlin commented on PR #11724:
URL: https://github.com/apache/lucene/pull/11724#issuecomment-1259830640

   @rmuir would you be able to help with running the workflows?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] msokolov commented on issue #11827: Release manager should review lucene benchmarks before building release candidates

2022-09-27 Thread GitBox


msokolov commented on issue #11827:
URL: https://github.com/apache/lucene/issues/11827#issuecomment-1259830880

   yup. Possibly too if Mike M is bored he could implement an alarming system 
:) or export the data somehow so we could bolt one on the side?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] kaivalnp closed pull request #1059: Refactor KnnGraphTester

2022-09-27 Thread GitBox


kaivalnp closed pull request #1059: Refactor KnnGraphTester
URL: https://github.com/apache/lucene/pull/1059


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org