jpountz merged PR #13359:
URL: https://github.com/apache/lucene/pull/13359
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
github-actions[bot] commented on PR #13359:
URL: https://github.com/apache/lucene/pull/13359#issuecomment-2215694348
This PR has not had activity in the past 2 weeks, labeling it as stale. If
the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you
for your contributi
jpountz commented on PR #13359:
URL: https://github.com/apache/lucene/pull/13359#issuecomment-2186240846
I will merge soon if there are no objections.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
jpountz commented on PR #13359:
URL: https://github.com/apache/lucene/pull/13359#issuecomment-2176590274
I pushed a new approach. Instead of `prepareSeekExact` returning `void`, it
now returns a `Supplier` and forbids calling any other method on `TermsEnum`
until the `Supplier` has been con
github-actions[bot] commented on PR #13359:
URL: https://github.com/apache/lucene/pull/13359#issuecomment-2174672285
This PR has not had activity in the past 2 weeks, labeling it as stale. If
the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you
for your contributi
mikemccand commented on code in PR #13359:
URL: https://github.com/apache/lucene/pull/13359#discussion_r1624448703
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java:
##
@@ -307,6 +309,30 @@ private boolean setEOF() {
return true;
}
vsop-479 commented on code in PR #13359:
URL: https://github.com/apache/lucene/pull/13359#discussion_r1623732526
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java:
##
@@ -307,6 +309,30 @@ private boolean setEOF() {
return true;
}
vsop-479 commented on code in PR #13359:
URL: https://github.com/apache/lucene/pull/13359#discussion_r1623732526
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java:
##
@@ -307,6 +309,30 @@ private boolean setEOF() {
return true;
}
jpountz commented on PR #13359:
URL: https://github.com/apache/lucene/pull/13359#issuecomment-2132902792
Now that #13408 has been merged, I could update the benchmark to simply call
IndexSearcher#search.
```java
import java.io.IOException;
import java.io.UncheckedIOE
mikemccand commented on code in PR #13359:
URL: https://github.com/apache/lucene/pull/13359#discussion_r1613219485
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java:
##
@@ -307,6 +309,30 @@ private boolean setEOF() {
return true;
}
mikemccand commented on code in PR #13359:
URL: https://github.com/apache/lucene/pull/13359#discussion_r1613174241
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java:
##
@@ -307,6 +309,30 @@ private boolean setEOF() {
return true;
}
vsop-479 commented on code in PR #13359:
URL: https://github.com/apache/lucene/pull/13359#discussion_r1611094333
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java:
##
@@ -307,6 +309,30 @@ private boolean setEOF() {
return true;
}
vsop-479 commented on code in PR #13359:
URL: https://github.com/apache/lucene/pull/13359#discussion_r1611094333
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java:
##
@@ -307,6 +309,30 @@ private boolean setEOF() {
return true;
}
mikemccand commented on code in PR #13359:
URL: https://github.com/apache/lucene/pull/13359#discussion_r1609713219
##
lucene/core/src/java/org/apache/lucene/index/TermsEnum.java:
##
@@ -61,6 +62,21 @@ public enum SeekStatus {
*/
public abstract boolean seekExact(BytesRef
mikemccand commented on code in PR #13359:
URL: https://github.com/apache/lucene/pull/13359#discussion_r1609895554
##
lucene/core/src/java/org/apache/lucene/index/TermsEnum.java:
##
@@ -61,6 +62,21 @@ public enum SeekStatus {
*/
public abstract boolean seekExact(BytesRef
jpountz commented on code in PR #13359:
URL: https://github.com/apache/lucene/pull/13359#discussion_r1609879874
##
lucene/core/src/java/org/apache/lucene/index/TermsEnum.java:
##
@@ -61,6 +62,21 @@ public enum SeekStatus {
*/
public abstract boolean seekExact(BytesRef tex
jpountz commented on code in PR #13359:
URL: https://github.com/apache/lucene/pull/13359#discussion_r1609878850
##
lucene/core/src/java/org/apache/lucene/index/TermsEnum.java:
##
@@ -61,6 +62,21 @@ public enum SeekStatus {
*/
public abstract boolean seekExact(BytesRef tex
jpountz commented on code in PR #13359:
URL: https://github.com/apache/lucene/pull/13359#discussion_r1609839123
##
lucene/core/src/java/org/apache/lucene/search/TermQuery.java:
##
@@ -150,7 +170,12 @@ public Scorer get(long leadCost) throws IOException {
@Override
jpountz commented on code in PR #13359:
URL: https://github.com/apache/lucene/pull/13359#discussion_r1609830164
##
lucene/core/src/java/org/apache/lucene/search/BlendedTermQuery.java:
##
@@ -19,6 +19,7 @@
import java.io.IOException;
import java.util.Arrays;
import java.util.L
jpountz commented on code in PR #13359:
URL: https://github.com/apache/lucene/pull/13359#discussion_r1609822485
##
lucene/core/src/java/org/apache/lucene/index/TermsEnum.java:
##
@@ -61,6 +62,21 @@ public enum SeekStatus {
*/
public abstract boolean seekExact(BytesRef tex
jpountz commented on code in PR #13359:
URL: https://github.com/apache/lucene/pull/13359#discussion_r1609822235
##
lucene/core/src/java/org/apache/lucene/index/TermsEnum.java:
##
@@ -61,6 +62,21 @@ public enum SeekStatus {
*/
public abstract boolean seekExact(BytesRef tex
jpountz commented on code in PR #13359:
URL: https://github.com/apache/lucene/pull/13359#discussion_r1609821173
##
lucene/core/src/java/org/apache/lucene/index/CheckIndex.java:
##
@@ -3754,13 +3754,17 @@ public static Status.TermVectorStatus testTermVectors(
Ter
mikemccand commented on code in PR #13359:
URL: https://github.com/apache/lucene/pull/13359#discussion_r1609713219
##
lucene/core/src/java/org/apache/lucene/index/TermsEnum.java:
##
@@ -61,6 +62,21 @@ public enum SeekStatus {
*/
public abstract boolean seekExact(BytesRef
mikemccand commented on code in PR #13359:
URL: https://github.com/apache/lucene/pull/13359#discussion_r1609703874
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java:
##
@@ -307,6 +309,30 @@ private boolean setEOF() {
return true;
}
mikemccand commented on code in PR #13359:
URL: https://github.com/apache/lucene/pull/13359#discussion_r1609702603
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java:
##
@@ -307,6 +309,30 @@ private boolean setEOF() {
return true;
}
mikemccand commented on code in PR #13359:
URL: https://github.com/apache/lucene/pull/13359#discussion_r1609700636
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java:
##
@@ -307,6 +309,30 @@ private boolean setEOF() {
return true;
}
jpountz commented on code in PR #13359:
URL: https://github.com/apache/lucene/pull/13359#discussion_r1609634889
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java:
##
@@ -307,6 +309,30 @@ private boolean setEOF() {
return true;
}
mikemccand commented on code in PR #13359:
URL: https://github.com/apache/lucene/pull/13359#discussion_r1609620348
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java:
##
@@ -307,6 +309,30 @@ private boolean setEOF() {
return true;
}
jpountz commented on PR #13359:
URL: https://github.com/apache/lucene/pull/13359#issuecomment-2122746505
It creates a 50GB terms dictionary while my machine only has ~28GB of RAM
for the page cache, so many terms dictionary lookups result in page faults.
--
This is an automated message fr
mikemccand commented on PR #13359:
URL: https://github.com/apache/lucene/pull/13359#issuecomment-2122733760
> But I created a benchmark that starts looking like running a Lucene query
that is encouraging
Was this with a forced-cold index?
--
This is an automated message from the Ap
rmuir commented on code in PR #13359:
URL: https://github.com/apache/lucene/pull/13359#discussion_r1604776531
##
lucene/core/src/java/org/apache/lucene/index/TermsEnum.java:
##
@@ -61,6 +62,15 @@ public enum SeekStatus {
*/
public abstract boolean seekExact(BytesRef text)
jpountz commented on PR #13359:
URL: https://github.com/apache/lucene/pull/13359#issuecomment-2112625165
I iterated a bit on this change:
- `TermsEnum#prepareSeekExact` is introduced, which only prefetches data
which is later going to be needed by `TermsEnum#seekExact`.
- `TermStates
jpountz commented on code in PR #13359:
URL: https://github.com/apache/lucene/pull/13359#discussion_r1598128358
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java:
##
@@ -307,6 +309,31 @@ private boolean setEOF() {
return true;
}
rmuir commented on code in PR #13359:
URL: https://github.com/apache/lucene/pull/13359#discussion_r1597522761
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java:
##
@@ -307,6 +309,31 @@ private boolean setEOF() {
return true;
}
+
jpountz commented on PR #13359:
URL: https://github.com/apache/lucene/pull/13359#issuecomment-2105658311
This is a draft as I need to do more work on tests and making sure that this
new method cannot corrupt the state of the `SegmentTermsEnum`.
But I created a benchmark that start
35 matches
Mail list logo