[GitHub] [lucene] jpountz commented on issue #11773: Could `PointRangeQuery`'s boundary values used for `NumericComparator` to calculate `estimatedNumberOfMatches`

2022-09-22 Thread GitBox
jpountz commented on issue #11773: URL: https://github.com/apache/lucene/issues/11773#issuecomment-1254622883 Thanks, I had not well understood that you were after the case when both the filter and the sort would be on the same field. You are right that the collector could do better by bein

[GitHub] [lucene] wjp719 commented on pull request #687: LUCENE-10425:speed up IndexSortSortedNumericDocValuesRangeQuery#BoundedDocSetIdIterator construction using bkd binary search

2022-09-22 Thread GitBox
wjp719 commented on PR #687: URL: https://github.com/apache/lucene/pull/687#issuecomment-1254624579 > I would rather not add this option and make the binary search logic a bit more complex/inefficient. OK thanks, when index sorts on descending order, I have tried bkd binary search

[GitHub] [lucene] dweiss closed pull request #11802: fix sentence iteration in opennlp package

2022-09-22 Thread GitBox
dweiss closed pull request #11802: fix sentence iteration in opennlp package URL: https://github.com/apache/lucene/pull/11802 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [lucene] dweiss commented on pull request #11802: fix sentence iteration in opennlp package

2022-09-22 Thread GitBox
dweiss commented on PR #11802: URL: https://github.com/apache/lucene/pull/11802#issuecomment-1254626299 Duplicated in #11734 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [lucene] dweiss commented on pull request #11734: Fix repeating token sentence boundary bug

2022-09-22 Thread GitBox
dweiss commented on PR #11734: URL: https://github.com/apache/lucene/pull/11734#issuecomment-1254627495 I don't know what happened there but I'm sure it's going to be fixable. Let me take a look later today or tomorrow morning (I'm out of office today). -- This is an automated message fro

[GitHub] [lucene] rmuir commented on issue #11788: Upgrade ANTLR to version 4.11.1

2022-09-22 Thread GitBox
rmuir commented on issue #11788: URL: https://github.com/apache/lucene/issues/11788#issuecomment-1254645510 looks like an antlr problem, if they broke backwards compat, they prolly should have named it `5.x`? let's be careful about upgrading to new versions. newer antlr versions have

[GitHub] [lucene] uschindler commented on issue #11788: Upgrade ANTLR to version 4.11.1

2022-09-22 Thread GitBox
uschindler commented on issue #11788: URL: https://github.com/apache/lucene/issues/11788#issuecomment-1254658670 Thanks Robert. I would have said the same. In the worst case we should (like most projects do for ASM, e.g. forbidden apis) shade the antrlr runtime to lucenes package name and i

[GitHub] [lucene] vigyasharma commented on a diff in pull request #11796: GITHUB#11795: Add FilterDirectory to track write amplification factor

2022-09-22 Thread GitBox
vigyasharma commented on code in PR #11796: URL: https://github.com/apache/lucene/pull/11796#discussion_r977305054 ## lucene/core/src/java/org/apache/lucene/store/ByteTrackingIndexOutput.java: ## @@ -0,0 +1,70 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

[GitHub] [lucene] jpountz commented on issue #11799: Indexing method for learned sparse retrieval

2022-09-22 Thread GitBox
jpountz commented on issue #11799: URL: https://github.com/apache/lucene/issues/11799#issuecomment-1254691652 > we want a single Field containing a list of key-value pairs or a json formatted Note that you can add one `FeatureField` field to your Lucene document for every key/value p

[GitHub] [lucene] jpountz commented on a diff in pull request #11796: GITHUB#11795: Add FilterDirectory to track write amplification factor

2022-09-22 Thread GitBox
jpountz commented on code in PR #11796: URL: https://github.com/apache/lucene/pull/11796#discussion_r977363543 ## lucene/core/src/java/org/apache/lucene/store/ByteTrackingIndexOutput.java: ## @@ -0,0 +1,70 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or

[GitHub] [lucene] jpountz commented on pull request #11796: GITHUB#11795: Add FilterDirectory to track write amplification factor

2022-09-22 Thread GitBox
jpountz commented on PR #11796: URL: https://github.com/apache/lucene/pull/11796#issuecomment-1254705837 This class feels like it'd be a good fit for the `misc` module rather than `core`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [lucene] jpountz commented on pull request #687: LUCENE-10425:speed up IndexSortSortedNumericDocValuesRangeQuery#BoundedDocSetIdIterator construction using bkd binary search

2022-09-22 Thread GitBox
jpountz commented on PR #687: URL: https://github.com/apache/lucene/pull/687#issuecomment-1254731765 I'm (maybe naively) assuming that we could work around this case at the inner node level by skipping inner nodes whose max value is equal to the min value if we have already seen this value

[GitHub] [lucene] jpountz commented on a diff in pull request #11722: Binary search the entries when all suffixes have the same length in a leaf block.

2022-09-22 Thread GitBox
jpountz commented on code in PR #11722: URL: https://github.com/apache/lucene/pull/11722#discussion_r977400678 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnumFrame.java: ## @@ -646,6 +648,84 @@ public SeekStatus scanToTermLeaf(BytesRef target

[GitHub] [lucene] wjp719 commented on pull request #687: LUCENE-10425:speed up IndexSortSortedNumericDocValuesRangeQuery#BoundedDocSetIdIterator construction using bkd binary search

2022-09-22 Thread GitBox
wjp719 commented on PR #687: URL: https://github.com/apache/lucene/pull/687#issuecomment-1254778654 > I'm (maybe naively) assuming that we could work around this case at the inner node level by skipping inner nodes whose max value is equal to the min value if we have already seen this value

[GitHub] [lucene] thongnt99 commented on issue #11799: Indexing method for learned sparse retrieval

2022-09-22 Thread GitBox
thongnt99 commented on issue #11799: URL: https://github.com/apache/lucene/issues/11799#issuecomment-1254781175 @ jpountz Great. Thank you very much. I will try it out and see if there is any different in the scores. -- This is an automated message from the Apache Git Service. To respond

[GitHub] [lucene] gcbaptista commented on issue #11800: INVALID_SYNTAX_CANNOT_PARSE for at sign (@)

2022-09-22 Thread GitBox
gcbaptista commented on issue #11800: URL: https://github.com/apache/lucene/issues/11800#issuecomment-1254813633 Hey again, So if I want my queries to support `@`, what should be my approach to keep the parsing compatibility from this version on? If there is no way to parse it right no

[GitHub] [lucene] reta commented on issue #11788: Upgrade ANTLR to version 4.11.1

2022-09-22 Thread GitBox
reta commented on issue #11788: URL: https://github.com/apache/lucene/issues/11788#issuecomment-1254977405 @rmuir @uschindler thanks guys > looks like an antlr problem, if they broke backwards compat, they prolly should have named it 5.x? Sadly I don't know the story, I believe

[GitHub] [lucene] rmuir commented on issue #11788: Upgrade ANTLR to version 4.11.1

2022-09-22 Thread GitBox
rmuir commented on issue #11788: URL: https://github.com/apache/lucene/issues/11788#issuecomment-1255064053 i'd prefer not changing anything without addressing the testing. I need to reiterate just how insanely trappy antlr v4 is. for painless to work with v4 and prevent insanely slow perf

[jira] [Updated] (LUCENE-9089) FST.Builder with fluent-style constructor

2022-09-22 Thread Drew Foulks (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Drew Foulks updated LUCENE-9089: Reporter: Bruno Roustant (was: Bruno Roustant) > FST.Builder with fluent-style constructor >

[jira] [Updated] (LUCENE-8983) PhraseWildcardQuery - new query to control and optimize wildcard expansions in phrase

2022-09-22 Thread Drew Foulks (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Drew Foulks updated LUCENE-8983: Reporter: Bruno Roustant (was: Bruno Roustant) > PhraseWildcardQuery - new query to control and o

[jira] [Updated] (LUCENE-9049) Remove FST cachedRootArcs now redundant with direct-addressing

2022-09-22 Thread Drew Foulks (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Drew Foulks updated LUCENE-9049: Reporter: Bruno Roustant (was: Bruno Roustant) > Remove FST cachedRootArcs now redundant with dir

[jira] [Updated] (LUCENE-9045) Do not use TreeMap/TreeSet in BlockTree and PerFieldPostingsFormat

2022-09-22 Thread Drew Foulks (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Drew Foulks updated LUCENE-9045: Reporter: Bruno Roustant (was: Bruno Roustant) > Do not use TreeMap/TreeSet in BlockTree and PerF

[jira] [Updated] (LUCENE-9064) Can we remove the FST cache in Kuromoji and Nori analyzers?

2022-09-22 Thread Drew Foulks (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Drew Foulks updated LUCENE-9064: Reporter: Bruno Roustant (was: Bruno Roustant) > Can we remove the FST cache in Kuromoji and Nori

[GitHub] [lucene] gsmiller commented on pull request #11738: Optimize MultiTermQueryConstantScoreWrapper for case when a term matches all docs in a segment.

2022-09-22 Thread GitBox
gsmiller commented on PR #11738: URL: https://github.com/apache/lucene/pull/11738#issuecomment-1255173279 @rmuir did you have any other feedback or opposition to this change? Sorry, it dropped off my plate for a bit but picking it up now and looking to get it merged. Thanks again! -- Thi

[GitHub] [lucene] gsmiller commented on pull request #11744: Remove LongValueFacetCounts#getTopChildrenSortByCount since it provides redundant functionality

2022-09-22 Thread GitBox
gsmiller commented on PR #11744: URL: https://github.com/apache/lucene/pull/11744#issuecomment-1255176819 @mikemccand I tagged you as a potential reviewer on this if you have some time. Thought you might have a good opinion as you authored it originally. (Also tagged you in #11746, which is

[GitHub] [lucene] gsmiller opened a new pull request, #11804: FacetsCollector#collect is no longer final to allow extension

2022-09-22 Thread GitBox
gsmiller opened a new pull request, #11804: URL: https://github.com/apache/lucene/pull/11804 ### Description I'd like to propose removing the `final` restriction on `FacetsCollector#collect` to allow extension. I have a use-case where I'd like to be able to throw a `CollectionTermina

[GitHub] [lucene] mikemccand commented on pull request #11796: GITHUB#11795: Add FilterDirectory to track write amplification factor

2022-09-22 Thread GitBox
mikemccand commented on PR #11796: URL: https://github.com/apache/lucene/pull/11796#issuecomment-1255217103 I love this approach/idea! It's simple so we should start with this ... but it will necessarily be a lagging indicator since merging takes some time to kick off and run to comp

[GitHub] [lucene] mikemccand commented on pull request #11796: GITHUB#11795: Add FilterDirectory to track write amplification factor

2022-09-22 Thread GitBox
mikemccand commented on PR #11796: URL: https://github.com/apache/lucene/pull/11796#issuecomment-1255225299 An alternative implementation would be to add the bytes only in the `IndexOutput.close` method instead of on each method that writes bytes? It might be less error-proned, but, also l

[GitHub] [lucene] mikemccand commented on a diff in pull request #11796: GITHUB#11795: Add FilterDirectory to track write amplification factor

2022-09-22 Thread GitBox
mikemccand commented on code in PR #11796: URL: https://github.com/apache/lucene/pull/11796#discussion_r977831498 ## lucene/core/src/java/org/apache/lucene/store/ByteTrackingIndexOutput.java: ## @@ -0,0 +1,70 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

[GitHub] [lucene] mdmarshmallow commented on a diff in pull request #11796: GITHUB#11795: Add FilterDirectory to track write amplification factor

2022-09-22 Thread GitBox
mdmarshmallow commented on code in PR #11796: URL: https://github.com/apache/lucene/pull/11796#discussion_r977879810 ## lucene/core/src/java/org/apache/lucene/store/ByteTrackingIndexOutput.java: ## @@ -0,0 +1,70 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

[GitHub] [lucene] mdmarshmallow commented on a diff in pull request #11796: GITHUB#11795: Add FilterDirectory to track write amplification factor

2022-09-22 Thread GitBox
mdmarshmallow commented on code in PR #11796: URL: https://github.com/apache/lucene/pull/11796#discussion_r977881217 ## lucene/core/src/java/org/apache/lucene/store/ByteTrackingIndexOutput.java: ## @@ -0,0 +1,70 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

[GitHub] [lucene] mdmarshmallow commented on a diff in pull request #11796: GITHUB#11795: Add FilterDirectory to track write amplification factor

2022-09-22 Thread GitBox
mdmarshmallow commented on code in PR #11796: URL: https://github.com/apache/lucene/pull/11796#discussion_r977890148 ## lucene/core/src/java/org/apache/lucene/store/ByteTrackingIndexOutput.java: ## @@ -0,0 +1,70 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

[GitHub] [lucene] mdmarshmallow commented on pull request #11796: GITHUB#11795: Add FilterDirectory to track write amplification factor

2022-09-22 Thread GitBox
mdmarshmallow commented on PR #11796: URL: https://github.com/apache/lucene/pull/11796#issuecomment-1255305951 So by doing this on `IndexOutput.close()`, we would avoid including half-done merges/flushes in the write amplification factor? As you said, this does track all-time WAF so I guess

[GitHub] [lucene] dan2097 commented on issue #11761: Expand TieredMergePolicy deletePctAllowed limits

2022-09-22 Thread GitBox
dan2097 commented on issue #11761: URL: https://github.com/apache/lucene/issues/11761#issuecomment-1255309927 I have also ran into this on our patent search system. In our index the problem is exagerrated by the larger documents tending to be more frequently reindexed so the 20% deleted doc

[GitHub] [lucene] caohassl opened a new issue, #11805: Add a InterruptedCollector to received thread interrupt request and exit search task early

2022-09-22 Thread GitBox
caohassl opened a new issue, #11805: URL: https://github.com/apache/lucene/issues/11805 ### Description hi, I try to submit a Lucene search task using multiple threads, and when I cancel the search thread, the search task complete normally. But Some search tasks are time-consu

[GitHub] [lucene] vigyasharma commented on pull request #11796: GITHUB#11795: Add FilterDirectory to track write amplification factor

2022-09-22 Thread GitBox
vigyasharma commented on PR #11796: URL: https://github.com/apache/lucene/pull/11796#issuecomment-1255316059 I see you'd already responded to a bunch of my comments. I should've refreshed my PR page. Will resolve those. -- This is an automated message from the Apache Git Service. To respo

[GitHub] [lucene] caohassl opened a new pull request, #11806: GITHUB#11728: Add a InterruptedCollector to received thread interrupt request and exit search task early

2022-09-22 Thread GitBox
caohassl opened a new pull request, #11806: URL: https://github.com/apache/lucene/pull/11806 ### Description ISSUE:#11805 1、Add a InterruptedCollector class to delegate collector 2、By default, when LeafReaderContext is traversed, determine whether there is an interrupt reque

[GitHub] [lucene] Yuti-G commented on pull request #11768: Fix tie-break bug in various Facets implementations

2022-09-22 Thread GitBox
Yuti-G commented on PR #11768: URL: https://github.com/apache/lucene/pull/11768#issuecomment-1255341964 Thanks @gsmiller for discovering this issue! The changes look good to me. I am curious if the `index` in `LongIntCursor` works similarly to `ordinals` in other faceting implementati

[jira] [Updated] (LUCENE-8292) Fix FilterLeafReader.FilterTermsEnum to delegate all seekExact methods

2022-09-22 Thread Drew Foulks (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Drew Foulks updated LUCENE-8292: Reporter: Bruno Roustant (was: Bruno Roustant) > Fix FilterLeafReader.FilterTermsEnum to delegate

[jira] [Updated] (LUCENE-8753) New PostingFormat - UniformSplit

2022-09-22 Thread Drew Foulks (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Drew Foulks updated LUCENE-8753: Reporter: Bruno Roustant (was: Bruno Roustant) > New PostingFormat - UniformSplit > -

[jira] [Updated] (LUCENE-9078) Term vectors options should not be configurable per-doc

2022-09-22 Thread Drew Foulks (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Drew Foulks updated LUCENE-9078: Reporter: Bruno Roustant (was: Bruno Roustant) > Term vectors options should not be configurable

[jira] [Updated] (LUCENE-8906) Lucene50PostingsReader.postings() casts BlockTermState param to private IntBlockTermState

2022-09-22 Thread Drew Foulks (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Drew Foulks updated LUCENE-8906: Reporter: Bruno Roustant (was: Bruno Roustant) > Lucene50PostingsReader.postings() casts BlockTer

[jira] [Updated] (LUCENE-8836) Optimize DocValues TermsDict to continue scanning from the last position when possible

2022-09-22 Thread Drew Foulks (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Drew Foulks updated LUCENE-8836: Reporter: Bruno Roustant (was: Bruno Roustant) > Optimize DocValues TermsDict to continue scannin

[jira] [Updated] (LUCENE-8159) Add a copy constructor in AutomatonQuery to copy directly the compiled automaton

2022-09-22 Thread Drew Foulks (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Drew Foulks updated LUCENE-8159: Reporter: Bruno Roustant (was: Bruno Roustant) > Add a copy constructor in AutomatonQuery to copy

[jira] [Updated] (LUCENE-8921) IndexSearcher.termStatistics should not require TermStates but docFreq and totalTermFreq

2022-09-22 Thread Drew Foulks (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Drew Foulks updated LUCENE-8921: Reporter: Bruno Roustant (was: Bruno Roustant) > IndexSearcher.termStatistics should not require

[GitHub] [lucene] gautamworah96 commented on pull request #11796: GITHUB#11795: Add FilterDirectory to track write amplification factor

2022-09-22 Thread GitBox
gautamworah96 commented on PR #11796: URL: https://github.com/apache/lucene/pull/11796#issuecomment-1255423884 For folks more familiar with WAF calculations for Search applications, is the formula of `(flushedBytes + mergedBytes) / flushedBytes` always correct? For example, does the

[GitHub] [lucene] gsmiller commented on pull request #11768: Fix tie-break bug in various Facets implementations

2022-09-22 Thread GitBox
gsmiller commented on PR #11768: URL: https://github.com/apache/lucene/pull/11768#issuecomment-1255483139 @Yuti-G could you help me understand what faceting implementation or part of the code you're referring to? Thanks! -- This is an automated message from the Apache Git Service. To resp

[GitHub] [lucene] Yuti-G commented on pull request #11768: Fix tie-break bug in various Facets implementations

2022-09-22 Thread GitBox
Yuti-G commented on PR #11768: URL: https://github.com/apache/lucene/pull/11768#issuecomment-1255500611 Sure, I just updated the previous comment with links. Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [lucene] dweiss commented on issue #11800: INVALID_SYNTAX_CANNOT_PARSE for at sign (@)

2022-09-22 Thread GitBox
dweiss commented on issue #11800: URL: https://github.com/apache/lucene/issues/11800#issuecomment-1255521641 You can escape the at character: ``` am\@zing ``` or you can quote the term: ``` "am\@zing" ``` Or you can set up flexible query parser with your own syntax par

[GitHub] [lucene] gsmiller commented on pull request #11768: Fix tie-break bug in various Facets implementations

2022-09-22 Thread GitBox
gsmiller commented on PR #11768: URL: https://github.com/apache/lucene/pull/11768#issuecomment-1255562840 @Yuti-G thanks for the links. In this case, the contract is that we break ties by the value (of the long) itself (low-to-high), which the PQ is already doing. So this appears to be corr

[GitHub] [lucene-solr] joshsouza opened a new pull request, #2671: Add sts support

2022-09-22 Thread GitBox
joshsouza opened a new pull request, #2671: URL: https://github.com/apache/lucene-solr/pull/2671 As discovered in https://github.com/apache/solr-operator/issues/475 the `s3-repository` contrib module is missing a dependency on the `software.amazon.awssdk:sts` module in order to enable aut

[GitHub] [lucene] Yuti-G commented on pull request #11768: Fix tie-break bug in various Facets implementations

2022-09-22 Thread GitBox
Yuti-G commented on PR #11768: URL: https://github.com/apache/lucene/pull/11768#issuecomment-1255662264 I see.. Thanks for the explanation of indexes! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [lucene] vigyasharma commented on a diff in pull request #11796: GITHUB#11795: Add FilterDirectory to track write amplification factor

2022-09-22 Thread GitBox
vigyasharma commented on code in PR #11796: URL: https://github.com/apache/lucene/pull/11796#discussion_r978239743 ## lucene/core/src/java/org/apache/lucene/store/ByteTrackingIndexOutput.java: ## @@ -0,0 +1,70 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

[GitHub] [lucene] vigyasharma commented on a diff in pull request #11796: GITHUB#11795: Add FilterDirectory to track write amplification factor

2022-09-22 Thread GitBox
vigyasharma commented on code in PR #11796: URL: https://github.com/apache/lucene/pull/11796#discussion_r978242377 ## lucene/core/src/java/org/apache/lucene/store/ByteTrackingIndexOutput.java: ## @@ -0,0 +1,70 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

[GitHub] [lucene] vigyasharma commented on a diff in pull request #11796: GITHUB#11795: Add FilterDirectory to track write amplification factor

2022-09-22 Thread GitBox
vigyasharma commented on code in PR #11796: URL: https://github.com/apache/lucene/pull/11796#discussion_r978243220 ## lucene/core/src/java/org/apache/lucene/store/ByteTrackingIndexOutput.java: ## @@ -0,0 +1,70 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

[GitHub] [lucene] vigyasharma commented on pull request #11796: GITHUB#11795: Add FilterDirectory to track write amplification factor

2022-09-22 Thread GitBox
vigyasharma commented on PR #11796: URL: https://github.com/apache/lucene/pull/11796#issuecomment-1255778326 > An alternative implementation would be to add the bytes only in the `IndexOutput.close` method instead of on each method that writes bytes? It might be less error-proned, but, also

[GitHub] [lucene] vigyasharma commented on pull request #11796: GITHUB#11795: Add FilterDirectory to track write amplification factor

2022-09-22 Thread GitBox
vigyasharma commented on PR #11796: URL: https://github.com/apache/lucene/pull/11796#issuecomment-1255779217 Thanks for persisting with this @mdmarshmallow. I think we're close now, just a couple of discussion threads to resolve. This change will be super useful :) -- This is an automate

[GitHub] [lucene] vsop-479 commented on pull request #11722: Binary search the entries when all suffixes have the same length in a leaf block.

2022-09-22 Thread GitBox
vsop-479 commented on PR #11722: URL: https://github.com/apache/lucene/pull/11722#issuecomment-1255837607 @jpountz Thanks for your review. I did a simple performance test, which indexed 1M random UUID's substring(2, 8), got 10 segments, and picked up 1K terms to search. Average Result

[GitHub] [lucene] LuXugang commented on a diff in pull request #687: LUCENE-10425:speed up IndexSortSortedNumericDocValuesRangeQuery#BoundedDocSetIdIterator construction using bkd binary search

2022-09-22 Thread GitBox
LuXugang commented on code in PR #687: URL: https://github.com/apache/lucene/pull/687#discussion_r978314526 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/search/IndexSortSortedNumericDocValuesRangeQuery.java: ## @@ -214,12 +221,172 @@ public int count(LeafReaderContext co