[GitHub] [lucene] rmuir commented on pull request #12089: Modify TermInSetQuery to "self optimize" if doc values are available

2023-02-07 Thread via GitHub
rmuir commented on PR #12089: URL: https://github.com/apache/lucene/pull/12089#issuecomment-1420793280 > 100%. The issue here is that `TermInSetQuery` gets rewritten to a `BooleanQuery` because there are fewer than 16 terms, so it doesn't have a chance to "self-optimize" to use doc values.

[GitHub] [lucene] rmuir commented on pull request #12089: Modify TermInSetQuery to "self optimize" if doc values are available

2023-02-07 Thread via GitHub
rmuir commented on PR #12089: URL: https://github.com/apache/lucene/pull/12089#issuecomment-1420797458 I spent mine time her just to prevent that code is duplicated and prevent a mess, nothing more. I got no skin in the game and could care less about these stupid abusive "joins" that people

[GitHub] [lucene] rmuir commented on pull request #12054: Introduce a new `KeywordField`.

2023-02-07 Thread via GitHub
rmuir commented on PR #12054: URL: https://github.com/apache/lucene/pull/12054#issuecomment-1420805369 > * added a `newSetQuery` that creates a `TermInSetQuery` and hopefully soon benefits from @gsmiller 's optimization You can make it `new IndexOrDocValuesQuery(new TermInSetQuery

[GitHub] [lucene] gsmiller commented on pull request #12089: Modify TermInSetQuery to "self optimize" if doc values are available

2023-02-07 Thread via GitHub
gsmiller commented on PR #12089: URL: https://github.com/apache/lucene/pull/12089#issuecomment-1421121653 @rmuir thanks for the feedback and spending time having a look. I'm going to try summarizing where we've landed to make sure we're on the same page. I think we both agree on the followi

[GitHub] [lucene] gsmiller commented on pull request #12054: Introduce a new `KeywordField`.

2023-02-07 Thread via GitHub
gsmiller commented on PR #12054: URL: https://github.com/apache/lucene/pull/12054#issuecomment-1421143189 > You can make it new IndexOrDocValuesQuery(new TermInSetQuery, SortedSetDocValuesField.newSlowSetQuery()) right now and it performs better than what is on that PR. +1 to using `

[GitHub] [lucene] jpountz merged pull request #12054: Introduce a new `KeywordField`.

2023-02-07 Thread via GitHub
jpountz merged PR #12054: URL: https://github.com/apache/lucene/pull/12054 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

[GitHub] [lucene] gsmiller closed issue #11736: Promote DocValuesTermsQuery functionality from sandbox module

2023-02-07 Thread via GitHub
gsmiller closed issue #11736: Promote DocValuesTermsQuery functionality from sandbox module URL: https://github.com/apache/lucene/issues/11736 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

[GitHub] [lucene] gsmiller commented on issue #11736: Promote DocValuesTermsQuery functionality from sandbox module

2023-02-07 Thread via GitHub
gsmiller commented on issue #11736: URL: https://github.com/apache/lucene/issues/11736#issuecomment-1421239346 This was done as part of #12129. Resolving. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

[GitHub] [lucene] gsmiller commented on issue #11740: Can we improve cost estimation in TermInSetQuery's ScoreSupplier?

2023-02-07 Thread via GitHub
gsmiller commented on issue #11740: URL: https://github.com/apache/lucene/issues/11740#issuecomment-1421245215 As a different approach, the idea of a "self-optimizing" `TermInSetQuery` as explored in #12089, working around the problem of trying to provide an up-front cost estimation to be u

[GitHub] [lucene] benwtrent merged pull request #12050: Reuse HNSW graph for intialization during merge

2023-02-07 Thread via GitHub
benwtrent merged PR #12050: URL: https://github.com/apache/lucene/pull/12050 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.a

[GitHub] [lucene] benwtrent closed issue #11354: Reuse HNSW graphs when merging segments? [LUCENE-10318]

2023-02-07 Thread via GitHub
benwtrent closed issue #11354: Reuse HNSW graphs when merging segments? [LUCENE-10318] URL: https://github.com/apache/lucene/issues/11354 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [lucene] benwtrent commented on pull request #12050: Reuse HNSW graph for intialization during merge

2023-02-07 Thread via GitHub
benwtrent commented on PR #12050: URL: https://github.com/apache/lucene/pull/12050#issuecomment-1421381273 @jmazanec15 merged and I backported to branch_9x (some minor changes for java version stuff around switch statements). Good stuff! -- This is an automated message from the Ap

[GitHub] [lucene] jmazanec15 commented on pull request #12050: Reuse HNSW graph for intialization during merge

2023-02-07 Thread via GitHub
jmazanec15 commented on PR #12050: URL: https://github.com/apache/lucene/pull/12050#issuecomment-1421527701 Thanks @benwtrent! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [lucene] zhaih commented on a diff in pull request #12126: Refactor part of IndexFileDeleter and ReplicaFileDeleter into a common utility class

2023-02-07 Thread via GitHub
zhaih commented on code in PR #12126: URL: https://github.com/apache/lucene/pull/12126#discussion_r1099470288 ## lucene/replicator/src/java/org/apache/lucene/replicator/nrt/CopyJob.java: ## @@ -206,7 +206,7 @@ private synchronized void _transferAndCancel(CopyJob prevJob) throws

[GitHub] [lucene] AKafakA commented on issue #11862: Concurrent rewrite for KnnVectorQuery

2023-02-07 Thread via GitHub
AKafakA commented on issue #11862: URL: https://github.com/apache/lucene/issues/11862#issuecomment-1421715239 Hey, here is Wei from Linkedin. I am interesting on this issue and will try to work on it. Thanks -- This is an automated message from the Apache Git Service. To respond to