[GitHub] [lucene] mmatela commented on issue #12080: SynonymGraphFilter: wrong output token position when input positions overlap

2023-02-17 Thread via GitHub
mmatela commented on issue #12080: URL: https://github.com/apache/lucene/issues/12080#issuecomment-1434301348 Turns out my initial solution lead to exceptions when a synonym appears at the beginning of the query or there are more tokens after the synonym. After some trial and error, it seem

[GitHub] [lucene] LuXugang opened a new pull request, #12153: Unrelated code in TestIndexSortSortedNumericDocValuesRangeQuery

2023-02-17 Thread via GitHub
LuXugang opened a new pull request, #12153: URL: https://github.com/apache/lucene/pull/12153 `SortedSetDocValuesField.newSlowRangeQuery` appeared in `TestIndexSortSortedNumericDocValuesRangeQuery#toString` seems no reason? -- This is an automated message from the Apache Git Servi

[GitHub] [lucene] rmuir merged pull request #12132: Implement ScorerSupplier for Sorted(Set)DocValuesField#newSlowRangeQuery

2023-02-17 Thread via GitHub
rmuir merged PR #12132: URL: https://github.com/apache/lucene/pull/12132 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apach

[GitHub] [lucene] rmuir commented on issue #12151: Benchmark Current Approaches for TermInSetQuery Evaluation

2023-02-17 Thread via GitHub
rmuir commented on issue #12151: URL: https://github.com/apache/lucene/issues/12151#issuecomment-1434652878 > First, imagine searching over a catalog of products, where products have been assigned a > [UNSPSC](https://en.wikipedia.org/wiki/UNSPSC) categorization identifier. You can

[GitHub] [lucene] iverase opened a new pull request, #12154: Implement ScorerSupplier for LatLonDocValuesQuery

2023-02-17 Thread via GitHub
iverase opened a new pull request, #12154: URL: https://github.com/apache/lucene/pull/12154 Similar to https://github.com/apache/lucene/pull/12132, implement Score supplier for LatLonDocValuesQuery and move the creation of the Component2D in there. -- This is an automated message from th

[GitHub] [lucene] gsmiller commented on pull request #12055: Better skipping for multi-term queries with a FILTER rewrite.

2023-02-17 Thread via GitHub
gsmiller commented on PR #12055: URL: https://github.com/apache/lucene/pull/12055#issuecomment-1434796647 @jpountz I've found that applying this same idea to `TermInSetQuery` is really helpful for performance in our use-cases at Amazon product search. It's nice because the behavior of `Term

[GitHub] [lucene] jpountz commented on pull request #12055: Better skipping for multi-term queries with a FILTER rewrite.

2023-02-17 Thread via GitHub
jpountz commented on PR #12055: URL: https://github.com/apache/lucene/pull/12055#issuecomment-1434808901 My attention has moved to a few other things, feel free to do whatever you want with this PR, I'll be happy to review. +1 on the nice property of gradually moving from a lazy disju

[GitHub] [lucene] gsmiller commented on a diff in pull request #12055: Better skipping for multi-term queries with a FILTER rewrite.

2023-02-17 Thread via GitHub
gsmiller commented on code in PR #12055: URL: https://github.com/apache/lucene/pull/12055#discussion_r1109995482 ## lucene/core/src/java/org/apache/lucene/search/MultiTermQueryConstantScoreWrapper.java: ## @@ -183,23 +182,31 @@ private WeightOrDocIdSet rewrite(LeafReaderContext

[GitHub] [lucene] gsmiller commented on pull request #12055: Better skipping for multi-term queries with a FILTER rewrite.

2023-02-17 Thread via GitHub
gsmiller commented on PR #12055: URL: https://github.com/apache/lucene/pull/12055#issuecomment-1434825645 Thanks @jpountz. I pushed my `TermInSetQuery` changes, but still need to address a couple of Robert's comments on the original implementation. I'll update here when I think it's ready f

[GitHub] [lucene] iverase closed pull request #12154: Implement ScorerSupplier for LatLonDocValuesQuery

2023-02-17 Thread via GitHub
iverase closed pull request #12154: Implement ScorerSupplier for LatLonDocValuesQuery URL: https://github.com/apache/lucene/pull/12154 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

[GitHub] [lucene] iverase commented on pull request #12154: Implement ScorerSupplier for LatLonDocValuesQuery

2023-02-17 Thread via GitHub
iverase commented on PR #12154: URL: https://github.com/apache/lucene/pull/12154#issuecomment-1434848174 Yikes, Adrien just make me realise that now I might be creating one of those Component2D objects per segment. I am going to close it for now -- This is an automated message from the Ap

[GitHub] [lucene] gsmiller commented on pull request #12055: Better skipping for multi-term queries with a FILTER rewrite.

2023-02-17 Thread via GitHub
gsmiller commented on PR #12055: URL: https://github.com/apache/lucene/pull/12055#issuecomment-1435332248 OK, I think I've addressed the previous feedback and also brought in the same changes to `TermInSetQuery`. This should be ready for feedback @jpountz (whenever you have a free moment).

[GitHub] [lucene] rmuir commented on a diff in pull request #12055: Better skipping for multi-term queries with a FILTER rewrite.

2023-02-17 Thread via GitHub
rmuir commented on code in PR #12055: URL: https://github.com/apache/lucene/pull/12055#discussion_r1110449800 ## lucene/core/src/java/org/apache/lucene/search/MultiTermQueryConstantScoreWrapper.java: ## @@ -183,23 +182,31 @@ private WeightOrDocIdSet rewrite(LeafReaderContext co

[GitHub] [lucene] rmuir commented on pull request #12055: Better skipping for multi-term queries with a FILTER rewrite.

2023-02-17 Thread via GitHub
rmuir commented on PR #12055: URL: https://github.com/apache/lucene/pull/12055#issuecomment-1435401885 I dont see any of my feedback addressed. I'll repeat what i said before: * We shouldn't be forming booleanqueries from a `FILTER` rewrite, this is wrong to do and it causes some slowdow

[GitHub] [lucene] rmuir commented on pull request #12055: Better skipping for multi-term queries with a FILTER rewrite.

2023-02-17 Thread via GitHub
rmuir commented on PR #12055: URL: https://github.com/apache/lucene/pull/12055#issuecomment-1435403774 I think it would help, a lot, to look thru history and see how "constant score auto rewrite" was implemented years ago, and then its removal, before adding it back again. but I'm fi

[GitHub] [lucene] rmuir commented on pull request #12055: Better skipping for multi-term queries with a FILTER rewrite.

2023-02-17 Thread via GitHub
rmuir commented on PR #12055: URL: https://github.com/apache/lucene/pull/12055#issuecomment-1435411850 also dropping the postings reuse is going to cause big performance degradation for many situations. For example with NIOFSDirectory, new postings reader means indexinput.clone() calls, buf

[GitHub] [lucene] rmuir commented on pull request #12055: Better skipping for multi-term queries with a FILTER rewrite.

2023-02-17 Thread via GitHub
rmuir commented on PR #12055: URL: https://github.com/apache/lucene/pull/12055#issuecomment-1435417952 I suggest an easy path to success here: * Keep the simple filter rewrite without crazy boolean auto-optimizations, that does what the javadocs in MultiTermQuery says it does and nothing

[GitHub] [lucene] gsmiller commented on pull request #12055: Better skipping for multi-term queries with a FILTER rewrite.

2023-02-17 Thread via GitHub
gsmiller commented on PR #12055: URL: https://github.com/apache/lucene/pull/12055#issuecomment-1435469343 @rmuir thanks for the feedback. Let me see if I can respond to all of it here: > postings reuse problems Can you help me with where you see this as a problem? I went throug

[GitHub] [lucene] gsmiller commented on pull request #12055: Better skipping for multi-term queries with a FILTER rewrite.

2023-02-17 Thread via GitHub
gsmiller commented on PR #12055: URL: https://github.com/apache/lucene/pull/12055#issuecomment-1435477300 Found the issue where the "constant score auto rewrite" implementation was removed: [LUCENE-5938](https://issues.apache.org/jira/browse/LUCENE-5938). If I'm understanding the history, i