[GitHub] [lucene] gf2121 opened a new issue, #12028: Add newSetQuery for IntField, LongField, FloatField, DoubleField
gf2121 opened a new issue, #12028: URL: https://github.com/apache/lucene/issues/12028 ### Description Today `TermInSetQuery` can be rewritten to disjunction BooleanQuery to lazily materialize query result if terms count < 16. This can significantly improve query performance in cases like `selective_clause AND low_cardinality_field in (xxx) `. Recently we added IntField, LongField, FloatField, DoubleField to index both with points and doc values (https://github.com/apache/lucene/issues/11199). `xxxField#newExactQuery` now can take advantage of `IndexOrDocValuesQuery` to match with DocValues when there is a selective conjunction clause. I wonder if we can have `xxxField#newSetQuery` that generates disjunction BooleanQuery when points count < 16 ? For example: ``` public static Query newSetQuery(String field, long... values) { if (values.length < 16) { BooleanQuery.Builder builder = new BooleanQuery.Builder(); for (long value: values) { builder.add(newExactQuery(field, value), Occur.FILTER); } return builder.build(); } return LongPoint.newSetQuery(field, values); } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] gf2121 commented on issue #12028: Add newSetQuery for IntField, LongField, FloatField, DoubleField
gf2121 commented on issue #12028: URL: https://github.com/apache/lucene/issues/12028#issuecomment-1361021593 I benchmarked some queries like `_id = '1' AND cardinality_8_field in (1, 2, 3) ` on 1M docs, here is the result: ``` Benchmark Mode Cnt ScoreError Units fieldSetQuery thrpt 10 48.025 ± 16.741 ops/ms pointSetQuery thrpt 10 5.514 ± 0.159 ops/ms ``` `fieldSetQuery` is using `LongField#newSetQuery` (see the example above) while `pointSetQuery` is using `LongPoint#newSetQuery`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] alessandrobenedetti opened a new pull request, #12029: KnnVectorQuery introduce getters/setters
alessandrobenedetti opened a new pull request, #12029: URL: https://github.com/apache/lucene/pull/12029 ### Description Knn Queries are locked currently, it would be beneficial for applications using them to have access to getters and setters. An example is how filter queries are managed in Apache Solr: the processing of pre-filters and post-filters could benefit from opening up the access to such variables. Especially the pre-filter support introduced in Solr 9.1 could get great benefits from being able to set the filter, after the query has been parsed. See: https://github.com/apache/solr/pull/1245 If there are no objections I would simply remove the final and add the getters/setters. I may consider alternative if there's some valid concern in doing that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] rmuir commented on pull request #12029: introduce support in KnnVectorQuery for getters/setters
rmuir commented on PR #12029: URL: https://github.com/apache/lucene/pull/12029#issuecomment-1361503436 queries should be immutable, see Query.java documentation. Hence I don't think we should add getter/setters or remove final keywords. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] alessandrobenedetti commented on pull request #12029: introduce support in KnnVectorQuery for getters/setters
alessandrobenedetti commented on PR #12029: URL: https://github.com/apache/lucene/pull/12029#issuecomment-1361523810 Thanks @rmuir for the prompt answer, I took a look at Query.java, and couldn't find any particular reason for the Knn query to be immutable(aside from historical reasons?). If you can elaborate I am happy to consider alternatives (I could bring back final and just add getters if any better). Knn query has a sub-query that uses as an internal filter, and having access to that can make the Solr side of filter/post-filters processing much easier and performant. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] rmuir commented on pull request #12029: introduce support in KnnVectorQuery for getters/setters
rmuir commented on PR #12029: URL: https://github.com/apache/lucene/pull/12029#issuecomment-1361552026 All queries need to be immutable for the query cache to work correctly and consistently. You are right, the docs need help here. Unfortunately docs on immutability were attached to deprecated methods and disappeared! https://github.com/apache/lucene-solr/blob/releases/lucene-solr/5.5.5/lucene/core/src/java/org/apache/lucene/search/Query.java#L95-L107 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] rmuir commented on pull request #12029: introduce support in KnnVectorQuery for getters/setters
rmuir commented on PR #12029: URL: https://github.com/apache/lucene/pull/12029#issuecomment-1361577899 the `getTarget()` getters are unsafe as they return mutable things (`float[]`, `BytesRef`) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] alessandrobenedetti commented on pull request #12029: introduce support in KnnVectorQuery for getters/setters
alessandrobenedetti commented on PR #12029: URL: https://github.com/apache/lucene/pull/12029#issuecomment-1361578619 Thanks, @rmuir for the explanation, it seems reasonable and definitely, I won't argue with that. For the sake of my needs, just the getters would be fine(and I'll clone the query, changing the filter). I checked around and getters seem to be tolerated (see org.apache.lucene.search.FuzzyQuery). Any problem with this? I updated the Pull Request -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] alessandrobenedetti commented on pull request #12029: introduce support in KnnVectorQuery for getters/setters
alessandrobenedetti commented on PR #12029: URL: https://github.com/apache/lucene/pull/12029#issuecomment-1361606803 @rmuir you are right again, I gave it another try, using copies, this should be safe. If there are still concerns I may move to some Builder/Constructors approaches, to be able to build a KnnVectorQuery starting from an old one. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] epotyom commented on pull request #12025: Issue #11582 Update Faceting user guide
epotyom commented on PR #12025: URL: https://github.com/apache/lucene/pull/12025#issuecomment-1361631708 > At the same time, for demo module we already have a method for compile-time safety of examples that doesn't rely upon this new `@snippet`. See IndexFiles/SearchFiles where we simply include the source code in the javadocs There is a check that referenced `region`s exist in the source file. And the regions support is the only benefit really compared to links to source code as in your IndexFiles/SearchFiles example, because you don't need to scroll through source files to find an example referenced. But it's true that it doesn't beat disadvantages that you listed. Updated the pull request to link to source files instead of using snippets. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] twosom opened a new pull request, #12030: fix typo in BaseSynonymParserTestCase
twosom opened a new pull request, #12030: URL: https://github.com/apache/lucene/pull/12030 ### Description -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] rmuir merged pull request #12025: Issue #11582 Update Faceting user guide
rmuir merged PR #12025: URL: https://github.com/apache/lucene/pull/12025 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] rmuir commented on pull request #12025: Issue #11582 Update Faceting user guide
rmuir commented on PR #12025: URL: https://github.com/apache/lucene/pull/12025#issuecomment-1361698706 Thank you @epotyom for this! I'll backport to 9.5. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] rmuir closed issue #11582: Update Faceting user guide [LUCENE-10546]
rmuir closed issue #11582: Update Faceting user guide [LUCENE-10546] URL: https://github.com/apache/lucene/issues/11582 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] rmuir commented on pull request #12029: introduce support in KnnVectorQuery for getters/setters
rmuir commented on PR #12029: URL: https://github.com/apache/lucene/pull/12029#issuecomment-1361726280 BytesRef.clone won't do what we want here. it is a shallow clone. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] dsmiley commented on pull request #12029: introduce support in KnnVectorQuery for getters/setters
dsmiley commented on PR #12029: URL: https://github.com/apache/lucene/pull/12029#issuecomment-1361753981 Completely agree with Robert -- Query subclasses ought to be immutable and the javadocs ought to be updated to scream this. Nasty/hard bugs happen when a Query is mutable. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] rmuir merged pull request #12030: fix typo in BaseSynonymParserTestCase
rmuir merged PR #12030: URL: https://github.com/apache/lucene/pull/12030 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] rmuir commented on pull request #12030: fix typo in BaseSynonymParserTestCase
rmuir commented on PR #12030: URL: https://github.com/apache/lucene/pull/12030#issuecomment-1361838583 Thank you @twosom ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] jpountz commented on pull request #12011: Tune the amount of memory that is allocated to sorting postings upon flushing.
jpountz commented on PR #12011: URL: https://github.com/apache/lucene/pull/12011#issuecomment-1362512365 I plan on merging it soon if there are no objections. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] alessandrobenedetti commented on pull request #12029: introduce support in KnnVectorQuery for getters/setters
alessandrobenedetti commented on PR #12029: URL: https://github.com/apache/lucene/pull/12029#issuecomment-1362512997 Thanks again @rmuir, moved to a deep copy! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org