[GitHub] [lucene] gf2121 opened a new issue, #12028: Add newSetQuery for IntField, LongField, FloatField, DoubleField

2022-12-21 Thread GitBox


gf2121 opened a new issue, #12028:
URL: https://github.com/apache/lucene/issues/12028

   ### Description
   
   Today `TermInSetQuery` can be rewritten to disjunction BooleanQuery to 
lazily materialize query result if terms count < 16. This can significantly 
improve query performance in cases like `selective_clause AND 
low_cardinality_field in (xxx) `.
   
Recently we added IntField, LongField, FloatField, DoubleField to index 
both with points and doc values 
(https://github.com/apache/lucene/issues/11199). `xxxField#newExactQuery` now 
can take advantage of `IndexOrDocValuesQuery` to match with DocValues when 
there is a selective conjunction clause. I wonder if we can have 
`xxxField#newSetQuery` that generates disjunction BooleanQuery when points 
count < 16 ? 

For example:

```
   public static Query newSetQuery(String field, long... values) {
 if (values.length < 16) {
   BooleanQuery.Builder builder = new BooleanQuery.Builder();
   for (long value: values) {
 builder.add(newExactQuery(field, value), Occur.FILTER);
   }
   return builder.build();
 }
 return LongPoint.newSetQuery(field, values);
   }
```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] gf2121 commented on issue #12028: Add newSetQuery for IntField, LongField, FloatField, DoubleField

2022-12-21 Thread GitBox


gf2121 commented on issue #12028:
URL: https://github.com/apache/lucene/issues/12028#issuecomment-1361021593

   I benchmarked some queries like `_id = '1' AND cardinality_8_field in (1, 2, 
3) ` on 1M docs, here is the result:
   ```
   Benchmark   Mode  Cnt   ScoreError   Units
   fieldSetQuery  thrpt   10  48.025 ± 16.741  ops/ms
   pointSetQuery  thrpt   10   5.514 ±  0.159  ops/ms
   ```
   `fieldSetQuery` is using `LongField#newSetQuery` (see the example above) 
while `pointSetQuery` is using `LongPoint#newSetQuery`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] alessandrobenedetti opened a new pull request, #12029: KnnVectorQuery introduce getters/setters

2022-12-21 Thread GitBox


alessandrobenedetti opened a new pull request, #12029:
URL: https://github.com/apache/lucene/pull/12029

   ### Description
   Knn Queries are locked currently, it would be beneficial for applications 
using them to have access to getters and setters.
   An example is how filter queries are managed in Apache Solr:
   the processing of pre-filters and post-filters could benefit from opening up 
the access to such variables.
   Especially the pre-filter support introduced in Solr 9.1 could get great 
benefits from being able to set the filter, after the query has been parsed.
   See:
   https://github.com/apache/solr/pull/1245
   
   If there are no objections I would simply remove the final and add the 
getters/setters.
   I may consider alternative if there's some valid concern in doing that.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on pull request #12029: introduce support in KnnVectorQuery for getters/setters

2022-12-21 Thread GitBox


rmuir commented on PR #12029:
URL: https://github.com/apache/lucene/pull/12029#issuecomment-1361503436

   queries should be immutable, see Query.java documentation. Hence I don't 
think we should add getter/setters or remove final keywords.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] alessandrobenedetti commented on pull request #12029: introduce support in KnnVectorQuery for getters/setters

2022-12-21 Thread GitBox


alessandrobenedetti commented on PR #12029:
URL: https://github.com/apache/lucene/pull/12029#issuecomment-1361523810

   Thanks @rmuir for the prompt answer, I took a look at Query.java, and 
couldn't find any particular reason for the Knn query to be immutable(aside 
from historical reasons?).
   If you can elaborate I am happy to consider alternatives (I could bring back 
final and just add getters if any better).
   
   Knn query has a sub-query that uses as an internal filter, and having access 
to that can make the Solr side of filter/post-filters processing much easier 
and performant.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on pull request #12029: introduce support in KnnVectorQuery for getters/setters

2022-12-21 Thread GitBox


rmuir commented on PR #12029:
URL: https://github.com/apache/lucene/pull/12029#issuecomment-1361552026

   All queries need to be immutable for the query cache to work correctly and 
consistently.
   
   You are right, the docs need help here. Unfortunately docs on immutability 
were attached to deprecated methods and disappeared! 
https://github.com/apache/lucene-solr/blob/releases/lucene-solr/5.5.5/lucene/core/src/java/org/apache/lucene/search/Query.java#L95-L107


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on pull request #12029: introduce support in KnnVectorQuery for getters/setters

2022-12-21 Thread GitBox


rmuir commented on PR #12029:
URL: https://github.com/apache/lucene/pull/12029#issuecomment-1361577899

   the `getTarget()` getters are unsafe as they return mutable things 
(`float[]`, `BytesRef`)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] alessandrobenedetti commented on pull request #12029: introduce support in KnnVectorQuery for getters/setters

2022-12-21 Thread GitBox


alessandrobenedetti commented on PR #12029:
URL: https://github.com/apache/lucene/pull/12029#issuecomment-1361578619

   Thanks, @rmuir for the explanation, it seems reasonable and definitely, I 
won't argue with that.
   For the sake of my needs, just the getters would be fine(and I'll clone the 
query, changing the filter).
   I checked around and getters seem to be tolerated (see 
org.apache.lucene.search.FuzzyQuery).
   Any problem with this?
   I updated the Pull Request


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] alessandrobenedetti commented on pull request #12029: introduce support in KnnVectorQuery for getters/setters

2022-12-21 Thread GitBox


alessandrobenedetti commented on PR #12029:
URL: https://github.com/apache/lucene/pull/12029#issuecomment-1361606803

   @rmuir you are right again, I gave it another try, using copies, this should 
be safe.
   If there are still concerns I may move to some Builder/Constructors 
approaches, to be able to build a KnnVectorQuery starting from an old one.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] epotyom commented on pull request #12025: Issue #11582 Update Faceting user guide

2022-12-21 Thread GitBox


epotyom commented on PR #12025:
URL: https://github.com/apache/lucene/pull/12025#issuecomment-1361631708

   > At the same time, for demo module we already have a method for 
compile-time safety of examples that doesn't rely upon this new `@snippet`. See 
IndexFiles/SearchFiles where we simply include the source code in the javadocs
   
   There is a check that referenced `region`s exist in the source file. And the 
regions support is the only benefit really compared to links to source code as 
in your IndexFiles/SearchFiles example, because you don't need to scroll 
through source files to find an example referenced. But it's true that it 
doesn't beat disadvantages that you listed. Updated the pull request to link to 
source files instead of using snippets.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] twosom opened a new pull request, #12030: fix typo in BaseSynonymParserTestCase

2022-12-21 Thread GitBox


twosom opened a new pull request, #12030:
URL: https://github.com/apache/lucene/pull/12030

   ### Description
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir merged pull request #12025: Issue #11582 Update Faceting user guide

2022-12-21 Thread GitBox


rmuir merged PR #12025:
URL: https://github.com/apache/lucene/pull/12025


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on pull request #12025: Issue #11582 Update Faceting user guide

2022-12-21 Thread GitBox


rmuir commented on PR #12025:
URL: https://github.com/apache/lucene/pull/12025#issuecomment-1361698706

   Thank you @epotyom for this! I'll backport to 9.5.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir closed issue #11582: Update Faceting user guide [LUCENE-10546]

2022-12-21 Thread GitBox


rmuir closed issue #11582: Update Faceting user guide [LUCENE-10546]
URL: https://github.com/apache/lucene/issues/11582


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on pull request #12029: introduce support in KnnVectorQuery for getters/setters

2022-12-21 Thread GitBox


rmuir commented on PR #12029:
URL: https://github.com/apache/lucene/pull/12029#issuecomment-1361726280

   BytesRef.clone won't do what we want here. it is a shallow clone.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] dsmiley commented on pull request #12029: introduce support in KnnVectorQuery for getters/setters

2022-12-21 Thread GitBox


dsmiley commented on PR #12029:
URL: https://github.com/apache/lucene/pull/12029#issuecomment-1361753981

   Completely agree with Robert -- Query subclasses ought to be immutable and 
the javadocs ought to be updated to scream this.  Nasty/hard bugs happen when a 
Query is mutable.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir merged pull request #12030: fix typo in BaseSynonymParserTestCase

2022-12-21 Thread GitBox


rmuir merged PR #12030:
URL: https://github.com/apache/lucene/pull/12030


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on pull request #12030: fix typo in BaseSynonymParserTestCase

2022-12-21 Thread GitBox


rmuir commented on PR #12030:
URL: https://github.com/apache/lucene/pull/12030#issuecomment-1361838583

   Thank you @twosom !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jpountz commented on pull request #12011: Tune the amount of memory that is allocated to sorting postings upon flushing.

2022-12-21 Thread GitBox


jpountz commented on PR #12011:
URL: https://github.com/apache/lucene/pull/12011#issuecomment-1362512365

   I plan on merging it soon if there are no objections.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] alessandrobenedetti commented on pull request #12029: introduce support in KnnVectorQuery for getters/setters

2022-12-21 Thread GitBox


alessandrobenedetti commented on PR #12029:
URL: https://github.com/apache/lucene/pull/12029#issuecomment-1362512997

   Thanks again @rmuir, moved to a deep copy!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org