[GitHub] [lucene] gsmiller commented on pull request #12135: Avoid duplicate sorting and prefix-encoding in KeywordField#newSetQuery

via GitHub Wed, 08 Feb 2023 09:53:31 -0800


gsmiller commented on PR #12135:
URL: https://github.com/apache/lucene/pull/12135#issuecomment-1423017019


   Just to make sure I understand the suggestion here, it sounds like the idea 
would be to use a `Stream<BytesRef>` in place of `PrefixCodedTerms` both in 
`TermInSetQuery` and `SortedSetDocValuesSetQuery` right? And these streams 
would be backed by the same `BytesRef[]` provided to 
`KeywordField#newSetQuery`? I think there are a couple challenges with that 
(but maybe I'm misunderstanding the suggestion completely):
   1. Wouldn't we still be sorting the data twice since we would need to 
provide separate streams to each query? Or I suppose we could maybe share a 
single stream across the queries if we knew for sure that only one would 
consume it?
   2. Within each query implementation, we may end up iterating over the terms 
multiple times (e.g., if someone did a `toString`, `visit`, create a weight, 
etc.). I don't have a whole lot of experience using streams, but my 
understanding is they provide a "single consumption" model. So I'm not sure how 
this would work?
   
   I suspect I'm misunderstanding the suggestion, so just trying to clarify and 
figure out where I've gotten confused. Thanks for the input! I like the idea, 
I'm just trying to figure out how to make it workable.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] gsmiller commented on pull request #12135: Avoid duplicate sorting and prefix-encoding in KeywordField#newSetQuery

Reply via email to