uschindler commented on PR #12135:
URL: https://github.com/apache/lucene/pull/12135#issuecomment-1424863546

   I disagree with Robert who says "only arrays". I agree with you we can also 
allow to pass collections. But only when we do it as proposed before:
   
   We can have 2 ctors, one with `Collection<BytesRef>` and one with 
`BytesRef[]`. But both should call a third hidden ctor taking 
`Stream<BytesRef>`.
   
   This internal implementation would call: `stream.sorted()` (and possibly 
also `.distinct()`) and just operate on the stream. If you pass in a SortedSet 
(e.g. TreeSet) the sorted and distinct calls will be no-ops. It will not sort 
the stuff again, IF (and only IF) the comparator of the Treeset/sortedset is 
exactly the one we use. So it is 100% type safe.
   
   I hope Adrien understand that this is the best way to avoid duplicate 
sorting:
   
   ```java
   public TermInSetQuery(String field, BytesRef[] terms) {
     this(field, Arrays.stream(terms);
   }
   
   public TermInSetQuery(String field, Collection<BytesRef> terms) {
     this(field, terms.stream();
   }
   
   private TermInSetQuery(String field, Stream<BytesRef> stream) {
     super(field); // and so on
     stream.sorted().distinct().forEachOrdered(term -> {
        // process the terms coming in natural order
     });
   }
   ```
   
   This would be my favorite way to implement this query. I don't think 
Adrien's mental problem with streams is an argument without a benchmark.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to