[ 
https://issues.apache.org/jira/browse/LUCENE-9619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17243094#comment-17243094
 ] 

Ignacio Vera edited comment on LUCENE-9619 at 12/3/20, 10:34 AM:
-----------------------------------------------------------------

{quote}my thinking was that this API could be use to fill an int[] buffer only 
one leaf at a time 
{quote}
 

With the visitor pattern there is no control to visit one leaf at a time. I am 
wondering if it make more sense to have and IntersectsAll method that just 
return a DocIdIdIterator. In addition the Intersects method can return a 
DocSetIterator as well and accept a Predicate function. Something like:

 {code}
 /** Visit all (document,value) pairs under the current node and return only 
the ones that match
 the given predicate. The iterator will contain at most \{@link #size()} 
elements. */
public abstract DocIdSetIterator intersect(Predicate<byte[]> matchesPredicate);

/** Return all documents under the current node. The iterator will contain 
\{@link #size()} elements. */ 
public abstract DocIdSetIterator intersectAll();

{code}
  

I am actually more concern about the optimisation where we added a exact 
bounding box to the leaf nodes as it needs to be treated carefully.


was (Author: ivera):
{quote}
my thinking was that this API could be use to fill an int[] buffer only one 
leaf at a time 
{quote}
 

With the visitor pattern there is no control to visit one leaf at a time. I am 
wondering if it make more sense to have and IntersectsAll method that just 
return a DocIdIdIterator. In addition the Intersects method can return a 
DocSetIterator as well and accept a Predicate function. Something like:

 
/** Visit all (document,value) pairs under the current node and return only the 
ones that match
the given predicate. The iterator will contain at most \{@link #size()} 
elements.  */  public abstract DocIdSetIterator intersect(Predicate<byte[]> 
matchesPredicate);

  /** Return all documents under the current node. The iterator will contain 
\{@link #size()} elements. */  public abstract DocIdSetIterator intersectAll();
 

I am actually more concern about the optimisation where we added a exact 
bounding box to the leaf nodes as it needs to be treated carefully.

> Move Points from a visitor API to a custor-style API?
> -----------------------------------------------------
>
>                 Key: LUCENE-9619
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9619
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Priority: Minor
>
> Points' visitor API work well but there are a couple things we could make 
> better if we moved to a cursor API, e.g.
>  - Term queries could return a DocIdSetIterator without having to materialize 
> a BitSet.
>  - Nearest-neighbor search could work on top of the regular API instead of 
> casting to BKDReader 
> https://github.com/apache/lucene-solr/blob/6a7131ee246d700c2436a85ddc537575de2aeacf/lucene/sandbox/src/java/org/apache/lucene/sandbox/document/FloatPointNearestNeighbor.java#L296
>  - We could optimize counting the number of matches of a query by adding the 
> number of points in a leaf without visiting documents where there are no 
> deleted documents and a leaf fully matches the query.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to