scampi commented on issue #11702:
URL: https://github.com/apache/lucene/issues/11702#issuecomment-1307096355

   I was involved in a [previous 
issue](https://issues.apache.org/jira/browse/LUCENE-10449) that is related to 
this one. The problem was a drop of performance when scanning 
`SortedSetDocValues` docvalues (i.e., the keyword field in Elasticsearch). The 
solution was rightfully to use `BinaryDocValues` for this kind of access 
pattern (i.e., full scan of the column).
   
   Therefore, we created a prototype that implements multi-valued binary 
docvalues which works well. However, having some support for this use case 
directly in Lucene is preferable, be it a new docvalues or some tooling as 
proposed by @jpountz .
   
   Performance issues of scanning multi-valued binary data is probably 
something that would affect other use cases, e.g., the ESQL query 
language/engine proposed by Elastic.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to