scampi commented on issue #11702: URL: https://github.com/apache/lucene/issues/11702#issuecomment-1307096355
I was involved in a [previous issue](https://issues.apache.org/jira/browse/LUCENE-10449) that is related to this one. The problem was a drop of performance when scanning `SortedSetDocValues` docvalues (i.e., the keyword field in Elasticsearch). The solution was rightfully to use `BinaryDocValues` for this kind of access pattern (i.e., full scan of the column). Therefore, we created a prototype that implements multi-valued binary docvalues which works well. However, having some support for this use case directly in Lucene is preferable, be it a new docvalues or some tooling as proposed by @jpountz . Performance issues of scanning multi-valued binary data is probably something that would affect other use cases, e.g., the ESQL query language/engine proposed by Elastic. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org