MarcusSorealheis commented on PR #874: URL: https://github.com/apache/lucene/pull/874#issuecomment-1295674939
I think slow indexing throughput is a pain that customers ought to surface. If they find that they mostly use vectors for use cases that don't have nrt-scaling and replication requirements that should drive our decision to inhibit the maximum number of dimensions. I have seen multiple Open AI and Hugging Face customers flock to other search engines because we impose this limit. 4096 is the number that keeps getting thrown but have seen one case of more. On the other hand, if there are stability concerns at a particular level of dimensionality, we should cap there. All customers don't have equivalent needs for indexing throughput. Plus — we can work on indexing throughput in the future as an incremental improvement to the feature. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org