navneet1v commented on issue #13403: URL: https://github.com/apache/lucene/issues/13403#issuecomment-2132043000
+1 on the idea of bringing the dimensionality reduction technique in Lucene. One problem though I have seen with PQ is you need to have enough number of vectors to build the codebooks. Hence building a Segment level code book can be challenging, as segment may have few documents in it. **Some ways to overcome the above problem:** 1. To overcome the problem of number of vectors we can either build a global code book outside of segments and let segments use it during merging and flush. I don't know how we can do this in Lucene may be via new KNNVectorFormatReader that can supply this(based of my limited knowledge). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org