gautamworah96 opened a new issue, #13403:
URL: https://github.com/apache/lucene/issues/13403

   ### Description
   
   I opened this issue as a discussion topic. With the advancement in int8, 
int4 type vector storage, I believe Lucene takes the unquantized vectors as 
inputs, intelligently calculates the correct quantized value, and then indexes 
it.
   
   Another technique that experimenters use to improve vector search is to 
reduce the number of dimensions. In practical terms, this translates to using 
PCA (Principal Component Analysis) or other techniques.
   
   Should Lucene implement support for PCA or other dimensionality reduction 
techniques (or add a hook maybe) internally? Or can we rely on the user 
preprocessing their vectors and supplying them?
   
   I am undecided on whether a "search" and "information retrieval" library 
should add advanced statistics functionality (if I may call PCA that)..


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to