shbhar commented on PR #15903: URL: https://github.com/apache/lucene/pull/15903#issuecomment-4238623650
>Add a datablind scalar quantized flat vectors format that is OSQ based. I took a stab at this in the existing SQ format and it's quite tricky so we may want it to be a completely separate format. Makes sense, I guess with this we can also give the option to user to not store fp32 vectors at all > Add support for rotation. This could be a stand-alone class folks can use to rotate their vectors before ingestion and querying, or it could be internalized in the quantized codec (I think there are arguments for both methods) Do you mean add rotation inside existing OSQ and still keep optimizeIntervals + 14 byte correction? I have not yet done an experiment where I disable optimizeIntervals+14byte correction and see if it still helps over rotation alone. My understanding of why QJL correction also doesnt work is that while it reduces per vector reconstruction error/MSE, we dont directly care about reconstruction error and only care about ranking via dot products in NN - so if any correction adds more noise in ranking it might actually make recall worse. I will try disabling optimizeIntervals/14byte correction next and see how OSQ with correction vs without correction performs on precentered and prerotated vectors to see if it helps, hurts or is neutral for recall. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
