mccullocht commented on PR #15903: URL: https://github.com/apache/lucene/pull/15903#issuecomment-4165359031
I've played with TQ a bit over the last week and wrote a less sophisticated implementation covering 1 and 2 bit encodings. I came to the conclusion that there was a small recall improvement on modern embeddings (voyage-3.5 in my case). I think testing the flat case is a good idea in terms of an upper bound improvement. One worry I have with TQ in Lucene is related to per-segment overhead at query time. The transforms can be addressed by pushing it up to the query layer, but an efficient scoring implementation would likely use lookup tables that are expensive to compute and may not have a good implementation on panama depending on how well `Vector.shuffle()` is implemented. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
