benwtrent commented on PR #12064: URL: https://github.com/apache/lucene/pull/12064#issuecomment-1375565264
> E.g. could hamming distance reuse the byte[] API by introducing a new distance function and half-float/bfloat16 reuse the float[] API? Hamming distance for binary vectors will be a bitwise operation. So, we store the binary vectors as some numerical type (some folks use int). If we use the `byte` that means we use 4x as many operations vs `int`. We can cross that bridge when we come to it I suppose. > Yes, lets design for today. OK, I will rewrite to remove the abstract class. > Personally I will push back against new vector types/functions as long as performance is in its current state. I agree. We need to address this. Makes me wonder about the work done here: https://github.com/apache/lucene/issues/10177. Seems promising, though the cost of flush increases (because of clustering), but the data structure seems to fit WAY better inside of Lucene. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org