benwtrent opened a new issue, #12700: URL: https://github.com/apache/lucene/issues/12700
### Description VectorSimilarityFunction might return negative scores in extreme circumstances. This could happen if `VectorUtil#cosine` returns something like `-1.0000001` instead of just `-1` for antipodal vectors. Then the similarity score would be `-0.0000005` (this numbers are made up, and don't reflect a scenario I have actually seen). We already know that the floating point error compounds on larger vectors and using Panama. Should we snap vector scores to `0` to ensure this doesn't happen? Or rely on users of the library to do such? Here is a related ES bug: https://github.com/elastic/elasticsearch/issues/100975 NOTE: That bug is over 1536 dims, not the Lucene limit of 1024. However, it seems to me that this is a possibility even over 1024 dimensions. I am fine if the consensus is "library users just handle it". But it seems like something every user would be potentially concerned about. ### Version and environment details _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org