benwtrent closed issue #13281: Deprecate `COSINE` before Lucene 10 release
URL: https://github.com/apache/lucene/issues/13281
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
benwtrent commented on issue #13281:
URL: https://github.com/apache/lucene/issues/13281#issuecomment-2276245454
agreed, I think we will miss the window
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to g
jpountz commented on issue #13281:
URL: https://github.com/apache/lucene/issues/13281#issuecomment-2276232929
@benwtrent Shall we close this issue as "won't fix"?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
benwtrent commented on issue #13281:
URL: https://github.com/apache/lucene/issues/13281#issuecomment-2245953147
@msokolov @jmazanec15
I don't know of many `int8` models/datasets out there that require cosine.
But, I did a benchmark with Cohere's int8 embeddings here:
https://hugging
jmazanec15 commented on issue #13281:
URL: https://github.com/apache/lucene/issues/13281#issuecomment-2237366786
> I am not sure what to do for users who quantize their own vectors & rely
on cosine.
I think I am on same page as @msokolov. Users could "float_vector ->
norm_float_vecto
msokolov commented on issue #13281:
URL: https://github.com/apache/lucene/issues/13281#issuecomment-2236548966
It would be interesting to know how many actual users of COSINE there are. I
agree there may be no workaround, but that does not mean we need to continue to
support, either. One qu
benwtrent commented on issue #13281:
URL: https://github.com/apache/lucene/issues/13281#issuecomment-2236364825
I cannot think of an adequate work around at all for `byte` folks. The
linear transformation of bytes will indeed cause potentially non-uniform
magnitudes and could break scoring
jmazanec15 commented on issue #13281:
URL: https://github.com/apache/lucene/issues/13281#issuecomment-2059602419
Thanks @benwtrent that makes sense
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
jmazanec15 commented on issue #13281:
URL: https://github.com/apache/lucene/issues/13281#issuecomment-2059318306
@benwtrent Is the main reason to deprecate to stop enabling users to setup
non-optimal configurations? Or are there limitations cosine similarity imposes
on implementation/optimi