rmuir commented on PR #14433: URL: https://github.com/apache/lucene/pull/14433#issuecomment-2774215695
I think the history is just that this norm can contain arbitrary value, which before was a suboptimal encoding into a single byte. There was a ValueSource that assumed it was a single byte, so that was moved to only work with TFIDF for backwards compatibility purposes. Elsewhere, norm was extended and generalized to be opaque 64-bit value. Depending upon the Similarity's index-time `computeNorm()` implementation, it might not even be possible to decode to a float. But the default encoding was also fixed to be practical, by @jpountz, whilst still using a single byte. So in practice all the built-in Similarities use the same encoding and can work with this: it just won't work if you extend Similarity to do something else. Any confusion can be solved with documentation: * should be clear that this only works, if your similarity uses the default implementation of `computeNorm()` * don't think PositionLength is a good name, norm is not that (see discountOverlaps as an example). Also I would ask if we really need this `EMPTY` instance: it would be good to keep polymorphism under wraps. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org