msokolov edited a comment on pull request #1930: URL: https://github.com/apache/lucene-solr/pull/1930#issuecomment-703872279
> Thank you for ... the tests catching mis-use where user tries to change dimension or scoring function in an existing field. Thanks to @mocobeta for those; I was able to carry that forward from her earlier patch > I see you implemented the two score functions, but are they ever exercised in tests True - this was extracted from a bigger change including usage of those methods as part of KNN search, but they deserve their own unit tests - I'll add. > I would love to see a "Vector Overview" javadoc somewhere ... Yes - I'll add to the VectorValues/VectorField class javadocs I think that's the most natural/visible place. > I am curious how the basic vector usage performs -- just indexing one vector field, and retrieving it at search time. We can (separately) enable luceneutil to support testing vectors, somehow. But I wonder where we'll get semi-realistic vectors derived from Wikipedia content Agreed that benchmarking is needed. I think we can use http://ann-benchmarks.com/ as a guide for some standardized test vectors. They won't be related to wikipedia? If we get to wanting that, we could also make use of something like https://fasttext.cc/docs/en/pretrained-vectors.html that is trained on ngrams taken from Wikipedia (for many languages)? I don't know how suited it is, just found in a google search. For that, we'd have to compute document/query vectors based on an ngram-vector dictionary. I think a simple thing is to sum all the ngram-vectors for all the ngrams in a document / query ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org