msokolov commented on pull request #1930: URL: https://github.com/apache/lucene-solr/pull/1930#issuecomment-706185147
> Agreed that benchmarking is needed. I think we can use http://ann-benchmarks.com/ as a guide for some standardized test vectors. Hmm I tried to get that benchmarking suite to run and it requires some major Python-fu. That package relies on docker, scipy, scikit-learn, h5py, matplotlib, and these in turn rely on a lot of native libraries, and all the versions have to be just right. I didn't have the right versions in my package manager's repos, so I had to install from source, and was never able to get the right combination, so I finally just gave up on that approach. Maybe someday we can use it to compare the performance of this solution with SOA native libraries, but not today! I'll try having a look at the Wikipedia-derived vectors to see if we can at least develop our own internal benchmarks in luceneutil. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org