msokolov commented on issue #13647: URL: https://github.com/apache/lucene/issues/13647#issuecomment-2695517144
There are other vector data files - I think the key one that has become a reference point is Cohere 768d trained on wikipedia-derived docs, but I'm not sure where nightly benchmarks gets it from; maybe just a locally cached copy? I have a local copy I could share, but can we attach 3G files here? And TBH it's possible nightlies is using an even larger one with the full 33M docs (I may have truncated the copy I have since I almost never test with more than 2M docs) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org