Hi Pengkai,
my experience is based on http://www.findfiles.net/ which holds >700 Mio
documents, each about 2kb size.
A single Index containing that kind of data should hold below 80 Mio
documents. In case you have complex queries with lots of facets,
sorting, function queries then even 50 Mio documents per index could be
your upper limit. On very fast Hardware and warmed index you might
deliver results on average within 1 second.
For documents above 5kb in size those numbers might not necessarily be
the same.
Try to test your documents by creating (NOT COPYING) and index them in
vast numbers. After every 10 Mio documents test the average response
time with caches switched off. If the average response time hits your
threshold, then the number of documents in index is your limit per index.
Scaling up is no problem. AFAIK 20 to 50 indexes should be fine within a
distributed productive system.
Kind Regards
Gregor
On 09/29/2011 12:14 PM, Pengkai Qin wrote:
Hi all,
Now I'm doing research on solr distributed search, and it is said
documents more than one million is reasonable to use distributed search.
So I want to know, does anyone have the test result(Such as time cost)
of using single index and distributed search of more than one million
data? I need the test result very urgent, thanks in advance!
Best Regards,
Pengkai
--
How to find files on the Internet? FindFiles.net <http://findfiles.net>!