We are currently indexing 5 million books in Solr, scaling up over the next few 
years to 20 million.  However we are using the entire book as a Solr document.  
We are evaluating the possibility of indexing individual pages as there are 
some use cases where users want the most relevant pages regardless of what book 
they occur in.  However, we estimate that we are talking about somewhere 
between 1 and 6 billion pages and have concerns over whether Solr will scale to 
this level.

Does anyone have experience using Solr with 1-6 billion Solr documents?

The lucene file format document 
(http://lucene.apache.org/java/3_0_1/fileformats.html#Limitations)  mentions a 
limit of about 2 billion document ids.   I assume this is the lucene internal 
document id and would therefore be a per index/per shard limit.  Is this 
correct?


Tom Burton-West.



Reply via email to