Obviously as the number of documents increase the index size must increase to some degree -- I think linearly? But what index size will result for 7M documents over 50K words where we're talking just 2 fields per doc: 1 id field and one OCR field of ~1.4M? Ballpark?

Regarding single word queries, do you think, say, 0.5 sec/query to return 7M score-ranked IDs is possible/reasonable in this scenario?


The only real advice I can add is to give it a try. If you have test data, try testing it and see what happens. 1/2 sec queries is likely possible with the right hardware and settings -- but run a few tests before signing any contracts ;) If the index is really large, SOLR-303 should help make it more managable.

Let us know how things go and post add data to:
http://wiki.apache.org/solr/SolrPerformanceData

ryan



Reply via email to