Hi y’all,

I’m a newbie to Solr, and was looking for advice on whether Solr is the 
best choice for this project. 

I need to be able to search through terabytes of existing data.  Documents 
may vary in size from 10 MB to 20 KB in size.  Also at some point I’ll 
also need to feed in approximately approximately 1-5 million new documents 
a day. 

With this in mind…

Has anyone used Solr to conduct searches over terabytes of data?  If so, 
are there any configuration parameters I should pay particular attention 
to such jvm size, mergeFactor etc?

Is there a limit to the number of shards Solr is capable of?  I don’t 
think there’s any way I can do this without some sort of distributed 
search.

I’ve read that solr indexes can go into the millions if not billions of 
documents… however at what point do the index size become impractical – I 
know this is a bit open ended, but I guess does Solr have a limit to the 
number of documents that can be in a single index? 

Has anyone looked into any of these other search engines and are there any 
other search engines that would be better suited such as Fast or Automomy:
http://mg4j.dsi.unimi.it/
http://www.egothor.org/performance.shtml

I know I asked quite a bit in this post, but any help/suggestions would be 
much appreciated.



Regards,

Willie

Reply via email to