There's a whole heap of information that is missing like what you plan on storing vs indexing and yes QPS too. My short answer is try with one server until it falls over then start adding more.
When you say multiple-server setup do you mean multiple servers where each server acts as a slave storing the entire index so you have load balancing across multiple servers OR do you mean multiple servers where each server stores a portion of the data? If it's the former, sometimes a simple master/slave setup in Solr 4.x works but the latter may mean SolrCloud. Master/Slave is easy but I don't know much about SolrCloud. Questions to think about (this is not exhaustive by any means) 1) When you say 5-10 pages per website (300+ websites) that you are crawling 2x per hour, are you *replacing* the old copy of the web page in your index or storing some form of history for some reason. 2) What are you planning on storing vs indexing which would dictate your memory requirements. 3) You mentioned you don't know QPS but having some guess would help.. is it mostly for storage and occasional lookup (where slow responses is probably tolerable) or is this powering a real user-facing website (where low latency is prob desired). Again, I like to start simple and use one server until it dies then expand from there. Cheers Amit On Thu, Apr 4, 2013 at 7:58 AM, imehesz <imeh...@gmail.com> wrote: > hello, > > I'm using a single server setup with Nutch (1.6) and Solr (4.2) > > I plan to trigger the Nutch crawling process every 30 minutes or so and add > about 300+ websites a month with (~5-10 pages each). At this point I'm not > sure about the query requests/sec. > > Can I run this on a single server (how long)? > If not, what would be the best and most efficient way to have multiple > server setup? > > thanks, > --iM > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Solr-4-2-single-server-limitations-tp4053829.html > Sent from the Solr - User mailing list archive at Nabble.com. >