There's a whole heap of information that is missing like what you plan on
storing vs indexing and yes QPS too. My short answer is try with one server
until it falls over then start adding more.
When you say multiple-server setup do you mean multiple servers where each
server acts as a slave storing the entire index so you have load balancing
across multiple servers OR do you mean multiple servers where each server
stores a portion of the data? If it's the former, sometimes a simple
master/slave setup in Solr 4.x works but the latter may mean SolrCloud.
Master/Slave is easy but I don't know much about SolrCloud.

Questions to think about (this is not exhaustive by any means)
1) When you say 5-10 pages per website (300+ websites) that you are
crawling 2x per hour, are you *replacing* the old copy of the web page in
your index or storing some form of history for some reason.
2) What are you planning on storing vs indexing which would dictate your
memory requirements.
3) You mentioned you don't know QPS but having some guess would help.. is
it mostly for storage and occasional lookup (where slow responses is
probably tolerable) or is this powering a real user-facing website (where
low latency is prob desired).

Again, I like to start simple and use one server until it dies then expand
from there.

Cheers
Amit


On Thu, Apr 4, 2013 at 7:58 AM, imehesz <imeh...@gmail.com> wrote:

> hello,
>
> I'm using a single server setup with Nutch (1.6) and Solr (4.2)
>
> I plan to trigger the Nutch crawling process every 30 minutes or so and add
> about 300+ websites a month with (~5-10 pages each). At this point I'm not
> sure about the query requests/sec.
>
> Can I run this on a single server (how long)?
> If not, what would be the best and most efficient way to have multiple
> server setup?
>
> thanks,
> --iM
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-4-2-single-server-limitations-tp4053829.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Reply via email to