Thanks Toke.  Your input has been informative and valuable.
I will go through the links you provided and will let you know what we end
up going.

On Sat, Dec 5, 2015 at 5:02 AM, Toke Eskildsen <t...@statsbiblioteket.dk>
wrote:

> Gaurav Patel <gaura...@gmail.com> wrote:
> > 3 Physical Machines with 60 cpu cores and 512 GB RAM each.
> > EMC Isilon Appliance with PB storage. It can be accessed via HDFS or NFS.
>
> We have experimented a little bit with smaller machines, backed by EMC
> Isilon over NFS. That worked surprisingly well, but ultimately did not
> scale for us as we could not justify paying for enterprise SSDs for the
> Isilon. There is a write-up at
> https://sbdevel.wordpress.com/2013/12/06/danish-webscale/
>
> > Can we use solr cloud for this setup?
>
> Yes. That is independent of the backing storage.
>
> > How many instances of SOLR are recommended per physical machines
> > and how much ram should be allocated to it.
>
> "That depends".
>
> http://lucidworks.com/blog/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/
>
> The amount of RAM for JVMs should be whatever is needed. Or to put it
> another way: There are some explicitly configured internal caches in Solr,
> but just setting Xmx to a very high number will not help performance. On
> the contrary, it will lead to long garbage collecting pauses and eat from
> the precious disk cache.
>
> There are some rules of thumb for running Solr, but my own meta rule of
> thumbs is that their applicability goes down when scale goes up. One of the
> rules of thumb is to have 1 Solr instance per machine. But running JVMs
> with very large heaps (100GB+) has the potential of extremely long garbage
> collection pauses and also implies a larger memory overhead due to internal
> pointer size.
>
> > Should zookeeper be installed along with solr on each box or should be
> > installed in separate 2 Virtual machines by itself?
>
> I have no opinion on that.
>
> > Can we run kakfa and cassandra along with solr on each physical machine?
>
> Sure, but they will of course compete with Solr for resources.
>
> > Anybody running Solr with HDFS in production?
>
> It is a recurring theme on this mailing list at least. It can be searched
> at
> https://www.mail-archive.com/solr-user@lucene.apache.org/
>
> - Toke Eskildsen
>

Reply via email to