On 12/22/2013 09:48 PM, Shawn Heisey wrote:
On 12/22/2013 2:10 PM, David Santamauro wrote:
My goal is to have a redundant copy of all 8 currently running, but
non-redundant shards. This setup (8 nodes with no replicas) was a test
and it has proven quite functional from a performance perspective.
Loading, though, takes almost 3 weeks so I'm really not in a position to
redesign the distribution, though I can add nodes.

I have acquired another resource, a very large machine that I'd like to
use to hold the replicas of the currently deployed 8-nodes.

I realize I can run 8 jetty/tomcats and accomplish my goal but that is a
maintenance headache and is really a last resort. I really would just
like to be able to deploy this big machine with 'numShards=8'.

Is that possible or do I really need to have 8 other nodes running?

You don't want to run more than one container or Solr instance per
machine.  Things can get very confused, and it's too much overhead.
>
With existing collections, you can simply run the CoreAdmin CREATE
action on the new node with more resources.

So you'd do something like this, once for each of the 8 existing parts:

http://newnode:port/solr/admin/cores?action=CREATE&name=collname_shard1_replica2&collection=collname&shard=shard1

It will automatically replicate the shard from its current leader.

Fantastic! Clearly my understanding of "collection", vs "core" vs "shard" was lacking but now I see the relationship better.


One thing to be aware of: With 1.4TB of index data, it might be
impossible to keep enough of the index in RAM for good performance,
unless the machine has a terabyte or more of RAM.

Yes, I'm well aware of the performance implications, many of which are mitigated by 2TB of SSD and 512GB RAM.

Thanks for the nudge in the right direction. The first node/shard1 is replicating right now.

David



Reply via email to