Re: Solr Collection API doesn't seem to be working

Mark Miller Thu, 03 Jan 2013 06:59:19 -0800

MAX_INT is just a place holder for a high value given the context of this guy 
wanting to add replicas for as many machines as he adds down the line. You are 
taking it too literally.


- Mark

On Jan 3, 2013, at 9:02 AM, Per Steffensen <st...@designware.dk> wrote:

> On 1/3/13 2:50 AM, Mark Miller wrote:
>> Unfortunately, for 4.0, the collections API was pretty bare bones. You don't 
>> actually get back responses currently - you just pass off the create command 
>> to zk for the Overseer to pick up and execute.
>> 
>> So you actually have to check the logs of the Overseer to see what the 
>> problem may be. I'm working on making sure we address this for 4.1.
>> 
>> If you look at the admin UI, in the zk tree, you should be able to see what 
>> node is the overseer (look for its election node). The logs for that node 
>> should indicate the problem.
>> 
>> FYI, if I remember right, replication factor is not currently optional.
> Actually I believe it is.
>> 
>> In the future, I'd like it so you can say like replicationFactor=max_int, 
>> and the overseer will periodically try to match that given the nodes it sees 
>> - but we don't have that yet.
> Uhhhh, but why!
> 
> It would be nice if you can say replicationFactor=X where X is higher than 
> your current number of nodes, and overseer then periodically tries to see if 
> it can honor your original request for replicationFactor X (it will be when 
> you eventually have X nodes in your cluster).
> 
> But specifying a MAX_INT value is IMHO a bad idea. It requires double 
> "resource"-usage to maintain double number of replica, so you dont want more 
> replica than necessary relative to your risk/HA-profile. I couldnt imaging a 
> setup where you want replica of each shard across all nodes no matter how 
> many nodes you add to your cluster. Of course you can always give a 
> replicationFactor of 10 (or something high) and then if you know (currently 
> believe) that you will never add more than 10 nodes to your cluster, then 
> basically you will achieve what you wanted to do with MAX_INT. But if things 
> evolve and you end up having 20 or 100 nodes in you cluster you probably do 
> not want more than 10 replica anyway.
>> 
>> When you add new nodes, to add them to a current collection you will either 
>> have to use CoreAdmin API or pre configure the cores in solr.xml. All you 
>> need is to specify a matching collection name for the new core.
>> 
>> - Mark
>

Re: Solr Collection API doesn't seem to be working

Reply via email to