Any help on this is much appreciated. Is it better to use more cores for
zookeeper (as opposed to 1 core machine)?



On Wed, Mar 12, 2014 at 4:28 PM, Chris W <chris1980....@gmail.com> wrote:

> Hi Furkan
>
> Load on the network is very low when read workload is on the cluster.
> During indexing, a few of my "commits" get hung forever and the solr nodes
> are attempting to get connection from zookeeper. The peer communication
> between zk is very good and i havent seen any issues. The network transfer
> is around 15-20 mBps when i restart a solr node.
>
> *Infrastructure*: 10 node solrcloud cluster with 3 node zk ensemble
> (m1.medium instances with 1 core cpu, 1.5Gb of Heap out of total of 3Gb
> ram). Solr logs are in the same mount as the solr data and tlogs. Zk logs
> are also in the same mount as zk data. I have 80+ collections which can
> grow up to 150-200 easily.
>
> *Regarding ZK Data*
>
> Why does 50MB pose a problem if none of the system parameters are in an
> alarming state? I have around 80+ collections in solr and the every
> collection has the same schema but different solrconfig.xml.  Hence I am
> bundling every schema,config  into a different zk folder and pushing that
> as a separate config. Is there a way in solr/zookeeper to use one for
> common files (like velocity template, schema)  and push just the
> solrconfig.xml into another config directory? In the 50MB I am sure that
> atleast 90% of the data is duplicate across configs
>
> Kindly advise and thanks for your response
>
>
>
>
>
>
>
>
> On Wed, Mar 12, 2014 at 4:08 PM, Furkan KAMACI <furkankam...@gmail.com>wrote:
>
>> Hi;
>>
>> FAQ page says that:
>>
>> *Q: I'm seeing lot's of session timeout exceptions - what to do?*
>> *A: Try raising the ZooKeeper session timeout by editing solr.xml - see
>> the
>> zkClientTimeout attribute. The minimum session timeout is 2 times your
>> ZooKeeper defined tickTime. The maximum is 20 times the tickTime. The
>> default tickTime is 2 seconds. You should avoiding raising this for no
>> good
>> reason, but it should be high enough that you don't see a lot of false
>> session timeouts due to load, network lag, or garbage collection pauses.
>> The default timeout is 15 seconds, but some environments might need to go
>> as high as 30-60 seconds*.
>>
>> So when you do that what is the load of your network? Do you get that
>> timeouts while heavy indexing or at an idle time? If not there should be a
>> network problem. Could you chech whether a problem exists "between" your
>> Zookeeper ensembles? On the other hand could you give some more
>> information
>> about your infrastructure and Solr logs? (PS: 50 mb data *may *cause a
>> problem for your architecture)
>>
>> Thanks;
>> Furkan KAMACI
>>
>>
>> 2014-03-13 0:57 GMT+02:00 Chris W <chris1980....@gmail.com>:
>>
>> > Hi
>> >
>> >   I have a 3 node zk ensemble . I see a very high latency for zk
>> responses
>> > and also a lot of outstanding requests (in the order of 30-40)
>> >
>> > I also see that the requests are not going to all zookeeper nodes
>> equally.
>> > One node has more requests/connections than the others. I see that
>> CPU/Mem
>> > and disk usage limits are very normal (under 30% cpu, disk reads in the
>> > order of kb, jvm size is 2 Gb but it hasnt even reached 30% usage). The
>> > size of data in zk is around 50MB
>> >
>> > I also see a few zk timeout for solrcloud nodes causing them to be
>> shown as
>> > "dead" in the cloud view. I have increased the connection timeout to
>> around
>> > 3 minutes and still the same issue seems to be happening
>> >
>> > How do i make zk respond faster to requests and where does zk usually
>> spend
>> > time while dealing with incoming requests?
>> >
>> > Any pointers on how to move forward will be great
>> >
>> > --
>> > Best
>> > --
>> > C
>> >
>>
>
>
>
> --
> Best
> --
> C
>



-- 
Best
-- 
C

Reply via email to