Hi Furkan

Load on the network is very low when read workload is on the cluster.
During indexing, a few of my "commits" get hung forever and the solr nodes
are attempting to get connection from zookeeper. The peer communication
between zk is very good and i havent seen any issues. The network transfer
is around 15-20 mBps when i restart a solr node.
*Infrastructure*: 10 node solrcloud cluster with 3 node zk ensemble
(m1.medium instances with 1 core cpu, 1.5Gb of Heap out of total of 3Gb
ram). Solr logs are in the same mount as the solr data and tlogs. Zk logs
are also in the same mount as zk data. I have 80+ collections which can
grow up to 150-200 easily.

*Regarding ZK Data*

Why does 50MB pose a problem if none of the system parameters are in an
alarming state? I have around 80+ collections in solr and the every
collection has the same schema but different solrconfig.xml.  Hence I am
bundling every schema,config  into a different zk folder and pushing that
as a separate config. Is there a way in solr/zookeeper to use one for
common files (like velocity template, schema)  and push just the
solrconfig.xml into another config directory? In the 50MB I am sure that
atleast 90% of the data is duplicate across configs

Kindly advise and thanks for your response








On Wed, Mar 12, 2014 at 4:08 PM, Furkan KAMACI <furkankam...@gmail.com>wrote:

> Hi;
>
> FAQ page says that:
>
> *Q: I'm seeing lot's of session timeout exceptions - what to do?*
> *A: Try raising the ZooKeeper session timeout by editing solr.xml - see the
> zkClientTimeout attribute. The minimum session timeout is 2 times your
> ZooKeeper defined tickTime. The maximum is 20 times the tickTime. The
> default tickTime is 2 seconds. You should avoiding raising this for no good
> reason, but it should be high enough that you don't see a lot of false
> session timeouts due to load, network lag, or garbage collection pauses.
> The default timeout is 15 seconds, but some environments might need to go
> as high as 30-60 seconds*.
>
> So when you do that what is the load of your network? Do you get that
> timeouts while heavy indexing or at an idle time? If not there should be a
> network problem. Could you chech whether a problem exists "between" your
> Zookeeper ensembles? On the other hand could you give some more information
> about your infrastructure and Solr logs? (PS: 50 mb data *may *cause a
> problem for your architecture)
>
> Thanks;
> Furkan KAMACI
>
>
> 2014-03-13 0:57 GMT+02:00 Chris W <chris1980....@gmail.com>:
>
> > Hi
> >
> >   I have a 3 node zk ensemble . I see a very high latency for zk
> responses
> > and also a lot of outstanding requests (in the order of 30-40)
> >
> > I also see that the requests are not going to all zookeeper nodes
> equally.
> > One node has more requests/connections than the others. I see that
> CPU/Mem
> > and disk usage limits are very normal (under 30% cpu, disk reads in the
> > order of kb, jvm size is 2 Gb but it hasnt even reached 30% usage). The
> > size of data in zk is around 50MB
> >
> > I also see a few zk timeout for solrcloud nodes causing them to be shown
> as
> > "dead" in the cloud view. I have increased the connection timeout to
> around
> > 3 minutes and still the same issue seems to be happening
> >
> > How do i make zk respond faster to requests and where does zk usually
> spend
> > time while dealing with incoming requests?
> >
> > Any pointers on how to move forward will be great
> >
> > --
> > Best
> > --
> > C
> >
>



-- 
Best
-- 
C

Reply via email to