Hello,

After more tests, we could identify our problem in indexation (Solr 4.0.0).
Indeed our problems are OutOfMemoryErrors. Thinking about Zookeeper connection 
problems was a mistake. We have thought about this because OOME sometimes 
appear in logs after errors on Zookeeper leader election.

Indexing fails when we define several Solr schemas in Zookeeper.
When we define a single schema, indexation works well. It has been tested with 
a single Solr node in the cluster, or with two Solr nodes.
We are facing problems when we upload several configurations in Zookeeper : we 
can create an index for a single collection, but OutOfMemoryErrors are thrown 
when we try to create an index for a second collection with another schema.
Garbage collect logs show a rapid increase of memory consumption, then 
OutOfMemory errors.

Can we define a distinct schema for each collection ?

Thanks !

Joel Gaspard



De : GASPARD Joel [mailto:joel.gasp...@cegedim.com]
Envoyé : mardi 22 janvier 2013 16:30
À : solr-user@lucene.apache.org<mailto:solr-user@lucene.apache.org>
Objet : Indexing problems

Hello,

We are facing some problems when indexing with Solr 4.0.0 with more than one 
server node and we can't find a way to solve them.
We have 2 nodes of Solr Cloud instances.
They are running in a Zookeeper ensemble (3.4.4 version) with 3 servers 
(another application is deployed on the third server).
We try to index a collection with 1 shard stored in the 2 nodes.
2 other collections with an only shard have already been indexed. The logs for 
this first indexing have been lost but maybe there was a single Solr node when 
the indexing has been made. Each collection contains about 3.000.000 documents 
(16 Go).

When we start adding documents, failures occur very fast, after maybe 2000 
documents, and the solr servers cannot be accessed anymore.
I add to this mail an attachment containing a part of the logs.

When we use Solr Cloud with only one node in a single zookeeper ensemble, we 
don't encounter any problem.



Some precisions on our configuration :
We send about 400 documents per minute.
The documents are added in Solr by two threads on our application, using the 
CloudSolrServer class.
These threads don't call the commit method. We use only the solr config to 
commit. The solrconfig.xml defines for now :
<autoCommit><maxTime>15000</maxTime><openSearcher>false</openSearcher></autoCommit>
No soft commit
We have also tried :
<autoCommit><maxTime>600000</maxTime><openSearcher>false</openSearcher></autoCommit>
<autoSoftCommit><maxTime>1000</maxTime></autoSoftCommit>

The Solr servers are launched with these options :
-Xmx12G -Xms4G
-XX:MaxPermSize=256m -XX:MaxNewSize=356m
-XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+UseParNewGC
-XX:+CMSClassUnloadingEnabled
-XX:MinHeapFreeRatio=10
-XX:MaxHeapFreeRatio=25
-DzkHost=server1:2188,server2:2188,server3:2188

The solr.xml contains zkClientTimeout="60000" and zoo.cfg defines a ticktime of 
3000 ms.

The Solr servers on which we are facing some problems contain old collections 
and old cores created for some tests.



Could you give some indications to me ?
Is this a problem in our solr or zookeeper config ?
How could we detect network problems ?
Is there a problem with the VM parameters ? Should we analyse some garbage 
collect logs ?

Thanks in advance.

Joel Gaspard

Reply via email to