Hi Markus; I am already using existing functionality at Nutch. I have calculated the batch size effect and I think that map task should be tune up.
Thanks; Furkan KAMACI 2014-02-27 17:21 GMT+02:00 Markus Jelsma <markus.jel...@openindex.io>: > Something must be eating your memory in your solrcloud indexer in Nutch. > We have our own SolrCloud indexer in Nutch and it uses extremely little > memory. You either have a leak or your batch size is too large. > > -----Original message----- > > From:Furkan KAMACI <furkankam...@gmail.com> > > Sent: Thursday 27th February 2014 16:04 > > To: solr-user@lucene.apache.org > > Subject: How To Test SolrCloud Indexing Limits > > > > Hi; > > > > I'm trying to index 2 million documents into SolrCloud via Map Reduce > Jobs > > (really small number of documents for my system). However I get that > error > > at tasks when I increase the added document size: > > > > java.lang.ClassCastException: java.lang.OutOfMemoryError cannot be > > cast to java.lang.Exception > > at > org.apache.solr.client.solrj.impl.CloudSolrServer$RouteException.<init>(CloudSolrServer.java:484) > > at > org.apache.solr.client.solrj.impl.CloudSolrServer.directUpdate(CloudSolrServer.java:351) > > at > org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:510) > > at > org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117) > > at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68) > > at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54) > > at > org.apache.nutch.indexwriter.solrcloud.SolrCloudIndexWriter.close(SolrCloudIndexWriter.java:95) > > at > org.apache.nutch.indexer.IndexWriters.close(IndexWriters.java:114) > > at > org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:54) > > at > org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:649) > > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766) > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:363) > > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:396) > > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) > > at org.apache.hadoop.mapred.Child.main(Child.java:249) > > > > > > I use Solr 4.5.1 for my purpose. I do not get any error at my > > SolrCloud nodes.. I want to test my indexing capability and I have > > changed some parameters to tune up. Is there any idea for autocommit - > > softcommit size or maxTime - maxDocs parameters to test. I don't need > > the numbers I just want to follow a policy as like: increase > > autocommit and maxDocs, don't use softcommit and maxTime (or maybe no > > free lunch, try everything!). > > > > I don't ask this question for production purpose, I know that I should > > test more parameters and tune up my system for such kind of purpose I > > just want to test my indexing limits. > > > > > > Thanks; > > > > Furkan KAMACI > > >