Here is what I'm doing, SolrServer server = new StreamingUpdateSolrServer(url, 1000,5);
server.addBeans(dataList); //where dataList is List<some_obj> with 10K elements I run two threads each using the same server object and then each call server.addBeans(...). I'm able to get 50K/sec inserted using that, but the commit after that (after 100k records) takes 70sec - which messes up the avg time. There are two problems here, 1) Once in a while I get "connection reset" error, Caused by: java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:168) at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read(BufferedInputStream.java:237) at org.apache.commons.httpclient.HttpParser.readRawLine(HttpParser.java:78) at org.apache.commons.httpclient.HttpParser.readLine(HttpParser.java:106) Note: if I use CommonsHttpSolrServer I get the buffer error. 2) The commit takes way too long for every 100k (I may commit more often if this can not be improved) I'm trying to fix this error problem which happens only if I run two threads both calling addBeans (10k at a time). One thread work fine. I'm not sure how can I use the MultiThreadedConnectionManager to create StreamingUpdateSolrServer and if they would help? Thanks, -vivek 2009/4/9 Noble Paul നോബിള് नोब्ळ् <noble.p...@gmail.com>: > using a single request is the fatest > > http://wiki.apache.org/solr/Solrj#head-2046bbaba3759b6efd0e33e93f5502038c01ac65 > > I could index at the rate of 10,000 docs/sec using this and > BinaryRequestWriter > > On Thu, Apr 9, 2009 at 10:36 PM, vivek sar <vivex...@gmail.com> wrote: >> I'm inserting 10K in a batch (using addBeans method). I read somewhere >> in the wiki that it's better to use the same instance of SolrServer >> for better performance. Would MultiThreadedConnectionManager help? How >> do I use it? >> >> I also wanted to know how can use EmbeddedSolrServer - does my app >> needs to be running in the same jvm with Solr webapp? >> >> Thanks, >> -vivek >> >> 2009/4/9 Noble Paul നോബിള് नोब्ळ् <noble.p...@gmail.com>: >>> how many documents are you inserting ? >>> may be you can create multiple instances of CommonshttpSolrServer and >>> upload in parallel >>> >>> >>> On Thu, Apr 9, 2009 at 11:58 AM, vivek sar <vivex...@gmail.com> wrote: >>>> Thanks Shalin and Paul. >>>> >>>> I'm not using MultipartRequest. I do share the same SolrServer between >>>> two threads. I'm not using MultiThreadedHttpConnectionManager. I'm >>>> simply using CommonsHttpSolrServer to create the SolrServer. I've also >>>> tried StreamingUpdateSolrServer, which works much faster, but does >>>> throws "connection reset" exception once in a while. >>>> >>>> Do I need to use MultiThreadedHttpConnectionManager? I couldn't find >>>> anything on it on Wiki. >>>> >>>> I was also thinking of using EmbeddedSolrServer - in what case would I >>>> be able to use it? Does my application and the Solr web app need to >>>> run into the same JVM for this to work? How would I use the >>>> EmbeddedSolrServer? >>>> >>>> Thanks, >>>> -vivek >>>> >>>> >>>> On Wed, Apr 8, 2009 at 10:46 PM, Shalin Shekhar Mangar >>>> <shalinman...@gmail.com> wrote: >>>>> Vivek, do you share the same SolrServer instance between your two threads? >>>>> If so, are you using the MultiThreadedHttpConnectionManager when creating >>>>> the HttpClient instance? >>>>> >>>>> On Wed, Apr 8, 2009 at 10:13 PM, vivek sar <vivex...@gmail.com> wrote: >>>>> >>>>>> single thread everything works fine. Two threads are fine too for a >>>>>> while and all the sudden problem starts happening. >>>>>> >>>>>> I tried indexing using REST services as well (instead of Solrj), but >>>>>> with that too I get following error after a while, >>>>>> >>>>>> 2009-04-08 10:04:08,126 ERROR [indexerThreadPool-5] Indexer - >>>>>> indexData()-> Failed to index >>>>>> java.net.SocketException: Broken pipe >>>>>> at java.net.SocketOutputStream.socketWrite0(Native Method) >>>>>> at >>>>>> java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92) >>>>>> at java.net.SocketOutputStream.write(SocketOutputStream.java:136) >>>>>> at >>>>>> java.io.BufferedOutputStream.write(BufferedOutputStream.java:105) >>>>>> at java.io.FilterOutputStream.write(FilterOutputStream.java:80) >>>>>> at >>>>>> org.apache.commons.httpclient.methods.StringRequestEntity.writeRequest(StringRequestEntity.java:145) >>>>>> at >>>>>> org.apache.commons.httpclient.methods.EntityEnclosingMethod.writeRequestBody(EntityEnclosingMethod.java:499) >>>>>> at >>>>>> org.apache.commons.httpclient.HttpMethodBase.writeRequest(HttpMethodBase.java:2114) >>>>>> at >>>>>> org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1096) >>>>>> at >>>>>> org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:398) >>>>>> at >>>>>> org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171) >>>>>> at >>>>>> org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397) >>>>>> at >>>>>> org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323) >>>>>> >>>>>> >>>>>> Note, I'm using "simple" lock type. I'd tried "single" type before >>>>>> that once caused index corruption so I switched to "simple". >>>>>> >>>>>> Thanks, >>>>>> -vivek >>>>>> >>>>>> 2009/4/8 Noble Paul നോബിള് नोब्ळ् <noble.p...@gmail.com>: >>>>>> > do you see the same problem when you use a single thread? >>>>>> > >>>>>> > what is the version of SolrJ that you use? >>>>>> > >>>>>> > >>>>>> > >>>>>> > On Wed, Apr 8, 2009 at 1:19 PM, vivek sar <vivex...@gmail.com> wrote: >>>>>> >> Hi, >>>>>> >> >>>>>> >> Any ideas on this issue? I ran into this again - once it starts >>>>>> >> happening it keeps happening. One of the thread keeps failing. Here >>>>>> >> are my SolrServer settings, >>>>>> >> >>>>>> >> int socketTO = 0; >>>>>> >> int connectionTO = 100; >>>>>> >> int maxConnectionPerHost = 10; >>>>>> >> int maxTotalConnection = 50; >>>>>> >> boolean followRedirects = false; >>>>>> >> boolean allowCompression = true; >>>>>> >> int maxRetries = 1; >>>>>> >> >>>>>> >> Note, I'm using two threads to simultaneously write to the same index. >>>>>> >> >>>>>> >> org.apache.solr.client.solrj.SolrServerException: >>>>>> >> org.apache.commons.httpclient.ProtocolException: Unbuffered entity >>>>>> >> enclosing request can not be repeated. >>>>>> >> at >>>>>> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:470) >>>>>> >> at >>>>>> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:242) >>>>>> >> at >>>>>> org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateRequest.java:259) >>>>>> >> at >>>>>> org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:48) >>>>>> >> at >>>>>> org.apache.solr.client.solrj.SolrServer.addBeans(SolrServer.java:57) >>>>>> >> >>>>>> >> Thanks, >>>>>> >> -vivek >>>>>> >> >>>>>> >> On Sat, Apr 4, 2009 at 1:07 AM, vivek sar <vivex...@gmail.com> wrote: >>>>>> >>> Hi, >>>>>> >>> >>>>>> >>> I'm sending 15K records at once using Solrj (server.addBeans(...)) >>>>>> >>> and have two threads writing to same index. One thread goes fine, but >>>>>> >>> the second thread always fails with, >>>>>> >>> >>>>>> >>> >>>>>> >>> org.apache.solr.client.solrj.SolrServerException: >>>>>> >>> org.apache.commons.httpclient.ProtocolException: Unbuffered entity >>>>>> >>> enclosing request can not be repeated. >>>>>> >>> at >>>>>> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:470) >>>>>> >>> at >>>>>> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:242) >>>>>> >>> at >>>>>> org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateRequest.java:259) >>>>>> >>> at >>>>>> org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:48) >>>>>> >>> at >>>>>> org.apache.solr.client.solrj.SolrServer.addBeans(SolrServer.java:57) >>>>>> >>> at >>>>>> com.apple.afterchat.indexer.solr.handler.BeanIndexHandler.indexData(BeanIndexHandler.java:44) >>>>>> >>> at >>>>>> com.apple.afterchat.indexer.Indexer.indexData(Indexer.java:77) >>>>>> >>> at com.apple.afterchat.indexer.Indexer.run(Indexer.java:39) >>>>>> >>> at >>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885) >>>>>> >>> at >>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907) >>>>>> >>> at java.lang.Thread.run(Thread.java:637) >>>>>> >>> Caused by: org.apache.commons.httpclient.ProtocolException: >>>>>> >>> Unbuffered >>>>>> >>> entity enclosing request can not be repeated. >>>>>> >>> at >>>>>> org.apache.commons.httpclient.methods.EntityEnclosingMethod.writeRequestBody(EntityEnclosingMethod.java:487) >>>>>> >>> at >>>>>> org.apache.commons.httpclient.HttpMethodBase.writeRequest(HttpMethodBase.java:2114) >>>>>> >>> at >>>>>> org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1096) >>>>>> >>> at >>>>>> org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:398) >>>>>> >>> at >>>>>> org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171) >>>>>> >>> at >>>>>> org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397) >>>>>> >>> at >>>>>> org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323) >>>>>> >>> at >>>>>> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:417) >>>>>> >>> >>>>>> >>> Does anyone know what could be the problem? >>>>>> >>> >>>>>> >>> Thanks, >>>>>> >>> -vivek >>>>>> >>> >>>>>> >> >>>>>> > >>>>>> > >>>>>> > >>>>>> > -- >>>>>> > --Noble Paul >>>>>> > >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Regards, >>>>> Shalin Shekhar Mangar. >>>>> >>>> >>> >>> >>> >>> -- >>> --Noble Paul >>> >> > > > > -- > --Noble Paul >