Re: to reduce indexing time

2014-03-05 Thread Shawn Heisey
On 3/5/2014 2:58 PM, Toby Lazar wrote: OK, I was using HttpSolrServer since I haven't yet migrated to CloudSolrServer. I added the line: solrServer.setRequestWriter(new BinaryRequestWriter()) after creating the server object and now see the difference through wireshark. Is it fair to assu

Re: to reduce indexing time

2014-03-05 Thread Toby Lazar
OK, I was using HttpSolrServer since I haven't yet migrated to CloudSolrServer. I added the line: solrServer.setRequestWriter(new BinaryRequestWriter()) after creating the server object and now see the difference through wireshark. Is it fair to assume that this usage is multi-thread safe?

Re: to reduce indexing time

2014-03-05 Thread Shawn Heisey
On 3/5/2014 2:31 PM, Toby Lazar wrote: I believe SolrJ uses XML under the covers. If so, I don't think you would improve performance by switching to SolrJ, since the client would convert it to XML before sending it on the wire. Until recently, SolrJ always used XML by default for requests and

Re: to reduce indexing time

2014-03-05 Thread Toby Lazar
Thanks Ahmet for the correction. I used wireshark to capture an UpdateRequest to solr and saw this XML: 123blah and figured that javabin was only for the responses. Does wt apply for how solrj send requests to solr? Could this HTTP content be in javabin format? Toby On Wed, Mar 5, 2014

Re: to reduce indexing time

2014-03-05 Thread Ahmet Arslan
Hi Toby, SolrJ uses javabin by default. Ahmet On Wednesday, March 5, 2014 11:31 PM, Toby Lazar wrote: I believe SolrJ uses XML under the covers.  If so, I don't think you would improve performance by switching to SolrJ, since the client would convert it to XML before sending it on the wire. T

Re: to reduce indexing time

2014-03-05 Thread Toby Lazar
I believe SolrJ uses XML under the covers. If so, I don't think you would improve performance by switching to SolrJ, since the client would convert it to XML before sending it on the wire. Toby *** Toby Lazar Capital Technology Group Email: tla...@capitaltg.com

Re: to reduce indexing time

2014-03-05 Thread Ahmet Arslan
Hi, One thing to consider is, I think solrnet use xml update, there is xml parsing overhead with it. Switching to solrJ or CSV can cause additional gain. http://wiki.apache.org/lucene-java/ImproveIndexingSpeed Ahmet On Wednesday, March 5, 2014 10:13 PM, sweety wrote: I will surely read about

Re: to reduce indexing time

2014-03-05 Thread sweety
I will surely read about JVM Garbage collection. Thanks a lot, all of you. But, is the time required for my indexing good enough? I dont know about the ideal timings. I think that my indexing is taking more time. -- View this message in context: http://lucene.472066.n3.nabble.com/to-reduce-ind

Re: to reduce indexing time

2014-03-05 Thread Greg Walters
It doesn't sound like you have much of an understanding of java's garbage collection. You might read http://www.oracle.com/webfolder/technetwork/tutorials/obe/java/gc01/index.html to get a better understanding of how it works and why you're seeing different levels of memory utilization at any g

Re: to reduce indexing time

2014-03-05 Thread sweety
Now i have batch indexed, with batch of 250 documents.These were the results. After 7,000 documents, Qtime: 46894, System time : 00:00:55.9384892 JVM memory : 249.02mb, 24.8% This shows quite a reduction in timing. After 70,000 documents, Qtime: 480435, System time : 00:09:29.5206727 System memor

Re: to reduce indexing time

2014-03-05 Thread Shawn Heisey
On 3/5/2014 7:47 AM, sweety wrote: > Before indexing , this was the memory layout, > > System Memory : 63.2% ,2.21 gb > JVM Memory : 8.3% , 81.60mb of 981.38mb > > I have indexed 700 documents of total size 12MB. > Following are the results i get : > Qtime: 8122, System time : 00:00:12.7318648 >

Re: to reduce indexing time

2014-03-05 Thread Ahmet Arslan
Hi, Batch/bulk indexing is the way to go for speed.  * Disable autoSoftCommit feature for the bulk indexing. * Disable transaction log for the bulk indexing. Ater you finish bulk indexing, you can enable above. Again you are too generous with 1 second refresh rate (autoSoftCommit maxTime).  He