Here are some specs for my indexer. Indexer is custom Java code that reads data from DB and other services builds the solrDocument and submits it using SolrJ via Http. Indexer is doing a bit of work for building the documents. The overhead is around 30 to 40ms. For every document addition solr takes around 150 to 200 ms. I tried the bulk addition approach with 1000 documents at time. But found out that solr just take the same amount of time. I commit and optimize only once at the end. I currently use 32 threads in production environment to get that speed of 2hrs.
Thanks, Kalyan Manepalli -----Original Message----- From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] Sent: Wednesday, July 01, 2009 3:11 PM To: solr-user@lucene.apache.org Subject: Re: Tips on speeding the indexing process Kalyan, Using SolrJ? Use the StreamingServer, it's nice and fast. Alternatively, start multiple indexing threads (match the number of Solr server CPU cores) and index from there. Send batches of docs, not one by one. Don't commit or optimize until you are done. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- > From: "Manepalli, Kalyan" <kalyan.manepa...@orbitz.com> > To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org> > Sent: Wednesday, July 1, 2009 3:42:45 PM > Subject: Tips on speeding the indexing process > > Hi, > I have a very generic question regarding indexing. In my current > app, I have about 450,000 docs each doc size around 2k. The total indexing > time > is around 2hrs. > Now due to multi language support, the number of documents is increasing to > 2.0 > million. The total indexing time is exceeding 6 hrs. > I wanted to know if there are any general tips to speedup the indexing > process. > > Thanks, > Kalyan Manepalli