I dont know exactly what is this 3G Ram buffer used. But what I noticed was both index size and file number were keeping increasing, but stuck in the commit.
--- On Fri, 5/22/09, Otis Gospodnetic <otis_gospodne...@yahoo..com> wrote: > From: Otis Gospodnetic <otis_gospodne...@yahoo.com> > Subject: Re: How to index large set data > To: solr-user@lucene.apache.org > Date: Friday, May 22, 2009, 7:26 AM > > Hi, > > Those settings are a little "crazy". Are you sure you > want to give Solr/Lucene 3G to buffer documents before > flushing them to disk? Are you sure you want to use > the mergeFactor of 1000? Checking the logs to see if > there are any errors. Look at the index directory to > see if Solr is actually still writing to it? (file sizes are > changing, number of files is changing). kill -QUIT the > JVM pid to see where things are "stuck" if they are > stuck... > > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > ----- Original Message ---- > > From: Jianbin Dai <djian...@yahoo.com> > > To: solr-user@lucene.apache.org; > noble.p...@gmail.com > > Sent: Friday, May 22, 2009 3:42:04 AM > > Subject: Re: How to index large set data > > > > > > about 2.8 m total docs were created. only the first > run finishes. In my 2nd try, > > it hangs there forever at the end of indexing, (I > guess right before commit), > > with cpu usage of 100%. Total 5G (2050) index files > are created. Now I have two > > problems: > > 1. why it hangs there and failed? > > 2. how can i speed up the indexing? > > > > > > Here is my solrconfig.xml > > > > false > > 3000 > > 1000 > > 2147483647 > > 10000 > > false > > > > > > > > > > --- On Thu, 5/21/09, Noble Paul > നോബിള് नोब्ळ् wrote: > > > > > From: Noble Paul നോബിള് > नोब्ळ् > > > Subject: Re: How to index large set data > > > To: solr-user@lucene.apache.org > > > Date: Thursday, May 21, 2009, 10:39 PM > > > what is the total no:of docs created > > > ? I guess it may not be memory > > > bound. indexing is mostly amn IO bound operation. > You may > > > be able to > > > get a better perf if a SSD is used (solid state > disk) > > > > > > On Fri, May 22, 2009 at 10:46 AM, Jianbin Dai > > > wrote: > > > > > > > > Hi Paul, > > > > > > > > Thank you so much for answering my > questions. It > > > really helped. > > > > After some adjustment, basically setting > mergeFactor > > > to 1000 from the default value of 10, I can > finished the > > > whole job in 2.5 hours. I checked that during > running time, > > > only around 18% of memory is being used, and VIRT > is always > > > 1418m. I am thinking it may be restricted by JVM > memory > > > setting. But I run the data import command > through web, > > > i.e., > > > > > > > http://:/solr/dataimport?command=full-import, > > > how can I set the memory allocation for JVM? > > > > Thanks again! > > > > > > > > JB > > > > > > > > --- On Thu, 5/21/09, Noble Paul > നോബിള് > > > नोब्ळ् > > > wrote: > > > > > > > >> From: Noble Paul നോബിള് > > > नोब्ळ् > > > >> Subject: Re: How to index large set > data > > > >> To: solr-user@lucene.apache.org > > > >> Date: Thursday, May 21, 2009, 9:57 PM > > > >> check the status page of DIH and see > > > >> if it is working properly. and > > > >> if, yes what is the rate of indexing > > > >> > > > >> On Thu, May 21, 2009 at 11:48 AM, > Jianbin Dai > > > > > > >> wrote: > > > >> > > > > >> > Hi, > > > >> > > > > >> > I have about 45GB xml files to be > indexed. I > > > am using > > > >> DataImportHandler. I started the full > import 4 > > > hours ago, > > > >> and it's still running..... > > > >> > My computer has 4GB memory. Any > suggestion on > > > the > > > >> solutions? > > > >> > Thanks! > > > >> > > > > >> > JB > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > >> > > > >> > > > >> -- > > > >> > > > > ----------------------------------------------------- > > > >> Noble Paul | Principal Engineer| AOL | > http://aol.com > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > ----------------------------------------------------- > > > Noble Paul | Principal Engineer| AOL | http://aol.com > > > > >