DIH can read item by item. did you use stream="true" in the XPathEntityProcessor ?
On Sun, Jun 21, 2009 at 9:20 AM, Jianbin Dai <djian...@yahoo.com> wrote: > > Can DIH read item by item instead of the whole file before indexing? my > biggest file size is 6GB, larger than the JVM max ram value. > > > --- On Sat, 6/20/09, Erik Hatcher <e...@ehatchersolutions.com> wrote: > > > From: Erik Hatcher <e...@ehatchersolutions.com> > > Subject: Re: Use DIH with large xml file > > To: solr-user@lucene.apache.org > > Date: Saturday, June 20, 2009, 6:52 PM > > How are you configuring DIH to read > > those files? It is likely that you'll need at least as > > much RAM to the JVM as the largest file you're processing, > > though that depends entirely on how the file is being > > processed. > > > > Erik > > > > On Jun 20, 2009, at 9:23 PM, Jianbin Dai wrote: > > > > > > > > Hi, > > > > > > I have about 50GB of data to be indexed each day using > > DIH. Some of the files are as large as 6GB. I set the JVM > > Xmx to be 3GB, but the DIH crashes on those big files. Is > > there any way to handle it? > > > > > > Thanks. > > > > > > JB > > > > > > > > > > > > > > > > > -- ----------------------------------------------------- Noble Paul | Principal Engineer| AOL | http://aol.com