Prasi, as per the ticket I linked to earlier, I was running into GC
settings. May be worth investigating - and take a look at the GC settings
I'm running with in the ticket.

Cheers,
Chris


On 22 October 2013 10:25, Prasi S <prasi1...@gmail.com> wrote:

> bq: ...three different files each with a partial set
> of data.
>
> WE have to index around 170 metadata. around 120 fields are int he first
> file, 50 metadata in the second fiel and 6 on the third file. All the three
> files have the same unique key. We use solrj to push these files to solr.
> First, we index the first file for the 220 Million records. Then we take
> the second file, do a partial update on the existing 220M. then the same is
> repeated for the third file.
>
> WE commit in batches. Our batch consist of 20,000 records. Once 5 such
> batches are sent to solr, we send a commit to solr from the code. We have
> disabled Softcommit. The hardcommit is as below.
>
>      <autoCommit>
>        <maxTime>${solr.autoCommit.maxTime:600000}</maxTime>
>        <openSearcher>false</openSearcher>
>      </autoCommit>
>
>
> Thanks,
> Prasi
>
>
> On Tue, Oct 22, 2013 at 2:34 PM, Erick Erickson <erickerick...@gmail.com
> >wrote:
>
> > This is not a lot of data really.
> >
> > bq: ...three different files each with a partial set
> > of data.
> >
> > OK, what does this mean? Are you importing as CSV files or
> > something? Are you trying to commit 10s of M documents at once?
> >
> > This shouldn't be merging since you're in 4.4 unless you're committing
> > far too frequently.
> >
> > What are your commit settings? Both soft and hard? How are you
> > committing?
> >
> > In short, there's not a lot of information to go on here, you need to
> > provide
> > a number of details.
> >
> > Best,
> > Erick
> >
> >
> > On Tue, Oct 22, 2013 at 9:25 AM, Prasi S <prasi1...@gmail.com> wrote:
> >
> > > Hi all,
> > > We are using solrcloud 4.4 (solrcloud with external zookeeper, 2
> tomcats
> > ,
> > > 2 solr- 1 in each tomcat) for indexing delimited files. Our index
> records
> > > count to 220 Million. We have three different files each with a partial
> > set
> > > of data.
> > >
> > > We index the first file completely. Then the second and thrid files are
> > > partial updates.
> > >
> > > 1. While we are testing the indexing performance, we notice that the
> solr
> > > hangs frequently after 2 days. It just hangs for about an hour or 2
> hours
> > >  and then if we hit the admin url , it comes back and starts indexing.
> > Why
> > > does this happen?
> > >
> > > We have noticed that in the last 12 hours , the hangin was so frequent
> .
> > > almost 6 hours it was just in hanged state.
> > >
> > > 2. also, commit time also increases for the partial upload.
> > >
> > >
> > > Do we need to tweek any parameter or is it the behavior with Cloud for
> > huge
> > > volume of data?
> > >
> > >
> > > Thanks,
> > > Prasi
> > >
> >
>

Reply via email to