Thanks guys!
I will start splitting the file in chunks of 5M (10 chunks) to start with
reduce the size if needed.

Thanks,
-Utkarsh


On Wed, Nov 13, 2013 at 9:08 AM, Walter Underwood <wun...@wunderwood.org>wrote:

> Don't load 50M documents in one shot. Break it up into reasonable chunks
> (100K?) with commits at each point.
>
> You will have a bottleneck somewhere, usually disk or CPU. Yours appears
> to be disk. If you get faster disks, it might become the CPU.
>
> wunder
>
> On Nov 13, 2013, at 8:22 AM, Utkarsh Sengar <utkarsh2...@gmail.com> wrote:
>
> > Bumping this one again, any suggestions?
> >
> >
> > On Tue, Nov 12, 2013 at 3:58 PM, Utkarsh Sengar <utkarsh2...@gmail.com
> >wrote:
> >
> >> Hello,
> >>
> >> I load data from csv to solr via UpdateCSV. There are about 50M
> documents
> >> with 10 columns in each document. The index size is about 15GB and I am
> >> using a 3 node distributed solr cluster.
> >>
> >> While loading the data the disk IO goes to 100%. if the load balancer in
> >> front of solr hits the machine which is doing the processing then the
> >> request times out. But in general, requests to all the machines become
> >> slow. I have attached a screenshot of the diskI/O and CPU usage.
> >>
> >> Is there a fix in solr which can possibly throttle the load or maybe its
> >> due to MergePolicy? How can I debug solr to get the exact cause?
> >>
> >> --
> >> Thanks,
> >> -Utkarsh
> >>
> >
> >
> >
> > --
> > Thanks,
> > -Utkarsh
>
> --
> Walter Underwood
> wun...@wunderwood.org
>
>
>
>


-- 
Thanks,
-Utkarsh

Reply via email to