Thanks for the info, Shawn! On Mon, Mar 5, 2012 at 6:49 AM, Shawn Heisey <s...@elyograg.org> wrote:
> On 3/4/2012 3:31 AM, Sphene Software wrote: > >> Folks, >> >> I am planning to use DIH for an index of size 10 million records. >> >> I would like to know the following; >> - Can DIH scale for this size of an indexes >> - If DIH is a bottleneck, what is the specific issue and how it can be >> addressed >> > > My entire index is about 67 million documents. There are a total of seven > shards, six of them have over 11 million documents each. I can do a full > dataimport (from MySQL) of those six shards simultaneously in less than > three hours. The seventh shard is less than 500000 documents and builds > after the others during a full rebuild. It is rare that we have to do a > full rebuild, it's mostly at schema change time. > > I use SolrJ for updates, my experience with that so far suggests that > doing the full import with my SolrJ code would take significantly longer > than three hours. > > Thanks, > Shawn > >