GC is operating the way I think it should but I am lacking memory.  I am
just surprised because indexing is performing fine (documents going in) but
deletions are really bad (documents coming out).

Is it possible these deletes are hitting many segments, each of which I
assume must be re-built?  And if there isn't much slack memory laying
around to begin with, there's a bunch of contention/swap?

Thanks Shawn!

On Wed, May 20, 2015 at 4:50 PM, Shawn Heisey <apa...@elyograg.org> wrote:

> On 5/20/2015 5:41 PM, Ryan Cutter wrote:
> > I have a collection with 1 billion documents and I want to delete 500 of
> > them.  The collection has a dozen shards and a couple replicas.  Using
> Solr
> > 4.4.
> >
> > Sent the delete query via HTTP:
> >
> > http://hostname:8983/solr/my_collection/update?stream.body=
> > <delete><query>source:foo</query></delete>
> >
> > Took a couple minutes and several replicas got knocked into Recovery
> mode.
> > They eventually came back and the desired docs were deleted but the
> cluster
> > wasn't thrilled (high load, etc).
> >
> > Is this expected behavior?  Is there a better way to delete documents
> that
> > I'm missing?
>
> That's the correct way to do the delete.  Before you'll see the change,
> a commit must happen in one way or another.  Hopefully you already knew
> that.
>
> I believe that your setup has some performance issues that are making it
> very slow and knocking out your Solr nodes temporarily.
>
> The most common root problems with SolrCloud and indexes going into
> recovery are:  1) Your heap is enormous but your garbage collection is
> not tuned.  2) You don't have enough RAM, separate from your Java heap,
> for adequate index caching.  With a billion documents in your
> collection, you might even be having problems with both.
>
> Here's a wiki page that includes some info on both of these problems,
> plus a few others:
>
> http://wiki.apache.org/solr/SolrPerformanceProblems
>
> Thanks,
> Shawn
>
>

Reply via email to