RE: OutOfMemory GC: GC overhead limit exceeded - Why isn't WeakHashMap getting collected?

Upayavira Tue, 14 Dec 2010 06:53:57 -0800

A Lucene index is made up of segments. Each commit writes a segment.
Sometimes, upon commit, some segments are merged together into one, to
reduce the overall segment count, as too many segments hinders
performance. Upon optimisation, all segments are (typically) merged into
a single segment.

Replication copies any new segments from the master to the slave,
whether they be new segments arriving from a commit, or new segments
that are a result of a segment merge. The result is a set of index files
on disk that are a clean mirror of the master.

Then, when your replication process has finished syncing changed
segments, it fires a commit on the slave. This causes Solr to create a
new index reader. 

When the first query comes in, this triggers Solr to populate caches.
Whoever was unfortunate to cause that cache population will see poorer
results (we've seen 40s responses rather than 1s).

The solution to this is to set up an autowarming query in
solrconfig.xml. This query is executed against the new index reader,
causing caches to populate from the updated files on disk. Only once
that autowarming query has completed will the index reader be made
available to Solr for answering search queries.

There's some cleverness that I can't remember the details of specifying
how much to keep from the existing caches, and how much to build up from
the files on disk. If I recall, it is all configured in solrconfig.xml.

You ask a good question whether a commit will be triggered if the sync
brought over no new files (i.e. if the previous one did, but this one
didn't). I'd imagine that Solr would compare the maximum segment ID on
disk with the one in memory to make such a decision, in which case Solr
would spot the changes from the previous sync and still work. The best
way to be sure is to try it! 

The simplest way to try it (as I would do it) would be to:

1) switch off post-commit replication
2) post some content to solr
3) commit on the master
4) use rsync to copy the indexes from the master to the slave
5) do another (empty) commit on the master
6) trigger replication via an HTTP request to the slave
7) See if your posted content is available on your slave.

Maybe someone else here can tell you what is actually going on and save
you the effort!

Does that help you get some understand what is going on?

Upayavira

On Tue, 14 Dec 2010 09:15 -0500, "Jonathan Rochkind" <rochk...@jhu.edu>
wrote:
> But the entirety of the old indexes (no longer on disk) wasn't cached in
> memory, right?  Or is it?  Maybe this is me not understanding lucene
> enough. I thought that portions of the index were cached in disk, but
> that sometimes the index reader still has to go to disk to get things
> that aren't currently in caches.  If this is true (tell me if it's not!),
> we have an index reader that was based on indexes that... are no longer
> on disk. But the index reader is still open. What happens when it has to
> go to disk for info?
> 
> And the second replication will trigger a commit even if there are in
> fact no new files to be transfered over to slave, because there have been
> no changes since the prior sync with failed commit?
> ________________________________________
> From: Upayavira [...@odoko.co.uk]
> Sent: Tuesday, December 14, 2010 2:23 AM
> To: solr-user@lucene.apache.org
> Subject: RE: OutOfMemory GC: GC overhead limit exceeded - Why isn't
> WeakHashMap getting collected?
> 
> The second commit will bring in all changes, from both syncs.
> 
> Think of the sync part as a glorified rsync of files on disk. So the
> files will have been copied to disk, but the in memory index on the
> slave will not have noticed that those files have changed. The commit is
> intended to remedy that - it causes a new index reader to be created,
> based upon the new on disk files, which will include updates from both
> syncs.
> 
> Upayavira
> 
> On Mon, 13 Dec 2010 23:11 -0500, "Jonathan Rochkind" <rochk...@jhu.edu>
> wrote:
> > Sorry, I guess I don't understand the details of replication enough.
> >
> > So slave tries to replicate. It pulls down the new index files. It tries
> > to do a commit but fails.  But "the next commit that does succeed will
> > have all the updates." Since it's a slave, it doesn't get any commits of
> > it's own. But then some amount of time later, it does another replication
> > pull. There are at this time maybe no _new_ changes since the last failed
> > replication pull. Does this trigger a commit that will get those previous
> > changes actually added to the slave?
> >
> > In the meantime, between commits.. are those potentially large pulled new
> > index files sitting around somewhere but not replacing the old slave
> > index files, doubling disk space for those files?
> >
> > Thanks for any clarification.
> >
> > Jonathan
> > ________________________________________
> > From: ysee...@gmail.com [ysee...@gmail.com] On Behalf Of Yonik Seeley
> > [yo...@lucidimagination.com]
> > Sent: Monday, December 13, 2010 10:41 PM
> > To: solr-user@lucene.apache.org
> > Subject: Re: OutOfMemory GC: GC overhead limit exceeded - Why isn't
> > WeakHashMap getting collected?
> >
> > On Mon, Dec 13, 2010 at 9:27 PM, Jonathan Rochkind <rochk...@jhu.edu>
> > wrote:
> > > Yonik, how will maxWarmingSearchers in this scenario effect replication?  
> > > If a slave is pulling down new indexes so quickly that the warming 
> > > searchers would ordinarily pile up, but maxWarmingSearchers is set to 
> > > 1.... what happens?
> >
> > Like any other commits, this will limit the number of searchers
> > warming in the background to 1.  If a commit is called, and that tries
> > to open a new searcher while another is already warming, it will fail.
> >  The next commit that does succeed will have all the updates though.
> >
> > Today, this maxWarmingSearchers check is done after the writer has
> > closed and before a new searcher is opened... so calling commit too
> > often won't affect searching, but it will currently affect indexing
> > speed (since the IndexWriter is constantly being closed/flushed).
> >
> > -Yonik
> > http://www.lucidimagination.com
> >
>

RE: OutOfMemory GC: GC overhead limit exceeded - Why isn't WeakHashMap getting collected?

Reply via email to