Re: OutOfMemory GC: GC overhead limit exceeded - Why isn't WeakHashMap getting collected?

Jonathan Rochkind Tue, 14 Dec 2010 08:03:10 -0800

Yeah, I understand basically how caches work.

What I don't understand is what happens in replication if, the newsegment files are succesfully copied, but the actual commit fails due tomaxAutoWarmingSearches. The new files are on disk... but the commitcould not succeed and there is NOT a new index reader, because thecommit failed. And there is potentially a long gap before a futuresuccesful commit.

1. Will the existing index searcher have problems because the files havebeen changed out from under it?

2. Will a future replication -- at which NO new files are available onmaster -- still trigger a future commit on slave?

Maybe these are obvious to everyone but me, because I keep asking thisquestion, and the answer I keep getting is just describing the basics ofreplication, as if this obviously answers my question.

Or maybe the answer isn't obvious or clear to anyone including me, inwhich case the only way to get an answer is to try and test it myself.A bit complicated to test, at least for my level of knowledge, as I'mnot sure exactly what I'd be looking for to answer either of thosequestions.


Jonathan

On 12/14/2010 9:53 AM, Upayavira wrote:

A Lucene index is made up of segments. Each commit writes a segment.
Sometimes, upon commit, some segments are merged together into one, to
reduce the overall segment count, as too many segments hinders
performance. Upon optimisation, all segments are (typically) merged into
a single segment.

Replication copies any new segments from the master to the slave,
whether they be new segments arriving from a commit, or new segments
that are a result of a segment merge. The result is a set of index files
on disk that are a clean mirror of the master.

Then, when your replication process has finished syncing changed
segments, it fires a commit on the slave. This causes Solr to create a
new index reader.

When the first query comes in, this triggers Solr to populate caches.
Whoever was unfortunate to cause that cache population will see poorer
results (we've seen 40s responses rather than 1s).

The solution to this is to set up an autowarming query in
solrconfig.xml. This query is executed against the new index reader,
causing caches to populate from the updated files on disk. Only once
that autowarming query has completed will the index reader be made
available to Solr for answering search queries.

There's some cleverness that I can't remember the details of specifying
how much to keep from the existing caches, and how much to build up from
the files on disk. If I recall, it is all configured in solrconfig.xml.

You ask a good question whether a commit will be triggered if the sync
brought over no new files (i.e. if the previous one did, but this one
didn't). I'd imagine that Solr would compare the maximum segment ID on
disk with the one in memory to make such a decision, in which case Solr
would spot the changes from the previous sync and still work. The best
way to be sure is to try it!

The simplest way to try it (as I would do it) would be to:

1) switch off post-commit replication
2) post some content to solr
3) commit on the master
4) use rsync to copy the indexes from the master to the slave
5) do another (empty) commit on the master
6) trigger replication via an HTTP request to the slave
7) See if your posted content is available on your slave.

Maybe someone else here can tell you what is actually going on and save
you the effort!

Does that help you get some understand what is going on?

Upayavira

On Tue, 14 Dec 2010 09:15 -0500, "Jonathan Rochkind"<rochk...@jhu.edu>
wrote:

But the entirety of the old indexes (no longer on disk) wasn't cached in
memory, right?  Or is it?  Maybe this is me not understanding lucene
enough. I thought that portions of the index were cached in disk, but
that sometimes the index reader still has to go to disk to get things
that aren't currently in caches.  If this is true (tell me if it's not!),
we have an index reader that was based on indexes that... are no longer
on disk. But the index reader is still open. What happens when it has to
go to disk for info?

And the second replication will trigger a commit even if there are in
fact no new files to be transfered over to slave, because there have been
no changes since the prior sync with failed commit?
________________________________________
From: Upayavira [...@odoko.co.uk]
Sent: Tuesday, December 14, 2010 2:23 AM
To: solr-user@lucene.apache.org
Subject: RE: OutOfMemory GC: GC overhead limit exceeded - Why isn't
WeakHashMap getting collected?

The second commit will bring in all changes, from both syncs.

Think of the sync part as a glorified rsync of files on disk. So the
files will have been copied to disk, but the in memory index on the
slave will not have noticed that those files have changed. The commit is
intended to remedy that - it causes a new index reader to be created,
based upon the new on disk files, which will include updates from both
syncs.

Upayavira

On Mon, 13 Dec 2010 23:11 -0500, "Jonathan Rochkind"<rochk...@jhu.edu>
wrote:

Sorry, I guess I don't understand the details of replication enough.

So slave tries to replicate. It pulls down the new index files. It tries
to do a commit but fails.  But "the next commit that does succeed will
have all the updates." Since it's a slave, it doesn't get any commits of
it's own. But then some amount of time later, it does another replication
pull. There are at this time maybe no _new_ changes since the last failed
replication pull. Does this trigger a commit that will get those previous
changes actually added to the slave?

In the meantime, between commits.. are those potentially large pulled new
index files sitting around somewhere but not replacing the old slave
index files, doubling disk space for those files?

Thanks for any clarification.

Jonathan
________________________________________
From: ysee...@gmail.com [ysee...@gmail.com] On Behalf Of Yonik Seeley
[yo...@lucidimagination.com]
Sent: Monday, December 13, 2010 10:41 PM
To: solr-user@lucene.apache.org
Subject: Re: OutOfMemory GC: GC overhead limit exceeded - Why isn't
WeakHashMap getting collected?

On Mon, Dec 13, 2010 at 9:27 PM, Jonathan Rochkind<rochk...@jhu.edu>
wrote:

Yonik, how will maxWarmingSearchers in this scenario effect replication?  If a 
slave is pulling down new indexes so quickly that the warming searchers would 
ordinarily pile up, but maxWarmingSearchers is set to 1.... what happens?

Like any other commits, this will limit the number of searchers
warming in the background to 1.  If a commit is called, and that tries
to open a new searcher while another is already warming, it will fail.
  The next commit that does succeed will have all the updates though.

Today, this maxWarmingSearchers check is done after the writer has
closed and before a new searcher is opened... so calling commit too
often won't affect searching, but it will currently affect indexing
speed (since the IndexWriter is constantly being closed/flushed).

-Yonik
http://www.lucidimagination.com

Re: OutOfMemory GC: GC overhead limit exceeded - Why isn't WeakHashMap getting collected?

Reply via email to