Re: An observation on the "Too Many Files Open" problem

Mark Baird Wed, 26 Dec 2007 05:53:38 -0800

Well when I wasn't sending regular commits I was getting out of memory
exceptions from Solr fairly often, which I assume is due to the size of the
documents I'm sending.  I'd love to set the autocommit in solrconfig.xml and
not worry about sending commits on the client side, but autocommit doesn't
seem to work at all for me.  I'm aware of the maxTime bug but I've tried
setting maxTime to 0, and completely leaving it out of solrconfig.xml all
together, but no matter what I try my Solr server never does an autocommit.


Yes I'm just using one Solr instance for indexing and searching currently.
However this is just in a development environment still.  We just started
looking at Solr about a month ago and haven't gone to production with
anything yet.  We will probably have separate instances for indexing and
searching by the time we go live.


Mark

On Dec 24, 2007 4:20 PM, Otis Gospodnetic <[EMAIL PROTECTED]>
wrote:

> Mark,
>
> Another question to ask is: do you *really* need to be calling commit
> every 300 docs?  Unless you really need searchers to see your 300 new docs,
> you don't need to commit.  Just optimize + commit at the end of your whole
> batch.  Lowering the mergeFactor is the right thing to do.  Out of
> curiosity, are you using a single instance of Solr for both indexing and
> searching?
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
> ----- Original Message ----
> From: Mark Baird <[EMAIL PROTECTED]>
> To: Solr Mailing List <solr-user@lucene.apache.org>
> Sent: Monday, December 24, 2007 7:25:00 PM
> Subject: An observation on the "Too Many Files Open" problem
>
> Running our Solr server (latest 1.2 Solr release) on a Linux machine we
>  ran
> into the "Too Many Open Files" issue quite a bit.  We've since changed
>  the
> ulimit max filehandle setting, as well as the Solr mergeFactor setting
>  and
> haven't been running into the problem anymore.  However we are seeing
>  some
> behavior from Solr that seems a little odd to me.  When we are in the
>  middle
> of our batch index process and we run the lsof command we see a lot of
>  open
> file handles hanging around that reference Solr index files that have
>  been
> deleted by Solr and no longer exist on the system.
>
> The documents we are indexing are potentially very large, so due to
>  various
> memory constraints we only send 300 docs to Solr at a time.  With a
>  commit
> between each set of 300 documents.  Now one of the things that I read
>  may
> cause old file handles to hang around was if you had an old IndexReader
> still open pointing to those old files.  However whenever you issue a
>  commit
> to the server it is supposed to close the old IndexReader and open a
>  new
> one.
>
> So my question is, when the Reader is being closed due to a commit,
>  what
> exactly is happening?  Is it just being set to null and a new instance
>  being
> created?  I'm thinking the reader may be sitting around in memory for a
> while before the garbage collector finally gets to it, and in that time
>  it
> is still holding those files open.  Perhaps an explicit method call
>  that
> closes any open file handles should occur before setting the reference
>  to
> null?
>
> After looking at the code, it looks like reader.close() is explicitly
>  being
> called as long as the closeReader property in SolrIndexSearcher is set
>  to
> true, but I'm not sure how to check if that is always getting set to
>  true or
> not.  There is one constructor of SolrIndexSearcher that sets it to
>  false.
>
> Any insight here would be appreciated.  Are stale file handles
>  something I
> should just expect from the JVM?  I've never ran into the "Too Many
>  Files
> Open" exception before, so this is my first time looking at the lsof
> command.  Perhaps I'm reading too much into the data it's showing me.
>
>
> Mark Baird
>
>
>
>

Re: An observation on the "Too Many Files Open" problem

Reply via email to