Re: Memory problems with HttpSolrServer

Shawn Heisey Mon, 06 May 2013 07:02:19 -0700

On 5/6/2013 1:32 AM, Rogowski, Britta wrote:
> Hi!
> 
> When I write from our database to a HttpSolrServer, (using a 
> LinkedBlockingQueue to write just one document at a time), I run into memory 
> problems (due to various constraints, I have to remain on a 32-bit system, so 
> I can use at most 2 GB RAM).
> 
> If I use an EmbeddedSolrServer (to write locally), I have no such problems. 
> Just now, I tried out ConcurrentUpdateSolrServer (with a queue size of 1, but 
> 3 threads to be safe), and that worked out fine too. I played around with 
> various GC options and monitored memory with jconsole and jmap, but only 
> found out that there's lots of byte arrays, SolrInputFields and Strings 
> hanging around.
> 
> Since ConcurrentUpdateSolrServer works, I'm happy, but I was wondering if 
> people were aware of the memory issue around HttpSolrServer.


Is it memory usage within the JVM, or OS allocation for the java process
that you are looking at?

There are no known memory problems with current versions of SolrJ, and
none that I know about with older versions.  At the time you wrote this,
4.2.1 was the latest version, but now several hours later, 4.3.0 has
been released.

I have a SolrJ app that I've been using since 3.5.0, currently using
4.2.1.  It creates 32 separate HttpSolrServer instances, to keep all my
shards up to date.  It runs for weeks or months at a time and is
currently using about 25MB of RAM within the JVM.  When special reindex
requests happen, memory usage may briefly go up to a few hundred MB.  It
will typically allocate the entire 1GB heap at the OS level, but I could
run it with a smaller heap and have no trouble.

After I gathered those numbers, I restarted the application.  Memory
usage is still low, and the OS shows only 106MB in use.

I suspect that your java code may have a memory leak.  I'm not sure why
the leak isn't happening with the concurrent object, that's very very
weird.  ConcurrentUpdateSolrServer uses HttpSolrServer internally.  When
you use HttpSolrServer, are you reusing one object or creating a new one
for every request?  You should create one HttpSolrServer object for
every separate Solr core and then use that object for the life of your
application.  It is completely thread safe.

There is a large caveat with ConcurrentUpdateSolrServer.  If you are
using try/catch blocks to trap request errors and take action, you
should be aware that this object will never throw an error.  Even if a
request fails or your Solr server is down, your application will never know.

Why do I need 32 HttpSolrServer objects? I have 2 index chains, 7 shards
per chain, with a live core and a build core per shard.  That is 28
separate cores.  There are four Solr servers, so I need four additional
objects for CoreAdmin requests.

Thanks,
Shawn

Re: Memory problems with HttpSolrServer

Reply via email to