Re: What is largest reasonable setting for ramBufferSizeMB?

Mark Miller Mon, 22 Feb 2010 11:04:48 -0800

>>and a ramBufferSize of 3GB

If you had actually used great than 2GB of it, you would have seen problems
as an int overflowed -


which is why its now hard limited -

    if (mb > 2048.0) {
      throw new IllegalArgumentException("ramBufferSize " + mb + " is too
large; should be comfortably less than 2048");
    }


On Fri, Feb 19, 2010 at 5:03 AM, Glen Newton <glen.new...@gmail.com> wrote:

> I've run Lucene with heap sizes as large as 28GB of RAM (on a 32GB
> machine, 64bit, Linux) and a ramBufferSize of 3GB. While I haven't
> noticed the GC issues mark mentioned in this configuration, I have
> seen them in the ranges he discusses (on 1.6 <update 18).
>
> You may consider using LuSql[1] to create the indexes, if your source
> content is in a JDBC accessible db. It is quite a bit faster than
> Solr, as it is a tool specifically created and tuned for Lucene
> indexing. But it is command-line, not RESTful like Solr. The released
> version of LuSql only runs single machine (though designed for many
> threads), the new release will allow distributing indexing across any
> number of machines (with each machine building a shard). The new
> release also has plugable sources, so it is not restricted to JDBC.
>
> -Glen
> [1]http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswiki/index.php/LuSql
>
> On 18 February 2010 21:34, Otis Gospodnetic <otis_gospodne...@yahoo.com>
> wrote:
> > Hi Tom,
> >
> > It wouldn't.  I didn't see the mention of parallel indexing in the
> original email. :)
> >
> > Otis
> > ----
> > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> > Hadoop ecosystem search :: http://search-hadoop.com/
> >
> >
> >
> > ----- Original Message ----
> >> From: Tom Burton-West <tburtonw...@gmail.com>
> >> To: solr-user@lucene.apache.org
> >> Sent: Thu, February 18, 2010 3:30:05 PM
> >> Subject: Re: What is largest reasonable setting for ramBufferSizeMB?
> >>
> >>
> >> Thanks Otis,
> >>
> >> I don't know enough about Hadoop to understand the advantage of using
> Hadoop
> >> in this use case.  How would using Hadoop differ from distributing the
> >> indexing over 10 shards on 10 machines with Solr?
> >>
> >> Tom
> >>
> >>
> >>
> >> Otis Gospodnetic wrote:
> >> >
> >> > Hi Tom,
> >> >
> >> > 32MB is very low, 320MB is medium, and I think you could go higher,
> just
> >> > pick whichever garbage collector is good for throughput.  I know Java
> 1.6
> >> > update 18 also has some Hotspot and maybe also GC fixes, so I'd use
> that.
> >> > Finally, this sounds like a good use case for reindexing with Hadoop!
> >> >
> >> >  Otis
> >> > ----
> >> > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> >> > Hadoop ecosystem search :: http://search-hadoop.com/
> >> >
> >> >
> >>
> >> --
> >> View this message in context:
> >>
> http://old.nabble.com/What-is-largest-reasonable-setting-for-ramBufferSizeMB--tp27631231p27645167.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >
> >
>
>
>
> --
>
> -
>



-- 
- Mark

http://www.lucidimagination.com

Re: What is largest reasonable setting for ramBufferSizeMB?

Reply via email to