I've run Lucene with heap sizes as large as 28GB of RAM (on a 32GB
machine, 64bit, Linux) and a ramBufferSize of 3GB. While I haven't
noticed the GC issues mark mentioned in this configuration, I have
seen them in the ranges he discusses (on 1.6 <update 18).

You may consider using LuSql[1] to create the indexes, if your source
content is in a JDBC accessible db. It is quite a bit faster than
Solr, as it is a tool specifically created and tuned for Lucene
indexing. But it is command-line, not RESTful like Solr. The released
version of LuSql only runs single machine (though designed for many
threads), the new release will allow distributing indexing across any
number of machines (with each machine building a shard). The new
release also has plugable sources, so it is not restricted to JDBC.

-Glen
[1]http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswiki/index.php/LuSql

On 18 February 2010 21:34, Otis Gospodnetic <otis_gospodne...@yahoo.com> wrote:
> Hi Tom,
>
> It wouldn't.  I didn't see the mention of parallel indexing in the original 
> email. :)
>
> Otis
> ----
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Hadoop ecosystem search :: http://search-hadoop.com/
>
>
>
> ----- Original Message ----
>> From: Tom Burton-West <tburtonw...@gmail.com>
>> To: solr-user@lucene.apache.org
>> Sent: Thu, February 18, 2010 3:30:05 PM
>> Subject: Re: What is largest reasonable setting for ramBufferSizeMB?
>>
>>
>> Thanks Otis,
>>
>> I don't know enough about Hadoop to understand the advantage of using Hadoop
>> in this use case.  How would using Hadoop differ from distributing the
>> indexing over 10 shards on 10 machines with Solr?
>>
>> Tom
>>
>>
>>
>> Otis Gospodnetic wrote:
>> >
>> > Hi Tom,
>> >
>> > 32MB is very low, 320MB is medium, and I think you could go higher, just
>> > pick whichever garbage collector is good for throughput.  I know Java 1.6
>> > update 18 also has some Hotspot and maybe also GC fixes, so I'd use that.
>> > Finally, this sounds like a good use case for reindexing with Hadoop!
>> >
>> >  Otis
>> > ----
>> > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
>> > Hadoop ecosystem search :: http://search-hadoop.com/
>> >
>> >
>>
>> --
>> View this message in context:
>> http://old.nabble.com/What-is-largest-reasonable-setting-for-ramBufferSizeMB--tp27631231p27645167.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>
>



-- 

-

Reply via email to