Been there done that.

Indexing into the smaller cores will be faster.
You will be able to spread the load across multiple machines.

There are other advantages:
You will not have a 1/2Terabyte set of files to worry about.
You will not need 1.1T in one partition to run an optimize.
You will not need 12+ hours to run an optimize.
It will not take 1/2 hour to copy the newly optimized index to a query server.

On Mon, Nov 16, 2009 at 7:14 PM, Otis Gospodnetic
<otis_gospodne...@yahoo.com> wrote:
> If an index fits in memory, I am guessing you'll see the speed change roughly 
> proportionally to the size of the index.  If an index does not fit into 
> memory (i.e. disk head has to run around the disk to look for info), then the 
> improvement will be even greater.  I haven't explicitly tested this and am 
> hoping somebody will correct me if this is wrong.
>
> Otis
> --
> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>
>
>
> ----- Original Message ----
>> From: Phil Hagelberg <p...@hagelb.org>
>> To: solr-user@lucene.apache.org
>> Sent: Mon, November 16, 2009 8:42:49 PM
>> Subject: core size
>>
>>
>> I'm are planning out a system with large indexes and wondering what kind
>> of performance boost I'd see if I split out documents into many cores
>> rather than using a single core and splitting by a field. I've got about
>> 500GB worth of indexes ranging from 100MB to 50GB each.
>>
>> I'm assuming if we split them out to multiple cores we would see the
>> most dramatic benefit in searches on the smaller cores, but I'm just
>> wondering what level of speedup I should expect. Eventually the cores
>> will be split up anyway, I'm just trying to determine how to prioritize
>> it.
>>
>> thanks,
>> Phil
>
>



-- 
Lance Norskog
goks...@gmail.com

Reply via email to