Been there done that. Indexing into the smaller cores will be faster. You will be able to spread the load across multiple machines.
There are other advantages: You will not have a 1/2Terabyte set of files to worry about. You will not need 1.1T in one partition to run an optimize. You will not need 12+ hours to run an optimize. It will not take 1/2 hour to copy the newly optimized index to a query server. On Mon, Nov 16, 2009 at 7:14 PM, Otis Gospodnetic <otis_gospodne...@yahoo.com> wrote: > If an index fits in memory, I am guessing you'll see the speed change roughly > proportionally to the size of the index. If an index does not fit into > memory (i.e. disk head has to run around the disk to look for info), then the > improvement will be even greater. I haven't explicitly tested this and am > hoping somebody will correct me if this is wrong. > > Otis > -- > Sematext is hiring -- http://sematext.com/about/jobs.html?mls > Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR > > > > ----- Original Message ---- >> From: Phil Hagelberg <p...@hagelb.org> >> To: solr-user@lucene.apache.org >> Sent: Mon, November 16, 2009 8:42:49 PM >> Subject: core size >> >> >> I'm are planning out a system with large indexes and wondering what kind >> of performance boost I'd see if I split out documents into many cores >> rather than using a single core and splitting by a field. I've got about >> 500GB worth of indexes ranging from 100MB to 50GB each. >> >> I'm assuming if we split them out to multiple cores we would see the >> most dramatic benefit in searches on the smaller cores, but I'm just >> wondering what level of speedup I should expect. Eventually the cores >> will be split up anyway, I'm just trying to determine how to prioritize >> it. >> >> thanks, >> Phil > > -- Lance Norskog goks...@gmail.com