Hi Otis, Thanks a lot for your interest.
The main thing i cant understand very well is that if I have 8 maquines that will be searchers, for example, why they will have a higher cost of hw if I have one big index. If I have 10 smaller indexes I will need to search over all of them so...that won´t requiere the same hw? I understand that if i can search in a subset of the index it would be better to split the index but if i must search in the entire index? I can add new searcher maquines so i think that my hw problem is the ram, its that right? Probably i'm missing something, sorry if my question have an obvious answer. 2008/6/15 Otis Gospodnetic <[EMAIL PROTECTED]>: > Hi Roberto, > > SAN is a fine choice, if that's what you were worried about. There is no > way to tell exactly how fast your searches will be, as that depends on a lot > of factors -- benchmarking with your own data and hardware and queries is > the best way to go. > > As for the cost of multiple smaller machines and one large one (if that's > what's needed) is that, I *think*, the price of hw goes up significantly > when you start working with high-end hw, and that cost may be higher than > the cost of N smaller servers combined. That's the cost difference that I > was trying to point out. That's for your IT people to figure out after you > tell them what type of hw you need and what the options are. > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > ----- Original Message ---- > > From: Roberto Nieto <[EMAIL PROTECTED]> > > To: solr-user@lucene.apache.org > > Sent: Saturday, June 14, 2008 5:05:54 PM > > Subject: Re: doubt with an index of 300gb > > > > Hi Otis, > > > > Thanks for your fast answer. > > > > I understand perfectly your points. I will explain my limitations ... > > > > --Multiple smaller indices you can split them across several servers, but > > you can't do that with a monolithic index. > > The index will be allocated in a SAN that is not under my election. I can > > decide to split the index or use a monolithic one but not the allocation > > > > --With multiple smaller indices you can choose to search only a subset of > > them, should that make sense for your app. > > --How much does it cost to have 1 server with a LOT of RAM that serving > this > > index will need? Maybe it's cheaper to have multiple smaller machines. > > This index will be an index public and i will always need to search in > the > > entire index. I understand the problem of the RAM, but if I use multiple > > index and then i search in all of them i will use less RAM? The index > will > > have 10 fields, all of them excepting the content will be small and I > will > > only sort be score. If someone have any experience of how much ram i will > > need or something about the response times with this kind of index it > would > > be very usefull for me. > > > > --How long does it take you to rebuild one big index, should it get > > corrupted vs. rebuilding only a subset of your data? > > This is a very important aspect, but my primary objective must be the > > response time. I thought about using different index with different solr > but > > the problem is the mixture of results and how to sort them...so i think > (but > > not sure) that using only one index it will be faster knowing that i will > > always need to search in the entire index. > > > > > > Any help or suggestion will be very usefull. > > > > Thank you very much for your attention > > > > > > 2008/6/14 Otis Gospodnetic : > > > > > Roberto, > > > > > > Here is some food for thought... > > > > > > Multiple smaller indices you can split them across several servers, but > you > > > can't do that with a monolithic index. > > > > > > With multiple smaller indices you can choose to search only a subset of > > > them, should that make sense for your app. > > > How much does it cost to have 1 server with a LOT of RAM that serving > this > > > index will need? Maybe it's cheaper to have multiple smaller machines. > > > > > > How long does it take you to rebuild one big index, should it get > corrupted > > > vs. rebuilding only a subset of your data? > > > How long does it take you to copy the index around the network after > you > > > optimize it vs. copying only a subset, or multiple subsets in parallel? > > > > > > etc. > > > > > > Otis -- > > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > > > > > > ----- Original Message ---- > > > > From: Roberto Nieto > > > > To: solr-user@lucene.apache.org > > > > Sent: Saturday, June 14, 2008 7:31:28 AM > > > > Subject: doubt with an index of 300gb > > > > > > > > Hi users, > > > > > > > > I´m going to create a big index of 300gb in a SAN where i have 4TB. I > > > read > > > > many entries in the mail list talking about using multiple index with > > > > multicore. I would like to know what kind of benefit can i have > > > > using multiple index instead of one big index if i dont have problems > > > with > > > > the disk? I know that the optimizes and the commits would be faster > with > > > > smaller indexs, but in search? The RAM use would be the same using 10 > > > > indexes of 30gb than using 1 index of 300gb? Any suggestion or > experience > > > > will be very usefull for me. > > > > > > > > Thanks in advance. > > > > > > > > Rober. > > > > > > > >