Matthew, Thanks, a very good point.
Andrey. > -----Original Message----- > From: Matthew Runo [mailto:[EMAIL PROTECTED] > Sent: Thursday, September 18, 2008 11:38 AM > To: solr-user@lucene.apache.org > Subject: Re: Hardware config for SOLR > > I can't speak to a lot of this - but regarding the servers I'd go with > the more powerful ones, if only for the amount of ram. Your index will > likely be larger than 1 gig, and with only two you'll have a lot of > your index not stored in ram, which will slow down your QPS. > > Thanks for your time! > > Matthew Runo > Software Engineer, Zappos.com > [EMAIL PROTECTED] - 702-943-7833 > > On Sep 17, 2008, at 3:32 PM, Andrey Shulinskiy wrote: > > > Hello, > > > > > > > > We're planning to use SOLR for our project, got some questions. > > > > > > > > So I asked some Qs yesterday, got no answers whatsoever. Wondering if > > they didn't make sense, or if the e-mail was too long... :-) > > > > Anyway, I'll try to ask them again and hope for some answers this > > time. > > > > It's a very new experience for us so any help is really appreciated. > > > > > > > > First, some numbers we're expecting. > > > > - The average size of a doc: ~100K > > > > - The number of indexes: 1 > > > > - The query response time we're looking for: < 200 - 300ms > > > > - The number of stored docs: > > > > 1st year: 500K - 1M > > > > 2nd year: 2-3M > > > > - The estimated number of concurrent users per second > > > > 1st year: 15 - 25 > > > > 2nd year: 40 - 60 > > > > - The estimated number of queries > > > > 1st year: 15 - 25 > > > > 2nd year: 40 - 60 > > > > > > > > Now the questions > > > > > > > > 1) Should we do sharding or not? > > > > If we start without sharding, how hard will it be to enable it? > > > > Is it just some config changes + the index rebuild or is it more? > > > > My personal opinion is to go without sharding at first and enable it > > later if do get a lot of documents. > > > > > > > > 2) How should we organize our clusters to ensure redundancy? > > > > Should we have 2 or more identical Masters (means that all the > > updates/optimisations/etc. are done for every one of them)? > > > > An alternative, afaik, is to reconfigure one slave to become the new > > Master, how hard is that? > > > > > > > > 3) Basically, we can get servers of two kinds: > > > > > > > > * Single Processor, Dual Core Opteron 2214HE > > > > * 2 GB DDR2 SDRAM > > > > * 1 x 250 GB (7200 RPM) SATA Drive(s) > > > > > > > > * Dual Processor, Quad Core 5335 > > > > * 16 GB Memory (Fully Buffered) > > > > * 2 x 73 GB (10k RPM) 2.5" SAS Drive(s), RAID 1 > > > > > > > > The second - more powerful - one is more expensive, of course. > > > > > > > > How can we take advantage of the multiprocessor/multicore servers? > > > > Is there some special setup required to make, say, 2 instances of SOLR > > run on the same server using different processors/cores? > > > > > > > > 4) Does it make much difference to get a more powerful Master? > > > > Or, on the contrary, as slaves will be queried more often, they should > > be the better ones? Maybe just the HDDs for the slaves should be as > > fast > > as possible? > > > > > > > > 5) How many slaves does it make sense to have per one Master? > > > > What's (roughly) the performance gain from 1 to 2, 2 -> 3, etc? > > > > When does it stop making sense to add more slaves? > > > > As far as I understand, it depends mainly on the size of the index. > > However, I'd guess the time required to do a push for too many slaves > > can be a problem too, correct? > > > > > > > > Thanks, > > > > Andrey. > > > > > > >