RE: shards and performance

2008-08-21 Thread Lance Norskog
21, 2008 10:59 AM To: solr-user@lucene.apache.org Subject: Re: shards and performance 2008/8/21 Otis Gospodnetic <[EMAIL PROTECTED]> > Uh uh. 6 instances per node all pointing to the same index? > Yes, this can increase performance, but only because it essentially > gives

Re: shards and performance

2008-08-21 Thread Alexander Ramos Jardim
olr-user@lucene.apache.org > > Sent: Wednesday, August 20, 2008 9:49:04 AM > > Subject: Re: shards and performance > > > > Another thing to consider on your sharding is the access rate you want to > > guarantee. > > > > In the project I am working, I need to g

Re: shards and performance

2008-08-21 Thread Otis Gospodnetic
gust 20, 2008 9:49:04 AM > Subject: Re: shards and performance > > Another thing to consider on your sharding is the access rate you want to > guarantee. > > In the project I am working, I need to guarantee at least 200hits/second > with various facets in all queries. >

Re: shards and performance

2008-08-20 Thread Alexander Ramos Jardim
2008/8/20 Ian Connor <[EMAIL PROTECTED]> > So, because the OS is doing the caching in RAM. It means I could have > 6 jetty servers per machine all pointing to the same data. Once the > index is built, I can load up some more servers on different ports and > it will boost performance. > > That does

Re: shards and performance

2008-08-20 Thread Ian Connor
So, because the OS is doing the caching in RAM. It means I could have 6 jetty servers per machine all pointing to the same data. Once the index is built, I can load up some more servers on different ports and it will boost performance. That does sound promising - thanks for the tip. What made you

Re: shards and performance

2008-08-20 Thread Alexander Ramos Jardim
Another thing to consider on your sharding is the access rate you want to guarantee. In the project I am working, I need to guarantee at least 200hits/second with various facets in all queries. I am not using sharding, but I have 6 Solr instances per cluster node, and I have 3 nodes, to a total o

Re: shards and performance

2008-08-20 Thread Ian Connor
I have based my machines on bare bones servers (I call them ghetto servers). I essentially have motherboards in a rack sitting on catering trays (heat resistance is key). http://web.mac.com/iconnor/iWeb/Site/ghetto-servers.html Motherboards: GIGABYTE GA-G33M-S2L (these are small mATX with 4 RAM s

Re: shards and performance

2008-08-19 Thread Alexander Ramos Jardim
As long as Solr/Lucene makes smart use from memory (and they from my experiences), it is really easy to calculate how long a huge query/update will take when you know how much the smaller ones will take. Just keep in mind that the resource consumption of memory and disk space is almost always propo

Re: shards and performance

2008-08-19 Thread Mike Klaas
On 19-Aug-08, at 12:58 PM, Phillip Farber wrote: So you experience differs from Mike's. Obviously it's an important decision as to whether to buy more machines. Can you (or Mike) weigh in on what factors led to your different take on local shards vs. shards distributed across machines?

Re: shards and performance

2008-08-19 Thread Phillip Farber
Thanks, Ian, for the considered reply. See below. Ian Connor wrote: I have not seen any boost by having an index split into shards on the same machine. However, when you split it into smaller shards on different machines (cpu/ram/hdd), the performance boost worth it. So you experience differs

Re: shards and performance

2008-08-19 Thread Ian Connor
I have not seen any boost by having an index split into shards on the same machine. However, when you split it into smaller shards on different machines (cpu/ram/hdd), the performance boost worth it. At least for building the index, the number of shards really does help. To index Medline (1.6e7 do

Re: shards and performance

2008-08-19 Thread Mike Klaas
On 19-Aug-08, at 10:18 AM, Phillip Farber wrote: I'm trying to understand how splitting a monolithic index into shards improves query response time. Please tell me if I'm on the right track here. Were does the increase in performance come from? Is it that in-memory arrays are smaller