Re: scaling / sharding questions

2008-06-18 Thread Yonik Seeley
On Wed, Jun 18, 2008 at 5:53 PM, Phillip Farber <[EMAIL PROTECTED]> wrote: > Does this mean that the Lucene scoring algorithm is computed without the idf > factor, i.e. we just get term frequency scoring? No, it means that the idf calculation is done locally on a single shard. With a big index tha

Re: scaling / sharding questions

2008-06-18 Thread Phillip Farber
ta in a backing store, but are storing all data in the index itself. We have found this "challenging". Cheers, Lance Norskog -Original Message- From: Jeremy Hinegardner [mailto:[EMAIL PROTECTED] Sent: Friday, June 13, 2008 3:36 PM To: solr-user@lucene.apache.org Subject:

RE: scaling / sharding questions

2008-06-17 Thread Norskog, Lance
[mailto:[EMAIL PROTECTED] Sent: Sunday, June 15, 2008 10:24 PM To: solr-user@lucene.apache.org Subject: Re: scaling / sharding questions Yep got that. Thanks. /M On Sun, Jun 15, 2008 at 8:42 PM, Otis Gospodnetic < [EMAIL PROTECTED]> wrote: > With Lance's MD5 schema you'd do thi

Re: scaling / sharding questions

2008-06-15 Thread Marcus Herou
use this for several things. If we use this for shards, we have a query > > > that > > > matches a shard's contents. > > > > > > The Solr/Lucene syntax does not support modular arithmetic,and so it > will > > > not let you query a subset that

Re: scaling / sharding questions

2008-06-15 Thread Otis Gospodnetic
Marcus Herou <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Cc: [EMAIL PROTECTED] > Sent: Saturday, June 14, 2008 5:53:35 AM > Subject: Re: scaling / sharding questions > > Hi. > > We as well use md5 as the uid. > > I guess by saying each 1/16th is

Re: scaling / sharding questions

2008-06-14 Thread Marcus Herou
. > > It sounds like you're not storing the data in a backing store, but are > storing all data in the index itself. We have found this "challenging". > > Cheers, > > Lance Norskog > > -Original Message- > From: Jeremy Hinegardner [mailto:[EMAIL PROTECTED] &

RE: scaling / sharding questions

2008-06-13 Thread Lance Norskog
ng the data in a backing store, but are storing all data in the index itself. We have found this "challenging". Cheers, Lance Norskog -Original Message- From: Jeremy Hinegardner [mailto:[EMAIL PROTECTED] Sent: Friday, June 13, 2008 3:36 PM To: solr-user@lucene.apache.org Subject

Re: scaling / sharding questions

2008-06-13 Thread Jeremy Hinegardner
> Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > - Original Message > > From: Marcus Herou <[EMAIL PROTECTED]> To: > > solr-user@lucene.apache.org; [EMAIL PROTECTED] Sent: Friday, June 6, > > 2008 9:14:10 AM Subject: Re: scalin

Re: scaling / sharding questions

2008-06-13 Thread Jeremy Hinegardner
Sorry for not keeping this thread alive, lets see what we can do... One option I've thought of for 'resharding' would splitting an index into two by just copying it, the deleting 1/2 the documents from one, doing a commit, and delete the other 1/2 from the other index and commit. That is: 1) T

Re: scaling / sharding questions

2008-06-06 Thread Otis Gospodnetic
Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Marcus Herou <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org; [EMAIL PROTECTED] > Sent: Friday, June 6, 2008 9:14:10 AM > Subject: Re: scaling / sharding questions > > Cool

Re: scaling / sharding questions

2008-06-06 Thread Marcus Herou
Cool sharding technique. We as well are thinking of howto "move" docs from one index to another because we need to re-balance the docs when we add new nodes to the cluster. We do only store id's in the index otherwise we could have moved stuff around with IndexReader.document(x) or so. Luke (http:

scaling / sharding questions

2008-06-05 Thread Jeremy Hinegardner
Hi all, This may be a bit rambling, but let see how it goes. I'm not a Lucene or Solr guru by any means, I have been prototyping with solr and understanding how all the pieces and parts fit together. We are migrating our current document storage infrastructure to a decent sized solr cluster, usi