RE: scaling / sharding questions

Norskog, Lance Tue, 17 Jun 2008 12:36:15 -0700

I cannot facet on one huge index; it runs out of ram when it attempts to
allocate a giant array. If I store several shards in one JVM, there is
no problem.


Are there any performance benefits to a large index v.s. several small
indexes?

Lance 

-----Original Message-----
From: Marcus Herou [mailto:[EMAIL PROTECTED] 
Sent: Sunday, June 15, 2008 10:24 PM
To: solr-user@lucene.apache.org
Subject: Re: scaling / sharding questions

Yep got that.

Thanks.

/M

On Sun, Jun 15, 2008 at 8:42 PM, Otis Gospodnetic <
[EMAIL PROTECTED]> wrote:

> With Lance's MD5 schema you'd do this:
>
> 1 shard: 0-f*
> 2 shards: 0-8*, 9-f*
> 3 shards: 0-5*, 6-a*, b-f*
> 4 shards: 0-3*, 4-7*, 8-b*, c-f*
> ...
> 16 shards: 0*, 1*, 2*....... d*, e*, f*
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
> ----- Original Message ----
> > From: Marcus Herou <[EMAIL PROTECTED]>
> > To: solr-user@lucene.apache.org
> > Cc: [EMAIL PROTECTED]
> > Sent: Saturday, June 14, 2008 5:53:35 AM
> > Subject: Re: scaling / sharding questions
> >
> > Hi.
> >
> > We as well use md5 as the uid.
> >
> > I guess by saying each 1/16th is because the md5 is hex, right?
(0-f).
> > Thinking about md5 sharding.
> > 1 shard: 0-f
> > 2 shards: 0-7:8-f
> > 3 shards: problem!
> > 4 shards: 0-3....
> >
> > This technique would require that you double the amount of shards 
> > each
> time
> > you split right ?
> >
> > Split by delete sounds really smart, damn that I did'nt think of 
> > that :)
> >
> > Anyway over time the technique of moving the whole index to a new 
> > shard
> and
> > then delete would probably be more than challenging.
> >
> >
> >
> >
> > I will never ever store the data in Lucene mainly because of bad exp

> > and since I want to create modules which are fast,  scalable and 
> > flexible and storing the data alongside with the index do not match 
> > that for me at
> least.
> >
> > So yes I will have the need to do a "foreach id in ids get document"
> > approach in the searcher code, but at least I can optimize the 
> > retrieval
> of
> > docs myself and let Lucene do what it's good at: indexing and 
> > searching
> not
> > storage.
> >
> > I am more and more thinking in terms of having different levels of
> searching
> > instead of searcing in all shards at the same time.
> >
> > Let's say you start with 4 shards where you each document is 
> > replicated 4 times based on publishdate. Since all shards have the 
> > same data you can
> lb
> > the query to any of the 4 shards.
> >
> > One day you find that 4 shards is not enough because of search
> performance
> > so you add 4 new shards. Now you only index these 4 new shards with 
> > the
> new
> > documents making the old ones readonly.
> >
> > The searcher would then prioritize the new shards and only if the 
> > query returns less than X results you start querying the old shards.
> >
> > This have a nice side effect of having the most relevant/recent 
> > entries
> in
> > the index which is searched the most. Since the old shards will be 
> > mostly idle you can as well convert 2 of the old shards to "new" 
> > shards reducing the need for buying new servers.
> >
> > What I'm trying to say is that you will end up with an architecture 
> > which have many nodes on top which each have few documents and fewer

> > and fewer nodes as you go down the architecture but where each node 
> > store more documents since the search speed get's less and less
relevant.
> >
> > Something like this:
> >
> > xxxxxxxx - Primary: 10M docs per shard, make sure 95% of the results
> comes
> > from here.
> >    yyyy - Standby: 100M docs per shard - merges of 10 primary
indices.
> >      zz - Archive: 1000M docs per shard - merges of 10 standby
indices.
> >
> > Search top-down.
> > The numbers are just speculative. The drawback with this 
> > architecture is that you get no indexing benefit at all if the 
> > architecture drawn above
> is
> > the same as which you use for indexing. I think personally you 
> > should use
> X
> > indexers which then merge indices (MapReduce) for max performance 
> > and lay them out as described above.
> >
> > I think Google do something like this.
> >
> >
> > Kindly
> >
> > //Marcus
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > On Sat, Jun 14, 2008 at 2:27 AM, Lance Norskog wrote:
> >
> > > Yes, I've done this split-by-delete several times. The halved 
> > > index
> still
> > > uses as much disk space until you optimize it.
> > >
> > > As to splitting policy: we use an MD5 signature as our unique ID. 
> > > This
> has
> > > the lovely property that we can wildcard.  'contentid:f*' denotes 
> > > 1/16
> of
> > > the whole index. This 1/16 is a very random sample of the whole
index.
> We
> > > use this for several things. If we use this for shards, we have a 
> > > query that matches a shard's contents.
> > >
> > > The Solr/Lucene syntax does not support modular arithmetic,and so 
> > > it
> will
> > > not let you query a subset that matches one of your shards.
> > >
> > > We also found that searching a few smaller indexes via the Solr 
> > > 1.3 Distributed Search feature is actually faster than searching 
> > > one large index, YMMV. So for us, a large pile of shards will be 
> > > optimal anyway,
> so
> > > we
> > > have to need "rebalance".
> > >
> > > It sounds like you're not storing the data in a backing store, but

> > > are storing all data in the index itself. We have found this
"challenging".
> > >
> > > Cheers,
> > >
> > > Lance Norskog
> > >
> > > -----Original Message-----
> > > From: Jeremy Hinegardner [mailto:[EMAIL PROTECTED]
> > > Sent: Friday, June 13, 2008 3:36 PM
> > > To: solr-user@lucene.apache.org
> > > Subject: Re: scaling / sharding questions
> > >
> > > Sorry for not keeping this thread alive, lets see what we can
do...
> > >
> > > One option I've thought of for 'resharding' would splitting an 
> > > index
> into
> > > two by just copying it, the deleting 1/2 the documents from one, 
> > > doing
> a
> > > commit, and delete the other 1/2 from the other index and commit.

> > > That
> is:
> > >
> > >  1) Take original index
> > >  2) copy to b1 and b2
> > >  3) delete docs from b1 that match a particular query A
> > >  4) delete docs from b2 that do not match a particular query A
> > >  5) commit b1 and b2
> > >
> > > Has anyone tried something like that?
> > >
> > > As for how to know where each document is stored, generally we're 
> > > considering unique_document_id % N.  If we rebalance we change N 
> > > and redistribute, but that
> > > probably will take too much time.    That makes us move more
towards a
> > > staggered
> > > age based approach where the most recent docs filter down to
> "permanent"
> > > indexes based upon time.
> > >
> > > Another thought we've had recently is to have many many many 
> > > physical shards, on the indexing writer side, but then merge 
> > > groups of them into logical shards which are snapshotted to reader

> > > solrs' on a frequent
> basis.
> > > I haven't done any testing along these lines, but logically it 
> > > seems
> like
> > > an
> > > idea worth pursuing.
> > >
> > > enjoy,
> > >
> > > -jeremy
> > >
> > > On Fri, Jun 06, 2008 at 03:14:10PM +0200, Marcus Herou wrote:
> > > > Cool sharding technique.
> > > >
> > > > We as well are thinking of howto "move" docs from one index to
> another
> > > > because we need to re-balance the docs when we add new nodes to 
> > > > the
> > > cluster.
> > > > We do only store id's in the index otherwise we could have moved
> stuff
> > > > around with IndexReader.document(x) or so. Luke
> > > > (http://www.getopt.org/luke/) is able to reconstruct the indexed
> > > Document
> > > data so it should be doable.
> > > > However I'm thinking of actually just delete the docs from the 
> > > > old index and add new Documents to the new node. It would be 
> > > > cool to not waste cpu cycles by reindexing already indexed stuff
but...
> > > >
> > > > And we as well will have data amounts in the range you are 
> > > > talking about. We perhaps could share ideas ?
> > > >
> > > > How do you plan to store where each document is located ? I mean

> > > > you probably need to store info about the Document and it's 
> > > > location somewhere perhaps in a clustered DB ? We will probably 
> > > > go for HBase
> for
> > > this.
> > > >
> > > > I think the number of documents is less important than the 
> > > > actual
> data
> > > > size (just speculating). We currently search 10M (will get much 
> > > > much
> > > > larger) indexed blog entries on one machine where the JVM has 1G
> heap,
> > > > the index size is 3G and response times are still quite fast. 
> > > > This is a readonly node though and is updated every morning with

> > > > a freshly optimized index. Someone told me that you probably 
> > > > need twice the RAM if you plan to both index and search at the 
> > > > same time. If I were you
> I
> > > > would just test to index X entries of your data and then start 
> > > > to search in the index with lower JVM settings each round and 
> > > > when response times get too slow or you hit OOE then you get a 
> > > > rough
> estimate
> > > of the bare minimum X RAM needed for Y entries.
> > > >
> > > > I think we will do with something like 2G per 50M docs but I 
> > > > will
> need
> > > > to test it out.
> > > >
> > > > If you get an answer in this matter please let me know.
> > > >
> > > > Kindly
> > > >
> > > > //Marcus
> > > >
> > > >
> > > > On Fri, Jun 6, 2008 at 7:21 AM, Jeremy Hinegardner
> > > >
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > This may be a bit rambling, but let see how it goes.  I'm not 
> > > > > a Lucene or Solr guru by any means, I have been prototyping 
> > > > > with solr and understanding how all the pieces and parts fit
together.
> > > > >
> > > > > We are migrating our current document storage infrastructure 
> > > > > to a decent sized solr cluster, using 1.3-snapshots right now.
> > > > > Eventually this will be in the
> > > > > billion+ documents, with about 1M new documents added per day.
> > > > >
> > > > > Our main sticking point right now is that a significant number

> > > > > of our documents will be updated, at least once, but possibly 
> > > > > more
> than
> > > > > once.  The volatility of a document decreases over time.
> > > > >
> > > > > With this in mind, we've been considering using a cascading 
> > > > > series of shard clusters.  That is :
> > > > >
> > > > >  1) a cluster of shards holding recent data ( most recent week

> > > > > or two ) smaller
> > > > >    indexes that take a small amount of time to commit updates 
> > > > > and
> > > optimise,
> > > > >    since this will hold the most volatile documents.
> > > > >
> > > > >  2) Following that another cluster of shards that holds some 
> > > > > relatively recent
> > > > >    ( 3-6 months ? ), but not super volatile, documents, these 
> > > > > are items that
> > > > >    could potentially receive updates, but generally not.
> > > > >
> > > > >  3) A final set of 'archive' shards holding the final resting 
> > > > > place
> for
> > > > >    documents.  These would not receive updates.  These would 
> > > > > be
> online
> > > for
> > > > >    searching and analysis "forever".
> > > > >
> > > > > We are not sure if this is the best way to go, but it is the 
> > > > > approach we are leaning toward right now.  I would like some 
> > > > > feedback from the folks here if you think that is a reasonable

> > > > > approach.
> > > > >
> > > > > One of the other things I'm wondering about is how to 
> > > > > manipulate indexes We'll need to roll documents around between

> > > > > indexes over time, or at least migrate indexes from one set of

> > > > > shards to another
> as
> > > the documents 'age'
> > > > > and
> > > > > merge/aggregate them with more 'stable' indexes.   I know
about
> merging
> > > > > complete
> > > > > indexes together, but what about migrating a subset of 
> > > > > documents from one index into another index?
> > > > >
> > > > > In addition, what is generally considered a 'manageable' index

> > > > > of large size?  I was attempting to find some information on 
> > > > > the relationship between search response times, the amount of 
> > > > > memory
> for
> > > > > used for a search, and the number of documents in an index, 
> > > > > but I wasn't having much luck.
> > > > >
> > > > > I'm not sure if I'm making sense here, but just thought I 
> > > > > would throw this out there and see what people think.  Ther 
> > > > > eis the distinct possibility that I am not asking the right 
> > > > > questions or considering the right parameters, so feel free to

> > > > > correct me, or
> ask
> > > > > questions as you see fit.
> > > > >
> > > > > And yes, I will report how we are doing things when we get 
> > > > > this all figured out, and if there are items that we can 
> > > > > contribute back to Solr we will.  If nothing else there will 
> > > > > be a nice article of how we manage TB of data with Solr.
> > > > >
> > > > > enjoy,
> > > > >
> > > > > -jeremy
> > > > >
> > > > > --
> > > > >
> > >
> ======================================================================
> ==
> > > > >  Jeremy Hinegardner
> > > [EMAIL PROTECTED]
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Marcus Herou CTO and co-founder Tailsweep AB
> > > > +46702561312
> > > > [EMAIL PROTECTED]
> > > > http://www.tailsweep.com/
> > > > http://blogg.tailsweep.com/
> > >
> > > --
> > >
> ======================================================================
> ==
> > >  Jeremy Hinegardner
> [EMAIL PROTECTED]
> > >
> > >
> > >
> >
> >
> > --
> > Marcus Herou CTO and co-founder Tailsweep AB
> > +46702561312
> > [EMAIL PROTECTED]
> > http://www.tailsweep.com/
> > http://blogg.tailsweep.com/
>
>


--
Marcus Herou CTO and co-founder Tailsweep AB
+46702561312
[EMAIL PROTECTED]
http://www.tailsweep.com/
http://blogg.tailsweep.com/

RE: scaling / sharding questions

Reply via email to