Re: feedback on Solr 4.x LotsOfCores feature

Yago Riveiro Mon, 07 Oct 2013 07:13:51 -0700

I assume that the lotOfCores feature doesn't use zookeeper

I tried simulate the cores as collection, but when the size of 
clusterstate.json is bigger than 1M and -Djute.maxbuffer is needed to increase 
the 1 mega limitation.


A naive question, why clusterstate.json is doesn't by collection?  

--  
Yago Riveiro
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)


On Monday, October 7, 2013 at 1:33 PM, Erick Erickson wrote:

> Thanks for the great writeup! It's always interesting to see how
> a feature plays out "in the real world". A couple of questions
> though:
>  
> bq: We added 2 Cores options :
> Do you mean you patched Solr? If so are you willing to shard the code
> back? If both are "yes", please open a JIRA, attach the patch and assign
> it to me.
>  
> bq: the number of file descriptors, it used a lot (need to increase global
> max and per process fd)
>  
> Right, this makes sense since you have a bunch of cores all with their
> own descriptors open. I'm assuming that you hit a rather high max
> number and it stays pretty steady....
>  
> bq: the overhead to parse solrconfig.xml and load dependencies to open
> each core
>  
> Right, I tried to look at sharing the underlying solrconfig object but
> it seemed pretty hairy. There are some extensive comments in the
> JIRA of the problems I foresaw. There may be some action on this
> in the future.
>  
> bq: lotsOfCores doesn’t work with SolrCloud
>  
> Right, we haven't concentrated on that, it's an interesting problem.
> In particular it's not clear what happens when nodes go up/down,
> replicate, resynch, all that.
>  
> bq: When you start, it spend a lot of times to discover cores due to a big
>  
> How long? I tried 15K cores on my laptop and I think I was getting 15
> second delays or roughly 1K cores discovered/second. Is your delay
> on the order of 50 seconds with 50K cores?
>  
> I'm not sure how you could do that in the background, but I haven't
> thought about it much. I tried multi-threading core discovery and that
> didn't help (SSD disk), I assumed that the problem was mostly I/O
> contention (but didn't prove it). What if a request came in for a core
> before you'd found it? I'm not sure what the right behavior would be
> except perhaps to block on that request until core discovery was
> complete. Hmmmmm. How would that work for your case? That
> seems do-able.
>  
> BTW, so far you get the prize for the most cores on a node I think.
>  
> Thanks again for the great feedback!
>  
> Erick
>  
> On Mon, Oct 7, 2013 at 3:53 AM, Soyez Olivier
> <olivier.so...@worldline.com (mailto:olivier.so...@worldline.com)> wrote:
> > Hello,
> >  
> > In my company, we use Solr in production to offer full text search on
> > mailboxes.
> > We host dozens million of mailboxes, but only webmail users have such
> > feature (few millions).
> > We have the following use case :
> > - non static indexes with more update (indexing and deleting), than
> > select requests (ratio 7:1)
> > - homogeneous configuration for all indexes
> > - not so much user at the same time
> >  
> > We started to index mailboxes with Solr 1.4 in 2010, on a subset of
> > 400,000 users.
> > - we had a cluster of 50 servers, 4 Solr per server, 2000 users per Solr
> > instance
> > - we grow to 6000 users per Solr instance, 8 Solr per server, 60Go per
> > index (~2 million users)
> > - we upgraded to Solr 3.5 in 2012
> > As indexes grew, IOPS and the response times have increased more and more.
> >  
> > The index size was mainly due to stored fields (large .fdt files)
> > Retrieving these fields from the index was costly, because of many seek
> > in large files, and no limit usage possible.
> > There is also an overhead on queries : too many results are filtered to
> > find only results concerning user.
> > For these reason and others, like not pooled users, hardware savings,
> > better scoring, some requests that do not support filtering, we have
> > decided to use the LotsOfCores feature.
> >  
> > Our goal was to change the current I/O usage : from lots of random I/O
> > access on huge segments to mostly sequential I/O access on small segments.
> > For our use case, it's not a big deal, that the first query to one not
> > yet loaded core will be slow.
> > And, we don’t need to fit all the cores into memory at once.
> >  
> > We started from the SOLR-1293 issue and the LotsOfCores wiki page to
> > finally use a patched Solr 4.2.1 LotsOfCores in production (1 user = 1
> > core).
> > We don't need anymore to run so many Solr per node. We are now able to
> > have around 50000 cores per Solr and we plan to grow to 100,000 cores
> > per instance.
> > In a first time, we used the solr.xml persistence. All cores have
> > loadOnStartup="false" and transient="true" attributes, so a cold start
> > is very quick. The response times were better than ever, in comparaison
> > with poor response times, we had before using LotsOfCores.
> >  
> > We added 2 Cores options :
> > - "numBuckets" to create a subdirectory based on a hash on the corename
> > % numBuckets in the core Datadir, because all cores cannot live in the
> > same directory
> > - "Auto" with 3 differents values :
> > 1) false : default behaviour
> > 2) createLoad : create, if not exist, and load the core on the fly on
> > the first incoming request (update, select).
> > 3) onlyLoad : load the core on the fly on the first incoming request
> > (update, select), if exist on disk
> >  
> > Then, to improve performance and avoid synchronization in the solr.xml
> > persistence : we disabled it.
> > The drawback is we cannot see anymore all the availables cores list with
> > the admin core status command, only those warmed up.
> > Finally, we can achieve very good performances with Solr LotsOfCores :
> > - Index 5 emails (avg) + commit + search : x4.9 faster response time
> > (Mean), x5.4 faster (95th per)
> > - Delete 5 documents (avg) : x8.4 faster response time (Mean) x7.4
> > faster (95th per)
> > - Search : x3.7 faster response time (Mean) 4x faster (95th per)
> >  
> > In fact, the better performance is mainly due to the little size of each
> > index, but also thanks to the isolation between cores (updates and
> > queries on many mailboxes don’t have side effects to each other).
> > One important thing with the LotsOfCores feature is to take care of :
> > - the number of file descriptors, it used a lot (need to increase global
> > max and per process fd)
> > - the value of the transientCacheSize depending of the RAM size and the
> > PermGen allocated size
> > - the leak of ClassLoader that increase minor GC times, when CMS GC is
> > enabled (use -XX:+CMSClassUnloadingEnabled)
> > - the overhead to parse solrconfig.xml and load dependencies to open
> > each core
> > - lotsOfCores doesn’t work with SolrCloud, then we store indexes
> > location outside of Solr. We have Solr proxies to route requests to the
> > right instance.
> >  
> > Not in production, we try the core discovery feature in Solr 4.4 with a
> > lots of cores.
> > When you start, it spend a lot of times to discover cores due to a big
> > number of cores, meanwhile all requests fail (SolrDispatchFilter.init()
> > not done yet). It will be great to have for example an option for a core
> > discovery in background, or just to be able to disable it, like we do in
> > our use case.
> >  
> > If someone is interested in these new options for LotsOfCores feature,
> > just tell me
> >  
> >  
> > Ce message et les pièces jointes sont confidentiels et réservés à l'usage 
> > exclusif de ses destinataires. Il peut également être protégé par le secret 
> > professionnel. Si vous recevez ce message par erreur, merci d'en avertir 
> > immédiatement l'expéditeur et de le détruire. L'intégrité du message ne 
> > pouvant être assurée sur Internet, la responsabilité de Worldline ne pourra 
> > être recherchée quant au contenu de ce message. Bien que les meilleurs 
> > efforts soient faits pour maintenir cette transmission exempte de tout 
> > virus, l'expéditeur ne donne aucune garantie à cet égard et sa 
> > responsabilité ne saurait être recherchée pour tout dommage résultant d'un 
> > virus transmis.
> >  
> > This e-mail and the documents attached are confidential and intended solely 
> > for the addressee; it may also be privileged. If you receive this e-mail in 
> > error, please notify the sender immediately and destroy it. As its 
> > integrity cannot be secured on the Internet, the Worldline liability cannot 
> > be triggered for the message content. Although the sender endeavours to 
> > maintain a computer virus-free network, the sender does not warrant that 
> > this transmission is virus-free and will not be liable for any damages 
> > resulting from any virus transmitted.

Re: feedback on Solr 4.x LotsOfCores feature

Reply via email to