Re: Solr and SolrCloud repllcation, and load balancing questions.

Himanshu Mehrotra Sun, 06 Jul 2014 20:43:24 -0700

Ok, great.  Thanks for helping  out.

Thanks,
Himanshu



On Sun, Jul 6, 2014 at 9:35 PM, Erick Erickson <erickerick...@gmail.com>
wrote:

> [C] I've rarely seen situations where the document cache has
> a very high a hit rate. For that to happen, the queries would
> need to be returning the exact same documents, which isn't
> usually the case. I wouldn't increase this very far. The
> recommendation is that it be
> (total simultaneous queries you expect) * (page size). The
> response chain sometimes has separate components that
> operate on the docs being returned (i.e. "rows" parameter).
> The docs in the cache can then be accessed by successive
> components without going back out to disk. But by and
> large this isn't used by different queries.
>
> And, in fact, with the MMapDirectory stuff it's not even clear to
> me that this cache is as useful as it was when it was conceived.
>
> Bottom line: I wouldn't worry about this one, and I certainly
> wouldn't allocate lots of memory to it until and unless I was able
> to measure performance gains....
>
> Best,
> Erick
>
> On Sun, Jul 6, 2014 at 2:31 AM, Himanshu Mehrotra
> <himanshu.mehro...@snapdeal.com> wrote:
> > Erick, first up thanks for thoroughly answering my questions.
> >
> > [A]  I had read the blot mentioned, and yet failed to 'get it'.  Now I
> > understand the flow.
> > [B]  The automatic, heuristic based approach as you said will be
> difficult
> > to get right, that is why I thought 'beefiness' index configuration
> similar
> > to load balancer might help get same effective result for the most part.
>  I
> > guess it is feature that most people won't need, only the ones in process
> > of upgrading their machines.
> > [C] I will go through the blog, and do empirical analysis.  Speaking of
> > caches, I see that for us filter cache hit ratio is good 97% while
> document
> > cache hit ratio is below 10%, does it mean that document cache
> (size=4096)
> >  is not big enough and I should increase the size or does it mean that we
> > are getting queries that result in too wide a result set and hence we
> would
> > probably better off switching off the document cache altogether if we
> could
> > do it.
> >
> > Thanks,
> > Himanshu
> >
> >
> >
> > On Sun, Jul 6, 2014 at 5:27 AM, Erick Erickson <erickerick...@gmail.com>
> > wrote:
> >
> >> Question1, both sub-cases.
> >>
> >> You're off on the wrong track here, you have to forget about
> replication.
> >>
> >> When documents are added to the index, they get forwarded to _all_
> >> replicas. So the flow is like this...
> >> 1> leader gets update request
> >> 2> leader indexes docs locally, and adds to (local) transaction log
> >>       _and_ forwards request to all followers
> >> 3> followers add docs to tlog and index locally
> >> 4> followers ack back to leader
> >> 5> leader acks back to client.
> >>
> >> There is no replication in the old sense at all in this scenario. I'll
> >> add parenthetically that old-style replication _is_ still used to
> >> "catch up" a follower that is waaaaaay behind, but the follower is
> >> in the "recovering" state if this ever occurs.
> >>
> >> About commit. If you commit from the client, the commit is forwarded
> >> to all followers (actually, all nodes in the collection). If you have
> >> autocommit configured, each of the replicas will fire their commit when
> >> the time period expires.
> >>
> >> Here's a blog that might help:
> >>
> >>
> http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
> >>
> >> [B] right, SolrCloud really supposes that the machines are pretty
> >> similar so doesn't provide any way to do what you're asking. Really,
> >> you're asking for some way to assign "beefiness" to the node in terms
> >> of load sent to it... I don't know of a way to do that and I'm not
> >> sure it's on the roadmap either.
> >>
> >> What you'd really want, though, is some kind of heuristic that was
> >> automatically applied. That would take into account transient load
> >> problems, i.e. replica N happened to get a really nasty query to run
> >> and is just slow for a while. I can see this being very tricky to get
> >> right though. Would a GC pause get weighted as "slow" even though the
> >> pause could be over already? Anyway, I don't think this is on the
> >> roadmap at present but could well be wrong.
> >>
> >> In your specific example, though (this works because of the convenient
> >> 2x....) you could host 2x the number of shards/replicas on the beefier
> >> machines.
> >>
> >> [C] Right, memory allocation is difficult. The general recommendation
> >> is that memory for Solr allocated in the JVM should be as small as
> >> possible, and leave let the op system use memory for MMapDirectory.
> >> See the excellent blog here:
> >> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
> .
> >> If you over-allocate memory to the JVM, your GC profile worsens...
> >>
> >> Generally, when people throw "memory" around they're talking about JVM
> >> memory...
> >>
> >> And don't be mislead by the notion of "the index fitting into memory".
> >> You're absolutely right that when you get into a swapping situation,
> >> performance will suffer. But there are some very interesting tricks
> >> played to keep JVM consumption down. For instance, only every 128th
> >> term is stored in the JVM memory. Other terms are then read as needed.
> >> And stored in the OS memory via MMapDirectory implementations....
> >>
> >> Your GC stats look quite reasonable. You can get a snapshot of memory
> >> usage by attaching, say, jConsole to the running JVM and see what
> >> memory usage was after a forced GC. Sounds like you've already seen
> >> this, but in case not:
> >> http://searchhub.org/2011/03/27/garbage-collection-bootcamp-1-0/. It
> >> was written before there was much mileage on the new G1 garbage
> >> collector which has received mixed reviews.
> >>
> >> Note that the stored fields kept in memory are controlled by the
> >> documentCache in solrconfig.xml. I think of this as just something
> >> that holds documents when assembling the return list, it really
> >> doesn't have anything to do with searching per-se, just keeping disk
> >> seeks down during processing for a particular query. I.e. for a query
> >> returning 10 rows, only 10 docs will be kept here not the 5M rows that
> >> matched.
> >>
> >> Whether 4G is sufficient is.... not answerable. I've doubled the
> >> memory requirements by changing the query without changing the index.
> >> Here's a blog outlining why we can't predict and how to get an answer
> >> empirically:
> >>
> >>
> http://searchhub.org/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/
> >>
> >> Best,
> >> Erick
> >>
> >> On Sat, Jul 5, 2014 at 1:57 AM, Himanshu Mehrotra
> >> <himanshumehrotra2...@gmail.com> wrote:
> >> > Hi,
> >> >
> >> > I had three quesions/doubts regarding Solr and SolrCloud
> functionality.
> >> > Can anyone help clarify these? I know these are bit long, please bear
> >> with
> >> > me.
> >> >
> >> > [A] Replication related - As I understand before SolrCloud, under a
> >> classic
> >> > master/slave replication setup, every 'X' minutes slaves will
> pull/poll
> >> the
> >> > updated index (index segments added and deleted/merged away ).  And
> when
> >> a
> >> > client explicitly issues a 'commit' only master solr closes/finalizes
> >> > current index segment and creates a new current index segment.  As
> port
> >> of
> >> > this index segment merges as well as 'fsync' ensuring data is on the
> disk
> >> > also happens.
> >> >
> >> > I read documentation regarding replication on SolrCloud but
> unfortunately
> >> > it is still not very clear to me.
> >> >
> >> > Say I have solr cloud setup of 3 solr servers with just a single
> shard.
> >> > Let's call them L (the leader) and F1 and F2, the followers.
> >> >
> >> > Case 1: We are not using autoCommits, and explictly issue 'commit' via
> >> > Client.  How does replication happen now?
> >> > Does the each update to leader L that goes into tlog get replicated to
> >> > followers F1, and F2 (wher they also put update in tlog ) before
> client
> >> > sees response from leader L?  What happens when client issues a
> 'commit'?
> >> > Does  the creation of new segment, merging of index segments if
> required,
> >> > and fsync happen on all three solrs or that just happens on leader L
> and
> >> > followers F1, F2 simply sync the post commit state of index.
>  More-over
> >> > does leader L wait for fsync in followers F1, F2, before responding
> >> > sucessfully to Client?  If yes does it sequentially updates F1 and
> then
> >> F2
> >> > or is the process concurrent/parallel via threads.
> >> >
> >> > Case 2: We use autoCommit every 'X' minutes and do not issue 'commit'
> via
> >> > Client.  Is this setup similar to classic master slave in terms of
> >> > data/index updates?
> >> > As in since autoCommit happens every 'X' minutes replication will
> happen
> >> > after commit, every 'X' minutes followers get updated index.  But does
> >> > simple updates, the ones that go int tlog get replicated immediately
> to
> >> > follower's tlog .
> >> >
> >> > Another thing I noticed in Solr Admin UI, is that replication is set
> to
> >> > afterCommit, what are other possible settings for this knob.  And what
> >> > behaviour do we get out of them.
> >> >
> >> >
> >> >
> >> >
> >> > [B] Load balancing related - In traditional master/slave setup we use
> >> load
> >> > balancer to distribute load search query load equally over slaves.  In
> >> case
> >> > one of the slave solr is running on 'beefier' machine (say more RAM or
> >> CPU
> >> > or both) than others, then load balancers allow distributing load by
> >> > weights so that we can distribute load proportional to percieved
> machine
> >> > capacity.
> >> >
> >> > With solr cloud setup, lets take an example, 2 shards, 3 replicas per
> >> > shard, totaling to 6 solr servers are running and say we have
> >> > Servers S1L1, S1F1, S1F2 hosting replicas of shard1 and servers S2L1,
> >> S2F1,
> >> > S2F2 hosting replicas of shard2.  S1L1 and S2L2 happen to be leaders
> of
> >> > their respective shard.  And lets say S1F2, and S2F1 happen to be
> twice
> >> as
> >> > big machines as others (twice the RAM and CPU).
> >> >
> >> > Ideally speaking in such case we would want S2F1 and S1F2 to handle
> twice
> >> > the search query load as their peers.  That is if 100 search queries
> come
> >> > we know each shard will receive these 100 queries.  So we want S1L1,
> and
> >> > S1F1 to handle 25 queries each, and S1F2 to handle 50 queries.
>  Similarly
> >> > we would want S2L1 and S2F2 to handle 25 queries and S2F1 to handle 50
> >> > queries.
> >> >
> >> > As far as I understand, this is not possible via smart client
> provided in
> >> > SolrJ.  All solr servers will handle 33% of the query load.
> >> >
> >> > Alternative is to use dumb client and load balancer over all servers.
> >>  But
> >> > even then I guess we won't get correct/desired distribution of
> queries.
> >> > Say we put following weights for each server
> >> >
> >> > 1 - S1L1
> >> > 1 - S1F1
> >> > 2 - S1F2
> >> > 1 - S1L1
> >> > 2 - S1F1
> >> > 1 - S1F2
> >> >
> >> > Now 1/4 of total number of requests go to S1F2 directly, plus now it
> >> > recieves  1/6 ( 1/2 * 1/3 ) of request that went to some server on
> shard
> >> 2.
> >> > This totals up to 10/24 of request load, not half as we would expect.
> >> >
> >> > One way could be to chose weight y and x such that y/(2*(y + 2x)) +
> 1/6 =
> >> > 1/2 . It seems too much of trouble to get them ( y = 4 and x = 1 ).
> >> > Every time we add/remove/upgrade servers we need to recalculate new
> >> weights.
> >> >
> >> > A simpler alternative it appears would be that each solr node register
> >> its
> >> > 'query_weight' with zoo-keeper on joining the cluster. This
> >> 'query_weight'
> >> > could be a property similar to 'solr.solr.home' or 'zkHosts' that we
> >> > specify with startup commandline for solr server.
> >> >
> >> > And all smart clients and solr servers, to honour that weight when
> they
> >> > distribute load.  Is there such a feature planned for Solr Cloud?
> >> >
> >> >
> >> >
> >> >
> >> > [C] GC/Memory usage related - From the documentation and videos
> available
> >> > on internet, it appears that solr perform well if index fits into the
> >> > memory and stord fields fit in the memory.  Holding just index in
> memory
> >> > has more degrading impact on solr performance and if we don't have
> enough
> >> > memory to hold index solr is still slower, and the moment java process
> >> hits
> >> > swap space solr will slow to a crawl.
> >> >
> >> > My question is what the 'memory' being talked about is? Is it the Java
> >> Heap
> >> > we specify via Xmx and Xms options.  Or is it the free memory, or
> >> buffered,
> >> > or cached memory as available from output free command on *nix
> systems.
> >> > And how do we know if our index and stored fields will fit the memory.
> >>  For
> >> > example say the data directory for the core/collection occupies 200MB
> on
> >> > disk ( 150,000 live documents and 180,000 max documents per Solr UI) ,
> >> then
> >> > is a 8GB machine with solr being configured with Xmx 4G going to be
> >> > sufficient?
> >> >
> >> > Are there any guidlines as to configuring the java heap and total RAM,
> >> > given an index size and the expected query rate ( queries per
> >> > minute/second).
> >> > On production system I observed via gc logs that minor collections
> happen
> >> > at rate of 20 per minute, full gc happens every seven to ten minutes,
> are
> >> > these  too high or low given direct search query load on that solr
> node
> >> is
> >> > about 2500 request per minute.  What kind of GC behaviour I should
> expect
> >> > from an healthy and fast/optimal solr node in solr-cloud setup.  Is
> the
> >> > answer it depends on your specific response time and throughput
> >> > requirements or is there some kind of rule of thumb that can followed
> >> > irrespective of the situation.  Or should I see if any improvements
> can
> >> be
> >> > made via regular measure, tweak , measure cycles.
> >> >
> >> > Thanks,
> >> > Himanshu
> >>
>

Re: Solr and SolrCloud repllcation, and load balancing questions.

Reply via email to