Re: Solr and SolrCloud repllcation, and load balancing questions.

Erick Erickson Sun, 06 Jul 2014 09:06:24 -0700

[C] I've rarely seen situations where the document cache has
a very high a hit rate. For that to happen, the queries would
need to be returning the exact same documents, which isn't
usually the case. I wouldn't increase this very far. The
recommendation is that it be
(total simultaneous queries you expect) * (page size). The
response chain sometimes has separate components that
operate on the docs being returned (i.e. "rows" parameter).
The docs in the cache can then be accessed by successive
components without going back out to disk. But by and
large this isn't used by different queries.


And, in fact, with the MMapDirectory stuff it's not even clear to
me that this cache is as useful as it was when it was conceived.

Bottom line: I wouldn't worry about this one, and I certainly
wouldn't allocate lots of memory to it until and unless I was able
to measure performance gains....

Best,
Erick

On Sun, Jul 6, 2014 at 2:31 AM, Himanshu Mehrotra
<himanshu.mehro...@snapdeal.com> wrote:
> Erick, first up thanks for thoroughly answering my questions.
>
> [A]  I had read the blot mentioned, and yet failed to 'get it'.  Now I
> understand the flow.
> [B]  The automatic, heuristic based approach as you said will be difficult
> to get right, that is why I thought 'beefiness' index configuration similar
> to load balancer might help get same effective result for the most part.  I
> guess it is feature that most people won't need, only the ones in process
> of upgrading their machines.
> [C] I will go through the blog, and do empirical analysis.  Speaking of
> caches, I see that for us filter cache hit ratio is good 97% while document
> cache hit ratio is below 10%, does it mean that document cache (size=4096)
>  is not big enough and I should increase the size or does it mean that we
> are getting queries that result in too wide a result set and hence we would
> probably better off switching off the document cache altogether if we could
> do it.
>
> Thanks,
> Himanshu
>
>
>
> On Sun, Jul 6, 2014 at 5:27 AM, Erick Erickson <erickerick...@gmail.com>
> wrote:
>
>> Question1, both sub-cases.
>>
>> You're off on the wrong track here, you have to forget about replication.
>>
>> When documents are added to the index, they get forwarded to _all_
>> replicas. So the flow is like this...
>> 1> leader gets update request
>> 2> leader indexes docs locally, and adds to (local) transaction log
>>       _and_ forwards request to all followers
>> 3> followers add docs to tlog and index locally
>> 4> followers ack back to leader
>> 5> leader acks back to client.
>>
>> There is no replication in the old sense at all in this scenario. I'll
>> add parenthetically that old-style replication _is_ still used to
>> "catch up" a follower that is waaaaaay behind, but the follower is
>> in the "recovering" state if this ever occurs.
>>
>> About commit. If you commit from the client, the commit is forwarded
>> to all followers (actually, all nodes in the collection). If you have
>> autocommit configured, each of the replicas will fire their commit when
>> the time period expires.
>>
>> Here's a blog that might help:
>>
>> http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>>
>> [B] right, SolrCloud really supposes that the machines are pretty
>> similar so doesn't provide any way to do what you're asking. Really,
>> you're asking for some way to assign "beefiness" to the node in terms
>> of load sent to it... I don't know of a way to do that and I'm not
>> sure it's on the roadmap either.
>>
>> What you'd really want, though, is some kind of heuristic that was
>> automatically applied. That would take into account transient load
>> problems, i.e. replica N happened to get a really nasty query to run
>> and is just slow for a while. I can see this being very tricky to get
>> right though. Would a GC pause get weighted as "slow" even though the
>> pause could be over already? Anyway, I don't think this is on the
>> roadmap at present but could well be wrong.
>>
>> In your specific example, though (this works because of the convenient
>> 2x....) you could host 2x the number of shards/replicas on the beefier
>> machines.
>>
>> [C] Right, memory allocation is difficult. The general recommendation
>> is that memory for Solr allocated in the JVM should be as small as
>> possible, and leave let the op system use memory for MMapDirectory.
>> See the excellent blog here:
>> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html.
>> If you over-allocate memory to the JVM, your GC profile worsens...
>>
>> Generally, when people throw "memory" around they're talking about JVM
>> memory...
>>
>> And don't be mislead by the notion of "the index fitting into memory".
>> You're absolutely right that when you get into a swapping situation,
>> performance will suffer. But there are some very interesting tricks
>> played to keep JVM consumption down. For instance, only every 128th
>> term is stored in the JVM memory. Other terms are then read as needed.
>> And stored in the OS memory via MMapDirectory implementations....
>>
>> Your GC stats look quite reasonable. You can get a snapshot of memory
>> usage by attaching, say, jConsole to the running JVM and see what
>> memory usage was after a forced GC. Sounds like you've already seen
>> this, but in case not:
>> http://searchhub.org/2011/03/27/garbage-collection-bootcamp-1-0/. It
>> was written before there was much mileage on the new G1 garbage
>> collector which has received mixed reviews.
>>
>> Note that the stored fields kept in memory are controlled by the
>> documentCache in solrconfig.xml. I think of this as just something
>> that holds documents when assembling the return list, it really
>> doesn't have anything to do with searching per-se, just keeping disk
>> seeks down during processing for a particular query. I.e. for a query
>> returning 10 rows, only 10 docs will be kept here not the 5M rows that
>> matched.
>>
>> Whether 4G is sufficient is.... not answerable. I've doubled the
>> memory requirements by changing the query without changing the index.
>> Here's a blog outlining why we can't predict and how to get an answer
>> empirically:
>>
>> http://searchhub.org/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/
>>
>> Best,
>> Erick
>>
>> On Sat, Jul 5, 2014 at 1:57 AM, Himanshu Mehrotra
>> <himanshumehrotra2...@gmail.com> wrote:
>> > Hi,
>> >
>> > I had three quesions/doubts regarding Solr and SolrCloud functionality.
>> > Can anyone help clarify these? I know these are bit long, please bear
>> with
>> > me.
>> >
>> > [A] Replication related - As I understand before SolrCloud, under a
>> classic
>> > master/slave replication setup, every 'X' minutes slaves will pull/poll
>> the
>> > updated index (index segments added and deleted/merged away ).  And when
>> a
>> > client explicitly issues a 'commit' only master solr closes/finalizes
>> > current index segment and creates a new current index segment.  As port
>> of
>> > this index segment merges as well as 'fsync' ensuring data is on the disk
>> > also happens.
>> >
>> > I read documentation regarding replication on SolrCloud but unfortunately
>> > it is still not very clear to me.
>> >
>> > Say I have solr cloud setup of 3 solr servers with just a single shard.
>> > Let's call them L (the leader) and F1 and F2, the followers.
>> >
>> > Case 1: We are not using autoCommits, and explictly issue 'commit' via
>> > Client.  How does replication happen now?
>> > Does the each update to leader L that goes into tlog get replicated to
>> > followers F1, and F2 (wher they also put update in tlog ) before client
>> > sees response from leader L?  What happens when client issues a 'commit'?
>> > Does  the creation of new segment, merging of index segments if required,
>> > and fsync happen on all three solrs or that just happens on leader L and
>> > followers F1, F2 simply sync the post commit state of index.  More-over
>> > does leader L wait for fsync in followers F1, F2, before responding
>> > sucessfully to Client?  If yes does it sequentially updates F1 and then
>> F2
>> > or is the process concurrent/parallel via threads.
>> >
>> > Case 2: We use autoCommit every 'X' minutes and do not issue 'commit' via
>> > Client.  Is this setup similar to classic master slave in terms of
>> > data/index updates?
>> > As in since autoCommit happens every 'X' minutes replication will happen
>> > after commit, every 'X' minutes followers get updated index.  But does
>> > simple updates, the ones that go int tlog get replicated immediately to
>> > follower's tlog .
>> >
>> > Another thing I noticed in Solr Admin UI, is that replication is set to
>> > afterCommit, what are other possible settings for this knob.  And what
>> > behaviour do we get out of them.
>> >
>> >
>> >
>> >
>> > [B] Load balancing related - In traditional master/slave setup we use
>> load
>> > balancer to distribute load search query load equally over slaves.  In
>> case
>> > one of the slave solr is running on 'beefier' machine (say more RAM or
>> CPU
>> > or both) than others, then load balancers allow distributing load by
>> > weights so that we can distribute load proportional to percieved machine
>> > capacity.
>> >
>> > With solr cloud setup, lets take an example, 2 shards, 3 replicas per
>> > shard, totaling to 6 solr servers are running and say we have
>> > Servers S1L1, S1F1, S1F2 hosting replicas of shard1 and servers S2L1,
>> S2F1,
>> > S2F2 hosting replicas of shard2.  S1L1 and S2L2 happen to be leaders of
>> > their respective shard.  And lets say S1F2, and S2F1 happen to be twice
>> as
>> > big machines as others (twice the RAM and CPU).
>> >
>> > Ideally speaking in such case we would want S2F1 and S1F2 to handle twice
>> > the search query load as their peers.  That is if 100 search queries come
>> > we know each shard will receive these 100 queries.  So we want S1L1, and
>> > S1F1 to handle 25 queries each, and S1F2 to handle 50 queries.  Similarly
>> > we would want S2L1 and S2F2 to handle 25 queries and S2F1 to handle 50
>> > queries.
>> >
>> > As far as I understand, this is not possible via smart client provided in
>> > SolrJ.  All solr servers will handle 33% of the query load.
>> >
>> > Alternative is to use dumb client and load balancer over all servers.
>>  But
>> > even then I guess we won't get correct/desired distribution of queries.
>> > Say we put following weights for each server
>> >
>> > 1 - S1L1
>> > 1 - S1F1
>> > 2 - S1F2
>> > 1 - S1L1
>> > 2 - S1F1
>> > 1 - S1F2
>> >
>> > Now 1/4 of total number of requests go to S1F2 directly, plus now it
>> > recieves  1/6 ( 1/2 * 1/3 ) of request that went to some server on shard
>> 2.
>> > This totals up to 10/24 of request load, not half as we would expect.
>> >
>> > One way could be to chose weight y and x such that y/(2*(y + 2x)) + 1/6 =
>> > 1/2 . It seems too much of trouble to get them ( y = 4 and x = 1 ).
>> > Every time we add/remove/upgrade servers we need to recalculate new
>> weights.
>> >
>> > A simpler alternative it appears would be that each solr node register
>> its
>> > 'query_weight' with zoo-keeper on joining the cluster. This
>> 'query_weight'
>> > could be a property similar to 'solr.solr.home' or 'zkHosts' that we
>> > specify with startup commandline for solr server.
>> >
>> > And all smart clients and solr servers, to honour that weight when they
>> > distribute load.  Is there such a feature planned for Solr Cloud?
>> >
>> >
>> >
>> >
>> > [C] GC/Memory usage related - From the documentation and videos available
>> > on internet, it appears that solr perform well if index fits into the
>> > memory and stord fields fit in the memory.  Holding just index in memory
>> > has more degrading impact on solr performance and if we don't have enough
>> > memory to hold index solr is still slower, and the moment java process
>> hits
>> > swap space solr will slow to a crawl.
>> >
>> > My question is what the 'memory' being talked about is? Is it the Java
>> Heap
>> > we specify via Xmx and Xms options.  Or is it the free memory, or
>> buffered,
>> > or cached memory as available from output free command on *nix systems.
>> > And how do we know if our index and stored fields will fit the memory.
>>  For
>> > example say the data directory for the core/collection occupies 200MB on
>> > disk ( 150,000 live documents and 180,000 max documents per Solr UI) ,
>> then
>> > is a 8GB machine with solr being configured with Xmx 4G going to be
>> > sufficient?
>> >
>> > Are there any guidlines as to configuring the java heap and total RAM,
>> > given an index size and the expected query rate ( queries per
>> > minute/second).
>> > On production system I observed via gc logs that minor collections happen
>> > at rate of 20 per minute, full gc happens every seven to ten minutes, are
>> > these  too high or low given direct search query load on that solr node
>> is
>> > about 2500 request per minute.  What kind of GC behaviour I should expect
>> > from an healthy and fast/optimal solr node in solr-cloud setup.  Is the
>> > answer it depends on your specific response time and throughput
>> > requirements or is there some kind of rule of thumb that can followed
>> > irrespective of the situation.  Or should I see if any improvements can
>> be
>> > made via regular measure, tweak , measure cycles.
>> >
>> > Thanks,
>> > Himanshu
>>

Re: Solr and SolrCloud repllcation, and load balancing questions.

Reply via email to