Ok, great. Thanks for helping out. Thanks, Himanshu
On Sun, Jul 6, 2014 at 9:35 PM, Erick Erickson <erickerick...@gmail.com> wrote: > [C] I've rarely seen situations where the document cache has > a very high a hit rate. For that to happen, the queries would > need to be returning the exact same documents, which isn't > usually the case. I wouldn't increase this very far. The > recommendation is that it be > (total simultaneous queries you expect) * (page size). The > response chain sometimes has separate components that > operate on the docs being returned (i.e. "rows" parameter). > The docs in the cache can then be accessed by successive > components without going back out to disk. But by and > large this isn't used by different queries. > > And, in fact, with the MMapDirectory stuff it's not even clear to > me that this cache is as useful as it was when it was conceived. > > Bottom line: I wouldn't worry about this one, and I certainly > wouldn't allocate lots of memory to it until and unless I was able > to measure performance gains.... > > Best, > Erick > > On Sun, Jul 6, 2014 at 2:31 AM, Himanshu Mehrotra > <himanshu.mehro...@snapdeal.com> wrote: > > Erick, first up thanks for thoroughly answering my questions. > > > > [A] I had read the blot mentioned, and yet failed to 'get it'. Now I > > understand the flow. > > [B] The automatic, heuristic based approach as you said will be > difficult > > to get right, that is why I thought 'beefiness' index configuration > similar > > to load balancer might help get same effective result for the most part. > I > > guess it is feature that most people won't need, only the ones in process > > of upgrading their machines. > > [C] I will go through the blog, and do empirical analysis. Speaking of > > caches, I see that for us filter cache hit ratio is good 97% while > document > > cache hit ratio is below 10%, does it mean that document cache > (size=4096) > > is not big enough and I should increase the size or does it mean that we > > are getting queries that result in too wide a result set and hence we > would > > probably better off switching off the document cache altogether if we > could > > do it. > > > > Thanks, > > Himanshu > > > > > > > > On Sun, Jul 6, 2014 at 5:27 AM, Erick Erickson <erickerick...@gmail.com> > > wrote: > > > >> Question1, both sub-cases. > >> > >> You're off on the wrong track here, you have to forget about > replication. > >> > >> When documents are added to the index, they get forwarded to _all_ > >> replicas. So the flow is like this... > >> 1> leader gets update request > >> 2> leader indexes docs locally, and adds to (local) transaction log > >> _and_ forwards request to all followers > >> 3> followers add docs to tlog and index locally > >> 4> followers ack back to leader > >> 5> leader acks back to client. > >> > >> There is no replication in the old sense at all in this scenario. I'll > >> add parenthetically that old-style replication _is_ still used to > >> "catch up" a follower that is waaaaaay behind, but the follower is > >> in the "recovering" state if this ever occurs. > >> > >> About commit. If you commit from the client, the commit is forwarded > >> to all followers (actually, all nodes in the collection). If you have > >> autocommit configured, each of the replicas will fire their commit when > >> the time period expires. > >> > >> Here's a blog that might help: > >> > >> > http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ > >> > >> [B] right, SolrCloud really supposes that the machines are pretty > >> similar so doesn't provide any way to do what you're asking. Really, > >> you're asking for some way to assign "beefiness" to the node in terms > >> of load sent to it... I don't know of a way to do that and I'm not > >> sure it's on the roadmap either. > >> > >> What you'd really want, though, is some kind of heuristic that was > >> automatically applied. That would take into account transient load > >> problems, i.e. replica N happened to get a really nasty query to run > >> and is just slow for a while. I can see this being very tricky to get > >> right though. Would a GC pause get weighted as "slow" even though the > >> pause could be over already? Anyway, I don't think this is on the > >> roadmap at present but could well be wrong. > >> > >> In your specific example, though (this works because of the convenient > >> 2x....) you could host 2x the number of shards/replicas on the beefier > >> machines. > >> > >> [C] Right, memory allocation is difficult. The general recommendation > >> is that memory for Solr allocated in the JVM should be as small as > >> possible, and leave let the op system use memory for MMapDirectory. > >> See the excellent blog here: > >> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html > . > >> If you over-allocate memory to the JVM, your GC profile worsens... > >> > >> Generally, when people throw "memory" around they're talking about JVM > >> memory... > >> > >> And don't be mislead by the notion of "the index fitting into memory". > >> You're absolutely right that when you get into a swapping situation, > >> performance will suffer. But there are some very interesting tricks > >> played to keep JVM consumption down. For instance, only every 128th > >> term is stored in the JVM memory. Other terms are then read as needed. > >> And stored in the OS memory via MMapDirectory implementations.... > >> > >> Your GC stats look quite reasonable. You can get a snapshot of memory > >> usage by attaching, say, jConsole to the running JVM and see what > >> memory usage was after a forced GC. Sounds like you've already seen > >> this, but in case not: > >> http://searchhub.org/2011/03/27/garbage-collection-bootcamp-1-0/. It > >> was written before there was much mileage on the new G1 garbage > >> collector which has received mixed reviews. > >> > >> Note that the stored fields kept in memory are controlled by the > >> documentCache in solrconfig.xml. I think of this as just something > >> that holds documents when assembling the return list, it really > >> doesn't have anything to do with searching per-se, just keeping disk > >> seeks down during processing for a particular query. I.e. for a query > >> returning 10 rows, only 10 docs will be kept here not the 5M rows that > >> matched. > >> > >> Whether 4G is sufficient is.... not answerable. I've doubled the > >> memory requirements by changing the query without changing the index. > >> Here's a blog outlining why we can't predict and how to get an answer > >> empirically: > >> > >> > http://searchhub.org/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/ > >> > >> Best, > >> Erick > >> > >> On Sat, Jul 5, 2014 at 1:57 AM, Himanshu Mehrotra > >> <himanshumehrotra2...@gmail.com> wrote: > >> > Hi, > >> > > >> > I had three quesions/doubts regarding Solr and SolrCloud > functionality. > >> > Can anyone help clarify these? I know these are bit long, please bear > >> with > >> > me. > >> > > >> > [A] Replication related - As I understand before SolrCloud, under a > >> classic > >> > master/slave replication setup, every 'X' minutes slaves will > pull/poll > >> the > >> > updated index (index segments added and deleted/merged away ). And > when > >> a > >> > client explicitly issues a 'commit' only master solr closes/finalizes > >> > current index segment and creates a new current index segment. As > port > >> of > >> > this index segment merges as well as 'fsync' ensuring data is on the > disk > >> > also happens. > >> > > >> > I read documentation regarding replication on SolrCloud but > unfortunately > >> > it is still not very clear to me. > >> > > >> > Say I have solr cloud setup of 3 solr servers with just a single > shard. > >> > Let's call them L (the leader) and F1 and F2, the followers. > >> > > >> > Case 1: We are not using autoCommits, and explictly issue 'commit' via > >> > Client. How does replication happen now? > >> > Does the each update to leader L that goes into tlog get replicated to > >> > followers F1, and F2 (wher they also put update in tlog ) before > client > >> > sees response from leader L? What happens when client issues a > 'commit'? > >> > Does the creation of new segment, merging of index segments if > required, > >> > and fsync happen on all three solrs or that just happens on leader L > and > >> > followers F1, F2 simply sync the post commit state of index. > More-over > >> > does leader L wait for fsync in followers F1, F2, before responding > >> > sucessfully to Client? If yes does it sequentially updates F1 and > then > >> F2 > >> > or is the process concurrent/parallel via threads. > >> > > >> > Case 2: We use autoCommit every 'X' minutes and do not issue 'commit' > via > >> > Client. Is this setup similar to classic master slave in terms of > >> > data/index updates? > >> > As in since autoCommit happens every 'X' minutes replication will > happen > >> > after commit, every 'X' minutes followers get updated index. But does > >> > simple updates, the ones that go int tlog get replicated immediately > to > >> > follower's tlog . > >> > > >> > Another thing I noticed in Solr Admin UI, is that replication is set > to > >> > afterCommit, what are other possible settings for this knob. And what > >> > behaviour do we get out of them. > >> > > >> > > >> > > >> > > >> > [B] Load balancing related - In traditional master/slave setup we use > >> load > >> > balancer to distribute load search query load equally over slaves. In > >> case > >> > one of the slave solr is running on 'beefier' machine (say more RAM or > >> CPU > >> > or both) than others, then load balancers allow distributing load by > >> > weights so that we can distribute load proportional to percieved > machine > >> > capacity. > >> > > >> > With solr cloud setup, lets take an example, 2 shards, 3 replicas per > >> > shard, totaling to 6 solr servers are running and say we have > >> > Servers S1L1, S1F1, S1F2 hosting replicas of shard1 and servers S2L1, > >> S2F1, > >> > S2F2 hosting replicas of shard2. S1L1 and S2L2 happen to be leaders > of > >> > their respective shard. And lets say S1F2, and S2F1 happen to be > twice > >> as > >> > big machines as others (twice the RAM and CPU). > >> > > >> > Ideally speaking in such case we would want S2F1 and S1F2 to handle > twice > >> > the search query load as their peers. That is if 100 search queries > come > >> > we know each shard will receive these 100 queries. So we want S1L1, > and > >> > S1F1 to handle 25 queries each, and S1F2 to handle 50 queries. > Similarly > >> > we would want S2L1 and S2F2 to handle 25 queries and S2F1 to handle 50 > >> > queries. > >> > > >> > As far as I understand, this is not possible via smart client > provided in > >> > SolrJ. All solr servers will handle 33% of the query load. > >> > > >> > Alternative is to use dumb client and load balancer over all servers. > >> But > >> > even then I guess we won't get correct/desired distribution of > queries. > >> > Say we put following weights for each server > >> > > >> > 1 - S1L1 > >> > 1 - S1F1 > >> > 2 - S1F2 > >> > 1 - S1L1 > >> > 2 - S1F1 > >> > 1 - S1F2 > >> > > >> > Now 1/4 of total number of requests go to S1F2 directly, plus now it > >> > recieves 1/6 ( 1/2 * 1/3 ) of request that went to some server on > shard > >> 2. > >> > This totals up to 10/24 of request load, not half as we would expect. > >> > > >> > One way could be to chose weight y and x such that y/(2*(y + 2x)) + > 1/6 = > >> > 1/2 . It seems too much of trouble to get them ( y = 4 and x = 1 ). > >> > Every time we add/remove/upgrade servers we need to recalculate new > >> weights. > >> > > >> > A simpler alternative it appears would be that each solr node register > >> its > >> > 'query_weight' with zoo-keeper on joining the cluster. This > >> 'query_weight' > >> > could be a property similar to 'solr.solr.home' or 'zkHosts' that we > >> > specify with startup commandline for solr server. > >> > > >> > And all smart clients and solr servers, to honour that weight when > they > >> > distribute load. Is there such a feature planned for Solr Cloud? > >> > > >> > > >> > > >> > > >> > [C] GC/Memory usage related - From the documentation and videos > available > >> > on internet, it appears that solr perform well if index fits into the > >> > memory and stord fields fit in the memory. Holding just index in > memory > >> > has more degrading impact on solr performance and if we don't have > enough > >> > memory to hold index solr is still slower, and the moment java process > >> hits > >> > swap space solr will slow to a crawl. > >> > > >> > My question is what the 'memory' being talked about is? Is it the Java > >> Heap > >> > we specify via Xmx and Xms options. Or is it the free memory, or > >> buffered, > >> > or cached memory as available from output free command on *nix > systems. > >> > And how do we know if our index and stored fields will fit the memory. > >> For > >> > example say the data directory for the core/collection occupies 200MB > on > >> > disk ( 150,000 live documents and 180,000 max documents per Solr UI) , > >> then > >> > is a 8GB machine with solr being configured with Xmx 4G going to be > >> > sufficient? > >> > > >> > Are there any guidlines as to configuring the java heap and total RAM, > >> > given an index size and the expected query rate ( queries per > >> > minute/second). > >> > On production system I observed via gc logs that minor collections > happen > >> > at rate of 20 per minute, full gc happens every seven to ten minutes, > are > >> > these too high or low given direct search query load on that solr > node > >> is > >> > about 2500 request per minute. What kind of GC behaviour I should > expect > >> > from an healthy and fast/optimal solr node in solr-cloud setup. Is > the > >> > answer it depends on your specific response time and throughput > >> > requirements or is there some kind of rule of thumb that can followed > >> > irrespective of the situation. Or should I see if any improvements > can > >> be > >> > made via regular measure, tweak , measure cycles. > >> > > >> > Thanks, > >> > Himanshu > >> >