I did not see answers... I am not an authority, but will tell you what I think....
Did you get some answers? On 5/6/11 2:52 PM, "Ravi Solr" <ravis...@gmail.com> wrote: >Hello, > Pardon me if this has been already answered somewhere and I >apologize for a lengthy post. I was wondering if anybody could help me >understand Replication internals a bit more. We have a single >master-slave setup (solr 1.4.1) with the configurations as shown >below. Our environment is quite commit heavy (almost 100s of docs >every 5 minutes), and all indexing is done on Master and all searches >go to the Slave. We are seeing that the slave replication performance >gradually decreases and the speed decreases < 1kbps and ultimately >gets backed up. Once we reload the core on slave it will be work fine >for sometime and then it again gets backed up. We have mergeFactor set >to 10 and ramBufferSizeMB is set to 32MB and solr itself is running >with 2GB memory and locktype is simple on both master and slave. How big is your index? How many rows and GB ? Every time you replicate, there are several resets on caching. So if you are constantly Indexing, you need to be careful on how that performance impact will apply. > >I am hoping that the following questions might help me understand the >replication performance issue better (Replication Configuration is >given at the end of the email) > >1. Does the Slave get the whole index every time during replication or >just the delta since the last replication happened ? It depends. If you do an OPTIMIZE every time your index, then you will be sending the whole index down. If the amount of time if > 10 segments, I believe that might also trigger a whole index, since you cycled all the segments. In that case I think you might want to increase the mergeFactor. > >2. If there are huge number of queries being done on slave will it >affect the replication ? How can I improve the performance ? (see the >replications details at he bottom of the page) It seems that might be one way the you get the index.* directories. At least I see it more frequently when there is huge load and you are trying to replicate. You could replicate less frequently. > >3. Will the segment names be same be same on master and slave after >replication ? I see that they are different. Is this correct ? If it >is correct how does the slave know what to fetch the next time i.e. >the delta. Yes they better be. In the old days you could just rsync the data directory from master and slave and reload the core, that worked fine. > >4. When and why does the index.<TIMESTAMP> folder get created ? I see >this type of folder getting created only on slave and the slave >instance is pointing to it. I would love to know all the conditions... I believe it is supposed to replicate to index.*, then reload to point to it. But sometimes it gets stuck in index.* land and never goes back to straight index. There are several bug fixes for this in 3.1. > >5. Does replication process copy both the index and index.<TIMESTAMP> >folder ? I believe it is supposed to copy the segment or whole index/ from master to index.* on slave. > >6. what happens if the replication kicks off even before the previous >invocation has not completed ? will the 2nd invocation block or will >it go through causing more confusion ? That is not supposed to happen, if a replication is in process, it should not copy again until that one is complete. Try it, just delete the data/*, restart SOLR, and force a replication, while it is syncing, force it again. Does not seem to work for me. > >7. If I have to prep a new master-slave combination is it OK to copy >the respective contents into the new master-slave and start solr ? or >do I have have to wipe the new slave and let it replicate from its new >master ? If you shut down the slave, copy the data/* directory amd restart you should be fine. That is how we fix the data/ dir when there is corruption. > >8. Doing an 'ls | wc -l' on index folder of master and slave gave 194 >and 17968 respectively...I slave has lot of segments_xxx files. Is >this normal ? Several bugs fixed in 3.1 for this one. Not a good thing.... You are getting leftover segments or index.* directories. > >MASTER > ><requestHandler name="/replication" class="solr. >ReplicationHandler" > > <lst name="master"> > <str name="replicateAfter">startup</str> > <str name="replicateAfter">commit</str> > <str name="replicateAfter">optimize</str> > > <str name="confFiles">schema.xml,stopwords.txt</str> > <str name="commitReserveDuration">00:00:10</str> > </lst> ></requestHandler> > > >SLAVE > ><requestHandler name="/replication" class="solr.ReplicationHandler" > > <lst name="slave"> > <str name="masterUrl">master core url</str> > <str name="pollInterval">00:03:00</str> > <str name="compression">internal</str> > <str name="httpConnTimeout">5000</str> > <str name="httpReadTimeout">10000</str> > </lst> ></requestHandler> > > >REPLICATION DETAILS FROM PAGE > >Master master core url >Poll Interval 00:03:00 >Local Index Index Version: 1296217104577, Generation: 20190 > Location: /data/solr/core/search-data/index.20110429042508 > Size: 2.1 GB > Times Replicated Since Startup: 672 > Previous Replication Done At: Fri May 06 15:41:01 EDT 2011 > Config Files Replicated At: null > Config Files Replicated: null > Times Config Files Replicated Since Startup: null > Next Replication Cycle At: Fri May 06 15:44:00 EDT 2011 > Current Replication Status Start Time: Fri May 06 15:41:00 EDT >2011 > Files Downloaded: 43 / 197 > Downloaded: 477.08 KB / 588.82 MB [0.0%] > Downloading File: _hdm.prx, Downloaded: 9.3 KB / 9.3 KB [100.0%] > Time Elapsed: 967s, Estimated Time Remaining: 1221166s, Speed: 505 >bytes/s > > >Ravi Kiran Bhaskar