Igor Radic wrote on November 11, 2010 9:48 PM > > [...] > 1. Problem 1 - merging range influences merging results > It is logical that merging range does NOT influence merging results. > Meaning, I should get the same results when merging 100 revisions at > once or 10 times 10 revisions. > But we have noticed it is not so - in all cases the problem was somehow > related with a deletion of file/directory.
Please look into issue 3324 which has been solved recently. If this is your problem, it will be fixed in one of the upcoming releases. See: http://subversion.tigris.org/issues/show_bug.cgi?id=3324 > Usually the difference is only seen in mergeinfo. > But once we even had very strange case where two branches were created > and the same range was merged from TRUNK (but in different steps). > We ended up with 2 exact branches (content and mergeinfo) - at least > according to Tortoise repo-browser comparison. > But one branch could be re-integrated into TRUNK, and the other one > could NOT. > Are their any known limitations when choosing merging ranges? > Our current rule is: when choosing merging range, the revision which > deletes at least one file/folder must be first in the range. > Does this make sense? I assume that the two branches were created from the same trunk revision (otherwise, the mergeinfo would very likely be different). If the same merges have been made on both branches, but in finer steps on one of them, the reintegrate merges to the trunk should work the same way. If there was a difference (e.g. you got a tree conflict originating from the deletion when trying to reintegrate one branch but not when reintegrating the other one), this might have been caused by the bug described in issue 3324. You did not specify what reintegrate problem you encountered. If the problem was not in the final merge step but already in the early phase, when --reintegrate checks the reintegrate source, this could have been caused by branching off from different trunk revisions. There is less chance for these checks to fail in the branch copied from the newer revision. But this does not really fit to your description of identical content and mergeinfo (except you manipulated the latter one). > 2. Problem 2 - order of actions influences mergeinfo > I have a very simple example. > I have TRUNK and two branches (X and Y) created from TRUNK. > Now the work is finished in both branches and I want to re-integrate > them back to TRUNK. So you have merged everything from the trunk up to the HEAD revision into both branches. > But in the meantime, one file/folder in branch Y has mergeinfo (although > it has no mergeinfo in TRUNK). So Y has some subtree mergeinfo originating from some merge to a subdirectory or file within Y. > If I first merge branch X and then branch Y, the same file/folder in > TRUNK will get mergeinfo - the one as in branch Y (except for TRUNK > entries, of course), plus the final information about the merge of > branch Y. Yes, the subtree mergeinfo of your file/subdirectory is taken over from Y into the corresponding file/subdirectory in the trunk, supplemented by the mergeinfo for the reintegrate merge of Y to the trunk you are performing. Note that any later merge to the root directory of the trunk will also change the mergeinfo in that file/subdirectory. > But if I first merge branch Y and then branch X, the mergeinfo will also > contain information about merge of branch X. For the same reason: as any later merge after the merge of Y will update the subtree mergeinfo, you will also see there information about the reintegrate merge of X if you reintegrate X later than Y. > I understand why this happens, but I don't understand why such behaviour > is wanted. > I mean, what exactly do I get with this information when it is obvious > that in branch X this particular mergeinfo was not added/changed? To understand why this happens: whenever you do some operation that needs mergeinfo ("svn log -g", "svn blame -g", "svn merge") on a subdirectory or file within your trunk, it looks for explicit mergeinfo in that subdirectory/file. If there is none, it looks up the parent directory etc. This causes "svn log -g" to work on a subdirectory/file that does not have explicit mergeinfo when there is one on the root directory of your trunk. But if you now use "svn log -g" on that subdirectory/file within your trunk that has got some explicit mergeinfo originating from a former subtree merge to the corresponding subdirectory/file in Y, the search for explicit mergeinfo on parent directory stops as it already found this one. Therefore, this mergeinfo needs to have the complete information about all merges, also the reintegrate of X done later than that of Y. In the above explanation I simplified a bit: Provided that the later merge of X did not affect that subdirectory/file, the mergeinfo for the reintegrate merge of X does not have to be propagated to the subtree mergeinfo, and I think newer clients do optimize a bit here. > 3. Problem 3 - how should "only mark as merged" option work? > Again, a small example: there is branch X, then branch Y created from > it, and branch Z created from branch Y. > At first, all 3 branches have exactly the same contents and no mergeinfo. > Now, I perform either test 1 or test 2, getting 2 different results. > - Test 1: > revision 10: change file A in branch X > revision 11: merge revision 10 (branch X) to branch Y -> this changes > file A and adds mergeinfo for X:10 > revision 12: merge revision 11 (branch Y) to branch Z -> this changes > file A and adds mergeinfo for X:10 and Y:11 This merge from Y to Z does three things: a) it merges the change on A from Y to Z b) it merges the mergeinfo change (X:10) from Y to Z c) it adds mergeinfo for the merge being done here: Y:11 > revision 13: re-integrate branch Z to branch Y -> no content is changed, > mergeinfo is added for Z:12 > > - Test 2: > revision 10: change file A in branch X > revision 11: merge revision 10 (branch X) to branch Y -> this changes > file A and adds mergeinfo for X:10 > revision 12: accidentally perform the same change of file A in branch Z > revision 13: now that it is clear that this is the same change as in > Y:11, only mark as merged Y:11 -> no content is changed, mergeinfo is > added for Y:11 (but not for X:10) Revision 11 in test 2 consists of two changes: the change to file A in Y got by the merge from X and the mergeinfo X:10 that was added to Y. In the previous step (revision 12 in test 2), you only replayed the first of these changes to Z, but not the second one. Revision 12 in test 2 did only operation a) from revision 12 in test 1. Before you can make the result of revision 12 in test 2 look like a merge from Y to Z as in test1, you must first replay the other missing change b). So first do a record-only merge of X:10 to Z to add the same mergeinfo a real merge would have got from merging the X:10 mergeinfo on Y to Z (this replays action b from test 1), and then do a record-only merge of Y:11 to Z to make it look like a real merge from Y to Z (this replays action c from test 1). The --record-only option does just what it says: it does not merge any changes (so it does not merge mergeinfo changes from the source branch), it only records as merged what you specified. > revision 14: re-integrate branch Z to branch Y -> no content is changed, > mergeinfo is added for Z:12-13, but also removed for X:10 The reintegrate merge does NOT replay all changes on Z one after the other to Y. It is much simpler: it computes the difference between Y and Z and applies this difference to Y. In order not to remove former changes on Y, it is very important that ALL changes on Y have been merged to Z before reintegrating Z to Y. Otherwise, the difference would not only add changes from Z to Y but also remove former changes from Y which have not been merged to Z, when applied to Y. (I am simplifying to make this better to understand.) The reintegrate merge performs some safety checks to ensure that you have not left out any revisions from Y on your merges to Z. It uses the merge tracking information to do so. Now in test 2, you have actually skipped such a merge (Y:11) (and instead made a separate file change with the same modification in revision 12). This would have prevented the reintegrate merge from doing anything (to be correct: provided that another merge from Y to Z was done after revision 13). But then in revision 13, you promised by a record-only merge that you actually did a merge of Y:11, although you forgot about action b. The reintegrate merge believes you and does its job. The result is that the thing (mergeinfo) you forgot to add on Z is thrown out on Y. > In both tests the content is exactly the same. > But in second test mergeinfo for X:10 is missing - meaning branch X was > not merged to branch Y. > Is this behaviour wanted? > Why is it like that? > What if we want also to keep mergeinfo changes when marking a revision > as merged? > Should we then also mark those other changes as merged? > In other words - is this a bug in Subversion or it is exactly how it > should work and instead we should use it a bit differently to achieve > what we expect? I think the questions are answered from the explanation above. This is not a bug in Subversion. If you do an indirect merge (merging a merge result: (X->Y)->Z), this is easiest understood by considering the mergeinfo change of the merge revision X->Y itself an object of merging on your merge to Z, to which mergeinfo for this merge to Z is just added at the end of the operation. > 4. Problem 4 - corrupt and obsolete mergeinfo > Well, this is the most important question of all. > In our project we have reached approximately revision 28000. > In the beginning of our project we didn't know exactly how we should use > merges and we made some bad actions. Like removal of the trunk and moving/renaming a branch to be the new trunk then, instead of using a 2-URL merge for reintegrating. I am really embarrassed about that. :-) But this was in Subversion 1.2 times, so this does not affect merge tracking, which was introduced in Subversion 1.5. > We have also removed some mergeinfos or manipulated it manually. > Also lately we made some suspicious merges and corrections of mergeinfo. As Paul Burba said in a presentation about "Merging and Merge Tracking": "DO NOT hand edit or remove svn:mergeinfo properties unless you are sure you know what you are doing (and recheck yourself)." So you should not manipulate the mergeinfo. You might think some mergeinfo is incorrect although it is not. (For example, when you do all merges on the root directory, but in one branch a subdirectory has been deleted and then copied again (not merged!) from a different branch, it gets very complicated.) OK, to "know what you are doing", read the Subversion book (nightly 1.6 version) thoroughly, and read all white papers about mergeinfo. This is a lot of work, but otherwise you will come to wrong conclusions. Even then, as Paul said, recheck yourself! > Our mergeinfo contains information since the beginning in revision 1. Actually later since merge tracking was introduced in Subversion 1.5. > Let's say that we need mergeinfo for last 6 months of work and > everything that is older than that can be considered obsolete. Be careful with such assumptions. The mergeinfo can be helpful particularly for very old changes everyone in your team forgot about. For instance, if you have doubts about some area in the code, you can use "svn blame -g" to see where each line came from (even when it was reintegrated from a branch that has been deleted long ago), and you can get information from the log messages why the line was changed and what other changes were made along. (On the TortoiseSVN blame dialog, choose "Include merge info".) Of course, even without mergeinfo, log messages from the deleted branches are still available in the repository. But it would be much harder to find them, if you don't keep the references where the merge results came from. You are not forced to make use of the mergeinfo, but you should not prevent your colleagues from using it. You may make their life unnecessarily difficult, and you may cause unneeded costs for your company. Generally, it is a bad idea to remove information from a source control system, which is made to keep information from history. > Is it possible that we somehow remove some obsolete information? As there is no obsolete information, this is not possible. But if you like, just ignore the information. > Is such action dangerous? That depends on several factors: what additional effort these actions will cause in future, whether the cause of these actions is recognised by anybody in your team, whether this information is passed to your boss, whether he likes you, and how he reacts. Such actions will not crash the repository. Such actions may make the work of your team less efficient. Source control is intended to make that work more efficient. > If we are allowed to do that, what are the limitations or rules when > doing that? Do not ask anybody for permission except your boss, so you can blame him later. But be honest and tell him that you want to do things in your special way not documented in the Subversion book and that you want to remove Subversion-internal information maintained or used by the Subversion commands documented in the Subversion book (maintained by "svn merge", used by "svn log -g", "svn blame -g", "svn mergeinfo"). If you ignore all warnings, there are no limitations or rules. > I suppose we should do such action in TRUNK first, am I right? Not at all. The trunk is where your colleagues will continue to work all the time, and this is the place where they most benefit from the mergeinfo on log and blame commands (when using the option). > I also suppose we should only remove mergeinfo about old dead branches, > not the ones that are still active and being merged to. Removing mergeinfo in branches being merged to would be very silly. You would have to track what has been merged yourself to avoid conflicts on re-merging revisions that have already been merged. And reintegrating such a branch would be dangerous. You should also not remove mergeinfo about dead branches, because this information may be very helpful (svn log -g, svn blame -g), and there would be no benefit from removing it. Some general remark: we have two kinds of information about the history in a Subversion repository. The first is the ancestry: where a file comes from, usually from a previous revision on the same path, but some times from a previous revision on a different path, e.g. when a branch has been taken. The ancestry is not so helpful for a merge revision, because the ancestry of the merge source tells more about that change as the ancestry of the merge target. The ancestry builds the copied-from information, the mergeinfo builds the merged-from information, which connects to the ancestry of the merge source. Both the copied-from and merged-from information are helpful. The merged-from information can be removed (or manipulated), although it is not wise to do so, but the copied-from information can't - hmm... I think there is a way... but I won't betray it. :-) Finally, let's talk about bogus mergeinfo left behind from 1.5 and early 1.6 clients. Can this be removed? First answer: yes, it can, although there is no need to do so. If you do, leave this to someone who read and understood Paul's white papers about mergeinfo and "knows what (s)he is doing (and rechecks him/herself)", particularly as there is such a person in your team. Second answer: Why bother about 1% of bogus mergeinfo when it does not harm and when you can benefit from the 99% of correct mergeinfo? I prefer to benefit from the mergeinfo in 99% of the cases when doing "svn log -g", even if I would need some extra time to interpret the results in the 1% where the mergeinfo is wrong, because there is a big time saving overall. This is much better than suffering of removed mergeinfo in 100% of the cases when I use "svn log -g". And Subversion does recognize (some) bogus mergeinfo, so I won't even get wrong information for the whole 1% of it. For more about this, see below. Third answer: There is no problem in removing (or correcting) bogus mergeinfo (only the bogus one!!), if it is done by someone "who knows what (s)he is doing". Then you could benefit from 100% correct mergeinfo. Another remark about bogus mergeinfo. I have had the experience that in some cases an "svn log -g" did not supply the log entries from the merged revisions. This happens when Subversion recognizes bogus mergeinfo and ignores it. This behavior has been recently improved: in future, Subversion will only ignore the bogus part of the mergeinfo but honor the rest of it. See issue 3270. http://subversion.tigris.org/issues/show_bug.cgi?id=3270 > In advance, thanks a lot for your answers. > I hope you can help me and my colleagues. I hope I did. - Servatius