Josh, thank you very much for stepping in an for your explanations! A few more comments below.
15.04.2014 03:06, Josh Durgin wrote: > On 04/14/2014 02:43 AM, Dmitry Smirnov wrote: [] >> Specifically in regards to Ceph for a moment I feel much more comfortable >> with >> "dh_makeshlibs -V" not only due to lack of confidence in C++ symbols approach >> but mostly because of rapid upstream development. The amount of changes >> between 0.72.2 and 0.79 is huge. I would feel quite uncomfortable if "qemu" >> would be built successfully with 0.79 but wouldn't pull newer libraries when >> used with 0.72.2... For example Ceph cluster has to be completely upgraded to >> 0.79 as it just doesn't work with mix of 0.72.2 and 0.79 components (i.e. >> OSD,MON,MDS)... > > As a ceph developer, the mixed-cluster issue is a bug (possibly fixed > already since 0.79 is undergoing heavy testing and fixes before the > next long term stable release, 0.80, is out). If you have more details > we'd be happy to hear about them. No, this is not about mixed-cluster issue (to be fair I'm not really sure what you're talking about, but I _think_ you're referring to a situation when different parts of the cluster is using different versions of the software). This is about build-time vs run-time version/interface difference in 3rd party software. The bug in question is debian-specific but it shows a more deep issue within the mentioned libraries, which you confirm below. When building everything from source on target machines, the problem does not exist. The problem comes when you build different system components on different machines or at different time, and run a combination of all this, again, on different machines. I mean not that different parts of ceph software/ stack have different versions, but when you build a 3rd party software using one version of rbd.h/librbd.so, but run it against a system- installed librbd.so.1 from different build of librbd. In this situation, a distribution needs to ensure the runtime librbd.so.1 has the right ABI. ABI does not change _that_ often, and it is quite normal when an app which was built against, say, 0.79 version of librbd.so, can happily work with librbd.so.1 version 0.72 (when it actually does NOT use any symbols which are present in 0.79 but didn't exist in 0.72). So when a 3rd party app actually uses such symbols which are present in 0.79 but not present in earlier versions, a distribution should include metadata for this 3rd party application that it needs librbd.so.1 AT LEAST of version 0.79, to satisfy library symbol (and hopefully ABI) requiriments. When we may have a situation when an app is built against a more recent version but is run against an older version of a library? For example, most distributions have some "unstable" branch where all new most current versions of all software are uploaded to. On the other hand, users may run stable branches with older versions of everything. And imagine a ceph cluster running with a 0.72 version, and the user wants to try qemu (3rd party app) from unstable branch, because their "stable" qemu has a bug with their guest OS. The easiest way here is just to install qemu from unstable branch. With all the rest of the cluster still running the same 0.72 version, without mixed-cluster issues. So we need a mechanism to tell the package management system the minimum version of a library an app requires. For this, we need a .symbols file with a list of all exported symbols together with library version at which they first appeared. And this is what this whole talk is about, nothing more. One possible way here, as suggested by Dmitry, is to record the build version of the library as minimum required for the app linked to it (dh_makeshlibs -V does just that). This way, qemu built against 0.72 version of librbd will require librbd >= 0.72 according to package manager metadata, even if it actually uses only symbols introduced in 0.44 version of librbd and before. This approach works, but, as you've shown below, not for ceph, because ceph does not tolerate mixed-cluster, so once you update any 3rd party, not cluster-related but cluster-used, software, you'll have to upgrade whole cluster too, because according to the package manager, you'll have to update - in this case - librbd to the latest, which will most likely require updating whole ceph stack on one node, which ofcourse requires upgrading ceph stack on other nodes as well. Dmitry: this is a good example of why naive `makeshlibs -V' approach sometimes should NOT be used... > Regarding library symbols, the ceph libraries each have C++ as well as > C interfaces, and there's been some suggestions to move to > visibility=hidden by default, to avoid some of the hairier problems > with C++ libraries [1]. It seems like this would make .symbols files approach > more tenable, since passing through all C++ symbols would > not be as bad if only the desired ones are exported in the first place. > This isn't done yet, but in the mean time the "dh_makeshlibs -V" approach > seems fine to me. > > If there's anything we could do upstream to make this easier, let us know. > > Thanks! > Josh > > [1] http://marc.info/?l=ceph-devel&m=138842618710279 This is _exactly_ what we're talking about. Whenever the library have C++ interface or not is not very important once you talk about default- hidden visibility. And once you hide most internal symbols, it should be more or less easy to expose only the necessary symbols, and their count will be rather small, too (much smaller than the default-visible case). For now, I think, the current approach taken by Dmitry is the best: he gave all C symbols a version, so that any non-C++ program will use the versioned .symbols file correctly. And marked all C++ symbols as belonging to "current" version (I think it should be possible to use a variable in .symbols file, like ${VERSION}, somehow) -- so that all C++ users of the library will always require "latest" version. We don't have many C++ users in Debian archive (if at all), so this should not be that bad as a start. Later on, when the above changes will land in ceph, it will be possible to expand the C++ regex in the symbols file to include actual list of exported symbols. Thank you! /mjt -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org