Josh, thank you very much for stepping in an for your explanations!
A few more comments below.

15.04.2014 03:06, Josh Durgin wrote:
> On 04/14/2014 02:43 AM, Dmitry Smirnov wrote:
[]
>> Specifically in regards to Ceph for a moment I feel much more comfortable 
>> with
>> "dh_makeshlibs -V" not only due to lack of confidence in C++ symbols approach
>> but mostly because of rapid upstream development. The amount of changes
>> between 0.72.2 and 0.79 is huge. I would feel quite uncomfortable if "qemu"
>> would be built successfully with 0.79 but wouldn't pull newer libraries when
>> used with 0.72.2... For example Ceph cluster has to be completely upgraded to
>> 0.79 as it just doesn't work with mix of 0.72.2 and 0.79 components (i.e.
>> OSD,MON,MDS)...
> 
> As a ceph developer, the mixed-cluster issue is a bug (possibly fixed
> already since 0.79 is undergoing heavy testing and fixes before the
> next long term stable release, 0.80, is out). If you have more details
> we'd be happy to hear about them.

No, this is not about mixed-cluster issue (to be fair I'm not really sure
what you're talking about, but I _think_ you're referring to a situation
when different parts of the cluster is using different versions of the
software).

This is about build-time vs run-time version/interface difference in 3rd
party software.

The bug in question is debian-specific but it shows a more deep issue
within the mentioned libraries, which you confirm below.  When building
everything from source on target machines, the problem does not exist.
The problem comes when you build different system components on different
machines or at different time, and run a combination of all this, again,
on different machines.  I mean not that different parts of ceph software/
stack have different versions, but when you build a 3rd party software
using one version of rbd.h/librbd.so, but run it against a system-
installed librbd.so.1 from different build of librbd.

In this situation, a distribution needs to ensure the runtime librbd.so.1
has the right ABI.  ABI does not change _that_ often, and it is quite
normal when an app which was built against, say, 0.79 version of librbd.so,
can happily work with librbd.so.1 version 0.72 (when it actually does NOT
use any symbols which are present in 0.79 but didn't exist in 0.72).

So when a 3rd party app actually uses such symbols which are present in
0.79 but not present in earlier versions, a distribution should include
metadata for this 3rd party application that it needs librbd.so.1 AT
LEAST of version 0.79, to satisfy library symbol (and hopefully ABI)
requiriments.

When we may have a situation when an app is built against a more recent
version but is run against an older version of a library?  For example,
most distributions have some "unstable" branch where all new most
current versions of all software are uploaded to.  On the other hand,
users may run stable branches with older versions of everything.  And
imagine a ceph cluster running with a 0.72 version, and the user wants
to try qemu (3rd party app) from unstable branch, because their "stable"
qemu has a bug with their guest OS.  The easiest way here is just to
install qemu from unstable branch.  With all the rest of the cluster
still running the same 0.72 version, without mixed-cluster issues.

So we need a mechanism to tell the package management system the minimum
version of a library an app requires.  For this, we need a .symbols
file with a list of all exported symbols together with library version
at which they first appeared.  And this is what this whole talk is
about, nothing more.

One possible way here, as suggested by Dmitry, is to record the build
version of the library as minimum required for the app linked to it
(dh_makeshlibs -V does just that).  This way, qemu built against 0.72
version of librbd will require librbd >= 0.72 according to package
manager metadata, even if it actually uses only symbols introduced
in 0.44 version of librbd and before.

This approach works, but, as you've shown below, not for ceph, because
ceph does not tolerate mixed-cluster, so once you update any 3rd party,
not cluster-related but cluster-used, software, you'll have to upgrade
whole cluster too, because according to the package manager, you'll
have to update - in this case - librbd to the latest, which will most
likely require updating whole ceph stack on one node, which ofcourse
requires upgrading ceph stack on other nodes as well.

Dmitry: this is a good example of why naive `makeshlibs -V' approach
sometimes should NOT be used...

> Regarding library symbols, the ceph libraries each have C++ as well as
> C interfaces, and there's been some suggestions to move to
> visibility=hidden by default, to avoid some of the hairier problems
> with C++ libraries [1]. It seems like this would make .symbols files approach 
> more tenable, since passing through all C++ symbols would
> not be as bad if only the desired ones are exported in the first place.
> This isn't done yet, but in the mean time the "dh_makeshlibs -V" approach 
> seems fine to me.
> 
> If there's anything we could do upstream to make this easier, let us know.
> 
> Thanks!
> Josh
> 
> [1] http://marc.info/?l=ceph-devel&m=138842618710279

This is _exactly_ what we're talking about.  Whenever the library have
C++ interface or not is not very important once you talk about default-
hidden visibility.  And once you hide most internal symbols, it should
be more or less easy to expose only the necessary symbols, and their
count will be rather small, too (much smaller than the default-visible
case).

For now, I think, the current approach taken by Dmitry is the best:
he gave all C symbols a version, so that any non-C++ program will
use the versioned .symbols file correctly.  And marked all C++ symbols
as belonging to "current" version (I think it should be possible to
use a variable in .symbols file, like ${VERSION}, somehow) -- so that
all C++ users of the library will always require "latest" version.
We don't have many C++ users in Debian archive (if at all), so this
should not be that bad as a start.  Later on, when the above changes
will land in ceph, it will be possible to expand the C++ regex in
the symbols file to include actual list of exported symbols.

Thank you!

/mjt


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to