[CCing Panu Matilainen, the maintainer of rpm, or, at least rpm 4.*, which is what all major distributions are using AIUI]
On Mon, 2011-03-21 at 10:50 +0100, "Martin v. Löwis" wrote: > Am 21.03.2011 07:37, schrieb Prashant Kumar: > > Hello, > > My name is Prashant Kumar and I've worked on porting few python > > libraries(distutils2, configobj) and I've been looking at the ideas > > list for GSoC for a project related to porting. > > > > I came across [1] and found it interesting. It mentions that some Hi Prashant! Thanks for the interest. Panu: [1] is http://wiki.python.org/moin/RPMOnPython3 , a Google Summer of Code proposal to work on the Python 3 bindings to RPM. > > of the work has already been done; I would like to look at the code > > repository for the same, could someone provide me the link for the > > same? > Not so much the code but the person who did the porting. This was Dave > Malcolm (CC'ed); please get in touch with him. Please familiarize > yourself with the existing Python bindings (in the latest RPM 4 release > from rpm.org). You'll notice that this already has Python 3 support; > not sure whether that's the most recent code, though. Panu Matilainen also worked on the python 3 port of the librpm python bindings. For the rpm source code, see: http://rpm.org/wiki/GetSource (the python bindings are in a subdirectory of the main source tree). My initial patchbomb landed on the mailing list here: http://lists.rpm.org/pipermail/rpm-maint/2009-October/002528.html and Panu committed and fixed up the patches around then. My understanding is that the current status is that the bindings work, but all values that were formerly exposed to Python 2 as "str" are now exposed to Python 3 as "bytes", which would require changing all consumers of the code. I believe Panu has also been working on a rewrite of the Python bindings, since the existing code is a little messy. Panu, am I remembering this correctly? The idea is that these types are fundamentally string-like, but unfortunately rpm has always been a bit loose in its interpretation of the encoding of byte values in package files and package databases. There are millions of rpm files out there, and millions of rpm databases, and all of these are in _some_ encoding. I have seen specfiles in which parts of the file were encoded in UTF-8 and other parts were encoded in Latin-1 (this broke one of my python scripts horribly). Martin and I discussed this last week at PyCon. I believe the proposal that we came up with was: - try to interpret bytes as UTF-8, using the "surrogateescape" mechanism, so that if it fails, we can at least preserve the exact bytes and round-trip Ultimately, this does mean trying to impose some kind of encoding standard on rpm files and rpm databases, which I think would be a Good Thing, but is perhaps something of scope creep compared to what the proposal at [1] says. See e.g. http://rpm.org/ticket/30 Other ideas that occur: - does rpmlint check for encoding yet? - what to do e.g. about canonicalization? What happens if one rpm provide a feature named "café" (where the "é" is U+00E9) and another rpm requires a feature named "café" (where the "é" is U+0065 LATIN SMALL LETTER E + U+0301 COMBINING ACUTE ACCENT)? IIRC we ruled that rpms in Fedora had to have ASCII names, and I'm guessing this applies to metadata, but we do allow UTF-8 filenames within package payloads (again, IIRC) I should mention that I'm drowning in email, and more likely to receive email to which I am directly listed in the "To" or "CC". Alas, it's also worth mentioning that there was a hostile fork of rpm, at rpm5.org, and that the "#rpm" channel on Freenode relates to that fork. I would advise not bothering with the rpm5 code; my understanding is that all major Linux distributions that use rpm use the rpm 4.* code hosted at rpm.org, not the rpm5 fork (and I have no personal interest in a GSOC project to work on python 3 support there). I doublechecked that fork in its CVS repository, and it does not yet have any of the Python 3 support. Hope this is helpful Dave _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com