I'm starting to review and test Colin's patches. Here are some comments. Where I don't mention anything (the majority of the code), then I think it looks great.
Meta-question: do you think it makes sense to turn on `from __future__ import unicode_literals`? 0002-Avoid-various diff --git a/lib/debian/deb822.py b/lib/debian/deb822.py index 7e8d0a6..f838da8 100644 --- a/lib/debian/deb822.py +++ b/lib/debian/deb822.py @@ -234,8 +234,8 @@ class Deb822Dict(object, UserDict.DictMixin): return '{%s}' % ', '.join(['%r: %r' % (k, v) for k, v in self.items()]) def __eq__(self, other): - mykeys = self.keys(); mykeys.sort() - otherkeys = other.keys(); otherkeys.sort() + mykeys = sorted(self.keys()) + otherkeys = sorted(other.keys()) if not mykeys == otherkeys: return False While the above code is fine, a perhaps more idiomatic way of writing it would be: mykeys = sorted(self) otherkeys = sorted(other) since .keys() is the default iteration protocol for mappings. diff --git a/lib/debian/debian_support.py b/lib/debian/debian_support.py index f0577ac..a5d5a3d 100644 --- a/lib/debian/debian_support.py +++ b/lib/debian/debian_support.py @@ -53,9 +53,9 @@ class ParseError(Exception): return self.msg def __repr__(self): - return "ParseError(%s, %d, %s)" % (`self.filename`, + return "ParseError(%s, %d, %s)" % (repr(self.filename), self.lineno, - `self.msg`) + repr(self.msg)) def print_out(self, file): """Writes a machine-parsable error message to file.""" @@ -337,7 +337,7 @@ class PseudoEnum: self._name = name self._order = order def __repr__(self): - return '%s(%s)'% (self.__class__._name__, `name`) + return '%s(%s)'% (self.__class__._name__, repr(name)) def __str__(self): return self._name def __cmp__(self, other): @@ -392,7 +392,7 @@ def patches_from_ed_script(source, for line in i: match = re_cmd.match(line) if match is None: - raise ValueError, "invalid patch command: " + `line` + raise ValueError("invalid patch command: " + repr(line)) (first, last, cmd) = match.groups() first = int(first) @@ -561,7 +561,7 @@ def update_file(remote, local, verbose=None): continue if verbose: - print "update_file: field %s ignored" % `field` + print "update_file: field %s ignored" % repr(field) if not patches_to_apply: if verbose: @@ -569,17 +569,17 @@ def update_file(remote, local, verbose=None): return download_file(remote, local) for patch_name in patches_to_apply: - print "update_file: downloading patch " + `patch_name` + print "update_file: downloading patch " + repr(patch_name) patch_contents = download_gunzip_lines(remote + '.diff/' + patch_name + '.gz') - if read_lines_sha1(patch_contents ) <> patch_hashes[patch_name]: - raise ValueError, "patch %s was garbled" % `patch_name` + if read_lines_sha1(patch_contents ) != patch_hashes[patch_name]: + raise ValueError("patch %s was garbled" % repr(patch_name)) patch_lines(lines, patches_from_ed_script(patch_contents)) new_hash = read_lines_sha1(lines) - if new_hash <> remote_hash: - raise ValueError, ("patch failed, got %s instead of %s" - % (new_hash, remote_hash)) + if new_hash != remote_hash: + raise ValueError("patch failed, got %s instead of %s" + % (new_hash, remote_hash)) replace_file(lines, local) return lines You might consider using %r in these cases, which calls the repr() of the object for string interpolation. E.g. def __repr__(self): return '%s(%r)'% (self.__class__._name__, name) Note though in that example, 'name' is undefined here. I'm guessing you meant self._name. @@ -593,8 +593,6 @@ def merge_as_sets(*args): for x in args: for y in x: s[y] = True - l = s.keys() - l.sort() - return l + return sorted(s.keys()) Again, you can probably just `return sorted(s)` 0003-Use-Python-3-style-print diff --git a/lib/debian/arfile.py b/lib/debian/arfile.py diff --git a/lib/debian/debfile.py b/lib/debian/debfile.py diff --git a/lib/debian/debian_support.py b/lib/debian/debian_support.py diff --git a/lib/debian/debtags.py b/lib/debian/debtags.py diff --git a/lib/debian/doc-debtags b/lib/debian/doc-debtags You need to add `from __future__ import print_function` to these files. 0005-Use-iterkeys Why make these changes, given that .iterkeys() and .iteritems() doesn't exist in Python 3? I see later you use `six` for Python version compatibility, but why not just use .keys() and .items() instead? Are these collections so big that making concrete lists in Python 2 will cause serious memory pressure? 0006-Use-absolute I like this change a lot! Would it make sense to add a __future__ import of absolute_import too? 0007-Use-Python-3-style-print This patch is also inconsistent about adding from __future__ import print_function. It should add them to all the files where print() is being used. 0011-Implement-rich This looks fine. If you wanted to, and you can support nothing less that Python 2.7 (not yet true for Debian), then you could define just __lt__ and __eq__ and let the @functools.total_ordering class decorator fill in the rest. Just FYI. 0019-Use-six-to-paper-over-int-long My suspicion is that because all you care about is 0L here, you probably don't need six for that. IOW, you could just return 0 without coercing to long(). I haven't tested that though. 0022-Be-much-more-careful diff --git a/tests/test_deb822.py b/tests/test_deb822.py index eae7418..d44477e 100755 --- a/tests/test_deb822.py +++ b/tests/test_deb822.py @@ -757,8 +763,10 @@ Description: python modules to work with Debian-related data formats warnings.filterwarnings(action='ignore', category=UnicodeWarning) filename = 'test_Sources.mixed_encoding' - for paragraphs in [deb822.Sources.iter_paragraphs(open(filename)), - deb822.Sources.iter_paragraphs(open(filename), + f1 = open(filename, 'rb') + f2 = open(filename, 'rb') + for paragraphs in [deb822.Sources.iter_paragraphs(f1), + deb822.Sources.iter_paragraphs(f2, use_apt_pkg=False)]: p1 = paragraphs.next() self.assertEqual(p1['maintainer'], @@ -766,6 +774,8 @@ Description: python modules to work with Debian-related data formats p2 = paragraphs.next() self.assertEqual(p2['uploaders'], u'Frank Küster <fr...@debian.org>') + f2.close() + f1.close() def test_bug597249_colon_as_first_value_character(self): """Colon should be allowed as the first value character. See #597249. It's a test, so this doesn't bother me too much, and also because while you could use contextlib.nested() in Python 2, that doesn't exist in Python 3. There, and in Python 2.7, the with-statement itself supports multiple context managers, so you could write: with open(filename, 'rb') as f1, open(filename, 'rb') as f2: Unfortunately, that doesn't work in Python 2.6. :( 0023-Use-six-to-paper-over-iterator.next I think this is unnecessary. Built-in next() appeared in Python 2.6, and it's available in 2.7 and 3.2 also. So instead of calling e.g. `sequence_iter.next()` or `six.advance_iterator(sequence_iter)` you can just call `next(sequence_iter)` everywhere. That's it for now. I'm going to do some additional testing and will follow up if I find anything. Otherwise, I'd say, go for it. Even without the few minor comments above, Colin's patch is fantastic.
signature.asc
Description: PGP signature