Hi Michael, On Thu, Jun 10, 2010 at 10:54:39AM +0200, Michael Vogt wrote: > Package: python-debian > Version: 0.1.16 > Severity: normal > > It appears that the deb822.Deb822.iter_paragraph method gets confused > if there are bogus entries (like a single line) in the file. Below is > a test that shows the behavior. Depending on the policy the excpeted > value is either 2 or 3 (2 if we want to discard invalid stanzas).
What is your use case for this? I'm having a hard time seeing a good way to handle bogus data consistently. What should the parser yield when it encounters a bogus stanza? > It appears that the problem is "while len(x) != 0" in deb822.py, that > will make the parser stop on the first bogus line. Attached is a > possible patch for this that makes the EOF handling explicit. It should be noted that this behavior is specific to the native parser (which is used when you specify use_apt_pkg=False or you don't have python-apt installed). When iter_paragraphs uses apt_pkg, it returns a bogus Deb822 object for the bogus line. Because of apt_pkg's TagParser implementation, it may appear to have a key corresponding to the bogus line, but actually trying to get the value for that key will raise KeyError. This is not good behavior - it breaks the map interface - but unless we check for validity of the data (which would defeat the purpose of using apt_pkg), I don't know how do make it better. Interestingly, with your patch, the native parser returns an empty Deb822 object (essentially {}). This probably is the best behavior we can ask for - although I think it should at least raise a warning, and the behavior should be documented. And...it would be really nice if we could make the apt_pkg one do the same thing. Any ideas? -- John Wright <j...@debian.org> -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org