control: retitle -1 Fails on non breaking space

Hi,
On Tue, Mar 15, 2022 at 01:36:11PM +0100, Carsten Schoenert wrote:
> Control: tags -1 help
> 
> Hi,
> 
> Am Sun, Feb 13, 2022 at 12:51:47PM +0100 schrieb Guido Günther:
>  
> > I moved this here since nothing did change on the gbp site in that
> > area however it seems pydoctor didn't change either.
> > 
> > The python 3.8 changelog has
> > 
> > Python3 changelog and no sax parser change popped out that 
> > python3's sax parser changes didn't 
> > 
> > could that trigger it? I would expect for this to get noticed earlier
> > then though. I didn't look at changes in twisted yet.
> > 
> > Please move the issue back to gbp if you think it should be fixed there.
> 
> I think the root for the issue is mainly the updated pydoctor package.
> No other package that is a dependency of pydoctor has got a significant
> update too and no other involved package got also an update since the
> build of git-buildpackage is failing.
> 
> To dive into the problem I've prepared a local pbuilder chroot with the
> data from the day before pydoctor 21.12.1-1 was uploaded (that was 15
> Jan 2022 08:49:16 +0000) to unstable by using snapshots.d.o.
> 
> Building git-buildpackage with packages from 14 Jan 2022 works as
> expected.
> 
> Using data from snapshot.d.o. right after the upload of pydoctor
> 21.12.1-1 makes the build of git-buildpackage fail with the exact same
> error log visible in the starting email of this report.
> I've added some hacky debug printing into
> usr/lib/python3.9/xml/sax/expatreader.py around line 221 to see what
> excately is the string the build is complaining about. This leeds to
> this output (data is the second argument within 'def feed()' in line
> 206 of expatreader.py):
> 
> ----%<----
> +++debug+++
> data= 
> --><div><tt class="rst-docutils literal">str</tt>&nbsp;or <tt 
> class="rst-docutils literal">list</tt>&nbsp;of <tt class="rst-docutils 
> literal">str</tt></div><--
> undefined entity: line 1, column 46
> +++debug+++
> Traceback (most recent call last):
>   File "/usr/lib/python3.9/xml/sax/expatreader.py", line 219, in feed
>     self._parser.Parse(data, isFinal)
> xml.parsers.expat.ExpatError: undefined entity: line 1, column 46
> 
> During handling of the above exception, another exception occurred:
> 
> Traceback (most recent call last):
>   File "/usr/lib/python3/dist-packages/twisted/web/_flatten.py", line 390, in 
> _flattenTree
>     element = next(stack[-1])
> ---->%----
> 
> To me the data fiels looks like valid HTML/CSS code...

Carsten extracted this:

  <div><tt class="rst-docutils literal">str</tt>&nbsp;or <tt 
class="rst-docutils literal">list</tt>&nbsp;of <tt class="rst-docutils 
literal">str</tt></div>

and it would fail on &nbsp; which came about by a non breaking space (C2
A0) in gbp's source code. Removing these makes gbp's API build pass
again but it still looks like a regression in pydoctor.

Cheers,
 -- Guido

Reply via email to