Package: python-beautifulsoup Version: 3.0.4-1 Followup-For: Bug #479414 This problem seems to be fixed upstream (3.0.7 http://www.crummy.com/software/BeautifulSoup/).
$ python2.5 t.py # script from original report <html> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <h1>Here is a Latin-1 entity: ®</h1> </html> Traceback (most recent call last): File "t.py", line 21, in <module> print BeautifulSoup(input2) File "/var/lib/python-support/python2.5/BeautifulSoup.py", line 1282, in __init__ BeautifulStoneSoup.__init__(self, *args, **kwargs) File "/var/lib/python-support/python2.5/BeautifulSoup.py", line 946, in __init__ self._feed() File "/var/lib/python-support/python2.5/BeautifulSoup.py", line 971, in _feed SGMLParser.feed(self, markup) File "/usr/lib/python2.5/sgmllib.py", line 99, in feed self.goahead(0) File "/usr/lib/python2.5/sgmllib.py", line 133, in goahead k = self.parse_starttag(i) File "/usr/lib/python2.5/sgmllib.py", line 291, in parse_starttag self.finish_starttag(tag, attrs) File "/usr/lib/python2.5/sgmllib.py", line 340, in finish_starttag self.handle_starttag(tag, method, attrs) File "/usr/lib/python2.5/sgmllib.py", line 376, in handle_starttag method(attrs) File "/var/lib/python-support/python2.5/BeautifulSoup.py", line 1372, in start_meta self._feed(self.declaredHTMLEncoding) File "/var/lib/python-support/python2.5/BeautifulSoup.py", line 971, in _feed SGMLParser.feed(self, markup) File "/usr/lib/python2.5/sgmllib.py", line 99, in feed self.goahead(0) File "/usr/lib/python2.5/sgmllib.py", line 133, in goahead k = self.parse_starttag(i) File "/usr/lib/python2.5/sgmllib.py", line 285, in parse_starttag self._convert_ref, attrvalue) UnicodeDecodeError: 'ascii' codec can't decode byte 0xae in position 0: ordinal not in range(128) $ wget http://www.crummy.com/software/BeautifulSoup/download/BeautifulSoup.py --2008-06-24 23:25:53-- http://www.crummy.com/software/BeautifulSoup/download/BeautifulSoup.py Resolving www.crummy.com... 66.160.141.133 Connecting to www.crummy.com|66.160.141.133|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 77799 (76K) [text/x-python] Saving to: `BeautifulSoup.py' 100%[======================================>] 77,799 114K/s in 0.7s 2008-06-24 23:25:54 (114 KB/s) - `BeautifulSoup.py' saved [77799/77799] $ python2.5 t.py <html> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <h1>Here is a Latin-1 entity: ®</h1> </html> <html> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <meta name="Description" content="Here is a Latin-1 entity: ®" /> </html> -- System Information: Debian Release: lenny/sid APT prefers unstable APT policy: (500, 'unstable') Architecture: i386 (i686) Kernel: Linux 2.6.25-2-686 (SMP w/2 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/bash Versions of packages python-beautifulsoup depends on: ii python 2.5.2-1 An interactive high-level object-o ii python-support 0.8.3 automated rebuilding support for P python-beautifulsoup recommends no packages. -- no debconf information -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]