Package: urlscan Version: 0.5.5 Severity: normal I don't know where this is going but... urlscan doesn't handle URLs that contain UTF-8 characters. I just received an email which contains the following URL which urlscan fails to parse correctly: http://www.pantherhouse.com/newshelton/my-wife-thinks-i’m-a-swan/ When I paste it into Firefox directly, it successfully opens http://www.pantherhouse.com/newshelton/my-wife-thinks-i%E2%80%99m-a-swan/
I'm not quite sure how to handle this since it essentially means that virtually any character can appear in an URL. Maybe you have a good idea. -- System Information: Debian Release: 4.0 APT prefers testing APT policy: (500, 'testing') Architecture: i386 (i686) Shell: /bin/sh linked to /bin/bash Kernel: Linux 2.6.18-3-686 Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Versions of packages urlscan depends on: ii python 2.4.4-2 An interactive high-level object-o ii python-central 0.5.12 register and build utility for Pyt ii python-urwid 0.9.7.1-1 curses-based UI/widget library for urlscan recommends no packages. -- no debconf information -- Martin Michlmayr http://www.cyrius.com/