Package: urlscan
Version: 0.5.5
Severity: normal

I don't know where this is going but... urlscan doesn't handle URLs
that contain UTF-8 characters.  I just received an email which
contains the following URL which urlscan fails to parse correctly:
   http://www.pantherhouse.com/newshelton/my-wife-thinks-i’m-a-swan/
When I paste it into Firefox directly, it successfully opens
   http://www.pantherhouse.com/newshelton/my-wife-thinks-i%E2%80%99m-a-swan/

I'm not quite sure how to handle this since it essentially means that
virtually any character can appear in an URL.  Maybe you have a good
idea.



-- System Information:
Debian Release: 4.0
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: i386 (i686)
Shell:  /bin/sh linked to /bin/bash
Kernel: Linux 2.6.18-3-686
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)

Versions of packages urlscan depends on:
ii  python                        2.4.4-2    An interactive high-level object-o
ii  python-central                0.5.12     register and build utility for Pyt
ii  python-urwid                  0.9.7.1-1  curses-based UI/widget library for

urlscan recommends no packages.

-- no debconf information

-- 
Martin Michlmayr
http://www.cyrius.com/

Reply via email to