Re: idna_to_unicode_8z8z() takes a stroll through the heap

Simon Josefsson Wed, 05 Jun 2013 14:14:55 -0700

Sam Varshavchik <[email protected]> writes:

>         char *p=strdup("example.com\xe3");
>         err=idna_to_unicode_8z8z(p, &utf8_ptr, 0);
...
> When g_utf8_next_char() gets 0xe3, this loop will merrily skip over
> the trailing \0 in the C string, and off it goes, into merry-land.


Right.

> Yes, idna_to_unicode_8z8z() is documented as taking valid UTF-8 for
> input. But, is it unreasonable for me to take an address from an
> E-mail header, and feed it to idna_to_unicode_8z8z(), without having
> to validate it for properly UTF-8ness?

It has to be validated for proper UTF-8-ness.  The UTF-8 functions in
libidn (copied from glib) assume valid UTF-8 strings.

I agree it is way too easy to end up using libidn the way you did.  I'm
split between improving documentation to explain the issue or add input
sanitization to all libidn functions accepting UTF-8 data.  I know IDNA
operations is a performance bottleneck in some environments, and
validating UTF-8 takes some CPU time.  But probably not that much
though...

/Simon

_______________________________________________
Help-libidn mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/help-libidn

Re: idna_to_unicode_8z8z() takes a stroll through the heap

Reply via email to