Hi everyone. I have a IDNA2008 implementation that is starting to work (it passes some test vectors for the lookup algorithm), and it I want to publish an early version it at this point to get public review of the API.
First I want to explain why this is a separate library than an extension of libidn.so: libidn.so is an IDNA2003 implementation plus some other stuff. It is quite big, around 200kb if you optimize it for size. That is already a size problem for embedded devices today. By having libidn2 be a separate project, libidn won't be larger. People that don't need IDNA2003 does not need the libidn baggage, and people who don't need IDNA2008 does not need the libidn2 baggage. Further, eventually IDNA2003 may just go away, replaced by IDNA2008 and at that point it would be very painful (impossible) to remove the IDNA2003 stuff from Libidn. This combined makes me believe that IDNA2008 should be in a separate shared library. However, there is no reason the libidn2.so shared library wouldn't be part of the 'GNU Libidn' project umbrella eventually. Adding it today would just slow down development though, since it is still changing significantly both internally and externally. The webpage for this project will be: http://josefsson.org/libidn2/ I have uploaded GTK-DOC generated API manual at: http://josefsson.org/libidn2/reference/idn2-idn2.html In particular the essential API looks like this: /* IDNA2008 with UTF-8 input. */ extern IDN2_API int idn2_lookup_u8 (const uint8_t *src, uint8_t **lookupname, int flags); extern IDN2_API int idn2_register_u8 (const uint8_t *ulabel, const uint8_t *alabel, uint8_t **insertname, int flags); /* IDNA2008 with locale encoded inputs. */ extern IDN2_API int idn2_lookup_ul (const char *src, char **lookupname, int flags); extern IDN2_API int idn2_register_ul (const char *ulabel, const char *alabel, char **insertname, int flags); I want to stress that these interfaces are not final and I want your input on how to make them better. There is no ABI guarantees of the shared library now. As you can see, there is one interface for passing in UTF-8 strings and one for passing in locale encoded strings. The locale interface will convert the string to UTF-8 and NFC normalize it. I'm not sure how useful the idn2_register_ul interface is -- accepting non-UTF8 and non-NFC inputs to the register process is error prone. For the lookup process it is natural. Note that the "register" interface takes only one label, not an entire domain name. This is per the suggested interface in RFC 5891. I'm not sure how useful this is -- maybe it should accept an entire domain name. Thoughts? Possibly there should be a way to pass a pre-allocated buffer and let the function populate the buffer with the output domain name, instead of forcing callers to copy the newly allocated name into its proper place. My proposal on how to achieve this is to let the code inspect the *lookupname or *insertname value and if that is non-NULL, then the output is copied into that buffer location rather than allocating a new buffer. Of course, the buffer must have room for 256 bytes (255 characters + 1 NUL) which is the largest possible domain name. I'm not sure a size parameter is needed, 256 is such a small buffer size anyway that the caller could be required to always allocate a 256 byte large buffer. Archives of the actual implementation will be available at: http://josefsson.org/libidn2/releases/ This work is sponsored by DENIC. If you know others who are interested in supporting this effort, please let me know! /Simon _______________________________________________ Help-libidn mailing list [email protected] http://lists.gnu.org/mailman/listinfo/help-libidn
