[ 
http://jira.codehaus.org/browse/DOXIA-239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=133205#action_133205
 ] 
Lukas Theussl commented on DOXIA-239:
-------------------------------------

Some links:

[http://www.w3.org/TR/html4/struct/links.html#h-12.2.1]
[http://www.w3.org/TR/html4/appendix/notes.html#non-ascii-chars]

I think encodeId() should replace non-ASCII characters according to the 
recommendation of the latter link above.

> Handle non-ASCII characters in anchors and id's
> -----------------------------------------------
>
>                 Key: DOXIA-239
>                 URL: http://jira.codehaus.org/browse/DOXIA-239
>             Project: Maven Doxia
>          Issue Type: Bug
>          Components: Core, Documentation, Modules, Sink API
>            Reporter: Lukas Theussl
>
> From DOXIA-236:
> The javadoc for the method HtmlTools.encodeId() mentions the pattern 
> [A-Za-z][A-Za-z0-9:_.-]* for its output. To me, this looks like the term 
> "letter" in meant to refer to ASCII characters in this context. However, the 
> employed method Character.isLetter() will classify characters according to 
> the Unicode data file. For instance, the characters "ä" and "ß" are letters 
> in the Unicode sense. encodeId() will pass these through to its output, 
> violating the ASCII-only pattern stated in its javadoc.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to