Hi Antonio,

thanks for your reply!

Antonio Gallardo schrieb:
We hit the same issue some years ago and we found a more pragmatic solution:

In org.apache.cocoon.components.serializers.encoding.XHTMLEncoder add
the line marked with a + sign:


    private static final char ENCODINGS[][][] = {
+    { { 39 } , "'".toCharArray() },
       { { 160 } , " ".toCharArray() },

Actually this patch is already in the 2.1 branch :)

Unfortunately it doesn't work for me. The XHTML source contains the NCR for the ' character which also causes a JavaScript error.

To make it work, it would have to look like this:

    private static final char ENCODINGS[][][] = {
        { { 34 } , "\"".toCharArray() },
        { { 39 } , "'".toCharArray() },

But this contradicts the very purpose of the XHTMLEncoder, doesn't it?

-- Andreas




See:
http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references#Entities_representing_special_characters_in_XHTML

Please let me know if this fix the issue, I will gladly commit the fix.

Best Regards,

Antonio Gallardo.


Andreas Hartmann escribió:
Hi Cocoon devs,

this issue has already been discussed several times, e.g. [1], but
AFAIK has not been resolved yet.

The XHTMLSerializer, or, more specifically, the XHMLEncoder, from the
serializers block in Cocoon 2.1.x escapes all characters with a
corresponding HTML 4.0 character entity reference into this entity
reference. This causes issues with inline JavaScript, since e.g. the
double quotes are transformed to " which causes a JavaScript
parsing error. Another minor negative effect is the increased document
size.

If I understand the W3C correctly, see e.g. [2], the recommended
approach is to use the character set of the encoding as far as possible,
and use escapes only in exceptional circumstances. I didn't find a
reason why the XHTMLSerializer uses escapes, but I suspect that it is
related to browser compatibility issues.

Do you think it would make sense to make this behaviour configurable,
e.g.

  <use-entities>true|false</use-entities>

Does the XHTMLSerializer in Cocoon 2.2 show a different behaviour?

TIA for any comments!

-- Andreas


[1]
http://www.nabble.com/Problem-with-XHTMLSerializers-to1311360.html#a1311360

[2] http://www.w3.org/International/tutorials/tutorial-char-enc/






--
Andreas Hartmann, CTO
BeCompany GmbH
http://www.becompany.ch
Tel.: +41 (0) 43 818 57 01

Reply via email to