[
https://issues.apache.org/jira/browse/XERCESC-2120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Blackton updated XERCESC-2120:
-------------------------------------
Description:
When attempting to write an xml document containing valid UTF-16 surrogate
pairs an error occurs during validation. This causes the write to fail.
It appears as though this issue was introduced with
https://issues.apache.org/jira/browse/XERCESC-1854 in the following commit
http://svn.apache.org/viewvc/xerces/c/trunk/src/xercesc/dom/impl/DOMLSSerializerImpl.cpp?r1=768978&r2=1226891.
I have supplied a reproducible and a potential patch. The string validator
should be responsible for determining if the codepoint is part of a surrogate
pair. However, I may also like to make the argument that this may not be the
right location to be doing the string validation. As it will leave the output
document in an inconsistent (half-written) state.
was:
When attempting to write an xml document containing valid UTF-16 surrogate
pairs an error occurs during validation. This causes the write to fail.
It appears as though this issue was introduced with
https://issues.apache.org/jira/browse/XERCESC-1854 in the following commit
http://svn.apache.org/viewvc/xerces/c/trunk/src/xercesc/dom/impl/DOMLSSerializerImpl.cpp?r1=768978&r2=1226891.
I have supplied a reproducible and a potential patch. The string validator
should be responsible for determining if the codepoint is part of a surrogate
pair.
> DOM Serialization does not correctly validate Surrogate Pairs
> -------------------------------------------------------------
>
> Key: XERCESC-2120
> URL: https://issues.apache.org/jira/browse/XERCESC-2120
> Project: Xerces-C++
> Issue Type: Bug
> Components: DOM
> Affects Versions: 3.2.0
> Reporter: Andrew Blackton
> Attachments: DOMCharacterValidationTest.cpp, DomStringValidation.patch
>
>
> When attempting to write an xml document containing valid UTF-16 surrogate
> pairs an error occurs during validation. This causes the write to fail.
> It appears as though this issue was introduced with
> https://issues.apache.org/jira/browse/XERCESC-1854 in the following commit
> http://svn.apache.org/viewvc/xerces/c/trunk/src/xercesc/dom/impl/DOMLSSerializerImpl.cpp?r1=768978&r2=1226891.
> I have supplied a reproducible and a potential patch. The string validator
> should be responsible for determining if the codepoint is part of a surrogate
> pair. However, I may also like to make the argument that this may not be the
> right location to be doing the string validation. As it will leave the output
> document in an inconsistent (half-written) state.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]