Hi Joost…
hmmm, that is unfortunately not the reality. In fact any newlines/returns
in attributes are collapsed to a single space (saw mentioned somewhere that
this is officially so). This is also what happens here with
clojure.data.xml:
(prn (-> (xml/emit-str (xml/element :foo {:bar "Baz\r\nquux"}))
(xml/parse-str)))
;; => #xml/element{:tag :foo, :attrs {:bar "Baz quux"}}
Ciao
…Jochen
Am Donnerstag, 9. November 2017 11:56:18 UTC+1 schrieb Joost:
>
> Hi Jochen
>
> Since newlines and crs are allowed in attribute values, you don't need to
> escape them. The correctly escaped version and the unescaped version of the
> XML are exactly equivalent.
>
> Joost.
>
> On Thursday, November 9, 2017 at 9:48:07 AM UTC+1, Jochen wrote:
>>
>> Hi…
>>
>> an unexpected problem that I currently face is about newline and return
>> characters in xml attribute values.
>>
>> I first found it in good old clojure.xml and thought to fix it with
>> Clojure.data.xml (0.0.8 and 0.2.0-alpha3 tried), but it did not help.
>>
>> When I read in some external xml with escaped cr/lf ( ) in an
>> attribute value, it is parsed correctly., but when I write it out again the
>> \r\n appears in the output unescaped, breaking the attribute value on next
>> read.
>>
>> Here is some real stuff showing the issue:
>> ;; cr lf are not escaped:
>> (prn (xml/emit-str (xml/element :foo {:bar "Baz\r\nquux"})))
>> ;; => "<?xml version=\"1.0\" encoding=\"UTF-8\"?><foo
>> bar=\"Baz\r\nquux\"></foo>"
>>
>> ;; if we escape manually, the ampersand is escaped:
>> (prn (xml/emit-str (xml/element :foo {:bar "Baz quux"})))
>> ;; => "<?xml version=\"1.0\" encoding=\"UTF-8\"?><foo
>> bar=\"Baz&#13;&#10;quux\"></foo>"
>>
>> ;; we can fix after the fact with some string replacer, but this feels
>> really hacky:
>> (prn (-> (xml/emit-str (xml/element :foo {:bar "Baz quux"}))
>> (str/replace "&#" "&#")))
>> ;; => "<?xml version=\"1.0\" encoding=\"UTF-8\"?><foo
>> bar=\"Baz quux\"></foo>"
>>
>> ;; although it is then reparsed correctly:
>> (prn (-> (xml/emit-str (xml/element :foo {:bar "Baz quux"}))
>> (str/replace "&#" "&#")
>> (xml/parse-str)))
>> ;;=> #clojure.data.xml.Element{:tag :foo, :attrs {:bar "Baz\r\nquux"},
>> :content ()}
>>
>> Looking into the emit code, it seems like this is a Java XMLStreamWriter
>> issue?!
>>
>> Any idea how to fix this in a clean way?
>>
>> Ciao
>>
>> …Jochen
>>
>>
>
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.