Hi…
an unexpected problem that I currently face is about newline and return
characters in xml attribute values.
I first found it in good old clojure.xml and thought to fix it with
Clojure.data.xml (0.0.8 and 0.2.0-alpha3 tried), but it did not help.
When I read in some external xml with escaped cr/lf ( ) in an
attribute value, it is parsed correctly., but when I write it out again the
\r\n appears in the output unescaped, breaking the attribute value on next
read.
Here is some real stuff showing the issue:
;; cr lf are not escaped:
(prn (xml/emit-str (xml/element :foo {:bar "Baz\r\nquux"})))
;; => "<?xml version=\"1.0\" encoding=\"UTF-8\"?><foo
bar=\"Baz\r\nquux\"></foo>"
;; if we escape manually, the ampersand is escaped:
(prn (xml/emit-str (xml/element :foo {:bar "Baz quux"})))
;; => "<?xml version=\"1.0\" encoding=\"UTF-8\"?><foo
bar=\"Baz&#13;&#10;quux\"></foo>"
;; we can fix after the fact with some string replacer, but this feels
really hacky:
(prn (-> (xml/emit-str (xml/element :foo {:bar "Baz quux"}))
(str/replace "&#" "&#")))
;; => "<?xml version=\"1.0\" encoding=\"UTF-8\"?><foo
bar=\"Baz quux\"></foo>"
;; although it is then reparsed correctly:
(prn (-> (xml/emit-str (xml/element :foo {:bar "Baz quux"}))
(str/replace "&#" "&#")
(xml/parse-str)))
;;=> #clojure.data.xml.Element{:tag :foo, :attrs {:bar "Baz\r\nquux"},
:content ()}
Looking into the emit code, it seems like this is a Java XMLStreamWriter
issue?!
Any idea how to fix this in a clean way?
Ciao
…Jochen
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.