I've got an issue where the clojure.xml/parse and /emit functions are
not symmetric with respect to how attributes are read and written.
The parser decodes HTML entities (e.g. & -> &) however the emitter
does not re-encode them:
user> (require ['clojure.xml :as 'xml])
nil
user> (xml/emit (xml/parse (org.xml.sax.InputSource.
(java.io.StringReader.
"<?xml version='1.0' encoding='UTF-8'?
><whatever name='Stuff & Things' />"))))
<?xml version='1.0' encoding='UTF-8'?>
<whatever name='Stuff & Things'/>
nil
As the decoding seems to be done in the Sax parser, I suppose the
easiest way to handle this issue is to re-encode the attributes before
they are written. I wrote this (not saying it's the best way by any
means):
(defn encode-html-entities [s]
(loop [ret s
[[char replacement] & rest] [["&" "&"]
["'" "'"]
["\"" """]
["<" "<"]
[">" ">"]]]
(if (nil? char)
ret
(recur (.replaceAll ret char replacement)
rest))))
This could be put in clojure.xml/emit-element around the call to (val
attr).
Thoughts?
-Wayne
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Clojure" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---