On Wed, Aug 12, 2009 at 16:22, Meikel Brandmeyer<[email protected]> wrote: > > Hi Stephen, > > On Aug 12, 3:57 pm, "Stephen C. Gilardi" <[email protected]> wrote: >> > I have to parse some XML files with c.x/parse. However the files >> > contain UTF-8 characters, which end up as '?' after being parsed by >> > c.x/parse. Is there some possibility to correctly parse the files? I >> > suspect there is some settings somewhere in my Clojure/JVM/System >> > which makes the whole thing fail, but I have no clue how to find out >> > where to look... > >> Does this help get you going: >> >> http://groups.google.com/group/clojure/msg/0f6dc9ec66b852fe > > Thanks for the tip. Unfortunately, it doesn't help. Now everything is > completely chopped to pieces. > >> More generally, you should also be able to specify the encoding by >> arranging for an InputStreamReader with a properly specified >> "charset" (like "UTF8") to wrap your input byte source. > > I tried, but c.x/parse only accepts an InputStream. I didn't find > a way to set the charset and that one...
You shouldn't have to. XML is funny that way: InputStream is a stream of *bytes*, not characters. XML will try to parse as UTF-8 if it doesn't find a <?xml ... ?> header specifying some other encoding. So, in your case it should "just work" unless the files I believe to be UTF-8 aren't actually UTF-8. // Ben --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to [email protected] Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~----------~----~----~----~------~----~------~--~---
