Henning Thielemann wrote:
I want to parse and process HTML lazily. I use HXT because the HTML parser
is very liberal. However it uses Parsec and is thus strict. HaXML has a
so called lazy parser, but it is not what I consider lazy:
*Text.XML.HaXml.Html.ParseLazy> Text.XML.HaXml.Pretty.document $ htmlParse "text" $
"<html><head></head><body>"++undefined++"</body></html>"
*** Exception: Prelude.undefined
*Text.XML.HaXml.Html.ParseLazy> Text.XML.HaXml.Pretty.document $ htmlParse "text" $
"<html><head></head><body>&</body></html>"
*** Exception: Expected "</" but found &
at file text at line 1 col 26
If it would be lazy, it would return some HTML code before the error.
Are you sure that it is the parser, that is not lazy, and it isn't that
the pretty printer is overly strict?
From the evidence above the parser could be returning some results
before the error, and the pretty printer strictly slurping it all up to
the error and then dying.
Jules
_______________________________________________
Haskell-Cafe mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/haskell-cafe