Hi

Depending on exactly what you want, TagSoup may be of interest to you.
It is lazy, but it doesn't return a tree. It is very tollerant of
errors, and will simply never "fail to parse" something.

http://www-users.cs.york.ac.uk/~ndm/tagsoup/

Thanks

Neil

On 5/11/07, Henning Thielemann <[EMAIL PROTECTED]> wrote:

I want to parse and process HTML lazily. I use HXT because the HTML parser
is very liberal. However it uses Parsec and is thus strict. HaXML has a
so called lazy parser, but it is not what I consider lazy:

*Text.XML.HaXml.Html.ParseLazy> Text.XML.HaXml.Pretty.document $ htmlParse "text" $ 
"<html><head></head><body>"++undefined++"</body></html>"
*** Exception: Prelude.undefined
*Text.XML.HaXml.Html.ParseLazy> Text.XML.HaXml.Pretty.document $ htmlParse "text" $ 
"<html><head></head><body>&</body></html>"
*** Exception: Expected "</" but found &
  at file text  at line 1 col 26

If it would be lazy, it would return some HTML code before the error.
HaXML uses the Polyparse package for parsing which contains a so called
lazy parser. However it has return type (Either String a). That is, for
the decision whether the parse was successful, the document has to be
parsed completely.

*Text.ParserCombinators.PolyLazy> runParser (exactly 4 (satisfy Char.isAlpha)) 
("abc104"++undefined)
("*** Exception: Parse.satisfy: failed

If it would have return type (String, a) it could return both a partial
value of type 'a' and the error message as String. It would be even better
if it has some handling for incorrect input texts, and returns ([String],
a), where [String] is the type of a list of warnings and error messages
and 'a' is the type of a total value of parse output.

Is there some parser of this type? Unfortunately
 http://www.haskell.org/haskellwiki/Applications_and_libraries/Compiler_tools
   does not compare the laziness of the mentioned parsers.
_______________________________________________
Haskell-Cafe mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/haskell-cafe

_______________________________________________
Haskell-Cafe mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to