Re: [Haskell-cafe] Techniques for ensuring parser correctness?

S. Doaitse Swierstra Mon, 26 Jul 2010 00:03:31 -0700

On 26 jul 2010, at 03:51, Jason Dagit wrote:

> Hello,
> 
> I find that parser correctness is often hard to verify.  Therefore, I'm 
> interested in techniques that others have used successfully, especially with 
> Haskell.


It seems to me that you are not so much trying to verify parsers, but more 
specifically Parsec parsers. Since in Parsec-based parsers you control the 
backtracking explicitly such parsers can get very complicated semantics. Now 
the question arises: what does it mean for a (Parsec) parser to be correct? Do 
you have another description of the  language which is to be recognised, e.g. a 
context-free grammar. Only then can you give meaning to the word "correctness".

In general I think that the more your parser combinators deviate from 
context-free grammars in terms of expressiveness, the more problems you will 
encounter. If you make heavy use of the monadic part, you will not only have to 
prove the correctness of static parsers, but even of parsers which are 
generated dynamically. If you use the backtrack-controlling features, your 
proofs will become even more complicated, since it is unlikely that your more 
abstract formalism in which you have specified your language does not have a 
similar construct: here comes in about 50 years on research on parsing 
techniques and grammar analysis. If your grammar is e.g. LL(1) then you can 
verify that some of the back-tracking-controlling features in your Parser 
parser have been used in a sound way, i.e., that you will be able to parse any 
sentence that your grammar describes.

If you have a context-free grammar, and you want to be relatively sure that the 
parser is correct and you do not want to go through large verification efforts 
I suggest you use the uu-parsinglib; the only restriction there is is that your 
grammar should fulfill certain modest "well-formedness" criteria, such as being 
non-left-recursive and non-ambiguous. Then  the semantics of the combinators 
are exactly what you want, i.e. your parsers and your grammars are isomorphic. 
If you have however an "incorrect formal specification", i.e., a specification 
which contains ambiguous non-terminals like p* constructs where p can reduce to 
an empty string   things break. The first problem one is not recognised and 
will lead to a non-terminating parser, whereas the second problem is detected 
by the grammars analysing themselves while being used, and leading to a 
run-time error message once you reach that part of the grammar during parsing.

If you insist on using left-recursive parsers you may use the left-corner 
transform from the 

http://hackage.haskell.org/packages/archive/ChristmasTree/0.2/doc/html/Text-GRead-Transformations-LeftCorner.html

package, or use a parser generator like happy; parser generators usually do 
some form of analysis (i.e. proving properties), which captures many mistakes 
in the design of a language.

Furthermore you may take a look at: 
@inproceedings{DBLP:conf/mpc/BrinkHL10,
  author    = {Kasper Brink and
               Stefan Holdermans and
               Andres L{\"o}h},
  title     = {Dependently Typed Grammars},
  booktitle = {MPC},
  year      = {2010},
  pages     = {58-79},
  ee        = {http://dx.doi.org/10.1007/978-3-642-13321-3_6},
  crossref  = {DBLP:conf/mpc/2010},
  bibsource = {DBLP, http://dblp.uni-trier.de}

Doaitse Swierstra










> 
> Techniques I'm aware of:
>   * Round trip checks: Generate a datastructure, render as a string, parse 
> back, and compare.  Quickcheck can be used to automate this.
>   * Fuzz testing:  What tools exist to help me?
>   * Formal verification: Has anyone been using this with Haskell parsers?  
> Other than general theorem provers, say Isabelle, what tools exist?
> 
> My specific need:
> The immediate challenge I have is that I'm modifying the parser that Darcs 
> uses and we would like to improve the parser's test suite as well.  The 
> parser abstraction used in this case follows parsec's API.  Left to my own 
> devices I would use round trip checks, written with quickcheck, for this 
> exercise.  Because we're using a parsec style parser, I don't have a nice 
> neat grammar handy.
> 
> Thanks in advance for any advice you have!
> 
> Thanks,
> Jason
> _______________________________________________
> Haskell-Cafe mailing list
> [email protected]
> http://www.haskell.org/mailman/listinfo/haskell-cafe

_______________________________________________
Haskell-Cafe mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Techniques for ensuring parser correctness?

Reply via email to