I took your advice and limited it to the parser generator portion. I also made the schedule much more specific and moved the grammar DSL bits near the end. I submitted the updated proposal to Melange at http://www.google-melange.com/gsoc/proposal/review/google/gsoc2011/tylercurtis/1 . I'd still love any feedback/suggestions/questions/criticisms.
-- Tyler Curtis On Tue, Apr 5, 2011 at 12:29 PM, Andrew Whitworth <[email protected]> wrote: > I'm excited to see you interested in participating in GSoC again, > Tyler. Thanks for sending in this proposal! The application deadline > is on Friday (Hint to all other GSoC students who haven't submitted > yet!) > > The project you've picked here is very big. In fact, I would suggest > it is far too large for a single student to complete over the summer. > I don't doubt that you've got some MAD SKILLZ, but this is a hell of a > lot of work for anybody. It's especially daunting for a student who > hasn't had much prior exposure to lexing with finite automatons or > parsing with LALR generators. > > I recommend you pick one of the two. Either create a lex-like lexer > which breaks a string up into a sequence of tokens, or create an LALR > parser-generator which operates on a sequence of input tokens. Doing > both is way too much work and is a recipe for failure. > > I suggest you pick one or the other (both would be equally valuable), > and focus a proposal on that. > > --Andrew Whitworth > > > > On Tue, Apr 5, 2011 at 1:33 AM, Tyler Curtis <[email protected]> wrote: >> Here's a draft of my proposal for a LALR Parser Generator and Lexical >> Analyzer Generator for Parrot. I'd welcome any feedback or questions. >> >> Personal Information >> >> Name: Tyler Curtis >> Email: [email protected] >> IRC: tcurtis >> >> LALR Parser and Lexer Generators for Parrot >> >> Abstract >> >> An LALR parser generator and a lexical analyzer generator will be >> developed that will convert declarative specifications into Winxed >> code to parse and lexically analyzer input. >> >> Benefits to the Parrot VM and Open Source Community >> >> Parrot currently has a very convenient to use set of tools for the >> development of compilers. However, the Perl 6 grammars used by PCT >> have some performance disadvantages compared to LALR(1) grammars, when >> they can be expressed as LALR(1) grammars. They are also less familiar >> to users of conventional parsing tools, such as lex and yacc. The >> creation of more conventional parsing tools will make it easier for >> compiler writers to apply their prior knowledge to targetting Parrot. >> They could even potentially easily adapt portions of their existing >> parsers to use the developed tools. LALR(1) parser generators also >> make expressing some grammars easier. It will also be provider a more >> language-agnostic option for compiler development on Parrot. >> >> Deliverables >> >> A LALR parser generator executable which translates a grammar >> specification into a parser in Winxed. >> A lexical analyzer generator to tokenize the input to the parser. >> Some example grammars. >> A test suite. >> >> Project Details >> >> LALR(Look-Ahead Left-to-Right) parsers are a variation of >> LR(Left-to-Right) parsers which require much smaller parsing tables. >> They merge states from an LR parsing table with identical sets of >> first components. They take time linear in the size of their input to >> run, which is quite fast. However, they are also very complicated. >> Writing them by hand is not practical. Instead, a number of LALR >> parser generators have been written, which convert a declarative >> specification of an LALR grammar (possibly including semantic actions >> to build a parse tree or perform other actions in the process of >> parsing) into a LALR parser. >> >> This is a rather convenient way of writing parsers. There are LALR >> parser generators producing parsers in a great number of programming >> languages now. Parrot has a compiler toolkit which uses Perl 6 regexes >> to express a language's grammar; however, Parsing expression grammar >> (PEGs, which are analogous to Perl 6 regexes) parsers require either >> more memory or more time to parse input than LALR parsers. They also >> do not handle left recursive grammars as well as LALR parsers. >> >> My project would consist of developing a lexical analyzer generator >> and parser generator that generate lexical analyzers and parsers, >> respectively, as Parrot classes. The parsers will present an interface >> similar to HLL::Grammar objects(with parse and parsefile methods). >> Instead of embedding semantic actions into the grammar specification, >> the parse/parsefile methods will accept an optional "actions" named >> parameter as HLL::Grammar does. The lexer generator would create >> classes with an analogous interface, also using an action parameter. >> >> Some sources I intend to read about the theory of LALR(1) parsing: >> The Dragon Book >> Parsing Techniques - A Practical Guide >> "Practical LR9k) Translators" by Frank DeRemer. >> "Efficient Computation of LALR(1) Look-Ahead Sets" by Frank >> DeRemer >> and Thomas Pennello. >> "On the Translation of Languages from Left to Right" by >> Donald Knuth. >> >> Project Schedule >> >> The portion of the schedule prior to June 11 is intended to be less >> demanding in time because I will still be in classes until then. >> >> April 8 - May 22: Study the theory behind LALR(1) parsing and >> deterministic finite automata (for the lexical analyzer generator). >> Look at the code of existing LALR parser generators. Begin designing >> the lexer and grammar specification languages. >> >> May 23 - May 31: Write a parser for the lexical analyzer >> specifications. >> >> May 31 - June 11: Implement generation of a lexical analyzer from a >> parsed specification. >> >> June 12 - June 18: Write documentation and tests for the lexical >> analyzer generator. >> >> June 19 - June 25: Further design and research for the parser >> generator, including specifying the schedule of the rest of the >> summerer in more detail. >> >> June 26 - July 2: Implement the parser for the grammar specifications. >> >> July 3 - July 10: Begin work on the parser generator. >> >> July 11 - July 15: Work on mid-term evaluation. >> >> July 16 - August 6: Complete implementation of the parser generator. >> >> August 7 - August 22: Write more tests and documentation for the >> parser generator. >> >> August 22: Hard "pencils down" date. >> >> August 22 - August 26: Work on final evaluation. >> >> References and Likely Mentors >> >> I have spoken with Whiteknight, and he is interested in mentoring. >> >> License >> >> Artistic License 2.0 >> >> Bio >> >> I am a first-year student at the University of Chicago studying >> Computer Science and Mathematics. I did Google Summer of Code last >> year for Parrot. Development tools, particularly compilers, parsers, >> and object systems, are my primary interests in Computer Science. I do >> not have much previous knowledge of the theory of LALR parsing, but, >> if I am accepted, I plan to spend the remaining time before GSoC >> begins studying it. >> >> Eligibility: I am an eligible student. >> >> -- >> Tyler Curtis >> _______________________________________________ >> http://lists.parrot.org/mailman/listinfo/parrot-dev >> > _______________________________________________ http://lists.parrot.org/mailman/listinfo/parrot-dev
