CS 61B Lecture 37: Expression Parsing. Parsing expression grammar. Unlike CFGs, PEGs cannot be ambiguous; if a string parses, it has exactly one valid parse tree.
It is conjectured that there exist context-free languages that cannot be parsed by a PEG, but this is not yet proven. PEGs are well-suited to parsing computer languages, but not natural languages where their performance is comparable to general CFG algorithms such as the Earley algorithm. Definition Syntax Formally, a parsing expression grammar consists of: A finite set N of nonterminal symbols.A finite set Σ of terminal symbols that is disjoint from N.A finite set P of parsing rules.An expression eS termed the starting expression.
An atomic parsing expression consists of: any terminal symbol,any nonterminal symbol, orthe empty string ε.Given any existing parsing expressions e, e1, and e2, a new parsing expression can be constructed using the following operators: Sequence: e1 e2Ordered choice: e1 / e2Zero-or-more: e*One-or-more: e+Optional: e? Semantics Examples . JFlex - User's Manual. The Fast Lexical Analyser Generator Copyright ©1998-2014 by Gerwin Klein, Steve Rowe, and Régis Décamps.
JFlex User's Manual Version 1.5.1, March 21, 2014 Contents JFlex is a lexical analyser generator for Java written in Java. Design goals The main design goals of JFlex are: Full unicode supportFast generated scanners Fast scanner generationConvenient specification syntaxPlatform independenceJLex compatibility About this manual This manual gives a brief but complete description of the tool JFlex. The next section of this manual describes installation procedures for JFlex. Installing JFlex Windows To install JFlex on Windows 95/98/NT/XP, follow these three steps: Unzip the file you downloaded into the directory you want JFlex in (using something like WinZip). Unix with tar archive To install JFlex on a Unix system, follow these two steps: You can verify the integrity of the downloaded file with the MD5 checksum available on the JFlex download page.
Syntactic_recognition. Treetop grammars are written in a custom language based on parsing expression grammars.
Literature on the subject of parsing expression grammars (PEGs) is useful in writing Treetop grammars. PEGs have no separate lexical analyser (since the algorithm has the same time-complexity guarantees as the best lexical analysers) so all whitespace and other lexical niceties (like comments) must be explicitly handled in the grammar. A further benefit is that multiple PEG grammars may be seamlessly composed into a single parser. Treetop grammars look like this: require "my_stuff" grammar GrammarName include Module::Submodule rule rule_name ... end rule rule_name ... end ... end The main keywords are: grammar : This introduces a new grammar. A grammar may be surrounded by one or more nested module statements, which provides a namespace for the generated Ruby parser.
Treetop will emit a module called GrammarName and a parser class called GrammarNameParser (in the module namespace, if specified). The If Works - Talk: Writing a language in 15 minutes. I gave a talk at London Ruby User Group yesterday, based on the work I’ve been doing on Heist, my Scheme interpreter project.
I wrote the core of a basic Scheme interpreter in about 15 minutes as a live-coded demo (well, kind of – the coding was pre-recorded so I could focus on talking), which seemed to go down pretty well. If you missed it (or if you were there and want to watch it again in slow motion), here’s the slides and the video (just code, no narrative (sorry)). (Side note: I think Lisp may be affecting my writing style.)
The slides first: lrug-scheme-15.zip. They are S5-format HTML, introducing the Scheme language features I implement during the talk. Scheme interpreter in 15 minutes from James Coglan on Vimeo. Video is also available from Skills Matter if you want the narrative. Some relevant links: