background preloader

Python

Facebook Twitter

Spynner - Programmatic web browsing module for Python with Javascript/AJAX support. Simple Top-Down Parsing in Python. Fredrik Lundh | July 2008 In Simple Iterator-based Parsing, I described a way to write simple recursive-descent parsers in Python, by passing around the current token and a token generator function. A recursive-descent parser consists of a series of functions, usually one for each grammar rule. Such parsers are easy to write, and are reasonably efficient, as long as the grammar is “prefix-heavy”; that is, that it’s usually sufficient to look at a token at the beginning of a construct to figure out what parser function to call.

For example, if you’re parsing Python code, you can identify most statements simply by looking at the first token. However, recursive-descent is less efficient for expression syntaxes, especially for languages with lots of operators at different precedence levels. For example, here’s an excerpt from Python’s expression grammar. In the early seventies, Vaughan Pratt published an elegant improvement to recursive-descent in his paper Top-down Operator Precedence.

Simple Iterator-based Parsing. Fredrik Lundh | November 2005 | Originally posted to online.effbot.org Iterator-based parsing is an efficient and straightforward way of writing recursive-descent parsers in Python. Here’s an outline: use an iterator to split the source into a stream of tokens or token descriptors. pass the iterator’s next method and the first token to the toplevel parser class. use separate functions, where appropriate, for individual grammar rules. pass the next method and the current token on to these functions as well. to check the current token, inspect the token argument. to fetch the next token, call the next method. Here’s a simple example. This is a limited but hopefully safe version of Python’s eval function, which handles strings, floating point values, integers, and tuples only. Tuples can be nested. import cStringIO, tokenize def atom(next, token): if token[1] == "(": out = [] token = next() while token[1] !

Here’s the code in action: >>> simple_eval("'hello')))") 'hello' 23.2. shlex — Simple lexical analysis — Python v2.7.4 documentation. New in version 1.5.2. Source code: Lib/shlex.py The shlex class makes it easy to write lexical analyzers for simple syntaxes resembling that of the Unix shell. This will often be useful for writing minilanguages, (for example, in run control files for Python applications) or for parsing quoted strings. Prior to Python 2.7.3, this module did not support Unicode input. The shlex module defines the following functions: shlex.split(s[, comments[, posix]]) Split the string s using shell-like syntax.

New in version 2.3. Changed in version 2.6: Added the posix parameter. Note Since the split() function instantiates a shlex instance, passing None for s will read the string to split from standard input. The shlex module defines the following class: class shlex.shlex([instream[, infile[, posix]]]) A shlex instance or subclass instance is a lexical analyzer object. See also Module ConfigParser Parser for configuration files similar to the Windows .ini files. 23.2.1. shlex Objects shlex.get_token() shlex.wordchars.

Pyparsing. Twisted. Twisted is an event-driven networking engine written in Python and licensed under the open source ​MIT license. Twisted runs on Python 2 and an ever growing subset also works with Python 3. Twisted makes it easy to implement custom network applications. Here's a TCP server that echoes back everything that's written to it: from twisted.internet import protocol, reactor, endpoints class Echo(protocol.Protocol): def dataReceived(self, data): self.transport.write(data) class EchoFactory(protocol.Factory): def buildProtocol(self, addr): return Echo() endpoints.serverFromString(reactor, "tcp:1234").listen(EchoFactory()) reactor.run() Learn more about ​writing servers, ​writing clients and the ​core networking libraries , including support for SSL, UDP, scheduled events, unit testing infrastructure, and much more.

Twisted includes an event-driven web server. Learn more about ​web application development, ​templates and Twisted's ​HTTP client. Twisted includes a sophisticated IMAP4 client library.