background preloader

NLTK

NLTK
NLTK is a leading platform for building Python programs to work with human language data. It provides easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning. Thanks to a hands-on guide introducing programming fundamentals alongside topics in computational linguistics, NLTK is suitable for linguists, engineers, students, educators, researchers, and industry users alike. NLTK is available for Windows, Mac OS X, and Linux. Best of all, NLTK is a free, open source, community-driven project. NLTK has been called “a wonderful tool for teaching, and working in, computational linguistics using Python,” and “an amazing library to play with natural language.”

http://nltk.org/

pyquery pyquery allows you to make jquery queries on xml documents. The API is as much as possible the similar to jquery. pyquery uses lxml for fast xml and html manipulation. This is not (or at least not yet) a library to produce or interact with javascript code.

nltk.googlecode.com/svn/trunk/doc/book/ch00.html This is a book about Natural Language Processing. By "natural language" we mean a language that is used for everyday communication by humans; languages like English, Hindi or Portuguese. In contrast to artificial languages such as programming languages and mathematical notations, natural languages have evolved as they pass from generation to generation, and are hard to pin down with explicit rules. We will take Natural Language Processing — or NLP for short — in a wide sense to cover any kind of computer manipulation of natural language. At one extreme, it could be as simple as counting word frequencies to compare different writing styles.

pyquery pyquery allows you to make jquery queries on xml documents. The API is as much as possible the similar to jquery. pyquery uses lxml for fast xml and html manipulation. This is not (or at least not yet) a library to produce or interact with javascript code. I just liked the jquery API and I missed it in python so I told myself “Hey let’s make jquery in python”.

sjbrown's Guide To Writing Games By Shandy Brown. Please send comments / corrections via email to tutorial@ezide.com Last Update: March 2011 Table Of Contents Pupil Pupil is an eye tracking hardware and software platform that started as a thesis project at MIT. Pupil is a project in active, community driven development. For noncommercial use, the hardware is accessible, hackable, and affordable. The software is open source and written in Python and C where speed is an issue. Our vision is to create a tool kit for a diverse group of people interested in learning about eye tracking and conducting their eye tracking projects. Headset TextBlob: Simplified Text Processing — TextBlob 0.6.0 documentation Release v0.8.4. (Changelog) TextBlob is a Python (2 and 3) library for processing textual data. It provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, and more. TextBlob stands on the giant shoulders of NLTK and pattern, and plays nicely with both.

Twisted Twisted is an event-driven networking engine written in Python and licensed under the open source ​MIT license. Twisted runs on Python 2 and an ever growing subset also works with Python 3. Twisted makes it easy to implement custom network applications. Here's a TCP server that echoes back everything that's written to it: from twisted.internet import protocol, reactor, endpoints class Echo(protocol.Protocol): def dataReceived(self, data): self.transport.write(data) class EchoFactory(protocol.Factory): def buildProtocol(self, addr): return Echo() endpoints.serverFromString(reactor, "tcp:1234").listen(EchoFactory()) reactor.run() Learn more about ​writing servers, ​writing clients and the ​core networking libraries , including support for SSL, UDP, scheduled events, unit testing infrastructure, and much more.

Simple Top-Down Parsing in Python Fredrik Lundh | July 2008 In Simple Iterator-based Parsing, I described a way to write simple recursive-descent parsers in Python, by passing around the current token and a token generator function. A recursive-descent parser consists of a series of functions, usually one for each grammar rule. Such parsers are easy to write, and are reasonably efficient, as long as the grammar is “prefix-heavy”; that is, that it’s usually sufficient to look at a token at the beginning of a construct to figure out what parser function to call. For example, if you’re parsing Python code, you can identify most statements simply by looking at the first token. However, recursive-descent is less efficient for expression syntaxes, especially for languages with lots of operators at different precedence levels.

Related: