background preloader

TextBlob: Simplified Text Processing — TextBlob 0.6.0 documentation

TextBlob: Simplified Text Processing — TextBlob 0.6.0 documentation
Release v0.8.4. (Changelog) TextBlob is a Python (2 and 3) library for processing textual data. It provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, and more. TextBlob stands on the giant shoulders of NLTK and pattern, and plays nicely with both. Features Noun phrase extractionPart-of-speech taggingSentiment analysisClassification (Naive Bayes, Decision Tree)Language translation and detection powered by Google TranslateTokenization (splitting text into words and sentences)Word and phrase frequenciesParsingn-gramsWord inflection (pluralization and singularization) and lemmatizationSpelling correctionJSON serializationAdd new models or languages through extensionsWordNet integration Get it now $ pip install -U textblob $ python -m textblob.download_corpora Ready to dive in?

GStreamer The Evolution of a Haskell Programmer Fritz Ruehr, Willamette University Freshman Haskell programmer fac n = if n == 0 then 1 else n * fac (n-1) Sophomore Haskell programmer, at MIT (studied Scheme as a freshman) fac = (\(n) -> (if ((==) n 0) then 1 else ((*) n (fac ((-) n 1))))) Junior Haskell programmer (beginning Peano player) fac 0 = 1 fac (n+1) = (n+1) * fac n Another junior Haskell programmer (read that n+k patterns are “a disgusting part of Haskell” [1] and joined the “Ban n+k patterns”-movement [2]) fac 0 = 1 fac n = n * fac (n-1) Senior Haskell programmer (voted for Nixon Buchanan Bush — “leans right”) fac n = foldr (*) 1 [1..n] Another senior Haskell programmer (voted for McGovern Biafra Nader — “leans left”) fac n = foldl (*) 1 [1..n] Yet another senior Haskell programmer (leaned so far right he came back left again!) -- using foldr to simulate foldl fac n = foldr (\x g n -> g (x*n)) id [1..n] 1 Memoizing Haskell programmer (takes Ginkgo Biloba daily) facs = scanl (*) 1 [1..] fac n = facs !! (studied at Oxford) Ph.D. Tenured professor

Pupil Pupil is an eye tracking hardware and software platform that started as a thesis project at MIT. Pupil is a project in active, community driven development. For noncommercial use, the hardware is accessible, hackable, and affordable. Our vision is to create a tool kit for a diverse group of people interested in learning about eye tracking and conducting their eye tracking projects. Headset Capture Software Visualization Software Discussion Forum The main forum for PUPIL discussion is the pupil-discuss group. Pupil in 3D Pupil3d uses Pupil for experimental 3D tracking of visual attention using structure from motion. NUMA (Non-Uniform Memory Access): An Overview - ACM Queue Christoph Lameter, Ph.D. NUMA (non-uniform memory access) is the phenomenon that memory at various points in the address space of a processor have different performance characteristics. At current processor speeds, the signal path length from the processor to memory plays a significant role. Increased signal path length not only increases latency to memory but also quickly becomes a throughput bottleneck if the signal path is shared by multiple processors. Today, processors are so fast that they usually require memory to be directly attached to the socket that they are on. As the trend toward improving system performance by bringing memory nearer to processor cores continues, NUMA will play an increasingly important role in system performance. NUMA systems today (2013) are mostly encountered on multisocket systems. Performance-sensitive applications can require complex logic to handle memory with diverging performance characteristics. How Operating Systems Handle Numa Memory NODE LOCAL.

pymc Go for System Administrators - blog dot lusis If I never directly touch a Go concurrency primitive, I’m convinced I’m going to write all my cli apps with it just for ease of deployment. This is something I said the other day. I figured it deserved a more detailed blog post. Most people who know me professionally know two things about me: I’m fairly pragmatic and somewhat conservative about technology decisionsI’m a language tourist This second one is something Bryan Berry attributed to me in an early FoodFight episode. I love learning new programming languages. So it’s weird that I find myself 18 years later having a working knowledge of ruby, python, perl, java and a few other languages to a lesser degree. This leads me to picking up Go. If you haven’t heard of Go, there are countless articles, blog posts and a shitload of new tooling written in it. Mind you I don’t pick up languages based on popularity. I actually attempted that route working on a PAM module for StormPath. So why Go now? On Pragmatism Tooling in Go The syntax is easy.

pyquery pyquery allows you to make jquery queries on xml documents. The API is as much as possible the similar to jquery. pyquery uses lxml for fast xml and html manipulation. This is not (or at least not yet) a library to produce or interact with javascript code. I just liked the jquery API and I missed it in python so I told myself “Hey let’s make jquery in python”. It can be used for many purposes, one idea that I might try in the future is to use it for templating with pure http templates that you modify using pyquery. The project is being actively developped on a git repository on Github. Please report bugs on the github issue tracker. You can use the PyQuery class to load an xml document from a string, a lxml document, from a file or from an url: Now d is like the $ in jquery: >>> d("#hello")[<p#hello.hello>]>>> p = d("#hello")>>> print(p.html())Hello world ! >>> d('p:first')[<p#hello.hello>] First there is the Sphinx documentation here.

nu7hatch/gmail Astropython its-not-software - steveyegge2 You don't work in the software industry. The software industry has been around a lot longer than ours, and it continues to thrive in parallel to ours. There's some overlap, just as the hardware and software industries have some overlap. But it's a lot less than you probably realize. Not knowing that we're not in the software industry is hurting you every day. But it's also hurting us in that any competitor who does understand that it's a different industry is going to start coding circles around us, to whatever extent they've figured it out. Our Sister Industry So what's the software industry, and how do we differ from it? Well, the software industry is what you learn about in school, and it's what you probably did at your previous company. So it includes pretty much everything that Microsoft does: Windows and every application you download for it, including your browser. Servware Servware is stuff that lives on your own servers. Software Lifecycle Broken/Incomplete Models Documentation Yawn.

pygame Cap'n Proto: Introduction Cap’n Proto is an insanely fast data interchange format and capability-based RPC system. Think JSON, except binary. Or think Protocol Buffers, except faster. In fact, in benchmarks, Cap’n Proto is INFINITY TIMES faster than Protocol Buffers. This benchmark is, of course, unfair. But doesn’t that mean the encoding is platform-specific? NO! Doesn’t that make backwards-compatibility hard? Not at all! Won’t fixed-width integers, unset optional fields, and padding waste space on the wire? Yes. When bandwidth really matters, you should apply general-purpose compression, like zlib or Snappy, regardless of your encoding format. Are there other advantages? Glad you asked! Incremental reads: It is easy to start processing a Cap’n Proto message before you have received all of it since outer objects appear entirely before inner objects (as opposed to most encodings, where outer objects encompass inner objects). Why do you pick on Protocol Buffers so much? I no longer work for Google.

NLTK Deconstructing Deferred I apologize for the incomplete post. I pressed the wrong button and the first half of this post went out by accident, somewhat unreviewed. I'm happy enough about what got sent for ideas 1-3, although I had wanted to insert some more links into the Twisted docs and code for Deferred. Here is the rest, starting with the section on Chaining, which hadn't fully fleshed out when I accidentally posted. Idea 4: Chaining Deferreds This is a five-star idea! def sync_and_read_bookmarks(): sync_bookmarks() But how do we write this if all operations return Deferreds? d = sync_bookmarks() d.addCallback(lambda unused_result: read_bookmarks()) return d What is happening here? (A note about the ugly lambda: it is needed because all callbacks are called with a result value, but read_bookmarks() takes no arguments. Here's the code. def callback(self, result): for (cb, eb) in self.callbacks: if isinstance(result, Failure): cb = eb # Use errback try: result = cb(result) except: result = Failure() break break Dizzy?