background preloader

Pidgin project

Facebook Twitter

Er. Pattern. Pattern is a web mining module for the Python programming language.


It has tools for data mining (Google, Twitter and Wikipedia API, a web crawler, a HTML DOM parser), natural language processing (part-of-speech taggers, n-gram search, sentiment analysis, WordNet), machine learning (vector space model, clustering, SVM), network analysis and <canvas> visualization. The module is free, well-document and bundled with 50+ examples and 350+ unit tests.

Download Installation Pattern is written for Python 2.5+ (no support for Python 3 yet). To install Pattern so that the module is available in all Python scripts, from the command line do: > cd pattern-2.6 > python install If you have pip, you can automatically download and install from the PyPi repository: If none of the above works, you can make Python aware of the module in three ways: Quick overview pattern.web pattern.en The pattern.en module is a natural language processing (NLP) toolkit for English. Pattern.web. The pattern.web module has tools for online data mining: asynchronous requests, a uniform API for web services (Google, Bing, Twitter, Facebook, Wikipedia, Wiktionary, Flickr, RSS), a HTML DOM parser, HTML tag stripping functions, a web crawler, webmail, caching, Unicode support.


It can be used by itself or with other pattern modules: web | db | en | search | vector | graph. Documentation URLs The URL object is a subclass of Python's urllib2.Request that can be used to connect to a web address. The method can be used to retrieve the content (e.g., HTML source code). GET: query data is encoded in the URL string (usually for retrieving data).POST: query data is encoded in the message body (for posting data). URL() expects a string that starts with a valid protocol (e.g. The example below downloads an image. URL downloads The download() function takes a URL string, calls and returns the retrieved data.

URL mime-type URL exceptions User-agent and referrer Find URLs. Introducing SPARQL: Querying the Semantic Web. Published on this if you're having trouble printing code examples Introducing SPARQL: Querying the Semantic WebBy Leigh Dodds November 16, 2005 An Introduction to SPARQL This tutorial, the first of a three-part series, introduces SPARQL -- a query language and data access protocol for the Semantic Web.

Introducing SPARQL: Querying the Semantic Web

SPARQL is defined in terms of the W3C's RDF data model and will work for any data source that can be mapped into RDF. The specification is under development by the RDF Data Access Working Group (DAWG) and has recently reached Last Call Working Draft. At this point in its life cycle the specification is stable enough that developers can begin seriously exploring its capabilities.

But what if you're a lot more interested in Web 2.0, which is practical and real, than in the Semantic Web, about which opinions vary widely? However SPARQL has a much wider potential audience. The goal of these tutorials is to enable developers to quickly become productive with SPARQL. SPARQL in Context. Technical Notes. Encoding a BCD Number Definition BCD represents each of the digits of an unsigned decimal as the 4-bit binary equivalents.

Technical Notes

Unpacked BCD Unpacked BCD representation contains only one decimal digit per byte. The digit is stored in the least significant 4 bits; the most significant 4 bits are not relevant to the value of the represented number. Packed BCD Packed BCD representation packs two decimal digits into a single byte. Invalid BCD Numbers These binary numbers are not allowed in the BCD code: 1010, 1011, 1100, 1101, 1110, 1111 Packing a Two-Byte BCD To pack a two-byte unpacked BCD number into a single byte creating a packed BCD number, shift the upper byte left four times, then OR the results with the lower byte. For example, Converting between Decimal and BCD From Decimal to Unpacked BCD: To convert a decimal number into an unpacked BCD number, assign each decimal digit its 8-bit binary equivalent.

Converting between Binary and BCD 00001001 00000010(unpacked BCD) = 01011100(base 2) How could I write "hello world" in binary.