background preloader

Data Munging

Facebook Twitter

How to use Python's enumerate and zip to iterate over two lists and their indices. Python split. Ubuntu Start Page. Iteration - How do I use Python's itertools.groupby() An Introduction to Python Lists. You can use the list type to implement simple data structures, such as stacks and queues. stack = [] stack.append(object) object = stack.pop() queue = [] queue.append(object) object = queue.pop(0) The list type isn’t optimized for this, so this works best when the structures are small (typically a few hundred items or smaller).

An Introduction to Python Lists

For larger structures, you may need a specialized data structure, such as collections.deque. Another data structure for which a list works well in practice, as long as the structure is reasonably small, is an LRU (least-recently-used) container. Lru.remove(item) lru.append(item) If you do the above every time you access an item in the LRU list, the least recently used items will move towards the beginning of the list. Searching Lists The in operator can be used to check if an item is present in the list: if value in L: print "list contains", value To get the index of the first matching item, use index: Command line arguments. More on python flatten. Hands-On Python A Tutorial Introduction for Beginners. Hands-On Python A Tutorial Introduction for Beginners Contents Chapter 1.

Hands-On Python A Tutorial Introduction for Beginners

Python In One Easy Lesson. Nick Parlante Nov 2010 This is a one-hour introduction to Python used for Stanford's CS107.

Python In One Easy Lesson

This material should work as an introduction for any experienced programmer. We'll look at some core Python features and get a feel for how it compares to other languages. Flattening lists. Sending gmail though python code. Screen Scraping Web Pages. This tutorial shows how to programmatically retrieve a stock quote from Google Finance. It uses Python's high level Web API and screen scraping with regular expressions. First, lets look at the page we want to get our content from. To get finance data for the ticker "IBM", we use a URL like this: If you enter this URL in your browser, you can see the page we are going to scrape from. To retrieve the content of the page, we can use Python's urllib module: import urllib content = urllib.urlopen(" Now that we have the content stored, we want to scrape some data from it. <span class="pr" id="ref_18241_l">116.26</span>

Csv – Comma-separated value files. The csv module is useful for working with data exported from spreadsheets and databases into text files formatted with fields and records, commonly referred to as comma-separated value (CSV) format because commas are often used to separate the fields in a record.

csv – Comma-separated value files

Note The Python 2.5 version of csv does not support Unicode data. There are also “issues with ASCII NUL characters”. Using UTF-8 or printable ASCII is recommended. Reading Use reader() to create a an object for reading data from a CSV file. Import csvimport sys f = open(sys.argv[1], 'rt')try: reader = csv.reader(f) for row in reader: print rowfinally: f.close() The first argument to reader() is the source of text lines. The csv Module. (New in 2.3) The csv module is used to read data files in the CSV (comma-separated values) format, as used by Microsoft Excel and many other applications.

The csv Module

A CSV file contains a number of rows, each containing a number of columns, usually separated by commas. Webscraping with Python and BeautifulSoup. Recently my life has been a hype; partly due to my upcoming Python addiction.

Webscraping with Python and BeautifulSoup

There’s simply no way around it; so I should better confess it in public. I’m in love with Python. It’s not only mature, businessproof and performant, but also benefits from sleekness, great performance and is just so much fun to write. It’s as if I were in Star Trek and only had to tell the computer what I wanted; never minding how the job actually it is done. Even my favourite comic artist(besides Scott Adams, of course..) took up on it; so my feelings have to be honest. Ed Hellen's Python notes and examples.

For Linux the packages numpy, scipy, Gnuplot, and matplotlib should be in usr/lib/Python2.x/site-packages. 1-d Arrays, Matrices, Numerical Integration, Numerical Solution of ODEs, Curve Fitting, Fit to line, Reading and Writing Array files, Finding zeros of functions, Graphing with Gnuplot, Fast Fourier Transform, Waveforms: Square, Sawtooth, Time Delay, Noise, Create Postscript Graph, Simple Plots with matplotlib, Plot Functions and Data, Interactive Plots with matplotlib, Plotting with log or linear axes, Subplots, 2 Y axes, Inset Graph.

Ed Hellen's Python notes and examples

GroupBy-fu: improvements in grouping and aggregating data in pandas. A couple of weeks ago in my inaugural blog post I wrote about the state of GroupBy in pandas and gave an example application.

GroupBy-fu: improvements in grouping and aggregating data in pandas

However, I was dissatisfied with the limited expressiveness (see the end of the article), so I decided to invest some serious time in the groupby functionality in pandas over the last 2 weeks in beefing up what you can do. So this article is a part show-and-tell, part quick tutorial on the new features. Note that I haven’t added a lot of this to the official documentation yet.

GroupBy primer GroupBy may be one of the least well-understood features in pandas. Here, the index (row labels) contains dates and the columns are names for each time series. Perform an aggregation, like computing the sum of mean of each group. Xah's Perl & Python Tutorial. This is a tutorial on Perl, Python, and few other dynamic languages. The tutorial is concrete, example based, covering a practical subset of the language, using universal language features. The goal is to get you quickly started. You can start coding toy programs after a few hours. Then, you'll be able to understand & use official language reference for detail & advanced features. New languages and snippets are added continuously.