background preloader

Dataviz - Python

Facebook Twitter


28 Jupyter Notebook tips, tricks and shortcuts. This post is based on a post that originally appeared on Alex Rogozhnikov’s blog, ‘Brilliantly Wrong’. We have expanded the post and will continue to do so over time - if you have a suggestion please let us know in the comments. Thanks to Alex for graciously letting us republish his work here. Jupyter Notebook Jupyter notebook, formerly known as the IPython notebook, is a flexible tool that helps you create readable analyses, as you can keep code, images, comments, formulae and plots together.

Jupyter is quite extensible, supports many programming languages and is easily hosted on your computer or on almost any server — you only need to have ssh or http access. The Jupyter interface. Project Jupyter was born out of the IPython project as the project evolved to become a notebook that could support multiple languages - hence its historical name as the IPython notebook. Advanced Jupyter Notebook Tricks — Part I - Data Science Blog by Domino. By roos on November 3rd, 2015 I love Jupyter notebooks!

Advanced Jupyter Notebook Tricks — Part I - Data Science Blog by Domino

They’re great for experimenting with new ideas or data sets, and although my notebook “playgrounds” start out as a mess, I use them to crystallize a clear idea for building my final projects. Jupyter is so great for interactive exploratory analysis that it’s easy to overlook some of its other powerful features and use cases. I wanted to write a blog post on some of the lesser known ways of using Jupyter — but there are so many that I broke the post into two parts. Getting Started with Plotly for Python. Plotly for Python can be configured to render locally inside Jupyter (IPython) notebooks, locally inside your web browser, or remotely in your online Plotly account.

Getting Started with Plotly for Python

Remote hosting on Plotly is free for public use. For private use, view our paid plans. Offline Use Standalone HTML Offline mode will save an HTML file locally and open it inside your web browser. Learn more by calling help: import plotly help(plotly.offline.plot) Copy to clipboard! Inside Jupyter/IPython Notebooks Learn more about offline mode Hosting on Plotly Plotly provides a web-service for hosting graphs. In the terminal, copy and paste the following to install the Plotly library and set your user credentials. Visualizing Summer Travels - Geoff Boeing. This is a series of posts about visualizing spatial data.

Visualizing Summer Travels - Geoff Boeing

I spent a couple of months traveling in Europe this summer and collected GPS location data throughout the trip with the OpenPaths app. I explored different web mapping technologies such as CartoDB, Leaflet, Mapbox, and Tilemill to plot my travels. I also used Python and matplotlib to run some descriptive statistics and visualize other aspects of my trip. Here is the series of posts: My Python code is available in this GitHub repo. This series serves as an introduction and tutorial for these various technologies and methods. Interactive maps Here are some brief highlights. I also visualized this spatial data as an interactive map using the Leaflet javascript library, and by rolling my own set of web map tiles then rendering them with Tilemill and Mapbox.

Step by step Kaggle competition tutorial – Datanice. Kaggle is a Data Science community where thousands of Data Scientists compete to solve complex data problems.

Step by step Kaggle competition tutorial – Datanice

In this article we are going to see how to go through a Kaggle competition step by step. The contest explored here is the San Francisco Crime Classification contest. The goal is to classify a crime occurrence knowing the time and place it happened. Here, the objectives are fixed by Kaggle. In general, when starting a Data Science project, one of the most important steps is the business understanding and the definition of the scope and objectives of the project. For data exploration I like to use IPython Notebook which allows you to run your scripts line by line: import pandas as pd df = pd.read_csv('train.csv') len(df) #884262 df.head() We have 800k data points in our training set covering about ten years of crime.

A simple pandas function which allows to find outliers in the data is : Bokeh Docs. Interactive Plotting in IPython Notebook (Part 1/2): Bokeh. Summary In this post I will talk about interactive plotting packages that support the IPython Notebook and allow you to zoom, pan, resize, or even hover and get values off your plots directly from an IPython Notebook.

Interactive Plotting in IPython Notebook (Part 1/2): Bokeh

This post will focus on Bokeh while the next post will be about Plotly. I will also provide some very rudimentary examples that should allow to get started straight away. Interactive Plots: +1 for convenience Anyone who’s delved into ‘exploratory’ data analysis requiring a depiction of their results would have inevitably come to the point where they would need to fiddle with plotting settings just to make the result legible (much more work required to make it attractive). Well, with interactive plotting the days of the static plot are dwindling. Bokeh Bokeh is a package by Continuum Analytics, authors of the Anaconda distribution of which I spoke in this previous post. Creating interactive crime maps with Folium.

You can see this Domino project here I get very excited about a nice map.

Creating interactive crime maps with Folium

But when it comes to creating maps in Python, I have struggled to find the right library in the ever changing jungle of Python libraries.