background preloader


Facebook Twitter


Data Visualization is a big part of a data scientist’s jobs.

In the early stages of a project, you’ll often be doing an Exploratory Data Analysis (EDA) to gain some insights into your data. Creating visualizations really helps make things clearer and easier to understand, especially with larger, high dimensional datasets. Towards the end of your project, it’s important to be able to present your final results in a clear, concise, and compelling manner that your audience, whom are often non-technical clients, can understand. R. Cheatsheet - Python & R codes for common Machine Learning Algorithms. In his famous book – Think and Grow Rich, Napolean Hill narrates story of Darby, who after digging for a gold vein for a few years walks away from it when he was three feet away from it!

Cheatsheet - Python & R codes for common Machine Learning Algorithms

Now, I don’t know whether the story is true or false. But, I surely know of a few Data Darby around me. These people understand the purpose of machine learning, its execution and use just a set 2 – 3 algorithms on whatever problem they are working on.


AWS Training and Certification. Top 5 Python IDEs For Data Science (article) IDE stands for Integrated Development Environment.

Top 5 Python IDEs For Data Science (article)

It’s a coding tool which allows you to write, test and debug your code in an easier way, as they typically offer code completion or code insight by highlighting, resource management, debugging tools,… And even though the IDE is a strictly defined concept, it’s starting to be redefined as other tools such as notebooks start gaining more and more features that traditionally belong to IDEs. For example, debugging your code is also possible in Jupyter Notebook. You can probably most clearly see this evolution in the results of the Stack Overflow Developer Survey below, which also includes these new tools, next to the traditional IDEs that you might already know; They all fall under the section “development environment”.


Graph and its representations. Cron format. Cron Format Cron format is a simple, yet powerful and flexible way to define time and frequency of various actions. nnCron make active use of cron format in both classic and extended modes.

Cron format

Traditional (inherited from Unix) cron format consists of five fields separated by white spaces: Terminal 101: Creating Cron Jobs. Every Monday, we'll show you how to do something new and simple with Apple's built-in command line application.

Terminal 101: Creating Cron Jobs

You don't need any fancy software, or a knowledge of coding to do any of these. All you need is a keyboard to type 'em out! There are many times when you need to run a shell script or command at regular intervals. MySQL Database Export - Backup Methods. The simplest way of exporting a table data into a text file is using SELECT...INTO OUTFILE statement that exports a query result directly into a file on the server host.

MySQL Database Export - Backup Methods



D3. C++ Xcode. Latex. Unix. Getting Started: Building a Chrome Extension. Extensions are made of different, but cohesive, components.

Getting Started: Building a Chrome Extension

Components can include background scripts, content scripts, an options page, UI elements and various logic files. Extension components are created with web development technologies: HTML, CSS, and JavaScript. An extension's components will depend on its functionality and may not require every option. This tutorial will build an extension that allows the user to change the background color of any page on

It will use many core components to give an introductory demonstration of their relationships. To start, create a new directory to hold the extension's files. The completed extension can be downloaded here. Extensions start with their manifest. The directory holding the manifest file can be added as an extension in developer mode in its current state. Ta-da! Although the extension has been installed, it has no instruction. Navigate back to the extension management page and click the Reload link. Os_setup. Online Web Tutorials. Meta-meta. Free Online IDE and Terminal. Get started with Hadoop and Spark in 10 minutes.

With the big 3 Hadoop vendors – Cloudera, Hortonworks and MapR - each providing their own Hadoop sandbox virtual machines (VMs), trying out Hadoop today has become extremely easy.

Get started with Hadoop and Spark in 10 minutes

For a developer, it is extremely useful to download and get started with one of these VMs and try out Hadoop to practice data science right away. However, with the core Apache Hadoop, these vendors package their own software into their distributions, mostly for the orchestration and management, which can be a pain due to the multiple scattered open-source projects within the Hadoop ecosystem. e.g. Hortonworks includes the open-source Ambari while Cloudera includes its own Cloudera Manager for orchestrating Hadoop installations and managing multi-node clusters.

Moreover, most of these distributions require today a 64-bit machine and sometimes a high-amount of memory (for a laptop). e.g. running Cloudera Manager with a full-blown Cloudera Hadoop Distribution (CDH) 5.x requires at least 10GB RAM. Coding Tutorials. SQL Teaching - The easiest way to learn SQL. R inferno. Sublime Text: The text editor you'll fall in love with.

DATA VISUALIZATION. Web Design Terminology: HTML vs CSS vs JavaScript etc. Big Data University. Compile and Execute Programs Online.