background preloader

Data Science

Facebook Twitter

Machine Learning

Data Science Podcasts : datascience. Artificial Intelligence: A Modern Approach. Time/location: Lectures: Tue/Thu 9-10:15am in NVIDIA Auditorium Sections: will happen based on need; see calendar Office hours: see calendar Course assistants: Gabor Angeli Naran Bayanbat Thiraphat Charoensripongsa Rohan Kamath Tiffany Low Tyler O'Neil Aju Thalappillil Scaria Cameron Schaeffer Hao Su Calendar: look here for dates/times of all lectures, sections, office hours, due dates.

Course description: Problems in game playing, natural language processing, computer vision, robotics are challenging due to the inherent noise/uncertainty and computational complexity. This course provides the mathematical and algorithmic framework for tackling these sorts of problems. CS221. Instructor: Chris Piech Course assistants: Contact: Please use Piazza for all questions related to lectures, homeworks, and projects.

CS221

For private questions, email: cs221-sum1213-staff@lists.stanford.edu. Office Hours: See the office hour calendar. Reproducible Research. Contributing EditorsKeith A.

Reproducible Research

Baggerly is professor of bioinformatics and computational biology at The University of Texas MD Anderson Cancer Center in Houston, Texas. Donald A. Berry is head of the division of quantitative sciences and chair and professor of the department of biostatistics at The University of Texas MD Anderson Cancer Center. Research is reproducible if it can be reproduced by others. Of course, rerunning an experiment will give different results—an observation that gave rise to the development of statistics as a discipline. In the 1990s, the geophysicist Jon Claerbout became frustrated with new students having great difficulty duplicating previous students’ research. Lprr.html. Literate Programming: The term "literate programming" was introduced by Don Knuth in the early 1980's, with the idea that a computer program should be documented in a manner that is readable by humans.

lprr.html

Reproducible Research: When a project is complete (e.g. a paper is published), the computational tools and data should be preserved in a manner that allows one to reproduce the final products (e.g., figures, tables, error values) and to later understand the methods used and the implementation (including all parameter values). The paper should contain all details that can be reasonably included. Other details should be available on-line. Reproducible Research Introduction: Reproducibility Toolkit. Overview Teaching: 30 min Exercises: 10 min Questions What tools will we be using?

Reproducible Research Introduction: Reproducibility Toolkit

How can we use these tools to improve reproducibility? Objectives Learn to use R, RStudio and RMarkdown. Our reproducibility toolkit R + RStudio Why R? Guide to reproducible code. 5 free resources every data scientist should start using today. Before jumping into popular MOOCs or purchasing recommended books on Amazon, I started by subscribing to various data science and data engineering newsletters.

5 free resources every data scientist should start using today

At first, I was reading every single article and taking notes, but over time learned to recognize the important links shared in multiple newsletters and focus on a few. Newsletters are great to stay up to date with new tools, academic research, and popular blog posts shared by large internet giants (e.g. Data Science - youcubed. Data Science in a Box. Data Science Fundamentals. Five Cognitive Biases In Data Science (And how to avoid them) Retrain your mind.

Five Cognitive Biases In Data Science (And how to avoid them)

Image by John Hain from Pixabay. Recently, I was reading Rolf Dobell’s The Art of Thinking Clearly, which made me think about cognitive biases in a way I never had before. I realized how deeply seated some cognitive biases are. In fact, we often don’t even consciously realize when our thinking is being affected by one. For data scientists, these biases can really change the way we work with data and make our day-to-day decisions, and generally not for the better.

Data science is, despite the seeming objectivity of all the facts we work with, surprisingly subjective in its processes. As data scientists, our job is to make sense of the facts. As a result, we data scientists need to be extremely careful, because all humans are very much susceptible to cognitive biases. In this piece, I want to point out five of the most common types of cognitive biases. 1. Source.

Technology Tools

6 Open Source Data Science Projects to Impress Hiring Managers. 10 Time-Saving Data Exploration Hacks, Tips and Tricks! Data Ethics. Journal on Mathematics of Data Science (SIMODS) Two-Year College Data Science Summit. May 10-11, 2018, Washington, DC metro areaAgendaParticipantsView April 18 TYCDSS Introductory Webinar With funding from the National Science Foundation, this workshop will bring together a diverse group of participants to make recommendations for two-year college data science programs, keeping in mind the needs of each of three student populations: Those seeking employment following an associate’s degree Those seeking transfer to four-year programs Those seeking certificate programs and college-level courses in data science for professional development The following products are the desired outputs of the summit: Capacity for the workshop has been reached but please fill out this Google form to be updated on developments, discussions, and products.

Two-Year College Data Science Summit

The steering committee includes the following: We will update this webpage with additional materials as they become available. We thank the sponsors of this workshop: To become a sponsor, contact Steve Pierson. Resources.