background preloader

Data Science

Facebook Twitter

Universities Offer Courses in a Hot New Field - Data Science. 10 Fascinating Data Visualization Projects. The way we digest information is evolving, and for good reason — the sheer amount of data we accumulate and compile each day is staggering. In fact, according to IBM, 90% of data in the world today has been created only within the last two years. With this kind of growth, sometimes static numbers or a simple rundown of statistics can fall flat. That's why people from data scientists to artists are using visuals to relay information. In the gallery above, we've compiled 10 data visualization projects that prove actually seeing information can be effective, as well as really cool.

Cameron Marlow | cameronmarlow.com. These are the skills you need to be a data scientist at Facebook | VentureBeat | News | by christinafarr. How can big data and smart analytics tools ignite growth for your company? Find out at DataBeat, May 19-20 in San Francisco, from top data scientists, analysts, investors, and entrepreneurs. Register now and save $200! Finally, a clear description of what it means to be a data scientist! Facebook is seeking a “data scientist” to join the team of a dozen researchers in the Menlo Park HQ tasked with uncovering insights from the most extensive data set on human relationships ever assembled.

The job is lumped into the general category of “software engineering careers,” but is a far more product-driven role than you might expect. CloudBeat 2012 is assembling the biggest names in the cloud’s evolving story to learn about real cases of revolutionary cloud adoption. According to the job description, the ideal candidate is a strong communicator with an appreciation for products, someone with expertise in the full gamut of technical skills. Who is this wunderkind? UseR! 2014 Los Angeles. Etalab, mission chargée de l'ouverture des données publiques et du développement de la plateforme française Open Data. Projects. The Changing American Diet See what we ate on an average day, for the past several decades. Who is Older and Younger than You Here's a chart to show you how long you have until you start to feel your age. Working Parents Here are the mothers and fathers who work like you.

Shifting Parent Work Hours, Mom vs. Articles about stay-at-home dads and parents with even work loads might make it seem like dads are putting in a lot of hours in the household these days. Divorce Rates for Different Groups We know when people usually get married. Comparing ggplot2 and R Base Graphics Figure out which is best with a side-by-side comparison. Never Been Married Some people never get married, and some wait longer than others. Marrying Age People get married at various ages, but there are definite trends that vary across demographic groups. Million to One Shot, Doc Between 2009 and 2014, there were an estimated 17,968 visits to the emergency room for things stuck in a rectum. Why People Visit the Emergency Room.

QuantifiedSelf Paris. Information is Beautiful Award Winners 2013. In an information-obsessed society, a new crop of artists is representing visual data in engaging, imaginative ways. In June, we were pleased to announce this year's Information is Beautiful Awards. It's a competition, open to the public, that was founded last year by data visualizer David McCandless and data investment agency Kantar's creative director Aziz Cami.

With the aim to recognize the most excellent and beautiful in data visualization, infographics and data journalism, Information is Beautiful's judges (chaired by McCandless and Cami, and including the President of the Rhode Island School of Design, John Maeda; editor of Creative Review, Patrick Burgoyne; Eric Rodenbeck and George Oates of famed, San Francisco data-viz studio Stamen; and London-based designer and data artist Stefanie Posavec) selected from hundreds of global submissions. The online community also had a voice, as thousands of votes were cast to represent one additional judging place. Celebrating Excellence in Data Visualization and Information Design. Data - Facebook II - Mapping the Internet.

Edges may appear and disappear as routing patterns change over time. Some nodes are peers (A->B means A can exchange traffic through B for free) and some are connected by regular edges (A->B means A pays B to route their traffic). Peers have weights of zero in the training data and regular edges have weights of one. We have shuffled the AS numbers and replaced them with distorted versions of the AS names. In other words, the names are arbitrary with respect to reality, but consistent within the data. 26 CORNELL - Cornell University, it would be remapped to a random new number and its corresponding name, say (hypothetically), 11 HARVARD - Harvard University, with the text altered in some fashion.

For the prediction task, you are given a path which, at one point in the training time period, was an optimal path from node A to B. The training set graphs give links in the form: Messy AS Name 1 | Messy AS Name 2 | Edge weight The test set paths have the form: Getting started with Hadoop, Hive, and Sqoop « {5} Setfive – Talking to the World. I apologize for the buzzword heavy title but it was the best I could do. I couldn’t find a good quick start explaining how to get started with Hive so I thought I’d share my experiences. Anyway, a client of ours came to us needing to analyze a dataset that was about ~200 million rows over 6 months and is currently growing at about 10 million rows a week and increasing. From a reporting standpoint, they were looking to run aggregate counts and group bys over the data and then display the results on charts.

Additionally, they were also looking to select subsets of the data and use them later – basically SELECT * FROM table WHERE x AND y AND z. Obviously, doing the calculations in real time was out of the question so we knew we were looking for a solution that would be easy to use, support the necessary requirements and that would predictably scale with the increasing generation rate of data. With requirements in hand we hit the Internet and finally arrived at Hive running on top of Hadoop. Most popular porn searches, by state. We've seen that we can learn from what people search for, through the eyes of Google suggestions: state stereotypes, national stereotypes, and even the insecurities of age.

Do we see anything when we look at porn searches? PornHub released a small dataset on the three most popular searches for each state. Their map only shows the top search, which is limiting, but the chart above incorporates all queries. If a term was in a state's top three, the state is shaded black. There are interesting regional patterns in there, as you sweep from more widely spread searches to the more specific. PornHub also provided average time spent on the site per visit, but I pretty much ignored that column. Just note that the bar chart by PornHub starts at 10 minutes, which makes Rhode Island's average duration look a lot shorter.

Anyways... Most popular porn searches, by state.