background preloader

Data science

Data science
We’ve all heard it: according to Hal Varian, statistics is the next sexy job. Five years ago, in What is Web 2.0, Tim O’Reilly said that “data is the next Intel Inside.” But what does that statement mean? Why do we suddenly care about statistics and about data? In this post, I examine the many sides of data science — the technologies, the companies and the unique skill sets. The web is full of “data-driven apps.” One of the earlier data products on the Web was the CDDB database. Google is a master at creating data products. Google’s breakthrough was realizing that a search engine could use input other than the text on the page. Flu trends Google was able to spot trends in the Swine Flu epidemic roughly two weeks before the Center for Disease Control by analyzing searches that people were making in different regions of the country. Google isn’t the only company that knows how to use data. In the last few years, there has been an explosion in the amount of data that’s available.

http://radar.oreilly.com/2010/06/what-is-data-science.html

TreeSheets A "hierarchical spreadsheet" that is a great replacement for spreadsheets, mind mappers, outliners, PIMs, text editors and small databases. Suitable for any kind of data organization, such as todo lists, calendars, project management, brainstorming, organizing ideas, planning, requirements gathering, presentation of information, etc. It's like a spreadsheet, immediately familiar, but much more suitable for complex data because it's hierarchical. Will We Exploit Data or Will Data Exploit Us? The interest on big data and open data is understandably growing all over the world. The combination of several technology innovations, in areas like social media, cloud computing, analytics, offer scenarios that we could hardly imagine in the past. And the trend toward greater transparency and openness that is being championed by many governments and NGOs is almost creating a “perfect storm” around the ability to extract wealth from the growing masses of data that are freely available over the Internet. It is not just about data that was previously kept behind boundaries and that governments are liberating through their various “data.gov” initiatives.

Institute for Advanced Analytics The Data Science Lab (or “DataLab”) is the research arm of the Institute for Advanced Analytics. It brings together investigators from various disciplines to focus on the intersection of analytical and computational challenges organizations face in extracting meaningful insights from a vast quantity and variety of data. Areas of Interest The detection of anomalous patterns or rare events within and across various realms of data, and determining whether such patterns are meaningful outliers or artifacts. What Is Data Science? Data Scientists Data Scientists perform data science. They use technology and skills to increase awareness, clarity and direction for those working with data. The data scientist role is here to accommodate the rapid changes that occur in our modern day environment and are bestowed the task of minimising the disruption that technology and data is having on the way we work, play and learn.

Why Census matters to you Census is any country is important in making major policy decisions and can affect your day-to-day, but it's not always obvious how. Leading up to the August 9 Australia Census, the Australian Bureau of Statistics put together an interactive called Spotlight, which helps its citizens understand the data a little better. Spotlight takes some of the data from the last Census - conducted in 2006 - and turns it into a simple interactive movie, to show just a few of the interesting things that the Census can tell us about Australia's people and population. As you go through the interactive, it asks you little bits about you such as gender and where you live, and then tells you information about what Census says about you and what's around. It also zooms out to put things in perspective. The voice-over helps to make it extra playful.

Building data startups: Fast, big, and focused This is a written follow-up to a talk presented at a recent Strata online event. A new breed of startup is emerging, built to take advantage of the rising tides of data across a variety of verticals and the maturing ecosystem of tools for its large-scale analysis. These are data startups, and they are the sumo wrestlers on the startup stage. The weight of data is a source of their competitive advantage. Setting up a Data Science Laboratory There is no better way of understanding new data processing, retrieval, analysis or visualising techniques than actually trying things out. In order to do this, it is best to use a server that acts as data science lab, with all the basic tools and sample data in place. Buck Woody discusses his system, and the configuration he chose.

Media Map Shows Impact of Information (December 5, 2011) How do we know that media make a difference? We’ve looked at the numbers. The Media Map Project, a multi-faceted research collaboration between Internews and The World Bank Institute, has made 25 data sets which collectively touch on every country in the world and up to 30 years' worth of information available to the public for download and analysis. Visualizations on the site allow viewers to interact with and test the data in different ways.

Department of Computer Science - Viterbi School of Engineering - Data Science The Master of Science in Computer Science (Data Science) provides students with a core background in Computer Science and specialized algorithmic, statistical, and systems expertise in acquiring, storing, accessing, analyzing and visualizing large, heterogeneous and real-time data associated with diverse real-world domains including energy, the environment, health, media, medicine, and transportation. Curriculum: You must take the following required courses: CS 570 - Analysis of Algorithms 3 Units - Fall, Spring, Summer.

Related: