background preloader

What is data science

Facebook Twitter

The Big Data Brain Drain: Why Science is in Trouble. Regardless of what you might think of the ubiquity of the "Big Data" meme, it's clear that the growing size of datasets is changing the way we approach the world around us.

The Big Data Brain Drain: Why Science is in Trouble

This is true in fields from industry to government to media to academia and virtually everywhere in between. Our increasing abilities to gather, process, visualize, and learn from large datasets is helping to push the boundaries of our knowledge. But where scientific research is concerned, this recently accelerated shift to data-centric science has a dark side, which boils down to this: the skills required to be a successful scientific researcher are increasingly indistinguishable from the skills required to be successful in industry. While academia, with typical inertia, gradually shifts to accommodate this, the rest of the world has already begun to embrace and reward these skills to a much greater degree.

L’histoire de l’innovation contemporaine c’est les Big Data. La lecture de la semaine provient de la vénérable revue The Atlantic et on la doit à Erik Brynjolfsson, économiste à la Sloan School of Management et responsable du groupe Productivité numérique au Centre sur le Business numérique du Massachusetts Institute of Technology et Andrew McAfee auteurs Race Against the Machine (“La course contre les machines où comment la révolution numérique accélère l’innovation, conduit la productivité et irréversiblement transforme l’emploi et l’économie”).

L’histoire de l’innovation contemporaine c’est les Big Data

The Human Face of Big Data. Big Data, Big Hype: Big Deal. ‘Big data’ is dead. What’s next? How can big data and smart analytics tools ignite growth for your company?

‘Big data’ is dead. What’s next?

Find out at DataBeat, May 19-20 in San Francisco, from top data scientists, analysts, investors, and entrepreneurs. Register now and save $200! This is a guest post by technology executive John De Goes “Big data” is dead. Big data is dead, long live big data: Thoughts heading to Strata. A recent VentureBeat article argues that “Big Data” is dead.

Big data is dead, long live big data: Thoughts heading to Strata

It’s been killed by marketers. That’s an understandable frustration (and a little ironic to read about it in that particular venue). As I said sarcastically the other day, “Put your Big Data in the Cloud with a Hadoop.” You don’t have to read much industry news to get the sense that “big data” is sliding into the trough of Gartner’s hype curve. That’s natural. Research paper: What big data can do for the cultural sector « Cross Innovation. These days, data sizes are almost infinite, and organizations learn a lot about their position and succesful strategies by simply analyzing data.

Research paper: What big data can do for the cultural sector « Cross Innovation

However, most cultural industries have not yet implemented this concept. Anthony Lilley and Paul Moore give their views on how data can benefit creative industries. This paper argues the value of big data analysis for creative institutions, but also that most of them are not even taking online data into account. The paper is a collaboration between Paul Moore, professor at the University of Ulster and researcher on (theory and practice of) the creative industries, and Anthony Lilley, media practitioner and creative concept developer with an international experience in the creative industries.

Dr. Brian Lowe, SUNY Oneonta – Analyzing "Big Data" In today’s Academic Minute, Dr.

Dr. Brian Lowe, SUNY Oneonta – Analyzing "Big Data"

Brian Lowe of the State University of New York Oneonta explains why "Big Data" is becoming a focus of academic inquiry. Dr. Brian Lowe, SUNY Oneonta – Analyzing Big Data Brian Lowe is an associate professor of sociology at the State University of New York Oneonta where his research and teaching interests include sociological theories, animal and society, cultural and comparative-historical sociology and spectacular conflicts. His work has appeared in a number of peer-reviewed journals and in 2006 he published, Emerging Moral Vocabularies: The Creation and Establishment of New Forms of Moral and Ethical Meanings. Why you should never trust a data visualisation. First of all, let me be clear: the headline of this article is a reference to Pete Warden's post, and should be read in the same way - as a caution against blind acceptance, rather than the wholesale condemnation of data visualisation.

Why you should never trust a data visualisation

An excellent blogpost has been receiving a lot of attention over the last week. Pete Warden, an experienced data scientist and author for O'Reilly on all things data, writes: The wonderful thing about being a data scientist is that I get all of the credibility of genuine science, with none of the irritating peer review or reproducibility worries ... I thought I was publishing an entertaining view of some data I'd extracted, but it was treated like a scientific study. This is an important acknowledgement of a very real problem, but in my view Warden has the wrong target in his crosshairs. But there is: humans are visual creatures. What am I doing about it? Ultimately, I believe the solution is a two-way street. Where do you sit on this debate? Ethique Big Data. Le Big Data : c’est de « la connerie » Directeur technologique de la campagne 2012 de Barack Obama, Harper Reed a son mot à dire sur le thème du Big Data.

Le Big Data : c’est de « la connerie »

Et ce ne sont pas des éloges. En tout cas en ce qui concerne l’utilisation de ce terme par l’industrie IT. L’emploi à l’épreuve des algorithmes. Par Hubert Guillaud le 03/05/13 | 6 commentaires | 4,691 lectures | Impression.

L’emploi à l’épreuve des algorithmes

Big Data : nouvelle étape de l’informatisation du monde. Par Hubert Guillaud le 14/05/13 | 13 commentaires | 10,975 lectures | Impression Viktor Mayer-Schönberger, professeur à l’Oxford internet Institute, et Kenneth Cukier, responsable des données pour The Economist ont récemment publié Big Data : une révolution qui va transformer notre façon de vivre, de travailler et penser (le site dédié).

Big Data : nouvelle étape de l’informatisation du monde

Ce livre est intéressant à plus d’un titre, mais avant tout pour ce qu’il nous apprend du changement du monde en cours. To Hypothesize or Not to Hypothesize. "Not everything that counts can be counted, and not everything that can be counted counts. " - Albert Einstein . Data Science is in the early stage of development and needs to develop canons to guide us. There is a brewing debate about the use of established scientific methods in the practice of data science. Some suggest traditional scientific methods must be used while others assert new scientific methods must be developed - especially considering algorithms, machine learning and future artificial intelligence. Part of that debate includes whether it is necessary to form a hypothesis. I suggest the answer is it depends. . . (2) Deduction of meaning from the data and different data relationships. . (3) Formation of hypothesis. . (4) Experimental or observational testing of the validity of the hypothesis. .

Forget big data, small data is the real revolution. There is a lot of talk about "big data" at the moment. For example, this is Big Data Week, which will see events about big data in dozens of cities around the world. But the discussions around big data miss a much bigger and more important picture: the real opportunity is not big data, but small data. Not centralized "big iron", but decentralized data wrangling. Not "one ring to rule them all" but "small pieces loosely joined". Universities Offer Courses in a Hot New Field - Data Science. L'Institut pour la Science des Données participe à l'année internationale de la statistique. Pour le Data Science Institute, ces nouveaux métiers que nous regroupons sous le terme de Science des Données, combinent justement trois composantes : la statistique, l'informatique et la communication.

Nous avons donc tout naturellement choisi de nos associer à cette année internationale de la statistique afin de promouvoir ces savoirs et les métiers qui s'y rattachent. Nous ne sommes pas seuls ! Loin s'en faut, puisque plus de 1400 organisations, en provenance de 108 pays, ont déjà rejoint cette initiative. Data Jujitsu: The art of turning data into product. Having worked in academia, government and industry, I’ve had a unique opportunity to build products in each sector. Much of this product development has been around building data products. Just as methods for general product development have steadily improved, so have the ideas for developing data products. Thanks to large investments in the general area of data science, many major innovations (e.g., Hadoop, Voldemort, Cassandra, HBase, Pig, Hive, etc.) have made data products easier to build.

Nonetheless, data products are unique in that they are often extremely difficult, and seemingly intractable for small teams with limited funds. Yet, they get solved every day. How? Les 3 V du Big Data : Volume, Vitesse et Variété. Le volume des données explose. Dans un rapport de 2010 consacré au Big Data, McKinsey prédisait une augmentation de 60 % de la marge d’exploitation des retailers qui utiliseraient pleinement ces énormes volumes de données. A Data Scientist's Real Job: Storytelling - Jeff Bladt and Bob Filbin. Every morning at DoSomething.org, our computers greet us with a report containing over 350 million data points tracking our organization’s performance.

Our challenge as data scientists is to translate this haystack of information into guidance for staff so they can make smart decisions — whether it’s choosing the right headline for today’s email blast (should we ask our members to “take action now” or “learn more”?) Or determining the purpose of our summer volunteer campaign (food donation drive or recycling campaign?). In short, we’re tasked with transforming data into directives. Good analysis parses numerical outputs into an understanding of the organization.

Is Big Data Dying? Has big data been the victim of too much hype? Has it failed to live up to its promise? The newsonomics of a news company of the future. What will news companies look like in 2018? How will they operate differently? That future is coming into focus. While many publishers’ vision is still quite blurry, it’s the Financial Times that is clearest-eyed about its roadmap and its future. The FT’s clarity first struck me when I sat down for an introductory talk with FT.com managing director Rob Grimshaw in London in fall 2009. His office, just off Southwark Bridge, offered a view of the Thames that forced you to think about the long history of newspapering in that city, a business then in a deep, deep recession along with the rest of the global economy.

Today, Grimshaw is headquartered in New York City. Make no mistake: The FT hasn’t quite cracked the code yet. All that said, given its digital transition, I believe it is far more likely to successfully cross over to the new age than other publishers. Look beyond the product it creates, and look at how it creates it, and the lessons tumble out. Make the company connection. Non, les données ne sont pas du pétrole... Il ne se passe plus une semaine sans un dossier spécial titrant sur "les data, pétrole du XXIe Siècle", "data is the new oil", "les données, le nouvel or noir", "vos données personnelles valent 315 milliards d'euros", "profitez des opportunités des big data", voire même un "trésor caché" et j'en passe.

On comprend bien la métaphore : les données personnelles, les données publiques, les données de l'internet des objets seraient comme le pétrole : une ressource naturelle, fluide, susceptibles de toutes sortes de transformations, et porteuses d'un énorme potentiel de valeur. Plus encore, elles seraient le ferment d'une nouvelle révolution industrielle, appelées à plier l'économie mondiale à leur puissant potentiel industriel. On comprend la métaphore, mais elle n'en n'est pas moins lassante.

Competitions. Why becoming a data scientist is NOT actually easier than you think - josephmisiti's posterous. Software engineer’s guide to getting started with data science. Many of my software engineer friends ask me about learning data science. There are many articles on this subject from renowned data scientists (Dataspora, Gigaom, Quora, Hilary Mason). This post captures my journey (a software engineer) on learning Statistics and Data Visualization.I'm mid-way in my 5 year journey to become proficient in data science and my learning program has included self-learning (books, blogs, toy problems), projects at work, class-room training (Stanford), teaching/presentations, conferences (UseR, Strata).

Here's what I've done so far and what worked and what didn't... a) Self-learning (2 - 4 months) Explore if data science is for you This is the key to getting started. B) Class-room training (9 - 12 months) If you're serious about learning, enroll into a formal program If you're serious about picking this skill, then opt for a course. Introduction to Data Science. About the Course Commerce and research are being transformed by data-driven discovery and prediction. Skills required for data analytics at massive levels – scalable data management on and off the cloud, parallel algorithms, statistical modeling, and proficiency with a complex ecosystem of tools and platforms – span a variety of disciplines and are not easy to obtain through conventional curricula.

Tour the basic techniques of data science, including both SQL and NoSQL solutions for massive data management (e.g., MapReduce and contemporaries), algorithms for data mining (e.g., clustering and association rule mining), and basic statistical modeling (e.g., linear and non-linear regression). Recommended Background. Career Advice: How do I become a data scientist. » Getting Started with Data Science hilarymason. I get quite a few e-mail messages from very smart people who are looking to get started in data science. Here’s what I usually tell them: The best way to get started in data science is to DO data science! First, data scientists do three fundamentally different things: math, code (and engineer systems), and communicate. Figure out which one of these you’re weakest at, and do a project that enhances your capabilities.

Then figure out which one of these you’re best at, and pick a project which shows off your abilities. Second, get to know other data scientists! Third, put your projects out in public. So you want to be a data scientist? : Nature Jobs Blog. Big Data. Big Data. Peasant Muse: From Data Self to Data Serf. "We belong to you, but the land belongs to us. " - Russian Peasant Refrain "But even when I am at a loss to define the essence of freedom I know full well the meaning of captivity. " - Adam Zagajewski, 'Freedom' "You can't be what you were So you better start being just what you are. The Promise and Peril of the 'Data-Driven Society' Left: Damian Dovarganes/Associated Press; Jim Hollander/European Pressphoto AgencyData from credit card purchases and cell phone use provide a wealth of information about personal behavior.

A small group of academics, business executives and journalists gathered at the M.I.T. #2 Jeff Hammerbacher, Chief Scientist, Cloudera and DJ Patil, Entrepreneur-in-Residence, Greylock Ventures - Tim O’Reilly: The World’s 7 Most Powerful Data Scientists. Data Science Blogs. The #1 Career Mistake Capable People Make. Volume 1, Issue 1. Learning To Be A Data Scientist. Columbia University to Create Data Sciences Institute in NYC. New York University wants to train the next generation of data scientists. How to think and talk like a collaborative data scientist.

Are You a Data Whisperer? The curse of big data. Data Scientists: The Definition of Sexy. Harvard Business Review: Data Scientist Is The 'Sexiest Job Of The 21st Century' Data Scientist: The Sexiest Job of the 21st Century. Microsoft: Big Data Dominated By Marketing, Sales. Culture & Big Data: Four Essential Questions. Big Data Is Great, but Don’t Forget Intuition. Plongée au cœur du Web. Big Data, grande illusion. Vertigineux "big data". Big Data’s Big Problem: Little Talent - Tech Europe. Specials : Nature. Keep Your Data Scientist…Send Me A Data Artist! — International Institute for Analytics. Internet, Politics, Policy 2012: Big Data, Big Challenges?

International Journal of Internet Science, Volume 7, Issue 1. RESEARCH CENTER FOR DATAOLOGY AND DATASCIENCE. Search Results » burrell. Le Big Data en analyse : qu’est ce que le Big Data ? (1/5) « Rendez-vous à Vheissu. Top 5 Myths About Big Data. A Very Short History of Data Science. A Taxonomy of Data Science. Big Data’s Impact in the World. A data science cheat sheet. Big Data in 2012: Hadoop, Big Data Apps, Data Science Tools, Cloud Collision and More. How Big Data Gets Real. Why the term "data science" is flawed but useful. Data science is a pipeline between academic disciplines.

Data Science: a literature review. What is "Data Science" Anyway? Six months after "What is data science?" What is Data Science. Data Science Central. What is Data Science and Why Should I Care? « The Slalom Blog. What is a data scientist? The Evolution of "What is Data Science?" Rise of the Data Scientist. Hal Varian on how the Web challenges managers - McKinsey Quarterly - Strategy - Innovation. The Three Sexy Skills of Data Geeks « Dataspora. For Today’s Graduate, Just One Word - Statistics. (9) What is data science. What is data science?