background preloader

Six Provocations for Big Data

Six Provocations for Big Data
The era of Big Data has begun. Computer scientists, physicists, economists, mathematicians, political scientists, bio-informaticists, sociologists, and many others are clamoring for access to the massive quantities of information produced by and about people, things, and their interactions. Diverse groups argue about the potential benefits and costs of analyzing information from Twitter, Google, Verizon, 23andMe, Facebook, Wikipedia, and every space where large groups of people leave digital traces and deposit data. Significant questions emerge. This essay offers six provocations that we hope can spark conversations about the issues of Big Data. (This paper was presented at Oxford Internet Institute’s “A Decade in Internet Time: Symposium on the Dynamics of the Internet and Society” on September 21, 2011.)

Conversation privée sur Facebook Les médias ont baptisé ce bug - pour l'instant toujours démenti par Facebook - le "cauchemar de Facebook" : des messages privés, reçus entre 2007 et 2009, qui réapparaissent aléatoirement sur la timeline des utilisateurs. Leur timeline publique. Une nouvelle fois (on ne les compte plus), le réseau social se retrouve donc au centre des angoisses cyber-existentielles des internautes. On les comprend. Mais en même temps, n'est-il pas temps, enfin, de prendre conscience des risques inhérents aux réseaux sociaux en termes de données personnelles? Facebook n'est pas là pour protéger votre vie privée Facebook est un outil de partage et de publication, son but n'est pas de sécuriser les échanges de ses membres sur internet. Comme le résume très bien le journaliste Jean-Marc Manach, spécialiste des questions de vie privée, "il n'y a pas de 'vie privée' sur Facebook : sur un 'réseau social', on mène une 'vie sociale', voire une 'vie publique'". Facebook n'oublie rien

The evolution of data products In “What is Data Science?,” I started to talk about the nature of data products. Since then, we’ve seen a lot of exciting new products, most of which involve data analysis to an extent that we couldn’t have imagined a few years ago. It’s an old problem: the geeky engineer wants something cool with lots of knobs, dials, and fancy displays. Disappearing data We’ve become accustomed to virtual products, but it’s only appropriate to start by appreciating the extent to which data products have replaced physical products. But while we’re accustomed to the displacement of physical products by virtual products, the question of how we take the next step — where data recedes into the background — is surprisingly tough. A list may be an appropriate way to deliver potential contacts, and a spreadsheet may be an appropriate way to edit music metadata. These projects suggest the next step in the evolution toward data products that deliver results rather than data. We can push even further. Interfaces

L’ethnographie armée par les statistiques 1 Je remercie Agnès Gramain, statisticienne et économètre formée à la sociologie, dont les remarques (...) 2 On trouvera quelques éléments pour une telle histoire, entre autres, dans M. Pollak, « Paul Lazarsf (...) 3 En particulier, l’utilisation quantitative d’enquêtes ethnographiques relève d’une méconnaissance d (...) 4 B. Lepetit, « Architecture, géographie, histoire : usages de l’échelle », Genèses, 13, 1993, p. 11 (...) 1Opposer la fabrication et l’usage de statistiques à l’observation ethnographique et à l’entretien approfondi – comme le quantitatif au qualitatif, le « macro » au « micro », et en fin de compte la sociologie à l’anthropologie1 : ces lieux communs ne résistent pas à l’examen de l’histoire des sciences sociales2 ni à une pratique de recherche qui met en œuvre, successivement, observation, questionnaire et utilisation de statistiques nationales. 5 On en trouvera les premiers résultats dans : M. Pluvinage & F. 9 Cf. Le travail à-côté entre ressource et contrainte

«Humanités Digitales» Texte retravaillé le lundi 28 octobre 2013 L’usage en français de l’expression «Humanités Digitales» a démarré à l’Université de Bordeaux 3 dès 2008, et à celle de Lausanne en 2010, sans qu’il n’y ait eu de contact entre les groupes de chercheurs des deux universités. Depuis 2013, la Suisse compte 3 laboratoires d’Humanités Digitales, à l’Université de Bâle, à l’EPFL et à l’Université de Lausanne. L’apparition d’un néologisme tient du fait de société, et demandera bien du temps pour être analysé. Je soulignerai simplement ici qu’ «ordinateur» ou l’anglais «computer» désignent un concept cérébral. D’autre part, le chercheur en Humanités Digitales tient de l’Homo Faber et réellement fabrique, crée les nouvelles sciences humaines et sociales. Indications bibliographiques (liste à compléter, toute suggestion bienvenue!) Michel Serres, Petite Poucette, Paris: Le Pommier, 2012. Claire Clivaz, «Common Era 2.0. Actes du THATCamp à Saint-Malo 2013, atelier 1: à paraître online bientôt Claire Clivaz

The New Big Data Top scientists from companies such as Google and Yahoo are gathered alongside leading academics at the 17th Association for Computing Machinery (ACM) conference on Knowledge Discovery and Data Mining (KDD) in San Diego this week. They will present the latest techniques for wresting insights from the deluge of data produced nowadays, and for making sense of information that comes in a wider variety of forms than ever before. Twenty years ago, the only people who cared about so-called “big data”—the only ones who had enormous data sets and the motivation to try to process them—were members of the scientific community, says Usama Fayyad, executive chair of ACM’s Special Interest Group on Knowledge Discovery and Data Mining and former chief data officer at Yahoo. The explosive growth of the Internet, however, changed everything. These days, Internet giants make their money from the information they collect about users and the insights they gain from mining it.

OpenRefine Debates in the Digital Humanities Encompassing new technologies, research methods, and opportunities for collaborative scholarship and open-source peer review, as well as innovative ways of sharing knowledge and teaching, the digital humanities promises to transform the liberal arts—and perhaps the university itself. Indeed, at a time when many academic institutions are facing austerity budgets, digital humanities programs have been able to hire new faculty, establish new centers and initiatives, and attract multimillion-dollar grants. Clearly the digital humanities has reached a significant moment in its brief history. But what sort of moment is it? Debates in the Digital Humanities brings together leading figures in the field to explore its theories, methods, and practices and to clarify its multiple possibilities and tensions.

Why We Should Learn the Language of Data Illustration: Ellen Lupton How can global warming be real when there’s so much snow?” Hearing that question — repeatedly — this past February drove Joseph Romm nuts. A massive snowstorm had buried Washington, DC, and all across the capital, politicians and pundits who dispute the existence of climate change were cackling. The family of Oklahoma senator Jim Inhofe built an igloo near the Capitol and put up a sign reading “Al Gore’s New Home“. The planet can’t be warming, they said; look at all this white stuff! Romm — a physicist and climate expert with the Center for American Progress — spent a week explaining to reporters why this line of reasoning is so wrong. Statistics is hard. Consider the economy: Is it improving or not? Problem is, to calculate that stat, economists remove stores that have closed from their sample. Or take the raging debate over childhood vaccination, where well-intentioned parents have drawn disastrous conclusions from anecdotal information.

Reconciling market segments and personas Market segmentation and personas are two different techniques that are often perceived as conflicting methods, but they are actually complementary tools that organizations can use to design and sell successful products. The value of market segmentation The marketing profession has taken much of the guesswork out of determining what motivates people to buy. One of the most powerful tools for doing so is market segmentation, which groups people by their distinct needs to determine what types of consumers will be most receptive to a particular product or marketing message. To develop these models, marketers classify consumers according to a set of demographics and geographic variables such as age, race, education, and location. However, understanding why somebody wants to buy something is not the same thing as actually defining the product—what it is, how it will work, and how it will be used. The value of personas How do you select the right personas? An example Example survey chart

Le double visage de l'outil qui fabrique la nouvelle humanité numérique LE MONDE | | Par Paul Mathias, expert des lab Hadopi et ex-directeur de programme au Collège international de philosophie Inextricablement lié aux réseaux sociaux et à Facebook, le "printemps arabe" a frappé les consciences, mais n'a pas été le premier événement à résulter d'un usage expert et constant des réseaux. Les manifestations de Gênes contre le G8, en 2001, consacraient l'idée que des "foules intelligentes", composées d'individus interconnectés et mobiles, formaient un dispositif de contestation très efficace et capable de neutraliser les techniques de confinement mises en oeuvre par les forces de police. Pourtant, les noces de la démocratie et des technologies du numérique, des réseaux, posent problème. Seulement la médiation ne se fait pas ici comme au moyen de l'encre et du papier, de la voix et de l'image et de leur diffusion analogique. Le titan Facebook fait donc question, mais laisse toutefois ouvertes les incertitudes qui l'accompagnent.

How much information is there in the world? Think you're overloaded with information? Not even close. A study appearing on Feb. 10 in Science Express, an electronic journal that provides select Science articles ahead of print, calculates the world's total technological capacity -- how much information humankind is able to store, communicate and compute. "We live in a world where economies, political freedom and cultural growth increasingly depend on our technological capabilities," said lead author Martin Hilbert of the USC Annenberg School for Communication & Journalism. So how much information is there in the world? Prepare for some big numbers: Looking at both digital memory and analog devices, the researchers calculate that humankind is able to store at least 295 exabytes of information. Telecommunications grew 28 percent annually, and storage capacity grew 23 percent a year. "These numbers are impressive, but still miniscule compared to the order of magnitude at which nature handles information" Hilbert said.

Related: