background preloader

Analyse et visualisation de données

Facebook Twitter

Datablog.owni.fr. Journalism in the Age of Data: A Video Report on Data Visualization by Geoff McGhee. How a Science Journalist Created a Data Visualization to Show the Magnitude of the Haiti Earthquake. On the one year anniversary of the Haiti earthquake, journalist Peter Aldhous created a data visualization that shows how the Carribean country's relatively low seismic earthquake had as many fatalities as all but one earthquake over a time span of almost 40 years.

The data visualization is striking but also a study in how journalists are increasingly telling stories that leverage datasets that are freely available to the public. Peter Aldhous, San Francisco Bureau Chief for New Scientist magazine, created the interactive graphics. We asked him to explain how he created the visualizations which compare seismic activity to fatalities caused by earthquakes over the span of four decades. Aldhous posted the data visualizations on his Web site with the following explanation: The earthquake that struck near the Haitian capital, Port-au-Prince, on 12 January 2010, was unremarkable in seismic terms -- barely making the year's top 20 most powerful quakes. The bar graph marks fatalities. The Results. Notabilia – Visualizing Deletion Discussions on Wikipedia. The Joy of Stats. About the video Hans Rosling says there’s nothing boring about stats, and then goes on to prove it.

A one-hour long documentary produced by Wingspan Productions and broadcast by BBC, 2010. A DVD is available to order from Wingspan Productions. Director & Producer; Dan Hillman, Executive Producer: Archie Baron. The change from large to small families reflects dramatic changes in peoples lives. Hans Rosling asks: Has the UN gone mad? Hans Rosling explains a very common misunderstanding about the world: That saving the poor children leads to overpopulation. Stats With Cats Blog | … for when you can't solve life's problems with statistics alone. Ten Fatal Flaws in Data Analysis | Stats With Cats Blog. 1. Where’s the Beef? In a way, the worst flaw a data analysis can have is no analysis at all. Instead, you get data lists, sorts and queries, and maybe some simple descriptive statistics but nothing that addresses objectives, answers questions, or tells a story.

If that’s all you want, that’s fine. But a data report is not a data analysis. Reports provide information; analyses provide knowledge. 2. If there were to be a fatal flaw in an analysis, it would probably involve how well the samples represent the population. 3. Sometimes the population is real and well defined, but the samples don’t represent it adequately. 4. The number of samples always seems to be an issue in statistical studies ( 5. Most people don’t appreciate variance. 6. NASA uses checklists to ensure that every astronaut does things correctly, completely, and consistently. 7. 8. Here’s where you have to use your gut feel. 9. 10. Any Questions? Visualising Data.

Information Is Beautiful | Ideas, issues, concepts, subjects - v. Info=Beautiful (infobeautiful) Rencontre avec David McCandless » Article » OWNI, Digital Journalism. Le journaliste du Guardian tient le site "Information is beautiful", sur lequel il met en scène toutes sortes de données. Entretien autour des problématiques que pose la visualisation de données. Boire un thé avec David McCandless d’Information is beautiful quand on s’intéresse à la visualisation de données revient un peu à partager un pétard avec ses rockers préférés quand on est une groupie. Je souris béatement tandis qu’il peste contre sa nouvelle maison qu’il juge bien trop grande et trop froide. David met de l’eau à bouillir et je remarque que même sa théière est recouverte d’une petite laine. Quelques instants plus tard, je le suis, sans sucre et sans lait, dans les escaliers qui mènent à son bureau. Work In progress Là, il me montre une infographie sur les exoplanètes qu’il termine actuellement pour The Guardian.

La notion d’échelle est fondamentale pour moi ; je crois que c’est véritablement la clé de la visualisation de données car elle donne à la fois le contexte et le sens. Google NGram Experiments. With Google’s new tool Ngram Viewer, you can visualise the rise and fall of particular keywords across 5 million books and 500 years! See how big cocaine was in Victorian times. The spirit of inquiry over the ages. The spirit of inquiry over the ages II (NGram is case-sensitive). The Battle Of The Brains What happened around 1700??? Age-old debates (by Andy, James Rooney, Nick, Bidzubido, Jacqui,Gary,Stefan Lasiewski,Mark) Got any more? Debtris. Horoscoped. Do horoscopes really all just say the same thing? We scraped & analysed 22,000 to see.

See our completed meta-horoscope chart and make up your own mind. We’ve also created a single meta-prediction out of the most common words.. How do you gather 22,000 horoscopes? Obviously you could manually cut and paste them from one of the many online Zodiac pages. But that, we calculated, would take about a week of solid work (84.44 hours). So we engaged the services of arch-coder Thomas Winnigham to do a bit of hacking. Yahoo Shine kindly archive their daily predictions in a simple and very hackable format (example).

Well, it’s not quite that easy. We can’t share the 9.5MB spreadsheet with you because it’s Yahoo’s copyright. So every different type of horoscope got sucked up – career, teen, love, daily overview. We used an online tool called TagCrowd to find the most common words. You can see the full data in a Google spreadsheet here. FlowingData | Data Visualization, Infographics, and Statistics. Nathan Yau (flowingdata) Nathan Yau: I have the top search resu... Picturing social order. 10 Best Data Visualization Projects of the Year – 2010. Data visualization and all things related continued its ascent this year with projects popping up all over the place.

Some were good, and a lot were not so good. More than anything, I noticed a huge wave of big infographics this year. It was amusing at first, but then it kind of got out of hand when online education and insurance sites started to game the system. Although it's died down a lot ever since the new Digg launched. That's what stuck out in my mind initially as I thought about the top projects of the year. One of the major themes for 2010 was using data not just for analysis or business intelligence, but for telling stories. So here are the top 10 visualization projects of the year, listed from bottom to top. 10. Scott Manley of the Armagh Observatory visualized 30 years of asteroid discoveries. 9. Hannah Fairfield, former editor for The New York Times, and now graphics director for The Washington Post, had a look at gas prices versus miles driven per capita. 8. 7. 6. 5. 4. 3.

Facebook worldwide friendships mapped. As we all know, people all over the world use Facebook to stay connected with friends and family. You meet someone. You friend him or her on Facebook to keep in touch. These friendships began within universities, but today there are friendships that connect countries. Facebook engineering intern Paul Butler visualizes these connections: I defined weights for each pair of cities as a function of the Euclidean distance between them and the number of friends between them. Then I plotted lines between the pairs by weight, so that pairs of cities with the most friendships between them were drawn on top of the others. In other words, for each pair of countries with a friend in one country and a friend in the other, a line was drawn. It might remind you of Chris Harrison's maps that show interconnectedness via router configurations. In areas of high density it looks more or less like population density.

Information aesthetics - Information Visualization & Visual Communication. Andrew Vande Moere (infosthetics) What is Data Visualization? Data journalism and data visualization | News. Guardian Datastore (datastore) Data journalism and data visualization from the Datablog | News. How to be a data journalist | News. Data journalism is huge. I don't mean 'huge' as in fashionable - although it has become that in recent months - but 'huge' as in 'incomprehensibly enormous'. It represents the convergence of a number of fields which are significant in their own right - from investigative research and statistics to design and programming.

The idea of combining those skills to tell important stories is powerful - but also intimidating. Who can do all that? The reality is that almost no one is doing all of that, but there are enough different parts of the puzzle for people to easily get involved in, and go from there. 1. 'Finding data' can involve anything from having expert knowledge and contacts to being able to use computer assisted reporting skills or, for some, specific technical skills such as MySQL or Python to gather the data for you. 2. 3. 4. Tools such as ManyEyes for visualisation, and Yahoo! How to begin? So where does a budding data journalist start? Play around. And you know what? Data journalism pt1: Finding data (draft – comments invited) The following is a draft from a book about online journalism that I’ve been working on.

I’d really appreciate any additions or comments you can make – particularly around sources of data and legal considerations The first stage in data journalism is sourcing the data itself. Often you will be seeking out data based on a particular question or hypothesis (for a good guide to forming a journalistic hypothesis see Mark Hunter’s free ebook Story-Based Inquiry (2010)).

On other occasions, it may be that the release or discovery of data itself kicks off your investigation. There are a range of sources available to the data journalist, both online and offline, public and hidden. Typical sources include: national and local government;bodies that monitor organisations (such as regulators or consumer bodies);scientific and academic institutions;health organisations;charities and pressure groups;business;and the media itself. Private companies and charities Regulators, researchers and the media. Data journalism pt2: Interrogating data. This is a draft from a book chapter on data journalism (the first, on gathering data, is here). I’d really appreciate any additions or comments you can make – particularly around ways of spotting stories in data, and mistakes to avoid. UPDATE: It has now been published in The Online Journalism Handbook. “One of the most important (and least technical) skills in understanding data is asking good questions.

An appropriate question shares an interest you have in the data, tries to convey it to others, and is curiosity-oriented rather than math-oriented. Once you have the data you need to see if there is a story buried within it. The first stage in this process, then, is making sure the data is in the right format to be interrogated. If the information is already online you can sometimes ‘scrape’ it – that is, automatically copy the relevant information into a separate document. Insert: Cleaning up data Some tips for cleaning your data include: Use a spellchecker to check for misspellings. Data journalism pt3: visualising data – charts and graphs (comments wanted) This is a draft from a book chapter on data journalism (the first, on gathering data, is here; the section on interrogating data is here). I’d really appreciate any additions or comments you can make – particularly around considerations in visualisation.

A further section on visualisation tools, can be found here. UPDATE: It has now been published in The Online Journalism Handbook. “At their best, graphics are instruments for reasoning about quantitative information. Often the most effective way to describe, explore, and summarize a set of numbers – even a very large set – is to look at pictures of those numbers.” (Edward Tufte, The Visual Display of Quantitative Information, 2001) Visualisation is the process of giving a graphic form to information which is often otherwise dry or impenetrable.

Broadly speaking there are two typical reasons for visualising data: to find a story; or to tell one. In most cases, however, the story will not be as immediately visible. Types of visualisation. Data journalism pt4: visualising data – tools and publishing (comments wanted) This is a draft from a book chapter on data journalism (here are parts 1; two; and three, which looks the charts side of visualisation). I’d really appreciate any additions or comments you can make – particularly around tips and tools. UPDATE: It has now been published in The Online Journalism Handbook. Visualisation tools So if you want to visualise some data or text, how do you do it? Thankfully there are now dozens of free and cheap pieces of software that you can use to quickly turn your tables into charts, graphs and clouds. The best-known tool for creating word clouds is Wordle (wordle.net).

Simply paste a block of text into the site, or the address of an RSS feed, and the site will generate a word cloud whose fonts and colours you can change to your preferences. ManyEyes (manyeyes.alphaworks.ibm.com/manyeyes/) also allows you to create word clouds and tag clouds – as well as word trees and phrase nets that allow you to see common phrases. Publishing your visualisation Like this: Data journalism pt5: Mashing data (comments wanted) This is a draft from a book chapter on data journalism (part 1 looks at finding data; part 2 at interrogating data; part 3 at visualisation, and 4 at visualisation tools). I’d really appreciate any additions or comments you can make – particularly around tips and tools. UPDATE: It has now been published in The Online Journalism Handbook.

Mashing data Wikipedia defines a mashup particularly succinctly, as “a web page or application that uses or combines data or functionality from two or many more external sources to create a new service.” Those sources may be online spreadsheets or tables; maps; RSS feeds (which could be anything from Twitter tweets, blog posts or news articles to images, video, audio or search results); or anything else which is structured enough to ‘match’ against another source. This ‘match’ is typically what makes a mashup. Why make a mashup? Some web developers have built entire sites that are mashups. Mashup tools Yahoo! Mashups and APIs Box-out: Anatomy of a feed. How to: get to grips with data journalism. A graph showing the number of IEDs cleared from the Afghanistan War Logs Only a couple of years ago, the idea that journalists would need to know how to use a spreadsheet would have been laughed out of the newsroom.

Now those benighted days are way behind us and extracting stories out of data is part of every journalist's toolkit of skills. Some people say the answer is to become a sort of super hacker, write code and immerse yourself in SQL. If you decide to take that approach, you can find a load of resources here. Of course, you could just ignore the whole thing, hope it'll go away and you can get back to longing to write colour pieces. 1) Sourcing the data This is a much undervalued skill - with many journalists simply outsourcing it to research departments and work experience students. But broadly, the general approach is to look for the most authoritative place for your data. GDP - from the Office for National Statistics. Adobe PDF files are the enemy of open data. 3) Keep the codes. Ebook : le cahier de l’OpenData 2010 » Article » OWNI, Digital Journalism. LOPPSI vs Open Data.

Outils, ressources