background preloader

Disinformation Visualization: How to lie with datavis

Disinformation Visualization: How to lie with datavis
By Mushon Zer-Aviv, January 31, 2014 Seeing is believing. When working with raw data we’re often encouraged to present it differently, to give it a form, to map it or visualize it. It all sounds very sinister, and indeed sometimes it is. Over the past year I’ve had a few opportunities to run Disinformation Visualization workshops, encouraging activists, designers, statisticians, analysts, researchers, technologists and artists to visualize lies. Centuries before big data, computer graphics and social media collided and gave us the datavis explosion, visualization was mostly a scientific tool for inquiry and documentation. Reproducing Lies Let’s set up some rules. We don’t spread visual lies by presenting false data. To deconstruct possible lies I suggest we use the content / structure / presentation model as three lenses through which we can analyse a graphic. Should we legalize the killing of babies? I would hope most of you would say: No. Should women have the right to their own bodies?

about earth a visualization of global weather conditions forecast by supercomputers updated every three hours ocean surface current estimates updated every five days ocean surface temperatures and anomaly from daily average (1981-2011) updated daily ocean waves Aerosols and Chemistry | GEOS-5 (Goddard Earth Observing System) GMAO / NASA atmospheric pressure corresponds roughly to altitude several pressure layers are meteorologically interesting they show data assuming the earth is completely smooth note: 1 hectopascal (hPa) ≡ 1 millibar (mb) 1000 hPa | 00,~100 m, near sea level conditions 700 hPa | 0~3,500 m, planetary boundary, high 10 hPa | ~26,500 m, even more stratosphere the "Surface" layer represents conditions at ground or water level this layer follows the contours of mountains, valleys, etc. overlays show another dimension of data using color some overlays are valid at a specific height while others are valid for the entire thickness of the atmosphere Wind | wind speed at specified height Temp | Peak Wave Period |

7 tools for scraping - Use for datajournalism & insightful content I’ve been creating a lot of (data driven) creative content lately and one of the things I like to do is gathering as much data as I can from public sources. I even have some cases it is costing to much time to create and run database queries and my personal build PHP scraper is faster so I just wanted to share some tools that could be helpful. Just a short disclaimer: use these tools on your own risk! 1. Scraper is a simple data mining extension for Google Chrome™ that is useful for online research when you need to quickly analyze data in spreadsheet form. You can select a specific data point, a price, a rating etc and then use your browser menu: click Scrape Similar and you will get multiple options to export or copy your data to Excel or Google Docs. 2. – Click here to download the example script. 3. 4. 5. 6. 7. But on the end, building your individual project scrapers will always be more effective than using predefined scrapers. Summary Article Name Author Jan-Willem Bobbink Description

Data Discrimination Means the Poor May Experience a Different Internet Data analytics are being used to implement a subtle form of discrimination, while anonymous data sets can be mined to reveal health data and other private information, a Microsoft researcher warned this morning at MIT Technology Review’s EmTech conference. Kate Crawford, principal researcher at Microsoft Research, argued that these problems could be addressed with new legal approaches to the use of personal data. In a new paper, she and a colleague propose a system of “due process” that would give people more legal rights to understand how data analytics are used in determinations made against them, such as denial of health insurance or a job. “It’s the very start of a conversation about how to do this better,” Crawford, who is also a visiting professor at the MIT Center for Civic Media, said in an interview before the event. During her talk this morning, Crawford added that with big data, “you will never know what those discriminations are, and I think that’s where the concern begins.”

images-1 A statistical graphics course and statistical graphics advice Dean Eckles writes: Some of my coworkers at Facebook and I have worked with Udacity to create an online course on exploratory data analysis, including using data visualizations in R as part of EDA.The course has now launched at so anyone can take it for free. And Kaiser Fung has reviewed it. So definitely feel free to promote it! Criticism is also welcome (we are still fine-tuning things and adding more notes throughout).I wrote some more comments about the course here, including highlighting the interviews with my great coworkers. I didn’t have a chance to look at the course so instead I responded with some generic comments about eda and visualization (in no particular order): - Think of a graph as a comparison. - For example, Tukey described EDA as the search for the unexpected (or something like that, I don’t remember the exact quote). - No need to cram all information onto a single graph. - I like line plots.

Tabula: Extract Tables from PDFs Open Data Cities & DataGM Monday 21 Feb 2011 saw the launch of DataGM, the Greater Manchester Datastore, a partnership between FutureEverything and Trafford Council in Greater Manchester, that developed out of FutureEverything’s Open Data Cities innovation lab. The move to open up publicly held datasets is gaining momentum across the globe, and has been led by cities such as Vancouver. Open Data is a gateway to a data-based and digital society, it adds layers of accessibility to a huge amount of public information, on everything from the location of buses to census data. Open Data enables citizens to have meaningful interaction with the information that surrounds them. It can spark an innovation ecology, as people are able to build applications and services and create value (social and economic) by using the data. The embryonic idea behind Open Data Cities – How would cities evolve if all data was open? Open Data Cities / DataGM was nominated for a Big Chip Award 2011 and a UK Public Sector Digital Awards 2011.

note -se Exploratory Data Analysis Using R Lesson 1: What is EDA? (1 hour) We'll start by learn about what exploratory data analysis (EDA) is and why it is important. You'll meet the amazing instructors for the course and find out about the course structure and final project. Lesson 2: R Basics (3 hours) EDA, which comes before formal hypothesis testing and modeling, makes use of visual methods to analyze and summarize data sets. Lesson 3: Explore One Variable (4 hours) We perform EDA to understand the distribution of a variable and to check for anomalies and outliers. Problem Set 3 (2 hours) Lesson 4: Explore Two Variables (4 hours) EDA allows us to identify the most important variables and relationships within a data set before building predictive models. Problem Set 4 (2 hours) Lesson 5: Explore Many Variables (4 hours) Data sets can be complex. Problem Set 5 (2 hours) Lesson 6: Diamonds and Price Predictions (2 hours) Investigate the diamonds data set alongside Facebook Data Scientist, Solomon Messing. Final Project (10+ hours)

Kimono : Turn websites into structured APIs from your browser in seconds How Much Data is Created Every Minute? The Internet has become a place where massive amounts of information and data are being generated every day. Big data isn’t just some abstract concept created by the IT crowd, but a continually growing stream of digital activity pulsating through cables and airwaves across the world. This data never sleeps: every minute giant amounts of it are being generated from every phone, website and application across the Internet. The question: how much is created, and where does it all come from? To put things into perspective, this infographic by DOMO breaks down the amount of data generated on the Internet every minute. YouTube users upload 48 hours of video, Facebook users share 684,478 pieces of content, Instagram users share 3,600 new photos, and Tumblr sees 27,778 new posts published. Click here, or the image below for a full-sized version of the graphic.

Toutes les data d'une révolution Au premier coup d’oeil, vous ne savez pas vraiment ce que vous regardez. Ce pourraient être des brins d’ADN mitochondrial, ou du sperme en train de féconder un ovule, ou peut-être le produit d’une union incongrue entre une créature marine et une boule de poils. Tandis que la chose grossit, elle passe par une série de mutations de plus en plus complexes, pour finir comme une sorte d’Étoile noire en chantier. publicité Ce que vous êtes en train de regarder, ce sont des données, ou pour être plus précis, une représentation visuelle de tweets et de retweets publiés pendant une période de quelques heures le 11 février 2011, juste après la démission du président égyptien Hosni Moubarak. Omniprésence des données numériques Plus que jamais auparavant, nous vivons dans un univers de données. Cette explosion de données s’est accompagnée de nouveaux moyens de les représenter. De tous les phénomènes sociaux qui invitent à l’analyse, peu sont aussi complexes ou aussi volatiles que les révolutions.

Who's the most influential in a social graph? New software recognizes key influencers faster than ever (Phys.org)—At an airport, many people are essential for planes to take off. Gate staffs, refueling crews, flight attendants and pilots are in constant communication with each other as they perform required tasks. But it's the air traffic controller who talks with every plane, coordinating departures and runways. Communication must run through her in order for an airport to run smoothly and safely. In computational terms, the air traffic controller is the "betweenness centrality," the most connected person in the system. Determining the most influential person on a social media network (or, in computer terms, a graph) is more complex. Georgia Tech has developed a new algorithm that quickly determines betweenness centrality for streaming graphs. "Unlike existing algorithms, our system doesn't restart the computational process from scratch each time a new edge is inserted into a graph," said College of Computing Professor David Bader, the project's leader.

How NASA Makes Scientific Data Beautiful How do you make education interesting and, more importantly, beautiful? When it comes to the work of NASA, attracting enthusiasts isn't difficult with the usual visuals of bright stars and colorful planets on hand. Look no further than the recent awe over Mars rover Curiosity's high-res pictures to see proof of humanity's fascination with space. But not all of NASA's data is packaged into a neat little photos. The SVS is not only an active and creative tool for NASA outreach — it has even gone viral. "I think scientists have an amazing internal world — they think about these things and how they work," says Dr. Mashable spoke with Mitchell about Perpetual Ocean and how to bring beauty to educational information. How did Perpetual Ocean come about? We're tasked to visualize massive results of all kinds for the purposes of public outreach. One of those things we thought we couldn't do was anything that involved a flow field — an ocean current or wind, for example. What's next for SVS?

Related: