Data Sets. How to: get to grips with data journalism. A graph showing the number of IEDs cleared from the Afghanistan War Logs Only a couple of years ago, the idea that journalists would need to know how to use a spreadsheet would have been laughed out of the newsroom.
Now those benighted days are way behind us and extracting stories out of data is part of every journalist's toolkit of skills. Some people say the answer is to become a sort of super hacker, write code and immerse yourself in SQL. If you decide to take that approach, you can find a load of resources here. But a lot of the work we do is just in excel and that's what I'll deal with here. Of course, you could just ignore the whole thing, hope it'll go away and you can get back to longing to write colour pieces. 1) Sourcing the data This is a much undervalued skill - with many journalists simply outsourcing it to research departments and work experience students. But broadly, the general approach is to look for the most authoritative place for your data. 3) Keep the codes.
I don't mean 'huge' as in fashionable - although it has become that in recent months - but 'huge' as in 'incomprehensibly enormous'. It represents the convergence of a number of fields which are significant in their own right - from investigative research and statistics to design and programming. The idea of combining those skills to tell important stories is powerful - but also intimidating. Who can do all that? The reality is that almost no one is doing all of that, but there are enough different parts of the puzzle for people to easily get involved in, and go from there. 1. 'Finding data' can involve anything from having expert knowledge and contacts to being able to use computer assisted reporting skills or, for some, specific technical skills such as MySQL or Python to gather the data for you. 2. 3. 4.
Tools such as ManyEyes for visualisation, and Yahoo! How to begin? DataMasher. The 70 Online Databases that Define Our Planet. Back in April, we looked at an ambitious European plan to simulate the entire planet.
The idea is to exploit the huge amounts of data generated by financial markets, health records, social media and climate monitoring to model the planet’s climate, societies and economy. The vision is that a system like this can help to understand and predict crises before they occur so that governments can take appropriate measures in advance. There are numerous challenges here. Nobody yet has the computing power necessary for such a task, neither are there models that will can accurately model even much smaller systems. But before any of that is possible, researchers must gather the economic, social and technological data needed to feed this machine.
Today, we get a grand tour of this challenge from Dirk Helbing and Stefano Balietti at the Swiss Federal Institute of Technology in Zurich. These and other pursuits are now producing massive amounts of data, many of which are freely available on the web.