background preloader

Maps Data

Facebook Twitter

Publishers Can Afford Data Journalism, Says ProPublica’s Scott Klein. Share on Tumblr This winter, Scott Klein made a prediction at the Nieman Lab that that drew some attention: “in 2014, you will be scooped by a reporter who knows how to program.” As I noted at this blog, he was proven correct within the month, as enterprising journalists applied their data skills to creating scoops and audiences.

Yesterday, The New York Times’ promising new data journalism venture, The Upshot, published the most popular story at nytimes.com, confirming that original data and reporting presented with context and a strong narrative remains a powerful, popular combination. It’s still atop the most-emailed, viewed and shared leaderboards this morning. So, Klein was right. Again. That’s not a huge surprise to me, nor anyone else in the data journalism world.

Klein, an assistant managing editor at ProPublica, draws on years of hands-on experience working with data, reporters and developers at one of the most important nonprofit news organizations in the world. Both. Top 5 Data-Scraping Tools for Would-Be Data Journalists. This post originally appeared on Northwestern University Knight Lab’s blog. This past fall, I spent time with the NPR News Apps team (now known as NPR Visuals) coding up some projects, working mainly as a visual/interaction designer. But in the last few months, I’ve been working on a project that involves scraping newspaper articles and Twitter APIs for data. I was a relative beginner with Python — I’d pair coded a bit with others and made some basic programs, but nothing too complicated. I knew I needed to develop a more in-depth knowledge of web scraping and data parsing skills and of course took to the web to help.

Along the way, I found a few tools that were exceptionally useful for expanding my knowledge. So if you too are just starting out with scraping, here are five of the most useful tools I’ve encountered while learning and working on my project. 1. A free Chrome browser extension, Scraper is a handy, easy-to-use tool when you need to extract plain text from websites. 2. 3. 4. Mapsbynik: Nobody lives here: The nearly 5 million Census... We are drowning in data about readers and attention, but which metrics really matter? You won’t like the answer. Thanks to the web and real-time measurement tools, the media industry has gone from having virtually no hard data on readers and attention to an embarrassment of riches — not only can we measure what people click on, but we can measure how far down the page they got when they were reading, whether they posted a comment, which social networks they came from, and a hundred other pieces of data.

The only problem is that this is very much a double-edged sword. New York Times media writer David Carr recently looked at some examples of media companies that are rewarding their writers based on traffic statistics and other measurements, including The Oregonian — whose efforts I wrote about here. But is paying your journalists based on pageviews or other metrics a smart way to align their incentives with your goals as a business, or does it poison the well when it comes to enhancing or encouraging creativity?

Be careful what kind of incentives you use Who is your customer — reader or advertiser? Je Veux Savoir - Envoyer et rechercher des demandes d'accès aux documents. OpenPrism. World-class map collection tools - HERE Three Sixty. Whereas others just take pictures, the HERE vehicles collect 700,000 3D data points at a time and up to 140 gigabytes are collected in just one day to create an exact model of the street level environment. This technology automatically detects streets signs and captures a myriad of other details to help us build a map that even carmakers trust for built-in navigation. With the acquisition of earthmine last November we further developed the way we collect and process data, strengthening particularly our 3D capabilities.

Now we’re incorporating that enhanced reality capture technology into our new fleet of collection vehicles. We’re deploying these vehicles globally to create an exact digital representation of the real world. The cameras we use to map streets have doubled in resolution – each can now capture 16.8 megapixels – bringing the total image size to 68 megapixels. Last week, we invited a number of journalists to see the rollout of our new fleet and the reviews are in.

Oakland Police Beat applies data-driven investigative journalism in California. Share on Tumblr One of the explicit connections I’ve made over the years lies between data-driven investigative journalism and government or corporate accountability. In debugging the backlash to data journalism, I highlighted the work of The Los Angeles Times Data Desk, which has analyzed government performance data for accountability, among other notable projects. I could also have pointed to the Chicago Sun-Times, which applied data-driven investigative methods to determine that the City of Chicago’s 911 dispatch times vary widely depending on where you live, publishing an interactive map online for context, or to a Pulitzer Prize-winning story on speeding cops in Florida. This week, there’s a new experiment in applying data journalism to local government accountability in Oakland, California, where the Oakland Police Beat has gone online. Oakland Police Beat is squarely aimed at shining sunlight on the practices of Oakland’s law enforcement officers.

So, what exactly did you launch? Why aren’t political reporters asking the right questions about polls? :: Ryerson Review of Journalism :: The Ryerson School of Journalism. Welcome to the poll on polls. To begin, please press 1 “What is a poll?” David Akin asks in the makeup room at the Sun News Network studio in downtown Toronto. He doesn’t need to think about his answer. “It is a snapshot backward in time.” This photo of public opinion is a hallmark of Akin’s show, Battleground, which specializes in election coverage. It’s a big night for Coletto. It’s also an important night for journalists. But with the polling industry under fire for poor performance, he thought there was still a place for his work.

He’s about to find out. Thorough coverage from Battleground and a perfect showing from Abacus would be a good start, but fixing poll reporting will require much more. This doesn’t necessarily mean journalists should give up trying. If you think the golden age of poll stories is over, please press 2 An unscientific survey of journalists and pollsters suggests that poll reporting wasn’t always bad. Newspapers and broadcasters bought into Gregg and his ilk. Live Video / Audio Feed Map. Marges d’erreur, incertitude et probabilités | Si la tendance se maintient. Alors que nous abordons le dernier bout de cette campagne, ce billet a pour but de montrer le niveau de confiance que vous devriez accorder aux sondages et projections. je vais vous montrer que non seulement le modèle fonctionne, mais que les probabilités reflètent bien l’incertitude qui existe due aux sondages et au mode de scrutin.

Je l’ai déjà fait par le passé, mais jamais aussi détaillé. Lors de la dernière élection Québécoise, en me basant sur les sondages, j’avais projeté 66 PQ, 33 PLQ, 24 CAQ et 2 QS. Les vrais résultats étant 54 PQ, 50 PLQ, 19 CAQ et 2 QS. Mais en utilisant les bons pourcentages, le modèle aurait projeté 51 PQ, 48 PLQ, 24 CAQ et 2 QS. Le modèle aurait aussi correctement prédit le gagnant dans 105 comtés sur 125. Ainsi, tant globalement que par comté, cela montre bien qu’il est possible de transposer les pourcentages en sièges. Ce qui manquait à mon modèle était une idée de l’incertitude qui existe. 1. Faisons quelques simulations pour illustrer cela. 2. How publishers share traffic data with writers (and why some don't) The Verge recently raised eyebrows for its practice of not giving its reporters access to their own traffic data — a rarity among publishers in a metrics-crazy time. For most digital-savvy publishers, metrics are now critical tools not just in evaluating staff but in planning future stories.

They’re even going so far as to build out proprietary dashboards that point writers in the right direction of what might pop. The idea is to not just find the latest viral video but to help reporters learn quickly what’s working and what’s not. After all, just about any business you can imagine is using data as a guide. Should journalism be any different? At BuzzFeed, for instance, all staffers has access to a personal dashboard that shows them how much traffic their stories are earning on the “viral Web,” basically any site that isn’t BuzzFeed, which is the ultimate target audience.

On the extreme end of the spectrum, in terms of sharing stats with staff, is Forbes. Not all publishers go that far. Internet- History, Travel, Arts, Science, People, Places. Enfin un cadre pour une stratégie culturelle numérique  Le ministère de la Culture et des Communications du Québec a annoncé lundi sa stratégie culturelle numérique (PDF). Cette stratégie laisse grande place aux recommandations stratégiques pour le virage numérique de l’industrie culturelle québécoise déposées par la SODEC en 2011.

Ces recommandations, que j’ai eu l’honneur de rédiger avec des membres de la SODEC, découlaient des multiples rencontres avec les gens de l’industrie culturelle en 2010 et en 2011. Un constat inéluctable s’imposait : un contenu culturel qui n’est ni numérisé, ni diffusé en ligne, ni accessible sur les moteurs de recherches, ni agrégé par des sites ou sur les réseaux sociaux est un contenu qui n’existe pas aux yeux des consommateurs. Le retard dans les pratiques numériques réduit la capacité des entreprises culturelles québécoises à faire concurrence à l’offre étrangère omniprésente. Ce n’est pas un problème de talent, mais de rayonnement. Pour occuper l’espace numérique Par exemple : This Simple Data-Scraping Tool Could Change How Apps Are Made | Wired Design. The number of web pages on the internet is somewhere north of two billion, perhaps as many as double that.

It’s a huge amount of raw information. By comparison, there are only roughly 10,000 web APIs–the virtual pipelines that let developers access, process, and repackage that data. In other words, to do anything new with the vast majority of the stuff on the web, you need to scrape it yourself. Even for the people who know how to do that, it’s tedious. For the last five months, Rowe and Ranade have been building out Kimono, a web app that lets you slurp data from any website and turn it instantly into an API. Excitement’s already bubbling around the potential. Eliminating the Bottleneck The idea for Kimono was born out of Rowe’s time as a developer at the design consultancy Frog, where he continually ran into the same frustrating problem. It’s about letting artists, historians, sociologists cull and combine content. Click to Open Overlay Gallery Democratizing Data Scraping.

Edit fiddle. Google Announces An Online Data Interpretation Class For The General Public. Google has launched its own Massive Open Online Course (MOOC) to teach the general public how to understand surveys, research, and data. Called “Making Sense of Data” and running from March 18 to April 4, the course will be open to the public and, like most MOOCs, will be taught through a series of video lectures, interactive projects, and the support of community TAs. Users who complete the final capstone homework assignment will even have the option of receiving a certificate of completion (the unlisted YouTube introduction is embedded below): With this course, Google joins the growing ranks of for-profit online education providers who are answering the White House’s call for more data science-literate workers. This year, both of the major MOOC companies, Coursera and Udacity, announced data-science program tracks complete with paid certificates of completion.

It’s potentially a big win for Google. Image by theshirtdudes. New MIT Media Lab Tool Lets Anyone Visualize Unwieldy Government Data | Co.Design | business + innovation + design. In the four years since the U.S. government created data.gov, the first national repository for open data, more than 400,000 datasets have become available online from 175 agencies like the USDA, the Department of Energy, and the EPA. Governments all over the world have taken steps to make their data more transparent and available to the public. But in practice, much of that data--accessible as spreadsheets through sites like data.gov--is incomprehensible to the average person, who might not know how to wrangle huge data sets.

Never-ending tables mean next to nothing to me, even if I know that they might be hiding some interesting relationship within their numbers, like how income stacks up with happiness. To wade through what César Hidalgo, director of the Macro Connections group at the MIT Media Lab, calls "the last 10 inches" separating people from their government's incoherent tables and spreadsheets, Hidalgo turned to visualization. DataViva.info. Historical Metropolitan Populations of the United States - Peakbagger.com. The graph and tables on this page attempt to show how the urban hierarchy of the United States has developed over time. The statistic used here is the population of the metropolitan area (contiguous urbanized area surrounding a central city), not the population of an individual city.

Metropolitan area population is much more useful than city population as an indicator of the size and importance of a city, since the official boundaries of a city are usually arbitrary and often do not include vast suburban areas. For example, in 2000 San Antonio was the 10th largest city in the U.S., larger than Boston or San Francisco, but its Metro Area was only ranked about 30th. The same thing was happening even back in 1790: New York was the biggest single city, but Philadelphia plus its suburbs of Northern Liberties and Southwark made it the biggest metro area. The top 20 Metro Areas in the United States, 1790-2010 Approximate Populations in Thousands Six main sources for population data were used: Public ice skating in Ottawa and Gatineau - schedules and locations. Afpfr : Les cambriolages en France... Plotly | Analyze and visualize data, together.

The United States of income tax, in one map. Interactive. The World of Seven Billion The map shows population density; the brightest points are the highest densities. Each country is colored according to its average annual gross national income per capita, using categories established by the World Bank (see key below). Some nations— like economic powerhouses China and India—have an especially wide range of incomes.

But as the two most populous countries, both are lower middle class when income is averaged per capita. The Most Amazing, Beautiful and Viral Maps of the Year - Wired Science. Beautiful Maps, and the Lies They Tell, An Op-Ed From Runkeeper. This Fun Maps post is guest-written by Margaret McKenna, Head of Data and Analytics at RunKeeper, a free app for running, cycling and other fitness activities Recently the Flowing Data blog released a series of captivating maps created using public running routes from RunKeeper. Many media outlets picked up the maps, and local newspapers were happy to see a reflection of their hometowns. But tension arose around what the maps meant. The Flowing Data editor suggested that the maps could be used by public officials for city planning. RunKeeper’s map,left, with random data pulled from 2013 in New York City, Flowing Data’s original map at right.

Much has been written about how data visualizations — particularly beautiful, eye-catching ones — can distort the truth, but maps in particular can be a fraught area. There are two things that should immediately tip someone off to the trouble with Flowing Data’s New York City map: 1) There are no running routes in Central Park. 1) Seasonal bias. Behind the dialect map interactive: How an intern created The New York Times’ most popular piece of content in 2013. Nicolas.kruchten.com - Zoomable Map for Montreal Election Results. MapBox Enables Amazing Custom Maps for Sites and Apps. How to Design a Viral Map and Still Respect Yourself in the Morning - Wired Science.