background preloader

Big data

Facebook Twitter

Premise: Open Data: How Not To Cock It Up. These days I find myself giving advice on an occasional basis to the governments of four different countries, including the UK. I like doing this, and it makes for a refreshing counterbalance to the nitty gritty of running mySociety - spreadsheets, UX design, contracts, discussions about features - all that stuff. Recently, however, I've started to think about what citizens themselves need to do to help make their governments keen on open data, or how to sustain any interest the government of their country might be showing. So, last week I took the opportunity of speaking at the excellent DataCamp in London to put my thoughts into a short talk. I'm sure there'll be a video online soon, but here is a slightly more polished version of the speech I read from on the day. I hope you find it thought provoking. "We live in quite extraordinary times. We should celebrate the fact that the political classes are paying attention to open data.

What do I mean by cocking up open data? So there we are. Strata Week: Overcharging algorithms. Here are a few of the data stories that caught my eye this week. When algorithms overcharge on Amazon A postdoc in Michael Eisen’s lab at UC Berkeley logged in to Amazon a couple of weeks ago in order to purchase a copy of Peter Lawrence’s “The Making of a Fly.”

Although out of print, the book is a classic in the field of evolutionary biology, and there were several copies available, both new and used. The used copies were on sale for roughly $35. The two new copies were priced a bit higher: $1.7 and $2.1 million. Although he assumed at first it was a mistake, when Eisen returned to the page the next day, he found the price had gone up, with both books for sale around $2.8 million. By the end of the day, the price of one was raised again, to more than $3.5 million. Some folks got creative in response to the multi-million-dollar price tag attached to “The Making of a Fly.” It’s obvious why one vendor would establish an algorithm to perpetually undercut the competition. Eisen wrote: U.S. Data is a currency. If I talk about data marketplaces, you probably think of large resellers like Bloomberg or Thomson Reuters.

Or startups like InfoChimps. What you probably don’t think of is that we as consumers trade in data. Since the advent of computers in enterprises, our interaction with business has caused us to leave a data imprint. In return for this data, we might get lower prices or some other service. When I use Facebook I’m trading my data for their service. So let’s guard our privacy by all means, but recognize this is a bargain and a marketplace we enter into. Is this all one-way traffic? The maturity of the data currency will be signalled by personal data bank accounts, that give us the consumer control and traceability. Who runs data banks themselves will be another point of control in the struggle for data ownership.

Related: LOLCats Get Serious: Comedy Network Hires Prominent Data Scientist. “Analyzing large data sets–so called big data–will become a key basis of competition, underpinning new waves of productivity growth, innovation, and consumer surplus, as long as the right policies and enablers are in place,” so writes giant consulting firm McKinsey Global Institute in a major new report this month on the topic. In LOLCat speak, that might read as Ceiling Cat sees what you are doing and thinks there’s much to be gained by analyzing it. Those two ways of seeing big data will come together with today’s hire of the first data scientist at the Cheezburger Network, Sean Power, co-author of the O’Reilly book Complete Web Monitoring and a leading voice on extraction of value from data online. With more than 400 million pageviews per month, Cheezburger offers a lot of data to analyze.

The company is smart to make this hire now – the talent crunch in this field is expected to grow much worse, quickly. “For Cheezburger to hire a data scientist, that seems perfectly normal. Big data leads to a big revival for display ads. It looks like rumors of display advertising’s death were greatly exaggerated. Nearly all the tech industry’s biggest players — AOL, Facebook, Google, Microsoft and Yahoo– are expected to report growth in their display ad businesses for 2011, according to a report released Tuesday from research outfit eMarketer. And these display ad businesses aren’t just legacy cash cows for older tech firms; Facebook is expected to nudge out Yahoo to become the biggest seller of display ads this year. By 2012, Facebook will likely account for one dollar of every five spent on display ads, according to eMarketer’s research.

Why are display and banner ads– long criticized for being inaccurate, annoying, and easily ignored– currently on the upswing? It could be because of new technologies that employ big data processing and real-time analytics capabilities to make display ads more efficient– and more lucrative– than they’ve ever been. Take TellApart. The strategy is paying off.