A special report on managing information: Data, data everywhere

WHEN the Sloan Digital Sky Survey started work in 2000, its telescope in New Mexico collected more data in its first few weeks than had been amassed in the entire history of astronomy. Now, a decade later, its archive contains a whopping 140 terabytes of information. A successor, the Large Synoptic Survey Telescope, due to come on stream in Chile in 2016, will acquire that quantity of data every five days. Such astronomical amounts of information can be found closer to Earth too. Wal-Mart, a retail giant, handles more than 1m customer transactions every hour, feeding databases estimated at more than 2.5 petabytes—the equivalent of 167 times the books in America's Library of Congress (see article for an explanation of how data are quantified). Facebook, a social-networking website, is home to 40 billion photos. All these examples tell the same story: that the world contains an unimaginably vast amount of digital information which is getting ever vaster ever more rapidly. Dross into gold

http://www.economist.com/node/15557443

Lessons of the Victorian data revolution Ken Cukier recently wrote about how useful analogies from the past are in explaining the potential of the current data revolution. Science as we know it was consciously created in the 19th century, and in many ways the current wave of data techniques feels like an echo of that first flood of innovations. It’s fascinating to read histories of the era like “The Philosophical Breakfast Club” and spot the parallels. Take tides for example. You’ve probably never worried about the timing or height of the sea, but for Victorian sailors figuring out the tides was a life or death problem.

Linked Data Research Centre In an attempt to reaching out to the Web developers (called 'Hacker Joe', here ;) I've compiled a screen-cast on how one can understand the Web as a huge database . The screen-cast starts out with a bit of explanation of some essentials, however, the bigger part of it is dedicated to two hands-on examples: first we query and use data from DBPedia (the linked data version of Wikipedia) and then we look into a heavily distributed linked data form, that is, using a FOAF profile we again query and use data from there. Note that the accompanying slides are available as well via slideshare .

With M2M, the machines do all the talking The shift from transporting voice to delivering data has transformed the business of mobile carriers, but there’s yet another upheaval on the horizon: machine to machine communications (M2M). In M2M, devices and sensors communicate with each other or a central server rather than with human beings. These devices often use an embedded SIM card for communication over the mobile network. Applications include automotive, smartgrid, healthcare and environmental usages. M2M traffic differs from human-generated voice and data traffic. Mobile carriers are adapting by creating entirely new companies for M2M, such as Telenor’s M2M carrier Telenor Connexion, and m2o city, Orange’s joint venture with water giant Veolia.

With Big Data Comes Big Responsibilities The reams of data that many modern businesses collect—dubbed “big data”—can provide powerful insights. It is the key to Netflix’s recommendation engines, Facebook’s social ads, and even Amazon’s methods for speeding up the new Web browser, Silk, which comes with its new Fire tablet. But big data is like any powerful tool. Using it carelessly can have dangerous results. A new paper presented at a recent Symposium on the Dynamics of the Internet and Society spells out the reasons that businesses and academics should proceed with caution. Bracing for the Data Deluge From Facebook to the Department of Motor Vehicles, the world is catalogued in databases. No one knows it better than MIT adjunct professor and entrepreneur Michael Stonebraker, who has spent the last 25 years developing the technology that made it so. He got his big break by inventing and commercializing technology that underlies most of the databases, known as relational databases, that rule today. But Stonebraker now happily calls his earlier inventions largely obsolete. He’s working on a new generation of database technology that can handle the flood of digital data that is starting to overwhelm established methods.

Open government data gathers momentum in Switzerland The opendata.ch 2011 conference was inaugurated by Edith Graf-Litscher, National Councillor and Co-Chair of the Parliamentarian Group for Digital Sustainability, and Andreas Kellerhals, Director of the Swiss Federal Archives. The opening address was given by Nigel Shadbolt, Professor at the University of Southampton and member of the UK’s Public Sector Transparency Board. In an inspiring speech he highlighted the far-reaching transformative potential of open government data for people and governments alike, both now and in the future. Other speakers, including Jean-Philippe Amstein, Director of the Federal Office of Topography swisstopo, Hans-Peter Thür, Federal Data Protection and Information Commissioner, and Peter Fischer, the Delegate for Federal IT Strategy, echoed Shadbolt’s sentiments but also pointed to the challenges for Switzerland in dealing with freely accessible government data.

The next, next big thing In my old age, at least for the computing industry, I’m getting more irritated by smart young things that preach today’s big thing, or tomorrow’s next big thing, as the best and only solution to my computing problems. Those that fail to learn from history are doomed to repeat it, and the smart young things need to pay more attention. Because the trends underlying today’s computing should be evident to anyone with a sufficiently good grasp of computing history. Depending on the state of technology, the computer industry oscillates between thin- and thick-client architectures. Either the bulk of our compute power and storage is hidden away in racks of (sometimes distant) servers, or alternatively, into a mass of distributed systems closer to home.

Open Data Challenge on Datavisualization European public bodies produce thousands upon thousands of datasets every year – about everything from how our tax money is spent to the quality of the air we breathe. With the Open Data Challenge, the Open Knowledge Foundation and the Open Forum Academy are challenging designers, developers, journalists and researchers to come up with something useful, valuable or interesting using open public data. Everybody from the EU can submit an idea, app, visualization or dataset to the competition between 5th April and 5th June. The winners will be announced in mid June at the European Digital Assembly in Brussels. A total of €20,000 in prizes could be another motivator if you’re undecided yet.