David B. Sparks, a fifth-year PhD candidate in the Department of Political Science at Duke University, has today published a fascinating set of experiments using ‘Isarithmic’ maps to visualise US party identification. Isarithmic maps are essentially topographic/contour maps and offer an alternative approach to plotting geo-spatial data using choropleth maps. This is a particularly interesting approach for the US with its extreme population patterns.
by Maria Popova From Darwin’s marginalia to Voltaire’s correspondence, or what Dalí’s controversial World’s Fair pavilion has to do with digital myopia. Despite our remarkable technological progress in the past century and the growth of digital culture in the past decade, a large portion of humanity’s richest cultural heritage remains buried in analog archives. Bridging the disconnect is a fledgling discipline known as the Digital Humanities, bringing online historical materials and using technologies like infrared scans, geolocation mapping, and optical character recognition to enrich these resources with related information or make entirely new discoveries about them.
This collection represents the full spectrum of data-related content we’ve published on O’Reilly Radar over the last year. Mike Loukides kicked things off in June 2010 with “What is data science?” and from there we’ve pursued the various threads and themes that naturally emerged. Now, roughly a year later, we can look back over all we’ve covered and identify a number of core data areas: Data issues -- The opportunities and ambiguities of the data space are evident in discussions around privacy, the implications of data-centric industries, and the debate about the phrase “data science” itself.
<img src="http://radar.oreilly.com/2011/08/25/0811-mysociety.png" border="0" alt="mySociety" width="270" style="float: right;margin: 3px 0 10px 10px" /> There has been much hand-wringing of late about whether the explosion of government-run app contests over the last couple of years has generated any real value for the public. With only one of the Apps for Democracy projects still running, it’s easy to see the entire movement being written off as an overly optimistic fad. The organisation that I’m lucky enough to lead — mySociety — didn’t come from the world of app contests, but it does build the kind of open-source , open-data-grounded civic apps that such contests are suppose to produce. I believe that mySociety’s story shows that it’s possible to build meaningful, impactful civic and democratic web apps, to grow them to a scale where they’re unambiguously a good use of time and money, then sustain them for years at a time.
In this, my first Visualization Deconstructed post, I’m expanding the scope to examine one of the most popular contemporary visualization techniques : animation of geospatial data over time. The beauty of photo versus the wonder of film <img src="http://blogs.oreilly.com/wp/wp-content/uploads/2011/10/1011-facebook-viz1.jpg" border="0" alt="Paul Butler's visualizing frienships" width="300" style="float: right;margin: 3px 0 10px 10px" /> In a previous post , Sebastien Pierre provided some excellent analysis about the illuminating visualization produced by Paul Butler, which examined the relationships between Facebook users around the world.
The world has changed. And some things that should not have been forgotten, were lost. I found these words from the Lord of the Rings echoing in my head as I listened to a fascinating presentation by Luiz André Barroso , Distinguished Engineer at Google, concerning Google's legendary past, golden present, and apocryphal future. His talk, Warehouse-Scale Computing: Entering the Teenage Decade , was given at the Federated Computing Research Conference . Luiz clearly knows his stuff and was early at Google, so he has a deep and penetrating perspective on the technology.
Government is releasing data at a breakneck pace, and it is just getting started. One interesting side effect of our National Data Catalog is that we're regularly parsing all of the data on data.gov, and we're able to do interesting things with the aggregate metadata. By parsing out the release date for each dataset on data.gov, and grouping each release by quarter though it's easy to see that since the second quarter of 2009-- when Data.gov was released, the federal government has released more raw datasets than it ever has in the past. Take a look at what's happened after Data.gov launched: Now, granted, like all government data-- it's a little messy. These are bulk, aggregate conclusions and haven't been reviewed, but they point to a trend regardless of their accuracy.
One of the joys of the last few years has been the flood of real-world datasets being released by all sorts of organizations. These usually involve some record of individuals’ activities, so to assuage privacy fears, the distributors will claim that any personally-identifying information (PII) has been stripped. The idea is that this makes it impossible to match any record with the person it’s recording. Something that my friend Arvind Narayanan has taught me, both with theoretical papers and repeated practical demonstrations, is that this anonymization process is an illusion.
From Facebook to the Department of Motor Vehicles, the world is catalogued in databases. No one knows it better than MIT adjunct professor and entrepreneur Michael Stonebraker , who has spent the last 25 years developing the technology that made it so. He got his big break by inventing and commercializing technology that underlies most of the databases, known as relational databases, that rule today.
<a href="//ad.doubleclick.net/jump/teg.fmsq/ajqj/a;specialreport=20100225;subs=n;wsub=n;sdn=n;!c=15557443;dcopt=ist;pos=ldr_top;sz=728x90,970x90,970x250;tile=1;ord=781912008?" target="_blank"><img src="//ad.doubleclick.net/ad/teg.fmsq/ajqj/a;specialreport=20100225;subs=n;wsub=n;sdn=n;!c=15557443;dcopt=ist;pos=ldr_top;sz=728x90,970x90,970x250;tile=1;ord=781912008?"
The visualization below highlights something only recently possible on the web: a dynamic, interactive canvas. Titled “Disaster Strikes: A World In Sight” , it visualizes a century of floods, fires, droughts, and earthquakes around the globe. (Below is a snapshot of 1996, an apparently costly year for disasters). It’s not a passively animated graphic, but one that users can actively engage with, freezing or pivoting dimensions to reveal new views of the data. It’s a harbinger of a new class of documents, which digital publishers are beginning to embrace, to provide a richer information experience for readers.
Supply chains come in all shapes and sizes. Supply chain complexity increases as it becomes larger or more geographically extended or more data intensive. Lora Cecere, a partner at Altimeter Group, recently wrote a post focused on "the big data supply chain." [" User in the Era: Big Data Supply Chains ," Supply Chain Shaman, 1 June 2011]. Since "big data" may be a new term for some readers, Cecere begins her post by explaining what she means by "big data." She writes:
In my old age, at least for the computing industry, I’m getting more irritated by smart young things that preach today’s big thing, or tomorrow’s next big thing, as the best and only solution to my computing problems. Those that fail to learn from history are doomed to repeat it, and the smart young things need to pay more attention. Because the trends underlying today’s computing should be evident to anyone with a sufficiently good grasp of computing history.
<img src="http://radar.oreilly.com/2011/05/16/0511-m2m.png" width="250" border="0" alt="M2M screenshot" style="float: right;margin: 3px 0 10px 10px" /> The shift from transporting voice to delivering data has transformed the business of mobile carriers, but there’s yet another upheaval on the horizon: machine to machine communications (M2M) . In M2M, devices and sensors communicate with each other or a central server rather than with human beings. These devices often use an embedded SIM card for communication over the mobile network.
<img src="http://radar.oreilly.com/2011/05/16/0511-steampunk.png" border="0" alt="SteamPunk Frankenstein - By D. Mattocks by SteamPunk Frankenstein, on Flickr" style="float: right;margin: 3px 0 10px 10px" /> Ken Cukier recently wrote about how useful analogies from the past are in explaining the potential of the current data revolution .