background preloader

Linking data

Facebook Twitter

The Web as a CMS: How BBC joined Linked Open Data. I was looking at the slides from a recent talk by Paul Rissen, Senior Data Architect at the BBC, about the history of Linked Data usage at the organisation. One of his slides, number 20 to be exact, reminded me of how quietly revolutionary the work at the BBC has been. The slide was titled ‘The Web as a Content Management System’. First Successes Early on the BBC decided not to mint their own ids but to utilise existing URIs for musical artists from a freely available database MusicBrainz. For the uninitiated, a URI (Uniform Resource Identifier) is a way for the computer to identify a thing and it is one of the basic concepts in Linked Data paradigm. Firstly, this instantly gave them a database of 50 million artists, albums and songs.

That’s the ‘magic’ of linked data. Fast, Cheap and Out of Control I can imagine what the conversations with the heads of editorial when the techies suggested the idea of using ‘wild’ data more often called open data. The Quiet Revolution of Linked Open Data. A Simple Knowledge Organisation Tool. NetAppVoice: How The Semantic Web Changes Everything. Again! The “semantic Web” is hugely important to tomorrow’s business. Do not underestimate its significance: It truly changes everything. Embrace it, or risk extinction. But what is it? And what does it mean for your business? “Semantic” is the latest buzzword to hit the online world. It’s come to mean everything and nothing. From semantic search to the semantic Web; and from semantic marketing to semantic technologies, it seems like everyone wants to ride the semantic train.

But let’s take things from the beginning. So What? It marks the transition into a new phase of the Web, where we stop searching and start finding. In other words, we discover not just the information that matches the keywords we search for, but the information that we really wanted to find. This is exactly what is happening with Google GOOG +0.27%’s semantic search, which finds content in direct response to the intent of our search query.

New Products; New Services The Age Of Checkbox Marketing Is Over So how’s it achieved? Taxonomy and Recirculation. The taxonomy’s main job is to help users explore the site, and it can only do that if the tags and categories are used consistently. We wanted a system that was easy for editors to manage, with clear guidelines for how to apply tags and categories to posts. When we looked at the original taxonomy on The Toast, we found, well, the exact opposite of what we wanted. The Old Toast: Like your closet, but with more piles of random stuff Peeking inside The Toast uncovered a big ol’ stew of tags. When we started this process, The Toast had around 4500 posts sorted into 60 categories. We discovered 8,182 unique tags—6,152 of which were applied only to a single post.

The complete tags list, as you can imagine, was a hot mess. Some tags were topical: disabilitythe great outdoorstravel Some grouped posts into recurring series: two monks inventing thingstexts fromwatching downton abbey with an historian Some were funny: The New Toast: Your closet after a visit to the Container Store Popularity recircs. Small Pieces Loosely Joined: How smarter use of technology and data can deliver real reform of local government. Local authorities could save up to £10billion by 2020 through smarter and more collaborative use of technology and data. Small Pieces Loosely Joined highlights how every year councils lose more than £1 billion by failing to identify where fraud has taken place.

The paper also sheds light on how a lack of data sharing and collaboration between many local authorities, as well as the use of bespoke IT systems, keeps the cost of providing public services unsustainably high. The report sets out three ways in which local authorities could not only save billions of pounds, but also provide better, more coordinated public services: Using data to predict and prevent fraud. Each year councils lose in excess of £1.3 billion through Council Tax fraud, benefit fraud and housing tenancy fraud (such as illegal subletting). Testimonials Local Government Minister Kris Hopkins: Richard Copley, Corporate ICT Manager Rotherham Metropolitan Borough Council: "Local Government has spent 5 years cutting back. Hancock must bring a platform for transparent thinking to the Cabinet Office. David BicknellPublished 13 May 2015 The economic case for Government as a Platform (GaaP) and ensuring public confidence in the Cabinet Office's data are key tasks facing new minister Matt Hancock The introductions have been made, the bust of St Francis of 70 Whitehall has been paid due respect, and now the work begins for Matthew Hancock and Oliver Letwin in the Cabinet Office.

Hancock's named responsibilities include public sector efficiency and reform, civil service issues, industrial relations strategy in the public sector, government transparency, civil contingencies, the civil society, cyber security, and UK statistics. His remit will also include the ongoing role - and performance - of the Government Digital Service (GDS) whose focus will markedly be on Government as a Platform (GaaP), which is still largely a "vision for digital government", offering a common core infrastructure of shared digital systems, technology and processes on which to build user-centric government services. The CDO: It's a role, not a title. Immediacy is becoming an expectation in the world of digital business, especially as user interaction data becomes just as important as transactional data. One customer even tells me that user data more than 15 minutes old borders on irrelevant. With the shift to mobile, the rise of cloud computing, and the explosion of data, a new point of control has formed at the edge of the enterprise.

Leading businesses strive for tight connections and speedy interaction here, because it’s where customers, employees and partners are. Armed with their mobile devices, or plugged in to the internet of everything, they expect to have relevant information or services at their fingertips, without delay. Delivering this requires understanding the user’s context, delivered from apps to the enterprise. This person is sometimes called a chief digital officer. But it’s important to view the CDO as a role, not a title. Which channels do our customers prefer? The Elephant was a Trojan Horse: On the Death of Map-Reduce at Google : Paper Trail. Note: this is a personal blog post, and doesn’t reflect the views of my employers at Cloudera Map-Reduce is on its way out. But we shouldn’t measure its importance in the number of bytes it crunches, but the fundamental shift in data processing architectures it helped popularise.

This morning, at their I/O Conference, Google revealed that they’re not using Map-Reduce to process data internally at all any more. We shouldn’t be surprised. The writing has been on the wall for Map-Reduce for some time. It was known for decades that generalised dataflow engines adequately capture the map-reduce model as a fairly trivial special case.

Map-Reduce has served a great purpose, though: many, many companies, research labs and individuals are successfully bringing Map-Reduce to bear on problems to which it is suited: brute-force processing with an optional aggregation. In the public domain, Hadoop would not have had any success without Map-Reduce to sell it. Drupal For Dummies.

Data terminology

Open Knowledge Foundation Annual Report 2011-2012. Web of Linked Data. Here follows a quick introduction to the notions of Semantic Web, Linked Data and the Linking Open Data initiative. Semantic Web The Semantic Web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries. This fosters the opportunity of creating a next generation world-wide web of structured data which are not only understandable to humans (like the typical HTML page), but also understandable by computers. The data on the Semantic Web have explicitly defined structure (like in the databases) and semantics (like in the ontologies).

This allows the computers to perform structured queries (like those in SQL) and infer new facts. In short, the Semantic Web is an extension of the current WWW, in which information is given well-defined meaning, better enabling computers and people to work in cooperation. Linked Data 1. URI Design Principles: Creating Persistent URIs for Government Linked Data. Version: 23 October 2013 Table of Contents Goals for Persistent Open Government Data URIsURI Design OverviewExample Persistent URIs for Government Linked DataReferences & Resources 1.

Goals for Persistent Open Government Data URIs As an increasing number of governments and government agencies have begun publishing linked open government data, policies and best practices will emerge for Uniform Resource Identifier (URI) design[0] in open government data release. RPI TWC is working extensively with key players in the United States and international linked open government data initiatives to develop URI schemes that are useful today and in the future. The URI scheme demonstrated by the TWC Instance Hub provides rich descriptive information about entities for humans examining open government data, while encouraging intuitive navigation and exploration of this data. The principles recommended in this document should produce URIs with the following characteristics: URIs that are easily re-hosted. Linked Data: Evolving the Web into a Global Data Space.