background preloader


Facebook Twitter

Open Platform. Apps. Mashup example. Music and Literary Apps. Sometimes it takes a little human intervention to make semantic applications easier to build.

Music and Literary Apps

The Guardian newspaper has augmented its Open Platform API with unique identifiers for bands and books. In turn, the company has simplified the process of creating a mashup that uses multiple sources to focus on a single work, which may help its content spread farther. The introductory post helps explain the process: We… extended the Guardian’s Content API to include non-Guardian identifiers. At the moment, we have populated data for two types of identifiers, ISBNs and MusicBrainz ids.ISBNs are available chiefly on our book review articles, about 2,800 or so of them as I speak. On the music side, about 17,000 items of content and 600 artists have been marked up with identifiers from MusicBrainz.

“The big vision statement inspiring all of this kind of stuff is the idea that we intend to weave the Guardian into the fabric of the Internet,” said Matt McAlister, Head of Guardian Developer Network. Value to the news industry? Almost every talk about Linked Data I've seen inevitably at some point shows the 'linked data' universe bubble diagram.

Value to the news industry?

Every time I see it, it has grown in size. However, the first time I saw it, I noticed a glaring omission. None of our major UK print-based news organisations featured on it, and that fact is yet to change. We now know that, whatever the outcome of the next election, we are only going to see more Government and state gathered data published, not less. So how, as the news industry, are we going to respond to this, and what does the digital news media look like in a world with a high level of semantic state data available? To imagine how it could work, let us look at a non-news example from a news organisation. There are a couple of points to note. Pages are performing very well in SEO terms.

Linked Data at the Guardian. The semantic web is given a rough raking by the syntactic web, and it is not impossible to see why when you first get taken down the SPARQL/RDF/Ontology rabbit hole.

Linked Data at the Guardian

It is not great fun learning to develop with the semantic web today. (As an aside, using a semi-SQL model as a primary metaphor in SPARQL did not help me personally. But then, SQL has always seemed like an assembly language designed by Prolog programmers) But the capability to use semantic data to accurately join data is fantastically powerful. Down that particular rabbit hole is a warm cosy realm, an existence where mashups never have flaky data interconnections. More seriously, you decide your music application could benefit from a bit of descriptive text and some mashed up functionality. But in general, ours is a cruel universe. In the MusicBrainz/Wikipedia case, there is a deeper semantic option. There are two things happening here, two sides of the semantic question. Data is Journalism: Politics API from The Guardian. UK newspaper The Guardian is expanding its Open Platform (our Guardian API profile).

Data is Journalism: Politics API from The Guardian

Today they’ve launched a useful new government API that covers information about politicians and elections in the UK, with many details going back to 1992. It also contains limited older data, as far back as 1945. The newspaper announced the API, noting it was built from a pre-existing internal system and noting there’s more to come: This is the first of what will become a series of thematic APIs; ones which expose content in a tightly organised facet of our subject expertise. Our editorial staff have long since used and contributed to a resource known as Aristotle. With an election coming in the next few months, The Guardian is urging developers to use its platform to create applications. In order to encourage wide use of the new API, the barrier has been lowered.

Another way the newspaper has made it easy to use the new API is by providing a sample application, complete with a walk-through. Simon Rogers. Simon Rogers édite le Data Blog du Guardian et a participé pour son journal à l'exploitation des fuites de Wikileaks.

Simon Rogers

Interview par l'atelier des Médias d'un des principaux datajournalistes. Il y a quelques jours, L’Atelier des Médias de RFI a profité du passage par Paris de Simon Rogers, “Monsieur data” au Guardian pour l’interroger sur son parcours et sur cette nouvelle tendance du journalisme : le data-journalisme ou journalisme de données. Simon Rogers édite le Data Blog du Guardian. Il a participé pour son journal à l’exploitation des fuites de Wikileaks sur l’Afghanistan et sur l’Irak.

Malgré sa modestie et la simplicité avec laquelle il expose ses points de vue, c’est certainement une des principales personnalités et un des principaux moteurs du data-journalisme et de la visualisation de données dans le monde. Ecoutez ici l’interview en VF: Atelier des Médias (ADM) : Pour commencer, est-ce que vous pouvez nous présenter votre blog et nous parler de votre parcours?