background preloader

A lire

Facebook Twitter

Using Public Data to Fight a War. How does a technology built for apartment-hunting end up being evaluated by the U.S. Army for use in Afghanistan? Cazoodle is using public data sources like Flickr and OpenStreetMap to build detailed guidebooks for American soldiers. Last week at Strata I sat down with company CTO Govind Kabra to find out how they do it. Its project for the Army is to build a detailed database of information about places in Afghanistan, using only public sources on the Web. The goal is to describe in detail the towns and cities including everything from names, locations and populations, as well as lists and coordinates for schools, mosques, banks and hotels. The military already collects this sort of information, but using traditional offline sources through groups like the National Geospatial-Intelligence Agency. It's a slow and dangerous process to send personnel door to door for research within war-torn countries, and though the agency's budget is classified, presumably very expensive.

Origins Results. An Intro To The Semantic Web: Why You Need To Know About It Sooner Than Later | Web Central Station. Case Studies Fusionex Read about how Fusionex, a provider of business intelligence solutions, leveraged Windows Azure HDInsight Service and Hadoop to help a customer better analyze its data, cut reporting its time from three months to less than 30 minutes, and reduce its energy consumption by 20 percent.

Mind Palette Learn how Japanese app maker Mind Palette cut database costs by 20 percent and received improved technical support by migrating its Linux servers from Amazon Web Services to Windows Azure. Equifax Discover how financial services firm Equifax is reacting faster and slashing IT costs with a Microsoft hybrid cloud solution + SUSE Linux Enterprise Server. City of Barcelona Learn how the City of Barcelona is improving the lives of its citizens and creating a Smart-city plan with its Windows Azure HDInsight Service and Hadoop Big Data solution. Teletica Silverstripe Discover how SilverStripe, an open source content management system, grew by working with Microsoft.

DreamFactory Software. Data mining and data warehousing. Scalable ontological EAI and e ... Manuel sur la communication et la ... Gephi Tutorial Visualization. Open Knowledge Foundation Blog » Blog Archive » Playing around with Open Linked Data: data.totl.net. Making Connections Real. Refining UMBEL’s Linking and Mapping Predicates with Wikipedia We are only days away from releasing the first commercial version 1.00 of UMBEL (Upper Mapping and Binding Exchange Layer) [1]. To recap, UMBEL has two purposes, both aimed to promote the interoperability of Web-accessible content.

First, it provides a general vocabulary of classes and predicates for describing domain ontologies and external datasets. Second, UMBEL is a coherent framework of 28,000 broad subjects and topics (the “reference concepts”), which can act as binding nodes for mapping relevant content. This last iteration of development has focused on the real-world test of mapping UMBEL to Wikipedia [2]. The result, to be more fully described upon release, has led to two major changes. There is a huge diversity of organizational structure and world views on the Web; the linking and mapping predicates to fulfill this purpose must also capture that diversity.

A Comparison of Mapping Predicates Equivalent Properties. The INSEMTIVES ISWC2010 Tutorial - 10 ways to make your semantic app addictive. Co-located with the 9th International Semantic Web Conference (ISWC 2010) 7th November, 2010, Room 3D, Shanghai, China Summary Useful semantic content cannot be created fully automatically, but motivating people to become an active part of this endeavor is still an art more than a science. In this tutorial we will revisit fundamental design issues of semantic-content authoring technology in order to find out which incentives speak to people to become engaged with the Semantic Web, and to determine the ways they can be transferred into technology design. We will present a combination of methods from areas as diverse as community support, participation management, usability engineering, and incentives theory, which can be applied to analyze semantically enabled systems and applications and design incentivized variants thereof, as well as empirically grounded best practices which should be taken into account in order to encourage large-scale user participation.

. ↑ top Motivation Program Dr. Dr. Meditating on Perl, Python and the Semantic Web. I have been working on a project with the National Labs over the past several months. A central element of that work has been an effort to bring much of the information and computatiom power of the Labs into a Service Oriented Architecture paradigm and the Semantic Web constructs are at the heart of that part of the work.

Not being that familiar with the Semantic Web constructs, I embarked on a concerted effort to teach myself about it. So I have read through several books. But, for me, there is no substitute for doing some actualy programming on a topic to really do a "deep dive" into the material. So I latched onto Segara, Evans and Taylors' "Programming the Semantic Web" (published by O'Reilly. I like O'Reilly's books, in general, and had some really good learning on their similar book "Programming Collective Intelligence.

" Out of this last few months' efforts, two questions have emerged for me. First, why is Python so prevalent in this genere of discourse? Just curious. Web - python semantic proxy/server, wich framework use. Visualizing the Semantic Web (9781852335762): Vladimir Geroimenko, Chaomei Chen. Towards the Semantic Web: Ontology-Driven Knowledge Management (9780470848678): John Davies, Dieter Fensel, Frank van Harmelen. Spinning the Semantic Web: Bringing the World Wide Web to Its Full Potential (9780262062329): Dieter Fensel, Wolfgang Wahlster, Henry Lieberman, James Hendler. Social commerce | Quand le e-commerce rencontre le Web d'aujourd'hui | Julien Chaumond. An Introduction to Linked Data. Seth Grimes's Tweet on the Sandro Hawke's video presentation caught my attention. The presentation, entitled An Introduction to Linked Data, was recorded in June 8, 2010 at the Cambridge Semantic Web Gathering, occurred at Massachusetts Institute of Technology(MIT) in Cambridge, MA.

Sandro works at World Wide Web Consortium, an international community where Member organizations, a full-time staff, and the public work together to develop Web standards. From a Summary: "Although the first Semantic Web standards are more than ten years old, only recently have we begun to actually see machines sharing data on the Web.

The key turning point was the acceptance of the core Linked Data principle, that object identifiers should also work with Web protocols to access useful information. This talk will cover the basic concepts and techniques of publishing and using Linked Data, assuming some familiarity with programming and the Web. The Slides of the presentation (PDF) are also available. How-to create a Linked Data site. Planète Web Sémantique. Semantic Search Gets On The Map - semanticweb.com. Présentation du web de données. Use Google Refine to Export JSON. Tutorial: Use Google Refine to Export JSON Introduction Google Refine is great at cleaning large sets of data.

But one amazing under-documented feature is the ability to design and output JSON files. With Google Refine, you can turn a simple spreadsheet into a straight forward JSON dataset or multidimensional array quickly and easily. If you haven’t used Refine before, here’s some videos to get you started. Basic JSON structure JSON stands for JavaScript Object Notation and is an increasingly common way of incorporating data in JavaScript. Let’s start with a simple spreadsheet in Refine. When converted to JSON, each row of data is converted into an element in an array. Converting your spreadsheet to JSON is easy. A new window will open showing you the default export template. Clicking Export saves a text file. Export JSON to match templates The real power of Refine is that it gives you control over how the JSON will be formatted.

To get the same thing in our file, we edit the prefix. Henry Thompson : Are Uris really names? The Twitter data extraction begins! « Laurens goes semantic… Today I started with the implementation of the extraction package. The package contains two models: one for the user’s profile and one for the user’s tweets. They then will be annotated in the “Annotator” module and converted into a list of simple triples in the “Triplifier” module.

It is being implemented in PHP. There is a very good API: PHP Twiter with OAuth to help with this task. Extraction Package I started with the User Profile Model. First I had to get familiar with the Twitter API and revise the PHP basics. Grabbed data from Twitter with PHP and converted to HTML The next step is to convert this table into triples. Grabbed data from Twitter with PHP and converted to triples viewed as HTML table The next step is to grab the tweets from Grabeteer and connect them to the user profile with another triple. Like this: Like Loading... Featured Tools and Technologies. Encyclopedia and Glossary. I only have one big research question, but I attack it from a lot of different angles.

The question is representation. How do people make, see and use things that carry meaning? The angles from which I attack my question include various ways in which representations are applied (including design processes, interacting with technology, computer programming, visualisation), various methods by which I collect research data (including controlled experiments, prototype construction, ethnographic observation), and the theoretical perspectives of various academic disciplines (including computer science, cognitive psychology, engineering, architecture, music, anthropology). If you are based in Cambridge, you may like to attend the following talks on human-computer interaction. I only have one big research question, but I attack it from a lot of different angles. Présentation du web de données. Object-Oriented PHP for Beginners. Recherche à facettes. Un article de Wikipédia, l'encyclopédie libre.

La recherche à facettes (ou recherche facettée, ou navigation à facettes) est une technique en recherche d'information correspondant à une méthodologie d'accès à l'information basée sur une classification à facettes. Elle donne aux utilisateurs les moyens de filtrer une collection de données en choisissant un ou plusieurs critères (les facettes). Il n'est donc pas tant question de recherche que de filtrage (une recherche brute, taxonomique, pouvant être utilisée en complément).

Une classification à facettes associe à chaque donnée de l'espace de recherche un certain nombre d'axes explicites de filtrage, par exemple des mots clés issus d'une analyse texte, des métadonnées stockées dans une base de données, etc. On trouve par exemple des recherches à facettes basées sur des catégories sur de nombreux sites de e-commerce. Type de facette[modifier | modifier le code] On peut voir des facettes liées : Les principaux efforts se sont orientés vers :

Tunnel SSH, proxy HTTP [NoJhan - Site perso] Par nojhan le 3 septembre 2008 Quand on utilise des applications réseaux, il peut arriver que l’on soit bloqué par des administrateurs systèmes tatillons (et compétents, donc), qui vont bloquer tous les ports, sauf ceux du web, et faire passer le traffic dans un proxy HTTP. Dans cet article, je détaille une des (nombreuses) méthodes possible permettant de contourner ce genre de restriction : le tunnel SSH.

Configuration Le détail des commandes est donné ici pour Linux, mais la méthode en elle-même, une fois comprise, peut fonctionner pour n’importe-quelle combinaison d’OS. Imaginons que vous êtes sur un réseau local, derrière un proxy HTTP, et que vous ne pouvez théoriquement accéder à l’internet que sur les ports associés au web : 80 (HTTP) et 443 (HTTPS). C’est le cas dans l’écrasante majorité des cybercafés, par exemple. À partir du moment où vous avez accès à une machine en dehors de ce réseau, vous pouvez théoriquement accéder à ce que vous voulez sur l’internet. Le principe du tunnel. Strata Conference 2011, Day 2 Keynotes. Day 2, and after yesterday’s tutorials the conference is really getting going. Here’s a stream of consciousness from the morning’s keynotes at this sold-out event. “In the same way that the industrial revolution changed what it meant to be human, the data revolution is changing what it means to be alive.”

The first of this morning’s keynotes; Hilary Mason from link shortener bit.ly. Data and the people who work with data; “The state of the data union is strong.” Data scientists have an identity – a place to rally around – with Strata. We have accomplished much, begging, borrowing and stealing from lots of domains. The most important thing we have now that we didn’t have before… is momentum. There are still challenges though. Opportunities (expressed in context of bit.ly); Bit.ly gets lots of data from people shrinking web links.

Now that we have all this data, it offers a window on to the world. Next, Mark Madsen from Third Nature, talking about ‘the Mythology of Big Data.’ Related. The Semantic Link Podcast RSS Feed is live! Episode 2 of The Semantic Link podcast discusses Drupal and more. Episode 2 of our new Semantic Link podcast went up on SemanticWeb.com this evening, and it’s another good one. Not that I’m biased or anything. The whole team is present once more, and we start the show discussing the implications of Drupal 7 and its newly formalised RDFa-publishing capabilities. Unlike regular semantic technology solutions, which someone has to consciously procure as a Semantic Technology solution, Drupal is first and foremost a (popular, free) Content Management System; the semantic smarts come for free, and therefore reach a massive new audience.

From there, we move into a broader discussion of the ways in which semantic projects take root within organisations. Have a listen, and let us know what you think. @theSemanticLink is the show’s Twitter id, which we will use to invite questions ahead of shows on particular topics. Related The Semantic Link is open At the end of last month, I wrote about the new Semantic Link podcast that I'm involved with for SemanticWeb.com. MapReduce from the basics to the actually useful (in under 30 minutes)

Semanticweb.com - The Voice of Semantic Web Business. When supercomputers meet the Semantic Web – Post. Easy Semantic Solution Is At Hand! – Post. Mapping Wikileaks’ Cablegate using Python, mongoDB and Gephi – Saturday, 5 Feburary 2011. Need faster machine learning? Take a set-oriented approach. We recently faced the type of big data challenge we expect to become increasingly common: scaling up the performance of a machine learning classifier for a large set of unstructured data. Machine learning algorithms can help make sense of data by classifying, clustering and summarizing items in a data set.

In general, performance has limited the opportunities to apply machine learning to understanding big or messy data sets. Analysts need to factor in time for speeding up off-the-shelf algorithms or even whether a machine learning pass would complete in a timely manner. While using smaller random samples can help mitigate performance issues, some data sets lend themselves to improved results when applied to more data.

Here we share our experience implementing a set-oriented approach to machine learning that led to huge performance increases (more detail is available in a related post at O’Reilly Answers). We had a tricky data set with categories that could be only subtly different. Developer Week in Review. Rescue teams this week uncovered the body of your faithful WIR reporter, buried under more that 50 feet of snow. A note found next to the body read: “Supplies failing. iPhone battery nearly dead. Still, must put out Week in Review.”

News of the mobile: SDK edition New goodies in the mobile developer basket for both major camps this week. The folks at Google have issued unto the masses the Android 3.0 SDK preview, which includes a platform build. The 3.0 drop will be the one that the new generation of Android tablets will be deployed with. Meanwhile, the third beta of iOS 4.3 has appeared. Last one out at Microsoft, please shut off the lights For a company that was the hot place to work just a few years ago, Microsoft seems to be experiencing some major brain-drain at the top end of the pay grade. It’s common to see turnover at the highest ranks of any large company. The Interwebs, circa 1994 Want a reminder of just how far the Internet has gone in a mere decade and a half? Related. Handling RDF on Your Own System – Quick Start.

Tags Associated With Other Tags on Delicious Bookmarked Resources.