Named-entity recognition Named-entity recognition (NER) (also known as entity identification, entity chunking and entity extraction) is a subtask of information extraction that seeks to locate and classify elements in text into pre-defined categories such as the names of persons, organizations, locations, expressions of times, quantities, monetary values, percentages, etc. Most research on NER systems has been structured as taking an unannotated block of text, such as this one: Jim bought 300 shares of Acme Corp. in 2006. And producing an annotated block of text that highlights the names of entities: [Jim]Person bought 300 shares of [Acme Corp.]Organization in [2006]Time. In this example, a person name consisting of one token, a two-token company name and a temporal expression have been detected and classified.
OPAC2.0 – OpenCalais meets our museum collection / auto-tagging and semantic parsing of collection data Today we went live with another one of the new experimental features of our collection database – auto-generation of tags based on semantic parsing. Throughout the Museum’s collection database you will now find, in the right hand column of the more recently acquired objects (see a quick sample list ), a new cluster of content titled “Auto-generated tags”. We have been experimenting with Reuters’ OpenCalais web service since it launched in January.
Web 3.0 A short story about the Semantic Web. Some Internet experts believe the next generation of the Web - Web 3.0 - will make tasks like your search for movies and food faster and easier. Instead of multiple searches, you might type a complex sentence or two in your Web 3.0 browser, and the Web will do the rest. For example, you could type "I want to see a funny movie and then eat at a good Mexican restaurant. What are my options?"
LingPipe Home How Can We Help You? Get the latest version: Free and Paid Licenses/DownloadsLearn how to use LingPipe: Tutorials Get expert help using LingPipe: Services Join us on Facebook What is LingPipe? LingPipe is tool kit for processing text using computational linguistics. LingPipe is used to do tasks like: Content Manager Submitted by Anonymous on Mon, 04/14/2008 - 15:49. For a general overview of Calais please take a moment to read the About section. If you’d just like to jump in and learn how Calais is relevant to developers, read on. Content and Collection Management This is a big area that covers everything from corporate knowledge management to librarians to collections at museums. Given the range of needs for this group as a whole, we’re simply going to try and point you in some useful directions.
AutoMap: Project Overview | People | Sponsors | Publications | Hardware Requirements | Software | Training & Sample Data AutoMap is a text mining tool developed by CASOS at Carnegie Mellon. Input: one or more unstructured texts. Output: DyNetML files and CS files. Developer Submitted by Anonymous on Mon, 04/14/2008 - 15:50. For a general overview of Calais please take a moment to read the About section. If you’d just like to jump in and learn how Calais is relevant to developers, read on. Thoughts on Google Plus: The Magic Isn’t Social, It’s Semantic It’s been said that I’ve called Google Plus “one of the subtlest and most user-friendly ontology development systems we’ve ever seen.” I did, and you can listen for yourselves on the Semantic Link podcast. Why did I do so? Well, G+ follows some of the basic principles of linked data: it uses persistent HTTP URIs for people, Sparks (concepts) and posts.
Software Provider Submitted by Anonymous on Mon, 04/14/2008 - 15:44. For a general overview of Calais please take a moment to read the About section. If you’d just like to jump in and learn how Calais is relevant to software providers, read on. Do you build content-driven software? If you do - the Calais team wants to work with you to incorporate Calais functionality in your tools. REST Submitted by Anonymous on Wed, 02/25/2009 - 13:19. Calais’s latest release of an improved REST interface is the simplest and fastest way to submit your documents. Here’s how you can invoke request with this API. Web service URL for improved REST API is located at should create an HTTP POST request.Document content should be passed as the body of the HTTP request.Submitted content should be UTF-8 encoded.Your Calais license, different processing and user options are specified as HTTP headers (key-value pairs) of the request. Following headers are mandatory: x-calais-licenseID: value of this header is your license keycontent-type: value of this parameter is the content type of submitted content, whether its text/raw, text/html, etc., as documented hereaccept or outputformat: possible values are the expected MIME types of response, e.g., xml/rdf, application/json, etc., as documented hereSpecification of all other processing and user options is optional. Sample Java code is here.
Metadata Metadata is "data about data".[1] There are two "metadata types;" structural metadata, about the design and specification of data structures or "data about the containers of data"; and descriptive metadata about individual instances of application data or the data content. The main purpose of metadata is to facilitate in the discovery of relevant information, more often classified as resource discovery. Metadata also helps organize electronic resources, provide digital identification, and helps support archiving and preservation of the resource. Metadata assists in resource discovery by "allowing resources to be found by relevant criteria, identifying resources, bringing similar resources together, distinguishing dissimilar resources, and giving location information." [2]
Beyond Social: Read/Write in The Era of Internet of Things This blog was founded in 2003 on the philosophy of a read/write Web - a Web in which people can create content as easily as they consume it. This trend eventually came to be known as Web 2.0 - although others preferred Social Web - and was popularized by activities like blogging and social networking. It would be easy to say that the 'social' element is still the primary part of today's Web, since the popular products of this era enable you to say what's on your mind (Facebook), what's happening (Twitter), or where you are (Foursquare). 3.0 Semantic Web The Semantic Web is a collaborative movement led by international standards body the World Wide Web Consortium (W3C).[1] The standard promotes common data formats on the World Wide Web. By encouraging the inclusion of semantic content in web pages, the Semantic Web aims at converting the current web, dominated by unstructured and semi-structured documents into a "web of data". The Semantic Web stack builds on the W3C's Resource Description Framework (RDF).[2]
Forecast 2020: Web 3.0+ and Collective Intelligence « simple processes “We know what we are, but we know not what we may become” – Shakespeare The ancient Chinese curse or saying — “May you live in interesting times.” — is upon us. We are in the midst of a new revolution fueled by advancements in the Internet and technology. Currently, there is an abundance of information and the size of social interaction has reached a colossal scale. Within a span of just one generation, the availability of information and our access to them has changed dramatically from scarcity to surplus. What humans will do or try to do with such powerful surplus of information will be the main topic of this article.