background preloader

The START Natural Language Question Answering System

The START Natural Language Question Answering System
START, the world's first Web-based question answering system, has been on-line and continuously operating since December, 1993. It has been developed by Boris Katz and his associates of the InfoLab Group at the MIT Computer Science and Artificial Intelligence Laboratory. Unlike information retrieval systems (e.g., search engines), START aims to supply users with "just the right information," instead of merely providing a list of hits. Currently, the system can answer millions of English questions about places (e.g., cities, countries, lakes, coordinates, weather, maps, demographics, political and economic systems), movies (e.g., titles, actors, directors), people (e.g., birth dates, biographies), dictionary definitions, and much, much more. Below is a list of some of the things START knows about, with example questions. You can type your question above or select from the following examples.

http://start.csail.mit.edu/

ECML/PKDD'02 Tutorial on Text Mining and Internet Content filtering José María Gómez Hidalgo Departamento de Inteligencia Artificial Universidad Europea de Madrid In the recent years, we have witnessed an impressive growth of the availability of information in electronic format, mostly in the form of text, due to the Internet and the increasing number and size of digital and corporate libraries. The overwhelming amount of text is hardly to consume for an average human being, who faces an information overload problem.

American English Pronunciation Lesson: Wh- question Pitch Boundaries Introduction to wh-questions A wh-question begins with the words who, what, why, when, where, and how. These types of questions seek information and cannot be answered with "yes" or "no." Wh-questions can end with a rising or falling pitch boundary, depending on whether the speaker is truly asking a question, or is masking a suggestion as a question. Rising pitch boundary in wh-question When the speaker holds no assumption as to what the answer will be, and the topic is new, the wh-question is likely to have a rising pitch boundary.

Wolfram Natural language processing Natural language processing (NLP) is a field of computer science, artificial intelligence, and linguistics concerned with the interactions between computers and human (natural) languages. As such, NLP is related to the area of human–computer interaction. Many challenges in NLP involve natural language understanding, that is, enabling computers to derive meaning from human or natural language input, and others involve natural language generation.

Dictionary of American Regional English "Arthur the Rat" "Arthur the Rat" is a short tale devised to obtain phonetic representation from throughout the country of all phonemes in American English. See full text of the story here. a free, commonsense-enriched natural language understander Recent bugfixes Version 2.1 (6 Aug 2004) - includes new MontyNLGenerator component generates sentences and summaries Version 2.0.1 - fixes API bug in version 2.0 which prevents java api from being callable What is MontyLingua? [top] MontyLingua is a free*, commonsense-enriched, end-to-end natural language understander for English.

ConceptNet What is ConceptNet? [top] ConceptNet is a freely available commonsense knowledgebase and natural-language-processing toolkit which supports many practical textual-reasoning tasks over real-world documents right out-of-the-box (without additional statistical training) including topic-jisting (e.g. a news article containing the concepts, “gun,” “convenience store,” “demand money” and “make getaway” might suggest the topics “robbery” and “crime”), affect-sensing (e.g. this email is sad and angry), analogy-making (e.g. “scissors,” “razor,” “nail clipper,” and “sword” are perhaps like a “knife” because they are all “sharp,” and can be used to “cut something”), text summarization contextual expansion causal projection cold document classification and other context-oriented inferences The ConceptNet knowledgebase is a semantic network presently available in two versions: concise (200,000 assertions) and full (1.6 million assertions).

The `Bow' Toolkit Bow (or libbow) is a library of C code useful for writing statistical text analysis, language modeling and information retrieval programs. The current distribution includes the library, as well as front-ends for document classification (rainbow), document retrieval (arrow) and document clustering (crossbow). The library and its front-ends were designed and written by Andrew McCallum, with some contributions from several graduate and undergraduate students. The name of the library rhymes with `low', not `cow'. Maximum Entropy Modeling Using SharpEntropy. Free source code and programming articles Overview This article presents a maximum entropy modeling library called SharpEntropy, and discusses its usage, first by way of a simple example of predicting outcomes, and secondly, by presenting a way of splitting English sentences into constituent tokens (useful for natural language processing). Please note that because most of the code is a conversion based on original Java libraries published under the LGPL license, the source code available for download with this article is also released under the LGPL license. This means, it can freely be used in software that is released under any sort of license, but if you make changes to the library itself and those changes are not for your private use, you must release the source code to those changes.

Brill POS Tagger for Win32 Paul Maddox LDC - Linguistic Data Consortium - Current Projects

Related: