background preloader

Apache OpenNLP - Welcome to Apache OpenNLP

Apache OpenNLP - Welcome to Apache OpenNLP
The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution. These tasks are usually required to build more advanced text processing services. OpenNLP also includes maximum entropy and perceptron based machine learning.

http://opennlp.apache.org/

Related:  Concept extractionOutils linguistiques

Language Computer - Cicero On-Demand API The Cicero On-Demand provides a RESTful interface that wraps LCC's CiceroLite and other NLP components. This API is used for Cicero On-Demand whether the server is the one hosted at LCC or is run locally on your machine. You can access a free, rate-limited version online, as described below, at demo.languagecomputer.com. For more information on service plans, contact support. Following is a description of the REST calls, which are valid for both the hosted and local modes. Checking the server status

SONDY - University of Lyon 2 About SONDY SONDY (SOcial Network DYnamics) is an open source software written in Java for collecting, analyzing, and mining data generated by social media. Its main focus is on event detection and influence analysis. The goal of this software is to provide a way to identify events to which people react, and to identify the most influential people inside these events. Topic Modeling Toolbox The first step in using the Topic Modeling Toolbox on a data file (CSV or TSV, e.g. as exported by Excel) is to tell the toolbox where to find the text in the file. This section describes how the toolbox converts a column of text from a file into a sequence of words. The process of extracting and preparing text from a CSV file can be thought of as a pipeline, where a raw CSV file goes through a series of stages that ultimately result in something that can be used to train the topic model.

Metaweb video From Freebase On July 16th 2010, when Metaweb announced their acquisition by Google, they also launched a video that explains what Metaweb/Freebase does, what entities are, etc. You know what drives me crazy about words? The Stanford NLP (Natural Language Processing) Group About | Questions | Mailing lists | Download | Extensions | Models | Online demo | Release history | FAQ About Stanford NER is a Java implementation of a Named Entity Recognizer. French stemming algorithm Letters in French include the following accented forms, â à ç ë é ê è ï î ô û ù The following letters are vowels: a e i o u y â à ë é ê è ï î ô û ù Assume the word is in lower case. Then put into upper case u or i preceded and followed by a vowel, and y preceded or followed by a vowel. u after q is also put into upper case.

Text Analysis API Saplo API gives you the possibility to build applications upon our text analysis technology platform. Take a look at our Text Analysis API documentation. Through the API you gain access to: Through Saplo API it's possible to automatically extract entities found in text. This service can automatically define the meaning of words and identify each tag as a company, person or location. Support is implemented for English and Swedish texts though new languages can be added on demand. WordNet WordNet is a lexical database for the English language.[1] It groups English words into sets of synonyms called synsets, provides short, general definitions, and records the various semantic relations between these synonym sets. The purpose is twofold: to produce a combination of dictionary and thesaurus that is more intuitively usable, and to support automatic text analysis and artificial intelligence applications. The database and software tools have been released under a BSD style license and can be downloaded and used freely. The database can also be browsed online.

Introducing fise, the Open Source RESTful Semantic Engine Edit: fise is now known as the Stanbol Enhancer component of the Apache Stanbol incubating project. As a member of the IKS european project Nuxeo contributes to the development of an Open Source software project named fise whose goal is to help bring new and trendy semantic features to CMS by giving developers a stack of reusable HTTP semantic services to build upon. As such concepts might be new to some readers, the first part of this blog post is presented as a Q&A. What is a Semantic Engine?

For Academics - Sentiment140 - A Twitter Sentiment Analysis Tool Is the code open source? Unfortunately the code isn't open source. There are a few tutorials with open source code that have similar implementations to ours: Format Data file format has 6 fields:0 - the polarity of the tweet (0 = negative, 2 = neutral, 4 = positive)1 - the id of the tweet (2087)2 - the date of the tweet (Sat May 16 23:58:44 UTC 2009)3 - the query (lyx). If there is no query, then this value is NO_QUERY.4 - the user that tweeted (robotickilldozr)5 - the text of the tweet (Lyx is cool) If you use this data, please cite Sentiment140 as your source.

An Gramadóir Kevin P. Scannell Summary This is the home page for An Gramadóir, an open source grammar checking engine. It is intended as a platform for the development of sophisticated natural language processing tools for languages with limited computational resources. It is currently implemented for the Irish language (Gaeilge); this is, to the best of my knowledge, the first grammar checker developed for any minority language. Natural language processing Natural language processing (NLP) is a field of computer science, artificial intelligence, and linguistics concerned with the interactions between computers and human (natural) languages. As such, NLP is related to the area of human–computer interaction. Many challenges in NLP involve natural language understanding, that is, enabling computers to derive meaning from human or natural language input, and others involve natural language generation. History[edit] The history of NLP generally starts in the 1950s, although work can be found from earlier periods. In 1950, Alan Turing published an article titled "Computing Machinery and Intelligence" which proposed what is now called the Turing test as a criterion of intelligence.

Related:  AI