background preloader

Apache UIMA - Apache UIMA

Apache UIMA - Apache UIMA
Related:  Concept extractionAI

Apache Stanbol - Welcome to Apache Stanbol! Knowledge from Information by Matthias Broecheler index maui-indexer - Maui - Multi-purpose automatic topic indexing Summary Maui automatically identifies main topics in text documents. Depending on the task, topics are tags, keywords, keyphrases, vocabulary terms, descriptors, index terms or titles of Wikipedia articles. Maui performs the following tasks: term assignment with a controlled vocabulary (or thesaurus) subject indexing topic indexing with terms from Wikipedia keyphrase extraction terminology extraction automatic tagging It can also be used for terminology extraction and semi-automatic topic indexing. New:Try out Maui demo! Important: Questions regarding usage, bug reports or support? Also: read more on Download, Installation and Usage pages. Domain and language independence Maui has been successfully tested on computer science, agricultural, medicine, physics, biology, bioinformatics documents, as well as on blog posts and news articles. Examples are provided in Maui's Wiki pages Background Maui has been developed by Olena Medelyan as a part of her PhD project, under supervision of Ian H.

Another Word For It The Stanford NLP (Natural Language Processing) Group A Suite of Core NLP Tools About | Citing | Download | Usage | SUTime | Sentiment | Adding Annotators | Caseless Models | Shift Reduce Parser | Extensions | Questions | Mailing lists | Online demo | FAQ | Release history About Stanford CoreNLP provides a set of natural language analysis tools which can take raw text input and give the base forms of words, their parts of speech, whether they are names of companies, people, etc., normalize dates, times, and numeric quantities, and mark up the structure of sentences in terms of phrases and word dependencies, indicate which noun phrases refer to the same entities, indicate sentiment, etc. Stanford CoreNLP is an integrated framework. Its goal is to make it very easy to apply a bunch of linguistic analysis tools to a piece of text. Citing Stanford CoreNLP If you're just running the CoreNLP pipeline, please cite this CoreNLP demo paper. Download Download Stanford CoreNLP version 3.5.2. GitHub: Here is the Stanford CoreNLP GitHub site. Usage Javadoc

Kea 1. Documents - Kea gets a directory name and processes all documents in this directory that have the extension ".txt". The default language and the encoding is set to English, but this can be changed as long as a corresponding stopword file and a stemmer is provided. 2. 3. 4. TFxIDF is a measure describing the specificity of a term for this document under consideration, compared to all other documents in the corpus. 5. 6. Computational Creativity Apache Mahout: Scalable machine learning and data mining NERD: Named Entity Recognition and Disambiguation This version: 2012-11-07 - v0.5 [ n3 ] History: 2011-10-04 - v0.4 [ n3 ] 2011-08-31 - v0.3 [ n3 ] 2011-08-01 - v0.2 [ n3 ] 2011-06-24 - v0.1 [ n3 ] Authors: Giuseppe Rizzo, Raphaël Troncy Copyright © 2011-2012 Giuseppe Rizzo and Raphaël Troncy This work is licensed under a Creative Commons Attribution License. Abstract The NERD ontology is a set of mappings established manually between the taxonomies of named entity types. Status of this Document The NERD ontology has been evolving gradually since its creation in 2011. Table of Contents 1. An alphabetical index of NERD terms, by class and by property (relationships, attributes), are given below. 2. The NERD ontology is composed of two building blocks: the NERD core and the NERD inferred axioms. These classes host bi-directional references to the classes defined by the vocabularies of the external extractors. The inferred NERD axioms are generated from the long-tail of the most frequent classes of the extractors involved in the NERD project. 3.

Alexandre Bouchard-Côté General Email: bouchard AT stat.ubc.caAssistant Professor in the Department of Statistics at UBCPath: McGill -> UCB -> UBC. AKA: Alex, Bouchard, or 卜利森. See also: how to typeset my last name.Office: ESB, Room 3124Resumé (last updated: Nov. '13) Research Interests My main field of research is in statistical machine learning. On the methodology side, I am interested in Monte Carlo methods such as SMC and MCMC, graphical models, non-parametric Bayesian statistics, randomized algorithms, and variational inference. My favoriate applications, both in linguistics and biology, are related to phylogenetics in one way or another. In the past, I also did some work on machine translation, on logical characterization and approximation of labeled Markov processes, and on reinforcement learning. Refereed Publications Alexandre Bouchard-Côté. (2014) Sequential Monte Carlo (SMC) for Bayesian phylogenetics. Workshop Papers, Discussions, Reports, Presentations Teaching Old Scribbles How to typeset my last name