background preloader

Natural Language Processing

Facebook Twitter

School of Engineering - Stanford Engineering Everywhere. WhereNext: a Location Predictor on Trajectory Pattern Mining. The pervasiveness of mobile devices and location based services is leading to an increasing volume of mobility data.

WhereNext: a Location Predictor on Trajectory Pattern Mining

This side effect provides the opportunity for innovative methods that analyze the behaviors of movements. Treeler: Open-source Structured Prediction for NLP. Syntactic Approaches for Natural Language Processing. Probabilistic Non-Linear Principal Component Analysis with Gaussian Process Latent Variables. It is known that Principal Component Analysis has an underlying probabilistic representation based on a latent variable model.

Probabilistic Non-Linear Principal Component Analysis with Gaussian Process Latent Variables

Principal component analysis (PCA) is recovered when the latent variables are integrated out and the parameters of the model are optimised by maximum likelihood. It is less well known that the dual approach of integrating out the parameters and optimising with respect to the latent variables also leads to PCA. The marginalised likelihood in this case takes the form of Gaussian process mappings, with linear Covariance functions, from a latent space to an observed space, which we refer to as a Gaussian Process Latent Variable Model (GPLVM).

This dual probabilistic PCA is still a linear latent variable model, but by looking beyond the inner product kernel as a for a covariance function we can develop a non-linear probabilistic PCA. Would you like to put a link to this lecture on your homepage? POWERSET - Natural Language and the Semantic Web. Panel: Priors, Deep Architectures and NLP: YOU ARE DOING EVERYTHING WRONG!! Only Connect! Two Minor Explorations in Using Graphs for IF and NLP. NLP Interchange Format (NIF) Machine Learning for Natural Languages Processing. Machine Learning Applications / Challenges in Natural Language Parsing. Linked Data in Linguistics for NLP and Web Annotation. This presentation introduces three major data pools that have recently been made freely available as Linked Data by a collaborative community process: (1) the DBpedia Internationalization committee is concerned with the extraction of RDF from the language-specific Wikipedia editions; (2) the creation of a configurable extractor based on DBpedia and able to extract information from all languages of Wiktionary with manageable effort; (3) the Working Group for Open Lingustic Data, an Open Knowledge Foundation group with the goal of converting Open Linguistics data sets to RDF and interlinking them.

Linked Data in Linguistics for NLP and Web Annotation

The presentation highlights and stresses the role of Open Licences and RDF for the sustenance of such pools. It also provides a short update on the recent progress of NIF (Natural Language Processing Interchange Format) by the LOD2-EU project. The transcript of the Q&A session "Linking Resources" is available here. Would you like to put a link to this lecture on your homepage? Go ahead! Combining Shallow and Deep NLP Methods for Recognizing Textual Entailment. Approximate Inference in Natural Language Processing. I'll start out by presenting an idealized version of the natural language processing problem of parsing.

Approximate Inference in Natural Language Processing

I will brazenly suggest that most of NLP is reducible to variations on parsing problems. I'll show how dynamic programming solves the idealized version of the problem, both for calculating modes and marginals over parse trees, exploiting some key independence assumptions about the structure of natural language sentences. I will then discuss two approximate inference methods that let us build more powerful models of parsing. Neither comes with strong theoretical guarantees, but both are demonstrated to perform strongly in experiments on real NLP data. The first method builds on the dynamic programming representation, combining max-product and sum-product methods to produce, approximately, the k-best parses and a residual sum over the rest of the parses, useful when incorporating features that violate the usual independence assumptions. Toward Large-Scale Shallow Semantics for Higher-Quality NLP.

Building on the successes of the past decade’s work on statistical methods, there are signs that continued quality improvement for QA, summarization, information extraction, and possibly even machine translation require more-elaborate and possibly even (shallow) semantic representations of text meaning.

Toward Large-Scale Shallow Semantics for Higher-Quality NLP

But how can one define a large-scale shallow semantic representation system and contents adequate for NLP applications, and how can one create the corpus of shallow semantic representation structures that would be required to train machine learning algorithms? This talk addresses the components required (including a symbol definition ontology and a corpus of (shallow) meaning representations) and the resources and methods one needs to build them (including existing ontologies, human annotation procedures, and a verification methodology). Would you like to put a link to this lecture on your homepage? Go ahead! Copy the HTML snippet ! Some thoughts on prior knowledge, deep architectures and NLP. Incorporating Prior Knowledge into NLP with Markov Logic.

Natural language processing. Structured Prediction for Natural Language Processing. This tutorial will discuss the use of structured prediction methods from machine learning in natural language processing.

Structured Prediction for Natural Language Processing

The field of NLP has, in the past two decades, come to simultaneously rely on and challenge the field of machine learning. Statistical methods now dominate NLP, and have moved the field forward substantially, opening up new possibilities for the exploitation of data in developing NLP components and applications. NLP at Google. Deep Learning in Natural Language Processing. This tutorial will describe recent advances in deep learning techniques for Natural Language Processing (NLP).

Deep Learning in Natural Language Processing

Traditional NLP approaches favour shallow systems, possibly cascaded, with adequate hand-crafted features. In constrast, we are interested in end-to-end architectures: these systems include several feature layers, with increasing abstraction at each layer. Compared to shallow systems, these feature layers are learnt for the task of interest, and do not require any engineering. Structured Prediction Problems in Natural Language Processing.

Unsupervised Learning for Natural Language Processing. Given the abundance of text data, unsupervised approaches are very appealing for natural language processing.

Unsupervised Learning for Natural Language Processing

We present three latent variable systems which achieve state-of-the-art results in domains previously dominated by fully supervised systems. For syntactic parsing, we describe a grammar induction technique which begins with coarse syntactic structures and iteratively refines them in an unsupervised fashion. The resulting coarse-to-fine grammars admit efficient coarse-to-fine inference schemes and have produced the best parsing results in a variety of languages. Statistical NLP / corpus-based computational linguistics resources. Contents Tools: Machine Translation, POS Taggers, NP chunking, Sequence models, Parsers, Semantic Parsers/SRL, NER, Coreference, Language models, Concordances, Summarization, Other Corpora: Large collections, Particular languages, Treebanks, Discourse, WSD, Literature, Acquisition Dictionaries Lexical/morphological resources.

Statistical NLP / corpus-based computational linguistics resources

Natural Language Processing. This is a book about Natural Language Processing. By natural language we mean a language that is used for everyday communication by humans; languages like English, Hindi or Portuguese. In contrast to artificial languages such as programming languages and mathematical notations, natural languages have evolved as they pass from generation to generation, and are hard to pin down with explicit rules.

We will take Natural Language Processing (or NLP for short) in a wide sense to cover any kind of computer manipulation of natural language. At one extreme, it could be as simple as counting the number of times the letter t occurs in a paragraph of text. At the other extreme, NLP involves "understanding" complete human utterances, at least to the extent of being able to give useful responses to them. The Dashboard - NLP: Everyday, Analytical & Unusual Uses. Michael Collins.