
machine-learning
Get flash to fully experience Pearltrees
hadoop
hashing
ai
ica
SOM tutorial part 1
Home Page of Thorsten Joachims
· Special Issue on Learning to Rank for IR , Information Retrieval Journal , Hang Li, Tie-Yan Liu, Cheng Xiang Zhai, T. Joachims, Springer, 2009. · Special Issue on Automated Text Categorization , Journal on Intelligent Information Systems , T. Joachims and F. Sebastiani, Kluwer, Vol. 2, 2002. · Redundancy, Diversity, and Interdependent Document Relevance (IDR) , P.Ashutosh Saxena - Assistant Professor - Cornell - Computer Scien
In statistics , latent Dirichlet allocation (LDA) is a generative model that allows sets of observations to be explained by unobserved groups that explain why some parts of the data are similar. For example, if observations are words collected into documents, it posits that each document is a mixture of a small number of topics and that each word's creation is attributable to one of the document's topics. LDA is an example of a topic model and was first presented as a graphical model for topic discovery by David Blei , Andrew Ng , and Michael Jordan in 2002. [ 1 ] [ edit ] Topics in LDA In LDA, each document may be viewed as a mixture of various topics. This is similar to probabilistic latent semantic analysis (pLSA), except that in LDA the topic distribution is assumed to have a Dirichlet prior .
Latent Dirichlet allocation - Wikipedia, the free encyclopedia
CRF Project Page
The CRF package is a java implementation of Conditional Random Fields for sequential labeling developed by Sunita Sarawagi of IIT Bombay. The package is distributed with the hope that it will be useful for researchers working in information extraction or related areas. We have attempted to keep the core CRF package compact and barebones for ease of deployment. However, we have packaged additional supporting classes for generating features, managing model structure and dictionary of words in the training data.ls | About
Motivation With the exceptional increase in computing power, storage capacity and network bandwidth of the past decades, ever growing datasets are collected in fields such as bioinformatics (Splice Sites, Gene Boundaries, etc), IT-security (Network traffic) or Text-Classification (Spam vs. Non-Spam), to name but a few.The International Machine Learning Society is a non-profit organisation whose main aim is to foster machine learning research and whose main activity is the coordination of the annual International Conference on Machine Learning (ICML).
machinelearning.org - Home
Popular Ensemble Methods: An Empirical Study
My research is in machine learning and statistics, with basic research on theory, methods, and algorithms. Areas of focus include nonparametric methods, sparsity, the analysis of high-dimensional data, graphical models, information theory, and applications in language processing, computer vision, and information retrieval. Perspectives on several research topics in statistical machine learning appeared in this Statistica Sinica commentary . Estimating a high dimensional regression function is notoriously difficult, due to the curse of dimensionality—this curse can be characterized rigorously using minimax theory. We developed on a new method for simultaneously performing bandwidth selection and variable selection in nonparametric regression that can beat the curse of dimensionality when the underlying function is sparse.
John Lafferty
Active Learning with Statistical Models
Amos Storkey - Research - Belief Networks
As an introduction to the research I have been doing using and developing belief network approaches, I thought it might be useful to provide a basic introduction to what I am talking about. If you find this tutorial useful then please put a link to it on your home page. To switch to a printable form of this document hit here . Reload to return to the original form.Henry Rowleys Home Page
I graduated from the VASC group in the Computer Science Department at Carnegie Mellon University . My advisor was Dr. Takeo Kanade . My research interests are in computer vision and pattern recognition and how machine learning can be applied to such problems.The Non-linearity and Complexity Research Group has high international visibility in the areas of pattern analysis, probabilistic methods, non-linear dynamics and the application of methods from statistical physics to the analysis of complex systems. The underpinning methodology used includes principled approaches from probabilistic modelling, Bayesian statistics, statistical mechanics and non-linear stochastic and deterministic differential equations. Particularly significant application domains include Biomedical Information Engineering and Signal Processing, Health Informatics, Environmental Modelling and Weather Forecasting, Error-correcting Codes and Multi-user Communication, Complex Systems and Networks, Solitons and Optical Fibers, and Chaos and turbulence.
Neural Computing Research Group: The GTM H
The term "Pareto principle" can also refer to Pareto efficiency . The Pareto principle (also known as the 80–20 rule , the law of the vital few, and the principle of factor sparsity ) states that, for many events, roughly 80% of the effects come from 20% of the causes. [ 1 ] [ 2 ] Business-management consultant Joseph M. Juran suggested the principle and named it after Italian economist Vilfredo Pareto , who observed in 1906 that 80% of the land in Italy was owned by 20% of the population; he developed the principle by observing that 20% of the pea pods in his garden contained 80% of the peas. [ 2 ] It is a common rule of thumb in business; e.g., "80% of your sales come from 20% of your clients".

