machine-learning

TwitterFacebook
Get flash to fully experience Pearltrees
hadoop

hashing

ai

ica

http://www.ai-junkie.com/ann/som/som1.html This tutorial is the first of two related to self organising feature maps. Initially, this was just going to be one big comprehensive tutorial, but work demands and other time constraints have forced me to divide it into two. Nevertheless, part one should provide you with a pretty good introduction. Certainly more than enough to whet your appetite anyway! I will appreciate any feedback you are willing to give - good or bad.

SOM tutorial part 1

http://www.cs.cornell.edu/People/tj/

Home Page of Thorsten Joachims

· Special Issue on Learning to Rank for IR , Information Retrieval Journal , Hang Li, Tie-Yan Liu, Cheng Xiang Zhai, T. Joachims, Springer, 2009. · Special Issue on Automated Text Categorization , Journal on Intelligent Information Systems , T. Joachims and F. Sebastiani, Kluwer, Vol. 2, 2002. · Redundancy, Diversity, and Interdependent Document Relevance (IDR) , P.

Ashutosh Saxena - Assistant Professor - Cornell - Computer Scien

http://www.cs.cornell.edu/%7Easaxena/ Jiang/Zheng/Lim, Labutov/Yosinski, Yang/Low/Cong, Ly, and Anand/Koppula to present their respective works at the RSS workshops in Los Angeles. Congcong Li, TP Wong and Norris Xu build a "shoe finder robot" in a day using FeCCMs, featured in ICRA 2011 videos. ( More ) Also mentioned in New Scientist . Congcong Li and Adarsh Kowdle's paper on Holistic Scene Understanding was published in NIPS 2010 and ECCV workshop on parts and attributes. (More) Make3D and Grasping featured in Nilsson's book "The Quest for Artificial Intelligence" , that promises to be a definitive history of the field.
In statistics , latent Dirichlet allocation (LDA) is a generative model that allows sets of observations to be explained by unobserved groups that explain why some parts of the data are similar. For example, if observations are words collected into documents, it posits that each document is a mixture of a small number of topics and that each word's creation is attributable to one of the document's topics. LDA is an example of a topic model and was first presented as a graphical model for topic discovery by David Blei , Andrew Ng , and Michael Jordan in 2002. [ 1 ] [ edit ] Topics in LDA In LDA, each document may be viewed as a mixture of various topics. This is similar to probabilistic latent semantic analysis (pLSA), except that in LDA the topic distribution is assumed to have a Dirichlet prior .

Latent Dirichlet allocation - Wikipedia, the free encyclopedia

http://en.wikipedia.org/wiki/Latent_Dirichlet_allocation

CRF Project Page

The CRF package is a java implementation of Conditional Random Fields for sequential labeling developed by Sunita Sarawagi of IIT Bombay. The package is distributed with the hope that it will be useful for researchers working in information extraction or related areas. We have attempted to keep the core CRF package compact and barebones for ease of deployment. However, we have packaged additional supporting classes for generating features, managing model structure and dictionary of words in the training data. http://crf.sourceforge.net/
http://largescale.ml.tu-berlin.de/about/

ls | About

Motivation With the exceptional increase in computing power, storage capacity and network bandwidth of the past decades, ever growing datasets are collected in fields such as bioinformatics (Splice Sites, Gene Boundaries, etc), IT-security (Network traffic) or Text-Classification (Spam vs. Non-Spam), to name but a few.
The International Machine Learning Society is a non-profit organisation whose main aim is to foster machine learning research and whose main activity is the coordination of the annual International Conference on Machine Learning (ICML). http://www.machinelearning.org/

machinelearning.org - Home

Popular Ensemble Methods: An Empirical Study

http://www.cs.cmu.edu/afs/cs/project/jair/pub/volume11/opitz99a-html/paper.html An ensemble consists of a set of individually trained classifiers (such as neural networks or decision trees) whose predictions are combined when classifying novel instances. Previous research has shown that an ensemble is often more accurate than any of the single classifiers in the ensemble. Bagging [ Breiman1996a ] and Boosting [ Freund Schapire1996 , Schapire1990 ] are two relatively new but popular methods for producing ensembles. In this paper we evaluate these methods on 23 data sets using both neural networks and decision trees as our classification algorithm.
My research is in machine learning and statistics, with basic research on theory, methods, and algorithms. Areas of focus include nonparametric methods, sparsity, the analysis of high-dimensional data, graphical models, information theory, and applications in language processing, computer vision, and information retrieval. Perspectives on several research topics in statistical machine learning appeared in this Statistica Sinica commentary . Estimating a high dimensional regression function is notoriously difficult, due to the curse of dimensionality—this curse can be characterized rigorously using minimax theory. We developed on a new method for simultaneously performing bandwidth selection and variable selection in nonparametric regression that can beat the curse of dimensionality when the underlying function is sparse.

John Lafferty

http://www.cs.cmu.edu/~lafferty/research.html
http://www.cs.cmu.edu/afs/cs/project/jair/pub/volume4/cohn96a-html/statmodels.html For many types of machine learning algorithms, one can compute the statistically ``optimal'' way to select training data. In this paper, we review how optimal data selection techniques have been used with feedforward neural networks. We then show how the same principles may be used to select data for two alternative, statistically-based learning architectures: mixtures of Gaussians and locally weighted regression. While the techniques for neural networks are computationally expensive and approximate, the techniques for mixtures of Gaussians and locally weighted regression are both efficient and accurate. Empirically, we observe that the optimality criterion sharply decreases the number of training examples the learner needs in order to achieve good performance.

Active Learning with Statistical Models

Amos Storkey - Research - Belief Networks

As an introduction to the research I have been doing using and developing belief network approaches, I thought it might be useful to provide a basic introduction to what I am talking about. If you find this tutorial useful then please put a link to it on your home page. To switch to a printable form of this document hit here . Reload to return to the original form.

Henry Rowleys Home Page

I graduated from the VASC group in the Computer Science Department at Carnegie Mellon University . My advisor was Dr. Takeo Kanade . My research interests are in computer vision and pattern recognition and how machine learning can be applied to such problems.
The Non-linearity and Complexity Research Group has high international visibility in the areas of pattern analysis, probabilistic methods, non-linear dynamics and the application of methods from statistical physics to the analysis of complex systems. The underpinning methodology used includes principled approaches from probabilistic modelling, Bayesian statistics, statistical mechanics and non-linear stochastic and deterministic differential equations. Particularly significant application domains include Biomedical Information Engineering and Signal Processing, Health Informatics, Environmental Modelling and Weather Forecasting, Error-correcting Codes and Multi-user Communication, Complex Systems and Networks, Solitons and Optical Fibers, and Chaos and turbulence.

Neural Computing Research Group: The GTM H

The term "Pareto principle" can also refer to Pareto efficiency . The Pareto principle (also known as the 80–20 rule , the law of the vital few, and the principle of factor sparsity ) states that, for many events, roughly 80% of the effects come from 20% of the causes. [ 1 ] [ 2 ] Business-management consultant Joseph M. Juran suggested the principle and named it after Italian economist Vilfredo Pareto , who observed in 1906 that 80% of the land in Italy was owned by 20% of the population; he developed the principle by observing that 20% of the pea pods in his garden contained 80% of the peas. [ 2 ] It is a common rule of thumb in business; e.g., "80% of your sales come from 20% of your clients".

Pareto principle - Wikipedia, the free encyclopedia