Machine learning , a branch of artificial intelligence , is about the construction and study of systems that can learn from data.
In machine learning , unsupervised learning refers to the problem of trying to find hidden structure in unlabeled data.
In data mining , k -means clustering is a method of cluster analysis which aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean .
The canopy clustering algorithm is an unsupervised pre- clustering algorithm, often used as preprocessing step for the K-means algorithm or the Hierarchical clustering algorithm. It is intended to speed up clustering operations on large data sets , where using another algorithm directly may be impractical due to the size of the data set. The algorithm proceeds as follows:
PCA of a multivariate Gaussian distribution centered at (1,3) with a standard deviation of 3 in roughly the (0.878, 0.478) direction and of 1 in the orthogonal direction. The vectors shown are the eigenvectors of the covariance matrix scaled by the square root of the corresponding eigenvalue, and shifted so their tails are at the mean. Principal component analysis (PCA) is a mathematical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components .
Supervised learning is the machine learning task of inferring a function from labeled training data. The training data consist of a set of training examples .
In probability and statistics , a generative model is a model for randomly generating observable data, typically given some hidden parameters. It specifies a joint probability distribution over observation and label sequences.
A hidden Markov model ( HMM ) is a statistical Markov model in which the system being modeled is assumed to be a Markov process with unobserved ( hidden ) states. An HMM can be considered as the simplest dynamic Bayesian network . The mathematics behind the HMM was developed by L.
A naive Bayes classifier is a simple probabilistic classifier based on applying Bayes' theorem with strong (naive) independence assumptions.
In natural language processing , latent Dirichlet allocation (LDA) is a generative model that allows sets of observations to be explained by unobserved groups that explain why some parts of the data are similar.
Discriminative models , also called conditional models , are a class of models used in machine learning for modeling the dependence of an unobserved variable
Conditional random fields (CRFs) are a class of statistical modelling method often applied in pattern recognition and machine learning , where they are used for structured prediction . Whereas an ordinary classifier predicts a label for a single sample without regard to "neighboring" samples, a CRF can take context into account; e.g., the linear chain CRF popular in natural language processing predicts sequences of labels for sequences of input samples. CRFs are a type of discriminative undirected probabilistic graphical model .
Linear discriminant analysis (LDA) and the related Fisher's linear discriminant are methods used in statistics , pattern recognition and machine learning to find a linear combination of features which characterizes or separates two or more classes of objects or events.
In machine learning , support vector machines ( SVMs , also support vector networks [ 1 ] ) are supervised learning models with associated learning algorithms that analyze data and recognize patterns, used for classification and regression analysis . The basic SVM takes a set of input data and predicts, for each given input, which of two possible classes forms the output, making it a non- probabilistic binary linear classifier .
An artificial neural network , often just named a neural network , is a mathematical model inspired by biological neural networks . A neural network consists of an interconnected group of artificial neurons , and it processes information using a connectionist approach to computation .