background preloader

Self-organizing map

Self-organizing map
A self-organizing map (SOM) or self-organizing feature map (SOFM) is a type of artificial neural network (ANN) that is trained using unsupervised learning to produce a low-dimensional (typically two-dimensional), discretized representation of the input space of the training samples, called a map. Self-organizing maps are different from other artificial neural networks in the sense that they use a neighborhood function to preserve the topological properties of the input space. This makes SOMs useful for visualizing low-dimensional views of high-dimensional data, akin to multidimensional scaling. The model was first described as an artificial neural network by the Finnish professor Teuvo Kohonen, and is sometimes called a Kohonen map or network.[1][2] Like most artificial neural networks, SOMs operate in two modes: training and mapping. A self-organizing map consists of components called nodes or neurons. Large SOMs display emergent properties. Learning algorithm[edit] Variables[edit] Related:  Machine Learningà mettre en ligne

Multilayer perceptron A multilayer perceptron (MLP) is a feedforward artificial neural network model that maps sets of input data onto a set of appropriate outputs. A MLP consists of multiple layers of nodes in a directed graph, with each layer fully connected to the next one. Except for the input nodes, each node is a neuron (or processing element) with a nonlinear activation function. Theory[edit] Activation function[edit] If a multilayer perceptron has a linear activation function in all neurons, that is, a linear function that maps the weighted inputs to the output of each neuron, then it is easily proved with linear algebra that any number of layers can be reduced to the standard two-layer input-output model (see perceptron). The two main activation functions used in current applications are both sigmoids, and are described by in which the former function is a hyperbolic tangent which ranges from -1 to 1, and the latter, the logistic function, is similar in shape but ranges from 0 to 1. Layers[edit] in the

Growing self-organizing map A growing self-organizing map (GSOM) is a growing variant of the popular self-organizing map (SOM). The GSOM was developed to address the issue of identifying a suitable map size in the SOM. It starts with a minimal number of nodes (usually 4) and grows new nodes on the boundary based on a heuristic. All the starting nodes of the GSOM are boundary nodes, i.e. each node has the freedom to grow in its own direction at the beginning. Node growth options in GSOM: (a) one new node, (b) two new nodes and (c) three new nodes. The algorithm[edit] The GSOM process is as follows: where the Learning Rate , is a sequence of positive parameters converging to zero as . , are the weight vectors of the node before and after the adaptation and is the neighbourhood of the winning neuron at the th iteration. Approximation of a spiral with noise by 1D SOM (the upper row) and GSOM (the lower row) with 50 (the first column) and 100 (the second column) nodes. Applications[edit] References[edit] See also[edit]

Deep learning Branch of machine learning Deep learning (also known as deep structured learning or differential programming) is part of a broader family of machine learning methods based on artificial neural networks with representation learning. Learning can be supervised, semi-supervised or unsupervised.[1][2][3] Deep learning architectures such as deep neural networks, deep belief networks, recurrent neural networks and convolutional neural networks have been applied to fields including computer vision, speech recognition, natural language processing, audio recognition, social network filtering, machine translation, bioinformatics, drug design, medical image analysis, material inspection and board game programs, where they have produced results comparable to and in some cases surpassing human expert performance.[4][5][6] Artificial neural networks (ANNs) were inspired by information processing and distributed communication nodes in biological systems. Definition[edit] Overview[edit] History[edit]

Connectionism Connectionism is a set of approaches in the fields of artificial intelligence, cognitive psychology, cognitive science, neuroscience, and philosophy of mind, that models mental or behavioral phenomena as the emergent processes of interconnected networks of simple units. There are many forms of connectionism, but the most common forms use neural network models. Basic principles[edit] The central connectionist principle is that mental phenomena can be described by interconnected networks of simple and often uniform units. Spreading activation[edit] In most connectionist models, networks change over time. Neural networks[edit] Neural networks are by far the most commonly used connectionist model today.Though there are a large variety of neural network models, they almost always follow two basic principles regarding the mind: Most of the variety among neural network models comes from: Biological realism[edit] Learning[edit] By formalizing learning in such a way, connectionists have many tools.

Protein Secondary Structure Prediction with Neural Nets: Feed-Forward Networks Introduction to feed-forward nets Feed-forward nets are the most well-known and widely-used class of neural network. The popularity of feed-forward networks derives from the fact that they have been applied successfully to a wide range of information processing tasks in such diverse fields as speech recognition, financial prediction, image compression, medical diagnosis and protein structure prediction; new applications are being discovered all the time. In common with all neural networks, feed-forward networks are trained, rather than programmed, to carry out the chosen information processing tasks. The feed-forward architecture Feed-forward networks have a characteristic layered architecture, with each layer comprising one or more simple processing units called artificial neurons or nodes. Diagram of 2-Layer Perceptron Feed-forward nets are generally implemented with an additional node - called the bias unit - in all layers except the output layer. Training a feed-forward net 1. 2. 3.

Singular Value Decomposition of a Matrix Description Compute the singular-value decomposition of a rectangular matrix. Usage svd(x, nu = min(n, p), nv = min(n, p), LINPACK = FALSE) La.svd(x, nu = min(n, p), nv = min(n, p)) Arguments Details The singular value decomposition plays an important role in many statistical techniques. svd and La.svd provide two slightly different interfaces. Computing the singular vectors is the slow part for large matrices. Unsuccessful results from the underlying LAPACK code will result in an error giving a positive error code (most often 1): these can only be interpreted by detailed study of the FORTRAN code but mean that the algorithm failed to converge. Value The SVD decomposition of the matrix as computed by LAPACK, \bold{X = U D V'}, where \bold{U} and \bold{V} are orthogonal, \bold{V'} means V transposed, and \bold{D} is a diagonal matrix with the singular values D[i,i]. The returned value is a list with components For La.svd the return value replaces v by vt, the (conjugated if complex) transpose of v.

Restricted Boltzmann machine Diagram of a restricted Boltzmann machine with three visible units and four hidden units (no bias units). A restricted Boltzmann machine (RBM) is a generative stochastic neural network that can learn a probability distribution over its set of inputs. RBMs were initially invented under the name Harmonium by Paul Smolensky in 1986,[1] but only rose to prominence after Geoffrey Hinton and collaborators invented fast learning algorithms for them in the mid-2000s. RBMs have found applications in dimensionality reduction,[2] classification,[3] collaborative filtering, feature learning[4] and topic modelling.[5] They can be trained in either supervised or unsupervised ways, depending on the task. Restricted Boltzmann machines can also be used in deep learning networks. Structure[edit] associated with the connection between hidden unit and visible unit , as well as bias weights (offsets) for the visible units and for the hidden units. or, in vector form, where visible units and and See also[edit]

Recurrent neural network A recurrent neural network (RNN) is a class of neural network where connections between units form a directed cycle. This creates an internal state of the network which allows it to exhibit dynamic temporal behavior. Unlike feedforward neural networks, RNNs can use their internal memory to process arbitrary sequences of inputs. Architectures[edit] Fully recurrent network[edit] This is the basic architecture developed in the 1980s: a network of neuron-like units, each with a directed connection to every other unit. For supervised learning in discrete time settings, training sequences of real-valued input vectors become sequences of activations of the input nodes, one input vector at a time. Hopfield network[edit] The Hopfield network is of historic interest although it is not a general RNN, as it is not designed to process sequences of patterns. A variation on the Hopfield network is the bidirectional associative memory (BAM). Elman networks and Jordan networks[edit] The Elman SRN Where:

Feature learning Feature learning or representation learning[1] is a set of techniques in machine learning that learn a transformation of "raw" inputs to a representation that can be effectively exploited in a supervised learning task such as classification. Feature learning algorithms themselves may be either unsupervised or supervised, and include autoencoders,[2] dictionary learning, matrix factorization,[3] restricted Boltzmann machines[2] and various form of clustering.[2][4][5] When the feature learning can be performed in an unsupervised way, it enables a form of semisupervised learning where first, features are learned from an unlabeled dataset, which are then employed to improve performance in a supervised setting with labeled data.[6][7] Clustering as feature learning[edit] K-means clustering can be used for feature learning, by clustering an unlabeled set to produce k centroids, then using these centroids to produce k additional features for a subsequent supervised learning task. See also[edit]