Machine learning « Follow the Data Machine learning While preparing for our next podcast recording, here are some interesting recent machine learning developments. Machine learning as a service. The Protocols and Structures for Inference (PSI) project aims to develop an architecture for presenting machine learning algorithms, their inputs (data) and outputs (predictors) as resource-oriented RESTful web services in order to make machine learning technology accessible to a broader range of people than just machine learning researchers.Why? Currently, many machine learning implementations (e.g., in toolkits such as Weka, Orange, Elefant, Shogun, SciKit.Learn, etc.) are tied to specific choices of programming language, and data sets to particular formats (e.g., CSV, svmlight, ARFF). I think it seems promising. BigML, which has been mentioned in passing on this blog, has now published some videos of what the interface actually looks like. Like this: Like Loading...
Multilinear principal component analysis Multilinear principal component analysis (MPCA) is a mathematical procedure that uses multiple orthogonal transformations to convert a set of multidimensional objects into another set of multidimensional objects of lower dimensions. There is one orthogonal (linear) transformation for each dimension (mode): hence multilinear. This transformation aims to capture as high a variance as possible, accounting for as much of the variability in the data as possible, subject to the constraint of mode-wise orthogonality. MPCA is a multilinear extension of principal component analysis (PCA). The major difference is that PCA needs to reshape a multidimensional object into a vector, while MPCA operates directly on multidimensional objects through mode-wise processing. E.g., for 100x100 images, PCA operates on vectors of 10000x1 while MPCA operates on vectors of 100x1 in two modes. MPCA is a basic algorithm for dimension reduction via multilinear subspace learning. The algorithm
Operations, machine learning and premature babies Julie Steele and I recently had lunch with Etsy’s John Allspaw and Kellan Elliott-McCrea. I’m not sure how we got there, but we made a connection that was (to me) astonishing between web operations and medical care for premature infants. I’ve written several times about IBM’s work in neonatal intensive care at the University of Toronto. IBM discovered that by applying machine learning to the full data stream, they were able to diagnose some dangerous infections a full day before any symptoms were noticeable to a human. That observation strikes me as revolutionary. In our conversation, we started wondering how this applied to web operations. We talked a bit about whether it was possible to alarm on the first (and second) derivatives of some key parameters, and of course it is. Web operations has been on the forefront of “big data” since the beginning. Related:
Parallel coordinates Parallel coordinates is a common way of visualizing high-dimensional geometry and analyzing multivariate data. This visualization is closely related to time series visualization, except that it is applied to data where the axes do not correspond to points in time, and therefore do not have a natural order. Therefore, different axis arrangements may be of interest. History Parallel coordinates were often said to be invented by Philbert Maurice d'Ocagne (fr) in 1885, but even though the words "Coordonnées parallèles" appear in the book title this work has nothing to do with the visualization techniques of the same name (the book only describes a method of coordinate transformation, see fulltext PDF of the book by clicking the link in the references). Higher dimensions Adding more dimensions in parallel coordinates (often abbreviated ||-coords or PCs) involves adding more axes. Statistical considerations Reading Limitations Software See also Radar chart
Tom's Hardware US Researchers at the Imperial College in London believes that magnets could be used to develop future processors with far greater processing capacity than today's CPUs. According to a study published in the journal Science, a honeycomb-pattern of tiny, nano-sized magnets that are submerged in a material known as spin ice could solve a complex computational problem in a single step. In fact, clusters of such magnet arrays function similar to a neural network: It is more "similar to how our brains work than to the way in which traditional computers process information," the researchers said. Exploiting the potential of magnets gets more difficult the closer they are located to each other as they interfere with their magnetic fields, the scientists found that their honeycomb patterns create competition between magnets and "reduces the problems caused by these interactions by two-thirds." Honeycomb magnet processors are very much science fiction at this point.
Terminology in Data Analytics As data continue to grow at a faster rate than either population or economic activity, so do organizations' efforts to deal with the data deluge, and use it to capture value. And so do the methods used to analyze data, which creates an expanding set of terms (including some buzzwords) used to describe these methods. This is a field in flux, and different people may have different conceptions of what terms mean. Comments on this page and its "definitions" are welcome. Since many of these terms are subsets of others, or overlapping, the clearest approach is to start with the more specific terms and move to the more general. Predictive modeling: Used when you seek to predict a target (outcome) variable (feature) using records (cases) where the target is known. Predictive analytics: Basically the same thing as predictive modeling, but less specific and technical. Supervised Learning: Another synonym for predictive modeling. Unsupervised Learning: Business intelligence: Data mining: Text mining: 1.
Judea Pearl Judea Pearl (born 1936) is an Israeli-born American computer scientist and philosopher, best known for championing the probabilistic approach to artificial intelligence and the development of Bayesian networks (see the article on belief propagation). He is also credited for developing a theory of causal and counterfactual inference based on structural models (see article on causality). He is the 2011 winner of the ACM Turing Award, the highest distinction in computer science, "for fundamental contributions to artificial intelligence through the development of a calculus for probabilistic and causal reasoning". Judea Pearl is the father of journalist Daniel Pearl, who was kidnapped and murdered by militants in Pakistan connected with Al-Qaeda and the International Islamic Front in 2002 for his American and Jewish heritage. Biography Pearl is currently a professor of computer science and statistics and director of the Cognitive Systems Laboratory at UCLA. Books
Behavioral Targeting: the most underused technique in today’s marketing Posted in How To on May 30th, 2012 We recently launched geo-behavioral targeting feature in Visual Website Optimizer. (We also launched usability testing module; our vision is to offer all tools and techniques a marketer would need for conversion rate optimization). People use A/B testing, multivariate testing, analytics and usability studies for improving sales and conversions. However, I feel behavioral targeting is massively underused. Part of the reason could be due to difficulty of implementation, but with tools like ours (and others in the market), it is becoming easier by the day to get started with all sorts of targeting and personalization campaigns. What is behavioral targeting? Different visitors behave differently on your website. In my landing page optimization tips article, I recommended knowing who your target customer is and having only one clear call to action button (and thereby neglecting other visitors). Def. So, behavioral targeting must be hard? Actually, no!
20 lines of code that beat A/B testing every time Zwibbler.com is a drop-in solution that lets users draw on your web site. A/B testing is used far too often, for something that performs so badly. It is defective by design: Segment users into two groups. In recent years, hundreds of the brightest minds of modern civilization have been hard at work not curing cancer. With a simple 20-line change to how A/B testing works, that you can implement today, you can always do better than A/B testing -- sometimes, two or three times better. It can reasonably handle more than two options at once.. The Multi-armed bandit problem The multi-armed bandit problem takes its terminology from a casino. Like many techniques in machine learning, the simplest strategy is hard to beat. def choose(): if math.random() < 0.1: # exploration! Why does this work? Let's say we are choosing a colour for the "Buy now!" Then a web site visitor comes along and we have to show them a button. Another visitor comes along. But suddenly, someone clicks on the orange button!
Perceptrons in Lisp (A simple machine learning exercise) So having missed Stanford's Machine Learning course (mostly out of laziness - I'm sure it was great) I'm trying to learn this stuff on my own. I'm going through MIT's Machine Learning notes on OpenCourseWare. They're easy [for me] to digest without being insulting, and they help me avoid searching for "The right book" to learn from (a task that would delay my learning anything but make me feel busy). After reading the first two lectures I decided I should stop and practice what I've learned: a simple perceptron learning algorithm. What's a Perceptron anyway? It sounds like a Transformer. We want to choose the variables so that the above term is positive when we'll have a storm, and negative otherwise. More generally, say we have a vector of characteristics . How do we find ? Our learning algorithm will tell us how to choose that . We're going to start with any ol' , say just a vector with 1's in all positions. Let's see it in practice. I [rather foolishly] created my own training data. Edit:
Robots master skills with ‘deep learning’ technique Robot learns to use hammer. What could go wrong? (credit: UC Berkeley) UC Berkeley researchers have developed new algorithms that enable robots to learn motor tasks by trial and error, using a process that more closely approximates the way humans learn. They demonstrated their technique, a type of reinforcement learning, by having a robot complete various tasks — putting a clothes hanger on a rack, assembling a toy plane, screwing a cap on a water bottle, and more — without pre-programmed details about its surroundings. A new AI approach “What we’re reporting on here is a new approach to empowering a robot to learn,” said Professor Pieter Abbeel of UC Berkeley’s Department of Electrical Engineering and Computer Sciences. The work is part of a new People and Robots Initiative at UC’s Center for Information Technology Research in the Interest of Society (CITRIS). Neural-inspired learning Coat-hanger training (no wire hangers!) BRETT masters human tasks on its own A little nightcap?
Welcome to AITopics | AITopics