Machine Learning Cheat Sheet (for scikit-learn)

jakevdp/sklearn_scipy2013 enthought/pyql GUESS: The Graph Exploration System Visualizing the stock market structure This example employs several unsupervised learning techniques to extract the stock market structure from variations in historical quotes. The quantity that we use is the daily variation in quote price: quotes that are linked tend to cofluctuate during a day. Learning a graph structure We use sparse inverse covariance estimation to find which quotes are correlated conditionally on the others. Specifically, sparse inverse covariance gives us a graph, that is a list of connection. For each symbol, the symbols that it is connected too are those useful to explain its fluctuations. Clustering We use clustering to group together quotes that behave similarly. Note that this gives us a different indication than the graph, as the graph reflects conditional relations between variables, while the clustering reflects marginal properties: variables clustered together can be considered as having a similar impact at the level of the full stock market. Embedding in 2D space Visualization Script output:

jakevdp/sklearn_pycon2013 a free/open-source library for quantitative finance GitHub - airbnb/caravel: Caravel is a data exploration platform designed to be visual, intuitive, and interactive python - Correcting matplotlib colorbar ticks Multi-armed bandit Resource problem in machine learning In probability theory and machine learning, the multi-armed bandit problem (sometimes called the K-[1] or N-armed bandit problem[2]) is a problem in which a decision maker iteratively selects one of multiple fixed choices (i.e., arms or actions) when the properties of each choice are only partially known at the time of allocation, and may become better understood as time passes. A fundamental aspect of bandit problems is that choosing an arm does not affect the properties of the arm or other arms.[3] The multi-armed bandit problem is a classic reinforcement learning problem that exemplifies the exploration–exploitation tradeoff dilemma. In contrast to general RL, the selected actions in bandit problems do not affect the reward distribution of the arms. In the problem, each machine provides a random reward from a probability distribution specific to that machine, that is not known a priori. Empirical motivation [edit] The multi-armed bandit model levers.

StatsModels: Statistics in Python — statsmodels 0.6.0.dev-455510c documentation statsmodels is a Python module that provides classes and functions for the estimation of many different statistical models, as well as for conducting statistical tests, and statistical data exploration. An extensive list of result statistics are avalable for each estimator. The results are tested against existing statistical packages to ensure that they are correct. The package is released under the open source Modified BSD (3-clause) license. Since version 0.5.0 of statsmodels, you can use R-style formulas together with pandas data frames to fit your models. import numpy as npimport statsmodels.api as smimport statsmodels.formula.api as smf # Load datadat = sm.datasets.get_rdataset("Guerry", "HistData").data # Fit regression model (using the natural log of one of the regressors)results = smf.ols('Lottery ~ Literacy + np.log(Pop1831)', data=dat).fit() # Inspect the resultsprint results.summary() You can also use numpy arrays instead of formulas: