Get flash to fully experience Pearltrees
Common Crawl produces and maintains a repository of web crawl data that is openly accessible to everyone. The crawl currently covers 6 billion pages and the repository includes valuable metadata. The crawl data is stored on Amazon’s Public Data Sets , allowing it to be bulk downloaded as well as directly accessed for map-reduce processing in EC2.
Recently via Twitter I came across “ Gibbs Sampling for the Uninitiated ” by Philip Resnik and Eric Hardisty , a tutorial that shows how to use Gibbs sampling of a Naive Bayes model to estimate the labels on a set of documents. This paper goes through the algebra in great detail and concludes with pseudocode. Resnik and Hardisty do such a good job of making it look easy that I decided to write my own Gibbs sampler.
This is the home page for the book, "Bayesian Data Analysis," by Andrew Gelman , John B. Carlin , Hal S. Stern , and Donald B.
(March 26th Update: Video now available) Last night, I moderated our Bay Area R Users Group kick-off event with a panel discussion entitled “The R and Science of Predictive Analytics”, co-located with the Predictive Analytics World conference here in SF.
information graphics are visual representations of information , data or knowledge.