Data science (aka Data mining)
< luisjoliva
Get flash to fully experience Pearltrees
About: ELKI is a framework for implementing data-mining algorithms with support for index structures, that includes a wide variety of clustering and outlier detection methods. Changes:
The book has now been published by Cambridge University Press.
Datasets for http://www.datadotgc.ca/. DataDotGC, which launched, in February 2010, is a Canadian, citizen-led effort to promote open data and help share data that has already been...
2011 Report to Congress on White House Staff Government whitehouse, salary, government, congress, ... Since 1995, the White House has been required to deliver a report to Congress listing the title and salary of every White House Office employee.
Google Data Science tools
Data pre-processing and cleansing
Oluolu is a open source query log mining tool which works on Hadoop.
Pattern is a web mining module for the Python programming language. It bundles tools for data retrieval (Google + Twitter + Wikipedia API, web spider, HTML DOM parser), text analysis (rule-based shallow parser, WordNet interface, syntactical + semantical n-gram search algorithm, tf-idf + cosine similarity + LSA metrics), clustering and classification (k-means, KNN, SVM), and data visualization (graph networks).
In the 11 February 2011 issue, Science joins with colleagues from Science Signaling , Science Translational Medicine , and Science Careers to provide a broad look at the issues surrounding the increasingly huge influx of research data.