background preloader

Data Mining

Facebook Twitter

Apache Mahout: Scalable machine learning and data mining. 5 of the Best Free and Open Source Data Mining Software. The process of extracting patterns from data is called data mining. It is recognized as an essential tool by modern business since it is able to convert data into business intelligence thus giving an informational edge.

At present, it is widely used in profiling practices, like surveillance, marketing, scientific discovery, and fraud detection. There are four kinds of tasks that are normally involve in Data mining: * Classification - the task of generalizing familiar structure to employ to new data* Clustering - the task of finding groups and structures in the data that are in some way or another the same, without using noted structures in the data.* Association rule learning - Looks for relationships between variables.* Regression - Aims to find a function that models the data with the slightest error. For those of you who are looking for some data mining tools, here are five of the best open-source data mining software that you could get for free: Orange RapidMiner Weka JHepWork. R and Data Mining. Introduction to Data Mining. MSISS ST4003 : Data Mining - Louis Aslett. MSISS ST4003 : Data Mining 2010-11 < Back to homepage 2009-2010 ST4003 Data Mining lab material This is the labs page for the fourth year undergraduate course in data mining for MSISS and mathematics students, lectured by Dr Myra O'Reagan.

Useful Links Introduction to R R reference card RSeek, Google powered search engine of R resources Labs Lab 1 - Examining Data Lab 2 - A Basic Tree Classifier Lab 3 - More Trees Lab 4 - More Programming Concepts and Model Evaluation Lab 5 - Introduction to Neural Networks Lab 6 - Random Forests Lab 7 - Introduction to Support Vector Machines Data Sets Telecom Customer Churn Data (small version) Titanic Survivor Data Cheese Taste Data ESL SVM simulated data.