background preloader

Data Mining: What is Data Mining?

Data Mining: What is Data Mining?
Overview Generally, data mining (sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it into useful information - information that can be used to increase revenue, cuts costs, or both. Data mining software is one of a number of analytical tools for analyzing data. It allows users to analyze data from many different dimensions or angles, categorize it, and summarize the relationships identified. Technically, data mining is the process of finding correlations or patterns among dozens of fields in large relational databases. Continuous Innovation Although data mining is a relatively new term, the technology is not. Example For example, one Midwest grocery chain used the data mining capacity of Oracle software to analyze local buying patterns. Data, Information, and Knowledge Data Data are any facts, numbers, or text that can be processed by a computer. Information Knowledge Data Warehouses What can data mining do?

Supports de cours -- Data Mining Cette page recense les supports utilisés pour mes enseignements de Machine Learning, Data Mining et de Data Science au sein du Département Informatique et Statistique (DIS) de l'Université Lyon 2, principalement en Master 2 Statistique et Informatique pour la Science des donnéEs (SISE), formation en data science, dans le cadre du traitement statistique des données et de la valorisation des big data. Je suis très attentif à la synergie forte entre l'informatique et les statistiques dans ce diplôme, ce sont là les piliers essentiels du métier de data scientist. Attention, pour la majorité, il s'agit de « slides » imprimés en PDF, donc très peu formalisés, ils mettent avant tout l'accent sur le fil directeur du domaine étudié et recensent les points importants. Cette page est bien entendu ouverte à tous les statisticiens, data miner et data scientist, étudiants ou pas, de l'Université Lyon 2 ou d'ailleurs. Nous vous remercions par avance. Ricco Rakotomalala – Université Lyon 2

Data Mining: Text Mining, Visualization and Social Media 5 of the Best Free and Open Source Data Mining Software The process of extracting patterns from data is called data mining. It is recognized as an essential tool by modern business since it is able to convert data into business intelligence thus giving an informational edge. At present, it is widely used in profiling practices, like surveillance, marketing, scientific discovery, and fraud detection. There are four kinds of tasks that are normally involve in Data mining: * Classification - the task of generalizing familiar structure to employ to new data* Clustering - the task of finding groups and structures in the data that are in some way or another the same, without using noted structures in the data.* Association rule learning - Looks for relationships between variables.* Regression - Aims to find a function that models the data with the slightest error. For those of you who are looking for some data mining tools, here are five of the best open-source data mining software that you could get for free: Orange RapidMiner Weka JHepWork