AI

TwitterFacebook
Get flash to fully experience Pearltrees
General Email: bouchard AT stat.ubc.ca Assistant Professor in the Department of Statistics at UBC Path: McGill -> UCB -> UBC . AKA: Alex, Bouchard, or 卜利森. See also: how to typeset my last name. Office: ESB , Room 3124 Resumé (last updated: Oct. '12) http://www.stat.ubc.ca/~bouchard/

Alexandre Bouchard-Côté

http://en.wikipedia.org/wiki/ELKI ELKI (for Environment for DeveLoping KDD-Applications Supported by Index-Structures ) is a knowledge discovery in databases (KDD, "data mining") software framework developed for use in research and teaching by the database systems research unit of Professor Hans-Peter Kriegel at the Ludwig Maximilian University of Munich , Germany. It aims at allowing the development and evaluation of advanced data mining algorithms and their interaction with database index structures . [ edit ] Description The ELKI framework is written in Java and built around a modular architecture. Most currently included algorithms belong to clustering , outlier detection [ 1 ] and database indexes . A key concept of ELKI is to allow the combination of arbitrary algorithms, data types, distance functions and indexes and evaluate these combinations.

ELKI

http://jfuzzylogic.sourceforge.net/html/index.html What is jFuzzyLogic? jFuzzyLogic is a fuzzy logic package written in java (as you might have guessed). It implements Fuzzy control language (FCL) specification (IEC 61131 part 7) jFuzzyLogic Features - Implements Fuzzy control language (FCL) IEC-61131-7 specification. - Parametric optimization algorithms. - Membership functions: -Continuous: GenBell, Sigmoidal, Trapetzoidal, Gaussian, PieceWiseLinear, Triangular, Cosing, Dsigm -Discrete: Singleton, GenericSingleton -Custom membership functions can be defined - Defuzzifiers: -Continuous: CenterOfGravity, RightMostMax, CenterOfArea, LeftMostMax, MeanMax -Discrete: CenterOfGravitySingletons -Custom defuzzifiers can be easly created -Function based deffuzifiers (e.g.

jFuzzyLogic: Open Source Fuzzy Logic (Java)

R

Applying complex system entropy cluster algorithm to mining principle of herbal combinations in traditional Chinese medicine

Five Viscera Tonifying Method (FVTM) was established by Prof. Gao Zhongying, a national prestigious and experienced practitioner of Traditional Chinese Medicine (TCM). This method extends the implication of tonfiying method while making a break-through in traditional TCM theory. http://dl.acm.org/citation.cfm?id=1656055
The process of extracting patterns from data is called data mining. It is recognized as an essential tool by modern business since it is able to convert data into business intelligence thus giving an informational edge. At present, it is widely used in profiling practices, like surveillance, marketing, scientific discovery, and fraud detection. There are four kinds of tasks that are normally involve in Data mining: * Classification - the task of generalizing familiar structure to employ to new data * Clustering - the task of finding groups and structures in the data that are in some way or another the same, without using noted structures in the data. * Association rule learning - Looks for relationships between variables. * Regression - Aims to find a function that models the data with the slightest error. For those of you who are looking for some data mining tools, here are five of the best open-source data mining software that you could get for free:

5 of the Best Free and Open Source Data Mining Software

http://www.junauza.com/2010/11/free-data-mining-software.html
http://en.wikipedia.org/wiki/Factor_graph In probability theory and its applications, a factor graph is a particular type of graphical model , with applications in Bayesian inference , that enables efficient computation of marginal distributions through the sum-product algorithm . One of the important success stories of factor graphs and the sum-product algorithm is the decoding of capacity-approaching error-correcting codes , such as LDPC and turbo codes . A factor graph is an example of a hypergraph , in that an arrow (i.e., a factor node) can connect more than one (normal) node. When there are no free variables, the factor graph of a function f is equivalent to the constraint graph of f , which is an instance to a constraint satisfaction problem .

Factor graph

Playgrounds

Because we want to give kick-ass product recommendations. I'm showing you how to find related items based on a really simple formula. If you pay attention, this technique is used all over the web (like on Amazon) to personalize the user experience and increase conversion rates. https://www.bionicspirit.com/blog/2012/01/16/cosine-similarity-euclidean-distance.html

Data Mining: Finding Similar Items and Users

http://wiki.xkcd.com/irc/Bucket#Bucket_Overview Bucket has an outer shell of metal [ citation needed ] ; within the metal is a protective layer of high density plastic [ citation needed ] , in which may or may not reside pure HOH [ citation needed ] . There [ citation needed ] can only be speculation about what else the Bucket contains. [ citation needed ] Do not make our Bucket stupid or mean. Any stupiding of the Bucket will get you warned, kicked, and then banned. Bucket's replies should generally not be meaner or more asshole-y than the lines that trigger them. Do not alter factoids that are not your own.

Bucket - XKCD Wiki

http://blog.jmacoe.com/gestion_ti/base_de_datos/5-mejores-software-mineria-datos-codigo-libre-abierto/#more-1975

5 de los mejores software de minería de datos de Código Libre y Abierto | El rincón de JMACOE

El proceso de extracción de patrones a partir de datos se llama minería de datos. Es reconocida como una herramienta esencial de los negocios modernos, ya que es capaz de convertir los datos en inteligencia de negocios dando así una ventaja de información. Actualmente, es ampliamente utilizado en las prácticas de perfil, como vigilancia, comercialización, descubrimientos científicos, y detección de fraudes. Hay cuatro tipos de tareas que normalmente se involucran en la minería de datos: Clasificación – la tarea de generalizar una estructura familiar para utilizarla en los nuevos datos Agrupamiento – la tarea de encontrar grupos y estructuras en los datos que son de alguna manera u otra lo mismo, sin necesidad de utilizar las estructuras observadas en los datos. Aprendizaje de reglas de asociación – Busca relaciones entre las variables. Regresión – Su objetivo es encontrar una función que modele los datos con el menor error.
How Can We Help You? Get the latest version: Free and Paid Licenses/Downloads Learn how to use LingPipe: Tutorials Get expert help using LingPipe: Services Join us on Facebook What is LingPipe?

LingPipe Home

http://alias-i.com/lingpipe/
Open source is a great choice for many text analytics users, especially folks who have programming skills, who need custom capabilities or who are trying to get a feel for possibilities before committing themselves. Excellent options are available for all these users. Tools such as Gate, NLTK, R and RapidMiner share the low cost, power, flexibility and community that have driven adoptionof open-source software by individual users and enterprises alike. RapidMiner even combines text processing with business intelligence (BI) and visualization functions. This article will look at open source text analytics, focusing on those four tools. ( UIMA , the open source Unstructured Information Management Architecture, is a rich topic in itself, one that merits its own article.) I will suggest a number of resources that will help you get started.

Open Source Text Analytics by Seth Grimes

Inside Google, MapReduce is used for 80% of all the data processing needs. That includes indexing web content , running the clustering engine for Google News , generating reports for popular queries ( Google Trends ), processing satellite imagery , language model processing for statistical machine translation and even mundane tasks like data backup and restore. The other 20% is handled by a lesser known infrastructure called “Pregel” which is optimized to mine relationships from “graphs”. According to wikipedia a “graph” is a collection of vertices or ‘nodes’ and a collection of ‘edges’ that connect pair of ‘nodes’. Depending on the requirements, a ‘graph’ can be undirected which means there is no distinction between the two ‘nodes’ in the graph, or it could be directed from one ‘node’ to another.

Pregel: Google’s other data-processing infrastructure | Scalable web architectures

In the past , I’ve written about Google Pregel. At the time, as it was quite obvious, there was no implementation of anything like Pregel out there of any kind, not to mention Open Source. Now things have changed, so I’d like to give a quick list of the projects out there that might help you getting started with this technology, as I see that very often people ask what the difference is between all of them.

claudio martella

ROC

Books

Vizz

Mondrian (software)

Mondrian is a general-purpose statistical data-visualization system. It features outstanding visualization techniques for data of almost any kind, and has its particular strength compared to other tools when working with Categorical Data, Geographical Data and LARGE Data. All plots in Mondrian are fully linked, and offer various interactions and queries.
Global Warming in My Lifetime – July’s Story NASA GISS has a great web application that let’s users generate maps of global monthly temperature anomalies in 2 degree grids. I’ve made a 27 second video of 7 decade maps for July to see how global temperature anomalies have progressed through my life so far. Here’s the link to NASA GISS’s map application page.

Interesting Data Tools | Climate Charts & Graphs