background preloader

Data Mining Map

Related:  Data mining

Data Mining - PPDM Wiki From PPDM Wiki Introduction Traditional data analysis is done by inserting data into standards or customized models. In either case, it is assumed that the relationships among various system variables are well known and can be expressed mathematically. However, in many cases, relationships may not be known. Big data has big implications for knowledge management A goal of knowledge management over the years has been the ability to integrate information from multiple perspectives to provide the insights required for valid decision-making. Organizations do not make decisions just based on one factor, such as revenue, employee salaries or interest rates for commercial loans. The total picture is what should drive decisions, such as where to invest marketing dollars, how much to invest in R&D or whether to expand into a new geographic market.

Perceptual Edge - Library Contents Books Articles Whitepapers Other Brief Publications Books Information Dashboard Design: Displaying data for at-a-glance monitoring, Second Edition, Stephen Few, $40.00 (U.S.), Analytics Press, 2013 This book alone addresses the visual design of dashboards. Don't be misled by the title. Data Mining Image: Detail of sliced visualization of thirty video samples of Downfall remixes. See actual visualization below. As part of my post doctoral research for The Department of Information Science and Media Studies at the University of Bergen, Norway, I am using cultural analytics techniques to analyze YouTube video remixes. My research is done in collaboration with the Software Studies Lab at the University of California, San Diego. A big thank you to CRCA at Calit2 for providing a space for daily work during my stays in San Diego. The following is an excerpt from an upcoming paper titled, “Modular Complexity and Remix: The Collapse of Time and Space into Search,” to be published in the peer review journal AnthroVision, Vol 1.1.

Tertiary data: Big data's hidden layer Big data isn’t just about multi-terabyte datasets hidden inside eventually-concurrent distributed databases in the cloud, or enterprise-scale data warehousing, or even the emerging market in data. It’s also about the hidden data you carry with you all the time; the slowly growing datasets on your movements, contacts and social interactions. Until recently, most people’s understanding of what can actually be done with the data collected about us by our own cell phones was theoretical. There were few real-world examples. But over the last couple of years, this has changed dramatically.

Crossfilter Fast Multidimensional Filtering for Coordinated Views Crossfilter is a JavaScript library for exploring large multivariate datasets in the browser. Crossfilter supports extremely fast (<30ms) interaction with coordinated views, even with datasets containing a million or more records; we built it to power analytics for Square Register, allowing merchants to slice and dice their payment history fluidly. Since most interactions only involve a single dimension, and then only small adjustments are made to the filter values, incremental filtering and reducing is significantly faster than starting from scratch.

Scraping for Journalism: A Guide for Collecting Data Photo by Dan Nguyen/ProPublica Our Dollars for Docs news application lets readers search pharmaceutical company payments to doctors. We’ve written a series of how-to guides explaining how we collected the data. Most of the techniques are within the ability of the moderately experienced programmer. The most difficult-to-scrape site was actually a previous Adobe Flash incarnation of Eli Lilly’s disclosure site.

Visualising Ad Hoc Tweeted Link Communities, via BackType So you’ve tweeted a link as part of your social media/event amplification strategy, and it’s job done, right? Or is there maybe some way you can learn something about who else found that interesting? Notwitshtanding the appearance of yet another patent of the bleedin’ obvious, here’s one way I’ve been experimenting with for tracking informal, ad hoc communities around a link. (In part this harkens back to some of my previous “social life of a URL” doodles such as delicious URL History – Hyperbolic Tree Visualisation, More Hyperbolic Tree Visualisations – delicious URL History: Users by Tag.) In part inspired by a comment by Chris Jobling on one of my flickr Twitter network images, here’s a recipe for identifying a core community that may be interested in a retweeted link:

Related:  empedocleamitbatraImprorant Resources