background preloader

BigData

Facebook Twitter

[1508.03843] The Gremlin Graph Traversal Machine and Language. Big Data Analyist; The next job for a MBB? Many times I wonder what will be my next career, if this Lean Six Sigma Master Black Belt ever runs its course.

Big Data Analyist; The next job for a MBB?

Since MBBs, by there nature, are generally good at data analysis, there has been a lot of talk about a new field that is developing around the analysis of big data. The Harvard Business Review had an article about it this week “Data is useless without the skills to analyze it.” by Jeanne Harris 13 Sep 2012. In this article the try to define the skills of a Big Data Analyst, and they sound like a good MBB! Ready and Willing to Experiment: They talk about hypothesis testing, population sampling, and inference. (we can do that) Adept at Mathematical Reasoning: Excerpt from the article “Business users don’t need to be statisticians, but they need to understand the proper usage of statistical methods.

This is quite interesting to me. What this article does not say is about the tools that you should be proficient at using in order to perform big data analysis. Kernel Based Algorithms for Mining Huge Data Sets: Supervised, Semi-supervised, and Unsupervised Learning. Software: new LinearSVM: The newest extremely fast machine learning (data mining) algorithm for solving multiclass classification problems from ultra large data sets that implements an original proprietary version of a cutting plane algorithm for designing a linear support vector machine.

Kernel Based Algorithms for Mining Huge Data Sets: Supervised, Semi-supervised, and Unsupervised Learning

LinearSVM is a linearly scalable routine meaning that it creates an SVM model in a CPU time which scales linearly with the size of the training data set. Chapter 3 ISDA: A support vector machines tool with a nice GUI for solving large-scale classification and regression problems. If you are using results and analysis by the help of ISDA software in your publications please make the reference to: Huang T. MATLAB Codes for ISDA Chapter 5 SemiL: The software SemiL is the first program that implements graph-based semi-supervised learning techniques for large-scale problems.

Chapter 6.

Cluster Manager

Pattern-Based Analytics. Innovation and Big Data in Corporations: A Roadmap. The bleeding edge of insight innovation is around next generation digital consumer experience.

Innovation and Big Data in Corporations: A Roadmap

Consumer behaviors are rapidly evolving….always connected, always sharing, always aware. Obviously new technology like Big Data drives and transforms consumer behavior and empowerment. With the influx of money, attention and entrepreneurial energy, there is a massive amount of innovation taking place to solve data centric problems (such as the high cost of collecting, cleaning, curating, analyzing, maintaining, predicting) in new ways. There are two distinct patterns in data-centric innovation: Disruptive innovation like predictive search which brings a very different value proposition to tasks like discover, engage, explore and buy and/or creates new markets!! With either pattern the managerial challenge is moving from big picture strategy to day-to-day execution. Airline loyalty programs are a great example of multi-year evolving competitive roadmaps. British Airways “Know Me” Project Summary.

Visualization

Many Eyes. TIBCO Spotfire - Business Intelligence Analytics Software & Data Visualization. Pangool - Hadoop API made easy. Why the days are numbered for Hadoop as we know it — Cloud Computing News. Intuitive Data Analysis. Lucky Lion Studios – Royalty-Free Video Game Music Library. DRM, Video Optimization, Digital Copy Protection & Conditional Access - Widevine Technologies.

Data Gravity. Blog - looking with cassandra into the future of atlas? It’s no surprise we use MongoDB almost everywhere. And it’s no surprise either that we’re planning for world domination, ingesting more and more data from different media sources into our Atlas platform: which is leading us to hit MongoDB scalability limits, discover obscure bugs, and question some of its design choices.

That’s why for the upcoming Atlas 4.0 version we’re going to switch gears and move towards one of the most dominant players in the non-relational databases space: Cassandra. is it because you’re all crazy hipsters? The reason we’re moving from MongoDB to Cassandra is not because we love the latest and coolest technology, whatever that means. Rather, given the large growth we’re experiencing on the Atlas platform in terms of data and users: So, we chose to go with Cassandra for a number of reasons: But we didn’t stop at mere theoretical considerations: once settled on Cassandra, it was time to do some prototyping and testing.

Tests passed with flying colours. The end?