background preloader

Big Data

Facebook Twitter

MongoDB 2.0 Should Have Been 1.0 | Luigi Montanez. No open source project has received more criticism in recent years than MongoDB. Most of the flak has revolved around technical implementation decisions made for the project, and perceptions of how the company behind MongoDB, 10gen, has used advantages gained from those design decisions in marketing their product. It seems that more blog posts have been written on why one should not use MongoDB than on why one should use it. In a now-404ed July 2010 entry (mirror), Mikael Rogers, at the time an employee of the company behind CouchDB, wrote a highly-trafficked critique of MongoDB’s asynchronous writes and lack of single-server durability. Because of the design decisions made regarding persisting to disk, sending a hard kill signal to a standalone MongoDB instance could result in the database becoming corrupted beyond recovery.

This was no good, of course. A few months later, Ethan Gunderson wrote a post titled “Two Reasons You Shouldn’t Use MongoDB”. Configurable fsync time in 1.2. Deadline Extended: Data Engineering Meets the Semantic Web (DESWEB2012) from Parreira, Josiane on 2011-10-31 (semantic-web from October 2011) An Open Data Ecosystem. Rufus Pollock recently shared his thoughts on scaling what he sees as the current open data ecosystem. He writes, “The last several decades the world has seen an explosion of digital technologies which have the potential to transform the way knowledge is disseminated. This world is rapidly evolving and one of its more striking possibilities is the creation of an open data ecosystem in which information is freely used, extended and built on. The resulting open data ‘commons’ is valuable in and of itself, but also, and perhaps even more importantly, because the social and commercial benefits it generates — whether in helping us to understand climate change; speeding the development of life-saving drugs; or improving governance and public services.”

Pollock continues, “In developing this open data ecosystem there are three key things are needed: material, tools and people. Read more here. Image: Courtesy Flickr/ okfn. Www.mckinsey.com/mgi/publications/big_data/pdfs/MGI_big_data_full_report.pdf. Are you ready for the era of ‘big data’? - McKinsey Quarterly - Strategy - Innovation. The top marketing executive at a sizable US retailer recently found herself perplexed by the sales reports she was getting. A major competitor was steadily gaining market share across a range of profitable segments. Despite a counterpunch that combined online promotions with merchandizing improvements, her company kept losing ground. When the executive convened a group of senior leaders to dig into the competitor’s practices, they found that the challenge ran deeper than they had imagined.

The competitor had made massive investments in its ability to collect, integrate, and analyze data from each store and every sales unit and had used this ability to run myriad real-world experiments. At the same time, it had linked this information to suppliers’ databases, making it possible to adjust prices in real time, to reorder hot-selling items automatically, and to shift items from store to store easily.

But over the last few years, the volume of data has exploded. 1. 2. 3. 4. 5. Strata NY 2011 [Data Visualizations] - The Subjectivity of Fact. This post was written by Mimi Rojanasakul. She is an artist and designer based in New York, currently pursuing her MFA in Communications Design at Pratt Institute. Say hello or follow her @mimiosity. Last but not least, here's a round-up of talks that were focused on data visualization at Strata. Most of the presentations covered the standard do's and don'ts, parading humorously incomprehensible examples around of those who would forget to label their axis, or use a pie chart to show change over time.

Naomi Robbins, the consummate modernist, spends her presentation extolling clarity, objectivity, and a form follows function philosophy that comes with a number of simple guidelines to follow. DO: make the data stand out, eliminate unnecessary dimensions, or try a dot plot instead of a bar graph sometime. Robbins makes a point of distinguishing more technical graphics from art.

. [ A set of donut graphs meant to convey change over time, and an amended line-graph version by Iliinsky ]