background preloader

Big Data and NOSQL

Facebook Twitter

What’s Next for Apache Hadoop Data Management and Governance: Cloudera Naviga... Learn about the new functionality coming aboard Cloudera Navigator, the trail-blazing solution for metadata management and lineage in Apache Hadoop.

What’s Next for Apache Hadoop Data Management and Governance: Cloudera Naviga...

More than two years ago, Cloudera introduced Cloudera Navigator 1.0, which was the first offering to unify auditing across enterprise Apache Hadoop deployments. About a year later, Cloudera released Cloudera Navigator 2.0, which introduced another first for Hadoop: comprehensive metadata management and lineage to Hadoop. Today, more than 200 customers across numerous industries use Cloudera Navigator in production to deliver trust and visibility to their Hadoop deployments. Today we are announcing exciting news for Cloudera Navigator: Cloudera Navigator has joined the Cloudera Accelerator Program, a partner program designed to expedite the development and certification of partner applications.

Lambda Architecture

NoSQL Databases and Polyglot Persistence: A Curated Guide. Big Data Right Now: Five Trendy Open Source Technologies. Big Data is on every CIO’s mind this quarter, and for good reason.

Big Data Right Now: Five Trendy Open Source Technologies

Companies will have spent $4.3 billion on Big Data technologies by the end of 2012. But here’s where it gets interesting. Drill. Speed is Key Leveraging an efficient columnar storage format, an optimistic execution engine and a cache-conscious memory layout, Apache Drill is blazing fast.


Coordination, query planning, optimization, scheduling, and execution are all distributed throughout nodes in a system to maximize parallelization. Liberate Nested Data Perform interactive analysis on all of your data, including nested and schema-less. The Database as a Value.

Map Reduce

p1150-stonebraker. CAP theorem. XRX. NOSQL Databases. The NoSQL movement. In a conversation last year, Justin Sheehy, CTO of Basho, described NoSQL as a movement, rather than a technology.

The NoSQL movement

This description immediately felt right; I’ve never been comfortable talking about NoSQL, which when taken literally, extends from the minimalist Berkeley DB (commercialized as Sleepycat, now owned by Oracle) to the big iron HBase, with detours into software as fundamentally different as Neo4J (a graph database) and FluidDB (which defies description). But what does it mean to say that NoSQL is a movement rather than a technology? We certainly don’t see picketers outside Oracle’s headquarters.


Big Table. Hadoop. Marriage of Hadoop and OLAP: Best of both worlds to make sense of 200 Terabytes of data. Like many other companies in the social networking world, Zoosk inherits a vast amount of data every day from user interactions, web logs, financial transactions, as well as standard business metric data.

Marriage of Hadoop and OLAP: Best of both worlds to make sense of 200 Terabytes of data

Making sense of the data and turning it into actionable intelligence is of utmost importance to Zoosk, where we are constantly trying to optimize our product offerings and business processes. The question is: how do we most effectively leverage our data, and turn it into business intelligence? There are a few typical approaches to answer this question. Visual Guide to NoSQL Systems - Nathan Hurst's Blog. There are so many NoSQL systems these days that it's hard to get a quick overview of the major trade-offs involved when evaluating relational and non-relational systems in non-single-server environments.

Visual Guide to NoSQL Systems - Nathan Hurst's Blog

I've developed this visual primer with quite a lot of help (see credits at the end), and it's still a work in progress, so let me know if you see anything misplaced or missing, and I'll fix it. Without further ado, here's what you came here for (and further explanation after the visual). Note: RDBMSs (MySQL, Postgres, etc) are only featured here for comparison purposes. Also, some of these systems can vary their features by configuration (I use the default configuration here, but will try to delve into others later).