background preloader

Schema

Facebook Twitter

Big Data in Real-Time at Twitter. Cassandra By Example. Cassandra has received a lot of attention of late, and more people are now evaluating it for their organization.

Cassandra By Example

As these folks work to get up to speed, the shortcomings in our documentation become all the more apparent. Easily, the worst of these is explaining the data model to those with an existing background in relational databases. The problem is that Cassandra’s data model is different enough from that of a traditional database to readily cause confusion, and just as numerous as the misconceptions are the different ways that well intentioned people use to correct them. Some folks will describe the model as a map of maps, or in the case of super columns, a map of maps of maps.

Schema in Cassandra 1.1. The evolution of schema in Cassandra When Cassandra was first released several years ago, it followed closely the data model outlined in Google’s Bigtable paper (with the notable addition of SuperColumns — more on these later): ColumnFamilies grouping related columns needed to be defined up-front, but column names were just byte arrays interpreted by the application. It would be fair to characterize this early Cassandra data model as “schemaless.”