This article was originally written by Viliam Holub This is the second part of a two part series. Before you read this, you should go back and read the original article, “Synchronizing Clocks In a Cassandra Cluster Pt. 1 – The Problem.” In it, I covered how important clocks are and how bad clocks can be in virtualized systems (like Amazon EC2) today. In today’s installment, I’m going to cover some disadvantages of off-the-shelf NTP installations, and how to overcome them. Synchronizing Clocks In a Cassandra Cluster, Pt. 2: Solutions
Synchronizing Clocks In a Cassandra Cluster, Pt. 1: The Problem This article was originally written by Viliam Holub Cassandra is a highly-distributable NoSQL database with tunable consistency. What makes it highly distributable makes it also, in part, vulnerable: the whole deployment must run on synchronized clocks. It’s quite surprising that, given how crucial this is, it is not covered sufficiently in literature. And, if it is, it simply refers to installation of a NTP daemon on each node which – if followed blindly – leads to really bad consequences.
The History of Apache Cassandra
HBase vs Cassandra
Making Things Easier with Cassandra GUI 2.0 Cassandra GUI evolved from its first version and new version includes bug fixes and enhanced features. New features. Complete pagination for Row view of explorerSearch rows by their names.
cassandra-user - frequent client exceptions on 0.7.0 Hello, We were occasionally experiencing client exceptions with 0.6.3, so we upgraded to 0.7.0 a couple weeks ago, but unfortunately we now get more client exceptions, and more frequently. Also, occasionally nodetool ring will show a node Down even though cassandra is still running and the node will be up again shortly. We run nodetool ring every half hour or so for monitoring, otherwise we probably would not have noticed. I'm trying to determine whether we are hitting some bugs, just don't have enough hardware for our application, or have made some error in configuration. I would happy to provide any more information or run tests to narrow down the problem.
Within NoSQL, the operations of indexing, fetching and searching for information are intimately tied to the physical storage mechanisms. It is important to remember that rows are stored across hosts, but a single row is stored on a single host. (with replicas) Columns families are stored in sorted order, which makes querying a set of columns efficient (provided you are spanning rows). The Bad : Partitioning Cassandra Indexing: The good, the bad and the ugly
Cloud Architecture Tutorial - Running in the Cloud (3of3)
Announcing Astyanax AstyanaxContext<Keyspace> context = new AstyanaxContext.Builder() .forCluster("ClusterName") .forKeyspace("KeyspaceName")
Compressed families not created on new node
Cassandra NYC 2011: Nathan Milford - Cassandra for System Admins
Cassandra for LOBS | Ruby Zone Database storage is expensive. This is especially true if you build a traditional SAN based M+N cluster. The cost of the storage array, fiber channel switches, fiber channel interfaces, drives the cost per terabyte into the thousands quite easily. And while storage costs in general are plummeting, SAN storage costs are falling at a slower rate, widening the gap between SAN and direct attached storage. Given the cost of SAN storage, it would be unfortunate to waste it which is what we discovered we were doing. Our platform makes a lot of 3rd party service calls.
"Building on Quicksand" Paper for CIDR (Conference on Innovative Database Research) - PatHelland's WebLog
Upgrading Cassandra: 0.8.x to 1.0.x | DataStax Cassandra 1.0 Documentation This section describes how to upgrade Cassandra 0.8.x to 1.0.x and how to upgrade between minor releases of Cassandra 1.0.x. The procedures also apply to DataStax Community Edition. Best Practices for Upgrading Cassandra The following steps are recommended before upgrading Cassandra: Upgrading Cassandra: 0.8.x to 1.0.x
What’s new in Cassandra 1.0: Compression Cassandra 1.0 introduces support for data compression on a per-ColumnFamily basis, one of the most-requested features since the project started. Compression maximizes the storage capacity of your Cassandra nodes by reducing the volume of data on disk. In addition to the space-saving benefits, compression also reduces disk I/O, particularly for read-dominated workloads. Compression benefits Besides data size, compression typically improves both read and write performance. Cassandra is able to quickly find the location of rows in the SSTable index, and only decompresses the relevant row chunks.
Netflix Benchmarks on AWS Show Cassandra NoSQL Still Has the Goods A little more than a year ago, Apache Cassandra's reputation was untouchable. It was blowing other NoSQL data stores out of the water in benchmarks and in our very own DZone popularity poll. What else would you expect from the data solution that was originally designed to handle the data on Facebook. How could it not be the top solution out there?
Cassandra Welcome to Apache Cassandra The Apache Cassandra database is the right choice when you need scalability and high availability without compromising performance. Linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure make it the perfect platform for mission-critical data. Cassandra's support for replicating across multiple datacenters is best-in-class, providing lower latency for your users and the peace of mind of knowing that you can survive regional outages. The Apache Cassandra Project
Intro — Hector v0.8.x documentation
sebgiroux/Cassandra-Cluster-Admin - GitHub README.mkd Cassandra Cluster Admin by Sébastien Giroux Cassandra Cluster Admin is a GUI tool to help people administrate their Apache Cassandra cluster. If you're like me and used MySQL for a while (and still using it!)
Tuning Cassandra | DataStax Cassandra 0.8 Documentation Effective tuning depends not only on the types of operations your cluster performs most frequently, but also on the shape of the data itself. For example, Cassandra’s memtables have overhead for index structures on top of the actual data they store. If the size of the values stored in the columns is small compared to the number of columns and rows themselves (sometimes called skinny rows), this overhead can be substantial. Thus, the optimal tuning for this type of data is quite different than the optimal tuning for a small numbers of columns with more data (fat rows).
README.mdown Overview cassandra-stress is modeled after the stress.py script in Apache Cassandra's source distribution. cassandra-stress is built on top of Hector, a well-tested and widely deployed Java client for Apache Cassandra. The benefits of using Hector for a tool like this are many: zznate/cassandra-stress - GitHub
NodeTool - Cassandra Wiki
Cassandra Write Performance – A quick look inside Application Performance
MX4J - Open Source Java Management Extensions
Linux performance basics
What’s new in Cassandra 0.7: expiring columns
Contents | DataStax Cassandra 0.8 Documentation
Clustering | DataStax Cassandra 0.7 Documentation
Operations - Cassandra Wiki
cassandra load balancing
database design - What's The Best Practice In Designing A Cassandra Data Model
Cassandra: RandomPartitioner vs OrderPreservingPartitioner « Dominic Williams
driftx/chiton - GitHub
Apache Cassandra Glossary
API - Cassandra Wiki
User Guide - GitHub
Hector – a Java Cassandra client | PrettyPrint.me
zznate/cassandra-tutorial - GitHub
up and running with cassandra