background preloader

Cassandra

Facebook Twitter

Hbase

Redis. CASSANDRA-192] Load balancing. The Karger/Ruhl paper (really Ruhl – based on his thesis) gives two load balancing algorithms.

CASSANDRA-192] Load balancing

One is based again on each machine having several virtual nodes, but the load balance is done by only activating one node per machine. Each machine picks its node based on how evenly it partitions the address space. This would be easy to implement in Cassandra for our random hash-based partitioner (since only one node is active at a time, changing nodes maps essentially to a token change in Cassandra with no further changes necessary) but does not help order-preserving partitioning where we cannot tell how evenly the address space (same as the key space) is partitioned. The second Ruhl algorithm assumes neither the ability to measure address space nor virtual nodes. Why you won't be building your killer app on a distributed. I ran across A case study in building layered DHT applications while doing some research on implementing load-balancing in Cassandra.

Why you won't be building your killer app on a distributed

The question addressed is, "Are DHTs a general-purpose tool that you can build more sophisticated services on? " Short version: no. A few specialized applications can and have been built on a plain DHT, but most applications built on DHTs have ended up having to customize the DHT's internals to achieve their functional or performance goals. This paper describes the results of attempting to build a relatively complex datastructure (prefix hash trees, for range queries) on top of OpenDHT.

The result was mostly failure: