background preloader

Consistent-hashing

Facebook Twitter

p2p

Package Index : hash_ring 1.2. Implements consistent hashing in Python (using md5 as hashing function). Implements consistent hashing that can be used when the number of server nodes can increase or decrease (like in memcached). The hashing ring is built using the same algorithm as libketama. Consistent hashing is a scheme that provides a hash table functionality in a way that the adding or removing of one slot does not significantly change the mapping of keys to slots. More about hash_ring can be read in a blog post (that explains the idea in greater details): Consistent hashing implemented simply in python < More information about consistent hashing can be read in these articles: There is also a wrapper MemcacheRing that extends python-memcache to use consistent hashing for key distribution. Basic example of usage (for managing memcached instances): Example using weights: How to use MemcacheRing: The code should be clean and simple.

Web Caching with Consistent Hashing. By David Karger , Alex Sherman , Andy Berkheimer, Bill Bogstad, Rizwan Dhanidina Ken Iwamoto, Brian Kim, Luke Matkins, Yoav Yerushalmi. MIT Laboratory for Computer Science . Abstract: A key performance measure for the World Wide Web is the speed with which content is served to users. An important issue in many caching systems is how to decide what is cached where at any given time. In this paper, we offer a new web caching strategy based on Consistent hashing provides an alternative to multicast and directory schemes, and has several other advantages in load balancing and fault tolerance.

Keywords: Caching Hashing Consistent Introduction Web Caching Related Work Our Work Paper Overview Consistent Hashing Needs Analysis Implementation Results Our System Caches Mapping to caches DNS servers Discussion of DNS Tests Test Setup Test Driver Results Extensions Locality Advanced Load Balancing - Hot Pages Fault Tolerance Conclusion Acknowledgments Bibliography 1 Introduction 1.1 Web Caching 1.2 Related Work 1.3 Our Work caches. Tom White: Consistent Hashing. I've bumped into consistent hashing a couple of times lately. The paper that introduced the idea ( Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web by David Karger ) appeared ten years ago, although recently it seems the idea has quietly been finding its way into more and more services, from Amazon's Dynamo to memcached (courtesy of Last.fm ).

So what is consistent hashing and why should you care? It would be nice if, when a cache machine was added, it took its fair share of objects from all the other cache machines. Equally, when a cache machine was removed, it would be nice if its objects were shared between the remaining machines. This is exactly what consistent hashing does - maps objects to the same cache machine, as far as is possible, at least. The basic idea behind the consistent hashing algorithm is to hash both objects and caches using the same hash function. Demonstration Let's look at this in more detail. Node); Consistent hashing and random trees: Distributed caching protoco. BibTeX Years of Citing Articles Bookmark OpenURL Abstract We describe a family of caching protocols for distributed networks that can be used to decrease or eliminate the occurrence of hot spots in the network.

Citations. Understanding Consistent Hashing | Spiteful.com. Next up in the toolbox series is an idea so good it deserves an entire article all to itself: consistent hashing. Let’s say you’re a hot startup and your database is starting to slow down. You decide to cache some results so that you can render web pages more quickly. If you want your cache to use multiple servers (scale horizontally, in the biz), you’ll need some way of picking the right server for a particular key. If you only have 5 to 10 minutes allocated for this problem on your development schedule, you’ll end up using what is known as the naïve solution: put your N server IPs in an array and pick one using key % N. I kid, I kid — I know you don’t have a development schedule.

That’s OK. You’re a startup. Anyway, this ultra simple solution has some nice characteristics and may be the right thing to do. You’ll have a second problem if your cache is read-through or you have some sort of processing occurring alongside your cached data. As I said, though, that might be OK.