Map/Reduce. Overview[edit] MapReduce is a framework for processing parallelizable problems across huge datasets using a large number of computers (nodes), collectively referred to as a cluster (if all nodes are on the same local network and use similar hardware) or a grid (if the nodes are shared across geographically and administratively distributed systems, and use more heterogenous hardware).
Processing can occur on data stored either in a filesystem (unstructured) or in a database (structured). MapReduce can take advantage of locality of data, processing it on or near the storage assets in order to reduce the distance over which it must be transmitted. Consistent hashing. Consistent hashing is a special kind of hashing such that when a hash table is resized and consistent hashing is used, only keys need to be remapped on average, where is the number of keys, and.