hadoop

TwitterFacebook
Get flash to fully experience Pearltrees
mapreduce

replication

If you’re using random numbers in your MapReduce jobs, you could be suffering from data loss. The cause of the data loss is subtle and happens due to Hadoop’s behavior in dealing with TaskTrackers that are lost in the middle of a job.

Using random numbers in Hadoop MapReduce is dangerous | Engineer

http://blog.rapleaf.com/dev/2009/08/14/using-random-numbers-in-mapreduce-is-dangerous/

Running Hadoop On Ubuntu Linux (Single-Node Cluster) - Michael G

In this tutorial, I will describe how to setup a single-node Hadoop cluster. http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/