TechAlpine – All About Technology

www.techalpine.com

What is Map/Reduce in Hadoop?

Metų : Processing vast amount of data (multi-terabyte data-sets) is a major concern in real life projects.As the size of data is increasing day by day, applications are finding it difficult to process it in a reliable,secured and fault-tolerant way.

Here comes the Hadoop Map/Reduce framework.It helps to write the applications easily which process vast amount of data keeping all the concerns in mind.The biggest advantage is, Map/Reduce allows parallel processing on large clusters of commodity hardware.

The main concepts behind Map/Reduce is

a) Split the job into independent tasks (known as Map tasks).
b) Process the tasks in a parallel manner.
c) Sort the output of the maps and send them as input to the reduce task (known as Reduce tasks).

So it is basically parallel processing of chunks and then joining them back to get the end result.The framework takes care of scheduling tasks, monitoring them and re-executes the failed tasks.

Note : Commodity hardware is a term for affordable devices that are generally compatible with other such devices.