Phases of MapReduce

A MapReduce Program consists of three different phases. They are:

Mapper
Sort and Shuffle
Reducer

Among all the three phases, Mapper and Reducer are the direct implementation with respect to coding, where as the Sort and Shuffle phase acts as a glue between Mapper and Reducer. As a developer we are responsible to write the code for Mapper and Reducer phases. The following figure shows how MapReduce processes data.

Data processing in MapReduce

The above figure shows that, Mapper phase takes the input in the form of key value pairs (K,V), and generate the output in the form of (K,V).

Sort and Shuffle phase takes the input as (K,V) and generates the output int the form of Key and List of Value pairs (K, List(v)).

Reducer phase takes the input as (K, List(v)), and generates the output as (K,V). Reducer phase output is the Final Output.

As of now you may not understand the diagram clearly, but just have an idea, you can get the clear picture in the future posts.

Phases of MapReduce

Search This Blog

Follow in Facebook

Popular Posts

About Me

Phases of MapReduce

Share this post

Search This Blog

Follow in Facebook

Popular Posts

About Me