Phases of MapReduce

Posted at  21:52  |  in  MapReduce

A MapReduce Program consists of three different phases. They are:
  • Mapper
  • Sort and Shuffle
  • Reducer
Among all the three phases, Mapper and Reducer are the direct implementation with respect to coding, where as the Sort and Shuffle phase acts as a glue between Mapper and Reducer.  As a developer we are responsible to write the code for Mapper and Reducer phases. The following figure shows how MapReduce processes data. 
 
MapReduce Processing of Data
Data processing in MapReduce
The above figure shows that, Mapper phase takes the input in the form of key value pairs (K,V), and generate the output in the form of (K,V). 

Sort and Shuffle phase takes the input as (K,V) and generates the output int the form of Key and List of Value pairs (K, List(v)). 

Reducer phase takes the input as (K, List(v)), and generates the output as (K,V). Reducer phase output is the Final Output.
As of now you may not understand the diagram clearly, but just have an idea, you can get the clear picture in the future posts. 
                                    


Share this post

About-Privacy Policy-Contact us
Copyright © 2013 Hadoop Tutor. Blogger Template by Bloggertheme9
Proudly Powered by Blogger.
back to top