Showing posts with label MapReduce. Show all posts
Showing posts with label MapReduce. Show all posts
A MapReduce Program consists of three different phases. They are:
  • Mapper
  • Sort and Shuffle
  • Reducer
Among all the three phases, Mapper and Reducer are the direct implementation with respect to coding, where as the Sort and Shuffle phase acts as a glue between Mapper and Reducer.  As a developer we are responsible to write the code for Mapper and Reducer phases. The following figure shows how MapReduce processes data. 
 
MapReduce Processing of Data
Data processing in MapReduce
The above figure shows that, Mapper phase takes the input in the form of key value pairs (K,V), and generate the output in the form of (K,V). 

Sort and Shuffle phase takes the input as (K,V) and generates the output int the form of Key and List of Value pairs (K, List(v)). 

Reducer phase takes the input as (K, List(v)), and generates the output as (K,V). Reducer phase output is the Final Output.
As of now you may not understand the diagram clearly, but just have an idea, you can get the clear picture in the future posts. 
                                    


Phases of MapReduce

Posted at  21:52  |  in  MapReduce  |  Read More»

A MapReduce Program consists of three different phases. They are:
  • Mapper
  • Sort and Shuffle
  • Reducer
Among all the three phases, Mapper and Reducer are the direct implementation with respect to coding, where as the Sort and Shuffle phase acts as a glue between Mapper and Reducer.  As a developer we are responsible to write the code for Mapper and Reducer phases. The following figure shows how MapReduce processes data. 
 
MapReduce Processing of Data
Data processing in MapReduce
The above figure shows that, Mapper phase takes the input in the form of key value pairs (K,V), and generate the output in the form of (K,V). 

Sort and Shuffle phase takes the input as (K,V) and generates the output int the form of Key and List of Value pairs (K, List(v)). 

Reducer phase takes the input as (K, List(v)), and generates the output as (K,V). Reducer phase output is the Final Output.
As of now you may not understand the diagram clearly, but just have an idea, you can get the clear picture in the future posts. 
                                    


Hadoop Framework mainly based on two ecosystems, They are HDFS and MapReduce. HDFS is meant for storage and MapRduce is meant for processing. As of now we have seen the storage part i.e, HDFS, now let us have a look at the processing part i.e, MapReduce. 
Map Reduce Logo

In the Hadoop world, MapReduce is considered as one of the major component. MapReduce is responsible for the processing of huge amount of data which get stored on top of HDFS. It is also responsible for parallel processing. MapReduce achieves parallel processing by the means of splits, i.e, all the data is divided into multiple chunks and the processing will be done on each in a parallel fashion. 

MapReduce is a programming model for data processing. Even if it is a programming, it is very simple. Hadoop accepts MR programs written in different languages. Mostly people use java to write MR programs. 

MapReduce Introduction

Posted at  20:51  |  in  MapReduce  |  Read More»

Hadoop Framework mainly based on two ecosystems, They are HDFS and MapReduce. HDFS is meant for storage and MapRduce is meant for processing. As of now we have seen the storage part i.e, HDFS, now let us have a look at the processing part i.e, MapReduce. 
Map Reduce Logo

In the Hadoop world, MapReduce is considered as one of the major component. MapReduce is responsible for the processing of huge amount of data which get stored on top of HDFS. It is also responsible for parallel processing. MapReduce achieves parallel processing by the means of splits, i.e, all the data is divided into multiple chunks and the processing will be done on each in a parallel fashion. 

MapReduce is a programming model for data processing. Even if it is a programming, it is very simple. Hadoop accepts MR programs written in different languages. Mostly people use java to write MR programs. 

About-Privacy Policy-Contact us
Copyright © 2013 Hadoop Tutor. Blogger Template by Bloggertheme9
Proudly Powered by Blogger.
back to top