Difference between MapReduce and PIG:
Both PIG and MapReduce does the same work. Both are used to process the data. When the PIG program is executed internally it converts into a MapReduce job and process the data. The following are the some of the differences between MR and PIG.
|
|
|
MapReduce
program expects the programming language skills for writing the business
logic.
|
In Apache PIG there is no need of much programming
skills. The entire program is based on PIG transformations.
|
Amount of code
is very large; we must write huge programming code.
|
Amount of code is very less when compared to
MapReduce program. 200 lines of MapReduce program is equivalent to 10 lines
of Pig script.
|
MapReduce
program is compiled and executed directly.
|
Pig script internally converts into MapReduce
program and gets executed.
|
Writing and
executing MapReduce programming is a bit complex task.
|
Writing and Executing PIG script is a simple task
when compared with MapReduce.
|
Difference between MapReduce and PIG:
Both PIG and MapReduce does the same work. Both are used to process the data. When the PIG program is executed internally it converts into a MapReduce job and process the data. The following are the some of the differences between MR and PIG.
|
|
|
MapReduce
program expects the programming language skills for writing the business
logic.
|
In Apache PIG there is no need of much programming
skills. The entire program is based on PIG transformations.
|
Amount of code
is very large; we must write huge programming code.
|
Amount of code is very less when compared to
MapReduce program. 200 lines of MapReduce program is equivalent to 10 lines
of Pig script.
|
MapReduce
program is compiled and executed directly.
|
Pig script internally converts into MapReduce
program and gets executed.
|
Writing and
executing MapReduce programming is a bit complex task.
|
Writing and Executing PIG script is a simple task
when compared with MapReduce.
|
Different modes of Pig Execution:
Pig has two execution modes or types. They are:
- Local Mode
- MapReduce Mode
Now let us see each execution mode in detail.
Local Mode:
In Local Mode of Pig execution, all the input data will be taken from local file system. After execution it provides output on top of local file system. In local mode, Pig runs in a single JVM and accesses the local filesystem. This mode of suitable only for small datasets and when trying out Pig. To start the local mode of execution, the following command is used.
The above command starts Grunt. Grunt is the Pig interactive shell.
MapReduce Mode/HDFS Mode/ Clustered Mode:
In this mode Apache Pig will take the input form HDFS paths only, and after processing data it will put output files on top of HDFS. In MapReduce mode of execution, Pig translates queries into MapReduce jobs and runs them on a Hadoop Cluster.
Different modes of Pig Execution:
Pig has two execution modes or types. They are:
- Local Mode
- MapReduce Mode
Now let us see each execution mode in detail.
Local Mode:
In Local Mode of Pig execution, all the input data will be taken from local file system. After execution it provides output on top of local file system. In local mode, Pig runs in a single JVM and accesses the local filesystem. This mode of suitable only for small datasets and when trying out Pig. To start the local mode of execution, the following command is used.
The above command starts Grunt. Grunt is the Pig interactive shell.
MapReduce Mode/HDFS Mode/ Clustered Mode:
In this mode Apache Pig will take the input form HDFS paths only, and after processing data it will put output files on top of HDFS. In MapReduce mode of execution, Pig translates queries into MapReduce jobs and runs them on a Hadoop Cluster.
Apache PIG Introduction:
Apache PIG is a transformative language. Initially PIG was developed at Yahoo laboratories, later in 2006 it was officially opted by Apache Software Foundation (ASF) Pig is high productive when compared to MapReduce. Pig raises the level of abstraction for processing Bigdata.
Apache Pig is one of the component of Hadoop. Pig is the high level language on top of MapReduce. It uses multiple transformations to process the data. The data flow in Pig is based on these transformations. So, we call Pig as Transformative Language/ DataFlow Language.
Initially Pig is called as Pig Latin. When compared to MapReduce, Pig reduces the size of code, 15 lines of Pig code is equal to nearly 200 lines of MapReduce code. When we run the Pig script, it will internally convert into MapReduce jobs.
As part of this tutorial you can have a look at the following topics:
- Modes of Pig Execution
- Pig Programming example
- Pig UDF's
Apache PIG Introduction:
Apache PIG is a transformative language. Initially PIG was developed at Yahoo laboratories, later in 2006 it was officially opted by Apache Software Foundation (ASF) Pig is high productive when compared to MapReduce. Pig raises the level of abstraction for processing Bigdata.
Apache Pig is one of the component of Hadoop. Pig is the high level language on top of MapReduce. It uses multiple transformations to process the data. The data flow in Pig is based on these transformations. So, we call Pig as Transformative Language/ DataFlow Language.
Initially Pig is called as Pig Latin. When compared to MapReduce, Pig reduces the size of code, 15 lines of Pig code is equal to nearly 200 lines of MapReduce code. When we run the Pig script, it will internally convert into MapReduce jobs.
As part of this tutorial you can have a look at the following topics:
- Modes of Pig Execution
- Pig Programming example
- Pig UDF's