Hadoop Introduction

Now a days we are getting a huge amount of data (terabytes to petabytes), where we are facing a big problem in storing and processing such huge amount of data. We call this huge amount of data as BIGDATA. Bigdata is the growing challenge that organization facing now a days.
To overcome from this problem Google has released white papers GFS (Google File System) and Map Reduce in 2000’s. Based on those papers “Doug Cutting” developed a new framework called HADOOP. This has got this name from the yellow colored Elephant Toy named Hadoop, with which Doug Cutting's son is used to play. Officially this was got released into the market on 15^th Feb, 2011.

Hadoop is the Apache open source software framework. Hadoop includes number of components that were specifically designed to solve large scale distributed data storage, analysis and retrieval tasks. The following are the components/Ecosystems of Hadoop.

HDFS
Map Reduce
Apache PIG
HIVE
SQOOP
HBASE etc...,

Among all the above components HDFS (Hadoop Distributed File System) and Map Reduce are the two important components which we have to know compulsorily. HDFS deals with storage where as Map Reduce deals with processing. As Hadoop is designed for storage and processing these two plays an important role.

Using Hadoop we can process three types of data, they are:

Structured Data (Eg: RDBMS)
Semi-structured Data (Eg: Excel Sheets)
Unstructured Data (Eg: Normal Files)

There are 3 v's of Hadoop, They are:

Volume: The volume of data that hadoop can store and process is very high.
Variety: Hadoop can process different varieties of data(Structured, semistructed and un structured).
Velocity: Hadoop can process huge amount of data with high velocity, i.e, very quickly.

Hadoop Introduction

HDFS

Map Reduce

Apache PIG

HIVE

SQOOP

HBASE etc...,

Search This Blog

Follow in Facebook

Popular Posts

About Me

Hadoop Introduction

HDFS

Map Reduce

Apache PIG

HIVE

SQOOP

HBASE etc...,

Share this post

Search This Blog

Follow in Facebook

Popular Posts

About Me