Hadoop Ecosystem : Part I

When we talk about Hadoop we talk about Big data which is defined by 4 V’s :

  • Velocity, Variety, Veracity and Volume.

Type of Data that Hadoop handles are :

  • Structured, Semi structured and unstructured.

Hadoop supports and is defined by following features :

  • An Open Source framework for processing large data sets.
  • It is built for commodity hardware ,removing the need for exclusive hardware.
  • Has high fault tolerance
  • Highly scalable.
  • Default replication of 3.

 

Components of Hadoop

|


|                                                     |                                          |                                                     |

Hadoop common                 HDFS                              YARN                                   MapReduce

Hadoop Common : This module consists of utilities that support Hadoop.

HDFS (Hadoop Distributed File System):  HDFS is a distributed file system that has  master/slave architecture .It has two types of nodes , a NameNode and one or more instances of DataNode(s).NameNode stores the metadata and log of all the data living on the DataNodes whereas DataNodes stores all the data in blocks.

YARN(Yet Another Resource Navigator): Yarn is the component that takes over the role of resource manager and job scheduler.This framework comprises of the Resource manager as well as the Node manager.

MapReduce: MapReduce is the software framework that  can handle the parallel processing of large data set on large cluster of HDFS.The MapReduce use key value pair to complete processing.The steps involved in MapReduce mostly involve Map, Shuffle, Reduce .Here is an example for better understanding.

MRexample

(You can read in detail about mapreduce with example here : Apache’s MapReduce page )

If you think you can add to the topic, feel free to comment below.

Happy Learning!!

Reference: Apache Hadoop

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s