MapReduce

- In order to enable parallel processing of large datasets stored in the Hadoop Distributed File System parallel, Apache MapReduce software framework is integrated in the Hadoop Architecture. Typically, a Hadoop environment will be comprised of thousands of nodes which are made of commodity hardware. So, it is more likely the system needs an application created by MapReduce framework to enable a reliable and fault-tolerant data processing.
  The Tasks and Terminology
  The term MapReduce is named for the two specific tasks carried out by the software framework. They are mapping and reducing tasks.
  Map Task
  In this initial task, the data extracted from the data source is converted into a key/value pairs. This will be the base data format which will face the next step ‘Reduce Task’
  Reduce Task
  After the map task, the key/value pairs generated are further combined into smaller sets of tuples.
  While performing these tasks, the software framework itself will take care of the scheduling, monitoring, executing and re-executing the tasks.
  Master and Slave
  Every cluster-node in MapReduce framework contains one master JobTracker and slave TaskTracker. These two are responsible for the task management and execution.
  Master JobTracker
  Being the master, JobTracker is responsible for creating and assigning the tasks to be run on MapReduce. It assigns the job to TaskTracker enabling it to run on DataNodes. After running the job, the TaskTracker reports the status to the JobTracker. Job Tracker is also responsible for keeping track of the resources available and the resources consumed. Since the JobTracker is the crucial point for MapReduce to run the jobs, the failure of which would lead to complete halt of MapReduce?
  Slave TaskTracker
  TaskTracker is designated to accomplish the tasks which are assigned by the JobTracker. And while running those tasks, the TaskTracker will keep informed the JobTracker about the task status at regular intervals.

Interested about Hadoop?
Get in touch with training experts Get Free Quotes

Most students read these articles

Top instructors

Recommended Courses

Hadoop Hands-on Training with Job Placement
Oct 27 2025

hadoop hands on training
Oct 28 2025

Hadoop Hands-on Training with Job Placement
Oct 29 2025

Big DataHadoop Training
Oct 30 2025

Hadoop Hands-on Training with Job Placement
Oct 31 2025

Hadoop Tutorials

Interview Questions

Popular Tutorials

MapReduce

The Tasks and Terminology

Map Task

Reduce Task

Master and Slave

Master JobTracker

Slave TaskTracker

Most students read these articles

Top instructors

Recommended Courses

Hadoop Hands-on Training with Job Placement

hadoop hands on training

Hadoop Hands-on Training with Job Placement

Big DataHadoop Training

Hadoop Hands-on Training with Job Placement