Interview Questions
Working of Hadoop
-
The following are the different stages in the working pattern of Hadoop
Submitting a Job to Hadoop Job client
Firstly, in order to run a job in Hadoop, the user should submit the job request to the Hadoop job client with all the essential information to run the job. Following are the required details.
- Location of the input file and the location where the output should be stored in the distributed file system should be mentioned.
- The jar file which contains java classes to implement map and reduce functions.
- Parameters pertaining to the configuration settings of the job.
Forwarding Job to the JobTracker
Once, the job is submitted to the Hadoop Client, the job is forwarded to the master JobTracker with configuration settings. Thereafter, the JobTracker will take care of the remaining stages of the job process. It schedules the tasks for the slaves and distributes the necessary configuration and jar/executable files to the slaves. It also provides the diagnostic information to the TaskTracker (slave).
Execution by the TaskTracker
Hadoop TaskTrackers in every node which connected to Hadoop cluster is controlled by JobTracker. Once the task is assigned by the JobTracker, the TaskTracker will run the task and stores the result at the desired output location. During this stage, JobTracker will be notified on the status periodically.
Get in touch with training experts Get Free Quotes