Thursday, March 14, 2013

Adding Scheduler to Hadoop Cluster

 

As we know we we execute task or jobs on hadoop it follows FIFO Scheduling, but if you are in multi user hadoop environment the you will need better scheduler for the consistency and correctness of the task scheduling.

Hadoop comes with other schedulers too those are:

Fair Schedulers : This defines pools and over time; each pool gets around the same amount of resources.

Capacity Schedulers : This defines queues, and each queue has a guaranteed capacity. The capacity scheduler shares computer resources allocated to a queue with other queues if those resources are not in use.

For changing the scheduler you need to take your cluster offline and make some configuration changes, first make sure that the correct scheduler jar files are there. In older version of hadoop you need to put the jar file if not ther in lib directory but from hadoop 1 these jars available in the lib folder and if you are using the newer hadoop good news for you Smile

Steps will be:

  1. Shutdown your cluster
  2. Add this property to the file mapred-site.xml
    1. <property>
      <name>mapred.jobtracker.taskScheduler</name>
      <value>org.apache.hadoop.mapred.FairScheduler</value>
      </property>
  3. Start the cluster.
  4. To verify go to browser and type
    1.  http://namenodehostname:50030/scheduler

That’s it Surprised smile Yes Smile Enjoy Scheduling.

No comments:

Post a Comment

Thank you for Commenting Will reply soon ......

Featured Posts

#Linux Commands Unveiled: #date, #uname, #hostname, #hostid, #arch, #nproc

 #Linux Commands Unveiled: #date, #uname, #hostname, #hostid, #arch, #nproc Linux is an open-source operating system that is loved by millio...