Infinite Programming Tips: September 2012

Tuesday, September 25, 2012

ERROR: org.apache.hadoop.hbase.MasterNotRunningException: Retried

ERROR: org.apache.hadoop.hbase.MasterNotRunningException: Retried 7 times

Here is some help for this command:
List all tables in hbase. Optional regular expression parameter could
be used to filter the output. Examples:

There may be more than one probable reason : one is here :)

Check if namenode is in safemode....... If so wait for namenode to come out from safe mode or after waiting for 1 minute you can just ask name node to come out of safe mode using command

bin/hadoop dfsadmin -safemode leave

and then try to work with hbase ..

Monday, September 24, 2012

Apache Hadoop NextGen MapReduce (YARN)

MapReduce has undergone a complete overhaul in hadoop-0.23 and we now have, what we call, MapReduce 2.0 (MRv2) or YARN.

The fundamental idea of MRv2 is to split up the two major functionalities of the JobTracker, resource management and job scheduling/monitoring, into separate daemons. The idea is to have a global ResourceManager (RM) and per-application ApplicationMaster (AM). An application is either a single job in the classical sense of Map-Reduce jobs or a DAG of jobs.

The ResourceManager and per-node slave, the NodeManager (NM), form the data-computation framework. The ResourceManager is the ultimate authority that arbitrates resources among all the applications in the system.

The per-application ApplicationMaster is, in effect, a framework specific library and is tasked with negotiating resources from the ResourceManager and working with the NodeManager(s) to execute and monitor the tasks.

Check this LINK for more detail

Hadoop High Availability (HA hadoop Cluster)

Note: Currently, only manual failover is supported, means rely on the operator to manually initiate a failover. Automatic failure detection and initiation of a failover will be implemented in future versions.

The best technology has one demerit you know what : High availability of Namenode, so if NameNode is down whole cluster is down, so here hadoop has added HA feature in Hadoop which will make it more available for use Smile

i did not get any other word.

High Availability feature addresses the above problems by providing the option of running two redundant NameNodes in the same cluster in an Active/Passive configuration with a hot standby. This allows a fast failover to a new NameNode in the case that a machine crashes, or a graceful administrator-initiated failover for the purpose of planned maintenance.

How it is organised ?

Read It In Detail ===>>

Sunday, September 23, 2012

Demystifying Hadoop concepts Series: Safe mode

What is is safe mode of hadoop, may time we come across this exception “ org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.server.namenode.SafeModeException” or some other exceptions Which contains safe mode in it Smile

First let me tell what Safe mode is in context to Hadoop : as we all know Name node contains fsimage (metadata) of the data present on the cluster, which can be large or small based on the size of the cluster and the size of date present on the cluster, so when the name node starts it loads this fsimage and the edit logs from the disk in the Primary memory RAM for fast processing, and after loading it waits for data nodes to report about the present on those data nodes, so during this process that is loading the fsimage and edit logs and waiting for data nodes to report about the data block in safe mode, which is a read only mode for name node this is done to maintain the consistency of the data present, this is just like saying “ i will not receive any thing till i know what i already have”. And during this period no modification to the file blocks are allowed as to maintain the correctness of the data.