Infinite Programming Tips: Hadoop High Availability (HA hadoop Cluster)

Monday, September 24, 2012

Hadoop High Availability (HA hadoop Cluster)

Note: Currently, only manual failover is supported, means rely on the operator to manually initiate a failover. Automatic failure detection and initiation of a failover will be implemented in future versions.

The best technology has one demerit you know what : High availability of Namenode, so if NameNode is down whole cluster is down, so here hadoop has added HA feature in Hadoop which will make it more available for use Smile

i did not get any other word.

High Availability feature addresses the above problems by providing the option of running two redundant NameNodes in the same cluster in an Active/Passive configuration with a hot standby. This allows a fast failover to a new NameNode in the case that a machine crashes, or a graceful administrator-initiated failover for the purpose of planned maintenance.

How it is organised ?

So in HA mode two machines are configured as NameNodes, so at a time one Machine works as active Namenode and the other machine works as Standby Namenode.The Active NameNode is responsible for all client operations in the cluster, while the Standby is simply acting as a slave, maintaining enough state to provide a fast failover if necessary. The storage is maintained on a cetral storage so that both the active and standby node can have the access at the same time to the most latest data.When any namespace modification is performed by the Active node, it durably logs a record of the modification to an edit log file stored in the shared directory. The Standby node is constantly watching this directory for edits, and as it sees the edits, it applies them to its own namespace. In the event of a failover, the Standby will ensure that it has read all of the edits from the shared storage before promoting itself to the Active state. This ensures that the namespace state is fully synchronized before a failover occurs.

In order to provide a fast failover, it is also necessary that the Standby node have up-to-date information regarding the location of blocks in the cluster. In order to achieve this, the DataNodes are configured with the location of both NameNodes, and send block location information and heartbeats to both.

What you Require ?

NameNode Machines: Two machines with the same configuration. (You will also need two SecondryNameNode for both the active and the standby NameNodes)
Shared Storage : a shared storage with all permission for these two machines.

How will we do the configuration for that?

To configure HA NameNodes, you must add several configuration options to your hdfs-site.xml configuration file.
The order in which you set these configurations is unimportant, but the values you choose for dfs.federation.nameservices and dfs.ha.namenodes.[nameservice ID] will determine the keys of those that follow. Thus, you should decide on these values before setting the rest of the configuration options.

dfs.federation.nameservices - the logical name for this new nameservice Choose a logical name for this nameservice, for example "mycluster", and use this logical name for the value of this config option. The name you choose is arbitrary. It will be used both for configuration and as the authority component of absolute HDFS paths in the cluster.
Note: If you are also using HDFS Federation, this configuration setting should also include the list of other nameservices, HA or otherwise, as a comma-separated list.
```
<property>
  <name>dfs.federation.nameservices</name>
  <value>mycluster</value>
</property>
```
dfs.ha.namenodes.[nameservice ID] - unique identifiers for each NameNode in the nameservice
Configure with a list of comma-separated NameNode IDs. This will be used by DataNodes to determine all the NameNodes in the cluster. For example, if you used "mycluster" as the nameservice ID previously, and you wanted to use "nn1" and "nn2" as the individual IDs of the NameNodes, you would configure this as such:
```
<property>
  <name>dfs.ha.namenodes.mycluster</name>
  <value>nn1,nn2</value>
</property>
```
Note: Currently, only a maximum of two NameNodes may be configured per nameservice.
dfs.namenode.rpc-address.[nameservice ID].[name node ID] - the fully-qualified RPC address for each NameNode to listen on
For both of the previously-configured NameNode IDs, set the full address and IPC port of the NameNode processs. Note that this results in two separate configuration options. For example:
```
<property>
  <name>dfs.namenode.rpc-address.mycluster.nn1</name>
  <value>machine1.example.com:8020</value>
</property>
<property>
  <name>dfs.namenode.rpc-address.mycluster.nn2</name>
  <value>machine2.example.com:8020</value>
</property>
```
Note: You may similarly configure the "servicerpc-address" setting if you so desire.
dfs.namenode.http-address.[nameservice ID].[name node ID] - the fully-qualified HTTP address for each NameNode to listen on
Similarly to rpc-address above, set the addresses for both NameNodes' HTTP servers to listen on. For example:
```
<property>
  <name>dfs.namenode.http-address.mycluster.nn1</name>
  <value>machine1.example.com:50070</value>
</property>
<property>
  <name>dfs.namenode.http-address.mycluster.nn2</name>
  <value>machine2.example.com:50070</value>
</property>
```
Note: If you have Hadoop's security features enabled, you should also set the https-address similarly for each NameNode.
dfs.namenode.shared.edits.dir - the location of the shared storage directory
This is where one configures the path to the remote shared edits directory which the Standby NameNode uses to stay up-to-date with all the file system changes the Active NameNode makes. You should only configure one of these directories. This directory should be mounted r/w on both NameNode machines. The value of this setting should be the absolute path to this directory on the NameNode machines. For example:
```
<property>
  <name>dfs.namenode.shared.edits.dir</name>
  <value>file:///mnt/filer1/dfs/ha-name-dir-shared</value>
</property>
```
dfs.client.failover.proxy.provider.[nameservice ID] - the Java class that HDFS clients use to contact the Active NameNode
Configure the name of the Java class which will be used by the DFS Client to determine which NameNode is the current Active, and therefore which NameNode is currently serving client requests. The only implementation which currently ships with Hadoop is the ConfiguredFailoverProxyProvider, so use this unless you are using a custom one. For example:
```
<property>
  <name>dfs.client.failover.proxy.provider.mycluster</name>
  <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
```
dfs.ha.fencing.methods - a list of scripts or Java classes which will be used to fence the Active NameNode during a failover
It is critical for correctness of the system that only one NameNode be in the Active state at any given time. Thus, during a failover, we first ensure that the Active NameNode is either in the Standby state, or the process has terminated, before transitioning the other NameNode to the Active state. In order to do this, you must configure at least one fencing method. These are configured as a carriage-return-separated list, which will be attempted in order until one indicates that fencing has succeeded. There are two methods which ship with Hadoop: shell and sshfence. For information on implementing your own custom fencing method, see theorg.apache.hadoop.ha.NodeFencer class.
- sshfence - SSH to the Active NameNode and kill the process
  The sshfence option SSHes to the target node and uses fuser to kill the process listening on the service's TCP port. In order for this fencing option to work, it must be able to SSH to the target node without providing a passphrase. Thus, one must also configure the dfs.ha.fencing.ssh.private-key-files option, which is a comma-separated list of SSH private key files. For example:
```
<property>
  <name>dfs.ha.fencing.methods</name>
  <value>sshfence</value>
</property>

<property>
  <name>dfs.ha.fencing.ssh.private-key-files</name>
  <value>/home/exampleuser/.ssh/id_rsa</value>
</property>
```
  Optionally, one may configure a non-standard username or port to perform the SSH. One may also configure a timeout, in milliseconds, for the SSH, after which this fencing method will be considered to have failed. It may be configured like so:
```
<property>
  <name>dfs.ha.fencing.methods</name>
  <value>sshfence([[username][:port]])</value>
</property>
<property>
  <name>dfs.ha.fencing.ssh.connect-timeout</name>
  <value>
</property>
```
- shell - run an arbitrary shell command to fence the Active NameNode
  The shell fencing method runs an arbitrary shell command. It may be configured like so:
```
<property>
  <name>dfs.ha.fencing.methods</name>
  <value>shell(/path/to/my/script.sh arg1 arg2 ...)</value>
</property>
```
  The string between '(' and ')' is passed directly to a bash shell and may not include any closing parentheses.
  The shell command will be run with an environment set up to contain all of the current Hadoop configuration variables, with the '_' character replacing any '.' characters in the configuration keys. The configuration used has already had any namenode-specific configurations promoted to their generic forms -- for exampledfs_namenode_rpc-address will contain the RPC address of the target node, even though the configuration may specify that variable as dfs.namenode.rpc-address.ns1.nn1.
  Additionally, the following variables referring to the target node to be fenced are also available:
  $target_host
  hostname of the node to be fenced
  $target_port
  IPC port of the node to be fenced
  $target_address
  the above two, combined as host:port
  $target_nameserviceid
  the nameservice ID of the NN to be fenced
  $target_namenodeid
  the namenode ID of the NN to be fenced
  These environment variables may also be used as substitutions in the shell command itself. For example:
```
<property>
  <name>dfs.ha.fencing.methods</name>
  <value>shell(/path/to/my/script.sh --nameservice=$target_nameserviceid $target_host:$target_port)</value>
</property>
```
  If the shell command returns an exit code of 0, the fencing is determined to be successful. If it returns any other exit code, the fencing was not successful and the next fencing method in the list will be attempted.
  Note: This fencing method does not implement any timeout. If timeouts are necessary, they should be implemented in the shell script itself (eg by forking a subshell to kill its parent in some number of seconds).
fs.defaultFS - the default path prefix used by the Hadoop FS client when none is given
Optionally, you may now configure the default path for Hadoop clients to use the new HA-enabled logical URI. If you used "mycluster" as the nameservice ID earlier, this will be the value of the authority portion of all of your HDFS paths. This may be configured like so, in your core-site.xml file:
```
<property>
  <name>fs.defaultFS</name>
  <value>hdfs://mycluster</value>
</property>
```

Deployment details

After all of the necessary configuration options have been set, one must initially synchronize the two HA NameNodes' on-disk metadata. If you are setting up a fresh HDFS cluster, you should first run the format command (hdfs namenode -format) on one of NameNodes. If you have already formatted the NameNode, or are converting a non-HA-enabled cluster to be HA-enabled, you should now copy over the contents of your NameNode metadata directories to the other, unformatted NameNode using scp or a similar utility. The location of the directories containing the NameNode metadata are configured via the configuration options dfs.namenode.name.dir and/or dfs.namenode.edits.dir. At this time, you should also ensure that the shared edits dir (as configured by dfs.namenode.shared.edits.dir) includes all recent edits files which are in your NameNode metadata directories.
At this point you may start both of your HA NameNodes as you normally would start a NameNode.
You can visit each of the NameNodes' web pages separately by browsing to their configured HTTP addresses. You should notice that next to the configured address will be the HA state of the NameNode (either "standby" or "active".) Whenever an HA NameNode starts, it is initially in the Standby state.

Administrative commands

Now that your HA NameNodes are configured and started, you will have access to some additional commands to administer your HA HDFS cluster. Specifically, you should familiarize yourself with all of the subcommands of the "hdfs haadmin" command. Running this command without any additional arguments will display the following usage information:

Usage: DFSHAAdmin [-ns <nameserviceId>]
    [-transitionToActive <serviceId>]
    [-transitionToStandby <serviceId>]
    [-failover [--forcefence] [--forceactive] <serviceId> <serviceId>]
    [-getServiceState <serviceId>]
    [-checkHealth <serviceId>]
    [-help <command>]

This guide describes high-level uses of each of these subcommands. For specific usage information of each subcommand, you should run "hdfs haadmin -help <command>".

transitionToActive and transitionToStandby - transition the state of the given NameNode to Active or Standby
These subcommands cause a given NameNode to transition to the Active or Standby state, respectively. These commands do not attempt to perform any fencing, and thus should rarely be used. Instead, one should almost always prefer to use the "hdfs haadmin -failover" subcommand.
failover - initiate a failover between two NameNodes
This subcommand causes a failover from the first provided NameNode to the second. If the first NameNode is in the Standby state, this command simply transitions the second to the Active state without error. If the first NameNode is in the Active state, an attempt will be made to gracefully transition it to the Standby state. If this fails, the fencing methods (as configured by dfs.ha.fencing.methods) will be attempted in order until one succeeds. Only after this process will the second NameNode be transitioned to the Active state. If no fencing method succeeds, the second NameNode will not be transitioned to the Active state, and an error will be returned.
getServiceState - determine whether the given NameNode is Active or Standby
Connect to the provided NameNode to determine its current state, printing either "standby" or "active" to STDOUT appropriately. This subcommand might be used by cron jobs or monitoring scripts which need to behave differently based on whether the NameNode is currently Active or Standby.
checkHealth - check the health of the given NameNode
Connect to the provided NameNode to check its health. The NameNode is capable of performing some diagnostics on itself, including checking if internal services are running as expected. This command will return 0 if the NameNode is healthy, non-zero otherwise. One might use this command for monitoring purposes.

Infinite Programming Tips

Monday, September 24, 2012

Hadoop High Availability (HA hadoop Cluster)

Deployment details

Administrative commands

No comments:

Post a Comment

Featured Posts

✨ Tired of the same old Windows Start Menu and Taskbar?