Thursday, July 4, 2013

How to copy files from one Hadoop Cluster to another ?

Suppose if you want to copy files from hadoop clusters you have three options :

1 : copy file to local and then copy from local using
  
 copyToLocal and then copyFromLocal
  
 -get and -put

But not a good option.

So we have another option:

-cp and distcp

Distcp will require Map-reduce to be running if you dont want to run Mapreduce on your cluster you have other option that is -cp

Uses:
hadoop dfs -cp hdfs://<source> hdfs://<destination>

if you want faster copy use distcp for that your job tracker and task tracker must be running.

distcp uses:

hadoop distcp hdfs://<source> hdfs://<destination>.

No comments:

Post a Comment

Thank you for Commenting Will reply soon ......

Featured Posts

Enable shared folders in ubuntu in vmware?

 To enable Shared Folders in Ubuntu (VM) on VMware , follow these steps: Step 1: Enable Shared Folders in VMware Settings Power...