All the question that scared me now i am trying to scare them .. so that they cant scare others :)
Monday, February 24, 2014
Comparison between Big Data and RDBMS
|
RDBMS
|
Big
Data
|
Data size
|
Gigabytes
|
Petabytes
|
Access
|
Interactive and batch
|
Batch
|
Updates
|
Read and write many
times
|
Write once, read many
times
|
Structure
|
Static schema
|
Dynamic schema
|
Integrity
|
High
|
Low
|
Scaling
|
Nonlinear
|
Linear
|
Sunday, February 23, 2014
HBase Backup
HBase Backup:
We need
to have backup of HBase table offline in some point of time, in spite of the
fact that Hadoop and HBase provide replication and redundancy. For this we have
some backup option in HBase. These are categorized in two ways:
Online backup
Again this is categorized in three
ways
Replication: In
this method you need to have a 2nd cluster where you will keep your
replication for the data from the 1st cluster.
Hadoop/HBase
Export command: which runs a map reduce job to copy table from one cluster to
the same cluster or to other Hadoop cluster. This does not require any kind of
downtime for backing/ exporting data.
In this method
we need to export the data to the cluster and if we need to restore we need to
restore it by Importing.
CopyTable: this
is also online backup method which copies table from one cluster to another
cluster or to the same cluster.
Offline Backup:
Distcp : this is
a kind of file system backup, this copies a directory from HDFS to same cluster
or to other cluster.
copyToLocal :
this is less reliable way of copying directories from HDFS to local backup
drive. If large amount of data is there then you need lot of Hadoop tune-up to
copy successfully.
Offline Backup
methods are full shutdown backup method, suppose you need to copy HBase you
need to stop your HBase cluster, for a successful backup, as the files are
being continuously moved, modified and changes while cluster is online, and
copying in this scenario may fail.
Monday, February 10, 2014
Linux: Crontab - Brief
I always get confuse whenever I want to set a new cron job. The confuse is with regard to the options too be set!
For those who new to 'cron', its nothing but, an event scheduler in Linux. That means, you can schedule any script to run at any time you wanted to. Its just the system/server should be up and running!
cron job is specific to every user in Linux/Unix. So, one can't see other's cron unless the necessary privileges or sudo root access given.
Whatever, here is the options in cron:
To check cron jobs:
[root@localhost kiran]# crontab -l
no crontab for root
To set cron jobs:
[root@localhost kiran]# crontab -e
After adding, here is how it looks:
[root@localhost kiran]# crontab -l
##Script to test
00 */2 1-31 * 0,2,3 sh /home/kiran/test.sh >> /dev/null
Every Cron job should be given with 5 options:
- minute -> 0-59
- hour -> 0-23
- day of month -> 1-31
- month -> 1-12
- day of week -> 0-7 (0 is Sunday )
In the above example:
00 -- 0th Minute
*/2 -- Every 2 hours
1-31 -- Every day (1 to 31)
* -- Every Month
0,2,3 -- Sunday,Tuesday,Wednesday
Subscribe to:
Posts (Atom)
Featured Posts
Installing And Exploring Auto Dark Mode Software
Windows Auto--Night--Mode: Simplify Your Theme Switching Windows Auto--Night--Mode is a free and lightweight tool that makes switching bet...
-
Configuration config = HBaseConfiguration.create(); Job job = new Job(config,"ExampleReadWrite"); job.setJarByClass(MyReadWriteJo...
-
Print numbers in order : #!/bin/bash for i in $(seq 0 4) do for j in $(seq $i -1 0) do echo -n $j done echo done Will gi...