Thursday, December 17, 2015

Read thousands of books in $5

Friday, September 4, 2015

Change desktop screen size using terminal linux inside VM


Change desktop screen size using terminal linux inside VM

xrandr -s 1440x900

xrandr -s <Resolution which you want to change>

you can use

xrandr --verbose 

option to get the available screen resolution available.

Option with xrandr :

Wednesday, September 2, 2015

Install GNOME Desktop Ubuntu

Using following command we can install and enable desktop GUI for Ubuntu server version if needed.

sudo apt-get install ubuntu-gnome-desktop
sudo service gdm restart

Tuesday, September 1, 2015

org.apache.hadoop.yarn.exceptions.InvalidAuxServiceException: The auxService:mapreduce_shuffle does not exist

Add following entries to yarn-site.xml if not present and restar yarn service.



Sunday, August 30, 2015

Enable passwordless sudoers for a user

We can add following line in /etc/sudoers to enable the user to be passwordless login from other machines.


Monday, August 24, 2015

Comparison between RDBMS and Map-Reduce

Scale UP
Scale Out
Data Size
PB and more
Batch and Interactive
Update Type
Write many Read many
Write once Read many
Structural data/ Schema first Write later
Non Structural data/ Write first Schema later
No SQL and SQL support too with add on tools
Response time
Faster for Less data/ slow once size increases
Faster for more data in comparison
Note: Slowly with the new sub projects being developed for Hadoop this gap is being filled up as people are developing an abstract layer on top of Map-Reduce and YARN frameworks which takes SQL and in turn convert it to Map-Reduce/Yarn.

Tuesday, June 16, 2015

Extract All Tar Files in a directory in Linux

This will first list the files containing extension tar.gz, and then awk will get the file names which is column 9 in

ls -lrth command, and

NF > 2 will remove the blank line and

tar -xvzf will extract files names contained in variable $i,

Like this we can experiment various operations like renaming all files with specific extension of so and can fiddled with to achieve various goals.

for i in `ls -lrth *.tar.gz |awk  'NF>2 {print $9}'`; do tar -xvzf $i; done

Thursday, June 4, 2015

All about hadoop Balancer.

Hadoop Data Balancing
Hadoop Data Balancing

Hadoop Balancer:

This is tool provided to balance the disk uses throughout the Hadoop cluster. I may happen sometime that some of the nodes in the cluster becomes over utilized or underutilized, which occurs due to addition of new nodes where newly added nodes may be underutilized or if there are less number of nodes result in overutilization. We can run balancer from more than 1 machine in the cluster to increase the speed of balancing but it will increase bandwidth uses to very high.
This tool requires administrator right on the Hadoop cluster to run.

Syntax of the balancer:

bin/ [-threshold <threshold>]

Where files resides in the bin directory of the Hadoop folder. And the threshold is the parameter which decides target of balance, this lies in fraction between 0,1 the default value is 10% if nothing is passed as the threshold value.

This process does the transferring of blocks between the nodes resulting network activity and if a production cluster must be used cautiously, as it result in some block missing error or late reply from the cluster.
This process can be stopped any time if required using following command:

Monday, September 8, 2014

WPF Architecture

WPF architecture is multilayered architecture.  It has three layers mainly Managed code, Unmanaged code and  Core Operating system. We can call these layers as set of assemblies that built up the entire framework.

The major components are below Presentation Framework, Presentation Core and Media Integration(Milcore)  are the major components of wpf architecture. 


Friday, August 22, 2014

Limit Disk uses in Datanodes Hadoop

There may in some scenario when disk attached to data node may go over utilize and you become unable to perform any operation on the data node due to no space left on the system. so we have a option of defining a limit of space which can be used by data node daemons.
Using following configuration:

 <description>Reserved space in bytes per volume. This defines to leave this much space free for non dfs use.</description>