Saturday, May 4, 2013

Set up hadoop cluster on EC2

I followed : http://blog.cloudera.com/blog/2012/10/set-up-a-hadoophbase-cluster-on-ec2-in-about-an-hour/

There are few things missing here and there but it's a great help otherwise.

Here are the missing parts :

1. Get the EC2 command line tools :  wget http://s3.amazonaws.com/ec2-downloads/ec2-api-tools.zip.
Your EC2_HOME will point to the place where the above file will be unzipped.

2. Here are all the lines you will put in your ~/.bash_profile :

export AWS_ACCESS_KEY_ID=BKIAISLDPUEUJN2ILNTP
export AWS_SECRET_ACCESS_KEY=n+v7BZhFy5CwUqpC27C/q8/vJiz5+vy0YH4Z8yyV
export EC2_PRIVATE_KEY=/mnt/aws/aws-pk.pem
export EC2_CERT=/mnt/aws/aws-cert.pem
export EC2_HOME=/mnt/aws/ec2-api-tools-1.6.7.3/
export JAVA_HOME=/usr

3. After you launch your ubuntu instance - you need to go to AWS console and the corresponding security group - add rule for SSH - so that you can login to the box.

4. On the ubuntu box : put these 2 lines in .bashrc :
export AWS_ACCESS_KEY_ID=BKIAISLDPUEUJN2ILNTP
export AWS_SECRET_ACCESS_KEY=n+v7BZhFy5CwUqpC27C/q8/vJiz5+vy0YH4Z8yyV


5. Here is your hadoop.properties file for whirr :
whirr.cluster-name=whirrly
whirr.instance-templates=6 noop
whirr.provider=aws-ec2
whirr.identity=${env:AWS_ACCESS_KEY_ID}
whirr.credential=${env:AWS_SECRET_ACCESS_KEY}
whirr.cluster-user=huser
whirr.private-key-file=${sys:user.home}/.ssh/id_rsa
whirr.public-key-file=${sys:user.home}/.ssh/id_rsa.pub
whirr.env.repo=cdh4
whirr.hardware-id=m1.large
whirr.image-id=us-east-1/ami-1db20274
whirr.location-id=us-east-1


No comments:

Blog Archive