Sunday, March 14, 2010

Backup to Amazon EC2 Using Spot Instances

Following this nice tutorial explaining how to automate backup to Amazon EC2 using rsync, I have decided to improve this script a little bit and add support for spot instances.

Spot instances are like any other EC2 instance, except the price you pay per hour is determined by demand. This should always be cheaper than running a regular instance, since if spot instance price was higher than regular instance, people would shift from spot instance while causing a price drop (in short: supply and demand). More on spot instances here: http://aws.amazon.com/ec2/spot-instances/. Today, for example, the cheapest instance is $0.085 per hour, while a spot instance is about $0.031 - more than 60% save. If you have a lot of backup to do, and you are running on slow internet connection, this might save you a significant amount.

To use the script you should first read the tutorial in Black Peper blog, and then update the script to this one. You should save the script as backup.bash if you want all features to work. See script inline comments for more details:

#!/bin/bash

# Amazon login parameters
export EC2_PRIVATE_KEY=~/.ec2/pk-.pem
export EC2_CERT=~/.ec2/cert-.pem

# EC2 tools path
export JAVA_HOME=/usr/lib64/jvm/sun-java-5.0u17/jre
export EC2_HOME=/root/ec2-api-tools-1.3-46266/

### Machine image you want to use as the base for the machine you want to start up, more on this later.
# ubuntu 9.1 32-bit for me
export amiid="ami-bb709dd2"

### SSH key to use to setup the machine with. In the EC2 console you need to setup an SSH key that you can connect to your new machine with as by default they do not allow access by any other means. This is my own keyname on the console.
export key="backup console"

### Where do launch your machine. N-Virginia for me, since it's cheaper.
export zone="us-east-1a"

### Local SSH key to connect to machine with. Location of the actual SSH key that you also put in the EC2 console.
export id_file="/root/backup_to_ec2/backupconsole.pem"

### Volume to mount to machine. The volume that we've previously created.
export vol_name="vol-12345678"

## Where to mount the volume on our new machine.
export mount_point="/mnt/vol"

### Device name for the mount
export device_name="/dev/sdf"

### Security group. To help me identify my machine, I use security groups as EC2 doesn't have real instance labels.
export group="backup"

### Maximum price for amazon spot instance
export price=".05"

# See if the backup is still running. If this script is already running then abort. This is necessary if you are running a cron job and want only one instance of this script at a time. You may delete it if you don't care for this check. It is important to name the script file 'backup.bash' for this check to work.
export pidof_out=`/sbin/pidof -x backup.bash`
num=`echo $pidof_out | wc -w`
if [ "$num" != "1" ]
then
echo "backup.bash is already running"
exit
fi
echo backup.bash is not running

#
# Start the instance
# Capture the output so that
# we can grab the INSTANCE ID field
# and use it to determine when
# the instance is running
#

echo Requesting spot instance ${amiid} with price of ${price}

${EC2_HOME}/bin/ec2-request-spot-instances ${amiid} --price ${price} -z ${zone} -k "${key}" --group ${group} > /tmp/a
if [ $? != 0 ]; then
echo Error requesting spot instance for image ${amiid}
exit 1
fi

export rid=`cat /tmp/a | grep SPOTINSTANCEREQUEST | cut -f2`

#
# Loop until the status changes to 'active'
#

sleep 30
echo Checking request ${rid}
export ACTIVE="active"
export done="false"
while [ $done == "false" ]
do
export request=`${EC2_HOME}/bin/ec2-describe-spot-instance-requests ${rid} | grep SPOTINSTANCEREQUEST`
export status=`echo $request | cut -f6 -d' '`
if [ $status == ${ACTIVE} ]; then
export done="true"
export iid=`echo $request | cut -f8 -d' '`
else
echo Waiting...
sleep 60
fi
done
echo Request ${rid} is active

#
# Loop until instance is running
#

echo Waiting for instance to start...
export done="false"
export RUNNING="running"
while [ $done == "false" ]
do
  export status=`${EC2_HOME}/bin/ec2-describe-instances ${iid} grep INSTANCE  | cut -f6`
  if [ $status == ${RUNNING} ]; then
    export done="true"
  else
    echo Waiting...
    sleep 10
  fi
done
echo Instance ${iid} is running.
#
# Attach the volume to the running instance
#
echo Attaching volume ${vol_name}

${EC2_HOME}/bin/ec2-attach-volume ${vol_name} -i ${iid} -d ${device_name}
sleep 15

#
# Loop until the volume status changes
# to 'attached'
#

export ATTACHED="attached"
export done="false"
while [ $done == "false" ]
do
export status=`${EC2_HOME}/bin/ec2-describe-volumes | grep ATTACHMENT | grep ${iid} | cut -f5`
if [ "$status" == ${ATTACHED} ]; then
export done="true"
else
echo Waiting...
sleep 10
fi
done

echo Volume ${vol_name} is attached

export EC2_HOST=`${EC2_HOME}/bin/ec2-describe-instances | grep "${iid}" | tr '\t' '\n' \
| grep amazonaws.com`

### Important trick here.
### 1. Because you will be starting up a different machine every time you run this script, you'll be forced to say yes to accepting the change of host for the SSH key, the options here make sure the doesn't happen and you can run this completely automated without human interaction.
### 2. Since we don't want to save the host SSH key, we will redirect the known hosts list to a temp file
export KNOWN_HOSTS='/tmp/known_hosts.$$'
rm $KNOWN_HOSTS

### This line logs on and mounts our volume to our machine.
ssh -i ${id_file} -o "StrictHostKeyChecking no" -o "UserKnownHostsFile=$KNOWN_HOSTS" ubuntu@$EC2_HOST "sudo mkdir /mnt/data-store && sudo mount ${device_name} /mnt/data-store"

### Run rsync, whatever options you'd like, here are a couple of examples I use.
rsync -e "ssh -i ${id_file} -o 'UserKnownHostsFile=$KNOWN_HOSTS'" --rsync-path "sudo rsync" --delete -avz /mnt/vg1/lv0/backup ubuntu@$EC2_HOST:/mnt/data-store/backup

### Clean up. Disconnect the volume
ssh -i ${id_file} -o "UserKnownHostsFile=$KNOWN_HOSTS" ubuntu@$EC2_HOST "sudo umount -d ${device_name}"

### Detach volume from machine
${EC2_HOME}/bin/ec2-detach-volume ${vol_name} -i ${iid}

### Shutdown instance
${EC2_HOME}/bin/ec2-terminate-instances ${iid}














3 comments:

  1. It is nice blog post on AWS EC2 Instance. I want to increase some information on AWS EC2 backup strategy. Thanks for sharing.

    ReplyDelete
  2. Borgata Hotel Casino & Spa Archives - DRMCD
    Borgata 성남 출장마사지 Hotel Casino & Spa, Atlantic City, NJ, October 29, 2021. $100 Million Atlantic City 속초 출장마사지 Casino; Borgata 안양 출장안마 Hotel Casino & Spa; Borgata Hotel Casino & 김제 출장샵 Spa; Atlantic City 평택 출장샵

    ReplyDelete