You may or may not know that there is quite a cheap way to archive your backup jobs to tape, without actually having a tape drive.
While I was looking at options to store my backup jobs off-site, I researched a number of things:
- Buying a tape library
- Buying another enterprise NAS (Network Attached Storage)
- Using a JBOD (Just a bunch of disks) system like Microsoft Storage Spaces
All of these are expensive. All of these require some sort of hardware plus additional network and configuration. All of these are a pain in the ass.
Our setup consists of the following (across data centers):
- VMWare 5.5/6.5
- Veeam 9/9.5
We currently replicate between data centers using Veeams built in functionality. This is mainly for disaster recovery. This is an issue for a couple of reasons:
- Duplicate data at each end
- Additional hardware required to hold the backups
- Complexities of a WAN for Veeam
What sparked my interest in a better way of doing things was the recent bouts of Crypto viruses which are now targeting backup devices, either wiping them clean or encrypting them. If that happens, you’re really screwed.
I decided on AWS, simply, due to cost and simplicity. Backup should be simple. Simple is good.
If we look at it purely from a cost perspective, it is as follows (see AWS costs in full here):
|Virtual Tape Storage (EC2)
Used as the initial storage when the tape backups are uploaded before they go to Glacier
|$0.025 per GB-month of data stored|
|Virtual Tape Storage Archive (Glacier)
Used when tapes are archived
|$0.005 per GB-month of data stored|
There is also the cost of the gateway appliance itself. The way this works is, you are billed for the first 100GB written to the gateway. After that, there are no further charges. The AWS calculator says the following: Up to a maximum of $125.00 per gateway per month. The first 100 GB per account is free.
So you will be looking at $125.00 per gateway, per month if you are pushing more than 100GB of data to AWS. In our case, we run a gateway at each data center.
The cost of tape archive is too good to pass up. That’s what really makes the AWS VTL Gateway a no-brainer.
The setup of the VTL Gateway is being done on VMWare 6.5 with Veeam Backup and Replication 9.5 Update 2.
I am going to assume you already have an AWS account.
Step 1 – Download and configure the VTL appliance
- From the AWS services, select the Storage Gateway service
- You will likely be presented with a wizard on your first run through. Otherwise, select Create Gateway.
- Select Tape Gateway
- Select VMWare ESXi and download the image. You can follow this guide for Hyper-V as apart from the Hypervisor, the steps are mostly the same
- Once the image has downloaded, deploy it to your infrastructure as a new OVF template.
- Once the image has been deployed to your environment, go to the console and log in. The server will get DHCP, but you will likely have to assign it an IP address and Gateway. The default username sguser and the password is sgpassword. Note that your VTL Gateway appliance needs to be on the same subnet as your Veeam server.
- Once you have configured the network information, you need to attach to disk drives. One is for cache, the other is for an upload buffer. It is up to you what you allocate here, but the cache drive needs to be bigger than your largest backup set. I’ve gone with 500GB for cache and 500GB for upload buffer. Realistically, it should be around 2TB for cache, 200GB for upload buffer. Once the drives are added, reboot the appliance.
- Once the device comes back online, if you go back to the AWS console, it will be asking for an IP address to connect to. You do not need to open any ports. This is established through the web browser (nifty).
- Once you do this, it will ask to configure the drives you have added
- While you are in the AWS console, create some tapes. For the purpose of this guide, I recommend creating some 100GB tapes, which are the minimum.
- You are now good to go. Lets move on to the Veeam configuration
Step 2 – Configure Veeam
Your Veeam server needs to be on the same subnet as the AWS VTL Gateway Appliance
- On your Veeam management server, go to control panel and select iSCSI connections. Add your AWS VTL Gateway Appliance and quick connect
- Connect to every discovered target. These are your tape drives
- Once all your targets have been connected, in Veeam, select Tape Infrastructure and add a new tape server
- Run through the wizard. Everything should be pretty straight forward
- Your new tape server should now be ready!
Step 3 – Create GFS Media Pools
The next step is to create media pools. Media pools control how you are going to store your media. This is very important using AWS, as it will also control how you are going to store your virtual tapes, either in EC2, or Glacier.
The following example, I am going to store 1 year of monthly backups. I am not going to select Weekly, Quarterly, or Yearly. You can choose these options if you wish.
- From Veeam, select Media Pools and select Add GFS Media Pool
- When it comes to Tapes, you can simply select to add free tapes. However, I usually add tapes specifically to my media pool depending on the backup size. For instance, if I am backing up a 400GB VM, I would add 500GB tapes (just in case), specifically for this media pool. The reason is to save on cost when pulling these tapes back from AWS. You will be charged to pull an entire tape back if you require files. So it’s best to create your media pools accordingly.
- My GFS Media Set is as follows. 0 disables the media set
- Click on the Advanced button under media set. There is a very important option which must be selected. You want to move all finished jobs to a Vault. This archives the tape at the end of the job (Glacier storage). You will need to do this for all media sets you create unless you want to leave them on EC2 storage.
- Select encryption if you want to (recommended for AWS)
- If you have created tapes in the first step, add them to this media pool now. If they are not showing, right click on your tape libaray and import tapes.
Step 4 – Creating GFS Tape Jobs
Now that we have our media pool and vault created, we can now create tape jobs.
Just a note with GFS tape jobs: The job will start and then wait for the next backup to occur. The tape job checks every hour for new backup files. It starts at midnight 12:00 and will check every hour after that. Once a job has finished, you will see the tape job activate. Manually running the tape job will finish without any files being backed up unless Active Full is selected. This isn’t stated anywhere and was a real pain in the ass when testing this.
- Select New Backup to tape job
- Under backup files, select your backup job. You can select a backup copy job here. Select the latest not incremental (If you are only doing full backups like me)
- Select the media pool you created before. If you added tapes to the media pool, the free space will be displayed
- This is the most important step make sure you click the following options
Ejecting the media and exporting them will add the media to the vault you created earlier and archive the tapes. Since you are creating virtual tapes, having a ton of monthly tapes isn’t really an issue.
- For the schedule, since it’s monthly, I go for the last Saturday each month.
That’s it. You’re done. If you configured it right, your backup will finish and your tape will be archived:
Some common questions
- I create a media pool per set of backups or per customer to cut down on AWS fees
- For each backup set, eject and archive your tapes. This will cut down on your AWS costs
- If you delete an archive tape from AWS under 90 days, you will be charged for the full tape (data transfer charges)
- Size your tapes for the backup job so you don’t cause massive data charges when you only need a sub-set of files for one VM
If you have any further questions, please let me know.