Synology Gmail Backup


January 2015 update: I reinstalled GMVault again, and David's comment below was worth detailing in a new post. Read the updated post.

Update: I kept getting IMAP abort errors to the tune of Gmvault ssl socket error: EOF. Connection lost, reconnect., and a GitHub thread suggested I turn off IMAP compression in gmvault_defaults.conf. That fixed the problem for me, and now I've got completely clean, error-free backups.

A Synology DS214 recently made its way into our home, and I've been busy devising ways to grow our new private cloud. After pulling and crimping some CAT6, installing some nice drives, and a rather effortless setup thanks to DSM 5.0 (Synology's own surprisingly nice software), I've got a headless server with plenty of RAID-protected disk space and dirt cheap, off-site Glacier backups. It does a bunch of cool things out of the box, and I decided my first challenge would be to automate Google Apps email backups.

Background

I have separate Google Apps accounts for personal and business use, and for business I've been using Backupify for almost a year and a half. For $3/month, Backupify required zero setup and faithfully kept all my Google Apps data backed up and available to restore at a moment's notice. There's nothing I'd change about the service, I just did the math and decided I'd take matters into my own hands. For "fun." And with all the savings, I can afford an extra two-thirds of a cup of coffee every month!

Now here's the best part: I managed to do this without installing ipkg. That's a well-known and well-documented bootstrap for Synology devices that gives you access to a whole world of updated, third-party Linux packages. I'm still attempting to avoid it, only because I've read that firmware updates can require re-integrating ipkg, and I'm wary of jumping in after earlier regrets over iPhone jailbreaking. In other words, I'm a baby.

But avoid I can, because DSM 5.0 comes with cron, shell access, and one-click installers for Python 2.7 and Python 3. This means that all I have to do is get GMVault running and move on to the next challenge!

Jimmy Bonney's post got me most of the way there, and a comment from David Cumps and some old-fashioned trial and error saw me through the rest of the way.

The Setup

I have shared folders for work and personal use, and my setup in each is identical. There could be a better way to do this, but so far all's working well. For each shared folder, I have...

  • mail-backup/: a directory that GMVault can use for its repository, which is a series of dated year-month folders containing two files per message: JSON meta data and a compressed package with the message body and attachments
  • utilities/: the directory I used to install GMVault and where I've placed backup scripts for cron
  • utilities/log/mail-backup/: a directory that gets dated plain text logs every time the backup scripts run

I've cobbled together two shell scripts, one that runs daily and syncs in quick mode, and one that runs monthly to ensure a full sync.

email-backup-full.sh: Mark and timestamp the beginning and end of the sync process, print environment variables at runtime (helpful for debugging permissions issues if you don't run as root), and log gmvault's output while it attempts to sync.

#!/bin/sh

NOW=$(date +"%Y-%m-%d")
LOGFILE="/volume1/yourfolder/utilities/log/mail-backup/log-$NOW.log"
CURTIME=$(date +"%r")

echo "------------------------------------" >> $LOGFILE
echo "$CURTIME: Starting email sync..." >> $LOGFILE
echo "------------------------------------" >> $LOGFILE
printf "\n" >> $LOGFILE

set >> $LOGFILE

sh /volume1/yourfolder/utilities/gmvault_env/bin/gmvault sync --emails-only -t full --db-dir /volume1/yourfolder/mail-backup/ you@gmail.com >> $LOGFILE

CURTIME=$(date +"%r")

printf "\n" >> $LOGFILE
echo "------------------------------------" >> $LOGFILE
echo "$CURTIME: email sync finished." >> $LOGFILE
echo "------------------------------------" >> $LOGFILE

email-backup-quick.sh: Ditto the first script, just put gmvault in quick mode instead of full.

#!/bin/sh

NOW=$(date +"%Y-%m-%d")
LOGFILE="/volume1/yourfolder/utilities/log/mail-backup/log-$NOW.log"
CURTIME=$(date +"%r")

echo "------------------------------------" >> $LOGFILE
echo "$CURTIME: Starting email sync..." >> $LOGFILE
echo "------------------------------------" >> $LOGFILE
printf "\n" >> $LOGFILE

set >> $LOGFILE

sh /volume1/yourfolder/utilities/gmvault_env/bin/gmvault sync --emails-only -t quick --db-dir /volume1/yourfolder/mail-backup/ you@gmail.com >> $LOGFILE

CURTIME=$(date +"%r")

printf "\n" >> $LOGFILE
echo "------------------------------------" >> $LOGFILE
echo "$CURTIME: email sync finished." >> $LOGFILE
echo "------------------------------------" >> $LOGFILE

Process

  1. Install Python 2.7 (labeled "Python") from the DSM Package Center.
  2. Create directories in a shared folder for your backup repository, logs, and a place to set up GMVault. (Mine are mail-backup, utilities/log/mail-backup, and utilities respectively.)
  3. SSH as root into your utilities directory, and curl -O https://raw.githubusercontent.com/pypa/virtualenv/master/virtualenv.py -k to install a virtualenv1.
  4. python virtualenv.py gmvault_env to install GMVault in the virtualenv from the previous step.2
  5. sh /volume1/yourfolder/utilities/gmvault_env/bin/gmvault should now run GMVault and give you an error for having too few arguments.
  6. Place your full and quick backup scripts somewhere—I put mine right in the utilities folder. Remember the full path to each script for step 8.
  7. Authenticate with Google Apps once, running some variant of sh /volume1/yourfolder/utilities/gmvault_env/bin/gmvault sync --emails-only -t quick --db-dir /volume1/yourfolder/mail-backup/. Read, hit enter, paste the unique URL into your browser, allow access from the browser, and you can end the gmvault instance prematurely once the authentication's done. Auth tokens will be stored in /root/.gmvault, along with gmvault defaults that you can edit.
  8. Schedule these scripts to run automatically with Task Scheduler, which you'll find in the Control Panel. Create a new User-defined script and give it a name, run as root, and in the User-defined script textarea add: sh /volume1/yourfolder/utilities/backup-email-full.sh. Schedule it to run when you'd like (or temporarily at short intervals to test), then repeat with the backup-email-quick.sh script.

That's it! I'll probably try and figure out how to detect failures and fire off an email, but at the moment I've got working daily backups of two Google Apps email accounts. The initial sync for each ~4GB of mail took 3-4 hours, and subsequent quick syncs only take a few minutes. As a bonus, when you allow GMVault to compress messages the total size of the backup will be a fraction of what Google reports for your disk usage.

Check out GMVault's options for more ideas, and let me know if you improve my humble first attempts to get this working! As usual, my Python and shell scripting is a bit on the infantile side.


  1. As of January 2015, I had to specifically use wget https://pypi.python.org/packages/source/v/virtualenv/virtualenv-1.10.1.tar.gz and tar xzf virtualenv-1.10.1.tar.gz to get an older version of virtualenv that worked. 

  2. I ended up on a side quest here, running easy_install pip, pip install virtualenv, virtualenv --no-site-packages gmvault-1.7-beta, and finally easy_install gmvault, but I'm not sure that having pip or easy_installing gmvault contributed anything aside from confusion. Trying to run any gmvault without prepending sh would (and still does) result in an error: "env: bash: No such file or directory". 

* * *