Migrating virtual machines from Amazon EC2 to Google Compute Engine

My Amazon EC2 discount contract is almost up, and I’ve been playing with Google Compute Engine (GCE). Initial impressions are that it’s faster and costs less money, particularly if you don’t want to pay up-front for EC2 reserved instances. Google’s web console is more modern than Amazon’s, though slightly less sophisticated. Google’s CLI tools are much faster and don’t require Java. Google’s API uses JSON instead of XML.

In terms of capabilities, GCE is not as advanced as EC2, but it’s vastly more powerful than Linode, Digital Ocean, and the like. One exception is that Google doesn’t permit sending SMTP directly from GCE instances. They have a partnership with Sendgrid for that. I’m using Mandrill instead, and so far I’m very pleased with that choice.

Migration from EC2 to GCE without re-installation

It’s possible to migrate virtual machines from EC2 to GCE. This post explains how I migrated my production Ubuntu 12.04 LTS instance. It’s not a detailed guide. If you possess a good amount of Linux operations knowledge, I hope the information here will help you do your own migration quickly.

Assumptions

Important differences between EC2 and GCE

EC2 uses Xen for virtualization. GCE uses KVM.

Most EC2 instances are paravirtualized (PV). They do not emulate actual PC hardware, and depend on Xen support in the kernel. Most of the time, EC2 instances use PVGRUB to boot. PVGRUB is part of the Amazon Kernel Image (aki-xxxxxxxx) associated with your instance. PVGRUB basically parses a GRUB configuration file in your root filesystem, figures out what kernel you want to boot, and tells Xen to boot it. You never actually run GRUB inside your instance.

With KVM, you have a full hardware virtual machine that emulates a PC. It requires a functioning bootloader in your boot disk image. Without one, you won’t boot. Fixing this, and using a kernel with the proper support, are the two main obstacles in migrating a machine from EC2 to GCE.

Let’s get started.

On EC2:

  • Snapshot your system before you do anything else. If you’re paranoid, create the snapshot while your system isn’t running.
  • Install a recent kernel. The Ubuntu 12.04 LTS kernel images don’t have the virtio SCSI driver needed by GCE. I used HPA’s 3.13.11 generic kernel. (These days it isn’t necessary to use a “virtual” kernel image. The generic ones have all the paravirtualized drivers and Xen/KVM guest support.)
  • Make sure your EC2 system still boots! If it doesn’t boot on EC2, it won’t do much good on GCE.

On GCE:

  • Create and boot a new (temporary) instance on GCE using one of their existing distribution bundles.
  • Create a new volume large enough to receive the boot volume you have at EC2, and attach it to your temporary instance.
  • Create an MBR partition table on the target volume, partition it, and create a root filesystem.
  • Mount your new filesystem.

On EC2:

  • Copy data to your new GCE filesystem. Use any method you like; consider creating a volume on EC2 from the snapshot you just created and using that as your source. That will make sure you copy device nodes and other junk you might overlook otherwise. Remember to use a method that preserves hard links, sparse files, extended attributes, ACL’s, and so on.

On GCE:

  • Verify you received your data on your target volume and everything looks OK.
  • Bind-mount /proc and /dev into your target volume and chroot into it.
  • Install grub2 and grub-pc (Or whatever provides grub2 on your distribution.)
  • Remove any legacy grub ec2 packages you might have.
  • Remove /boot/grub/menu.lst
  • Add and edit the following in /etc/default/grub:
GRUB_CMDLINE_LINUX_DEFAULT="console=ttyS0,38400n8 ro root=LABEL=(your root fs label)"
GRUB_TERMINAL=serial
GRUB_SERIAL_COMMAND="serial --speed=38400 --unit=0 --word=8 --parity=no --stop=1"
  • Run update-grub
  • Install grub onto your new volume (probably grub-install /dev/sdb).
  • Edit your fstab to disable any other disks you haven’t migrated over
  • Edit the hostname (/etc/hostname)
  • Edit /etc/resolv.conf to use a valid resolver
  • Uninstall any ec2-specific software packages.
  • Exit the chroot
  • Un-mount the bind mounts and target fs
  • Detach the target fs
  • Create a new GCE instance using the target fs, and boot!
  • If it boots, destroy your temporary instance. If it doesn’t, re-attach the target disk to it and see what went wrong.

These are the minimum changes required to boot the image on GCE. You’ll still want to clean things up and make changes according to Google’s suggestions.

Troubleshooting

Check the serial console output. Is the kernel starting?

... KVM messages omitted ...
Booting from Hard Disk...
Booting from 0000:7c00
[    0.000000] Initializing cgroup subsys cpuset
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Initializing cgroup subsys cpuacct
[    0.000000] Linux version 3.14.3-031403-generic (apw@gomeisa) (gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) ) #201405061153 SMP Tue May 6 15:54:50 UTC 2014

If you don’t see anything after “Booting from 0000:7c00” then you haven’t installed GRUB properly.

If the kernel starts but the root filesystem doesn’t mount, make sure you see the root disk being detected. Make sure the root disk label is properly set in the filesystem and the GRUB configuration.

Please help me improve this post. Leave a comment below!

Secure browsing on open Wi-Fi hotspots

I frequently connect to insecure Wi-Fi networks on my iOS devices and my Mac. Aside from the risk of eavesdropping and malware when connecting to these hotspots, they frequently block access to services, insert advertisements in web pages, or worse.

To work around these problems, I’ve tried numerous virtual private network (VPN) services. My experience with most of them has been awful. They tend to connect slowly or not at all, and I frequently can’t access anything on the Internet once the VPN connection is made. Many services don’t offer automatic connections, particularly on iOS. The software tends to be clunky and confusing.

Cloak VPN is an exception. I’ve been using Cloak for several months, and it’s been rock solid. It’s also affordable, at $3/mo for 5GB of data transfer or $10 for unlimited transfer. If you don’t want a subscription, Cloak also offers the ability to buy non-renewing, unlimited passes for a week, a month, or a year.

Cloak automatically detects when you’re connecting to insecure Wi-Fi and protects your connection. One account can be used to protect all your computers and iDevices.

Cloak released version 2.0 today for iOS, which is a significant upgrade. You can now identify trusted networks, such as your home or cellular network, and Cloak will stay out of the way when you use those networks. This means you can set it up and pretty much forget about it. (Cloak for Mac already offers this capability.)

Like any VPN, using Cloak can cause issues. Cloaked connections are sometimes misidentified by servers as coming from a “bot” instead of a human. This isn’t Cloak’s fault, but a consequence of well-intentioned but misguided system administrators. Some sites won’t let you connect at all, while others, such as Wells Fargo, may ask an extra question when you sign in.

With a few clicks or taps, you can disable Cloak and connect to problem sites. In practice, I’ve only found one or two websites that were completely blocked while using Cloak. I’ve also had outgoing iMessages get blocked sporadically. In all, the issues have been minor, and far outweighed by the benefits of the service.

Cloak has very responsive customer service and is sometimes able to work around blocks by re-routing traffic for certain websites.

I highly encourage you to learn more about Cloak and get started with a free 30-day trial. You don’t need to hand over a credit card to get stared.

Cloak provides small data kickbacks to users who tout them on Twitter. I don’t spam my followers so that I can get free stuff. I’m posting this because I rely on Cloak, and I think everyone should check it out.

Use Dropbox to host public files on your own domain name

I’ve been using a Dropbox public folder and some Apache trickery to share files directly from Dropbox on my own domain at pub.noxon.cc. Dropbox is drag-and-drop file sharing at its finest, and by sharing my files on pub.noxon.cc instead of on dl.dropboxusercontent.com, my files are accessible to corporate folks who would otherwise find themselves blocked by an over-zealous web filter. Last but not least, if one of my files becomes too popular, Dropbox won’t shut down my account.

product logos

Dropbox doesn’t offer a custom hosting service, so I had to build it. I already have an Apache server, so I created a new virtual host and added some reverse proxy magic. I set up my virtual host as the origin server for the Amazon CloudFront content distribution network, ensuring a minimal load on my own server and the ability to handle virtually unlimited amounts of traffic.

Here’s a recipe for Apache 2.2, mod_proxy, and mod_rewrite:

DirectoryIndex disabled

ProxyRequests off

RewriteEngine on
RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME} !-f
RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME} !-d
RewriteRule ^/(.*) http://dl.dropboxusercontent.com/u/xxxxxx/$1 [P,L]
ProxyPassReverse / http://dl.dropboxusercontent.com/u/xxxxxx/

Header unset cache-control
Header unset Pragma
Header merge cache-control max-age=3600
Header merge cache-control must-revalidate
RequestHeader set User-Agent Mozilla

The cache-control settings dictate that CloudFront should cache my content for an hour (3600 seconds). CloudFront currently ignores the specified max-age for 404 results, instead preferring to cache them for about 10 minutes. I’d prefer a shorter lifetime for failed requests, but that’s not easy with Apache 2.2; With 2.4, it’s do-able.

The requesting User-Agent override is necessary because Dropbox blocks requests from the Amazon CloudFront User-Agent.

Using mod_rewrite makes it possible to host overlapping content outside of Dropbox. If it exists on the server, it gets served locally; If it’s missing, Apache tries to fetch it from Dropbox. I locally host the favicon, robots.txt, a 404 handler, and a couple of other things.

If you want to use your own 404 handler, you’ll need this:

ProxyErrorOverride On
ErrorDocument 404 /path/to/404.html

Before you deploy something like this, carefully consider the security implications and make the necessary adjustments. Do you want PHP code in a Dropbox folder running on your server?

Dropbox public folders are not available to users who signed up for Dropbox after July 31, 2012.