Migrating virtual machines from Amazon EC2 to Google Compute Engine

My Amazon EC2 discount contract is almost up, and I’ve been playing with Google Compute Engine (GCE). Initial impressions are that it’s faster and costs less money, particularly if you don’t want to pay up-front for EC2 reserved instances. Google’s web console is more modern than Amazon’s, though slightly less sophisticated. Google’s CLI tools are much faster and don’t require Java. Google’s API uses JSON instead of XML.

In terms of capabilities, GCE is not as advanced as EC2, but it’s vastly more powerful than Linode, Digital Ocean, and the like. One exception is that Google doesn’t permit sending SMTP directly from GCE instances. They have a partnership with Sendgrid for that. I’m using Mandrill instead, and so far I’m very pleased with that choice.

Migration from EC2 to GCE without re-installation

It’s possible to migrate virtual machines from EC2 to GCE. This post explains how I migrated my production Ubuntu 12.04 LTS instance. It’s not a detailed guide. If you possess a good amount of Linux operations knowledge, I hope the information here will help you do your own migration quickly.

Assumptions

Important differences between EC2 and GCE

EC2 uses Xen for virtualization. GCE uses KVM.

Most EC2 instances are paravirtualized (PV). They do not emulate actual PC hardware, and depend on Xen support in the kernel. Most of the time, EC2 instances use PVGRUB to boot. PVGRUB is part of the Amazon Kernel Image (aki-xxxxxxxx) associated with your instance. PVGRUB basically parses a GRUB configuration file in your root filesystem, figures out what kernel you want to boot, and tells Xen to boot it. You never actually run GRUB inside your instance.

With KVM, you have a full hardware virtual machine that emulates a PC. It requires a functioning bootloader in your boot disk image. Without one, you won’t boot. Fixing this, and using a kernel with the proper support, are the two main obstacles in migrating a machine from EC2 to GCE.

Let’s get started.

On EC2:

  • Snapshot your system before you do anything else. If you’re paranoid, create the snapshot while your system isn’t running.
  • Install a recent kernel. The Ubuntu 12.04 LTS kernel images don’t have the virtio SCSI driver needed by GCE. I used HPA’s 3.13.11 generic kernel. (These days it isn’t necessary to use a “virtual” kernel image. The generic ones have all the paravirtualized drivers and Xen/KVM guest support.)
  • Make sure your EC2 system still boots! If it doesn’t boot on EC2, it won’t do much good on GCE.

On GCE:

  • Create and boot a new (temporary) instance on GCE using one of their existing distribution bundles.
  • Create a new volume large enough to receive the boot volume you have at EC2, and attach it to your temporary instance.
  • Create an MBR partition table on the target volume, partition it, and create a root filesystem.
  • Mount your new filesystem.

On EC2:

  • Copy data to your new GCE filesystem. Use any method you like; consider creating a volume on EC2 from the snapshot you just created and using that as your source. That will make sure you copy device nodes and other junk you might overlook otherwise. Remember to use a method that preserves hard links, sparse files, extended attributes, ACL’s, and so on.

On GCE:

  • Verify you received your data on your target volume and everything looks OK.
  • Bind-mount /proc and /dev into your target volume and chroot into it.
  • Install grub2 and grub-pc (Or whatever provides grub2 on your distribution.)
  • Remove any legacy grub ec2 packages you might have.
  • Remove /boot/grub/menu.lst
  • Add and edit the following in /etc/default/grub:
GRUB_CMDLINE_LINUX_DEFAULT="console=ttyS0,38400n8 ro root=LABEL=(your root fs label)"
GRUB_TERMINAL=serial
GRUB_SERIAL_COMMAND="serial --speed=38400 --unit=0 --word=8 --parity=no --stop=1"
  • Run update-grub
  • Install grub onto your new volume (probably grub-install /dev/sdb).
  • Edit your fstab to disable any other disks you haven’t migrated over
  • Edit the hostname (/etc/hostname)
  • Edit /etc/resolv.conf to use a valid resolver
  • Uninstall any ec2-specific software packages.
  • Exit the chroot
  • Un-mount the bind mounts and target fs
  • Detach the target fs
  • Create a new GCE instance using the target fs, and boot!
  • If it boots, destroy your temporary instance. If it doesn’t, re-attach the target disk to it and see what went wrong.

These are the minimum changes required to boot the image on GCE. You’ll still want to clean things up and make changes according to Google’s suggestions.

Troubleshooting

Check the serial console output. Is the kernel starting?

... KVM messages omitted ...
Booting from Hard Disk...
Booting from 0000:7c00
[    0.000000] Initializing cgroup subsys cpuset
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Initializing cgroup subsys cpuacct
[    0.000000] Linux version 3.14.3-031403-generic (apw@gomeisa) (gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) ) #201405061153 SMP Tue May 6 15:54:50 UTC 2014

If you don’t see anything after “Booting from 0000:7c00″ then you haven’t installed GRUB properly.

If the kernel starts but the root filesystem doesn’t mount, make sure you see the root disk being detected. Make sure the root disk label is properly set in the filesystem and the GRUB configuration.

Please help me improve this post. Leave a comment below!

3 thoughts on “Migrating virtual machines from Amazon EC2 to Google Compute Engine

  1. steven

    Great work! I have software currently running on EC2 in a paravirtualized (PV) instance with a reservation that is about to expire. Before renewing the reservation, I wanted to test other offerings.

    I used the steps outlined above to successfully migrate the instance to both GCE and to an EC2’s HVM instance so I could run some benchmarks and compare. I was surprised to learn my software (both CPU and I/O intensive) performed very similarly in both Amazon’s and Google’s platforms. In the end, EC2 (PV) performance was ever so slightly (~5%) better, but again, that is with my software running for a few days with the same data sets in all environments.

    My biggest stumbling block is that I had no output from the EC2 console until I finally figured out how to make it work. Basically I was working in the blind. With GCE, the console was there to help me figure out some small details.

    Reply
  2. steven

    Update to the test mentioned in the previous comment. The difference in performance between EC2’s PV and HVM instance may have been due to random factors such as neighbor activities. Further testing (running for almost 24 hours) has shown that I’m getting virtually identical performance out of both of the EC2 environments, which continue to be about 5% better than GCE.

    Reply
    1. Jeff Noxon Post author

      It’s probably worth noting that I’ve been testing the small instances (m1.small on EC2 and g1-small on GCE) … and for those, GCE is noticeably faster… 30% or more.

      Reply

Leave a Reply