Moving to the cloud is a big ticket item for most development teams at the moment. The cloud presents a variety of advantages over traditional web hosting, such as scalability, availability and durability. It’s not always the case, however, that we can move to the cloud without a few headaches, both in how we design our systems, and how our development teams interact with them. Looking at guides from providers or those in the know there are best practices out there to guide our way, but often these demand large rewrites of our applications and changes to the way we work. For enterprise customers this might fit into the normal pathway of change management, but smaller companies, agencies or startups might well find the cloud a hard proposition to sell to management if it requires months of application changes and upskilling of a development team.
So, is there a better way? I believe there is, and have followed this pathway during a migration onto AWS. We looked at the minimum case for moving and tried to keep this as a primary commitment when planning the work. At every stage the question was: is this necessary? Are there ways round it? If there are, what is the negative impact. This way we can be on AWS, taking advantage of many of it’s features without having had to rewrite applications or retrain a team. From there, we can slowly add in these changes, upskill the team over time, making this a pathway which gets the benefits of the cloud with less of the major challenges for the business.
Along the way we’ll encounter some challenges; I’ve previously covered this topic at conference talks under the title “What You’ll Miss on AWS and How to Find It”, but for a more detailed explanation in this blog post a better title was, as above, “How to Migrate to AWS Without Changing Everything”
If you’re currently on a traditional hosting environment such as a VPS, a dedicated box or co-locating you might have one beefy machine that does everything - stores your content, provides a database and runs your code (both from web serving and background jobs). With AWS you’re not going to want to do that, for a few reasons:
AWS split their service into geographic regions, all around the world. Within regions they have what are known as availability zones, one or more data centres that are treated as a location where you can choose to locate a virtual service, such as a server, a network interface, a file system.
These availability zones are connected and can share data between them, but Amazon reserve the right to do things like carry out planned maintenance, or for a single AZ to go unexpectedly offline. That means if you’re relying on services in that AZ you’ll experience loss of service if it goes down, which removes one of your benefits to being in the cloud. So we need a solution spanning these availability zones.
Components of your application will likely scale at different rates. If you run a high traffic website like a news site, your web serving requirement is likely your primary driver of scaling requirement, followed maybe by a database. If you’re on a SaaS platform, the database likely takes the brunt of the load, or possibly background job handling. For agencies the biggest scaling need might be storage or serving of media.
By splitting out your infrastructure to logical parts of your service, you can scale the right parts at the right time, and save costs where a service doesn’t need to scale in line with the services offered by traditional VPS providers.
Moving away from the concept of one server doing everything lets you make the most advantage of the services cloud providers offer. In AWS everything has an acronym, so let’s look at some of those now, and the services behind the letters:
- EC2 - we’ve just said we’ll move away from single servers, but if you’re running a traditional web serving environment (e.g. Apache/Nginx and CGI scripts, or server processes such as Express for Node) you’ll still need these virtual servers, which are referred to as “instances” (they can run a variety of Linux flavours or Windows). EC2 by the way stands for Elastic Compute Cloud
- EBS - this is a service you might miss, but it’s essential to EC2. EBS stands for Elastic Block Store and is a fancy way of saying hard disks. They’re a bit better than a hard disk on your server though, as EBS volumes can be detached from instances (usually when they’re not running) and reattached for other ones to provide more flexible storage, resized (again when disconnected) and snapshotted (even when running, if you don’t mind risking a bit of consistency). Older EC2 instances provide “instance storage” but that’s being retired so don’t use it - use EBS instead.
- RDS - these are basically locked down EC2 instances which run database server software. They offer a bunch of flavours of SQL (MySQL, PostgreSQL, MariaDB, Oracle, MSSQL) and extra features such as automated backups, point-in-time restore (using snapshots and binary log replay) and easy to spin-up read replicas. They can also operate across Availability Zones, using synchronous replicas so you can survive the sudden loss of an AZ.
- S3 - this was the flagship AWS service and is one of the most commonly used by people not running the rest of their infrastructure on AWS. S3 stores files and costs next to nothing. It calls files objects and is a nearly-flat file storage system - it is split into buckets which are kind of like directories except their names must be unique (globally, not just to your account), and within buckets objects have unique names. An object’s unique name may contain a “/“ at which point the AWS console gives you the appearance of a directory, but there are no actual logical directories - so you can’t search for a directory name, except as a “prefix” including all parent directories. As an extra tool S3 can also serve websites, but be careful with permissions - barely a day goes by without some company being found to have made an S3 bucket for sensitive data public!
- EFS - don’t get this confused with EBS, this is still a volume to store files on, but it’s a bit different. EFS is Amazon’s networked file system, with the key difference being that it’s accessible via network mount from EC2 instances in any availability zone in your region (and also from other regions if you want to do fun network peering and love stress). It is also a “pay for what you use” system, meaning you’re billed by what’s stored, rather than by what’s provisioned. We’ll discuss EFS more later, because it solves some problems and creates some new ones.
- ELB/CLB/ALB/NLB - the LB stands for Load Balancer, with the E being Elastic, C being Classic, A being Application and N being Network. The story here is that there used to be just ELB, but then it got expanded for different use cases. The original E was renamed as C for those who still had configurations using it. The Application balancer is what most HTTP(S) applications will use, and Network load balancers are for more interesting traffic such as FTP or TCP (such as might be used in websocket systems, signalling servers, ping servers)
That’s just scratching the surface of what AWS offers, but it’s a pretty good list of tools we’ll want to use to move our application onto the cloud. There are a few more specific services that we’ll cover later on, as we have need of them.
Back to our availability problem - AWS offers an SLA of 99.95%. This equals about 4 hours downtime per year which isn’t bad. However the small print requires you to be using a redundant set-up for this to be the case. That means servers running in multiple Availability Zones, which leads to the question of how you accomplish that.
You could have one server, and then move it if the AZ has a problem but you don’t want to do this:
- Servers take time to build from an image and boot up. This means some downtime even if it’s only minutes.
- If an AZ goes down, lots of people will have this idea. That means your few minutes to start a server normally might become 10-20 minutes in the event of an AZ failure
So you want to have two servers, minimum, for a production application. Then if one AZ goes down, you still want to start up another so you’ve still got two servers. Doing this yourself would be time consuming and mean you need to be on-call all the time. Fortunately another acronym can save the day: ASG, or Auto Scaling Groups.
An ASG requires what’s known as a Launch Configuration, and Launch Configurations require yet another acronym, an AMI or Amazon Machine Image.
Amazon offer a selection of pre-made images for Linux and Windows, and there are also community AMIs (that anyone can make) and marketplace AMIs (that you pay for). Most likely you’ll want to take a vanilla Linux image, install the software you need, configure it and package that as your AMI. There are external tools that can help with the provisioning (such as Chef, Puppet, Ansible) and creating the image (Packer) but for now here’s a simple way to do it:
- Create a new instance, choosing a Linux AMI when asked (I use Ubuntu)
- Pick a small instance (t2.micro is free for a year from when you set up your account)
- Attach a small EBS volume (8GB should do for an Apache server) - let it get deleted if the instance is terminated
- Open it up to all network traffic (you’re not going to production with this)
- Add an SSH key
Once you complete the wizard you’ll see your instance is starting, and within a few minutes you’ll be able to SSH into the default user with the key you’ve provided.
From this point follow whatever normal patterns you would to configure a server, but here are some suggestions:
- Update the instance, many AMIs will be on older kernels or core system tools
- Create users for your team, add their SSH keys, give sudo to people you trust
- Prevent people signing on to the instance via SSH with a password (keys only) - you can also firewall ports later in the EC2 network settings
- Install the AWS CloudWatch Agent (this is different from the older CloudWatch Logs Agent or CloudWatch metrics agent) and set it to:
- Send standard metrics
- Stream your auth log at the very least, possibly syslog, logs for any web server you install
- Don’t install a database server - you’ll use RDS for that. You might want a database client though.
If you are web serving, set up your environment for serving your application, but don’t worry about deploying code or loading in any data just yet.
Once you’re done, shut the instance down (from within or from the console). Once the console reports it’s shut down you can create an image of the instance. This generates an AMI (which records key hardware configurations, kernel, network settings) and a Snapshot (basically a full image of the EBS volume).
Once you have an AMI you can terminate the instance which removes it from your account and deletes the attached EBS volume (if you just left the instance switched off you could start it again to make further changes - you don’t pay for stopped instances, but you do pay for their EBS volumes). Now our account is clean but we have a custom AMI, ready to create a Launch Configuration.
The Launch Configuration wizard will feel very similar after you’ve created an instance - it’s basically the same, except at the end of the process you won’t have an instance. When choosing an EBS volume in your launch configuration you can choose a larger size than your instance snapshot, but don’t choose a smaller one. You can also choose to encrypt it even if the original wasn’t encrypted.
Once your Launch Configuration is done you finally get to start some servers. Creating an Auto Scaling Group is the simplest process so far. Choose a configuration, specify how many instances you want (start with minimum 2, max 3, ignore “desired”) and leave everything else default. As soon as your ASG is created you’ll see it starts to create instances for you, up to the minimum specified. In the future you could use metrics to measure how these are performing, and a CloudWatch Alarm to request your group increases the number of instances in it, but for now you’re safely provisioned across multiple AZs.
Creating a database instance is much simpler - follow the wizard, choose single-az for testing, multi-az for production and AWS does the rest. You can connect to this like you’d connect to any database from your application, just remember you’ll be connecting to a host name shown in the console rather than “localhost” (and need to ensure your users are not bound to a specific host).
This article never got finished; instead it ended up as the basis for a talk at PHPUK 2018, slides for which are here. This then formed the basis of a workshop I ran at PHPUk 2019, the repository for which is available here. I may look to write in more detail on this in future, drop me a message!