Deploy a secure static site with AWS & Terraform

Originally published at AWS Advent. Look out soon for a new post describing how you can auto-publish your site direct from Github using Github Actions

There are many uses for static websites. A static site is, of course, the simplest form of website. Every website consists of delivering HTML, CSS and other resources to a browser, but with a static website, that HTML and CSS are delivered the same to every user, regardless as to how they’ve interacted with your site previously. There’s no database, authentication or anything else associated with sending the site to the user – just a straight HTTPS connection and some text content. This content can benefit from things like edge caching for faster delivery and poses little risk of bugs, as there’s not much in the way of code being sent down the wire.

Until recently that’s all a static site would do, and they’d be favoured mainly by small businesses who were happy to write HTML (rather than use a content management system like Wordpress), throw-away websites for company promotions (often known as brochureware) and hobbyist sites.

Of course with the advent of MVC JavaScript frameworks, microservices and serverless computing the static website has another use – as a delivery tool for those JavaScript + API based web services whose initial load may just require some HTML and the JS files which power the application, but whose code then makes API requests to handle all the other aspects of user interaction – authentication, data submission, search etc.

Whilst people wanting to deploy static sites therefore may fall into two quite different categories – for one the site is the whole of their business, for the other the static site is a very minor part versus the API – their requirements will be broadly the same. In this article we’re going to explore deploying a static site which meets the following criteria:

  • Must work at the root domain of a business, e.g. example.com
  • Must redirect from the common (but unecessary) www. subdomain to the root domain
  • Must be served via HTTPS (and upgrade HTTP to HTTPS)
  • Must support “pretty” cannonical URLs – e.g. example.com/about-us rather than example.com/about-us.html
  • Must not cost anything when not being accessed (except for domain name costs)

What AWS offers

AWS can achieve these aims through use of the following services:

  • S3
  • Cloudfront
  • ACM (Amazon Certificate Manager)
  • Route53
  • Lambda

That might seem like quite a lot of services to host a simple static website; let’s look at that list again but summarise why each item is being used:

  • S3 – object storage; allows you to put files in the cloud. Other AWS users or AWS services may be permitted access to these files. They may also be made public. S3 supports website hosting, but only via HTTP. For HTTPS you need…
  • Cloudfront – content delivery system; can sit in front of an S3 bucket or a website served via any other domain (doesn’t need to be on AWS) and deliver files from servers close to users, caching them if allowed. Allows you to import HTTPS certificates managed by…
  • ACM – generates and stores certificates (you can also upload your own). Will automatically renew certificates which it generates. For generating certificates your domain must be validated via adding custom CNAME records. This can be done automatically in…
  • Route53 – AWS nameservers and DNS service. R53 replaces your domain provider’s nameservers (at a cost of $0.50 per month per domain) and allows both traditional DNS records (A, CNAME, MX, TXT etc) and “alias” records which map to a specific other AWS service – such as S3 websites or Cloudfront distributions. Thus an A record on your root domain can link directly to Cloudfront, and your CNAMEs to validate your ACM certificate can also be automatically provisioned
  • Lambda – functions as a service. Lambda lets you run custom code on events, which can come directly or from a variety of other AWS services. Crucially you can put a Lambda function into Cloudfront, manipulating requests or responses as they’re received from or sent to your users. This is how we’ll make our URLs look nice

Hopefully that gives you some understanding of the services – you could cut out Cloudfront and ACM if you didn’t care about HTTPS, but you’d be very wrong to not care about HTTPS – it really shouldn’t be a choice any more.

Update, 4th May 2021: AWS have announced CloudFront Functions which may become a simpler/cheaper way to handle the URL rewrite than using Lambda@Edge. I will update this guide if I modify the website to use that instead

All this is well and good, but whilst AWS is powerful their console leaves much to be desired, and setting up one site can take some time – replicating it for multiple sites is as much an exercise in memory and box ticking as it is in technical prowess. What we need is a way to do this once, or even better have somebody else do this once, and then replicate it as many times as we need.

Enter Terraform

One of the most powerful parts of AWS isn’t clear when you first start using the console to manage your resources. AWS has a super powerful API that drives pretty much everything. It’s key to so much of their own automation, to the entirety of their security model and to third party tools, tools like Terraform.

Terraform is “Infrastructure-as-Code” or IaC. It lets you define resources on a variety of cloud providers, and then runs commands to:

  • Check the current state of your environment
  • Make required changes such that your actual environment matches the code you’ve written

In code form, Terraform uses blocks of code called resources which look like this:

1
2
3
resource “aws_s3_bucket” “some-internal-reference” {
bucket = “my-bucket-name”
}

Each resource can include variables (documented on the provider’s website) and these can be text, numbers, true/false, lists (of the above) or maps (basically like sub resources with their own variables).

Terraform is distributed as pre-built binaries (it’s also open source, written in Go, so you can build it yourself) that you can run simply by downloading, making them executable and then executing them. In order to work with AWS you need to define a “provider” which is formatted similar to a resource:

1
2
provider “aws” {
}

To run any AWS API (via command line, terraform or a language of your choice) you’ll need to generate an access key and secret key for the account you’d like to use. That’s beyond the scope of this article, but given you should also avoid hardcoding those credentials into Terraform, and given you’d be very well served having access to it, skip over to the AWS CLI setup instructions and set this up with the correct keys before continuing.

(NB: in this step you’re best provisioning an account with admin rights, or at least full access to IAM, S3, Route53, Cloudfront, ACM & Lambda. However don’t be tempted to create access keys for your root account – AWS recommends against this)

Now that you’ve got your system set up to use AWS programatically, installed Terraform and been introduced to the basics of its syntax it’s a good time to take a look at our code on GitHub.

Clone the repository above; you’ll see we have one file in the root (main.tf.example) and then a directory called modules. One of the best parts of Terraform is modules and how they behave. Modules allow one user to define a specific set of infrastructure that may either relate directly to each other or interact by being on the same account. These modules can define variables allowing some aspects (names, domains, tags) to be customised, whilst other items that may be necessary for the module to function (like a certain configuration of a CloudFront distribution) are fixed.

To start off run bash ./setup which will copy the example file to main.tf and also ensure your local Terraform installation has the correct providers (AWS and file archiving) as well as set up the modules. In main.tf then you’ll see a suggested set up using three modules. Of course, you’d be free to just remove main.tf entirely and use each module in its own right, but for this tutorial, it helps to have a complete picture.

At the top of the main.tf file are defined three variables which you’ll need to fill in correctly:

  • The first is the domain you wish to use – it can be your root domain (example.com) or any sort of subdomain (my-site.example.com).
  • Second, you’ll need the Zone ID associated with your domain on Route 53. Each Route 53 domain gets a zone ID which relates to AWS’ internal domain mapping system. To find your Zone ID visit the Route53 Hosted Zones page whilst signed in to your AWS account and check the right-hand column next to the root domain you’re interested in using for your static site.
  • Finally choose a region; if you already use AWS you may have a preferred region, otherwise, choose one from the AWS list nearest to you. As a note, it’s generally best to avoid us-east-1 where possible, as on balance this tends to have more issues arise due to its centrality in various AWS services.

Now for the fun part. Run terraform plan – if your AWS CLI environment is set up the plan should execute and show the creation of a whole list of resources – S3 Buckets, CloudFront distributions, a number of DNS records and even some new IAM roles & policies. If this bit fails entirely, check that the provider entity in main.tf is using the right profile name based on your ~/.aws/credentials file.

Once the plan has run and told you it’s creating resources (it shouldn’t say updating or destroying at this point), you’re ready to go. Run terraform apply – this basically does another plan, but at the end, you can type yes to have Terraform create the resources. This can take a while as Terraform has to call various AWS APIs and some are quicker than others – DNS records can be slightly slower, and ACM generation may wait until it’s verified DNS before returning a positive response. Be patient and eventually it will inform you that it’s finished, or tell you if there have been problems applying.

If the plan or apply options have problems you may need to change some of your variables based on the following possible issues:

  • Names of S3 buckets should be globally unique – so if anyone in the world has a bucket with the name you want, you can’t have it. A good system is to prefix buckets with your company name, or suffix them with random characters.
  • You need to not have an A record for your root or www. domain already in Route53.
  • You need to not have an ACM certificate for your root domain already.

Once the plan has run and told you it’s creating resources (it shouldn’t say updating or destroying at this point), you’re ready to go. Run “terraform apply” - this basically does another plan, but at the end you can type “yes” to have Terraform create the resources. This can take a while, especially for the certificate to generate, but be patient and eventually it will inform you that it’s finished.

Go into the AWS console and browse S3, Cloudfront, Route53 and you should see your various resources created. You can also view the Lambda function and ACM but be aware that for the former you’ll need to be in the specific region you chose to run in, and for the latter you must select us-east-1 (N. Virginia)

What now?

It’s time to deploy a website. This is the easy part – you can use the S3 console to drag and drop files (remember to use the website bucket and not the logs or www redirect buckets), use awscli to upload yourself (via aws s3 cp or aws s3 sync) or run the example bash script provided in the repo which takes one argument, a directory of all files you want to upload. Be aware – any files uploaded to your bucket will immediately be public on the internet if somebody knows the URL!

If you don’t have a website, check the “example-website” directory – running the bash script above without any arguments will deploy this for you. Once you’ve deployed something, visit your domain and all being well you should see your site. Cloudfront distributions have a variable time to set up so in some cases it might be 15ish minutes before the site works as expected.

Note also that Cloudfront is set to cache files for 5 minutes; even a hard refresh won’t reload resource files like CSS or JavaScript as Cloudfront won’t go and fetch them again from your bucket for 5 minutes after first fetching them. During development you may wish to turn this off – you can do this in the Cloudfront console, set the TTL values to 0. Once you’re ready to go live, run terraform apply again and it will reconfigure Cloudfront to recommended settings.

In summary

With a minimal amount of work we now have a framework that can deploy a secure static site to any domain we choose in a matter of minutes. We could use this to rapidly deploy websites for marketing clients, publish a blog generated with a static site builder like Jekyll, or use it as the basis for a serverless web application using ReactJS delivered to the client and a back-end provided by AWS Lambda accessed via AWS API Gateway or (newly released) an AWS Application Load Balancer.

Bonus: yes this site was deployed using the exact tools from this article. Boom!