3rd May 2020

Handling environment variables in Lambda@Edge

I recently came across the issue of managing environment variables in Lambda functions when they are deployed @Edge, i.e. are made part of a cache behaviour for a CloudFront distribution. It turns out it is not possible to use environment variables when a function is used in this manner. Fortunately there is a workaround which may handle some common requirements, if you happen to be using Terraform.

Lambda environment variables

Environment variables allow you to avoid hardcoding information in your Lambda runtime code, and instead you save them as part of your infrastructure layer. For example instead of this code:

1	const PASSWORD = 'p@ssword'

You would write:

1	const PASSWORD = process.env.PASSWORD

And then add the bucket name when you create the function. If you are using Terraform to create your function then your code might look like this:

resource "aws_lambda_function" "my-lambda" {
  function_name = "myLambda"
  role = aws_iam_role.my-role.arn
  filename = data.archive_file.my-function.output_path
  source_code_hash = data.archive_file.my-function.output_base64sha256
  runtime = "nodejs12.x"
  handler = "index.handler"

  environment {
    variables = {
      PASSWORD = 'p@ssword'
    }
  }
}

Why use environment variables anyway?

Like environment variables in our terminals, containers or programs, env vars in Lambda increase the flexibility of your code, and decouple parts of the implementation from resources such as keys, file names, domains etc.

Secrets: you may want to keep an API key need to know, permissioned to whoever manages your infrastructure but not to anybody writing code
Reusability: the same Lambda function may be deployed in multiple places; but if there’s any external access it’s likely that resource names may need to be different between deployments
Testable code: strings that refer to some external entity tend to make testing harder, so having clean code that gets its variables from calling scope can help this

The specific use case I was addressing was loading a secret from a named JSON file in a bucket. When the Lambda needed to be reused all of the logic worked but it was no longer possible to hardcode a bucket name, because two accounts cannot have a bucket with the same name.

What’s this “@Edge” business?

As with many AWS names, some descriptions cause more confusion than they seem worth. Lambda@Edge is a way to use Lambda functions to execute on Cloudfront distributions. Cloudfront is AWS’ global content delivery network; it’s commonly used to serve static content from S3 buckets, but can also be pointed at any other web server (not necessarily hosted on AWS) to act as a caching layer, serving content at locations close to users - edge locations.

As well as caching data, allowing you to put tools like WAF or SSL in front of other websites, Cloudfront can execute Lambda functions on either the user’s request (before it’s sent to the “origin”) or the response that comes from the origin. Cloudfront can do some minor response modifications natively (changing cache headers or compressing content) but by adding Lambda you have the opportiunity to build out custom middleware for your content delivery.

Deploying your Lambda functions to the edge requires a few differences; namely they must be deployed to the us-east-1 region, and must publish a named version. This is because unlike regular Lambda’s, Lambdas at the Edge need to be explicitly deployed to each edge location (most Cloudfront settings can take tens of minutes to be applied globally). There are a limited number of execution environments - namely Node and Python, no layers, and no X-ray traces. Oh, and for some reason they can’t use environment variables.

Using Terraform to give Lambda@Edge back its environment variables

Terraform supports uploading Lambda functions by zipping together some content:

data "archive_file" "my-function" {
  type        = "zip"
  output_path = "./my-function.zip"

  source {
    content  = templatefile("./index.js", { password = 'p@ssword'})
    filename = "index.js"
  }
}

resource "aws_lambda_function" "my-lambda" {
  function_name = "myLambda"
  role = aws_iam_role.my-role.arn
  filename = data.archive_file.my-function.output_path
  source_code_hash = data.archive_file.my-function.output_base64sha256
  runtime = "nodejs12.x"
  handler = "index.handler"

  # This is also needed for Lambda@Edge
  publish = true

 // No environment variables now :-(
}

And so we’ve modified the top of our index.js file:

1	const PASSWORD = '${password};

This will bake the environment variables into your code at the point it’s deployed. This obviously has a slight change in that a user who has access to read the code but not the environment variables can now read your “hardcoded” environment variables, but it doesn’t seem like a common use case for someone to have such specific read access to a Lambda function - especially not when the function is being deployed via Terraform anyway.

The only other deficiency of this mechanism is that reading your JS code the environment variable is now just a string and so harder to manage with the syntax of the language. You can prevent this becoming too complex by declaring all your env vars into consts at the top of your index file.

An example

Talk is cheap, show me the code, right? Visit my example repository on GitHub to see this for yourself; for best results run it in Terraform for real and explore the zip file to see how the code changes when rendered via Terraform’s template method.

Reckon this is all nonsense and there’s a better way? Hit me up on Twitter: @m1ke