Deploying AWS Lambda Functions with Ansible

2016-05-27 Surprised cow

N.B. This is a cross-post from the YPlan tech blog.

AWS Lambda is a hip “serverless” technology that everyone is talking about. While I find ‘serverless’ a bit of a funny name—it just means a managed environment—I like using Lambda functions for infrastructure glue or running small internal applications. Putting these things on individually managed servers of their own has always been a pain, so Lambda is a perfect fit.

Here’s what we’re currently running on Lambda:

A Slack bot that informs us about our Google Playstore reviews - as detailed in Alex’s recent post.
Critical Cloudwatch alarms sent out as SMSs with Twilio.
A site monitor that polls our front page from multiple AWS regions to check whether it’s up and looks sane - publishing Cloudwatch metrics that can then trigger alarms and then the above SMS alert.
Long term log storage of all the configuration changes happening on our AWS RDS databases - AWS keeps these logs for only ~24 hours, but we’d like to be able to audit it for longer.
Logging alerts based on filters on our Cloudwatch logs.

Deploying Lambda functions was a mixed experience at first, but it is now relatively easy. When we started using it, there wasn’t really a good way of automating deployment, so we’d have to create them on the console and then update with CLI commands. However there is now Cloudformation support for all the necessary pieces, which is what we’re using.

Ansible has also made this more pleasant, supporting Cloudformation stacks as YAML files since v2.0. It’s about ten times easier to read and edit than JSON, without needing one to turn to tools like Troposphere.

All of our Lambda functions use Python; our backend is Python/Django, so it makes sense to stick with what we know. To deploy one, we run an Ansible playbook that builds it into a zip file, uploads that to S3, then creates/updates the relevant Cloudformation stack as necessary.

We’ll show you what that looks like for an example function. All the code and supporting files are available ready to run on our Github repo ansible-python-lambda if you want to follow along!

An Example Playbook

At the top level we have an Ansible playbook that looks something like this:

- name: deploy my_lambda_function
  hosts: 127.0.0.1
  connection: local
  vars:
    aws_region: eu-west-1
    code_path: my_lambda_function/
    s3_bucket: my-lambda-bucket
    zip_name: my_lambda_function.zip
  tasks:
    - include: includes/init_workspace.yml
    - include: includes/build_zip.yml
    - include: includes/copy_to_s3.yml

    - name: cloudformation stack
      cloudformation:
        region: '{{ aws_region }}'
        stack_name: my-lambda-function
        template: files/my_lambda_function_stack.yml
        template_format: yaml
        template_parameters:
          S3Bucket: '{{ s3_bucket }}'
          S3Key: '{{ zip_name }}'
          S3ObjectVersion: '{{ s3_version_id }}'

This declares some variables we use to build the lambda function, includes several pieces that build the zip file and upload it to S3, then calls cloudformation to create/update our stack. We’ve cut the building and uploading of the zip file into several different includes since for different Lambda functions we might do some customization; for example our site monitor function is deployed in multiple regions, so copy_to_s3 and cloudformation need calling multiple times with different arguments.

Let’s look at each include in turn now…

`init_workspace`

We use a workspace folder to build the zip file, taking a copy of the code and then adding other stuff such as dependencies or unencrypted config files. We do this in a temporary directory that is ignored by git.

This playbook looks like:

- name: remove workspace dir
  file:
    path: workspace/
    state: absent
    force: yes

- name: create workspace dir
  file:
    path: workspace/
    state: directory

- name: copy in code
  synchronize:
    src: '{{ code_path }}'
    dest: workspace/
    delete: yes

- name: setup.cfg for no prefix
  copy:
    dest: workspace/setup.cfg
    content: |
      [install]
      prefix=

- name: get absolute path of workspace
  changed_when: false
  command: pwd
  args:
    chdir: workspace
  register: abs_workspace_path

- name: check for requirements.txt
  changed_when: false
  stat:
    path: '{{ abs_workspace_path.stdout }}/requirements.txt'
  register: requirements_result

- name: install dependencies
  when: requirements_result.stat.exists
  pip:
    chdir: '{{ abs_workspace_path.stdout }}'
    extra_args: '-t .'  # install here, no virtualenv
    requirements: requirements.txt

- name: erase .pyc files
  command: find . -type f -name "*.py[co]" -delete
  args:
    chdir: '{{ abs_workspace_path.stdout }}'

The way of installing dependencies, using pip with prefix=, is directly taken from the AWS Lambda Python Tutorial. We’ve made the pip invocation dependent on whether or not the requirements.txt file exists since we have a couple of Lambda functions that depend on nothing more than boto3 which is installed by default.

As mentioned, after this step the individual Lambdas’ playbooks often add other files, such as decrypted configuration.

`build_zip`

This step is relatively simple, but it’s important to invoke the zip utility in exactly the right way to include the code at the correct level. We keep our zip files in another git-ignored build directory, rather e.g. /tmp, so they can be easily inspected during debugging.

- name: remove any old zip
  file:
    path: build/{{ zip_name }}
    state: absent

- name: zip package
  command: zip -r ../build/{{ zip_name }} .
  args:
    chdir: workspace/

`copy_to_s3`

Uploading to S3 can be done by Ansible’s s3 module, but unfortunately you need the exact S3 version id of the upload to update Lambda correctly. To get this back we have to use the AWS CLI (pip install awscli) for just one API call. We create this S3 bucket in another playbook we use to control all of our S3 configuration, so if you’re following along you’ll need to make sure the bucket exists and has versioning on before running.

- name: copy package to s3
  s3:
    mode: put
    src: build/{{ zip_name }}
    region: '{{ aws_region }}'
    bucket: '{{ s3_bucket }}'
    object: '{{ zip_name }}'  # versioning on for bucket

- name: find the s3 version we just uploaded
  command: >
    aws s3api head-object
      --region {{ aws_region }}
      --bucket {{ s3_bucket }}
      --key {{ zip_name }}
  register: head_object_output

- name: set s3_version_id
  set_fact:
    s3_version_id: '{{ (head_object_output.stdout|from_json)["VersionId"] }}'

Cloudformation stack

After the three above includes, the cloudformation module is invoked to make sure the lambda function and any accompanying resources exist and are up to date. For each of our individual Lambda functions these stacks vary, but there are some best practices we’ve settled on:

Always create an IAM role for the individual function, rather than rely on a generic one. This ensures that permissions are looked at and can be as granular as possible.
Always create a Cloudwatch Alarm to track errors in the function, that is hooked up to SNS to email whoever is responsible for that function. We haven’t settled on hooking up something like Sentry for exception catching yet, so at least this way we get notified about failures - by default you get nothing except manual checking of Cloudwatch metrics/logs.
Encode all the triggers for the function possible in the stack or Ansible code. This might not be possible if you need to integrate with a third party service or some cutting edge not-in-Cloudformation AWS service, but keeping as much infrastructure as code is always a win.

Here’s an example stack that invokes the function every day at 08:00 UTC:

Description: My Lambda Function
Parameters:
  S3Bucket:
    Description: S3 Bucket where the Lambda code is
    Type: String
  S3Key:
    Description: S3 Key where the Lambda code is
    Type: String
  S3ObjectVersion:
    Description: Version of the S3 Key to use
    Type: String
Resources:
  IAMRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Effect: Allow
            Principal:
              Service: lambda.amazonaws.com
            Action: 'sts:AssumeRole'
      Policies:
        - PolicyName: Logging
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
              - Effect: Allow
                Action:
                  - 'logs:CreateLogGroup'
                  - 'logs:CreateLogStream'
                  - 'logs:PutLogEvents'
                Resource: 'arn:aws:logs:*:*:*'
  LambdaFunction:
    Type: AWS::Lambda::Function
    Properties:
      Code:
        S3Bucket: {Ref: S3Bucket}
        S3Key: {Ref: S3Key}
        S3ObjectVersion: {Ref: S3ObjectVersion}
      Handler: main.lambda_handler
      MemorySize: 128
      Role: {'Fn::GetAtt': [IAMRole, Arn]}
      Runtime: python2.7
      Timeout: 300
  Schedule:
    Type: AWS::Events::Rule
    Properties:
      Description: Every day at 0800
      ScheduleExpression: cron(0 8 * * ? *)
      Targets:
        - Id: TriggerMyLambdaFunction
          Arn: {'Fn::GetAtt': [LambdaFunction, Arn]}
  LambdaPermission:
    Type: AWS::Lambda::Permission
    Properties:
      Action: lambda:InvokeFunction
      FunctionName: {'Fn::GetAtt': [LambdaFunction, Arn]}
      Principal: events.amazonaws.com
      SourceArn: {'Fn::GetAtt': [Schedule, Arn]}
  EmailCreator:
    Type: AWS::SNS::Topic
    Properties:
      DisplayName: Tell the creator about failures in this lambda function.
      Subscription:
        - Protocol: email
          Endpoint: creator@example.com
  LambdaFunctionFailures:
    Type: AWS::CloudWatch::Alarm
    Properties:
      AlarmActions:
      - Ref: EmailCreator
      Namespace: AWS/Lambda
      Dimensions:
      - Name: FunctionName
        Value: {Ref: LambdaFunction}
      MetricName: Errors
      EvaluationPeriods: '1'
      Statistic: Sum
      Period: '60'
      ComparisonOperator: GreaterThanThreshold
      Threshold: '0'

It’s a bit involved, but it ensures all the necessary pieces are automated and isolated from anything else that might be running on our account!

Conclusion

We hope you enjoyed the above walkthrough of our current Ansible-Lambda deployment method. We hope you try using it from our Github repo, and send us any feedback in the issues/with a pull request!

😸😸😸 Check out my new book on using GitHub effectively, Boost Your GitHub DX! 😸😸😸

One summary email a week, no spam, I pinky promise.

Tags: ansible