<< Back to all Blogs

Managing Missing CloudFormation Support with the AWS CDK

Simon Bracegirdle

You have may experienced this scenario — you're trying to write infrastructure as code (IaC) that interacts with a lesser known AWS service, so you open the AWS CloudFormation documentation, search for the service and then you're faced with a horrifying reality — there's no CloudFormation resources for what you need.

You have a difficult decision to make — you can either develop a Custom Resource yourself by writing a Lambda or you can re-factor your IaC to use Terraform or another alternative.

None of these options are ideal, but if you're using AWS Cloud Development Kit (CDK) then you have another option up your sleeve that is easier and safer to use. In this post I'll introduce the AWS SDK Custom Resource wrappers provided by the AWS CDK, which ease the creation of safe Custom Resources.

SIDE NOTE: My examples will use Python, but are adaptable to Typescript, Java and other CDK supported languages.

The Pain of Custom Resources

To understand the benefits of the CDK wrappers, let's review the dangers of implementing Custom Resources the traditional way.

The traditional way to build a Custom Resource is to develop a Lambda function that handles the CloudFormation resource events and responds asynchronously. For example; upon resource creation your custom resource can call the AWS SDK to create resources, and then call the CFN callback.

A simple example of Lambda code for a custom resource is:

import boto3
import cfnresponse
client = boto3.client('guardduty')

def handler(event, context):
  try:
    response = client.list_detectors(MaxResults=1)

    cfnresponse.send(event, context, cfnresponse.SUCCESS, {"ID": response['DetectorIds'][0]} if response and response['DetectorIds'] else {})
  except Exception as e:
    print(e)
    cfnresponse.send(event, context, cfnresponse.FAILED, {})

At a glance it doesn't look too bad, but there's some issues with this approach. Firstly, if you forget to catch an exception then the Custom Resource will hang. You'll need to wait for the CloudFormation stack timeout, which is three hours by default. You will face the same issue if you forget to call the cfnresponse.send method after completing the work.

Creating a Custom Resource is not too much hassle — you need to create an IAM execution role, the IAM policy, the Lambda function and the Custom Resource provider. This verbosity is okay for a once-off activity, but if you need to polyfill more than one resource it can add up — making your code messy and hard to maintain. Don't discount the effort of writing tests for this function either.

How does the AWS CDK help with this?

The AWS CDK provides a Custom Resource module out of the box, designed to make the process of creating CloudFormation Custom Resources easier.

If you're looking to polyfill missing CloudFormation support by talking to the SDK directly, the AwsCustomResource construct is your friend. This construct allows you to make SDK calls based on CloudFormation resource events (e.g. on resource create / delete / update).

For example; if you want to create an IoT Thing Group, you'd call the Iot SDK method createThingGroup on CloudFormation create, updateThingGroup on update and deleteThingGroup on delete.

The benefit is that you don't need to build a Lambda yourself, which reduces the risk of missing the cfn_response callback and reduces the verbosity of resources that you need to define. It's syntactic sugar, but it also helps to create safe and dependable code.

How do I use the AwsCustomResource construct?

Now that we understand the benefits, let's learn how to use the module.

Start by importing custom_resources:

from aws_cdk import (
    custom_resources as cr
)

The minimal set of properties for creating a custom resource is as follows:

cr.AwsCustomResource(self, 'MyResource',
                     policy=custom-resource_policy,
                     on_create=on_create_call,
                     on_update=on_update_call,
                     on_delete=on_delete_call)

Under the hood this resource will create a Lambda, IAM role, IAM policies and Custom Resource provider for you. But that's all hidden away from you. You don't need to write any Lambda code yourself.

The policy parameter allows you to define the permissions assigned to the underlying Lambda function. The module provides a couple of convenient ways to provide this.

Firstly, it can infer permissions from the SDK calls specified in on_create, on_update and on_delete. This example will allow the function to access any resource for those SDK calls:

policy=cr.AwsCustomResourcePolicy.from_sdk_calls(
    resources=cr.AwsCustomResourcePolicy.ANY_RESOURCE
)

If you want more control you can provide a full policy document. Here's an example for an IoT role alias that allows the role alias CRUD operations, as well as iam:PassRole:

policy=cr.AwsCustomResourcePolicy.from_statements(
    statements=[iam.PolicyStatement(
        effect=iam.Effect.ALLOW,
        actions=[
            "iot:CreateRoleAlias",
            "iot:UpdateRoleAlias",
            "iot:DeleteRoleAlias"
        ],
        resources=["*"]
    ), iam.PolicyStatement(
        effect=iam.Effect.ALLOW,
        actions=[
            "iam:PassRole"
        ],
        resources=[role_arn]
    )]
)

For each of the resource events — on_create, on_update and on_delete you can provide a AwsSdkCall object which defines the AWS SDK method to call.

For example, to create an IoT Thing Group on the create event:

on_create=cr.AwsSdkCall(
    action="createThingGroup",
    service="Iot",
    parameters={
        'thingGroupName': thing_group_name
    },
    physical_resource_id=cr.PhysicalResourceId.of(thing_group_name)
)

This will call the createThingGroup AWS SDK method at resource creation time and pass the parameters to the SDK call verbatim.

physical_resource_id is a way of identifying the underlying physical resource created by the Custom Resource. This is important because it's not always a one-to-one mapping between logical resource actions and changes to the physical resource. The physical resource ID determines how the update/delete behaviour works.

For example, if the physical resource ID changes when updating the logical resource, then CloudFormation will attempt to delete the old physical resource with the previous physical resource ID (replacement).

This is useful in cases where there's no way to update the physical resource. For example with AWS Certificate Manager certificates; if the user updates the domain name of a certificate we need to delete the old certificate and replace it with a new one for the new domain.

But in most simple cases it'll be a one-to-one mapping between logical and physical, so the physical resource ID shouldn't change between create and update. In this scenario the sensible option is to use the ARN as the physical resource ID, which you can get from most API responses.

With the CDK Custom Resources module you can use the PhysicalResourceId.fromResponse helper to specify the response path to use as the physical resource ID. For example, for IoT Thing Groups:

cr.PhysicalResourceId.fromResponse('thingGroupArn')

Or, if you want more control then you can specify your own ID based on any arbitrary string:

cr.PhysicalResourceId.of('some_unique_id')

Here's a full example of a custom resource for a AWS IoT Thing Groups:

cr.AwsCustomResource(self, 'ThingGroup',
                     policy=cr.AwsCustomResourcePolicy.from_sdk_calls(
                         resources=cr.AwsCustomResourcePolicy.ANY_RESOURCE
                     ),
                     on_create=cr.AwsSdkCall(
                         action="createThingGroup",
                         service="Iot",
                         parameters={
                             'thingGroupName': thing_group_name
                         },
                         physical_resource_id=cr.PhysicalResourceId.fromResponse('thingGroupArn')
                     ),
                     on_update=cr.AwsSdkCall(
                         action="createThingGroup",
                         service="Iot",
                         parameters={
                             'thingGroupName': thing_group_name
                         },
                         physical_resource_id=cr.PhysicalResourceId.fromResponse('thingGroupArn')
                     ),
                     on_delete=cr.AwsSdkCall(
                         action="deleteThingGroup",
                         service="Iot",
                         parameters={
                             'thingGroupName': thing_group_name
                         },
                     ))

This resource will create a thing group on resource creation and then delete the thing group on resource deletion. The AwsCustomResource construct will generate a lambda to take care of this internally, which we don't need to worry about.

Notice the on_update action also creates a new thing group. This is because there's no way to update the group name without creating a new thing group entirely.

By returning a new physical resource ID in the update action, CloudFormation will trigger a deletion of the previous thing group when updating the thingGroupName property. The user of the Custom Resource doesn't need to know about this, all they see after the update is a Thing Group with the name they gave it.

Summary

In this post we learned about the CDK Custom Resources module and specifically the AwsCustomResource construct which makes it easy to wrap AWS SDK calls in a Custom Resource. We also learned about the importance of the physical resource ID in controlling Custom Resource behaviour.

Ideally, AWS would find a way to improve CloudFormation support across all its services, then we wouldn't need to polyfill it for services we want to use from infrastructure as code. Since we don't have that, the CDK Custom Resources module is a nice fallback for those that are using CDK for their infrastructure.

Need help with your infrastructure as code? Then get in contact with the Perth IaC gurus — Mechanical Rock.