Setting up a Docassemble shell in AWS

Paul Hands
14 min readJun 30, 2022

Docassemble is a resource widely used to get information via interviews (or via external API calls) and insert that information in Markdown, Word Docs and even as PDFs, among other things. It’s a handy tool and it’s possible to install a version “quasi-locally” inside an AWS ecosystem to allow for multiple users to access the same version, allowing for consistent interviews, templates and packages to be used. This guide takes you through a step by step with images way to host your own docassemble instance in AWS through EC2 instances with a load balancer allowing you to send users to e.g. docassemble.example.com via a DNS, to use as your own standalone version.

Note that through this guide I don’t include tags anywhere. I would recommend tagging all resources as it saves a lot of headache later on in AWS cost explorer!

Setting up the ECS cluster

We need to have an ECS cluster running in AWS which groups our application and service endpoints, allowing them to correctly run on EC2 instances and have the working version of docassemble. The easiest way to do this is to use the construction wizard and the defaults, and then delete the automatically created pieces we don’t need.

First navigate to ECS in AWS and select “Get Started”

On the first page, ensure that sample-app is selected, don’t edit any of the task definitions and hit next

On the next page, leave all the options as the default, we will be deleting all the sample services and tasks anyway, so we don’t need to do anything. Ensure the load balancer type is “None”. Hit next

In the next section, change the cluster name to “docassemble” and then hit next.

On the next section (which is the review) scroll to the bottom and hit create.

This will create our new docassemble cluster with all the default values.

We now need to “undo” and delete the newly created Service, task, definition and VPC.

First off, open up the service “sample-app-service” and select update

In the configure, change the number of tasks to zero, select skip to review, and then update service. This effectively tells the service that we have just created that it shouldn’t run any tasks and therefore shouldn’t use any EC2 instances. We do this so that when we stop and delete the task and definition, the service doesn’t automatically try to boot another one up.

We can now return to service and delete the “sample-app-service” (Screen will look the same as the “update” above). Next select Task definitions on the left hand side menu and select “first-run-task-definition”. There should only be one revision but there may be more based on how often you have ran ECS to create a cluster from scratch. Select all revisions and then actions → deregister

Once this is done navigate to EC2 and if applicable, terminate the instance that has been created to link to the account, AWS may have done this for you when you deleted the service. Next navigate to VPC and find the VPC titled “ECS docassemble — VPC”, select and delete it, and it will delete all associated subnets and security groups.

Once this is done, the Cluster is in the state we want it to be to establish the other areas needed for Docassemble.

S3 Bucket and IAM role

The next step is to set up an S3 bucket in which our version of docassemble can store files, backups, templates and interviews.

Navigate to S3 and select “Create bucket”, give it a name like docassemble-exmaple-com or similar and set the region to the be the closest region to you / your application. Leave everything else as the default and create the bucket — make a note of the bucket name as you will need it later.

We now need to generate an IAM role in AWS that we can assign to any EC2 instances, this will allow them to access the docassemble bucket to store data and extract it, and alsomcommunicate with the ECS tasks we are going to create to run docassemble on. While we can do this manually, it would involve passing long and complicated secret keys to the Docker containers the docassemble application will run on in our EC2 instances. It is easier to add the role here and not worry about it.

Navigate to IAM service in AWS and select Roles from the left hand side of the menu, select “Create role”.

In the first page, select “AWS service” and “EC2” as the use case, select next.

On the next page we determine which permissions the role can have. The easiest way to do this is to select “AmazonEC2ContainerServiceforEC2Role” and “AmazonS3FullAccess”. This will allow the role to access the required buckets and services but will allow your role in theory to access and any and all buckets in your AWS account. You can create a specialist S3 inline policy to only allow for the bucket in question, if you so choose. Select Next.

Name the role “docassembleInstanceRole” and select “Create role”

Security groups, Launch configuration and Auto scaling

Next we are going to set up a system so our applications run on our default VPC, via a security group that allows AWS applications to communicate to one another, and then add a launch configuration and an autoscaling group which will take care of ensuring we always have the correct number of EC2 instances running with the associated roles to handle docassemble.

First, navigate to the VPC console and note the VPC CIDR of the default VPC, it will be an IPv4 CIDR like below.

Inside the VPC service, select Security groups on the left hand side menu. We need to create 2 different security groups:

  1. A security group that allows access to the docassemble application
  2. A security group that allows docassemble access to the load balancer

First create a new security group called “docassembleSg”, with a description like that of 1, above, the VPC should be the default, and we need to add inbound rules:

  1. Select “Type” of SSH and source to be anywhere-IPV4
  2. Select “Type” of SSH and source to anywhere-IPV6
  3. Select “Type” of All Traffic and a custom source. In the text box add the IPV4 CIDR we noted earlier from the VPC. Add descriptions if you want to.

Select “Create Security Group” to add this to your AWS system.

The next Security group is for the load balancer. Add it as above but name this one “docassembleLbSg”. The inbound rules we need to add are for “Type” HTTP and “Type” HTTPS with the source being Anywhere for both IPV4 and IPV6, making 4 rules in total.

Next we want to create a Launch Configuration to allow for the correct creation of EC2 instances in AWS to run our services on.

Inside the EC2 service, select Launch Configuration on the left hand side menu. Select to “Create Launch Configuration”. Name the configuration “docassembleLc” and select an ECS-optimised AMI (Amazon Machine Image. This is essentially something under the hood to make sure the EC2 instance is booted up as effectively as possible.) It might be worth searching through the AMIs to see if a new one exists. amzn-ami images are generally good to use. Select the IAM instance profile to be “docassembleInstanceRole” (that we created earlier). For instance type, select an instance that has at least 2GB of RAM (recommended t2.medium)

Open up the advanced details and leave the Kernel ID and RAM disk as default, and don’t change or insert anything for Metadata. In the User data option (bottom of the screenshot below) update the text to be the JSON below. This updates the email-receiving feature to work, which we do not use but still need to adjust in our use of docassemble, the ECS_CLUSTER line specifies these instances should be used for our “docassemble” cluster we created and the final bit expands the capacity of the Docker container drive to be 20gb up from 10, as the default size can sometimes be a bit small for some servers.

Note the long line that is cut off should be (Medium doesn’t seem to allow me to type code in!):

cloud-init-per once docker_options echo ‘OPTIONS=”${OPTIONS} — storage-opt dm.basesize=20G”’ >> /etc/sysconfig/docker

The input is not already base64 encoded so leave that box blank. In the security groups section underneath select to “Select an existing security group” and choose “docassembleSg”, select to proceed without a key pair for log in and tick the acknowledgement and then select “Create launch configuration”.

In the “Auto Scaling Groups” option on the left hand side of the EC2 menu, select to “Create an Auto Scaling Group”, call it “docassembleAsg” and switch to launch configuration and select the launch configuration we have just created, select Next.

In Network select the default VPC (it may be prepopulated) and then in subnets select all available subnets, select Next.

In load balancing select No load balancer, and update the Health check period to be 600 seconds. Select Next.

In group size update the Desired, Minimum and Maximum capacity to be 3, select None for scaling policies and do not check for instance scale-in protection. Select skip to review unless you wish to add SNS notifications (which we don’t) or add a tag (left up to the reader). Then create your auto scaling group. This should start booting up EC2 instances and assigning them to the cluster. If this isn’t working, check that the T&Cs of the AMI don’t need to be approved (depending on the AMI you have chosen further up in the Launch Configuration.)

Application Load Balancer

Next we need to create a load balancer which will handle requests and send them to the correct place (our application task created in the next step). First we need to create some “Target groups” for the load balancer to use. Select Load Balancers on the left hand side menu of the EC2 service. We need to create 3 target groups that are very similar. For all three groups we want to use the “Instances” target type:

Inside the target group name we want to create a “docassemble-web”, “docassemble-websocket” and “docassemable-http-redirect” respectively. The VPC should be the default for your account, and the protocol will always be HTTP. In the health check we need to use HTTP protocol and the health check path should be “/health_check” which is simply a page that returns an OK (used purely to check that the services are running correctly). Inside the port option:

  • “web” and “websocket” should be left at port 80
  • “http-redirect” should be 8081

Only while creating the http-redirect target group open the Advanced health check settings and update the healthy threshold to 10, the timeout to 10 seconds and the Success codes to be 301, 307.

Once created, select the docassemble-websocket group, select the attributes tab and edit. Tick the stickiness option and select Load balancer generated cookie, and make the stickiness duration 3 days.

Select the Load balancers option in EC2 left hand side menu, Create, and select to create an application load balancer, give it a name of docassembleLb, keep the Scheme as internet facing and the IP address type as IPV4.

Inside Network mapping ensure the default VPC is selected and tick all subnets.

In Security groups delete the default and select the docassembleLbSg we created earlier. Assign the HTTP:80 action to forward to the “docassemble-http-redirect” target group.

Once the load balancer is created, you need to make a few manual changes to it.

In the “Load Balancers” section, select the docassembleLb load balancer, and open the “Listeners” tab. Select the “HTTP : 80” listener and click Edit. Under “Default action(s),” make sure that step 1 is “Forward to docassemble-http-redirect.”

Once those changes (if any) are saved, select the “HTTPS : 443” listener and click Edit.

Under “Default action(s),” it will incorrectly say that requests should be forwarded to the http-redirect target group. This is the proper setting for HTTP (port 80), but not for HTTPS (port 443), so you need to change it. Click the edit button (pencil icon) and change it so that it forwards to the web target group. Then press “Update” to save your changes.

Then go back to the list of listeners, and under “HTTPS : 443,” click “View/edit rules.” Click the “+” button at the top of the screen to add a new rule. Make it the first rule in the list of rules. Construct the new rule so that it says, in effect, “if the path is /ws/*, forward the request to the websocket target group.” Then click “Save.”

Now the load balancer will listen to port 443 and act on requests according to two “Rules.” The first rule says that if the path of the HTTPS request starts with /ws/, which is the path for socket communication, then the request will be forwarded using the “docassemble-websocket” group which has the “stickiness” feature enabled. The second rule says that all other traffic will use the “docassemble-web” group for which “stickiness” is not enabled. Your load balancer will also listen to port 80 and forward all those requests to port 8081 on your web servers, which will respond by redirecting the user to port 443 of your load balancer.

Finally, go to the “Description” tab for the load balancer and make note of the “DNS name.” It will be something like docassemblelb-174526082.us-west-1.elb.amazonaws.com. This should be used in your DNS provider to proxy from e.g. http://docassemble.example.com to this target.

Task definitions

We need to create two different task definitions in the service we created. Navigate to the ECS console in AWS and select Task Definitions and create a new definition called “docassemble-backend”, which is an EC2 task definition.

Scroll down and select Configure via JSON and add the following:

JS Fiddle for backend

You will need to update the S3 bucket to match that which was created earlier (docassemble-corterum-com) and the hostname (http://docassemble.corterum.com). If you want to update the TIMEZONE you can do but it isn’t essential.

Once this is done, create a second task definition using the same concept, but called “docassemble-app” which uses the JSON below:

JS Fiddle for app

Note that the docassemble-app task definition is much briefer than the docassemble-backend task definition. This is because the backend service will save a configuration file in the S3 bucket. All that the app service needs to do is retrieve that configuration file. Note also that it is not necessary to include any secret keys in the JSON configuration. This is because the Launch Configuration includes the IAM Role that we created earlier; the virtual machines themselves are authorised to access the S3 bucket.

Once these are created go into the docassemble ECS cluster we made, and create a new service. This needs to be an EC2 service that pulls the docassemble-backend task definition. Give it a name of docassemble-backend and set the number of tasks to be 2. Leave everything else as default and select Next. On the next page keep the “Load balancer type” as None, and select Next. On Set Auto Scaling leave the option as “Do not adjust the service’s desired count”. Go to review and create service.

Once this is done, create another EC2 service called docassemble-app which has 2 tasks and uses the task definition matching it’s name. However in the Load balancer type select “Application Load Balancer”. The service IAM role needs to be ecsServiceRole, the Load balancer name should be the same as we created earlier (in this example we used docassembleLb). Under Container to load balance, select “docassemble-app:80:80” and then click “Add to load balancer”. In the menu that appears under target group name select “docassemble-web”. Further up, change the Health check grace period to 1200, then create the service.

Just one more thing needs to be done to make the docassemble server fully functional: we need to associate the other two Target Groups with the same EC2 instances that are now associated with the “docassemble-web“ Target Group. Go back to the EC2 menu and select Target Groups from the left hand side section, select the “docassemble-web” Target Group, go to the “Targets” tab, and note the Instance IDs of the “Registered Instances.” Select the docassemble-websocket group and click the “Edit” button within the “Targets” tab. On the “Register and deregister instances” page that appears, select the instances you just noted and click the “Add to registered” button. Then do the same with docassemble-http-redirect.

You should now be able to navigate to your DNS domain and see a working version of docassemble!

--

--

Paul Hands

Software developer. Agile enthusiast. Toddler parent. Dad joke lover