AWS Lambda Architecture Best Practices
With the evolution of technology from mainframe computers to personal computers and cloud computing, the one thing that is constant is the need to make technology more efficient, convenient and affordable.
The introduction of serverless architecture has gained ground all over the world and is now a favoured option by most companies. Backend as a Service (BaaS), such as the authentication services offered by providers like Facebook; or Function as a Service (Faas), where applications with server-side logic are run over stateless containers, and completely managed by 3rd party providers.
Leading technology companies are now offering their own serverless implementations. Our main focus will be AWS Lambda.
AWS Lambda is a serverless computing platform, implemented on AWS platforms like EC2 and S3.
AWS Lambda invokes your user code only when needed and automatically scales to support the rate of incoming requests without requiring the user to configure anything. There is no limit to the number of requests a user code can handle
AWS Lambda can be used along side other AWS services such as to process lifecycle events from Amazon Elastic Compute Cloud and manage Amazon EC2 resources. Amazon EC2 sends events to Amazon CloudWatch Events for lifecycle events such as when an instance changes state, when an Amazon Elastic Block Store volume snapshot completes, or when a spot instance is scheduled to be terminated. You configure CloudWatch Events to forward those events to a Lambda function for processing.
Lambda functions can be built using Go, Python, Ruby, Node JS, Java,, and C#. When you create a Lambda function, you are to specify the runtime environment, the roles, the memory allocation and the method to execute it. Once these are provided, AWS Lambda deploys the code, administers it and handles maintenance and security patches and monitoring. But there are some best practices that one can employ to get the best out of each AWS Lambda deployment.
Lambda function runs on VPC by default, which has internet access (only S3 and Dynamodb AWS Services). However, it won't have access to any other private VPC, including other AWS resources that run under another VPC.
If a function runs on a Lambda-managed VPC, Lambda is responsible for its availability running on multiple AZs of that VPC region.
Another key point is to have your Lambda compute capacity distributed across availability zones which makes your Lambdas inherently fault-tolerant in case of any data center failures.
In Lambda, memory, and CPU go hand-in-hand. If you increase memory, CPU allocation will also increase. Now, if we need to reduce the time of lambda execution, we would try increasing memory/CPU to process it faster. But, here is the catch, if we experiment in detail, we will find that after a certain limit increasing the memory doesn't reduce the execution time but it increases the cost so there is a balance required between performance and the cost involved.
There are few open source tools available which claim to help you find the best power configuration. However, I prefer to monitor the usage of the memory and execution time through CloudWatch logs and then adjust the configuration accordingly. Increasing or decreasing a small number makes a big difference in overall AWS cost.
When we invoke the Lambda's first time, it does download the code from S3, download all the dependencies, create a container, and start the application before it executes the code. This whole duration (except the execution of code) is known as a cold start time. Once the container is up and running, for subsequent Lambda invocation, Lambda is already initialized and it just needs to execute the application logic and that duration is, called warm start time.
So should we be worried about cold start time or warm start time? Well, cold start time takes a significant amount of time as part of the full execution, so more emphasis is around reducing that one. However, warm time also can be reduced by following good coding practice.
Now, let's discuss how can we improve the Lambda performance overall:
- Choose interpreted languages like Python, Nodejs, as compared to Go, Java, C++ to reduce the cold start time.
- Use the default network environment unless you need a VPC resource with a private IP. Because setting up ENI takes significant time and add to the cold start time. With the upcoming release of AWS Lambda, more improvement is expected in this.
- Remove all unnecessary dependencies which are not required to run the function. Keep only the ones which are required at runtime only.
- Use Global/Static variables, Singleton objects — these remain alive until the container goes down. So any subsequent call does not need to reinitialize these variables/objects.
- Define your database connections at a global level so that it can be reused for subsequent invocation.
With Lambda being able to access anything, security becomes a major consideration. WHat can invoke the lambda function(function policies)? What the lambda function can access(Execution roles)?
One IAM role per function — One IAM role should be mapped with only one function even though multiple functions need the same IAM policies. It helps to ensure the least privilege policies when any enhancement happens for the security policies for the specific function.
As Lambda would be running on shared VPC, it is not good practice to keep the AWS credential in code.
- In most cases, the IAM execution role is sufficient to connect to AWS services by just using the AWS SDK.
- In cases where a function needs to call the cross-account services, it might need the credentials. Then, just use the Assume Role API within AWS Security Token Service and retrieve temporary credentials.
- In cases where a function needs long-lived credentials to be stored like DB credentials, access key, either use environment variables with encryption helper or AWS System Manager.
AWS Lambda is all about your code running in the cloud. So how should we test it in local?
Lambda doesn't provide any endpoint URL to test directly. It always depends on the event source systems to initiate.
We can use AWS SAM for doing the local testing of the Lambda function. It gives the CLI which provides a Lambda-like execution environment locally. We can get localhost URL for API Gateway which calls the lambda function in local.
We can use localstack open source project to create a local environment having most of the AWS resources/services available. This can be used to run lambda along with other AWS services. You can integrate AWS SAM and localstack as well, as it provides all the services as APIs, running as a Docker container in the backend.
Put business logic outside of the Lambda Handler. The Handler function should be used just to retrieve the inputs and then pass it to other functions/methods. These functions/methods should parse them into variables related to our application and use it. This will separate the business logic from the handler and it can be tested within the context of objects and functions we have created.
Lambda has Versioning and Alias features as well. We can publish multiple versions of a function. Each version can be invoked in parallel in a separate container. By default, the version would be $LATEST . We can use these versions during development for creating multiple environments like dev/UAT, however, it is not recommended to be used directly for Production env as every time we upload new code, the version will be incremented and clients need to point to the new one. That's where Aliases comes into the picture.
Aliases refer to a particular version of the function. So if the code changes and a newer version is published, event source will still point to the same alias but the alias will be updated to refer to the newer version. This helps to plan a Blue/Green Deployment. We can test the newer version with sample events and once it works fine, it can be pointed by the Alias to switch the traffic to it. This can be used for rollback to the original version also if any issues are found.
One of the best ways to enhance your AWS Lambda experience is by integrating software such as CloudWatch which works very well with Lambda and provides you with good details of the lambda execution. Lambda automatically tracks the number of requests, the execution duration per request, and the number of requests resulting in an error and publishes the associated CloudWatch metrics. You can leverage these metrics to set CloudWatch custom alarms as well.
We can also use X-Ray to identify potential bottlenecks in our Lambda execution. The X-Ray can be useful when trying to visualize where we are spending our function’s execution time. It also helps to trace all the downstream systems it connects with the complete flow.
General positive feedback about Lambda is that it’s simple to set up, pricing is excellent, and it integrates with other internal AWS products such as RDS and ElastiCache.
When it comes to drawbacks of the solution, there have been 2 main areas where there has been criticism:
“Cold Start”: Creating a temporary container (that is subsequently destroyed) can take between 100 milliseconds to 2 minutes, and this delay is referred to as “cold start”.There are various workarounds to negate this, but it is something important to be aware of.
Computational Restrictions: Being based on temporary containers means that usable memory is limited, so functions requiring a lot of processing cannot be handled by AWS Lambda. Again workarounds are available, such as using a step function.
Additionally, there is an element of “lock-in”, as choosing to go with AWS invariably means you’ll be integrating (and become reliant on) other AWS tools and products in the Amazon ecosystem.
Like other AWS services, Lambda can be of great help and useful tool for your project or an integral part of your development stack. But to get the most out of the service, one most be acquainted with its best practises. But used skillfully, the end results usually exceed expectations.