Paper Review: Serverless computation with OpenLambda

- May 06, 2017

This paper provides a great accessible review and evaluation of the AWS Lambda architecture. It is by Scott Hendrickson, Stephen Sturdevant, Tyler Harter, Venkateshwaran Venkataramani†, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, and it appeared at Hot Cloud 16

Virtual machines virtualized and shared the hardware so multiple VMs can colocate on the same machine. This allowed consolidation of machines, prevented the server sprawl problem, and reduced costs as well as improving manageability.

The containers virtualized and shared the operating system, and avoided the overheads of VMs. They provided fast startup times for application servers. By "fast" we mean about 25 seconds of preparation time.

In both VMs and containers, there is a "server" waiting for a client to serve to. Applications are defined as collection of servers and services.

"Serverless" takes the virtualization a step ahead. They virtualize and share the runtime, and now the unit of deployment is a function. Applications are now defined as a set of functions (i.e., lambda handlers) with access to a common data store.

Lambda handlers from different customers share common pools of runtimes managed by the cloud provider, so developers need not worry about server management. Handlers are typically written in interpreted languages such as JavaScript or Python. By sharing the runtime environment across functions, the code specific to a particular application will typically be small, and hence it is inexpensive to send the handler code to any worker in a cluster.

Performance evaluation on AWS Lambda

Handlers can execute on any worker; in AWS, startup time for a handler on a new worker is approximately 1-2 seconds. Upon a load burst, a load balancer can start a Lambda handler on a new worker to service a queued RPC call without incurring excessive latencies. Figure 2 shows that 100 lambda workers are generated in a short time to serve 100 outstanding RPC requests.

Figure 5 shows more details about the lambda handler initialization. There is a delay for unpausing a lambda function (1ms), if you start from scratch that delay is actually several 100ms.

On the systems research side, one problem to investigate is building better execution engines. Under light load, Lambdas are significantly slower than containers as Figure 4 shows.

Lambdas are great for performance tuning. You can see which functions are accessed how many times because that is how billing is provided. This helps you to tune the performance of your applications.

On the topic of billing, in AWS, the cost of an invocation is proportional to the memory cap (not the actual memory consumed) multiplied by the actual execution time, as rounded up to the nearest 100ms. But many RPCs are shorter than 100ms, so they cost several times more than if charging were more fine-grained.

The authors are working on building an opensource lambda computing platform.

Discussion and future directions

So what kind of functions are suitable for "lambdatization"?

Currently RPC calls from web apps are being lambdatized. But as the paper observed, maybe those are too small, they last less than 100 ms, the unit of billing. A common use case scenario is when an app puts an image to S3, this triggers a call to a lambda handler that processes this image and creates a thumbnail. That is a better fit.

It is also best if the input to the lambda function is not very big, wasting/duplicating work by transferring large amounts of data. Lambda is for computation. The computation should compensate the cost of network use. Lambda excels in handling bursty traffic by autoscaling extremely quickly.

Picking up on this last observation, I think a very beneficial way to employ lambdatization is for addressing bottlenecks of the big data processing application/platform that surface at runtime. In a sense, I am advocating to use lambdatization to virtualize even the application at the unit of functions! For this we can ask the developer to provide tags to label some functions as lambda-offloadable. Then we can use a preprocessor and a shim layer to deploy these functions as lambda functions, so they can be auto-scaled based on the feedback/directives from the underlying distributed systems middleware.