The Future of Cloud lies in revisiting the designs and limitations of today’s notion of ‘serverless computing’, say UC Berkeley researchers

Last week, researchers at the UC Berkeley released a research paper titled ‘Serverless Computing: One Step Forward, Two Steps Back’, which highlights some pitfalls in the current serverless architectures. Researchers have also explored the challenges that should be addressed to utilize the complete potential that the cloud can offer to innovative developers.

Cloud isn’t being used to the fullest

The researchers have termed cloud as “the biggest assemblage of data capacity and distributed computing power ever available to the general public, managed as a service”.

The cloud today is being used as an outsourcing platform for standard enterprise data services. In order to leverage the actual potential of the cloud to the fullest, creative developers need programming frameworks. The majority of cloud services are simply multi-tenant, easier-to-administer clones of legacy enterprise data services such as object storage, databases, queueing systems, and web/app servers.

Off late, the buzz for serverless computing--a platform in the cloud where developers simply upload their code, and the platform executes it on their behalf as needed at any scale--is on the rise. This is because public cloud vendors have started offering new programming interfaces under the banner of serverless computing. The researchers support this with a Google search trend comparison where the term “serverless” recently matched the historic peak of popularity of the phrase “Map Reduce” or “MapReduce”.

the-future-of-cloud-lies-in-revisiting-the-designs-and-limitations-of-todays-notion-of-serverless-computing-say-uc-berkeley-researchers-img-0

Source: arxiv.org

They point out that the notion of serverless computing is vague enough to allow optimists to project any number of possible broad interpretations on what it might mean.

Hence, in this paper, they have assessed the field based on the serverless computing services that vendors are actually offering today and also see why these services are a disappointment given that the cloud has a bigger potential.

A Serverless architecture based on FaaS (Function-as-a-Service)

Functions-as-a-Service (FaaS) is the commonly used and more descriptive name for the core
of serverless offerings from the public cloud providers. Typical FaaS offerings today support a variety of languages (e.g., Python, Java, Javascript, Go), allow programmers to register functions with the cloud provider, and enable users to declare events that trigger each function. The FaaS infrastructure monitors the triggering events, allocates a runtime for the function, executes it, and persists the results. The user is billed only for the computing resources used during function invocation.

Building applications on FaaS not only requires data management in both persistent and temporary storage but also mechanisms to trigger and scale function execution. According to the researchers, cloud providers are quick to emphasize that serverless is not only FaaS, but it is, FaaS supported by a “standard library”: the various multi-tenanted, autoscaling services provided by the vendor; for instance, S3 (large object storage), DynamoDB (key-value storage), SQS (queuing services), and more.

However, current FaaS solutions are good for simple workloads of independent tasks such as parallel tasks embedded in Lambda functions, or jobs to be run by the proprietary cloud services. However, when it comes to use cases that involve stateful tasks, these FaaS have a surprisingly high latency. These realities limit the attractive use cases for FaaS today, discouraging new third-party programs that go beyond the proprietary service offerings
from the vendors.

Limitations of the current FaaS offering

No recoverability

Function invocations are shut down by the Lambda infrastructure automatically after 15 minutes. Lambda may keep the function’s state cached in the hosting VM in order to support a ‘warm start’ state. However, there is no way to ensure that subsequent invocations are run on the same VM. Hence functions must be written assuming that state will not be recoverable across invocations.

I/O Bottlenecks

Lambdas usually connect to cloud services or shared storage across a network interface. This means moving data across nodes or racks. With FaaS, things appear even worse than the network topology would suggest. Recent studies show that a single Lambda function
can achieve on average 538 Mbps network bandwidth. This is an order of magnitude slower than a single modern SSD. Worse, AWS appears to attempt to pack Lambda functions from the
same user together on a single VM, so the limited bandwidth is shared by multiple functions. The result is that as compute power scales up, per-function bandwidth shrinks proportionately. With 20 Lambda functions, average network bandwidth was 28.7Mbps—2.5 orders of magnitude slower than a single SSD.

Communication Through Slow Storage

Lambda functions can only communicate through an autoscaling intermediary service. As a corollary, a client of Lambda cannot address the particular function instance that handled the client’s previous request: there is no “stickiness” for client connections. Hence maintaining state across client calls require writing the state out to slow storage, and reading it back on every subsequent call.

No Specialized Hardware

FaaS offerings today only allow users to provision a time slice of a CPU hyperthread and some
amount of RAM; in the case of AWS Lambda, one determines the other. There is no API or mechanism to access specialized hardware. These constraints, combined with some significant shortcomings in the standard library of FaaS offerings, substantially limit the scope of feasible serverless applications.

The researchers conclude, “We see the future of cloud programming as far, far brighter than the promise of today’s serverless FaaS offerings. Getting to that future requires revisiting the designs and limitations of what is being called ‘serverless computing’ today.” They believe cloud programmers need to build a programmable framework that goes beyond FaaS, to dynamically manage the allocation of resources in order to meet user-specified performance goals for both compute and for data. The program analysis and scheduling issues are likely to open up significant opportunities for more formal research, especially for data-centric programs.

To know more this research in detail, read the complete research paper.

Introducing GitLab Serverless to deploy cloud-agnostic serverless functions and applications

Introducing ‘Pivotal Function Service’ (alpha): an open, Kubernetes based, multi-cloud serverless framework for developer workloads

Introducing numpywren, a system for linear algebra built on a serverless architecture