5 min read

Google researchers presented a paper on Google’s consistent global authorization system known as Zanzibar. The paper focuses on the design, implementation, and deployment of Zanzibar for storing and evaluating access control lists (ACL). Zanzibar offers a uniform data model and configuration language for providing a wide range of access control policies from hundreds of client services at Google. The client services include Cloud, Drive, Calendar, Maps, YouTube and Photos.

Zanizibar authorization decisions respect causal ordering of user actions and thus provide external consistency amid changes to access control lists and object contents. It scales to trillions of access control lists and millions of authorization requests per second to support services used by billions of people. It has maintained 95th-percentile latency of less than 10 milliseconds and availability of greater than 99.999% over 3 years of production use.

Here’s a list of the authors who contributed to the paper, Ruoming Pang, Ramon C ´aceres, Mike Burrows, Zhifeng Chen, Pratik Dave, Nathan Germer, Alexander Golynski, Kevin Graney, Nina Kang, Lea Kissner, Jeffrey L. Korn, Abhishek Parmar, Christopher D. Richards and Mengzhi Wang.

What are the goals of Zanzibar system

Researchers have certain goals for the Zanzibar system which are as follows:

  • Correctness: The system must ensure consistency of access control decisions.
  • Flexibility: Zanzibar system should also support access control policies for consumer and enterprise applications.
  • Low latency: The system should quickly respond because authorization checks are usually in the critical path of user interactions. And low latency is important for serving search results that often require tens to hundreds of checks.
  • High availability: Zanzibar system should reliably respond to requests Because in the absence of explicit authorization, client services would be forced to deny their user access.
  • Large scale: The system should protect billions of objects that are shared by billions of users. The system should be deployed around the globe so that it becomes easier for its clients and the end users.

To achieve the above-mentioned goals, Zanzibar involves a combination of features. For example, for flexibility, the system pairs a simple data model with a powerful configuration language that allows clients to define arbitrary relations between users and objects. The Zanzibar system employs an array of techniques for achieving low latency and high availability and for consistency, it stores the data in normalized forms.

Zanzibar replicates ACL data across multiple data centers

The Zanzibar system operates at a global scale and stores more than two trillion ACLs (Access Control Lists) and also performs millions of authorization checks per second. But the ACL data does not lend itself to geographic partitioning as the authorization checks for an object can actually come from anywhere in the world. This is the reason why, Zanzibar replicates all of its ACL data in multiple geographically distributed data centers and then also distributes the load across thousands of servers around the world.

Zanzibar’s architecture includes a main server organized in clusters

Image source:  Zanzibar: Google’s Consistent, Global Authorization System

The acl servers are the main server type in this system and they are organized in clusters so that they respond to Check, Read, Expand, and Write requests. When the requests arrive at any server in a cluster, the server passes on the work to other servers in the cluster and those servers may then contact other servers for computing intermediate results. The initial server is the one that gathers the final result and returns it to the client.

The Zanzibar system stores the ACLs and their metadata in Spanner databases. There is one database for storing relation tuples for each client namespace and one database for holding all namespace configurations. And there is one changelog database that is shared across all namespaces.

So the acl servers basically read and write those databases while responding to client requests. Then there are a specialized server type that respond to Watch requests, they are known as the watchservers. These servers tail the changelog and serve namespace changes to clients in real time.

The Zanzibar system runs a data processing pipeline for performing a variety of offline functions across all Zanzibar data in Spanner. For example, producing dumps of the relation tuples in each namespace at a known snapshot time.

Zanzibar uses an indexing system for optimizing operations on large and deeply nested sets, known as Leopard. It is responsible for reading periodic snapshots of ACL data and for watching the changes between snapshots. It also performs transformations on data, such as denormalization, and then responds to requests coming from acl servers.

The researchers concluded by stating that Zanzibar system is simple, flexible data model and offers configuration language support. According to them, Zanzibar’s external consistency model allows authorization checks to be evaluated at distributed locations without the need for global synchronization. It also offers low latency, scalability, and high availability.

People are finding this paper very interesting and also the facts involved are surprising for them. A user commented on HackerNews, “Excellent paper. As someone who has worked with filesystems and ACLs, but never touched Spanner before.” Another user commented, “What’s interesting to me here is not the ACL thing, it’s how in a way ‘straight forward’ this all seems to be.”

Another comment reads, “I’m surprised by all the numbers they give out: latency, regions, operation counts, even servers. The typical Google paper omits numbers on the Y axis of its most interesting graphs. Or it says “more than a billion”, which makes people think “2B”, when the actual number might be closer to 10B or even higher.”

Few others think that the name of the project wasn’t Zanzibar initially and it was called ‘Spice’.

To know more about this system, check out the paper Zanzibar: Google’s Consistent, Global Authorization System.

Read Next

Google researchers propose building service robots with reinforcement learning to help people with mobility impairment

Researchers propose a reinforcement learning method that can hack Google reCAPTCHA v3

Researchers input rabbit-duck illusion to Google Cloud Vision API and conclude it shows orientation-bias