News

How Facebook data scientists use Bayesian optimization for tuning their online systems

3 min read

Facebook data scientists had released a paper, Constrained Bayesian Optimization with Noisy Experiments in 2017 where they describe using Bayesian optimization to design rounds of A/B tests based on prior test results. An A/B test is a randomized experiment, used to determine which variant of A and B is more “effective”. They are used for improving a product.

Facebook has a large array of backend systems serving billions of people every day. They have a large number of internal parameters that must be tuned carefully using live, randomized experiments, also known as A/B tests. Individual experiments may take a week or longer, so there is a challenge to optimize a set of parameters with the least number of experiments.

Bayesian optimization

Bayesian optimization is a technique used to solve optimization problems where the objective function (the online metric of interest) does not have an analytic expression. It can only be evaluated through some time consuming operations like a randomized experiment. Bayesian optimization allows joint tuning of more parameters with fewer experiments compared to a grid search or manual tuning. It also helps in finding better values.

The Gaussian process (GP) is a Bayesian model that works well for Bayesian optimization. GP provides good uncertainty estimates of how an online metric varies with the parameters of interest. It is illustrated as follows:

Source: Facebook research blog

The work in the paper was motivated by several challenges in using Bayesian optimization for tuning online systems. The challenges are noise, constraints, and batch experimentation. In the paper, the authors describe a Bayesian approach for handling observation noise in which they include the posterior uncertainty induced by noise in EI’s expectation.

In the paper, they describe a Bayesian approach for handling observation noise. A posterior uncertainty is induced by noise in EI’s expectation. Instead of computing the expectation of I(x) under the posterior of f(x), it is computed under the joint posterior of f(x) and f(x*). This expectation no longer has a closed form like El but can easily draw samples of values at past observations f(x_1), …, f(x_n) from the GP posterior. The conditional distribution f(x) | f(x_1), …, f(x_n) has closed form.

The results

The approach described in the paper is used to optimize various systems at Facebook. Two such optimizations are described in the paper. The first is to optimize six parameters from one of Facebook’s ranking systems. The second one was to optimize seven numeric compiler flags for the HipHop Virtual Machine (HHVM). The web servers powering Facebook use the HHVM to serve requests.

The end goal of this optimization was to reduce CPU usage on the web servers, with a constraint of keeping the peak memory usage less. This following figure shows the CPU usage of each configuration tested. There is a 100 total, it also shows the probability that each point satisfied the memory constraint:

Source: Facebook research blog

The first 30 iterations were randomly generated configurations depicted as a green line. After this, the Bayesian optimization was used to identify parameter configurations to be evaluated. It was observed that Bayesian optimization was able to find better configurations that are more likely to satisfy the constraints.

The findings are that Bayesian optimization is an effective and robust tool for optimizing via noisy experiments.

For full details, visit the Facebook research blog. You can also take look at the research paper.

Read next

NIPS 2017 Special: A deep dive into Deep Bayesian and Bayesian Deep Learning with Yee Whye Teh

Facebook’s Glow, machine learning compiler, to be supported by Intel, Qualcomm and others

“Deep meta reinforcement learning will be the future of AI where we will be so close to achieving artificial general intelligence (AGI)”, Sudharsan Ravichandiran

Prasad Ramesh

Data science enthusiast. Cycling, music, food, movies. Likes FPS and strategy games.

Share
Published by
Prasad Ramesh
Tags: AI News

Recent Posts

Top life hacks for prepping for your IT certification exam

I remember deciding to pursue my first IT certification, the CompTIA A+. I had signed…

3 years ago

Learn Transformers for Natural Language Processing with Denis Rothman

Key takeaways The transformer architecture has proved to be revolutionary in outperforming the classical RNN…

3 years ago

Learning Essential Linux Commands for Navigating the Shell Effectively

Once we learn how to deploy an Ubuntu server, how to manage users, and how…

3 years ago

Clean Coding in Python with Mariano Anaya

Key-takeaways:   Clean code isn’t just a nice thing to have or a luxury in software projects; it's a necessity. If we…

3 years ago

Exploring Forms in Angular – types, benefits and differences   

While developing a web application, or setting dynamic pages and meta tags we need to deal with…

3 years ago

Gain Practical Expertise with the Latest Edition of Software Architecture with C# 9 and .NET 5

Software architecture is one of the most discussed topics in the software industry today, and…

3 years ago