Home Data Paper in Two minutes: Using Mean Field Games for learning behavior policy...

Paper in Two minutes: Using Mean Field Games for learning behavior policy of large populations

February 20, 2018 - 12:00 am

3427

4 min read

This ICLR 2018 accepted paper, Deep Mean Field Games for Learning Optimal Behavior Policy of Large Populations, deals with inference in models of collective behavior, specifically at how to infer the parameters of a mean field game (MFG) representation of collective behavior. This paper is authored by Jiachen Yang, Xiaojing Ye, Rakshit Trivedi, Huan Xu, and Hongyuan Zha. The 6th annual ICLR conference is scheduled to happen between April 30 – May 03, 2018.

Mean field game theory is the study of decision making in very large populations of small interacting agents. This theory understands the behavior of multiple agents each individually trying to optimize their position in space and time, but with their preferences being partly determined by the choices of all the other agents.

Estimating the optimal behavior policy of large populations with Deep Mean Field Games

What problem is the paper attempting to solve?

The paper considers the problem of representing and learning the behavior of a large population of agents, to construct an effective predictive model of the behavior. For example, a population’s behavior directly affects the ranking of a set of trending topics on social media, represented by the global population distribution over topics. Each user’s observation of this global state influences their choice of the next topic in which to participate, thereby contributing to future population behavior.

Classical predictive methods such as time series analysis are also used to build predictive models from data. However, these models do not consider the behavior as

the result of optimization of a reward function and so may not provide insight into the motivations that produce a population’s behavior policy. Alternatively, methods that employ the underlying population network structure assume that nodes are only influenced by a local neighborhood and do not include a representation of a global state. Hence, they face difficulty in explaining events as the result of uncontrolled implicit optimization.

MFG (mean field games) overcomes the limitations of alternative predictive methods by determining how a system naturally behaves according to its underlying optimal control policy. The paper proposes a novel approach for estimating the parameters of MFG. The main contribution of the paper is in relating the theories of MFG and Reinforcement Learning within the classic context of Markov Decision Processes (MDPs). The method suggested uses inverse RL to learn both the reward function and the forward dynamics of the MFG from data.

Paper summary

The paper covers the problem in three sections– theory, algorithm, and experiment. The theoretical contribution begins by transforming a continuous time MFG formulation to a discrete time formulation and then relates the MFG to an associated MDP problem.
In the algorithm phase, an RL solution is suggested to the MFG problem. The authors relate solving an optimization problem on an MDP of a single agent with solving the inference problem of the (population-level) MFG. This leads to learning a reward function from demonstrations using a maximum likelihood approach, where the reward is represented using a deep neural network. The policy is learned through an actor-critic algorithm, based on gradient descent with respect to the policy parameters.
The algorithm is then compared with previous approaches on toy problems with artificially created reward functions. The authors then demonstrate the algorithm on real-world social data with the aim of recovering the reward function and predicting the future trajectory.

Key Takeaways

This paper describes a data-driven method to solve a mean field game model of population evolution, by proving a connection between Mean Field Games with Markov Decision Process and building on methods in reinforcement learning.
This method is scalable to arbitrarily large populations because the Mean Field Games framework represents population density rather than individual agents.
With experiments on real data, Mean Field Games emerges as a powerful framework for learning a reward and policy that can predict trajectories of a real-world population more accurately than alternatives.

Reviewer feedback summary

Overall Score: 26/30
Average Score: 8.66

The reviewers are unanimous in finding the work in this paper highly novel and significant. According to the reviewers, there is still minimal work at the intersection of machine learning and collective behavior, and this paper could help to stimulate the growth of that intersection. On the flip side, surprisingly, the paper was criticized with the statement “scientific content of the work has critical conceptual flaws”. However, the author refutations persuaded the reviewers that the concerns were largely addressed.

Top 6 Cybersecurity Books from Packt to Accelerate Your Career

Your Quick Introduction to Extended Events in Analysis Services from Blog…

Logging the history of my past SQL Saturday presentations from Blog…

Storage savings with Table Compression from Blog Posts – SQLServerCentral

Daily Coping 31 Dec 2020 from Blog Posts – SQLServerCentral

Learning Essential Linux Commands for Navigating the Shell Effectively

Exploring the Strategy Behavioral Design Pattern in Node.js

How to integrate a Medium editor in Angular 8

Implementing memory management with Golang’s garbage collector

How to create sales analysis app in Qlik Sense using DAR…

Paper in Two minutes: Using Mean Field Games for learning behavior policy of large populations

Estimating the optimal behavior policy of large populations with Deep Mean Field Games

What problem is the paper attempting to solve?

Paper summary

Key Takeaways

Reviewer feedback summary

LEAVE A REPLY Cancel reply

Must Read in Cloud & Networking

Top life hacks for prepping for your IT certification exam

Learning Essential Linux Commands for Navigating the Shell Effectively

ServiceNow Partners with IBM on AIOps from DevOps.com

Must Read in Data

Learn Transformers for Natural Language Processing with Denis Rothman

Scientific Analysis of Donald Trump’s Tweets on COVID-19 with Transformers

Distributed training in TensorFlow 2.x

Interviews

Learn Transformers for Natural Language Processing with Denis Rothman

Clean Coding in Python with Mariano Anaya

Bringing AI to the B2B world: Catching up with Sidetrade CTO Mark Sheldon [Interview]

On Adobe InDesign 2020, graphic designing industry direction and more: Iman Ahmed, an Adobe Certified Partner and Instructor [Interview]

Is DevOps experiencing an identity crisis? [Interview]

MobilePro

datapro

Programming

Subscribe to our newsletter