DeepMind open sources TRFL, a new library of reinforcement learning building blocks

The DeepMind team announced yesterday that they’re open sourcing a new library, named TRFL, that comprises useful building blocks for writing reinforcement learning (RL) agents in TensorFlow. The TRFL library was created by the research engineering team at DeepMind.

TRFL library is a collection of key algorithmic components that are used for a large number of DeepMind’s agents such as DQN, DDPG, and the Importance Weighted Actor Learner Architecture.

A typical deep reinforcement learning agent usually comprises a large number of interacting components that includes the environment and some deep network representing values or policies. Apart from these, these RL agents also include components such as a learned model of the environment, pseudo-reward functions or a replay system. Moreover, these RL agents interact in subtle ways which makes it difficult to identify bugs in large computational graphs.

To fix this issue, it is recommended to open-source complete agent implementations. This is because even though the large agent codebases are useful for reproducing research, it is hard to modify and extend them. Additionally, a different and complementary approach is to provide a reliable, well-tested implementation of common building blocks. These implementations can then be used in a variety of different RL agents.

TRFL library helps as it includes functions that help implement both classical RL algorithms as well as other cutting-edge techniques. The loss functions and other operations that come with TRFL, are implemented in pure TensorFlow. These RL algorithms are not complete algorithms instead they’re implementations of RL-Specific mathematical operations which are required when building fully-functional RL agents.

The DeepMind team also provides TensorFlow ops for value-based reinforcement learning in discrete action spaces such as TD-learning, Sarsa, Q-learning, and their variants. Moreover, it offers ops for implementing continuous control algorithms such as DPG as well as ops for learning distributional value functions.

Finally, TRFL also comes with an implementation of the auxiliary pseudo-reward functions used by UNREAL. This improves data efficiency in a wide range of domains.

“This is not a one-time release. Since this library is used extensively within DeepMind, we will continue to maintain it as well as add new functionalities over time. We are also eager to receive contributions to the library by the wider RL community”, mentioned the DeepMind team.

For more information, check out the official DeepMind blog.