Artificial Intelligence

Are Recurrent Neural Networks capable of warping time?

2 min read

Can Recurrent neural networks warp time?’ is authored by Corentin Tallec and Yann Ollivier to be presented at ICLR 2018.

This paper explains that plain RNNs cannot account for warpings, leaky RNNs can account for uniform time scalings but not irregular warpings, and gated RNNs can adapt to irregular warpings.

Gating mechanism of LSTMS (and GRUs) to time invariance / warping

What problem is the paper trying to solve?

In this paper, that authors prove that learnable gates in a recurrent model formally provide quasi-invariance to general time transformations in the input data. Further, the authors try to recover part of the LSTM architecture from a simple axiomatic approach. This leads to a new way of initializing gate biases in LSTMs and GRUs. Experimentally, this new chrono initialization is shown to greatly improve learning of long term dependencies, with minimal implementation effort.

Paper summary

The authors have derived the self loop feedback gating mechanism of recurrent networks from first principles via a postulate of invariance to time warpings. Gated connections appear to regulate the local time constants in recurrent models. With this in mind, the chrono initialization, a principled way of initializing gate biases in LSTMs, has been introduced. Experimentally, chrono initialization is shown to bring notable benefits when facing long term dependencies.

Key takeaways

  • In this paper, the authors show that postulating invariance to time transformations in the data (taking invariance to time warping as an axiom) necessarily leads to a gate-like mechanism in recurrent models.
  • The paper provides precise prescriptions on how to initialize gate biases depending on the range of time dependencies to be captured.
  • The empirical benefits of the new initialization on both synthetic and real world data have been tested.
  • The authors also observed a substantial improvement with long-term dependencies, and slight gains or no change when short-term dependencies dominate.

Reviewer comments summary

Overall Score: 25/30

Average Score: 8

According to a reviewer, the core insight of the paper is the link between recurrent network design and its effect on how the network reacts to time transformations. This insight is simple, elegant and valuable, as per the reviewer. A minor complaint highlighted is that there are an unnecessarily large number of paragraph breaks, which make reading slightly jarring.

Read Next:

Recurrent neural networks and the LSTM architecture

Build generative chatbot using recurrent neural networks (LSTM RNNs)

How to recognize Patterns with Neural Networks in Java

 

Savia Lobo

A Data science fanatic. Loves to be updated with the tech happenings around the globe. Loves singing and composing songs. Believes in putting the art in smart.

Share
Published by
Savia Lobo

Recent Posts

Top life hacks for prepping for your IT certification exam

I remember deciding to pursue my first IT certification, the CompTIA A+. I had signed…

3 years ago

Learn Transformers for Natural Language Processing with Denis Rothman

Key takeaways The transformer architecture has proved to be revolutionary in outperforming the classical RNN…

3 years ago

Learning Essential Linux Commands for Navigating the Shell Effectively

Once we learn how to deploy an Ubuntu server, how to manage users, and how…

3 years ago

Clean Coding in Python with Mariano Anaya

Key-takeaways:   Clean code isn’t just a nice thing to have or a luxury in software projects; it's a necessity. If we…

3 years ago

Exploring Forms in Angular – types, benefits and differences   

While developing a web application, or setting dynamic pages and meta tags we need to deal with…

3 years ago

Gain Practical Expertise with the Latest Edition of Software Architecture with C# 9 and .NET 5

Software architecture is one of the most discussed topics in the software industry today, and…

3 years ago