3 min read

This is a quick summary of the research paper titled Making Neural Programming Architectures Generalize via Recursion by Jonathon Cai, Richard Shin, Dawn Song published on 6th Nov 2016.

The idea of solving a common task is central to developing any algorithm or system. The primary challenge in designing any such system is the problem of generalizing the result to a large set of data. Simply put it means that using the same system, we should be able to predict accurate results when the amount of data is vast and varied across different domains. This is where most ANN systems fail. Researchers have claimed that the process of iteration which is inherent in all algorithms if introduced externally, will help us arrive at a system and architecture that can predict accurate results over limitless amounts of data. This technique is called the Recursive Neural Program. For more on this and the different Neural network programs, you can refer to the original research paper.  A sample illustration showing a Neural Network Program is shown below:

The Problem with Learned Neural Networks

The most common technique which was applied till date to was to use Learned Neural Network – a method where a program was given increasingly complex tasks – for example solving the graduate level addition problem, in simpler words, adding two numbers. The problem with this approach was that the program kept on solving correctly as long as the number of digits was less. When the digits increased, the results were chaotic, some were correct and some were not, the reason being the program chose a complex method to solve the problem of increasing complexity. The real reason behind it was actually the architecture, which stayed the same as the complexity of the problem was increased, hence the program could not adapt in the end and gave chaotic response.

The Solution of Recursion

The essence of recursion is that it helps the system break down the problem into smaller pieces and then it solves these problems separately. This means irrespective of how complex the problem, the recursive process will break it down into standard units, i.e., the solution remains uniform and consistent.

Keeping the theory of recursion in mind, a group of researchers have implemented this in their neural network Program and created a recursive architecture called as the Neural Programmer-Interpreter (NPI).

This illustration shows the different algorithms and techniques used to create Neural Network based programs. The present system is based on the May 2016 formulation proposed by Reed et al.

The system induces a supervised recursion in solving any task, in a way that a particular function stores an output in a particular memory cell, then calls that output value back while checking the actual desired result. This self-calling of the program or the function automatically induces recursion and that itself helps the program to decompose the problem into multiple smaller units and hence the results are more accurate than other techniques.

The scientists have successfully applied this technique to solve four common tasks namely

  1. Grade School Addition
  2. Bubble Sort
  3. Topological Sort
  4. Quick Sort

They have found that the Recursive Neural Network architecture gives 100 percent success rates in predicting correct results in case of all the four above mentioned tasks. The flip- side of this technique is still the amount of supervision required while performing the tasks. These will be subject to further investigation and research.

For a more detailed approach and results on the different neural Network programs and their performance, please refer to the original research paper.  

LEAVE A REPLY

Please enter your comment!
Please enter your name here