Data

Recurrent neural networks and the LSTM architecture

2 min read

A recurrent neural network is a class of artificial neural networks that contain a network like series of nodes, each with a directed or one-way connection to every other node. These nodes can be classified as either input, output, or hidden. Input nodes receive data from outside of the network, hidden nodes modify the input data, and output nodes provide the intended results. RNNs are well known for their extensive usage in NLP tasks.

The video tutorial above has been taken from Natural Language Processing with Python.

Why are recurrent neural networks well suited for NLP?

What makes RNNs so popular and effective for natural language processing tasks is that they operate sequentially over data sets. For example, a movie review is an arbitrary sequence of letters and characters, which the RNN can take as an input. The subsequent hidden and output layers are also capable of working with sequences.

In a basic sentiment analysis example, you might just have a binary output – like classifying movie reviews as positive or negative. RNNs can do more than this – they are capable of generating a sequential output, such as taking an input sentence in English and translating it into Spanish. This ability to sequentially process data is what makes recurrent neural networks so well suited for NLP tasks.

RNNs and long short-term memory

Recurrent neural networks can sometimes become unstable due to the complexity of the connections they are built upon. That’s where LSTM architecture helps. LSTM introduces something called a memory cell. The memory cell simplifies what could be incredibly by using a series of different gates to govern the way it changes within the network.

  • The input gate manages inputs
  • The output gates manage outputs
  • Self-recurrent connection that keeps the memory cell in a consistent state between different steps
  • The forget gate simply allows the memory cell to ‘forget’ its previous state

[dropcap]R[/dropcap]ead Next

High-level concepts

Tutorials

Research in this area

Richard Gall

Co-editor of the Packt Hub. Interested in politics, tech culture, and how software and business are changing each other.

Share
Published by
Richard Gall

Recent Posts

Top life hacks for prepping for your IT certification exam

I remember deciding to pursue my first IT certification, the CompTIA A+. I had signed…

3 years ago

Learn Transformers for Natural Language Processing with Denis Rothman

Key takeaways The transformer architecture has proved to be revolutionary in outperforming the classical RNN…

3 years ago

Learning Essential Linux Commands for Navigating the Shell Effectively

Once we learn how to deploy an Ubuntu server, how to manage users, and how…

3 years ago

Clean Coding in Python with Mariano Anaya

Key-takeaways:   Clean code isn’t just a nice thing to have or a luxury in software projects; it's a necessity. If we…

3 years ago

Exploring Forms in Angular – types, benefits and differences   

While developing a web application, or setting dynamic pages and meta tags we need to deal with…

3 years ago

Gain Practical Expertise with the Latest Edition of Software Architecture with C# 9 and .NET 5

Software architecture is one of the most discussed topics in the software industry today, and…

3 years ago