What is a recurrent neural network RNN

· Category: AI & Machine Learning

Short answer

A recurrent neural network processes sequential data by maintaining hidden state that captures information from previous time steps, making it suitable for language, audio, and time-series tasks.

How it works

At each time step, an RNN cell receives the current input and the previous hidden state, producing a new hidden state and optionally an output. This recurrent connection allows the network to model temporal dependencies. Standard RNNs struggle with long-term dependencies due to vanishing gradients, so LSTM and GRU variants introduce gating mechanisms that control information flow, enabling the network to remember or forget information over many steps.

Example

An RNN trained for sentiment analysis processes a sentence word by word. The hidden state accumulates emotional context from early words like "not" and "good," allowing the network to correctly classify "not good" as negative despite the positive word "good."

Why it matters

Before transformers, RNNs were the dominant architecture for NLP and signal processing. They remain relevant for low-latency applications and resource-constrained environments where full attention mechanisms are too expensive. Understanding RNNs provides intuition for how sequence models capture temporal structure.

Example

import torch
import torch.nn as nn

model = nn.Sequential(
    nn.Linear(784, 256),
    nn.ReLU(),
    nn.Dropout(0.2),
    nn.Linear(256, 10)
)
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)

This snippet defines a simple neural network with dropout for regularization, a cross-entropy loss, and the Adam optimizer in PyTorch.