LSTM (Long Short-Term Memory)

Tue, 30 May 2023 00:00:00 +0000

LSTM (Long Short-Term Memory)

LSTM is a type of RNN architecture that is widely used for sequnce modeling task
LSTM overcome RNN limitation(vanising gradient) by introducing a memory cell and three gating mechanisms.
Memory cell in LSTM allows to store and access information over long sequence
LSTMs use a series of gates which control how the information in a sequence of data comes into, is stored in and leaves the network. they are -
- forgot gate
- input gate
- output gate

Application

NLP task- named entity recognition, sentiment analysis, machine translation etc.
Speech Recognition - automatic speech recognition, speech-to-text conversion etc.
Time Series Analysis and Forecasting - stock market prediction, weather forecasting etc.

Architecture

Forget gate layer

First step in the process is Forgot gate. This gate telling the LSTM how much information keep from previous state. Output of this gate is between 0 and 1. Output of this forgot gate multiply with previous LSTM output. output of forgot gate is 0 implies -> Forget all previous memory output of forgot gate is 1 implies -> Keep all previous memory output of forgot gate is 0.5 implies -> Keep some of previous memory

LSTM on Ujjal

LSTM (Long Short-Term Memory)

LSTM (Long Short-Term Memory)

Application

Architecture

Forget gate layer