LSTM (Long Short-Term Memory)
LSTM (Long Short-Term Memory) LSTM is a type of RNN architecture that is widely used for sequnce modeling task LSTM overcome RNN limitation(vanising gradient) by introducing a memory cell and three gating mechanisms. Memory cell in LSTM allows to store and access information over long sequence LSTMs use a series of gates which control how the information in a sequence of data comes into, is stored in and leaves the network. they are - forgot gate input gate output gate Application NLP task- named entity recognition, sentiment analysis, machine translation etc. Speech Recognition - automatic speech recognition, speech-to-text conversion etc. Time Series Analysis and Forecasting - stock market prediction, weather forecasting etc. Architecture Forget gate layer First step in the process is Forgot gate. This gate telling the LSTM how much information keep from previous state. Output of this gate is between 0 and 1. Output of this forgot gate multiply with previous LSTM output. output of forgot gate is 0 implies -> Forget all previous memory output of forgot gate is 1 implies -> Keep all previous memory output of forgot gate is 0.5 implies -> Keep some of previous memory ...