<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>LSTM on Ujjal</title><link>https://ujjalkumarmaity.github.io/tags/lstm/</link><description>Recent content in LSTM on Ujjal</description><generator>Hugo -- 0.154.5</generator><language>en-us</language><lastBuildDate>Tue, 30 May 2023 00:00:00 +0000</lastBuildDate><atom:link href="https://ujjalkumarmaity.github.io/tags/lstm/index.xml" rel="self" type="application/rss+xml"/><item><title>LSTM (Long Short-Term Memory)</title><link>https://ujjalkumarmaity.github.io/blogs/lstm/</link><pubDate>Tue, 30 May 2023 00:00:00 +0000</pubDate><guid>https://ujjalkumarmaity.github.io/blogs/lstm/</guid><description>&lt;h2 id="lstm-long-short-term-memory"&gt;LSTM (Long Short-Term Memory)&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;LSTM is a type of RNN architecture that is widely used for sequnce modeling task&lt;/li&gt;
&lt;li&gt;LSTM overcome RNN limitation(vanising gradient) by introducing a memory cell and three gating mechanisms.&lt;/li&gt;
&lt;li&gt;Memory cell in LSTM allows to store and access information over long sequence&lt;/li&gt;
&lt;li&gt;LSTMs use a series of gates which control how the information in a sequence of data comes into, is stored in and leaves the network. they are -
&lt;ul&gt;
&lt;li&gt;forgot gate&lt;/li&gt;
&lt;li&gt;input gate&lt;/li&gt;
&lt;li&gt;output gate&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="application"&gt;Application&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;NLP task&lt;/strong&gt;- named entity recognition, sentiment analysis, machine translation etc.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Speech Recognition&lt;/strong&gt; - automatic speech recognition, speech-to-text conversion etc.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Time Series Analysis and Forecasting&lt;/strong&gt; - stock market prediction, weather forecasting etc.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="architecture"&gt;Architecture&lt;/h3&gt;
&lt;h4 id="forget-gate-layer"&gt;Forget gate layer&lt;/h4&gt;
&lt;p&gt;First step in the process is Forgot gate. This gate telling the LSTM how much information keep from previous state. Output of this gate is between 0 and 1. Output of this forgot gate multiply with previous LSTM output.
&lt;!-- raw HTML omitted --&gt;&lt;!-- raw HTML omitted --&gt;
output of forgot gate is 0 implies &lt;code&gt;-&amp;gt;&lt;/code&gt; Forget all previous memory&lt;!-- raw HTML omitted --&gt;
output of forgot gate is 1 implies &lt;code&gt;-&amp;gt;&lt;/code&gt; Keep all previous memory&lt;!-- raw HTML omitted --&gt;
output of forgot gate is 0.5 implies &lt;code&gt;-&amp;gt;&lt;/code&gt; Keep some of previous memory&lt;!-- raw HTML omitted --&gt;&lt;/p&gt;</description></item></channel></rss>