An LSTM network has special 'gates' that help it remember or forget information over long text sequences. Which gate decides what new information gets saved into the memory cell?
The gate that decides what new information gets saved into the memory cell is the input gate. The memory cell, also known as the cell state, serves as the central memory unit of an LSTM network, capable of carrying information over extended sequences. The input gate's role is to determine which newly learned information from the current input and previous hidden state should be added to this memory cell. It accomplishes this through two main components: First, a sigmoid layer processes the input and previous hidden state, producing values between 0 and 1. These values act as a filter, indicating the importance of each piece of new information, where 1 means completely important and 0 means completely unimportant. Second, a tanh layer also processes the input and previous hidden state, creating a vector of candidate new values that range from -1 to 1. To decide what new information gets saved, the output of the sigmoid filter is element-wise multiplied by the candidate values from the tanh layer. This filtered new information is then added to the memory cell, allowing only relevant and selected new data to update the long-term memory of the network.