After processing text with an LSTM, the common layer used to squish all the information from the sequence into a single, fixed-size set of numbers before the final output layer is a global pooling layer, most typically Global Average Pooling or Global Max Pooling. An LSTM, which stands for Long Short-Term Memory, is a type of recurrent neural network designed to process sequences of data. When an LSTM processes an input sequence, such as a sentence or a document, it produces a sequence of hidden state vectors. Each hidden state vector contains....
Log in to view the answer