This kind of pooling is called max pooling. It helps the network do two main things. First, it performs dimensionality reduction. After a convolution layer processes input data and generates feature maps (grids highlighting detected features), max pooling reduces the spatial size of these maps by selecting only the most prominent activation from each small, non-overlapping region. For example, a 2x2 area of values in a feature map might be replaced by a single value: ....
Log in to view the answer