Internal covariate shift refers to the change in the distribution of layer inputs as the parameters of previous layers change during training. As a neural network learns, the weights in earlier layers are updated, which causes the output values (activations) fed into subsequent layers to shift in range and distribution. This forces later layers to constantly adapt to new input statistic....
Log in to view the answer