Govur University Logo
--> --> --> -->
...

Which activation function is most susceptible to the vanishing gradient problem, and why?



The sigmoid activation function is most susceptible to the vanishing gradient problem. The sigmoid function outputs values between 0 and 1. During backpropagation, the gradients are multiplied together as they are passed backward through the layers of the neural network. The derivative of the sigmoid function has a maximum value of 0.25. When the input to a sigmoid function is very large or v....

Log in to view the answer



Redundant Elements