Govur University Logo
--> --> --> -->
...

Describe the fundamental difference between stochastic gradient descent (SGD) and Adam optimizer.



The fundamental difference between Stochastic Gradient Descent (SGD) and the Adam optimizer lies in how they update the parameters of a model during training. SGD updates the parameters using a single, global learning rate for all parameters. It computes the gradient of the loss function with respect to the parameters using a single randomly selected data point (or a small batch of data points) and then updates the parameters in the direction opposite to the gradient, scaled by the learning rate. The Adam optimizer,....

Log in to view the answer



Redundant Elements