What is the difference between gradient descent and stochastic gradient descent?
A.:
The difference is that in gradient descent you compute the derivative of the cost function (and that requires a sum over all training points), whereas in stochastic gradient descent you ESTIMATE the derivative (gradient), by using a small subset of the training points.