Hi there, I am trying to write a tool which involves implementing logistic regression. With the batch gradient descent method, the convergence is guaranteed as it is a convex problem. However, I find that with the stochastic gradient decent method, it typically converges to some random points (i.e., not very close to the minimum point resulted from the batch method). I have tried different ways of decreasing the learning rate, and different starting points of weights. However, the performance (e.g., accuracy, precision/recall, ...) are comparable (to the batch method).
I understand that this is possible, since SGD(stochastic gradient descent) uses an approximation to the real cost each step. Does it matter? I guess it does since otherwise the interpretation of the weights would not make much sense even the accuracy is comparable. If it matters, I wonder if you have some suggestions on how to make it converge or getting close to the global optimal point. Thanks! [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.