Date(s) : 31/01/2022 iCal
14 h 00 min
Visio Zoom – link TBP
We investigate a general class of stochastic gradient descent (SGD) algorithms, called conditioned SGD, based on a preconditioning of the gradient direction. Under some mild assumptions, we establish the almost sure convergence and the asymptotic normality for a broad class of conditioning matrices. In particular, when the conditioning matrix is an estimate of the inverse Hessian at the optimal point, the algorithm is proved to be asymptotically optimal. The benefits of this approach are validated on simulated and real datasets. As an extension of the Conditioned SGD framework, we shall terminate by presenting a new class of methods in which, at each iteration, a single descent direction is selected at random.
This is joint work with Rémi Leluc (https://remileluc.github.io/)
Link to the paper: https://arxiv.org/abs/2006.02745