Description |
Title: Adaptive Optimization
Abstract: Optimization is one of the core components of modern machine learning. Choosing the right optimization algorithm, and choosing the right parameters to tune that algorithm, can have a huge impact on the quality of the final model produced during training. Current practice is for humans to simply try many possible optimizers and parameter settings in order to find the empirically best one, which is slow, expensive, and tedious. In this talk, I will present new techniques that avoid this costly manual search. These algorithms do not require any tuning of parameters, and are able to automatically leverage hidden structure in the data, interpolating smoothly between worst-case performance on adversarial data and much better performance on the "easier" data that is encountered in practice. This ability is called adaptivity.
For the case of convex losses, I will present algorithms that automatically detect and take advantage of different kinds of structure in the data, such as sparsity, smoothness, or strong-convexity. In the non-convex setting, I will discuss the popular momentum heuristic employed in practical training of non-convex deep learning models. I will present two new results that for the first time suggest a strong theoretical advantage in employing momentum for non-convex optimization.
Bio: Ashok obtained a PhD in computer science from Stanford in 2018, and has been a research scientist at Google since then. His research focuses on optimization algorithms that automatically tune themselves without human intervention. |