Skip to main content

Seminar

Date:
-
Location:
MDS 220
Speaker(s) / Presenter(s):
Dr. Qiang Ye, University of Kentucky

Title: Preconditioning for Accelerated Gradient Descent Optimization and Regularization

Abstract: Accelerated training algorithms, such as adaptive learning rates and various normalization methods, are widely used in deep learning but not fully understood. When regularization is introduced, standard optimizers like adaptive learning rates may not perform effectively. This raises the need for alternative regularization approaches and the question of how to properly combine regularization with preconditioning. In this talk, we present preconditioning as a unified mathematical framework for understanding various acceleration techniques and deriving appropriate regularization schemes. We will explain how preconditioning with AdaGrad, RMSProp, and Adam accelerates training; discuss the interaction between regularization and preconditioning, and demonstrate how normalization methods accelerate training and how this perspective can lead to new preconditioning training algorithms.