An Optimal Control Framework for Efficient Training of Deep Neural NetworksMS50

We present a new mathematical framework that simplifies designing, training, and analyzing deep neural networks. It is based on the interpretation of deep learning as a dynamic optimal control problem. The deep learning problem can thus be cast as a continuous problem and can also be interpreted as a dynamic inverse problem (comparable to, e.g., electromagnetic imaging, optical flow) or optimal control problems (similar to, e.g., optimal mass transport or path-planning). We exemplify how the understanding of the underlying dynamical systems helps design, analyze, and train deep neural networks. The talk focusses on ways to ensure the stability of the dynamics in both the continuous and discrete setting to obtain a well-posed learning problem that allows effective, iterative solution. Throughout the talk, we will illustrate the impact of stability and discretization on the performance of both stochastic and deterministic iterative optimization algorithms.

This presentation is part of Minisymposium “MS50 - Analysis, Optimization, and Applications of Machine Learning in Imaging (3 parts)
organized by: Michael Moeller (University of Siegen) , Gitta Kutyniok (Technische Universität Berlin) .

Authors:
Lars Ruthotto (Department of Mathematics and Computer Science, Emory University)
Eldad Haber (University of British Columbia)
Keywords:
deep learning, machine learning, nonlinear optimization, partial differential equation models