PDE-based Algorithms for Convolution Neural NetworksMS36

This talk presents a new framework for image classification that exploits the relationship between the training of deep Convolution Neural Networks (CNNs) to the problem of optimally controlling a system of nonlinear partial differential equations (PDEs). This new interpretation leads to a variational model for CNNs, which provides new theoretical insight into CNNs and new approaches for designing learning algorithms. We exemplify the myriad benefits of the continuous network in three ways. First, we show how to scale deep CNNs across image resolutions using multigrid methods. Second, we show how to scale the depth of deep CNNS in a shallow-to-deep manner to gradually increase the flexibility of the classifier. Third, we analyze the stability of CNNs and present stable variants that are also reversible (i.e., information can be propagated from input to output layer and vice versa), which in combination allows training arbitrarily deep networks with limited computational resources.

This presentation is part of Minisymposium “MS36 - Computational Methods for Large-Scale Machine Learning in Imaging (2 parts)
organized by: Matthias Chung (Virginia Tech) , Lars Ruthotto (Department of Mathematics and Computer Science, Emory University) .

Authors:
Eldad Haber (University of British Columbia)
Lars Ruthotto (Department of Mathematics and Computer Science, Emory University)
Keywords:
deep learning, machine learning, partial differential equation models