Global Optimality in Matrix, Tensor Factorization, and Deep LearningMS50

The past few years have seen a dramatic increase in the performance of recognition systems thanks to the introduction of deep networks for representation learning. However, the mathematical reasons for this success remain elusive. A key issue is that the neural network training problem is non-convex, hence optimization algorithms may not return a global minima. Building on ideas from convex relaxations of matrix factorizations, this work proposes a general framework which allows for the analysis of a wide range of non-convex factorization problems – including matrix factorization, tensor factorization, and deep neural network training. The talk will describe sufficient conditions under which a local minimum of the non-convex optimization problem is a global minimum and show that if the size of the factorized variables is large enough then from any initialization it is possible to find a global minimizer using a local descent algorithm. Joint work with Ben Haeffele.

This presentation is part of Minisymposium “MS50 - Analysis, Optimization, and Applications of Machine Learning in Imaging (3 parts)
organized by: Michael Moeller (University of Siegen) , Gitta Kutyniok (Technische Universität Berlin) .

Authors:
Rene Vidal (Johns Hopkins University)
Benjamin Haeffele (Johns Hopkins University)
Keywords:
deep learning, machine learning, nonlinear optimization