End-to-end learning of CNN features in in discrete optimization models for motion and stereoMS36

For many years, discrete optimization models such as conditional random fields (CRFs) have defined the state-of-the-art for classical correspondence problems such as motion and stereo. One of the most important ingredients in those models is the choice of the feature transform that is used to compute the similarity between images patches. For a long time, hand crafted features such as the celebrated scale invariant feature transform (SIFT) defined the state-of-the-art. Triggered by the recent success of convolutional neural networks (CNNs), it is quite natural to learn such a feature transform from data. In this talk, I will show how to efficiently learn such CNN features from data using an end-to-end learning approach. It turns out that our learned models yields state-of-the-art results on a number of established benchmark databases.

This presentation is part of Minisymposium “MS36 - Computational Methods for Large-Scale Machine Learning in Imaging (2 parts)
organized by: Matthias Chung (Virginia Tech) , Lars Ruthotto (Department of Mathematics and Computer Science, Emory University) .

Authors:
Thomas Pock (Graz University of Technology)
Patrick Knöbelreiter (Graz University of Technology)
Gottfried Munda (Graz University of Technology)
Christian Reinbacher (Amazon)
Alexander Shekhovtsov (TU Prague)
Keywords:
computer vision, deep learning, machine learning, nonlinear optimization