Project Description

Research in artificial intelligence and machine learning aims at developing computational systems that match or exceed human performance in highly complex cognitive tasks. In recent years, we have been witness to a number of impressive breakthroughs in these fields. In 2016, for the first time ever, a world-class human Go player was defeated by a computer. Advances in speech and object recognition have far exceeded previous expectations and are greatly contributing to the applicability and robustness of new technologies such as voice control or autonomous driving. Most of these breakthroughs are due to a special method of teaching and designing smart computational systems which is known as deep learning. The basic idea of deep learning is to adapt strongly simplified models of layered network structures in the human nervous system such that they succeed at performing specific task as, for example, digit recognition. While the success of deep neural networks is evident, the underlying mechanisms that are responsible for their groundbreaking performance remain largely mysterious. This has sparked a renewed interest in the rigorous mathematical analysis of deep neural network architectures with the goal of uncovering these mechanisms. In the project "Depth and Discriminability in Deep Learning Architectures", we will perform a fundamental mathematical analysis of a highly important property of deep learning architectures, namely their discriminatory behavior. Imagine a neural network that is supposed to correctly recognize handwritten letters. In order to succeed at this task, it has to be capable of distinguishing the letter "u" from the letter "v" even though in many handwriting styles, they look very similar. Our goal is to better understand how depth, that is the number of layers, and other characteristics of a neural network influence its capability of distinguishing between elements from different classes that, at first sight, are very similar. We hope that our work will not only contribute to a better understanding of why deep learning works so well but also provide guidelines on how to optimally design deep learning architectures to succeed at a given task.