Michael Schelling, ehemaliges Mitglied der Forschungsgruppe Visual Computing, hat seine Dissertation unter dem Titel Viewpoint Guided Learning for 3D Scene Understanding and Representations verteidigt. Er wurde begutachtet von Prof. Dr. Timo Ropinski (Medieninformatik, Universität Ulm), Prof. Dr. Andreas Kolb (Universität Siegen), als Wahlmitglieder apl. Prof. Dr. Rainer Michalzik und Prof. Dr. Heiko Neumann (beide Univeristät Ulm), sowie als weiteres Mitglied in der Prüfungskommission Prof. Dr. Matthias Tichy (Vorsitz und Protokoll).
Zusammenfassung:
Deep Learning has become the dominating technique in many domains, including visual computing. The visual perception, of both humans and machines, is rooted in the 2D space, as visual sensors are ultimately only able to capture a two-dimensional representation from their specific viewpoint. Even 3D sensors, such as indirect Time-of- Flight (iToF) cameras, can only perceive information which is visible from their respective viewpoint, and thus can only model the observed 3D space as a 2D manifold, which is commonly referred to as a 2.5D image.
Consequently, the visual observation is highly dependent on both the observed 3D geometry and the chosen viewpoint. This interdependency of the viewpoint and the observed 3D structure can be exploited to improve the performance of neural networks for visual tasks. To this end, I present algorithmic optimizations for neural network training, which allow for the use of viewpoint information when addressing 3D problems and, vice versa, the use of 3D information when addressing viewpoint related problems.
First, a dynamic labeling strategy is presented, which enables 3D point networks to identify informative viewpoints for 3D models. Second, it is shown how the integration of information about the view direction into 3D point convolutional networks can improve the error correction rates for iToF cameras. Lastly, a training method for compensating motion artefacts in iToF images, e.g., through changes of the viewpoint, is derived, which allows for the supervision of 2D networks via the reconstructed 2.5D depth image.
Wir wünschen ihm mit diesem vollbrachten Schritt alles Gute, sodass die Wege dorthin führen, wo es interessant ist.