||Teaching a machine what it ‘sees’ has been a long-standing goal in computer vision which is not surprising since such 3D scene understanding algorithms would have a tremendous value for applications. For example, robots (including vehicles such as cars) could interact autonomously and intelligently within their environment, images and videos could be automatically indexed based on their spatial arrangement and on semantic tags, and missing parts in 3D reconstructions could be completed based on how a reasonable scene looks like. Even though seemingly easy for us humans, computers still struggle with this task.
However, 3D computer vision (multiple-view geometry, visual SLAM, structure-from-motion, …) has matured and is nowadays a well-established technique for metric 3D reconstructions. Moreover, we have seen large progress in 2D image and video content analysis (segmentation, object and activity recognition, …). 3D reconstruction and 2D scene understanding have mostly evolved independently, though. It is clear that the two problems intertwine and a joint approach would be mutually beneficial. The major goal of our research is precisely the development of mathematical formulations and algorithms which combine scene understanding and 3D reconstructions in a joint framework. Since this is a very ambitious task, our research will initially focus on low level geometric concepts before ultimately tackling the higher-level 3D scene understanding problem. Starting from known concepts in 3D computer vision, we follow an interdisciplinary approach mostly drawing upon geometric computing and data-driven approaches.
Roland Angst joined ASUS Corp. Taipei, Tw, in December 2015.