Blanz, Volker


Prof. Dr. Volker Blanz

Learning-Based Modeling of Objects

Personal Homepage:

Research Mission

As technology in Computer Graphics becomes more and more powerful, a tremendous increase in the complexity of rendered scenes and a high demand for realism has posed new challenges for modeling and rendering. It has become essential to replace as much as possible of artists’ and designers’ manual work by automated algorithms, allowing them to create scenes and objects on a higher, more abstract level.

The overall approach of the research group is to use learning-based methods in Computer Graphics in order to capture the typical properties classes of objects, such as human faces. This involves three main steps that are addressed by our projects: (1) data collection, (2) statistical data analysis, and (3) methods for application of the class-specific information in Computer Graphics and Vision. The close relationship between Graphics and Vision is reflected in our previous work: We combine methods from both fields, and our results can be used for face modeling [3] and animation [1], but also for face recognition [4]. Building on the technology that we developed in previous years, we plan to exploit new sources of data, such as time-sequences of 3D scans, and explore new techniques for data analysis.

Our long-term vision for Computer Graphics is a technology that captures existing objects, scenes and events automatically, and converts them into a mathematical representation that allows users to manipulate and interact with the scene on a high level of abstraction. To achieve this, the measured data have to be converted into a representation that reflects the mental representation in high-level stages of the human visual system. The user interface has to provide some of the cognitive concepts that are meaningful to users, such as object identity, material properties, scene parameters and motion patterns.
We have addressed this problem in previous work by separating the identity of a person from the scene parameters of an image [4], and by manipulating meaningful attributes of faces such as gender or body weight, while keeping the persons’ identities unchanged [3]. On the way to implementing the long-term vision, a variety of interesting problems for Computer Vision and Machine Learning can be defined, ranging from low-level preprocessing to object recognition.

Our approach to Computer Graphics is example based, unlike the state-of the art methods of manual design of objects, material properties and motions applied in the production of movies, and unlike physical simulations presented in research. Physical simulations of phenomena, such as mechanical deformations of faces during speech or the interaction of light with matter, involve assumptions about the internal structure and the physical properties of objects. Simulations of reasonably complex phenomena require a large number of parameters that are difficult to measure. Any simplifications on this level are likely to produce unrealistic results.
Therefore, even though physical simulations are based on general physical laws, they cannot completely avoid empirical measurements. In contrast, our inductive approach is entirely data-driven and fully automated. Measuring and modeling only quantities that are directly perceivable, such as the deformation of faces or the radiance of reflected light, our method directly maximizes the realism in reproducing the measurements and generalizing to new viewing conditions.
The statistical methods represent inherent physical laws and common properties of objects in an implicit way. For example, the symmetry of human faces is captured by a high correlation between structures on the left and right side of the faces in the database.

Statistical learning has become a very active field of research in the last decade, introducing important methods such as Neural Networks and Support Vector Machines that learn general properties of data from examples by induction. The general properties learned from data can, for example, be an estimate of a functional relationship (regression), or the probability density of examples in an appropriate representation (parameter estimation).
We have addressed the latter problem in previous work in terms of a Linear Object Class [3]: Elements of a class of objects, such as faces, cars or teeth, are converted into a vector space representation, and their probability density in this space is estimated with a Principal Component Analysis. Within the linear span of examples and the region with a high estimated probability, all vectors describe admissible elements of the object class. We have used regression to learn the difference between male and female faces, and other attributes of faces from examples.
Finally, we have used the concept of Bayesian estimators for image analysis in model fitting and face recognition. The powerful techniques of statistical learning provide a promising basis for future research on object representation, image analysis and synthesis.


  1. V. Blanz, C. Basso, T. Poggio, and T. Vetter. Reanimating faces in images and video. In P. Brunet and D. Fellner,
    editors, Computer Graphics Forum, Vol. 22, No. 3 EUROGRAPHICS 2003, pages 641-650, Granada, Spain, 2003.
  2. V. Blanz, B. Schölkopf, H. Bülthoff, C. Burges, V. Vapnik, and T. Vetter. Comparison of view-based object recognition
    algorithms using realistic 3D models. In C. von der Malsburg, W. von Seelen, J.C. Vorbrüggen, and B. Sendhoff, editors,
    Artificial Neural Networks – ICANN96, pages 251-256, Springer, Lecture Notes in Computer Science 1112, 1996.
  3. V. Blanz and T. Vetter. A morphable model for the synthesis of 3D faces. In Computer Graphics Proc. SIGGRAPH’99,
    pages 187-194, Los Angeles, 1999.
  4. V. Blanz and T. Vetter. Face recognition based on fitting a 3d morphable model. IEEE Trans. on Pattern Analysis
    and Machine Intell., 25(9):1063-1074, 2003.

Mentor in Saarbrücken: Professor Hans-Peter Seidel
Mentor in Stanford: Professor Bernd Girod