Geometry and Learning
Personal Homepage: http://theinf2.informatik.uni-jena.de/People/Joachim+Giesen.html
Learning of Geometry: Given samples obtained from a shape we want to learn some of its geometric and topological characteristics. A popular example that fits in this framework is surface reconstruction: to obtain a digital model of some solid one samples its surface, e.g., by using a laser range scanner, and applies a reconstruction algorithm to the sample that outputs a continuous model of the surface. In general shape sampling is a very powerful paradigm that does not only allow to compute digital models of shapes but also to simplify these models, to compute fingerprints for efficient retrieval in shape databases and to segment and partially match shapes. The latter application is particularly interesting in biological applications where a macromolecule is modeled as a union of balls that represent the molecule’s atoms.
We are interested in data structures and algorithms for sample based geometry processing that provided some sampling condition holds give results with provable guarantees. A major challenge is posed by kinetic data sets, i.e., samples of shapes that move or deform over time. Motion is fundamental in many disciplines that model the physical world, such as robotics, computer graphics, or computational biology.
Geometry of Learning: Some powerful machine learning techniques like support vector machines and spectral clustering are essentially geometric techniques. Support vector machines are mainly used in a supervised setting, i.e., the data come with labels and a classifier is computed from the labeled data. The classifier should generalize to data whose labels are not known in advance. Clustering is an example of unsupervised learning. Here the goal is to attach labels to the given data, i.e., the labels are not part of the input as it is the case in the supervised setting. Instead pairwise similarity/dissimilarity values are given. The goal is to assign labels such that data with the same label are similar and dissimilar otherwise.
Geometry enters in support vector machines via the so called kernel trick that embeds the original data into some Hilbert space. In Hilbert space the the data can be processed by exploiting its geometry, e.g., by the fact that we can measure distances and angles.
In spectral clustering the original data are embedded into some Euclidean space. The embedding is derived from the spectral properties of the pairwise similarity matrix, i.e., its eigenvalues and eigenspaces. Right now the embedding is not really well understood but in many applications it allows standard geometric clustering algorithms like the k-means algorithm to perform well on the embedded data.
We hope to get a better understanding of the success but also of the limitations of these techniques by looking at the geometric and topological aspects of the corresponding embeddings.