Research Training Group 2088

Project A2: Wasserstein Metrics in Statistics: Algorithms

Exact computation of the Wasserstein W_p metric (see project A1) between pixel images remains a challenging problem. While some progress has been made for the case p=2 by employing smoothness properties and a convenient geometrical result, the general case is largely unresolved.

The most natural approach to computing Wasserstein metrics is by solving a large discrete transportation problem by means of combinatorial optimization. In the framework of a newly developed shortlist method we were recently able to design an algorithm that solves medium sized problems considerably faster than commercial solvers.

The goal of this project is to enhance this method, and to develop and implement further algorithms which take the special structure of the underlying data into account, e.g. by using multiscale approaches and column generation techniques. We would like to be able to solve large-scale transportation problems of an order of magnitude of 10^5 sources and sinks.

The PhD student working on this project can rely on an established code base in C and in the statistical programming language R. The successful candidate has a strong background in discrete optimization and substantial programming experience in C or C++. Previous skills in R and knowledge in probability theory or statistics is an asset.

Methods: discrete and linear optimization, quantization of finite measures
Applications: image retrieval, biochemical substrate transfer

← Previous Project Overview Next Projekt →