Time filter

Source Type

Geiger A.,MPI for Intelligent Systems | Geiger A.,Karlsruhe Institute of Technology | Lauer M.,Karlsruhe Institute of Technology | Wojek C.,MPI for Informatics | And 2 more authors.
IEEE Transactions on Pattern Analysis and Machine Intelligence | Year: 2014

In this paper, we present a novel probabilistic generative model for multi-object traffic scene understanding from movable platforms which reasons jointly about the 3D scene layout as well as the location and orientation of objects in the scene. In particular, the scene topology, geometry, and traffic activities are inferred from short video sequences. Inspired by the impressive driving capabilities of humans, our model does not rely on GPS, lidar, or map knowledge. Instead, it takes advantage of a diverse set of visual cues in the form of vehicle tracklets, vanishing points, semantic scene labels, scene flow, and occupancy grids. For each of these cues, we propose likelihood functions that are integrated into a probabilistic generative model. We learn all model parameters from training data using contrastive divergence. Experiments conducted on videos of 113 representative intersections show that our approach successfully infers the correct layout in a variety of very challenging scenarios. To evaluate the importance of each feature cue, experiments using different feature combinations are conducted. Furthermore, we show how by employing context derived from the proposed method we are able to improve over the state-of-the-art in terms of object detection and object orientation estimation in challenging and cluttered urban environments. © 2013 IEEE.


Garrido P.,MPI for Informatics | Valgaerts L.,MPI for Informatics | Rehmsen O.,MPI for Informatics | Thormahlen T.,University of Marburg | And 2 more authors.
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition | Year: 2014

We propose an image-based, facial reenactment system that replaces the face of an actor in an existing target video with the face of a user from a source video, while preserving the original target performance. Our system is fully automatic and does not require a database of source expressions. Instead, it is able to produce convincing reenactment results from a short source video captured with an off-the-shelf camera, such as a webcam, where the user performs arbitrary facial gestures. Our reenactment pipeline is conceived as part image retrieval and part face transfer: The image retrieval is based on temporal clustering of target frames and a novel image matching metric that combines appearance and motion to select candidate frames from the source video, while the face transfer uses a 2D warping strategy that preserves the user's identity. Our system excels in simplicity as it does not rely on a 3D face model, it is robust under head motion and does not require the source and target performance to be similar. We show convincing reenactment results for videos that we recorded ourselves and for low-quality footage taken from the Internet. © 2014 IEEE.


Bonifaci V.,CNR Institute for System Analysis and Computer Science Antonio Ruberti | Mehlhorn K.,MPI for Informatics | Varma G.,Tata Institute of Fundamental Research
Journal of Theoretical Biology | Year: 2012

Physarum polycephalum is a slime mold that is apparently able to solve shortest path problems. A mathematical model has been proposed by Tero et al. (Journal of Theoretical Biology, 244, 2007, pp. 553-564) to describe the feedback mechanism used by the slime mold to adapt its tubular channels while foraging two food sources s0 and s1. We prove that, under this model, the mass of the mold will eventually converge to the shortest s0-s1 path of the network that the mold lies on, independently of the structure of the network or of the initial mass distribution. This matches the experimental observations by Tero et al. and can be seen as an example of a "natural algorithm", that is, an algorithm developed by evolution over millions of years. © 2012 Elsevier Ltd.


Sikka K.,University of California at San Diego | Sharma G.,MPI for Informatics | Sharma G.,Indian Institute of Technology Kanpur | Bartlett M.,University of California at San Diego
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition | Year: 2016

We study the problem of facial analysis in videos. We propose a novel weakly supervised learning method that models the video event (expression, pain etc.) as a sequence of automatically mined, discriminative sub-events (e.g. onset and offset phase for smile, brow lower and cheek raise for pain). The proposed model is inspired by the recent works on Multiple Instance Learning and latent SVM/HCRF - it extends such frameworks to model the ordinal or temporal aspect in the videos, approximately. We obtain consistent improvements over relevant competitive baselines on four challenging and publicly available video based facial analysis datasets for prediction of expression, clinical pain and intent in dyadic conversations. In combination with complimentary features, we report state-of-the-art results on these datasets.


Bhattarai B.,University of Caen Lower Normandy | Sharma G.,MPI for Informatics | Sharma G.,Indian Institute of Technology Kanpur | Jurie F.,University of Caen Lower Normandy
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition | Year: 2016

We propose a novel Coupled Projection multi-task Metric Learning (CP-mtML) method for large scale face retrieval. In contrast to previous works which were limited to low dimensional features and small datasets, the proposed method scales to large datasets with high dimensional face descriptors. It utilises pairwise (dis-)similarity constraints as supervision and hence does not require exhaustive class annotation for every training image. While, traditionally, multi-task learning methods have been validated on same dataset but different tasks, we work on the more challenging setting with heterogeneous datasets and different tasks. We show empirical validation on multiple face image datasets of different facial traits, e.g. identity, age and expression. We use classic Local Binary Pattern (LBP) descriptors along with the recent Deep Convolutional Neural Network (CNN) features. The experiments clearly demonstrate the scalability and improved performance of the proposed method on the tasks of identity and age based face image retrieval compared to competitive existing methods, on the standard datasets and with the presence of a million distractor face images.


Scherbaum K.,Saarland University | Petterson J.,Commonwealth Bank of Australia | Feris R.S.,IBM | Blanz V.,University of Siegen | Seidel H.-P.,MPI for Informatics
Proceedings of the IEEE International Conference on Computer Vision | Year: 2013

Face detection is an important task in computer vision and often serves as the first step for a variety of applications. State-of-the-art approaches use efficient learning algorithms and train on large amounts of manually labeled imagery. Acquiring appropriate training images, however, is very time-consuming and does not guarantee that the collected training data is representative in terms of data variability. Moreover, available data sets are often acquired under controlled settings, restricting, for example, scene illumination or 3D head pose to a narrow range. This paper takes a look into the automated generation of adaptive training samples from a 3D morph able face model. Using statistical insights, the tailored training data guarantees full data variability and is enriched by arbitrary facial attributes such as age or body weight. Moreover, it can automatically adapt to environmental constraints, such as illumination or viewing angle of recorded video footage from surveillance cameras. We use the tailored imagery to train a new many-core implementation of Viola Jones' AdaBoost object detection framework. The new implementation is not only faster but also enables the use of multiple feature channels such as color features at training time. In our experiments we trained seven view-dependent face detectors and evaluate these on the Face Detection Data Set and Benchmark (FDDB). Our experiments show that the use of tailored training imagery outperforms state-of-the-art approaches on this challenging dataset. © 2013 IEEE.


Reinhard E.,MPI for Informatics | Pouli T.,MPI for Informatics | Kunkel T.,Dolby Laboratories Inc. | Long B.,University of Bristol | And 2 more authors.
ACM Transactions on Graphics | Year: 2012

Managing the appearance of images across different display environments is a difficult problem, exacerbated by the proliferation of high dynamic range imaging technologies. Tone reproduction is often limited to luminance adjustment and is rarely calibrated against psychophysical data, while color appearance modeling addresses color reproduction in a calibrated manner, albeit over a limited luminance range. Only a few image appearance models bridge the gap, borrowing ideas from both areas. Our take on scene reproduction reduces computational complexity with respect to the state-ofthe-art, and adds a spatially varying model of lightness perception. The predictive capabilities of the model are validated against all psychophysical data known to us, and visual comparisons show accurate and robust reproduction for challenging high dynamic range scenes. © 2012 ACM.


Even G.,Tel Aviv University | Medina M.,MPI for Informatics
Algorithmica | Year: 2016

We present deterministic and randomized algorithms for the problem of online packet routing in grids in the competitive network throughput model (Aiello et al. in SODA, pp 771–780 2003). In this model the network has nodes with bounded buffers and bounded link capacities. The goal in this model is to maximize the throughput, i.e., the number of delivered packets. Our deterministic algorithm is the first online algorithm with an (Formula presented.) competitive ratio for uni-directional grids (where n denotes the size of the network). The deterministic online algorithm is centralized and handles packets with deadlines. This algorithm is applicable to various ranges of values of buffer sizes and communication link capacities. In particular, it holds for buffer size and communication link capacity in the range (Formula presented.). Our randomized algorithm achieves an expected competitive ratio of (Formula presented.) for the uni-directional line. This algorithm is applicable to a wide range of buffer sizes and communication link capacities. In particular, it holds also for unit size buffers and unit capacity links. This algorithm improves the best previous (Formula presented.)-competitive ratio of Azar and Zachut (ESA, pp 484–495, 2005). © 2016 The Author(s)


Lenzen C.,MPI for Informatics | Patt-Shamir B.,Tel Aviv University
Proceedings of the Annual ACM Symposium on Principles of Distributed Computing | Year: 2015

We study approximate distributed solutions to the weighted all-pairs-shortest-paths (APSP) problem in the CONGEST model. We obtain the following results. A deterministic (1 + ε)-approximation to APSP with running time O(ε-2nlogn) rounds. The best previously known algorithm was randomized and slower by a (log n) factor. In many cases, routing schemes involve relabeling, i.e., assigning new names to nodes and that are used in distance and routing queries. It is known that relabeling is necessary to achieve running times of o(n/ log n). In the relabeling model, we obtain the following results. A randomized O(k)-approximation to APSP, for any inateger k > 1, running in O(n1/2+1/k + D) rounds, where D is the hop diameter of the network. This algorithm simplifies the best previously known result and reduces its approximaation ratio from O(k log k) to O(k). Also, the new algorithm uses O(logn)-bit labels, which is asymptotically optimal. A randomized O(k)-approximation to APSP, for any integer k > 1, running in time O((nD)1/2 • n1/k + D) and producing compact routing tables of size labels consist of O(k log n) bits. This improves on the apaproximation ratio of (k2) for tables of that size achieved by the best previously known algorithm, which terminates faster, in O(n1/2+1/k + D) rounds. In addition, we improve on the time complexity of the best known deterministic algorithm for distributed approximate Steiner forest. © Copyright 2015 ACM.


Valgaerts L.,MPI for Informatics | Wu C.,MPI for Informatics | Wu C.,Intel Corporation | Bruhn A.,University of Stuttgart | And 2 more authors.
ACM Transactions on Graphics | Year: 2012

Recent progress in passive facial performance capture has shown impressively detailed results on highly articulated motion. However, most methods rely on complex multi-camera set-ups, controlled lighting or fiducial markers. This prevents them from being used in general environments, outdoor scenes, during live action on a film set, or by freelance animators and everyday users who want to capture their digital selves. In this paper, we therefore propose a lightweight passive facial performance capture approach that is able to reconstruct high-quality dynamic facial geometry from only a single pair of stereo cameras. Our method succeeds under uncontrolled and time-varying lighting, and also in outdoor scenes. Our approach builds upon and extends recent image-based scene flow computation, lighting estimation and shading-based refinement algorithms. It integrates them into a pipeline that is specifically tailored towards facial performance reconstruction from challenging binocular footage under uncontrolled lighting. In an experimental evaluation, the strong capabilities of our method become explicit: We achieve detailed and spatio-temporally coherent results for expressive facial motion in both indoor and outdoor scenes-even from low quality input images recorded with a hand-held consumer stereo camera. We believe that our approach is the first to capture facial performances of such high quality from a single stereo rig and we demonstrate that it brings facial performance capture out of the studio, into the wild, and within the reach of everybody. © 2012 ACM.

Loading MPI for Informatics collaborators
Loading MPI for Informatics collaborators