Time filter

Source Type

United Kingdom

Jones M.,MItsubishi Electric | Geng Y.,Boston University | Nikovski D.,MERL | Hirata T.,MItsubishi Electric
IEEE Conference on Intelligent Transportation Systems, Proceedings, ITSC

We study the problem of predicting travel times for links (road segments) using floating car data. We present four different methods for predicting travel times and discuss the differences in predicting on congested and uncongested roads. We show that estimates of the current travel time are mainly useful for prediction on links that get congested. Then we examine the problem of predicting link travel times when no recent probe car data is available for estimating current travel times. This is a serious problem that arises when using probe car data for prediction. Our solution, which we call geospatial inference, uses floating car data from nearby links to predict travel times on the desired link. We show that geospatial inference leads to improved travel time estimates for congested links compared to standard methods. © 2013 IEEE. Source

Bedri H.,Massachusetts Institute of Technology | Feigin M.,Massachusetts Institute of Technology | Boufounos P.T.,MERL | Raskar R.,Massachusetts Institute of Technology
Proceedings of the IEEE International Conference on Computer Vision

SONAR imaging can detect reflecting objects in the dark and around corners, however many SONAR systems require large phased-arrays and immobile equipment. In order to enable sound imaging with a mobile device, one can move a microphone and speaker in the air to form a large synthetic aperture. We demonstrate resolution limited audio images using a moving microphone and speaker of a mannequin in free-space and a mannequin located around a corner. This paper also explores the 2D resolution limit due to aperture size as well as the time resolution limit due to bandwidth, and proposes Continuous Basis Pursuits (CBP) to super-resolve. © 2015 IEEE. Source

Barker J.,University of Sheffield | Marxer R.,University of Sheffield | Vincent E.,French Institute for Research in Computer Science and Automation | Watanabe S.,MERL
2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings

The CHiME challenge series aims to advance far field speech recognition technology by promoting research at the interface of signal processing and automatic speech recognition. This paper presents the design and outcomes of the 3rd CHiME Challenge, which targets the performance of automatic speech recognition in a real-world, commercially-motivated scenario: a person talking to a tablet device that has been fitted with a six-channel microphone array. The paper describes the data collection, the task definition and the baseline systems for data simulation, enhancement and recognition. The paper then presents an overview of the 26 systems that were submitted to the challenge focusing on the strategies that proved to be most successful relative to the MVDR array processing and DNN acoustic modeling reference system. Challenge findings related to the role of simulated data in system training and evaluation are discussed. © 2015 IEEE. Source

Sharma A.,University of Maryland College Park | Tuzel O.,MERL | Jacobs D.W.,University of Maryland College Park
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

This paper proposes a learning-based approach to scene parsing inspired by the deep Recursive Context Propagation Network (RCPN). RCPN is a deep feed-forward neural network that utilizes the contextual information from the entire image, through bottom-up followed by top-down context propagation via random binary parse trees. This improves the feature representation of every super-pixel in the image for better classification into semantic categories. We analyze RCPN and propose two novel contributions to further improve the model. We first analyze the learning of RCPN parameters and discover the presence of bypass error paths in the computation graph of RCPN that can hinder contextual propagation. We propose to tackle this problem by including the classification loss of the internal nodes of the random parse trees in the original RCPN loss function. Secondly, we use an MRF on the parse tree nodes to model the hierarchical dependency present in the output. Both modifications provide performance boosts over the original RCPN and the new system achieves state-of-the-art performance on Stanford Background, SIFT-Flow and Daimler urban datasets. © 2015 IEEE. Source

Sturm P.,French Institute for Research in Computer Science and Automation | Ramalingam S.,MERL | Tardif J.-P.,Carnegie Mellon University | Gasparini S.,French Institute for Research in Computer Science and Automation | Barreto J.,University of Coimbra
Foundations and Trends in Computer Graphics and Vision

This survey is mainly motivated by the increased availability and use of panoramic image acquisition devices, in computer vision and various of its applications. Different technologies and different computational models thereof exist and algorithms and theoretical studies for geometric computer vision ("structure-from-motion") are often re-developed without highlighting common underlying principles. One of the goals of this survey is to give an overview of image acquisition methods used in computer vision and especially, of the vast number of camera models that have been proposed and investigated over the years, where we try to point out similarities between different models. Results on epipolar and multi-view geometry for different camera models are reviewed as well as various calibration and self-calibration approaches, with an emphasis on non-perspective cameras.We finally describe what we consider are fundamental building blocks for geometric computer vision or structure-from-motion: epipolar geometry, pose and motion estimation, 3D scene modeling, and bundle adjustment. The main goal here is to highlight the main principles of these, which are independent of specific camera models. Source

Discover hidden collaborations