Schönau am Königssee, Germany
Schönau am Königssee, Germany

Time filter

Source Type

Habets E.A.P.,International Audio Laboratories Erlangen | Benesty J.,University of Québec
2011 Joint Workshop on Hands-free Speech Communication and Microphone Arrays, HSCMA'11 | Year: 2011

In this paper a two-stage beamforming approach is presented for dereverberation and noise reduction. The first stage comprises a delay-and-sum (DS) beamformer that generates a reference signal that contains a spatially filtered version of the desired speech and interference. In general, the desired speech component at the output of the DS beamformer contains less reverberation compared to reverberant speech signal received at the microphones. The second stage uses the filtered microphone signals and the noisy reference signal to estimate the desired speech component at the output of the DS beamformer. A major advantage over classical approaches is that the proposed approach is able to dereverberate the received desired signal with very low speech distortion. The dereverberation and noise reduction performance is evaluated for a circular microphone array. © 2011 IEEE.


Levin D.,Bar - Ilan University | Habets E.A.P.,International Audio Laboratories Erlangen | Gannot S.,Bar - Ilan University
Journal of the Acoustical Society of America | Year: 2012

A vector-sensor consisting of a monopole sensor collocated with orthogonally oriented dipole sensors is used for direction of arrival (DOA) estimation in the presence of an isotropic noise-field or internal device noise. A maximum likelihood (ML) DOA estimator is derived and subsequently shown to be a special case of DOA estimation by means of a search for the direction of maximum steered response power (SRP). The problem of SRP maximization with respect to a vector-sensor can be solved with a computationally inexpensive algorithm. The ML estimator achieves asymptotic efficiency and thus outperforms existing estimators with respect to the mean square angular error (MSAE) measure. The beampattern associated with the ML estimator is shown to be identical to that used by the minimum power distortionless response beamformer for the purpose of signal enhancement. © 2012 Acoustical Society of America.


Thiergart O.,International Audio Laboratories Erlangen | Habets E.A.P.,International Audio Laboratories Erlangen
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings | Year: 2013

Extracting sound sources in noisy and reverberant conditions remains a challenging task that is commonly found in modern communication systems. In this work, we consider the problem of obtaining a desired spatial response for at most L simultaneously active sound sources. The proposed spatial filter is obtained by minimizing the diffuse plus self-noise power at the output of the filter subject to L linear constraints. In contrast to earlier works, the L constraints are based on instantaneous narrowband direction-of-arrival estimates. In addition, a novel estimator for the diffuse-to-noise ratio is developed that exhibits a sufficiently high temporal and spectral resolution to achieve both dereverberation and noise reduction. The presented results demonstrate that an optimal tradeoff between maximum white noise gain and maximum directivity is achieved. © 2013 IEEE.


Taseska M.,International Audio Laboratories Erlangen | Habets E.A.P.,International Audio Laboratories Erlangen
IEEE Workshop on Applications of Signal Processing to Audio and Acoustics | Year: 2013

Extracting sounds that originate from a specific location, while reducing noise and interferers is required in many hands-free communications systems. We propose a spotforming approach that uses distributed microphone arrays and aims at extracting sounds that originate from a pre-defined spot of interest (SOI), while reducing background noise and sounds that originate from outside the SOI. The spotformer is realized as a linear spatial filter, which is based on the signal statistics of sounds from the SOI, the signal statistics of sounds outside the SOI and the background noise signal statistics. The required signal statistics are estimated from the microphone signals, while taking into account the uncertainty in the location estimates of the desired and the interfering sound sources. The applicability of the method is demonstrated by simulations and the quality of the extracted signal is evaluated in different scenarios. © 2013 IEEE.


Thiergart O.,International Audio Laboratories Erlangen | Del Galdo G.,International Audio Laboratories Erlangen | Habets E.A.P.,International Audio Laboratories Erlangen
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings | Year: 2012

The signal-to-reverberant ratio (SRR) is an important parameter in several applications such as speech enhancement, dereverberation, and parametric spatial audio coding. In this contribution, an SRR estimator is derived from the direction-of-arrival dependent complex spatial coherence function computed via two omnidirectional microphones. It is shown that by employing a computationally inexpensive DOA estimator, the proposed SRR estimator outperforms existing approaches. © 2012 IEEE.


Schinkel-Bielefeld N.,International Audio Laboratories Erlangen
2016 8th International Conference on Quality of Multimedia Experience, QoMEX 2016 | Year: 2016

Audio quality evaluation for audio material of intermediate and high quality requires expert listeners. In comparison to non-experts, these are not only more critical in their ratings, but also employ different strategies in their evaluation. In particular they concentrate on shorter sections of the audio signal and compare more to the reference than inexperienced listeners. © 2016 IEEE.


Talmon R.,Yale University | Habets E.A.P.,International Audio Laboratories Erlangen
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings | Year: 2013

The reverberation time (RT) is a very important measure that quantifies the acoustic properties of a room and provides information about the quality and intelligibility of speech recorded in that room. Moreover, information about the RT can be used to improve the performance of automatic speech recognition systems and speech dereverberation algorithms. In a recent study, it has been shown that existing methods for blind estimation of the RT are highly sensitive to additive noise. In this paper, a novel method is proposed to blindly estimate the RT based on the decay rate distribution. Firstly, a data-driven representation of the underlying decay rates of several training rooms is obtained via the eigenvalue decomposition of a specially-tailored kernel. Secondly, the representation is extended to a room under test and used to estimate its decay rate (and hence its RT). The presented results show that the proposed method outperforms a competing method and is significantly more robust to noise. © 2013 IEEE.


Backstrom T.,International Audio Laboratories Erlangen
IEEE Workshop on Applications of Signal Processing to Audio and Acoustics | Year: 2013

Speech and audio coding have during the last decade converged to an increasingly unified technology. This contribution discusses one of the remaining fundamental differences between speech and audio paradigms, namely, windowing of the input signal. Audio codecs generally use lapped transforms and apply a perceptual model in the transform domain, whereby temporal continuity is achieved by windowing and overlap-add. Speech codecs on the other hand achieve temporal continuity by using linear predictive filtering, whereby windowing is applied in the residual domain. Despite these fundamental differences, we demonstrate that the two windowing approaches, combined with perceptual modeling, perform very similarly both in terms of perceptual quality and theoretical properties. © 2013 IEEE.


Habets E.A.P.,International Audio Laboratories Erlangen | Benesty J.,University of Quebec at Montréal
IEEE Transactions on Audio, Speech and Language Processing | Year: 2012

Signals captured by a set of microphones in a speech communication system are mixtures of desired signals and noise. In this paper, a different perspective on frequency-domain beamformers in room acoustics is provided. Specifically, the observed noise signals are divided into coherent and incoherent signal components while no assumptions are being made regarding the number of coherent noise sources and the noise sound field. From this perspective, performance measures are defined and existing beamformers are deduced. In addition, a new and general tradeoff beamformer is proposed that enables a compromise between noise reduction and speech distortion on the one hand, and coherent noise versus incoherent noise reductions on the other hand. The presented performance evaluation shows how existing beamformers and the tradeoff beamformer perform in different scenarios. © 2006 IEEE.


Thiergart O.,International Audio Laboratories Erlangen | Del Galdo G.,International Audio Laboratories Erlangen | Habets E.A.P.,International Audio Laboratories Erlangen
Journal of the Acoustical Society of America | Year: 2012

Many applications in spatial sound recording and processing model the sound scene as a sum of directional and diffuse sound components. The power ratio between both components, i.e., the signal-to-diffuse ratio (SDR), represents an important measure for algorithms which aim at performing robustly in reverberant environments. This contribution discusses the SDR estimation from the spatial coherence between two arbitrary first-order directional microphones. First, the spatial coherence is expressed as function of the SDR. For most microphone setups, the spatial coherence is a complex function where both the absolute value and phase contain relevant information on the SDR. Secondly, the SDR estimator is derived from the spatial coherence function. The estimator is discussed for different practical microphone setups including coincident setups of arbitrary first-order directional microphones and spaced setups of identical first-order directional microphones. An unbiased SDR estimation requires noiseless coherence estimates as well as information on the direction-of-arrival of the directional sound, which usually has to be estimated. Nevertheless, measurement results verify that the proposed estimator is applicable in practice and provides accurate results. © 2012 Acoustical Society of America.

Loading International Audio Laboratories Erlangen collaborators
Loading International Audio Laboratories Erlangen collaborators