Benesty J.,University of Quebec at Montreal |
Chen J.,Northwestern Polytechnical University |
Huang Y.,WEVOICE, Inc.
IEEE Transactions on Audio, Speech and Language Processing | Year: 2011
Binaural noise reduction with a stereophonic (or simply stereo) setup has become a very important problem as stereo sound systems and devices are being more and more deployed in modern voice communications. This problem is very challenging since it requires not only the reduction of the noise at the stereo inputs, but also the preservation of the spatial information embodied in the two channels so that after noise reduction the listener can still localize the sound source from the binaural outputs. As a result, simply applying a traditional single-channel noise reduction technique to each channel individually may not work as the spatial effects may be destroyed. In this paper, we present a new formulation of the binaural noise reduction problem in stereo systems. We first form a complex signal from the stereo inputs with one channel being its real part and the other being its imaginary part. By doing so, the binaural noise reduction problem can be processed by a single-channel widely linear filter. The widely linear estimation theory is then used to derive optimal noise reduction filters that can fully take advantage of the noncircularity of the complex speech signal to achieve noise reduction while preserving the desired signal (speech) and spatial information. With this new formulation, the Wiener, minimum variance distortionless response (MVDR), maximum signal-to-noise ratio (SNR), and tradeoff filters are derived. Experiments are provided to justify the effectiveness of these filters. © 2011 IEEE.
Benesty J.,University of Quebec |
Chen J.,Northwestern Polytechnical University |
Huang Y.,WEVOICE, Inc. |
Gaensler T.,Mh Acoustics LLC
Journal of the Acoustical Society of America | Year: 2012
This paper addresses the problem of noise reduction in the time domain where the clean speech sample at every time instant is estimated by filtering a vector of the noisy speech signal. Such a clean speech estimate consists of both the filtered speech and residual noise (filtered noise) as the noisy vector is the sum of the clean speech and noise vectors. Traditionally, the filtered speech is treated as the desired signal after noise reduction. This paper proposes to decompose the clean speech vector into two orthogonal components: one is correlated and the other is uncorrelated with the current clean speech sample. While the correlated component helps estimate the clean speech, it is shown that the uncorrelated component interferes with the estimation, just as the additive noise. Based on this orthogonal decomposition, the paper presents a way to define the error signal and cost functions and addresses the issue of how to design different optimal noise reduction filters by optimizing these cost functions. Specifically, it discusses how to design the maximum SNR filter, the Wiener filter, the minimum variance distortionless response (MVDR) filter, the tradeoff filter, and the linearly constrained minimum variance (LCMV) filter. It demonstrates that the maximum SNR, Wiener, MVDR, and tradeoff filters are identical up to a scaling factor. It also shows from the orthogonal decomposition that many performance measures can be defined, which seem to be more appropriate than the traditional ones for the evaluation of the noise reduction filters. © 2012 Acoustical Society of America.
Huang Y.A.,WEVOICE, Inc. |
Benesty J.,University of Quebec at Montreal
IEEE Transactions on Audio, Speech and Language Processing | Year: 2012
This paper focuses on the class of single-channel noise reduction methods that are performed in the frequency domain via the short-time Fourier transform (STFT). The simplicity and relative effectiveness of this class of approaches make them the dominant choice in practical systems. Over the past years, many popular algorithms have been proposed. These algorithms, no matter how they are developed, have one feature in common: the solution is eventually formulated as a gain function applied to the STFT of the noisy signal only in the current frame, implying that the interframe correlation is ignored. This assumption is not accurate for speech enhancement since speech is a highly self-correlated signal. In this paper, by taking the interframe correlation into account, a new linear model for speech spectral estimation and some optimal filters are proposed. They include the multi-frame Wiener and minimum variance distortionless response (MVDR) filters. With these filters, both the narrowband and fullband signal-to-noise ratios (SNRs) can be improved. Furthermore, with the MVDR filter, speech distortion at the output can be zero. Simulations present promising results in support of the claimed merits obtained by theoretical analysis. © 2011 IEEE.
Agency: National Aeronautics and Space Administration | Branch: | Program: SBIR | Phase: Phase II | Award Amount: 599.56K | Year: 2009
For in-helmet voice communication, the currently used Communication-Cap-based Audio (CCA) systems have a number of recognized logistical issues and inconveniences that cannot be resolved with incremental improvements to the basic design of the CCA systems. The objective of this research project is to develop an Integrated Spacesuit Audio (ISA) system that can possess similar performance to a CCA while offering users inherent comfort and ease of use. In Phase I, the feasibility of using microphone array beamforming or multichannel noise reduction plus a single-channel postfilter to combat a variety of types of in-helmet noise was validated. Comparative simulations indicated that novel multichannel noise reduction is more practical and more effective than traditional microphone array beamforming for ISA systems. Phase II will pursue advanced development and prototype of the proposed technical solution for the ISA system. Directions for improvement that were established in Phase I will be carefully followed, subjective evaluation will be carried out, and the ISA designs will be further optimized. Finally a real-time demo system will be built using either DSP or FPGA. It should be ready for testing and use by NASA at the end of Phase II.
Agency: National Aeronautics and Space Administration | Branch: | Program: SBIR | Phase: Phase I | Award Amount: 97.06K | Year: 2011
Acoustic survey is now performed using hand-held devices once every two months on the international space station (ISS). It takes quite a lot of precious crew time and the sporadic monitoring program is not adequate.This Phase I proposal is concerned with developing an automated sound level and noise exposure monitoring system running on a ZigBee-compliant wireless sensor network. In the proposed research, we will focus ona preliminary design of the monitoring terminal that integrates the functionalities of microphone, data sampling, and signal processing along with data communication through a ZigBee wireless channel. Sufficient compliance of the developed sound level meter and noise dosimeter with the related ANSI standards will be tested and demonstrated. Thisplan takes advantage of our broad knowledge in acoustic signal processing and ZigBee wireless sensor network, and will benefit from our experienceand skills with the development of embedded digital signal processing systems using either FPGA (field programmable gate array) or DSP (digital signal processor). The Phase I effort will provide a foundation for prototype design to be conducted in Phase II.