Analysis Synthesis Team

Paris, France

Analysis Synthesis Team

Paris, France
SEARCH FILTERS
Time filter
Source Type

Obin N.,Analysis Synthesis Team | Rodet X.,Analysis Synthesis Team | Lacheret A.,University of Paris Ouest La Defense
Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010 | Year: 2010

This paper presents a study on the use of deep syntactical features to improve prosody modeling 1. A French linguistic processing chain based on linguistic preprocessing, morpho-syntactical labeling, and deep syntactical parsing is used in order to extract syntactical features from an input text. These features are used to define more or less high-level syntactical feature sets. Such feature sets are compared on the basis of a HMM-based prosodic structure model. High-level syntactical features are shown to significantly improve the performance of the model (up to 21% error reduction combined with 19% BIC reduction). © 2010 ISCA.


Liuni M.,Analysis Synthesis Team | Robel A.,Analysis Synthesis Team | Matusiak E.,University of Vienna | Romito M.,University of Pisa | Rodet X.,Analysis Synthesis Team
IEEE Transactions on Audio, Speech and Language Processing | Year: 2013

We present an algorithm for sound analysis and re-synthesis with local automatic adaptation of time-frequency resolution. The reconstruction formula we propose is highly efficient, and gives a good approximation of the original signal from analyses with different time-varying resolutions within complementary frequency bands: this is a typical case where perfect reconstruction cannot in general be achieved with fast algorithms, which provides an error to be minimized. We provide a theoretical upper bound for the reconstruction error of our method, and an example of automatic adaptive analysis and re-synthesis of a music sound. © 2006-2012 IEEE.


Lagrange M.,Analysis Synthesis Team | Scavone G.,McGill University | Depalle P.,McGill University
IEEE Transactions on Audio, Speech and Language Processing | Year: 2010

This paper introduces an analysis/synthesis scheme for the reproduction of sounds generated by sustained contact between rigid bodies. This scheme is rooted in a Source/Filter decomposition of the sound where the filter is described as a set of poles and the source is described as a set of impulses representing the energy transfer between the interacting objects. Compared to single impacts, sustained contact interactions like rolling and sliding make the estimation of the parameters of the Source/Filter model challenging because of two issues. First, the objects are almost continuously interacting. The second is that the source is generally unknown and has therefore to be modeled in a generic way. In an attempt to tackle those issues, the proposed analysis/synthesis scheme combines advanced analysis techniques for the estimation of the filter parameters and a flexible model of the source. It allows the modeling of a wide range of sounds. Examples are presented for objects of various shapes and sizes, rolling or sliding over plates of different materials. In order to demonstrate the versatility of the approach, the system is also considered for the modeling of sounds produced by percussive musical instruments. © 2006 IEEE.


Yeh C.,Analysis Synthesis Team | Robel A.,Analysis Synthesis Team
Proceedings of the 9th International Conference on Digital Audio Effects, DAFx 2006 | Year: 2013

We describe a novel algorithm for the estimation of the colored noise level in audio signals with mixed noise and sinusoidal components. The noise envelope model is based on the assumptions that the envelope varies slowly with frequency and that the magnitudes of the noise peaks obey a Rayleigh distribution. Our method is an extension of a recently proposed approach of spectral peak classification of sinusoids and noise, which takes into account a noise envelope model to improve the detection of sinusoidal peaks. By means of iterative evaluation and adaptation of the noise envelope model, the classification of noise and sinusoidal peaks is iteratively refined until the detected noise peaks are coherently explained by the noise envelope model. Testing examples of estimating white noise and colored noise are demonstrated.


Caetano M.,Analysis Synthesis Team | Rodet X.,Analysis Synthesis Team
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings | Year: 2012

The model used to represent musical instrument sounds plays a crucial role in the quality of sound transformations. Ideally, the representation should be compact and accurate, while its parameters should give flexibility to independently manipulate perceptually related features of the sounds. This work describes a source-filter model for musical instrument sounds based on the sinusoidal plus residual decomposition. The sinusoidal component is modeled as sinusoidal partial tracks (source) and a time-varying spectral envelope (filter), and the residual is represented as white noise (source) shaped by a time-varying spectral envelope (filter). This article presents estimation and representation techniques that give totally independent and intuitive control of the spectral envelope model and the frequencies of the partials to perform perceptually related sound transformations. The result of a listening test confirmed that, in general, the sounds resynthesized from the source-filter model are perceptually similar to the original recordings. © 2012 IEEE.


Oleary S.,Analysis Synthesis Team | Robel A.,Analysis Synthesis Team
IEEE/ACM Transactions on Audio Speech and Language Processing | Year: 2016

Sound texture synthesis has applications in creating audio scenes for film and video games. In this paper, a novel algorithm for sound texture synthesis is presented. The goal of this algorithm is to produce new examples of a given sampled texture, the synthesized textures being of any desired duration. The algorithm is based on a montage approach to synthesis in that the original sample is cut into small pieces, referred to as atoms, and these atoms are concatenated together in a new sequence, preserving certain structures of the original texture. The sequence modelling of the atoms has two levels: atoms are concatenated to create segments and segments are concatenated, based on their history, to create textures. This approach deals with problems of repetition associated with sampling based sound texture synthesis techniques. Listening tests show that the results of the synthesis are very promising for a broad range of textures, including quasi-periodic and more random textures. © 2014 IEEE.


Caetano M.,Analysis Synthesis Team | Rodet X.,Analysis Synthesis Team
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings | Year: 2011

The amplitude modulations of musical instrument sounds and speech are important perceptual cues. Accurate estimation of the amplitude, or equivalently energy, envelope of a time-domain signal (waveform) is not a trivial task, though. Ideally, the amplitude envelope should outline the waveform connecting the main peaks and avoiding over fitting. In this work we propose a method to obtain a smooth function that approximately matches the main peaks of the waveform using true envelope estimation, dubbed true amplitude envelope. True envelope is a cepstral smoothing technique that has been shown to outperform traditional envelope estimation techniques both in accuracy of estimation and ease of order selection. True amplitude envelope gives a reliable estimation that follows closely sudden variations in amplitude and avoids ripples in more stable regions with near optimal order selection depending on the fundamental frequency of the signal. © 2011 IEEE.


Caetano M.,Analysis Synthesis Team | Rodet X.,Analysis Synthesis Team
13th International Conference on Digital Audio Effects, DAFx 2010 Proceedings | Year: 2010

The aim of sound morphing is to obtain a sound that falls perceptually between two (or more) sounds. Ideally, we want to morph perceptually relevant features of sounds and be able to independently manipulate them. In this work we present a method to obtain perceptually intermediate spectral envelopes guided by highlevel spectral shape descriptors and a technique that employs evolutionary computation to independently manipulate the timbral features captured by the descriptors. High-level descriptors are measures of the acoustic correlates of salient timbre dimensions derived from perceptual studies, such that the manipulation of the descriptors corresponds to potentially interesting timbral variations.


Burred J.J.,Analysis Synthesis Team | Burred J.J.,Audionamix | Robel A.,Analysis Synthesis Team
13th International Conference on Digital Audio Effects, DAFx 2010 Proceedings | Year: 2010

We propose a new statistical model of musical timbre that handles the different segments of the temporal envelope (attack, sustain and release) separately in order to account for their different spectral and temporal behaviors. The model is based on a reduced-dimensionality representation of the spectro-temporal envelope. Temporal coefficients corresponding to the attack and release segments are subjected to explicit trajectory modeling based on a non-stationary Gaussian Process. Coefficients corresponding to the sustain phase are modeled as a multivariate Gaussian. A compound similarity measure associated with the segmental model is proposed and successfully tested in instrument classification experiments. Apart from its use in a statistical framework, the modeling method allows intuitive and informative visualizations of the characteristics of musical timbre.


Caetano M.,Analysis Synthesis Team | Rodet X.,Analysis Synthesis Team
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings | Year: 2011

The goal of sound morphing by feature interpolation is to obtain sounds whose values of features are intermediate between those of the source and target sounds. In order to do this, we should be able to resynthesize sounds that present a set of predefined feature values, a notoriously difficult problem. In this work, we present morphing techniques to obtain hybrid musical instrument sounds whose feature values correspond as close as possible to the ideal interpolated values. When the features capture perceptually relevant information, the morphed sound whose features are interpolated is perceptually intermediate. The features we use are acoustic correlates of salient timbre dimensions derived from perceptual studies, such that sounds whose feature values are intermediate between two would be placed between them in the underlying timbre space. We measure the perceptual impact of the morphed sounds directly by the feature values, using them as an objective measure with which to evaluate the results. Thus we consider that the morphed sounds change perceptually linearly when the corresponding feature values vary linearly. © 2011 IEEE.

Loading Analysis Synthesis Team collaborators
Loading Analysis Synthesis Team collaborators