Entity

Time filter

Source Type


Panda A.,TCS Innovation Labs Mumbai
Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH | Year: 2015

This paper addresses the problem of speaker verification in the presence of additive noise. We propose a fast implementation of Psychoacoustic Model Compensation (Psy-Comp) scheme for static features along with model domain mean and variance normalization for robust speaker recognition in noisy conditions. The proposed algorithms are validated through experiments on noise corrupted NIST-2000 speaker recognition database. We show that the Psy-Comp scheme along with model domain mean and variance normalization provide significant performance gain compared to the Vector Taylor Series (VTS) scheme and feature domain cepstral mean and variance normalization scheme. Moreover, the computational cost of the proposed method is significantly less than the VTS scheme. Copyright © 2015 ISCA. Source


De A.,Tata Consultancy Services Ltd. | Kopparapu S.K.,TCS Innovation Labs Mumbai
Proceedings of the 2013 International Conference on Advances in Computing, Communications and Informatics, ICACCI 2013 | Year: 2013

Supervised learning techniques have long been used to analyze unstructured natural language text documents. However, supervised learning techniques are not only computationally intensive but also often require large training corpora. Supervised techniques often fail when such training corpora is either (a) not available or (b) when available, is not statistically significant to enable learning. In many practical scenarios, unsupervised leaning techniques become de-facto since the training corpus is not available. In this paper we first describe an unsupervised text analysis technique and demonstrate its usefulness in addressing a real life application to harness ideas from aggregating ideas posted on our company Ideas Portal website. © 2013 IEEE. Source


Mishra N.,TCS Innovation Labs Delhi | Kopparapu S.K.,TCS Innovation Labs Mumbai
Advances in Intelligent Systems and Computing | Year: 2014

Insignia identification is an important task especially as a self help application on mobile phones which can be used in museums. We propose a knowledge driven rule-based approach and a learning based approach using artificial neural network (ANN) for insignia recognition. Both the approaches are based on a common set of insignia image segmentation followed by extraction of simple, yet effective features. The features used are based on one of frugal processing and computing to suit the mobile computing power. In both the approaches we identify each extracted segment in the insignia; the correct recognition of the segment followed by post processing results in the identification of the insignia. Experimental results show that both approaches work equally well in terms of recognition accuracy of over 90% in terms of identification of the segments and 100% in terms of the actual insignia identification. © Springer International Publishing Switzerland 2014. Source


Ahmed I.,TCS Innovation Labs Mumbai | Kopparapu S.K.,TCS Innovation Labs Mumbai
Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH | Year: 2013

A frugal approach to construct speech corpora, specially for resource deficient languages, is to exploit collections of speech and corresponding text data available in audio books, news, lectures. However, using these resources for building speech corpora require an alignment of the long duration speech data with the accompanying text data. Existing techniques for automatic speech-text alignment of long audio files assume availability of a basic speech recognition engine and hence cannot be directly used for resource deficient languages. In this paper, we propose a novel technique for sentence level alignment of long speechtext data by exploiting the syllable information in speech and text data. The proposed technique does not depend on the availability of any speech recognition models and hence can be used for resource deficient languages. Copyright © 2013 ISCA. Source


Imran A.,TCS Innovation Labs Mumbai | Sunil K.,TCS Innovation Labs Mumbai
2012 IEEE International Conference on Signal Processing, Communications and Computing, ICSPCC 2012 | Year: 2012

Building speech recognition application for resource deficient languages is a challenge because of the unavailability of a speech corpus. Speech corpus is a central element for training the acoustic models used in a speech recognition engine. Constructing a speech corpus for a language is an expensive, time consuming and laborious process. This paper addresses a mechanism to develop an inexpensive speech corpus, for resource deficient languages Indian English and Hindi, by exploiting existing collections of online speech data to build a frugal speech corpus. For the purpose of demonstration we use online audio news archives to build a frugal speech corpus. We then use this speech corpus to train acoustic models and evaluate the performance of speech recognition on Indian English and Hindi speech. © 2012 IEEE. Source

Discover hidden collaborations