Entity

Time filter

Source Type

Hong Kong

Ghoshal A.,Saarland University | Povey D.,Microsoft | Agarwal M.,IIIT Allahabad | Akyazi P.,Bogazici University | And 9 more authors.
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings | Year: 2010

In this paper we present a novel approach for estimating feature-space maximum likelihood linear regression (fMLLR) transforms for full-covariance Gaussian models by directly maximizing the likelihood function by repeated line search in the direction of the gradient. We do this in a pre-transformed parameter space such that an approximation to the expected Hessian is proportional to the unit matrix. The proposed algorithm is as efficient or more efficient than standard approaches, and is more flexible because it can naturally be combined with sets of basis transforms and with full covariance and subspace precision and mean (SPAM) models. ©2010 IEEE. Source


Goel N.,Go Vivace Inc. | Thomas S.,Johns Hopkins University | Agarwal M.,IIIT Allahabad | Akyazi P.,Bogazici University | And 9 more authors.
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings | Year: 2010

Preparation of a lexicon for speech recognition systems can be a significant effort in languages where the written form is not exactly phonetic. On the other hand, in languages where the written form is quite phonetic, some common words are often mispronounced. In this paper, we use a combination of lexicon learning techniques to explore whether a lexicon can be learned when only a small lexicon is available for boot-strapping. We discover that for a phonetic language such as Spanish, it is possible to do that better than what is possible from generic rules or hand-crafted pronunciations. For a more complex language such as English, we find that it is still possible but with some loss of accuracy. ©2010 IEEE. Source


Li H.,Hong Kong UST | Wei L.-Y.,Microsoft | Sander P.V.,Hong Kong UST | Fu C.-W.,Nanyang Technological University
ACM Transactions on Graphics | Year: 2010

Blue noise sampling is widely employed for a variety of imaging, geometry, and rendering applications. However, existing research so far has focused mainly on isotropic sampling, and challenges remain for the anisotropic scenario both in sample generation and quality verification. We present anisotropic blue noise sampling to address these issues. On the generation side, we extend dart throwing and relaxation, the two classical methods for isotropic blue noise sampling, for the anisotropic setting, while ensuring both high-quality results and efficient computation. On the verification side, although Fourier spectrum analysis has been one of the most powerful and widely adopted tools, so far it has been applied only to uniform isotropic samples. We introduce approaches based on warping and sphere sampling that allow us to extend Fourier spectrum analysis for adaptive and/or anisotropic samples; thus, we can detect problems in alternative anisotropic sampling techniques that were not yet found via prior verification. We present several applications of our technique, including stippling, visualization, surface texturing, and object distribution. © 2010 ACM. Source


Golin M.,Hong Kong UST | Zhang Y.,Hong Kong UST
IEEE Transactions on Information Theory | Year: 2010

The state-of-the-art in length-limited Huffman coding (LLHC) algorithms is the Θ (nD)-time, Θ (n)-space one of Hirschberg and Larmore, where n is the size of the code and D ≤ n is the length restriction on the codewords. This is a very clever, very problem specific, technique. This paper presents a simple dynamic-programming (DP) method that solves the problem with the same time and space bounds. The fact that there was an Θ (nD) time DP algorithm was previously known; it is a straightforward DP with the Monge property (which permits an order of magnitude speedup). It was not interesting, though, because it also required Θ (nD) space. The main result of this paper is the technique developed for reducing the space. It is quite simple and applicable to many other problems modeled by DPs with the Monge property. This is illustrated with examples from web-proxy design and wireless mobile paging. © 2006 IEEE. Source


Scherzer D.,MPI Informatik | Yang L.,Hong Kong UST | Mattausch O.,Vienna University of Technology | Nehab D.,IMPA | And 3 more authors.
Computer Graphics Forum | Year: 2012

Nowadays, there is a strong trend towards rendering to higher-resolution displays and at high frame rates. This development aims at delivering more detail and better accuracy, but it also comes at a significant cost. Although graphics cards continue to evolve with an ever-increasing amount of computational power, the speed gain is easily counteracted by increasingly complex and sophisticated shading computations. For real-time applications, the direct consequence is that image resolution and temporal resolution are often the first candidates to bow to the performance constraints (e.g. although full HD is possible, PS3 and XBox often render at lower resolutions). In order to achieve high-quality rendering at a lower cost, one can exploit temporal coherence (TC). The underlying observation is that a higher resolution and frame rate do not necessarily imply a much higher workload, but a larger amount of redundancy and a higher potential for amortizing rendering over several frames. In this survey, we investigate methods that make use of this principle and provide practical and theoretical advice on how to exploit TC for performance optimization. These methods not only allow incorporating more computationally intensive shading effects into many existing applications, but also offer exciting opportunities for extending high-end graphics applications to lower-spec consumer-level hardware. To this end, we first introduce the notion and main concepts of TC, including an overview of historical methods. We then describe a general approach, image-space reprojection, with several implementation algorithms that facilitate reusing shading information across adjacent frames. We also discuss data-reuse quality and performance related to reprojection techniques. Finally, in the second half of this survey, we demonstrate various applications that exploit TC in real-time rendering. © 2012 The Eurographics Association and Blackwell Publishing Ltd. Source

Discover hidden collaborations