IFLYTEK Research

Hefei, China

IFLYTEK Research

Hefei, China

Time filter

Source Type

Chen Y.,Hefei University of Technology | Li X.,iFlyTek Research | Li L.,Wuhan University of Technology | Liu G.,Hefei University of Technology | Xu G.,University of Technology, Sydney
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | Year: 2016

The pervasive employments of Location-based Social Network call for precise and personalized Point-of-Interest (POI) recommendation to predict which places the users prefer. Modeling user mobility, as an important component of understanding user preference, plays an essential role in POI recommendation. However, existing methods mainly model user mobility through analyzing the check-in data and formulating a distribution without considering why a user checks in at a specific place from psychological perspective. In this paper, we propose a POI recommendation algorithm modeling user mobility by considering check-in data and geographical information. Specifically, with check-in data, we propose a novel probabilistic latent factor model to formulate user psychological behavior from the perspective of utility theory, which could help reveal the inner information underlying the comparative choice behaviors of users. Geographical behavior of all the historical check-ins captured by a power law distribution is then combined with probabilistic latent factor model to form the POI recommendation algorithm. Extensive evaluation experiments conducted on two real-world datasets confirm the superiority of our approach over state-of-the-art methods. © Springer International Publishing Switzerland 2016.


Guo H.,Hefei University of Technology | Li X.,IFLYTEK Research | He M.,Hefei University of Technology | Zhao X.,Hefei University of Technology | And 2 more authors.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | Year: 2016

The pervasive use of Location-based Social Networks calls for more precise Point-of-Interest recommendation. The probability of a user’s visit to a target place is influenced by multiple factors. Though there are several fusion models in such fields, heterogeneous information are not considered comprehensively. To this end, we propose a novel probabilistic latent factor model by jointly considering the social correlation, geographical influence and users’ preference. To be specific, a variant of Latent Dirichlet Allocation is leveraged to extract the topics of both user and POI from reviews which is denoted as explicit interest. Then, Probabilistic Latent Factor Model is introduced to depict the implicit interest. Moreover, Kernel Density Estimation and friend-based Collaborative Filtering are leveraged to model user’s geographic allocation and social correlation respectively. Thus, we propose CoSoLoRec, a fusion framework, to ameliorate the recommendation. Experiments on two real-word datasets show the superiority of our approach over the state-of-the-art methods. © Springer International Publishing AG 2016.


Liu C.,IFlytek Research | Hu Y.,IFlytek Research | Dai L.-R.,Hefei University of Technology | Jiang H.,York University
IEEE Transactions on Audio, Speech and Language Processing | Year: 2011

In this paper, we have proposed two novel optimization methods for discriminative training (DT) of hidden Markov models (HMMs) in speech recognition based on an efficient global optimization algorithm used to solve the so-called trust region (TR) problem, where a quadratic function is minimized under a spherical constraint. In the first method, maximum mutual information estimation (MMIE) of Gaussian mixture HMMs is formulated as a standard TR problem so that the efficient global optimization method can be used in each iteration to maximize the auxiliary function of discriminative training for speech recognition. In the second method, we propose to construct a new auxiliary function for DT of HMMs by adding a quadratic penalty term. The new auxiliary function is constructed to serve as first-order approximation as well as lower bound of the original discriminative objective function within a locality constraint. Due to the lower-bound property, the found optimal point of the new auxiliary function is guaranteed to improve the original discriminative objective function until it converges to a local optimum or stationary point of the objective function. Both TR-based optimization methods have been investigated on two standard large-vocabulary continuous speech recognition tasks, using the WSJ0 and Switchboard databases. Experimental results have shown that the proposed TR methods outperform the conventional EBW method in terms of convergence behavior as well as recognition performance. © 2011 IEEE.


Du J.,Microsoft | Hu Y.,IFlytek Research | Jiang H.,York University
IEEE Transactions on Audio, Speech and Language Processing | Year: 2011

In this paper, we apply the well-known boosted mixture learning (BML) method to learn Gaussian mixture HMMs in speech recognition. BML is an incremental method to learn mixture models for classification problems. In each step of BML, one new mixture component is estimated according to the functional gradient of an objective function to ensure that it is added along the direction that maximizes the objective function. Several techniques have been proposed to extend BML from simple mixture models like the Gaussian mixture model (GMM) to the Gaussian mixture hidden Markov model (HMM), including Viterbi approximation for state segmentation, weight decay and sampling boosting to initialize sample weights to avoid overfitting, combination between partial updating and global updating to refine model parameters in each BML iteration, and use of the Bayesian Information Criterion (BIC) for parsimonious modeling. Experimental results on two large-vocabulary continuous speech recognition tasks, namely the WSJ-5k and Switchboard tasks, have shown that the proposed BML yields significant performance gain over the conventional training procedure, especially for small model sizes. © 2006 IEEE.


Ding H.,National School of Technology | Pan J.,IFlytek Research | Shen M.,University of Konstanz | Shen M.,South China University of Technology
2015 IEEE International Conference on Information and Automation, ICIA 2015 - In conjunction with 2015 IEEE International Conference on Automation and Logistics | Year: 2015

Objective measures are favored and widely used by many researchers in evaluating the quality of noise-suppressed speech. A good and reliable objective measure should have property that it could evaluate speech quality in consistent and well correlated with subjective ratings. In this paper, several widely used objective measures are applied to the speech signals with the Chinese languages including Mandarin and Cantonese. The correlations between objective measure outputs and perceptual-subjective ratings are reported and analyzed. The experimental results show that the correlation with the language types of Mandarin and Cantonese are lower than the one with English and objective measures behave differently in Mandarin, Cantonese and English. Detail discussion and conclusion are presented as well. © 2015 IEEE.


Ding J.,Hefei University of Technology | Chen Y.,Hefei University of Technology | Li X.,IFlyTek Research | Liu G.,Hefei University of Technology | And 2 more authors.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | Year: 2016

Personalized Recommendation has drawn greater attention in academia and industry as it can help people filter out massive useless information. Several existing recommender techniques exploit social connections, i.e., friends or trust relations as auxiliary information to improve recommendation accuracy. However, opinion leaders in each circle tend to have greater impact on recommendation than those of friends with different tastes. So we devise two unsupervised methods to identify opinion leaders that are defined as experts. In this paper, we incorporate the influence of experts into circle-based personalized recommendation. Specifically, we first build explicit and implicit social networks by utilizing users’ friendships and similarity respectively. Then we identify experts on both social networks. Further, we propose a circle-based personalized recommendation approach via fusing experts’ influences into matrix factorization technique. Extensive experiments conducted on two datasets demonstrate that our approach outperforms existing methods, particularly on handing cold-start problem. © Springer International Publishing Switzerland 2016.


Xia X.-J.,Anhui University of Science and Technology | Ling Z.-H.,Anhui University of Science and Technology | Jiang Y.,IFLYTEK Research | Dai L.-R.,Anhui University of Science and Technology
Speech Communication | Year: 2014

This paper presents a hidden Markov model (HMM) based unit selection speech synthesis method using log likelihood ratios (LLR) derived from perceptual data. The perceptual data is collected by judging the naturalness of each synthetic prosodic word manually. Two acoustic models which represent the natural speech and the unnatural synthetic speech are trained respectively. At synthesis time, the LLRs are derived from the estimated acoustic models and integrated into the unit selection criterion as target cost functions. The experimental results show that our proposed method can synthesize more natural speech than the conventional method using likelihood functions. Due to the inadequacy of the acoustic model estimated for the unnatural synthetic speech, utilizing the LLR-based target cost functions to rescore the pre-selection results or the N-best sequences can achieve better performance than substituting them for the original target cost functions directly. © 2014 Elsevier B.V. All rights reserved.


Du J.,Anhui University of Science and Technology | Hu J.-S.,IFlytek Research | Zhu B.,IFlytek Research | Wei S.,IFlytek Research | Dai L.-R.,Anhui University of Science and Technology
Proceedings of International Conference on Frontiers in Handwriting Recognition, ICFHR | Year: 2014

This paper presents a novel approach to writer adaptation using bottleneck features and discriminative linear regression for the recognition of online handwritten Chinese characters. First, bottleneck features extracted from a bottleneck layer of a deep neural network representing a nonlinear and discriminative transformation of the input features are verified to be much more effective in adaptation of writing styles than the conventional features after linear discriminant analysis transformation. Second, discriminative linear regression via a so-called sample separation margin based minimum classification error criterion is adopted for writer adaptation. The experiments on an in-house developed online Chinese handwriting corpus with a vocabulary of 15,167 characters and testing data collected from user inputs of Smartphones show that our proposed approach can achieve very significant improvements of recognition accuracy compared with a state-of-the-art adaptation approach for writer adaptation. © 2014 IEEE.


Du J.,Anhui University of Science and Technology | Hu J.-S.,IFlytek Research | Zhu B.,IFlytek Research | Wei S.,IFlytek Research | Dai L.-R.,Anhui University of Science and Technology
Proceedings - International Conference on Pattern Recognition | Year: 2014

This paper presents a study of designing compact classifiers using deep neural networks for recognition of online handwritten Chinese characters. Two schemes are investigated based on practical considerations. First, deep neural networks are adopted purely as a classifier with a state-of-the-art feature extractor of online handwritten Chinese characters. Second, the so-called bottleneck features extracted from a bottleneck layer of deep neural networks are fed to the prototype-based classifier. The experiments on an in-house developed online Chinese handwriting corpus with a vocabulary of 15,167 characters show that compared with prototype-based classifier widely developed on the mobile device, deep neural network based classifier can yield significant improvements of recognition accuracy with acceptably increased footprint and latency while the bottleneck-feature approach can bring a more compact classifier with an observable performance gain. © 2014 IEEE.


Du J.,Anhui University of Science and Technology | Zhai J.-F.,Anhui University of Science and Technology | Hu J.-S.,IFlytek Research | Zhu B.,IFlytek Research | And 2 more authors.
Proceedings of the International Conference on Document Analysis and Recognition, ICDAR | Year: 2015

This paper presents a novel approach to writer adaptation based on convolutional neural network (CNN) as a feature extractor and improved discriminative linear regression for online handwritten Chinese character recognition. First, the proposed recognizer consisting of CNN-based feature extractor and prototype-based classifier can achieve comparable performance with the state-of-the-art CNN-based classifier while it could be designed more compact and efficient as a practical solution. Second, the writer adaption is performed via a linear transformation of the extracted feature from CNN. The transformation parameters are optimized with a so-called sample separation margin based minimum classification error criterion, which can be further improved by using more synthesized adaptation data and a simple regularization method. The experiments on the data collected from user inputs of Smartphones with a vocabulary of 20,936 characters demonstrate that our writer adaptation approach can yield significant improvements of recognition accuracy over a high-performance baseline system and also outperform a state-of-the-art approach based on style transfer mapping especially with increased adaptation data. © 2015 IEEE.

Loading IFLYTEK Research collaborators
Loading IFLYTEK Research collaborators