Time filter

Source Type

Ma A.J.,Hong Kong Baptist University | Yuen P.C.,Hong Kong Baptist University | Yuen P.C.,PBNU HKBU United International College | Zou W.W.W.,Hong Kong Baptist University | And 2 more authors.
IEEE Transactions on Circuits and Systems for Video Technology | Year: 2013

Supervised manifold learning has been successfully applied to action recognition, in which class label information could improve the recognition performance. However, the learned manifold may not be able to well preserve both the local structure and global constraint of temporal labels in action sequences. To overcome this problem, this paper proposes a new supervised manifold learning algorithm called supervised spatio-temporal neighborhood topology learning (SSTNTL) for action recognition. By analyzing the topological characteristics in the context of action recognition, we propose to construct the neighborhood topology using both supervised spatial and temporal pose correspondence information. Employing the property in locality preserving projection (LPP), SSTNTL solves the generalized eigenvalue problem to obtain the best projections that not only separates data points from different classes, but also preserves local structures and temporal pose correspondence of sequences from the same class. Experimental results demonstrate that SSTNTL outperforms the manifold embedding methods with other topologies or local discriminant information. Moreover, compared with state-of-the-art action recognition algorithms, SSTNTL gives convincing performance for both human and gesture action recognition. © 1991-2012 IEEE.

Xie X.,Sun Yat Sen University | Xie X.,Guangdong Province Key Laboratory of Information Security | Lai J.,Sun Yat Sen University | Lai J.,Guangdong Province Key Laboratory of Information Security | Zheng W.-S.,Queen Mary, University of London
Pattern Recognition | Year: 2010

Face recognition under varying lighting conditions is challenging, especially for single image based recognition system. Exacting illumination invariant features is an effective approach to solve this problem. However, existing methods are hard to extract both multi-scale and multi-directivity geometrical structures at the same time, which is important for capturing the intrinsic features of a face image. In this paper, we propose to utilize the logarithmic nonsubsampled contourlet transform (LNSCT) to estimate the reflectance component from a single face image and refer it as the illumination invariant feature for face recognition, where NSCT is a fully shift-invariant, multi-scale, and multi-direction transform. LNSCT can extract strong edges, weak edges, and noise from a face image using NSCT in the logarithm domain. We analyze that in the logarithm domain the low-pass subband of a face image and the low frequency part of strong edges can be regarded as the illumination effects, while the weak edges and the high frequency part of strong edges can be considered as the reflectance component. Moreover, even though a face image is polluted by noise (in particular the multiplicative noise), the reflectance component can still be well estimated and meanwhile the noise is removed. The LNSCT can be applied flexibly as neither assumption on lighting condition nor information about 3D shape is required. Experimental results show the promising performance of LNSCT for face recognition on Extended Yale B and CMU-PIE databases. © 2010 Elsevier Ltd. ALL rights reserved.

Chen S.-Z.,Sun Yat Sen University | Chen S.-Z.,Guangdong Province Key Laboratory of Information Security | Guo C.-C.,Guangdong Province Key Laboratory of Information Security | Guo C.-C.,Sun Yat Sen University | And 2 more authors.
IEEE Transactions on Image Processing | Year: 2016

This paper proposes a novel approach to person re-identification, a fundamental task in distributed multi-camera surveillance systems. Although a variety of powerful algorithms have been presented in the past few years, most of them usually focus on designing hand-crafted features and learning metrics either individually or sequentially. Different from previous works, we formulate a unified deep ranking framework that jointly tackles both of these key components to maximize their strengths. We start from the principle that the correct match of the probe image should be positioned in the top rank within the whole gallery set. An effective learning-to-rank algorithm is proposed to minimize the cost corresponding to the ranking disorders of the gallery. The ranking model is solved with a deep convolutional neural network (CNN) that builds the relation between input image pairs and their similarity scores through joint representation learning directly from raw image pixels. The proposed framework allows us to get rid of feature engineering and does not rely on any assumption. An extensive comparative evaluation is given, demonstrating that our approach significantly outperforms all the state-of-the-art approaches, including both traditional and CNN-based methods on the challenging VIPeR, CUHK-01, and CAVIAR4REID datasets. In addition, our approach has better ability to generalize across datasets without fine-tuning. © 2015 IEEE.

Liang Y.,Sun Yat Sen University | Liang Y.,Guangdong Province Key Laboratory of Information Security | Lai J.-H.,Sun Yat Sen University | Lai J.-H.,Guangdong Province Key Laboratory of Information Security | And 3 more authors.
Proceedings - International Conference on Pattern Recognition | Year: 2010

In this paper we propose to convert the task of face hallucination into an image decomposition problem, and then use the morphological component analysis (MCA) for hallucinating a single face image, based on a novel three-step framework. Firstly, a low-resolution input image is up-sampled by interpolation. Then, the MCA is employed to decompose the interpolated image into a high-resolution image and an unsharp masking, as MCA can properly decompose a signal into special parts according to typical dictionaries. Finally, a residue compensation, which is based on the neighbor reconstruction of patches, is performed to enhance the facial details. The proposed method can effectively exploit the facial properties for face hallucination under the image decomposition perspective. Experimental results demonstrate the effectiveness of our method, in terms of the visual quality of the hallucinated face images. © 2010 IEEE.

Xie X.,Sun Yat Sen University | Xie X.,Guangdong Province Key Laboratory of Information Security | Xie X.,Concordia University at Montréal | Lai J.,Sun Yat Sen University | And 4 more authors.
Signal Processing | Year: 2011

Quotient Image (QI) algorithm has been widely used in face recognition and re-rendering under varying illumination conditions. One of the inaccuracies of QI algorithm is the assumption of "Ideal Class", that all faces have the same surface normal (3D shape). However, in practice this assumption is often not true. To reduce the inaccuracy, the Non-Ideal Class Non-Point Light source QI (NIC-NPL-QI), which ignores the "Ideal Class" assumption, is developed in this paper for face relighting. Unlike that in the basic QI algorithm a fixed reference object for all test objects is used, in the NIC-NPL-QI algorithm a special reference object for each test object is constructed, so that the test and reference objects have similar illumination images, achieving the equal effect of "Ideal Class" assumption. In the proposed method, the wavelet algorithm is introduced to estimate an illumination image. Furthermore, the proposed NIC-NPL-QI algorithm can handle the harmonic light and shadows. Experiments on Extended Yale B and CMU-PIE databases show that NIC-NLP-QI algorithm obtains better quality in synthesizing face images as compared with state-of-the-art algorithms. © 2010 Elsevier B.V. All rights reserved.

Tan J.,Sun Yat Sen University | Tan J.,Guangdong Province Key Laboratory of Information Security | Xie X.,CAS Shenzhen Institutes of Advanced Technology | Zheng W.-S.,Sun Yat Sen University | And 3 more authors.
International Journal of Pattern Recognition and Artificial Intelligence | Year: 2012

Each Chinese character is comprised of radicals, where a single character (compound character) contains one (or more than one) radicals. For human cognitive perspective, a Chinese character can be recognized by identifying its radicals and their spatial relationship. This human cognitive law may be followed in computer recognition. However, extracting Chinese character radicals automatically by computer is still an unsolved problem. In this paper, we propose using an improved sparse matrix factorization which integrates affine transformation, namely affine sparse matrix factorization (ASMF), for automatically extracting radicals from Chinese characters. Here the affine transformation is vitally important because it can address the poor-alignment problem of characters that may be caused by internal diversity of radicals and image segmentation. Consequently we develop a radical-based Chinese character recognition model. Because the number of radicals is much less than the number of Chinese characters, the radical-based recognition performs a far smaller category classification than the whole character-based recognition, resulting in a more robust recognition system. The experiments on standard Chinese character datasets show that the proposed method gets higher recognition rates than related Chinese character recognition methods. © 2012 World Scientific Publishing Company.

Xie X.-H.,Sun Yat Sen University | Xie X.-H.,Guangdong Province Key Laboratory of Information Security | Lai J.-H.,Sun Yat Sen University | Lai J.-H.,Guangdong Province Key Laboratory of Information Security | Zheng W.-S.,Queen Mary, University of London
Tien Tzu Hsueh Pao/Acta Electronica Sinica | Year: 2010

We proposed to describe the relationship between the pixel grey values of the face images under frontal and non-frontal illumination conditions by using a second order polynomial model. Correspondingly, an illumination normalization method based on such nonlinear model was formed. The proposed method learns the illumination variations in a statistical manner by using regression model without any prior physical knowledge. Furthermore, in order to improve the visual quality, a PCA-based weighting compensation for the normalized face image was proposed. The experimental results on the Extended Yale B and CMU-PIE face databases show that the proposed method can attain good visualization for face images, and significantly improve the face recognition performance.

Wu J.-S.,Sun Yat Sen University | Wu J.-S.,SYSU CMU Shunde International Joint Research Institute | Zheng W.-S.,Sun Yat Sen University | Zheng W.-S.,Guangdong Province Key Laboratory of Computational Science | And 2 more authors.
Neural Networks | Year: 2015

Kernel competitive learning has been successfully used to achieve robust clustering. However, kernel competitive learning (KCL) is not scalable for large scale data processing, because (1) it has to calculate and store the full kernel matrix that is too large to be calculated and kept in the memory and (2) it cannot be computed in parallel. In this paper we develop a framework of approximate kernel competitive learning for processing large scale dataset. The proposed framework consists of two parts. First, it derives an approximate kernel competitive learning (AKCL), which learns kernel competitive learning in a subspace via sampling. We provide solid theoretical analysis on why the proposed approximation modelling would work for kernel competitive learning, and furthermore, we show that the computational complexity of AKCL is largely reduced. Second, we propose a pseudo-parallelled approximate kernel competitive learning (PAKCL) based on a set-based kernel competitive learning strategy, which overcomes the obstacle of using parallel programming in kernel competitive learning and significantly accelerates the approximate kernel competitive learning for large scale clustering. The empirical evaluation on publicly available datasets shows that the proposed AKCL and PAKCL can perform comparably as KCL, with a large reduction on computational cost. Also, the proposed methods achieve more effective clustering performance in terms of clustering precision against related approximate clustering approaches. © 2014 Elsevier Ltd.

Ma A.J.,Hong Kong Baptist University | Yuen P.C.,Hong Kong Baptist University | Yuen P.C.,BNU HKBU United International College | Lai J.-H.,Sun Yat Sen University | Lai J.-H.,Guangdong Province Key Laboratory of Information Security
IEEE Transactions on Pattern Analysis and Machine Intelligence | Year: 2013

This paper addresses the independent assumption issue in fusion process. In the last decade, dependency modeling techniques were developed under a specific distribution of classifiers or by estimating the joint distribution of the posteriors. This paper proposes a new framework to model the dependency between features without any assumption on feature/classifier distribution, and overcomes the difficulty in estimating the high-dimensional joint density. In this paper, we prove that feature dependency can be modeled by a linear combination of the posterior probabilities under some mild assumptions. Based on the linear combination property, two methods, namely, Linear Classifier Dependency Modeling (LCDM) and Linear Feature Dependency Modeling (LFDM), are derived and developed for dependency modeling in classifier level and feature level, respectively. The optimal models for LCDM and LFDM are learned by maximizing the margin between the genuine and imposter posterior probabilities. Both synthetic data and real datasets are used for experiments. Experimental results show that LCDM and LFDM with dependency modeling outperform existing classifier level and feature level combination methods under nonnormal distributions and on four real databases, respectively. Comparing the classifier level and feature level fusion methods, LFDM gives the best performance. © 1979-2012 IEEE.

Xie X.,Sun Yat Sen University | Zheng W.-S.,Sun Yat Sen University | Zheng W.-S.,Queen Mary, University of London | Lai J.,Sun Yat Sen University | And 3 more authors.
IEEE Transactions on Image Processing | Year: 2011

A face image can be represented by a combination of large-and small-scale features. It is well-known that the variations of illumination mainly affect the large-scale features (low-frequency components), and not so much the small-scale features. Therefore, in relevant existing methods only the small-scale features are extracted as illumination-invariant features for face recognition, while the large-scale intrinsic features are always ignored. In this paper, we argue that both large-and small-scale features of a face image are important for face restoration and recognition. Moreover, we suggest that illumination normalization should be performed mainly on the large-scale features of a face image rather than on the original face image. A novel method of normalizing both the Small-and Large-scale (S&L) features of a face image is proposed. In this method, a single face image is first decomposed into large-and small-scale features. After that, illumination normalization is mainly performed on the large-scale features, and only a minor correction is made on the small-scale features. Finally, a normalized face image is generated by combining the processed large-and small-scale features. In addition, an optional visual compensation step is suggested for improving the visual quality of the normalized image. Experiments on CMU-PIE, Extended Yale B, and FRGC 2.0 face databases show that by using the proposed method significantly better recognition performance and visual results can be obtained as compared to related state-of-the-art methods. © 2011 IEEE.

Loading Guangdong Province Key Laboratory of Information Security collaborators
Loading Guangdong Province Key Laboratory of Information Security collaborators