Huang Y.,Key Laboratory of Machine Perception MoE |
Duan D.,Key Laboratory of Machine Perception MoE |
Cui J.,Key Laboratory of Machine Perception MoE |
Davoine F.,LIAMA Sino European Laboratory |
And 2 more authors.
2014 IEEE International Conference on Image Processing, ICIP 2014 | Year: 2014
Head pose is an important indicator of a person's visual focus of attention (VFoA). A traditional way to recognize VFoA is to consider accurate head pose or gaze estimations. However, these estimations usually degrade drastically in middle or low resolution video data. In this paper, a joint estimation of head pose and VFoA is proposed to address this issue; both head pose and VFoA are iteratively refined until convergence. This approach is evaluated in a specific scenario involving children around a table playing together with toys. Datasets are acquired and annotated by psychologists in Peking university. The experimental results demonstrate the usefulness of the join estimation process to recognize visual focus of attention in middle resolution video sequences. © 2014 IEEE.
Wang C.,Peking University |
Wang C.,Key Laboratory of Machine Perception MOE |
Fang Y.,Peking University |
Fang Y.,Key Laboratory of Machine Perception MOE |
And 6 more authors.
IEEE Transactions on Intelligent Transportation Systems | Year: 2016
Visual-based approaches have been extensively studied for on-road vehicle detection; however, it faces great challenges as the visual appearance of a vehicle may greatly change across different viewpoints and as a partial observation sometimes happens due to occlusions from infrastructure or scene dynamics and/or a limited camera vision field. This paper presents a visual-based on-road vehicle detection algorithm for a multilane traffic scene. A probabilistic inference framework based on part models is proposed to overcome the challenges from a multiview and partial observation. Geometric models are learned for each dominant viewpoint to describe the configuration of vehicle parts and their spatial relations in probabilistic representations. Viewpoint maps are generated based on the knowledge of the road structure and driving patterns, which provide a prediction of the viewpoints of a vehicle whenever it happens at a certain location. Extensive experiments are conducted using an onboard camera on multilane motor ways in Beijing. A large-scale data set that contains more than 30 000 labeled ground truths for both fully and partially observed vehicles in different viewpoints across various traffic density scenes is developed. The data set will be opened to the society together with this publication. © 2015 IEEE.
Liu M.,Key Laboratory of Machine Perception MOE |
Luo Y.,Center for Quantum Computation and Intelligent Systems |
Luo Y.,Nanyang Technological University |
Tao D.,Center for Quantum Computation and Intelligent Systems |
And 2 more authors.
Proceedings of the National Conference on Artificial Intelligence | Year: 2015
Multi-label image classification is of significant interest due to its major role in real-world web image analysis applications such as large-scale image retrieval and browsing. Recently, matrix completion (MC) has been developed to deal with multi-label classification tasks. MC has distinct advantages, such as robustness to missing entries in the feature and label spaces and a natural ability to handle multi-label problems. However, current MC-based multi-label image classification methods only consider data represented by a singleview feature, therefore, do not precisely characterize images that contain several semantic concepts. An intuitive way to utilize multiple features taken from different views is to concatenate the different features into a long vector; however, this concatenation is prone to over-fitting and leads to high time complexity in MC-based image classification. Therefore, we present a novel multi-view learning model for MC-based image classification, called low-rank multi-view matrix completion (IrMMC), which first seeks a low-dimensional common representation of all views by utilizing the proposed low-rank multi-view learning (IrMVL) algorithm. In IrMVL, the common subspace is constrained to be low rank so that it is suitable for MC. In addition, combination weights are learned to explore complementarity between different views. An efficient solver based on fixed-point continuation (FPC) is developed for optimization, and the learned low-rank representation is then incorporated into MC-based image classification. Extensive experimentation on the challenging PAS-CAL VOC'07 dataset demonstrates the superiority of Ir-MMC compared to other multi-label image classification approaches. Copyright © 2015, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
Gu S.C.,Key Laboratory of Machine Perception MOE |
Tan Y.,Key Laboratory of Machine Perception MOE |
He X.G.,Key Laboratory of Machine Perception MOE
Science China Information Sciences | Year: 2010
In this paper, we investigate how to extract the lowest frequency features from an image. A novel Laplacian smoothing transform (LST) is proposed to transform an image into a sequence, by which low frequency features of an image can be easily extracted for a discriminant learning method for face recognition. Generally, the LST is able to be an efficient dimensionality reduction method for face recognition problems. Extensive experimental results show that the LST method performs better than other pre-processing methods, such as discrete cosine transform (DCT), principal component analysis (PCA) and discrete wavelet transform (DWT), on ORL, Yale and PIE face databases. Under the leave one out strategy, the best performance on the ORL and Yale face databases is 99.75% and 99.4%; however, in this paper, we improve both to 100% with a fast linear feature extraction method for the first time. © Science China Press and Springer-Verlag Berlin Heidelberg 2010.
Zhang P.,Key Laboratory of Machine Perception MOE |
Tan Y.,Key Laboratory of Machine Perception MOE
2013 IEEE 3rd International Conference on Information Science and Technology, ICIST 2013 | Year: 2013
This paper proposes a new feature-goodness criterion named class-wise information gain (CIG). The CIG is able to measure the goodness of a feature for recognizing a specific class, and further helps to select the features with the highest information content for a specific class. In order to confirm the effectiveness of the CIG, a CIG-based malware detection method is proposed. Eight groups of experiments on three public malware datasets are carried out to evaluate the performance of the proposed CIG-based malware detection method through cross-validation. Comprehensive experimental results suggest that the CIG is an effective feature-goodness criterion, and the proposed CIG-based malware detection method is effective to detect malware loaders and infected executables. This method outperforms the information gain (IG)-based malware detection method for about 26% in detecting infected executables, without decrease in detecting malware loaders, while its memory requirement is about 60% less than that of the IG-based malware detection method empirically. © 2013 IEEE.