Time filter

Source Type

Jiang J.,CAS Shenzhen Institutes of Advanced Technology | Jiang J.,Chinese University of Hong Kong | Jiang J.,Shenzhen Key Laboratory of Computer Vision and Pattern Recognition | Cheng J.,CAS Shenzhen Institutes of Advanced Technology | And 3 more authors.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | Year: 2013

Real-time 3D sensing plays a critical role in robotic navigation, video surveillance and human-computer interaction, etc. When computing 3D structures of dynamic scenes from stereo sequences, spatiotemporal stereo and scene flow methods can produce temporally coherent disparity. However, most existing methods do not utilize the previous disparity map sufficiently to compute the next disparity map, and the searching space of correspondences limits the speed of disparity computation for each image pair. This paper proposes an effective scheme to predict disparity maps from stereo sequences. In particular, we apply a robust 3D registration algorithm based on the angular-invariant feature to estimate the ego-motion of the stereo rig between consecutive frames, and present the transformation between consecutive disparity maps. The scheme can produce a sequence of temporally coherent disparity maps rapidly. We apply the new scheme to real outdoor scenes, and thorough empirical studies indicate the effectiveness of the new scheme for practical applications. © 2013 Springer-Verlag.


Li X.,Southwest Jiaotong University | He H.,Southwest Jiaotong University | Yin Z.,Southwest Jiaotong University | Yin Z.,CAS Institute of Remote Sensing | And 3 more authors.
Neurocomputing | Year: 2015

Kernel partial least squares (KPLS) algorithm for super-resolution (SR) has carried out a regression model to estimate a high-resolution (HR) feature patch from its corresponding low-resolution (LR) feature patch using a training database. However, KPLS may be time-consuming in the neighbor search and use of principal components. In this paper we propose a clustering and weighted boosting (CWB) framework to improve the efficiency in KPLS regression model construction without reducing SR reconstruction quality. First, the training LR-HR feature patch pairs are divided into a certain number of clusters. For each test LR feature patch, the neighbor search in the selected cluster saves more computational costs than that in the whole training database. Second, a weighted boosting scheme is used to adaptively construct the KPLS regression model with the best number of principal components (BNPC). Experimental results on natural scene images suggest that the proposed CWB method can effectively improve the efficiency of KPLS-based SR method while preserving reconstruction quality, and achieve better performance than the conventional KPLS method. © 2014 Elsevier B.V.


Li X.,Southwest Jiaotong University | He H.,Southwest Jiaotong University | Yin Z.,Southwest Jiaotong University | Yin Z.,CAS Institute of Remote Sensing | And 3 more authors.
Neurocomputing | Year: 2014

In this paper, we present a novel learning-based single image super-resolution algorithm to address the problems of inefficient learning and improper estimation in coping with nonlinear high-dimensional feature data. Our method named as subspace projection and neighbor embedding (SPNE) first projects the high-dimensional data into two different subspaces respectively, i.e., kernel principal component analysis (KPCA) subspace and modified locality preserving projection (MLPP) subspace to obtain the global and local structures of data. In an optimal low-dimensional feature space, the k-nearest neighbors of each input low-resolution (LR) image patch can be found for efficient learning. Then within similarity measures and proportional factors, the k embedding weights are used to estimate high-frequency information from a training dataset. Finally, we apply iterative back projection (IBP) to further enhance the super-resolution results. Experiments on simulative and actual LR images demonstrate that the proposed approach outperforms the existing NE-based super-resolution methods in terms of visual quality and some selected objective metrics. © 2014 Elsevier B.V.


Guo J.,CAS Shenzhen Institutes of Advanced Technology | Guo J.,Chinese University of Hong Kong | Cheng J.,CAS Shenzhen Institutes of Advanced Technology | Cheng J.,Chinese University of Hong Kong | And 5 more authors.
Applied Mechanics and Materials | Year: 2013

In this paper, we present a dynamic gesture recognition system. We focus on the visual sensory information to recognize human activity in form of hand movements from a small, predefined vocabulary. A fast and effective method is presented for hand detection and tracking at first for the trajectory extraction. A novel trajectory correction method is applied for simply but effectively trajectory correction. Gesture recognition is achieved by means of a matching technique by determining the distance between the unknown input direction code sequence and a set of previously defined templates. A dynamic time warping (DTW) algorithm is used to perform the time alignment and normalization by computing a temporal transformation allowing the two signals to be matched. Experiment results show our proposed gesture recognition system achieve well result in real time. © (2013) Trans Tech Publications, Switzerland.


Zhang J.,CAS Shenzhen Institutes of Advanced Technology | Zhang J.,University of Chinese Academy of Sciences | Zhang J.,Chinese University of Hong Kong | Zhang J.,Guangdong Provincial Key Laboratory of Robotics and Intelligent System | And 5 more authors.
IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems | Year: 2012

Since the visual system is susceptible to the lighting condition and surroundings changes, the accuracy for object localization of robot grasping system based on visual servo is rather poor so as to the low grasping success rate and bad robustness of the whole system. In view of such phenomenon, in this paper, we propose a method of fusing binocular camera accompany with monocular vision, IR sensors, tactile sensors and encoders to design a reliable and robust grasping system that could offer real-time feedback information. In order to avoid the situation of robot grasping-nothing, we use the binocular vision supplemented by monocular camera and IR sensors to locate accurately. By analyzing the contact model and pressure between gripper and the object, a durable, non-slip rubber coating is designed to increase the fingertip's friction, What's more, Fuzzy Neural Network (FNN) method was applied to fuse the information of multiple sensors in our robot system. By monitoring force and position information in the process of grasping all the time, the system can reduce the phenomenon of slippage and crush of object as well as improve the grasping stability greatly. The experimental results show the effectiveness of our system. © 2012 IEEE.


Cheng J.,CAS Shenzhen Institutes of Advanced Technology | Cheng J.,Chinese University of Hong Kong | Cheng J.,Guangdong Provincial Key Laboratory of Robotics and Intelligent System | Bian W.,Intelligent Systems Technology, Inc. | Tao D.,Intelligent Systems Technology, Inc.
Information Sciences | Year: 2013

Gesture recognition plays an important role in human machine interactions (HMIs) for multimedia entertainment. In this paper, we present a dimension reduction based approach for dynamic real-time hand gesture recognition. The hand gestures are recorded as acceleration signals by using a handheld with a 3-axis accelerometer sensor installed, and represented by discrete cosine transform (DCT) coefficients. To recognize different hand gestures, we develop a new dimension reduction method, locally regularized sliced inverse regression (LR-SIR), to find an effective low dimensional subspace, in which different hand gestures are well separable, following which recognition can be performed by using simple and efficient classifiers, e.g., nearest mean, k-nearest-neighbor rule and support vector machine. LR-SIR is built upon the well-known sliced inverse regression (SIR), but overcomes its limitation that it ignores the local geometry of the data distribution. Besides, LR-SIR can be effectively and efficiently solved by eigen-decomposition. Finally, we apply the LR-SIR based gesture recognition to control our recently developed dance robot for multimedia entertainment. Thorough empirical studies on 'digits'-gesture recognition suggest the effectiveness of the new gesture recognition scheme for HMI. © 2012 Elsevier Inc. All rights reserved.


Li X.,CAS Shenzhen Institutes of Advanced Technology | Li X.,Guangdong Provincial Key Laboratory of Robotics and Intelligent System | Zhou Y.,CAS Shenzhen Institutes of Advanced Technology | Liang G.,CAS Shenzhen Institutes of Advanced Technology | And 4 more authors.
2015 IEEE International Conference on Information and Automation, ICIA 2015 - In conjunction with 2015 IEEE International Conference on Automation and Logistics | Year: 2015

This paper provides a closed form analytical solution for TDOA three dimensional (3D) acoustic source localization with four synchronized ultrasonic generators or receivers. Then a pure analytical solution for Geometric Dilution of Precision (GDOP) with four signal generators or receivers is derived instead of a least square solution. The proposed GDOP model removes the effect of time clock drift, which shows the character of Geometric Dilution of Precision more directly. © 2015 IEEE.


Xiao Q.,Chinese University of Hong Kong | Cheng J.,Chinese University of Hong Kong | Cheng J.,Guangdong Provincial Key Laboratory of Robotics and Intelligent System
2013 IEEE International Conference on Information and Automation, ICIA 2013 | Year: 2013

In this paper, we propose a framework which fuses multiple features for action recognition in depth sequence. The fusion of multiple features is important for recognizing action since a single feature-based representation is inadequate to capture the variants. Hence, we use two types of features: i) a quantized vocabulary of local spatio-temporal descriptor HOG3D, and ii) a global projection based descriptor that computes the HOG from the Depth Motion Maps. To optimally combine these features, we input those features to different classifiers, where SVM is applied to estimate the probabilities of action labels. Then, we weight those probabilities respectively and sum it to find the maximum score of action labels. The proposed approach is tested on publicly available MSR Action3D dataset which demonstrates that fusion of multiple features help to achieve improved performance significantly, outperforming Li et al.[1] in most of the cases. © 2013 IEEE.


Jiang J.,CAS Shenzhen Institutes of Advanced Technology | Jiang J.,Chinese University of Hong Kong | Jiang J.,The Shenzhen Key Laboratory of Computer Vision and Pattern Recognition | Cheng J.,CAS Shenzhen Institutes of Advanced Technology | And 6 more authors.
Neurocomputing | Year: 2014

Real-time stereo matching in image sequences is important in video monitoring, robotic navigation and intelligent vehicle, etc. Spatiotemporal stereo and scene flow can be used to produce temporally coherent disparity of dynamic scenes. However, most methods do not use the previous disparity map sufficiently to compute the current one. Thus, the disparity range limits the speed of disparity computation for each stereo pair. This paper integrates the temporal information into the stereo computation, and presents the relationship between consecutive disparity maps, which makes the disparity prediction reasonable. The scheme can produce a sequence of temporally coherent disparity maps rapidly. The tests performed on simulated and real stereo sequences confirm the validity of our approach. © 2014 Elsevier B.V.


Xu D.,CAS Shenzhen Institutes of Advanced Technology | Xu D.,Guangdong Provincial Key Laboratory of Robotics and Intelligent System | Chen Y.-L.,CAS Shenzhen Institutes of Advanced Technology | Chen Y.-L.,Guangdong Provincial Key Laboratory of Robotics and Intelligent System | And 7 more authors.
2012 IEEE International Conference on Robotics and Biomimetics, ROBIO 2012 - Conference Digest | Year: 2012

Natural human robot interaction based on the dynamic hand gesture is becoming a popular research topic in the past few years. The traditional dynamic gesture recognition methods are usually restricted by the factors of illumination condition, varying color and cluttered background. The recognition performance can be improved by using the hand-wearing devices but this is not a natural and barrier-free interaction. To overcome these shortcomings, the depth perception algorithm based on the Kinect depth sensor is introduced to carry out 3D hand tracking. We propose a novel start/end point detection method for segmenting the 3D hand gesture from the hand motion trajectory. Then Hidden Markov Models (HMMs) are implemented to model and classify the hand gesture sequences and the recognized gestures are converted to control commands for the interaction with the robot. Seven different hand gestures performed by two hands can sufficiently navigate the robot. Experiments show that the proposed dynamic hand gesture interaction system can work effectively in the complex environment and in real-time with an average recognition rate of 98.4%. And further experiments for the robot navigation also verify the robustness of our system. © 2012 IEEE.

Loading Guangdong Provincial Key Laboratory of Robotics and Intelligent System collaborators
Loading Guangdong Provincial Key Laboratory of Robotics and Intelligent System collaborators