Key Laboratory of Advanced Displays and System Application

Shanghai, China

Key Laboratory of Advanced Displays and System Application

Shanghai, China

Time filter

Source Type

Guan Y.-P.,Shanghai University | Guan Y.-P.,Key Laboratory of Advanced Displays and System Application
Proceedings - 2010 13th IEEE International Conference on Computational Science and Engineering, CSE 2010 | Year: 2010

Among gestures in non-verbal communication, pointing gesture can be taken as one of natural human computer interfaces. Vision based hand pointing is an optimal model for human-computer interaction (HCI). One of key problems among the vision based pointing gesture is how to recognize the pointing. Aiming at some limits existing in the literature, a novel method is developed to estimate pointing gestures based on some non-calibrated cameras. Multiple un-calibrated cameras are adopted to determine the pointing target based on pointing features extracted from multiple cameras and support vector machine (SVM) classifier. No explicit constraints are set on the cameras placement. Pointing user can move freely inside a wider interaction environment while pointing at some targets. The mentioned approach does not constrain the pointing surface whether is flat or not, or the target is visible by the cameras. Edge detection based on multi-scale wavelet transformation is used to extract pointing objects from a clutter background. Experiments have shown that the developed approach is efficient for pointing recognition by comparisons. © 2010 IEEE.


Huang Y.,Shanghai University | Guan Y.,Shanghai University | Guan Y.,Key Laboratory of Advanced Displays and System Application
Engineering Applications of Artificial Intelligence | Year: 2015

We study the challenging problem to classify samples into a large number of classes, and propose the idea of using different Dimensionality-Reduction (DR) projections for different classes of samples. Based on this intuitive idea, the traditional Linear Discriminant Analysis (LDA) and the trace-ratio LDA are formulated to their corresponding new multi-subspace objectives. We justify that certain effects of class-adaptive feature selection are naturally achieved via our multi-subspace DR methods. Experiments on seven datasets show that, our multi-subspace trace-ratio LDA outperform its ratio-trace and single-subspace counterparts, and its advantage is more apparent when the number of classes to be classified is large. © 2015 Elsevier Ltd.


Xie S.,Shanghai University | Guan Y.,Shanghai University | Guan Y.,Key Laboratory of Advanced Displays and System Application
Multimedia Tools and Applications | Year: 2015

Automatically detecting anomaly in surveillance videos is a crucial issue for social security. Motion instability based online abnormal behaviors detection has been developed in an unsupervised way. The motion instability is composed of direction randomness and motion intensity of particles gotten by optical flow based consecutive motion feature extraction. The direction randomness is gotten based on weighted average of a circular variance of all particles. The motion intensity is determined according to average energy of all particles considering the camera perspective effect. A feature tracking based scheme has been employed to extract spatial-temporal motion features from videos to increase the processing speed. An adaptive dynamic thresholding strategy is developed to detect deviation of the track from the patterns observed both in direction randomness and motion intensity. Besides a double-threshold inference strategy is adopted to determine the range of the motion instability. A state transition model is used to reduce false alarm for confirming anomaly. The anomaly in the video is fast online detected in an ordinary hardware from a cluttered scene without any hypothesis for the scenario contents in advance. Comparative study with state-of-the-arts has indicated the superior performance of the developed approach. © 2015 Springer Science+Business Media New York


Guan Y.-P.,Shanghai University | Guan Y.-P.,Key Laboratory of Advanced Displays and System Application
Tien Tzu Hsueh Pao/Acta Electronica Sinica | Year: 2014

Human being daily skill can be exerted fully and bondage can be delivered efficiently in which people use ordinary equipment as an input way if pointing gesture is used for human-computer interaction(HCI). One of key problems is how to reliably recognize pointing user from HCI scene with cluttered background. A novel method has been developed based on spatio-temporal motion. According to multi-scale wavelet transform(MWT)with outstanding local characteristics both in spatial and temporal domains, it is adopted to extract foreground motion subject from cluttered scene. Some disadvantages are overcome including restrictions in environment conditions, dynamic environment variation, and a priori assumption. MWT based gradient integral graph is used to get some HOG feature vectors in pointing hand which are classified and learnt based on machine learning. Pointing user is recognized according to spatial relationship between pointing hand and its corresponding subject. Experimental results have been shown that the proposed method is efficient and viable. ©, 2014, Chinese Institute of Electronics. All right reserved.


Guan Y.-P.,Shanghai University | Guan Y.-P.,Key Laboratory of Advanced Displays and System Application
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | Year: 2012

A novel algorithm is developed to detect moving objects and remove cast shadows by exploiting textural and spatial-temporal features. Multi-scale wavelet transformation is used to segment moving objects based on spatial property. Textural and spectral features color ratio differences between two adjacent pixels in four different directions are used to remove cast shadows. RGB color space is selected instead of introducing complex color models to segment moving objects and eliminate shadows. The proposal requires much less efforts compared with currently available methods. It does not require any complex supervised training phase, and does not require manual calibration or makes any hypothesis. Experiments have highlighted that the proposal is robust and efficient to segment moving objects and suppress shadow by comparisons. © 2012 Springer-Verlag.


Guan Y.,Shanghai University | Guan Y.,Key Laboratory of Advanced Displays and System Application | Huang Y.,Shanghai University
Engineering Applications of Artificial Intelligence | Year: 2015

In this paper, a novel system is developed to detect and track multiple heads in multi-pose status by efficient human head validation using ellipse detection. A particle filter is employed for each tracked head. The appearance model of head is updated based on fusion of color histogram and oriented gradient one. Regardless of its pose, a head is modeled as an ellipse, and we propose an objective function to fit the proposed elliptical equation. A robust supervised distance function learning framework has been developed to recover some missed detections and suppress some false detections, using an Expectation Maximization algorithm. Comparative study with state-of-the-arts has indicated the superiority and good performance of the proposed method. © 2014 Elsevier Ltd.


Huang Y.,Shanghai University | Guan Y.,Shanghai University | Guan Y.,Key Laboratory of Advanced Displays and System Application
Multimedia Tools and Applications | Year: 2015

In this paper, we propose a new philosophy different from that of the well-known Locality-Sensitive Hashing (LSH): if two data points are close, we wish that the probability for them to fall into the same hash buckets is high; whereas if two data points are far away, we do not care the probability of them falling into the same hash buckets. Our new philosophy is a relaxation of the LSH requirement, by ignoring the side effects of placing differently labeled data points into the same hash bucket. Based on such relaxation, a new hashing method, namely the Laplacian Hashing, is derived, which is natural to incorporate any kernel functions and “similar” / “dissimilar” weakly supervised information. Another contribution of this paper is that, it is the first time that a fast hashing method is applied for the midway processing in a cascaded face detection structure. Experimental results show that, our method is on average not worse than the state of the arts in terms of accuracy, but much faster and thus can handle much larger training datasets within reasonable computation time. © 2015 Springer Science+Business Media New York


Huang Y.,Shanghai University | Guan Y.,Shanghai University | Guan Y.,Key Laboratory of Advanced Displays and System Application
Multimedia Tools and Applications | Year: 2015

In this paper, we presented a non-uniform 1D ruler model and applied it in various image classification and image recognition scenarios, and some are for military technology usage. Our model is very simple, elegant and original, which is solved by convex quadratic programming. It has wide applications in pattern recognition and intelligent multimedia data analysis. We believe that a new research topic, namely, numeric calibration, has started, which is parallel to dimensionality reduction, feature selection, or metric learning etc. Our methods can be used as a pre-processing step for metric learning methods, in which, our learned calibrated feature space is used as input for them. The various combinations of our methods and metric learning methods, may lead to new interesting research problems. © 2015 Springer Science+Business Media New York


Guan Y.-P.,Shanghai University | Guan Y.-P.,Key Laboratory of Advanced Displays and System Application
Tien Tzu Hsueh Pao/Acta Electronica Sinica | Year: 2013

A novel human-computer interaction (HCI) is developed based on multimodal visual features aiming at some limits at present. Two-dimensional Gabor wavelet is adopted to extract some visual features of global face orientation, which overcomes some difficulties including extraction of some facial distinct features, discrimination among some different facial orientations. An efficient and fast approach to locating center of eyes is proposed based on facial geometric distributions without considering facial resolution, eyes closing or opening and user's wearing. Some prominent multimodal visual features for classification are selected to machine learning and training to determine the pointing target after evaluating performance of some extracted visual features. Non-wearable and natural HCI modal can be realized in which user can move freely without wearing any markers when he points at some targets. Their daily skills can be exerted fully during HCI. Experiment results indicate that the developed approach is efficient and can be used to natural non-wearable HCI.


Yang C.,Key Laboratory of Advanced Displays and System Application | An P.,Key Laboratory of Advanced Displays and System Application | Liu D.,Key Laboratory of Advanced Displays and System Application | Shen L.,Key Laboratory of Advanced Displays and System Application
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings | Year: 2016

Multi-view video plus depth (MVD) is a 3D video representation. In MVD, the depth map provides the scene distance information and is used to render the virtual view through Depth Image Based Rendering (DIBR) technique. The depth map coding error will induce distortion in the rendered virtual views. This paper proposes a mathematic model that can estimate the synthesized virtual view distortion induced by depth map compression, and the model is employed to the rate distortion optimization (RDO) in the depth map coding. Based on the rendered virtual view quality, a Lagrangian optimization adjustment scheme at Coding Unit (CU) level is proposed to improve the depth map encoding efficiency. Experimental results demonstrate that the proposed method can improve the BD-PSNR of virtual view for 0.62 dB, and the encoding complexity reduces compared with the view synthesis optimization (VSO) technique in the 3D-HEVC Test Model (HTM). © 2016 IEEE.

Loading Key Laboratory of Advanced Displays and System Application collaborators
Loading Key Laboratory of Advanced Displays and System Application collaborators