Shenzhen Key Laboratory of Information Science and Technology

Laboratory of, China

Shenzhen Key Laboratory of Information Science and Technology

Laboratory of, China
SEARCH FILTERS
Time filter
Source Type

Lu Z.,Tsinghua University | Lu Z.,Shenzhen Key Laboratory of Information Science and Technology | Zhang W.,Tsinghua University | Zhang W.,Shenzhen Key Laboratory of Information Science and Technology | And 2 more authors.
2016 IEEE Symposium Series on Computational Intelligence, SSCI 2016 | Year: 2016

Deep learning is greatly successful when used for pedestrian detection. However, we find that this method is barely satisfactory for multi-scale detection. Meanwhile, various solutions such as multi-scale classifiers have been developed (based on traditional methods) to handle this situation. Considering this, we propose a scale-discriminative classifier layer (SDC) that contains numerous classifiers to cope with different scales. To expand the capacity for small-scale pedestrian detection, we construct a full-scale layer that converges both high-level semantic features and low-level features. From the analysis above, a scale-discriminative network (SDN) for pedestrian detection was born. We apply this network to the Caltech pedestrian dataset, and the experimental results show that the SDN achieves state-of-the-art performance. © 2016 IEEE.


Liu S.,Shenzhen Key Laboratory of Information Science and Technology | Zhou F.,Shenzhen Key Laboratory of Information Science and Technology | Liao Q.,Shenzhen Key Laboratory of Information Science and Technology
IEEE Transactions on Image Processing | Year: 2016

Defocus map estimation (DME) is highly important in many computer vision applications. Nearly, all existing approaches for DME from a single image are based on a one-parameter defocus model, which does not allow for the variation of depth over edges. In this paper, a novel two-parameter model of defocused edges is proposed for DME from a single image. We can estimate the defocus amounts for each side of the edges through this proposed model, and the confidence that the edge is a pattern edge, where the depth remains the same over the edge, can be generated. Then, we modify the TV-L1 algorithm for structure-texture decomposition by taking advantage of this confidence to eliminate pattern edges while preserving structural ones. Finally, the defocus amounts estimated at the edge positions are used as initial values, and the structure component is employed as a guidance in the following Laplacian matting procedure to avoid the influence of pattern edges on the final defocus map. Experiment results show that the proposed method can effectively eliminate the influence of pattern edges compared with the state-of-art method. Furthermore, the estimated defocus map is feasible in applications of depth estimation and foreground/background segmentation. © 2016 IEEE.


Wang B.,Tsinghua University | Wang B.,Shenzhen Key Laboratory of Information Science and Technology | Li W.,Tsinghua University | Li W.,Shenzhen Key Laboratory of Information Science and Technology | And 3 more authors.
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings | Year: 2013

Recent research has shown that collaborative representation-based classifier (CRC) can lead to promising results for the classification of face images. However, CRC is conducted in the original image space rather than the nonlinear high dimensional feature space in which features belonging to the same class are better grouped together and thus can be easily separable. To address this problem, this paper presents a novel classifier, Kernel Collaborative Representation-based Classifier (KCRC), by incorporating the kernel trick into the framework of CRC. Extensive experiments on both the AT&T and the FERET face databases demonstrate the priority of KCRC to CRC and several state-of-the-art methods. © 2013 IEEE.


Wu Y.,Tsinghua University | Wu Y.,Shenzhen Key Laboratory of Information Science and Technology | Jiang Y.,Tsinghua University | Jiang Y.,Shenzhen Key Laboratory of Information Science and Technology | And 7 more authors.
Neurocomputing | Year: 2014

Robust face recognition under uncontrolled illumination conditions is one of the key challenges for real-time face recognition systems. Weber-face (WF) is an illumination insensitive face representation based on Weber' law. In this letter, we develop a generalized Weber-face (GWF) which extracts the statistics of multi-scale information from face images. By assigning different weights to the inner-ground and outer-ground we further develop a weighted GWF (wGWF) version. Based on our experiments on the extended Yale-B and FERET face database we show that the proposed methods are robust to illumination variations and can obtain promising performance comparable with existing approaches. © 2014 Elsevier B.V.


Zhou F.,Tsinghua University | Zhou F.,Shenzhen Key Laboratory of Information Science and Technology | Liao Q.,Tsinghua University | Liao Q.,Shenzhen Key Laboratory of Information Science and Technology
IET Image Processing | Year: 2015

In this study, the authors consider the problem of image super-resolution (SR) in terms of the perceptual criteria. Existing SR methods treat the traditional mean-squared error (MSE) as an irreplaceable objective function. However, MSE has been widely criticised since it is inconsistent with visual perception of human beings. The perceptual criteria, including the structural similarity (SSIM) index and feature similarity (FSIM) index, have been reported to be more effective in assessing image quality. Therefore SSIM and FSIM are included for the SR task in this study. Specifically, the authors first propose to reform principal component analysis (PCA), which is named as visual perceptual PCA (VP-PCA), by adopting SSIM as the object function. Subsequently, to accomplish the SR task, the authors cluster the training data and perform VPPCA on each cluster to calculate the coefficients. Finally, based on the principle of FSIM, the traditional SR results and the SR results using VP-PCA are combined to form our fused results. Experimental results are provided to show the superiority of the proposed method over several state-of-the-art methods in both quantitative and visual comparisons. © The Institution of Engineering and Technology 2014.


Wang B.,Tsinghua University | Wang B.,Shenzhen Key Laboratory of Information Science and Technology | Li W.,Tsinghua University | Li W.,Shenzhen Key Laboratory of Information Science and Technology | And 3 more authors.
Neurocomputing | Year: 2013

The single sample per person problem (SSPP) is quite common in real-world face recognition applications. In such circumstance, the lack of enough training samples often results in poor generalization ability for majority of the existing state-of-the-art methods. To address this problem, in this paper, a fairly simple but effective approach, called adaptive linear regression classifier (ALRC), is presented based on the simple observation that similar subjects have similar intra-personal variations. ALRC is a linear model representing a probe image as a linear combination of the single class-specific gallery and the intra-personal variations adaptively pulled from his/her kNNs in an auxiliary generic training set with multiple samples per person. ALRC can be easily employed with a regularized least square estimator and the decision is ruled in favor of the class with the minimum reconstruction error. Experimental results on AR and FERET face datasets show that ALRC outperforms several state-of-the-art approaches and demonstrates promising abilities against variations including expression, illumination and disguise. © 2013 Elsevier B.V.


Lu H.,Tsinghua University | Zhang S.,Shenzhen Key Laboratory of Information Science and Technology | Lin X.,Shenzhen Key Laboratory of Information Science and Technology
IEEE Conference on Intelligent Transportation Systems, Proceedings, ITSC | Year: 2012

Vehicle-to-infrastructure (V2I) communications enable various applications for transportation, help improve road safety, transportation efficiency and comfort of travel. As vehicles change their point of attachment frequently, seamless handover becomes one of the most challenging research issues for supporting these applications. In this paper, we propose a fast handover for Proxy Mobile IPv6 (PMIPv6) to provide seamless mobility of vehicles. The proposed scheme tackles the spatial and temporal prediction issues in Fast Handover for Proxy Mobile IPv6 (FPMIPv6) with the aid of mobility information, pre-stored access point (AP) placement information and received signal strength (RSS). It also uses a pre-established backup binding and a tunnel between previous mobile access gateway (PMAG) and new mobile access gateway (NMAG) to reduce handover latency and packet loss. Analytical and simulation results show that the overall handover latency and packet loss can be significantly reduced by the proposed scheme compared with other related schemes. © 2012 IEEE.


Yang W.,Shenzhen Key Laboratory of Information Science and Technology | Yang W.,Visual Information Processing Laboratory | Huang X.,Shenzhen Key Laboratory of Information Science and Technology | Huang X.,Visual Information Processing Laboratory | And 3 more authors.
Information Sciences | Year: 2014

In this paper, we present a multimodal personal identification system that fuses finger vein and finger dorsal images at the feature level. First, we design an image acquisition device, which can synchronously capture finger vein and finger dorsal images. Also, a small dataset of the images has been established for algorithm testing and evaluation. Secondly, to utilize the intrinsic positional relationship between the finger veins and the finger dorsal, we perform a special registration on two kinds of images. Subsequently, the regions-of-interest (ROIs) of both kinds of images are extracted and normalized in both size and intensity. Thirdly, we develop a magnitude-preserved competitive code feature extraction method, which is utilized in both the finger vein and finger dorsal images. Furthermore, according to the preserved magnitude, a comparative competitive code (C2Code) is explored for finger vein and dorsal fusion at the feature level. The proposed feature map of C2Code, which contains new features of junction points and positions from the finger vein and finger dorsal image pairs, is extremely informative for identification. Finally, the C2Code feature map is fed into a nearest neighbor (NN) classifier to carry out personal authentication. Experimentally, we compare the performance of the proposed fusion strategy with that of state-of-the-art unimodal biometrics by using the established dataset, and it is found that there is higher identification accuracy and lower equal-error-rates (EERs). © 2013 Elsevier Inc. All rights reserved.


Wang B.,Tsinghua University | Wang B.,Shenzhen Key Laboratory of Information Science and Technology | Li W.,Tsinghua University | Li W.,Shenzhen Key Laboratory of Information Science and Technology | And 2 more authors.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | Year: 2013

This paper focuses on enhancing Sparse Representation based Classifier (SRC) in single-sample face recognition tasks under varying illumination conditions. The major contribution is two-fold: firstly, we present an interesting observation based on Lambertian reflectance model: the identity information will be canceled out by the pair-wise difference images from the same subject in logarithmic domain, and only the subject-independent illumination variation retains. Secondly, inspired from this observation, we propose to "borrow" illumination variations from any generic subject by constructing an illumination variation dictionary composed of pair-wise difference images of generic subjects in logarithmic domain to cover the possible illumination variations between test and gallery samples. Experimental results on Extended Yale B and FERET face databases demonstrate the superiority of our method. © Springer-Verlag 2013.


Li F.,Tsinghua University | Li F.,Shenzhen Key Laboratory of Information Science and Technology | Yang W.,Tsinghua University | Yang W.,Shenzhen Key Laboratory of Information Science and Technology | And 2 more authors.
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings | Year: 2016

The detection and localization of abnormal activities are considered in this work. An efficient approach called oriented G-MM(OGMM) is proposed. The approach uses optical flow as low-level feature and quantizes the orientation of optical flow into 8 sections. In training stage, the approach will learn a GMM model at each orientation section and each position. In testing stage, the proposed approach estimates the probability of whether a position is abnormal using likelihood method. The proposed approach is a local method and can detect and locate anomaly. What's more, in the proposed approach, the same process is done to each position with little interaction between different positions. This makes the approach suit for parallel computing and can deal with large-scale tasks in Big Data times. The experiments verify that the proposed approach is efficient and effective. © 2016 IEEE.

Loading Shenzhen Key Laboratory of Information Science and Technology collaborators
Loading Shenzhen Key Laboratory of Information Science and Technology collaborators