Time filter

Source Type

Yang S.,Shanghai JiaoTong University | Yang H.,Shanghai JiaoTong University | Yang H.,Shanghai Key Laboratory of Digital Media Processing and Transmission | Li J.,Shanghai JiaoTong University | And 3 more authors.
Communications in Computer and Information Science | Year: 2017

For public security, an intelligent video surveillance system that can analyze large-scale crowd scenes has become an urgent need. In this paper, we propose a system that integrates multiple crowd properties, including stationary and dynamic features, local and global characteristics, and historic statistics analysis in a unified framework. Specially our system consists of four modules. Crowd density module describes global density level and local density distribution with sparse spatial-temporal local binary pattern. Crowd segmentation module presents both global crowd grouping and local moving directions based on spatial-temporal dynamics. In crowd saliency module, salient regions are detected to alarm abnormal behaviors. At last, in order to analyze the historic features of video streaming, a historical statistics analysis module is introduced. Experiments on different crowd datasets show that our system is robust and feasible, and satisfies the requirements of video surveillance applications. © Springer Nature Singapore Pte Ltd. 2017.


Xue G.,Shanghai JiaoTong University | Xue G.,Shanghai Key Laboratory of Digital Media Processing and Transmission | Sun J.,Shanghai JiaoTong University | Sun J.,Shanghai Key Laboratory of Digital Media Processing and Transmission | And 2 more authors.
Proceedings - International Conference on Image Processing, ICIP | Year: 2010

Effective foreground detection under sudden illumination change is an active research topic. However, most existing background subtraction approaches, which are intensity based, fail to handle this situation. In this paper, we propose a novel background modeling method that overcomes this limitation by relying on statistical models which use pixel phase instead of intensities. We first extract the phase feature of the pixel using Gabor filters. Then, a phase based background subtraction approach is proposed. In this approach, each phase feature is modeled independently by a mixture of Gaussian models and updated with a novel scheme. Since foreground pixels are scattered in the preliminary detection result, distance transform is implemented on the binary image which transforms the image into a distance map. We segment the distance image with a threshold and get the final result. Experiments on two challenging sequences demonstrate the effectiveness and robustness of our method. © 2010 IEEE.


Yu S.,Shanghai JiaoTong University | Zheng S.,Shanghai JiaoTong University | Zheng S.,Shanghai Key Laboratory of Digital Media Processing and Transmission | Yang H.,Shanghai JiaoTong University | And 2 more authors.
2013 10th IEEE International Conference on Advanced Video and Signal Based Surveillance, AVSS 2013 | Year: 2013

The recognition of vehicle manufacturer logo is a crucial and very challenging problem, which is still an area with few published effective methods. This paper proposes a new fast and reliable system for Vehicle Logo Recognition (VLR) based on Bag-of-Words (BoW). In our system, vehicle logo images are represented as histograms of visual words and classified by SVM in three steps: firstly, extract dense-SIFT features; secondly, quantize features into visual words by 'Soft-assignment' thirdly, build histograms of visual words with spatial information. Compared with traditional VLR methods, experiment results show that our proposed system achieves higher recognition accuracy with less processing time. The proposed system is evaluated on a dataset of 840 low-resolution vehicle logo images with about 30×30 pixels, which verifies that our system is practical and effective. © 2013 IEEE.


Zhu J.,Shanghai JiaoTong University | Zhu J.,Shanghai Key Laboratory of Digital Media Processing and Transmission | Wang B.,Microsoft | Yang X.,Shanghai JiaoTong University | And 4 more authors.
Proceedings of the IEEE International Conference on Computer Vision | Year: 2013

With the improved accessibility to an exploding amount of video data and growing demands in a wide range of video analysis applications, video-based action recognition/classification becomes an increasingly important task in computer vision. In this paper, we propose a two-layer structure for action recognition to automatically exploit a mid-level "acton" representation. The actons are learned via a new max-margin multi-channel multiple instance learning framework. The learned actons (with no requirement for detailed manual annotations) thus observe a property of being compact, informative, discriminative, and easy to scale. This is different from the standard unsupervised (e.g. k-means) or supervised (e.g. random forests) coding strategies in action recognition. Applying the learned actons in our two-layer structure yields the state-of-the-art classification performance on Youtube and HMDB51 datasets. © 2013 IEEE.


Wang Z.,CAS Shanghai Advanced Research Institute | Wang Z.,University of Chinese Academy of Sciences | Wang P.,CAS Shanghai Advanced Research Institute | Zhang H.,CAS Shanghai Advanced Research Institute | And 3 more authors.
IEICE Transactions on Information and Systems | Year: 2014

High Efficiency Video Coding (HEVC) is the latest video coding standard that is supported by JCT-VC. In this letter, an encoding algorithm for early termination of Coding Unit (CU) and Prediction Unit (PU) based on the texture direction is proposed for the HEVC intra prediction. Experimental results show that the proposed algorithm provides an average 40% total encoding time reduction with the negligible loss of rate-distortion. © 2014 The Institute of Electronics, Information and Communication Engineers.


Cheng N.,Tongji University | Lu N.,University of Waterloo | Wang P.,Tongji University | Wang P.,Shanghai Key Laboratory of Digital Media Processing and Transmission | And 2 more authors.
IEEE Vehicular Networking Conference, VNC | Year: 2011

RSU-assisted VANETs should provide vehicles and mobile users with reliable safety applications as well as various non-safety services with different QoS levels. We present a dedicated multi-channel MAC plus a QoS-provision channel allocation scheme based on the EDCA channel throughput analysis to improve the QoS performance of non-safety services for the RSU-assisted centralized VANETs. Simulation results show that the proposed scheme outperforms the other two existing centralized MAC schemes. © 2011 IEEE.


Wang P.,Tongji University | Wang P.,Shanghai Key Laboratory of Digital Media Processing and Transmission | Han J.,Shanghai Key Laboratory of Digital Media Processing and Transmission | Liu F.,Shanghai Key Laboratory of Digital Media Processing and Transmission | And 2 more authors.
Journal of Networks | Year: 2011

A centralized resource allocation algorithm in multi-cell OFDM systems is studied, which aims at improving the performance of wireless communication systems and enhancing user's spectral efficiency on the edge of the cell. The proposed resource allocation algorithm can be divided into two steps. The first step is sub-carrier allocation based on matrix searching in single cell and the second one is joint power allocation based on cooperative game theory in multi-cell. By comparing with traditional resource allocation algorithms in multi-cell scenario, we find that the proposed algorithm has lower computational complexity and good fairness performance. © 2011 ACADEMY PUBLISHER.


Han J.,Tongji University | Wang P.,Tongji University | Wang P.,Shanghai Key Laboratory of Digital Media Processing and Transmission | Liu F.,Tongji University | Zhu Y.,Tongji University
Journal of Communications | Year: 2011

This paper studies power allocation in coordinated multi-point (CoMP) transmission of 3GPP LTE-Advanced system with remote radio units(RRUs) power constraints. We apply block diagonal (BD) precoding to downlink transmission, and assume perfect knowledge of downlink channels and transmit messages at each transmit point. We propose a modified water-filling power (MWF) allocation algorithm in order to maximize the downlink sum capacity, at the same time the low complexity is achieved. The interior-point method is also used to solve the optimization problem. Simulations show that interior-point method converges after only a few iterative steps and the system capacity is near-optimal. As for complexity and power efficiency, MWF achieves a good compromise. © 2011 ACADEMY PUBLISHER.


Wang C.,East China Normal University | Shen M.,South China University of Technology | Yao C.,Third Security | Yao C.,Shanghai Key Laboratory of Digital Media Processing and Transmission
Multimedia Tools and Applications | Year: 2016

Dynamic weather conditions, such as rain and snow, often produce strong intensity discontinuity among frames, thus seriously degrade their visual or compression performance. How to remove these artifacts is a challenging task and has been intensively studies recently. The state-of-the-art algorithms detect these scratches before removing them from the scene. Visual effect of rain or snow is complex and difficult to be distinguished from the background; hence the precision of its detection and segmentation by hard decision is usually unsatisfactory. As an anisotropic filter performs well in structural noise removal, such as linear, planar as well as isotropic noise, it is utilized in this paper to analyze image content and suppress scratch noise simultaneously. Compared with the state-of-the-art algorithms, the proposed algorithm is better and more robust in dynamic scenes. © 2016 Springer Science+Business Media New York


Wang C.,East China Normal University | Shen M.,University of Konstanz | Shen M.,South China University of Technology | Yao C.,Third Security | Yao C.,Shanghai Key Laboratory of Digital Media Processing and Transmission
Journal of Visual Communication and Image Representation | Year: 2015

A blind/no-reference (NR) method is proposed in this paper for image quality assessment (IQA) of the images compressed in discrete cosine transform (DCT) domain. When an image is measured by structural similarity (SSIM), two variances, i.e. mean intensity and variance of the image, are used as features. However, the parameters of original copies are actually unavailable in NR applications; hence SSIM is not widely applicable. To extend SSIM in general cases, we apply Gaussian model to fit quantization noise in spatial domain, and directly estimate noise distribution from the compressed version. Benefit from this rearrangement, the revised SSIM does not require original image as the reference. Heavy compression always results in some zero-value DCT coefficients, which need to be compensated for more accurate parameter estimate. By studying the quantization process, a machine-learning based algorithm is proposed to estimate quantization noise taking image content into consideration. Compared with state-of-the-art algorithms, the proposed IQA is more heuristic and efficient. With some experimental results, we verify that the proposed algorithm (provided no reference image) achieves comparable efficacy to some full reference (FR) methods (provided the reference image), such as SSIM. © 2015 Elsevier Inc. All rights reserved.

Loading Shanghai Key Laboratory of Digital Media Processing and Transmission collaborators
Loading Shanghai Key Laboratory of Digital Media Processing and Transmission collaborators