Time filter

Source Type

Tian R.,Beijing Key Laboratory of Digital Media | Zhang Y.,Beijing Key Laboratory of Digital Media | Zhang Y.,Beihang University | Fan R.,Beijing Key Laboratory of Digital Media | Wang G.,Beijing Key Laboratory of Digital Media
2016 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2016 | Year: 2016

The latest High Efficiency Video Coding (HEVC) standard offers higher performance than existing video coding standards - up to 50% bit-rate reduction at the equal perceptual quality, but with a significant encoder complexity increase. With regard to intra prediction, a set of 35 intra prediction modes is defined in HEVC to enhance the intra coding performance. However, the high complexity makes it difficult to be applied in real-time applications. To reduce the complexity of intra prediction while maintaining the coding performance, an adaptive fast mode decision algorithm for HEVC intra coding is proposed in this paper, which can efficiently reduce the number of the candidate modes for rate-distortion optimization (RDO) and thus the intra coding time. First, the relation observed between the costs of two candidate modes will be exploited to improve the efficiency of prediction. Second, both the texture consistency of neighborhood and the texture characteristic in current predict unit (PU) will also be considered. Experimental results demonstrate that the proposed algorithm saves 30.12% intra encoding time on average without incurring noticeable performance degradation and outperforms the state-of-the-art fast intra mode decision algorithms by achieving a better RD performance with approximate encoding time saving. © 2016 IEEE.

Bai X.,Beihang University | Bai X.,Beijing Key Laboratory of Digital Media | Zhou F.,Beihang University | Zhou F.,Beijing Key Laboratory of Digital Media | And 2 more authors.
Applied Optics | Year: 2012

Enhancing an image through increasing the contrast of the image is one effective way of image enhancement. To well enhance an image and suppress the produced noises in the resulting image, a multiscale top-hat selection transform-based algorithm through extracting bright and dark image regions and increasing the contrast between them is proposed. First, the multiscale top-hat selection transform is discussed and then is used to extract the bright and dark image regions of each scale. Second, the final extracted bright and dark image regions are obtained through a maximum operation on all the extracted multiscale bright and dark image regions at all scales. Finally, by using a weight strategy, the image is enhanced through increasing the contrast of the image by adding the final bright regions on and subtracting the final dark regions from the original image. The weight parameters are used to adjust the effect of image enhancement. Because the multiscale top-hat selection transform is used to effectively extract the final image regions and discriminate the possible noise regions, the image is well enhanced and some noises are suppressed. Experimental results on different types of images show that our algorithm performs well for noise-suppressed image enhancement and is useful for different applications. © 2012 Optical Society of America.

Zhang H.,Beihang University | Zhang H.,Beijing Key Laboratory of Digital Media | Jiang Z.,Beihang University | Jiang Z.,Beijing Key Laboratory of Digital Media
Electronics Letters | Year: 2016

Pose estimation is a critical problem in the challenge of visual object recognition. An alternative model, agreement function (AF), is proposed to solve this problem, which is essentially a generative model since it is learned to represent the joint probability distribution of the inputs and their poses. Estimated poses of unseen samples can be obtained by maximising the AF conditional on the given samples. Extensive experiments are performed on several challenging datasets to validate the authors' model, and achieved state-of-the-art experimental results. © 2016 The Institution of Engineering and Technology.

Luo J.,Beihang University | Jiang Z.,Beijing Key Laboratory of Digital Media
Proceedings - International Conference on Pattern Recognition | Year: 2014

This paper addresses the problem of learning semantic compact binary codes for efficient retrieval in large-scale image collections. Our contributions are three-fold. Firstly, we introduce semantic codes, of which each bit corresponds to an attribute that describes a property of an object (e.g. dogs have furry). Secondly, we propose to use matrix factorization (MF) to learn the semantic codes by encoding attributes. Unlike traditional PCA-based encoding methods which quantize data into orthogonal bases, MF assumes no constraints on bases, and this scheme is coincided with that attributes are correlated. Finally, to augment semantic codes, MF is extended to encode extra non-semantic codes to preserve similarity in origin data space. Evaluations on a-Pascal dataset show that our method is comparable to the state-of-the-art when using Euclidean distance as ground truth, and even outperforms state-of-the-art when using class label as ground truth. Furthermore, in experiments, our method can retrieve images that share the same semantic properties with the query image, which can be used to other vision tasks, e.g. re-training classifiers. © 2014 IEEE.

Wu J.,Beihang University | Jiang Z.,Beihang University | Yang J.,Beijing Key Laboratory of Digital Media | Luo J.,Beijing Key Laboratory of Digital Media
Proceedings - 2013 7th International Conference on Image and Graphics, ICIG 2013 | Year: 2013

The identification of shadow and shading boundaries is a key step towards reducing the imaging effects that are caused by direct illumination of the light source in the scene. Discriminating shadow boundaries from images of natural scenes has been widely applied in the field of computer vision such as object recognition, intelligent monitoring and image understanding. In this paper, we propose a method to identify shadow boundaries based on multiple kernel learning. We first extract all possible candidate boundaries and then analyze their properties. Unlike the previous proposed methods which simply combine features as a vector, we choose the optimal kernel function for every feature and learn the correct weights of different features from training database. At last, we link shadow boundaries fragments together to get longer and complete shadow boundaries. The experiment results show that the method we propose works well in shadow boundaries identification. © 2013 IEEE.

Zhang H.,Beihang University | Zhang H.,Beijing Key Laboratory of Digital Media | Jiang Z.,Beihang University | Jiang Z.,Beijing Key Laboratory of Digital Media | Cheng Y.,93707 PLA Troops
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences - ISPRS Archives | Year: 2016

How to obtain accurate difference map remains an open challenge in change detection. To tackle this problem, we propose a change detection method based on saliency detection and wavelet transformation. We do frequency-tuned saliency detection in initial difference image (IDI) obtained by logarithm ratio to get a salient difference image (SDI). Then, we calculate local entropy of SDI to obtain an entropic salient difference image (ESDI). The final difference image (FDI) is the wavelet fusion of IDI and ESDI, and Otsu thresholding is used to extract difference map from FDI. Experimental results validate the effectiveness and feasibility.

Zhang H.,Beihang University | Zhang H.,Beijing Key Laboratory of Digital Media | El-Gaaly T.,Rutgers University | Elgammal A.,Rutgers University | And 2 more authors.
Computer Vision and Image Understanding | Year: 2015

Due to large variations in shape, appearance, and viewing conditions, object recognition is a key precursory challenge in the fields of object manipulation and robotic/AI visual reasoning in general. Recognizing object categories, particular instances of objects and viewpoints/poses of objects are three critical subproblems robots must solve in order to accurately grasp/manipulate objects and reason about their environments. Multi-view images of the same object lie on intrinsic low-dimensional manifolds in descriptor spaces (e.g. visual/depth descriptor spaces). These object manifolds share the same topology despite being geometrically different. Each object manifold can be represented as a deformed version of a unified manifold. The object manifolds can thus be parameterized by its homeomorphic mapping/reconstruction from the unified manifold. In this work, we develop a novel framework to jointly solve the three challenging recognition sub-problems, by explicitly modeling the deformations of object manifolds and factorizing it in a view-invariant space for recognition. We perform extensive experiments on several challenging datasets and achieve state-of-the-art results. © 2015 Elsevier Inc. All rights reserved.

Feng H.,Beihang University | Feng H.,Beijing Key Laboratory of Digital Media | Jiang Z.,Beihang University | Jiang Z.,Beijing Key Laboratory of Digital Media | And 8 more authors.
IEEE Transactions on Instrumentation and Measurement | Year: 2014

The detection of fastener defects is an important task in railway inspection systems, and it is frequently performed to ensure the safety of train traffic. Traditional inspection is usually operated by trained workers who walk along railway lines to search for potential risks. However, the manual inspection is very slow, costly, and dangerous. This paper proposes an automatic visual inspection system for detecting partially worn and completely missing fasteners using probabilistic topic model. Specifically, our method is able to simultaneously model diverse types of fasteners with different orientations and illumination conditions using unlabeled data. To assess the damages, the test fasteners are compared with the trained models and automatically ranked into three levels based on the likelihood probability. The experimental results demonstrate the effectiveness of this method. © 1963-2012 IEEE.

Zhang H.,Beihang University | Zhang H.,Beijing Key Laboratory of Digital Media | Jiang Z.,Beihang University | Jiang Z.,Beijing Key Laboratory of Digital Media
Chinese Journal of Aeronautics | Year: 2014

The application of high-performance imaging sensors in space-based space surveillance systems makes it possible to recognize space objects and estimate their poses using vision-based methods. In this paper, we proposed a kernel regression-based method for joint multi-view space object recognition and pose estimation. We built a new simulated satellite image dataset named BUAA-SID 1.5 to test our method using different image representations. We evaluated our method for recognition-only tasks, pose estimation-only tasks, and joint recognition and pose estimation tasks. Experimental results show that our method outperforms the state-of-the-arts in space object recognition, and can recognize space objects and estimate their poses effectively and robustly against noise and lighting conditions. © 2014 Production and hosting by Elsevier Ltd. on behalf of CSAA & BUAA.

Zhang H.,Beihang University | Zhang H.,Beijing Key Laboratory of Digital Media | Jiang Z.,Beihang University | Jiang Z.,Beijing Key Laboratory of Digital Media | Elgammal A.,Rutgers University
Acta Astronautica | Year: 2013

Imaging sensors are widely used in aerospace recently. In this paper, a vision-based approach for estimating the pose of cooperative space objects is proposed. We learn generative model for each space object based on homeomorphic manifold analysis. Conceptual manifold is used to represent pose variation of captured images of the object in visual space, and nonlinear functions mapping between conceptual manifold representation and visual inputs are learned. Given such learned model, we estimate the pose of a new image by minimizing a reconstruction error via a traversal procedure along the conceptual manifold. Experimental results on the simulated image dataset show that our approach is effective for 1D and 2D pose estimation. © 2013 IAA.

Loading Beijing Key Laboratory of Digital Media collaborators
Loading Beijing Key Laboratory of Digital Media collaborators