State Key Laboratory of Digital Publishing Technology

Beijing, China

State Key Laboratory of Digital Publishing Technology

Beijing, China
Time filter
Source Type

Shi C.,Beijing Institute of Technology | Xiao J.,Beijing Institute of Technology | Jia W.,Beijing Institute of Technology | Xu C.,Beijing Institute of Technology | And 2 more authors.
Communications in Computer and Information Science | Year: 2012

Prior knowledge of Chinese calligraphy is modeled in this paper, and the hierarchical relationship of strokes and radicals is represented by a novel five layer framework. Calligraphist's unique calligraphy skill is analyzed and his particular strokes, radicals and layout patterns provide raw element for the proposed five layers. The criteria of visual aesthetics based on Marr's vision assumption are built for the proposed algorithm of automatic generation of Chinese character. The Bayesian statistics is introduced to characterize the character generation process as a Bayesian dynamic model, in which, parameters to translate, rotate and scale strokes, radicals are controlled by the state equation, as well as the proposed visual aesthetics is employed by the measurement equation. Experimental results show the automatically generated characters have almost the same visual acceptance compared to calligraphist's artwork. © 2012 Springer-Verlag.

Gao L.,Peking University | Qi X.,Peking University | Tang Z.,State Key Laboratory of Digital Publishing Technology | Lin X., | Liu Y.,Korea Advanced Institute of Science and Technology
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries | Year: 2012

Considering the tremendous value of citation metadata, many methods have been proposed to automate Citation Metadata Extraction (CME). The existing methods primarily rely on the content analysis of citation text. However, the results from such content-based methods are often unreliable. Moreover, the extracted citation metadata is only a small part of the relevant metadata that spreads across the Internet. As opposed to the content-based CME methods, this paper proposes a Web-based CME approach and a citation enriching system, called as BibAll, which is capable of correcting the parsing results of content-based CME methods and augmenting citation metadata by leveraging relevant bibliographic data from digital repositories and cited-by publications on the Web. BibAll consists of four main components: citation parsing, Web-based bibliographic data retrieval, irrelevant bibliographic data filtering, and relevant bibliographic data integration. The system has been tested on the publicly available FLUX-CIM dataset. Experimental results show that BibAll significantly improves the citation parsing accuracy and augments the metadata of the original citation. © 2012 ACM.

Zhang X.,State Key Laboratory of Digital Publishing Technology | Zhang X.,Huazhong University of Science and Technology | Zhang Z.,Huazhong University of Science and Technology | Zhang C.,Huazhong University of Science and Technology | Bai X.,Huazhong University of Science and Technology
Proceedings - International Conference on Pattern Recognition | Year: 2017

Scene text detection and recognition have become active research topics in computer vision. In this paper, we focus on the detection of text proposal from wild images. Text proposals attempt to generate a relatively small set of bounding box proposals that are most likely to contain text. Different from previous methods that merge similar region based on property of individual region, we assumed that text word bare strong symmetry property. We propose a new algorithm that exploit the symmetry property to directly generate word-level proposals. Proposals generation process using the region features, and rank process making use of the symmetry structures in text groups. Experiments on two standard datasets demonstrate that the proposed algorithm has achieve the state-of-the-art performance, especially in the case of smaller proposal number. © 2016 IEEE.

Jin X.-B.,State Key Laboratory of Digital Publishing Technology | Jin X.-B.,Beijing Technology and Business University | Dou C.,Beijing Technology and Business University
Proceedings of the World Congress on Intelligent Control and Automation (WCICA) | Year: 2016

For the practical big time series data, it is the important step to eliminate the noise and get the dynamic information of the time series data. For the series data, the extraction model and the transform model are given in this paper, and the sampling interval of the models is discussed to guarantee the estimated system convergence. The dynamic information is analyzed according to the estimated dynamic characteristic. The experiment results show that for the time series data, the dynamic information extracted by Kalman filter is feasible, and can be used to analyze the trend characteristic effectively. © 2016 IEEE.

Lin X.,Beijing Institute of Technology | Gao L.,Beijing Institute of Technology | Tang Z.,Beijing Institute of Technology | Tang Z.,State Key Laboratory of Digital Publishing Technology | And 2 more authors.
Proceedings - 10th IAPR International Workshop on Document Analysis Systems, DAS 2012 | Year: 2012

This paper presents a performance evaluation system for mathematical formula identification. First, a ground-truth dataset is constructed to facilitate the performance comparison of different mathematical formula identification algorithms. Statistics analysis of the dataset shows the diversities of the dataset to reflect the real-world documents. Second, a performance evaluation metric for mathematical formula identification is proposed, including the error type definitions and the scenario-adjustable scoring. The proposed metric enables in-depth analysis of mathematical formula identification systems in different scenarios. Finally, based on the proposed evaluation metric, a tool is developed to automatically evaluate mathematical formula identification results. It is worth noting that the ground-truth dataset and the evaluation tool are freely available for academic purpose. © 2012 IEEE.

Chen L.,Wuhan University of Science and Technology | Tian J.,Wuhan University of Science and Technology | Tian J.,State Key Laboratory of Digital Publishing Technology | Liu Z.,Zaozhuang University
International Journal of Electronics | Year: 2014

Many scalable video compression techniques utilise a mixed-resolution scheme, which down-samples some frames at the encoder to produce reduced-resolution frames while keeping resolutions of other frames unchanged as full resolutions, in order to achieve higher compression gain. Image enlargement technique is required at the decoder to recover the original full-resolution frames for this mixed-resolution video system set-up. This article proposes a Bayesian approach to enlarge the reduced-resolution frame via its maximum a-posterior estimation, using the information from the observed reduced-resolution frame, plus more detailed information extracted from available neighbouring frames in full resolution. Experiments are conducted to justify that the proposed approach outperforms a few conventional approaches. © 2013 © 2013 Taylor & Francis.

Fang J.,Beijing Institute of Technology | Gao L.,Beijing Institute of Technology | Bai K.,IBM | Qiu R.,State Key Laboratory of Digital Publishing Technology | And 2 more authors.
Proceedings of the International Conference on Document Analysis and Recognition, ICDAR | Year: 2011

Table detection is always an important task of document analysis and recognition. In this paper, we propose a novel and effective table detection method via visual separators and geometric content layout information, targeting at PDF documents. The visual separators refer to not only the graphic ruling lines but also the white spaces to handle tables with or without ruling lines. Furthermore, we detect page columns in order to assist table region delimitation in complex layout pages. Evaluations of our algorithm on an e-Book dataset and a scientific document dataset show competitive performance. It is noteworthy that the proposed method has been successfully incorporated into a commercial software package for large-scale Chinese e-Book production. © 2011 IEEE.

Li L.,Beijing Institute of Technology | Wang Y.,Beijing Institute of Technology | Tang Z.,Beijing Institute of Technology | Tang Z.,State Key Laboratory of Digital Publishing Technology | Gao L.,Beijing Institute of Technology
Multimedia Tools and Applications | Year: 2014

Comic page segmentation aims to automatically decompose scanned comic images into storyboards (frames), which is the key technique to produce digital comic documents that are suitable for reading on mobile devices. In this paper, we propose a novel method for comic page segmentation by finding the quadrilateral enclosing box of each storyboard. We first acquire the edge image of the input comic image, and then extract line segments with a heuristic line segment detection algorithm. We perform line clustering to further merge the overlapped line segments and remove the redundancy line segments. Finally, we perform another round of line clustering and post-processing to compose the obtained line segments into complete quadrilateral enclosing boxes of the storyboards. The proposed method is tested on 2,237 comic images from 12 different printed comic series, and the experimental results demonstrate that our method is effective for comic image segmentation and outperforms the existing methods. © 2012 Springer Science+Business Media, LLC.

Zhu X.-S.,Yangzhou University | Zhu X.-S.,State Key Laboratory of Digital Publishing Technology | Ding J.,Yangzhou University
Jisuanji Xuebao/Chinese Journal of Computers | Year: 2012

This paper presents a novel quantization-based watermarking method. The method embeds the watermark information by modulating a feature signal generated from the host signal. The feature signal is suggested to choose the normalized correlation between the host signal and a random signal. Information modulation is carried out on the generated feature signal by selecting a code word from the codebook associated with the embedded information. The structured codebooks are designed using uniform quantizers for M-ary modulation. The watermarked signal is produced to provide the modulated feature in the sense of minimizing the embedding distortion. Meanwhile, we derive the expressions of the embedding distortion and the minimal channel distortion to remove the hidden message. According to them, the optimal code word can be found in the codebook for the watermarking performance improvement. The proposed scheme is theoretically invariant to valumetric scaling and can resist stronger noise than the well-known spread transform dither modulation. Numerical simulations on real images show that it achieves the good imperceptibility and strong robustness against a wide range of attacks and significantly outperforms other state-of-the-art watermarking methods.

Zhang B.,State Key Laboratory of Digital Publishing Technology | Yu J.Z.,University of Shanghai for Science and Technology
Applied Mechanics and Materials | Year: 2013

The detection of elements in top part of research paper is very important, because these elements are often used as the search items by user. This paper provides a mixed method for auto detection of top part from research paper. The paper's feature of keyword, layout and content similarity are mixed to accurately find the area of top part and recognize the elements in top part. Experiments show the advantage of our method over existing methods, and future work is also described in the paper. © (2013) Trans Tech Publications, Switzerland.

Loading State Key Laboratory of Digital Publishing Technology collaborators
Loading State Key Laboratory of Digital Publishing Technology collaborators