HP Labs India

Bangalore, India

HP Labs India

Bangalore, India
Time filter
Source Type

Peng X.,Entrance | Setlur S.,Entrance | Govindaraju V.,Entrance | Sitaram R.,HP Labs India
International Journal on Document Analysis and Recognition | Year: 2013

The convenience of search, both on the personal computer hard disk as well as on the web, is still limited mainly to machine printed text documents and images because of the poor accuracy of handwriting recognizers. The focus of research in this paper is the segmentation of handwritten text and machine printed text from annotated documents sometimes referred to as the task of "ink separation" to advance the state-of-art in realizing search of hand-annotated documents. We propose a method which contains two main steps-patch level separation and pixel level separation. In the patch level separation step, the entire document is modeled as a Markov Random Field (MRF). Three different classes (machine printed text, handwritten text and overlapped text) are initially identified using G-means based classification followed by a MRF based relabeling procedure. A MRF based classification approach is then used to separate overlapped text into machine printed text and handwritten text using pixel level features forming the second step of the method. Experimental results on a set of machine-printed documents which have been annotated by multiple writers in an office/collaborative environment show that our method is robust and provides good text separation performance. © 2011 Springer-Verlag.

Audhya G.K.,BSNL Kolkata | Sinha K.,HP Labs India | Mandal K.,University of Waterloo | Dattagupta R.,Jadavpur University | And 2 more authors.
IEEE Transactions on Mobile Computing | Year: 2013

This paper presents a novel method for solving channel assignment problems (CAPs) in hexagonal cellular networks with nonhomogeneous demands in a 2-band buffering system (where channel interference does not extend beyond two cells). The CAP with nonhomogeneous demand is first partitioned into a sequence of smaller subproblems, each of which has a homogeneous demand from a subset of the nodes of the original network. Solution to such a subproblem constitutes an assignment phase, where multiple homogeneous demands are assigned to the nodes corresponding to the subproblem, satisfying all the frequency separation constraints. The whole assignment process for the original network consists of a succession of multiple homogeneous assignments for all the subproblems. Based on this concept, we present a polynomial time approximation algorithm for solving the CAP for cellular networks having nonhomogeneous demands. Our proposed assignment algorithm, when executed on well-known benchmark instances, comes up with an assignment which is always within about 6 percent more than the optimal bandwidth, but requires a very small execution time (less than 5 millisecond on a HPxw8400 workstation). The proposed algorithm is very much suitable for real-life situations, where fast channel assignment is of primary importance, tolerating, however, a marginal deviation (6 percent) from the optimal bandwidth. © 2013 IEEE.

Sankarasubramaniam Y.,HP Labs India | Ramanathan K.,HP Labs India | Ghosh S.,SAS Institute
Information Processing and Management | Year: 2014

Automatic text summarization has been an active field of research for many years. Several approaches have been proposed, ranging from simple position and word-frequency methods, to learning and graph based algorithms. The advent of human-generated knowledge bases like Wikipedia offer a further possibility in text summarization - they can be used to understand the input text in terms of salient concepts from the knowledge base. In this paper, we study a novel approach that leverages Wikipedia in conjunction with graph-based ranking. Our approach is to first construct a bipartite sentence-concept graph, and then rank the input sentences using iterative updates on this graph. We consider several models for the bipartite graph, and derive convergence properties under each model. Then, we take up personalized and query-focused summarization, where the sentence ranks additionally depend on user interests and queries, respectively. Finally, we present a Wikipedia-based multi-document summarization algorithm. An important feature of the proposed algorithms is that they enable real-time incremental summarization - users can first view an initial summary, and then request additional content if interested. We evaluate the performance of our proposed summarizer using the ROUGE metric, and the results show that leveraging Wikipedia can significantly improve summary quality. We also present results from a user study, which suggests that using incremental summarization can help in better understanding news articles. © 2014 Elsevier Ltd. All rights reserved.

Peng X.,State University of New York at Buffalo | Setlur S.,State University of New York at Buffalo | Govindaraju V.,State University of New York at Buffalo | Sitaram R.,HP Labs India
Proceedings of SPIE - The International Society for Optical Engineering | Year: 2011

Document binarization is one of the initial and critical steps for many document analysis systems. Nowadays, with the success and popularity of hand-held devices, large efforts are motivated to convert documents into digital format by using hand-held cameras. In this paper, we propose a Bayesian based maximum a posteriori (MAP) estimation algorithm to binarize the camera-captured document images. A novel adaptive segmentation surface estimation and normalization method is proposed as the preprocessing step in our work and followed by a Markov Random Field based refine procedure to remove noises and smooth binarized result. Experimental results show that our method has better performance than other algorithms on bad or uneven illumination document images. © 2011 SPIE-IS&T.

Peng X.,State University of New York at Buffalo | Setlur S.,State University of New York at Buffalo | Govindaraju V.,State University of New York at Buffalo | Ramachandrula S.,HP Labs India
Pattern Recognition Letters | Year: 2012

A boosted tree classifier is proposed to segment machine printed, handwritten and overlapping text from documents with handwritten annotations. Each node of the tree-structured classifier is a binary weak learner. Unlike a standard decision tree (DT) which only considers a subset of training data at each node and is susceptible to over-fitting, we boost the tree using all available training data at each node with different weights. The proposed method is evaluated on a set of machine-printed documents which have been annotated by multiple writers in an office/collaborative environment. The experimental results show that the proposed algorithm outperforms other methods on an imbalanced data set. © 2011 Elsevier B.V. All rights reserved.

Madhavan M.,Indian Institute of Technology Madras | Thangaraj A.,Indian Institute of Technology Madras | Sankarasubramanian Y.,HP Labs India | Viswanathan K.,HP Labs India
IEEE International Symposium on Information Theory - Proceedings | Year: 2010

The Hopper-Blum (HB) protocol, which uses noised linear parities of a shared key for authentication, has been proposed for light-weight applications such as RFID. Recently, algorithms for decoding linear codes have been specially designed for use in passive attacks on the HB protocol. These linear coding attacks have resulted in the need for long keys in the HB protocol, making the protocol too complex for RFID in some cases. In this work, we propose the NLHB protocol, which is a non-linear variant of the HB protocol. The non-linearity is such that passive attacks on the NLHB protocol continue to be provably hard by reduction. However, the linear coding attacks cannot be directly adapted to the proposed NLHB protocol because of the non-linearity. Hence, smaller key sizes appear to be sufficient in the NLHB protocol for the same level of security as the HB protocol. We construct specific instances of the NLHB protocol and show that they can be significantly less complex for implementation than the HB protocol, in spite of the non-linearity. Further, we propose an extension, called the NLHB+ protocol, that is provably secure against a class of active attack models. © 2010 IEEE.

Sankarasubramaniam Y.,Hp Labs India | Narayanan B.,Hp Labs India | Viswanathan K.,Hp Labs India | Kuchibhotla A.,Hp Labs India
Proceedings of SPIE - The International Society for Optical Engineering | Year: 2010

This paper presents an algorithm called CIPDEC (Content Integrity of Printed Documents using Error Correction), which identifies any modifications made to a printed document. CIPDEC uses an error correcting code for accurate detection of addition/deletion of even a few pixels. A unique advantage of CIPDEC is that it works blind - it does not require the original document for such detection. Instead, it uses fiducial marks and error correcting code parities. CIPDEC is also robust to paper-world artifacts like photocopying, annotations, stains, folds, tears and staples. Furthermore, by working at a pixel level, CIPDEC is independent of language, font, software, and graphics that are used to create paper documents. As a result, any changes made to a printed document can be detected long after the software, font, and graphics have fallen out of use. The utility of CIPDEC is illustrated in the context of tamper-proofing of printed documents and ink extraction for form-filling applications. © 2009 Copyright SPIE - The International Society for Optical Engineering.

Jain H.P.,Indian Institute of Technology Madras | Subramanian A.,Hp Labs India | Das S.,Indian Institute of Technology Madras | Mittal A.,Indian Institute of Technology Madras
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | Year: 2011

Automatic detection and pose estimation of humans is an important task in Human-Computer Interaction (HCI), user interaction and event analysis. This paper presents a model based approach for detecting and estimating human pose by fusing depth and RGB color data from monocular view. The proposed system uses Haar cascade based detection and template matching to perform tracking of the most reliably detectable parts namely, head and torso. A stick figure model is used to represent the detected body parts. The fitting is then performed independently for each limb, using the weighted distance transform map. The fact that each limb is fitted independently speeds-up the fitting process and makes it robust, avoiding the combinatorial complexity problems that are common with these types of methods. The output is a stick figure model consistent with the pose of the person in the given input image. The algorithm works in real-time and is fully automatic and can detect multiple non-intersecting people. © 2011 Springer-Verlag Berlin Heidelberg.

Mandalapu D.,HP Labs India | Subramanian S.,University of Bristol
ACM International Conference Proceeding Series | Year: 2011

Pressure is a useful medium for interaction as it can be used in different contexts such as for navigating through depth in 3-D, for time-series visualizations, and in zoomable interfaces. We propose pressure based input as an alternative to repetitive multi-touch interactions, such as expanding/ pinching to zoom. While most user interface controls for zooming or scrolling are bidirectional, pressure is primarily a one-way continuous parameter (from zero to positive). Human ability to control pressure from positive to zero is limited but needs to be resolved to make this medium accessible to various interactive tasks. We first carry out an experiment to measure the effectiveness of various pressure control functions for controlling pressure in both directions (from zero to positive and positive to zero). Based on this preliminary knowledge, we compare the performance of a pressure based zooming system with a multitouch expand/pinch gesture based zooming system. Our results show that pressure input is an improvement to multitouch interactions that involve multiple invocations, such as the one presented in this paper. Copyright © 2011 ACM.

Madhavan M.,Indian Institute of Technology Madras | Thangaraj A.,Indian Institute of Technology Madras | Viswanathan K.,Hp Labs India | Sankarasubramaniam Y.,Hp Labs India
Proceedings of 16th National Conference on Communications, NCC 2010 | Year: 2010

In this paper, we propose a light-weight provably-secure authentication protocol called the NLHB protocol, which is a variant of the HB protocol [6]. The HB protocol uses the complexity of decoding linear codes for security against passive attacks. In contrast, security for the NLHB protocol is proved by reducing the provably hard problem of decoding a class of nonlinear codes to passive attacks. We demonstrate that the existing passive attacks([10],[3]) on the HB protocol family, which have contributed to considerable reduction in its effective key-size, do not work against the NLHB protocol. From the evidence, we conclude that smaller-key sizes are sufficient for the NLHB protocol to achieve the same level of passive attack security as the HB Protocol. Further, for this choice of parameters, we provide an implementation instance for the NLHB protocol for which the Prover/Verifier complexity is lower than the HB protocol, enabling authentication on very low-cost devices like RFID tags. ©2010 IEEE.

Loading HP Labs India collaborators
Loading HP Labs India collaborators