Illinois at Singapore Pte Ltd.

Singapore, Singapore

Illinois at Singapore Pte Ltd.

Singapore, Singapore

Time filter

Source Type

According to various embodiments, there is provided an electric meter including a sensor circuit configured to provide a plurality of instantaneous magnetic field measurements; a processing circuit configured to generate a time-series of magnetic field vectors, each magnetic field vector of the time-series of magnetic field vectors including the plurality of instantaneous magnetic field measurements; and a total current determination circuit configured to determine a total current, wherein the total current is a sum of currents of each branch of a plurality of branches of a power distribution network; wherein the processing circuit is further configured to compute a de-mixing matrix based on the determined total current and the time-series of magnetic field vectors, and further configured to linear transform each magnetic field vector using the de-mixing matrix to determine a current of each branch.

Feng J.,National University of Singapore | Ni B.,Illinois at Singapore Pte Ltd. | Tian Q.,University of Texas at San Antonio | Yan S.,National University of Singapore
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition | Year: 2011

Modern visual classification models generally include a feature pooling step, which aggregates local features over the region of interest into a statistic through a certain spatial pooling operation. Two commonly used operations are the average and max poolings. However, recent theoretical analysis has indicated that neither of these two pooling techniques may be qualified to be optimal. Besides, we further reveal in this work that more severe limitations of these two pooling methods are from the unrecoverable loss of the spatial information during the statistical summarization and the underlying over-simplified assumption about the feature distribution. We aim to address these inherent issues in this work and generalize previous pooling methods as follows. We define a weighted ℓp-norm spatial pooling function tailored for the class-specific feature spatial distribution. Moreover, a sensible prior for the feature spatial correlation is incorporated. Optimizing such pooling function towards optimal class separability yields a so-called geometric ℓp-norm pooling (GLP) method. The described GLP method is capable of preserving the class-specific spatial/geometric information in the pooled features and significantly boosts the discriminating capability of the resultant features for image classification. Comprehensive evaluations on several image benchmarks demonstrate that the proposed GLP method can boost the image classification performance with a single type of feature to outperform or be comparable with the state-of-the-arts. © 2011 IEEE.

Xu J.,Northeastern University China | Zhang Z.,Illinois at Singapore Pte. Ltd. | Tung A.K.H.,National University of Singapore | Yu G.,Northeastern University China
VLDB Journal | Year: 2012

Advances in geographical tracking, multimedia processing, information extraction, and sensor networks have created a deluge of probabilistic data. While similarity search is an important tool to support the manipulation of probabilistic data, it raises new challenges to traditional relational databases. The problem stems from the limited effectiveness of the distance metrics employed by existing database systems. On the other hand, several more complicated distance operators have proven their values for better distinguishing ability in specific probabilistic domains. In this paper, we discuss the similarity search problem with respect to Earth Mover's Distance (EMD). EMD is the most successful distance metric for probability distribution comparison but is an expensive operator as it has cubic time complexity. We present a new database indexing approach to answer EMD-based similarity queries, including range queries and k-nearest neighbor queries on probabilistic data. Our solution utilizes primal-dual theory from linear programming and employs a group of B + trees for effective candidate pruning. We also apply our filtering technique to the processing of continuous similarity queries, especially with applications to frame copy detection in real-time videos. Extensive experiments show that our proposals dramatically improve the usefulness and scalability of probabilistic data management. © 2011 Springer-Verlag.

Chen B.,Illinois at Singapore Pte. Ltd | Zhou Z.,National University of Singapore | Yu H.,National University of Singapore
Proceedings of the Annual International Conference on Mobile Computing and Networking, MOBICOM | Year: 2013

Counting the number of RFID tags, or RFID counting, is needed by a wide array of important wireless applications. Motivated by its paramount practical importance, researchers have developed an impressive arsenal of techniques to improve the performance of RFID counting (i.e., to reduce the time needed to do the counting). This paper aims to gain deeper and fundamental insights in this subject to facilitate future research on this topic. As our central thesis, we find out that the overlooked key design aspect for RFID counting protocols to achieve near-optimal performance is a conceptual separation of a protocol into two phases. The first phase uses small overhead to obtain a rough estimate, and the second phase uses the rough estimate to further achieve an accuracy target. Our thesis also indicates that other performanceenhancing techniques or ideas proposed in the literature are only of secondary importance. Guided by our central thesis, we manage to design near-optimal protocols that are more efficient than existing ones and simultaneously simpler than most of them. © 2013 by the Association for Computing Machinery, Inc.

Cai R.,Guangdong University of Technology | Cai R.,Nanjing University | Zhang Z.,Illinois at Singapore Pte. Ltd. | Hao Z.,Guangdong University of Technology
Neural Networks | Year: 2013

With the advances of biomedical techniques in the last decade, the costs of human genomic sequencing and genomic activity monitoring are coming down rapidly. To support the huge genome-based business in the near future, researchers are eager to find killer applications based on human genome information. Causal gene identification is one of the most promising applications, which may help the potential patients to estimate the risk of certain genetic diseases and locate the target gene for further genetic therapy. Unfortunately, existing pattern recognition techniques, such as Bayesian networks, cannot be directly applied to find the accurate causal relationship between genes and diseases. This is mainly due to the insufficient number of samples and the extremely high dimensionality of the gene space. In this paper, we present the first practical solution to causal gene identification, utilizing a new combinatorial formulation over V-Structures commonly used in conventional Bayesian networks, by exploring the combinations of significant V-Structures. We prove the NP-hardness of the combinatorial search problem under a general settings on the significance measure on the V-Structures, and present a greedy algorithm to find sub-optimal results. Extensive experiments show that our proposal is both scalable and effective, particularly with interesting findings on the causal genes over real human genome data. © 2013 Elsevier Ltd.

Zhang J.,Nanyang Technological University | Zhang Z.,Illinois at Singapore Pte. Ltd. | Xiao X.,Nanyang Technological University | Yang Y.,Illinois at Singapore Pte. Ltd. | And 2 more authors.
Proceedings of the VLDB Endowment | Year: 2012

∈-differential privacy is the state-of-the-art model for releasing sensitive information while protecting privacy. Numerous methods have been proposed to enforce ?-differential privacy in various analytical tasks, e.g., regression analysis. Existing solutions for regression analysis, however, are either limited to non-standard types of regression or unable to produce accurate regression results. Motivated by this, we propose the Functional Mechanism, a differentially private method designed for a large class of optimizationbased analyses. The main idea is to enforce ∈-differential privacy by perturbing the objective function of the optimization problem, rather than its results. As case studies, we apply the functional mechanism to address two most widely used regression models, namely, linear regression and logistic regression. Both theoretical analysis and thorough experimental evaluations show that the functional mechanism is highly effective and efficient, and it significantly outperforms existing solutions. © 2012 VLDB Endowment.

Xu J.,Northeastern University China | Zhang Z.,Illinois at Singapore Pte. Ltd | Xiao X.,Nanyang Technological University | Yang Y.,Illinois at Singapore Pte. Ltd | And 2 more authors.
VLDB Journal | Year: 2013

Differential privacy (DP) is a promising scheme for releasing the results of statistical queries on sensitive data, with strong privacy guarantees against adversaries with arbitrary background knowledge. Existing studies on differential privacy mostly focus on simple aggregations such as counts. This paper investigates the publication of DP-compliant histograms, which is an important analytical tool for showing the distribution of a random variable, e.g., hospital bill size for certain patients. Compared to simple aggregations whose results are purely numerical, a histogram query is inherently more complex, since it must also determine its structure, i.e., the ranges of the bins. As we demonstrate in the paper, a DP-compliant histogram with finer bins may actually lead to significantly lower accuracy than a coarser one, since the former requires stronger perturbations in order to satisfy DP. Moreover, the histogram structure itself may reveal sensitive information, which further complicates the problem. Motivated by this, we propose two novel mechanisms, namely NoiseFirst and StructureFirst, for computing DP-compliant histograms. Their main difference lies in the relative order of the noise injection and the histogram structure computation steps. NoiseFirst has the additional benefit that it can improve the accuracy of an already published DP-compliant histogram computed using a naive method. For each of proposed mechanisms, we design algorithms for computing the optimal histogram structure with two different objectives: minimizing the mean square error and the mean absolute error, respectively. Going one step further, we extend both mechanisms to answer arbitrary range queries. Extensive experiments, using several real datasets, confirm that our two proposals output highly accurate query answers and consistently outperform existing competitors. © 2013 Springer-Verlag Berlin Heidelberg.

Fu T.Z.J.,Illinois at Singapore Pte Ltd | Song Q.,Chinese University of Hong Kong | Chiu D.M.,Chinese University of Hong Kong
Scientometrics | Year: 2014

By means of their academic publications, authors form a social network. Instead of sharing casual thoughts and photos (as in Facebook), authors select co-authors and reference papers written by other authors. Thanks to various efforts (such as Microsoft Academic Search and DBLP), the data necessary for analyzing the academic social network is becoming more available on the Internet. What type of information and queries would be useful for users to discover, beyond the search queries already available from services such as Google Scholar? In this paper, we explore this question by defining a variety of ranking metrics on different entities—authors, publication venues, and institutions. We go beyond traditional metrics such as paper counts, citations, and h-index. Specifically, we define metrics such as influence, connections, and exposure for authors. An author gains influence by receiving more citations, but also citations from influential authors. An author increases his or her connections by co-authoring with other authors, and especially from other authors with high connections. An author receives exposure by publishing in selective venues where publications have received high citations in the past, and the selectivity of these venues also depends on the influence of the authors who publish there. We discuss the computation aspects of these metrics, and the similarity between different metrics. With additional information of author-institution relationships, we are able to study institution rankings based on the corresponding authors’ rankings for each type of metric as well as different domains. We are prepared to demonstrate these ideas with a web site ( built from millions of publications and authors. © 2014, Akadémiai Kiadó, Budapest, Hungary.

Yuan G.,South China University of Technology | Yang Y.,Khalifa University | Zhang Z.,Illinois at Singapore Pte. Ltd. | Hao Z.,Foshan University
Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining | Year: 2016

Differential privacy enables organizations to collect accurate aggregates over sensitive data with strong, rigorous guarantees on individuals' privacy. Previous work has found that under differential privacy, computing multiple correlated aggregates as a batch, using an appropriate strategy, may yield higher accuracy than computing each of them independently. However, finding the best strategy that maximizes result accuracy is non-trivial, as it involves solving a complex constrained optimization program that appears to be non-convex. Hence, in the past much effort has been devoted in solving this non-convex optimization program. Existing approaches include various sophisticated heuristics and expensive numerical solutions. None of them, however, guarantees to find the optimal solution of this optimization problem. This paper points out that under (ϵ, δ)-differential privacy, the optimal solution of the above constrained optimization problem in search of a suitable strategy can be found, rather surprisingly, by solving a simple and elegant convex optimization program. Then, we propose an efficient algorithm based on Newton's method, which we prove to always converge to the optimal solution with linear global convergence rate and quadratic local convergence rate. Empirical evaluations demonstrate the accuracy and efficiency of the proposed solution. © 2016 ACM.

On B.-W.,Illinois at Singapore Pte Ltd | Lee I.,Troy University | Lee D.,Pennsylvania State University
Knowledge and Information Systems | Year: 2012

When non-unique values are used as the identifier of entities, due to their homonym, confusion can occur. In particular, when (part of) "names" of entities are used as their identifier, the problem is often referred to as a name disambiguation problem, where goal is to sort out the erroneous entities due to name homonyms (e. g., If only last name is used as the identifier, one cannot distinguish "Masao Obama" from "Norio Obama"). In this paper, in particular, we study the scalability issue of the name disambiguation problem-when (1) a small number of entities with large contents or (2) a large number of entities get un-distinguishable due to homonyms. First, we carefully examine two of the state-of-the-art solutions to the name disambiguation problem and point out their limitations with respect to scalability. Then, we propose two scalable graph partitioning algorithms known as multi-level graph partitioning and multi-level graph partitioning and merging to solve the large-scale name disambiguation problem. Our claim is empirically validated via experimentation-our proposal shows orders of magnitude improvement in terms of performance while maintaining equivalent or reasonable accuracy compared to competing solutions. © 2011 Springer-Verlag London Limited.

Loading Illinois at Singapore Pte Ltd. collaborators
Loading Illinois at Singapore Pte Ltd. collaborators