Center for Computing science

Bowie, MD, United States

Center for Computing science

Bowie, MD, United States

Time filter

Source Type

Streib A.P.,Center for Computing science | Streib N.,Center for Computing science
14th Workshop on Analytic Algorithmics and Combinatorics 2017, ANALCO 2017 | Year: 2017

A very challenging problem from statistical physics is computing the partition function of the ferromagnetic Ising model, even in the relatively simple case of no applied field. In this case, the partition function can be written as a function of the subgraphs of the underlying graph in which all vertices have even degree. In their seminal work, Jerrum and Sinclair showed that this quantity can be approximated by a rapidly converging Markov chain on all subgraphs. However, their chain frequently leaves the space of even subgraphs. Our aim is to devise and analyze a new class of Markov chains that do not leave this space, in the hopes of finding a faster sampling algorithm. We define Markov chains by viewing the even subgraphs as a vector space (often called the cycle space) whose transitions are defined by the addition of basis elements. The rate of convergence depends on the basis chosen, and our analysis proceeds by dividing bases into two types, short and long. The classical singlesite update Markov chain known as Glauber dynamics is a special case of our short cycle basis Markov chains. We show that for any graph and any long basis, there is a temperature for which the corresponding Markov chain requires an exponential time to mix. Moreover, we show that for d-dimensional grids with d ≥ 2 - those of the most physical importance - all fundamental bases (a natural class of bases derived from spanning trees) are long. For the 2-dimensional grid on the torus, we show that there is a temperature for which the Markov chain requires exponential time for any chosen basis. © Copyright 2016 Amanda Pascoe Streib, Noah Streib.


Nau D.S.,University of Maryland University College | Lustrek M.,Jozef Stefan Institute | Parker A.,University of Maryland University College | Parker A.,Center for Computing science | And 2 more authors.
Artificial Intelligence | Year: 2010

In situations where one needs to make a sequence of decisions, it is often believed that looking ahead will help produce better decisions. However, it was shown 30 years ago that there are "pathological" situations in which looking ahead is counterproductive. Two long-standing open questions are (a) what combinations of factors have the biggest influence on whether lookahead pathology occurs, and (b) whether it occurs in real-world decision-making. This paper includes simulation results for several synthetic game-tree models, and experimental results for three well-known board games: two chess endgames, kalah (with some modifications to facilitate experimentation), and the 8-puzzle. The simulations show the interplay between lookahead pathology and several factors that affect it; and the experiments confirm the trends predicted by the simulation models. The experiments also show that lookahead pathology is more common than has been thought: all three games contain situations where it occurs. © 2010 Elsevier B.V. All rights reserved.


Jain S.,National University of Singapore | Moelius III S.E.,Center for Computing science | Zilles S.,University of Regina
Theoretical Computer Science | Year: 2013

Iterative learning is a model of language learning from positive data, due to Wiehagen. When compared to a learner in Gold's original model of language learning from positive data, an iterative learner can be thought of as memory-limited. However, an iterative learner can memorize some input elements by coding them into the syntax of its hypotheses. A main concern of this paper is: to what extent are such coding tricks necessary? One means of preventing some such coding tricks is to require that the hypothesis space used be free of redundancy, i.e., that it be 1-1. In this context, we make the following contributions. By extending a result of Lange and Zeugmann, we show that many interesting and non-trivial classes of languages can be iteratively identified using a Friedberg numbering as the hypothesis space. (Recall that a Friedberg numbering is a 1-1 effective numbering of all computably enumerable sets.) An example of such a class is the class of pattern languages over an arbitrary alphabet. On the other hand, we show that there exists an iteratively identifiable class of languages that cannot be iteratively identified using any 1-1 effective numbering as the hypothesis space. We also consider an iterative-like learning model in which the computational component of the learner is modeled as an enumeration operator, as opposed to a partial computable function. In this new model, there are no hypotheses, and, thus, no syntax in which the learner can encode what elements it has or has not yet seen. We show that there exists a class of languages that can be identified under this new model, but that cannot be iteratively identified. On the other hand, we show that there exists a class of languages that cannot be identified under this new model, but that can be iteratively identified using a Friedberg numbering as the hypothesis space. © 2012 Elsevier B.V. All rights reserved.


Harris D.G.,University of Maryland University College | Sullivan F.,Center for Computing science
Leibniz International Proceedings in Informatics, LIPIcs | Year: 2015

The all-terminal reliability polynomial of a graph counts its connected subgraphs of various sizes. Algorithms based on sequential importance sampling (SIS) have been proposed to estimate a graph's reliability polynomial. We show upper bounds on the relative error of three sequential importance sampling algorithms. We use these to create a hybrid algorithm, which selects the best SIS algorithm for a particular graph G and particular coefficient of the polynomial. This hybrid algorithm is particularly effective when G has low degree. For graphs of average degree ≤ 11, it is the fastest known algorithm; for graphs of average degree ≤ 45 it is the fastest known polynomial-space algorithm. For example, when a graph has average degree 3, this algorithm estimates to error ε in time O(1.26nε-2). Although the algorithm may take exponential time, in practice it can have good performance even on medium-scale graphs. We provide experimental results that show quite practical performance on graphs with hundreds of vertices and thousands of edges. By contrast, alternative algorithms are either not rigorous or are completely impractical for such large graphs. © David G. Harris and Francis Sullivan.


Mardziel P.,University of Maryland University College | Mardziel P.,Center for Computing science | Hicks M.,University of Maryland University College | Srivatsa M.,IBM
Journal of Computer Security | Year: 2013

This paper explores the idea of knowledge-based security policies, which are used to decide whether to answer queries over secret data based on an estimation of the querier's (possibly increased) knowledge given the results. Limiting knowledge is the goal of existing information release policies that employ mechanisms such as noising, anonymization, and redaction. Knowledge-based policies are more general: they increase flexibility by not fixing the means to restrict information flow. We enforce a knowledge-based policy by explicitly tracking a model of a querier's belief about secret data, represented as a probability distribution, and denying any query that could increase knowledge above a given threshold. We implement query analysis and belief tracking via abstract interpretation, which allows us to trade off precision and performance through the use of abstraction. We have developed an approach to augment standard abstract domains to include probabilities, and thus define distributions. We focus on developing probabilistic polyhedra in particular, to support numeric programs. While probabilistic abstract interpretation has been considered before, our domain is the first whose design supports sound conditioning, which is required to ensure that estimates of a querier's knowledge are accurate. Experiments with our implementation show that several useful queries can be handled efficiently, particularly compared to exact (i.e., sound) inference involving sampling. We also show that, for our benchmarks, restricting constraints to octagons or intervals, rather than full polyhedra, can dramatically improve performance while incurring little to no loss in precision. © 2013 IOS Press and the authors.


Case J.,University of Delaware | Moelius III S.E.,Center for Computing science
Information and Computation | Year: 2011

Golds original paper on inductive inference introduced a notion of an optimal learner. Intuitively, a learner identifies a class of objects optimally iff there is no other learner that: requires as little of each presentation of each object in the class in order to identify that object, and, for some presentation of some object in the class, requires less of that presentation in order to identify that object. Beick considered this notion in the context of function learning, and gave an intuitive characterization of an optimal function learner. Jantke and Beick subsequently characterized the classes of functions that are algorithmically, optimally identifiable. Herein, Golds notion is considered in the context of language learning. It is shown that a characterization of optimal language learners analogous to Beicks does not hold. It is also shown that the classes of languages that are algorithmically, optimally identifiable cannot be characterized in a manner analogous to that of Jantke and Beick. Other interesting results concerning optimal language learning include the following. It is shown that strong non-U-shapedness, a property involved in Beicks characterization of optimal function learners, does not restrict algorithmic language learning power. It is also shown that, for an arbitrary optimal learner F of a class of languages L, F optimally identifies a subclass K of L iff F is class-preserving with respect to K. © 2011 Elsevier Inc. All rights reserved.


Davis S.T.,Center for Computing science | Conroy J.M.,Center for Computing science | Schlesinger J.D.,Center for Computing science
Proceedings - 12th IEEE International Conference on Data Mining Workshops, ICDMW 2012 | Year: 2012

OCCAMS is a new algorithm for the Multi- Document Summarization (MDS) problem. We use Latent Semantic Analysis (LSA) to produce term weights which identify the main theme(s) of a set of documents. These are used by our heuristic for extractive sentence selection which borrows techniques from combinatorial optimization to select a set of sentences such that the combined weight of the terms covered is maximized while redundancy is minimized. OCCAMS outperforms CLASSY11 on DUC/TAC data for nearly all years since 2005, where CLASSY11 is the best human-rated system of TAC 2011. OCCAMS also delivers higher ROUGE scores than all human-generated summaries for TAC 2011. We show that if the combinatorial component of OCCAMS, which computes the extractive summary, is given true weights of terms, then the quality of the summaries generated outperforms all human generated summaries for all years using ROUGE-2, ROUGE-SU4, and a coverage metric. We introduce this new metric based on term coverage and demonstrate that a simple bi-gram instantiation achieves a statistically significant higher Pearson correlation with overall responsiveness than ROUGE on the TAC data. © 2012 IEEE.


Case J.,University of Delaware | Moelius III S.E.,Center for Computing science
Theory of Computing Systems | Year: 2012

Intuitively, a recursion theorem asserts the existence of self-referential programs. Two well-known recursion theorems are Kleene's Recursion Theorem (krt) and Rogers' Fixpoint Recursion Theorem (fprt). Does one of these two theorems better capture the notion of program self-reference than the other? In the context of the partial computable functions over the natural numbers (PC), fprt is strictly weaker than krt, in that fprt holds in any effective numbering of PC in which krt holds, but not vice versa. It is shown that, in this context, the existence of self-reproducing programs (a. k. a. quines) is assured by krt, but not by fprt. Most would surely agree that a self-reproducing program is self-referential. Thus, this result suggests that krt is better than fprt at capturing the notion of program self-reference in PC. A generalization of krt to arbitrary constructive Scott subdomains is then given. (For fprt, a similar generalization was already known.) Surprisingly, for some such subdomains, the two theorems turn out to be equivalent. A precise characterization is given of those constructive Scott subdomains in which this occurs. For such subdomains, the two theorems capture the notion of program self-reference equally well. © 2011 Springer Science+Business Media, LLC.


Moelius III S.E.,Center for Computing science | Zilles S.,University of Regina
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | Year: 2010

Iterative learning is a model of language learning from positive data, due to Wiehagen. When compared to a learner in Gold's original model of language learning from positive data, an iterative learner can be thought of as memory-limited. However, an iterative learner can memorize some input elements by coding them into the syntax of its hypotheses. A main concern of this paper is: to what extent are such coding tricks necessary? One means of preventing some such coding tricks is to require that the hypothesis space used be free of redundancy, i.e., that it be 1-1. By extending a result of Lange & Zeugmann, we show that many interesting and non-trivial classes of languages can be iteratively identified in this manner. On the other hand, we show that there exists a class of languages that cannot be iteratively identified using any 1-1 effective numbering as the hypothesis space. We also consider an iterative-like learning model in which the computational component of the learner is modeled as an enumeration operator, as opposed to a partial computable function. In this new model, there are no hypotheses, and, thus, no syntax in which the learner can encode what elements it has or has not yet seen. We show that there exists a class of languages that can be identified under this new model, but that cannot be iteratively identified. On the other hand, we show that there exists a class of languages that cannot be identified under this new model, but that can be iteratively identified using a Friedberg numbering as the hypothesis space. © 2010 Springer-Verlag.


Moelius III S.E.,Center for Computing science
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | Year: 2012

The Rogers semilattice of effective programming systems (epses) is the collection of all effective numberings of the partial computable functions ordered such that θ ≤ ψ whenever θ-programs can be algorithmically translated into ψ-programs. Herein, it is shown that an eps ψ is minimal in this ordering if and only if, for each translation function t into ψ, there exists a computably enumerable equivalence relation (ceer) R such that (i) R is a subrelation of ψ's program equivalence relation, and (ii) R equates each ψ-program to some program in the range of t. It is also shown that there exists a minimal eps for which no single such R does the work for all such t. In fact, there exists a minimal eps ψ such that, for each ceer R, either R contradicts ψ's program equivalence relation, or there exists a translation function t into ψ such that the range of t fails to intersect infinitely many of R's equivalence classes. © 2012 Springer-Verlag.

Loading Center for Computing science collaborators
Loading Center for Computing science collaborators