Beer M.,University of Liverpool |
Ferson S.,Applied Biomathematics, Inc. |
Kreinovich V.,University of Texas at El Paso
Mechanical Systems and Signal Processing | Year: 2013
Probabilistic uncertainty and imprecision in structural parameters and in environmental conditions and loads are challenging phenomena in engineering analyses. They require appropriate mathematical modeling and quantification to obtain realistic results when predicting the behavior and reliability of engineering structures and systems. But the modeling and quantification is complicated by the characteristics of the available information, which involves, for example, sparse data, poor measurements and subjective information. This raises the question whether the available information is sufficient for probabilistic modeling or rather suggests a set-theoretical approach. The framework of imprecise probabilities provides a mathematical basis to deal with these problems which involve both probabilistic and non-probabilistic information. A common feature of the various concepts of imprecise probabilities is the consideration of an entire set of probabilistic models in one analysis. The theoretical differences between the concepts mainly concern the mathematical description of the set of probabilistic models and the connection to the probabilistic models involved. This paper provides an overview on developments which involve imprecise probabilities for the solution of engineering problems. Evidence theory, probability bounds analysis with p-boxes, and fuzzy probabilities are discussed with emphasis on their key features and on their relationships to one another. This paper was especially prepared for this special issue and reflects, in various ways, the thinking and presentation preferences of the authors, who are also the guest editors for this special issue. © 2013 Elsevier Ltd. Source
Balch M.S.,Applied Biomathematics, Inc.
International Journal of Approximate Reasoning | Year: 2012
This paper introduces a new mathematical object: the confidence structure. A confidence structure represents inferential uncertainty in an unknown parameter by defining a belief function whose output is commensurate with Neyman-Pearson confidence. Confidence structures on a group of input variables can be propagated through a function to obtain a valid confidence structure on the output of that function. The theory of confidence structures is created by enhancing the extant theory of confidence distributions with the mathematical generality of Dempster-Shafer evidence theory. Mathematical proofs grounded in random set theory demonstrate the operative properties of confidence structures. The result is a new theory which achieves the holistic goals of Bayesian inference while maintaining the empirical rigor of frequentist inference. © 2012 Elsevier Inc. All rights reserved. Source
Sentz K.,Los Alamos National Laboratory |
Ferson S.,Applied Biomathematics, Inc.
Reliability Engineering and System Safety | Year: 2011
The current challenge of nuclear weapon stockpile certification is to assess the reliability of complex, high-consequent, and aging systems without the benefit of full-system test data. In the absence of full-system testing, disparate kinds of information are used to inform certification assessments such as archival data, experimental data on partial systems, data on related or similar systems, computer models and simulations, and expert knowledge. In some instances, data can be scarce and information incomplete. The challenge of Quantification of Margins and Uncertainties (QMU) is to develop a methodology to support decision-making in this informational context. Given the difficulty presented by mixed and incomplete information, we contend that the uncertainty representation for the QMU methodology should be expanded to include more general characterizations that reflect imperfect information. One type of generalized uncertainty representation, known as probability bounds analysis, constitutes the union of probability theory and interval analysis where a class of distributions is defined by two bounding distributions. This has the advantage of rigorously bounding the uncertainty when inputs are imperfectly known. We argue for the inclusion of probability bounds analysis as one of many tools that are relevant for QMU and demonstrate its usefulness as compared to other methods in a reliability example with imperfect input information. © 2011 Elsevier Ltd. All rights reserved. Source
Agency: Department of Health and Human Services | Branch: | Program: SBIR | Phase: Phase I | Award Amount: 480.87K | Year: 2012
DESCRIPTION (provided by applicant): Patient data collected during health care delivery and public health surveys possess a great deal of information that could be used in biomedical and epidemiological research. Access to these data, however, is usually limited because of the private nature of most personal health records. Methods of balancing the informativeness of data for research with the information loss required to minimize disclosure risk are needed before these data can be used to improve public health. Current methods are primarily focused on protecting privacy, but focusing on protecting privacy alone is inadequate. In statistical disclosure control techniques, information truthfulness is not well preserved so that unreliable results may be released. In generalization-based anonymization approaches, there is information loss due to attribute generalization and existing techniques do not provide sufficient control for maintaining data utility. What are currently needed are methods that protect boththe privacy of individuals represented in the data as well as the integrity of relationships studied by researchers. The problem is that there is an inherent tradeoff between protecting the privacy of individuals and protecting the informativeness of the data set. Protecting the privacy of individuals always results in a loss of information and it is the information contained by the data set that affects the power of a statistical test. For a given anonymization strategy, however, there are often multiple ways of masking the data that meet the disclosure risk criteria provided. This can be taken advantage of to choose the solution that best preserves statistical information while meeting the disclosure risk criteria provided. This project will develop the first integrated software system that provides solutions for problems faced in all three stages in the release of sensitive health care data: 1. anonymize a data set by intervalizing/generalizing data to satisfy currently available anonymization strategies,2. provide sufficient controls within anonymization procedures to satisfy constraints on statistical usefulness of the data, and 3. compute statistical tests for the anonymized data intervals. There are two main challenges facing this effort. The first isthat, based on existing research results, integrating our proposed new control processes into anonymization procedures is expected to be computationally difficult. We will overcome this challenge by developing efficient and practically useful greedy algorithms, approximation algorithms, or algorithms working for realistic situations (if not for general cases). The other primary challenge facing this effort is the fact that statistical calculations with interval data sets are known to be computationally difficult, and these calculations are necessary both for control processes within anonymization procedures and for subsequent statistical computation and tests. We will overcome this challenge with efficient algorithms that exploit the structure present in data sets intervalized for privacy. The software will be tested on medical data sets of various sizes and structures to demonstrate the feasibility of the approach and to characterize the scalability of the algorithms with data set size. PUBLIC HEALTHRELEVANCE: Patient health records possess a great deal of information that is useful in medical research, but access to these data is usually limited because of the private nature of most personal health records. Methods of balancing the informativeness ofdata for research with the information loss required to minimize disclosure risk are needed before these data can be used to improve public health. This project will develop the first integrated software system that provides solutions for intervalizing/generalizing data, controlling data utility, and performing analyses using interval statistics.
Agency: National Aeronautics and Space Administration | Branch: | Program: SBIR | Phase: Phase II | Award Amount: 599.88K | Year: 2007
This project extends Probability Bounds Analysis to model epistemic and aleatory uncertainty during early design of engineered systems in an Integrated Concurrent Engineering environment. This method uses efficient analytic and semi-analytic calculations, is more rigorous than probabilistic Monte Carlo simulation, and provides comprehensive and (often) best possible bounds on mission-level risk as a function of uncertainty in each parameter. Phase 1 demonstrated the capability to robustly model uncertainty during early design. Phase 2 will build on the Phase 1 work by 1) Implementing the PBA technology in Excel-mediated computing tools, 2) Fashioning an interface for these tools that enables fast and robust elicitation of expert knowledge, 3) Initiating the development of a library of such elicitations, 4) Demonstrating the application of the tools, interface and library in an interactive, distributed-computing environment, 5) Developing case studies, and 6) Creating tutorial documentation. Important applications of these new tools include the ability to rapidly and rigorously explore uncertainty regarding alternate designs, determine risk-based margins that are robust to surprise, and incorporate qualitatively described risks in quantitative analyses. This suite of capabilities is not currently available to systems engineers and cannot be provided by more traditional probabilistic risk assessment methods.