Human Genetics Foundation Turin
Human Genetics Foundation Turin
Baldassi C.,Polytechnic University of Turin |
Baldassi C.,Human Genetics Foundation Turin
Journal of Statistical Mechanics: Theory and Experiment | Year: 2017
We present a method for Monte Carlo sampling on systems with discrete variables (focusing in the Ising case), introducing a prior on the candidate moves in a Metropolis-Hastings scheme which can significantly reduce the rejection rate, called the reduced-rejection-rate (RRR) method. The method employs same probability distribution for the choice of the moves as rejection-free schemes such as the method proposed by Bortz, Kalos and Lebowitz (BKL) (1975 J. Comput. Phys. 17 10-8); however, it uses it as a prior in an otherwise standard Metropolis scheme: it is thus not fully rejection-free, but in a wide range of scenarios it is nearly so. This allows to extend the method to cases for which rejection-free schemes become inefficient, in particular when the graph connectivity is not sparse, but the energy can nevertheless be expressed as a sum of two components, one of which is computed on a sparse graph and dominates the measure. As examples of such instances, we demonstrate that the method yields excellent results when performing Monte Carlo simulations of quantum spin models in presence of a transverse field in the Suzuki-Trotter formalism, and when exploring the so-called robust ensemble which was recently introduced in Baldassi et al (2016 Proc. Natl Acad. Sci. 113 E7655-62). Our code for the Ising case is publicly available (RRR Monte Carlo code https://github.com/carlobaldassi/RRRMC.jl), and extensible to user-defined models: it provides efficient implementations of standard Metropolis, the RRR method, the BKL method (extended to the case of continuous energy specra), and the waiting time method by Dall and Sibani (2001 Comput. Phys. Commun. 141 260-7). © 2017 IOP Publishing Ltd and SISSA Medialab srl.
Lucibello C.,Polytechnic University of Turin |
Lucibello C.,Human Genetics Foundation Turin |
Parisi G.,CNR Institute for Chemical and Physical Processes |
Sicuro G.,Brazilian Center for Research in Physics (CBPF)
Physical Review E - Statistical, Nonlinear, and Soft Matter Physics | Year: 2017
The matching problem is a notorious combinatorial optimization problem that has attracted for many years the attention of the statistical physics community. Here we analyze the Euclidean version of the problem, i.e., the optimal matching problem between points randomly distributed on a d-dimensional Euclidean space, where the cost to minimize depends on the points' pairwise distances. Using Mayer's cluster expansion we write a formal expression for the replicated action that is suitable for a saddle point computation. We give the diagrammatic rules for each term of the expansion, and we analyze in detail the one-loop diagrams. A characteristic feature of the theory, when diagrams are perturbatively computed around the mean field part of the action, is the vanishing of the mass at zero momentum. In the non-Euclidean case of uncorrelated costs instead, we predict and numerically verify an anomalous scaling for the sub-sub-leading correction to the asymptotic average cost. © 2017 American Physical Society.
Braunstein A.,Polytechnic University of Turin |
Braunstein A.,Human Genetics Foundation Turin |
Muntoni A.P.,Polytechnic University of Turin |
Pagnani A.,Polytechnic University of Turin |
And 2 more authors.
Nature Communications | Year: 2017
Assuming a steady-state condition within a cell, metabolic fluxes satisfy an underdetermined linear system of stoichiometric equations. Characterizing the space of fluxes that satisfy such equations along with given bounds (and possibly additional relevant constraints) is considered of utmost importance for the understanding of cellular metabolism. Extreme values for each individual flux can be computed with linear programming (as flux balance analysis), and their marginal distributions can be approximately computed with Monte Carlo sampling. Here we present an approximate analytic method for the latter task based on expectation propagation equations that does not involve sampling and can achieve much better predictions than other existing analytic methods. The method is iterative, and its computation time is dominated by one matrix inversion per iteration. With respect to sampling, we show through extensive simulation that it has some advantages including computation time, and the ability to efficiently fix empirically estimated distributions of fluxes. © The Author(s) 2017.
PubMed | Civile Mp Arezzo Hospital, The Second University of Naples, Heinrich Heine University Düsseldorf, Wellcome Trust Sanger Institute and 23 more.
Type: Journal Article | Journal: Nature | Year: 2016
Approximately 1.5 billion people worldwide are overweight or affected by obesity, and are at risk of developing type 2 diabetes, cardiovascular disease and related metabolic and inflammatory disturbances. Although the mechanisms linking adiposity to associated clinical conditions are poorly understood, recent studies suggest that adiposity may influence DNA methylation, a key regulator of gene expression and molecular phenotype. Here we use epigenome-wide association to show that body mass index (BMI; a key measure of adiposity) is associated with widespread changes in DNA methylation (187 genetic loci with P<110
Baldassi C.,Polytechnic University of Turin |
Baldassi C.,Human Genetics Foundation Turin |
Zamparo M.,Polytechnic University of Turin |
Zamparo M.,Human Genetics Foundation Turin |
And 8 more authors.
PLoS ONE | Year: 2014
In the course of evolution, proteins show a remarkable conservation of their three-dimensional structure and their biological function, leading to strong evolutionary constraints on the sequence variability between homologous proteins. Our method aims at extracting such constraints from rapidly accumulating sequence data, and thereby at inferring protein structure and function from sequence information alone. Recently, global statistical inference methods (e.g. direct-coupling analysis, sparse inverse covariance estimation) have achieved a breakthrough towards this aim, and their predictions have been successfully implemented into tertiary and quaternary protein structure prediction methods. However, due to the discrete nature of the underlying variable (amino-acids), exact inference requires exponential time in the protein length, and efficient approximations are needed for practical applicability. Here we propose a very efficient multivariate Gaussian modeling approach as a variant of direct-coupling analysis: the discrete amino-acid variables are replaced by continuous Gaussian random variables. The resulting statistical inference problem is efficiently and exactly solvable. We show that the quality of inference is comparable or superior to the one achieved by mean-field approximations to inference with discrete variables, as done by direct-coupling analysis. This is true for (i) the prediction of residue-residue contacts in proteins, and (ii) the identification of protein-protein interaction partner in bacterial signal transduction. An implementation of our multivariate Gaussian approach is available at the website http://areeweb.polito.it/ricerca/cmp/code. © 2014 Baldassi et al.
Pardini B.,Human Genetics Foundation Turin |
Rosa F.,Human Genetics Foundation Turin |
Barone E.,University of Pisa |
Di Gaetano C.,Human Genetics Foundation Turin |
And 15 more authors.
Clinical Cancer Research | Year: 2013
Purpose: Colorectal cancer is routinely treated with a 5-fluorouracil (5-FU)-based chemotherapy. 5-FU incorporates into DNA, and the base excision repair (BER) pathway specifically recognizes such damage. We investigated the association of single-nucleotide polymorphisms (SNP) in the 30-untranslated regions (UTR) of BER genes, and potentially affecting the microRNA (miRNA) binding, on the risk of colorectal cancer, its progression, and prognosis. SNPs in miRNA-binding sites may modulate the posttranscriptional regulation of gene expression operated by miRNAs and explain interindividual variability in BER capacity and response to 5-FU. Experimental Design: We tested 12 SNPs in the 30-UTRs of five BER genes for colorectal cancer susceptibility in a case-control study (1,098 cases and 1,459 healthy controls). Subsequently, we analyzed the role of these SNPs on clinical outcomes of patients (866 in the Training set and 232 in the Replication set). Results: SNPs in theSMUG1and NEIL2 genes were associated with overall survival. In particular,SMUG1 rs2233921 TT carriers showed increased survival compared with those with GT/GG genotypes [HR, 0.54; 95% confidence interval (CI), 0.36-0.81; P = 0.003] in the Training set and after pooling results from the Replication set. The association was more significant following stratification for 5-FU-based chemotherapy (P=5.6 × 10-5). A reduced expression of the reporter gene for the T allele of rs2233921 was observed when compared with the common G allele by in vitro assay. None of the genotyped BER polymorphisms were associated with colorectal cancer risk. Conclusions:Weprovidethe first evidence thatvariations inmiRNA-bindingsites inBERgenes30-UTRmay modulate colorectal cancer prognosis and therapy response. © 2013 American Association for Cancer Research.