Entity

Time filter

Source Type

Odense, Denmark

Hiller D.,Center for Epigenetics | Wong W.H.,Stanford University
Statistics in Biosciences | Year: 2013

RNA sequencing is a recent technology which has seen an explosion of methods addressing all levels of analysis, from read mapping to transcript assembly to differential expression modeling. In particular the discovery of isoforms at the transcript assembly stage is a complex problem and current approaches suffer from various limitations. For instance, many approaches use graphs to construct a minimal set of isoforms which covers the observed reads, then perform a separate algorithm to quantify the isoforms, which can result in a loss of power. Current methods also use ad-hoc solutions to deal with the vast number of possible isoforms which can be constructed from a given set of reads. Finally, while the need of taking into account features such as read pairing and sampling rate of reads has been acknowledged, most existing methods do not seamlessly integrate these features as part of the model. We present Montebello, an integrated statistical approach which performs simultaneous isoform discovery and quantification by using a Monte Carlo simulation to find the most likely isoform composition leading to a set of observed reads. We compare Montebello to Cufflinks, a popular isoform discovery approach, on a simulated data set and on 46. 3 million brain reads from an Illumina tissue panel. On this data set Montebello appears to offer a modest improvement over Cufflinks when considering discovery and parsimony metrics. In addition Montebello mitigates specific difficulties inherent in the Cufflinks approach. Finally, Montebello can be fine-tuned depending on the type of solution desired. © 2012 International Chinese Statistical Association. Source


McCall M.N.,University of Rochester | Murakami P.N.,Center for Epigenetics | Lukk M.,EMBL EBI Functional Genomics Group | Lukk M.,Cancer Research UK Research Institute | Huber W.,EMBL Genome Biology Unit
BMC Bioinformatics | Year: 2011

Background: Microarray technology has become a widely used tool in the biological sciences. Over the past decade, the number of users has grown exponentially, and with the number of applications and secondary data analyses rapidly increasing, we expect this rate to continue. Various initiatives such as the External RNA Control Consortium (ERCC) and the MicroArray Quality Control (MAQC) project have explored ways to provide standards for the technology. For microarrays to become generally accepted as a reliable technology, statistical methods for assessing quality will be an indispensable component; however, there remains a lack of consensus in both defining and measuring microarray quality.Results: We begin by providing a precise definition of microarray quality and reviewing existing Affymetrix GeneChip quality metrics in light of this definition. We show that the best-performing metrics require multiple arrays to be assessed simultaneously. While such multi-array quality metrics are adequate for bench science, as microarrays begin to be used in clinical settings, single-array quality metrics will be indispensable. To this end, we define a single-array version of one of the best multi-array quality metrics and show that this metric performs as well as the best multi-array metrics. We then use this new quality metric to assess the quality of microarry data available via the Gene Expression Omnibus (GEO) using more than 22,000 Affymetrix HGU133a and HGU133plus2 arrays from 809 studies.Conclusions: We find that approximately 10 percent of these publicly available arrays are of poor quality. Moreover, the quality of microarray measurements varies greatly from hybridization to hybridization, study to study, and lab to lab, with some experiments producing unusable data. Many of the concepts described here are applicable to other high-throughput technologies. © 2011 McCall et al; licensee BioMed Central Ltd. Source


Feinberg A.,Center for Epigenetics
Genome Medicine | Year: 2014

Andrew Feinberg shares his views on the field of cancer epigenetics, from its beginnings to the most exciting recent findings. © 2014 Feinberg; licensee BioMed Central Ltd. Source


Lee H.,Center for Epigenetics | Jaffe A.E.,Center for Epigenetics | Feinberg J.I.,Center for Epigenetics | Tryggvadottir R.,Center for Epigenetics | And 7 more authors.
International Journal of Epidemiology | Year: 2012

Background: Gestational age at birth strongly predicts neonatal, adolescent and adult morbidity and mortality through mostly unknown mechanisms. Identification of specific genes that are undergoing regulatory change prior to birth, such as through changes in DNA methylation, would increase our understanding of developmental changes occurring during the third trimester and consequences of pre-term birth (PTB). Methods: We performed a genome-wide analysis of DNA methylation (using microarrays, specifically CHARM 2.0) in 141 newborns collected in Baltimore, MD, using novel statistical methodology to identify genomic regions associated with gestational age at birth. Bisulphite pyrosequencing was used to validate significant differentially methylated regions (DMRs), and real-time PCR was performed to assess functional significance of differential methylation in a subset of newborns. Results: We identified three DMRs at genome-wide significance levels adjacent to the NFIX, RAPGEF2 and MSRB3 genes. All three regions were validated by pyrosequencing, and RAGPEF2 also showed an inverse correlation between DNA methylation levels and gene expression levels. Although the three DMRs appear very dynamic with gestational age in our newborn sample, adult DNA methylation levels at these regions are stable and of equal or greater magnitude than the oldest neonate, directionally consistent with the gestational age results. Conclusions: We have identified three differentially methylated regions associated with gestational age at birth. All three nearby genes play important roles in the development of several organs, including skeletal muscle, brain and haematopoietic system. Therefore, they may provide initial insight into the basis of PTB's negative health outcomes. The genome-wide custom DNA methylation array technology and novel statistical methods employed in this study could constitute a model for epidemiologic studies of epigenetic variation. Published by Oxford University Press on behalf of the International Epidemiological Association © The Author 2012; all rights reserved. Source


Feinberg A.P.,Center for Epigenetics | Feinberg A.P.,Johns Hopkins University | Irizarry R.A.,Center for Epigenetics | Fradin D.,Center for Epigenetics | And 14 more authors.
Science Translational Medicine | Year: 2010

The epigenome consists of non-sequence-based modifications, such as DNA methylation, that are heritable during cell division and that may affect normal phenotypes and predisposition to disease. Here, we have performed an unbiased genome-scale analysis of ∼4 million CpG sites in 74 individuals with comprehensive array-based relative methylation (CHARM) analysis. We found 227 regions that showed extreme interindividual variability [variably methylated regions (VMRs)] across the genome, which are enriched for developmental genes based on Gene Ontology analysis. Furthermore, half of these VMRs were stable within individuals over an average of 11 years, and these VMRs defined a personalized epigenomic signature. Four of these VMRs showed covariation with body mass index consistently at two study visits and were located in or near genes previously implicated in regulating body weight or diabetes. This work suggests an epigenetic strategy for identifying patients at risk of common disease. Source

Discover hidden collaborations