Time filter

Source Type

Jordan I.K.,Georgia Institute of Technology | Jordan I.K.,PanAmerican Bioinformatics Institute
Biology Direct | Year: 2016

Background: The term "Columbian Exchange" refers to the massive transfer of life between the Afro-Eurasian and American hemispheres that was precipitated by Columbus' voyage to the New World. The Columbian Exchange is widely appreciated by historians, social scientists and economists as a major turning point that had profound and lasting effects on the trajectory of human history and development. Presentation of the hypothesis: I propose that the Columbian Exchange should also be appreciated by biologists for its role in the creation of novel human genomes that have been shaped by rapid adaptive evolution. Specifically, I hypothesize that the process of human genome evolution stimulated by the Columbian Exchange was based in part on selective sweeps of introgressed haplotypes from ancestral populations, many of which possessed pre-evolved adaptive utility based on regional-specific fitness and health effects. Testing the hypothesis: Testing of this hypothesis will require comparative analysis of genome sequences from putative ancestral source populations, with genomes from modern admixed populations, in order to identify ancestry-specific introgressed haplotypes that exist at higher frequencies in admixed populations than can be expected by chance alone. Investigation of such ancestry-enriched genomic regions can be used to provide clues as to the functional roles of the genes therein and the selective forces that have acted to increase their frequency in the population. Implications of the hypothesis: Critical interrogation of this hypothesis could serve to underscore the important role of introgression as a source of adaptive alleles and as a driver of evolutionary change, and it would highlight the role of admixture in facilitating rapid human evolution. Reviewers: This article was reviewed by Frank Eisenhaber, Lakshminarayan Iyer and Igor B. Rogozin © 2016 Jordan.

Conley A.B.,Georgia Institute of Technology | Jordan I.K.,Georgia Institute of Technology | Jordan I.K.,PanAmerican Bioinformatics Institute
Mobile DNA | Year: 2012

Background: Transposable elements (TEs) encode sequences necessary for their own transposition, including signals required for the termination of transcription. TE sequences within the introns of human genes show an antisense orientation bias, which has been proposed to reflect selection against TE sequences in the sense orientation owing to their ability to terminate the transcription of host gene transcripts. While there is evidence in support of this model for some elements, the extent to which TE sequences actually terminate transcription of human gene across the genome remains an open question. Results: Using high-throughput sequencing data, we have characterized over 9,000 distinct TE-derived sequences that provide transcription termination sites for 5,747 human genes across eight different cell types. Rarefaction curve analysis suggests that there may be twice as many TE-derived termination sites (TE-TTS) genome-wide among all human cell types. The local chromatin environment for these TE-TTS is similar to that seen for 3 UTR canonical TTS and distinct from the chromatin environment of other intragenic TE sequences. However, those TE-TTS located within the introns of human genes were found to be far more cell type-specific than the canonical TTS. TE-TTS were much more likely to be found in the sense orientation than other intragenic TE sequences of the same TE family and TE-TTS in the sense orientation terminate transcription more efficiently than those found in the antisense orientation. Alu sequences were found to provide a large number of relatively weak TTS, whereas LTR elements provided a smaller number of much stronger TTS. Conclusions: TE sequences provide numerous termination sites to human genes, and TE-derived TTS are particularly cell type-specific. Thus, TE sequences provide a powerful mechanism for the diversification of transcriptional profiles between cell types and among evolutionary lineages, since most TE-TTS are evolutionarily young. The extent of transcription termination by TEs seen here, along with the preference for sense-oriented TE insertions to provide TTS, is consistent with the observed antisense orientation bias of human TEs. © 2012 Conley and Jordan; licensee BioMed Central Ltd.

Wang J.,Georgia Institute of Technology | Lunyak V.V.,Buck Institute for Age Research | King Jordan I.,Georgia Institute of Technology | King Jordan I.,PanAmerican Bioinformatics Institute
Nucleic Acids Research | Year: 2012

We report on the development of an unsupervised algorithm for the genome-wide discovery and analysis of chromatin signatures. Our Chromatin-profile Alignment followed by Tree-clustering algorithm (ChAT) employs dynamic programming of combinatorial histone modification profiles to identify locally similar chromatin sub-regions and provides complementary utility with respect to existing methods. We applied ChAT to genomic maps of 39 histone modifications in human CD4+ T cells to identify both known and novel chromatin signatures. ChAT was able to detect chromatin signatures previously associated with transcription start sites and enhancers as well as novel signatures associated with a variety of regulatory elements. Promoter-associated signatures discovered with ChAT indicate that complex chromatin signatures, made up of numerous co-located histone modifications, facilitate cell-type specific gene expression. The discovery of novel L1 retrotransposon-associated bivalent chromatin signatures suggests that these elements influence the mono-allelic expression of human genes by shaping the chromatin environment of imprinted genomic regions. Analysis of long gene-associated chromatin signatures point to a role for the H4K20me1 and H3K79me3 histone modifications in transcriptional pause release. The novel chromatin signatures and functional associations uncovered by ChAT underscore the ability of the algorithm to yield novel insight on chromatin-based regulatory mechanisms. © 2012 The Author(s).

Wang J.,Georgia Institute of Technology | Lunyak V.V.,Buck Institute for Age Research | Jordan I.K.,Georgia Institute of Technology | Jordan I.K.,PanAmerican Bioinformatics Institute
Nucleic Acids Research | Year: 2012

Boundary elements partition eukaryotic chromatin into active and repressive domains, and can also block regulatory interactions between domains. Boundary elements act via diverse mechanisms making accurate feature-based computational predictions difficult. Therefore, we developed an unbiased algorithm that predicts the locations of human boundary elements based on the genomic distributions of chromatin and transcriptional states, as opposed to any intrinsic characteristics that they may possess. Application of our algorithm to ChIP-seq data for histone modifications and RNA Pol II-binding data in human CD4 + T cells resulted in the prediction of 2542 putative chromatin boundary elements genome wide. Predicted boundary elements display two distinct features: first, position-specific open chromatin and histone acetylation that is coincident with the recruitment of sequence-specific DNA-binding factors such as CTCF, EVI1 and YYI, and second, a directional and gradual increase in histone lysine methylation across predicted boundaries coincident with a gain of expression of non-coding RNAs, including examples of boundaries encoded by tRNA and other non-coding RNA genes. Accordingly, a number of the predicted human boundaries may function via the synergistic action of sequence-specific recruitment of transcription factors leading to non-coding RNA transcriptional interference and the blocking of facultative heterochromatin propagation by transcription-associated chromatin remodeling complexes. © The Author(s) 2011. Published by Oxford University Press.

Conley A.B.,Georgia Institute of Technology | King Jordan I.,Georgia Institute of Technology | King Jordan I.,PanAmerican Bioinformatics Institute
Nucleic Acids Research | Year: 2012

Mammalian genomes encode numerous cis-natural antisense transcripts (cis-NATs). The extent to which these cis-NATs are actively regulated and ultimately functionally relevant, as opposed to transcriptional noise, remains a matter of debate. To address this issue, we analyzed the chromatin environment and RNA Pol II binding properties of human cis-NAT promoters genome-wide. Cap analysis of gene expression data were used to identify thousands of cis-NAT promoters, and profiles of nine histone modifications and RNA Pol II binding for these promoters in ENCODE cell types were analyzed using chromatin immunoprecipitation followed by sequencing (ChIP-seq) data. Active cis-NAT promoters are enriched with activating histone modifications and occupied by RNA Pol II, whereas weak cis-NAT promoters are depleted for both activating modifications and RNA Pol II. The enrichment levels of activating histone modifications and RNA Pol II binding show peaks centered around cis-NAT transcriptional start sites, and the levels of activating histone modifications at cis-NAT promoters are positively correlated with cis-NAT expression levels. Cis-NAT promoters also show highly tissue-specific patterns of expression. These results suggest that human cis-NATs are actively transcribed by the RNA Pol II and that their expression is epigenetically regulated, prerequisites for a functional potential for many of these non-coding RNAs. © 2012 The Author(s).

Garzon-Martinez G.A.,Colombian Corporation for Agricultural Research CORPOICA | Zhu Z.I.,U.S. National Center for Biotechnology Information | Landsman D.,U.S. National Center for Biotechnology Information | Barrero L.S.,Colombian Corporation for Agricultural Research CORPOICA | And 4 more authors.
BMC Genomics | Year: 2012

Background: Physalis peruviana commonly known as Cape gooseberry is a member of the Solanaceae family that has an increasing popularity due to its nutritional and medicinal values. A broad range of genomic tools is available for other Solanaceae, including tomato and potato. However, limited genomic resources are currently available for Cape gooseberry.Results: We report the generation of a total of 652,614 P. peruviana Expressed Sequence Tags (ESTs), using 454 GS FLX Titanium technology. ESTs, with an average length of 371 bp, were obtained from a normalized leaf cDNA library prepared using a Colombian commercial variety. De novo assembling was performed to generate a collection of 24,014 isotigs and 110,921 singletons, with an average length of 1,638 bp and 354 bp, respectively. Functional annotation was performed using NCBI's BLAST tools and Blast2GO, which identified putative functions for 21,191 assembled sequences, including gene families involved in all the major biological processes and molecular functions as well as defense response and amino acid metabolism pathways. Gene model predictions in P. peruviana were obtained by using the genomes of Solanum lycopersicum (tomato) and Solanum tuberosum (potato). We predict 9,436 P. peruviana sequences with multiple-exon models and conserved intron positions with respect to the potato and tomato genomes. Additionally, to study species diversity we developed 5,971 SSR markers from assembled ESTs.Conclusions: We present the first comprehensive analysis of the Physalis peruviana leaf transcriptome, which will provide valuable resources for development of genetic tools in the species. Assembled transcripts with gene models could serve as potential candidates for marker discovery with a variety of applications including: functional diversity, conservation and improvement to increase productivity and fruit quality. P. peruviana was estimated to be phylogenetically branched out before the divergence of five other Solanaceae family members, S. lycopersicum, S. tuberosum, Capsicum spp, S. melongena and Petunia spp. © 2012 Garzón-Martínez et al; licensee BioMed Central Ltd.

Zhao Y.-Q.,Buck Institute for Research on Aging | Zhao Y.-Q.,Wuhan University | Jordan I.K.,Georgia Institute of Technology | Jordan I.K.,PanAmerican Bioinformatics Institute | Lunyak V.V.,Buck Institute for Research on Aging
Neurotherapeutics | Year: 2013

This review highlights recent discoveries that have shaped the emerging viewpoints in the field of epigenetic influences in the central nervous system (CNS), focusing on the following questions: i) How is the CNS shaped during development when precursor cells transition into morphologically and molecularly distinct cell types, and is this event driven by epigenetic alterations?; ii) How do epigenetic pathways control CNS function?; iii) What happens to "epigenetic memory" during aging processes, and do these alterations cause CNS dysfunction?; iv) Can one restore normal CNS function by manipulating the epigenome using pharmacologic agents, and will this ameliorate aging-related neurodegeneration? These and other still unanswered questions remain critical to understanding the impact of multifaceted epigenetic machinery on the age-related dysfunction of CNS. © 2013 The American Society for Experimental NeuroTherapeutics, Inc.

Wang J.,Georgia Institute of Technology | Lunyak V.V.,Buck Institute for Age Research | King Jordan I.,Georgia Institute of Technology | King Jordan I.,PanAmerican Bioinformatics Institute
Bioinformatics | Year: 2013

Although some histone modification chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) signals show abrupt peaks across narrow and specific genomic locations, others have diffuse distributions along chromosomes, and their large contiguous enrichment landscapes are better modeled as broad peaks. Here, we present BroadPeak, an algorithm for the identification of such broad peaks from diffuse ChIP-seq datasets. We show that BroadPeak is a linear time algorithm that requires only two parameters, and we validate its performance on real and simulated histone modification ChIP-seq datasets. BroadPeak calls peaks that are highly coincident with both the underlying ChIP-seq tag count distributions and relevant biological features, such as the gene bodies of actively transcribed genes, and it shows superior overall recall and precision of known broad peaks from simulated datasets.Availability: The source code and documentations are available at http://jordan.biology.gatech. edu/page/software/broadpeak/. © 2013 The Author.

Rishishwar L.,Georgia Institute of Technology | Rishishwar L.,PanAmerican Bioinformatics Institute | Petit III R.A.,Emory University | Kraft C.S.,Emory University | And 2 more authors.
Journal of Bacteriology | Year: 2014

Vancomycin is the mainstay of treatment for patients with Staphylococcus aureus infections, and reduced susceptibility to vancomycin is becoming increasingly common. Accordingly, the development of rapid and accurate assays for the diagnosis of vancomycin-intermediate S. aureus (VISA) will be critical. We developed and applied a genome-based machine-learning approach for discrimination between VISA and vancomycin-susceptible S. aureus (VSSA) using 25 whole-genome sequences. The resulting machine-learning model, based on 14 gene parameters, including 3 molecular typing markers and 11 genes implicated in reduced vancomycin susceptibility, is able to unambiguously distinguish between the VISA and VSSA isolates analyzed here despite the fact that they do not form evolutionarily distinct groups. As such, the model is able to discriminate based on specific genomic markers of antibiotic susceptibility rather than overall sequence relatedness. Subsequent evaluation of the model using leave-one-out validation yielded a classification accuracy of 84%. The machine-learning approach described here provides a generalized framework for the application of genome sequence analysis to the classification of bacteria that differ with respect to clinically relevant phenotypes and should be particularly useful in defining the genomic features that underlie antibiotic resistance. © 2014, American Society for Microbiology.

Jjingo D.,Georgia Institute of Technology | Conley A.B.,Georgia Institute of Technology | Yi S.V.,Georgia Institute of Technology | Lunyak V.V.,Buck Institute for Research on Aging | And 2 more authors.
Oncotarget | Year: 2012

DNA methylation of promoter sequences is a repressive epigenetic mark that down-regulates gene expression. However, DNA methylation is more prevalent within gene-bodies than seen for promoters, and gene-body methylation has been observed to be positively correlated with gene expression levels. This paradox remains unexplained, and accordingly the role of DNA methylation in gene-bodies is poorly understood. We addressed the presence and role of human gene-body DNA methylation using a meta-analysis of human genome-wide methylation, expression and chromatin data sets. Methylation is associated with transcribed regions as genic sequences have higher levels of methylation than intergenic or promoter sequences. We also find that the relationship between gene-body DNA methylation and expression levels is non-monotonic and bell-shaped. Mid-level expressed genes have the highest levels of gene-body methylation, whereas the most lowly and highly expressed sets of genes both have low levels of methylation. While gene-body methylation can be seen to efficiently repress the initiation of intragenic transcription, the vast majority of methylated sites within genes are not associated with intragenic promoters. In fact, highly expressed genes initiate the most intragenic transcription, which is inconsistent with the previously held notion that gene-body methylation serves to repress spurious intragenic transcription to allow for efficient transcriptional elongation. These observations lead us to propose a model to explain the presence of human gene-body methylation. This model holds that the repression of intragenic transcription by gene-body methylation is largely epiphenomenal, and suggests that gene-body methylation levels are predominantly shaped via the accessibility of the DNA to methylating enzyme complexes. © Jjingo et al.

Loading PanAmerican Bioinformatics Institute collaborators
Loading PanAmerican Bioinformatics Institute collaborators