Silva L.L.,Institute Nacional Of Ciencia E Tecnologia Em Doencas Tropicais Fundacao Oswaldo Cruz Fiocruz | Silva L.L.,Federal University of Minas Gerais | Marcet-Houben M.,Center for Genomic Regulation | Marcet-Houben M.,University Pompeu Fabra | And 5 more authors.
BMC Genomics | Year: 2012

Background: Schistosoma mansoni is one of the causative agents of schistosomiasis, a neglected tropical disease that affects about 237 million people worldwide. Despite recent efforts, we still lack a general understanding of the relevant host-parasite interactions, and the possible treatments are limited by the emergence of resistant strains and the absence of a vaccine. The S. mansoni genome was completely sequenced and still under continuous annotation. Nevertheless, more than 45% of the encoded proteins remain without experimental characterization or even functional prediction. To improve our knowledge regarding the biology of this parasite, we conducted a proteome-wide evolutionary analysis to provide a broad view of the S. mansoni's proteome evolution and to improve its functional annotation.Results: Using a phylogenomic approach, we reconstructed the S. mansoni phylome, which comprises the evolutionary histories of all parasite proteins and their homologs across 12 other organisms. The analysis of a total of 7,964 phylogenies allowed a deeper understanding of genomic complexity and evolutionary adaptations to a parasitic lifestyle. In particular, the identification of lineage-specific gene duplications pointed to the diversification of several protein families that are relevant for host-parasite interaction, including proteases, tetraspanins, fucosyltransferases, venom allergen-like proteins, and tegumental-allergen-like proteins. In addition to the evolutionary knowledge, the phylome data enabled us to automatically re-annotate 3,451 proteins through a phylogenetic-based approach rather than solely sequence similarity searches. To allow further exploitation of this valuable data, all information has been made available at PhylomeDB (http://www.phylomedb.org).Conclusions: In this study, we used an evolutionary approach to assess S. mansoni parasite biology, improve genome/proteome functional annotation, and provide insights into host-parasite interactions. Taking advantage of a proteome-wide perspective rather than focusing on individual proteins, we identified that this parasite has experienced specific gene duplication events, particularly affecting genes that are potentially related to the parasitic lifestyle. These innovations may be related to the mechanisms that protect S. mansoni against host immune responses being important adaptations for the parasite survival in a potentially hostile environment. Continuing this work, a comparative analysis involving genomic, transcriptomic, and proteomic data from other helminth parasites, other parasites, and vectors will supply more information regarding parasite's biology as well as host-parasite interactions. © 2012 Silva et al.; licensee BioMed Central Ltd. Source

Zerlotini A.,National Institute for Science and Technology in Tropical Diseases FIOCRUZ Minas | Zerlotini A.,Laboratorio Multiusuario Of Bioinformatica | Aguiar E.R.G.R.,National Institute for Science and Technology in Tropical Diseases FIOCRUZ Minas | Yu F.,Shanghai Center for Bioinformation Technology | And 10 more authors.
Nucleic Acids Research | Year: 2013

The new release of SchistoDB (http://SchistoDB.net) provides a rich resource of genomic data for key blood flukes (genus Schistosoma) which cause disease in hundreds of millions of people worldwide. SchistoDB integrates whole-genome sequence and annotation of three species of the genus and provides enhanced bioinformatics analyses and data-mining tools. A simple, yet comprehensive web interface provided through the Strategies Web Development Kit is available for the mining and visualization of the data. Genomic scale data can be queried based on BLAST searches, annotation keywords and gene ID searches, gene ontology terms, sequence motifs, protein characteristics and phylogenetic relationships. Search strategies can be saved within a user's profile for future retrieval and may also be shared with other researchers using a unique web address. © The Author(s) 2012. Source

Aguiar E.R.G.R.,Federal University of Minas Gerais | Aguiar E.R.G.R.,French National Center for Scientific Research | Olmo R.P.,Federal University of Minas Gerais | Olmo R.P.,French National Center for Scientific Research | And 13 more authors.
Nucleic Acids Research | Year: 2015

Virus surveillance in vector insects is potentially of great benefit to public health. Large-scale sequencing of small and long RNAs has previously been used to detect viruses, but without any formal comparison of different strategies. Furthermore, the identification of viral sequences largely depends on similarity searches against reference databases. Here, we developed a sequence-independent strategy based on virus-derived small RNAs produced by the host response, such as the RNA interference pathway. In insects, we compared sequences of small and long RNAs, demonstrating that viral sequences are enriched in the small RNA fraction. We also noted that the small RNA size profile is a unique signature for each virus and can be used to identify novel viral sequences without known relatives in reference databases. Using this strategy, we characterized six novel viruses in the viromes of laboratory fruit flies and wild populations of two insect vectors: mosquitoes and sandflies. We also show that the small RNA profile could be used to infer viral tropism for ovaries among other aspects of virus biology. Additionally, our results suggest that virus detection utilizing small RNAs can also be applied to vertebrates, although not as efficiently as to plants and insects. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research. Source

Lobo F.P.,Laboratorio Multiusuario Of Bioinformatica | Hilario H.O.,Federal University of Minas Gerais | Souza R.A.,Fundacao Ezequiel Dias | Tauch A.,Bielefeld University | And 2 more authors.
Nucleic Acids Research | Year: 2012

The enrichment analysis is a standard procedure to interpret 'omics' experiments that generate large gene lists as outputs, such as transcriptomics and protemics. However, despite the huge success of enrichment analysis in these classes of experiments, there is a surprising lack of application of this methodology to survey other categories of large-scale biological data available. Here, we report Kegg Orthology enrichMent-Online DetectiOn (KOMODO), a web tool to systematically investigate groups of monophyletic genomes in order to detect significantly enriched groups of homologous genes in one taxon when compared with another. The results are displayed in their proper biochemical roles in a visual, explorative way, allowing users to easily formulate and investigate biological hypotheses regarding the taxonomical distribution of genomic elements. We validated KOMODO by analyzing portions of central carbon metabolism in two taxa extensively studied regarding their carbon metabolism profile (Enterobacteriaceae family and Lactobacillales order). Most enzymatic activities significantly biased were related to known key metabolic traits in these taxa, such as the distinct fates of pyruvate (the known tendency of lactate production in Lactobacillales and its complete oxidation in Enterobacteriaceae), demonstrating that KOMODO could detect biologically meaningful differences in the frequencies of shared genomic elements among taxa. KOMODO is freely available at http://komodotool.org. © 2012 The Author(s). Source

Coutinho T.J.D.,Federal University of Minas Gerais | Franco G.R.,Federal University of Minas Gerais | Lobo F.P.,Laboratorio Multiusuario Of Bioinformatica
Computational and Structural Biotechnology Journal | Year: 2015

A mainstream procedure to analyze the wealth of genomic data available nowadays is the detection of homologous regions shared across genomes, followed by the extraction of biological information from the patterns of conservation and variation observed in such regions. Although of pivotal importance, comparative genomic procedures that rely on homology inference are obviously not applicable if no homologous regions are detectable. This fact excludes a considerable portion of "genomic dark matter" with no significant similarity - and, consequently, no inferred homology to any other known sequence - from several downstream comparative genomic methods. In this review we compile several sequence metrics that do not rely on homology inference and can be used to compare nucleotide sequences and extract biologically meaningful information from them. These metrics comprise several compositional parameters calculated from sequence data alone, such as GC content, dinucleotide odds ratio, and several codon bias metrics. They also share other interesting properties, such as pervasiveness (patterns persist on smaller scales) and phylogenetic signal. We also cite examples where these homology-independent metrics have been successfully applied to support several bioinformatics challenges, such as taxonomic classification of biological sequences without homology inference. They where also used to detect higher-order patterns of interactions in biological systems, ranging from detecting coevolutionary trends between the genomes of viruses and their hosts to characterization of gene pools of entire microbial communities. We argue that, if correctly understood and applied, homology-independent metrics can add important layers of biological information in comparative genomic studies without prior homology inference. © 2015 The Authors. Source

