Time filter

Source Type

Nova Petrópolis, Brazil

Mendes N.D.,CNRS Biometry and Evolutionary Biology Laboratory | Mendes N.D.,University of Lisbon | Mendes N.D.,French Institute for Research in Computer Science and Automation | Freitas A.T.,University of Lisbon | And 3 more authors.
BMC Genomics

Background: Efforts using computational algorithms towards the enumeration of the full set of miRNAs of an organism have been limited by strong reliance on arguments of precursor conservation and feature similarity. However, miRNA precursors may arise anew or be lost across the evolutionary history of a species and a newly sequenced genome may be evolutionarily too distant from other genomes for an adequate comparative analysis. In addition, the learning of intricate classification rules based purely on features shared by miRNA precursors that are currently known may reflect a perpetuating identification bias rather than a sound means to tell true miRNAs from other genomic stem-loops.Results: We show that there is a strong bias amongst annotated pre-miRNAs towards robust stem-loops in the genomes of Drosophila melanogaster and Anopheles gambiae and we propose a scoring scheme for precursor candidates which combines four robustness measures. Additionally, we identify several known pre-miRNA homologs in the newly sequenced Anopheles darlingi and show that most are found amongst the top-scoring precursor candidates. Furthermore, a comparison of the performance of our approach is made against two single-genome pre-miRNA classification methods.Conclusions: In this paper we present a strategy to sieve through the vast amount of stem-loops found in metazoan genomes in search of pre-miRNAs, significantly reducing the set of candidates while retaining most known miRNA precursors. This approach makes no use of conservation data and relies solely on properties derived from our knowledge of miRNA biogenesis. © 2010 Mendes et al; licensee BioMed Central Ltd. Source

T'Hoen P.A.C.,Leiden University | T'Hoen P.A.C.,Netherlands Bioinformatics Center | Friedlander M.R.,Center for Genomic Regulation | Friedlander M.R.,University Pompeu Fabra | And 29 more authors.
Nature Biotechnology

RNA sequencing is an increasingly popular technology for genome-wide analysis of transcript sequence and abundance. However, understanding of the sources of technical and interlaboratory variation is still limited. To address this, the GEUVADIS consortium sequenced mRNAs and small RNAs of lymphoblastoid cell lines of 465 individuals in seven sequencing centers, with a large number of replicates. The variation between laboratories appeared to be considerably smaller than the already limited biological variation. Laboratory effects were mainly seen in differences in insert size and GC content and could be adequately corrected for. In small-RNA sequencing, the microRNA (miRNA) content differed widely between samples owing to competitive sequencing of rRNA fragments. This did not affect relative quantification of miRNAs. We conclude that distributing RNA sequencing among different laboratories is feasible, given proper standardization and randomization procedures. We provide a set of quality measures and guidelines for assessing technical biases in RNA-seq data. © 2013 Nature America, Inc. Source

Efimov D.,Moscow State University | Zaki N.,Bioinformatics Laboratory | Berengueres J.,Media Laboratory
Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

High-throughput experimental techniques have made available large datasets of experimentally detected protein-protein interactions. However, experimentally determined protein complexes datasets are not exhaustive nor reliable. A protein complex plays a key role in disease development. Therefore, the identification and characterization of protein complexes involved is crucial to the understanding of the molecular events under normal and abnormal physiological conditions. In this paper, we propose a novel graph mining algorithm to identify protein complexes. The algorithm first checks the quality of the interaction data, then predicts protein complexes based on the concept of weighted clustering coefficient. To demonstrate the effectiveness of our proposed method, we present experimental results on yeast protein interaction data. The level of accuracy achieved is a strong argument in favor of the proposed method. Novel protein complexes were also predicted to assist biologists in their search for protein complexes. The datasets and programs are freely available from http://faculty.uaeu.ac.ae/nzaki/PE-WCC. htm. Copyright 2012 ACM. Source

Shen W.,Chongqing Medical University | Chen M.,Chongqing Medical University | Wei G.,Chongqing Medical University | Li Y.,Chongqing Medical University | Li Y.,Bioinformatics Laboratory

Predicting miRNAs is an arduous task, due to the diversity of the precursors and complexity of enzyme processes. Although several prediction approaches have reached impressive performances, few of them could achieve a full-function recognition of mature miRNA directly from the candidate hairpins across species. Therefore, researchers continue to seek a more powerful model close to biological recognition to miRNA structure. In this report, we describe a novel miRNA prediction algorithm, known as FOMmiR, using a fixed-order Markov model based on the secondary structural pattern. For a training dataset containing 809 human pre-miRNAs and 6441 human pseudo-miRNA hairpins, the model's parameters were defined and evaluated. The results showed that FOMmiR reached 91% accuracy on the human dataset through 5-fold cross-validation. Moreover, for the independent test datasets, the FOMmiR presented an outstanding prediction in human and other species including vertebrates, Drosophila, worms and viruses, even plants, in contrast to the well-known algorithms and models. Especially, the FOMmiR was not only able to distinguish the miRNA precursors from the hairpins, but also locate the position and strand of the mature miRNA. Therefore, this study provides a new generation of miRNA prediction algorithm, which successfully realizes a full-function recognition of the mature miRNAs directly from the hairpin sequences. And it presents a new understanding of the biological recognition based on the strongest signal's location detected by FOMmiR, which might be closely associated with the enzyme cleavage mechanism during the miRNA maturation. © 2012 Shen et al. Source

Hernandez L.G.,Mass Spectrometry Group | Hernandez L.G.,University of Brasilia | Lu B.,Scripps Research Institute | Da Cruz G.C.N.,University of Brasilia | And 7 more authors.
Journal of Proteome Research

A large-scale mapping of the worker honeybee brain proteome was achieved by MudPIT. We identified 2742 proteins from forager and nurse honeybee brain samples; 17% of the total proteins were found to be differentially expressed by spectral count sampling statistics and a G-test. Sequences were compared with the EuKaryotic Orthologous Groups (KOG) catalog set using BLASTX and then categorized into the major KOG categories of most similar sequences. According to this categorization, nurse brain showed increased expression of proteins implicated in translation, ribosomal structure, and biogenesis (14.5%) compared with forager (1.8%). Experienced foragers overexpressed proteins involved in energy production and conversion, showing an extensive difference in this set of proteins (17%) in relation to the nurse subcaste (0.6%). Examples of proteins selectively expressed in each subcaste were analyzed. A comparison between these MudPIT experiments and previous 2-DE experiments revealed nine coincident proteins differentially expressed in both methodologies. © 2011 American Chemical Society. Source

Discover hidden collaborations