Orsini M.,CRS4 Bioinformatics |
Travaglione A.,CRS4 Bioinformatics |
Capobianco E.,University of Miami |
Capobianco E.,CNR Institute of Clinical Physiology
Computer Methods and Programs in Biomedicine | Year: 2013
Translational research in cancer genomics assigns a fundamental role to bioinformatics in support of candidate gene prioritization with regard to both biomarker discovery and target identification for drug development. Efforts in both such directions rely on the existence and constant update of large repositories of gene expression data and omics records obtained from a variety of experiments. Users who interactively interrogate such repositories may have problems in retrieving sample fields that present limited associated information, due for instance to incomplete entries or sometimes unusable files. Cancer-specific data sources present similar problems. Given that source integration usually improves data quality, one of the objectives is keeping the computational complexity sufficiently low to allow an optimal assimilation and mining of all the information. In particular, the scope of integrating intraomics data can be to improve the exploration of gene co-expression landscapes, while the scope of integrating interomics sources can be that of establishing genotype-phenotype associations. Both integrations are relevant to cancer biomarker meta-analysis, as the proposed study demonstrates. Our approach is based on re-annotating cancer-specific data available at the EBI's ArrayExpress repository and building a data warehouse aimed to biomarker discovery and validation studies. Cancer genes are organized by tissue with biomedical and clinical evidences combined to increase reproducibility and consistency of results. For better comparative evaluation, multiple queries have been designed to efficiently address all types of experiments and platforms, and allow for retrieval of sample-related information, such as cell line, disease state and clinical aspects. © 2013 Elsevier Ireland Ltd.
Capobianco E.,CRS4 Bioinformatics |
Marras E.,CRS4 Bioinformatics |
Travaglione A.,CRS4 Bioinformatics
Statistical Applications in Genetics and Molecular Biology | Year: 2011
Inference methods applied to biological networks suffer from a main criticism: as the latter reflect associations measured under static conditions, an evaluation of the underlying modular organization can be biologically meaningful only if the dynamics can also be taken into consideration. The same limitation is present in protein interactome networks. Given the substantial uncertainty characterizing protein interactions, we identify at least three aspects that must be considered for inference purposes: 1. Coverage, which for most organisms is only partial; 2. Stochasticity, affecting both the high-throughput experimental and the computational settings from which the interactions are determined, and leading to suboptimal measurement accuracy; 3. Information variety, due to the heterogeneity of technological and biological sources generating the data. Consequently, advances in inference methods require adequate treatment of both system uncertainty and dynamical aspects. Feasible solutions are often made possible by data (omic) integration procedures that complement the experimental design and the computational approaches for network modeling. We present a multiscale stochastic approach to deal with protein interactions involved in a well-known signaling network, and show that based on some topological network features, it is possible to identify timescales (or resolutions) that characterize complex pathways. © 2011 De Gruyter. All rights reserved.