Entity

Time filter

Source Type

Galveston, TX, United States

Ju H.,University of Texas Medical Branch | Ju H.,Institute for Translational science | Brasier A.R.,Sealy Center for Molecular Medicine | Brasier A.R.,Institute for Translational science
BMC Research Notes | Year: 2013

Background: The choice of selection methods to identify important variables for binary classification modeling is critical to produce stable models that are interpretable, that generate accurate predictions and have minimum bias. This work is motivated by data on clinical and laboratory features of severe dengue infections (dengue hemorrhagic fever, DHF) obtained from 51 individuals enrolled in a prospective observational study of acute human dengue infections. Results: We carry out a comprehensive performance comparison using several classification models for DHF over the dengue data set. We compared variable selection results by Multivariate Adaptive Regression Splines, Learning Ensemble, Random Forest, Bayesian Moving Averaging, Stochastic Search Variable Selection, and Generalized Regularized Logistics Regression. Model averaging methods (bagging, boosting and ensemble learners) have higher accuracy, but the generalized regularized regression model has the highest predictive power because the linearity assumptions of candidate predictors are strongly satisfied via deviance chi-square testing procedures. Bootstrapping applications for evaluating predictive regression coefficients in regularized regression model are performed. Conclusions: Feature reduction methods introduce inherent biases and therefore are data-type dependent. We propose that these limitations can be overcome using an exhaustive approach for searching feature space. Using this approach, our results suggest that IL-10, platelet and lymphocyte counts are the major features for predicting dengue DHF on the basis of blood chemistries and cytokine measurements. © 2013 Ju and Brasier; licensee BioMed Central Ltd.


Zhao Y.,Sealy Center for Molecular Medicine | Brasier A.R.,Sealy Center for Molecular Medicine | Brasier A.R.,University of Texas Medical Branch
Current Proteomics | Year: 2011

Recent advances in global-scale proteomic technology enable identification of hundreds of candidate biomarkers. However, very few candidates so identified can reach the high bar of FDA approval for clinical use. The low efficiency of biomarker approval reflects the challenges of taking candidate biomarkers identified in discovery research through the long and difficult pipeline required for biomarker development. The greatest challenge in biomarker development is the lack of reliable assays for use in the verification and validation phases. This paper reviews methodologies and challenges for biomarker assay development with emphasis on stable isotope dilution coupled with multiple reaction monitoring-mass spectrometry (SID-MRM-MS). Because of its sensitivity, quantification abilities, and specificity, SIDMRM- MS has the potential to bridge the critical rate-limiting gaps between the biomarker discovery- and validation phases. A workflow for generation of a specific SID-MRM-MS assay is presented. We conclude that currently, SIDMRM- MS assay is a promising technology for biomarker verification and validation. To move the technology toward an FDA-approvable platform, more stringent evaluation must be performed and these future studies will require a joint effort of the clinical proteomics community, the regulatory agency and major mass spectrometer manufacturers. © 2011 Bentham Science Publishers Ltd.


Bertolusso R.,Rice University | Tian B.,University of Texas Medical Branch | Zhao Y.,University of Texas Medical Branch | Zhao Y.,Sealy Center for Molecular Medicine | And 10 more authors.
PLoS ONE | Year: 2014

We present an integrated dynamical cross-talk model of the epithelial innate immune reponse (IIR) incorporating RIG-I and TLR3 as the two major pattern recognition receptors (PRR) converging on the RelA and IRF3 transcriptional effectors. bioPN simulations reproduce biologically relevant gene-and protein abundance measurements in response to time course, gene silencing and dose-response perturbations both at the population and single cell level. Our computational predictions suggest that RelA and IRF3 are under auto- and cross-regulation. We predict, and confirm experimentally, that RIG-I mRNA expression is controlled by IRF7. We also predict the existence of a TLR3-dependent, IRF3-independent transcription factor (or factors) that control(s) expression of MAVS, IRF3 and members of the IKK family. Our model confirms the observed dsRNA dose-dependence of oscillatory patterns in single cells, with periods of 1-3 hr. Model fitting to time series, matched by knockdown data suggests that the NF-κB module operates in a different regime (with different coefficient values) than in the TNFα-stimulation experiments. In future studies, this model will serve as a foundation for identification of virus-encoded IIR antagonists and examination of stochastic effects of viral replication. Our model generates simulated time series, which reproduce the noisy oscillatory patterns of activity (with 1-3 hour period) observed in individual cells. Our work supports the hypothesis that the IIR is a phenomenon that emerged by evolution despite highly variable responses at an individual cell level. © 2014 Bertolusso et al.


Spratt H.,University of Texas Medical Branch | Spratt H.,Sealy Center for Molecular Medicine | Spratt H.,Institute for Translational science | Ju H.,University of Texas Medical Branch | And 3 more authors.
Methods | Year: 2013

Biological experiments in the post-genome era can generate a staggering amount of complex data that challenges experimentalists to extract meaningful information. Increasingly, the success of an appropriately controlled experiment relies on a robust data analysis pipeline. In this paper, we present a structured approach to the analysis of multidimensional data that relies on a close, two-way communication between the bioinformatician and experimentalist. A sequential approach employing data exploration (visualization, graphical and analytical study), pre-processing, feature reduction and supervised classification using machine learning is presented. This standardized approach is illustrated by an example from a proteomic data analysis that has been used to predict the risk of infectious disease outcome. Strategies for model selection and post hoc model diagnostics are presented and applied to the case illustration. We discuss some of the practical lessons we have learned applying supervised classification to multidimensional data sets, one of which is the importance of feature reduction in achieving optimal modeling performance. © 2013 Elsevier Inc.


Brasier A.R.,University of Texas Medical Branch | Brasier A.R.,Sealy Center for Molecular Medicine | Brasier A.R.,Institute for Translational science | Zhao Y.,University of Texas Medical Branch | And 14 more authors.
Journal of Clinical Virology | Year: 2015

Objectives: Dengue virus (DENV) infection is a significant risk to over a third of the human population that causes a wide spectrum of illness, ranging from sub-clinical disease to intermediate syndrome of vascular complications called dengue fever complicated (DFC) and severe, dengue hemorrhagic fever (DHF). Methods for discriminating outcomes will impact clinical trials and understanding disease pathophysiology. Study design: We integrated a proteomics discovery pipeline with a heuristics approach to develop a molecular classifier to identify an intermediate phenotype of DENV-3 infectious outcome. Results: 121 differentially expressed proteins were identified in plasma from DHF vs dengue fever (DF), and informative candidates were selected using nonparametric statistics. These were combined with markers that measure complement activation, acute phase response, cellular leak, granulocyte differentiation and viral load. From this, we applied quantitative proteomics to select a 15 member panel of proteins that accurately predicted DF, DHF, and DFC using a random forest classifier. The classifier primarily relied on acute phase (A2M), complement (CFD), platelet counts and cellular leak (TPM4) to produce an 86% accuracy of prediction with an area under the receiver operating curve of >0.9 for DHF and DFC vs DF. Conclusions: Integrating discovery and heuristic approaches to sample distinct pathophysiological processes is a powerful approach in infectious disease. Early detection of intermediate outcomes of DENV-3 will speed clinical trials evaluating vaccines or drug interventions. © 2015 Elsevier B.V.

Discover hidden collaborations