Baurley J.W.,Binus University |
Baurley J.W.,Biorealm |
Conti D.V.,University of Southern California
BMC Bioinformatics | Year: 2013
Background: Testing for marginal associations between numerous genetic variants and disease may miss complex relationships among variables (e.g., gene-gene interactions). Bayesian approaches can model multiple variables together and offer advantages over conventional model building strategies, including using existing biological evidence as modeling priors and acknowledging that many models may fit the data well. With many candidate variables, Bayesian approaches to variable selection rely on algorithms to approximate the posterior distribution of models, such as Markov-Chain Monte Carlo (MCMC). Unfortunately, MCMC is difficult to parallelize and requires many iterations to adequately sample the posterior. We introduce a scalable algorithm called PEAK that improves the efficiency of MCMC by dividing a large set of variables into related groups using a rooted graph that resembles a mountain peak. Our algorithm takes advantage of parallel computing and existing biological databases when available.Results: By using graphs to manage a model space with more than 500,000 candidate variables, we were able to improve MCMC efficiency and uncover the true simulated causal variables, including a gene-gene interaction. We applied PEAK to a case-control study of childhood asthma with 2,521 genetic variants. We used an informative graph for oxidative stress derived from Gene Ontology and identified several variants in ERBB4, OXR1, and BCL2 with strong evidence for associations with childhood asthma.Conclusions: We introduced an extremely flexible analysis framework capable of efficiently performing Bayesian variable selection on many candidate variables. The PEAK algorithm can be provided with an informative graph, which can be advantageous when considering gene-gene interactions, or a symmetric graph, which simply divides the model space into manageable regions. The PEAK framework is compatible with various model forms, allowing for the algorithm to be configured for different study designs and applications, such as pathway or rare-variant analyses, by simple modifications to the model likelihood and proposal functions. © 2013 Baurley and Conti; licensee BioMed Central Ltd.
Figueiredo J.C.,University of Southern California |
Levine A.J.,University of Southern California |
Crott J.W.,Tufts University |
Baurley J.,Biorealm |
Haile R.W.,University of Southern California
Molecular Nutrition and Food Research | Year: 2013
Scope: The metabolism of folate involves a complex network of polymorphic enzymes that may explain a proportion of the risk associated with colorectal neoplasia. Over 60 observational studies primarily in non-Hispanic White populations have been conducted on selected genetic variants in specific genes, MTHFR, MTR, MTRR, CBS, TCNII, RFC, GCPII, SHMT, TYMS, and MTHFD1, including five meta-analyses on MTHFR 677C>T (rs1801133) and MTHFR 1298C>T (rs1801131); two meta-analyses on MTR-2756A>C (rs1805087); and one for MTRR 66A>G (rs1801394). Methods and results: This systematic review synthesizes these data, highlighting the consistent inverse association between MTHFR 677TT genotype and risk of colorectal cancer (CRC) and its null association with adenoma risk. Results for other variants varied across individual studies; in our meta-analyses we observed some evidence for SHMT 1420C>T (rs1979277) ((odds ratio) OR = 0.85; 95% confidence interval (CI) = 0.73-1.00 for TT v. CC) and TYMS 5' 28 bp repeat (rs34743033) and CRC risk (OR = 0.84; 95% CI = 0.75-0.94 for 2R/3R v. 3R/3R and OR = 0.82; 95% CI = 0.69-0.98 for 2R/2R v. 3R/3R). Conclusion: To gain further insight into the role of folate variants in colorectal neoplasia will require incorporating measures of the metabolites, including B-vitamin cofactors, homocysteine and S-adenosylmethionine, and innovative statistical methods to better approximate the folate one-carbon metabolism pathway. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Baurley J.W.,Biorealm |
Edlund C.K.,University of Southern California |
Pardamean B.,Binus University
Advances in Intelligent and Soft Computing | Year: 2012
With the increasing availability and affordability of genome-wide genotyping and sequencing technologies, biomedical researchers are faced with increasing computational challenges in managing and analyzing large quantities of genetic data. Previously, this data intensive research required computing and personnel resources accessible only to large institutions. Cloud computing allows researchers to analyze their data without a local computing infrastructure. We evaluated the feasibility of cloud computing for association analysis of genome-wide data. Our approach utilized the MapReduce model which divides the analysis into independent units and distributes the work to a computing cloud. We evaluated our approach by modeling the relationships between genetic variants and disease in a simulated genome-wide association study. We generated several data sets of 100,000 subjects and various number of genetic variants, and demonstrated that our analysis approach is scalable and provides an attractive alternative to establishing and maintaining a local computing cluster. © 2012 Springer-Verlag GmbH.
Figueiredo J.C.,University of Southern California |
Ly S.,Childrens Hospital Los Angeles |
Ly S.,University of California at Los Angeles |
Magee K.S.,Smile Inc |
And 19 more authors.
Birth Defects Research Part A - Clinical and Molecular Teratology | Year: 2015
Background: Several lifestyle and environmental exposures have been suspected as risk factors for oral clefts, although few have been convincingly demonstrated. Studies across global diverse populations could offer additional insight given varying types and levels of exposures. Methods: We performed an international case-control study in the Democratic Republic of the Congo (133 cases, 301 controls), Vietnam (75 cases, 158 controls), the Philippines (102 cases, 152 controls), and Honduras (120 cases, 143 controls). Mothers were recruited from hospitals and their exposures were collected from interviewer-administered questionnaires. We used logistic regression modeling to estimate odds ratios (OR) and 95% confidence intervals (CI). Results: Family history of clefts was strongly associated with increased risk (maternal: OR = 4.7; 95% CI, 3.0-7.2; paternal: OR = 10.5; 95% CI, 5.9-18.8; siblings: OR = 5.3; 95% CI, 1.4-19.9). Advanced maternal age (5 year OR = 1.2; 95% CI, 1.0-1.3), pregestational hypertension (OR = 2.6; 95% CI, 1.3-5.1), and gestational seizures (OR = 2.9; 95% CI, 1.1-7.4) were statistically significant risk factors. Lower maternal (secondary school OR = 1.6; 95% CI, 1.2-2.2; primary school OR = 2.4, 95% CI, 1.6-2.8) and paternal education (OR = 1.9; 95% CI, 1.4-2.5; and OR = 1.8; 95% CI, 1.1-2.9, respectively) and paternal tobacco smoking (OR = 1.5, 95% CI, 1.1-1.9) were associated with an increased risk. No other significant associations between maternal and paternal factors were found; some environmental factors including rural residency, indoor cooking with wood, chemicals and water source appeared to be associated with an increased risk in adjusted models. Conclusion: Our study represents one of the first international studies investigating risk factors for clefts among multiethnic underserved populations. Our findings suggest a multifactorial etiology including both maternal and paternal factors. © 2015 Wiley Periodicals, Inc.
Agency: Department of Health and Human Services | Branch: National Institutes of Health | Program: SBIR | Phase: Phase II | Award Amount: 2.01M | Year: 2015