Biostat Solutions Inc.

Mount Airy, MD, United States

Biostat Solutions Inc.

Mount Airy, MD, United States
Time filter
Source Type

Stevens J.R.,Utah State University | Masud A.A.,Utah State University | Masud A.A.,Indiana University | Suyundikov A.,Utah State University | Suyundikov A.,BioStat Solutions Inc.
PLoS ONE | Year: 2017

In high dimensional data analysis (such as gene expression, spatial epidemiology, or brain imaging studies), we often test thousands or more hypotheses simultaneously. As the number of tests increases, the chance of observing some statistically significant tests is very high even when all null hypotheses are true. Consequently, we could reach incorrect conclusions regarding the hypotheses. Researchers frequently use multiplicity adjustment methods to control type I error rates-primarily the family-wise error rate (FWER) or the false discovery rate (FDR)-while still desiring high statistical power. In practice, such studies may have dependent test statistics (or p-values) as tests can be dependent on each other. However, some commonly-used multiplicity adjustment methods assume independent tests. We perform a simulation study comparing several of the most common adjustment methods involved in multiple hypothesis testing, under varying degrees of block-correlation positive dependence among tests. © 2017 Stevens et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Schifano E.D.,University of Connecticut | Li L.,Harvard University | Li L.,BioStat Solutions Inc. | Christiani D.C.,Harvard University | Lin X.,Harvard University
American Journal of Human Genetics | Year: 2013

There is increasing interest in the joint analysis of multiple phenotypes in genome-wide association studies (GWASs), especially for the analysis of multiple secondary phenotypes in case-control studies and in detecting pleiotropic effects. Multiple phenotypes often measure the same underlying trait. By taking advantage of similarity across phenotypes, one could potentially gain statistical power in association analysis. Because continuous phenotypes are likely to be measured on different scales, we propose a scaled marginal model for testing and estimating the common effect of single-nucleotide polymorphism (SNP) on multiple secondary phenotypes in case-control studies. This approach improves power in comparison to individual phenotype analysis and traditional multivariate analysis when phenotypes are positively correlated and measure an underlying trait in the same direction (after transformation) by borrowing strength across outcomes with a one degree of freedom (1-DF) test and jointly estimating outcome-specific scales along with the SNP and covariate effects. To account for case-control ascertainment bias for the analysis of multiple secondary phenotypes, we propose weighted estimating equations for fitting scaled marginal models. This weighted estimating equation approach is robust to departures from normality of continuous multiple phenotypes and the misspecification of within-individual correlation among multiple phenotypes. Statistical power improves when the within-individual correlation is correctly specified. We perform simulation studies to show the proposed 1-DF common effect test outperforms several alternative methods. We apply the proposed method to investigate SNP associations with smoking behavior measured with multiple secondary smoking phenotypes in a lung cancer case-control GWAS and identify several SNPs of biological interest. © 2013 The American Society of Human Genetics.

Houston J.P.,Eli Lilly and Company | Houston J.P.,Indiana University | Houston J.P.,INC Research | Kohler J.,BioStat Solutions Inc. | And 7 more authors.
Journal of Clinical Psychiatry | Year: 2012

Objective: Pharmacogenomic analyses of weight gain during treatment with second-generation antipsychotics have resulted in a number of associations with variants in ankyrin repeat and kinase domain containing 1 (ANKK1)/dopamine D2 receptor (DRD2) and serotonin 2C receptor (HTR2C) genes. These studies primarily assessed subjects with schizophrenia who had prior antipsychotic exposure that may have influenced the amount of weight gained from subsequent therapies. We assessed the relationships between single-nucleotide polymorphisms (SNPs) in these genes with weight gain during treatment with olanzapine in a predominantly antipsychotic-naive population. Method: The association between 5 ANKK1, 54 DRD2, and 11 HTR2C SNPs and weight change during 8 weeks of olanzapine treatment was assessed in 4 pooled studies of 205 white patients with diagnoses other than schizophrenia who were generally likely to have had limited previous antipsychotic exposure. Results: The A allele of DRD2 rs2440390(A/G) was associated with greater weight gain in the entire study sample (P = .0473). Three HTR2C SNPs in strong linkage disequilibrium, rs6318, rs2497538, and rs1414334, were associated with greater weight gain in women but not in men (P = .0032, .0012, and .0031, respectively). A significant association with weight gain for 2 HTR2C SNPs previously reported associated with weight gain, -759C/T (rs3813929) and -697G/C (rs518147), was not found. Conclusions: Associations between weight gain and HTR2C and DRD2 variants in whites newly exposed to olanzapine may present opportunities for the individualization of medication selection and development based on differences in adverse events observed across genotype groups. Trial Registration: identifiers: Study A: NCT00088036, Study B: NCT00091650, Study C: NCT00094549, Study D: NCT00035321. © Copyright 2012 Physicians Postgraduate Press, Inc.

Qu L.,BioStat Solutions Inc. | Nettleton D.,Iowa State University | Dekkers J.C.M.,Iowa State University
Biometrics | Year: 2012

For analysis of genomic data, e.g., microarray data from gene expression profiling experiments, the two-component mixture model has been widely used in practice to detect differentially expressed genes. However, it naïvely imposes strong exchangeability assumptions across genes and does not make active use of a priori information about intergene relationships that is currently available, e.g., gene annotations through the Gene Ontology (GO) project. We propose a general strategy that first generates a set of covariates that summarizes the intergene information and then extends the two-component mixture model into a hierarchical semiparametric model utilizing the generated covariates through latent nonparametric regression. Simulations and analysis of real microarray data show that our method can outperform the naïve two-component mixture model. © 2012, The International Biometric Society.

Qu L.,Biostat Solutions Inc. | Nettleton D.,Iowa State University | Dekkers J.C.M.,Iowa State University
Biometrics | Year: 2012

Given a large number of t-statistics, we consider the problem of approximating the distribution of noncentrality parameters (NCPs) by a continuous density. This problem is closely related to the control of false discovery rates (FDR) in massive hypothesis testing applications, e.g., microarray gene expression analysis. Our methodology is similar to, but improves upon, the existing approach by Ruppert, Nettleton, and Hwang (2007, Biometrics, 63, 483-495). We provide parametric, nonparametric, and semiparametric estimators for the distribution of NCPs, as well as estimates of the FDR and local FDR. In the parametric situation, we assume that the NCPs follow a distribution that leads to an analytically available marginal distribution for the test statistics. In the nonparametric situation, we use convex combinations of basis density functions to estimate the density of the NCPs. A sequential quadratic programming procedure is developed to maximize the penalized likelihood. The smoothing parameter is selected with the approximate network information criterion. A semiparametric estimator is also developed to combine both parametric and nonparametric fits. Simulations show that, under a variety of situations, our density estimates are closer to the underlying truth and our FDR estimates are improved compared with alternative methods. Data-based simulations and the analyses of two microarray datasets are used to evaluate the performance in realistic situations. © 2012, The International Biometric Society.

Marshall S.L.,BioStat Solutions Inc. | Guennel T.,BioStat Solutions Inc. | Kohler J.,BioStat Solutions Inc. | Man M.,Eli Lilly and Company | Fossceco S.,Eli Lilly and Company
Pharmacogenomics | Year: 2013

Aim: This article aims to evaluate the performance of a recent method to estimate heritability of continuous and binary traits, specifically in the context of pharmacogenetic studies. Materials & methods: The approach to be evaluated was designed to estimate heritability in large-scale disease studies. Extensive simulation studies designed to emulate common scenarios seen in pharmacogenetic studies were performed to elucidate the potential utility of this approach outside of disease genetics. The simulations cover continuous and binary traits with small-to-moderate heritability values across a variety of samples sizes in genome-wide, as well as candidate gene, settings. Results: On a genome-wide scale, a combination of relatively large sample sizes (i.e., n ≥ 1000) and at least moderate underlying heritability (i.e., ≥0.25) are needed in order to attain reasonable statistical power. However, in candidate gene studies, reasonable power can be attained across a more broad range of scenarios. Conclusion: Our simulation studies show that the proposed approach has clear utility in the context of pharmacogenetic studies, especially in candidate gene settings, and provides novel supplementary information that can be used to inform decision-making in the pharmaceutical industry. Original submitted 1 November 2012; Revision submitted 24 January 201. © 2013 Future Medicine Ltd.

Li L.,BioStat Solutions Inc. | Guennel T.,BioStat Solutions Inc. | Marshall S.,BioStat Solutions Inc. | Cheung L.W.-K.,Eli Lilly and Company
Pharmacogenomics Journal | Year: 2014

Delivering on the promise of personalized medicine has become a focus of the pharmaceutical industry as the era of the blockbuster drug is fading. Central to realizing this promise is the need for improved analytical strategies for effectively integrating information across various biological assays (for example, copy number variation and targeted protein expression) toward identification of a treatment-specific subgroup - identifying the right patients. We propose a novel combination of elastic net followed by a maximal χ 2 and semiparametric bootstrap. The combined approaches are presented in a two-stage strategy that estimates patient-specific multi-marker molecular signatures (MMMS) to identify and directly test for a biomarker-driven subgroup with enhanced treatment effect. This flexible strategy provides for incorporation of business-specific needs, such as confining the search space to a subgroup size that is commercially viable, ultimately resulting in actionable information for use in empirically based decision making. © 2014 Macmillan Publishers Limited.

Qu L.,Wright State University | Guennel T.,BioStat Solutions Inc. | Marshall S.,BioStat Solutions Inc.
Biometrics | Year: 2013

Summary: Following the rapid development of genome-scale genotyping technologies, genetic association mapping has become a popular tool to detect genomic regions responsible for certain (disease) phenotypes, especially in early-phase pharmacogenomic studies with limited sample size. In response to such applications, a good association test needs to be (1) applicable to a wide range of possible genetic models, including, but not limited to, the presence of gene-by-environment or gene-by-gene interactions and non-linearity of a group of marker effects, (2) accurate in small samples, fast to compute on the genomic scale, and amenable to large scale multiple testing corrections, and (3) reasonably powerful to locate causal genomic regions. The kernel machine method represented in linear mixed models provides a viable solution by transforming the problem into testing the nullity of variance components. In this study, we consider score-based tests by choosing a statistic linear in the score function. When the model under the null hypothesis has only one error variance parameter, our test is exact in finite samples. When the null model has more than one variance parameter, we develop a new moment-based approximation that performs well in simulations. Through simulations and analysis of real data, we demonstrate that the new test possesses most of the aforementioned characteristics, especially when compared to existing quadratic score tests or restricted likelihood ratio tests. © 2013, The International Biometric Society.

Houston J.P.,Eli Lilly and Company | Houston J.P.,Indiana University | Kohler J.,BioStat Solutions Inc. | Ostbye K.M.,BioStat Solutions Inc. | And 2 more authors.
Psychiatry Research | Year: 2011

Single-nucleotide and diplotype associations with 17-item Hamilton Depression Rating Scale (HAMD17) total score changes were examined, based on catechol-O-methyltransferase (COMT) rs165599 status in duloxetine-treated, self-identified white patients with major depressive disorder. COMT rs165737 and a diplotype containing COMT rs165599 and COMT rs165737 were associated with HAMD17 total score changes. © 2011 Elsevier Ltd.

PubMed | Eli Lilly and Company and BioStatSolutions Inc.
Type: Journal Article | Journal: Journal of clinical pharmacology | Year: 2015

Atomoxetine, which is indicated for treatment of attention-deficit hyperactivity disorder (ADHD), is predominantly metabolized by genetically polymorphic cytochrome P450 2D6 (CYP2D6). Based on identified CYP2D6 genotypes, individuals can be categorized into 4 phenotypic metabolizer groups as ultrarapid, extensive, intermediate, and poor. Previous studies have focused on observed differences between poor and extensive metabolizers, but it is not well understood whether the safety profile of intermediate metabolizers differs from that of ultrarapid and extensive metabolizers. This study compared safety and tolerability among the different CYP2D6 metabolizer groups in the 12-week open-label phase of an atomoxetine study in adult patients with ADHD. Genotyping identified 1039 patients as extensive/ultrarapid metabolizers, 780 patients as intermediate metabolizers, and 117 patients as poor metabolizers. Common (5% frequency) treatment-emergent adverse events did not significantly differ between extensive/ultrarapid and intermediate metabolizers (odds ratios were <2.0 or >0.5). Poor metabolizers had higher frequencies of dry mouth, erectile dysfunction, hyperhidrosis, insomnia, and urinary retention compared with the other metabolizer groups. There were no significant differences between extensive/ultrarapid and intermediate metabolizers in changes from baseline in vital signs. These results suggest that data from CYP2D6 intermediate and extensive/ultrarapid metabolizers can be combined when considering safety analyses related to atomoxetine.

Loading Biostat Solutions Inc. collaborators
Loading Biostat Solutions Inc. collaborators