Time filter

Source Type

Minneapolis, MN, United States

Paunic V.,University of Minnesota | Steinbach M.,University of Minnesota | Madbouly A.,Bioinformatics Research | Kumar V.,University of Minnesota
2013 ACM Conference on Bioinformatics, Computational Biology and Biomedical Informatics, ACM-BCB 2013 | Year: 2013

The Human Leukocyte Antigen (HLA) gene system plays a crucial role in hematopoietic stem cell transplantation, where patients and donors are matched with respect to their HLA genes in order to maximize the chances of a successful transplant. It is the most polymorphic region of the human genome with some of the strongest associations with autoimmune, infectious, and inammatory diseases. The availability of HLA data is, therefore, of high importance to clinicians and researchers. However, due to its high polymorphism, obtaining it is time- And cost-prohibitive. We previously described a method for the prediction of HLA genes from widely available Single Nucleotide Polymorphism (SNP) data. In this paper we show that using HLA gene dependency information improves prediction performance on multiple real-world data sets. More specifically, we propose and evaluate different approaches for integrating HLA gene dependency into the prediction process. The results from experiments on two real data sets show that adding dependency information is a valuable asset for HLA gene prediction, particularly for smaller data sets. Copyright © 2007 by the Association for Computing Machinery.

Paunic V.,University of Minnesota | Steinbach M.,University of Minnesota | Kumar V.,University of Minnesota | Maiers M.,Bioinformatics Research
Proceedings - 12th IEEE International Conference on Data Mining Workshops, ICDMW 2012 | Year: 2012

Variation in the Human Leukocyte Antigen (HLA) gene system is very important. It is one of the most polymorphic regions of the human genome and one of the most extensively studied regions due to its association with autoimmune, infectious, and inflammatory diseases, such as rheumatoid arthritis, celiac disease, multiple sclerosis and Type I diabetes. The HLA gene system also plays a crucial role in hematopoietic stem cell transplantation, where patients and donors are matched with respect to their HLA genes in order to maximize the chances of a successful transplant. Having complete HLA data is therefore of great use to clinicians and researchers. However, due to its polymorphism, obtaining it is highly time- and cost-prohibitive. Genome-wide association studies finding strong associations within HLA region would ideally like to identify the exact HLA alleles responsible for association in order to determine the causal genes/variants. Here we propose a method to infer HLA alleles from widely available and affordable SNP genotype data. Our method takes into account the high linkage disequilibrium that exists in the region. We demonstrate that this additional information is an imporant asset in HLA prediction problem. © 2012 IEEE.

Hollenbach J.A.,University of California at San Francisco | Saperstein A.,Stanford University | Albrecht M.,Bioinformatics Research | Vierra-Green C.,Center for International Blood and Marrow Transplant Research | And 3 more authors.
PLoS ONE | Year: 2015

We conducted a nationwide study comparing self-identification to genetic ancestry classifications in a large cohort (n = 1752) from the National Marrow Donor Program. We sought to determine how various measures of self-identification intersect with genetic ancestry, with the aim of improving matching algorithms for unrelated bone marrow transplant. Multiple dimensions of self-identification, including race/ethnicity and geographic ancestry were compared to classifications based on ancestry informative markers (AIMs), and the human leukocyte antigen (HLA) genes, which are required for transplant matching. Nearly 20% of responses were inconsistent between reporting race/ethnicity versus geographic ancestry. Despite strong concordance between AIMs and HLA, no measure of self-identification shows complete correspondence with genetic ancestry. In certain cases geographic ancestry reporting matches genetic ancestry not reflected in race/ethnicity identification, but in other cases geographic ancestries show little correspondence to genetic measures, with important differences by gender. However, when respondents assign ancestry to grandparents, we observe sub-groups of individuals with well- defined genetic ancestries, including important differences in HLA frequencies, with implications for transplant matching. While we advocate for tailored questioning to improve accuracy of ancestry ascertainment, collection of donor grandparents' information will improve the chances of finding matches for many patients, particularly for mixed-ancestry individuals. © 2015 Hollenbach et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Paunic V.,Bioinformatics Research | Paunic V.,University of Minnesota | Gragert L.,Bioinformatics Research | Madbouly A.,Bioinformatics Research | And 2 more authors.
PLoS ONE | Year: 2012

In hematopoietic stem cell transplantation, donor selection is based primarily on matching donor and patient HLA genes. These genes are highly polymorphic and their typing can result in exact allele assignment at each gene (the resolution at which patients and donors are matched), but it can also result in a set of ambiguous assignments, depending on the typing methodology used. To facilitate rapid identification of matched donors, registries employ statistical algorithms to infer HLA alleles from ambiguous genotypes. Linkage disequilibrium information encapsulated in haplotype frequencies is used to facilitate prediction of the most likely haplotype assignment. An HLA typing with less ambiguity produces fewer high-probability haplotypes and a more reliable prediction. We estimated ambiguity for several HLA typing methods across four continental populations using an information theory-based measure, Shannon's entropy. We used allele and haplotype frequencies to calculate entropy for different sets of 1,000 subjects with simulated HLA typing. Using allele frequencies we calculated an average entropy in Caucasians of 1.65 for serology, 1.06 for allele family level, 0.49 for a 2002-era SSO kit, and 0.076 for single-pass SBT. When using haplotype frequencies in entropy calculations, we found average entropies of 0.72 for serology, 0.73 for allele family level, 0.05 for SSO, and 0.002 for single-pass SBT. Application of haplotype frequencies further reduces HLA typing ambiguity. We also estimated expected confirmatory typing mismatch rates for simulated subjects. In a hypothetical registry with all donors typed using the same method, the entropy values based on haplotype frequencies correspond to confirmatory typing mismatch rates of 1.31% for SSO versus only 0.08% for SBT. Intermediate-resolution single-pass SBT contains the least ambiguity of the methods we evaluated and therefore the most certainty in allele prediction. The presented measure objectively evaluates HLA typing methods and can help define acceptable HLA typing for donor recruitment. © 2012 Paunić et al.

Madbouly A.,Bioinformatics Research | Gragert L.,Bioinformatics Research | Freeman J.,Bioinformatics Research | Leahy N.,Bioinformatics Research | And 5 more authors.
Tissue Antigens | Year: 2014

Genetic matching for loci in the human leukocyte antigen (HLA) region between a donor and a patient in hematopoietic stem cell transplantation (HSCT) is critical to outcome; however, methods for HLA genotyping of donors in unrelated stem cell registries often yield results with allelic and phase ambiguity and/or do not query all clinically relevant loci. We present and evaluate a statistical method for in silico imputation of HLA alleles and haplotypes in large ambiguous population data from the Be The Match® Registry. Our method builds on haplotype frequencies estimated from registry populations and exploits patterns of linkage disequilibrium (LD) across HLA haplotypes to infer high resolution HLA assignments. We performed validation on simulated and real population data from the Registry with non-trivial ambiguity content. While real population datasets caused some predictions to deviate from expectation, validations still showed high percent recall for imputed results with average recall >76% when imputing HLA alleles from registry data. We simulated ambiguity generated by several HLA genotyping methods to evaluate the imputation performance on several levels of typing resolution. On average, imputation percent recall of allele-level HLA haplotypes was >95% for allele-level typing, >92% for intermediate resolution typing and >58% for serology (low-resolution) typing. Thus, allele-level HLA assignments can be imputed through the application of a set of statistical and population genetics inferences and with knowledge of haplotype frequencies and self-identified race and ethnicities. © 2014 John Wiley & Sons A/S.

Discover hidden collaborations