News Article | May 18, 2017
New research presented at this year's European Congress on Obesity (ECO) in Porto, Portugal (17-20 May) shows that middle-aged people who spend the most nights in hospital (and thus have the highest healthcare burden) have on average much higher levels of visceral fat (internal fat that surrounds their organs) and fat within their thigh muscles than those who spend no nights in hospital. The study is by Dr Olof Dahlqvist Leinhard Chief Technology Officer, Advanced MR Analytics AB, Linköping, Sweden. There are significant economic costs associated with obesity. Yet the consensus definition of obesity, based on body mass index (BMI), lacks detail on precise body composition and distribution of fat-compartments within the body. Magnetic resonance imaging (MRI), currently the gold standard for body composition profiling, allows for accurate quantification of body fat content and distribution. Previous research has shown strong association between fat distribution and the risk for diabetes type 2 and cardiovascular disease. In this study, the authors wanted to understand the association of body fat distribution with general health. Their theory was that health burden metrics are useful because hospitalisation nights are distinct decisions made by a physician indicating a certain severity of disease. Thus this study aimed to determine the association between body composition measures and prior health care burden (HCB), measured as number of nights hospitalisation, and to characterise subjects with the highest prior HCB. The study included 2,864 males and 3,157 females, age range at imaging 46 to 78 years, from the UK Biobank imaging cohort. Visceral adipose tissue index (VATi=VAT/height2, l/m2), abdominal subcutaneous adipose tissue index (ASATi=ASAT/height2, l/m2) and intra-muscular adipose tissue in anterior thighs (IMAT, %) were measured using an MRI scanner. The MR-images were analysed using AMRA® Profiler research (AMRA, Sweden). The HCB was derived from UK Biobank hospital in-patient data gathered prior to, and during the imaging study. Computer modelling was used to establish the relationships between body fat distribution and HCB. All models were adjusted for age, sex, smoking, alcohol intake, and physical activity. Finally, body composition was determined for subjects in the 90th percentile of prior HCB (meaning the 10% of participants with the highest HCB) and compared to a group with no hospital nights matched on sex, age, and BMI. Their data showed that HCB was associated with increased VATi and increased IMAT (both statistically significant). Association with ASATi was not significant. The group with highest prior HCB consisted of 382 females and 292 males, median age at imaging 64 years, who were hospitalised for at least nine nights. Comparing to subjects with no hospital nights, VATi and IMAT in those with the most hospital nights were substantially higher (both statistically significant findings). Dr Leinhard concludes: "This study demonstrated that internal fat around organs and thigh fat are associated with more days of prior hospitalisation. Subcutaneous fat did not show a significant relationship." He adds: "The findings indicate that visceral obesity should be the focus rather than subcutaneous fat and body mass index (BMI) for achieving better health. The findings related to muscle fat infiltration are more difficult to interpret but highlight the importance of musculoskeletal health. More research is needed to understand the underlying cause for increased infiltration of fat into thigh muscle."
News Article | February 27, 2017
OXFORD, England, February 27, 2017 /PRNewswire/ -- Around 20% of adults have fatty liver disease, putting them at increased risk of heart disease, stroke and cancer, Oxford and University of Westminster researchers reveal in a paper published in PLOS One today. (Logo...
News Article | February 15, 2017
A genomic study of baldness identified more than 200 genetic regions involved in this common but potentially embarrassing condition. These genetic variants could be used to predict a man's chance of severe hair loss. The study, led by Saskia Hagenaars and W. David Hill of The University of Edinburgh, United Kingdom, is published February 14th, 2017 in PLOS Genetics. Before this new study, only a handful of genes related to baldness had been identified. The University of Edinburgh scientists examined genomic and health data from over 52,000 male participants of the UK Biobank, performing a genome-wide association study of baldness. They pinpointed 287 genetic regions linked to the condition. The researchers created a formula to try and predict the chance that a person will go bald, based on the presence or absence of certain genetic markers. Accurate predictions for an individual are still some way off, but the results can help to identify sub-groups of the population for which the risk of hair loss is much higher. The study is the largest genetic analysis of male pattern baldness to date. Many of the identified genes are related to hair structure and development. They could provide possible targets for drug development to treat baldness or related conditions. Saskia Hagenaars, a PhD student from The University of Edinburgh's Centre for Cognitive Ageing and Cognitive Epidemiology, who jointly led the research, said: "We identified hundreds of new genetic signals. It was interesting to find that many of the genetics signals for male pattern baldness came from the X chromosome, which men inherit from their mothers." Dr David Hill, who co-led the research, said: "In this study, data were collected on hair loss pattern but not age of onset; we would expect to see an even stronger genetic signal if we were able to identify those with early-onset hair loss." The study's principal investigator, Dr Riccardo Marioni, from The University of Edinburgh's Centre for Genomic and Experimental Medicine, said: "We are still a long way from making an accurate prediction for an individual's hair loss pattern. However, these results take us one step closer. The findings pave the way for an improved understanding of the genetic causes of hair loss." In your coverage please use this URL to provide access to the freely available article in PLOS Genetics: http://journals. Funding: This research was conducted, using the UK Biobank Resource, in The University of Edinburgh Centre for Cognitive Ageing and Cognitive Epidemiology, part of the cross-council Lifelong Health and Wellbeing Initiative (MR/K026992/1). Funding from the Biotechnology and Biological Sciences Research Council (BBSRC) and Medical Research Council (MRC) is gratefully acknowledged. WDH is supported by a grant from Age UK (Disconnected Mind Project). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing Interests: The authors have declared that no competing interests exist.
News Article | February 15, 2017
The discovery cohort consisted of 147 studies comprising 458,927 adult individuals of the following ancestries: (1) European descent (n = 381,625); (2) African (n = 27,494); (3) South Asian (n = 29,591); (4) East Asian (n = 8,767); (5) Hispanic (n = 10,776) and (6) Saudi Arabian (n = 695). All participating institutions and coordinating centres approved this project, and informed consent was obtained from all subjects. Discovery meta-analysis was carried out in each ancestry group (except the Saudi Arabian) separately as well as in the All group. Validation was undertaken in individuals of European ancestry only (Supplementary Tables 1–3). Conditional analyses were undertaken only in the European descent group (106 studies, n = 381,625). The SNPs we identify are available from the NCBI dbSNP database of short genetic variations (https://www.ncbi.nlm.nih.gov/projects/SNP/). No statistical methods were used to predetermine sample size. The experiments were not randomized and the investigators were not blinded to allocation during experiments and outcome assessment. Height (in centimetres) was corrected for age and the genomic principal components (derived from GWAS data, the variants with a MAF > 1% on ExomeChip (http://genome.sph.umich.edu/wiki/Exome_Chip_Design), or ancestry-informative markers available on the ExomeChip), as well as any additional study-specific covariates (for example, recruiting centre), in a linear regression model. For studies with non-related individuals, residuals were calculated separately by sex, whereas for family-based studies sex was included as a covariate in the model. Additionally, residuals for case/control studies were calculated separately. Finally, residuals were subject to inverse normal transformation. The majority of studies followed a standardized protocol and performed genotype calling using the designated manufacturer’s software, which was then followed by zCall30. For ten studies participating in the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium, the raw intensity data for the samples from seven genotyping centres were assembled into a single project for joint calling11. Study-specific quality-control measures of the genotyped variants was implemented before association analysis (Supplementary Tables 1–2). Individual cohorts were analysed separately for each ancestry population, with either RAREMETALWORKER (http://genome.sph.umich.edu/wiki/RAREMETALWORKER) or RVTEST (http://zhanxw.github.io/rvtests/), to associate inverse normal transformed height data with genotype data taking potential cryptic relatedness (kinship matrix) into account in a linear mixed model. These software are designed to perform score-statistics based rare-variant association analysis, can accommodate both unrelated and related individuals, and provide single-variant results and variance-covariance matrix. The covariance matrix captures linkage disequilibrium relationships between markers within 1 Mb, which is used for gene-level meta-analyses and conditional analyses31. Single-variant analyses were performed for both additive and recessive models (for the alternate allele). The individual study data were investigated for potential existence of ancestry population outliers based on the 1000 Genome Project phase 1 ancestry reference populations. A centralized quality control procedure implemented in EasyQC32 was applied to individual study association summary statistics to identify outlying studies: (1) assessment of possible problems in height transformation; (2) comparison of allele frequency alignment against 1000 Genomes Project phase 1 reference data to pinpoint any potential strand issues; and (3) examination of quantile–quantile plots per study to identify any problems arising from population stratification, cryptic relatedness and genotype biases. We excluded variants if they had a call rate <95%, Hardy–Weinberg equilibrium P < 1 × 10−7, or large allele frequency deviations from reference populations (>0.6 for all ancestry analyses and >0.3 for ancestry-specific population analyses). We also excluded from downstream analyses markers not present on the Illumina ExomeChip array 1.0, variants on the Y chromosome or the mitochondrial genome, indels, multiallelic variants, and problematic variants based on the Blat-based sequence alignment analyses. Meta-analyses were carried out in parallel by two different analysts at two sites. We conducted single-variant meta-analyses in a discovery sample of 458,927 individuals of different ancestries using both additive and recessive genetic models (Extended Data Fig. 1 and Supplementary Tables 1–4). Significance for single-variant analyses was defined at an array-wide level (P < 2 × 10−7, Bonferroni correction for 250,000 variants). The combined additive analyses identified 1,455 unique variants that reached array-wide significance (P < 2 × 10−7), including 578 non-synonymous and splice-site variants (Supplementary Tables 5–7). Under the additive model, we observed a high genomic inflation of the test statistics (for example, a λ of 2.7 in European ancestry studies for common markers, Extended Data Fig. 2 and Supplementary Table 8), although validation results (see below) and additional sensitivity analyses (see below) suggested that it is consistent with polygenic inheritance as opposed to population stratification, cryptic relatedness, or technical artefacts (Extended Data Fig. 2). The majority of these 1,455 association signals (1,241; 85.3%) were found in the European ancestry meta-analysis (85.5% of the discovery sample size) (Extended Data Fig. 2). Nevertheless, we discovered eight associations within five loci in our all-ancestry analyses that are driven by African studies (including one missense variant in the growth hormone gene GH1 (rs151263636), Extended Data Fig. 3), three height variants found only in African studies, and one rare missense marker associated with height in South Asians only (Supplementary Table 7). We observed a marked genomic inflation of the test statistics even after adequate control for population stratification (linear mixed model) arising mainly from common markers; λ in European ancestry was 1.2 and 2.7 for all and common markers, respectively (Extended Data Fig. 2 and Supplementary Table 8). Such inflation is expected for a highly polygenic trait like height, and is consistent with our very large sample size3, 33. To confirm this, we applied the recently developed linkage disequilibrium score regression method to our height ExomeChip results34, with the caveats that the method was developed (and tested) with >200,000 common markers available. We restricted our analyses to 15,848 common variants (MAF ≥ 5%) from the European-ancestry meta-analysis, and matched them to pre-computed linkage disequilibrium scores for the European reference dataset34. The intercept of the regression of the χ2 statistics from the height meta-analysis on the linkage disequilibrium score estimates that the inflation in the mean χ2 is due to confounding bias, such as cryptic relatedness or population stratification. The intercept was 1.4 (s.e.m. = 0.07), which is small when compared to the λ of 2.7. Furthermore, we also confirmed that the linkage disequilibrium score regression intercept is estimated upward because of the small number of variants on the ExomeChip and the selection criteria for these variants (that is, known GWAS hits). The ratio statistic of (intercept − 1)/(mean χ2 − 1) is 0.067 (s.e.m. = 0.012), well within the normal range34, suggesting that most of the inflation (~93%) observed in the height association statistics is due to polygenic effects (Extended Data Fig. 2). Furthermore, to exclude the possibility that some of the observed associations between height and rare and low-frequency variants could be due to allele calling problems in the smaller studies, we performed a sensitivity meta-analysis with primarily European ancestry studies totalling >5,000 participants. We found very concordant effect sizes, suggesting that smaller studies do not bias our results (Extended Data Fig. 2). The RAREMETAL R package35 and the GCTA v1.24 (ref. 36) software were used to identify independent height association signals across the European descent meta-analysis results. RAREMETAL performs conditional analyses by using covariance matrices in order to distinguish true signals from those driven by linkage disequilibrium at adjacent known variants. First, we identified the lead variants (P < 2 × 10−7) based on a 1-Mb window centred on the most significantly associated variant and performed linkage disequilibrium pruning (r2 < 0.3) to avoid downstream problems in the conditional analyses due to co-linearity. We then conditioned on the linkage disequilibrium-pruned set of lead variants in RAREMETAL and kept new lead signals at P < 2 × 10−7. The process was repeated until no additional signal emerged below the pre-specified P-value threshold. The use of a 1-Mb window in RAREMETAL can obscure dependence between conditional signals in adjacent intervals in regions of extended linkage disequilibrium. To detect such instances, we performed joint analyses using GCTA with the ARIC and UK ExomeChip reference panels, both of which comprise >10,000 individuals of European descent. With the exception of a handful of variants in a few genomic regions with extended linkage disequilibrium (for example, the HLA region on chromosome 6), the two pieces of software identified the same independent signals (at P < 2 × 10−7). To discover new height variants, we conditioned the height variants found in our ExomeChip study on the previously published GWAS height variants3 using the first release of the UK Biobank imputed dataset and regression methodology implemented in BOLT-LMM37. Because of the difference between the sample size of our discovery set (n = 458,927) and the UK Biobank (first release, n = 120,084), we applied a threshold of P < 0.05 to declare a height variant as independent in this analysis. We also explored an alternative approach based on approximate conditional analysis36. This latter method (SSimp) relies on summary statistics available from the same cohort, thus we first imputed summary statistics38 for exome variants, using summary statistics from a previous study3. Conversely, we imputed the top variants from this study3 using the summary statistics from the ExomeChip. Subsequently, we calculated effect sizes for each exome variant conditioned on the top variants of this study3 in two ways. First, we conditioned the imputed summary statistics of the exome variant on the summary statistics of the top variants that fell within 5 Mb of the target ExomeChip variant. Second, we conditioned the summary statistics of the ExomeChip variant on the imputed summary statistics of the hits of this study3. We then selected the option that yielded a higher imputation quality. For poorly tagged variants ( < 0.8), we simply used up-sampled HapMap summary statistics for the approximate conditional analysis. Pairwise SNP-by-SNP correlations were estimated from the UK10K data (TwinsUK39 and ALSPAC40 studies, n = 3,781). Several studies, totalling 252,501 independent individuals of European ancestry, became available after the completion of the discovery analyses, and were thus used for validation of our experiment. We validated the single-variant association results in eight studies, totalling 59,804 participants, genotyped on the ExomeChip using RAREMETAL31. We sought additional evidence for association for the top signals in two independent studies in the UK (UK Biobank) and Iceland (deCODE), comprising 120,084 and 72,613 individuals, respectively. We used the same quality control and analytical methodology as described above. Genotyping and study descriptions are provided in Supplementary Tables 1–3. For the combined analysis, we used the inverse-variance-weighted fixed effects meta-analysis method using METAL41. Significant associations were defined as those with a combined meta-analysis (discovery and validation) P < 2 × 10−7. We considered 81 variants with suggestive association in the discovery analyses (2 × 10−7 < P ≤ 2 × 10−6). Of those 81 variants, 55 reached significance after combining discovery and replication results based on a P < 2 × 10−7 (Supplementary Table 9). Furthermore, recessive modelling confirmed seven new independent markers with a P < 2 × 10−7 (Supplementary Table 10). One of these recessive signals is due to a rare X-linked variant in the AR gene (rs137852591, MAF = 0.21%). Because of its frequency, we only tested hemizygous men (we did not identify homozygous women for the minor allele) so we cannot distinguish between a true recessive mode of inheritance or a sex-specific effect for this variant. To test the independence and integrate all height markers from the discovery and validation phase, we used conditional analyses and GCTA ‘joint’ modelling36 in the combined discovery and validation set. This resulted in the identification of 606 independent height variants, including 252 non-synonymous or splice-site variants (Supplementary Table 11). If we consider only the initial set of lead SNPs with P < 2 × 10−7, we identified 561 independent variants. Of these 561 variants (selected without the validation studies), 560 have concordant direction of effect between the discovery and validation studies, and 548 variants have a P < 0.05 (466 variants with P < 8.9 × 10−5, Bonferroni correction for 561 tests), suggesting a very low false discovery rate (Supplementary Table 11). For the gene-based analyses, we applied two different sets of criteria to select variants, based on coding variant annotation from five prediction algorithms (PolyPhen2 HumDiv and HumVar, LRT, MutationTaster and SIFT)42. The mask labelled ‘broad’ included variants with a MAF < 0.05 that are nonsense, stop-loss, splice site, as well as missense variants that are annotated as damaging by at least one program mentioned above. The mask labelled ‘strict’ included only variants with a MAF < 0.05 that are nonsense, stop-loss, splice-site, as well as missense variants annotated as damaging by all five algorithms. We used two tests for gene-based testing, namely the SKAT43 and VT44 tests. Statistical significance for gene-based tests was set at a Bonferroni-corrected threshold of P < 5 × 10−7 (threshold for 25,000 genes and four tests). The gene-based discovery results were validated (same test and variants, when possible) in the same eight studies genotyped on the ExomeChip (n = 59,804 participants) that were used for the validation of the single-variant results (see above, and Supplementary Tables 1–3). Gene-based conditional analyses were performed in RAREMETAL. We accessed ExomeChip data from GIANT (BMI, waist:hip ratio), GLGC (total cholesterol, triglycerides, HDL-cholesterol, LDL-cholesterol), IBPC (systolic and diastolic blood pressure), MAGIC (glycaemic traits), REPROGEN (age at menarche and menopause), and DIAGRAM (type 2 diabetes) consortia. For coronary artery disease, we accessed 1000 Genomes Project-imputed GWAS data released by CARDIoGRAMplusC4D45. DEPICT (http://www.broadinstitute.org/mpg/depict/) is a computational framework that uses probabilistically defined reconstituted gene sets to perform gene set enrichment and gene prioritization15. For a description of gene set reconstitution, refer to refs 15, 46. In brief, reconstitution was performed by extending pre-defined gene sets (such as Gene Ontology terms, canonical pathways, protein-protein interaction subnetworks and rodent phenotypes) with genes co-regulated with genes in these pre-defined gene set using large-scale microarray-based transcriptomics data. In order to adapt the gene set enrichment part of DEPICT for ExomeChip data (https://github.com/RebeccaFine/height-ec-depict), we made two principal changes. First, because DEPICT for GWAS incorporates all genes within a given linkage disequilibrium block around each index SNP, we modified DEPICT to take as input only the gene directly impacted by the coding SNP. Second, we adapted the way DEPICT adjusts for confounders (such as gene length) by generating null ExomeChip association results using Swedish ExomeChip data (Malmö Diet and Cancer (MDC), All New Diabetics in Scania (ANDIS), and Scania Diabetes Registry (SDR) cohorts, n = 11,899) and randomly assigning phenotypes from a normal distribution before conducting association analysis (see Supplementary Information). For the gene set enrichment analysis of the ExomeChip data, we used significant non-synonymous variants statistically independent of known GWAS hits (and that were present in the null ExomeChip data; see Supplementary Information for details). For gene set enrichment analysis of the GWAS data, we used all loci with a non-coding index SNP and that did not contain any of the novel ExomeChip genes. In visualizing the analysis, we used affinity propagation clustering47 to group the most similar reconstituted gene sets based on their gene memberships (see Supplementary Information). Within a ‘meta-gene set’, the best P value of any member gene set was used as representative for comparison. DEPICT for ExomeChip was written using the Python programming language and the code can be found at https://github.com/RebeccaFine/height-ec-depict. We also applied the PASCAL (http://www2.unil.ch/cbg/index.php?title=Pascal) pathway analysis tool16 to association summary statistics for all coding variants. In brief, the method derives gene-based scores (both SUM and MAX statistics) and subsequently tests for the over-representation of high gene scores in predefined biological pathways. We used standard pathway libraries from KEGG, REACTOME and BIOCARTA, and also added dichotomized (Z score > 3) reconstituted gene sets from DEPICT15. To accurately estimate SNP-by-SNP correlations even for rare variants, we used the UK10K data (TwinsUK39 and ALSPAC40 studies, n = 3781). To separate the contribution of regulatory variants from the coding variants, we also applied PASCAL to association summary statistics of only regulatory variants (20 kb upstream, gene body excluded) from a previous study3. In this way, we could classify pathways driven principally by coding, regulatory or mixed signals. For the generation of STC2 mutants (R44L and M86I), wild-type STC2 cDNA contained in pcDNA3.1/Myc-His(−) (Invitrogen)23 was used as a template. Mutagenesis was carried out using Quickchange (Stratagene), and all constructs were verified by sequence analysis. Recombinant wild-type STC2 and variants were expressed in human embryonic kidney (HEK) 293T cells (293tsA1609neo, ATCC CRL-3216) maintained in high-glucose DMEM supplemented 10% fetal bovine serum, 2 mM glutamine, nonessential amino acids, and gentamicin. The cells are routinely tested for mycoplasma contamination. Cells (6 × 106) were plated onto 10-cm dishes and transfected 18 h later by calcium phosphate co-precipitation using 10 μg plasmid DNA. Medium was collected 48 h after transfection, cleared by centrifugation, and stored at −20 °C until use. Protein concentrations (58–66 nM) were determined by TRIFMA using antibodies described previously23. PAPP-A was expressed stably in HEK293T cells as previously reported48. Expressed levels of PAPP-A (27.5 nM) were determined by a commercial ELISA (AL-101, Ansh Labs). Culture supernatants containing wild-type STC2 or variants were adjusted to 58 nM, added an equal volume of culture supernatant containing PAPP-A corresponding to a 2.1-fold molar excess, and incubated at 37 °C. Samples were taken at 1, 2, 4, 6, 8, 16, and 24 h and stored at −20 °C. Specific proteolytic cleavage of 125I-labeled IGFBP-4 is described in detail elsewhere49. In brief, the PAPP-A–STC2 complex mixtures were diluted (1:190) to a concentration of 72.5 pM PAPP-A and mixed with pre-incubated 125I-IGFBP4 (10 nM) and IGF-1 (100 nM) in 50 mM Tris-HCl, 100 mM NaCl, 1 mM CaCl . Following 1 h incubation at 37 °C, reactions were terminated by the addition of SDS–PAGE sample buffer supplemented with 25 mM EDTA. Substrate and co-migrating cleavage products were separated by 12% non-reducing SDS–PAGE and visualized by autoradiography using a storage phosphor screen (GE Healthcare) and a Typhoon imaging system (GE Healthcare). Band intensities were quantified using ImageQuant TL 8.1 software (GE Healthcare). STC2 and covalent complexes between STC2 and PAPP-A were blotted onto PVDF membranes (Millipore) following separation by 3–8% SDS–PAGE. The membranes were blocked with 2% Tween-20, and equilibrated in 50 mM Tris-HCl, 500 mM NaCl, 0.1% Tween-20; pH 9 (TST). For STC2, the membranes were incubated with goat polyclonal anti-STC2 (R&D systems, AF2830) at 0.5 μg ml−1 in TST supplemented with 2% skimmed milk for 1 h at 20 °C. For PAPP-A–STC2 complexes, the membranes were incubated with rabbit polyclonal anti-PAPP-A50 at 0.63 μg ml−1 in TST supplemented with 2% skimmed milk for 16 h at 20 °C. Membranes were washed with TST and subsequently incubated with polyclonal rabbit anti-goat IgG[en rule]horseradish peroxidase (DAKO, P0449) or polyclonal swine anti-rabbit IgG[en rule]horseradish peroxidase (DAKO, P0217), respectively, diluted 1:2,000 in TST supplemented with 2% skimmed milk for 1 h at 20 °C. Following washing with TST, membranes were developed using enhanced chemiluminescence (ECL Prime, GE Healthcare). Images were captured using an ImageQuant LAS 4000 instrument (GE Healthcare). Summary genetic association results are available on the GIANT website (http://portals.broadinstitute.org/collaboration/giant/index.php/GIANT_consortium).
News Article | February 24, 2017
WASHINGTON, D.C.—When it comes to genome sequencing, visionaries like to throw around big numbers: There’s the UK Biobank, for example, which promises to decipher the genomes of 500,000 individuals, or Iceland’s effort to study the genomes of its entire human population. Yesterday, at a meeting here organized by the Smithsonian Initiative on Biodiversity Genomics and the Shenzhen, China–based sequencing powerhouse BGI, a small group of researchers upped the ante even more, announcing their intent to, eventually, sequence “all life on Earth.” Their plan, which does not yet have funding dedicated to it specifically but could cost at least several billions of dollars, has been dubbed the Earth BioGenome Project (EBP). Harris Lewin, an evolutionary genomicist at the University of California, Davis, who is part of the group that came up with this vision 2 years ago, says the EBP would take a first step toward its audacious goal by focusing on eukaryotes—the group of organisms that includes all plants, animals, and single-celled organisms such as amoebas. That strategy, and the EBP’s overall concept, found a receptive audience at BioGenomics2017, a gathering this week of conservationists, evolutionary biologists, systematists, and other biologists interested in applying genomics to their work. “This is a grand idea,” says Oliver Ryder, a conservation biologist at the San Diego Zoo Institute for Conservation Research in California. “If we really want to understand how life evolved, genome biology is going to be part of that.” Ryder and others drew parallels between the EBP and the Human Genome Project, which began as an ambitious, controversial, and, at the time, technically impossible proposal more than 30 years ago. That earlier effort eventually led not only to the sequencing of the first human genome, but also to entirely new DNA technologies that are at the center of many medical frontiers and the basis for a $20 billion industry. “People have learned from the human genome experience that [sequencing] is a tremendous advance in biology,” Lewin says. Many details about the EBP are still being worked out. But as currently proposed, the first step would be to sequence in great detail the DNA of a member of each eukaryotic family (about 9000 in all) to create reference genomes on par or better than the reference human genome. Next would come sequencing to a lesser degree a species from each of the 150,000 to 200,000 genera. Finally, EBP participants would get rough genomes of the 1.5 million remaining known eukaryotic species. These lower resolution genomes could be improved as needed by comparing them with the family references or by doing more sequencing, says EBP co-organizer Gene Robinson, a behavioral genomics researcher and director of the Carl R. Woese Institute for Genomic Biology at the University of Illinois in Urbana. The entire eukaryotic effort would likely cost about the same as it did to sequence that first human genome, estimate Lewin, Robinson, and EBP co-organizer John Kress, an evolutionary biologist at the Smithsonian National Museum of Natural History here. It took about $2.7 billion to read and order the 3 billion bases composing the human genome, about $4.8 billion in today’s dollars. With a comparable amount of support, the EBP’s eukaryotic work might be done in a decade, its organizers suggest. Such optimism arises from ever-decreasing DNA sequencing costs—one meeting presenter from Complete Genomics, based in Mountain View, California, says his company plans to be able to roughly sequence whole eukaryotic genomes for about $100 within a year—and improvements in sequencing technology that make possible higher quality genomes, at reasonable prices. “It became apparent to me that at a certain point, it would be possible to sequence all life on Earth,” Lewin says. Although some may find the multibillion-dollar price tag hard to justify for researchers not studying humans, the fundamentals of matter, or the mysteries of the universe, the EBP has a head start, thanks to the work of several research communities pursuing their own ambitious sequencing projects. These include the Genome 10K Project, which seeks to sequence 10,000 vertebrate genomes, one from each genus; i5K, an effort to decipher 5000 arthropods; and B10K, which expects to generate genomes for all 10,500 bird species. The EBP would help coordinate, compile, and perhaps fund these efforts. “The [EBP] concept is a community of communities,” Lewin says. There are also sequencing commitments from giants in the genomics field, such as China’s BGI, and the Wellcome Trust Sanger Institute in the United Kingdom. But at a planning meeting this week, it became clear that significant challenges await the EBP, even beyond funding. Although researchers from Brazil, China, and the United Kingdom said their nations are eager to participate in some way, the 20 people in attendance emphasized the need for the effort to be more international, with developing countries, particularly those with high biodiversity, helping shape the project’s final form. They proposed that the EBP could help develop sequencing and other technological experts and capabilities in those regions. The Global Genome Biodiversity Network, which is compiling lists and images of specimens at museums and other biorepositories around the world, could supply much of the DNA needed, but even broader participation is important, says Thomas Gilbert, an evolutionary biologist at the Natural History Museum of Denmark in Copenhagen. The planning group also stressed the need to develop standards to ensure high-quality genome sequences and to preserve associated information for each organism sequenced, such as where it was collected and what it looked like. Getting DNA samples from the wild may ultimately be the biggest challenge—and the biggest cost, several people noted. Not all museum specimens yield DNA preserved well enough for high-quality genomes. Even recently collected and frozen plant and animal specimens are not always handled correctly for preserving their DNA, says Guojie Zhang, an evolutionary biologist at BGI and the University of Copenhagen. And the lack of standards could undermine the project’s ultimate utility, notes Erich Jarvis, a neurobiologist at The Rockefeller University in New York City: “We could spend money on an effort for all species on the planet, but we could generate a lot of crap.” But Lewin is optimistic that won’t happen. After he outlined the EBP in the closing talk at BioGenomics2017, he was surrounded by researchers eager to know what they could do to help. “It’s good to try to bring together the tribes,” says Jose Lopez, a biologist from Nova Southeastern University in Fort Lauderdale, Florida, whose “tribe” has mounted “GIGA,” a project to sequence 7000 marine invertebrates. “It’s a big endeavor. We need lots of expertise and lots of people who can contribute.”
Agency: GTR | Branch: MRC | Program: | Phase: Intramural | Award Amount: 9.60M | Year: 2013
In UK Biobank, questionnaire data, physical measurements and biological samples have been collected from 500,000 men and women aged 40-69, and their health is now being followed long-term. A prospective cohort like UK Biobank allows reliable assessment of the relevance of many different exposures to the development of many different diseases. However, such studies need to be big because only a relatively small proportion of participants will develop any particular disease. It is now planned to conduct specialised imaging of the brain, heart, large blood vessels, abdomen, bone and joints in 100,000 UK Biobank participants. Although imaging has been done in some other studies, these have involved only small numbers of people (typically less than 5,000) and have focussed on imaging particular parts of the body. By contrast, combination of imaging data from different parts of the body in 100,000 UK Biobank participants with the detailed non-imaging data already collected will provide a unique resource for researchers from around the world to investigate the causes of different diseases. (For example, dementia may be related to imaging measures not only from the brain but also from other parts of the body, as well as to genetic, biochemical or environmental information.)
Agency: GTR | Branch: MRC | Program: | Phase: Intramural | Award Amount: 10.00M | Year: 2013
During the next 18 months, it is intended to measure about 600,000 genetic markers in the DNA extracted from blood samples that have already been collected from each of the 500,000 participants in UK Biobank. When these “genotype” measurements are combined with whole genome sequence data from a few thousand or tens of thousands of UK individuals, it will be possible to “impute” (i.e. estimate) very many more genetic variants in the region of the DNA adjacent to the variants that have been measured. The combination of these detailed genotyping data with the extensive range of known biochemical risk factors that are currently being measured in blood and urine samples from the UK Biobank participants, along with the detailed information from questionnaires and physical measurements conducted at the initial assessment visits and from linkage to health records about the development of disease during long-term follow-up, will make UK Biobank uniquely rich as a resource for researchers from all areas of health to conduct studies of the relevance of genes to disease rapidly and cost-effectively. Hence, these detailed genotype data will facilitate research that harnesses the full power of UK Biobank to help understand the causes of many different diseases.
Agency: GTR | Branch: MRC | Program: | Phase: Intramural | Award Amount: 9.37M | Year: 2016
The challenges of understanding the determinants of common life-threatening and disabling diseases are substantial. Such conditions are typically caused by many different exposures that each have moderate effects and interact with each other in complex ways. Prospective cohorts, such as UK Biobank, have advantages for the comprehensive and reliable quantification of the combined effects of different types of risk factor on health outcomes. In particular, exposures can be assessed before they are affected by disease or its management, and diseases can be assessed that are not readily investigated by retrospective studies (e.g. dementia). Moreover, all of the beneficial and adverse effects of a specific factor on the life-time risks of different diseases can be considered. Prospective studies do, however, need to involve large numbers of participants because only a relatively small proportion will develop any particular condition. UK Biobank has involved the collection of extensive baseline questionnaire data, physical measurements and biological samples from 500,000 men and women aged 40-69 at baseline, and their health is now being followed long-term. This proposal is for enhanced phenotyping of 100,000 of the participants with a set of imaging modalities that have been carefully chosen to provide considerable additional information that is likely to be relevant to many different health outcomes.
News Article | February 15, 2017
Already feeling drained so early in the year? Genes might contribute in a small but significant way to whether people report being tired and low in energy. This is according to UK researchers led by Vincent Deary of Northumbria University, Newcastle, and Saskia Hagenaars of the University of Edinburgh, in a paper in Springer Nature's journal Molecular Psychiatry. They found that genetics accounts for about eight percent of people's differences in self-reported tiredness/low energy; this implies that the vast majority of people's differences in self-reported tiredness are environmental in origin. The researchers found that the small genetic contributions to self-reported tiredness overlapped with genetic contributions to a range of mental and physical health conditions, and with whether people smoke, or are carrying too much weight, and also longevity. Their large-scale study analyzed genetic information of 111,749 participants who all indicated whether they felt tired or low in energy in the two weeks before their data were collected in the UK Biobank study. The large UK Biobank resource is used to identify the reasons behind certain diseases occurring in middle aged and older people. It includes genetic samples as well as information about participants' physical and mental health, personality and cognitive functioning. The researchers working together on the study conducted various statistical analyses, including genome-wide associations, heritability estimates, and testing genetic associations between tiredness and more than 25 health-related variables. The researchers took factors such as age and gender into account. The findings suggest that it was genetic proneness to some illnesses, not just presence of these illnesses, that had an association with self-reports of tiredness. For instance, the researchers looked at people who were genetically prone to diabetes but did not have the condition, and the small genetic link with tiredness remained intact. Indeed, genetic overlap was found to exist between tiredness and a general tendency to poor health. "Being genetically predisposed to a range of mental and physical health complaints also predisposes people to report that they are more tired or lacking in energy," added Hagenaars. This applied to people with a higher genetic tendency to symptoms of the so-called metabolic syndrome, such as high cholesterol levels, and a high waist to hip ratio or obesity. According to the research team, these links raise the possibility of a genetic link between tiredness and vulnerability to physiological stress. A genetic association between tiredness and longevity was also found, and with whether someone had higher genetic tendency to weak grip strength, smoking, depression, and schizophrenia. The findings also suggest that people who have a tendency to experience more mental and emotional distress are more likely to report being tired. Overall, the findings confirm that self-reported tiredness is a partly heritable, complex phenomenon. It has genetic associations with various health, physiological, cognitive, personality, and affective processes. But the researchers emphasized that most of people's differences in tiredness are probably environmental. The genetic data available accounted for only 8.4 percent of people's differences in tiredness. "Although tiredness is largely causally heterogeneous, there may be a small but significant direct genetic contribution to tiredness proneness," Deary said, summarizing the findings of this part of the study. The research team foresees that more tests to find such links will be done in future as more genome-wide genotyping data becomes available.
News Article | February 15, 2017
A genetic predisposition to higher waist-to-hip ratio adjusted for body mass index (a measure of abdominal adiposity [fat]) was associated with an increased risk of type 2 diabetes and coronary heart disease, according to a study appearing in the February 14 issue of JAMA. Obesity, typically defined on the basis of body mass index (BMI), is a leading cause of type 2 diabetes and coronary heart disease (CHD). However, for any given BMI, body fat distribution can vary substantially; some individuals store proportionally more fat around their visceral organs (abdominal adiposity) than on their thighs and hip. In observational studies, abdominal adiposity has been associated with type 2 diabetes and CHD. Whether these associations represent causal relationships remains uncertain. Sekar Kathiresan, M.D., of Massachusetts General Hospital, Harvard Medical School, Boston, and colleagues examined whether a genetic predisposition to increased waist-to-hip ratio adjusted for BMI was associated with cardiometabolic quantitative traits (i.e., lipids, insulin, glucose, and systolic blood pressure), type 2 diabetes and CHD. Estimates for cardiometabolic traits were based on a combined data set consisting of summary results from 4 genome-wide association studies conducted from 2007 to 2015, including up to 322,154 participants, as well as individual-level, cross-sectional data from the UK Biobank collected from 2007-2011, including 111,986 individuals. The researchers found that genetic predisposition to higher waist-to-hip ratio adjusted for BMI was associated with increased levels of quantitative risk factors (lipids, insulin, glucose, and systolic blood pressure) as well as a higher risk for type 2 diabetes and CHD. "These results permit several conclusions. First, these findings lend human genetic support to previous observations associating abdominal adiposity with cardiometabolic disease," the authors write. "Second, these results suggest that body fat distribution, beyond simple measurement of BMI, could explain part of the variation in risk of type 2 diabetes and CHD noted across individuals and subpopulations. ... Third, waist-to-hip ratio adjusted for BMI might prove useful as a biomarker for the development of therapies to prevent type 2 diabetes and CHD." Editor's Note: Please see the article for additional information, including other authors, author contributions and affiliations, financial disclosures, funding and support, etc. Related material: The editorial, "When Will Mendelian Randomization Become Relevant for Clinical Practice and Public Health?" by George Davey Smith, M.D., D.Sc., of the University of Bristol, United Kingdom, and colleagues also is available at the For The Media website. To place an electronic embedded link to this study in your story This link will be live at the embargo time: http://jamanetwork.