Institute For Medizinische Biometrie Und Statistik

fur, Germany

Institute For Medizinische Biometrie Und Statistik

fur, Germany
Time filter
Source Type

Hengl T.,ISRIC World Soil Information | De Jesus J.M.,ISRIC World Soil Information | Heuvelink G.B.M.,ISRIC World Soil Information | Gonzalez M.R.,ISRIC World Soil Information | And 15 more authors.
PLoS ONE | Year: 2017

This paper describes the technical development and accuracy assessment of the most recent and improved version of the SoilGrids system at 250m resolution (June 2016 update). SoilGrids provides global predictions for standard numeric soil properties (organic carbon, bulk density, Cation Exchange Capacity (CEC), pH, soil texture fractions and coarse fragments) at seven standard depths (0, 5, 15, 30, 60, 100 and 200 cm), in addition to predictions of depth to bedrock and distribution of soil classes based on the World Reference Base (WRB) and USDA classification systems (ca. 280 raster layers in total). Predictions were based on ca. 150,000 soil profiles used for training and a stack of 158 remote sensing-based soil covariates (primarily derived from MODIS land products, SRTM DEM derivatives, climatic images and global landform and lithology maps), which were used to fit an ensemble of machine learning methods-random forest and gradient boosting and/or multinomial logistic regression-as implemented in the R packages ranger, xgboost, nnet and caret. The results of 10-fold cross-validation show that the ensemble models explain between 56% (coarse fragments) and 83% (pH) of variation with an overall average of 61%. Improvements in the relative accuracy considering the amount of variation explained, in comparison to the previous version of SoilGrids at 1 km spatial resolution, range from 60 to 230%. Improvements can be attributed to: (1) the use of machine learning instead of linear regression, (2) to considerable investments in preparing finer resolution covariate layers and (3) to insertion of additional soil profiles. Further development of SoilGrids could include refinement of methods to incorporate input uncertainties and derivation of posterior probability distributions (per pixel), and further automation of spatial modeling so that soil maps can be generated for potentially hundreds of soil variables. Another area of future research is the development of methods for multiscale merging of SoilGrids predictions with local and/or national gridded soil products (e.g. up to 50 m spatial resolution) so that increasingly more accurate, complete and consistent global soil information can be produced. SoilGrids are available under the Open Data Base License. © 2017 Hengl et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Dichgans M.,Ludwig Maximilians University of Munich | Dichgans M.,Synergy Systems | Malik R.,Ludwig Maximilians University of Munich | Konig I.R.,Institute For Medizinische Biometrie Und Statistik | And 43 more authors.
Stroke | Year: 2014

Background and Purpose-Ischemic stroke (IS) and coronary artery disease (CAD) share several risk factors and each has a substantial heritability. We conducted a genome-wide analysis to evaluate the extent of shared genetic determination of the two diseases. Methods-Genome-wide association data were obtained from the METASTROKE, Coronary Artery Disease Genomewide Replication and Meta-analysis (CARDIoGRAM), and Coronary Artery Disease (C4D) Genetics consortia. We first analyzed common variants reaching a nominal threshold of significance (P<0.01) for CAD for their association with IS and vice versa. We then examined specific overlap across phenotypes for variants that reached a high threshold of significance. Finally, we conducted a joint meta-analysis on the combined phenotype of IS or CAD. Corresponding analyses were performed restricted to the 2167 individuals with the ischemic large artery stroke (LAS) subtype. Results-Common variants associated with CAD at P<0.01 were associated with a significant excess risk for IS and for LAS and vice versa. Among the 42 known genome-wide significant loci for CAD, 3 and 5 loci were significantly associated with IS and LAS, respectively. In the joint meta-analyses, 15 loci passed genome-wide significance (P<5×10-8) for the combined phenotype of IS or CAD and 17 loci passed genome-wide significance for LAS or CAD. Because these loci had prior evidence for genome-wide significance for CAD, we specifically analyzed the respective signals for IS and LAS and found evidence for association at chr12q24/SH2B3 (PIS=1.62×10-7) and ABO (PIS=2.6×10-4), as well as at HDAC9 (PLAS=2.32×10-12), 9p21 (PLAS=3.70×10-6), RAI1-PEMT-RASD1 (PLAS=2.69×10-5), EDNRA (PLAS=7.29×10-4), and CYP17A1-CNNM2-NT5C2 (PLAS=4.9×10-4). Conclusions-Our results demonstrate substantial overlap in the genetic risk of IS and particularly the LAS subtype with CAD. © 2013 American Heart Association, Inc.

Preuss M.,Institute For Medizinische Biometrie Und Statistik | Preuss M.,University of Lübeck | Konig I.R.,Institute For Medizinische Biometrie Und Statistik | Thompson J.R.,University of Leicester | And 42 more authors.
Circulation: Cardiovascular Genetics | Year: 2010

Background-Recent genome-wide association studies (GWAS) of myocardial infarction (MI) and other forms of coronary artery disease (CAD) have led to the discovery of at least 13 genetic loci. In addition to the effect size, power to detect associations is largely driven by sample size. Therefore, to maximize the chance of finding novel susceptibility loci for CAD and MI, the Coronary ARtery DIsease Genome-wide Replication And Meta-analysis (CARDIoGRAM) consortium was formed. Methods and Results-CARDIoGRAM combines data from all published and several unpublished GWAS in individuals with European ancestry; includes >22 000 cases with CAD, MI, or both and >60 000 controls; and unifies samples from the Atherosclerotic Disease VAscular functioN and genetiC Epidemiology study, CADomics, Cohorts for Heart and Aging Research in Genomic Epidemiology, deCODE, the German Myocardial Infarction Family Studies I, II, and III, Ludwigshafen Risk and Cardiovascular Heath Study/AtheroRemo, MedStar, Myocardial Infarction Genetics Consortium, Ottawa Heart Genomics Study, PennCath, and the Wellcome Trust Case Control Consortium. Genotyping was carried out on Affymetrix or Illumina platforms followed by imputation of genotypes in most studies. On average, 2.2 million single nucleotide polymorphisms were generated per study. The results from each study are combined using meta-analysis. As proof of principle, we meta-analyzed risk variants at 9p21 and found that rs1333049 confers a 29% increase in risk for MI per copy (P=2×10-20). Conclusion-CARDIoGRAM is poised to contribute to our understanding of the role of common genetic variation on risk for CAD and MI. © 2010 American Heart Association, Inc.

Davies R.W.,University of Ottawa | Wells G.A.,University of Ottawa | Stewart A.F.R.,Medizinische Klinik II | Erdmann J.,Duke University | And 29 more authors.
Circulation: Cardiovascular Genetics | Year: 2012

Background-Recent genome-wide association studies (GWAS) have identified several novel loci that reproducibly associate with coronary artery disease (CAD) and/or myocardial infarction risk. However, known common CAD risk variants explain only 10% of the predicted genetic heritability of the disease, suggesting that important genetic signals remain to be discovered. Methods and Results-We performed a discovery meta-analysis of 5 GWAS involving 13 949 subjects (7123 cases, 6826 control subjects) imputed at approximately 5 million single nucleotide polymorphisms, using pilot 1000 Genomes-based haplotypes. Promising loci were followed up in an additional 5 studies with 11 032 subjects (5211 cases, 5821 control subjects). A novel CAD locus on chromosome 6p21.3 in the major histocompatibility complex (MHC) between HCG27 and HLA-C was identified and achieved genome-wide significance in the combined analysis (rs3869109; pdiscovery=3.3×10-7, preplication=5.3×10 -4 pcombined=1.12×10-9). A subanalysis combining discovery GWAS showed an attenuation of significance when stringent corrections for European population structure were used (P=4.1×10-10 versus 3.2×10-7), suggesting that the observed signal is partly confounded due to population stratification. This gene dense region plays an important role in inflammation, immunity, and self- cell recognition. To determine whether the underlying association was driven by MHC class I alleles, we statistically imputed common HLA alleles into the discovery subjects; however, no single common HLA type contributed significantly or fully explained the observed association. Conclusions-We have identified a novel locus in the MHC associated with CAD. MHC genes regulate inflammation and T-cell responses that contribute importantly to the initiation and propagation of atherosclerosis. Further laboratory studies will be required to understand the biological basis of this association and identify the causative allele(s). © 2012 American Heart Association, Inc.

Loading Institute For Medizinische Biometrie Und Statistik collaborators
Loading Institute For Medizinische Biometrie Und Statistik collaborators