Ober U.,University of Gottingen |
Ayroles J.F.,North Carolina State University |
Ayroles J.F.,Harvard University |
Stone E.A.,North Carolina State University |
And 8 more authors.
PLoS Genetics | Year: 2012
Predicting organismal phenotypes from genotype data is important for plant and animal breeding, medicine, and evolutionary biology. Genomic-based phenotype prediction has been applied for single-nucleotide polymorphism (SNP) genotyping platforms, but not using complete genome sequences. Here, we report genomic prediction for starvation stress resistance and startle response in Drosophila melanogaster, using ~2.5 million SNPs determined by sequencing the Drosophila Genetic Reference Panel population of inbred lines. We constructed a genomic relationship matrix from the SNP data and used it in a genomic best linear unbiased prediction (GBLUP) model. We assessed predictive ability as the correlation between predicted genetic values and observed phenotypes by cross-validation, and found a predictive ability of 0.239±0.008 (0.230±0.012) for starvation resistance (startle response). The predictive ability of BayesB, a Bayesian method with internal SNP selection, was not greater than GBLUP. Selection of the 5% SNPs with either the highest absolute effect or variance explained did not improve predictive ability. Predictive ability decreased only when fewer than 150,000 SNPs were used to construct the genomic relationship matrix. We hypothesize that predictive power in this population stems from the SNP-based modeling of the subtle relationship structure caused by long-range linkage disequilibrium and not from population structure or SNPs in linkage disequilibrium with causal variants. We discuss the implications of these results for genomic prediction in other organisms. © 2012 Ober et al.
Wolc A.,Iowa State University |
Wolc A.,Dallas Center |
Zhao H.H.,Iowa State University |
Arango J.,Dallas Center |
And 10 more authors.
Genetics Selection Evolution | Year: 2015
Abstract Background: Genomic selection (GS) using estimated breeding values (GS-EBV) based on dense marker data is a promising approach for genetic improvement. A simulation study was undertaken to illustrate the opportunities offered by GS for designing breeding programs. It consisted of a selection program for a sex-limited trait in layer chickens, which was developed by deterministic predictions under different scenarios. Later, one of the possible schemes was implemented in a real population of layer chicken. Methods: In the simulation, the aim was to double the response to selection per year by reducing the generation interval by 50%, while maintaining the same rate of inbreeding per year. We found that GS with retraining could achieve the set objectives while requiring 75% fewer reared birds and 82% fewer phenotyped birds per year. A multi-trait GS scenario was subsequently implemented in a real population of brown egg laying hens. The population was split into two sub-lines, one was submitted to conventional phenotypic selection, and one was selected based on genomic prediction. At the end of the 3-year experiment, the two sub-lines were compared for multiple performance traits that are relevant for commercial egg production. Results: Birds that were selected based on genomic prediction outperformed those that were submitted to conventional selection for most of the 16 traits that were included in the index used for selection. However, although the two programs were designed to achieve the same rate of inbreeding per year, the realized inbreeding per year assessed from pedigree was higher in the genomic selected line than in the conventionally selected line. Conclusions The results demonstrate that GS is a promising alternative to conventional breeding for genetic improvement of layer chickens. © 2015 Wolc et al.
Baes C.F.,Bern University of Applied Sciences |
Baes C.F.,Qualitas AG |
Dolezal M.A.,Polytechnic of Milan |
Dolezal M.A.,University of Veterinary Medicine Vienna |
And 13 more authors.
BMC Genomics | Year: 2014
Background: Advances in human genomics have allowed unprecedented productivity in terms of algorithms, software, and literature available for translating raw next-generation sequence data into high-quality information. The challenges of variant identification in organisms with lower quality reference genomes are less well documented. We explored the consequences of commonly recommended preparatory steps and the effects of single and multi sample variant identification methods using four publicly available software applications (Platypus, HaplotypeCaller, Samtools and UnifiedGenotyper) on whole genome sequence data of 65 key ancestors of Swiss dairy cattle populations. Accuracy of calling next-generation sequence variants was assessed by comparison to the same loci from medium and high-density single nucleotide variant (SNV) arrays. Results: The total number of SNVs identified varied by software and method, with single (multi) sample results ranging from 17.7 to 22.0 (16.9 to 22.0) million variants. Computing time varied considerably between software. Preparatory realignment of insertions and deletions and subsequent base quality score recalibration had only minor effects on the number and quality of SNVs identified by different software, but increased computing time considerably. Average concordance for single (multi) sample results with high-density chip data was 58.3% (87.0%) and average genotype concordance in correctly identified SNVs was 99.2% (99.2%) across software. The average quality of SNVs identified, measured as the ratio of transitions to transversions, was higher using single sample methods than multi sample methods. A consensus approach using results of different software generally provided the highest variant quality in terms of transition/transversion ratio. Conclusions: Our findings serve as a reference for variant identification pipeline development in non-human organisms and help assess the implication of preparatory steps in next-generation sequencing pipelines for organisms with incomplete reference genomes (pipeline code is included). Benchmarking this information should prove particularly useful in processing next-generation sequencing data for use in genome-wide association studies and genomic selection.