Dairy Futures Cooperative Research Center

Melbourne, Australia

Dairy Futures Cooperative Research Center

Melbourne, Australia

Time filter

Source Type

Daetwyler H.D.,Australian Department of Primary Industries and Fisheries | Daetwyler H.D.,La Trobe University | Daetwyler H.D.,Dairy Futures Cooperative Research Center | Capitan A.,French National Institute for Agricultural Research | And 35 more authors.
Nature Genetics | Year: 2014

The 1000 bull genomes project supports the goal of accelerating the rates of genetic gain in domestic cattle while at the same time considering animal health and welfare by providing the annotated sequence variants and genotypes of key ancestor bulls. In the first phase of the 1000 bull genomes project, we sequenced the whole genomes of 234 cattle to an average of 8.3-fold coverage. This sequencing includes data for 129 individuals from the global Holstein-Friesian population, 43 individuals from the Fleckvieh breed and 15 individuals from the Jersey breed. We identified a total of 28.3 million variants, with an average of 1.44 heterozygous sites per kilobase for each individual. We demonstrate the use of this database in identifying a recessive mutation underlying embryonic death and a dominant mutation underlying lethal chrondrodysplasia. We also performed genome-wide association studies for milk production and curly coat, using imputed sequence variants, and identified variants associated with these traits in cattle. © 2014 Nature America, Inc.


Hayes B.J.,Australian Department of Primary Industries and Fisheries | Hayes B.J.,Dairy Futures Cooperative Research Center | Hayes B.J.,La Trobe University | Lewin H.A.,University of California at Davis | And 3 more authors.
Trends in Genetics | Year: 2013

As the global population and global wealth both continue to increase, so will the demand for livestock products, especially those that are highly nutritious. However, competition with other uses for land and water resources will also intensify, necessitating more efficient livestock production. In addition, as climate change escalates, reduced methane emissions from cattle and sheep will be a critical goal. Application of new technologies, including genomic selection and advanced reproductive technologies, will play an important role in meeting these challenges. Genomic selection, which enables prediction of the genetic merit of animals from genome-wide SNP markers, has already been adopted by dairy industries worldwide and is expected to double genetic gains for milk production and other traits. Here, we review these gains. We also discuss how the use of whole-genome sequence data should both accelerate the rate of gain and enable rapid discovery and elimination of genetic defects from livestock populations. © 2012 Elsevier Ltd.


Pryce J.E.,Australian Department of Primary Industries and Fisheries | Pryce J.E.,Dairy Futures Cooperative Research Center | Hayes B.J.,Australian Department of Primary Industries and Fisheries | Hayes B.J.,Dairy Futures Cooperative Research Center | And 4 more authors.
Journal of Dairy Science | Year: 2012

In this study, 3 strategies for controlling progeny inbreeding in mating plans were compared. The strategies used information from pedigree inbreeding coefficients, genomic relationships, or shared runs of homozygosity. The strategies were compared for the reduction in genetic gain and progeny inbreeding that would be expected from selected matings, and for the decrease of homozygosity of deleterious recessive alleles. Using real pedigree, genotype [43,115 single nucleotide polymorphism (SNP) markers], and estimated breeding value data from Holstein cattle, mating plans were derived for herds of 300 cows with 20 sires available for mating, replicated 50 times. Each of the 300 individuals allocated as dams were matched to 1 of 20 sires to maximize genetic merit minus the penalty for estimated progeny inbreeding, and given the restriction that the sire could not be mated to more than 10% of the cows. The strategy that used a genomic relationship matrix (GRM) was the most effective in reducing average progeny inbreeding; this strategy also resulted in fewer homozygous SNP out of 1,000 low-frequency SNP compared with the strategy using pedigree information. In the future, large numbers of cattle may be genotyped for low-density SNP panels. A GRM constructed using 3,123 SNP produced results similar to a GRM constructed using the full 43,115 SNP. These results demonstrate that using GRM information, a 1% reduction in progeny inbreeding (valued at around $5 per cow) can be made with very little compromise in the overall breeding objective. These results and the availability of low-cost, low-density genotyping make it attractive to apply mating plans that use genomic information in commercial dairy herds. © 2012 American Dairy Science Association.


Hayes B.J.,Australian Department of Primary Industries and Fisheries | Hayes B.J.,Dairy Futures Cooperative Research Center
Journal of Dairy Science | Year: 2011

Large numbers of dairy cattle are now routinely genotyped for dense single nucleotide polymorphism (SNP) arrays for the purpose of predicting genomic estimated breeding values. Such SNP arrays contain very good information for parentage assignment and pedigree reconstruction. The main challenge in using this information for parentage assignment and pedigree reconstruction is development of computationally efficient strategies that enable a candidate animal to be assigned its sire and dam with the large volume of data. Here we describe an efficient algorithm for parentage assignment with SNP data and demonstrate very accurate assignment with 50,000-SNP and 3,000-SNP panels. The computer code implementing the algorithm is given in the Appendix. © 2011 American Dairy Science Association.


Druet T.,University of Liège | Macleod I.M.,University of Melbourne | Hayes B.J.,La Trobe University | Hayes B.J.,Australian Department of Primary Industries and Fisheries | Hayes B.J.,Dairy Futures Cooperative Research Center
Heredity | Year: 2014

Genomic prediction from whole-genome sequence data is attractive, as the accuracy of genomic prediction is no longer bounded by extent of linkage disequilibrium between DNA markers and causal mutations affecting the trait, given the causal mutations are in the data set. A cost-effective strategy could be to sequence a small proportion of the population, and impute sequence data to the rest of the reference population. Here, we describe strategies for selecting individuals for sequencing, based on either pedigree relationships or haplotype diversity. Performance of these strategies (number of variants detected and accuracy of imputation) were evaluated in sequence data simulated through a real Belgian Blue cattle pedigree. A strategy (AHAP), which selected a subset of individuals for sequencing that maximized the number of unique haplotypes (from single-nucleotide polymorphism panel data) sequenced gave good performance across a range of variant minor allele frequencies. We then investigated the optimum number of individuals to sequence by fold coverage given a maximum total sequencing effort. At 600 total fold coverage (x 600), the optimum strategy was to sequence 75 individuals at eightfold coverage. Finally, we investigated the accuracy of genomic predictions that could be achieved. The advantage of using imputed sequence data compared with dense SNP array genotypes was highly dependent on the allele frequency spectrum of the causative mutations affecting the trait. When this followed a neutral distribution, the advantage of the imputed sequence data was small; however, when the causal mutations all had low minor allele frequencies, using the sequence data improved the accuracy of genomic prediction by up to 30%. © 2014 Macmillan Publishers Limited All rights reserved.


Brondum R.F.,University of Aarhus | Su G.,University of Aarhus | Lund M.S.,University of Aarhus | Bowman P.J.,Australian Department of Primary Industries and Fisheries | And 5 more authors.
BMC Genomics | Year: 2012

Background: The accuracy of genomic prediction is highly dependent on the size of the reference population. For small populations, including information from other populations could improve this accuracy. The usual strategy is to pool data from different populations; however, this has not proven as successful as hoped for with distantly related breeds. BayesRS is a novel approach to share information across populations for genomic predictions. The approach allows information to be captured even where the phase of SNP alleles and casuative mutation alleles are reversed across populations, or the actual casuative mutation is different between the populations but affects the same gene. Proportions of a four-distribution mixture for SNP effects in segments of fixed size along the genome are derived from one population and set as location specific prior proportions of distributions of SNP effects for the target population. The model was tested using dairy cattle populations of different breeds: 540 Australian Jersey bulls, 2297 Australian Holstein bulls and 5214 Nordic Holstein bulls. The traits studied were protein-, fat- and milk yield. Genotypic data was Illumina 777K SNPs, real or imputed.Results: Results showed an increase in accuracy of up to 3.5% for the Jersey population when using BayesRS with a prior derived from Australian Holstein compared to a model without location specific priors. The increase in accuracy was however lower than was achieved when reference populations were combined to estimate SNP effects, except in the case of fat yield. The small size of the Jersey validation set meant that these improvements in accuracy were not significant using a Hotelling-Williams t-test at the 5% level. An increase in accuracy of 1-2% for all traits was observed in the Australian Holstein population when using a prior derived from the Nordic Holstein population compared to using no prior information. These improvements were significant (P<0.05) using the Hotelling Williams t-test for protein- and fat yield.Conclusion: For some traits the method might be advantageous compared to pooling of reference data for distantly related populations, but further investigation is needed to confirm the results. For closely related populations the method does not perform better than pooling reference data. However, it does give an increased accuracy compared to analysis based on only one reference population, without an increased computational burden. The approach described here provides a general setup for inclusion of location specific priors: the approach could be used to include biological information in genomic predictions. © 2012 Brøndum et al.; licensee BioMed Central Ltd.


Khatkar M.S.,University of Sydney | Khatkar M.S.,Dairy Futures Cooperative Research Center | Moser G.,University of Sydney | Moser G.,Dairy Futures Cooperative Research Center | And 4 more authors.
BMC Genomics | Year: 2012

Background: We investigated strategies and factors affecting accuracy of imputing genotypes from lower-density SNP panels (Illumina 3K, 7K, Affymetrix 15K and 25K, and evenly spaced subsets) up to one medium (Illumina 50K) and one high-density (Illumina 800K) SNP panel. We also evaluated the utility of imputed genotypes on the accuracy of genomic selection using Australian Holstein-Friesian cattle data from 2727 and 845 animals genotyped with 50K and 800K SNP chip, respectively. Animals were divided into reference and test sets (genotyped with higher and lower density SNP panels, respectively) for evaluating the accuracies of imputation. For the accuracy of genomic selection, a comparison of direct genetic values (DGV) was made by dividing the data into training and validation sets under a range of imputation scenarios.Results: Of the three methods compared for imputation, IMPUTE2 outperformed Beagle and fastPhase for almost all scenarios. Higher SNP densities in the test animals, larger reference sets and higher relatedness between test and reference animals increased the accuracy of imputation. 50K specific genotypes were imputed with moderate allelic error rates from 15K (2.85%) and 25K (2.75%) genotypes. Using IMPUTE2, SNP genotypes up to 800K were imputed with low allelic error rate (0.79% genome-wide) from 50K genotypes, and with moderate error rate from 3K (4.78%) and 7K (2.00%) genotypes. The error rate of imputing up to 800K from 3K or 7K was further reduced when an additional middle tier of 50K genotypes was incorporated in a 3-tiered framework. Accuracies of DGV for five production traits using imputed 50K genotypes were close to those obtained with the actual 50K genotypes and higher compared to using 3K or 7K genotypes. The loss in accuracy of DGV was small when most of the training animals also had imputed (50K) genotypes. Additional gains in DGV accuracies were small when SNP densities increased from 50K to imputed 800K.Conclusion: Population-based genotype imputation can be used to predict and combine genotypes from different low, medium and high-density SNP chips with a high level of accuracy. Imputing genotypes from low-density SNP panels to at least 50K SNP density increases the accuracy of genomic selection. © 2012 Khatkar et al.; licensee BioMed Central Ltd.


Rodriguez-Ramilo S.T.,Instituto Nacional Of Investigacion Y Tecnologia Agraria Y Alimentaria Inia | Garcia-Cortes L.A.,Instituto Nacional Of Investigacion Y Tecnologia Agraria Y Alimentaria Inia | Gonzalez-Recio O.,Instituto Nacional Of Investigacion Y Tecnologia Agraria Y Alimentaria Inia | Gonzalez-Recio O.,Australian Department of Primary Industries and Fisheries | Gonzalez-Recio O.,Dairy Futures Cooperative Research Center
PLoS ONE | Year: 2014

Genome-enhanced genotypic evaluations are becoming popular in several livestock species. For this purpose, the combination of the pedigree-based relationship matrix with a genomic similarities matrix between individuals is a common approach. However, the weight placed on each matrix has been so far established with ad hoc procedures, without formal estimation thereof. In addition, when using marker- and pedigree-based relationship matrices together, the resulting combined relationship matrix needs to be adjusted to the same scale in reference to the base population. This study proposes a semi-parametric Bayesian method for combining marker- and pedigree-based information on genome-enabled predictions. A kernel matrix from a reproducing kernel Hilbert spaces regression model was used to combine genomic and genealogical information in a semi-parametric scenario, avoiding inversion and adjustment complications. In addition, the weights on marker- versus pedigree-based information were inferred from a Bayesian model with Markov chain Monte Carlo. The proposed method was assessed involving a large number of SNPs and a large reference population. Five phenotypes, including production and type traits of dairy cattle were evaluated. The reliability of the genome-based predictions was assessed using the correlation, regression coefficient and mean squared error between the predicted and observed values. The results indicated that when a larger weight was given to the pedigree-based relationship matrix the correlation coefficient was lower than in situations where more weight was given to genomic information. Importantly, the posterior means of the inferred weight were near the maximum of 1. The behavior of the regression coefficient and the mean squared error was similar to the performance of the correlation, that is, more weight to the genomic information provided a regression coefficient closer to one and a smaller mean squared error. Our results also indicated a greater accuracy of genomic predictions when using a large reference population. © 2014 Rodríguez-Ramilo et al.


Moser G.,University of Queensland | Lee S.H.,University of Queensland | Hayes B.J.,Australian Department of Primary Industries and Fisheries | Hayes B.J.,Dairy Futures Cooperative Research Center | And 4 more authors.
PLoS Genetics | Year: 2015

Gene discovery, estimation of heritability captured by SNP arrays, inference on genetic architecture and prediction analyses of complex traits are usually performed using different statistical models and methods, leading to inefficiency and loss of power. Here we use a Bayesian mixture model that simultaneously allows variant discovery, estimation of genetic variance explained by all variants and prediction of unobserved phenotypes in new samples. We apply the method to simulated data of quantitative traits and Welcome Trust Case Control Consortium (WTCCC) data on disease and show that it provides accurate estimates of SNP-based heritability, produces unbiased estimators of risk in new samples, and that it can estimate genetic architecture by partitioning variation across hundreds to thousands of SNPs. We estimated that, depending on the trait, 2,633 to 9,411 SNPs explain all of the SNP-based heritability in the WTCCC diseases. The majority of those SNPs (>96%) had small effects, confirming a substantial polygenic component to common diseases. The proportion of the SNP-based variance explained by large effects (each SNP explaining 1% of the variance) varied markedly between diseases, ranging from almost zero for bipolar disorder to 72% for type 1 diabetes. Prediction analyses demonstrate that for diseases with major loci, such as type 1 diabetes and rheumatoid arthritis, Bayesian methods outperform profile scoring or mixed model approaches. © 2015 Moser et al.


Hand M.L.,La Trobe University | Hand M.L.,Dairy Futures Cooperative Research Center | Cogan N.O.I.,La Trobe University | Cogan N.O.I.,Dairy Futures Cooperative Research Center | And 2 more authors.
Theoretical and Applied Genetics | Year: 2012

Allohexaploid tall fescue (Festuca arundinacea Schreb. syn. Lolium arundinaceum [Schreb.] Darbysh.) is an agriculturally important grass cultivated for pasture and turf world-wide. Genetic improvement of tall fescue could benefit from the use of non-domesticated germplasm to diversify breeding populations through the incorporation of novel and superior allele content. However, such potential germplasm must first be characterised, as three major morphotypes (Continental, Mediterranean and rhizomatous) with varying degrees of hybrid interfertility are commonly described within this species. As hexaploid tall fescue is also a member of a polyploid species complex that contains tetraploid, octoploid and decaploid taxa, it is also possible that germplasm collections may have inadvertently sampled some of these sub-species. In this study, 1,040 accessions from the publicly available United States Department of Agriculture tall fescue and meadow fescue germplasm collections were investigated. Sequence of the chloroplast genome-located matK gene and the nuclear ribosomal DNA internal transcribed spacer (rDNA ITS) permitted attribution of accessions to the three previously known morphotypes and also revealed the presence of tall fescue sub-species of varying ploidy levels, as well as other closely related species. The majority of accessions were, however, identified as Continental hexaploid tall fescue. Analysis using 34 simple sequence repeat markers was able to further investigate the level of genetic diversity within each hexaploid tall fescue morphotype group. At least two genetically distinct sub-groups of Continental hexaploid tall fescue were identified which are probably associated with palaeogeographic range expansion of this morphotype. This work has comprehensively characterised a large and complex germplasm collection and has identified genetically diverse accessions which may potentially contribute valuable alleles at agronomic loci for tall fescue cultivar improvement programs. © 2012 Springer-Verlag.

Loading Dairy Futures Cooperative Research Center collaborators
Loading Dairy Futures Cooperative Research Center collaborators