The Dimensionality of Genomic Information and Its Effect on Genomic Prediction.

TitleThe Dimensionality of Genomic Information and Its Effect on Genomic Prediction.
Publication TypeJournal Article
Year of Publication2016
AuthorsPocrnic, I, Lourenco, DAL, Masuda, Y, Legarra, A, Misztal, I
Date Published2016 May

The genomic relationship matrix (GRM) can be inverted by the algorithm for proven and young (APY) based on recursion on a random subset of animals. While a regular inverse has a cubic cost, the cost of the APY inverse can be close to linear. Theory for the APY assumes that the optimal size of the subset (maximizing accuracy of genomic predictions) is due to a limited dimensionality of the GRM, which is a function of the effective population size (Ne). The objective of this study was to evaluate these assumptions by simulation. Six populations were simulated with approximate effective population size (Ne) from 20 to 200. Each population consisted of 10 nonoverlapping generations, with 25,000 animals per generation and phenotypes available for generations 1-9. The last 3 generations were fully genotyped assuming genome length L = 30. The GRM was constructed for each population and analyzed for distribution of eigenvalues. Genomic estimated breeding values (GEBV) were computed by single-step GBLUP, using either a direct or an APY inverse of GRM. The sizes of the subset in APY were set to the number of the largest eigenvalues explaining x% of variation (EIGx, x = 90, 95, 98, 99) in GRM. Accuracies of GEBV for the last generation with the APY inverse peaked at EIG98 and were slightly lower with EIG95, EIG99, or the direct inverse. Most information in the GRM is contained in ∼NeL largest eigenvalues, with no information beyond 4NeL Genomic predictions with the APY inverse of the GRM are more accurate than by the regular inverse.

Alternate JournalGenetics
PubMed ID26944916
PubMed Central IDPMC4858800