Practical Use of Methods for Imputation of HLA Alleles from SNP Genotype Data
Motyer A., Vukcevic D., Dilthey A., Donnelly P., McVean G., Leslie S.
AbstractThe human leukocyte antigen (HLA) genes play an essential role in immune function. Typing of HLA alleles is critical for transplantation and is informative for many disease associations. The high cost of accurate lab-based HLA typing has precluded its use in large-scale disease-association studies. The development of statistical methods to type alleles using linkage disequilibrium with nearby SNPs, called HLA imputation, has allowed large cohorts of individuals to be typed accurately, so that massive numbers of affected individuals and controls may be studied. This has resulted in many important findings. Several HLA imputation methods have been widely used, however their relative performance has not been adequately addressed. We have conducted a comprehensive study to evaluate the most widely used HLA imputation methods. We assembled a multi-ethnic panel of 10,561 individuals with SNP genotype data and lab-based typing of alleles at 11 HLA genes at two-field resolution, and used it to train and validate each method. Use of this panel leads to imputation accuracy far superior to what is currently publicly available. We present a highly-accurate new imputation method, HLA*IMP:03. We address the question of optimal use of HLA imputations in tests of genetic association, showing that it is usually not necessary to apply a probability threshold to achieve maximal power. We also investigated the effect on accuracy of SNP density and population stratification at the continental level and show that neither of these are a significant concern.