Haplotype estimation for biobank-scale data sets.
O'Connell J., Sharp K., Shrine N., Wain L., Hall I., Tobin M., Zagury J-F., Delaneau O., Marchini J.
The UK Biobank (UKB) has recently released genotypes on 152,328 individuals together with extensive phenotypic and lifestyle information. We present a new phasing method, SHAPEIT3, that can handle such biobank-scale data sets and results in switch error rates as low as ∼0.3%. The method exhibits O(NlogN) scaling with sample size N, enabling fast and accurate phasing of even larger cohorts.