Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

A substantial investment has been made in the generation of large public resources designed to enable the identification of tag SNP sets, but data establishing the adequacy of the sample sizes used are limited. Using large-scale empirical and simulated data sets, we found that the sample sizes used in the HapMap project are sufficient to capture common variation, but that performance declines substantially for variants with minor allele frequencies of <5%.

Original publication




Journal article


Nat Genet

Publication Date





1320 - 1322


Chromosome Mapping, Databases, Nucleic Acid, Diabetes Mellitus, Type 2, Gene Frequency, Genetic Predisposition to Disease, Genome, Human, Humans, Linkage Disequilibrium, Polymorphism, Single Nucleotide, Sample Size