insights gained from sequencing the whole genomes of 2,636 Icelanders to a median depth of 20×. We found 20 million SNPs and 1.5 million insertions-deletions (indels). We describe the density and frequency spectra of sequence variants in relation to their functional annotation, gene position, pathway and conservation score. We demonstrate an excess of homozygosity and rare protein-coding variants in Iceland. We imputed these variants into 104,220 individuals down to a minor allele frequency of 0.1% and found a recessive frameshift mutation in MYL4 that causes early-onset atrial fibrillation, several mutations in ABCB4 that increase risk of liver diseases and an intronic variant in GNAS associating with increased thyroid-stimulating hormone levels when maternally inherited. These data provide a study design that can be used to determine how variation in the sequence of the human genome gives rise to human diversity.The authors comment
The advent of high-throughput genotyping and sequencing has revolutionized the ability to investigate how diversity in the sequence of the human genome affects human diversity. Large-scale genotyping of common variants led to an avalanche of discoveries of variants associating with common and complex diseases. Now studies based on whole-genome and exome sequencing are beginning to yield rare variants associating with common diseases. They also provide unprecedented information about human sequence diversity and insights into the structure and history of human populations. Several large-scale sequencing projects are ongoing or in the planning stages, foremost among them the 1000 Genomes Project and the Exome Sequencing Project (ESP), which have already provided valuable information about human genome diversity and tools to use in genetic discovery.
Our efforts at studying the human genome and its impact on diseases and other traits have focused on the Icelandic population. Genetic studies of the Icelandic population benefit from a genealogy of the nation reaching centuries back in time, a founder effect and broad access to nationwide healthcare information. The transition from genome-wide association studies (GWAS) based on common SNPs on microarrays to those based on a vast number of rare variants identified by whole-genome and exome sequencing presents new opportunities and challenges.
Here we describe the insights gained from sequencing the whole genomes of 2,636 Icelanders. First, we describe the density and frequency spectra of sequence variants in relation to their annotation. Second, we examine the geographical variation in sequence diversity in Iceland. Third, we show how variants down to a frequency of 0.1% can be imputed into the genomes of individuals who are only genotyped on microarray platforms and how the phenotypes of first- and second-degree relatives can be incorporated into analysis using the genealogy. Finally, we provide three examples of how rare variants in these data can be mined for associations with an extensive set of phenotypes and one example of how these data can be used to analyze clinical problems.