01 November 2019

Genomics

' The Personal Genome Project-UK, an open access resource of human multi-omics data' by  Olga Chervova, Lucia Conde, José Afonso Guerra-Assunção, Ismail Moghul, Amy P. Webster, Alison Berner, Elizabeth Larose Cadieux, Yuan Tian, Vitaly Voloshin, Tiago F. Jesus, Rifat Hamoudi, Javier Herrero  and Stephan Beck  in (2019) 6 Scientific Data  257 comments
Integrative analysis of multi-omics data is a powerful approach for gaining functional insights into biological and medical processes. Conducting these multifaceted analyses on human samples is often complicated by the fact that the raw sequencing output is rarely available under open access. The Personal Genome Project UK (PGP-UK) is one of few resources that recruits its participants under open consent and makes the resulting multi-omics data freely and openly available. As part of this resource, we describe the PGP-UK multi-omics reference panel consisting of ten genomic, methylomic and transcriptomic data. Specifically, we outline the data processing, quality control and validation procedures which were implemented to ensure data integrity and exclude sample mix-ups. In addition, we provide a REST API to facilitate the download of the entire PGP-UK dataset. The data are also available from two cloud-based environments, providing platforms for free integrated analysis. In conclusion, the genotype-validated PGP-UK multi-omics human reference panel described here provides a valuable new open access resource for integrated analyses in support of personal and medical genomics.
The authors state
The Personal Genome Project UK (PGP-UK) is a member of the global PGP network together with the PGPs in the United States, Canada, Austria and China. The PGP network aims to provide multi-omics and trait data under open access to the community. This contributes to personalised medicine by advancing our understanding of how phenotypes and the development of diseases are influenced by genetic, epigenetic, environmental and lifestyle factors. While all five PGP centres generate whole-genome sequencing (WGS), some PGPs, such as PGP-UK, produce additional multi-omics data. 
To participate in this study, volunteers must pass the eligibility criteria (e.g. be a UK citizen or permanent resident), sign the open consent form and pass a very thorough entrance exam. The objective of the exam is to ensure that the participant understands the key PGP-UK procedures and the potential risks of being involved in a project of this nature. At present, 1100 subjects have successfully enrolled in the project, and over a hundred of them have had their genomes sequenced. Once enrolled, participants are invited for sample collection which involves giving a blood or saliva sample or both for DNA and RNA extraction. DNA sequencing is then performed followed by data analysis. The results are reported back to the participants in the form of a Genome Report that is made publicly available after a grace period of one month. However, the participant is able to withdraw from the project at any time. DNA methylation data is generated using the Illumina HumanMethylation450 BeadChip array (450 k) and results are displayed in a freely available Methylome Report, a unique feature of the UK branch of the project. The preparation of both Genome and Methylome reports is discussed in more details in the Usage Notes Section. ... 
A pilot cohort of ten members of the public make up the PGP-UK multi-omics reference panel. For this cohort, we collected whole-genome bisulfite sequencing (WGBS) and RNA sequencing (RNA-seq) in addition to WGS and 450 k data. Figure 1 shows a schematic of the PGP-UK workflow. More information about PGP-UK can be found in1,2 and on the project’s website www.personalgenomes.org.uk. ... 
While controlled access multi-omics data can be submitted into a single public repository (e.g. EGA in Europe or dbGaP in the USA), there is currently no single public repository for open access multi-omics data. Consequently, the different types of datasets (WGS, WGBS, RNA-seq, 450 k) were submitted to the corresponding repositories (European Nucleotide Archive (ENA), European Variation Archive (EVA), ArrayExpress) at EMBL-EBI. The details are given in the Data Records section and direct data download links are provided on the PGP-UK data web page www.personalgenomes.org.uk/data. For convenience, we offer a web API to download all the available PGP-UK data (see Data Records). The cumulative size of the PGP-UK multi-omics reference panel exceeds 2TB, which means that it would take over 3 days (more than 85 hours) to download (with mean UK download speed of 54.2 Mbps, Ofcom 2018). To overcome this limitation, we collaborated with two cloud platform providers (Seven Bridges Genomics and Lifebit) to host PGP-UK data in their respective clouds for unrestricted access as briefly described in Data Records section. 
In this paper, we describe the PGP-UK multi-omics human reference panel derived from 10 participants. We followed best practices to perform various quality control (QC) checks to ensure the quality of the pilot WGS, WGBS, RNA-seq and 450 k datasets as described in the Technical Validation section. Finally, we describe the methods employed for multi-omics data matching, which ensures that samples are mapped to the correct participant.
'Direct to consumer genetic testing' by Rachel Horton, Gillian Crawford, Lindsey Freeman, Angela Fenwick, Caroline F Wright and Anneke Lucassen in (2019) 367 BMJ l5688 offers cautions about DTC genomics. The authors note
 Direct-to-consumer (DTC) genetic tests are sold online and in shops as a way to “find out what your DNA says.” Testing kits typically contain instructions and equipment for collecting a saliva sample, which customers post to the DTC company for analysis. 
Some DTC genetic tests promise insights into ancestry or disease risks; others claim to provide information on personality, athletic ability, and child talent. However, interpretation of genetic data is complex and context dependent, and DTC genetic tests may produce false positive and false negative results.
Anyone concerned about a result from a DTC genetic test might turn to their general practitioner (GP) or other primary healthcare provider for advice. This practice pointer aims to help clinicians in this scenario and explains what sort of health information is provided by these tests, their limitations, and how clinicians 
... What health information do DTC genetic tests provide? 
DTC genetic tests might provide a range of health information: 
• Polygenic risk scores—combine many different common variants across the genome to place someone in a broad risk category, eg, “your genes predispose you to weigh about 3% more than average.” The validity and utility of these risk scores for predictive clinical purposes is hotly debated. In our opinion, although polygenic scores may be useful in researching the causes of disease, or stratifying populations into higher and lower risks they are rarely able to usefully predict disease ;
• Genotype at specific points—looks at specific variants that influence the chance of developing particular diseases, eg, “you have two copies of the ε4 variant in the APOE gene. People with this result have an increased risk of developing late onset Alzheimer’s disease.” This type of testing can also be used to identify variants that affect drug metabolism. 
• Carrier screening—looks at specific variants to identify people who are carriers for particular recessive genetic conditions, eg, “one variant detected in the CFTR gene. If you and your partner are both carriers, each child may have a 25% chance of having this condition.” Many carrier tests are ancestry specific: they test for specific carrier variants common in a particular ancestral group. If someone with a different ancestry were a carrier, this would probably not be detected as it would likely be due to a different carrier variant (which the test would not check). 
• Uninterpreted “raw” genetic data—some DTC genetic test companies provide access to uninterpreted genetic data. Customers can download their data and seek an interpretation using third party services. These usually work by cross referencing the data against freely available genetic databases and constructing a report based on interpretations in these databases (which may not be up to date). They may report variants and disease risks that were not reported or referred to by the original DTC genetic test company, and might repurpose raw data from tests designed to answer other questions, such as ancestry, to try to provide health information. 
Saliently, the practice note asks 'What are the limitations of DTC genetic tests?', noting

  •  Predictive value is low when there is no family history of disease
  • Identifying a “high genetic risk” via DTC genetic testing does not mean that they will definitely develop the condition. 
  • False positives are common, especially where third party interpretation services are used
  •  The quality control for DTC genetic tests is variable.  Results found via third party interpretation services need particular care. 
Further cautions for consumers are -
  • Does the DTC genetic testing company have real people you can talk to? Are they qualified (eg, genetic counsellors) to advise in response to your clinical concerns? 
  • Have you read all the information and small print about the test? Sometimes tests have substantial limitations. 
  • If you are worried about a genetic condition in your family, is the DTC genetic test you are thinking about thorough enough to properly check this? 
  • Could your decision to have a test affect your family? DTC genetic tests sometimes reveal information that could be relevant to your family—such as a health risk that might run in the family, or that family relationships are different from what you expected. Have you told your family that you are thinking about having a genetic test? 
  • Are you happy with what the DTC company might do with your data? DTC companies might collect, store, sell, or undertake research on your genetic data. Do you find that acceptable? Do you know who might have access to your data? 
  • Reassuring results can be false negatives 
  • DTC genetic tests tend to prioritise breadth over detail. For example, the 23andMe “genetic health risk” report for BRCA1 and BRCA2 currently only checks for three disease-causing variants mainly relevant for people with Ashkenazi Jewish ancestry; this approach would miss in the region of 80% of people with disease-causing BRCA variants in the general population 
  • DTC genetic tests are sold as providing answers, and patients may understandably expect that their results will be clearly predictive of future health. These expectations, driven by marketing and media coverage, leave people at risk of over-interpreting results from DTC genetic testing. One common pitfall is to compare the result to a “zero risk,” rather than population risk.