I spent most of last week at the Genomics of Common Diseases meeting held by Nature Genetics and the Wellcome Trust. Below I summarize some of the big themes of the conference (with some editorial and biased by my own opinions), and give the big takeaway message I got from each of the talks.

Big themes

Before the meeting, Orli Bahcall presented a list of “big questions” regarding the state and future of genomics of common diseases. While going through the questions and absorbing the talks, I came up with the following big themes:

Challenges in discovery and interpretation of susceptibility loci

Clearly, GWAS has made a lot of progress in identifying genetic variants associated with disease. We saw many examples of this, including from Martin Tobin in COPD, from Peter Gregerson in rheumatoid arthritis, and from Judy Cho in IBD. GWAS methods have become pretty solid, and sample sizes are continuously expanding to allow large meta-analyses to identify more and more susceptibility loci (see the recent progress in schizophrenia for example). Clearly, one of the big challenges going forward will be in interpreting genetic associations and determining the biology driving them.

Speakers highlighted several ways we can improve association studies and gain better insight into the genetic architecture of disease:

There have been some heroic efforts to track down individual GWAS hits all the way to the causal variant and its mechanism. For instance, Hoskins investigated a GWAS hit for pancreatic cancer risk down to an indel that creates a novel binding site that represses a distant gene through long range interaction. These efforts are an extremely important next step in GWAS. Yet, fully investigating each of these loci is a huge task. To make progress on a large scale forward, I think we’re going to have to develop better computational methods for performing this on a large scale. A large part of this will be integrating many layers of genomics datasets, mentioned above. But I’m personally excited by the following classes of methods that will help generate meaningful predictions about the effects of variants:

A need for better infrastructure for genomics data sharing

As David Altshuler pointed out, in many cases to understand a single genome, we need to have access to the genomes of thousands or millions of people. He mentioned the cases of rare mutations in families, estimating penetrance, ethnic diversity, and cancer. We are now capable of approaching sample sizes of this scale. It is not news that genomics is now generating tons of data. We saw presentations from large projects including the GTEx, 1000 Genomes, and Genomics England’s 100K Genomes Project, plus a talk by Teri Manolio about many different clinical genomics projects from around the world. In a talk titled “Big Data: Transforming Cancer Research and Care”, Lynda Chin presented work on enormous datasets that included genomic, transcriptomic, epigenomic, and more data from many cancer patients. She emphasized the need to develop resources to crowdsource analysis to maximize use of these datasets.

While many of the datasets mentioned above and elsewhere are available publicly or with approved access, it is still difficult to ask questions like “at this location in the genome, what fraction of people have allele A” or “across all eQTL studies ever done, what is the effect of this variant.” While it is possible to answer some of these questions, it requires a huge amount of logistics to query many diverse datasets. If all publicly available genomics data was accessible and query-able within the same “interoperable framework”, it would be much easier to ask these kinds of questions and would make existing datasets much more powerful. This is a major goal of the Global Alliance for Genomics and Health, and I think its importance should be emphasized.

Genome Editing

Although this topic did not come up as much as I expected during the meeting, clearly one of the most transformative new technologies in genomics is the ability to precisely edit genomes in place with the use of the CRISPR-Cas9 system. Feng Zhang gave a great overview of the technology, and some exciting new developments they are working on. He was followed by Kiran Musunuru who talked about using CRISPR to follow up a specific GWAS locus.

This technology obviously has tons of use cases. For the first time, instead of just identifying associations, we can directly assess the effects of specific variants within their original genomic context. This will allow us to validate and study effects of GWAS loci and QTLs, easily generate gene knock-outs in tissue-specific models, and more. However, there are still many challenges. Musunuru pointed out issues in throughput: he wants to test all variants at a GWAS locus. Performing separate CRISPR experiments on each variant using current methods would be a huge task. Instead, he outlined a plan where he will generate a pooled library of guide RNAs for each mutation in a cell line with his gene of interest tagged with GFP. He will then assay expression using FACs and analyze extremes of the expression distribution to determine variants leading to increases and decreases in expression. I think this is the direction we need to be thinking. His example is only for a single locus. Imagine now that we’d like to interrogate variants at thousands of loci across the genome. There are still technical challenges, such as isolating specific populations of cells that received specific CRISPR edits, that make genome editing on such a large scale infeasible.

Challenges in bringing genomics to the clinic

Howard Jacob joked (or not) that the future of common disease is “phenotyping using an Apple watch and clinical sequencing”. However, there are many barriers to overcome. Two key areas were discussed regarding what needs to happen to make clinical genomics a reality:

I really enjoyed seeing the work of deCODE, presented by Hreinn Stefanson, in which they took CNVs implicated in schizophrenia and autism, and asked how they behave in controls. These types of studies are crucial for understanding complex traits: we have to look both at individuals with the disease phenotype, but also at individuals without the phenotype. Interestingly, they found that CNVs for schizophrenia indeed showed a cognitive phenotype in healthy controls, suggesting the variant plays a role in cognition separate from its role in the manifest disease. This study is important both to learn about penetrance, but also because it allows better elucidation of the phenotype conferred by that specific variant.

David Altshuler passionately advocated the importance of penetrance in variant interpretation, both in many of the Q&A sessions, but also in his talk about the Global Alliance. The big issue is that to assess the impact of a variant in an individual, we really need to look at all other individuals that harbor the same variant and ask what their phenotypes were.

Big messages from each talk

(Note a couple are missing, mostly because some were closed to social media, but a couple of them I didn’t attend, sorry!)