Coffee or how to get science into the popular press
Sometimes one gets unexpected news. Like I got a couple of days ago when one of our collaborators, Nicola Pirastu, informed me that the press release of one of our recent papers was apparently picked up by the ‘regular’ press. At the time of writing a search on news.google.com gives four pages of returned results for a total of more than 100 news sites, including Time magazine, the Huffington Post, the Daily Mail, The Sun, and, on the other side of the globe, the New Zealand Herald. In the Netherlands Nu.nl picked up the story.
So, what was the reason for all this buzz? The answer is simple: coffee! In our paper Non-additive genome-wide association scan reveals a new gene associated with habitual coffee consumption in Nature Scientific Reports , we describe how we found several genetic variants (SNPs) in thePDSS2 gene on chromosome 6 that are associated with coffee consumption. For example, people from the Italian cohorts that are homozygous for the G allele of the top hit within that gene (a SNP named rs6568479) on average consume 1.2 fewer cups of coffee per day than others.
However, in my opinion, the most important part of the paper is not finding this association between coffee consumption and the PDSS2 gene, but rather the fact that we went beyond the normal additive1 genetic model and instead also tested for association via dominant2 and recessive3 models. The association with the PDSS2 gene only shows up when using the recessive model.
A brief overview of the paper
The project was split into a discovery and a replication phase. For the former we used samples from two (genetically isolated) Italian populations: 370 people from the village of Carlantino in the South (known as INGI-CARL) and 843 people from six villages in the North East (INGI-FVG). In these populations coffee consumption was assessed by interviews. For the second phase we used data from 1731 people in the Erasmus Rucphen Family Study (REF), a Dutch genetically isolated study population. In this group coffee consumption was self-reported. In both cases the unit of measurement was cups per day.
For each of these populations we had (imputed) genetic data available. We first did a genome-wide association scan (GWAS) using mixed-models linear regression on the Italian samples and as pointed out above, we didn’t only run an association scan for the additive genetic model, but also for the dominant and recessive models. After meta-analysis of the Italian data (the discovery phase) we obtained genome-wide significant associations between coffee consumption and 21 genetic loci. There were no associations when using the dominant model, but some loci popped up with both the recessive and the additive model, however for all variants the p-values showed less significance for the latter model.
For the replication stage we took these 21 markers and ran the same association tests in the ERF data to see which associations are more likely to be true because they are also found in an independent cohort. Out of the 21 initial genetic variants five remained significant in this phase. However, none of these turn out to be coding variants. We also looked at gene expression in various tissues for these SNPs using GTeX which showed negative correlation meaning that people with a higher daily coffee consumption have a lower expression of PDSS2. All in all, this suggests that the genetic variants we identified act on coffee consumption by regulating the expression of the PDSS2 gene, which, in turn, suggests a possible functional explanation, maybe in the caffeine metabolism pathway.
- The additive model assumes that the effect of the SNP on the trait increased linearly with each copy of the effect allele. For example, consider an A/G SNP and let’s call the G allele the effect allele. When using the additive model, the person with the AA genotype will have the smallest effect; let’s put that at zero. If the association test shows that the effect increases with 1.5 (whatever units) per G allele, then a person with the AG genotype will have an effect size of 1.5 and a person with the GG genotype will have an effect size of 2 × 1.5 = 3.
- The dominant genetic model assumes that there is no difference in effect whether you have one or two copies of the effect allele. So, following the previous example, a person with the AA genotype would have an effect of 0 and people with either AG or GG genotypes will have the same effect size (e.g. 1.5).
- The recessive model assumes that you need two copies of the effect allele to observe a change in phenotype. Following the previous examples, a person with AA or AG genotype would have zero effect, but an individual with GG genotype would have an effect size of 1.5.