Have questions? Visit https://www.reddit.com/r/SNPedia

User:Zimmer

From SNPedia

http://carlzimmer.com/

https://www.reddit.com/r/science/comments/4tfbyl/science_ama_series_im_carl_zimmer_and_im_here_to/d5gygg1

https://zimmerome.gersteinlab.org/2016/05/06/part01_gerstein/


When processing

http://archive.gersteinlab.org/proj/zimmerome/part05/data/variants/Z.variantCall.SNPs.vcf

This promethease report is produced.

https://promethease-ondemand.s3.amazonaws.com/view/a1468302610-af87a456-a891-4012-8a24-1ce42f6d9289/promethease.html?Signature=mfiLfrsoZfWeM9geFYxzS6yw4ZE%3D&Expires=1472190629&AWSAccessKeyId=AKIAJ6EKJ5QCIXQJCNTA

Most of it is relatively typical, but promethease is trying to emphasize the atypical, so the top result is

rs79556279(G;T)

with the text

Behçet's disease HLA-B*51 the primary risk in Behçet's disease and rs79556279=T has the strongest association of any SNP in the HLA-B region.

That sounds alarming, in reality Promethease was indicating increased risk of Behçet's disease, not direct causation. I've tweaked the text in SNPedia so future reports will make that clearer.

Regardless of wording, what triggers this claim? The file at

http://archive.gersteinlab.org/proj/zimmerome/part05/data/variants/Z.variantCall.SNPs.vcf

contains this line

6	31329846	.	G	T	1169.77	PASS	AC=1;AF=0.500;AN=2;BaseQRankSum=2.521;ClippingRankSum=-0.027;DP=54;FS=3.854;MLEAC=1;MLEAF=0.500;MQ=60.00;MQ0=0;MQRankSum=-0.615;POSITIVE_TRAIN_SITE;QD=21.66;ReadPosRankSum=0.686;VQSLOD=1.22;culprit=QD	GT:AD:DP:GQ:PL	0/1:25,28:53:99:1198,0,1517

as well as

##fileformat=VCFv4.1

and

##reference=file:///gpfs/scratch/fas/gerstein/common/personalGenome/zimmerome/newAlign/hs37d5.fa

So hs37d5 is not one of the typical reference genome names I recognize, but presumably it's similar to the more familiar GRCh37.p13, and if so, then

https://www.ncbi.nlm.nih.gov/SNP/snp_ref.cgi?rs=79556279

tells us that rs79556279 is at

chr6:31329846

and

http://www.snpedia.com/index.php/Rs79556279


points out that

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4066492/

https://www.ncbi.nlm.nih.gov/pubmed/24876276

contains this sentence.

Conditioning on HLA-B*51 and rs79556279, the strongest associated SNPs, 4.5 kb upstream of HLA-B loci showed no other significant SNPs. Moreover HLA-B*51 and rs79556279 are in strong linkage disequilibrium. Two other regions, one tetrameric to HLA-C and the second in a region that includes HLA-A, were associated with BD, but the former was lost on conditioning for rs79556279.


https://www.snpedia.com/index.php/HLA-B

says

HLA-B51 is associated with Behçet's disease and other inflammatory joint and skin diseases.

Two associated tag SNPs for HLA-B51 are rs79556279 and rs116799036. [PMID 24821759]


https://www.ncbi.nlm.nih.gov/pubmed/24821759


says


Among these was the most strongly BD-associated SNP in the study, rs79556279 [padditive = 2.2 × 10−50, OR 2.7 (95% confidence interval, CI, 2.3, 3.1)], which was located 4.9 kb 5′ of HLA-B. After controlling for the effect of rs79556279, we found that no other SNP in the HLA-B/MICA region was significantly associated with BD (Fig. 1A). Association testing of MHC-region SNPs conditioned on the effect of HLA-B*51 similarly identified no significant residual association in the HLA-B/MICA region (Fig. 1B), and, moreover, rs79556279 was in strong LD with HLA-B*51 [expectation–maximization r2 (r2EM) = 0.92; expectation-maximization pairwise linkage disequilibrium (D′EM) = 0.96], indicating that the effect of HLA-B*51 underlies the observed effect of rs79556279.

The condition is more common among Turks, Sephardic Jews, and people of Arab and Armenian ancestry.

He's also a carrier for rs28940579(C;T) which he wrote about as

For example, I have rare SNPs in a gene called MEFV. At one location in that gene, the vast majority of people have a base called thymine. But one of my copies of the MEFV gene has a cytosine at that spot. This variant gives me the rare distinction of being a carrier for a disease called familial Mediterranean fever, which causes runaway inflammation. (You need two copies to actually get the disease.)

and both of these seem quite consistent with his apparent middle eastern jewish heritage.


He's also online at

https://www.openhumans.org/CarlZimmer/

and we can see the report for that at

https://promethease-ondemand.s3.amazonaws.com/view/a1468857295-4c9a19e2-5fd0-42a0-a585-44f01b56594a/promethease.html?Signature=SbSCclNiiNJJobPCQw5bGyI1%2Bc0%3D&Expires=1472745300&AWSAccessKeyId=AKIAJ6EKJ5QCIXQJCNTA

and it's nearly identical. This time the top line is

##fileformat=VCFv4.2

so it's a newer VCF format. This one supports 'gVCF' which allows you to encode the normal (aka '0|0') calls via 'END=' tags. Promethease can use that to produce a much richer report, but sadly this file doesn't actually have any END= tags so the report is only for positions that vary from the reference.

The header also contains

##reference=file:///seq/references/Homo_sapiens_assembly19/v1/Homo_sapiens_assembly19.fasta

which reinforces the idea that while it may be the same sequencing, it's a very different assembly.

and the report from that data is at

https://promethease-ondemand.s3.amazonaws.com/view/a1468857295-4c9a19e2-5fd0-42a0-a585-44f01b56594a/promethease.html?Signature=SbSCclNiiNJJobPCQw5bGyI1%2Bc0%3D&Expires=1472745300&AWSAccessKeyId=AKIAJ6EKJ5QCIXQJCNTA

This report is nearly identical, with the same top hits. There are a few extra genos in this openhumans file (17,222 vs 17,739) but for the differences seem minor.

Promethease is able to make a report for both files simultaneously, and now shows a 'conflicts' checkbox which allows us to highlight only the positions which disagree between the two reports. Here is that report about both

https://promethease-ondemand.s3.amazonaws.com/view/a1468858678-37dda57c-4295-4ee6-b058-98107bd56198/promethease.html?Signature=MzOMs9WmJT9EENeJHgQPDJklv9o%3D&Expires=1472746683&AWSAccessKeyId=AKIAJ6EKJ5QCIXQJCNTA'


The only 6 positions with any information in SNPedia disagree between the two files

in the gene OXTR

not in a gene


in HLA-A

in MICB

in GABRB3

A number of other variations influence metabolism.

  • He's a CYP2C19 Ultra-Fast Metabolizer
  • rs2228093(T;T) suggests a alcohol-induced hypersensitivity