Have questions? Visit https://www.reddit.com/r/SNPedia

User talk:JohnLloydScharf

From SNPedia

Please discontinue posting your personal data on the talk pages.[edit]

I welcome any genuine discussion about your genotypes, but a raw data dump, using the non-dbSNP oriented genotypes is going to confuse too many people. --- cariaso 14:30, 5 September 2011 (UTC)

Are you saying not to post that on my user page, or just not on the pages for SNPs? I posted personal info on SNPs to show you they are FTDNA. You changed several deleting the statement they are FTDNA2. I found I do have those from the Illumina chip or FTDNA2.John Lloyd Scharf 14:39, 5 September 2011 (UTC)

On your user page, you are encouraged to continue as you have been. But the SNP and genotype Talk pages should involve actual discussion, not just raw data. I believe I only removed your FTDNA text from genotype pages. Genotypes should not be considered on chip, only snps should be on a chip. Regardless, that requires no brain. Let the bots do what they do well. Let humans do what they do well. --- cariaso 15:17, 5 September 2011 (UTC)

See Talk:Rs10521339.

I've made a small addition to User:JohnLloydScharf/Alzheimer.27s_Disease which shows a new syntax you might find helpful (or quite possibly not).

{{I am an rs|669|A|G}}


I am an rs669(A;G) 23andme equivalent rs669(C;T) (chr12 9,079,672 minus) possibly increased risk for Alzheimers

Your edit to rs2547547 was not appreciated.[edit]

  1. Pasting an abstract, while offering no original content, improvement, simplification, or even hyperlinks to the relevant rs#s
  2. Pasting the above as a summary for the entire snp. The summary should be reserved for a short summary of everything we know. A summary of the paper is welcome in page text.
  3. while the paper is fine to add, it has a mere 100 samples and controls. The sample size is too small to offer much certainty

--- cariaso 01:13, 6 September 2011 (UTC)

Okay, show me how you would have done it and tell me what sample size is large enough. I do not believe the reference for the Baldness has 100 samples, so I must be missing something.

Correct, the baldness one was not what I was referring to. The 100 samples were in rs2547547. You can see what I would suggest as a welcomed edit of that snp.
  • You listed the baldness SNP allele and it was a sample that consisted of 95 families in which at least two brothers had early-onset AGA (391 genotyped individuals, including 201 affected men). John Lloyd Scharf 06:07, 6 September 2011 (UTC)
  • You also listed the research, [PMID 20731616] , for Rs2547547, for which I quoted. John Lloyd Scharf 06:07, 6 September 2011 (UTC)
"One hundred and ninety-six Chinese subjects consisting of 97 HCC patients and 99 controls were enrolled in this study. Nine polymorphisms of the HDAC1, HDAC2, and HDAC3 gene (rs2530223, rs1741981, rs2547547, rs13204445, rs6568819, rs10499080, rs11741808, rs2475631, rs11391) were examined using Applied Biosystems SNaP-Shot and TaqMan technology." That is your source for the 2.2x. Your edit does not reveal which is the risk allele. The CEU - European is only 180 samples of Utah residents, which is 16 less than this. The CHB - Han Chinese is only 90 samples, which is fewer than the 97 affected in this case. If the sample size was too small with 196 samples that leaves questions to be answered:
  1. What is the criteria for the minimum population needed?
there is no firm limits. But lower quality information is best labeled as such. In time, site best practices and guidelines will emerge.
  1. Why cite the research if it had too few samples to be considered valid?
it isn't too low to be valid. It's just too low to be worthy of the SNP summary. If well summarized 100 samples could be an excellent and welcome paper. Nearly everything published is welcome in SNPedia, if it is well explained. copy and pasting an abstract is not welcome.
  1. How is it helpful to say what the risk is if the wild and risk alleles are not named?
It is better to say what is wild and risk. However in this case, it appears to be a somewhat rare combination of 2 snps which offers 2.2x reduced risk. It was not yet clear to me how to classify one form as risk, when risk may be quite normal. In time I would hope to add that, but SNPedia works best by making many small improvements, not one 'perfect' edit. It would help the page if each genotype was labeled as risk, or normal, but dumping all 3 into the SNP page forces readers to do an unnecessary amount of re-interpretation.

Also, you deleted the reference to the fact that it is FTDNA2.

In fact, the history shows I've deleted it twice, on sept 3 & 4. And it also shows the SNPediaBot has added it back each time. But there were a problem with both of those FTDNA2 loads, as you pointed out. For that reason rs1126809 is still on my talk page, as it has not yet been resolved. However over at SNPedia:FAQ#How_many_SNPs_are_in_SNPedia.3F you'll see that the FTDNA2 count still looks a little 'low'. And that's because SNPediaBot is currently making its 3rd attempt at the load.
The bot did not delete that. You did when you edited out my edit. Keep in mind, FTDNA is avoiding SNPs that have a medical interest and concentrating on those that give the most bang for the buck on ancestor populations, i.e., Family Finder. John Lloyd Scharf 06:12, 6 September 2011 (UTC)
Right. the bot added it, and a bunch of others it shouldn't have such as rs1126809. The best way for me to fix it was to remove everything, and reload again. I've now done that 2x. The 3rd run has now completed successfully. --- cariaso 11:03, 6 September 2011 (UTC)

I am going to stop making any edits to the Rs pages until you are more certain what you want. John Lloyd Scharf 01:21, 6 September 2011 (UTC)

  • I am still stalled on edits. John Lloyd Scharf 06:20, 6 September 2011 (UTC)

Please discontinue posting your personal opinions on my User page. It is not appreciated.[edit]

Error 1[edit]


I think you are indicating that 1 time you were tested and told you were a (C;C). Then you were tested a second time and were told you were a (C;T). having a few of these (~20) seems to be not uncommon, either between 2 different companies, or even between 2 different versions of the 23andMe. This is because the decision to call you as a (C;T) is based on seeing a certain minimum number of Cs, and a certain number of Ts. Across the ~1M snps tested, a few will just barely make the threshold of heterozygosity one time, and will just barely miss it the next. It is an error in your raw data due to the limitations of micrarrays, not an error in SNPedia or Promethease. --- cariaso 19:10, 4 October 2011 (UTC)

It is an error on the part of FTDNA. Your comment was a distraction from my User Page and inappropriate. If you are justifying two results for the same SNP, that cannot be justified to any customer. My guess is FTDNA discovered their AFFY chips had too many errors and too many "--" results I have over 200 "--" results just for Chromosome 6 alone. John Lloyd Scharf 22:04, 4 October 2011 (UTC)

Error 2[edit]

My FTDNA v1 information predates your involvement in SNPedia, and at that time I was given FTDNA data which did include rs11391, rs11276, rs11701 and rs1205. If it's not in your data, then Family Tree DNA has probably since decided to remove it from the results they release. Keeping up with their changes is substantially more trouble than it's worth. Feel free to continue documenting which SNPs have been removed, and in time I may try to address them in a batch, but understand that I will not be dealing with this issue in the short term. --- cariaso 19:25, 4 October 2011 (UTC)

Your response is wrong regarding this as you were given the "FTDNA2" and you posted this after I gave you the Illumina chip results. There is only Affy results and Illumina chip results. I do not know of any FTDNA v1 designation. Your response has nothing to do with the fact SNPedia indicators are faulty and cannot be used by the typical Family Finder customer.

If you remove the "FTDNA" and "FTDNA2" indicators it would be more appropriate than your faulty indicator causing people not to look up data just because you say FTDNA does or does not test it. John Lloyd Scharf 22:04, 4 October 2011 (UTC)

Perhaps I'll respond to the rest of this in time, but for the moment I'll just say that the section title did make me smile. Thanks. --- cariaso

No response is necessary. I ordered 23andMe. Some of you statements have caused me great concern and regret in making that decision, since you called into question the results with your generalizations. I think I need to contact them with my concerns. John Lloyd Scharf 04:48, 5 October 2011 (UTC)

New pages[edit]

Why are you creating pages like rs5000996 and rs1418706 which seem to lack any publications? --- cariaso 07:38, 2 November 2011 (UTC)

I do not understand the question. How do they differ from this


John Lloyd Scharf 19:01, 2 November 2011 (UTC)

Please use the [[rs1234]] form instead of http://www.snpedia.com/index.php/Rs104894002 form where possible.

The page you've pointed to (rs104894002)has a link to the omim article about that SNP under the 007 label. The goal of SNPedia is to cover rs#s which have some associated literature. --- cariaso 02:00, 3 November 2011 (UTC)

Rs5000966 has a population and is tested by FTDNA, which is documentation of a sort that rs104894002 lacks. If I had mentioned rs104894002, I would have referred directly to:

H. H. Klünemann, B. H. Ridha, L. Magy, J. R. Wherrett, D. M. Hemelsoet, R. W. Keen, J. L. De Bleecker, M. N. Rossor, :J. Marienhagen, H. E. Klein, L. Peltonen, and J. Paloneva
The genetic causes of basal ganglia calcification, dementia, and bone cysts: DAP12 and TREM2
Neurology May 10, 2005 64:1502-1507
or http://www.ncbi.nlm.nih.gov/pubmed/16505336/

John Lloyd Scharf 14:13, 3 November 2011 (UTC)

10M+ snps have a population. 700k+ snps are tested by FTDNA. Neither of those is sufficiently notable, and I hope you do not intend to put them all in. The Pubmed paper you've cited would be welcome information. However I believe you found it via the OMIM link, which goes to show why the OMIM citations are worthy. The OMIM information was bulk loaded by a bot, which did not have any way to intelligently follow into the PMIDs. --- cariaso 16:30, 3 November 2011 (UTC)

That must be why so many SNPs appear useless to those who test. It seems to be a flaw in the use of the bot you might want to resolve. My posting of those was based on their populations being more unique than some others and being on Chromosome 6, which is the home of many HLA haplotypes. They may have as yet undocumented effects on antibodies, which I am looking into. I am following the line of 6p25.3 at this point. John Lloyd Scharf 16:39, 3 November 2011 (UTC)

Your Edit Errors[edit]

rs1935952: Gene is FOXO3 not FOXO3A (obsolete). Alleles are C/G not C/T. To allow for backwards compatibility, population template MUST use exact order for the first four groups: CEU, HCB, JPT, YRI.

rs805264 I'm not sure if you meant to add this one as the rsid you used is for rs805297. Currently the population template is designed only for HapMap data sources. It will need to be redesigned before other sources such as POPU2 can be used. Any new sources must have good coverage and reputation for accuracy.

rs9352774 and rs11039588 and rs4959270 Template is case sensitive. "Orientation" is "plus" or "minus" not "Plus" or "Minus".

rs10947055 and rs1109324 and rs1155974 and rs34490746 Change "HapMapRevision" or remove it if you can't figure out the data source. (If you can't figure it out, there's probably a reason it shouldn't be changed.)

rs884460 It isn't necessary to add http://www.ncbi.nlm.nih.gov/projects/SNP/snp_ref.cgi?rs=rs884460 as it is already included in the template.

rs34490746 You got the orientation wrong. Should be plus. Also the hapmap data is suspect as dbsnp claims it is from r27 but the source data from hapmap does not include it. This is usually because someone submitted the QC- results. In any case, I've edited it for order and removed the incorrect r28 designation.

In general it is usually best to just add new SNPs with just the rsid and let the bots fill in the details later to avoid simple typos as well as other problems of manually adding entries. If there is something the bots get wrong or leave out then the change should be documented to show why the added information is more up to date or correct. This allows the bots to be improved. Jlick 20:15, 6 November 2011 (UTC)

  • The orientation issue was a typo.
  • " Gene is FOXO3 not FOXO3A (ob solete). Alleles are C/G not C/T. To allow for backwards compatibility, population template MUST use exact order for the first four groups: CEU, HCB, JPT, YRI."
  • If "HapMapRevision" is important, you should be able to point out what the source is. If you cannot document it, how do we know it exists? That something is documented by NCIB should be enough.
  • I did not name the FOX3A gene and it is cited by the article. If FOX03A is obsolete, why did you use it in the first place???? If you are saying FOX03A is obsolete, then isn't the research is obsolete.
  • I do not know what importance or relevance "backward compatibility," has for a user of SNPedia and nothing fell apart when I changed the order, except to make the differences more visible between groups...i.e., more information.
  • Let me know what criteria you are using for your decisions rather than blaming the "bot" for it. You know and admit it has flaws. If you create it, you are responsible for it, but "bot" is not an authoritative source. Y