Thursday, June 14, 2012

Summarize 23andMe SNP Categories

The primary goal of this script is to provide statistics about your 23andMe SNPs (number of annotated SNPs, number of homozygous / heterozygous disease assocations, number of coding SNPs, etc.)


Step #1:Create a 
  • Prepare combined SNP file (click here for details)
  • This will also work for filtered files (check here for details)
Step #2: Produce Summary Statistics

  • Download the perl script 23andMe_stats.pl
  • There is one parameter that you need to enter:
    • inputfile = file containing 23andMe SNPs with both SeattleSNP and GWAS Catalog annotations (click here for details)
  • PC Users
    • Open a terminal window (type "cmd" in Run, for example)
    • Move to the folder where your 23andMe data is saved.
      • Basic commands:
        • cd = change folder
          • If the data is not in your C:\ drive, you can type "cd \d D:"
        • .. = move up one folder
    • Type in "perl 23andMe_GWAS_stats.pl" and enter the required genome parameter. See example below  (click to enlarge) .

  • Mac Users
    • Open Terminal (in Applications/Utilities, for example)
    • Basic commands:
      • cd = change folder
      • .. = move up one folder
    • Type in "perl 23andMe_GWAS_ stats .pl" and enter the required genome parameter. See example below  (click to enlarge) .

I have tested my perl scripts on a PC and Mac, but I cannot guarentee that they will work on every possible platform. Also, these scripts may need modifications as file formats change, but I have currently confirmed that my scripts work with v2 and v3 arrays using genomes from Genomes Unzipped.  If you have any questions or comments, please post them below and I will do my best to help troubleshoot.

No comments:

Post a Comment

 
Creative Commons License
My Biomedical Informatics Blog by Charles Warden is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 United States License.