All PGA Variation Data
Bulk Download of All Variation Data Files
WARNING: This is a very large file and will take several minutes
to download. The file is a compressed and "tarred" unix file
containing the entire directory of text data files. These are the
same text data files which appear in the data pages for each candidate
gene on our Finished Genes List. Please see our Usage Statement
if this work is used in a publication. A complete description of
the contents of each of these files is found in the README.txt file.
Download of Variation Data (Single File )
This is a tab delimited text file in our "prettybase"
format, which describes all SNP sites discovered by the SeattleSNPs
PGA. The format of this file is:
<chromosome position-HUGO_NAME-chromosome> <PGA Sample ID> <Allele1> <Allele2>
Example: 74772592-PLAU-10 D001 G T
The 'chromosome position' is generated from mapping to the most recent genome assembly available from the UCSC Genome Assembly
Download of PGA Variation Data by Chromosome
These are tab delimited text files in our "prettybase"
format, which describes all SNP sites discovered by the SeatteSNPs PGA but
separated into files based on chromosome. The format of this file is:
<chromosome position-HUGO_NAME-chromosome > <PGA Sample ID> <Allele1> <Allele2>
Example: 74772592-10-PLAU D001 G T
The chromosome position is generated by mapping to Genome Build 36, hg18:
UCSC Genome Assembly
Chromosome 1 |
PGA SIFT/PolyPhen Data
Putative functional changes in a candidate gene's protein function were assessed by taking the nonsynonymous coding SNPs (cSNPs) for each gene and using both SIFT and Polyphen. Generally, each nonsynonymous amino acid change is analyzed in the context of other evolutionary similar proteins to determine the likelihood the polymorphic nonsynonymous change, and then statistically classified. Each of these programs classifies each coding SNP as tolerant or intolerant (SIFT), or benign, possibly damaging, probably damaging (Polyphen).
|Combined SIFT/PolyPhen Data for PGA Nonsynonymous SNPs|
|Combined SIFT/PolyPhen Data for PGA Nonsynonymous SNPs (Intolerant or Potentially Damaging)|
|SIFT Data for PGA Nonsynonymous SNPs (Potentially Intolerant)|
|PolyPhen Data for PGA Nonsynonymous Genes (Potentially Damaging)|
Download of dbSNP rs IDs For All Variations (Single File)
This is a tab-delimited text file that lists the NCBI dbSNP rs IDs as of June 4, 2007. In addition, the chromosome positions for UCSC Browser builds hg17 and hg18 are given.
Following several header lines, the columns are:
<gene name><PGA local ID><chromosome number><hg17 position><hg18 position><rs ID>
sftpb SFTPB-002233 chr2 85805543 85747396 rs3024799