We have developed an efficient selection algorithm (LDSelect) that
is based on the linkage disequilibrium statistic r² and doesn’t
require direct haplotype inference(Carlson et al. 2004). This algorithm
selects a subset of variants that efficiently describe all common patterns
of variation in a gene, based on two primary criteria – 1) the
minor allele frequency (MAF) of a SNP and 2) the minimum
level of association between assayed and unassayed SNPs, measured by
the linkage disequilibrium statistic r². Given these parameters, LDSelect
identifies bins of SNPs such that one tagSNP per bin can be genotyped.
All SNPs above the MAF threshold will either be directly genotyped
or will exceed the specified level of allelic association with a SNP
that is genotyped. For the large-scale genotyping we have selected
tagSNPs from our representative European (CEPH) and African-American
samples (i.e. panel 1) used for the initial four years of the SeattleSNPs
project. Binning criteria for tagSNP selection were MAF cutoff
of 5% and an r² threshold of 0.65.
The Illumina BeadArray technology provides a robust and accurate genotyping
platform using highly multiplexed ligation assays with multiple levels
of specificity to obtain optimal results (Fan et al. 2003). Currently,
Illumina produces a scaleable, high specificity multiplexed, system
based (384 to 1536 SNPs/assay) on self-assembling bead arrays(Oliphant
et al. 2002). The processing of 96 samples within a standard plate
format also makes this system amenable to high throughput genotyping.