A summary of sequencing done for each of the three pilot projects is available here. The list of samples and allocations is provided in a spreadsheet. Variant Calls Our variant calls are always released in vcf format. The released can be found in the release directory EBI|NCBI. Alignments The main project alignments are available in BAM format. Raw sequence files The main project raw sequence data is available in fastq format.

PLINK This page describes some basic file formats, convenience functions and analysis options for rare copy number variant (CNV) data. Support for common copy number polymorphisms (CNPs) is described here. Copy number variants are represented as segments. Basic support for segmental CNV data The basic command for reading a list of segmental CN variants is plink --cnv-list mydata.cnv --fam mydata.fam --map which can be abbreviated plink --cfile mydata (note that the map file must have the map extension). FID Family ID IID Individual ID CHR Chromosome BP1 Start position (base-pair) BP2 End position (base-pair) TYPE Type of variant, e.g. 0,1 or 3,4 copies SCORE Confidence score associated with variant SITES Number of probes in the variant Having a header row is optional; if the first line starts with FID it will be ignored. The FAM file format is the first 6 fields of a PED file, described here; this file lists the sex, phenotype and founder status of each individual. and then

The Human Genome Project (HGP) was one of the great feats of exploration in history - an inward voyage of discovery rather than an outward exploration of the planet or the cosmos; an international research effort to sequence and map all of the genes - together known as the genome - of members of our species, Homo sapiens. Completed in April 2003, the HGP gave us the ability, for the first time, to read nature's complete genetic blueprint for building a human being.

