Eight consanguineous Arab families with novel autosomal recessive disorders were mapped with illumina 700K SNP. All relevant positional candidate genes were screened for pathogenic mutations. None were identified. Multiple homozygosity intervals were obtained for each family since no significant LOD scores were possible. Whole exome sequencing was on ABI SOLiD4 for 1 affected individual from each family. Mapping and annotation was on LifeScope software. Data validation was done manually for each linkage interval, by visual inspection of read depth and bead number coverage. On average 30,000 sequence variations were detected in each sample including novel variants, known polymorphisms & exome sequencing errors. For each chromosome with a linkage interval, data was isolated and filtered by exportation to Excel spreadsheets and visual inspection to exclude non-linkage interval data. The number of variants in the linkage intervals for each family was between 400 and 1300. Homozygous sequence variations within the linkage intervals were between 50 and 300 with 15-30 novel variants. Determination if a variant was homozygous or heterozygous, novel or annotated was done manually upon visual inspection of data on Excel spreadsheets. For each novel variant it was manually determined if it were exonic, splice site specific or intronic. For each annotated variation it was manually determined if it is associated to a disease phenotype relevant to the family disease. Minor genotype frequency was investigated for annotated variants if they represent disease states. All novel exonic variants were tested in silico with PolyPhen and Sift Protein Modeling software to access the effect on protein function. All damaging variants (novel or annotated exonic, and splice site) were validated by Sanger sequencing and tested for co-segregation to disease. An identical approach is essential to access pathogenic effects of insertion/deletion variants within each linkage interval. This approach is tedious, involves a tremendous amount of manual work and is prone to oversight errors. Software tools development for automating next-generation sequencing data analysis is essential to eliminate manual work and identify pathogenic mutations among the plethora of existing variants. Such automation is applicable in cases without linkage intervals to limit the number of variants under consideration.


Article metrics loading...

Loading full text...

Full text loading...

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error