Background and Objectives: Recent next generation sequencing studies of different cancers reveal a varied spectrum of mutations with patients having dozens to hundreds of mutations with little overlap in mutations between different patients. A difficult problem is to understand which of the observed mutations contribute to tumorigenesis. While several approaches have been used to determine significantly mutated genes, these approaches do not calculate random expectation mutation frequencies. Here, we develop a simulation of random mutagenesis and compare observed mutation frequencies (Fo) to random expectation frequencies (Fr) to identify genes that are likely to be selected for or against in tumors. Methods: Our random simulation method, implemented in Matlab, applies a different mutation probability for A or T bases, G or C bases and CpG repeats. Mutation simulation was done on Agilent's 50 Mb exon library. Mutations are reported on a gene level. Each simulation was run for 100 trials with each trial consisting of 316 repeats. The 100 trials enables calculation of standard deviations for random mutation frequencies. We also carried out a bootstrap analysis of observed data to estimate standard deviations of observed mutations. We calculated differences between Fo and Fr using ratio and rank comparisons. Results: We applied our approach to ovarian cancer data reported by the The Cancer Genome Atlas in 2011. We found a significant difference between observed mutations and random expectation mutations. Random mutations correlate well with the length of a gene (R2=0.54 for one trial) while almost no correlation is seen with observed mutations (R2=0.11). We also found one set of genes that is mutated at higher frequencies than expected while another set of genes is mutated at lower frequencies than expected. Conclusions: Our simulated mutagenesis method is a novel approach to determining significance of observed mutations in cancer. Application of this model to ovarian cancer data shows a significant discrepancy between expected and observed mutations possibly indicating that most observed gene mutations are selected for. In addition, our approach reveals specific genes that may be implicated in tumorigenesis. These genes are good candidates for further study.


Article metrics loading...

Loading full text...

Full text loading...

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error