Personalized medicine uses information about an individual's genes, proteins, environment, and phenotype data to prevent, diagnose, and treat diseases. In addition, the innovative bio-markers discovery as the key of personalized medicine across multiple tumor types has unlocked new information about cancer biology by providing critical insights to biological, pathogenic and pharmacologic responses to treatment. In this decade, the finalization of the human genome project, when a complete sequence was published for the first time, created the potential to identify a large set of single nucleotide polymorphisms (SNPs) across the entire genome. Consequently, this has opened the door to possibilities for great improvements in diagnosis and therapeutics. In addition, the availability of massive amounts of Genomic Wide Association Study (GWAS) data has necessitated the development of new data mining and machine learning methods for quality control, imputation and analysis issues including multiple testing, predictive modeling for chronic diseases, and to discover variants that could lead to a particular trait/disease. Currently, personalized medicine faces multiple issues when trying to predict complex diseases such as cardiovascular, cancer, and asthma…etc. Yet, disease prediction still based on SNPs and few environmental factors, while complex diseases are usually affected by gene-to-gene interactions and many environmental factors which have great impact and significance on the predicted outcomes. Therefore, the current challenge is to develop a unique personalized medicine system as an approach to discover that some tumors have unique pathologic and molecular characteristics that may warrant different treatment strategies. This research is based on the announcement of Qatar national genome project (to map the genome of the entire population of Qatar for delivering personalized medicine). The goal is develop a genomics data hub and establish an advanced big data analytic with modern data mining predictive modeling with high performance computing for memory-intensive genomic analysis/variant and data-intensive clinical analytic using petabytes of phenotype and Omics databases. Therefore, by understanding specific differences in tumor biology, researchers are identifying bio-markers for many tumor types, which are helping them to develop treatments targeting these underlying disease pathways. With these targeted therapies, clinicians can develop a more specific treatment strategy for some individuals that are potentially more effective based on the individual's tumor characteristics. The experimental and simulated genome-wide SNP data provided by the Genetic Analysis Workshop 16 and 17 will be utilized to investigate the new machine learning technique. This data afforded an opportunity to analyze the applicability and benefit of current machine learning methods, namely, penalized regression, ensemble learning methods, and network analyses resulted in several new findings while known and simulated genetic risk variants were also identified. The integrated strategies of both phenotype and Omics databases, implementation, and the learning processes are briefly proposed. The motivation of this research is to identify and discuss those GWAS challenges that will require breakthrough and innovative big data analytic and advanced predictive modeling frameworks to handle massive GWAS data towards personalized medicine at bedside. The ultimate goal is to deliver the right treatments to the right patients at the right time.


Article metrics loading...

Loading full text...

Full text loading...

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error