Qatar is accumulating substantial local expertise in biomedical data analytics. In particular, QCRI is forming a scientific computing multidisciplinary group with a particular interest in machine learning, statistical modeling and bioinformatics. We are now in a strong position to address the computational needs of biomedical researchers in Qatar, and to prepare a new generation of scientists with a multidisciplinary expertise.

The goal of genomics, proteomics and metabolomics is to identify, and characterize the function of genes, proteins, and small molecules that participate in chemical reactions, and are essential for maintaining life. This research area expands rapidly and holds a great promise in the discovery of risk factors and potential biomarkers of diseases such as obesity and diabetes, the two areas of increasing concern in Qatar population.

In this paper, we develop new statistical modeling techniques of clustering based on mixture models with model selection of large biomedical datasets (proteomics and metabolomics). Deterministic and Bayesian approach are used. The new approach is formulated within the multivariate mixture-model cluster analysis to handle both normal (Gaussian) and non-normal (non-Gaussian) large dimensional data.

To choose the number of component mixture clusters we develop the model selection with information measure of complexity (ICOMP) criterion of the estimated inverse-Fisher information matrix. We have promising preliminary results which, suggest the use of our algorithm to identify obesity susceptibility genes in humans in a genome-wide association study and in Mass spectrum data generated for adipocyte tissue for an obesity study.


Article metrics loading...

Loading full text...

Full text loading...

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error