Proteins are the most important molecules of life that are part of almost every biological process. The knowledge of protein functions plays an essential role in understanding biological cells which ultimately have a significant impact on human life in areas, such as, personalized medicine, better crops and improved therapeutic interventions. The conventional approaches for classifying protein functions are essentially based on two-way classification decisions, i.e., a function is either decided as being positively annotated to a protein or it is decided as being negatively annotated to a protein. There are two basic shortcomings with these two-way classification approaches. Firstly, they classify every case irrespective of the available information. As a result, the cases with low level of associated information may be misclassified thereby leading to ineffective accuracy rates. Secondly, there is no mechanism in these approaches to incorporate and take advantage from continuously evolving biological information resulting from technological advancements.


In this work, we propose and evaluate a three-way decision making approach to classify protein functions. The essential idea is to extend the two-way decision making approach by adding a third decision option of deferment. Keeping in view the technological advancements for understanding biological processes, which are continuously refining and updating the details of biological information, we argue that the three-way approach can be used to overcome the two shortcomings of the two-way approaches. Firstly, we can exercise the decision of deferment whenever we do not have sufficient evidence to reach certain conclusions. This can help us in reducing some of the misclassifications. Secondly, by explicitly identifying the cases for which immediate decisions may not be possible, we make room for integrating anticipated future biological information which will make the decision making more evident and obvious.


We evaluated the proposed three-way decision making approach on the dataset of Saccharomyces cerevisiae species proteins which is obtained from Uniprot database with the corresponding functional classes extracted from the Gene Ontology database. For our experiments, we consider rough sets based models for inducing three-way decisions. The results of our experiments indicate that by increasing the level of biological information associated with proteins, the number of deferred cases can be reduced while maintaining the same level of accuracy. We comprehensively benchmark our scheme under these settings and conclude that the classification becomes more crisp as the knowledge of associated biological information matures.


Article metrics loading...

Loading full text...

Full text loading...

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error