Towards Integrating the Gene Ontology and the Hierarchical Bayesian Network Classification Model: An Empirical Case Study

Main Article Content

Hasanein Alharbi


Data Mining (DM) is knowledge-intensive process that can be significantly enhanced by integrating the domain knowledge. Recent research claimed that ontology can play various roles in the DM process. Additionally, ontology can facilitate different steps in the Bayesian Network (BN) construction task. To this end, this paper investigates the advantages of consolidating the Gene Ontology (GO) and the Hierarchical Bayesian Network (HBN) classifier in a flexible framework which preserves the advantages of both ontology and Bayesian theory. The proposed Semantically Aware Hierarchical Bayesian Network (SAHBN) classification model introduces a flexible framework that systematically consolidates domain knowledge in the form of ontology and the DM process. Furthermore, it establishes a solid foundation to explore the possibility of integrating more comprehensive ontological knowledge in the DM process. SAHBN is tested using three datasets in the biomedical domain to predict the effect of the DNA repair gene on the human ageing process. DNA repair genes are classified as either ageing-related or non-ageing related based on their GO biological process terms. Overall, SAHBN classifier shows a very competitive performance compared with the existing Bayesian-based classification algorithms. SAHBN has outperformed existing algorithms in more than 50% of the implemented experiments. Six performance criteria were used to evaluate the performance of the proposed SAHBN model.

Article Details