DSpace Repository

The impact of pre-clustering on classification of heterogeneous protein data

Show simple item record

dc.contributor.author Altartouri, Haneen
dc.contributor.author Tamimi, Hashem
dc.contributor.author Ashhab, Yaqoub
dc.date.accessioned 2021-12-13T11:19:13Z
dc.date.accessioned 2022-05-22T08:55:50Z
dc.date.available 2021-12-13T11:19:13Z
dc.date.available 2022-05-22T08:55:50Z
dc.date.issued 2021-09-14
dc.identifier.uri http://localhost:8080/xmlui/handle/123456789/8406
dc.description.abstract The aim of this paper is to evaluate improvement in the classification of protein sequence data by introducing clustering as a prepossessing step. Clustering analysis was introduced to discover any possible sub-clusters that might have different patterns within the same protein class. A classification learning algorithm is then applied to each cluster to enhance the classification accuracy. Two standard benchmark datasets: caspase 3 human substrates that include cleaved and non-cleaved peptides, and the membrane proteins inner and α-helical proteins were used to examine the proposed approach. Different descriptors based on the physicochemical properties of amino acids were extracted from the protein sequence data and two encoding methods were used to represent the protein sequences using the descriptors. The results show that applying clustering process prior to classification gives higher prediction accuracy than using classification alone. In addition, the result of time performance shows that the proposed approach succeeded in reducing the training time of the classification process significantly while maintaining the accuracy of prediction. en_US
dc.language.iso en en_US
dc.publisher Network Modeling Analysis in Health Informatics and Bioinformatics - Springer en_US
dc.subject Protein sequence data en_US
dc.subject Classification en_US
dc.subject Clustering en_US
dc.subject Physico-chemical properties en_US
dc.title The impact of pre-clustering on classification of heterogeneous protein data en_US
dc.title.alternative The impact of pre-clustering on classification of heterogeneous protein data en_US
dc.type Article en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Browse

My Account