Using clustering to enhance protin sequence classification
| dc.contributor.advisor | khader, Sameer | |
| dc.contributor.author | Altartouri, Haneen | |
| dc.date.accessioned | 2022-04-07T10:26:37Z | |
| dc.date.accessioned | 2022-05-11T05:33:03Z | |
| dc.date.available | 2022-04-07T10:26:37Z | |
| dc.date.available | 2022-05-11T05:33:03Z | |
| dc.date.issued | 5/1/2013 | |
| dc.description | no of pages 109, 26547, Informatics 2/2013 , in the store | |
| dc.description.abstract | We introduce a new approach for enhancing the performance of prediction of biological attributes based on protein sequences using a combination of classification algorithms and clustering analysis. Before applying classification, we use clustering analysis in order to find clusters of similar proteins. A classification algorithm is then applied on each cluster. The proposed approach is suitable for large datasets, when high classification accuracy and fast convergence are required. Different descriptors based on the physicochemical properties of amino acids are used, some of them are native properties and the others are derived properties. Two encoding methods are used to represent the protein sequences using the descriptors. These descriptors and encoding methods are analyzed to enhance the performance of the proposed approach. Three standard benchmark datasets, Caspase, Major Histocompatibility Complex class II (MHC-II) and the membrane proteins are used to examine the proposed approach. Many experiments with different parameters are performed and the results are cross validated. The results show that applying clustering prior to classification gives higher prediction accuracy than using the classification without clustering, especially when using the membrane proteins dataset and the Caspase dataset. In addition, the result of time performance, especially when using the MHC-II vii | en_US |
| dc.identifier.uri | http://test.ppu.edu/handle/123456789/3081 | |
| dc.language.iso | en | en_US |
| dc.publisher | جامعة بوليتكنك فلسطين - informatics | en_US |
| dc.subject | enhance protein sequence | en_US |
| dc.title | Using clustering to enhance protin sequence classification | en_US |
| dc.type | Other | en_US |
