Parallel Scatter Search Algorithm for Feature Selection in High Dimensional Datasets

Adi, Safa

DSpace Home
→
Graduation Projects, Theses, and Student Papers
→
Master of Informatics
→
View Item

dc.contributor.advisor	Aldasht, Mohammed
dc.contributor.author	Adi, Safa
dc.date.accessioned	2018-05-27T09:03:48Z
dc.date.accessioned	2022-05-11T05:32:43Z
dc.date.available	2018-05-27T09:03:48Z
dc.date.available	2022-05-11T05:32:43Z
dc.date.issued	5/31/2017
dc.identifier.uri	http://test.ppu.edu/handle/123456789/718
dc.description	CD,no of pages 60,30130, informatics 2/2017
dc.description.abstract	Feature selection is one of a key success factor for classification problem in high dimensional datasets. This process aims to select the discriminative subsets of features in order to enhance the classification performance and reduce learning time. In this thesis we introduce an approach for handling the classification problem in high dimensional datasets using scatter search algorithm with wrapper model. During our study we have implemented the sequential and the parallel versions of the scatter search algorithm. The classification performance for the two versions are similar for Mushroom, Madelon, Gisette and Spambase datasets. For Arcene dataset the parallel version of the scatter search enhances the classification performance from 0.93 to 0.94 comparing with the sequential version of the scatter search. Five benchmark datasets are used to evaluate our approach, all of them are two-class classification problem. They are: Mushroom, Spambase, Arcene, Madelon and Gisette. Three of them (Arcene, Madelon and Gisette) are feature selection challenge. The obtained results indicate that the proposed approach is very efficient for feature selection process in high dimensional datasets; since the scatter search algorithm reduces the execution time and enhances the classification performance. A comparative study is conducted with other research in the literature that uses the evolutionary algorithms to handle the classification problem in high dimensional datasets. Our proposed method is very efficient, it reduced the execution time for all the datasets that we use in our experiments, and enhanced the classification results, the classification results are ranged from 0.92 to 1.0 for the five datasets.	en_US
dc.language.iso	en	en_US
dc.publisher	جامعة بوليتكنك فلسطين - معلوماتية
dc.relation.ispartofseries	30130;74
dc.subject	Vector,Random,Dimensionreduction,Parallelprogramming,Benchmark	en_US
dc.title	Parallel Scatter Search Algorithm for Feature Selection in High Dimensional Datasets	en_US
dc.type	Other	en_US