Parallel Scatter Search Algorithm for Feature Selection in High Dimensional Datasets

Adi, Safa

Parallel Scatter Search Algorithm for Feature Selection in High Dimensional Datasets

Files

Parallel Scatter Search Algorithm for Feature Selection in High Dimensional Datasets-30130-2017.pdf (1003.41 KB)

Date

5/31/2017

Authors

Adi, Safa

Publisher

جامعة بوليتكنك فلسطين - معلوماتية

Abstract

Feature selection is one of a key success factor for classification problem in high dimensional datasets. This process aims to select the discriminative subsets of features in order to enhance the classification performance and reduce learning time. In this thesis we introduce an approach for handling the classification problem in high dimensional datasets using scatter search algorithm with wrapper model. During our study we have implemented the sequential and the parallel versions of the scatter search algorithm. The classification performance for the two versions are similar for Mushroom, Madelon, Gisette and Spambase datasets. For Arcene dataset the parallel version of the scatter search enhances the classification performance from 0.93 to 0.94 comparing with the sequential version of the scatter search. Five benchmark datasets are used to evaluate our approach, all of them are two-class classification problem. They are: Mushroom, Spambase, Arcene, Madelon and Gisette. Three of them (Arcene, Madelon and Gisette) are feature selection challenge. The obtained results indicate that the proposed approach is very efficient for feature selection process in high dimensional datasets; since the scatter search algorithm reduces the execution time and enhances the classification performance. A comparative study is conducted with other research in the literature that uses the evolutionary algorithms to handle the classification problem in high dimensional datasets. Our proposed method is very efficient, it reduced the execution time for all the datasets that we use in our experiments, and enhanced the classification results, the classification results are ranged from 0.92 to 1.0 for the five datasets.

Description

CD,no of pages 60,30130, informatics 2/2017

Keywords

Vector,Random,Dimensionreduction,Parallelprogramming,Benchmark

URI

http://test.ppu.edu/handle/123456789/718

Collections

Master of Informatics

Full item page

Parallel Scatter Search Algorithm for Feature Selection in High Dimensional Datasets

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By