Abstract:
This project aims to implement the computer science concepts in the biotechnology field. The
idea is to apply machine learning algorithms such as Fuzzy C-Means, Subtractive, and genetic
algorithms.
A set of pathogenic and non-pathogenic bacterium is selected to be clustered based on its
genomic duplication features. The clustering is done by extracting a set of features from the
genomic duplication in the DNA sequence of each bacterium. And then the correlation between
the clusters and a group of biological features is calculated. To select the best combination of
duplication features a genetic algorithm is used, each clustering process is evaluated and fitness
is calculated, and the genetic algorithm select the best fitness.
A hierarchical clustering is implemented on each of the duplication features, so we can analyze
the feature from one dimension. The output of the hierarchical clustering is analyzed manually.
Description:
no of pages 57, 23346, تكنولوجيا المعلومات 13/2009 , in the store