DSpace Repository

Genome Database Indexing Using A Modified Wavelet Transformation And Btree

Show simple item record

dc.contributor.advisor Tahboub, kareem
dc.contributor.author Wohoush, Samer
dc.date.accessioned 2022-04-11T06:10:09Z
dc.date.accessioned 2022-05-11T05:33:13Z
dc.date.available 2022-04-11T06:10:09Z
dc.date.available 2022-05-11T05:33:13Z
dc.date.issued 6/1/2011
dc.identifier.uri http://test.ppu.edu/handle/123456789/3089
dc.description no of pages 100, 25719, 26546 , informatics 3/2011 ,4/2011,
dc.description.abstract The main problem of searching the Genome DNA sequences is the large size of sequences and the very high and variant sequences lengths. There are different methods used to enhance sequence searching like using database indexing methods instead of direct access to sequence files. Our main idea is to provide a suitable access methodology, in time and space, to Genome DNA sequences for searching and comparing while considering the size of the data and the index. The Genome database searching system is needed to give facilities, compact data representation and compression, accurate output, practical to use, and to minimize the number of l/O operations. l/O operations mainly needed at last step to avoid false positives (the sequences that appear to be related but are not related to the searched query). The number of candidate sequences, that need to be checked by database l/O referencing, will be reduced by pruning so no need to search the whole database. In this thesis, we propose an approach to build a complete index structure that is suitable for large database to do searching with efficient storage space and search time. We use a suitable representation of Genome DNA sequences using n-gram Haar wavelet transformation, and integer conversion for coefficients. A suitable index structure, which is build upon a modified BTree index, is used to hold the integer representation after transformation. We also introduce enhancements that can be followed to increase system efficiency by decreasing index storage size. Our structure is called the Modified Wavelet Transformation and BTree (M-WTBT). The M-WTBT structure allows tuning for a set of parameters so that the index structure is suitable to the available resources. An implementation is done, using a dataset used previously by a set of researches, to approve features and to show the advantages of the M-WTBT structure. Also, the M-WTBT shown to be effective when compare with a set of previous researches. Keyword: Sequence transformation, sequence compression, large database indexing, Haar Wavelet Transformation, Genome DNA Sequence searching and indexing en_US
dc.language.iso en en_US
dc.publisher جامعة بوليتكنك فلسطين - informatics en_US
dc.subject Genome en_US
dc.subject Btree en_US
dc.subject Wavelet en_US
dc.title Genome Database Indexing Using A Modified Wavelet Transformation And Btree en_US
dc.type Other en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Browse

My Account