Journal Image
Recent Patents on DNA & Gene Sequences
ISSN (Print): 1872-2156
ISSN (Online): 2212-3431
DOI: 10.2174/1872215611307020003      Price:  $100

Pattern Matching in Indeterminate and Arc-Annotated Sequences

Author(s): Md Tanvir Islam Aumi, Tanaeem M. Moosa and M. Sohel Rahman
Pages 96-104 (9)
In this paper, we present efficient algorithms for finding indeterminate Arc-Annotated patterns in indeterminate Arc-Annotated references. Our algorithms run in O(m + nm w ) time where n and m are respectively the length of our reference and pattern strings and w is the target machine word size. Here we have assumed the alphabet size to be constant, because, indeterminate Arc-Annotated sequences are used to model biological sequences. Clearly, for short patterns, our algorithms run in linear time and efficient algorithms for matching short patterns to reference genomes have huge applications in practical settings. We have also applied our algorithms to scan the ncRNAs without pseudoknots. We scanned three whole human chromosomes and it took only 2.5 - 4 minutes to scan one whole chromosome for an ncRNA family. Some relevant patents are discussed in [1, 2].
Indeterminate Sequence, Arc-Annotated Sequence, Sequence Matching, Bioinformatics, Patent.
A l EDA Group, Department of CSE, BUET, Dhaka-1000, Bangladesh.